Holberton School and The Low-Level Algorithms

The memory and the processes

How processes interact with the virtual memory

Gustavo Adolfo Mejía

--

This image is a combination of Filindmitriy86, and oldschoolroleplaying.

Every time that you run a program on your PC, bash,vim, photoshop, chromeor matlab, the executable or binary file and the variables must be loaded, the question is where.

Many people think that they are in the R.A.M. But it’s not totally correct, because sometimes they are loaded on the disk. It could be something new to you but take it easy. I’m not crazy. In this entry, I’m going to use an example to illustrate what happens with the memory of a process when you execute a binary file.

The process

Every time that you run a program or a binary file, the OS creates an instance. It has an ID associated and you can see its value with the program ps. You can use the flag ps -A to see all the processes.

Running ps.

We are going to use this process id PID to identify our program.

The binary

If we want to know how the memory works. First, we need a simple program. Our «hello world».

This binary must have some features:

  • It has to allocate some memory (we need something in the heap).
  • It has to show the PID and the memory allocated.
  • And let’s include a counter and a sleep for better visualization.
#include <string.h>             /* strdup */
#include <unistd.h> /* sleep, getpid */
#include <stdio.h> /* printf */
/**
* main - simple program which shows
* the PID and allocated memory
* Return: 0
*/
int main(void)
{
char *msg;
pid_t pid;
int counter;
msg = strdup("Welcome to Holberton");
pid = getpid();
counter = 0;
while(1)
{
printf("[%d] %4d: %s\n", pid, counter++, msg);
sleep(1);
}
}

OK, now let's create the file, compile it with gcc and execute it as ./a.out:

Creating Infinite program.

The output of the program was:

[68461]   0: Welcome to Holberton
[68461] 1: Welcome to Holberton
[68461] 2: Welcome to Holberton
.
.
[68637] 1: Welcome to Holberton
[68637] 2: Welcome to Holberton
.
.

It printed the PID, a counter and Welcome to Holberton every second. But as you can see, the PID was68461, 68637, 68705 and 68830. This was because every time that the binary was executed, it created a new process.

/proc

OK, we have our program running, and we have its PID. The next step is to check what has been generated. This information can be found here. In the directory /proc/<PID>/.

But what is proc?

The proc file system acts as an interface to internal data structures in the
kernel. It can be used to obtain information about the system and to change
certain kernel parameters at runtime (sysctl)[source].

We are going to work with the process 2724 and this is a screenshot of what the /proc/<PID> directory has inside:

Content of /proc/<PID>.

Here, there are some interesting files, links, and directories. let’s check some of them:

  • fd/: This directory contains the file descriptors that define the stdin(1), stdout(2), stderr(3). If you want to check the output or input of your process, this is the place.
  • mem: It is the virtual memory’s content of the process. This is what we want to hack.
  • maps: It shows the distribution of the virtual memory as human-readable information.
  • status: This file contains all the human-readable information of the process PID, PPID, NAME, UID, GROUPS, etc.
  • environ: This file has the environment variables of the process.
  • cmdline: This file has the arguments of the program.
  • cwd/: This is a symbolic link to the current directory of the program.
  • root/: This is another symbolic link, but it is pointing to the program’s root directory, normally /.

But now we are going to focus on virtual memory.

Let’s meet maps and mem

Let’s check what maps has inside with the cat program.

$ cat maps
Content of maps file.

Let’s understand the map file.

  • This image shows the distribution of the virtual memory of the process 2724.
  • Each line is a section of memory. You can see where stack, heap, vsyscall start and end.
  • The lower memory is at the top and the higher memory is at the bottom.
Virtual memory representation
  • The lowest address is at 0x55996bb15000 and the highest is at 0xffffffffff601000.
  • For example. We want to know what is happening in the heap. So it starts at 0x55996d276000 and ends at 0x55996d297000, which correspond to a length of 0x21000 bytes or 132KiB or 132 kilo-binary bytes.
  • It has read and write permissions (rw-p). This is useful for hacking the mem file.

Remember: maps describes the virtual memory’s distribution like an index and mem file has its content.

Now that we are on the same page. Let’s dive into mem file with the program cat.

$ sudo cat mem
cat: mem: Input/output error

Error!! (Don’t you love when the terminal is so dramatic?) If we try to see it, the program will show us an Input/Ouput error. This could be a problem because the cat tries to read whole the file. And only part of the file has read permissions. So, we need a program that can show us an isolated section of the file.

Say hi to my little xxd

Here is when the xxd enters the scene. This program can dump an isolated section of a binary file translated into hexadecimal format.

Fragment of xxd’s manual.

First, let’s check these options: -s and -l:

  • seek (-s): it will seek an offset of the binary. This is perfect if we only want to see the heap section. The syntax is -s offset.
  • len (-l): it will make the program stop after reading len bytes or octets (8 bits). The syntax is -l len.

So, we only need a start point and a length. We can find this information in the maps file.

Read the heap

First, Let’s check the heap location with cat maps | grep heap:

$ cat maps | grep heap
55996d276000-55996d297000 rw-p 00000000 00:00 0 [heap]

The output shows that the heap starts at 0x5..76000 and ends at 0x5..97000. So the length is $((-0x5..76000 + 0x5..97000)).

Second, I’m going to send the output to the less command for an easy visualization:

$ sudo xxd -s 0x55996d276000 -l $((-0x55996d276000+0x55996d297000)) mem | less
Finding the start of the allocated string.

After some scrolling, we’ll found the memory allocated and the output message in the address 0x55996d2762a0.

Third, Let’s change the world!

Edit with dd

Now, if we want to edit the running memory we need:

  • A program that can skip the forbidden memory.
  • To edit byte by byte in mem file.

This program is dd.

The first line of dd’s manual.

It can convert a file, which is what we want. now let’s see some flags:

  • Output file (of=): this flag sets the output file, in our care, it’s mem file. Our syntax is of=mem.
  • Buffer size (bs=): this flag sets the read/write size, we have to set it in 1 to prevent writing outside of the memory allocated. Our syntax is bs=1.
  • Conventions (conv=): by default dd truncates the output but we don’t want to lose information. So we need to include the flag notrunc. Our syntax is conv=notrunc(for more information check this link).
  • Seek (seek=): it will skip N objects at the output. We want to change the file from the allocated memory. But watch out! It must be an integer. Our syntax is seek=<allocated-memory-address-int>.

Now that we have all this information. Let’s convert the address of Welcome... from hex to int. There are many ways. But in this example, we are going to use Python python3 -c "print(int(<hex-num>))".

$ python3 -c "print(int(0x55996d2762a0))"
94117449654944

Now that we have the integer value for the seek option, let’s apply it to the code.

$ printf "Welcome to Bogota :)" | sudo dd of=mem conv=notrunc seek=94117449654944 bs=1
20+0 records in
20+0 records out
20 bytes copied, 0.000119827 s, 167 kB/s
...
$
Changing virtual memory.

Here in the image, the left window is running the eternal program Welcome to Holberton. And on the right window, we are replacing the mem information with Welcome to Bogota :).

Conclusions

In this example, it was shown that:

  • A process is an instance of a binary file executed.
  • A process has an id associated or PID.
  • The process information is located in the /proc directory.
  • The /proc directory has information about every process: arguments, environment variables, file descriptors, virtual memory.
  • The /proc/<PID>/maps file shows the virtual memory distribution.
  • The /proc/<PID>/mem file shows the content of the virtual memory for the process.
  • The/proc/<PID>/mem file has sections and it needs an offset for writing or reading.
  • And if you want to know about virtual memory, check my other post ;).

References

Articles

Thanks

Nataly and Melany for your help, your patience and your advices

About the Author

Tavo

Hi, I’m Gustavo Mejía.
My idea with Linux started with a Raspberry.
And now I really fallen in love with all this world.
If you like my post. Follow me on Twitter and Linkedin.

--

--