Holberton School and The Low-Level Algorithms
The memory and the processes
How processes interact with the virtual memory
Every time that you run a program on your PC, bash
,vim
, photoshop
, chrome
or matlab
, the executable or binary file and the variables must be loaded, the question is where.
Many people think that they are in the R.A.M. But it’s not totally correct, because sometimes they are loaded on the disk. It could be something new to you but take it easy. I’m not crazy. In this entry, I’m going to use an example to illustrate what happens with the memory of a process when you execute a binary file.
The process
Every time that you run a program or a binary file, the OS creates an instance. It has an ID
associated and you can see its value with the program ps
. You can use the flag ps -A
to see all the processes.
We are going to use this process id PID
to identify our program.
The binary
If we want to know how the memory works. First, we need a simple program. Our «hello world».
This binary must have some features:
- It has to allocate some memory (we need something in the heap).
- It has to show the
PID
and the memory allocated. - And let’s include a counter and a sleep for better visualization.
#include <string.h> /* strdup */
#include <unistd.h> /* sleep, getpid */
#include <stdio.h> /* printf *//**
* main - simple program which shows
* the PID and allocated memory
* Return: 0
*/
int main(void)
{
char *msg;
pid_t pid;
int counter; msg = strdup("Welcome to Holberton");
pid = getpid();
counter = 0;
while(1)
{
printf("[%d] %4d: %s\n", pid, counter++, msg);
sleep(1);
}}
OK, now let's create the file, compile it with gcc
and execute it as ./a.out
:
The output of the program was:
[68461] 0: Welcome to Holberton
[68461] 1: Welcome to Holberton
[68461] 2: Welcome to Holberton
.
.
[68637] 1: Welcome to Holberton
[68637] 2: Welcome to Holberton
.
.
It printed the PID
, a counter and Welcome to Holberton
every second. But as you can see, the PID
was68461
, 68637
, 68705
and 68830
. This was because every time that the binary was executed, it created a new process.
/proc
OK, we have our program running, and we have its PID
. The next step is to check what has been generated. This information can be found here. In the directory /proc/<PID>/
.
But what is proc?
The proc file system acts as an interface to internal data structures in the
kernel. It can be used to obtain information about the system and to change
certain kernel parameters at runtime (sysctl)[source].
We are going to work with the process 2724
and this is a screenshot of what the /proc/<PID>
directory has inside:
Here, there are some interesting files, links, and directories. let’s check some of them:
fd/
: This directory contains the file descriptors that define thestdin(1), stdout(2), stderr(3)
. If you want to check the output or input of your process, this is the place.mem
: It is the virtual memory’s content of the process. This is what we want to hack.maps
: It shows the distribution of the virtual memory as human-readable information.status
: This file contains all the human-readable information of the processPID, PPID, NAME, UID, GROUPS, etc
.environ
: This file has the environment variables of the process.cmdline
: This file has the arguments of the program.cwd/
: This is a symbolic link to the current directory of the program.root/
: This is another symbolic link, but it is pointing to the program’s root directory, normally/
.
But now we are going to focus on virtual memory.
Let’s meet maps and mem
Let’s check what maps
has inside with the cat
program.
$ cat maps
Let’s understand the map file
.
- This image shows the distribution of the virtual memory of the process
2724
. - Each line is a section of memory. You can see where
stack
,heap
,vsyscall
start and end. - The lower memory is at the top and the higher memory is at the bottom.
- The lowest address is at
0x55996bb15000
and the highest is at0xffffffffff601000
. - For example. We want to know what is happening in the
heap
. So it starts at0x55996d276000
and ends at0x55996d297000
, which correspond to a length of0x21000
bytes or132KiB
or 132 kilo-binary bytes. - It has read and write permissions (
rw-p
). This is useful for hacking themem
file.
Remember: maps
describes the virtual memory’s distribution like an index and mem
file has its content.
Now that we are on the same page. Let’s dive into mem
file with the program cat
.
$ sudo cat mem
cat: mem: Input/output error
Error!! (Don’t you love when the terminal is so dramatic?) If we try to see it, the program will show us an Input/Ouput error
. This could be a problem because the cat
tries to read whole the file. And only part of the file has read permissions. So, we need a program that can show us an isolated section of the file.
Say hi to my little xxd
Here is when the xxd
enters the scene. This program can dump an isolated section of a binary file translated into hexadecimal format.
First, let’s check these options: -s
and -l
:
- seek (
-s
): it will seek an offset of the binary. This is perfect if we only want to see the heap section. The syntax is-s offset
. - len (
-l
): it will make the program stop after readinglen
bytes or octets (8 bits). The syntax is-l len
.
So, we only need a start point and a length. We can find this information in the maps
file.
Read the heap
First, Let’s check the heap location with cat maps | grep heap
:
$ cat maps | grep heap
55996d276000-55996d297000 rw-p 00000000 00:00 0 [heap]
The output shows that the heap starts at 0x5..76000
and ends at 0x5..97000
. So the length is $((-0x5..76000 + 0x5..97000))
.
Second, I’m going to send the output to the less
command for an easy visualization:
$ sudo xxd -s 0x55996d276000 -l $((-0x55996d276000+0x55996d297000)) mem | less
After some scrolling, we’ll found the memory allocated and the output message in the address 0x55996d2762a0
.
Third, Let’s change the world!
Edit with dd
Now, if we want to edit the running memory we need:
- A program that can skip the forbidden memory.
- To edit byte by byte in
mem
file.
This program is dd
.
It can convert a file, which is what we want. now let’s see some flags:
- Output file (
of=
): this flag sets the output file, in our care, it’smem
file. Our syntax isof=mem
. - Buffer size (
bs=
): this flag sets the read/write size, we have to set it in1
to prevent writing outside of the memory allocated. Our syntax isbs=1
. - Conventions (
conv=
): by defaultdd
truncates the output but we don’t want to lose information. So we need to include the flagnotrunc
. Our syntax isconv=notrunc
(for more information check this link). - Seek (
seek=
): it will skipN
objects at the output. We want to change the file from the allocated memory. But watch out! It must be an integer. Our syntax isseek=<allocated-memory-address-int>
.
Now that we have all this information. Let’s convert the address of Welcome...
from hex
to int
. There are many ways. But in this example, we are going to use Python python3 -c "print(int(<hex-num>))"
.
$ python3 -c "print(int(0x55996d2762a0))"
94117449654944
Now that we have the integer value for the seek
option, let’s apply it to the code.
$ printf "Welcome to Bogota :)" | sudo dd of=mem conv=notrunc seek=94117449654944 bs=1
20+0 records in
20+0 records out
20 bytes copied, 0.000119827 s, 167 kB/s
...
$
Here in the image, the left window is running the eternal program Welcome to Holberton
. And on the right window, we are replacing the mem
information with Welcome to Bogota :)
.
Conclusions
In this example, it was shown that:
- A process is an instance of a binary file executed.
- A process has an id associated or
PID
. - The process information is located in the
/proc
directory. - The
/proc
directory has information about every process: arguments, environment variables, file descriptors, virtual memory. - The
/proc/<PID>/maps
file shows the virtual memory distribution. - The
/proc/<PID>/mem
file shows the content of the virtual memory for the process. - The
/proc/<PID>/mem
file has sections and it needs an offset for writing or reading. - And if you want to know about virtual memory, check my other post ;).
References
Articles
- https://stackoverflow.com/questions/20526198/why-using-conv-notrunc-when-cloning-a-disk-with-dd/20531600#20531600
- https://www.kernel.org/doc/Documentation/filesystems/proc.txt
- https://blog.holbertonschool.com/hack-the-virtual-memory-c-strings-proc/
Thanks
Nataly and Melany for your help, your patience and your advices
About the Author
Hi, I’m Gustavo Mejía.
My idea with Linux started with a Raspberry.
And now I really fallen in love with all this world.
If you like my post. Follow me on Twitter and Linkedin.