Buffer Overflows, Shellcode, and Memory Corruption

Published in

CodeX

8 min readAug 8, 2022

Contributors: Jake Mellichamp, Steven Griffin, Wes Bailey

I. Introduction

Buffer overflows and memory corruption exploits are one of the earliest of computer security issues, and the subject has accounted for countless hours of effort from both the attack and defense perspectives.

The first description of a buffer overflow attack is recorded in the 142 page, 1972 USAF publication ‘Computer Security Technology Planning Study’.[1] In section 3.2, titled ‘The Malicious User Threat’, the document asserts:“The fact that operating systems were not designed to be secure provides a malicious user with any number of opportunities to subvert the operating system itself”.

The report goes on to discuss a vulnerable program that handled pointers, noting that:“By supplying addresses outside the space allocated to the users programs, it is often possible to get the monitor to obtain unauthorized data for that user, or at the very least… [cause] a system crash.”

A more succinct and canonical description of a buffer overflow attack would be difficult to find. Clearly, the focus in the early days was on making systems work, not making systems secure. That would change.

While most of the early exploits can only be reproduced on a modern OS by explicitly disarming the default countermeasures, it’s still instructive to learn something about the history of these attacks. Nothing within the realm of cyber security has been created in a vacuum, and it’s often the case that methods of modern systems that seem arbitrary are rooted directly to something in the past. Understanding the history can shed greater light and understanding on the what’s and why’s of current platforms and their vulnerabilities.

II. Historical Perspective

In the fall of 1988, a graduate student at Cornell University named Robert Tappan Morris wrote a bit of malicious code that became notorious as the “Morris Worm”. [2] It exploited several weaknesses in Unix architecture including a buffer overflow in the ‘fingerd’ network service. The worm spread rapidly through machines connected to the nascent internet and caused serious disruptions.

Morris insisted that he was merely, “demonstrating the inadequacies of current security measures on computer networks by exploiting the security defects he had discovered.” His justification fell on deaf ears, and Morris became the first person to be indicted under the “Computer Fraud and Abuse Act”. He was later convicted and sentenced to 3 years probation and a $10,500 fine. He was able to rebound rather well though, later completing his Ph.D. at Harvard and co-founding several tech companies including the highly successful incubator YCombinator. His current estimated net worth is $4.9 billion dollars.

History of the BufferOverflow visualized

A lot of work has been done to address this type of vulnerability over the last three decades. In 1997, StackGuard was announced, implementing the concept of a Stack Canary. The same year, a hacker going by the name ‘Solar Designer’ demonstrated the return-2-libc attack, effectively bypassing non-executable stack countermeasures. With this discovery, heap overflows, pointer overwrites, format string attacks, and a host of other fiendishly creative exploits were developed.[3] Simultaneously, defensive measures were continually bolstered in such forms as ASLR, StackGhost, PaX, and more. No matter what one side does, the other seems just ahead or on their heels; the cat and mouse game never ends.[4]

III. Stack Buffer Overflow Concepts

With that being said, in our experiments, we will become the cat (attackers) that catches the mouse (modern stack protections). Here outlines the context of what a stack buffer overflow is: Stack buffer overflows (SBOF) have been a fundamental exploit in ELF binaries for much of history. A successful attacker can overwrite local variable data or execute malicious code. This is obviously not ideal, and is usually caused by a programmer’s unawareness of how stack frames work.

The process of compiling source code into an executable binary is a complex process. The source code must be preprocessed, assembled, and linked to library functions (with a linker) before finally loaded into memory (with a loader). When this process is complete a text, data, heap, and stack segmentation are loaded into memory and used for program execution.

The program layout will be handy in understanding the nature of SBOF attacks, but the real vulnerability lies inside the stack frame.

A stack frame is segment of memory that is created every time a function call is made. It contains the base-frame pointer (EBP/RBP — a constant address marking the base of the stack), the stack pointer (ESP/RSP — which may change during the execution of a function as values are pushed or popped off the stack), a return address (EIP), and incoming function parameters. The stack buffer overflow vulnerability can be described below.

In order to exploit this situation, the attacker must overflow the buffer’s memory with a combination of NOP commands and Shellcode. The overflow should continue until the base-frame-pointer, EBP, is successfully overwritten. The final step of the exploit is to overwrite the return address of the stack back into the buffer, before the start of the shellcode. If these conditions can be met, then the binary in question is a liability.

IV. Exploit Demonstration on Linux 32-bit x86 system

In our SBOF exploit, our buffer for input is 500 bytes in size. To start off, we have to discover:

The distance from the start of the buffer to the return address so we can place a malicious return address in the EIP register.

To do this, our team used GDB. We started by randomly selecting a small overflow of the buffer, in this case, 8 bytes over the buffer (Figure 4.2).

A segfault occurs! By using the command (gdb) info registers a user can see what registers have been affected. As we can see from the image below, we successfully overwrote the EBP register but not the EIP. The EIP is the return address. From Figure 4.1, we also know the EIP is always 4 bytes past the EBP register. So we simply need to add 4 to the original number of 508 making it 512.

Successfully Overwriting the Return Address

Launching the program with a 512 ‘A’ characters results in the following:

4.3 Successful Overwrite of the return address

Crafting a Malicious Payload

Using GDB we discovered that the distance between the end of the buffer and the EIP register is exactly 12 bytes. Knowing this information, we can craft a malicious payload. So, Where to begin?

Shellcode: Using https://shell-storm.org/ our team was able to find a snippet of shellcode to execute the System call execve(/bin/sh) using 43 bytes. This shellcode will launch a terminal with the current processes permissions.
Return Address: We need to overwrite the EIP register to point to our shellcode. Using the command (gdb) x/200wx $esp-550 we are able to read the virtual memory addresses 200 bytes near the stack base. We’ll upload the shellcode somewhere in the overflowed stack (It doesn’t have to be an exact value, just somewhere within the stack), and use the Return address to point to the start of the shellcode.

Figure 4.4 — Possible Return Address Locations

NOP Slide: The payload needs to be 512 bytes long (Figure 4.3). We have Shellcode (43 bytes) and a return address (40 bytes, The reason the return address is 40 bytes is to create a sort of buffer space to ensure a greater chance of success that it lands on our correct return address even if some things move slightly around in memory). After doing some quick math (512–40–43 = 429) we know that we need to fill the rest of the 429 bytes of buffer data with NOP Commands.

Executing the Code

Once the python script has been created, we run the ELF binary as any typical user would, except our buffer input is a combination of NOPs, Shellcode, and Return addresses. After execution, you should have a shell/terminal open and ready to use!

V. Advanced Exploits

While this is an enlightening and instructive exercise, the fact is that this attack is not possible without at least partially disarming countermeasures in the operating system and/or the compiler. The non-executable stack feature (DEP) is a hardware-based safety feature that will prevent the malicious code from executing. The StackGuard feature implements a stack canary that will also halt execution if overwritten by this simple overflow attack. Address space layout randomization can be defeated with brute force in a 32 bit machine due to the comparatively small available address space. 64 bit machines are practically invulnerable to brute force vs. ASLR, but technically still not immune.

VI. Conclusion

Stack Overflow exploits have been around for a very long time, they remain a present and viable threat vector. Despite over three decades of operating system and compiler engineering, these vulnerabilities have not been completely mitigated. This begs the question, can buffer overflows ever be fully ‘engineered out’ of software?

The Open Web Application Security Project (OWASP), a non-profit organization working to improve software security, provides educational content relevant to the topic. Its online whitepaper on Buffer Overflow states that:

“part of the problem is due to the wide variety of ways buffer overflows can occur, and part is due to the error-prone techniques often used to prevent them.”

The paper also notes that some languages like C, C++, Fortran, and Assembly, are more susceptible to overflows than others. Interpreted languages are much safer, if not altogether immune to the technique. And all operating systems are vulnerable to some extent.[5]

It seems that the best defense is a well planned and vigilant ‘defense-in-depth’, addressing software design and build best practices at all phases and maintaining up-to-date OS patches needed to address current vulnerabilities as they are identified.

Final Note: This article would not have been possible without the incredible collaboration of my colleagues: Steven Griffin and Wes Bailey.

Stay tuned… we will be attempting to conduct a Modern Stack Overflow exploit without safeguards shortly.

VII. References

[1] James P. Anderson. 1972. Computer Security Technology Planning Study. Deputy for Command and Management Systems HQ Electronic Systems Division (AFSC).

https://csrc.nist.gov/csrc/media/publications/conference-paper/1998/10/08/proceedings-of-the-21st-nissc-1998/documents/early-cs-papers/ande72a.pdf

[2] Wikipedia contributors. Robert Tappan Morris. Wikipedia, The Free Encyclopedia. September 21, 2020, https://en.wikipedia.org/w/index.php?title=Robert_Tappan_Morris&oldid=979630920. Accessed October 21, 2020.

[3] Aleph One. Smashing the Stack for Fun and Profit. Underground.org. v7, Issue 49. Nov. 1996. https://seclists.org/bugtraq/1996/Nov/17

[4] Haroon Meer. Memory Corruption Attacks The (almost) Complete History. BlackHat 2010

https://thinkst.com/resources/papers/BlackHat-USA-2010-Meer-History-of-Memory-Corruption-Attacks-wp.pdf

[5] Buffer Overflow. OWASP website. Apr 2020. https://owasp.org/www-community/vulnerabilities/Buffer_Overflow#