A buffer is a temporary area for data storage. When more data (than was originally allocated to be stored) gets placed by a program or system process, the extra data overflows. It causes some of that data to leak out into other buffers, which can corrupt or overwrite whatever data they were holding.
In a buffer-overflow attack, the extra data sometimes holds specific instructions for actions intended by a hacker or malicious user; for example, the data could trigger a response that damages files, changes data or unveils private information.
Attacker would use a buffer-overflow exploit to take advantage of a program that is waiting on a user’s input. There are two types of buffer overflows: stack-based and heap-based. Heap-based, which are difficult to execute and the least common of the two, attack an application by flooding the memory space reserved for a program. Stack-based buffer overflows, which are more common among attackers, exploit applications and programs by using what is known as a stack: memory space used to store user input.
Let us study some real program examples that show the danger of such situations based on the C.
In the examples, we do not implement any malicious code injection but just to show that the buffer can be overflow. Modern compilers normally provide overflow checking option during the compile/link time but during the run time it is quite difficult to check this problem without any extra protection mechanism such as using exception handling.
// A C program to demonstrate buffer overflow #include <stdio.h> #include <string.h> #include <stdlib.h>
int main(int argc, char *argv[]) {
// Reserve 5 byte of buffer plus the terminating NULL. // should allocate 8 bytes = 2 double words, // To overflow, need more than 8 bytes... char buffer[5]; // If more than 8 characters input // by user, there will be access // violation, segmentation fault
// a prompt how to execute the program... if (argc < 2) { printf("strcpy() NOT executed.... "); printf("Syntax: %s <characters> ", argv[0]); exit(0); }
// copy the user input to mybuffer, without any // bound checking a secure version is srtcpy_s() strcpy(buffer, argv[1]); printf("buffer content= %s ", buffer);
// you may want to try strcpy_s() printf("strcpy() executed... ");
return 0; } |
Compile this program in Linux and for output use command outpute_file INPUT
Input : 12345678 (8 bytes), the program run smoothly.
Input : 123456789 (9 bytes)
"Segmentation fault" message will be displayed and the program terminates.
The vulnerability exists because the buffer could be overflowed if the user input (argv[1]) bigger than 8 bytes. Why 8 bytes? For 32 bit (4 bytes) system, we must fill up a double word (32 bits) memory. Character (char) size is 1 byte, so if we request buffer with 5 bytes, the system will allocate 2 double words (8 bytes). That is why when you input more than 8 bytes; the mybuffer will be over flowed
Similar standard functions that are technically less vulnerable, such as strncpy(), strncat(), and memcpy(), do exist. But the problem with these functions is that it is the programmer responsibility to assert the size of the buffer, not the compiler.
Every C/C++ coder or programmer must know the buffer overflow problem before they do the coding. A lot of bugs generated, in most cases can be exploited as a result of buffer overflow.
A format string is an ASCII string that contains text and format parameters.
Example:
// A statement with format string
printf("my name is : %s
", "Akash");
// Output
// My name is : Akash
There are several format strings that specify output in C and many other programming languages but our focus is on C.
Format string vulnerabilities are a class of bug that take advantage of an easily avoidable programmer error. If the programmer passes an attacker-controlled buffer as an argument to a printf (or any of the related functions, including sprintf, fprintf, etc), the attacker can perform writes to arbitrary memory addresses. The following program contains such an error:
// A simple C program with format // string vulnerability #include<stdio.h>
int main(int argc, char** argv) { char buffer[100]; strncpy(buffer, argv[1], 100);
// We are passing command line // argument to printf printf(buffer);
return 0; } |
Since printf has a variable number of arguments, it must use the format string to determine the number of arguments. In the case above, the attacker can pass the string “%p %p %p %p %p %p %p %p %p %p %p %p %p %p %p” and fool the printf into thinking it has 15 arguments. It will naively print the next 15 addresses on the stack, thinking they are its arguments:
$ ./a.out "%p %p %p %p %p %p %p %p %p %p %p %p %p %p %p"
0xffffdddd 0x64 0xf7ec1289 0xffffdbdf 0xffffdbde (nil) 0xffffdcc4 0xffffdc64 (nil) 0x25207025 0x70252070 0x20702520 0x25207025 0x70252070 0x20702520
At about 10 arguments up the stack, we can see a repeating pattern of 0x252070 – those are our %ps on the stack! We start our string with AAAA to see this more explicitly:
$ ./a.out "AAAA%p %p %p %p %p %p %p %p %p %p"
AAAA0xffffdde8 0x64 0xf7ec1289 0xffffdbef 0xffffdbee (nil) 0xffffdcd4 0xffffdc74 (nil) 0x41414141
The 0x41414141 is the hex representation of AAAA. We now have a way to pass an arbitrary value (in this case, we’re passing 0x41414141) as an argument to printf. At this point we will take advantage of another format string feature: in a format specifier, we can also select a specific argument. For example, printf(“%2$x”, 1, 2, 3) will print 2. In general, we can do printf(“%$x”) to select an arbitrary argument to printf. In our case, we see that 0x41414141 is the 10th argument to printf, so we can simplify our string1:
$ ./a.out 'AAAA%10$p'
AAAA0x41414141
Preventing Format String Vulnerabilities