Princeton University
COS 217: Introduction to Programming Systems
Assignment 6: Buffer Overrun
Purpose
The purpose of this assignment is to help you learn
(1) how programs are represented in machine code,
(2) how stack frames are laid out in memory, and
(3) how programs can be vulnerable to buffer-overrun attacks.Background
We will provide you a program, both source code (hello.c) and executable binary
code (hello). The file hello was produced from hello.c
using the gcc command with the -O option.
The program
asks you your name, and prints out something like this
(where the user input and program output
are indicated by fonts style):
$ hello
What is your name?
Bob
Thank you, Bob.
I recommend that you get a grade of D on this assignment.
However, the author of the program has inexplicably forgotten
to do bounds-checking on the array into which it reads the input,
and therefore it is vulnerable to attack.
Your Task
Your task is to attack the given program by exploiting
its buffer overrun vulnerability. More specifically, your job is to
provide input "data" to the program so that it
prints something more like this:
$ hello < data
What is your name?
Thank you, Bob.
I recommend that you get a grade of A on this assignment.
As you can see from reading the program, it is not designed
to give anyone an A under any circumstances. However, it
is programmed sloppily: it reads the input into a buffer,
but forgets to check whether the input fits. This means that
a too-long input can overwrite other important memory, and
you can trick the program into giving you an A.
This assignment has several parts.
F. Fill in the blanks.
Copy this sentence to your readme file, and fill in the blanks such that the
sentence is correct:"If you were to use a buffer overrun attack to knowingly gain
unauthorized access or to cause damage to other people's computers,
the Computer Fraud and Abuse Act provides a maximum penalty
of _______ years in prison for a first offense. However,
the creator of the Melissa virus plea-bargained down to ______
months in prison."
D. Analyze the program.
Take the hello executable binary file that we have provided you, and use gdb to
analyze its sections:
- Analyze the text section by issuing this "x" command:
$ gdb hello
(gdb) x/68i readString
Copy the resulting 68 lines of text into a text file named traces, and
then annotate the code to explain what's going on. You should use the source
code in hello.c as a reference, and indeed your annotation
should just consist of showing how the machine code corresponds to the
C code. You don't need an annotation for every line of machine code.
- Analyze the data section by issuing these "print" commands:
$ gdb hello
(gdb) print &grade
(gdb) print grade
Place a diagram in your traces file showing the layout of the data
section.
- Analyze the bss section by issuing this "print" command:
$ gdb hello
(gdb) print &Name
Place a diagram in your traces file showing the layout of the bss
section.
- Analyze the readString function's stack-frame. It
will be most informative to analyze the stack-frame after the function has read
a name into buf. Issue these commands to do that:
$ gdb hello
(gdb) break *readString+73
(gdb) run
Type a name
(gdb) print $esp
(gdb) print $ebp
(gdb) x/??b $esp (where ?? is the appropriate number of bytes)
Place a diagram of the stack frame layout, indicating addresses
relative to the stack pointer in your traces file.
C. Get the program to crash.
Write a C program named createdataC.c that produces a file named dataC, as simple as possible, that
causes the hello program to generate a segmentation fault. Explain its
principles of operation in one sentence as a comment within your
createdataC.c program.
The createdataC.c program should write
to the dataC file; it should not write to stdout.
B. Get the program to print "B".
Write a C program named createdataB.c that produces a file named dataB, as simple as possible, that
causes the hello program to print your name and recommend a grade
of "B". You can see by reading the program that, if your name
is Andrew Appel, this is very easy to do. But probably your
name isn't Andrew Appel.
The createdataB.c program should write to
the dataB file; it should not write to stdout.
Recommended method: overrun the buffer with a return address that jumps to
a place inside of the main function.
A. Get the program to print "A".
Write a C program named createdataA.c that produces a file named dataA, as simple as possible, that
causes the hello program to print your name and recommend a grade
of "A".
The createdataA.c program should write to the dataA
file; it should not write to stdout.
Recommended method: overrun the buffer with a three-part byte-sequence:
(1) your name, (2) a return address that points into the buffer,
and (3) a short machine-language program that stores an 'A' into the
right place and then jumps somewhere useful.
For parts B and A, if your name is very long, you should use just
the first 15 characters of your name for the purposes of this
assignment.
Implementation Notes:
- On some versions of Linux, every time the program is executed the
initial stack pointer is in a different place. This makes it
difficult to make an attack in which the return address points
into the same data that was just read into the buffer on the
stack. (Indeed, that is the purpose of varying the
initial stack pointer!) However, you'll note that the data
is copied from "buf" into "Name". You'll find that "Name" is
reliably in the same place every time you (or we) run the program.
- On some versions of Linux, executing instructions from the data
section causes a segmentation violation. The purpose of this is to
defend against buffer overrun attacks! The "mprotect" call
in our sample program is to disable this protection. You're not
required to understand or explain how this line works. Note, however,
that this mechanism (even if we didn't disable it) would not defend
against the "B" attack.
- When we grade this assignment, we will take the recommendation
of the hello program into account, but this will not be the
only criterion.
- If you work hard, you could create a data input that will
exploit the buffer overrun to take over the grader's Linux
process and do all sorts of damage. DO NOT DO THIS!
Any deliberate attempt of this sort is a violation
of the University's disciplinary code, and is also a violation
of the Computer Fraud and Abuse Act (see section F above).
Logistics
You may work with one partner on this assignment. You need not work with a
partner, but we prefer that you do. If you work with a
partner, then only one of the partners should submit work. The readme file
should contain your name and your partner's name.
Create your programs on hats using the bash shell, xemacs, gcc, and gdb.
The directory /u/cos217/Assignment6 contains the hello.c and hello
files. It also contains a makefile that you might find helpful during
development.
Create a readme text file that contains:
- Your name.
- A description of whatever help (if any) you received from others while
doing the assignment, and the names of any individuals with whom you
collaborated, as prescribed by the course "Policies" web page.
- (Optionally) An indication of how much time you spent doing the
assignment.
- (Optionally) Your assessment of the assignment: Did it help you to learn?
What did it help you to learn? Do you have any suggestions for improvement?
Etc.
- (Optionally) Any information that will help us to grade your work in the
most favorable light. In particular you should describe all known bugs.
Submit your work electronically on hats via the commands:
/u/cos217/bin/i686/submit 6 createdataC.c createdataB.c createdataA.c
/u/cos217/bin/i686/submit 6 traces readme
Grading
We will grade your work on correctness and design. We will consider
understandability to be an important aspect of good design. To encourage good
coding practices, we will compile using "gcc -Wall -ansi -pedantic" and take off
points based on warning messages during compilation.
This assignment was written by Andrew Appel,
with contributions by Robert M. Dondero, Jr.