Princeton University
COS 217: Introduction to Programming Systems

Assignment 7: An Execution Profiler

Purpose

The purpose of this assignment is to help you gain a richer understanding of IA-32 assembly language, learn about UNIX signal handling, and learn about the mechanisms that underlie execution profiling tools such as gprof. The assignment will also give you experience writing ADT client code. Finally, the assignment will give you the opportunity to learn more about the GNU/UNIX programming tools, especially xemacs, gdb, and gprof.

Background

Consider this sequence of commands:

(1) gcc -Wall -ansi -pedantic -pg -S someprogram.c
(2) gcc -c someprogram.s
(3) gcc -pg -o someprogram someprogram.o

In the first command, the -pg option informs the gcc compiler to instrument the assembly language file that it produces (someprogram.s), that is, to insert assembly language code that calls functions which collect execution profiling statistics. In the second command, the -c option informs the assembler to produce someprogram.s, but not to link the program to form an executable file. In the third command, the -pg option informs the gcc linker to merge the definitions of those statistics-collecting functions with those in someprogram.o, and so into the resulting executable file (someprogram). Subsequently, the execution of someprogram calls the statistics-collecting functions, and thereby produces the execution profiling statistics in a file named gmon.out.

Thus gcc's execution profiling mechanism consists of two components:

Your Task

Your task in this assignment is to develop those two components (in C), thereby creating a simple execution profiler. You may assume that the program to be profiled (someprogram.c in the above example) resides in one source code file; your mechanism need not handle multi-file C programs.

The first component should be a program named instrument. That is, you should create C source code in a file named instrument.c, from which you can use gcc to create an executable binary file named instrument. The second component should be an object file named count.o. That is, you should create C source code in a file named count.c, from which you can use gcc to create an object file named count.o.

The instrument and count.o files should be such that you can use them in this way to profile a C program whose source code is in file someprogram.c:

(1) gcc -Wall -ansi -pedantic -S someprogram.c
(2) instrument someprogram.s
(3) gcc -c someprogram.s
(4) gcc -o someprogram someprogram.o count.o symtablehash.o (and dynarray.o if necessary)
(5) someprogram
(6) cat stats

The resulting "stats" file should contain text that describes the execution profile of someprogram. See below for details.

Note that the assignment involves using your SymTable ADT from Assignment 3. If your SymTable ADT from Assignment 3 is not working properly, then we will provide you with one upon request. Also note that the assignment may involve using the DynArray ADT from precepts. Again, see below for details.

The instrument Program

The instrument program should accept a single command-line argument: someprogram.s. It should instrument someprogram.s by inserting assembly language code at appropriate places.

Specifically, your program should instrument each function in someprogram.s in this manner. The code that your program should insert is shown in boldface.

   .type somefunction,@function 
somefunction: 

#------------------------- 
   pushl $__nameofsomefunction
   call __count 
   addl $4, %esp 
#------------------------- 

   pushl %ebp 
   movl %esp, %ebp
   ...
   ret 

#------------------------- 
__endofsomefunction: 
#------------------------- 

Then, your program should insert this code at the end of someprogram.s:

#------------------------- 
   .section ".rodata" 
__nameofsomefunction: 
   .asciz "somefunction" 
(Repeat the previous two lines for each function name)
#------------------------- 

#------------------------- 
   .section ".rodata" 
   .globl __table 
__table: 
   .long __nameofsomefunction    # function name 
   .long somefunction            # function start address 
   .long __endofsomefunction     # function end address 
   (Repeat the previous three lines for each function)
   .long 0                       # to mark the end of the table
#------------------------- 

Allowable assumptions:

Suggestions:

There should be no memory leaks in your instrument program.

The __count() Function

The count.o file should contain the definition of the __count() function and associated functions. The __count() function should have one formal parameter: the name of the function that was just called.

The first time (and only the first time) the __count() function is called, it should:

Each time the __count() function is called, it should examine its formal parameter (a function name) to determine which function has just been called, and should increment that function's "call count" in its data structures.

Requirements:

Suggestion:

There should be no memory leaks in your __count() function or in its subordinates.

Testing

You should test your profiler on several C programs. In each case, you should compare the output of your profiler with that of gprof, making sure that the results are reasonably consistent. You need not submit the C programs that you used as test cases. However you should thoroughly describe your testing strategy in your readme file.

Your description of your testing strategy in your readme file should be structured using these major headings:

Your description should be addressed to another programmer, and should be thorough enough that the programmer would know how to test your code.

Suggestions:

#!/bin/bash
#---------------------------------------------------------------------
# Build dynarray.o symtablehash.o, instrument, and count.o.
#---------------------------------------------------------------------

set -o verbose

gcc -Wall -ansi -pedantic -g -c dynarray.c
gcc -Wall -ansi -pedantic -g -c symtablehash.c
gcc -Wall -ansi -pedantic -g -o instrument instrument.c dynarray.o
gcc -Wall -ansi -pedantic -g -c count.c
#!/bin/bash
#---------------------------------------------------------------------
# Given the name of a C source code file without the filename
# extension (e.g. testsort), build an instrumented executable program
# having that name.
#---------------------------------------------------------------------

set -o verbose

gcc -S $1.c
instrument $1.s
gcc -Wall -ansi -pedantic -g -o $1 $1.s count.o symtablehash.o dynarray.o

Logistics

Develop on hats, using xemacs to create source code, gdb to debug, and gprof to check the behavior of your execution profiler.

Create a "readme" text file that contains:

Submit your work electronically on hats via the command:

/u/cos217/bin/i686/submit 7 instrument.c count.c symtable.h symtablehash.c readme

You should also submit any other source code files that comprise your profiler. Specifically, you should submit dynarray.h and dynarray.c if you use the DynArray ADT.

Grading

As always, we will grade your work on correctness and design, and will consider understandability to be an important aspect of good design. Please see the statements of previous assignments for some guidelines concerning understandability. To encourage good coding practices, we will compile using "gcc -Wall -ansi -pedantic" and take off points based on warning messages during compilation.

Extra Credit

For extra credit (up to 10 percent), design your execution profiler so it can instrument a C program that consists of multiple source code files. Doing so is challenging, especially with respect to handling static functions. If you choose to attempt the extra credit, you probably should make an appointment with your preceptor to discuss the matter.