The purpose of this assignment is to help you understand how heap memory management works in C. It also will give you more opportunity to use the GNU/Linux programming tools, especially bash
, emacs
, gcc217
, gdb
, and make
.
Students from past semesters reported taking, on average, 14 hours (each) to complete this assignment.
You may work with one teammate on this assignment. You need not work with a teammate, but we prefer that you do. Your preceptor will help you with your teammate search, if you so desire.
Your teammate must be from your precept. Sorry, no exceptions. So make sure you find a teammate as soon as you can. After all, if your precept has an odd number of students, then someone must work alone; don't let that person be you!
The "challenge" part for this assignment is detailed below. While doing the challenge part of the assignment, you are bound to observe the course policies regarding assignment conduct as given in the course Policies web page, plus one additional policy: you may not use any "human" sources of information. That is, you may not consult with the course's staff members, the lab teaching assistants, other current students via Piazza, or any other people while working on the challenge part of an assignment, except for clarification of requirements. However you may consult with your teammate, with no limitations.
The challenge part is worth 6 percent of the assignment. So if you don't do any of the challenge part and all other parts of your assignment solution are perfect and submitted on time, then your grade for the assignment will be 94 percent.
A standard C programming environment contains four functions that allow dynamic memory management: malloc
, free
, calloc
, and realloc
. Those functions are used heavily in many C programs.
The COS 217 lectures describes multiple implementations of malloc
and free
, five of which use the heap section of memory:
heapmgr1 is a minimal implementation. Its algorithms and data structures are described in lectures, and its code is provided in lectures.
heapmgr2 is a pad implementation. Its algorithms and data structures are described in lectures and its code is provided in lectures.
heapmgr3 is a linked list implementation. Its algorithms and data structures described briefly in lectures and thoroughly in precepts, and its code is provided in precepts.
heapmgr4 is a doubly-linked list implementation. Its algorithms and data structures are described briefly in lectures and thoroughly in precepts.
heapmgr5 is a bins implementation. Its algorithms and data structures are described briefly in lectures and thoroughly in precepts.
Some background reading:
Section 8.7 of the book The C Programming Language (Kernighan and Ritchie) shows code for a linked list implementation. That book section is available through Blackboard in the Course Materials section. The key data structure in that implementation is a circular singly-linked list; each free memory chunk is stored in that list. Each memory chunk contains a header which specifies its size and, if free, the address of the next chunk in the list. Although elegant in its simplicity, that implementation can be inefficient.
The web page http://gee.cs.oswego.edu/dl/html/malloc.html (Doug Lea) describes a bins implementation. (Here is a copy of that web page, just in case the original is unavailable.) The key data structure is an array of non-circular doubly-linked lists, that is, an array of bins. Each bin contains all free chunks of a prescribed size. The use of multiple bins instead of a single linked list allows malloc
to be more efficient. Moreover, each memory chunk contains both a header and a footer. The header contains three fields: the size of the chunk, an indication of whether the chunk is free, and, if free, a pointer to the next free chunk in its bin. The footer contains two fields: the size of the chunk, and, if free, a pointer to the previous free chunk in its bin. That chunk structure allows free
to be more efficient.
You are given heapmgr.h
, the interface of a HeapMgr
(heap manager) module. It declares two functions:
void *HeapMgr_malloc(size_t uSize); void HeapMgr_free(void *pv);
You also are given the heapmgr1.c
, heapmgr2.c
, and heapmgr3.c
implementations of the HeapMgr
module. Your task is to compose heapmgr4.c
, heapmgr5.c
, and some supporting files.
heapmgr4.c
must enhance heapmgr3.c
so the HeapMgr_free
function is efficient. To do that it must use a doubly-linked list and chunks that contain headers and footers (as described above, in lectures, and in precepts).
If designed properly, the HeapMgr_free
function in your heapmgr4.c
implementation will be efficient in all cases. However, the HeapMgr_malloc
function in your heapmgr4.c
implementation will be subject to poor worst-case behavior. Your heapmgr5.c
implementation must enhance your heapmgr4.c
implementation so the worst-case behavior of its HeapMgr_malloc
function is not poor. To do that it must use multiple doubly-linked lists, alias bins (as described above, in lectures, and in precepts).
Your heapmgr4.c
and heapmgr5,c
implementations must not call the standard malloc
, free
, calloc
, or realloc
functions.
Your heapmgr4.c
and heapmgr5.c
implementations must check invariants by defining a thorough Checker_isValid
function, and calling assert(Checker_isValid(...))
at the leading and trailing edges of the HeapMgr_malloc
and HeapMgr_free
functions.
The CourseLab directory /u/cos217/Assignment6
contains files that support the assignment. For reference, we list and briefly describe them here. The following sections of this document describe how to use them.
Makefile
: a minimal Makefile that automates building of programs using the given code and your code.heapmgr.h
: the interface of the HeapMgr
module.heapmgr1.c
: a minimal implementation of the HeapMgr
module.heapmgr2.c
: a pad implementation of the HeapMgr
module.heapmgr3.c
: a linked list implementation of the HeapMgr
module.checker3.h
and checker3.c
: a Checker
module used in the heapmgr3.c
implementation.chunk3.h
and chunk3.c
: a Chunk
module used in the heapmgr3.c
implementation.chunk4.h
and chunk4.c
: a Chunk
module to be used in your heapmgr4.c
implementation.checker4.h
: the interface of a Checker
module to be used in your heapmgr4.c
implementation.heapmgr4bad*.o
: buggy versions of the heapmgr4.c
implementation.heapmgr4good.o
: a correct version of the heapmgr4.c
implementation.chunk5.h
and chunk5.c
: a Chunk
module to be used in your heapmgr5.c
implementation. They're the same as chunk4.h
and chunk4.c
.checker5.h
: the interface of a Checker
module to be used in your heapmgr5.c
implementation.heapmgr5bad*.o
: buggy versions of the heapmgr5.c
implementationheapmgr5good.o
: a correct version of the heapmgr5.c
implementation.testheapmgr.c
: a client module that tests the HeapMgr
module (with any of its implementations), and reports timing and heap usage statistics.testheap
and testheapimp
: bash
shell scripts that automate testing.words.c
: a HeapMgr
and SymTable
client used in the challenge part of this assignment.symtable.h
: a SymTable interface.symtablehash.o
: a SymTable implementation.pride32.txt
: a file containing 32 copies of the Pride and Prejudice novel, to be used as input to the words.c
program.readme
: a template readme
file.testheapmgr
ClientThe testheapmgr.c
client requires three command-line arguments. The first must be any one of seven strings, as shown in the following table, indicating which of seven tests the client should run:
Argument | Test Performed |
LifoFixed |
LIFO with fixed size chunks |
FifoFixed |
FIFO with fixed size chunks |
LifoRandom |
LIFO with random size chunks |
FifoRandom |
FIFO with random size chunks |
RandomFixed |
Random order with fixed size chunks |
RandomRandom |
Random order with random size chunks |
Worst |
Worst case order for a heap manager implemented using a single linked list |
The second command-line argument is the number of calls of HeapMgr_malloc
and HeapMgr_free
that the client should execute. The third command-line argument is the (maximum) size, in bytes, of each memory chunk that the client should allocate and free.
Immediately before termination testheapmgr.c
writes to stdout
an indication of how much CPU time and heap memory it consumed. See the testheapmgr.c
file for more information.
Develop on CourseLab, using emacs
to create source code and gdb
to debug. Perform these steps:
Makefile
The CourseLab /u/cos217/Assignment6
directory contains the given files that are described previously in this document. Create a project directory, and copy all files except pride32.txt
from the /u/cos217/Assignment6
directory to your project directory.
Then study the given Makefile
. It is not a proper one: it does not maintain .o
files to allow for partial builds, but instead builds directly from .c
files (or given .o
files) to executable binary files. Nevertheless, you will find it helpful. Using it will allow you to avoid much typing.
checker4.c
Compose your checker4.c
implementation. Then issue these commands to build programs with buggy heapmgr4
implementations:
gcc217 -g testheapmgr.c heapmgr4bada.o checker4.c chunk4.c -o test4bada gcc217 -g testheapmgr.c heapmgr4badb.o checker4.c chunk4.c -o test4badb gcc217 -g testheapmgr.c heapmgr4badc.o checker4.c chunk4.c -o test4badc gcc217 -g testheapmgr.c heapmgr4badd.o checker4.c chunk4.c -o test4badd gcc217 -g testheapmgr.c heapmgr4bade.o checker4.c chunk4.c -o test4bade gcc217 -g testheapmgr.c heapmgr4badf.o checker4.c chunk4.c -o test4badf gcc217 -g testheapmgr.c heapmgr4badg.o checker4.c chunk4.c -o test4badg
Then run those programs using these commands:
./test4bada RandomRandom 1000 20000 ./test4badb RandomRandom 1000 20000 ./test4badc RandomRandom 1000 20000 ./test4badd RandomRandom 1000 20000 ./test4bade RandomRandom 1000 20000 ./test4badf RandomRandom 1000 20000 ./test4badg RandomRandom 1000 20000
If (and only if) each program writes a reasonable descriptive error message and aborts via an assert, then your checker4.c
implementation is working properly.
It will be challenging to compose a checker4.c
implementation that detects the errors in all of the given heapmgr4bad*.o
files. But your checker4.c
implementation should detect the errors in many of the given heapmgr4bad*.o
files before you proceed to the next step.
heapmgr4.c
Compose your heapmgr4.c
implementation. Then issue this command:
gcc217 -g testheapmgr.c heapmgr4.c checker4.c chunk4.c -o test4d
to build a debug version of a testing program. Run the program with various command-line arguments to make sure your heapmgr4.c
implementation passes all tests defined in your checker4.c
implementation.
Then issue these commands:
gcc217 -D NDEBUG -O testheapmgr.c heapmgr4.c chunk4.c -o test4 gcc217 -D NDEBUG -O testheapmgr.c heapmgr4good.o chunk4.c -o test4good
The-O
(that's uppercase "oh") argument commandsgcc
to optimize the machine language code that it produces. When given the-O
argument,gcc
spends more time compiling your code so, subsequently, the computer spends less time executing your code. The-D NDEBUG
argument commandsgcc
to define theNDEBUG
macro, just as if the preprocessor directive#define NDEBUG
appeared in the specified.c
file(s). Defining theNDEBUG
macro disables the calls of theassert
macro within theHeapMgr
implementations. Doing so also disables code withintestheapmgr.c
that performs (time consuming) checks of memory contents.
Those commands build a release version of a testing program using your heapmgr4.c
implementation and a release version of a testing program using the given heapmgr4good.o
implementation. Run the two programs with various command-line arguments. If your test4
program consumes exactly the same amount of memory as the test4good
program does, and the two sometimes consume approximately the same amount of time, then you can be reasonably sure that your heapmgr4.c
implementation is correct.
checker4.c
and heapmgr4.c
ImplementationsCritique your testheapmgr.c/heapmgr4.c/checker4.c/chunk4.c
program using splint
. Each time splint
generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme
file.
Similarly, critique your checker4.c
and heapmgr4.c
code using critTer
. Each time critTer
generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme
file.
WhencritTer
critiques thetestheapmgr.c
file, it generates warnings about the use of "magic numbers," the length of some loops, and the length of the file. We judged that using magic numbers, defining deeply nested code, and defining long loops intestheapmgr.c
was clearer than the alternative. And splittingtestheapmgr.c
into multiple files would have complicated the build process unnecessarily. So please ignore those warnings.
Similarly, whencritTer
critiques theheapmgr3.c
file, it generates a warning which suggests that theHeapMgr_free
function must validate itspv
parameter through anassert
. In fact theHeapMgr_free
function must not validate itspv
parameter through anassert
, so please ignore that warning.
checker5.c
Compose your checker5.c
implementation. Then issue these commands to build programs with buggy heapmgr5
implementations:
gcc217 -g testheapmgr.c heapmgr5bada.o checker5.c chunk5.c -o test5bada gcc217 -g testheapmgr.c heapmgr5badb.o checker5.c chunk5.c -o test5badb gcc217 -g testheapmgr.c heapmgr5badc.o checker5.c chunk5.c -o test5badc gcc217 -g testheapmgr.c heapmgr5badd.o checker5.c chunk5.c -o test5badd gcc217 -g testheapmgr.c heapmgr5bade.o checker5.c chunk5.c -o test5bade gcc217 -g testheapmgr.c heapmgr5badf.o checker5.c chunk5.c -o test5badf gcc217 -g testheapmgr.c heapmgr5badg.o checker5.c chunk5.c -o test5badg gcc217 -g testheapmgr.c heapmgr5badh.o checker5.c chunk5.c -o test5badh
Then run those programs using these commands:
./test5bada RandomRandom 1000 20000 ./test5badb RandomRandom 1000 20000 ./test5badc RandomRandom 1000 20000 ./test5badd RandomRandom 1000 20000 ./test5bade RandomRandom 1000 20000 ./test5badf RandomRandom 1000 20000 ./test5badg RandomRandom 1000 20000 ./test5badh RandomRandom 1000 20000
If (and only if) each program writes a reasonable descriptive error message and aborts via an assert, then your checker5.c
implementation is working properly.
It will be challenging to compose a checker5.c
implementation that detects the errors in all of the given heapmgr5bad*.o
files. But your checker5.c
implementation should detect the errors in many of the given heapmgr5bad*.o
files before you proceed to the next step.
heapmgr5.c
Compose your heapmgr5.c
implementation. Then issue this command:
gcc217 -g testheapmgr.c heapmgr5.c checker5.c chunk5.c -o test5d
to build a debug version of a testing program. Run the program with various command-line arguments to make sure your heapmgr5.c
implementation passes all tests defined in your checker5.
implementation.
Then issue these commands:
gcc217 -D NDEBUG -O testheapmgr.c heapmgr5.c chunk5.c -o test5 gcc217 -D NDEBUG -O testheapmgr.c heapmgr5good.o chunk5.c -o test5good
Those commands build a release version of a testing program using your heapmgr5.c
implementation and a release version of a testing program using the given heapmgr5good.o
implementation. Run the two programs with various command-line arguments. If your test5
program consumes exactly the same amount of memory as the test5good
program does, and the two sometimes consume approximately the same amount of time, then you can be reasonably sure that your heapmgr5.c
implementation is correct.
checker5.c
and heapmgr5.c
ImplementationsCritique your testheapmgr.c/heapmgr5.c/checker5.c/chunk5.c
program using splint
. Each time splint
generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme
file.
Similarly, critique your checker5.c
and heapmgr5.c
code using critTer
. Each time critTer
generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme
file.
Issue these commands to build programs using the given HeapMgr
implementations:
gcc217 -D NDEBUG -O testheapmgr.c heapmgr1.c -o test1 gcc217 -D NDEBUG -O testheapmgr.c heapmgr2.c -o test2 gcc217 -D NDEBUG -O testheapmgr.c heapmgr3.c chunk3.c -o test3
readme
FileEdit your copy of the given readme
file by answering each question that is expressed therein.
In particular, in your readme
file record the CPU times and heap memory consumed by the test1
, test2
, test3
, test4
, and test5
programs with tests RandomRandom
and Worst
, with call count 100000, and with maximum chunk sizes 2000 and 20000. Note that if the CPU time consumed is more than 5 minutes, testheapmgr
will abort execution. To report the time and memory consumption, paste the output of the testheap
script into your readme
file.
You must paste the output of thetestheap
script into yourreadme
file. An alternative approach would be to (1) generate time and memory consumption statistics by running thetest1
...test5
programs directly from the bash command prompt rather than via thetestheap
script, and (2) manually type the statistics into your readme file using a format of your choosing. But the alternative would be more difficult for you and your grader. So the alternative approach is unacceptable; you will lose points if you use it.
You are given the words.c
file. The words
program reads words from the file whose name is argv[1]
, and then writes to stdout
each distinct word and its occurrence count. Then the program does the same for argv[2]
, argv[3]
, and so forth. Finally the program writes CPU time and heap memory usage statistics to stderr
.
Answer these questions in your readme
file:
Suppose you run the words
program with one command-line argument which is pride32.txt
. Which implementation (heapmgr1.c
, heapmgr2.c
, heapmgr3.c
, heapmgr4.c
, or heapmgr5.c
) is the best? The "best" implementation is the one that offers the best balance of time and space efficiency.
Suppose you run the words
program with fifteen command-line arguments, each of which is pride32.txt
. Which implementation (heapmgr1.c
, heapmgr2.c
, heapmgr3.c
, heapmgr4.c
, or heapmgr5.c
) is the best? Again, the "best" implementation is the one that offers the best balance of time and space efficiency.
You will find that your answer to Question 1 differs from your answer to Question 2. Why is that so? Explain in terms of the salient characteristics of the words
program and the HeapMgr
implementations.
You could develop your answers simply by considering the characteristics of the
words
program and theHeapMgr
implementations. But it would be wise to do some experiments to confirm your answers.To do such experiments, consider using the given
symtable.h
andsymtablehash.o
files. Thesymtable.h
file contains the interface of aSymTable
ADT. Thesymtablehash.o
file contains an implementation of aSymTable
ADT that does not callmalloc
,calloc
,realloc
, orfree
; instead it callsHeapMgr_malloc
andHeapMgr_free
. You may use those files, along with the givenHeapMgr
implementations and yourHeapMgr
implementations, to build somewords
programs. Then you may analyze CPU time and heap memory usage statistics of the programs by running them usingpride32.txt
.
The
pride32.txt
file is large. To avoid the possibility of exhausting your CourseLab disk quota, we recommend that you not copypride32.txt
from the/u/cos217/Assignment6
directory to your working directory. Instead we recommend that you use the file that already exists in the/u/cos217/Assignment6
directory. To do that you'll need to use the absolute file name/u/cos217/Assignment6/pride32.txt
as command-line arguments to yourwords
programs.
Provide the instructors with your feedback on the assignment. To do that, issue this command:
FeedbackCOS217.py 6
and answer the questions that it asks. (When answering the numeric questions, please enter the average of the responses that you and your teammate would provide individually.) That command stores its questions and your answers in a file named feedback
in your working directory.
Submit your work electronically on CourseLab. If you worked with a teammate, then one of the teammates must submit all of your team's files, and the other teammate must submit none of your team's files. Your readme
file and your source code files must contain both your name and your teammate's name. Use these commands to submit:
submit 6 checker4.c heapmgr4.c submit 6 checker5.c heapmgr5.c submit 6 readme feedback
In part, good program style is defined by the splint
and critTer
tools, and by the rules given in The Practice of Programming (Kernighan and Pike) as summarized by the Rules of Programming Style document.
The more course-specific style rules listed in the previous assignment specifications also apply, as does this one: your heapmgr4.c
and heapmgr5.c
implementations must have good function-level modularity.
We will grade your work on two kinds of quality:
Quality from the user's point of view. From the user's point of view, your code has quality if it behaves as it must. The correct behavior of your code is defined by the previous sections of this assignment specification.
Quality from the programmer's point of view. From the programmer's point of view, your code has quality if it is well styled and thereby easy to maintain. Good program style is defined by the previous section of this assignment specification.
The thoroughness of your Checker_isValid
functions will be a prominent part of your grade.
To encourage good coding practices, we will deduct points if gcc217
generates warning messages.
There are three keys to success for this assignment:
Your code must have proper function modularity. The heapmgr3.c
implementation defines functions HeapMgr_malloc
, HeapMgr_free
, HeapMgr_useChunk
, and HeapMgr_getMoreMemory
. In your heapmgr4.c
and heapmgr5.c
implementations you must define additional functions that help make your code easier to read and maintain.
We can't suggest the specific additional functions you should define; part of the assignment is that you determine what they should be. We suggest that you study the pseudocode given in precepts. Consider each important verb in the pseudocode. Some candidates for additional functions probably will be obvious.
Your code must check invariants thoroughly. That is, for each implementation you must define a thorough Checker_isValid
function. The Checker_isValid
function must check the HeapMgr
object's data structures, making sure that those aspects of the data structures that must not vary indeed have not varied. You must call the Checker_isValid
function at the leading and trailing edges of your HeapMgr_malloc
and HeapMgr_free
functions.
The Checker_isValid
function for the heapmgr3
implementation defines several checks of invariants. Some of them apply to your implementations; some of them do not. Your implementations must use more complex data structures, so the Checker_isValid
functions for your implementations must define more checks of invariants.
It would be a mistake to develop your Checker_isValid
functions after developing your heapmgr4.c
and heapmgr5.c
implementations. That is, it would be a mistake to proceed to Step 2 before doing a thorough job on Step 1, or to proceed to Step 5 before doing a thorough job on Step 4. After all, your Checker_isValid
functions will help you to develop your heapmgr4.c
and heapmgr5.c
implementations.
Use the given testheapmgr.c
client properly. Run the test4
and test4good
programs repeatedly, progressing from simple tests (e.g., LifoFixed
) to more complex ones (e.g., RandomRandom
), from small numbers of calls of HeapMgr_malloc
and HeapMgr_free
to large number of calls, and from small (maximum) chunk sizes to large (maximum) chunk sizes. For each run compare the performance — especially the memory consumption — of the two programs. Do the same for test5
and test5good
.
This assignment was written by Robert M. Dondero, Jr. and Iasonas Petras.