A symbol table is an unordered collection of bindings. A binding consists of a key and a value. A key is a string that uniquely identifies its binding; a value is data that is somehow pertinent to its key. A symbol table allows its client to insert (put) new bindings, to retrieve (get) the values of bindings with specified keys, and to remove bindings with specified keys. Symbol tables are used often in programming systems; compilers, assemblers, and execution profilers use them extensively.
There are several reasonable ways to implement a symbol table. A simple implementation might store the bindings in a linked list. Linked lists are described in Section 2.7 of The Practice of Programming (Kernighan and Pike) and Section 17.5 of C Programming: A Modern Approach (King). A more efficient implementation might use a hash table. Hash tables are described in Section 2.9 of The Practice of Programming, Section 6.6 of The C Programming Language (Kernighan and Ritchie), Chapter 8 of C Interfaces and Implementations (Hanson), and Chapter 14 of Algorithms in C, Parts 1-4 (Sedgewick).
Your task in this assignment is to create an ADT named "SymTable". Each instance of the SymTable ADT should be a symbol table. You should design your SymTable ADT so it is "generic." That is, you should design your SymTable so its values are void pointers, and thus can point to data of any type.
You should create two implementations of your SymTable ADT: one that uses a linked list and another that uses a hash table.
A subsequent assignment will ask you to create a program that is a client of your SymTable ADT.
The SymTable interface should be stored in a file named symtable.h. It should contain these function declarations:
SymTable_new() should return a new SymTable_T that contains no bindings.SymTable_T SymTable_new(void); void SymTable_free(SymTable_T oSymTable); int SymTable_getLength(SymTable_T oSymTable); int SymTable_put(SymTable_T oSymTable, const char *pcKey, const void *pvValue); int SymTable_remove(SymTable_T oSymTable, const char *pcKey);int SymTable_contains(SymTable_T oSymTable, const char *pcKey); void *SymTable_get(SymTable_T oSymTable, const char *pcKey); void SymTable_map(SymTable_T oSymTable, void (*pfApply)(const char *pcKey, void *pvValue, void *pvExtra), const void *pvExtra);
SymTable_free() should free all memory occupied by oSymTable. If oSymTable is NULL, then the function should do nothing.
SymTable_getLength() should return the number of bindings in oSymTable. It should be a checked runtime error for oSymTable to be NULL.
If no binding with key pcKey exists in oSymTable, then SymTable_put() should add a new binding to oSymTable consisting of key pcKey and value pvValue, and should return 1 (TRUE). Otherwise the function should not change oSymTable, and should return 0 (FALSE). It should be a checked runtime error for oSymTable or pcKey to be NULL.
If a binding with key pcKey exists within oSymTable, then SymTable_remove() should remove that binding from oSymTable and should return 1 (TRUE). Otherwise the function should not change oSymTable, and should return 0 (FALSE). It should be a checked runtime error for oSymTable or pcKey to be NULL.
SymTable_contains() should return 1 (TRUE) if oSymTable contains a binding whose key is pcKey, and 0 (FALSE) otherwise. It should be a checked runtime error for oSymTable or pcKey to be NULL.
SymTable_get() should return the value of the binding within oSymTable whose key is pcKey, or NULL if no such binding exists. It should be a checked runtime error for oSymTable or pcKey to be NULL.
SymTable_map() should apply function *pfApply to each binding in oSymTable, passing pvExtra as an extra parameter. It should be a checked runtime error for oSymTable or pfApply to be NULL.
A SymTable should "own" its keys. That is, the SymTable_put function should not simply store the value of pcKey within the binding that it creates. Rather, the SymTable_put function should make a copy of string pcKey, and store the address of that copy within the new binding. You will find the standard C functions strlen() and malloc() useful for making the copy. Conversely, a SymTable should not own its values. Indeed it cannot own its values; since is cannot determine the types of its values, it cannot create copies of them.Your SymTable linked list implementation should:
#define HASH_MULTIPLIER 65599U ... static int SymTable_hash(const char *pcKey, int iBucketCount) /* Return a hash code for pcKey that is between 0 and iBucketCount-1, inclusive. Adapted from the Spring 2005 COS 217 lecture notes. */ { int i; unsigned int uiHash = 0U; for (i = 0; pcKey[i] != '\0'; i++) uiHash = uiHash * HASH_MULTIPLIER + (unsigned int)pcKey[i]; return (int)(uiHash % (unsigned int)iBucketCount); }See page 578 of the Sedgewick textbook for an alternative.
Implementing those features is worth 94% of the assignment. To receive the remaining 6%, your SymTable hash table implementation also should:
More precisely, page 126 of the Hanson textbook provides a sequence of integers that are appropriate bucket counts: 509, 1021, 2053, 4093, 8191, 16381, 32771, and 65521. When SymTable_put() detects that the new binding count exceeds 509, it should increase the bucket count to 1021. When the function detects that the new binding count exceeds 1021, it should increase the bucket count to 2053. Etc. When SymTable_put() detects that the new binding count exceeds 65521, it should not increase the bucket count. Thus 65521 is the maximum number of buckets that the SymTable should contain.
You will receive a better grade if you submit a working non-expanding hash table implementation instead of a non-working expanding hash table implementation. If your attempts to develop the expansion code fail, then leave the expansion code in your symtablehash.c file as comments, and describe your attempts in your readme file.
A client program is available in the file /u/cos217/Assignment3/testsymtable.c. That program requires you to provide a single command-line argument, which should be an integer that specifies a binding count. The program tests your SymTable ADT by manipulating several SymTable objects. One of those SymTable objects contains the specified number of bindings. The program prints to stdout an indication of how much CPU time it consumed while manipulating that SymTable object. You should use testsymtable.c to test your SymTable ADT. You should create additional test programs, as you deem necessary, to test your SymTable ADT.
Create a "readme" text file that contains:
Submit your work electronically on hats via the command:
/u/cos217/bin/i686/submit 3 symtable.h symtablelist.c symtablehash.c readme