Princeton University
COS 217: Introduction to Programming Systems

Assignment 2: A String Replacement Program

Purpose

The purpose of this assignment is to help you learn/review pointer handling and dynamic memory management in the C programming language.  It will also give you the opportunity to learn more about the UNIX/GNU programming tools, especially bash, Emacs, gcc, and gdb.

Background

Every programming system contains a text editor.  Text editors typically provide a "replace" operation that allows the user to systematically replace instances of one character string with another.

Your Task

Your task in this assignment is to create a C program named strreplace.  As the name implies, the program should allow the user to replace strings, thus implementing one facet of a text editor.  Specifically, when invoked from the shell in this manner:
strreplace fromstring tostring
the program should read lines from stdin and write them to stdout, replacing all occurrences of the string fromstring with the string tostring.  The strings fromstring and tostring are not necessarily the same length.

For example, if the file typical.txt contains:

Now is the time
for all good men
to come to the aid of their country.
then the command:
strreplace the first <typical.txt
should print this to stdout:
Now is first time
for all good men
to come to first aid of firstir country.
In addition to printing the transformed text to stdout, strreplace should also print diagnostic information to stderr (not stdout) indicating the number of substitutions performed and the number of characters by which the size of the file changed.  For the above input, the output to stderr should be:
3 substitutions
+6 characters
The tostring command-line argument is optional; if it is omitted, the effect should be to delete occurrences of fromstring.  Thus, the command:
strreplace the <typical.txt
should print this to stdout:
Now is  time
for all good_men
to come to  aid of ir country.
Note the two spaces between "is" and "time" in the first line and between "to" and "aid" in the last line.

The output to stderr should be:

3 substitutions
-9 characters

Your program should validate its command-line arguments: if the user specifies zero command-line arguments or more than two command-line arguments, your program should print a "usage" message to stderr indicating proper usage of the program. 

Your program should handle boundary cases reasonably.  In particular, strreplace should treat an empty toString (e.g. strreplace the "") as it treats a missing toString.  It should handle an empty fromString (e.g. strreplace "" first) by printing a "usage" message to stderr.  It should not crash when the stdin file is empty.

There are several reasonable ways to write strreplace.  To achieve the stated purpose of the assignment, you should write your program so it processes one line at a time.  That is, your program should read an entire line from stdin into memory, apply the appropriate substitutions to produce the entire resulting line in memory, print the resulting line to stdout, and repeat. Using that approach is a critical aspect of the assignment.  (Hint:  Read about the gets, fgets, puts, and fputs functions.)

You may assume that no stdin line is longer than 80 characters; your program may ignore characters in the 81st column and beyond.  But your program should not corrupt memory when stdin contains a line whose length is greater than 80 characters.  (Hint: Again, read about the gets and fgets functions.)  You should not make any assumptions about the length of the line that results from applying substitutions.

Your program should contain no "memory leaks."  Any memory that it acquires via the malloc or calloc functions should be released via the free function.

Your program probably will call several functions from the standard C library.  (The sample solution that we developed calls approximately 10 of them.)  Many of those functions will be from the standard string library.  It would be appropriate to define additional string manipulation functions as the need arises.

Logistics

You should develop on arizona using the bash shell. Use Emacs to create source code. Use gcc to compile, assemble, and link. Use gdb to debug.

You should store your source code in a file named strreplace.c.  You should store the executable program in a file named strreplace.  The file /u/cs217/Assignment2/samplestrreplace is an executable solution; to test your strreplace program you may compare its output to that of samplestrreplace.  Also you may use the shell script in file /u/cs217/Assignment2/teststrreplace and the data files in the Assignment2 directory to test your program.  Your program should pass all of the tests defined in the teststrreplace shell script.

Create a "readme" text file containing your name and descriptions of any known bugs. Submit your work electronically via the command /u/cs217/bin/submit 2 strreplace.c readme.

Grading

We will grade your work on correctness and understandability.  See Assignment 1 for guidelines concerning program understandability.  To encourage good coding practices, you will lose points for any warning messages generated by gcc during the compilation of your work.