The C preprocessor is an important part of the C programming system. The C preprocessor performs two jobs. Its first job is to handle preprocessor directives (#define, #include, etc.) that reside in the given source code file. Its second job is to "de-comment" (that is, remove comments from) the given source code file. The first is the more difficult. Nevertheless, the second is substantial.
Your program should be structured as a UNIX filter. That is, your program should read characters from standard input, and write characters to standard output (and possibly to standard error). Specifically, your program should read text (which presumably comprises a C program) from standard input, write that same text -- devoid of comments -- to standard output, and write error messages as appropriate to standard error. A typical command-line execution of your program might look like this:
decomment < somefile.c > somefilewithoutcomments.c 2> errormessages
Your program should:
Your program should detect and report these three errors:
You may assume that the last line of the input file ends with a newline character.
Your program should make sure that the text that it writes to standard output ends with a newline character. In particular, your program should write a newline character after an unterminated comment.
You should not make any assumptions about the maximum length of an input line.
Suggestion: The concept of deterministic finite state automaton, as described in the COS 126 course, is pertinent to this assignment. Designing your program as such can simplify its logic significantly.
Use xemacs to create source code in a file named decomment.c.
Use the gcc command with the -Wall, -ansi, and -pedantic options to preprocess, compile, assemble, and link your program.
Execute your program multiple times on various input files that test all logical paths through your code.
To help you test your program, we have provided several files in hats directory /u/cos217/Assignment1:
sampledecomment < somefile.c > output1 2> errors1 decomment < somefile.c > output2 2> errors2 diff output1 output2 diff errors1 errors1 rm output1 errors1 output2 errors2The UNIX diff command finds differences between two given files. The executions of the diff command shown above should produce no output. If the command "diff output1 output2" produces output, then sampledecomment and your program have written different characters to standard output. Similarly, if the command "diff errors1 errors2" produces output, then sampledecomment and your program have written different characters to standard error.
Use xemacs to create a "readme" text file that contains:
Note that comments describing your code should not be in the readme file. Rather they should be integrated into the code itself.
Submit your work electronically on hats via the command:
/u/cos217/bin/i686/submit 1 decomment.c readme
If the directory /u/cos217/bin/i686 is in your PATH environment variable, then you can abbreviate that command as:
submit 1 decomment.c readme
If you are using the bash shell and have copied files .bashrc and .bash_profile to your HOME directory, then directory /u/cos217/bin/i686 indeed is in your PATH environment variable. You can examine the value of your PATH environment variable by executing the command "printenv PATH".
(1) Uses a consistent and appropriate indentation scheme. All statements that are nested within a compound, if, switch, while, for, or do...while statement should be indented. Most programmers use either a 3- or 4-space indentation scheme. Note that the xemacs editor can automatically apply a consistent indentation scheme to your program.
(2) Uses descriptive identifiers. The names of variables, constants, structures, types, and functions should indicate their purpose. Remember: C can handle identifiers of any length, and the first 31 characters are significant. We encourage you to prefix each variable name with characters that indicate its type. For example, the prefix "c" might indicate that the variable is of type "char," "i" might indicate "int," "pc" might mean "pointer to char," "ui" might mean "unsigned int," etc.
(3) Contains carefully worded comments. You should begin each program file with a comment that includes your name, the number of the assignment, and the name of the file. Each function -- especially the main function -- should begin with a comment that describes what the computer does when it executes that function. That comment should explicitly state what (if anything) the computer reads from standard input (or any other stream), and what (if anything) the computer writes to standard output and standard error (or any other stream). The function's comment should also describe what the computer does when it executes that function by explicitly referring to the function's parameters and return value.