COS 333: Assignment 2

COS 333 Assignment 2: Feeping Creaturism

Due midnight, Friday, February 19

Thu Feb 11 09:14:00 EST 2010

Clarification added 2/11: Replace /*...*/ comments by a single blank. And when you see /*, assume it begins a comment, not a regular expression. The purpose of the assignment is for you to work in a big program, not get trapped in dusty corners.

One of the most common experiences for a programmer coming into a new job or working with some open-source code is to have to fix a bug or make a small change in a large unfamiliar program. This requires the ability to quickly find the relevant parts of the program and change them in a minimal way, while ignoring irrelevant parts and being sure not to break anything.

This assignment is an exercise in adding some new features to an existing program that we have talked about in class, but whose innards are almost certainly unfamiliar. To get started, become familiar enough with AWK that you know in broad outline what it does; this awk help file might help. Then download the source from this page and skim the source code to see how it is implemented.

Your specific tasks are to

add a repeat-until statement that parallels the existing do-while statement but reverses the sense of the test. That is,
```
	repeat {
		statements
	} until (expression)
```
repeats the statements until the expression becomes true.
add a new built-in function htoi(s) that returns the result of converting the hexadecimal digits of the string s into a numeric value, analogous to the C function atoi. Your function should ignore leading whitespace and any 0x or 0X prefix, handle an optional sign, and stop scanning at the first character not part of a hex number. Use strtol; do not parse hex with your own code! (strtoul would be an alternative, but its results are somewhat surprising for negative hex strings. In this narrow context strtol is better.)
modify lexical analysis to permit /*...*/ comments as in C, C++, Java, etc., in addition to the existing # comments. You do not need to support nested comments; the first */ ends the comment even if there is more than one /*.
create enough test cases to exercise your code thoroughly.

For this assignment, you can get a very long way merely by finding code that already does more or less what's needed and is already in the right place and right form; most of the job amounts to intelligent cut and paste, using grep to locate places that might be relevant.

For repeat-until, the AWK grammar needs a new rule that specifies the syntax of a repeat-until statement and creates the right kind of node in the parse tree. Lexical analysis has to recognize two new keywords and a new function has to be written and added to the program to provide the semantics.

For htoi, there are no grammar changes but there is a new function name to be recognized, and you have to add new semantics in the right place in run.c.

For the new comment syntax, you have to fiddle the lexical analysis in lex.c.

For the last part, testing, automate as much as possible. Create a shell file awk.test (for ksh and bash, not tcsh). There should be at least dozen tests in total that ensure that your features are properly tested. The file awk.test should be self-contained, requiring no input from a user and generating its own test data somehow. It should produce no output if the tests work, and one line per failure, of the form

	Error: test N failed

if the N-th test fails. It should assume that the program being tested is named a.out and is in the current directory. For example, this file contains one minimal test of repeat-until:

	#!/bin/bash
	# test 1: count down from 1 to 0; no braces
	./a.out '
	BEGIN {
		n = 1	/* a comment in the new syntax */
		repeat
			print n
		until (--n <= 0)
	}' >temp1
	echo '1' >temp2
	cmp temp1 temp2 >/dev/null 2>&1 || echo 'Error: test 1 failed' 2>&1

Your awk.test should contain other tests, with comments to explain what each test does. Don't forget to test for syntax errors.

Here is some other advice:

Talking over the precise meaning of the assignment with friends is encouraged. You must write your code yourself, however, so once you start coding, you may no longer discuss specific programming details with other students.
It is almost always best to attack such a task in stages, making sure each one works before the next one is started. First compile what you downloaded. Then make a tiny change, recompile, and test:
```
	repeat {tiny change; recompile; test} until (it's done)
```
Prepare some good test cases ahead of time. Test early, test often, know what you are testing. Automate the testing as much as you can.
Learn to use gdb. This short gdb help file might help you get started or remind you of basic commands.
No core dumps, please.

For my version, I added about 25-30 lines, mostly for repeat-until, spread over 6 source files; almost all are simple variants of what's there already. The htoi function required about 5 new lines in 3 files. Comment stripping added about 10-12 lines. If you're doing a lot more, you are off on the wrong track. If you get stuck, here are some hints that might help. No penalty for using them, no reward for not using them. You will probably find it most instructive to try hard before looking at the hints.

We will assess the quality of your test cases, so your tests should be good ones. We will also make sure you didn't break something else, so be careful of that. Don't make unnecessary or irrelevant changes in any file. In particular, do not replace newlines by carriage returns (Mac) or CRLF (Windows), and do not reformat the code. And try to make your tests correct; it would be nice if we could run your tests over everyone else's implementations.

Submission

Updates to significant pieces of software are often distributed as "patch" files, that is, the set of changes necessary to convert the old version into the new version. For Unix and Linux source, a patch file is usually made by running diff to create a file of editing commands on the system where the changes were made, and running patch with that file as input on the system where the changes are to be applied.

For this assignment, you have to submit a patch file awk.patch that contains your changes. The easiest way to do this is to place the original program in one directory, say old, and the new version in new. Clean out all the junk like Yacc-generated files (ytab*), proctab.c, and binary files like a.out and *.o. (The make clean rule in the makefile doesn't get rid of all of ytab*.) Then in the parent of these directories, say

     diff -ur old new >awk.patch

The recipient (the grader in this case) will say, on his/her system,

    cd old
    patch --verbose --backup <../awk.patch

to update the old version with your changes, in place.

Try the patch process yourself to be sure it works right before you submit. Back up your work before you start experimenting!! My patch file is around 170 lines long; if yours is a lot bigger, you are probably including something like a Yacc output file or you have somehow changed line feeds or tabs in your source files. Fix those up and try again.

When you're all done, submit awk.patch and awk.test, using dropbox.

PLEASE follow the rules on what to submit. It will be a help if you get the filenames right and submit exactly what's asked for. Thanks.