Princeton University
COS 217: Introduction to Programming Systems

Assignment 7: A Linux Shell


Purpose

The purpose of this assignment is to help you learn about Linux processes, low-level input/output, and signals. It also will give you ample opportunity to define software modules; in that sense the assignment is a capstone for the course.

Students from past semesters reported taking, on average, 26.5 hours to complete this assignment.


Rules

This assignment is an individual assignment, not a team assignment.

Signal handling (as described below) is the challenge part of this assignment. While doing the challenge part of the assignment, you are bound to observe the course policies regarding assignment conduct as given in the course Policies web page, plus one additional policy: you may not use any "human" sources of information. That is, you may not consult with the course's staff members, the lab teaching assistants, other current students via Piazza, or any other people while working on the challenge part of an assignment, except for clarification of requirements.

The challenge part is worth 5 percent of the assignment. So if you don't do any of the challenge part and all other parts of your assignment solution are perfect and submitted on time, then your grade for the assignment will be 95 percent.


Background

A Linux shell is a program that makes the facilities of the operating system available to interactive users. There are several popular Linux/Unix shells: sh (the Bourne shell), csh (the C shell), and bash (the Bourne Again shell) are a few.


Your Task

Your task in this assignment is to create a series of three related programs. The programs must be named ishlex, ishsyn, and ish. Your ish program must be a minimal but realistic interactive Linux shell. Your development of the simpler ishlex and ishsyn programs will help you to develop your ish program. A Supplementary Information page lists detailed requirements and recommendations.


The Procedure

Develop on CourseLab. Use emacs to create source code. Use make to automate the build process. Use gdb to debug.


Stage 0: Preliminaries

Read this entire assignment specification and the entire assignment Supplementary Information page. Review the lecture slides and precept material from the first half of the course on testing, building, debugging, style, and especially modularity. Study the lecture slides and precept material from the second half of the course on exceptions and processes, process management, I/O management, signals, and alarms. Complete the pertinent required reading, especially Chapter 8 of Computer Systems: A Programmer's Perspective (Bryant & O'Hallaron).

The CourseLab /u/cos217/Assignment7 directory contains files that you will find useful. Subsequent stages describe them. Create a project directory, and copy all files from the /u/cos217/Assignment7 directory to your project directory.


Stage 1: Lexical Analysis

Compose a lexical analyzer for your programs. Your lexical analyzer must be defined in a distinct module. Your lexical analyzer must accept an array of characters, and return a DynArray object containing tokens. (The DynArray ADT was described in precepts. The source code defining the DynArray ADT is available in the CourseLab /u/cos217/Assignment7 directory.) Compose additional modules that are used by your lexical analyzer, as appropriate.

From the user's point of view, a token is a word. (Your program may represent a token as a string, or as a richer data structure.) More formally, from the user's point of view a token consists of a sequence of non-white-space characters that is separated from other tokens by white-space characters. There are two exceptions:

Special characters inside of strings are not separate tokens. It is an error for an "opening" double quote within a line to be unmatched by a "closing" double quote.

Make no assumptions about the length of each line. Your lexical analyzer must work for lines of any length.

Then compose a client of your lexical analyzer. The client must be defined in a file named ishlex.c. Use the ishlex.c client, your lexical analyzer module, and other modules that you have composed to build a program named ishlex. Your ishlex must:

  1. Write to stdout a prompt consisting of a percent sign and a space.
  2. Read a line (that is, an array of characters) from stdin.
  3. Write that line (array of characters) to stdout
  4. Flush the stdout buffer.
  5. Pass the line (array of characters) to your lexical analyzer to create a DynArray object containing tokens.
  6. Write the tokens to stdout, using precisely the format specified in the Supplementary Information page.

It must do that repeatedly until the program reaches end-of-file of stdin. Recall that typing Ctrl-d simulates end-of-file when stdin is bound to the terminal.

Test your ishlex thoroughly. These given files will help you with your testing:


Stage 2: Syntactic Analysis

Compose a syntactic analyzer for your programs. Your syntactic analyzer must be defined in a distinct module. Your syntactic analyzer must accept a DynArray object containing tokens, and return a command. Compose additional modules as appropriate.

The DynArray object containing tokens must begin with an ordinary token, which is the command's name. It is an error for the DynArray object not to begin with an ordinary token. The command name token might be followed by tokens which are command-line arguments, tokens which indicate redirection of stdin, and/or tokens which indicate redirection of stdout.

Your syntactic analyzer must handle redirection in these ways:

Then compose a client of your syntactic and lexical analyzer modules. The client must be defined in a file named ishsyn.c. Use the ishsyn.c client, your syntactic and lexical analyzer modules, and other modules that you have composed to build a program named ishsyn.

Your ishsyn must use the same lexical analyzer module as your ishlex does.

The behavior of your ishsyn must be a superset of the behavior of your ishlex, except that your ishsyn must not write tokens to stdout. More precisely, your ishsyn must:

  1. Write to stdout a prompt consisting of a percent sign and a space.
  2. Read a line (that is, an array of characters) from stdin.
  3. Write that line (array of characters) to stdout.
  4. Flush the stdout buffer.
  5. Pass the array of characters to your lexical analyzer to create a DynArray object containing tokens.
  6. Pass the DynArray object containing tokens to your syntactic analyzer to create a command.
  7. Write the command to stdout, using precisely the format specified in the Supplementary Information page.

It must do that repeatedly until the program reaches end-of-file of stdin.

Test your ishsyn thoroughly. These given files will help you with your testing:


Stage 3: Handling External Commands

Compose a "first draft" of ish. At this stage ish must handle simple external commands, that is, commands that contain no redirection (via < or >).

Specifically, compose a file named ish.c. Use ish.c, your lexical and syntactic analyzer modules, and other modules that you have composed to build a program named ish. Compose additional modules as appropriate.

The behavior of your ish must be a superset of the behavior of your ishsyn, except that your ish must not write commands to stdout. More precisely, your ish must:

  1. Write to stdout a prompt consisting of a percent sign and a space.
  2. Read a line (that is, an array of characters) from stdin.
  3. Write the line (array of characters) to stdout.
  4. Flush the stdout buffer.
  5. Pass the line (array of characters) to your lexical analyzer to create DynArray object containing tokens.
  6. Pass the DynArray object containing tokens to your syntactic analyzer to create a command.
  7. Execute the command.

It must do that repeatedly until the program reaches end-of-file of stdin.

Test your ish thoroughly. These given files will help you with your testing:

Your ish must use the same lexical analyzer module and syntactic analyzer module as your ishsyn does.


Stage 4: Handling Shell Built-In Commands

Enhance ish so it handles shell built-in commands. Specifically, ish must interpret four shell built-in commands:

setenv var [value] If environment variable var does not exist, then your ish must create it. Your ish must set the value of var to value, or to the empty string if value is omitted. Note: Initially, your ish inherits environment variables from its parent. Your ish must be able to modify the value of an existing environment variable or create a new environment variable via the setenv command. Your ish must be able to set the value of any environment variable; but the only environment variable that it explicitly uses is HOME. It is an error for a setenv command to have zero or more than two command-line arguments.
unsetenv var Your ish must destroy the environment variable var. It is an error for an unsetenv command to have zero command-line arguments or more than one command-line argument.
cd [dir] Your ish must change its working directory to dir, or to the HOME directory if dir is omitted. It is an error for a cd command to have more than one command-line argument. It is an error for a cd command to have zero command-line arguments if the HOME environment variable is not set.
exit Your ish must exit with status 0. It is an error for an exit command to have any command-line arguments.

Test your ish thoroughly. Your ish must have exactly the same behavior as sampleish does with respect to its handling of shell built-in commands. You will find the aforementioned testish and testishdiff scripts helpful.


Stage 5: Handling Redirection

Enhance your ish so it handles redirection of stdin and/or stdout.

It is erroneous for stdin to be redirected to a file that does not exist.

If stdout is redirected to a file that does not exist, then your ish must create it. If the stdout is redirected to a file that already exists, then your ish must destroy the file's contents and rewrite the file from scratch. Your ish must set the permissions of the file to 0600.

It is erroneous for stdout to be redirected to a file whose name is invalid. For example, it is erroneous for stdout to be redirected to a file named "/" or ".", or for stdout to be redirected to a file in some directory whose contents the user cannot change.

Note that the four shell built-in commands neither read from stdin nor write to stdout. So it would be pointless (but not erroneous) for the user to redirect stdin or stdout within any of those commands. More precisely, when given a shell built-in command containing redirection of stdin or stdout, your ish must lexically and syntactically analyze the entire command, including the part that redirects stdin or stdout — just as your ishlex and your ishsyn do — and must report any lexical or syntactic errors that it encounters. However your ish must not implement the specified file redirection.

Test your ish thoroughly. Your ish must have exactly the same behavior as your sampleish does with respect to handling of redirection. You will find the aforementioned testish and testishdiff scripts helpful.


Stage 6: Handling Signals

Enhance your ish to handle SIGINT signals.

When the user types Ctrl-c, Linux sends a SIGINT signal to your ish (parent) process and to its child process. Upon receiving a SIGINT signal:

Test your ish thoroughly. Your ish must have exactly the same behavior as sampleish does with respect to handling of signals.


Finishing Up

Critique your programs using the splint tool. Each time splint generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.

Similarly, critique your programs using the critTer tool. Each time critTer generates a warning on your code, you must either (1) edit your code to eliminate the warning, or (2) explain your disagreement with the warning in your readme file.

Create a Makefile. The first dependency rule must build all three programs. That is, the first dependency rule in the Makefile must be

all: ishlex ishsyn ish
The Makefile must maintain object (.o) files to allow for partial builds, and encode the dependencies among the files that comprise your programs. As always, use the gcc217 command to build.

Edit your copy of the given readme file by answering each question that is expressed therein.

One of the sections of the readme file requires you to list the authorized sources of information that you used to complete the assignment. Another section requires you to list the unauthorized sources of information that you used to complete the assignment. Your grader will not grade your submission unless you have completed those sections. To complete the "authorized sources" section of your readme file, copy the list of authorized sources given in the "Policies" web page to that section, and edit it as appropriate.

Provide the instructors with your feedback on the assignment. To do that, issue this command:

FeedbackCOS217.py 7

and answer the questions that it asks. That command stores its questions and your answers in a file named feedback in your working directory.

Submit your work electronically on CourseLab using these commands:

submit 7 readme feedback Makefile ishlex.c ishsyn.c ish.c
submit 7 dynarray.h dynarray.c
submit 7 allOtherModuleFiles

Don't forget to submit both your .h files and your .c files.

To make sure that your submission is complete, use this approach... Create a temporary directory. Copy the files that comprise your submission to that directory. Build your programs in that directory to make sure that no files are missing. Delete from that directory all files that you do not wish to submit, for example, executable binary files and .o files. Finally submit all of the files in that directory by issuing the command submit 7 *.


Handling Errors

Your programs must handle each erroneous line gracefully by writing a descriptive error message to stderr and rejecting the line. Any error message written by your programs must begin with "programName: " where programName is argv[0], that is, the name of your program's executable binary file. Note that argv[0] typically will be ishlex, ishsyn, or ish, but need not be so.

The error messages written by your programs must be identical to those written by sampleishlex, sampleishsyn, and sampleish. However, if your programs read a line that contains multiple errors, then your programs can report any one of the errors — not necessarily the same error as sampleishlex, sampleishsyn, and sampleish reports.

It must be impossible for the user's input to cause your programs to terminate abnormally — via a failed assert, heap corruption, a segmentation fault, etc.


Memory Management

Your programs must contain no memory leaks. For every call of malloc or calloc, eventually there must be a corresponding call of free. More specifically, your programs must produce clean meminfo reports when the user terminates your programs by typing Ctrl-d. ish need not produce a clean meminfo report when the user terminates the program by issuing the exit command or by typing Control-c twice within three seconds.


Program Style

In part, good program style is defined by the splint and critTer tools, and by the rules given in The Practice of Programming (Kernighan and Pike) as summarized by the Rules of Programming Style document.

The more course-specific style rules listed in the previous assignment specifications also apply, as do these: your code must have proper file-level and function-level modularity.


Grading

To receive any credit for your ishlex, the program must build. To receive any credit for your ishsyn, the program must build. To receive any credit for your ish, the program must build.

We will grade your work on two kinds of quality:

To encourage good coding practices, we will deduct points if gcc217 generates warning messages.

Remember that the Supplementary Information page lists detailed implementation requirements and recommendations.


This assignment was written by Robert M. Dondero, Jr.
with contributions by many other faculty members and students.