|
What are the goals of this assignment? To learn about binary trees and data compression.
Do I need to use a symbol table on this assignment? No. However, note that the encoding table is really a symbol table that maps from characters to sequences of 0s and 1s. To encode the message, you would build this symbol table, and then compress the message by replacing each character with the corresponding sequence of 0s and 1s. But your task on this assignment is to decode, so you don't need to build an explicit symbol table.
Can there be whitespace characters in the preorder representation of the tree? Yes, if the original message uses whitespace, then the tree will contain whitespace characters. You should not need to do anything special to handle them.
How do I detect the newline between the preorder and the sequence of 0s and 1s? The easiest way is to simply ignore anything other than 0s and 1s when traversing the tree: if 0 go left, if 1 go right, otherwise ignore.
The .pre file seems to have a trailing newline character after all of the 0s and 1s. Why is this? Unfortunately, all files end with a newline. When decoding, ignore any characters other than 0s or 1s as in the prevous question. If you get the error message "Error: tried to read from empty input stream", this is the likely cause. Be sure to check if (!CharStdIn.isEmpty()) before reading each character.
How do I compare characters? If c is of type char, then if (c == '*') is true if and only if c represents the asterisk character.
If I invoke the method preorder with tree.preorder(), how can I access the invoking object tree from with the method preorder? The statement PrefixTree x = this; stores a reference to the invoking object in the variable x.
Do I have to organize my program the way it is outlined below. No, you are free to organize your program as you see fit. However, we will compile your program with java PrefixTree and execute it with java PrefixTree < input.pre.
Can I assume the uncompressed message will have at least two characters? Yes.
Do I need to format my output exactly as specified in the assignment? No, but you should put the symbol encoding table first, then the uncompressed message, then the statistics. Don't worry about lining things up perfectly.
How come System.out.print(c) doesn't work when c is a char? Java is a bit inconsistent here. The println method supports primitive input types, but print does not. Instead, use string concatenation to ensure that you're sending a string, e.g., System.out.print(c + "").
|
Input. Here are some sample input files (with extension .pre). The input file consists of two parts. The first part is the preorder traversal of the tree. The second part is the string of 0's and 1's. There will be a single newline character separating the two parts.
Execution. Execute with the following command:
To avoid seeing the text whiz by, usejava PrefixTree < monalisa.pre
java PrefixTree < monalisa.pre | more
|
Submission. Your programs must be named PrefixTree.java. Don't forget to hit the "Run Script" button on the submission system to test that it compiles cleanly.
readme.txt. Use the following readme file template.
|
These are purely suggestions for how you might make progress. You do not have to follow these steps.
public static void main(String[] args) { PrefixTree tree = new PrefixTree(); tree.preorder(); }
|
See Project Gutenberg for an amazing selection of free etexts. You'll notice that the zip'd versions are roughly half the size of the corresponding raw text versions. The method of data compression is very similar to the one you are using on this assignment!