COS 126 Hamming Codes in TOY |
Programming Assignment Due: Wednesday |
Write a TOY program to encode data using Hamming codes. Then write a TOY program to correct encoded data that has been corrupted.
Perspective. Error-correcting codes enable data to be sent through a noisy communication channel without corruption. To accomplish this, the sender appends redundant information to the message, so that even if some of the original data is corrupted during transmission, the receiver can still recover the original message intact. Transmission errors are common and can arise from scratches on a CD, static on a cell phone, or atmospheric interference. In a noisy environment, error-correcting codes can increase the throughput of a communication link since there is no need to retransmit the message if it becomes partially corrupted during transmission. For this reason, error-correcting codes are used in many common systems including: storage devices (CD, DVD, DRAM), mobile communication (cell phones, wireless, microwave links), digital television, and high-speed modems (ADSL, xDSL).
Hamming Codes. A Hamming code is a specific type of error-correcting code that allows the detection and correction of single-bit transmission errors. Hamming codes are used in many applications where such errors are common, including DRAM memory chips and satellite communication hardware. Hamming codes work by repeatedly reading four message bits, which we denote by m1, m2, m3, m4, and then inserting three parity bits, which we denote by p1, p2, and p3. If any one of these seven bits is corrupted during transmission, the receiver can detect the error and recover the original four message bits intact. This is called single-bit error correction because at most one bit can be corrected per unit of data sent. The overhead for using this method is a 75% increase in bandwidth because it requires three extra parity bits for every four message bits. This compares favorably with the naive approach of sending three copies of each bit (and using the one that appears most frequently), which results in a 200% increase in bandwidth.
Before we describe the algebra of Hamming codes, we first visualize the calculation of the parity bits using Venn diagrams. As an example, suppose we wish to send the message 1101. We associate each of the four message bits with a specific intersection region of three pairwise overlapping circles, as illustrated below:
Now imagine this picture is transmitted over a noisy communication channel, and that one bit is corrupted so that the following picture arrives at the receiving station (corresponding to 1001100):
Of course, in practice, only the 7 bits are transmitted, rather than the circle diagrams.
Part 1. Write a TOY program encode.toy to encode a binary message using the scheme described above. Your program should repeatedly read four bits m1, m2, m3, and m4 from TOY standard input, and write the seven bits m1, m2, m3, m4, p1, p2, p3 to TOY standard output, where
Recall that ^ is the exclusive or operator in Java and TOY. This captures the parity notion described above.Part 2. Write a TOY program decode.toy to correct a Hamming encoded message. Your program should repeatedly read seven bits m1, m2, m3, m4, p1, p2, p3 from TOY standard input, and write four bits to TOY standard output. Recall, to determine which one, if any, of the message bits is corrupted, perform the following three parity checks:
Input format. For convenience, we'll transmit each bit individually (as 0000 or 0001), instead of packing 16 bits per TOY word as would be done in a real application. Your program should repeat until FFFF appears on TOY standard input. You may assume that the number of transmitted bits (not including the terminating value) is a multiple of four when encoding, and a multiple of seven when correcting. The directory hamming contains a few sample input files for encode.toy and decode.toy:
% more encode3.txt % more decode3.txt 0001 0001 0000 0001 0001 0000 0000 0001 0001 0000 0000 0001 0001 0001 0000 0000 0001 0001 0000 0000 0000 0000 0001 0001 0001 0001 0001 0001 0001 0001 0001 0001 0000 FFFF FFFF
TOY simulator. To execute your TOY program, use either the command-line simulator TOY.java or the graphical Visual X-TOY simulator.
% java TOY encode.toy < encode3.txt % java TOY decode.toy < decode3.txt ... ... Terminal Terminal --------------------------------------- --------------------------------------- 0001 0001 0001 0001 0000 0000 0001 0001 0001 0001 0000 0001 0000 0001 0001 0000 0001 0001 0001 0001 0000 0001 0000 0001 0000 ... 0000 0001 0001 0001 0001 0001 0001 0001 ...
Submission. Submit encode.toy and decode.toy. Be sure to comment your programs to make it clear to the grader what you are doing and what each variable represents. Also, submit a readme.txt file and answer the questions.
Extra credit. Rewrite decode.toy using the fewest number of (nonzero) words of TOY main memory.
Midterm evaluation. Please fill out the following anonymous questionnaire to provide us with feedback to help us improve this course.