|
|
What is the origin of the Markov text generator? It was first described by Claude Shannon in 1948. The first computer version was apparently written by Don P. Mitchell, adapted by Bruce Ellis, and popularized by A. K. Dewdney in the Computing Recreations section of Scientific American. Brian Kernighan and Rob Pike revived the program in a University setting and described it as an example of design in The Practice of Programming. The program is also described in Jon Bentley's Programming Pearls.
How do I read in characters from standard input? If you are using java 1.5 or newer, then you can use the StdIn.readAll() method to read the entire text file into one String.
If you have an older version of java, either update it, or ask a preceptor for instructions on using the In.readAll() method.String text = StdIn.readAll();
Given a string s, is there an efficient way to get a substring? Use s.substring(i, i+k) to get the k-character substring starting at i. In Java, this operation takes constant time, regardless of the size of k. This efficient construction is possible because strings are immutable, which enables the library to reuse the relevant part of the original string's character array. See Section 3.1 for more information on strings.
Why treat the text as a cyclic string? This prevents the Markov chain from ever getting "stuck", i.e., entering a state with no outgoing transitions.
How do I emulate the behavior of a cyclic string? There are a number of approaches. The easiest might be to concatenate the first k characters to the end of the text.
Should my program generate a different output each time I run it? Yes.
For which values of k should my program work? It's fine to assume k > 0. Naturally, as k gets larger, your program will use more memory and take longer.
I get an OutOfMemoryException. How do I tell Java to use more of my computer's memory? Depending on your operating system, you may have to ask the Java Virtual Machine for more main memory beyond the 64MB default.
The 100m means 100MB, and you should adjust this number depending on the size of the input text.java -Xmx100m TextGenerator 7 1000 < input.txt
Why not store the probability of each transition instead of using parallel edges? It's easier to implement this way. Storing the probabilities would be more efficient and more general, but it's not necessary on this assignment.
|
Input. Here are a number of additional sample input files.
|
Submission. Submit MarkovChain.java, TextGenerator.java, AdjList.java and readme.txt. Do not modify or submit In.java or SymbolTable.java.
readme.txt. Use the following readme file template.
|
These are purely suggestions for how you might make progress. You do not have to follow these steps.
|