Part 1 (reading in input)
|
Both the BST method and the qsort method suggested in the
assignment are capable of solving the text-indexing problem for
the supplied data in a reasonable amount of time.
We also encourage you to think of your own method for solving the
problem. If it's faster, great. If not, that's OK too,
since you will have learned more by experimenting.
Use the submit command:
/u/cs126/bin/submit 8 readme index.c
Also submit any other auxilliary files that are needed to
compile your program.
It's not necessary to submit a brute force solution, if you
submit a different method.
The readme
file should contain:
Name, precept number, any problems encountered,
and whatever help your received.
Give a high level description of the basic approach
you took to solve the problem. Indicate advantages and
disadvantages of your method, and why you chose it.
Include instructions for compiling your code.
Be sure to submit all necessary files.
Output of query0/corpus0 and query1/corpus1.
First and last 15 lines of output from query2/corpus2.
Indicate how long it takes for query2/corpus2
to complete - use the Unix /usr/bin/time command.
Find all occurrences of a given query string.
Depending on your method, this may require only 5 or so extra
lines of code, so give it a shot.
No extra credit awarded here if you do a brute force approach.
Here's an easy but "malicious" input with only 1000 words in
the corpus and 7000 query phrases:
query3,
corpus3.
Design an improved algorithm that does not grind to halt, even
on "pathological" inputs.
Kevin Wayne