Special collaboration policy for Spring '03: for this assignment, you are permitted to work jointly with one classmate. If you choose to do so, you and your partner should submit your code and the readme.txt jointly (under one login name only). You and your partner will receive the same grade.
|
What's the maximum number of movies and actors? The Internet Movie Database contains less than 125,000 movies and 500,000 actors. We won't test your program on anything larger than this.
What if there are multiple chains of the same length. Which one do I print out? Yes, in general there will be many paths of the same length from an actor to Kevin Bacon. Print out any one you like.
Do I always use "Kevin Bacon" as the source? Yes, although you're welcome to experiment with others. Apparently, Christopher Lee is the center of the Hollywood Universe. For your final submission, we recommend something like
#define SOURCE "Bacon, Kevin"
What do I print if an actor is not in the same connected component as Kevin Bacon? Print that their Bacon number is "infinity." Or if the actor does not even appear in the database, you can print out a message to this effect.
When I run my program, I get a segmentation fault or bus error in tst-index.c. Who's to blame? Probably you. It's very likely that you a memory allocation error or an array out-of-bounds access error somewhere else in your program and the error remains hidden for a while. Errors that result from trampling memory can be difficult to debug for exactly this reason.
|
Input and output. Here are a number of sample input files. The numbers below are slightly off because we are now editing the data files to remove movies with no actors and remove '/' characters from movie titles.
File | Movies | Actors | Description |
input-bacon.txt | 46 | 1,498 | With Kevin Bacon |
input-top-grossing.txt | 187 | 8,265 | Top grossing |
input-hero.txt | 193 | 2,513 | Contains string "hero" |
input-mpaa-g.txt | 967 | 13,850 | Rated G by MPAA |
input-year2000.txt | 4,757 | 43,940 | Released in 2000 |
input-mpaa.txt | 14,192 | 170,539 | Rated by MPAA |
input-all.txt | 122,938 | 418,577 | All |
Our data is taken from the Internet Movie Database. Feel free to download the full 850MB+ database for yourself. In addition to cast lists, it has movie reviews, audio clips, ratings, etc. If you want to create interesting input files (in our format), send them to us.
Extra credit. Create your own movie database file. If it's sufficiently interesting, we'll award extra credit and post them for other students to use. Feel free to create a database of other social relationships. For example, get the Registrar to release a roster list of all classes taken at Princeton since 1900 and use those to compute your Woodrow Wilson number. :) Or connnect Kazaa users if they've downloaded/uploaded a file.
Reference solutions. For reference, we provide executable code for our solution in Windows, Solaris, and OS X. As usual it's a challenge to beat the reference solutions, but it has been done several times in the past! Note that your solution to queries on input-all.txt will not agree precisely with The Oracle of Bacon because our input file includes TV movies and excludes a few movies that our perl script doesn't parse.
|
We will compile your program with "gcc *.c" and execute it with "a.out moviedb.txt < actors.txt"so you are free to organize your program as you see fit. As usual, we encourage and reward sound design principles.
Here is a template readme.txt file. It should contain the following information:
Unix users may find the shell script timeit.csh useful for computing a table of CPU times.
|
|