COS 435, Spring 2006 - Homework 1: Search engine test
Due 5:00pm Monday, February 27, 2006.
General Collaboration Policy
You may discuss problems with other students in the class. However,
each student must write up his or her own solution to each problem
independently. That is, while you may formulate the solutions to
problems
in collaboration with classmates, you must be able to articulate the
solutions on your own.
Lateness Policy
A late penalty will be applied, unless there
are extraordinary circumstances and/or prior arrangements:
-
Penalized 10% of the earned score if submitted by 9am Tuesday (2/28/06).
- Penalized 25% of the earned score if submitted by 5pm Wednesday
(3/1/06).
-
Penalized 50% if submitted later than 5pm Wednesday (3/2/06).
This homework is our class experiment with evaluating search
engines.
This is only meant to be an exercise, so I do not expect we can do a
thorough enough job to call the study valid. But it will have all the
components of a full evaluation and hopefully we will get something
interesting.
Each of you has chosen one query, which you will run on each
of three search engines: Google,
Yahoo, and
MSN Search. I chose these for their
popularity and
distinct underlying search indexes. Consider only the regular search
results, not sponsored links. Also, ignore the clustering by site
in Google and MSN Search; count each result returned. If you are
having trouble with several results in languages other than English,
you can go to the advanced search and choose
English only, but then do this for all of the search engines. (In my
trials,
I did not get foreign-language results with a regular search, so this
may not
be an issue.) Be forewarned that I once got less than 10 results
on the first results page from MSN Search when it was set to 10 results
per page (and there were more than 10 results).
Before running the query on the search engines, write a
description of what you will consider a relevant document and what you
will consider a highly relevant document for your own hand assessment
of search engine results. Use the model provided by the TREC
narrative section of topic specifications, which is used by the TREC
experiments to define relevance (for examples see the class
presentation notes for February 9 on "Relevance
by TREC method" ). You will hand in this description.
After writing your description of relevance and high relevance, run
your query on each search engine and record the first 20 results
returned.
To get a pool for hand assessment, take the the first 15 results from
each search engine. Collect the 45 results from the 3 search engines,
remove duplicates, and visit each result to decide relevance. Score
each result as one of:
- 0 == wrong, totally irrelevant material
- 1 == on topic, relevant
- 10 == Very good / totally relevant, would want to use.
After constructing the pool, go back and rate each of the first 20
results
returned by each search engine. If a result does not appear in the
pool,
it receives a rating of 0 (irrelevant). If a document appears twice
under
different URLs in the list for one search engine, count it only for
it's better ranking for that search engine and delete any additional
appearances within the same list. In this case there will be less than
20 distinct documents returned
by the search engine. Do not go back to the search engine to get more
documents. Keep only what was returned in the
first 20, in the order they were ranked, and give the last positions,
with missing documents, ratings of 0.
For each search engine, calculate the discounted cumulative
gain (see the paper in your readings Evaluation by highly
relevant documents) for each document rank from 1 through 20.
You should turn in three length-20 lists of discounted cumulative gains
-- one list for each search engine. Please do this by email so I can
combine them
easily. I will do the averaging across queries for each search engine
and compare the results.
In your email also include the actual query used, your description of
relevance and high relevance, and any observations you think
interesting or relevant
about the search results overall or for a particular engine.
Also save all the data you have collected -- just in case.
The Search Engine Watch
has a lot
of useful information about search engines. See "Our
Departments" below the news on their home page.