|
Computer Science 425
Database & Information Management Systems
|
Fall 2006
|
Directory
General Information |
Schedule
and Readings |
Project Page |
Announcements
Project Overview
The goal of the course project is to have you explore in greater detail
some aspect of the design or use of database systems. The choice of
project
is yours, but it does need to be approved. You may work individually or
in pairs.
Examples
Most projects are primarily experimental (this includes application
development), but theoretical analysis and in-depth literature research
are also possible. If
you choose to implement a database for an application, the database
must
have some complexity that makes it more than straightforward (see more
discussion of this in item 2 below). An experimental project may also
study in depth an algorithm or algorithms from database or search
systems (see item 1 below). Theoretical projects analyzing some
aspect of database design must add substantial depth to the
textbook treatments of the subject. A project may be primarily
in-depth
literature research of the state-of-the-art of an area within databases
or information retrieval
that we have not covered in class. However, such projects must include
critical analysis of the results in the literature.
Important note: If you are a
graduate
student and wish to satisfy your programming requirement with the
COS
425
project, you must choose an implementation project, and you must notify
me in advance that you want to use the project to satisfy the
requirement.
Here are some suggestions. You may choose one or use them as guides
to specifying your own project. Note that many of these
suggestions talk in terms of one of database systems or information
retrieval (search) systems. In reality, database and information
retrieval are combined in many tools and applications; your
project certainly can combine them.
- In earlier offerings of COS425, the same project was done by all
students.
It was the implementation and evaluation of the main query optimization
algorithm presented in Ramakrishnan and Gerke Database Management
Systems (3rd edition).
A detailed specification of that project can be found in the COS
425 spring 2001 project description.
- Implement an application that requires database support.
Implement the
user interface, the application interface to the database and the
database.
This application needs to have some complexity in functionality,
constraint
maintenance, reliability or user interface. The user interface
may be minimalist if the focus of the project is elsewhere. The
application should be something
in which you are interested and for which you can obtain or generate a
reasonable set of interactions and data for testing. The
application need not be "serious": one previous student
implemented a database and rule system for the game Warhammer. The
database may be relational or in XML. Depending on the system
being used and the student's
background, learning the use and configuration of an API for a database
system may be considered a substantial project goal. (See
Chapters 6 and
7 of Ramakrishnan and Gehrke, Database Management
Systems (3rd edition),
for a discussion of application development for SQL. See Chapter 10 of
Silberschatz, Korth and Sudarshan Database
System Concepts (5th edition) for a brief discussion
of application development in XML. Problem sets 3 and 4 give you
pointers to SQL and XML servers.)
- Database techniques and systems exist for special kinds of data.
For example,
Chapter 28 of our text presents techniques for spatial (geometric)
data.
Your project may focus on techniques for a special kind of data.
Possibilities
include:
- Select a special kind of data and
write a review paper describing the important techniques and issues,
with
references and critical analysis.
- Implement and evaluate an important method for specialized
data, e.g. multidimensional
indexing (see Section 28.3 of our text).
- There are also customized information retrieval techniques for
special kinds of data.
- Select an example of a special kind of data (e.g. music,
images, video) and
write a review paper describing the methods that have been developed
for information retrieval beyond the use of keyword search of text
labels. What are the
important techniques and issues for information retrieval of this
information medium?
Provide up-to-date
references and critical analysis.
- Explore the use of information retrieval software posted
on the Web for some specific type of non-text data; do a set of
well-designed experiments or implement a small application that uses
it. A home-grown example of such software is Marsyas for music
(begun as a Ph. D. thesis in CS at Princeton).
- Do an empirical study of two or three search services that use
different search engines, i.e. the crawling, indexing and search
software are different. To the best of my knowledge Google, Yahoo, Ask
and MSN Search are all different,
but you should check that nothing has changed in this fast-changing
area (e.g. until 2004, Yahoo used Google for its Web search). To
do your study, read about well-established studies like
those conducted by TREC. Be
forewarned that such studies involve testing the search engines on
identical queries and analyzing the results, which can be very time
consuming.
- Explore methods for enhancing and changing queries that
are specified by search terms -- i.e. the kind of queries we use every
day.
Such enhancements and changes attempt to deal with ambiguity, synonyms
and other natural-language issues. Read about
state-of-the-art methods for changing queries. Do some
analysis of methods -- for example, do a comparative analysis
from the literature or conduct experiments to evaluate methods.
You may design and analyze a method of your own design, but you must
have knowledge of prior methods and results.
Requirements
Each individual or pair must:
- By 5pm on Wed. Nov 8, 2006 send email to Professor
LaPaugh
containing
a one-paragraph description of your project. Each individual must
email
Professor LaPaugh to confirm partnerships.
- During the week of Nov. 27, 2006 meet with Professor
LaPaugh
for
15-20 minutes to discuss project progress and issues.
- Submit by 5pm Tues. Jan 16, 2007 (Dean's date) a report
that
describes
your project. This must include the goals of the project, your
methodology
and the results If
it is an application implementation, you need to describe the
application, your design requirements, the major implementation
decisions, and your assessment of the result. If it is an experimental
algorithm study, you need to describe
what was implemented, the major implementation decisions, how you
designed
the experiments, and the experimental results. If it is a
theoretical study, you need to describe the problem,
review what was known about the problem before your analysis, and give
the details and the results of your theoretical analysis. If it is a
literature-based
project, you need to describe the major issues under study, summarize
the
major techniques and experiments presented in the literature and critically
analyze the results; you must have a bibliography that includes
recent
research. For any project that involves programming, all source
code you write should be in an
appendix or made accessible on the Web.
- After the project report is submitted and before 5pm Mon.
Jan. 22,
2007
each individual or pair must meet with Professor LaPaugh for a project
demonstration (where applicable) and discussion.
Projects will be graded on thoroughness and depth of analysis.
Difficulty
will be taken into consideration. Keep in mind that evaluation
is an important part of any project. Be clear on the goals of your
project
and how you demonstrate or measure success.
A.S. LaPaugh Mon Oct 30 16:05:26 EST 2006