COS 425, Fall 2006 - Problem Set 3:   Part I: SQL

Due at 1:30pm, Friday October 20 , 2006.


Collaboration Policy

You may discuss problems with other students in the class. However, each student must write up his or her own solution to each problem independently. That is, while you may formulate the solutions to problems in collaboration with classmates, you must be able to articulate the solutions on your own.

Late Penalties


Part I, Section A: writing queries for a preexisting database.

Problem 1.
The Department of Computer Science maintains a database of technical reports using the software MySQL. Normally, this is accessed from the department's technical reports server. However, the technical staff have prepared a SQL Web interface so that you may play with a real database (or rather a snapshot through July, 2006). This database is somewhat different from the examples we have been using in that it has much more text and no numerical data. There are only two tables, defined as:
TABLE main (
  id varchar(10),
  entry date,
  org varchar(75),
  language varchar(20),
  title tinytext,
  date varchar(30),
  pages varchar(15),
  abstract text,
  ps enum('Y','N'),
  pdf enum('Y','N'),
  PRIMARY KEY (id),
  UNIQUE id (id)
);


TABLE authors (
  id char(10),
  author char(50),
  ord tinyint(3) unsigned
);
The different types are extensions of the basic types and are described in the Language Reference linked from the SQL Web interface. However, all you really need to know is that tinyint(3) unsigned is a type of integer and that varchar, char, tinytext and text are all types of strings so that one can use the SQL operator "LIKE".

Part 0 (warm-up, don't turn in) To get an understanding of what the entries in main and authors look like, go to the SQL Web interface and enter and submit these three SQL queries:

select *
from main M
where M.date='April 2006'
select *
from authors A
where A.id='TR-595-99'
select *
from authors A
where A.author LIKE 'Walker%'
Part 1. Write and submit SQL queries to find the following. Hand in a print-out of the results of each query. 

Query a: Find the names of all authors of TR's dated in the summer of 2002 (June, July and August).

Query b: Find the ids, titles, and dates of the TRs with the word "security" in their abstracts.

Query c: Find the names of all co-authors of Professor Appel in technical reports dated sometime in 2001.

Query d: For those authors who have published more than 1 TR in 2004, find their names and the number of TRs they published in 2004.

Part 2 Why do you think the database was organized the way it was: two tables with the data recorded as indicated in the table definitions? Would you organize it differently? Be sure to consider the use of this database as represented by the technical report server.

Part I, Section B: defining a database.

Problem 2 
One database system to which you have access is the Princeton University Instructional Oracle Facility. Go to the Princeton University Instructional Oracle Facility Web site and follow the directions to establish an account. Then go to the Instructional Oracle SQL Editor. You will use the Web interface provided there to define tables and insert and delete values.

Important: 
The local copies of  Oracle 8 documentation linked at the bottom of the  Princeton University Instructional Oracle Facility Web site no lonter exist.  Furthermore, the current version being run by the facility is Oracle 9.2   Documentation can be obtained from Oracle:  Oracle9i SQL Reference: Table of Contents.  To turn in your work, use the browser print command to print your sequence of SQL commands and the results. Note that the full sequence of commands must be executed at one time (one transaction) in the editor window to appear as a sequence in the bottom frame, where the results also appear. Also, you must have the bottom frame selected to print both the command sequence and the results.  If you wish to submit your work by email, cut and paste the contents of the bottom frame to a text file and email that.   Be sure to save your tables in your Oracle database until your work  has been graded.

For this problem we will use the following database for a consulting firm:

Underlined attributes constitute the primary key for a relation -- our usual convention.

Part 1 Define tables corresponding to the 4 relations in your Oracle database. Include definitions of PRIMARY KEY constraints and FOREIGN KEY constraints.

Part 2 Add two consultants with at least two clients each and two consultants with no clients; at least one consultant must have a supervisor for each client with which he/she is paired. Of course, you can use minimalist names so that you don't need to do a lot of typing.

Part 3  Show that the Oracle facility enforces FOREIGN KEY constraints by executing a sequence of inserts and deletes for two of the relations (your choice of which) to illustrate enforcement. Show (i) that a tuple cannot be deleted in the referenced table if the primary key value of that tuple is present as the foreign key value of a tuple in the referencing table, (ii) that the primary key value of a tuple cannot be changed in the referenced table if that primary key  value is present as the foreign key value of a tuple in the referencing table, and (iii) that a tuple cannot be added in the referencing table if the foreign key value in that tuple is not present as a primary key value in a tuple of the referenced table.

WARNING: You must follow each INSERT or DELETE or UPDATE command with a COMMIT command on a separate line for the ORACLE database system to permanently store or delete the values. For example:

        insert into works (name, c_name, fee_rate) values ('smith', 'jones', 100);
        commit;
See Item 3 under Guidelines on Specifying SQL Commands To SQL Editor in the SQL Editor Tutorial.

Problem 3
In many applications, there is a large amount of initial data when a new database is defined. This data is bulk loaded into the database tables. The Princeton University Instructional Oracle Facility provides a simple bulk loading facility, which is documented in The SQL Loader Demo. The bulk loader uses data and control files, which must reside in your top-level directory on the OIT central file server (for Windows machines on the OIT network, this is usually drive H; this is also the home directory of your OIT unix (arizona) account). The loader loads into tables defined using the Instructional Oracle SQL Editor.

You will bulk load data into a table "stugrade" with two attributes: a student id "sid" of type NUMBER and "grade" of type CHAR(1), which holds a letter grade from A through F without + or -.

  1. Define the table stugrade in the Instructional Oracle SQL Editor.
  2.  Download the text files dbex.ctl and dbex.dat from the COS 425 web site and save them in your OIT Central file server top-level (home) directory. (If you prefer to work in your CS Department Unix account, you can use the Unix command "scp" (secure copy) to copy files to and from your OIT Unix home directory.)
  3. Go to the Oracle SQL Loader and bulk load stugrade using files dbex.ctl and dbex.dat.
  4. Construct a SQL query that will count, for each of the 6 letter grades, the number of students in the table receiving that grade. Enter this query in the Instructional  Oracle SQL Editor. Turn in your result.



NOTE:  If you have a Computer Science Department account, then you also have access to a student MySQL database server running on machine "studentdb" that can be used for course projects or further exploration of database systems.   See CS Guide: Getting Started With MySQL .   If you have a CS account,  you can get a studentdb account by filling out the form "database setup" on CS Guide.  WARNING:  CS Guide pages are restricted to local (i.e. cs.princeton.edu) access.  If you do not have a CS account but would like to work with the MySQL database server, please email Professor LaPaugh.