COS 318 : Operating system
Fall 2006, Princeton University

Final Project: File System


Important Dates

Project Due: Jan. 16, 2007 (Dean's Date) 23:59pm (NO LATE SUBMISSIONS)

Precept: Dec 6th, 2006. (Only one precept will be given, and it will be used to help clarify the requirement only.)

 

Overall Requirement

In this project you will implement a simple UNIX-like file system with a hierarchical directory structure.  Files and directories can grow and shrink, and you need to manage free disc space.  You will be able to browse the directory structure, create new files and directories, delete them, etc.  The functionality is similar to UNIX file systems, but it does not include permission and user management.  Neither do we require concurrency or high performance of your file system.  

Since this is the final project rather than normal course project, you would have to do the design and implementation by yourself. You can not discuss the project design/implementation with other students. The TA will be able to help clarify the requirement, but not give any hint or help to the project.

Design Document Requirement (5 points)

In the design document (submitted together with your final code, there would be no design review), you would need to provide details about the design of your file system. Since we will not go through the design document with you, you would need to provide a more comprehensive design document so that we can fully understand your design by just reading your document. We will grade the design document in the following way, each bullet will worth one point:

·         File system disk layout

·         Super block structure/content

·         Inode data structure on disk

·         Free block data structure and management

·         How to implement link/unlink

If you choose to implement extra credits, you would also need to provide corresponding design details in your design document in order to get the full extra credits.

Implementation Details

We provide you with the kernel we have built so far, including a number of functions to access the disk (see block.h for details), and also several shell command (explained later) to test the file system.  You are required to implement most of the standard UNIX file system calls as listed in the following:

1. File system module initialization

Prototype:    void fs_init (void);
Function description:

This function initializes the data structures and resources used by the file system subsystem, and if it detects that the disk is already formatted, it automatically mounts it to the root directory.  It is invoked at kernel initialization time, after USB subsystem has been initialized, but before the block module (block.c) is initialized.  So you need to invoke block_init in fs_init.  Note that by the time fs_init is called, it disk is not necessarily formatted.  As a result, you need to devise a mechanism so that a formatted disk is recognized (need the help of fs_mkfs).

This function is the only non-syscall function you are required to implement.

2. Format a disk

Prototype:    int fs_mkfs (void);
Function description:

This function formats a disk, either a raw disk, or one that is previous formatted, and then mount the newly formatted file system to the root directory.  The size of disk (in blocks) is defined as FS_SIZE at fs.h.(You can increase the size of disk if you would like to in order to support bigger file system.)  We assume there is one and only one disk present in system.  You need to set up a flag (magic number) somewhere in the disk, so that it can be later recognized as a formatted disk (ie: You can still access a formatted disk and get its content after the shell exists and gets restarted).  The function should return 0 on success, and -1 on failure.

Note: The bootblock, kernel image and process images are all outside the file system.  The file system starts at the first block after the last process image in the disk and that block is given the block number 0 when using block_read/write to access the disk.


3. Open and possibly create a file

Prototype:    int fs_open (char *filename, int flags);
Function description:

Given a filename, fs_open() returns a file descriptor, a small, non-negative integer for use in the subsequent system calls (fs_read, fs_write, fs_lseek, etc).  The file descriptor returned by a successful call will be the lowest-numbered file descriptor not currently open for the process.

The parameter flags must include one of the following access modes:  FS_O_RDONLY, FS_O_WRONLY, FS_O_RDWR.  These request opening the file read-only, write-only, or read/write respectively.  The constants are defined in the file common.h.  Open return the new file descriptor, or -1 if an error occurred.

If a non-existing file is opened for write, it should be created.  An attempt to open a non-existing file read-only should fail.

To make your life easier, we assume filename passed to the syscalls can only be ".", "..", or a filename in the current directory.  So you don't have to parse the path as "/" separated directory and file names.  You can also assume that the length of the filename (and dirname in the future) will be less than 32 bytes (MAX_FILE_NAME). These assumptions remain the same for the following functions.

You do not need to worry about user management or access control list.

4. Close a file descriptor

Prototype:    int fs_close (int fd);
Function description:

fs_close() closes a file descriptor, so that it no longer refers to any file and may be reused.  It returns zero on success, and -1 on failure.  If the descriptor was the last reference to a file which has been removed using unlink the file is deleted.

5. Read a file

Prototype:    int fs_read (int fd, char *buf, int count);
Function description:

fs_read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.  If count is zero, fs_read() returns zero and has no other results.  On success, the number of bytes successfully read is returned (zero indicates end of file), and the file position is advanced by this number.  It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now.  On error, -1 is returned.  In this case it is left unspecified whether the file position changes.

6. Write a file

Prototype:    int fs_write (int fd, char *buf, int count);
Function description:

fs_write() writes up to count bytes to the file referenced by the file descriptor fd from the buffer starting at buf.  A fs_read() occurs after a fs_write() has returned should return the new data.  

On success, the number of bytes written are returned (zero indicate nothing was written).  On error, -1 is returned.  If count is zero, 0 will be returned without causing any other effect.


7. Reposition read/write file offset

Prototype:    int fs_lseek (int fd, int offset);
Function description:

The fs_lseek() function repositions the offset of the open file associated with the file descriptor fd to the argument offset.

The fs_lseek() function allows the file offset to be set beyond the end of file (but this does not change the size of the file).  If data is later written at this point, subsequent read of data in the gap (a "hole") return bytes ('\0') until data is actually written into the gap.

Upon successful completion, fs_lseek() returns the resulting offset location as measured in bytes from the beginning of the file.  Otherwise, a value of -1 is returned.

8. Create a directory

Prototype:    int fs_mkdir (char *dirname);
Function description:

fs_mkdir() attempts to create a directory named dirname.  It returns zero on success, or -1 if an error occurred.  fs_mkdir() should fail if the directory dirname already exists.

9. Delete a directory

Prototype:    int fs_rmdir (char *dirname);
Function description:

fs_rmdir() deletes a directory, which must be empty.  On success, zero is returned; on error, -1 is returned (eg, attempt to deleting a non-empty directory).

10. Change the current directory

Prototype:    int fs_chdir (char *dirname);
Function description:

fs_chdir() changes the current directory to that specified in dirname.  On success, zero is returned.  On error, -1 is returned.

11. Make a new name for a file

Prototype:    int fs_link (char *oldpath, char *newpath);
Function description:

fs_link() creates a new link (also known as a hard link) to an existing file oldpath.  If newpath exits it will not be overwritten.  The new name may be used exactly as the old one for any operation; both names refer to the same file and it is impossible to tell which name was the `original'.

On success, zero is returned.  On error, -1 is returned.

Note because we excluded the usage of paths, oldpath and newpath are actually both filenames and can only be in the same directory.

12. Delete a file.

Prototype:    int fs_unlink (char *filename);
Function description:

fs_unlink() deletes a name from the file system.  If that name was the last link to a file and no process has the file open, the file is deleted and the space it was using is made available for reuse.

If the name was the last link to a file but any process still have the file open the file will remain in existence until the last file descriptor referring to it is closed.

On success, zero is returned.  On error, -1 is returned.

13. Get file/directory status.

Prototype:    int fs_stat (char *filename, fileStat *buf);
Function description:

fs_stat() returns information about a file.  It returns a fileStat structure (defined in common.h), which contains the following fields:

typedef struct {
    int inodeNo;        /* the file i-node number */
    short type;            /* the file i-node type, DIRECTORY, FILE_TYPE (there's another value FREE_INODE which never appears here */
    char links;            /* number of links to the i-node */
    int size;            /* file size in bytes */
    int numBlocks;    /* number of blocks used by the file */
} fileStat;

Note: if your implementation will need different kind of fileStat structure, you can modify this fileStat and the corresponding code to reflect your change. Please try to keep some essential fields like file size, type so that our testing script will be able to work with your finished code.


Shell Commands:

You will be able to test your file system implementation through the following shell commands (provided through shell.c). We will also test your file system through this standard shell, so please do not change the existing shell commands. (You can add extra shell commands for your own testing purpose.)

Shell commands

Arguments

Description

mkfs

 

Make a new file system, i.e., format the disk so that it is ready for other file system operations..

open

<filename> <flag>

Open a file with the given <flag>, return a file descriptor (fd) associated with this file.

<flag>: 1: FS_O_RDONLY; 2: FS_O_WRONLY; 3: FS_O_RDWR

The flag is same as the flag used by the open system call. The current file offset will be 0 when the file is opened.

read

<fd> <size>

Read <size> bytes from the file associated with <fd>, from current file offset. The current file offset will move forward <size> bytes after read.

write

<fd> <string>

Write <string> into file associated with <fd>, from current file offset. The current file offset will move forward <size> bytes after write.

lseek

<fd> <offset>

Move the current file offset associated with <fd> to a new file offset at <offset>. The <offset> means the number of bytes from the beginning of the file.

close

<fd>

Close the file associated with <fd>.

mkdir

<dirname>

Create a sub-directory <dirname> under the current directory.

rmdir

<dirname>

Remove the sub-directory <dirname>.

chdir

<dirname>

Change the current directory to <dirname>.

link

<src> <dest>

Create a link named <dest> to an existing file or directory named <src>.

unlink

<name>

Remove a link to the name. (When link count drop to 0, delete the file or directory).

stat

<name>

Show the status of the file or directory with name <name>. It should display its inode information; whether it is a file or directory; its link count; size of the file/directory; number of blocks allocated; other information stored about this file/directory.

ls

 

Show the content of the current directory.

cat

<filename>

Show the content of the file.

create

<filename> <size>

Create a file with <filename> as the name in the current directory, and fill it with <size> amount of data, then close the file. (For testing purpose)

flush

 

Flush the dirty cache of the file system to the disk. (Only needed if you are implementing the extra credit for file system cache, you would need to find a way to add this shell command into the shell.)

 

Implementation grading (15 points for implementation):

We will provide a test script which will contain a set of sample testing cases to test your implementation. During the submission, you are also encouraged to provide extra test cases to show case your implementation. When we grade the final project, we will review the code as well as run various test cases including sample testing cases, extra test cases provided by us and extra test cases selected from your submission! You could potentially get more points if your test cases proved to be difficult for other people.

For the basic test cases, we will test things like:

n        can mkfs and mount fs properly

n        can create new files.

n        can create directories.

n        can create nested directory.

n        file read/write is functioning.

n        link/unlink is functioning.

n        stat is functioning.

n        basic error checking (eg, delete non-existing files, etc).

We will publish the test cases and testing script soon to the mailing list.

Passing all those sample test cases will give you 10 points (We will provide 10 test cases, each test case will worth 1 point). The general code quality (through code inspection) will worth 2 points. The extra test cases will worth another 3 points (including the ones submitted by you, TA and other group).


Extra Credits

There are two extra credit opportunities in this project, one point each.

1.  File system caching.


You need to implement caching mechanism for the file system in order to earn the 1 extra point.  There are two kinds of data that can be cached in a file system -- the i-nodes and the data blocks.  You need to at least implement caching for data blocks.  Specifically, you need to do the following jobs.

·         Maintain cache data structure corresponding to part of the data on disk.  Make the system calls go to cache for data, and let cache module to handle the misses.

·         Implement the LRU cache replace policy.

·         Implement a system call int fs_flush (void); which can be invoked to flush the cached data to the disk.

You need to be careful to make the file system consistent.

2.  Large file support

You can get an extra point if you implement the single, double and triple indirect pointers in the i-node structure of your file system.  This upgrade allows larger files to be stored.


Files for this assignment

You should use the code available in /u/cos318/6_pre/ as a starting point for your project. Don't be daunted by the large number of files since you are already familar with many of them and also most files do not require to be touched at all.  You only need to change two of the files: fs.h and fs.c.   However, feel free to add other files so that you can better organize your code.

You should use the functions provided in block.h to read and write blocks in the file system. These functions will use a file (on linux) or the USB disk (on our OS) as the store. Two different implementations of block.c are available for the linux/USB systems.

Apart from the source code you have modified, you will also need to submit a README file with your design documentation.  We do not have a design review for this project, but you still need to have the designing part in README.   Other than the basic design requirements stated earlier, you can also provide more information about your file system, e.g.:

·         Provide the following parameters of your filesystem (given the disc size being 32M, or 8000 blocks)

o        The maximum number of total files/directories supported (i.e. the number of inodes in the UNIX filesystem).

o        The maximum number of files/subdirectories in each directory.

o        The maximum size of file.

o        The maximum depth of directory tree (without considering the path length limit.)

·         Other things to describe your design and implementation to help to AI better appreciate your job.

Testing on Linux

We have designed the project so that you can build two targets from it, the image file and a Linux executable.  The following files can be compiled and linked to the filesystem implementation to make a Linux executable, so that you can debug your code on Linux before trying it on a bare machine.

·         shellutilFake.c : Fake some helper functions for the shell

·         blockFake.c : Uses a UNIX file as a disk

You can test your code on linux by typing "make lnxsh" and executing "lnxsh" on linux. This program will read/write to a file called "disk".  "make image" will still produce the image that you can use to boot a lab machine.

Note that for this project, we will only test your implementation by running "lnxsh" on Linux.  We will not test your "image" on a bare machine.  However, your code still needs to be able to produce the full image.

Submitting your program

Submit your final solution electronically from arizona.princeton.edu, using the following command:

/u/cos318/bin/318submit 6 README changed-files ...

The submit command copies your files to the directory /u/cos318/submit/UserLogin/number and lists all the files that you have submitted for assignment number. UserLogin is your user account name. If you execute submit after the project deadline, your files are placed in directory number-late. You can run submit more than once, and you can submit partial lists of files. (README should contain both the design document and other information you want TA to know about your system.)

CS318 Staff