Final Project : Unix File System
COS 318, Fall 2016, Princeton University


Due: January 17 (Dean’s date), 2016 @ 5:00PM (late submissions can not be accepted)

Precept: Mon December 12th, 2015

Office Hour: Thursdays from 9:30am-11:30am.

No in-person Design Review

Precept slides 2016

Precept slides 2015

Test cases and expected output are available on Blackboard

Grading rubric

Project Code: on lab machines

Overall Requirement

In this project, you will implement a simple UNIX-like file system with a hierarchical directory structure.  Files grow and directories grow and shrink, so you need to manage free disk space.  You will be able to browse the directory structure, create new files and directories, delete them, etc.  The functionality is similar to UNIX file systems, but it does not include permission and user management.  Neither do we require concurrency or high performance from your file system. In other words, you can assume it is accessed only by one process at a time.  

Since this is the final project, you will design and implement it by yourself. Do not discuss the design or implementation with other students. Piazza will be primarily for requirement clarification.

Resources

Your textbook has information that will help you to complete this project, but we also highly recommend The File System section in Ritchie & Thompson's The UNIX Time-Sharing System paper from 1974.

API

We provide you with the kernel we have built so far, including a number of functions to access the disk (see block.h for details), and also several shell commands (explained later) to test the file system.  You are required to implement most of the standard UNIX file system calls as listed in the following:

1. File system module initialization

Prototype:    void fs_init (void);
Function description:

This function initializes the data structures and resources used by the file system subsystem, and if it detects that the disk is already formatted, it automatically mounts it to the root directory.  It is invoked at kernel initialization time, but before the block module (block.c) is initialized.  So you need to invoke block_init in fs_init.  Note that by the time fs_init is called, the disk is not necessarily formatted.  As a result, you need to devise a mechanism so that a formatted disk is recognized (see fs_mkfs).

This function is the only non-syscall function you are required to implement.

2. Format a disk

Prototype:    int fs_mkfs (void);
Function description:

This function formats a disk: either a raw disk, or one that has previously been formatted. It then mounts the newly formatted file system to the root directory.  The size of the disk (in blocks) is defined as FS_SIZE in fs.h. (You can increase the size of the disk for testing if you would like.)  We assume there is one and only one disk present in the system.  You need to write a magic number in the super block so that it can later be recognized as a formatted disk (i.e. you can still access a formatted disk and get its content after the shell exits and restarts).  The function should return 0 on success, and -1 on failure.

Note: The bootblock, kernel image and process images are all outside the file system.  The file system starts at the first block after the last process image in the disk and that block is given the block number 0 when using block_read/write to access the disk.


3. Open and possibly create a file

Prototype:    int fs_open (char *filename, int flags);
Function description:

Given a filename, fs_open() returns a file descriptor, a small, non-negative integer for use in subsequent system calls (fs_read, fs_write, fs_lseek, etc.).  The file descriptor returned by a successful call will be the lowest-numbered file descriptor not currently open.

The parameter flags must include one of the following access modes:  FS_O_RDONLY, FS_O_WRONLY, FS_O_RDWR.  These request opening the file read-only, write-only, or read/write respectively.  The constants are defined in the file common.h.

Open returns the new file descriptor, or -1 if an error occurred.

If a non-existent file is opened for writing, it should be created.  An attempt to open a non-existent file read-only should fail.

To make your life easier, we assume filename passed to the syscalls can only be ".", "..", a directory, or a filename in the current directory.  So you don't have to parse the path as "/" separated directory and file names. You can also assume that the length of the filename (and dirname) will be less than 32 bytes (MAX_FILE_NAME). These assumptions remain the same for the following functions.

It is considered an error to open a directory in any mode besides FS_O_RDONLY.

You can use a shared file descriptor table. You do not need to worry about user management or access control lists.

4. Close a file descriptor

Prototype:    int fs_close (int fd);
Function description:

fs_close() closes a file descriptor, so that it no longer refers to any file and may be reused.  It returns zero on success, and -1 on failure.  If the descriptor was the last reference to a file which has been removed using unlink, the file is deleted.

5. Read a file

Prototype:    int fs_read (int fd, char *buf, int count);
Function description:

fs_read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.  If count is zero, fs_read() returns zero and has no other results.  On success, the number of bytes successfully read is returned, and the file position is advanced by this number.  It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now.  On error, -1 is returned. In the error case it is left unspecified whether the file position changed.

6. Write a file

Prototype:    int fs_write (int fd, char *buf, int count);
Function description:

fs_write() writes up to count bytes to the file referenced by the file descriptor fd from the buffer starting at buf.

On success, the number of bytes written from buf are returned (a number less than count can be returned), and the file position is advanced by this number.  On error, -1 is returned. It is an error to attempt a write at a point beyond the maximum size of a file. It is an error if count is greater than zero but no bytes were written.

A file of size zero should not take up any data blocks.

Writing padding (see fs_lseek()) should be all or nothing.

If count is zero, 0 will be returned without causing any other effects.

7. Reposition read/write file offset

Prototype:    int fs_lseek (int fd, int offset);
Function description:

The fs_lseek() function repositions the offset of the open file associated with the file descriptor fd to the argument offset.

The fs_lseek() function allows the file offset to be set beyond the end of file (but this does not change the size of the file).  If data is later written at this offset, the file is padded with '\0's in the intervening space.

Upon successful completion, fs_lseek() returns the resulting offset location as measured in bytes from the beginning of the file.  Otherwise, a value of -1 is returned.

8. Create a directory

Prototype:    int fs_mkdir (char *dirname);
Function description:

fs_mkdir() attempts to create a directory named dirname.  It returns zero on success, or -1 if an error occurred.  fs_mkdir() should fail if the directory dirname already exists.

New directories must contain "." and ".." entries. It is an error to try to create a directory without them.

9. Delete a directory

Prototype:    int fs_rmdir (char *dirname);
Function description:

fs_rmdir() deletes a directory, which must be empty.  On success, zero is returned; on error, -1 is returned (e.g. attempting to delete a non-empty directory).

10. Change the current directory

Prototype:    int fs_cd (char *dirname);
Function description:

fs_cd() changes the current directory to that specified in dirname.  On success, zero is returned.  On error, -1 is returned.

11. Make a new name for a file

Prototype:    int fs_link (char *oldpath, char *newpath);
Function description:

fs_link() creates a new link (also known as a hard link) to an existing file oldpath.  If newpath exists it will not be overwritten.  The new name may be used exactly as the old one for any operation; both names refer to the same file and it is impossible to tell which name was the "original."

On success, zero is returned.  On error, -1 is returned.  It is an error to use this function on a directory.

Note that because there are no "paths" beyond the current directory, the parent, or a child directory, oldpath and newpath are actually both filenames and can only be in the same directory.

12. Delete a file

Prototype:    int fs_unlink (char *filename);
Function description:

fs_unlink() deletes a name from the file system.  If that name was the last link to a file and no process has the file open, the file is deleted and the space it was using is made available for reuse.

If the name was the last link to a file but it is still open under an existing file descriptor, then it remains in existence until the last file descriptor referencing it is closed.

On success, zero is returned.  On error, -1 is returned.  It is an error to use this function on a directory.

13. Get file/directory status.

Prototype:    int fs_stat (char *filename, fileStat *buf);
Function description:

fs_stat() returns information about a file.  It returns a fileStat structure (defined in common.h), which contains the following fields:

typedef struct {
    int inodeNo;        /* the file i-node number */
    short type;            /* the file i-node type, DIRECTORY, FILE_TYPE (there's another value FREE_INODE which never appears here */
    char links;            /* number of links to the i-node */
    int size;            /* file size in bytes */
    int numBlocks;    /* number of blocks used by the file */
} fileStat;

Do not reuse this struct for your inodes.

Note: if your implementation needs different members you can modify this struct, but please only add to it and update the members that are already there.

14. List current directory content

Prototype:    void shell_ls (void);
Function description:

Note that this function is in shell.c.

shell_ls() prints a list of files in the current directory. The list does not have to be sorted alphabetically.

Shell Commands

You will be able to test your file system implementation through the following shell commands (provided through shell.c). We will also test your file system through this standard shell, so please do not change the existing shell commands. You can add extra shell commands for your own testing purposes.

Shell commands

Arguments

Description

mkfs

 

Make a new file system, i.e., format the disk so that it is ready for other file system operations.

open

<filename> <flag>

Open a file with the given <flag>, return a file descriptor (fd) associated with this file.

<flag>: 1: FS_O_RDONLY; 2: FS_O_WRONLY; 3: FS_O_RDWR

The flag is same as the flag used by the open system call. The current file offset will be 0 when the file is opened.

read

<fd> <size>

Read <size> bytes from the file associated with <fd>, from current file offset. The current file offset will move forward <size> bytes after read.

write

<fd> <string>

Write <string> into file associated with <fd>, from current file offset. The current file offset will move forward <size> bytes after write.

lseek

<fd> <offset>

Move the current file offset associated with <fd> to a new file offset at <offset>. The <offset> means the number of bytes from the beginning of the file.

close

<fd>

Close the file associated with <fd>.

mkdir

<dirname>

Create a sub-directory <dirname> under the current directory.

rmdir

<dirname>

Remove the sub-directory <dirname>.

cd

<dirname>

Change the current directory to <dirname>.

link

<src> <dest>

Create a link named <dest> to an existing file named <src>.

unlink

<name>

Remove a link to the name. (When link count drop to 0, delete the file or directory).

stat

<name>

Show the status of the file or directory with name <name>. It should display its inode information; whether it is a file or directory; its link count; size of the file/directory; number of blocks allocated; other information stored about this file/directory.

ls

 

Show the content of the current directory.

Note! You must implement this function.

cat

<filename>

Show the content of the file.

create

<filename> <size>

Create a file with <filename> as the name in the current directory, and fill it with <size> amount of data, then close the file. (For testing purpose)

Directory Fragmentation

As soon as a directory's entries can fit within less than the current number of data blocks allocated for the directory, they will. A data block should be given up and the holes filled in.

For example, say that an implementation can fit four entries within a data block. There are currently eleven entries within some directory, which thus requires three data blocks. When any three entries are removed from the directory, the directory’s data should be consolidated to fit within two data blocks.

Testing

You can test your code on linux by typing "make" and executing "./lnxsh". This program will read/write to a file called "disk".

A test script will be provided that you can use to test your file system. You will also write your own test cases and submit these with your code.

Since your implementation's output should be similar to the given tests' output, your final submission should not output custom error messages.

Files

You should use the code template as a starting point for your project. You only need to change three of the files: fs.h, fs.c, and shell.c.  However, feel free to add other files so that you can better organize your code, or modify existing files, but make sure to describe in the design document any additional files added or changes made to other files. Note: Please do not submit the whole tarball, only pick the files you modified and compress it into tarball and check it in dropbox.

You should use the functions provided in block.h to read and write blocks in the file system. These functions will use a file on linux.

The following files are compiled and linked to the filesystem implementation to make a Linux executable:

Design Document

You will submit a design document with your final code in lieu of the typical design review. In the design document you need to provide details about the design of your file system. Since we will not go through the design document with you, you should provide a comprehensive and concise design description. Your document should contain at least the following sections:

The design document should be in PDF format. If what you are describing is best explained with a diagram, please create one (or more). Limit your design document to five pages.

Grading

Your submission will be graded based on the following items:

Submission

Submit via Dropbox