Final Project : Unix File System |
Due: January 13
(Dean’s date), 2015 @ 5:00PM (late submissions can not be accepted)
Precept: Mon December 8th,
2014 (only one precept will be given)
Office Hour: Wednesday December 10th from 4-5. Aditionally, I will be monitoring Piazza and email the entire break.
No in-person Design Review
Precept slides: pdf
Test cases and expected output are available on the fishbowl file system
Project Code: on lab machines
TA: Kelvin Zou (kelvinzou@cs.princeton.edu)
In this project you will implement a simple UNIX-like file system with a hierarchical directory structure. Files grow and directories grow and shrink, so you need to manage free disk space. You will be able to browse the directory structure, create new files and directories, delete them, etc. The functionality is similar to UNIX file systems, but it does not include permission and user management. Neither do we require concurrency or high performance from your file system. In other words, you can assume it is accessed only by one process at a time.
Since this is the final project, you will design and implement it by yourself. Do not discuss the design or implementation with other students. Piazza will be primarily for requirement clarification.
Your textbook has information that will help you to complete this project, but I also highly recommend The File System section in Ritchie & Thompson's The UNIX Time-Sharing System paper from 1974.
We provide you with the kernel we have built so far, including a number of functions to access the disk (see block.h for details), and also several shell commands (explained later) to test the file system. You are required to implement most of the standard UNIX file system calls as listed in the following:
1.
File system module initialization
Prototype: void fs_init (void);
Function description:
This function initializes the data structures and resources used by the file
system subsystem, and if it detects that the disk is already formatted, it
automatically mounts it to the root directory. It is invoked at kernel
initialization time, but before the
block module (block.c) is initialized. So you
need to invoke block_init in fs_init. Note that by the time fs_init is called, the disk is not necessarily
formatted. As a result, you need to devise a mechanism so that a
formatted disk is recognized (see fs_mkfs).
This function is the only non-syscall function you
are required to implement.
2.
Format a disk
Prototype: int fs_mkfs (void);
Function description:
This function formats a disk: either a raw disk, or one that has previously
been formatted. It then mounts the newly formatted file system to the root
directory. The size of the disk (in blocks) is defined as FS_SIZE in fs.h. (You can increase the
size of the disk for testing if you would like.)
We assume there is one and only one disk present in the system. You
need to write a magic number in the super block so that it can
later be recognized as a formatted disk (i.e. you can
still access a formatted disk and get its content after the shell exits and
restarts). The function should return 0 on success, and -1 on
failure.
Note: The bootblock, kernel image and process images are all outside the file system. The file system starts at the first block after the last process image in the disk and that block is given the block number 0 when using block_read/write to access the disk.
3.
Open and possibly create a file
Prototype: int fs_open (char *filename, int
flags);
Function description:
Given a filename, fs_open()
returns a file descriptor, a small, non-negative integer for use in
subsequent system calls (fs_read, fs_write, fs_lseek,
etc.). The file descriptor returned by a successful call will be the
lowest-numbered file descriptor not currently open.
The parameter flags must include one of the following access modes:
FS_O_RDONLY, FS_O_WRONLY, FS_O_RDWR.
These request opening the file read-only, write-only, or read/write
respectively. The constants are defined in the file common.h.
Open returns the new file descriptor, or -1 if an error occurred.
If a non-existent file is opened for writing, it should be created. An
attempt to open a non-existent file read-only should fail.
To make your life easier, we assume filename passed to the syscalls can only be ".", "..", a directory, or a filename in the current directory. So
you don't have to parse the path as "/" separated directory and file
names. You can also assume that the length of the filename (and dirname) will be less than 32 bytes
(MAX_FILE_NAME). These assumptions remain the same for the following functions.
It is considered an error to open a directory in any mode besides FS_O_RDONLY.
You can use a shared file descriptor table. You do not need to worry about user management or access control lists.
4.
Close a file descriptor
Prototype: int fs_close (int fd);
Function description:
fs_close() closes a file descriptor, so
that it no longer refers to any file and may be reused. It returns zero
on success, and -1 on failure. If the descriptor was the last reference
to a file which has been removed using unlink, the file is deleted.
5.
Read a file
Prototype: int fs_read (int fd,
char *buf, int count);
Function description:
fs_read() attempts to read up to count
bytes from file descriptor fd into the buffer
starting at buf. If count is zero, fs_read()
returns zero and has no other results. On success, the number of bytes successfully
read is returned, and the file position is
advanced by this number. It is not an error if this number is smaller
than the number of bytes requested; this may happen for example because fewer
bytes are actually available right now. On error, -1 is returned.
In the error case it is left unspecified whether the file position changed.
6.
Write a file
Prototype: int fs_write (int fd,
char *buf, int count);
Function description:
fs_write() writes up to count
bytes to the file referenced by the file descriptor fd
from the buffer starting at buf.
On success, the number of bytes written from buf are returned (a number
less than count can be returned), and the file position is advanced by
this number. On error, -1 is returned. It is an error to attempt a write
at a point beyond the maximum size of a file. It is an error if count is
greater than zero but no bytes were written.
A file of size zero should not take up any data blocks.
Writing padding (see fs_lseek()) should be all or nothing.
If count is zero, 0 will be returned without causing any other effects.
7.
Reposition read/write file offset
Prototype: int fs_lseek (int fd,
int offset);
Function description:
The fs_lseek() function repositions the
offset of the open file associated with the file descriptor fd
to the argument offset.
The fs_lseek() function allows the file offset to
be set beyond the end of file (but this does not change the size of the file).
If data is later written at this offset, the file is padded with '\0's in
the intervening space.
Upon successful completion, fs_lseek() returns the resulting offset location as measured in
bytes from the beginning of the file. Otherwise, a value of -1 is
returned.
8.
Create a directory
Prototype: int fs_mkdir (char *dirname);
Function description:
fs_mkdir() attempts to create a directory
named dirname. It returns zero on
success, or -1 if an error occurred. fs_mkdir()
should fail if the directory dirname already
exists.
New directories must contain "." and ".." entries. It is an
error to try to create a directory without them.
9.
Delete a directory
Prototype: int fs_rmdir (char *dirname);
Function description:
fs_rmdir() deletes a directory, which
must be empty. On success, zero is returned; on error, -1 is returned (e.g. attempting to delete a non-empty directory).
10.
Change the current directory
Prototype: int fs_cd (char *dirname);
Function description:
fs_cd() changes the current
directory to that specified in dirname.
On success, zero is returned. On error, -1 is returned.
11.
Make a new name for a file
Prototype: int fs_link (char *oldpath, char *newpath);
Function description:
fs_link() creates a new link (also
known as a hard link) to an existing file oldpath.
If newpath exists it will not be
overwritten. The new name may be used exactly as the old one for any
operation; both names refer to the same file and it is impossible to tell which
name was the "original."
On success, zero is returned. On error, -1 is returned. It is an error to use this function on a directory.
Note that because there are no "paths" beyond the current directory, the parent, or a child directory, oldpath
and newpath are actually both filenames and
can only be in the same directory.
12.
Delete a file
Prototype: int fs_unlink (char *filename);
Function description:
fs_unlink() deletes a name from the
file system. If that name was the last link to a file and no process has
the file open, the file is deleted and the space it was using is made available
for reuse.
If the name was the last link to a file but it is still open under an existing file descriptor, then it
remains in existence until the last file descriptor referencing
it is closed.
On success, zero is returned. On error, -1 is returned. It is an error to use this function on a directory.
13.
Get file/directory status.
Prototype: int fs_stat (char *filename, fileStat
*buf);
Function description:
fs_stat() returns information about a
file. It returns a fileStat structure
(defined in common.h), which contains the
following fields:
typedef struct {
int inodeNo;
/* the file i-node number */
short type;
/* the file i-node type,
DIRECTORY, FILE_TYPE (there's another value FREE_INODE which never appears here
*/
char links;
/* number of links to the i-node
*/
int size;
/* file size in bytes */
int numBlocks;
/* number of blocks used by the file */
} fileStat;
Do not reuse this struct for your inodes.
Note: if your implementation needs different members you can modify this struct, but please only add to it and update the members that are already there.
14.
List current directory content
Prototype: void shell_ls
(void);
Function description:
Note that
this function is in shell.c.
shell_ls() prints a list of files in the current directory.
The list does not have to be sorted alphabetically.
You will be able to test your file system implementation through the
following shell commands (provided through shell.c).
We will also test your file system through this standard shell, so please do
not change the existing shell commands. You can add extra shell commands for
your own testing purposes.
Shell
commands |
Arguments |
Description |
mkfs |
|
Make a new file system, i.e., format the disk so that it is ready for other file system operations. |
open |
<filename> <flag> |
Open a file with the given <flag>, return a file descriptor (fd) associated with this file. <flag>: 1: FS_O_RDONLY; 2: FS_O_WRONLY; 3: FS_O_RDWR The flag is same as the flag used by the open system call. The current file offset will be 0 when the file is opened. |
read |
<fd> <size> |
Read <size> bytes from the file associated with <fd>, from current file offset. The current file offset will move forward <size> bytes after read. |
write |
<fd> <string> |
Write <string> into file associated with <fd>, from current file offset. The current file offset will move forward <size> bytes after write. |
lseek |
<fd> <offset> |
Move the current file offset associated with <fd> to a new file offset at <offset>. The <offset> means the number of bytes from the beginning of the file. |
close |
<fd> |
Close the file associated with <fd>. |
mkdir |
<dirname> |
Create a sub-directory <dirname> under the current directory. |
rmdir |
<dirname> |
Remove the sub-directory <dirname>. |
cd |
<dirname> |
Change the current directory to <dirname>. |
link |
<src> <dest> |
Create a link named <dest> to an existing file or directory named <src>. |
unlink |
<name> |
Remove a link to the name. (When link count drop to 0, delete the file or directory). |
stat |
<name> |
Show the status of the file or directory with name <name>. It should display its inode information; whether it is a file or directory; its link count; size of the file/directory; number of blocks allocated; other information stored about this file/directory. |
ls |
|
Show the content of the current directory. Note! You must implement this function. |
cat |
<filename> |
Show the content of the file. |
create |
<filename> <size> |
Create a file with <filename> as the name in the current directory, and fill it with <size> amount of data, then close the file. (For testing purpose) |
As soon as a directory's entries can fit within less than the current number of data blocks allocated for the directory, they will. A data block should be given up and the holes filled in.
For example, say that an implementation can fit four entries within a data block. There are currently eleven entries within some directory, which thus requires three data blocks. When any three entries are removed from the directory, the directory’s data should be consolidated to fit within two data blocks.
You can test your code on linux by typing "make" and executing "./lnxsh". This program will read/write to a file called "disk".
A test script will be provided that you can use to test your file system. You will also write your own test cases and submit these with your code.
Since your implementation's output should be similar to the given tests' output, your final submission should not output custom error messages.
You should use the code template as a starting point for your project. You only need to change three of the files: fs.h, fs.c, and shell.c. However, feel free to add other files so that you can better organize your code, or modify existing files, but make sure to describe in the design document any additional files added or changes made to other files. Note: Please do not submit the whole tarball, only pick the files you modified and compress it into tarball and check it in dropbox.
You should use the functions provided in block.h to read and write blocks in the file system. These functions will use a file on linux.
The following files are compiled and linked to the filesystem implementation to make a Linux executable:
· shellutilFake.c : Fake some helper functions for the shell
· blockFake.c : Uses a UNIX file as a disk
You will submit a design document with your final code in lieu of the typical design review. In the design document you need to provide details about the design of your file system. Since we will not go through the design document with you, you should provide a comprehensive and concise design description. Your document should contain at least the following sections:
· Overall file system layout
· Super block
· Inodes
· Free block data structure and management
· Link/Unlink implementation (how you did it)
The design document should be in PDF format. If what you are describing is best explained with a diagram, please create one (or more). Limit your design doc to 2-3 pages.
Your submission will be graded based on the following items:
Submit via Dropbox