Programming Project #4
| Assigned: | November 19th |
| Due: | Friday, December 3rd at 11:59 PM (free extension to Sunday, December 5th at 11:59 PM) |
Please make sure you've read the general project information page, including information on handing in your project.
Ground rules
The basic rules and project information are the same for this assignment as for Project #2. If you're working in a group, please remember that the student whose handin directory isn't being used should submit a single file called partner containing the name and CATS account of both project members into her directory. The grade received by each project partner will be the same.
As with other projects, the following suggestions apply:
- Start early!
- Do your design document before writing a single line of code. You're welcome to discuss the design with the professor, TA, or tutor before you start implementation, and we're going to ask to see the design document before helping you debug code. Remember, you can get nearly half of the credit for the assignment for just doing the design well without writing a single line of code!
Unlike previous programs, you may not hand this project in late (past December 5th) because we need to have grades done by the end of that week. Late submissions would not give us time to get project grading done in time.
Goals
The goal for this project is to implement a simple file system in DLXOS. The file system should have the following features:
- The file system has a block size of 4KB.
- The file system has to support the following calls. The actual location
of the implementation is described in the specifications section.
- fileID = Open (filename, mode)
- retval = Close (fileID)
- bytesread = Read (fileID, buffer, bytesToRead)
- byteswritten = Write (fileID, buffer, bytesToWrite)
- retval = Seek (fileID, offset, whence)
- retval = Delete (filename)
- There is a single directory in which files reside.
- The file system uses a file allocation table (FAT).
- User traps corresponding to the above calls should be implemented. There are stubs for these calls in the latest version of traps.c; you may need to ensure that the corresponding calls are in usertraps.s. Of course, you also have to make sure these calls actually call FsOpen(), FsRead(), FsWrite(), etc.
- You need to write a DLX program that lists all of the files in the (single) directory. This can be done easily by treating "/" as a file, allowing it to be opened and read. Then, your user program need only be able to parse the contents....
You need not have Project #3 working in order to complete this project. If you're worried about how well your VM code works, I'd suggest using the "stock" dlxos.tgz. Don't lose time trying to get your VM code working; spend it on the file system instead.
Specifications
Your file system calls will be integrated into DLXOS in filesys.c. There are routines called FsDlxRead(), and so on for the other calls you need to implement. You need to modify these calls to use the file system you are designing. Your routines will automatically be called when you call FsOpen() with a filename that starts with "dlx:". For example, calling FsOpen() on the file "dlx:/foo" will result in FsDlxOpen() being called with the file /foo. Other calls (except for FsDelete()) are switched automatically based on the file ID set in FsOpen(), which remembers which file system (Unix or DLXOS) the file was opened in.
Reading and writing disk blocks
For this assignment, you'll be reading and writing blocks from a disk that
you specify with arguments to the simulator:
dlxsim -d 1 mydisk 50000 -x os.exe
This says that you want DLX disk 1 to be called mydisk, and that you
want it to be at most 50000 blocks long (a block is 512 bytes). Note that the
actual file will grow to the size necessary; if you only write block 5 of a
50,000 block disk, the associated file will only grow to 6 blocks long (blocks
0–5).
You can read and write disk blocks with the DiskIo() routine as described below. You can also see an example in sysproc.c. Note that reading and writing disk blocks involves (simulated) latency; a process that calls DiskIo() should expect to sleep until the read or write is complete. Currently, this latency is exactly 10ms (regardless of the number of blocks transferred); this may be updated (in the simulator—no DLXOS changes) to a more realistic estimate.
A sample call to DiskIo() is:
int startBlock; // first block to transfer (one disk block is 512 bytes)
int nBlocks; // Number of blocks to transfer
void *buf; // PHYSICAL address of the buffer. Transfers must be to contiguous
// physical addresses; if you want to transfer to non-contiguous
// buffers, use multiple calls to DiskIo()
int diskNumber; // This should match the disk number you specified above (the first
// argument after -d )
int op; // This should be either DISK_IO_READ or DISK_IO_WRITE
int blocksXferred; // DiskIo() returns the actual number of blocks transferred
blocksXferred = DiskIo (startBlock, nBlocks, buf, diskNumber, op);
Note that file system block size is 4KB; thus each file block is eight disk blocks.
Superblock
You'll need to have a "superblock" in your file system that contains some basic information: the size of the file system, the location and length of the FAT, and the first block in the directory. You should read this in when the OS starts up and use the information to configure the file system in the OS. The superblock should look something like this:
struct superblock {
unsigned int magic; // you can use this to make sure you're reading the right block
// You can leave the field out if you want.
int blocksInDisk;
int fatStartBlock;
int fatLengthInBlocks;
int dirStartBlock
int dirLengthInEntries; // Could be bytes or blocks instead?
};
The superblock needs to be in a fixed location regardless of the size of the file system. We recommend block 0 of the disk because every disk must have a block 0.
File allocation table (FAT)
Your file system will use a file allocation table to keep track of disk blocks. Your disk will have a maximum of 64K blocks, making the maximum file system size 64K * 4K = 256MB. This also means that you may use unsigned 16 bit integers in the FAT, making its size a maximum of 64K*2 = 128KB. However, your FAT should only be as long as it needs to be—if your disk holds 16 MB, the FAT should have 4K entries and thus occupy 8KB of disk space.
Each entry in the FAT contains either the block number of the next block in the file or a "special purpose" number to indicate one of two things: the block is empty, or the block is the last one in the file. Since you'll likely use block 0 for the super block and store the FAT itself starting at block 1, you can use the numbers 0 and 1 for these special purpose numbers.
The FAT should be in a well-known location in the file system; read its location from the superblock.
You can keep the FAT in memory if you want; just make sure you write changes back to disk every so often.
Directory
Your file system will have a single directory that can accommodate file names of up to 23 characters. For each file in the directory, you should store the file name (\0 terminated), file length (in bytes), and the first block number in the file. You can store all of this information in a fixed-size structure large enough for the string (24 bytes) and two other values. This information can be used to read and write the individual blocks of a file. Of course, the directory itself is like a file, and uses the FAT the same way as every other file. The first block in the directory should be readily accessible; as described above, you should store its location in the superblock in other words, the file system should automatically translate "/" into the block number of the directory's first block.
Don't keep the directory sorted; instead, sort it in the directory listing program if you want. This the code simpler, albeit slower, but it's what Linux does, so it can't be such a terrible thing.
The root directory, as in Unix, is referred to as "/". This means that the call to open the file foo in DLXOS is FsOpen("dlx:/foo", ...). Similar naming applies to the delete call.
Opening files
There are three possible flags to open: read, write, and create. The read and write flags are used to indicate which operations are permitted on an open file. The create flag is set by the caller when the user wants to create a file if it's not already found, and should be used in conjunction with the other flags. If create isn't set and the file isn't found, an error should be returned. If create is set and the file doesn't yet exist, it should be created with 0 size. Much of the open code is already written for you.
Reading & writing files
Reading files is self-explanatory. Your file system simply needs to find the appropriate data and read it in. Writing is a bit more complex you may need to allocate a new block. Otherwise, data is written as you'd expect.
System calls to read and write files need not be on file block boundaries; it can be tricky to handle this case if you're not careful.
Seeking
Seeks in the DLXOS file system work just the way seeks work in Unix. A seek updates the "current" pointer but doesn't transfer data. Seeks past of the end of the file fail and return a (negative) error code.
Error conditions
Your file system should catch errors and return appropriate error codes. Possible errors you may want to catch include:
- File not found (open, delete)
- Out of space (open/create, write)
- File not open (read/write) this may already be caught
- End of file (read)
- Seek past end of file (read,write)
Directory listing
You need to write a DLX user program to list all of the files in the (single) directory. You need not list file sizes, though you're certainly allowed to. Listing the directory is simply a matter of being able to open the directory file (from a user program) and parse the contents.
Hints
- Write your design document first!
- Start early!
- Implement the pieces in the following order:
- Unix program to build a file system on a "disk" file. Use this to make sure your file system design makes sense. You may also want to build a Unix program to list the contents of the file system and other utilities. IMPORTANT: this is straightforward on Solaris (unix.ic) and Mac OS X, but can be difficult on Linux because Linux is little-endian and DLX is big-endian. If you do this program right, much of your code can be ported straight to DLXOS.
- Manage the FAT on disk. You may want to cache a copy of this in memory.
- Read existing files. You may use your Unix utility to create files to test this.
- Create and write files.
- Delete files
- Directory lister
- Use a Unix program to debug the contents of your disk. You may not be able to follow what's happening in your OS that easily, but you can certainly analyze what the OS does to the disk.
- Try your file system code with both small (~1 MB) and large (~20 MB) file systems. Your Unix program should be able to build a disk file of any size; the only change should be the length of the FAT and the first block used in the directory.
- Almost all of your code will be in filesys.c and filesys.h, with a bit of code in traps.c and traps.h. This project will require more code than prior projects, but the code will be easier to write.
- Remember that the system stack is very small. Don't allocate buffers on the system stack! Instead, get buffers using MemoryAllocatePage() to allocate pages to use as buffers.
- Use debugging statements or gdb (see the dlxsim page) to track where your code is going.
What to hand in
As with other projects, you'll need to hand in your design documentation and your code. Because your code may include both operating system code, user programs, and Unix programs, please make sure your GNUmakefile builds them separately. For this project, you'll need to be particularly careful because you might have programs that need to be compiled with regular (not DLX) gcc. Do not hand in any disk files!
Extra credit
For extra credit, you may implement the following features. As usual, the required portion of your code should be substantially working before you attempt extra credit. Please include a list of the extra credit features you've implemented in your README file so we can test them.
- 10 points: The fastest program on our benchmark will receive a 10 point bonus. The benchmark will create 50 files (randomly named), write 100KB to each of them, and then read the data back (again, in a random order). You don't have to do any extra work to qualify for this extra credit, but you could spend time optimizing data layout if you want.
- 10 points: Multi-level (tree-structured) directories (i.e., more than one, and structured in a tree like Unix).
- 20 points: Improve performance with a file block cache. To get full extra credit for this, you also need to implement block replacement algorithms.
Last updated 1 Dec 2004 by Ethan L. Miller (elm at ucsc dot edu)