CS372: Homework 11
Problem 1:
Is it fundamentally necessary to store on disk the information about the
unallocated disk sectors? Explain why.
Problem 2:
The FastFile file system uses an inode array to
organize the files on disk. Each inode consists of a user id (2 bytes),
three time stamps (4 bytes each), protection bits (2 bytes), a reference
count (2 byte), a file type (2 bytes) and the size (4 bytes). Additionally,
the inode contains 13 direct indexes, 1 index to a 1st-level index table,
1 index to a 2nd-level index table, and 1 index to a 3rd level index table.
The file system also stores the first 436 bytes of each file in the inode.
- Assume a disk sector is 512 bytes, and assume
that any auxilliary index table takes up an entire sector, what is the
maximum size for a file in this system.
- Is there any benefit for including the first 436
bytes of the file in the inode?
Problem 3:
Can we implement hard links in DOS? Why?
Problem 4:
In some early release of an operating system that
shall remain nameless, when a file was deleted, its sectors reverted to
the free list but they were not erased. What problems do you think may
result from this? Why do you think the blocks were not erased?
Problem 5:
You are designing a file system from scratch.
The disk driver allows you complete control over the placement of data
on the disk. Assuming that you have settled for a File Allocation Table
(FAT) architecture, where would be the best place to store the table on
disk?
Problem 6:
Pooh Software Ltd. is selling a file system that
uses a UNIX-like file system with multi-level indexing. For more reliability,
the inode array is actually replicated on the disk in two different places.
The intent is that if one or a group of sectors that are storing either
replica of the array become bad, the system can always recover from the
replica. Discuss the effect of having this replicated data structure on
performance.
Problem 7:
The FAT file system uses 16-bit numbers to represent
the cluster number that starts the linked list of the clusters that are
implementing a file. Explain the implications of limiting the cluster numbers
to 16 bits. Modify the FAT file system to use 32-bit numbers to represent
the clusters and show how the limitations of FAT can be lifted.
Problem 8:
Contiguous allocation of files leads to disk fragmentation.
Is this internal or external fragmentation?
Problem 9:
Can we implement symbolic links in DOS (FAT file
system)? If so, show how, and if not, explain why.
Problem 10:
Consider an indexed file allocation using index
nodes (inodes). An inode contains among other things, 7 indexes, one indirect
index, one double index, and one triple index.
-
What usually is stored in the inode in addition to the indexes?
-
What is the disadvantage of storing the file name
in the inode? Where should the file name be stored?
-
If the disk sector is 512 bytes, what is the maximum
file size in this allocation scheme?
-
Suppose we would like to enhance the file system
by supporting versioning. That is, when a file is updated, the system creates
new version leaving the previous one intact. How would you modify the inode
allocation scheme to support versioning? You answer should consider how
a new version is created and deleted.
-
In a file system supporting versioning, would
you put information about the version number in the inode, or in the directory
tree? Justify your answer.
Problem 11:
The LoneStar backup system for UNIX works as follows:
At the beginning of each week, the backup system traverses the file system
tree structure and saves all directories and files. This stage is called
full backup. Then, on a daily basis, it performs incremental backups. It
does so by traversing the directory tree, and saving only the files whose
time stamps show that they have been modified since the previous incremental
backup. Note that the LoneStar system does not follow the symbolic links
when saving files, instead storing them as symbolic links. This preserves
the integrity of the symbolic link.
-
What happens if the backup system were to follow
the symbolic links and save the files the link points to?
-
Identify problems with this backup scheme.
-
Instead of doing the above, the KeyStone backup
system requires the file system to maintain a bit map for every sector
on disk. If a disk sector is updated, the file system sets the corresponding
bit to one. When the KeyStone system performs an incremental backup, it
saves only the disk sectors whose bits are set, then it resets the entire
bit map. Compare between the two backup schemes. What are the common
problems between them? What are the key advantages of each? Which one would
you use?
Problem 12:
Explain what happens to the disk in UNIX when a user in a
text editor saves a new file called
"foo" into the current directory. Assume that foo is 15678 bytes in
length, that a disk block is 1024 bytes, that the file system supports
up to 2^32
sectors, and that an inode contains 10 direct block pointers, 1 indirect block
pointer, 1 doubly-indirect block pointer, and 1 triply-indirect block pointer.
The system uses on-disk free maps for tracking free blocks and free inodes.
Here's the set of actions that you may assume that the editor makes:
fd = open(foo, O_CREAT|O_WRONLY);
p = &editBuffer;
numBlocks = editBufferSize / 1024;
for (i = 0; i < numBlocks; i++) {
write(fd, p, 1024);
p += 1024;
}
lengthOfLastPartialBlock = editBufferSize % 1024;
write(fd, p, lengthOfLastPartialBlock);
close(fd);
Assume that creating a file is synchronous -- all writes to
disk complete before the call returns. Assume that write() calls are
asynchronous (they write data and metadata to memory, but don't force the
writes to disk), but that close() does not return until all writes to the file
and metadata are safely on disk. Assume that all of the data structures that
must be read to perform this action are already in the cache.
-
Assuming
the system does not use logging, explain what blocks get written out to the
disk and what each block contains.
-
Suppose
the machine crashes after writing the inode, free map, and data blocks, but
before writing any indirect blocks. Describe how, if not fixed when the system
reboots, such an inconsistency could cause security violations in which one
user could access the files of another user (be specific: How would this
happen? Which blocks are vulnerable? Can any of the user's data be exposed to
other users? Can any other user's data be exposed to this user?)
- Suppose
you were to implement write-ahead logging to fix this problem. Describe exactly
what your system would write to the log to satisfy this user's requests, when
the writes occur, and when the system can "apply" the updates --
copy them to their normal positions on disk.
Problem 13:
-
An engineer has designed a FAT-like system and he has used 24 bits for
each entry. For a 32-GB disk, what is the minimum size of a file allocation
in this system?
Justify your answer.
-
Consider an index-based file system with the inode containing 64 indexes,
1 indirect index pointing to a disk block containing an array of direct
indexes, and 1 2-level index in the usual way. Assume that each index takes
4 bytes.
- What is the maximum file size under this arrangement,
if a disk block is 1024 bytes? Explain how do you compute
this maximum size.
- How many disk accesses does it take to read one disk block at location
3000321 within a file, assuming no caching. Justify your answer.
Problem 14:
Versioning: It is often desirable for users to
maintain different versions of the same file (for example, during program
development, for recovering in the case of wrong updates applied to a version,
etc.). The VMS file system implements versioning by creating a new copy
for the file every time it is opened for writing. The system also supported
versioning in the naming structure, appending a version number to the name
of each file. For instance, when foo.c;1 is updated, the system creates
foo.c;2, etc. The system implemented commands to manipulate versions, for
example, the PURGE command would delete all versions except the most recent
one. Also, users could disable versioning for some files. Discuss the pros
and cons for this scheme.
Problem 15:
An implementation of the FSCK program traverses
the file system tree and builds two lists of the disk blocks. One list
contains the sectors that are shown to be in use, while the other reads
the free disk-block information on disk. If the two lists for 4 blocks
were as shown:
In use |
0 | 1 | 0 |
1 |
Free |
0 | 0 | 1 |
1 |
Identify what are the problems, if any, and
what should fsck do for each block.