When is Project 6 due? Dean's Date. I think it's policy that anything due after the last day of classes has to be due on Dean's Date, but I might be wrong. So, high-bandwidth high-delay networks should have large socket buffers. Should slow modems also have them, and if so, is there any advantage? The bandwidth-delay product of a network tells you how many bits are "in-flight" at any given time, and gives you an idea of the minimum buffer size you need to efficiently use the full capacity of the system from a networking standpoint. If we assume a 56kbps modem talking from here to the other coast (assume 80ms round-trip), then we get a bandwidth-delay product of about 500 bytes. If the socket buffers were only 500 bytes long, trying to send any more data than that at one time would cause the application to block. So, it's often a good idea to have buffers larger than the bandwidth-delay product, just to make the application run faster. What does it mean to "convert" a CD-ROM filesystem to the native filesystem for the OS? One could imagine reading all of the data off the CD-ROM and making a copy of it in the native filesystem, hence the term "convert". When accessing old data, it's often preferable to convert it in a separate program than to keep adding backwards compatibility for every old version. However, for CD-ROMs, people expect that your computer should be able to access a file on the disk without having to convert all of it. How do you disable an LED? Control over the LED in question was actually done via a memory-mapped device register. So, if you set a particular bit, the LED was turned on, and if you cleared that bit, the LED was turned off. However, there's no way to just set/clear a single bit, so the entire byte had to be read, modified, and written back on each on/off operation. If more than one device is using a device controller or driver, can both be used at the same time, or do they interrupt each other? It depends entirely on the hardware. IDE drives were assumed to be used for lower-end systems than SCSI drives. As a result, the SCSI controllers can often support more drives per controller and the drives can operate independently. In contrast, IDE controllers often allow a "master" drive and a "slave" drive, and only one can be accessed at a time. My experience, however, is that most motherboards and controller cards have two "channels", so at the cost of using two separate ribbon cables, the two IDE drives can be used at the same time. What exactly is the kernel? How does it relate to the hardware or the OS? Is it the OS itself? This may be splitting hairs, but what we think of as the OS is a collection of programs to allow you to use a computer usefully. The kernel is essentially the main program out of all of these, and is what you've been writing on your bootable floppy labs. Most people use the terms kernel and OS interchangeably since these usually came from the same people. In some cases, it doesn't, such as with the Linux kernel - for a somewhat political take on the matter, see http://www.gnu.org/gnu/linux-and-gnu.html and http://www.gnu.org/gnu/gnu-linux-faq.html Is the difference between the Celeron and the Pentium similar to the difference between the 486DX and 486SX? In the case of the 486, I think the major difference was the presense or absence of the floating-point unit. If your code really needed floating point and you had a 486SX, then you probably used an emulation library and it was painfully slow. In the case of the Celeron, the difference seems to be the amount of cache on the chip and the speed of its interconnects to memory. The Celerons tend to have less cache and a slower memory bus interconnect, which make them somewhat slower than the Pentiums. However, I don't know of any operations where the relative speed difference between the two will even be a factor of two. They're pretty close in performance for a lot of things. Can you mmap a network connection? You can achieve similar sorts of goals using various (mostly experimental) techniques, but you can't mmap a socket fd. So, I guess the short answer is "no". If the request is longer than the buffer, does the process block & does this cause problems with the network? What happens if the reader's network cache is smaller than the writer's cache? By default, sockets are marked as "synchronous" when they are created. If you try writing to a socket and exceed the amount of socket buffer space allocated for writing, the write call does block. The socket buffers will get drained as the data reaches the read buffers on the other side of the connection (and we receive confirmation of that). As the write buffers get drained, more of the data gets copied, and then finally, when the last bit of the data is in the write buffers, the process resumes. If the receiver has a small read buffer, and if the receiver is slow to read data from it, the sending process will get blocked for a while. Note that the network isn't affected. Remember that we said that the network was "packet switched" and not "circuit switched". So, there's nobody holding a lock on the network itself. If programs want to make sure that they don't block while trying to do network activities, then they can mark the sockets as being "nonblocking", and if your read/write can't complete, it will partially succeed and set the 'errno' variable to EAGAIN. Can you explain how to use sockets for interprocess communication? Let's say we have a bunch of threads that are sharing access to a "task queue", and elements get added to this task queue as the computation proceeds. In other words, each thread grabs a task, processes it, and then possibly adds whatever new tasks to the queue. We could do this with shared memory, and have the threads block on a condition variable. We could instead have a bunch of helper processes all connect to a coordinator process, and have the coordinator process dole out tasks. Each helper process would do its work and signal completion to coordinator, possibly passing back any new tasks that got created. The drawback in this case is that if the tasks rely on having access to some other data that gets constantly changed, you would then have to package up the relevant data and ship it off to the helpers whenever you gave them a new task. On pages 289-290 of the reading, the author mentions the idea of device drivers running in user space. Isn't this going to cause a performance hit? What are the advantages and disadvantages? There are some cases when doing this can make a lot of sense. If the device in question can only be accessed by one program, then allowing it to get directly to the device can eliminate going through the kernel. In other cases, if the program is trusted not to be bad, then going straight to the device might be necessary to get better performance. Some kinds of networking and supercomputer environments did this to avoid the data copying between the kernel and the user process. However, if you're going to try to have general-purpose running in user-space processes, then you end up with the same issues that faced microkernels. It's possibly more robust since a buggy device driver will crash its program, rather than the whole system. However, the cost of the robustness is the added copying, which will hurt performance. However, we're getting to the point where most people aren't hitting the performance limits of their system, and are getting more and more annoyed by instability. So, these kinds of tradeoffs might make sense. What exactly happens in the checksum process mentioned in the reading? How does the controller check that the data was transferred correctly? A checksum can range from very simple to very complicated. The XOR schemes we dicussed for RAID controllers can serve as checksums. When you do a read, you compare the XOR value from all of the disks with the XOR value that was previously computed and stored. If they don't match, you know that an error has occurred, but don't know on which disk. You see this same sort of thing for downloading files - if you run "cksum" or "md5sum" on a file, you'll get a string of digits, and you can verify that you receive that same value after transferring the file somewhere else. If those values don't match, then you know that the transfer got corrupted. In the case of disks, it's not good enough just to know that the data is bad - you'd like to be able to recover it if at all possible. So, they use more complicated schemes that pad out the original data in such a way that it's possible to tell what's changed. They use a class of codes called Error Correction Codes (ECC), and in this particular case, I believe they use what are known as Reed-Solomon codes. For memory systems, they use what are known as Hamming Codes. The main idea is to keep extra error-correcting information that's tied to the amount of data, and use this to figure out what data got corrupted and what it originally was. In RAID, we know what data is unavailable since the disk is gone, so it's possible to reconstruct the data that way. If you want more details on Reed-Solomon codes, see http://www.siam.org/siamnews/mtc/mtc193.htm You bought a $1000 hard drive? Why? This was around 1987. I was one of the founders of a game software company, and we had enough source files that using floppies wasn't practical. It was much faster to do a compile with the hard drive than trying it from floppies. We had enough files that the linker (the final stage of the compilation process) took about an hour. It made us a lot more careful about fixing bugs in the program. I checked about a year ago, and pirated versions of our games were available on some nostalgia sites via emulators. I guess some things never die. Random tidbit of info - I was at the Metropolitan Museum of Art in New York over Thanksgiving break, and decided to hit the South Asian Art section. After much fruitless searching, I asked one of the museum employees if they had any Mohenjo Daro artifacts, and was saddened to find that they did not.