Project 3: Preemptive Scheduler
Overview
In this third project, you will add support for preemptive scheduling and synchronization
to the kernel.
Specifically, it will be your job to modify
entry.S,
scheduler.c,
sync.h and
sync.c.
WARNING: Do not view the starter code for this project before completing the previous one. Doing so is a violation of the honor code.
Unlike project 2, the tasks in this project can be preempted by another task via a time interrupt. You need to deal with the time interrupt and handle the critical section issues so that a task can be preempted safely. Additionally, you will implement three kinds of synchronization primitives: condition variables, semaphores, and barriers.
To fulfill the requirements of this project, you must determine:
- when interrupts are on, and when they are off
- how round-robin scheduling works
- how tasks are put to sleep and woken up
- the properties of synchronization primitives
We have provided starter code which contains 24 files:
- entry.S: Contains the entry points to several kernel facilities
- scheduler.[ch]: The kernel's process scheduler
- sync.[ch]: The kernel's synchronization facility
- interrupt.[ch]: The kernel's interrupt facility
- syslib.[ch]: The kernel's system call facility
- kernel.[ch]: A 264-sector kernel
- queue.[ch]: The library and implementation for a queue
- util.[ch]: Useful helper functions
- printf.[ch]: Printing functions for debugging
- common.h: Library for common type definitions
- createimage: The executable for creating a bootable image
- bootblock: The executable for the bootloader
- helper.S: Helper function for some of the provided test programs
- settest: Script to set up a specific test for the preemptive scheduler
- Makefile: Used to compile and create your files
- bochsrc: The configuration file for the Bochs emulator
We also provide five test programs for you to test your code at various stages. See
Testing for instructions on how to use these tests.
The start code is available at /u/318/code/project3 on the FC010 machines.
Preemptive Scheduling
Processes and threads in x86 are preempted using a timer interrupt. In x86, interrupts can be triggered
in several different ways: hardware, software, exceptions. For this project, the relevant interrupt, the timer interrupt IRQ 0, is a hardware interrupt.
The start code includes handlers for the system call
interrupt, irq0 interrupt, and the various exceptions (in particular, IRQ 7 used in the test programs). The first part of implementing preemptive scheduling is to
complete
irq0_entry and the
TEST_NESTED_COUNT macro in
entry.S.
At the same time, the second part of implementing preemptive scheduling is to complete the
put_current_running() function in
scheduler.c.
You will want to consult
scheduler.h and
queue.[ch] to familiarize yourself with the queue data structure we provide, and since the definition of
the PCB structure has moved to
scheduler.h.
The processor takes the following actions on an interrupt:
- Disable interrupts
-
Push the flags register, CS, and return IP (in that order)
- Jump to the interrupt handler
Once the interrupt handler has run, it returns control to the interrupted task with
the
iret instruction, which pops the instruction pointer
and the flags register, then reenables interrupts.
IRQ 0 is a hardware interrupt,
so you must acknowledge it quickly using the macro
entry.S:
SEND_EOI --
otherwise the hardware won't deliver any more timer interrupts. The tasks
entry.S:
irq0_entry must perform are
-
Increment the 64-bit counter,
time_elapsed
-
Increment entry.S:disable_count to reflect the fact that interrupts
are off.
-
Check whether nested_count is zero. If it is, enter the kernel the
way a system call would yield, otherwise do nothing.
- Check whether there are tasks ready to be awakened (Note: You'll need to implement blocking sleep to finish this part!).
-
Decrement
entry.S:disable_count
in anticipation of
iret
nested_count is a field in the PCB. For
processes, it is 1 during a system call and 0 otherwise. For threads, it is
always 1. You should study how
interrupt.c
changes this field during a system call. This will help you implement the macro
TEST_NESTED_COUNT in
entry.S.
This may not look very hard, but the tricky part is managing when
interrupts are on, and when they are off. Your solution should satisfy the
following two properties:
-
Safety: To avoid race conditions, interrupts should be off
whenever your code is accessing kernel data structures, primarily the
PCBs and task queues.
-
Liveness: Interrupts should be on most of the time so that
processes that take too long are preempted.
Once you have the timer interrupt handler, you will want to implement
put_current_running() function in
scheduler.c. This function saves the currently running task in anticipation of a preemption.
Blocking Sleep
The other major aspect to preemptive scheduling is blocking sleep. This is supposed to suspend the currently running task, and allow the next task in line to execute for some time period, until the next task is woken up and run for some time.
For this part of the project, it is your job to implement the
do_sleep() and
check_sleeping() functions in
scheduler.c so that the currently running process can be put to sleep and the next one in line woken up.
check_sleeping() takes care of checking when it is time to awaken all sleeping processes whose deadlines have been reached. The passing of the deadline is determined in part by looking at
time_elapsed.
A non-conforming implemention of do_sleep() has been provided. You must modify the implementation of do_sleep() so that calls to sleep block the current process (i.e. take it out of the ready queue until some appropriate time in the future) instead of busy waiting.
Synchronization Primitives
An important part of preemptive scheduling is to handle synchronization. Thus, you will implement three synchronization primitives in this project: condition variables, semaphores, and barriers.
The files you will edit here are
sync.c and
sync.h. You will need to consult
queue.[ch]. In particular, in
sync.h, you must define the properties of each of the sync primitives, where knowing how to use queues will be very useful.
We've given you an implementation of
locks as a reference. These primitives must operate correctly in the face of
preemption by turning interrupts on and off with the functions
enter_critical and
leave_critical.
In the descriptions below, if an operation is performed on an uninitialized
structure, the results are up to you. All unblocks should place the
unblocked thread at the end of the ready queue. For each of the primitives, you must implement the following functions.
Condition Variables
condition_init(cond) |
: Initializes the specified memory to hold a condition variable.
|
condition_wait(lock, cond) |
: Releases lock, blocks until signalled,
then reacquires lock. The calling thread is
responsible for acquiring lock before
calling this function.
|
condition_signal(cond) |
: Unblocks the first thread to have blocked on
condition_wait(lock,cond). If no thread is blocked on
cond, then do nothing.
|
condition_broadcast(cond) |
: Unblocks all threads that have blocked on
condition_wait(lock,cond). The order in which threads are
unblocked is not important.
|
Semaphores
semaphore_init(sem, value) |
: Initializes the specified memory to hold a semaphore whose starting
value is value; a non-negative integer.
|
semaphore_down(sem) |
: Decrement the semaphore, and block the current process if the value is now negative.
|
semaphore_up(sem) |
: Increment the semaphore, and unblock one waiting process if the value was negative.
|
Barriers
barrier_init(bar, n) |
: Initializes a barrier for use by n threads.
|
barrier_wait(bar) |
: Blocks until a total of n threads have
called barrier_wait(bar). The order in which
threads are unblocked is unimportant. Immediately after the last call to
barrier_wait(bar) returns, and possibly
before the other calls have returned, the barrier is available for
another round of waits.
|
Testing
After you complete each phase of the preemptive scheduler, you should test it with one of our provided test programs.
The script:
settest prepares a specified test for your OS to run. Its usage is
./settest test_name_here. Then simply run
bochs to see the test in action.
In order to clean up your directory after running a test you may run
make cleantest, which cleans and removes all the testing files. If you just want to reset your kernel, scheduler etc. you should run
make clean.
These are the five provided tests:
- test_regs
observes the value of all registers both before
and after a loop. Presuming that the process
is preempted at some time during the loop, it
will compare the values before and after, to
see if any registers were improperly clobbered
by preemption. Although this will test all
registers (and flags), note that some classes
of errors may manifest themselves rarely,
and so passing this test does not mean that
your code is perfect.
- test_preempt
creates two
processes which spin as quickly as
possible, but never yield or sleep.
If you see that
both counters are incrementing, then preemption is working
(at least a little bit). Notice that for some reason, the counter growing speeds
of two processes may be different, but the numbers of entry counts must be roughly the same
if you are doing the round robin scheduling.
- test_blocksleep
creates two threads, and one goes to sleep.
It tests four things: (i) that the scheduler
survives when all processes are sleeping,
(ii) that a process is not re-entered
during sleep, (iii) that sleep lasts about
the right amount of time, and (iv) the
non-sleeping process is scheduled during
the sleep.
Note: This test may fail on the Lab machines
due to differences beween the physical machines and
Bochs. This is fine.
- test_barrier
creates 3 threads. The threads sleep for a random period of time, and then
barrier_wait on
a common barrier. They repeat this process
several times.
- test_all is the final test for your project 3 solution. It tests
priorities, locks, condition variables,
semaphores, barriers, etc.
Note:
- The outputs of all the tests are self-explanatory, and tell you if something has gone wrong.
- If you implement the extra credit, you will have to uncomment some portions of some of the tests in order to test your solution.
- All tests need all the files in the test directories in order to function properly, even if some of these files don't contain any code. Don't waste time trying to change this format. It has to do with how the kernel is built by createimage.
Lastly, here is an example for how to test your code. Suppose you want to test the barrier primitive on your OS, run
./settest test_barrier and then
bochs
If you wanted to create your
own test set, or to create
a test set with no processes,
you could copy one of these test
sets to a new directory, and
then modify it to suit your needs.
Use this to test your kernel.
We will use similiar test cases when grading.
Don't forget to test the final version via USB!
When you have finished the project, the test_all test should look similar to this:
Design Review
As always, please take a look at
what we
expect from you in general.
For this project,
at your design review, we want you to be able to describe the workflow of all the preemptive scheduling components you will be implementing.
You should be able to describe:
-
irq0_entry:
- The workflow of the timer interrupt.
We suggest you use the provided system call handler to guide your timer interrupt workflow.
- Blocking sleep:
- How you make a process/thread sleep, how you wake up a process/thread.
- Show how you handle the case when every process and every thread is sleeping, and the ready queue is empty.
When answering these questions, it's useful to keep in mind what the relationship with the ready queue is.
- Synchronization primitives - condition variables, semaphores, barriers:
- For each one, the data structure you will use.
- The pseudo code for condition_wait, semaphore_up, semaphore_down, and barrier_wait.
Specifically, make sure your design for these functions is race-condition free.
You need to convince the TA that you understand the workflow of the timer interrupt and the synchronization primitives. It would be helpful to write the pseudo code down to show to the TA.
Though not required for the design review, we strongly suggest you to study the new system call mechanism of this project, familiarize yourself with the mechanics of round-robin scheduling, and devise the steps the scheduler needs to take in order to complete one cycle of scheduling. Knowing this is key to understanding what each component of the scheduler must take care of.
Each group must schedule a 10-minute design review slot. The
Review will take place on Monday, Oct. 22 between 3:00pm and 7:00pm.
Some tips:
- In order to get the points for organization, come into the review prepared with your answers to all questions written down/typed up.
- Make sure that your answers can be discussed within the allotted 10-minute time slot.
- To respect the time of your classmates, please save your questions for the end of the review. If there is time, we can address them, otherwise, don't fret, you can always post them on Piazza or talk to one of the TAs during office hours.
Use the signup sheet to schedule a time to see the TA in the fishbowl.
Hints
- The counter time_elapsed in scheduler.c counts the number of clock ticks.
- The variable nested_count is not very clear, so here's a rephrasing of what it's trying to say. nested_count indicates whether we are in kernel mode (e.g. when the kernel is currently performing a system call, or a kernel thread is being executed), or not. If we are in kernel mode, nested_count = 1, otherwise nested_count = 0.
- Let's rephrase what "If nested_count is zero, enter the kernel the way a system call would yield, else do nothing." is trying to say. If nested_count is zero, we know we are not in a system call. So we need to enter the kernel. We want to use the fact that a timer interrupt was invoked in order to preempt tasks. In particular, what part of the kernel do we need to enter? The scheduler. This is where the actual switching between tasks happens. But what if nested_count is not zero? The specs tell us to "do nothing". So we just return? Well, what got us here is the fact that the currently running task was performing a system call (or is a kernel thread), and the processor timer went off in the middle of it (If you look at system_call_helper() in interrupt.c you'll see that it is possible for this to happen.) Since we're in the middle of a system call, we want to let that system call finish. So in irq0_entry, the comment that says do system call actually goes right above the return label, because do we want to switch tasks if we're in the middle of a system call? This should help you determine the workflow of irq0_entry.
-
If interrupts are disabled, will
the kernel clock time_elapsed
continue to increment?
-
What does the incrementing of disable_count imply, what does the decrementing of disable_count imply? Recall that we do this to indicate that interrupts are on/off. When does this happen?
-
Use the provided macros in entry.S whenever you need to save a context, restore a stack etc. They are there to save you some work, and to make your code more readable and uniform.
-
For completing the semaphores, a notion that should help you avoid implementation mistakes is the idea that for every semaphore_up invocation, there should be a corresponding semaphore_down invocation. What this means is that, if there is an unbalance because there are more processes calling one or the other, your code should not allow race conditions to arise. Calling semaphore_init(sem, value) counts as value ups.
-
C functions are allowed to trash registers
%eax, %ecx, and
%edx. Keep this in mind when calling them from assembly.
Checklist
Here is a rough checklist to help you complete the three phases of the project:
-
Preemptive Scheduling
- implement irq0_entry
- yields the current process
- edit: entry.S and scheduler.c
-
Blocking Sleep
- Implement do_sleep and check_sleeping
- Compute some future time at which the process should unblock.
- Remove the current process from the ready queue, save it somewhere else
- Periodically check if these blocked processes can be added to the ready queue
- edit: scheduler.c
-
Synchronization Primitives
- Ensure all tests work both on Bochs and via USB.
Extra Credit
Prioritized Task Scheduling
For an additional 1 point of extra credit, change the scheduler algorithm so
that it allocates processor time proportionally to each task's priority.
If you choose to do this, you should come up with your own priority scheduling algorithm. We provide the functions do_getpriority and do_setpriority in scheduler.[ch] to help you with this.
Describe your algorithm in your README_EXTRA.
One way to test your implementation of priority scheduling
would be to modify test_preempt,
so that each task has a different priority. If you observe that
two counters are growing at different rates and the priority numbers and entry counts of
processes/threads are changing as you expect, your algorithm probably works.
You do not need to submit these test cases.
If you do attempt the extra credit, insert the line #define EC_PRIORITIES near the top of common.h. You may ignore any mentions of deadlock extra credit in this assignment.
When you have finished the extra credit, the test_all test should look something like this:
Program Submission
Don't forget to complete the general
requirements for each project! In particular, you need to
submit a README file, which concisely describes your design and
implementation, and what parts work and what parts you didn't get to work.
You don't have to repeat anything you presented in the design review.
If you implement the extra credit, describe your priority scheduling algorithm.
Submit via
Dropbox;
only one person per group should submit. When submitting the assignment, please turn in the files specified in the Dropbox, along with your readme (less than 500 words in text format).