Project 3 - Fair-Share Scheduling

Out: Monday February 11th, 2002
Part 1 Due: Friday-Sunday, February 15-17, 2002
Part 2 Due: Monday, February 25, 2002

Note that Monday, February 18, is a holiday, and midterm 2 is scheduled for Friday, February 22.

Assignment Goals

Assignment Overview

As with many schedulers, one of the key concepts in the Linux scheduler is round-robin (RR) scheduling.  Part of the point of RR scheduling is fairness.  In particular, there is fairness among processes, in the sense that each process (of a given priority) receives an equal allocation of CPU time over an appropriately measured interval. One problem with RR scheduling is that it is per process fairness, not per user fairness.  What this means is that a user who starts many processes will receive, in total, a much larger fraction of system resources than will a user who has only a few processes running.

Fair-share schedulers address this problem, using the RR mechanism, but with the goal of providing equal allocation per user (that is, summing over all processes being run for that user), not per process.  As an example, if user A has four times as many processes running as user B, each of user A's process would receive only one fourth of the allocation per second, say, given to processes for user B, so that users A and B split the CPU resource equally.  Under the default scheduler, user A would receive 80% of the CPU.

This assignment requires you to design, implement, and evaluate a fair-share scheduler in Linux.

Complications:

The example above assumes that all processes are in tight CPU loops.  That isn't the typical case.  Instead, processes perform IO and other blocking operations.  This complicates the definition of "fair share".

Part of this assignment is for you to decide just what the objectives of your scheduler are.  The objectives are distinct from the mechanisms used to actually perform the scheduling.  Typically, the mechanisms do not achieve the objectives, except in extreme cases (e.g., all running processes are in tight CPU loops).  In fact, typically it's difficult or impossible to give a convincing argument about just what "should happen" for a given set of real processes.

If it were me doing this assignment, I'd try to preserve the flavor of the basic mechanisms used in the current Linux scheduler, subject to the change to fair-share scheduling, on the theory (which is true) that the current scheduler represents a survival of the fittest evolution of the algorithm based on observations of its performance in actual use over a number of years.

In any case, though, you're free to make whatever decisions you think are appropriate, but you should try to be complete in your write-up about what those decisions are.

Background: Linux Scheduling

Linux actually contains three scheduling policies, one for "general use" and two for a kind of real-time processing. We'll be concerned with the general use policy, which is a descendant of the multi-level feedback approach discussed in the book. However, the other two have some penetration into the code you'll end up modifying, so you need to understand at least a tiny bit about them.

Quoting from "man sched_setscheduler":

SCHED_OTHER is the default universal time-sharing scheduler policy used by most processes, SCHED_FIFO and SCHED_RR are intended for special time-critical applications that need precise control over the way in which runnable processes are selected for execution. Processes scheduled with SCHED_OTHER must be assigned the static priority 0, processes scheduled under SCHED_FIFO or SCHED_RR can have a static priority in the range 1 to 99. Only processes with superuser privileges can get a static priority higher than 0 and can therefore be scheduled under SCHED_FIFO or SCHED_RR. The system calls sched_get_priority_min and sched_get_priority_max can be used to find out the valid priority range for a scheduling policy in a portable way on all POSIX.1b conforming systems.

and:

SCHED_OTHER: Default Linux time-sharing scheduling.

SCHED_OTHER can only be used at static priority 0. SCHED_OTHER is the standard Linux time-sharing scheduler that is intended for all processes that do not require special static priority real-time mechanisms. The process to run is chosen from the static priority 0 list based on a dynamic priority that is determined only inside this list. The dynamic priority is based on the nice level (set by the nice or setpriority system call) and increased for each time quantum the process is ready to run, but denied to run by the scheduler. This ensures fair progress among all SCHED_OTHER processes.

Note that these describe the logical operation of the scheduler;  the actual implementation might (and probably will) not be quite what you'd expect from reading these descriptions.

You might want to take a look at this link before you dive too deeply into this assignment. Note that this link describes the Linux 2.2 kernel, and we're working with the 2.4 kernel in this course. So, some things have changed (in particular, multiprocessor scheduling has been simplified).

Background: Fair-Share Scheduling

There are many approaches to fair share scheduling in the literature. Among the best known is Lottery Scheduling. A variation by the same authors that is easier to implement and has better accuracy is Stride Scheduling. (The paper is available through the "Cached:" links in the upper right corner of that page.) Le Moal et al. presents another look at stride scheduling and, particularly, its implementation with priorities.

Note that all these papers contain a section that evaluates the effectiveness of their approaches. You might find these sections useful in forming ideas for how to validate your implementation, and as a yardstick to measure your results against, whether or not you adopt the particular discipline advocated in the paper.

You do NOT have to read these papers. I think it will be easier to complete the assignment if you read the evaluation section of at least one, but even that is not required if you think you can develop a reasonable evaluation plan on your own.

Please note that, as always, the goal of this assignment is not to produce the most effective artifact (scheduler) possible, but rather the process you'll go through. That process includes making reasonable design choices about what to implement (which should include understanding the current Linux scheduler and estimating how, and how difficult it will be,to alter it), then doing the implementation, and then evaluating it. You could do a full-credit job of each of those steps and still end up with a scheduler that isn't very good at fair-share, because some unforeseeable problem arises when you actually implement.

Utilities

Before you begin the assignment, log into spinlock or coredump, and execute the following from your /cse451/username directory:
tar -xvzf /cse451/zahorjan/public/fairshare.tar.gz
The tar file contains the source code and binaries of a number of utilities that you will find useful during the course of this project. The tar command will create a directory called fairshare/, and the utilities will be extracted into it. The utilities are intended to make it easier to start up a batch of processes running under different user IDs. A description of the utilities is given here.

Implementation

Warning: it should be obvious that if you can find an implementation of fair-share scheduling for Linux on the web, we can find that same implementation. You should do this project without referring to an existing implementation of fair-share scheduling.

Linux Source and Configuration

The most directly relevant code is in .../kernel/sched.c.  It is certain that you will need to look at other source files as well.

You should make decisions appropriate to execution on multi-processors when writing your code.  Of course, our (virtual) machines are not multi-processors, so this is a bit of pretending.  We MAY have an actual multi-processor available, on loan, for testing by those who are interested.  However, the bottom line is that you should write code that would be sensible for execution on multi-processors, but your code has to be tested only on uni-processors.  This means that you must worry about race conditions.  However, I don't believe that you need to create any new policy or mechanism specific to multiprocessor scheduling.

Recommended Procedure

It is very inconvenient to have a hard bug in the scheduler - the machine simply won't run your copy of the kernel, and you'll spend days rebooting VMware.

For that reason, I recommend that you (a) think hard about your code before you recompile and install, and (b) make modifications incrementally, not all at once.  If the machine fails to boot, or to run, it's a good guess that the last set of changes you made are the problem.  (It's not certain, of course.)  Making sure that set of changes is small can help find the problem. Additionally, you might find that instrumenting the code (printk's, or similar sorts of things) helps you, and even that building some additional tools (e.g., a new system call, and a user-level application that invokes it) are worth the effort.

Part 1: Understand the Linux Scheduler and Design your Fair-Share Scheduler (Due by e-mail 2/15-17)
This first task is simply to read and understand the source code to the existing Linux scheduler, to verify that you've understood it by running some simple tests using the assignment utilities, to design (on paper) how you want your fair-share scheduler to work, and to design the experiments you will run to evaluate your implementation.

WHAT TO TURN IN FOR PART 1

As part of your writeup for this assignment, you need to answer the following questions:

  1. What does the current Linux scheduler do? Make sure you address all of the following questions:
    • What is the current scheduling mechanism?  Be specific and thorough.  (It should be possible to re-implement the behavior of the current scheduler from your description, which should be a kind of specification.)

    • Is starvation possible with the current mechanism?  If so, give an example of how it might occur.
    • Is there aging?  That is, are the priorities of processes that have low recent CPU consumption raised to avoid effective starvation?

    • Are IO bound processes (those whose ratio of IO to CPU use is high) given any kind of favoritism?

    • Suppose that all processes on the system are scheduled using SCHED_OTHER, and that all are in tight CPU loops.  Give an expression that indicates the fraction of CPU time that will be allocated to a single process, as a function of its base ("nice") priority.  (Note: The answer to this depends on at least one Linux configuration parameter.)

    • Verify your equation through measurements on a VMware machine. The assignment utilities can help with this.

  2. What are the objectives of your fair-share scheduler? Make sure you tell us how you want to allocate CPU among processes (a) that all belong to a single user, and (separately) (b) that are not in tight CPU loops. That is, tell us what you want "fair-share" to mean.  It is okay, expected even, that the answers to some of these questions can be answered best by referring to the implementation (just as exactly what the current scheduler does can be defined fully only by its implementation).

  3. What are your plans to modify the existing scheduler implementation to achieve fair-share scheduling? Your answer to this question should be "more than a specification":  it should be a specification (sufficient for some other trained person to implement from), plus it should identify specific source files to be modified.

  4. How will you evaluate your modified scheduler? Your evaluation should test whether each of the objectives you set for your scheduler are met, within the bounds of what is realistically possible given the time alloted for this assignment.
Our Hopes and Dreams: There are two reasons to have an early Part 1 due date. One is to make sure you get started early enough to complete Part 2 on time. The other is so that we can provide feedback on your design before you head into the implementation phase.

This means that we'll try to get feedback to you as quickly as humanly possible. We hope you'll have some understanding if it takes a few days to respond. In return, the due date for this part is "any time between Friday and Sunday."

To expedite this process, you will e-mail your Part 1 write-ups to us. Details will be posted to the class mailing list early this week.

Part 2: Implement and Evaluate Your Fair-Share Scheduler (Due 2/25)
Implement your fair-share scheduler, as you designed it in Part 1, possibly modifed in response to our comments.

WHAT TO TURN IN FOR PART 2: