From: Brian Milnes (brianmilnes_at_qwest.net)
Date: Mon Jan 26 2004 - 15:13:34 PST
Lottery Scheduling: Flexible Proportional-Share Resource Management -
Waldspurger and Weihl
The authors argue for more flexibility in schedulers. Fair share and
microeconomic schedulers provide better scheduling than Unix's decay
scheduler but are too granular. They have developed a lottery scheduler
which uses randomization to provide a proportional share of CPU. It works
well for compute bound and interactive jobs and can be generalized to I/O,
memory and lock access.
The idea is that competing processes hold lottery tickets and a lottery is
held to see who gets the resources. The algorithm is probabilistically fair.
The expected number of wins in n rounds is np, where p is the proportion of
the tickets held. The expected time to schedule is 1/p. With a scheduler
granularity of 10 ms a reasonable fairness can occur in under a second.
The tickets can be transferred, for example a client can give its server its
tickets while it waits on an RPC. The tickets can be inflated by allowing
the creation of more tickets. A process that does not use all of its quanta
can be issued compensation tickets until it schedules again. This keeps
things proportional and allows IO based threads to start quickly. Tickets
are automatically transferred from a client to a server on a call.
They implemented this under the Mach 3.0 microkernel. They use a 10
instruction pseudo random number Park-Miller algorithm. The implementation
uses random numbers from 1 to the sum of the outstanding tickets and a
traverses a list of the ticket sums or uses a log(n) tree for more clients.
The benchmarked their system on a DecStation running Mach 3.0. The
scheduler seems pretty fair with up to 20:1 scheduling of two processes
running dryhstone. They can also apply it to dynamically changing process
priorities. In a Mach client server example they could produce an 8:3:1
client ratio pretty effectively and had nearly matching latencies on the
calls with a standard deviation of less than 7% of the average. They also
could tune three MPEG players to control their relative frame rates. The
cost of their unoptimized scheduler was within a few percent of Mach's
scheduler.
The most interesting thing that they can do with this is to provide per
process control of synchronization waits. The standard problem with
processes of different priorities waiting on a lock is that they all compete
with an equal priority. In their algorithm, all of the processes blocked on
a mutex contribute their currency. And when a thread releases a lock it
holds a lottery to determine which of the blocked processes gets the mutex.
This way no high priority thread is starved on a resource while low priority
threads all consume this resource. An inverse lottery can be used to make
decisions such as which process should give up a needed page of physical
memory.
One great shortcoming of the paper is that they did not benchmark their
fair mutexes.
This archive was generated by hypermail 2.1.6 : Mon Jan 26 2004 - 15:13:43 PST