From: Justin Voskuhl (justinv_at_microsoft.com)
Date: Mon Jan 26 2004 - 14:02:17 PST
This paper presents an interface between the kernel and a user-mode
threads package that provides the benefits of kernel-mode threads with
the flexibility of a user-mode threads implementation. The abstraction
that is provided to the threads package is that of a "virtual
multiprocessor" whereby each process gets to see which processors in the
system have been allocated to it. The processors it is allocated can
change dynamically over time as the kernel sees fit.
The problems with kernel threads are given as two primary drawbacks:
* Thread primitives are very expensive in time, because you have
to transition across a protection boundary to access them.
* The kernel decides the semantics of the threads, leaving
applications unable to customize this behavior if they should need to do
so.
The problems with user-mode threads packages are that because they do
not integrate with the kernel, when one user-mode thread blocks for UI,
the virtual processor becomes under utilized. Furthermore, user-mode
threads packages built on top of kernel threads can exhibit correctness
problems when they rely on all threads making finite progress - if a
thread becomes blocked this assumption can become invalidated and the
user-mode thread system can deadlock.
The interface from the kernel to a user-mode threads package is
described. There are four "upcall" points that are given: Add a
Processor, Processor Preempted, Schedule Activation Blocked, and
Schedule Activation Unblocked. The interface from the user-mode threads
package to the kernel includes two notifications: Add More Processors,
and Processor Idle. These two interfaces allow the user-mode process to
maintain whatever threading policy it likes while allowing the kernel to
deal primarily with the protection related aspects of threads.
The performance of the prototype is not exceptional, however if the goal
is additional flexibility, then it's possible that this performance hit
is acceptable. It's also noted that the systems that the systems being
compared have different levels of tuning. One is written in hand-tuned
assembler, the other is Modula-2.
It's interesting how this compares to completion ports in Windows. In
some ways completion ports cover the most common case you would want to
use Schedule Activations for. With user-level threading I would imagine
it would be easier to integrate parallelism into programming languages
more tightly. Because threads are so expensive, they can't be used for
many situations you would expect to be able to apply them. Instead,
most applications create a handful of long-lived threads instead.
This archive was generated by hypermail 2.1.6 : Mon Jan 26 2004 - 14:02:24 PST