From: shearerje_at_comcast.net
Date: Sun Jan 25 2004 - 18:21:57 PST
“Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism” (Anderson et al, 1992) proposes that effective thread management requires support at both the user and kernel levels. Specifically, it addresses the dilemma that purely user level implementations suffer badly when blocking kernel actions such as I/O are invoked, while kernel level implementations have insufficient knowledge of user level requirements to make good thread management decisions. The proposed hybrid approach has the kernel allocating processors at the application level and letting the application itself (through a user level library) manage the assignment of threads to these processors. The kernel and the application communicate with each other through “Scheduler Activations” to work out the details. For example, the application notifies the kernel when it wants n more processors or when it is done with a processor, and the kernel notifies the application when a thread has done something that block
s (such as an I/O call) so that the application can use the processor for another thread while waiting. The kernel can also unilaterally deallocate a processor from an application, but it first tells the application that it is going to do it so the application has an opportunity to complete any critical code section or reallocate any thread that happens to be running on that processor at the time. Testing showed that the added complexity resulted in a moderate throughput penalty (compared to FastThreads) for applications that don’t use kernel services, but under heavy memory load conditions or heavy I/O conditions the proposed hybrid outperformed both the FastThreads user-space solution or the Topaz kernel-space solution.
The paper went into great detail characterizing the problems with both user-level and kernel-level thread management. At the user-level these have to do with lack of global visibility (starving processors or blocking high priority tasks) and kernel resource contention (E.g., I/O). At the kernel level these include crossing the kernel protection boundary for things that could be (and often eventually are) handled in one application’s context. Kernel implementations also suffer from the need to be all things for all users, i.e., too complex and feature laden. The worst case scenario seems to be doing both a user-level and a separate kernel-level implementation at the same time. The paper briefly discussed the potential for nested spin-locks in this scenario.
The paper went into much less detail on the options and implications for when an application should request more processors or free up processors, leaving this as a simple trigger; transitioning from threads pending to processors idling or vise versa. Since the application should have more knowledge about the nature of the pending threads (estimated duration, etc.) I think there is room here for improvement. The discussion of hanging onto a processor long enough to complete a critical thread was, I think, profound enough to also warrant more discussion than was included. Finally, I agree with the author’s lament that the test environment, FireFly, did not support nearly enough processors to make a really convincing demonstration.
This archive was generated by hypermail 2.1.6 : Sun Jan 25 2004 - 18:22:03 PST