From: David Coleman (David.Coleman_at_roxio.com)
Date: Sat Jan 24 2004 - 18:10:58 PST
This article presents several alternatives for managing threads on
shared-memory multiprocessor systems. It describes different approaches
used by a thread library to create and start threads on available
processors. It presents five different designs for managing the data
structures necessary for thread management. Measurements of the various
approaches are presented with a clear winner for larger numbers of
processors. Different approaches for implementing spinlocks in
shared-memory environments are also presented, again with empirical data
and a clear winner for larger numbers of processors.
The thread library described uses free processors to create and start
threads. Because of this, the thread library is itself essentially a
multithreaded application. As such, the design for minimizing
bottlenecks in thread creation and starting applies to all
multithreaded, shared-memory applications. However, it must be
remembered that the actual number of instructions necessary to implement
this is small and the measurements for the different designs might
differ for larger applications.
The thread model presented seems to indicate that threads either run
until they relinquish the processor or complete processing. They don’t
appear to be interruptible. The statement from the introduction supports
this, “No locking is needed inside thread routines, since only one
routine can be executing at any one time.” The code presented in Table
III also seems to indicate this.
It took me a little time to get my mind around two things: that threads
are not preemptively interrupted and we are operating from the
prospective of the implementation of the thread library. The discussion
of the processors searching for threads caught me a little by surprise.
Nevertheless, this is a very interesting discussion of thread contention
for shared resources. I attended a performance talk at Microsoft
describing some of these approaches to partitioning problems to avoid
bottlenecks and bus thrashing (there are issues with different variables
sharing cache lines and thus invalidating the cache even though separate
memory addresses are used for per-thread flags).
This archive was generated by hypermail 2.1.6 : Sat Jan 24 2004 - 18:08:29 PST