The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors

From: Greg Green (ggreen_at_cs.washington.edu)
Date: Sun Jan 25 2004 - 21:58:39 PST

Next message: Richard Jackson: "Review: Thomas Anderson, et al. The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors."

Previous message: Joanna Muench: "Review of Performance Implications of Thread Management Alternatives"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This paper studies 5 different thread management algorithms on a
multiprocess machine. The machine was a Sequent Symmetry Model A. It
had write-through cache which has a substantial effect on the
performance.

The first management algorithm is to have a single lock
on central data structures. This was the slowest algorithm as all
thread management is serialized through the locks.

The second is to have each central data structure with it's own
lock. This was faster than the first but latency was increased as
multiple locks are accessed to manage a thread.

The third is to have a free-list of control blocks and stacks for each
processor, thus negating the need for locks. This choice is expensive
in space for stacks. The free lists of control blocks also need to be
balanced among the processors, for efficient management of
space. Stacks need to be managed in a central pool or space required
is impractical. This arrangement is faster than the first two and has
the lowest latency, because there isn't much locking. The final
2 methods have the highest throughput.

The fourth is to have a central queue of idle processors. When a
processor is idle, it creates a control block and stack then waits on
a spin lock until a thread needs to be created. This method is faster
when there are idle processors, but latency will increase if there are
more ready threads than processors.

The fifth is to have a ready queue per processor and per processor
free lists. This case is difficult to balance because ready threads
are a minority of the number of threads. Therefore a processor may
have an empty queue an another processor may have several threads
ready to run, resulting in imbalanced load.

I found the discussion on spin locks interesting. I wasn't familiar
with that level of detail on cache invalidation. The discussion of
various ways of mitigating the costs of many threads waiting on lock
was also informative. There are many things to consider that I haven't
seen.

The modeling of bus and lock contention was somewhat sparse. I would
have liked to know more. I liked the paper though it was quite
difficult to read. A lot of information.

-- 
Greg Green
email: ggreen_at_u.washington.edu

Next message: Richard Jackson: "Review: Thomas Anderson, et al. The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors."

Previous message: Joanna Muench: "Review of Performance Implications of Thread Management Alternatives"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Sun Jan 25 2004 - 21:57:33 PST