The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors

From: Greg Green (ggreen_at_cs.washington.edu)
Date: Sun Jan 25 2004 - 21:58:39 PST

  • Next message: Richard Jackson: "Review: Thomas Anderson, et al. The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors."

    This paper studies 5 different thread management algorithms on a
    multiprocess machine. The machine was a Sequent Symmetry Model A. It
    had write-through cache which has a substantial effect on the
    performance.

    The first management algorithm is to have a single lock
    on central data structures. This was the slowest algorithm as all
    thread management is serialized through the locks.

    The second is to have each central data structure with it's own
    lock. This was faster than the first but latency was increased as
    multiple locks are accessed to manage a thread.

    The third is to have a free-list of control blocks and stacks for each
    processor, thus negating the need for locks. This choice is expensive
    in space for stacks. The free lists of control blocks also need to be
    balanced among the processors, for efficient management of
    space. Stacks need to be managed in a central pool or space required
    is impractical. This arrangement is faster than the first two and has
    the lowest latency, because there isn't much locking. The final
    2 methods have the highest throughput.

    The fourth is to have a central queue of idle processors. When a
    processor is idle, it creates a control block and stack then waits on
    a spin lock until a thread needs to be created. This method is faster
    when there are idle processors, but latency will increase if there are
    more ready threads than processors.

    The fifth is to have a ready queue per processor and per processor
    free lists. This case is difficult to balance because ready threads
    are a minority of the number of threads. Therefore a processor may
    have an empty queue an another processor may have several threads
    ready to run, resulting in imbalanced load.

    I found the discussion on spin locks interesting. I wasn't familiar
    with that level of detail on cache invalidation. The discussion of
    various ways of mitigating the costs of many threads waiting on lock
    was also informative. There are many things to consider that I haven't
    seen.

    The modeling of bus and lock contention was somewhat sparse. I would
    have liked to know more. I liked the paper though it was quite
    difficult to read. A lot of information.

    -- 
    Greg Green
    email: ggreen_at_u.washington.edu
    

  • Next message: Richard Jackson: "Review: Thomas Anderson, et al. The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors."

    This archive was generated by hypermail 2.1.6 : Sun Jan 25 2004 - 21:57:33 PST