An Analysis of Linux Scalability to Many Cores, OSDI 2010
Below is a simple sloppy counter implementation based on Section 4.3.
Suppose (1) four threads execute the work function on four different cores;
(2) the four threads start after the init
function.
Consider the counter update
function.
Do you think the implementation is correct?
Below is a conditional set function implementation on a lock-free hashtable. It applies a version-based lock-free protocol and allows concurrent updates, where as the one (described in Section 4.4) only permits an exclusive writer. What can go wrong with this implementation?
As Figure 3 shows, gmake
scales well
as the number of cores increases.
However, it still falls short of perfect scalability.
Why do you think that’s the case?
Provide a list of questions you would like to discuss in class. Feel free to provide any comments on the paper and related topics (e.g., which parts you like and which parts you find confusing).