Memory Sharing on a CTA
•With many processors referencing a single memory image, the probability that a given reference is nonlocal is (P-1)/P
•So, by the CTA nearly all memory references will take l time implying that each processor can do little work
•Responses
•A “sea” of small processors that can be used “inefficiently”
•Have many threads for processors to switch among
•Find a way to greatly reduce l
•Change processors from vN to something else