Hiding Latency
•To hide memory reference latency (l of the CTA model) requires that there be many more threads (work) than there are processors
•A thread is a sequence of instructions operating on a small quantity of data -- for example, a loop iteration
•The idea is that a processor with many threads to execute, can switch to execute another thread when it is stalled waiting for a memory reference, getting productive work done during the wait time
•The idea can be used in either a programming model or hardware implementation