•Summary
for successful parallel computation
–Rather than using a shared memory abstraction, use the CTA
model; it reflects costs accurately
–Use
ZPL for programming to get convenience, speed and portability; use MP as last
resort
–Be suspicious of claims like the “problems” with shared memory
have been solved by new machine
–When
choosing architecture, prefer support for global addressing, 1-sided communication,
point-to-point network, (randomizing) non-minimal adaptive routing, SMP
nodes