Communication Pattern
•Architectures differ, but row and column broadcasts are often fast
•Transfer only the segment of row stored locally to the processors in the column
–For 1 block Puv is a sender
–For P1/2-1 blocks Puv is a receiver
–Space required is only 4t elements -- 2t for the segments being processed and 2t for the segments arriving
Pd
Ph
Pl
Pp