Tale Of Two Multiplies
“It was the best of times” that we wanted from our parallel MM programs, but which of the hall of fame algorithms, Cannon’s or SUMMA, gets the best times?
Analytically, which one is better?
Recall the schema of each program:
Cannon’s SUMMA
Skew A loop thru n
Skew B flood A[,k]
loop thru n flood B[k,]
C+=A*B C+=A*B
rotate A,B