CTA charges for
operations, memory reference, communication
PRAM solution of little
use
Tournament
algorithm finds maximum of n elements in O(log n) time like the global sum
Odd
PEs send value to next lower processor
Even
PEs recv, compare and send larger value to parent
IDivide
PE index by 2, and repeat until one processor left
Time
is O(log n) communication + parallel compare steps