ZPL’s Latency Tolerance
There are two ways ZPL exploits blocked data transfer
- Vectorization moves array slices as a single unit -- ZPL naturally vectorizes because it is compiling array operations
- Combining communications to the same destination reduces the overhead, benefits from pipelining
Communication is also pipelined, allowing communication to overlap with computation
Goals of combining and pipelining can conflict