Optimization in Distributed DBMS
•A distributed database (2-minute tutorial):
–Data is distributed over multiple nodes, but is uniform.
–Query execution can be distributed to sites.
–Communication costs are significant.
•Consequences for optimization:
–Optimizer needs to decide locality
–Need to exploit independent parallelism.
–Need operators that reduce communication costs (semi-joins).
–