Skip to main content

Lecture 12: MultiPaxos — Whiteboard Descriptions

These are text descriptions of the whiteboard PDF from the lecture on April 24, 2026. See also the whiteboard PDF.

These materials were drafted by AI based on the live whiteboard PDF and audio transcript from the corresponding lecture and then reviewed and edited by course staff. They may contain errors. Please let us know if you spot any.

Diagram: State machine replication via logs. Each replica maintains a local view of the log of operations to execute on the application. Clients broadcast their requests to all replicas. The replicas use some kind of consensus algorithm to come to agreement on which operations to execute in which order, putting them in the log. We can think of the consensus algorithm as helping implement a "global" view of the log that is consistent across replicas once values are chosen.

Differences from Single Decree

  • all nodes play all roles
  • ballot "numbers" are pairs (seq num, server-id) with lexicographic order
    • (1, S_2) < (4, S_1)
    • (1, S_2) > (1, S_1)

Diagram: Three vertical process lines labeled S_1, S_2, S_3. From S_1, an arrow loops back to itself and then continues rightward to S_2 and on to S_3, labeled 1a(r = (1, S_1)). The point is that ballot numbers carry the sender's server id as the tiebreaker.

Unoptimized MultiPaxos

  • SDP (single-decree Paxos) in each slot
  • clients broadcast requests to all servers
    • find an empty slot, propose this request
  • wait for prefix of log chosen
    • execute request
    • respond to client

Diagram: Spacetime diagram with five vertical process lines labeled C_1, S_1, S_2, S_3, C_2 from left to right. Time flows downward.

  • C_1 sends req_1 to S_1.
  • S_1 runs phase 1 of single-decree Paxos for slot 0: it broadcasts 1a(s = 0, r = (1, S_1)) to itself, S_2, and S_3 (the self-arrow loops back to S_1).
  • S_2 replies 1b(s = 0, r = (1, S_1), null) back to S_1.
  • S_1 then sends 2a(s = 0, v = (1, S_1), req_1) to itself and the other servers.
  • S_2 replies 2b(s = 0, r = (1, S_1)) to S_1.
  • S_1 sends resp_1 back to C_1.
  • Meanwhile C_2 sends req_2 toward S_3. A tangle of arrows among S_1, S_2, and S_3 indicates a second, separate Paxos instance is running concurrently in some other slot.
  • Eventually resp_2 is delivered to C_2.

The visual point: every request runs its own full two-phase Paxos, and concurrent client requests cause overlapping protocol traffic across the servers.

Distinguished Proposer Optimization (Leader Election)

  • use phase 1 to elect the leader
  • combine phase 1 across all slots
    • delete slot # from 1a / 1b
    • 1b summarizes votes in all slots (a list)
  • phase 2 unchanged

Diagram: Spacetime diagram with five process lines C_1, S_1, S_2, S_3, C_2. Time flows downward.

  • S_1 broadcasts 1a(r = (1, S_1)) (no slot number) to itself, S_2, and S_3.
  • S_2 replies 1b(r = (1, S_1), summ = []) back to S_1 — an empty list of prior votes across all slots.
  • After leader election, C_1 sends req_1 to S_1 (the leader).
  • S_1 sends 2a(s = 0, v = (1, S_1), req_1) to the other servers.
  • S_2 replies 2b(s = 0, r = (1, S_1)) to S_1.
  • S_1 sends resp_1 to C_1.
  • C_2 sends req_2 toward S_1 (since S_1 is the leader). A subsequent exchange between S_1 and the followers (drawn as a couple of arrows) handles slot 1, and resp_2 goes back to C_2.

The visual point: phase 1 runs once at the start to elect a leader, and after that each request only needs phase 2.

Heartbeats

  • leader tells followers that they are still up
  • followers set a timer to check if they've heard from the leader recently
    • if not, try to become leader