Problem Set 4: Dynamic Partitioning

Due: February 27, 2024

Consider a high-level protocol for a partitioned key-value store where keys are statically assigned to "shards" and shards are dynamically assigned to groups by a ShardMaster that numbers these configurations, similar to lab 4 part 2. Each group is a Paxos cluster that provides a key-value API. Groups move through the sequence of configurations in order without skipping (i.e., if a group is currently in config 3 but config 5 is the most recent, then the group will first change to config 4 before changing to config 5). When a group receives a new configuration, it "reconfigures" by exchanging data with other groups.

More specifically:

Each configuration is numbered.
Each configuration maps shards to groups.
Each group has a current configuration.
When a group receives a new configuration, it enters a "reconfiguring" state where it refuses to execute client requests until it has finished exchanging data with other groups.
- During reconfiguration, each group tracks what data it needs to receive and what data it needs to send.
  - "Data it needs to receive" means key-value data for keys that the group did not own in the previous configuration but does own in the next configuration.
  - "Data it needs to send" means key-value data for keys that the group owned in the previous configuration but does not own in the next configuration.
- Groups send data to other groups in "ShardMove" messages. A ShardMove message contains a configuration number and a bunch of key-value store data that is being transferred to the receiving group.
- Each group repeatedly retransmits the data it needs to send that has not been acknowledged by the new owner yet.
- When a group is in reconfiguration and receives a ShardMove message from another group in the same configuration number, it adds the data to its key-value store, marks the data as received, and sends a ShardMoveAck to the sending group.
- When a group is in reconfiguration and receives a ShardMoveAck message from another group in the same configuration number, it marks that data as acknowledged locally.
- Once a group has received all required data and received acks for all sent data, it exits reconfiguration and enters the new configuration, deleting any data it no longer owns. It then begins accepting client requests again.

In this problem set, we ignore the Paxos layer and just pretend that each group is a single very reliable machine.

Each problem is worth 8 points.

Suppose we "optimize" the protocol by modifying the reconfiguration protocol so that a group does not wait for sent data to be acknowledged before deleting it. Give an example execution where permanent data loss occurs (i.e., the system reaches a state where no group holds the data for a key that should be somewhere but isn't).
Suppose we "optimize" the protocol by allowing client requests during reconfiguration. Give an example execution that violates linearizability. In 1-2 sentences, explain why your execution is not linearizable.

Hint: Consider client operations that modify data that is being moved.
Would it be linearizable to allow client operations during reconfiguration that modify data that is not being moved (i.e., the data is owned by the same group in the old configuration and the new configuration)? If yes, explain why in 1-2 sentences. If not, give an execution that violates linearizability while only modifying data that is not moved during reconfiguration (in this case, no need to explain your execution, just give it).
Different groups enter the reconfiguration phase at different times. Suppose that all groups are currently in configuration 1. Then:
- group G1 hears about configuration 2, enters reconfiguration, finishes reconfiguration, and moves fully to configuration 2 and begins serving client requests again
- group G2 hears about configuration 2, enters reconfiguration, but has not finished reconfiguring yet, so it does not serve client requests at the moment
- group G3 has not yet heard about configuration 2, but it continue to serve client requests in configuration 1
In this situation, G1 is serving client requests in configuration 2, and G3 is serving client requests in configuration 1. Why does this situation not violate linearizability from a client's perspective? Justify your answer in 1-2 sentences.
In lecture we discussed the problem of moving the whole AMOApplication, not just the KVStore data. Suppose we "optimize" the protocol by not doing that, and instead just sending the KVStore data during reconfiguration. Give an example execution with only a single message drop and no other failures, where a client receives a response that violates linearizability (or, in other words, the only correct linearizable act the system can take is to purposefully ignore the client). Also, in 1 sentence, explain intuitively why the client would receive a non-linearizable response (or must wait forever and cannot recover).
At a high level, describe an execution where two groups are two or more configuration numbers "apart" from each other. (In other words, an execution that reaches a state where one group is currently in configuration number \(n\) and another group is in configuration number \(m\), and \(n\) and \(m\) have a difference of two or more.)