Due: February 27, 2024
Consider a high-level protocol for a partitioned key-value store where keys are statically assigned to "shards" and shards are dynamically assigned to groups by a ShardMaster that numbers these configurations, similar to lab 4 part 2. Each group is a Paxos cluster that provides a key-value API. Groups move through the sequence of configurations in order without skipping (i.e., if a group is currently in config 3 but config 5 is the most recent, then the group will first change to config 4 before changing to config 5). When a group receives a new configuration, it "reconfigures" by exchanging data with other groups.
More specifically:
In this problem set, we ignore the Paxos layer and just pretend that each group is a single very reliable machine.
Each problem is worth 8 points.
Suppose we "optimize" the protocol by modifying the reconfiguration protocol so that a group does not wait for sent data to be acknowledged before deleting it. Give an example execution where permanent data loss occurs (i.e., the system reaches a state where no group holds the data for a key that should be somewhere but isn't).
Suppose we "optimize" the protocol by allowing client requests during reconfiguration. Give an example execution that violates linearizability. In 1-2 sentences, explain why your execution is not linearizable.
Hint: Consider client operations that modify data that is being moved.
Would it be linearizable to allow client operations during reconfiguration that modify data that is not being moved (i.e., the data is owned by the same group in the old configuration and the new configuration)? If yes, explain why in 1-2 sentences. If not, give an execution that violates linearizability while only modifying data that is not moved during reconfiguration (in this case, no need to explain your execution, just give it).
Different groups enter the reconfiguration phase at different times. Suppose that all groups are currently in configuration 1. Then:
In this situation, G1 is serving client requests in configuration 2, and G3 is serving client requests in configuration 1. Why does this situation not violate linearizability from a client's perspective? Justify your answer in 1-2 sentences.
In lecture we discussed the problem of moving the whole AMOApplication, not just the KVStore data. Suppose we "optimize" the protocol by not doing that, and instead just sending the KVStore data during reconfiguration. Give an example execution with only a single message drop and no other failures, where a client receives a response that violates linearizability (or, in other words, the only correct linearizable act the system can take is to purposefully ignore the client). Also, in 1 sentence, explain intuitively why the client would receive a non-linearizable response (or must wait forever and cannot recover).
At a high level, describe an execution where two groups are two or more configuration numbers "apart" from each other. (In other words, an execution that reaches a state where one group is currently in configuration number \(n\) and another group is in configuration number \(m\), and \(n\) and \(m\) have a difference of two or more.)