Design documents
This page describes how to write a design document.
- A design document describes a distributed protocol at a high-level but with
enough detail that a competent programmer could implement it without having to
think about any distributed systems-y aspects of the problem (a big ask!).
- Notice that this definition of a design document means that it is very different
from other forms of software documentation that you may be familiar with. It is
not a goal of a design document to describe how your implementation works at a
low level. Instead, you should describe conceptually how your system solves
the problem it is trying to solve, in such a way that someone else could implement it
without looking at your implementation.
- We will follow a highly structured template in this class, given below.
Advice on writing a design doc
- You can generally omit pieces of "state" that never change, such as the
addresses of other nodes. Think of these as configuration parameters.
- Try to use math/high-level concepts, rather than Java-specific concepts. For
example, say "map" instead of "
HashMap
". Similarly, prefer saying that some
piece of state is "optional" rather than "null
". (Though null
is ok too,
but you should explicitly document any piece of state that is allowed to be
null
.)
- When describing a piece of state that is optional/possibly
null
, be sure to
describe what it means if that state is not present. Can the state only be
missing at the beginning of time, or can it be present sometimes and then
later absent again?
- Feel free to include commentary that describes alternative designs you
considered, and why you chose your design over those alternatives.
- Timers should almost always follow one of these two design patterns:
- Tick pattern:
- Timer contains no data
- Timer is set in the initial state or near the "beginning of time"
- Timer is unconditionally reset every time it fires
- Discard pattern:
- Timer contains enough data to tell whether it is "current"
- Timer is set when some region of time starts
- When the timer fires, it is checked whether it is still "current", and
if not, it is ignored and not reset. Otherwise it is processed and
reset.
- You are not bound to the implementation you describe in your design doc. You
may (should!) change your implementation after you submit or after you see
your feedback, if that you think that will produce a cleaner design and/or one
that is more likely to work.
- We recommend (but do not require) that you follow the template. The
questions in the template force you to address important edge cases.
- The bulk of your protocol section is in the answer to the question “what happens when this message is received at this destination?”, so be comprehensive.
- What edge cases might you be checking for before you do a certain execution?
- Do certain states change how you handle a specific message?
- Multiple short bullet points are way more readable than fewer long bullet points.
- Feel free to give messages/objects/things names, define them, and reference them throughout your design doc. This is preferred because it makes the doc more concise and clear.
- Try to make your design doc application agnostic (avoid mentioning a specific application and commands, like KVStore and Get/Puts) and language agnostic (avoid Java specific details).
- Make note of the length of your timers, especially in relation to other timers (eg, how many times will your ping timer go off versus your ping check timer?)
- Correctness section should look something like this:
- For each message type, state why the protocol remains correct and live in the following cases:
- One or more of the message is delayed
- One or more of the message is dropped
- One or more of the message is duplicated
- One or more of the message is reordered
- State why the protocol remains correct and live even when nodes crash at any point in time in the execution (barring any limitations/assumptions you have previously stated)
- State why the protocol correct/live even when a group of nodes are partitioned from some but not all nodes, and then the partition is resolved (messages resume between nodes that had been partitioned).
(Commentary is in [square brackets].)
- Preface
- Goals: [What is the system trying to achieve?]
- Desired fault model: [What failures does the system handle automatically?]
- Challenges: [What makes the goal hard to achieve in the fault model? (If it was easy, we probably wouldn't need a design document.)]
- Assumptions: [What assumptions about client behavior or fault model does the protocol make in order to work?]
- Protocol
- Kinds of node: [For example, "client" or "server". Each node is permanently of one kind.
You cannot start a node as one kind and then later switch it to another kind.]
- For each kind of node, what temporary roles it can play, if any. [For
example, "primary server" or "backup server" or "idle server". A role
is temporary and might change over time.]
- State at each kind of node: [What data does each kind of node need to do its job? Is certain data only present when the node is performing a certain role?]
- For each piece of state, what type of data is it (integer, set of strings, etc.)?
- If you refer to any auxiliary types, be sure to define them!
- For each piece of state, what does it mean intuitively (if not obvious)?
- For each piece of state, how is it initialized at the beginning of time?
- For each piece of state, are there any important constraints on how it evolves over time (monotonically increasing, etc.)?
- Messages: [How do the nodes communicate with each other over the network?]
- List all the kinds of messages
- For each kind of message, list:
- Source: [What kind of node (and role, if applicable) sends the message?]
- Destination: [What kind of node (and role, if applicable) receives the message?]
- Contents: [What data is sent in the message?]
- When is it sent? [In other words, under what conditions?]
- Can this message be sent spontaneously by a node, or only in response to another message/timer?
- If yes, describe what else (if anything) happens when the message is sent spontaneously by a node.
- What happens at the destination when it is received?
- [When should the message be ignored (e.g., because it is old or no longer relevant, or because it is "from the future" and not relevant yet)?]
- [If a message is not ignored, how does the receiver's state get updated as a result?]
- [What messages get sent as a result of receiving this message?]
- [What timers get set as a result of receiving this message?]
- Timers:
- List all the kinds of timers
- For each kind of timer, list:
- What kind of node sets it? [What kind of node (and role, if applicable)?]
- Contents: [What data (if any) is stored in the timer?]
- When is it set? [In other words, under what conditions?]
- What happens when it fires?
- [When should the timer be ignored (e.g., because it is old or no longer relevant)?]
- [If a timer is not ignored, how does the local node's state get updated as a result?]
- [What messages get sent as a result of this timer firing?]
- [Should this timer get reset?]
- [What (other) timers get set as a result of this timer firing?]
- Correctness/Liveness analysis
- For each message, why is the protocol correct/live even if the network
delays, reorders, drops, or duplicates messages. [There is overlap in some
cases—a long enough delay can cause a reordering.]
- Why is the protocol correct/live even if the network delays or drops a
group of messages? [For example, what if the network drops both the reply
and the resend of an RPC, or a long enough sequence of resends, or a large
portion of messages sent?]
- For each node, why is the protocol correct/live even if that node crashes
or disappears (doesn't send or recieve messages) for a long time and then
later is reconnected? [Note that this overlaps with the previous
case—disappearing is like dropping or delaying a group of messages.]
- Why is the protocol correct/live even when a group of nodes are
partitioned from some but not all nodes, and then the partition is
resolved? [Again, note the overlap with the prior case. You may assume
that the network is always symmetric and transitive—if A can reach B,
then B can reach A, and if A can reach B and B can reach C, then A can
reach C—provided they retry for long enough. This is a simplification
since real networks, e.g., the Internet,are sometimes are not symmetric
or not transitive.]
- [Correctness: the program doesn't do the wrong thing; liveness: the
program does something useful (esp if it is also the correct thing to do.
In general, the failure model of each particular lab will define which
cases need to be handled for correctness and/or liveness, as the previous
bullet explains for the network model used across the labs.]
- [If you choose to ignore drops because they are similar to arbitrary
delays, be sure your discussion of delays includes the case where the
delay is extremely long. (So, don't say something like "if the message is
delayed, it's fine because it will eventually be delivered". Instead,
describe how your system would handle the case where the message takes so
long to arrive that you're not willing to wait that long.) Alternatively,
you can describe how you handle drops explicitly.]
- Conclusion
- Goals achieved:
- For each goal, summarize why the protocol achieves the goal.
- Limitations:
- What goals are only achieved sometimes? [For example, because too many failures occurred.]
- Preface
- Goals:
- Desired fault model:
- Challenges:
- Assumptions:
- Protocol
- Kinds of node:
- For each kind of node, what temporary roles it can play, if any.
- State at each kind of node:
- For each piece of state, what type of data is it (integer, set of strings, etc.)?
- If you refer to any auxiliary types, be sure to define them!
- For each piece of state, what does it mean intuitively (if not obvious)?
- For each piece of state, how is it initialized at the beginning of time?
- For each piece of state, are there any important constraints on how it evolves over time (monotonically increasing, etc.)?
- Messages:
- List all the kinds of messages
- For each kind of message, list:
- Source:
- Destination:
- Contents:
- When is it sent?
- Can this message be sent spontaneously by a node, or only in response to another message/timer?
- If yes, describe what else (if anything) happens when the message is sent spontaneously by a node.
- What happens at the destination when it is received?
- Timers:
- List all the kinds of timers
- For each kind of timer, list:
- What kind of node sets it?
- Contents:
- When is it set?
- What happens when it fires?
- Correctness/Liveness Analysis
- For each message, why is the protocol correct/live even if the network
delays, reorders, drops, or duplicates that message?
- Why is the protocol correct/live even if the network delays or drops a
group of messages?
- For each node, why is the protocol correct/live even if that node crashes
or disappears (doesn't send or recieve messages) for a long time and then
later is reconnected?
- Why is the protocol correct/live even when a group of nodes are
partitioned from some but not all nodes, and then the partition is resolved?
- Conclusion
- Goals achieved:
- For each goal, summarize why the protocol achieves the goal.
- Limitations:
- What goals are only achieved sometimes?