Reading Questions. Lecture 9 (01/31). Distributed Stream Processing. Fault-Tolerance.

 

As you read the papers, consider the following questions:

 

Question 1: What types of failures can occur in a distributed stream processing system, and how can these failures affect the output seen by users?

 

Question 2: In the paper by Hwang et. al., the authors examine several techniques to recover the state of a failed processing node. How do these techniques compare in terms of recovery speed and runtime overhead?

 

Question 3: What is the main challenge in handling network partitions? What are possible approaches to deal with this type of failures?