--------------------------------------------------- Return-Path: rgrimm@cs.washington.edu Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.2.4]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id XAA25110 for ; Wed, 5 Feb 1997 23:49:23 -0800 Received: from [128.95.8.129] (h127.dyn.cs.washington.edu [128.95.8.127]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id XAA06305 for ; Wed, 5 Feb 1997 23:49:21 -0800 X-Sender: rgrimm@june.cs.washington.edu Message-Id: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Wed, 5 Feb 1997 23:46:53 -0800 To: bershad@cs From: rgrimm@cs.washington.edu (Robert Grimm) Subject: EndToEnd 552-reading summary The end-to-end argument, as introduced by Saltzer et al., states that a function can only be completely and correctly implemented by the applications at the endpoints of a communication system. As a result, providing the same function at a lower level within the communication system is not possible. Classical examples for functionality that requires end-to-end support are delivery and ordering guarantees for messages, the suppression of duplicate communication, and the secure transmission of date over network connections. But as indicated in the paper, the end-to-end argument can apply to storage subsystems and overall operating system structure as well. While the end-to-end argument calls for the communication endpoints to implement a certain function, the underlying communication system might have to offer some form of support for this same function to ensure an efficient implementation. At the same time, different applications have different functional requirements. Consequently, different applications may need different forms of support (e.g., reliable, sequential and duplicate-suppressed streams vs. datagrams). The communication system should thus give applications some choice of functionality (e.g., TCP vs. UDP). The necessity for this complex trade-off between support in the communication system and implementation within an application to ensure efficiency and the resulting need for a choice of abstractions is the crucial insight of the paper. And, while the paper was published in 1984, its impact is still clearly felt today (see Engler et al. in the 1995 SOSP). I think this is one of _the_ seminal papers in systems design and the end-to-end argument should always be considered when designing operating system functionality. --------------------------------------------------- Return-Path: sparekh@crocus Received: from crocus.cs.washington.edu (crocus.cs.washington.edu [128.95.1.67]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with SMTP id BAA25837 for ; Thu, 6 Feb 1997 01:18:10 -0800 Received: (sparekh@localhost) by crocus.cs.washington.edu (8.6.12/7.2ws+) id BAA27261; Thu, 6 Feb 1997 01:18:04 -0800 Date: Thu, 6 Feb 1997 01:18:04 -0800 Message-Id: <199702060918.BAA27261@crocus.cs.washington.edu> From: Sujay Parekh To: bershad@cs Subject: EndToEnd 552-reading summary The End-To-End principle aims to clarify the issue of where to place a given functionality in a modular, layered system. It is inspired by the layering inherent in a typical networking service. The basic contention is that the only way to completely and correctly implement a certain functionality (like reliable data transfer) is with the assistance of the top-level application. In the case of the reliable transfer, the only guarantee a network layer can provide is that the data was reliably transferred to the corresponding layer on the other side. However, what one really wants is that the application at the other end received the data and acted correctly on it, and this information is only obtainable with an application-level reliability protocol. This renders the network-level reliability redundant. The other argument against placing advanced functionality in lower layers is that this functionality may not be utilized by all applications of that layer and hence it may lead to unacceptable overhead or even incorrect behavior. An example of the latter is a real-time data transmission layered over a reliable transport protocol. The argument for putting functionality in lower layers has to do with performance. Sticking with the reliability example, efforts at lower layers may significantly effect the application performance. In addition, it may reduce the amount of work needed by the application programmers. These counter arguments naturally lead to a performance trade-off. Thus, the placement of functionality has to be carefully evaluated, often taking into consideration the actual top-level applications that may be involved. --------------------------------------------------- Return-Path: tian@wally Received: from wally.cs.washington.edu (wally.cs.washington.edu [128.95.2.122]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id DAA26821 for ; Thu, 6 Feb 1997 03:41:22 -0800 Received: (tian@localhost) by wally.cs.washington.edu (8.8.3+CSE/7.2ws+) id DAA31724; Thu, 6 Feb 1997 03:41:22 -0800 (PST) Date: Thu, 6 Feb 1997 03:41:22 -0800 (PST) From: Tian Lim To: Brian Bershad Subject: EndToEnd 552-reading summary Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII The end to end argument is essentially an argument against low level service implementation. The reason for this is that no matter how reliable or functional lower level services are, they do not know anything about the application code using them. The argument is best framed in terms of communication endpoints. While the transport layer may provide 100% reliable message passing, the application code only knows that a message sent was received by the destination machine and accepted by the transport layer. This says nothing about the correct handling of the message by the ultimate recipient of the message. The application may still have to resend the message or receive an acknowledgement. As a result of this, it is no longer clear that the underlying layer should be wholly reliable since message loss would already be handled by the application. However, performance considerations make the decision complicated. In the case of checksums, having the transport layer maintain small packet integrity and retransmit small replacements if necessary, is far cheaper than having FTP retransmit a file in case of corruption. In the case of an OS, choosing to implement a given service has a "global" effect on all apps that run on it. The choice of abstraction and layering is therefore a serious design decision that really should be made by the application. Generalizing the end to end argument, what applications need is flexibility - a system that provides alternatives for lower level services, perhaps even allowing them to be replaced. Gee that sounds like SPIN. Even so, the end to end argument provides a guideline as to where functionality should be placed by whoever it is that makes the layering decisions. --------------------------------------------------- Return-Path: matthai@franklin.cs.washington.edu Received: from franklin.cs.washington.edu (franklin.cs.washington.edu [128.95.2.103]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id IAA28449 for ; Thu, 6 Feb 1997 08:30:59 -0800 Received: from localhost (localhost [127.0.0.1]) by franklin.cs.washington.edu (8.8.3+CSE/7.2ws+) with SMTP id IAA04846; Thu, 6 Feb 1997 08:30:59 -0800 (PST) Message-Id: <199702061630.IAA04846@franklin.cs.washington.edu> X-Mailer: exmh version 1.5.3 12/28/94 To: bershad@franklin.cs.washington.edu cc: Matthai Philipose Subject: EndToEnd 552-reading summary Reply-to: Matthai Philipose Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 06 Feb 1997 08:30:58 PST From: Matthai Philipose When data or control is communicated in a distributed system, two key questions related to reliability arise: 1)Who is responsible for verifying that the communication succeeded? (e.g. that the integrity of the data has been preserved, or that the appropriate action has been taken on the basis of the control message) 2)Who is responsible for retrying in case the communication failed? In the particular case of an application running on a distributed operating system, we would like to know how to distribute these responsibilities between application and low-level system. The end-to-end argument notes that since the definition of whether the communication is successful is application-specific, if the application cares enough about reliability, it must always provide both verification and retry capability. However, retrying at the level of the application may be an extremely heavyweight operation (re-transmitting a file, for instance). Retries should therefore be required very infrequently. The frequency of error of the lower-level system should be sufficiently low that the high-level application requires retries infrequently enough. Satltzer et. al. argue that since the error-rate of the lower-level system needs only to match that of the application, it is unnecessary to design systems of "perfect" reliability. They suggest that the "proper trade-off" of increasing the complexity of the underlying system versus increasing its reliability (or more generally, endowing it with more powerful semantics) requires careful thought. However, their main method for deciding this trade-off seems to be to assess the allowable retry-rate of the application and then set the semantics of the lower-level system. Since the semantics of lower-level services of conventional systems are generally fixed, and since when these systems are designed there is no measure of the allowable failure rate of the applications above them, there is no practical way to provide "strictly necessary reliability". In other words, since the definition of "strictly necessary" changes with applications, whereas the lower-level system does not, the only practical course is to make the lower-level as reliable as possible. Conventional systems could be changed in the following ways to accomodate the end-to-end argument: i)When they are designed, they could be designed with required reliability in mind. This is difficult, because it requires characterizing all future applications. ii)The reliability of system services could be parameterizable, so that each application may set it appropriately. Ideally, the application should be able to turn a few knobs (parameters) to get the service it needs. In general, though, service semantics may have to be parametrized by a function, leading to: iii)The semantics of system service could be defined by the application. This is the approach suggested, for instance, by Lampson in his "open operating system", and adopted recently in systems like SPIN. This method may have the undesirable effect of requiring applications programmers to implement versions of lower-level services.