--------------------------------------------------- Return-Path: yasushi@silk Received: from silk.cs.washington.edu (silk.cs.washington.edu [128.95.2.238]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id XAA24744 for ; Wed, 5 Feb 1997 23:07:17 -0800 Received: (yasushi@localhost) by silk.cs.washington.edu (8.7.2/7.2ws+) id XAA05659; Wed, 5 Feb 1997 23:07:17 -0800 (PST) Date: Wed, 5 Feb 1997 23:07:17 -0800 (PST) From: yasushi@silk Message-Id: <199702060707.XAA05659@silk.cs.washington.edu> To: bershad@silk Subject: Quicksilver 552-reading summary Quicksilver is a research operating system developed at IBM. It has a microkernel structure and relies heavily on RPC to connect system components. RPC is always executed in a transaction context, and the central transaction manager coordinates commit and abort. There are two novelties in Quicksilver. One is the concept of "transaction everywhere". It extended the notion of transaction to non-persistent services like a window manager or a process manager. Quicksilver introduced various transaction modes to process commit efficiently for those services. The other novelty is that a distributed transactional file system(DFS) was provided in Quicksilver, and it was really used. By really using the DFS, the authors could find many problems in the strict transactional file system semantics. And they implemented a relaxed consistency model to make it usable. The only real(i.e., persistent) service written for QuickSilver is DFS. Although the authors' discussion about usefulness of DFS is convincing, I don't see equally convincing argument for other transactional services. Therefore, in my opinion, the validity of the "transaction everywhere" concept is not well proved in this paper. --------------------------------------------------- Return-Path: rgrimm@cs.washington.edu Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.2.4]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id AAA25608 for ; Thu, 6 Feb 1997 00:48:29 -0800 Received: from [128.95.8.129] (h127.dyn.cs.washington.edu [128.95.8.127]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id AAA12323 for ; Thu, 6 Feb 1997 00:48:24 -0800 X-Sender: rgrimm@june.cs.washington.edu Message-Id: Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 6 Feb 1997 00:45:57 -0800 To: bershad@cs From: rgrimm@cs.washington.edu (Robert Grimm) Subject: Quicksilver 552-reading summary The Quicksilver distributed system, as described by Schmuck and Wyllie, uses transactions to perform all interprocess communication, be it local or remote. By using transactions, Quicksilver ensures that persistent data is consistent in the event of failures and access to shared data is always synchronized. Furthermore, transactions can be used to undo several changes, thus aiding recovery, and to notify servers of client termination. The implementation uses three user-space servers on top of a micro-kernel to provide the transaction functionality, namely a transaction manager (which initiates and terminates all transactions), a communication manager (which handles the network transport and ensures its reliability) and a log manager (which provides an append-only file). Many of the details of transaction management, as they would apply to applications in Quicksilver, are hidden within the standard libraries and the system consequently has (limited) binary compatibility with Unix (well, AIX). Since providing full transaction semantics (i.e., all ACID properties) is expensive, several classes of transactions with different commit protocols as well as different concurrency protocols are supported (which, in a sense, makes some classes very un-transactional and obscures important insights since the high-level notion of a transaction is still applied). Different applications can thus choose the desired class of transaction and still perform efficiently. The presented performance evaluation seems to corroborate this view as Quicksilver seems to perform somewhat worse in microbenchmarks but shows only minimal performance degradation in higher-level benchmarks. I think that the performance evaluation is somewhat dubious and does not really provide any real insights into the cost of using transaction-like IPC. For example, almost 19% (89% of 21%) of all transactions observed in a week-long trace have no use, but are generously attributed to bad programming. Furthermore, AIX does not provide a good baseline for performance comparisons, since the two systems are differently structured and, besides running some of the same applications, use different algorithms and different subsystems (e.g., NFS in AIX and DFS in Quicksilver). Next, the uses described in the paper are very strongly geared towards applications from exactly one domain (software development with text editors and compilers) and no data is given for applications from other domains. Lastly, the literature (especially the literature on file systems including the issues raised by temporary file data and meta-data updates) provides plenty of evidence for the inappropriateness and inefficiency of many of Unix's semantics and abstractions. Structuring Quicksilver around Unix binary compatibility may thus very well hide the real cost of transactional IPC. However, a more fundamental problem exists with using a rather high-level paradigm (i.e., transactions) to structure all distributed operations in Quicksilver. The best indication for this problem is the laundry list of lessons which repeatedly focus on a perceived mismatch between the synchronization features of transactions (i.e., the locking mechanism) and particular applications (within the limited domain explored by the authors!). Thus, a form of the end-to-end argument does apply to Quicksilver: Applications know better which synchronization (and, for that matter, which atomicity and reliability guarantees) they require and should thus (at least) partially provide that functionality themselves, let alone have a real choice ("all interprocess communication must be done on behalf of some transaction"...). --------------------------------------------------- Return-Path: sparekh@crocus Received: from crocus.cs.washington.edu (crocus.cs.washington.edu [128.95.1.67]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with SMTP id BAA25716 for ; Thu, 6 Feb 1997 01:00:38 -0800 Received: (sparekh@localhost) by crocus.cs.washington.edu (8.6.12/7.2ws+) id BAA27211; Thu, 6 Feb 1997 01:00:35 -0800 Date: Thu, 6 Feb 1997 01:00:35 -0800 Message-Id: <199702060900.BAA27211@crocus.cs.washington.edu> From: Sujay Parekh To: bershad@cs Subject: Quicksilver 552-reading summary QuickSilver (QS) is a novel system which integrates support for database-style transactions into the Operating System. It has a small kernel for some services, but most of the O/S services are implemented as separate servers. Services are obtained by making network-transparent IPCs to these servers. By integrating transactions into these IPCs, QS pervasively introduces the transaction model into many O/S services. Each QS node has a Transaction Manager process that provides the proper functionality of transactions. Several advantages are cited for including transaction support as an O/S service: failure recovery, atomic updates and distributed notification/resource control. However, many services do not need strict 2-phase commit semantics, so the QS notion of transactions is extended to include the notion of volatile-state servers. This allows them to provide a unified resource-management and notification model. An added feature of their implementation is that they are able to directly run many Unix utilities written for AIX/RT. We are presented with several examples of where transactions prove useful, but the really compelling ones had to do with their Distributed File System (DFS). While the end-to-end rationale argues against putting such a complex service into the kernel, the QS contention is that it does not add much overhead, and it allows one to build and structure better applications. Indeed, their relative performance to AIX does indicate the overhead is not overbearing, although the systems are not similar enough to allow a fair comparison. They do make the important observation that transactions allow one to write quick-and-dirty applications that are nevertheless robust in the face of failures. However, it is also quite clear that in order to get good performance and reasonable behavior it is necessary to restructure some application servers (eg. break up a large transaction into smaller pieces). End-to-end argument again. The final obvious yet important point of note is that operating system services require drastically different semantics from a transaction system than does a database. All the special-cases and extentions mentioned in the final section are a testament to this. --------------------------------------------------- Return-Path: ddion@june Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.2.4]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id BAA25751 for ; Thu, 6 Feb 1997 01:03:47 -0800 Received: (ddion@localhost) by june.cs.washington.edu (8.8.5+CS/7.2ju) id BAA12943 for bershad@cs; Thu, 6 Feb 1997 01:03:47 -0800 From: ddion@june (David Dion) Message-Id: <199702060903.BAA12943@june.cs.washington.edu> Subject: Quicksilver 552-reading summary To: bershad@cs Date: Thu, 6 Feb 1997 01:03:46 -0800 (PST) X-Mailer: ELM [version 2.4 PL23] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit QuickSilver uses transactions to solve complicated problems in the design and implementation of a distributed system. The transaction model brings several advantages to a distributed system. Locking and concurrency protocols are fundamental parts of a transaction system. Similarly, dealing with distributed failure conditions becomes easier, as atomicity and rollbacks are built in. The commit/abort model can also be used as a means for failure detection or termination notification. Transaction management in QuickSilver is coordinated by a toolkit consisting of transactional IPC, a transaction manager, and a log manager. Transactional IPC requires that all IPC be associated with a transaction. In addition, a communication manager on each machine can transparently connect to peer communication managers on other hosts to deliver messages to remote processes. The transaction manager is responsible for initiation and termination of transactions. Specifically, it coordinates the commit and abort operations among potentially distributed participants. The log manager buffers the necessary data to perform rollback operations in the event of a failure. QuickSilver services are implemented as a set of user-level server processes which interact with these managers. The QuickSilver distributed file system, for instance, stresses many of QuickSilver's transaction management services. The benefits of QuickSilver's transaction model are unclear. Performance measurements are unable to isolate the effects of transactions on system performance. They do show that QuickSilver performs on average only slightly slower than AIX2.2. However, a comparison of the semantics of the benchmarks on the two systems is not given. In addition, the portability of existing applications remains in question. QuickSilver provides a "default transaction" to encompass non-transaction based applications, but how this meshes with complicated applications is not described. --------------------------------------------------- Return-Path: tian@wally Received: from wally.cs.washington.edu (wally.cs.washington.edu [128.95.2.122]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id DAA26624 for ; Thu, 6 Feb 1997 03:09:09 -0800 Received: (tian@localhost) by wally.cs.washington.edu (8.8.3+CSE/7.2ws+) id DAA31284; Thu, 6 Feb 1997 03:09:08 -0800 (PST) Date: Thu, 6 Feb 1997 03:09:08 -0800 (PST) From: Tian Lim To: Brian Bershad Subject: Quicksilver 552-reading summary Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Quicksilver is a distributed system that uses transactions extensively to provide atomic and recoverable core services (most notably the file system, process control and ipc). Since the same mechanisms are used throughout, many issues in the system are simplified, such as distributed resource management. For instance, process termination is represented as a transaction commit or abort, so any other services maintaining state for the dead process are notified as part of the commit/abort protocol. The participants, remote or local, are implicitly enrolled in the transaction by having engaged in IPC with the process, since IPC occurs in the context of the transaction. Performance of the system is naturally degraded, but the measurements given do not measure the overhead of the transaction support. They compared Quicksilver against AIX, a substantially different system. Comparing Quicksilver against Quicksilver minus transactions would have been better, although it is unclear what would remain if transactions were ripped out. Nevertheless, the authors assert it is usable with performance comparable to a production system. The paper does not directly address the question of whether transactions are the appropriate paradigm for all OS services (at least, services that look like AIX). They note many changes that had to be performed in order to achieve reasonable semantics (such as fragmenting large transactions into smaller ones, or relaxing consistency constraints). Transactions are used in databases because clients care about atomicity, consistency and durability; it is not clear all clients of OS services also do. --------------------------------------------------- Return-Path: matthai@franklin.cs.washington.edu Received: from franklin.cs.washington.edu (franklin.cs.washington.edu [128.95.2.103]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id IAA28484 for ; Thu, 6 Feb 1997 08:33:46 -0800 Received: from localhost (localhost [127.0.0.1]) by franklin.cs.washington.edu (8.8.3+CSE/7.2ws+) with SMTP id IAA04857; Thu, 6 Feb 1997 08:33:45 -0800 (PST) Message-Id: <199702061633.IAA04857@franklin.cs.washington.edu> X-Mailer: exmh version 1.5.3 12/28/94 To: bershad@franklin.cs.washington.edu cc: Matthai Philipose Subject: Quicksilver 552-reading summary Reply-to: Matthai Philipose Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Date: Thu, 06 Feb 1997 08:33:45 PST From: Matthai Philipose The Quicksilver system uses transactions as a framework for providing end-to-end guarantees. The participation graph for each transaction (which is an application-level concept) identifies all the "ends" of the transaction, essentially all persistent state updated by the transaction; end-to-end verification then involves verifying that all the "end" nodes completed their part of the transaction successfully. In the case that even one of these nodes claims failure (or "aborts"), the entire transaction is aborted, and a retry attempted if necessary, provding the atomicity characteristic of transactional systems. Schmuck & Wyllie provide many instances (a file-system, a parallel-make utility, etc) to show how the end-to-end guarantees provided by transactions are natural and useful. Further, they show that the overhead of the transactional framework is not too large, especially when the transactions are reasonably heavyweight (transferring non-empty files vs. empty ones, for instance). The key seems to be that two-phase commits seem to happen only when permanent state is being modified, so that in most communication-related (and system book-keeping) state-updates, little overhead is incurred. On the other hand, though they claim that programs using transactions are easy to write, S&W point out that a large number of "unnecessary" transactions were created in the implementation of the shell and the pmake facilities. Further, like any end-to-end retry scheme, it seems that the penalty of even one component of the transaction failing is that the entire transaction is retried. If the failure-frequency of the underlying system is not well-matched, with the allowable retry frequency of the transaction, this could be expensive; S&W note this problem and suggest that "nested" transactions could solve this problem. On the whole, end-to-end guarantees on the integrity of update of permanent state sounds like something most distribution applications could use; transactions therefore seem like a good idea, given fairly reliable lower-level system components. --------------------------------------------------- Return-Path: mernst@ebi.cs.washington.edu Received: from trout.cs.washington.edu (trout.cs.washington.edu [128.95.1.178]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id IAA28653 for ; Thu, 6 Feb 1997 08:47:32 -0800 Received: from ebi.cs.washington.edu (mernst@ebi.cs.washington.edu [128.95.4.37]) by trout.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id IAA19230; Thu, 6 Feb 1997 08:45:02 -0800 (PST) Received: (from mernst@localhost) by ebi.cs.washington.edu (8.8.4/8.8.2) id IAA04740; Thu, 6 Feb 1997 08:44:22 -0800 Date: Thu, 6 Feb 1997 08:44:22 -0800 Message-Id: <199702061644.IAA04740@ebi.cs.washington.edu> From: Michael Ernst To: bershad@cs.washington.edu Subject: Quicksilver 552-reading summary Experience with transactions in QuickSilver By Frank Schmuck and Jim Wyllie QuickSilver is a distributed operating system based on the concept of transactions; every process runs within a transaction, and processes may create additional transactions as well. The notion of "transaction" is substantially weakened from the ordinary one, as a transaction need not provide failure atomicity, recoverability, or isolation. The authors make the point that their system runs with reasonable performance and does what it is intended to do, but I'm not convinced about the necessity for the system. They don't justify what problems are soluble by only such a system. (Some weak examples include elimination of files left over when a program is aborted -- but I want the log files to remain, even if some other outputs can be done away with. Also, having just read the end-to-end paper, I wonder if their mechanism really obviates the need for higher-level mechanism: what if the transaction commits, but for some reason the application isn't notified of this? I'm concerned about running ordinary binaries in such a system; if I run my editor for a week, then it dies (say, my machine crashes or I log out without explicitly exiting the editor), then apparently I lose a week of work. In short, I'm not convinced that this is interesting and worthwhile work. --------------------------------------------------- Return-Path: sungeun@wormwood Received: from wormwood.cs.washington.edu (wormwood.cs.washington.edu [128.95.2.107]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id IAA28733 for ; Thu, 6 Feb 1997 08:49:52 -0800 Received: (sungeun@localhost) by wormwood.cs.washington.edu (8.8.3+CSE/7.2ws+) id IAA08957; Thu, 6 Feb 1997 08:49:51 -0800 (PST) Date: Thu, 6 Feb 1997 08:49:51 -0800 (PST) Message-Id: <199702061649.IAA08957@wormwood.cs.washington.edu> From: Sung-Eun Choi To: bershad@whistler Subject: Quicksilver 552-reading summary Reply-To: sungeun@cs.washington.edu This paper describes the use of transactions in QuickSilver, a transaction-based distributed system. In QuickSilver, servers and clients (as well as all other applications) are required to used transactions, though each server may employ transactions so as to define a certain behaviour. This allows the servers provide a certain service with a defined behaviour and (I guess) ensure this behaviour -- or notify of failure -- using transactions. Apparently, providing transactions only mildy affects the performance of the system as a whole. The paper lacks sufficient details for me to be convinced that transactions are an appropriate abstraction for all distributed applications. --------------------------------------------------- Return-Path: echris@merganser Received: from merganser.cs.washington.edu (merganser.cs.washington.edu [128.95.2.192]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id IAA28748 for ; Thu, 6 Feb 1997 08:51:00 -0800 Received: (echris@localhost) by merganser.cs.washington.edu (8.8.3+CSE/7.2ws+) id IAA10146; Thu, 6 Feb 1997 08:50:59 -0800 (PST) Date: Thu, 6 Feb 1997 08:50:59 -0800 (PST) Message-Id: <199702061650.IAA10146@merganser.cs.washington.edu> From: E Christopher Lewis To: bershad@whistler Subject: Quicksilver 552-reading summary Reply-To: echris@cs.washington.edu Schmuck and Wyllie give their experiences in the development and use of QuickSilver, a transaction-based distributed operating system. In summarizing the QuickSilver architecture and arguing the value of using transactions for general distributed computing, the authors fail to impart any deep intuition for how one really develops an application in QuickSilver (what exactly does the OS provide?) or why a transaction approach is the right model. Sure we can come up with a bunch of examples where transactions on disk access are useful, but what else is there? (PMake is not a good answer.) Perhaps all the answers are in the TOCS paper from '88? The authors present "lessons learned" from several years of QuickSilver experience. Each lesson can be rephrased as "transactions and QuickSilver are good," but the discussion for each describes work- arounds for situations where transactions are unnatural. Again, I am left wondering why transactions are the right abstraction for the development of general distributed systems. --------------------------------------------------- Return-Path: govindk@shasta.ee.washington.edu Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.2.4]) by whistler.cs.washington.edu (8.8.3/7.2ws+) with ESMTP id KAA29512 for ; Thu, 6 Feb 1997 10:28:31 -0800 Received: from shasta.ee.washington.edu (shasta.ee.washington.edu [128.95.28.11]) by june.cs.washington.edu (8.8.5+CS/7.2ju) with SMTP id KAA03147 for ; Thu, 6 Feb 1997 10:28:30 -0800 Received: from andes.ee.washington.edu by shasta.ee.washington.edu (4.1/SMI-4.1) id AA01439; Thu, 6 Feb 97 08:38:49 GMT Received: by andes.ee.washington.edu (SMI-8.6/SMI-SVR4) id IAA07459; Thu, 6 Feb 1997 08:38:19 -0800 Date: Thu, 6 Feb 1997 08:38:19 -0800 From: govindk@shasta.ee.washington.edu (Govindarajan K) Message-Id: <199702061638.IAA07459@andes.ee.washington.edu> To: bershad@cs Subject: Quicksilver 552-reading Summary X-Sun-Charset: US-ASCII Quicksilver is a distributed operating system built at IBM , Almaden Research Center. The Quicksilver transaction management toolkit consists of a transactional IPC, a transaction manager and a log manager. Quicksilver borrows the notion of transactions from databases and applies it to the aspect of resource management in a distributed domain. This means that either a process completes fully or all the changes made by the process is deleted. An interesting example to illustrate this point is the ftp example given in the paper. So either the whole file comes in or you get no file. The main advantages of transactions are *data is consistent in the presense of failures * changes can be collectively undone. * synchronization of shared data. The performance of Quicksilver is compared with NFS in AIX and it comes within 2 - 5% in most cases of AIX. I thought a more suitable comparison will be w.r.t another scheme which provides the same sort of guarantees rather than a normal NFS. For example group communication guarantees that within a group everyone has the same view of the system. Essentially group communication is also used( in the context of fault tolerance ) to provide consistency of data and have a reliable data in the presense of failures. Another paradigm which would be interesting to compare this with is using process migration using some sort of distributed checkpointing . It would be interesting to know whether resuming a process at some other equivalent server would serve this purpose better.