Outline for 3/4/98
- Last time: File protection / access control
- Administrative
- Optional problems will be on the Homework webpage later today. Solutions will be posted eventually (before studying for finals begins)
- Questions?
- Objective:
- Introduce distributed systems
- Distributed file systems
Why Distributed Systems?
- Economic argument - lots of small machines are more cost effective than a single supercomputer-class machine.
- Sharing - some physical devices (printers, scanners) and logical resources (files, databases)
- Geographically distributed applications (banking, collaboration)
- Parallel processing - collections of machines cooperating on a single problem.
Variations on Theme

Some Issues in Distributed Systems
- Transparency - degree to which the location and boundaries between nodes are visible.
- Performance - latency, bandwidth
- Scalability - behavior as the size of the system grows.
- Reliability and fault tolerance - Parts go down
- How to detect? What to do with the surviving parts?
- Security
Networks 101
- Various technologies (multiaccess bus - twisted pair, fiber optics; wireless - IR, radio; switched; store-and-forward)
- Usual distinctions
- LANs (household ethernet); WANs (internet)
- Packet-switching; Circuit-switching
- Trends - network technology is undergoing significant performance improvements - big impact on future system design
Distributed File Systems
- Naming
- Location transparency/ independence
- Caching
- Consistency
- Replication
- Availability and updates
Naming
- \\His\d\pictures\castle.jpg
- Not location transparent - both machine and drive embedded in name.
- NFS mounting
- Remote directory mounted
over local directory in local naming hierarching.
- /usr/m_pt/A
- No global view
Global Name Space

Hints
- A valuable distributed systems design technique that can be illustrated in naming.
- Definition: information that is not guaranteed to be correct. If it is, it can improve performance. If not, things will still work OK. Must be able to validate information.
- Example: Sprite prefix tables
Caching
- Location of cache on client - disk or memory
- Update policy
- write through
- delayed writeback
- write-on-close
- Consistency
- Client does validity check, contacting server
- Server call-backs
Reliability Issues
- Server crashes
- State (if any kept) lost, reconstruct upon recovery (dialog with clients?)
- Stateless server - all requests from clients are self-contained
- Network partitions
- Client response - optimistic (continue to use what's in cache) or pessimistic (conservative)
Replication
- File name maps to set of replicas, one of which will be used to satisfy request
- Goal: availability
- Update strategy
- Atomic updates - all or none
- Primary copy approach
- Voting schemes
- Optimistic, then detection of conflicts