IntServ & DiffServ (Oct 27, 2004) by Xiaowei Yang What we have seen FIFO+TCP -> best effort service Scheduling algorithm (WFQ), Signaling, Admission control, Flowspecs -> guaranteed service (rate, and delay bound) Today's lecture: Engineering design experiments. Work out / study the mechanisms to provide different levels of services in the network. We are equipped with the basic building blocks. Work out the two variants of QoS networks. Two important techniques: Soft-state Dynamic packet state IntServ: Motivation is to provide end-to-end service guarantee. 1. fine-grained, per flow service guarantee 2. end-to-end service guarantee Build blocks: 1. Routers: scheduling algorithms and buffer management policies 2. Applications: flowspecs 3. From applications to routers: signalling 4. From routers to applications: admission control Service classes in IntServ: Specifies service interfaces. doesnot say how to implement it. First present the three service classes, and then describe the scheduling algorithm, admission control decisions. Clark, Shenker & Zhang's paper laid out the mechanisms to implement the service classes. 1. Guaranteed service it provides a client firm guarantees: guaranteed bandwidth and bounded delay. scheduling: could be wfq. flowspecs: (r,b) but, in IETF specifications, it contains more fields, but less important. admission decision: req + sum of allocated < capacity allocated to guaranteed service. disadvantages: 1. potential under-utilization if traffic is very bursty; 2. high per-flow jitter caused by WFQ spreading out bursts. policing: not needed, as packets from different users are put into different queues. 2. Controlled load service this class of service provides the client data with a quality of service closely approximating the QoS what the flow would encounter from an unloaded network, but uses admission control to assure that this service is received when the network is overloaded. overcome the two advantages of guaranteed service. doesnot give firm guarantees. network make measurements. based on the measurement results, make an admission decision. the network could admit a flow even though the sum of all admitted flow's requested rate is greater than the network's capacity. scheduling: a separate queue, inside the queue, fifo+ scheduling to minimize the jitter. it is one way to do it. admission decision: based on heuristics. use measurement to characterize existing flows. use worst-case for new source. sum of measured load + r < bandwidth allocated to cl service policing: traffic filtering at the entrance of the network. 3. Best effort scheduling: fifo no signalling, admission, flowspecs are needed. Unified scheduling algorithm (in csz paper) not mentioned by the rfcs. wfqs. put packets into different queues. question is how to deal with non-conforming traffic. if we put non-conforming traffic into a separate queue, packets may be sent out of order. or they take too much of best-effort service bandwidth because they do not do backoff, starving tcp flows. this problem is addressed in the cf paper. We have studied a router scheduling algorithm WFQ, a traffic descriptor token bucket. Let's look at the signalling & admission control part. We need a protocol that exchange flow specs and admission decisions between applications and routers. Q. How do we design such a protocol? a general signalling and admission control protocol (draw a picture with a sender, a bunch of routers, and a receiver) step 1: explain why receiver driven, and how to implement receiver driven. straightforward approach is for a sender sends its flow specs into the network, the flow spec is passed to the routes en route, each router makes a decision to admit this flow or not. if so, pass on the request, if not, send an error back. then receiver echos back the decision. disadvantage: 1. if one-to-many communication, a sender needs to make reservation for a lot of receivers. 2. a sender doesnot know the service level a receiver wants. so rsvp designers decide to let the receiver make reservation. a sender sends a path message to a receiver. the path message contains the tspec (specifies the traffic characteristics of the flow, e.g, the token bucket rate and size). when the receiver receives the path message, it sends a resv message to reserve the resource for desired service. the resv message contains both the tspec, and the rspec (reserve spec, e.g., guaranteed service, the reserved rate, or the controlled load service). the resv message must follow the exact reverse path the path message follows. but Internet path is asymmetric. so the path message contains a field that stores the previous hop's address, and set up a path state at each of the router en route from a sender to a receiver. so a resv message is sent "hop-by-hop". if a resv fails, an error message is sent back to the receiver; else, it goes all the back to the sender. a reservation is made. step 2: explain the advantage of receiver driven: reservation merging. a sender sends one path message, receiver decides what they want. resv can be merged. add from the previous pictures a multicast tree. step 3: explain soft state. what happens if the sender or the receive dies? how do we garbage collect the resources? the solution is to use soft state. soft state does not require explicit removal. so a sender periodically sends path messages if it has data to send; and a receiver periodically sends resv messages. if soft state times out without a renewal message, path state or resv state is removed. resource is automatically reclaimed. this also helps with route failover. if a route fails, the routing algo will find a different path. the resource reserved at the other route should be recycled. a periodic path & resv message will establish a new reservation over the new path if the old path fails. Examples of how guaranteed service and controlled load service can be implemented with the rsvp. Guaranteed service: tspec: r, b, peak rate, min packet size, max packet size rspec: service class, rate, and a slack delay term. node can use this term to degrade the service request. (from rfcs: This slack term can be utilized by the network element to reduce its resource reservation for this flow. When a network element chooses to utilize some of the slack in the RSpec, it MUST follow specific rules in updating the R and S fields of the RSpec.) Controlled load service: same tspec. rspec: just includes the service class. intserv summary: 1. fine-grained, end-to-end guarantees. guaranteed service, controlled load service, and best effort service. 2. wfq scheduling for guaranteed service 3. rsvp for signaling. soft-state, receiver-driven. complexity is added to tolerate multicast and failures DiffServ: Motivation. in the old days, network is used by a big happy research family. congestion control is enough. when we see congestion, we backoff, let other people have the bandwidth share. then we think of how to give a user end-to-end service guarantee. this motivation is good, but does not seem appealing to service providers. diffserv aims to align the interests of providers and the users. what are the providers' interests? making money, of course. how? isps can make money by offering services which are better than best effort. it is similar to airline ticket pricing. a first-class ticket costs more, but gets better treatment. more, diffserv only provides per-hop behaviors, and makes no claim of end-to-end guarantees. the hope is, by stitching a sequence of ds domains, users would get end-to-end qos. Q: is there any reason to believe that one hop better treatment brings better overall end-to-end performance? in another word, is incremental deployment gets incremental benefit? A: yes in two situations. 1. if the first-hop is your bottleneck. 2. if most of ur destinations belong to the same isp. service interfaces: 1. per-hop behavior. no end-to-end service guarantee. no firm guarantee. relative guarantees. class 1 always gets better service than class 2. 2. different services for different traffic aggregates advantage over intserv: 1. easier to implement. as it doesnot need admission control, and cooperation between isps. 2. service level agreements are negotiated between a customer domain and a provider domain. this model fits into the current business models. architecture: two-tier: a ds domain consists of boundary nodes and ds interior nodes. packets are classified at the boundaries of network, and are assigned to different traffic class. the traffic class is identified by a code point. at the core, packets are forwarded according to the per-hop behavior associated with the traffic class. (draw a picture with different clouds and boundary nodes, and interior nodes.) a ds region contains a continuous regions of ds domains. +-------+ | |-------------------+ +----->| Meter | | | | |--+ | | +-------+ | | | V V +------------+ +--------+ +---------+ | | | | | Shaper/ | packets =====>| Classifier |=====>| Marker |=====>| Dropper |=====> | | | | | | +------------+ +--------+ +---------+ Q: 1. how is packet assigned to different traffic class? A. traffic conditioning. above pictures. an sla (service level agreement) usually specifies how much of a customer's traffic will be granted what type of services. for example, uw could purchase 10mb class 1 service from a provider. since traffic from one customer is put into shared queues with other customers, the provider needs to make sure a customer's excess traffic will not degrade the level of service other customer's legitimate traffic gets. so metering and marking are necessary. traffic profile specifies the service granted to a traffic stream. classifier helps locate the appropriated traffic profile the meter checks against. packets are marked as In/out profile. out-profile could be shaped to become in profile, or dropped. this block is called traffic conditioning. it usually happens at the administrative boundaries. Q: meters could be at both in-gress points and egress points. in-gress serves checking. what about egress point? is there any advantage? egress point enforces policy. Two layer architecture: packets are marked at edge routes. core routes simply use the the marks to forwarding packets with the corresponding phbs. the marks are called dscp (diffserv codepoint in ietf terminology). dynamic packet state: codepoint is marked at the edge. core router does not need to do extra marking, does not have to know individual traffic stream's profile. each packet carries some dynamic computed state. each router can coordinate their actions using the dynamic computed state, simulating a stateful network. without dynamic computed code point, each router needs to keep traffic profile, and does the traffic conditioning. Q: we've seen this before. where? A: XCP. otherwise, each router needs to keep per-tcp round trip time, and cwnd. we now discuss the types of per-hop behavior specified by IETF. 1. expedited forwarding: forward with minimal delay and loss need to be rate limited before entering the network possible implementations: priority queue, wfq. 2. assured forwarding: could be used to emulate best-effort with different loads. Olympic service: gold (lightest load), sliver, bronze the actual service depends on the provision. if a provider under provisioned gold service, gold service may get worse service than bronze. so it is a provider's responsibility to provision its network to provide advertised service. service quality is controlled by provisioning, not admission. (you dont say no back to users when they start sending their packets. you turn them away when they purchase slas.) implementation: WFQ with different drop precedence. cannot separate flows into different queues because out-of-order delivery causes tcp retransmissions. within each queue, use the drop precedence to separate traffic into different classes. ietf: four af classes, each class has three drop precedences. RIO is a technique to deal with excess traffic. it best utilizes statistical multiplexing. traffic is bursty. so if not congested, we might not want to drop packets. mark non-conforming packets as out of profile. if network congested, drop them first. the "out-of-profile" traffic represents opportunistic traffic. summarize: best-effort -> intserv -> diffserv what's going on after diffserv. my work, and overlay networks.