IntServ & DiffServ (Oct 27, 2004)
by Xiaowei Yang

What we have seen
   FIFO+TCP -> best effort service
   Scheduling algorithm (WFQ), Signaling, Admission control, Flowspecs
   -> guaranteed service (rate, and delay bound)

Today's lecture:

Engineering design experiments. Work out / study the mechanisms to
provide different levels of services in the network.  We are equipped
with the basic building blocks. Work out the two variants of QoS
networks.

Two important techniques:

Soft-state
Dynamic packet state

IntServ: 

Motivation is to provide end-to-end service guarantee. 

1. fine-grained, per flow service guarantee
2. end-to-end service guarantee

Build blocks:

1. Routers: scheduling algorithms and buffer management policies
2. Applications: flowspecs
3. From applications to routers: signalling
4. From routers to applications: admission control

Service classes in IntServ:

Specifies service interfaces. doesnot say how to implement it. First
present the three service classes, and then describe the scheduling
algorithm, admission control decisions.  Clark, Shenker & Zhang's
paper laid out the mechanisms to implement the service classes.

1. Guaranteed service

it provides a client firm guarantees: guaranteed bandwidth and bounded
delay.

scheduling: could be wfq. 

flowspecs: (r,b) but, in IETF specifications, it contains more
fields, but less important.

admission decision: req + sum of allocated < capacity allocated to guaranteed
service.

disadvantages: 1. potential under-utilization if traffic is very
bursty; 2. high per-flow jitter caused by WFQ spreading out bursts.

policing: not needed, as packets from different users are put into
different queues.

2. Controlled load service 

this class of service provides the client data with a quality of
service closely approximating the QoS what the flow would encounter
from an unloaded network, but uses admission control to assure that
this service is received when the network is overloaded.

overcome the two advantages of guaranteed service.

doesnot give firm guarantees.

network make measurements. based on the measurement results, make an
admission decision.  the network could admit a flow even though the
sum of all admitted flow's requested rate is greater than the
network's capacity. 

scheduling: a separate queue, inside the queue, fifo+ scheduling to
minimize the jitter. it is one way to do it.

admission decision: based on heuristics. use measurement to
characterize existing flows. use worst-case for new source.

 sum of measured load + r  < bandwidth
allocated to cl service

policing: traffic filtering at the entrance of the network.

3. Best effort

scheduling: fifo

no signalling, admission, flowspecs are needed.

Unified scheduling algorithm (in csz paper) not mentioned by the rfcs.

wfqs. put packets into different queues. question is how to deal with
non-conforming traffic. if we put non-conforming traffic into a
separate queue, packets may be sent out of order. or they take too
much of best-effort service bandwidth because they do not do backoff,
starving tcp flows. this problem is addressed in the cf paper.

We have studied a router scheduling algorithm WFQ, a traffic
descriptor token bucket. Let's look at the signalling & admission
control part. We need a protocol that exchange flow specs and
admission decisions between applications and routers.

Q. How do we design such a protocol? a general signalling and
admission control protocol

(draw a picture with a sender, a bunch of routers, and a receiver)

step 1: explain why receiver driven, and how to implement receiver
driven.

straightforward approach is for a sender sends its flow specs into the
network, the flow spec is passed to the routes en route, each router
makes a decision to admit this flow or not. if so, pass on the
request, if not, send an error back. then receiver echos back the
decision.

disadvantage: 1. if one-to-many communication, a sender needs to make
reservation for a lot of receivers. 2. a sender doesnot know the service
level a receiver wants. 

so rsvp designers decide to let the receiver make reservation. 

a sender sends a path message to a receiver. the path message contains
the tspec (specifies the traffic characteristics of the flow, e.g, the
token bucket rate and size).

when the receiver receives the path message, it sends a resv message
to reserve the resource for desired service. the resv message contains
both the tspec, and the rspec (reserve spec, e.g., guaranteed service,
the reserved rate, or the controlled load service).

the resv message must follow the exact reverse path the path message
follows. but Internet path is asymmetric. so the path message contains
a field that stores the previous hop's address, and set up a path
state at each of the router en route from a sender to a receiver. so a
resv message is sent "hop-by-hop". 

if a resv fails, an error message is sent back to the receiver; else,
it goes all the back to the sender. a reservation is made.

step 2: explain the advantage of receiver driven: reservation merging.

a sender sends one path message, receiver decides what they want. resv
can be merged. add from the previous pictures a multicast tree.

step 3: explain soft state.

what happens if the sender or the receive dies? how do we garbage
collect the resources? 

the solution is to use soft state. soft state does not require
explicit removal. so a sender periodically sends path messages if it
has data to send; and a receiver periodically sends resv messages. if
soft state times out without a renewal message, path state or resv
state is removed. resource is automatically reclaimed. 

this also helps with route failover. if a route fails, the routing
algo will find a different path. the resource reserved at the other
route should be recycled.  a periodic path & resv message will
establish a new reservation over the new path if the old path fails.


Examples of how guaranteed service and controlled load service can
be implemented with the rsvp.

Guaranteed service:

tspec: r, b, peak rate, min packet size, max packet size 

rspec: service class, rate, and a slack delay term. node can use this
term to degrade the service request. (from rfcs: This slack term can
be utilized by the network element to reduce its resource reservation
for this flow. When a network element chooses to utilize some of the
slack in the RSpec, it MUST follow specific rules in updating the R
and S fields of the RSpec.)

Controlled load service:

same tspec.

rspec: just includes the service class.

intserv summary:

1. fine-grained, end-to-end guarantees.
   guaranteed service, controlled load service, and best effort
   service.

2. wfq scheduling for guaranteed service

3. rsvp for signaling. soft-state, receiver-driven. complexity is
   added to tolerate multicast and failures

DiffServ:

Motivation.

in the old days, network is used by a big happy research
family. congestion control is enough. when we see congestion, we
backoff, let other people have the bandwidth share. 

then we think of how to give a user end-to-end service guarantee. 

this motivation is good, but does not seem appealing to service
providers. diffserv aims to align the interests of providers and the
users.  what are the providers' interests? making money, of
course. how? isps can make money by offering services which are better
than best effort. it is similar to airline ticket pricing. a
first-class ticket costs more, but gets better treatment.
  
more, diffserv only provides per-hop behaviors, and makes no claim of
end-to-end guarantees. the hope is, by stitching a sequence of ds
domains, users would get end-to-end qos.

Q: is there any reason to believe that one hop better treatment brings
better overall end-to-end performance? in another word, is incremental
deployment gets incremental benefit?

A: yes in two situations. 1. if the first-hop is your
bottleneck. 2. if most of ur destinations belong to the same isp.

service interfaces:

1. per-hop behavior. no end-to-end service guarantee. no firm
guarantee.  relative guarantees. class 1 always gets better service
than class 2.

2. different services for different traffic aggregates

advantage over intserv:

1. easier to implement. as it doesnot need admission control, and
   cooperation between isps. 

2. service level agreements are negotiated between a customer domain
   and a provider domain. this model fits into the current business
   models.

architecture:

two-tier: a ds domain consists of boundary nodes and ds interior nodes. packets
are classified at the boundaries of network, and are assigned to
different traffic class. the traffic class is identified by a code
point. at the core, packets are forwarded according to the per-hop
behavior associated with the traffic class.

(draw a picture with different clouds and boundary nodes, and
interior nodes.)

a ds region contains a continuous regions of ds domains. 


                               +-------+
                               |       |-------------------+
                        +----->| Meter |                   |
                        |      |       |--+                |
                        |      +-------+  |                |
                        |                 V                V
                  +------------+      +--------+      +---------+
                  |            |      |        |      | Shaper/ |
    packets =====>| Classifier |=====>| Marker |=====>| Dropper |=====>
                  |            |      |        |      |         |
                  +------------+      +--------+      +---------+

Q: 1. how is packet assigned to different traffic class?

A. traffic conditioning. above pictures. 

an sla (service level agreement) usually specifies how much of a
customer's traffic will be granted what type of services. for example,
uw could purchase 10mb class 1 service from a provider. since traffic
from one customer is put into shared queues with other customers, the
provider needs to make sure a customer's excess traffic will not
degrade the level of service other customer's legitimate traffic
gets. so metering and marking are necessary.

traffic profile specifies the service granted to a traffic stream.

classifier helps locate the appropriated traffic profile the meter
checks against.

packets are marked as In/out profile. out-profile could be shaped to
become in profile, or dropped. 

this block is called traffic conditioning. it usually happens at the
administrative boundaries.

Q: meters could be at both in-gress points and egress points. in-gress
serves checking. what about egress point? is there any advantage?

egress point enforces policy. 

Two layer architecture: packets are marked at edge routes. core routes
simply use the the marks to forwarding packets with the corresponding
phbs. the marks are called dscp (diffserv codepoint in ietf
terminology).

dynamic packet state: codepoint is marked at the edge. core router
does not need to do extra marking, does not have to know individual
traffic stream's profile.

each packet carries some dynamic computed state. each router can
coordinate their actions using the dynamic computed state, simulating
a stateful network.

without dynamic computed code point, each router needs to keep traffic
profile, and does the traffic conditioning.

Q: we've seen this before. where?

A: XCP. otherwise, each router needs to keep per-tcp round trip time,
and cwnd.

we now discuss the types of per-hop behavior specified by IETF.

1. expedited forwarding: forward with minimal delay and loss

need to be rate limited before entering the network

possible implementations: priority queue, wfq.

2. assured forwarding: could be used to emulate best-effort with
different loads.  Olympic service: gold (lightest load), sliver,
bronze

the actual service depends on the provision. if a provider under
provisioned gold service, gold service may get worse service than
bronze. so it is a provider's responsibility to provision its network
to provide advertised service. service quality is controlled by
provisioning, not admission. (you dont say no back to users when they
start sending their packets. you turn them away when they purchase slas.)

implementation: WFQ with different drop precedence.  cannot separate
flows into different queues because out-of-order delivery causes tcp
retransmissions. within each queue, use the drop precedence to
separate traffic into different classes.

ietf: four af classes, each class has three drop precedences.

RIO is a technique to deal with excess traffic. it best utilizes
statistical multiplexing. traffic is bursty. so if not congested, we
might not want to drop packets. mark non-conforming packets as out of
profile. if network congested, drop them first. the "out-of-profile"
traffic represents opportunistic traffic.

summarize:

best-effort -> intserv -> diffserv

what's going on after diffserv. my work, and overlay networks.