From: Chandrika Jayant (cjayant@cs.washington.edu)
Date: Mon Nov 08 2004 - 01:07:09 PST
“An Analysis of Internet Content Delivery Systems”
Written by Saroiu, Gummadi, Dunn, Gribble, and Levy
Reviewed by Chandrika Jayant
This paper
analyzes
internet content delivery with respect to four systems: HTTP web
traffic,
Akamain CDN, Kazaa P2P, and Gnutella P2P (HTTP for downloading P2P,
non-HTTP
for searching P2P). The traces are done on all incoming and outcoming
Internet
traffic at the
P2P traffic accounts for the majority of transferred HTTP bytes. P2P documents are 3 orders of magnitude larger than web objects, which isn’t surprising but it IS surprising that such a low P2P request rate and small population of systems still has twice the flows of web traffic. Small number of large objects account for very large fraction of observed P2P traffic. There are so many concurrent requests being serviced because of P2P’s long transfer rates(1000 times the rates for web traffic objects). In P2P, clients and servers are not clearly divided- load travels “similarly” in both directions. However, load is very poorly spread out. A small number of clients/servers make up most of the observed traffic. I’m surprised that the load is not inherently balanced better, especially in Kazaa which has supernodes.
A problem with growing P2P network sharing is that it doesn’t seem to scale very well at all. The bandwidth cost of each Kazaa peer turned out to be about 90 times that of a web client- each added peer majorly affects the network.
The authors present caching as a possible solution to the problems of growing CDN and P2P systems. They come to the conclusion that since much content is static in CDN’s (unlike in WWW), a local web proxy could reduce the need for a separate CDN. This is not discussed much and almost glossed over in comparison to the discussion on P2P networks- perhaps another paper should explore if this would even be useful at all.
Proxy caching could help in P2P systems but the authors present a very preliminary proposal. P2P traffic appears to be a good candidate for caching since its traffic is very repetitive. Wide-area demands, if this was successfully deployed, would be greatly reduced. It seems natural the way that the authors present the idea, but why haven’t other people thought of this yet (or why isn’t it mentioned)?
The lack of
generality in the paper bothered me. First of all, the authors picked
Akamai as
their CDN, and Kazaa and Gnutella as their P2P systems without
explaining how
indicative these specific systems are of the general trends of their
categories
(CDN vs P2P). Also, the whole paper is written in the context of a
large university
setting. There is no discussion of what this means in non-university/
smaller
settings. I would assume in non-university settings, P2P traffic
wouldn’t be
nearly as prevalent. Also, since there is better bandwidth available at
a
university compared to a home, many people will try to download files
from that
type of setting, possibly biasing the outbound/inbound traffic model
set here. I
appreciate the value of this paper in the specific setting and would
even
believe that it could model parts of other networks quite well, but I
would
need to be convinced.
The fact that video traffic had increased by almost 400% in a similar study three years prior to 2002, and MP3 traffic by 300% speaks for itself- this is an exciting and new branch in networking. Obviously P2P sharing needs to be handled in a drastically different way for it ever to be able to scale, as populations and networks grow. Caching seems like a plausible idea to battle this inevitable problem.
This archive was generated by hypermail 2.1.6 : Mon Nov 08 2004 - 01:07:15 PST