From: Ethan Katz-Bassett (ethan@cs.washington.edu)
Date: Mon Nov 08 2004 - 00:05:50 PST
In this paper, the authors present an investigation of content delivery on
the Internet by way of a trace of UW network traffic. They break the
traffic down into three categories of content delivery: www, content
delivery networks (CDN), and P2P. They pick Kazaa and Gnutella as P2P
systems to monitor, and select Akamai as their representative CDN. They did
not explain much about their choice of Akamai; I am curious as to what
proportion of CDN traffic it represents and how they made this selection (vs
selecting another or selecting multiple). They find that proxy caches could
largely replace a content delivery network. When they compare it to the
results from their similar study in 1999, we see a clear change in web
traffic over that time: P2P now dominates in terms of bandwidth usage, and
audio and video traffic increased while images decreased as a proportion of
bandwidth.
The paper details much P2P content dominates traditional web traffic (in
terms of bandwidth) and how P2P traffic differs from standard web traffic.
The P2P objects tend to be much larger, leading to longer transfer times. A
small number of Kazaa users represent most of the traffic. Similarly, a
small number of objects represents a large part of the transferred bytes.
This is partially due to the fact that objects are very popular for a short
period of time; this idea of "hot" downloads seems to indicate why
BitTorrent can work. They saw that UW "exports these large [P2P] objects
more than it imports them." This fact is perhaps to be expected, as the
demand of a P2P user does not necessarily increase with their bandwidth, but
users are likely to select downloads from users with fast connections (such
as UW students).
The authors argue that P2P peers demand so much bandwidth that the systems
do not scale well. This remark made me think of Clark's "Explicit
Allocation" paper; perhaps service allocation profiles could control how
much in service P2P traffic users are allowed. They simulate a cache for
Kazaa traffic and find that such a system would dramatically reduce the
bandwidth needed. Of course, such a system is not currently feasible; I
doubt the RIAA would think kindly of UW facilitating P2P use by caching the
most popular songs and movies.
They explain a bit about the architecture differences between Kazaa and
Gnutella; I am interested to know more about the architecture of different
P2P systems. They seem to have grouped Kazaa fragments into their composite
whole object; I wonder what percentage of transfers were fragmented and
whether a few peers end up delivering most of the fragments of a given
object.
The paper presents an interesting of the breakdown of web traffic. To
properly improve a network, it is helpful to understand how it is being
used. I had not realized that P2P services accounted for such a large
portion of traffic.
This archive was generated by hypermail 2.1.6 : Mon Nov 08 2004 - 00:05:56 PST