From: Ethan Phelps-Goodman (ethanpg@cs.washington.edu)
Date: Sun Nov 07 2004 - 22:50:48 PST
Analysis of Internet Content Delivery Systems
Saroiu et. al.
In this paper the authors collect and analyze data about the type of
content delivery system used for traffic at the UW. They collect their
data through a nine day trace of all incoming and outgoing packets on
the UW border routers. Content is divide into four types: WWW, Akamai,
Gnutella, and Kazaa. (Akamai is a commercial system for replicating
popular static web content.) Their data is primarily about the number
of objects requested and total bytes transmitted for each content type.
Much of what they find has already been reported in an earlier study of
their and in other places. They find that Kazaa traffic dwarfs WWW
traffic, and that this traffic is 80% video files. Kazaa content even
outpaces UW web servers on outbound content. In addition, this content
is highly skewed: a small number of clients and a small number of files
account for a large portion of the bandwidth. They have a number of
specific findings, but few of them stuck me as particularly new or
surprising. Also, the four categories the consider accounted for only
57% of overall bandwidth, leaving almost half the usage unaccounted
for. Also, their sample is certainly skewed by looking at a university.
I would guess the influence of P2P traffic is much higher here than in
other sectors.
One surprising piece is that 600 non-UW peers were serving 25% of the
incoming Kazaa traffic. This suggests that the P2P network is doing a
poor job of distributing requests. They conclude that the average Kazaa
user consumes 90 times the bandwidth of a web user. Maybe the most
interesting result is that traffic served from UW Kazaa peers shows an
85% outbound cache hit rate. Since Kazaa servers in the UW account for
a majority of the outbound traffic considered, placing an outbound
cache would have a large impact on network usage.
Their study of Akamai seemed flawed. They say Akamai maintains 13,000
servers, but they were able to identify only 4,000. Presumably they are
undercounting Akamai traffic by a factor of 3? Also, in talking about
caching, they show that Akamai content has a very high hit rate. This
is hardly surprising though, given that Akamai exists to distribute
content that is static and popular. They suggest that Akamai could be
replaced by proxy caches, but to my understanding this is exactly what
Akamai is--a commercial, explicitly managed cache.
Ethan
This archive was generated by hypermail 2.1.6 : Sun Nov 07 2004 - 22:50:59 PST