From: Pravin Bhat (pravinb@u.washington.edu)
Date: Mon Nov 08 2004 - 03:11:43 PST
Paper Summary: The paper presents a statistical analysis of a large trace of
network traffic collected at the edge-routers in the University of Washington
network. The data streams of type - HTTP web traffic, Akamai CDN traffic, and
P2P traffic are compared in terms of the network resources they consume and
their distribution among the network users. Several suggestions and conclusions
specific to the UW network are drawn from the analysis which could be applicable
to other networks as well.
Paper Strengths:
The paper presents several statistics from real network traffic recorded over
a period of nine days - some of which are quite surprising.
A key observation made by the paper is that the bandwidth consumption at the
network edges has shifted toward a Zipf distribution in the recent years due
to the increased popularity of P2P applications. Its surprising to see that
top 200 Kazaa clients accounted for 27% of all inbound HTTP traffic in the UW network.
This observation is orthogonal to the popular knowledge that web object requests
have a Zipf distribution which can be dealt by technologies like CDN.
Along similar lines P2P software has changed the nature of participation at the
network edges. More hosts are now acting as servers. Top 170 Kazaa peers in the
UW network accounted for 50% of the outbound HTTP traffic. Unfortunately
current networks work with the assumption that servers are located towards the
center of the network. This assumption is reflected in the fact that
- more bandwidth is allocated at the center of the network
- nodes near the network edges have lower edge-connectivity
- there is less downstream bandwidth versus upstream bandwidth near network edges.
If this trend continues to shift servers towards network edges we will need to
fundamentally change the way we design networks.
The paper also identifies the fact that popular P2P software like Kazaa are
currently unable to uniformly distribute their traffic over the entire
overlay network. This calls for improved P2P algorithms and better caching
technologies in network/isp peer-points to minimize redundant inter-domain
traffic.
Limitations and Areas for improvement:
While conclusions drawn from the statistics presented in the paper might
apply to most campus networks it might be a stretch for the authors to make
sweeping generalizations of the form - "Peer-to-peer traffic now accounts for
the majority of HTTP bytes transferred...". A campus network might not be the
best representative of the general internet traffic. For example a
considerable fraction of the internet traffic must involve large corporations
where P2P applications are generally blocked.
Such drastic inequalities in bandwidth sharing can occur only in the absence of
a proper queuing strategy in the UW routers which can be easily fixed by
patching the routers to use Fair Queuing. Caching can only improve network
efficiency in terms of reducing redundant transfers however it will still
allow certain users to use up a large fraction of the internal bandwidth.
Fair Queuing on the other hand would ensure that no legitimate users are
robbed of the bandwidth they deserve. I'm surprised that the authors didnt
suggest this option. On the other hand if UW routers did use Fair Queuing at
the time of this experiment then the "worst offenders" were simply using
bandwidth that would have otherwise gone unused.
While this body of research has several merits I'm worried about the ethical
issues involving monitoring private internet traffic. The authors take care to
anonymize sensitive data in the traffic trace but its a slippery slope towards
the point where it becomes common practice for networks to target unpopular
users using network monitoring.
Relevance and Future work:
It should be obvious that the paper is relevant to the future evolution of the
internet as it identifies key issues that are bound to dominate internet
traffic for a long time. In lieu of growing popularity of P2P traffic we have
to rethink our strategies regarding
- fair resource sharing
- network design to accommodate increased servers at the network edges.
- improved caching techniques for P2P traffic
- improved P2P algorithms that uniformly distribute load over the network and
exploit spatial and temporal locality
This archive was generated by hypermail 2.1.6 : Mon Nov 08 2004 - 03:11:43 PST