From: Tarik Nesh-Nash (tarikn_at_microsoft.com)
Date: Sun Mar 07 2004 - 11:22:16 PST
This paper is a study of the users and objects patterns on P2P file
sharing system and comparing it to the web and other systems. The paper
finishes by recommending the use of locality awareness to reduce the
traffic bandwidth.
The paper starts by observations about the forces that drive p2p
systems. It analyzes data collected at UW about kazaa traffic. This
system serves media files, these objects are fewer, larger and immutable
contrarily to the files on the web. The typical Kazaa user is
characterized by greediness, poor availability and patience. Old users
consume fewer bits due to attrition or slow down over time. Old objects
seem to be less popular than the short lived objects, though most
requests are targeting old objects. This phenomenon is explained with
"fetch at most once" behavior. While the web user repeatedly fetches
the same page as the page keeps changing, the media file is immutable
and the user only needs to download it once. This is clearly reflected
on the pattern of users and objects on the short term and long term. As
a conclusion, unlike the web traffic, The P2P system does not follow the
Zipf's law as the most popular objects get requested much less. A
comparison to other non Zipf's law systems confirm this. The existence
of proxy cache for web traffic mimic the same behavior for cache misses,
and it seems obvious since an infinite cache will lead to a fetch at
most once behavior. Grouping the objects into categories will lead to a
zip's law behavior, and this confirms the same non zipf's behavior of
the objects. The principle is similar to the movies sales using online
(VoD) or though box office and they do indeed have the same sales
patterns.
A model study was implemented to parameterize the different variables
and analyze its effects on the system. The system investigated changes
in the request rates, number of clients and objects. And the simulation
results were very close to prediction. The load slows down as the
clients age, though new clients improve the performance, new clients are
not enough to have the same effect.
To reduce the bandwidth, the idea of locality awareness seems very
attractive. An effective use of the contents existing within the
organization (e.g. UW) will eliminate an important download from
external nodes. The use of a proxy cache (ideally unlimited) achieves
this goal. Any object that gets downloaded gets cached and hence future
load requests will get it from the proxy. The end result is that the
object got fetched at most once. Legal and political restriction may
prevent the implementation of this model. Other alternatives may be the
use of static or dynamic redirection. The redirector will index the
locations of the objects. Locality awareness results in a very
impressive performance results.
As usual, this UW publication is clean, clear and well structured.
This archive was generated by hypermail 2.1.6 : Sun Mar 07 2004 - 11:22:29 PST