From: Prasanna Kumar Jayapal (prasak_at_winse.microsoft.com)
Date: Mon Mar 08 2004 - 17:39:29 PST
This paper ("Measurement, Modeling, and Analysis of a P2P workload")
analyses the file-sharing workload characteristics of a p2p system and
compares it to the typical web surfing characteristics. The authors
present a nice report of their systematic analysis and show that the
fetch-at-most-once behavior (due to the immutable nature of the p2p
objects) of the p2p system causes the major difference with the web
workload. The key point made in this paper is that the distributed data
caching will be effective in reducing the bandwidth consumption of the
p2p system.
The data for this analysis is based on a 200-day trace of Kazaa p2p
traffic collected at UW and the methodology is nicely described in the
paper. However, Kazaa is just one of the numerous p2p systems available
and I would have liked to see some observations on other systems as well
since the paper targets p2p systems in general. The main qualities of a
p2p system (or Kazaa) are - p2p users are more patient, user queries
reduce as they age, workload consists of a large set of immutable
objects, users fetch the objects at most once.
A model was created based on the statistics observed which showed that
p2p file-sharing systems are driven primarily by introduction of new
data objects and new clients joining the system and also by clients'
fetch-at-most-once behaviors. These behaviors differ from Zipf curves
which are often used to describe web behaviors. I was not very familiar
with the Zipf curves and I felt a little difficult to follow them. From
the analysis of the model it was concluded that, without the
introduction of new objects and clients, p2p system performance
decreases over time and the new objects act as a rejuvenating factor
that counter-balances the impact of fetch-at-most-once behavior.
The authors then used their workload to explore the exploitation of
locality and caching in the p2p systems to reduce the network traffic.
They discuss different techniques like organizational proxy cache,
request redirectors and locality-aware mechanism and finally demonstrate
that the locality aware mechanism would reduce the external bandwidth
consumption by maximizing the use of data stored in the local peers. In
summary, there is a good deal of locality in the p2p system where a
protocol for caching or redirecting can potentially lighten the
bandwidth load for the entire network. Overall, this was an interesting
paper to read and very insightful.
This archive was generated by hypermail 2.1.6 : Mon Mar 08 2004 - 17:39:17 PST