Review of "Measurement, Modeling, and Analysis of a Peer-to-Peer File-Sharing Workload"

From: Jeff Duzak (jduzak_at_exchange.microsoft.com)
Date: Mon Mar 08 2004 - 11:19:45 PST

  • Next message: Raz Mathias: "Measurement, Modeling, and Analysis of a Peer-To-Peer File-Sharing Workload"

    This paper describes the results of the analysis of a 200-day-long trace
    of Kazaa requests from within UW. The analysis finds that the P2P
    file-sharing workload is significantly different from the typical web
    workload in a number of ways. The authors then show how the Kazaa
    workload benefits less from caching than a web workload. Last, the
    paper demonstrates how a system of file-sharing among users within the
    UW can decrease external request load from the UW.
     
    The first few sections describe a number of characteristics of the Kazaa
    workload. Many of these are intuitive, such as the often long
    transaction time, the dropoff of bytes requested by clients as the
    clients age. The fact that the frequency of requests by a client does
    not also drop off with a client's age surprised me. The paper goes on
    to describe how large file requests dominate the workload of the system,
    and how the most popularly requested objects are the most recently
    "born". These findings, likewise, are fairly intuitive.
     
    Next, the paper shows how the Kazaa workload, as well as the workloads
    of multimedia systems described in previous papers, does not follow the
    Zipf distribution. A model is used to show that the fetch-at-most-once
    nature of multimedia objects explains the non-Zipf distribution of a
    multimedia workload. Again, this is reasonable.
     
    One part of the model that seems a bit unreasonable is the assumption
    that clients will continue to make requests at a constant rate (2
    objects / day) even when the most popular objects have been expended. I
    would expect that the client request rate would drop as popular objects
    are expended. I believe the assumed constant request rate exaggerates
    the rate of request of unpopular objects, and therefore exaggerates the
    decrease in cache hit rate. Nonetheless, the conclusion that
    fetch-at-most-once behavior causes decreased cache efficiency seems
    valid, and is a surprise.
     
    Last, the paper describes a system of redirecting object requests
    originating within the UW user community to peers within the same
    community. The system achieves a large reduction in requests to peers
    external to UW. More interestingly, tests of this system show that the
    most active clients are the most important for achieving this caching
    effect.
     
    Overall, the paper presented a number of interesting results. However,
    the paper was fairly repetitive of the fact that a P2P file-sharing
    workload is not Zipf. I wish the paper spent more time discussing the
    implications of this finding.


  • Next message: Raz Mathias: "Measurement, Modeling, and Analysis of a Peer-To-Peer File-Sharing Workload"

    This archive was generated by hypermail 2.1.6 : Mon Mar 08 2004 - 11:19:47 PST