Reading Review 11-08-2004

From: Craig M Prince (cmprince@cs.washington.edu)
Date: Sun Nov 07 2004 - 21:48:46 PST

  • Next message: Ethan Phelps-Goodman: "content delivery systems."

    Reading Review 11-08-2004
    -------------------------
    Craig Prince

    The paper titled "An Analysis of Internet Content Delivery Systems"
    attempts to compare and quantify the types of TCP traffic that occurs over
    a campus network (namely the UW campus). More specifically the paper
    compares four different ways that content is delivered: WWW requests,
    Akamai CDN HTTP requests, Gnutella P2P file transfers, and Kazaa P2P file
    transfers. The paper makes very interesting observations as to the
    bandwidth consumed by each content delivery system as well as the type of
    content, the number of unique flows in each system, a comparison of
    inbound to outbound traffic, et al.

    What I liked about this work was that even though the experiment was
    simple, the authors were still able to make some amazing observations.
    What surprised me the most was the sheer volume of P2P traffic that was on
    the network. Less surprising was the observation that most of the P2P
    traffic is large audio and video files. One of the unique contributions of
    this work that makes it more than a simple measurement study is that they
    analyze the effectiveness of caching on these content delivery systems.
    They find that to some extent caching would improve performance on
    outbound WWW and Akamai requests and they also find that a ton of
    bandwidth could be saved if incoming P2P traffic was serviced by a cache.

    One assumption made by the authors is that e-mail is insignificant as a
    content delivery network. The authors ignore e-mail traffic which can
    include attachments, etc. While I suspect the bandwidth used by e-mail is
    small in comparison to P2P, I wish e-mail traffic were included just for
    comparison.

    While the authors propose an interesting reverse P2P cache to handle the
    loaded of duplicate requests to the campus network, what concerned me were
    the real-world implications of doing so. Namely does caching content place
    any liability upon the university as to the nature of the content? Legal
    matters aside though, the observation that much of the P2P bandwidth
    involves the same small set of files is a very useful discovery.

    Overall I liked the thoroughness of the paper; however, I would have liked
    to be given some possible explanations for the results. Why has P2P
    traffic become so popular -- are there engineering/economic/legal reasons?
    In what ways does the fact that this is a campus network affect the
    results (this is answered a little)? Why are there just a few files that
    are so popular? Why aren't there a lot of large WWW file transfers? Can we
    expect these results to continue to hold?


  • Next message: Ethan Phelps-Goodman: "content delivery systems."

    This archive was generated by hypermail 2.1.6 : Sun Nov 07 2004 - 21:48:47 PST