Novel challenges due to the different structures of Web 2.0 sites, richer methods of user interaction, new technologies, and fundamentally different philosophy are identified.
An enhancement to CDNs is proposed that offers better protection to Web sites against flash events and trace-driven simulations are used to study the effect of the enhancement on CDNs and Web sites.
This work designs a variant of the sketch data structure, k-ary sketch, which uses a constant, small amount of memory, and has constant per-record update and reconstruction cost, and enables it to summarize traffic at various levels and detects significant changes by looking for flows with large forecast errors.
A detailed characterization of Twitter, an application that allows users to send short messages, is presented, which identifies distinct classes of Twitter users and their behaviors, geographic growth patterns and current size of the network.
A methodology for measuring personalization in Web search results is developed and it is found that, on average, 11.7% of results show differences due to personalization, but that this varies widely by search query and by result ranking.
A survey is deployed to 200 Facebook users recruited via Amazon Mechanical Turk, finding that 36% of content remains shared with the default privacy settings, and overall, privacy settings match users' expectations only 37% of the time, and when incorrect, almost always expose content to more users than expected.
This paper reports on a longitudinal study consisting of multiple snapshots of an examination of the diffusion of private information for users as they visit various Web sites triggering data gathering aggregation by third parties.
How CDNs are commonly used on the Web and a methodology to study how well they perform are defined and use of a DNS lookup in the critical path of a resource retrieval does not generally result in better server choices being made relative to client response time in either average or worst case situations.
It is shown that delta encoding can provide remarkable improvements in response size and response delay for an important subset of HTTP content types, and that the combination of delta encoding and data compression yields the best results.
This work presents a technique based on Principal Component Analysis (PCA) that models the behavior of normal users accurately and identifies significant deviations from it as anomalous, and applies it to detect click-spam in Facebook ads and finds that a surprisingly large fraction of clicks are from anomalous users.