• Publications
  • Influence
Measurements, analysis, and modeling of BitTorrent-like systems
Existing studies on BitTorrent systems are single-torrent based, while more than 85% of all peers participate in multiple torrents according to our trace analysis. In addition, these studies are notExpand
  • 446
  • 37
  • PDF
Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems
Cache partitioning and sharing is critical to the effective utilization of multicore processors. However, almost all existing studies have been evaluated by simulation that often has severalExpand
  • 374
  • 27
  • PDF
A performance study of BitTorrent-like peer-to-peer systems
This paper presents a performance study of BitTorrent-like P2P systems by modeling, based on extensive measurements and trace analysis. Existing studies on BitTorrent systems are single-torrent basedExpand
  • 165
  • 16
  • PDF
DULO: an effective buffer cache management scheme to exploit both temporal and spatial locality
Sequentiality of requested blocks on disks, or their spatial locality, is critical to the performance of disks, where the throughput of accesses to sequentially placed disk blocks can be an order ofExpand
  • 170
  • 11
  • PDF
ULCC: a user-level facility for optimizing shared cache performance on multicores
Scientific applications face serious performance challenges on multicore processors, one of which is caused by access contention in last level shared caches from multiple running threads. TheExpand
  • 56
  • 9
  • PDF
DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch
Current disk prefetch policies in major operating systems track access patterns at the level of the file abstraction. While this is useful for exploiting application-level access patterns, file-levelExpand
  • 152
  • 8
  • PDF
BWS: balanced work stealing for time-sharing multicores
Running multithreaded programs in multicore systems has become a common practice for many application domains. Work stealing is a widely-adopted and effective approach for managing and scheduling theExpand
  • 30
  • 7
  • PDF
Concurrent Analytical Query Processing with GPUs
In current databases, GPUs are used as dedicated accelerators to process each individual query. Sharing GPUs among concurrent queries is not supported, causing serious resource underutilization.Expand
  • 36
  • 5
  • PDF
Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning
Performance degradation of memory-intensive programs caused by the LRU policy's inability to handle weak-locality data accesses in the last level cache is increasingly serious for two reasons. First,Expand
  • 56
  • 4
  • PDF
Enabling software management for multicore caches with a lightweight hardware support
The management of shared caches in multicore processors is a critical and challenging task. Many hardware and OS-based methods have been proposed. However, they may be hardly adopted in practice dueExpand
  • 46
  • 4
  • PDF