A new perspective for efficient virtual-cache coherence

@inproceedings{Kaxiras2013ANP,
  title={A new perspective for efficient virtual-cache coherence},
  author={S. Kaxiras and Alberto Ros},
  booktitle={ISCA},
  year={2013}
}
  • S. Kaxiras, Alberto Ros
  • Published in ISCA 2013
  • Computer Science
  • Coherent shared virtual memory (cSVM) is highly coveted for heterogeneous architectures as it will simplify programming across different cores and manycore accelerators. In this context, virtual L1 caches can be used to great advantage, e.g., saving energy consumption by eliminating address translation for hits. Unfortunately, multicore virtual-cache coherence is complex and costly because it requires reverse translation for any coherence request directed towards a virtual L1. The reason is the… CONTINUE READING
    52 Citations
    Early Experiences with Separate Caches for Private and Shared Data
    • 3
    • PDF
    Efficient Intra-SM Slicing through Dynamic Resource Partitioning for GPU Multiprogramming
    • 5
    • Highly Influenced
    Reducing address translation overheads with virtual caching
    Architectural Support for Address Translation on GPUs
    • 24
    • PDF
    Devirtualizing virtual memory for heterogeneous systems
    ARCHITECTURAL SUPPORT FOR VIRTUAL MEMORY IN GPUs
    TLB Shootdown Mitigation for Low-Power Many-Core Servers with L1 Virtual Caches
    • 10
    • PDF
    Efficient synonym filtering and scalable delayed translation for hybrid virtual caching
    • 1
    • Highly Influenced
    • PDF
    Efficient synonym filtering and scalable delayed translation for hybrid virtual
    • 2
    • Highly Influenced

    References

    SHOWING 1-10 OF 10 REFERENCES
    The Synonym Lookaside Buffer: A Solution to the Synonym Problem in Virtual Caches
    • 17
    • Highly Influential
    • PDF
    Reducing memory reference energy with opportunistic virtual caching
    • 69
    • Highly Influential
    • PDF
    Enigma: architectural and operating system support for reducing the impact of address translation
    • 27
    • Highly Influential
    DeNovo: Rethinking the Memory Hierarchy for Disciplined Parallelism
    • 152
    • Highly Influential
    • PDF
    Shared last-level TLBs for chip multiprocessors
    • 124
    • Highly Influential
    • PDF
    Organization and performance of a two-level virtual-real cache hierarchy
    • 100
    • Highly Influential
    • PDF
    Coherency for multiprocessor virtual address caches
    • 58
    • Highly Influential
    Reactive NUCA: near-optimal block placement and replication in distributed caches
    • 401
    • Highly Influential
    • PDF
    Virtual-address caches. Part 1: problems and solutions in uniprocessors
    • 86
    • Highly Influential
    • PDF
    Subspace snooping: Filtering snoops with operating system support
    • 57
    • Highly Influential
    • PDF