Remote memory in the age of fast networks

@article{Aguilera2017RemoteMI,
  title={Remote memory in the age of fast networks},
  author={Marcos K. Aguilera and Nadav Amit and Irina Calciu and Xavier Deguillard and Jayneel Gandhi and Pratap Subrahmanyam and Lalith Suresh and Kiran Tati and Rajesh Venkatasubramanian and Michael Yung Chung Wei},
  journal={Proceedings of the 2017 Symposium on Cloud Computing},
  year={2017}
}
  • M. Aguilera, Nadav Amit, +7 authors M. Wei
  • Published 24 September 2017
  • Computer Science
  • Proceedings of the 2017 Symposium on Cloud Computing
As the latency of the network approaches that of memory, it becomes increasingly attractive for applications to use remote memory---random-access memory at another computer that is accessed using the virtual memory subsystem. This is an old idea whose time has come, in the age of fast networks. To work effectively, remote memory must address many technical challenges. In this paper, we enumerate these challenges, discuss their feasibility, explain how some of them are addressed by recent work… Expand
Can far memory improve job throughput?
TLDR
It is found that while far memory is not a panacea, for memory-intensive workloads it can provide performance improvements on the order of 10% or more even without changing the total amount of memory available. Expand
More Exploration to Composable Infrastructure: The Application and Analysis of Composable Memory
TLDR
This is the first design that provides the most significant memory extension without any software modification and the large-scale performance evaluations comparing to other state-of-the-art works about network-based remote memory. Expand
The Case for Physical Memory Pools: A Vision Paper
TLDR
It is argued that creating physical memory pools is essential for cheaper and more efficient cloud computing infrastructures, and the research challenges to implement these structures are identified. Expand
A Survey on the Challenges of Implementing Physical Memory Pools
TLDR
This article identifies enabling technologies for physical memory pools such as OS design, distributed shared memory structures and virtualization with regards to their relevance and impact on eliminating memory limits, and discusses the challenges forPhysical memory pools which can be used by multiple servers. Expand
Rethinking software runtimes for disaggregated memory
TLDR
A new software runtime for disaggregated memory is implemented that improves average memory access time by 1.7-5X and reduces dirty data amplification by 2-10X, compared to state-of-the-art systems. Expand
Cooperative Memory Expansion via OS Kernel Support for Networked Computing Systems
TLDR
Cooperative memory expansion (COMEX), an OS kernel extension, establishes a stable pool of memory collectively across nodes in a cluster and enhances OS's memory subsystem for memory aggregation from connected machines by allowing process's page table to track remote memory page frames without programmer effort or modifications to application codes. Expand
AIFM: High-Performance, Application-Integrated Far Memory
TLDR
Application-integrated far memory (AIFM), which makes remote, “far” memory available to applications through a simple API and with high performance, and allows data structure engineers to build remoteable, hybrid near/far memory data structures. Expand
Rethinking Software Runtimes for Disaggregated Memory Extended Abstract
Disaggregated memory addresses resource provisioning inefficiencies in current datacenters by improving memory utilization and decreasing the total memory over-provisioning necessary to avoidExpand
Project PBerry: FPGA Acceleration for Remote Memory
TLDR
This approach uses emerging cache-coherent FPGAs to expose cache coherence events to the operating system and enables other use cases, such as live virtual machine migration, unified virtual memory, security and code analysis, which open up many promising research directions. Expand
Remote regions: a simple abstraction for remote memory
TLDR
An intuitive abstraction for a process to export its memory to remote hosts, and to access the memory exported by others, is proposed in the Linux kernel and it is shown that remote regions are easy to use and perform close to RDMA. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 46 REFERENCES
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device
TLDR
This paper presents the design and implementation of a high performance networking block device (HPBD) over InfiniBand fabric, which serves as a swap device of kernel virtual memory (VM) system for efficient page transfer to/from remote memory servers. Expand
FaRM: Fast Remote Memory
We describe the design and implementation of FaRM, a new main memory distributed computing platform that exploits RDMA to improve both latency and throughput by an order of magnitude relative toExpand
ThreadMarks: Shared Memory Computing on Networks of Workstations
TLDR
This work discusses the experience with parallel computing on networks of workstations using the TreadMarks distributed shared memory system, which allows processes to assume a globally shared virtual memory even though they execute on nodes that do not physically share memory. Expand
Adaptive Main Memory Compression
  • I. Tuduce, T. Gross
  • Computer Science
  • USENIX Annual Technical Conference, General Track
  • 2005
TLDR
A memory compression solution that adapts the allocation of real memory between uncompressed and compressed pages and also manages fragmentation without user involvement is described. Expand
The Network RamDisk: Using remote memory on heterogeneous NOWs
TLDR
This paper describes the design, implementation and evaluation of a Network RamDisk device that uses main memory of remote workstations as a faster‐than‐disk storage device and proposes various reliability policies, making the device tolerant to single workstation crashes. Expand
System-level implications of disaggregated memory
TLDR
A software-based prototype by extending the Xen hypervisor to emulate a disaggregated memory design wherein remote pages are swapped into local memory on-demand upon access is developed, showing that low-latency remote memory calls for a different regime of replacement policies than conventional disk paging. Expand
Latency-Tolerant Software Distributed Shared Memory
TLDR
Grappa enables users to program a cluster as if it were a single, large, non-uniform memory access (NUMA) machine, and addresses deficiencies of previous DSM systems by exploiting application parallelism, trading off latency for throughput. Expand
Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store
TLDR
This paper explores the design of a distributed in-memory key-value store called Pilaf that takes advantage of Remote Direct Memory Access to achieve high performance with low CPU overhead and introduces the notion of self-verifying data structures that can detect read-write races without client-server coordination. Expand
Disaggregated memory for expansion and sharing in blade servers
TLDR
It is demonstrated that memory disaggregation can provide substantial performance benefits (on average 10X) in memory constrained environments, while the sharing enabled by the solutions can improve performance-per-dollar by up to 57% when optimizing memory provisioning across multiple servers. Expand
Efficient Memory Disaggregation with Infiniswap
TLDR
The design and implementation of INFINISWAP is described, a remote memory paging system designed specifically for an RDMA network that increases the overall memory utilization of a cluster and works well at scale. Expand
...
1
2
3
4
5
...