Re-architecting datacenter networks and stacks for low latency and high performance

@article{Handley2017RearchitectingDN,
  title={Re-architecting datacenter networks and stacks for low latency and high performance},
  author={Mark Handley and Costin Raiciu and Alexandru Agache and Andrei Voinescu and Andrew W. Moore and Gianni Antichi and Marcin W{\'o}jcik},
  journal={Proceedings of the Conference of the ACM Special Interest Group on Data Communication},
  year={2017}
}
  • M. Handley, C. Raiciu, Marcin Wójcik
  • Published 7 August 2017
  • Computer Science
  • Proceedings of the Conference of the ACM Special Interest Group on Data Communication
Modern datacenter networks provide very high capacity via redundant Clos topologies and low switch latency, but transport protocols rarely deliver matching performance. We present NDP, a novel data-center transport architecture that achieves near-optimal completion times for short transfers and high flow throughput in a wide range of scenarios, including incast. NDP switch buffers are very shallow and when they fill the switches trim packets to headers and priority forward the headers. This… 
Expanding across time to deliver bandwidth efficiency and low latency
TLDR
Opera is presented, a dynamic network that delivers latency-sensitive traffic quickly by relying on multi-hop forwarding in the same way as expander-graph-based approaches, but provides near-optimal bandwidth for bulk flows through direct forwarding over time-varying source-to-destination circuits.
FatPaths: Routing in Supercomputers, Data Centers, and Clouds with Low-Diameter Networks when Shortest Paths Fall Short
We introduce FatPaths: a simple, generic, and robust routing architecture for Ethernet stacks. FatPaths enables state-of-the-art low-diameter topologies such as Slim Fly to achieve unprecedented
Homa: a receiver-driven low-latency transport protocol using network priorities
TLDR
The implementation of Homa delivers 99th percentile round-trip times less than 15 μs for short messages on a 10 Gbps network running at 80% load, almost 100x lower than the best published measurements of an implementation.
RPO: Receiver-driven Transport Protocol Using Opportunistic Transmission in Data Center
TLDR
Small-scale testbed experiments and large-scale simulations show that RPO significantly improves the network utilization by up to 35% under high workload over the state-of-the-art receiver-driven transmission schemes, without introducing additional queueing delay.
PL2: Towards Predictable Low Latency in Rack-Scale Networks
TLDR
A Predictable Low Latency (PL2) network architecture for rack-scale systems with Ethernet as interconnecting fabric that leverages programmable Ethernet switches to carefully schedule packets such that they incur no loss with NIC and switch queues maintained at small, near-zero levels.
Superways: A Datacenter Topology for Incast-heavy workloads
TLDR
This work proposes Superways, a heterogeneous datacenter topology that provides higher bandwidth for some servers to absorb incasts, as incasts occur only at a small number of servers that aggregate responses from other senders.
Exploring Token-Oriented In-Network Prioritization in Datacenter Networks
TLDR
This article proposes a readily-deployable remedy to achieve in-network prioritization by pushing both switch and end-host hardware capacity to an extreme end and implements a running TOP system with Linux hosts and commodity switches, and investigates the applicability of TOP.
Polo: Receiver-Driven Congestion Control for Low Latency over Commodity Network Fabric
TLDR
Polo is presented to realize low latency for flows over commodity network fabric relying on Explicit Congestion Notification (ECN) and priority queues and results show that Polo outperforms the state-of-art receiver-driven protocols in a wide range of scenarios including incast.
PowerTCP: Pushing the Performance Limits of Datacenter Networks
TLDR
It is shown analytically and empirically that POWERTCP can significantly outperform the state-ofthe-art in both traditional datacenter topologies and emerging reconfigurable datacenters where frequent bandwidth changes make congestion control challenging.
FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short
TLDR
FatPaths is introduced, a simple, generic, and robust routing architecture that enables state-of-the-art low-diameter topologies such as Slim Fly to achieve unprecedented performance and may become a standard routing scheme for modern topologies.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 40 REFERENCES
Congestion Control for Large-Scale RDMA Deployments
TLDR
DCQCN, an end-to-end congestion control scheme for RoCEv2, is introduced and it is shown that DCQCN dramatically improves throughput and fairness of Ro CEv2 RDMA traffic.
Scalable, optimal flow routing in datacenters via local link balancing
TLDR
An optimization decomposition is used to prove LocalFlow's optimality when combined with unmodified end hosts' TCP, and it is shown that since LocalFlow acts independently on each switch, it is highly scalable, adapts quickly to dynamic workloads, and admits flexible deployment strategies.
Fastpass: A Centralized “Zero-Queue” Datacenter Network
TLDR
This paper describes Fastpass, a datacenter network architecture built using this principle that achieves high throughput comparable to current networks at a 240x reduction is queue lengths, and achieves much fairer and consistent flow throughputs than the baseline TCP.
Presto: Edge-based Load Balancing for Fast Datacenter Networks
TLDR
A soft-edge load balancing scheme that closely tracks that of a single, non-blocking switch over many workloads and is adaptive to failures and topology asymmetry, called Presto is designed and implemented.
Data center TCP (DCTCP)
TLDR
DCTCP enables the applications to handle 10X the current background traffic, without impacting foreground traffic, thus largely eliminating incast problems, and delivers the same or better throughput than TCP, while using 90% less buffer space.
Safe and effective fine-grained TCP retransmissions for datacenter communication
TLDR
This paper uses high-resolution timers to enable microsecond-granularity TCP timeouts and shows that eliminating the minimum retransmission timeout bound is safe for all environments, including the wide-area.
Improving datacenter performance and robustness with multipath TCP
TLDR
This work proposes using Multipath TCP as a replacement for TCP in large-scale data centers, as it can effectively and seamlessly use available bandwidth, giving improved throughput and better fairness on many topologies.
VL2: a scalable and flexible data center network
TLDR
VL2 is a practical network architecture that scales to support huge data centers with uniform high capacity between servers, performance isolation between services, and Ethernet layer-2 semantics, and is built on a working prototype.
pHost: distributed near-optimal datacenter transport over commodity network fabric
The importance of minimizing flow completion times (FCT) in datacenters has led to a growing literature on new network transport designs. Of particular note is pFabric, a protocol that achieves
Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center
TLDR
The HULL (High-bandwidth Ultra-Low Latency) architecture is presented to balance two seemingly contradictory goals: near baseline fabric latency and high bandwidth utilization and results show that by sacrificing a small amount of bandwidth, HULL can dramatically reduce average and tail latencies in the data center.
...
1
2
3
4
...