Language-Directed Hardware Design for Network Performance Monitoring

@article{Narayana2017LanguageDirectedHD,
  title={Language-Directed Hardware Design for Network Performance Monitoring},
  author={Srinivas Narayana and Anirudh Sivaraman and Vikram Nathan and Prateesh Goyal and Venkat Arun and Mohammad Alizadeh and Vimalkumar Jeyakumar and Changhoon Kim},
  journal={Proceedings of the Conference of the ACM Special Interest Group on Data Communication},
  year={2017}
}
Network performance monitoring today is restricted by existing switch support for measurement, forcing operators to rely heavily on endpoints with poor visibility into the network core. Switch vendors have added progressively more monitoring features to switches, but the current trajectory of adding specific features is unsustainable given the ever-changing demands of network operators. Instead, we ask what switch hardware primitives are required to support an expressive language of network… 
Memory-Efficient Performance Monitoring on Programmable Switches with Lean Algorithms
TLDR
This work defines a new class of \emph{lean} algorithms that use memory sublinear in both the size of input data and the number of flows and introduces lean algorithms for a set of important statistics, such as identifying flows with high latency, loss, out-of-order, or retransmitted packets.
Demonstration of the Marple System for Network Performance Monitoring
TLDR
Marple, a system that allows network operators to measure a wide variety of performance metrics in real time, consists of a performance query language, Marple, modeled on familiar functional operators like map, filter, and groupby, supported by a programmable key-value store on switches.
Distributed Network Monitoring and Debugging with SwitchPointer
TLDR
The key contribution of SwitchPointer is to efficiently provide network visibility by using switch memory as a “directory service” — each switch, rather than storing the data necessary for monitoring functionalities, stores pointers to end-hosts where relevant telemetry data is stored.
Designing Heavy-Hitter Detection Algorithms for Programmable Switches
TLDR
This work introduces PRECISION, an algorithm that uses Partial Recirculation to find top flows on a programmable switch and achieves higher accuracy than previous heavy hitter detection algorithms that avoid recirculation, and suggests two algorithms for the hierarchical heavy hitters detection problem.
Scalable, Network-Wide Telemetry with Programmable Switches
TLDR
Sonata is presented, a flexible and scalable network telemetry system that uses the compute resources of both stream-processing servers and a single Protocol Independent Switch Architecture (PISA) switch and Herd, a system for implementing a subset of Sonata queries distributed across several switches.
Mantis: Reactive Programmable Switches
TLDR
Mantis is a combination of language for specifying dynamic components of packet processing and an optimized, general, and safe control loop for implementing them, and provides a simple-to-reason-about set of abstractions for users, and the Mantis control plane can react to changes in the network in 10s of μs.
Scaling Hardware Accelerated Network Monitoring to Concurrent and Dynamic Queries With *Flow
TLDR
This work introduces *Flow, a switch accelerated telemetry system for efficient, concurrent, and dynamic measurement, to carefully partition processing between switch ASICs and application software.
PacketScope: Monitoring the Packet Lifecycle Inside a Switch
TLDR
PacketScope is a network telemetry system that lets network operators peek inside network switches to ask a suite of useful queries about how switches modify, drop, delay, and forward packets, and gives network operators an intuitive and powerful Spark-like dataflow language to express these queries.
Martini: Bridging the Gap between Network Measurement and Control Using Switching ASICs
TLDR
Evaluation results show that Martini can effectively support a wide range of fine-timescale management tasks such as microburst detection and fast load balancing by reducing the control loop from seconds to nanoseconds.
Dynamic Property Enforcement in Programmable Data Planes
TLDR
This paper developed P4box, a system for deploying runtime monitors in programmable data planes that allows programmers to easily express a broad range of properties and demonstrates that runtime monitors represent a small overhead to network devices in terms of latency and resource consumption.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 46 REFERENCES
Frenetic: a network programming language
TLDR
Frenetic provides a declarative query language for classifying and aggregating network traffic as well as a functional reactive combinator library for describing high-level packet-forwarding policies, which facilitates modular reasoning and enables code reuse.
Simplifying Datacenter Network Debugging with PathDump
TLDR
Evaluation results show that Path-Dump requires minimal switch and edge resources, while enabling network debugging at fine-grained time scales, and can support a surprisingly large class of network debugging problems.
Packet Transactions: High-Level Programming for Line-Rate Switches
TLDR
This paper introduces the notion of a packet transaction: a sequential packet-processing code block that is atomic and isolated from other such code blocks that can run at line rate on emerging programmable line-rate switching chips.
I Know What Your Packet Did Last Hop: Using Packet Histories to Troubleshoot Networks
TLDR
This paper built NetSight, an extensible platform that captures packet histories and enables applications to concisely and flexibly retrieve packet histories of interest and built four applications that illustrate its flexibility: an interactive network debugger, a live invariant monitor, a path-aware history logger, and a hierarchical network profiler.
Packet-Level Telemetry in Large Datacenter Networks
TLDR
This work presents Everflow, a packet-level network telemetry system for large DCNs, and presents experiments that demonstrate Everflow's scalability, and shares experiences of troubleshooting network faults gathered from running it for over 6 months in Microsoft's DCNs.
Toward Predictable Performance in Software Packet-Processing Platforms
TLDR
This work presents a general-purpose packet-processing system that combines ease of programmability with predictable performance, while running a diverse set of applications and serving multiple clients with different needs, and constitutes the first evidence that, when designing software network equipment, flexibility and predictability are not mutually exclusive goals.
Trumpet: Timely and Precise Triggers in Data Centers
TLDR
Trumpet is proposed, an event monitoring system that leverages CPU resources and end-host programmability, to monitor every packet and report events at millisecond timescales, and allows operators to describe new network events such as detecting correlated bursts and loss.
Compiling Path Queries
TLDR
A declarative query language for efficient path-based traffic monitoring and can enable "interactive debugging"-- compiling multiple queries in a few seconds--while fitting rules comfortably in modern switch TCAMs and the automaton state into two bytes.
FlowRadar: A Better NetFlow for Data Centers
TLDR
The key idea of FlowRadar is to encode perflow counters with a small memory and constant insertion time at switches, and then to leverage the computing power at the remote collector to perform network-wide decoding and analysis of the flow counters.
Network Monitoring as a Streaming Analytics Problem
TLDR
This paper shows with a simple example query involving DNS reflection attacks and traffic traces from one of the world's largest IXPs that Sonata can capture 95% of all traffic pertaining to the query, while reducing the overall data rate and the number of required counters by four orders of magnitude.
...
1
2
3
4
5
...