• Publications
  • Influence
A Practical Dynamic Buffer Overflow Detector
tl;dr
We present CRED, a practical dynamic buffer overflow detector for C programs. Expand
  • 350
  • 31
Accelerating Deep Convolutional Neural Networks Using Specialized Hardware
tl;dr
General Purpose Computing on GPGPUs, FPGAs, Field Programmable Gate Arrays, ApplicationSpecific Integrated Circuits, and ASICs . Expand
  • 264
  • 19
Flexible Hardware Acceleration for Instruction-Grain Program Monitoring
tl;dr
In this paper, we propose a flexible hardware solution for accelerating a wide range of instruction-grain program monitoring tools. Expand
  • 136
  • 18
Parallelizing dynamic information flow tracking
tl;dr
We present a parallel algorithm for relaxed DIFT, based on symbolic inheritance tracking, which achieves linear speed-up asymptotically and reduces the overhead to as low as 1.2X using 9 monitoring cores. Expand
  • 66
  • 9
Ditto: a system for opportunistic caching in multi-hop wireless networks
tl;dr
This paper presents the design, implementation, and evaluation of Ditto, a system that opportunistically caches overheard data to improve subsequent transfer throughput in wireless mesh networks. Expand
  • 61
  • 7
Performance Modeling and Scalability Optimization of Distributed Deep Learning Systems
tl;dr
This paper develops performance models that quantify the impact of partitioning and provisioning decisions on overall distributed system performance and scalability. Expand
  • 43
  • 5
HyperDrive: exploring hyperparameters with POP scheduling
tl;dr
We develop a scheduling algorithm POP that quickly identifies among promising, opportunistic and poor configurations of hyperparameters. Expand
  • 20
  • 4
Page overlays: An enhanced virtual memory framework to enable fine-grained memory management
tl;dr
We propose a new virtual memory framework that enables efficient implementation of a variety of fine-grained memory management techniques, each of which has a wide variety of applications. Expand
  • 40
  • 3
SERF: Efficient Scheduling for Fast Deep Neural Network Serving via Judicious Parallelism
tl;dr
We identify and model two important properties of DNN workloads: homogeneous request service demand, and interference among requests running concurrently due to cache/memory contention. Expand
  • 19
  • 3
Efficient Deep Neural Network Serving: Fast and Furious
tl;dr
The emergence of deep neural networks (DNNs) as a state of the art machine learning technique has enabled a variety of artificial intelligence applications for image recognition, speech recognition and translation, drug discovery, and machine vision. Expand
  • 7
  • 3