• Publications
  • Influence
Sensor faults: Detection methods and prevalence in real-world datasets
This work explores and characterize four qualitatively different classes of fault detection methods, and finds that time-series-analysis-based methods are more effective for detecting short duration faults than long duration ones, and incur more false positives than the other methods. Expand
Data centers power reduction: A two time scale approach for delay tolerant workloads
In this work we focus on a stochastic optimization based approach to make distributed routing and server management decisions in the context of large-scale, geographically distributed data centers,Expand
Early prediction of software component reliability
This paper develops a software component reliability prediction framework by exploiting architectural models and associated analysis techniques, stochastic modeling approaches, and information sources available early in the development lifecycle to illustrate its utility as an early reliability prediction approach. Expand
Improving QoS in BitTorrent-like VoD Systems
This paper considers a BitTorrent-like VoD system and focuses on how the lack of load balance affects the performance and what steps can be taken to remedy that and proposes several practical schemes aimed at addressing these questions. Expand
On the Prevalence of Sensor Faults in Real-World Deployments
This work first explores and characterize three qualitatively different classes of fault detection methods, which are qualitatively consistent in identifying sensor faults in real world data sets, a first-step towards automated on-line fault detection and classification. Expand
Wide-area analytics with multiple resources
Tetrium is proposed, a system for multi-resource allocation in geo-distributed clusters, that jointly considers both compute and network resources for task placement and job scheduling and significantly reduces job response time, while incorporating several other performance goals with simple control knobs. Expand
Scheduling jobs across geo-distributed datacenters
An extensive simulation study with realistic job traces shows that the proposed scheduling algorithms result in up to 50% improvement in average job completion time over the Shortest Remaining Processing Time (SRPT) based approaches. Expand
VideoEdge: Processing Camera Streams using Hierarchical Clusters
This work proposes VideoEdge, a system that introduces dominant demand to identify the best tradeoff between multiple resources and accuracy, and narrows the search space by identifying a "Pareto band" of promising configurations. Expand
Striping doesn't scale: how to achieve scalability for continuous media servers with replication
A study of scalability characteristics of CM servers as a function of tradeoffs between striping and replication. Expand
Stochastic Complement Analysis of Multi-Server Threshold Queues with Histeresis
The authors solve a limited form of this multi-server threshold-based queueing system with hysteresis using Green’s function method with stochastic complementation, which is a more intuitive and more easily extensible method. Expand