Learn More
We consider the problem of jointly allocating compute and network resources in a large Infrastructure-as-a-service cloud. We formulate the problem of optimally allocating resources to virtual data centers (VDCs) for four well-known management objectives: balanced load, energy efficiency, fair allocation, and service differentiation. Then, we outline an(More)
While real-time service assurance is critical for emerging telecom cloud services, understanding and predicting performance metrics for such services is hard. In this paper, we pursue an approach based upon statistical learning whereby the behavior of the target system is learned from observations. We use methods that learn from device statistics and(More)
— Many regions of the world do not have access to the Internet due to lack of proper communication infrastructure. Delay Tolerant Network (DTN) provides communication in a challenging network condition such as high communication delay and intermittent connectivity. DTN is a promising solution to solve lack of connectivity problems in developing regions such(More)
We address the problem of resource allocation in a large-scale cloud environment, which we formalize as that of dynamically optimizing a cloud configuration for green computing objectives under CPU and memory constraints. We propose a generic gossip protocol for resource allocation, which can be instantiated for specific objectives. We develop an(More)
We model and evaluate the performance of a distributed key-value storage system that is part of the Spotify backend. Spotify is an on-demand music streaming service, offering low-latency access to a library of over 16 million tracks and serving over 10 million users currently. We first present a simplified model of the Spotify storage architecture, in order(More)
Predicting the performance of cloud services is intrinsically hard. In this work, we pursue an approach based upon statistical learning, whereby the behaviour of a system is learned from observations. Specifically, our testbed implementation collects device statistics from a server cluster and uses a regression method that accurately predicts, in real-time,(More)
In the recent years, access to ICT has become not only an economic imperative for nations seeking progress, but also an essential asset to improve the development of societies as a whole. However, still in 2010, more than 90% of the population of Africa has no access to Internet. The Bytewalla project was started with the objective of narrowing the global(More)
Predicting performance metrics for cloud services is critical for real-time service assurance. We demonstrate a platform for estimating real-time service-level metrics. Statistical learning methods on device statistics are used to predict metrics for services running on these devices.
— Detecting faults and SLA violations in a timely manner is critical for telecom providers, in order to avoid loss in business, revenue and reputation. At the same time predicting SLA violations for user services in telecom environments is difficult, due to time-varying user demands and infrastructure load conditions. In this paper, we propose a(More)
Service assurance for the telecom cloud is a challenging task and is continuously being addressed by academics and industry. One promising approach is to utilize machine learning to predict service quality in order to take early mitigation actions. In previous work we have shown how to predict service-level metrics, such as frame rate for a video(More)