Learn More
The promise of cloud computing is to provide computing resources instantly whenever they are needed. The state-of-art virtual machine (VM) provisioning technology can provision a VM in tens of minutes. This latency is unacceptable for jobs that need to scale out during computation. To truly enable on-the-fly scaling, new VM needs to be ready in seconds upon(More)
The popularity of cloud service spurs the increasing demands of virtual resources to the service vendors. Along with the promising business opportunities, it also brings new technique challenges such as effective capacity planning and instant cloud resource provisioning. In this paper, we describe our research efforts on improving the service quality for(More)
The popularity of cloud service spurs the increasing demands of cloud resources to the cloud service providers. Along with the new business opportunities, the pay-as-you-go model drastically changes the usage pattern and brings technology challenges to effective capacity planning. In this paper, we propose a new method for cloud capacity planning with the(More)
Modern scientific databases and web databases maintain large and heterogeneous data. These real-world databases contain hundreds or even thousands of relations and attributes. Traditional predefined query forms are not able to satisfy various ad-hoc queries from users on those databases. This paper proposes DQF, a novel database query form interface, which(More)
The cold-start problem has attracted extensive attention among various online services that provide personalized recommendation. Many online vendors employ contextual bandit strategies to tackle the so-called <i>exploration/exploitation</i> dilemma rooted from the cold-start problem. However, due to high-dimensional user/item features and the underlying(More)
Monitoring cluster evolution in data streams is a major research topic in data streams mining. Previous clustering methods for evolving data streams focus on global clustering result. It may lose critical information about individual cluster. This paper introduces some basic movements of evolution of an individual cluster. Based on the measurement of the(More)
The advent of Big Data era drives data analysts from different domains to use data mining techniques for data analysis. However, performing data analysis in a specific domain is not trivial; it often requires complex task configuration, onerous integration of algorithms, and efficient execution in distributed environments.Few efforts have been paid on(More)
Event mining is a useful way to understand computer system behaviors. The focus of recent works on event mining has been shifted to event summarization from discovering frequent patterns. Event summarization seeks to provide a comprehensible explanation of the event sequence on certain aspects. Previous methods have several limitations such as ignoring(More)
Anomaly detection has always been a critical and challenging problem in many application areas such as industry, healthcare, environment and finance. This problem becomes more di cult in the Big Data era as the data scale increases dramatically and the type of anomalies gets more complicated. In time sensitive applications like real time monitoring, data(More)
Personalized recommendation services have gained increasing popularity and attention in recent years as most useful information can be accessed online in real-time. Most online recommender systems try to address the information needs of users by virtue of both user and content information. Despite extensive recent advances, the problem of personalized(More)