author={Alberto Acerbi and Vasileios Lampos and R. Alexander Bentley},
  booktitle={Big Data 2013},

Topics from this paper

Multiple-Attribute Decision Making-based Optimization of Robustness against Cascading Failures in Interdependent network
  • Qi Wang
  • 2020 2nd International Conference on Information Technology and Computer Application (ITCA)
  • 2020
The existence of functional interation between coupled layers in multiple systems makes interdependent networks more vulnerable to cascading failures. To mitigate cascade propagation, priorExpand
Autoencoding Binary Classifiers for Supervised Anomaly Detection
The proposed Autoencoding Binary Classifiers (ABC) is a probabilistic binary classifier that effectively exploits the label information, where normal data points are modeled using the AE as a component and achieves higher detection performance than existing supervised and unsupervised methods. Expand
Image Analysis and Infrastructure Support for Data Mining the Farm Security Administration: Office of War Information Photography Collection
The team is developing and utilizing existing algorithms and running them on Comet to analyze the Farm Security Administration - Office of War Information image corpus from 1935-1944, held by the Library of Congress and accessible online to the public. Expand
A cloud software system for visualization of game-based learning data collected on mobile devices
A cloud software system (CSS) under the client-server architecture with an educational iPad game called Taffy Town, where players (students) and teachers can login and view dynamically created visualizations of the collected learning data. Expand
Analysis and optimization in smart manufacturing based on a reusable knowledge base for process performance models
An architectural design and software framework for fast development of descriptive, diagnostic, predictive, and prescriptive analytics solutions for dynamic production processes and an organization and key structure for the reusable KB, composed of atomic and composite process performance models and domain-specific dashboards are proposed. Expand
Kernel Spectral Clustering and applications
This chapter reviews the main literature related to kernel spectral clustering (KSC), an approach to clustering cast within a kernel-based optimization setting, and shows how it is possible to handle large-scale data. Expand
MRPrePost—A parallel algorithm adapted for mining big data
MRPrePost is a parallel algorithm based on Hadoop platform, which improves PrePost by way of adding a prefix pattern, and on this basis into the parallel design ideas, making MRPrePost algorithm can adapt to mining large data's association rules. Expand
Understanding the YouTube partners and their data: Measurement and analysis
This paper makes effective use of Insight, a new analytics service of YouTube that offers simple data analysis for partners, to provide the practical guidance from the raw Insight data, and enable more complex investigations for the inherent features that affect the popularity of the videos. Expand


4S: Scalable subspace search scheme overcoming traditional Apriori processing
A scalable subspace search scheme (4S) is proposed, which overcomes the efficiency problem by departing from the traditional levelwise search, and a new generalized notion of correlated subspaces is proposed which gives way to transforming the search space to a correlation graph of dimensions. Expand
A framework of spatial co-location mining on MapReduce
A framework of parallel co-location mining based on MapReduce is presented and a proposal to parallelize spatial co- location mining on distributed machines is presented. Expand
A parallel computing platform for training large scale neural networks
Artificial neural networks (ANNs) have been proved to be successfully used in a variety of pattern recognition and data mining applications. However, training ANNs on large scale datasets are bothExpand
ADraw: A novel social network visualization tool with attribute-based layout and coloring
A novel visualization tool with attribute-based layout and coloring is developed in this paper, designed based on the principle of making the nodes with same attribute values closer in the diagram. Expand
Agglomerative co-clustering for synonymous phrases based on common effects and influences
This paper proposes an approach to clustering synonymous noun phrases focusing on two types of predicate argument relations extracted from potentially big textual data using a parallel distributed programming model, MapReduce, to handle the large matrix. Expand
Alarm prediction in large-scale sensor networks — A case study in railroad
This work collaborates with a US Class I railway company and applies advanced analytics techniques to be able to predict alarms associated with catastrophic equipment failures several days ahead of time, and builds customized SVM algorithm to meet the requirements. Expand
Bibliometric-enhanced retrieval models for big scholarly information systems
This paper will explore how statistical modelling of scholarship, such as Bradfordizing or network analysis of coauthorship network, can improve retrieval services for specific communities, as well as for large, cross-domain large collections. Expand
Complete storm identification algorithms from big raw rainfall data using MapReduce framework
This paper proposes a MapReduce-based overall storm identification algorithm, which greatly improves performance when compared to the depth-first search (DFS) graph traveling approach as introduced in the previous work. Expand
Computing betweenness centrality in external memory
This paper describes the first known external-memory and cache-oblivious algorithms for computing betweenness centrality, and describes general algorithms for networks with weighted and unweighted edges and a specialized algorithm with small diameters, as is common in social networks exhibiting the “small worlds” phenomenon. Expand
Data chaos: An entropy based MapReduce framework for scalable learning
This paper proposes an entropy based theoretic framework for machine learning, which states that chaos in sample data will decrease and rule will advance as learning progresses, and proposes a MapReduce based distributed computational framework for scalable learning. Expand