Analyzing Analytics

@inproceedings{Bordawekar2015AnalyzingA,
  title={Analyzing Analytics},
  author={Rajesh R. Bordawekar and Bob Blainey and Ruchir Puri},
  booktitle={Analyzing Analytics},
  year={2015}
}
Many organizations today are faced with the challenge of processing and distilling information from huge and growing collections of data. Such organizations are increasingly deploying sophisticated mathematical algorithms to model the behavior of their business processes to discover correlations in the data, to predict trends and ultimately drive decisions to optimize their operations. These techniques, are known collectively as analytics, and draw upon multiple disciplines, including… 

Figures and Tables from this paper

Data Processing and Analytics for Data-Centric Sciences
TLDR
This chapter describes the data analytics framework that has been designed and developed in the ENVRIplus project to be suitable for serving the needs of researchers in several domains including environmental sciences, open and extensible both with respect to the algorithms and methods it enables and the computing platforms it relies on to execute them.
Millipede: Die-Stacked Memory Optimizations for Big Data Machine Learning Analytics
TLDR
This paper proposes memory optimizations for a "sea of simple MIMD cores (SSMC)" PNM architecture, called Millipede, which (pre) fetches and operates on entire memory rows to exploit BMLAs' row-density and employs cross-corelet flow-control to prevent eviction.
Progressive Evaluation of Queries over Untagged Data
TLDR
This paper considers a scenario where tagging can be performed using several techniques that differ in cost and accuracy and develops a progressive approach to answering Select-Project-Join queries (with a restricted version of the join predicates) that enriches the right data to the right degree so as to maximize the quality of the query results.
Accelerating database workloads by software-hardware-system co-design
TLDR
This tutorial provides a concise system-level characterization of different types of data management technologies, namely, the relational and NoSQL databases and data stream management systems from the perspective of analytical workloads and discusses opportunities for accelerating key data management workloads using software and hardware approaches.
PIQUE: Progressive Integrated QUery Operator with Pay-As-You-Go Enrichment
TLDR
This paper explores a novel approach that supports progressive data enrichment during query processing in order to support interactive exploratory analysis and is based on integrating an operator, entitled PIQUE, to support a prioritized execution of the enrichment functions during queryprocessing.
How Machine Learning is Changing e-Government
TLDR
Through the analysis, quite interesting findings have been identified, containing both benefits and barriers from the public sectors' perspective, pinpointing a wide adoption of Machine Learning approaches in the public sector.
The role of business analytics in supporting strategy processes: Opportunities and limitations
TLDR
Business analytics can provide important data-driven insights into strategy processes; it is recommended its further integration with other traditional OR and strategy tools in order to support strategic decision-makers.
Cognitive Database: A Step towards Endowing Relational Databases with Artificial Intelligence Capabilities
TLDR
This work proposes Cognitive Databases, an approach for transparently enabling Artificial Intelligence (AI) capabilities in relational databases that exemplifies using AI functionality to endow relational databases with capabilities that were previously very hard to realize in practice.
Privacy-Aware personal Information Discovery model based on the cloud
TLDR
The proposed model for Privacy-Aware Information Discovery (PAID-M) provides privacy awareness by executing data analytics algorithms encapsulated with privacy preserving techniques and presents how it intends to address the privacy issue in the cloud deployment process by considering differences in privacy regulations and jurisdictions.
Clustering Undergraduate Computer Science Student Final Project Based on Frequent Itemset
TLDR
The purpose of this study is to apply the method of association rule mining namely ECLAT algorithm to find most common terms combination and to group a collection of abstracts.
...
...

References

SHOWING 1-10 OF 51 REFERENCES
Analyzing Analytics: Part 1: A Survey of Business Analytics Models and Algorithms
TLDR
This survey paper and the accompanying research report identifies some of the key techniques employed in analytics both to serve as an introduction for the non-specialist and to explore the opportunity for greater optimization for parallel computer architectures and systems software.
Enabling analysts in managed services for CRM analytics
TLDR
New areas that open up for KDD research in terms of 'time-to-insight' and repeatability for analysts are identified in the form of a managed service offering for CRM analytics.
Mining of Massive Datasets
TLDR
Determining relevant data is key to delivering value from massive amounts of data and big data is defined less by volume which is a constantly moving target than by its ever-increasing variety, velocity, variability and complexity.
Scaling up machine learning: parallel and distributed approaches
TLDR
This tutorial gives a broad view of modern approaches for scaling up machine learning and data mining methods on parallel/distributed platforms and provides an integrated overview of state-of-the-art platforms and algorithm choices.
Mining of Massive Datasets
TLDR
This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets, and explains the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing.
Bridging two worlds with RICE
TLDR
This work proposes an alternative data exchange mechanism with R, SQL-SHM, a shared memory-based data exchange to incorporate R's vertical data structure and extended this approach to R-Op introducing R scripts equivalent to native database operations like join or aggregation within the execution plans.
Bridging Two Worlds with RICE Integrating R into the SAP In-Memory Computing Engine
TLDR
This work proposes an alternative data exchange mechanism with R, SQL-SHM, a shared memory-based data exchange to incorporate R’s vertical data structure and extended this approach to R-Op introducing R scripts equivalent to native database operations like join or aggregation within the execution plans.
Competing on Analytics: The New Science of Winning
You have more information at hand about your business environment than ever before. But are you using it to "out-think" your rivals? If not, you may be missing out on a potent competitive tool. In
On the structural properties of massive telecom call graphs: findings and implications
TLDR
This paper uses the Call Detail Records of a mobile operator from four geographically disparate regions to construct call graphs, and introduces the Treasure-Hunt model to describe the shape of mobile call graphs.
Extracting insights from social media with large-scale matrix approximations
TLDR
A flexible new family of low-rank matrix approximation algorithms for modeling topics in a given corpus of documents (e.g., blog posts and tweets) is described and benchmark distributed optimization algorithms for running these models in a Hadoopi-enabled cluster environment.
...
...