Analyzing Analytics
@inproceedings{Bordawekar2015AnalyzingA, title={Analyzing Analytics}, author={Rajesh R. Bordawekar and Bob Blainey and Ruchir Puri}, booktitle={Analyzing Analytics}, year={2015} }
Many organizations today are faced with the challenge of processing and distilling information from huge and growing collections of data. Such organizations are increasingly deploying sophisticated mathematical algorithms to model the behavior of their business processes to discover correlations in the data, to predict trends and ultimately drive decisions to optimize their operations. These techniques, are known collectively as analytics, and draw upon multiple disciplines, including…Â
20 Citations
Data Processing and Analytics for Data-Centric Sciences
- Computer ScienceTowards Interoperable Research Infrastructures for Environmental and Earth Sciences
- 2020
This chapter describes the data analytics framework that has been designed and developed in the ENVRIplus project to be suitable for serving the needs of researchers in several domains including environmental sciences, open and extensible both with respect to the algorithms and methods it enables and the computing platforms it relies on to execute them.
Millipede: Die-Stacked Memory Optimizations for Big Data Machine Learning Analytics
- Computer Science2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
- 2018
This paper proposes memory optimizations for a "sea of simple MIMD cores (SSMC)" PNM architecture, called Millipede, which (pre) fetches and operates on entire memory rows to exploit BMLAs' row-density and employs cross-corelet flow-control to prevent eviction.
Progressive Evaluation of Queries over Untagged Data
- Computer ScienceArXiv
- 2018
This paper considers a scenario where tagging can be performed using several techniques that differ in cost and accuracy and develops a progressive approach to answering Select-Project-Join queries (with a restricted version of the join predicates) that enriches the right data to the right degree so as to maximize the quality of the query results.
Accelerating database workloads by software-hardware-system co-design
- Computer Science2016 IEEE 32nd International Conference on Data Engineering (ICDE)
- 2016
This tutorial provides a concise system-level characterization of different types of data management technologies, namely, the relational and NoSQL databases and data stream management systems from the perspective of analytical workloads and discusses opportunities for accelerating key data management workloads using software and hardware approaches.
PIQUE: Progressive Integrated QUery Operator with Pay-As-You-Go Enrichment
- Computer Science
- 2018
This paper explores a novel approach that supports progressive data enrichment during query processing in order to support interactive exploratory analysis and is based on integrating an operator, entitled PIQUE, to support a prioritized execution of the enrichment functions during queryprocessing.
How Machine Learning is Changing e-Government
- Computer ScienceICEGOV
- 2019
Through the analysis, quite interesting findings have been identified, containing both benefits and barriers from the public sectors' perspective, pinpointing a wide adoption of Machine Learning approaches in the public sector.
The role of business analytics in supporting strategy processes: Opportunities and limitations
- BusinessJ. Oper. Res. Soc.
- 2019
Business analytics can provide important data-driven insights into strategy processes; it is recommended its further integration with other traditional OR and strategy tools in order to support strategic decision-makers.
Cognitive Database: A Step towards Endowing Relational Databases with Artificial Intelligence Capabilities
- Computer ScienceArXiv
- 2017
This work proposes Cognitive Databases, an approach for transparently enabling Artificial Intelligence (AI) capabilities in relational databases that exemplifies using AI functionality to endow relational databases with capabilities that were previously very hard to realize in practice.
Privacy-Aware personal Information Discovery model based on the cloud
- Computer Science2015 Latin American Network Operations and Management Symposium (LANOMS)
- 2015
The proposed model for Privacy-Aware Information Discovery (PAID-M) provides privacy awareness by executing data analytics algorithms encapsulated with privacy preserving techniques and presents how it intends to address the privacy issue in the cloud deployment process by considering differences in privacy regulations and jurisdictions.
Clustering Undergraduate Computer Science Student Final Project Based on Frequent Itemset
- Computer Science
- 2016
The purpose of this study is to apply the method of association rule mining namely ECLAT algorithm to find most common terms combination and to group a collection of abstracts.
References
SHOWING 1-10 OF 51 REFERENCES
Analyzing Analytics: Part 1: A Survey of Business Analytics Models and Algorithms
- Computer Science
- 2011
This survey paper and the accompanying research report identifies some of the key techniques employed in analytics both to serve as an introduction for the non-specialist and to explore the opportunity for greater optimization for parallel computer architectures and systems software.
Enabling analysts in managed services for CRM analytics
- Computer ScienceKDD
- 2009
New areas that open up for KDD research in terms of 'time-to-insight' and repeatability for analysts are identified in the form of a managed service offering for CRM analytics.
Mining of Massive Datasets
- Computer Science
- 2014
Determining relevant data is key to delivering value from massive amounts of data and big data is defined less by volume which is a constantly moving target than by its ever-increasing variety, velocity, variability and complexity.
Scaling up machine learning: parallel and distributed approaches
- Computer ScienceKDD '11 Tutorials
- 2011
This tutorial gives a broad view of modern approaches for scaling up machine learning and data mining methods on parallel/distributed platforms and provides an integrated overview of state-of-the-art platforms and algorithm choices.
Mining of Massive Datasets
- Computer Science
- 2011
This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets, and explains the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing.
Bridging two worlds with RICE
- Computer ScienceVLDB 2011
- 2011
This work proposes an alternative data exchange mechanism with R, SQL-SHM, a shared memory-based data exchange to incorporate R's vertical data structure and extended this approach to R-Op introducing R scripts equivalent to native database operations like join or aggregation within the execution plans.
Bridging Two Worlds with RICE Integrating R into the SAP In-Memory Computing Engine
- Computer ScienceProc. VLDB Endow.
- 2011
This work proposes an alternative data exchange mechanism with R, SQL-SHM, a shared memory-based data exchange to incorporate R’s vertical data structure and extended this approach to R-Op introducing R scripts equivalent to native database operations like join or aggregation within the execution plans.
Competing on Analytics: The New Science of Winning
- Business
- 2007
You have more information at hand about your business environment than ever before. But are you using it to "out-think" your rivals? If not, you may be missing out on a potent competitive tool. In…
On the structural properties of massive telecom call graphs: findings and implications
- Computer ScienceCIKM '06
- 2006
This paper uses the Call Detail Records of a mobile operator from four geographically disparate regions to construct call graphs, and introduces the Treasure-Hunt model to describe the shape of mobile call graphs.
Extracting insights from social media with large-scale matrix approximations
- Computer ScienceIBM J. Res. Dev.
- 2011
A flexible new family of low-rank matrix approximation algorithms for modeling topics in a given corpus of documents (e.g., blog posts and tweets) is described and benchmark distributed optimization algorithms for running these models in a Hadoopi-enabled cluster environment.