An algebra for distributed Big Data analytics

@article{Fegaras2017AnAF,
  title={An algebra for distributed Big Data analytics},
  author={L. Fegaras},
  journal={Journal of Functional Programming},
  year={2017},
  volume={27}
}
  • L. Fegaras
  • Published 2017
  • Computer Science
  • Journal of Functional Programming
Abstract We present an algebra for data-intensive scalable computing based on monoid homomorphisms that consists of a small set of operations that capture most features supported by current domain-specific languages for data-centric distributed computing. This algebra is being used as the formal basis of MRQL, which is a query processing and optimization system for large-scale distributed data analysis. The MRQL semantics is given in terms of monoid comprehensions, which support group-by and… Expand
17 Citations
A Query Processing Framework for Large-Scale Scientific Data Analysis
  • L. Fegaras
  • Computer Science
  • Trans. Large Scale Data Knowl. Centered Syst.
  • 2018
  • 2
Compile-Time Query Optimization for Big Data Analytics
  • 3
  • PDF
Scalable Linear Algebra Programming for Big Data Analysis
  • PDF
Scalable Querying of Nested Data
  • 2
  • PDF
Modeling Big Data Processing Programs
  • Highly Influenced
  • PDF
Translation of array-based loops to distributed data-parallel programs
  • 2
  • PDF
Compile-Time Code Generation for Embedded Data-Intensive Query Languages
  • 5
  • PDF
The optics of language-integrated query
  • PDF
Execution Strategies for Compute Intensive Queries in Particle Physics
  • Highly Influenced
  • PDF
Charting the Design Space of Query Execution using VOILA
  • PDF
...
1
2
...

References

SHOWING 1-10 OF 60 REFERENCES
SCOPE: easy and efficient parallel processing of massive data sets
  • 808
  • Highly Influential
  • PDF
A Query Processing Framework for Array-Based Computations
  • 3
  • PDF
XML Query Optimization in Map-Reduce
  • 38
  • PDF
An optimization framework for map-reduce queries
  • 34
  • PDF
Incremental Query Processing on Big Data Streams
  • L. Fegaras
  • Computer Science
  • IEEE Transactions on Knowledge and Data Engineering
  • 2016
  • 24
  • PDF
Optimization of Nested Queries using the NF2 Algebra
  • 6
  • PDF
Towards an effective calculus for object query languages
  • 126
  • PDF
MapReduce: Simplified Data Processing on Large Clusters
  • 21,274
  • Highly Influential
Supporting Bulk Synchronous Parallelism in Map-Reduce Queries
  • L. Fegaras
  • Computer Science
  • 2012 SC Companion: High Performance Computing, Networking Storage and Analysis
  • 2012
  • 6
  • PDF
...
1
2
3
4
5
...