Skip to search formSkip to main contentSkip to account menu

Apache Spark

Known as: Resilient Distributed Datasets, Resilient Distributed Dataset, Spark (cluster computing framework) 
Apache Spark is an open source cluster computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark… 
Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2016
Highly Cited
2016
Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning… 
Highly Cited
2016
Highly Cited
2016
This open source computing framework unifies streaming, batch, and interactive big data workloads to unlock new applications. 
Review
2016
Review
2016
Apache Spark has emerged as the de facto framework for big data analytics with its advanced in-memory programming model and upper… 
Review
2016
Review
2016
The proliferation of mobile devices, such as smartphones and Internet of Things gadgets, has resulted in the recent mobile big… 
Highly Cited
2015
Highly Cited
2015
Spark SQL is a new module in Apache Spark that integrates relational processing with Spark's functional programming API. Built on… 
Highly Cited
2015
Highly Cited
2015
the boom in the technology has resulted in emergence of new concepts and challenges. Big data is one of those spoke about terms… 
Highly Cited
2015
Highly Cited
2015
  • Kewen Wang, M. Khan
  • IEEE 17th International Conference on High…
  • 2015
  • Corpus ID: 16465129
Apache Spark is an open source distributed data processing platform that uses distributed memory abstraction to process large… 
Highly Cited
2015
Highly Cited
2015
Data has long been the topic of fascination for Computer Science enthusiasts around the world, and has gained even more… 
Review
2015
Review
2015
Apache Spark is an open-source cluster computing framework for big data processing. It has emerged as the next generation big… 
Highly Cited
2010
Highly Cited
2010
MapReduce and its variants have been highly successful in implementing large-scale data-intensive applications on commodity…