Skip to search formSkip to main contentSkip to account menu

Apache Spark

Known as: Resilient Distributed Datasets, Resilient Distributed Dataset, Spark (cluster computing framework) 
Apache Spark is an open source cluster computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark… 
Wikipedia (opens in a new tab)

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2017
2017
With the spreading prevalence of Big Data, many advances have recently been made in this field. Frameworks such as Apache Hadoop… 
Highly Cited
2016
Highly Cited
2016
A huge amount of digital data containing useful information, called Big Data, is generated everyday. To mine such useful… 
Highly Cited
2016
Highly Cited
2016
Non-technical losses (NTL) such as electricity theft cause significant harm to our economies, as in some countries they may range… 
Review
2015
Review
2015
Apache Spark is an open-source cluster computing framework for big data processing. It has emerged as the next generation big… 
Highly Cited
2015
Highly Cited
2015
In the context of drug discovery, a key problem is the identification of candidate molecules that affect proteins associated with… 
2011
2011
Dense fine-grained PbTe bulk materials without oxide phases are fabricated using a process that combines cryomilling (mechanical…