Apache Spark

Known as: Resilient Distributed Dataset, Resilient Distributed Datasets, Spark 
Apache Spark is an open source cluster computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark… (More)
Wikipedia

Topic mentions per year

Topic mentions per year

2004-2017
010020030020042017

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2016
Highly Cited
2016
Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning… (More)
  • figure 1
  • figure 2
Is this relevant?
2016
2016
We describe matrix computations available in the cluster programming framework, Apache Spark. Out of the box, Spark provides… (More)
  • table 1
  • figure 1
  • figure 2
Is this relevant?
2016
2016
The number and the size of linked open data graphs keep growing at a fast pace and confronts semantic RDF services with problems… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • table 1
Is this relevant?
2016
2016
Recommending news articles is a challenging task due to the continuous changes in the set of available news articles and the… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • table 1
Is this relevant?
2016
2016
To maximize resource utilization and system throughput, hardware resources are often shared across multiple Apache Spark jobs… (More)
  • figure 1
  • figure 3
  • figure 2
  • table I
  • figure 5
Is this relevant?
Highly Cited
2015
Highly Cited
2015
Spark SQL is a new module in Apache Spark that integrates relational processing with Spark's functional programming API. Built on… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
2015
2015
Apache Spark is an open source distributed data processing platform that uses distributed memory abstraction to process large… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 7
  • figure 8
Is this relevant?
2015
2015
We describe matrix computations available in the cluster programming framework, Apache Spark. Out of the box, Spark provides… (More)
Is this relevant?
2014
2014
A significant part of the data produced every day by online services is structured as a graph. Therefore, there is the need for… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
Is this relevant?
Highly Cited
2013
Highly Cited
2013
The initial design of Apache Hadoop [1] was tightly focused on running massive, MapReduce jobs to process a web crawl. For… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • table 1
Is this relevant?