A survey of open source tools for machine learning with big data in the Hadoop ecosystem

  title={A survey of open source tools for machine learning with big data in the Hadoop ecosystem},
  author={Sara Landset and Taghi M. Khoshgoftaar and Aaron N. Richter and Tawfiq Hasanin},
  journal={Journal of Big Data},
With an ever-increasing amount of options, the task of selecting machine learning tools for big data can be difficult. The available tools have advantages and drawbacks, and many have overlapping uses. The world’s data is growing rapidly, and traditional tools for machine learning are becoming insufficient as we move towards distributed and real-time processing. This paper is intended to aid the researcher or professional who understands machine learning but is inexperienced with big data. In… CONTINUE READING
Highly Influential
This paper has highly influenced 10 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 100 citations. REVIEW CITATIONS
Recent Discussions
This paper has been referenced on Twitter 34 times over the past 90 days. VIEW TWEETS


Publications citing this paper.
Showing 1-10 of 61 extracted citations

100 Citations

Citations per Year
Semantic Scholar estimates that this publication has 100 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 105 references

Hadoop: The Definitive Guide, 3rd edn. Sebastopol, CA:O’Reilly Media, Inc.

  • T. White
  • 2012
Highly Influential
4 Excerpts

3D data management: controlling data volume, velocity and variety

  • D. Laney
  • META Group;
  • 2001
Highly Influential
4 Excerpts

Conjecture: Scalable Machine Learning in Hadoop with Scalding

  • J. Attenberg
  • https://codeascraft. com/2014/06/18/conjecture…
  • 2014
Highly Influential
3 Excerpts

Mahout becomes a researcher: Large Scale Recommendations at Mendeley

  • K. Jack
  • Big Data Week, Hadoop User Group UK;
  • 2012
Highly Influential
4 Excerpts

Similar Papers

Loading similar papers…