Scaling big data mining infrastructure: the twitter experience

  title={Scaling big data mining infrastructure: the twitter experience},
  author={Jimmy J. Lin and Dmitriy V. Ryaboy},
  journal={SIGKDD Explorations},
The analytics platform at Twitter has experienced tremendous growth over the past few years in terms of size, complexity, number of users, and variety of use cases. In this paper, we discuss the evolution of our infrastructure and the development of capabilities for data mining on "big data". One important lesson is that successful big data mining in practice is about much more than what most academics would consider data mining: life "in the trenches" is occupied by much preparatory work that… CONTINUE READING
Highly Cited
This paper has 123 citations. REVIEW CITATIONS
Recent Discussions
This paper has been referenced on Twitter 1 time over the past 90 days. VIEW TWEETS

From This Paper

Figures, tables, and topics from this paper.
68 Citations
7 References
Similar Papers


Publications citing this paper.
Showing 1-10 of 68 extracted citations

123 Citations

Citations per Year
Semantic Scholar estimates that this publication has 123 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-7 of 7 references

Information platforms and the rise of the data scientist

  • J. Hammerbacher
  • 2009
Highly Influential
8 Excerpts

Data Jujitsu: The Art of Turning Data Into Product

  • D. Patil
  • 2012
Highly Influential
4 Excerpts

Building Data Science Teams

  • D. Patil
  • 2011
Highly Influential
4 Excerpts

Large-scale machine learning with stochastic gradient descent

  • L. Bottou
  • 2010
Highly Influential
3 Excerpts

Similar Papers

Loading similar papers…