Kafka : a Distributed Messaging System for Log Processing

  title={Kafka : a Distributed Messaging System for Log Processing},
  author={Jay Kreps},
Log processing has become a critical component of the data pipeline for consumer internet companies. We introduce Kafka, a distributed messaging system that we developed for collecting and delivering high volumes of log data with low latency. Our system incorporates ideas from existing log aggregators and messaging systems, and is suitable for both offline and online message consumption. We made quite a few unconventional yet practical design choices in Kafka to make our system efficient and… CONTINUE READING
Highly Influential
This paper has highly influenced 73 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 486 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 291 extracted citations

486 Citations

Citations per Year
Semantic Scholar estimates that this publication has 486 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-4 of 4 references

Efficient data transfer through zero copy: https://www.ibm.com/developerworks/linux/library

  • Efficient data transfer through zero copy: https…
Highly Influential
11 Excerpts

Cloudera's Flume, https://github.com/cloudera

  • Cloudera's Flume, https://github.com/cloudera
1 Excerpt

Facebook's Scribe, http://www.facebook.com/note.php?note_id=32008268919

  • Facebook's Scribe, http://www.facebook.com/note…
1 Excerpt

apache.org/hdfs/ [10] http://hadoop.apache.org/zookeeper/ [11] http://www.slideshare.net/cloudera/hw09-hadoop-based- data-mining-platform-for-the-telecom-industry

  • apache.org/hdfs/ [10] http://hadoop.apache.org…

Similar Papers

Loading similar papers…