Corpus ID: 18534081

Kafka : a Distributed Messaging System for Log Processing

@inproceedings{Kreps2011KafkaA,
  title={Kafka : a Distributed Messaging System for Log Processing},
  author={Jay Kreps},
  year={2011}
}
  • Jay Kreps
  • Published 2011
  • Computer Science
  • Log processing has become a critical component of the data pipeline for consumer internet companies. We introduce Kafka, a distributed messaging system that we developed for collecting and delivering high volumes of log data with low latency. Our system incorporates ideas from existing log aggregators and messaging systems, and is suitable for both offline and online message consumption. We made quite a few unconventional yet practical design choices in Kafka to make our system efficient and… CONTINUE READING

    Figures and Topics from this paper.

    A study on Modern Messaging Systems- Kafka, RabbitMQ and NATS Streaming
    Data Ingestion for the Connected World
    52
    DZMQ: A Decentralized Distributed Messaging System for Realtime Web Applications and Services
    3
    DistributedLog: A High Performance Replicated Log Service
    5
    Kafka and Its Using in High-throughput and Reliable Message Distribution
    17
    Kafka, Samza and the Unix Philosophy of Distributed Data
    38
    Performance Prediction for the Apache Kafka Messaging System
    3

    References

    Publications referenced by this paper.
    SHOWING 1-4 OF 4 REFERENCES
    Cloudera's Flume, https://github.com/cloudera
      Efficient data transfer through zero copy: https://www.ibm.com/developerworks/linux/library
        Facebook's Scribe, http://www.facebook.com/note.php?note_id=32008268919
          apache.org/hdfs/ [10] http://hadoop.apache.org/zookeeper/ [11] http://www.slideshare.net/cloudera/hw09-hadoop-based- data-mining-platform-for-the-telecom-industry