Live Blogging from the Big Data DC Meetup# 4 – Chris Burroughs is presenting on Kafka.
Image may be NSFW.
Clik here to view.
Essentially, Kafka is a distributed publish-subscribe messaging system. It provides persistent messaging (that is, protection against restarts and shut downs. It also provides a constant time, that is, O(1) disk structures that provide constant time performance even with many TB of stored messages. (This is sort of impressive as is already).
The main advantage of Kafka seems to be the high-throughput: even using simple hardware Kafka can support hundreds of thousands of messages per second.