Skip to content
Charlie edited this page Sep 30, 2021 · 7 revisions

Fundamentals

(With an emphasis on fun...and another emphasis on mental)

Consumer Groups

  • Each consumer in a group can be assigned [0 -> p] partitions

  • Each partition can only be assigned to a single consumer at a time

  • partition -> consumer assignment is configurable

  • Provides automatic load balancing of partitions and consumers in a group

Balancing

  • Rebalance: the process of assigning partitions to consumers in a group

  • Occurs when a new consumer is added to a group, a consumer is shut down, etc.

  • Small window of time when all consumers are unable to consume

  • Consumers send heartbeats to a particular broker designated as the group coordinator

  • Heartbeats are sent during polling, commit, and sometimes as a separate operation

  • session.timeout.ms - Amount of time a consumer can be out of contact with the brokers while still considered alive

  • heartbeat.interval.ms - How frequently the client / consumer will send a heartbeat to the group coordinator

  • heartbeat is usually set to 1/3 the session timeout

Delivery Guarantees

Idempotence

When enabled, same message sent by a producer will only be written to the log a single time

Transactions

Atomic writes across multiple partitions

Producers produce to topics using a new transactional API Consumers read from topics using an isolation level

How Each batch of messages contain a sequence number which the broker uses to dedupe

Useful Commands

Produce Messages with Keys

./kafka-console-producer.sh --broker-list localhost:9092 \
  --topic my_topic \
  --property "parse.key=true" --property "key.separator=:" < input.txt

Get Earliest Offset in Topic

./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic my_topic --time -2

Additional Reading

Clone this wiki locally