-
Notifications
You must be signed in to change notification settings - Fork 0
Kafka
(With an emphasis on fun...and another emphasis on mental)
-
Each consumer in a group can be assigned [0 -> p] partitions
-
Each partition can only be assigned to a single consumer at a time
-
partition -> consumer assignment is configurable
-
Provides automatic load balancing of partitions and consumers in a group
-
Rebalance: the process of assigning partitions to consumers in a group
-
Occurs when a new consumer is added to a group, a consumer is shut down, etc.
-
Small window of time when all consumers are unable to consume
-
Consumers send heartbeats to a particular broker designated as the group coordinator
-
Heartbeats are sent during polling, commit, and sometimes as a separate operation
-
session.timeout.ms
- Amount of time a consumer can be out of contact with the brokers while still considered alive -
heartbeat.interval.ms
- How frequently the client / consumer will send a heartbeat to the group coordinator -
heartbeat is usually set to 1/3 the session timeout
When enabled, same message sent by a producer will only be written to the log a single time
Atomic writes across multiple partitions
Producers produce to topics using a new transactional API Consumers read from topics using an isolation level
How Each batch of messages contain a sequence number which the broker uses to dedupe
./kafka-console-producer.sh --broker-list localhost:9092 \
--topic my_topic \
--property "parse.key=true" --property "key.separator=:" < input.txt
./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic my_topic --time -2