what is a consumer group?
A consumer group is basically a set of one or more consumers working together in parallel to consume messages from topic partitions. Messages are equally divided among all the consumers pf a group, with no two consumer receiving the same message.
Kafka ensures that only a single consumer reads messages from any partition within a consumer group. In other words, topic partitions are a unit of parallelism – only one consumer can work on a partition in a consumer group at a time. If a consumer stops, Kafka spreads partitions across the remaining consumers in the same consumer group. Similarly, every time a consumer is added to or removed from a group, the consumption is rebalanced within the group.
Kafka stores the current offset per consumer group per topic per partition, as it would for a single consumer. This means that unique messages are only sent to a single consumer in a consumer group, and the load is balanced across consumers as equally as possible.
Here is a summary of how Kafka manages the distribution of partitions to consumers within a consumer group:
- Number of consumers in a group = number of partitions: each consumer consumes one partition.
- Number of consumers in a group > number of partitions: some consumers will be idle.
- Number of consumers in a group < number of partitions: some consumers will consume more partitions than others.