Jacek Laskowski (37) [Avatar] Offline
#1
Hi,

While reviewing KafkaStreams class I noticed num.stream.threads configuration property that is by default 1. If I'm not mistaken it means that only one thread handles the topology.

When would it make sense to increase the number of stream processor threads? What are the drawbacks (which made the setting 1 not 2 or more)?

Jacek
Bill Bejeck (45) [Avatar] Offline
#2
Hi Jacek,

Another good question. The default is one thread because it's not known what a user will use for their topology. You are correct in your assertion that with num.stream.threads set to 1 means only one stream thread will handle the topology.

Depending on what kind of development work on your local machine, one thread will be sufficient. The critical point to keep in mind is that Kafka Streams creates StreamTasks based on the number of partitions.

The total number of tasks created is the max number of partitions across all input topics. For example, if you have a topic with ten partitions you'll end up with ten tasks. However, if you have three source topics (A, B and C) and the partitions looked like this (A=2, B=5, C 3), you would have five tasks since max(2, 5, 3) = 5.

Once you figure the number of tasks, you'll want to set the number of threads t to (t >= 1 && t <= total number of tasks). If you set the number of threads above the number of tasks, those threads will be idle.

HTH,

Bill