svencowart (3) [Avatar] Offline
#1
I am uncertain about how to separate the concept of a StreamTask from a Partition? It's mentioned, “State Stores are Assigned Per Task -
The statement above could be interpreted to mean that each partition has its own state store, but that is not the case. Partitions are assigned to a StreamTask and each StreamTask has it’s own state store.”

What defines a StreamTask? At what point are there multiple copies of a stream topology running via multiple copies of the same StreamTask? In my mind, if I run a StreamTask then there is one instance of the StreamTask so why is there a need to repartition the data? The only way I could map the logic in my mind from the excerpt above, is if each broker has it's own copy of a StreamTask and then I can see the necessity to repartition the data.

I apologize if my question seems silly as I am still fairly new to Kafka and Kafka Streams. I just find the excerpt about a StateStore important and lacking proper explanation.