48456 (1) [Avatar] Offline

The chapters about a logging infrastructure were really interesting (as the other parts of the book).

I have one question that I don't see addressed in those chapters: are there some guidelines regarding the number of queues/topics needed to process the logs of an infrastructure?

Would it be better to put everything in one kafka topic if the kafka cluster can handle it, or are there other criteria to decide if some partitioning would be a better idea ?

Any feedback would be really appreciated !

Julien Vehent (14) [Avatar] Offline
It's a very kafka-specific question, and to be honest I didn't want to get into the weeds of implementation for that chapter. Kafka is just one possible implementation of a logging pipeline, and depending on where your infrastructure run, you may want to use other platforms (Kinesis in AWS, for example, is a great tool).

That said, as a general rule, I would suggest to start with the simplest possible setup, a single queue, then increase the complexity of your configuration as needed. it's never a good idea to start with a large number of queues before having a clear picture of how you will consume these queues.