The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

Petar Korudzhiev (2) [Avatar] Offline
Hi Yan,

I am looking at the source code and see that you have the same event source trigger for the lambda: 'notify-restaurant' and 'notify-user'
Guess this is the reason to have a filter in the source code: let orderPlaced = events.filter(r => r.eventType === 'order_placed'); (for example)
Basically you have two consumers to the same data stream and we are interesting in different set of records in the lambda.
Is that the idea?

I am curios to know what is your opinion which is the better approach to have one generic kinesis stream used in all the places in your application or to have separated streams for different purposes. Have you had this type of dilemma in your mind when you tried to implement something?
Yan Cui (73) [Avatar] Offline
Hi Petar,

Yes, it's a constant struggle I have, and honestly, I go back-and-forth on this A LOT! The thing is, you have to balance between two opposing forces:

1) the limits on the read throughput (5 reads/shard/sec) can be problematic when you have more than a handful of subscribers, which pushes you towards having specialised streams.

2) to keep things simple for producers, to minimise cost (there's overhead per stream as you have to leave some headroom in throughput), to preserve ordering across multiple types of events for the same partition key (e.g. user id), and to make BI easier (which typically involves aggregating ALL the events into one place - e.g. Athena - to make them easy to query), you're pushed towards having one stream for everything.

The good news is that you can always bridge the gap, e.g. if you go with 1) then you can use Lambda to ship events from the specialised streams to a centralised stream; if you go with 2) then you can use something like Kinesis Analytics to create specialised streams as and when those use cases come up.

Another factor to consider is ownership, which is not a problem when you're a startup with a handful of engineers. In a bigger company, the centralised stream would need to have clear ownership and you need some governance around contract changes, etc. This is an overhead on the development team and creates inter-team dependencies, but I think they're necessary to prevent unintended breaking changes from one team causing havok on another team.

Sorry it's not a clear cut answer, but I have given you enough context to make your own decisions in your particular case.