The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

hhadi (1) [Avatar] Offline
#1
I have a program with the following three processes:
Process #1 reads a file line by line
Process #2 is applied on each line to manipulate the content
Process #3 takes the content from Process #2 and posts it to a database.

In order to speed up the process Futures are used in Process #3 to post the content asynchronously to the database. Obviously this will speed up the program performance for not having to wait on Process #3 to post in order to read the next line.
Here are my questions
1) Is it a potential drawback that if process #3 is slow and both processes #1 and #2 are fast then the program will run out of memory for having a number of unprocessed threads of content that are waiting for processes #3 to post them to the database?

2) One way to solve such a problem is to use the actor model , where request are send to the “mail box” of Process #3 which may exists on a separate machine with its own memory, right ?

3) How would reactive programming improve the program?
roland.kuhn (39) [Avatar] Offline
#2
Re: Question to clarify my understanding of reactive programming ch3
Sorry for the delay, ScalaDays took up most of mine time during the past weeks.

You are right that just spawning Futures is not going to solve the problem, and sending to an Actor instead is going to suffer from the same: if the production of data is faster than its consumption then any system you design will eventually run out of memory. The solution is to react both to the availability of data (in process 1 -> 2 -> 3) as well as the availability of database bandwidth (in process 3 -> 2 -> 1). In other words, suspend reading from file while more than X lines are still in flight towards the database.

We will cover patterns for this in the flow control chapter.