hhadi (1) [Avatar] Offline
#1
I have a program with the following three processes:
Process #1 reads a file line by line
Process #2 is applied on each line to manipulate the content
Process #3 takes the content from Process #2 and posts it to a database.

In order to speed up the process Futures are used in Process #3 to post the content asynchronously to the database. Obviously this will speed up the program performance for not having to wait on Process #3 to post in order to read the next line.
Here are my questions
1) Is it a potential drawback that if process #3 is slow and both processes #1 and #2 are fast then the program will run out of memory for having a number of unprocessed threads of content that are waiting for processes #3 to post them to the database?

2) One way to solve such a problem is to use the actor model , where request are send to the “mail box” of Process #3 which may exists on a separate machine with its own memory, right ?

3) How would reactive programming improve the program?
roland.kuhn (39) [Avatar] Offline
#2
Re: Question to clarify my understanding of reactive programming ch3
Sorry for the delay, ScalaDays took up most of mine time during the past weeks.

You are right that just spawning Futures is not going to solve the problem, and sending to an Actor instead is going to suffer from the same: if the production of data is faster than its consumption then any system you design will eventually run out of memory. The solution is to react both to the availability of data (in process 1 -> 2 -> 3) as well as the availability of database bandwidth (in process 3 -> 2 -> 1). In other words, suspend reading from file while more than X lines are still in flight towards the database.

We will cover patterns for this in the flow control chapter.