361939 (1) [Avatar] Offline
#1
Page 30: "The workers saved the data in their partition to the database. In our scenario, we had
four workers, this means four connections to the database when we saved the data. "

This is not correct. Connections to Postgres are per tasks or partitions. Here there are 3 workers in the picture but there are four partitions or tasks. So four connections are each per task or partition not per worker.
Jean Georges Perrin (2) [Avatar] Offline
#2
Sorry about the confusion. Let me check all that and get back here!
Jean Georges Perrin (2) [Avatar] Offline
#3
361939 wrote:Page 30: "The workers saved the data in their partition to the database. In our scenario, we had
four workers, this means four connections to the database when we saved the data. "

This is not correct. Connections to Postgres are per tasks or partitions. Here there are 3 workers in the picture but there are four partitions or tasks. So four connections are each per task or partition not per worker.


I corrected that in the manuscript, which will be in MEAP v04, it now says:

The workers saved the data in their partition to the database. In this scenario, you had four partitions, this means four connections to the database when you saved the data. Imagine a similar scenario with 200k tasks trying first to connect to the database and then inserting data. A fine-tuned database server will refuse too many connections, which will require more control in the application. A solution to this load issue is addressed in chapters 17 and 19 through repartitioning and the options when exporting to a database.


Looks better to you?