The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

Yevhenii Kurtov (6) [Avatar] Offline
#1
Hello, Saša

In section `8.3.5 Restart frequency` you denotes that Supervisor will terminate itself in case of maximum restart frequency exceedance

It’s important to keep in mind that a supervisor won’t restart a child process forever. The supervisor relies on the maximum restart frequency, which defines how many restarts are allowed in a given time period. By default, the maximum restart frequency is five restarts in five seconds. If this frequency is exceeded, the supervisor gives up and terminates itself.


Does that apply for GenServers that shutting down themselves by returning {:stop, reason, reply, new_state}?
What if :normal or :shutdown was specified as the stoppage reason?
Will it happen if those GenServers was spawn with temporary restart strategy?

The use-case on mind is following: there are a bunch of workers that are doing document export and if they are failing to do their job after N attempts (due to network errors mostly) they will exit they wait it will be logged and doesn't affect the parent supervisor.
sjuric (109) [Avatar] Offline
#2
Maximum restart frequency is the number of restarts per time interval after which supervisor will terminate all of its children (together with itself). This can be tuned through max_restarts and max_seconds options to supervise/2.

Whether a child process (e.g. GenServer) is going to be restarted after a termination depends on the restart option given to worker/supervisor functions. If this is not specified, the default value is permanent, meaning the child is restarted regardless of the exit reason (even normal or shutdown).

If you explicitly want to stop a permanent worker without restarting it, you could use Supervisor.terminate_child. However, I can't recall a single time I used that function in practice.

For your use case, I'd recommend using temporary workers with explicit error control. Retrying with supervisors is meant to help with unexpected bugs. A network failure is a very expected situation, so IMO it should be handled explicitly. In your GenServer you could issue a network call (preferably asynchronous call). Then if the call fails or timeouts, you can retry with a (possibly growing) delay. Once you reached max attempts the worker can just stop itself by returning a :stop tuple. Using a non-normal reason will also ensure the termination is logged as an error.