The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

ebert (13) [Avatar] Offline
In chapter 3, there is a difference in the way the same training algorithm is executed in two scripts that I don't quite understand.

In listing 3.3, the code for running the algorithm is:

for epoch in range(training_epochs):
    for (x, y) in zip(trX, trY):, feed_dict={X: x, Y: y})

However in list 3.5, applying regularization, a similar step (inside the reg_lambda loop) is coded as:

for reg_lambda in np.linspace(0,1,100):
    for epoch in range(training_epochs):, feed_dict={X: x_train, Y: y_train})

Why it is not necessary to do a loop through x, y values in the second case?


Finally, I've found out why the difference in other tutorial. You can always decide how many points to include in each training step in the Gradient Descent method: one by one, which is the case in the first script (stochastic gradient descent), or all of them, which is the case in the second one (batch gradient descent). There is also possible to proceed slice by slice (mini-batch).

Not every approach is equal: in each one there is a different trade-off between speed to reach a good value of model parameters and computational resources used.

I don't know if this topic will be included in next chapters. If it's not the case, maybe it would be relevant to comment something on it.
Nishant Shukla (52) [Avatar] Offline
Hi ebert,

You bring up a great point that I did not have a chance to address. I think it would be a good idea if I more explicitly outlined the differences.
Sorry for the late response, but I hope you're finding your read informative!