The Author Online Book Forums are Moving

The Author Online Book Forums will soon redirect to Manning's liveBook and liveVideo. All book forum content will migrate to liveBook's discussion forum and all video forum content will migrate to liveVideo. Log in to liveBook or liveVideo with your Manning credentials to join the discussion!

Thank you for your engagement in the AoF over the years! We look forward to offering you a more enhanced forum experience.

627201 (9) [Avatar] Offline
#1
In the batch gradient decent code examples from the repository, there is a following fragment:

      for k in range(batch_size):
            correct_cnt += int(np.argmax(layer_2[k:k+1]) == np.argmax(labels[batch_start+k:batch_start+k+1]))

            layer_2_delta = (labels[batch_start:batch_end]-layer_2)/batch_size
            layer_1_delta = layer_2_delta.dot(weights_1_2.T)* relu2deriv(layer_1)
            layer_1_delta *= dropout_mask

            weights_1_2 += alpha * layer_1.T.dot(layer_2_delta)
            weights_0_1 += alpha * layer_0.T.dot(layer_1_delta)

It seems to me that the code that comes for the calculation of layer_2_delta and below does not really depends from k, and therefore, should be calculated only once outside of loop, i.e
      for k in range(batch_size):
            correct_cnt += int(np.argmax(layer_2[k:k+1]) == np.argmax(labels[batch_start+k:batch_start+k+1]))

       layer_2_delta = (labels[batch_start:batch_end]-layer_2)/batch_size
       layer_1_delta = layer_2_delta.dot(weights_1_2.T)* relu2deriv(layer_1)
       layer_1_delta *= dropout_mask

       weights_1_2 += alpha * layer_1.T.dot(layer_2_delta)
       weights_0_1 += alpha * layer_0.T.dot(layer_1_delta)
613046 (3) [Avatar] Offline
#2
Yes, absolutely! I noticed it too. It seemed suspicious to me because the neural net was training so low even though Andrew stated that the first thing we would notice when running the code was that it runs much faster. Anyway, in chapter 9 we still get to train the neural net using batch gradient descent and the code in that section is correct.