First off all: Kudos to this book! This read actually for the first time gradually increases my understanding.
Maybe not enough though, as I do not yet get, why the batch_size parameter changes from 1 to 16 in the final model (listing 3.62 as compared to, e.g., listing 3.58 ). Could you please add one explanatory sentence in the final version of the book? Or did I miss it somewhere?