pietz (3) [Avatar] Offline
I have been following the development of this book for a while and have been completely satisfied with the tone, pace, quality and choice of content. At this point I just wanted to highlight chapter 9 and its explanation of a LSTM cell. I've never come across such a detailed and well explained piece of text regarding LSTMs. Great stuff!

I also wanted to express a request regarding additional content. In early march a new paper showed an empirical evaluation of RNNs vs CNNs for sequence modeling. They concluded CNNs should be regarded as a natural starting point for sequence modelling tasks.They also described an architecture referred to as a temporal convolutional network (TCN).


While this book shows how CNNs can be used for text classification, I'd be very interested how one would implement a CNN where input and output have the same length of characters as well as a seq2seq example.
425504 (2) [Avatar] Offline
Hi Pietz,

Thanks for the feedback. LSTM's was certainly a fun chapter write.

I certainly agree with you and the paper that CNN's are good entry point into NLP, if for no other reason that speed difference in training and inference times between CNNs and RNNs make iteration so much faster.

We do have a chapter devoted to Seq2Seq and Attention Networks that should be available in MEAP now. Hopefully that will give some further insight into dealing with language as sequences. Unfortunately, we won't be able to cover Temporal Convolutional Networks in this edition of the book, but definitely interesting work being done there. Interesting and exciting, as the greatest struggle of LSTMs and GRUs is their computational costs.