kb0000
I can't tell if this is a typo or not.

The variable df.spam is not defined (i.e. df seems undefined for this section on spam detection). If df.spam is the same as sms.spam, then the code blocks in 4.5.6 and the end of 4.8.1 seem to be the same (are those sections supposed to be different? It wasn't clear where the LSA preprocessing for 4.8.1 was coming from as they both use the PCA vector variables).

It seems to switch from sms.spam to df.spam in 4.6.3, then back in 4.6.4, then back again in 4.8.1. I might have missed something important though; that is a long span of pages.

Whichever it is, I would really appreciate having a single section (rather than 2-3 spread farther apart) that walks through LDA overfitting down to using LSA/LDiA to improve generalization.
hobs
Thank you! That is valid feedback. I will fix those typos and organize those sections better.