john.mount (79) [Avatar] Offline
#1
Some of the steps from Practical Data Science chapters 4 and 6 are now wrapped into a preview (not on cran yet) package available here: http://www.win-vector.com/blog/2014/08/vtreat-designing-a-package-for-variable-treatment/ . For teaching (i.e. the book) you don't want these steps hidden in a package. But in practice you want to automate a lot of these steps (once you understand their limits) so you can spend more time on important domain specific feature engineering.
john.mount (79) [Avatar] Offline
#2
Re: Preview variable treatment package from Win-Vector LLC
Here is a knitr page that is a more fully worked example of using the vtreat package to automate a lot of the steps from chapter 6 (though the by-hand calibration set remains the most important component): https://github.com/WinVector/vtreat/blob/master/notes/KDD2009example.md