285443 (3) [Avatar] Offline
They'd make the code samples more useful, especially for the Spark section.
Jeff Smith (14) [Avatar] Offline
Ah, good observation. That had been my intention, but I notice that I haven't pushed any working files to the GitHub repo. I'll put that back on my to-do list.
426683 (1) [Avatar] Offline

Can you please share the test files for the code in chapter 2? It will make the chapters more useful.

407209 (1) [Avatar] Offline
yes, I'd also find sample data for this chapter useful. Any chance you could supply ?
Jeff Smith (14) [Avatar] Offline
The resources for Chapter 6 contain some LibSVM formatted data, but if I remember correctly, Chapter 2 presumes CSV files. I see how that could be useful, and I'll make sure to put up some properly formatted example data (in the GitHub repo).

If you want a solution right now, you could likely use this old example file from the Spark project (which is probably pretty close to what I was using when I wrote that chapter) : https://github.com/apache/spark/blob/branch-1.0/mllib/data/sample_tree_data.csv