I suggest that the examples in this book use the Spark 2.0+ API. I am not suggesting that the examples use Dataframes/Datasets instead - working with RDDs is fine. However, for things like initializing spark - people are going to be confused when they see most tutorials use the SparkSession instead of the SparkContext. Spark 2.0 has been out long enough that it's used in production, and for a MEAP book, I don't think it should be using 1.6 API.
|