About this Book

Apache Spark is a general data processing framework. That means you can use it for all kinds of computing tasks. And that means any book on Apache Spark needs to cover a lot of different topics. We’ve tried to describe all aspects of using Spark: from configuring runtime options and running standalone and interactive jobs, to writing batch, streaming, or machine learning applications. And we’ve tried to pick examples and example data sets that can be run on your personal computer, that are easy to understand, and that illustrate the concepts well.

We hope you’ll find this book and the examples useful for understanding how to use and run Spark and that it will help you write future, production-ready Spark applications.

About this Book

Who should read this book

How this book is organized

About the code

Author Online