The next release of Apache Spark will be 2.0, marking a big milestone for the project. In this talk, I’ll cover how the community has grown to reach this point, and some of the major features in 2.0. The largest additions are performance improvements for Datasets, DataFrames and SQL through Project Tungsten, as well as a new Structured Streaming API that provides simpler and more powerful stream processing. I’ll also discuss a bit of what’s in the works for future versions.