You’d be forgiven for passing by the announcement of Apache Spark 2.3. After all, it’s a point release, isn’t it? Sure, there will be some bug fixes, maybe an improvement or two to the MLLib framework, maybe an extra operator or something, but nothing all that major. That will be saved for Apache Spark 3.0, surely?
In fact, this is no mere point release. Apache Spark 2.3 ships with two major new features, one of which is perhaps the biggest (and often-requested) change to streaming operations since Spark Streaming was added to the project. The other is native integration with Kubernetes to execute Spark jobs in container clusters.