Some Things You Learn Running Apache Spark in Production for Three Years


April 4, 2017

Presented at Enterprise Data World  (Atlanta, Georgia)

Apache Spark is one of the most exciting open-source data-processing frameworks today. It features a range of useful capabilities and an unusually developer-friendly programming model. However, the ease of getting a simple Spark application running can hide some of the challenges you might face while going from a proof of concept to a real-world application. This talk will distill our experiences as early adopters of Spark in production, present a case study where using Spark effectively provided huge benefits over legacy solutions, and provide concrete advice regarding: