Big Data In Production: Bare Metal to OpenShift

Published

January 29, 2017

Presented at DevConf.cz  (Brno, Czechia)

Apache Spark is one of the most exciting open-source data-processing frameworks today. It features a range of useful capabilities and an unusually developer-friendly programming model. However, the ease of getting a simple Spark application running can hide some of the challenges you might face while going from a proof of concept to a real-world application. This talk will distill our experiences as early adopters of Spark in production, present a case study where using Spark effectively provided huge benefits over legacy solutions, explain why we migrated from a dedicated Spark cluster to OpenShift, and provide concrete advice regarding:

This talk assumes some familiarity with Apache Spark but will provide context for attendees who are new to Spark. You’ll learn from a seasoned Red Hat engineer with over three years of experience running Spark in production and contributing to the Spark community.

Talk video