I’m speaking at Spark Summit today about using Spark to analyze operational data from the Fedora project. Here are some links to further resources related to my talk:

You should also check out my team’s Silex library, which contains useful code factored out of real Spark applications we’ve built in Red Hat’s Emerging Technology group. It includes a lot of cool functionality, but the part I mentioned in the talk is this handy interface for preprocessing JSON data before inferring a schema.

