If you’ll be at Apache: Big Data next week, you should definitely check out some talks from my teammates in Red Hat’s Emerging Technology group and our colleague Suneel Marthi from the CTO office:
- Random Forest Clustering with Apache Spark by Erik Erlandson,
- Using a Relative Index of Performance (RIP) to Determine Optimum Configuration Settings Compared to Random Forest Assessment Using Spark by Diane Feddema,
- Distributed Machine Learning with Apache Mahout by Suneel Marthi, and
- Data Science for the Datacenter: Analyzing Logs with Apache Spark by William Benton.
Unfortunately, my talk is at the same time as Suneel’s, so I won’t be able to attend his, but these are all great talks and you should be sure to put as many as possible on your schedule if you’ll be in Vancouver!