Here’s a quick video I put together introducing infrastructure log processing in Spark. At the end, there are a couple of nice graphs contrasting PCA and t-SNE for embedding high-dimensional log metadata into two dimensions.
Log processing demo from William Benton on Vimeo.