Please log in to watch this conference skillscast.
Apache Spark is one of the most popular general purpose distributed systems, and has driven a lot of growth in the Scala community. This talk will look at the magic which makes Spark work, peeling back the curtain to revel the several hundred gnomes that secretly power most distributed systems.
In essence parts of this talk could be considered “why spark is built the way it is, why its not perfect, and how to work around our mistakes." It’s not all doom and gloom though, we will explore the new APIs and the exciting new things we can do with them with a brief detour into how to work around some of the trade-offs in the new APIs – but mostly focused on the new exciting shiny things we can play with.
A basic background with Apache Spark will probably make the talk more exciting, or depressing depending on your point of view, but for those new to Apache Spark just enough to understand whats going will be covered at the start. The presenter would of course encourage you to buy and read her books on the topic (“Learning Spark” & “High Performance Spark”), because which presenter doesn’t do that.
Even if distributed systems aren't your jam, there will be pictures of cats, gnomes, and maybe even a panda to keep things exciting. Also learning how systems like Spark have been designed and evolved can be useful to avoid our mistakes (or make you feel better about your own mistakes).
YOU MAY ALSO LIKE:
Keynote: The Magic Behind Spark
Holden is a transgender Canadian open source developer advocate @ Google with a focus on Apache Spark, BEAM, and related "big data" tools. She is the co-author of Learning Spark, High Performance Spark, and another Spark book that's a bit more out of date. She is a committer and PMC on Apache Spark and committer on SystemML & Mahout projects. She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal.