Please log in to watch this conference skillscast.
Scala seems to be, suddenly, an important language within the Apache Hadoop ecosystem, with the arrival of Scala-based projects like Apache Kafka and Apache Spark -- which is in essence "distributed Scala". In fact, it's not a surprising marriage: Hadoop has been building on functional paradigms and immutability in its way for years through MapReduce/HDFS, and projects like Crunch and Cascading. This talk will give a Hadoop-centric take on the evolution of Scala, its benefits to Hadoop-related projects, why it succeeds where other languages don't in Hadoop, and some quirks that remain barriers to its further adoption in "big data".
YOU MAY ALSO LIKE:
- What “50 Years of Data Science” leaves out (SkillsCast recorded in April 2017)
- LJC Lunchtime Lightning Talks (Online Meetup on 7th August 2020)
- Web Scraping with GoLang (Online Meetup on 13th August 2020)
- Abstract Data Types In The Region Of Abysmal Pain, And How To Navigate Them (SkillsCast recorded in September 2019)
- The Last Frontier and Beyond (SkillsCast recorded in August 2019)
Keynote: Spark+Hadoop and how it relates to Scala
Sean is Director of Data Science at Cloudera in London. Before Cloudera, he founded Myrrix Ltd (now, the Oryx project) to commercialize large-scale real-time recommender systems on Apache Hadoop. He is an Apache Spark committer and co-authored Advanced Analytics on Spark. He was a committer and VP for Apache Mahout, and co-author of Mahout in Action. Previously, Sean was a senior engineer at Google.