R has been described as 'a DSL for statistical analysis'. Hadoop is for LARGE scale computing. Between them, they can take on a number of interesting problems - once you get them to play together. Which is actually both easier and more accessible than you might think. In this demo I will solve a simple map/reduce problem in R, and run it on an Amazon EMR cluster.
YOU MAY ALSO LIKE:
- Haskell eXchange 2017 (in London on 12th - 13th October 2017)
- Brian Sletten's Data Science with R Workshop (in London on 8th - 10th November 2017)
- Advanced Stairway to Scala by Bill Venners and Dick Wall (in London on 11th - 13th December 2017)
- Lightbend Apache Spark for Scala - Professional (in London on 11th - 12th December 2017)
An example of a map/reduce algorithm using R and Hadoop
Anette is a consultant for ThoughtWorks where she builds people, teams, projects and occasionally a bit of code. She has worked in a number of different countries, industries and development stacks to solve all sorts of problems, but lately it has be