We will go through all the install and setup steps and then will use the Python interface PySpark to show a few examples of big-data crunching (perhaps with public data on S3).
YOU MAY ALSO LIKE:
- Key Concepts in Statistical Inference (SkillsCast recorded in April 2018)
- Vue.js and TypeScript: Working Together like Peanut Butter and Jelly (Online Meetup on 3rd February 2022)
- Deep Learning with F#: An Experience Report (SkillsCast recorded in October 2021)
Setting up a Spark cluster on AWS
Robert Hardy is a full stack quant, with over 12 years of experience in the front office teams of major financial institutions. He has built professional portfolio management systems entirely from open source components. He experienced an epiphany when he was introduced to TDD, pair programming and Agile methods. Robert talks and blogs on topics related to software and mathematics, and with his diploma in painting and ceramics in hand he claims to even have some level of expertise in the Fine Arts.