We will go through all the install and setup steps and then will use the Python interface PySpark to show a few examples of big-data crunching (perhaps with public data on S3).
YOU MAY ALSO LIKE:
- Key Concepts in Statistical Inference (SkillsCast recorded in April 2018)
- Improving Software Quality through Data with Markus Harrer (Online Workshop on 14th - 15th November 2022)
- How Storybook Makes Your UI Development Super Easy (SkillsCast recorded in July 2022)
Setting up a Spark cluster on AWS
Robert Hardy is a full stack quant, with over 12 years of experience in the front office teams of major financial institutions. He has built professional portfolio management systems entirely from open source components. He experienced an epiphany when he was introduced to TDD, pair programming and Agile methods. Robert talks and blogs on topics related to software and mathematics, and with his diploma in painting and ceramics in hand he claims to even have some level of expertise in the Fine Arts.