This session was not filmed.
Robert Hardy will host an interactive workshop on how to use Spark with Python to carry out machine learning algorithms when you have too much data to comfortably work with Pandas. Attendees should bring along a laptop and should open an account on AWS. All code and setup scripts will be available on a public GitHub repo.
You will cover all steps of the workflow:
- Spinning up your Spark instance on AWS
- Trimming and cleaning data
- Using different storage formats for faster handling
- Browsing subsets of the data to get a feel for which features might be the most useful
- Application of models from the SparkMLliband Scikit-learn libraries
- Viewing results and assessing the quality of our predictions
YOU MAY ALSO LIKE:
- Key Concepts in Statistical Inference (SkillsCast recorded in April 2018)
- Python for Programmers (in London on 11th - 13th November 2019)
- Advanced Python (in London on 14th - 15th November 2019)
- Scala eXchange London 2019 (in London on 12th - 13th December 2019)
- Practical ML 2020 (in London on 2nd - 3rd July 2020)
- A Guide to the Market Promise of Automagic AI-Enabled Detection and Response (in London on 29th October 2019)
- October Raspberry Pint: Raspberry Pis and other Digital Making Fun (in London on 29th October 2019)
- Hiring a Personal Investigator for Your App (SkillsCast recorded in October 2019)
- Data Transparency on Django with yourData (SkillsCast recorded in October 2019)
Workshop: Big Data Machine Learning with Python and Spark on AWS
Robert Hardy is a full stack quant, with over 12 years of experience in the front office teams of major financial institutions. He has built professional portfolio management systems entirely from open source components. He experienced an epiphany when he was introduced to TDD, pair programming and Agile methods. Robert talks and blogs on topics related to software and mathematics, and with his diploma in painting and ceramics in hand he claims to even have some level of expertise in the Fine Arts.