We will go through all the install and setup steps and then will use the Python interface PySpark to show a few examples of big-data crunching (perhaps with public data on S3).
YOU MAY ALSO LIKE:
Setting up a Spark cluster on AWS
Robert Hardy
Robert Hardy is a full stack quant, with over 12 years of experience in the front office teams of major financial institutions. He has built professional portfolio management systems entirely from open source components. He experienced an epiphany when he was introduced to TDD, pair programming and Agile methods. Robert talks and blogs on topics related to software and mathematics, and with his diploma in painting and ceramics in hand he claims to even have some level of expertise in the Fine Arts.