SkillsCast
About the Speaker
Please log in to watch this conference skillscast.
Apache Spark is a general purpose distributed computing framework for distributed data processing. With MLlib, Spark's machine learning library, fitting a model to a huge data set becomes very easy. Similarly, Spark's general purpose functionality enables application of a model across a large collection of observations. We'll walk through fitting a model to a big data set using MLlib and applying a trained scikit-learn model to a large data set.
YOU MAY ALSO LIKE:
- How to Experiment Quickly (SkillsCast recorded in December 2019)
- Introduction to Knowledge Graphs (Online Workshop) with Howard Knowles (Online Workshop on 21st July 2022)
- Deep Learning Fundamentals with Leonardo De Marchi (Online Workshop on 12th - 15th September 2022)
- YOW! Data 2022 (Online Conference on 1st - 2nd June 2022)
- How Data Analytics is Changing the Way Internal Auditors Work (Online Meetup on 29th - 30th May 2022)
- Getting Geospatial Data on The Web (SkillsCast recorded in February 2022)
- Deep Learning with F#: An Experience Report (SkillsCast recorded in October 2021)
Apache Spark for Machine Learning on Large Data Sets
Juliet Hougland
Data VagabondBagged & Boosted