Are you ready to demystify Machine Learning and put it to work on real problems? Kick start your skills in this hands on workshop and learn the secrets of this powerful method for data analysis!
Join Louis Dorard for this Machine Learning course and gain the skills to integrate ML into your applications, using cutting-edge industry techniques.
In this workshop you'll gain an understanding of the possibilities and limitations of modern Machine Learning, and how to put it to work on real cases. You'll learn to prepare data, to create ML models, to evaluate them in your domain of application, to optimize them, and to deploy them. Adopt a top-down, results-first and experimentation-driven approach, and focus on practical techniques applied to concrete examples.
If you're interested in leveraging the power of machine learning to improve your applications, then this course is for you!
Learn how to:
- Understand the possibilities and limitations of Machine Learning
- Build predictive models from data, with Decision Trees and Random Forests
- Analyze models' behavior, errors, performance, and optimize their parameters
- Transform text variables into useful numerical representations for ML
- Package and deploy models to production with APIs
Introduction to Machine Learning
- Key ML concepts and terminology
- Formalizing supervised learning problems: classification and regression
- Possibilities and example use cases (web applications, mobile, enterprise data science)
- Learning techniques: Nearest Neighbors and Decision Trees
- [Hands-on] Introduction to Jupyter notebooks
- [Exercise] Decision Trees in scikit-learn (open source ML library) and BigML (ML-as-a-Service tool)- Model creation on classification and regression datasets- Visualization and interpretation
* Performance criteria for ML models and evaluation procedure* Aggregate metrics for regression (MAE, MSE, R-squared, MAPE) and classification (accuracy, confusion and cost matrices, precision, recall, AUC) * [Exercise] Evaluating models with Python, scikit-learn and BigML on previous datasets* [Hands-on] Procedure for individual error inspection and interpretation
- [Hands-on] Tuning model complexity: under-fitting vs. over-fitting
- Improving predictions with Ensembles; application to Decisions Trees: Random Forests
- [Exercise] Comparing multiple evaluations of Decision Trees and Random Forests on previous datasets* [Hands-on] Optimizing classifiers by tuning probability thresholds and trading off between competing metrics* Embracing randomness with cross-validation* [Exercise] Tuning all models' hyper-parameters with grid search and competing in a Kaggle challenge
ML on text — Natural Language Processing
- [Hands-on] Text pre-processing tips with the NLTK library
- [Hands-on] Feature extraction (bag of words and n-grams) and feature selection with scikit-learn
- [Exercise] Creating and optimizing a model to detect fake hotel reviews
- [Hands-on] Why and how to use REST APIs for ML use in production
- [Exercise] Deploying your own Python models as APIs with the Flask library* [Hands-on] Using your API with curl, Postman, and to fill in missing values in a spreadsheet program* Critical overview of open source and cloud ML products and deployment solutions
- Recap of key take-aways
- Other ML techniques
- Introduction to neural networks and usage of BigML's automated deep learning feature- Unsupervised learning: clustering and anomaly detection- Time series forecasting (by reduction to a regression problem)- Recommender systems (by reduction to a classification problem)* Resources to go further and customized suggestions
Attendees should have:
- Programming experience and basic knowledge of the Python syntax. Code samples will be provided throughout the course; the exercises in this course that involve programming can be done by combining and adapting these samples. Please consult Codeacademy's Learn Python and Robert Johansson's Introduction to Python programming (in particular the following sections: Python program files, Modules, Assignment, Fundamental types, Control Flow and Functions) to learn or revise Python's basics.
- Usage of a spreadsheet program (e.g. Microsoft Excel)
- Basic knowledge of scientific calculus, linear algebra and statistics (undergraduate level) will be useful to better understand some of the theory behind learning techniques, but it isn’t a hard requirement.
Bring your own hardware
To participate in this Machine Learning course you are required to bring your own laptop for practical work, with Python 3 and a recent version of Chrome.
If you are unable to bring your own laptop and you let us know at least 2 weeks prior to your attendance of this course, our team will be able to provide you with a laptop pre-installed with the above environment.