Please log in to watch this conference skillscast.
Data Science usage at Netflix goes much beyond our eponymous recommendation systems. It touches almost all aspects of our business - from optimizing content delivery and informing buying decisions to fighting fraud. Our unique culture affords our data scientists extraordinary freedom of choice in ML tools and libraries, all of which results in an ever-expanding set of interesting problem statements and a diverse set of ML approaches to tackle them. Our data scientists, at the same time, are expected to build, deploy, and operate complex ML workloads autonomously without the need to be significantly experienced with systems or data engineering. In this talk, I will discuss some of the challenges involved in improving the development and deployment experience for ML workloads. I will focus on Metaflow, our ML framework, which offers useful abstractions for managing the model’s lifecycle end-to-end, and how a focus on human-centric design positively affects our data scientists' velocity.
Q&A
Question: Does metaflow provide an option of hyperparameter tuning using Bayesian methods rather than just gird search?
Answer: Great question. Yes, we are working on HPO integrations. This PR lays down the ground work for it - https://github.com/Netflix/metaflow/pull/510. We are exploring integrations with Optuna. I would love to know which HPO libraries/services are you using today!
Question: Metaflow is awesome but if it uses AWS underneath, what is the advantage of it over similar ML OPS features in AWS Sagemaker?
Answer: Currently, our integrations are with the AWS cloud, but all our integrations are plugins so it's easy to support Azure, GCS and On-prem.
Question: First of all we at REA group recently decided to use Metaflow as our orchestration tool. It is an amazing tool, thanks for open sourcing it. The endpoint function is really neat! Have you guys thought about integrations to AWS sagemaker inference. If yes, would that be open sourced?
Answer: Nice! I would love to get feedback on your experience with Metaflow. Yes, indeed we will surface a bunch of inference backends when we open source @endpoint
. However, if you would like to get started without a formal integration, here is how Metaflow can work with Cortex (which is similar to Sagemaker inference ) - https://site-cd1e85.webflow.io/post/reproducible-machine-learning-pipelines-with-metaflow-and-cortex
Question: Are the steps of training flow mentioned in the presentation (snapshot, restore, etc) implemented via MetaFlow?
Answer: Within Metaflow you can execute arbitrary python code. We don't enforce very many constraints except that it needs to be a well formed DAG with a single entry point and exit point to the graph. Here is some more information - https://docs.metaflow.org/metaflow/basics
Question: Just wondering how Metaflow approaches the reusability of models. Other than grid search, does Metaflow make use for similar previously cached workflows or model pickles automatically or is that left solely to the user piecing them together separately?
Answer: You can access any prior result (within a metaflow flow or any Python process) - https://docs.metaflow.org/metaflow/client
YOU MAY ALSO LIKE:
Taming the Long Tail of Industrial ML Applications
Savin Goyal
Machine Learning InfrastructureNetflix