Please log in to watch this conference skillscast.
HealthUnlocked is a social network centred around health issues, where people find information about chronic conditions. Our users share 4.5 pieces of health content every minute, which we classify into 700 different categories within milliseconds using machine learning.
The data science team at HealthUnlocked is used to using mature Python libraries to process text and implement machine learning algorithms. During this talk, you will explore a journey to translate a Python model prototype into Clojure production code.
You will learn how the HU team implemented our natural language processing pipeline, including tokenisation and vectorisation, as well as the core Naive Bayes algorithm, from first principles.
YOU MAY ALSO LIKE:
- Now you're Speaking my Language! : Building, Maintaining and Using a Patient-Friendly Medical Ontology (SkillsCast recorded in December 2018)
- Digital Discrimination: Cognitive Bias in Machine Learning (in Online Event on 4th June 2020)
- Love the Brain You're In (SkillsCast recorded in October 2019)
- Creating smaller, faster, production-worthy mobile machine learning models for Android (SkillsCast recorded in October 2019)
Clojure for Data Science: from a Prototype in Python to Clojure in Production
Chloe is currently part of the data science team at HealthUnlocked which aims to improve user experience on the platform and facilitate user data analysis. She has always been passionate about healthcare and psychology and she enjoys digging for new insights in medical data.
Maria is the lead data scientist at HealthUnlocked, a health social network. Her main role is to build the data pipelines and machine algorithms that power THE team's content recommender and intelligent content tagger. Before working at HealthUnlocked, she did a PhD in Cambridge in signal processing & machine learning. She then moved to work at a startup using mainly big data tools (Spark) and Python.