Please log in to watch this conference skillscast.
How do you detect antisocial behaviour in online communities? This is a very real problem for the Guardian as we have tens-of-thousands of comments a day. During this talk, you will learn how, as a small team of non-data-scientists, Nicolas and team used Apache Spark running on Amazon’s Elastic Map Reduce (‘EMR’) platform to detect abusive comments. The results? A better community and significantly reduced workload for our moderation team.
You will explore Nicolas and Thomas's real-life example to introduce machine learning concepts and Apache Spark. The particular focus will be on patterns of behaviour in online communities, but the talk should also be interesting to you if you happen to be relatively new to machine learning, or interested in classification problems more generally - for example, spam detection. You will also discover problems the team encountered along the way and how to avoid them yourself!
The Call for Papers is now open for Scala eXchange 2017! Submit your talk for the chance to join a stellar line-up of experts on stage. Find out more.
YOU MAY ALSO LIKE:
- Data automation in the wild with Thomas Kaliakos! (SkillsCast recorded in September 2017)
- Leonardo De Marchi's Deep Learning Fundamentals (in London on 22nd - 23rd October 2019)
- Lightbend Akka for Scala - Professional (in London on 11th - 12th November 2019)
- Scala eXchange London 2019 (in London on 12th - 13th December 2019)
- Scalax2gether Community Day 2019 (in London on 14th December 2019)
- Reinforcement Learning Journal Club (in London on 17th October 2019)
- Countdown to Big Data LDN (in London on 17th October 2019)
- Abstract Data Types In The Region Of Abysmal Pain, And How To Navigate Them (SkillsCast recorded in September 2019)
- Using Kubeflow Pipelines for building machine learning pipelines (SkillsCast recorded in September 2019)
Detecting antisocial comments: an adventure in machine learning at theguardian.com - Intermediate
Thomas Kaliakos is a Software Engineer at Ovo Energy. He is a machine learning enthusiast. He's also a professional day-dreamer, amateur philosopher and lover of asking "why?"
'Nic Long is Tech Lead for the Discussion (user commenting) team at the Guardian, where he has been working for the last couple of years. He loves building scalable APIs in functional languages - in particular Scala and Clojure. He is passionate about developing new ways of journalism on the web and other technology platforms.