Please log in to watch this conference skillscast.
Queries such as whether a user has seen a tweet, the number of unique users who have favourited a tweet, or the number of times a tweet was seen over a period of time, are all hard to answer exactly due to having huge datasets and tight time constraints. However, the exact number is often not necessary as long as the errors are bounded. Certain algebraic data structures enable approximate answers to the queries mentioned, owing to distributed computations performed over them that are guaranteed to be correct because of their algebraic properties.
YOU MAY ALSO LIKE:
- ScalaCon 2022: Opening Keynote Networking Party (in London on 4th October 2022)
- ScalaCon 2022 (Online Conference on 4th - 8th October 2022)
- LJC: I Started Testing In Production... Then I Went On Holiday (Online Meetup on 15th August 2022)
- Real-time Stream Processing in Spring Made Easy (in London on 25th August 2022)
- A History of Enterprise Monads (SkillsCast recorded in May 2021)
- Connecting the dots - building and structuring a functional application in Scala (SkillsCast recorded in May 2021)
Count-Min Sketch in Real Data Applications
Laura is a data scientist at Twitter with a taste for functional programming. She mostly writes Scala, though she really admires O'Caml as well. Laura knows a bit about recommender systems, statistics and math. She did her BSc in Financial and Actuarial Mathematics in Vilnius University, where with several friends she wrote a thesis about The Role of Tail Index in Analysis of Currency Returns.