Please log in to watch this conference skillscast.
How can a small team with a limited budget enable the analysis of large volumes of data in a world of constantly changing requirements?
Lindsey and Phil will share with you how the Guardian has used a range of technologies including Apache Spark and PrestoDB on AWS to support simple ingestion and fast querying of a wide range of datasets. Learn why it’s important to decouple storage from compute and raw data sources from optimised query formats and why there’s still no single perfect solution.
YOU MAY ALSO LIKE:
- Introduction to Kafka Streaming (SkillsCast recorded in December 2018)
- Leonardo De Marchi's Deep Learning Fundamentals (in London on 22nd - 23rd October 2019)
- Martine Devos' Certified Scrum Master, Estimation & Planning Class (in London on 4th - 5th November 2019)
- P3X - People, Product & Process eXchange 2019 (in London on 31st October - 1st November 2019)
- Security in the Age of Big Data (Data Anonymisation & Encryption) (in London on 21st October 2019)
- Right to Left - Outcome Driven Agility (in London on 21st October 2019)
- Automating Elaborate-Transform-Load for Busy Data Scientists (SkillsCast recorded in October 2019)
- RedisTimeSeries = A Time Series Database using Redis (SkillsCast recorded in October 2019)
The Agile Data Warehouse - Beginners
Phil is a Senior Developer Manager at the Guardian.
Lindsey is a Data Engineer working at Deliveroo and is passionate about developing processes and systems to drive data-driven decisions across an organisation. She previously worked on the data technology team at the Guardian.