Please log in to watch this conference skillscast.
The complexity in complex distributed systems isn’t in the code, it’s between the services or functions. And a lot of failures are hard to predict and maybe even hard to detect. When your system is made up of multiple microservices or a bunch of lambdas and some queues, how do you know whether it’s working the way you think it should? Quality in these systems isn’t so much about testing up front: if you’re releasing 20 times a day, you can’t pay the cost of running full regression tests every time. You need to have a risk-based approach and focus your testing effort on the things where it really matters. And more importantly, you need to be able to quickly find out when things are going wrong, and quickly fix them. Your production system is the only place the full complexity comes into play, so you should be doing a lot of your quality work there. Make sure you can find out about problems as early as possible and do as much ‘testing’ here as you can.
During Sarah's keynote you will learn about:
- the importance of observability - building in log aggregation, metrics and tracing so you can tell what’s up
- business-focussed monitoring, including synthetic monitoring
- why documentation is important and how to encourage people to keep it up to date
- how chaos experiments help
You should go away knowing more about what it takes to make your complex distributed systems easier to build and to run with high quality and stability.
YOU MAY ALSO LIKE:
- Keynote: Operating Microservices: Everything Is at Scale (SkillsCast recorded in November 2018)
- Going Multicloud with Serverless (SkillsCast recorded in October 2019)
- Building an open source Cloud-native Edge Computing infrastructure with the OpenNESS Toolkit for 5G and Industry 4.0. (SkillsCast recorded in October 2019)
Keynote: Quality for 'Cloud Natives': What Changes When Your Systems Are Complex And Distributed?
I've been a developer for 15 years, leading delivery teams across consultancy, financial services and media. Over the last few years I have developed a deep interest in operability, observability and devops, and at the beginning of 2018, this led to me taking over responsibility for Operations and Reliability at the Financial Times.