Meet up

Building Fault-Tolerant Microservices

Tuesday, 4th November at Skills Matter, London

This meetup was organised by LJC: London Java Community in November 2014


Building Fault-Tolerant Microservices

The topics covered will be:

  • What to do when one of the dependencies fails to respond in time
  • When to use network level time outs vs application level timeouts
  • What to monitor and how to monitor it, e.g connection pools, thread pools, queue sizes, latency
  • How to test for when the network is slow or saturated
  • How to test for when traffic is lost in transit
  • How to train your stakeholders to expect failure and get them to agree to fallbacks meaning they can choose availability over other requirements
  • When to use automated circuit breakers vs manual kill switches
  • Tips, hints and tricks for doing all of the above in Java

The topics covered are especially relevant if your application has a lot of dependencies that it communicates with over a network i.e. microservices. It is even more applicable if your application is deployed to an environment which is prone to failure e.g. a "cloud".

With supporting powerpoint slides, I'll cover the theory and motivation behind moving to a more distributed architecture and then go through the pitfalls and the strategies for improving fault-tolerance, backed up with real examples from Sky.

Who should attend:

Developers, Testers, Architects Junior developers should be able to follow it as well

Christopher Batey

Christopher is a Senior Engineer at Lightbend. He is currently on the core Akka team responsible for developing Akka (, Akka Http, Akka Streams, Reactive Kafka and Alpakka ( He has previously built trading systems, online television platforms and worked extensively with Apache Cassandra. Likes: Scala, Java, the JVM, Akka, distributed databases, XP, TDD, Pairing. Dislikes: Untested software and code ownership.

Who's coming?

Attending Members