Presentation: Low Latency Microservices in Java



4:10pm - 5:00pm

Day of week:



Key Takeaways

  • Learn lessons on developing low latency microservices with Java.
  • Understand how you can make asynchronous messaging simplier.
  • Gain practical advice on handling failure from large scale, low latency JVM based microservice implementations.


In this talk we will look at the differences between micro-services and monolith architectures and their relative benefits and disadvantage. We will look at design patterns which will allow us to utilize these different strategies as a deployment concern without significant changes to the business logic.

We will look at how micro-service architecture can be implemented under low latency constraints of 10 - 100 micro-second latencies, in Java in particular, and how these strategies reduce the impact of serializing data and logging.


What can you tell me about low latency microservices in Java?
A concern around using a microservice design is that it can add a lot of overhead in two ways. This can be development overhead in terms of trying to test and debug your integrated system, but also overhead just in terms of latency between your services. There are people who have come up with different approaches to solve this problem, but I have never seen anyone go into much depth about it.
I consider the choice between microservices or a monolith, a deployment issue. When you are writing unit tests, you don’t need necessarily to test the infrastructure. You just want to see your services talking to one another correctly.
You can do this without any type of transport being involved. Without a transport, it is no harder to test, debug or profile than a normal monolith component. Once you have proven that your services can work without a transport, then, for deployment purposes, you put in a transport that suits your needs.
In my talk, I explore one particular transport that is designed for low latency, but this design principle can apply whether you are using our transport, or some other transport. You can take a design approach, where you are not locked into the assumption that you have a particular transport, or indeed any transport between every component in your system. This flexibility means you only add the transport should is appropriate to the use case you need to run. You are not tied to a solution before you’ve worked out what problem you are trying to solve.
After outlining the approach, I look at a sample service acts as a gateway to a JDBC database. I examine what the code looks like, and what performance can you get out of it, along with some benchmarks. By having a persistent queue, you have guaranteed delivery of updates to the database. If have to restart, it just continues from where it was up to, and writing it out to the database as fast as the database can keep up.
Tell me about some of the gotchas that you will be exploring for the audience?
The main gotcha in low latency is care the need to avoid anything which takes a non-trivial amount of time, you want to minimise network hops, disk activity, garbage collection, and pretty but any operating system call. We care about how our services are distributed amongst the cores of our CPUs, as well as how our applications are distributed across machines.
The reason why we are not necessarily worried about running it across many machines is that we are looking at systems that can do 100,000 operations per second through a single thread, let alone through a single machine, and at that throughput you may not need multiple machines or, if you do, you probably don’t need many.
We have a client that takes peak message rates of 24 million messages a second, with a cluster of six machines. Today each machine, might have 80 free logical CPUs, which would allow you to deploy up to 80 single threaded services each.
The more effective each machine runs, the less you need. This is not only cost effective, but when latency is critical, having lots of machines might add too much latency to be practical. More machines also add complexity. Microservices tools can help you reduce the overhead of managing these machines, though you are still better off if you don’t need so many.
Another consideration is development lifecycle for each developer. This usually involves repeated testing. To maximise productivity, you want a short development cycle, with fast feedback from your tests, without having to deploy to shared systems. Most developers don’t have access to racks of machines to do their development and testing on. You can have a much more productive team if they can test their software is a short timeframe on the hardware is available to them individually.
What would you consider to be the key takeaways for a Java developer coming to your talk?
The main actionable thing is about how you can make asynchronous messaging simpler. You don’t need to use a complex framework. In fact, the example I start up with is so trivial that you would think there is nothing happening at all, and that is the case. It is a design pattern. You don’t have to have a framework to get a solution.
Only after your components work do you need to consider what transport is best for your use case. Many people go into microservice, and they see Microservices=REST. That may be appropriate for some use cases, but one of the things REST is not good at is performance.
Generally speaking, that doesn’t matter. Performance isn’t the only criteria, but if performance does matter -- or you are trying to talk to a service which isn’t designed for REST -- that shouldn’t change the design radically. All you need is a different transport, matching what you are interacting with. It doesn’t have to be all about REST.
In conclusion, you take a very simple design decision, and treat transport as an optional and swappable configuration, enabling you to choose the transport that suits your needs. For me, some of the problems people have with microservices come from locking themselves into a particular idea of what the solution has to look like and that is just simply not the case.
If you have a failure, how do you know exactly the path that your execution went through the system and how do you reproduce that?
You can model each service or component as a state machine. If you replay the inputs in the same order with the same code, you will get the same results every time.
What if you have multiple input streams, or you want to upgrade your code? To handle these cases, we recommend replying the *output* rather than the input. You can not only record the exact order messages were processed but the result of each of the messages as they were made at the time. After upgrading your software, the service can reply the outputs, to see which decisions it need to honour, even if your new service would have made different decision.
This way you can upgrade the software in the middle of the day and have it continue on, and know that it’s not going to break anything that happened previously.

Speaker: Peter Lawrey

@StackOverflow Gold Badges Java, JVM, Memory, & Performance / CEO Higher Frequency Trading Ltd

Some background on Peter: Most answers for Java and JVM on (~12K), "Vanilla Java" blog with four million views and ~300 posts, founder of the Performance Java User's Group, a virtual JUG with 1800+ members, architect of Chronicle Software, open source project for high performance, low latency libraries in Java, & Java Champion

Find Peter Lawrey at


Monday, 13 June

Tuesday, 14 June

Wednesday, 15 June