DEV Community

Building resilient services

Namc on April 17, 2019

Original article published at : namc.in - Retries, Timeouts and Backoff Distributed systems are hard. While we learn a lot about making highly ava...
Collapse
 
phlash profile image
Phil Ashby

Very nice - thanks Namc :)

We recently had to deal with a situation in our production environment where a series of timeouts & retries interacted badly with each other, and in particular as our clients making API calls also have their own retry in place sometimes... we find that almost all static configurations are fragile, and only work for some conditions. We are now looking at using service mesh techniques to isolate these communication patterns from services themselves, and make them adaptive to maintain whole system performance. A nice article introducing this approach here:
medium.com/@autoletics/macro-appro...