DEV Community

Let's talk about Health Checks

Douglas Makey Mendez Molero on January 12, 2019

According to the azure documentation in this excellent article, they state that. "It's a good practice, and often a business requirement, to monit...
Collapse
 
mornindew profile image
mornindew

Thanks for the article. I appreciate it.

One thing to consider is that health checks should be used to drive the behavior orchestation platform (e.g. kubernetes). If it fails then k8s will act on that failure (typically a restart). This is very useful as it starts to "self heal" outages but it also means that your health check should include only things that recoverable and can benefit from a restart.

Redis is actually a good example of this. Maybe your application will operate just fine without it's cache (albeit slower). In that case a restart isn't best and a 200 is acceptable. I typically will just use errors in the log files to handle with monitoring alerts. Really a case by case but I usually will only add checks to the health endpoints that are recoverable and also owned by the microservice that is hosting the healthcheck endpoint.

Just my .02

Collapse
 
phlash profile image
Phil Ashby

Nicely done! We are currently implementing health checks in our microservice platform at work, and have chosen to support two variants:

  • internal: similar to your description above, this ensures the service is able to accept requests itself.
  • external: as an extension to the internal check, this also passes on the healthcheck call to dependant services, ensuring the end-end chain of services are able to accept requests.

We added the external check for use by platform monitoring, avoiding the need to have internal network access, and as a way to dynamically record request routing, as we collect service IDs and versions in results. This helps with configuration drift detection / reconciliation in a complex service mesh platform.

Collapse
 
ajinkyax profile image
Ajinkya Borade

please write more Golang tutorials :) Gin maybe !