DEV Community

Cover image for Creating your own health check monitor with Node-RED
Karolis for Webhook Relay

Posted on • Updated on

Creating your own health check monitor with Node-RED

If you are running your own blog, a SaaS application or a forum, you probably have encountered uptime/health monitors such as https://uptimerobot.com and their competitors (there are plenty of them: https://alternativeto.net/software/uptimerobot). In this short tutorial we will build our own simple (but flexible, that you can extend well beyond of what other tools can offer).

Our website health monitor will be:

  1. Querying 3 websites
  2. Checking their response status codes and contents
  3. Rate-limiting notifications
  4. Sending notifications to both email and Slack

The flow looks like this:

the flow

What is Node-RED?

From https://nodered.org/:

Node-RED is a programming tool for wiring together hardware devices, APIs and online services in new and interesting ways. It provides a browser-based editor that makes it easy to wire together flows using the wide range of nodes in the palette that can be deployed to its runtime in a single-click.

Getting started guide can be found here.

Even though my preferred language is Go, I find working with Node-RED a lot of fun :)

Step 1: Time ticker

Time ticker is a simple inject node from the input category. Configure interval based on your needs, we will be adding some message rate limiting so you can even set it to 1 or 5 seconds. In my example I set it to 30s:

configure ticker

We don't care about topic or payload, it can be injecting anything. The only thing we need from it is to trigger further actions.

Step 2: Making requests

To make requests will be using http request node from the function category.

http request node configuration

As you can see, it's really straightforward. No additional configuration here is required. This node will make a GET request and return a message that will have multiple fields but we only care about:

  • statusCode - we will check this to be 200.
  • responseUrl - we will incorporate it into the warning message.
  • payload - this is the actual response body that contains the HTML. We will check whether it contains a certain phrase that we know should be there.

Step 3: Validating responses

In this step we will add a simple HTTP status code validation (if the website is down, you won't get 200 response). However, sometimes you can still get 200 (from your reverse proxy displaying an empty page) or just website update going wrong. In those cases you will want to check the response body to get some specific phrases or keywords that should be there.

Checking response status code

To validate response status code we will use a switch node from the function category:

switch node config to check status code

Checking response body contents

To check response body contents I couldn't find "doesn't contain" option in the switch, so I just inverted the logic and chose second output like this:

switch node config for 'does not contain' path

Then, we just connect the second output (leaving first one empty) and we get what we want :)

Step 4: Generate payloads

This step is really up to you and what you want to display. For the Slack payload we need to format a simple JSON message so we will use a function node from the function category.

The function for the bad response body looks like:

return {
    payload: `{"response_type": "in_channel", "text": "[WARNING] ${msg.responseUrl} URL returned unexpected contents, please investigate" }`,
    topic: msg.topic
}
Enter fullscreen mode Exit fullscreen mode

And for the wrong status code:

return {
    payload: `{"response_type": "in_channel", "text": "[WARNING] ${msg.responseUrl} responded with status code '${msg.statusCode}'" }`,
    topic: msg.topic
}
Enter fullscreen mode Exit fullscreen mode

Email doesn't need to be JSON payload so it looks like:

return {
    payload: `[WARNING] ${msg.responseUrl} responded with status code '${msg.statusCode}'`,
    topic: msg.topic
}
Enter fullscreen mode Exit fullscreen mode

You can try adding more information based on what triggered the flow. In this case we want to differentiate payloads based on whether the status code or the response body contents were unexpected.

Step 5: Slack and email notifications

Before setting up notification nodes, I would really recommend adding rate-limiting to your flow as a stream of emails/Slack messages will be distracting you at the worst possible time :)

Rate limit can be added through a delay node from the function category. Configuration looks like:

rate limit config

TIP: if you add a rate-limit per topic, your messages will not appear. After the first message, it will wait for the full duration to receive any new messages and will only display them then. You could, however, create a separate delay for each website, to rate-limit events there.

As for notifications, there are many ways. I chose two: Slack and email. For Slack notifications, we create an http request node that will send the payloads (that we generated in the previous step) to an 'incoming webhooks URL' such as https://hooks.slack.com/services/............. You can read about them here: https://api.slack.com/incoming-webhooks.

For email, we will use an email node from the social category. For gmail users, you can generate an 'App Password' here: https://support.google.com/accounts/answer/185833.

./wrap_up

While there are plenty of monitoring services that have free tiers, they usually can't match the flexibility of Node-RED when it comes to testing specific features. With Node-RED we can:

  • Set whatever frequency of checks we want
  • Do multiple actions on the websites or use different, non-HTTP protocols
  • Integrate into whatever notification system we have (send webhooks, Slack, Telegram, Twilio or even create a new ticket in our internal issue tracker)

Obviously, there are also downsides, such as:

  • Even though it's easy, you actually have to create these flows instead of just supplying a URL to that 3rd party service
  • Those services usually have multiple deployments of their applications around the world so the datacentres, where they are hosting their apps can fail without ruining their business (if your RPI with Node-RED dies, you won't get warnings unless you monitor your RPI too, which is totally doable :) ).

I would suggest having a mix of public SaaS offerings (you can have a free tier on them) and your own custom monitoring applications that do better, deeper tests of your main services. You can also register a monitor in uptimerobot to test your Node-RED monitoring app. It is highly unlikely that your Node-RED instance, uptimerobot, and your SaaS application would fail at the same time without you getting notified :)

What's next

In the next post, I will demonstrate how to create a lot more interesting, an asynchronous flow that would be doing an end-to-end test of a SaaS application.

Top comments (1)

Collapse
 
fizcris profile image
Cristian Alonso

Hi @krusenas ,

Congrats forr the post!

Just a quick tip, the http request will create an undahandeled error if it can't reach the server, to avoid that it would be great if the http request is proceeded with a ping so you don't run into that situation.

Cheers