DEV Community

Himanshu Gupta for Solace Developers

Posted on • Originally published at solace.com on

Configuring PubSub+ Event Broker for High Availability in AWS

When architecting a system, it’s important to make sure it’s redundant, resilient, and fault-tolerant so it can carry on if a node fails, or even if all the nodes in one datacenter fail.

PubSub+ Event Broker can be deployed in a high availability (HA) group, and configured for disaster recovery (DR). In this post, I’ll focus on deploying for HA in AWS. There is an AWS CloudFormation template available for you to easily spin up your HA group, but I’d like to show you how to do it manually so you understand how it works.

How does HA work with PubSub+ Event Broker?

A typical HA deployment of Solace’s PubSub+ Event Broker consists of three nodes:

  1. Primary
  2. Backup
  3. Monitor

The primary and backup brokers are set up in active-standby configuration, while the third acts as a monitoring node. The two brokers have their own storage, so they share nothing and are completely independent. The monitoring node is just responsible for maintaining quorum between the primary and backup brokers.

When a message is published to the primary broker, it is persisted locally and synchronously pushed to the backup broker. Once the backup broker receives the message, it acknowledges receipt to the primary broker, at which point the primary broker sends an acknowledgement back to the publisher.

If there is a subscriber interested in this message, it will be forwarded to that subscriber. Otherwise, the message will be forwarded later. Once the subscriber receives the message, it will send a confirmation back to the primary broker. Upon receiving this receipt, the primary and standby brokers will both delete their copies of the message.

Note that at any given time, only the active broker will accept connections. For example, let’s say your primary broker is your active broker. If and when the primary broker fails, the backup broker will become the active broker and start accepting connections. When the primary broker becomes available again, you can either configure your system to automatically make it the active broker again, or leave the secondary broker active. It’s recommended that you run the primary and secondary brokers on different servers within the same datacenter.

This HA setup makes systems very resilient, because if the primary broker fails, the backup will quickly take over and minimize the impact to your system.

Now that we know how an HA group works, let’s spin up three instances of PubSub+ Event Broker on AWS and configure them to be part of an HA group.

Launching EC3 instances with PubSub+ Event Broker

Let’s see how we can configure three broker instances to be our primary broker, secondary broker, and monitoring node. We will be running three instances of PubSub+ Event Broker on three different EC2 instances on AWS. See my blog post on how to launch an EC2 instance with PubSub+Event Broker.

Once you have an EC2 instance up and running, you should use the security group that was created for this instance and attach it to the backup and monitoring EC2 instance. Do not create all 3 instances with 3 different security groups.

I also edited the security group that was created for the first EC2 instance to allow all incoming traffic from any instance that is attached to this security group. You can do this by going to the security group and under Inbound section, clicking Edit , then clicking Add Rule , selecting All Traffic and in the Source field, enter the security group Id. Here is what it should look like:

Again, this is just for demo purposes to make sure all our instances can speak to each other.

Configuring HA group

Now that we have all 3 EC2 instances with PubSub+ Event Broker up and running, we can begin to configure them as part of one HA group. The following steps have been thoroughly documented in the HA Group Configuration section of Solace PubSub+ Technical Documentation.

Before we begin, we need to update our router name on each instance. For an HA group to be configured, the router names must match the node name. When we launched EC2 instance from Solace AMI, we were given a default router name (it’s usually the private IP address). We can use the given router name as our node name, but I would rather give our nodes more descriptive names such as ha-demo-primary, ha-demo-backup, and ha-demo-monitor.

Primary node

Log in to your primary node and change the router name to ha-demo-primary :

/ Activate Solace CLI[sysadmin@ip-172-31-26-146 ~]$ solacectl cliSolace PubSub+ Standard Version 9.3.1.5Operating Mode: Message Routing Nodeip-172-31-26-146> enableip-172-31-26-146# configure/ Change router name to primaryip-172-31-26-146(configure)# router-name ha-demo-primaryThis command causes a reload of the system.Do you want to continue (y/n)? y

Note that a router name change requires a reboot of PubSub+ Event Broker so you will have to wait a minute or so before it comes back up. Run the following command once the broker has been rebooted to confirm the router name was changed:

ip-172-31-26-146> show router-nameRouter Name:          ha-demo-primaryMirroring Hostname:   NoDeferred Router Name: ha-demo-primaryMirroring Hostname:   No

Once you have updated the router name, you will need to run some commands to:

  1. Create and designate different instances as primary , backup, and monitor
  2. Provide the authentication key

From Solace docs: pre-shared-key is 44 to 344 characters (which translates into 32 to 256 bytes of binary data encoded in base 64). It’s used to provide authentication between nodes in a HA Group and must be the same on each node. I used Base64 Encode to create my key. Once you have your key, run the following commands and replace my key with the one you created.

ip-172-31-26-146(configure)# hardware message-spool shutdownAll message spooling will be stopped.Do you want to continue (y/n)? yip-172-31-26-146(configure)# redundancyip-172-31-26-146(configure/redundancy)# switchover-mechanism hostlistip-172-31-26-146(configure/redundancy)# exitip-172-31-26-146(configure)# redundancyip-172-31-26-146(configure/redundancy)# groupip-172-31-26-146(configure/redundancy/group)# create node ha-demo-primaryip-172-31-26-146(configure/redundancy/group/node)# connect-via 172.31.26.146ip-172-31-26-146(configure/redundancy/group/node)# node-type message-routing-nodeip-172-31-26-146(configure/redundancy/group/node)# exitip-172-31-26-146(configure/redundancy/group)# create node ha-demo-backupip-172-31-26-146(configure/redundancy/group/node)# connect-via 172.31.26.208ip-172-31-26-146(configure/redundancy/group/node)# node-type message-routing-nodeip-172-31-26-146(configure/redundancy/group/node)# exitip-172-31-26-146(configure/redundancy/group)# create node ha-demo-monitorip-172-31-26-146(configure/redundancy/group/node)# connect-via 172.31.22.77ip-172-31-26-146(configure/redundancy/group/node)# node-type monitor-nodeip-172-31-26-146(configure/redundancy/group/node)# exitip-172-31-26-146(configure/redundancy/group)# exitip-172-31-26-146(configure/redundancy)# authenticationip-172-31-26-146(configure/redundancy/authentication)# pre-shared-key key c29sYWNlaXNhZ3JlYXRldmVudHBsYXRmb3Jtd2hpY2h5b3VzaG91bGR1c2Vmb3JhbGx5b3VyZXZlbnRpbmduZWVkcw==ip-172-31-26-146(configure/redundancy/authentication)# exitip-172-31-26-146(configure/redundancy)# active-standby-role primaryip-172-31-26-146(configure/redundancy)# no shutdown

You will mostly run the same commands on your backup node that you ran on your primary node. The only difference is that you will designate it as a backup node by running this command instead:

active-standby-role backup

Here are the commands you need to run on your backup node:

ip-172-31-26-208> enableip-172-31-26-208# configureip-172-31-26-208(configure)# router-name ha-demo-backupThis command causes a reload of the system.Do you want to continue (y/n)? yip-172-31-26-208(configure)# [sysadmin@ip-172-31-26-208 ~]$ solacectl cliSolace PubSub+ Standard Version 9.3.1.5Operating Mode: Message Routing Nodeip-172-31-26-208> show router-nameRouter Name:          ha-demo-backupMirroring Hostname:   NoDeferred Router Name: ha-demo-backupMirroring Hostname:   Noip-172-31-26-208> enableip-172-31-26-208# configureip-172-31-26-208(configure)# hardware message-spool shutdownAll message spooling will be stopped.Do you want to continue (y/n)? yip-172-31-26-208(configure)# redundancyip-172-31-26-208(configure/redundancy)# switchover-mechanism hostlistip-172-31-26-208(configure/redundancy)# groupip-172-31-26-208(configure/redundancy/group)# create node ha-demo-primaryip-172-31-26-208(configure/redundancy/group/node)# connect-via 172.31.26.146ip-172-31-26-208(configure/redundancy/group/node)# node-type message-routing-nodeip-172-31-26-208(configure/redundancy/group/node)# exitip-172-31-26-208(configure/redundancy/group)# create node ha-demo-backupip-172-31-26-208(configure/redundancy/group/node)# connect-via 172.31.26.208ip-172-31-26-208(configure/redundancy/group/node)# node-type message-routing-nodeip-172-31-26-208(configure/redundancy/group/node)# exitip-172-31-26-208(configure/redundancy/group)# create node ha-demo-monitorip-172-31-26-208(configure/redundancy/group/node)# connect-via 172.31.22.77ip-172-31-26-208(configure/redundancy/group/node)# node-type monitor-nodeip-172-31-26-208(configure/redundancy/group/node)# exitip-172-31-26-208(configure/redundancy/group)# exitip-172-31-26-208(configure/redundancy)# authenticationip-172-31-26-208(configure/redundancy/authentication)# pre-shared-key key c29sYWNlaXNhZ3JlYXRldmVudHBsYXRmb3Jtd2hpY2h5b3VzaG91bGR1c2Vmb3JhbGx5b3VyZXZlbnRpbmduZWVkcw==ip-172-31-26-208(configure/redundancy/authentication)# exitip-172-31-26-208(configure/redundancy)# active-standby-role backupip-172-31-26-208(configure/redundancy)# no shutdownip-172-31-26-208(configure/redundancy)#

Monitor node

The commands you need to run on your monitor node are slightly different but for the most part, you are doing the same sort of configuration that you did on primary and backup nodes.

Before changing the router name on the monitor node, you need to run this command:

[sysadmin@ip-172-31-22-77 ~]$ solacectl cliSolace PubSub+ Standard Version 9.3.1.5Operating Mode: Message Routing Nodeip-172-31-22-77> enableip-172-31-22-77# reload default-config monitoring-nodeThis command causes a reload of the system which will discard all configurationand messaging data stored on this system.Do you want to continue (y/n)? y

This will restart your node with default configs and designate it as a monitoring node (instead of a message routing node).

Now, you can change the router name like we did with our primary and backup nodes.

ip-172-31-22-77(configure)# router-name ha-demo-monitorThis command causes a reload of the system.Do you want to continue (y/n)? y

Confirm the name was changed by running this command:

ip-172-31-22-77# show router-nameRouter Name:          ha-demo-monitorMirroring Hostname:   NoDeferred Router Name: ha-demo-monitorMirroring Hostname:   No

You can now go ahead and run the following commands (just like you did with primary and backup nodes).

ip-172-31-22-77(configure)# redundancyip-172-31-22-77(configure/redundancy)# switchover-mechanism hostlistip-172-31-22-77(configure/redundancy)# groupip-172-31-22-77(configure/redundancy/group)# create node ha-demo-primaryip-172-31-22-77(configure/redundancy/group/node)# connect-via 172.31.26.146ip-172-31-22-77(configure/redundancy/group/node)# node-type message-routing-nodeip-172-31-22-77(configure/redundancy/group/node)# exitip-172-31-22-77(configure/redundancy/group)# create node ha-demo-backupip-172-31-22-77(configure/redundancy/group/node)# connect-via 172.31.26.208ip-172-31-22-77(configure/redundancy/group/node)# node-type message-routing-nodeip-172-31-22-77(configure/redundancy/group/node)# exitip-172-31-22-77(configure/redundancy/group)# create node ha-demo-monitorip-172-31-22-77(configure/redundancy/group/node)# connect-via 172.31.22.77ip-172-31-22-77(configure/redundancy/group/node)# node-type monitor-nodeip-172-31-22-77(configure/redundancy/group/node)# exitip-172-31-22-77(configure/redundancy/group)# exitip-172-31-22-77(configure/redundancy)# authenticationip-172-31-22-77(configure/redundancy/authentication)# pre-shared-key key c29sYWNlaXNhZ3JlYXRldmVudHBsYXRmb3Jtd2hpY2h5b3VzaG91bGR1c2Vmb3JhbGx5b3VyZXZlbnRpbmduZWVkcw==ip-172-31-22-77(configure/redundancy/authentication)# exitip-172-31-22-77(configure/redundancy)# no shutdown

Once you are done configuring, you can run the following command on any of the nodes to confirm that all three nodes are Online and part of your HA group:

ip-172-31-26-146> show redundancy groupNode Router-Name   Node Type       Address           Status-----------------  --------------  ----------------  ---------ha-demo-backup     Message-Router  172.31.26.208     Onlineha-demo-monitor    Monitor         172.31.22.77      Onlineha-demo-primary\*   Message-Router  172.31.26.146     Online\* - indicates the current node

That’s it! You now have a functional HA group with your primary and backup brokers and monitoring node.

Note that by default, guaranteed messaging is disabled in HA group and can only be enabled once you have an HA group setup. It is recommended that you enable guaranteed messaging but for the purpose of this post, I will leave it as disabled.

More info on your HA group

You can get more information about your redundancy settings by running this command:

ip-172-31-26-146> show redundancyConfiguration Status     : EnabledRedundancy Status        : UpOperating Mode           : Message Routing NodeSwitchover Mechanism     : HostlistAuto Revert              : NoRedundancy Mode          : Active/StandbyActive-Standby Role      : PrimaryMate Router Name         : ha-demo-backupADB Link To Mate         : UpADB Hello To Mate        : DownPrimary Virtual Router  Backup Virtual Router----------------------  ----------------------Activity Status                Local Active            ShutdownRouting Interface              intf0:1                 intf0:1Routing Interface Status       UpVRRP Status                    InitializeVRRP Priority                  250Message Spool Status           AD-DisabledPriority Reported By Mate      Standby

You can get even more information by running this command: show redundancy detail (I am not going to show the output here).

I hope you found this post useful. For more information, visit PubSub+ for Developers. If you have any questions, post them to the Solace Developer Community.

The post Configuring PubSub+ Event Broker for High Availability in AWS appeared first on Solace.

Top comments (1)

Collapse
 
dpaine20 profile image
David Paine20

Great thanks for sharing. The article is fully supported by images, snippets, and tools. Here one thing to add, no doubt, your mentioned tool for base64 encoding is great. But also check that tool url-decode.com/tool/base64-encode where you find dozens of other tools, that will really help you out, in future projects.
Regards