Let’s improve Node.js application latency using different YugabyteDB distributed SQL database configurations.
Application developers rely on different database configurations (and sometimes different databases altogether) to improve latency. The Largest River, my first distributed web application, is no different.
From the start of this project, I've set out to explore how different YugabyteDB configurations could be deployed to improve latency for users across the globe. During this process, I've deployed three databases with various configurations in YugabyteDB Managed.
This is how it went….
Initially, I deployed a single-region, multi-zone cluster in Google Cloud in the
us-west2 cloud region.
Considering that databases have traditionally scaled vertically (adding more compute power to existing nodes) rather than horizontally (adding more instances), this basic distributed SQL configuration already had some benefits over many standard offerings.
While I'm optimizing for latency, it is worth noting that a multi-zone deployment safeguards against outages in a particular data center. If one of our nodes failed, our YugabyteDB cluster would continue to operate by serving requests from the remaining nodes in the cluster.
However, in terms of latency, this deployment does pose some issues. Users connecting from a nearby application instance will experience low latency (as little as
4ms), but those connecting from the other side of the globe, say in Australia, will suffer high latency (
250ms or more in some cases).
This is often a reasonable tradeoff, but here we're building the next big global business (well, we aren't, but pretend we are!) and demand faster reads and writes.
Our friends on other continents are suffering. Multi-region, multi-zone with read replicas configuration to the rescue!
With this cluster deployment, we are putting our data closer to our end users. This drastically improves read latency. To illustrate, let’s use the same group in Sydney as an example. Their reads are now down to as little as
4ms, by connecting to the nearby read replica node. This is a major performance win on reads, but how about writes?
However, writes from Sydney are still relatively slow in this deployment. While from a practical standpoint, our application is able to connect and send writes to the nearest replica node, this request is sent to the primary cluster to be committed. After all, a replica node is just that, a replica of the primary database.
The DocDB replication layer has numerous options for replicating your data depending on your needs, so I suggest you explore further if going global is in your sights!
Geo-partioned configurations can be used both to improve latency and uphold compliance regulations. For instance, European Union and Indian regulations state that all user data collected in these territories must also be stored in these territories.
This configuration might also make sense for a global e-commerce application, as product catalogs might be different across geographies. Serving reads from these nodes is extremely fast, much like our previous multi-region deployment. In addition, writes are fast because we are able to commit writes to the geo-partitioned node in our user's local geography.
So, these are just a few of the many distributed database configurations at your disposal. Which you should choose is totally dependent on your application's needs. However, more often than not, distributing your data layer will improve latency (as well as resiliency, data compliance, and more!)
Look out for my next article on managing your database connections in Node.js!