But let's take a moment to consider Wikipedia's definition.
A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve a common goal.
Three significant characteristics of distributed systems are:
1. concurrency of components 2. lack of a global clock 3. independent failure of components
If we stretch this definition a bit, we'll discover that Ruby on Rails supports creating distributed systems. In fact, most full-stack web frameworks do.
Consider that you're already running multiple services, often on separate hardware, things like databases and caches. Now examine purpose-driven deployments. A server dedicated to standard web requests and another for API requests. Or think of the third-party services that web applications rely on for things like logging, email, search, analytics, image processing, etc... Such services are endless. Just look at what's offered in Heroku's marketplace.
While it may not be self-evident, thinking of Rails as a distributed system platform will help frame the mental model for the ensuing discussion.
I want to focus on a particular feature in Rails that helps accomplish some distributed system characteristics, ActiveJob. ActiveJob provides background job processing for Rails.
The principal component in the Rails toolbelt for managing background jobs is the backplane or data-bus which allows us to store metadata about expensive operations that should move off the critical path and be processed in the background. These days, Redis is generally used as this backplane for serious Rails applications. It's a multipurpose in-memory data store that serves several functions: a cache, message broker, and database.
ActiveJob uses Redis as a message broker by saving or queueing metadata about operations to run in the background. It also dequeues that metadata and runs those operations. Sidekiq is another popular library that serves the same purpose. Sidekiq implements the ActiveJob interface and includes several advanced features not available in ActiveJob.
It's common for Rails applications to run ActiveJob background workers on separate hardware, but sophisticated deployments might even run isolated worker pools on different machines for different types of background work. It's also viable to run multiple instances of Redis to further separate different types of background work.
The possibilities are endless, and Rails supports all of them. I hope it's becoming clear that purpose-driven deployments can comprise a distributed system even when all the code is contained inside a monolith. For our purposes, it might help to think of distributed systems in terms of units of deployment instead of isolated units of execution.
Now, let's consider some scenarios where segregating different types of background work makes sense.
- Limiting resource contention on a centralized database
- Constraining requests to third party API endpoints
- Partitioning customers to ensure a single user can't exhaust system resources and cause problems for everyone
- Isolating ETL processes to prevent disrupting production
All of these scenarios can be managed by a single monolithic Rails codebase. Why might this be a good idea? Team productivity is exceptionally high in a well-designed Rails monolith. Even if the code is more tightly coupled, it's a tradeoff worth considering as you'll get several benefits of
"distributed systems" without all the downsides.
Stay tuned for the next post covering how to structure a queueing system to manage all that background work.