For years, developers have been struggling to find the way to create high-scalable projects. When building complex software, they face a plenty of challenges required to be solved. Otherwise, the app is likely to fail and bring no income.
Uber-like applications are capable to handle up to thousands of requests per second and can be easily scaled if necessary. What’s more, users can access the main functionality if some system’s parts are down. So, how to develop such high availability solutions where critical functions work even if something fails?
The answer lies in the distributed architecture. While centralized systems have low availability, scalability, and consistency, distributed software systems provide their high levels. Certainly, the development of distributed systems is more complicated, but the result is worth it.
Distributed systems can be demonstrated by the client-server architecture, designing the base for multi-tier architectures, which in turn, have functions like presentation, application processing, and data management separated from each other. Alternatives include the broker architecture and Service-Oriented Architecture (SOA).
Distributed architecture is based on the idea of distributed system concepts such as availability, consistency, durability, idempotency, and persistence.
When the application complies with these concepts, it can easily withstand high loads, process thousands of requests per second, have all operations correctly made, and all messages successfully delivered.
Distributed system concepts
High availability means the percentage of time the service is operational. It is one of the most important characteristics of successful software.
Though developers dream about achieving 100% availability, it can be very challenging and expensive. Even such large and complex systems as Gmail and the VISA card network don’t provide 100% availability.
Distributed software systems are often designed on top of machines with a lower level of availability. To develop an application with 99.99% availability you can use machines/nodes that have the four nines availability.
Also, find out how to develop high-performance scalable applications.
In a consistent system, all nodes see and return the same information simultaneously. In order to ensure that all nodes have the same data, they need to exchange messages and work in synchronization.
However, in speaking of data communications between nodes, some difficulties may arise. For example, messages’ delivery may fail, or messages may get lost, or some nodes may be unavailable at some point.
Generally, the weaker the required level of consistency, the faster the system can work – but at the same time the higher chances that it won’t return the latest dataset.
Idempotency means that the actual event execution will occur only one time regardless the number of times a specific request is executed. By providing a high level of idempotency, developers manage to avoid bad consequences of dropped connections, request errors, and more.
For example, if the customer tries to make a payment but nothing happens, he/she could try again. When the system is idempotent, the payment will be charged only one time, while non-idempotent systems don’t guarantee the lack of double charges and users returning their money back.
4. Data durability
Durability is one of the key concerns of distributed systems. It means that once data is added to the data storage, they will be available in the future, even if the some system’s nodes are offline or have their data corrupted.
Different distributed databases have different levels of data durability. Some databases support data durability at the machine/node level, some of them maintain it at the cluster level, and some don’t offer this functionality out of the box.
Data durability takes an important role when developing high-scalable applications able to process millions of events per day.
In many cases, product owners/companies can’t allow data loss, especially when dealing with transactions and other critical operations. That’s why developers need to strongly focus on providing a high level of data durability.
Nowadays, most distributed data storage services, e.g. Cassandra, MongoDB, and Dynamodb, offer durability support at different levels and can be all configured to ensure data durability at the cluster level.
5. Message Persistence
When the node which is processing a message goes offline or some other failure happens, there is a risk that a message will be lost. Message persistence implies that the message is saved and will be processed after the issue is solved.
Message persistence is one of the most important characteristics of a quality application.
However, to implement the system protected from losses, for example, a messaging app with billions of users or an Uber-like app with millions of payments per day, is quite difficult and requires proven technologies and developers’ expertise.
The creation of a messaging system that delivers a message at least one time and the implementation of a lossless cluster can become a solution to this challenge.
In speaking of distributed systems, messaging is generally ensured by some distributed messaging service like RabbitMQ or Kafka, supporting various levels of reliability in delivering messages and allowing to build successful app architectures.
Also, useful Information to check out (at the bottom of the page) Types of scaling and sharding practice.