DEV Community

Cover image for Top 5 principles of software distributed systems that you need to know
Sergey V.
Sergey V.

Posted on

Top 5 principles of software distributed systems that you need to know

High scalability is crucial when it comes to adding new features to your product and handling thousands and millions of requests per second. Software engineers and architects are always striving to find the best way to develop applications that can be elastically scaled depending on the customer’s needs. Furthermore, it’s important to give access to it in case of some trouble, for instance, lack of internet.

Distributed architecture has been invented to help companies easily withstand high loads, extend their projects, and make the service available even if there are some failures. Unlike centralized software systems, solutions that are built on top of distributed architecture, provide a high level of scalability, accessibility, and consistency.

Although their creation is more difficult and may take a bit more time, the end result is worth it, especially if you’re designing a product such as an Uber-like app, online booking platform, e-commerce store, or an international e-learning platform. There are several alternatives to the distributed architecture—broker and SOA (service-oriented architecture).

Distributed software applications can be showcased by the client-server architecture that created the base for so-called multi-tier architectures. The latter is characterized by the separation of the service features, for example, app processing, data management, presentation, and more.

Middleware is an essential element of any distributed architecture that serves as an infrastructure that supports the development and execution of software solutions. Representing a buffer between the network and application, it is set in the middle of the system to ensure efficient support and management of its components.

Engineers can employ a wide range of tools and frameworks for developing distributed software systems: NET, .NET Web services, AXIS Java Web services, J2EE, CORBA, and others.

As functions in the distributed architecture are separated from each other, it’s much easier to maintain it and extend when needed. As a consequence, such a system is able to handle high loads and enable the product availability even if some its part fails.

The main principles of distributed systems

1. Availability

Service availability is about the time your product is accessible for use—despite high loads, lack of internet, system failures, data synchronization, and other factors. It’s rather challenging to ensure a constant 100% availability: even world-famous brand solutions like Gmail, Visa, Mastercard, and others can’t offer it.

If your software system is built on top of the legacy technology stack, you should migrate to new technologies and new, scalable architecture. This will help you increase the accessibility and flexibility of your application.

Another thing that you can do is to include additional nodes/machines into the cluster. Thanks to this, an efficient app functioning will be provided even if some part of the system underperforms. So, when creating a software product, take care of having a high accessibility level.

2. Consistency

Consistency implies that all the system nodes simultaneously see, have, and communicate the same data. To provide their correct functioning, they need to work in synchronization involving successful message delivery and request processing. If the system is implemented properly, you will avoid message losses and delivery delays, thus improving the quality of customer service.

As a rule, the weaker the consistency level is, the quicker the product functioning. Hence, during app development, you need to define what level you need. If you’re building the project that requires data (payment initiation, financial transactional, or other information) storage in a consistent way, you should provide make this functionality part consistent.

3. Idempotency

Idempotency is another important characteristic of the distributed system that allows preventing dropped connections and other errors. In an idempotent system, the event is executed only one time irrespective of the number of request execution times.

If the user attempts to make a payment but it doesn’t work, it is likely to try again. When the application has a high level of idempotency, the payment will be charged only one time so that customers can be sure they won’t lose their money.

At the same time, non-idempotent systems don’t guarantee the lack of double charges which may result in money losses and just unpleasant situations associated with trying to return it back. Therefore, it’s very important to provide successful message/payment/data delivery and correct service functioning.

Here at Smartym Pro, we recommend to add versioning and optimistic locking and integrate a consistent data storage to achieve high idempotency level.

4. Data durability

Once some information appears in the data storage, you will always be able to access it even if some functionality component will fail or some nodes will be offline. Sounds great, doesn’t it? Data durability is actually about turning it into reality and building truly reliable software systems that keep a lot of sensitive data.

Banks, insurance companies, healthcare providers, logistics companies, government entities, and other organizations that deal with personal and business data are those that just can’t afford its losses. This is the key reason why engineers have to ensure the highest level of data durability.

Fortunately, today most distributed data storage services that include MongoDB, DynamoDB, and Cassandra enable durability support at different levels (at the node level, cluster level, etc.). What’s more, you can easily configure them to make the end system durable at the cluster level.

5. Message persistence

Having been providing web application development services for many years, we believe that this is one of the most crucial elements of any quality software product. When there is a lack of internet and data synchronization, the message sent can be lost. With message persistence, it will be saved and instantly delivered once the internet appears.

However, to provide this feature is quite challenging if you need to build (or have already had and want to improve) an app with millions and billions of users that has to be able to process thousands of requests per second. The use of the best technologies and innovative practices is required.

When building the project, integrate a lossless cluster and use a reliable messaging service such as RabbitMQ or Kafka. In our work, we generally employ RabbitMQ and can say that it’s a great tool to enable message persistence.

Architecture scaling issues

When building or improving your software system, you should prepare to situations like an influx of users, unexpected load increase, or the necessity to add new agents to the system.

Otherwise, the product won’t be able to cope with high loads and will fail, leading to poor customer experience and client losses. Therefore, you should take care of high scalability and easy maintenance before the actual development.

There are two principal ways to scale the system that you should consider—vertical and horizontal. While vertical scaling strategy implies purchasing a stronger machine with improved parameters (more cores, better processing power, improved memory, etc.), horizontal is about adding new machines to the cluster and increasing the overall capacity. Since horizontal scaling is generally a more cost-effective option, it’s more popular than vertical.

Sharding

Distributed applications are intended to store large data amounts but it’s difficult to do using a single node. Here comes sharding—one of the most popular and efficient ways to store various data sets on a number of nodes.

Simply put, a shard means a horizontal data partition in a database engine with the help of some sort of hash to assign to the partition. Each shard is held on a separate database server instance and acts as a single source for the data subset.

Due to this method, the load is evenly distributed across multiple nodes and becomes lower, which dramatically improves the overall product performance.

Closing thoughts

To sum up, the development of scalable applications is crucial for project success. Well, a perfect way to achieve this is to integrate a distributed architecture. With high scalability, ease of maintenance, and great availability, you will avoid technical crashes, ensure quality customer service, and boost sales. If you’re interested in providing high performance of the system, we also recommend to read the article about building a high-performance architecture.

Top comments (1)

Collapse
 
dianamaltseva8 profile image
Diana Maltseva

A great post with truly useful information!