Dr. Gernot Starke

Posted on Jul 27, 2021

Quality-Driven Software Architecture

#architecture #codequality #method

Getting your product quality right

Quality is the raison d'être for software architects:
Our systems should be reliable, performant, scalable and user-friendly. Systems should be build and maintained cost-effective and future-proof. Every IT professional knows that this combination of characteristics means hard work.
The article shows how you can methodically construct quality.

Tea with sugar?

How do you drink your morning tea: With sugar or natural? With a drop of milk and some honey? Do you prefer it with lemon? Black or green, hot or with ice? Or do you prefer coffee in the end? As you can see, the requirements for breakfast drinks, just like those for software, depend very much on the
viewer and the context. If we want to create a high-quality morning beverage (or a system in general), we definitely need to determine the specific (quality) requirements of the relevant stakeholders beforehand. Overly general requirements are only of limited help.

Nowadays, customers and users expect software to meet a large number of (quality) requirements, of which "correct function" is only one among many. In classical software development, however, functionality is often treated as the center of attention: The design and implementation activities are focussed on getting the functionality right - with potentially fatal consequences. Let us illustrate this with a small example.

Functionality is (usually) not enough

Since the triumph of digital cameras and smartphones, practically all of us have felt the need to somehow organize the many individual images on the computer.
Let's use this as an example of a requirements description for a software system: manage digital photos.
Imagine you are the software architect who shall develop such a system together with a small development team.
The customer tells you the following story:

"We want to organize photos in folders and tag them with keywords.
Later, we want to be able to search and display images according to various search criteria (such as: date, keyword, etc.). In our private collection, we can get by with a few thousand photos."

As an architect, you might reformulate these coarse statements into some features
(In real development, somebody like your product owner should handle that...)

add images
add image to folder
delete images
display images
search images for location, date, keywords, folder
change metadata (like location, date, keyword)
add folder
rename folder
remove folder

(yes, I know that it would be nice to have additional features like albums, but let us keep things simple for now)

Fig. 1: Sample photo with some attributes

Now you need to derive a solution approach from these (admittedly very rough) requirements.
A model helps us to clarify things. We can derive it from an example (see Fig. 1).
In our case, each photo has only a handful of attributes (date, location, file name, folder, etc.).
We map the keywords as a list so that each photo can have several.
Again, I'm well aware that this approach is overly simplistic - but give me a minute.

Fig. 2: A model for a "solution approach"

In our example, the domain model seems almost trivial (see Figure 2).
Only three small entities are sufficient to represent the desired functionality from a purely business perspective. Location, date and keywords are stored via key-value pairs in the Metadata class. It could hardly be simpler.

Based upon such a model, architects and develpers can start implementing business functionality and corresponding tests.
In our example, we would need to implement the classic CRUD tasks (create, read, update, delete) for our domain entities (photos, folders, and metadata), and additionally solve a few technical tasks to bring the photo system to life:

display jpeg files
create a graphical user interface for our features

From an architects' perspective, it is valid to ask the "make-or-buy" question: Has anybody else already solved my problem, and could we potentially buy or re-use an existing solution?
The answer is obviously "yes", as several commercial and open-source packages exist for photo management,
e.g. Shotwell, Darktable and Lightzone, see fig. 3. But that would be pretty boring for us developers, so we build the solution ourselves (despite it is usually very clever and efficient to prefer existing solutions over home-grown stuff...)

Fig. 3: Options for Photo Management

But ...

The customer is really happy that you delivered with a week and plays around with your solution. With a big smile on their faces they comment:

"It is great, but we require a small change."

Only three letters, sounding insignificant, that "but".
Our customer, excited by what they saw, adds the following requirements:

"We want to roll out our photo management to multiple users and provide 10Gbyte of free storage for each of them. Of course, people shall be able to share photos amongst each other. Otherwise, don't worry - the functionality will remain identical..."

Ok. We should have known there was something else coming - otherwise they wouldn't have asked us in the first place. So, not only local storage on users' machines any longer, as users get 10 Gig for free. Sharing photos among users definitely adds some more functional requirements - but having centrally managed storage and multiple users changes the whole scope and context of our development.

As you are an experienced architect and therefore know about "Murphy's law", you ask the customer for an expected number of users. They answer promptly: Millions, hopefully dozens of millions of users.

Millions of users

A quick calculation reveals that we now have to re-think our file-based storage approach:
let's assume we get 10 Million users, and everyone will add only half of the allowed free storage (5 GByte) of images to our system. Without considering backup and index space, that alone will require a quite impressive amount of storage:

    10 Million * 5 GByte 
  = 10.000.000 * 5 GByte 
  = 10.000     * 5 Terabyte
  = 10         * 5 Petabyte

50 Petabyte. Defintely more than we can squeeze on on standard SSD drive, and likely to become pretty expensive if we use a standard cloud storage provider.
Additionally, with a million of potentially concurrent users, we should think about concurrency, scaling, elasticity and a few other things not contained in our domain model.

Interestingly though, our domain model remains extremely simple, see the diagram below (Fig. 4)
Just a single domain class is added (User), denoting the owner of the photo, plus a relationship to share photos with other users.

Fig. 4: Enhanced Domain Model

Practically, we will now need some changes to functionality, namely all things related to user management (add/update/remove users, handle authentication, handle forgotten credentials and so on).

Just a single knob...

In contrast to the original requirements, no (!) relevant functionality was added, just a single quality attribute was adjusted: We had single-user operation before, and did not explicitly care for storage capacity. Our customer just turned a single knob ("capacity"), resulting in imporant under-the-hood changes.

That tiny change in requirements resulted in a completely different solution approach.
Compared to the simple functional requirements, even hardened architects and developers now suddenly get concerns: millions of users and petabytes of data confront a development team with completely new tasks:

Scalability to keep the system perfomant and responsive even with the planned number of users. This includes clustering and load balancing to be able to react to load peaks at short notice.
Storage of large amounts of data (new German Big Data), which does not work satisfactorily with conventional databases or file systems.
Distributed data storage, because for performance and risk reasons the entire data stock should be distributed over several locations or data centers.
High availability, because large numbers of international users do not tolerate long maintenance windows
Transmission costs, for example for replication of image data.

What's missing in Domain-Driven Design

A purely domain-driven approach to the design and development of our photo example might have ignored the essential quality requirements (number of users and data volume). In general, if there is too much focus on functionality and domain structure, there is a risk of overlooking important quality features or ignoring them for a long time.

Parallel to understanding and considering the domain you should therefore explicitly design the required or needed qualities into your system. But let's consider a relevant objection first:

In case you think "hey, we are nor adressing millions of users, neither giving away petabytes of free storage, this example is exaggerated".

Ok, let's turn another knob

If you are of the opinion that the quality requirements mentioned here are too ambitious, because you will never have millions of users yourself, let's choose a completely different requirement:

Your customer wants to store images for a few dozen employees, each with a maximum of one hundred images.

This results in a total volume of data that can easily be stored on a single hard disk. We can easily manage a few dozen user IDs without any capacity or storage problems. However:

The stored images are top secret (military grade security)

Again, we have to deal with problems beyond pure functionality: secure encryption algorithms, secure transfer of images to users, key- or credential management, highly secure authentication of users. These tasks are also difficult to retrofit to an already finished software.

What is "Quality"?

The term "quality" was subject of another short post here on dev.to.

In the past, many researchers and practitioners have discussed options how to treat "quality" of software based products. Their common approach is to treat quality as a hierarchy of smaller properties:

The well-known approach of ISO-25010 divides quality into eight categories, each of them again by additional characteristics (see Fig. 5).

Fig. 5: ISO 25010 Software Product Quality

Our photo management system from the introduction serves as an example: We now describe the (quality) requirements as a hierarchy: The required software quality is decomposed into subcharacteristics, which in turn are further decomposed as required. This so-called quality tree describes or defines the specific quality requirements for a system. In Fig. 6 below we turned the tree leftwise, and removed most of the branches from the "big" ISO-25010 model.

Fig. 6: Sample Quality Tree

These textual descriptions of specific quality attributes (e.g. "Photo search yields results in < 1sec on average") are called quality scenarios and has been described in many books and articles, see this nice post from Frank Rosner. Scenarios provide a practical and pragmatic way do describe quality requirements, they are easy to understand and create.

Achieving Quality

So how can you achieve this, often quite extensive, set of quality requirements for a system? We have already seen that concentrating on the purely functional or technical characteristics is often not enough.

The short answer: Explicitly consider quality requirements - don't expect that the system will be fast, secure, reliable or maintainable enough just by producing clean code.

Systematically Constructing Quality

As a software architect, you must therefore place great emphasis on achieving the important quality characteristics from the very beginning when designing and implementing your systems. Fortunately, this is quite simple in practice:

First of all, you should make quality a very important goal and pay attention to it accordingly. Then follow a classic plan-do-check-act cycle (analogous to the classical PDCA Deming cycle) with the following steps:

aim, getting specific and concrete quality requirements.
plan: define (together with your team) approaches or strategies to achieve the respective quality requirement. Approaches might be purely technical (e.g. use Elasticsearch for image search queries), organizational, business-related (e.g. restrict free storage to 1 GByte in first year of usage), or a mixture of all.
build: Implement some of these approaches.
check or test: Check the effectiveness of your approaches. To do this, you need to measure and test as close as possible to the running system or parts of it. If necessary, you now need to re-adjust. To do this, continue with step 1 (aim). This cyclical iterative procedure elevates the systematic review of goal achievement to a system. It ensures that high-priority quality requirements are taken into account in a timely manner in design and development. Neil Ford calls this "evolutionary architecture."

Fig. 7: Quality Driving Architectural Decisions

The build and check activities are completely normal architecture and project business - but the aim and plan activities require some explanation.

Aim: Getting Quality Requirements Right

In the ideal world, you receive concrete and specific quality requirements from requirements engineering or your product owner. However, in my long years in the IT industry, I have rarely encountered ideal conditions - I regularly had to rework the quality requirements for systems. The simple reason for this: The functional requirements and processes of a system can usually be described quite simply, but clients and other stakeholders usually have a hard time with the specific, precise description of maintainability, flexibility, robustness, ergonomics, efficiency and so on (the qualities of systems).

There has been a remedy for this problem for a long time: Methods such as quality scenarios can help to describe quality requirements and goals in a pragmatic and operationalized way. We have already roughly addressed this above, here again in more detail:

You remember the hierarchical nature of quality: quality scenarios form the leaves from the quality tree. We concretize these leaves with scenarios in order to describe more precisely what the respective stakeholders mean by this term. Several different scenarios may well refer to a single quality characteristic. Again, instead of gray theory, a few examples (in Table 1) will help you understand this approach. In general, scenarios describe the reaction of a system to events. Many quality characteristics can be well concretized with the help of scenarios, especially efficiency, performance, flexibility, extensibility and reliability.

Feature	Subfeature	Scenario	Priority
Reliability	Robustness	When uploading a corrupted jpg photo, the system gives a meaningful indication without crashing.	B
	Robustness	When uploading an unknown file type (such as: pdf, svg, tiff), the system gives an appropriate message and saves the data without crashing.	A
	Data Integrity	The system will not modify or corrupt photos uploaded by users under any circumstances.	A
Performance	Capacity: Number of Users	The system supports up to 10 million users, each of whom is allowed to use up to 5 GB of storage for photos and metadata.	B
Ease of Use	Learnability	Casual users should be able to save their first photo in less than 2 minutes.	B
	Learnability	Users should be able to create more complex searches without the help of documentation	C

Table 1: More Examples for Quality Scenarios

Please note the "Priority" column in Table 1: This puts the quality requirements in an appropriate order from a business point of view. The finished system will achieve all quality goals, the priority helps during for architecture and development in design and implementation decisions.

Now that we have clarified the quality requirements and made them operational and hopefully testable - we move on to the next step, strategy, tactics or other approaches to achieve these requirements.

Plan: Define Approaches to Achieve Quality

Once we know the specific requirements, we can design, decide and define specific approaches, measures, concepts and technology to achieve these requirements. In addition to established good practices of software engineering, we can rely on published strategies, tactics and patterns.

The well-known Software Engineering Institute has introduced the term "quality tactics" for this, and published a few for individual quality characteristics.

To date, there is no complete collection of such quality tactics, but the current literature provides you with a large pool: The patterns that have been so widespread for a few years already help for the quality characteristics flexibility, robustness, performance and security, etc. - as long as you consider them with a focus on quality-enhancing measures.

QDSA in Practice

In development practice, QDSA consists of really simple mechanisms:

Write down your stakeholders' explicit quality requirements.
Design, decide and define specific approaches, tactics or strategies to achieve these specific requirements.
Implement one or several, and finally
Check if your approach(es) worked out as planned.

A simple board or table in your project wiki suffices, no additional tools required.

QDSA for Photo Management

Let's come back to our photo management example one last time and dedicate this way to our demanding quality goal "high number of users and large amount of data". Table 2 contains some proposals how your team might achieve that goal. It's still a lot of work to do, but at least we now have collected some ideas which we can pursue during development.

Q-Requirement	Proposales / Approaches / Tactics / Strategies
Capacity (Number of Users): System supports up to 10 million users, each is allowed up to 5 GB of storage.	* NoSQL datastore with auto-replication * Use content-delivery network * Use Auth0 for user management * hire expert for large-scale load testing
Learnability: Casual users should be able to save their first photo in less than 2 minutes.	* get usability expert on team * add drop image function for upload * design-thinking workshop to optimize flows
...	...

Some of these proposals are expensive, others are difficult to implement. But if your customer has difficult requirements, sometimes you have to take big leaps to achieve these goals! Brainstorming, collecting and systematically analyzing proposals for action is the core of Quality-Driven Software Architecture.

Beware of Side Effects

In QDSA, you should be particularly careful at one point: Individual measures can easily impact other quality attributes. A positive effect on one characteristic such as performance can easily have negative effects on other characteristics, such as memory requirements, robustness, or comprehensibility. That's the classical side-effect: There is no free lunch. Many things you improve will have an impact on budget or schedule. You should therefore systematically question possible consequences before you start implementing your quality approaches.

The rest is classic development business, namely planning the measures in everyday life and then implementing them. Whether you work according to Scrum or other process models, with regard to QDSA there are no special features compared to normal procedures. From my point of view, the combination of DDD and QDSA is very promising: On the one hand, you get your domain structure under control, on the other hand, you explicitly address the important quality topics. Your customers or clients will be thrilled!

Quality Creates Value

Constructing quality systematically in this way creates value for your stakeholders: software systems that achieve specific quality goals. As mentioned above, many quality-enhancing approaches or strategies are initially expensive, costly or even risky with regard to other goals. In the long term, however, such investment in achieving quality goals virtually always pays off. This is why quality orientation is often unpopular with purely short-term management, because some investments do not pay off immediately, but only after some delay.

Quality can hardly be achieved without systematic design (aka architecture). And good things never come for free.

Acknowledgements

Way back in the year 2000, some software architects from Siemens Medical (mainly Mrs Christing Hofmeister) published a book called "Applied Software Architecture", describing a method to systematically designing systems based upon given requirements and constraints. They called their approach global analysis - which became the fundation of the QDSA approach described here. The Siemens guys did not focus on quality, and proposed a quite waterfallish top-down approach to architecture, not suited for modern development practices. Nevertheless - their idea of developing solution approaches based upon specific (quality) requirements or constraints remains usefull.

Image credits

The cheerful lady has been photographed by Photo by Sincerely Media on Unsplash.

The header image by Isaac M Smith

DEV Community