DEV Community

carlo-moretto for Prodigysgroup

Posted on • Originally published at agregg.cloud on

Machine learning: is it really the best option for you?

Introduction

In recent years, machine learning and deep learning have been used as solutions to several problems. Self-driving cars, face recognition, virtual assistants, shopping forecast and even in anti-crime applications. The reason of this success is that although developing and deploying ML systems is relatively fast and cheap, a high performance is still achievable.

But is it really the best solution to a specific problem?

To answer properly to this question we have to consider an hidden factor that can really make the difference when choosing the best technologies for a project: the T __echnical D_ ebt._

Causes of Technical Debt

“Technical Debt” is a metaphor introduced by Ward Cunnigham in 1992 to help understand the long-term costs that can be incurred by moving too quickly in software engineering.

The causes that can lead to one can be intentional – like time constraints placed on development and source code complexity – or unintentional – like lack of coding standards and guidelines or lack of planning for future developments. This debt can be paid off by refactoring code, improving unit tests, reducing dependencies and deleting dead code. However, this is not easy and identifying it requires a careful analysis.

For this reason it is always best to try and set it to a very low quantity from the beginning of the project.

So, what is the main cost of using ML? Primarily, the fact that maintaining them over time is challenging and expensive.

Technical Debt in ML Systems

One of the most common kind of technical debts that arise with ML is _ entanglement _. Machine learning systems mix signals together entangling them and isolating impossible improvements. In this case, a solution could be to isolate ML models (if possible) or to focus on detecting changes in prediction behaviour as they occur.

Moreover, a prediction from a machine learning model is made widely accessible to other systems that can consume the information provided at runtime or later by reading files. Without access controls, some of these _ consumers _ may be undeclared. This problem in software engineering is also know as visibility debt.

Other technical debts you need to be aware of concern _ d ata dependencies , _such as _ unstable input signals _. They are mostly used because of how convenient it is to take signals as inputs from other systems, but they are also unstable because their behaviour changes over the time. A solution for unstable data dependencies is to create a versioned copy of a given signal. Another source of technical debt is _ underutilized dependencie s _, that is unnecessary packages that can be paid off by doing a detailed analysis.

Furthermore, there is reproducibility debt. Machine learning make it difficult to re-run experiments and get similar results.

Conclusions: Measuring Debt and Paying it Off

Technical debt can be measured by asking yourself a few useful questions, like:

• How easily can an entirely new algorithmic approach be tested at full scale?

• What is the transitive closure of all data dependencies?

• How precisely can the impact of a change to the system be measured?

• Does improving one model or signal degrade the others?

• How quickly can new members of the team be brought up to speed?

In addition to this, considering and not ignoring technical debt since the beginning of the project could be enough. These are very good practices that can be very important for the success of the project, his maintenance over time and the team health itself.

Why AgrEGG?

AgrEGG helps to detect and track all possible causes of technical debt. With us, you will be able to test and see all data dependencies in a much easier way.

Also, thanks to our team of experts, you will have the opportunity to receive an in-depth analysis based on your project needs.

References

Hidden Techical Debt in Machine Learning Systems

Top comments (0)