## DEV Community

Berk Hakbilen

Posted on • Updated on

# Evaluation for Regression Models in Machine Learning

Model evaluation is very important since we need to understand how well our model is performing. In comparison to classification, performance of a regression model is slightly harder to determine because, unlike classification, it is almost impossible to predict the exact value of a target variable. Therefore, we need a way to calculate how close our prediction value is to the real value.

There are different model evaluation metrics that are used popularly for regression models which we are going to dive into in the following sections.

## Mean Absolute Error

Mean absolute error is a very intuitive and simple technique, therefore also popular. It is basically the average of the distances between the predicted and the true values. Basically the distances between the predicted and the real values are also the error terms. The overall error for the whole data is the average of all prediction error terms. We take the absolute of the distances/errors to prevent negative and positive terms/errors from cancelling each other.

• MAE is not sensitive to outliers. Use MAE when you do not want outliers to play a big role in error calculated.

• MAE is not differentiable globally. This is not convenient when we use it as a loss function, due to the gradient optimization method.

## Mean Squared Error (MSE)

MSE is one of widely used metrics for regression problems. MSE is the the measure of average of squared distance between the actual values and the predicted values. Squared terms help to also take into consideration of negative terms and avoid cancellation of the total error between positive and negative differences.

• Graph of MSE is differantiable which means it can be easily used as a loss function.
• MSE can be decomposed into variance and bias squared. This helps us understand the effect of variance or bias in data to the overall error.

• The value calculated MSE has a different unit than the target variable since it is squared. (Ex. meter → meter²)

• If there exists outliers in the data, then they are going to result in a larger error. Therefore, MSE is not robust to outliers (this can also be an advantage if you are looking to penalize outliers).

Root Mean Squared Error (RMSE)
As the name already suggests, in RMSE we take the root of the mean of squared distances, meaning the root of MSE. RMSE is also a popularly used evaluation metric, especially in deep learning techniques.

• The error calculated has the same unit as the target variables making the interpretation relatively easier.

• Just like MSE, RMSE is also susceptible to outliers.

## R-Squared

R square is a different metric compared to the ones we have discussed until now. It does not directly measure the error of the model.

R-squared evaluates the scatter of the data points around the fitted regression line. It is the percentage of the target variable variation which the model considers compared to the actual target variable variance. It is also known as the “coefficient of determination” or goodness of fit.

As we can see above, R-squared is calculated by dividing the sum of squared error of predictions by the total sum of square, where predicted value is replaced by the mean of real values.

R-squared is always between 0 and 1. 0 indicates that the model does not explain any of the variation in the target variable around its mean value. The regression model basically predicts the mean of the parget variable. A value of 1 indicates, that the model explains all the variance in the target variable around its mean.

A larger R-squared value usually indicates that the regression model fits the data better. However, a high R-square model does not necessarily mean a good model.

``````import matplotlib.pyplot as plt
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
import seaborn as sns; sns.set_theme(color_codes=True)

X, y = make_regression(n_samples = 80, n_features=1,
n_informative=1, bias = 50, noise = 15, random_state=42)

plt.figure()
ax = sns.regplot(x=X,y=y)

model = LinearRegression()
model.fit(X, y)
print('R-squared score: {:.3f}'.format(model.score(X, y)))
``````

``````X, y = make_regression(n_samples = 80, n_features=1,
n_informative=1, bias = 50, noise = 200, random_state=42)

plt.figure()
ax = sns.regplot(x=X,y=y)

model = LinearRegression()
model.fit(X, y)
print('R-squared score: {:.3f}'.format(model.score(X, y)))
``````

• R-square is a handy, and an intuitive metric of how well the model fits the data. Therefore, it is a good metric for a baseline model evaluation. However, due to the disadvantages we are going to discuss now, it should be used carefully.