Machine Learning Evaluation Measures

Posted on Updated on

When developing machine learning models there is a long list of possible evaluation measures. On one hand this can be good as it gives us lots of insights into the models and be able to select the best model that meets the requirements. (BTW this is different to choosing the best model based on the evaluation measures!). On the other hand it can be very confusing what all of these mean as there can appear to be so many of them.  In this post I’ll look at some of these evolution measures.

I’m not going to go into the basic set of evaluation measures that come from the typical use of the Confusion Matrix, including True/False Positives, True/False Negatives, Accuracy, Miss-classification rate, Precision, Recall, Sensitivity and F1 score.

The following evaluation measures will be discussed:

  • R-Squared (R2)
  • Mean Squared Error (MSE)
  • Sum of Squared Error (SSE)
  • Root Mean Square (RMSE)

R-Squared (R²)

R-squared measures how well your data fits a regression line. It measures the variation of the predicted values, from the model, from that of the actual value. It is typically given as a percentage or in the range of Zero to One (although you can have negative values). It is also known as Coefficient of Determination. The higher the value for R² the better.

R² is always between 0 and 100%:

  • 0% indicates that the model explains none of the variability of the response data around its mean.
  • 100% indicates that the model explains all the variability of the response data around its mean.


But R² cannot determine whether the coefficient estimates and predictions are biased

Mean Squared Error (MSE)

MSE  measures average squared error of our predictions. For each point, it calculates square difference between the predictions and the target and then average those values. The higher this value, the worse the model is.

Screenshot 2019-12-20 11.20.14

The larger the number the larger the error. Error in this case means the difference between the observed values and the predicted values. Square each difference, this ensures negative and positive values do not cancel each other out.

Sum of Squared Error (SSE)

SSE is the sum of the squared differences between each observation and its group’s mean.  It measures the overall difference between your data and the values predicted by your estimation model.

Screenshot 2019-12-20 11.28.40

Root Mean Square Error (RMSE) 

RMSE is just the square root of MSE. The square root is introduced to make scale of the errors to be the same as the scale of targets. As the square root of a variance, RMSE can be interpreted as the standard deviation of the unexplained variance. Lower values of RMSE indicate better fit. RMSE is a good measure of how accurately the model predicts the response.