Skip to main content

Table 3 Description of performance metrics and their formula

From: AUD-DSS: a decision support system for early detection of patients with alcohol use disorder

Metric

Description

Formula

Precision

Precision or Positive Predictive Value (PPV) is a performance metric that determines how many of the records that were expected to be positive were truly positive. The main aim of looking at this number is to decrease the number of false positives

\(Precision= \frac{TP}{TP+FP}\)

Recall

Recall or True Positive Rate (TPR) describes the sensitivity of the classifier. The number of positive samples captured by accurate forecasts is measured by Recall. When all positive samples must be identified, and all false negatives must be avoided, Recall is considered as a performance metric

\(Recall (Sensitivity)= \frac{TP}{TP+FN}\)

F1-Score

The F1-Score is calculated by averaging Precision and Recall. As a result, it shows the performance of the classifier in detecting positive records. This means that the classifier performs best in the positive class if the F1-Score is high. For binary classifications based on imbalanced datasets, F1-Score can be a more appropriate metric to be considered than accuracy

\(F1-Score=2\times \frac{Precision \times Recall}{Precision+Recall}\)

Predictive Accuracy

The most popular measure of the classifier’s performance is predictive accuracy, which evaluates the algorithm's overall effectiveness by calculating the likelihood of the class label's actual value. Measuring the predictive accuracy is the fastest way to understand whether the predictive model has been trained correctly and what the overall performance is. However, it is not the best option to be considered since it cannot give detailed information about the performance of the classifier

\(Accuracy= \frac{TP+TN}{TP+TN+FP+FN}\)

AUROC

The AUROC is a single number that measures the total area underneath the ROC curve and thereby summarizes the performance of the classifiers, as long as we assume that FP and FN are equal mistakes. In most medical situations, FN is considered more serious as these people are not identified by the test. Individuals given an FP classification will be tested further, which provides the opportunity to change the classification. ROC curve visualizes the trade-off between TPR and False Positive Rate (FPR) by displaying them for various threshold settings (cutoff points). In particular, the ROC curve attempts to map the cumulative distribution function of a defined probability distribution in the y-axis against the x-axis, for both true and false identified events. In this curve, the y-axis is the TPR, and the x-axis is the FP rate which is calculated as

\(False\, Positive \,Rate= \frac{FP}{TN+FP}\)

AUPRC

The AUPRC is another widely used performance metric in binary classification problem. It is a threshold-independent measure that estimates the area under a curve formed by a trade-off between several characteristics of performance as the model's prediction threshold changes. In the AUPRC curve, Recall is on the x-axis and Precision is on the y-axis

 
  1. AUROC Area under receiver operating characteristics curve, AUPRC Area under the precision-recall curve