Skip to main content

Table 3 Performance measures for the binary classification problem: TP – true positives, TN – true negatives, FP – false positives, FN – false negatives

From: Exploring the potential of 3D Zernike descriptors and SVM for protein–protein interface prediction

Measure Mathematical formulation Comment
Accuracy A \(=\frac {\text {TP}+\text {TN}}{\text {TP}+\text {TN}+\text {FP}+\text {FN}}\) Indicates the fraction of correct predictions over the total: not very significant when dealing with imbalanced data.
Precision P \(=\frac {\text {TP}}{\text {TP}+\text {FP}}\) Indicates the fraction of relevant instances among the retrieved ones.
Recall R \(=\frac {\text {TP}}{\text {TP}+\text {FN}}\) Indicates the fraction of relevant instances that have been retrieved over the total relevant instances.
F1 score F\(_{1} = 2 \times \frac {\mathrm {P} \times \mathrm {R}}{\mathrm {P} + \mathrm {R}}\) It is the harmonic mean of precision and recall.
Matthews correlation coefficient MCC \(=\frac {\text {TP}\times \text {TN} - \text {FP}\times \text {FN}}{\sqrt {(\text {TP}+\text {FP})(\text {TP}+\text {FN})(\text {TN}+\text {FP})(\text {TN}+\text {FN})}}\) Returns a value between −1 and +1: +1 represents a perfect prediction, 0 no better than random prediction and −1 indicates total disagreement between prediction and observation.