Skip to main content

Table 3 Performance measures for the binary classification problem: TP – true positives, TN – true negatives, FP – false positives, FN – false negatives

From: Exploring the potential of 3D Zernike descriptors and SVM for protein–protein interface prediction

Measure

Mathematical formulation

Comment

Accuracy

A \(=\frac {\text {TP}+\text {TN}}{\text {TP}+\text {TN}+\text {FP}+\text {FN}}\)

Indicates the fraction of correct predictions over the total: not very significant when dealing with imbalanced data.

Precision

P \(=\frac {\text {TP}}{\text {TP}+\text {FP}}\)

Indicates the fraction of relevant instances among the retrieved ones.

Recall

R \(=\frac {\text {TP}}{\text {TP}+\text {FN}}\)

Indicates the fraction of relevant instances that have been retrieved over the total relevant instances.

F1 score

F\(_{1} = 2 \times \frac {\mathrm {P} \times \mathrm {R}}{\mathrm {P} + \mathrm {R}}\)

It is the harmonic mean of precision and recall.

Matthews correlation coefficient

MCC \(=\frac {\text {TP}\times \text {TN} - \text {FP}\times \text {FN}}{\sqrt {(\text {TP}+\text {FP})(\text {TP}+\text {FN})(\text {TN}+\text {FP})(\text {TN}+\text {FN})}}\)

Returns a value between −1 and +1: +1 represents a perfect prediction, 0 no better than random prediction and −1 indicates total disagreement between prediction and observation.