Exploring the potential of 3D Zernike descriptors and SVM for protein–protein interface prediction

BMC Bioinformatics

Table 3 Performance measures for the binary classification problem: TP – true positives, TN – true negatives, FP – false positives, FN – false negatives

Measure	Mathematical formulation	Comment
Accuracy	A \(=\frac {\text {TP}+\text {TN}}{\text {TP}+\text {TN}+\text {FP}+\text {FN}}\)	Indicates the fraction of correct predictions over the total: not very significant when dealing with imbalanced data.
Precision	P \(=\frac {\text {TP}}{\text {TP}+\text {FP}}\)	Indicates the fraction of relevant instances among the retrieved ones.
Recall	R \(=\frac {\text {TP}}{\text {TP}+\text {FN}}\)	Indicates the fraction of relevant instances that have been retrieved over the total relevant instances.
F₁ score	F\(_{1} = 2 \times \frac {\mathrm {P} \times \mathrm {R}}{\mathrm {P} + \mathrm {R}}\)	It is the harmonic mean of precision and recall.
Matthews correlation coefficient	MCC \(=\frac {\text {TP}\times \text {TN} - \text {FP}\times \text {FN}}{\sqrt {(\text {TP}+\text {FP})(\text {TP}+\text {FN})(\text {TN}+\text {FP})(\text {TN}+\text {FN})}}\)	Returns a value between −1 and +1: +1 represents a perfect prediction, 0 no better than random prediction and −1 indicates total disagreement between prediction and observation.

ISSN: 1471-2105