Skip to main content

Table 3 More detailed classifier performance statistics. For each set of genes tested, five statistics that reflected performance were calculated. Accuracy is the overall accuracy of the classifier; precision reflects the classifier's specificity and recall reflects classifier sensitivity. The area under curve (AUC) is the area underneath the ROC curve drawn for each set of genes (see Figure 3) and represents classifier performance across all combinations of sensitivity and specificity. It ranges from 0 to 1, where 1 represents 100% accuracy, 0.5 represents performance no better than random and 0 represents 0% accuracy. The Kappa statistic is a measurement of agreement between predicted and actual classifications and takes false positive rates into account. It is a number between 1 (symbolising perfect agreement between predicted and actual classifications) and 0 (symbolising no agreement).

From: Speeding disease gene discovery by sequence based candidate prioritization

Test Set

Nodes in tree

Accuracy

Precision

Recall

AUC

Kappa

Training (OMIM) set

15

67%

65%

77%

0.75

0.35

10 × cross validation

15

63%

62%

70%

0.70

0.27

HGMD set

15

64.5%

63%

71%

0.69

0.29

Oligogenic set

15

65%

63%

72%

0.76

0.31