Skip to main content

Table 3 More detailed classifier performance statistics. For each set of genes tested, five statistics that reflected performance were calculated. Accuracy is the overall accuracy of the classifier; precision reflects the classifier's specificity and recall reflects classifier sensitivity. The area under curve (AUC) is the area underneath the ROC curve drawn for each set of genes (see Figure 3) and represents classifier performance across all combinations of sensitivity and specificity. It ranges from 0 to 1, where 1 represents 100% accuracy, 0.5 represents performance no better than random and 0 represents 0% accuracy. The Kappa statistic is a measurement of agreement between predicted and actual classifications and takes false positive rates into account. It is a number between 1 (symbolising perfect agreement between predicted and actual classifications) and 0 (symbolising no agreement).

From: Speeding disease gene discovery by sequence based candidate prioritization

Test Set Nodes in tree Accuracy Precision Recall AUC Kappa
Training (OMIM) set 15 67% 65% 77% 0.75 0.35
10 × cross validation 15 63% 62% 70% 0.70 0.27
HGMD set 15 64.5% 63% 71% 0.69 0.29
Oligogenic set 15 65% 63% 72% 0.76 0.31