Table 4 Relative contribution of each feature to classification as disease gene. An estimate of the relative contribution of each sequence feature in the final score used by the alternating decision tree for classifying genes as being involved in disease. The percentages are based on the average absolute contribution to the cumulative absolute score of each disease gene in the training set.

From: Speeding disease gene discovery by sequence based candidate prioritization

Feature % Contribution to final score
Signal peptide 23%
Mouse homolog % identity 21%
Length of 3' UTR 12%
Number of exons 7%
Rat homolog % identity 7%
Worm homolog % identity 6%
GC 6%
CDS length 5%
Gene length 4%
Mouse homolog Ka 3%
Paralog % identity 2%