Skip to main content

Table 4 Relative contribution of each feature to classification as disease gene. An estimate of the relative contribution of each sequence feature in the final score used by the alternating decision tree for classifying genes as being involved in disease. The percentages are based on the average absolute contribution to the cumulative absolute score of each disease gene in the training set.

From: Speeding disease gene discovery by sequence based candidate prioritization

Feature

% Contribution to final score

Signal peptide

23%

Mouse homolog % identity

21%

Length of 3' UTR

12%

Number of exons

7%

Rat homolog % identity

7%

Worm homolog % identity

6%

GC

6%

CDS length

5%

Gene length

4%

Mouse homolog Ka

3%

Paralog % identity

2%