Skip to main content

Table 2 Significant differences between the control set and disease set of genes. The features found to be significantly different between Ensembl genes found in OMIM and those not in OMIM. Significance was calculated using the Mann-Whitney U test unless otherwise noted.

From: Speeding disease gene discovery by sequence based candidate prioritization

Feature

Median in control set

Median in disease set

Significance

Gene length

19 k

27 k

P < 0.001

cDNA length

2,126 bp

2,442 bp

P < 0.001

Protein length

383 aa

494 aa

P < 0.001

3' UTR length

446 bp

488 bp

P < 0.01

Exon number

8

10

P < 0.001

Distance to neighbouring gene

46 kb

52 kb

P < 0.01

Protein identity with BRH in mouse

80%

87%

P < 0.001

Gene encodes signal peptide

17%

35%

P < 0.0001 (calculated using the chi squared test)

5' CpG islands

12%

16%

P < 0.028 (calculated using the chi squared test)