Table 4 Significant Parameters Utilized by C4.5 Classification Tree as Properties to Determine aTIS

From: Bioinformatic analyses of mammalian 5'-UTR sequence properties of mRNAs predicts alternative translation initiation sites

Parameters aTIS Sequences non-aTIS Sequences
  Mean Std. Dev. Mean Std. Dev.
Number of AUGs 2.41 3.16 1.72 2.73
5' UTR Length 477.23 1303.24 100.24 278.97
(-7,-6) Consensus Sequence 60% n/a 27.6% n/a
IRES 58% n/a 11.84% n/a
G/C Ratio 1.11 0.47 1.00 0.33
  1. Critical parameters in the classification tree are illustrated with respect to their average values for alternative start sites and non-alternative start sites. Differences among these values facilitated the effective classification of mRNAs with aTIS. The number of AUGs is a count of the number of AUGs in the 5'-UTR. Consensus sequences matches against a C in the -7 position from the start site and a G or C in the -6 position. The sequences are also scanned for the IRES secondary structure and the ratio of Guanine and Cytosine in the 5' UTR is also measured. Although the G/C ratio was the least important variable, it resulted in a split in the classification tree and is, therefore, listed as an parameter required for the final classification results.