Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: IRESpy: an XGBoost model for prediction of internal ribosome entry sites

Fig. 3

Calculation of triplet features. An example of triplet features in the Cricket paralysis virus (CrPV) intergenic region (IGR) are shown. The secondary structure of the candidate sequence was predicted using UNAfold [29]. For each nucleotide, only two states are possible, paired or unpaired. Parenthesess “()” or dots “.” represent the paired and unpaired nucleotides in the predicted secondary structure, respectively. For any 3 adjacent bases, there are 8 possible structural states: “(((”, “((.”, “(..”,“(.(”,“.((”,“.(.”,“..(”, and” …”. Triplet features comprise the structural states plus the identity of the central base, A, C, G, or U, so there are 32 (8*4 = 32) triplet features in total. Triplet features are normalized by dividing the observed number of each triplet by the total number of all the triplet features

Back to article page