Feature | # variables | Normalization method | Ref. |
---|---|---|---|
Nucleotide composition* | 84 | Individual nucleotide frequency divided by total nucleotide frequency | [25] |
Transcript length** | 4 | Binary coding: length intervals < 100, 400, 900 and > 900. | [26] |
Amino acid composition§ | 20 | Individual amino acid frequency divided by total amino acid frequency | [27] |
ORF length§ | 4 | Binary coding: length intervals < 20, 60, 100 and > 100. | [28] |
Isoelectric point§ | 1 | Value divided by 14 | [29] |
Compositional entropy§ | 1 | Amount of low complexity residues divided by sequence length | [30] |
Mean hidropathy§ | 1 | Summed means from sliding 3nt window | [31] |