Skip to main content

Table 7 Interpretable biological features selected by decision tree model based on Seq + Str features for the 746 dataset

From: Prediction of HIV-1 protease cleavage site using a combination of sequence, structural, and physicochemical features

Rank Variable Description Importance
1 RSA_4 RSA at the 4th position of an octamer 1.0000
2 MT DipC of methionine & threonine 0.7855
3 C AAC of cysteine 0.6060
4 RSA_5 RSA at the 5th position of an octamer 0.5238
5 VH DipC of valine & histidine 0.4059
6 AE DipC of alanine & glutamic acid 0.3769
7 FL DipC of phenylalanine & leucine 0.3268