Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: A machine learning strategy for predicting localization of post-translational modification sites in protein-protein interacting regions

Fig. 5

The sampling strategy for balancing interacting and non-interacting sub-datasets. The larger non-interacting sub-dataset was clustered by GibbsCluster into 10 clusters. Each cluster contained sequences representing a different characteristic motif; for the purpose of illustration, shown here are example motifs from the phosphorylation dataset. Equal numbers of sequences from each cluster were randomly selected and combined to create a reduced non-interacting sub-dataset which was similar in size to the interacting sub-dataset

Back to article page