Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: Inferring RNA sequence preferences for poorly studied RNA-binding proteins based on co-evolution

Fig. 4

Analyses of the number of homologous RBPs and their sequence similarities to the target RBP for the KNN algorithm. The figure is based on the RRM-FL set from the InVitro dataset. a Box plot of the preference prediction performances for five different sequence similarity bins. The x-axis shows the similarity between the target RBP and the nearest neighbor (1NN). The y-axis shows the in vitro performance (PCCs) of using 1NN for preference prediction. The red dashed line connects the mean value of each bin. A significant correlation (0.3, p-value < 0.05) was observed between the PCC and the sequence similarity values. b Box plot of the number of neighbors needed for five different sequence similarity bins. The x-axis is the same as that in a. The y-axis denotes the optimal number of neighbors (optK) to use in the KNN algorithm. The optK value was determined through cross validation. A negative correlation (-0.17) was observed between the optK and the sequence similarity values

Back to article page