Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Prediction of RNA-binding amino acids from protein and RNA sequences

Figure 1

The binding information loss during the redundancy reduction based on measuring sequence similarity. The clustered protein sequences by CDHIT. Protein chain A of the protein-RNA complex 1F7Y (1F7Y:A) was selected as the representative sequence because it was the longest. In the representative sequence, the boxed residues were determined as non-binding residues, but those residues in similar locations were determined to be binding residues in the non-selected protein sequences. Hence, the binding information of non-selected protein sequences was not contained in input training data which would only include the binding information of selected sequences.

Back to article page