Skip to main content

Table 3 Number of sequences, as well as positive and negative instances used in our experiments for the RNA- and DNA-protein data sets. Number of sequences as well as number of positive (+) and negative (-) instances in the non-redundant RNA- and DNA-protein sequence data sets for 30%, 60%, and 90% identity cutoffs.

From: Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling

Data Sets Number of Sequences Number of + Instances Number of - Instances
RNA-prot 30% 180 5398 27837
RNA-prot 60% 215 6689 32073
RNA-prot 90% 246 7798 34675
DNA-prot 30% 257 5326 53494
DNA-prot 60% 289 5974 58031
DNA-prot 90% 317 6551 60877