Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling

BMC Bioinformatics

Table 3 Number of sequences, as well as positive and negative instances used in our experiments for the RNA- and DNA-protein data sets. Number of sequences as well as number of positive (+) and negative (-) instances in the non-redundant RNA- and DNA-protein sequence data sets for 30%, 60%, and 90% identity cutoffs.

Data Sets	Number of Sequences	Number of + Instances	Number of - Instances
RNA-prot 30%	180	5398	27837
RNA-prot 60%	215	6689	32073
RNA-prot 90%	246	7798	34675
DNA-prot 30%	257	5326	53494
DNA-prot 60%	289	5974	58031
DNA-prot 90%	317	6551	60877

ISSN: 1471-2105