Skip to main content

Table 3 Number of sequences, as well as positive and negative instances used in our experiments for the RNA- and DNA-protein data sets. Number of sequences as well as number of positive (+) and negative (-) instances in the non-redundant RNA- and DNA-protein sequence data sets for 30%, 60%, and 90% identity cutoffs.

From: Mixture of experts models to exploit global sequence similarity on biomolecular sequence labeling

Data Sets

Number of Sequences

Number of + Instances

Number of - Instances

RNA-prot 30%

180

5398

27837

RNA-prot 60%

215

6689

32073

RNA-prot 90%

246

7798

34675

DNA-prot 30%

257

5326

53494

DNA-prot 60%

289

5974

58031

DNA-prot 90%

317

6551

60877