Name | #Positives | #Negatives | Imbalance |
---|
human
| 1 406 | 81 228 | 57.8 |
arabidopsis
| 231 | 28 359 | 122.8 |
animal
| 7 053 | 218 154 | 30.9 |
plant
| 2 172 | 114 929 | 52.9 |
virus
| 237 | 839 | 3.5 |
microPred
| 691 | 9 248 | 13.4 |
- Characteristics of biological datasets used in the experiments. Imbalance is defined as a ratio of #Negatives to #Positives. We limited dataset imbalance to several tens for practical reasons even though proportions of miRNAs to non-miRNAs in genomes are more extreme. In the case of virus dataset the imbalance is exceptionally low as we wanted to know how methods perform on moderately imbalanced problems. In addition, it is difficult to create representative dataset for viruses as their genomes differ significantly in sizes and most of them do not contain miRNAs.