From: A statistical approach for 5′ splice site prediction using short sequence motifs and without encoding sequence data
Type of data
Total number of sequences (TSS, FSS)
Number of redundant sequences (TSS, FSS)
Balanced
(2796,2796)
(830, 102)
Imbalanced-I
(2796,5000)
(830, 231)
Imbalanced-II
(2796,10000)
(830,828)
Imbalanced-III
(2796,15000)
(830,1727)