Skip to main content

Table 3 Data efficiency of the STSS algorithm.

From: Self-training in significance space of support vectors for imbalanced biomedical event data

Class label Original training dataset STSS training dataset (Ds)
  Number of instances Imbalance ratio Number of instances Imbalance ratio
AtLoc Pos: 48
Neg: 3661527
1:76282 Pos: 48
Neg: 128761
1:2682
Cause Pos: 1117
Neg: 366045
1:3277 Pos: 1117
Neg: 27505
1:24
Cause-Theme Pos: 6
Neg: 3661521
1:610261 Pos: 6
Neg: 6000
1:1000
Site Pos: 425
Neg: 36661150
1:8614 Pos: 425
Neg: 36627
1:86
Theme Pos: 9246
Neg: 3652329
1:395 Pos: 9246
Neg: 30915
1:3
ToLoc Pos: 50
Neg: 3661525
1:73231 Pos: 50
Neg: 167120
1:3342
  1. An imbalance ratio close to 1.0 is preferred.
\