Table 2 Number of data points from unannotated corpus used in our system, averaged over the six separate SVMs (one per relation type).

From: Semi-supervised Learning for the BioNLP Gene Regulation Network

System positives negatives
[BASIC] 616 0
[PRE-SEL], select POS, no NEG 563 (= 91.3 %) 0
[PRE-SEL], select POS, select NEG 563 9,967 (= 90.6 %)
  1. Also mentioned is the percentage of the respective total candidate pool this is. The positive candidate pool consists of all data points conforming to the distant supervision criterion (# = 616) as seen in the [BASIC] system, while the negative pool is the complement of this (# = 11,162).