Skip to main content

Table 4 Composition of the training sets

From: Building a protein name dictionary from full text: a machine learning term extraction approach

Training lists

PG

C

IK

Pr

# n-grams, n>=2

304

193

111

254

# occurrences in articles where the n-gram is most frequent

16,543

10,862

5,853

12,547