Skip to main content

Table 3 Dataset used for training SVM classifiers

From: Identifying named entities from PubMed® for enriching semantic categories

Headwords

Positive

Negative

SemCat catogories

Gene

3532163

1631676

GENE_OR_PROTEIN

   

DNA_MOLECULE

Protein

3533621

1630690

GENE_OR_PROTEIN

   

PROTEIN_MOLECULE

Disease

88653

5096888

DISEASE_OR_SYNDROME

   

INJURY_OR_POISONING

   

SIGN_OR_SYMPTOM

Cell(s)

14581

5178142

CELL

  1. For each keyword, terms from relevant SemCat categories were merged and used for the classifiers.