Skip to main content

Table 3 Dataset used for training SVM classifiers

From: Identifying named entities from PubMed® for enriching semantic categories

Headwords Positive Negative SemCat catogories
Gene 3532163 1631676 GENE_OR_PROTEIN
    DNA_MOLECULE
Protein 3533621 1630690 GENE_OR_PROTEIN
    PROTEIN_MOLECULE
Disease 88653 5096888 DISEASE_OR_SYNDROME
    INJURY_OR_POISONING
    SIGN_OR_SYMPTOM
Cell(s) 14581 5178142 CELL
  1. For each keyword, terms from relevant SemCat categories were merged and used for the classifiers.