Skip to main content

Table 6 Number of abstracts, keywords and FScore per class

From: The first step in the development of text mining technology for cancer risk assessment: identifying and organizing scientific evidence in risk assessment literature

Class Abstracts Keywords F-Measure
Carcinogenic activity 1023 1157 92.8
   Human study/epidemiology 190 (171) 44 77.7
Tumor related 39 28 56.3
Morphological effect on tissue/organ 2 1  
Biochemical/cellbiological effects 2 3  
Biomarkers 35 14 68.4
Polymorphism 37 32 79.5
   Animal study 629 (546) 46 80.2
Study length 156 (3) 3  
2-year cancer bioassay 14 9  
Short and medium 143 110 45.9
Tumors 186 73 74.3
Preneoplastic lesions 150 121 81.2
Morphological effect on tissue/organ 60 50 46.3
Biochemical/cellbiological effects 135 198 52.1
Biomarker 6 3  
Type of animal 452 (388) 166 70.5
Genetically modified animals 73 76 73.5
   Cell experiments 319 (313) 28 78.5
Biochemical/cellbiological effects 100 128 58.7
   Subcellular systems 2 2  
   Study on microorganisms 44 22 85.2
Mode of Action 653 316 85.5
   Genotoxic 426 (72) 16 89.1
Strand breaks 32 12 77.4
Adducts 174 11 89.8
Chromosomal change 84 (36) 23 68.2
Micronucleus 47 5 85.9
Chromosomal aberration 35 10 68.2
Mutations 145 38 85.4
Other dna mods 100 52 62.0
   Non-genotoxic 324 (8) 4 76.3
Reactive oxygen species 54 26 70.5
Cytoxicity 50 7 62.0
DNA repair 29 8 64.2
Hormonal receptor 47 30 61.6
Effects on cell proliferation 113 30 69.6
Effects on cell death 110 10 83.3
Transcriptional, translational, posttranslational modifications 27 22 61.2
Peroxisome proliferation 3 2  
Inflammation 15 10  
Toxicokinetics 365 269 77.7
   Absorption, uptake, distribution, excretion 117 45 69.8
   Bioaccumulation/Lipophility 0 0  
   Metabolism 275 (152) 36 76.4
Activation or deactivation 191 161 74.8
Reactive oxygen species 7 6  
   Toxicokinetic modeling 31 21 84.6
  1. The first column shows the name of a class in the taxonomy. The second column shows the total number abstracts classified in the class (or its sub-classes). The value in brackets is the number of abstracts classified in the class without taking the sub-classes into account. The third column shows the total number of unique keyword annotations for each class. The count does not include the annotations for sub-classes, except for the three top level classes where the number of all keywords (also those of sub-classes) are included.