Skip to main content

Table 6 Number of abstracts, keywords and FScore per class

From: The first step in the development of text mining technology for cancer risk assessment: identifying and organizing scientific evidence in risk assessment literature

Class

Abstracts

Keywords

F-Measure

Carcinogenic activity

1023

1157

92.8

   Human study/epidemiology

190 (171)

44

77.7

Tumor related

39

28

56.3

Morphological effect on tissue/organ

2

1

 

Biochemical/cellbiological effects

2

3

 

Biomarkers

35

14

68.4

Polymorphism

37

32

79.5

   Animal study

629 (546)

46

80.2

Study length

156 (3)

3

 

2-year cancer bioassay

14

9

 

Short and medium

143

110

45.9

Tumors

186

73

74.3

Preneoplastic lesions

150

121

81.2

Morphological effect on tissue/organ

60

50

46.3

Biochemical/cellbiological effects

135

198

52.1

Biomarker

6

3

 

Type of animal

452 (388)

166

70.5

Genetically modified animals

73

76

73.5

   Cell experiments

319 (313)

28

78.5

Biochemical/cellbiological effects

100

128

58.7

   Subcellular systems

2

2

 

   Study on microorganisms

44

22

85.2

Mode of Action

653

316

85.5

   Genotoxic

426 (72)

16

89.1

Strand breaks

32

12

77.4

Adducts

174

11

89.8

Chromosomal change

84 (36)

23

68.2

Micronucleus

47

5

85.9

Chromosomal aberration

35

10

68.2

Mutations

145

38

85.4

Other dna mods

100

52

62.0

   Non-genotoxic

324 (8)

4

76.3

Reactive oxygen species

54

26

70.5

Cytoxicity

50

7

62.0

DNA repair

29

8

64.2

Hormonal receptor

47

30

61.6

Effects on cell proliferation

113

30

69.6

Effects on cell death

110

10

83.3

Transcriptional, translational, posttranslational modifications

27

22

61.2

Peroxisome proliferation

3

2

 

Inflammation

15

10

 

Toxicokinetics

365

269

77.7

   Absorption, uptake, distribution, excretion

117

45

69.8

   Bioaccumulation/Lipophility

0

0

 

   Metabolism

275 (152)

36

76.4

Activation or deactivation

191

161

74.8

Reactive oxygen species

7

6

 

   Toxicokinetic modeling

31

21

84.6

  1. The first column shows the name of a class in the taxonomy. The second column shows the total number abstracts classified in the class (or its sub-classes). The value in brackets is the number of abstracts classified in the class without taking the sub-classes into account. The third column shows the total number of unique keyword annotations for each class. The count does not include the annotations for sub-classes, except for the three top level classes where the number of all keywords (also those of sub-classes) are included.