Skip to main content

Table 8 The distribution of annotated named entity categories

From: Semantic annotation of consumer health questions

Category

# questions

% (Rank)

# questions

% (Rank)

# questions

% (Rank)

 

CHQA-email

CHQA-web

 

Practice

Unadjudicated

  

anatomy

31

12.8 (4)

5,339

15.8 (2)

153

10.8 (3)

cellular_ entity

0

0 (17)

224

0.7 (16)

13

0.9 (12)

diagnostic_ procedure

3

1.2 (8)

967

2.9 (8)

101

7.2 (4)

drug_supplement

26

10.8 (5)

3,264

9.7 (4)

237

16.8 (2)

food

3

1.2 (8)

474

1.4 (11)

35

2.5 (11)

gene_protein

1

0.4 (16)

156

0.5 (17)

9

0.6 (14)

geographic_ location

2

0.8 (13)

455

1.4 (13)

3

0.2 (17)

lifestyle

2

0.8 (13)

438

1.3 (14)

44

3.1 (9)

measurement

3

1.2 (8)

331

1.0 (15)

10

0.7 (13)

organism_ function

3

1.2 (8)

469

1.4 (12)

66

4.7 (6)

organization

7

2.9 (7)

576

1.7 (9)

1

0.1 (18)

person_ population

36

14.9 (2)

3,763

11.2 (3)

60

4.2 (7)

problem

75

31.0 (1)

11,711

34.7 (1)

476

33.7 (1)

procedure_ device

32

13.2 (3)

2,481

7.4 (5)

99

7.0 (5)

profession

2

0.8 (13)

1,144

3.4 (7)

8

0.6 (15)

research_cue

-

-

-

-

4

0.3 (16)

substance

13

5.4 (6)

1,466

4.3 (6)

56

4.0 (8)

other

3

1.2 (8)

489

1.5 (10)

38

2.7 (10)

Total

242

100.0

33,747

100.0

1,413

100.0

Average

12.1

 

9.8

 

1.6

 

Range

1-35

 

1-84

 

1-5

 
  1. Note that questions in the unadjudicated set are counted twice since this set is double-annotated