Skip to main content

Table 7 Basic corpus statistics

From: Semantic annotation of consumer health questions

Corpus Part

# questions

# tokens

Average

Range

Std. Dev.

CHQA-email

1,740

95,834

55.1

2-427

51.3

- Practice

20

    

- Unadjudicated

1,720

    

CHQA-web

874

6,597

7.5

3-51

4.1

Total

2,614

102,431

39.2

2-427

136.7