Skip to main content

Table 27 Distribution of data across the folds

From: A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools

Fold

Number of Sentences

IDs of files in fold

Fold 0

3,066

11532192 - 15005800

Fold 1

3,990

15040800 - 15630473

Fold 2

3,951

15676071 - 16110338

Fold 3

3,723

16121255 - 16507151

Fold 4

4,200

16539743 - 17083276

Training

18,930

11532192 - 17083276

Development

2,780

17194222 - 17696610