Skip to main content

Table 10 Distribution of corpus sentences in translation experiments

From: Combining MEDLINE and publisher data to create parallel corpora for the automatic translation of biomedical text

 

Training

Tuning

Testing

French

Titles

458,543

57,317

57,317

Abstracts (hr=0.0)

28,882

28,882

28,881

Abstracts (hr=0.29)

17,351

17,365

28,881

Spanish

Titles

198,512

24,814

24,814

Abstracts (hr=0.0)

7,772

7,772

7,772

Abstracts (hr=0.29)

5,403

5,418

7,772