Skip to main content

Table 4 Results of systematic corpus evaluation using edit distance

From: Combining MEDLINE and publisher data to create parallel corpora for the automatic translation of biomedical text

Extracted data

Total number

Average edit distance to MEDLINE data

Number with edit distance above 50

French (14,817 citations)

TIFR

14,817

5.82

325

TIEN

14,815

5.99

347

ABEN

14,089

8.20

1,180

ABFR

14,153

-

-

Spanish (3,371 citations)

TIES

3,371

6.82

70

TIEN

3,371

4.39

14

ABEN

2,961

7.50

148

ABES

2,968

-

-