Skip to main content

Table 1 Composition of article corpus by source

From: FamPlex: a resource for entity recognition and relationship resolution of human protein families and complexes in biomedical text mining

Text source type

Number

%

MEDLINE Abstract

183,019

67.9

Elsevier XML

40,261

14.9

Pubmed Central Open Access Subset XML

32,113

11.9

Pubmed Central Author Manuscript XML

13,777

5.1

No content retrieved

267

0.1

Pubmed Central Open Access Subset text file

52

0.02

Total

269,489

100.0