Skip to main content

Table 4 BioALBERT trained on different training steps, different combinations of the text corpora, and BioALBERT model version and size

From: Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT

Model version

BioALBERT size

Combination of corpus used for training

Number of training steps

1

Base1

Wikipedia + BooksCorpus + PubMed

200K

2

Base2

Wikipedia + BooksCorpus + PubMed+ PMC

470K

3

Large1

Wikipedia + BooksCorpus + PubMed

200K

4

Large2

Wikipedia + BooksCorpusPubMed + PMC

470K

5

Base3

Wikipedia + BooksCorpus + PubMed + MIMIC-III

200K

6

Base4

Wikipedia + BooksCorpus + PubMed + PMC + MIMIC-III

200K

7

Large3

Wikipedia + BooksCorpus + PubMed + MIMIC-III

270K

8

Large4

Wikipedia + BooksCorpus + PubMed + PMC + MIMIC-III

270K