Skip to main content

Table 2 BERT model statistics: model parameters, vocabulary size in wordpieces, and number of English language words in the pretraining data

From: Dependency parsing of biomedical text with BERT

Model

Params (M)

Vocab (K)

Words (Eng.) (B)

Google BERT large

340

29

3.3

Google mBERT

180

120

2.5

SciBERT base scivocab uncased

110

31

3.2

BioBERT large v1.1. custom vocab

360

59

21.3

BlueBERT base P+M

110

31

4.5