Skip to main content

Table 1 Data used in prior state-of-the-art studies compared to ours (BioALBERT)

From: Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT

Training corpus

BioBERT [13]

SciBERT [11]

BLUE [12]

PubMedBERT [14]

KeBioLM [15]

BioALBERT

General

\(\checkmark\)

\(\times\)

\(\checkmark\)

\(\times\)

\(\times\)

\(\checkmark\)

PMC

\(\checkmark\)

\(\times\)

\(\times\)

\(\checkmark\)

\(\checkmark\)

\(\checkmark\)

PubMed

\(\checkmark\)

\(\times\)

\(\checkmark\)

\(\checkmark\)

\(\checkmark\)

\(\checkmark\)

Clinical notes

\(\times\)

\(\times\)

\(\checkmark\)

\(\times\)

\(\times\)

\(\checkmark\)