Leveraging pre-trained language models for mining microbiome-disease relationships

BMC Bioinformatics

Table 2 Performance metrics different fine-tuned language models

	Model	Accuracy	F1 score	Precision	Recall
Baseline	BERE_TL(MDI)	NA	0.738	0.736	0.740
Our models (fine-tuned)	Bert-base-uncased	\(0.733 \pm 0.018\)	\(0.731 \pm 0.015\)	\(0.742 \pm 0.02\)	\(0.733 \pm 0.018\)
	BioMegatron	\(0.778 \pm 0.008\)	\(0.769 \pm 0.013\)	\(0.771 \pm 0.013\)	\(0.778 \pm 0.008\)
	PubMedBERT	\(0.782 \pm 0.022\)	\(0.778 \pm 0.019\)	\(0.783 \pm 0.021\)	\(0.782 \pm 0.022\)
	BioClinicalBERT	\(0.729 \pm 0.032\)	\(0.724 \pm 0.029\)	\(0.731 \pm 0.032\)	\(0.729 \pm 0.032\)
	BioLinkBERT-base	\({\textbf {0.811}} \pm {\textbf {0.029}}\)	\({\textbf {0.804}} \pm {\textbf {0.036}}\)	\({\textbf {0.813}} \pm {\textbf {0.034}}\)	\({\textbf {0.811}} \pm {\textbf {0.028}}\)
	BioMedLM	\(0.806 \pm 0.028\)	\(0.804 \pm 0.028\)	\({\textbf {0.822}} \pm {\textbf {0.030}}\)	\(0.806 \pm 0.028\)
	BioGPT	\(0.732 \pm 0.017\)	\(0.726 \pm 0.017\)	\(0.732 \pm 0.025\)	\(0.736 \pm 0.016\)
	GPT-3	\({\textbf {0.814}} \pm {\textbf {0.021}}\)	\({\textbf {0.810}} \pm {\textbf {0.025}}\)	\(0.810 \pm 0.021\)	\({\textbf {0.814}} \pm {\textbf {0.021}}\)

ISSN: 1471-2105