From: Leveraging pre-trained language models for mining microbiome-disease relationships
Model | Accuracy | F1 score | Precision | Recall | |
---|---|---|---|---|---|
Baseline | BERE_TL(MDI) | NA | 0.738 | 0.736 | 0.740 |
Our models (fine-tuned) | Bert-base-uncased | \(0.733 \pm 0.018\) | \(0.731 \pm 0.015\) | \(0.742 \pm 0.02\) | \(0.733 \pm 0.018\) |
BioMegatron | \(0.778 \pm 0.008\) | \(0.769 \pm 0.013\) | \(0.771 \pm 0.013\) | \(0.778 \pm 0.008\) | |
PubMedBERT | \(0.782 \pm 0.022\) | \(0.778 \pm 0.019\) | \(0.783 \pm 0.021\) | \(0.782 \pm 0.022\) | |
BioClinicalBERT | \(0.729 \pm 0.032\) | \(0.724 \pm 0.029\) | \(0.731 \pm 0.032\) | \(0.729 \pm 0.032\) | |
BioLinkBERT-base | \({\textbf {0.811}} \pm {\textbf {0.029}}\) | \({\textbf {0.804}} \pm {\textbf {0.036}}\) | \({\textbf {0.813}} \pm {\textbf {0.034}}\) | \({\textbf {0.811}} \pm {\textbf {0.028}}\) | |
BioMedLM | \(0.806 \pm 0.028\) | \(0.804 \pm 0.028\) | \({\textbf {0.822}} \pm {\textbf {0.030}}\) | \(0.806 \pm 0.028\) | |
BioGPT | \(0.732 \pm 0.017\) | \(0.726 \pm 0.017\) | \(0.732 \pm 0.025\) | \(0.736 \pm 0.016\) | |
GPT-3 | \({\textbf {0.814}} \pm {\textbf {0.021}}\) | \({\textbf {0.810}} \pm {\textbf {0.025}}\) | \(0.810 \pm 0.021\) | \({\textbf {0.814}} \pm {\textbf {0.021}}\) |