From: A compressed large language model embedding dataset of ICD 10 CM descriptions
Model | Embedding dimension | Accuracy | Balanced accuracy |
---|---|---|---|
BioGPT Compressed | 1000 | 0.960 | 0.927 |
BioGPT Compressed | 100 | 0.935 | 0.891 |
BioGPT Compressed | 50 | 0.925 | 0.873 |
BioGPT Compressed | 10 | 0.815 | 0.698 |
ClinicalBERT | 768 | 0.200 | 0.634 |
PubMedBERT-MS-MARCO | 768 | 0.158 | 0.629 |
SapBERT-PubMedBERT | 768 | 0.159 | 0.616 |
MedBERT | 768 | 0.171 | 0.613 |