Skip to main content

Table 3 The supervised models’ performance ordered by decreasing balanced accuracy

From: A compressed large language model embedding dataset of ICD 10 CM descriptions

Model

Embedding dimension

Accuracy

Balanced accuracy

BioGPT Compressed

1000

0.960

0.927

BioGPT Compressed

100

0.935

0.891

BioGPT Compressed

50

0.925

0.873

BioGPT Compressed

10

0.815

0.698

ClinicalBERT

768

0.200

0.634

PubMedBERT-MS-MARCO

768

0.158

0.629

SapBERT-PubMedBERT

768

0.159

0.616

MedBERT

768

0.171

0.613