From: Neural sentence embedding models for semantic similarity estimation in the biomedical domain
Method | r |
---|---|
String-based methods | |
Jaccard | 0.751 |
Q-gram (q = 3) | 0.723 |
Unsupervised | |
fastText (skip-gram, max pooling) | 0.766 |
fastText (CBOW, max pooling) | 0.253 |
Sent2vec | 0.798 |
Skip-thoughts | 0.485 |
Paragraph vector (PV-DM) | 0.819 |
Paragraph vector (PV-DBOW) | 0.804 |
Unsupervised combination of several methods (mean) | |
Jaccard, q-gram, Paragraph vector (PV-DBOW) and sent2vec | 0.846 |
Supervised combination of several methods | |
Supervised linear regression (Combination of Jaccard, Q-gram, sent2vec, Paragraph vector DM, skip-thoughts, fastText) | 0.871 |