Embedding model | Language | Domain | Type | Corpus size | Vocab size | Array size | Algorithm | Property |
---|---|---|---|---|---|---|---|---|
W2V-SBWC | Spanish | General | Word | 1.5 billion | 68k | 300 | Word2Vec Skip-gram BOW | Pre-trained |
FastText-SBWC | Spanish | General | Word | 1.5 billion | 81.2k | 300 | FastText Skip-gram BOW | Pre-trained |
FastText-SBC | Spanish | Specific (Biomedical) | Word | 600 billion | 91.7k | 300 | FastText Skip-gram BOW | Own |
Scielo+Wiki cased | Spanish | Specific (Biomedical) | Word | Â | 50k | 300 | FastText Skip-gram BOW | Pre-trained |
SNOMED-SBC | Spanish | Specific (Biomedical) | Concept | 600 billion | 88.1k | 300 | FastText Skip-gram BOW | Own |
Pubmed and PMC | English | Specific (Biomedical) | Word | 2 billion | 400k | 300 | Word2Vec Skip-gram BOW | Pre-trained |
FastText-2M | English | General | Word | 600 billion | 2 million | 300 | FastText Skip-gram BOW | Pre-trained |
Sense2vec Reddit | English/Spanish | General | Sense | 2 billion | 120k | 128 | Sense2Vec | Pre-trained |