Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Prioritization, clustering and functional annotation of MicroRNAs using latent semantic indexing of MEDLINE abstracts

Fig. 1

Overview of Latent Semantic Indexing. In a vector-space model, the semantic structure of a document is represented as a vector (essentially, a bag of words) in word space, and the degree of similarity between documents is calculated by the cosine of the angle between document vectors. The vectors consist of weighted terms, which are a function of the frequency of the terms in and across all documents in the collection. A variant of the vector space model, called Latent Semantic Indexing, improves retrieval by applying singular value decomposition (SVD) to create a subspace in which text documents are represented as vectors. The components in the subspace may be regarded as a concept derived from the word usage patterns in the document. Hence, the relevant documents are retrieved based on the degree of conceptual similarity between the documents

Back to article page