Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: BioReader: a text mining tool for performing classification of biomedical literature

Fig. 2

Workflow of a typical database curation process involving data extraction from the primary literature. First, an initial search using a publication search engine such as PubMed is performed, after which corpora of both relevant and irrelevant articles are defined. These corpora are then used to train a text mining classifier, which is applied in subsequent searches to minimize time spent reading irrelevant articles. With each iteration of data extraction, the size of the corpora increases, thus increasing the performance of the classification algorithm

Back to article page