Figure 1From: Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full textPharmspresso pipeline for data processing. The Pharmspresso pipeline for data processing: full text PDFs of articles are downloaded, converted to text, and tokenized into individual words and sentences. Next, the text is parsed to identify words or phrases that are members of specific categories within the ontology. These are marked as such and indexed for future search accessibility.Back to article page