- Oral presentation
- Open Access
- Published:
Pitfalls in applying text mining to scientific literature
BMC Bioinformatics volume 11, Article number: O4 (2010)
Numbers and data mining are easy. Our numerical system counts 10 digits, any combination is possible, and every measured value can be captured in a number. Large quantities of measures can be analysed efficiently using incredibly powerful calculators, and resulting information can be shown is simple clear graphs.
Text is hard. Hundreds of letters and millions of different combinations can be used in the personal interpretation of information, in words and phrases that reflect one's personality rather than objective measurements. Depending on context and language, the same expression carries totally different information, or no meaning at all.
Text Mining requires 'education' at different levels: for providing information, to capture, to store and to retrieve that information, and to interpret results of the mining process.
I will provide a few examples of a few text mining tools in daily practice.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Neefs, JM. Pitfalls in applying text mining to scientific literature. BMC Bioinformatics 11 (Suppl 5), O4 (2010). https://doi.org/10.1186/1471-2105-11-S5-O4
Published:
DOI: https://doi.org/10.1186/1471-2105-11-S5-O4
Keywords
- Data Mining
- Scientific Literature
- Objective Measurement
- Powerful Calculator
- Daily Practice