Volume 6 Supplement 1

A critical assessment of text mining methods in molecular biology

Reports

Edited by Christian Blaschke, Lynette Hirschman, Alfonso Valencia, Alexander Yeh

A critical assessment of text mining methods in molecular biology. Go to conference site.

Granada, SpainMarch 28-31, 2004

Overview of BioCreAtIvE: critical assessment of information extraction for biology

The goal of the first BioCreAtIvE challenge (Critical Assessment of Information Extraction in Biology) was to provide a set of common evaluation tasks to assess the state of the art for text mining applied to ...

Authors: Lynette Hirschman, Alexander Yeh, Christian Blaschke and Alfonso Valencia

Citation: BMC Bioinformatics 2005 6(Suppl 1):S1

Content type: Introduction Published on: 24 May 2005
- View Full Text
- View PDF
BioCreAtIvE Task 1A: gene mention finding evaluation

The biological research literature is a major repository of knowledge. As the amount of literature increases, it will get harder to find the information of interest on a particular topic. There has been an inc...

Authors: Alexander Yeh, Alexander Morgan, Marc Colosimo and Lynette Hirschman

Citation: BMC Bioinformatics 2005 6(Suppl 1):S2

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
GENETAG: a tagged corpus for gene/protein named entity recognition

Named entity recognition (NER) is an important first step for text mining the biomedical literature. Evaluating the performance of biomedical NER systems is impossible without a standardized test corpus. The a...

Authors: Lorraine Tanabe, Natalie Xie, Lynne H Thom, Wayne Matten and W John Wilbur

Citation: BMC Bioinformatics 2005 6(Suppl 1):S3

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
BioCreAtIvE Task1A: entity identification with a stochastic tagger

Our approach to Task 1A was inspired by Tanabe and Wilbur's ABGene system [1, 2]. Like Tanabe and Wilbur, we approached the problem as one of part-of-speech tagging, adding a GENE tag to the standard tag set. Whe...

Authors: Shuhei Kinoshita, K Bretonnel Cohen, Philip V Ogren and Lawrence Hunter

Citation: BMC Bioinformatics 2005 6(Suppl 1):S4

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Exploring the boundaries: gene and protein identification in biomedical text

Good automatic information extraction tools offer hope for automatic processing of the exploding biomedical literature, and successful named entity recognition is a key component for such tools.

Authors: Jenny Finkel, Shipra Dingare, Christopher D Manning, Malvina Nissim, Beatrice Alex and Claire Grover

Citation: BMC Bioinformatics 2005 6(Suppl 1):S5

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Identifying gene and protein mentions in text using conditional random fields

We present a model for tagging gene and protein mentions from text using the probabilistic sequence tagging framework of conditional random fields (CRFs). Conditional random fields model the probability P(t|o) of...

Authors: Ryan McDonald and Fernando Pereira

Citation: BMC Bioinformatics 2005 6(Suppl 1):S6

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Recognition of protein/gene names from text using an ensemble of classifiers

This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using...

Authors: GuoDong Zhou, Dan Shen, Jie Zhang, Jian Su and SoonHeng Tan

Citation: BMC Bioinformatics 2005 6(Suppl 1):S7

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Gene/protein name recognition based on support vector machine using dictionary as features

Automated information extraction from biomedical literature is important because a vast amount of biomedical literature has been published. Recognition of the biomedical named entities is the first step in inf...

Authors: Tomohiro Mitsumori, Sevrani Fation, Masaki Murata, Kouichi Doi and Hirohumi Doi

Citation: BMC Bioinformatics 2005 6(Suppl 1):S8

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Systematic feature evaluation for gene name recognition

In task 1A of the BioCreAtIvE evaluation, systems had to be devised that recognize words and phrases forming gene or protein names in natural language sentences. We approach this problem by building a word cla...

Authors: Jörg Hakenberg, Steffen Bickel, Conrad Plake, Ulf Brefeld, Hagen Zahn, Lukas Faulstich, Ulf Leser and Tobias Scheffer

Citation: BMC Bioinformatics 2005 6(Suppl 1):S9

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Text Detective: a rule-based system for gene annotation in biomedical texts

The identification of mentions of gene or gene products in biomedical texts is a critical step in the development of text mining applications in biosciences. The complexity and ambiguity of gene nomenclature m...

Authors: Javier Tamames

Citation: BMC Bioinformatics 2005 6(Suppl 1):S10

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Overview of BioCreAtIvE task 1B: normalized gene lists

Our goal in BioCreAtIve has been to assess the state of the art in text mining, with emphasis on applications that reflect real biological applications, e.g., the curation process for model organism databases....

Authors: Lynette Hirschman, Marc Colosimo, Alexander Morgan and Alexander Yeh

Citation: BMC Bioinformatics 2005 6(Suppl 1):S11

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Data preparation and interannotator agreement: BioCreAtIvE Task 1B

We prepared and evaluated training and test materials for an assessment of text mining methods in molecular biology. The goal of the assessment was to evaluate the ability of automated systems to generate a li...

Authors: Marc E Colosimo, Alexander A Morgan, Alexander S Yeh, Jeffrey B Colombe and Lynette Hirschman

Citation: BMC Bioinformatics 2005 6(Suppl 1):S12

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Automatically annotating documents with normalized gene lists

Document gene normalization is the problem of creating a list of unique identifiers for genes that are mentioned within a document. Automating this process has many potential applications in both information e...

Authors: Jeremiah Crim, Ryan McDonald and Fernando Pereira

Citation: BMC Bioinformatics 2005 6(Suppl 1):S13

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
ProMiner: rule-based protein and gene entity recognition

Identification of gene and protein names in biomedical text is a challenging task as the corresponding nomenclature has evolved over time. This has led to multiple synonyms for individual genes and proteins, a...

Authors: Daniel Hanisch, Katrin Fundel, Heinz-Theodor Mevissen, Ralf Zimmer and Juliane Fluck

Citation: BMC Bioinformatics 2005 6(Suppl 1):S14

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
A simple approach for protein name identification: prospects and limits

Significant parts of biological knowledge are available only as unstructured text in articles of biomedical journals. By automatically identifying gene and gene product (protein) names and mapping these to uni...

Authors: Katrin Fundel, Daniel Güttler, Ralf Zimmer and Joannis Apostolakis

Citation: BMC Bioinformatics 2005 6(Suppl 1):S15

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Evaluation of BioCreAtIvE assessment of task 2

Molecular Biology accumulated substantial amounts of data concerning functions of genes and proteins. Information relating to functional descriptions is generally extracted manually from textual data and store...

Authors: Christian Blaschke, Eduardo Andres Leon, Martin Krallinger and Alfonso Valencia

Citation: BMC Bioinformatics 2005 6(Suppl 1):S16

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
An evaluation of GO annotation retrieval for BioCreAtIvE and GOA

The Gene Ontology Annotation (GOA) database http://www.ebi.ac.uk/GOA aims to provide high-quality supplementary GO annotation to proteins in the UniProt Know...

Authors: Evelyn B Camon, Daniel G Barrell, Emily C Dimmer, Vivian Lee, Michele Magrane, John Maslen, David Binns and Rolf Apweiler

Citation: BMC Bioinformatics 2005 6(Suppl 1):S17

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Learning Statistical Models for Annotating Proteins with Function Information using Biomedical Text

The BioCreative text mining evaluation investigated the application of text mining methods to the task of automatically extracting information from text in biomedical research articles. We participated in Task...

Authors: Soumya Ray and Mark Craven

Citation: BMC Bioinformatics 2005 6(Suppl 1):S18

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
A sentence sliding window approach to extract protein annotations from biomedical articles

Within the emerging field of text mining and statistical natural language processing (NLP) applied to biomedical articles, a broad variety of techniques have been developed during the past years. Nevertheless,...

Authors: Martin Krallinger, Maria Padron and Alfonso Valencia

Citation: BMC Bioinformatics 2005 6(Suppl 1):S19

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Protein annotation as term categorization in the gene ontology using word proximity networks

We participated in the BioCreAtIvE Task 2, which addressed the annotation of proteins into the Gene Ontology (GO) based on the text of a given document and the selection of evidence text from the document just...

Authors: Karin Verspoor, Judith Cohn, Cliff Joslyn, Sue Mniszewski, Andreas Rechtsteiner, Luis M Rocha and Tiago Simas

Citation: BMC Bioinformatics 2005 6(Suppl 1):S20

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Finding genomic ontology terms in text using evidence content

The development of text mining systems that annotate biological entities with their properties using scientific literature is an important recent research topic. These systems need first to recognize the biolo...

Authors: Francisco M Couto, Mário J Silva and Pedro M Coutinho

Citation: BMC Bioinformatics 2005 6(Suppl 1):S21

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Mining protein function from text using term-based support vector machines

Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed a...

Authors: Simon B Rice, Goran Nenadic and Benjamin J Stapley

Citation: BMC Bioinformatics 2005 6(Suppl 1):S22

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF
Data-poor categorization and passage retrieval for Gene Ontology Annotation in Swiss-Prot

In the context of the BioCreative competition, where training data were very sparse, we investigated two complementary tasks: 1) given a Swiss-Prot triplet, containing a protein, a GO (Gene Ontology) term and ...

Authors: Frédéric Ehrler, Antoine Geissbühler, Antonio Jimeno and Patrick Ruch

Citation: BMC Bioinformatics 2005 6(Suppl 1):S23

Content type: Report Published on: 24 May 2005
- View Full Text
- View PDF

A critical assessment of text mining methods in molecular biology

Reports

Overview of BioCreAtIvE: critical assessment of information extraction for biology

BioCreAtIvE Task 1A: gene mention finding evaluation

GENETAG: a tagged corpus for gene/protein named entity recognition

BioCreAtIvE Task1A: entity identification with a stochastic tagger

Exploring the boundaries: gene and protein identification in biomedical text

Identifying gene and protein mentions in text using conditional random fields

Recognition of protein/gene names from text using an ensemble of classifiers

Gene/protein name recognition based on support vector machine using dictionary as features

Systematic feature evaluation for gene name recognition

Text Detective: a rule-based system for gene annotation in biomedical texts

Overview of BioCreAtIvE task 1B: normalized gene lists

Data preparation and interannotator agreement: BioCreAtIvE Task 1B

Automatically annotating documents with normalized gene lists

ProMiner: rule-based protein and gene entity recognition

A simple approach for protein name identification: prospects and limits

Evaluation of BioCreAtIvE assessment of task 2

An evaluation of GO annotation retrieval for BioCreAtIvE and GOA

Learning Statistical Models for Annotating Proteins with Function Information using Biomedical Text

A sentence sliding window approach to extract protein annotations from biomedical articles

Protein annotation as term categorization in the gene ontology using word proximity networks

Finding genomic ontology terms in text using evidence content

Mining protein function from text using term-based support vector machines

Data-poor categorization and passage retrieval for Gene Ontology Annotation in Swiss-Prot

Important information

Annual Journal Metrics

Follow

BMC Bioinformatics

Contact us