Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: Moara: a Java library for extracting and normalizing gene and protein mentions

Figure 2

Code example and output when extracting and normalizing gene/protein mentions. A: Text extracted from PubMed abstract 1385987 (cf. Figure 1). Extraction was performed with CBR-Tagger and ABNER, both trained with BioCreative 2 Gene Mention corpus alone. Normalization was performed for human using flexible matching and a multiple cosine disambiguation. B: Output presents the text of each extracted mention, including the start and end positions. The gene/protein candidates that were matched to each mention are listed below: the identifier in the Entrez Gene database, the synonym to which the text of the mention was matched, and the disambiguation score. The candidates identified with an asterisk (*) were selected by the system according to the disambiguation strategy. In this example, a multiple disambiguation procedure was used and more than one candidate may be chosen for the same mention.

Back to article page