BMC Bioinformatics

Table 1 Sensitivity of gene name extraction of PSE. Five datasets were used for the evaluation. The first two datasets represented as Set 1A, 1B contain forty abstracts retrieved with randomly generated PubMed IDs, respectively. The next two datasets labeled as Set 2A, 2B contain forty abstracts which were randomly selected from the 4,548 abstracts retrieved with "gene AND disease AND activation" as keywords, respectively. The last dataset represented as GENIA is the result from using the GENIA corpus containing 2,000 abstracts. The results using Set 1A,1B and Set 2A, 2B and GENIA are represented as Ev1, Ev2 and Ev3 in this manuscript, respectively. TP, FP and FN represent the true positive, the false positive and the false negative, respectively.

From: PSE: A tool for browsing a large amount of MEDLINE/PubMed abstracts with gene names and common words as the keywords

Dataset	TP	FP	FN	Precision	Recall	F-measure
Set 1A	50	15	61	76.9%	45.0%	56.8%
Set 1B	40	11	21	78.4%	65.6%	71.4%
Set 2A	287	23	157	92.6%	64.6%	76.1%
Set 2B	291	49	210	85.6%	58.1%	69.2%
GENIA	12,842	7,752	6,912	62.4%	65.0%	63.7%

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com