Skip to main content

Advertisement

Table 4 Example of an article that presents name ambiguity between gene names, and between a gene name and a term from other domain (PMC2275796).

From: BioCreative III interactive task: an overview

PMC2275796   Central Vote Curated Outputa System Raw Output Team
Gene ID Gene names Species    78 68 65 93 89
56606 GLUT9/SLC2A9 human 7 Y, C Y, C Y, C Y, C Y, C Y, C
9948 WDR1/AIP1 human   Y Y Y Y Y -
Some examples of ambiguity found in system’s output
11182 GLUT9/SLC2A6 human     N, C N,C N,C  
  CAD     N   N   
  MI     N     N
139741 MAGI2/AIP1 human    N   N   N
  Total genes detected   2 6 4 44 4 15
Performance for total of genes in the article FP   0 4 2 42 2 14
   FN   0 0 0 0 0 1
   TP   2 2 2 2 2 1
   Precision   1 0.33 0.50 0.05 0.50 0.07
   Recall   1 1 1 1 1 0.5
  Total central genes   1 1 2 2 2 1
Performance for detecting central genesb FP   0 0 1 1 1 0
   FN   0 0 0 0 0 0
   TP   1 1 1 1 1 1
   Precision   1 1 0.50 0.50 0.50 1
   Recall   1 1 1 1 1 1
  1. List of Entrez Gene IDs, gene name and species found in PMC2275796. The Central Vote column indicates the number of curators that selected the gene as central; “Y”: gene mentioned in the article was detected; “-”:gene mentioned was missed; “N”: the entity detected was not a gene or a wrong gene; “C”=indicates central gene as determined by majority vote, and in the systems it means that the gene was ranked high (gene ranked higher than non central genes); “Total genes detected”: totality of gene mentions provided by a given system (what the system considered a gene). FP and FN stand for false positive and negative, respectively. aCurated output by manual curation (2 curators) and system-assisted curation (5 curators) was identical so it is shown as a single column. bThe FP for central gene performance was calculated by comparing the list of manually curated central genes with the gene ranking by the system. If any non-central gene is ranked higher than a central one it is considered a FP.