Skip to main content

Table 5 Example of an article containing multiple gene and specie mentions (PMC2680910)

From: BioCreative III interactive task: an overview

PMCID2680910   Central Vote Curated Outputa System Raw Output Team
Gene ID Gene names Species   1 2 3 4 5 78 68 65 93 89
10015 ALIX human 7 Y, C Y, C Y, C Y, C Y, C Y, C Y, C Y, C Y, C Y, C
57630 POSH human 7 Y, C Y, C Y, C Y, C Y, C Y, C Y, C Y, C Y, C Y, C
155030 Gag HIV-1 6 Y, C Y, C - Y, C Y, C Y, C - - Y, C -
36990 POSH Drosophila   Y Y Y Y Y Y Y - Y Y
43330 ALIX Drosophila   Y Y Y Y Y Y Y - - Y
128866 CHMP4B human   Y Y Y Y - Y - Y - Y
39659 TAK-1 Drosophila   Y Y Y Y Y - Y - - Y
3355106 ALG-2 Drosophila   Y Y Y Y Y - - - Y -
7323 UbcH5c human   Y Y Y Y - - Y Y - -
1489984 p9 EIAV   Y Y Y Y - - - - - -
137492 HCRP1 human   Y Y Y Y - Y - Y - -
7251 TSG101 human   Y Y Y Y - Y - Y - -
155030 p6 HIV-1   Y - Y Y - - - - - -
7334 UBC13 human 1 Y - Y, C Y - Y Y Y - -
  Total genes detected   14 19 13 26 10 90 22 120 9 52
   FP   0 5 0 0 3 81 15 113 4 46
   FN   0 2 1 0 7 5 7 7 8 8
   TP   14 12 13 14 7 9 7 7 5 6
   Precision   1.00 0.71 1.00 1.00 0.70 0.10 0.32 0.06 0.56 0.12
   Recall   1.00 0.86 0.93 1.00 0.50 0.64 0.50 0.50 0.38 0.43
  1. List of Entrez Gene ID, gene name and species found in PMC2680910. The Central Vote column indicates the number of curators that selected the gene as central; “Y”: gene mentioned in the article is detected; “-”:gene mentioned was missed; “C”=indicates central gene as determined by majority vote, and in the systems it means that the gene was ranked high by the system (gene ranked higher than non central genes); “Total genes detected”: totality of gene mentions provided by a given system (what the system considered a gene). FP and FN stand for false positive and negative, respectively. aCurated output by manual curation (2 curators, 1-2) and system-assisted curation (5 curators, but 3 are shown, 3-5).