Skip to main content

Table 2 Main features used by the participating teams. The table shows the features and strategies adopted by the different participants and the number of users.

From: Evaluation of BioCreAtIvE assessment of task 2

Characteristics (C), resources (R) and methods (M) Users
(C) Sentence level (retrieval unit) [19,20,22,25,26]
(C) Paragraph level (retrieval unit) [21,23,24]
(C) Full article processed [19,21,22,24,25]
(C) Full article processed except methods section [26]
(C) Only abstract processed [20]
(C) GO term – Protein distance [22,24,25]
(M) Stemming [20,22,24,26]
(M) POS tagging [25,26]
(M) Shallow parsing [25]
(M) Finite state automata [20,25]
(M) Edit distance ranking [20]
(M) Vector space model [20,21]
(M) Machine learning technique [23-25]
(M) Support Vector Machines [23]
(M) Naïve Bayes models [24,25]
(M) N-gram models [24]
(M) External resource – tool: GATE NLP tool [21]
(M) External resource – tool: Morphological normalizer BioMorpher [21]
(M) External resource – tool: qtile query based ranking tool [26]
(M) External resource – tool: Grok POS tagger [25]
(M) Heuristic rules [22,24-26]
(M) Regular expressions/pattern matching [19,20,22,24,25]
(M) Literal string matching [22,24]
(R) Protein name aliases (link to external databases) [22,24,26]
(R) GO terms used [19-26]
(R) GOA data used [22-24]
(R) GO term forming words/tokens [19,22,24,26]
(R) GO term variants [22,25]
(R) External resource – data: Dictionary of suffixes [24]
(R) External resource – data: UMLS/MeSH dictionary [20,24]
(R) External resource – data: HUGO database [22,24,26]
(R) External resource – data: SGD database [24]
(R) External resource – data: MGI database [24]
(R) External resource – data: RGD database [24]
(R) External resource – data: TAIR database [24]
(R) External resource – data: Procter and Gamble protein synoyms [21]