Skip to main content

Table 1 Features extracted.

From: Gene/protein name recognition based on support vector machine using dictionary as features

Feature Value
word all words in the training data
orthography capital, symbol, etc. (see Table 2)
prefix 1, 2, or 3 gram of the starting letters of a word
suffix 1, 2, or 3 gram of the ending letters of a word
part of speech Brill tagger
preceding class -2, -1
gene/protein name dictionary protein names collected from SWISS-PROT and TrEMBL