Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Building a protein name dictionary from full text: a machine learning term extraction approach

Figure 1

Filtering process and exclusion lists. Oval boxes on the left of the figure present the exclusions lists used as input to the filtering process. An exclusion list is connected to the step of the filtering process that uses it. Each step filters out terms that match the exclusion list in sequence based on the rules described in the boxes of the second column. Some steps perform matches by considering the entire term (e.g., the first step on the top), other steps use only specific words in a term. These lists have been built and are being maintained manually. The rectangular boxes on the right show the number of terms that have been excluded at each step, when processing the most frequent terms from JBC2000 (numbers in parentheses indicate counts of unique terms).

Back to article page