PTM
|
Filtering
|
Generic corpus
|
Positive corpora
|
---|
|
token
|
# Filtered astracts
|
# Retrieved abstracts
|
Precision
|
# Abstracts
|
Recall
|
---|
Acetylation
|
“acet”
|
26,144
|
1,753
|
65%
|
97
|
89%
|
Amidation
|
“amid”
|
21,861
|
1,515
|
73%
|
61
|
95%
|
Disulfide bond
|
“disulf”
|
6,933
|
1,095
|
94%
|
514
|
75%
|
Glycosylation
|
“glyco”
|
31,379
|
2,746
|
73%
|
464
|
85%
|
Methylation
|
“methyl”
|
28,015
|
664
|
57%
|
47
|
87%
|
Phosphorylation
|
“phospho”
|
61,144
|
16,129
|
71%
|
906
|
93%
|
Sulfation
|
“sulf”
|
20,834
|
256
|
65%
|
40
|
92%
|
- “Filtering token” is the term used to select the abstracts, “# filtered abstracts” is the number of abstracts which contain these terms, and “# retrieved abstracts” is the number of abstracts selected by the complete sentence extraction procedure. Precision was estimated based on manual analysis of 100 positive abstracts.