Skip to main content

Table 4 Overall text-mining performance with the residue filtering based on analysis of sentence parse tree

From: Natural language processing in text mining for structural modeling of protein complexes

Method of parse tree analysis

L tot

L int

Coverage (%)

Success (%)

Accuracy (%)

ΔN(0)

ΔN(1)

Method 1. Scoring of the residue-containing sentence only

222

173

38.3

29.9

77.9

−13

+ 10

Method 2. Scoring of the residue-containing sentence and keyword spotting in the context sentences

208

154

35.9

26.6

74.0

−7

+ 3

Method 3. SVM model with scores of the residue-containing and context sentences

182

146

31.4

25.2

80.2

−27

+ 21

  1. Keywords used in the analysis were taken from our dictionary (Table 3). For definitions of columns 2–8, see footnotes to Table 1