Natural language processing in text mining for structural modeling of protein complexes

BMC Bioinformatics

Table 4 Overall text-mining performance with the residue filtering based on analysis of sentence parse tree

Method of parse tree analysis	L _tot	L _int	Coverage (%)	Success (%)	Accuracy (%)	ΔN(0)	ΔN(1)
Method 1. Scoring of the residue-containing sentence only	222	173	38.3	29.9	77.9	−13	+ 10
Method 2. Scoring of the residue-containing sentence and keyword spotting in the context sentences	208	154	35.9	26.6	74.0	−7	+ 3
Method 3. SVM model with scores of the residue-containing and context sentences	182	146	31.4	25.2	80.2	−27	+ 21

Keywords used in the analysis were taken from our dictionary (Table 3). For definitions of columns 2–8, see footnotes to Table 1

ISSN: 1471-2105