Natural language processing in text mining for structural modeling of protein complexes

BMC Bioinformatics

Table 2 Overall text-mining performance with the residue filtering based on spotting in the residue-containing sentences keyword(s) from specialized dictionaries

Dictionary and reference	Number of PPI keywords	L _tot ^a	L _int ^b	Coverage (%)^c	Success (%)^d	Accuracy (%)^e	ΔN(0)^f	ΔN(1)^f
Blaschke et al., [20]	43	265	205	45.8	35.4	77.4	0	−8
Chowdhary et al., [58]	191	284	233	49.1	40.2	82.0	−7	−4
Hakenberg et al. [59]	234	297	232	51.3	40.1	78.1	6	−7
Plake et al. [60]	73	291	230	50.3	39.7	79.0	1	−1
Raja et al. [23]	412	302	247	52.2	42.7	81.8	0	−5
Schuhmann et al. [57]	64	212	152	36.6	26.3	71.7	− 1	5
Temkin et al. [21]	174	283	223	48.9	38.5	78.8	0	−9
Own dictionary	16	224	169	38.7	29.2	75.4	−6	8

For definitions of columns 3–9, see footnotes to Table 1. Full content of in-house dictionary is in Table 3, but only PPI + ive part was used to calculate the data in this Table

ISSN: 1471-2105