Skip to main content

Table 2 Overall text-mining performance with the residue filtering based on spotting in the residue-containing sentences keyword(s) from specialized dictionaries

From: Natural language processing in text mining for structural modeling of protein complexes

Dictionary and reference

Number of PPI keywords

L tot a

L int b

Coverage (%)c

Success (%)d

Accuracy (%)e

ΔN(0)f

ΔN(1)f

Blaschke et al., [20]

43

265

205

45.8

35.4

77.4

0

−8

Chowdhary et al., [58]

191

284

233

49.1

40.2

82.0

−7

−4

Hakenberg et al. [59]

234

297

232

51.3

40.1

78.1

6

−7

Plake et al. [60]

73

291

230

50.3

39.7

79.0

1

−1

Raja et al. [23]

412

302

247

52.2

42.7

81.8

0

−5

Schuhmann et al. [57]

64

212

152

36.6

26.3

71.7

− 1

5

Temkin et al. [21]

174

283

223

48.9

38.5

78.8

0

−9

Own dictionary

16

224

169

38.7

29.2

75.4

−6

8

  1. For definitions of columns 3–9, see footnotes to Table 1. Full content of in-house dictionary is in Table 3, but only PPI + ive part was used to calculate the data in this Table