Skip to main content

Table 1 Overall text-mining performance with the residue filtering using semantic similarity of words in a residue-containing sentence to a generic concept in the WordNet vocabulary. For comparison, the results with basic residue filtering are also shown

From: Natural language processing in text mining for structural modeling of protein complexes

Query

Similarity measure

L tot a

L int b

Coverage (%)c

Success (%)d

Accuracy (%)e

ΔN(0)f

ΔN(1)f

AND

–

128

108

22.1

18.7

84.4

  

OR

–

328

273

56.6

47.2

83.2

  

OR

Lesk [39, 40]

319

267

55.1

46.1

83.7

-3

−1

OR

Lin [41]

251

184

43.4

31.8

73.3

+ 8

−8

OR

Path [42, 43]

316

265

54.6

45.8

83.9

−3

+ 1

  1. aNumber of complexes for which TM protocol found at least one abstract with residues
  2. bNumber of complexes with at least one interface residue found in abstracts
  3. cRatio of L tot and total number of complexes
  4. dRatio of L int and total number of complexes
  5. eRatio of L int and L tot
  6. fCalculated by Eq. (2)