Skip to main content

Table 3 Statistics for the various word subsets

From: Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs

Data set

Word subset

Word number

Lp max

nb sf*

Initial data set

All words

11 294

4.3 (5.6)

0.2 (0.7)

 

Over-represented words

1 705

11.3 (12.1)

1.3 (1.4)

 

Extreme ubiquitous words

23

26 (14)

10.33 (5.5)

 

Extreme superfamily-specific words

24

89 (47)

1.4 (0.4)

Initial data set+random SCOPa

All words

11 294

2.5 (0.9)

0.006 (0.4)

 

Over-represented words

45 (7)

10.7 (11.9)

1.9 (2.2)

  1. We report average values with standard deviation between brackets. a: twelve random SCOP classifications were generated by permuting the loops in the real SCOP classification.