Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs

BMC Bioinformatics

Table 3 Statistics for the various word subsets

Data set	Word subset	Word number	Lp_max	nb_sf*
Initial data set	All words	11 294	4.3 (5.6)	0.2 (0.7)
	Over-represented words	1 705	11.3 (12.1)	1.3 (1.4)
	Extreme ubiquitous words	23	26 (14)	10.33 (5.5)
	Extreme superfamily-specific words	24	89 (47)	1.4 (0.4)
Initial data set+random SCOP^a	All words	11 294	2.5 (0.9)	0.006 (0.4)
	Over-represented words	45 (7)	10.7 (11.9)	1.9 (2.2)

We report average values with standard deviation between brackets. ^a: twelve random SCOP classifications were generated by permuting the loops in the real SCOP classification.

ISSN: 1471-2105