BMC Bioinformatics

Table 1 Statistics of the sequences included in the two datasets

From: String kernels for protein sequence comparisons: improved fold recognition

		Dataset
	CATH2833		CATH793
CATH fold ID	N^a	L (SD)^b	N	L (SD)
1.10.10	381	79 (26)	36	135 (10)
2.60.40	555	110 (29)	130	140 (16)
3.20.20	251	294 (69)	2	157 (14)
3.30.70	368	182 (59)	52	141 (18)
3.40.50	1278	153 (77)	573	151 (17)

^aNumber of proteins in the fold
^bMean (standard deviation) of the lengths of the proteins in the fold

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com