Skip to main content

Table 1 Expected length of longest common subsequence computed for several protein datasets. The columns represent respectively, DS: the tested protein datasets, NS: number of tested protein sequences, AEL: average of the expected length of the longest common subsequence and finally SD: the standard deviation.

From: CLUSS: Clustering of protein sequences based on a new similarity measure

DS

NS

AEL

SD

COG database

144298

3.934

0.363

KOG database

60748

4.062

0.458

G-proteins family

381

3.718

0.200

GH2 family

316

4.355

0.232

ROK family

730

4.074

0.324