Skip to main content

Table 1 Refinement of The initial Dataset (A) through the application of successive filters.

From: AnEnPi: identification and annotation of analogous enzymes

Datasets

# Clusters

Max. Clusters

% Analogous

 

1

2

3

> 3

  

A

1447

459

199

328

131

40.5

B

1600

345

113

180

78

26.2

C

1560

316

91

97

46

20.7

D

1619

302

73

70

23

19.4

E

1897

142

23

1

5

8.1

  1. Table 1: A, dataset obtained after clustering; B, dataset obtained after the exclusion of singlets (clusters With only one sequence); C, dataset obtained after the exclusion of EC's which are not defined up to the Fourth level (incomplete EC' s); D, dataset obtained after the joining of clusters where some sequences Were annotated as 'subunits'; E, dataset obtained after the joining of clusters with putative intragenomic Analogy. Max. Cluster, the maximum number of clusters found for one specific enzymatic activity; % analogous, fraction of enzymatic activities where analogy was detected. # Clusters: number of functions with, respectively, 1, 2, 3 or more than 3 clusters.