AnEnPi: identification and annotation of analogous enzymes

BMC Bioinformatics

Table 1 Refinement of The initial Dataset (A) through the application of successive filters.

Datasets	# Clusters				Max. Clusters	% Analogous
	1	2	3	> 3
A	1447	459	199	328	131	40.5
B	1600	345	113	180	78	26.2
C	1560	316	91	97	46	20.7
D	1619	302	73	70	23	19.4
E	1897	142	23	1	5	8.1

Table 1: A, dataset obtained after clustering; B, dataset obtained after the exclusion of singlets (clusters With only one sequence); C, dataset obtained after the exclusion of EC's which are not defined up to the Fourth level (incomplete EC' s); D, dataset obtained after the joining of clusters where some sequences Were annotated as 'subunits'; E, dataset obtained after the joining of clusters with putative intragenomic Analogy. Max. Cluster, the maximum number of clusters found for one specific enzymatic activity; % analogous, fraction of enzymatic activities where analogy was detected. # Clusters: number of functions with, respectively, 1, 2, 3 or more than 3 clusters.

ISSN: 1471-2105