Skip to main content

Table 2 Protein domain hierarchies generated automatically either from a single curated alignment or from non-aligned sequences

From: Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures

identifier

Protein superfamily name

# seqs

# nodes amcBPPS

LLR

Run time§

Started from curated alignments:

    

cd00075

Histidine kinase-like ATPase c

87,258

95(62)

518062

119.27

cd00130

PAS

50,200

117(115)

416375

103.95

cd00174

SH3

13,890

44(35)

26971

3.83

cd00590

RRM

107,488

63(56)

557782

63.75

cd01427

HAD-like hydrolases

41,818

85(73)

324699

59.77

cd02440

AdoMet_MTases

150,872

112(99)

1417985

250.27

cd04301

NAT-SF

43,486

71

244420

23.30

cl02566

SET (pfam00856)

8,946

21

54230

2.58

cl10444

P-loop GTPases‡

198,624

115 (109)

3826672

464.67

none

AAA + ATPases‡

84,695

86(85)

1779227

173.73

Started from unaligned sequences:

    

none

α,β- hydrolase fold

50,811

109(104)

752259

139.82

none

Helicases

86,287

117 (111)

1935380

342.10

  1. ‡ For these non-CDD curated alignments were used as input.
  2. § The time (in minutes) is for Steps 2 and 3 of the algorithm only.
  3. Unaligned sequences were aligned using the multiple alignment procedures cited in Methods to generate an input alignment for the amcBPPS program.