Skip to main content

Table 1 Data sets.

From: Predicting conserved protein motifs with Sub-HMMs

Name

Size

Description

Pfam proteins

2,990,695

Proteins in Pfam database

Pfam HMMs

9,318

Domains in Pfam database

DKFs

7,435

Pfam domains of known function

DUFs

1,883

Pfam domains of unknown function

Sub-HMMs

48,535

Sub-HMMs excised from Pfam domains

Sub-DKFs

39,217

Sub-HMMs excised from DKFs

Sub-DUFs

9,318

Sub-HMMs excised from DUFs

  1. The table provides the sizes of the different data sets used and generated by this study using Pfam 22.0.