Flow diagram of the method. First, filter a primary database using a profile or motif database for a subset of sequences that will comprise the protein superfamily database. Then, partition the protein superfamily database into subfamilies depending on the criterion for a subfamily. Then, build an MSA for each subfamily and build HMMs of all w width windows of the MSA. Finally, tabulate matches with an e-value under 100 to identify subfamily signatures for the HMM database of the superfamily and tabulate matches with e-value under 0.1 to identify potentially significant functional regions in the subfamily.