Schematic overview of the clustering procedures We start with a single linkage tree constructed from pairwise distances. Each leaf in the tree corresponds to a protein sequence. Superfamilies are determined based on the internal structure of the tree. For each superfamily a distinct superfamily distance graph is built. This weighted graph is cut at weak connections into subclusters.