Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: UQlust: combining profile hashing with linear-time ranking for efficient clustering and analysis of big macromolecular data

Fig. 2

Hierarchical clustering of 98,000 protein chains from the Protein Data Bank, using the fragment-based FragBag profile and the uQlust:Tree algorithm. The initial micro-clusters of structures deemed as closely related (i.e. those with identical hash keys, including large “micro-clusters” of nearly identical structures such as those of globins or lysozymes) constitute the leaves in the tree. CATH assignment at the class level for majority alpha, alpha/beta (or alpha + beta) and beta clusters are shown as red, blue and yellow bars, respectively. It should be noted that the uQlust graphical user interface enables interactive exploration of such generated dendograms and other representations of large data sets

Back to article page