Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data

Figure 1

Sequence reads organized in a graph structure. Single reads are represented by vertices (nodes) and their sequence overlaps by edges (A) Examples of different types of clusters that can be found in the graph structure. Graph parts in the shaded areas represent connected components of the graph. Nodes with the same color correspond to clusters (communities) as identified using a hierarchical agglomeration algorithm. In some cases connected components are identical to clusters identified by the hierarchical agglomeration method (green nodes in gray shading and turquoise nodes in pink shading). Magenta node represents a singlet - a read with no similarity to other sequences. (B-D) Principles of sequence read clustering and cluster analysis. (B) An example of a graph built from reads sampled from the largest connected component of P. sativum. Communities of reads were identified by the hierarchical agglomeration algorithm and labeled to distinguish different classes of repeats. (C) Schematic representation of resulting clusters (colored circles), showing number of reads (v) and number of edges (e) within and between the clusters. (D) Graph layouts calculated using the Fruchterman and Reingold algorithm for three clusters differing in structure.

Back to article page