Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Rebooting the human mitochondrial phylogeny: an automated and scalable methodology with expert knowledge

Figure 1

Individual sequence features. These descriptors operate directly on sequences. Due to their simplicity, they comprise the first tests to be applied to any prospective members of a dataset. (a) The sequence length histogram locates unusually short or long sequences, commonly classifying correct genomes as belonging to strict or flexible sets, and also detecting outliers which cannot be straightforwardly ascribed to either group. Blue dots mark accepted strict sequences; red dots, outlier strict sequences; and green dots, flexible and not strict sequences. (b) The ambiguity covering histogram serves as an aid for determining acceptable ambiguity thresholds and approximates a simple measure of aggregated quality. (The green dot shows the base covering of fully defined sequences, with zero ambiguous positions.)

Back to article page