Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: Profile hidden Markov model sequence analysis can help remove putative pseudogenes from DNA barcoding and metabarcoding datasets

Fig. 3

Reducing GC content and introducing frameshifts reduces ORF lengths and bit scores. Each column shows the results from a particular perturbed community dataset: a controlled community with nuMTs absent, a community with nuMTs that have a reduced GC content, and a community with nuMTs where we introduced frameshift mutations. The top panel shows the length variation of sequences in the longest retained open reading frame. The solid vertical line indicates the length of a typical COI barcode at 658 bp. The two vertical dashed lines shows the boundaries for identifying ORFs with outlier lengths. The bottom panel shows the HMMER3 sequence bit scores. The vertical dashed line shows the cutoff for identifying sequences with low outlier scores

Back to article page