Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: Profile hidden Markov model sequence analysis can help remove putative pseudogenes from DNA barcoding and metabarcoding datasets

Fig. 2

NuMTs tend to have lower GC content, shorter open reading frames, and smaller bit scores. Based on the artificial DNA barcoding dataset described in Table 1. The top panel shows GC content (%) in gene and nuMT sequences. The middle panel shows the sequence length distribution for the longest retained open reading frame. The solid vertical line indicates the length of a typical COI barcode at 658 bp. The two vertical dashed lines shows the boundaries for identifying ORFs with outlier lengths. The bottom panel shows the HMMER3 sequence bit score distribution. The vertical dashed line shows the cutoff for identifying outlier scores

Back to article page