Skip to main content

Table 2 Sensitivity and specificity for two pseudogene filtering methods

From: Profile hidden Markov model sequence analysis can help remove putative pseudogenes from DNA barcoding and metabarcoding datasets

Experiment

Dataset

Type of mutations

Sensitivity (%)

Specificity (%)

ORFfinder

ORFfinder + profile HMM analysis

ORFfinder

ORFfinder + profile HMM analysis

Artificial DNA barcoding dataset. COI genes and nuMTs from 10 species

Full length COI barcode and nuMT sequences

N/A

70

73

90

90

Perturbed community dataset

Full length COI barcode and simulated nuMTs

GC—> AT

31

27

99

 ~ 100

Perturbed community dataset

Full length COI barcode and simulated nuMTs

Frameshift

88

94

 ~ 100

 ~ 100

Perturbed community dataset

Short COI barcode and simulated nuMTs

GC—> AT

17**—50*

6**—15*

99

 ~ 100

Perturbed community dataset

Short COI barcode and simulated nuMTs

Frameshift

42**—58*

61**—87*

99

99*—~ 100**

Perturbed community dataset

Full length COI barcode and twice as many nuMTs

GC—> AT

17

0

99

 ~ 100

Perturbed community dataset

Full length COI barcode and twice as many nuMTs

Frameshift

0

0

 ~ 100

 ~ 100

Perturbed community dataset

Full length COI barcode and half as many nuMTs

GC—> AT

39

36

95

96

Perturbed community dataset

Full length COI barcode and half as many nuMTs

Frameshift

95

98

96

99

  1. Sensitivity refers to the true positive rate, our ability to correctly identify known or simulated nuMTs. Specificity refers to the true negative rate, our ability to correctly identify COI genes. * 5’ fragment. ** 3’ fragment