Skip to main content

Table 5 Classification Performance with Reference Genomes of genus Brassica

From: Machine learning on alignment features for parent-of-origin classification of simulated hybrid RNA-seq

Brassica DNA

A

HiSat2

B

Hi_AS

C

Hi_RF

D

STAR

E

St_AS

F

St_RF

G

bwa

Accuracy

93.0%

93.1%

95.1%

94.5%

92.7%

94.3%

94.8%

Sensitivity

95.8%

95.7%

94.7%

97.1%

94.8%

94.0%

97.3%

Specificity

90.3%

90.6%

95.6%

91.9%

90.5%

94.6%

92.3%

Precision

90.8%

91.0%

95.6%

92.3%

90.9%

94.6%

92.6%

F1-score

93.2%

93.3%

95.1%

94.6%

92.8%

94.6%

94.9%

MCC

0.862

0.863

0.903

0.890

0.854

0.886

0.897

AUPRC

–

–

99.0%

–

–

98.5%

–

AUROC

–

–

98.9%

–

–

98.5%

–

Pos Pref

52.8%

52.6%

49.5%

52.6%

52.2%

49.7%

52.5%

Ties

–

3.4%

–

–

4.8%

–

–

  1. Performance metrics for genome-based parent-of-origin classification in Brassica. Whether used with HiSat2 or STAR, the random forest improved the accuracy, F1, and MCC. For directional statistics, species B. rapa and B. oleracea were considered the negative and positive classes, respectively. (A, B, C) Parent chosen by the HiSat2 aligner, or by comparing HiSat2 alignment scores, or by the random forest using HiSat2 alignment features, respectively. (D, E, F) Similar to columns A, B, and C but using the STAR aligner, configured for splicing. (G) Parent chosen by bwa