Skip to main content

Table 2 Classification Performance with Reference Transcripts of genus Arabidopsis

From: Machine learning on alignment features for parent-of-origin classification of simulated hybrid RNA-seq

Arabidopsis RNA

A

Bowtie2

B

Bo_AS

C

Bo_RF

D

STAR

E

St_AS

F

St_RF

G

Salmon

H

bwa

Accuracy

72.7%

81.0%

95.0%

73.0%

80.9%

88.5%

70.8%

75.2%

Sensitivity

56.9%

70.7%

90.6%

56.0%

69.9%

87.2%

48.1%

60.9%

Specificity

88.5%

91.3%

99.5%

90.0%

91.9%

89.9%

93.5%

89.0%

Precision

83.2%

89.1%

99.4%

84.9%

89.6%

89.6%

88.0%

84.7%

F1-score

67.5%

78.8%

94.8%

67.5%

78.5%

88.3%

62.2%

70.9%

MCC

0.478

0.634

0.904

0.489

0.633

0.770

0.466

0.521

AUPRC

–

–

99.5%

–

–

96.5%

–

–

AUROC

–

–

99.4%

–

–

96.2%

–

–

Pos Pref

34.2%

39.7%

45.6%

33.0%

39.0%

48.6%

27.3%

36.0%

Ties

–

14.0%

–

–

14.7%

–

–

–

  1. Performance metrics for parent-of-origin classification in Arabidopsis. In all seven approaches, RNA-seq read pairs were assigned to either of two reference transcriptomes. Whether used with Bowtie2 or STAR, the random forest method demonstrated superior performance. For the sake of directional statistics like sensitivity, species A. lyrata and A. halleri were designated as the negative and positive classes, respectively. (A) Parent chosen by the Bowtie2 aligner. (B) Parent chosen by comparing Bowtie2 alignment scores. (C) Parent chosen by the random forest classifier using Bowtie2 alignment features. (D, E, F) Similar to columns A, B, C, but using the STAR aligner, configured to avoid splicing. (G, H) Parent chosen by Salmon or bwa, respectively