Arabidopsis DNA | A HiSat2 | B Hi_AS | C Hi_RF | D STAR | E St_AS | F St_RF | G bwa |
---|
Accuracy | 73.3% | 82.8% | 94.5% | 72.9% | 81.4% | 88.6% | 76.4% |
Sensitivity | 50.1% | 72.1% | 91.2% | 49.6% | 69.9% | 87.6% | 56.1% |
Specificity | 96.4% | 93.5% | 98.3% | 96.2% | 93.0% | 89.7% | 96.7% |
Precision | 93.4% | 91.7% | 98.1% | 92.9% | 90.9% | 89.5% | 94.4% |
F1-score | 65.2% | 80.7% | 94.5% | 64.6% | 79.0% | 88.5% | 70.4% |
MCC | 0.525 | 0.671 | 0.897 | 0.518 | 0.646 | 0.773 | 0.578 |
AUPRC | – | – | 99.3% | – | – | 96.5% | – |
AUROC | – | – | 99.2% | – | – | 96.2% | – |
Pos Pref | 26.8% | 39.3% | 46.5% | 26.7% | 38.4% | 48.9% | 29.7% |
Ties | – | 13.3% | – | – | 14.7% | – | – |
- Performance metrics for parent-of-origin classification in Arabidopsis. In all six approaches, RNA-seq read pairs were assigned to either of two reference genomes. Whether used with HiSat2 or STAR, the random forest led to superior accuracy, F1, and MCC. For the sake of directional statistics like sensitivity, species A. lyrata and A. halleri were designated as the negative and positive classes, respectively. (A) Parent chosen by the HiSat2 aligner. (B) Parent chosen by comparing HiSat2 alignment scores. (C) Parent chosen by the random forest classifier using HiSat2 alignment features. (D, E, F) Similar to columns A, B, and C but using the STAR aligner, configured for splicing. (G) Parent chosen by bwa