Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: DeltaMSI: artificial intelligence-based modeling of microsatellite instability scoring on next-generation sequencing data

Fig. 2

Diagnostic power of 29 individual microsatellite marker loci in the various models to predict dMMR status. 29 of 36 marker loci (Y-axis) showed acceptable coverage within and between samples and were used for machine learning by the indicated models (X-axis). Marker loci are denoted by gene name (capitals) and source (Salipante/Idylla®assay/Hause/Bethesda panel) with full genomic coordinates in Additional file 2: Table S1. Models were trained versus outcome at sample level by IHC (dMMR/pMMR) assuming all loci of a sample classified by IHC as pMMR to be stable, and all loci of a sample classified as dMMR to be unstable. Models included isolation forest, local outlier factor, one-class support vector machine (SVM), logistic regression, random forest, naive Bayes and support vector classifier (SVC). The heat map shows representative AUC at locus level in the validation set to predict dMMR status at sample level (low to high AUC from red to blue). 28 of 29 loci (with exception of MSI_PBMR1_Salipante) achieved acceptable AUC and were retained in the final DeltaMSI script. Logistic regression and SVC consistently achieved highest AUC and were integrated into the combined voting model of DeltaMSI

Back to article page