Skip to main content

Advertisement

Figure 2 | BMC Bioinformatics

Figure 2

From: An integrative variant analysis suite for whole exome next-generation sequencing data

Figure 2

Theoretical Performance of the Regression Models. (a) The Atlas-SNP2 model is evaluated on a subset of the training data, which requires a minimum total depth of 10 base-pairs. (b) The Atlas-Indel2 model is evaluated on a subset of the training data that requires at least 2 variant reads (a default heuristic filter). To estimate the effectiveness of the regression models and test for overfitting, a series of cross-validation tests were performed by repeatedly sampling half of the training data to be used to train the model, and then evaluating the model on the remaining data. This process was repeated 100 times, with each result plotted as a gray line. The average of all these lines is plotted as a bold, color-coded line. The color indicates the p cutoff which returns the given performance at that point. The suite's default cutoff of 0.5 is marked. The actual model evaluated on the full set is plotted as a black line, but is mostly covered by the average line.

Back to article page