Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: The MOBSTER R package for tumour subclonal deconvolution from bulk DNA whole-genome sequencing data

Fig. 2

MOBSTER analysis of WGS sample PD4120a (breast cancer). a MOBSTER fit for \(n=4643\) somatic mutations mapping to chromosome 3, which is largely diploid. Here the input VAF (Observed frequency) is adjusted by tumour purity. MOBSTER identifies \(K=3\) Beta components and one tail, as shown previously in [CG20]. b Coverage for this sample as a histogram of the depth of sequencing for the input mutations. This sample has a median coverage of 169×. c Mixing proportions obtained from MOBSTER’s clustering assignments; these represent the proportion of mutations assigned to each one of the fit clusters in the model’s mixture. d Scores for model selection used by MOBSTER; in this case the model is selected by using the ICL score with reduced entropy, termed reICL. Note that all other scores suggest the same optimum model (red point in the score plot). This means that the identified model is the optimum no matter what scoring system we use. e Entropy of the model’s latent variables; we report both the standard entropy (solid line), as well as the reduced entropy which is computed just between mutations assigned to Beta clusters. As expected the reduced entropy is bounded from above by the standard entropy. f Value of the latent variable per mutation (cluster assignment probability). Here we assign via hard clustering assignments all mutations regardless the latent variables value. This shows that more uncertainty is found for mutations that map to clusters C3, C2 and Tail as the mixture density functions have overlapping support

Back to article page