Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: GMMchi: gene expression clustering using Gaussian mixture modeling

Fig. 1

Common distributions in gene expression analysis: examples of common histogram distributions seen in gene expression data from a panel of 78 colorectal cancer (CRC) derived cell lines. Expression levels based on microarray analysis are given as log2 on the x-axis, and numbers of samples with given expression levels on the y-axis. The continuous curves are fitted normal distributions and the vertical dotted red line marks the best estimate for separating low and high levels of expression, derived as described later. The mRNA for the gene coding GJC2, a gap junction protein, exhibits a unimodal distribution representing a single Gaussian distribution while the mRNA for the gene coding CDX1, a homeobox protein, exhibits a bimodal distribution representing two distinct Gaussian distributions. The gene for CDH1, encoding E-cadherin, shows a unimodal distribution with a tail of low expression values, whose estimation is one of the main challenges for GMM

Back to article page