Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: GMMchi: gene expression clustering using Gaussian mixture modeling

Fig. 4

Determining the Chi-square threshold value for identifying a poorly fitted tail: this histogram is the distribution of the spread of the log2-transformed χ2 values of all genes identified by GMM as bimodal in our panel of 78 cell lines. The histogram has a clear bimodal distribution with two normal components that are well fitted by GMM estimation. We assume the two distributions represent: (1) On the low side, the well-fitted distributions with lower χ2 values, and (2) the higher χ2 values indicating inadequately fitted distributions containing non-normal tail components. Using the estimated mean of the upper distribution of 8.48 and a standard deviation of 1.5, we estimate a lower threshold of 5.52 for the log2 χ2 values that indicate the presence of a non-normal tail at a false-negative rate of 5%

Back to article page