Skip to main content
Figure 4 | BMC Bioinformatics

Figure 4

From: Combining Shapley value and statistics to the analysis of gene expression data in children exposed to air pollution

Figure 4

Clustering on genes selected by CASh and t-test at p < 0.01. (a) Heat-map of the log-expression values together with hierarchical clustering (Ward method, Euclidean distance) and K-means (a priori specified number of clusters K = 2) of 47 subjects (columns) and 159 genes (rows) with highest Shapley values and with un-adjusted p-values smaller than 0.01 (from CASh, Algorithm 1). (b) Heat-map of the log-expression values together with the hierarchical clustering (Ward method, Euclidean distance) and K-means (a priori specified number of clusters K = 2) of 47 subjects (columns) and 265 genes (rows) with p-values smaller than 0.01 (from t-test). Yellow: up-regulation and blue: down-regulation. In subject labels, 1 means exposed subject, whereas 0 means non-exposed subject. The red and green labels on the top of the heat-map represent the two clusters of subjects provided by K-means. Orange rectangles highlight misclassified subjects. The vertical dashed line shows the separation between the two main clusters.

Back to article page