Skip to main content
Figure 5 | BMC Bioinformatics

Figure 5

From: Combining Shapley value and statistics to the analysis of gene expression data in children exposed to air pollution

Figure 5

CASh p -values versus differential Shapley value. (a) Scatterplot of the p-values provided by the CASh method versus the differential Shapley values of the first 450 genes with the smallest p-value from CASh applied to v ¯ T P + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdsfaujabdcfaqjabgUcaRaaaaaa@30CB@ vs. v ¯ P R + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdcfaqjabdkfasjabgUcaRaaaaaa@30C7@ and to v ¯ T P − MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdsfaujabdcfaqjabgkHiTaaaaaa@30D6@ vs. v ¯ P R − MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdcfaqjabdkfasjabgkHiTaaaaaa@30D2@ . Red points correspond to 80 genes selected with respect to the differential Shapley value ϕ( v ¯ T P + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdsfaujabdcfaqjabgUcaRaaaaaa@30CB@ ) - ϕ( v ¯ P R + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdcfaqjabdkfasjabgUcaRaaaaaa@30C7@ ). Green points correspond to 80 genes selected with respect to the differential Shapley value ϕ( v ¯ T P − MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdsfaujabdcfaqjabgkHiTaaaaaa@30D6@ ) - ϕ( v ¯ P R − MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdcfaqjabdkfasjabgkHiTaaaaaa@30D2@ ). The blue dashed line intercepts the y-axis in p = 0.05; The brown dashed line intercepts the y-axis in p = 0.01. Blue arrows indicate the two genes which are shown in the table (column 'Probe ID'), that are characterized by different p-values provided by CASh (column 'p-value') but with the same Shapley value difference ϕ( v ¯ T P + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdsfaujabdcfaqjabgUcaRaaaaaa@30CB@ ) - ϕ( v ¯ P R + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdcfaqjabdkfasjabgUcaRaaaaaa@30C7@ ) = 0.00063 (see column 'Shapley value'). The medians of the statistics of the Shapley value in v ¯ T P + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdsfaujabdcfaqjabgUcaRaaaaaa@30CB@ and v ¯ P R + MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGafmODayNbaebadaahaaWcbeqaaiabdcfaqjabdkfasjabgUcaRaaaaaa@30C7@ together with the difference of the medians are shown in column 'Median Shapley value'. (b) Hierarchical clustering (Ward method, Euclidean distance) of 47 subjects (columns) on 160 genes selected according to the differential Shapley value (green and red points in (a)). (c) Hierarchical clustering (Ward method, Euclidean distance) of 47 subjects (columns) on 113 genes selected from the list of 160 genes with the highest differential Shapley value having a p-value from CASh lower than 0.01 (green and red points in (a) below the brown dashed line). In subject labels, 1 means exposed subject, whereas 0 means non-exposed subject. Orange rectangles highlight misclassified subjects.

Back to article page