Skip to main content
Figure 4 | BMC Bioinformatics

Figure 4

From: Combining gene expression data from different generations of oligonucleotide arrays

Figure 4

The effect of data filtering on identification of differentially expressed genes and on correlation between array types for the same genes. (a) Percentage of differentially expressed genes common in U95Av2 and U133A datasets. When we considered only the top 1,000 highly correlated genes across U95Av2 and U133A, the overlap between the lists of differentially expressed genes increased dramatically (solid line). For comparison, we show the result without gene selection by correlation (dashed line). For the latter, we subsampled a random gene set of same size repeatedly to eliminate the effect of total size; we also filtered using Present and Absent calls to increase the overlaps. (b) Distribution of the correlation coefficient of probe sets stratified by their mean expression value across U95Av2 and U133A. The density was estimated for upper quartiles using a Gaussian kernel. Filtering by expression values clearly enhances the correlation of probe sets across array types, thus improving the reproducibility in the selection of differentially expressed genes.

Back to article page