Number of hits in non tumor tissues versus hits in tumor tissues. The left top panel plots one point per transcript in the human genome (total number of hits <=1,000). The other panels were filtered to show only those transcripts for which the 'most expressed in' TissueInfo attribute had value "gland", "lung", and "spleen". Filtering shows that transcripts most expressed in "gland" are on average equally likely to appear in tumor tissues as in non tumor tissues. In contrast, transcripts most expressed in "spleen" are five times more likely to appear in non tumor tissues. Filtering on other tissues shows average ratios of expression intermediate between gland and spleen. These plots reflect that EST libraries sequenced from different tissues have varied ratios of tumor/non tumor representation and that this ratio can be determined when grouping transcripts by the calculated most expressed in TissueInfo attribute. The consequence of grouping transcripts by the most expressed in attribute is that the scatter observed on the "all tissues" panel is greatly reduced after filtering. This observation motivated the development of the EST mining approach described in this manuscript.