Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Identification and analysis of methylation call differences between bisulfite microarray and bisulfite sequencing data with statistical learning techniques

Figure 1

Data preprocessing. (1) Only reads overlapping with a CpG on the Infinium 450K chip are retained. (2) Windows are extended to the left and right of each CpG according to the maximum read length, yielding a uniform feature representation. (3) For each CpG, a consensus sequence is formed from its corresponding set of reads. Additionally, the position-specific frequency of each base is extracted. (4) Finally, CpG positions are masked by introducing gaps in the sequence or zeroing frequencies.

Back to article page