Probabilistic alignment leads to improved accuracy and read coverage for bisulfite sequencing data
© Hong et al.; licensee BioMed Central Ltd. 2013
Received: 7 June 2013
Accepted: 19 November 2013
Published: 21 November 2013
DNA methylation has been linked to many important biological phenomena. Researchers have recently begun to sequence bisulfite treated DNA to determine its pattern of methylation. However, sequencing reads from bisulfite-converted DNA can vary significantly from the reference genome because of incomplete bisulfite conversion, genome variation, sequencing errors, and poor quality bases. Therefore, it is often difficult to align reads to the correct locations in the reference genome. Furthermore, bisulfite sequencing experiments have the additional complexity of having to estimate the DNA methylation levels within the sample.
Here, we present a highly accurate probabilistic algorithm, which is an extension of the Genomic Next-generation Universal MAPper to accommodate bisulfite sequencing data (GNUMAP-bs), that addresses the computational problems associated with aligning bisulfite sequencing data to a reference genome. GNUMAP-bs integrates uncertainty from read and mapping qualities to help resolve the difference between poor quality bases and the ambiguity inherent in bisulfite conversion. We tested GNUMAP-bs and other commonly-used bisulfite alignment methods using both simulated and real bisulfite reads and found that GNUMAP-bs and other dynamic programming methods were more accurate than the more heuristic methods.
The GNUMAP-bs aligner is a highly accurate alignment approach for processing the data from bisulfite sequencing experiments. The GNUMAP-bs algorithm is freely available for download at: http://dna.cs.byu.edu/gnumap. The software runs on multiple threads and multiple processors to increase the alignment speed.
KeywordsDNA methylation Bisulfite sequencing Probabilistic alignment Parallel processing
DNA methylation is a chemical process in which a methyl group is added to the carbon-5 position of a DNA cytosine. In most vertebrates, DNA methylation typically occurs on the cytosine of a CpG dinucleotide [1, 2], although some specific examples of other types of methylation have been shown to play roles in specific tissues [3-6]. Since its discovery over 60 years ago, DNA methylation has been linked to many important biological phenomena such as the suppression of gene expression [7, 8], imprinting , X chromosome inactivation , epigenetic reprogramming during mammalian development , and cancer development . Therefore, the study of genome-wide methylation patterns is currently of great interest to researchers, particularly in areas related to the molecular mechanisms of development, cancer, and chromatin dynamics.
When DNA is treated with sodium bisulfite, unmethylated cytosine residues are converted to uracil, while 5-methylcytosine residues are unaffected. Later in bisulfite sequencing (BS-seq) experimental protocols, PCR amplification or sequencing converts the uracil residues to thymines. The next step of finding the correct genomic location for a bisulfite-treated read (BSR) is a complicated and difficult process. Although cytosine to thymine changes are allowed for when mapping the BSRs to the genomic sequence, methylated and unmethylated CpG locations are often identified and, in some cases, it is impossible to distinguish between a bisulfite (BS)-treated thymine that originated from an unmethylated cytosine, and a true thymine from a different genomic location or an individual genomic variation at that location [13, 14]. This ambiguity is often magnified by the presence of sequencing errors or low-quality bases. As a result, sophisticated computational strategies are required for aligning reads from BS-seq experiments.
The first whole-genome methylation profiles were performed on Arabidopsis thaliana. To map the resultant BSRs, alignment algorithms based on a probabilistic formulation and a suffix tree  as well as reference genome conversion  were used. However, it is not computationally feasible to apply this approach to larger genomes such as the human genome, or to experiments with the current deeper sequencing depths. Later algorithms, such as BSMAP , constructed seed tables of locations from both the original reference sequence and the BS variants, and then extends the seeds to form a possible mapping location. This seed extension process can be somewhat unreliable because the seeds must be exact and do not take into account BS-treated variations. BS alignment methods such as BS Seeker , Bismark , and BRAT-BW  have been used to map BSRs. These methods employ a Burrows-Wheeler transformation  for fast (in)exact string matching and then combine the results with either a pre-processing or post-processing script to handle BSRs with three letters after converting all Cs to T. The strength of these methods lie with reads with fewer mismatches, but they offer very limited support for aligning reads with insertions or deletions (indels). These methods also have difficulty in aligning Ts in the reads to Cs in the genome without also (incorrectly) aligning for cysteines in the reads to thymidines in the genome. More recently, the LAST alignment program has been adapted to align BSRs . LAST uses a seed extension approach similar to the one used by NCBI BLAST , but the speed and mapping accuracy are increased by using variable-length seeds and base quality information .
In this study, we present a highly accurate probabilistic mapping algorithm, Genomic Next-generation Universal MAPper for Bisulfite Sequencing (GNUMAP-bs), for BSR alignment. GNUMAP-bs is an extension of the Genomic Next-Generation Universal MAPper (GNUMAP)  that can accommodate the alignment of BSRs to a reference genome. GNUMAP-bs was developed to achieve higher accuracy than other BS-seq approaches by including base quality scores in the alignment process. In addition, the GNUMAP-bs probabilistic mapping approach allows for the unbiased estimation of DNA methylation, especially when reads are aligned to multiple genomic locations.
Results and discussion
Parameters used in the two experiments for each of the aligners tested
-m 17 -s 1 -T 20 -a 0.90 (-a 0.92) -b
-k 17 -s 2 -r -A 20 -t 75 (-t 90) -b2
-n 0 -w 100 -v 3
-n 2 -l 50
-N 1 -L 20 -bowtie2 -min-score L,0,-0.6
-Q1 -j1 -d120 -n20 -f1 | last-map-probs.py -s150 -m0.95
default settings for single-end reads
We used two datasets to evaluate the performance of each of the alignment methods. The first was a simulated BSR dataset, which was carefully designed to mimic a typical human methylation experiment. The second was a real human BS-seq dataset, which was compared with an experimentally-derived human methylome profile.
Simulated bisulfite sequencing experiment
For the simulation study, we randomly assigned 20% of the CGs in the whole human genome (NCBI build37/HG19) to represent unmethylated cytosines (Cs) by changing them to thymidines (Ts), thereby simulating complete BS conversion. For the remaining 80% of the CGs, 75% was randomly assigned to be fully methylated and, therefore the Cs remained as Cs. The remaining 5% was assigned to be methylated in proportions between 0.1 and 0.9. In this dataset, we assumed that all non-CG sites remained unmethylated, so these Cs were all changed to Ts for read generation. We used the dwgsim (http://sourceforge.net/projects/dnaa/files/dwgsim) simulation tool with parameters -e 0.001-0.008 -1 100 -y 0.05 -r 0.002 -R 0.2 -C 10, to generate 100-bp BSRs with a 10? read depth across the genome. This simulation produced a BS-seq dataset that contained approximately 180 million (M) reads with a sequencing error rate that ranged from 0.001 to 0.008 and increased from the 5’ to 3’ ends, plus 5% randomly generated sequence, and a mutation rate of 0.002 (in which 0.2% were indels).
Simulated bisulfite read experiment
Overall mapping results:
Total reads aligned (%)
Correctly aligned (%)
Incorrectly aligned (%)
With ≥1 sequence variant:
Total reads aligned (%)
Correctly aligned (%)
Incorrectly aligned (%)
Ave. absolute estimation err.
Total compute time (16 CPUs)
39 h 50 m
29 h 25 m
4 h 28 m
46 h 16 m
97 h 26 m
58 h 20 m
Peak memory usage (GB)
Reads per second per CPU
Differences in sensitivities of these methods were much more pronounced for BSRs that contained at least one sequencing error or genome variant, and the GNUMAP-bs sensitivity was clearly better than the sensitivities of the other approaches (Table 2). We also evaluated how accurately each aligner predicted the CG methylation levels when the true methylation level ranged from 10% to 90%. All the alignment methods tested performed well with mean absolute errors less than 0.01 and with standard deviations less than 0.07.
When the computational performances of each of the methods were compared (Table 2), we observed that GNUMAP-bs required the most RAM (44.8 GB) and Bismark required the least (5.9 GB). Because some of the software applications presented here support computation on multiple threads and some do not, we presented two different measures of computational speed: 1) the total run time on a 16 CPU linux server and 2) the number of reads processed per CPU per second. GNUMAP-bs required approximately 40 hours of total compute time to process the 180 M BSRs, while BSMAP, the fastest algorithm, was nearly nine times faster than GNUMAP-bs. The LAST application does not support parallel computing, so LAST had the longest total alignment time. However, the LAST algorithm aligned the most reads per second per CPU (753), while Bismark-Bowtie2 aligned the least (26).
Human BS-seq dataset
We also evaluated the performances of the alignment methods using a BS-seq dataset generated from samples from a healthy human donor collected at the Andrology Laboratory at the University of Utah (Salt Lake City, UT, USA). The BS-seq data were generated by coupling BS conversion and the Illumina HiSeq2000 platform, which generated 101-bp BSRs for analysis. We aligned 283.6 M reads from three lanes of BS sequencing data containing 85 M to 100 M sequencing reads each, to the recent build of the human genome (NCBI build37/HG19). The human BSRs were processed for quality control as suggested previously . Briefly, the quality control involved masking low quality bases or trimming consecutive lowest quality bases at the 3’ ends of the reads.
We used the same parameters on these data as were used for the simulation experiment (Table 1) with two exceptions: -a 0.92 for GNUMAP-bs and -t 90 for Novoalign. The LAST algorithm again aligned the highest proportion (93.7%) of BSRs to the genome, followed by 70.0% for GNUMAP-bs, 68.2% for Novoalign, 67.1% for BSMAP, 67.0% for Bismark, 62.9% for Bismark-Bowtie2, and only 50.3% for BRAT-BW.
We compared these mapping results with the Sanger-based BS sequencing control data available from the Human Epigenome Project (HEP)  by selecting the data that were obtained using the same type of tissue as was used in our BS-seq dataset. The HEP data provides a natural gold standard for algorithmic evaluation. For this comparison, we focused on the 13,563 HEP CG sites on chromosome 22 (chr22), because these data provide the most comprehensive chromosomal CG coverage that is available in the HEP database. The data for the other chromosomes in the HEP showed similar profiles as chr22, but the coverage was much sparser.
Human bisulfite read experiment
CG read coverage
CG Read coverage
The differences between GNUMAP-bs and LAST become even less pronounced when the overlap between the mapped BSRs and the 13,563 CGs with HEP coverage on chr22 was considered. Last aligned the BSRs to 63.5% (8,606) of the CGs HEP sites compared with 58.3% (7,902) for GNUMAP-bs, and 57.5% (7,802) for Novoalign, 57.1% (7,747) for BSMAP, 55.7% (7,561) for Bismark, 54.1% (7,331) for Bismark-Bowtie2, and 49.3% (6,690) for BRAT-BW (Table 3).
BS sequencing presents difficult challenges to researchers attempting to process the sequencing reads from BS-seq experiments. In this work, we present GNUMAP-bs, a highly accurate and effective alignment algorithm that is specifically designed to estimate DNA methylation levels with base-level resolution in BS-seq data. GNUMAP-bs uses a probabilistic approach to align BSRs to a reference genome. GNUMAP-bs was developed to achieve higher coverage and accuracy than other published BS-seq approaches by integrating base quality and alignment quality information in the mapping process. We have shown that the GNUMAP-bs probabilistic mapping approach results in an improved unbiased estimation of DNA methylation across the human genome. In simulated and real datasets, we showed that GNUMAP-bs outperforms other BS-seq alignment methods when both coverage and consistency were balanced with Sanger based BS sequencing controls.
In addition, GNUMAP-bs provides many high-demand features needed for constructing a high quality methylome from BS-seq data. First, GNUMAP-bs incorporates quality sequencing data into a dynamic programming framework. This feature gives GNUMAP-bs the best balance between sensitivity and specificity of the tested BSR aligners, especially for reads that contain short polymorphisms. Second, GNUMAP-bs adaptively assigns an optimal mapping stringency based on an effective read length after the original read is trimmed. Third, GNUMAP-bs not only relies on the maximal score alignment but also probabilistically considers suboptimal alignments; that is, the alignment score is converted to a posterior probability and the probabilistic scores quantify the likelihood of the true source location for each read across the reference genome. As a result, for both the simulated and real datasets, we showed that GNUMAP-bs was more effective that the other methods in detecting read locations in the presence of sequencing errors. GNUMAP-bs also displayed the highest consistency with a known HEP methylation database. Because GNUMAP-bs supports Message Passing Interface (MPI) processing, the computational burden of the dynamic programming can be alleviated. Moreover, with computer resources becoming cheaper, computing clusters with large numbers of nodes and cores, and more computing clouds are becoming available. Therefore, memory and CPU running times are less of a bottleneck, which is especially useful for GNUMAP-bs alignments. For this reason, accuracy should currently be a more important concern in BS-seq data analysis.
In the second step of the GNUMAP algorithm, all the BSRs are aligned to the genome at the hashed locations. The alignment is performed using a novel probabilistic alignment algorithm that uses base quality information. All the regions that meet a predefined alignment score threshold are retained for the final step. In GNUMAP-bs, the probabilistic alignment algorithm matches the unconverted (original) reads to the unconverted genome while allowing matches between Ts in the reads and Cs in the genome. Using dynamic programming in probability space for mapping BSRs to a genome has several benefits: 1) the Needleman-Wunsch algorithm is guaranteed to find the optimal alignment for a BSR; 2) by incorporating base qualities into the probabilistic algorithm, true DNA methylation can be more accurately identified in the alignment, especially in areas where the reads have low base quality; and 3) by making only a small change in the alignment scoring matrix, namely by removing the T (read) to C (genome) 'mismatch’ penalty and scoring these alignments as a 'match’, the probabilistic Needleman-Wunsch algorithm can accurately account for the BS changes in BSRs without losing reads that are partially converted, or that have genome variations or sequencing errors.
During PCR amplification after BS conversion, the C to T conversions in the BSRs lead to G to A conversions on the PCR-synthesized strand as described previously . These G to A changes provide evidence of non-methylation on the original strand. The phenomenon also occurs for C to T changes on the PCR-synthesized strand. Therefore, the GNUMAP-bs algorithm maps each BSR twice, once looking for only C to T changes and then looking for only G to A changes. When a G to A change is observed, the read is considered to be a PCR-synthesized strand, and the non-methylation event is attributed to the original strand.
It is worth noting that the calibrated posterior probability is a special case of the LAST  mapping probabilities equation where a scaling factor for a bit score in the simple E-value statistics is not explicitly computed.
This probability provides an intuitive value between 0 and 1 representing the methylation signal for each read at each CG site. Other BS mapping programs use instead of Equation (1), which may not accurately capture the true posterior probability. The posterior probability indicates the relative significance of mapping a read to a particular position. The probabilistic Needleman-Wunsch computes a log-likelihood score; therefore, if a particular BSR has multiple possible mapping locations in the reference genome, the significance of each mapping decreases. After GNUMAP-bs computes P(r d ) for each read, the algorithm combines the probabilities into one methylation profile vector and infers a true methylation ratio at each CG site.
Intuitively, the reads that cover each C location across the reference genome provide evidence of true mapping so that a reliable methylation percentage can be obtained.
Software implementation and availability
The GNUMAP-bs algorithm is integrated into the is GNUMAP software suite, and is freely available for download at http://dna.cs.byu.edu/gnumap. The GNUMAP-bs pipeline can accommodate single-end or pair-end reads in either FASTA or FASTQ file formats. A reference genome file (or multiple reference files) in FASTA format is also required. However, to increase the efficiency of the workflow, users can opt to write the reference genome hash table to a file, which can be used in future runs. GNUMAP outputs read alignments in standard SAM file format. The GNUMAP-bs software suite contains an addition application function (sam2gmp) that summarizes a SAM file and writes it into a text file that contains one line for each cytosine in the genome. Each line in this file contains the chromosome number, the location of the cytosine on the chromosome, the number of reads, and the numbers of As, Cs, Gs, Ts, and Ns covering the location. In addition, the text file gives a likelihood ratio p-value that indicates whether there is a significant number of Cs at the location (i.e. methylation significance).
In addition to the adaptations for BSRs, the GNUMAP-bs software also contains several modifications to reduce the computational time and memory needed for the alignments. For example, the initial read and genome conversion step reduces the genome alphabet to three bases, leading to an increase in the number of genome locations identified in the hash table. To reduce this effect, multiple seeds from each read are referenced by GNUMAP-bs, and the alignment is only conducted on the locations with the top two most k-mer hash references. Furthermore, although the original GNUMAP software supports multi-threaded computing within the same node using pthread, GNUMAP-bs is fully enabled to support MPI processing. This feature allows a large-scale alignment to be spread across multiple nodes in a cluster or supercomputing facility . In this implementation, the genome is divided into equal parts across nodes, and then the same batches of reads are aligned by each node to their individual portion of the genome. Once the batch is completed, the nodes communicate via MPI to calculate the posterior probability scores. Because most of the CPU time is spent on the alignments, the communication overhead is relatively small, resulting in a highly efficient parallel algorithm.
This research was conducted using the Fulton Supercomputing Lab at Brigham Young University (Provo, UT, USA) and the Linux Clusters for Genetic Analysis (LinGA) computing resources at Boston University Medical Campus (Boston, MA, USA). The study was supported financially by a grant from the National Institutes of Health (NIH) (R01 HG005692).
- Clark S, Harrison J, Frommer M, et al: CpNpG methylation in mammalian cells. Nat Genet. 1995, 10: 20-27. 10.1038/ng0595-20.View ArticlePubMed
- Bernstein BE, Meissner A, Lander ES: The mammalian epigenome. Cell. 2007, 128 (4): 669-681. 10.1016/j.cell.2007.01.033.View ArticlePubMed
- Ramsahoye BH, Biniszkiewicz D, Lyko F, Clark V, Bird AP, Jaenisch R: Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a. Proc Nat Acad Sci. 2000, 97 (10): 5237-5242. 10.1073/pnas.97.10.5237. http://www.pnas.org/content/97/10/5237.abstract,PubMed CentralView ArticlePubMed
- Haines TR, Rodenhiser DI, Ainsworth PJ: Allele-Specific non-CpG methylation of the Nf1 Gene during early mouse development. Dev Biol. 2001, 240 (2): 585-598. 10.1006/dbio.2001.0504. http://www.sciencedirect.com/science/article/pii/S0012160601905044,View ArticlePubMed
- Grandjean V, Yaman R, Cuzin F, Rassoulzadegan M: Inheritance of an epigenetic mark: the CpG DNA Methyltransferase 1 is required for de novo establishment of a complex pattern of non-CpG methylation. PLoS ONE. 2007, 2 (11): e1136-10.1371/journal.pone.0001136. http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0001136,PubMed CentralView ArticlePubMed
- Lister R, Pelizzola M, Dowen R, Hawkins R, Hon G, Tonti-Filippini J, Nery J, Lee L, Ye Z, Ngo Q, et al: Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009, 462 (7271): 315-322. 10.1038/nature08514.PubMed CentralView ArticlePubMed
- Riggs AD: X inactivation, differentiation, and DNA methylation. Cytogenet Cell Genet. 1975, 14: 9-25. 10.1159/000130315.View ArticlePubMed
- Holliday R, Pugh JE: DNA modification mechanisms and gene activity during development. Science. 1975, 187 (4173): 226-232. 10.1126/science.1111098.View ArticlePubMed
- Morison IM, Ramsay JP, Spencer HG: A census of mammalian imprinting. Trends Genet. 2005, 21 (8): 457-465. 10.1016/j.tig.2005.06.008.View ArticlePubMed
- Chow JC, Yen Z, Ziesche SM, Brown CJ: Silencing of the mammalian X chromosome. Annu Rev Genomics Hum Genet. 2005, 6: 69-92. 10.1146/annurev.genom.6.080604.162350.View ArticlePubMed
- Reik W: Stability and flexibility of epigenetic gene regulation in mammalian development. Nature. 2007, 447 (7143): 425-432. 10.1038/nature05918.View ArticlePubMed
- Feinberg AP, Vogelstein B: Hypomethylation distinguishes genes of some human cancers from their normal counterparts. Nature. 1983, 301 (5895): 89-92. 10.1038/301089a0.View ArticlePubMed
- Frommer M, McDonald L, Millar D, Collis C, Watt F, Grigg G, Molloy P, Paul C: A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc Nat Acad Sci. 1992, 89 (5): 1827-10.1073/pnas.89.5.1827.PubMed CentralView ArticlePubMed
- Xi Y, Li W: BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics. 2009, 10: 232-10.1186/1471-2105-10-232.PubMed CentralView ArticlePubMed
- Cokus S, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild C, Pradhan S, Nelson S, Pellegrini M, Jacobsen S: Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature. 2008, 452 (7184): 215-219. 10.1038/nature06745.PubMed CentralView ArticlePubMed
- Lister R, O’Malley R, Tonti-Filippini J, Gregory B, Berry C, Millar A, Ecker J: Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell. 2008, 133 (3): 523-536. 10.1016/j.cell.2008.03.029.PubMed CentralView ArticlePubMed
- Chen P, Cokus S, Pellegrini M: BS Seeker: precise mapping for bisulfite sequencing. BMC Bioinformatics. 2010, 11: 203-10.1186/1471-2105-11-203. http://www.biomedcentral.com/1471-2105/11/203,PubMed CentralView ArticlePubMed
- Krueger F, Andrews SR: Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics. 2011, 27 (11): 1571-1572. 10.1093/bioinformatics/btr167. http://bioinformatics.oxfordjournals.org/content/27/11/1571.long; http://dx.doi.org/10.1093/bioinformatics/btr167,PubMed CentralView ArticlePubMed
- Harris EY, Ponts N, Roch KGL, Lonardi S: BRAT-BW: efficient and accurate mapping of bisulfite-treated reads. Bioinformatics. 2012, 28 (13): 1795-1796. 10.1093/bioinformatics/bts264. http://bioinformatics.oxfordjournals.org/content/28/13/1795.long; http://dx.doi.org/10.1093/bioinformatics/bts264,PubMed CentralView ArticlePubMed
- Burrows M, Wheeler D: A block-sorting lossless data compression algorithm. 1994,, Technical Report SRC-RR-124, Digital Equipment Corporation, 1-16
- Frith MC, Mori R, Asai K: A mostly traditional approach improves alignment of bisulfite-converted DNA. Nucleic Acids Res. 2012, 40 (13): e100-e100. 10.1093/nar/gks275. http://nar.oxfordjournals.org/content/38/7/e96.short,PubMed CentralView ArticlePubMed
- Altschul SF, Madden TL, Shaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-402. 10.1093/nar/25.17.3389.PubMed CentralView ArticlePubMed
- Frith MC, Wan R, Horton P: Incorporating sequence quality data into alignment improves DNA read mapping. Nucleic Acids Res. 2010, 38 (7): 100-100. 10.1093/nar/gkq010. http://nar.oxfordjournals.org/content/38/7/e100.long,View Article
- Clement NL, Snell Q, Clement MJ, Hollenhorst PC, Purwar J, Graves BJ, Cairns BR, Johnson WE: The GNUMAP algorithm: unbiased probabilistic mapping of oligonucleotides from next-generation sequencing. Bioinformatics. 2010, 26: 38-45. 10.1093/bioinformatics/btp614. http://bioinformatics.oxfordjournals.org/content/26/1/38; http://dx.doi.org/10.1093/bioinformatics/btp614,View ArticlePubMed
- Chatterjee A, Stockwell P, Rodger E, Morison I: Comparison of alignment software for genome-wide bisulphite sequence data. Nucleic Acids Res. 2012, 40 (10): e79-10.1093/nar/gks150.PubMed CentralView ArticlePubMed
- Eckhardt F, Lewin J, Cortese R, Rakyan V, Attwood J, Burger M, Burton J, Cox T, Davies R, Down T, et al: DNA methylation profiling of human chromosomes 6, 20 and 22. Nat Genet. 2006, 38 (12): 1378-1385. 10.1038/ng1909.PubMed CentralView ArticlePubMed
- Harris RA, Wang T, Coarfa C, Nagarajan RP, Hong C, Downey SL, Johnson BE, Fouse SD, Delaney A, Zhao Y, Olshen A, Ballinger T, Zhou X, Forsberg KJ, Gu J, Echipare L, O’Geen H, Lister R, Pelizzola M, Xi Y, Epstein CB, Bernstein BE, Hawkins RD, Ren B, Chung WY, Gu H, Bock C, Gnirke A, Zhang MQ, Haussler D, Ecker JR, Li W, Farnham PJ, Waterland RA, Meissner A, Marra MA, Hirst M, Milosavljevic A, Costello JF: Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat Biotechnol. 2010, 28 (10): 1097-1105. 10.1038/nbt.1682.PubMed CentralView ArticlePubMed
- Clement N, Clement M, Snell Q, Johnson W: Parallel Mapping Approaches for GNUMAP. 2012 IEEE 26th Int Parallel Distributed Process Symp Workshops PhD Forum. 2011, 0 (1530-2075): 435-443.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.