Skip to main content
  • Poster presentation
  • Open access
  • Published:

Alignment of short reads to multiple genomes using hashing

Background

Recent advances in biotechnology have enabled high-throughput sequencing of genomes based on large numbers of short reads. Current methods [1, 2], however, depend mostly on aligning reads to only one reference genome at a time, making it difficult to differentiate sequencing errors from single nucleotide variants (SNV).

Materials and methods

Inspired by [3], we propose a new method that attempts to take advantage of multiple genomes and SNV information to align reads. This approach is promising in that it allows us to distinguish between sequencing errors and SNV. Our proposed alignment algorithm uses read fragments to identify seeds and extend these seeds to find occurrences of reads in the genome. In this study, we have developed and implemented an algorithm using multiple genomes that captures genomic variations, indexes the multiple genomes and operates short read alignment on a collection of genomes. The preliminary result was validated on Aspergillus fumigatus.

References

  1. Gontarz PM, Berger J, Wong CF: SRmapper: a fast and sensitive genome-hashing alignment tool. Bioinformatics. 2013, 29 (3): 316-321.

    Article  CAS  PubMed  Google Scholar 

  2. Langmead B, Salzberg SL: Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012, 9 (4): 357-359.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  3. Huang L, Popic V, Batzoglou S: Short read alignment with populations of genomes. Bioinformatics. 2013, 29 (13): i361-i370.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quang Tran.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tran, Q., Phan, V. Alignment of short reads to multiple genomes using hashing. BMC Bioinformatics 15 (Suppl 10), P23 (2014). https://doi.org/10.1186/1471-2105-15-S10-P23

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/1471-2105-15-S10-P23

Keywords