Flipping NextGen: using biological systems to characterize NextGen sequencing technologies

Glasscock, Jarret; Richt, Ryan; Hickenbotham, Matt

doi:10.1186/1471-2105-10-S7-A18

Volume 10 Supplement 7

UT-ORNL-KBRIN Bioinformatics Summit 2009

Meeting abstract
Open access
Published: 25 June 2009

Flipping NextGen: using biological systems to characterize NextGen sequencing technologies

Jarret Glasscock¹,
Ryan Richt¹ &
Matt Hickenbotham¹

BMC Bioinformatics volume 10, Article number: A18 (2009) Cite this article

2302 Accesses
Metrics details

Background

At a current 12 gigabases per sequencing run (and growing), there have been significant advancements in DNA sequencing technologies resulting in next generation (NextGen) sequencing platforms that produce 5 orders of magnitude more data than platforms used for the human genome project.

Results

A broad range of genomes was surveyed in order to assess characteristics necessary to sufficiently analyze these biological systems. In the context of genome re-sequencing projects we found 15 bp was needed to uniquely map 98% of loci in many bacteria, while 20 bp was needed before hitting lower asymptotes to uniquely characterize a fraction of more complex genomes (Figure 1).

Transcriptomes on the other hand were much less variable and required fewer bases (x) to uniquely map a much larger percentage (y) of their sequence space. For example, more than 98% of the complex human transcriptome could be uniquely characterized with as few as 20 bp.

Finally, de-novo sequencing (i.e. without a reference) would require a minimum of 1/2 of the sequence length to be unique in order to allow sufficient contig extension in the assembly process. For example, 40–50 bp reads are necessary for de-novo characterization of these systems uniquely defined by 20–25 bp reads. As of 2009, short read NextGen sequencing technologies have moved to 50 bp and beyond, ushering in what is expected to be the start of a revolution in genomics.

Conclusion

These results establish a lower bound on sequence length (x) required to sufficiently conduct re-sequencing, transcriptome, and de-novo sequencing projects. The asymptotic nature of the results also provides a guide for what percentage of the total space (y) we might expect to define in genomes/transcriptomes of similar size and complexity.

Author information

Authors and Affiliations

Cofactor Genomics, St. Louis, MO, 63103, USA
Jarret Glasscock, Ryan Richt & Matt Hickenbotham

Authors

Jarret Glasscock
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Richt
View author publications
You can also search for this author in PubMed Google Scholar
Matt Hickenbotham
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jarret Glasscock.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Glasscock, J., Richt, R. & Hickenbotham, M. Flipping NextGen: using biological systems to characterize NextGen sequencing technologies. BMC Bioinformatics 10 (Suppl 7), A18 (2009). https://doi.org/10.1186/1471-2105-10-S7-A18

Download citation

Published: 25 June 2009
DOI: https://doi.org/10.1186/1471-2105-10-S7-A18

UT-ORNL-KBRIN Bioinformatics Summit 2009

Flipping NextGen: using biological systems to characterize NextGen sequencing technologies

Background

Results

Conclusion

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

UT-ORNL-KBRIN Bioinformatics Summit 2009

Flipping NextGen: using biological systems to characterize NextGen sequencing technologies

Background

Results

Conclusion

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us