Correlation of geographical distribution of mycobacteriophages with genomic clusters and genome distance

Parrish, Miranda L; Rinehart, Claire A

doi:10.1186/1471-2105-14-S17-A8

Volume 14 Supplement 17

Proceedings of the 12th Annual UT-ORNL-KBRIN Bioinformatics Summit 2013

Meeting abstract
Open access
Published: 22 October 2013

Correlation of geographical distribution of mycobacteriophages with genomic clusters and genome distance

Miranda L Parrish¹ &
Claire A Rinehart¹

BMC Bioinformatics volume 14, Article number: A8 (2013) Cite this article

1593 Accesses
Metrics details

Background

Mycobacterium smegmatis is a soil bacterium. Over 448 mycobacteriophages specific for M. smegmatis have been sequenced and grouped into clusters of related genomes based on the similarity of their products and genome organization. The phagesdb.org database contains not only the sequence information, but also the geographic locations of the sampling sites for each of these mycobacteriophages. From these data we addressed two questions in this study: one to determine if the mycobacteriophage clusters are randomly distributed geographically, the second to determine the correlation between geographic distance and genetic distance within each cluster.

Materials and methods

Since the geographic sampling was not evenly distributed, the sampling frequency for each geographic and hydrologic [1] region was determined. Samples were drawn at random so that the total number of samples in a cohort was equal to the number of mycobacteriophages contained in a cluster. Each sample was assigned a sampling frequency based on the region from which it was drawn. The sampling probability for the cohort was calculated by multiplying the sampling frequencies for each draw. This was repeated 100,000 times and a sampling probability distribution was generated. The sampling frequency for the actual cluster was also calculated from the geographic frequencies and a probability for the cluster data was determined from the CDF of the sampling distribution. Over half of the clusters showed non-random distribution in both geographic and by hydrologic regions (a = 0.05). Those showing non-random distribution at the hydrologic scale were generally part of adjacent continuous drainage regions (Figure 1).

Geographic distance was determined by the GeoDistance function in Mathematica^® which gives the distance between positions projected onto a reference ellipsoid; heights are ignored. The genetic distance between nucleotide sequences was determined by the DamerauLevenshteinDistance function in Mathematica^® which gives the number of one-element deletions, insertions, substitutions and transpositions required to transform one sequence to the other. A correlation analysis showed very weak or no correlation between geographic distance and genetic distance within clusters.

References

U.S. Geological Survey: USGS: The National Map Viewer. Watershed Boundary Dataset [Data File]. 2012, [http://viewer.nationalmap.gov/viewer/]
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biology, Western Kentucky University, Bowling Green, KY, 42101, USA
Miranda L Parrish & Claire A Rinehart

Authors

Miranda L Parrish
View author publications
You can also search for this author in PubMed Google Scholar
Claire A Rinehart
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Claire A Rinehart.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Parrish, M.L., Rinehart, C.A. Correlation of geographical distribution of mycobacteriophages with genomic clusters and genome distance. BMC Bioinformatics 14 (Suppl 17), A8 (2013). https://doi.org/10.1186/1471-2105-14-S17-A8

Download citation

Published: 22 October 2013
DOI: https://doi.org/10.1186/1471-2105-14-S17-A8

Proceedings of the 12th Annual UT-ORNL-KBRIN Bioinformatics Summit 2013

Correlation of geographical distribution of mycobacteriophages with genomic clusters and genome distance

Background

Materials and methods

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Proceedings of the 12th Annual UT-ORNL-KBRIN Bioinformatics Summit 2013

Correlation of geographical distribution of mycobacteriophages with genomic clusters and genome distance

Background

Materials and methods

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us