Correlation of geographical distribution of mycobacteriophages with genomic clusters and genome distance
© Parrish and Rinehart; licensee BioMed Central Ltd. 2013
Published: 22 October 2013
Mycobacterium smegmatis is a soil bacterium. Over 448 mycobacteriophages specific for M. smegmatis have been sequenced and grouped into clusters of related genomes based on the similarity of their products and genome organization. The phagesdb.org database contains not only the sequence information, but also the geographic locations of the sampling sites for each of these mycobacteriophages. From these data we addressed two questions in this study: one to determine if the mycobacteriophage clusters are randomly distributed geographically, the second to determine the correlation between geographic distance and genetic distance within each cluster.
Materials and methods
Geographic distance was determined by the GeoDistance function in Mathematica ® which gives the distance between positions projected onto a reference ellipsoid; heights are ignored. The genetic distance between nucleotide sequences was determined by the DamerauLevenshteinDistance function in Mathematica ® which gives the number of one-element deletions, insertions, substitutions and transpositions required to transform one sequence to the other. A correlation analysis showed very weak or no correlation between geographic distance and genetic distance within clusters.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.