VITCOMIC: visualization tool for taxonomic compositions of microbial communities based on 16S rRNA gene sequences
© Mori et al; licensee BioMed Central Ltd. 2010
Received: 17 February 2010
Accepted: 18 June 2010
Published: 18 June 2010
Understanding the community structure of microbes is typically accomplished by sequencing 16S ribosomal RNA (16S rRNA) genes. These community data can be represented by constructing a phylogenetic tree and comparing it with other samples using statistical methods. However, owing to high computational complexity, these methods are insufficient to effectively analyze the millions of sequences produced by new sequencing technologies such as pyrosequencing.
We introduce a web tool named VITCOMIC (VI sualization tool for T axonomic CO mpositions of MI crobial C ommunity) that can analyze millions of bacterial 16S rRNA gene sequences and calculate the overall taxonomic composition for a microbial community. The 16S rRNA gene sequences of genome-sequenced strains are used as references to identify the nearest relative of each sample sequence. With this information, VITCOMIC plots all sequences in a single figure and indicates relative evolutionary distances.
VITCOMIC yields a clear representation of the overall taxonomic composition of each sample and facilitates an intuitive understanding of differences in community structure between samples. VITCOMIC is freely available at http://mg.bio.titech.ac.jp/vitcomic/.
The number of sequenced bacterial genomes has increased rapidly and now exceeds 1,000 ; however, we have little information regarding environmental microbes, largely because the majority of them are unculturable . The taxonomic composition of a microbial community can provide important clues to better understand its structure and ecology . Analysis using 16S rRNA genes is a frequently used method to obtain the taxonomic composition of a microbial community [4, 5]. Features of 16S rRNA genes include essentiality for all Bacteria and Archaea, mosaic structures of highly conserved regions and variable regions [6, 7], and little possibility for horizontal gene transfer . Moreover, the availability of numerous tools and databases specific for the 16S rRNA genes has potentiated taxonomic analyses [9–12].
Ultra-deep sequencing of microbial communities using a massively parallel pyrosequencer has recently uncovered relatively rare species in communities [5, 13–15]. However, the enormous amounts of sequencing data produced by recent pyrosequencing studies are difficult to effectively analyze using existing computational tools (Additional file 1) . For example, the overall taxonomic composition of each sample is traditionally presented graphically in phylogenetic trees [9, 17]. However, graphical representation and comparison of overall taxonomic compositions for pyrosequencing data is difficult due to the high computational complexity involved in constructing multiple alignments and phylogenetic trees from millions of sequences [16, 18]. Therefore, researchers tend to use a compressed representation of taxonomic composition such as a bar graph or pie chart of the phylum-level composition. Unfortunately, these compressed representations of overall taxonomic composition can be difficult to represent differences among microbial communities, especially differences attributable to minority taxa .
To address deficiencies in the analysis of taxonomic compositions of microbial communities, we developed a rapid visualization tool, named VITCOMIC, that presents overall taxonomic compositions based on large datasets of 16S rRNA genes from microbial communities. VITCOMIC can facilitate intuitive understanding of microbial communities and compare taxonomic compositions between communities.
Creation of a reference 16S rRNA gene database and their distance matrix
The reference 16S rRNA gene sequence database was constructed using 16S rRNA gene sequences from genome-sequenced strains. These data are suitable as reference data because they are accurate and have well-defined taxonomic information. Genomic sequences of Bacteria and Archaea were obtained from the NCBI Genome Database  in September 2009. The 16S rRNA genes of each strain were detected using RNAmmer . One 16S rRNA gene was randomly sampled per species because there are only small sequence differences among 16S rRNA genes within the same genome and the same species [22, 23]. A total of 601 16S rRNA gene sequences from 601 species of Bacteria and Archaea were obtained. To calculate phylogenetic distances among them, all sequences were aligned using MAFFT 6.713 with default parameters . After constructing multiple alignments, genetic distances between sequences with Kimura's two-parameter model of base substitution  were calculated using the dnadist program in PHYLIP 3.69 . The phylogenetic tree was constructed using the neighbor-joining method in the neighbor program in PHYLIP 3.69. The phylum-level taxonomy of the species was obtained from the NCBI Taxonomy Database .
Sample data for testing VITCOMIC
We used human gut microbiome data from Turnbaugh et al.  to test VITCOMIC. In their study, each individual was categorized as obese, lean, or overweight using body mass index. DNA was extracted from the feces of each individual, and the V2 variable regions of 16S rRNA genes were PCR amplified prior to pyrosequencing using a 454 GS FLX system . We used the sequences from obese and lean individuals. The obese sample consisted of 704,369 sequences from 196 individuals; the lean sample consisted of 291,993 sequences from 61 individuals.
Inference of a nearest relative for each sequence
Alignments less than 50 bp were excluded to avoid inaccurate alignments. Because variable regions are nearly neutral, false alignments between a variable region and a conserved region or other variable regions are sometimes constructed and included in calculations of total BLAST scores (Figure 1B). To calculate total BLAST scores, it is necessary to develop the function "alignments consistency check". The alignments consistency check detects false alignment using information on positions of aligned regions of the sample sequence and matched database sequence. Normally, the order of aligned regions of the sample sequence is consistent with that of the matched database sequence (Figure 1A). On the other hand, most pairs of sequences that contain false alignments are not consistent with respect to the order of aligned regions (Figure 1B). The alignments consistency check detects collapses of these consistencies and excludes these pairs of sample and database sequences in the target calculation of total BLAST scores.
Graphical representation of the taxonomic composition of the sample
Comparison of taxonomic compositions between samples
Using VITCOMIC, the overall taxonomic compositions of both the obese and lean samples could be clearly visualized (Figure 2 = obese; Figure 3 = lean). Large colored dots indicate relatively abundant taxa in each sample (relative abundance > 1%). These large colored dots are distributed almost identically between obese and lean samples and are located at related species of Clostridium, Eubacterim, and Bacteroides. These taxa are the abundant in the normal human gut microbiome . Small dots that are located at the most lateral circle indicate closely related strains of the genome-sequenced strains. These strains are Escherichia coli and Proteus mirabilis in Proteobacteria, Enterococcus faecalis and the group of Lactobacillus in Firmicutes, groups of Bifidobacterium and Propionibacterium in Actinobacteria, and Akkermansia muciniphila in Verrucomicrobia. It is well established that some of these strains inhabit the human gut, whereas others do not [33–39]. In Figures 2 and 3, several dots are distributed on the 80-90% lines, indicating that several taxa distantly related to genome-sequenced strains inhabit the human gut. These results were consistent with the study of Turnbaugh et al. .
VITCOMIC can easily visualize overall taxonomic compositions of large amounts of 16S rRNA gene-based community analysis data. Traditional visualization methods by constructing phylogenetic trees require a lot of computation time when analyzing large amounts of data . Even if researchers are able to construct a phylogenetic tree, the tree itself can be difficult to analyze because it may contain too many branches . By contrast, taxonomic assignments based on BLAST are fast and can be highly parallelized . Although several highly accurate taxonomic assignment tools have been developed [41, 42], the accuracy of BLAST-based taxonomic assignments is also well validated [29, 43]. In addition, calculations of total BLAST scores and applications of the alignments consistency check improve the accuracy of the assignment, especially when long sequences are examined. Longer sequences containing more variable regions will generate a greater number of alignment divisions. The alignments consistency check may be necessary for the study using the pyrosequencer because recently developed pyrosequencer has improved the read length by over 400 bp . Although the taxonomic assignment using only genome-sequenced species for the reference would not yield the best assignment compared with the assignment using larger database that contains uncultured bacteria [12, 45], this provisional taxonomy provided by VITCOMIC is accurate enough for the visual comparisons of taxonomic composition between samples.
Compared with other tools, the most unique function of VITCOMIC is a simultaneous visualization and comparison of taxonomic compositions between samples (Additional file 1). Comparison of taxonomic compositions between samples from different microbial communities is an effective means to better understand similarities and differences between microbial communities . However, the comparison of several microbial communities can be difficult given a large number of sequences . VITCOMIC can simultaneously visualize large amounts of data by merging sequence data from several community analysis projects (Additional files 2, 3, and 4). Additional file 2 visualizes 139,356 16S rRNA gene sequences obtained from various soils . Additional file 3 presents seawater microbial communities data derived from 452 different 16S rRNA gene surveys containing 11,144,358 sequences, which were obtained from the NCBI Sequence Read Archive . Additional file 4 presents data for the human microbial communities derived from 60 different 16S rRNA gene surveys containing 4,363,040 sequences, which were obtained from NCBI Sequence Read Archive. Although detailed comparisons among samples from different microbial communities are difficult due to the large number of sequences and differing primers, VITCOMIC showed that overall taxonomic compositions and abundant taxa are distinctly different between environments.
VITCOMIC only uses the 16S rRNA gene sequences from 601 genome-sequenced bacteria as references. The reason why we selected the reference database from 601 species is the quality and quantity of the biological information. These sequences are derived from genome-sequenced species, from which we can generally obtain much information about ecophysiology (i.e., metabolic potentials, habitats, gene repertoires). Therefore, by adopting genome-sequenced species as the reference database, we can retrieve several biological information for each taxon inductively by analyzing the genomic information of the nearest genome-sequenced species from the 16S rRNA gene-targeted analysis. These features provide valuable initiative knowledge for a following metagenomic analysis. To address the increasing number of genome-sequenced species, the reference database of VITCOMIC will be updated periodically.
Using a phylogenetic relationship with genome-sequenced strains, VITCOMIC clearly presents the overall taxonomic composition of 16S rRNA gene-based microbial community analysis data. VITCOMIC facilitates an intuitive understanding of differences in community structure between samples.
Availability and requirements
Project name: VITCOMIC
Project home page: http://mg.bio.titech.ac.jp/vitcomic/
Operating system(s): Platform independent
Programming language: Perl
Other requirements: None
License: GNU GPL
Any restrictions to use by non-academics: None
We thank Hiroyuki Toh, Tetsuya Hayashi and Takehiko Itoh for helpful discussions. This work was supported by a Grant-in-Aid from the Institute for Bioinformatics Research and Development, the Japan Science and Technology Agency (BIRD-JST) and a Grant-in-Aid for Scientific Research (C: 22592032).
- Liolios K, Chen IM, Mavromatis K, Tavernarakis N, Hugenholtz P, Markowitz VM, Kyrpides NC: The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2010, 38: D346-D354. 10.1093/nar/gkp848View ArticlePubMedPubMed CentralGoogle Scholar
- Rappé MS, Giovannoni SJ: The uncultured microbial majority. Annu Rev Microbiol 2003, 57: 369–394. 10.1146/annurev.micro.57.030502.090759View ArticlePubMedGoogle Scholar
- Pace NR: A molecular view of microbial diversity and the biosphere. Science 1997, 276: 734–740. 10.1126/science.276.5313.734View ArticlePubMedGoogle Scholar
- Eckburg PB, Bik EM, Bernstein CN, Purdom E, Dethlefsen L, Sargent M, Gill SR, Nelson KE, Relman DA: Diversity of the human intestinal microbial flora. Science 2005, 308: 1635–1638. 10.1126/science.1110591View ArticlePubMedPubMed CentralGoogle Scholar
- Sogin ML, Morrison HG, Huber JA, Welch DM, Huse SM, Neal PR, Arrieta JM, Herndl GJ: Microbial diversity in the deep sea and the underexplored "rare biosphere". Proc Natl Acad Sci USA 2006, 103: 12115–12120. 10.1073/pnas.0605127103View ArticlePubMedPubMed CentralGoogle Scholar
- Van de Peer Y, Chapelle S, De Wachter R: A quantitative map of nucleotide substitution rates in bacterial rRNA. Nucleic Acids Res 1996, 24: 3381–3391. 10.1093/nar/24.17.3381View ArticlePubMedPubMed CentralGoogle Scholar
- Mears JA, Cannone JJ, Stagg SM, Gutell RR, Agrawal RK, Harvey SC: Modeling a minimal ribosome based on comparative sequence analysis. J Mol Biol 2002, 321: 215–234. 10.1016/S0022-2836(02)00568-5View ArticlePubMedGoogle Scholar
- Jain R, Rivera MC, Lake JA: Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci USA 1999, 96: 3801–3806. 10.1073/pnas.96.7.3801View ArticlePubMedPubMed CentralGoogle Scholar
- Ludwig W, Strunk O, Westram R, Richter L, Meier H, Buchner A, Lai T, Steppi S, Jobb G, Förster W, Brettske I, Gerber S, Ginhart AW, Gross O, Grumann S, Hermann S, Jost R, König A, Liss T, Lüssmann R, May M, Nonhoff B, Reichel B, Strehlow R, Stamatakis A, Stuckmann N, Vilbig A, Lenke M, Ludwig T, Bode A, Schleifer KH: ARB: a software environment for sequence data. Nucleic Acids Res 2004, 32: 1363–1371. 10.1093/nar/gkh293View ArticlePubMedPubMed CentralGoogle Scholar
- Lozupone C, Knight R: UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 2005, 71: 8228–8235. 10.1128/AEM.71.12.8228-8235.2005View ArticlePubMedPubMed CentralGoogle Scholar
- Schloss PD, Handelsman J: Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness. Appl Environ Microbiol 2005, 71: 1501–1506. 10.1128/AEM.71.3.1501-1506.2005View ArticlePubMedPubMed CentralGoogle Scholar
- Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS, McGarrell DM, Marsh T, Garrity GM, Tiedje JM: The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res 2009, 37: D141-D145. 10.1093/nar/gkn879View ArticlePubMedPubMed CentralGoogle Scholar
- Roesch LF, Fulthorpe RR, Riva A, Casella G, Hadwin AK, Kent AD, Daroub SH, Camargo FA, Farmerie WG, Triplett EW: Pyrosequencing enumerates and contrasts soil microbial diversity. ISME J 2007, 1: 283–290.PubMedPubMed CentralGoogle Scholar
- Armougom F, Raoult D: Exploring microbial diversity using 16S rRNA high-throughput methods. J Comput Sci Syst Biol 2009, 2: 69–92. 10.4172/jcsb.1000019View ArticleGoogle Scholar
- Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE, Sogin ML, Jones WJ, Roe BA, Affourtit JP, Egholm M, Henrissat B, Heath AC, Knight R, Gordon JI: A core gut microbiome in obese and lean twins. Nature 2009, 457: 480–484. 10.1038/nature07540View ArticlePubMedPubMed CentralGoogle Scholar
- Sun Y, Cai Y, Liu L, Yu F, Farrell ML, McKendree W, Farmerie W: ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences. Nucleic Acids Res 2009, 37: e76. 10.1093/nar/gkp285View ArticlePubMedPubMed CentralGoogle Scholar
- Letunic I, Bork P: Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 2007, 23: 127–128. 10.1093/bioinformatics/btl529View ArticlePubMedGoogle Scholar
- Kemena C, Notredame C: Upcoming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics 2009, 25: 2455–2465. 10.1093/bioinformatics/btp452View ArticlePubMedPubMed CentralGoogle Scholar
- Bent SJ, Forney LJ: The tragedy of the uncommon: understanding limitations in the analysis of microbial diversity. ISME J 2008, 2: 689–695. 10.1038/ismej.2008.44View ArticlePubMedGoogle Scholar
- NCBI Genome Database[ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/]
- Lagesen K, Hallin P, Rødland EA, Staerfeldt HH, Rognes T, Ussery DW: RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 2007, 35: 3100–3108. 10.1093/nar/gkm160View ArticlePubMedPubMed CentralGoogle Scholar
- Acinas SG, Marcelino LA, Klepac-Ceraj V, Polz MF: Divergence and redundancy of 16S rRNA sequences in genomes with multiple rrn operons. J Bacteriol 2004, 186: 2629–2635. 10.1128/JB.186.9.2629-2635.2004View ArticlePubMedPubMed CentralGoogle Scholar
- Yarza P, Richter M, Peplies J, Euzeby J, Amann R, Schleifer KH, Ludwig W, Glöckner FO, Rosselló-Móra R: The All-Species Living Tree project: a 16S rRNA-based phylogenetic tree of all sequenced type strains. Syst Appl Microbiol 2008, 31: 241–250. 10.1016/j.syapm.2008.07.001View ArticlePubMedGoogle Scholar
- Katoh K, Toh H: Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework. BMC Bioinformatics 2008, 9: 212. 10.1186/1471-2105-9-212View ArticlePubMedPubMed CentralGoogle Scholar
- Kimura M: A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 1980, 16: 111–120. 10.1007/BF01731581View ArticlePubMedGoogle Scholar
- Felsenstein J: PHYLIP-Phylogeny inference package (Version 3.2). Cladistics 1989, 5: 164–166.Google Scholar
- NCBI Taxonomy Database[http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi]
- Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM: Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005, 437: 376–380.PubMedPubMed CentralGoogle Scholar
- Hamady M, Lozupone C, Knight R: Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. ISME J 2010, 4: 17–27. 10.1038/ismej.2009.97View ArticlePubMedPubMed CentralGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215: 403–410.View ArticlePubMedGoogle Scholar
- Chao A, Chazdon RL, Colwell RK, Shen TJ: Abundance-based similarity indices and their estimation when there are unseen species in samples. Biometrics 2006, 62: 361–371. 10.1111/j.1541-0420.2005.00489.xView ArticlePubMedGoogle Scholar
- Kurokawa K, Itoh T, Kuwahara T, Oshima K, Toh H, Toyoda A, Takami H, Morita H, Sharma VK, Srivastava TP, Taylor TD, Noguchi H, Mori H, Ogura Y, Ehrlich DS, Itoh K, Takagi T, Sakaki Y, Hayashi T, Hattori M: Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes. DNA Res 2007, 14: 169–181. 10.1093/dnares/dsm018View ArticlePubMedPubMed CentralGoogle Scholar
- Paulsen IT, Banerjei L, Myers GS, Nelson KE, Seshadri R, Read TD, Fouts DE, Eisen JA, Gill SR, Heidelberg JF, Tettelin H, Dodson RJ, Umayam L, Brinkac L, Beanan M, Daugherty S, DeBoy RT, Durkin S, Kolonay J, Madupu R, Nelson W, Vamathevan J, Tran B, Upton J, Hansen T, Shetty J, Khouri H, Utterback T, Radune D, Ketchum KA, Dougherty BA, Fraser CM: Role of mobile DNA in the evolution of vancomycin-resistant Enterococcus faecalis . Science 2003, 299: 2071–2074. 10.1126/science.1080613View ArticlePubMedGoogle Scholar
- Brüggemann H, Henne A, Hoster F, Liesegang H, Wiezer A, Strittmatter A, Hujer S, Dürre P, Gottschalk G: The complete genome sequence of Propionibacterium acnes , a commensal of human skin. Science 2004, 305: 671–673. 10.1126/science.1100330View ArticlePubMedGoogle Scholar
- Derrien M, Collado MC, Ben-Amor K, Salminen S, de Vos WM: The Mucin degrader Akkermansia muciniphila is an abundant resident of the human intestinal tract. Appl Environ Microbiol 2008, 74: 1646–1648. 10.1128/AEM.01226-07View ArticlePubMedPubMed CentralGoogle Scholar
- Morita H, Toh H, Fukuda S, Horikawa H, Oshima K, Suzuki T, Murakami M, Hisamatsu S, Kato Y, Takizawa T, Fukuoka H, Yoshimura T, Itoh K, O'Sullivan DJ, McKay LL, Ohno H, Kikuchi J, Masaoka T, Hattori M: Comparative genome analysis of Lactobacillus reuteri and Lactobacillus fermentum reveal a genomic island for reuterin and cobalamin production. DNA Res 2008, 15: 151–161. 10.1093/dnares/dsn009View ArticlePubMedPubMed CentralGoogle Scholar
- Oshima K, Toh H, Ogura Y, Sasamoto H, Morita H, Park SH, Ooka T, Iyoda S, Taylor TD, Hayashi T, Itoh K, Hattori M: Complete genome sequence and comparative analysis of the wild-type commensal Escherichia coli strain SE11 isolated from a healthy adult. DNA Res 2008, 15: 375–386. 10.1093/dnares/dsn026View ArticlePubMedPubMed CentralGoogle Scholar
- Pearson MM, Sebaihia M, Churcher C, Quail MA, Seshasayee AS, Luscombe NM, Abdellah Z, Arrosmith C, Atkin B, Chillingworth T, Hauser H, Jagels K, Moule S, Mungall K, Norbertczak H, Rabbinowitsch E, Walker D, Whithead S, Thomson NR, Rather PN, Parkhill J, Mobley HL: Complete genome sequence of uropathogenic Proteus mirabilis , a master of both adherence and motility. J Bacteriol 2008, 190: 4027–4037. 10.1128/JB.01981-07View ArticlePubMedPubMed CentralGoogle Scholar
- Sela DA, Chapman J, Adeuya A, Kim JH, Chen F, Whitehead TR, Lapidus A, Rokhsar DS, Lebrilla CB, German JB, Price NP, Richardson PM, Mills DA: The genome sequence of Bifidobacterium longum subsp. infantis reveals adaptations for milk utilization within the infant microbiome. Proc Natl Acad Sci USA 2008, 105: 18964–18969. 10.1073/pnas.0809584105View ArticlePubMedPubMed CentralGoogle Scholar
- Mathog DR: Parallel BLAST on split databases. Bioinformatics 2003, 19: 1865–1866. 10.1093/bioinformatics/btg250View ArticlePubMedGoogle Scholar
- Wang Q, Garrity GM, Tiedje JM, Cole JR: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 2007, 73: 5261–5267. 10.1128/AEM.00062-07View ArticlePubMedPubMed CentralGoogle Scholar
- Clemente JC, Jansson J, Valiente G: Accurate taxonomic assignment of short pyrosequencing reads. Pac Symp Biocomput 2010, 3–9.Google Scholar
- Wu D, Hartman A, Ward N, Eisen JA: An automated phylogenetic tree-based small subunit rRNA taxonomy and alignment pipeline (STAP). PLoS One 2008, 3: e2566. 10.1371/journal.pone.0002566View ArticlePubMedPubMed CentralGoogle Scholar
- Roche 454 sequencer web page[http://454.com/]
- Desantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi D, Hu P, Andersen GL: Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol 2006, 72: 5069–5072. 10.1128/AEM.03006-05View ArticlePubMedPubMed CentralGoogle Scholar
- NCBI Sequence Read Archive[http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.