Comparative analysis of microbiome measurement platforms using latent variable structural equation modeling
- Xiao Wu^{1}Email author,
- Kathryn Berkow^{1},
- Daniel N Frank^{2},
- Ellen Li^{3, 4},
- Ajay S Gulati^{5} and
- Wei Zhu^{1}
DOI: 10.1186/1471-2105-14-79
© Wu et al.; licensee BioMed Central Ltd. 2013
Received: 10 July 2012
Accepted: 3 February 2013
Published: 5 March 2013
Abstract
Background
Culture-independent phylogenetic analysis of 16S ribosomal RNA (rRNA) gene sequences has emerged as an incisive method of profiling bacteria present in a specimen. Currently, multiple techniques are available to enumerate the abundance of bacterial taxa in specimens, including the Sanger sequencing, the ‘next generation’ pyrosequencing, microarrays, quantitative PCR, and the rapidly emerging, third generation sequencing, and fourth generation sequencing methods. An efficient statistical tool is in urgent need for the followings tasks: (1) to compare the agreement between these measurement platforms, (2) to select the most reliable platform(s), and (3) to combine different platforms of complementary strengths, for a unified analysis.
Results
We present the latent variable structural equation modeling (SEM) as a novel statistical application for the comparative analysis of measurement platforms. The latent variable SEM model treats the true (unknown) relative frequency of a given bacterial taxon in a specimen as the latent (unobserved) variable and estimates the reliabilities of, and similarities between, different measurement platforms, and subsequently weighs those measurements optimally for a unified analysis of the microbiome composition. The latent variable SEM contains the repeated measures ANOVA (both the univariate and the multivariate models) as special cases and, as a more general and realistic modeling approach, yields superior goodness-of-fit and more reliable analysis results, as demonstrated by a microbiome study of the human inflammatory bowel diseases.
Conclusions
Given the rapid evolution of modern biotechnologies, the measurement platform comparison, selection and combination tasks are here to stay and to grow - and the latent variable SEM method is readily applicable to any other biological settings, aside from the microbiome study presented here.
Keywords
Bioinformatics Latent variable structural equation modeling Measurement model Reliability Repeated measures ANOVABackground
Complex microbial communities, like those of the human gastrointestinal (GI) tract and other environmental specimens, have gained increased attention in recent years, thanks to technological advances in culture-independent methods based on the amplification of 16S rRNA genes [1, 2]. The NIH Roadmap Human Microbiome Project (HMP) has undertaken a large scale effort to characterize 16S rRNA sequences from healthy human subjects and from human subjects with various diseases. In the course of conducting the project, the various sequencing centers used both ABI 3730 Sanger sequencing and 454 FLX Titanium pyrosequencing platforms to generate and release reference data from multiple body sites sampled in 300 healthy human subjects [3, 4]. Traditional phylogenetic analysis of a sample is performed by amplifying 16S rRNA genes, cloning, and sequencing by the Sanger method [5]. An advantage of this method is the sufficiency of single pass Sanger sequencing of 900-1000 bases for classifying bacteria. Disadvantages include potential cloning bias [6], as well as time and expense, which can be prohibitive for in-depth sampling of complex microbial communities.
Next-generation sequencing (NGS) technology provides a promising alternative to quantifying the microbiome without the limitations of cloning/Sanger sequencing. For instance, a single run of the 454 Life Sciences pyrosequencing platform can produce 1.2 million sequences in 8 hours [7], which would require months or years of work with the older methods. The high throughput per run means the unit cost of NGS is only a fraction of that for Sanger sequencing. The new technology also eliminates the cloning bias by directly sequencing the 16S rRNA genes generated by polymerase chain reaction (PCR). Therefore, high throughput sequencing is ideal if adaptable to meet the requirements needed for microbiome work. However, the main limitation of high throughput sequencing is read length. Reads from NGS technologies are considerably shorter than those from Sanger sequencing. Illumina’s Solexa and Applied Biosystem’s SOLiD platforms generate reads of about 25-100 bases, while 454 sequencing technology reads up to 400-500 bases per sequence. The concern is loss of classification accuracy with shorter sequence reads [8, 9]. In addition, the bias associated with PCR amplification is also a concern of PCR based next generation sequencing [10]. Several strategies have been tried to maximize the information obtained from short sequences. One is to target hypervariable regions (HVR) that are most informative for a specific microbiome of interest [11, 12]. As a comparison to the Sanger and the NGS methods, quantitative PCR (qPCR) employs primers specific for particular bacterium to detect and quantify bacteria. Although a reliable and accurate quantification measure for the absolute amount of 16S rRNA genes from one specific organism [13], the accuracy of qPCR relies on proper designs of the primers [14].
To date, few attempts have been made to systematically compare and combine different measurement modalities for microbiome analysis. Nossa et al.[15] surveyed broad-range 16S rRNA primers for use in 454 pyrosequencing to classify bacteria from the human foregut microbiome. A length of 900 bases long reads were simulated as Sanger sequences and treated as accurate taxonomies. The group concluded that 347 F/803R primers (covering the 16S rRNA V3V4 region) is the most suitable primer pair for pyrosequencing of classification of foregut 16S rRNA genes. Frank et al.[16] observed similar results provided by Sanger sequencing and pyrosequencing in the human Nasal Microbiota. One recent work has demonstrated that the measured profile (identification and abundance) of microbial communities depends highly on the selection of sequencing platforms - Sanger sequencing and pyrosequencing with different target regions (V1V3, V4V6, V7V9) yielded varying patterns for different genera [17]. It is thus arduous to compare the accuracies of different sequencing platforms for measuring microbiome compositions in an experimental approach.
Here we propose an alternative analytical approach using the latent variable structural equation modeling (SEM) to compare and integrate microbiome measurements from different measurement platforms. The latent variable SEM treats the true bacterial composition of a sample as the latent (unobserved) variable and estimates the relations between, and the reliabilities of, different measurement platforms, and if necessary, subsequently combines them for a joint analysis with each platform weighed by its reliability [18]. The latent variable SEM includes the repeated measures ANOVA, both the univariate and the multivariate versions, as special cases, and is free from the rigid assumptions of the latter approaches such as weighing each platform equally in the analysis regardless of their reliabilities and assuming equal measurement error variances [19]. Furthermore, as with the repeated measures ANOVA, the latent variable SEM can easily incorporate covariates such as disease phenotypes and genotypes, etc. [20, 21] to examine their influences on the underlying microbiome composition/bacteria expression.
In this paper, we demonstrate the latent variable SEM approach through a study of the microbiome in inflammatory bowel diseases (IBD). Our primary goal is to identify the most reliable microbiome measurement platform. A secondary goal is to examine the impact of IBD disease phenotypes (Crohn’s Disease [CD] and ulcerative colitis [UC]) on the enteric microbiota. The measurement platforms compared in this study are: 1) ABI 3730 (Sanger) sequencing of the entire 16S rRNA gene; 2) 454 sequencing of the V1-V3 hypervariable regions; 3) 454 sequencing of the V3-V5 hypervariable region. In the case of a single bacterial taxon, Faecalibacterium spp., we compared the three sequencing platforms with an established qPCR assay.
Methods
In this section, we illustrate the general methodology for platform comparison and combination using latent variable SEM. We start with the simpler latent variable SEM measurement model in which covariates are not involved to better elucidate how latent variable SEM gauges platform reliability and consistency. Subsequently, we introduce latent variable SEM with covariates and describe its two special cases -- repeated measures ANOVA in the univariate and multivariate approaches. To better assist readers with a less mathematical background in this section, each general model is accompanied by the corresponding example from the microbiome study on IBD.
Measurement model of latent variable SEM
where S is the sample covariance matrix. This in turn reduces to minimizing the difference between S and Σ(θ).
To fix ideas, we now illustrate the modeling and estimation of the latent variable SEM in details by setting m = 3 in Figure 1(A). The SEM equations are: Y_{1} = λ_{1}ξ + ε_{1}, Y_{2} = λ_{2}ξ + ε_{2} and Y_{3} = λ_{3}ξ + ε_{3}, where E(Y_{ i }) = 0, E(ε_{ i }) = 0, $\mathit{Var}\left({Y}_{i}\right)={\sigma}_{{y}_{i}}^{2},$Var(ξ) = ${\sigma}_{\zeta}^{2}$, $\mathit{Var}\left({\epsilon}_{i}\right)={\sigma}_{{\epsilon}_{i}}^{2},$Cov(ξ, ε_{ i }) = 0 and Cov(ε_{ i }, ε_{ j }) = 0.
Platform reliability measure
Here r_{ ij } is the sample Pearson product moment correlation coefficient between the observed variables Y_{ i } and Y_{ j }. Similarly, we have ${\widehat{R}}_{{y}_{1}}^{2}=\frac{{r}_{12}{r}_{13}}{{r}_{23}}\phantom{\rule{0.25em}{0ex}}\mathrm{and}\phantom{\rule{0.25em}{0ex}}{\widehat{R}}_{{y}_{3}}^{2}=\frac{{r}_{13}{r}_{23}}{{r}_{12}}.$ By now we have shown how to compute the R-square from the data, and furthermore, how the R-square is related to the correlations between the observed variables. Suppose the first two of the three measurement platforms are perfectly correlated (r_{12} = 1) while the third measure is poorly correlated to the first two with r_{13} = r_{23} = 0.5. Then we have ${R}_{{y}_{1}}^{2}={R}_{{y}_{2}}^{2}=1$, and ${R}_{{y}_{3}}^{2}=0.25$. That is, the first two measurements are deemed perfectly reliable on the strength of their perfect consistency, while the third one is considered relatively unreliable due to its poor correlation to the other measures.
The standardized path coefficients are defined as ${\widehat{\lambda}}_{i}^{*}={\widehat{\lambda}}_{i}\frac{{\widehat{\sigma}}_{\zeta}}{{\widehat{\sigma}}_{{y}_{i}}}$. Together with the definition of reliability ${\widehat{R}}_{{y}_{i}}^{2}=\frac{{\widehat{\lambda}}_{i}^{2}{\widehat{\sigma}}_{\zeta}^{2}}{{\widehat{\sigma}}_{{y}_{i}}^{2}}$, we can easily obtain that ${\widehat{R}}_{{y}_{i}}^{2}=\frac{{\widehat{\lambda}}_{i}^{2}{\widehat{\sigma}}_{\zeta}^{2}}{{\widehat{\sigma}}_{yi}^{2}}={\left({\widehat{\lambda}}_{i}^{*}\right)}^{2}$. Therefore, the standardized path coefficient ${\widehat{\lambda}}_{i}^{*}$ is indeed the sample correlation between the observed measurement Y_{ i } and the latent variable ζ. The estimated reliability of the i^{ th } platform is equal to the squared estimated path coefficient in the latent variable SEM measurement model.
Comparison to repeated measures ANOVA
This particular structure of the variance covariance matrix is called “compound symmetry”. The univariate repeated measures ANOVA can be obtained from the more general latent variable SEM shown in Figure 2(A) by imposing equal measurement error variances and equal path coefficients from the measurements to the latent variable. That is, λ_{ i } ≡ 1 and ${\sigma}_{{\epsilon}_{i}}^{2}$ ≡ ${\sigma}_{\epsilon}^{2}$ (i = 1, 2, … m).
In summary, the repeated measures ANOVA models, both the univariate and the multivariate approaches, are special cases of latent variable SEM with constraints on the error variances and path coefficients. The general latent variable SEM is a more realistic, flexible and better-fitting model to evaluate the latent variable with several measurements, especially when the reliability of each measurement is unclear and the assumption of equal error variances is questionable. This general principle is fully illustrated in the ensuing example of a microbiome study where we compared the latent SEM measurement model with both repeated measures ANOVA models.
Latent variable SEM with covariates
Thus the parameters can be estimated through minimizing the ML fitting function, or equivalently, by equating Σ(θ) and S, the sample covariance matrix for both X and Y.
Nonparametric analysis of latent variable SEM
In the above, we presented the analysis of latent variable SEM based on the most widely used maximum likelihood estimation (MLE) framework, which depends on normality assumptions. In practice, SEM with continuous variable, including ordinal variables of five categories or more will not have severe problems with non-normality. When the normality assumption is not attainable, one can not directly employ the hypothesis test or confidence interval results. One can employ bootstrap resampling procedures to perform nonparametric significance tests and to construct nonparametric confidence intervals [22, 24]. Here we have adopted Efron’s non-parametric bootstrap by re-sampling from the original data with replacement and subsequently obtain the nonparametric bootstrap estimation [25].
In order to fully analyze the following application example on IBD and microbiome, we developed a modified boot.sem function by adapting the boot.sem function from the R package SEM (version 0.9-21) to estimate platform reliability and the standardized latent variable SEM path coefficients and other parameters whenever the normality assumption is not attainable. Our modified boot.sem function is available for free download at http://www.ams.sunysb.edu/~zhu/wei/SEM.html. As an example, the 95% bootstrap confidence intervals of the reliabilities based on the 2.5^{th} and the 97.5^{th} percentiles of the resampled data are shown in the following section.
Results and discussion
Data and model descriptions
Inflammatory bowel diseases (IBD), including Crohn’s disease (CD) and ulcerative colitis (UC), are chronic inflammatory conditions of the small intestine and/or the colon. The IBD study reported here includes 39 ileal CD patients, 50 UC patients, and 53 non-IBD control subjects, specimens from which were subjected to microbiome analysis. The abundance of the bacterial genus Faecalibacterium (a member of the Clostridium Group IV of the phylum Firmicutes) from disease unaffected ileal samples collected from the proximal margin of resected ileum of each subject was determined from four measurement modalities: Sanger sequencing, 454 pyrosequencing of two hypervariable regions of the 16S rRNA gene (V1V3 and V3V5), and quantitative PCR (qPCR) [26]. Assembled Sanger sequences were deposited in GenBank accession HQ739096-HQ821395. 454 V1V3 and V3V5 sequences were deposited in the Sequence Read Archive accession SRX021348-SRX021368, SRX037800-SRX037802. The qPCR assay was performed for Faecalibacterium prausnitzii and total bacteria using established primers [27]. F. prausnitzii is a predominant species found in the human gastrointestinal microbiome that has been implicated in CD [28, 29]. For each sequencing platform, the relative frequency of this bacterial taxon was calculated and then subjected to the empirical logit transformation as described in Li and others[26]. The qPCR data (dCT) were converted as qPCR = logit(2^{dCT}) so that all four measurements were subjected to the same transformation. The IBD phenotypes (CD and UC) are incorporated as two covariates into the SEM model for an association analysis as well. Path diagrams for the latent variable SEM measurement, and covariate models for Faecalibacterium are shown in Figure 1(B) and Figure 3(B) respectively.
Consistency and reliability of different measurement modalities
Pearson correlations among four different measurement modalities for the logit transformed relative frequency of Faecalibacterium (N = 142)
Sanger | 454_V1V3 (p value) | 454_V3V5 (p value) | qPCR (p value) | |
---|---|---|---|---|
Sanger | 1 | 0.828 (<.001) | 0.866 (<.001) | 0.642 (<.001) |
454_V1V3 | 1 | 0.887 (<.001) | 0.624 (<.001) | |
454_V3V5 | 1 | 0.610 (<.001) | ||
qPCR | 1 |
Reliability of each measurement platform in the four-modality latent variable SEM measurement model, and its correlation to the latent variable (true relative frequency of Faecalibacterium )
Four- modality measurement model | ||||
---|---|---|---|---|
Sanger | 454_V1V3 | 454_V3V5 | qPCR | |
Reliability | 0.819 | 0.857 | 0.912 | 0.441 |
(95% CI) | (0.689, 0.907) | (0.774, 0.917) | (0.865, 0.963) | (0.303, 0.553) |
Correlation to the latent variable | 0.905 | 0.926 | 0.955 | 0.664 |
(95% CI) | (0.830, 0.952) | (0.880, 0.958) | (0.930, 0.981) | (0.550, 0.744) |
Reliability of each measurement platform in the three-modality latent variable SEM measurement model, and its correlation to the latent variable (true relative frequency of Faecalibacterium)
Three- modality measurement model | |||
---|---|---|---|
Sanger | 454_V3V5 | qPCR | |
Reliability | 0.911 | 0.822 | 0.452 |
(95% CI) | (0.775, 1.000) | (0.720, 0.912) | (0.323, 0.610) |
Correlation to the latent variable | 0.955 | 0.907 | 0.672 |
(95% CI) | (0.880, 1.000) | (0.849, 0.955) | (0.568, 0.781) |
Sanger | 454_V1V3 | qPCR | |
Reliability | 0.851 | 0.806 | 0.483 |
(95% CI) | (0.671, 1.000) | (0.645, 0.905) | (0.350, 0.648) |
Correlation to the latent variable | 0.922 | 0.898 | 0.696 |
(95% CI) | (0.819, 1.000) | (0.803, 0.951) | (0.592, 0.805) |
Reliability for more bacterial taxa in the three-modality latent variable SEM measurement model (Sanger, 454_V1V3 and 454_V3V5), and its correlation to the latent variable
Three-measurement modality model | |||
---|---|---|---|
Sanger | 454_V1V3 | 454_V3V5 | |
(A) Proteobacteria | |||
Reliability | 0.657 | 0.641 | 0.974 |
(95% CI) | (0.524, 0.793) | (0.529, 0.724) | (0.878, 1.000) |
Correlation to the latent variable | 0.811 | 0.801 | 0.987 |
(95% CI) | (0.724, 0.891) | (0.727, 0.851) | (0.937, 1.000) |
(B) Firmicutes/Clostridia/Clostridiales/LachnoIV | |||
Reliability | 0.685 | 0.923 | 0.793 |
(95% CI) | (0.582, 0.804) | (0.837, 1.000) | (0.688, 0.903) |
Correlation to the latent variable | 0.827 | 0.961 | 0.890 |
(95% CI) | (0.763, 0.897) | (0.915, 1.000) | (0.829, 0.950) |
(C) Actinobacteria | |||
Reliability | 0.582 | 0.854 | 0.882 |
(95% CI) | (0.424, 0.700) | (0.743, 0.942) | (0.765, 0.976) |
Correlation to the latent variable | 0.763 | 0.924 | 0.939 |
(95% CI) | (0.652, 0.837) | (0.862, 0.970) | (0.875, 0.988) |
(D) Bacteroidetes | |||
Reliability | 0.684 | 0.828 | 0.980 |
(95% CI) | (0.323, 0.922) | (0.652, 1.000) | (0.941, 1.000) |
Correlation to the latent variable | 0.827 | 0.910 | 0.990 |
(95% CI) | (0.569, 0.960) | (0.808, 1.000) | (0.970, 1.000) |
(E) Firmicutes/Bacilli | |||
Reliability | 0.698 | 0.953 | 0.959 |
(95% CI) | (0.553, 0.797) | (0.888, 1.000) | (0.913, 0.995) |
Correlation to the latent variable | 0.835 | 0.976 | 0.979 |
(95% CI) | (0.744, 0.893) | (0.942, 1.000) | (0.956, 0.998) |
Comparison to repeated measures ANOVA
Model goodness-of-fit comparison between latent variable SEM and repeated measures ANOVA approach of Faecalibacterium based on four measurements (Sanger, 454 pyrosequencing V1V3, 454 pyrosequencing V3V5 and qPCR)
MODEL | MODEL CONSTRAINT | GOODNESS-OF-FIT | |
---|---|---|---|
A: Latent variable SEM | set only λ_{1} = 1 | Chi-square | 5.089 (df = 2) Pr > χ^{2}: 0.079 |
RMSEA | 0.105 | ||
CFI | 0.994 | ||
B: Equivalent to repeated measures ANOVA (multivariate approach) | set all indicator path coefficient λ_{i} ≡ 1 (i = 1, 2, 3, 4) | Chi-square | 129.955 (df = 5) Pr > χ^{2}: < .001 |
RMSEA | 0.421 | ||
CFI | 0.750 | ||
C: Equivalent to repeated measures ANOVA (univariate approach) | set all indicator path coefficient λ_{i} ≡ 1; set all indicator error variances to be equal, var (ε_{i}) ≡ σ^{2} (i = 1, 2, 3, 4) | Chi-square | 172.068 (df = 8) Pr > χ^{2}: < .001 |
RMSEA | 0.381 | ||
CFI | 0.671 |
Estimation of the latent variable SEM model with IBD phenotypes
Similarly, UC patients are found to have 4.1 % less Faecalibacterium than the control subjects (p = 0.048) because $\widehat{\pi}\left(\mathit{CD}=0,\mathit{UC}=1\right)-\widehat{\pi}\left(\mathit{CD}=0,\mathit{UC}=0\right)=-0.041$.
Conclusions
In this work, we introduced the latent variable SEM as a versatile and effective analytical tool for measurement platform comparison and combination. While traditional SEM relied on the normality assumption for its parametric based inference, thanks to contemporary nonparametric techniques such as the bootstrap resampling method [22, 24] and the rapid advancement of modern computers, one can readily perform non-parametric analysis of latent variable SEM when the data are not normal as we have shown in the analysis of a microbiome study of the human inflammatory bowel diseases.
In the study of the gastrointestinal microbiome, we demonstrated that latent variable SEM can provide a robust means of integrating datasets derived from different experimental platforms. Moreover, it can gauge effectively the relative merits of different measurement platforms, in this example, Sanger sequencing, 454 pyrosequencing with two different target regions/windows, and qPCR. Joint panel studies [4] have shown that different 454 pyrosequencing windows may be optimal for different bacterial taxa. Their observations have been confirmed by our own analysis using the latent variable SEM measurement models (Table 4) based on the given IBD study - where the 454_V3V5 window are shown to be a better measurement platform for Proteobacteria, Actinobacteria, Bacteroidetes and Firmicutes/Bacilli in addition to the Faecalibacterium, while the 454_V1V3 window is found more reliable for Firmicutes/Clostridia/ Clostridiales/LachnoIV.
The joint study panel has also recommended sequencing microbiome with two 454 pyrosequencing windows such as V1V3 and V3V5 - which we can readily combine using the latent variable SEM for a unified joint analysis. Nevertheless, more works need to be done for a thorough treatment of the platform comparison problem. For example, we have yet to examine the rare taxa issue. Given that data from rare taxa will feature near zero counts and artificially low or suspiciously high variances, a robust version of the current latent SEM method needs to be developed for the occasion. We definitely expect to submit a follow-up paper on this issue.
To our knowledge, this is the first application of latent variable SEM to the study of human microbiome, and for modern sequencing platform comparison and combination. Since human gastrointestinal microbial communities are typically complex and difficult to study in situ, multiple experimental/measurement modalities are required to provide a deep description of the dynamic microbe-microbe and microbe-host interactions in the gut. Given the rapid evolution of modern sequencing technologies, with the debut Sanger sequencing quickly followed by the higher throughput ‘next generation sequencing’ (a.k.a. pyrosequencing) with shorter sequence reads, and with a variety of third and fourth generations sequencing technologies already on the horizon, the platform comparison and combination task is becoming increasingly critical.
Declarations
Acknowledgements
This work was supported by the Crohns and Colitis Foundation of America (EL, WZ), the Simons Foundation (EL) and National Institutes of Health (HG005964, DNF), UH2 DK083994 (EL), EB007530 (WZ), HL091939 (WZ), and MH090134 (WZ), and the Children's Digestive Health and Nutrition Foundation and the CCFA (ASG). We acknowledge use of the Washington University Digestive Diseases Research Core Center Tissue Procurement Facility (P30 DK52574). We thank Drs. George Weinstock and Erica Sodergren at the Genome Institute of Washington University for generating the sequence data. We also thank Dr. R. Balfour Sartor at School of Medicine of University of North Carolina for helpful discussions. Our thanks also go to the BMC Bioinformatics Review Panel for their insightful comments that have improved this work substantially.
Authors’ Affiliations
References
- Weisburg WG, Barns SM, Pelletier DA, Lane DJ: 16S ribosomal DNA amplification for phylogenetic study. J Bacteriol 1991,173(2):697-703.PubMed CentralPubMedGoogle Scholar
- Lane DJ, Pace B, Olsen GJ, Stahl DA, Sogin ML, Pace NR: Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. Proc Natl Acad Sci USA 1985,82(20):6955-6959. 10.1073/pnas.82.20.6955PubMed CentralView ArticlePubMedGoogle Scholar
- Peterson J, Garges S, Giovanni M, McInnes P, Wang L, Schloss JA, Bonazzi V, McEwen JE, Wetterstrand KA, Deal C: The NIH Human Microbiome Project. Genome Res 2009,19(12):2317-2323.PubMed CentralView ArticlePubMedGoogle Scholar
- Jumpstart Consortium Human Microbiome Project Data Generation Working Group: Evaluation of 16S rDNA-Based Community Profiling for Human Microbiome Research . PLoS ONE 2012,7(6):e39315. 10.1371/journal.pone.0039315PubMed CentralView ArticleGoogle Scholar
- Sanger F, Coulson AR: A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. J Mol Biol 1975,94(3):441-448. 10.1016/0022-2836(75)90213-2View ArticlePubMedGoogle Scholar
- Zoetendal EG, Akkermans ADL, De Vos WM: Temperature gradient gel electrophoresis analysis of 16S rRNA from human fecal samples reveals stable and host-specific communities of active bacteria. Appl Environ Microb 1998,64(10):3854-3859.Google Scholar
- Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z: Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005,437(7057):376-380.PubMed CentralPubMedGoogle Scholar
- Roesch LF, Fulthorpe RR, Riva A, Casella G, Hadwin AK, Kent AD, Daroub SH, Camargo FA, Farmerie WG, Triplett EW: Pyrosequencing enumerates and contrasts soil microbial diversity. ISME J 2007,1(4):283-290.PubMed CentralPubMedGoogle Scholar
- Dowd SE, Sun Y, Secor PR, Rhoads DD, Wolcott BM, James GA, Wolcott RD: Survey of bacterial diversity in chronic wounds using pyrosequencing, DGGE, and full ribosome shotgun sequencing. BMC Microbiol 2008, 8: 43. 10.1186/1471-2180-8-43PubMed CentralView ArticlePubMedGoogle Scholar
- Inglis GD, Thomas MC, Thomas DK, Kalmokoff ML, Brooks SP, Selinger LB: Molecular methods to measure intestinal bacteria: a review. J AOAC Int 2012,95(1):5-23. 10.5740/jaoacint.SGE_InglisView ArticlePubMedGoogle Scholar
- Spear GT, Sikaroodi M, Zariffard MR, Landay AL, French AL, Gillevet PM: Comparison of the diversity of the vaginal microbiota in HIV-infected and HIV-uninfected women with or without bacterial vaginosis. J Infect Dis 2008,198(8):1131-1140. 10.1086/591942PubMed CentralView ArticlePubMedGoogle Scholar
- Chakravorty S, Helb D, Burday M, Connell N, Alland D: A detailed analysis of 16S ribosomal RNA gene segments for the diagnosis of pathogenic bacteria. J Microbiol Methods 2007,69(2):330-339. 10.1016/j.mimet.2007.02.005PubMed CentralView ArticlePubMedGoogle Scholar
- Zemanick ET, Wagner BD, Sagel SD, Stevens MJ, Accurso FJ, Harris JK: Reliability of quantitative real-time PCR for bacterial detection in cystic fibrosis airway specimens. PLoS One 2010,5(11):e15101. 10.1371/journal.pone.0015101PubMed CentralView ArticlePubMedGoogle Scholar
- Rosey AL, Abachin E, Quesnes G, Cadilhac C, Pejin Z, Glorion C, Berche P, Ferroni A: Development of a broad-range 16S rDNA real-time PCR for the diagnosis of septic arthritis in children. J Microbiol Methods 2007,68(1):88-93. 10.1016/j.mimet.2006.06.010View ArticlePubMedGoogle Scholar
- Nossa CW, Oberdorf WE, Yang L, Aas JA, Paster BJ, Desantis TZ, Brodie EL, Malamud D, Poles MA, Pei Z: Design of 16S rRNA gene primers for 454 pyrosequencing of the human foregut microbiome. World J Gastroenterol 2010,16(33):4135-4144. 10.3748/wjg.v16.i33.4135PubMed CentralView ArticlePubMedGoogle Scholar
- Frank DN, Feazel LM, Bessesen MT, Price CS, Janoff EN, Pace NR: The human nasal microbiota and Staphylococcus aureus carriage. PLoS One 2010,5(5):e10598. 10.1371/journal.pone.0010598PubMed CentralView ArticlePubMedGoogle Scholar
- Kumar PS, Brooker MR, Dowd SE, Camerlengo T: Target region selection is a critical determinant of community fingerprints generated by 16S pyrosequencing. PLoS One 2011,6(6):e20956. 10.1371/journal.pone.0020956PubMed CentralView ArticlePubMedGoogle Scholar
- Frank DN, Zhu W, Sartor RB, Li E: Investigating the biological and clinical significance of human dysbioses. Trends Microbiol 2011,19(9):427-434. 10.1016/j.tim.2011.06.005PubMed CentralView ArticlePubMedGoogle Scholar
- Kline RB: Principles and Practice of Structural Equation Modeling. New York: The Guilford Press; 1998.Google Scholar
- Frank DN, St Amand AL, Feldman RA, Boedeker EC, Harpaz N, Pace NR: Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases. Proc Natl Acad Sci USA 2007,104(34):13780-13785. 10.1073/pnas.0706625104PubMed CentralView ArticlePubMedGoogle Scholar
- Frank DN, Robertson CE, Hamm CM, Kpadeh Z, Zhang T, Chen H, Zhu W, Sartor RB, Boedeker EC, Harpaz N: Disease phenotype and genotype are associated with shifts in intestinal-associated microbiota in inflammatory bowel diseases. Inflamm Bowel Dis 2011,17(1):179-184. 10.1002/ibd.21339View ArticlePubMedGoogle Scholar
- Bollen KA: Structural equations with latent variables. New York: John Wiley & sons, Inc; 1989.View ArticleGoogle Scholar
- Allen MJ, Yen WM: Introduction to Measurement Theory. Long Grove, IL: Waveland Press; 2002.Google Scholar
- Fox J: Structural Equation Modeling With the sem Package in R. Structural Equation Modeling 2006, 13: 465-486. 10.1207/s15328007sem1303_7View ArticleGoogle Scholar
- Efron B: The jackknife, the bootstrap, and other resampling plans. Philadelphia, Pa: Society for Industrial and Applied Mathematics; 1982.View ArticleGoogle Scholar
- Li E, Hamm CM, Gulati AS, Sartor RB, Chen H, Wu X, Zhang T, Rohlf FJ, Zhu W, Gu C: Inflammatory bowel diseases phenotype, C. difficile and NOD2 genotype are associated with shifts in human ileum associated microbial composition. PLoS One 2012,7(6):e26284. 10.1371/journal.pone.0026284PubMed CentralView ArticlePubMedGoogle Scholar
- Rinttila T, Kassinen A, Malinen E, Krogius L, Palva A: Development of an extensive set of 16S rDNA-targeted primers for quantification of pathogenic and indigenous bacteria in faecal samples by real-time PCR. J Appl Microbiol 2004,97(6):1166-1177. 10.1111/j.1365-2672.2004.02409.xView ArticlePubMedGoogle Scholar
- Sokol H, Pigneur B, Watterlot L, Lakhdari O, Bermudez-Humaran LG, Gratadoux JJ, Blugeon S, Bridonneau C, Furet JP, Corthier G: Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. Proc Natl Acad Sci USA 2008,105(43):16731-16736. 10.1073/pnas.0804812105PubMed CentralView ArticlePubMedGoogle Scholar
- Sokol H, Seksik P, Furet JP, Firmesse O, Nion-Larmurier I, Beaugerie L, Cosnes J, Corthier G, Marteau P, Dore J: Low counts of Faecalibacterium prausnitzii in colitis microbiota. Inflamm Bowel Dis 2009,15(8):1183-1189. 10.1002/ibd.20903View ArticlePubMedGoogle Scholar
- Hu L, Bentler PM: Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling 1999,6(1):1-55. 10.1080/10705519909540118View ArticleGoogle Scholar
- Seksik P, Sokol H, Pigneur B, Watterlot L, Lakhdari O, Bermudez-Humaran LG, Gratadoux JJ, Blugeon S, Bridonneau C, Furet JP: Faecalibacterium prausnitzii is an anti-inflammatory commensal bacterium identified by gut microbiota analysis of Crohn disease patients. P Natl Acad Sci USA 2008,105(43):16731-16736. 10.1073/pnas.0804812105View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.