Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Towards a unified medical microbiome ecology of the OMU for metagenomes and the OTU for microbes

Fig. 1

Diagramming the study design towards unified medical microbiome ecology of metagenomes (OMU) and organisms (OTU). The whole diagram consists of top section and bottom section, as well as formal set-theoretic (mathematical) definition of OMU, which are interpreted below. (1) The top section displays the study design consisting of two parts, i.e., the previous works (for the OTU and three topics of the OMU) including diversity, heterogeneity and biogeography) and the contents planned for this study. The new contents with the OMU in this study include six approaches: two network-approaches with core/periphery network (CPN) and high-salience skeleton network (HSN), Sloan near-neutral model, normalized stochasticity ratio (NSR), two statistical test approaches (randomization tests and shared OMU analysis) for detecting the disease effects. (2) In the bottom section, the ad-hoc concept OMU (operational metagenomic unit) or its shorthand MU is introduced as the counterpart of OTU (see the methods section for their interpretations), and MG (metagenomic gene) is considered as the basic (‘atomic’) unit of OMU and is similar to (the counterpart of) the species in the OTU (97% similarity in 16S-rRNA sequences for bacteria) taxonomic hierarchy. Both MGs and species exist as basic (undividable) units. In the case of MG, each MG may have one or more functions, but the ‘components’ (if forced to divide) of MG is sequencing reads that do not have a corresponding function (hence being atomic). Species is the foundation of a taxonomic system in the case of OTU hierarchy. The other entities of OMU include MF/MP/MFGC (defined by Ma & Li 2018: Mol. Ecol. Res.) and CAG and MGS (by Li et al. 2014; Nielsen et al. 2014, both in Nature Biotechnology). All of them are generated from MGs, just like other taxonomic units such as genus and family are combinations of species. (3) Formal mathematical definitions for MF/MP/MFGC from the MG are defined as follows. Assuming there are n MGs, i.e., MG1, MG2,…, MGn, we can define MF/MP/MFGC with mathematic set notation: \(MF = \left\{ {MG1, \, MG2, \, MG3, \ldots } \right\}\), where MG1, MG2, MG3 are mapped to same metagenomic function. MP can be defined similarly except that all of its genes (MGs) are mapped to same metagenomic pathway. Therefore, MF (or MP) can be described as a set of the MGs annotated to the same metagenomic function (or pathway). Conceptually, MFGC is a set of subsets of MFs (or MPs). That is, the elements of MFGC set consist of the combination of MFs. For example, \(MFGC = \{ \{ MF1, \, MF7\} \}\), this MFGC consists of the MGs that simultaneously annotated to two metagenomic functions MF1, and MF7. Assuming there are m possible MFs, the total possible number of MFGC is equal to \(M = C_{m}^{2} + C_{m}^{3} \cdots + C_{m}^{m - 1} + C_{m}^{m}\). In practice, only a tiny portion of the possible number exists naturally

Back to article page