 Research
 Open access
 Published:
A study on multiomic oscillations in Escherichia coli metabolic networks
BMC Bioinformatics volumeĀ 19, ArticleĀ number:Ā 194 (2018)
Abstract
Background
Two important challenges in the analysis of molecular biology information are data (multiomic information) integration and the detection of patterns across large scale molecular networks and sequences. They are are actually coupled beause the integration of omic information may provide better means to detect multiomic patterns that could reveal multiscale or emerging properties at the phenotype levels.
Results
Here we address the problem of integrating various types of molecular information (a large collection of gene expression and sequence data, codon usage and protein abundances) to analyse the E.coli metabolic response to treatments at the whole network level. Our algorithm, MORA (Multiomic relations adjacency) is able to detect patterns which may represent metabolic network motifs at pathway and supra pathway levels which could hint at some functional role. We provide a description and insights on the algorithm by testing it on a large database of responses to antibiotics. Along with the algorithm MORA, a novel model for the analysis of oscillating multiomics has been proposed. Interestingly, the resulting analysis suggests that some motifs reveal recurring oscillating or position variation patterns on multiomics metabolic networks. Our framework, implemented in R, provides effective and friendly means to design intervention scenarios on real data. By analysing how multiomics data build up multiscale phenotypes, the software allows to compare and test metabolic models, design new pathways or redesign existing metabolic pathways and validate in silico metabolic models using nearby species.
Conclusions
The integration of multiomic data reveals that E.coli multiomic metabolic networks contain position dependent and recurring patterns which could provide clues of long range correlations in the bacterial genome.
Background
In the last decades, the study of E.coli treatment tolerance metabolic response through multiomics is emerging as an essential part of several approaches to molecular biology, metabolic engineering and medicine [1]. Nowadays promising models on multiomics are based on statistical methodologies [2] and, recently, on multiplex approaches [3, 4]. Highthroughput omics technologies [5] enrich complex relational data structures (i.e., XML documents [6], complex networks or maps of multiview omics [7]) and provide increasing elements for the multiomic integration at different layers of quantitative and relational information.
In several works, the bacterial metabolic response upon perturbations is modelled through the multiomics dynamic changes on metabolic, signalling and regulatory networks [8]. Then, the multiomic analysis leads to several engineering and optimizations approaches [9, 10] that reveal hidden biological motifs and pattern regularities [11, 12].
The integration of single omics, even if not biologically comparable (e.g. codon usage and protein abundance), can increase the total information about the system [13]. Ishii and Tomita [14] describe in depth the concept of multiomic spaces as a powerful datadriven approach to understanding biological processes and systems. The information elicited from specific multiomic spaces is multilayered and phenotypic responsive [15].
In conclusion, identifiable multiomic motifs could reveal the dynamical behavior of the total cellular system in standard conditions and after perturbations. Multiomic metabolic network motifs are short recurring patterns that are presumed to have a biological function. Often they indicate sequencespecific parts of pathways with associated oscillating multiomics. In this work, multiomic metabolic network motifs [16] are identified and their recurring oscillating multiomic patterns are analysed. Oscillations are defined in a binary or, at least, discrete space of features. We represented the oscillating multiomics in two different ways: (i) as linked nodes with opposite characteristics on networks (refer to bluered nodes of Fig. 1b (1)), (ii) as a sequence of highlow adjacent values (refer to bluered cylinders of Fig. 1b (3)). Then, the oscillating multiomic features on sequences and networks are linked into a multilayer relational structure (MLS) strengthening the relations between the sequence patterns and the network motifs.
Multiomic data integration is well documented in several works [17, 18]. In our paper we adopt noise robust techniques on uptodate data (Additional file 1: Section S5). We will show that oscillating multiomics are found on E.coli metabolic networks as motifs and on sequences as patterns.
For this reason, we introduce the MLS on which an ad hoc algorithm (MORA  multiomic relational adjacency) finds the reciprocal influences of the neighbouring multiomics on sequences and projected on networks. Oscillating multiomics and their variations are helpful in the analysis of the impact of new drugs and in applications of metabolic engineering. Moreover, this work contributes to the study and to the creation of new interesting metabolic circuits based on multiomic structural relations. Furthermore, the MORA reciprocal influences could seen as an index of the topological interplay between the gene order (considered in our sequences) and the pathways. Gene order along sequences and pathway structures are evolutionary conserved, then this index could be useful in evolutionary organism comparisons. Oscillating multiomic motifs and patterns coupled with the MORA reciprocal influences describe in a new fashion system homeostasis processes and their regulatory functions unveiling the extent of multiscale oscillating multiomics and their network plasticity [19].
Methods
The subject organism of this study is the E.coli K12 MG1655 [20].
In āDefinition of MLSsā section MLSs are described in detail. The global impact of antibiotics on the whole network and the local impact on pathways have been taken into account on these structures. Therefore, the multiomic feature scaling and normalization are applied twice (please refer to āBinary discretization of multiomics taking into account the global and the local effectā section for more detail). Multiomics are discretized into a binary field in order to be analysed. Through the MLS it is possible to outline the relations (or reciprocal influences) of oscillating multiomics across sequences and small networks (pathways). For this latter purpose, in āAn algorithm to discover multiomics relational adjacency (MORA)ā section, the algorithm MORA is introduced. Reciprocal influences are not enough informative for understanding if the oscillating multiomics are actual motifs of the bacterial system. Then, two models to represent the extent of oscillating multiomic motifs/patterns as sequences (paragraph Oscillating multiomics on patterns) and on pathways (āOscillating multiomics on networksā section) are introduced. A detailed description of datasources, procedures and methods of dataintegration is provided in āData sources and multiomics data integrationā section and in the Additional file 1: Section S5. To facilitate the reader a block diagram of the overall procedure is shown in Fig. 2. We have concentrated our attention on E.coli organism but it is possible to extend the methodologies to other bacterial organisms. One of the most important preconditions is the availability of data: (i) the whole genome, (ii) the protein abundance, (iii) operon and protein complex information and (iv) the whole metabolic network. The most relevant bottlenecks in the preliminary data integration processes come from the availability of the protein abundance and operon information. In the PaxDB (protein abundance database) [21] the data coverage for (i) E.coli is of 98%, (ii) H.pylori is of 98%, (iii) B. henselae if of 85% and (iv) S.enterica is of 59%. Other proteobacteria have a data coverage lower than the 47%. Moreover, the operon information (DOOR DB [22]) is conspicuous only on E.coli (152,785 entries), instead, in the other PaxDB listed organisms is less than half or completely not numerically relevant. Nevertheless, the main lack of information, except for E.coli, comes from the reconstruction of the whole metabolic network. In particular, this information is important in two steps of the procedure that we will present: (1) in searching a parameter (the average path length) of the algorithm MORA, (2) in the computation of the pathways with extensions and/or operon compressions.
The most of the times incomplete metabolic networks (i.e. obtained merging only the KEGG pathways) do not present the properties of complex networks, such as the powerlaw degree distribution, the small world, the average path length, etc. Moreover, the powerlaw degree parameter Ī± is important to assess if a network is biological one or not (see the duplicaton model of Chung et al. [23]). For this reason, as described in Additional file 1: Section S5, our integrated network is deeply studied under its biological aspects. In the domain of bacteria, for all we know, there is not another E.coli proteincentric network more complete than this one. For this reason, it is made available in the annexed repository.
Definition of MLSs
In an integrated multiomic space, as that in Fig. 1a, each omic layer is arranged on the basis of its datastructure relations. In the genomic layer, the genes disposed along the double strand with specific positions are transformed in a one line sequence. The gene order is considered as an organismspecific order relation (ā¤) and is highly conserved in duplication and during the translational processes [24]. As shown in Fig. 1b(2), we described a gene relationship of the type g_{1}ā¤g_{2}ā¤āÆā¤g_{n} for each gene g_{i}. The order induced on the multiomic sequence reproduces the gene order relation on the double strand. In particular, multiomics are said to be adjacent with respect to the gene pairs when they are in the positions g_{i} and g_{i+1} āi ā[1ā¦n]. As it is shown in Fig. 1c(1)(2) g_{4},g_{5},g_{6} could be on the same strand or on both. As said previously, the gene reciprocal positions on the double strand are merged and represented on a line sequence. Thus, when the gene order changes in one of the two strands, then the multiomic pattern (Fig. 1b(3)) changes the arrangement of its values. Indeed, in Fig. 1c(1) and (2) the fragment of the pattern changes because of the swap of g_{5} and g_{6}.
In an integrative approach, some specific datastructure relations could involve more than one omic layer. For example, the proteomic and the metabolomic layers can be represented as a proteincentric network of reactions G(V,E) where the nodeset V contains proteins and the edgeset E represents the enzymatic reactions. In the proteincentric network representation, as illustrated in Fig. 1b(1), the reversible reactions are depicted as a double arrow link (ā) and the irreversible reactions are represented as single arrow links (āorā) [25].
In this setting, two proteins p_{i},p_{j}āV(G)) are said to have a strong relationship if they are linked by an edge e(p_{i},p_{j}) āE(G) or if they are the end points of an undirected shortest path that must not be longer than the average path length (APL) [26]. The proteins in a strong relationship will be the subject of thorough analysis as described in āAn algorithm to discover multiomics relational adjacency (MORA)ā section.
In literature, it is proved that the gene adjacency is conserved across prokaryotes with a relevant operon architecture [27]. In particular, it has been shown that the proteins encoded by conserved adjacent genes present interactions on the metabolomic layer [28]. In presence of protein complexes, these interactions are physical, while, when dealing with anabolic and catabolic processes, they are functional. Then, the genes are positioned on DNA depending on their association to metabolic functions.
In order to model the relationship between the gene order and the pathways, the concept of MLS is introduced.
The MLS represents the pathways (Fig. 1b(1)) in combination with the gene order information (Fig. 1b(2)) studying the patterns on multiomic sequences (Fig. 1b(3)). The abbreviation mov is used when we refer to a sequence of multiomic values on the multiomic sequences.
The multiomics related to each pathway are used to build a multiomic subspace that represents the values of each MLS. Furthermore, MLS represents the interactions g_{i}āp_{i} āiāG between the elements in different omic layers of the same subspace (Fig. 1b). Additionally, Operon compression (Fig. 3a) and path extension (Fig. 3b) are modifications of MLS, introduced to identify relevant multiomic pattern variations. The operon compression maintains unaltered the MLS gene order. In this case, the elements that belong to the same operon (Fig. 3a e_{2} e_{3} e_{4}) are compressed to the more frequent multiomic. Moreover, path extension is a multiomic pattern modification accomplished into two steps: 1) adjacent non oscillating elements on the pattern are labeled as end nodes of the pathway (i.e. in Fig. 3b the multiomics in positions e_{2} e_{3} and e_{3} e_{4}); 2) we search for the best oscillating shortest path that links the above end nodes.
As a result, the chosen multiomic path is the shortest between the most oscillating multiomics. The alternation is measured by using the score defined in Eq. 7. Then, these nodes extend the pathway and insert new multiomics in the multiomic sequence, breaking in some cases the gene order (Fig. 3b). Moreover, detailed aspects of oscillating multiomics are illustrated in the following paragraphs.
Binary discretization of multiomics taking into account the global and the local effect
Protein abundance is a measure of the part per million quantity of the proteins inside a cell, as provided, for example, in Wang et al. [29]. Its definition and the protein abundance variation used in this paper, respectively pa and pv, can be found in Additional file 1: S5 Section 5.0.3). Codon Adaptation Index (CAI) is an index of nonuniform codon use defined by Sharp and Li [30] (a deeper discussion is in Additional file 1: S5 Section 5.0.1). All these quantities are extracted, integrated and normalized with the purpose of identifying multiomic patterns and their mov.
Then, the zeromean unitvariance normalization is applied twice: i) the first time it is computed on the complete data set, i.e. considering the whole organism (see Fig. 4a(1)). The second time it is computed considering the same omics but on a small sample filtered from the multiomic space by a specific pathway of N elements (see also Fig. 4a(2)). Then, a vector of numbers called local effect, is obtained for each pathway: \({mov}_{1} = (\widetilde {e}_{1}, \widetilde {e}_{2} \dots, \widetilde {e}_{N})\). The same elements of the multiomic space are selected from the normalized complete data set getting a vector of real numbers called global effect: \({mov}_{2} = (\widehat {e}_{1}, \widehat {e}_{2} \dots, \widehat {e}_{N})\) (see Fig. 4). Both the vectors represent the same elements and have the same length of N. The normalization of the omics is described in the following Eq. 1:
where Ī¼ is the mean and Ļ is the standard deviation. Then, the local effect (mov_{1}) and the global effect (mov_{2}) in steady state (see Eq. 2) and after the tth treatment (see Eq. 3) vectors are discretized into two classes: 0 and 1 (Fig. 4b(1) redhead and bluehead cylinders, respectively). Binary discretized multiomics in steadystate conditions are obtained considering \(\overline {{pa}_{j}}\) and \( \overline {cai}\) ājā1,...,N and i=1,2 as in the following Eq. 2:
Instead, the binary discretized multiomics perturbed by treatments are obtained considering the protein variation for the tth treatment \(\overline {pv}^{t}\) and \( \overline {cai}\) ājā1..N and i=1,2 as in Eq. 3
In some cases, the mov_{1}\(\left (\text {or}~mov^{t}_{1}\right)\) binary discretization is not enough sensitive to discover an alternation. Therefore, in the same positions it is possible to find oscillating multiomics on mov_{2} (or \(mov^{t}_{2}\)) and not on mov_{1}. If it happens then the missing oscillating multiomics on mov_{1} are substituted with the oscillating multiomics of mov_{2}. In this way, it is possible to combine in a binary field the information about the system global response and the pathway local response to antibiotics. Formally, as shown in Fig. 4b(2), the ith local \(\left (a_{i}^{local}\right)\) and global \(\left (a_{i}^{global}\right)\) oscillations are taken in OR. The binary mov (or mov^{t}) resulting from the multiomic pattern is projected on the associated pathway.
Normalization processes are suited to deal with the assumption of independent and dependent systematic biases [31]. Moreover, the scale on which data should be included in these processes (global versus local scale) has an extensive application to highthroughput omics analysis [32]. In particular, local normalization has the advantage of correcting systematic stress response bias within small groups of multiomics. Then, it is possible to account inconsistencies among the multiomics once they are discretized in a binary field. The local variabilities in standard and perturbed measurement conditions could be more relevant with respect to global normalization even if they could be affected by noise. This is also the reason because we adopted noise robust techniques of dataintegration as described in āData sources and multiomics data integrationā section. However, the local normalization process may over fit the data, reducing accuracy, especially, i.e., if the multiomics are integrated from genes that are not randomly spotted on the array [33] and subject to local and global responses determined by the interaction of global and local regulatory mechanisms, such the E.coli oxygen response [34]. For these reasons, it is more accurate to combine the information that comes from the global normalization with the local one.
An algorithm to discover multiomics relational adjacency (MORA)
MORA is a search algorithm that weights the multiomics with respect to their positions on the MLS. Its purpose is to assign a high score to the multiomics that have two properties at the same time: they are adjacent on the sequence and strictly connected on the pathway. The algorithm assigns a score equal to zero to the multiomics that are adjacent on the sequence but unconnected to the pathway. In Fig. 5, the algorithm weights two adjacent multiomics evaluating their positions (i.e. j and j+1 on the pattern mov_{i}=[e_{1},e_{2},ā¦,e_{j},e_{j+1},ā¦e_{n}]). A high score is assigned when the multiomics e_{j} and e_{j+1} are directly connected on the pathway (green dotted lines) or with the shortest path (magenta dotted lines), otherwise a score of 0 is assigned. The multiomic reciprocal adjacency is computed for all the adjacent couples c. The median value of the MORA weights is a summary index of the reciprocal influences between multiomics and underlines the interplay among them on the MLS. Note that in multiomic sequence the propagation of influences of an element becomes gradually weaker as the distance from its neighbouring elements increases: the propagation of influences is limited to the metabolic network average path length (APL) (see Additional file 1: Section S5  Section 5.0.4), decreasing its influence gradually and in an inversely proportional way with respect to the path length. If sequences and pathways do not present reciprocal influences (RI=0), the oscillating multiomics lose their significance with respect to the pathways and viceversa. In these cases, MLS interactions do not present a real biological meaning.
MORA algorithm takes as input the following parameters: mov^{i}, which is the pattern of the ith MLS, G^{i}, which is its pathway and APL. Let us define Ī“=2 as the distance on the sequence from e_{j} to e_{j+1},ājā [0,nā(Ī“ā1)]. Ī“ is fixed to 2 and identifies the adjacency of positions j and j+1 on the pattern (as it is shown in Fig. 5). Iteratively, each couple of (e_{j},e_{j+1}) is associated with a couple of nodes in the pathway with the same index for simplicity: (p_{j},p_{j+1}). Then, the algorithm will search in G^{i} the shortest paths with the end nodes node_{from}=p_{j} and node_{to}=p_{j+1}. Also, we define Ļā [1, āAPLā] as the path distance between p_{j} and p_{j+1}. For example, in Fig. 5, MORA finds a direct link between the pathway nodes (green dotted lines) and a path of length 2 (in magenta dotted lines).
We define the array of weights infl as an array of all zeros with the same length of mov^{i}. When the algorithm finds the shortest paths on G^{i} from node_{from} to node_{to}, it takes the positions z of the path nodes on the mov^{i} and in correspondence of these positions he gives a weight to the vector element infl_{z} (i.e the algorithm in Fig. 5a takes for Ļ=2 the positions z={j,j+1,h}). The weights are computed with the following formula: \( w = \frac {1}{\psi _{y} 1} \), for the y th shortest path found. These weights are summed up in the vector infl as follows: infl[z]=infl[z]+w. If there are c couples with Ī“=2 and Ļ=2 then a perfect adjacency on mov_{i} and a direct edge on G_{i} are present, and then w=1. The weight w becomes progressively smaller if the distance Ļ between the end nodes increases. The maximum distance between the end nodes is limited by the upper bound that is properly āAPLā. The supplementary Section S6Figure 4 (Additional file 1) describes the whole procedure of the algorithm with a toy example and a related Figure. MORA is tested against several extreme structures proving its correctness. In particular, MORA is shown to be stable in case of cliques (supplementary Figure 4 (Additional file 1) part (b)), in snaked networks (Additional file 1: Figure S4) part (c) and on complete networks (not shown).
MORA is given as yardstick for validating and deciding if the measures that are taken in next Sections for oscillating multiomic patterns and pathways are comparable and at what level of reliability.
Algorithm 1 presents a schematic description of the MORA methodology. The code is available at https://github.com/lodeguns/MORA.
Oscillating multiomics on patterns
In this work, the multiomics are discretized in a binary field as described in the previous āBinary discretization of multiomics taking into account the global and the local effectā section, so that these values are classified into two classes (N=2). A multiomic pattern presents an alternation if the two adjacent e_{j} and e_{j+1} in the pattern sequence are subtracted and ā£e_{j}āe_{j+1}ā£=1. If the result of the subtraction is 0 then e_{j} and e_{j+1} are equal and there is not an alternation (i.e. Fig. 1d(2)a_{1}).
Furthermore, l is defined as the length of mov and div=lā1 is the number of divisors between the e_{j}. For each pattern, the relative multiomic pattern score a.s (Eq. 4) is the number of adjacent alternated couples of values divided by N:
We are interested in patterns of oscillating multiomics with values close to the maximal score. We can obtain the alternation score of multiomic patterns leveraging Eq. 4. A maximal score corresponds to a sequence of fully oscillating multiomic values, for example mov_{s}=0ā1ā0ā1ā...0ā1 or alternatively mov_{s}=1ā0ā1ā0....1ā0ā1ā0. Equation 4 returns the maximal score if and only if the pattern sequence presents fully oscillating multiomics. The correctness of the last statement is proved in Theorem 1 (Additional file 1: Section S4  Theorem 1). Fully oscillating sequences of different lengths, l_{1} and l_{2}, present different scores depending on their length. Thus, it would be better to normalize the score dividing it by the number of divisors. Then, the absolute score of alternation is obtained, as in Eq. 5.
The maximal absolute score is proved to exist and it is unique for each pattern sequence with a specific length (Additional file 1: Section S4  Theorem 2). Theorems 1 and 2 are important because they prove the correctness of how to compute the distance d of an observed oscillating multiomic pattern mov_{obs} (i.e., mov_{obs}:1ā0ā1ā0ā0ā1ā0ā1ā1ā0) from the maximal absolute score (that can be considered as the ideal score). Thus, we have a powerful instrument to investigate how much the oscillations observed in mov_{obs} are found to be distant from the ideal (maximal) oscillations mov_{id}. The distance d in Eq. 6 is the classical Manhattan distance:
d is a geometric distances only if mov_{obs} and mov_{id} have the same length. The similarity index mov_{obs} with respect to mov_{id} is computed using Eq. 7:
Oscillating multiomics on networks
In this Section, we illustrate the effect of oscillating multiomics when projected on pathways. We consider the Park and BarabĆ”si [35] dyadic model on the pathways. In particular, a couple of proteins, linked in the pathway, presents a dyadic property if they both have the same multiomic value (i.e., the couples of redred nodes or blueblue nodes in Fig. 1b(1)). Instead, they show an antidyadic property if they have oscillating multiomics (couples of redblue nodes). Following the model of Park and BarabĆ”si, it is possible to compute the expected value of an antidyadic effect on the couples of nodes that present an alternation. Let n_{1} be the number of pathway nodes with multiomic values equal to 1, and n_{0} be the number of pathway nodes with multiomic value equal to 0. In this way, the total number of nodes in the pathway is N=n_{1}+n_{0}. The number of links between nodes that show the antidyadic property are described by the variables m_{10} and m_{01}. The number of links between nodes that do not show an alternation (dyadic property) are represented by the variables m_{11} and m_{00}. Therefore, the total number of edges for a pathway is M=m_{10}+m_{01}+m_{11}+m_{00}. Note that the maximal possible number of edges in a directed network is equal to N(Nā1). The network density Ī“ [36] of a directed graph is described in the following Eq. 8:
The random variables \(X_{m_{10}}, X_{m_{01}}\), \(X_{m_{11}}\), \(X_{m_{00}}\) (where \(X_{m_{10}} = X_{m_{01}}\)) describe the events of oscillations or nonoscillations in a network. The expected value of observing an alternation (antidyadic property) on two nodes given by Eq. 9 is the following:
On the other hand, the expected value of not observing an alternation (dyadic property) is defined by Eqs. 10 and 11:
The last three Equations describe the link of every pair of nodes with a given probability [37, 38] as described by the Gilbertās model for random networks. Therefore, if the counts of m_{10}, m_{01}, m_{11} and m_{00} deviate from the expected values described above, then it is possible that the multiomics on the pathways are disposed in a structured manner, differently from Gilbertās model. The ratio between the observed alternation and its expected value is a direct measure of deviation (a.k.a. magnitude) of the observed pathway from a random configuration. In particular, the pathways present oscillating multiomics with a magnitude \(\widehat {m_{10}}\) given by Eq. 12 :
Following the properties of the network structure, if the nodes present a dyadic effect, their magnitudes \(\widehat {m_{11}}\) and \(\widehat {m_{00}}\) are given by Eqs. 13 and 14:
If \(\widehat {m_{01}}\) is greater than 1, it indicates that multiomic nodes are oscillating more than expected by a random configuration. Multiomics are not oscillating in the same way when \(\widehat {m_{00}} > 1\) or \(\widehat {m_{11}} > 1\). Due to the configuration of the pathways, it is possible that the antidyadic magnitude (see Eq. 12) and both the dyadic magnitudes (see Eqs. 13, 14) present values greater than 1; therefore, a network is mainly oscillating if \(\widehat {m_{10}} > \widehat {m_{11}}\) and \(\widehat {m_{10}} > \widehat {m_{00}}\).
The average dyadiceffect\(\widehat {m_{0011}}\) is defined as the average between the magnitudes \(\widehat {m_{11}}\) and \(\widehat {m_{00}}\).
The whole procedure performance
Given an unweighted graph G(N,E), with N the set of pathway nodes and E the set of edges/reactions, MORA searches the reciprocal influences with a polynomial complexity on N. Additionally, MORA is coupled with the measurement of oscillations making the whole procedure exponential. The function get.shortest.paths is a modified function that implements a breadthfirst search (BFS) structure [26] and takes a track of all the shortest paths between two end nodes. The worst case time complexity for this search algorithm is of O(N+E), but 0ā¤Eā¤N, thus, the complexity could be quadratic on N: O(N^{2}). The latter is called for each adjacent couple (Nā1) of multiomics along the sequence plus the one taken considering the couples between the first and the last elements (O(N(N+E)) [39]. Moreover, the computation of the dyadic/antidyadic effect requires an exhaustive enumeration of all the possible configurations of the n_{1} nodes on the whole node set N: \(\binom {N}{n_{1}} = O\left (N^{n_{1}}\right)\). The latter turns out to be exponential in time but it is possible to apply an approximation by analyzing the boundary configurations of a phase diagram [35]. Unfortunately, in this case, due to the high specificity of enzymesubstrate reactions, with the latter approximation a large part of the biological information is lost. The computation of the oscillating multiomics on sequences require a linear time with respect the number of nodes (O(N)). As consequence, the whole procedure is of f\(\left (\mathrm {N}, n_{1}\right) = O\left (N^{n_{1}}\right) + O\left (N^{3}\right)) + {O}(N)\).
Data sources and multiomics data integration
In the Additional file 1: Section S3 the data sources and the differences between static and dynamic omic values are described. Static omic values, as CAI, are those not responsive to antibiotics. Dynamic omic values are those sensitive to treatments, as, for example, mRNA amount, protein abundance and its variation. In particular, for what concerns the transcriptomic layer, the microarray compendia are extracted with relevant betweenstudies reliabilities [40] as described in the Additional file 1: Section S5 pp.5.0.2. Furthermore, in the proteomic layer, a random effect model is designed with affordable predictors with the aim of obtaining a noiserobust protein variation pv as a fundamental dynamic omic (See [41] and Additional file 1: Section S5 pp.5.0.3 for more details on the model). For what concern the metabolomic layer, a novel proteincentric network of reactions is obtained by integrating two sources [42, 43] (Additional file 1: Section S5 pp.5.0.4). Finally, in the genomic layer, relevant information comes from the codon usage extraction as described in the Additional file 1: Section S5 pp.5.0.1.
Results
The experiments were performed on 66 pathways in standard conditions and taking into account the average effect of 69 antibiotic perturbations.
The sequence patterns and network motifs are studied on 66 multilayered structures four times. In fact, four different experimental setups were compared on the same dataset: (i) MLS without modifications; (ii) MLS with operon compression; (iii) MLS with path extension; (iv) MLS with operon compression followed by path extension. (see violet block in Fig. 2)
The MORA algorithm evaluated the reciprocal influences RI on the 66 x4 MLS (see white block in Fig. 2). Once the associations between the recurring patterns and network motifs were evaluated, the oscillating multiomics were computed with the following scores: (i) the similarity between observed oscillating multiomic patterns and the ideal patterns (Eq. 7); (ii) the pathway dyadic/antidyadic effect magnitudes, as illustrated in Eqs. 12, 13, 14 (see also green blocks in Fig. 2).
For a detailed description of the experimental parameters see in the Additional file 2: Tables S1 and S2. Row names are labelled with their unique E.coli KEGG identifier codes (eco:pathNNNNN).
A small case study
Multiomic oscillations of E.coli Glycolysis are shown in the Additional file 1: Figure S5 Section S7. In this case, in standard conditions, the similarity computed as in Eq. 7 is =0.6 (red line) and the \(\widehat {m_{01}}\) is =1.9, while the magnitude of the dyadic effect \(\widehat {m_{0011}}\) is =1.5. These values could change due to the perturbations, as shown in Additional file 1: Figure S5 part (b). In this small case study, a single pathway alternation analysis with the effect of path extensions is shown in Additional file 1: Figure S6 highlighting that with or without path extensions, the oscillations on motifs are preserved.
Multiomic oscillation features are present in standard conditions and show a multiscale behavior
For the 66 pathways and suprapathways extracted from KEGG the multiomic oscillations in standard conditions through their associated 66 MLSs and relative modifications are studied.
Network Motifs:
It is clearly shown (Fig. 6a) that most of the pathways without path extensions (black dots) presents a relevant antidyadic effect (median \(\widehat {m_{01}} = 1.87, sd= \pm 0.57\)) even if it is also present a relevant but more variable dyadic effect (median 2.03,sd=Ā±0.81). In the pathways with path extensions (blue dots) the antidyadic effect is decreased with a median 1.35,sd=Ā±0.48; contextually, also the average dyadic effect is decreased (median 1.45,sd=Ā±0.57). In some cases, path extensions, more than other MLS modifications, could increase the oscillations. In some other cases MLS modifications introduce new proteins as nodes and new edges as reactions that could decrease the oscillations; it depends on the network topologies. In this case, in the most part of the pathways, the antidyadic effect magnitude is decreased even if it remains relevant. In standard conditions, pathways without path extensions present only 18/66 combined MLSs with a Ļ_{obs}>0.7 and \(\widehat {m_{01}} >1\).
Sequence Patterns:
In Fig. 6b it is shown how combined and competitor structures interact. Furthermore, pathways with path extensions (blue triangles) are more oscillating on patterns (median Ļ_{obs}=0.73,sd=0.10) with respect to those without extensions (black dots) (median Ļ_{obs}=0.53,sd=0.14). On one hand, pathways with path extensions present a high antidyadic effect, on the other hand, they decrement the Ļ_{obs}, moving the density center to a value nearer to 0.5. This means that pathways with path extensions seem to be more in combination than pathways without extensions. Multiomic pattern operon compression returns similarity scores which are a little higher (median Ļ_{obs}=0.54,sd=0.15) than in unmodified MLSs. Modifying MLSs, coupling operon compression and path extension, leads to lower oscillations in patterns (median Ļ_{obs}=0.70,sd=0.12). The presence of operons on the patterns does not cause always the same effect: due to the compression, in some pathways, for example Glycolysis (KEGG identifier path:eco00010) Ļ_{obs} changes from 0.62 to 0.66, while, in the Citrate cycle (TCA cycle) path:eco00020, Ļ_{obs} changes from 0.38 to 0.62. In other cases, Ļ_{obs} decreases its value from 0.64 to 0.38, as, for example, in the Lysine degradation (KEGG identifier path:eco00310). After all, oscillation is still present for the most part of the pathways without the MLS modifications. In this way, multiomic oscillations allow to uncover similarities between the network structures, which can reveal the existence of generic organization principles. A comparison of MLSs with and without modifications reveals a multiscale presence of oscillations in sequences and networks of different dimensions but with a widespread tendency to homeostasis.
Multiomic oscillations are related to MLS reciprocal influences showing a particular interplay between the sequence gene order and the pathway structure
It is possible to assess that MLSs are related to multiomic oscillations underlining the interplay between the gene order and the particular schema of reactions. In both the plots of Fig. 6, pathways without extensions maintain a relevant reciprocal influence (median 1,sd=Ā±0.27), increasing their value for all the pathways with extensions (median 2,sd=Ā±0.63). In both the cases, the reciprocal influences are not near to 0, thus associating the antidyadic motifs to the neighboring multiomics along the patterns. In this way, it is also proved that the network plasticity [19] does not depends only on a particular circuit of reactions but could be investigated through the extent of the MORA reciprocal influences between the sequence gene order and the network structures. From an evolutionary point of view, the sequence gene order and the pathway structures are conserved along the organisms [44, 45]. In future works, it could be interesting to investigate the interplay of gene order and network structures on several species taking as a measure of the evolutionary pressure the comparisons between MORA reciprocal influences.
Multiomic oscillation features change in configuration due to perturbations and reveal different regulatory behavior
We measured the multiomic oscillations on the 66 MLSs perturbed by considering the average effect of 69 antibiotic perturbations. A strong response from the MLSs, as shown in Fig. 7, is obtained when pathways with extensions in standard conditions (black dots) are compared against the average effect of 69 treatments (blue triangles). The treatment effects strongly lower the patterns oscillation score moving the median from Ļ_{obs}=0.72,sd=0.10 to Ļ_{obs}=0.57,sd=0.11. On the other hand, on the networks, they led the median values of \(\widehat {m_{01}}\) from 1.36 to 1.41, with an increase in variability from sd=0.50, to 0.87. These values show that the organism, in response to the treatments, activates other oscillating circuits on the same pathway, silencing some others. For example, in Fig. 7, base excision repair (KEGG pathway id eco03410 with extension) reasonably increases its antidyadic effect of more than 0.5. Mismatch repair (KEGG pathway eco03430) also increases its antidyadic effect magnitude, and in this case, its oscillating similarity on the pattern is lowered. The same is for the Flagellar assembly pathway (KEGG pathway eco02040) that, due to the label overlapping not shown on the plot, in standard conditions, is near to the black node 360 while in the treatments it is behind the blue triangle 280, always in Fig. 7. The structures (sequence and pathway) that form the MLS are defined in combination if oscillating multiomics are present at the same time on the patterns and on the network motifs. Instead, when oscillating multiomics in one structure are found on patterns and not on pathways or viceversa, then these structures are defined in competition. In our case, it is observed that the combined structures in standard conditions are 47/66 considering pathways with extension, those with operon compression are 14/66, while those with operon compression and extension are 43/66. The average effect of the 69 treatments causes only 25/66 MLSs in combination. Similar results are obtained considering MLSs with operon compression and with operon compression and path extension. Some structures that are combined in steady state conditions due to the perturbations become competitors and viceversa. Configuration changes imply the activation or the deactivation of oscillating multiomic circuits on the pathways (as shown in Fig. 8) highlighting the presence of different and unbalanced regulatory functions.
Discussion
The proposed methodology with regard to performances could be comparable with wellknown mining methodologies of sequence patterns [46] and dyad motifs [35, 47]. From the biological point of view, it is based on an extensive perturbation multiomic analysis related to network and sequence structures. In this field, recent studies are focused on frequency patterns, that, conditioned by biochemical oscillators, are activated or deactivated on regulatory network circuits. In particular, it is proven, that different types of regulatory functions appear to be related to particular network structures to such an extent that different biochemical oscillators are associated with specific structures [48]. Furthermore, some other studies are focused on bacterial network motifs and sequences which deepen the topological features on complex structures [49]. As a consequence, our methodology is focused on multiomic patterns and motifs by putting them in a biological and structural relation. In this way, it is possible to leverage the topological interplay between networks and sequences for understanding the effect of perturbations and the role of regulatory functions. Consequently, discovered patterns and motifs are considered part of the same multilayered structure (MLS) and they come from quite easily data integration processes useful to compare dynamic omic sources and perturbations (protein variation, mRNA amounts, etc.) with static omic sources (codon usage, gene order, pathways). Therefore, it is analyzed the extent of oscillating multiomics and their reciprocal influences investigating whether sequence patterns and network motifs are combined or competitor. In this setting, we have shown how perturbations alter the MLS dynamically. These changes were proved to trigger the activation and deactivation of oscillating multiomic circuits with major implications in the metabolic response to antibiotics. Moreover, unknown structural features are revealed, e.g. it is shown that the gene order and the bacterial reaction circuits reveal a strong interplay in combined structures. MLSs were studied when subject to modifications (operon compression and path extensions) considering multiscale multiomic metabolic networks. When the motifs present a lack of oscillating multiomics, then through the path extensions we have seen how the metabolic network affects the recurring patterns more than the operon compression. In this setting, some patterns show high scores while others not. The reason is that, in some cases, the network helps the recurring pattern to maintain a structural oscillation, while, in other cases, the network influences negatively the oscillations. In this way, the variable multiomic quantities could be analyzed in order to discover unknown regulation mechanisms between the different omiclayers. Impressively, competitor MLSs, after a perturbation, could become combined showing a sort of synchronization used to realize a catabolic or an anabolic process in an optimized manner. Furthermore, the latter observations coupled with the discretization of multiomics depict the complexity of metabolic processes and their response to antibiotics unveiling the rules behind the metabolic network robustness and plasticity [19]. After all, MORA reciprocal influences between motifs and patterns are obtained in a reasonable time (see 2) on an almost complete dataset. In fact, the whole proteincentric network is of 1.644 vertices and 369.863 edges and, in our knowledge, it does not exist a larger network (in terms of vertices) on which to project more multiomics. The selected experiments come from two compendia specifically suited for antibiotic treatments (see Paragraph Data sources and multiomics data integration). In order to make results comparable, it is possible to extend the whole procedure to other experiments considering perturbations of the same typology. Unfortunately, the complete proposed methodology is exponential (see Paragraph The whole procedure performance) because of the measurement of oscillating multiomics on networks (see Paragraph Oscillating multiomics on networks). It is possible to achieve ad hoc algorithmic improvements reducing the pathway redundancy [50] or saving and reusing the computations done on overlapped pathway subcircuits/subgraphs in specialized datastructures. After all, functional plasticity, homeostasis, redundancy, and promiscuity conserved biological aspects of metabolic networks and biological processes which makes it possible to survive to external perturbations [51]. Therefore, it is a better choice to not remove redundancy preserving the biological mining of the research. For the same thing it is recommended to pay attention in the network edge pruning due to the promiscuity of small metabolites (i.e H_{2}O,O_{2},AMP, etc ā¦) [25, 52]. Oscillating multiomic patterns associated with dynamic rates of substrate/compound transformations might lead to predicting some specific biochemical processes also in presence of missing data, by using inferential methods based on the profiling of reaction velocities and their dynamics. Future improvements might lead us to study oscillations with N>2 classes. In literature there are several methodologies to define N as a discrete space of features [53]. We suggest to apply a consensus criterium in order to project multiomics on a discrete space with N>2 because there is not an a priori best methodology. For example, we adopted the Sturgeās formula in one of our precedent works [12]. Once the number of discretisation levels N is decided, the procedure to obtain Ļ_{obs} is very similar to the one described in the paragraph Oscillating multiomics on patterns. Note that if we have a set of discretization levels Ī£={0,1,2,..N}, then each discretisation level contributes to the pattern oscillation by its fraction 1/Ī£. The ideal Ļ_{obs} is equal to \(\frac {(max(\Sigma)0) * div}{max(\Sigma)*div} = 1\). It follows that the ideal oscillation with N>2 assumes the form of a series of minmax discretisation levels: max(Ī£)āmin(Ī£)āmax(Ī£)ā¦min(Ī£)āmax(Ī£). More the adjacent values are nearer to max(Ī£) or min(Ī£) the less we have oscillating multiomics. The multiclass dyadic effect follows the same rules presented above where the maximal oscillation is equal to N and the others are ā [0,N].
Conclusion
In this paper a multiomic integration based methodology is introduced to analyse bacterial oscillating multiomics on sequence patterns and network motifs. The subject of our analysis is E.coli. The lack of methodologies for multiomics integration decreases the chance to detect emerging motifs and patterns across sequences, networks or pathways. The current need of analytics to compare and test metabolic models, the accurate design of a new pathway or the redesign of an existing metabolic pathway or the experimental validation of an in silico metabolic model using a nearby species require the information of how multiomics data build up multiscale phenotypes. The high level comparison is based on novel algorithms which take the low level approach at pathway level framework and generate multiscale multiomic metabolic networks. The goal is to find relations between oscillating multiomic patterns on sequences and oscillating motifs on metabolic networks. Recurring oscillating multiomic patterns are discovered to be related to multiomic metabolic network motifs. This last novel feature can be related to the highly conserved gene order and to the highly specificity of enzymatic reactions and their network topology. The discovered motifs can be not only useful in the study of bacterial phenotypic responses but also in applications of metabolic engineering and optimizations. Then, this work can contribute to the study and to the creation of new interesting metabolic circuits based on binary multiomic network motifs and their recurring patterns. Our framework is implemented in a software written in R which provides effective and friendly means to design intervention scenarios in the perspective of the synthetic biology.
Abbreviations
 Alphabetical list of abbreviations:

APL: Average path length
 CAI:

Codon adaptation index
 FB:

Francesco Bardozzo
 MLS:

Multilayered structures
 MORA:

Multiomic reciprocal adjacencies
 PA:

Protein abundance
 PL:

Pietro LiĆ³
 PV:

Protein variation
 RT:

Roberto Tagliaferri
References
Kohl M, Megger DA, Trippler M, Meckel H, Ahrens M, Bracht T, Weber F, Hoffmann AC, Baba HA, Sitek B, et al.A practical data processing workflow for multiomics projects. Biochim Biophys Acta (BBA)  Protein Proteomics. 2014; 1844(1):52ā62.
Bersanelli M, Mosca E, Remondini D, Giampieri E, Sala C, Castellani G, Milanesi L. Methods for the integration of multiomics data: mathematical aspects. BMC Bioinformatics. 2016; 17(Suppl 2):15.
Angione C, Conway M, LiĆ³ P. Multiplex methods provide effective integration of multiomic data in genomescale models. BMC Bioinformatics. 2016; 17(4):83.
Menichetti G, Remondini D, Panzarasa P, MondragĆ³n RJ, Bianconi G. Weighted multiplex networks. PLoS ONE. 2014; 9(6):97857.
Suravajhala P, Kogelman LJ, Kadarmideen HN. Multiomic data integration and analysis using systems genomics approaches: methods and applications in animal production, health and welfare. Genet Sel Evol. 2016; 48(1):38.
Mesiti M, JimĆ©nezRuiz E, Sanz I, BerlangaLlavori R, Perlasca P, Valentini G, Manset D. Xmlbased approaches for the integration of heterogeneous biomolecular data. BMC Bioinformatics. 2009; 10(Suppl 12):7.
Serra A, Fratello M, Fortino V, Raiconi G, Tagliaferri R, Greco D. Mvda: a multiview genomic data integration methodology. BMC Bioinformatics. 2015; 16(1):261.
ChiappinoPepe A, Pandey V, Ataman M, Hatzimanikatis V. Integration of metabolic, regulatory and signaling networks towards analysis of perturbation and dynamic responses. Curr Opin Syst Biol. 2017.
Zhang W, Li F, Nie L. Integrating multiple āomicsā analysis for microbial biology: application and methodologies. Microbiology. 2010; 156(2):287ā301.
Aksenov SV, Church B, Dhiman A, Georgieva A, Sarangapani R, Helmlinger G, Khalil IG. An integrated approach for inference and mechanistic modeling for advancing drug development. FEBS Lett. 2005; 579(8):1878ā83.
Ebrahim A, Brunk E, Tan J, Oābrien EJ, Kim D, Szubin R, Lerman JA, Lechner A, Sastry A, Bordbar A, et al.Multiomic data integration enables discovery of hidden biological regularities. Nat Commun. 2016; 7:1ā9.
Bardozzo F, LiĆ³ P, Tagliaferri R. Multi omic oscillations in bacterial pathways. In: Neural Networks (IJCNN), 2015 International Joint Conference On. Killarney: IEEE: 2015. p. 1ā8.
Chakraborty S, Nag D, Mazumder TH, Uddin A. Codon usage pattern and prediction of gene expression level in bungarus species. Gene. 2017; 604:48ā60.
Ishii N, Tomita M. Multiomics datadriven systems biology of e. coli. In: Systems biology and biotechnology of Escherichia coli, volume 1: 2009. p. 41ā57.
Toyoda T, Wada A. Omic space: coordinatebased integration and analysis of genomic phenomic interactions. Bioinformatics. 2004; 20(11):1759ā65.
ShenOrr SS, Milo R, Mangan S, Alon U. Network motifs in the transcriptional regulation network of escherichia coli. Nat Genet. 2002; 31(1):64ā8.
Kim M, Rai N, Zorraquino V, Tagkopoulos I. Multiomics integration accurately predicts cellular state in unexplored conditions for escherichia coli. Nat Commun. 2016; 7:1ā12.
Fondi M, LiĆ² P. Multiomics and metabolic modelling pipelines: challenges and tools for systems microbiology. Microbiol Res. 2015; 171:52ā64.
Almaas E, Oltvai Z, BarabĆ”si AL. The activity reaction core and plasticity of metabolic networks. PLoS Comput Biol. 2005; 1(7):68.
Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, ColladoVides J, Glasner JD, Rode CK, Mayhew GF, et al.The complete genome sequence of escherichia coli k12. science. 1997; 277(5331):1453ā62.
Wang M, Weiss M, Simonovic M, Haertinger G, Schrimpf SP, Hengartner MO, von Mering C. Paxdb, a database of protein abundance averages across all three domains of life. Mol Cell Proteomics. 2012; 11(8):492ā500.
Mao F, Dam P, Chou J, Olman V, Xu Y. Door: a database for prokaryotic operons. Nucleic Acids Res. 2008; 37(suppl_1):459ā63.
Chung F, Lu L, Dewey TG, Galas DJ. Duplication models for biological networks. J Comput Biol. 2003; 10(5):677ā87.
Edwards MT, Rison SC, Stoker NG, Wernisch L. A universally applicable method of operon map prediction on minimally annotated genomes using conserved genomic context. Nucleic Acids Res. 2005; 33(10):3253ā62.
Light S, Kraulis P. Network analysis of metabolic enzyme evolution in escherichia coli. Bmc Bioinformatics. 2004; 5(1):1.
Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal, Complex Syst. 2006; 1695(5):1ā9.
Cho BK, Zengler K, Qiu Y, Park YS, Knight EM, Barrett CL, Gao Y, Palsson BĆ. The transcription unit architecture of the escherichia coli genome. Nat Biotechnol. 2009; 27(11):1043ā9.
Dandekar T, Snel B, Huynen M, Bork P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci. 1998; 23(9):324ā8.
Wang M, Herrmann CJ, Simonovic M, Szklarczyk D, Mering C. Version 4.0 of paxdb: Protein abundance data, integrated across model organisms, tissues, and celllines. Proteomics. 2015; 15(18):3163ā8.
Sharp PM, Li WH. The codon adaptation indexa measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 1987; 15(3):1281ā95.
Ludbrook J. Special article comparing methods of measurement. Clin Exp Pharmacol Physiol. 1997; 24(2):193ā203.
Callister SJ, Barry RC, Adkins JN, Johnson ET, Qian WJ, WebbRobertson BJM, Smith RD, Lipton MS. Normalization approaches for removing systematic biases associated with mass spectrometry and labelfree proteomics. J Proteome Res. 2006; 5(2):277ā86.
Quackenbush J. Microarray data normalization and transformation. Nat Genet. 2002; 32(Supp):496.
Bettenbrock K, Bai H, Ederer M, Green J, Hellingwerf KJ, Holcombe M, Kunz S, Rolfe MD, Sanguinetti G, Sawodny O, et al.Towards a systems level understanding of the oxygen response of escherichia coli. Adv Microb Physiol. 2014; 64:65ā114.
Park J, BarabĆ”si AL. Distribution of node characteristics in complex networks. Proc Natl Acad Sci. 2007; 104(46):17916ā20.
Schaeffer SE. Graph clustering. Comput Sci Rev. 2007; 1(1):27ā64.
Albert R, BarabĆ”si AL. Statistical mechanics of complex networks. Rev Mod Phys. 2002; 74(1):47.
Goldenberg A, Zheng AX, Fienberg SE, Airoldi EM, et al.A survey of statistical network models. Found TrendsĀ®; Mach Learn. 2010; 2(2):129ā233.
Erciyes K. Complex networks: an algorithmic perspective, volume 1: CRC Press; 2014, pp. 93ā4.
CsĆ”rdi G, Franks A, Choi DS, Airoldi EM, Drummond DA. Accounting for experimental noise reveals that mrna levels, amplified by posttranscriptional processes, largely determine steadystate protein levels in yeast. PLoS Genet. 2015; 11(5):1005206.
Winter B. Linear models and linear mixed effects models in R with linguistic applications. CoRR. 2013; abs/1308.5499:1ā42. http://arxiv.org/abs/1308.5499, https://dblp.org/rec/bib/journals/corr/Winter13.
Kanehisa M, Goto S. Kegg: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000; 28(1):27ā30.
Keseler IM, BonavidesMartĆnez C, ColladoVides J, GamaCastro S, Gunsalus RP, Johnson DA, Krummenacker M, Nolan LM, Paley S, Paulsen IT, et al.Ecocyc: a comprehensive view of escherichia coli biology. Nucleic Acids Res. 2009; 37(suppl 1):464ā70.
Kelley BP, Sharan R, Karp RM, Sittler T, Root DE, Stockwell BR, Ideker T. Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proc Natl Acad Sci. 2003; 100(20):11394ā99.
Tamames J. Evolution of gene order conservation in prokaryotes. Genome Biol. 2001; 2(6):0020ā1.
Yu CC, Chen YL. Mining sequential patterns from multidimensional sequence data. IEEE Trans Knowl Data Eng. 2005; 17(1):136ā40.
Vazquez A, Dobrin R, Sergi D, Eckmann JP, Oltvai Z, BarabĆ”si AL. The topological relationship between the largescale attributes and local interaction patterns of complex networks. Proc Natl Acad Sci. 2004; 101(52):17940ā5.
Kang JH, Cho KH. A novel interaction perturbation analysis reveals a comprehensive regulatory principle underlying various biochemical oscillators. BMC Syst Biol. 2017; 11(1):95.
Gao S, Chen A, Rahmani A, Zeng J, Tan M, Alhajj R, Rokne J, Demetrick D, Wei X. Multiscale modularity and motif distributional effect in metabolic networks. Curr Protein Pept Sci. 2016; 17(1):82ā92.
Vivar JC, Pemu P, McPherson R, Ghosh S. Redundancy control in pathway databases (recipa): an application for improving geneset enrichment analysis in omics studies and ābig dataā biology. Omics J Integr Biol. 2013; 17(8):414ā22.
GĆ¼ell O, SaguĆ©s F, Serrano MĆ. Essential plasticity and redundancy of metabolism unveiled by synthetic lethality analysis. PLoS Comput Biol. 2014; 10(5):1003637.
Carbonell P, Lecointre G, Faulon JL. Origins of specificity and promiscuity in metabolic networks. J Biol Chem. 2011; 286(51):43994ā4004.
Dougherty J, Kohavi R, Sahami M. Supervised and unsupervised discretization of continuous features. In: Machine Learning Proceedings 1995. Tahoe City: Elsevier: 1995. p. 194ā202.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2016. R Foundation for Statistical Computing. https://www.Rproject.org/.
Acknowledgements
Authors would like to thanks Dr. Angela Serra and Dr. Paola Galdi for the read back of the manuscript.
Funding
Publication of this article is partly funded by the Universities of Cambridge and Salerno.
Availability of data and materials
Algorithm and procedures: Multiomic oscillation analyses were carried out using R [54] with custom scripts which may be downloaded from GitHub repository (github.com/lodeguns/MORA).
Furthermore, an implementation of MORA and the complete procedure of computation for identifying oscillating multiomics on sequences and pathways is given, with some ad hoc plots and toy examples.
Data: In these repository files there are all the data divided into pathways under the effect of antibiotic perturbations and in standard conditions. Furthermore, the CAI, PA, PV data have been uploaded.
E.coli protein centric metabolic network: Finally, the whole E.coli integrated metabolic network obtained merging the data from EcoCyc [43] and KEGG [42], and an associated matrix for identifying operons and protein complexes are provided in the repository.
About this supplement
This article has been published as part of BMC Bioinformatics Volume 19 Supplement 7, 2018: 12th and 13th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2015/16). The full contents of the supplement are available online at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume19supplement7.
Author information
Authors and Affiliations
Contributions
FB PL RT conceived the study and devised the methodology. FB collected the data, built the models and wrote the code. FB PL RT wrote the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
No human, animal or plant experiments were performed in this study, and ethics committee approval was therefore not required.
Competing interests
The authors declare that they have no competing interests.
Publisherās Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional files
Additional file 1
This file describes all the technical details and procedures used for the multiomic dataintegration. In addition, some Sections show some math proofs for what is concerning the oscillating multiomics, a toy example for the multiomic relational adjacencies and some concrete examples of pathway and patterns. (PDF 767 kb)
Additional file 2
It is a PDF document which contains 66 measures of dyadic/antidyadic effects, oscillation similarity scores, and reciprocal influences computed with MORA. Data are reported in standard conditions and considering the average effect of 69 treatments. (PDF 108 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Bardozzo, F., LiĆ³, P. & Tagliaferri, R. A study on multiomic oscillations in Escherichia coli metabolic networks. BMC Bioinformatics 19 (Suppl 7), 194 (2018). https://doi.org/10.1186/s1285901821755
Published:
DOI: https://doi.org/10.1186/s1285901821755