Degeneracy measures in biologically plausible random Boolean networks

Background Degeneracy—the ability of structurally different elements to perform similar functions—is a property of many biological systems. Highly degenerate systems show resilience to perturbations and damage because the system can compensate for compromised function due to reconfiguration of the underlying network dynamics. Degeneracy thus suggests how biological systems can thrive despite changes to internal and external demands. Although degeneracy is a feature of network topologies and seems to be implicated in a wide variety of biological processes, research on degeneracy in biological networks is mostly limited to weighted networks. In this study, we test an information theoretic definition of degeneracy on random Boolean networks, frequently used to model gene regulatory networks. Random Boolean networks are discrete dynamical systems with binary connectivity and thus, these networks are well-suited for tracing information flow and the causal effects. By generating networks with random binary wiring diagrams, we test the effects of systematic lesioning of connections and perturbations of the network nodes on the degeneracy measure. Results Our analysis shows that degeneracy, on average, is the highest in networks in which ~ 20% of the connections are lesioned while 50% of the nodes are perturbed. Moreover, our results for the networks with no lesions and the fully-lesioned networks are comparable to the degeneracy measures from weighted networks, thus we show that the degeneracy measure is applicable to different networks. Conclusions Such a generalized applicability implies that degeneracy measures may be a useful tool for investigating a wide range of biological networks and, therefore, can be used to make predictions about the variety of systems’ ability to recover function. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04601-5.

A biological network with high degeneracy means that the system can show the same macroscopic behavior following a lesion even though the underlying network dynamics are significantly different. In other words, if the system is highly degenerate, after a lesion, the function can be recovered by a structurally different (i.e., performing a different function under normal conditions) component taking over a new function. For example, in the brain many different neural clusters can affect the same motor outputs, and if some of the brain areas are damaged, an alternative (non-redundant) pathway can be recruited in order to generate functionally equivalent behaviors [46][47][48][49][50]. Degeneracy thus suggests how biological systems can thrive despite changes to internal and external demands.
It has been shown that degeneracy also plays a role in complexity and evolvability of biological systems [8,10,27,28,32,36,51]. Higher levels of degeneracy correlate with an increase in the degree of both the functional integration and local segregation of a system, and therefore, higher degeneracy is accompanied by higher degree of complexity of the systems [33].While local segregation (namely, functional specialization) enables system to be flexible against environmental stress (due to diversity of functions), functional integration allows system to be robust [52][53][54]. If a component, or a group of components are compromised in a highly degenerate system, functions can be reassigned among distinct elements (that are locally segregated) while the macro-level behavior (which requires the system to be functionally integrated) is conserved. This adaptability brings an obvious advantage over the course of natural selection [46].
To measure degeneracy in systems, Tononi et al. [33] introduced a quantitative measure for neural networks (see also alternatives [55,56]) using an information theoretic approach. Information theory [57][58][59] provides a set of tools to describe how information is processed in systems. It allows us to measure the statistical (in)dependencies in terms of the information content of the components. Mutual information (MI), which is a measure provided by information theory, can also capture nonlinear dependencies(/ relationships) that are not detectable by correlation analysis [57,60]. However, direction of the interaction between the components cannot be discerned from MI alone [33,60,61]. Incorporating MI with systematic perturbations (to determine directionality), degeneracy is formalized [33] in terms of the causal effects of the changes in the state of the subsets (components and/or subgroups of components) on the system's output. If the output activity of the system is not affected by the change (e.g., perturbation) in a subset's state, then the system is highly degenerate with regard to the function that is performed by that subset. This information theoretic measure of degeneracy is, first, applied to highly abstract networks in the work by Tononi et al. [33] which is followed by applications to the weighted networks with a high degree of biological fidelity (e.g., Hodgkin-Huxley type neural networks [40] and genetic networks with epistasis [62]).
Although it has been shown that degeneracy as a network property exists at different levels of biological organization (from molecules to behavior [30,36,46]), a quantitative analysis of degeneracy at such levels is sparse and methods are individualized to specific cases (see the different versions of degeneracy measurements in other works [10,55,63]). In systems biology, information theoretic measures are widely applied to many problems [64], yet, to date, there is not a comprehensive study applying this measure for biologically realistic networks other than networks with weighted connections.
Neural networks offer one example of how biological systems can incorporate degeneracy to ensure survival after being damaged. However, other biological networks are likewise capable of recovering partial or full function following damage. For example, in between-species interaction networks a species loss can be compensated by other species contributing to ecosystem functioning [65]. Likewise, on a smaller scale, it has been shown that loss of functioning in some (non-redundant) genes has a weak or no effect on the fitness of the gene networks [8]. Although degeneracy might not be detectable under normal conditions, after perturbating or lesioning the biological networks, changes in the environment may also evoke degenerate responses ('degeneracy lifting' effect [9]). Environmental (evolutionary) pressure in receptor-signal transduction networks [55] can push the signaling pathways to reconfigure into a degenerate form.
Although degeneracy is a feature of biological gene networks, it is unclear whether models of gene networks can be analyzed using the same information theoretic approaches as used for neural network models. A GRN is a network of gene-gene interactions through their regulators that control the gene expression levels of the products (mRNAs and proteins) which, ultimately, determine the cell fate (final cell type, i.e., function of the cell) [66]. Measurements of degeneracy at the level of gene transcription control may provide insights on how functions of genes that determine the cell function, can be recovered as a consequence of the network properties (GRN topology).
Random Boolean network (RBN) models, as discrete models of GRNs, are well-suited to study degeneracy since with RBNs we can induce and trace the effects of targeted lesions while environmental/external pressure is a parameter that can be controlled over in silico experiments. Unlike neuronal networks where edges are (synaptic) weight vectors, RBNs have a static wiring diagrams [67] governed by logic equations that represent the functions of gene regulatory factors (e.g., transcription factors). Logic equations describe the underlying network architecture. For example, for a simple network of 3 genes, if gene G1 is regulated both directly by G3 and through an indirect link from G2, this architecture is represented by the logic function of "G1 = (G2 OR G3)". Likewise, if there is an inhibitory regulation of G1 through G2 while the same architecture is preserved from the network described above, this structure can be represented by the logic equation of "G1 = ((NOT G2) OR G3)". Since each state of gene expression is the direct outcome of the activity of (regulatory interactions in) the previous state, one can assess the effects of circuit architecture on gene expression levels [66]. Hence, in RBNs, it is feasible to trace the information flow at each (discrete) time step and so, causal influences.
While RBNs have been widely used in system biology to study GRNs, recent work [68] has sought to match the complexity of gene expression patterns that are observed in developmental processes (such as, mammalian cortical area development [69]) by incorporating additional biological details to simulations. This is attained by converting Boolean models to equivalent systems of differential equations that describe synergistic relations in gene regulatory processes [68]. Unlike widely-used in-silico single-cell gene expression data generators such as GeneNetWeaver [70] where simulated networks are re-constructed from a limited number of known structures (from E. coli and S. cerevisiae [70]), here, the network structure is defined by a (Boolean) ruleset that can be generated and manipulated by users [68].
In this study we test to what extent the information theoretic measure of degeneracy applies to RBNs as well as continuous models derived from RBN approaches. Furthermore, we test systematic lesions in randomly generated Boolean networks while varying the number of perturbed nodes. This enables us to explore how degeneracy quantitively changes as a function of interventions to the nodes and induced topological alterations in the networks. Results show that degeneracy measures, in most cases, can be applied to networks based on Boolean logic in addition to more typical weighted networks.

Lesioning
We first investigate the effects of systematic lesioning on measures of degeneracy in randomly wired RBNs. This is consistent with classical models of RBNs [67] where the system is deterministic and no noise term is involved. Edges between network nodes were lesioned in two ways. In type-1 lesioning, only incoming edges were lesioned incrementally while it is possible (due to the pseudo-random algorithm to generate the logic functions) that outgoing edges stayed intact (for details see "Methods"). In the second type of lesioning (hence the name type-2 lesion), we lesioned all incoming and outgoing edges for randomly chosen nodes incrementally.
For classical RBNs, both types of lesioning result in an increase on degeneracy measures ( Fig. 1). That is, removing edges from the network paradoxically increased degeneracy. One reason for this may be that progressive lesioning increases the chances of a node becoming isolated. In the context of classical RBNs, the activity of the isolated node remains constant over time. However, a constant activity profile may also be a result of a function of a (set of ) node(s) that is highly robust and yields same output over time. Therefore, the degeneracy metric might fall short in cases where the outputs of the isolated nodes remain static over time. Here, we hypothesize that degeneracy only works as a measure when all units have varying activity.
To test this hypothesis, we induced a small probability that the activity each node in the network would flip from 1 to 0, or 0 to 1, on each iteration (see "Methods" for details). Results show that the networks with type-1 lesions decrease in average degeneracy values as the cut percentage increases (Fig. 2). Similarly, for type-2 lesions, average Degeneracy, (grey area) is computed as the average MI between subsets of X and O under perturbation over increasing perturbed subset size k, in networks with b no lesions and with c 100%-cut condition. In panel a, on the x axis, the cut percentage represents the affected number of nodes (in total of 10 nodes) whose edges are lesioned given the networks. As cuts becomes larger, degeneracy increases. In panels b and c, degeneracy is calculated according to definition given in "Methods" Eq. 3

Fig. 2
Average degeneracy values compared between type-1 (green line) and type-2 (blue line) lesioning in RBNs with varying unit activity. On the x axis, the cut percentage represents the affected number of nodes (in total of 10 nodes) whose edges are lesioned given the networks. For both lesioning types, degeneracy was lowest in the 100% cut condition where edges of all nodes (10) were cut degeneracy decreases as a function of lesioning. This validates that degeneracy emerges as a network property for RBNs, but only when the activity of all nodes in the network changes over time.
Furthermore, we hypothesized that there could be a difference between the effects of lesioning types as a direct consequence of the partial lesioning (the outgoing edges are preserved) in type-1 condition which can lead some nodes to become dead ends since the activity ends in those nodes. Active nodes without incoming edges means that such nodes do not serve a function, and this eventually would result in lower degeneracy values. The comparison of two lesioning types, in Fig. 2, demonstrates that these lesioning types are not significantly different in their effects on degeneracy (f(1,10) = 0, p > 0.05).

Perturbation
Degeneracy is calculated as the area between the average MI P (X k ;O) (mutual information, MI, between the portion of entropy shared by the system for each perturbed subset k and the output O) and overall-MI (mutual information between the system and its output) for different perturbed subset sizes, k (see Eq. 3 in "Methods"). This area shows a characteristic shape of the degeneracy function: a non-zero value that declines to zero as perturbed subset size k approaches k = O, following an increase that is "higher than would be expected from a linear increase" [33]. This characteristic shape has furthermore been replicated in networks composed of in Hodgkin-Huxley neurons [40].
In weighted networks with no connections, the average (overall-) MI shows a linear increase where degeneracy is zero [33]. Here, this condition (i.e., 100%-of-edges-cut) is compared for RBNs with varying unit activity. Our results for no-edges-cut condition and the characteristic profile of degeneracy are comparable to corresponding findings in previous studies (Fig. 3). Further inspection of partial degeneracy (see "Methods" for details) values from individual simulations showed that partial degeneracy can have a negative value in some instances (only two such conditions captured here for comparison, Fig. 4a, b). Although we have observed that partial degeneracy can be negative for different cut-conditions and different sizes of perturbed subset, when MI is averaged over the simulations for all the perturbed subset sizes k, 〈MI P (X k ;O)〉, degeneracy D N (X; O) was above zero in all conditions.

Interactions of lesions and perturbations
Increases in both lesion extent and the number of nodes perturbed give rise to different metrics in terms of degeneracy, raising the question of how these two factors may interact. We therefore conducted additional simulations in which each lesion condition (0-100%, see "Methods") was crossed with each perturbation condition (k = 1-10). Figure 5a, b show how average partial degeneracy changes as a function of perturbation subset size k, given cut percentages. For both lesioning types, partial degeneracy peaks around when half of the nodes (k ~ 5) are perturbed in the system. The measure of degeneracy can detect existing isofunctionality between the different structures only when one of the structures is perturbed. When half of the nodes in the system are perturbed, we, thereby, maximize the probability of selecting/measuring the right structure for given degeneracy in cases where network structure is random or unknown.

Results for continuous models with RBN connectivity scheme (cBNs)
We apply the same perturbation and lesioning procedures to RBNs that generate continuous expression data. These networks also have initial variance due to the model (see Fig. 4 Partial degeneracy values from individual networks for each perturbed subset size k. Data from 10 (k) × 1000 simulations of networks with a no-cut condition and b 100%-cut condition details in "Methods" and Additional file 1) which is a system of stochastic differential equations (SDEs).
The comparison of the two lesioning types in cBNs (Fig. 6a), demonstrates that both conditions have similar effects on the average degeneracy, where there is a no significant difference (two-way analysis of variance, ANOVA) found between both types of lesioning (f(1,10) = 9.49, p > 0.01). However, average degeneracy varies with cut conditions. The degeneracy measures-the area between the average MI P (X k ;O) and overall-MIdemonstrate similar characteristics (Fig. 6c, d), and thus are comparable, to RBN measures. For the 100%-cut condition (Fig. 6c), degeneracy is closest to zero, consistent with findings in previous studies [33]. Likewise, partial degeneracy shows a negative value in some instances (Fig. 6f, g) although on average (MI) degeneracy was above zero for all conditions and subset sizes k. Moreover, comparable to RBNs, the peak for partial degeneracy measures is around when half of the nodes (k ~ 5) are perturbed in the system for both lesioning types (Fig. 6).

Discussion
Although a variety of types of biological networks are thought to exhibit degeneracy, previous theoretical work has primarily focused on networks with weighted connections [33,40,62]. In this study, we demonstrate that degeneracy measures are also suitable for RBNs, with some caveats. Although our simulations largely replicated previous studies investigating degeneracy in neurally-inspired networks, RBNs use Boolean logic operators rather than weighted connections to determine function. It therefore might have been the case that information-theoretic approaches developed for one class of networks might not have generalized correctly to a new class. By replicating previous findings using RBN (which are discrete systems) and cBNs, we demonstrate that information-theoretic approaches are applicable to a broad range of network types. Network and graph theoretic approaches, frequently formulated in terms of information theory, have been applied extensively to neuroscience [52-54, 61, 71-80] to predict individual differences, consequences of lesions, and ability to recover function following injury. Extending this approach to the study of GRNs opens the door to investigating the consequences of, and possible remedies for, genetic dysfunction. Because most genetic functions are performed by subsets of many components within functional modules [5,81], diseases may emerge due to disorganization of the components in these modules. Degeneracy measures can be recruited for predicting and inducing topological modifications (for example, 'rewiring of diseased modules' [81]) to achieve desired functional outcomes that have clinical significance, such as enhanced pharmaceuticals with better drug targets.
In addition to replicating previous results, we explored the impact of systematically manipulating network connectivity (lesioning) while decomposing degeneracy by size of the perturbed subset. In doing so, we identify a potential interaction between the number of perturbed nodes and the magnitude of the impact of lesions on partial network degeneracy. In networks in which ~ 20% of the connections are lesioned while 50% of the nodes are perturbed, it is observed that average partial degeneracy reaches its highest value among all other cut conditions and for all perturbed subset sizes. These are likely conditions in which sufficient connectivity (here, k = 5, ~ 10%cut for RBNs and ~ 20%-cut for cBNs) allows for the expression of degenerate structures without compromising their function, whereas more lesioning would diminish both primary and degenerate structures, and more perturbation would confound the functions of the nodes. Likewise, if perturbation (of the number of nodes) is smaller, not all possible degenerate structures might be expressed in the network or, when the lesioning is less, degenerate structures might be unobservable since most of the primary structures are intact.
In the literature it has been shown that additional damage (gene/node deletions) can restore the function of previously compromised (metabolic) networks [82][83][84].
Here, we show that progressively lesioning a network results in reduced partial degeneracy across all perturbation subsets. However, when a perturbation subset includes approximately half of the nodes in the network, we observe an increase in partial degeneracy relative to larger or smaller perturbation subsets. This finding suggests that it might be possible to determine an optimum degree of lesioning and perturbation given a network to achieve higher degeneracy in the systems. Thus, partial degeneracy measures might be helpful to develop strategies to predict how to recover the function after damage.
As originally conceived, degeneracy was intended to capture the idea that identical functions could be carried out by distinct network structures. Intuitively, therefore, degeneracy would seem to have a lower bound at zero-in a network with no degeneracy, all structures would serve their own individual functions, and perturbation of those structures would disrupt network output related to the function served.
Although on average degeneracy in our simulations tended to be above or equal to zero, we observed individual simulations in which partial degeneracy values were below zero. In previous studies [40,62], negative degeneracy has been observed especially for network models with increased biological fidelity.
One possible reason for the observation of negative degeneracy may be that the information-theoretic measure for degeneracy was originally developed for and tested on neural networks with no initial variance. As the biological fidelity of the models (thus, inherent variance in the systems) increases, for some conditions (lower coupling and lower connection probability [40] and networks with lower complexity [62]) negative degeneracy has been shown. However, we have not observed such effects on overall degeneracy measurements where 1000 MI values for each possible subset size k were averaged across random networks. Mathematically, degeneracy gets a negative value when the portion (k/n) of the MI between the whole system (n) and the output sheet (n/2) is higher than the average of the MI between the (perturbed) subset of the system (k) and the output sheet (n/2). However, the biological meaning/equivalence of negative degeneracy remains unclear. For studying more biologically realistic complex networks, adjustments in the tools for quantifying degeneracy may be needed.
Another challenge for degeneracy metrics is that, when the units have no varying activity over time due to complete isolation from rest of the network, this may result in higher partial degeneracy values. In that case, the degeneracy measure cannot accommodate the difference between robustness of a node's output and time-invariant output of an isolated node. To account for such an effect, the 'function outcome' definition in degeneracy tools may need to be revised.
Furthermore, in this study, the total number of nodes given a network was set to 10 which is comparable to the network sizes studied in previous works on degeneracy [33,40,62]. However real systems such as biological networks often consist of thousands of connected nodes. When we scale up our network simulations to networks with 100 nodes (while other parameters remained the same), we observed that the entropy calculated for conditions in which the number of perturbed units is greater than around 20 reaches an upper bound (see Additional file 1: Text 4 for simulation results).
Since calculating degeneracy depends on accurately calculating entropy over all possible perturbed subset sizes, the solution seems to be to increase the number of timesteps in our simulations. However, we note that when the number of perturbed units is, for example, 100, there are 2 100 (~ 10 29 ) unique states, indicating the absolute minimum number of timesteps needed to visit each network state once (and accurately estimating entropy would likely require simulations of length 10 30 timesteps). Simply put, it is unfeasible to calculate degeneracy for even moderately sized networks in the way described by previous authors. We acknowledge this as a weakness of the degeneracy measures that applies both to RBNs as well as weighted network, and suggest that future work should aim at developing definitions of degeneracy that can be applied to larger networks.

Network architecture
RBNs were initially proposed as simplified models for gene regulatory networks by Kauffmann [85] where network nodes represent the genes, and the edges represent the regulatory functions. A RBN constitutes a discrete dynamical system that has N nodes, each with K incoming edges (hence, also referred to as N-K models). Each node (gene) can be ON or OFF (1 or 0); a network of N binary nodes therefore has 2 N distinct states [85]. This system is state determined [85] according to Boolean functions that are assigned to each node randomly (from 2 2^K possible functions [86]) where each node has a minimum of zero to a maximum of N outputs. Such a state-space allows random network configurations which often leads to nonlinear dynamics. The total number of nodes representing the genes, here, is N = 10, thus there are 2 10 possible states. In RBNs, the state of the nodes in the network can be updated synchronously or asynchronously in discrete time steps. In this study, for simplicity purposes, a synchronous update rule is chosen.
RBNs have a well-defined function mapping scheme through logic (Boolean) operators which constitute the rules for connections that control the state of gene regulators. In our simulations, RBNs have different combinations of the following operators: AND, OR, NOT, XOR, COPY. For cBNs, operators AND NOT, OR, NOT, COPY are randomly placed to generate rulesets (functions). All operators have equal probability to be assigned (see Additional file 1: Text 1 and Additional file 1: Text 3 for details). Incoming edges are randomly distributed for each node with the condition that each node has at least one (thereby, connectivity is preserved) and at most two (more than one Boolean operator) edges mapped to the other nodes. Thus, outgoing edges are assigned to the nodes in a completely random fashion (allowing for the emergence of highly regulated nodes).
Although the networks that are simulated in this study carry the general characteristics of N-K models-as they are binary networks connected with Boolean functions, these networks differ in some aspects. For example, N is limited to 10, in order to generate comparable conditions with previous studies on degeneracy measures ( [33,40,56]). The number of edges for each rule, is a randomly assigned value that can be either 1 or 2 while a node can have many outgoing edges up to N nodes. Finally, there is no constraint on the expected density of truth values (namely, bias parameter p).

Network lesioning
Degeneracy, as a strategy (or design principle [10,21,87]) for networks to recover their function, refers to the rearrangement of (structurally different) components in a way that function/output remains the same even after a damage. In a network with high degeneracy, there are many possible network reconfigurations that can produce/recover the function. To test the potential factors that give rise to (higher/lower) degeneracy in networks, here, we induce interventions to the systems at the network-level by lesioning the edges.
Two different lesions were introduced to the synthetic networks. In type-1 lesioning, all incoming edges from randomly chosen nodes were cut while the outgoing edges were preserved. In the second type of lesioning, all incoming and outgoing edges were cut from randomly selected nodes given the percentage of total lesioned edges. In both types of lesioning, the edges are lesioned in increments of ten percent of the total number of the nodes given a network. For example, in type-1 30% cut condition, we have lesioned all the incoming edges of the 3 randomly chosen nodes given a network of 10 nodes. Likewise, in type-2 30% cut condition, all the edges (incoming and outgoing) of 3 randomly chosen nodes (out of 10 nodes total in a network) were lesioned. By 100%-cut condition (in both lesioning types), we refer to networks where no node is connected to the other, and so the nodes are isolated.
For both lesioning types, 0%-cut condition refers to networks that are not lesioned, yet the edges are randomly disturbed (according to the method that is defined previously). This may lead some nodes to not have any edges due to random assignments of the rules, therefore, mimicking a (partially) lesioned condition.

Biologically realistic random Boolean networks: discrete to continuous
Boolean functions describe how the states of the regulators control the state of the target genes [68]. In our study, Boolean functions are randomly generated for each simulation with incremental lesioning. To execute numerical simulations, we used two types of RBNs: RBNs (classical model), and cBNs (BoolODE [84]).
Classical RBNs are discrete dynamical systems where system output consists of binary values. To induce perturbation, we set the bit-flip probability [88][89][90][91][92] to 0.25 for the perturbed subset. For RBNs with varying unit activity, besides the perturbation subset, all units have a bit-flip probability of 0.05 at a random time step. The total number of time steps in a simulation is 1000, and we discard the first 10 time points as a burn-in period.
The BoolODE pipeline by Pratapa et al. [68], similar to classical RBN simulations, takes input files that are randomly generated Boolean networks. Then, BoolODE systematically converts a random Boolean network into a system of SDEs that is a continuous model of gene regulation (for model specifications see Additional file 1: Text 2 and Additional file 1: Text 3). Time points in the numerical solution result in vectors of gene expression values that correspond to individual cells. That means for every analysis, each sampled time point is from a cell [68], and in this study, we sample from 990 time points (1000-10, first 10 timepoints treated as burn-in) for each gene in a single simulation and total of 1000 simulations are run for each lesioning percentage increment of 10 s (from no cut condition to all 10 genes cut) which makes 10,000 simulations for each type of lesioning and thus, 2 (lesioning type) × 10 (k subset of perturbed genes) × 11 (cut conditions, no-cut condition inclusive) × 10 synthetic cells with random gene regulatory mechanisms.

Quantification of degeneracy in neural networks
To measure degeneracy, we used the mathematical framework described by Tononi et al. [33]. In this framework, degeneracy is characterized in terms of the average mutual information between subsets of elements within a system and an output sheet (which is also a subset of network X). The output sheet is a set of randomly chosen nodes in a network and its activity is a result of the interactions the other nodes in the system. Thus, activity in the output sheet represents the behavior or the response of the whole system.
From information theory, entropy (Shannon entropy with log base 2 for binary representation) is calculated from probability density functions for subsets of X (Eq. 1). Then mutual information that measures the portion of entropy shared by the system subset To measure degeneracy in the network, we need to determine the effects of the (subset of ) element(s) on the entropy of the output-the behavior of the network. Since mutual information does not capture direction, however, mere calculation of mutual information is not enough to determine the contribution of the elements to the output of the system. To overcome this, perturbations (variance) are injected to the system. If no initial variance is assumed in the system, the value of mutual information between the network and the output is zero before any perturbation [33]. Variance (perturbation) is injected as uncorrelated random noise to each subset k.
Under such perturbations, mutual information of the system is computed as in EQ3 and this procedure is repeated for all subsets of sizes 1 ≤ k ≤ n. Then, degeneracy D N (X;O) of X with respect to O can be calculated as: MI P (X;O) is MI for all elements to the output sheet, and 〈MI P (X j k ;O)〉 is the average of the contribution of each perturbed subset size k to the output sheet.

Quantification of degeneracy in cBNs
Biological RBN simulations (via BoolODE) results in continuous unit activity in terms of gene expression vectors since the simulated networks are translated into nonlinear dynamical system (Additional file 1: Text 2 and Additional file 1: Text 3, Additional file 1: Figure 1). To quantify degeneracy in RBNs we therefore discretize the gene expression vectors by taking the median activity for a unit. Activity that is above the median is set to 1, and activity below the median is set to 0.
We apply the degeneracy measures to the discretized gene expression vectors generated from the simulations. In our simulations, perturbations are systematically injected to the subset size k of genes as normally distributed (with mean = 0, and standard deviation = 0.01) random noise through the governing SDE. The number of elements (namely, the genes) is n = 10 for all simulations with output sheet consisted of the activity of O = n/2 = 5 elements, that is also randomized for each trial.

Partial degeneracy in random networks
Degeneracy can be measured by alternate ways that are mathematically equivalent (see other definitions in [33]). The formal definition that we use in this study requires averaging over every MI measured between each node (unit, j) which are incrementally perturbed (from k = 1 to k = n) and the output sheet for a given network structure (〈MI P (X j k ;O)〉). However, in case where all the networks are randomly generated and the output . Perturbation (represented as syringes) of the nodes in boxes with k notation, is injected as a variance (uncorrelated noise). The box with O notation represents output sheet that is also consisted of randomly chosen set of (n/2 = 5) nodes (dark blue circles). For each network, MI is calculated for perturbed set size k and the output sheet O, for all subset sizes of perturbed set noted as j