Superlevel sets
Definition 4.1
Consider \(f:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\), an arbitrary function. For a given \(u\in {\mathbb {R}}\) a superlevel set of f is the set of the form
$$\begin{aligned}U_u(f)=\{x\in {\mathbb {R}}^n\mid f(x)\ge u\}.\end{aligned}$$
When \(u=1\) we drop the index and write only U(f). Naturally, a polynomial superlevel set is a superlevel set of a polynomial.
Polynomial sublevel sets are defined similarly as in Definition 4.1 with the only difference the direction of the inequality. However, in this paper, we only focus on superlevel sets. For \(d\in {\mathbb {Z}}_{\ge 0}\) let \(P_d\) denote the set of polynomials of total degree at most d. A sum of squares (SOS) polynomial of degree 2d is a polynomial \(p\in P_{2d}\) such that there exist \(p_1,\dots ,p_m\in P_d\) so that \(p=\sum _{i=1}^m p_i^2\). We denote the set of SOS polynomials of degree at most 2d by \(\Sigma _{2d}\).
Theorem 4.2
([17, Theorem 2]) Let \(B\subseteq {\mathbb {R}}^n\) be a compact set and K a closed subset of B. For \(d\in {\mathbb {N}}\) define
$$\begin{aligned}S_d=\{p\in P_d\mid p\ge 0\text { on }B,\;p\ge 1\text { on }K\}.\end{aligned}$$
Then there exists a polynomial \(p_d\in S_d\) such that
$$\begin{aligned} \int _Bp_d(x)dx=\inf \Big \{\int _Bp(x)dx\mid p\in S_d\Big \}. \end{aligned}$$
Furthermore \(\lim _{d\rightarrow \infty }{\mathrm{Vol}}(U(p_d)-K)=0\).
Given a pair (B, K) where \(B\subseteq {\mathbb {R}}^n\) is a compact set and \(K\subseteq B\) a closed set, and given \(d\in {\mathbb {N}}\); we call the polynomial superlevel set U(p) (with p being the polynomial \(p_d\in S_d\) found in Theorem 4.2) the PSS representation of \(K\subseteq B\) of degree d. When K is a semi-algebraic set, one can find \(p_d\) numerically using a minimization problem subject to some positivity constraints [18, Equation 13].
Let \(B=[a_B,b_B]\) and \(K_i=[a_{K_i},b_{K_i}], i=1,\dots ,m\) be some hyperrectangles in \({\mathbb {R}}^n\) such that \(K:=\cup _{i=1}^mK_i\subseteq B\). By solving a similar optimization problem it is possible to find the PSS representation of \(K\subseteq B\). Let \(d\in {\mathbb {N}}\). The goal is to find the coefficients of a polynomial of degree d such that \(\int _Bp(x)dx\) becomes minimum subject to some conditions. Before presenting the constraints, let us look at the target function. A polynomial p(x) of degree d can be written as \(\sum _{\alpha \in {\mathbb {N}}_d^n}c_\alpha x^\alpha\). Here \({\mathbb {N}}_d^n\) is the set of \(\alpha =(\alpha _1,\dots ,\alpha _n)\in {\mathbb {Z}}_{\ge 0}^n\) such that \(\sum _{i=1}^n\alpha _i\le d\). Now the integral can be simplified as below.
$$\begin{aligned} \int _Bp(x)dx&= \int _B\Big (\sum _{\alpha \in {\mathbb {N}}_d^n}c_\alpha x^\alpha \Big )dx \\ &= \sum _{\alpha \in {\mathbb {N}}_d^n}c_\alpha \int _Bx^\alpha dx \\ &= \sum _{\alpha \in {\mathbb {N}}_d^n}\Big (\int _Bx^\alpha dx\Big )c_\alpha . \end{aligned}$$
Since \(\int _Bx^\alpha dx\) are constant real numbers independent of the coefficients of the polynomial, the target function is a linear function on the coefficients of p(x) which are the variables of the optimization problem.
Now let us look at the constraints. First of all p(x) has to be nonnegative on B. This can be enforced by letting
$$ p(x)-\sum _{j=1}^ns_{B,j}(x)\big (x_j-a_{B,j}\big )\big (b_{B,j}-x_j\big )\in \Sigma _{2r}, \quad s_{B,j}\in \Sigma _{2r-2},j=1,\dots ,n, $$
(2)
where \(r=\lfloor \frac{d}{2}\rfloor\) the largest integer less than or equal to \(\frac{d}{2}\). Secondly we need \(p(x)\ge 1\) on K or in other words \(p(x)-1\ge 0\) on K. This holds if and only if \(p(x)-1\ge 0\) on each \(K_i\). Therefore for every \(i=1,\dots ,m\) one more constraint of the shape (2) has to be added:
$$ p(x)-1-\sum _{j=1}^ns_{K_i,j}(x)\big (x_j-a_{K_i,j}\big )\big (b_{K_i,j}-x_j\big )\in \Sigma _{2r}, \quad s_{K_i,j}\in \Sigma _{2r-2},j=1,\dots ,n. $$
Recall Definition 2.3: the multistationarity region of a network is in fact a superlevel set, \(U_2(\Phi _f^0)\). The goal is to find a PSS representation of the set \(U_2(\Phi _f^0)\). One way to accomplish this goal is to find a rectangular representation of the multistationarity region and then solve the above mentioned SOS optimization problem. The next example illustrates this idea. To tackle it we use a Matlab toolbox called YALMIP [31, 32] which can receive an SOS optimization problem, process it and use other solvers to solve it. For the solver to be used by YALMIP, we chose SeDuMi [33].
Examples
We continue with the example from Fig. 3. Consider the rectangular representation of the multistationarity region of that example given in Fig. 4c. To find the PSS representation of this set, we let \(B=[(0.0005,0),(0.0010,2)]\) and K be the union of rectangles \(K_i\)s such that their associated number is greater than or equal to 2. From the 100 sub-rectangles of B, 9 of them satisfy this condition. These sub-rectangles are colored orange in Fig. 5. We use the YALMIP and SeDuMi packages of Matlab to solve the SOS optimization discussed before this example. To report the computation time we add the two times reported in the output of YALMIP: the “yalmiptime” and “solvertime”. It takes about 2 s to get the coefficients of the polynomial p of the PSS representation of degree 2. Figure 5 shows the plot of U(p).
Unfortunately the same code does not produce a better approximation when we increase d, the degree of p, from 2 to 4, 8 or 16. The output from Matlab gives similar figures in these cases as Fig. 5.
Consider another gene regulatory example from [19, Chapter 2]. To avoid lengthening the text, we only reproduce the system of equations needed to study the multistationarity of the network:
$$\begin{aligned} \begin{array}{ll} k_1x_7x_5-k_5x_1=0 &{} \quad k_2x_8x_6-k_6x_2=0\\ k_3x_1-k_7x_3=0 &{} \quad k_4x_2-k_8x_4=0\\ k_9x_7x_4-k_{11}x_9=0 &{} \quad k_{10}x_8x_3-k_{12}x_{10}=0\\ k_{13}x_9x_4-k_{15}x_{11}=0 &{} \quad k_{14}x_{10}x_3-k_{16}x_{12}=0\\ x_5=k_{17} &{} \quad x_6=k_{18}\\ x_7+x_9+x_{11}=k_{19} &{} \quad x_8+x_{10}+x_{12}=k_{20}. \end{array} \end{aligned}$$
(3)
We fix all parameters other than \(k_7\) and \(k_8\) to the following values coming from Equation (2.10) of [19, Chapter 2]:
$$\begin{aligned} \begin{array}{l} k_1=k_2=k_3=k_4=1,\;k_5=0.0082,\;k_6=0.0149,\;\\ k_9=k_{10}=0.01,\;k_{11}=k_{12}=10000,\;k_{13}=2,\;\\ k_{14}=25,\;k_{15}=1,k_{16}=9,k_{17}=k_{18}=k_{19}=1,\;\\ k_{20}=4. \end{array} \end{aligned}$$
(4)
We reproduced the rectangular representation of the multistationarity region of this network by Matlab, as shown in Fig. 6a. From the 100 sub-rectangles in total, for 28 of them the average number of steady states is greater than or equal to 2. Using YALMIP and SeDuMi it took between 1 and 2 s to get the polynomials of the PSS representations of degrees 2, 4 and 8 represented in Fig. 6b–d respectively. For this example, the PSS approximation of degree 4 looks different than that of degree 2, but for degree 8 the plot looks similar to degree 4.
Advantages of PSS representation over rectangular representation
It is natural to ask why one should find a PSS representation of the multistationarity region using the rectangular representation given one already has the rectangular representation? Let \(B\subseteq {\mathbb {R}}^r\) be the parameter region of the form of a hyperrectangle, and \(K\subseteq B\) be the multistationarity region. In the rectangular representation we have \(K\simeq \cup _{i=1}^m K_i\) where \(K_i=[a_{K_i},b_{K_i}]\) are hyperrectangles. In the PSS representation we have \(K\simeq U(p)\) where p is a polynomial of degree d.
-
1-
When \(r\ge 4\), plotting K is impossible. In order to save or show the rectangular representation one needs to use a matrix of size \((m)\times (2r)\), where each row stands for one \(K_i\) and the first r columns have the coordinates of the point \(a_{K_i}\) and the second r columns correspond to the coordinates of the point \(b_{K_i}\). However for the PSS representation one needs to use only a vector of size \(\left( {\begin{array}{c}r+d\\ r\end{array}}\right) =\sum _{i=0}^d\left( {\begin{array}{c}r-1+i\\ r-1\end{array}}\right)\), where \(\left( {\begin{array}{c}r-1+i\\ r-1\end{array}}\right)\) entries are coefficients of the terms of degree i. The terms are ordered from smaller total degree to larger and for the terms of the same total degree we use the lexicographic order.
-
2-
To test if a point \(k^\star \in B\) belongs to K using the rectangular representation one should check m conditions of the form \(k^\star \in K_i\) which means verifying an inequality on each coordinate of the point, i.e. \(a_{K_i,j}\le k^\star _j\le b_{K_i,j}\). If one of the conditions \(k^\star \in K_i\) is positive, then there is no need to check the rest, otherwise all should fail to conclude that \(k^\star \not \in K\). However, using the PSS representation one needs to check only one condition of an evaluation form, i.e. \(p(k^\star )\ge 1\).
-
3-
Recall from the last paragraph of the “Approximation by sampling” Section explaining that parameters near the boundary of the multistationarity region are not suitable choices for an experimentalist. To check the distance of a point \(k^\star \in B\setminus K\) to the boundaries of K using the rectangular representation one should find distance of \(k^\star\) from boundaries of each \(K_i\) and then taking the minimum. However, using the PSS representation, in both cases of \(k^\star \in B\setminus K\) or \(k^\star \in K\), one just needs to find the distance of \(k^\star\) from the algebraic set defined by \(p(k)-1=0\), for example by Lagrange multipliers, as in the next section.
To conclude, if \(\left( {\begin{array}{c}r+d\\ r\end{array}}\right)\) is considerably smaller than 2mr, then storing the PSS representation instead of the initial rectangular representation will save memory without loosing information about the multistationarity region.
Approximating the distance of parameter point from the boundary
To illustrate how to approximate distance of a parameter point from the boundaries of the multistationarity region using a PSS representation we continue with the example from Fig. 3.
Let p be the polynomial of degree 2 in two variables \(k_7\) and \(k_8\) corresponding to U(p) in Fig. 6b. We will approximate distance of the point \(k^\star =(0.08,0.02)\) from the boundary of the multistationarity region by the distance of \(k^\star\) from the algebraic set defined by \(p(k_7,k_8)-1=0\). This question is equivalent to minimizing the Euclidean distance function of a point \((k_7,k_8)\) from the point \(k^\star\) subject to the constraint \((k_7,k_8)\in L_1(p)\). The target function is \(\sqrt{(k_7-0.08)^2+(k_8-0.02)^2}\) which gets minimized if and only if \((k_7-0.08)^2+(k_8-0.02)^2\) gets minimized. An elementary way to solve this minimization problem is to use the method of Lagrangian multipliers [34, Chapter 7, Theorem 1.13]. Define
$$ F(k_7,k_8,\lambda )= (k_7-0.08)^2+(k_8-0.02)^2 + \lambda \big (p(k_7,k_8)-1\big ). $$
Now we must find the critical points of \(F(k_7,k_8,\lambda )\). So we should solve the system of equations obtained by \(\frac{\partial F}{\partial k_7}=\frac{\partial F}{\partial k_8}=\frac{\partial F}{\partial \lambda }=0\). It takes 0.167 s to solve this system of equations by the solve command in Maple. It has 4 solutions, from which 2 belong to the rectangle \(B=[(0,0),(0.1,0.1)]\) and the minimum of the target function is obtained at the point (0.04499222669, 0.04161251428). The distance of this point from \(k^\star\) is 0.04114176669.
Constructing a PSS representation from a sampling representation
It is not necessary to have a rectangular representation to get the PSS representation. Let \(B=[a_B,b_B]\) be a hyperrectangle and \(K=\{a^{(1)},\dots ,a^{(m)}\}\) a finite set. Let \(d\in {\mathbb {N}}\), and the goal be to find the coefficients of a polynomial of degree d such that \(\int _Bp(x)dx\) becomes minimum subject to some conditions. We already saw that the target function is linear. The constraint \(p\ge 1\) on K can be enforced by \(p(a^{(i)})\ge 1\) for every i, which are linear constraints. The positivity of p on B can be enforced by Eq. (2) or by adding a large enough number of random points from B and putting the constraint \(p(a)>0\). The later idea makes the problem solvable by any common linear programming tool. However, here we still use Eq. (2).
Let us illustrate this with our ongoing example. Consider the sampling representation of the multistationarity region of the network of Example 3.4 given at Fig. 4b. To find the PSS representation of this set, we let \(B=[(0.0005,0),(0.001,2)]\) and K to be the set of points for which the system \(f_k(x)=0\) had more than one positive solution. There are 1000 points from which 78 of them are parameter choices where the network has three steady states. Using the YALMIP package of Matlab, it takes less than a second to get the coefficients of each of the polynomials p of the PSS representation of degrees 2, 6 and 10. Figure 7a–c show the plots of U(p) for degrees 2, 6 and 10 respectively. The plot for degree 6 actually looks worse than the plot for degree 2 (further away from the actual result in Fig. 4a), although the one for degree 10 looks a little better. In all these cases YALMIP finished the computations with a message ‘Numerical problems (SeDuMi)’ indicating that the solver found the problem to be numerically ill-posed. Rescaling the parameter region of interest, B, to [(0, 0), (1, 1)] and then transforming the PSS polynomial back to the original B allows a better PSS approximations via YALMIP. For degrees 2 and 6 the numerical problem message is avoided but for degree 10 it remains. The results are shown in Fig. 7d–f.
To demonstrate that the number of free parameters need not be 2 to be able to compute the PSS representation, we repeated the above process with all 8 parameters of the system being free in the following hyperrectangle:
$$ B= [\, (1, 0, 0.0005, 0, 1, 1, 40, 0), (4, 2, 0.001, 2, 4, 3, 50, 2) \,]. $$
Solving the system at 1000 random points uniformly chosen from B takes the same amount of time as solving the system at 1000 random points with only 2 of their coordinates varying took time in the previous case. It took about 1.5 s to compute the 45 coefficients of the polynomial of the PSS representation of degree 2. The polynomial found is the following.
$$\begin{aligned} p=&\, 0.0232593410 k_{1}^{2}- 0.2965863774 k_{2}^{2}+ 559.9760177000 k_{3}^{2}\\ &-0.0618678914 k_{4}^{2}+ 0.0098179292 k_{5}^{2}- 0.0061375489 k_{6}^{2}\\ &-0.0036871825 k_{7}^{2}- 0.0716829116 k_{8}^{2}- 0.0110701085 k_{1} k_{2}\\ &-5.3373104650 k_{1} k_{3}+ 0.0186292240 k_{1} k_{4}+ 12.5539028600 k_{2} k_{3}\\ &+0.0051935911 k_{1} k_{5}- 0.0820767495 k_{2} k_{4}+ 0.0250525965 k_{1} k_{6}\\ &-0.0649618436 k_{2} k_{5}- 8.2165668850 k_{3} k_{4}+ 0.0092355627 k_{1} k_{7}\\ &-0.0310047257 k_{2} k_{6}- 4.2926137530 k_{3} k_{5}- 0.0527315151 k_{1} k_{8}\\ &+0.0017547696 k_{2} k_{7}- 2.7647028150 k_{3} k_{6}+ 0.0288291263 k_{4} k_{5}\\ &+0.0785940967 k_{2} k_{8}- 0.4049844472 k_{3} k_{7}+ 0.0536360086 k_{4} k_{6}\\ &+12.6014929300 k_{3} k_{8}- 0.0091147734 k_{4} k_{7}+ 0.0198800552 k_{5} k_{6}\\ &-0.0567174819 k_{4} k_{8}+ 0.0009448114 k_{5} k_{7}- 0.0190068615 k_{5} k_{8}\\ &+0.0102920949 k_{6} k_{7}+ 0.0316830973 k_{6} k_{8}+ 0.0022468043 k_{7} k_{8}\\ &-0.6043381476 k_{1}+ 0.7152344469 k_{2}+ 0.3287484744 k_{4}\\ &+36.3811699100 k_{3}- 0.0617318336 k_{5}- 0.5940226892 k_{6}\\ &+0.2987521874 k_{7}+ 0.2558883329 k_{8}- 5.0441247030. \end{aligned}$$
Advantages of PSS representation over the sampling representation
Let us address why one should find a PSS representation of the multistationarity region using the sampling representation if one already has a sampling representation. Let \(B\subseteq {\mathbb {R}}^r\) be the parameter region of the form of a hyperrectangle, \(K\subseteq B\) be the multistationarity region. In the sampling representation we have \(\{a^{(1)},\dots ,a^{(m)}\}\subseteq K\). In the PSS representation we have \(K\simeq U(p)\) where p is a polynomial of degree d.
-
1-
When \(r\ge 4\), plotting K is impossible. In order to save or show the sampling representation one needs to use a matrix of the size \(m \times r\), where each row stands for one point \(a^{(i)}\) and the columns correspond to the coordinates of the points. However, for the PSS representation one needs to use a vector of the size \(\left( {\begin{array}{c}r+d\\ r\end{array}}\right)\), as explained in the first item of the “Advantages of PSS representation over rectangular representation” section1.
-
2-
To test if a point \(k^\star \in B\) belongs to K using the sampling representation is not a straightforward task. However, using the PSS representation one needs to verify only one condition of the evaluation form, \(p(k^\star )\ge 1\).
-
3-
To compute the distance of a point \(k^\star \in B\) to the boundaries of K, using the sampling representation, if \(k^\star \not \in K\), one should compute the distance of \(k^\star\) from each point in the sampling representation of K and then take the minimum. Using the PSS representation, whether \(k\not \in K\) or not, one just needs to find the distance of \(k^\star\) from the algebraic set defined by \(p(k)-1=0\).
In a typical example from CRN theory, r is usually much higher than 2, and therefore item 1 is really important. When \(\left( {\begin{array}{c}r+d\\ r\end{array}}\right)\) is lower than rm, one can use less memory by saving the PSS representation instead of keeping all the points of the sampling representation in the memory. Further, as items 2 and 3 show, this will not cause a loss of information about the multistationarity region.
More involved example
We showed earlier that the PSS representation can be generated for examples with a higher number of parameters than two. There, we let all 8 parameters of the network in Fig. 3a to be free and found the PSS representation of degree two in 8 indeterminants. Now we bring another such example which also serves to emphasize Remark 2.2 item (ii): that to have a PSS representation of the multistationarity region, one does not need to have the right hand side functions of the ODE system to be of polynomial or even rational functions.
Consider a gene expression system with 4 species \(X_i\), \(i=1,\dots ,4\) where these species can be m-RNA or protein molecules or other relevant factors, with the ODE system as in Fig. 8a which was introduced in [35, Figure 2]. As one can see the right hand side functions involve at least a square root, and as a result this system is not polynomial, or even defined by rational functions. Let us fix the parameter values other than the three degradation rates \(\beta _i\), \(i=2,3,4\) to the following values, chosen the same as in [35]:
$$\begin{aligned} \begin{array}{l} \alpha _1=1,\;n=4,\;v_1=4,\;v_2=8,\;v_3=4,\;\\ h_{1,1}=h_{2,2}=0.5,\;h_{3,3}=1,\;h_{4,4}=0.75,\;\\ g_{2,1}=g_{3,2}=g_{4,3}=1,\;\beta _1=0.5,\;\\ \alpha _2=1,\;\alpha _3=2,\;\alpha _4=3. \end{array} \end{aligned}$$
(5)
In [35] it is shown that for \((\beta _2,\beta _3,\beta _4)=(5,3,12)\), the network is bistable, with three steady states. Here we rename these three parameters by \(k_i, i=1,2,3\) respectively, let them vary in the 3-dimensional cube \(B=[(3,1,10),(7,5,14)]\), and look for the multistationary region. I.e. we seek the relation between these three parameters so that the number of steady states of the network remains the same. Substituting the values into Fig. 8a and letting \(dx_i/dt=0\), \(i=1,\dots ,4\) gives the following system of equations.
$$\begin{aligned} \begin{array}{l} 4+\frac{8x_4^4}{256+x_4^4}\frac{1}{\sqrt{x_1}}-\frac{1}{2}\sqrt{x_1}=0,\\ x_1-k_1\sqrt{x_2}=0,\\ 2x_2-k_2x_3=0,\\ 3x_3-k_3\root 4 \of {x_4^3}=0. \end{array} \end{aligned}$$
(6)
With a simple calculation one can see that the set of positive solutions to the system of Eq. (6) is in one to one correspondence with the the set of positive roots of the following univariate polynomial:
$$ f(y) = \root 4 \of {24k_1^2k_2k_3^3}y^{73} - 144y^{64} + 256\root 4 \of {24k_1^2k_2k_3^3}y^9 - 12288. $$
(7)
For any positive root of the polynomial in (7), a positive solution for (6) is
$$\begin{aligned} (x_1,x_2,x_3,x_4) = \Big (k_1\sqrt{\frac{k_2k_3}{6}}y^6,\,\frac{k_2k_3}{6}y^{12},\frac{k_3}{3}y^{12},y^{16}\Big ). \end{aligned}$$
By Descartes' rule of signs it is clear that the polynomial in (7) has 1 or 3 positive roots counted by multiplicity for any choice of \(k_i\)s. To find the sampling representation of the multistationarity region of our system we solved the equation \(f(y)=0\) for 1000 random choices of the parameters using Matlab, which took about 8 min. At 532 of these choices the polynomial had 3 positive roots. The sampling representation is shown in Fig. 8b, with the points where the system has 3 steady states shown as orange spheres. We then used the information from the sampling representation to compute the PSS representation of the multistationarity region of degree two, which took less than 2 s. The resulting polynomial is below, with coefficients given to 6 decimal places.
$$\begin{aligned} \begin{array}{l} p(k_1,k_2,k_3) = - 0.043653k_1^2 - 0.029919k_2^2 \\ -0.013650k_3^2 - 0.082681k_1k_2 - 0.045005k_1k_3 \\ {}-0.055895k_2k_3 + 1.226648k_1 + 1.271835k_2 \\ +0.710556k_3 - 8.112402. \end{array} \end{aligned}$$
Therefore the PSS representation of the multistationarity region is the set of the points \((k_1,k_2,k_3)\in B\) that satisfy \(p(k_1,k_2,k_3)\ge 1\), which is the region between the two yellow surfaces in Fig. 8b. Note that the point (5, 3, 12) corresponding to the value examined in [35] is in the middle of this region.