Volume 14 Supplement 5
Proceedings of the Third Annual RECOMB Satellite Workshop on Massively Parallel Sequencing (RECOMB-seq 2013)
Assembling contigs in draft genomes using reversals and block-interchanges
- Chi-Long Li^{1},
- Kun-Tze Chen^{1} and
- Chin Lung Lu^{1}Email author
DOI: 10.1186/1471-2105-14-S5-S9
© Li et al.; licensee BioMed Central Ltd. 2013
Published: 10 April 2013
Abstract
The techniques of next generation sequencing allow an increasing number of draft genomes to be produced rapidly in a decreasing cost. However, these draft genomes usually are just partially sequenced as collections of unassembled contigs, which cannot be used directly by currently existing algorithms for studying their genome rearrangements and phylogeny reconstruction. In this work, we study the one-sided block (or contig) ordering problem with weighted reversal and block-interchange distance. Given a partially assembled genome π and a completely assembled genome σ, the problem is to find an optimal ordering to assemble (i.e., order and orient) the contigs of π such that the rearrangement distance measured by reversals and block-interchanges (also called generalized transpositions) with the weight ratio 1:2 between the assembled contigs of π and σ is minimized. In addition to genome rearrangements and phylogeny reconstruction, the one-sided block ordering problem particularly has a useful application in genome resequencing, because its algorithms can be used to assemble the contigs of a draft genome π based on a reference genome σ. By using permutation groups, we design an efficient algorithm to solve this one-sided block ordering problem in $\mathcal{O}\left(\delta n\right)$ time, where n is the number of genes or markers and δ is the number of used reversals and block-interchanges. We also show that the assembly of the partially assembled genome can be done in $\mathcal{O}\left(n\right)$ time and its weighted rearrangement distance from the completely assembled genome can be calculated in advance in $\mathcal{O}\left(n\right)$ time. Finally, we have implemented our algorithm into a program and used some simulated datasets to compare its accuracy performance to a currently existing similar tool, called SIS that was implemented by a heuristic algorithm that considers only reversals, on assembling the contigs in draft genomes based on their reference genomes. Our experimental results have shown that the accuracy performance of our program is better than that of SIS, when the number of reversals and transpositions involved in the rearrangement events between the complete genomes of π and σ is increased. In particular, if there are more transpositions involved in the rearrangement events, then the gap of accuracy performance between our program and SIS is increasing.
Background
The techniques of next generation sequencing have greatly advanced in the past decade [1–3], which allows an increasing number of draft genomes to be produced rapidly in a decreasing cost. Usually, these draft genomes are partially sequenced, leading to their published genomes as collections of unassembled contigs (short for contiguous fragments). These draft genomes in contig form, however, can not be used immediately in some bioinformatics applications, such as the study of genome rearrangements, which requires the completely assembled genomes to calculate their rearrangement distances [4]. To adequately address this issue, Gaul and Blanchette [5] introduced and studied the so-called block ordering problem defined as follows. Given two partially assembled genomes, with each representing as an unordered set of blocks, the block ordering problem is to assemble (i.e., order and orient) the blocks of the two genomes such that the distance of genome rearrangements between the two assembled genomes is minimized. The blocks mentioned above are the contigs, each of which can be represented by an ordered list of genes or markers. In their work [5], Gaul and Blanchette proposed a linear-time algorithm to solve the block ordering problem if the problem is further simplified to maximize the number of cycles in the breakpoint graph corresponding to the assembled genomes. The rationale behind this modification is totally based on a result obtained by Bourque and Pevzner [6], showing that the reversal distance between two assembled genomes can be approximated well by maximizing the number of cycles in their corresponding breakpoint graph. Actually, in addition to the number of cycles, the number of hurdles, as well as the presence of a fortress or not, is also important and needed for determining the actual reversal distance [7]. Therefore, it is still a challenge to efficiently solve the block ordering problem by optimizing the true rearrangement distance.
In the literature, many different kinds of genome rearrangements have been extensively studied [4], such as reversal (also called inversion), transposition and block-interchange (also called generalized transposition), translocation, fusion and fission. Reversal affects a segment on a chromosome by reversing this segment as well as exchanging its strands. Transposition rearranges a chromosome by interchanging its two adjacent and nonoverlapping segments. Block-interchange is a generalized transposition that exchanges two nonoverlapping but not necessarily adjacent segments on a chromosome. Translocation acts on two chromosomes by exchanging their the end fragments. Fusion is a special translocation that joins two chromosomes into one and fission is also a special translocation that splits a chromosome into two. In this study, we consider a variant of the block ordering problem, in which one of the two input genomes is still partially assembled but the other is completely assembled, with optimizing the genome rearrangement distance measured by weighted reversals and block-interchanges, whose weights are 1 and 2, respectively. For distinguishing this special block ordering problem from the original one, we call it as one-sided block (or contig) ordering problem. In fact, an efficient algorithm to solve the one-sided block ordering problem has a useful application in genome resequencing [8, 9], because the reference genome for resequencing organisms can serve as the completely assembled genome in the one-sided block ordering problem and the contigs of partially assembled resequencing genome can then be assembled together into one or several scaffolds based on the reference genome. From this respect, the one-sided block ordering problem can be considered as a kind of contig scaffolding (or assembly) problem that aims to use genome rearrangements to create contig scaffolds for a draft genome based on a reference genome.
Currently, several contig scaffolding tools based on the reference genomes have been developed, such as Projector 2 [10], OSLay [11], ABACAS [12], Mauve Aligner [13], fillScaffolds [14], r2cat [15] and SIS [16]. Among these contig scaffolding tools, both SIS and fillScaffolds use the concept of genome rearrangements to generate contig scaffolds for a draft genome. SIS deals with only reversals, while in addition to reversals, fillScaffolds considers other rearrangements, such as transpositions and translocations (including fissions and fusions). Basically, SIS was dedicated to creating the contig scaffolds for prokaryotic draft genomes by heuristically searching for their inversion signatures, where an inversion signature is a pair of adjacent genes or markers appearing along a contig such that they form a breakpoint and are also located in different transcriptional strands. As for fillScaffolds, it used the traditional technique of breakpoint graphs to assemble the contigs of draft genomes. In the study by Dias and colleagues [16], they have used real prokaryotic draft genomes to demonstrate that SIS had the best overall accuracy performance when compared to the other tools we mentioned above.
In this study, we utilize permutation groups in algebra, instead of the breakpoint graphs used by Gaul and Blanchette [5], to design an efficient algorithm, whose time complexity is $\mathcal{O}\left(\delta n\right)$, for solving the one-sided block ordering problem with weighted reversal and block-interchange distance, where n is the number of genes or markers and δ is the number of reversals and block-interchanges used to transform the assembly of the partially assembled genome (i.e., draft genome) into the completely assembled genome (i.e., reference genome). In particular, we also show that the assembly of the partially assembled genome can be done in $\mathcal{O}\left(n\right)$ time and its weighted reversal and block-interchange distance from the completely assembled genome can be calculated in advance in $\mathcal{O}\left(n\right)$ time. In addition, we have implemented our algorithm into a program and used some simulated datasets to compare its accuracy performance to SIS on assembling the contigs in the draft genomes based on their reference genomes. Our experimental results have shown that the averaged normalized contig mis-join errors of our program are lower than those of SIS, when the number of reversals and transpositions involved in the rearrangement events between the complete genomes of the partially and completely assembled organisms is increased. In particular, if there are more transpositions involved in the rearrangement events, then the gap of accuracy performance between our program and SIS is increasing.
Preliminaries
One-sided block ordering problem
In the following, we dedicate ourselves to linear, uni-chromosomal genomes. With a slight modification, however, our algorithmic result can still apply to circular, uni-chromosomal genomes, or to multi-chromosomal genomes with linear or circular chromosomes in a chromosome-by-chromosome manner. Once completely assembled, a uni-chromosomal genome can be represented by a signed permutation of n integers between 1 and n, with each integer representing a gene or marker and its associated sign indicating the strandedness of the corresponding gene or marker. If the genome is partially assembled, then it will be represented by an unordered set of blocks, where a block B of size k, denoted by B = [b_{1}, b_{2}, ..., b_{ k }], is an ordered list of k signed integers. Let $\overline{B}=\left[-{b}_{k},\phantom{\rule{2.77695pt}{0ex}}-{b}_{k-1},\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}-{b}_{1}\right]$ denote the reverse of B. Given an unordered set of m blocks, say $\mathcal{B}=\left\{{B}_{1},\phantom{\rule{2.77695pt}{0ex}}{B}_{2},\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}{B}_{m}\right\}$, corresponding to a partially assembled genome, an ordering (or assembly) of $\mathcal{B}$ is an ordered list of m blocks in which each block B_{ i } or its reverse $\overline{{B}_{i}}$ appears exactly once, where 1 ≤ i ≤ m. For instance, suppose that $\mathcal{B}=\left\{{B}_{1},{B}_{2},{B}_{3}\right\}=\left\{\left[1,4\right],\left[3,2\right],\left[-5,6\right]\right\}$. Then (B_{1}, B_{3}, B_{2}) = ([1, 4], [-5, 6], [3, 2]) and (B_{1}, -B_{3}, B_{2}) = ([1, 4], [-6, 5], [3, 2]) are two orderings of $\mathcal{B}$. Basically, each ordering of $\mathcal{B}$induces (or defines) a signed permutation of size n, which is obtained by concatenating the blocks in this ordered list. For instance, the ordering (B_{1}, B_{3}, B_{2}) in the above exemplified $\mathcal{B}$ induces the signed permutation (1, 4, -5, 6, 3, 2), which simply is denoted by B_{1} ⊙ B_{3} ⊙ B_{2}. Clearly, the permutation induced by an ordering of $\mathcal{B}$ corresponds to an assembly of the blocks in $\mathcal{B}$. Now, the one-sided block ordering problem we study in this paper is formally defined as follows:
One-sided block ordering problem with reversal and block-interchange distance
Input: A partially assembled genome π and a completely assembled genome σ.
Output: Find an ordering of π such that the rearrangement distance measured by reversals and block-interchanges with the weight ratio 1:2 between the permutation induced by the ordering of π and σ is minimized.
As discussed in our previous study [17], it is biologically meaningful to assign twice the weight to block-interchanges than to reversals, due to the observation from the biological data that transpositions occur with about half the frequency of reversals [18].
Permutation groups
Permutation groups have been proven to be a very useful tool in the studies of genome rearrangements [17]. Below, we recall some useful definitions, notations and properties borrowed form our previous work [17]. Basically, given a set E = {1, 2, ..., n}, a permutation is defined to be a one-to-one function from E into itself and usually expressed as a product of cycles in the study of genome rearrangements. For instance, π = (1)(3, 2) is a product of two cycles to represent a permutation of E = {1, 2, 3} and means that π(1) = 1, π(2) = 3 and π(3) = 2. The elements in a cycle can be arranged in any cyclic order and hence the cycle (3, 2) in the permutation π exemplified above can be rewritten as (2, 3). Moreover, if the cycles in a permutation are all disjoint (i.e., no common element in any two cycles), then the product of these cycles is called the cycle decomposition of the permutation. In fact, a permutation in the cycle decomposition can be used to model a genome containing several circular chromosomes, with each disjoint cycle representing a circular chromosome. Notice that in the rest of this article, we say "cycle in a permutation" to mean "cycle in the cycle decomposition of this permutation" for simplicity, unless otherwise specified. A cycle with k elements is further called a k-cycle. In convention, the 1-cycles in a permutation are not written explicitly since their elements are fixed in the permutation. For instance, the above exemplified permutation π can be written as π = (2, 3). If the cycles in a permutation are all 1-cycles, then this permutation is called an identify permutation and denoted by 1. Suppose that α and β are two permutations of E. Then their product αβ, also called their composition, defines a permutation of E satisfying αβ(x) = α(β(x)) for all $x\in E$. If both α and β are disjoint, then αβ = βα. If αβ = 1, then α is called the inverse of β, denoted by β^{-1}, and vice versa. Moreover, the conjugation of β by α, denoted by α · β, is defined to be the permutation $\alpha \beta {{\alpha}^{-}}^{\mathsf{\text{1}}}$. It can be verified that if y = β(x), then α(y) = αβ(x) = αβα^{-1}α(x) = α · β(α(x)). Hence, α · β can be obtained from β by just changing its element x with α(x). In other words, if β = (b_{1}, b_{2}, ..., b_{ k }), then α · β = (α(b_{1}), α(b_{2}), ..., α(b_{ k })).
It is a fact that every permutation can be expressed into a product of 2-cycles, in which 1-cycles are still written implicitly. Given a permutation α of E, its norm, denoted by ||α||, is defined to be the minimum number, say k, such that α can be expressed as a product of k 2-cycles. In the cycle decomposition of α, let n_{ c }(α) denote the number of its disjoint cycles, notably including the 1-cycles not written explicitly. Given two permutations α and β of E, α is said to divide β, denoted by α|β, if and only if ||βα^{-1}|| = ||β|| - ||α||. In our previous work [17], it has been shown that ||α|| = |E| - n_{ c }(α) and for any k elements in E, say a_{1}, a_{2}, ..., a_{ k }, they all appear in a cycle of α in the ordering of a_{1}, a_{2}, ..., a_{ k } if and only if (a_{1}, a_{2}, ..., a_{ k }) | α.
Let α = (a_{1}, a_{2}) be a 2-cycle and β be an arbitrary permutation of E. If α|β, that is, both a_{1} and a_{2} appear in the same cycle of β, then the composition αβ, as well as βα, has the effect of fission by breaking this cycle into two smaller cycles. For instance, let α = (1, 3) and β = (1, 2, 3, 4). Then α|β, since both 1 and 3 are in the cycle (1, 2, 3, 4), and αβ = (1, 2)(3, 4) and βα = (4, 1)(2, 3). On the other hand, if $\alpha \nmid \beta $, that is, a_{1} and a_{2} appear in different cycles of β, then αβ, as well as βα, has the effect of fusion by joining the two cycles into a bigger cycle. For example, if α = (1, 3) and β = (1, 2)(3, 4), then $\alpha \nmid \beta $ and, as a result, αβ = (1, 2, 3, 4) and βα = (2, 1, 4, 3).
A model for representing DNA molecules
As mentioned before, a permutation in the form of the cycle decomposition can be used to model a genome containing multiple chromosomes (or a chromosome with multiple contigs), with each cycle representing a chromosome (or contig). To facilitate modelling the rearrangement of reversals using the permutation groups, however, we need to use two cycles to represent a chromosome, with one cycle representing a strand of the chromosome and the other representing the complementary strand. For this purpose, we first let E = {-1, 1, -2, 2, ..., -n, -n} and Γ = (1, -1)(2, -2) ... (n, -n). We then use an admissible cycle, which is a cycle containing no i and its opposite -i simultaneously for some $i\in E$, to represent a chromosomal strand, say π^{+}, and use π^{-} = Γ · (π^{+})^{-1}, which is the reverse complement of π^{+}, to represent the opposite strand of π^{+}. As demonstrated in our previous work [17], it is useful to represent a double stranded chromosome π by the product of its two strands π^{+} and π^{-}, that is, $\pi ={\pi}^{+}{\pi}^{-}={\pi}^{-}{\pi}^{+}$, because a reversal (respectively, block-interchange) acting on this DNA molecule can be mimicked by multiplying two (respectively, four) 2-cycles with π, as described in the following lemmas.
Lemma 1 ([17]) Let π = π^{+}π^{-} denote a double stranded DNA and let x and y be two elements in E. If $\left(x,y\right)\phantom{\rule{2.77695pt}{0ex}}\nmid \phantom{\rule{2.77695pt}{0ex}}\pi $, that is, x and y are in the different strands of π, then the effect of (π Γ(y), π Γ(x))(x, y)π is a reversal acting on π.
Lemma 2 ([17]) Let π = π^{+}π^{-} denote a double stranded DNA and let u, v, x and y be four elements in E. If (x, u, y, v)|π, that is, x, u, y and v appear in the same strand of π in this order, then the effect of (π Γ(v), π Γ(u)) (π Γ(y), π Γ(x)) (u, v)(x, y)π is a block-interchange acting on π.
Moreover, as described in the following lemma, we have shown in [17] that given two different DNA molecules π and σ, every cycle α in (the cycle decomposition of) σπ^{-1} always has a mate cycle (π Γ) · α^{-1} that also appears in σπ^{-1}. In fact, α and (π Γ) · α^{-1} in σπ^{-1} are each other's mate cycle.
Lemma 3 ([17]) Let π and σ be two different double-stranded DNA molecules. If α is a cycle in σπ^{-1}, then (π Γ) · α^{-1} is also a cycle in σπ^{-1}.
An efficient algorithm for the one-sided block ordering problem
To clarify our algorithm, we start with defining some notations. Let α denote an arbitrary linear DNA molecule (or contig). As mentioned previously, it is represented by the product of its two strands α^{+} and α^{ - }, that is, α = α^{+}α^{ - }. If α contains k genes (or markers), we also denote its α^{+} by (α^{+}[1], α^{+}[2], ..., α^{+}[k]), where α^{+}[i] is the i-th gene in α, and its α^{-} by (α^{ - } [1], α^{ - }[2], ..., α^{ - }[k]). By convention, α^{+}[1] and α^{ - }[1] are called as tails of α. Let π = π_{1}π_{2} ... π_{ m } be a linear, uni-chromosomal genome that is partially assembled into m contigs π_{1}, π_{2}, ..., π_{ m }, each with n_{ i } genes, and σ = (1, 2, ..., n) be a linear, uni-chromosomal genome that is assembled completely. Let C = {c_{ k } = n + k + 1: 0 ≤ k ≤ 2m - 1} ∪ {-c_{ k } = -n - k - 1: 0 ≤ k ≤ 2m - 1} be a set of 4m distinct integers, called caps, which are different from those genes in E. Let $\hat{E}=E\phantom{\rule{2.77695pt}{0ex}}\cup \phantom{\rule{2.77695pt}{0ex}}C$ and $\hat{\text{\Gamma}}=\left(1,\phantom{\rule{2.77695pt}{0ex}}-1\right)\left(2,\phantom{\rule{2.77695pt}{0ex}}-2\right)\phantom{\rule{2.77695pt}{0ex}}\dots \phantom{\rule{2.77695pt}{0ex}}\left(n+2m,\phantom{\rule{2.77695pt}{0ex}}-n-2m\right)$. For the purpose of designing our algorithm later, we add four caps ${c}_{2\left(i-1\right)},{c}_{2\left(i-1\right)+1},-{c}_{2\left(i-1\right)}$ and -c_{2(i- 1)+1}to the ends of each contig π_{ i }, where 1 ≤ i ≤ m, leading to a capping contig ${\widehat{\pi}}_{i}$ with ${\widehat{\pi}}_{i}^{+}\left[1\right]={c}_{2\left(i-1\right)},\phantom{\rule{2.77695pt}{0ex}}{\widehat{\pi}}_{i}^{+}\left[j\right]={\pi}_{i}^{+}\left[j-1\right]$, for $2\le j\le {n}_{i}+1,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\pi}}_{i}^{+}\left[{n}_{i}+2\right]={c}_{2\left(i-1\right)+1},\phantom{\rule{2.77695pt}{0ex}}{\widehat{\pi}}_{i}^{-}\left[1\right]=\widehat{\text{\Gamma}}\left({c}_{2\left(i-1\right)+1}\right),\phantom{\rule{2.77695pt}{0ex}}{\widehat{\pi}}_{i}^{-}\left[j\right]={\pi}_{i}^{-}\left[j-1\right]$ for 2 ≤ j ≤ n_{ i } + 1, and ${\widehat{\pi}}_{i}^{-}\left[{n}_{i}+2\right]=\widehat{\text{\Gamma}}\left({c}_{2\left(i-1\right)}\right)$. Moreover, we insert m-1 dummy contigs without any genes (i.e., null contigs) σ_{2}, σ_{3}, ..., σ_{ m } into σ, where the original contig in σ becomes σ_{1} now, and add four caps c_{2(i- 1)}, c_{2(i- 1)+1}, -c_{2(i- 1)}and -c_{2(i- 1)+1}to the ends of each contig σ_{ i } to obtain a capping contig ${\widehat{\sigma}}_{i}$, where ${\widehat{\sigma}}_{i}^{+}\left[1\right]={c}_{2\left(i-1\right)},{\widehat{\sigma}}_{i}^{+}\left[j\right]={\sigma}_{i}^{+}\left[j-1\right]$ for $2\le j\le {n}_{i}+1,{\widehat{\sigma}}_{i}^{+}\left[{n}_{i}+2\right]={c}_{2\left(i-1\right)+1},\phantom{\rule{2.77695pt}{0ex}}{\widehat{\sigma}}_{i}^{-}\left[1\right]=\widehat{\text{\Gamma}}\left({c}_{2\left(i-1\right)+1}\right),{\widehat{\sigma}}_{i}^{-}\left[j\right]={\sigma}_{i}^{-}\left[j-1\right]$ for 2 ≤ j ≤ n_{ i } + 1, and ${\widehat{\sigma}}_{i}^{-}\left[{n}_{i}+2\right]=\widehat{\text{\Gamma}}\left({c}_{2\left(i-1\right)}\right)$. Notice that the purpose of adding caps to the ends of the contigs is to serve as delimiters when we use permutation groups to model translocations of multiple contigs later. We denote the capping π and σ by $\hat{\pi}$ and $\hat{\sigma}$, respectively. To distinguish the four caps in a capping contig, say ${\hat{\pi}}_{i}$, we call the left caps ${\widehat{\pi}}_{i}^{+}\left[1\right]$ and ${\widehat{\pi}}_{i}^{-}\left[1\right]$ as 5' caps and the right caps ${\widehat{\pi}}_{i}^{+}\left[{n}_{i}+2\right]$ and ${\widehat{\pi}}_{i}^{-}\left[{n}_{i}+2\right]$ as 3' caps.
In addition, we define 5cap$\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\alpha}\right)$ to be the 5' cap in the strand of $\widehat{\alpha}$ that contains x. For convenience, we extend the definitions above from the capping contig to the capping genome. For instance, given a capping genome, say $\widehat{\pi}$, char$\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$ denotes the character of x in a capping contig ${\widehat{\pi}}_{i}$ of $\widehat{\pi}$ that contains x, and 5cap$\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$ denotes the 5' cap of the strand in ${\widehat{\pi}}_{i}$ containing x, that is, $\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)=\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\pi}}_{i}\right)$ and 5cap$\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$ = 5cap$\left(x,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\pi}}_{i}\right)$. In our previous work [17], we have shown the following lemma.
Lemma 4 ([17]) For a capping genome $\widehat{\pi}$and $x\in \mathit{\xca}$, if char $\left(x,\widehat{\pi}\right)=C3$(respectively, T), then $\mathsf{\text{char}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(x\right),\widehat{\pi}\right)$is T (respectively, C3) and if $\mathsf{\text{char}}\left(x,\widehat{\pi}\right)=O$ (respectively, N3 and C5), then $\mathsf{\text{char}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(x\right),\widehat{\pi}\right)$is O (respectively, N3 and C5).
Basically, we design our algorithm to solve the one-sided block ordering problem by dealing with the contigs of the capping genome $\widehat{\pi}$ as if they were linear chromosomes. Let ${c}_{1}=\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$ and ${c}_{2}=\left(u,\phantom{\rule{2.77695pt}{0ex}}v\right)$ be two 2-cycles with character pairs of (non-C5, non-C5) and (C5, C5), respectively, and let ${c}_{1}^{\prime}=\left(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\widehat{\pi}\hat{\text{\Gamma}}\left(x\right)\right)$ and ${c}_{2}^{\prime}=\left(\widehat{\pi}\hat{\text{\Gamma}}\left(v\right),\widehat{\pi}\hat{\text{\Gamma}}\left(u\right)\right)$. Notice that the character pair of ${c}_{2}^{\prime}$ is (C5, C5) by Lemma 4. In our previous study [17], we have proven that performing a translocation $\tau $ on $\widehat{\pi}$ can be mimicked by the composition of ${c}_{2}^{\prime}{c}_{1}^{\prime}{c}_{2}{c}_{1}\widehat{\pi}\left(\mathsf{\text{i}}.\mathsf{\text{e}}.,\phantom{\rule{2.77695pt}{0ex}}\tau ={c}_{2}^{\prime}{c}_{1}^{\prime}{c}_{2}{c}_{1}\right)$, if $\left(x,\phantom{\rule{2.77695pt}{0ex}}u\right)|\widehat{\pi},\left(y,\phantom{\rule{2.77695pt}{0ex}}v\right)|\widehat{\pi},\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)\nmid \widehat{\pi}$ and $\left(x,\widehat{\text{\Gamma}}\left(y\right)\right)\nmid \widehat{\pi}$ (i.e., x and u, as well as y and v, lie in the same contig stand in $\widehat{\pi}$, but x and y appear in the different contigs in $\widehat{\pi}$). Moreover, if the character pair of ${c}_{1}$ is in $\mathsf{\text{CEpair}}\phantom{\rule{0.3em}{0ex}}\mathsf{\text{=}}\phantom{\rule{0.3em}{0ex}}\left\{\left(\mathsf{\text{C}}3,\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{C}}3\right),\phantom{\rule{2.77695pt}{0ex}}\left(\mathsf{\text{C}}3,\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{N}}3\right),\phantom{\rule{2.77695pt}{0ex}}\left(\mathsf{\text{T}},\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{T}}\right),\phantom{\rule{2.77695pt}{0ex}}\left(\mathsf{\text{T}},\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{N}}3\right),\phantom{\rule{2.77695pt}{0ex}}\left(\mathsf{\text{N}}3,\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{N}}3\right)\right\}$, then $\tau $ acts on $\widehat{\pi}$ by exchanging the two caps of some contig in $\widehat{\pi}$ with the two caps of another contig and, as a result, leaves the original genome $\pi $ unaffected. Notice that the character pair of ${c}_{1}^{\prime}$ also belongs to CEpair and that of ${c}_{2}^{\prime}$ is (C5, C5) according to Lemma 4. Furthermore, if ${c}_{1}$ is a 2-cycle of character pair (T, C3) (respectively, (O, N3)), then performing τ on $\widehat{\pi}$ becomes a fusion (respectively, fission) to act on π. Hence, we have the following lemma, where it can be verified that $\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(x,\widehat{\pi}\right)\right)=5\mathsf{\text{cap}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(x\right),\widehat{\pi}\right)$ and and $\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(y,\widehat{\pi}\right)\right)=5\mathsf{\text{cap}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\widehat{\pi}\right)$.
Lemma 5 ([17]) Let c_{1} = (x, y) denote a 2-cycle with char$\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)=T$and char$\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)=C\mathit{3}$, and let ${c}_{2}=\left(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right),\phantom{\rule{2.77695pt}{0ex}}5\mathsf{\text{cap}}\left(y,\widehat{\pi}\right)\right)$, ${c}_{1}^{\prime}=\left(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right)\right)$and ${c}_{2}^{\prime}=(\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right))$. If $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)\nmid \widehat{\pi}$and $\left(x,\phantom{\rule{2.77695pt}{0ex}}\hat{\text{\Gamma}}\left(y\right)\right)\phantom{\rule{2.77695pt}{0ex}}\nmid \widehat{\pi}$, then the effect of ${c}_{2}^{\prime}{c}_{1}^{\prime}{c}_{2}{c}_{1}\widehat{\pi}$is a fusion that acts on π by concatenating the contig containing y with the contig containing x.
It is not hard to see that the permutation induced by an ordering of the uncapped genome π can be considered as the result of applying consecutive m - 1 fusions to the m contigs in π. Based on the above discussion, it can be realized that our purpose is to find m - 1 translocations to act on $\widehat{\pi}$ such that their rearrangement effects on the original π are m - 1 fusions and the genome rearrangement distance measured by weighted reversals and block-interchanges between the resulting assembly of the contigs in π and σ is minimum. In Algorithm 1 below, we describe our algorithm for efficiently solving the one-sided block ordering problem, where reversals are weighted one and block-interchanges are weighted two. Basically, we try to derive m - 1 fusions from $\widehat{\sigma}{\widehat{\pi}}^{-1}$ to act on π in Algorithm 1.
Algorithm 1
Input: A partially assembled, linear, uni-chromosomal genome π = π_{1}π_{2} ... π_{ m } and a completely assembled, linear, uni-chromosomal genome σ = σ_{1}.
Output: An optimally assembled genome of π, denoted by assembly(π), and the weighted reversal and block-interchange distance Δ(π, σ) between assembly(π) and σ.
1: Add m - 1 null contigs ${\sigma}_{2},{\sigma}_{3},\dots ,{\sigma}_{m}$ into σ such that $\sigma ={\sigma}_{1}{\sigma}_{2}\dots {\sigma}_{m}$.
Obtain $\widehat{\pi}={\widehat{\pi}}_{1}{\widehat{\pi}}_{2}\phantom{\rule{0.3em}{0ex}}\dots \phantom{\rule{2.77695pt}{0ex}}{\widehat{\pi}}_{m}$ and $\widehat{\sigma}={\widehat{\sigma}}_{1}{\widehat{\sigma}}_{2}\phantom{\rule{2.77695pt}{0ex}}\dots \phantom{\rule{2.77695pt}{0ex}}{\widehat{\sigma}}_{m}$ by capping π and σ.
2: Compute $\widehat{\sigma}{\widehat{\pi}}^{-1}$ and $\widehat{\pi}\hat{\text{\Gamma}}$.
3: /* To perform cap exchanges */
Let i = 0.
while there are x and y in a cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$ such that $\left(\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right),\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{char}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)$ ∈ CEpair do
Let i = i + 1.
Find x and y in a cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$ with $\left(\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right),\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{char}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)$ ∈ CEpair.
Let ${\chi}_{i}=(\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right))\phantom{\rule{0.3em}{0ex}}(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))\phantom{\rule{0.3em}{0ex}}(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))\phantom{\rule{0.3em}{0ex}}\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$.
Calculate new $\widehat{\pi}={\chi}_{i}\widehat{\pi}$, new $\widehat{\pi}\hat{\text{\Gamma}}={\chi}_{i}\widehat{\pi}\hat{\text{\Gamma}}$ and new $\widehat{\sigma}{\widehat{\pi}}^{-1}=\widehat{\sigma}{\widehat{\pi}}^{-1}{\chi}_{i}^{-1}$.
end while
4: /* To find consecutive m - 1 fusions */
Let i = 0.
while there are two adjacent elements x and y in a cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$ such that $\left(\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right),\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{char}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)=\left(\mathsf{\text{T}},\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{C}}3\right)$ and $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)\nmid \widehat{\pi}$do
Let i = i + 1.
Find two adjacent elements x and y in a cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$ such that $\left(\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right),\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{char}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)=\left(\mathsf{\text{T}},\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{C}}3\right)$ and $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)\nmid \widehat{\pi}$.
Let ${\tau}_{i}=(\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))$, $\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right))\phantom{\rule{0.3em}{0ex}}(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))\phantom{\rule{0.3em}{0ex}}(5\mathsf{\text{cap}}\left(x,\phantom{\rule{0.3em}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))\phantom{\rule{0.3em}{0ex}}\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$.
Calculate new $\widehat{\pi}={\tau}_{i}\widehat{\pi}$, new $\widehat{\pi}\hat{\text{\Gamma}}={\tau}_{i}\widehat{\pi}\hat{\text{\Gamma}}$ and new $\widehat{\sigma}{\widehat{\pi}}^{-1}=\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}_{i}^{-1}$.
end while
while i < m - 1 do
Let i = i + 1.
Find two adjacent elements x and y in a cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$ such that $\left(\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right),\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{char}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)=\left(\mathsf{\text{T}},\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{C}}3\right)$ and $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)|\widehat{\pi}$.
Find the strand of a different contig in $\widehat{\pi}$ with at least a non-cap integer and its 3' cap, say z, different from y.
Let ${\tau}_{i}=(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))\phantom{\rule{0.3em}{0ex}}(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(y\right))\phantom{\rule{0.3em}{0ex}}\left(y,\phantom{\rule{2.77695pt}{0ex}}z\right)\left(x,\phantom{\rule{2.77695pt}{0ex}}z\right)$.
Calculate new $\widehat{\pi}={\tau}_{i}\widehat{\pi}$, new $\widehat{\pi}\hat{\text{\Gamma}}={\tau}_{i}\widehat{\pi}\hat{\text{\Gamma}}$ and new $\widehat{\sigma}{\widehat{\pi}}^{-1}=\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}_{i}^{-1}$.
end while
Let assembly(π) denote the assembled contig in current $\widehat{\pi}$ whose caps are removed.
5: /* To find reversals */
Let ${n}_{\gamma}=0$.
while there are two adjacent elements x and y in a cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$ such that $\left(x,\hat{\text{\Gamma}}\left(y\right)\right)|\widehat{\pi}$do
Let ${n}_{\gamma}={n}_{\gamma}+1$.
Find two adjacent elements x and y in a cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$ such that $\left(x,\hat{\text{\Gamma}}\left(y\right)\right)|\widehat{\pi}$.
Let ${\gamma}_{{n}_{\gamma}}=(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$.
Calculate new $\widehat{\pi}={\gamma}_{{n}_{\gamma}}\widehat{\pi}$, new $\widehat{\pi}\hat{\text{\Gamma}}={\gamma}_{{n}_{\gamma}}\widehat{\pi}\hat{\text{\Gamma}}$ and new $\widehat{\sigma}{\widehat{\pi}}^{-1}=\widehat{\sigma}{\widehat{\pi}}^{-1}{\gamma}_{{n}_{\gamma}}^{-1}$.
end while
6: /* To find block-interchanges */
Let ${n}_{\beta}=0$.
while $\widehat{\sigma}{\widehat{\pi}}^{-1}\ne 1$ do
Let ${n}_{\beta}={n}_{\beta}+1$.
Choose any two adjacent elements x and y in a cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$.
Find two adjacent integers u and v in a cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$ such that $\left(u,\phantom{\rule{2.77695pt}{0ex}}v\right)\nmid \left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)\widehat{\pi}$.
Let ${\beta}_{\delta}=(\widehat{\pi}\hat{\text{\Gamma}}\left(v\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(u\right))\phantom{\rule{0.3em}{0ex}}(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))\phantom{\rule{0.3em}{0ex}}\left(u,\phantom{\rule{2.77695pt}{0ex}}v\right)\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$.
Calculate new $\widehat{\pi}={\beta}_{{n}_{\beta}}\widehat{\pi}$, new $\widehat{\pi}\hat{\text{\Gamma}}={\gamma}_{{n}_{\beta}}\widehat{\pi}\hat{\text{\Gamma}}$ and new $\widehat{\sigma}{\widehat{\pi}}^{-1}=\widehat{\sigma}{\widehat{\pi}}^{-1}{\beta}_{{n}_{\beta}}^{-1}$.
end while
7: Output assembly(π) and $\text{\Delta}\left(\pi ,\phantom{\rule{2.77695pt}{0ex}}\sigma \right)={n}_{\gamma}+2{n}_{\beta}$.
Below, we consider an example to clarify Algorithm 1. Let π = {[1, 4], [-5, 6], [3, 2]} and σ = {[1, 2, ..., 6]} be the input linear, uni-chromosomal genomes of Algorithm 1. In our algorithm, these two genomes will be further represented by π = (1, 4)(-4, -1)(-5, 6)(-6, 5)(3, 2)(-2, -3) and σ = (1, 2, ..., 6)(-6, - 5, ..., -1). First of all, we add two null contigs into σ and cap all the contigs in π and σ in a way such that $\widehat{\pi}=\left(7,\phantom{\rule{2.77695pt}{0ex}}1,\phantom{\rule{2.77695pt}{0ex}}4,\phantom{\rule{2.77695pt}{0ex}}8\right)\phantom{\rule{0.3em}{0ex}}\left(-8,\phantom{\rule{2.77695pt}{0ex}}-4,\phantom{\rule{2.77695pt}{0ex}}-1,\phantom{\rule{2.77695pt}{0ex}}-7\right)\phantom{\rule{0.3em}{0ex}}\left(9,\phantom{\rule{2.77695pt}{0ex}}-5,\phantom{\rule{2.77695pt}{0ex}}6,\phantom{\rule{2.77695pt}{0ex}}10\right)\phantom{\rule{0.3em}{0ex}}\left(-10,\phantom{\rule{2.77695pt}{0ex}}-6,\phantom{\rule{2.77695pt}{0ex}}5,\phantom{\rule{0.3em}{0ex}}-9\right)\phantom{\rule{0.3em}{0ex}}\left(11,\phantom{\rule{2.77695pt}{0ex}}3,\phantom{\rule{2.77695pt}{0ex}}2,\phantom{\rule{2.77695pt}{0ex}}12\right)\left(-12,\phantom{\rule{2.77695pt}{0ex}}-2,\phantom{\rule{2.77695pt}{0ex}}-3,\phantom{\rule{2.77695pt}{0ex}}-11\right)$ and $\widehat{\sigma}=\left(7,\phantom{\rule{2.77695pt}{0ex}}1,\phantom{\rule{2.77695pt}{0ex}}2,\phantom{\rule{2.77695pt}{0ex}}.\phantom{\rule{2.77695pt}{0ex}}.\phantom{\rule{2.77695pt}{0ex}}.\phantom{\rule{2.77695pt}{0ex}},\phantom{\rule{2.77695pt}{0ex}}6,\phantom{\rule{2.77695pt}{0ex}}8\right)\phantom{\rule{0.3em}{0ex}}\left(-8,\phantom{\rule{2.77695pt}{0ex}}-6,\phantom{\rule{2.77695pt}{0ex}}-5,\phantom{\rule{2.77695pt}{0ex}}.\phantom{\rule{2.77695pt}{0ex}}.\phantom{\rule{2.77695pt}{0ex}}.\phantom{\rule{2.77695pt}{0ex}},\phantom{\rule{2.77695pt}{0ex}}-1,\phantom{\rule{2.77695pt}{0ex}}-7\right)\phantom{\rule{0.3em}{0ex}}\left(9,\phantom{\rule{2.77695pt}{0ex}}10\right)\phantom{\rule{0.3em}{0ex}}\left(-10,\phantom{\rule{2.77695pt}{0ex}}-9\right)\phantom{\rule{0.3em}{0ex}}\left(11,\phantom{\rule{2.77695pt}{0ex}}12\right)\phantom{\rule{0.3em}{0ex}}\left(-12,\phantom{\rule{2.77695pt}{0ex}}-11\right)$. Next, we compute $\widehat{\sigma}{\widehat{\pi}}^{-1}=\left(2,\phantom{\rule{2.77695pt}{0ex}}4\right)\phantom{\rule{0.3em}{0ex}}\left(-1,\phantom{\rule{2.77695pt}{0ex}}-3\right)\phantom{\rule{0.3em}{0ex}}\left(3,\phantom{\rule{2.77695pt}{0ex}}12\right)\phantom{\rule{0.3em}{0ex}}\left(-2,\phantom{\rule{2.77695pt}{0ex}}-11\right)\phantom{\rule{0.3em}{0ex}}\left(5,\phantom{\rule{2.77695pt}{0ex}}-5,\phantom{\rule{2.77695pt}{0ex}}10,\phantom{\rule{2.77695pt}{0ex}}8\right)\phantom{\rule{0.3em}{0ex}}\left(-4,\phantom{\rule{2.77695pt}{0ex}}-6,\phantom{\rule{2.77695pt}{0ex}}-9,\phantom{\rule{2.77695pt}{0ex}}6\right)$. It can be found that 10 and 8 are in a cycle of current $\widehat{\sigma}{\widehat{\pi}}^{-1}$ with $\left(\mathsf{\text{char}}\phantom{\rule{2.77695pt}{0ex}}\left(10,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right),\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{char}}\left(8,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)=\left(\mathsf{\text{C}}3,\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{C}}3\right)\in \mathsf{\text{CEpair}}$. We perform a cap exchange on $\widehat{\pi}$ by multiplying $(\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(8,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))$, $(\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(8,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)$, $\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(10,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right))\phantom{\rule{0.3em}{0ex}}(\widehat{\pi}\hat{\text{\Gamma}}\left(8\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(10\right))\phantom{\rule{0.3em}{0ex}}(5\mathsf{\text{cap}}\left(10,\widehat{\pi}\right)$$\phantom{\rule{2.77695pt}{0ex}}5\mathsf{\text{cap}}\left(8,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))\phantom{\rule{0.3em}{0ex}}\left(10,\phantom{\rule{2.77695pt}{0ex}}8\right)=\left(-8,\phantom{\rule{2.77695pt}{0ex}}-10\right)\phantom{\rule{0.3em}{0ex}}\left(-4,\phantom{\rule{2.77695pt}{0ex}}-6\right)\phantom{\rule{2.77695pt}{0ex}}\left(9,7\right)\phantom{\rule{2.77695pt}{0ex}}\left(10,8\right)$ with $\widehat{\pi}$, resulting in new $\widehat{\pi}=\left(7,\phantom{\rule{2.77695pt}{0ex}}1,\phantom{\rule{2.77695pt}{0ex}}4,\phantom{\rule{2.77695pt}{0ex}}10\right)\phantom{\rule{0.3em}{0ex}}\left(-10,\phantom{\rule{2.77695pt}{0ex}}-4,\phantom{\rule{2.77695pt}{0ex}}-1,\phantom{\rule{2.77695pt}{0ex}}-7\right)\phantom{\rule{0.3em}{0ex}}\left(9,\phantom{\rule{2.77695pt}{0ex}}-5,\phantom{\rule{2.77695pt}{0ex}}6,\phantom{\rule{2.77695pt}{0ex}}8\right)\phantom{\rule{0.3em}{0ex}}\left(-8,\phantom{\rule{2.77695pt}{0ex}}-6,\phantom{\rule{2.77695pt}{0ex}}5,\phantom{\rule{2.77695pt}{0ex}}-9\right)\phantom{\rule{0.3em}{0ex}}\left(11,\phantom{\rule{2.77695pt}{0ex}}3,\phantom{\rule{2.77695pt}{0ex}}2,\phantom{\rule{2.77695pt}{0ex}}12\right)\phantom{\rule{0.3em}{0ex}}\left(-12,\phantom{\rule{2.77695pt}{0ex}}-2,\phantom{\rule{2.77695pt}{0ex}}-3,\phantom{\rule{2.77695pt}{0ex}}-11\right)$. In addition, we have new $\widehat{\sigma}{\widehat{\pi}}^{-1}=\left(2,\phantom{\rule{2.77695pt}{0ex}}4\right)\left(-1,\phantom{\rule{2.77695pt}{0ex}}-3\right)\phantom{\rule{0.3em}{0ex}}\left(3,\phantom{\rule{2.77695pt}{0ex}}12\right)\phantom{\rule{0.3em}{0ex}}\left(-2,\phantom{\rule{2.77695pt}{0ex}}-11\right)\phantom{\rule{0.3em}{0ex}}\left(5,\phantom{\rule{2.77695pt}{0ex}}-5,\phantom{\rule{2.77695pt}{0ex}}10\right)\phantom{\rule{0.3em}{0ex}}\left(-4,\phantom{\rule{2.77695pt}{0ex}}-9,\phantom{\rule{2.77695pt}{0ex}}6\right)\phantom{\rule{0.3em}{0ex}}\left(9,\phantom{\rule{2.77695pt}{0ex}}7\right)\phantom{\rule{0.3em}{0ex}}\left(-10,\phantom{\rule{2.77695pt}{0ex}}-8\right)$. It can be observed that -5 and 10 are in the same cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$ with satisfying that $\mathsf{\text{char}}\left(--5,\phantom{\rule{0.3em}{0ex}}\widehat{\pi}\right)=\mathsf{\text{T}}$, $\mathsf{\text{char}}\left(10,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)=\mathsf{\text{C}}3$ and $\left(-5,\phantom{\rule{2.77695pt}{0ex}}10\right)\nmid \widehat{\pi}$ (since -5 and 10 are in different contigs in current $\widehat{\pi}$). Therefore, we perform a fusion on $\widehat{\pi}$, by multiplying ${\rho}_{1}=(\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(10,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(5\mathsf{\text{cap}}\left(-5,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\right))\phantom{\rule{0.3em}{0ex}}(\widehat{\pi}\hat{\text{\Gamma}}\left(10\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(-5\right))\phantom{\rule{0.3em}{0ex}}(5\mathsf{\text{cap}}\left(-5,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(10,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))\phantom{\rule{0.3em}{0ex}}\left(-5,\phantom{\rule{2.77695pt}{0ex}}10\right)=$$\left(-10,\phantom{\rule{2.77695pt}{0ex}}-8\right)\left(-4,\phantom{\rule{2.77695pt}{0ex}}-9\right)\left(9,7\right)\left(-5,10\right)$ with $\widehat{\pi}$, to obtain new $\widehat{\pi}=\left(7,\phantom{\rule{2.77695pt}{0ex}}1,\phantom{\rule{2.77695pt}{0ex}}4,\phantom{\rule{2.77695pt}{0ex}}-5,\phantom{\rule{2.77695pt}{0ex}}6,\phantom{\rule{2.77695pt}{0ex}}8\right)\phantom{\rule{0.3em}{0ex}}\left(-8,\phantom{\rule{2.77695pt}{0ex}}-6,\phantom{\rule{2.77695pt}{0ex}}5,\phantom{\rule{2.77695pt}{0ex}}-4,\phantom{\rule{2.77695pt}{0ex}}-1,\phantom{\rule{2.77695pt}{0ex}}-7\right)\phantom{\rule{0.3em}{0ex}}\left(9,10\right)\phantom{\rule{0.3em}{0ex}}\left(-10,\phantom{\rule{2.77695pt}{0ex}}-9\right)\phantom{\rule{0.3em}{0ex}}\left(11,\phantom{\rule{2.77695pt}{0ex}}3,\phantom{\rule{2.77695pt}{0ex}}2,\phantom{\rule{2.77695pt}{0ex}}12\right)\phantom{\rule{0.3em}{0ex}}\left(-12,\phantom{\rule{2.77695pt}{0ex}}-2,\phantom{\rule{2.77695pt}{0ex}}-3\phantom{\rule{2.77695pt}{0ex}}-11\right)$. Moreover, we have new $\widehat{\sigma}{\widehat{\pi}}^{-1}=\left(2,\phantom{\rule{2.77695pt}{0ex}}4\right)\phantom{\rule{0.3em}{0ex}}\left(-1,\phantom{\rule{2.77695pt}{0ex}}-3\right)\phantom{\rule{0.3em}{0ex}}\left(3,\phantom{\rule{2.77695pt}{0ex}}12\right)\phantom{\rule{0.3em}{0ex}}\left(-2,\phantom{\rule{2.77695pt}{0ex}}-11\right)\phantom{\rule{0.3em}{0ex}}\left(5,\phantom{\rule{2.77695pt}{0ex}}-5\right)\phantom{\rule{0.3em}{0ex}}\left(-4,\phantom{\rule{2.77695pt}{0ex}}6\right)$, in which 3 and 12 form a (T, C3) pair but they belong to the same contig strand in $\widehat{\pi}$, that is, $\left(3,\phantom{\rule{2.77695pt}{0ex}}12\right)|\widehat{\pi}$. In this case, $\widehat{\pi}$ has a contig strand (7, 1, 4, -5, 6, 8) whose 3' cap is 8 that is different from 12. Hence, we multiply $(\widehat{\pi}\hat{\text{\Gamma}}\left(8\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(3\right))\phantom{\rule{0.3em}{0ex}}(\widehat{\pi}\hat{\text{\Gamma}}\left(8\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(12\right))\phantom{\rule{0.3em}{0ex}}\left(12,\phantom{\rule{2.77695pt}{0ex}}8\right)\left(3,\phantom{\rule{2.77695pt}{0ex}}8\right)=\left(-6,\phantom{\rule{2.77695pt}{0ex}}-11\right)\phantom{\rule{2.77695pt}{0ex}}\left(-6,\phantom{\rule{2.77695pt}{0ex}}-2\right)\phantom{\rule{2.77695pt}{0ex}}\left(12,\phantom{\rule{2.77695pt}{0ex}}8\right)\phantom{\rule{2.77695pt}{0ex}}\left(3,\phantom{\rule{2.77695pt}{0ex}}8\right)$ with $\widehat{\pi}$ to obtain new $\widehat{\pi}=\left(7,\phantom{\rule{2.77695pt}{0ex}}1,\phantom{\rule{2.77695pt}{0ex}}4,\phantom{\rule{2.77695pt}{0ex}}-5,\phantom{\rule{2.77695pt}{0ex}}6,\phantom{\rule{2.77695pt}{0ex}}3,\phantom{\rule{2.77695pt}{0ex}}2,\phantom{\rule{2.77695pt}{0ex}}8\right)\phantom{\rule{0.3em}{0ex}}\left(-8,\phantom{\rule{2.77695pt}{0ex}}-2,\phantom{\rule{2.77695pt}{0ex}}-3,\phantom{\rule{2.77695pt}{0ex}}-6,\phantom{\rule{2.77695pt}{0ex}}5,\phantom{\rule{2.77695pt}{0ex}}-4,\phantom{\rule{2.77695pt}{0ex}}-1,\phantom{\rule{2.77695pt}{0ex}}-7\right)\phantom{\rule{0.3em}{0ex}}\left(9,\phantom{\rule{2.77695pt}{0ex}}10\right)\phantom{\rule{0.3em}{0ex}}\left(-10,\phantom{\rule{2.77695pt}{0ex}}-9\right)\phantom{\rule{0.3em}{0ex}}\left(11,\phantom{\rule{2.77695pt}{0ex}}12\right)\phantom{\rule{0.3em}{0ex}}\left(-12,\phantom{\rule{2.77695pt}{0ex}}-11\right)$ and new $\widehat{\sigma}{\widehat{\pi}}^{-1}=\left(2,\phantom{\rule{2.77695pt}{0ex}}4\right)\phantom{\rule{0.3em}{0ex}}\left(-1,\phantom{\rule{2.77695pt}{0ex}}-3\right)\phantom{\rule{0.3em}{0ex}}\left(5,\phantom{\rule{2.77695pt}{0ex}}-5\right)\phantom{\rule{0.3em}{0ex}}\left(-4,\phantom{\rule{2.77695pt}{0ex}}6\right)\phantom{\rule{0.3em}{0ex}}\left(3,\phantom{\rule{2.77695pt}{0ex}}8\right)\phantom{\rule{0.3em}{0ex}}\left(-2,\phantom{\rule{2.77695pt}{0ex}}-6\right)$. Notice that -4 and 6 are adjacent in a cycle of current $\widehat{\sigma}{\widehat{\pi}}^{-1}$ and they are in different strands in current $\widehat{\pi}$ since $\left(-4,\phantom{\rule{0.3em}{0ex}}\hat{\text{\Gamma}}\left(6\right)\right)|\widehat{\pi}$. Thus, we can find a reversal, which is $\left(\widehat{\pi}\hat{\text{\Gamma}}\left(6\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(-4\right)\right)\left(-4,\phantom{\rule{2.77695pt}{0ex}}6\right)=\left(5,\phantom{\rule{2.77695pt}{0ex}}-5\right)\phantom{\rule{0.3em}{0ex}}\left(-4,6\right)$, from $\widehat{\sigma}{\widehat{\pi}}^{-1}$ to transform $\widehat{\pi}$ into (7, 1, 4, 5, 6, 3, 2, 8) (-8, -2, -3, -6, -5, -4, -1, -7) (9, 10) (-10, -9) (11, 12) (-12, -11). After that, we have new $\widehat{\sigma}{\widehat{\pi}}^{-1}=\left(2,\phantom{\rule{2.77695pt}{0ex}}4\right)\phantom{\rule{2.77695pt}{0ex}}\left(-1,\phantom{\rule{2.77695pt}{0ex}}-3\right)\phantom{\rule{2.77695pt}{0ex}}\left(3,\phantom{\rule{2.77695pt}{0ex}}8\right)\phantom{\rule{2.77695pt}{0ex}}\left(-2,\phantom{\rule{2.77695pt}{0ex}}-6\right)$, which can serve as a block-interchange to further transform $\widehat{\pi}$ into (7, 1, 2, 3, 4, 5, 6, 8)(-8, -6, -5, -4, -3, -2, -1, -7) (9, 10) (-10, -9) (11, 12) (-12, -11), which is equal to $\widehat{\sigma}$. As a result, we obtain an ordering ([1, 4], [-5, 6], [3, 2]) of π whose induced permutation [1, 4] ⊙ [-5, 6] ⊙ [3, 2] = (1, 4, -5, 6, 3, 2) can be transformed into the permutation (1, 2, ..., 6) of σ using a reversal and a block-interchange (i.e., Δ(π, σ) = 3).
Actually, after running the step 3 of Algorithm 1, it can be verified according to the capping of π and σ and Lemma 3 that for any two adjacent elements x and y in a cycle of $\widehat{\pi}{\widehat{\sigma}}^{-1}$ with $(\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $\mathsf{\text{char}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))=\left(\mathsf{\text{T}},\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{C}}3\right)$, if $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)\nmid \widehat{\pi}$, then $\left(x,\hat{\text{\Gamma}}\left(y\right)\right)\nmid \widehat{\pi}$. Moreover, the operation ${\tau}_{i}=\left(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right)\right)\phantom{\rule{0.3em}{0ex}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{0.3em}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(y\right)\right)\phantom{\rule{0.3em}{0ex}}\left(y,\phantom{\rule{2.77695pt}{0ex}}z\right)\left(x,\phantom{\rule{2.77695pt}{0ex}}z\right)$ used in the step 4 of Algorithm 1 acts on $\widehat{\pi}$ still as a fusion of π, as explained as follows. Notice that $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)|\widehat{\pi}$, meaning that x and y are in the same cycle of $\widehat{\pi}$ and hence $5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)=5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$. It can be verified that $(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(z,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))\phantom{\rule{0.3em}{0ex}}(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(z,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))=1$. Since $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)|\widehat{\pi}$, we have $\left(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))\right|\widehat{\pi}$ and hence $(5\mathsf{\text{cap}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))(5\mathsf{\text{cap}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(\widehat{\pi}\widehat{\text{\Gamma}}\left(x\right),\widehat{\pi}\right))=1$. It is not hard to see that $(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(y\right))=(\widehat{\pi}\hat{\text{\Gamma}}\left(x\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(y\right))(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))$. Thus, τ_{ i } can be rewritten as ${\tau}_{i}={\alpha}_{2}{\alpha}_{1}$, where ${\alpha}_{1}=(5\mathsf{\text{cap}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(x\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(z,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))\left(x,\phantom{\rule{2.77695pt}{0ex}}z\right)$ and ${\alpha}_{2}=(5\mathsf{\text{cap}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))(\widehat{\pi}\hat{\text{\Gamma}}\left(x\right),\widehat{\pi}\hat{\text{\Gamma}}\left(y\right))(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(z,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))\left(y,\phantom{\rule{2.77695pt}{0ex}}z\right)$. It can be verified that ${\alpha}_{2}=(5\mathsf{\text{cap}}\left({\alpha}_{1}\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}{\alpha}_{1}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left({\alpha}_{1}\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}{\alpha}_{1}\widehat{\pi}\right))({\alpha}_{1}\widehat{\pi}\hat{\text{\Gamma}}\left(z\right),\phantom{\rule{2.77695pt}{0ex}}{\alpha}_{1}\widehat{\pi}\hat{\text{\Gamma}}\left(y\right))(5\mathsf{\text{cap}}\left(z,\phantom{\rule{2.77695pt}{0ex}}{\alpha}_{1}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}{\alpha}_{1}\widehat{\pi}\right))\left(y,\phantom{\rule{2.77695pt}{0ex}}z\right)$. By Lemma 5, as well as the previous discussion, it can be realized that a_{1} acts on $\widehat{\pi}$ as a fusion of π and α_{2} continues to act on ${\alpha}_{1}\widehat{\pi}$ as a cap exchange. As a result, the rearrangement effect of acting τ_{ i } on $\widehat{\pi}$ is still equivalent to a fusion acting on π. The above discussion indicates that a fusion to π can be mimicked by a translocation τ, which acts on $\widehat{\pi}$ as a fusion of π, followed by zero or more translocations acting on $\tau \widehat{\pi}$ as cap exchanges.
In the following, we prove the correctness of Algorithm 1. Initially, it is not hard to see that all the 5' caps are fixed in $\widehat{\sigma}{\widehat{\pi}}^{-1}$ and $\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)\ne \mathsf{\text{N}}3$ for all $x\in \hat{E}$. For any element $x\in \hat{E}$ with $\mathsf{\text{char}}\left(x,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\pi}}_{i}\right)=\mathsf{\text{T}}$, where $1\le i\le m,\phantom{\rule{2.77695pt}{0ex}}\mathsf{\text{if}}\phantom{\rule{2.77695pt}{0ex}}{\widehat{\pi}}^{-1}\left(x\right)\ne {\widehat{\sigma}}_{1}^{+}\left[1\right]$ and ${\widehat{\pi}}^{-1}\left(x\right)\ne {\widehat{\sigma}}_{1}^{-}\left[1\right]$, that is, the 5' cap of ${\widehat{\pi}}_{i}$ is not equal to that of ${\widehat{\sigma}}_{1}$, then the character of $\widehat{\sigma}{\widehat{\pi}}^{-1}\left(x\right)$ in $\widehat{\pi}$ must be C3. If any cycle in $\widehat{\sigma}{\widehat{\pi}}^{-1}$ contains any two elements x and y with the same character (either T or C3) in $\widehat{\pi}$, then we can extract two 2-cycles c_{1} = (x, y) and ${c}_{1}^{\prime}=(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))$ from two mate cycles in $\widehat{\sigma}{\widehat{\pi}}^{-1}$ and multiply ${c}_{2}^{\prime}{c}_{1}^{\prime}{c}_{2}{c}_{1}$ with $\widehat{\pi}$ to exchange the caps of the contigs containing x and y, respectively, in $\widehat{\pi}$, where ${c}_{2}=(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right),\phantom{\rule{2.77695pt}{0ex}}5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))$ and ${c}_{{2}^{\prime}}=(\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(y,\widehat{\pi}\right)),\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(x,\widehat{\pi}\right)))$. This is the job to be performed in the step 3 in Algorithm 1. Moreover, after finishing the cap exchanges in the step 3, each cycle in the remaining $\widehat{\sigma}{\widehat{\pi}}^{-1}$ has at most one element with T character and at most one element with C3 character. In other words, after running the step 3, there are at least 2(m- 1) cycles in the resulting $\widehat{\sigma}{\widehat{\pi}}^{-1}$ such that each such a cycle contains exactly one element, say x, with $\left(x,\widehat{\pi}\right)=\mathsf{\text{T}}$ and exactly one element, say y, with $\mathsf{\text{char}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)=\mathsf{\text{C}}3$, and $\widehat{\sigma}{\widehat{\pi}}^{-1}\left(x\right)=y$. In this case, we can further derive 2(m - 1) 2-cycles from these cycles in $\widehat{\sigma}{\widehat{\pi}}^{-1}$ with each 2-cycle having a character pair of (T, C3). Intriguingly, we shall show below that these 2(m- 1) 2-cycles with character pair (T, C3), denoted by ${f}_{1},{f}_{1}^{\prime},.\phantom{\rule{2.77695pt}{0ex}}.\phantom{\rule{2.77695pt}{0ex}}.\phantom{\rule{2.77695pt}{0ex}},\phantom{\rule{2.77695pt}{0ex}}{f}_{m-1},{f}_{m-1}^{\prime}$, can be used to obtain an optimal ordering of π such that the weighted reversal and block-interchange distance between the permutation induced by this ordering of π and σ is minimum. In fact, f_{ k } and ${f}_{k}^{\prime}$, where 1 ≤ k ≤ m - 1, are derived from two mate cycles in $\widehat{\sigma}{\widehat{\pi}}^{-1}$ and hence we call them as mate 2-cycles below. Moreover, if ${f}_{k}=\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$, then ${f}_{k}^{\prime}=(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))$.
For 1 ≤ k ≤ m - 1, we simply let ${f}_{k}=\left({x}_{k},\phantom{\rule{2.77695pt}{0ex}}{y}_{k}\right)$, where $\mathsf{\text{char}}\left({x}_{k},\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)=\mathsf{\text{T}}$ and $\mathsf{\text{char}}\left({y}_{k},\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)=\mathsf{\text{C}}3$. Then ${f}_{k}^{\prime}=(\widehat{\pi}\hat{\text{\Gamma}}\left({y}_{k}\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left({x}_{k}\right))$. As mentioned previously, the permutation induced by an ordering of π can be mimicked by performing m - 1 consecutive fusions on π that has m contigs initially. According to Lemma 5 and our previous discussion, if ${f}_{k}\nmid \widehat{\pi}$, where 1 ≤ k ≤ m - 1, then ${g}_{k}^{\prime}{f}_{k}^{\prime}{g}_{k}{f}_{k}$ can be applied to $\widehat{\pi}$ to function as a fusion of two contigs in π, where ${g}_{k}=(5\mathsf{\text{cap}}\left({x}_{k},\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right),\phantom{\rule{2.77695pt}{0ex}}5\mathsf{\text{cap}}\left({y}_{k},\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))$ and ${g}_{k}^{\prime}=(\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)))$. Notice that g_{ k } and ${g}_{k}^{\prime}$ are mate 2-cycles. However, not all ${f}_{1},{f}_{2},\dots ,{f}_{m-1}$ cannot divide $\widehat{\pi}$. Suppose that only the first λ 2-cycles ${f}_{1},{f}_{2},\dots ,{f}_{\lambda}$ cannot divide $\widehat{\pi}$, where 0 ≤ λ ≤ m - 1, that is, ${f}_{k}\nmid \widehat{\pi}$ for 1 ≤ k ≤ λ, but ${f}_{k}|\widehat{\pi}$ for λ + 1 ≤ k ≤ m - 1. In this situation, we shall show below that we still can use ${f}_{1},{f}_{2},\dots ,{f}_{m-1}$, as well as their mate 2-cycles, to derive an optimal ordering of π, as we did in the step 4 in Algorithm 1.
Recall that the 5' caps are all fixed in the beginning $\widehat{\sigma}{\widehat{\pi}}^{-1}$ (before the step 3 in Algorithm 1). As mentioned before, for any translocation used to perform on $\widehat{\pi}$, it can be expressed as four 2-cycles, two with (non-C5, non-C5) character pair and the others with (C5, C5). It can be verified that during the process of the step 3, no two elements x and y with char $\left(x,\widehat{\pi}\right)$ = C5 but char $\left(y,\widehat{\pi}\right)\ne \mathsf{\text{C}}5$ can be found in a cycle of the $\widehat{\sigma}{\widehat{\pi}}^{-1}$[17], that is, C5 and non-C5 elements are not mixed together in the same cycle of $\widehat{\sigma}{\widehat{\pi}}^{-1}$. Actually, this property still continues to be asserted when we later perform any translocation on $\widehat{\pi}$ to function as a fusion of π. Let us now pay attention on those cycles in $\widehat{\sigma}{\widehat{\pi}}^{-1}$ with only non-C5 elements and temporarily denote the composition of these cycles by $\varphi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)$. If we still can find any two elements x and y from a cycle in $\varphi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)$ such that $(\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))$, $\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)))(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$ is an exchange of caps when applying it to $\widehat{\pi}$, then we apply this cap exchange to $\widehat{\pi}$ until we cannot find any one from $\varphi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)$. Finally, we denote such a $\varphi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)$ without any cap exchange by $\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)$. Basically, $\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)$ can be considered as a permutation of ${E}^{\prime}=E\cup \left\{-{c}_{2i},\phantom{\rule{2.77695pt}{0ex}}{c}_{2i+1}\phantom{\rule{2.77695pt}{0ex}}:\phantom{\rule{2.77695pt}{0ex}}0\le i\le m-1\right\}$ and hence its norm $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)\left|\right|$ is equal to $\left|{E}^{\prime}\right|-\phantom{\rule{2.77695pt}{0ex}}{n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right))$ according to the formula we mentioned before.
Lemma 6 Let $\tau =(\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))$, $\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)))(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(y,\widehat{\pi}\right))\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$be a fusion to act on π, where char$\left(x,\widehat{\pi}\right)=T$and char$\left(y,\widehat{\pi}\right)=C3$. Then $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}\in \left\{-2,0,2\right\}$.
Proof. For simplicity, it is assumed that we cannot find any cap exchange from $\widehat{\sigma}{\widehat{\pi}}^{-1}$ to perform on $\widehat{\pi}$. We then consider the following two cases.
Case 1: Suppose that $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)|\widehat{\sigma}{\widehat{\pi}}^{-1}$, that is, both x and y lie in the same cycle, say α, in $\widehat{\sigma}{\widehat{\pi}}^{-1}$. Without loss of generality, let $\alpha =\left({a}_{1},\phantom{\rule{2.77695pt}{0ex}}{a}_{2},\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}{a}_{i}\equiv x,\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}{a}_{j}\equiv \phantom{\rule{2.77695pt}{0ex}}y\right)$. Then α can be expressed as α = α_{1}α_{2}(x, y), where α_{1} = (a_{1}, ..., a_{ i }}) and α_{2} = (a_{i+1}, ..., a_{ j }). Let ${\alpha}^{\prime}$ denote the mate cycle of α in $\widehat{\sigma}{\widehat{\pi}}^{-1}$, that is, ${\alpha}^{\prime}=(\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{j}\right),\phantom{\rule{2.77695pt}{0ex}}\dots ,\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{i}\right),\phantom{\rule{2.77695pt}{0ex}}\dots ,\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{2}\right),\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{1}\right))$. Then it can be expressed as ${\alpha}^{\prime}={\alpha}_{1}^{\prime}{\alpha}_{2}^{\prime}(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))$, where ${\alpha}_{1}^{\prime}=(\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{i-1}\right),\phantom{\rule{2.77695pt}{0ex}}\dots ,\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{1}\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{j}\right))$ and ${\alpha}_{2}^{\prime}=(\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{j-1}\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{j-2}\right),\phantom{\rule{2.77695pt}{0ex}}\dots ,\widehat{\pi}\hat{\text{\Gamma}}\left({a}_{i}\right))$. Clearly, after applying τ to $\widehat{\pi}$, the cycle α becomes two disjoint cycles α_{1} and α_{2} in $\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}$ and ${\alpha}^{\prime}$ becomes two disjoint ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$. It means that ${n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right))={n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right))+2$ and hence $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=2$.
Case 2: Suppose that $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)\nmid \widehat{\sigma}{\widehat{\pi}}^{-1}$, that is, x and y lie in two different cycles, say α_{1} and α_{2}, in $\widehat{\sigma}{\widehat{\pi}}^{-1}$. In this case, $\widehat{\pi}\hat{\text{\Gamma}}\left(x\right)$ and $\widehat{\pi}\hat{\text{\Gamma}}\left(y\right)$ also are in two different cycles, say ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$, that are the mate cycles of α_{1} and α_{2}, respectively, in $\widehat{\sigma}{\widehat{\pi}}^{-1}$. By Lemma 4, char $\left(\widehat{\pi}\hat{\text{\Gamma}}\left(x\right),\widehat{\pi}\right)=\mathsf{\text{C}}3$ and char $\left(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\widehat{\pi}\right)=\mathsf{\text{T}}$. Then performing τ on $\widehat{\pi}$ leads α_{1} and α_{2} to be joined together into a cycle, say α, in $\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}$ and ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$ to be joined into another cycle, say ${\alpha}^{\prime}$. If α_{1} and α_{2}, as well as ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$, does not contain both T and C3 elements simultaneously, then ${n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right))={n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right))-2$ and hence $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=-2$. If exactly one of α_{1} and α_{2}, as well as exactly one of ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$, contains both T and C3 elements simultaneously, then joining α_{1} and α_{2} will also change char $\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$ from T to O and char $\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$ from C3 to N3, and joining ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$ will change char $\left(\widehat{\pi}\hat{\text{\Gamma}}\left(x\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$ from C3 to N3 and char $\left(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$ from T to O. Therefore, the cycle α, as well as ${\alpha}^{\prime}$, contains a C3 (or T) element and an N3 element. In this case, we can use these four elements, along with their corresponding 5' caps in $\widehat{\pi}$, as a cap exchange to perform on $\widehat{\pi}$, resulting in that each of the cycles α and ${\alpha}^{\prime}$ is divided into two smaller ones in new $\widehat{\sigma}{\widehat{\pi}}^{-1}$. As a result, ${n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right))={n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right))$ and hence $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=0$. Suppose that both α_{1} and α_{2}, as well as both ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$, contain T and C3 elements at the same time. Then, after applying τ to $\widehat{\pi}$, one of the above two T elements becomes an O element in new $\widehat{\pi}$, leading to α, as well as ${\alpha}^{\prime}$, containing only a T element, along with a C3 element and an N3 element. Next, we can use the T and N3 elements (or the C3 and N3 elements) in α and ${\alpha}^{\prime}$ and their corresponding 5' caps in $\widehat{\pi}$ to exchange the caps of $\widehat{\pi}$. After that, α, as well as ${\alpha}^{\prime}$, is divided into two cycles in the new $\widehat{\sigma}{\widehat{\pi}}^{-1}$ and, consequently, ${n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right))={n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right))$ and hence $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=0$.
Notice that if $\widehat{\pi}=\widehat{\sigma}$, then $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=0$. According to Lemmas 5 and 6, any translocation τ that acts on $\widehat{\pi}$ as a fusion of π decreases the norm $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)\left|\right|$ at most by two. Hence, we call τ as a good fusion of π if $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\pi}}^{-1}{\tau}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=2$. By the discussion in the proof of Lemma 6, we have the following corollary.
Corollary 1 Let $\tau =(\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))$, $\widehat{\pi}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)))(\widehat{\pi}\hat{\text{\Gamma}}\left(y\right)$, $\widehat{\pi}\hat{\text{\Gamma}}\left(x\right))(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right)$, $5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\right))\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$be a fusion to act on π, where char $\left(x,\widehat{\pi}\right)=T$and char $\left(y,\widehat{\pi}\right)=C3$. If $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)|\widehat{\sigma}{\widehat{\pi}}^{-1}$, then τ is a good fusion to perform on π.
According to Corollary 1, it can be realized that f_{ k }, as well as its mate 2-cycle ${f}_{k}^{\prime}$, can derive a good fusion to act on π, where 1 ≤ k ≤ λ. If λ = m - 1, then performing the m - 1 fusions on π, as we did in Algorithm 1, corresponds to an optimal ordering of π such that the weighted reversal and block-interchange distance between the assembly of π and σ is minimum. For simplifying our discussion below, we assume that the λ good fusions derived from f_{1}, f_{2}, ..., f_{λ} and their mate 2-cycles can assemble λ + 1 contigs of π into several super-contigs. If λ <m - 1, then we show below that the fusions of m - 1 contigs in π performed by our algorithm utilizing f_{1}, f_{2}, ..., f_{m-1}is still optimal.
Lemma 7 Let ${\tau}_{1},{\tau}_{2},\dots ,{\tau}_{m-1}$be any sequence of m - 1 translocations that act on $\widehat{\pi}$as fusions to assemble m - 1 contigs in π. Let ${\widehat{\omega}}_{k}$be the genome obtained by performing τ_{ k } and zero or more following cap exchanges on ${\widehat{\omega}}_{k-1}$such that no more cap exchange can be derived from $\widehat{\sigma}{\widehat{\omega}}_{k}^{-1}$, where ${\widehat{\omega}}_{0}=\widehat{\pi}$and 1 ≤ k ≤ m - 1. Then $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{0}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{m-1}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}\le 2\lambda $.
Proof. For simplicity, we assume that in the beginning, no cap exchange can be derived from $\widehat{\sigma}{\widehat{\omega}}_{0}^{-1}$ to act on ${\widehat{\omega}}_{0}$. Let ${\omega}_{k}$ denote the genome obtained from ${\widehat{\omega}}_{k}$ by removing its caps, where 1 ≤ k ≤ m - 1. By Lemma 6, $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k-1}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}\in \left\{-2,0,2\right\}$ and by Corollary 1, $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k-1}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=2$ if ${\tau}_{k}$ is a good fusion to ${\omega}_{k-1}$. In fact, there are at most λ translocations from ${\tau}_{1},{\tau}_{2},\dots ,{\tau}_{m-1}$ that are good fusions. The reason is as follows. As mentioned before, we can obtain 2λ 2-cycles ${f}_{1},{f}_{1}^{\prime},\dots ,{f}_{\lambda},{f}_{\lambda}^{\prime}$ from $\widehat{\sigma}{\widehat{\pi}}^{-1}$ that can derive λ good fusions to act on π, say ${\tau}_{1},{\tau}_{2},\dots ,{\tau}_{\lambda}$, as well as 2(m - λ - 1) other 2-cycles ${f}_{\lambda +1},{f}_{\lambda +1}^{\prime},\dots ,{f}_{m-1},{f}_{m-1}^{\prime}$ that cannot derive any good fusions to act on π since their T and C3 elements lie in the same contig strand in $\widehat{\pi}$. If we can further extract two 2-cycles, say f and its mate 2-cycle f', from $\widehat{\sigma}{\widehat{\pi}}^{-1}$ that can derive a good fusion, say τ, to act on π, then the C3 elements in both f and f' must locate at a contig whose T elements are in some f_{ k } and ${f}_{k}^{\prime}$, respectively, where 1 ≤ k ≤ λ. This implies that the good fusion τ cannot act on $\widehat{\pi}$ together with ${\tau}_{1},{\tau}_{2},\dots ,{\tau}_{\lambda}$ at the same time, since they will assemble a circular contig that is not allowed. Now, we suppose that ${\tau}_{1},{\tau}_{2},\dots ,{\tau}_{m-1}$ are the fusions obtained by the step 4 of Algorithm 1. Clearly, for $1\le k\le \lambda $, $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k-1}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=2$ since τ_{ k } is a good fusion to ${\omega}_{k-1}$. Moreover, for $\lambda +1\le k\le m-1,\phantom{\rule{2.77695pt}{0ex}}\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k-1}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=0$, due to the following reason. According to Algorithm 1, we have ${\tau}_{k}=({\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({z}_{k}\right)$, ${\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({x}_{k}\right))({\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({z}_{k}\right)$, ${\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({y}_{k}\right))\left({y}_{k},\phantom{\rule{2.77695pt}{0ex}}{z}_{k}\right)\left({x}_{k},\phantom{\rule{2.77695pt}{0ex}}{z}_{k}\right)$, which actually equals to $({\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({x}_{k}\right)$, ${\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({z}_{k}\right)$, ${\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({y}_{k}\right))\left({y}_{k},\phantom{\rule{2.77695pt}{0ex}}{z}_{k},\phantom{\rule{2.77695pt}{0ex}}{x}_{k}\right)$. Moreover, we have $\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k}^{-1}\right)=\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k-1}^{-1}\right){\tau}_{k}^{-1}$, in which the composition of $\left({x}_{k},\phantom{\rule{2.77695pt}{0ex}}{y}_{k}\right){\left({y}_{k},\phantom{\rule{2.77695pt}{0ex}}{z}_{k},\phantom{\rule{2.77695pt}{0ex}}{x}_{k}\right)}^{-1}$ equals to $\left({x}_{k},\phantom{\rule{2.77695pt}{0ex}}{z}_{k}\right)$ and the composition of $({\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({y}_{k}\right)$, ${\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({x}_{k}\right))({\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({x}_{k}\right)$, ${\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({z}_{k}\right)$, ${\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({y}_{k}\right){}^{)-1}$ equals to $({\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({y}_{k}\right),{\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left({z}_{k}\right))$. Recall that ${f}_{k}=\left({x}_{k},\phantom{\rule{2.77695pt}{0ex}}{y}_{k}\right)$ and ${f}_{{k}^{\prime}}=({\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left(y\right),{\widehat{\omega}}_{k-1}\hat{\text{\Gamma}}\left(x\right))$, both of which are extracted from two mate cycles in $\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k-1}^{-1}\right)$. According to the above discussion, both y_{ k } and $\widehat{\pi}\hat{\text{\Gamma}}\left({x}_{k}\right)$ will be fixed in $\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k}^{-1}\right)$, thus increasing the number of cycles by two. However, the 2-cycle $\left({x}_{k},\phantom{\rule{2.77695pt}{0ex}}{z}_{k}\right)$ will further join other two cycles respectively containing x_{ k } and z_{ k } together into one cycle and $(\widehat{\pi}\hat{\text{\Gamma}}\left({y}_{k}\right),\phantom{\rule{2.77695pt}{0ex}}\widehat{\pi}\hat{\text{\Gamma}}\left({z}_{k}\right))$ will join another two cycles respectively containing $\widehat{\pi}\hat{\text{\Gamma}}\left({y}_{k}\right)$ and $\widehat{\pi}\hat{\text{\Gamma}}\left({z}_{k}\right)$ together into one cycle, thus decreasing the number of cycles by two. As a result, ${n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k}^{-1}\right))={n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\omega}}_{k-1}^{-1}\right))$. Therefore, we have $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{0}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{m-1}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}\le 2\lambda $ for the (m -1) fusions obtained by the step 4 of Algorithm 1. In fact, to let $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{0}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{m-1}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}>2\lambda $ happen, there must be a translocation τ_{ i } that acts on ${\widehat{\omega}}_{i-1}$ as a fusion of ${\omega}_{i-1}$ satisfying either (1) $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=0$, the number of good fusions newly created by τ_{ i } and its following cap exchanges minus that of good fusions currently destroyed by τ_{ i } and the following cap exchanges is greater than or equal to one, and the total available good fusions can assemble more contigs than before, or (2) $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=-2$, the number of good fusions created by τ_{ i } and its following cap exchanges minus that of the currently destroyed good fusions is greater than or equal to two, and the total good fusions can assemble more contigs than before. However, we show below that no such a translocation τ_{ i } exits. Let ${\tau}_{i}=({\widehat{\omega}}_{i-1}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\right))$, ${\widehat{\omega}}_{i-1}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\right)))({\widehat{\omega}}_{i-1}\hat{\text{\Gamma}}\left(y\right),\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\hat{\text{\Gamma}}\left(x\right))(5\mathsf{\text{cap}}\left(x,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\right)$, $5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\right))\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)$ be a fusion (but not a good one) to ${\omega}_{i-1}$, where char $\left(x,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{T}}$ and char $\left(y,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{C}}3$. According to Corollary 1, we have $\left(x,\phantom{\rule{2.77695pt}{0ex}}y\right)\nmid \widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}$, that is, x and y are in different cycles of $\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}$. Moreover, char $\left(x,\phantom{\rule{2.77695pt}{0ex}}{\tau}_{i}{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{O}}$ and char $\left(y,\phantom{\rule{2.77695pt}{0ex}}{\tau}_{i}{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{N}}3$ after applying τ_{ i } to ${\widehat{\omega}}_{i-1}$. Below, we consider two cases.
Case 1: Suppose that there is a 2-cycle ${f}_{j}=\left({x}_{j},\phantom{\rule{2.77695pt}{0ex}}{y}_{j}\right)$ such that ${x}_{j}=x$, where 1 $\le j\le m-1$, char $\left({x}_{j},\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{T}}$ and char $\left({y}_{j},\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{C}}3$. For simplifying our discussion, we assume that f_{ j } is disjoint from the other cycles in $\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}\right)$ and y is in the cycle $\alpha =\left({a}_{1},\phantom{\rule{2.77695pt}{0ex}}{a}_{2},\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}{a}_{h}\equiv y\right)$ of $\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}\right)$. Then in $\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}\right){\tau}_{i}^{-1}$, the cycles f_{ j } and α are joined into a cycle $\beta =\left({a}_{1},\phantom{\rule{2.77695pt}{0ex}}{a}_{2},\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}{a}_{h-1},\phantom{\rule{2.77695pt}{0ex}}y,\phantom{\rule{2.77695pt}{0ex}}{y}_{j},\phantom{\rule{2.77695pt}{0ex}}x\right)$, which can be expressed as $\gamma \left(y,\phantom{\rule{2.77695pt}{0ex}}{y}_{j}\right)$, where $\gamma =\left({a}_{1},\phantom{\rule{2.77695pt}{0ex}}{a}_{2},\phantom{\rule{2.77695pt}{0ex}}\dots ,\phantom{\rule{2.77695pt}{0ex}}{a}_{h-1},\phantom{\rule{2.77695pt}{0ex}}y,\phantom{\rule{2.77695pt}{0ex}}x\right)$, char $\left(y,\phantom{\rule{2.77695pt}{0ex}}{\tau}_{i}{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{N}}3$ and char $\left({y}_{j},\phantom{\rule{2.77695pt}{0ex}}{\tau}_{i}{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{C}}3$. According to Lemma 3, there is a cycle ${\beta}^{\prime}=\left({\tau}_{i}{\widehat{\omega}}_{i-1}\hat{\text{\Gamma}}\right)\cdot {\beta}^{-1}$. that is the mate cycle of β in $\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}\right){\tau}_{i}^{-1}$. In other words, we can extract c_{1} = (y, y_{ j }) from β and ${c}_{1}^{\prime}=({\tau}_{i}{\widehat{\omega}}_{i-1}\hat{\text{\Gamma}}\left({y}_{j}\right)$, ${\tau}_{i}{\widehat{\omega}}_{i-1}\hat{\text{\Gamma}}\left(y\right))$ from ${\beta}^{\prime}$, and then apply ${\tau}_{i}^{\prime}={c}_{2}^{\prime}{c}_{1}^{\prime}{c}_{2}{c}_{1}$ to ${\tau}_{i}{\widehat{\omega}}_{i-1}$ as a cap exchange, where ${c}_{2}=(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}{\tau}_{i}{\widehat{\omega}}_{i-1}\right),\phantom{\rule{2.77695pt}{0ex}}5\mathsf{\text{cap}}\left({y}_{j},\phantom{\rule{2.77695pt}{0ex}}{\tau}_{i}{\widehat{\omega}}_{i-1}\right))$ and ${c}_{2}^{\prime}=({\tau}_{i}{\widehat{\omega}}_{i-1}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left({y}_{j},\phantom{\rule{2.77695pt}{0ex}}{\tau}_{i}{\widehat{\omega}}_{i-1}\right))$, ${\tau}_{i}{\widehat{\omega}}_{i-1}\hat{\text{\Gamma}}(5\mathsf{\text{cap}}\left(y,\phantom{\rule{2.77695pt}{0ex}}{\tau}_{i}{\widehat{\omega}}_{i-1}\right)))$, since the character pair (C3, N3) of (y_{ j }, y) belongs to CEpair. After that, y_{ j }, as well as ${\tau}_{i}{\widehat{\omega}}_{i-1}\widehat{\text{\Gamma}}\left(y\right)$, will be fixed in the resulting $\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i}^{-1}\right)$ and char $\left(y,\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i}\right)$ will become C3. As a result, ${n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i}^{-1}\right))={n}_{c}(\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}\right))$ and hence $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=0$. According to the above discussion, if j ≤ λ, that is, f_{ j } cannot be used to derive a good fusion to ${\omega}_{i-1}$, then acting ${\tau}_{i}^{\prime}{\tau}_{i}$ on ${\widehat{\omega}}_{i-1}$ still serves as a fusion of ${\omega}_{i-1}$ and after that, it can be verified that no existing good fusion is destroyed and no new good fusion is created. If j ≤ λ, that is, f_{ j } can be used to derive a good fusion to ${\omega}_{i-1}$, then this good fusion will be destroyed when we perform ${\tau}_{i}^{\prime}{\tau}_{i}$ on ${\widehat{\omega}}_{i-1}$. Suppose that char$\left({a}_{h-1},\phantom{\rule{2.77695pt}{0ex}}{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{T}}$. Then after further performing the cap exchange ${\tau}_{i}^{\prime}$ on ${\tau}_{i}{\widehat{\omega}}_{i-1}$, we still can extract a 2-cycle (a_{h-1}, y) from γ with character pair of (T, C3) in the resulting ${\widehat{\omega}}_{i}$. Clearly, if (a_{h-1}, y) = f_{ k } with k < λ, that is, f_{ k } cannot derive a good fusion to ω_{i-1} (a_{h-1}and y are in the same cycle of ${\widehat{\omega}}_{i-1}$), then after performing the cap exchange ${\tau}_{i}^{\prime}$ on ${\tau}_{i}{\widehat{\omega}}_{i-1}$, it can be used to derive a good fusion to ${\widehat{\omega}}_{i}$, since a_{ h-1 } and y will be separated by ${\tau}_{i}^{\prime}$ into two different cycles in the resulting ${\widehat{\omega}}_{i}$. If k ≤ λ, that is, f_{ k } can derive a good fusion to ω_{i-1}, then after performing the cap exchange ${\tau}_{i}^{\prime}$ on ${\tau}_{i}{\widehat{\omega}}_{i-1}$, f_{ k } can or cannot derive a good fusion to ${\widehat{\omega}}_{i}$. Based on the above discussion, the number of good fusions newly created by τ_{ i } and ${\tau}_{i}^{\prime}$ minus that of good fusions currently destroyed by τ_{ i } and ${\tau}_{i}^{\prime}$ must be less than or equal to zero.
Case 2: Suppose that there is no ${f}_{j}=\left({x}_{j},\phantom{\rule{2.77695pt}{0ex}}{y}_{j}\right)$ such that x_{ j } = x, where $1\le j\le m-1$, char $\left({x}_{j},{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{T}}$ and char $\left({y}_{j},{\widehat{\omega}}_{i-1}\right)=\mathsf{\text{C}}3$. Let α_{1} denote the cycle containing x and α_{2} denote the cycle containing y in $\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}$. Also let ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$ be the mate cycles of α_{1} and α_{2}, respectively, in $\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}$. Note that after applying τ_{ i } to ${\widehat{\omega}}_{i-1}$, the cycles α_{1} and α_{2} will be merged into a single cycle, say α, in $\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}{\tau}_{i}^{-1}$ and ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$ will be merged into a single cycle, say ${\alpha}^{\prime}$. Moreover, the characters of x and y in ${\tau}_{i}{\widehat{\omega}}_{i-1}$ will become O and N3, respectively. As discussed in the proof of Lemma 6, if both α_{1} and α_{2}, as well as both ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$, do not contain T and C3 elements simultaneously, then $\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i-1}^{-1}\right)\left|\right|-\left|\right|\psi \left(\widehat{\sigma}{\widehat{\omega}}_{i}^{-1}\right)\left|\right|\phantom{\rule{2.77695pt}{0ex}}=-2$. In this case, it can be verified that no existing good fusion is destroyed by τ_{ i } and no new good fusion is created by τ_{ i }. In other words, the number of the increased good fusions minus that of the destroyed good fusions is zero. If at least one of α_{1} and α_{2}, as well as at least one of ${\alpha}_{1}^{\prime}$ and ${\alpha}_{2}^{\prime}$, has both T and C3 elements at the same time, then $\left|\right|\psi (\widehat{\sigma}{}_{}^{}$