The hardness result in the previous section implies that it might be too much if we use the standard isomorphism to measure the similarity of 2-generation pedigrees. In practice, ambiguities exist in pedigree-related datasets. In fact, it is estimated that 2-10% of people do not know their biological father [2, 25]. For 2-generation pedigrees, in general the pedigrees cannot be monogamous. So, we need a new measure to weakly describe the similarity of two 2-generation pedigrees.

For a general 2-generation pedigree *P*, it is not difficult to identify all (not necessarily disjoint) 〈*i*, *j*〉-families (or simply families, when 〈*i*, *j*〉's are used). (For instance, the left component in Figure 2 can be decomposed into two families: 〈2, 0〉 and 〈0, 1〉.) Then, we try to decompose the generation-2 nodes in these families so that the resulting number of isomorphic sub-families is minimized. Note that in this process a generation-1 pair can appear in more than one sub-family. This can in turn be formulated as the Minimum Common Integer Pair Partition (MCIPP) problem.

### MCIP and MCIPP Problems

Throughout this paper, for MCIP, we focus on integers in \mathcal{N}=\left\{1,2,3,\dots \right\}. A partition of an integer *n* is a multiset *τ*(*n*) = {*n*_{1}, *n*_{2},..., *n*_{
t
}} such that {\sum}_{1\le i\le t}{n}_{i}=n. For example, when *n* = 9, {1, 2, 2, 4} is a partition of *n*. It should be noted that while it is simple to partition an integer, the number of such partitions is usually (counter-intuitively) huge. For instance, the integer 10 has 190569292 distinct partitions [3].

A partition of a multiset *X* = {*x*_{1}, *x*_{2},...,*x*_{
p
}} is a multiset union of all the partitions *τ*(*x*_{
i
}), i.e., ∪_{1≤i≤p}*τ*(*x*_{
i
}). A multiset *Z* is a common partition of two multisets *X* = {*x*_{1}, *x*_{2},..., *x*_{
p
}}, *Y* = {*y*_{1}, *y*_{2},..., *y*_{
q
}} if there are partitions *τ*_{1}, *τ*_{2} with ∪_{1≤i≤p}*τ*_{1}(*x*_{
i
}) = ∪_{1≤j≤q}*τ*_{2}(*y*_{
j
}) = *Z*. The size of the partition *Z* is denoted as |*Z*|. For example, given *X* = {5, 8}, *Y* = {3,10}, a common partition of *X*, *Y* is *Z* = {1, 2, 2, 4, 4}, and the size of this partition is 5. It is easily seen that the necessary condition for *X* and *Y* to admit a common partition is that the sums of the integers in *X* and *Y* are equal. Throughout this paper, whenever we talk about a common partition for sets of integers *X* and *Y*, we always assume that this condition is met.

#### MCIP (Minimum Common Integer Partition)

Instance: Two multiple sets of integers *A* and *B*, and an integer *k*.

Question: Does *A*, *B* admit a common partition of size *k*?

For the ease of presentation, we use MCIP(A, B) to represent this instance.

Given a 2-tuple of integers, 〈*a, b*〉, the projection {\mathcal{P}}_{1}\left(\u3008a,b\u3009\right)=a,{\mathcal{P}}_{2}\left(\u3008a,b\u3009\right)=b. Let *S* be a set of 2-tuples of integers, {\mathcal{P}}_{1}\left(S\right)={\cup}_{s\in S}{\mathcal{P}}_{1}\left(s\right),{\mathcal{P}}_{2}\left(S\right)={\cup}_{s\in S}{\mathcal{P}}_{2}\left(s\right).

Given two sets of 2-tuples *S*, *T*, a common partition of *S* and *T* is a set of 2-tuples *H* = {〈*g*_{1}, *h*_{1}〉, 〈*g*_{2}, *h*_{2}〉, ⋯, 〈*g*_{
k
}, *h*_{
k
}〉} such that {\mathcal{P}}_{1}\left(H\right) is a common partition of {\mathcal{P}}_{1}\left(S\right) and {\mathcal{P}}_{1}\left(T\right), and, {\mathcal{P}}_{2}\left(H\right) is a common partition of {\mathcal{P}}_{2}\left(S\right) and {\mathcal{P}}_{2}\left(T\right). *k* is the size of the partition *H*. Again, it is easily seen that the necessary condition for *S* and *T* to admit a common partition is that the sums of the integers in {\mathcal{P}}_{1}\left(S\right) and {\mathcal{P}}_{1}\left(T\right) are equal, so are those in {\mathcal{P}}_{2}\left(S\right) and {\mathcal{P}}_{2}\left(T\right). Throughout this paper, whenever we talk about any common partition of sets of 2-tuples *S, T*, we always assume that this condition is met.

#### MCIPP (Minimum Common Integer Pair Partition)

Instance: Two multiple sets of 2-tuples of integers *S* and *T*, and an integer *k*.

Question: Does *S, T* admit a common partition of size *k*?

Recall that a 2-tuple 〈*i, j*〉 represents the pedigree of a couple which has *i* female and *j* male chilren. Again, we use MCIPP(S, T) to represent this instance. As MCIPP is a generalization for MCIP, all the known negative results regarding MCIP hold for MCIPP; i.e., MCIP and MCIPP are both NP-complete and APX- hard, following [7]. (In the past, *d*-MCIP has also been considered, where the input is *d* multisets with the same sum. Efficient asymptotic approximation algorithms have been obtained for large *d* [7, 33, 34], the best factor being 0.5625 · *d* + O(1) [34]. We will only consider *d* = 2 in this paper.) Also, note that the integer 0 in a solution for MCIP is meaningless while it is possible that 0 can appear either in the input or in the solution for MCIPP. So for MCIPP, we focus on integers in \mathcal{N}\cup \left\{0\right\}=\left\{1,2,3,\dots \right\}.

Finally, a Fixed-Parameter Tractable (FPT) algorithm is an algorithm for a decision problem with input size *n* and parameter *k* whose running time is *O*(*f* (*k*)*n*^{c}) = *O**(*f*(*k*)), where *f* (−) is any computable function on *k* and *c* is a constant. FPT algorithms are efficient tools for handling some NP-complete problems, especially when *k* is small in practical datasets [9, 11].

### Some properties of MCIPP

Given a pair of integers *a, c*, we say *a dominates c* if *a* >*c*. Given a pair of 2-tuples of integers 〈*a, b*〉 and 〈*c*, *d*〉, we say 〈*a, b*〉 *dominates* 〈*c*, *d*〉 if *a* ≥ *c* and *b* ≥ *d*. To simplify the writing, we say that 〈*a, b*〉 and 〈*c*, *d*〉 form a *dominating pair* if either 〈*a, b*〉 *dominates* 〈*c*, *d*〉 or vice versa. Likewise, 〈*a, b*〉 and 〈*c*, *d*〉 form a *non-dominating pair* if either *a* >*c*, *b* <*d* or *a* <*c*, *b* >*d*.

We first describe some optimality properties for both the optimization versions of MCIP and MCIPP. When the context is clear, we still use MCIP(-,-) and MCIPP(-,-) to denote the corresponding optimization versions of the instances.

**Lemma 1** *Let A, B be the input for MCIP. In any feasible solution, if a partition for some × ∈ A, τ (x) =* {*x*_{1}, *x*_{2},...,*x*_{
p
}}, *and a partition for some y* ∈ *B, tau*(*y*) = {*y*_{1}, *y*_{2},...,*y*_{
q
}}, *satisfies that |τ*(*x*) ∩ *τ*(*y*)| > 1 *then this solution for MCIP is not optimal*.

*Proof*. Suppose to the contrary that |*τ*(*x*) ∩ *τ*(*y*)| > 1, and the corresponding partition for *A, B* is optimal. WLOG, suppose *τ*(*x*) = {*x*_{1}, *x*_{2},..., *x*_{
p
}} and *τ*(*y*) = {*y*_{1}, *y*_{2},..., *y*_{
q
}} contain *r* common elements {*z*_{1}, *z*_{2},..., *z*_{
r
}} then we can update *τ*(*x*) ← *τ*(*x*) − {*z*_{1}, *z*_{2},..., *z*_{
r
}}∪{*z*_{1} + *z*_{2} + ... + *z*_{
r
}} and *τ*(*y*) ← *τ*(*y*) − {*z*_{1}, *z*_{2},..., *z*_{
r
}}∪{*z*_{1} + *z*_{2} + ... + *z*_{
r
}}. Then the solution size for MCIP on *A, B* is reduced by *r* − 1, contradicting the optimality of the assumption. □

With the above lemma, we can now assume that for any optimal partition for some *x* ∈ *A* and some *y* ∈ *B*, they share at most one common element. Notice that this lemma also holds for MCIPP, i.e., in an optimal partition of 〈*s*_{1}, *s*_{2}〉 ∈ *S* and 〈*t*_{1}, *t*_{2}〉 ∈ *T*, *τ*(〈*s*_{1}, *s*_{2}〉) and *τ*(〈*t*_{1}, *t*_{2}〉) share at most one common 2-tuple. Similarly, we can assume that in the input for *MCIP* (*A, B*) (resp. *MCIPP* (*S, T*)) there is no common pair of integers in *A* and *B* (resp. no common pair of 2-tuples in *S* and *T*), as it must be put in the optimal solution.

The following property is trivial and holds for both MCIP and MCIPP.

**Lemma 2** *Let* |*MCIPP**(*S, T*)| *be the optimal solution size for MCIPP*(*S, T*). *Then* |*MCIPP**(*S, T*)| >*max{|S|, |T|}*.

For a pair of dominating 2-tuples 〈*a, b*〉 and 〈*c*, *d*〉, we can use subtraction to partition them into two common pairs. For example, if *a* ≤ *c* and *b* ≤ *d*, then we can obtain the common partition {〈*a, b*〉, 〈*c* − *a, d* − *b*〉. So, given 〈2, 4〉 and 〈4, 5〉 we can obtain a partition {〈2,4〉, 〈2,1〉} for 〈4, 5〉. We also say that this is a *dominating - partition* operation. Apparently, for MCIP, this gives a way to partition a pair of integers as well. For instance, given 2 and 6, we can subtract 2 from 6 to obtain a partition {2, 4} for 6.

We next describe some properties on non-dominating pairs of 2-tuples which are unique for MCIPP -- for MCIP, a pair of integers *a, b* has the property that either *a* dominates *b* or vice versa. This is not the case for a non-dominating pairs of 2-tuples, e.g. 〈1, 4〉 and 〈2, 3〉. We start with this fundamental lemma.

**Lemma 3** *Let A', B' be a set of positive integers with the same total sum, moreover, let us suppose |A'| = m* + 1, *|B'| = m. Then there must exist elements a* ∈ *A', b* ∈ *B' such that a* <*b*.

*Proof*. As |*A'*| > |*B'*|, we can arbitrarily select *m* elements from *A'* and match up them with those in *B* in an one-to-one fashion. As the sum of integers in *A* and *B* are the same, in this matching at least one of elements *a* ∈ *A* must be smaller than its matched counterpart *b* ∈ *B* -- otherwise, the sum of integers in *A* would be larger than that of *B'*. □

**Corollary 1** *Let A, B be two sets of n* > 1 *positive integers with the same total sum. WLOG, let A = {a*_{1}, *a*_{2}, ⋯, *a*_{
n
}}, *B* = {*b*_{1}, *b*_{2}, ⋯, *b*_{
n
}}. *Then there must exist an element b* ∈ *B' = B* − {*b*_{
j
}} *which is greater than some element a* ∈ *A'* = {*a*_{1}, *a*_{2}, ⋯, *a*_{i−1}, *a*_{
i
} − *b*_{
j
}, *a*_{i+1}, ⋯, *a*_{
n
}}, *where a*_{
i
} >*b*_{
j
}.

*Proof*. Obviously we have |*A'*| = *n* and |*B'*| = *n* − 1. Then this corollary follows directly from Lemma 3. □

The implication of Corollary 1 for MCIP with input *A, B* is obvious -- we can successively find pairs of dominating integers. In fact, in the proof of Corollary 1, once we obtain *a'* = *a*_{
k
} ∈ *A'* and *b'* = *b*_{
ℓ
} ∈ *B'* such that *a'* <*b'*, we can repeatedly use the above argument to *A''* = *A'* − {*a'*} = {*a*_{1}, *a*_{2}, ⋯, *a*_{k−1}, *a*_{k+1}, ⋯, *a*_{
n
}} and *B''* = {*b*_{1}, *b*_{2}, ⋯, *b*_{ℓ−1}, *b*_{ℓ} − *a', b*_{ℓ+1}, ⋯, *b*_{
n
}}, where |*A''*| = |*B''*| = *n* − 1 and the two sets *A, B* have the same sum.

Now let us see how this can be applied to MCIPP. When we have an instance of MCIPP whose input {*S, T*} is each composed of *m* non-dominating 2-tuples, then we can find a pair *s* = 〈*s*_{1}, *s*_{2}〉 ∈ *S* and *t* = 〈*t*_{1}, *t*_{2}〉 ∈ *T* (assuming *s*_{1} >*t*_{1} and *s*_{2} <*t*_{2}) such that we can put 〈*t*_{1}, *s*_{2}〉 in some solution set while the resulting instance *S*' = ({*S* − {〈*s*_{1}, *s*_{2}〉}) ∪ {〈*s*_{1} − *t*_{1}, 0〉},*T'* = ({*T* − {〈*t*_{1}, *t*_{2}〉}) ∪ {〈0, *t*_{2} − *s*_{2}〉} is still a valid instance for MCIPP. (We call this operation *non-dominating-partition*.) Then, following Corollary 1, there exists a pair of dominating 2-tuples *s'* ∈ *S', t'* ∈ *T'*. Moreover, if we apply the dominating-partition process on these two tuples *s', t'*, following Corollary 1, we can repeatedly find dominating tuples until all tuples in *S, T* are all commonly partitioned. This is because after we apply the dominating-partition on *s', t'* (say *s'* <*t'*) to obtain an MCIPP instance *S'',T''*, we have \left|{\mathcal{P}}_{1}\left({S}^{\u2033}\right)\right|\ne \left|{\mathcal{P}}_{1}\left({T}^{\u2033}\right)\right|,\left|{\mathcal{P}}_{2}\left({S}^{\u2033}\right)\right|\ne \left|{\mathcal{P}}_{2}\left({T}^{\u2033}\right)\right| and the two pairs of sizes in fact differ by one. Following Corollary 1, we can then repeatedly obtain dominating pairs.

*Algorithm Heuristic-MCIPP*(*S, T*)

Input: *S*, *T*

Output: A common partition *τ*(*S*, *T*) for *S*, *T*, initially empty.

1 While |*S*| ≥ 2 and |*T*| ≥ 2

2 Repeat

2.1 select a pair of dominating 2-tuples, *s* ∈ *S* and *t* ∈ *T*,

2.2 compute two decomposing 2-tuples by subtraction,

2.3 update *S* ← *S* − {*s*}, *T* ← (*T* − {*t*}) ∪{*t* − *s*} if *s* <*t*,

2.4 update *S* ← (*S* − {*s*}) ∪{*s* − *t*}, *T* ← *T* − {*t*} if *s* >*t*,

2.5 update *τ*(*S*, *T*) ←*τ*(*S*, *T*) ∪ {min(*s*, *t*)},

3 Until no dominating 2-tuples can be found.

4 If there are at least two pairs of non-dominating 2-tuples in *S* and *T*

5 Then

5.1 use a brute-force method to select two non-dominating 2-tuples *s'* ∈ *S*, *t'* ∈ *T* which leads to successive dominating pairs.

6 If |*S*| = 1, |*T*| = 1, then find the smaller tuple in *S* and *T*, *x*.

7 Return *τ*(*S*, *T*) ← *τ*(*S*, *T*) ∪ {*x*}.

Of course, due to the 'existence' constraint in Lemma 3 and Corollary 1, we would have to use a brute-force method to find a pair *s* ∈ *S, t* ∈ *T* which can make the process of repeatedly processing dominating pairs possible. Let us show an example, *S* = {〈9, 4〉, 〈1, 11〉, 〈6, 3〉} and *T* = {〈2, 8〉, 〈12, 1〉, 〈2, 9〉}. In this example, among the 9 non-dominating pairs between *S* and *T*, there are 4 solutions enabling us to successively find dominating pairs. One of them is *s* = 〈6, 3〉 and *t* = 〈2, 9〉, which gives us a common partition of size 6. The other 5 solutions all lead to a common partition of size 7.

The above discussion enables us to design an algorithm Heuristic-MCIPP to prove the next lemma.

**Lemma 4** *Let* |*MCIPP*(*S, T*)| *be the size of the solution returned by Heuristic-MCIPP. Then* |*MCIPP*(*S, T*)| ≤ |*S*| + |*T*|.

*Proof*. When there is no non-dominating pairs in the input, with the running of the algorithm Heuristic-MCIPP, we have |*MCIPP*(*S*, *T*)| ≤ |*S*| + |*T*| − 1. The reason is that when each of *S* and *T* has at least two 2-tuples, we can use the dominating-partition procedure to obtain two 2-tuples in the solution set for each pair of dominating 2-tuples from *S, T*. When there are a total of three elements in *S, T*, say, one in *S* and two in *T*, we just need to return the two elements in *T* as their sum matches the one in *S* already. (This is certainly true for MCIP as pointed out in [7].)

When there are *p* ≥ 2 pairs of non-dominating pairs at Step 4-5 of the Heuristic-MCIPP algorithm, following Corollary 1 and the subsequent arguments, there exists a non-dominating pair *s* = 〈*s*_{1}, *s*_{2}〉 ∈ *S*, *t* = 〈*t*_{1}, *t*_{2}〉 ∈ *T* which leads to successive dominating pairs. (We can use the brute-force method to find this in *O*(*p*^{2}(|*S*| + |*T*|)) time.) In this case, the solution obtained by Heuristic-MCIPP has size at most 1 + (|*S*| + |*T*| − 1) = |*S*| + |*T*|, where the first one corresponds to (*s*_{1}, *t*_{2}) (if *s*_{1} <*t*_{1}) or (*t*_{1}, *s*_{2}) (if *s*_{1} >*t*_{1}). □

In fact, the above three lemmas imply that Heuristic-MCIPP provides a factor-2 approximation for MCIPP, as we have |*S*| + |*T*| ≤ 2max{|*S*|, |*T*|} ≤ 2|*MCIPP**(*S, T*)|. On the other hand, designing approximation algorithms is not our focus for this paper; in fact, by a simple modification for the Maximum Packing method in [7] we can obtain a similar factor-1.25 approximation for MCIPP. In the remainder of this paper, we solely focus on the exact or FPT algorithm.

Note that Lemma 4 is different from its counterpart for MCIP, which, according to Lemma 2.2 in [7], states that |*MCIP*(*A, B*)*|* < |*A*| + |*B*| − 1. The latter in fact immediately implies that for MCIP there is always an optimal solution which does not partition at least an integer from either *A* or *B*. That further implies that there is a simple FPT algorithm for MCIP based on the bounded-degree search. We will show a stronger property in the next section to improve the FPT algorithm for MCIP, and subsequently, an FPT algorithm for MCIPP can be obtained.

### An FPT algorithm for MCIPP

We first give the following lemma for MCIP.

**Lemma 5** *Let A, B be the input for MCIP and let a be the smallest element in A or B. Then there is an optimal solution for MCIP which contains a, i.e., there is an optimal solution which does not partition the smallest element in A and B*.

*Proof*. We first show the following claim: in an optimal solution *τ* for MCIP with input *A, B*, let *a*_{1}, *a*_{2} be a pair of elements in *τ*(*z*), *z* ∈ *A* ∪ B, with the condition that (1) *a*_{1} + *a*_{2} <*a*, and (2) *a*_{2} is the minimum among all pairs of elements in *τ* satisfying the condition (1), then there is an optimal solution *τ'* which partitions some element *z* ∈ *A* ∪ *B* with *τ'*(*z*) = *τ*(*z*) − {*a*_{1}, *a*_{2}} ∪ {*a*_{1} + *a*_{2}}.

The proof for the above claim is as follows. WLOG, let *z* ∈ *A* and let *a*_{1} ∈ *τ*(*y*_{1}) and *a*_{2} ∈ *τ*(*y*_{2}) for two distinct integers *y*_{1}, *y*_{2} ∈ B. Following the definition of 〈*a*_{1}, *a*_{2}〉, *τ*(*y*_{1}) contains at least one more element other than *a*_{1}, say *a*_{3}; and following (2) we have *a*_{3} >*a*_{2}. Suppose that *a*_{3} ∈ *τ*(*x*) for some *x* ∈ *A*. By Lemma 1, *x* ≠ *z*. We replace *a*_{3} by {a}_{3}^{\prime}={a}_{3}-{a}_{2}, and *a*_{1} by {a}_{1}^{\prime}={a}_{1}+{a}_{2}. Subsequently, we obtain another optimal partition *τ'* with *τ'*(*x*) = *τ*(*x*) − {*a*_{3}} ∪ {{a}_{3}^{\prime}, *a*_{2}}, *τ'*(*y*_{1}) = *τ*(*y*_{1}) − {*a*_{1}, *a*_{3}} ∪ {{a}_{1}^{\prime}, {a}_{3}^{\prime}}, and *τ'*(*z*) = *τ*(*z*) − {*a*_{1}, *a*_{2}} ∪ {{a}_{1}^{\prime}}. Apparently, *τ'* has the same size as *τ*, so it is also a minimum size common partition for *A, B*.

It is obvious that, as long as the smallest element *a* is partitioned in some optimal partition *τ* with \tau \left(a\right)=\left\{{a}_{1}^{\prime},{a}_{2}^{\prime},\dots ,{a}_{t}^{\prime}\right\}, we can repeatedly apply the above steps to obtain another optimal partition *τ'* with {\tau}^{\prime}\left(a\right)=\left\{{a}_{1}^{\prime},{a}_{2}^{\prime},\dots ,{a}_{i-1}^{\prime},{a}_{i}^{\prime}+{a}_{j}^{\prime},{a}_{i+1}^{\prime},\dots ,{a}_{j-1}^{\prime},{a}_{j+1}^{\prime},\dots ,{a}_{t}^{\prime}\right\}. After *t* − 1 such steps, we obtain an optimal partition which contains the minimum element *a* in *A* ∪ *B*. □

An example of the above proof is given as follows. We have *A* = {2, 5, 5}, *B* = {6, 6}, and an optimal partition *τ* = {1,1, 5, 5} where *τ*(2) = {*a*_{1} = 1, *a*_{2} = 1}. By the construction in the proof of Lemma 5, *y*_{1} = 6, *y*_{2} = 6, *a*_{3} = 5, {a}_{3}^{\prime}=4 and {a}_{1}^{\prime}=2. The new optimal solution is *τ'* = {1, 2, 4, 5}, where *a* = 2 is kept.

We comment that we can use Lemma 2.2 by Chen *et al*. [7] directly to prove a weaker claim: as |*MCIP*(*A*, *B*)| < |*A*| + |*B*| − 1, there must be an optimal solution whose corresponding matching graph between the partitioned elements in *A, B* contains no cycle, which means there is at least one leaf node. Then this leaf node corresponds to an unpartitioned integer in *A* or *B*. The above lemma in fact implies a faster FPT algorithm for MCIP. Pick the smallest element *a* ∈ *A* ∪ *B* (say *a* ∈ *A*), we try to partition some other integer *z* ∈ *B* by subtracting *a* from it. Then we repeat over the new problem instance involving *z* − *a*. This process is repeated *k* times when either a solution is founded or we have to report that there is no solution of size *k*. The running time is *O**((max{|*A*|, |*B*|})^{k}) = *O**(*k*^{k}).

To obtain an FPT algorithm for MCIPP, we also need a similar lemma.

**Lemma 6** *Let S, T be the input for MCIPP. Then there is an optimal solution for MCIPP which either contains* 〈*a, b*〉 ∈ *S* ∪ *T or* 〈*c, d*〉 ∈ *S* ∪ *T, or contains* 〈*a, d*〉, *where a is the minimum element in* {\mathcal{P}}_{1}\left(S\cup T\right)*and d is the minimum element in* {\mathcal{P}}_{2}\left(S\cup T\right).

*Proof*. Again, we first show the following claim: in an optimal solution *τ* for MCIPP with input *S, T*, let 〈*a*_{1}, *a*_{2}〉, 〈*b*_{1}, *b*_{2}〉 be two 2-tuples in *τ*(*z*), *z* ∈ *S* ∪ *T*, such that (1) *a*_{1} + *b*_{1} ≤ *a*, and (2) *b*_{1} is the minimum among all pairs of 2-tuples in *τ* satisfying (1), then there is an optimal solution *τ'* which partitions some 2-tuple *z* ∈ *S* ∪ *T* with *τ'*(*z*) = *τ*(*z*) − {〈*a*_{1}, *a*_{2}〉, 〈*b*_{1}, *b*_{2}〉} ∪ { *a*_{1} + *b*_{1}, *a*_{2}}. (Symmetrically, we can have a claim on the second component of 2-tuples in *S* ∪ *T*, i.e., *d*.)

WLOG, let *z* = 〈*z*_{1}, *z*_{2}〉 ∈ *S* and let 〈*a*_{1}, *a*_{2}〉 ∈ *τ*(*y*_{1}) and 〈*b*_{1}, *b*_{2}〉 ∈ *τ*(*y*_{2}) for two distinct 2-tuples *y*_{1}, *y*_{2} ∈ *T*. Following the definition of (*a*_{1}, *b*_{1}), *tau*(*y*_{1}) contains at least one more pair 〈*c*_{1}, *c*_{2}〉, with *c*_{1} ≥ *b*_{1}. Suppose that 〈*c*_{1}, *c*_{2}〉 ∈ *τ*(*x*) for some *x* ∈ *S*. Again, by Lemma 1, *x* ≠ *z*. We replace 〈*c*_{1}, *c*_{2}〉 by 〈*c*_{1} − *b*_{1}, *c*_{2}〉, and 〈*a*_{1}, *a*_{2}〉 by 〈*a*_{1} + *b*_{1}, *a*_{2}〉. Subsequently, we obtain another optimal partition *τ'* with *τ'*(*x*) = *τ*(*x*) − {〈*c*_{1}, *c*_{2}〉}∪{〈*c*_{1} − *b*_{1}, *c*_{2}〉, 〈*a*_{2}, *b*_{2}〉}, *τ'*(*y*_{1}) = *τ*(*y*_{1}) − {〈*a*_{1}, *a*_{2}〉, 〈*c*_{1}, *c*_{2}〉 ∪ {〈*a*_{1} + *b*_{1}, *a*_{2}〉}, 〈*c*_{1} − *b*_{1}, *c*_{2}〉, and *τ'*(*z*) = *τ*(*z*) − {〈*a*_{1}, *a*_{2}〉, 〈*b*_{1}, *b*_{2}〉 ∪ {〈*a*_{1} + *b*_{1}, *a*_{2}〉}. Again, *τ'* is also a minimum size common partition for *S, T*.

Similar to Lemma 5, it is obvious that we can repeatedly apply the above steps to obtain an optimal solution with does not partition the smallest element in {\mathcal{P}}_{1}\left(S\cup T\right) (and, symmetrically, {\mathcal{P}}_{2}\left(S\cup T\right)). Hence the lemma is proven. □

With the above lemma, it is again possible to have an FPT algorithm, Exact-MCIPP, for MCIPP using bounded degree search. At each step, we search for 〈*a, b*〉, 〈*c, d*〉 ∈ *S* ∪ *T* or 〈*a, d*〉 ∈ *S* ∪ *T*, where *a* is the minimum element in {\mathcal{P}}_{1}\left(S\cup T\right) and *d* is the minimum element in {\mathcal{P}}_{2}\left(S\cup T\right) such that some optimal solution for MCIPP contains 〈*a*, *b*〉, 〈*c, d*〉 or 〈*a*, *d*〉. For one step, the running time for the former would be *O*(*k*_{1} + *k*_{2}) for the first two cases and for the latter would also be *O*(*k*_{1} + *k*_{2}) -- as 〈*a, d*〉 could be subtracted from *O*(*k*_{1} + *k*_{2}) pairs, where *k*_{1} = |*S*|, *k*_{2} = |T|. As *k*_{1}, *k*_{2} ≤ *k*, the running time of this step is bounded by *O*(2*k*). Running this for *k* steps, the running time of the whole algorithm is *O**(2^{k}*k*^{k}). Hence, we have the following theorem.

Algorithm *Exact-MCIPP(S, T)*

Input: *S, T, k*

Output: A common partition *τ*(*S*, *T*) for *S*, *T*, initially empty.

1 While *k* ≥ 1

2 Repeat

2.1 let *a* be the minimum element in {\mathcal{P}}_{1}\left(S\cup T\right),

2.2 let *d* be the minimum element in {\mathcal{P}}_{2}\left(S\cup T\right),

2.3 if 〈*a*, *d*〉 ∈ *S* ∪ *T* then *τ*(*S*, *T*) ← *τ*(*S*, *T*) ∪ {〈*a*, *d*〉}, delete 〈*a*, *d*〉 from *S* ∪ *T*, and update *S*, *T* and *k* ← *k* − 1,

2.4 if 〈*a*, *b*〉 ∈ *S* ∪ *T* then *τ*(*S*, *T*) ← *τ*(*S*, *T*) ∪ {〈*a*, *b*〉}, delete 〈*a*, *b*〉 from *S* ∪ *T*, and update *S*, *T* and *k* ← *k* − 1,

2.5 if 〈*c*, *d*〉 ∈ *S* ∪ *T* then *τ*(*S*, *T*) ← *τ*(*S*, *T*) ∪ {〈*c*, *d*〉}, delete 〈*c*, *d*〉 from *S* ∪ *T*, and update *S*, *T* and *k* ← *k* − 1,

3 Until *S* = ∅ or *T* = ∅ or *k* = 0.

4 If both *S* = ∅ and *T* = ∅

4.1 Then return *τ*(*S*, *T*),

4.2 Else return 'no solution'.

**Theorem 3** *Minimum Common Integer Pair Partition is FPT*.

The running time of the above FPT algorithm is still too high to be applied alone to the similarity comparison for arbitrary 2-generation pedigrees, i.e., when *k* is large. In [14], the salmon data contains 60 individuals from each family, with hundreds of families. To handle some data like that, we either need to speed up the running time of our algorithm or combine the FPT algorithm with some existing approximation algorithms (which will be discussed next). Nevertheless, it lays down a solid theoretical foundation for further research on this problem, especially when *k* is relatively small.

In practice, to handle datasets possibly of varying *k* values, we suggest a combination of the FPT algorithm and approximation algorithms [7, 31]. That is, when the value of *k* is not too large, we can run this FPT algorithm; when *k* is too large for the FPT algorithm to handle, we can then use the approximation algorithms. (We comment that the approximation algorithms in [7, 31], though presented for MCIP, can be easily adapted for MCIPP.)