Time complexity
Here we prove the NPhardness of the α,βWQB_{
P(C) }problem by a reduction from the maximum edge biclique problem. Note that the query
_{
P(c) }problem is a generalization of α,βWQB_{
P(C) }problem and hence, is also NPhard.
Lemma 1. The α,βWQB
_{
P(C) }
problem is NPhard.
Proof. Given a bipartite graph G := (U + V, E) and an integer k, the maximum edge biclique problem asks if G contains a biclique with atleast k edges. The maximum edge biclique problem is NPcomplete [10]. Let G' := (U + V, E', ω ) be a weighted bipartite graph where ω(u, v) is set to 1 if (u, v) ∈ E or is set to 0 otherwise. Note that, there is a biclique with k edges in G if and only if the maximum weighted α,βWQB_{
P
} in G' has a weight of atleast k when α and β are set to 1. Similarly, there is a biclique with k edges in G if and only if the maximum weighted α,βWQB_{
C
} in G' has a weight of atleast k when α and β are set to 0. Therefore, the α,βWQB_{
P(C) }problem is NPhard.
We now prove that checking for the existence of a percentage α,βWQB in a bipartite graph is NPcomplete. Note that, checking the existence of a constant version α,βWQB in a bipartite graph can be done in polynomial time. For rest of the section we only refer to percentage version α,βWQB's.
Problem 3 (Existence).
Instance: A weighted bipartite graph G := (U + V, E, ω), values
α, β ∈ [0, 1].
Find: If there exists a α,βWQB
_{
P
}
(U', V'
) in G.
To prove the hardness of existence problem we need some auxiliary definitions. A modified weighted bipartite graph, denoted by (U + V, E, Ω), is a complete bipartite graph (U + V, E) with a weight function Ω: E → [0, 1] where, for any two edges e and e', Ω(e)  Ω(e') ≤ 1.
Definition 4 (Modified α,βWQB (MOWQB)).
Let G := (U + V, E, Ω) be a modified weighted bipartite graph. A
nonempty pair
(U', V') included in (U, V) is a MOWQB of G, if it satisfies the three properties: (1) (U', V') includes (∅, V), (2) ∀u ∈ U' : ∑_{
v∈V'
}, w(u, v) ≥ 0, and (3) ∀v ∈ V' : ∑_{
u∈U'
}, w(u, v) ≥ 0.
Problem 4 (One sided existence).
Instance: A weighted bipartite graph G := (U + V, E, ω), values
α, β ∈ [0, 1].
Find: If there exists a α,βWQB
_{
P
} (U', V') in G which includes the pair (∅, V).
Problem 5 (Modified existence).
Instance: A modified weighted bipartite graph
G := (U + V, E, Ω).
Find: If there exists a MOWQB in G.
The series of reductions to prove the hardness of the existence problem are as follows. We first reduce the partition problem, which is NPcomplete [18], to the modified existence problem. The modified existence problem is then reduced to the one sided existence problem. The one sided existence problem reduces to the existence problem.
Lemma 2. The modified existence problem is NPcomplete.
Proof. The proof of MOWQB ∈ NP can be briefly described in the following.
Given a weighted bipartite graph G(U + V, E, Ω) and a pair (U', V') included in (U, V), it can be verified in polynomial time if the pair (U', V') satisfies the all the MOWQB constraints for G. So, the modified existence problem belongs to class NP. The reduction from partition problem is as follows.
We are left to show that partition ≤_{
p
}
MOWQB. Given a finite set A, and a size s(a) ∈ Z
^{+} associated with every element a of A, the partition problem asks if A can be partitioned into two sets (A
_{1}, A
_{2}) such that
.
a. Construction: Let SUM be the sum of sizes of all elements in A. Build a modified weighted bipartite graph G := (U + V, E, Ω) as follows. For every element a in A there is a corresponding vertex u
_{
a
} in U. The set V contains two vertices v
_{+} and v
_{}. For every vertex u
_{
a
} ∈ U, Ω(u
_{
a
}, v
_{+}) = s(a)/(2 × SUM) and Ω(u
_{
a
}, v
_{

}) = s(a)/(2 × SUM). Add an additional vertex u
_{
sum
} to set U. Set Ω(u
_{
sum
}, v
_{+}) to 1/4 and Ω(u
_{
sum
}, v
_{

}) to 1/4. Note that, the weights assigned to edges of G satisfy the constraint on Ω for a modified weighted bipartite graph.
b. ⇒: Let (A
_{1}, A
_{2}) be a partition of A such that the sum of the sizes of elements in A
_{1} is equal to the sum of the sizes of elements in A
_{2}. Let U
_{1} = {u
_{
a
} : a ∈ A
_{1}}. The sum of weights of all edges from v
_{+} to the vertices in U
_{1} is equal to 1/4. Let U' = U
_{1} ∪ u
_{
sum
}. The sum of weights of all edges from v
_{+} to vertices of U' is 0. Similarly, the sum of weights of all edges from v
_{} to vertices of U' is 0. Thus, (U', V) is a MOWQB of G.
⇐: Let (P, V) be a MOWQB of G. The edge from v
_{} to u
_{
sum
} is the only positive weighted edge from vertex v
_{}. So, P will contain vertex u
_{∑}. Since Ω(v
_{+}, u
_{
sum
}) is negative, set P will also contain vertices from U  u
_{
sum
}. The sum of the weights of edges from v
_{} to vertices in P  u
_{
sum
} cannot be smaller than 1/4. Similarly, the sum of the weights of edges from v
_{+} to vertices in P  u
_{
sum
} cannot be smaller than 1/4. So, the sum of all elements in A corresponding to the vertices in P  u
_{
sum
} should be equal to SUM/2. This proves that if G contains a MOWQB, set A can be partitioned.
Hence, the modified existence problem is NPcomplete.
Lemma 3. The one sided existence problem is NPcomplete.
Proof. The proof of one sided existence ∈ NP is omitted for brevity. Next we show MOWQB ≤_{
p
}
one sided existence. We prove this problem to be NPcomplete by a reduction from the modified existence problem. The reduction is as follows.
a. Construction: Let G := (U + V, E, Ω) be the modified weighted bipartite graph in an instance of the modified existence problem. We build a graph G' := (U + V, E, ω) for an instance of one sided existence problem from G. Notice that the partition and vertices remain the same. If the weight of every edge in the G is non negative, set α = β = 0 and ω(u, v) = Ω(u, v) for every edge (u, v) ∈ E. Otherwise, set α and β to x and ω(u, v) = Ω(u, v)  x for every edge (u, v) ∈ E, where x is the minimum edge weight in G.
b. ⇒ and ⇐: Let (U', V) be a MOWQB of graph G. If weights of all edges in G are non negative, the constraints for both the problems are the same. If G has negative weighted edges, the constraints of both the problems will be the same when α,β and ω for the one sided existence problem instance are set as mentioned in the construction. It can be seen that there is a MOWQB in G if and only if there is a α,βWQB_{
P
} in the graph G' which includes the pair (∅, B).
This proves that the one sided existence problem is NPcomplete.
Lemma 4. Existence problem is NPcomplete.
Proof. Given a set of vertices (U', V'), a weighted bipartite graph G = (U + V, E, ω) and values α, β ∈ [0, 1], it can be verified in polynomial time if (U', V') is a α,βWQB_{
P
} in G. Thus, the existence problem belongs to NP. We now show that One sided existence ≤_{
p
}
existence.
a. Construction: Let G' = (U + V, E', ω), α', β' ∈ [0, 1] be the parameters of the one sided existence problem. We build the weighted bipartite graph G = (U
_{
p
} + V, E, ω) for the instance of existence problem as follows. First, set G = G'. For every vertex u ∈ U, let S
_{
u
} denote the sum of the weights of all edges incident on u. Delete every vertex u ∈ U whose S
_{
u
} is less than αV. Let (U
_{
p
} + V) denote the remaining vertices, and E' represent the remaining edges in G. For the instance of the existence problem, set α = 0 and β = β'.
b. ⇒ and ⇐: Any α,βWQB_{
P
} in G' which includes (∅, V), is also α,βWQB_{
P
} in G. Consider a α,βWQB_{
P
} (U', V') in G. If V' = V, then (U', V') is a α,βWQB_{
P
} in G' which includes the pair (∅, V). If V' is not the same set as V, the pair (U', V) is still a α,βWQB_{
P
} in G' and it includes the pair (∅, V).
IP formulations for the α,βWQB problem
Although greedy approaches are often used in problems of a similar structure, e.g., multidimensional knapsack [19], δQB [3], in our experiments, both greedy and randomized approach did not identify solutions close enough to the exact solutions. In our experiments, simple greedy and randomized solutions yielded accuracies ranging from 60% to 95% depending on various parameters without performance guarantee. Hence we consider that it is rather important here to find exact solutions in order to demonstrate the usefulness of α,βWQB's. Here we present integer programming (IP) formulations solving the α,βWQB problem in exact solutions.
Due to the similarity in formulating constraints between α,βWQB_{
C
} and α,βWQB_{
P
}, we start by formulating a solution to α,βWQB_{
P
} . Our initial IP requires quadratic constraints, which are then replaced by linear constraints such that it can be solved by various optimization software packages. Our final formulation is further improved by adopting the implication rule to simplify variables involved. This improved formulation requires variables and constraints linear to the number of input edges, and thus, suits better for sparse graphs. Throughout the section, unless stated otherwise, G := (U + V, E, ω) represents a weighted bipartite graph, and G' = (U', V') represents the maximum weighted α,βWQB of G and E' represents the edges induced by G' in G.
Quadratic programming
For each
u ∈
U (
v ∈
V), a binary variable
x
_{
u
} (
x
_{
v
}) is introduced. The variable
x
_{
u
} (
x
_{
v
}) is 1 if and only if vertex
u (
v) is in
U' (
V'). The integer program to find the solution
G' can be formulated as follows.
The quadratic terms in the constraints are necessary because, α and β thresholds apply only to vertices in U' and V'. This formulation uses variables and constraints linear to the size of input vertices, i.e., O(U + V). Since solving a quadratic program usually requires a proprietary solver, we reformulate the program so that all expressions are linear.
Converted linear programming
A standard approach to convert a quadratic program to a linear one is introducing auxiliary variables to replace the quadratic terms. Here we introduce a binary variable
y
_{
uv
} for every edge (
u,
v) in
G, such that,
y
_{
uv
} = 1 if and only if
x
_{
u
} =
x
_{
v
} = 1, i.e., the edge (
u,
v) is in
G'. The linear program to find the solution
G' is formulated as follows.
Expressions (7) and (8) state the condition that y
_{
uv
} = 1 if and only of x
_{
u
} = x
_{
v
} = 1. Expression (8) ensures that, for any edge whose end points (u, v) are chosen to be in G', y
_{
uv
} is set to 1. Due to the use of y
_{
uv
} variables, this formulation requires O(UV) variables and constraints.
Improved linear programming
Observe that constraint (7) becomes trivial if
y
_{
uv
} = 0. In other words, this constraint formulates implications, e.g., for binary variables
p and
q, the expression
p ≤
q is equivalent to
p →
q. Expanding on this idea, we eliminate the requirement of variables
y
_{
uv
} in constraints (9) and (10) in the next formulation while sharing the rest of the aforementioned linear program.
There is a variable x
_{
v
} for every vertex v in G. There is a variable y
_{
uv
} for every edge (u, v) in G whose weight is not 0. The variable y
_{
uv
} is set to 1 if and only if both x
_{
u
} and x
_{
v
} are set to 1. For any vertex u ∈ U (v ∈ V), the variable x
_{
u
} (x
_{
v
}) is set to 1 if and only if vertex u (v) is in G'. Constraint (12) can also be explained as follows. If x
_{
u
} = 1, the constraint transforms to the second constraint in the α,βWQB Definition. If x
_{
u
} = 0, constraint (12) becomes trivial. Constraint (13) can be explained in a similar manner.
Generalized formulation for α,βWQB
_{
P
}
and α,βWQB
_{
C
}
Recall that the difference between the two problems
α,
βWQB
_{
P
} and
α,
βWQB
_{
C
} is in the edge weight summation which we can combine as the following properties: (1) ∀
u ∈
U' : ∑
_{
v∈V'
}
ω(
u,
v) ≥
α
_{
P
}
V' 
α
_{
C
}, and (2) ∀
v ∈
V' : ∑
_{
u∈U'
}
ω(
u,
v) ≥
β
_{
P
}
U' 
β
_{
C
}, where (
α
_{
P
},
β
_{
P
} ) and (
α
_{
C
},
β
_{
C
} ) are the parameters given in
α,
βWQB
_{
P
} and
α,
βWQB
_{
C
} respectively. Following the same reasoning in the previous paragraphs, linear constraints (12) and (13) are now updated as the following.
As a results, the problem instance is a α,βWQB_{
C
} problem if (α
_{
P
}, β
_{
P
} ) = (1, 1), and it is a α,βWQB_{
P
} problem if (α
_{
C
}, α
_{
C
}) = (0, 0). Note that the formulation does not require either condition to present; it essentially defines a generalization of α,βWQB problems when all 4 parameters are valid and nonzero.
If there are n vertices in U and m vertices in V, there will be a total of m + n + 2k constraints and m + n + k variables where k is the number of edges whose weight is not equal to 0. The above formulations can be extended to solve the query problem by adding an additional constraint x
_{
v
} = 1 to the formulation, for every vertex v ∈ P ∪ Q. Similar constraints also help us explore sub optimal solutions, e.g., excluding known vertices in subsequent solutions, or provide a lowerbound of required query items in the optimal solution.