Matching instance and its reduction to a cost flow network. (a) A bipartite graph corresponding to sets X and Y. In our particular application, X represents a set of reads and Y represents a set of genomic segments, where the expected coverage of each read is one and segments are expected to be uniformly covered. Each read x ∈ X potentially maps to multiple segments, illustrated by the edges in the graph. An edge (x, y) has the weight w
(x, y), reflecting the best similarity between read x and a substring of of the genome starting at segment y. (b) and (c) depict two possible matchings. In (b), one of the y segments is covered by four reads, while the other two segments are covered by one read each. In (c), each segment is covered by two reads. It is possible that the matching in (b) is better in terms of sequence similarity, though is unrealistic in terms of segment coverage, which would make the matching in (c) preferable. (d) The corresponding network. Each pair of consecutive layers is a bipartite graph with capacities c and costs w' as described.