Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: Accurate determination of node and arc multiplicities in de bruijn graphs using conditional random fields

Fig. 3

Subgraph of the de Bruijn graph built from the real P. aeruginosa dataset (30×). Nodes and arcs are labelled with their (average) coverage. The fitted node and arc models are shown on the left. The nodes of the graph are coloured according to their true multiplicity. The neighbourhoods of size 1, 2 and 3 surrounding node n1 are shown as coloured ellipses. The inset table shows the categorical distribution \(P(Y_{\mathbf {n_{1}}})\) of the multiplicities {0,1,2} for node n1 for different neighbourhood sizes. For small neighbourhoods (sizes 0 and 1), node n1 (incorrectly) appears to represent a sequencing error due to its relatively low coverage. At neighbourhood sizes 2 and 3, the CRF model has enough information to (correctly) infer multiplicity 1 for node n1 with high probability

Back to article page