Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: Filling gaps of genome scaffolds via probabilistic searching optical maps against assembly graph

Fig. 5

Searching site graph for the site sequence that best matches a gap. In this example, the gap has site sequence \(x_1x_2x_3x_4\) with distance 8, 5, 3, respectively. Through locating gaps in Step 1, we have known that the beginning site \(x_1\) matches \(s_1\), and the ending site \(x_4\) matches \(s_7\). Thus, our objective is to find the path from \(s_1\) to \(s_7\) that best matches the gap \(x_1x_2x_3x_4\). a Initially, we set \(\Pr [x_1 = s_1] = 1\) as we have known \(x_1\) matches \(s_1\). Next we propagated this probability to downstream site pairs and calculated the following matching beliefs for site \(x_2\): \(Belief(x_2 = s_2) = \Pr [x_1 = s_1] S(8, 8)\), \(Belief(x_2 = s_3) = \Pr [x_1 = s_1] S(8, 7)\), and \(Belief(x_2 = s_5) = \Pr [x_1 = s_1] S(8, 14)\). After normalization, we obtained the site matching probabilities: \(\Pr [x_2 = x_2] = 0.81\), \(\Pr [x_2 = x_3] = 0.19\), and \(\Pr [x_2 = x_5] = 0\). b We propagated these probabilities further and obtained the following beliefs for site \(x_3\) \(Belief(x_3 = s_4) = \Pr [x_2=s_2] S(5, 5)\), \(Belief(x_3 = s_5) = \Pr [x_2=s_3] S(5, 4)\), and \(Belief(x_3 = s_7) = \Pr [x_2=s_5] S(5, 4)\) and then normalized them into probabilities. After normalization, we obtained the site matching probabilities: \(\Pr [x_3 = x_4] = 0.95\), \(\Pr [x_3 = x_5] = 0.05\), and \(\Pr [x_3 = x_7] = 0\). c For site \(x_4\), we calculated its matching beliefs similarly. Note that there are two paths reaching site \(s_7\), and thus we needed to calculate the maximum of the two paths as follows: \(Belief(x_4 = s_7) =\) \(\max \{\Pr [x_3=s_4]S(3,3),\) \(\Pr [x_3 = s_5] S(3,4)\}\). After calculating \(\Pr [x_4=s_8]\), we traced back and reported the best matching site path as \(s_1s_2s_4s_7\)

Back to article page