Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: fLPS: Fast discovery of compositional biases for the protein universe

Fig. 1

The algorithm. Three stages of bias annotation are depicted: QUICK SCAN: For each amino-acid residue type, from the maximum window size M down to the minimum m, the sequence is scanned for windows that have numbers of amino-acids greater than the expectation for a high binomial P-value threshold (=0.001). These windows are merged into a contig if they overlap each other. MINIMIZE: For each contig, the lowest-probability subsequences (LPSs) are computed by searching down from the contig length to the minimum m. MERGE: LPSs for different residue types are then sorted together in increasing order of binomial P-value and iteratively assessed for merger into multiple-residue LPSs. LPSs are merged if the merged LPS would have a lower P-value. This assessment entails checking whether the multiple-residue LPSs can be trimmed or extended, as depicted. Mergers of LPSs are assessed until no more can be performed. OUTPUT: Both single- and multiple-residue LPSs are output in increasing order of binomial P-value

Back to article page