Skip to main content
Figure 6 | BMC Bioinformatics

Figure 6

From: The distance-profile representation and its application to detection of distantly related protein families

Figure 6

Distribution of pvalue-distances between feature vectors Ppvaluewith thresholding ( τ = 3.5). Thresholding introduces complex effect on the distribution of the PD measure and makes the derivation of its significance level difficult. Right: The complete distribution is plotted in logscale. The linear correlation in logscale suggests an exponential decay. Left: Zoom-in on the range [0,40]. The distribution is multi-modal due to the contributions of a different number of elements to sum as in Equation (4). The smallest nonzero PD measures start at approximately 4.426, which corresponds to a match of two feature vectors Ppvalueat one entry where the z-score is equal to 3.5, i.e. a pvalue of 0.0120 or -log pvalue = 4.426. The distribution decreases rapidly and increases again at PD ≈ 9, corresponding to a match of two feature vectors at two distinct entries with zscore ≈ 3.5. The pattern repeats periodically as the number of significant entries common to both feature vectors increases.

Back to article page