Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Base-pair ambiguity and the kinetics of RNA folding

Fig. 1

Unbound or Bound? ROC performance of classifiers based on thresholding the T-S ambiguity index. Small values of dT-S(p,s) are taken as evidence that a molecule belongs to the unbound group as opposed to the bound group. In the left panel, the classifier is based on using the comparative secondary structure for s to compute the ambiguity index. Alternatively, the MFE structure is used for the classifier depicted in the right panel. AUC: Area Under Curve—see text for interpretation. Additionally, for each of the two experiments, a p-value was calculated based only on the signs of the individual ambiguity indexes, under the null hypothesis that positive indexes are distributed randomly among molecules in all seven RNA families. Under the alternative, positive indexes are more typically found among the unbound as opposed to bound families. Under the null hypothesis the test statistic is hypergeometric—see Eq 14. Left Panel: p=1.2×10−34. Right Panel: p=0.02. In considering these p-values, it is worth re-emphasizing the points made about the interpretation of p-values in the paragraph following Eq 14. The right panel illustrates the point: the ambiguity index based on the MFE secondary structure “significantly distinguishes the two categories (p=0.02)” but clearly has no utility for classification. (These ROC curves and those in Fig. 2 were lightly smoothed by the method known as “Locally Weighted Scatterplot Smoothing,” e.g. with the python command Y=lowess(Y, X, 0.1, return_sorted=False) coming from statsmodels.nonparametric.smoothers_lowess)

Back to article page