Skip to main content

Table 7 The effect of additional similar sequences in a family, on the performance of AMPS applied to the Master Data Set. Clustering was performed on NAS instead of SD for efficiency with large alignments. Acc SCR (Master Data Set): Accuracy for AMPS clustered on Normalised Alignment Score (NAS) for the Master data set. Acc SCR (Extended data set): Accuracy for alignments on the data set with additional sequences. p : Wilcoxon Signed Rank Pair test significance

From: OXBench: A benchmark for evaluation of protein multiple sequence alignment accuracy

PID Average 1 Number of Families 2 Acc SCR (Master Data set) 3 Acc SCR (Extended data set) 4 Difference in Accuracy (4–3) p
0–10 21 21.7 35.3 13.6 0.00947
10–20 57 59.4 63.3 3.9 0.0719
20–30 64 81.2 82.6 1.4 0.283
30–50 175 92.2 92.2 0.0 0.899
50–100 355 98.9 98.8 -0.1 0.00255
Total 672 89.7 90.5 0.8 0.238