Skip to main content

Table 7 The effect of additional similar sequences in a family, on the performance of AMPS applied to the Master Data Set. Clustering was performed on NAS instead of SD for efficiency with large alignments. Acc SCR (Master Data Set): Accuracy for AMPS clustered on Normalised Alignment Score (NAS) for the Master data set. Acc SCR (Extended data set): Accuracy for alignments on the data set with additional sequences. p : Wilcoxon Signed Rank Pair test significance

From: OXBench: A benchmark for evaluation of protein multiple sequence alignment accuracy

PID Average 1

Number of Families 2

Acc SCR (Master Data set) 3

Acc SCR (Extended data set) 4

Difference in Accuracy (4–3)

p

0–10

21

21.7

35.3

13.6

0.00947

10–20

57

59.4

63.3

3.9

0.0719

20–30

64

81.2

82.6

1.4

0.283

30–50

175

92.2

92.2

0.0

0.899

50–100

355

98.9

98.8

-0.1

0.00255

Total

672

89.7

90.5

0.8

0.238