Skip to main content

Table 2 Similarities between sequences in the three training sets.

From: VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines

Model Type

Number of clusters

Minimum cluster size

Maximum cluster size

Number of Singletons

Average cluster size

Cluster distributiona

Bacterial

84

1

4

74

1.19

6,2,2

Viral

87

1

5

78

1.15

7,1,0,1

Tumour

76

1

7

66

1.32

4,2,2,1,0,1

  1. For a given cut-off, a perfectly diverse set of sequences will have number of clusters equal to the number of sequences, a maximum and minimum cluster size of one, and an average cluster size of one.
  2. a for non-singleton clusters of 2 or more members. Cluster numbers are shown in ascending cluster size.