Skip to main content

Table 2 Similarities between sequences in the three training sets.

From: VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines

Model Type Number of clusters Minimum cluster size Maximum cluster size Number of Singletons Average cluster size Cluster distributiona
Bacterial 84 1 4 74 1.19 6,2,2
Viral 87 1 5 78 1.15 7,1,0,1
Tumour 76 1 7 66 1.32 4,2,2,1,0,1
  1. For a given cut-off, a perfectly diverse set of sequences will have number of clusters equal to the number of sequences, a maximum and minimum cluster size of one, and an average cluster size of one.
  2. a for non-singleton clusters of 2 or more members. Cluster numbers are shown in ascending cluster size.