Skip to main content

Table 1 Proportion of 40 k var sequences of different length with identical sequence hits (> 99% identity) against databases of different size

From: Varia: a tool for prediction, analysis and visualisation of variable genes

Hit length

Databases

41 k (%)

92 k (%)

162 k (%)

205 k (%)

150 bp

67.1

77.0

87.6

89.3

150 bp (Africa)

58.9

70.2

83.2

85.4

150 bp (Asia)

84.8

91.7

96.6

97.4

1 kb

58.5

68.1

79.9

81.8

2 kb

55.7

64.1

75.3

77.0

3 kb

46.5

53.8

63.3

64.7

full hits (80%)

30.6

41.8

53.4

55.7

  1. All sequences start at the LARSFADIG motif found in the N-terminal DBL domain (DBLα) of most var genes. Query sequences from African and Asian genes are shown separately for the 150 bp sequences