Skip to main content

Table 9 Practical results for different alphabets – Memory requirements

From: Optimal neighborhood indexing for protein similarity search

alphabet

α

L

S neighborhood

r

Σ20

5

11

1.70 * 109 bits = 212 MBytes

5.58

Σ16

4

12

1.52 * 109 bits = 190 MBytes

5.00

Σ8

3

14

1.37 * 109 bits = 171 MBytes

4.50

Σ4

2

19

1.27 * 109 bits = 159 MBytes

4.17

Σ2

1

32

1.12 * 109 bits = 140 MBytes

3.67

  1. Database index size for neighborhood indexing on different alphabets. The first three columns are the same as in Table 2, the other two columns refer to the experience described in section "Practical results". The index size is equal to N × (log2 N + 2αL), as explained in the beginning of the paper. Here N = 12 700 507 and log2 N = 24. The ratio r is against the size of the index for offset indexing, which is here Soffset = N × log2 N = 0.30 * 109 bits = 38 MBytes.