Number of ℓ-intervals for various reduced alphabets. Numbers of ℓ-intervals for ℓ ∈ [1, 20] of different length for various reduced alphabets. We built the enhanced suffix array with sequences from the RCSB protein data bank (PDB) (total sequence length 4,264,239 bytes). The used reduced amino acid alphabets are given in Figure 8. Note that we limited the interval lengths in the figures to 5,000 to prevent distortion.