Length distribution for the Sargasso Sea resource. Figure 2a shows the percentages of sequences in each of 18 different length bins. Bars in blue represent the number of sequences in each bin for the SSea-nr database, those in yellow show the corresponding distribution for the Curr-nr database and those in orange show the length distribution for sequences from all completely sequenced prokaryotic genomes. The distribution shows clearly that the lengths of the Sargasso Sea sequences are much more highly concentrated at lengths between 50 and 300 residues. In figure 2b the same bins are used, but the Sargasso Sea sequences are split into their eleven constituent parts. This shows clearly that most of the fragments are to be found in sections "eaa" to "eah". The section "eaa" contained only 21,000 sequences, compared to the 100,000 sequences in the other bins.