Figure 1

A simplified view of the internal representation of the M5nr. Sequences are stored in a single FASTA file using md5 sequence identifiers. In addition a number of tables are stored in an SQL database management system to allow rapid queries. The tables link md5 identifiers with IDs, functions and organisms provided by a number of data sources.