Figure 1From: prot4EST: Translating Expressed Sequence Tags from neglected genomesThe training set deficit for EST projects. Around 85% of species with representation in dbEST (>100 ESTs) have less than 100 complete CDS entries in the EMBL database. These species comprise ~45% of all ESTs. Sixty-six species, with 246263 dbEST sequences, have no full-length CDS. Source: dbEST and EMBL database (July 2004).Back to article page