Skip to main content

Table 2 A brief summary of the sequence data underlying the emerencia web service at http://emerencia.math.chalmers.se as of May 2005. The threshold BLAST E-values for "good" and "poor" matches were arbitrarily set to 0.0 and 1e-100, respectively. Graphical illustrations showing the population of the database over time and additional aspects of emerencia are generated automatically on a monthly basis and are available at the above address.

From: Approaching the taxonomic affiliation of unidentified sequences in public databases – an example from the mycorrhizal fungi

NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES 7528 (21 % of total)
NUMBER OF IDENTIFIED SEQUENCES 28959 (79% of total)
NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES WITH GOOD MATCHES (E-VALUE = 0.0) 4791 (64 % of the insufficiently identified sequences)
NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES WITH POOR MATCHES (E-VALUE >1E-100) 1135 (15 % of the insufficiently identified sequences)
TOTAL NUMBER OF SEQUENCES LAST UPDATED BEFORE 1995-01-01 180 (0.5%)
TOTAL NUMBER OF SEQUENCES LAST UPDATED BEFORE 2000-01-01 3651 (10 %)
TOTAL NUMBER OF SEQUENCES LAST UPDATED BEFORE 2005-01-01 31858 (87%)
NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES LAST UPDATED BEFORE 2000-01-01 264 (3.5 % of the insufficiently identified sequences)
NUMBER OF INSUFFICIENTLY IDENTIFIED SEQUENCES LAST UPDATED BEFORE 2000-01-01 AND WITH POOR MATCHES (E-VALUE > 1E-100) 17 (0.2 % of the insufficiently identified sequences)
NUMBER OF IDENTIFIED SEQUENCES HAVING AT LEAST ONE INSUFFICIENTLY IDENTIFIED COUNTERPART AS IDENTIFIED BY BLAST 2981 (10 % of the identified sequences)
NUMBER OF IDENTIFIED SEQUENCES WITHOUT INSUFFICIENTLY IDENTIFIED COUNTERPARTS 25978 (90 % of the identified sequences)