From: Efficient computation of absent words in genomic sequences
Organism | Genome size | ⌊log10 |G|⌋ | ⌊log4 |G|⌋ | #unwords | length |
---|---|---|---|---|---|
H. sapiens | ≈ 3.1 Gb | 9 | 15.8 | 104 | 11 |
M. musculus | ≈ 2.7 Gb | 9 | 15.7 | 192 | 11 |
D. melanogaster | ≈ 132 Mb | 8 | 13.5 | 104 | 11 |
C. elegans | ≈ 100 Mb | 8 | 13.3 | 2 | 10 |
N. crassa | ≈ 34 Mb | 7 | 12.5 | 2262 | 11 |
S. cerevisiae | ≈ 12 Mb | 7 | 11.8 | 4 | 9 |
S. aureus | ≈ 2.79 Mb | 6 | 10.7 | 248 | 8 |
T. kodakarensis | ≈ 2.08 Mb | 6 | 10.5 | 1 | 8 |
M. jannaschii | ≈ 1.66 Mb | 6 | 10.3 | 3 | 6 |
M. genitalium | ≈ 0.58 Mb | 5 | 9.6 | 5 | 6 |