From: An improved alignment-free model for dna sequence similarity metric
Dataset | Number of | Total number of DNA | Average length of a DNA | Size of |
---|---|---|---|---|
 | families | sequences in the dataset | sequence in the family | dataset (KB) |
DS2 | 6 | 285 | 1307 | 396 |
DS3 | 6 | 310 | 1536 | 501 |
DS4 | 6 | 251 | 1075 | 291 |
HOG20 | 20 | 1542 | 1492 | 2488 |
HOG50 | 50 | 3327 | 1466 | 5285 |
HOG80 | 80 | 7305 | 1413 | 11207 |
HOG100 | 100 | 9648 | 1484 | 15501 |