Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies

Fig. 2

Diagram of reference database construction. Step 1: From the downloaded reference genomes and taxonomy information, CDKAM creates the mapping of sequences IDs and taxonomy IDs. Step 2: collecting k-mers and solving the k-mer collision to obtain discriminative k-mers as demonstrated in the Algorithm 1. Step 3: compressing the database. In the final step, the whole set of k-mers is divided into smaller groups. Then, for each group, CDKAM stores the number of kmers that shared the same group ID (saved in the Size file), the SUFFIX of k-mers (saved in the Suffix file) and their taxonomy ID (saved in the Taxonomy_ID file)

Back to article page