Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: MetageNN: a memory-efficient neural network taxonomic classifier robust to sequencing errors and missing genomes

Fig. 5

MetageNN framework for pre-training on long sequences from genomes and its direct application to long read data. a MetageNN utilizes the extensive collection of reference genomes available to sample long sequences. As a means of dealing with a different distribution of noisy long reads, MetageNN relies on computing short-k-mer profiles (6mers), which are more robust to sequencing errors and are used as input to the MetageNN architecture (gray rectangles represent layers). Parameters in blue are used for the small database and parameters in green for the main database (see “Databases”). A textual representation of activation functions and dropouts is provided between layers. b Once MetageNN training is completed, its learnt features are expected to be more robust to sequencing noise and thus can be directly transferred to long-read data (no fine-tuning needed). The classification of long-read data is based on a majority voting strategy

Back to article page