Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: Improvements in viral gene annotation using large language models and soft alignments

Fig. 2

The similarity matrix is traversed removing spurious mutual matches. The soft alignment approach assumes that matches will exist near one another and on a diagonal close to the diagonal representing the alignment. A In the first step, the matrix is traversed to identify and discard top-left to bottom-right diagonals with fewer than five mutual matches and which are positioned five insertions or deletions away from a diagonal with more than five mutual matches (cells in red). B In the second step, Single gaps along the main diagonal, and representing the same amino acid, are classified as reciprocal matches (green cells), an indication of a false negative

Back to article page