Fig. 2From: PEPMatch: a tool to identify short peptide sequence matches in large sets of proteinsMismatch search protocol. Given the query peptide of length 9 and the specified number of mismatches equal to 2, we determine that k needs to be 3, using Eq. (1). The peptide can also be split evenly since 9 is divisible by 3, so the k-mers are non-overlapping. The 3-mers are searched through the preprocessed proteome using hash table lookups. DLH is found at index 1,414,500,458, and the neighboring indexes are checked for Hamming distance. The left neighbor has 0 mismatches, and the right neighbor has 2 mismatches compared with the preprocessed proteome locations. In this case, the total number of mismatches is 2, which is equal to our threshold value, which means a match is found hereBack to article page