K-mers produced by keeSeek, which have a minimum required distance from the relative reference sequences, have been validated by using an external tool. We have chosen glsearch (version 36.3.5b) (Pearson, 2000), a global-local aligner part of the Fasta3 package and based on the Needleman and Wunsch algorithm (Needleman and Wunsch, 1970). The glsearch command line used is the following:
glsearch36 <query> <reference> -f -100 -g -100 -n -b 1 -d 1 -E 10000 -z -1
Note that, to make the results from keeSeek and glsearch comparable, we heavily penalize gap extension (-g parameter), to avoid their presence in the final output.