As explained before, we allow the user to input a defined nucleotide composition for k-mer generation. This option has been implemented for two reasons:
keeSeek starts from the selected nucleotide composition and permutates symbols to generate all possible anagrams.
The number of existing anagrams for a sequence of length N, and with N different symbols, is the factorial of N. If we skip equivalent words in permutations, since there are only four symbols in DNA strings, the number of anagrams for a word of N / (S1* S2* S3* S4) where S1, S2, S3, and S4 are the amounts of the different symbols in the string.
In the following graph we show the total number of anagrams resulting from different nucleotide composition.