Explanation of the generation times

Reference Genome	Neverword	Min. num. of mismatches	Generation time (min : sec : msec)
Homo sapiens (GRCh37.64 ENSEMBL release)	TGATCGTAATCGTCGCGACA	4	4:36:241
	GCAACCGTTCACGTGTTAGA	3	1:08:592
	CTGAACGATTGATGCTCGAC	3	1:08:560
Arabidopsis thaliana (NC_003070.9, NC_003071.7, NC_003074.8, NC_003075.7, NC_003076.8)	TCGGTGTACGGTAATCACCA	4	0:01:987
	CTGGTCGAAGTACGCATATC	4	0:02:275
	CTTGAGTGCAACAGCGTATC	4	0:01:989
Mycobacterium tuberculosis (NC_000962.2)	GGATTGCCCCTTAGACTAGA	7	0:02:053
	CGTGTCTCCAATAAGTGAGC	5	0:00:084
	CAGCTTAGGCTATCATCGAG	5	0:00:175

Results have been obtained by running three times keeSeek and specifying the parameters “-a 5:5:5:5 -R 1-K 3000 <genome.fasta>”. The first parameter allows to specify the nucleotides composition, namely 5 A, 5 C, 5 G, and 5 T, and to obtain ordered permutations of 20-mers. The second parameter tells keeSeek to create a random seed and use it to prepare a random mapping for reshuffling the permutations; in this way the order among permutations is preserved but an increased variability of codes is ensured. Finally, “-K 3000” is used to limit the computation to the first 3000 results. Candidates that pass the filters are evaluated in blocks.