Computes a (lower) hash value for a shape applied to a sequence.
The hash value (a.k.a. code) of a q-gram is the lexicographical rank of this q-gram in the set of all possible q-grams. For example, the hash value of the Dna 3-gram AAG is 2 as there are only two 3-grams (AAA and AAC) having a smaller lexicographical rank. If hash is called with a gapped shape, the q-gram is the text subsequence of no-gap shape positions relative to the text iterator, e.g. a shape 1101 at the beginning of text ACGT corresponds to the 3-gram ACT.
Shape to be used for hashing.
Sequence iterator pointing to the first character of the shape.
The distance of
Hash value of the shape.
Code example that computes hash values of 4-grams with different shapes starting at the beginning of a text.
The resulting hexadecimal hash values of the three 4-mers GATT, GATC and GATA are:
, , ,
SeqAn - Sequence Analysis Library - www.seqan.de