Interface Functions Detail
TValue hash(shape, it[, charsLeft]);
Computes a (lower) hash value for a shape applied to a sequence.
Parameters
shape
|
Shape to be used for hashing. Types: Shape |
it
|
Sequence iterator pointing to the first character of the shape. |
charsLeft
|
The distance of it to the string end. If
charsLeft is smaller than the shape's span, the
hash value corresponds to the smallest shape beginning with
charsLeft characters. |
Returns
TValue |
Hash value of the shape (Metafunction: Value). |
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
See Also
TValue hash2(shape, it, charsLeft);
Computes an unique hash value of a shape applied to a sequence, even if the sequence is shorter than
the shape span.
Parameters
shape
|
Shape to be used for hashing. Types: Shape |
it
|
Sequence iterator pointing to the first character of the shape. |
charsLeft
|
The distance of it to the string end. |
Returns
TValue |
Hash value of the shape (Metafunction: Value). |
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
See Also
TValue hash2Next(shape, it);
Computes a unique hash value for the adjacent shape, even if it is shorter than q.
Parameters
shape
|
Shape to be used for hashing the q-gram. Types: Shape |
it
|
Sequence iterator pointing to the first character of the adjacent shape. |
Returns
TValue |
Hash value of the shape (Metafunction: Value). |
hash has to be called before with shape on the left adjacent q-gram.
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
See Also
TValue hash2Upper(shape, it, charsLeft);
Computes an upper unique hash value of a shape applied to a sequence,
even if the sequence is shorter than the shape span.
Parameters
shape
|
Shape to be used for hashing. Types: Shape |
it
|
Sequence iterator pointing to the first character of the shape. |
charsLeft
|
The distance of it to the string end. |
Returns
TValue |
Upper hash value of the shape. The hash value corresponds to
the maximal hash2 value of a shape beginning
with the min(charsLeft,length(shape)) characters + 1
(Metafunction: Value). |
This function in conjunction with hash2 is useful to search a
q-gram index for p-grams with p < q.
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
See Also
void hashInit(shape, it);
Preprocessing step of a pure
hashNext loop.
Parameters
shape
|
Shape to be used for hasing. |
it
|
The iterator to use for initializing the shape. |
Overlapping q-grams can efficiently be hashed by calling hash on the first text position and hashNext on succeeding, adjacent positions.
One drawback of this scenario is that for-loops cannot start with the first position directly and become more complicated.
As a remedy, hashInit was introduced which initializes the Shape to be used with hashNext on the first position directly.
Example
Two hash loop examples.
The first loop uses hash/hashNext while the second use hashInit/hashNext and can process all hashes within the loop.
#include <seqan/sequence.h>
#include <seqan/index.h>
using namespace seqan2;
int main()
{
DnaString text = "AAAACACAGTTTGA";
Shape<Dna, UngappedShape<3> > myShape;
// loop using hash() and hashNext() starts at position 1
std::cout << hash(myShape, begin(text)) << '\t';
for (unsigned i = 1; i < length(text) - length(myShape) + 1; ++i)
std::cout << hashNext(myShape, begin(text) + i) << '\t';
std::cout << std::endl;
// loop using hashInit() and hashNext() starts at position 0
hashInit(myShape, begin(text));
for (unsigned i = 0; i < length(text) - length(myShape) + 1; ++i)
std::cout << hashNext(myShape, begin(text) + i) << '\t';
std::cout << std::endl;
return 0;
}
0 0 1 4 17 4 18 11 47 63 62 56
0 0 1 4 17 4 18 11 47 63 62 56
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
TValue hashNext(shape, it);
Computes the hash value for the adjacent shape.
Parameters
shape
|
Shape to be used for hashing. Types: Shape |
it
|
Sequence iterator pointing to the first character of the adjacent shape. |
Returns
TValue |
Hash value of the q-gram (Metafunction: Value). |
hash has to be called before.
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
See Also
TValue hashUpper(shape, it, charsLeft);
Computes an upper hash value for a shape applied to a sequence.
Parameters
shape
|
Shape to be used for hashing. Types: Shape |
it
|
Sequence iterator pointing to the first character of the shape. |
charsLeft
|
The distance of it to the string end. |
Returns
TValue |
Upper hash value of the shape. The hash value corresponds to
the maximal hash value of a shape beginning
with min(charsLeft,length(shape)) characters + 1 (Metafunction:
Value). |
This function in conjunction with hash is useful to search a q-gram index for p-grams with
p < q.
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
See Also
TSize length(shape);
Returns the number of elements of the shape (span).
Parameters
shape
|
Shape object for which the number of relevant positions is determined. |
Returns
TSize |
The number of elements of the shape (span) (Metafunction: Size). |
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
TSize resize(shape, length)
Resize a shape to a specified span.
Parameters
shape
|
Shape object for which the number of relevant positions is determined |
length
|
The new length (span) of the shape. |
Returns
TSize |
The new span of type (Metafunction: Size). |
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
void shapeToString(bitmap, shape);
Converts a given shape into a sequence of '1' (relevant position) and '0' (irrelevant position).
Parameters
bitmap
|
The resulting sequence object. Types: String |
shape
|
Shape object. Types: Shape |
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
See Also
bool stringToShape(shape, bitmap);
Takes a shape given as a string of '1' (relevant position) and '0'
(irrelevant position) and converts it into a Shape object.
Parameters
shape
|
Shape object that is manipulated. |
bitmap
|
A character string of '1' and '0' representing relevant and irrelevant positions (blanks)
respectively. This string must begin with a '1'. Trailing '0's are ignored. If shape
is a SimpleShape at most one contiguous sequences of 1s is allowed. If
shape is a OneGappedShape at most two contiguous sequences of '1's are
allowed (String of char). |
Returns
bool |
true if the conversion was successful. |
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
See Also
void unhash(result, hash, q);
Inverse of the
hash function; for ungapped shapes.
Parameters
result
|
String to write the result to. Types: String. |
hash
|
The hash value previously computed with hash. |
q
|
The q-gram length. Types: unsigned |
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
See Also
TValue value(shape);
Returns the current hash value of the Shape.
Parameters
shape
|
The Shape to query for its value. |
Returns
TValue |
The hash value of the shape. |
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.
TSize weight(shape);
Number of relevant positions in a shape.
Parameters
shape
|
Shape object for which the number of relevant positions is determined. |
Returns
TSize |
Number of relevant positions (Metafunction: Size). |
For ungapped shapes the return value is the result of the length function. For gapped shapes this is the number of '1's.
Data Races
If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.