Class
Shape
Stores hash value and shape for an ungapped or gapped q-gram.
Shape<TValue, TSpec>
Parameters
TValue
The Value type of the string the shape is applied to (e.g. Dna).
TSpec
The specializing type.
Default: SimpleShape, for ungapped q-grams.
Remarks
The ValueSize of Shape is the ValueSize of TValue which is the alphabet size.
To get the span or the weight of a shape call length or weight.
Specializations
GappedShapeA fixed gapped shape.
GenericShapeA variable gapped shape.
OneGappedShapeA variable shape with one optional gap.
SimpleShapeA variable length ungapped shape (also called q-gram or k-mer).
UngappedShapeA fixed length ungapped shape (also called q-gram or k-mer).
Metafunctions
HostType of the object a given object depends on.
LENGTHNumber of elements in a fixed-size container.
SizeType of an object that is suitable to hold size information.
ValueType of the items in the container or behind an iterator.
ValueSizeNumber of different values a value type object can have.
WEIGHTNumber of relevant positions in a shape.
Member Functions
ShapeConstructor
Functions
countOccurrencesReturns the number of occurrences of representative substring or a q-gram in the index text.
countOccurrencesMultipleReturns the number of occurrences of a q-gram for every sequence of a StringSet .
createCountArrayBuilds an index on a StringSet storing how often a q-gram occurs in each sequence.
createQGramIndexBuilds a q-gram index on a sequence.
createQGramIndexDirOnlyBuilds the directory of a q-gram index on a sequence.
createQGramIndexSAOnlyBuilds the suffix array of a q-gram index on a sequence.
getOccurrenceReturns an occurrence of the representative substring or a q-gram in the index text.
getOccurrencesReturns all occurrences of the representative substring or a q-gram in the index text.
hashComputes a (lower) hash value for a shape applied to a sequence.
hash2Computes an unique hash value of a shape applied to a sequence, even if the sequence is shorter than the shape span
hash2NextComputes a unique hash value for the adjacent shape, even if it is shorter than q.
hash2UpperComputes an upper unique hash value of a shape applied to a sequence, even if the sequence is shorter than the shape span.
hashInitPreprocessing step of a pure hashNext loop.
hashNextComputes the hash value for the adjacent shape.
hashUpperComputes an upper hash value for a shape applied to a sequence.
indexShapeShortcut for getFibre(.., QGramShape).
lengthThe number of items/characters.
rangeReturns the suffix array interval borders of occurrences of representative substring or a q-gram in the index text.
shapeToStringConverts a given shape into a sequence of '1' (relevant position) and '0' (irrelevant position).
valueReference to the value.
weightNumber of relevant positions in a shape.
Examples
The following code shows how one can use a gapped shape to search for the pattern "ACxA" in a reference. First we assign a form to the shape and then compute the corresponding hash value. The hash value of a string and a Shape object is unique, such that one can retrieve the string from a shape if the hash value is known.
File "shape.cpp"
1#include <seqan/sequence.h>
2#include <seqan/index.h>
3
4using namespace seqan;
5
6int main ()
7{
8    DnaString genome = "ACGACGTGCAACGTACGACTAGCATCGGATCAGCAT";
9
10    Shape<Dna, OneGappedShape> myShape;
11    stringToShape(myShape, "1101");
12
13    // compute hash of a search pattern
14    unsigned hashedPattern = hash(myShape, "ACGA");
15    std::cout << "The hash is: " << hashedPattern << std::endl;
16
17    // compute all overlapping hashes and compare with hash of pattern
18    for (unsigned i = 0; i < length(genome) - length(myShape) + 1; ++i)
19        if (hash(myShape, begin(genome) + i) == hashedPattern)
20            std::cout << "Hit at position: " << i <<std::endl;
21
22    return 0;
23}
The hash is: 4
Hit at position: 0
Hit at position: 14
Hit at position: 17
SeqAn - Sequence Analysis Library - www.seqan.de
 

Page built @2013/07/11 09:12:35