Tag
QGram Index Fibres
Tag to select a specific fibre (e.g. table, object, ...) of a q-gram Index.
Include Headers
seqan/index.h
Remarks
These tags can be used to get Fibres of a q-gram Index.
Tags
The original text the index should be based on. | |
The concatenation of all text sequences. Remarks: | |
The suffix array. Remarks: Contains all occurrences of q-grams, s.t. the occurrences of a single q-gram are stored in a contiguous block (q-gram bucket).
q-grams exceeding the end of the text are ignored.
The beginning of each bucket can be determined by the q-gram directory ( It corresponds to a suffix array which is sorted by the first q-gram. | |
The directory/hash table. Remarks: The directory contains for every possible q-gram hash value the start index of the q-gram bucket.
A q-gram bucket is a contiguous interval in the suffix array ( | |
Maps q-gram hashes to buckets.
This fibre is used by the OpenAddressing index and stores all parameters of the open addressing hash function and hash value occupancy in the QGramDir fibre.
In contrast to OpenAddressing, IndexQGram uses a trivial 1-to-1 mapping from q-gram hash values to buckets.
For that index the fibre is of type Nothing. | |
The counts array. Remarks: Contains the numbers of occurrences per sequence of each q-gram, s.t. the numbers of the same q-gram are stored in a contiguous block (q-gram count bucket).
A bucket contains entries (seqNo,count) of sequences with at least one q-gram occurrence. q-grams exceeding the end of the text are ignored.
The beginning of each count bucket can be determined by the q-gram counts directory ( | |
The counts directory. Remarks: The counts directory contains for every possible q-gram hash value the start index of the q-gram count bucket.
A q-gram count bucket is a contiguous interval in the counts array ( | |
The shape the index is based on. Remarks: The q-gram index needs an underlying Shape. This shape can be gapped or ungapped.
The number of '1's (relevant positions) in the shape determines Dynamic shapes (SimpleShape, GenericShape, ...) must be initialized before the index can be used. | |
The union of suffix array and directory. Remarks: In most applications a q-gram index consisting of both of these table is required.
To efficiently create them at once use this tag for indexRequire or indexCreate. |
See Also
SeqAn - Sequence Analysis Library - www.seqan.de