Page Maximal Repeats
Given a sequences, a repeat is a substring that occurs at at least 2 different positions. A maximal repeat is a repeat that cannot be extended to the left or to right to a longer repeat. The following example demonstrates how to iterate over all maximal repeats and output them.
#include <iostream> #include <seqan/index.h> using namespace seqan;
We begin with a @Class.String@ to store our sequence. Then we create an @Class.Index@ of this StringSet.
Afterwards we initialize a string with the sequence and build an index over it
int main() { String<char> myString = "How many wood would a woodchuck chuck."; typedef Index<String<char> > TMyIndex; TMyIndex myIndex(myString);
To find maximal repeats, we use SeqAn's MaxRepeatsIterator and set the minimum repeat length to 3.
typedef Iterator<TMyIndex, MaxRepeats>::Type TMaxRepeatIterator; TMaxRepeatIterator myRepeatIterator(myIndex, 3); while (!atEnd(myRepeatIterator)) { Iterator<TMaxRepeatIterator>::Type myRepeatPair(myRepeatIterator); while (!atEnd(myRepeatPair)) { std::cout << *myRepeatPair << ", "; ++myRepeatPair; } std::cout << repLength(myRepeatIterator) << " "; std::cout << "\t\"" << representative(myRepeatIterator) << '\"' << std::endl; ++myRepeatIterator; } return 0; }
A repeat can be represented by its length and positions it occurs at. $myRepeatIterator$ iterates over all repeat strings. Please note that in contrast to supermaximal repeats, given a maximal repeat string, not all pairs of its occurrences are maximal repeats. So we need an iterator to iterate over all maximal pairs of this repeat string. The @Spec.MaxRepeats Iterator@ can be seen as a container and be iterated for itself.
weese@tanne:~/seqan$ cd ../build weese@tanne:~/build$ make demo_index_maxrepeats weese@tanne:~/build$ ./bin/demo_index_maxrepeats < 8 , 21 >, 5 " wood" < 21 , 13 >, < 8 , 13 >, 3 " wo" < 26 , 32 >, 5 "chuck" weese@tanne:~/seqan/demos$