Page Maximal Repeats
Given a sequences, a repeat is a substring that occurs at at least 2 different positions. A maximal repeat is a repeat that cannot be extended to the left or to right to a longer repeat. The following example demonstrates how to iterate over all maximal repeats and output them.
#include <iostream>
#include <seqan/index.h>
using namespace seqan;
We begin with a @Class.String@ to store our sequence. Then we create an @Class.Index@ of this StringSet.
Afterwards we initialize a string with the sequence and build an index over it
int main()
{
String<char> myString = "How many wood would a woodchuck chuck.";
typedef Index<String<char> > TMyIndex;
TMyIndex myIndex(myString);
To find maximal repeats, we use SeqAn's MaxRepeatsIterator and set the minimum repeat length to 3.
typedef Iterator<TMyIndex, MaxRepeats>::Type TMaxRepeatIterator;
TMaxRepeatIterator myRepeatIterator(myIndex, 3);
while (!atEnd(myRepeatIterator))
{
Iterator<TMaxRepeatIterator>::Type myRepeatPair(myRepeatIterator);
while (!atEnd(myRepeatPair))
{
std::cout << *myRepeatPair << ", ";
++myRepeatPair;
}
std::cout << repLength(myRepeatIterator) << " ";
std::cout << "\t\"" << representative(myRepeatIterator) << '\"' << std::endl;
++myRepeatIterator;
}
return 0;
}
A repeat can be represented by its length and positions it occurs at. $myRepeatIterator$ iterates over all repeat strings. Please note that in contrast to supermaximal repeats, given a maximal repeat string, not all pairs of its occurrences are maximal repeats. So we need an iterator to iterate over all maximal pairs of this repeat string. The @Spec.MaxRepeats Iterator@ can be seen as a container and be iterated for itself.
weese@tanne:~/seqan$ cd ../build weese@tanne:~/build$ make demo_index_maxrepeats weese@tanne:~/build$ ./bin/demo_index_maxrepeats < 8 , 21 >, 5 " wood" < 21 , 13 >, < 8 , 13 >, 3 " wo" < 26 , 32 >, 5 "chuck" weese@tanne:~/seqan/demos$