Page Supermaximal Repeats

Example for using the SuperMaxRepeatsIterator.

Given a sequences, a repeat is a substring that occurs at at least 2 different positions. A supermaximal repeat is a repeat that is not part of any longer repeat. The following example demonstrates how to iterate over all supermaximal repeats and output them.

We start by including the required headers and using the namespace seqan2.

#include <iostream>
#include <seqan/index.h>

using namespace seqan2;

Afterwards we initialize a stirng with the sequence and build an index over it

int main()
{
    String<char> myString = "How many wood would a woodchuck chuck.";

    typedef Index<String<char> > TMyIndex;
    TMyIndex myIndex(myString);

To find supermaximal repeats, we use SeqAn's SuperMaxRepeats Iterator and set the minimum repeat length to 3.

    Iterator<TMyIndex, SuperMaxRepeats>::Type myRepeatIterator(myIndex, 3);

    while (!atEnd(myRepeatIterator))
    {
        // A repeat can be represented by its length and positions it occurs at.
        // Function getOccurrences returns an unordered sequence of these positions
        // The length of this sequence, i.e. the repeat abundance can be obtained
        // from countOccurrences.
        for (unsigned i = 0; i < countOccurrences(myRepeatIterator); ++i)
            std::cout << getOccurrences(myRepeatIterator)[i] << ", ";

        // Function repLength returns the length of the repeat string.
        std::cout << repLength(myRepeatIterator) << "   ";

        // The repeat string itself can be determined with function representative.
        std::cout << "\t\"" << representative(myRepeatIterator) << '\"' << std::endl;

        ++myRepeatIterator;
    }

    return 0;
}

The only supermaximal repeats of "How many wood would a woodchuck chuck." of length at least 3 are " wood" and "chuck". There are repeats of " wo" which are maximal (see Maximal Repeats), ut not supermaximal, as " wo" is part of the longer repeat " wood".

weese@tanne:~/seqan$ cd demos
weese@tanne:~/seqan/demos$ make index_supermaxrepeats
weese@tanne:~/seqan/demos$ ./index_supermaxrepeats
8, 21, 5        " wood"
26, 32, 5       "chuck"
weese@tanne:~/seqan/demos$*