Example Program
Supermaximal Repeats
Example for using the SuperMaxRepeats Iterator.
Given a sequences, a repeat is a substring that occurs at at least 2 different positions.
A supermaximal repeat is a repeat that is not part of any longer repeat. The following
example demonstrates how to iterate over all supermaximal repeats and output them.
File "index_supermaxrepeats.cpp"
A tutorial about finding supermaximal repeats.
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 |
We begin with a String to store our sequence.
8 | |
9 |
10 | |
11 | |
12 |
To find supermaximal repeats, we use SeqAn's SuperMaxRepeats Iterator
and set the minimum repeat length to 3.
13 | |
14 | |
15 | |
16 |
A repeat can be represented by its length and positions it occurs at.
getOccurrences returns an unordered sequence of these positions
The length of this sequence, i.e. the repeat abundance can be obtained
from countOccurrences.
17 | |
18 | |
19 |
repLength returns the length of the repeat string.
20 | |
21 |
The repeat string itself can be determined with representative.
22 | |
23 | |
24 | |
25 | |
26 | |
27 | |
28 |
Output
The only supermaximal repeats of "How many wood would a woodchuck chuck." of length at least 3
are " wood" and "chuck" . There are repeats of " wo" which are maximal (see Maximal Repeats),
ut not supermaximal, as " wo" is part of the longer repeat " wood" .
weese@tanne:~/seqan/demos$ make index_supermaxrepeats
weese@tanne:~/seqan/demos$ ./index_supermaxrepeats
8, 21, 5 " wood"
26, 32, 5 "chuck"
weese@tanne:~/seqan/demos$
SeqAn - Sequence Analysis Library - www.seqan.de