Example Program
Maximal Unique Matches
Example for using the MUMs Iterator.
Given a set of sequences, a unique match is a match that occurs exactly once in each sequence. A maximal unique match (MUM) is a unique match that is not part of any longer unique match. The following example demonstrates how to iterate over all MUMs and output them.
 1 #include  2 #include  3 4 using namespace std; 5 using namespace seqan; 6 7 int main () 8 {
We begin with a StringSet that stores multiple strings.
 9 StringSet< String > mySet; 10 resize(mySet, 3); 11 mySet[0] = "SeqAn is a library for sequence analysis."; 12 mySet[1] = "The String class is the fundamental sequence type in SeqAn."; 13 mySet[2] = "Subsequences can be handled with SeqAn's Segment class."; 14
Then we create an Index of this StringSet.
 15 typedef Index< StringSet > > TMyIndex; 16 TMyIndex myIndex(mySet); 17
To find maximal unique matches (MUMs), we use the MUMs Iterator and set the minimum MUM length to 3.
 18 Iterator< TMyIndex, MUMs >::Type myMUMiterator(myIndex, 3); 19 String< SAValue::Type > occs; 20 21 while (!atEnd(myMUMiterator)) 22 {
A multiple match can be represented by positions it occurs at in every sequence and its length. getOccurrences returns an unordered sequence of pairs (seqNo,seqOfs) the match occurs at.
 23 occs = getOccurrences(myMUMiterator);
To order them ascending according seqNo we use orderOccurrences.
 24 orderOccurrences(occs); 25 26 for(unsigned i = 0; i < length(occs); ++i) 27 cout << getValueI2(occs[i]) << ", "; 28
repLength returns the length of the match.
 29 cout << repLength(myMUMiterator) << "   "; 30
The match string itself can be determined with representative.
 31 cout << "\t\"" << representative(myMUMiterator) << '\"' << endl; 32 33 ++myMUMiterator; 34 } 35 36 return 0; 37 }
Output
The only maximal matches that occur in all 3 sequences are "SeqAn" and "sequence". They occur exactly once and thus are maximal unique matches.
weese@tanne:~/seqan\$ cd demos
weese@tanne:~/seqan/demos\$ make index_mums
weese@tanne:~/seqan/demos\$ ./index_mums
0, 53, 33, 5    "SeqAn"
23, 36, 3, 8    "sequence"
weese@tanne:~/seqan/demos\$
SeqAn - Sequence Analysis Library - www.seqan.de