Example Program
Approximate Searching
Approximate string matching.
A tutorial about the use of approximate find algorithms.
1#include <iostream>
2#include <seqan/find.h>
3
4using namespace seqan;
5
Example 1: This program finds all occurrences of CCT in AACTTAACCTAA with ≤ 1 error using the MyersUkkonen approximate search algorithm.
6int main() 
7{
8    String<char> haystk("AACTTAACCTAA");
9    String<char> ndl("CCT");
10
11    Finder<String<char> > fnd(haystk);
12    Pattern<String<char>, MyersUkkonen> pat(ndl);
The function setScoreLimit sets the limit score an occurrence must reach. Since the used scoring scheme is a distance measure (edit distance), all scores are negative. A score limit of ≥ -1 therefore means an edit distance ≤ 1. Note that position returns the position of the last found occurrence.
13    setScoreLimit(pat, -1);
14    while (find(fnd, pat)) {
15        std::cout << position(fnd) << ": " << getScore(pat) << "\n";
16    }
17
Example 2: Finding all start and endpositions
18    String<char> t = "babybanana";
19    String<char> p = "babana";
20    Finder<String<char> > finder(t);
21    Pattern<String<char>, Myers<FindInfix> > pattern(p);
Instead of using setScoreLimit, we pass the score limit -2 as a third argument to find
22    while (find(finder, pattern, -2)) {
23        std::cout << "end: " << endPosition(finder) << std::endl;
In order to find the begin position, we have to call findBegin. Note that the third argument of findBegin is optional. The default is the score limit that was used during the last call of find (i.e. -2 in this example).
24        while (findBegin(finder, pattern, getScore(pattern))) {
25            std::cout << "begin: " << beginPosition(finder) << std::endl;
26            std::cout << infix(finder) << " matches with score ";
27            std::cout << getBeginScore(pattern) << std::endl;
28        }
29    }
30    return 0;
31}
32
Output
3: -1
4: -1
8: -1
9: 0
10: -1
end: 6
begin: 0
babyba matches with score -2
end: 7
begin: 2
byban matches with score -2
end: 8
begin: 2
bybana matches with score -1
end: 9
begin: 4
banan matches with score -2
begin: 2
bybanan matches with score -2
end: 10
begin: 4
banana matches with score -1
SeqAn - Sequence Analysis Library - www.seqan.de
 

Page built @2013/07/11 09:12:16