Example Program
Constraint Iterator
Example for using node predicates on a deferred suffix tree.
Given a sequences, we want to find all substrings s that fulfill certain constraints.
The relative probabilty to see s should be at least p_min . s should also be not longer than
replen_max .
The latter constraint is a anti-monotonic pattern predicate and can be used in conjunction with the
first constraint to cut of the trunk of a suffix tree. Only the top of the suffix tree contains candidates
that might fulfill both predicates, so we can use an Index based on a deferred suffix tree (see IndexWotd).
The following example demonstrates how to iterate over all suffix tree nodes fulfilling the constraints and output them.
File "index_node_predicate.cpp"
A tutorial showing how to extent an index with a node predicate.
1 | |
2 | |
3 | |
4 | |
5 |
constraint parameters
6 | |
7 | |
8 | |
9 | |
10 | |
11 |
SeqAn extensions
12 | |
13 | |
14 | |
15 | |
16 | |
17 | |
18 | |
19 | |
20 | |
21 | |
22 | |
23 | |
24 | |
25 | |
26 | |
27 | |
28 | |
29 | |
30 | |
31 | |
32 | |
33 | |
34 | |
35 | |
36 | |
37 | |
38 | |
39 | |
40 | |
41 | |
42 | |
43 | |
44 | |
45 | |
46 | |
47 | |
48 | |
49 | |
50 | |
51 | |
52 |
We begin with a String to store our sequence.
53 | |
54 |
Then we create our customized index which is a specialization
of the deferred wotd-Index
55 | |
56 | |
57 | |
58 | |
59 | |
60 |
To find all strings that fulfill our constraints,
we simply do a dfs-traversal via goBegin and goNext.
61 | |
62 | |
63 | |
64 | |
65 | |
66 | |
67 |
countOccurrences returns the number of hits of the representative.
68 | |
69 |
The representative string can be determined with representative
70 | |
71 | |
72 | |
73 | |
74 | |
75 | |
76 |
Output
weese@tanne:~/seqan/demos$ make index_node_predicate
weese@tanne:~/seqan/demos$ ./index_node_predicate
38x ""
6x " "
3x " wo"
2x " wood"
2x "a"
4x "c"
2x "chuck"
2x "ck"
3x "d"
2x "d "
2x "huck"
2x "k"
6x "o"
2x "od"
2x "ood"
3x "u"
2x "uck"
4x "w"
3x "wo"
2x "wood"
weese@tanne:~/seqan/demos$
See
SeqAn - Sequence Analysis Library - www.seqan.de