Spec VSTree Iterator
Abstract iterator for string trees, where string trees are trees constructed from a string.

Extends Iter
All Extended Iter
All Subcl's BottomUpIterator, MaxRepeatsIterator, MultiMemsIterator, MumsIterator, SuperMaxRepeatsFastIterator, SuperMaxRepeatsIterator, TopDownHistoryIterator, TopDownIterator
All Impl'd IteratorAssociatedTypesConcept
Defined in <seqan/index.h>
Signature template <typename TContainer, typename TSpec> class Iter<TContainer, VSTree<TSpec> >;

Template Parameters

TSpec The specialization type.
TContainer Type of the container that can be iterated. Types: IndexDfi, IndexEsa, IndexWotd, FMIndex

Member Function Overview

Member Functions Inherited From Iter

Interface Function Overview

Interface Functions Inherited From IteratorAssociatedTypesConcept

Interface Metafunction Overview

Interface Metafunctions Inherited From Iter

Interface Metafunctions Inherited From IteratorAssociatedTypesConcept

Detailed Description

This iterator is a pointer to a node in the string tree of a given text. Depending on the index this can either be a suffix or prefix tree/trie. Every node can uniquely be mapped to an interval of suffices or prefixes.

Default virtual string tree iterators depending on the Index

IndexSa Virtual suffix trie iterator
IndexEsa Virtual suffix tree iterator
IndexWotd Virtual suffix tree iterator
IndexDfi Virtual suffix tree iterator
FMIndex Virtual prefix trie iterator

Examples

This code shows how an index can be used with iterators to achieve a pre-order tree like traversal in DFS of the text "tobeornottobe". In order to do so a Top-Down History iterator is used.

#include <seqan/index.h>

using namespace seqan;

int main()
{
    typedef Index<CharString> TIndex;

    TIndex index("tobeornottobe");
    Iterator<TIndex, TopDown<ParentLinks<> > >::Type it(index);

    do
    {
        // Print the letters from the root to the current node
        std::cout << representative(it) << std::endl;

        if (!goDown(it) && !goRight(it))
            while (goUp(it) && !goRight(it))
                ;
    }
    while (!isRoot(it));

    return 0;
}

Interface Functions Detail

TAlign alignment(iterator);

Note.

Internal function.

Returns an alignment of the occurrences of the representative substring in the index text.

Parameters

iterator An iterator of a string tree.

Returns

TAlign A local alignment corresponding to the seed of the iterator.

The function representative must uniquely occur in every sequence (e.g. in Mums), otherwise the seed returned is one many. The return type is a Align object.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

void assignProperty(pm, value(iterator), val)

Assigns a property to an item in the property map.

Parameters

pm An PropertyMapConcept.
iterator An iterator of a string tree. Types: VSTreeIterator
val The new value, where the type of the new value must match the value type of the property map.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool atEnd(iterator);

Determines whether an virtual string tree iterator is at the end position.

Parameters

iterator An iterator.

Returns

bool true if iterator points behind the last item of the container, otherwise false.

Examples

The following example shows the usage of the atEnd function.

#include <seqan/index.h>

using namespace seqan;

int main()
{
    typedef Index<CharString> TIndex;
    TIndex index("TATAA");

    Iterator<TIndex, TopDown<ParentLinks<> > >::Type itDefault;
    itDefault = begin(index, TopDown<ParentLinks<> >());

    while (!atEnd(itDefault))
    {
        std::cout << representative(itDefault) << std::endl;
        goNext(itDefault);
    }

    std::cout << "--------------------------------" << std::endl;

    Iterator<TIndex, TopDown<ParentLinks<Postorder> > >::Type itPostOrder;
    itPostOrder = begin(index, TopDown<ParentLinks<Postorder> >());

    while (!atEnd(itPostOrder))
    {
        std::cout << representative(itPostOrder) << std::endl;
        goNext(itPostOrder);
    }

    return 0;
}

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool childrenAreLeaves(iterator);

Test whether iterator points to a node with only leaf-children.

Parameters

iterator An iterator of a suffix tree.

Returns

bool true if iterator points to an inner node of the tree, whose children are leaves. Otherwise it is false.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

TSize countChildren(iterator);

Count the number of children of a tree node.

Parameters

iterator An iterator of a string tree.

Returns

TSize The number of children of a tree node. The type is the result of Size of the index type of the iterator.

Examples

// this code is in seqan/index/index_esa_stree.h

 typedef Index< String<char> > TMyIndex;
 TMyIndex myIndex(myString);

 Iterator< TMyIndex, TopDown<ParentLinks<PreorderEmptyEdges> > >::Type tdIterator(myIndex);
 Size<TMyIndex>::Type count;

 while (!atEnd(tdIterator)) {
 // We print out the representatives of all nodes that have more than 3 children and the number of occurrences.
 count = countChildren(tdIterator);
 if (count >= 3)
 {
     std::cout << "Representative " << representative(tdIterator) << " has " <<  count << " children  and " << countOccurrences(tdIterator) << " Occurrences " << std::endl;

     ++tdIterator;
 }

DemoIndexCountChildren

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

TSize countOccurrences(iterator);

Returns the number of occurrences of representative substring in the index text.

Parameters

iterator An iterator of a string tree.

Returns

TSize The number of positions where the representative of iterator occurs in the text. The return type is the result of the metafunction Size of the index type of the iterator.

The necessary index tables are built on-demand via indexRequire if index is not const.

Examples

///An example to demonstrate  the functions countChildren and countOccurrences
#include <iostream>
#include <seqan/index.h>

using namespace seqan;

int main()
{
    //We begin with a String to store our sequence.
    String<char> myString = "How many wood would a woodchuck chuck. A woodchuck chucks as much wood as a woodchuck could";

    //Then we create an Index of this StringSet.
    typedef Index<String<char> > TMyIndex;
    TMyIndex myIndex(myString);

    // We will use a TopDown Iterator that supports parent links, ommits
    // empty edges and traverses the index in preorder to print out the number of
    // children at each node (not the number of leafs in the subtree).
    Iterator<TMyIndex, TopDown<ParentLinks<PreorderEmptyEdges> > >::Type tdIterator(myIndex);
    Size<TMyIndex>::Type count;

    while (!atEnd(tdIterator))
    {
        //We print out the representatives of all nodes that have more than 3
        //children and the number of occurrences. Also, we print a message if a node
        //is a leaf.
        count = countChildren(tdIterator);
        if (count >= 3)
        {
            std::cout << "Representative " << representative(tdIterator) << " has " <<  count << " children and ";
            std::cout << countOccurrences(tdIterator) << " occurrences " << std::endl;
        }
        if (isLeaf(tdIterator))
            std::cout << "The node is a leaf " << std::endl;

        tdIterator++;
    }

    return 0;
}

The result is as follows:

Representative  has 17 children and 91 occurrences 
Representative   has 5 children and 16 occurrences 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
Representative a has 3 children and 5 occurrences 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
Representative c has 3 children and 12 occurrences 
The node is a leaf 
Representative chuck has 3 children and 5 occurrences 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
Representative ck has 3 children and 5 occurrences 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
Representative d has 3 children and 7 occurrences 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
Representative huck has 3 children and 5 occurrences 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
Representative k has 3 children and 5 occurrences 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
Representative o has 4 children and 13 occurrences 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
Representative uck has 3 children and 5 occurrences 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 
The node is a leaf 

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

int getFrequency(iterator);

Returns the number of sequences, which contain the representative as a substring.

Parameters

iterator An iterator of a suffix tree. Types: VSTreeIterator

Returns

int The number of different sequences containing the representative.

Example

The following code how getFrequency is used. Note that the result of alternative 1 and 2 is the same, however alternative one copies a string which requires more memory.

#include <seqan/index.h>

using namespace seqan;

int main()
{
    typedef StringSet<String<char> >    TText;
    typedef Index<TText>                TIndex;
    typedef SAValue<TIndex>::Type       TSAValue;

    TText text;
    appendValue(text, "MISSISSIPPI");
    appendValue(text, "MYMISSISAHAPPY");

    TIndex index(text);
    Iterator<TIndex, TopDown<> >::Type it(index);

    goDown(it, "SSI");

    std::cout << "SSI occurs in " << getFrequency(it) << " sequences." << std::endl;

    // 1 alternative to print the location of hits
    String<TSAValue> saPositions = getOccurrences(it);
    for (unsigned i = 0; i < length(saPositions); ++i)
        std::cout << "Hit in sequence " << saPositions[i].i1 << " at position " << saPositions[i].i2 << std::endl;

    std::cout << "----------------------------" << std::endl;

    // 2 alternative to print the location of hits
    Pair<unsigned> hitInterval = range(it);
    for (; hitInterval.i1 < hitInterval.i2; ++hitInterval.i1)
        std::cout << "Hit in sequence " << getFibre(index, FibreSA())[hitInterval.i1].i1 <<
        " at position " << getFibre(index, FibreSA())[hitInterval.i1].i2 << std::endl;

    return 0;
}
SSI occurs in 2 sequences.
Hit in sequence 0 at position 5
Hit in sequence 1 at position 4
Hit in sequence 0 at position 2
----------------------------
Hit in sequence 0 at position 5
Hit in sequence 1 at position 4
Hit in sequence 0 at position 2

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

See Also

TSAValue getOccurrence(iterator);

Returns an occurrence of the representative substring in the index text.

Parameters

iterator An iterator of a string tree.

Returns

TSAValue A position where the representative of iterator occurs in the text. The return type is the result of the metafunction SAValue of the index type of the iterator.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

TInfix getOccurrences(iterator);

Returns all occurrences of the representative substring in the index text.

Parameters

iterator An iterator of a string tree.

Returns

TInfix All positions where the representative of iterator occurs in the text. Type Infix<Fibre<TIndex, FibreSA>::Type>.

The necessary index tables are built on-demand via indexRequire if index is not const.

Example

DemoMummy DemoSupermaximalRepeats DemoMaximalUniqueMatches

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

See Also

TReturn getOccurrencesBwt(iterator);

Returns the characters left beside all occurrence of the representative substring in the index text.

Parameters

iterator An iterator of a suffix tree.

Returns

TReturn All positions where the representative of iterator occurs in the text.

If iterator's container type is TIndex the return type is Infix<Fibre<TIndex, EsaBwt>::Type const>::Type.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

TValue getProperty(pm, value(iterator))

Get method for an item's property.

Parameters

pm An PropertyMapConcept.
iterator An iterator of a string tree. Types: VSTreeIterator

Returns

TValue Reference to the item in the property map of type GetValue.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

void goBegin(iterator);

Iterates to the first position of a container.

Parameters

iterator Object that iterates through

This function is equivalent to iterator = begin(container).

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

See Also

void goNext(it);

Go to next tree node.

Parameters

it VSTreeIterator to advance.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

void goRoot(iterator);

Move iterator to the root node.

Parameters

iterator An iterator of a suffix tree.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool isLeaf(iterator);

Test whether a tree iterator points to a leaf.

Parameters

iterator An iterator of a tree.

Returns

bool true if iterator points to a leaf of the tree, otherwise false.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool isLeftMaximal(iterator);

Test whether the occurrences of an iterator's representative mutually differ in the character left of the hits.

Parameters

iterator An iterator of a suffix tree.

Returns

bool true if there are at least two different characters left of the occurrences, otherwise false.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

See Also

bool isPartiallyLeftExtensible(iterator);

Test whether the characters left of the two occurrences of representative are equal.

Parameters

iterator An iterator of a suffix tree.

Returns

bool true if there are at least two different characters left of the occurrences, otherwise false.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

See Also

bool isRightTerminal(iterator);

Test whether iterator points to a suffix.

Parameters

iterator An iterator of a suffix tree.

Returns

bool true if iterator points to the node representing a suffix, otherwise false.

Every leaf is also a right terminal (see isLeaf), but not vice versa.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool isRoot(iterator);

Test whether a tree iterator points to the root node.

Parameters

iterator An iterator of a tree.

Returns

bool true if iterator points to the root of the tree, otherwise false.

Example

The following example shows the usage of the @Function.isRoot@ function.

#include <seqan/index.h>

using namespace seqan;

int main()
{
    typedef Index<CharString> TIndex;
    TIndex index("TATAA");

    Iterator<TIndex, BottomUp<Postorder> >::Type itDefault;
    itDefault = begin(index, BottomUp<Postorder>());

    while (!isRoot(itDefault))
    {
        std::cout << representative(itDefault) << std::endl;
        goNext(itDefault);
    }

    return 0;
}

code{.output} output:AA ATAA A TAA TATAA TA @endcode

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool isUnique(iterator);

Test whether the representative occurs only once in every sequence.

Parameters

iterator An iterator of a suffix tree.

Returns

bool true if there are at least two different characters left of the occurrences, otherwise false.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

See Also

bool nodeHullPredicate(iterator);

If false this node and its subtree is concealed.

Parameters

iterator An iterator of a string tree.

Returns

bool Returns whether or not a subtree is concealed.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool nodePredicate(iterator);

If false this node will be skipped during the bottom-up traversal.

Parameters

iterator An iterator of a string tree.

Returns

bool Returns whether or not the node will be skipped.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

See Also

TRef property(pm, value(iterator))

Accesses the property of an item in the property map.

Parameters

pm An PropertyMapConcept.
iterator An iterator of a string tree. Types: VSTreeIterator

Returns

TRef Reference to the item in the property map of type Reference.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

TPair range(iterator);

Returns the suffix array interval borders of occurrences of representative substring in the index text.

Parameters

iterator An iterator of a string tree/trie.

Returns

TPair All positions where a substring occurs in the text are stored in a contiguous range of the suffix array. range returns begin and end position of this range for occurrences of representative. The type is Pair<Size< TIndex > > with TIndex being the index type of the iterator.

The necessary index tables are built on-demand via indexRequire if index is not const.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

TSize repLength(iterator);

Returns the length of the substring representing the path from root to iterator node.

Parameters

iterator An iterator of a string tree. Types: VSTreeIterator

Returns

TSize The length of the sequence returned by representative The return type is the result of the metafunction Size of the underlying index.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

TInfix representative(iterator);

Returns a substring representing the path from root to iterator node.

Parameters

iterator An iterator of a string tree.

Returns

TInfix Infix<TSting> An InfixSegment of the text of an index. The type is Infix<Fibre<TIndex, FibreText>::Type>::Type.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

TDesc value(iterator);

Returns a vertex descriptor of the current node.

Parameters

iterator An iterator of a string tree. Types: VSTreeIterator

Returns

TDesc An object of type VertexDescriptor that uniquely identifies the current node. The vertex descriptor can be used to store node specific values in an PropertyMapConcept.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

Interface Metafunctions Detail

EdgeLabel<TSpec>::Type;

Type of a string representing the characters of edges.

Template Parameters

TSpec Tag to specify the object of which the type of the edge labels is requested.

Returns

Type of a string representing the characters of edges.