SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
seqan3::bi_fm_index_cursor< index_t > Class Template Reference

The SeqAn Bidirectional FM Index Cursor. More...

#include <seqan3/search/fm_index/bi_fm_index_cursor.hpp>

+ Inheritance diagram for seqan3::bi_fm_index_cursor< index_t >:

Public Types

using index_type = index_t
 Type of the index.
 
Text types
using size_type = typename index_type::size_type
 Type for representing positions in the indexed text.
 
Cursor types
using fwd_cursor = fm_index_cursor< fm_index< typename index_type::alphabet_type, index_type::text_layout_mode, typename index_type::sdsl_index_type > >
 Type for the unidirectional cursor on the original text.
 

Public Member Functions

Constructors, destructor and assignment
 bi_fm_index_cursor () noexcept=default
 Default constructor. Accessing member functions on a default constructed object is undefined behavior.
 
 bi_fm_index_cursor (bi_fm_index_cursor const &) noexcept=default
 Defaulted.
 
bi_fm_index_cursoroperator= (bi_fm_index_cursor const &) noexcept=default
 Defaulted.
 
 bi_fm_index_cursor (bi_fm_index_cursor &&) noexcept=default
 Defaulted.
 
bi_fm_index_cursoroperator= (bi_fm_index_cursor &&) noexcept=default
 Defaulted.
 
 ~bi_fm_index_cursor ()=default
 Defaulted.
 
 bi_fm_index_cursor (index_t const &_index) noexcept
 Construct from given index.
 
bool operator== (bi_fm_index_cursor const &rhs) const noexcept
 Compares two cursors.
 
bool operator!= (bi_fm_index_cursor const &rhs) const noexcept
 Compares two cursors.
 
bool extend_right () noexcept
 Tries to extend the query by the smallest possible character to the right such that the query is found in the text. .
 
bool extend_left () noexcept
 Tries to extend the query by the smallest possible character to the left such that the query is found in the text. .
 
template<typename char_t >
requires std::convertible_to<char_t, index_alphabet_type>
bool extend_right (char_t const c) noexcept
 Tries to extend the query by the character c to the right.
 
template<typename char_type >
requires seqan3::detail::is_char_adaptation_v<char_type>
bool extend_right (char_type const *cstring) noexcept
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
template<typename char_t >
requires std::convertible_to<char_t, index_alphabet_type>
bool extend_left (char_t const c) noexcept
 Tries to extend the query by the character c to the left.
 
template<typename char_type >
requires seqan3::detail::is_char_adaptation_v<char_type>
bool extend_left (char_type const *cstring) noexcept
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
template<std::ranges::range seq_t>
bool extend_right (seq_t &&seq) noexcept
 Tries to extend the query by seq to the right.
 
template<std::ranges::range seq_t>
bool extend_left (seq_t &&seq) noexcept
 Tries to extend the query by seq to the left.
 
bool cycle_back () noexcept
 Tries to replace the rightmost character of the query by the next lexicographically larger character such that the query is found in the text. .
 
bool cycle_front () noexcept
 Tries to replace the leftmost character of the query by the next lexicographically larger character such that the query is found in the text. .
 
size_type last_rank () noexcept
 Outputs the rightmost respectively leftmost rank depending on whether extend_right() or extend_left() has been called last.
 
size_type query_length () const noexcept
 Returns the depth of the cursor node in the implicit suffix tree, i.e. the length of the sequence searched.
 
fwd_cursor to_fwd_cursor () const noexcept
 Returns a unidirectional seqan3::fm_index_cursor on the original text. path_label() on the returned unidirectional index cursor will be equal to path_label() on the bidirectional index cursor. cycle_back() and last_char() will be undefined behavior if the last extension on the bidirectional FM index has been to the left. The behavior will be well-defined after the first extension to the right on the unidirectional index.
 
template<std::ranges::range text_t>
requires (index_t::text_layout_mode == text_layout::single)
auto path_label (text_t &&text) const noexcept
 Returns the searched query.
 
template<std::ranges::range text_t>
requires (index_t::text_layout_mode == text_layout::collection)
auto path_label (text_t &&text) const noexcept
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
size_type count () const noexcept
 Counts the number of occurrences of the searched query in the text.
 
locate_result_type locate () const
 Locates the occurrences of the searched query in the text.
 
std::vector< std::pair< size_type, size_type > > locate () const
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
auto lazy_locate () const
 Locates the occurrences of the searched query in the text on demand, i.e. a std::ranges::view is returned and every position is located once it is accessed.
 
auto lazy_locate () const
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 

Detailed Description

template<typename index_t>
class seqan3::bi_fm_index_cursor< index_t >

The SeqAn Bidirectional FM Index Cursor.

Template Parameters
index_tThe type of the underlying index; This is normally seqan3::bi_fm_index.

The cursor's interface provides searching a string both from left to right as well as from right to left in the indexed text. It extends the interface of the unidirectional seqan3::fm_index_cursor. All methods modifying the cursor (e.g. extending by a character with extend_right()) return a bool value whether the operation was successful or not. In case of an unsuccessful operation the cursor remains unmodified, i.e. an cursor can never be in an invalid state except default constructed cursors that are always invalid.

The asymptotic running times for using the cursor depend on the SDSL index configuration. To determine the exact running times, you have to additionally look up the running times of the used traits (configuration).

Constructor & Destructor Documentation

◆ bi_fm_index_cursor()

template<typename index_t >
seqan3::bi_fm_index_cursor< index_t >::bi_fm_index_cursor ( )
defaultnoexcept

Default constructor. Accessing member functions on a default constructed object is undefined behavior.

Defaulted.

Member Function Documentation

◆ count()

template<typename index_t >
size_type seqan3::bi_fm_index_cursor< index_t >::count ( ) const
inlinenoexcept

Counts the number of occurrences of the searched query in the text.

Returns
Number of occurrences of the searched query in the text.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ cycle_back()

template<typename index_t >
bool seqan3::bi_fm_index_cursor< index_t >::cycle_back ( )
inlinenoexcept

Tries to replace the rightmost character of the query by the next lexicographically larger character such that the query is found in the text. .

Returns
true if there exists a query in the text where the rightmost character of the query is lexicographically larger than the current rightmost character of the query.

Example:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::debug_stream << "Example cycle_back() and cycle_front()\n";
seqan3::dna4_vector genome{"GAATTAATGAAC"_dna4};
seqan3::bi_fm_index index{genome}; // build the bidirectional index
auto cur = index.cursor(); // create a cursor
// cur.cycle_back(); // cycle_back / cycle_front on begin() is undefined behaviour!
cur.extend_right("AAC"_dna4); // search the sequence "AAC"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "AAC"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 1
// cur.cycle_front(); // undefined behaviour! only cycle_back() is allowed after extend_right()
cur.cycle_back(); // search the sequence "AAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "AAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 3
cur.extend_left('G'_dna4); // search the sequence "GAAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "GAAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 2
// cur.cycle_back(); // undefined behaviour! only cycle_front() is allowed after extend_left()
cur.cycle_front(); // search the sequence "TAAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "TAAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 3
cur.cycle_front(); // search the sequence "TAAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "TAAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 3
}
Provides the bidirectional seqan3::bi_fm_index.
The SeqAn Bidirectional FM Index.
Definition bi_fm_index.hpp:58
Provides seqan3::debug_stream and related types.
Provides seqan3::dna4, container aliases and string literals.
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition debug_stream.hpp:37
The SeqAn namespace for literals.

Complexity

\(O(\Sigma) * O(T_{BACKWARD\_SEARCH})\)

It scans linearly over the alphabet starting from the rightmost character until it finds the query with a larger rightmost character.

Exceptions

No-throw guarantee.

◆ cycle_front()

template<typename index_t >
bool seqan3::bi_fm_index_cursor< index_t >::cycle_front ( )
inlinenoexcept

Tries to replace the leftmost character of the query by the next lexicographically larger character such that the query is found in the text. .

Returns
true if there exists a query in the text where the leftmost character of the query is lexicographically larger than the current leftmost character of the query.

Example:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::debug_stream << "Example cycle_back() and cycle_front()\n";
seqan3::dna4_vector genome{"GAATTAATGAAC"_dna4};
seqan3::bi_fm_index index{genome}; // build the bidirectional index
auto cur = index.cursor(); // create a cursor
// cur.cycle_back(); // cycle_back / cycle_front on begin() is undefined behaviour!
cur.extend_right("AAC"_dna4); // search the sequence "AAC"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "AAC"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 1
// cur.cycle_front(); // undefined behaviour! only cycle_back() is allowed after extend_right()
cur.cycle_back(); // search the sequence "AAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "AAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 3
cur.extend_left('G'_dna4); // search the sequence "GAAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "GAAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 2
// cur.cycle_back(); // undefined behaviour! only cycle_front() is allowed after extend_left()
cur.cycle_front(); // search the sequence "TAAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "TAAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 3
cur.cycle_front(); // search the sequence "TAAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "TAAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 3
}

Complexity

\(O(\Sigma) * O(T_{BACKWARD\_SEARCH})\)

It scans linearly over the alphabet starting from the leftmost character until it finds the query with a larger leftmost character.

Exceptions

No-throw guarantee.

◆ extend_left() [1/3]

template<typename index_t >
bool seqan3::bi_fm_index_cursor< index_t >::extend_left ( )
inlinenoexcept

Tries to extend the query by the smallest possible character to the left such that the query is found in the text. .

Returns
true if the cursor could extend the query successfully.

Complexity

\(O(\Sigma) * O(T_{BACKWARD\_SEARCH})\)

It scans linearly over the alphabet until it finds the smallest character that is represented by an edge.

Exceptions

No-throw guarantee.

◆ extend_left() [2/3]

template<typename index_t >
template<typename char_t >
requires std::convertible_to<char_t, index_alphabet_type>
bool seqan3::bi_fm_index_cursor< index_t >::extend_left ( char_t const  c)
inlinenoexcept

Tries to extend the query by the character c to the left.

Template Parameters
char_tType of the character needs to be convertible to the character type char_type of the index.
Parameters
[in]cCharacter to extend the query with to the left.
Returns
true if the cursor could extend the query successfully.

Complexity

\(O(T_{BACKWARD\_SEARCH})\)

Exceptions

No-throw guarantee.

◆ extend_left() [3/3]

template<typename index_t >
template<std::ranges::range seq_t>
bool seqan3::bi_fm_index_cursor< index_t >::extend_left ( seq_t &&  seq)
inlinenoexcept

Tries to extend the query by seq to the left.

Template Parameters
seq_tThe type of range of the sequence to search; must model std::ranges::bidirectional_range.
Parameters
[in]seqSequence to extend the query with to the left (starting from right to left, see example).
Returns
true if the cursor could extend the query successfully.

If extending fails in the middle of the sequence, all previous computations are rewound to restore the cursor's state before calling this method.

Example:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::debug_stream << "Example extend_left(seq)\n";
seqan3::dna4_vector genome{"GAATTAATGAAC"_dna4};
seqan3::bi_fm_index index{genome}; // build the bidirectional index
auto cur = index.cursor(); // create a cursor
cur.extend_right("AAC"_dna4); // search the sequence "AAC"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "AAC"
cur.extend_left("ATG"_dna4); // extend the query to "ATGAAC"
// The rightmost character of "ATG" is extended to the left first.
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "ATGAAC"
}

Complexity

\(|seq| * O(T_{BACKWARD\_SEARCH})\)

Exceptions

No-throw guarantee.

◆ extend_right() [1/3]

template<typename index_t >
bool seqan3::bi_fm_index_cursor< index_t >::extend_right ( )
inlinenoexcept

Tries to extend the query by the smallest possible character to the right such that the query is found in the text. .

Returns
true if the cursor could extend the query successfully.

Complexity

\(O(\Sigma) * O(T_{BACKWARD\_SEARCH})\)

It scans linearly over the alphabet until it finds the smallest character that is represented by an edge.

Exceptions

No-throw guarantee.

◆ extend_right() [2/3]

template<typename index_t >
template<typename char_t >
requires std::convertible_to<char_t, index_alphabet_type>
bool seqan3::bi_fm_index_cursor< index_t >::extend_right ( char_t const  c)
inlinenoexcept

Tries to extend the query by the character c to the right.

Template Parameters
char_tType of the character; needs to be convertible to the character type char_type of the index.
Parameters
[in]cCharacter to extend the query with to the right.
Returns
true if the cursor could extend the query successfully.

Complexity

\(O(T_{BACKWARD\_SEARCH})\)

Exceptions

No-throw guarantee.

◆ extend_right() [3/3]

template<typename index_t >
template<std::ranges::range seq_t>
bool seqan3::bi_fm_index_cursor< index_t >::extend_right ( seq_t &&  seq)
inlinenoexcept

Tries to extend the query by seq to the right.

Template Parameters
seq_tThe type of range of the sequence to search; must model std::ranges::forward_range.
Parameters
[in]seqSequence to extend the query with to the right.
Returns
true if the cursor could extend the query successfully.

If extending fails in the middle of the sequence, all previous computations are rewound to restore the cursor's state before calling this method.

Complexity

\(|seq| * O(T_{BACKWARD\_SEARCH})\)

Exceptions

No-throw guarantee.

◆ last_rank()

template<typename index_t >
size_type seqan3::bi_fm_index_cursor< index_t >::last_rank ( )
inlinenoexcept

Outputs the rightmost respectively leftmost rank depending on whether extend_right() or extend_left() has been called last.

Returns
Rightmost or leftmost rank.

Example:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::debug_stream << "Example cycle_back() and cycle_front()\n";
seqan3::dna4_vector genome{"GAATTAATGAAC"_dna4};
seqan3::bi_fm_index index{genome}; // build the bidirectional index
auto cur = index.cursor(); // create a cursor
// cur.cycle_back(); // cycle_back / cycle_front on begin() is undefined behaviour!
cur.extend_right("AAC"_dna4); // search the sequence "AAC"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "AAC"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 1
// cur.cycle_front(); // undefined behaviour! only cycle_back() is allowed after extend_right()
cur.cycle_back(); // search the sequence "AAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "AAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 3
cur.extend_left('G'_dna4); // search the sequence "GAAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "GAAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 2
// cur.cycle_back(); // undefined behaviour! only cycle_front() is allowed after extend_left()
cur.cycle_front(); // search the sequence "TAAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "TAAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 3
cur.cycle_front(); // search the sequence "TAAT"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "TAAT"
seqan3::debug_stream << cur.last_rank() << '\n'; // outputs 3
}

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ lazy_locate()

template<typename index_t >
auto seqan3::bi_fm_index_cursor< index_t >::lazy_locate ( ) const
inline

Locates the occurrences of the searched query in the text on demand, i.e. a std::ranges::view is returned and every position is located once it is accessed.

Returns
Positions in the text.

Complexity

\(count() * O(T_{BACKWARD\_SEARCH} * SAMPLING\_RATE)\)

Exceptions

Strong exception guarantee (no data is modified in case an exception is thrown).

◆ locate()

template<typename index_t >
locate_result_type seqan3::bi_fm_index_cursor< index_t >::locate ( ) const
inline

Locates the occurrences of the searched query in the text.

Returns
Positions in the text.

Complexity

\(count() * O(T_{BACKWARD\_SEARCH} * SAMPLING\_RATE)\)

Exceptions

Strong exception guarantee (no data is modified in case an exception is thrown).

◆ operator!=()

template<typename index_t >
bool seqan3::bi_fm_index_cursor< index_t >::operator!= ( bi_fm_index_cursor< index_t > const &  rhs) const
inlinenoexcept

Compares two cursors.

Parameters
[in]rhsOther cursor to compare it to.
Returns
true if the cursors are not equal, false otherwise.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ operator==()

template<typename index_t >
bool seqan3::bi_fm_index_cursor< index_t >::operator== ( bi_fm_index_cursor< index_t > const &  rhs) const
inlinenoexcept

Compares two cursors.

Parameters
[in]rhsOther cursor to compare it to.
Returns
true if both cursors are equal, false otherwise.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ path_label()

template<typename index_t >
template<std::ranges::range text_t>
requires (index_t::text_layout_mode == text_layout::single)
auto seqan3::bi_fm_index_cursor< index_t >::path_label ( text_t &&  text) const
inlinenoexcept

Returns the searched query.

Template Parameters
text_tThe type of the text used to build the index; must model std::ranges::input_range.
Parameters
[in]textText that was used to build the index.

Complexity

\(O(SAMPLING\_RATE * T_{BACKWARD\_SEARCH}) + query\_length()\)

Exceptions

No-throw guarantee.

◆ query_length()

template<typename index_t >
size_type seqan3::bi_fm_index_cursor< index_t >::query_length ( ) const
inlinenoexcept

Returns the depth of the cursor node in the implicit suffix tree, i.e. the length of the sequence searched.

Returns
Length of searched sequence.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ to_fwd_cursor()

template<typename index_t >
fwd_cursor seqan3::bi_fm_index_cursor< index_t >::to_fwd_cursor ( ) const
inlinenoexcept

Returns a unidirectional seqan3::fm_index_cursor on the original text. path_label() on the returned unidirectional index cursor will be equal to path_label() on the bidirectional index cursor. cycle_back() and last_char() will be undefined behavior if the last extension on the bidirectional FM index has been to the left. The behavior will be well-defined after the first extension to the right on the unidirectional index.

Returns
Returns a unidirectional seqan3::fm_index_cursor on the index of the original text.

Example:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::debug_stream << "Example to_fwd_cursor()\n";
seqan3::dna4_vector genome{"GAATTAACGAAC"_dna4};
seqan3::bi_fm_index index{genome}; // build the bidirectional index
auto cur = index.cursor(); // create a cursor
cur.extend_left("AAC"_dna4); // search the sequence "AAC"
seqan3::debug_stream << cur.path_label(genome) << '\n'; // outputs "AAC"
auto uni_it = cur.to_fwd_cursor(); // unidirectional cursor on the text "GAATTAACGAAC"
seqan3::debug_stream << uni_it.path_label(genome) << '\n'; // outputs "AAC"
// Undefined behaviour! Cannot be called on the forward cursor if the last extension on the bidirectional
// cursor was to the left:
// cur.cycle_back();
// seqan3::debug_stream << cur.last_rank() << '\n';
uni_it.extend_right('G'_dna4); // search the sequence "AACG"
seqan3::debug_stream << uni_it.path_label(genome) << '\n'; // outputs "AACG"
seqan3::debug_stream << uni_it.last_rank() << '\n'; // outputs 2
uni_it.cycle_back(); // returns false since there is no sequence "AACT" in the text.
}

Complexity

Constant.

Exceptions

No-throw guarantee.


The documentation for this class was generated from the following file:
Hide me