SeqAn3
The Modern C++ library for sequence analysis.
seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ > Class Template Reference

The SeqAn Bidirectional FM Index. More...

#include <seqan3/search/fm_index/bi_fm_index.hpp>

+ Inheritance diagram for seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >:

Public Types

Text types
using alphabet_type = typename fm_index_type::alphabet_type
 The type of the underlying character of the indexed text.
 
using size_type = typename sdsl_index_type::size_type
 Type for representing positions in the indexed text.
 
Cursor types
using cursor_type = bi_fm_index_cursor< bi_fm_index >
 The type of the bidirectional cursor.
 
using fwd_cursor_type = fm_index_cursor< fm_index_type >
 The type of the unidirectional cursor on the original text.
 
using rev_cursor_type = fm_index_cursor< rev_fm_index_type >
 The type of the unidirectional cursor on the reversed text.
 

Public Member Functions

cursor_type begin () const noexcept
 Returns a seqan3::bi_fm_index_cursor on the index that can be used for searching. More...
 
bool empty () const noexcept
 Checks whether the index is empty. More...
 
fwd_cursor_type fwd_begin () const noexcept
 Returns a unidirectional seqan3::fm_index_cursor on the original text of the bidirectional index that can be used for searching. More...
 
bool operator!= (bi_fm_index const &rhs) const noexcept
 Compares two indices. More...
 
bool operator== (bi_fm_index const &rhs) const noexcept
 Compares two indices. More...
 
rev_cursor_type rev_begin () const noexcept
 Returns a unidirectional seqan3::fm_index_cursor on the reversed text of the bidirectional index that can be used for searching. Note that because of the text being reversed, extend_right() resp. cycle_back() correspond to extend_left() resp. cycle_front() on the bidirectional index cursor. More...
 
size_type size () const noexcept
 Returns the length of the indexed text including sentinel characters. More...
 
Constructors, destructor and assignment
 bi_fm_index ()=default
 Defaulted.
 
 bi_fm_index (bi_fm_index const &)=default
 Defaulted.
 
bi_fm_indexoperator= (bi_fm_index const &)=default
 Defaulted.
 
 bi_fm_index (bi_fm_index &&)=default
 Defaulted.
 
bi_fm_indexoperator= (bi_fm_index &&)=default
 Defaulted.
 
 ~bi_fm_index ()=default
 Defaulted.
 
template<std::ranges::range text_t>
 bi_fm_index (text_t &&text)
 Constructor that immediately constructs the index given a range. The range cannot be empty. More...
 

Static Public Attributes

static constexpr text_layout text_layout_mode = text_layout_mode_
 Indicates whether index is built over a collection.
 

Friends

template<typename bi_fm_index_t >
class bi_fm_index_cursor
 

Detailed Description

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
class seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >

The SeqAn Bidirectional FM Index.

Template Parameters
alphabet_tThe alphabet type; must model seqan3::semialphabet.
text_layout_mode_Indicates whether this index works on a text collection or a single text. See seqan3::text_layout.
sdsl_index_type_The type of the underlying SDSL index, must model seqan3::sdsl_index.

The seqan3::bi_fm_index is a fast and space-efficient bidirectional string index to search strings and collections of strings. In general, we recommend to favour the seqan3::bi_fm_index over the seqan3::fm_index if you want to allow multiple errors when searching.

General information

Here is a short example on how to build an index and search a pattern using an cursor. Please note that there is a very powerful search module with a high-level interface seqan3::search that encapsulates the use of cursors.

#include <vector>
int main()
{
using seqan3::operator""_dna4;
std::vector<seqan3::dna4> genome{"ATCGATCGAAGGCTAGCTAGCTAAGGGA"_dna4};
seqan3::bi_fm_index index{genome}; // build the index
auto cur = index.begin(); // create a cursor
cur.extend_right("GG"_dna4); // search the pattern "GG"
cur.extend_left("AA"_dna4); // search the pattern "AAGG"
seqan3::debug_stream << "Number of hits: " << cur.count() << '\n'; // outputs: 2
seqan3::debug_stream << "Positions in the genome: ";
for (auto const & pos : cur.locate()) // outputs: 8, 22
seqan3::debug_stream << pos << ' ';
return 0;
}
Attention
When building an index for a single text over any alphabet, the symbol with rank 255 is reserved and may not occur in the text.

Here is an example using a collection of strings (e.g. a genome with multiple chromosomes or a protein database):

#include <vector>
int main()
{
using seqan3::operator""_dna4;
std::vector<std::vector<seqan3::dna4>> genomes{"ATCTGACGAAGGCTAGCTAGCTAAGGGA"_dna4,
"TAGCTGAAGCCATTGGCATCTGATCGGACT"_dna4,
"ACTGAGCTCGTC"_dna4,
"TGCATGCACCCATCGACTGACTG"_dna4,
"GTACGTACGTTACG"_dna4};
seqan3::bi_fm_index index{genomes}; // build the index
auto cur = index.begin(); // create a cursor
cur.extend_right("GA"_dna4); // search the pattern "GA"
cur.extend_left("CT"_dna4); // search the pattern "CTGA"
seqan3::debug_stream << "Number of hits: " << cur.count() << '\n'; // outputs: 5
seqan3::debug_stream << "Positions in the genomes: ";
for (auto const & pos : cur.locate()) // outputs: (3,16) (2,1) (1,3) (0,2) (1,19)
seqan3::debug_stream << pos << ' ';
return 0;
}
Attention
When building an index for a text collection over any alphabet, the symbols with rank 254 and 255 are reserved and may not be used in the text.

Constructor & Destructor Documentation

◆ bi_fm_index()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
template<std::ranges::range text_t>
seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::bi_fm_index ( text_t &&  text)
inline

Constructor that immediately constructs the index given a range. The range cannot be empty.

Template Parameters
text_tThe type of range to construct from; must model std::ranges::bidirectional_range.
Parameters
[in]textThe text to construct from.

Complexity

At least linear.

Member Function Documentation

◆ begin()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
cursor_type seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::begin ( ) const
inlinenoexcept

Returns a seqan3::bi_fm_index_cursor on the index that can be used for searching.

Returns
Returns a bidirectional seqan3::bi_fm_index_cursor on the index.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ empty()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
bool seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::empty ( ) const
inlinenoexcept

Checks whether the index is empty.

Returns
true if the index is empty, false otherwise.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ fwd_begin()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
fwd_cursor_type seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::fwd_begin ( ) const
inlinenoexcept

Returns a unidirectional seqan3::fm_index_cursor on the original text of the bidirectional index that can be used for searching.

Returns
Returns a unidirectional seqan3::fm_index_cursor on the index of the original text.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ operator!=()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
bool seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::operator!= ( bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ > const &  rhs) const
inlinenoexcept

Compares two indices.

Returns
true if the indices are unequal, false otherwise.

Complexity

Linear.

Exceptions

No-throw guarantee.

◆ operator==()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
bool seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::operator== ( bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ > const &  rhs) const
inlinenoexcept

Compares two indices.

Returns
true if the indices are equal, false otherwise.

Complexity

Linear.

Exceptions

No-throw guarantee.

◆ rev_begin()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
rev_cursor_type seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::rev_begin ( ) const
inlinenoexcept

Returns a unidirectional seqan3::fm_index_cursor on the reversed text of the bidirectional index that can be used for searching. Note that because of the text being reversed, extend_right() resp. cycle_back() correspond to extend_left() resp. cycle_front() on the bidirectional index cursor.

Attention
For text collections the text IDs are also reversed.
Returns
Returns a unidirectional seqan3::fm_index_cursor on the index of the reversed text.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ size()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
size_type seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::size ( ) const
inlinenoexcept

Returns the length of the indexed text including sentinel characters.

Returns
Returns the length of the indexed text including sentinel characters.

Complexity

Constant.

Exceptions

No-throw guarantee.


The documentation for this class was generated from the following file: