SeqAn3  3.0.0
The Modern C++ library for sequence analysis.
seqan3::bi_fm_index< is_collection, sdsl_index_type_ > Class Template Reference

The SeqAn Bidirectional FM Index. More...

#include <seqan3/search/fm_index/bi_fm_index.hpp>

+ Inheritance diagram for seqan3::bi_fm_index< is_collection, sdsl_index_type_ >:

Public Types

Text types
using size_type = typename sdsl_index_type::size_type
 Type for representing positions in the indexed text.
 
Cursor types
using cursor_type = bi_fm_index_cursor< bi_fm_index< is_collection_, sdsl_index_type > >
 The type of the bidirectional cursor.
 
using fwd_cursor_type = fm_index_cursor< fm_index_type >
 The type of the unidirectional cursor on the original text.
 
using rev_cursor_type = fm_index_cursor< rev_fm_index_type >
 The type of the unidirectional cursor on the reversed text.
 

Public Member Functions

cursor_type begin () const noexcept
 Returns a seqan3::bi_fm_index_cursor on the index that can be used for searching. More...
 
template<std::ranges::Range text_t>
void construct (text_t &&text)
 Constructs the index given a range. The range cannot be an rvalue (i.e. a temporary object) and has to be non-empty. More...
 
template<std::ranges::Range text_t>
void construct (text_t &&text)
 
bool empty () const noexcept
 Checks whether the index is empty. More...
 
fwd_cursor_type fwd_begin () const noexcept
 Returns a unidirectional seqan3::fm_index_cursor on the original text of the bidirectional index that can be used for searching. More...
 
bool operator!= (bi_fm_index const &rhs) const noexcept
 Compares two indices. More...
 
bool operator== (bi_fm_index const &rhs) const noexcept
 Compares two indices. More...
 
rev_cursor_type rev_begin () const noexcept
 Returns a unidirectional seqan3::fm_index_cursor on the reversed text of the bidirectional index that can be used for searching. Note that because of the text being reversed, extend_right() resp. cycle_back() correspond to extend_left() resp. cycle_front() on the bidirectional index cursor. More...
 
size_type size () const noexcept
 Returns the length of the indexed text including sentinel characters. More...
 
Constructors, destructor and assignment
 bi_fm_index ()=default
 Default constructor.
 
 bi_fm_index (bi_fm_index const &)=default
 Copy constructor.
 
bi_fm_indexoperator= (bi_fm_index const &)=default
 Copy assignment.
 
 bi_fm_index (bi_fm_index &&)=default
 Move constructor.
 
bi_fm_indexoperator= (bi_fm_index &&)=default
 Move assignment.
 
 ~bi_fm_index ()=default
 Destructor.
 
template<std::ranges::Range text_t>
 bi_fm_index (text_t &&text)
 Constructor that immediately constructs the index given a range. The range cannot be empty. More...
 

Protected Types

Index types
using sdsl_index_type = sdsl_index_type_
 The type of the underlying SDSL index for the original text.
 
using rev_sdsl_index_type = sdsl_index_type_
 The type of the underlying SDSL index for the reversed text.
 
using sdsl_char_type = typename sdsl_index_type::alphabet_type::char_type
 The type of the reduced alphabet type. (The reduced alphabet might be smaller than the original alphabet in case not all possible characters occur in the indexed text.)
 
using sdsl_sigma_type = typename sdsl_index_type::alphabet_type::sigma_type
 The type of the alphabet size of the underlying SDSL index.
 
using fm_index_type = fm_index< is_collection_, sdsl_index_type >
 The type of the underlying FM index for the original text.
 
using rev_fm_index_type = fm_index< is_collection_, sdsl_index_type >
 The type of the underlying FM index for the reversed text.
 

Protected Attributes

fm_index_type fwd_fm
 Underlying FM index for the original text.
 
rev_fm_index_type rev_fm
 Underlying FM index for the reversed text.
 
size_t sigma {0}
 The alphabet size of the text.
 

Static Protected Attributes

static constexpr bool is_collection_ {is_collection}
 Indicates whether index is built over a collection.
 

Friends

template<typename fm_index_t >
class fm_index_cursor
 

Detailed Description

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
class seqan3::bi_fm_index< is_collection, sdsl_index_type_ >

The SeqAn Bidirectional FM Index.

Template Parameters
is_collectionIndicates whether this index works on a text collection (true) or a single text (false).
sdsl_index_type_The type of the underlying SDSL index, must model seqan3::SdslIndex.

The seqan3::bi_fm_index is a fast and space-efficient bidirectional string index to search strings and collections of strings. In general, we recommend to favour the seqan3::bi_fm_index over the seqan3::fm_index if you want to allow multiple errors when searching.

General information

Here is a short example on how to build an index and search a pattern using an cursor. Please note that there is a very powerful search module with a high-level interface seqan3::search that encapsulates the use of cursors.

#include <vector>
using namespace seqan3;
int main()
{
std::vector<dna4> genome{"ATCGATCGAAGGCTAGCTAGCTAAGGGA"_dna4};
bi_fm_index index{genome}; // build the index
auto cur = index.begin(); // create a cursor
cur.extend_right("GG"_dna4); // search the pattern "GG"
cur.extend_left("AA"_dna4); // search the pattern "AAGG"
debug_stream << "Number of hits: " << cur.count() << '\n'; // outputs: 2
debug_stream << "Positions in the genome: ";
for (auto const & pos : cur.locate()) // outputs: 8, 22
debug_stream << pos << ' ';
debug_stream << '\n';
return 0;
}
Attention
When building an index for a single text over any alphabet, the symbol with rank 255 is reserved and may not occur in the text.

Here is an example using a collection of strings (e.g. a genome with multiple chromosomes or a protein database):

#include <vector>
using namespace seqan3;
int main()
{
std::vector<std::vector<dna4>> genomes{"ATCTGACGAAGGCTAGCTAGCTAAGGGA"_dna4,
"TAGCTGAAGCCATTGGCATCTGATCGGACT"_dna4,
"ACTGAGCTCGTC"_dna4,
"TGCATGCACCCATCGACTGACTG"_dna4,
"GTACGTACGTTACG"_dna4};
bi_fm_index index{genomes}; // build the index
auto cur = index.begin(); // create a cursor
cur.extend_right("GA"_dna4); // search the pattern "GA"
cur.extend_left("CT"_dna4); // search the pattern "CTGA"
debug_stream << "Number of hits: " << cur.count() << '\n'; // outputs: 5
debug_stream << "Positions in the genomes: ";
for (auto const & pos : cur.locate()) // outputs: (3,16) (2,1) (1,3) (0,2) (1,19)
debug_stream << pos << ' ';
debug_stream << '\n';
return 0;
}
Attention
When building an index for a text collection over any alphabet, the symbols with rank 254 and 255 are reserved and may not be used in the text.

Constructor & Destructor Documentation

◆ bi_fm_index()

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
template<std::ranges::Range text_t>
seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::bi_fm_index ( text_t &&  text)
inline

Constructor that immediately constructs the index given a range. The range cannot be empty.

Template Parameters
text_tThe type of range to construct from; must model std::ranges::BidirectionalRange.
Parameters
[in]textThe text to construct from.

Complexity

At least linear.

Member Function Documentation

◆ begin()

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
cursor_type seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::begin ( ) const
inlinenoexcept

Returns a seqan3::bi_fm_index_cursor on the index that can be used for searching.

Returns
Returns a bidirectional seqan3::bi_fm_index_cursor on the index.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ construct() [1/2]

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
template<std::ranges::Range text_t>
void seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::construct ( text_t &&  text)
inline

Constructs the index given a range. The range cannot be an rvalue (i.e. a temporary object) and has to be non-empty.

Template Parameters
text_tThe type of range to construct from; must model std::ranges::BidirectionalRange.
Parameters
[in]textThe text to construct from.

Complexity

At least linear.

Exceptions

No guarantee.

◆ construct() [2/2]

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
template<std::ranges::Range text_t>
void seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::construct ( text_t &&  text)
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

◆ empty()

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
bool seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::empty ( ) const
inlinenoexcept

Checks whether the index is empty.

Returns
true if the index is empty, false otherwise.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ fwd_begin()

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
fwd_cursor_type seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::fwd_begin ( ) const
inlinenoexcept

Returns a unidirectional seqan3::fm_index_cursor on the original text of the bidirectional index that can be used for searching.

Returns
Returns a unidirectional seqan3::fm_index_cursor on the index of the original text.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ operator!=()

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
bool seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::operator!= ( bi_fm_index< is_collection, sdsl_index_type_ > const &  rhs) const
inlinenoexcept

Compares two indices.

Returns
true if the indices are unequal, false otherwise.

Complexity

Linear.

Exceptions

No-throw guarantee.

◆ operator==()

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
bool seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::operator== ( bi_fm_index< is_collection, sdsl_index_type_ > const &  rhs) const
inlinenoexcept

Compares two indices.

Returns
true if the indices are equal, false otherwise.

Complexity

Linear.

Exceptions

No-throw guarantee.

◆ rev_begin()

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
rev_cursor_type seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::rev_begin ( ) const
inlinenoexcept

Returns a unidirectional seqan3::fm_index_cursor on the reversed text of the bidirectional index that can be used for searching. Note that because of the text being reversed, extend_right() resp. cycle_back() correspond to extend_left() resp. cycle_front() on the bidirectional index cursor.

Attention
For text collections the text IDs are also reversed.
Returns
Returns a unidirectional seqan3::fm_index_cursor on the index of the reversed text.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ size()

template<bool is_collection = false, detail::SdslIndex sdsl_index_type_ = default_sdsl_index_type>
size_type seqan3::bi_fm_index< is_collection, sdsl_index_type_ >::size ( ) const
inlinenoexcept

Returns the length of the indexed text including sentinel characters.

Returns
Returns the length of the indexed text including sentinel characters.

Complexity

Constant.

Exceptions

No-throw guarantee.


The documentation for this class was generated from the following file: