SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ > Class Template Reference

The SeqAn Bidirectional FM Index. More...

#include <seqan3/search/fm_index/bi_fm_index.hpp>

+ Inheritance diagram for seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >:

Public Types

Text types
using alphabet_type = typename fm_index_type::alphabet_type
 The type of the underlying character of the indexed text.
 
using size_type = typename sdsl_index_type::size_type
 Type for representing positions in the indexed text.
 
Cursor types
using cursor_type = bi_fm_index_cursor< bi_fm_index >
 The type of the bidirectional cursor.
 
using fwd_cursor_type = fm_index_cursor< fm_index_type >
 The type of the unidirectional cursor on the original text.
 

Public Member Functions

cursor_type cursor () const noexcept
 Returns a seqan3::bi_fm_index_cursor on the index that can be used for searching. .
 
bool empty () const noexcept
 Checks whether the index is empty.
 
fwd_cursor_type fwd_cursor () const noexcept
 Returns a unidirectional seqan3::fm_index_cursor on the original text of the bidirectional index that can be used for searching.
 
bool operator!= (bi_fm_index const &rhs) const noexcept
 Compares two indices.
 
bool operator== (bi_fm_index const &rhs) const noexcept
 Compares two indices.
 
size_type size () const noexcept
 Returns the length of the indexed text including sentinel characters.
 
Constructors, destructor and assignment
 bi_fm_index ()=default
 Defaulted.
 
 bi_fm_index (bi_fm_index const &)=default
 Defaulted.
 
bi_fm_indexoperator= (bi_fm_index const &)=default
 Defaulted.
 
 bi_fm_index (bi_fm_index &&)=default
 Defaulted.
 
bi_fm_indexoperator= (bi_fm_index &&)=default
 Defaulted.
 
 ~bi_fm_index ()=default
 Defaulted.
 
template<std::ranges::range text_t>
 bi_fm_index (text_t &&text)
 Constructor that immediately constructs the index given a range. The range cannot be empty.
 

Static Public Attributes

static constexpr text_layout text_layout_mode = text_layout_mode_
 Indicates whether index is built over a collection.
 

Friends

template<typename bi_fm_index_t >
class bi_fm_index_cursor
 

Detailed Description

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
class seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >

The SeqAn Bidirectional FM Index.

Template Parameters
alphabet_tThe alphabet type; must model seqan3::semialphabet.
text_layout_mode_Indicates whether this index works on a text collection or a single text. See seqan3::text_layout.
sdsl_index_type_The type of the underlying SDSL index, must model seqan3::detail::sdsl_index.

The seqan3::bi_fm_index is a fast and space-efficient bidirectional string index to search strings and collections of strings. In general, we recommend to favour the seqan3::bi_fm_index over the unidirectional seqan3::fm_index if you want to allow multiple errors when searching.

General information

Here is a short example on how to build an index and search a pattern using an cursor. Please note that there is a very powerful search module with a high-level interface seqan3::search that encapsulates the use of cursors.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna4> genome{"ATCGATCGAAGGCTAGCTAGCTAAGGGA"_dna4};
seqan3::bi_fm_index index{genome}; // build the index
auto cur = index.cursor(); // create a cursor
cur.extend_right("GG"_dna4); // search the pattern "GG"
cur.extend_left("AA"_dna4); // search the pattern "AAGG"
seqan3::debug_stream << "Number of hits: " << cur.count() << '\n'; // outputs: 2
seqan3::debug_stream << "Positions in the genome: ";
for (auto && pos : cur.locate()) // outputs: (0, 8), (0, 22)
seqan3::debug_stream << pos << ' ';
return 0;
}
bool extend_right() noexcept
Tries to extend the query by the smallest possible character to the right such that the query is foun...
Definition bi_fm_index_cursor.hpp:328
The SeqAn Bidirectional FM Index.
Definition bi_fm_index.hpp:58
cursor_type cursor() const noexcept
Returns a seqan3::bi_fm_index_cursor on the index that can be used for searching. .
Definition bi_fm_index.hpp:252
Provides seqan3::debug_stream and related types.
Provides seqan3::dna4, container aliases and string literals.
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition debug_stream.hpp:37
The SeqAn namespace for literals.
The main SeqAn3 namespace.
Definition aligned_sequence_concept.hpp:26
Meta-header for the Search / FM Index submodule .
Attention
When building an index for a single text over any alphabet, the symbol with rank 255 is reserved and may not occur in the text.

Here is an example using a collection of strings (e.g. a genome with multiple chromosomes or a protein database):

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector<std::vector<seqan3::dna4>> genomes{"ATCTGACGAAGGCTAGCTAGCTAAGGGA"_dna4,
"TAGCTGAAGCCATTGGCATCTGATCGGACT"_dna4,
"ACTGAGCTCGTC"_dna4,
"TGCATGCACCCATCGACTGACTG"_dna4,
"GTACGTACGTTACG"_dna4};
seqan3::bi_fm_index index{genomes}; // build the index
auto cur = index.cursor(); // create a cursor
cur.extend_right("GA"_dna4); // search the pattern "GA"
cur.extend_left("CT"_dna4); // search the pattern "CTGA"
seqan3::debug_stream << "Number of hits: " << cur.count() << '\n'; // outputs: 5
seqan3::debug_stream << "Positions in the genomes: ";
for (auto && pos : cur.locate()) // outputs: (3,16) (2,1) (1,3) (0,2) (1,19)
seqan3::debug_stream << pos << ' ';
return 0;
}
Attention
When building an index for a text collection over any alphabet, the symbols with rank 254 and 255 are reserved and may not be used in the text.

Constructor & Destructor Documentation

◆ bi_fm_index()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
template<std::ranges::range text_t>
seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::bi_fm_index ( text_t &&  text)
inline

Constructor that immediately constructs the index given a range. The range cannot be empty.

Template Parameters
text_tThe type of range to construct from; must model std::ranges::bidirectional_range.
Parameters
[in]textThe text to construct from.

Complexity

At least linear.

Member Function Documentation

◆ cursor()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
cursor_type seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::cursor ( ) const
inlinenoexcept

Returns a seqan3::bi_fm_index_cursor on the index that can be used for searching. .

Returns
Returns a bidirectional seqan3::bi_fm_index_cursor on the index.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ empty()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
bool seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::empty ( ) const
inlinenoexcept

Checks whether the index is empty.

Returns
true if the index is empty, false otherwise.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ fwd_cursor()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
fwd_cursor_type seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::fwd_cursor ( ) const
inlinenoexcept

Returns a unidirectional seqan3::fm_index_cursor on the original text of the bidirectional index that can be used for searching.

Returns
Returns a unidirectional seqan3::fm_index_cursor on the index of the original text.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ operator!=()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
bool seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::operator!= ( bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ > const &  rhs) const
inlinenoexcept

Compares two indices.

Returns
true if the indices are unequal, false otherwise.

Complexity

Linear.

Exceptions

No-throw guarantee.

◆ operator==()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
bool seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::operator== ( bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ > const &  rhs) const
inlinenoexcept

Compares two indices.

Returns
true if the indices are equal, false otherwise.

Complexity

Linear.

Exceptions

No-throw guarantee.

◆ size()

template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
size_type seqan3::bi_fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >::size ( ) const
inlinenoexcept

Returns the length of the indexed text including sentinel characters.

Returns
Returns the length of the indexed text including sentinel characters.

Complexity

Constant.

Exceptions

No-throw guarantee.


The documentation for this class was generated from the following file:
Hide me