|
SeqAn3 3.4.1-rc.1
The Modern C++ library for sequence analysis.
|
The SeqAn FM Index. More...
#include <seqan3/search/fm_index/fm_index.hpp>
Inheritance diagram for seqan3::fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >:Public Member Functions | |
| cursor_type | cursor () const noexcept |
| Returns a seqan3::fm_index_cursor on the index that can be used for searching. Cursor is pointing to the root node of the implicit suffix tree. . | |
| bool | empty () const noexcept |
| Checks whether the index is empty. | |
| bool | operator!= (fm_index const &rhs) const noexcept |
| Compares two indices. | |
| bool | operator== (fm_index const &rhs) const noexcept |
| Compares two indices. | |
| template<cereal_archive archive_t> | |
| void | serialize (archive_t &archive) |
| Serialisation support function. | |
| size_type | size () const noexcept |
| Returns the length of the indexed text including sentinel characters. | |
Constructors, destructor and assignment | |
| fm_index ()=default | |
| Defaulted. | |
| fm_index (fm_index const &rhs) | |
| When copy constructing, also update internal data structures. | |
| fm_index (fm_index &&rhs) | |
| When move constructing, also update internal data structures. | |
| fm_index & | operator= (fm_index rhs) |
| When copy/move assigning, also update internal data structures. | |
| ~fm_index ()=default | |
| Defaulted. | |
| template<std::ranges::bidirectional_range text_t> | |
| fm_index (text_t &&text) | |
| Constructor that immediately constructs the index given a range. The range cannot be empty. | |
Static Public Attributes | |
| static constexpr text_layout | text_layout_mode = text_layout_mode_ |
| Indicates whether index is built over a collection. | |
Private Member Functions | |
| template<std::ranges::range text_t> requires (text_layout_mode_ == text_layout::single) | |
| void | construct (text_t &&text) |
| Constructs the index given a range. The range cannot be an rvalue (i.e. a temporary object) and has to be non-empty. | |
|
template<std::ranges::range text_t> requires (text_layout_mode_ == text_layout::collection) | |
| void | construct (text_t &&text, bool reverse=false) |
| This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
Static Private Member Functions | |
| template<typename output_it_t , typename sequence_t > | |
| static output_it_t | copy_sequence_ranks_shifted_by_one (output_it_t output_it, sequence_t &&sequence) |
| Eagerly convert sequence into ranks, shift by one and copy them into output_it. | |
Private Attributes | |
| sdsl_index_type | index |
| Underlying index from the SDSL. | |
| seqan3::contrib::sdsl::sd_vector | text_begin |
| Bitvector storing begin positions for collections. | |
| seqan3::contrib::sdsl::rank_support_sd< 1 > | text_begin_rs |
| Rank support for text_begin. | |
| seqan3::contrib::sdsl::select_support_sd< 1 > | text_begin_ss |
| Select support for text_begin. | |
Member types | |
| using | sdsl_index_type = sdsl_index_type_ |
| The type of the underlying SDSL index. | |
| using | sdsl_char_type = typename sdsl_index_type::alphabet_type::char_type |
| The type of the reduced alphabet type. (The reduced alphabet might be smaller than the original alphabet in case not all possible characters occur in the indexed text.) | |
| using | sdsl_sigma_type = typename sdsl_index_type::alphabet_type::sigma_type |
| The type of the alphabet size of the underlying SDSL index. | |
| using | alphabet_type = alphabet_t |
| The type of the underlying character of the indexed text. | |
| using | size_type = typename sdsl_index_type::size_type |
| Type for representing positions in the indexed text. | |
| using | cursor_type = fm_index_cursor< fm_index > |
| The type of the (unidirectional) cursor. | |
The SeqAn FM Index.
| alphabet_t | The alphabet type; must model seqan3::semialphabet. |
| text_layout_mode_ | Indicates whether this index works on a text collection or a single text. See seqan3::text_layout. |
| sdsl_index_type_ | The type of the underlying SDSL index, must model seqan3::sdsl_index. |
The seqan3::fm_index is a fast and space-efficient string index to search strings and collections of strings.
Here is a short example on how to build an index and search a pattern using an cursor. Please note that there is a very powerful search module with a high-level interface seqan3::search that encapsulates the use of cursors.
Here is an example using a collection of strings (e.g. a genome with multiple chromosomes or a protein database):
The underlying implementation of the FM Index (rank data structure, sampling rates, etc.) can be specified by passing a new SDSL index type as second template parameter:
|
inlineexplicit |
Constructor that immediately constructs the index given a range. The range cannot be empty.
| text_t | The type of range to construct from; must model std::ranges::bidirectional_range. |
| [in] | text | The text to construct from. |
|
inlineprivate |
Constructs the index given a range. The range cannot be an rvalue (i.e. a temporary object) and has to be non-empty.
| text_t | The type of range to construct from; must model std::ranges::bidirectional_range. |
| [in] | text | The text to construct from. |
No guarantee.
|
inlinenoexcept |
Returns a seqan3::fm_index_cursor on the index that can be used for searching. Cursor is pointing to the root node of the implicit suffix tree. .
Constant.
No-throw guarantee.
|
inlinenoexcept |
Checks whether the index is empty.
true if the index is empty, false otherwise.Constant.
No-throw guarantee.
|
inlinenoexcept |
Compares two indices.
true if the indices are unequal, false otherwise.Linear.
No-throw guarantee.
|
inlinenoexcept |
Compares two indices.
true if the indices are equal, false otherwise.Linear.
No-throw guarantee.
|
inline |
Serialisation support function.
| archive_t | Type of archive; must satisfy seqan3::cereal_archive. |
| archive | The archive being serialised from/to. |
|
inlinenoexcept |
Returns the length of the indexed text including sentinel characters.
Constant.
No-throw guarantee.