SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t > Class Template Reference

The alignment algorithm type to compute standard pairwise alignment using dynamic programming. More...

#include <seqan3/alignment/pairwise/alignment_algorithm.hpp>

+ Inheritance diagram for seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >:

Public Member Functions

Constructors, destructor and assignment
constexpr alignment_algorithm ()=default
 Defaulted.
 
constexpr alignment_algorithm (alignment_algorithm const &)=default
 Defaulted.
 
constexpr alignment_algorithm (alignment_algorithm &&)=default
 Defaulted.
 
constexpr alignment_algorithmoperator= (alignment_algorithm const &)=default
 Defaulted.
 
constexpr alignment_algorithmoperator= (alignment_algorithm &&)=default
 Defaulted.
 
 ~alignment_algorithm ()=default
 Defaulted.
 
constexpr alignment_algorithm (config_t const &cfg)
 Constructs the algorithm with the passed configuration.
 
Invocation
template<indexed_sequence_pair_range indexed_sequence_pairs_t, typename callback_t >
requires (!traits_t::is_vectorised) && std::invocable<callback_t, alignment_result_t>
void operator() (indexed_sequence_pairs_t &&indexed_sequence_pairs, callback_t &&callback)
 Computes the pairwise sequence alignment for the given range over indexed sequence pairs.
 
template<indexed_sequence_pair_range indexed_sequence_pairs_t, typename callback_t >
requires traits_t
::is_vectorised &&std::invocable< callback_t, alignment_result_t > void operator() (indexed_sequence_pairs_t &&indexed_sequence_pairs, callback_t &&callback)
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 

Private Types

using alignment_column_iterator_t = std::ranges::iterator_t< alignment_column_t >
 The iterator type over the alignment column.
 
using alignment_column_t = decltype(_alignment_column_t())
 The type of an alignment column as defined by the respective matrix policy.
 
using alignment_result_t = typename traits_t::alignment_result_type
 The alignment result type.
 
using score_debug_matrix_t = std::conditional_t< traits_t::is_debug, two_dimensional_matrix< std::optional< typename traits_t::original_score_type >, std::allocator< std::optional< typename traits_t::original_score_type > >, matrix_major_order::column >, empty_type >
 The type of the score debug matrix.
 
using trace_debug_matrix_t = std::conditional_t< traits_t::is_debug, two_dimensional_matrix< std::optional< trace_directions >, std::allocator< std::optional< trace_directions > >, matrix_major_order::column >, empty_type >
 The type of the trace debug matrix.
 
using traits_t = alignment_configuration_traits< config_t >
 The alignment configuration traits type with auxiliary information extracted from the configuration type.
 

Private Member Functions

template<typename sequence1_t , typename sequence2_t >
constexpr void check_valid_band_parameter (sequence1_t &&sequence1, sequence2_t &&sequence2, align_cfg::band_fixed_size const &band)
 Checks if the band parameters are valid for the given sequences.
 
template<bool initialise_first_cell, typename sequence1_value_t , typename sequence2_t >
void compute_alignment_column (sequence1_value_t const &seq1_value, sequence2_t &&sequence2)
 Computes a single alignment column.
 
template<typename sequence1_t , typename sequence2_t >
requires (!traits_t::is_banded)
void compute_matrix (sequence1_t &sequence1, sequence2_t &sequence2)
 Compute the alignment by iterating over the alignment matrix in a column wise manner.
 
template<typename sequence1_t , typename sequence2_t >
requires (traits_t::is_banded)
void compute_matrix (sequence1_t &sequence1, sequence2_t &sequence2, align_cfg::band_fixed_size const &band)
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
template<std::ranges::forward_range sequence1_t, std::ranges::forward_range sequence2_t, typename callback_t >
constexpr void compute_single_pair (size_t const idx, sequence1_t &&sequence1, sequence2_t &&sequence2, callback_t &callback)
 Computes the pairwise sequence alignment for a single pair of sequences.
 
template<typename sequence_range_t >
constexpr auto convert_batch_of_sequences_to_simd_vector (sequence_range_t &sequences)
 Converts a batch of sequences to a sequence of simd vectors.
 
void dump_alignment_column ()
 Dumps the current alignment matrix in the debug score matrix and if requested debug trace matrix.
 
constexpr void finalise_alignment () noexcept
 Checks the last cell, respectively column for the alignment optimum.
 
constexpr void finalise_last_cell_in_column (bool const at_last_row) noexcept
 Finalises the last cell of the current alignment column.
 
template<typename sequence1_t , typename sequence2_t >
constexpr void initialise_debug_matrices (sequence1_t &sequence1, sequence2_t &sequence2)
 Initialises the debug matrices for the given sequences.
 
template<typename sequence2_t >
auto initialise_first_alignment_column (sequence2_t &&sequence2)
 Initialises the first column of the alignment matrix.
 
template<typename index_t , typename sequence1_t , typename sequence2_t , typename callback_t >
requires (!traits_t::is_vectorised)
constexpr void make_alignment_result (index_t const idx, sequence1_t &sequence1, sequence2_t &sequence2, callback_t &callback)
 Creates a new alignment result from the current alignment optimum and for the given pair of sequences.
 
template<typename indexed_sequence_pair_range_t , typename callback_t >
requires traits_t
::is_vectorised constexpr auto make_alignment_result (indexed_sequence_pair_range_t &&index_sequence_pairs, callback_t &callback)
 Creates a new alignment result from the current alignment optimum and for the given indexed sequence range.
 

Static Private Member Functions

template<typename alignment_algorithm_t = alignment_algorithm>
static auto _alignment_column_t () -> decltype(std::declval< alignment_algorithm_t >().current_alignment_column())
 Helper function to access ((some_policy *)this)->current_alignment_column().
 

Private Attributes

alignment_column_t alignment_column {}
 Stores the currently processed alignment column.
 
alignment_column_iterator_t alignment_column_it {}
 Stores the state of the currently processed alignment column.
 
std::shared_ptr< config_t > cfg_ptr {}
 The alignment configuration stored on the heap.
 
std::pair< size_t, size_t > max_size_in_collection {}
 The maximal size within the first and the second sequence collection.
 
score_debug_matrix_t score_debug_matrix {}
 The debug matrix for the scores.
 
trace_debug_matrix_t trace_debug_matrix {}
 The debug matrix for the traces.
 

Detailed Description

template<typename config_t, typename... algorithm_policies_t>
class seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >

The alignment algorithm type to compute standard pairwise alignment using dynamic programming.

Template Parameters
config_tThe configuration type; must be of type seqan3::configuration.
algorithm_policies_tTemplate parameter pack with the policies to determine the execution of the algorithm; must be wrapped as seqan3::detail::deferred_crtp_base.

Configuration

The first template argument is the type of the alignment configuration which was used to configure the alignment algorithm type. They must be the same, otherwise it is possible that the output is not coherent with the expected result given the different configurations. The correct alignment is configured within the seqan3::detail::alignment_configurator and returned as a std::function object, which can be passed around and copied to create, for example multiple instances of the algorithm that can be executed in parallel.

Policies

This type uses multiple inheritance to configure a specific alignment computation, e.g. global or local alignments, using SIMD operations or scalar operations, computing the traceback or only the score etc.. These configurations are inherited using so-called alignment policies. An alignment policy is a type that implements a specific functionality through a common interface that is used by the alignment algorithm. These policies are also the customisation points of the algorithm which will be used to implement a specific behaviour. You can read more about the policies in Alignment policies.

Since some of the policies are augmented with traits to further refine the policy execution during the configuration, it is necessary to defer the template instantiation of the policies, which are modelled as CRTP-base classes. Therefore every added policy must be wrapped in a seqan3::detail::deferred_crtp_base class wrapper. This wrapper then is expanded during the final template instantiation of the alignment algorithm type using the corresponding template function seqan3::detail::invoke_deferred_crtp_base, which instantiates the CRTP policy types with the correct alignment algorithm type.

Constructor & Destructor Documentation

◆ alignment_algorithm()

template<typename config_t , typename... algorithm_policies_t>
constexpr seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::alignment_algorithm ( config_t const &  cfg)
inlineexplicitconstexpr

Constructs the algorithm with the passed configuration.

Parameters
cfgThe configuration to be passed to the algorithm.

Maintains a copy of the configuration object on the heap using a std::shared_ptr. In addition, the alignment state is initialised.

Member Function Documentation

◆ _alignment_column_t()

template<typename config_t , typename... algorithm_policies_t>
template<typename alignment_algorithm_t = alignment_algorithm>
static auto seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::_alignment_column_t ( ) -> decltype(std::declval< alignment_algorithm_t >().current_alignment_column())
staticprivate

Helper function to access ((some_policy *)this)->current_alignment_column().

This works around an issue of accessing inherited members during declaration time by defering the access by using two-phase lookup.

During the declaration time this class is still an incomplete type and we are technically not allowed to access its inherited members. gcc will allow that anyway, but clang rightfully doesn't.

See also
http://blog.llvm.org/2009/12/dreaded-two-phase-name-lookup.html

◆ check_valid_band_parameter()

template<typename config_t , typename... algorithm_policies_t>
template<typename sequence1_t , typename sequence2_t >
constexpr void seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::check_valid_band_parameter ( sequence1_t &&  sequence1,
sequence2_t &&  sequence2,
align_cfg::band_fixed_size const &  band 
)
inlineconstexprprivate

Checks if the band parameters are valid for the given sequences.

Template Parameters
sequence1_tThe type of the first sequence.
sequence2_tThe type of the second sequence.
Parameters
[in]sequence1The first sequence.
[in]sequence2The second sequence.
[in]bandThe band to check.
Exceptions
seqan3::invalid_alignment_configurationif the band parameters would form an invalid alignment matrix.

Checks if the given band intersects with the alignment matrix formed by the first and second sequence. For example if the lower bound of the band is larger than the size of the first sequence the band would lie outside of the alignment matrix and thus is invalid.

◆ compute_alignment_column()

template<typename config_t , typename... algorithm_policies_t>
template<bool initialise_first_cell, typename sequence1_value_t , typename sequence2_t >
void seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::compute_alignment_column ( sequence1_value_t const &  seq1_value,
sequence2_t &&  sequence2 
)
inlineprivate

Computes a single alignment column.

Template Parameters
initialise_first_cellAn explicit bool template argument indicating whether the first cell of the current alignment column needs to be initialised.
seq1_value_tThe value type of the first sequence.
sequence2_tThe type of the second sequence.
Parameters
[in]seq1_valueThe current value of the first sequence for this alignment column.
[in]sequence2The current slice of the second sequence for this alignment column.

Computes a single column within the alignment matrix. The first cell of the column is either initialised according to the initialisation policy if initialise_first_cell is true, otherwise it assumes a banded column within the matrix and computes the value accordingly.

◆ compute_matrix()

template<typename config_t , typename... algorithm_policies_t>
template<typename sequence1_t , typename sequence2_t >
requires (!traits_t::is_banded)
void seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::compute_matrix ( sequence1_t &  sequence1,
sequence2_t &  sequence2 
)
inlineprivate

Compute the alignment by iterating over the alignment matrix in a column wise manner.

Template Parameters
sequence1_tThe type of the first sequence.
sequence2_tThe type of the second sequence.
Parameters
[in]sequence1The first sequence.
[in]sequence2The second sequence.

◆ compute_single_pair()

template<typename config_t , typename... algorithm_policies_t>
template<std::ranges::forward_range sequence1_t, std::ranges::forward_range sequence2_t, typename callback_t >
constexpr void seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::compute_single_pair ( size_t const  idx,
sequence1_t &&  sequence1,
sequence2_t &&  sequence2,
callback_t &  callback 
)
inlineconstexprprivate

Computes the pairwise sequence alignment for a single pair of sequences.

Template Parameters
sequence1_tThe type of the first sequence; must model std::ranges::forward_range.
sequence2_tThe type of the second sequence; must model std::ranges::forward_range.
callback_tThe type of the callback function.
Parameters
[in]idxThe index of the current processed sequence pair.
[in]sequence1The first sequence (or packed sequences).
[in]sequence2The second sequence (or packed sequences).
[in]callbackThe callback function to be invoked with the alignment result.
Exceptions
std::bad_allocduring allocation of the alignment matrices or seqan3::invalid_alignment_configuration if an invalid configuration for the given sequences is detected.

Uses the standard dynamic programming algorithm to compute the pairwise sequence alignment.

◆ convert_batch_of_sequences_to_simd_vector()

template<typename config_t , typename... algorithm_policies_t>
template<typename sequence_range_t >
constexpr auto seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::convert_batch_of_sequences_to_simd_vector ( sequence_range_t &  sequences)
inlineconstexprprivate

Converts a batch of sequences to a sequence of simd vectors.

Template Parameters
sequence_range_tThe type of the range over sequences; must model std::ranges::forward_range.
Parameters
[in]sequencesThe batch of sequences to transform.
Returns
a sequence over simd vectors.

Expects that the size of the batch is less or equal than the number of alignments that can be computed within one simd vector. Applies an Array-of-Structures (AoS) to Structure-of-Arrays (SoA) transformation by storing one column of the batch as a simd vector.

◆ dump_alignment_column()

template<typename config_t , typename... algorithm_policies_t>
void seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::dump_alignment_column ( )
inlineprivate

Dumps the current alignment matrix in the debug score matrix and if requested debug trace matrix.

Copies the current score and if configured the current trace column into the local debug score and trace matrix. In the banded case the full matrix will be allocated with std::optional values and only the respective parts of the matrix that are present in the band are filled.

◆ finalise_last_cell_in_column()

template<typename config_t , typename... algorithm_policies_t>
constexpr void seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::finalise_last_cell_in_column ( bool const  at_last_row)
inlineconstexprprivatenoexcept

Finalises the last cell of the current alignment column.

Parameters
[in]at_last_rowA bool indicating whether the column ends in the last row of the alignment matrix.

After computing the alignment column the alignment column iterator possibly points to the last computed cell. This value is used to check for a new optimum if searching in the last row was enabled. Otherwise it does nothing. In case the alignment algorithm is run in debug mode the computed column is dumped in the debug score and trace matrix.

◆ initialise_debug_matrices()

template<typename config_t , typename... algorithm_policies_t>
template<typename sequence1_t , typename sequence2_t >
constexpr void seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::initialise_debug_matrices ( sequence1_t &  sequence1,
sequence2_t &  sequence2 
)
inlineconstexprprivate

Initialises the debug matrices for the given sequences.

Template Parameters
sequence1_tThe type of the first sequence.
sequence2_tThe type of the second sequence.

param[in] sequence1 The first sequence. param[in] sequence2 The second sequence.

Initialises the debug matrices if the alignment algorithm is running in debug mode. See seqan3::align_cfg::detail::debug for more information.

◆ initialise_first_alignment_column()

template<typename config_t , typename... algorithm_policies_t>
template<typename sequence2_t >
auto seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::initialise_first_alignment_column ( sequence2_t &&  sequence2)
inlineprivate

Initialises the first column of the alignment matrix.

Template Parameters
sequence2_tThe type of the second sequence.
Parameters
[in]sequence2The second sequence to initialise the first column for.

Initialises the alignment matrix with special initialisation functions for the origin cell and the cells in the first alignment column. The second sequence is required to get the size of the first column which can vary between banded and unbanded alignments. The value of the second sequence is actually not used during the initialisation.

◆ make_alignment_result() [1/2]

template<typename config_t , typename... algorithm_policies_t>
template<typename index_t , typename sequence1_t , typename sequence2_t , typename callback_t >
requires (!traits_t::is_vectorised)
constexpr void seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::make_alignment_result ( index_t const  idx,
sequence1_t &  sequence1,
sequence2_t &  sequence2,
callback_t &  callback 
)
inlineconstexprprivate

Creates a new alignment result from the current alignment optimum and for the given pair of sequences.

Template Parameters
callback_tThe type of the callback function.
index_tThe type of the index.
sequence1_tThe type of the first sequence.
sequence2_tThe type of the second sequence.
Parameters
[in]idxThe internal index used for this pair of sequences.
[in]sequence1The first range to get the alignment for if requested.
[in]sequence2The second range to get the alignment for if requested.
[in]callbackThe callback function to be invoked with the alignment result.

Fills the alignment result with the requested values. Depending on the selected configuration the following is extracted and/or computed:

  1. The alignment score.
  2. The end positions of the aligned range for the first and second sequence.
  3. The begin positions of the aligned range for the first and second sequence.
  4. The alignment between both sequences in the respective aligned region.

If the alignment is run in debug mode (see seqan3::align_cfg::detail::debug) the debug score and optionally trace matrix are stored in the alignment result as well.

Finally, the callback is invoked with the computed alignment result.

◆ make_alignment_result() [2/2]

template<typename config_t , typename... algorithm_policies_t>
template<typename indexed_sequence_pair_range_t , typename callback_t >
requires traits_t
::is_vectorised constexpr auto seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::make_alignment_result ( indexed_sequence_pair_range_t &&  index_sequence_pairs,
callback_t &  callback 
)
inlineconstexprprivate

Creates a new alignment result from the current alignment optimum and for the given indexed sequence range.

Template Parameters
callback_tThe type of the callback function.
indexed_sequence_pair_range_tThe type of the indexed sequence pair range.
Parameters
[in]index_sequence_pairsThe range over indexed sequence pairs.
[in]callbackThe callback function to be invoked with the alignment result.

This function is called for the vectorised algorithm. In this case the alignment state stores the results for the entire chunk of sequence pairs processed within this alignment computation. Accordingly, the chunk of sequence pairs is processed iteratively and the alignment results are added to the returned vector. Depending on the selected configuration the following is extracted and/or computed:

  1. The alignment score.
  2. The end positions of the aligned range for the first and second sequence.
  3. The begin positions of the aligned range for the first and second sequence.
  4. The alignment between both sequences in the respective aligned region.

If the alignment is run in debug mode (see seqan3::align_cfg::detail::debug) the debug score and optionally trace matrix are stored in the alignment result as well.

Finally, the callback is invoked with each computed alignment result iteratively.

◆ operator()()

template<typename config_t , typename... algorithm_policies_t>
template<indexed_sequence_pair_range indexed_sequence_pairs_t, typename callback_t >
requires (!traits_t::is_vectorised) && std::invocable<callback_t, alignment_result_t>
void seqan3::detail::alignment_algorithm< config_t, algorithm_policies_t >::operator() ( indexed_sequence_pairs_t &&  indexed_sequence_pairs,
callback_t &&  callback 
)
inline

Computes the pairwise sequence alignment for the given range over indexed sequence pairs.

Template Parameters
indexed_sequence_pairs_tThe type of indexed_sequence_pairs; must model seqan3::detail::indexed_sequence_pair_range.
callback_tThe type of the callback function that is called with the alignment result; must model std::invocable accepting one argument of type seqan3::alignment_result.
Parameters
[in]indexed_sequence_pairsA range over indexed sequence pairs to be aligned.
[in]callbackThe callback function to be invoked with each computed alignment result.
Exceptions
std::bad_allocduring allocation of the alignment matrices or seqan3::invalid_alignment_configuration if an invalid configuration for the given sequences is detected.

Uses the standard dynamic programming algorithm to compute the pairwise sequence alignment for each sequence pair. The space and runtime complexities depend on the selected configurations (see below). For every computed alignment the given callback is invoked with the respective alignment result.

Exception

Strong exception guarantee. Might throw std::bad_alloc or seqan3::invalid_alignment_configuration.

Thread-safety

Calls to this functions in a concurrent environment are not thread safe. Instead use a copy of the alignment algorithm type.

Complexity

The following table lists the runtime and space complexities for the banded and unbanded algorithm dependent on the given seqan3::align_cfg::output_* per sequence pair. Let n be the length of the first sequence, m be the length of the second sequence and k be the size of the band.

unbanded banded
runtime \( O(n*m) \) \( O(n*k) \)
space (score only) \( O(m) \) \( O(k) \)
space (end positions) \( O(m) \) \( O(k) \)
space (begin positions) \( O(n*m) \) \( O(n*k) \)
space (alignment) \( O(n*m) \) \( O(n*k) \)

The documentation for this class was generated from the following file:
Hide me