SeqAn3  3.0.3
The Modern C++ library for sequence analysis.
Pairwise

Provides the algorithmic components for the computation of pairwise alignments. More...

+ Collaboration diagram for Pairwise:

Classes

class  seqan3::alignment_result< alignment_result_value_t >
 Stores the alignment results and gives access to score, alignment and the front and end positionss. More...
 

Functions

template<typename sequence_t , typename alignment_config_t >
constexpr auto seqan3::align_pairwise (sequence_t &&seq, alignment_config_t const &config)
 Computes the pairwise alignment for a pair of sequences or a range over sequence pairs. More...
 

Detailed Description

Provides the algorithmic components for the computation of pairwise alignments.

See also
Alignment

Function Documentation

◆ align_pairwise()

template<typename sequence_t , typename alignment_config_t >
constexpr auto seqan3::align_pairwise ( sequence_t &&  seq,
alignment_config_t const &  config 
)
constexpr

Computes the pairwise alignment for a pair of sequences or a range over sequence pairs.

Template Parameters
sequence_tThe type of sequence pairs (see details for more information on the type constraints).
alignment_config_tThe type of the alignment configuration; must be a seqan3::configuration.
Parameters
[in]seqA tuple with two sequences or a range over such tuples.
[in]configThe object storing the alignment configuration.
Returns
A seqan3::algorithm_result_generator_range.

This function computes the pairwise alignment for the given sequences. During the setup phase the most efficient implementation is selected depending on the configurations stored in the given seqan3::configuration object. The configuration also holds settings for parallel or vectorised execution.

Compute a single alignment

In cases where only a single alignment is to be computed, the two sequences can be passed as a pair. The pair can be any class template that models the seqan3::tuple_like concept. The tuple elements must model std::ranges::viewable_range and std::copy_constructible. The following example demonstrates how an alignment is computed from a std::pair of sequences.

std::pair p{"ACGTAGC"_dna4, "AGTACGACG"_dna4};
auto result = seqan3::align_pairwise(p, config);
constexpr auto align_pairwise(sequence_t &&seq, alignment_config_t const &config)
Computes the pairwise alignment for a pair of sequences or a range over sequence pairs.
Definition: align_pairwise.hpp:135

Alternatively, you can use std::tie as shown in the example below:

std::vector vec{"ACCA"_dna4, "ATTA"_dna4};
auto result = seqan3::align_pairwise(std::tie(vec[0], vec[1]), config);
T tie(T... args)

Compute multiple alignments

In many cases one needs to compute multiple pairwise alignments. Accordingly, the align_pairwise interface allows to pass a range over sequence pairs. The alignment algorithm will be configured only once for all submitted alignments and then computes the alignments sequentially or in parallel depending on the given configuration. Since there is always a certain amount of initial setup involving runtime checks required, it is advisable to pass many sequence-pairs to this algorithm instead of repeatedly calling it with a single pair.

Accessing the alignment results

For each sequence pair one or more seqan3::alignment_results can be computed. The seqan3::align_pairwise function returns an seqan3::algorithm_result_generator_range which can be used to iterate over the alignments. If the vectorised configurations are omitted the alignments are computed on-demand when iterating over the results. In case of a parallel execution all alignments are computed at once in parallel when calling begin on the associated seqan3::algorithm_result_generator_range.

The following snippets demonstrate the single element and the range based interface.

#include <vector>
int main()
{
using namespace seqan3::literals;
Provides seqan3::align_cfg::edit_scheme.
Provides pairwise alignment function.
Provides seqan3::debug_stream and related types.
Provides seqan3::dna4, container aliases and string literals.
The SeqAn namespace for literals.
std::vector vec{std::pair{"AGTGCTACG"_dna4, "ACGTGCGACTAG"_dna4},
std::pair{"AGTAGACTACG"_dna4, "ACGTACGACACG"_dna4},
std::pair{"AGTTACGAC"_dna4, "AGTAGCGATCG"_dna4}};
// Compute the alignment of a single pair.
for (auto const & res : seqan3::align_pairwise(std::tie(vec[0].first, vec[0].second), edit_config))
seqan3::debug_stream << "The score: " << res.score() << "\n";
// Compute the alignment over a range of pairs.
for (auto const & res : seqan3::align_pairwise(vec, edit_config))
seqan3::debug_stream << "The score: " << res.score() << "\n";
Sets the global alignment method.
Definition: align_config_method.hpp:107
constexpr configuration edit_scheme
Shortcut for edit distance configuration.
Definition: align_config_edit.hpp:51
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition: debug_stream.hpp:42
}

Exception

Strong exception guarantee.

Might throw std::bad_alloc if it fails to allocate the alignment matrix or seqan3::invalid_alignment_configuration if the configuration is invalid. Throws std::runtime_error if seqan3::align_cfg::parallel has been specified without a thread_count value.

Complexity

The complexity depends on the configured algorithm. For the edit distance the following worst case over two input sequences of size $ N $ can be assumed:

Computing Runtime Space
score $ O(N^2/w) $ $ O(w) $
back coordinate $ O(N^2/w) $ $ O(w) $
front coordinate $ O(N^2/w) $ $ O(N^2/w) $
alignment $ O(N^2/w) $ $ O(N^2/w) $

$ w $ is the size of a machine word.

For all other algorithms that compute the standard dynamic programming algorithm the following worst case holds:

Computing Runtime Space
score $ O(N^2) $ $ O(N) $
back coordinate $ O(N^2) $ $ O(N) $
front coordinate $ O(N^2) $ $ O(N^2) $
alignment $ O(N^2) $ $ O(N^2) $

In the banded case the worst case is modified as follows:

Computing Runtime Space
score $ O(N*k) $ $ O(k) $
back coordinate $ O(N*k) $ $ O(k) $
front coordinate $ O(N*k) $ $ O(N*k) $
alignment $ O(N*k) $ $ O(N*k) $

$ k $ is the size of the band.

Thread safety

This function is re-entrant, i.e. it is always safe to call in parallel with different inputs. It is thread-safe, i.e. it is safe to call in parallel with the same input under the condition that the input sequences do not change when being iterated over.