fn() globalAlignmentScore
Computes the best global pairwise alignment score.

Defined in <seqan/align.h>, <seqan/align_parallel.h>
Signature TScoreCollection globalAlignmentScore([exec,] seqHCollection, seqVCollection, scoringScheme[, alignConfig][, lowerDiag, upperDiag]); TScoreCollection globalAlignmentScore([exec,] seqH, seqVCollection, scoringScheme[, alignConfig][, lowerDiag, upperDiag]); TScoreVal globalAlignmentScore(strings, scoringScheme[, alignConfig][, lowerDiag, upperDiag][, algorithmTag]); TScoreVal globalAlignmentScore(seqH, seqV, {MyersBitVector | MyersHirschberg}); TScoreVal globalAlignmentScore(strings, {MyersBitVector | MyersHirschberg});

Parameters

exec The ExecutionPolicy used for the alignment algorithm.
seqHCollection A collection of sequences aligned pairwise against the respective sequence in seqVCollection.
seqVCollection A collection of sequences aligned pairwise against the respective sequence in seqHCollection or with seqH.
strings A StringSet containing two sequences. Type: StringSet.
seqH A single sequence to be aligned against seqV or seqVCollection.
seqV A single sequence to be aligned against seqH.
alignConfig The AlignConfig to use for the alignment. Type: AlignConfig
scoringScheme The scoring scheme to use for the alignment. Note that the user is responsible for ensuring that the scoring scheme is compatible with algorithmTag. Type: Score.
lowerDiag Optional lower diagonal. Types: int
upperDiag Optional upper diagonal. Types: int
algorithmTag The Tag for picking the alignment algorithm. Types: PairwiseLocalAlignmentAlgorithms.

Return Values

TScoreVal Score value of the resulting alignment (Metafunction: Value of the type of scoringScheme).
TScoreCollection A collection of computed scores for every aligned sequence pair. The value type of this score is String over the score type of the passed scoring scheme. (Metafunction: Value of the type of scoringScheme).

Detailed Description

Warning:

There are currently some limitations in the use of the execution policy. The banded version is at the moment only supported for the following execution modes: sequential, parallel, vectorized and parallel+vectorized. The banded version only works for collections if all sequences within one collection have the same size.

Note:

In order to get the performance advantage of vectorised execution one has to compile the code with the respective CPU flags. For most Intel based CPUs the compiler flag -msse4 can be used for gcc and clang to enable vectorisation for 128 bit wide registers. For CPUs that support wider register please read the respective documentation on how to select the correct compilation flags.

This function does not perform the (linear time) traceback step after the (mostly quadratic time) dynamic programming step. Note that Myers' bit-vector algorithm does not compute an alignment (only in the Myers-Hirschberg variant) but scores can be computed using globalAlignmentScore. Global alignment score can be either used with two sequences or two sets of sequences of equal size.

The same limitations to algorithms as in globalAlignment apply. Furthermore, the MyersBitVector and MyersHirschberg variants can only be used without any other parameter.

Execution policies

SeqAn supports parallel and vectorised execution of pairwise alignments. This additional interface takes not just a single pair of sequences but a collection of sequences where both collections must have the same size. The collection based interface allows to additionally specify an ExecutionPolicy. This execution policy can be used to select the multi-threaded or vectorised implementation or the combination thereof for the alignment algorithm. SeqAn implements an inter-sequence vectorisation scheme which means that x alignments are computed in parallel in one SIMD vector where x is the number of elements a vector can compute in parallel. This depends on the architecture's supported SIMD vector width (128 bit, 256 bit or 512 bit) and the selected score type, e.g. int16_t. For example on a CPU architecture that supports SSE4 and a score type of int16_t, 128/16 = 8 alignments can be computed in parallel on a single core.

In addition, the execution policy can be configured for multi-threaded execution, such that either chunks of sequence pairs from the initial collection are spawned and executed on different threads or an intra-sequence parallelization is used to parallelize a single alignment. In total the following execution modes are possible: sequential, parallel, wave-front, vectorized, parallel+vectorized and wave-front+vectorized.

The wave-front execution can be selected via the WavefrontExecutionPolicy, which can also be combined with a vectorized execution. In addition the wave-front execution parallelizes a single pairwise alignment, while the standard Parallel specialization does only parallelizes the sequence set via chunking. Note,

The following example shows an example for a multi-threaded and vectorised execution of global alignments for two collections of sequences:

#include <iostream>

#include <seqan/align_parallel.h>
#include <seqan/stream.h>  // for printint strings

int main()
{
    using TSequence = seqan2::String<seqan2::Dna>;
    using TThreadModel = seqan2::Parallel;
    using TVectorSpec = seqan2::Vectorial;
    using TExecPolicy = seqan2::ExecutionPolicy<TThreadModel, TVectorSpec>;

    // dummy sequences
    TSequence seqH = "CGATT";
    TSequence seqV = "CGAAATT";

    seqan2::StringSet<TSequence> seqs1;
    seqan2::StringSet<TSequence> seqs2;

    for (size_t i = 0; i < 100; ++i)
    {
        appendValue(seqs1, seqH);
        appendValue(seqs2, seqV);
    }

    TExecPolicy execPolicy;
    setNumThreads(execPolicy, 4);

    seqan2::Score<int16_t, seqan2::Simple> scoreAffine(2, -2, -1, -4);

    seqan2::String<int16_t> scores = seqan2::globalAlignmentScore(execPolicy, seqs1, seqs2, scoreAffine);

    for (int16_t score : scores)
        std::cout << "Score: " << score << "\n";

    return EXIT_SUCCESS;
}

The following example shows an example for a wavefront and vectorised execution of global alignments for two collections of large sequences:

#include <iostream>

#include <seqan/align_parallel.h>
#include <seqan/stream.h>  // for printint strings

int main()
{
    using TSequence = seqan2::String<seqan2::Dna>;
    using TThreadModel = seqan2::WavefrontAlignment<seqan2::BlockOffsetOptimization>;
    using TVectorSpec = seqan2::Vectorial;
    using TExecPolicy = seqan2::ExecutionPolicy<TThreadModel, TVectorSpec>;

    // dummy sequences
    TSequence seqH;
    TSequence seqV;

    for (size_t i = 0; i < 10000; ++i)
    {
        seqan2::appendValue(seqH, 'A');
        seqan2::appendValue(seqV, 'A');
    }

    seqan2::StringSet<TSequence> seqs1;
    seqan2::StringSet<TSequence> seqs2;

    for (size_t i = 0; i < 100; ++i)

    {
        seqan2::appendValue(seqs1, seqH);
        seqan2::appendValue(seqs2, seqV);
    }

    TExecPolicy execPolicy;
    seqan2::setBlockSize(execPolicy, 500); // Sets the size of blocks used internally to partition the alignment matrix.
    seqan2::setParallelAlignments(execPolicy, 10); // Compute ten alignments at the same time.
    seqan2::setNumThreads(execPolicy, 4); // Use four threads to compute the actual alignment.

    seqan2::Score<int32_t, seqan2::Simple> scoreAffine(2, -2, -1, -4);

    seqan2::String<int32_t> scores = seqan2::globalAlignmentScore(execPolicy, seqs1, seqs2, scoreAffine);

    for (int32_t score : scores)
        std::cout << "Score: " << score << "\n";

    return EXIT_SUCCESS;
}

Data Races

thread-safe. No shared state is modified during the execution and concurrent invocations of this function on the same data does not cause any race conditions.

See Also