fn() bandedChainAlignment
Computes the best global pairwise alignment between two sequences given a non-empty seed chain.

Defined in <seqan/seeds.h>
Signature TValue bandedChainAlignment(align, seedChain, scoringScheme1[, scoringScheme2] [, alignConfig] [, k]); TValue bandedChainAlignment(gapsH, gapsV, seedChain, scoringScheme1[, scoringScheme2] [, alignConfig] [, k]); TValue bandedChainAlignment(frags, strings, seedChain, scoringScheme1[, scoringScheme2] [, alignConfig] [, k]); TValue bandedChainAlignment(alignmentGraph, seedChain, scoringScheme1[, scoringScheme2] [, alignConfig] [, k]);

Parameters

align An Align object that stores the alignment. The number of rows must be 2 and the sequences must have already been set. row(align, 0) is the horizontal sequence in the alignment matrix, row(align, 1) is the vertical sequence.
gapsH Horizontal gapped sequence in alignment matrix. Type: Gaps.
gapsV Vertical gapped sequence in alignment matrix. Type: Gaps.
frags String of Fragment objects. The sequence with id 0 is the horizontal one, the sequence with id 1 is the vertical one.
strings A StringSet containing two sequences.
alignmentGraph AlignmentGraph object to store the alignment in. The underlying StringSet must be an DependentStringSet. Types: AlignmentGraph.
seedChain The container holding the seeds. Note that the seeds have to be in montonic non-decreasing order and the container has to implement a forward-iterator. Type: SeedSet.
scoringScheme1 The scoring scheme used for the alignment. If scoringScheme2 is specified, then scoringScheme1 is used for the regions around the seeds and scoringScheme2 for the gap regions between two consecutive seeds. Types: Score
scoringScheme2 The optional scoring scheme for the gap regions between two anchors. Types: Score
k Optional extension of the band around the seeds. At the moment only band extensions greater or equal 1 are allowed. Type: nolink:int. Default: 15.
alignConfig The AlignConfig to use for the alignment.

Return Values

TValue An integer with the alignment score, as given by the Value metafunction of the Score type. If the seed chain is empty then the smallest value of the score type is used to return the minimal value of the selected score type and no alignment is computed.

Detailed Description

There exist multiple overloads for this function with four configuration dimensions.

First, you can select whether begin and end gaps are free in either sequence using alignConfig.

Second, you can select the type of the target storing the alignment. This can be either an Align object, two Gaps objects, a AlignmentGraph, or a string of Fragment objects. Align objects provide an interface to tabular alignments with the restriction of all rows having the same type. Using two Gaps objects has the advantage that you can align sequences with different types, for example DnaString and Dna5String. Alignment Graphs provide a graph-based representation of segment-based colinear alignments. Using Fragment strings is useful for collecting many pairwise alignments, for example in the construction of Alignment Graphs for multiple-sequence alignments (MSA).

Third, you can optionally give a second scoring scheme to fill the gaps between two consecutive seeds. Note that based on the specified scores either an affine or linear gap cost function is used. This only depends on whether for one of the scoring schemes the scores for gap opening and gap extension differ or not. If only one scoring scheme is defined the complete region is computed with the same scoring scheme.

Fourth, you can optinally select a proper band extension for the bands around the seeds. At the moment only band extensions of at least 1 are allowed. The default value is 15 and is based on the default values for the LAGAN-algorithm described by Brudno et al., 2003.

The examples below show some common use cases.

Examples

Banded chain alignment of two sequences using an Align object and using only one scoring scheme and no free end-gaps.

Dna5String seqH = "CGAATCCATCCCACACA";
Dna5String seqV = "GGCGATNNNCATGGCACA";

String<Seed<Simple> > seedChain;
appendValue(seedChain, Seed<Simple>(2, 0, 6, 5));
appendValue(seedChain, Seed<Simple>(9, 6, 12, 9));
appendValue(seedChain, Seed<Simple>(14, 11, 16, 17));

Align<Dna5String, ArrayGaps> alignment;
resize(rows(alignment), 2);
assignSource(row(alignment, 0), seqH);
assignSource(row(alignment, 1), seqV);

Score<int, Simple> scoringScheme(2, -1, -2);

int result = bandedChainAlignment(alignment, seedChain, scoringScheme, 2);

Banded chain alignment of two sequences using two Gaps objects, an unordered seed set to hold the seeds, two different scoring schemes for the gaps between the seeds and the seeds and free end-gaps.

DnaString seqH = "CGAATCCATCCCACACA";
Dna5String seqV = "GGCGATNNNCATGGCACA";

SeedSet<Simple, Unordered> seedChain;
addSeed(seedChain, Seed<Simple>(2, 0, 6, 5), Single());
addSeed(seedChain, Seed<Simple>(9, 6, 12, 9), Single());
addSeed(seedChain, Seed<Simple>(14, 11, 16, 17), Single());

Gaps<DnaString, ArrayGaps> gapsH(seqH);
Gaps<Dna5String, AnchorGaps<> > gapsV(seqV);

Score<int, Simple> scoringSchemeSeed(2, -1, -2);
Score<int, Simple> scoringSchemeGap(5, -3, -1, -5);
AlignConfig<true, true, true, true> alignConfig;

int result = globalAlignment(gapsH, gapsV, scoringSchemeSeed, scoringSchemeGap, alignConfig, 2);

Tutorial

Also see the Seed-and-Extend Tutorial

Reference

  • Brudno M, Do CB, Cooper GM, et al.: LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNA. Genome Research 2003, 13: 721-731.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.