Function
splitAlignment
Compute split alignments.
There are two variants of the split alignment problem.
In the first variant, we wan to align two sequences where the first (say the reference) one is shorter than the second (say a read) and the read contains an insertion with respect to the reference.
We now want to align the read agains the reference such that the left part of the read aligns well against the left part of the reference and the right part of the read aligns well against the right part of the reference.
The center gap in the reference is free. For example:
|||||||||||||||||||| |||||||||||||||||||||||||
read AGCATGTTAGATAAGATAGCCCCCCCCCCCCTGTGCTAGTAGGCAGTCAGCGCCAT
The second variant is to align two sequences A and B against a reference such that the left part of A aligns well to the left part of the reference and the right part of B aligns well to the right part of the reference.
Together, both reads span the whole reference and overlap with an insertion in the reference.
|||||||||||||||||| | ||
AGCATGTTAGATAAGATATCCGTCC
read 1
||| |||||||||||||||||||||||
CCGCTATGCTAGTAGGCAGTCAGCGCCAT
read 2
The resulting alignment of the left/right parts is depicted below.
The square brackets indicate clipping positions.
|||||||||||||||||| [ | ||
AGCATGTTAGATAAGATA [TCCGTCC
read 1
reference AGCATGTTAGATAAGATA] GTGCTAGTAGGCAGTCAGCGCCAT
] |||||||||||||||||||||||
CCGCT] ATGCTAGTAGGCAGTCAGCGCCAT
read 2
In the first case, we want to find the one breakpoint in the reference and the two breakpoints in the reads and the alignment of the left and right well-aligning read parts.
In the second case, we want to find the one breakpoint in the reference and the breakpoint/clipping position in each read.
The splitAlignment() function takes as the input two alignments.
The sequence in each alignment's first row is the reference and the sequence of the second row is the read.
The sequence has to be the same sequence whereas the reads might differ.
If the reads are the same then this is the same as the first case and if the reads differ then this is the second case.
The result is two alignments of the left and right contig path clipped appropriately.
The resulting score is the sum of the scores of both alignments.
Include Headers
seqan/align_split.h
Parameters
Align object with two rows for the left alignment. Types: Align | |
Align object with two rows for the right alignment. Types: Align | |
Gaps object with the horizontal/contig row for the left alignment. Types: Gaps | |
Gaps object with the vertical/read row for the left alignment. Types: Gaps | |
Gaps object with the horizontal/contig row for the right alignment. Types: Gaps | |
Gaps object with the vertical/read row for the right alignment. Types: Gaps | |
The scoring scheme to use for the alignment. Types: Score | |
The lower diagonal. Types: Remarks: You have to specify the upper and lower diagonals for the left alignment. For the right alignment, the corresponding diagonals are chosen for the lower right part of the DP matrix. | |
The lower diagonal. Also see remark for Types: |
Remarks
The DP algorithm is chosen automatically depending on whether the gap open and extension costs are equal.
Return Values
The sum of the alignment scores of both alignments.
TScoreValue is the value type of scoringScheme .
SeqAn - Sequence Analysis Library - www.seqan.de