SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
|
The alignment module contains concepts, algorithms and classes that are related to the computation of pairwise and multiple sequence alignments. More...
Modules | |
Aligned Sequence | |
Provides seqan3::aligned_sequence, as well as various ranges that model it. | |
CIGAR Conversion | |
The CIGAR Conversion submodule contains utility functions to convert a CIGAR to an alignment or vice versa. | |
Configuration | |
Provides configuration elements for the pairwise alignment configuration. | |
Decorator | |
The decorator submodule contains special SeqAn decorators. | |
Matrix | |
Provides data structures for representing alignment coordinates and alignments as a matrix. | |
Pairwise Alignments | |
Provides the algorithmic components for the computation of pairwise alignments. | |
Scoring | |
Provides the data structures used for scoring alphabets and sequences. | |
The alignment module contains concepts, algorithms and classes that are related to the computation of pairwise and multiple sequence alignments.
In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns. Sequence alignments are also used for non-biological sequences, such as calculating the distance cost between strings in a natural language or in financial data. [1]
SeqAn offers a generic multi-purpose alignment library comprising all widely known alignment algorithms as well as many special algorithms. These algorithms are all accessible through an easy to use alignment interface which is described in Pairwise Alignments.
The following code snippet demonstrates a simple use of the pairwise alignment interface.
The current version of SeqAn does not offer multiple sequence alignments (MSA). Please reach out to us with a specific use case we should consider in future versions.
A common file format to store (semi) alignments is the SAM/BAM format. In a SAM/BAM file, the alignment is represented as a CIGAR string. To allow back and forth conversion from a CIGAR string to the alignment representation in SeqAn, we provide the following functions:
For reading and writing SAM/BAM files, we provide the seqan3::sam_file_input and seqan3::sam_file_ouput.