SeqAn3 3.4.0-rc.4
The Modern C++ library for sequence analysis.
|
Provides the mask alphabet and functionality for creating masked composites. More...
Classes | |
class | seqan3::mask |
Implementation of a masked alphabet to be used for tuple composites. More... | |
struct | seqan3::mask_printer< mask_t > |
The printer used for formatted output of seqan3::mask alphabet. More... | |
class | seqan3::masked< sequence_alphabet_t > |
Implementation of a masked composite, which extends a given alphabet with a mask. More... | |
Provides the mask alphabet and functionality for creating masked composites.
Masks are useful in cases where an alphabet needs to be augmented with additional information. A common use case is the introduction of don't-care positions (masking) in nucleotide or aminoacid sequences without using 'N'
or 'X'
, respectively. Instead, the original alphabet is combined with seqan3::mask to create a seqan3::masked composite. When printed via the seqan3::debug_stream, masked characters are displayed in lowercase.
There are two types of masking:
'N'
or 'X'
)The use of soft-masking was popularised by RepeatMasker. Interspersed repeats (transposons, retrotransposons and processed pseudogenes) and low complexity sequences are denoted by lowercase characters. Note that larger repeats, such as segmental duplications, large tandem repeats and whole gene duplications are generally not masked.