SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches

Provides the mask alphabet and functionality for creating masked composites. More...

+ Collaboration diagram for Mask:

Classes

class  seqan3::mask
 Implementation of a masked alphabet to be used for tuple composites. More...
 
class  seqan3::masked< sequence_alphabet_t >
 Implementation of a masked composite, which extends a given alphabet with a mask. More...
 

Detailed Description

Provides the mask alphabet and functionality for creating masked composites.

See also
Alphabet

Introduction

Masks are useful in cases where an alphabet needs to be augmented with additional information. A common use case is the introduction of don't-care positions (masking) in nucleotide or aminoacid sequences without using 'N' or 'X', respectively. Instead, the original alphabet is combined with seqan3::mask to create a seqan3::masked composite. When printed via the seqan3::debug_stream, masked characters are displayed in lowercase.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::masked<seqan3::dna4> dna4_another_masked{'A'_dna4, seqan3::mask::unmasked};
// create a dna4 masked alphabet with an unmasked A
dna4_masked.assign_char('a'); // assigns a masked 'A'_dna4
if (dna4_masked.to_char() != dna4_another_masked.to_char())
{
seqan3::debug_stream << dna4_masked.to_char() << " is not the same as " << dna4_another_masked.to_char()
<< "\n";
}
}
constexpr derived_type & assign_char(char_type const chr) noexcept
Assign from a character, implicitly converts invalid characters.
Definition alphabet_base.hpp:160
static mask const unmasked
Member for unmasked.
Definition mask.hpp:65
Implementation of a masked composite, which extends a given alphabet with a mask.
Definition masked.hpp:42
Provides seqan3::debug_stream and related types.
Provides seqan3::dna4, container aliases and string literals.
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition debug_stream.hpp:37
Extends a given alphabet with the mask alphabet.
The SeqAn namespace for literals.

Types of masking

There are two types of masking:

Repeat masking

The use of soft-masking was popularised by RepeatMasker. Interspersed repeats (transposons, retrotransposons and processed pseudogenes) and low complexity sequences are denoted by lowercase characters. Note that larger repeats, such as segmental duplications, large tandem repeats and whole gene duplications are generally not masked.

Hide me