SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
seqan3::dna3bs Class Reference

The three letter reduced DNA alphabet for bisulfite sequencing mode (A,G,T(=C)). More...

#include <seqan3/alphabet/nucleotide/dna3bs.hpp>

+ Inheritance diagram for seqan3::dna3bs:

Public Member Functions

Constructors, destructor and assignment
constexpr dna3bs () noexcept=default
 Defaulted.
 
constexpr dna3bs (dna3bs const &) noexcept=default
 Defaulted.
 
constexpr dna3bs (dna3bs &&) noexcept=default
 Defaulted.
 
constexpr dna3bsoperator= (dna3bs const &) noexcept=default
 Defaulted.
 
constexpr dna3bsoperator= (dna3bs &&) noexcept=default
 Defaulted.
 
 ~dna3bs () noexcept=default
 Defaulted.
 
- Public Member Functions inherited from seqan3::nucleotide_base< dna3bs, 3 >
constexpr dna3bs complement () const noexcept
 Return the complement of the letter.
 
constexpr nucleotide_base (other_nucl_type const &other) noexcept
 Allow explicit construction from any other nucleotide type and convert via the character representation.
 
- Public Member Functions inherited from seqan3::alphabet_base< derived_type, size, char_t >
constexpr alphabet_base () noexcept=default
 Defaulted.
 
constexpr alphabet_base (alphabet_base const &) noexcept=default
 Defaulted.
 
constexpr alphabet_base (alphabet_base &&) noexcept=default
 Defaulted.
 
constexpr alphabet_baseoperator= (alphabet_base const &) noexcept=default
 Defaulted.
 
constexpr alphabet_baseoperator= (alphabet_base &&) noexcept=default
 Defaulted.
 
 ~alphabet_base () noexcept=default
 Defaulted.
 
constexpr char_type to_char () const noexcept
 Return the letter as a character of char_type.
 
constexpr rank_type to_rank () const noexcept
 Return the letter's numeric value (rank in the alphabet).
 
constexpr derived_type & assign_char (char_type const chr) noexcept
 Assign from a character, implicitly converts invalid characters.
 
constexpr derived_type & assign_rank (rank_type const c) noexcept
 Assign from a numeric value.
 

Private Types

using base_t = nucleotide_base< dna3bs, 3 >
 The base class.
 

Static Private Member Functions

static constexpr rank_type char_to_rank (char_type const chr)
 Returns the rank representation of character.
 
static constexpr rank_type rank_complement (rank_type const rank)
 Returns the complement by rank.
 
static constexpr char_type rank_to_char (rank_type const rank)
 Returns the character representation of rank.
 

Private Attributes

friend base_t
 Befriend seqan3::nucleotide_base.
 

Static Private Attributes

static constexpr std::array< rank_type, 256 > char_to_rank_table
 The lookup table used in char_to_rank.
 
static constexpr rank_type rank_complement_table [alphabet_size]
 The rank complement table.
 
static constexpr char_type rank_to_char_table [alphabet_size] {'A', 'G', 'T'}
 The lookup table used in rank_to_char.
 

Related Symbols

(Note that these are not member symbols.)

using dna3bs_vector = std::vector< dna3bs >
 Alias for a std::vector of seqan3::dna3bs.
 
Nucleotide literals
constexpr dna3bs operator""_dna3bs (char const c) noexcept
 The seqan3::dna3bs char literal.
 
constexpr dna3bs_vector operator""_dna3bs (char const *s, std::size_t n)
 The seqan3::dna3bs string literal.
 

Additional Inherited Members

- Static Public Member Functions inherited from seqan3::nucleotide_base< dna3bs, 3 >
static constexpr bool char_is_valid (char_type const c) noexcept
 Validate whether a character value has a one-to-one mapping to an alphabet value.
 
- Static Public Attributes inherited from seqan3::alphabet_base< derived_type, size, char_t >
static constexpr detail::min_viable_uint_t< size > alphabet_size = size
 The size of the alphabet, i.e. the number of different values it can take.
 
- Protected Types inherited from seqan3::alphabet_base< derived_type, size, char_t >
using char_type = std::conditional_t< std::same_as< char_t, void >, char, char_t >
 The char representation; conditional needed to make semi alphabet definitions legal.
 
using rank_type = detail::min_viable_uint_t< size - 1 >
 The type of the alphabet when represented as a number (e.g. via to_rank()).
 

Detailed Description

The three letter reduced DNA alphabet for bisulfite sequencing mode (A,G,T(=C)).

This alphabet represents a reduced version that can be used when dealing with bisulfite-converted data. All 'C's are converted to a 'T' in order to allow comparison of normal sequences with bisulfite-converted sequences. For completeness, this nucleotide alphabet has a complement table, however, it is not recommended to use it when dealing with bisulfite data because the complement of T is ambiguous in reads from bisulfite sequencing. A 'T' can represent a true thymidine or an unmethylated 'C' that was converted into a 'T'. Therefore, complementing a seqan3::dna3bs sequence will further reduce the alphabet to only 'T' and 'A', thereby losing all information about 'G'. When working with bisulfite data, we recommend to create the reverse complement of the seqan3::dna4 / seqan3::dna5 / seqan3::dna15 range first and convert to seqan3::dna3bs later. This avoids simplifying the data by automatically setting 'A' as the complement of 'C'. As an example: The sequence 'ACGTGC' in seqan3::dna4 would be 'ATGTGT' in seqan3::dna3bs. The complement of this seqan3::dna3bs sequence would be 'TATATA', however when complementing the seqan3::dna4 sequence first and afterwards transforming it into seqan3::dna3bs, it would be 'TGTATG' which preserves more information from the original sequence.

Like most alphabets, this alphabet cannot be initialised directly from its character representation. Instead initialise/assign from the character literal 'A'_dna3bs or use the function seqan3::dna3bs::assign_char().

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna3bs letter{'A'_dna3bs};
letter.assign_char('C'); // All C will be converted to T.
seqan3::debug_stream << letter << '\n'; // prints "T"
letter.assign_char('F'); // Unknown characters are implicitly converted to A.
seqan3::debug_stream << letter << '\n'; // prints "A"
}
constexpr derived_type & assign_char(char_type const chr) noexcept
Assign from a character, implicitly converts invalid characters.
Definition alphabet_base.hpp:160
The three letter reduced DNA alphabet for bisulfite sequencing mode (A,G,T(=C)).
Definition dna3bs.hpp:58
Provides seqan3::debug_stream and related types.
Provides seqan3::dna3bs, container aliases and string literals.
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition debug_stream.hpp:37
The SeqAn namespace for literals.
See also
https://en.wikipedia.org/wiki/Bisulfite_sequencing

This entity is stable. Since version 3.1.

Member Function Documentation

◆ char_to_rank()

static constexpr rank_type seqan3::dna3bs::char_to_rank ( char_type const  chr)
inlinestaticconstexprprivate

Returns the rank representation of character.

This function is required by seqan3::alphabet_base.

◆ rank_complement()

static constexpr rank_type seqan3::dna3bs::rank_complement ( rank_type const  rank)
inlinestaticconstexprprivate

Returns the complement by rank.

This function is required by seqan3::nucleotide_base.

◆ rank_to_char()

static constexpr char_type seqan3::dna3bs::rank_to_char ( rank_type const  rank)
inlinestaticconstexprprivate

Returns the character representation of rank.

This function is required by seqan3::alphabet_base.

Friends And Related Symbol Documentation

◆ dna3bs_vector

using dna3bs_vector = std::vector<dna3bs>
related

Alias for a std::vector of seqan3::dna3bs.

This entity is stable. Since version 3.1.

◆ operator""_dna3bs() [1/2]

constexpr dna3bs_vector operator""_dna3bs ( char const *  s,
std::size_t  n 
)
related

The seqan3::dna3bs string literal.

Returns
seqan3::dna3bs_vector

You can use this string literal to easily assign to dna3bs_vector:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna3bs_vector sequence1{"ACGTTA"_dna3bs};
seqan3::dna3bs_vector sequence2 = "ACGTTA"_dna3bs;
auto sequence3 = "ACGTTA"_dna3bs;
}

This entity is stable. Since version 3.1.

◆ operator""_dna3bs() [2/2]

constexpr dna3bs operator""_dna3bs ( char const  c)
related

The seqan3::dna3bs char literal.

Returns
seqan3::dna3bs

You can use this char literal to assign a seqan3::dna3bs character:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna3bs letter1{'A'_dna3bs};
auto letter2 = 'A'_dna3bs;
}

This entity is stable. Since version 3.1.

Member Data Documentation

◆ char_to_rank_table

constexpr std::array<rank_type, 256> seqan3::dna3bs::char_to_rank_table
staticconstexprprivate

The lookup table used in char_to_rank.

We would have defined these lookup tables directly within their respective constexpr functions, but at the time of writing this, gcc did not (clang >= 4 did!) auto-generate lookup tables.

static constexpr char_type rank_to_char(rank_type const rank)
{
// not possible because of static not being allowed within a constexpr function
static constexpr lookup_table = ...;
return lookup_table[rank];
}
static constexpr char_type rank_to_char(rank_type const rank)
{
// up-to the compiler to optimise, no guarantee that a lookup table is used.
constexpr lookup_table = ...;
return lookup_table[rank];
}
detail::min_viable_uint_t< size - 1 > rank_type
The type of the alphabet when represented as a number (e.g. via to_rank()).
Definition alphabet_base.hpp:77
std::conditional_t< std::same_as< char_t, void >, char, char_t > char_type
The char representation; conditional needed to make semi alphabet definitions legal.
Definition alphabet_base.hpp:69
rank_type rank
The value of the alphabet letter is stored as the rank.
Definition alphabet_base.hpp:258
static constexpr char_type rank_to_char(rank_type const rank)
Returns the character representation of rank.
Definition dna3bs.hpp:102
See also
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99320 for the progress on gcc

◆ rank_complement_table

constexpr rank_type seqan3::dna3bs::rank_complement_table[alphabet_size]
staticconstexprprivate
Initial value:
{
2,
2,
0
}

The rank complement table.

◆ rank_to_char_table

constexpr char_type seqan3::dna3bs::rank_to_char_table[alphabet_size] {'A', 'G', 'T'}
staticconstexprprivate

The lookup table used in rank_to_char.

We would have defined these lookup tables directly within their respective constexpr functions, but at the time of writing this, gcc did not (clang >= 4 did!) auto-generate lookup tables.

static constexpr char_type rank_to_char(rank_type const rank)
{
// not possible because of static not being allowed within a constexpr function
static constexpr lookup_table = ...;
return lookup_table[rank];
}
static constexpr char_type rank_to_char(rank_type const rank)
{
// up-to the compiler to optimise, no guarantee that a lookup table is used.
constexpr lookup_table = ...;
return lookup_table[rank];
}
See also
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99320 for the progress on gcc

The documentation for this class was generated from the following file:
Hide me