SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
Alphabet
+ Collaboration diagram for Alphabet:

Modules

 Adaptation
 Provides alphabet adaptions of some standard char and uint types.
 
 Aminoacid
 Provides the amino acid alphabets and functionality for translation from nucleotide.
 
 CIGAR
 Provides (semi-)alphabets for representing elements in CIGAR strings.
 
 Composite
 Provides templates for combining existing alphabets into new alphabet types.
 
 Container
 Alphabet related container.
 
 Gap
 Provides the gap alphabet and functionality to make an alphabet a gapped alphabet.
 
 Mask
 Provides the mask alphabet and functionality for creating masked composites.
 
 Nucleotide
 Provides the different DNA and RNA alphabet types.
 
 Quality
 Provides the various quality score types.
 
 Range
 Alphabet related ranges.
 
 Structure
 Provides types to represent single elements of RNA and protein structures.
 
 Views
 Alphabet related views.
 

Classes

struct  seqan3::custom::alphabet< t >
 A type that can be specialised to provide customisation point implementations so that third party types model alphabet concepts. More...
 
interface  alphabet
 The generic alphabet concept that covers most data types used in ranges. More...
 
class  seqan3::alphabet_base< derived_type, size, char_t >
 A CRTP-base that makes defining a custom alphabet easier. More...
 
struct  seqan3::alphabet_printer< alphabet_t >
 The printer used for formatted output of seqan3::alphabet types. More...
 
class  seqan3::alphabet_proxy< derived_type, alphabet_type >
 A CRTP-base that eases the definition of proxy types returned in place of regular alphabets. More...
 
struct  std::hash< alphabet_t >
 Struct for hashing a character. More...
 
struct  seqan3::invalid_char_assignment
 An exception typically thrown by seqan3::alphabet::assign_char_strict. More...
 
interface  semialphabet
 The basis for seqan3::alphabet, but requires only rank interface (not char). More...
 
interface  writable_alphabet
 Refines seqan3::alphabet and adds assignability. More...
 
interface  writable_semialphabet
 A refinement of seqan3::semialphabet that adds assignability. More...
 

Typedefs

template<typename alphabet_type >
using seqan3::alphabet_char_t = decltype(seqan3::to_char(std::declval< alphabet_type const >()))
 The char_type of the alphabet; defined as the return type of seqan3::to_char.
 
template<typename semi_alphabet_type >
using seqan3::alphabet_rank_t = decltype(seqan3::to_rank(std::declval< semi_alphabet_type >()))
 The rank_type of the semi-alphabet; defined as the return type of seqan3::to_rank. !
 

Variables

template<typename alph_t >
constexpr auto seqan3::alphabet_size = detail::adl_only::alphabet_size_cpo<alph_t>{}()
 A type trait that holds the size of a (semi-)alphabet.
 

Function objects

constexpr auto seqan3::to_rank = detail::adl_only::to_rank_cpo{}
 Return the rank representation of a (semi-)alphabet object.
 
constexpr auto seqan3::assign_rank_to = detail::adl_only::assign_rank_to_cpo{}
 Assign a rank to an alphabet object.
 
constexpr auto seqan3::to_char = detail::adl_only::to_char_cpo{}
 Return the char representation of an alphabet object.
 
constexpr auto seqan3::assign_char_to = detail::adl_only::assign_char_to_cpo{}
 Assign a character to an alphabet object.
 
template<typename alph_t >
constexpr auto seqan3::char_is_valid_for = detail::adl_only::char_is_valid_for_cpo<alph_t>{}
 Returns whether a character is in the valid set of a seqan3::alphabet (usually implies a bijective mapping to an alphabet value).
 
constexpr auto seqan3::assign_char_strictly_to = detail::adl_only::assign_char_strictly_to_fn{}
 Assign a character to an alphabet object, throw if the character is not valid.
 

Detailed Description

Introduction

Alphabets are a core component in SeqAn. They enable us to represent the smallest unit of biological sequence data, e.g. a nucleotide or an amino acid.

In theory, these could just be represented as a char and this is how many people perceive them, but it makes sense to use a smaller, stricter and well-defined alphabet in almost all cases, because:

In SeqAn there are alphabet types for typical sequence alphabets like DNA and amino acid, but also for qualities, RNA structures and alignment gaps. In addition there are templates for combining alphabet types into new alphabets, and wrappers for existing data types like the canonical char.

In addition to concrete alphabet types, SeqAn provides multiple concepts that describe groups of alphabets by their properties and can be used to constrain templates so that they only work with certain alphabet types. See the Tutorial on Concepts for a gentle introduction to the topic.

The alphabet concepts

Alphabet size

All alphabets in SeqAn have a fixed size. It can be queried via the seqan3::alphabet_size type trait and optionally also the alphabet_size static member of the alphabet (see below for "members VS free/global functions").

In some areas we provide alphabet types with different sizes for the same purpose, e.g. seqan3::dna4 ('A', 'C', 'G', 'T'), seqan3::dna5 (plus 'N') and seqan3::dna15 (plus ambiguous characters defined by IUPAC). By convention most of our alphabets carry their size in their name (seqan3::dna4 has size 4 a.s.o.).

A main reason for choosing a smaller alphabet over a bigger one is the possibility of optimising for space efficiency. Note, however, that a single letter by itself can never be smaller than a byte for architectural reasons. Actual space improvements are realised via secondary structures, e.g. when using a seqan3::bitpacked_sequence<seqan3::dna4> instead of std::vector<seqan3::dna4>. Also the single letter quality composite seqan3::qualified<seqan3::dna4, seqan3::phred42> fits into one byte, because the product of the alphabet sizes (4 * 42) is smaller than 256; whereas the same composite with seqan3::dna15 requires two bytes per letter (15 * 42 > 256).

Assigning and retrieving values

As mentioned above, we typically think of alphabets in their character representation, but we also require them in "rank representation" as programmers. In C and C++ it is quite difficult to cleanly differentiate between these, because the char type is considered an integral type and can be used to index an array (e.g. my_array['A'] translates to my_array[65]). Moreover the sign of char is implementation defined and on many platforms the smallest integer types int8_t and uint8_t are literally the same types as signed char and unsigned char respectively.

This leads to ambiguity when assigning and retrieving values:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// does not work:
// seqan3::dna4 my_letter{0}; // we want to set the default, an A
// seqan3::dna4 my_letter{'A'}; // we also want to set an A, but we are setting value 65
// std::cout << my_letter; // you expect 'A', but how would you access the number?
}
Provides seqan3::dna4, container aliases and string literals.

To solve this problem, alphabets in SeqAn define two interfaces:

  1. a rank based interface with
  2. a character based interface with

To prevent the aforementioned ambiguity, you can neither assign from rank or char representation via operator=, nor can you cast the alphabet to either of it's representation forms, you need to explicitly use the interfaces.

For efficiency, the representation saved internally is normally the rank representation, and the character representation is generated via conversion tables. This is, however, not required as long as both interfaces are provided and all functions operate in constant time.

The same applies for printing characters although seqan3::debug_stream provides some convenience.

Here is an example of explicit assignment of a rank and char, and how it can be printed via std::cout and seqan3::debug_stream:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
seqan3::dna4 my_letter;
seqan3::assign_rank_to(0, my_letter); // assign an A via rank interface
seqan3::assign_char_to('A', my_letter); // assign an A via char interface
std::cout << seqan3::to_char(my_letter) << '\n'; // prints 'A'
std::cout << (unsigned)seqan3::to_rank(my_letter) << '\n'; // prints 0
// we have to add the cast here, because uint8_t is also treated as a char type by default :(
// Using SeqAn's debug_stream:
seqan3::debug_stream << seqan3::to_char(my_letter) << '\n'; // prints 'A'
seqan3::debug_stream << my_letter << '\n'; // prints 'A' (calls to_char() automatically!)
seqan3::debug_stream << seqan3::to_rank(my_letter) << '\n'; // prints 0 (casts uint8_t to unsigned automatically!)
}
The four letter DNA alphabet of A,C,G,T.
Definition dna4.hpp:50
Provides seqan3::debug_stream and related types.
constexpr auto assign_char_to
Assign a character to an alphabet object.
Definition alphabet/concept.hpp:517
constexpr auto to_char
Return the char representation of an alphabet object.
Definition alphabet/concept.hpp:381
constexpr auto assign_rank_to
Assign a rank to an alphabet object.
Definition alphabet/concept.hpp:288
constexpr auto to_rank
Return the rank representation of a (semi-)alphabet object.
Definition alphabet/concept.hpp:152
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition debug_stream.hpp:37

To reduce the burden of calling assign_char often, most alphabets in SeqAn provide custom literals for the alphabet and sequences over the alphabet:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna4 letter = 'A'_dna4; // identical to assign_char_to('A', letter);
seqan3::dna4_vector sequence = "ACGT"_dna4; // identical to calling assign_char for each element
}
The generic concept for a (biological) sequence.
The SeqAn namespace for literals.

Note, however, that literals are not required by the concept.

Different concepts

All types that have valid implementations of the functions/functors described above model the concept seqan3::writable_alphabet. This is the strongest (i.e. most refined) general case concept. There are more refined concepts for specific biological applications (like seqan3::nucleotide_alphabet), and there are less refined concepts that only model part of an alphabet:

Typically you will use seqan3::alphabet in "read-only" situations (e.g. const parameters) and seqan3::writable_alphabet whenever the values might be changed. Semi-alphabets are less useful in application code.

semialphabet writable_semialphabet alphabet writable_alphabet Aux
alphabet_size
to_rank
alphabet_rank_t 🔗
assign_rank_to
to_char
alphabet_char_t 🔗
assign_char_to
char_is_valid_for
assign_char_strictly_to 🔗

The above table shows all alphabet concepts and related functions and type traits. The entities marked as "auxiliary" provide shortcuts to the other "essential" entities. This difference is only relevant if you want to create your own alphabet (you do not need to provide an implementation for the "auxiliary" entities, they are provided automatically).

Members VS free/global functions

The alphabet concept (as most concepts in SeqAn) looks for free/global functions, i.e. you need to be able to call seqan3::to_rank(my_letter), however most alphabets also provide a member function, i.e. my_letter.to_rank(). The same is true for the type trait seqan3::alphabet_size vs the static data member alphabet_size.

Members are provided for convenience and if you are an application developer who works with a single concrete alphabet type you are fine with using the member functions. If you, however, implement a generic function that accepts different alphabet types, you need to use the free function / type trait interface, because it is the only interface guaranteed to exist (member functions are not required/enforced by the concept).

Containers over alphabets

In SeqAn it is recommended you use the STL container classes like std::vector for storing sequence data, but you can use other class templates if they satisfy the respective seqan3::container, e.g. std::deque or folly::fbvector or even Qt::QVector.

std::basic_string is also supported, however, we recommend against using it, because it is not safe (and not useful) to call certain members like .c_str() if our alphabets are used as value type.

We provide specialised containers with certain properties in the Alphabet Container module .

A container over a seqan3::alphabet automatically models the seqan3::sequence concept.

Typedef Documentation

◆ alphabet_char_t

template<typename alphabet_type >
using seqan3::alphabet_char_t = typedef decltype(seqan3::to_char(std::declval<alphabet_type const>()))

The char_type of the alphabet; defined as the return type of seqan3::to_char.

This entity is stable. Since version 3.1.

◆ alphabet_rank_t

template<typename semi_alphabet_type >
using seqan3::alphabet_rank_t = typedef decltype(seqan3::to_rank(std::declval<semi_alphabet_type>()))

The rank_type of the semi-alphabet; defined as the return type of seqan3::to_rank. !

This entity is stable. Since version 3.1.

Variable Documentation

◆ alphabet_size

template<typename alph_t >
constexpr auto seqan3::alphabet_size = detail::adl_only::alphabet_size_cpo<alph_t>{}()
inlineconstexpr

A type trait that holds the size of a (semi-)alphabet.

Template Parameters
your_typeThe (semi-)alphabet type being queried.

This type trait is implemented as a global variable template.

It is only defined for types that provide one of the following (checked in this order):

  1. A static constexpr data member of seqan3::custom::alphabet<your_type> called alphabet_size.
  2. A free function alphabet_size(your_type const &) in the namespace of your type (or as friend) that returns the size.
  3. A static constexpr data member of your_type called alphabet_size.

Functions are only considered for one of the above cases if they are marked noexcept and constexpr and if the returned type models std::integral. For 2. the value of the argument to the function shall be ignored, the argument is only used to select the function via argument-dependent lookup.

Every (semi-)alphabet type must provide one of the above.

Note that if the (semi-)alphabet type with cvref removed is not std::is_nothrow_default_constructible or not seqan3::is_constexpr_default_constructible, this object will instead look for alphabet_size(std::type_identity<your_type> const &) with the same semantics (in case 2.).

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
auto sigma_char = seqan3::alphabet_size<char>; // calls seqan3::custom::alphabet_size(char{})
static_assert(std::same_as<decltype(sigma_char), uint16_t>);
std::cout << sigma_char << '\n'; // 256
auto sigma_dna5 = seqan3::alphabet_size<seqan3::dna5>; // returns dna5::alphabet_size
static_assert(std::same_as<decltype(sigma_dna5), uint8_t>);
std::cout << static_cast<uint16_t>(sigma_dna5) << '\n'; // 5
}
Provides alphabet adaptations for standard char types.
Provides seqan3::dna5, container aliases and string literals.

For an example of a full alphabet definition with free function implementations (solution 1. above), see seqan3::alphabet.

Customisation point

This is a customisation point (see Customisation). To specify the behaviour for your own alphabet type, simply provide one of the three functions specified above.

This entity is experimental and subject to change in the future. Implementation 2 (free function) is not stable.

This entity is stable. Since version 3.1. The name seqan3::alphabet_size, Implementation 1, and Implementation 3 are stable and will not change.

◆ assign_char_strictly_to

constexpr auto seqan3::assign_char_strictly_to = detail::adl_only::assign_char_strictly_to_fn{}
inlineconstexpr

Assign a character to an alphabet object, throw if the character is not valid.

Template Parameters
your_typeType of the target object.
Parameters
chrThe character being assigned; must be of the seqan3::alphabet_char_t of the target object.
alphThe target object; its type must model seqan3::alphabet.
Returns
Reference to alph if alph was given as lvalue, otherwise a copy.
Exceptions
seqan3::invalid_char_assignmentIf seqan3::char_is_valid_for<decltype(alph)>(chr) == false.

This is a function object. Invoke it with the parameters specified above.

Note that this is not a customisation point and it cannot be "overloaded". It simply invokes seqan3::char_is_valid_for and seqan3::assign_char_to.

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
char c = '!';
seqan3::assign_char_strictly_to('?', c); // calls seqan3::custom::assign_char_strictly_to('A', c)
seqan3::assign_char_strictly_to('A', d); // calls .assign_char('A') member
// also works for temporaries:
}
Core alphabet concept and free function/type trait wrappers.
The five letter DNA alphabet of A,C,G,T and the unknown character N.
Definition dna5.hpp:48
constexpr auto assign_char_strictly_to
Assign a character to an alphabet object, throw if the character is not valid.
Definition alphabet/concept.hpp:721

This entity is experimental and subject to change in the future. Experimental since version 3.1.

◆ assign_char_to

constexpr auto seqan3::assign_char_to = detail::adl_only::assign_char_to_cpo{}
inlineconstexpr

Assign a character to an alphabet object.

Template Parameters
your_typeType of the target object.
Parameters
chrThe character being assigned; must be of the seqan3::alphabet_char_t of the target object.
alphThe target object; its type must model seqan3::alphabet.
Returns
Reference to alph if alph was given as lvalue, otherwise a copy.

This is a function object. Invoke it with the parameter(s) specified above.

It acts as a wrapper and looks for three possible implementations (in this order):

  1. A static member function assign_char_to(char_type const chr, your_type & a) of the class seqan3::custom::alphabet<your_type>.
  2. A free function assign_char_to(char_type const chr, your_type & a) in the namespace of your type (or as friend).
  3. A member function called assign_char(char_type const chr) (not assign_char_to).

Functions are only considered for one of the above cases if they are marked noexcept (constexpr is not required, but recommended) and if the returned type is your_type &.

Every alphabet type must provide one of the above. Note that temporaries of your_type are handled by this function object and do not require an additional overload.

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
char c = '!';
seqan3::assign_char_to('?', c); // calls seqan3::custom::assign_char_to('A', c)
seqan3::assign_char_to('A', d); // calls .assign_char('A') member
// also works for temporaries:
// invalid/unknown characters are converted:
seqan3::dna5 d3 = seqan3::assign_char_to('!', seqan3::dna5{}); // == 'N'_dna5
}

For an example of a full alphabet definition with free function implementations (solution 1. above), see seqan3::alphabet.

Customisation point

This is a customisation point (see Customisation). To specify the behaviour for your own alphabet type, simply provide one of the three functions specified above.

This entity is experimental and subject to change in the future. Implementation 2 (free function) is not stable.

This entity is stable. Since version 3.1. The name seqan3::assign_char_to, Implementation 1, and Implementation 3 are stable and will not change.

◆ assign_rank_to

constexpr auto seqan3::assign_rank_to = detail::adl_only::assign_rank_to_cpo{}
inlineconstexpr

Assign a rank to an alphabet object.

Template Parameters
your_typeType of the target object.
Parameters
chrThe rank being assigned; must be of the seqan3::alphabet_rank_t of the target object.
alphThe target object.
Returns
Reference to alph if alph was given as lvalue, otherwise a copy.

This is a function object. Invoke it with the parameter(s) specified above.

It acts as a wrapper and looks for three possible implementations (in this order):

  1. A static member function assign_rank_to(rank_type const chr, your_type & a) of the class seqan3::custom::alphabet<your_type>.
  2. A free function assign_rank_to(rank_type const chr, your_type & a) in the namespace of your type (or as friend).
  3. A member function called assign_rank(rank_type const chr) (not assign_rank_to).

Functions are only considered for one of the above cases if they are marked noexcept (constexpr is not required, but recommended) and if the returned type is your_type &.

Every (semi-)alphabet type must provide one of the above. Note that temporaries of your_type are handled by this function object and do not require an additional overload.

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
char c = '!';
seqan3::assign_rank_to(66, c); // calls seqan3::custom::assign_rank_to(66, c); == 'B'
seqan3::assign_rank_to(2, d); // calls .assign_rank(2) member; == 'G'_dna5
// also works for temporaries:
// too-large ranks are undefined behaviour:
// seqan3::dna5 d3 = seqan3::assign_rank_to(50, seqan3::dna5{});
}

For an example of a full alphabet definition with free function implementations (solution 1. above), see seqan3::alphabet.

Customisation point

This is a customisation point (see Customisation). To specify the behaviour for your own alphabet type, simply provide one of the three functions specified above.

This entity is experimental and subject to change in the future. Implementation 2 (free function) is not stable.

This entity is stable. Since version 3.1. The name seqan3::assign_rank_to, Implementation 1, and Implementation 3 are stable and will not change.

◆ char_is_valid_for

template<typename alph_t >
constexpr auto seqan3::char_is_valid_for = detail::adl_only::char_is_valid_for_cpo<alph_t>{}
inlineconstexpr

Returns whether a character is in the valid set of a seqan3::alphabet (usually implies a bijective mapping to an alphabet value).

Template Parameters
your_typeThe alphabet type being queried.
Parameters
chrThe character being checked; must be convertible to seqan3::alphabet_char_t<your_type>.
alphThe target object; its type must model seqan3::alphabet.
Returns
true or false.

This is a function object. Invoke it with the parameter(s) specified above.

It acts as a wrapper and looks for three possible implementations (in this order):

  1. A static member function char_is_valid(char_type const chr) of the class seqan3::custom::alphabet<your_type>.
  2. A free function char_is_valid_for(char_type const chr, your_type const &) in the namespace of your type (or as friend).
  3. A static member function called char_is_valid(char_type) (not char_is_valid_for).

Functions are only considered for one of the above cases if they are marked noexcept (constexpr is not required, but recommended) and if the returned type is convertible to bool. For 2. the value of the second argument to the function shall be ignored, it is only used to select the function via argument-dependent lookup.

An alphabet type may provide one of the above. If none is provided, this function will declare every character c as valid for whom it holds that seqan3::to_char(seqan3::assign_char_to(c, alph_t{})) == c, i.e. converting back and forth results in the same value.

Note that if the alphabet type with cvref removed is not std::is_nothrow_default_constructible, this function object will instead look for char_is_valid_for(char_type const chr, std::type_identity<your_type> const &) in case 2. In that case the "fallback" above also does not work and you are required to provide such an implementation.

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// calls seqan3::custom::char_is_valid_for<char>('A')
std::cout << std::boolalpha << seqan3::char_is_valid_for<char>('A') << '\n'; // always 'true'
// calls dna5::char_is_valid('A') member
std::cout << std::boolalpha << seqan3::char_is_valid_for<seqan3::dna5>('A') << '\n'; // true
// for some alphabets, characters that are not uniquely mappable are still valid:
std::cout << std::boolalpha << seqan3::char_is_valid_for<seqan3::dna5>('a') << '\n'; // true
}

Customisation point

This is a customisation point (see Customisation). To specify the behaviour for your own alphabet type, simply provide one of the three functions specified above.

This entity is experimental and subject to change in the future. Experimental since version 3.1.

◆ to_char

constexpr auto seqan3::to_char = detail::adl_only::to_char_cpo{}
inlineconstexpr

Return the char representation of an alphabet object.

Template Parameters
your_typeType of the argument.
Parameters
alphThe alphabet object.
Returns
The char representation; usually char.

This is a function object. Invoke it with the parameter(s) specified above.

It acts as a wrapper and looks for three possible implementations (in this order):

  1. A static member function to_char(your_type const a) of the class seqan3::custom::alphabet<your_type>.
  1. A free function to_char(your_type const a) in the namespace of your type (or as friend).
  2. A member function called to_char().

Functions are only considered for one of the above cases if they are marked noexcept (constexpr is not required, but recommended) and if the returned type models seqan3::builtin_character.

Every alphabet type must provide one of the above.

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
auto char_to_char = seqan3::to_char('A'); // calls seqan3::custom::to_char('A')
auto dna5_to_char = seqan3::to_char('A'_dna5); // calls .to_char() member
std::cout << char_to_char << '\n'; // A
std::cout << dna5_to_char << '\n'; // A
}

For an example of a full alphabet definition with free function implementations (solution 1. above), see seqan3::alphabet.

Customisation point

This is a customisation point (see Customisation). To specify the behaviour for your own alphabet type, simply provide one of the three functions specified above.

This entity is experimental and subject to change in the future. Implementation 2 (free function) is not stable.

This entity is stable. Since version 3.1. The name seqan3::to_char, Implementation 1, and Implementation 3 are stable and will not change.

◆ to_rank

constexpr auto seqan3::to_rank = detail::adl_only::to_rank_cpo{}
inlineconstexpr

Return the rank representation of a (semi-)alphabet object.

Template Parameters
your_typeType of the argument.
Parameters
alphThe (semi-)alphabet object.
Returns
The rank representation; an integral type.

This is a function object. Invoke it with the parameter(s) specified above.

It acts as a wrapper and looks for three possible implementations (in this order):

  1. A static member function to_rank(your_type const a) of the class seqan3::custom::alphabet<your_type>.
  2. A free function to_rank(your_type const a) in the namespace of your type (or as friend).
  3. A member function called to_rank().

Functions are only considered for one of the above cases if they are marked noexcept (constexpr is not required, but recommended) and if the returned type models std::integral.

Every (semi-)alphabet type must provide one of the above.

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
auto char_to_rank = seqan3::to_rank('A'); // calls seqan3::custom::to_rank('A')
static_assert(std::same_as<decltype(char_to_rank), uint8_t>);
std::cout << static_cast<uint16_t>(char_to_rank) << '\n'; // 65
auto dna5_to_rank = seqan3::to_rank('A'_dna5); // calls .to_char() member
static_assert(std::same_as<decltype(dna5_to_rank), uint8_t>);
std::cout << static_cast<uint16_t>(dna5_to_rank) << '\n'; // 0
}

For an example of a full alphabet definition with free function implementations (solution 1. above), see seqan3::alphabet.

Customisation point

This is a customisation point (see Customisation). To specify the behaviour for your own alphabet type, simply provide one of the three functions specified above.

This entity is experimental and subject to change in the future. Implementation 2 (free function) is not stable.

This entity is stable. Since version 3.1. The name seqan3::to_rank, Implementation 1, and Implementation 3 are stable and will not change.

Hide me