SeqAn3  3.0.2
The Modern C++ library for sequence analysis.
Aminoacid

Provides the amino acid alphabets and functionality for translation from nucleotide. More...

+ Collaboration diagram for Aminoacid:

Classes

class  seqan3::aa10li
 The reduced Li amino acid alphabet. More...
 
class  seqan3::aa10murphy
 The reduced Murphy amino acid alphabet. More...
 
class  seqan3::aa20
 The canonical amino acid alphabet. More...
 
class  seqan3::aa27
 The twenty-seven letter amino acid alphabet. More...
 
interface  seqan3::aminoacid_alphabet
 A concept that indicates whether an alphabet represents amino acids.Since an amino acid alphabet has no specific characteristics (like the complement function for nucleotide alphabets), we distinguish an amino acid alphabet by the seqan3::is_aminoacid type trait. More...
 
class  seqan3::aminoacid_base< derived_type, size >
 A CRTP-base that refines seqan3::alphabet_base and is used by the amino acids. More...
 
struct  seqan3::aminoacid_empty_base
 This is an empty base class that can be inherited by types that shall model seqan3::aminoacid_alphabet. More...
 

Functions

template<genetic_code gc = genetic_code::CANONICAL, nucleotide_alphabet nucl_type>
constexpr aa27 seqan3::translate_triplet (nucl_type const &n1, nucl_type const &n2, nucl_type const &n3) noexcept
 Translate one nucleotide triplet into single amino acid (single nucleotide interface). More...
 
template<genetic_code gc = genetic_code::CANONICAL, typename tuple_type >
constexpr aa27 seqan3::translate_triplet (tuple_type const &input_tuple) noexcept
 Translate one nucleotide triplet into single amino acid (tuple interface). More...
 
template<genetic_code gc = genetic_code::CANONICAL, std::ranges::input_range range_type>
constexpr aa27 seqan3::translate_triplet (range_type &&input_range)
 Translate one nucleotide triplet into single amino acid (range interface). More...
 
template<genetic_code gc = genetic_code::CANONICAL, std::ranges::random_access_range rng_t>
constexpr aa27 seqan3::translate_triplet (rng_t &&input_range)
 Translate one nucleotide triplet into single amino acid (range interface, input range allows random access). More...
 

Variables

template<typename t >
constexpr bool seqan3::enable_aminoacid = detail::adl_only::enable_aminoacid_dispatcher::dispatch<remove_cvref_t<t>>()
 A trait that indicates whether a type shall model seqan3::aminoacid_alphabet. More...
 

Detailed Description

Provides the amino acid alphabets and functionality for translation from nucleotide.

Introduction

Amino acid sequences are an important part of bioinformatic data processing and used by many applications and while it is possible to represent them in a regular std::string, it makes sense to have specialised data structures in most cases. This sub-module offers the 27 letter aminoacid alphabet as well as three reduced versions that can be used with regular container and ranges. The 27 letter amino acid alphabet contains the 20 canonical amino acids, 2 additional proteinogenic amino acids (Pyrrolysine and Selenocysteine) and a termination letter (*). Additionally 4 wildcard letters are offered which allow a more generic usage for example in case of ambiguous amino acids (e.g. J which means either Isoleucine or Leucine). See also https://en.wikipedia.org/wiki/Amino_acid for more information about the amino acid alphabet.

Conversions

Amino acid name Three letter code One letter code Remapped in
seqan3::aa20
Remapped in
seqan3::aa10murphy
Remapped in
seqan3::aa10li
Alanine Ala A A A A
Arginine Arg R R K K
Asparagine Asn N N B H
Aspartic acid Asp D D B B
Cysteine Cys C C C C
Tyrosine Tyr Y Y F F
Glutamic acid Glu E E B B
Glutamine Gln Q Q B B
Glycine Gly G G G G
Histidine His H H H H
Isoleucine Ile I I I I
Leucine leu L L I J
Lysine Lys K K K K
Methionine Met M M I J
Phenylalanine Phe F F F F
Proline Pro P P P P
Serine Ser S S S A
Threonine Thr T T S A
Tryptophan Trp W W F F
Valine Val V V I I
Selenocysteine Sec U CC C
Pyrrolysine Pyl O KK K
Asparagine or aspartic acidAsx B DB B
Glutamine or glutamic acid Glx Z EB B
Leucine or Isoleucine Xle J LI J
Unknown Xaa X SS A
Stop Codon N/A * WF F

All amino acid alphabets provide static value members (like an enum) for all amino acids in the form of the one-letter representation. As shown above, alphabets smaller than 27 internally represent multiple amino acids as one.
For most cases it is highly recommended to use seqan3::aa27 as seqan3::aa20 provides no benefits in regard to space consumption (both need 5bits). Use it only when you know you need to interface with other software of formats that only support the canonical set.

Function Documentation

◆ translate_triplet() [1/4]

template<genetic_code gc = genetic_code::CANONICAL, nucleotide_alphabet nucl_type>
constexpr aa27 seqan3::translate_triplet ( nucl_type const &  n1,
nucl_type const &  n2,
nucl_type const &  n3 
)
noexcept

Translate one nucleotide triplet into single amino acid (single nucleotide interface).

Template Parameters
nucl_typeThe type of input nucleotides.
Parameters
[in]n1First nucleotide in triplet.
[in]n2Second nucleotide in triplet.
[in]n3Third nucleotide in triplet.

Translates single nucleotides into amino acid according to given genetic code.

Complexity

Constant.

Exceptions

No-throw guarantee.

◆ translate_triplet() [2/4]

template<genetic_code gc = genetic_code::CANONICAL, typename tuple_type >
constexpr aa27 seqan3::translate_triplet ( tuple_type const &  input_tuple)
noexcept

Translate one nucleotide triplet into single amino acid (tuple interface).

Template Parameters
tuple_typeType of input_tuple. Usually std::tuple, but similar types like std::array and seqan3::pod_tuple are also supported.
Parameters
[in]input_tupleTriplet of nucleotides that should be converted to amino acid.

Translates std::tuple or std::array with 3 nucleotides into amino acid according to given genetic code.

Complexity

Constant.

Exceptions

No-throw guarantee.

Deprecated:
Use seqan3::translate_triplet(nucl_type const & n1, nucl_type const & n2, nucl_type const & n3) instead.

◆ translate_triplet() [3/4]

template<genetic_code gc = genetic_code::CANONICAL, std::ranges::input_range range_type>
constexpr aa27 seqan3::translate_triplet ( range_type &&  input_range)

Translate one nucleotide triplet into single amino acid (range interface).

Template Parameters
range_typeType of input_range; must satisfy std::ranges::input_range.
Parameters
[in]input_rangeRange of three nucleotides that should be converted to amino acid.

Translates range with 3 nucleotides into amino acid according to given genetic code.

Complexity

Constant.

Exceptions

Strong exception guarantee (never modifies data).

Deprecated:
Use seqan3::translate_triplet(nucl_type const & n1, nucl_type const & n2, nucl_type const & n3) instead.

◆ translate_triplet() [4/4]

template<genetic_code gc = genetic_code::CANONICAL, std::ranges::random_access_range rng_t>
constexpr aa27 seqan3::translate_triplet ( rng_t &&  input_range)

Translate one nucleotide triplet into single amino acid (range interface, input range allows random access).

Template Parameters
rng_tType of input_range; must satisfy std::ranges::random_access_range.
Parameters
[in]input_rangeRange of three nucleotides that should be converted to amino acid.

Translates range with 3 nucleotides into amino acid according to given genetic code.

Complexity

Constant.

Exceptions

Strong exception guarantee (never modifies data).

Deprecated:
Use seqan3::translate_triplet(nucl_type const & n1, nucl_type const & n2, nucl_type const & n3) instead.

Variable Documentation

◆ enable_aminoacid

template<typename t >
constexpr bool seqan3::enable_aminoacid = detail::adl_only::enable_aminoacid_dispatcher::dispatch<remove_cvref_t<t>>()
inline

A trait that indicates whether a type shall model seqan3::aminoacid_alphabet.

Template Parameters
tType of the argument.

This is an auxiliary trait that is checked by seqan3::aminoacid_alphabet to verify that a type is an amino acid. This trait should never be read from, instead use seqan3::aminoacid_alphabet. However, user-defined alphabets that want to model seqan3::aminoacid_alphabet need to make sure that it evaluates to true for their type.

Specialisation

Do not specialise this trait directly. It acts as a wrapper and looks for two possible implementations (in this order):

  1. A static member variable enable_aminoacid of the class seqan3::custom::alphabet<t>.
  2. A free function constexpr bool enable_aminoacid(t) noexcept in the namespace of your type (or as friend).

If none of these is found, the default value is defined as:

Implementations of 1. and 2. are required to be marked constexpr and the value / return value must be convertible to bool. Implementations of 2. are required to be marked noexcept. The value passed to functions implementing 2. shall be ignored, it is only used for finding the function via argument-dependent lookup. In case that your type is not seqan3::is_constexpr_default_constructible_v and you wish to provide an implementation for 2., instead overload for std::type_identity<t>.

To make a type model seqan3::aminoacid_alphabet, it is recommended that you derive from seqan3::aminoacid_base. If that is not possible, choose option 2., and only implement option 1. as a last resort.

Example

namespace your_namespace
{
// your own aminoacid definition
{
//...
};
}
static_assert(seqan3::enable_aminoacid<your_namespace::your_aa> == true);
/***** OR *****/
namespace your_namespace2
{
// your own aminoacid definition
struct your_aa
{
//...
};
constexpr bool enable_aminoacid(your_aa) noexcept
{
return true;
}
}
static_assert(seqan3::enable_aminoacid<your_namespace2::your_aa> == true);

Customisation point

This is a customisation point (see Customisation). To change the default behaviour for your own alphabet, follow the above instructions.