Provides the amino acid alphabets and functionality for translation from nucleotide. More...

Collaboration diagram for Aminoacid:

Classes
class	seqan3::aa10li
	The reduced Li amino acid alphabet. More...

class	seqan3::aa10murphy
	The reduced Murphy amino acid alphabet. More...

class	seqan3::aa20
	The canonical amino acid alphabet. More...

class	seqan3::aa27
	The twenty-seven letter amino acid alphabet. More...

interface	aminoacid_alphabet
	A concept that indicates whether an alphabet represents amino acids. More...

class	seqan3::aminoacid_base< derived_type, size >
	A CRTP-base that refines seqan3::alphabet_base and is used by the amino acids. More...

struct	seqan3::aminoacid_empty_base
	This is an empty base class that can be inherited by types that shall model seqan3::aminoacid_alphabet. More...

Functions
template<genetic_code gc = genetic_code::canonical, nucleotide_alphabet nucl_type>
constexpr aa27	seqan3::translate_triplet (nucl_type const &n1, nucl_type const &n2, nucl_type const &n3) noexcept
	Translate one nucleotide triplet into single amino acid (single nucleotide interface).

Variables
template<typename t >
constexpr bool	seqan3::enable_aminoacid
	A trait that indicates whether a type shall model seqan3::aminoacid_alphabet.

Detailed Description

Provides the amino acid alphabets and functionality for translation from nucleotide.

See also: Alphabet

Introduction

Amino acid sequences are an important part of bioinformatic data processing and used by many applications and while it is possible to represent them in a regular std::string, it makes sense to have specialised data structures in most cases. This sub-module offers the 27 letter aminoacid alphabet as well as three reduced versions that can be used with regular container and ranges. The 27 letter amino acid alphabet contains the 20 canonical amino acids, 2 additional proteinogenic amino acids (Pyrrolysine and Selenocysteine) and a termination letter (*). Additionally 4 wildcard letters are offered which allow a more generic usage for example in case of ambiguous amino acids (e.g. J which means either Isoleucine or Leucine). See also https://en.wikipedia.org/wiki/Amino_acid for more information about the amino acid alphabet.

Conversions

Amino acid name	Three letter code	One letter code	Remapped in seqan3::aa20	Remapped in seqan3::aa10murphy	Remapped in seqan3::aa10li
Alanine	Ala	A	A	A	A
Arginine	Arg	R	R	K	K
Asparagine	Asn	N	N	B	H
Aspartic acid	Asp	D	D	B	B
Cysteine	Cys	C	C	C	C
Tyrosine	Tyr	Y	Y	F	F
Glutamic acid	Glu	E	E	B	B
Glutamine	Gln	Q	Q	B	B
Glycine	Gly	G	G	G	G
Histidine	His	H	H	H	H
Isoleucine	Ile	I	I	I	I
Leucine	leu	L	L	I	J
Lysine	Lys	K	K	K	K
Methionine	Met	M	M	I	J
Phenylalanine	Phe	F	F	F	F
Proline	Pro	P	P	P	P
Serine	Ser	S	S	S	A
Threonine	Thr	T	T	S	A
Tryptophan	Trp	W	W	F	F
Valine	Val	V	V	I	I
Selenocysteine	Sec	U	C	C	C
Pyrrolysine	Pyl	O	K	K	K
Asparagine or aspartic acid	Asx	B	D	B	B
Glutamine or glutamic acid	Glx	Z	E	B	B
Leucine or Isoleucine	Xle	J	L	I	J
Unknown	Xaa	X	S	S	A
Stop Codon	N/A	*	W	F	F

As shown above, alphabets smaller than 27 internally represent multiple amino acids as one.
For most cases it is highly recommended to use seqan3::aa27 as seqan3::aa20 provides no benefits in regard to space consumption (both need 5bits). Use it only when you know you need to interface with other software of formats that only support the canonical set.

Function Documentation

◆ translate_triplet()

template<genetic_code gc = genetic_code::canonical, nucleotide_alphabet nucl_type>

constexpr aa27 seqan3::translate_triplet	(	nucl_type const &	n1,
		nucl_type const &	n2,
		nucl_type const &	n3
	)

constexprnoexcept

Translate one nucleotide triplet into single amino acid (single nucleotide interface).

Template Parameters

nucl_type The type of input nucleotides; must model seqan3::nucleotide_alphabet.

Parameters

[in]	n1	First nucleotide in triplet.
[in]	n2	Second nucleotide in triplet.
[in]	n3	Third nucleotide in triplet.

Translates single nucleotides into amino acid according to given genetic code.

Complexity

Constant.

Exceptions

No-throw guarantee.

This entity is experimental and subject to change in the future. Experimental since version 3.1.

Variable Documentation

◆ enable_aminoacid

template<typename t >

constexpr bool seqan3::enable_aminoacid

inlineconstexpr

Initial value:

=

detail::adl_only::enable_aminoacid_cpo<std::remove_cvref_t<t>>::cpo_overload(detail::priority_tag<2>{})

A trait that indicates whether a type shall model seqan3::aminoacid_alphabet.

Template Parameters

t	Type of the argument.

This is an auxiliary trait that is checked by seqan3::aminoacid_alphabet to verify that a type is an amino acid. This trait should never be read from, instead use seqan3::aminoacid_alphabet. However, user-defined alphabets that want to model seqan3::aminoacid_alphabet need to make sure that it evaluates to true for their type.

Specialisation

Do not specialise this trait directly. It acts as a wrapper and looks for three possible implementations (in this order):

A static member variable enable_aminoacid of the class seqan3::custom::alphabet<t>.
A free function constexpr bool enable_aminoacid(t) noexcept in the namespace of your type (or as friend).
If none of these is found, the default value is defined as:
- true if the type inherits from seqan3::aminoacid_empty_base (or seqan3::aminoacid_base),
- false otherwise.

Implementations of 1. and 2. are required to be marked constexpr and the value / return value must be convertible to bool. Implementations of 2. are required to be marked noexcept. The value passed to functions implementing 2. shall be ignored, it is only used for finding the function via argument-dependent lookup. In case that your type is not seqan3::is_constexpr_default_constructible_v and you wish to provide an implementation for 2., instead overload for std::type_identity<t>.

To make a type model seqan3::aminoacid_alphabet, it is recommended that you derive from seqan3::aminoacid_base. If that is not possible, choose option 2., and only implement option 1. as a last resort.

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
 
#include <seqan3/alphabet/aminoacid/concept.hpp>
 
namespace your_namespace
{
 
// your own aminoacid definition
struct your_aa : seqan3::aminoacid_empty_base
{
    //...
};
 
} // namespace your_namespace
 
static_assert(seqan3::enable_aminoacid<your_namespace::your_aa> == true);
 
/***** OR *****/
 
namespace your_namespace2
{
 
// your own aminoacid definition
struct your_aa
{
    //...
};
 
constexpr bool enable_aminoacid(your_aa) noexcept
{
    return true;
}
 
} // namespace your_namespace2
 
static_assert(seqan3::enable_aminoacid<your_namespace2::your_aa> == true);

Customisation point

This is a customisation point (see Customisation). To change the default behaviour for your own alphabet, follow the above instructions.

This entity is experimental and subject to change in the future. Implementation 2 (free function) is not stable.

This entity is stable. Since version 3.1. The name seqan3::enable_aminoacid, Implementation 1, and Implementation 3 are stable and will not change.

Classes

Functions

Variables