SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
|
The reduced Li amino acid alphabet. More...
#include <seqan3/alphabet/aminoacid/aa10li.hpp>
Public Member Functions | |
Constructors, destructor and assignment | |
constexpr | aa10li () noexcept=default |
Defaulted. | |
constexpr | aa10li (aa10li const &) noexcept=default |
Defaulted. | |
constexpr | aa10li (aa10li &&) noexcept=default |
Defaulted. | |
constexpr aa10li & | operator= (aa10li const &) noexcept=default |
Defaulted. | |
constexpr aa10li & | operator= (aa10li &&) noexcept=default |
Defaulted. | |
~aa10li () noexcept=default | |
Defaulted. | |
Public Member Functions inherited from seqan3::aminoacid_base< aa10li, 10 > | |
constexpr | aminoacid_base (other_aa_type const other) noexcept |
Allow explicit construction from any other aminoacid type and convert via the character representation. | |
Public Member Functions inherited from seqan3::alphabet_base< derived_type, size, char_t > | |
constexpr | alphabet_base () noexcept=default |
Defaulted. | |
constexpr | alphabet_base (alphabet_base const &) noexcept=default |
Defaulted. | |
constexpr | alphabet_base (alphabet_base &&) noexcept=default |
Defaulted. | |
constexpr alphabet_base & | operator= (alphabet_base const &) noexcept=default |
Defaulted. | |
constexpr alphabet_base & | operator= (alphabet_base &&) noexcept=default |
Defaulted. | |
~alphabet_base () noexcept=default | |
Defaulted. | |
constexpr char_type | to_char () const noexcept |
Return the letter as a character of char_type. | |
constexpr rank_type | to_rank () const noexcept |
Return the letter's numeric value (rank in the alphabet). | |
constexpr derived_type & | assign_char (char_type const chr) noexcept |
Assign from a character, implicitly converts invalid characters. | |
constexpr derived_type & | assign_rank (rank_type const c) noexcept |
Assign from a numeric value. | |
Private Types | |
using | base_t = aminoacid_base< aa10li, 10 > |
The base class. | |
Static Private Member Functions | |
static constexpr rank_type | char_to_rank (char_type const chr) |
Returns the rank representation of character. | |
static constexpr char_type | rank_to_char (rank_type const rank) |
Returns the character representation of rank. | |
Private Attributes | |
friend | base_t |
Befriend seqan3::aminoacid_base. | |
Static Private Attributes | |
static constexpr std::array< rank_type, 256 > | char_to_rank_table |
The lookup table used in char_to_rank. | |
static constexpr char_type | rank_to_char_table [alphabet_size] {'A', 'B', 'C', 'F', 'G', 'H', 'I', 'J', 'K', 'P'} |
The lookup table used in rank_to_char. | |
Related Symbols | |
(Note that these are not member symbols.) | |
using | aa10li_vector = std::vector< aa10li > |
Alias for a std::vector of seqan3::aa10li. | |
Literals | |
constexpr aa10li | operator""_aa10li (char const c) noexcept |
The seqan3::aa10li char literal. | |
constexpr aa10li_vector | operator""_aa10li (char const *const s, size_t const n) |
The seqan3::aa10li string literal. | |
Related Symbols inherited from semialphabet | |
template<cereal_output_archive archive_t, semialphabet alphabet_t> | |
alphabet_rank_t< alphabet_t > | save_minimal (archive_t const &, alphabet_t const &l) |
Save an alphabet letter to stream. | |
template<cereal_input_archive archive_t, typename wrapped_alphabet_t > | |
void | load_minimal (archive_t const &, wrapped_alphabet_t &&l, alphabet_rank_t< detail::strip_cereal_wrapper_t< wrapped_alphabet_t > > const &r) |
Restore an alphabet letter from a saved rank. | |
Additional Inherited Members | |
Static Public Member Functions inherited from seqan3::aminoacid_base< aa10li, 10 > | |
static constexpr bool | char_is_valid (char_type const c) noexcept |
Validate whether a character value has a one-to-one mapping to an alphabet value. | |
Static Public Attributes inherited from seqan3::alphabet_base< derived_type, size, char_t > | |
static constexpr detail::min_viable_uint_t< size > | alphabet_size = size |
The size of the alphabet, i.e. the number of different values it can take. | |
Protected Types inherited from seqan3::alphabet_base< derived_type, size, char_t > | |
using | char_type = std::conditional_t< std::same_as< char_t, void >, char, char_t > |
The char representation; conditional needed to make semi alphabet definitions legal. | |
using | rank_type = detail::min_viable_uint_t< size - 1 > |
The type of the alphabet when represented as a number (e.g. via to_rank()). | |
The reduced Li amino acid alphabet.
The alphabet consists of letters A, B, C, F, G, H, I, J, K, P A represents hydrophilic and alocohol residues (A,S,T). B represents charged/polar residues (B,D,E,Q,Z). C represents cystein and the species-specific amino acid Selenocysteine. F represents amino acids with aromatic residues (F,W,Y). H represents a group of hydrophobic residues (H,N). I represents a group of large hydrophobic residues (I,V). J represents a group of large hydrophobic residues (J,L,M). K represents long-chain positively charged residues (K,R) and the species-specific amino acid Pyrrolysine. G and P do not represent any other amino acids other than themselves.
This alphabet allows to reduce the aminoacid alphabet size to 10 but is still able to recognize and represent folding of all proteins. Amino acids are grouped together based on residues.
Note: Letters which belong in the extended alphabet will be automatically converted. Terminator characters are converted to F, because the most commonly occurring stop codon in higher eukaryotes is UGA 2. This is most similar to a Tryptophan which in this alphabet gets converted to Phenylalanine. Anything unknown is converted to A.
Input Letter | Converts to |
---|---|
D | B1 |
E | B1 |
L | J1 |
M | J1 |
N | H1 |
O | K1 |
Q | B1 |
R | K1 |
S | A1 |
T | A1 |
U | C1 |
V | I1 |
W | F1 |
Y | F1 |
Z | B1 |
X (Unknown) | A1 |
* (Terminator) | F1,2 |
1T. Li, K. Fan, J. Wang, and W. Wang. Reduction of protein sequence complexity by residue grouping. Protein Eng., 16(5):323–330, May 2003.
2Trotta, E. (2016). Selective forces and mutational biases drive stop codon usage in the human genome: a comparison with sense codon usage. BMC Genomics, 17, 366. https://doi.org/10.1186/s12864-016-2692-4
|
inlinestaticconstexprprivate |
Returns the rank representation of character.
This function is required by seqan3::alphabet_base.
|
inlinestaticconstexprprivate |
Returns the character representation of rank.
This function is required by seqan3::alphabet_base.
|
related |
|
related |
The seqan3::aa10li string literal.
[in] | s | A pointer to the character string to assign. |
[in] | n | The size of the character string to assign. |
You can use this string literal to easily assign to aa10li_vector:
|
related |
The seqan3::aa10li char literal.
[in] | c | The character to assign. |
You can use this char literal to assign a seqan3::aa10li character:
|
staticconstexprprivate |
The lookup table used in char_to_rank.
We would have defined these lookup tables directly within their respective constexpr functions, but at the time of writing this, gcc did not (clang >= 4 did!) auto-generate lookup tables.
|
staticconstexprprivate |
The lookup table used in rank_to_char.
We would have defined these lookup tables directly within their respective constexpr functions, but at the time of writing this, gcc did not (clang >= 4 did!) auto-generate lookup tables.