The reduced Murphy amino acid alphabet. More...

#include <seqan3/alphabet/aminoacid/aa10murphy.hpp>

Inheritance diagram for seqan3::aa10murphy:

Public Member Functions
Constructors, destructor and assignment
constexpr	aa10murphy () noexcept=default
	Defaulted.

constexpr	aa10murphy (aa10murphy const &) noexcept=default
	Defaulted.

constexpr	aa10murphy (aa10murphy &&) noexcept=default
	Defaulted.

constexpr aa10murphy &	operator= (aa10murphy const &) noexcept=default
	Defaulted.

constexpr aa10murphy &	operator= (aa10murphy &&) noexcept=default
	Defaulted.

	~aa10murphy () noexcept=default
	Defaulted.

Public Member Functions inherited from seqan3::aminoacid_base< aa10murphy, 10 >
constexpr	aminoacid_base (other_aa_type const other) noexcept
	Allow explicit construction from any other aminoacid type and convert via the character representation.

Public Member Functions inherited from seqan3::alphabet_base< derived_type, size, char_t >
constexpr	alphabet_base () noexcept=default
	Defaulted.

constexpr	alphabet_base (alphabet_base const &) noexcept=default
	Defaulted.

constexpr	alphabet_base (alphabet_base &&) noexcept=default
	Defaulted.

constexpr alphabet_base &	operator= (alphabet_base const &) noexcept=default
	Defaulted.

constexpr alphabet_base &	operator= (alphabet_base &&) noexcept=default
	Defaulted.

	~alphabet_base () noexcept=default
	Defaulted.

constexpr char_type	to_char () const noexcept
	Return the letter as a character of char_type.

constexpr rank_type	to_rank () const noexcept
	Return the letter's numeric value (rank in the alphabet).

constexpr derived_type &	assign_char (char_type const chr) noexcept
	Assign from a character, implicitly converts invalid characters.

constexpr derived_type &	assign_rank (rank_type const c) noexcept
	Assign from a numeric value.

Related Symbols
(Note that these are not member symbols.)
using	aa10murphy_vector = std::vector< aa10murphy >
	Alias for a std::vector of seqan3::aa10murphy.

Additional Inherited Members
Static Public Member Functions inherited from seqan3::aminoacid_base< aa10murphy, 10 >
static constexpr bool	char_is_valid (char_type const c) noexcept
	Validate whether a character value has a one-to-one mapping to an alphabet value.

Static Public Attributes inherited from seqan3::alphabet_base< derived_type, size, char_t >
static constexpr detail::min_viable_uint_t< size >	alphabet_size = size
	The size of the alphabet, i.e. the number of different values it can take.

Protected Types inherited from seqan3::alphabet_base< derived_type, size, char_t >
using	char_type = std::conditional_t< std::same_as< char_t, void >, char, char_t >
	The char representation; conditional needed to make semi alphabet definitions legal.

using	rank_type = detail::min_viable_uint_t< size - 1 >
	The type of the alphabet when represented as a number (e.g. via to_rank()).

Detailed Description

The reduced Murphy amino acid alphabet.

The alphabet consists of letters A, B, C, F, G, H, I, K, P, S B represents charged/polar residues (E,D,N,Q). C represents cystein and the species-specific amino acid Selenocysteine. F represents amino acids with large and mainly hydrophobic aromatic side chains (F,W,Y). I represents large hydrophobes (L,V,I,M). K represents long-chain positively charged residues (K,R) and the species-specific amino acid Pyrrolysine. S represents alcohols (S,T) and unknown. A, G, H and P do not represent any other amino acids other than themselves.

This alphabet allows to reduce the aminoacid alphabet size to 10 but is still able to recognize and represent folding of all proteins. Amino acids are grouped together based on similar physical and chemical properties.

Note: Letters which belong in the extended alphabet will be automatically converted. Terminator characters are converted to F, because the most commonly occurring stop codon in higher eukaryotes is UGA ². This is most similar to a Tryptophan which in this alphabet gets converted to Phenylalanine. Anything unknown is converted to S.

Input Letter	Converts to
D	B¹
E	B¹
J	I¹
L	I¹
M	I¹
N	B¹
O	K¹
Q	B¹
R	K¹
T	S¹
U	C¹
V	I¹
W	F¹
Y	F¹
Z	B¹
X (Unknown)	S¹
* (Terminator)	F^1,2

¹L. R. Murphy, A. Wallqvist, and R. M. Levy. Simplified amino acid alphabets for protein fold recognition and implications for folding. Protein Eng., 13(3):149–152, Mar 2000.
²Trotta, E. (2016). Selective forces and mutational biases drive stop codon usage in the human genome: a comparison with sense codon usage. BMC Genomics, 17, 366. https://doi.org/10.1186/s12864-016-2692-4

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
 
#include <seqan3/alphabet/aminoacid/aa10murphy.hpp>
#include <seqan3/core/debug_stream.hpp>
 
int main()
{
    using namespace seqan3::literals;
 
    seqan3::aa10murphy letter{'A'_aa10murphy};
 
    letter.assign_char('C');
    seqan3::debug_stream << letter << '\n'; // prints "C"
 
    letter.assign_char('?');                // Unknown characters are implicitly converted to S.
    seqan3::debug_stream << letter << '\n'; // prints "S"
}

This entity is stable. Since version 3.1.

Friends And Related Symbol Documentation

◆ aa10murphy_vector

using aa10murphy_vector = std::vector<aa10murphy>

This entity is stable. Since version 3.1.

The documentation for this class was generated from the following file:

aa10murphy.hpp

Public Member Functions

Related Symbols

Additional Inherited Members

Detailed Description

Friends And Related Symbol Documentation

◆ aa10murphy_vector