SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
seqan3::cigar Class Reference

The seqan3::cigar semialphabet pairs a counter with a seqan3::cigar::operation letter. More...

#include <seqan3/alphabet/cigar/cigar.hpp>

+ Inheritance diagram for seqan3::cigar:

Public Types

using operation = exposition_only::cigar_operation
 The (extended) cigar operation alphabet of M,D,I,H,N,P,S,X,=.
 

Public Member Functions

Constructors, destructor and assignment
constexpr cigar () noexcept=default
 Defaulted.
 
constexpr cigar (cigar const &) noexcept=default
 Defaulted.
 
constexpr cigar (cigar &&) noexcept=default
 Defaulted.
 
constexpr cigaroperator= (cigar const &) noexcept=default
 Defaulted.
 
constexpr cigaroperator= (cigar &&) noexcept=default
 Defaulted.
 
 ~cigar () noexcept=default
 Defaulted.
 
constexpr cigar (component_type const alph) noexcept
 Construction via a value of one of the components.
 
constexpr cigaroperator= (component_type const alph) noexcept
 Assignment via a value of one of the components.
 
Read functions
small_string< 11 > to_string () const noexcept
 Return the string representation.
 
Write functions
cigarassign_string (std::string_view const input) noexcept
 Assign from a std::string_view.
 
- Public Member Functions inherited from seqan3::alphabet_tuple_base< cigar, uint32_t, exposition_only::cigar_operation >
constexpr alphabet_tuple_base (component_types... components) noexcept
 Construction from initialiser-list.
 
constexpr alphabet_tuple_base (component_type const alph) noexcept
 Construction via a value of one of the components.
 
constexpr alphabet_tuple_base (indirect_component_type const alph) noexcept
 Construction via a value of a subtype that is assignable to one of the components.
 
constexpr cigaroperator= (component_type const alph) noexcept
 Assignment via a value of one of the components.
 
constexpr cigaroperator= (indirect_component_type const alph) noexcept
 Assignment via a value of a subtype that is assignable to one of the components.
 
constexpr operator type () const noexcept
 Implicit cast to a single letter. Works only if the type is unique in the type list.
 
- Public Member Functions inherited from seqan3::alphabet_base< derived_type, size, char_t >
constexpr alphabet_base () noexcept=default
 Defaulted.
 
constexpr alphabet_base (alphabet_base const &) noexcept=default
 Defaulted.
 
constexpr alphabet_base (alphabet_base &&) noexcept=default
 Defaulted.
 
constexpr alphabet_baseoperator= (alphabet_base const &) noexcept=default
 Defaulted.
 
constexpr alphabet_baseoperator= (alphabet_base &&) noexcept=default
 Defaulted.
 
 ~alphabet_base () noexcept=default
 Defaulted.
 
constexpr char_type to_char () const noexcept
 Return the letter as a character of char_type.
 
constexpr rank_type to_rank () const noexcept
 Return the letter's numeric value (rank in the alphabet).
 
constexpr derived_type & assign_char (char_type const chr) noexcept
 Assign from a character, implicitly converts invalid characters.
 
constexpr derived_type & assign_rank (rank_type const c) noexcept
 Assign from a numeric value.
 

Friends

Get functions
template<size_t index>
constexpr auto get (cigar &l) noexcept
 Tuple-like access to the contained components.
 
template<typename type >
constexpr auto get (cigar &l) noexcept
 Tuple-like access to the contained components.
 
- Friends inherited from seqan3::alphabet_tuple_base< cigar, uint32_t, exposition_only::cigar_operation >
Comparison operators

Related Symbols

(Note that these are not member symbols.)

Other literals
constexpr cigar::operation operator""_cigar_operation (char const c) noexcept
 The seqan3::cigar::operation char literal.
 

Additional Inherited Members

- Static Public Attributes inherited from seqan3::alphabet_base< derived_type, size, char_t >
static constexpr detail::min_viable_uint_t< size > alphabet_size = size
 The size of the alphabet, i.e. the number of different values it can take.
 
- Protected Types inherited from seqan3::alphabet_base< derived_type, size, char_t >
using char_type = std::conditional_t< std::same_as< char_t, void >, char, char_t >
 The char representation; conditional needed to make semi alphabet definitions legal.
 
using rank_type = detail::min_viable_uint_t< size - 1 >
 The type of the alphabet when represented as a number (e.g. via to_rank()).
 

Detailed Description

The seqan3::cigar semialphabet pairs a counter with a seqan3::cigar::operation letter.

This semialphabet represents a unit in a CIGAR string, typically found in the SAM and BAM formats. It consists of a number and a seqan3::cigar::operation symbol.

It has a "visual representation", but since this is a string and not a char, the type only models seqan3::writable_semialphabet and not seqan3::writable_alphabet. Members for reading/writing the string are provided.

To avoid confusion between string and char literal, this alphabet has no user defined literal operators. Always assign from a pair of uint32_t and seqan3::cigar::operation.

OP Description
M Alignment match (can be a sequence match or mismatch, used only in basic CIGAR representations)
I Insertion to the reference
D Deletion from the reference
N Skipped region from the reference
S Soft clipping (clipped sequences present in seqan3::sam_record::sequence)
H Hard clipping (clipped sequences NOT present in seqan3::sam_record::sequence)
P Padding (silent deletion from padded reference)
= Sequence match
X Sequence mismatch

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::cigar letter{12, 'M'_cigar_operation};
letter.assign_string("10D");
seqan3::debug_stream << letter << '\n'; // prints "10D"
letter.assign_string("20Z"); // Unknown strings are implicitly converted to 0P.
seqan3::debug_stream << letter << '\n'; // prints "0P"
}
Provides the seqan3::cigar alphabet.
The seqan3::cigar semialphabet pairs a counter with a seqan3::cigar::operation letter.
Definition alphabet/cigar/cigar.hpp:57
cigar & assign_string(std::string_view const input) noexcept
Assign from a std::string_view.
Definition alphabet/cigar/cigar.hpp:167
Provides seqan3::debug_stream and related types.
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition debug_stream.hpp:37
The SeqAn namespace for literals.
See also
https://samtools.github.io/hts-specs/SAMv1.pdf#page=8

Member Typedef Documentation

◆ operation

The (extended) cigar operation alphabet of M,D,I,H,N,P,S,X,=.

The CIGAR string can be either basic or extended. The only difference in the extended cigar alphabet is that aligned bases are classified as an actual match ('=') or mismatch ('X'). In contrast, the basic cigar alphabet only indicated the aligned status with an 'M', without further information if the bases are actually equal or not.

The main purpose of the seqan3::cigar::operation alphabet is to be used in the seqan3::cigar composition, where a cigar operation is paired with a count value.

OP Description
M Alignment match (can be a sequence match or mismatch, used only in basic CIGAR representations)
I Insertion to the reference
D Deletion from the reference
N Skipped region from the reference
S Soft clipping (clipped sequences present in seqan3::sam_record::sequence)
H Hard clipping (clipped sequences NOT present in seqan3::sam_record::sequence)
P Padding (silent deletion from padded reference)
= Sequence match
X Sequence mismatch

Example usage:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::cigar::operation letter{'M'_cigar_operation};
letter.assign_char('D');
seqan3::debug_stream << letter << '\n'; // prints "D"
letter.assign_char('Z'); // Unknown characters are implicitly converted to M.
seqan3::debug_stream << letter << '\n'; // prints "M"
}
constexpr derived_type & assign_char(char_type const chr) noexcept
Assign from a character, implicitly converts invalid characters.
Definition alphabet_base.hpp:160
The actual implementation of seqan3::cigar::operation for documentation purposes only.
Definition cigar_operation.hpp:45
Note
Usually you do not want to manipulate cigar elements and vectors on your own but convert an alignment to a cigar and back. See seqan3::cigar_from_alignment for how to convert two aligned sequences into a cigar_vector.
See also
https://samtools.github.io/hts-specs/SAMv1.pdf#page=8

This entity is stable. Since version 3.1.

Constructor & Destructor Documentation

◆ cigar()

constexpr seqan3::cigar::cigar ( component_type const  alph)
inlineconstexprnoexcept

Construction via a value of one of the components.

Template Parameters
component_typeOne of the component types; must be uniquely contained in the type list of the composite.
Parameters
[in]alphThe value of a component that should be assigned.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
// creates 10M, as the cigar_op field is not provided.
seqan3::cigar letter1{10};
seqan3::debug_stream << "letter1: " << letter1 << '\n'; // 10M
// creates 0I, as the integer field is not provided.
seqan3::cigar letter2{'I'_cigar_operation};
seqan3::debug_stream << "letter2: " << letter2 << '\n'; // 0I
// creates 10I, as both fields are explicitly given.
seqan3::cigar letter3{10, 'I'_cigar_operation};
seqan3::debug_stream << "letter3: " << letter3 << '\n'; // 10I
}

This entity is stable. Since version 3.1.

Member Function Documentation

◆ assign_string()

cigar & seqan3::cigar::assign_string ( std::string_view const  input)
inlinenoexcept

Assign from a std::string_view.

In order to avoid unnecessary copies, you can initialise a seqan3::cigar from a std::string_view that contains the cigar string.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
std::string cigar_str{"4S134M"}; // input
seqan3::cigar letter1{};
seqan3::cigar letter2{};
// Assign from string
// convenient but creates an unnecessary string copy "4S"
letter1.assign_string(cigar_str.substr(0, 2));
letter2.assign_string(cigar_str.substr(2, 4));
seqan3::debug_stream << letter1 << '\n'; // prints 4S
seqan3::debug_stream << letter2 << '\n'; // prints 134M
// Assign from std::string_view (No extra string copies)
// Version 1
letter1.assign_string(std::string_view{cigar_str}.substr(0, 2));
letter2.assign_string(std::string_view{cigar_str}.substr(2, 4));
seqan3::debug_stream << letter1 << '\n'; // prints 4S
seqan3::debug_stream << letter2 << '\n'; // prints 134M
// No extra string copiesersion 2
letter1.assign_string(/*std::string_view*/ {cigar_str.data(), 2});
letter2.assign_string(/*std::string_view*/ {cigar_str.data() + 2, 4});
seqan3::debug_stream << letter1 << '\n'; // prints 4S
seqan3::debug_stream << letter2 << '\n'; // prints 134M
// Assign from char array
letter2.assign_string("40S");
seqan3::debug_stream << letter2 << '\n'; // prints 40S
// Assign from seqan3::small_string
letter2.assign_string(letter1.to_string());
seqan3::debug_stream << letter2 << '\n'; // prints 4S
}
T substr(T... args)

This entity is experimental and subject to change in the future. Experimental since version 3.2.

◆ operator=()

constexpr cigar & seqan3::cigar::operator= ( component_type const  alph)
inlineconstexprnoexcept

Assignment via a value of one of the components.

Template Parameters
component_typeOne of the component types; must be uniquely contained in the type list of the composite.
Parameters
[in]alphThe value of a component that should be assigned.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::cigar letter{10, 'I'_cigar_operation};
seqan3::debug_stream << "letter: " << letter << '\n'; // 10I
letter = 'D'_cigar_operation;
seqan3::debug_stream << "letter: " << letter << '\n'; // 10D
letter = 20;
seqan3::debug_stream << "letter: " << letter << '\n'; // 20D
}

This entity is stable. Since version 3.1.

◆ to_string()

small_string< 11 > seqan3::cigar::to_string ( ) const
inlinenoexcept

Return the string representation.

This entity is experimental and subject to change in the future. Experimental since version 3.1.

Friends And Related Symbol Documentation

◆ get [1/2]

template<size_t index>
constexpr auto get ( cigar l)
friend

Tuple-like access to the contained components.

Template Parameters
indexReturn the i-th element.
Returns
A proxy to the contained element that models the same alphabets concepts and supports assignment.

This entity is stable. Since version 3.1.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using seqan3::get;
using namespace seqan3::literals;
seqan3::cigar letter{10, 'M'_cigar_operation};
// Note that this is equivalent to get<uint32_t>(letter)
uint32_t size{get<0>(letter)};
// Note that this is equivalent to get<seqan3::cigar::operation>(letter)
seqan3::cigar::operation cigar_char{get<1>(letter)};
seqan3::debug_stream << "Size is " << size << '\n';
seqan3::debug_stream << "Cigar char is " << cigar_char << '\n'; // seqan3::debug_stream converts to char on the fly.
}
constexpr size_t size
The size of a type pack.
Definition type_pack/traits.hpp:143
constexpr auto const & get(configuration< configs_t... > const &config) noexcept
This is an overloaded member function, provided for convenience. It differs from the above function o...
Definition configuration.hpp:412

This entity is stable. Since version 3.1.

◆ get [2/2]

template<typename type >
constexpr auto get ( cigar l)
friend

Tuple-like access to the contained components.

Template Parameters
typeReturn the element of specified type; only available if the type is unique in the set of components.
Returns
A proxy to the contained element that models the same alphabet concepts and supports assignment.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using seqan3::get;
using namespace seqan3::literals;
seqan3::cigar letter{10, 'M'_cigar_operation};
// Note that this is equivalent to get<0>(letter)
uint32_t size{get<uint32_t>(letter)};
// Note that this is equivalent to get<1>(letter)
seqan3::cigar::operation cigar_char{get<seqan3::cigar::operation>(letter)};
seqan3::debug_stream << "Size is " << size << '\n';
seqan3::debug_stream << "Cigar char is " << cigar_char << '\n'; // seqan3::debug_stream converts to char on the fly.
}

This entity is stable. Since version 3.1.

◆ operator""_cigar_operation()

constexpr cigar::operation operator""_cigar_operation ( char const  c)
related

The seqan3::cigar::operation char literal.

Returns
seqan3::cigar::operation

You can use this char literal to assign a seqan3::cigar_operation character:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::cigar::operation letter1{'M'_cigar_operation};
auto letter2 = 'M'_cigar_operation;
}

This entity is stable. Since version 3.1.


The documentation for this class was generated from the following file:
Hide me