SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
The SeqAn Cookbook

This document provides example recipes on how to carry out particular tasks using the SeqAn functionalities in C++. Please note that these recipes are not ordered. You can use the links in the table of contents or the search function of your browser to navigate them.

It will take some time, but we hope to expand this document into containing numerous great examples. If you have suggestions for how to improve the Cookbook and/or examples you would like included, please feel free to contact us.

Read sequence files

#include <string>
#include <seqan3/core/debug_stream.hpp> // for debug_stream
#include <seqan3/io/sequence_file/input.hpp> // for sequence_file_input
int main()
{
std::filesystem::path tmp_dir = std::filesystem::temp_directory_path(); // get the tmp directory
// Initialise a file input object with a FASTA file.
seqan3::sequence_file_input file_in{tmp_dir / "seq.fasta"};
// Retrieve the sequences and ids.
for (auto & [seq, id, qual] : file_in)
{
seqan3::debug_stream << "ID: " << id << '\n';
seqan3::debug_stream << "SEQ: " << seq << '\n';
seqan3::debug_stream << "Empty Qual." << qual << '\n'; // qual is empty for FASTA files
}
return 0;
}
A class for reading sequence files, e.g. FASTA, FASTQ ...
Definition sequence_file/input.hpp:207
Provides seqan3::debug_stream and related types.
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition debug_stream.hpp:37
Provides seqan3::sequence_file_input and corresponding traits classes.
T temp_directory_path(T... args)

Construction and assignment of alphabet symbols

#include <seqan3/alphabet/all.hpp> // for working with alphabets directly
int main()
{
using namespace seqan3::literals;
// Two objects of seqan3::dna4 alphabet constructed with a char literal.
seqan3::dna4 ade = 'A'_dna4;
seqan3::dna4 gua = 'G'_dna4;
// Two additional objects assigned explicitly from char or rank.
seqan3::dna4 cyt, thy;
cyt.assign_char('C');
thy.assign_rank(3);
// Further code here...
Meta-header for the alphabet module.
constexpr derived_type & assign_char(char_type const chr) noexcept
Assign from a character, implicitly converts invalid characters.
Definition alphabet_base.hpp:160
constexpr derived_type & assign_rank(rank_type const c) noexcept
Assign from a numeric value.
Definition alphabet_base.hpp:184
The four letter DNA alphabet of A,C,G,T.
Definition dna4.hpp:50
The SeqAn namespace for literals.
return 0;
}
// Get the rank type of the alphabet (here uint8_t).
// Retrieve the numerical representation (rank) of the objects.
rank_type rank_a = ade.to_rank(); // => 0
rank_type rank_g = gua.to_rank(); // => 2
constexpr rank_type to_rank() const noexcept
Return the letter's numeric value (rank in the alphabet).
Definition alphabet_base.hpp:134
decltype(seqan3::to_rank(std::declval< semi_alphabet_type >())) alphabet_rank_t
The rank_type of the semi-alphabet; defined as the return type of seqan3::to_rank....
Definition alphabet/concept.hpp:166

Reverse complement and the six-frame translation of a string using views

This recipe creates a small program that

  1. reads a string from the command line (first argument to the program)
  2. "converts" the string to a range of seqan3::dna5 (Bonus: throws an exception if loss of information occurs)
  3. prints the string and its reverse complement
  4. prints the six-frame translation of the string
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <ranges> // include all of the standard library's views
#include <seqan3/alphabet/views/all.hpp> // include all of SeqAn's views
#include <seqan3/argument_parser/all.hpp> // optional: include the argument_parser
int main(int argc, char ** argv)
{
// We use the seqan3::argument_parser which was introduced in the second chapter
// of the tutorial: "Parsing command line arguments with SeqAn".
seqan3::argument_parser myparser{"Assignment-3", argc, argv}; // initialize
myparser.add_positional_option(s, "Please specify the DNA string.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR]" << ext.what() << '\n'; // you can customize your error message
return 0;
}
auto s_as_dna = s | seqan3::views::char_to<seqan3::dna5>;
// Bonus:
//auto s_as_dna = s | std::views::transform([] (char const c)
//{
// return seqan3::assign_char_strictly_to(c, seqan3::dna5{});
//});
seqan3::debug_stream << "Original: " << s_as_dna << '\n';
seqan3::debug_stream << "RevComp: " << (s_as_dna | std::views::reverse | seqan3::views::complement) << '\n';
seqan3::debug_stream << "Frames: " << (s_as_dna | seqan3::views::translate) << '\n';
}
Meta-header for the Alphabet / Views submodule .
Meta-header for the Argument Parser module .
Argument parser exception that is thrown whenever there is an error while parsing the command line ar...
Definition exceptions.hpp:37
The SeqAn command line parser.
Definition argument_parser.hpp:145
constexpr auto translate
A view that translates nucleotide into aminoacid alphabet with 1, 2, 3 or 6 frames.
Definition translate.hpp:800
auto const complement
A view that converts a range of nucleotides to their complement.
Definition complement.hpp:64
T what(T... args)

Reading records

After construction, you can now read the sequence records. Our file object behaves like a range, you can use a range-based for loop to conveniently iterate over the file:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
for (auto & record : fin)
{
seqan3::debug_stream << "ID: " << record.id() << '\n';
seqan3::debug_stream << "SEQ: " << record.sequence() << '\n';
// a quality field also exists, but is not printed, because we know it's empty for FASTA files.
}
}
The FASTA format.
Definition format_fasta.hpp:77
The class template that file records are based on; behaves like a std::tuple.
Definition record.hpp:190
Attention
An input file is a single input range, which means you can only iterate over it once!
Note
It is important to write auto & and not just auto, otherwise you will copy the record on every iteration.

You can also use structured binding, i.e. for (auto & [seq, id, qual] : fin) But beware: with structured bindings you do need to get the order of elements correct!

You can also read a file in chunks:

Reading records in chunks

int main()
{
// `&&` is important because seqan3::views::chunk returns temporaries!
for (auto && records : fin | seqan3::views::chunk(10))
{
// `records` contains 10 elements (or less at the end)
seqan3::debug_stream << "Taking the next 10 sequences:\n";
seqan3::debug_stream << "ID: " << (*records.begin()).id() << '\n'; // prints first ID in batch
}
Provides seqan3::views::chunk.
T current_path(T... args)
Meta-header for the IO / Sequence File submodule .
The main SeqAn3 namespace.
Definition aligned_sequence_concept.hpp:26

The example above will iterate over the file by reading 10 records at a time. If no 10 records are available anymore, it will just print the remaining records.

Applying a filter to a file

On some occasions you are only interested in sequence records that fulfill a certain criterion, e.g. having a minimum sequence length or a minimum average quality.

This recipe can be used to filter the sequences in your file by a custom criterion.

#include <numeric> // std::accumulate
#include <ranges>
int main()
{
// std::views::filter takes a function object (a lambda in this case) as input that returns a boolean
auto minimum_quality_filter = std::views::filter(
[](auto const & rec)
{
auto qualities = rec.base_qualities()
| std::views::transform(
[](auto quality)
{
return seqan3::to_phred(quality);
});
auto sum = std::accumulate(qualities.begin(), qualities.end(), 0);
return sum / std::ranges::size(qualities) >= 40; // minimum average quality >= 40
});
for (auto & rec : fin | minimum_quality_filter)
{
seqan3::debug_stream << "ID: " << rec.id() << '\n';
}
}
T accumulate(T... args)
constexpr auto to_phred
The public getter function for the Phred representation of a quality score.
Definition alphabet/quality/concept.hpp:97

Reading paired-end reads

In modern Next Generation Sequencing experiments you often have paired-end read data which is split into two files. The read pairs are identified by their identical name/id and position in the two files.

This recipe can be used to handle one pair of reads at a time.

int main()
{
// for simplicity we take the same file
for (auto && [rec1, rec2] : seqan3::views::zip(fin1, fin2)) // && is important!
{ // because seqan3::views::zip returns temporaries
if (rec1.id() != rec2.id())
throw std::runtime_error("Your pairs don't match.");
}
}
seqan::stl::views::zip zip
A view adaptor that takes several views and returns tuple-like values from every i-th element of each...
Definition zip.hpp:24
Provides seqan3::views::zip.

Storing records in a std::vector

This recipe creates a small program that reads in a FASTA file and stores all the records in a std::vector.

#include <filesystem>
#include <ranges> // std::ranges::copy
int main()
{
seqan3::sequence_file_input fin{current_path / "my.fasta"};
using record_type = decltype(fin)::record_type;
// You can use a for loop:
for (auto & record : fin)
{
records.push_back(std::move(record));
}
// But you can also do this:
seqan3::debug_stream << records << '\n';
}
T back_inserter(T... args)
T copy(T... args)
T push_back(T... args)

Note that you can move the record out of the file if you want to store it somewhere without copying.

int main()
{
using record_type = typename decltype(fin)::record_type;
record_type rec = std::move(*fin.begin()); // avoid copying
}

Writing records

The easiest way to write to a sequence file is to use the seqan3::sequence_file_output::push_back() or seqan3::sequence_file_output::emplace_back() member functions. These work similarly to how they work on a std::vector.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <string>
int main()
{
using namespace seqan3::literals;
using sequence_record_type = seqan3::sequence_record<types, fields>;
for (int i = 0; i < 5; ++i) // ...
{
std::string id{"test_id"};
seqan3::dna5_vector sequence{"ACGT"_dna5};
sequence_record_type record{std::move(sequence), std::move(id)};
fout.push_back(record);
}
}
A class for writing sequence files, e.g. FASTA, FASTQ ...
Definition io/sequence_file/output.hpp:66
The record type of seqan3::sequence_file_input.
Definition sequence_file/record.hpp:26
Provides seqan3::dna5, container aliases and string literals.
The generic concept for a (biological) sequence.
Provides seqan3::sequence_file_output and corresponding traits classes.
Provides seqan3::sequence_record.
A class template that holds a choice of seqan3::field.
Definition record.hpp:125
Type that contains multiple types.
Definition type_list.hpp:26

The class seqan3::sequence_file_output takes an extra parameter allowing to custom select the fields and their order.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
for (int i = 0; i < 5; i++)
{
std::string id{"test_id"};
// vector of combined data structure:
{'A'_dna5, '1'_phred42},
{'C'_dna5, '3'_phred42}};
auto view_on_seq = seqan3::views::elements<0>(seq_qual);
auto view_on_qual = seqan3::views::elements<1>(seq_qual);
// ...
// Note that the order of the arguments is different from the default `seq, id, qual`,
// because you specified that ID should be first in the fields template argument.
fout.emplace_back(id, view_on_seq, view_on_qual);
// or:
fout.push_back(std::tie(id, view_on_seq, view_on_qual));
}
}
The FASTQ format.
Definition format_fastq.hpp:77
Provides seqan3::views::elements.
Provides seqan3::phred42 quality scores.
Provides quality alphabet composites.
T tie(T... args)

File conversion

int main()
{
auto current_path = std::filesystem::current_path();
seqan3::sequence_file_output{current_path / "output.fasta"} =
seqan3::sequence_file_input{current_path / "my.fastq"};
}

Define a custom scoring scheme

Provides seqan3::aminoacid_scoring_scheme.
Provides seqan3::hamming_scoring_scheme.
Provides seqan3::nucleotide_scoring_scheme.
using namespace seqan3::literals;
// Define a simple scoring scheme with match and mismatch cost and get the score.
auto sc_nc = nc_scheme.score('A'_dna4, 'C'_dna4); // sc_nc == -5.
// Define a amino acid similarity matrix and get the score.
auto sc_aa = aa_scheme.score('M'_aa27, 'K'_aa27); // sc_aa == 2.
A data structure for managing and computing the score of two amino acids.
Definition aminoacid_scoring_scheme.hpp:72
constexpr void set_similarity_matrix(aminoacid_similarity_matrix const matrix_id)
Set the similarity matrix scheme (e.g. blosum62).
Definition aminoacid_scoring_scheme.hpp:118
A data structure for managing and computing the score of two nucleotides.
Definition nucleotide_scoring_scheme.hpp:35
@ blosum30
The blosum30 matrix for very distantly related proteins.
A strong type of underlying type score_type that represents the score of two matching characters.
Definition scoring_scheme_base.hpp:38
A strong type of underlying type score_type that represents the score two different characters.
Definition scoring_scheme_base.hpp:63
Attention
SeqAn's alignment algorithm computes the maximal similarity score, thus the match score must be set to a positive value and the scores for mismatch and gap must be negative in order to maximize over the matching letters.

Calculate edit distance for a set of sequences

This recipe can be used to calculate the edit distance for all six pairwise combinations. Here we only allow at most 7 errors and filter all alignments with 6 or fewer errors.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <ranges>
#include <utility>
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector vec{"ACGTGACTGACT"_dna4, "ACGAAGACCGAT"_dna4, "ACGTGACTGACT"_dna4, "AGGTACGAGCGACACT"_dna4};
// Configure the alignment kernel.
auto alignment_results = seqan3::align_pairwise(seqan3::views::pairwise_combine(vec), config);
auto filter_v = std::views::filter(
[](auto && res)
{
return res.score() >= -6;
});
for (auto const & result : alignment_results | filter_v)
{
seqan3::debug_stream << "Score: " << result.score() << '\n';
}
}
Provides pairwise alignment function.
Sets the global alignment method.
Definition align_config_method.hpp:119
Sets the minimal score (maximal errors) allowed during an distance computation e.g....
Definition align_config_min_score.hpp:36
Configures the alignment result to output the score.
Definition align_config_output.hpp:40
Provides seqan3::dna4, container aliases and string literals.
constexpr configuration edit_scheme
Shortcut for edit distance configuration.
Definition align_config_edit.hpp:48
constexpr auto align_pairwise(sequence_t &&seq, alignment_config_t const &config)
Computes the pairwise alignment for a pair of sequences or a range over sequence pairs.
Definition align_pairwise.hpp:131
constexpr auto pairwise_combine
A view adaptor that generates all pairwise combinations of the elements of the underlying range.
Definition pairwise_combine.hpp:648
Provides seqan3::views::pairwise_combine.

Searching for matches

This recipe can be used to search for all occurrences of a substring and print the number of hits and the positions in an ascending ordering.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using namespace seqan3::literals;
void run_text_single()
{
seqan3::dna4_vector text{
"CGCTGTCTGAAGGATGAGTGTCAGCCAGTGTAACCCGATGAGCTACCCAGTAGTCGAACTGGGCCAGACAACCCGGCGCTAATGCACTCA"_dna4};
seqan3::fm_index index{text};
seqan3::debug_stream << "===== Running on a single text =====\n"
<< "The following hits were found:\n";
for (auto && result : search("GCT"_dna4, index))
seqan3::debug_stream << result << '\n';
}
void run_text_collection()
{
std::vector<seqan3::dna4_vector> text{"CGCTGTCTGAAGGATGAGTGTCAGCCAGTGTA"_dna4,
"ACCCGATGAGCTACCCAGTAGTCGAACTG"_dna4,
"GGCCAGACAACCCGGCGCTAATGCACTCA"_dna4};
seqan3::fm_index index{text};
seqan3::debug_stream << "===== Running on a text collection =====\n"
<< "The following hits were found:\n";
for (auto && result : search("GCT"_dna4, index))
seqan3::debug_stream << result << '\n';
}
int main()
{
run_text_single();
run_text_collection();
}
The SeqAn FM Index.
Definition fm_index.hpp:186
Provides the unidirectional seqan3::fm_index.
Provides the public interface for search algorithms.
T search(T... args)

If you want to allow errors in your query, you need to configure the approximate search with the following search configuration objects:

To search for either 1 insertion or 1 deletion you can use the seqan3::search_cfg::error_count:

std::string text{"Garfield the fat cat without a hat."};
seqan3::fm_index index{text};
seqan3::debug_stream << search("cat"s, index, cfg) << '\n';
// prints: [<query_id:0, reference_id:0, reference_pos:14>,
// <query_id:0, reference_id:0, reference_pos:17>,
// <query_id:0, reference_id:0, reference_pos:18>,
// <query_id:0, reference_id:0, reference_pos:32>]
Collection of elements to configure an algorithm.
Definition configuration.hpp:42
Configuration element that represents the number or rate of deletion errors.
Definition max_error.hpp:170
Configuration element that represents the number or rate of insertion errors.
Definition max_error.hpp:124
Configuration element that represents the number or rate of substitution errors.
Definition max_error.hpp:79
Configuration element that represents the number or rate of total errors.
Definition max_error.hpp:34
A strong type of underlying type uint8_t that represents the number of errors.
Definition max_error_common.hpp:29

Reading the CIGAR information from a SAM file and constructing an alignment

This recipe can be used to:

  1. Read in a FASTA file with the reference and a SAM file with the alignment
  2. Filter the alignment records and only take those with a mapping quality >= 30.
  3. For the resulting alignments, print which read was mapped against with reference id and the number of seqan3::gap's involved in the alignment (either in aligned reference or in read sequence).
#include <algorithm> // std::ranges::count
#include <filesystem>
#include <ranges>
#include <string>
#include <vector>
int main()
{
// read in reference information
seqan3::sequence_file_input reference_file{current_path / "reference.fasta"};
std::vector<std::string> reference_ids{};
std::vector<seqan3::dna5_vector> reference_sequences{};
for (auto && record : reference_file)
{
reference_ids.push_back(std::move(record.id()));
reference_sequences.push_back(std::move(record.sequence()));
}
// filter out alignments
seqan3::sam_file_input mapping_file{current_path / "mapping.sam", reference_ids, reference_sequences};
auto mapq_filter = std::views::filter(
[](auto & record)
{
return record.mapping_quality() >= 30;
});
for (auto & record : mapping_file | mapq_filter)
{
reference_sequences[record.reference_id().value()],
record.reference_position().value(),
record.sequence());
// as loop
size_t sum_reference{};
for (auto const & char_reference : std::get<0>(alignment))
if (char_reference == seqan3::gap{})
++sum_reference;
// or via std::ranges::count
size_t sum_read = std::ranges::count(std::get<1>(alignment), seqan3::gap{});
// The reference_id is ZERO based and an optional. -1 is represented by std::nullopt (= reference not known).
std::optional reference_id = record.reference_id();
seqan3::debug_stream << record.id() << " mapped against "
<< (reference_id ? std::to_string(reference_id.value()) : "unknown reference") << " with "
<< sum_read << " gaps in the read sequence and " << sum_reference
<< " gaps in the reference sequence.\n";
}
}
Provides the function seqan3::alignment_from_cigar.
The alphabet of a gap character '-'.
Definition gap.hpp:36
A class for reading SAM files, both SAM and its binary representation BAM are supported.
Definition sam_file/input.hpp:239
T count(T... args)
Provides seqan3::gap.
auto alignment_from_cigar(std::vector< cigar > const &cigar_vector, reference_type const &reference, uint32_t const zero_based_reference_start_position, sequence_type const &query)
Construct an alignment from a CIGAR string and the corresponding sequences.
Definition alignment_from_cigar.hpp:81
@ alignment
The (pairwise) alignment stored in an object that models seqan3::detail::pairwise_alignment.
constexpr auto const & get(configuration< configs_t... > const &config) noexcept
This is an overloaded member function, provided for convenience. It differs from the above function o...
Definition configuration.hpp:412
SeqAn specific customisations in the standard namespace.
Provides the seqan3::record template and the seqan3::field enum.
Provides seqan3::sam_file_input and corresponding traits classes.
T to_string(T... args)

Map reads and write output to SAM file

For a full recipe on creating your own readmapper, see the very end of the tutorial Implementing your own read mapper with SeqAn.

void map_reads(std::filesystem::path const & query_path,
std::filesystem::path const & index_path,
std::filesystem::path const & sam_path,
reference_storage_t & storage,
uint8_t const errors)
{
// we need the alphabet and text layout before loading
{
std::ifstream is{index_path, std::ios::binary};
cereal::BinaryInputArchive iarchive{is};
iarchive(index);
}
seqan3::sequence_file_input query_file_in{query_path};
seqan3::sam_file_output sam_out{sam_path,
seqan3::configuration const search_config =
seqan3::configuration const align_config =
for (auto && record : query_file_in)
{
auto & query = record.sequence();
for (auto && result : search(query, index, search_config))
{
size_t start = result.reference_begin_position() ? result.reference_begin_position() - 1 : 0;
std::span text_view{std::data(storage.seqs[result.reference_id()]) + start, query.size() + 1};
for (auto && alignment : seqan3::align_pairwise(std::tie(text_view, query), align_config))
{
size_t ref_offset = alignment.sequence1_begin_position() + 2 + start;
size_t map_qual = 60u + alignment.score();
sam_out.emplace_back(query,
record.id(),
storage.ids[result.reference_id()],
ref_offset,
record.base_qualities(),
map_qual);
}
}
}
}
Configures the alignment result to output the alignment.
Definition align_config_output.hpp:168
Configures the alignment result to output the begin positions.
Definition align_config_output.hpp:128
The SeqAn Bidirectional FM Index.
Definition bi_fm_index.hpp:58
The seqan3::cigar semialphabet pairs a counter with a seqan3::cigar::operation letter.
Definition alphabet/cigar/cigar.hpp:57
A class for writing SAM files, both SAM and its binary representation BAM are supported.
Definition io/sam_file/output.hpp:71
Configuration element to receive all hits with the lowest number of errors within the error bounds.
Definition hit.hpp:56
T data(T... args)
auto cigar_from_alignment(alignment_type const &alignment, cigar_clipped_bases const &clipped_bases={}, bool const extended_cigar=false)
Creates a CIGAR string (SAM format) given a seqan3::detail::pairwise_alignment represented by two seq...
Definition cigar_from_alignment.hpp:111
@ ref_offset
Sequence (seqan3::field::ref_seq) relative start position (0-based), unsigned value.
@ cigar
The cigar vector (std::vector<seqan3::cigar>) representing the alignment in SAM/BAM format.
@ mapq
The mapping quality of the seqan3::field::seq alignment, usually a Phred-scaled score.
@ ref_id
The identifier of the (reference) sequence that seqan3::field::seq was aligned to.
@ id
The identifier, usually a string.
@ seq
The "sequence", usually a range of nucleotides or amino acids.
@ qual
The qualities, usually in Phred score notation.
T size(T... args)
A strong type representing free_end_gaps_sequence1_leading of the seqan3::align_cfg::method_global.
Definition align_config_method.hpp:65
A strong type representing free_end_gaps_sequence1_trailing of the seqan3::align_cfg::method_global.
Definition align_config_method.hpp:85
A strong type representing free_end_gaps_sequence2_leading of the seqan3::align_cfg::method_global.
Definition align_config_method.hpp:75
A strong type representing free_end_gaps_sequence2_trailing of the seqan3::align_cfg::method_global.
Definition align_config_method.hpp:95

Constructing a basic argument parser

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
void run_program(std::filesystem::path const & reference_path, std::filesystem::path const & index_path)
{
seqan3::debug_stream << "reference_file_path: " << reference_path << '\n';
seqan3::debug_stream << "index_path " << index_path << '\n';
}
struct cmd_arguments
{
std::filesystem::path reference_path{};
std::filesystem::path index_path{"out.index"};
};
void initialise_argument_parser(seqan3::argument_parser & parser, cmd_arguments & args)
{
parser.info.author = "E. coli";
parser.info.short_description = "Creates an index over a reference.";
parser.info.version = "1.0.0";
parser.add_option(args.reference_path,
'r',
"reference",
"The path to the reference.",
seqan3::input_file_validator{{"fa", "fasta"}});
parser.add_option(args.index_path,
'o',
"output",
"The output index file path.",
seqan3::output_file_validator{seqan3::output_file_open_options::create_new, {"index"}});
}
int main(int argc, char const ** argv)
{
seqan3::argument_parser parser("Indexer", argc, argv);
cmd_arguments args{};
initialise_argument_parser(parser, args);
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext)
{
std::cerr << "[PARSER ERROR] " << ext.what() << '\n';
return -1;
}
run_program(args.reference_path, args.index_path);
return 0;
}
void add_option(option_type &value, char const short_id, std::string const &long_id, std::string const &desc, option_spec const spec=option_spec::standard, validator_type option_validator=validator_type{})
Adds an option to the seqan3::argument_parser.
Definition argument_parser.hpp:236
argument_parser_meta_data info
Aggregates all parser related meta data (see seqan3::argument_parser_meta_data struct).
Definition argument_parser.hpp:634
A validator that checks if a given path is a valid input file.
Definition validators.hpp:518
A validator that checks if a given path is a valid output file.
Definition validators.hpp:648
@ standard
The default were no checking or special displaying is happening.
Definition auxiliary.hpp:246
@ required
Definition auxiliary.hpp:247
std::string author
Your name ;-)
Definition auxiliary.hpp:295
std::string version
The version information MAJOR.MINOR.PATH (e.g. 3.1.3)
Definition auxiliary.hpp:291
std::string short_description
A short description of the application (e.g. "A tool for mapping reads to the genome").
Definition auxiliary.hpp:293

Constructing a subcommand argument parser

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// =====================================================================================================================
// pull
// =====================================================================================================================
struct pull_arguments
{
std::string repository{};
std::string branch{};
bool progress{false};
};
int run_git_pull(seqan3::argument_parser & parser)
{
pull_arguments args{};
parser.add_positional_option(args.repository, "The repository name to pull from.");
parser.add_positional_option(args.branch, "The branch name to pull from.");
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext)
{
seqan3::debug_stream << "[Error git pull] " << ext.what() << "\n";
return -1;
}
seqan3::debug_stream << "Git pull with repository " << args.repository << " and branch " << args.branch << '\n';
return 0;
}
// =====================================================================================================================
// push
// =====================================================================================================================
struct push_arguments
{
std::string repository{};
bool push_all{false};
};
int run_git_push(seqan3::argument_parser & parser)
{
push_arguments args{};
parser.add_positional_option(args.repository, "The repository name to push to.");
parser.add_positional_option(args.branches, "The branch names to push (if none are given, push current).");
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext)
{
seqan3::debug_stream << "[Error git push] " << ext.what() << "\n";
return -1;
}
seqan3::debug_stream << "Git push with repository " << args.repository << " and branches " << args.branches << '\n';
return 0;
}
// =====================================================================================================================
// main
// =====================================================================================================================
int main(int argc, char const ** argv)
{
seqan3::argument_parser top_level_parser{"mygit", argc, argv, seqan3::update_notifications::on, {"push", "pull"}};
// Add information and flags, but no (positional) options to your top-level parser.
// Because of ambiguity, we do not allow any (positional) options for the top-level parser.
top_level_parser.info.description.push_back("You can push or pull from a remote repository.");
// A flag's default value must be false.
bool flag{false};
top_level_parser.add_flag(flag, 'f', "flag", "some flag");
try
{
top_level_parser.parse(); // trigger command line parsing
}
catch (seqan3::argument_parser_error const & ext) // catch user errors
{
seqan3::debug_stream << "[Error] " << ext.what() << "\n"; // customise your error message
return -1;
}
seqan3::argument_parser & sub_parser = top_level_parser.get_sub_parser(); // hold a reference to the sub_parser
std::cout << "Proceed to sub parser.\n";
if (sub_parser.info.app_name == std::string_view{"mygit-pull"})
return run_git_pull(sub_parser);
else if (sub_parser.info.app_name == std::string_view{"mygit-push"})
return run_git_push(sub_parser);
else
std::cout << "Unhandled subparser named " << sub_parser.info.app_name << '\n';
// Note: Arriving in this else branch means you did not handle all sub_parsers in the if branches above.
return 0;
}
void add_positional_option(option_type &value, std::string const &desc, validator_type option_validator=validator_type{})
Adds a positional option to the seqan3::argument_parser.
Definition argument_parser.hpp:312
void parse()
Initiates the actual command line parsing.
Definition argument_parser.hpp:402
argument_parser & get_sub_parser()
Returns a reference to the sub-parser instance if subcommand parsing was enabled.
Definition argument_parser.hpp:436
@ flag
The alignment flag (bit information), uint16_t value.
@ on
Automatic update notifications should be enabled.
std::string app_name
The application name that will be displayed on the help page.
Definition auxiliary.hpp:289
std::vector< std::string > description
A more detailed description that is displayed on the help page in the section "DESCRIPTION"....
Definition auxiliary.hpp:323

Serialise data structures with cereal

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <fstream>
#include <vector>
#include <seqan3/test/tmp_directory.hpp>
#include <cereal/archives/binary.hpp> // includes the cereal::BinaryInputArchive and cereal::BinaryOutputArchive
#include <cereal/types/vector.hpp> // includes cerealisation support for std::vector
// Written for std::vector, other types also work.
void load(std::vector<int16_t> & data, std::filesystem::path const & tmp_file)
{
std::ifstream is(tmp_file, std::ios::binary); // Where input can be found.
cereal::BinaryInputArchive archive(is); // Create an input archive from the input stream.
archive(data); // Load data.
}
// Written for std::vector, other types also work.
void store(std::vector<int16_t> const & data, std::filesystem::path const & tmp_file)
{
std::ofstream os(tmp_file, std::ios::binary); // Where output should be stored.
cereal::BinaryOutputArchive archive(os); // Create an output archive from the output stream.
archive(data); // Store data.
}
int main()
{
// The following example is for a std::vector but any seqan3 data structure that is documented as serialisable
// could be used, e.g. seqan3::fm_index.
seqan3::test::tmp_directory tmp{};
auto tmp_file = tmp.path() / "data.out"; // This is a temporary file name, use any other filename.
std::vector<int16_t> vec{1, 2, 3, 4};
store(vec, tmp_file); // Calls store on a std::vector.
// This vector is needed to load the information into it.
load(vec2, tmp_file); // Calls load on a std::vector.
seqan3::debug_stream << vec2 << '\n'; // Prints [1,2,3,4].
return 0;
}
constexpr void store(void *mem_addr, simd_t const &simd_vec)
Store simd_t size bits of integral data into memory.
Definition algorithm.hpp:374
constexpr simd_t load(void const *mem_addr)
Load simd_t size bits of integral data from memory.
Definition algorithm.hpp:333

Converting a range of an alphabet

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using seqan3::operator""_dna4;
using seqan3::operator""_dna5;
using seqan3::operator""_phred42;
int main()
{
// A vector of combined sequence and quality information.
std::vector<seqan3::dna4q> sequence1{{'A'_dna4, '!'_phred42},
{'C'_dna4, 'A'_phred42},
{'G'_dna4, '6'_phred42},
{'T'_dna4, '&'_phred42}};
// A vector of dna5.
std::vector<seqan3::dna5> sequence2{"AGNCGTNNCAN"_dna5};
// Convert dna4q to dna4.
// Since `sequence1` is an lvalue, we capture `in` via const &. When unsure, use the general case below.
auto view1 = sequence1
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::dna4>(in);
});
seqan3::debug_stream << view1 << '\n'; // ACGT
// Convert dna5 to dna4.
// General case: Perfect forward.
auto view2 = sequence2 | std::views::take(8)
| std::views::transform(
[](auto && in)
{
return static_cast<seqan3::dna4>(std::forward<decltype(in)>(in));
});
seqan3::debug_stream << view2 << '\n'; // AGACGTAA
return 0;
}
Provides aliases for qualified.

A custom dna4 alphabet that converts all unknown characters to A

When assigning from char or converting from a larger nucleotide alphabet to a smaller one, loss of information can occur since obviously some bases are not available. When converting to seqan3::dna5 or seqan3::rna5, non-canonical bases (letters other than A, C, G, T, U) are converted to 'N' to preserve ambiguity at that position. For seqan3::dna4 and seqan3::rna4 there is no letter 'N' to represent ambiguity, so the conversion from char for IUPAC characters tries to choose the best fitting alternative (see seqan3::dna4 for more details).

If you would like to always convert unknown characters to A instead, you can create your own alphabet with a respective char conversion table very easily like this:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// clang-format off
// We inherit from seqan3::nucleotide_base s.t. we do not need to implement the full nucleotide interface
// but it is sufficient to define `rank_to_char`, `char_to_rank`, and `complement_table`.
class my_dna4 : public seqan3::nucleotide_base<my_dna4, 4 /*alphabet size is 4*/>
{
public:
using nucleotide_base<my_dna4, 4>::nucleotide_base; // Use constructors of the base class.
private:
// Returns the character representation of rank. This is where rank conversion for to_char() is handled!
static constexpr char_type rank_to_char(rank_type const rank)
{
return rank_to_char_table[rank];
}
// Returns the rank representation of character. This is where char conversion for assign_char() is handled!
static constexpr rank_type char_to_rank(char_type const chr)
{
return char_to_rank_table[static_cast<index_t>(chr)];
}
// Returns the complement by rank. This is where complement is handled and with this, my_dna4 models
// seqan3::nucleotide_alphabet.
static constexpr rank_type rank_complement(rank_type const rank)
{
return rank_complement_table[rank];
}
private:
// === lookup-table implementation detail ===
// Value to char conversion table.
static constexpr char_type rank_to_char_table[alphabet_size]{'A', 'C', 'G', 'T'}; // rank 0,1,2,3
// Char-to-value conversion table.
static constexpr std::array<rank_type, 256> char_to_rank_table
{
[] () constexpr
{
// By default, everything has rank 0 which equals `A`.
std::array<rank_type, 256> conversion_table{};
conversion_table['C'] = conversion_table['c'] = 1;
conversion_table['G'] = conversion_table['g'] = 2;
conversion_table['T'] = conversion_table['t'] = 3;
conversion_table['U'] = conversion_table['T']; // set U equal to T
conversion_table['u'] = conversion_table['t']; // set u equal to t
return conversion_table;
}()
};
// The rank complement table.
static constexpr rank_type rank_complement_table[alphabet_size]
{
3, // T is complement of 'A'_dna4
2, // G is complement of 'C'_dna4
1, // C is complement of 'G'_dna4
0 // A is complement of 'T'_dna4
};
friend nucleotide_base<my_dna4, 4>; // Grant seqan3::nucleotide_base access to private/protected members.
friend nucleotide_base<my_dna4, 4>::base_t; // Grant seqan3::alphabet_base access to private/protected members.
};
// clang-format on
// Defines the `_my_dna4` *char literal* so you can write `'C'_my_dna4` instead of `my_dna4{}.assign_char('C')`.
constexpr my_dna4 operator""_my_dna4(char const c) noexcept
{
return my_dna4{}.assign_char(c);
}
int main()
{
my_dna4 my_letter{'C'_my_dna4};
my_letter.assign_char('S'); // Characters other than A,C,G,T are implicitly converted to `A`.
seqan3::debug_stream << my_letter << "\n"; // "A";
seqan3::debug_stream << seqan3::complement(my_letter) << "\n"; // "T";
}
A CRTP-base that refines seqan3::alphabet_base and is used by the nucleotides.
Definition nucleotide_base.hpp:40
constexpr nucleotide_base() noexcept=default
Defaulted.
constexpr auto complement
Return the complement of a nucleotide object.
Definition alphabet/nucleotide/concept.hpp:102
constexpr auto alphabet_size
A type trait that holds the size of a (semi-)alphabet.
Definition alphabet/concept.hpp:846
Provides seqan3::nucleotide_base.

If you are interested in custom alphabets, also take a look at our tutorial How to write your own alphabet.

Controlling threads of (de-)compression streams

When reading or writing compressed files, parallelisation is automatically applied when using BGZF-compressed files, e.g., BAM files. This will use 4 threads by default and can be adjusted by setting seqan3::contrib::bgzf_thread_count to the desired value:

# include <seqan3/io/all.hpp>
// The `bgzf_thread_count` is a variable that can only be changed during the runtime of a program.
// The following does not work, the value must be overwritten within a function:
// seqan3::contrib::bgzf_thread_count = 1u; // Does not work.
int main()
{
// Here, we change the number of threads to `1`.
// This is a global change and will affect every future bgzf (de-)compression.
// However, running (de-)compressions will not be affected.
// `bgzf_thread_count` may be overwritten multiple times during the runtime of a program, in which case
// the latest modification will determine the value.
seqan3::contrib::bgzf_thread_count = 1u;
// Read/Write compressed files.
// ...
return 0;
}
Meta-header for the IO module .

Auto vectorized dna4 complement

Our alphabet seqan3::dna4 cannot be easily auto-vectorized by the compiler.

See this discussion for more details.

You can add your own alphabet that is auto-vectorizable in some use cases. Here is an example for a dna4-like alphabet:

class simd_dna4 : public seqan3::nucleotide_base<simd_dna4, 256>
{
private:
friend base_t; // nucleotide_base
friend base_t::base_t; // alphabet_base
friend seqan3::rna4;
public:
constexpr simd_dna4() noexcept = default;
constexpr simd_dna4(simd_dna4 const &) noexcept = default;
constexpr simd_dna4(simd_dna4 &&) noexcept = default;
constexpr simd_dna4 & operator=(simd_dna4 const &) noexcept = default;
constexpr simd_dna4 & operator=(simd_dna4 &&) noexcept = default;
~simd_dna4() noexcept = default;
template <std::same_as<seqan3::rna4> t> // template parameter t to accept incomplete type
constexpr simd_dna4(t const r) noexcept
{
assign_rank(r.to_rank());
}
using base_t::assign_rank;
using base_t::base_t;
using base_t::to_rank;
static constexpr uint8_t alphabet_size = 4;
constexpr simd_dna4 & assign_char(char_type const c) noexcept
{
char_type const upper_case_char = c & 0b0101'1111;
rank_type rank = (upper_case_char == 'T') * 3 + (upper_case_char == 'G') * 2 + (upper_case_char == 'C');
return assign_rank(rank);
}
constexpr char_type to_char() const noexcept
{
rank_type const rank = to_rank();
switch (rank)
{
case 0u:
return 'A';
case 1u:
return 'C';
case 2u:
return 'G';
default:
return 'T';
}
}
constexpr simd_dna4 complement() const noexcept
{
rank_type rank{to_rank()};
rank ^= 0b11;
simd_dna4 ret{};
return ret.assign_rank(rank);
}
static constexpr bool char_is_valid(char_type const c) noexcept
{
char_type const upper_case_char = c & 0b0101'1111;
return (upper_case_char == 'A') || (upper_case_char == 'C') || (upper_case_char == 'G')
|| (upper_case_char == 'T');
}
};
constexpr char_type to_char() const noexcept
Return the letter as a character of char_type.
Definition alphabet_base.hpp:112
static constexpr bool char_is_valid(char_type const c) noexcept
Validate whether a character value has a one-to-one mapping to an alphabet value.
Definition nucleotide_base.hpp:136
alphabet_base< derived_type, size, char > base_t
Type of the base class.
Definition nucleotide_base.hpp:43
constexpr derived_type complement() const noexcept
Return the complement of the letter.
Definition nucleotide_base.hpp:109
The four letter RNA alphabet of A,C,G,U.
Definition rna4.hpp:46

All SeqAn documentation snippets

The following lists all snippets that appear in our documentation. Search for keywords with Strg + F.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using namespace seqan3::literals;
int main()
{
// CIGAR string = 2M1D2M
std::vector<seqan3::cigar> cigar_vector{{2, 'M'_cigar_operation},
{1, 'D'_cigar_operation},
{2, 'M'_cigar_operation}};
uint32_t reference_start_position{0}; // The read is aligned at the start of the reference.
seqan3::dna5_vector reference = "ACTGATCGAGAGGATCTAGAGGAGATCGTAGGAC"_dna5;
seqan3::dna5_vector query = "ACGA"_dna5;
auto alignment = alignment_from_cigar(cigar_vector, reference, reference_start_position, query);
seqan3::debug_stream << alignment << '\n'; // prints (ACTGA,AC-GA)
}
Provides the seqan3::cigar alphabet.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using namespace seqan3::literals;
auto sam_file_raw = R"(@HD VN:1.6
@SQ SN:ref LN:34
read1 41 ref 1 61 1S1M1D1M1I ref 10 300 ACGT !##$ AS:i:2 NM:i:7
read2 42 ref 2 62 1H7M1D1M1S2H ref 10 300 AGGCTGNAG !##$&'()* xy:B:S,3,4,5
read3 43 ref 3 63 1S1M1P1M1I1M1I1D1M1S ref 10 300 GGAGTATA !!*+,-./
)";
int main()
{
// The reference sequence might be read from a different file.
seqan3::dna5_vector reference = "ACTGATCGAGAGGATCTAGAGGAGATCGTAGGAC"_dna5;
// You will probably read it from a file, e.g., like this:
// seqan3::sam_file_input fin{"test.sam"};
for (auto && rec : fin)
{
auto alignment =
alignment_from_cigar(rec.cigar_sequence(), reference, rec.reference_position().value(), rec.sequence());
}
// prints:
// (ACT-,C-GT)
// (CTGATCGAG,AGGCTGN-A)
// (T-G-A-TC,G-AGTA-T)
}
The SAM format (tag).
Definition format_sam.hpp:105
Meta-header for the IO / SAM File submodule .
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector reference = "ATGGCGTAGAGCTTCCCCCCCCCCCCCCCCC"_dna5;
seqan3::dna5_vector read = "ATGCCCCGTTGCTT"_dna5; // length 14
// Align the full query against the first 14 bases of the reference.
seqan3::gap_decorator aligned_reference{reference | seqan3::views::slice(0, 14)};
seqan3::gap_decorator aligned_read{read};
// Insert gaps to represent the alignment:
seqan3::insert_gap(aligned_read, aligned_read.begin() + 11, 2);
seqan3::insert_gap(aligned_reference, aligned_reference.begin() + 4, 2);
seqan3::debug_stream << aligned_reference << '\n' << aligned_read << '\n';
// prints:
// ATGG--CGTAGAGCTT
// ATGCCCCGTTG--CTT
auto cigar_sequence = seqan3::cigar_from_alignment(std::tie(aligned_reference, aligned_read));
seqan3::debug_stream << cigar_sequence << '\n'; // prints [4M,2I,5M,2D,3M]
}
Includes the aligned_sequence and the related insert_gap and erase_gap functions to enable stl contai...
Provides the function seqan3::cigar_from_alignment and a helper struct seqan3::cigar_clipped_bases.
A gap decorator allows the annotation of sequences with gap symbols while leaving the underlying sequ...
Definition gap_decorator.hpp:78
Provides seqan3::gap_decorator.
Provides seqan3::gapped.
constexpr auto slice
A view adaptor that returns a half-open interval on the underlying range.
Definition slice.hpp:175
Provides seqan3::views::slice.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector reference = "ATGGCGTAGAGCTTCCCCCCCCCCCCCCCCC"_dna5;
seqan3::dna5_vector read = "ATGCCCCGTTGCTT"_dna5; // length 14
// Let's say, we want to ignore the last 2 bases of the query because the quality is low.
// We thus only align the first 12 bases, the last two will be soft-clipped bases in the CIGAR string.
seqan3::gap_decorator aligned_reference{reference | seqan3::views::slice(0, 12)};
seqan3::gap_decorator aligned_query{read | seqan3::views::slice(0, 12)};
// insert gaps
seqan3::insert_gap(aligned_reference, aligned_reference.begin() + 4, 2);
seqan3::insert_gap(aligned_query, aligned_query.begin() + 11, 2);
auto cigar_sequence =
seqan3::cigar_from_alignment(std::tie(aligned_reference, aligned_query),
{.hard_front = 1, .hard_back = 0, .soft_front = 0, .soft_back = 2});
seqan3::debug_stream << cigar_sequence << '\n'; // prints [1H,4M,2I,5M,2D,1M,2S]
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// A symmetric band around the main diagonal.
// A band starting with the main diagonal shifted by 3 cells to the right.
// A band starting with the main diagonal shifted by 3 cells down.
// An invalid band configuration.
// Using this band as a configuration in seqan3::align_pairwise would cause the algorithm to throw an exception.
}
Provides seqan3::detail::align_config_band.
Configuration element for setting a fixed size band.
Definition align_config_band.hpp:60
A strong type representing the lower diagonal of the seqan3::align_cfg::band_fixed_size.
Definition align_config_band.hpp:28
A strong type representing the upper diagonal of the seqan3::align_cfg::band_fixed_size.
Definition align_config_band.hpp:39
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Computes semi global edit distance using fast-bit vector algorithm.
// Computes semi global edit distance using slower standard pairwise algorithm.
// Computes global distance allowing a minimal score of 3 (Default: edit distance).
auto cfg_errors =
}
Provides seqan3::align_cfg::edit_scheme.
Provides global and local alignment configurations.
Provides seqan3::align_cfg::min_score configuration.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
// Configuration with linear gap costs.
// Configuration with affine gap costs. Score for opening a gap during the alignment algorithm will be -11.
// Accessing the members of the gap scheme
int open = affine_cfg.open_score;
int extension = affine_cfg.extension_score;
std::cout << open << '\n'; // -1
std::cout << extension << '\n'; // -10
}
Provides seqan3::align_config::gap_cost_affine.
A configuration element for the affine gap cost scheme.
Definition align_config_gap_cost_affine.hpp:72
A strong type of underlying type int32_t that represents the score (usually negative) of any characte...
Definition align_config_gap_cost_affine.hpp:48
A strong type of underlying type int32_t that represents a score (usually negative) that is incurred ...
Definition align_config_gap_cost_affine.hpp:31
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using namespace seqan3::literals;
int main()
{
// configure a global alignment for DNA sequences
auto seq1 = "TCGT"_dna4;
auto seq2 = "ACGA"_dna4;
for (auto res : seqan3::align_pairwise(std::tie(seq1, seq2), min_cfg))
seqan3::debug_stream << res.score() << '\n'; // print out the alignment score
}
Provides seqan3::align_cfg::scoring_scheme.
Sets the scoring scheme for the alignment algorithm.
Definition align_config_scoring_scheme.hpp:42
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using namespace seqan3::literals;
int main()
{
// configure a local alignment for DNA sequences
auto seq1 = "TCGT"_dna4;
auto seq2 = "ACGA"_dna4;
for (auto res : seqan3::align_pairwise(std::tie(seq1, seq2), min_cfg))
seqan3::debug_stream << res.score() << '\n'; // print out the alignment score
}
Sets the local alignment method.
Definition align_config_method.hpp:42
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Allow a minimal score of -5, i.e. at most 5 edit operations.
auto min_score = std::get<seqan3::align_cfg::min_score>(config);
min_score.score = -5;
}
Provides seqan3::configuration and utility functions.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
seqan3::align_cfg::on_result cfg{[](auto && result)
{
seqan3::debug_stream << result << '\n';
}};
}
Provides seqan3::align_cfg::on_result.
Configuration element to provide a user defined callback function for the alignment.
Definition align_config_on_result.hpp:51
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Compute only the alignment.
}
Provides configuration for alignment output.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Compute only the begin position of the aligned sequences.
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Compute only the end position of the aligned sequences.
}
Configures the alignment result to output the end position.
Definition align_config_output.hpp:84
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
// Basic alignment algorithm configuration.
std::pair p{"ACGTAGC"_dna4, "AGTACGACG"_dna4};
// Compute only the score:
for (auto res : seqan3::align_pairwise(p, config | seqan3::align_cfg::output_score{}))
seqan3::debug_stream << res << "\n"; // prints: {score: -4}
// Compute only the alignment:
for (auto res : seqan3::align_pairwise(p, config | seqan3::align_cfg::output_alignment{}))
seqan3::debug_stream << res << "\n"; // prints: {alignment: (ACGTA-G-C-,A-GTACGACG)}
// Compute the score and the alignment:
for (auto res :
seqan3::align_pairwise(p, config | seqan3::align_cfg::output_score{} | seqan3::align_cfg::output_alignment{}))
seqan3::debug_stream << res << "\n"; // prints: {score: -4, alignment: (ACGTA-G-C-,A-GTACGACG)}
// By default compute everything:
for (auto res : seqan3::align_pairwise(p, config))
<< res << "\n"; // prints {id: 0, score: -4, begin: (0,0), end: (7,9) alignment: (ACGTA-G-C-,A-GTACGACG)}
}
@ output_score
ID for the score output option.
@ output_alignment
ID for the alignment output option.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Compute only the score.
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Output only the id of the first sequence.
}
Configures the alignment result to output the id of the first sequence.
Definition align_config_output.hpp:208
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Output only the id of the second sequence.
}
Configures the alignment result to output the id of the second sequence.
Definition align_config_output.hpp:247
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <thread>
int main()
{
// Enables parallel computation with two threads.
// Enables parallel computation with the number of concurrent threads supported by the current architecture.
;
}
Provides seqan3::align_cfg::parallel configuration.
A global configuration type used to enable parallel execution of algorithms.
Definition configuration_element_parallel_mode.hpp:29
T hardware_concurrency(T... args)
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Compute only the score.
seqan3::align_cfg::score_type<int16_t>{}; // Now the alignment computes 16 bit integers.
seqan3::configuration cfg2 = seqan3::align_cfg::score_type<float>{}; // Now the alignment computes float scores.
}
Provides alignment configuration seqan3::align_cfg::score_type.
A configuration element to set the score type used in the alignment algorithm.
Definition align_config_score_type.hpp:33
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Enable SIMD vectorised alignment computation.
}
Provides seqan3::align_cfg::vectorised configuration.
Enables the vectorised alignment computation if possible for the current configuration.
Definition align_config_vectorised.hpp:39
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using namespace seqan3::literals;
int main()
{
auto seq1 = "ACGT"_dna4;
auto seq2 = "ACCT"_dna4;
for (auto res : align_pairwise(std::tie(seq1, seq2), min_cfg))
seqan3::debug_stream << res.score() << '\n';
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <span>
#include <vector>
{
public:
// Alias the base class
friend base_t;
// Inherit the alignment column type defined in the base class. This type is returned in initialise_column.
using typename base_t::alignment_column_type;
// The following types are required by the base type since they cannot be inferred within the base.
using column_data_view_type = std::span<int>; //This type is the underlying view over the actual memory location.
using value_type = int; // The actual value type.
using reference = int &; // The actual reference type.
my_matrix() = default;
my_matrix(my_matrix const &) = default;
my_matrix(my_matrix &&) = default;
my_matrix & operator=(my_matrix const &) = default;
my_matrix & operator=(my_matrix &&) = default;
~my_matrix() = default;
my_matrix(size_t const num_rows, size_t const num_cols) : num_rows{num_rows}, num_cols{num_cols}
{
data.resize(num_rows * num_cols);
}
protected:
size_t num_rows{};
size_t num_cols{};
//Required for the base class. Initialises the current column given the column index.
alignment_column_type initialise_column(size_t const column_index) noexcept
{
return alignment_column_type{*this,
column_data_view_type{std::addressof(data[num_rows * column_index]), num_rows}};
}
//Required for the base class. Initialises the proxy for the current iterator over the current column.
template <std::random_access_iterator iter_t>
constexpr reference make_proxy(iter_t iter) noexcept
{
return *iter;
}
};
int main()
{
my_matrix matrix{3, 5};
// Fill the matrix with
int val = 0;
for (auto col : matrix) // Iterate over the columns
for (auto & cell : col) // Iterate over the cells in one column.
cell = val++;
// Print the matrix column by column
for (auto col : matrix)
seqan3::debug_stream << col << '\n';
}
T addressof(T... args)
Provides seqan3::detail::alignment_matrix_column_major_range_base.
Provides a range interface for alignment matrices.
Definition alignment_matrix_column_major_range_base.hpp:60
constexpr alignment_matrix_column_major_range_base & operator=(alignment_matrix_column_major_range_base const &)=default
Defaulted.
alignment_column_type initialise_column(size_t column_index)
Returns the current alignment-column at the given column_index.
Definition alignment_matrix_column_major_range_base.hpp:447
value_type make_proxy(iter_t host_iter) noexcept
Creates the proxy value returned when dereferencing the alignment-column-iterator.
Definition alignment_matrix_column_major_range_base.hpp:433
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna4> database = "AACCGGTT"_dna4;
std::vector<seqan3::dna4> query = "ACGT"_dna4;
std::vector{-0, -1, -2, -3, -4, -5, -6, -7, -8, -1, -0, -1, -2, -3, -4, -5, -6, -7, -2, -1, -1, -1, -2,
-3, -4, -5, -6, -3, -2, -2, -2, -2, -2, -3, -4, -5, -4, -3, -3, -3, -3, -3, -3, -3, -4}};
seqan3::debug_stream << "database:\t" << database << '\n';
seqan3::debug_stream << "query:\t\t" << query << '\n';
seqan3::debug_stream << "score_matrix: " << score_matrix.cols() << " columns and " << score_matrix.rows()
<< " rows\n";
// Prints out the matrix in a convenient way
seqan3::debug_stream << score_matrix << '\n'; // without sequences
seqan3::debug_stream << debug_matrix{score_matrix, database, query} << '\n'; // with sequences
seqan3::debug_stream << seqan3::fmtflags2::utf8 << debug_matrix{score_matrix, database, query}; // as utf8
return 0;
}
A debug matrix to wrap alignment matrices and sequences and make them printable together.
Definition debug_matrix.hpp:59
A two dimensional matrix used inside of alignment algorithms.
Definition two_dimensional_matrix.hpp:62
Provides the declaration of seqan3::detail::debug_matrix.
@ utf8
Enables use of non-ASCII UTF8 characters in formatted output.
Definition debug_stream_type.hpp:30
Strong type for setting the column dimension of a matrix.
Definition two_dimensional_matrix.hpp:29
Strong type for setting the row dimension of a matrix.
Definition two_dimensional_matrix.hpp:37
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
using seqan3::operator|;
std::vector<seqan3::dna4> database = "AACCGGTT"_dna4;
std::vector<seqan3::dna4> query = "ACGT"_dna4;
std::vector{N, L, L, L, L, L, L, L, L, U, D, D | L, L, L, L,
L, L, L, U, U, D, D, D | L, L, L, L, L, U, U, D | U,
D | U, D, D, D | L, L, L, U, U, D | U, D | U, D | U, D | U, D, D, D | L}};
seqan3::debug_stream << "database:\t" << database << '\n';
seqan3::debug_stream << "query:\t\t" << query << '\n';
seqan3::debug_stream << "trace_matrix: " << trace_matrix.cols() << " columns and " << trace_matrix.rows()
<< " rows\n";
// Prints out the matrix in a convenient way
seqan3::debug_stream << trace_matrix << '\n'; // without sequences
seqan3::debug_stream << debug_matrix{trace_matrix, database, query} << '\n'; // with sequences
seqan3::debug_stream << seqan3::fmtflags2::utf8 << debug_matrix{trace_matrix, database, query}; // as utf8
return 0;
}
@ up
Trace comes from the above entry.
@ left
Trace comes from the left entry.
@ diagonal
Trace comes from the diagonal entry.
Provides seqan3::views::to_char.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
// Configure the alignment kernel.
{
std::pair p{"ACGTAGC"_dna4, "AGTACGACG"_dna4};
auto result = seqan3::align_pairwise(p, config);
}
{
std::vector vec{"ACCA"_dna4, "ATTA"_dna4};
auto result = seqan3::align_pairwise(std::tie(vec[0], vec[1]), config);
}
std::vector vec{std::pair{"AGTGCTACG"_dna4, "ACGTGCGACTAG"_dna4},
std::pair{"AGTAGACTACG"_dna4, "ACGTACGACACG"_dna4},
std::pair{"AGTTACGAC"_dna4, "AGTAGCGATCG"_dna4}};
// Compute the alignment of a single pair.
for (auto const & res : seqan3::align_pairwise(std::tie(vec[0].first, vec[0].second), edit_config))
seqan3::debug_stream << "The score: " << res.score() << "\n";
// Compute the alignment over a range of pairs.
for (auto const & res : seqan3::align_pairwise(vec, edit_config))
seqan3::debug_stream << "The score: " << res.score() << "\n";
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <ranges>
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector data1{"AGTGCTACG"_dna4, "AGTAGACTACG"_dna4, "AGTTACGAC"_dna4};
std::vector data2{"ACGTGCGACTAG"_dna4, "ACGTACGACACG"_dna4, "AGTAGCGATCG"_dna4};
// Configure the alignment kernel.
auto config =
// Compute the alignment over a range of pairs.
for (auto const & res : seqan3::align_pairwise(seqan3::views::zip(data1, data2), config))
seqan3::debug_stream << "The score: " << res.score() << "\n";
}
Meta-header for the Alignment / Configuration submodule .
A scoring scheme that assigns a score of 0 to matching letters and -1 to mismatching letters.
Definition hamming_scoring_scheme.hpp:33
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using first_seq_t = std::tuple_element_t<0, std::ranges::range_value_t<sequences_t>>;
using second_seq_t = std::tuple_element_t<1, std::ranges::range_value_t<sequences_t>>;
// Select the result type based on the sequences and the configuration.
using result_t =
config_t>::type>;
// Define the function wrapper type.
using function_wrapper_t = std::function<result_t(first_seq_t &, second_seq_t &)>;
static_assert(seqan3::detail::is_type_specialisation_of_v<function_wrapper_t, std::function>);
}
Provides seqan3::detail::alignment_selector.
Stores the alignment results and gives access to score, alignment and the front and end positions.
Definition alignment_result.hpp:145
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <mutex>
#include <vector>
int main()
{
// Generate some sequences.
using namespace seqan3::literals;
std::vector<sequence_pair_t> sequences{100, {"AGTGCTACG"_dna4, "ACGTGCGACTAG"_dna4}};
// Use edit distance with 4 threads.
auto const alignment_config =
// Compute the alignments in parallel and output them in order based on the input.
for (auto && result : seqan3::align_pairwise(sequences, alignment_config))
seqan3::debug_stream << result << '\n';
// prints:
// [id: 0 score: -4]
// [id: 1 score: -4]
// [id: 2 score: -4]
// [id: 3 score: -4]
// [id: 4 score: -4]
// [id: 5 score: -4]
// ...
// [id: 98 score: -4]
// [id: 99 score: -4]
// Compute the alignments in parallel and output them unordered using the callback (order is not deterministic).
std::mutex write_to_debug_stream{}; // Need mutex to synchronise the output.
auto const alignment_config_with_callback =
alignment_config
| seqan3::align_cfg::on_result{[&](auto && result)
{
std::lock_guard sync{write_to_debug_stream}; // critical section
seqan3::debug_stream << result << '\n';
}};
seqan3::align_pairwise(sequences, alignment_config_with_callback); // seqan3::align_pairwise is now declared void.
// might print:
// [id: 0 score: -4]
// [id: 1 score: -4]
// [id: 2 score: -4]
// [id: 6 score: -4]
// [id: 7 score: -4]
// [id: 3 score: -4]
// ...
// [id: 99 score: -4]
// [id: 92 score: -4]
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
// How to score two letters:
seqan3::debug_stream << "blosum62 score for T and S: " << (int)scheme.score('T'_aa27, 'S'_aa27) << "\n"; // == 1
scheme.set_similarity_matrix(seqan3::aminoacid_similarity_matrix::blosum80);
// You can also score aa20 against aa27:
seqan3::debug_stream << "blosum80 score for 'T'_aa27 and 'S'_aa20: " << (int)scheme.score('T'_aa27, 'S'_aa20)
<< "\n"; // == 2
scheme.set_hamming_distance();
seqan3::debug_stream << "Hamming distance between T and S: " << (int)scheme.score('T'_aa27, 'S'_aa20)
<< "\n"; // == -1
seqan3::debug_stream << "Hamming distance between T and T: " << (int)scheme.score('T'_aa27, 'T'_aa20)
<< "\n"; // == 0
// You can "edit" a given matrix directly:
seqan3::debug_stream << "blosum80 score between T and S: " << (int)scheme2.score('T'_aa27, 'S'_aa27)
<< "\n"; // == 2
auto & cell = scheme2.score('T'_aa27, 'S'_aa27);
cell = 3;
seqan3::debug_stream << "New score after editing entry: " << (int)scheme2.score('T'_aa27, 'S'_aa27) << "\n"; // == 3
std::vector<seqan3::aa27> one = "ALIGATOR"_aa27;
std::vector<seqan3::aa27> two = "ANIMATOR"_aa27;
// You can also score two sequences:
int score = 0;
for (auto pair : seqan3::views::zip(one, two))
score += scheme3.score(std::get<0>(pair), std::get<1>(pair));
seqan3::debug_stream << "Score: " << score << "\n"; // 4 + -3 + 4 + -3 + 4 + 5 + -1 + 5 = 15
}
Provides seqan3::aa27, container aliases and string literals.
Meta-header for the Alphabet / Aminoacid submodule .
@ blosum80
The blosum80 matrix for closely related proteins.
@ blosum62
The blosum62 matrix recommended for most use-cases.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
// You can score two letters:
seqan3::nucleotide_scoring_scheme scheme; // hamming is default
seqan3::debug_stream << "Score between DNA5 A and G: " << (int)scheme.score('A'_dna5, 'G'_dna5) << "\n"; // == -1
seqan3::debug_stream << "Score between DNA5 A and A: " << (int)scheme.score('A'_dna5, 'A'_dna5) << "\n"; // == 0
// You can also score differenct nucleotides:
seqan3::debug_stream << "Score between DNA5 A and RNA15 G: " << (int)scheme.score('A'_dna5, 'G'_rna15)
<< "\n"; // == -2
seqan3::debug_stream << "Score between DNA5 A and RNA15 A: " << (int)scheme.score('A'_dna5, 'A'_rna15)
<< "\n"; // == 3
// You can "edit" a given matrix directly:
seqan3::nucleotide_scoring_scheme scheme2; // hamming distance is default
seqan3::debug_stream << "Score between DNA A and G before edit: " << (int)scheme2.score('A'_dna15, 'G'_dna15)
<< "\n"; // == -1
scheme2.score('A'_dna15, 'G'_dna15) = 3;
seqan3::debug_stream << "Score after editing: " << (int)scheme2.score('A'_dna15, 'G'_dna15) << "\n"; // == 3
// You can score two sequences:
std::vector<seqan3::dna15> one = "AGAATA"_dna15;
std::vector<seqan3::dna15> two = "ATACTA"_dna15;
seqan3::nucleotide_scoring_scheme scheme3; // hamming distance is default
int score = 0;
for (auto pair : seqan3::views::zip(one, two))
score += scheme3.score(std::get<0>(pair), std::get<1>(pair));
seqan3::debug_stream << "Score: " << score << "\n"; // == 0 - 1 + 0 - 1 + 0 + 0 = -2
}
constexpr void set_simple_scheme(match_score< score_arg_t > const ms, mismatch_score< score_arg_t > const mms)
Set the simple scheme (everything is either match or mismatch).
Definition scoring_scheme_base.hpp:175
constexpr score_t & score(alph1_t const alph1, alph2_t const alph2) noexcept
Score two letters (either two nucleotids or two amino acids).
Definition scoring_scheme_base.hpp:214
Provides seqan3::rna15, container aliases and string literals.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// does not work:
// seqan3::dna4 my_letter{0}; // we want to set the default, an A
// seqan3::dna4 my_letter{'A'}; // we also want to set an A, but we are setting value 65
// std::cout << my_letter; // you expect 'A', but how would you access the number?
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna4 letter = 'A'_dna4; // identical to assign_char_to('A', letter);
seqan3::dna4_vector sequence = "ACGT"_dna4; // identical to calling assign_char for each element
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
seqan3::dna4 my_letter;
seqan3::assign_rank_to(0, my_letter); // assign an A via rank interface
seqan3::assign_char_to('A', my_letter); // assign an A via char interface
std::cout << seqan3::to_char(my_letter) << '\n'; // prints 'A'
std::cout << (unsigned)seqan3::to_rank(my_letter) << '\n'; // prints 0
// we have to add the cast here, because uint8_t is also treated as a char type by default :(
// Using SeqAn's debug_stream:
seqan3::debug_stream << seqan3::to_char(my_letter) << '\n'; // prints 'A'
seqan3::debug_stream << my_letter << '\n'; // prints 'A' (calls to_char() automatically!)
seqan3::debug_stream << seqan3::to_rank(my_letter) << '\n'; // prints 0 (casts uint8_t to unsigned automatically!)
}
constexpr auto assign_char_to
Assign a character to an alphabet object.
Definition alphabet/concept.hpp:521
constexpr auto to_char
Return the char representation of an alphabet object.
Definition alphabet/concept.hpp:383
constexpr auto assign_rank_to
Assign a rank to an alphabet object.
Definition alphabet/concept.hpp:290
constexpr auto to_rank
Return the rank representation of a (semi-)alphabet object.
Definition alphabet/concept.hpp:152
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <seqan3/utility/char_operations/transform.hpp> // seqan3::to_lower
class ab : public seqan3::alphabet_base<ab, 2>
{
private:
// make the base class a friend so it can access the tables:
// This function is expected by seqan3::alphabet_base
static constexpr char_type rank_to_char(rank_type const rank)
{
// via a lookup table
return rank_to_char_table[rank];
// or via an arithmetic expression
return rank == 1 ? 'B' : 'A';
}
// This function is expected by seqan3::alphabet_base
static constexpr rank_type char_to_rank(char_type const chr)
{
// via a lookup table
return char_to_rank_table[static_cast<index_t>(chr)];
// or via an arithmetic expression
return seqan3::to_lower(chr) == 'b' ? 1 : 0;
}
private:
// === lookup-table implementation detail ===
// map 0 -> A and 1 -> B
static constexpr std::array<char_type, alphabet_size> rank_to_char_table{'A', 'B'};
// map every letter to rank zero, except Bs
static constexpr std::array<rank_type, 256> char_to_rank_table{
// initialise with an immediately evaluated lambda expression:
[]()
{
std::array<rank_type, 256> ret{}; // initialise all values with 0 / 'A'
// only 'b' and 'B' result in rank 1
ret['b'] = 1;
ret['B'] = 1;
return ret;
}()};
};
// The class ab satisfies the alphabet concept.
static_assert(seqan3::alphabet<ab>);
Core alphabet concept and free function/type trait wrappers.
Provides seqan3::alphabet_base.
rank_type rank
The value of the alphabet letter is stored as the rank.
Definition alphabet_base.hpp:258
The generic alphabet concept that covers most data types used in ranges.
Refines seqan3::alphabet and adds assignability.
constexpr char_type to_lower(char_type const c) noexcept
Converts 'A'-'Z' to 'a'-'z' respectively; other characters are returned as is.
Definition transform.hpp:80
Provides utilities for modifying characters.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
auto sigma_char = seqan3::alphabet_size<char>; // calls seqan3::custom::alphabet_size(char{})
static_assert(std::same_as<decltype(sigma_char), uint16_t>);
std::cout << sigma_char << '\n'; // 256
auto sigma_dna5 = seqan3::alphabet_size<seqan3::dna5>; // returns dna5::alphabet_size
static_assert(std::same_as<decltype(sigma_dna5), uint8_t>);
std::cout << static_cast<uint16_t>(sigma_dna5) << '\n'; // 5
}
Provides alphabet adaptations for standard char types.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa10li letter{'A'_aa10li};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('?'); // Unknown characters are implicitly converted to A.
seqan3::debug_stream << letter << '\n'; // prints "A"
}
Provides seqan3::aa10li, container aliases and string literals.
The reduced Li amino acid alphabet.
Definition aa10li.hpp:80
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa10li letter1{'A'_aa10li};
auto letter2 = 'A'_aa10li;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa10li_vector sequence1{"ACGTTA"_aa10li};
seqan3::aa10li_vector sequence2 = "ACGTTA"_aa10li;
auto sequence3 = "ACGTTA"_aa10li;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa10murphy letter{'A'_aa10murphy};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('?'); // Unknown characters are implicitly converted to S.
seqan3::debug_stream << letter << '\n'; // prints "S"
}
Provides seqan3::aa10murphy, container aliases and string literals.
The reduced Murphy amino acid alphabet.
Definition aa10murphy.hpp:79
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa10murphy letter1{'A'_aa10murphy};
auto letter2 = 'A'_aa10murphy;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa10murphy_vector sequence1{"ACGTTA"_aa10murphy};
seqan3::aa10murphy_vector sequence2 = "ACGTTA"_aa10murphy;
auto sequence3 = "ACGTTA"_aa10murphy;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa20 letter{'A'_aa20};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('?'); // Unknown characters are implicitly converted to S.
seqan3::debug_stream << letter << '\n'; // prints "S"
}
Provides seqan3::aa20, container aliases and string literals.
The canonical amino acid alphabet.
Definition aa20.hpp:61
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa20 letter1{'A'_aa20};
auto letter2 = 'A'_aa20;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa20_vector sequence1{"ACGTTA"_aa20};
seqan3::aa20_vector sequence2 = "ACGTTA"_aa20;
auto sequence3 = "ACGTTA"_aa20;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa27 letter{'A'_aa27};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('?'); // Unknown characters are implicitly converted to X.
seqan3::debug_stream << letter << '\n'; // prints "X"
}
The twenty-seven letter amino acid alphabet.
Definition aa27.hpp:43
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa27 letter1{'A'_aa27};
auto letter2 = 'A'_aa27;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::aa27_vector sequence1{"ACGTTA"_aa27};
seqan3::aa27_vector sequence2 = "ACGTTA"_aa27;
auto sequence3 = "ACGTTA"_aa27;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
namespace your_namespace
{
// your own aminoacid definition
{
//...
};
} // namespace your_namespace
static_assert(seqan3::enable_aminoacid<your_namespace::your_aa> == true);
/***** OR *****/
namespace your_namespace2
{
// your own aminoacid definition
struct your_aa
{
//...
};
constexpr bool enable_aminoacid(your_aa) noexcept
{
return true;
}
} // namespace your_namespace2
static_assert(seqan3::enable_aminoacid<your_namespace2::your_aa> == true);
Provides seqan3::aminoacid_alphabet.
constexpr bool enable_aminoacid
A trait that indicates whether a type shall model seqan3::aminoacid_alphabet.
Definition alphabet/aminoacid/concept.hpp:143
This is an empty base class that can be inherited by types that shall model seqan3::aminoacid_alphabe...
Definition alphabet/aminoacid/concept.hpp:32
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
char c = '!';
seqan3::assign_char_strictly_to('?', c); // calls seqan3::custom::assign_char_strictly_to('A', c)
seqan3::assign_char_strictly_to('A', d); // calls .assign_char('A') member
// also works for temporaries:
}
The five letter DNA alphabet of A,C,G,T and the unknown character N.
Definition dna5.hpp:48
constexpr auto assign_char_strictly_to
Assign a character to an alphabet object, throw if the character is not valid.
Definition alphabet/concept.hpp:731
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
char c = '!';
seqan3::assign_char_to('?', c); // calls seqan3::custom::assign_char_to('A', c)
seqan3::assign_char_to('A', d); // calls .assign_char('A') member
// also works for temporaries:
// invalid/unknown characters are converted:
seqan3::dna5 d3 = seqan3::assign_char_to('!', seqan3::dna5{}); // == 'N'_dna5
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
char c = '!';
seqan3::assign_rank_to(66, c); // calls seqan3::custom::assign_rank_to(66, c); == 'B'
seqan3::assign_rank_to(2, d); // calls .assign_rank(2) member; == 'G'_dna5
// also works for temporaries:
// too-large ranks are undefined behaviour:
// seqan3::dna5 d3 = seqan3::assign_rank_to(50, seqan3::dna5{});
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// calls seqan3::custom::char_is_valid_for<char>('A')
std::cout << std::boolalpha << seqan3::char_is_valid_for<char>('A') << '\n'; // always 'true'
// calls dna5::char_is_valid('A') member
std::cout << std::boolalpha << seqan3::char_is_valid_for<seqan3::dna5>('A') << '\n'; // true
// for some alphabets, characters that are not uniquely mappable are still valid:
std::cout << std::boolalpha << seqan3::char_is_valid_for<seqan3::dna5>('a') << '\n'; // true
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::cigar letter{12, 'M'_cigar_operation};
letter.assign_string("10D");
seqan3::debug_stream << letter << '\n'; // prints "10D"
letter.assign_string("20Z"); // Unknown strings are implicitly converted to 0P.
seqan3::debug_stream << letter << '\n'; // prints "0P"
}
cigar & assign_string(std::string_view const input) noexcept
Assign from a std::string_view.
Definition alphabet/cigar/cigar.hpp:167
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
std::string cigar_str{"4S134M"}; // input
seqan3::cigar letter1{};
seqan3::cigar letter2{};
// Assign from string
// convenient but creates an unnecessary string copy "4S"
letter1.assign_string(cigar_str.substr(0, 2));
letter2.assign_string(cigar_str.substr(2, 4));
seqan3::debug_stream << letter1 << '\n'; // prints 4S
seqan3::debug_stream << letter2 << '\n'; // prints 134M
// Assign from std::string_view (No extra string copies)
// Version 1
letter1.assign_string(std::string_view{cigar_str}.substr(0, 2));
letter2.assign_string(std::string_view{cigar_str}.substr(2, 4));
seqan3::debug_stream << letter1 << '\n'; // prints 4S
seqan3::debug_stream << letter2 << '\n'; // prints 134M
// No extra string copiesersion 2
letter1.assign_string(/*std::string_view*/ {cigar_str.data(), 2});
letter2.assign_string(/*std::string_view*/ {cigar_str.data() + 2, 4});
seqan3::debug_stream << letter1 << '\n'; // prints 4S
seqan3::debug_stream << letter2 << '\n'; // prints 134M
// Assign from char array
letter2.assign_string("40S");
seqan3::debug_stream << letter2 << '\n'; // prints 40S
// Assign from seqan3::small_string
letter2.assign_string(letter1.to_string());
seqan3::debug_stream << letter2 << '\n'; // prints 4S
}
T substr(T... args)
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using seqan3::get;
using namespace seqan3::literals;
seqan3::cigar letter{10, 'M'_cigar_operation};
// Note that this is equivalent to get<uint32_t>(letter)
uint32_t size{get<0>(letter)};
// Note that this is equivalent to get<seqan3::cigar::operation>(letter)
seqan3::cigar::operation cigar_char{get<1>(letter)};
seqan3::debug_stream << "Size is " << size << '\n';
seqan3::debug_stream << "Cigar char is " << cigar_char << '\n'; // seqan3::debug_stream converts to char on the fly.
}
The actual implementation of seqan3::cigar::operation for documentation purposes only.
Definition cigar_operation.hpp:45
constexpr size_t size
The size of a type pack.
Definition type_pack/traits.hpp:143
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using seqan3::get;
using namespace seqan3::literals;
seqan3::cigar letter{10, 'M'_cigar_operation};
// Note that this is equivalent to get<0>(letter)
uint32_t size{get<uint32_t>(letter)};
// Note that this is equivalent to get<1>(letter)
seqan3::cigar::operation cigar_char{get<seqan3::cigar::operation>(letter)};
seqan3::debug_stream << "Size is " << size << '\n';
seqan3::debug_stream << "Cigar char is " << cigar_char << '\n'; // seqan3::debug_stream converts to char on the fly.
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::cigar::operation letter{'M'_cigar_operation};
letter.assign_char('D');
seqan3::debug_stream << letter << '\n'; // prints "D"
letter.assign_char('Z'); // Unknown characters are implicitly converted to M.
seqan3::debug_stream << letter << '\n'; // prints "M"
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::cigar::operation letter1{'M'_cigar_operation};
auto letter2 = 'M'_cigar_operation;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::cigar letter{10, 'I'_cigar_operation};
seqan3::debug_stream << "letter: " << letter << '\n'; // 10I
letter = 'D'_cigar_operation;
seqan3::debug_stream << "letter: " << letter << '\n'; // 10D
letter = 20;
seqan3::debug_stream << "letter: " << letter << '\n'; // 20D
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
// creates 10M, as the cigar_op field is not provided.
seqan3::cigar letter1{10};
seqan3::debug_stream << "letter1: " << letter1 << '\n'; // 10M
// creates 0I, as the integer field is not provided.
seqan3::cigar letter2{'I'_cigar_operation};
seqan3::debug_stream << "letter2: " << letter2 << '\n'; // 0I
// creates 10I, as both fields are explicitly given.
seqan3::cigar letter3{10, 'I'_cigar_operation};
seqan3::debug_stream << "letter3: " << letter3 << '\n'; // 10I
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter1{'T'_dna4, '"'_phred42};
letter1 = 'C'_rna4; // yields {'C'_dna4, '"'_phred42}
}
Meta-header for the Alphabet / Nucleotide submodule .
Joins an arbitrary alphabet with a quality alphabet.
Definition qualified.hpp:59
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
// The following creates {'C'_dna4, '!'_phred42}
// The following also creates {'C'_dna4, '!'_phred42}, since rna4 assignable to dna4
if (letter1 == letter2)
seqan3::debug_stream << "yeah\n"; // yeah
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter1{'T'_dna4, '"'_phred42};
letter1 = 'C'_dna4; // yields {'C'_dna4, '"'_phred42}
letter1 = '#'_phred42; // yields {'C'_dna4, '#'_phred42}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter1{'C'_dna4}; // creates {'C'_dna4, '!'_phred42}
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter2{'"'_phred42}; // creates {'A'_dna4, '"'_phred42}
if (letter1 == letter2)
seqan3::debug_stream << "yeah\n"; // yeah
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::alphabet_variant<seqan3::dna5, seqan3::gap> letter{}; // implicitly 'A'_dna5
seqan3::alphabet_variant<seqan3::dna5, seqan3::gap> letter2{'C'_dna5}; // constructed from alternative (== 'C'_dna5)
'U'_rna5}; // constructed from type that alternative is constructible from (== 'T'_dna5)
letter2.assign_char('T'); // == 'T'_dna5
letter2.assign_char('-'); // == gap{}
letter2.assign_char('K'); // unknown characters map to the default/unknown
// character of the first alternative type (== 'N'_dna5)
letter2 = seqan3::gap{}; // assigned from alternative (== gap{})
letter2 = 'U'_rna5; // assigned from type that alternative is assignable from (== 'T'_dna5)
seqan3::dna5 letter4 = letter2.convert_to<seqan3::dna5>();
}
Provides seqan3::alphabet_variant.
A combined alphabet that can hold values of either of its alternatives..
Definition alphabet_variant.hpp:129
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
var.assign_char('A'); // will be in the "dna4-state"
var = 'A'_dna5; // will be in the "dna5-state"
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
// possible:
// not possible:
// seqan3::alphabet_variant<seqan3::dna4, seqan3::gap> letter2 = 'C'_dna5;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <gtest/gtest.h>
int main()
{
static_assert(variant_t::is_alternative<seqan3::dna5>(), "dna5 is an alternative of variant_t");
static_assert(!variant_t::is_alternative<seqan3::dna4>(), "dna4 is not an alternative of variant_t");
static_assert(variant_t::is_alternative<seqan3::gap>(), "gap is an alternative of variant_t");
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
letter1 = 'C'_rna4;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// This example illustrates how we can reduce the usage of templates (or the amount of different instantiations) via
// type erasure. Having only one function generated for `algorithm()` is the only benefit of using `semialphabet_any`
// here. Of course this only makes sense for your application if the part of the program that is agnostic of the
// character representation (your equivalent of `algorithm()`) is substantially larger than the specific parts – and
// if compile-time and/or size of the executable are a concern.
#include <iostream>
using namespace seqan3::literals;
// Print is a template and gets instantiated two times because the behaviour is different for both types
template <typename rng_t>
void print(rng_t && r)
{
seqan3::debug_stream << r << '\n';
}
// Algorithm is not a template, only one instance is generated by the compiler
// Type information is encoded via a run-time parameter
void algorithm(std::vector<seqan3::semialphabet_any<10>> & r, bool is_murphy)
{
// Algorithm example that replaces rank 0 with rank 1
for (auto & v : r)
if (seqan3::to_rank(v) == 0)
seqan3::assign_rank_to(1, v);
// Here we verify the type for printing
if (is_murphy)
print(r
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::aa10murphy>(in);
}));
else
print(r
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::aa10li>(in);
}));
}
// Two instances of algo_pre exist
// They type erase the different arguments to the same type and encode the type information as a run-time parameter
void algo_pre(seqan3::aa10li_vector const & v)
{
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::semialphabet_any<10>>(in);
})
algorithm(tmp, false);
}
void algo_pre(seqan3::aa10murphy_vector const & v)
{
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::semialphabet_any<10>>(in);
})
algorithm(tmp, true);
}
int main()
{
seqan3::aa10li_vector v1{"AVRSTXOUB"_aa10li};
algo_pre(v1); // BIKBBBKCB
seqan3::aa10murphy_vector v2{"AVRSTXOUB"_aa10murphy};
algo_pre(v2); // BIKSSSKCB
}
A semi-alphabet that type erases all other semi-alphabets of the same size.
Definition semialphabet_any.hpp:45
seqan::stl::ranges::to to
Converts a range to a container. <dl class="no-api">This entity is not part of the SeqAn API....
Definition to.hpp:23
Provides seqan3::semialphabet_any.
Provides seqan3::ranges::to.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna4> v0{"ACGT"_dna4}; // data occupies 4 bytes in memory
seqan3::bitpacked_sequence<seqan3::dna4> v1{"ACGT"_dna4}; // data occupies 1 byte in memory
}
Provides seqan3::bitpacked_sequence.
A space-optimised version of std::vector that compresses multiple letters into a single byte.
Definition bitpacked_sequence.hpp:63
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::concatenated_sequences<seqan3::dna4_vector> concat1{"ACGT"_dna4, "GAGGA"_dna4};
seqan3::debug_stream << concat1[0] << '\n'; // "ACGT"
std::vector<seqan3::dna4_vector> concat2{"ACTA"_dna4, "AGGA"_dna4};
concat1 = concat2; // you can assign from other ranges
concat2[0] = "ATTA"_dna4; // this works for vector of vector
concat1[0][1] = 'T'_dna4; // and this works for concatenated_sequences
seqan3::debug_stream << concat1[0] << '\n'; // "ATTA"
// if you know that you will be adding ten vectors of length ten:
std::vector<seqan3::dna4> vector_of_length10{"ACGTACGTAC"_dna4};
concat1.reserve(10);
concat1.concat_reserve(10 * vector_of_length10.size());
while (concat1.size() < 10)
{
// ...
concat1.push_back(vector_of_length10);
}
}
Container that stores sequences concatenated internally.
Definition concatenated_sequences.hpp:86
Provides seqan3::concatenated_sequences.
T reserve(T... args)
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
foobar.insert(foobar.end(), "ACGT"_dna4);
seqan3::debug_stream << foobar[0] << '\n'; // "ACGT"
}
iterator end() noexcept
Returns an iterator to the element following the last element of the container.
Definition concatenated_sequences.hpp:487
iterator insert(const_iterator pos, rng_type &&value)
Inserts value before position in the container.
Definition concatenated_sequences.hpp:919
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
foobar.insert(foobar.end(), 2, "ACGT"_dna4);
seqan3::debug_stream << foobar[0] << '\n'; // "ACGT"
seqan3::debug_stream << foobar[1] << '\n'; // "ACGT"
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
seqan3::gap another_gap{};
another_gap.assign_char('A'); // this does not change anything
seqan3::debug_stream << my_gap.to_char(); // outputs '-'
if (my_gap.to_char() == another_gap.to_char())
seqan3::debug_stream << "Both gaps are the same!\n";
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::gapped<seqan3::dna4> converted_letter{'C'_dna4};
seqan3::gapped<seqan3::dna4>{}.assign_char('-'); // gap character
seqan3::gapped<seqan3::dna4>{}.assign_char('K'); // unknown characters map to the default/unknown
// character of the given alphabet type (i.e. A of dna4)
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
seqan3::mask another_mask{};
my_mask.assign_rank(false); // will assign my_mask the value mask::unmasked
another_mask.assign_rank(0); // will also assign another_mask the value mask::unmasked
if (my_mask.to_rank() == another_mask.to_rank())
seqan3::debug_stream << "Both are UNMASKED!\n";
}
Implementation of a masked alphabet to be used for tuple composites.
Definition mask.hpp:35
static const mask masked
Member for masked.
Definition mask.hpp:71
Create a mask composite which can be applied with another alphabet.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::masked<seqan3::dna4> dna4_another_masked{'A'_dna4, seqan3::mask::unmasked};
// create a dna4 masked alphabet with an unmasked A
dna4_masked.assign_char('a'); // assigns a masked 'A'_dna4
if (dna4_masked.to_char() != dna4_another_masked.to_char())
{
seqan3::debug_stream << dna4_masked.to_char() << " is not the same as " << dna4_another_masked.to_char()
<< "\n";
}
}
static const mask unmasked
Member for unmasked.
Definition mask.hpp:65
Implementation of a masked composite, which extends a given alphabet with a mask.
Definition masked.hpp:42
Extends a given alphabet with the mask alphabet.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using namespace seqan3::literals;
int main()
{
auto r1 = 'A'_rna5.complement(); // calls member function rna5::complement(); r1 == 'U'_rna5
auto r2 = seqan3::complement('A'_rna5); // calls global complement() function on the rna5 object; r2 == 'U'_rna5
}
Provides seqan3::nucleotide_alphabet.
Provides seqan3::rna5, container aliases and string literals.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna15 letter{'A'_dna15};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // prints "N"
}
The 15 letter DNA alphabet, containing all IUPAC smybols minus the gap.
Definition dna15.hpp:48
Provides seqan3::dna15, container aliases and string literals.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna15 letter1{'A'_dna15};
auto letter2 = 'A'_dna15;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna15 letter1 = 'C'_rna15; // implicitly converted
seqan3::dna15 letter2{};
letter2 = 'C'_rna15; // implicitly converted
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_dna15 : public seqan3::dna15
{
// using seqan3::dna15::dna15; // uncomment to import implicit conversion shown by letter1
};
struct my_rna15 : public seqan3::rna15
{};
int main()
{
using namespace seqan3::literals;
// my_dna15 letter1 = 'C'_rna15; // NO automatic implicit conversion!
// seqan3::dna15 letter2 = my_rna15{}; // seqan3::dna15 only allows implicit conversion from seqan3::rna15!
}
The 15 letter RNA alphabet, containing all IUPAC smybols minus the gap.
Definition rna15.hpp:48
Checks whether from can be implicityly converted to to.
Provides concepts that do not have equivalents in C++20.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna15_vector vector{'A'_rna15, 'C'_rna15, 'G'_rna15}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::dna15_vector dna15_vector{"ACGT"_rna15};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::rna15_vector rna15_vector = "ACGT"_rna15;
seqan3::dna15_vector dna15_vector{rna15_vector.begin(), rna15_vector.end()};
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna15_vector vector = "ACG"_dna15;
auto rna15_view = vector | seqan3::views::convert<seqan3::rna15>;
for (auto && chr : rna15_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::rna15 &&>);
}
}
Provides seqan3::views::convert.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna15_vector sequence1{"ACGTTA"_dna15};
seqan3::dna15_vector sequence2 = "ACGTTA"_dna15;
auto sequence3 = "ACGTTA"_dna15;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna16sam letter{'A'_dna16sam};
letter.assign_char('=');
seqan3::debug_stream << letter << '\n'; // prints "="
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // "N";
}
A 16 letter DNA alphabet, containing all IUPAC symbols minus the gap and plus an equality sign ('=').
Definition dna16sam.hpp:45
Provides seqan3::dna16sam.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna16sam letter1{'A'_dna16sam};
auto letter2 = 'A'_dna16sam;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna16sam_vector sequence1{"ACGTTA"_dna16sam};
seqan3::dna16sam_vector sequence2 = "ACGTTA"_dna16sam;
auto sequence3 = "ACGTTA"_dna16sam;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna3bs letter{'A'_dna3bs};
letter.assign_char('C'); // All C will be converted to T.
seqan3::debug_stream << letter << '\n'; // prints "T"
letter.assign_char('F'); // Unknown characters are implicitly converted to A.
seqan3::debug_stream << letter << '\n'; // prints "A"
}
The three letter reduced DNA alphabet for bisulfite sequencing mode (A,G,T(=C)).
Definition dna3bs.hpp:58
Provides seqan3::dna3bs, container aliases and string literals.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna3bs letter1{'A'_dna3bs};
auto letter2 = 'A'_dna3bs;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna3bs_vector sequence1{"ACGTTA"_dna3bs};
seqan3::dna3bs_vector sequence2 = "ACGTTA"_dna3bs;
auto sequence3 = "ACGTTA"_dna3bs;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna4 letter{'C'_dna4};
letter.assign_char('F'); // Characters other than IUPAC characters are implicitly converted to A.
seqan3::debug_stream << letter << '\n'; // prints "A"
// IUPAC characters are implicitly converted to their best fitting representative
seqan3::debug_stream << letter.assign_char('R') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('Y') << '\n'; // prints "C"
seqan3::debug_stream << letter.assign_char('S') << '\n'; // prints "C"
seqan3::debug_stream << letter.assign_char('W') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('K') << '\n'; // prints "G"
seqan3::debug_stream << letter.assign_char('M') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('B') << '\n'; // prints "C"
seqan3::debug_stream << letter.assign_char('D') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('H') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('V') << '\n'; // prints "A"
letter.assign_char('a'); // Lower case letters are the same as their upper case equivalent.
seqan3::debug_stream << letter << '\n'; // prints "A"
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna4 letter1{'A'_dna4};
auto letter2 = 'A'_dna4;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna4 letter1 = 'C'_rna4; // implicitly converted
seqan3::dna4 letter2{};
letter2 = 'C'_rna4; // implicitly converted
}
Provides seqan3::rna4, container aliases and string literals.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_dna4 : public seqan3::dna4
{
// using seqan3::dna4::dna4; // uncomment to import implicit conversion shown by letter1
};
struct my_rna4 : public seqan3::rna4
{};
int main()
{
using namespace seqan3::literals;
// my_dna4 letter1 = 'C'_rna4; // NO automatic implicit conversion!
// seqan3::dna4 letter2 = my_rna4{}; // seqan3::dna4 only allows implicit conversion from seqan3::rna4!
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector vector{'A'_rna4, 'C'_rna4, 'G'_rna4}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::dna4_vector dna4_vector{"ACGT"_rna4};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::rna4_vector rna4_vector = "ACGT"_rna4;
seqan3::dna4_vector dna4_vector{rna4_vector.begin(), rna4_vector.end()};
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector vector = "ACG"_dna4;
auto rna4_view = vector | seqan3::views::convert<seqan3::rna4>;
for (auto && chr : rna4_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::rna4 &&>);
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector sequence1{"ACGTTA"_dna4};
seqan3::dna4_vector sequence2 = "ACGTTA"_dna4;
auto sequence3 = "ACGTTA"_dna4;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna5 letter{'A'_dna5};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // prints "N"
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna5 letter1{'A'_dna5};
auto letter2 = 'A'_dna5;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna5 letter1 = 'C'_rna5; // implicitly converted
seqan3::dna5 letter2{};
letter2 = 'C'_rna5; // implicitly converted
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_dna5 : public seqan3::dna5
{
// using seqan3::dna5::dna5; // uncomment to import implicit conversion shown by letter1
};
struct my_rna5 : public seqan3::rna5
{};
int main()
{
using namespace seqan3::literals;
// my_dna5 letter1 = 'C'_rna5; // NO automatic implicit conversion!
// seqan3::dna5 letter2 = my_rna5{}; // seqan3::dna5 only allows implicit conversion from seqan3::rna5!
}
The five letter RNA alphabet of A,C,G,U and the unknown character N.
Definition rna5.hpp:46
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector vector{'A'_rna5, 'C'_rna5, 'G'_rna5}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::dna5_vector dna5_vector{"ACGT"_rna5};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::rna5_vector rna5_vector = "ACGT"_rna5;
seqan3::dna5_vector dna5_vector{rna5_vector.begin(), rna5_vector.end()};
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector vector = "ACG"_dna5;
auto rna5_view = vector | seqan3::views::convert<seqan3::rna5>;
for (auto && chr : rna5_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::rna5 &&>);
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector sequence1{"ACGTTA"_dna5};
seqan3::dna5_vector sequence2 = "ACGTTA"_dna5;
auto sequence3 = "ACGTTA"_dna5;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::rna15 letter{'A'_rna15};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // prints "N"
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna15 letter1{'A'_rna15};
auto letter2 = 'A'_rna15;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna15 letter1 = 'C'_dna15; // implicitly converted
seqan3::rna15 letter2{};
letter2 = 'C'_dna15; // implicitly converted
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_rna15 : public seqan3::rna15
{
// using seqan3::rna15::rna15; // uncomment to import implicit conversion shown by letter1
};
struct my_dna15 : public seqan3::dna15
{};
int main()
{
using namespace seqan3::literals;
// my_rna15 letter1 = 'C'_dna15; // NO automatic implicit conversion!
// seqan3::rna15 letter2 = my_dna15{}; // seqan3::rna15 only allows implicit conversion from seqan3::dna15!
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna15_vector vector{'A'_dna15, 'C'_dna15, 'G'_dna15}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::rna15_vector rna15_vector{"ACGT"_dna15};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::dna15_vector dna15_vector = "ACGT"_dna15;
seqan3::rna15_vector rna15_vector{dna15_vector.begin(), dna15_vector.end()};
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna15_vector vector = "ACG"_rna15;
auto dna15_view = vector | seqan3::views::convert<seqan3::dna15>;
for (auto && chr : dna15_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::dna15 &&>);
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna15_vector sequence1{"ACGTTA"_rna15};
seqan3::rna15_vector sequence2 = "ACGTTA"_rna15;
auto sequence3 = "ACGTTA"_rna15;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::rna4 letter{'A'_rna4};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to A.
seqan3::debug_stream << letter << '\n'; // prints "A"
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna4 letter1{'A'_rna4};
auto letter2 = 'A'_rna4;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna4 letter1 = 'C'_dna4; // implicitly converted
seqan3::rna4 letter2{};
letter2 = 'C'_dna4; // implicitly converted
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_rna4 : public seqan3::rna4
{
// using seqan3::rna4::rna4; // uncomment to import implicit conversion shown by letter1
};
struct my_dna4 : public seqan3::dna4
{};
int main()
{
using namespace seqan3::literals;
// my_rna4 letter1 = 'C'_dna4; // NO automatic implicit conversion!
// seqan3::rna4 letter2 = my_dna4{}; // seqan3::rna4 only allows implicit conversion from seqan3::dna4!
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna4_vector vector{'A'_dna4, 'C'_dna4, 'G'_dna4}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::rna4_vector rna4_vector{"ACGT"_dna4};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::dna4_vector dna4_vector = "ACGT"_dna4;
seqan3::rna4_vector rna4_vector{dna4_vector.begin(), dna4_vector.end()};
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna4_vector vector = "ACG"_rna4;
auto dna4_view = vector | seqan3::views::convert<seqan3::dna4>;
for (auto && chr : dna4_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::dna4 &&>);
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna4_vector sequence1{"ACGTTA"_rna4};
seqan3::rna4_vector sequence2 = "ACGTTA"_rna4;
auto sequence3 = "ACGTTA"_rna4;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::rna5 letter{'A'_rna5};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // prints "N"
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna5 letter1{'A'_rna5};
auto letter2 = 'A'_rna5;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna5 letter1 = 'C'_dna5; // implicitly converted
seqan3::rna5 letter2{};
letter2 = 'C'_dna5; // implicitly converted
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_rna5 : public seqan3::rna5
{
// using seqan3::rna5::rna5; // uncomment to import implicit conversion shown by letter1
};
struct my_dna5 : public seqan3::dna5
{};
int main()
{
using namespace seqan3::literals;
// my_rna5 letter1 = 'C'_dna5; // NO automatic implicit conversion!
// seqan3::rna5 letter2 = my_dna5{}; // seqan3::rna5 only allows implicit conversion from seqan3::dna5!
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna5_vector vector{'A'_dna5, 'C'_dna5, 'G'_dna5}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::rna5_vector rna5_vector{"ACGT"_dna5};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::dna5_vector dna5_vector = "ACGT"_dna5;
seqan3::rna5_vector rna5_vector{dna5_vector.begin(), dna5_vector.end()};
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna5_vector vector = "ACG"_rna5;
auto dna5_view = vector | seqan3::views::convert<seqan3::dna5>;
for (auto && chr : dna5_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::dna5 &&>);
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna5_vector sequence1{"ACGTTA"_rna5};
seqan3::rna5_vector sequence2 = "ACGTTA"_rna5;
auto sequence3 = "ACGTTA"_rna5;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::phred42 letter{'@'_phred42};
letter.assign_char('!');
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "0"
seqan3::debug_stream << letter.to_char() << '\n'; // prints "!"
letter.assign_phred(49); // Values exceeding the maximum are implicitly limited to the maximum phred value.
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "41"
}
Quality type for traditional Sanger and modern Illumina Phred scores.
Definition phred42.hpp:44
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::phred42 letter1{'!'_phred42};
auto letter2 = '!'_phred42;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::phred42> sequence1{"##!!##"_phred42};
std::vector<seqan3::phred42> sequence2 = "##!!##"_phred42;
auto sequence3 = "##!!##"_phred42;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::phred63 letter{'@'_phred63};
letter.assign_char('!');
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "0"
seqan3::debug_stream << letter.to_char() << '\n'; // prints "!"
letter.assign_phred(72); // Values exceeding the maximum are implicitly limited to the maximum phred value.
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "62"
}
Quality type for traditional Sanger and modern Illumina Phred scores.
Definition phred63.hpp:44
Provides seqan3::phred63 quality scores.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::phred63 letter1{'!'_phred63};
auto letter2 = '!'_phred63;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::phred63> sequence1{"##!!##"_phred63};
std::vector<seqan3::phred63> sequence2 = "##!!##"_phred63;
auto sequence3 = "##!!##"_phred63;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::phred68solexa letter{'@'_phred68solexa};
letter.assign_char(';');
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "-5"
seqan3::debug_stream << letter.to_char() << '\n'; // prints ";"
letter.assign_phred(72); // Values exceeding the maximum are implicitly limited to the maximum phred value.
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "62"
}
Quality type for Solexa and deprecated Illumina formats.
Definition phred68solexa.hpp:37
Provides seqan3::phred68solexa quality scores.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::phred68solexa letter1{'!'_phred68solexa};
auto letter2 = '!'_phred68solexa;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::phred68solexa> sequence1{"##!!##"_phred68solexa};
std::vector<seqan3::phred68solexa> sequence2 = "##!!##"_phred68solexa;
auto sequence3 = "##!!##"_phred68solexa;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::phred94 letter{'@'_phred94};
letter.assign_char('!');
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "0"
seqan3::debug_stream << letter.to_char() << '\n'; // prints "!"
letter.assign_phred(99); // Values exceeding the maximum are implicitly limited to the maximum phred value.
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "93"
}
Quality type for PacBio Phred scores of HiFi reads.
Definition phred94.hpp:41
Provides seqan3::phred94 quality scores.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::phred94 letter1{'!'_phred94};
auto letter2 = '!'_phred94;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::phred94> sequence1{"##!!##"_phred94};
std::vector<seqan3::phred94> sequence2 = "##!!##"_phred94;
auto sequence3 = "##!!##"_phred94;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
using seqan3::get;
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter{'A'_dna4, '('_phred42};
seqan3::debug_stream << seqan3::to_rank(letter) << ' ' // 7
<< seqan3::to_rank(get<0>(letter)) << ' ' // 0
<< seqan3::to_rank(get<1>(letter)) << '\n'; // 7
seqan3::debug_stream << seqan3::to_char(letter) << ' ' // A
<< seqan3::to_char(get<0>(letter)) << ' ' // A
<< seqan3::to_char(get<1>(letter)) << '\n'; // (
seqan3::debug_stream << seqan3::to_phred(letter) << ' ' // 7
<< seqan3::to_phred(get<1>(letter)) << '\n'; // 7
// Modify:
get<0>(letter) = 'G'_dna4;
seqan3::debug_stream << seqan3::to_char(letter) << '\n'; // G
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dot_bracket3 letter{'.'_db3};
letter.assign_char('(');
seqan3::debug_stream << letter << '\n'; // prints "("
letter.assign_char('F'); // Unknown characters are implicitly converted to '.'.
seqan3::debug_stream << letter << '\n'; // prints "."
}
The three letter RNA structure alphabet of the characters ".()".
Definition dot_bracket3.hpp:51
Provides the dot bracket format for RNA structure.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dot_bracket3 letter1{'('_db3};
auto letter2 = '('_db3;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dot_bracket3> sequence1{".(..)."_db3};
std::vector<seqan3::dot_bracket3> sequence2 = ".(..)."_db3;
auto sequence3 = ".(..)."_db3;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dssp9 letter{'H'_dssp9};
letter.assign_char('B');
seqan3::debug_stream << letter << '\n'; // prints "B"
letter.assign_char('F'); // Unknown characters are implicitly converted to 'X'.
seqan3::debug_stream << letter << '\n'; // prints "X"
}
The protein structure alphabet of the characters "HGIEBTSCX".
Definition dssp9.hpp:59
Provides the dssp format for protein structure.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dssp9 letter1{'('_dssp9};
auto letter2 = '('_dssp9;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dssp9> sequence1{"EHHHHT"_dssp9};
std::vector<seqan3::dssp9> sequence2 = "EHHHHT"_dssp9;
auto sequence3 = "EHHHHT"_dssp9;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
using seqan3::get;
seqan3::debug_stream << seqan3::to_rank(letter) << ' ' // 199
<< seqan3::to_rank(get<0>(letter)) << ' ' // 22
<< seqan3::to_rank(get<1>(letter)) << '\n'; // 1
seqan3::debug_stream << seqan3::to_char(letter) << ' ' // W
<< seqan3::to_char(get<0>(letter)) << ' ' // W
<< seqan3::to_char(get<1>(letter)) << '\n'; // B
// Modify:
get<0>(letter) = 'V'_aa27;
seqan3::debug_stream << seqan3::to_char(letter) << '\n'; // V
}
A seqan3::alphabet_tuple_base that joins an aminoacid alphabet with a protein structure alphabet.
Definition structured_aa.hpp:52
Provides the composite of aminoacid with structure alphabets.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
using seqan3::get;
seqan3::debug_stream << seqan3::to_rank(letter) << ' ' // 7
<< seqan3::to_rank(get<0>(letter)) << ' ' // 2
<< seqan3::to_rank(get<1>(letter)) << '\n'; // 1
seqan3::debug_stream << seqan3::to_char(letter) << ' ' // G
<< seqan3::to_char(get<0>(letter)) << ' ' // G
<< seqan3::to_char(get<1>(letter)) << '\n'; // (
// Modify:
get<0>(letter) = 'U'_rna4;
seqan3::debug_stream << seqan3::to_char(letter) << '\n'; // U
}
A seqan3::alphabet_tuple_base that joins a nucleotide alphabet with an RNA structure alphabet.
Definition structured_rna.hpp:53
Provides the composite of nucleotide with structure alphabets.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::wuss51 letter{':'_wuss51};
letter.assign_char('~');
seqan3::debug_stream << letter << '\n'; // prints "~"
letter.assign_char('#'); // Unknown characters are implicitly converted to ';'.
seqan3::debug_stream << letter << '\n'; // prints ";"
}
Provides the WUSS format for RNA structure.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::wuss51 letter1{'('_wuss51};
auto letter2 = '('_wuss51;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
bool is_closing_char_member = '}'_wuss51.is_pair_close();
bool is_closing_char_free = seqan3::is_pair_close('.'_wuss51);
std::cout << std::boolalpha << is_closing_char_member << '\n'; // true
std::cout << std::boolalpha << is_closing_char_free << '\n'; // false
}
T boolalpha(T... args)
constexpr auto is_pair_close
Check whether the given character represents a leftward interaction in an RNA structure.
Definition alphabet/structure/concept.hpp:179
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
bool is_opening_char_member = '{'_wuss51.is_pair_open();
bool is_opening_char_free = seqan3::is_pair_open('.'_wuss51);
std::cout << std::boolalpha << is_opening_char_member << '\n'; // true
std::cout << std::boolalpha << is_opening_char_free << '\n'; // false
}
constexpr auto is_pair_open
Check whether the given character represents a rightward interaction in an RNA structure.
Definition alphabet/structure/concept.hpp:97
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
bool is_unpaired_char_member = '.'_wuss51.is_unpaired();
bool is_unpaired_char_free = seqan3::is_unpaired('{'_wuss51);
std::cout << std::boolalpha << is_unpaired_char_member << '\n'; // true
std::cout << std::boolalpha << is_unpaired_char_free << '\n'; // false
}
constexpr auto is_unpaired
Check whether the given character represents an unpaired nucleotide in an RNA structure.
Definition alphabet/structure/concept.hpp:261
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::wuss51> sequence1{".<..>."_wuss51};
std::vector<seqan3::wuss51> sequence2 = ".<..>."_wuss51;
auto sequence3 = ".<..>."_wuss51;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
uint8_t max_depth_member = seqan3::wuss51::max_pseudoknot_depth;
uint8_t max_depth_meta = seqan3::max_pseudoknot_depth<seqan3::wuss51>;
std::cout << static_cast<uint16_t>(max_depth_member) << '\n'; // 22
std::cout << static_cast<uint16_t>(max_depth_meta) << '\n'; // 22
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
auto pk_opt = '.'_wuss51.pseudoknot_id(); // std::optional -> false
pk_opt = seqan3::pseudoknot_id('{'_wuss51); // std::optional -> true: 3
if (pk_opt)
seqan3::debug_stream << *pk_opt << '\n'; // 3
}
constexpr auto pseudoknot_id
Retrieve an id for the level of a pseudoknotted interaction (also known as 'page number').
Definition alphabet/structure/concept.hpp:456
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
std::string_view str{"ACTTTGATAN"};
try
{
seqan3::debug_stream << (str | seqan3::views::char_strictly_to<seqan3::dna4>); // ACTTTGATA
}
{
seqan3::debug_stream << "\n[ERROR] Invalid char!\n"; // Will throw on parsing 'N'
}
}
Provides seqan3::views::char_strictly_to.
An exception typically thrown by seqan3::alphabet::assign_char_strict.
Definition alphabet/exception.hpp:27
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
std::string str{"ACTTTGATAN"};
seqan3::debug_stream << (str | seqan3::views::char_to<seqan3::dna4>) << '\n'; // ACTTTGATAA
seqan3::debug_stream << (str | seqan3::views::char_to<seqan3::dna5>) << '\n'; // ACTTTGATAN
}
Provides seqan3::views::char_to.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector foo{"ACGTA"_dna5};
// pipe notation
auto v = foo | seqan3::views::complement;
seqan3::debug_stream << v << '\n'; // TGCAT
// function notation
seqan3::debug_stream << v2 << '\n'; // TGCAT
// generate the reverse complement:
auto v3 = foo | seqan3::views::complement | std::views::reverse;
seqan3::debug_stream << v3 << '\n'; // TACGT
}
Provides seqan3::views::complement.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
#include <seqan3/alphabet/quality/aliases.hpp> // includes seqan3::dna4q
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector vec = "ACTTTGATA"_dna4;
auto v = vec | seqan3::views::to_char;
seqan3::debug_stream << v << '\n'; // [A,C,T,T,T,G,A,T,A]
auto v3 = qvec | seqan3::views::to_char;
seqan3::debug_stream << v3 << '\n'; // [!,(,&,$,(,%,?,1,8]
std::vector<seqan3::dna4q> qcvec{{'C'_dna4, '!'_phred42},
{'A'_dna4, '('_phred42},
{'G'_dna4, '&'_phred42},
{'T'_dna4, '$'_phred42},
{'G'_dna4, '('_phred42},
{'A'_dna4, '%'_phred42},
{'C'_dna4, '?'_phred42},
{'T'_dna4, '1'_phred42},
{'A'_dna4, '8'_phred42}};
auto v4 = qcvec | seqan3::views::to_char;
seqan3::debug_stream << v4 << '\n'; // [C,A,G,T,G,A,C,T,A]
}
constexpr derived_type & assign_phred(phred_type const p) noexcept
Assign from the numeric Phred score value.
Definition phred_base.hpp:123
auto const to_char
A view that calls seqan3::to_char() on each element in the input range.
Definition to_char.hpp:60
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector vec = "ACTTTGATA"_dna4;
auto v = vec | seqan3::views::to_rank;
seqan3::debug_stream << v << '\n'; // [0,1,3,3,3,2,0,3,0]
auto v3 = qvec | seqan3::views::to_rank;
seqan3::debug_stream << v3 << '\n'; // [0,7,5,3,7,4,30,16,23]
std::vector<seqan3::dna4q> qcvec{{'C'_dna4, '!'_phred42},
{'A'_dna4, '('_phred42},
{'G'_dna4, '&'_phred42},
{'T'_dna4, '$'_phred42},
{'G'_dna4, '('_phred42},
{'A'_dna4, '%'_phred42},
{'C'_dna4, '?'_phred42},
{'T'_dna4, '1'_phred42},
{'A'_dna4, '8'_phred42}};
auto v4 = qcvec | seqan3::views::to_rank;
seqan3::debug_stream << v4 << '\n'; // [42,7,89,129,91,4,72,142,23]
}
auto const to_rank
A view that calls seqan3::to_rank() on each element in the input range.
Definition to_rank.hpp:63
Provides seqan3::views::to_rank.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
std::vector<int> vec{0, 1, 3, 3, 3, 2, 0, 3, 0};
seqan3::debug_stream << (vec | seqan3::views::rank_to<seqan3::dna4>) << '\n'; // ACTTTGATA
seqan3::debug_stream << (vec | seqan3::views::rank_to<seqan3::dna5>) << '\n'; // ACNNNGANA
}
Provides seqan3::views::rank_to.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
auto char_to_char = seqan3::to_char('A'); // calls seqan3::custom::to_char('A')
auto dna5_to_char = seqan3::to_char('A'_dna5); // calls .to_char() member
std::cout << char_to_char << '\n'; // A
std::cout << dna5_to_char << '\n'; // A
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
auto char_to_rank = seqan3::to_rank('A'); // calls seqan3::custom::to_rank('A')
static_assert(std::same_as<decltype(char_to_rank), uint8_t>);
std::cout << static_cast<uint16_t>(char_to_rank) << '\n'; // 65
auto dna5_to_rank = seqan3::to_rank('A'_dna5); // calls .to_char() member
static_assert(std::same_as<decltype(dna5_to_rank), uint8_t>);
std::cout << static_cast<uint16_t>(dna5_to_rank) << '\n'; // 0
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector vec{"ACGTACGTACGTA"_dna5};
// Default (first forward frame)
// == [T,Y,V,R]
seqan3::debug_stream << v1[1] << '\n';
// First forward frame
// == [T,Y,V,R]
// First reverse frame
// == [Y,V,R,T]
// Second forward frame
// == [R,T,Y,V]
// Second reverse frame
// == [T,Y,V,R]
// Third forward frame
// == [V,R,T]
// Third reverse frame
// == [R,T,Y]
// function syntax
// == [T,Y,V,R]
// combinability
auto v9 =
// == [M,H,A,C]
// combinability with default parameter
// == [C,M,H,A]
// combinability with default parameter
// == [C,M,H,A]
}
constexpr auto translate_single
A view that translates nucleotide into aminoacid alphabet for one of the six frames.
Definition translate.hpp:520
@ forward_frame2
The third forward frame starting at position 2.
@ forward_frame0
The first forward frame starting at position 0.
@ reverse_frame0
The first reverse frame starting at position 0.
@ reverse_frame2
The third reverse frame starting at position 2.
@ forward_frame1
The second forward frame starting at position 1.
@ reverse_frame1
The second reverse frame starting at position 1.
Provides seqan3::views::translate and seqan3::views::translate_single.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
using namespace seqan3::literals;
int main()
{
// Input range needs to be two-dimensional
std::vector<std::vector<seqan3::dna4>> vec{"ACGTACGTACGTA"_dna4, "TCGAGAGCTTTAGC"_dna4};
// Translation with default parameters
seqan3::debug_stream << v1 << "\n"; // [TYVR,RTYV,VRT,YVRT,TYVR,RTY,SRAL,REL*,ESFS,AKAL,LKLS,*SSR]
// Access the third forward frame (index_frame 2) of the second input sequence (index_seq 1)
// Required frames per sequence s = 6
// n = (index_seq * s) + j
// = 1 * 6 + 2
// = 8
auto third_frame_second_seq = v1[1 * 6 + 2];
seqan3::debug_stream << third_frame_second_seq << "\n"; // ESFS
// Translation with custom translation frame
seqan3::debug_stream << v2 << "\n"; // [TYVR,SRAL]
return 0;
}
constexpr auto translate_join
A view that translates nucleotide into aminoacid alphabet with 1, 2, 3 or 6 frames....
Definition translate_join.hpp:378
Provides seqan3::views::translate_join.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector vec{"ACGTACGTACGTA"_dna5};
// default frame translation
auto v1 = vec | seqan3::views::translate;
seqan3::debug_stream << v1 << '\n'; // [TYVR,RTYV,VRT,YVRT,TYVR,RTY]
// single frame translation
seqan3::debug_stream << v2 << '\n'; // [TYVR]
// reverse translation
seqan3::debug_stream << v3 << '\n'; // [TYVR,YVRT]
// forward frames translation
seqan3::debug_stream << v4 << '\n'; // [TYVR,RTYV,VRT]
// six frame translation
seqan3::debug_stream << v5 << '\n'; // [TYVR,RTYV,VRT,YVRT,TYVR,RTY]
// function syntax
seqan3::debug_stream << v6 << '\n'; // [TYVR,YVRT]
// combinability
seqan3::debug_stream << v7 << '\n'; // [CMHA,MHAC]
}
@ forward_frames
All forward frames.
@ forward_reverse0
The first forward and first reverse frame.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <string>
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna5q> vec{{'A'_dna5, 'I'_phred42},
{'G'_dna5, 'I'_phred42},
{'G'_dna5, '?'_phred42},
{'A'_dna5, '5'_phred42},
{'T'_dna5, '+'_phred42}};
// trim by phred_value
auto v1 = vec | seqan3::views::trim_quality(20u);
seqan3::debug_stream << v1 << '\n'; // AGGA
// trim by quality character; in this case the nucleotide part of the character is irrelevant
auto v2 = vec | seqan3::views::trim_quality(seqan3::dna5q{'C'_dna5, '5'_phred42});
seqan3::debug_stream << v2 << '\n'; // AGGA
// combinability
seqan3::debug_stream << v3 << '\n'; // AGGA
}
constexpr auto trim_quality
A view that does quality-threshold trimming on a range of seqan3::quality_alphabet.
Definition trim_quality.hpp:126
Provides seqan3::views::trim_quality.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <string>
#include <vector>
using namespace seqan3::literals;
int main()
{
std::vector<seqan3::phred42> vec{"II?5+"_phred42};
// trim by phred_value
auto v1 = vec | seqan3::views::trim_quality(20u);
seqan3::debug_stream << v1 << '\n'; // II?5
// trim by quality character
auto v2 = vec | seqan3::views::trim_quality('I'_phred42);
seqan3::debug_stream << v2 << '\n'; // II
// function syntax
auto v3 = seqan3::views::trim_quality(vec, '5'_phred42);
seqan3::debug_stream << v3 << '\n'; // II?5
// combinability
seqan3::debug_stream << v4 << '\n'; // II?5
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
std::string_view str{"ACTTTGATAN"};
try
{
seqan3::debug_stream << (str | seqan3::views::validate_char_for<seqan3::dna4>); // ACTTTGATA
}
{
seqan3::debug_stream << "\n[ERROR] Invalid char!\n"; // Will throw on parsing 'N'
}
}
Provides seqan3::views::validate_char_for.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main(int argc, char ** argv)
{
seqan3::argument_parser myparser{"Grade-Average", argc, argv}; // initialize
std::string name{"Max Muster"}; // define default values directly in the variable.
bool bonus{false};
std::vector<double> grades{}; // you can also specify a vector that is treated as a list option.
myparser.add_option(name, 'n', "name", "Please specify your name.");
myparser.add_flag(bonus, 'b', "bonus", "Please specify if you got the bonus.");
myparser.add_positional_option(grades, "Please specify your grades.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << '\n'; // customize your error message
return -1;
}
if (bonus)
grades.push_back(1.0); // extra good grade
double avg{0};
for (auto g : grades)
avg += g;
avg = avg / grades.size();
seqan3::debug_stream << name << " has an average grade of " << avg << '\n';
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main(int argc, char ** argv)
{
seqan3::argument_parser myparser{"The-Age-App", argc, argv}; // initialize
int age{30}; // define default values directly in the variable
myparser.add_option(age, 'a', "user-age", "Please specify your age.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "The-Age-App - [PARSER ERROR] " << ext.what() << '\n'; // customize your error message
return -1;
}
seqan3::debug_stream << "integer given by user: " << age << '\n';
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main(int argc, char ** argv)
{
seqan3::argument_parser myparser{"Penguin_Parade", argc, argv}; // initialize
myparser.info.version = "2.0.0";
myparser.info.date = "12.01.2017";
myparser.info.short_description = "Organize your penguin parade";
myparser.info.description.push_back("First Paragraph.");
myparser.info.description.push_back("Second Paragraph.");
myparser.info.examples.push_back("./penguin_parade Skipper Kowalski Rico Private -d 10 -m 02 -y 2017");
int d{01}; // day
int m{01}; // month
int y{2050}; // year
myparser.add_option(d, 'd', "day", "Please specify your preferred day.");
myparser.add_option(m, 'm', "month", "Please specify your preferred month.");
myparser.add_option(y, 'y', "year", "Please specify your preferred year.");
std::vector<std::string> penguin_names;
myparser.add_positional_option(penguin_names, "Specify the names of the penguins.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << ext.what() << "\n";
return -1;
}
// organize ...
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv};
std::string myvar{"Example"};
myparser.add_option(myvar, 's', "special-op", "You know what you doin'?", seqan3::option_spec::advanced);
}
@ advanced
Definition auxiliary.hpp:252
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <ranges>
#include <system_error>
namespace seqan3::custom
{
// Specialise the seqan3::custom::argument_parsing data structure to enable parsing of std::errc.
template <>
struct argument_parsing<std::errc>
{
// Specialise a mapping from an identifying string to the respective value of your type Foo.
{"no_error", std::errc{}},
{"timed_out", std::errc::timed_out},
{"invalid_argument", std::errc::invalid_argument},
{"io_error", std::errc::io_error}};
};
} // namespace seqan3::custom
int main(int argc, char const * argv[])
{
std::errc value{};
seqan3::argument_parser parser{"my_program", argc, argv};
// Because of the argument_parsing struct and
// the static member function enumeration_names
// you can now add an option that takes a value of type std::errc:
parser.add_option(value,
'e',
"errc",
"Give me a std::errc value.",
seqan3::value_list_validator{(seqan3::enumeration_names<std::errc> | std::views::values)});
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
return 0;
}
A validator that checks whether a value is inside a list of valid values.
Definition validators.hpp:200
auto const enumeration_names
Return a conversion map from std::string_view to option_type.
Definition auxiliary.hpp:162
A namespace for third party and standard library specialisations of SeqAn customisation points.
Definition char.hpp:39
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
namespace foo
{
enum class bar
{
one,
two,
three
};
// Specialise a mapping from an identifying string to the respective value of your type bar.
auto enumeration_names(bar)
{
return std::unordered_map<std::string_view, bar>{{"one", bar::one}, {"two", bar::two}, {"three", bar::three}};
}
} // namespace foo
int main(int argc, char const * argv[])
{
foo::bar value{};
seqan3::argument_parser parser{"my_program", argc, argv};
// Because of the enumeration_names function
// you can now add an option that takes a value of type bar:
parser.add_option(value,
'f',
"foo",
"Give me a foo value.",
seqan3::value_list_validator{(seqan3::enumeration_names<foo::bar> | std::views::values)});
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main(int argc, char ** argv)
{
seqan3::argument_parser myparser{"awesome-app", argc, argv}; // initialize
int a{3};
myparser.add_option(a, 'a', "awesome-parameter", "Please specify an integer.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << '\n'; // customize your error message
return -1;
}
if (myparser.is_option_set('a'))
seqan3::debug_stream << "The user set option -a on the command line.\n";
if (myparser.is_option_set("awesome-parameter"))
seqan3::debug_stream << "The user set option --awesome-parameter on the command line.\n";
// Asking for an option identifier that was not used before throws an error:
// myparser.is_option_set("foo"); // throws seqan3::design_error
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
int myint;
myparser.add_option(myint, 'i', "integer", "Give me a number.", seqan3::option_spec::standard, my_validator);
// an exception will be thrown if the user specifies an integer
// that is not in range [2,10] (e.g. "./test_app -i 15")
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "integer given by user passed validation: " << myint << "\n";
return 0;
}
A validator that checks whether a number is inside a given range.
Definition validators.hpp:125
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
int myint;
seqan3::value_list_validator my_validator{2, 4, 6, 8, 10};
myparser.add_option(myint, 'i', "integer", "Give me a number.", seqan3::option_spec::standard, my_validator);
// an exception will be thrown if the user specifies an integer
// that is not one of [2,4,6,8,10] (e.g. "./test_app -i 3")
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "integer given by user passed validation: " << myint << "\n";
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
myparser.add_option(myfile,
'f',
"file",
"Give me a filename.",
seqan3::input_file_validator{{"fa", "fasta"}});
// an exception will be thrown if the user specifies a filename
// that does not have one of the extensions ["fa","fasta"],
// does not exists, or is not readable.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "filename given by user passed validation: " << myfile << "\n";
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
std::string my_string;
seqan3::regex_validator my_validator{"[a-zA-Z]+@[a-zA-Z]+\\.com"};
myparser.add_option(my_string, 's', "str", "Give me a string.", seqan3::option_spec::standard, my_validator);
// an exception will be thrown if the user specifies a string
// that is no email address ending on .com
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "email address given by user passed validation: " << my_string << "\n";
return 0;
}
A validator that checks if a matches a regular expression pattern.
Definition validators.hpp:932
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
std::string file_name;
seqan3::regex_validator absolute_path_validator{"(/[^/]+)+/.*\\.[^/\\.]+$"};
seqan3::input_file_validator my_file_ext_validator{{"sa", "so"}};
myparser.add_option(file_name,
'f',
"file",
"Give me a file name with an absolute path.",
absolute_path_validator | my_file_ext_validator);
// an exception will be thrown if the user specifies a file name
// that is not an absolute path or does not have one of the file extension [sa,so]
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
std::cout << "filename given by user passed validation: " << file_name << "\n";
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
myparser.add_option(mydir,
'd',
"dir",
"The directory containing the input files.",
// an exception will be thrown if the user specifies a directory that does not exists or has insufficient
// read permissions.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "directory given by user passed validation: " << mydir << "\n";
return 0;
}
A validator that checks if a given path is a valid input directory.
Definition validators.hpp:767
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
myparser.add_option(myfile,
'f',
"file",
"The input file containing the sequences.",
seqan3::input_file_validator{{"fa", "fasta"}});
// an exception will be thrown if the user specifies a filename
// that does not have one of the extensions ["fa","fasta"] or if the file does not exist/is not readable.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "filename given by user passed validation: " << myfile << "\n";
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Default constructed validator has an empty extension list.
seqan3::debug_stream << validator1.get_help_page_message() << '\n';
// Specify your own extensions for the input file.
seqan3::debug_stream << validator2.get_help_page_message() << '\n';
// Give the seqan3 file type as a template argument to get all valid extensions for this file.
seqan3::debug_stream << validator3.get_help_page_message() << '\n';
return 0;
}
Provides some standard validators for (positional) options.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
myparser.add_option(mydir,
'd',
"dir",
"The output directory for storing the files.",
// an exception will be thrown if the user specifies a directory that cannot be created by the filesystem either
// because the parent path does not exists or the path has insufficient write permissions.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "directory given by user passed validation: " << mydir << "\n";
return 0;
}
A validator that checks if a given path is a valid output directory.
Definition validators.hpp:843
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
// Use the seqan3::output_file_open_options to indicate that you allow overwriting existing output files, ...
myparser.add_option(
myfile,
'f',
"file",
"Output file containing the processed sequences.",
// ... or that you will throw a seqan3::validation_error if the user specified output file already exists
myparser.add_option(myfile,
'g',
"file2",
"Output file containing the processed sequences.",
// an exception will be thrown if the user specifies a filename
// that does not have one of the extensions ["fa","fasta"],
// if the file already exists, or if the file is not writable.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "filename given by user passed validation: " << myfile << "\n";
return 0;
}
@ create_new
Forbid overwriting the output file.
@ open_or_create
Allow to overwrite the output file.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Default constructed validator has an empty extension list.
seqan3::debug_stream << validator1.get_help_page_message() << '\n';
// Specify your own extensions for the output file.
std::vector{std::string{"exe"}, std::string{"fasta"}}};
seqan3::debug_stream << validator2.get_help_page_message() << '\n';
// Give the seqan3 file type as a template argument to get all valid extensions for this file.
seqan3::debug_stream << validator3.get_help_page_message() << '\n';
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
enum class my_enum
{
VAL1 = 1,
VAL2 = 2,
COMB = 3
};
template <>
constexpr bool seqan3::add_enum_bitwise_operators<my_enum> = true;
int main()
{
using seqan3::operator|;
my_enum e = my_enum::VAL1;
my_enum e2 = e | my_enum::VAL2;
std::cout << std::boolalpha << (e2 == my_enum::COMB) << '\n'; // true
}
Provides seqan3::add_enum_bitwise_operators.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#if SEQAN3_WITH_CEREAL
# include <fstream>
# include <vector>
# include <seqan3/test/tmp_directory.hpp>
# include <cereal/archives/binary.hpp> // includes the cereal::BinaryInputArchive and cereal::BinaryOutputArchive
# include <cereal/types/vector.hpp> // includes cerealisation support for std::vector
// Written for std::vector, other types also work.
void load(std::vector<int16_t> & data, std::filesystem::path const & tmp_file)
{
std::ifstream is(tmp_file, std::ios::binary); // Where input can be found.
cereal::BinaryInputArchive archive(is); // Create an input archive from the input stream.
archive(data); // Load data.
}
// Written for std::vector, other types also work.
void store(std::vector<int16_t> const & data, std::filesystem::path const & tmp_file)
{
std::ofstream os(tmp_file, std::ios::binary); // Where output should be stored.
cereal::BinaryOutputArchive archive(os); // Create an output archive from the output stream.
archive(data); // Store data.
}
int main()
{
// The following example is for a std::vector but any seqan3 data structure that is documented as serialisable
// could be used, e.g. fm_index.
seqan3::test::tmp_directory tmp{};
auto tmp_file = tmp.path() / "data.out"; // this is a temporary file path, use any other filename.
std::vector<int16_t> vec{1, 2, 3, 4};
store(vec, tmp_file); // Calls store on a std::vector.
// This vector is needed to load the information into it.
load(vec2, tmp_file); // Calls load on a std::vector.
seqan3::debug_stream << vec << '\n'; // Prints [1,2,3,4].
return 0;
}
#endif
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
// Here we use the global and banded alignment configurations to show how they can be combined.
}
Provides seqan3::pipeable_config_element.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
enum struct my_id : int
{
bar_id,
foo_id
};
namespace seqan3::detail
{
template <>
inline constexpr std::array<std::array<int, 2>, 2> compatibility_table<my_id>{{{0, 1}, {1, 0}}};
} // namespace seqan3::detail
The internal SeqAn3 namespace.
Definition aligned_sequence_concept.hpp:26
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using seqan3::get;
// my_cfg is now of type configuration<gap_cost_affine, band_fixed_size>
seqan3::debug_stream << get<1>(my_cfg).lower_diagonal << '\n'; // prints -4
seqan3::debug_stream << get<seqan3::align_cfg::band_fixed_size>(my_cfg).upper_diagonal << '\n'; // prints 4
seqan3::debug_stream << get<seqan3::align_cfg::gap_cost_affine>(my_cfg).extension_score << '\n'; // prints -1
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
// Initial setup used in the actual example:
enum struct my_id : int
{
bar_id,
foo_id
};
struct bar : private seqan3::pipeable_config_element
{
public:
float value{};
bar() = default;
bar(bar const &) = default;
bar(bar &&) = default;
bar & operator=(bar const &) = default;
bar & operator=(bar &&) = default;
~bar() = default;
bar(float v) : value{v}
{}
static constexpr my_id id{my_id::bar_id};
};
template <typename t>
{
public:
t value{};
foo() = default;
foo(foo const &) = default;
foo(foo &&) = default;
foo & operator=(foo const &) = default;
foo & operator=(foo &&) = default;
~foo() = default;
foo(t v) : value{std::move(v)}
{}
static constexpr my_id id{my_id::foo_id};
};
template <typename t>
foo(t) -> foo<t>;
int main()
{
seqan3::configuration my_cfg{foo{1}}; // Only foo<int> is present.
seqan3::debug_stream << my_cfg.get_or(foo{std::string{"hello"}}).value << '\n'; // finds foo<int> -> prints: 1
seqan3::debug_stream << my_cfg.get_or(bar{2.4}).value << '\n'; // bar not present -> prints: 2.4
}
T move(T... args)
Adds pipe interface to configuration elements.
Definition pipeable_config_element.hpp:29
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
enum struct my_id : int
{
bar_id,
foo_id
};
{
public:
bar() = default;
bar(bar const &) = default;
bar(bar &&) = default;
bar & operator=(bar const &) = default;
bar & operator=(bar &&) = default;
~bar() = default;
static constexpr my_id id{my_id::bar_id};
};
template <typename t>
struct foo : private seqan3::pipeable_config_element
{
public:
foo() = default;
foo(foo const &) = default;
foo(foo &&) = default;
foo & operator=(foo const &) = default;
foo & operator=(foo &&) = default;
~foo() = default;
static constexpr my_id id{my_id::foo_id};
};
template <typename t>
foo(t) -> foo<t>;
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
uint8_t i = 71;
seqan3::debug_stream << '\'' << i << "'\n"; // prints '71' (because flag is set by default)
seqan3::debug_stream << '\'' << i << "'\n"; // prints 'G'
seqan3::debug_stream << seqan3::fmtflags2::small_int_as_number << '\'' << i << "'\n"; // prints '71' again
// instead of formatting the stream "inline", one can also call .setf()
}
void unsetf(fmtflags const flag)
Unset the format flag(s) on the stream.
Definition debug_stream_type.hpp:180
@ small_int_as_number
Definition debug_stream_type.hpp:31
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
int main()
{
using namespace seqan3::literals;
seqan3::debug_stream << "ACGT"_dna5;
o.flush();
seqan3::debug_stream << o.str(); // prints the string stream's buffer: "ACGT"
}
void set_underlying_stream(std::basic_ostream< char_t > &out)
Change the underlying output stream.
Definition debug_stream_type.hpp:113
T flush(T... args)
T str(T... args)
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
int main()
{
using namespace seqan3::literals;
my_stream << "ACGT"_dna5;
o.flush();
seqan3::debug_stream << o.str() << '\n'; // prints the string stream's buffer: "ACGT"
}
A "pretty printer" for most SeqAn data structures and related types.
Definition debug_stream_type.hpp:75
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
using namespace seqan3::literals;
// The alphabet normally needs to be converted to char explicitly:
std::cout << seqan3::to_char('C'_dna5) << '\n'; // prints 'C'
// The debug_stream, on the other hand, does this automatically:
seqan3::debug_stream << 'C'_dna5 << '\n'; // prints 'C'
// The debug_stream can also print all types that model std::ranges::input_range:
std::vector<seqan3::dna5> vec{"ACGT"_dna5};
seqan3::debug_stream << vec << '\n'; // prints "ACGT"
// ranges of non-alphabets are printed comma-separated:
seqan3::debug_stream << (vec | seqan3::views::to_rank) << '\n'; // prints "[0,1,2,3]"
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
int outer{};
// Might be used for non-copyable lambdas. In this example, the lambda would be copyable even without the wrapper.
seqan3::detail::copyable_wrapper wrapper{[&outer](int const x)
{
outer += x;
return outer;
}};
auto wrapper_2 = wrapper; // Would not work with non-copyable lambda.
seqan3::debug_stream << wrapper(2) << '\n'; // 2
seqan3::debug_stream << wrapper_2(4) << '\n'; // 6
}
Utility wrapper that behaves like std::optional but makes the type conform with the std::copyable con...
Definition copyable_wrapper.hpp:34
Provides seqan3::detail::copyable_wrapper.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <concepts>
#include <vector>
namespace seqan3::detail::adl_only
{
// Poison-pill overload to prevent non-ADL forms of unqualified lookup.
template <typename... args_t>
void begin(args_t...) = delete;
struct begin_cpo : public detail::customisation_point_object<begin_cpo, 1>
{
// Only this class is allowed to import the constructors from base_t. (CRTP safety idiom)
using base_t::base_t;
// range.begin(), member access
template <typename range_t>
requires true // further constraints
static constexpr auto SEQAN3_CPO_OVERLOAD(seqan3::detail::priority_tag<1>, range_t && range)(
/*return*/ std::forward<range_t>(range).begin() /*;*/
);
// begin(range), ADL access
template <typename range_t>
static constexpr auto SEQAN3_CPO_OVERLOAD(seqan3::detail::priority_tag<0>, range_t && range)(
/*return*/ begin(std::forward<range_t>(range)) /*;*/
);
};
} // namespace seqan3::detail::adl_only
namespace seqan3
{
// CPO is a normal function object that can be called via seqan3::begin(...)
inline constexpr auto begin = detail::adl_only::begin_cpo{};
} // namespace seqan3
namespace other_library
{
struct foo
{
friend int begin(foo const &) // ADL begin, as friend
{
return 0;
}
};
} // namespace other_library
// seqan3::begin CPO that will call the "begin" member function
static_assert(std::same_as<decltype(seqan3::begin(vec)), decltype(vec.begin())>); // same iterator type
static_assert(noexcept(vec.begin())); // is noexcept
static_assert(noexcept(seqan3::begin(vec)) == noexcept(vec.begin())); // perfect noexcept-forwarding
// seqan3::begin CPO that will call the "begin" function per ADL
other_library::foo foo{};
static_assert(std::same_as<decltype(seqan3::begin(foo)), decltype(begin(foo))>); // same value type
static_assert(!noexcept(begin(foo))); // isn't noexcept
static_assert(noexcept(seqan3::begin(foo)) == noexcept(begin(foo))); // perfect noexcept-forwarding
auto cpo_is_sfinae_friendly(...) -> void;
template <typename range_t>
auto cpo_is_sfinae_friendly(range_t && range) -> decltype(seqan3::begin(range));
// seqan3::begin itself is SFINAE friendly, i.e. no-hard compiler errors, if no cpo overload matches
static_assert(std::same_as<decltype(cpo_is_sfinae_friendly(0)), void>);
static_assert(std::same_as<decltype(cpo_is_sfinae_friendly(vec)), decltype(vec.begin())>);
T begin(T... args)
Helper utilities for defining customisation point objects (CPOs).
#define SEQAN3_CPO_OVERLOAD(...)
A macro that helps to define a seqan3::detail::customisation_point_object.
Definition customisation_point.hpp:104
A CRTP base-class that defines a customisation_point_object (CPO).
Definition customisation_point.hpp:138
A tag that allows controlled overload resolution via implicit base conversion rules.
Definition customisation_point.hpp:29
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <string>
#include <type_traits>
// Defines a crtp_base class with an additional value type.
template <typename derived_t, int value>
class base1
{
public:
int func1() const
{
return value;
}
};
// Defines a crtp_base class with an additional value type and a parameter type.
template <typename derived_t, typename value_t, typename parameter_t>
class base2
{
public:
value_t func2(parameter_t const p) const
{
return static_cast<value_t>(p);
}
};
// The derived class that inherits from a variadic crtp pattern, which are augmented with additional trait types.
// These types must be wrapped in a deferred layer, otherwise the compilation fails as incomplete types are not allowed.
// But during the definition of the base classes, the derived class cannot be known.
// In addition the deferred type must be invoked with the derived class using the `invoke_deferred_crtp_base` helper
// template to instantiate the correct crtp base type.
// Note that it is possible to define base classes with type template parameters (see base2) or
// non-type template parameters (see base1), but non-type and type template parameters cannot be mixed in one
// base class definition.
template <typename... deferred_bases_t>
class derived : public seqan3::detail::invoke_deferred_crtp_base<deferred_bases_t, derived<deferred_bases_t...>>...
{};
int main()
{
// Define deferred base with non-type template parameter
// Define deferred base with type template parameter.
// Instantiate the derived class with the deferred crtp base classes.
derived<deferred_base1, deferred_base2> d{};
// Check the inherited interfaces.
static_assert(std::is_same_v<decltype(d.func1()), int>, "Return type must be int");
static_assert(std::is_same_v<decltype(d.func2(10u)), uint8_t>, "Return type must be uint8_t");
}
Provides seqan3::detail::deferred_crtp_base.
typename deferred_crtp_base_t::template invoke< derived_t > invoke_deferred_crtp_base
Template alias to instantiate the deferred crtp base with the derived class.
Definition deferred_crtp_base.hpp:94
T is_same_v
An invocable wrapper that defers the instantiation of a crtp_base class.
Definition deferred_crtp_base.hpp:73
An invocable wrapper that defers the instantiation of a crtp_base class.
Definition deferred_crtp_base.hpp:40
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <type_traits>
template <typename t>
requires std::is_integral_v<t>
struct foo
{
t value;
};
// foo is declarable with int, i.e. foo<int> is a valid expression
static_assert(seqan3::detail::is_class_template_declarable_with_v<foo, int>);
// foo is not declarable with double, because it does not fulfil the requires clause of foo.
static_assert(!seqan3::detail::is_class_template_declarable_with_v<foo, double>);
// This also works with std::enable_if and producing a substitution failure.
template <typename t, typename = std::enable_if_t<std::is_same_v<t, int>>>
struct bar
{
t value;
};
// bar is declarable with int, i.e. bar<int> is a valid expression
static_assert(seqan3::detail::is_class_template_declarable_with_v<bar, int>);
// bar is not declarable with double, because it produces an substitution failure (SFINAE).
static_assert(!seqan3::detail::is_class_template_declarable_with_v<bar, double>);
// is_class_template_declarable_with_v works well with lazy_conditional_t
template <typename t>
using maybe_foo_t = seqan3::detail::
lazy_conditional_t<seqan3::detail::is_class_template_declarable_with_v<foo, t>, seqan3::detail::lazy<foo, t>, t>;
int main()
{
foo<int> a = maybe_foo_t<int>{10}; // foo is instantiable with int, thus use foo<int>
seqan3::debug_stream << "a: " << a.value << '\n'; // prints 10
float b = maybe_foo_t<float>{0.4f}; // foo is not instantiable with float, thus use float directly
seqan3::debug_stream << "b: " << b << '\n'; // prints 0.4
return 0;
}
Provides a type trait for verifying valid template declarations.
An empty type whose only purpose is to hold an uninstantiated template plus its arguments.
Definition lazy_conditional.hpp:30
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using seqan3::operator|;
struct error :
error,
seqan3::detail::strong_type_skill::decrement
| seqan3::detail::strong_type_skill::increment>
{
error,
seqan3::detail::strong_type_skill::decrement
| seqan3::detail::strong_type_skill::increment>::strong_type;
};
int main()
{
error e{4u};
--e;
++e;
}
CRTP base class to declare a strong typedef for a regular type to avoid ambiguous parameter settings ...
Definition strong_type.hpp:174
constexpr strong_type() noexcept=default
Defaulted.
Provides basic data structure for strong types.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
struct error : seqan3::detail::strong_type<uint8_t, error>
{
};
struct window_size : seqan3::detail::strong_type<uint8_t, window_size>
{
};
strong_type for the window_size.
Definition minimiser_hash.hpp:29
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <ranges>
#include <vector>
struct error : seqan3::detail::strong_type<unsigned, error>
{
};
struct window_size : seqan3::detail::strong_type<unsigned, window_size>
{
};
namespace detail
{
template <std::ranges::forward_range fwd_rng_type>
bool do_find(fwd_rng_type const &, uint8_t const, uint8_t const)
{
return true;
}
} // namespace detail
template <std::ranges::forward_range fwd_rng_type>
bool search(fwd_rng_type const & rng, window_size const window_size, error const error)
{
return detail::do_find(rng, window_size.get(), error.get());
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna4> range = "ACGTT"_dna4;
search(range, window_size{4u}, error{2u});
return 0;
}
constexpr value_t & get() &noexcept
Returns the underlying value.
Definition strong_type.hpp:201
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <ranges>
#include <vector>
namespace detail
{
template <std::ranges::forward_range fwd_rng_type>
bool do_find(fwd_rng_type const &, uint8_t const, uint8_t const)
{
return true;
}
} // namespace detail
template <std::ranges::forward_range fwd_rng_type>
bool search(fwd_rng_type const & rng, uint8_t const window_size, uint8_t const error)
{
return detail::do_find(rng, window_size, error);
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna4> range = "ACGTT"_dna4;
search(range, 4u, 2u);
return 0;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using list_to_transfer = seqan3::type_list<int, char, double>;
static_assert(std::same_as<resulting_t, std::tuple<int, char, double>>);
}
typename transfer_template_args_onto< source_type, target_template >::type transfer_template_args_onto_t
Shortcut for seqan3::detail::transfer_template_args_onto (transformation_trait shortcut).
Definition template_inspection.hpp:70
Provides type traits for working with templates.
Provides seqan3::type_list.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using my_type = std::vector<int>;
if constexpr (seqan3::detail::is_type_specialisation_of_v<my_type, std::vector>) // Note: std::vector has no <> !
{
// ...
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <vector>
int main()
{
using my_type = std::vector<int>;
if constexpr (seqan3::detail::template_specialisation_of<my_type, std::vector>) // Note: std::vector has no <> !
{
// ...
}
}
Provides concept seqan3::template_specialisation_of<mytype, [...]> for checking the type specialisati...
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <string>
#include <vector>
int main()
{
using selected_types = seqan3::detail::select_types_with_ids_t<types, types_as_ids, selected_ids>;
// resolves to type_list<std::vector<phred42>, std::string>
static_assert(std::same_as<selected_types, seqan3::type_list<std::vector<seqan3::phred42>, std::string>>);
}
Provides auxiliary data structures and functions for seqan3::record and seqan3::fields.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
int main()
{
std::string id{"seq1"};
std::string sequence{"ACTGACTGACTGACTAGCATGACTAGCATGC"};
// construct iterator from stream buffer
auto stream_it = seqan3::detail::fast_ostreambuf_iterator{*ostr.rdbuf()};
// You can do anything you could do with a regular std::ostreambuf_iterator
stream_it = '>'; // writes '>' to stream
*stream_it = ' '; // writes ' ' to stream
// Additionally, there is an efficient write_range member function
// Example 1: Write a range completely
stream_it.write_range(id); // return value can be ignored
// Example 2: Write a range in chunks of 10
while (it != std::ranges::end(sequence))
{
/* Note that you need cannot use stream_it.write_range(rng | std::views::take(10)) here
* because the returned iterator is not of the correct type.
*/
auto current_end = it;
size_t steps = std::ranges::advance(current_end, 10u, std::ranges::end(sequence));
using subrange_t =
std::ranges::subrange<decltype(it), decltype(current_end), std::ranges::subrange_kind::sized>;
// Be aware that your range_type must model std::ranges::borrowed_range in order to use the return value!
it = stream_it.write_range(subrange_t{it, current_end, 10u - steps});
stream_it = ' ';
}
}
Functionally the same as std::ostreambuf_iterator, but offers writing a range more efficiently.
Definition fast_ostreambuf_iterator.hpp:37
auto write_range(range_type &&rng)
Writes a range to the associated output.
Definition fast_ostreambuf_iterator.hpp:141
Provides seqan3::detail::fast_ostreambuf_iterator.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
#include <fstream>
int main()
{
std::ifstream file{my_file}; // Create the file.
seqan3::detail::safe_filesystem_entry file_guard{my_file}; // Safe cleanup in case of errors.
// Do something on the file, that can possibly throw.
// If an unhandled exception is thrown, the file guard destructor safely removes the file from the filesystem.
file_guard.remove(); // Explicitly remove the file.
}
A safe guard to manage a filesystem entry, e.g. a file or a directory.
Definition safe_filesystem_entry.hpp:35
Provides seqan3::detail::safe_filesystem_entry.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto input = R"(> TEST1
ACGT
> Test2
AGGCTGA
> Test3
GGAGTATAATATATATATATATAT)";
int main()
{
// specify custom field combination/order to file:
auto record = fin.front(); // get current record, in this case the first
auto & id = record.id();
seqan3::debug_stream << id << '\n'; // TEST1
auto & seq = record.sequence();
seqan3::debug_stream << seq << '\n'; // ACGT
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <string>
#include <vector>
int main()
{
using namespace seqan3::literals;
// The order of the types below represent a mapping between the type and the key.
// record_type now mimics std::tuple<std::string, dna4_vector, std::vector<phred42>>,
// the order also depends on selected_ids
record_type my_record{};
std::get<1>(my_record) = "the most important sequence in the database"; // access via index
std::get<std::string>(my_record) = "the least important sequence in the database"; // access via type
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <string>
#include <tuple>
int main()
{
auto stream_it = fout.begin();
seqan3::dna5_vector seq;
// ...
// assign to file iterator
*stream_it = std::tie(seq, id);
// is the same as:
fout.push_back(std::tie(seq, id));
}
Provides seqan3::sam_file_output and corresponding traits classes.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
int main()
{
// I only want to print the mapping position (field::ref_offset) and flag:
unsigned mapping_pos{1300};
// ...
fout.emplace_back(mapping_pos, flag); // note that the order the arguments is now different, because
// or: you specified that REF_OFFSET should be first
fout.push_back(std::tie(mapping_pos, flag));
}
sam_flag
An enum flag that describes the properties of an aligned read (given as a SAM record).
Definition sam_flag.hpp:73
@ none
None of the flags below are set.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <string>
int main()
{
// ...
fout.push_back(r);
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <string>
#include <tuple>
int main()
{
seqan3::dna5_vector seq;
// ...
fout.push_back(std::tie(seq, id));
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <string>
#include <vector>
int main()
{
std::string read_id;
// ... e.g. compute and alignment
using alignment_type =
alignment_type dummy_alignment{}; // an empty dummy alignment
// the record type specifies the fields we want to write
// initialize record
record_type rec{read, ref_id, dummy_alignment};
// Write the record
fout.push_back(rec);
// same as
fout.push_back(record_type{read, ref_id, dummy_alignment});
// as all our fields are empty so this would print an
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
#include <sstream>
#include <tuple>
int main()
{
// I only want to print the mapping position (field::ref_offset) and flag:
unsigned mapping_pos{1300};
// ...
fout.emplace_back(mapping_pos, flag); // note that the order the arguments is now different, because
// or: you specified that REF_OFFSET should be first
fout.push_back(std::tie(mapping_pos, flag));
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
auto it = fin.begin();
// the following are equivalent:
auto & rec0 = *it;
auto & rec1 = fin.front();
std::cout << std::boolalpha << (rec0.id() == rec1.id()) << '\n'; // true
// Note: both become invalid after incrementing "it"!
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
// Create the temporary file.
auto tmp_file = std::filesystem::temp_directory_path() / "my.sam";
std::ofstream tmp_stream{tmp_file};
tmp_stream << sam_file_raw;
tmp_stream.close();
seqan3::sam_file_input fin{tmp_file}; // SAM format assumed, regular std::ifstream taken as stream
}
T close(T... args)
T remove(T... args)
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto input = R"(@HD VN:1.6 SO:coordinate
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *)";
int main()
{
// ^ no need to specify the template arguments
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto input = R"(@HD VN:1.6 SO:coordinate
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *)";
int main()
{
// The default types; you can adjust this list if you don't want to read all this data.
using default_fields = seqan3::fields<seqan3::field::seq,
// The expected format:
default_fields,
// Which formats are allowed:
sam_file_input_t fin{std::istringstream{input}, seqan3::format_sam{}};
}
@ mate
The mate pair information given as a std::tuple of reference name, offset and template length.
@ header_ptr
A pointer to the seqan3::sam_file_header object storing header information.
@ tags
The optional tags in the SAM format, stored in a dictionary.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
auto rec = std::move(fin.front()); // rec now stores the data permanently
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
// access the header information
seqan3::debug_stream << fin.header().format_version << '\n'; // 1.6
seqan3::debug_stream << fin.header().ref_dict << '\n'; // [(ref,(45,))] (this only works with seqan3::debug_stream!)
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
{
using sequence_alphabet = seqan3::dna4; // instead of dna5
template <typename alph>
using sequence_container = seqan3::bitpacked_sequence<alph>; // must be defined as a template!
};
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
// ... within main you can then use:
int main()
{
}
A more refined container concept than seqan3::container.
The default traits for seqan3::sam_file_input.
Definition sam_file/input.hpp:171
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
// A helper struct to create a temporary file and remove it when it goes out of scope.
struct temporary_file
{
temporary_file()
{
std::ofstream file{path}; // Create file
}
temporary_file(temporary_file const &) = delete;
temporary_file & operator=(temporary_file const &) = delete;
temporary_file(temporary_file &&) = delete;
temporary_file & operator=(temporary_file &&) = delete;
~temporary_file()
{
}
std::string read_content() const
{
std::ifstream file{path};
}
};
static constexpr auto sam_file_raw = R"(@HD VN:1.6 pb:5.0.0 ot:ter
@SQ SN:ref LN:34
)";
static auto get_sam_file_input()
{
}
void defaults_to_cerr()
{
auto fin = get_sam_file_input();
auto it = fin.begin();
}
void redirect_to_cout()
{
auto fin = get_sam_file_input();
fin.options.stream_warnings_to = std::addressof(std::cout); // Equivalent to `= &std::cout;`
auto it = fin.begin();
}
void redirect_to_file()
{
temporary_file tmp_file{};
auto fin = get_sam_file_input();
{ // Inner scope to close file before reading
std::ofstream warning_file{tmp_file.path};
fin.options.stream_warnings_to = std::addressof(warning_file); // Equivalent to `= &warning_file;`
auto it = fin.begin();
}
std::cout << "File content:\n" << tmp_file.read_content();
}
void silence_warnings()
{
auto fin = get_sam_file_input();
fin.options.stream_warnings_to = nullptr;
auto it = fin.begin();
}
void filter()
{
auto fin = get_sam_file_input();
fin.options.stream_warnings_to = std::addressof(stream); // Equivalent to `= &stream;`
auto it = fin.begin();
for (std::string line{}; std::getline(stream, line);)
{
// If "pb" is not found in the warning, print it to cerr.
if (line.find("pb") == std::string::npos) // C++23: `!line.contains("pb")`
std::cerr << line << '\n';
}
}
void print_section(std::string_view const section)
{
std::cout << "### " << section << " ###\n";
std::cerr << "### " << section << " ###\n";
}
int main()
{
print_section("defaults_to_cerr");
defaults_to_cerr();
print_section("redirect_to_cout");
redirect_to_cout();
print_section("redirect_to_file");
redirect_to_file();
print_section("silence_warnings");
silence_warnings();
print_section("filter");
filter();
}
T getline(T... args)
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
for (auto & rec : fin)
{
seqan3::debug_stream << "flag: " << rec.flag() << '\n';
seqan3::debug_stream << "mapping quality: " << rec.mapping_quality() << '\n';
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <ranges>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
auto minimum_length10_filter = std::views::filter(
[](auto const & rec)
{
return std::ranges::size(rec.sequence()) >= 10;
});
for (auto & rec : fin | minimum_length10_filter) // only records with sequence length >= 10 will "appear"
seqan3::debug_stream << rec.id() << '\n';
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <utility>
#include <vector>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
using record_type = typename decltype(fin)::record_type;
std::vector<record_type> records{}; // store all my records in a vector
for (auto & rec : fin)
records.push_back(std::move(rec));
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
for (auto & rec : fin)
{
seqan3::debug_stream << "id: " << rec.id() << '\n';
seqan3::debug_stream << "read sequence: " << rec.sequence() << '\n';
seqan3::debug_stream << "mapping position: " << rec.reference_position() << '\n';
seqan3::debug_stream << "mapping quality: " << rec.mapping_quality() << '\n';
// there are more fields read on default
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
for (auto & [flag, mapq] : fin) // the order is the same as specified in fields!
{
seqan3::debug_stream << "flag: " << flag << '\n';
seqan3::debug_stream << "mapping quality: " << mapq << '\n';
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
// fin uses custom fields, fout uses the default fields.
// output doesn't have to match the configuration of the input
for (auto & r : fin)
fout.push_back(r); // copy all the records.
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
int main()
{
auto tmp_file = std::filesystem::temp_directory_path() / "my.sam";
seqan3::sam_file_output fout{tmp_file}; // SAM format detected, std::ofstream opened for file
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <filesystem>
#include <string>
#include <vector>
int main()
{
auto tmp_file = std::filesystem::temp_directory_path() / "my.sam";
std::vector<std::string> ref_ids{"ref1", "ref2"};
std::vector<size_t> ref_lengths{1234, 5678};
seqan3::sam_file_output fout{tmp_file, ref_ids, ref_lengths};
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
int main()
{
// no need to specify the template arguments <...> for format specialization:
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto sam_file_raw = R"(First 0 * 0 0 * * 0 0 ACGT *
2nd 0 * 0 0 * * 0 0 NATA *
Third 0 * 0 0 * * 0 0 GATA *
)";
int main()
{
// copying a file in one line:
// with seqan3::sam_file_output as a variable:
fout = fin;
// or in pipe notation:
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <ranges>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 * = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 * * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 * * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 * = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
auto input_file = seqan3::sam_file_input{std::istringstream{sam_file_raw}, seqan3::format_sam{}};
input_file | std::views::take(3) // take only the first 3 records
}
Provides platform and dependency checks.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <string>
#include <vector>
int main()
{
std::vector<std::string> ref_ids{"ref1", "ref2"};
std::vector<size_t> ref_lengths{1234, 5678};
// always give reference information if you want to have your header properly initialised
// add information to the header of the file.
fout.header().comments.push_back("This is a comment");
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
{"NATA"_dna5, "2nd"},
{"GATA"_dna5, "Third"}}; // a range of "records"
fout = range; // will iterate over the records and write them
// equivalent to:
range | fout;
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG !!!!!!!!!!!!!!!!!
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA !!!!!!!!!!! SA:Z:ref,29,-,6H5M,17,0;
r003 4 * 29 17 * * 0 0 TAGGC @@@@@ SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT !!!!!!!!! NM:i:1
)";
int main()
{
for (auto & rec : fin)
{
// Check if a certain flag value (bit) is set:
if (static_cast<bool>(rec.flag() & seqan3::sam_flag::unmapped))
std::cout << "Read " << rec.id() << " is unmapped\n";
if (rec.base_qualities()[0] < seqan3::assign_char_to('@', seqan3::phred42{})) // low quality
{
// Set a flag value (bit):
// Note that this does not affect other flag values (bits),
// e.g. `rec.flag() & seqan3::sam_flag::unmapped` may still be true
}
// Unset a flag value (bit):
rec.flag() &= ~seqan3::sam_flag::duplicate; // not marked as a duplicate anymore
}
}
@ failed_filter
The read alignment failed a filter, e.g. quality controls.
@ unmapped
The read is not mapped to a reference (unaligned).
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
using namespace seqan3::literals;
seqan3::sam_tag_dictionary dict{}; // initialise empty dictionary
dict.get<"NM"_tag>() = 3; // set SAM tag 'NM' to 3 (integer type)
dict.get<"CO"_tag>() = "comment"; // set SAM tag 'CO' to "comment" (string type)
auto nm = dict.get<"NM"_tag>(); // get SAM tag 'NM' (note: type is int32_t)
auto co = dict.get<"CO"_tag>(); // get SAM tag 'CO' (note: type is std::string)
seqan3::debug_stream << nm << '\n'; // will print '3'
seqan3::debug_stream << co << '\n'; // will print "comment"
}
The SAM tag dictionary class that stores all optional SAM fields.
Definition sam_tag_dictionary.hpp:327
auto & get() &
Uses std::map::operator[] for access and default initializes new keys.
Definition sam_tag_dictionary.hpp:354
Provides the seqan3::sam_tag_dictionary class and auxiliaries.
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
using namespace seqan3::literals;
template <> // no template parameter since the tag is known
struct seqan3::sam_tag_type<"XX"_tag> // here comes your tag
{
using type = int32_t; // specify the type of your tag
};
using namespace seqan3::literals;
// ...
uint16_t tag_id = "NM"_tag; // tag_id = 10061
using namespace seqan3::literals;
// ...
using nm_tag_type = seqan3::sam_tag_type_t<"NM"_tag>;
using namespace seqan3::literals;
// ...
using nm_tag_type2 = seqan3::sam_tag_type<"NM"_tag>::type;
The generic base class.
Definition sam_tag_dictionary.hpp:165
detail::sam_tag_variant type
The type for all unknown tags with no extra overload defaults to a std::variant.
Definition sam_tag_dictionary.hpp:167
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <variant> // for std::visit
#include <seqan3/utility/container/concept.hpp> // for the seqan3::container
// a lambda helper function that prints every type in the std::variant<...allowed SAM tag types...>
auto print_fn = [](auto && arg)
{
using T = std::remove_cvref_t<decltype(arg)>; // the type T of arg.
if constexpr (!seqan3::container<T>) // If T is not a container,
{
seqan3::debug_stream << arg << '\n'; // just print arg directly.
}
else // If T is a container,
{
for (auto const & arg_v : arg) // print every value in arg.
seqan3::debug_stream << arg_v << ",";
}
};
int main()
{
using namespace seqan3::literals;
seqan3::sam_tag_dictionary dict{}; // initialise empty dictionary
// ! there is no get function for unknown tags !
// dict.get<"XZ"_tag>() = 3;
// but you can use the operator[]
dict["XZ"_tag] = 3; // set unknown SAM tag 'XZ' to 3 (type int32_t)
// ! there is no get function for unknown tags !
// auto nm = dict.get<"XZ"_tag>();
// but you can use the operator[] again
auto xz = dict["XZ"_tag]; // get SAM tag 'XZ' (type std::variant<...allowed SAM tag types...>)
// ! you cannot print a std::variant directly !
// seqan3::debug_stream << nm << '\n';
// but you can use visit:
std::visit(print_fn, xz); // prints 3
}
The (most general) container concept as defined by the standard library.
Adaptations of concepts from the standard library.
T visit(T... args)
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <utility>
#include <vector>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
using record_type = typename decltype(fin)::record_type;
for (auto & rec : fin)
records.push_back(std::move(rec));
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <vector>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
using seqan3::get;
for (auto & [id, seq, qual] : fin) // the order is now different, "id" comes first, because it was specified first
{
seqan3::debug_stream << "ID: " << id << '\n';
seqan3::debug_stream << "SEQ: " << seq << '\n';
seqan3::debug_stream << "QUAL: " << qual << '\n';
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
for (auto & [sequence, id, quality] : fin)
{
seqan3::debug_stream << "ID: " << id << '\n';
seqan3::debug_stream << "SEQ: " << sequence << '\n';
seqan3::debug_stream << "EMPTY QUAL." << quality << '\n'; // quality is empty for FASTA files
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <ranges>
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
auto minimum_length5_filter = std::views::filter(
[](auto const & rec)
{
return std::ranges::size(rec.sequence()) >= 5;
});
for (auto & rec : fin | minimum_length5_filter) // only record with sequence length >= 5 will "appear"
{
seqan3::debug_stream << "IDs of seq_length >= 5: " << rec.id() << '\n';
// ...
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <utility>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
// ^ no need to specify the template arguments
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
for (auto & record : fin)
{
seqan3::debug_stream << "ID: " << record.id() << '\n';
seqan3::debug_stream << "SEQ: " << record.sequence() << '\n';
// a quality field also exists, but is not printed, because we know it's empty for FASTA files.
}
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
#include <utility>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
auto rec0 = std::move(fin.front());
}
// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
auto it = std::ranges::begin(fin);
// the following are equivalent:
auto & rec0 = *it;
auto & rec1 = fin.front();
std::cout << std::boolalpha << (rec0.id() == rec1.id()) << '\n'; // true
// Note: rec0 and rec1 are references and become invalid after incrementing &q