SeqAn3 3.3.0
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
The SeqAn Cookbook

This document provides example recipes on how to carry out particular tasks using the SeqAn functionalities in C++. Please note that these recipes are not ordered. You can use the links in the table of contents or the search function of your browser to navigate them.

It will take some time, but we hope to expand this document into containing numerous great examples. If you have suggestions for how to improve the Cookbook and/or examples you would like included, please feel free to contact us.

Read sequence files

#include <string>
#include <seqan3/core/debug_stream.hpp> // for debug_stream
#include <seqan3/io/sequence_file/input.hpp> // for sequence_file_input
int main()
{
std::filesystem::path tmp_dir = std::filesystem::temp_directory_path(); // get the tmp directory
// Initialise a file input object with a FASTA file.
seqan3::sequence_file_input file_in{tmp_dir / "seq.fasta"};
// Retrieve the sequences and ids.
for (auto & [seq, id, qual] : file_in)
{
seqan3::debug_stream << "ID: " << id << '\n';
seqan3::debug_stream << "SEQ: " << seq << '\n';
seqan3::debug_stream << "Empty Qual." << qual << '\n'; // qual is empty for FASTA files
}
return 0;
}
A class for reading sequence files, e.g. FASTA, FASTQ ...
Definition: sequence_file/input.hpp:210
Provides seqan3::debug_stream and related types.
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition: debug_stream.hpp:37
Provides seqan3::sequence_file_input and corresponding traits classes.
T temp_directory_path(T... args)

Construction and assignment of alphabet symbols

#include <seqan3/alphabet/all.hpp> // for working with alphabets directly
int main()
{
using namespace seqan3::literals;
// Two objects of seqan3::dna4 alphabet constructed with a char literal.
seqan3::dna4 ade = 'A'_dna4;
seqan3::dna4 gua = 'G'_dna4;
// Two additional objects assigned explicitly from char or rank.
seqan3::dna4 cyt, thy;
cyt.assign_char('C');
thy.assign_rank(3);
// Further code here...
Meta-header for the alphabet module.
constexpr derived_type & assign_char(char_type const chr) noexcept
Assign from a character, implicitly converts invalid characters.
Definition: alphabet_base.hpp:163
constexpr derived_type & assign_rank(rank_type const c) noexcept
Assign from a numeric value.
Definition: alphabet_base.hpp:187
The four letter DNA alphabet of A,C,G,T..
Definition: dna4.hpp:53
The SeqAn namespace for literals.
return 0;
}
// Get the rank type of the alphabet (here uint8_t).
// Retrieve the numerical representation (rank) of the objects.
rank_type rank_a = ade.to_rank(); // => 0
rank_type rank_g = gua.to_rank(); // => 2
constexpr rank_type to_rank() const noexcept
Return the letter's numeric value (rank in the alphabet).
Definition: alphabet_base.hpp:137
decltype(seqan3::to_rank(std::declval< semi_alphabet_type >())) alphabet_rank_t
The rank_type of the semi-alphabet; defined as the return type of seqan3::to_rank....
Definition: alphabet/concept.hpp:169

Reverse complement and the six-frame translation of a string using views

This recipe creates a small program that

  1. reads a string from the command line (first argument to the program)
  2. "converts" the string to a range of seqan3::dna5 (Bonus: throws an exception if loss of information occurs)
  3. prints the string and its reverse complement
  4. prints the six-frame translation of the string
#include <ranges> // include all of the standard library's views
#include <seqan3/alphabet/views/all.hpp> // include all of SeqAn's views
#include <seqan3/argument_parser/all.hpp> // optional: include the argument_parser
int main(int argc, char ** argv)
{
// We use the seqan3::argument_parser which was introduced in the second chapter
// of the tutorial: "Parsing command line arguments with SeqAn".
seqan3::argument_parser myparser{"Assignment-3", argc, argv}; // initialize
myparser.add_positional_option(s, "Please specify the DNA string.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR]" << ext.what() << '\n'; // you can customize your error message
return 0;
}
auto s_as_dna = s | seqan3::views::char_to<seqan3::dna5>;
// Bonus:
//auto s_as_dna = s | std::views::transform([] (char const c)
//{
// return seqan3::assign_char_strictly_to(c, seqan3::dna5{});
//});
seqan3::debug_stream << "Original: " << s_as_dna << '\n';
seqan3::debug_stream << "RevComp: " << (s_as_dna | std::views::reverse | seqan3::views::complement) << '\n';
seqan3::debug_stream << "Frames: " << (s_as_dna | seqan3::views::translate) << '\n';
}
Meta-header for the Alphabet / Views submodule .
Meta-header for the Argument Parser module .
Argument parser exception that is thrown whenever there is an error while parsing the command line ar...
Definition: exceptions.hpp:40
The SeqAn command line parser.
Definition: argument_parser.hpp:148
constexpr auto translate
A view that translates nucleotide into aminoacid alphabet with 1, 2, 3 or 6 frames.
Definition: translate.hpp:803
auto const complement
A view that converts a range of nucleotides to their complement.
Definition: complement.hpp:67
T what(T... args)

Reading records

After construction, you can now read the sequence records. Our file object behaves like a range, you can use a range-based for loop to conveniently iterate over the file:

#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
for (auto & record : fin)
{
seqan3::debug_stream << "ID: " << record.id() << '\n';
seqan3::debug_stream << "SEQ: " << record.sequence() << '\n';
// a quality field also exists, but is not printed, because we know it's empty for FASTA files.
}
}
The FASTA format.
Definition: format_fasta.hpp:80
The class template that file records are based on; behaves like a std::tuple.
Definition: record.hpp:193
Attention
An input file is a single input range, which means you can only iterate over it once!
Note
It is important to write auto & and not just auto, otherwise you will copy the record on every iteration.

You can also use structured binding, i.e. for (auto & [seq, id, qual] : fin) But beware: with structured bindings you do need to get the order of elements correct!

You can also read a file in chunks:

Reading records in chunks

int main()
{
// `&&` is important because seqan3::views::chunk returns temporaries!
for (auto && records : fin | seqan3::views::chunk(10))
{
// `records` contains 10 elements (or less at the end)
seqan3::debug_stream << "Taking the next 10 sequences:\n";
seqan3::debug_stream << "ID: " << (*records.begin()).id() << '\n'; // prints first ID in batch
}
Provides seqan3::views::chunk.
T current_path(T... args)
seqan::std::views::chunk chunk
A view adaptor that divides a range into chunks. <dl class="no-api">This entity is not part of the Se...
Definition: chunk.hpp:26
Meta-header for the IO / Sequence File submodule .

The example above will iterate over the file by reading 10 records at a time. If no 10 records are available anymore, it will just print the remaining records.

Applying a filter to a file

On some occasions you are only interested in sequence records that fulfill a certain criterion, e.g. having a minimum sequence length or a minimum average quality.

This recipe can be used to filter the sequences in your file by a custom criterion.

#include <numeric> // std::accumulate
#include <ranges>
int main()
{
// std::views::filter takes a function object (a lambda in this case) as input that returns a boolean
auto minimum_quality_filter = std::views::filter(
[](auto const & rec)
{
auto qualities = rec.base_qualities()
| std::views::transform(
[](auto quality)
{
return seqan3::to_phred(quality);
});
auto sum = std::accumulate(qualities.begin(), qualities.end(), 0);
return sum / std::ranges::size(qualities) >= 40; // minimum average quality >= 40
});
for (auto & rec : fin | minimum_quality_filter)
{
seqan3::debug_stream << "ID: " << rec.id() << '\n';
}
}
T accumulate(T... args)
constexpr auto to_phred
The public getter function for the Phred representation of a quality score.
Definition: alphabet/quality/concept.hpp:100

Reading paired-end reads

In modern Next Generation Sequencing experiments you often have paired-end read data which is split into two files. The read pairs are identified by their identical name/id and position in the two files.

This recipe can be used to handle one pair of reads at a time.

int main()
{
// for simplicity we take the same file
for (auto && [rec1, rec2] : seqan3::views::zip(fin1, fin2)) // && is important!
{ // because seqan3::views::zip returns temporaries
if (rec1.id() != rec2.id())
throw std::runtime_error("Your pairs don't match.");
}
}
seqan::std::views::zip zip
A view adaptor that takes several views and returns tuple-like values from every i-th element of each...
Definition: zip.hpp:27
Provides seqan3::views::zip.

Storing records in a std::vector

This recipe creates a small program that reads in a FASTA file and stores all the records in a std::vector.

#include <filesystem>
#include <ranges> // std::ranges::copy
int main()
{
seqan3::sequence_file_input fin{current_path / "my.fasta"};
using record_type = decltype(fin)::record_type;
// You can use a for loop:
for (auto & record : fin)
{
records.push_back(std::move(record));
}
// But you can also do this:
seqan3::debug_stream << records << '\n';
}
T back_inserter(T... args)
T copy(T... args)
T push_back(T... args)

Note that you can move the record out of the file if you want to store it somewhere without copying.

int main()
{
using record_type = typename decltype(fin)::record_type;
record_type rec = std::move(*fin.begin()); // avoid copying
}

Writing records

The easiest way to write to a sequence file is to use the seqan3::sequence_file_output::push_back() or seqan3::sequence_file_output::emplace_back() member functions. These work similarly to how they work on a std::vector.

#include <string>
int main()
{
using namespace seqan3::literals;
using sequence_record_type = seqan3::sequence_record<types, fields>;
for (int i = 0; i < 5; ++i) // ...
{
std::string id{"test_id"};
seqan3::dna5_vector sequence{"ACGT"_dna5};
sequence_record_type record{std::move(sequence), std::move(id)};
fout.push_back(record);
}
}
A class for writing sequence files, e.g. FASTA, FASTQ ...
Definition: io/sequence_file/output.hpp:69
The record type of seqan3::sequence_file_input.
Definition: sequence_file/record.hpp:29
Provides seqan3::dna5, container aliases and string literals.
The generic concept for a (biological) sequence.
Provides seqan3::sequence_file_output and corresponding traits classes.
Provides seqan3::sequence_record.
A class template that holds a choice of seqan3::field.
Definition: record.hpp:128
Type that contains multiple types.
Definition: type_list.hpp:29

File conversion

int main()
{
auto current_path = std::filesystem::current_path();
seqan3::sequence_file_output{current_path / "output.fasta"} =
seqan3::sequence_file_input{current_path / "my.fastq"};
}

Define a custom scoring scheme

Provides seqan3::aminoacid_scoring_scheme.
Provides seqan3::nucleotide_scoring_scheme.
using namespace seqan3::literals;
// Define a simple scoring scheme with match and mismatch cost and get the score.
auto sc_nc = nc_scheme.score('A'_dna4, 'C'_dna4); // sc_nc == -5.
// Define a amino acid similarity matrix and get the score.
auto sc_aa = aa_scheme.score('M'_aa27, 'K'_aa27); // sc_aa == 2.
A data structure for managing and computing the score of two amino acids.
Definition: aminoacid_scoring_scheme.hpp:75
constexpr void set_similarity_matrix(aminoacid_similarity_matrix const matrix_id)
Set the similarity matrix scheme (e.g. blosum62).
Definition: aminoacid_scoring_scheme.hpp:121
A data structure for managing and computing the score of two nucleotides.
Definition: nucleotide_scoring_scheme.hpp:38
@ blosum30
The blosum30 matrix for very distantly related proteins.
A strong type of underlying type score_type that represents the score of two matching characters.
Definition: scoring_scheme_base.hpp:41
A strong type of underlying type score_type that represents the score two different characters.
Definition: scoring_scheme_base.hpp:66
Attention
SeqAn's alignment algorithm computes the maximal similarity score, thus the match score must be set to a positive value and the scores for mismatch and gap must be negative in order to maximize over the matching letters.

Calculate edit distance for a set of sequences

This recipe can be used to calculate the edit distance for all six pairwise combinations. Here we only allow at most 7 errors and filter all alignments with 6 or fewer errors.

#include <ranges>
#include <utility>
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector vec{"ACGTGACTGACT"_dna4, "ACGAAGACCGAT"_dna4, "ACGTGACTGACT"_dna4, "AGGTACGAGCGACACT"_dna4};
// Configure the alignment kernel.
auto alignment_results = seqan3::align_pairwise(seqan3::views::pairwise_combine(vec), config);
auto filter_v = std::views::filter(
[](auto && res)
{
return res.score() >= -6;
});
for (auto const & result : alignment_results | filter_v)
{
seqan3::debug_stream << "Score: " << result.score() << '\n';
}
}
Provides pairwise alignment function.
Sets the global alignment method.
Definition: align_config_method.hpp:122
Sets the minimal score (maximal errors) allowed during an distance computation e.g....
Definition: align_config_min_score.hpp:39
Configures the alignment result to output the score.
Definition: align_config_output.hpp:43
Provides seqan3::dna4, container aliases and string literals.
constexpr configuration edit_scheme
Shortcut for edit distance configuration.
Definition: align_config_edit.hpp:51
constexpr auto align_pairwise(sequence_t &&seq, alignment_config_t const &config)
Computes the pairwise alignment for a pair of sequences or a range over sequence pairs.
Definition: align_pairwise.hpp:134
constexpr auto pairwise_combine
A view adaptor that generates all pairwise combinations of the elements of the underlying range.
Definition: pairwise_combine.hpp:651
Provides seqan3::views::pairwise_combine.

Searching for matches

This recipe can be used to search for all occurrences of a substring and print the number of hits and the positions in an ascending ordering.

using namespace seqan3::literals;
void run_text_single()
{
seqan3::dna4_vector text{
"CGCTGTCTGAAGGATGAGTGTCAGCCAGTGTAACCCGATGAGCTACCCAGTAGTCGAACTGGGCCAGACAACCCGGCGCTAATGCACTCA"_dna4};
seqan3::fm_index index{text};
seqan3::debug_stream << "===== Running on a single text =====\n"
<< "The following hits were found:\n";
for (auto && result : search("GCT"_dna4, index))
seqan3::debug_stream << result << '\n';
}
void run_text_collection()
{
std::vector<seqan3::dna4_vector> text{"CGCTGTCTGAAGGATGAGTGTCAGCCAGTGTA"_dna4,
"ACCCGATGAGCTACCCAGTAGTCGAACTG"_dna4,
"GGCCAGACAACCCGGCGCTAATGCACTCA"_dna4};
seqan3::fm_index index{text};
seqan3::debug_stream << "===== Running on a text collection =====\n"
<< "The following hits were found:\n";
for (auto && result : search("GCT"_dna4, index))
seqan3::debug_stream << result << '\n';
}
int main()
{
run_text_single();
run_text_collection();
}
The SeqAn FM Index.
Definition: fm_index.hpp:189
Provides the unidirectional seqan3::fm_index.
Provides the public interface for search algorithms.
T search(T... args)

If you want to allow errors in your query, you need to configure the approximate search with the following search configuration objects:

To search for either 1 insertion or 1 deletion you can use the seqan3::search_cfg::error_count:

std::string text{"Garfield the fat cat without a hat."};
seqan3::fm_index index{text};
seqan3::debug_stream << search("cat"s, index, cfg) << '\n';
// prints: [<query_id:0, reference_id:0, reference_pos:14>,
// <query_id:0, reference_id:0, reference_pos:17>,
// <query_id:0, reference_id:0, reference_pos:18>,
// <query_id:0, reference_id:0, reference_pos:32>]
Collection of elements to configure an algorithm.
Definition: configuration.hpp:45
Configuration element that represents the number or rate of deletion errors.
Definition: max_error.hpp:173
Configuration element that represents the number or rate of insertion errors.
Definition: max_error.hpp:127
Configuration element that represents the number or rate of substitution errors.
Definition: max_error.hpp:82
Configuration element that represents the number or rate of total errors.
Definition: max_error.hpp:37
A strong type of underlying type uint8_t that represents the number of errors.
Definition: max_error_common.hpp:32

Reading the CIGAR information from a SAM file and constructing an alignment

This recipe can be used to:

  1. Read in a FASTA file with the reference and a SAM file with the alignment
  2. Filter the alignment records and only take those with a mapping quality >= 30.
  3. For the resulting alignments, print which read was mapped against with reference id and the number of seqan3::gap's involved in the alignment (either in aligned reference or in read sequence).
#include <algorithm> // std::ranges::count
#include <filesystem>
#include <ranges>
#include <string>
#include <vector>
int main()
{
// read in reference information
seqan3::sequence_file_input reference_file{current_path / "reference.fasta"};
std::vector<std::string> reference_ids{};
std::vector<seqan3::dna5_vector> reference_sequences{};
for (auto && record : reference_file)
{
reference_ids.push_back(std::move(record.id()));
reference_sequences.push_back(std::move(record.sequence()));
}
// filter out alignments
seqan3::sam_file_input mapping_file{current_path / "mapping.sam", reference_ids, reference_sequences};
auto mapq_filter = std::views::filter(
[](auto & record)
{
return record.mapping_quality() >= 30;
});
for (auto & record : mapping_file | mapq_filter)
{
reference_sequences[record.reference_id().value()],
record.reference_position().value(),
record.sequence());
// as loop
size_t sum_reference{};
for (auto const & char_reference : std::get<0>(alignment))
if (char_reference == seqan3::gap{})
++sum_reference;
// or via std::ranges::count
size_t sum_read = std::ranges::count(std::get<1>(alignment), seqan3::gap{});
// The reference_id is ZERO based and an optional. -1 is represented by std::nullopt (= reference not known).
std::optional reference_id = record.reference_id();
seqan3::debug_stream << record.id() << " mapped against "
<< (reference_id ? std::to_string(reference_id.value()) : "unknown reference") << " with "
<< sum_read << " gaps in the read sequence and " << sum_reference
<< " gaps in the reference sequence.\n";
}
}
Provides the function seqan3::alignment_from_cigar.
The alphabet of a gap character '-'.
Definition: gap.hpp:39
A class for reading SAM files, both SAM and its binary representation BAM are supported.
Definition: sam_file/input.hpp:242
T count(T... args)
Provides seqan3::gap.
auto alignment_from_cigar(std::vector< cigar > const &cigar_vector, reference_type const &reference, uint32_t const zero_based_reference_start_position, sequence_type const &query)
Construct an alignment from a CIGAR string and the corresponding sequences.
Definition: alignment_from_cigar.hpp:84
@ alignment
The (pairwise) alignment stored in an object that models seqan3::detail::pairwise_alignment.
Provides the seqan3::record template and the seqan3::field enum.
Provides seqan3::sam_file_input and corresponding traits classes.
T to_string(T... args)

Map reads and write output to SAM file

For a full recipe on creating your own readmapper, see the very end of the tutorial Implementing your own read mapper with SeqAn.

void map_reads(std::filesystem::path const & query_path,
std::filesystem::path const & index_path,
std::filesystem::path const & sam_path,
reference_storage_t & storage,
uint8_t const errors)
{
// we need the alphabet and text layout before loading
{
std::ifstream is{index_path, std::ios::binary};
cereal::BinaryInputArchive iarchive{is};
iarchive(index);
}
seqan3::sequence_file_input query_file_in{query_path};
seqan3::sam_file_output sam_out{sam_path,
seqan3::configuration const search_config =
seqan3::configuration const align_config =
for (auto && record : query_file_in)
{
auto & query = record.sequence();
for (auto && result : search(query, index, search_config))
{
size_t start = result.reference_begin_position() ? result.reference_begin_position() - 1 : 0;
std::span text_view{std::data(storage.seqs[result.reference_id()]) + start, query.size() + 1};
for (auto && alignment : seqan3::align_pairwise(std::tie(text_view, query), align_config))
{
size_t ref_offset = alignment.sequence1_begin_position() + 2 + start;
size_t map_qual = 60u + alignment.score();
sam_out.emplace_back(query,
record.id(),
storage.ids[result.reference_id()],
ref_offset,
record.base_qualities(),
map_qual);
}
}
}
}
Configures the alignment result to output the alignment.
Definition: align_config_output.hpp:171
Configures the alignment result to output the begin positions.
Definition: align_config_output.hpp:131
The SeqAn Bidirectional FM Index.
Definition: bi_fm_index.hpp:61
The seqan3::cigar semialphabet pairs a counter with a seqan3::cigar::operation letter.
Definition: alphabet/cigar/cigar.hpp:60
A class for writing SAM files, both SAM and its binary representation BAM are supported.
Definition: io/sam_file/output.hpp:74
Configuration element to receive all hits with the lowest number of errors within the error bounds.
Definition: hit.hpp:59
T data(T... args)
auto cigar_from_alignment(alignment_type const &alignment, cigar_clipped_bases const &clipped_bases={}, bool const extended_cigar=false)
Creates a CIGAR string (SAM format) given a seqan3::detail::pairwise_alignment represented by two seq...
Definition: cigar_from_alignment.hpp:114
@ ref_offset
Sequence (seqan3::field::ref_seq) relative start position (0-based), unsigned value.
@ cigar
The cigar vector (std::vector<seqan3::cigar>) representing the alignment in SAM/BAM format.
@ mapq
The mapping quality of the seqan3::field::seq alignment, usually a Phred-scaled score.
@ ref_id
The identifier of the (reference) sequence that seqan3::field::seq was aligned to.
@ id
The identifier, usually a string.
@ seq
The "sequence", usually a range of nucleotides or amino acids.
@ qual
The qualities, usually in Phred score notation.
A strong type representing free_end_gaps_sequence1_leading of the seqan3::align_cfg::method_global.
Definition: align_config_method.hpp:68
A strong type representing free_end_gaps_sequence1_trailing of the seqan3::align_cfg::method_global.
Definition: align_config_method.hpp:88
A strong type representing free_end_gaps_sequence2_leading of the seqan3::align_cfg::method_global.
Definition: align_config_method.hpp:78
A strong type representing free_end_gaps_sequence2_trailing of the seqan3::align_cfg::method_global.
Definition: align_config_method.hpp:98
T tie(T... args)

Constructing a basic argument parser

void run_program(std::filesystem::path const & reference_path, std::filesystem::path const & index_path)
{
seqan3::debug_stream << "reference_file_path: " << reference_path << '\n';
seqan3::debug_stream << "index_path " << index_path << '\n';
}
struct cmd_arguments
{
std::filesystem::path reference_path{};
std::filesystem::path index_path{"out.index"};
};
void initialise_argument_parser(seqan3::argument_parser & parser, cmd_arguments & args)
{
parser.info.author = "E. coli";
parser.info.short_description = "Creates an index over a reference.";
parser.info.version = "1.0.0";
parser.add_option(args.reference_path,
'r',
"reference",
"The path to the reference.",
seqan3::input_file_validator{{"fa", "fasta"}});
parser.add_option(args.index_path,
'o',
"output",
"The output index file path.",
seqan3::output_file_validator{seqan3::output_file_open_options::create_new, {"index"}});
}
int main(int argc, char const ** argv)
{
seqan3::argument_parser parser("Indexer", argc, argv);
cmd_arguments args{};
initialise_argument_parser(parser, args);
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext)
{
std::cerr << "[PARSER ERROR] " << ext.what() << '\n';
return -1;
}
run_program(args.reference_path, args.index_path);
return 0;
}
void add_option(option_type &value, char const short_id, std::string const &long_id, std::string const &desc, option_spec const spec=option_spec::standard, validator_type option_validator=validator_type{})
Adds an option to the seqan3::argument_parser.
Definition: argument_parser.hpp:239
argument_parser_meta_data info
Aggregates all parser related meta data (see seqan3::argument_parser_meta_data struct).
Definition: argument_parser.hpp:637
A validator that checks if a given path is a valid input file.
Definition: validators.hpp:521
A validator that checks if a given path is a valid output file.
Definition: validators.hpp:651
@ standard
The default were no checking or special displaying is happening.
Definition: auxiliary.hpp:249
@ required
Definition: auxiliary.hpp:250
std::string author
Your name ;-)
Definition: auxiliary.hpp:298
std::string version
The version information MAJOR.MINOR.PATH (e.g. 3.1.3)
Definition: auxiliary.hpp:294
std::string short_description
A short description of the application (e.g. "A tool for mapping reads to the genome").
Definition: auxiliary.hpp:296

Constructing a subcommand argument parser

// =====================================================================================================================
// pull
// =====================================================================================================================
struct pull_arguments
{
std::string repository{};
std::string branch{};
bool progress{false};
};
int run_git_pull(seqan3::argument_parser & parser)
{
pull_arguments args{};
parser.add_positional_option(args.repository, "The repository name to pull from.");
parser.add_positional_option(args.branch, "The branch name to pull from.");
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext)
{
seqan3::debug_stream << "[Error git pull] " << ext.what() << "\n";
return -1;
}
seqan3::debug_stream << "Git pull with repository " << args.repository << " and branch " << args.branch << '\n';
return 0;
}
// =====================================================================================================================
// push
// =====================================================================================================================
struct push_arguments
{
std::string repository{};
bool push_all{false};
};
int run_git_push(seqan3::argument_parser & parser)
{
push_arguments args{};
parser.add_positional_option(args.repository, "The repository name to push to.");
parser.add_positional_option(args.branches, "The branch names to push (if none are given, push current).");
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext)
{
seqan3::debug_stream << "[Error git push] " << ext.what() << "\n";
return -1;
}
seqan3::debug_stream << "Git push with repository " << args.repository << " and branches " << args.branches << '\n';
return 0;
}
// =====================================================================================================================
// main
// =====================================================================================================================
int main(int argc, char const ** argv)
{
seqan3::argument_parser top_level_parser{"mygit", argc, argv, seqan3::update_notifications::on, {"push", "pull"}};
// Add information and flags, but no (positional) options to your top-level parser.
// Because of ambiguity, we do not allow any (positional) options for the top-level parser.
top_level_parser.info.description.push_back("You can push or pull from a remote repository.");
// A flag's default value must be false.
bool flag{false};
top_level_parser.add_flag(flag, 'f', "flag", "some flag");
try
{
top_level_parser.parse(); // trigger command line parsing
}
catch (seqan3::argument_parser_error const & ext) // catch user errors
{
seqan3::debug_stream << "[Error] " << ext.what() << "\n"; // customise your error message
return -1;
}
seqan3::argument_parser & sub_parser = top_level_parser.get_sub_parser(); // hold a reference to the sub_parser
std::cout << "Proceed to sub parser.\n";
if (sub_parser.info.app_name == std::string_view{"mygit-pull"})
return run_git_pull(sub_parser);
else if (sub_parser.info.app_name == std::string_view{"mygit-push"})
return run_git_push(sub_parser);
else
std::cout << "Unhandled subparser named " << sub_parser.info.app_name << '\n';
// Note: Arriving in this else branch means you did not handle all sub_parsers in the if branches above.
return 0;
}
void add_positional_option(option_type &value, std::string const &desc, validator_type option_validator=validator_type{})
Adds a positional option to the seqan3::argument_parser.
Definition: argument_parser.hpp:315
void parse()
Initiates the actual command line parsing.
Definition: argument_parser.hpp:405
argument_parser & get_sub_parser()
Returns a reference to the sub-parser instance if subcommand parsing was enabled.
Definition: argument_parser.hpp:439
@ flag
The alignment flag (bit information), uint16_t value.
@ on
Automatic update notifications should be enabled.
std::string app_name
The application name that will be displayed on the help page.
Definition: auxiliary.hpp:292

Serialise data structures with cereal

#include <fstream>
#include <vector>
#include <seqan3/test/tmp_directory.hpp>
#include <cereal/archives/binary.hpp> // includes the cereal::BinaryInputArchive and cereal::BinaryOutputArchive
#include <cereal/types/vector.hpp> // includes cerealisation support for std::vector
// Written for std::vector, other types also work.
void load(std::vector<int16_t> & data, std::filesystem::path const & tmp_file)
{
std::ifstream is(tmp_file, std::ios::binary); // Where input can be found.
cereal::BinaryInputArchive archive(is); // Create an input archive from the input stream.
archive(data); // Load data.
}
// Written for std::vector, other types also work.
void store(std::vector<int16_t> const & data, std::filesystem::path const & tmp_file)
{
std::ofstream os(tmp_file, std::ios::binary); // Where output should be stored.
cereal::BinaryOutputArchive archive(os); // Create an output archive from the output stream.
archive(data); // Store data.
}
int main()
{
// The following example is for a std::vector but any seqan3 data structure that is documented as serialisable
// could be used, e.g. seqan3::fm_index.
seqan3::test::tmp_directory tmp{};
auto tmp_file = tmp.path() / "data.out"; // This is a temporary file name, use any other filename.
std::vector<int16_t> vec{1, 2, 3, 4};
store(vec, tmp_file); // Calls store on a std::vector.
// This vector is needed to load the information into it.
load(vec2, tmp_file); // Calls load on a std::vector.
seqan3::debug_stream << vec2 << '\n'; // Prints [1,2,3,4].
return 0;
}

Converting a range of an alphabet

using seqan3::operator""_dna4;
using seqan3::operator""_dna5;
using seqan3::operator""_phred42;
int main()
{
// A vector of combined sequence and quality information.
std::vector<seqan3::dna4q> sequence1{{'A'_dna4, '!'_phred42},
{'C'_dna4, 'A'_phred42},
{'G'_dna4, '6'_phred42},
{'T'_dna4, '&'_phred42}};
// A vector of dna5.
std::vector<seqan3::dna5> sequence2{"AGNCGTNNCAN"_dna5};
// Convert dna4q to dna4.
// Since `sequence1` is an lvalue, we capture `in` via const &. When unsure, use the general case below.
auto view1 = sequence1
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::dna4>(in);
});
seqan3::debug_stream << view1 << '\n'; // ACGT
// Convert dna5 to dna4.
// General case: Perfect forward.
auto view2 = sequence2 | std::views::take(8)
| std::views::transform(
[](auto && in)
{
return static_cast<seqan3::dna4>(std::forward<decltype(in)>(in));
});
seqan3::debug_stream << view2 << '\n'; // AGACGTAA
return 0;
}
Provides aliases for qualified.
Provides seqan3::phred42 quality scores.
Provides quality alphabet composites.

A custom dna4 alphabet that converts all unknown characters to A

When assigning from char or converting from a larger nucleotide alphabet to a smaller one, loss of information can occur since obviously some bases are not available. When converting to seqan3::dna5 or seqan3::rna5, non-canonical bases (letters other than A, C, G, T, U) are converted to 'N' to preserve ambiguity at that position. For seqan3::dna4 and seqan3::rna4 there is no letter 'N' to represent ambiguity, so the conversion from char for IUPAC characters tries to choose the best fitting alternative (see seqan3::dna4 for more details).

If you would like to always convert unknown characters to A instead, you can create your own alphabet with a respective char conversion table very easily like this:

// clang-format off
// We inherit from seqan3::nucleotide_base s.t. we do not need to implement the full nucleotide interface
// but it is sufficient to define `rank_to_char`, `char_to_rank`, and `complement_table`.
class my_dna4 : public seqan3::nucleotide_base<my_dna4, 4 /*alphabet size is 4*/>
{
public:
using nucleotide_base<my_dna4, 4>::nucleotide_base; // Use constructors of the base class.
private:
// Returns the character representation of rank. This is where rank conversion for to_char() is handled!
static constexpr char_type rank_to_char(rank_type const rank)
{
return rank_to_char_table[rank];
}
// Returns the rank representation of character. This is where char conversion for assign_char() is handled!
static constexpr rank_type char_to_rank(char_type const chr)
{
return char_to_rank_table[static_cast<index_t>(chr)];
}
// Returns the complement by rank. This is where complement is handled and with this, my_dna4 models
// seqan3::nucleotide_alphabet.
static constexpr rank_type rank_complement(rank_type const rank)
{
return rank_complement_table[rank];
}
private:
// === lookup-table implementation detail ===
// Value to char conversion table.
static constexpr char_type rank_to_char_table[alphabet_size]{'A', 'C', 'G', 'T'}; // rank 0,1,2,3
// Char-to-value conversion table.
static constexpr std::array<rank_type, 256> char_to_rank_table
{
[] () constexpr
{
// By default, everything has rank 0 which equals `A`.
std::array<rank_type, 256> conversion_table{};
conversion_table['C'] = conversion_table['c'] = 1;
conversion_table['G'] = conversion_table['g'] = 2;
conversion_table['T'] = conversion_table['t'] = 3;
conversion_table['U'] = conversion_table['T']; // set U equal to T
conversion_table['u'] = conversion_table['t']; // set u equal to t
return conversion_table;
}()
};
// The rank complement table.
static constexpr rank_type rank_complement_table[alphabet_size]
{
3, // T is complement of 'A'_dna4
2, // G is complement of 'C'_dna4
1, // C is complement of 'G'_dna4
0 // A is complement of 'T'_dna4
};
friend nucleotide_base<my_dna4, 4>; // Grant seqan3::nucleotide_base access to private/protected members.
friend nucleotide_base<my_dna4, 4>::base_t; // Grant seqan3::alphabet_base access to private/protected members.
};
// clang-format on
// Defines the `_my_dna4` *char literal* so you can write `'C'_my_dna4` instead of `my_dna4{}.assign_char('C')`.
constexpr my_dna4 operator""_my_dna4(char const c) noexcept
{
return my_dna4{}.assign_char(c);
}
int main()
{
my_dna4 my_letter{'C'_my_dna4};
my_letter.assign_char('S'); // Characters other than A,C,G,T are implicitly converted to `A`.
seqan3::debug_stream << my_letter << "\n"; // "A";
seqan3::debug_stream << seqan3::complement(my_letter) << "\n"; // "T";
}
A CRTP-base that refines seqan3::alphabet_base and is used by the nucleotides.
Definition: nucleotide_base.hpp:43
constexpr auto complement
Return the complement of a nucleotide object.
Definition: alphabet/nucleotide/concept.hpp:105
constexpr auto alphabet_size
A type trait that holds the size of a (semi-)alphabet.
Definition: alphabet/concept.hpp:849
Provides seqan3::nucleotide_base.

If you are interested in custom alphabets, also take a look at our tutorial How to write your own alphabet.

Controlling threads of (de-)compression streams

When reading or writing compressed files, parallelisation is automatically applied when using BGZF-compressed files, e.g., BAM files. This will use 4 threads by default and can be adjusted by setting seqan3::contrib::bgzf_thread_count to the desired value:

# include <seqan3/io/all.hpp>
// The `bgzf_thread_count` is a variable that can only be changed during the runtime of a program.
// The following does not work, the value must be overwritten within a function:
// seqan3::contrib::bgzf_thread_count = 1u; // Does not work.
int main()
{
// Here, we change the number of threads to `1`.
// This is a global change and will affect every future bgzf (de-)compression.
// However, running (de-)compressions will not be affected.
// `bgzf_thread_count` may be overwritten multiple times during the runtime of a program, in which case
// the latest modification will determine the value.
seqan3::contrib::bgzf_thread_count = 1u;
// Read/Write compressed files.
// ...
return 0;
}
Meta-header for the IO module .

Auto vectorized dna4 complement

Our alphabet seqan3::dna4 cannot be easily auto-vectorized by the compiler.

See this discussion for more details.

You can add your own alphabet that is auto-vectorizable in some use cases. Here is an example for a dna4-like alphabet:

class simd_dna4 : public seqan3::nucleotide_base<simd_dna4, 256>
{
private:
friend base_t; // nucleotide_base
friend base_t::base_t; // alphabet_base
friend seqan3::rna4;
public:
constexpr simd_dna4() noexcept = default;
constexpr simd_dna4(simd_dna4 const &) noexcept = default;
constexpr simd_dna4(simd_dna4 &&) noexcept = default;
constexpr simd_dna4 & operator=(simd_dna4 const &) noexcept = default;
constexpr simd_dna4 & operator=(simd_dna4 &&) noexcept = default;
~simd_dna4() noexcept = default;
template <std::same_as<seqan3::rna4> t> // template parameter t to accept incomplete type
constexpr simd_dna4(t const r) noexcept
{
assign_rank(r.to_rank());
}
using base_t::assign_rank;
using base_t::base_t;
using base_t::to_rank;
static constexpr uint8_t alphabet_size = 4;
constexpr simd_dna4 & assign_char(char_type const c) noexcept
{
char_type const upper_case_char = c & 0b0101'1111;
rank_type rank = (upper_case_char == 'T') * 3 + (upper_case_char == 'G') * 2 + (upper_case_char == 'C');
return assign_rank(rank);
}
constexpr char_type to_char() const noexcept
{
rank_type const rank = to_rank();
switch (rank)
{
case 0u:
return 'A';
case 1u:
return 'C';
case 2u:
return 'G';
default:
return 'T';
}
}
constexpr simd_dna4 complement() const noexcept
{
rank_type rank{to_rank()};
rank ^= 0b11;
simd_dna4 ret{};
return ret.assign_rank(rank);
}
static constexpr bool char_is_valid(char_type const c) noexcept
{
char_type const upper_case_char = c & 0b0101'1111;
return (upper_case_char == 'A') || (upper_case_char == 'C') || (upper_case_char == 'G')
|| (upper_case_char == 'T');
}
};
constexpr char_type to_char() const noexcept
Return the letter as a character of char_type.
Definition: alphabet_base.hpp:115
static constexpr bool char_is_valid(char_type const c) noexcept
Validate whether a character value has a one-to-one mapping to an alphabet value.
Definition: nucleotide_base.hpp:139
constexpr derived_type complement() const noexcept
Return the complement of the letter.
Definition: nucleotide_base.hpp:112
The four letter RNA alphabet of A,C,G,U..
Definition: rna4.hpp:49
The main SeqAn3 namespace.
Definition: aligned_sequence_concept.hpp:29
SeqAn specific customisations in the standard namespace.

All SeqAn documentation snippets

The following lists all snippets that appear in our documentation. Search for keywords with Strg + F.

using namespace seqan3::literals;
int main()
{
// CIGAR string = 2M1D2M
std::vector<seqan3::cigar> cigar_vector{{2, 'M'_cigar_operation},
{1, 'D'_cigar_operation},
{2, 'M'_cigar_operation}};
uint32_t reference_start_position{0}; // The read is aligned at the start of the reference.
seqan3::dna5_vector reference = "ACTGATCGAGAGGATCTAGAGGAGATCGTAGGAC"_dna5;
seqan3::dna5_vector query = "ACGA"_dna5;
auto alignment = alignment_from_cigar(cigar_vector, reference, reference_start_position, query);
seqan3::debug_stream << alignment << '\n'; // prints (ACTGA,AC-GA)
}
Provides the seqan3::cigar alphabet.
using namespace seqan3::literals;
auto sam_file_raw = R"(@HD VN:1.6
@SQ SN:ref LN:34
read1 41 ref 1 61 1S1M1D1M1I ref 10 300 ACGT !##$ AS:i:2 NM:i:7
read2 42 ref 2 62 1H7M1D1M1S2H ref 10 300 AGGCTGNAG !##$&'()* xy:B:S,3,4,5
read3 43 ref 3 63 1S1M1P1M1I1M1I1D1M1S ref 10 300 GGAGTATA !!*+,-./
)";
int main()
{
// The reference sequence might be read from a different file.
seqan3::dna5_vector reference = "ACTGATCGAGAGGATCTAGAGGAGATCGTAGGAC"_dna5;
// You will probably read it from a file, e.g., like this:
// seqan3::sam_file_input fin{"test.sam"};
for (auto && rec : fin)
{
auto alignment =
alignment_from_cigar(rec.cigar_sequence(), reference, rec.reference_position().value(), rec.sequence());
}
// prints:
// (ACT-,C-GT)
// (CTGATCGAG,AGGCTGN-A)
// (T-G-A-TC,G-AGTA-T)
}
The SAM format (tag).
Definition: format_sam.hpp:108
Meta-header for the IO / SAM File submodule .
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector reference = "ATGGCGTAGAGCTTCCCCCCCCCCCCCCCCC"_dna5;
seqan3::dna5_vector read = "ATGCCCCGTTGCTT"_dna5; // length 14
// Align the full query against the first 14 bases of the reference.
seqan3::gap_decorator aligned_reference{reference | seqan3::views::slice(0, 14)};
seqan3::gap_decorator aligned_read{read};
// Insert gaps to represent the alignment:
seqan3::insert_gap(aligned_read, aligned_read.begin() + 11, 2);
seqan3::insert_gap(aligned_reference, aligned_reference.begin() + 4, 2);
seqan3::debug_stream << aligned_reference << '\n' << aligned_read << '\n';
// prints:
// ATGG--CGTAGAGCTT
// ATGCCCCGTTG--CTT
auto cigar_sequence = seqan3::cigar_from_alignment(std::tie(aligned_reference, aligned_read));
seqan3::debug_stream << cigar_sequence << '\n'; // prints [4M,2I,5M,2D,3M]
}
Includes the aligned_sequence and the related insert_gap and erase_gap functions to enable stl contai...
Provides the function seqan3::cigar_from_alignment and a helper struct seqan3::cigar_clipped_bases.
A gap decorator allows the annotation of sequences with gap symbols while leaving the underlying sequ...
Definition: gap_decorator.hpp:81
Provides seqan3::gap_decorator.
Provides seqan3::gapped.
constexpr auto slice
A view adaptor that returns a half-open interval on the underlying range.
Definition: slice.hpp:178
Provides seqan3::views::slice.
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector reference = "ATGGCGTAGAGCTTCCCCCCCCCCCCCCCCC"_dna5;
seqan3::dna5_vector read = "ATGCCCCGTTGCTT"_dna5; // length 14
// Let's say, we want to ignore the last 2 bases of the query because the quality is low.
// We thus only align the first 12 bases, the last two will be soft-clipped bases in the CIGAR string.
seqan3::gap_decorator aligned_reference{reference | seqan3::views::slice(0, 12)};
seqan3::gap_decorator aligned_query{read | seqan3::views::slice(0, 12)};
// insert gaps
seqan3::insert_gap(aligned_reference, aligned_reference.begin() + 4, 2);
seqan3::insert_gap(aligned_query, aligned_query.begin() + 11, 2);
auto cigar_sequence =
seqan3::cigar_from_alignment(std::tie(aligned_reference, aligned_query),
{.hard_front = 1, .hard_back = 0, .soft_front = 0, .soft_back = 2});
seqan3::debug_stream << cigar_sequence << '\n'; // prints [1H,4M,2I,5M,2D,1M,2S]
}
int main()
{
// A symmetric band around the main diagonal.
// A band starting with the main diagonal shifted by 3 cells to the right.
// A band starting with the main diagonal shifted by 3 cells down.
// An invalid band configuration.
// Using this band as a configuration in seqan3::align_pairwise would cause the algorithm to throw an exception.
}
Provides seqan3::detail::align_config_band.
Configuration element for setting a fixed size band.
Definition: align_config_band.hpp:63
A strong type representing the lower diagonal of the seqan3::align_cfg::band_fixed_size.
Definition: align_config_band.hpp:31
A strong type representing the upper diagonal of the seqan3::align_cfg::band_fixed_size.
Definition: align_config_band.hpp:42
int main()
{
// Computes semi global edit distance using fast-bit vector algorithm.
// Computes semi global edit distance using slower standard pairwise algorithm.
// Computes global distance allowing a minimal score of 3 (Default: edit distance).
auto cfg_errors =
}
Provides seqan3::align_cfg::edit_scheme.
Provides global and local alignment configurations.
Provides seqan3::align_cfg::min_score configuration.
#include <iostream>
int main()
{
// Configuration with linear gap costs.
// Configuration with affine gap costs. Score for opening a gap during the alignment algorithm will be -11.
// Accessing the members of the gap scheme
int open = affine_cfg.open_score;
int extension = affine_cfg.extension_score;
std::cout << open << '\n'; // -1
std::cout << extension << '\n'; // -10
}
Provides seqan3::align_config::gap_cost_affine.
A configuration element for the affine gap cost scheme.
Definition: align_config_gap_cost_affine.hpp:75
A strong type of underlying type int32_t that represents the score (usually negative) of any characte...
Definition: align_config_gap_cost_affine.hpp:51
A strong type of underlying type int32_t that represents a score (usually negative) that is incurred ...
Definition: align_config_gap_cost_affine.hpp:34
using namespace seqan3::literals;
int main()
{
// configure a global alignment for DNA sequences
auto seq1 = "TCGT"_dna4;
auto seq2 = "ACGA"_dna4;
for (auto res : seqan3::align_pairwise(std::tie(seq1, seq2), min_cfg))
seqan3::debug_stream << res.score() << '\n'; // print out the alignment score
}
Provides seqan3::align_cfg::scoring_scheme.
Sets the scoring scheme for the alignment algorithm.
Definition: align_config_scoring_scheme.hpp:45
using namespace seqan3::literals;
int main()
{
// configure a local alignment for DNA sequences
auto seq1 = "TCGT"_dna4;
auto seq2 = "ACGA"_dna4;
for (auto res : seqan3::align_pairwise(std::tie(seq1, seq2), min_cfg))
seqan3::debug_stream << res.score() << '\n'; // print out the alignment score
}
Sets the local alignment method.
Definition: align_config_method.hpp:45
int main()
{
// Allow a minimal score of -5, i.e. at most 5 edit operations.
auto min_score = std::get<seqan3::align_cfg::min_score>(config);
min_score.score = -5;
}
Provides seqan3::configuration and utility functions.
int main()
{
seqan3::align_cfg::on_result cfg{[](auto && result)
{
seqan3::debug_stream << result << '\n';
}};
}
Provides seqan3::align_cfg::on_result.
Configuration element to provide a user defined callback function for the alignment.
Definition: align_config_on_result.hpp:54
int main()
{
// Compute only the alignment.
}
Provides configuration for alignment output.
int main()
{
// Compute only the begin position of the aligned sequences.
}
int main()
{
// Compute only the end position of the aligned sequences.
}
Configures the alignment result to output the end position.
Definition: align_config_output.hpp:87
#include <vector>
int main()
{
using namespace seqan3::literals;
// Basic alignment algorithm configuration.
std::pair p{"ACGTAGC"_dna4, "AGTACGACG"_dna4};
// Compute only the score:
seqan3::debug_stream << res << "\n"; // prints: {score: -4}
// Compute only the alignment:
seqan3::debug_stream << res << "\n"; // prints: {alignment: (ACGTA-G-C-,A-GTACGACG)}
// Compute the score and the alignment:
for (auto res :
seqan3::debug_stream << res << "\n"; // prints: {score: -4, alignment: (ACGTA-G-C-,A-GTACGACG)}
// By default compute everything:
for (auto res : seqan3::align_pairwise(p, config))
<< res << "\n"; // prints {id: 0, score: -4, begin: (0,0), end: (7,9) alignment: (ACGTA-G-C-,A-GTACGACG)}
}
int main()
{
// Output only the id of the first sequence.
}
Configures the alignment result to output the id of the first sequence.
Definition: align_config_output.hpp:211
int main()
{
// Output only the id of the second sequence.
}
Configures the alignment result to output the id of the second sequence.
Definition: align_config_output.hpp:250
#include <thread>
int main()
{
// Enables parallel computation with two threads.
// Enables parallel computation with the number of concurrent threads supported by the current architecture.
;
}
Provides seqan3::align_cfg::parallel configuration.
seqan3::detail::parallel_mode< std::integral_constant< seqan3::detail::align_config_id, seqan3::detail::align_config_id::parallel > > parallel
Enables the parallel execution of the alignment algorithm if possible for the given configuration.
Definition: align_config_parallel.hpp:38
T hardware_concurrency(T... args)
int main()
{
// Compute only the score.
seqan3::align_cfg::score_type<int16_t>{}; // Now the alignment computes 16 bit integers.
seqan3::configuration cfg2 = seqan3::align_cfg::score_type<float>{}; // Now the alignment computes float scores.
}
Provides alignment configuration seqan3::align_cfg::score_type.
A configuration element to set the score type used in the alignment algorithm.
Definition: align_config_score_type.hpp:36
int main()
{
// Enable SIMD vectorised alignment computation.
}
Provides seqan3::align_cfg::vectorised configuration.
Enables the vectorised alignment computation if possible for the current configuration.
Definition: align_config_vectorised.hpp:42
#include <span>
#include <vector>
class my_matrix : public seqan3::detail::alignment_matrix_column_major_range_base<my_matrix>
{
public:
// Alias the base class
using base_t = seqan3::detail::alignment_matrix_column_major_range_base<my_matrix>;
friend base_t;
// Inherit the alignment column type defined in the base class. This type is returned in initialise_column.
using typename base_t::alignment_column_type;
// The following types are required by the base type since they cannot be inferred within the base.
using column_data_view_type = std::span<int>; //This type is the underlying view over the actual memory location.
using value_type = int; // The actual value type.
using reference = int &; // The actual reference type.
my_matrix() = default;
my_matrix(my_matrix const &) = default;
my_matrix(my_matrix &&) = default;
my_matrix & operator=(my_matrix const &) = default;
my_matrix & operator=(my_matrix &&) = default;
~my_matrix() = default;
my_matrix(size_t const num_rows, size_t const num_cols) : num_rows{num_rows}, num_cols{num_cols}
{
data.resize(num_rows * num_cols);
}
protected:
size_t num_rows{};
size_t num_cols{};
//Required for the base class. Initialises the current column given the column index.
alignment_column_type initialise_column(size_t const column_index) noexcept
{
return alignment_column_type{*this,
column_data_view_type{std::addressof(data[num_rows * column_index]), num_rows}};
}
//Required for the base class. Initialises the proxy for the current iterator over the current column.
template <std::random_access_iterator iter_t>
constexpr reference make_proxy(iter_t iter) noexcept
{
return *iter;
}
};
int main()
{
my_matrix matrix{3, 5};
// Fill the matrix with
int val = 0;
for (auto col : matrix) // Iterate over the columns
for (auto & cell : col) // Iterate over the cells in one column.
cell = val++;
// Print the matrix column by column
for (auto col : matrix)
seqan3::debug_stream << col << '\n';
}
T addressof(T... args)
Provides seqan3::detail::alignment_matrix_column_major_range_base.
#include <iostream>
int main()
{
using seqan3::detail::debug_matrix;
using namespace seqan3::literals;
std::vector<seqan3::dna4> database = "AACCGGTT"_dna4;
std::vector<seqan3::dna4> query = "ACGT"_dna4;
seqan3::detail::row_wise_matrix<int> score_matrix{
seqan3::detail::number_rows{5u},
seqan3::detail::number_cols{9u},
std::vector{-0, -1, -2, -3, -4, -5, -6, -7, -8, -1, -0, -1, -2, -3, -4, -5, -6, -7, -2, -1, -1, -1, -2,
-3, -4, -5, -6, -3, -2, -2, -2, -2, -2, -3, -4, -5, -4, -3, -3, -3, -3, -3, -3, -3, -4}};
seqan3::debug_stream << "database:\t" << database << '\n';
seqan3::debug_stream << "query:\t\t" << query << '\n';
seqan3::debug_stream << "score_matrix: " << score_matrix.cols() << " columns and " << score_matrix.rows()
<< " rows\n";
// Prints out the matrix in a convenient way
seqan3::debug_stream << score_matrix << '\n'; // without sequences
seqan3::debug_stream << debug_matrix{score_matrix, database, query} << '\n'; // with sequences
seqan3::debug_stream << seqan3::fmtflags2::utf8 << debug_matrix{score_matrix, database, query}; // as utf8
return 0;
}
Provides the declaration of seqan3::detail::debug_matrix.
@ utf8
Enables use of non-ASCII UTF8 characters in formatted output.
Definition: debug_stream_type.hpp:33
#include <iostream>
int main()
{
using seqan3::detail::debug_matrix;
using namespace seqan3::literals;
using seqan3::operator|;
std::vector<seqan3::dna4> database = "AACCGGTT"_dna4;
std::vector<seqan3::dna4> query = "ACGT"_dna4;
auto N = seqan3::detail::trace_directions::none;
auto D = seqan3::detail::trace_directions::diagonal;
auto U = seqan3::detail::trace_directions::up;
auto L = seqan3::detail::trace_directions::left;
seqan3::detail::row_wise_matrix<seqan3::detail::trace_directions> trace_matrix{
seqan3::detail::number_rows{5u},
seqan3::detail::number_cols{9u},
std::vector{N, L, L, L, L, L, L, L, L, U, D, D | L, L, L, L,
L, L, L, U, U, D, D, D | L, L, L, L, L, U, U, D | U,
D | U, D, D, D | L, L, L, U, U, D | U, D | U, D | U, D | U, D, D, D | L}};
seqan3::debug_stream << "database:\t" << database << '\n';
seqan3::debug_stream << "query:\t\t" << query << '\n';
seqan3::debug_stream << "trace_matrix: " << trace_matrix.cols() << " columns and " << trace_matrix.rows()
<< " rows\n";
// Prints out the matrix in a convenient way
seqan3::debug_stream << trace_matrix << '\n'; // without sequences
seqan3::debug_stream << debug_matrix{trace_matrix, database, query} << '\n'; // with sequences
seqan3::debug_stream << seqan3::fmtflags2::utf8 << debug_matrix{trace_matrix, database, query}; // as utf8
return 0;
}
Provides seqan3::views::to_char.
#include <vector>
int main()
{
using namespace seqan3::literals;
// Configure the alignment kernel.
{
std::pair p{"ACGTAGC"_dna4, "AGTACGACG"_dna4};
auto result = seqan3::align_pairwise(p, config);
}
{
std::vector vec{"ACCA"_dna4, "ATTA"_dna4};
auto result = seqan3::align_pairwise(std::tie(vec[0], vec[1]), config);
}
std::vector vec{std::pair{"AGTGCTACG"_dna4, "ACGTGCGACTAG"_dna4},
std::pair{"AGTAGACTACG"_dna4, "ACGTACGACACG"_dna4},
std::pair{"AGTTACGAC"_dna4, "AGTAGCGATCG"_dna4}};
// Compute the alignment of a single pair.
for (auto const & res : seqan3::align_pairwise(std::tie(vec[0].first, vec[0].second), edit_config))
seqan3::debug_stream << "The score: " << res.score() << "\n";
// Compute the alignment over a range of pairs.
for (auto const & res : seqan3::align_pairwise(vec, edit_config))
seqan3::debug_stream << "The score: " << res.score() << "\n";
}
#include <ranges>
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector data1{"AGTGCTACG"_dna4, "AGTAGACTACG"_dna4, "AGTTACGAC"_dna4};
std::vector data2{"ACGTGCGACTAG"_dna4, "ACGTACGACACG"_dna4, "AGTAGCGATCG"_dna4};
// Configure the alignment kernel.
auto config =
// Compute the alignment over a range of pairs.
for (auto const & res : seqan3::align_pairwise(seqan3::views::zip(data1, data2), config))
seqan3::debug_stream << "The score: " << res.score() << "\n";
}
Meta-header for the Alignment / Configuration submodule .
int main()
{
using first_seq_t = std::tuple_element_t<0, std::ranges::range_value_t<sequences_t>>;
using second_seq_t = std::tuple_element_t<1, std::ranges::range_value_t<sequences_t>>;
// Select the result type based on the sequences and the configuration.
using result_t =
config_t>::type>;
// Define the function wrapper type.
using function_wrapper_t = std::function<result_t(first_seq_t &, second_seq_t &)>;
static_assert(seqan3::detail::is_type_specialisation_of_v<function_wrapper_t, std::function>);
}
Provides seqan3::detail::alignment_selector.
Stores the alignment results and gives access to score, alignment and the front and end positionss.
Definition: alignment_result.hpp:148
#include <mutex>
#include <vector>
int main()
{
// Generate some sequences.
using namespace seqan3::literals;
std::vector<sequence_pair_t> sequences{100, {"AGTGCTACG"_dna4, "ACGTGCGACTAG"_dna4}};
// Use edit distance with 4 threads.
auto const alignment_config =
// Compute the alignments in parallel and output them in order based on the input.
for (auto && result : seqan3::align_pairwise(sequences, alignment_config))
seqan3::debug_stream << result << '\n';
// prints:
// [id: 0 score: -4]
// [id: 1 score: -4]
// [id: 2 score: -4]
// [id: 3 score: -4]
// [id: 4 score: -4]
// [id: 5 score: -4]
// ...
// [id: 98 score: -4]
// [id: 99 score: -4]
// Compute the alignments in parallel and output them unordered using the callback (order is not deterministic).
std::mutex write_to_debug_stream{}; // Need mutex to synchronise the output.
auto const alignment_config_with_callback =
alignment_config
| seqan3::align_cfg::on_result{[&](auto && result)
{
std::lock_guard sync{write_to_debug_stream}; // critical section
seqan3::debug_stream << result << '\n';
}};
seqan3::align_pairwise(sequences, alignment_config_with_callback); // seqan3::align_pairwise is now declared void.
// might print:
// [id: 0 score: -4]
// [id: 1 score: -4]
// [id: 2 score: -4]
// [id: 6 score: -4]
// [id: 7 score: -4]
// [id: 3 score: -4]
// ...
// [id: 99 score: -4]
// [id: 92 score: -4]
}
int main()
{
using namespace seqan3::literals;
// How to score two letters:
seqan3::debug_stream << "blosum62 score for T and S: " << (int)scheme.score('T'_aa27, 'S'_aa27) << "\n"; // == 1
scheme.set_similarity_matrix(seqan3::aminoacid_similarity_matrix::blosum80);
// You can also score aa20 against aa27:
seqan3::debug_stream << "blosum80 score for 'T'_aa27 and 'S'_aa20: " << (int)scheme.score('T'_aa27, 'S'_aa20)
<< "\n"; // == 2
scheme.set_hamming_distance();
seqan3::debug_stream << "Hamming distance between T and S: " << (int)scheme.score('T'_aa27, 'S'_aa20)
<< "\n"; // == -1
seqan3::debug_stream << "Hamming distance between T and T: " << (int)scheme.score('T'_aa27, 'T'_aa20)
<< "\n"; // == 0
// You can "edit" a given matrix directly:
seqan3::debug_stream << "blosum80 score between T and S: " << (int)scheme2.score('T'_aa27, 'S'_aa27)
<< "\n"; // == 2
auto & cell = scheme2.score('T'_aa27, 'S'_aa27);
cell = 3;
seqan3::debug_stream << "New score after editing entry: " << (int)scheme2.score('T'_aa27, 'S'_aa27) << "\n"; // == 3
std::vector<seqan3::aa27> one = "ALIGATOR"_aa27;
std::vector<seqan3::aa27> two = "ANIMATOR"_aa27;
// You can also score two sequences:
int score = 0;
for (auto pair : seqan3::views::zip(one, two))
score += scheme3.score(std::get<0>(pair), std::get<1>(pair));
seqan3::debug_stream << "Score: " << score << "\n"; // 4 + -3 + 4 + -3 + 4 + 5 + -1 + 5 = 15
}
Provides seqan3::aa27, container aliases and string literals.
Meta-header for the Alphabet / Aminoacid submodule .
@ blosum80
The blosum80 matrix for closely related proteins.
@ blosum62
The blosum62 matrix recommended for most use-cases.
int main()
{
using namespace seqan3::literals;
// You can score two letters:
seqan3::nucleotide_scoring_scheme scheme; // hamming is default
seqan3::debug_stream << "Score between DNA5 A and G: " << (int)scheme.score('A'_dna5, 'G'_dna5) << "\n"; // == -1
seqan3::debug_stream << "Score between DNA5 A and A: " << (int)scheme.score('A'_dna5, 'A'_dna5) << "\n"; // == 0
// You can also score differenct nucleotides:
seqan3::debug_stream << "Score between DNA5 A and RNA15 G: " << (int)scheme.score('A'_dna5, 'G'_rna15)
<< "\n"; // == -2
seqan3::debug_stream << "Score between DNA5 A and RNA15 A: " << (int)scheme.score('A'_dna5, 'A'_rna15)
<< "\n"; // == 3
// You can "edit" a given matrix directly:
seqan3::nucleotide_scoring_scheme scheme2; // hamming distance is default
seqan3::debug_stream << "Score between DNA A and G before edit: " << (int)scheme2.score('A'_dna15, 'G'_dna15)
<< "\n"; // == -1
scheme2.score('A'_dna15, 'G'_dna15) = 3;
seqan3::debug_stream << "Score after editing: " << (int)scheme2.score('A'_dna15, 'G'_dna15) << "\n"; // == 3
// You can score two sequences:
std::vector<seqan3::dna15> one = "AGAATA"_dna15;
std::vector<seqan3::dna15> two = "ATACTA"_dna15;
seqan3::nucleotide_scoring_scheme scheme3; // hamming distance is default
int score = 0;
for (auto pair : seqan3::views::zip(one, two))
score += scheme3.score(std::get<0>(pair), std::get<1>(pair));
seqan3::debug_stream << "Score: " << score << "\n"; // == 0 - 1 + 0 - 1 + 0 + 0 = -2
}
constexpr void set_simple_scheme(match_score< score_arg_t > const ms, mismatch_score< score_arg_t > const mms)
Set the simple scheme (everything is either match or mismatch).
Definition: scoring_scheme_base.hpp:178
constexpr score_t & score(alph1_t const alph1, alph2_t const alph2) noexcept
Score two letters (either two nucleotids or two amino acids).
Definition: scoring_scheme_base.hpp:217
Provides seqan3::rna15, container aliases and string literals.
int main()
{
// does not work:
// seqan3::dna4 my_letter{0}; // we want to set the default, an A
// seqan3::dna4 my_letter{'A'}; // we also want to set an A, but we are setting value 65
// std::cout << my_letter; // you expect 'A', but how would you access the number?
}
int main()
{
using namespace seqan3::literals;
seqan3::dna4 letter = 'A'_dna4; // identical to assign_char_to('A', letter);
seqan3::dna4_vector sequence = "ACGT"_dna4; // identical to calling assign_char for each element
}
int main()
{
seqan3::dna4 my_letter;
seqan3::assign_rank_to(0, my_letter); // assign an A via rank interface
seqan3::assign_char_to('A', my_letter); // assign an A via char interface
std::cout << seqan3::to_char(my_letter) << '\n'; // prints 'A'
std::cout << (unsigned)seqan3::to_rank(my_letter) << '\n'; // prints 0
// we have to add the cast here, because uint8_t is also treated as a char type by default :(
// Using SeqAn's debug_stream:
seqan3::debug_stream << seqan3::to_char(my_letter) << '\n'; // prints 'A'
seqan3::debug_stream << my_letter << '\n'; // prints 'A' (calls to_char() automatically!)
seqan3::debug_stream << seqan3::to_rank(my_letter) << '\n'; // prints 0 (casts uint8_t to unsigned automatically!)
}
constexpr auto assign_char_to
Assign a character to an alphabet object.
Definition: alphabet/concept.hpp:524
constexpr auto to_char
Return the char representation of an alphabet object.
Definition: alphabet/concept.hpp:386
constexpr auto assign_rank_to
Assign a rank to an alphabet object.
Definition: alphabet/concept.hpp:293
constexpr auto to_rank
Return the rank representation of a (semi-)alphabet object.
Definition: alphabet/concept.hpp:155
#include <seqan3/utility/char_operations/transform.hpp> // seqan3::to_lower
class ab : public seqan3::alphabet_base<ab, 2>
{
private:
// make the base class a friend so it can access the tables:
// This function is expected by seqan3::alphabet_base
static constexpr char_type rank_to_char(rank_type const rank)
{
// via a lookup table
return rank_to_char_table[rank];
// or via an arithmetic expression
return rank == 1 ? 'B' : 'A';
}
// This function is expected by seqan3::alphabet_base
static constexpr rank_type char_to_rank(char_type const chr)
{
// via a lookup table
return char_to_rank_table[static_cast<index_t>(chr)];
// or via an arithmetic expression
return seqan3::to_lower(chr) == 'b' ? 1 : 0;
}
private:
// === lookup-table implementation detail ===
// map 0 -> A and 1 -> B
static constexpr std::array<char_type, alphabet_size> rank_to_char_table{'A', 'B'};
// map every letter to rank zero, except Bs
static constexpr std::array<rank_type, 256> char_to_rank_table{
// initialise with an immediately evaluated lambda expression:
[]()
{
std::array<rank_type, 256> ret{}; // initialise all values with 0 / 'A'
// only 'b' and 'B' result in rank 1
ret['b'] = 1;
ret['B'] = 1;
return ret;
}()};
};
// The class ab satisfies the alphabet concept.
static_assert(seqan3::alphabet<ab>);
Core alphabet concept and free function/type trait wrappers.
Provides seqan3::alphabet_base.
The generic alphabet concept that covers most data types used in ranges.
Refines seqan3::alphabet and adds assignability.
constexpr char_type to_lower(char_type const c) noexcept
Converts 'A'-'Z' to 'a'-'z' respectively; other characters are returned as is.
Definition: transform.hpp:83
Provides utilities for modifying characters.
#include <iostream>
int main()
{
auto sigma_char = seqan3::alphabet_size<char>; // calls seqan3::custom::alphabet_size(char{})
static_assert(std::same_as<decltype(sigma_char), uint16_t>);
std::cout << sigma_char << '\n'; // 256
auto sigma_dna5 = seqan3::alphabet_size<seqan3::dna5>; // returns dna5::alphabet_size
static_assert(std::same_as<decltype(sigma_dna5), uint8_t>);
std::cout << static_cast<uint16_t>(sigma_dna5) << '\n'; // 5
}
Provides alphabet adaptations for standard char types.
int main()
{
using namespace seqan3::literals;
seqan3::aa10li letter{'A'_aa10li};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('?'); // Unknown characters are implicitly converted to A.
seqan3::debug_stream << letter << '\n'; // prints "A"
}
Provides seqan3::aa10li, container aliases and string literals.
The reduced Li amino acid alphabet..
Definition: aa10li.hpp:83
int main()
{
using namespace seqan3::literals;
seqan3::aa10li letter1{'A'_aa10li};
auto letter2 = 'A'_aa10li;
}
int main()
{
using namespace seqan3::literals;
seqan3::aa10li_vector sequence1{"ACGTTA"_aa10li};
seqan3::aa10li_vector sequence2 = "ACGTTA"_aa10li;
auto sequence3 = "ACGTTA"_aa10li;
}
int main()
{
using namespace seqan3::literals;
seqan3::aa10murphy letter{'A'_aa10murphy};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('?'); // Unknown characters are implicitly converted to S.
seqan3::debug_stream << letter << '\n'; // prints "S"
}
Provides seqan3::aa10murphy, container aliases and string literals.
The reduced Murphy amino acid alphabet..
Definition: aa10murphy.hpp:82
int main()
{
using namespace seqan3::literals;
seqan3::aa10murphy letter1{'A'_aa10murphy};
auto letter2 = 'A'_aa10murphy;
}
int main()
{
using namespace seqan3::literals;
seqan3::aa10murphy_vector sequence1{"ACGTTA"_aa10murphy};
seqan3::aa10murphy_vector sequence2 = "ACGTTA"_aa10murphy;
auto sequence3 = "ACGTTA"_aa10murphy;
}
int main()
{
using namespace seqan3::literals;
seqan3::aa20 letter{'A'_aa20};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('?'); // Unknown characters are implicitly converted to S.
seqan3::debug_stream << letter << '\n'; // prints "S"
}
Provides seqan3::aa20, container aliases and string literals.
The canonical amino acid alphabet..
Definition: aa20.hpp:64
int main()
{
using namespace seqan3::literals;
seqan3::aa20 letter1{'A'_aa20};
auto letter2 = 'A'_aa20;
}
int main()
{
using namespace seqan3::literals;
seqan3::aa20_vector sequence1{"ACGTTA"_aa20};
seqan3::aa20_vector sequence2 = "ACGTTA"_aa20;
auto sequence3 = "ACGTTA"_aa20;
}
int main()
{
using namespace seqan3::literals;
seqan3::aa27 letter{'A'_aa27};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('?'); // Unknown characters are implicitly converted to X.
seqan3::debug_stream << letter << '\n'; // prints "X"
}
The twenty-seven letter amino acid alphabet..
Definition: aa27.hpp:46
int main()
{
using namespace seqan3::literals;
seqan3::aa27 letter1{'A'_aa27};
auto letter2 = 'A'_aa27;
}
int main()
{
using namespace seqan3::literals;
seqan3::aa27_vector sequence1{"ACGTTA"_aa27};
seqan3::aa27_vector sequence2 = "ACGTTA"_aa27;
auto sequence3 = "ACGTTA"_aa27;
}
namespace your_namespace
{
// your own aminoacid definition
{
//...
};
} // namespace your_namespace
static_assert(seqan3::enable_aminoacid<your_namespace::your_aa> == true);
/***** OR *****/
namespace your_namespace2
{
// your own aminoacid definition
struct your_aa
{
//...
};
constexpr bool enable_aminoacid(your_aa) noexcept
{
return true;
}
} // namespace your_namespace2
static_assert(seqan3::enable_aminoacid<your_namespace2::your_aa> == true);
Provides seqan3::aminoacid_alphabet.
constexpr bool enable_aminoacid
A trait that indicates whether a type shall model seqan3::aminoacid_alphabet.
Definition: alphabet/aminoacid/concept.hpp:146
This is an empty base class that can be inherited by types that shall model seqan3::aminoacid_alphabe...
Definition: alphabet/aminoacid/concept.hpp:35
int main()
{
char c = '!';
seqan3::assign_char_strictly_to('?', c); // calls seqan3::custom::assign_char_strictly_to('A', c)
seqan3::assign_char_strictly_to('A', d); // calls .assign_char('A') member
// also works for temporaries:
}
The five letter DNA alphabet of A,C,G,T and the unknown character N..
Definition: dna5.hpp:51
constexpr auto assign_char_strictly_to
Assign a character to an alphabet object, throw if the character is not valid.
Definition: alphabet/concept.hpp:734
int main()
{
char c = '!';
seqan3::assign_char_to('?', c); // calls seqan3::custom::assign_char_to('A', c)
seqan3::assign_char_to('A', d); // calls .assign_char('A') member
// also works for temporaries:
// invalid/unknown characters are converted:
seqan3::dna5 d3 = seqan3::assign_char_to('!', seqan3::dna5{}); // == 'N'_dna5
}
int main()
{
char c = '!';
seqan3::assign_rank_to(66, c); // calls seqan3::custom::assign_rank_to(66, c); == 'B'
seqan3::assign_rank_to(2, d); // calls .assign_rank(2) member; == 'G'_dna5
// also works for temporaries:
// too-large ranks are undefined behaviour:
// seqan3::dna5 d3 = seqan3::assign_rank_to(50, seqan3::dna5{});
}
int main()
{
// calls seqan3::custom::char_is_valid_for<char>('A')
std::cout << std::boolalpha << seqan3::char_is_valid_for<char>('A') << '\n'; // always 'true'
// calls dna5::char_is_valid('A') member
std::cout << std::boolalpha << seqan3::char_is_valid_for<seqan3::dna5>('A') << '\n'; // true
// for some alphabets, characters that are not uniquely mappable are still valid:
std::cout << std::boolalpha << seqan3::char_is_valid_for<seqan3::dna5>('a') << '\n'; // true
}
int main()
{
using namespace seqan3::literals;
seqan3::cigar letter{12, 'M'_cigar_operation};
letter.assign_string("10D");
seqan3::debug_stream << letter << '\n'; // prints "10D"
letter.assign_string("20Z"); // Unknown strings are implicitly converted to 0P.
seqan3::debug_stream << letter << '\n'; // prints "0P"
}
cigar & assign_string(std::string_view const input) noexcept
Assign from a std::string_view.
Definition: alphabet/cigar/cigar.hpp:170
int main()
{
std::string cigar_str{"4S134M"}; // input
seqan3::cigar letter1{};
seqan3::cigar letter2{};
// Assign from string
// convenient but creates an unnecessary string copy "4S"
letter1.assign_string(cigar_str.substr(0, 2));
letter2.assign_string(cigar_str.substr(2, 4));
seqan3::debug_stream << letter1 << '\n'; // prints 4S
seqan3::debug_stream << letter2 << '\n'; // prints 134M
// Assign from std::string_view (No extra string copies)
// Version 1
letter1.assign_string(std::string_view{cigar_str}.substr(0, 2));
letter2.assign_string(std::string_view{cigar_str}.substr(2, 4));
seqan3::debug_stream << letter1 << '\n'; // prints 4S
seqan3::debug_stream << letter2 << '\n'; // prints 134M
// No extra string copiesersion 2
letter1.assign_string(/*std::string_view*/ {cigar_str.data(), 2});
letter2.assign_string(/*std::string_view*/ {cigar_str.data() + 2, 4});
seqan3::debug_stream << letter1 << '\n'; // prints 4S
seqan3::debug_stream << letter2 << '\n'; // prints 134M
// Assign from char array
letter2.assign_string("40S");
seqan3::debug_stream << letter2 << '\n'; // prints 40S
// Assign from seqan3::small_string
letter2.assign_string(letter1.to_string());
seqan3::debug_stream << letter2 << '\n'; // prints 4S
}
T substr(T... args)
int main()
{
using seqan3::get;
using namespace seqan3::literals;
seqan3::cigar letter{10, 'M'_cigar_operation};
// Note that this is equivalent to get<uint32_t>(letter)
uint32_t size{get<0>(letter)};
// Note that this is equivalent to get<seqan3::cigar::operation>(letter)
seqan3::cigar::operation cigar_char{get<1>(letter)};
seqan3::debug_stream << "Size is " << size << '\n';
seqan3::debug_stream << "Cigar char is " << cigar_char << '\n'; // seqan3::debug_stream converts to char on the fly.
}
The actual implementation of seqan3::cigar::operation for documentation purposes only....
Definition: cigar_operation.hpp:48
constexpr size_t size
The size of a type pack.
Definition: type_pack/traits.hpp:146
constexpr auto const & get(configuration< configs_t... > const &config) noexcept
This is an overloaded member function, provided for convenience. It differs from the above function o...
Definition: configuration.hpp:415
int main()
{
using seqan3::get;
using namespace seqan3::literals;
seqan3::cigar letter{10, 'M'_cigar_operation};
// Note that this is equivalent to get<0>(letter)
uint32_t size{get<uint32_t>(letter)};
// Note that this is equivalent to get<1>(letter)
seqan3::cigar::operation cigar_char{get<seqan3::cigar::operation>(letter)};
seqan3::debug_stream << "Size is " << size << '\n';
seqan3::debug_stream << "Cigar char is " << cigar_char << '\n'; // seqan3::debug_stream converts to char on the fly.
}
int main()
{
using namespace seqan3::literals;
seqan3::cigar::operation letter{'M'_cigar_operation};
letter.assign_char('D');
seqan3::debug_stream << letter << '\n'; // prints "D"
letter.assign_char('Z'); // Unknown characters are implicitly converted to M.
seqan3::debug_stream << letter << '\n'; // prints "M"
}
int main()
{
using namespace seqan3::literals;
seqan3::cigar::operation letter1{'M'_cigar_operation};
auto letter2 = 'M'_cigar_operation;
}
int main()
{
using namespace seqan3::literals;
seqan3::cigar letter{10, 'I'_cigar_operation};
seqan3::debug_stream << "letter: " << letter << '\n'; // 10I
letter = 'D'_cigar_operation;
seqan3::debug_stream << "letter: " << letter << '\n'; // 10D
letter = 20;
seqan3::debug_stream << "letter: " << letter << '\n'; // 20D
}
int main()
{
using namespace seqan3::literals;
// creates 10M, as the cigar_op field is not provided.
seqan3::cigar letter1{10};
seqan3::debug_stream << "letter1: " << letter1 << '\n'; // 10M
// creates 0I, as the integer field is not provided.
seqan3::cigar letter2{'I'_cigar_operation};
seqan3::debug_stream << "letter2: " << letter2 << '\n'; // 0I
// creates 10I, as both fields are explicitly given.
seqan3::cigar letter3{10, 'I'_cigar_operation};
seqan3::debug_stream << "letter3: " << letter3 << '\n'; // 10I
}
int main()
{
using namespace seqan3::literals;
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter1{'T'_dna4, '"'_phred42};
letter1 = 'C'_rna4; // yields {'C'_dna4, '"'_phred42}
}
Meta-header for the Alphabet / Nucleotide submodule .
Joins an arbitrary alphabet with a quality alphabet.
Definition: qualified.hpp:62
int main()
{
using namespace seqan3::literals;
// The following creates {'C'_dna4, '!'_phred42}
// The following also creates {'C'_dna4, '!'_phred42}, since rna4 assignable to dna4
if (letter1 == letter2)
seqan3::debug_stream << "yeah\n"; // yeah
}
int main()
{
using namespace seqan3::literals;
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter1{'T'_dna4, '"'_phred42};
letter1 = 'C'_dna4; // yields {'C'_dna4, '"'_phred42}
letter1 = '#'_phred42; // yields {'C'_dna4, '#'_phred42}
}
int main()
{
using namespace seqan3::literals;
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter1{'C'_dna4}; // creates {'C'_dna4, '!'_phred42}
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter2{'"'_phred42}; // creates {'A'_dna4, '"'_phred42}
if (letter1 == letter2)
seqan3::debug_stream << "yeah\n"; // yeah
}
int main()
{
using namespace seqan3::literals;
seqan3::alphabet_variant<seqan3::dna5, seqan3::gap> letter{}; // implicitly 'A'_dna5
seqan3::alphabet_variant<seqan3::dna5, seqan3::gap> letter2{'C'_dna5}; // constructed from alternative (== 'C'_dna5)
'U'_rna5}; // constructed from type that alternative is constructible from (== 'T'_dna5)
letter2.assign_char('T'); // == 'T'_dna5
letter2.assign_char('-'); // == gap{}
letter2.assign_char('K'); // unknown characters map to the default/unknown
// character of the first alternative type (== 'N'_dna5)
letter2 = seqan3::gap{}; // assigned from alternative (== gap{})
letter2 = 'U'_rna5; // assigned from type that alternative is assignable from (== 'T'_dna5)
seqan3::dna5 letter4 = letter2.convert_to<seqan3::dna5>();
}
Provides seqan3::alphabet_variant.
A combined alphabet that can hold values of either of its alternatives..
Definition: alphabet_variant.hpp:120
int main()
{
using namespace seqan3::literals;
var.assign_char('A'); // will be in the "dna4-state"
var = 'A'_dna5; // will be in the "dna5-state"
}
int main()
{
using namespace seqan3::literals;
// possible:
// not possible:
// seqan3::alphabet_variant<seqan3::dna4, seqan3::gap> letter2 = 'C'_dna5;
}
#include <gtest/gtest.h>
int main()
{
static_assert(variant_t::is_alternative<seqan3::dna5>(), "dna5 is an alternative of variant_t");
static_assert(!variant_t::is_alternative<seqan3::dna4>(), "dna4 is not an alternative of variant_t");
static_assert(variant_t::is_alternative<seqan3::gap>(), "gap is an alternative of variant_t");
}
// This example illustrates how we can reduce the usage of templates (or the amount of different instantiations) via
// type erasure. Having only one function generated for `algorithm()` is the only benefit of using `semialphabet_any`
// here. Of course this only makes sense for your application if the part of the program that is agnostic of the
// character representation (your equivalent of `algorithm()`) is substantially larger than the specific parts – and
// if compile-time and/or size of the exectuble are a concern.
#include <iostream>
using namespace seqan3::literals;
// Print is a template and gets instantiated two times because the behaviour is different for both types
template <typename rng_t>
void print(rng_t && r)
{
seqan3::debug_stream << r << '\n';
}
// Algorithm is not a template, only one instance is generated by the compiler
// Type information is encoded via a run-time parameter
void algorithm(std::vector<seqan3::semialphabet_any<10>> & r, bool is_murphy)
{
// Algorithm example that replaces rank 0 with rank 1
for (auto & v : r)
if (seqan3::to_rank(v) == 0)
// Here we reify the type for printing
if (is_murphy)
print(r
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::aa10murphy>(in);
}));
else
print(r
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::aa10li>(in);
}));
}
// Two instances of algo_pre exist
// They type erase the different arguments to the same type and encode the type information as a run-time parameter
void algo_pre(seqan3::aa10li_vector const & v)
{
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::semialphabet_any<10>>(in);
})
algorithm(tmp, false);
}
void algo_pre(seqan3::aa10murphy_vector const & v)
{
| std::views::transform(
[](auto const & in)
{
return static_cast<seqan3::semialphabet_any<10>>(in);
})
algorithm(tmp, true);
}
int main()
{
seqan3::aa10li_vector v1{"AVRSTXOUB"_aa10li};
algo_pre(v1); // BIKBBBKCB
seqan3::aa10murphy_vector v2{"AVRSTXOUB"_aa10murphy};
algo_pre(v2); // BIKSSSKCB
}
A semi-alphabet that type erases all other semi-alphabets of the same size.
Definition: semialphabet_any.hpp:48
seqan::std::ranges::to to
Converts a range to a container. <dl class="no-api">This entity is not part of the SeqAn API....
Definition: to.hpp:26
Provides seqan3::semialphabet_any.
Provides seqan3::ranges::to.
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna4> v0{"ACGT"_dna4}; // data occupies 4 bytes in memory
seqan3::bitpacked_sequence<seqan3::dna4> v1{"ACGT"_dna4}; // data occupies 1 byte in memory
}
Provides seqan3::bitpacked_sequence.
A space-optimised version of std::vector that compresses multiple letters into a single byte.
Definition: bitpacked_sequence.hpp:66
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::concatenated_sequences<seqan3::dna4_vector> concat1{"ACGT"_dna4, "GAGGA"_dna4};
seqan3::debug_stream << concat1[0] << '\n'; // "ACGT"
std::vector<seqan3::dna4_vector> concat2{"ACTA"_dna4, "AGGA"_dna4};
concat1 = concat2; // you can assign from other ranges
concat2[0] = "ATTA"_dna4; // this works for vector of vector
concat1[0][1] = 'T'_dna4; // and this works for concatenated_sequences
seqan3::debug_stream << concat1[0] << '\n'; // "ATTA"
// if you know that you will be adding ten vectors of length ten:
std::vector<seqan3::dna4> vector_of_length10{"ACGTACGTAC"_dna4};
concat1.reserve(10);
concat1.concat_reserve(10 * vector_of_length10.size());
while (concat1.size() < 10)
{
// ...
concat1.push_back(vector_of_length10);
}
}
Container that stores sequences concatenated internally.
Definition: concatenated_sequences.hpp:89
Provides seqan3::concatenated_sequences.
T reserve(T... args)
int main()
{
using namespace seqan3::literals;
foobar.insert(foobar.end(), "ACGT"_dna4);
seqan3::debug_stream << foobar[0] << '\n'; // "ACGT"
}
iterator end() noexcept
Returns an iterator to the element following the last element of the container.
Definition: concatenated_sequences.hpp:490
iterator insert(const_iterator pos, rng_type &&value)
Inserts value before position in the container.
Definition: concatenated_sequences.hpp:922
int main()
{
using namespace seqan3::literals;
foobar.insert(foobar.end(), 2, "ACGT"_dna4);
seqan3::debug_stream << foobar[0] << '\n'; // "ACGT"
seqan3::debug_stream << foobar[1] << '\n'; // "ACGT"
}
#include <iostream>
int main()
{
seqan3::gap another_gap{};
another_gap.assign_char('A'); // this does not change anything
seqan3::debug_stream << my_gap.to_char(); // outputs '-'
if (my_gap.to_char() == another_gap.to_char())
seqan3::debug_stream << "Both gaps are the same!\n";
}
int main()
{
using namespace seqan3::literals;
seqan3::gapped<seqan3::dna4> converted_letter{'C'_dna4};
seqan3::gapped<seqan3::dna4>{}.assign_char('-'); // gap character
seqan3::gapped<seqan3::dna4>{}.assign_char('K'); // unknown characters map to the default/unknown
// character of the given alphabet type (i.e. A of dna4)
}
int main()
{
seqan3::mask another_mask{};
my_mask.assign_rank(false); // will assign my_mask the value mask::unmasked
another_mask.assign_rank(0); // will also assign another_mask the value mask::unmasked
if (my_mask.to_rank() == another_mask.to_rank())
seqan3::debug_stream << "Both are UNMASKED!\n";
}
Implementation of a masked alphabet to be used for tuple composites..
Definition: mask.hpp:38
static const mask masked
Member for masked.
Definition: mask.hpp:74
Create a mask composite which can be applied with another alphabet.
int main()
{
using namespace seqan3::literals;
seqan3::masked<seqan3::dna4> dna4_another_masked{'A'_dna4, seqan3::mask::unmasked};
// create a dna4 masked alphabet with an unmasked A
dna4_masked.assign_char('a'); // assigns a masked 'A'_dna4
if (dna4_masked.to_char() != dna4_another_masked.to_char())
{
seqan3::debug_stream << dna4_masked.to_char() << " is not the same as " << dna4_another_masked.to_char()
<< "\n";
}
}
static const mask unmasked
Member for unmasked.
Definition: mask.hpp:68
Implementation of a masked composite, which extends a given alphabet with a mask..
Definition: masked.hpp:45
Extends a given alphabet with the mask alphabet.
using namespace seqan3::literals;
int main()
{
auto r1 = 'A'_rna5.complement(); // calls member function rna5::complement(); r1 == 'U'_rna5
auto r2 = seqan3::complement('A'_rna5); // calls global complement() function on the rna5 object; r2 == 'U'_rna5
}
Provides seqan3::nucleotide_alphabet.
Provides seqan3::rna5, container aliases and string literals.
int main()
{
using namespace seqan3::literals;
seqan3::dna15 letter{'A'_dna15};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // prints "N"
}
The 15 letter DNA alphabet, containing all IUPAC smybols minus the gap..
Definition: dna15.hpp:51
Provides seqan3::dna15, container aliases and string literals.
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna15 letter1{'A'_dna15};
auto letter2 = 'A'_dna15;
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna15 letter1 = 'C'_rna15; // implicitly converted
seqan3::dna15 letter2{};
letter2 = 'C'_rna15; // implicitly converted
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_dna15 : public seqan3::dna15
{
// using seqan3::dna15::dna15; // uncomment to import implicit conversion shown by letter1
};
struct my_rna15 : public seqan3::rna15
{};
int main()
{
using namespace seqan3::literals;
// my_dna15 letter1 = 'C'_rna15; // NO automatic implicit conversion!
// seqan3::dna15 letter2 = my_rna15{}; // seqan3::dna15 only allows implicit conversion from seqan3::rna15!
}
The 15 letter RNA alphabet, containing all IUPAC smybols minus the gap..
Definition: rna15.hpp:51
Checks whether from can be implicityly converted to to.
Provides concepts that do not have equivalents in C++20.
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna15_vector vector{'A'_rna15, 'C'_rna15, 'G'_rna15}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::dna15_vector dna15_vector{"ACGT"_rna15};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::rna15_vector rna15_vector = "ACGT"_rna15;
seqan3::dna15_vector dna15_vector{rna15_vector.begin(), rna15_vector.end()};
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna15_vector vector = "ACG"_dna15;
auto rna15_view = vector | seqan3::views::convert<seqan3::rna15>;
for (auto && chr : rna15_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::rna15 &&>);
}
}
Provides seqan3::views::convert.
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna15_vector sequence1{"ACGTTA"_dna15};
seqan3::dna15_vector sequence2 = "ACGTTA"_dna15;
auto sequence3 = "ACGTTA"_dna15;
}
int main()
{
using namespace seqan3::literals;
seqan3::dna16sam letter{'A'_dna16sam};
letter.assign_char('=');
seqan3::debug_stream << letter << '\n'; // prints "="
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // "N";
}
A 16 letter DNA alphabet, containing all IUPAC symbols minus the gap and plus an equality sign ('=')....
Definition: dna16sam.hpp:48
Provides seqan3::dna16sam.
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna16sam letter1{'A'_dna16sam};
auto letter2 = 'A'_dna16sam;
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna16sam_vector sequence1{"ACGTTA"_dna16sam};
seqan3::dna16sam_vector sequence2 = "ACGTTA"_dna16sam;
auto sequence3 = "ACGTTA"_dna16sam;
}
int main()
{
using namespace seqan3::literals;
seqan3::dna3bs letter{'A'_dna3bs};
letter.assign_char('C'); // All C will be converted to T.
seqan3::debug_stream << letter << '\n'; // prints "T"
letter.assign_char('F'); // Unknown characters are implicitly converted to A.
seqan3::debug_stream << letter << '\n'; // prints "A"
}
The three letter reduced DNA alphabet for bisulfite sequencing mode (A,G,T(=C))..
Definition: dna3bs.hpp:61
Provides seqan3::dna3bs, container aliases and string literals.
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna3bs letter1{'A'_dna3bs};
auto letter2 = 'A'_dna3bs;
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna3bs_vector sequence1{"ACGTTA"_dna3bs};
seqan3::dna3bs_vector sequence2 = "ACGTTA"_dna3bs;
auto sequence3 = "ACGTTA"_dna3bs;
}
int main()
{
using namespace seqan3::literals;
seqan3::dna4 letter{'C'_dna4};
letter.assign_char('F'); // Characters other than IUPAC characters are implicitly converted to A.
seqan3::debug_stream << letter << '\n'; // prints "A"
// IUPAC characters are implicitly converted to their best fitting representative
seqan3::debug_stream << letter.assign_char('R') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('Y') << '\n'; // prints "C"
seqan3::debug_stream << letter.assign_char('S') << '\n'; // prints "C"
seqan3::debug_stream << letter.assign_char('W') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('K') << '\n'; // prints "G"
seqan3::debug_stream << letter.assign_char('M') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('B') << '\n'; // prints "C"
seqan3::debug_stream << letter.assign_char('D') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('H') << '\n'; // prints "A"
seqan3::debug_stream << letter.assign_char('V') << '\n'; // prints "A"
letter.assign_char('a'); // Lower case letters are the same as their upper case equivalent.
seqan3::debug_stream << letter << '\n'; // prints "A"
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna4 letter1{'A'_dna4};
auto letter2 = 'A'_dna4;
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna4 letter1 = 'C'_rna4; // implicitly converted
seqan3::dna4 letter2{};
letter2 = 'C'_rna4; // implicitly converted
}
Provides seqan3::rna4, container aliases and string literals.
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_dna4 : public seqan3::dna4
{
// using seqan3::dna4::dna4; // uncomment to import implicit conversion shown by letter1
};
struct my_rna4 : public seqan3::rna4
{};
int main()
{
using namespace seqan3::literals;
// my_dna4 letter1 = 'C'_rna4; // NO automatic implicit conversion!
// seqan3::dna4 letter2 = my_rna4{}; // seqan3::dna4 only allows implicit conversion from seqan3::rna4!
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector vector{'A'_rna4, 'C'_rna4, 'G'_rna4}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::dna4_vector dna4_vector{"ACGT"_rna4};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::rna4_vector rna4_vector = "ACGT"_rna4;
seqan3::dna4_vector dna4_vector{rna4_vector.begin(), rna4_vector.end()};
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector vector = "ACG"_dna4;
auto rna4_view = vector | seqan3::views::convert<seqan3::rna4>;
for (auto && chr : rna4_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::rna4 &&>);
}
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector sequence1{"ACGTTA"_dna4};
seqan3::dna4_vector sequence2 = "ACGTTA"_dna4;
auto sequence3 = "ACGTTA"_dna4;
}
int main()
{
using namespace seqan3::literals;
seqan3::dna5 letter{'A'_dna5};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // prints "N"
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna5 letter1{'A'_dna5};
auto letter2 = 'A'_dna5;
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna5 letter1 = 'C'_rna5; // implicitly converted
seqan3::dna5 letter2{};
letter2 = 'C'_rna5; // implicitly converted
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_dna5 : public seqan3::dna5
{
// using seqan3::dna5::dna5; // uncomment to import implicit conversion shown by letter1
};
struct my_rna5 : public seqan3::rna5
{};
int main()
{
using namespace seqan3::literals;
// my_dna5 letter1 = 'C'_rna5; // NO automatic implicit conversion!
// seqan3::dna5 letter2 = my_rna5{}; // seqan3::dna5 only allows implicit conversion from seqan3::rna5!
}
The five letter RNA alphabet of A,C,G,U and the unknown character N..
Definition: rna5.hpp:49
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector vector{'A'_rna5, 'C'_rna5, 'G'_rna5}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::dna5_vector dna5_vector{"ACGT"_rna5};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::rna5_vector rna5_vector = "ACGT"_rna5;
seqan3::dna5_vector dna5_vector{rna5_vector.begin(), rna5_vector.end()};
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector vector = "ACG"_dna5;
auto rna5_view = vector | seqan3::views::convert<seqan3::rna5>;
for (auto && chr : rna5_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::rna5 &&>);
}
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector sequence1{"ACGTTA"_dna5};
seqan3::dna5_vector sequence2 = "ACGTTA"_dna5;
auto sequence3 = "ACGTTA"_dna5;
}
int main()
{
using namespace seqan3::literals;
seqan3::rna15 letter{'A'_rna15};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // prints "N"
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna15 letter1{'A'_rna15};
auto letter2 = 'A'_rna15;
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna15 letter1 = 'C'_dna15; // implicitly converted
seqan3::rna15 letter2{};
letter2 = 'C'_dna15; // implicitly converted
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_rna15 : public seqan3::rna15
{
// using seqan3::rna15::rna15; // uncomment to import implicit conversion shown by letter1
};
struct my_dna15 : public seqan3::dna15
{};
int main()
{
using namespace seqan3::literals;
// my_rna15 letter1 = 'C'_dna15; // NO automatic implicit conversion!
// seqan3::rna15 letter2 = my_dna15{}; // seqan3::rna15 only allows implicit conversion from seqan3::dna15!
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna15_vector vector{'A'_dna15, 'C'_dna15, 'G'_dna15}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::rna15_vector rna15_vector{"ACGT"_dna15};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::dna15_vector dna15_vector = "ACGT"_dna15;
seqan3::rna15_vector rna15_vector{dna15_vector.begin(), dna15_vector.end()};
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna15_vector vector = "ACG"_rna15;
auto dna15_view = vector | seqan3::views::convert<seqan3::dna15>;
for (auto && chr : dna15_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::dna15 &&>);
}
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna15_vector sequence1{"ACGTTA"_rna15};
seqan3::rna15_vector sequence2 = "ACGTTA"_rna15;
auto sequence3 = "ACGTTA"_rna15;
}
int main()
{
using namespace seqan3::literals;
seqan3::rna4 letter{'A'_rna4};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to A.
seqan3::debug_stream << letter << '\n'; // prints "A"
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna4 letter1{'A'_rna4};
auto letter2 = 'A'_rna4;
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna4 letter1 = 'C'_dna4; // implicitly converted
seqan3::rna4 letter2{};
letter2 = 'C'_dna4; // implicitly converted
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_rna4 : public seqan3::rna4
{
// using seqan3::rna4::rna4; // uncomment to import implicit conversion shown by letter1
};
struct my_dna4 : public seqan3::dna4
{};
int main()
{
using namespace seqan3::literals;
// my_rna4 letter1 = 'C'_dna4; // NO automatic implicit conversion!
// seqan3::rna4 letter2 = my_dna4{}; // seqan3::rna4 only allows implicit conversion from seqan3::dna4!
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna4_vector vector{'A'_dna4, 'C'_dna4, 'G'_dna4}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::rna4_vector rna4_vector{"ACGT"_dna4};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::dna4_vector dna4_vector = "ACGT"_dna4;
seqan3::rna4_vector rna4_vector{dna4_vector.begin(), dna4_vector.end()};
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna4_vector vector = "ACG"_rna4;
auto dna4_view = vector | seqan3::views::convert<seqan3::dna4>;
for (auto && chr : dna4_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::dna4 &&>);
}
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna4_vector sequence1{"ACGTTA"_rna4};
seqan3::rna4_vector sequence2 = "ACGTTA"_rna4;
auto sequence3 = "ACGTTA"_rna4;
}
int main()
{
using namespace seqan3::literals;
seqan3::rna5 letter{'A'_rna5};
letter.assign_char('C');
seqan3::debug_stream << letter << '\n'; // prints "C"
letter.assign_char('F'); // Unknown characters are implicitly converted to N.
seqan3::debug_stream << letter << '\n'; // prints "N"
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_char_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna5 letter1{'A'_rna5};
auto letter2 = 'A'_rna5;
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna5 letter1 = 'C'_dna5; // implicitly converted
seqan3::rna5 letter2{};
letter2 = 'C'_dna5; // implicitly converted
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_inherit.cpp.in
struct my_rna5 : public seqan3::rna5
{
// using seqan3::rna5::rna5; // uncomment to import implicit conversion shown by letter1
};
struct my_dna5 : public seqan3::dna5
{};
int main()
{
using namespace seqan3::literals;
// my_rna5 letter1 = 'C'_dna5; // NO automatic implicit conversion!
// seqan3::rna5 letter2 = my_dna5{}; // seqan3::rna5 only allows implicit conversion from seqan3::dna5!
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_vector.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna5_vector vector{'A'_dna5, 'C'_dna5, 'G'_dna5}; // (element-wise) implicit conversion
// but this won't work:
// seqan3::rna5_vector rna5_vector{"ACGT"_dna5};
// as a workaround you can use:
// side note: this would also work without the implicit conversion.
seqan3::dna5_vector dna5_vector = "ACGT"_dna5;
seqan3::rna5_vector rna5_vector{dna5_vector.begin(), dna5_vector.end()};
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_implicit_conversion_from_@source_alphabet@_views.cpp.in
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::rna5_vector vector = "ACG"_rna5;
auto dna5_view = vector | seqan3::views::convert<seqan3::dna5>;
for (auto && chr : dna5_view) // converts lazily on-the-fly
{
static_assert(std::same_as<decltype(chr), seqan3::dna5 &&>);
}
}
// generated from test/snippet/alphabet/nucleotide/@target_alphabet@_literal.cpp.in
int main()
{
using namespace seqan3::literals;
seqan3::rna5_vector sequence1{"ACGTTA"_rna5};
seqan3::rna5_vector sequence2 = "ACGTTA"_rna5;
auto sequence3 = "ACGTTA"_rna5;
}
int main()
{
using namespace seqan3::literals;
seqan3::phred42 letter{'@'_phred42};
letter.assign_char('!');
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "0"
seqan3::debug_stream << letter.to_char() << '\n'; // prints "!"
letter.assign_phred(49); // Values exceeding the maximum are implicitly limited to the maximum phred value.
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "41"
}
Quality type for traditional Sanger and modern Illumina Phred scores..
Definition: phred42.hpp:47
int main()
{
using namespace seqan3::literals;
seqan3::phred42 letter1{'!'_phred42};
auto letter2 = '!'_phred42;
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::phred42> sequence1{"##!!##"_phred42};
std::vector<seqan3::phred42> sequence2 = "##!!##"_phred42;
auto sequence3 = "##!!##"_phred42;
}
int main()
{
using namespace seqan3::literals;
seqan3::phred63 letter{'@'_phred63};
letter.assign_char('!');
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "0"
seqan3::debug_stream << letter.to_char() << '\n'; // prints "!"
letter.assign_phred(72); // Values exceeding the maximum are implicitly limited to the maximum phred value.
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "62"
}
Quality type for traditional Sanger and modern Illumina Phred scores..
Definition: phred63.hpp:47
Provides seqan3::phred63 quality scores.
int main()
{
using namespace seqan3::literals;
seqan3::phred63 letter1{'!'_phred63};
auto letter2 = '!'_phred63;
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::phred63> sequence1{"##!!##"_phred63};
std::vector<seqan3::phred63> sequence2 = "##!!##"_phred63;
auto sequence3 = "##!!##"_phred63;
}
int main()
{
using namespace seqan3::literals;
seqan3::phred68solexa letter{'@'_phred68solexa};
letter.assign_char(';');
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "-5"
seqan3::debug_stream << letter.to_char() << '\n'; // prints ";"
letter.assign_phred(72); // Values exceeding the maximum are implicitly limited to the maximum phred value.
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "62"
}
Quality type for Solexa and deprecated Illumina formats..
Definition: phred68solexa.hpp:40
Provides seqan3::phred68solexa quality scores.
int main()
{
using namespace seqan3::literals;
seqan3::phred68solexa letter1{'!'_phred68solexa};
auto letter2 = '!'_phred68solexa;
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::phred68solexa> sequence1{"##!!##"_phred68solexa};
std::vector<seqan3::phred68solexa> sequence2 = "##!!##"_phred68solexa;
auto sequence3 = "##!!##"_phred68solexa;
}
int main()
{
using namespace seqan3::literals;
seqan3::phred94 letter{'@'_phred94};
letter.assign_char('!');
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "0"
seqan3::debug_stream << letter.to_char() << '\n'; // prints "!"
letter.assign_phred(99); // Values exceeding the maximum are implicitly limited to the maximum phred value.
seqan3::debug_stream << letter.to_phred() << '\n'; // prints "93"
}
Quality type for PacBio Phred scores of HiFi reads..
Definition: phred94.hpp:44
Provides seqan3::phred94 quality scores.
int main()
{
using namespace seqan3::literals;
seqan3::phred94 letter1{'!'_phred94};
auto letter2 = '!'_phred94;
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::phred94> sequence1{"##!!##"_phred94};
std::vector<seqan3::phred94> sequence2 = "##!!##"_phred94;
auto sequence3 = "##!!##"_phred94;
}
int main()
{
using namespace seqan3::literals;
using seqan3::get;
seqan3::qualified<seqan3::dna4, seqan3::phred42> letter{'A'_dna4, '('_phred42};
seqan3::debug_stream << seqan3::to_rank(letter) << ' ' // 7
<< seqan3::to_rank(get<0>(letter)) << ' ' // 0
<< seqan3::to_rank(get<1>(letter)) << '\n'; // 7
seqan3::debug_stream << seqan3::to_char(letter) << ' ' // A
<< seqan3::to_char(get<0>(letter)) << ' ' // A
<< seqan3::to_char(get<1>(letter)) << '\n'; // (
seqan3::debug_stream << seqan3::to_phred(letter) << ' ' // 7
<< seqan3::to_phred(get<1>(letter)) << '\n'; // 7
// Modify:
get<0>(letter) = 'G'_dna4;
seqan3::debug_stream << seqan3::to_char(letter) << '\n'; // G
}
int main()
{
using namespace seqan3::literals;
seqan3::dot_bracket3 letter{'.'_db3};
letter.assign_char('(');
seqan3::debug_stream << letter << '\n'; // prints "("
letter.assign_char('F'); // Unknown characters are implicitly converted to '.'.
seqan3::debug_stream << letter << '\n'; // prints "."
}
The three letter RNA structure alphabet of the characters ".()"..
Definition: dot_bracket3.hpp:54
Provides the dot bracket format for RNA structure.
int main()
{
using namespace seqan3::literals;
seqan3::dot_bracket3 letter1{'('_db3};
auto letter2 = '('_db3;
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dot_bracket3> sequence1{".(..)."_db3};
std::vector<seqan3::dot_bracket3> sequence2 = ".(..)."_db3;
auto sequence3 = ".(..)."_db3;
}
int main()
{
using namespace seqan3::literals;
seqan3::dssp9 letter{'H'_dssp9};
letter.assign_char('B');
seqan3::debug_stream << letter << '\n'; // prints "B"
letter.assign_char('F'); // Unknown characters are implicitly converted to 'X'.
seqan3::debug_stream << letter << '\n'; // prints "X"
}
The protein structure alphabet of the characters "HGIEBTSCX"..
Definition: dssp9.hpp:62
Provides the dssp format for protein structure.
int main()
{
using namespace seqan3::literals;
seqan3::dssp9 letter1{'('_dssp9};
auto letter2 = '('_dssp9;
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dssp9> sequence1{"EHHHHT"_dssp9};
std::vector<seqan3::dssp9> sequence2 = "EHHHHT"_dssp9;
auto sequence3 = "EHHHHT"_dssp9;
}
int main()
{
using namespace seqan3::literals;
using seqan3::get;
seqan3::debug_stream << seqan3::to_rank(letter) << ' ' // 199
<< seqan3::to_rank(get<0>(letter)) << ' ' // 22
<< seqan3::to_rank(get<1>(letter)) << '\n'; // 1
seqan3::debug_stream << seqan3::to_char(letter) << ' ' // W
<< seqan3::to_char(get<0>(letter)) << ' ' // W
<< seqan3::to_char(get<1>(letter)) << '\n'; // B
// Modify:
get<0>(letter) = 'V'_aa27;
seqan3::debug_stream << seqan3::to_char(letter) << '\n'; // V
}
A seqan3::alphabet_tuple_base that joins an aminoacid alphabet with a protein structure alphabet....
Definition: structured_aa.hpp:55
Provides the composite of aminoacid with structure alphabets.
int main()
{
using namespace seqan3::literals;
using seqan3::get;
seqan3::debug_stream << seqan3::to_rank(letter) << ' ' // 7
<< seqan3::to_rank(get<0>(letter)) << ' ' // 2
<< seqan3::to_rank(get<1>(letter)) << '\n'; // 1
seqan3::debug_stream << seqan3::to_char(letter) << ' ' // G
<< seqan3::to_char(get<0>(letter)) << ' ' // G
<< seqan3::to_char(get<1>(letter)) << '\n'; // (
// Modify:
get<0>(letter) = 'U'_rna4;
seqan3::debug_stream << seqan3::to_char(letter) << '\n'; // U
}
A seqan3::alphabet_tuple_base that joins a nucleotide alphabet with an RNA structure alphabet....
Definition: structured_rna.hpp:56
Provides the composite of nucleotide with structure alphabets.
int main()
{
using namespace seqan3::literals;
seqan3::wuss51 letter{':'_wuss51};
letter.assign_char('~');
seqan3::debug_stream << letter << '\n'; // prints "~"
letter.assign_char('#'); // Unknown characters are implicitly converted to ';'.
seqan3::debug_stream << letter << '\n'; // prints ";"
}
Provides the WUSS format for RNA structure.
int main()
{
using namespace seqan3::literals;
seqan3::wuss51 letter1{'('_wuss51};
auto letter2 = '('_wuss51;
}
#include <iostream>
int main()
{
using namespace seqan3::literals;
bool is_closing_char_member = '}'_wuss51.is_pair_close();
bool is_closing_char_free = seqan3::is_pair_close('.'_wuss51);
std::cout << std::boolalpha << is_closing_char_member << '\n'; // true
std::cout << std::boolalpha << is_closing_char_free << '\n'; // false
}
T boolalpha(T... args)
constexpr auto is_pair_close
Check whether the given character represents a leftward interaction in an RNA structure.
Definition: alphabet/structure/concept.hpp:182
#include <iostream>
int main()
{
using namespace seqan3::literals;
bool is_opening_char_member = '{'_wuss51.is_pair_open();
bool is_opening_char_free = seqan3::is_pair_open('.'_wuss51);
std::cout << std::boolalpha << is_opening_char_member << '\n'; // true
std::cout << std::boolalpha << is_opening_char_free << '\n'; // false
}
constexpr auto is_pair_open
Check whether the given character represents a rightward interaction in an RNA structure.
Definition: alphabet/structure/concept.hpp:100
#include <iostream>
int main()
{
using namespace seqan3::literals;
bool is_unpaired_char_member = '.'_wuss51.is_unpaired();
bool is_unpaired_char_free = seqan3::is_unpaired('{'_wuss51);
std::cout << std::boolalpha << is_unpaired_char_member << '\n'; // true
std::cout << std::boolalpha << is_unpaired_char_free << '\n'; // false
}
constexpr auto is_unpaired
Check whether the given character represents an unpaired nucleotide in an RNA structure.
Definition: alphabet/structure/concept.hpp:264
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::wuss51> sequence1{".<..>."_wuss51};
std::vector<seqan3::wuss51> sequence2 = ".<..>."_wuss51;
auto sequence3 = ".<..>."_wuss51;
}
#include <iostream>
int main()
{
uint8_t max_depth_member = seqan3::wuss51::max_pseudoknot_depth;
uint8_t max_depth_meta = seqan3::max_pseudoknot_depth<seqan3::wuss51>;
std::cout << static_cast<uint16_t>(max_depth_member) << '\n'; // 22
std::cout << static_cast<uint16_t>(max_depth_meta) << '\n'; // 22
}
int main()
{
using namespace seqan3::literals;
auto pk_opt = '.'_wuss51.pseudoknot_id(); // std::optional -> false
pk_opt = seqan3::pseudoknot_id('{'_wuss51); // std::optional -> true: 3
if (pk_opt)
seqan3::debug_stream << *pk_opt << '\n'; // 3
}
constexpr auto pseudoknot_id
Retrieve an id for the level of a pseudoknotted interaction (also known as 'page number').
Definition: alphabet/structure/concept.hpp:459
int main()
{
std::string_view str{"ACTTTGATAN"};
try
{
seqan3::debug_stream << (str | seqan3::views::char_strictly_to<seqan3::dna4>); // ACTTTGATA
}
{
seqan3::debug_stream << "\n[ERROR] Invalid char!\n"; // Will throw on parsing 'N'
}
}
Provides seqan3::views::char_strictly_to.
An exception typically thrown by seqan3::alphabet::assign_char_strict.
Definition: alphabet/exception.hpp:30
int main()
{
std::string str{"ACTTTGATAN"};
seqan3::debug_stream << (str | seqan3::views::char_to<seqan3::dna4>) << '\n'; // ACTTTGATAA
seqan3::debug_stream << (str | seqan3::views::char_to<seqan3::dna5>) << '\n'; // ACTTTGATAN
}
Provides seqan3::views::char_to.
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector foo{"ACGTA"_dna5};
// pipe notation
auto v = foo | seqan3::views::complement;
seqan3::debug_stream << v << '\n'; // TGCAT
// function notation
seqan3::debug_stream << v2 << '\n'; // TGCAT
// generate the reverse complement:
auto v3 = foo | seqan3::views::complement | std::views::reverse;
seqan3::debug_stream << v3 << '\n'; // TACGT
}
Provides seqan3::views::complement.
#include <vector>
#include <seqan3/alphabet/quality/aliases.hpp> // includes seqan3::dna4q
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector vec = "ACTTTGATA"_dna4;
auto v = vec | seqan3::views::to_char;
seqan3::debug_stream << v << '\n'; // [A,C,T,T,T,G,A,T,A]
auto v3 = qvec | seqan3::views::to_char;
seqan3::debug_stream << v3 << '\n'; // [!,(,&,$,(,%,?,1,8]
std::vector<seqan3::dna4q> qcvec{{'C'_dna4, '!'_phred42},
{'A'_dna4, '('_phred42},
{'G'_dna4, '&'_phred42},
{'T'_dna4, '$'_phred42},
{'G'_dna4, '('_phred42},
{'A'_dna4, '%'_phred42},
{'C'_dna4, '?'_phred42},
{'T'_dna4, '1'_phred42},
{'A'_dna4, '8'_phred42}};
auto v4 = qcvec | seqan3::views::to_char;
seqan3::debug_stream << v4 << '\n'; // [C,A,G,T,G,A,C,T,A]
}
constexpr derived_type & assign_phred(phred_type const p) noexcept
Assign from the numeric Phred score value.
Definition: phred_base.hpp:126
auto const to_char
A view that calls seqan3::to_char() on each element in the input range.
Definition: to_char.hpp:63
#include <vector>
int main()
{
using namespace seqan3::literals;
seqan3::dna4_vector vec = "ACTTTGATA"_dna4;
auto v = vec | seqan3::views::to_rank;
seqan3::debug_stream << v << '\n'; // [0,1,3,3,3,2,0,3,0]
auto v3 = qvec | seqan3::views::to_rank;
seqan3::debug_stream << v3 << '\n'; // [0,7,5,3,7,4,30,16,23]
std::vector<seqan3::dna4q> qcvec{{'C'_dna4, '!'_phred42},
{'A'_dna4, '('_phred42},
{'G'_dna4, '&'_phred42},
{'T'_dna4, '$'_phred42},
{'G'_dna4, '('_phred42},
{'A'_dna4, '%'_phred42},
{'C'_dna4, '?'_phred42},
{'T'_dna4, '1'_phred42},
{'A'_dna4, '8'_phred42}};
auto v4 = qcvec | seqan3::views::to_rank;
seqan3::debug_stream << v4 << '\n'; // [42,7,89,129,91,4,72,142,23]
}
auto const to_rank
A view that calls seqan3::to_rank() on each element in the input range.
Definition: to_rank.hpp:66
Provides seqan3::views::to_rank.
#include <vector>
int main()
{
std::vector<int> vec{0, 1, 3, 3, 3, 2, 0, 3, 0};
seqan3::debug_stream << (vec | seqan3::views::rank_to<seqan3::dna4>) << '\n'; // ACTTTGATA
seqan3::debug_stream << (vec | seqan3::views::rank_to<seqan3::dna5>) << '\n'; // ACNNNGANA
}
Provides seqan3::views::rank_to.
#include <iostream>
int main()
{
using namespace seqan3::literals;
auto char_to_char = seqan3::to_char('A'); // calls seqan3::custom::to_char('A')
auto dna5_to_char = seqan3::to_char('A'_dna5); // calls .to_char() member
std::cout << char_to_char << '\n'; // A
std::cout << dna5_to_char << '\n'; // A
}
#include <iostream>
int main()
{
using namespace seqan3::literals;
auto char_to_rank = seqan3::to_rank('A'); // calls seqan3::custom::to_rank('A')
static_assert(std::same_as<decltype(char_to_rank), uint8_t>);
std::cout << static_cast<uint16_t>(char_to_rank) << '\n'; // 65
auto dna5_to_rank = seqan3::to_rank('A'_dna5); // calls .to_char() member
static_assert(std::same_as<decltype(dna5_to_rank), uint8_t>);
std::cout << static_cast<uint16_t>(dna5_to_rank) << '\n'; // 0
}
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector vec{"ACGTACGTACGTA"_dna5};
// Default (first forward frame)
// == [T,Y,V,R]
seqan3::debug_stream << v1[1] << '\n';
// First forward frame
// == [T,Y,V,R]
// First reverse frame
// == [Y,V,R,T]
// Second forward frame
// == [R,T,Y,V]
// Second reverse frame
// == [T,Y,V,R]
// Third forward frame
// == [V,R,T]
// Third reverse frame
// == [R,T,Y]
// function syntax
// == [T,Y,V,R]
// combinability
auto v9 =
// == [M,H,A,C]
// combinability with default parameter
// == [C,M,H,A]
// combinability with default parameter
// == [C,M,H,A]
}
constexpr auto translate_single
A view that translates nucleotide into aminoacid alphabet for one of the six frames.
Definition: translate.hpp:523
@ forward_frame2
The third forward frame starting at position 2.
@ forward_frame0
The first forward frame starting at position 0.
@ reverse_frame0
The first reverse frame starting at position 0.
@ reverse_frame2
The third reverse frame starting at position 2.
@ forward_frame1
The second forward frame starting at position 1.
@ reverse_frame1
The second reverse frame starting at position 1.
Provides seqan3::views::translate and seqan3::views::translate_single.
#include <iostream>
using namespace seqan3::literals;
int main()
{
// Input range needs to be two-dimensional
std::vector<std::vector<seqan3::dna4>> vec{"ACGTACGTACGTA"_dna4, "TCGAGAGCTTTAGC"_dna4};
// Translation with default parameters
seqan3::debug_stream << v1 << "\n"; // [TYVR,RTYV,VRT,YVRT,TYVR,RTY,SRAL,REL*,ESFS,AKAL,LKLS,*SSR]
// Access the third forward frame (index_frame 2) of the second input sequence (index_seq 1)
// Required frames per sequence s = 6
// n = (index_seq * s) + j
// = 1 * 6 + 2
// = 8
auto third_frame_second_seq = v1[1 * 6 + 2];
seqan3::debug_stream << third_frame_second_seq << "\n"; // ESFS
// Translation with custom translation frame
seqan3::debug_stream << v2 << "\n"; // [TYVR,SRAL]
return 0;
}
constexpr auto translate_join
A view that translates nucleotide into aminoacid alphabet with 1, 2, 3 or 6 frames....
Definition: translate_join.hpp:381
Provides seqan3::views::translate_join.
int main()
{
using namespace seqan3::literals;
seqan3::dna5_vector vec{"ACGTACGTACGTA"_dna5};
// default frame translation
auto v1 = vec | seqan3::views::translate;
seqan3::debug_stream << v1 << '\n'; // [TYVR,RTYV,VRT,YVRT,TYVR,RTY]
// single frame translation
seqan3::debug_stream << v2 << '\n'; // [TYVR]
// reverse translation
seqan3::debug_stream << v3 << '\n'; // [TYVR,YVRT]
// forward frames translation
seqan3::debug_stream << v4 << '\n'; // [TYVR,RTYV,VRT]
// six frame translation
seqan3::debug_stream << v5 << '\n'; // [TYVR,RTYV,VRT,YVRT,TYVR,RTY]
// function syntax
seqan3::debug_stream << v6 << '\n'; // [TYVR,YVRT]
// combinability
seqan3::debug_stream << v7 << '\n'; // [CMHA,MHAC]
}
@ forward_frames
All forward frames.
@ forward_reverse0
The first forward and first reverse frame.
#include <string>
#include <vector>
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna5q> vec{{'A'_dna5, 'I'_phred42},
{'G'_dna5, 'I'_phred42},
{'G'_dna5, '?'_phred42},
{'A'_dna5, '5'_phred42},
{'T'_dna5, '+'_phred42}};
// trim by phred_value
auto v1 = vec | seqan3::views::trim_quality(20u);
seqan3::debug_stream << v1 << '\n'; // AGGA
// trim by quality character; in this case the nucleotide part of the character is irrelevant
auto v2 = vec | seqan3::views::trim_quality(seqan3::dna5q{'C'_dna5, '5'_phred42});
seqan3::debug_stream << v2 << '\n'; // AGGA
// combinability
seqan3::debug_stream << v3 << '\n'; // AGGA
}
constexpr auto trim_quality
A view that does quality-threshold trimming on a range of seqan3::quality_alphabet.
Definition: trim_quality.hpp:129
Provides seqan3::views::trim_quality.
#include <string>
#include <vector>
using namespace seqan3::literals;
int main()
{
std::vector<seqan3::phred42> vec{"II?5+"_phred42};
// trim by phred_value
auto v1 = vec | seqan3::views::trim_quality(20u);
seqan3::debug_stream << v1 << '\n'; // II?5
// trim by quality character
auto v2 = vec | seqan3::views::trim_quality('I'_phred42);
seqan3::debug_stream << v2 << '\n'; // II
// function syntax
auto v3 = seqan3::views::trim_quality(vec, '5'_phred42);
seqan3::debug_stream << v3 << '\n'; // II?5
// combinability
seqan3::debug_stream << v4 << '\n'; // II?5
}
int main()
{
std::string_view str{"ACTTTGATAN"};
try
{
seqan3::debug_stream << (str | seqan3::views::validate_char_for<seqan3::dna4>); // ACTTTGATA
}
{
seqan3::debug_stream << "\n[ERROR] Invalid char!\n"; // Will throw on parsing 'N'
}
}
Provides seqan3::views::validate_char_for.
int main(int argc, char ** argv)
{
seqan3::argument_parser myparser{"Grade-Average", argc, argv}; // initialize
std::string name{"Max Muster"}; // define default values directly in the variable.
bool bonus{false};
std::vector<double> grades{}; // you can also specify a vector that is treated as a list option.
myparser.add_option(name, 'n', "name", "Please specify your name.");
myparser.add_flag(bonus, 'b', "bonus", "Please specify if you got the bonus.");
myparser.add_positional_option(grades, "Please specify your grades.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << '\n'; // customize your error message
return -1;
}
if (bonus)
grades.push_back(1.0); // extra good grade
double avg{0};
for (auto g : grades)
avg += g;
avg = avg / grades.size();
seqan3::debug_stream << name << " has an average grade of " << avg << '\n';
return 0;
}
int main(int argc, char ** argv)
{
seqan3::argument_parser myparser{"The-Age-App", argc, argv}; // initialize
int age{30}; // define default values directly in the variable
myparser.add_option(age, 'a', "user-age", "Please specify your age.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "The-Age-App - [PARSER ERROR] " << ext.what() << '\n'; // customize your error message
return -1;
}
seqan3::debug_stream << "integer given by user: " << age << '\n';
return 0;
}
int main(int argc, char ** argv)
{
seqan3::argument_parser myparser{"Penguin_Parade", argc, argv}; // initialize
myparser.info.version = "2.0.0";
myparser.info.date = "12.01.2017";
myparser.info.short_description = "Organize your penguin parade";
myparser.info.description.push_back("First Paragraph.");
myparser.info.description.push_back("Second Paragraph.");
myparser.info.examples.push_back("./penguin_parade Skipper Kowalski Rico Private -d 10 -m 02 -y 2017");
int d{01}; // day
int m{01}; // month
int y{2050}; // year
myparser.add_option(d, 'd', "day", "Please specify your preferred day.");
myparser.add_option(m, 'm', "month", "Please specify your preferred month.");
myparser.add_option(y, 'y', "year", "Please specify your preferred year.");
std::vector<std::string> penguin_names;
myparser.add_positional_option(penguin_names, "Specify the names of the penguins.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << ext.what() << "\n";
return -1;
}
// organize ...
return 0;
}
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv};
std::string myvar{"Example"};
myparser.add_option(myvar, 's', "special-op", "You know what you doin'?", seqan3::option_spec::advanced);
}
@ advanced
Definition: auxiliary.hpp:255
#include <ranges>
#include <system_error>
namespace seqan3::custom
{
// Specialise the seqan3::custom::argument_parsing data structure to enable parsing of std::errc.
template <>
struct argument_parsing<std::errc>
{
// Specialise a mapping from an identifying string to the respective value of your type Foo.
{"no_error", std::errc{}},
{"timed_out", std::errc::timed_out},
{"invalid_argument", std::errc::invalid_argument},
{"io_error", std::errc::io_error}};
};
} // namespace seqan3::custom
int main(int argc, char const * argv[])
{
std::errc value{};
seqan3::argument_parser parser{"my_program", argc, argv};
// Because of the argument_parsing struct and
// the static member function enumeration_names
// you can now add an option that takes a value of type std::errc:
parser.add_option(value,
'e',
"errc",
"Give me a std::errc value.",
seqan3::value_list_validator{(seqan3::enumeration_names<std::errc> | std::views::values)});
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
return 0;
}
A validator that checks whether a value is inside a list of valid values.
Definition: validators.hpp:203
auto const enumeration_names
Return a conversion map from std::string_view to option_type.
Definition: auxiliary.hpp:165
A namespace for third party and standard library specialisations of SeqAn customisation points.
Definition: char.hpp:42
namespace foo
{
enum class bar
{
one,
two,
three
};
// Specialise a mapping from an identifying string to the respective value of your type bar.
auto enumeration_names(bar)
{
return std::unordered_map<std::string_view, bar>{{"one", bar::one}, {"two", bar::two}, {"three", bar::three}};
}
} // namespace foo
int main(int argc, char const * argv[])
{
foo::bar value{};
seqan3::argument_parser parser{"my_program", argc, argv};
// Because of the enumeration_names function
// you can now add an option that takes a value of type bar:
parser.add_option(value,
'f',
"foo",
"Give me a foo value.",
seqan3::value_list_validator{(seqan3::enumeration_names<foo::bar> | std::views::values)});
try
{
parser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
}
int main(int argc, char ** argv)
{
seqan3::argument_parser myparser{"awesome-app", argc, argv}; // initialize
int a{3};
myparser.add_option(a, 'a', "awesome-parameter", "Please specify an integer.");
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << '\n'; // customize your error message
return -1;
}
if (myparser.is_option_set('a'))
seqan3::debug_stream << "The user set option -a on the command line.\n";
if (myparser.is_option_set("awesome-parameter"))
seqan3::debug_stream << "The user set option --awesome-parameter on the command line.\n";
// Asking for an option identifier that was not used before throws an error:
// myparser.is_option_set("foo"); // throws seqan3::design_error
return 0;
}
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
int myint;
myparser.add_option(myint, 'i', "integer", "Give me a number.", seqan3::option_spec::standard, my_validator);
// an exception will be thrown if the user specifies an integer
// that is not in range [2,10] (e.g. "./test_app -i 15")
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "integer given by user passed validation: " << myint << "\n";
return 0;
}
A validator that checks whether a number is inside a given range.
Definition: validators.hpp:128
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
int myint;
seqan3::value_list_validator my_validator{2, 4, 6, 8, 10};
myparser.add_option(myint, 'i', "integer", "Give me a number.", seqan3::option_spec::standard, my_validator);
// an exception will be thrown if the user specifies an integer
// that is not one of [2,4,6,8,10] (e.g. "./test_app -i 3")
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "integer given by user passed validation: " << myint << "\n";
return 0;
}
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
myparser.add_option(myfile,
'f',
"file",
"Give me a filename.",
seqan3::input_file_validator{{"fa", "fasta"}});
// an exception will be thrown if the user specifies a filename
// that does not have one of the extensions ["fa","fasta"],
// does not exists, or is not readable.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "filename given by user passed validation: " << myfile << "\n";
return 0;
}
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
std::string my_string;
seqan3::regex_validator my_validator{"[a-zA-Z]+@[a-zA-Z]+\\.com"};
myparser.add_option(my_string, 's', "str", "Give me a string.", seqan3::option_spec::standard, my_validator);
// an exception will be thrown if the user specifies a string
// that is no email address ending on .com
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "email address given by user passed validation: " << my_string << "\n";
return 0;
}
A validator that checks if a matches a regular expression pattern.
Definition: validators.hpp:935
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
std::string file_name;
seqan3::regex_validator absolute_path_validator{"(/[^/]+)+/.*\\.[^/\\.]+$"};
seqan3::input_file_validator my_file_ext_validator{{"sa", "so"}};
myparser.add_option(file_name,
'f',
"file",
"Give me a file name with an absolute path.",
absolute_path_validator | my_file_ext_validator);
// an exception will be thrown if the user specifies a file name
// that is not an absolute path or does not have one of the file extension [sa,so]
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
std::cout << "filename given by user passed validation: " << file_name << "\n";
return 0;
}
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
myparser.add_option(mydir,
'd',
"dir",
"The directory containing the input files.",
// an exception will be thrown if the user specifies a directory that does not exists or has insufficient
// read permissions.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "directory given by user passed validation: " << mydir << "\n";
return 0;
}
A validator that checks if a given path is a valid input directory.
Definition: validators.hpp:770
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
myparser.add_option(myfile,
'f',
"file",
"The input file containing the sequences.",
seqan3::input_file_validator{{"fa", "fasta"}});
// an exception will be thrown if the user specifies a filename
// that does not have one of the extensions ["fa","fasta"] or if the file does not exist/is not readable.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "filename given by user passed validation: " << myfile << "\n";
return 0;
}
int main()
{
// Default constructed validator has an empty extension list.
seqan3::debug_stream << validator1.get_help_page_message() << '\n';
// Specify your own extensions for the input file.
seqan3::debug_stream << validator2.get_help_page_message() << '\n';
// Give the seqan3 file type as a template argument to get all valid extensions for this file.
seqan3::debug_stream << validator3.get_help_page_message() << '\n';
return 0;
}
Provides some standard validators for (positional) options.
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
myparser.add_option(mydir,
'd',
"dir",
"The output directory for storing the files.",
// an exception will be thrown if the user specifies a directory that cannot be created by the filesystem either
// because the parent path does not exists or the path has insufficient write permissions.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "directory given by user passed validation: " << mydir << "\n";
return 0;
}
A validator that checks if a given path is a valid output directory.
Definition: validators.hpp:846
#include <filesystem>
int main(int argc, char const ** argv)
{
seqan3::argument_parser myparser{"Test", argc, argv}; // initialize
// Use the seqan3::output_file_open_options to indicate that you allow overwriting existing output files, ...
myparser.add_option(
myfile,
'f',
"file",
"Output file containing the processed sequences.",
// ... or that you will throw a seqan3::validation_error if the user specified output file already exists
myparser.add_option(myfile,
'g',
"file2",
"Output file containing the processed sequences.",
// an exception will be thrown if the user specifies a filename
// that does not have one of the extensions ["fa","fasta"],
// if the file already exists, or if the file is not writable.
try
{
myparser.parse();
}
catch (seqan3::argument_parser_error const & ext) // the user did something wrong
{
std::cerr << "[PARSER ERROR] " << ext.what() << "\n"; // customize your error message
return -1;
}
seqan3::debug_stream << "filename given by user passed validation: " << myfile << "\n";
return 0;
}
@ create_new
Forbid overwriting the output file.
@ open_or_create
Allow to overwrite the output file.
int main()
{
// Default constructed validator has an empty extension list.
seqan3::debug_stream << validator1.get_help_page_message() << '\n';
// Specify your own extensions for the output file.
std::vector{std::string{"exe"}, std::string{"fasta"}}};
seqan3::debug_stream << validator2.get_help_page_message() << '\n';
// Give the seqan3 file type as a template argument to get all valid extensions for this file.
seqan3::debug_stream << validator3.get_help_page_message() << '\n';
return 0;
}
#include <iostream>
enum class my_enum
{
VAL1 = 1,
VAL2 = 2,
COMB = 3
};
template <>
constexpr bool seqan3::add_enum_bitwise_operators<my_enum> = true;
int main()
{
using seqan3::operator|;
my_enum e = my_enum::VAL1;
my_enum e2 = e | my_enum::VAL2;
std::cout << std::boolalpha << (e2 == my_enum::COMB) << '\n'; // true
}
Provides seqan3::add_enum_bitwise_operators.
#if SEQAN3_WITH_CEREAL
# include <fstream>
# include <vector>
# include <seqan3/test/tmp_directory.hpp>
# include <cereal/archives/binary.hpp> // includes the cereal::BinaryInputArchive and cereal::BinaryOutputArchive
# include <cereal/types/vector.hpp> // includes cerealisation support for std::vector
// Written for std::vector, other types also work.
void load(std::vector<int16_t> & data, std::filesystem::path const & tmp_file)
{
std::ifstream is(tmp_file, std::ios::binary); // Where input can be found.
cereal::BinaryInputArchive archive(is); // Create an input archive from the input stream.
archive(data); // Load data.
}
// Written for std::vector, other types also work.
void store(std::vector<int16_t> const & data, std::filesystem::path const & tmp_file)
{
std::ofstream os(tmp_file, std::ios::binary); // Where output should be stored.
cereal::BinaryOutputArchive archive(os); // Create an output archive from the output stream.
archive(data); // Store data.
}
int main()
{
// The following example is for a std::vector but any seqan3 data structure that is documented as serialisable
// could be used, e.g. fm_index.
seqan3::test::tmp_directory tmp{};
auto tmp_file = tmp.path() / "data.out"; // this is a temporary file path, use any other filename.
std::vector<int16_t> vec{1, 2, 3, 4};
store(vec, tmp_file); // Calls store on a std::vector.
// This vector is needed to load the information into it.
load(vec2, tmp_file); // Calls load on a std::vector.
seqan3::debug_stream << vec << '\n'; // Prints [1,2,3,4].
return 0;
}
#endif
enum struct my_id : int
{
bar_id,
foo_id
};
namespace seqan3::detail
{
template <>
inline constexpr std::array<std::array<int, 2>, 2> compatibility_table<my_id>{{{0, 1}, {1, 0}}};
} // namespace seqan3::detail
int main()
{
using seqan3::get;
// my_cfg is now of type configuration<gap_cost_affine, band_fixed_size>
seqan3::debug_stream << get<1>(my_cfg).lower_diagonal << '\n'; // prints -4
seqan3::debug_stream << get<seqan3::align_cfg::band_fixed_size>(my_cfg).upper_diagonal << '\n'; // prints 4
seqan3::debug_stream << get<seqan3::align_cfg::gap_cost_affine>(my_cfg).extension_score << '\n'; // prints -1
}
// Initial setup used in the actual example:
enum struct my_id : int
{
bar_id,
foo_id
};
struct bar : private seqan3::pipeable_config_element
{
public:
float value{};
bar() = default;
bar(bar const &) = default;
bar(bar &&) = default;
bar & operator=(bar const &) = default;
bar & operator=(bar &&) = default;
~bar() = default;
bar(float v) : value{v}
{}
static constexpr my_id id{my_id::bar_id};
};
template <typename t>
{
public:
t value{};
foo() = default;
foo(foo const &) = default;
foo(foo &&) = default;
foo & operator=(foo const &) = default;
foo & operator=(foo &&) = default;
~foo() = default;
foo(t v) : value{std::move(v)}
{}
static constexpr my_id id{my_id::foo_id};
};
template <typename t>
foo(t) -> foo<t>;
int main()
{
seqan3::configuration my_cfg{foo{1}}; // Only foo<int> is present.
seqan3::debug_stream << my_cfg.get_or(foo{std::string{"hello"}}).value << '\n'; // finds foo<int> -> prints: 1
seqan3::debug_stream << my_cfg.get_or(bar{2.4}).value << '\n'; // bar not present -> prints: 2.4
}
T move(T... args)
Adds pipe interface to configuration elements.
Definition: pipeable_config_element.hpp:32
enum struct my_id : int
{
bar_id,
foo_id
};
{
public:
bar() = default;
bar(bar const &) = default;
bar(bar &&) = default;
bar & operator=(bar const &) = default;
bar & operator=(bar &&) = default;
~bar() = default;
static constexpr my_id id{my_id::bar_id};
};
template <typename t>
struct foo : private seqan3::pipeable_config_element
{
public:
foo() = default;
foo(foo const &) = default;
foo(foo &&) = default;
foo & operator=(foo const &) = default;
foo & operator=(foo &&) = default;
~foo() = default;
static constexpr my_id id{my_id::foo_id};
};
template <typename t>
foo(t) -> foo<t>;
int main()
{
uint8_t i = 71;
seqan3::debug_stream << '\'' << i << "'\n"; // prints '71' (because flag is set by default)
seqan3::debug_stream << '\'' << i << "'\n"; // prints 'G'
seqan3::debug_stream << seqan3::fmtflags2::small_int_as_number << '\'' << i << "'\n"; // prints '71' again
// instead of formatting the stream "inline", one can also call .setf()
}
void unsetf(fmtflags const flag)
Unset the format flag(s) on the stream.
Definition: debug_stream_type.hpp:183
@ small_int_as_number
Definition: debug_stream_type.hpp:34
#include <sstream>
int main()
{
using namespace seqan3::literals;
seqan3::debug_stream << "ACGT"_dna5;
o.flush();
seqan3::debug_stream << o.str(); // prints the string stream's buffer: "ACGT"
}
void set_underlying_stream(std::basic_ostream< char_t > &out)
Change the underlying output stream.
Definition: debug_stream_type.hpp:116
T flush(T... args)
T str(T... args)
#include <sstream>
int main()
{
using namespace seqan3::literals;
my_stream << "ACGT"_dna5;
o.flush();
seqan3::debug_stream << o.str() << '\n'; // prints the string stream's buffer: "ACGT"
}
A "pretty printer" for most SeqAn data structures and related types.
Definition: debug_stream_type.hpp:78
#include <iostream>
int main()
{
using namespace seqan3::literals;
// The alphabet normally needs to be converted to char explicitly:
std::cout << seqan3::to_char('C'_dna5) << '\n'; // prints 'C'
// The debug_stream, on the other hand, does this automatically:
seqan3::debug_stream << 'C'_dna5 << '\n'; // prints 'C'
// The debug_stream can also print all types that model std::ranges::input_range:
std::vector<seqan3::dna5> vec{"ACGT"_dna5};
seqan3::debug_stream << vec << '\n'; // prints "ACGT"
// ranges of non-alphabets are printed comma-separated:
seqan3::debug_stream << (vec | seqan3::views::to_rank) << '\n'; // prints "[0,1,2,3]"
}
int main()
{
int outer{};
// Might be used for non-copyable lambdas. In this example, the lambda would be copyable even without the wrapper.
seqan3::detail::copyable_wrapper wrapper{[&outer](int const x)
{
outer += x;
return outer;
}};
auto wrapper_2 = wrapper; // Would not work with non-copyable lambda.
seqan3::debug_stream << wrapper(2) << '\n'; // 2
seqan3::debug_stream << wrapper_2(4) << '\n'; // 6
}
Provides seqan3::detail::copyable_wrapper.
#include <concepts>
#include <vector>
namespace seqan3::detail::adl_only
{
// Poison-pill overload to prevent non-ADL forms of unqualified lookup.
template <typename... args_t>
void begin(args_t...) = delete;
struct begin_cpo : public detail::customisation_point_object<begin_cpo, 1>
{
using base_t = detail::customisation_point_object<begin_cpo, 1>;
// Only this class is allowed to import the constructors from base_t. (CRTP safety idiom)
using base_t::base_t;
// range.begin(), member access
template <typename range_t>
requires true // further constraints
static constexpr auto SEQAN3_CPO_OVERLOAD(seqan3::detail::priority_tag<1>, range_t && range)(
/*return*/ std::forward<range_t>(range).begin() /*;*/
);
// begin(range), ADL access
template <typename range_t>
static constexpr auto SEQAN3_CPO_OVERLOAD(seqan3::detail::priority_tag<0>, range_t && range)(
/*return*/ begin(std::forward<range_t>(range)) /*;*/
);
};
} // namespace seqan3::detail::adl_only
namespace seqan3
{
// CPO is a normal function object that can be called via seqan3::begin(...)
inline constexpr auto begin = detail::adl_only::begin_cpo{};
} // namespace seqan3
namespace other_library
{
struct foo
{
friend int begin(foo const &) // ADL begin, as friend
{
return 0;
}
};
} // namespace other_library
// seqan3::begin CPO that will call the "begin" member function
static_assert(std::same_as<decltype(seqan3::begin(vec)), decltype(vec.begin())>); // same iterator type
static_assert(noexcept(vec.begin())); // is noexcept
static_assert(noexcept(seqan3::begin(vec)) == noexcept(vec.begin())); // perfect noexcept-forwarding
// seqan3::begin CPO that will call the "begin" function per ADL
other_library::foo foo{};
static_assert(std::same_as<decltype(seqan3::begin(foo)), decltype(begin(foo))>); // same value type
static_assert(!noexcept(begin(foo))); // isn't noexcept
static_assert(noexcept(seqan3::begin(foo)) == noexcept(begin(foo))); // perfect noexcept-forwarding
auto cpo_is_sfinae_friendly(...) -> void;
template <typename range_t>
auto cpo_is_sfinae_friendly(range_t && range) -> decltype(seqan3::begin(range));
// seqan3::begin itself is SFINAE friendly, i.e. no-hard compiler errors, if no cpo overload matches
static_assert(std::same_as<decltype(cpo_is_sfinae_friendly(0)), void>);
static_assert(std::same_as<decltype(cpo_is_sfinae_friendly(vec)), decltype(vec.begin())>);
T begin(T... args)
Helper utilities for defining customisation point objects (CPOs).
#define SEQAN3_CPO_OVERLOAD(...)
A macro that helps to define a seqan3::detail::customisation_point_object.
Definition: customisation_point.hpp:107
#include <string>
#include <type_traits>
// Defines a crtp_base class with an additional value type.
template <typename derived_t, int value>
class base1
{
public:
int func1() const
{
return value;
}
};
// Defines a crtp_base class with an additional value type and a parameter type.
template <typename derived_t, typename value_t, typename parameter_t>
class base2
{
public:
value_t func2(parameter_t const p) const
{
return static_cast<value_t>(p);
}
};
// The derived class that inherits from a variadic crtp pattern, which are augmented with additional trait types.
// These types must be wrapped in a deferred layer, otherwise the compilation fails as incomplete types are not allowed.
// But during the definition of the base classes, the derived class cannot be known.
// In addition the deferred type must be invoked with the derived class using the `invoke_deferred_crtp_base` helper
// template to instantiate the correct crtp base type.
// Note that it is possible to define base classes with type template parameters (see base2) or
// non-type template parameters (see base1), but non-type and type template parameters cannot be mixed in one
// base class definition.
template <typename... deferred_bases_t>
class derived : public seqan3::detail::invoke_deferred_crtp_base<deferred_bases_t, derived<deferred_bases_t...>>...
{};
int main()
{
// Define deferred base with non-type template parameter
using deferred_base1 = seqan3::detail::deferred_crtp_base_vargs<base1, 10>;
// Define deferred base with type template parameter.
using deferred_base2 = seqan3::detail::deferred_crtp_base<base2, uint8_t, uint32_t>;
// Instantiate the derived class with the deferred crtp base classes.
derived<deferred_base1, deferred_base2> d{};
// Check the inherited interfaces.
static_assert(std::is_same_v<decltype(d.func1()), int>, "Return type must be int");
static_assert(std::is_same_v<decltype(d.func2(10u)), uint8_t>, "Return type must be uint8_t");
}
Provides seqan3::detail::deferred_crtp_base.
T is_same_v
#include <type_traits>
template <typename t>
requires std::is_integral_v<t>
struct foo
{
t value;
};
// foo is declarable with int, i.e. foo<int> is a valid expression
static_assert(seqan3::detail::is_class_template_declarable_with_v<foo, int>);
// foo is not declarable with double, because it does not fulfil the requires clause of foo.
static_assert(!seqan3::detail::is_class_template_declarable_with_v<foo, double>);
// This also works with std::enable_if and producing a substitution failure.
template <typename t, typename = std::enable_if_t<std::is_same_v<t, int>>>
struct bar
{
t value;
};
// bar is declarable with int, i.e. bar<int> is a valid expression
static_assert(seqan3::detail::is_class_template_declarable_with_v<bar, int>);
// bar is not declarable with double, because it produces an substitution failure (SFINAE).
static_assert(!seqan3::detail::is_class_template_declarable_with_v<bar, double>);
// is_class_template_declarable_with_v works well with lazy_conditional_t
template <typename t>
using maybe_foo_t = seqan3::detail::
lazy_conditional_t<seqan3::detail::is_class_template_declarable_with_v<foo, t>, seqan3::detail::lazy<foo, t>, t>;
int main()
{
foo<int> a = maybe_foo_t<int>{10}; // foo is instantiable with int, thus use foo<int>
seqan3::debug_stream << "a: " << a.value << '\n'; // prints 10
float b = maybe_foo_t<float>{0.4f}; // foo is not instantiable with float, thus use float directly
seqan3::debug_stream << "b: " << b << '\n'; // prints 0.4
return 0;
}
Provides a type trait for verifying valid template declarations.
using seqan3::operator|;
struct error :
seqan3::detail::strong_type<uint8_t,
error,
seqan3::detail::strong_type_skill::decrement
| seqan3::detail::strong_type_skill::increment>
{
using seqan3::detail::strong_type<uint8_t,
error,
seqan3::detail::strong_type_skill::decrement
| seqan3::detail::strong_type_skill::increment>::strong_type;
};
int main()
{
error e{4u};
--e;
++e;
}
Provides basic data structure for strong types.
struct error : seqan3::detail::strong_type<uint8_t, error>
{
using seqan3::detail::strong_type<uint8_t, error>::strong_type;
};
struct window_size : seqan3::detail::strong_type<uint8_t, window_size>
{
using seqan3::detail::strong_type<uint8_t, window_size>::strong_type;
};
strong_type for the window_size.
Definition: minimiser_hash.hpp:32
#include <ranges>
#include <vector>
struct error : seqan3::detail::strong_type<unsigned, error>
{
using seqan3::detail::strong_type<unsigned, error>::strong_type;
};
struct window_size : seqan3::detail::strong_type<unsigned, window_size>
{
using seqan3::detail::strong_type<unsigned, window_size>::strong_type;
};
namespace detail
{
template <std::ranges::forward_range fwd_rng_type>
bool do_find(fwd_rng_type const &, uint8_t const, uint8_t const)
{
return true;
}
} // namespace detail
template <std::ranges::forward_range fwd_rng_type>
bool search(fwd_rng_type const & rng, window_size const window_size, error const error)
{
return detail::do_find(rng, window_size.get(), error.get());
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna4> range = "ACGTT"_dna4;
search(range, window_size{4u}, error{2u});
return 0;
}
#include <ranges>
#include <vector>
namespace detail
{
template <std::ranges::forward_range fwd_rng_type>
bool do_find(fwd_rng_type const &, uint8_t const, uint8_t const)
{
return true;
}
} // namespace detail
template <std::ranges::forward_range fwd_rng_type>
bool search(fwd_rng_type const & rng, uint8_t const window_size, uint8_t const error)
{
return detail::do_find(rng, window_size, error);
}
int main()
{
using namespace seqan3::literals;
std::vector<seqan3::dna4> range = "ACGTT"_dna4;
search(range, 4u, 2u);
return 0;
}
int main()
{
using list_to_transfer = seqan3::type_list<int, char, double>;
using resulting_t = seqan3::detail::transfer_template_args_onto_t<list_to_transfer, std::tuple>;
static_assert(std::same_as<resulting_t, std::tuple<int, char, double>>);
}
Provides type traits for working with templates.
Provides seqan3::type_list.
#include <vector>
int main()
{
using my_type = std::vector<int>;
if constexpr (seqan3::detail::is_type_specialisation_of_v<my_type, std::vector>) // Note: std::vector has no <> !
{
// ...
}
}
#include <vector>
int main()
{
using my_type = std::vector<int>;
if constexpr (seqan3::detail::template_specialisation_of<my_type, std::vector>) // Note: std::vector has no <> !
{
// ...
}
}
#include <string>
#include <vector>
#include <seqan3/io/detail/record.hpp>
int main()
{
using selected_types = seqan3::detail::select_types_with_ids_t<types, types_as_ids, selected_ids>;
// resolves to type_list<std::vector<phred42>, std::string>
static_assert(std::same_as<selected_types, seqan3::type_list<std::vector<seqan3::phred42>, std::string>>);
}
#include <sstream>
int main()
{
std::string id{"seq1"};
std::string sequence{"ACTGACTGACTGACTAGCATGACTAGCATGC"};
// construct iterator from stream buffer
auto stream_it = seqan3::detail::fast_ostreambuf_iterator{*ostr.rdbuf()};
// You can do anything you could do with a regular std::ostreambuf_iterator
stream_it = '>'; // writes '>' to stream
*stream_it = ' '; // writes ' ' to stream
// Additionally, there is an efficient write_range member function
// Example 1: Write a range completely
stream_it.write_range(id); // return value can be ignored
// Example 2: Write a range in chunks of 10
while (it != std::ranges::end(sequence))
{
/* Note that you need cannot use stream_it.write_range(rng | std::views::take(10)) here
* because the returned iterator is not of the correct type.
*/
auto current_end = it;
size_t steps = std::ranges::advance(current_end, 10u, std::ranges::end(sequence));
using subrange_t =
std::ranges::subrange<decltype(it), decltype(current_end), std::ranges::subrange_kind::sized>;
// Be aware that your range_type must model std::ranges::borrowed_range in order to use the return value!
it = stream_it.write_range(subrange_t{it, current_end, 10u - steps});
stream_it = ' ';
}
}
Provides seqan3::detail::fast_ostreambuf_iterator.
T rdbuf(T... args)
#include <filesystem>
#include <fstream>
int main()
{
std::ifstream file{my_file}; // Create the file.
seqan3::detail::safe_filesystem_entry file_guard{my_file}; // Safe cleanup in case of errors.
// Do something on the file, that can possibly throw.
// If an unhandled exception is thrown, the file guard destructor safely removes the file from the filesystem.
file_guard.remove(); // Explicitly remove the file.
}
Provides seqan3::detail::safe_filesystem_entry.
#include <sstream>
auto input = R"(> TEST1
ACGT
> Test2
AGGCTGA
> Test3
GGAGTATAATATATATATATATAT)";
int main()
{
// specify custom field combination/order to file:
auto record = fin.front(); // get current record, in this case the first
auto & id = record.id();
seqan3::debug_stream << id << '\n'; // TEST1
auto & seq = record.sequence();
seqan3::debug_stream << seq << '\n'; // ACGT
}
#include <string>
#include <vector>
int main()
{
using namespace seqan3::literals;
// The order of the types below represent a mapping between the type and the key.
// record_type now mimics std::tuple<std::string, dna4_vector, std::vector<phred42>>,
// the order also depends on selected_ids
record_type my_record{};
std::get<1>(my_record) = "the most important sequence in the database"; // access via index
std::get<std::string>(my_record) = "the least important sequence in the database"; // access via type
}
#include <sstream>
#include <string>
#include <tuple>
int main()
{
auto stream_it = fout.begin();
seqan3::dna5_vector seq;
// ...
// assign to file iterator
*stream_it = std::tie(seq, id);
// is the same as:
fout.push_back(std::tie(seq, id));
}
Provides seqan3::sam_file_output and corresponding traits classes.
#include <sstream>
int main()
{
// I only want to print the mapping position (field::ref_offset) and flag:
unsigned mapping_pos{1300};
// ...
fout.emplace_back(mapping_pos, flag); // note that the order the arguments is now different, because
// or: you specified that REF_OFFSET should be first
fout.push_back(std::tie(mapping_pos, flag));
}
sam_flag
An enum flag that describes the properties of an aligned read (given as a SAM record).
Definition: sam_flag.hpp:76
@ none
None of the flags below are set.
#include <sstream>
#include <string>
#include <tuple>
int main()
{
seqan3::dna5_vector seq;
// ...
fout.push_back(std::tie(seq, id));
}
#include <sstream>
#include <string>
#include <vector>
int main()
{
std::string read_id;
// ... e.g. compute and alignment
using alignment_type =
alignment_type dummy_alignment{}; // an empty dummy alignment
// the record type specifies the fields we want to write
// initialize record
record_type rec{read, ref_id, dummy_alignment};
// Write the record
fout.push_back(rec);
// same as
fout.push_back(record_type{read, ref_id, dummy_alignment});
// as all our fields are empty so this would print an
}
#include <filesystem>
#include <sstream>
#include <tuple>
int main()
{
// I only want to print the mapping position (field::ref_offset) and flag:
unsigned mapping_pos{1300};
// ...
fout.emplace_back(mapping_pos, flag); // note that the order the arguments is now different, because
// or: you specified that REF_OFFSET should be first
fout.push_back(std::tie(mapping_pos, flag));
}
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
auto it = fin.begin();
// the following are equivalent:
auto & rec0 = *it;
auto & rec1 = fin.front();
std::cout << std::boolalpha << (rec0.id() == rec1.id()) << '\n'; // true
// Note: both become invalid after incrementing "it"!
}
#include <filesystem>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
// Create the temporary file.
auto tmp_file = std::filesystem::temp_directory_path() / "my.sam";
std::ofstream tmp_stream{tmp_file};
tmp_stream << sam_file_raw;
tmp_stream.close();
seqan3::sam_file_input fin{tmp_file}; // SAM format assumed, regular std::ifstream taken as stream
}
T close(T... args)
T remove(T... args)
#include <sstream>
auto input = R"(@HD VN:1.6 SO:coordinate
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *)";
int main()
{
// ^ no need to specify the template arguments
}
#include <sstream>
auto input = R"(@HD VN:1.6 SO:coordinate
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *)";
int main()
{
// The default types; you can adjust this list if you don't want to read all this data.
using default_fields = seqan3::fields<seqan3::field::seq,
// The expected format:
default_fields,
// Which formats are allowed:
sam_file_input_t fin{std::istringstream{input}, seqan3::format_sam{}};
}
@ mate
The mate pair information given as a std::tuple of reference name, offset and template length.
@ header_ptr
A pointer to the seqan3::sam_file_header object storing header information.
@ tags
The optional tags in the SAM format, stored in a dictionary.
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
auto rec = std::move(fin.front()); // rec now stores the data permanently
}
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
// access the header information
seqan3::debug_stream << fin.header().format_version << '\n'; // 1.6
seqan3::debug_stream << fin.header().ref_dict << '\n'; // [(ref,(45,))] (this only works with seqan3::debug_stream!)
}
#include <sstream>
{
using sequence_alphabet = seqan3::dna4; // instead of dna5
template <typename alph>
using sequence_container = seqan3::bitpacked_sequence<alph>; // must be defined as a template!
};
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
// ... within main you can then use:
int main()
{
}
A more refined container concept than seqan3::container.
The default traits for seqan3::sam_file_input.
Definition: sam_file/input.hpp:174
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
for (auto & rec : fin)
{
seqan3::debug_stream << "flag: " << rec.flag() << '\n';
seqan3::debug_stream << "mapping quality: " << rec.mapping_quality() << '\n';
}
}
#include <ranges>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
auto minimum_length10_filter = std::views::filter(
[](auto const & rec)
{
return std::ranges::size(rec.sequence()) >= 10;
});
for (auto & rec : fin | minimum_length10_filter) // only records with sequence length >= 10 will "appear"
seqan3::debug_stream << rec.id() << '\n';
}
#include <sstream>
#include <utility>
#include <vector>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
using record_type = typename decltype(fin)::record_type;
std::vector<record_type> records{}; // store all my records in a vector
for (auto & rec : fin)
records.push_back(std::move(rec));
}
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
for (auto & rec : fin)
{
seqan3::debug_stream << "id: " << rec.id() << '\n';
seqan3::debug_stream << "read sequence: " << rec.sequence() << '\n';
seqan3::debug_stream << "mapping position: " << rec.reference_position() << '\n';
seqan3::debug_stream << "mapping quality: " << rec.mapping_quality() << '\n';
// there are more fields read on default
}
}
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
for (auto & [flag, mapq] : fin) // the order is the same as specified in fields!
{
seqan3::debug_stream << "flag: " << flag << '\n';
seqan3::debug_stream << "mapping quality: " << mapq << '\n';
}
}
#include <filesystem>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 6H5M * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
// fin uses custom fields, fout uses the default fields.
// output doesn't have to match the configuration of the input
for (auto & r : fin)
fout.push_back(r); // copy all the records.
}
#include <filesystem>
int main()
{
auto tmp_file = std::filesystem::temp_directory_path() / "my.sam";
seqan3::sam_file_output fout{tmp_file}; // SAM format detected, std::ofstream opened for file
}
#include <filesystem>
#include <string>
#include <vector>
int main()
{
auto tmp_file = std::filesystem::temp_directory_path() / "my.sam";
std::vector<std::string> ref_ids{"ref1", "ref2"};
std::vector<size_t> ref_lengths{1234, 5678};
seqan3::sam_file_output fout{tmp_file, ref_ids, ref_lengths};
}
#include <sstream>
int main()
{
// no need to specify the template arguments <...> for format specialization:
}
#include <sstream>
auto sam_file_raw = R"(First 0 * 0 0 * * 0 0 ACGT *
2nd 0 * 0 0 * * 0 0 NATA *
Third 0 * 0 0 * * 0 0 GATA *
)";
int main()
{
// copying a file in one line:
// with seqan3::sam_file_output as a variable:
fout = fin;
// or in pipe notation:
}
#include <ranges>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 * = 37 39 TTAGATAAAGGATACTG *
r003 0 ref 29 30 * * 0 0 GCCTAAGCTAA * SA:Z:ref,29,-,6H5M,17,0;
r003 2064 ref 29 17 * * 0 0 TAGGC * SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 * = 7 -39 CAGCGGCAT * NM:i:1
)";
int main()
{
auto input_file = seqan3::sam_file_input{std::istringstream{sam_file_raw}, seqan3::format_sam{}};
input_file | std::views::take(3) // take only the first 3 records
}
Provides platform and dependency checks.
#include <sstream>
#include <string>
#include <vector>
int main()
{
std::vector<std::string> ref_ids{"ref1", "ref2"};
std::vector<size_t> ref_lengths{1234, 5678};
// always give reference information if you want to have your header properly initialised
// add information to the header of the file.
fout.header().comments.push_back("This is a comment");
}
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
{"NATA"_dna5, "2nd"},
{"GATA"_dna5, "Third"}}; // a range of "records"
fout = range; // will iterate over the records and write them
// equivalent to:
range | fout;
}
#include <iostream>
#include <sstream>
auto sam_file_raw = R"(@HD VN:1.6 SO:coordinate GO:none
@SQ SN:ref LN:45
r001 99 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG !!!!!!!!!!!!!!!!!
r003 0 ref 29 30 5S6M * 0 0 GCCTAAGCTAA !!!!!!!!!!! SA:Z:ref,29,-,6H5M,17,0;
r003 4 * 29 17 * * 0 0 TAGGC @@@@@ SA:Z:ref,9,+,5S6M,30,1;
r001 147 ref 237 30 9M = 7 -39 CAGCGGCAT !!!!!!!!! NM:i:1
)";
int main()
{
for (auto & rec : fin)
{
// Check if a certain flag value (bit) is set:
if (static_cast<bool>(rec.flag() & seqan3::sam_flag::unmapped))
std::cout << "Read " << rec.id() << " is unmapped\n";
if (rec.base_qualities()[0] < seqan3::assign_char_to('@', seqan3::phred42{})) // low quality
{
// Set a flag value (bit):
// Note that this does not affect other flag values (bits),
// e.g. `rec.flag() & seqan3::sam_flag::unmapped` may still be true
}
// Unset a flag value (bit):
rec.flag() &= ~seqan3::sam_flag::duplicate; // not marked as a duplicate anymore
}
}
@ failed_filter
The read alignment failed a filter, e.g. quality controls.
@ unmapped
The read is not mapped to a reference (unaligned).
int main()
{
using namespace seqan3::literals;
seqan3::sam_tag_dictionary dict{}; // initialise empty dictionary
dict.get<"NM"_tag>() = 3; // set SAM tag 'NM' to 3 (integer type)
dict.get<"CO"_tag>() = "comment"; // set SAM tag 'CO' to "comment" (string type)
auto nm = dict.get<"NM"_tag>(); // get SAM tag 'NM' (note: type is int32_t)
auto co = dict.get<"CO"_tag>(); // get SAM tag 'CO' (note: type is std::string)
seqan3::debug_stream << nm << '\n'; // will print '3'
seqan3::debug_stream << co << '\n'; // will print "comment"
}
The SAM tag dictionary class that stores all optional SAM fields.
Definition: sam_tag_dictionary.hpp:343
auto & get() &
Uses std::map::operator[] for access and default initializes new keys.
Definition: sam_tag_dictionary.hpp:370
Provides the seqan3::sam_tag_dictionary class and auxiliaries.
using namespace seqan3::literals;
template <> // no template parameter since the tag is known
struct seqan3::sam_tag_type<"XX"_tag> // here comes your tag
{
using type = int32_t; // specify the type of your tag
};
using namespace seqan3::literals;
// ...
uint16_t tag_id = "NM"_tag; // tag_id = 10061
using namespace seqan3::literals;
// ...
using nm_tag_type = seqan3::sam_tag_type_t<"NM"_tag>;
using namespace seqan3::literals;
// ...
using nm_tag_type2 = seqan3::sam_tag_type<"NM"_tag>::type;
The generic base class.
Definition: sam_tag_dictionary.hpp:181
detail::sam_tag_variant type
The type for all unknown tags with no extra overload defaults to a std::variant.
Definition: sam_tag_dictionary.hpp:183
#include <variant> // for std::visit
#include <seqan3/utility/container/concept.hpp> // for the seqan3::container
// a lambda helper function that prints every type in the std::variant<...allowed SAM tag types...>
auto print_fn = [](auto && arg)
{
using T = std::remove_cvref_t<decltype(arg)>; // the type T of arg.
if constexpr (!seqan3::container<T>) // If T is not a container,
{
seqan3::debug_stream << arg << '\n'; // just print arg directly.
}
else // If T is a container,
{
for (auto const & arg_v : arg) // print every value in arg.
seqan3::debug_stream << arg_v << ",";
}
};
int main()
{
using namespace seqan3::literals;
seqan3::sam_tag_dictionary dict{}; // initialise empty dictionary
// ! there is no get function for unknown tags !
// dict.get<"XZ"_tag>() = 3;
// but you can use the operator[]
dict["XZ"_tag] = 3; // set unknown SAM tag 'XZ' to 3 (type int32_t)
// ! there is no get function for unknown tags !
// auto nm = dict.get<"XZ"_tag>();
// but you can use the operator[] again
auto xz = dict["XZ"_tag]; // get SAM tag 'XZ' (type std::variant<...allowed SAM tag types...>)
// ! you cannot print a std::variant directly !
// seqan3::debug_stream << nm << '\n';
// but you can use visit:
std::visit(print_fn, xz); // prints 3
}
The (most general) container concept as defined by the standard library.
Adaptations of concepts from the standard library.
T visit(T... args)
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
}
#include <sstream>
#include <utility>
#include <vector>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
using record_type = typename decltype(fin)::record_type;
for (auto & rec : fin)
records.push_back(std::move(rec));
}
#include <sstream>
#include <vector>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
using seqan3::get;
for (auto & [id, seq, qual] : fin) // the order is now different, "id" comes first, because it was specified first
{
seqan3::debug_stream << "ID: " << id << '\n';
seqan3::debug_stream << "SEQ: " << seq << '\n';
seqan3::debug_stream << "QUAL: " << qual << '\n';
}
}
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
for (auto & [sequence, id, quality] : fin)
{
seqan3::debug_stream << "ID: " << id << '\n';
seqan3::debug_stream << "SEQ: " << sequence << '\n';
seqan3::debug_stream << "EMPTY QUAL." << quality << '\n'; // quality is empty for FASTA files
}
}
#include <ranges>
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
auto minimum_length5_filter = std::views::filter(
[](auto const & rec)
{
return std::ranges::size(rec.sequence()) >= 5;
});
for (auto & rec : fin | minimum_length5_filter) // only record with sequence length >= 5 will "appear"
{
seqan3::debug_stream << "IDs of seq_length >= 5: " << rec.id() << '\n';
// ...
}
}
#include <sstream>
#include <utility>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
// ^ no need to specify the template arguments
}
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
for (auto & record : fin)
{
seqan3::debug_stream << "ID: " << record.id() << '\n';
seqan3::debug_stream << "SEQ: " << record.sequence() << '\n';
// a quality field also exists, but is not printed, because we know it's empty for FASTA files.
}
}
#include <sstream>
#include <utility>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
auto rec0 = std::move(fin.front());
}
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
int main()
{
auto it = std::ranges::begin(fin);
// the following are equivalent:
auto & rec0 = *it;
auto & rec1 = fin.front();
std::cout << std::boolalpha << (rec0.id() == rec1.id()) << '\n'; // true
// Note: rec0 and rec1 are references and become invalid after incrementing "it"!
}
#include <seqan3/test/snippet/create_temporary_snippet_file.hpp>
// std::filesystem::current_path() / "my.fasta" will be deleted after the execution
seqan3::test::create_temporary_snippet_file my_fasta{"my.fasta", ""};
#include <filesystem>
int main()
{
using namespace seqan3::literals;
auto fasta_file = std::filesystem::current_path() / "my.fasta";
{
// Create a ./my.fasta file.
seqan3::sequence_file_output fout{fasta_file};
fout.emplace_back("ACGT"_dna4, "TEST1");
fout.emplace_back("AGGCTGA"_dna4, "Test2");
fout.emplace_back("GGAGTATAATATATATATATATAT"_dna4, "Test3");
}
// FASTA with DNA sequences assumed, regular std::ifstream taken as stream
seqan3::sequence_file_input fin{fasta_file};
}
void emplace_back(arg_t &&arg, arg_types &&... args)
Write a record to the file by passing individual fields.
Definition: io/sequence_file/output.hpp:339
#include <sstream>
// ... input had amino acid sequences
auto input = R"(>TEST1
FQTWE
>Test2
KYRTW
>Test3
EEYQTWEEFARAAEKLYLTDPMKV)";
int main()
{ // Use amino acid traits below
using sequence_file_input_type =
sequence_file_input_type fin{std::istringstream{input}, seqan3::format_fasta{}};
}
A traits type that specifies input as amino acids.
Definition: sequence_file/input.hpp:170
#include <sstream>
auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";
{
using sequence_alphabet = seqan3::dna4; // instead of dna5
template <typename alph>
using sequence_container = seqan3::bitpacked_sequence<alph>; // must be defined as a template!
};
int main()
{
}
The default traits for seqan3::sequence_file_input.
Definition: sequence_file/input.hpp:134
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
{"NATA"_dna5, "2nd"},
{"GATA"_dna5, "Third"}}; // a range of "records"
fout = range;
// the same as:
range | fout;
}
#include <sstream>
#include <string>
using namespace seqan3::literals;
struct data_storage_t
{
seqan3::concatenated_sequences<seqan3::dna4_vector> sequences{"ACGT"_dna4, "AAA"_dna4};
};
int main()
{
data_storage_t data_storage{};
// ... in your file writing function:
fout = seqan3::views::zip(data_storage.sequences, data_storage.ids);
}
int main()
{
// ^ no need to specify the template arguments
}
#include <filesystem>
#include <sstream>
auto input = R"(@TEST1
ACGT
+
##!#
@Test2
AGGCTGA
+
##!#!!!
@Test3
GGAGTATAATATATATATATATAT
+
##!###!###!###!###!###!#)";
int main()
{
// file format conversion in one line:
// with seqan3::sequence_file_output as a variable:
fout = fin;
// or in pipe notation:
}
The FASTQ format.
Definition: format_fastq.hpp:80
#include <sstream>
#include <string>
int main()
{
using namespace seqan3::literals;
for (int i = 0; i < 5; ++i) // some criteria
{
std::string id{"test_id"};
seqan3::dna5_vector seq{"ACGT"_dna5};
// ...
fout.emplace_back(seq, id);
}
}
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
for (int i = 0; i < 5; i++)
{
std::string id{"test_id"};
// vector of combined data structure:
{'A'_dna5, '1'_phred42},
{'C'_dna5, '3'_phred42}};
auto view_on_seq = seqan3::views::elements<0>(seq_qual);
auto view_on_qual = seqan3::views::elements<1>(seq_qual);
// ...
// Note that the order of the arguments is different from the default `seq, id, qual`,
// because you specified that ID should be first in the fields template argument.
fout.emplace_back(id, view_on_seq, view_on_qual);
// or:
fout.push_back(std::tie(id, view_on_seq, view_on_qual));
}
}
Provides seqan3::views::elements.
#include <sstream>
auto input = R"(@TEST1
ACGT
+
##!#
@Test2
AGGCTGA
+
##!#!!!
@Test3
GGAGTATAATATATATATATATAT
+
##!###!###!###!###!###!#)";
int main()
{
seqan3::format_fastq{}}; // doesn't have to match the configuration
for (auto & r : fin)
{
if (true) // r fulfills some criterium
fout.push_back(r);
}
}
#include <sstream>
#include <string>
#include <tuple>
int main()
{
using namespace seqan3::literals;
for (int i = 0; i < 5; ++i) // some criteria
{
std::string id{"test_id"};
seqan3::dna5_vector seq{"ACGT"_dna5};
// ...
fout.push_back(std::tie(seq, id));
}
}
#include <sstream>
#include <string>
#include <tuple>
int main()
{
using namespace seqan3::literals;
auto it = fout.begin();
for (int i = 0; i < 5; ++i) // some criteria
{
std::string id{"test_id"};
seqan3::dna5_vector seq{"ACGT"_dna5};
// ...
// assign to iterator
*it = std::tie(seq, id);
// is the same as:
fout.push_back(std::tie(seq, id));
}
}
#include <sstream>
#include <string>
#include <tuple>
int main()
{
using namespace seqan3::literals;
for (int i = 0; i < 5; ++i) // ...
{
std::string id{"test_id"};
seqan3::dna5_vector seq{"ACGT"_dna5};
// ...
fout.emplace_back(seq, id); // as individual variables
// or:
fout.push_back(std::tie(seq, id)); // as a tuple
}
}
#include <seqan3/test/snippet/create_temporary_snippet_file.hpp>
// std::filesystem::current_path() / "my.fasta" will be deleted after the execution
seqan3::test::create_temporary_snippet_file my_fasta{"my.fasta", ""};
#include <filesystem>
int main()
{
auto fasta_file = std::filesystem::current_path() / "my.fasta";
// FASTA format detected, std::ofstream opened for file
}
#if !SEQAN3_WORKAROUND_GCC_96070
# include <iterator>
# include <ranges>
# include <sstream>
auto input = R"(@TEST1
ACGT
+
##!#
@Test2
AGGCTGA
+
##!#!!!
@Test3
GGAGTATAATATATATATATATAT
+
##!###!###!###!###!###!#)";
int main()
{
// minimum_average_quality_filter and minimum_sequence_length_filter need to be implemented first
auto minimum_sequence_length_filter = std::views::filter(
[](auto rec)
{
return std::ranges::distance(rec.sequence()) >= 50;
});
auto minimum_average_quality_filter = std::views::filter(
[](auto const & record)
{
double qual_sum{0}; // summation of the qualities
for (auto chr : record.base_qualities())
qual_sum += seqan3::to_phred(chr);
// check if average quality is greater than 20.
return qual_sum / (std::ranges::distance(record.base_qualities())) >= 20;
});
input_file | minimum_average_quality_filter | minimum_sequence_length_filter | std::views::take(3)
}
#endif // !SEQAN3_WORKAROUND_GCC_96070
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
}
The Vienna format (dot bracket notation) for RNA sequences with secondary structure.
Definition: format_vienna.hpp:86
A class for reading structured sequence files, e.g. Stockholm, Connect, Vienna, ViennaRNA bpp matrix ...
Definition: structure_file/input.hpp:362
Provides seqan3::structure_file_input and corresponding traits classes.
#include <filesystem>
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
auto tmp_file = std::filesystem::temp_directory_path() / "my.dbn";
using namespace seqan3::literals;
// First, make /tmp/input.dbn
{
fout.emplace_back("GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA"_rna4,
"S.cerevisiae_tRNA-PHE M10740/1-73",
"(((((((..((((........)))).((((.........)))).....(((((.......))))))))))))."_wuss51);
fout.emplace_back("UUGGAGUACACAACCUGUACACUCUUUC"_rna4, "example", "..(((((..(((...)))..)))))..."_wuss51);
}
seqan3::structure_file_input sf{tmp_file}; // Vienna with RNA sequences assumed, use regular std::ifstream as stream
// ^ no need to specify the template arguments
}
A class for writing structured sequence files, e.g. Stockholm, Connect, Vienna, ViennaRNA bpp matrix ...
Definition: io/structure_file/output.hpp:63
void emplace_back(arg_t &&arg, arg_types &&... args)
Write a record to the file by passing individual fields.
Definition: io/structure_file/output.hpp:366
Provides seqan3::structure_file_output and corresponding traits classes.
#include <sstream>
#include <utility>
#include <vector>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
using record_type = typename decltype(fin)::record_type;
for (auto & rec : fin)
records.push_back(std::move(rec));
}
#include <ranges>
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
auto minimum_length5_filter = std::views::filter(
[](auto const & rec)
{
return std::ranges::size(rec.sequence()) >= 5;
});
for (auto & rec : fin | minimum_length5_filter) // only record with sequence length >= 5 will "appear"
seqan3::debug_stream << (rec.sequence() | seqan3::views::to_char) << '\n';
}
#include <sstream>
// Define custom traits
{
using seq_alphabet = seqan3::rna4; // instead of rna5
};
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
}
The default traits for seqan3::structure_file_input.
Definition: structure_file/input.hpp:223
#include <sstream>
#include <utility>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
auto rec0 = std::move(fin.front());
}
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
ACEWACEW
HGEBHHHH
> example
ACEWACEWACEWACEW
HGEBHHHHHGEBHHHH)";
int main()
{
using structure_file_input_t =
structure_file_input_t fin{std::istringstream{input}, seqan3::format_vienna{}};
for (auto & rec : fin)
{
seqan3::debug_stream << "ID: " << rec.id() << '\n';
// sequence and structure are converted to char on-the-fly
seqan3::debug_stream << "SEQ: " << (rec.sequence() | seqan3::views::to_char) << '\n';
seqan3::debug_stream << "STRUCTURE: " << (rec.sequence_structure() | seqan3::views::to_char) << '\n';
}
}
A traits type that specifies input as amino acids.
Definition: structure_file/input.hpp:322
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
auto it = std::ranges::begin(fin);
// the following are equivalent:
auto & rec0 = *it;
auto & rec1 = fin.front();
std::cout << std::boolalpha << (rec0.id() == rec1.id()) << '\n'; // true
// both become invalid after incrementing "it"!
}
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
using seqan3::get;
// note that the order is now different, "id" comes first, because it was specified first
for (auto & [id, struc_seq] : fin)
{
seqan3::debug_stream << "ID: " << id << '\n';
// sequence and structure are part of the same vector, of type std::vector<structured_rna<rna5, wuss51>>
// sequence and structure strings are extracted and converted to char on-the-fly
seqan3::debug_stream << "SEQ: " << (struc_seq | seqan3::views::elements<0> | seqan3::views::to_char) << '\n';
seqan3::debug_stream << "STRUCTURE: " << (struc_seq | seqan3::views::elements<1> | seqan3::views::to_char)
<< '\n';
}
}
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
ACEWACEW
HGEBHHHH
> example
ACEWACEWACEWACEW
HGEBHHHHHGEBHHHH)";
int main()
{
using seqan3::get;
using structure_file_input_t =
structure_file_input_t fin{std::istringstream{input}, seqan3::format_vienna{}};
for (auto & [seq, id, structure] : fin)
{
seqan3::debug_stream << "ID: " << id << '\n';
// sequence and structure are converted to char on-the-fly
seqan3::debug_stream << "SEQ: " << (seq | seqan3::views::to_char) << '\n';
seqan3::debug_stream << "STRUCTURE: " << (structure | seqan3::views::to_char) << '\n';
}
}
@ structure
Fixed interactions, usually a string of structure alphabet characters.
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
ACEWACEW
HGEBHHHH
> example
ACEWACEWACEWACEW
HGEBHHHHHGEBHHHH)";
int main()
{
// ... input had amino acid sequences
}
#include <sstream>
#include <string>
#include <vector>
using namespace seqan3::literals;
struct data_storage_t
{
};
int main()
{
data_storage_t data_storage{}; // a global or globally used variable in your program
// ... in your file writing function:
fout = seqan3::views::zip(data_storage.sequences, data_storage.ids, data_storage.structures);
}
#include <sstream>
#include <string>
#include <vector>
int main()
{
using namespace seqan3::literals;
for (int i = 0; i < 10; i++) // ...
{
std::string id{"test_id"};
seqan3::rna5_vector seq{"AGGGUU"_rna5};
// ...
fout.emplace_back(seq, id, structure);
}
}
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
{"ACGT"_rna5, "First", "...."_wuss51},
{"NATA"_rna5, "2nd", "...."_wuss51},
{"GATA"_rna5, "Third", "...."_wuss51}}; // a range of "records"
fout = range; // will iterate over the records and write them
}
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
for (int i = 0; i < 10; i++) // ...
{
std::string id{"test_id"};
seqan3::rna5_vector seq{"ACGU"_rna5};
// ...
fout.emplace_back(seq, id, structure); // as individual variables
// or:
fout.push_back(std::tie(seq, id, structure)); // as a tuple
}
}
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
{"ACGT"_rna5, "First", "...."_wuss51},
{"NATA"_rna5, "2nd", "...."_wuss51},
{"GATA"_rna5, "Third", "...."_wuss51}}; // a range of "records"
fout = range; // will iterate over the records and write them
}
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
bool criteria = true;
// the output doesn't have to match the configuration of the input
for (auto & r : fin)
{
if (criteria) // r fulfills some filter criterium
fout.push_back(r);
}
}
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
{"ACGT"_rna5, "First", "...."_wuss51},
{"NATA"_rna5, "2nd", "...."_wuss51},
{"GATA"_rna5, "Third", "...."_wuss51}}; // a range of "records"
range | fout;
// the same as:
fout = range;
}
#include <sstream>
auto input = R"(> S.cerevisiae_tRNA-PHE M10740/1-73
GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUUUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA
(((((((..((((........)))).((((.........)))).....(((((.......)))))))))))). (-17.50)
> example
UUGGAGUACACAACCUGUACACUCUUUC
..(((((..(((...)))..)))))... (-3.71))";
int main()
{
}
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
auto it = fout.begin();
for (int i = 0; i < 10; i++) // ...
{
std::string id{"test_id"};
seqan3::rna5_vector seq{"AGGGUU"_rna5};
// ...
// assign to iterator
*it = std::tie(seq, id, structure);
// is the same as:
fout.push_back(std::tie(seq, id, structure));
}
}
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
for (int i = 0; i < 10; i++) // ...
{
std::string id{"test_id"};
seqan3::rna5_vector seq{"AGGGUU"_rna5};
// ...
fout.push_back(std::tie(seq, id, structure));
}
}
#include <filesystem>
int main()
{
auto tmp_file = std::filesystem::temp_directory_path() / "my.dbn";
seqan3::structure_file_output fout{tmp_file}; // Vienna format detected, std::ofstream opened for file
}
#include <sstream>
#include <string>
#include <tuple>
#include <vector>
int main()
{
using namespace seqan3::literals;
for (int i = 0; i < 10; i++) // ...
{
std::string id{"test_id"};
// vector of combined data structure
// ...
// note also that the order the arguments is now different, because
// you specified that `seqan3::field::id` should be first in the fields template argument
fout.emplace_back(id, structured_sequence);
// or:
fout.push_back(std::tie(id, structured_sequence));
}
}
T emplace_back(T... args)
#include <iostream>
int main()
{
using namespace seqan3::literals;
// ^ no need to specify the template arguments
fout.emplace_back("AACGUU"_rna4, "example_id", ".(())."_wuss51); // default order for vienna: SEQ, ID, STRUCTURE
}
#include <cstdlib> // std::rand
#include <future> // std::async
#include <string> // std::string
#include <seqan3/core/debug_stream.hpp> // seqan3::debug_stream
#include <seqan3/io/sequence_file/input.hpp> // seqan3::sequence_file_input
#include <seqan3/io/views/async_input_buffer.hpp> // seqan3::views::async_input_buffer
std::string fasta_file =
R"(> seq1
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq2
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq3
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq4
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq5
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq6
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq7
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq8
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq9
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq10
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq11
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq12
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
)";
int main()
{
// initialise random number generator, only needed for demonstration purposes
std::srand(std::time(nullptr));
// create an input file from the string above
// create the async buffer around the input file
// spawns a background thread that tries to keep four records in the buffer
// create a lambda function that iterates over the async buffer when called
// (the buffer gets dynamically refilled as soon as possible)
auto worker = [&v]()
{
for (auto & record : v)
{
// pretend we are doing some work
// print current thread and sequence ID
seqan3::debug_stream << "Thread: " << std::this_thread::get_id() << '\t' << "Seq: " << record.id()
<< '\n';
}
};
// launch two threads and pass the lambda function to both
auto f0 = std::async(std::launch::async, worker);
auto f1 = std::async(std::launch::async, worker);
}
Provides seqan3::views::async_input_buffer.
T async(T... args)
T get_id(T... args)
constexpr auto async_input_buffer
A view adapter that returns a concurrent-queue-like view over the underlying range.
Definition: async_input_buffer.hpp:481
T rand(T... args)
T sleep_for(T... args)
T srand(T... args)
T time(T... args)
#include <string>
int main()
{
std::string vec{"foobar"};
auto v = vec | seqan3::detail::take_exactly(3); // or seqan3::detail::take_exactly_or_throw
seqan3::debug_stream << v << '\n'; // "foo"
seqan3::debug_stream << std::ranges::size(v) << '\n'; // 3
auto v2 = vec | seqan3::detail::take_exactly(9);
seqan3::debug_stream << std::ranges::size(v2) << '\n'; // 9 <- here be dragons! (undefined behaviour)
}
Provides seqan3::views::take_exactly and seqan3::views::take_exactly_or_throw.
#include <ranges>
#include <string>
int main()
{
std::views::take_while(
[](auto const & l)
{
return (l != '\r') && (l != '\n');
});
}
#include <string>
int main()
{
std::string vec{"foo\nbar"};