SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
seqan3::format_fasta Class Reference

The FASTA format. More...

#include <seqan3/io/sequence_file/format_fasta.hpp>

+ Inheritance diagram for seqan3::format_fasta:

Public Member Functions

Constructors, destructor and assignment
 format_fasta () noexcept=default
 Defaulted.
 
 format_fasta (format_fasta const &) noexcept=default
 Defaulted.
 
format_fastaoperator= (format_fasta const &) noexcept=default
 Defaulted.
 
 format_fasta (format_fasta &&) noexcept=default
 Defaulted.
 
format_fastaoperator= (format_fasta &&) noexcept=default
 Defaulted.
 
 ~format_fasta () noexcept=default
 Defaulted.
 

Static Public Attributes

static std::vector< std::stringfile_extensions
 The valid file extensions for this format; note that you can modify this value.
 

Protected Member Functions

template<typename stream_type , typename legal_alph_type , typename stream_pos_type , typename seq_type , typename id_type , typename qual_type >
void read_sequence_record (stream_type &stream, sequence_file_input_options< legal_alph_type > const &options, stream_pos_type &position_buffer, seq_type &sequence, id_type &id, qual_type &qualities)
 Read from the specified stream and back-insert into the given field buffers.
 
template<typename stream_type , typename seq_type , typename id_type , typename qual_type >
void write_sequence_record (stream_type &stream, sequence_file_output_options const &options, seq_type &&sequence, id_type &&id, qual_type &&qualities)
 Write the given fields to the specified stream.
 

Private Member Functions

template<typename stream_view_t , typename seq_legal_alph_type , typename id_type >
void read_id (stream_view_t &stream_view, sequence_file_input_options< seq_legal_alph_type > const &options, id_type &id)
 Implementation of reading the ID.
 
template<typename stream_view_t , typename seq_legal_alph_type , typename seq_type >
void read_seq (stream_view_t &stream_view, sequence_file_input_options< seq_legal_alph_type > const &, seq_type &seq)
 Implementation of reading the sequence.
 
template<typename stream_it_t , typename id_type >
void write_id (stream_it_t &stream_it, sequence_file_output_options const &options, id_type &&id)
 Implementation of writing the ID.
 
template<typename stream_it_t , typename seq_type >
void write_seq (stream_it_t &stream_it, sequence_file_output_options const &options, seq_type &&seq)
 Implementation of writing the sequence.
 

Additional Inherited Members

Detailed Description

The FASTA format.

Introduction

FASTA is the de-facto-standard for sequence storage in bioinformatics. See the article on wikipedia for a an in-depth description of the format.

fields_specialisation

The FASTA format provides the fields seqan3::field::seq and seqan3::field::id. Both fields are required when writing.

Implementation notes

When reading the ID-line the identifier (either ; or >) and any blank characters before the actual ID are stripped.

This implementation supports the following less known and optional features of the format:

  • ID lines beginning with ; instead of >
  • line breaks and other whitespace characters in any part of the sequence
  • character counts within the sequence (they are simply ignored)

The following optional features are currently not supported:

  • Multiple comment lines (starting with either ; or >), only one ID line before the sequence line is accepted
Remarks
For a complete overview, take a look at Sequence File

Member Function Documentation

◆ read_sequence_record()

template<typename stream_type , typename legal_alph_type , typename stream_pos_type , typename seq_type , typename id_type , typename qual_type >
void seqan3::format_fasta::read_sequence_record ( stream_type &  stream,
sequence_file_input_options< legal_alph_type > const &  options,
stream_pos_type &  position_buffer,
seq_type &  sequence,
id_type &  id,
qual_type &  qualities 
)
inlineprotected

Read from the specified stream and back-insert into the given field buffers.

Template Parameters
stream_typeInput stream, must satisfy seqan3::input_stream_over with char.
stream_pos_typeBuffer for storing the current record's file position.
seq_typeType of the seqan3::field::seq input; must satisfy std::ranges::output_range over a seqan3::alphabet.
id_typeType of the seqan3::field::id input; must satisfy std::ranges::output_range over a seqan3::alphabet.
qual_typeType of the seqan3::field::qual input; must satisfy std::ranges::output_range over a seqan3::writable_quality_alphabet.
Parameters
[in,out]streamThe input stream to read from.
[in,out]position_bufferThe buffer to store the current record's file position.
[in]optionsFile specific options passed to the format.
[out]sequenceThe buffer for seqan3::field::seq input, i.e. the "sequence".
[out]idThe buffer for seqan3::field::id input, e.g. the header line in FASTA .
[out]qualitiesThe buffer for seqan3::field::qual input.

Additional requirements

  • The function must also accept std::ignore as parameter for any of the fields. [This is enforced by the concept checker!]
  • In this case the data read for that field shall be discarded by the format.

◆ write_sequence_record()

template<typename stream_type , typename seq_type , typename id_type , typename qual_type >
void seqan3::format_fasta::write_sequence_record ( stream_type &  stream,
sequence_file_output_options const &  options,
seq_type &&  sequence,
id_type &&  id,
qual_type &&  qualities 
)
inlineprotected

Write the given fields to the specified stream.

Template Parameters
stream_typeOutput stream, must satisfy seqan3::output_stream_over with char.
seq_typeType of the seqan3::field::seq output; must satisfy std::ranges::output_range over a seqan3::alphabet.
id_typeType of the seqan3::field::id output; must satisfy std::ranges::output_range over a seqan3::alphabet.
qual_typeType of the seqan3::field::qual output; must satisfy std::ranges::output_range over a seqan3::quality_alphabet.
Parameters
[in,out]streamThe output stream to write into.
[in]optionsFile specific options passed to the format.
[in]sequenceThe data for seqan3::field::seq, i.e. the "sequence".
[in]idThe data for seqan3::field::id, e.g. the header line in FASTA.
[in]qualitiesThe data for seqan3::field::qual.

Additional requirements

  • The format must also accept std::ignore as parameter for any of the fields, however it shall throw an exception if one of the fields required for writing the format is marked as such. [this shall be checked inside the function]

Implements seqan3::sequence_file_output_format< t >.

Member Data Documentation

◆ file_extensions

std::vector<std::string> seqan3::format_fasta::file_extensions
inlinestatic
Initial value:
{
{"fasta"},
{"fa"},
{"fna"},
{"ffn"},
{"faa"},
{"frn"},
{"fas"},
}

The valid file extensions for this format; note that you can modify this value.


The documentation for this class was generated from the following file:
Hide me