IO related views. More...

Collaboration diagram for Views:

Variables
constexpr auto	seqan3::views::async_input_buffer
	A view adapter that returns a concurrent-queue-like view over the underlying range.

Detailed Descriptionno-api

IO related views.

See also: IO

Variable Documentation

◆ async_input_buffer [experimental-api]

constexpr auto seqan3::views::async_input_buffer

experimental-apiinlineconstexpr

A view adapter that returns a concurrent-queue-like view over the underlying range.

Template Parameters

urng_t The type of the range being processed. See below for requirements.

Parameters

[in,out]	urange	The range being processed.
[in]	buffer_size	Size of the buffer. Choose the size (> 0) depending on the expected work per element.

Returns: A view that pre-fetches elements from the underlying range and provides a thread-safe interface. See below for the properties of the returned range.

Header File

#include <seqan3/io/views/async_input_buffer.hpp>

Summary

This view spawns a background thread that pre-fetches elements from the underlying range and stores them in a concurrent queue. Iterating over this view then pops elements out of the queue and returns them. This is primarily useful if dereferencing/incrementing the iterator of the underlying range is expensive, e.g. with SeqAn files which lazily perform I/O.

Another advantage of this view is that multiple iterators can be created that are safe to iterate individually, even from different threads, i.e. you can use multiple threads to iterate safely over a single-pass input view with the added benefit of background pre-fetching.

In technical terms: this view facilitates a single-producer, multi-consumer design; it's a range interface over a concurrent queue.

Size of the buffer

The buffer_size parameter should be chosen depending on the expected work per element, e.g. if the underlying range is an input file over short reads, a buffer size of 100 or 1000 could be beneficial; if on the other hand the file contains genome-sized sequences, it would be better to buffer only a single sequence (buffering 100 sequences would result in the entire file being preloaded and likely consuming significant memory).

Range consumption

This view always moves elements from the underlying range into its buffer which means that the elements in the underlying range will be invalidated! For underlying ranges that are single-pass, this makes no difference, but it might be unexpected for multi-pass ranges (std::ranges::forward_range).

Typically this adaptor is used when you want to consume the entire underlying range. Destructing this view before all elements have been read will also stop the thread that moves object from the underlying range. In general, it is not safe to access the underlying range in other contexts once it has been passed to seqan3::views::async_input_buffer.

Note that in addition to the buffer of the view, every iterator has its own one-element-buffer. Dereferencing the iterator returns a reference to the element in the buffer, usually you will want to move this element out of the buffer with std::move std::ranges::iter_move. Incrementing the iterator refills the buffer from the queue inside the view (which in turn is then refilled from the underlying range).

View properties

concepts and reference type	`urng_t` (underlying range type)	`rrng_t` (returned range type)
std::ranges::input_range	required	preserved
std::ranges::forward_range		lost
std::ranges::bidirectional_range		lost
std::ranges::random_access_range		lost
std::ranges::contiguous_range		lost

std::ranges::viewable_range	required	guaranteed
std::ranges::view		guaranteed
std::ranges::sized_range		lost
std::ranges::common_range		lost
std::ranges::output_range		lost
seqan3::const_iterable_range		lost

std::ranges::range_reference_t		`std::ranges::range_value_t<urng_t> &`

std::iterator_traits ::iterator_category		none

See the views submodule documentation for detailed descriptions of the view properties.

Thread safety

The following operations are thread-safe:

calling .begin() and .end() on the view returned by this adaptor;
calling operators on the different iterator objects.

Calling operators on the same iterator object from different threads is not safe, i.e. you can pass the view to different threads by reference, and have each of those threads call begin() on the view and then perform operations (dereference, increment...) on that iterator from the respective thread; but you cannot call begin() in a parent thread, pass the iterator to different threads and operate on that concurrently.

Example

// SPDX-FileCopyrightText: 2006-2025 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2025 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
 
#include <cstdlib> // std::rand
#include <future>  // std::async
#include <string>  // std::string
 
#include <seqan3/core/debug_stream.hpp>           // seqan3::debug_stream
#include <seqan3/io/sequence_file/input.hpp>      // seqan3::sequence_file_input
#include <seqan3/io/views/async_input_buffer.hpp> // seqan3::views::async_input_buffer
 
std::string fasta_file =
    R"(> seq1
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq2
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq3
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq4
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq5
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq6
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq7
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq8
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq9
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq10
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq11
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
> seq12
ACGACTACGACGATCATCGATCGATCGATCGATCGATCGATCGATCGTACTACGATCGATCG
)";
 
int main()
{
    // initialise random number generator, only needed for demonstration purposes
    std::srand(std::time(nullptr));
 
    // create an input file from the string above
    seqan3::sequence_file_input fin{std::istringstream{fasta_file}, seqan3::format_fasta{}};
 
    // create the async buffer around the input file
    // spawns a background thread that tries to keep four records in the buffer
    auto v = fin | seqan3::views::async_input_buffer(4);
 
    // create a lambda function that iterates over the async buffer when called
    // (the buffer gets dynamically refilled as soon as possible)
    auto worker = [&v]()
    {
        for (auto & record : v)
        {
            // pretend we are doing some work
            std::this_thread::sleep_for(std::chrono::milliseconds(std::rand() % 1000));
            // print current thread and sequence ID
            seqan3::debug_stream << "Thread: " << std::this_thread::get_id() << '\t' << "Seq:    " << record.id()
                                 << '\n';
        }
    };
 
    // launch two threads and pass the lambda function to both
    auto f0 = std::async(std::launch::async, worker);
    auto f1 = std::async(std::launch::async, worker);
}

Running the snippet could yield the following output:

Thread: 0x80116bf00     Seq:    seq2
Thread: 0x80116bf00     Seq:    seq3
Thread: 0x80116ba00     Seq:    seq1
Thread: 0x80116bf00     Seq:    seq4
Thread: 0x80116bf00     Seq:    seq6
Thread: 0x80116ba00     Seq:    seq5
Thread: 0x80116bf00     Seq:    seq7
Thread: 0x80116ba00     Seq:    seq8
Thread: 0x80116bf00     Seq:    seq9
Thread: 0x80116bf00     Seq:    seq11
Thread: 0x80116bf00     Seq:    seq12
Thread: 0x80116ba00     Seq:    seq10

This shows that indeed elements from the underlying range are processed non-sequentially, that there are two threads and that work is "balanced" between them (one thread processed more element than the other, because its "work" per item happened to be smaller).

Note that you might encounter jumbled output if by chance two threads write to the stream at the exact same time.

If you remove the line starting with auto f1 = ... you will get sequential processing:

Thread: 0x80116aa00     Seq:    seq1
Thread: 0x80116aa00     Seq:    seq2
Thread: 0x80116aa00     Seq:    seq3
Thread: 0x80116aa00     Seq:    seq4
Thread: 0x80116aa00     Seq:    seq5
Thread: 0x80116aa00     Seq:    seq6
Thread: 0x80116aa00     Seq:    seq7
Thread: 0x80116aa00     Seq:    seq8
Thread: 0x80116aa00     Seq:    seq9
Thread: 0x80116aa00     Seq:    seq10
Thread: 0x80116aa00     Seq:    seq11
Thread: 0x80116aa00     Seq:    seq12

Note that even if you have a single processing thread, using this view can still improve performance measurably, because loading of the elements into the buffer (which reads input from disk) happens in a background thread.

This entity is experimental and subject to change in the future. Experimental since version 3.1.

Variables