SeqAn3 3.4.0-rc.4
The Modern C++ library for sequence analysis.
|
IO related views. More...
Variables | |
constexpr auto | seqan3::views::async_input_buffer |
A view adapter that returns a concurrent-queue-like view over the underlying range. | |
IO related views.
|
experimental-apiinlineconstexpr |
A view adapter that returns a concurrent-queue-like view over the underlying range.
urng_t | The type of the range being processed. See below for requirements. |
[in,out] | urange | The range being processed. |
[in] | buffer_size | Size of the buffer. Choose the size (> 0) depending on the expected work per element. |
Header File
#include <seqan3/io/views/async_input_buffer.hpp>
This view spawns a background thread that pre-fetches elements from the underlying range and stores them in a concurrent queue. Iterating over this view then pops elements out of the queue and returns them. This is primarily useful if dereferencing/incrementing the iterator of the underlying range is expensive, e.g. with SeqAn files which lazily perform I/O.
Another advantage of this view is that multiple iterators can be created that are safe to iterate individually, even from different threads, i.e. you can use multiple threads to iterate safely over a single-pass input view with the added benefit of background pre-fetching.
In technical terms: this view facilitates a single-producer, multi-consumer design; it's a range interface over a concurrent queue.
The buffer_size
parameter should be chosen depending on the expected work per element, e.g. if the underlying range is an input file over short reads, a buffer size of 100 or 1000 could be beneficial; if on the other hand the file contains genome-sized sequences, it would be better to buffer only a single sequence (buffering 100 sequences would result in the entire file being preloaded and likely consuming significant memory).
This view always moves elements from the underlying range into its buffer which means that the elements in the underlying range will be invalidated! For underlying ranges that are single-pass, this makes no difference, but it might be unexpected for multi-pass ranges (std::ranges::forward_range).
Typically this adaptor is used when you want to consume the entire underlying range. Destructing this view before all elements have been read will also stop the thread that moves object from the underlying range. In general, it is not safe to access the underlying range in other contexts once it has been passed to seqan3::views::async_input_buffer.
Note that in addition to the buffer of the view, every iterator has its own one-element-buffer. Dereferencing the iterator returns a reference to the element in the buffer, usually you will want to move this element out of the buffer with std::move std::ranges::iter_move. Incrementing the iterator refills the buffer from the queue inside the view (which in turn is then refilled from the underlying range).
concepts and reference type | urng_t (underlying range type) | rrng_t (returned range type) |
---|---|---|
std::ranges::input_range | required | preserved |
std::ranges::forward_range | lost | |
std::ranges::bidirectional_range | lost | |
std::ranges::random_access_range | lost | |
std::ranges::contiguous_range | lost | |
std::ranges::viewable_range | required | guaranteed |
std::ranges::view | guaranteed | |
std::ranges::sized_range | lost | |
std::ranges::common_range | lost | |
std::ranges::output_range | lost | |
seqan3::const_iterable_range | lost | |
std::ranges::range_reference_t | std::ranges::range_value_t<urng_t> & | |
std::iterator_traits ::iterator_category | none |
See the views submodule documentation for detailed descriptions of the view properties.
The following operations are thread-safe:
.begin()
and .end()
on the view returned by this adaptor;Calling operators on the same iterator object from different threads is not safe, i.e. you can pass the view to different threads by reference, and have each of those threads call begin()
on the view and then perform operations (dereference, increment...) on that iterator from the respective thread; but you cannot call begin()
in a parent thread, pass the iterator to different threads and operate on that concurrently.
Running the snippet could yield the following output:
This shows that indeed elements from the underlying range are processed non-sequentially, that there are two threads and that work is "balanced" between them (one thread processed more element than the other, because its "work" per item happened to be smaller).
Note that you might encounter jumbled output if by chance two threads write to the stream at the exact same time.
If you remove the line starting with auto f1 = ...
you will get sequential processing:
Note that even if you have a single processing thread, using this view can still improve performance measurably, because loading of the elements into the buffer (which reads input from disk) happens in a background thread.