SeqAn3 3.4.0-rc.4
The Modern C++ library for sequence analysis.
|
This HowTo documents how to write a view using the standard library and some helpers from SeqAn.
Difficulty | Difficult |
---|---|
Duration | 120 min |
Prerequisite tutorials | C++ Concepts,Ranges |
Recommended reading |
We have introduced "views" in Ranges. You can do many things with the views provided by the standard library and those shipped with SeqAn, but in certain situations you will want to define your own view. This page will teach you the basics of defining your own view.
A view is a type of std::ranges::range
that also models std::ranges::view
. The additional requirements of std::ranges::view
can be vaguely summarised as "not holding any own data" or at least not holding data that is relative in size to the number of elements in the view (e.g. a vector cannot be a view, because its size in memory depends on the number of elements it represents).
A simple example of a view is std::ranges::subrange
. It can be constructed from a pair of iterators or more precisely an iterator and a sentinel. begin()
always returns an iterator, but the type returned by end()
(the "sentinel") does not need to be of the same type as the iterator – as long as they are comparable. This view then holds exactly the iterator-sentinel-pair as its state and nothing else.
But std::ranges::subrange
does not yet facilitate any "composing-behaviour" that you have seen in Ranges tutorial, std::ranges::subrange
is simply a type that can be constructed from an iterator-sentinel-pair, you cannot "pipe" anything into it. Most views are adaptors on other views, e.g. std::ranges::transform_view
wraps an existing view and applies an element-wise transformation on-demand. You can directly construct std::ranges::transform_view
from another view or from non-view ranges that can be wrapped into a view (e.g. references to containers):
But this syntax gets difficult to read when you create "a view(from a view(from a view()))". That's why for every view that adapts an existing view we have an additional adaptor object, usually available in a views::
sub-namespace:
This adaptor object (std::views::transform
) provides the pipe operator and returns an object of the actual view type (std::ranges::transform_view
). The pipe operator allows us to chain multiple adaptors similar to the unix command line. We will discuss the details of these adaptor objects in the following sections.
Read section 24.7 and 24.7.1 of the C++ standard.
The wording of the standard needs some getting used to, but some important notes for us are:
|
but without providing a range: Terminology | Description | Example |
---|---|---|
"view" | A type that models std::ranges::view. | std::ranges::filter_view<std::subrange<int const *, int const *>> |
"view adaptor object" | Creates a view; can be combined with other adaptors. | std::views::reverse and std::views::filter and (std::views::filter([] (int) { return true; })) |
"view adaptor closure object" | A "view adaptor object" that requires no paramaters other than the range. | std::views::reverse and std::views::filter and (std::views::filter([] (int) { return true; })) |
In many cases where you are planning on creating "a new view", it will be sufficient to use the previously mentioned techniques to just create "a new adaptor object" and not having to specify the actual view type yourself.
Let's look at some examples!
char
so you cannot print a std::vector<seqan3::dna5>
via std::cout
¹. You also know that you can call seqan3::to_char
on every object that models seqan3::alphabet
which will convert it to char
or a similar type.
We want to do the following:
Define a range adaptor object using an existing adaptor which applies a concrete transformation (calling seqan3::to_char) on every element.
std::views::transform
and you need to set a fixed transformation function. std::views::transform
takes an object that models std::regular_invocable
, e.g. a lambda function with empty capture []
. ¹ You can print via seqan3::debug_stream
, but let's ignore that for now.
You simply define your adaptor type as auto
and make it behave like std::views::transform
, except that you "hard-code" the lambda function that is applied to each element. Since your adaptor object now takes a range as the only parameter, it is an adaptor closure object.
The object is marked as constexpr
because the adaptor object itself never changes, it only provides operator()
and operator|
that each return a specialisation of std::ranges::transformation_view
.
Study the seqan3::nucleotide_alphabet
. It states that you can call seqan3::complement
on all nucleotides which will give you 'A'_dna5
for 'T'_dna5
a.s.o. Think about how you can adapt the previous solution to write a view that transforms ranges of nucleotides into their complement.
BUT, we are also interested in reversing the range which is possible with std::views::reverse
:
Define a range adaptor object that presents a view of the reverse complement of whatever you pipe into it.
The adaptor consists of std::views::reverse
combined with std::views::transform
. This time the lambda just performs the call to seqan3::complement
.
Using existing adaptors only works to a certain degree, sometimes you will need to implement a full view. As we have seen above, it is easy to implement a view that presents the complement of a nucleotide range just using existing adaptors. For simplicity, we will still use this as an example and implement a non-generic transform view in the next steps.
A full view implementation typically consists of the following components:
std::ranges::view_interface
which will take care of a lot of boilerplate for us.A good thing to start with is to think about the iterator and the sentinel of your view. The iterator and/or its relation to the sentinel is the way we implement the behaviour that is specific to our view. We will start with implementing the iterator separately and later integrate it into the view.
Since we know that in our current usecase we have exactly one element in our view for every element in the underlying range, we don't need to change the relation between iterator and sentinel, i.e. "begin == end" on our view iff "begin == end" on the underlying range. This indicates that we can use the sentinel of the underlying range as-is.
Have a peak at the C++ Concepts tutorial again and study the std::forward_iterator
concept thoroughly. You will now have to implement your own forward iterator.
In order to re-use functionality of the underlying range's iterator type you can inherit from it (std::ranges::iterator_t
returns the iterator type).¹ In the end, you should be able to iterate over the underlying range and print elements like you would with the original iterator, i.e. your iterator shall behave exactly as the original in this regard (no transformation, yet).
Some things to keep in mind for the implementation:
std::forward_iterator
concept).static_assert
will just return true
or false
, to get more detailed information on why your type does not (yet) model the concept, implement a constrained function template and pass an object of your type to that template.std::ranges::sentinel_t<urng_t>
.¹ In some situations it might be better to wrap the underlying iterator instead of inheriting it, i.e. save a copy of the underlying iterator as a data member of your iterator. A reason could be that you don't want to inherit some members or want to prevent implicit convertibility.
The program prints "G A T T A C A ".
In the previous assigment you have created a working – but pointless – iterator. It does not do anything differently from the original.
Your task now is to implement the "complementing" behaviour, i.e.
seqan3::nucleotide_alphabet
Think about which operator is responsible for returning the element and be careful with the return type of that operator as your transformation might make a change necessary.
static_assert
: static_assert
gives you the opportunity to give a readable message in case of an error.¹seqan3::complement
function is operator*
: Here is the full solution:
The program prints "C T A A T G T "
¹ This is only recommended when you do not want to allow a different specialisation of the template to cover the excluded case.
You now have a working iterator, although it still lacks the capabilities of std::bidirectional_iterator
, std::random_access_iterator
and std::contiguous_iterator
. When designing views, you should always strive to preserve as much of the capabilities of the underlying range as possible.
Which of the mentioned concepts do you think your iterator could be made to implement? Have a look at the respective documentation.
std::random_access_iterator
(and thus also std::bidirectional_iterator
) when the underlying range is, because jumping on your iterator/view can be done in constant time iff it can be done in constant time on the underlying range (you just jump to the n-th element of the underlying range and perform your transformation one that).
However, it can never model std::contiguous_iterator
because that would imply that the elements are adjacent to each other in memory (the elements of our view are created on demand and are not stored in memory).
If you have looked at the std::random_access_iterator
, you will have seen that it is quite a bit of work to implement all the operators, many of whom just need to be overloaded to fix the return type. To make this a little bit easier SeqAn provides seqan3::detail::inherited_iterator_base
, it fixes the issue with the return type via CRTP. A solution to the previous exercise looks like this:
We now implement the view in several steps:
Like the iterator, the view is derived from a CRTP base class that takes care of defining many members for us, e.g. .size()
, .operator[]
and a few others.
The only data member the class holds is a copy of the underlying range. As you may have noted above, our class only takes underlying ranges that model std::ranges::view. This might seem strange; after all we want to apply the view to a vector of which we know that it is not a view, but we will clear this up later.
The only member types that we define here are the definitions of the iterators which are just the iterator we have defined before. Note that we have const_iterator
in addition to iterator
which const-qualified member functions return, because in a const-context the urange
data member will be const so we cannot return the mutable iterator
from it.
Many ranges like the standard library containers also present the member types of the iterator, i.e. value_type
, reference
a.s.o, but this is not required to model any of the range concepts.
These functions are the same member functions you know from std::vector
, they return objects of the previously defined iterator types that are initialised with the begin iterator from the underlying range.
The implementation for end()
is similar except that for our range the sentinel type (the return type of end()
) is the same as of the underlying range, we just pass it through.
For many more complex views you will have to define the sentinel type yourself or derive it from the underlying type in a similar manner to how we derived the iterator type. Often you can use std::default_sentinel_t
as the type for your sentinel and implement the "end-condition" in the iterator's equality comparison operator against that type.
We have two constructors, one that takes an the underlying type by copy and moves it into the data member (remember that since it is a view, it will not be expensive to copy – if it is copied).
The second constructor is more interesting, it takes a std::ranges::viewable_range
which is defined as being either a std::ranges::view
or a reference to std::ranges::range
that is not a view (e.g. std::vector<char> &
). Since we have a constructor for std::ranges::view
already, this one explicitly handles the second case and goes through std::views::all
which wraps the reference in a thin view-layer. Storing only a view member guarantess that our type itself is also cheap to copy among other things.
Note that both of these constructors seem like generic functions, but they just handle the underlying type or a type that turns into the underlying when wrapped in std::views::all
.
To easily use the second constructor we need to provide a type deduction guide.
Here is the full solution:
The program prints
The adaptor object is a function object also called functor. This means we define a type with the respective operators and then create a global instance of that type which can be used to invoke the actual functionality.
The adaptor has the primary purpose of facilitating the piping behaviour, but it shall also allow for function/constructor-style creation of view objects, therefore it defines two operators:
The first operator is very straight-forward, it simply delegates to the constructor of our view so that views::my(FOO)
is identical to my_view{FOO}
.
The second operator is declared as friend, because the left-hand-side of the operator|
is generic, it's how the range is handled in snippets like auto v = vec | views::my
.
operator|
.Our example adaptor type definition is rather simple, but for views/adaptors that take more parameters it gets quite complicated quickly. Therefore SeqAn provides some convenience templates for you:
See seqan3::detail::adaptor_base
, seqan3::detail::adaptor_for_view_without_args
and seqan3::detail::adaptor_from_functor
for more details.
The adaptor object is simply an instance of the previously defined type:
As noted above, we place this object in a views::
sub-namespace by convention. Since the object holds no state, we mark it as constexpr
and since it's a global variable we also mark it as inline
to prevent linkage issues.
Finally we can use our view with pipes and combine it with others:
Here is the full, final solution:
The program prints: