Constraints

Motivation

One central design goal of SeqAn is to provide generic algorithms and data structures that can be used for different types without reimplementing the same algorithms over and over again for particular types. This has multiple benefits: improved maintainability due to an additional level of abstraction and, more importantly, the ability to reuse the code with user provided types. A familiar example for generic code is std::vector and the algorithms in the standard library. They are templates which means that they can be instantiated with other types. Most often the type cannot be arbitrary, because the template expects a particular interface from the type.

A SeqAn example is the local alignment algorithm. It computes the best local match between two sequences over a finite alphabet. The algorithm is generic in so far that it allows any alphabet that offers the minimal interface which is used inside the algorithm (e.g. objects of the alphabet type must be equality comparable). Before C++20, this could not be checked easily and using the interface with non-conforming types would result in very hard to read compiler errors and consequently frustration of the user. In the following part of the tutorial, you will learn how to constrain such template arguments of generic functions and data structures and how this can have a huge impact on your code.

Here's a shorter example:

template <typename t>
t add(t const v1, t const v2)
{
    return v1 + v2;
}
 
int main()
{
    return add(1, 3); // instantiates add<int>()
}

The template parameter t is said to be unconstrained, in theory it can be instantiated with any type. But of course, it won't actually compile for all types because the function template implicitly requires that types provide a + operator. If a type is used that does not have a + operator, this implicitness causes the compiler to fail at the place where such operator is used – and not at the place the template is instantiated. This leads to very complex error messages for deeply nested code.

Constraints are a way of making requirements of template arguments explicit. Constraints can be formulated ad-hoc, but this tutorial only covers concepts. The interested reader can check the documentation to learn about ad-hoc definitions. Concepts are a set of constraints with a given name. Let's assume there is a concept called Addable that requires the existence of a + operator (as previously mentioned the syntax for defining concepts is not covered here). The following snippet demonstrates how we can constrain our function template, i.e. make the template immediately reject any types that don't satisfy the requirement:

template <Addable t>
t add(t const v1, t const v2)
{
    return v1 + v2;
}
 
int main()
{
    return add(1, 3); // instantiates add<int>()
}

The only difference is that we have replaced typename with Addable. If you plug in a type that does not model Addable, you will get a message stating exactly that and not a cryptic template backtrace.

The standard library provides a set of predefined concepts. For our example above, the std::integral concept could have been used.

Syntax variants

Depending on the complexity of your constraint statements, three different syntaxes are available to enforce constraints; all of the following are equivalent.

(1) The "verbose syntax", especially useful when enforcing multiple constraints:

template <typename t1, typename t2>
    requires std::integral<t1> && std::integral<t2> // && MyOtherConcept<t1>
auto add(t1 const v1, t2 const v2)
{
    return v1 + v2;
}

(2) The "intermediate syntax":

template <std::integral t1, std::integral t2>                       // one constraint per type
auto add(t1 const v1, t2 const v2)
{
    return v1 + v2;
}

(3) The "terse syntax":

auto add(std::integral auto const v1, std::integral auto const v2)  // one constraint per type
{
    return v1 + v2;
}

Different constraints can be applied to different template parameters and a single template parameter can be constrained by multiple concepts. Syntaxes can also be combined:

template <std::integral t1, std::integral t2>
    // requires MyOtherConcept<t1>
auto add(t1 const v1, t2 const v2)
{
    return v1 + v2;
}

Terminology

Template arguments can be constrained.
A named set of constraints is a concept.
A type that satisfies all requirements of a concept is said to model said concept.
A concept that is composed of another concept and additional constraints is said to refine said concept(s).

Some people confuse concepts with interfaces. Both can be used as an abstraction of concrete types, but interfaces have to be inherited from. → the abstraction is explicit in the definition of the type. Concepts on the other hand "describe properties from the outside". → types don't need to be related and don't need to "know about the concept" to model it.

Furthermore, the polymorphism possible with concepts (see below) is faster, because it is resolved at compile-time while interface inheritance is resolved at run-time.

Overloading and specialisation

In generic programming, "function overloading" and "template specialisation" play an important role. They allow providing generic interfaces and (gradually) more specialised implementations for specific types or groups of types.

Function (template) overloading

When a function is overloaded and multiple overloads are valid for a given/deduced template argument, the most-refined overload is chosen:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
 
#include <concepts>
#include <iostream> // for std::cout
 
template <std::integral t>
void print(t const v)
{
    std::cout << "integral value: " << v << '\n';
}
 
int main()
{
    int i{4};
    unsigned u{3};
 
    print(i); // prints "integral value: 4"
    print(u); // prints "integral value: 3"
}

But as soon as we introduce another overload, the compiler will pick the "best" match:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
 
#include <concepts>
#include <iostream> // for std::cout
 
template <std::integral t>
void print(t const v)
{
    std::cout << "integral value: " << v << '\n';
}
 
template <std::unsigned_integral t>
void print(t const v)
{
    std::cout << "Unsigned value: " << v << '\n';
}
 
int main()
{
    int i{4};
    unsigned u{3};
 
    print(i); // prints "integral value: 4"
    print(u); // prints "Unsigned value: 3"
}

Assignment 1: Static polymorphism with alphabets I

Write a small program, similar to the one above with the following "skeleton":

// which includes?
 
// Add one or more `void print` function template(s) here //
 
int main()
{
    using namespace seqan3::literals;
 
    auto d = 'A'_dna5;
    auto a = 'L'_aa27;
    auto g = seqan3::gap{};
 
    print(d);
    print(a);
    print(g);
}

The print function (template) should print for every object v passed to it the result of to_char(v) and it should be constrained to only accepts types that model seqan3::alphabet. Try calling print with a different type, e.g. int to make sure that it does.

Solution

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
 
#include <iostream> // for std::cout
 
#include <seqan3/alphabet/all.hpp> // include all alphabet headers
 
template <seqan3::alphabet t>
void print(t const v)
{
    std::cout << "I am an alphabet and my value as char is: " << seqan3::to_char(v) << '\n';
}
 
int main()
{
    using namespace seqan3::literals;
 
    auto d = 'A'_dna5;
    auto a = 'L'_aa27;
    auto g = seqan3::gap{};
 
    print(d);
    print(a);
    print(g);
}

Assignment 2: Static polymorphism with alphabets II

Adapt your previous solution to handle nucleotides differently from the rest. For nucleotides, it should print both the value and its complement.

Solution

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
 
#include <iostream> // for std::cout
 
#include <seqan3/alphabet/all.hpp> // include all alphabet headers
 
template <seqan3::alphabet t>
void print(t const v)
{
    std::cout << "I am an alphabet and my value as char is: " << seqan3::to_char(v) << '\n';
}
 
template <seqan3::nucleotide_alphabet t>
void print(t const v)
{
    std::cout << "I am a nucleotide, my value as char is: " << seqan3::to_char(v)
              << " and my complement is: " << seqan3::to_char(seqan3::complement(v)) << '\n';
}
 
int main()
{
    using namespace seqan3::literals;
 
    auto d = 'A'_dna5;
    auto a = 'L'_aa27;
    auto g = seqan3::gap{};
 
    print(d);
    print(a);
    print(g);
}

Partial template specialisation

Similar to function template overloading it is possible to use concepts for partially specialising class and variable templates.

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
 
#include <concepts>
#include <utility> // for std::pair
 
template <typename t>
struct square_root_type;
 
template <std::integral t>
struct square_root_type<t>
{
    using type = std::pair<float, float>; // real and imaginary part
};
 
template <std::unsigned_integral t>
struct square_root_type<t>
{
    using type = float; // doesn't need imaginary part
};
 
// `int` models std::integral but not std::unsigned_integral:
static_assert(std::same_as<typename square_root_type<int>::type, std::pair<float, float>>);
 
// `unsigned` models std::integral and std::unsigned_integral, but the latter is more refined:
static_assert(std::same_as<typename square_root_type<unsigned>::type, float>);

This is a typical example of a "type transformation trait". It maps one type to another type; in this case, it returns a type that is able to represent the square root of the "input type". This can be used in generic algorithms to hold data in different types depending on the type of the input – in this case, we could avoid half of the space consumption for unsigned integral types VS signed integral types.

Note: The std::same_as used above is a concept with two template parameters. It requires that both parameters are the same. The static_assert checks conditions at compile-time; it can be used to verify whether a type or a combination of types model a concept. In the above case, we can use the combination to check the "return type" of the transformation trait.

Concepts in SeqAn and this documentation

SeqAn uses concepts extensively, for template specialisation/overloading, to avoid misuse and improve error messages. Unfortunately, doxygen, the system used to generate this documentation, does not handle C++ concepts very well, yet. That's why it's important to read the detailed documentation section of the constrained type, where we try to document the requirements manually. In some parts of the documentation concepts are called "interfaces", please don't let this confuse you.

Example: seqan3::bitpacked_sequence

The class seqan3::bitpacked_sequence<alphabet_type> behaves just like std::vector<alphabet_type> but has an internal representation where multiple values are packed into a single byte/word to save space. Also analog to std::vector, not every alphabet_type can be used. To avoid misuse and weird error messages, the type is constrained.

Have a look at the documentation of seqan3::bitpacked_sequence. It has one constrained template parameter. Do you understand the requirements imposed on alphabet_type when using the seqan3::bitpacked_sequence?

Hint

In order to use the seqan3::bitpacked_sequence the alphabet_type must model the following:

It needs to model std::regular, a stl concept. This only enforcing two other concepts: std::semiregular<T> && std::equality_comparable<T>.
- std::semiregular<T> makes sure that your type is default initialisable (e.g. int i{};).
- std::equality_comparable<T> makes sure you can compare your type with == (e.g. i == j).
It makes sense that in order to save a range of letters (of type alphabet_type), you need them to be default initialisable, for example s.t. you can easily resize your container. Additionally, seqan3::bitpacked_sequence needs the alphabet_type to be comparable, in order be equality comparable itself (e.g. you can do bit_seq_1 == bit_seq_2).
It needs to model [seqan3::writable_semialphabet], a seqan3 concept. This again enforces two things:
- seqan3::assign_rank_to needs to be defined for objects of this type.
- the type shall model seqan3::semialphabet, which in summary enforces that your type is ordered (comparable via <), shall be efficiently copyable and you should be able to call seqan3::alphabet_size(c) and seqan3::to_rank(c) (assuming c is of type alphabet_type).

Of course, all seqan3 alphabets model the requirements and can be used with the seqan3::bitpacked_sequence.

But what happens if a type you would like to use does not model seqan3::writable_semialphabet (because obviously this concept is very SeqAn specific)?

You can learn how to make your own alphabet model the SeqAn requirements in How to write your own alphabet

In order to understand what "make a type model a concept" means in practical terms, let's look at an easier example in the next section.

Satisfying a concept

Let's say you have the following concept called fooger:

// helper concept has_foo:
template <typename T>
concept has_foo = requires (T val) {
                      typename T::FOO; // requirement 1
                      val.foo;         // requirement 2
                  };
 
// concept fooger:
template <typename T>
concept fooger = has_foo<T> && std::same_as<typename T::FOO, int>;

Do you understand the requirements?

Hint

The type T needs to model has_foo<T> Which again has two requirements: requirement 1: The type T has to have a type member called FOO requirement 2: The type T has to have a member variable calles foo
std::same_as is a concept that checks whether two types are exaclty the same. Thus, fooger requires, that the type member T::FOO is int.

Assignment 4: Make a type model a concept

Copy over the concept into a new .cpp file.

Add a type my_type that models the requirements, s.t.

int main()
{
    seqan3::debug_stream << fooger<my_type> << std::endl; // should print 1
}

prints 1.

Hint: Don't forget to include the seqan3::debug_stream via #include <seqan3/core/debug_stream.hpp>.

Solution

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
 
#include <cmath>
 
#include <seqan3/core/debug_stream.hpp>
 
// helper concept has_foo:
template <typename T>
concept has_foo = requires (T val) {
                      typename T::FOO; // requirement 1
                      val.foo;         // requirement 2
                  };
 
// concept fooger:
template <typename T>
concept fooger = has_foo<T> && std::same_as<typename T::FOO, int>;
 
struct my_type
{
    using FOO = int;
    char foo{}; // foo can be of any type, here it is of type `char`
};
 
int main()
{
    seqan3::debug_stream << fooger<my_type> << std::endl; // should print 1
}

Difficulty	Moderate
Duration	60 min
Prerequisite tutorials	Quick Setup (using CMake), Parsing command line arguments with Sharg
Recommended reading	Concepts (cppreference)

Table of Contents

Constraints

Motivation

Syntax variants

Terminology

Overloading and specialisation

Function (template) overloading

Assignment 1: Static polymorphism with alphabets I

Assignment 2: Static polymorphism with alphabets II

Partial template specialisation

Concepts in SeqAn and this documentation

Example: seqan3::bitpacked_sequence

Satisfying a concept

Assignment 4: Make a type model a concept