SeqAn3  3.0.0
The Modern C++ library for sequence analysis.
predicate.hpp File Reference

Provides character predicates for tokenisation. More...

+ Include dependency graph for predicate.hpp:
+ This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Namespaces

 seqan3
 The main SeqAn3 namespace.
 

Variables

Character predicates

Character predicates are function like objects that can be used to check if a character c fulfills certain constraints. SeqAn3 implements all predicates also available in the standard library and some more.

Disjunction and Negation

In contrast to the standard library (where the checks are implemented as functions), the functors in SeqAn3 can be joined efficiently, maintaining constant-time evaluation independent of the number of checks. Functors can be combined with the ||-operator or negated via the !-operator:

char chr{'1'};
auto constexpr my_cond = is_char<'%'> || is_digit;
bool is_percent = my_cond(chr); // is_percent == true

Defining complex combinations and using them in e.g. input/output can increase speed significantly over checking multiple functions: we measured speed-ups of 10x for a single check and speed-ups of over 20x for complex combinations.

Custom predicates

Standard library predicates

SeqAn offers the 12 predicates exactly as defined in the standard library except that we have introduced an underscore in the name to be consistent with our other naming.

The following table lists the predefined character predicates and which constraints are associated with them.

ASCII values characters

is_cntrl

is_print

is_space

is_blank

is_graph

is_punct

is_alnum

is_alpha

is_upper

is_lower

is_digit

is_xdigit

decimal hexadecimal octal
0–8 \x0\x8 \0\10 control codes (NUL, etc.) ≠0 0 0 0 0 0 0 0 0 0 0 0
9 \x9 \11 tab (\t) ≠0 0 ≠0 ≠0 0 0 0 0 0 0 0 0
10–13 \xA\xD \12\15 whitespaces (\n, \v, \f, \r) ≠0 0 ≠0 0 0 0 0 0 0 0 0 0
14–31 \xE\x1F \16\37 control codes ≠0 0 0 0 0 0 0 0 0 0 0 0
32 \x20 \40 space 0 ≠0 ≠0 ≠0 0 0 0 0 0 0 0 0
33–47 \x21\x2F \41\57 !"#$%&'()*+,-./ 0 ≠0 0 0 ≠0 ≠0 0 0 0 0 0 0
48–57 \x30\x39 \60\71 0123456789 0 ≠0 0 0 ≠0 0 ≠0 0 0 0 ≠0 ≠0
58–64 \x3A\x40 \72\100 :;<=>?@ 0 ≠0 0 0 ≠0 ≠0 0 0 0 0 0 0
65–70 \x41\x46 \101\106 ABCDEF 0 ≠0 0 0 ≠0 0 ≠0 ≠0 ≠0 0 0 ≠0
71–90 \x47\x5A \107\132 GHIJKLMNOP
QRSTUVWXYZ
0 ≠0 0 0 ≠0 0 ≠0 ≠0 ≠0 0 0 0
91–96 \x5B\x60 \133\140 []^_` 0 ≠0 0 0 ≠0 ≠0 0 0 0 0 0 0
97–102 \x61\x66 \141\146 abcdef 0 ≠0 0 0 ≠0 0 ≠0 ≠0 0 ≠0 0 ≠0
103–122 \x67\x7A \147\172 ghijklmnop
qrstuvwxyz
0 ≠0 0 0 ≠0 0 ≠0 ≠0 0 ≠0 0 0
123–126 \x7B\x7E \172\176 {|}~ 0 ≠0 0 0 ≠0 ≠0 0 0 0 0 0 0
127 \x7F \177 backspace character (DEL) ≠0 0 0 0 0 0 0 0 0 0 0 0


template<uint8_t interval_first, uint8_t interval_last>
constexpr auto seqan3::is_in_interval
 Checks whether a given letter is in the specified interval. More...
 
template<Alphabet alphabet_t>
constexpr auto seqan3::is_in_alphabet
 Checks whether a given letter is valid for the specified seqan3::Alphabet. More...
 
template<int char_v>
constexpr auto seqan3::is_char
 Checks whether a given letter is the same as the template non-type argument. More...
 
auto constexpr seqan3::is_eof = is_char<EOF>
 Checks whether a given letter is equal to the EOF constant defined in <cstdio>. More...
 
auto constexpr seqan3::is_cntrl
 Checks whether c is a control character. More...
 
auto constexpr seqan3::is_print = is_in_interval<' ', '~'>
 Checks whether c is a printable character. More...
 
auto constexpr seqan3::is_space = is_in_interval<'\t', '\r'> || is_char<' '>
 Checks whether c is a space character. More...
 
auto constexpr seqan3::is_blank = is_char<'\t'> || is_char<' '>
 Checks whether c is a blank character. More...
 
auto constexpr seqan3::is_graph = is_in_interval<'!', '~'>
 Checks whether c is a graphic character. More...
 
auto constexpr seqan3::is_punct
 Checks whether c is a punctuation character. More...
 
auto constexpr seqan3::is_alnum
 Checks whether c is a alphanumeric character. More...
 
auto constexpr seqan3::is_alpha = is_in_interval<'A', 'Z'> || is_in_interval<'a', 'z'>
 Checks whether c is a alphabetical character. More...
 
auto constexpr seqan3::is_upper = is_in_interval<'A', 'Z'>
 Checks whether c is a upper case character. More...
 
auto constexpr seqan3::is_lower = is_in_interval<'a', 'z'>
 Checks whether c is a lower case character. More...
 
auto constexpr seqan3::is_digit = is_in_interval<'0', '9'>
 Checks whether c is a digital character. More...
 
auto constexpr seqan3::is_xdigit
 Checks whether c is a hexadecimal character. More...
 

Detailed Description

Provides character predicates for tokenisation.

Author
Rene Rahn <rene.rahn AT fu-berlin.de>
Hannes Hauswedell <hannes.hauswedell AT fu-berlin.de>