SeqAn3 3.4.0-rc.1
The Modern C++ library for sequence analysis.
Loading...
Searching...
No Matches
Builtin Character Operations

Provides various operations on character types. More...

+ Collaboration diagram for Builtin Character Operations:

Char predicates

Char predicates are function like objects that can be used to check if a character c fulfills certain constraints. SeqAn implements all predicates also available in the standard library and some more.

Disjunction and Negation

In contrast to the standard library (where the checks are implemented as functions), the functors in SeqAn can be joined efficiently, maintaining constant-time evaluation independent of the number of checks. Functors can be combined with the ||-operator or negated via the !-operator:

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
#include <iostream>
int main()
{
char chr{'1'};
constexpr auto my_cond = seqan3::is_char<'%'> || seqan3::is_digit;
bool is_percent = my_cond(chr);
std::cout << std::boolalpha << is_percent << '\n'; // true
}
T boolalpha(T... args)
constexpr auto is_digit
Checks whether c is a digital character.
Definition predicate.hpp:259
constexpr auto is_char
Checks whether a given letter is the same as the template non-type argument.
Definition predicate.hpp:60
Provides character predicates for tokenisation.

Defining complex combinations and using them in e.g. input/output can increase speed significantly over checking multiple functions: we measured speed-ups of 10x for a single check and speed-ups of over 20x for complex combinations.

Custom predicates

Standard library predicates

SeqAn offers the 12 predicates exactly as defined in the standard library except that we have introduced an underscore in the name to be consistent with our other naming.

The following table lists the predefined character predicates and which constraints are associated with them.

ASCII values characters

is_cntrl

is_print

is_space

is_blank

is_graph

is_punct

is_alnum

is_alpha

is_upper

is_lower

is_digit

is_xdigit

decimal hexadecimal octal
0–8 \x0\x8 \0\10 control codes (NUL, etc.) ≠0 0 0 0 0 0 0 0 0 0 0 0
9 \x9 \11 tab (\t) ≠0 0 ≠0 ≠0 0 0 0 0 0 0 0 0
10–13 \xA\xD \12\15 whitespaces (\n, \v, \f, \r) ≠0 0 ≠0 0 0 0 0 0 0 0 0 0
14–31 \xE\x1F \16\37 control codes ≠0 0 0 0 0 0 0 0 0 0 0 0
32 \x20 \40 space 0 ≠0 ≠0 ≠0 0 0 0 0 0 0 0 0
33–47 \x21\x2F \41\57 !"#$%&'()*+,-./ 0 ≠0 0 0 ≠0 ≠0 0 0 0 0 0 0
48–57 \x30\x39 \60\71 0123456789 0 ≠0 0 0 ≠0 0 ≠0 0 0 0 ≠0 ≠0
58–64 \x3A\x40 \72\100 :;<=>?@ 0 ≠0 0 0 ≠0 ≠0 0 0 0 0 0 0
65–70 \x41\x46 \101\106 ABCDEF 0 ≠0 0 0 ≠0 0 ≠0 ≠0 ≠0 0 0 ≠0
71–90 \x47\x5A \107\132 GHIJKLMNOP
QRSTUVWXYZ
0 ≠0 0 0 ≠0 0 ≠0 ≠0 ≠0 0 0 0
91–96 \x5B\x60 \133\140 []^_` 0 ≠0 0 0 ≠0 ≠0 0 0 0 0 0 0
97–102 \x61\x66 \141\146 abcdef 0 ≠0 0 0 ≠0 0 ≠0 ≠0 0 ≠0 0 ≠0
103–122 \x67\x7A \147\172 ghijklmnop
qrstuvwxyz
0 ≠0 0 0 ≠0 0 ≠0 ≠0 0 ≠0 0 0
123–126 \x7B\x7E \172\176 {|}~ 0 ≠0 0 0 ≠0 ≠0 0 0 0 0 0 0
127 \x7F \177 backspace character (DEL) ≠0 0 0 0 0 0 0 0 0 0 0 0


template<uint8_t interval_first, uint8_t interval_last>
constexpr auto seqan3::is_in_interval
 Checks whether a given letter is in the specified interval.
 
template<int char_v>
constexpr auto seqan3::is_char
 Checks whether a given letter is the same as the template non-type argument.
 
constexpr auto seqan3::is_cntrl = is_in_interval<'\0', static_cast<char>(31)> || is_char<static_cast<char>(127)>
 Checks whether c is a control character.
 
constexpr auto seqan3::is_print = is_in_interval<' ', '~'>
 Checks whether c is a printable character.
 
constexpr auto seqan3::is_space = is_in_interval<'\t', '\r'> || is_char<' '>
 Checks whether c is a space character.
 
constexpr auto seqan3::is_blank = is_char<'\t'> || is_char<' '>
 Checks whether c is a blank character.
 
constexpr auto seqan3::is_graph = is_in_interval<'!', '~'>
 Checks whether c is a graphic character.
 
constexpr auto seqan3::is_punct
 Checks whether c is a punctuation character.
 
constexpr auto seqan3::is_alnum = is_in_interval<'0', '9'> || is_in_interval<'A', 'Z'> || is_in_interval<'a', 'z'>
 Checks whether c is a alphanumeric character.
 
constexpr auto seqan3::is_alpha = is_in_interval<'A', 'Z'> || is_in_interval<'a', 'z'>
 Checks whether c is a alphabetical character.
 
constexpr auto seqan3::is_upper = is_in_interval<'A', 'Z'>
 Checks whether c is a upper case character.
 
constexpr auto seqan3::is_lower = is_in_interval<'a', 'z'>
 Checks whether c is a lower case character.
 
constexpr auto seqan3::is_digit = is_in_interval<'0', '9'>
 Checks whether c is a digital character.
 
constexpr auto seqan3::is_xdigit = is_in_interval<'0', '9'> || is_in_interval<'A', 'F'> || is_in_interval<'a', 'f'>
 Checks whether c is a hexadecimal character.
 
constexpr auto seqan3::is_eof = is_char<EOF>
 Checks whether a given letter is equal to the EOF constant defined in <cstdio>.
 

Detailed Description

Provides various operations on character types.

See also
Utility

Variable Documentation

◆ is_alnum

constexpr auto seqan3::is_alnum = is_in_interval<'0', '9'> || is_in_interval<'A', 'Z'> || is_in_interval<'a', 'z'>
inlineconstexpr

Checks whether c is a alphanumeric character.

This function like object can be used to check if a character c is a alphanumeric character. For the standard ASCII character set, the following characters are alphanumeric characters:

  • digits (0123456789)
  • uppercase letters (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
  • lowercase letters (abcdefghijklmnopqrstuvwxyz)

Example

static_assert(seqan3::is_alnum('9'));
constexpr auto is_alnum
Checks whether c is a alphanumeric character.
Definition predicate.hpp:194

◆ is_alpha

constexpr auto seqan3::is_alpha = is_in_interval<'A', 'Z'> || is_in_interval<'a', 'z'>
inlineconstexpr

Checks whether c is a alphabetical character.

This function like object can be used to check if a character c is a alphabetical character. For the standard ASCII character set, the following characters are alphabetical characters:

  • uppercase letters (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
  • lowercase letters (abcdefghijklmnopqrstuvwxyz)

Example

static_assert(seqan3::is_alpha('z'));
constexpr auto is_alpha
Checks whether c is a alphabetical character.
Definition predicate.hpp:211

◆ is_blank

constexpr auto seqan3::is_blank = is_char<'\t'> || is_char<' '>
inlineconstexpr

Checks whether c is a blank character.

This function like object can be used to check if a character c is a blank character. For the standard ASCII character set, the following characters are blank characters:

  • horizontal tab ('\t')
  • space (' ')

Example

static_assert(seqan3::is_blank('\t'));
constexpr auto is_blank
Checks whether c is a blank character.
Definition predicate.hpp:139

◆ is_char

template<int char_v>
constexpr auto seqan3::is_char
inlineconstexpr

Checks whether a given letter is the same as the template non-type argument.

Template Parameters
char_vThe letter to compare against.

This function like object returns true if the argument is the same as the template argument, false otherwise.

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
seqan3::is_char<'C'>('C'); // returns true
constexpr auto my_check = seqan3::is_char<'C'>;
my_check('c'); // returns false, because case is different
}

◆ is_cntrl

constexpr auto seqan3::is_cntrl = is_in_interval<'\0', static_cast<char>(31)> || is_char<static_cast<char>(127)>
inlineconstexpr

Checks whether c is a control character.

This function like object can be used to check if a character c is a control character. For the standard ASCII character set, control characters are those between ASCII codes 0x00 (NUL) and 0x1f (US) and 0x7f (DEL).

Example

static_assert(seqan3::is_cntrl('\0'));
constexpr auto is_cntrl
Checks whether c is a control character.
Definition predicate.hpp:87

◆ is_digit

constexpr auto seqan3::is_digit = is_in_interval<'0', '9'>
inlineconstexpr

Checks whether c is a digital character.

This function like object can be used to check if a character c is a digital character. For the standard ASCII character set, the following characters are digital characters:

  • digits (0123456789)

Example

static_assert(seqan3::is_digit('1'));

◆ is_eof

constexpr auto seqan3::is_eof = is_char<EOF>
inlineconstexpr

Checks whether a given letter is equal to the EOF constant defined in <cstdio>.

This function like object returns true if the argument is equal to EOF, false otherwise.

Example

static_assert(seqan3::is_eof(EOF));
static_assert(!seqan3::is_eof('C'));
constexpr auto is_eof
Checks whether a given letter is equal to the EOF constant defined in <cstdio>.
Definition predicate.hpp:72

◆ is_graph

constexpr auto seqan3::is_graph = is_in_interval<'!', '~'>
inlineconstexpr

Checks whether c is a graphic character.

This function like object can be used to check if a character c is a graphic (has a graphical representation) character. For the standard ASCII character set, the following characters are graphic characters:

  • digits (0123456789)
  • uppercase letters (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
  • lowercase letters (abcdefghijklmnopqrstuvwxyz)
  • punctuation characters (!"#$%&'()*+,-./:;<=>?@[]^_`{|}~)

Example

static_assert(seqan3::is_graph('%'));
constexpr auto is_graph
Checks whether c is a graphic character.
Definition predicate.hpp:159

◆ is_in_interval

template<uint8_t interval_first, uint8_t interval_last>
constexpr auto seqan3::is_in_interval
inlineconstexpr

Checks whether a given letter is in the specified interval.

Template Parameters
interval_firstThe first character for which to return true.
interval_lastThe last character (inclusive) for which to return true.

This function like object returns true for all characters in the given range, false otherwise.

Example

// SPDX-FileCopyrightText: 2006-2024 Knut Reinert & Freie Universität Berlin
// SPDX-FileCopyrightText: 2016-2024 Knut Reinert & MPI für molekulare Genetik
// SPDX-License-Identifier: CC0-1.0
int main()
{
seqan3::is_in_interval<'A', 'G'>('C'); // returns true
constexpr auto my_check = seqan3::is_in_interval<'A', 'G'>;
my_check('H'); // returns false
}
constexpr auto is_in_interval
Checks whether a given letter is in the specified interval.
Definition predicate.hpp:44

◆ is_lower

constexpr auto seqan3::is_lower = is_in_interval<'a', 'z'>
inlineconstexpr

Checks whether c is a lower case character.

This function like object can be used to check if a character c is a lower case character. For the standard ASCII character set, the following characters are lower case characters:

  • lowercase letters (abcdefghijklmnopqrstuvwxyz)

Example

static_assert(seqan3::is_lower('a'));
constexpr auto is_lower
Checks whether c is a lower case character.
Definition predicate.hpp:243

◆ is_print

constexpr auto seqan3::is_print = is_in_interval<' ', '~'>
inlineconstexpr

Checks whether c is a printable character.

This function like object can be used to check if a character c is a printable character. For the standard ASCII character set, printable characters are those between ASCII codes 0x20 (space) and 0x7E (~).

Example

static_assert(seqan3::is_print(' '));
constexpr auto is_print
Checks whether c is a printable character.
Definition predicate.hpp:101

◆ is_punct

constexpr auto seqan3::is_punct
inlineconstexpr
Initial value:
=
is_in_interval<'!', '/'> || is_in_interval<':', '@'> || is_in_interval<'[', '`'> || is_in_interval<'{', '~'>

Checks whether c is a punctuation character.

This function like object can be used to check if a character c is a punctuation character. For the standard ASCII character set, the following characters are punctuation characters:

  • punctuation characters (!"#$%&'()*+,-./:;<=>?@[]^_`{|}~)

Example

static_assert(seqan3::is_punct(':'));
constexpr auto is_punct
Checks whether c is a punctuation character.
Definition predicate.hpp:175

◆ is_space

constexpr auto seqan3::is_space = is_in_interval<'\t', '\r'> || is_char<' '>
inlineconstexpr

Checks whether c is a space character.

This function like object can be used to check if a character c is a space character. For the standard ASCII character set, the following characters are space characters:

  • horizontal tab ('\t')
  • line feed ('\n')
  • vertical tab ('\v')
  • from feed ('\f')
  • carriage return ('\r')
  • space (' ')

Example

static_assert(seqan3::is_space('\n'));
constexpr auto is_space
Checks whether c is a space character.
Definition predicate.hpp:122

◆ is_upper

constexpr auto seqan3::is_upper = is_in_interval<'A', 'Z'>
inlineconstexpr

Checks whether c is a upper case character.

This function like object can be used to check if a character c is a upper case character. For the standard ASCII character set, the following characters are upper case characters:

  • uppercase letters (ABCDEFGHIJKLMNOPQRSTUVWXYZ)

Example

static_assert(seqan3::is_upper('K'));
constexpr auto is_upper
Checks whether c is a upper case character.
Definition predicate.hpp:227

◆ is_xdigit

constexpr auto seqan3::is_xdigit = is_in_interval<'0', '9'> || is_in_interval<'A', 'F'> || is_in_interval<'a', 'f'>
inlineconstexpr

Checks whether c is a hexadecimal character.

This function like object can be used to check if a character c is a hexadecimal character. For the standard ASCII character set, the following characters are hexadecimal characters:

  • digits (0123456789)
  • uppercase letters (ABCDEF)
  • lowercase letters (abcdef)

Example

static_assert(seqan3::is_xdigit('e'));
constexpr auto is_xdigit
Checks whether c is a hexadecimal character.
Definition predicate.hpp:277
Hide me