Sharg 1.1.2-rc.1
The argument parser for bio-c++ tools.
|
Learning Objective:
You will learn how to use the sharg::parser class to parse command line arguments. This tutorial is a walkthrough with links to the API documentation and is also meant as a source for copy-and-paste code.
Difficulty | Easy |
---|---|
Duration | 30-60 min |
Prerequisite tutorials | Quick Setup (using CMake) |
Recommended reading | POSIX conventions |
An easy and very flexible interface to a program is through the command line. This tutorial explains how to parse the command line using the Sharg library’s sharg::parser class.
This class will give you the following functionality:
--help
. You can also export this help to HTML and man pages.Before we start, let's agree on some terminology. Consider the following command line call
The binary program1
is called with several command line arguments. We call every single input an argument but differentiate their purpose into options, positional_options, flags or simply a value corresponding to one of the former. In our example above, -f
is a flag which is never followed by a value, -i
is an option with a short identifier (id) followed by its value 4
, --long-id
is also an option with a long identifier followed by its value 6
, and file1.txt is a positional_option, because it is an option identified by its position instead of an identifier.
Name | Purpose | Example |
---|---|---|
option | identify an argument by name (id-value pair) | -i 5 or --long-id 5 |
flag | boolean on/off flag (id) | -f |
positional option | identify an argument by position (value) | file1.txt |
Have a look at the POSIX conventions for command line arguments if you want a detailed description on the requirements for the above. (Note: in the linked article the following holds: value="argument", option="option", flag = "option that does not require arguments", positional option ="non-option").
We will get to know the wide functionality of the parser by writing a little application and extending it step by step. Let's say we have a tab separated file data.tsv
with information on the Game of Thrones Seasons (by Wikipedia):
We want to build an application that is able to read the file with or without a header line, select certain seasons and compute the average or median from the "Avg. U.S. viewers (millions)" of the selected seasons.
Before we add any of the options, flags, and positional options, we will take a look at the sharg::parser class itself. It is constructed by giving a program's name and passing the parameters argc
and argv
from main. Note that no command line arguments have been parsed so far, but we can now add more information to the parser. After adding all desired information, the parsing is triggered by calling the sharg::parser::parse member function. Since the function throws in case any errors occur, we need to wrap it into a try-catch block. Here is a first working example:
There are two types of exceptions: The sharg::design_error which indicates that the parser setup was wrong (directed to the developer of the program, not the user!) and any other exception derived from sharg::parser_error, which detects corrupted user input. Additionally, there are special user requests that are handled by the parser by exiting the program via std::exit, e.g. calling --help
that prints a help page screen.
The parser checks the following restrictions and throws a sharg::design_error if they are not satisfied:
_
, -
, or @
, but never start with -
._
or @
. Either the short or long id may be empty but not both at the same time.-h
, --help
, --advanced-help
, --advanced-help
, --export-help
, --version
, --copyright
are predefined and cannot be specified manually or used otherwise.When calling the sharg::parser::parse function, the following potential user errors are caught (and handled by throwing a corresponding exception):
sharg::unknown_option | The option/flag identifier is not known to the parser. |
sharg::too_many_arguments | More command line arguments than expected are given. |
sharg::too_few_arguments | Less command line arguments than expected are given. |
sharg::required_option_missing | A required option is not given (see Required options) |
sharg::user_input_error | The given (positional) option value was invalid. |
sharg::validation_error | (Positional-)Option validation failed (see Validators) |
We define "special requests" as command line inputs that do not aim to execute your program but rather display information about your program. Because we do not expect the program to be executed in the case of a special request, we exit the program at the end of the sharg::parser::parse call via std::exit.
Currently we support the following special requests:
-h/--help | Prints the help page to the command line (std::cout ) |
-hh/--advanced-help | Prints the advanced help page to the command line (std::cout ) |
--export-help | Exports the help page in a different format (std::cout ) |
--version | Prints the version information to the command line (std::cout ) |
--copyright | Prints the copyright information to the command line (std::cout ) |
Of course there is not much information to display yet, since we did not provide any. Let's improve this by modifying the sharg::parser::info member of our parser. The sharg::parser::info member is a struct of type sharg::parser_meta_data and contains the following members that can be customised:
void initialise_parser(sharg::parser & parser)
.--help
again and see the results.
Now that we're done with the meta information, we will learn how to add the actual functionality of options, flags and positional options. For each of these three there is a respective member function:
Each of the functions above take a variable by reference as the first parameter, which will directly store the corresponding parsed value from the command line. This has two advantages compared to other command line parsers: (1) There is no need for a getter function after parsing and (2) the type is automatically deduced (e.g. with boost::program_options you would need to access parser["file_path"].as<std::filesystem::path>()
afterwards).
The sharg::parser::add_flag only allows boolean variables while sharg::parser::add_option and sharg::parser::add_positional_option allow any type that is convertible from a std::string via std::from_chars or a container of the former (see List options). Besides accepting generic types, the parser will automatically check if the given command line argument can be converted into the desired type and otherwise throw a sharg::type_conversion_error exception.
So how does this look like? The following code snippet adds a positional option to parser
.
In addition to the variable that will store the value, you need to pass a description. This description will help users of your application to understand how the option is affecting your program.
add_positional_option()
will be linked to the first command line argument that is neither an option-value pair nor a flag. So the order of initialising your parser determines the order of assigning command line arguments to the respective variables. We personally recommend to always use regular options (id-value pairs) because they are more expressive and it is easier to spot errors.You can add an option like this:
Additionally to the variable that will store the value and the description, you need to specify a short and long identifier. The example above will recognize an option -n
or --my-number
given on the command line and expect it to be followed by a value separated only by =
or space or by nothing at all.
Finally, you can add a flag with the following call:
Note that you can omit either the short identifier by passing '\0'
or the long identifier by passing ""
but you can never omit both at the same time.
With the current design, every option/flag/positional automatically has a default value which simply is the value with which you initialise the corresponding variable that is passed as the first parameter. Yes it is that easy, just make sure to always initialise your variables properly.
As a best practice recommendation for handling multiple options/flags/positionals, you should store the variables in a struct and pass this struct to your parser initialisation function. You can use the following code that introduces a cmd_arguments
struct storing all relevant command line arguments. Furthermore, it provides you with a small function run_program()
that reads in the data file and aggregates the data by the given information. You don't need to look at the code of run_program()
, it is only so that we have a working program.
Copy and paste this code into the beginning of your application:
Your task is now to extend the initialisation function by the following:
initialise_parser
function by a parameter that takes a cmd_arguments
object and adapt theargs
;aggregate_by
to "mean"
.You can now use the variables from args
to add the following inside of the initialise_parser
function:
file_path
so our program knows the location of the data file to read in.-y/--year
that sets the variable year
, which will enable our program to filter the data by only including a season if it got released after the value year
.-a/--aggregate-by
that sets the variable aggregate_by
, which will enable our program choose between aggregating by mean or median.-H/--header-is-set
that sets the variable header_is_set
, which lets the program know whether it should ignore the first line in the file.Take a look at the help page again after you've done all of the above. You will notice that your options have been automatically included. Copy and paste the example data file from the introduction and check if your options are set correctly by trying the following few calls:
In some use cases you may want to allow the user to specify an option multiple times and store the values in a list. With the sharg::parser this behaviour can be achieved simply by choosing your input variable to be of a container type (e.g. std::vector). The parser registers the container type through sharg::container and will adapt the parsing of command line arguments accordingly.
Example:
Adding this option to a parser will allow you to call the program like this:
The vector list_variable
will then contain all three names ["Jon", "Arya", "Ned"]
.
An arbitrary positional option cannot be a list because of the ambiguity of which value belongs to which positional option. We do allow the very last option to be a list for convenience though. Note that if you try to add a positional list option which is not the last positional option, a sharg::design_error will be thrown.
Example:
Adding these positional options to a parser will allow you to call the program like this:
The first variable
will be filled with the value Stark
while the vector list_variable
will then contain the three names ["Jon", "Arya", "Ned"]
.
We extend the solution from assignment 3:
-y/--year
, since we want to keep it simple and only aggregate by season now.seasons
of type std::vector<uint8_t>
to the struct cmd_arguments
.-s/--season
that will fill the variable seasons
which lets the user specify which seasons toTake a look at the help page again after you've done all of the above. You will notice that your option -s/--season
even tells you that it is of type List of unsigned 8 bit integer's
. Check if your options are set correctly by trying the following few calls:
There is a flaw in the example application we have programmed in assignment 4, did you notice? You can make it misbehave by not giving it any option -s
(which is technically correct for the sharg::parser because a list may be empty). You could of course handle this in the program itself by checking whether the vector seasons
is empty, but since supplying no season is not expected we can force the user to supply the option at least once by declaring an option as required.
For this purpose we need to use the sharg::option_spec enum interface that is accepted as an additional argument by all of the add_[positional_option/option/flag]
calls:
If the user does not supply the required option via the command line, they will now get the following error:
Additionally to the required tag, there is also the possibility of declaring an option as advanced or hidden.
Set an option/flag to advanced, if you do not want the option to be displayed in the normal help page (-h/--help
). Instead, the advanced options are only displayed when calling -hh/--advanced-help
. This can be helpful, if you want to avoid bloating your help page with too much information for inexperienced users of your application, but still provide thorough information on demand.
Set an option/flag to hidden, if you want to completely hide it from the user. It will neither appear on the help page nor in any export format. For example, this might be useful for debugging reasons.
Summary:
Tag | Description |
---|---|
standard | The default tag (non-required and always visible). |
required | Required options will cause an error if not provided (required and always visible). |
advanced | Advanced options are only displayed wit -hh/--advanced-help . (non-required and visible on request). |
hidden | Hidden options are never displayed when exported (non-required and non-visible). |
Extend the solution from assignment 4 by declaring the -s/--season
option as required.
Check if your options are set correctly by trying the following call:
Our applications often do not allow just any value to be passed as input arguments and if we do not check for them, the program may exhibit undefined behaviour. The best way to carefully restrict user input is to directly check the input when parsing the command line. The sharg::parser provides validators for a given (positional) option.
A validator is a functor that is called within the parser after retrieving and converting a command line argument. We provide several validators, which we hope cover most of the use cases, but you can always create your own validator (see section Create your own validator).
The following validators are provided in the Sharg library and can be included with the following header:
All the validators below work on single values or a container of values. In case the variable is a container, the validator is called on each element separately.
On construction, this validator receives a maximum and a minimum number. The validator throws a sharg::validation_error exception whenever a given value does not lie inside the given min/max range.
Our application has a another flaw that you might have noticed by now: If you supply a season that is not in the data file, the program will again misbehave. Instead of fixing the program, let's restrict the user input accordingly.
On construction, the validator receives a list (vector) of valid values. The validator throws a sharg::validation_error exception whenever a given value is not in the given list.
-a/--aggregate-by
option that sets the list of valid values to ["median", "mean"]
.
Sharg offers two file validator types: the sharg::input_file_validator and the sharg::output_file_validator. On construction, the validator receives a list (vector) of valid file extensions that are tested against the extension of the parsed option value. The validator throws a sharg::validation_error exception whenever a given filename's extension is not in the given list of valid extensions. In addition, the sharg::input_file_validator checks if the file exists, is a regular file and is readable. Moreover, you have to add an additional flag sharg::output_file_open_options to the sharg::output_file_validator, which you can use to indicate whether you want to allow the output files to be overwritten.
Using the sharg::input_file_validator:
Using the sharg::output_file_validator:
In addition to the file validator types, sharg offers directory validator types. These are useful if one needs to provide an input directory (using the sharg::input_directory_validator) or output directory (using the sharg::output_directory_validator) where multiple files need to be read from or written to. The sharg::input_directory_validator checks whether the specified path is a directory and is readable. Similarly, the sharg::output_directory_validator checks whether the specified directory is writable and can be created, if it does not already exist. If the tests fail, a sharg::validation_error exception will be thrown. Also, if something unexpected with the filesystem happens, a std::filesystem_error will be thrown.
Using the sharg::input_directory_validator:
Using the sharg::output_directory_validator:
file_path
.
On construction, the validator receives a pattern for a regular expression. The pattern variable will be used for constructing a std::regex and the validator will call std::regex_match on the command line argument.
Note that a regex_match will only return true if the string matches the pattern completely (in contrast to regex_search which also matches substrings). The validator throws a sharg::validation_error exception whenever a given parameter does not match the given regular expression.
You can also chain validators using the pipe operator (|
). The pipe operator is the AND operation for two validators, which means that a value must pass both validators in order to be accepted by the combined validator.
For example, you may want a file name that only accepts absolute paths, but also must have one out of a list of given file extensions. For this purpose you can chain a sharg::regex_validator to a sharg::input_file_validator:
You can chain as many validators as you want, they will be evaluated one after the other from left to right (first to last).
file_path
by chaining it to the already present sharg::input_file_validator. The parsed file name should have a suffix called seasons
.
The following solution shows the complete code including all the little assignments of this tutorial that can serve as a copy'n'paste source for your application.
Many applications provide several sub programs, e.g. git
comes with many functionalities like git push
, git pull
, git checkout
, etc. each having their own help page. If you are interested in how this subcommand parsing can be done with the sharg::parser, take a look at our HowTo.
When you run a Sharg-based application for the first time, you will likely be asked about "update notifications". This is a feature that helps inform users about updates and helps the Sharg project get a rough estimate on which Sharg-based apps are popular.
See the API documentation of sharg::parser for information on how to configure (or turn off) this feature. See our wiki entry for more information on how it works and our privacy policy.