Class BamTagsDict
Indexes start positions of BAM tags in a CharString and provides a dict-like API.

Defined in <seqan/bam_io.h>
Signature class BamTagsDict;

Member Function Overview

Interface Function Overview

Detailed Description

Example

#include <iostream>
#include <seqan/stream.h>
#include <seqan/bam_io.h>

using namespace seqan;

int main()
{
    CharString bamStr, samStr = "AA:Z:value1\tAB:Z:value2\tAC:i:30";
    assignTagsSamToBam(bamStr, samStr);
    BamTagsDict tags(bamStr);
    std::cout << length(tags) << std::endl;  // #=> "3"
    for (unsigned id = 0; id < length(tags); ++id)
    {
        std::cout << getTagKey(tags, id) << " -> ";

        if (getTagType(tags, id) == 'i')  // is 32 bit integer
        {
            int32_t x = 0;
            if (!extractTagValue(x, tags, id))
                SEQAN_ASSERT_FAIL("Not a valid integer at pos %u!", id);
            std::cout << x;
        }
        if (getTagType(tags, id) == 'Z')  // is string
        {
            CharString str;
            if (!extractTagValue(str, tags, id))
                SEQAN_ASSERT_FAIL("Not a valid string at pos %u!", id);
            std::cout << '"' << str << '"';
        }

        std::cout << std::endl;
    }

    return 0;
}

Output is:

3
AA -> "value1"
AB -> "value2"
AC -> 30

See Also

Member Functions Detail

BamTagsDict::BamTagsDict(); BamTagsDict::BamTagsDict(tags);

Constructor

Parameters

tags The tags string of a BamAlignmentRecord to be indexed and or modified.

Note, the second constructor stores a reference to tags using a Holder. In case of modifying the BamTagsDict by adding or removing tags those changes will be transparently transferred to the origin of tags.

Data Races

Thread safety unknown!

Interface Functions Detail

bool appendTagValue(tagsDict, key, val[, typeC]);

Append a tag/value pair to a BamTagsDict.

Parameters

tagsDict The BamTagsDict to modify.
key The key of the tag. Must be a sequence of length 2.
val The value to set the tag to.
typeC BAM type char to use. For portability (so the generated files are the same on all platforms), use a signed/unsigned qualified type for val or give typeC. Also see the remarks for getBamTypeChar. Types: getBamTypeChar@.

Returns

bool true on success, false on failure. This function can fail if the key is not a valid tag id (e.g. does not have length 2) or if the type of val is not an atomic value or a string (anything but char *, char const *, a character, integer or float type is invalid).

Remarks

setTagValue behaves like appendTagValue if key was not part of tags before. However, in this case appendTagValue is faster.

Data Races

Thread safety unknown!

void buildIndex(tagsDict);

Build index for a BamTagsDict object.

Parameters

tagsDict The BamTagsDict object to build the index for.

Data Races

Thread safety unknown!

bool eraseTag(tagsDict, key);

Erase a tag from BamTagsDict.

Parameters

tagsDict The BamTagsDict to erase the tag from.
key The key of the tag to erase.

Returns

bool true if the tag could be erased, false if the key wasn't present.

Data Races

Thread safety unknown!

bool extractTagValue(dest, tagsDict, id)

Extract and cast "atomic" value from tags string with index id.

Parameters

dest The variable to write the value to.The value is first copied in a variable of the type indicated in the BAM file. Then it is cast into the type of dest.
tagsDict The BamTagsDict object to query.
id The id of the tag to extract the value from. See findTagKey.

Returns

bool true if the value could be extracted.

Remarks

The function only works for atomic types such as int, not for char* or arrays.

See BamTagsDict for an example.

Data Races

Thread safety unknown!

bool findTagKey(id, tagsDict, key);

Find a tag by its key for a BamTagsDict object.

Parameters

id The id of the found tag.
tagsDict The BamTagsDict to query.
key The key to query for: CharString.

Returns

bool true if the key could be found and false otherwise.

Data Races

Thread safety unknown!

TKey getTagKey(tagsDict, id);

Return key of a tag by index.

Parameters

tagsDict The BamTagsDict to query.
id The index of the dict entry.

Returns

TKey An infix of a CharString. Will be a two-character char sequence.

Data Races

Thread safety unknown!

char getTagType(tagsDict, id);

Returns the tag type char for an entry of a BamTagsDict.

Parameters

tagsDict The BamTagsDict to query.
id The id of the tag for which to determine the type. See findTagKey.

Returns

char A char that identifies the tag type.

Data Races

Thread safety unknown!

bool hasIndex(tagsDict);

Returns whether the BamTagsDict has an index.

Parameters

tagsDict The BamTagsDict to query.

Returns

bool true if dict has an index and false otherwise.

Data Races

Thread safety unknown!

unsigned length(tagsDict);

Returns the number of entries in a BamTagsDict.

Parameters

tagsDict The BamTagsDict object to query for its length.

Returns

TSize The number of entries in the BamTagsDict. TSize is the result of Size<BamTagsDict>::Type.

Data Races

Thread safety unknown!

bool setTagValue(tagsDict, key, val[, typeC]);

Set the value of a tag through a BamTagsDict.

Parameters

tagsDict The BamTagsDict to modify.
key The key of the tag. Must be a sequence of length 2.
val The value to set the tag to.
typeC BAM type char to use. For portability (so the generated files are the same on all platforms), use a signed/unsigned qualified type for val or give typeC. Also see the remarks for getBamTypeChar. Types: getBamTypeChar@.

Returns

bool true on success, false on failure. This function can fail if the key is not a valid tag id (e.g. does not have length 2) or if the type of val is not an atomic value or a string (anything but char *, char const *, a character, integer or float type is invalid).

Remarks

Note that setTagValue does not cast the type, so typeC only influences the type character written out but val is written out in binary without modification.

Examples

An example setting some atomic tag values.

CharString rawTagsText;
BamTagsDict tags(rawTagsText);
setTagValue(tags, "XA", 9);    // int
setTagValue(tags, "XB", 9u);   // unsigned int
setTagValue(tags, "XC", 'X');  // char

If char is equal to int8_t or uint8_t then the last line produces an entry with type 'c' or 'C'. To make sure that the type char 'A' (for "printable character") is written to the file, give it explicitely:

setTagValue(tags, "XC", 'X', 'A');  // Overrwrite XC, enforce type 'printable character'.

Note that on most systems ints have a width of 32 bytes, but the C++ standard leaves this open. For all types but characters, you should not give an explicit type char but use one of the types with explicit width and signed/unsigned qualifier such as int32_t, uint32_t etc.

// The following is not recommended since the type of <tt>x</tt> is not "unsigned 32 bit int."
int32_t x = -1;
setTagValue(tags, "XB", x, 'I');
// Instead, explicitely use an unsigned type if you need one.  Note that your compiler
// might warn you about assigning -1 to an unsigned variable so you know that you are
// probably doing something unintended.
uint32_t y = -1;
setTagValue(tags, "XB", y);

// Do not do this!
setTagValue(tags, "XA", 9, 'f');    // BOGUS since 9 is not a floating point number.

Data Races

Thread safety unknown!

See Also

void tagsToBamRecord(record, tagsDict)

Writes bam tags to the tags field of the given BamAlignmentRecord.

Parameters

record The BamAlignmentRecord whose tags field is overwritten.
tagsDict The BamTagsDict to get the tags from.

This is semantically the same as:

record.tags = host(tagsDict);

See BamTagsDict for an example.

Data Races

Thread safety unknown!