Class String
Sequence container class.

Implements SegmentableConcept, SequenceConcept, TextConcept
All Subcl's AllocString, ArrayString, BlockString, CStyleString, ExternalString, JournaledString, MMapString, PackedString, PizzaChiliString
All Impl'd AssignableConcept, ContainerConcept, DestructibleConcept, ForwardContainerConcept, RandomAccessContainerConcept, ReversibleContainerConcept, SegmentableConcept, SequenceConcept, TextConcept
Defined in <seqan/sequence.h>
Signature template <typename TValue, typename TSpec> class String<TValue, TSpec>;

Template Parameters

TValue The element type of the string.
TSpec The tag for selecting the string specialization.

Member Function Overview

Member Functions Inherited From AssignableConcept

Member Functions Inherited From RandomAccessContainerConcept

Interface Function Overview

Interface Functions Inherited From AssignableConcept

Interface Functions Inherited From ContainerConcept

Interface Functions Inherited From RandomAccessContainerConcept

Interface Functions Inherited From SegmentableConcept

Interface Functions Inherited From SequenceConcept

Interface Functions Inherited From TextConcept

Interface Metafunction Overview

Interface Metafunctions Inherited From ContainerConcept

Interface Metafunctions Inherited From SegmentableConcept

Interface Metafunctions Inherited From SequenceConcept

Interface Metafunctions Inherited From TextConcept

Detailed Description

The String class is for storing sequences and thus at the core of the sequence analysis library SeqAn. They are models for the sequence concept but extend the sequence concept by allowing implicit conversion of other sequence into strings as long as the element conversion works:

    seqan::String<char> strA;        // default construction
    seqan::String<char> strB(strA);  // copy construction
    seqan::String<char> strC("copy from other sequence");

Aside from that, the usual operations (appending, insertion, removing, element access) are available as well.

    // Assignment of sequence with the same alphabet and of another string.
    strA = "Hello World!";
    strB = strA;
    std::cout << strA << "\n"   // => "Hello World!"
              << strB << "\n";  // => "Hello World!"

    // Appending of values (characters) and whole strings.
    appendValue(strA, ' ');
    append(strA, strB);
    std::cout << strA << "\n";  // => "Hello World! Hello World!"

    // Element-wise access and replacing.
    std::cout << strB[3] << "\n";  // => "l";
    strB[3] = 'g';
    std::cout << strB[3] << "\n";  // => "g"

    replace(strB, 5, 12, "land");
    std::cout << strB << "\n";  // => "Helgoland"

    // Removal of elements and strings.
    erase(strA, 5, 18);
    erase(strA, length(strA) - 1);
    std::cout << strA << "\n";  // => "Hello World"

Strings have a size (the actual number of elements) and a capacity (the number of elements that memory has been allocated for). Note that clearing a string does not free the memory (as the STL, SeqAn assumes that strings will later require a similar amount of memory as before). Using shrinkToFit, the user can force a re-allocation of the memory such that the string afterward uses the minimal amount of memory to accomodate all of its objects.

    std::cout << "length(strA) = " << length(strA) << "\n"       // "length(strA) == 9"
              << "capacity(strA) = " << capacity(strA) << "\n";  // "capacity(strA) == 32"
    clear(strA);
    std::cout << "length(strA) = " << length(strA) << "\n"       // "length(strA) == 0"
              << "capacity(strA) = " << capacity(strA) << "\n";  // "capacity(strA) == 32"
    shrinkToFit(strA);
    std::cout << "length(strA) = " << length(strA) << "\n"       // "length(strA) == 0"
              << "capacity(strA) = " << capacity(strA) << "\n";  // "capacity(strA) == 0"

Examples

This example shows a brute force pattern matching scheme for two character Strings. Creation of String "text" shows the usage of some available String operating functions. See class StringSet for an example of a String container with other than simple type values. See class Index example for efficiently finding the same pattern matches using an index.

#include <seqan/file.h>
#include <seqan/sequence.h>

using namespace seqan;
int main()
{
    // Creating text
    String<char> text = "to be";
    std::cout << text << std::endl;
    appendValue(text, ' ');
    std::cout << "Last sign is whitespace? " << endsWith(text, ' ') << std::endl;
    // Erasing whitespaces in text
    eraseBack(text);
    erase(text, 2);
    // Appending another string
    append(text, "ornottobe");
    std::cout << text << std::endl;

    // Pattern
    String<char> pattern = "be";
    // Number of consecutive matching characters per position
    String<int> score;
    resize(score, length(text) - length(pattern) + 1);

    // Brute force pattern matching for every position
    for (unsigned i = 0; i < length(text) - length(pattern) + 1; ++i)
    {
        int localScore = 0;
        for (unsigned j = 0; j < length(pattern); ++j)
            if (text[i + j] == pattern[j])
                ++localScore;
        score[i] = localScore;
    }

    std::cout << "hit at ";
    for (unsigned i = 0; i < length(score); ++i)
        if (score[i] == (int)length(pattern))
            std::cout << i << " ";
    std::cout << std::endl;	

    return 0;
}

The output is as follows:

to be
Last sign is whitespace? 1
tobeornottobe
hit at 2 11 

See Also

Member Functions Detail

TString String::operator=(other)

The String assignment operator allows assignment of convertible sequences.

Parameters

other The other string. Must be a sequence whose elements are convertible into this String's type.

Returns

TString Reference to the String objecta after assignment.

String::String() String::String(other)

Constructor.

Parameters

other The source for the copy constructor. Can be of any sequence type as long as other's elements are convertible to the value type of this string.

Default and copy constructor are implemented.

Interface Functions Detail

TPos beginPosition(str);

Return 0 for compatibility with Segment.

Parameters

seg The String to use.

Returns

TPos Always 0.

TPos endPosition(str);

Return length of string for compatibility with Segment.

Parameters

seg The string to use.

Returns

TPos Length of the string.

TSize reserve(str, new_capacity[, tag]);

Increases the capacity.

Parameters

str The String to reserve space in.
newCapacity The new capacity str will get.
tag Specifies the strategy that is applied for changing the capacity.

Returns

TSize The amount of the requested capacity that was available. That is the function returns the minimum of newCapacity and capacity(me).

This function allows to increase the capacity but not the length of a container.

Use resize if you want to change the size of a container.

Remarks

At the end of the operation, capacity(me) can be larger than new_capacity. If new_capacity is smaller than capacity(me) at the beginning of the operation, the operation need not to change the capacity at all.

This operation does not changes the content of object.

This operation may invalidate iterators of object.

TSize resizeSpace(str, size, posBegin, posEnd [, limit][, resizeTag]);

Makes free space in container

Parameters

str The String to modify.
size Number of characters that should be freed.
posEnd Position behind the last item in object that is to be destroyed. If posEnd == posBegin, no item in object will be destroyed.
posBegin Position of the first item in object that is to be destroyed.
limit Maximal length object can get after this operation. (optional)
resizeTag Strategy that is applied if object has not enough capacity to store the complete content. (optional)

Returns

TSize The number of free characters.Depeding on resizeTag, this could be size or less than size if object has not enough capacity.

TValue* toCString(seq)

Access sequence as c-style string.

Parameters

seq The sequence to be accessed. Type: String

Returns

TValue* For strings that store their elements in a contiguous block (see IsContiguous) a pointer to first element of $object$ is returned.

Remarks

If the alphabet of $object$ is $char$ or $wchar_t$ the return value is a c-style string representing the contents of object.

Calling this function for non-contiguous containers will raise a compilation error. To create c-style strings for non-contiguous strings or strings with different alphabets, use a CStyleString as an intermediate.