Class
StringEnumeratorClass to enumerate all strings within a given edit/Hamming distance.
Class to enumerate all strings within a given edit/Hamming distance.
All Subcl's | HammingStringEnumerator, LevenshteinStringEnumerator |
---|---|
Defined in | <seqan/misc/edit_environment.h> |
Signature |
template <typename TString, typename TSpec>
class StringEnumerator<TString, TSpec>;
|
Template Parameters
TString |
Type of the string to enumerate the environment of. |
---|---|
TSpec |
Specialization tag. |
Member Function Overview
Interface Function Overview
-
TIter begin(stringEnum[, tag]);
Return begin iterator. -
TIter end(stringEnum[, tag]);
Return end iterator. -
TSize length(stringEnum);
Return number of strings that will be enumerated.
Interface Metafunction Overview
-
Difference<TStringEnumerator>::Type;
Returns difference type. -
Host<TStringEnumerator>::Type;
Returns host type. -
Position<TStringEnumerator, TSpec>::Type;
Returns iterator type. -
Position<TStringEnumerator>::Type;
Returns position type. -
Reference<TStringEnumerator>::Type;
Returns reference type of the enumerated strings. -
Size<TStringEnumerator>::Type;
Returns size type. -
Value<TStringEnumerator>::Type;
Return value type of the string to enumerate.
Member Variable Overview
-
bool StringEnumerator::trim
Indicate whether to ignore substitutions in first or last character of string in Levenshtein mode (optimization for approximate search).
Member Functions Detail
StringEnumerator::StringEnumerator(string[, minDist]);
Constructor
Parameters
string
|
The string to use as the center. Types: TString. |
---|---|
minDist
|
The smallest distance to generate strings with. Type: unsigned. Default: 0 |
Data Races
Thread safety unknown!
Interface Functions Detail
TIter begin(stringEnum[, tag]);
Return begin iterator.
Parameters
stringEnum
|
StringEnumerator to query. |
---|---|
tag
|
Iterator tag to use. |
Returns
TIter |
Iterator to the first string in the enumerator. |
---|
Data Races
Thread safety unknown!
TIter end(stringEnum[, tag]);
Return end iterator.
Parameters
stringEnum
|
StringEnumerator to query. |
---|---|
tag
|
Iterator tag to use. |
Returns
TIter |
End iterator for the string enumerator. |
---|
Data Races
Thread safety unknown!
TSize length(stringEnum);
Return number of strings that will be enumerated.
Parameters
stringEnum
|
StringEnumerator to query. |
---|
Returns
TSize |
The number of elements in the enumerator (Metafunction: Size). |
---|
Data Races
Thread safety unknown!
Member Variables Detail
bool StringEnumerator::trim
Indicate whether to ignore substitutions in first or last character of string in Levenshtein mode
(optimization for approximate search).
This is useful when searching for such enumerated strings in large texts. Patterns with substitutions in the first base would also be found.
Examples
#include <iostream> #include <seqan/basic.h> #include <seqan/sequence.h> #include <seqan/stream.h> // For printing SeqAn Strings. #include <seqan/misc/edit_environment.h> using namespace seqan; int main() { Dna5String original = "CGAT"; // Enumerate neighbourhood using Hamming distance. typedef StringEnumerator<Dna5String, EditEnvironment<HammingDistance, 2> > THammingEnumerator; typedef Iterator<THammingEnumerator>::Type THammingIterator; std::cout << "Enumerating Hamming distance environment of " << original << " of distance 2\n"; THammingEnumerator hammingEnumerator(original); for (THammingIterator itH = begin(hammingEnumerator); !atEnd(itH); goNext(itH)) std::cout << *itH << '\n'; // Enumerate neighbourhood using edit distance. typedef StringEnumerator<Dna5String, EditEnvironment<LevenshteinDistance, 2> > TEditEnumerator; typedef Iterator<TEditEnumerator>::Type TEditIterator; std::cout << "\nEnumerating edit distance environment of " << original << " of distance 1-2\n"; TEditEnumerator editEnumerator(original); for (TEditIterator itE = begin(editEnumerator); !atEnd(itE); goNext(itE)) std::cout << *itE << '\n'; return 0; }
Enumerating Hamming distance environment of CGAT of distance 2 AGAT CGAT GGAT TGAT NGAT CAAT CCAT CTAT CNAT CGCT CGGT CGTT CGNT CGAA CGAC CGAG CGAN AAAT ACAT ATAT ANAT GAAT GCAT GTAT GNAT TAAT TCAT TTAT TNAT NAAT NCAT NTAT NNAT AGCT AGGT AGTT AGNT GGCT GGGT GGTT GGNT TGCT TGGT TGTT TGNT NGCT NGGT NGTT NGNT AGAA AGAC AGAG AGAN GGAA GGAC GGAG GGAN TGAA TGAC TGAG TGAN NGAA NGAC NGAG NGAN CACT CAGT CATT CANT CCCT CCGT CCTT CCNT CTCT CTGT CTTT CTNT CNCT CNGT CNTT CNNT CAAA CAAC CAAG CAAN CCAA CCAC CCAG CCAN CTAA CTAC CTAG CTAN CNAA CNAC CNAG CNAN CGCA CGCC CGCG CGCN CGGA CGGC CGGG CGGN CGTA CGTC CGTG CGTN CGNA CGNC CGNG CGNN Enumerating edit distance environment of CGAT of distance 1-2 CGAT CAAT CCAT CTAT CNAT CGCT CGGT CGTT CGNT GAT CAT CGT CGA CAGAT CCGAT CGGAT CTGAT CNGAT CGAAT CGCAT CGGAT CGTAT CGNAT CGAAT CGACT CGAGT CGATT CGANT CACT CAGT CATT CANT CCCT CCGT CCTT CCNT CTCT CTGT CTTT CTNT CNCT CNGT CNTT CNNT CAT CCT CTT CNT CAA CCA CTA CNA CGC CGG CGT CGN CAAAT CACAT CAGAT CATAT CANAT CCAAT CCCAT CCGAT CCTAT CCNAT CTAAT CTCAT CTGAT CTTAT CTNAT CNAAT CNCAT CNGAT CNTAT CNNAT CAAAT CAACT CAAGT CAATT CAANT CCAAT CCACT CCAGT CCATT CCANT CTAAT CTACT CTAGT CTATT CTANT CNAAT CNACT CNAGT CNATT CNANT CGCAT CGCCT CGCGT CGCTT CGCNT CGGAT CGGCT CGGGT CGGTT CGGNT CGTAT CGTCT CGTGT CGTTT CGTNT CGNAT CGNCT CGNGT CGNTT CGNNT GCT GGT GTT GNT AT GT GA CT CA CG GAAT GCAT GGAT GTAT GNAT GAAT GACT GAGT GATT GANT CAAT CACT CAGT CATT CANT CAGCT CAGGT CAGTT CAGNT CCGCT CCGGT CCGTT CCGNT CGGCT CGGGT CGGTT CGGNT CTGCT CTGGT CTGTT CTGNT CNGCT CNGGT CNGTT CNGNT CAGT CCGT CGGT CTGT CNGT CAGA CCGA CGGA CTGA CNGA CGAA CGCA CGGA CGTA CGNA CAAGAT CACGAT CAGGAT CATGAT CANGAT CCAGAT CCCGAT CCGGAT CCTGAT CCNGAT CGAGAT CGCGAT CGGGAT CGTGAT CGNGAT CTAGAT CTCGAT CTGGAT CTTGAT CTNGAT CNAGAT CNCGAT CNGGAT CNTGAT CNNGAT CAGAAT CAGCAT CAGGAT CAGTAT CAGNAT CCGAAT CCGCAT CCGGAT CCGTAT CCGNAT CGGAAT CGGCAT CGGGAT CGGTAT CGGNAT CTGAAT CTGCAT CTGGAT CTGTAT CTGNAT CNGAAT CNGCAT CNGGAT CNGTAT CNGNAT CAGAAT CAGACT CAGAGT CAGATT CAGANT CCGAAT CCGACT CCGAGT CCGATT CCGANT CGGAAT CGGACT CGGAGT CGGATT CGGANT CTGAAT CTGACT CTGAGT CTGATT CTGANT CNGAAT CNGACT CNGAGT CNGATT CNGANT CGAAAT CGACAT CGAGAT CGATAT CGANAT CGCAAT CGCCAT CGCGAT CGCTAT CGCNAT CGGAAT CGGCAT CGGGAT CGGTAT CGGNAT CGTAAT CGTCAT CGTGAT CGTTAT CGTNAT CGNAAT CGNCAT CGNGAT CGNTAT CGNNAT CGAAAT CGAACT CGAAGT CGAATT CGAANT CGCAAT CGCACT CGCAGT CGCATT CGCANT CGGAAT CGGACT CGGAGT CGGATT CGGANT CGTAAT CGTACT CGTAGT CGTATT CGTANT CGNAAT CGNACT CGNAGT CGNATT CGNANT CGAAAT CGAACT CGAAGT CGAATT CGAANT CGACAT CGACCT CGACGT CGACTT CGACNT CGAGAT CGAGCT CGAGGT CGAGTT CGAGNT CGATAT CGATCT CGATGT CGATTT CGATNT CGANAT CGANCT CGANGT CGANTT CGANNT