Class
MarkovModelGives a suitable representation of a Marcov Chain.
Gives a suitable representation of a Marcov Chain.
Defined in | <seqan/statistics.h> |
---|---|
Signature |
template <typename TAlphabet[, typename TFloat[, typename TSpec]]>
class MarkovModel;
|
Template Parameters
TAlphabet |
The type of the underlying alphabet. |
---|---|
TFloat |
The type for storing counts, default is double. |
TSpec |
Tag for specialization. |
Member Function Overview
-
void MarkovModel::build(stringSet);
Compute the transition matrix from a training set. -
TFloat MarkovModel::emittedProbability(s);, TFloat MarkovModel::emittedProbability(ss);
Computes the probability that a string or a set of strings is emitted by the MarkovModel. -
MarkovModel::MarkovModel(order);
Constructor -
void MarkovModel::read(file);
Load an instance of MarkovModel from a file. -
void MarkovModel(transition[, stationaryDistribution]);
Set transition matrix. -
void MarkovModel::write(file);
Stores an instance of a markovModel in a file.
Member Variable Overview
-
unsigned MarkovModel::order
The order of the MarkovModel. -
TVector MarkovModel::stationaryDistribution
The vector of characgter distribution (String of TFloat). -
TMatrix MarkovModel::transition
The transition matirx.
Detailed Description
Examples
Build a MarkovModel from Background
#include <iostream> #include <fstream> #include <seqan/index.h> #include <seqan/statistics.h> #include <seqan/seq_io.h> using namespace std; using namespace seqan; int main() { // Build path to background FASTA file. CharString bgPath = SEQAN_PATH_TO_ROOT(); append(bgPath, "/extras/demos/statistics/background.fa"); // Read the background from a file into X. StringSet<DnaString> X; SequenceStream seqStream(toCString(bgPath)); if (!isGood(seqStream)) { std::cerr << "ERROR: Could not open " << bgPath << "\n"; return 1; } StringSet<CharString> ids; // will be ignored if (readAll(ids, X, seqStream) != 0) { std::cerr << "ERROR: Problem reading from " << bgPath << "\n"; return 1; } // Create MarkovModel of order 3 from the background. MarkovModel<Dna> mm(3); buildMarkovModel(mm, X); // Build set of words that we want to compute the zscore of. StringSet<DnaString> W; appendValue(W, "CCCAAAGC"); appendValue(W, "CCCAAAGTAAATT"); // Compute and print zscore. std::cout << "zscore: " << zscore(W, X, mm, AhoCorasick()) << "\n"; // //TODO his path has to be set explicitely when calling the demo // FILE *fd = fopen("projects/library/demos/zscore_human_mm.3","r"); // read(fd, mm); // fclose(fd); //std::cout << zscore(W, X, mm, WuManber()) << std::endl; return 0; }
The following example shows how to build a MarkovModel over a Dna alphabet from a set of background sequence. After build the model, we compute the zscore.
zscore: 11.8323
Load a MarkovModel from File
We can also load the MarkovModel from a file (previously saved using write). Since we do not have the background word set here but only the model, we compute the variance of a word using the function calculateVariance from the alignment_free module.
#include <iostream> #include <fstream> #include <seqan/index.h> #include <seqan/alignment_free.h> #include <seqan/statistics.h> #include <seqan/seq_io.h> using namespace std; using namespace seqan; int main() { // Build path to serialized MarkovModel. CharString mmPath = SEQAN_PATH_TO_ROOT(); append(mmPath, "/extras/demos/statistics/zscore_example_mm.3"); // Open the file. FILE * mmFile = fopen(toCString(mmPath), "rb"); if (!mmFile) { std::cerr << "ERROR: Could not open " << mmPath << "\n"; return 1; } // Create MarkovModel of order 3 and load it from the file. MarkovModel<Dna> mm(3); read(mmFile, mm); fclose(mmFile); // close file again // Build set of words that we want to compute the zscore of. DnaString word = "CCCAAAGC"; // Compute variance. double variance = 0; int n = 10000; // assumed text length calculateVariance(variance, word, mm, n); std::cout << "variance: " << variance << "\n"; return 0; }
variance: 0.267919
Member Functions Detail
void MarkovModel::build(stringSet);
Compute the transition matrix from a training set.
Parameters
stringSet
|
The StringSet to build the model for. |
---|
The character statitionary distribution and the auxiliary information that give raise to an instance of a Markov Model are also computed.
TFloat MarkovModel::emittedProbability(s);
TFloat MarkovModel::emittedProbability(ss);
Computes the probability that a string or a set of strings is emitted by the MarkovModel.
Parameters
s
|
The String to compute the emission probability for. |
---|---|
ss
|
The StringSet to compute the emission probability for. |
Returns
TFloat |
The emission probability, TFloat is the TFloat from the MarkovModel. |
---|
MarkovModel::MarkovModel(order);
Constructor
Parameters
order
|
The order of the model (unsigned). |
---|
void MarkovModel::read(file);
Load an instance of MarkovModel from a file.
Parameters
file
|
The file to read the model from (type FILE *). |
---|
void MarkovModel(transition[, stationaryDistribution]);
Set transition matrix.
Parameters
transition
|
The transition matrix. |
---|---|
stationaryDistribution
|
The vector of character distributions. |
Given e transition matrix, sets it as transition matrix of the MarkovModel and computes (if it is not available) the vector of character distributions and the auxiliary information.
void MarkovModel::write(file);
Stores an instance of a markovModel in a file.
Parameters
file
|
The file to write the model to (type FILE *). |
---|
Member Variables Detail
unsigned MarkovModel::order
The order of the MarkovModel.
TVector MarkovModel::stationaryDistribution
The vector of characgter distribution (String of TFloat).
TMatrix MarkovModel::transition
The transition matirx.