Class
MarkovModelGives a suitable representation of a Marcov Chain.
Gives a suitable representation of a Marcov Chain.
Defined in | <seqan/statistics.h> |
---|---|
Signature |
template <typename TAlphabet[, typename TFloat[, typename TSpec]]>
class MarkovModel;
|
Template Parameters
TAlphabet |
The type of the underlying alphabet. |
---|---|
TFloat |
The type for storing counts, default is double. |
TSpec |
Tag for specialization. |
Member Function Overview
-
void MarkovModel::build(stringSet);
Compute the transition matrix from a training set. -
TFloat MarkovModel::emittedProbability(s);, TFloat MarkovModel::emittedProbability(ss);
Computes the probability that a string or a set of strings is emitted by the MarkovModel. -
MarkovModel::MarkovModel(order);
Constructor -
void MarkovModel::read(file);
Load an instance of MarkovModel from a file. -
void MarkovModel(transition[, stationaryDistribution]);
Set transition matrix. -
void MarkovModel::write(file);
Stores an instance of a markovModel in a file.
Member Variable Overview
-
unsigned MarkovModel::order
The order of the MarkovModel. -
TVector MarkovModel::stationaryDistribution
The vector of characgter distribution (String of TFloat). -
TMatrix MarkovModel::transition
The transition matirx.
Detailed Description
Examples
Build a MarkovModel from Background
#include <iostream>
#include <fstream>
#include <seqan/index.h>
#include <seqan/statistics.h>
#include <seqan/seq_io.h>
using namespace seqan;
int main()
{
// Build path to background FASTA file.
CharString bgPath = SEQAN_PATH_TO_ROOT();
append(bgPath, "/demos/statistics/background.fa");
// Read the background from a file into X.
StringSet<DnaString> X;
SeqFileIn seqFile;
if (!open(seqFile, toCString(bgPath)))
{
std::cerr << "ERROR: Could not open " << bgPath << "\n";
return 1;
}
StringSet<CharString> ids; // will be ignored
readRecords(ids, X, seqFile);
// Create MarkovModel of order 3 from the background.
MarkovModel<Dna> mm(3);
buildMarkovModel(mm, X);
// Build set of words that we want to compute the zscore of.
StringSet<DnaString> W;
appendValue(W, "CCCAAAGC");
appendValue(W, "CCCAAAGTAAATT");
// Compute and print zscore.
std::cout << "zscore: " << zscore(W, X, mm, AhoCorasick()) << "\n";
// //TODO his path has to be set explicitely when calling the demo
// FILE *fd = fopen("projects/library/demos/zscore_human_mm.3","r");
// read(fd, mm);
// fclose(fd);
//std::cout << zscore(W, X, mm, WuManber()) << std::endl;
return 0;
}
The following example shows how to build a MarkovModel over a Dna alphabet from a set of background sequence. After build the model, we compute the zscore.
zscore: 11.8323
Load a MarkovModel from File
We can also load the MarkovModel from a file (previously saved using write). Since we do not have the background word set here but only the model, we compute the variance of a word using the function calculateVariance from the alignment_free module.
#include <iostream>
#include <fstream>
#include <seqan/index.h>
#include <seqan/alignment_free.h>
#include <seqan/statistics.h>
#include <seqan/seq_io.h>
using namespace seqan;
int main()
{
// Build path to serialized MarkovModel.
CharString mmPath = SEQAN_PATH_TO_ROOT();
append(mmPath, "/demos/statistics/zscore_example_mm.3");
// Open the file.
FILE * mmFile = fopen(toCString(mmPath), "rb");
if (!mmFile)
{
std::cerr << "ERROR: Could not open " << mmPath << "\n";
return 1;
}
// Create MarkovModel of order 3 and load it from the file.
MarkovModel<Dna> mm(3);
read(mmFile, mm);
fclose(mmFile); // close file again
// Build set of words that we want to compute the zscore of.
DnaString word = "CCCAAAGC";
// Compute variance.
double variance = 0;
int n = 10000; // assumed text length
calculateVariance(variance, word, mm, n);
std::cout << "variance: " << variance << "\n";
return 0;
}
variance: 0.267919
Member Functions Detail
void MarkovModel::build(stringSet);
Compute the transition matrix from a training set.
Parameters
stringSet
|
The StringSet to build the model for. |
---|
The character statitionary distribution and the auxiliary information that give raise to an instance of a Markov Model are also computed.
Data Races
Thread safety unknown!
TFloat MarkovModel::emittedProbability(s);
TFloat MarkovModel::emittedProbability(ss);
Computes the probability that a string or a set of strings is emitted by the MarkovModel.
Parameters
s
|
The String to compute the emission probability for. |
---|---|
ss
|
The StringSet to compute the emission probability for. |
Returns
TFloat |
The emission probability, TFloat is the TFloat from the MarkovModel. |
---|
Data Races
Thread safety unknown!
MarkovModel::MarkovModel(order);
Constructor
Parameters
order
|
The order of the model (unsigned). |
---|
Data Races
Thread safety unknown!
void MarkovModel::read(file);
Load an instance of MarkovModel from a file.
Parameters
file
|
The file to read the model from (type FILE *). |
---|
Data Races
Thread safety unknown!
void MarkovModel(transition[, stationaryDistribution]);
Set transition matrix.
Parameters
transition
|
The transition matrix. |
---|---|
stationaryDistribution
|
The vector of character distributions. |
Given e transition matrix, sets it as transition matrix of the MarkovModel and computes (if it is not available) the vector of character distributions and the auxiliary information.
Data Races
Thread safety unknown!
void MarkovModel::write(file);
Stores an instance of a markovModel in a file.
Parameters
file
|
The file to write the model to (type FILE *). |
---|
Data Races
Thread safety unknown!
Member Variables Detail
unsigned MarkovModel::order
The order of the MarkovModel.
TVector MarkovModel::stationaryDistribution
The vector of characgter distribution (String of TFloat).
TMatrix MarkovModel::transition
The transition matirx.