Scoring

Scoring schemes used for alignments and approximative search.

Contents

Scoring schemes define the score for aligning two characters of a given alphabet and the score for gaps within alignments. Given an alignment between two sequences and a scoring scheme, the score of the alignment can be computed as the sum of the scores for aligned character pairs plus the sum of the scores for all gaps.

An example for a scoring scheme is the levenshtein distance, for which each mismatch between two aligned characters costs 1 and each character that is aligned with a gap costs 1. Translated into scores instead of costs, misalignments get a score of -1 and gaps a score of -1 per character. This scoring scheme is the default for Simple Score.

1 Character Pair Scores

SeqAn offers two kinds of scoring scheme:

Simple Score	This scoring scheme differentiates only between "match" (the two aligned characters are the same) and "mismatch" (the two aligned characters are different). The functions scoreMatch and scoreMismatch access these two values.
Scoring matrices	These scoring schemes store a score value for each pair of characters. This value can be accessed using score. Examples for this kind of scoring scheme are Pam and Blosum62. The class Score Matrix can be used to store arbitrary scoring matrices.

2 Gap Scores

SeqAn scoring schemes support affine gap costs: The score for a gap of length n is gap_open + (n-1)*gap_extend while gap_open and gap_extend are typical negative values that represent the costs for "opening a gap" and "extendig the gap". If gap_open == gap_entend (which is the default), the scoring scheme uses linear gap costs.

See: Alignments, Searching

SeqAn - Sequence Analysis Library - www.seqan.de