Class FaiIndex
Data structure for access to FAI indices.

Defined in <seqan/seq_io.h>
Signature class FaiIndex;

Member Function Overview

Interface Function Overview

Detailed Description

FAI indices allow the rast random access to sequences or parts of sequences in a FASTA file. Originally, they were introduced in the samtools program.

Also see the Indexed FASTA I/O Tutorial.

Example

The following example demonstrates the usage of the FaiIndex class.

#include <seqan/basic.h>
#include <seqan/seq_io.h>
#include <seqan/sequence.h>

using namespace seqan2;

int main()
{
    CharString path = getAbsolutePath("demos/dox/seq_io/example.fa");

    FaiIndex faiIndex;

    // Try to read the FAI index.
    if (!open(faiIndex, toCString(path)))
    {
        std::cerr << "Could not read the FAI index.  Not fatal, we can just build it.\n";
        return 1;
    }

    // Try to build the FAI index (in memory) if reading was unsuccessful.  If
    // building into memory succeeded, we try to write it out.
    if (!build(faiIndex, toCString(path)))
    {
        std::cerr << "FATAL: Could not build FAI index.\n";
        return 1;
    }

    if (!save(faiIndex))
    {
        std::cerr << "FATAL: Could not write out FAI index after building.\n";
        return 1;
    }

    // Now, read the first 1000 characters of chr1.
    unsigned idx = 0;
    if (!getIdByName(idx, faiIndex, "chr"))
    {
        std::cerr << "FATAL: chr1 not found in FAI index.\n";
        return 1;
    }
    CharString seq;
    readRegion(seq, faiIndex, idx, 0, 100);

    // Now print the first 100 characters we just read.
    std::cout << "chr:1-100 = " << seq << "\n";

    return 0;
}

The output is as follows:

chr:1-100 = CCTATCTAATAATATACCTTATACTGGACTAGTGCCAATATTAAAATGAAGTGGGCGTAGTGTGTAATTTGATTGGGTGGAGGTGTGGCTTTGGCGTGCT

Member Functions Detail

FaiIndex::FaiIndex();

Constructor.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

Interface Functions Detail

bool build(faiIndex, fastaFilename[, faiFileName]);

Create a FaiIndex from FASTA file.

Parameters

faiIndex The FaiIndex to build into.
fastaFilename Path to the FASTA file to build an index for. Type: char const *.
faiFileName Path to the FAI file to use as the index file. Type: char const *. Default: "${fastaFilename}.fai".

Returns

bool true on success, false otherwise.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

void clear(faiIndex);

Reset a FaiIndex object to the state after default construction.

Parameters

faiIndex The FaiIndex to clear.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

void empty(faiIndex);

Returns whether the FaiIndex is empty.

Parameters

faiIndex The FaiIndex to check.

Returns

bool true if the index is empty and needs to be built via build.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool getIdByName(rID, faiIndex, name);

Return reference ID (numeric index in the file) of a sequence in a FAI file.

Parameters

faiIndex The FaiIndex to query.
name The name of the sequence to look the id up for. Type: ContainerConcept.
rID The id of the sequence is written here.

Returns

bool true if a sequence with the given name is known in the index, false otherwise.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

uint64_t numSeqs(faiIndex);

Return the number of sequences known to a FaiIndex.

Parameters

faiIndex The FaiIndex to query.

Returns

uint64_t The number of sequences in the index.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool open(faiIndex, fastaFilename [, faiFileName]);

Open a FaiIndex object.

Parameters

faiIndex The FaiIndex to write out.
fastaFilename Path to the FASTA file to build an index for. Type: char const *.
faiFileName The name of the FAI file to open. This parameter is optional. By default, the FAI file name is derived from the FASTA file name. Type: char const *.

Returns

bool true on success, false otherwise.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

void readRecord(faiIndex, fastaFileName[, faiFileName]);

Read a FAI index from file.

Parameters

faiIndex The FaiIndex to read into.
fastaFileName Path to the FASTA file to read. Type: char const *.
faiFileName Path to the FAI file to read. Type: char const *. Defaults to "${fastaFileName}.fai".

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

void readRegion(str, faiIndex, rID, beginPos, endPos); void readRegion(str, faiIndex, region);

Read a region through an FaiIndex.

Parameters

str The String to read the sequence into.
faiIndex The FaiIndex to read from.
rID The id of the sequence to read (Type: unsigned).
beginPos The begin position of the region to read (Type: unsigned).
endPos The end position of the region to read (Type: unsigned).
region The GenomicRegion to read.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

void readSequence(str, faiIndex, rID);

Load a whole sequence from a FaiIndex.

Parameters

str The String to read into.
faiIndex The FaiIndex to read from.
seqID The index of the sequence in the file.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

bool save(faiIndex[, faiFileName]);

Save a FaiIndex object.

Parameters

faiIndex The FaiIndex to write out.
faiFileName The name of the FAI file to write to. This parameter is optional only if the FAI index knows the FAI file name from a previous build call. By default, the FAI file name from the previous call to build is used. Type: char const *.

Returns

bool true on success, false otherwise.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

uint64_t sequenceLength(faiIndex, rID);

Return length of the sequence with the given id in the FaiIndex.

Parameters

faiIndex The FaiIndex to query.
rID The id of the sequence to get the length of.

Returns

uint64_t The length of the sequence with index rID in faiIndex.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.

CharString sequenceName(faiIndex, rID);

Return the name of the sequence with the given id in the FaiIndex.

Parameters

faiIndex The FaiIndex to query.
rID The index of the sequence.

Returns

CharString The name of the sequence with the given id.

Data Races

If not stated otherwise, concurrent invocation is not guaranteed to be thread-safe.