Class
FaiIndexData structure for access to FAI indices.
Data structure for access to FAI indices.
Defined in | <seqan/seq_io.h> |
---|---|
Signature |
class FaiIndex;
|
Member Function Overview
-
FaiIndex::FaiIndex();
Constructor.
Interface Function Overview
-
bool build(faiIndex, fastaFilename[, faiFileName]);
Create a FaiIndex from FASTA file. -
void clear(faiIndex);
Reset a FaiIndex object to the state after default construction. -
void empty(faiIndex);
Returns whether the FaiIndex is empty. -
bool getIdByName(rID, faiIndex, name);
Return reference ID (numeric index in the file) of a sequence in a FAI file. -
uint64_t numSeqs(faiIndex);
Return the number of sequences known to a FaiIndex. -
bool open(faiIndex, fastaFilename [, faiFileName]);
Open a FaiIndex object. -
void readRecord(faiIndex, fastaFileName[, faiFileName]);
Read a FAI index from file. -
void readRegion(str, faiIndex, rID, beginPos, endPos);, void readRegion(str, faiIndex, region);
Read a region through an FaiIndex. -
void readSequence(str, faiIndex, rID);
Load a whole sequence from a FaiIndex. -
bool save(faiIndex[, faiFileName]);
Save a FaiIndex object. -
uint64_t sequenceLength(faiIndex, rID);
Return length of the sequence with the given id in the FaiIndex. -
CharString sequenceName(faiIndex, rID);
Return the name of the sequence with the given id in the FaiIndex.
Detailed Description
FAI indices allow the rast random access to sequences or parts of sequences in a FASTA file. Originally, they were introduced in the samtools program.
Also see the Indexed FASTA I/O Tutorial.
Example
The following example demonstrates the usage of the FaiIndex class.
#include <seqan/basic.h>
#include <seqan/seq_io.h>
#include <seqan/sequence.h>
using namespace seqan;
int main()
{
CharString path = getAbsolutePath("demos/dox/seq_io/example.fa");
FaiIndex faiIndex;
// Try to read the FAI index.
if (!open(faiIndex, toCString(path)))
{
std::cerr << "Could not read the FAI index. Not fatal, we can just build it.\n";
return 1;
}
// Try to build the FAI index (in memory) if reading was unsuccessful. If
// building into memory succeeded, we try to write it out.
if (!build(faiIndex, toCString(path)))
{
std::cerr << "FATAL: Could not build FAI index.\n";
return 1;
}
if (!save(faiIndex))
{
std::cerr << "FATAL: Could not write out FAI index after building.\n";
return 1;
}
// Now, read the first 1000 characters of chr1.
unsigned idx = 0;
if (!getIdByName(idx, faiIndex, "chr"))
{
std::cerr << "FATAL: chr1 not found in FAI index.\n";
return 1;
}
CharString seq;
readRegion(seq, faiIndex, idx, 0, 100);
// Now print the first 100 characters we just read.
std::cout << "chr:1-100 = " << seq << "\n";
return 0;
}
The output is as follows:
chr:1-100 = CCTATCTAATAATATACCTTATACTGGACTAGTGCCAATATTAAAATGAAGTGGGCGTAGTGTGTAATTTGATTGGGTGGAGGTGTGGCTTTGGCGTGCT
Member Functions Detail
FaiIndex::FaiIndex();
Constructor.
Data Races
Thread safety unknown!
Interface Functions Detail
bool build(faiIndex, fastaFilename[, faiFileName]);
Create a FaiIndex from FASTA file.
Parameters
faiIndex
|
The FaiIndex to build into. |
---|---|
fastaFilename
|
Path to the FASTA file to build an index for. Type: char const *. |
faiFileName
|
Path to the FAI file to use as the index file. Type: char const *. Default: "${fastaFilename}.fai". |
Returns
bool |
true on success, false otherwise. |
---|
Data Races
Thread safety unknown!
void clear(faiIndex);
Reset a FaiIndex object to the state after default construction.
Parameters
faiIndex
|
The FaiIndex to clear. |
---|
Data Races
Thread safety unknown!
void empty(faiIndex);
Returns whether the FaiIndex is empty.
Parameters
faiIndex
|
The FaiIndex to check. |
---|
Returns
bool |
true if the index is empty and needs to be built via build. |
---|
Data Races
Thread safety unknown!
bool getIdByName(rID, faiIndex, name);
Return reference ID (numeric index in the file) of a sequence in a FAI file.
Parameters
faiIndex
|
The FaiIndex to query. |
---|---|
name
|
The name of the sequence to look the id up for. Type: ContainerConcept. |
rID
|
The id of the sequence is written here. |
Returns
bool |
true if a sequence with the given name is known in the index, false otherwise. |
---|
Data Races
Thread safety unknown!
uint64_t numSeqs(faiIndex);
Return the number of sequences known to a FaiIndex.
Parameters
faiIndex
|
The FaiIndex to query. |
---|
Returns
uint64_t |
The number of sequences in the index. |
---|
Data Races
Thread safety unknown!
bool open(faiIndex, fastaFilename [, faiFileName]);
Open a FaiIndex object.
Parameters
faiIndex
|
The FaiIndex to write out. |
---|---|
fastaFilename
|
Path to the FASTA file to build an index for. Type: char const *. |
faiFileName
|
The name of the FAI file to open. This parameter is optional. By default, the FAI file name is derived from the FASTA file name. Type: char const *. |
Returns
bool |
true on success, false otherwise. |
---|
Data Races
Thread safety unknown!
void readRecord(faiIndex, fastaFileName[, faiFileName]);
Read a FAI index from file.
Parameters
faiIndex
|
The FaiIndex to read into. |
---|---|
fastaFileName
|
Path to the FASTA file to read. Type: char const *. |
faiFileName
|
Path to the FAI file to read. Type: char const *. Defaults to "${fastaFileName}.fai". |
Data Races
Thread safety unknown!
void readRegion(str, faiIndex, rID, beginPos, endPos);
void readRegion(str, faiIndex, region);
Read a region through an FaiIndex.
Parameters
str
|
The String to read the sequence into. |
---|---|
faiIndex
|
The FaiIndex to read from. |
rID
|
The id of the sequence to read (Type: unsigned). |
beginPos
|
The begin position of the region to read (Type: unsigned). |
endPos
|
The end position of the region to read (Type: unsigned). |
region
|
The GenomicRegion to read. |
Data Races
Thread safety unknown!
void readSequence(str, faiIndex, rID);
Load a whole sequence from a FaiIndex.
Parameters
str
|
The String to read into. |
---|---|
faiIndex
|
The FaiIndex to read from. |
seqID
|
The index of the sequence in the file. |
Data Races
Thread safety unknown!
bool save(faiIndex[, faiFileName]);
Save a FaiIndex object.
Parameters
faiIndex
|
The FaiIndex to write out. |
---|---|
faiFileName
|
The name of the FAI file to write to. This parameter is optional only if the FAI index knows the FAI file name from a previous build call. By default, the FAI file name from the previous call to build is used. Type: char const *. |
Returns
bool |
true on success, false otherwise. |
---|
Data Races
Thread safety unknown!
uint64_t sequenceLength(faiIndex, rID);
Return length of the sequence with the given id in the FaiIndex.
Parameters
faiIndex
|
The FaiIndex to query. |
---|---|
rID
|
The id of the sequence to get the length of. |
Returns
uint64_t |
The length of the sequence with index rID in faiIndex. |
---|
Data Races
Thread safety unknown!
CharString sequenceName(faiIndex, rID);
Return the name of the sequence with the given id in the FaiIndex.
Parameters
faiIndex
|
The FaiIndex to query. |
---|---|
rID
|
The index of the sequence. |
Returns
CharString |
The name of the sequence with the given id. |
---|
Data Races
Thread safety unknown!