Class
SequenceStream
High-level reading and writing of sequences.
Building upon the more low-level sequence I/O functionality of SeqAn, this class provides easier to use I/O facilities. Especially, the underlying stream layer and using RecordReaders is hidden from the user. This is achieved by using dynamic polymorphism which comes at some performance cost.
Include Headers
seqan/seq_io.h
Remarks
Operation Mode
File Format and File Type
When reading, there are two operation modes: Normal reading and reading of "persistent" records. When reading in "persistent" mode, SequenceStream will scan over each record twice: Once for determining its size and once for actually reading the sequences. After the first pass, we can allocate a buffer of the exact size we need. This can save memory up to a factor of two, at the cost of scanning each record twice. Note that this is only possible for reading uncompressed files.
The file type determines whether a file is stored as raw text or whether it is compressed. Examples for file types are text files or gzip compressed files (FILE.gz). The file format considers the contents of the raw/decompressed file. Examples for file formats are FASTA, FASTQ, or EMBL.
When reading, the file type and format are guessed from the file itself. You do not have to specify any but you can force the SequenceStream to use the ones you provide. When writing, you should specify a file type and format when constructing the SequenceStream object. Otherwise, it will default to writing out raw-text FASTA files.
Member Functions
SequenceStreamConstructor
Functions
atEndCheck whether a SequenceStream is at the end of the file.
closeClose the SequenceStream.
flushWrite all data from SequenceStream to disk.
isGoodCheck whether a SequenceStream object is ready for reading.
openOpen or re-open a file using a SequenceStream.
readAllRead all sequence records from a SequenceStream object.
readRecordRead the next sequence record from SequenceStream.
writeAllWrite sequence records from to a SequenceStream object.
writeRecordWrite one sequence record from to a SequenceStream object.
Examples
Read a sequence file "example.fa" record by record. See the documentation of readRecord, readBatch, and readAll for more examples, including record-wise reading, reading in batches, and reading all records in a file.
// Create SequenceStream object for reading, optimized for reading single records.
seqan::SequenceStream seqIO("example.fa");
 
// Buffers for the sequence ids and characters.
seqan::CharString id;
seqan::Dna5QString seq;
 
while (!atEnd(seqIO)))
{
    // Read next sequence from the file.  In case of sequences with qualities,
    // the qualities are directly stored in the Dna5Q qualities.
    int res = readRecord(id, seq, seqIO);
    if (res != 0)
        std::cerr << "Error reading file!\n";
 
    // Extract qualities for printing, then print id, sequence, and qualities.
    seqan::CharString quals;
    assignQualityValues(quals, seq);
    std::cout << id << '\t' << seq << '\t' << quals << '\n';
}
SeqAn - Sequence Analysis Library - www.seqan.de
 

Page built @2013/07/11 09:12:16