Class
SequenceStream
High-level reading and writing of sequences.
Building upon the more low-level sequence I/O functionality of SeqAn, this class provides easier to use I/O facilities. Especially, the underlying stream layer and using RecordReaders is hidden from the user. This is achieved by using dynamic polymorphism which comes at some performance cost.
Include Headers
seqan/seq_io.h
Remarks
Operation Mode
File Format and File Type
When reading, there are two operation modes: Normal reading and reading of "persistent" records. When reading in "persistent" mode, SequenceStream will scan over each record twice: Once for determining its size and once for actually reading the sequences. After the first pass, we can allocate a buffer of the exact size we need. This can save memory up to a factor of two, at the cost of scanning each record twice. Note that this is only possible for reading uncompressed files.
The file type determines whether a file is stored as raw text or whether it is compressed. Examples for file types are text files or gzip compressed files (FILE.gz). The file format considers the contents of the raw/decompressed file. Examples for file formats are FASTA, FASTQ, or EMBL.
When reading, the file type and format are guessed from the file itself. You do not have to specify any but you can force the SequenceStream to use the ones you provide. When writing, you should specify a file type and format when constructing the SequenceStream object. Otherwise, it will default to writing out raw-text FASTA files.
Member Functions
SequenceStreamConstructor
Functions
atEndCheck whether a SequenceStream is at the end of the file.
closeClose the SequenceStream.
flushWrite all data from SequenceStream to disk.
isGoodCheck whether a SequenceStream object is ready for reading.
openOpen or re-open a file using a SequenceStream.
readAllRead all sequence records from a SequenceStream object.
readRecordRead the next sequence record from SequenceStream.
writeAllWrite sequence records from to a SequenceStream object.
writeRecordWrite one sequence record from to a SequenceStream object.
Examples
Read the sequence file (FASTA or FASTQ) from argv[1] record by record. The identifiers and sequences of the stream are printed to stdout. See the documentation of readRecord, readBatch, and readAll for more examples, including record-wise reading, reading in batches, and reading all records in a file.
1#include <seqan/basic.h>
2#include <seqan/seq_io.h>
3#include <seqan/sequence.h>
4
5using namespace seqan;
6
7// USAGE: sequence_read_stream_read FILE
8//
9// Print the contents of sequence FILE to stdout in tabular format.
10
11int main(int argc, char ** argv)
12{
13    if (argc != 2)
14    {
15        std::cerr << "USAGE: " << argv[0] << " SEQUENCE.{fa,fq}\n";
16        return 1;
17    }
18
19    SequenceStream seqStream(argv[1]);
20    if (!isGood(seqStream))
21    {
22        std::cerr << "ERROR: Could not open " << argv[1] << " for reading.\n";
23        return 1;
24    }
25
26    seqan::CharString id, seq;
27    while (!atEnd(seqStream))
28    {
29        if (readRecord(id, seq, seqStream) != 0)
30        {
31            std::cerr << "Problem reading from " << argv[1] << "\n";
32            return 1;
33        }
34        std::cout << id << "\t" << seq << "\n";
35    }
36
37    return 0;
38}
Open a SequenceStream for writing and write two sequences to it.
1#include <seqan/basic.h>
2#include <seqan/seq_io.h>
3#include <seqan/sequence.h>
4
5using namespace seqan;
6
7// USAGE: sequence_read_stream_write FILE
8//
9// Print some sequences to the file FILE
10
11int main(int argc, char ** argv)
12{
13    if (argc != 2)
14    {
15        std::cerr << "USAGE: " << argv[0] << " SEQUENCE.{fa,fq}\n";
16        return 1;
17    }
18
19    SequenceStream seqStream(argv[1], SequenceStream::WRITE);
20    if (!isGood(seqStream))
21    {
22        std::cerr << "ERROR: Could not open " << argv[1] << " for writing.\n";
23        return 1;
24    }
25
26    if (writeRecord(seqStream, "one", "CGAT") != 0 ||
27        writeRecord(seqStream, "two", "ASDF") != 0)
28    {
29        std::cerr << "ERROR: Problem writing to " << argv[1] << "\n";
30        return 1;
31    }
32
33    return 0;
34}
SeqAn - Sequence Analysis Library - www.seqan.de
 

Page built @2013/07/11 09:12:35