Chopper
All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
Classes | Public Types | Public Member Functions | Static Public Member Functions | Public Attributes | Private Member Functions | Private Attributes | List of all members
chopper::layout::hibf_statistics Class Reference

#include <chopper/layout/hibf_statistics.hpp>

Classes

class  bin
 
struct  level
 A representation of an IBF level that gathers information about bins in an IBF. More...
 
struct  level_summary
 

Public Types

enum class  bin_kind { split , merged }
 The kind of bin that is stored. More...
 

Public Member Functions

 hibf_statistics ()=delete
 Deleted. Holds reference members. More...
 
 hibf_statistics (hibf_statistics const &b)=delete
 Deleted. Holds const member. More...
 
hibf_statisticsoperator= (hibf_statistics const &)=delete
 Deleted. Holds const member. More...
 
 hibf_statistics (hibf_statistics &&b)=delete
 Deleted. Holds const member. More...
 
hibf_statisticsoperator= (hibf_statistics &&)=delete
 Deleted. Holds const member. More...
 
 ~hibf_statistics ()=default
 Defaulted. More...
 
 hibf_statistics (configuration const &config_, std::vector< seqan::hibf::sketch::hyperloglog > const &sketches_, std::vector< size_t > const &kmer_counts)
 Construct an empty HIBF with an empty top level IBF. More...
 
void finalize ()
 Gather all statistics to have all members ready. More...
 
void print_summary_to (size_t &t_max_64_memory, std::ostream &stream, bool const verbose=true)
 Prints a tab-separated summary of the statistics of this HIBF to the command line. More...
 
size_t total_hibf_size_in_byte ()
 Return the total corrected size of the HIBF in bytes. More...
 

Static Public Member Functions

static void print_header_to (std::ostream &stream, bool const verbose=true)
 Prints a column names of the summary to the command line. More...
 
static std::string byte_size_to_formatted_str (size_t const bytes)
 Round bytes to the appropriate unit and convert to string with unit. More...
 

Public Attributes

level top_level_ibf
 The top level IBF of this HIBF, often starting point for recursions. More...
 
double total_query_cost {0.0}
 The estimated query cost of every single kmer in this HIBF. More...
 
double expected_HIBF_query_cost {0.0}
 The estimated query cost relative to the total k-mer count in the data set. More...
 
seqan::hibf::layout::layout hibf_layout
 A reference to the input counts. More...
 

Private Member Functions

std::string to_formatted_BF_size (size_t const number_of_kmers_to_be_stored) const
 Compute the Bloom Filter size from number_of_kmers_to_be_stored and return it as a formatted string with the appropriate unit. More...
 
void collect_bins ()
 
void compute_cardinalities (level &curr_level)
 
void compute_total_query_cost (level &curr_level)
 Computes the estimated query cost. More...
 
void gather_statistics (level const &curr_level, size_t const level_summary_index)
 Recursively gather all the statistics from the bins. More...
 

Private Attributes

configuration const config {}
 Copy of the user configuration for this HIBF. More...
 
std::vector< double > const fp_correction {}
 The split bin false positive correction factors to use for the statistics. More...
 
double const merged_fpr_correction_factor {}
 The merged bin false positive correction factors to use for the statistics. More...
 
std::vector< seqan::hibf::sketch::hyperloglog > const & sketches
 A reference to the input sketches. More...
 
std::vector< size_t > const & counts
 A reference to the input counts. More...
 
size_t const total_kmer_count {}
 The original kmer count of all user bins. More...
 
std::map< size_t, level_summarysummaries
 The gathered summary of statistics for each level of this HIBF. More...
 

Member Enumeration Documentation

◆ bin_kind

The kind of bin that is stored.

Enumerator
split 

A single user bin, split into 1 or more bins (even though 1 is not technically split).

merged 

Multiple user bins are merged into a single technical bin.

Constructor & Destructor Documentation

◆ hibf_statistics() [1/4]

chopper::layout::hibf_statistics::hibf_statistics ( )
delete

Deleted. Holds reference members.

◆ hibf_statistics() [2/4]

chopper::layout::hibf_statistics::hibf_statistics ( hibf_statistics const &  b)
delete

Deleted. Holds const member.

◆ hibf_statistics() [3/4]

chopper::layout::hibf_statistics::hibf_statistics ( hibf_statistics &&  b)
delete

Deleted. Holds const member.

◆ ~hibf_statistics()

chopper::layout::hibf_statistics::~hibf_statistics ( )
default

Defaulted.

◆ hibf_statistics() [4/4]

chopper::layout::hibf_statistics::hibf_statistics ( configuration const &  config_,
std::vector< seqan::hibf::sketch::hyperloglog > const &  sketches_,
std::vector< size_t > const &  kmer_counts 
)

Construct an empty HIBF with an empty top level IBF.

Parameters
[in]config_User configuration for the HIBF.
[in]sketches_The sketches of the input.
[in]kmer_countsThe original user bin weights (kmer counts).

Member Function Documentation

◆ byte_size_to_formatted_str()

std::string chopper::layout::hibf_statistics::byte_size_to_formatted_str ( size_t const  bytes)
static

Round bytes to the appropriate unit and convert to string with unit.

◆ collect_bins()

void chopper::layout::hibf_statistics::collect_bins ( )
private

◆ compute_cardinalities()

void chopper::layout::hibf_statistics::compute_cardinalities ( level curr_level)
private

◆ compute_total_query_cost()

void chopper::layout::hibf_statistics::compute_total_query_cost ( level curr_level)
private

Computes the estimated query cost.

◆ finalize()

void chopper::layout::hibf_statistics::finalize ( )

Gather all statistics to have all members ready.

◆ gather_statistics()

void chopper::layout::hibf_statistics::gather_statistics ( level const &  curr_level,
size_t const  level_summary_index 
)
private

Recursively gather all the statistics from the bins.

Parameters
[in]curr_levelThe current IBF from which the statistics will be extracted.
[in]level_summary_indexThe index of curr_level in summeries.

◆ operator=() [1/2]

hibf_statistics & chopper::layout::hibf_statistics::operator= ( hibf_statistics &&  )
delete

Deleted. Holds const member.

◆ operator=() [2/2]

hibf_statistics & chopper::layout::hibf_statistics::operator= ( hibf_statistics const &  )
delete

Deleted. Holds const member.

◆ print_header_to()

void chopper::layout::hibf_statistics::print_header_to ( std::ostream &  stream,
bool const  verbose = true 
)
static

Prints a column names of the summary to the command line.

◆ print_summary_to()

void chopper::layout::hibf_statistics::print_summary_to ( size_t &  t_max_64_memory,
std::ostream &  stream,
bool const  verbose = true 
)

Prints a tab-separated summary of the statistics of this HIBF to the command line.

◆ to_formatted_BF_size()

std::string chopper::layout::hibf_statistics::to_formatted_BF_size ( size_t const  number_of_kmers_to_be_stored) const
private

Compute the Bloom Filter size from number_of_kmers_to_be_stored and return it as a formatted string with the appropriate unit.

Parameters
[in]number_of_kmers_to_be_stored

◆ total_hibf_size_in_byte()

size_t chopper::layout::hibf_statistics::total_hibf_size_in_byte ( )

Return the total corrected size of the HIBF in bytes.

Member Data Documentation

◆ config

configuration const chopper::layout::hibf_statistics::config {}
private

Copy of the user configuration for this HIBF.

◆ counts

std::vector<size_t> const& chopper::layout::hibf_statistics::counts
private

A reference to the input counts.

◆ expected_HIBF_query_cost

double chopper::layout::hibf_statistics::expected_HIBF_query_cost {0.0}

The estimated query cost relative to the total k-mer count in the data set.

◆ fp_correction

std::vector<double> const chopper::layout::hibf_statistics::fp_correction {}
private

The split bin false positive correction factors to use for the statistics.

◆ hibf_layout

seqan::hibf::layout::layout chopper::layout::hibf_statistics::hibf_layout

A reference to the input counts.

◆ merged_fpr_correction_factor

double const chopper::layout::hibf_statistics::merged_fpr_correction_factor {}
private

The merged bin false positive correction factors to use for the statistics.

◆ sketches

std::vector<seqan::hibf::sketch::hyperloglog> const& chopper::layout::hibf_statistics::sketches
private

A reference to the input sketches.

◆ summaries

std::map<size_t, level_summary> chopper::layout::hibf_statistics::summaries
private

The gathered summary of statistics for each level of this HIBF.

◆ top_level_ibf

level chopper::layout::hibf_statistics::top_level_ibf

The top level IBF of this HIBF, often starting point for recursions.

◆ total_kmer_count

size_t const chopper::layout::hibf_statistics::total_kmer_count {}
private

The original kmer count of all user bins.

◆ total_query_cost

double chopper::layout::hibf_statistics::total_query_cost {0.0}

The estimated query cost of every single kmer in this HIBF.


The documentation for this class was generated from the following files: