Public Member Functions | Protected Member Functions | List of all members
cudf::io::parquet_writer_options_base Class Reference

Base settings for write_parquet() and parquet_chunked_writer. More...

#include <parquet.hpp>

Inheritance diagram for cudf::io::parquet_writer_options_base:
cudf::io::chunked_parquet_writer_options cudf::io::parquet_writer_options

Public Member Functions

 parquet_writer_options_base ()=default
 Default constructor. More...
 
sink_info const & get_sink () const
 Returns sink info. More...
 
compression_type get_compression () const
 Returns compression format used. More...
 
statistics_freq get_stats_level () const
 Returns level of statistics requested in output file. More...
 
auto const & get_metadata () const
 Returns associated metadata. More...
 
std::vector< std::map< std::string, std::string > > const & get_key_value_metadata () const
 Returns Key-Value footer metadata information. More...
 
bool is_enabled_int96_timestamps () const
 Returns true if timestamps will be written as INT96. More...
 
auto is_enabled_utc_timestamps () const
 Returns true if timestamps will be written as UTC. More...
 
auto is_enabled_write_arrow_schema () const
 Returns true if arrow schema will be written. More...
 
auto get_row_group_size_bytes () const
 Returns maximum row group size, in bytes. More...
 
auto get_row_group_size_rows () const
 Returns maximum row group size, in rows. More...
 
auto get_max_page_size_bytes () const
 Returns the maximum uncompressed page size, in bytes. More...
 
auto get_max_page_size_rows () const
 Returns maximum page size, in rows. More...
 
auto get_column_index_truncate_length () const
 Returns maximum length of min or max values in column index, in bytes. More...
 
dictionary_policy get_dictionary_policy () const
 Returns policy for dictionary use. More...
 
auto get_max_dictionary_size () const
 Returns maximum dictionary size, in bytes. More...
 
auto get_max_page_fragment_size () const
 Returns maximum page fragment size, in rows. More...
 
std::shared_ptr< writer_compression_statisticsget_compression_statistics () const
 Returns a shared pointer to the user-provided compression statistics. More...
 
auto is_enabled_write_v2_headers () const
 Returns true if V2 page headers should be written. More...
 
auto const & get_sorting_columns () const
 Returns the sorting_columns. More...
 
void set_metadata (table_input_metadata metadata)
 Sets metadata. More...
 
void set_key_value_metadata (std::vector< std::map< std::string, std::string >> metadata)
 Sets metadata. More...
 
void set_stats_level (statistics_freq sf)
 Sets the level of statistics. More...
 
void set_compression (compression_type compression)
 Sets compression type. More...
 
void enable_int96_timestamps (bool req)
 Sets timestamp writing preferences. INT96 timestamps will be written if true and TIMESTAMP_MICROS will be written if false. More...
 
void enable_utc_timestamps (bool val)
 Sets preference for writing timestamps as UTC. Write timestamps as UTC if set to true. More...
 
void enable_write_arrow_schema (bool val)
 Sets preference for writing arrow schema. Write arrow schema if set to true. More...
 
void set_row_group_size_bytes (size_t size_bytes)
 Sets the maximum row group size, in bytes. More...
 
void set_row_group_size_rows (size_type size_rows)
 Sets the maximum row group size, in rows. More...
 
void set_max_page_size_bytes (size_t size_bytes)
 Sets the maximum uncompressed page size, in bytes. More...
 
void set_max_page_size_rows (size_type size_rows)
 Sets the maximum page size, in rows. More...
 
void set_column_index_truncate_length (int32_t size_bytes)
 Sets the maximum length of min or max values in column index, in bytes. More...
 
void set_dictionary_policy (dictionary_policy policy)
 Sets the policy for dictionary use. More...
 
void set_max_dictionary_size (size_t size_bytes)
 Sets the maximum dictionary size, in bytes. More...
 
void set_max_page_fragment_size (size_type size_rows)
 Sets the maximum page fragment size, in rows. More...
 
void set_compression_statistics (std::shared_ptr< writer_compression_statistics > comp_stats)
 Sets the pointer to the output compression statistics. More...
 
void enable_write_v2_headers (bool val)
 Sets preference for V2 page headers. Write V2 page headers if set to true. More...
 
void set_sorting_columns (std::vector< sorting_column > sorting_columns)
 Sets sorting columns. More...
 

Protected Member Functions

 parquet_writer_options_base (sink_info sink)
 Constructor from sink. More...
 

Detailed Description

Base settings for write_parquet() and parquet_chunked_writer.

Definition at line 623 of file parquet.hpp.

Constructor & Destructor Documentation

◆ parquet_writer_options_base() [1/2]

cudf::io::parquet_writer_options_base::parquet_writer_options_base ( sink_info  sink)
inlineexplicitprotected

Constructor from sink.

Parameters
sinkThe sink used for writer output

Definition at line 671 of file parquet.hpp.

◆ parquet_writer_options_base() [2/2]

cudf::io::parquet_writer_options_base::parquet_writer_options_base ( )
default

Default constructor.

This has been added since Cython requires a default constructor to create objects on stack.

Member Function Documentation

◆ enable_int96_timestamps()

void cudf::io::parquet_writer_options_base::enable_int96_timestamps ( bool  req)

Sets timestamp writing preferences. INT96 timestamps will be written if true and TIMESTAMP_MICROS will be written if false.

Parameters
reqBoolean value to enable/disable writing of INT96 timestamps

◆ enable_utc_timestamps()

void cudf::io::parquet_writer_options_base::enable_utc_timestamps ( bool  val)

Sets preference for writing timestamps as UTC. Write timestamps as UTC if set to true.

Parameters
valBoolean value to enable/disable writing of timestamps as UTC.

◆ enable_write_arrow_schema()

void cudf::io::parquet_writer_options_base::enable_write_arrow_schema ( bool  val)

Sets preference for writing arrow schema. Write arrow schema if set to true.

Parameters
valBoolean value to enable/disable writing of arrow schema.

◆ enable_write_v2_headers()

void cudf::io::parquet_writer_options_base::enable_write_v2_headers ( bool  val)

Sets preference for V2 page headers. Write V2 page headers if set to true.

Parameters
valBoolean value to enable/disable writing of V2 page headers.

◆ get_column_index_truncate_length()

auto cudf::io::parquet_writer_options_base::get_column_index_truncate_length ( ) const
inline

Returns maximum length of min or max values in column index, in bytes.

Returns
length min/max will be truncated to

Definition at line 784 of file parquet.hpp.

◆ get_compression()

compression_type cudf::io::parquet_writer_options_base::get_compression ( ) const
inline

Returns compression format used.

Returns
Compression format

Definition at line 693 of file parquet.hpp.

◆ get_compression_statistics()

std::shared_ptr<writer_compression_statistics> cudf::io::parquet_writer_options_base::get_compression_statistics ( ) const
inline

Returns a shared pointer to the user-provided compression statistics.

Returns
Compression statistics

Definition at line 815 of file parquet.hpp.

◆ get_dictionary_policy()

dictionary_policy cudf::io::parquet_writer_options_base::get_dictionary_policy ( ) const
inline

Returns policy for dictionary use.

Returns
policy for dictionary use

Definition at line 794 of file parquet.hpp.

◆ get_key_value_metadata()

std::vector<std::map<std::string, std::string> > const& cudf::io::parquet_writer_options_base::get_key_value_metadata ( ) const
inline

Returns Key-Value footer metadata information.

Returns
Key-Value footer metadata information

Definition at line 714 of file parquet.hpp.

◆ get_max_dictionary_size()

auto cudf::io::parquet_writer_options_base::get_max_dictionary_size ( ) const
inline

Returns maximum dictionary size, in bytes.

Returns
Maximum dictionary size, in bytes.

Definition at line 801 of file parquet.hpp.

◆ get_max_page_fragment_size()

auto cudf::io::parquet_writer_options_base::get_max_page_fragment_size ( ) const
inline

Returns maximum page fragment size, in rows.

Returns
Maximum page fragment size, in rows.

Definition at line 808 of file parquet.hpp.

◆ get_max_page_size_bytes()

auto cudf::io::parquet_writer_options_base::get_max_page_size_bytes ( ) const
inline

Returns the maximum uncompressed page size, in bytes.

If set larger than the row group size, then this will return the row group size.

Returns
Maximum uncompressed page size, in bytes

Definition at line 762 of file parquet.hpp.

◆ get_max_page_size_rows()

auto cudf::io::parquet_writer_options_base::get_max_page_size_rows ( ) const
inline

Returns maximum page size, in rows.

If set larger than the row group size, then this will return the row group size.

Returns
Maximum page size, in rows

Definition at line 774 of file parquet.hpp.

◆ get_metadata()

auto const& cudf::io::parquet_writer_options_base::get_metadata ( ) const
inline

Returns associated metadata.

Returns
Associated metadata

Definition at line 707 of file parquet.hpp.

◆ get_row_group_size_bytes()

auto cudf::io::parquet_writer_options_base::get_row_group_size_bytes ( ) const
inline

Returns maximum row group size, in bytes.

Returns
Maximum row group size, in bytes

Definition at line 746 of file parquet.hpp.

◆ get_row_group_size_rows()

auto cudf::io::parquet_writer_options_base::get_row_group_size_rows ( ) const
inline

Returns maximum row group size, in rows.

Returns
Maximum row group size, in rows

Definition at line 753 of file parquet.hpp.

◆ get_sink()

sink_info const& cudf::io::parquet_writer_options_base::get_sink ( ) const
inline

Returns sink info.

Returns
Sink info

Definition at line 686 of file parquet.hpp.

◆ get_sorting_columns()

auto const& cudf::io::parquet_writer_options_base::get_sorting_columns ( ) const
inline

Returns the sorting_columns.

Returns
Column sort order metadata

Definition at line 832 of file parquet.hpp.

◆ get_stats_level()

statistics_freq cudf::io::parquet_writer_options_base::get_stats_level ( ) const
inline

Returns level of statistics requested in output file.

Returns
level of statistics requested in output file

Definition at line 700 of file parquet.hpp.

◆ is_enabled_int96_timestamps()

bool cudf::io::parquet_writer_options_base::is_enabled_int96_timestamps ( ) const
inline

Returns true if timestamps will be written as INT96.

Returns
true if timestamps will be written as INT96

Definition at line 725 of file parquet.hpp.

◆ is_enabled_utc_timestamps()

auto cudf::io::parquet_writer_options_base::is_enabled_utc_timestamps ( ) const
inline

Returns true if timestamps will be written as UTC.

Returns
true if timestamps will be written as UTC

Definition at line 732 of file parquet.hpp.

◆ is_enabled_write_arrow_schema()

auto cudf::io::parquet_writer_options_base::is_enabled_write_arrow_schema ( ) const
inline

Returns true if arrow schema will be written.

Returns
true if arrow schema will be written

Definition at line 739 of file parquet.hpp.

◆ is_enabled_write_v2_headers()

auto cudf::io::parquet_writer_options_base::is_enabled_write_v2_headers ( ) const
inline

Returns true if V2 page headers should be written.

Returns
true if V2 page headers should be written.

Definition at line 825 of file parquet.hpp.

◆ set_column_index_truncate_length()

void cudf::io::parquet_writer_options_base::set_column_index_truncate_length ( int32_t  size_bytes)

Sets the maximum length of min or max values in column index, in bytes.

Parameters
size_byteslength min/max will be truncated to

◆ set_compression()

void cudf::io::parquet_writer_options_base::set_compression ( compression_type  compression)

Sets compression type.

Parameters
compressionThe compression type to use

◆ set_compression_statistics()

void cudf::io::parquet_writer_options_base::set_compression_statistics ( std::shared_ptr< writer_compression_statistics comp_stats)

Sets the pointer to the output compression statistics.

Parameters
comp_statsPointer to compression statistics to be updated after writing

◆ set_dictionary_policy()

void cudf::io::parquet_writer_options_base::set_dictionary_policy ( dictionary_policy  policy)

Sets the policy for dictionary use.

Parameters
policyPolicy for dictionary use

◆ set_key_value_metadata()

void cudf::io::parquet_writer_options_base::set_key_value_metadata ( std::vector< std::map< std::string, std::string >>  metadata)

Sets metadata.

Parameters
metadataKey-Value footer metadata

◆ set_max_dictionary_size()

void cudf::io::parquet_writer_options_base::set_max_dictionary_size ( size_t  size_bytes)

Sets the maximum dictionary size, in bytes.

Parameters
size_bytesMaximum dictionary size, in bytes

◆ set_max_page_fragment_size()

void cudf::io::parquet_writer_options_base::set_max_page_fragment_size ( size_type  size_rows)

Sets the maximum page fragment size, in rows.

Parameters
size_rowsMaximum page fragment size, in rows.

◆ set_max_page_size_bytes()

void cudf::io::parquet_writer_options_base::set_max_page_size_bytes ( size_t  size_bytes)

Sets the maximum uncompressed page size, in bytes.

Parameters
size_bytesMaximum uncompressed page size, in bytes to set

◆ set_max_page_size_rows()

void cudf::io::parquet_writer_options_base::set_max_page_size_rows ( size_type  size_rows)

Sets the maximum page size, in rows.

Parameters
size_rowsMaximum page size, in rows to set

◆ set_metadata()

void cudf::io::parquet_writer_options_base::set_metadata ( table_input_metadata  metadata)

Sets metadata.

Parameters
metadataAssociated metadata

◆ set_row_group_size_bytes()

void cudf::io::parquet_writer_options_base::set_row_group_size_bytes ( size_t  size_bytes)

Sets the maximum row group size, in bytes.

Parameters
size_bytesMaximum row group size, in bytes to set

◆ set_row_group_size_rows()

void cudf::io::parquet_writer_options_base::set_row_group_size_rows ( size_type  size_rows)

Sets the maximum row group size, in rows.

Parameters
size_rowsMaximum row group size, in rows to set

◆ set_sorting_columns()

void cudf::io::parquet_writer_options_base::set_sorting_columns ( std::vector< sorting_column sorting_columns)

Sets sorting columns.

Parameters
sorting_columnsColumn sort order metadata

◆ set_stats_level()

void cudf::io::parquet_writer_options_base::set_stats_level ( statistics_freq  sf)

Sets the level of statistics.

Parameters
sfLevel of statistics requested in the output file

The documentation for this class was generated from the following file: