Public Member Functions | Static Public Member Functions | List of all members
cudf::io::parquet_writer_options Class Reference

Settings for write_parquet(). More...

#include <parquet.hpp>

Inheritance diagram for cudf::io::parquet_writer_options:
cudf::io::parquet_writer_options_base

Public Member Functions

 parquet_writer_options ()=default
 Default constructor. More...
 
table_view get_table () const
 Returns table_view. More...
 
std::vector< partition_info > const & get_partitions () const
 Returns partitions. More...
 
std::vector< std::string > const & get_column_chunks_file_paths () const
 Returns Column chunks file paths to be set in the raw output metadata. More...
 
void set_partitions (std::vector< partition_info > partitions)
 Sets partitions. More...
 
void set_column_chunks_file_paths (std::vector< std::string > file_paths)
 Sets column chunks file path to be set in the raw output metadata. More...
 
- Public Member Functions inherited from cudf::io::parquet_writer_options_base
 parquet_writer_options_base ()=default
 Default constructor. More...
 
sink_info const & get_sink () const
 Returns sink info. More...
 
compression_type get_compression () const
 Returns compression format used. More...
 
statistics_freq get_stats_level () const
 Returns level of statistics requested in output file. More...
 
auto const & get_metadata () const
 Returns associated metadata. More...
 
std::vector< std::map< std::string, std::string > > const & get_key_value_metadata () const
 Returns Key-Value footer metadata information. More...
 
bool is_enabled_int96_timestamps () const
 Returns true if timestamps will be written as INT96. More...
 
auto is_enabled_utc_timestamps () const
 Returns true if timestamps will be written as UTC. More...
 
auto is_enabled_write_arrow_schema () const
 Returns true if arrow schema will be written. More...
 
auto get_row_group_size_bytes () const
 Returns maximum row group size, in bytes. More...
 
auto get_row_group_size_rows () const
 Returns maximum row group size, in rows. More...
 
auto get_max_page_size_bytes () const
 Returns the maximum uncompressed page size, in bytes. More...
 
auto get_max_page_size_rows () const
 Returns maximum page size, in rows. More...
 
auto get_column_index_truncate_length () const
 Returns maximum length of min or max values in column index, in bytes. More...
 
dictionary_policy get_dictionary_policy () const
 Returns policy for dictionary use. More...
 
auto get_max_dictionary_size () const
 Returns maximum dictionary size, in bytes. More...
 
auto get_max_page_fragment_size () const
 Returns maximum page fragment size, in rows. More...
 
std::shared_ptr< writer_compression_statisticsget_compression_statistics () const
 Returns a shared pointer to the user-provided compression statistics. More...
 
auto is_enabled_write_v2_headers () const
 Returns true if V2 page headers should be written. More...
 
auto const & get_sorting_columns () const
 Returns the sorting_columns. More...
 
void set_metadata (table_input_metadata metadata)
 Sets metadata. More...
 
void set_key_value_metadata (std::vector< std::map< std::string, std::string >> metadata)
 Sets metadata. More...
 
void set_stats_level (statistics_freq sf)
 Sets the level of statistics. More...
 
void set_compression (compression_type compression)
 Sets compression type. More...
 
void enable_int96_timestamps (bool req)
 Sets timestamp writing preferences. INT96 timestamps will be written if true and TIMESTAMP_MICROS will be written if false. More...
 
void enable_utc_timestamps (bool val)
 Sets preference for writing timestamps as UTC. Write timestamps as UTC if set to true. More...
 
void enable_write_arrow_schema (bool val)
 Sets preference for writing arrow schema. Write arrow schema if set to true. More...
 
void set_row_group_size_bytes (size_t size_bytes)
 Sets the maximum row group size, in bytes. More...
 
void set_row_group_size_rows (size_type size_rows)
 Sets the maximum row group size, in rows. More...
 
void set_max_page_size_bytes (size_t size_bytes)
 Sets the maximum uncompressed page size, in bytes. More...
 
void set_max_page_size_rows (size_type size_rows)
 Sets the maximum page size, in rows. More...
 
void set_column_index_truncate_length (int32_t size_bytes)
 Sets the maximum length of min or max values in column index, in bytes. More...
 
void set_dictionary_policy (dictionary_policy policy)
 Sets the policy for dictionary use. More...
 
void set_max_dictionary_size (size_t size_bytes)
 Sets the maximum dictionary size, in bytes. More...
 
void set_max_page_fragment_size (size_type size_rows)
 Sets the maximum page fragment size, in rows. More...
 
void set_compression_statistics (std::shared_ptr< writer_compression_statistics > comp_stats)
 Sets the pointer to the output compression statistics. More...
 
void enable_write_v2_headers (bool val)
 Sets preference for V2 page headers. Write V2 page headers if set to true. More...
 
void set_sorting_columns (std::vector< sorting_column > sorting_columns)
 Sets sorting columns. More...
 

Static Public Member Functions

static parquet_writer_options_builder builder (sink_info const &sink, table_view const &table)
 Create builder to create parquet_writer_options. More...
 
static parquet_writer_options_builder builder ()
 Create builder to create parquet_writer_options. More...
 

Additional Inherited Members

- Protected Member Functions inherited from cudf::io::parquet_writer_options_base
 parquet_writer_options_base (sink_info sink)
 Constructor from sink. More...
 

Detailed Description

Settings for write_parquet().

Definition at line 1187 of file parquet.hpp.

Constructor & Destructor Documentation

◆ parquet_writer_options()

cudf::io::parquet_writer_options::parquet_writer_options ( )
default

Default constructor.

This has been added since Cython requires a default constructor to create objects on stack.

Member Function Documentation

◆ builder() [1/2]

static parquet_writer_options_builder cudf::io::parquet_writer_options::builder ( )
static

Create builder to create parquet_writer_options.

Returns
parquet_writer_options_builder

◆ builder() [2/2]

static parquet_writer_options_builder cudf::io::parquet_writer_options::builder ( sink_info const &  sink,
table_view const &  table 
)
static

Create builder to create parquet_writer_options.

Parameters
sinkThe sink used for writer output
tableTable to be written to output
Returns
Builder to build parquet_writer_options

◆ get_column_chunks_file_paths()

std::vector<std::string> const& cudf::io::parquet_writer_options::get_column_chunks_file_paths ( ) const
inline

Returns Column chunks file paths to be set in the raw output metadata.

Returns
Column chunks file paths to be set in the raw output metadata

Definition at line 1249 of file parquet.hpp.

◆ get_partitions()

std::vector<partition_info> const& cudf::io::parquet_writer_options::get_partitions ( ) const
inline

Returns partitions.

Returns
Partitions

Definition at line 1242 of file parquet.hpp.

◆ get_table()

table_view cudf::io::parquet_writer_options::get_table ( ) const
inline

Returns table_view.

Returns
Table view

Definition at line 1235 of file parquet.hpp.

◆ set_column_chunks_file_paths()

void cudf::io::parquet_writer_options::set_column_chunks_file_paths ( std::vector< std::string >  file_paths)

Sets column chunks file path to be set in the raw output metadata.

Parameters
file_pathsVector of Strings which indicates file path. Must be same size as number of data sinks in sink info

◆ set_partitions()

void cudf::io::parquet_writer_options::set_partitions ( std::vector< partition_info partitions)

Sets partitions.

Parameters
partitionsPartitions of input table in {start_row, num_rows} pairs. If specified, must be same size as number of sinks in sink_info

The documentation for this class was generated from the following file: