Settings for read_parquet(). More...

#include <parquet.hpp>

Public Member Functions
	parquet_reader_options ()=default
	Default constructor. More...

source_info const &	get_source () const
	Returns source info. More...

bool	is_enabled_convert_strings_to_categories () const
	Returns true/false depending on whether strings should be converted to categories or not. More...

bool	is_enabled_use_pandas_metadata () const
	Returns true/false depending whether to use pandas metadata or not while reading. More...

std::optional< std::vector< reader_column_schema > >	get_column_schema () const
	Returns optional tree of metadata. More...

int64_t	get_skip_rows () const
	Returns number of rows to skip from the start. More...

std::optional< size_type > const &	get_num_rows () const
	Returns number of rows to read. More...

auto const &	get_columns () const
	Returns names of column to be read, if set. More...

auto const &	get_row_groups () const
	Returns list of individual row groups to be read. More...

auto const &	get_filter () const
	Returns AST based filter for predicate pushdown. More...

data_type	get_timestamp_type () const
	Returns timestamp type used to cast timestamp columns. More...

void	set_columns (std::vector< std::string > col_names)
	Sets names of the columns to be read. More...

void	set_row_groups (std::vector< std::vector< size_type >> row_groups)
	Sets vector of individual row groups to read. More...

void	set_filter (ast::expression const &filter)
	Sets AST based filter for predicate pushdown. More...

void	enable_convert_strings_to_categories (bool val)
	Sets to enable/disable conversion of strings to categories. More...

void	enable_use_pandas_metadata (bool val)
	Sets to enable/disable use of pandas metadata to read. More...

void	set_column_schema (std::vector< reader_column_schema > val)
	Sets reader column schema. More...

void	set_skip_rows (int64_t val)
	Sets number of rows to skip. More...

void	set_num_rows (size_type val)
	Sets number of rows to read. More...

void	set_timestamp_type (data_type type)
	Sets timestamp_type used to cast timestamp columns. More...

Static Public Member Functions
static parquet_reader_options_builder	builder (source_info src)
	Creates a parquet_reader_options_builder which will build parquet_reader_options. More...

Detailed Description

Settings for read_parquet().

Definition at line 54 of file parquet.hpp.

Constructor & Destructor Documentation

◆ parquet_reader_options()

cudf::io::parquet_reader_options::parquet_reader_options ( )

explicitdefault

Default constructor.

This has been added since Cython requires a default constructor to create objects on stack.

Member Function Documentation

◆ builder()

static parquet_reader_options_builder cudf::io::parquet_reader_options::builder ( source_info src )

static

Creates a parquet_reader_options_builder which will build parquet_reader_options.

Parameters

src	Source information to read parquet file

Returns: Builder to build reader options

◆ enable_convert_strings_to_categories()

void cudf::io::parquet_reader_options::enable_convert_strings_to_categories ( bool val )

inline

Sets to enable/disable conversion of strings to categories.

Parameters

val	Boolean value to enable/disable conversion of string columns to categories

Definition at line 208 of file parquet.hpp.

◆ enable_use_pandas_metadata()

void cudf::io::parquet_reader_options::enable_use_pandas_metadata ( bool val )

inline

Sets to enable/disable use of pandas metadata to read.

Parameters

val	Boolean value whether to use pandas metadata

Definition at line 215 of file parquet.hpp.

◆ get_column_schema()

std::optional<std::vector<reader_column_schema> > cudf::io::parquet_reader_options::get_column_schema ( ) const

inline

Returns optional tree of metadata.

Returns: vector of reader_column_schema objects.

Definition at line 134 of file parquet.hpp.

◆ get_columns()

auto const& cudf::io::parquet_reader_options::get_columns ( ) const

inline

Returns names of column to be read, if set.

Returns: Names of column to be read; nullopt if the option is not set

Definition at line 159 of file parquet.hpp.

◆ get_filter()

auto const& cudf::io::parquet_reader_options::get_filter ( ) const

inline

Returns AST based filter for predicate pushdown.

Returns: AST expression to use as filter

Definition at line 173 of file parquet.hpp.

◆ get_num_rows()

std::optional<size_type> const& cudf::io::parquet_reader_options::get_num_rows ( ) const

inline

Returns number of rows to read.

Returns: Number of rows to read; nullopt if the option hasn't been set (in which case the file is read until the end)

Definition at line 152 of file parquet.hpp.

◆ get_row_groups()

auto const& cudf::io::parquet_reader_options::get_row_groups ( ) const

inline

Returns list of individual row groups to be read.

Returns: List of individual row groups to be read

Definition at line 166 of file parquet.hpp.

◆ get_skip_rows()

int64_t cudf::io::parquet_reader_options::get_skip_rows ( ) const

inline

Returns number of rows to skip from the start.

Returns: Number of rows to skip from the start

Definition at line 144 of file parquet.hpp.

◆ get_source()

source_info const& cudf::io::parquet_reader_options::get_source ( ) const

inline

Returns source info.

Returns: Source info

Definition at line 109 of file parquet.hpp.

◆ get_timestamp_type()

data_type cudf::io::parquet_reader_options::get_timestamp_type ( ) const

inline

Returns timestamp type used to cast timestamp columns.

Returns: Timestamp type used to cast timestamp columns

Definition at line 180 of file parquet.hpp.

◆ is_enabled_convert_strings_to_categories()

bool cudf::io::parquet_reader_options::is_enabled_convert_strings_to_categories ( ) const

inline

Returns true/false depending on whether strings should be converted to categories or not.

Returns: true if strings should be converted to categories

Definition at line 117 of file parquet.hpp.

◆ is_enabled_use_pandas_metadata()

bool cudf::io::parquet_reader_options::is_enabled_use_pandas_metadata ( ) const

inline

Returns true/false depending whether to use pandas metadata or not while reading.

Returns: true if pandas metadata is used while reading

Definition at line 127 of file parquet.hpp.

◆ set_column_schema()

void cudf::io::parquet_reader_options::set_column_schema ( std::vector< reader_column_schema > val )

inline

Sets reader column schema.

Parameters

val	Tree of schema nodes to enable/disable conversion of binary to string columns. Note default is to convert to string columns.

Definition at line 223 of file parquet.hpp.

◆ set_columns()

void cudf::io::parquet_reader_options::set_columns ( std::vector< std::string > col_names )

inline

Sets names of the columns to be read.

Parameters

col_names Vector of column names

Definition at line 187 of file parquet.hpp.

◆ set_filter()

void cudf::io::parquet_reader_options::set_filter ( ast::expression const & filter )

inline

Sets AST based filter for predicate pushdown.

Parameters

filter AST expression to use as filter

Definition at line 201 of file parquet.hpp.

◆ set_num_rows()

void cudf::io::parquet_reader_options::set_num_rows ( size_type val )

Sets number of rows to read.

Parameters

val	Number of rows to read after skip

◆ set_row_groups()

void cudf::io::parquet_reader_options::set_row_groups ( std::vector< std::vector< size_type >> row_groups )

Sets vector of individual row groups to read.

Parameters

row_groups Vector of row groups to read

◆ set_skip_rows()

void cudf::io::parquet_reader_options::set_skip_rows ( int64_t val )

Sets number of rows to skip.

Parameters

val	Number of rows to skip from start

◆ set_timestamp_type()

void cudf::io::parquet_reader_options::set_timestamp_type ( data_type type )

inline

Sets timestamp_type used to cast timestamp columns.

Parameters

type	The timestamp data_type to which all timestamp columns need to be cast

Definition at line 247 of file parquet.hpp.

The documentation for this class was generated from the following file:

parquet.hpp

Public Member Functions

Static Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ parquet_reader_options()

Member Function Documentation

◆ builder()

◆ enable_convert_strings_to_categories()

◆ enable_use_pandas_metadata()

◆ get_column_schema()

◆ get_columns()

◆ get_filter()

◆ get_num_rows()

◆ get_row_groups()

◆ get_skip_rows()

◆ get_source()

◆ get_timestamp_type()

◆ is_enabled_convert_strings_to_categories()

◆ is_enabled_use_pandas_metadata()

◆ set_column_schema()

◆ set_columns()

◆ set_filter()

◆ set_num_rows()

◆ set_row_groups()

◆ set_skip_rows()

◆ set_timestamp_type()