Public Member Functions | Static Public Member Functions | List of all members
cudf::io::json_reader_options Class Reference

Input arguments to the read_json interface. More...

#include <json.hpp>

Public Member Functions

 json_reader_options ()=default
 Default constructor. More...
 
source_info const & get_source () const
 Returns source info. More...
 
std::variant< std::vector< data_type >, std::map< std::string, data_type >, std::map< std::string, schema_element > > const & get_dtypes () const
 Returns data types of the columns. More...
 
compression_type get_compression () const
 Returns compression format of the source. More...
 
size_t get_byte_range_offset () const
 Returns number of bytes to skip from source start. More...
 
size_t get_byte_range_size () const
 Returns number of bytes to read. More...
 
size_t get_byte_range_size_with_padding () const
 Returns number of bytes to read with padding. More...
 
size_t get_byte_range_padding () const
 Returns number of bytes to pad when reading. More...
 
char get_delimiter () const
 Returns delimiter separating records in JSON lines. More...
 
bool is_enabled_lines () const
 Whether to read the file as a json object per line. More...
 
bool is_enabled_mixed_types_as_string () const
 Whether to parse mixed types as a string column. More...
 
bool is_enabled_prune_columns () const
 Whether to prune columns on read, selected based on the set_dtypes option. More...
 
bool is_enabled_experimental () const
 Whether to enable experimental features. More...
 
bool is_enabled_dayfirst () const
 Whether to parse dates as DD/MM versus MM/DD. More...
 
bool is_enabled_keep_quotes () const
 Whether the reader should keep quotes of string values. More...
 
bool is_enabled_normalize_single_quotes () const
 Whether the reader should normalize single quotes around strings. More...
 
bool is_enabled_normalize_whitespace () const
 Whether the reader should normalize unquoted whitespace characters. More...
 
json_recovery_mode_t recovery_mode () const
 Queries the JSON reader's behavior on invalid JSON lines. More...
 
bool is_strict_validation () const
 Whether json validation should be enforced strictly or not. More...
 
bool is_allowed_numeric_leading_zeros () const
 Whether leading zeros are allowed in numeric values. More...
 
bool is_allowed_nonnumeric_numbers () const
 Whether unquoted number values should be allowed NaN, +INF, -INF, +Infinity, Infinity, and -Infinity. More...
 
bool is_allowed_unquoted_control_chars () const
 Whether in a quoted string should characters greater than or equal to 0 and less than 32 be allowed without some form of escaping. More...
 
std::vector< std::string > const & get_na_values () const
 Returns additional values to recognize as null values. More...
 
void set_dtypes (std::vector< data_type > types)
 Set data types for columns to be read. More...
 
void set_dtypes (std::map< std::string, data_type > types)
 Set data types for columns to be read. More...
 
void set_dtypes (std::map< std::string, schema_element > types)
 Set data types for a potentially nested column hierarchy. More...
 
void set_compression (compression_type comp_type)
 Set the compression type. More...
 
void set_byte_range_offset (size_t offset)
 Set number of bytes to skip from source start. More...
 
void set_byte_range_size (size_t size)
 Set number of bytes to read. More...
 
void set_delimiter (char delimiter)
 Set delimiter separating records in JSON lines. More...
 
void enable_lines (bool val)
 Set whether to read the file as a json object per line. More...
 
void enable_mixed_types_as_string (bool val)
 Set whether to parse mixed types as a string column. Also enables forcing to read a struct as string column using schema. More...
 
void enable_prune_columns (bool val)
 Set whether to prune columns on read, selected based on the set_dtypes option. More...
 
void enable_experimental (bool val)
 Set whether to enable experimental features. More...
 
void enable_dayfirst (bool val)
 Set whether to parse dates as DD/MM versus MM/DD. More...
 
void enable_keep_quotes (bool val)
 Set whether the reader should keep quotes of string values. More...
 
void enable_normalize_single_quotes (bool val)
 Set whether the reader should enable normalization of single quotes around strings. More...
 
void enable_normalize_whitespace (bool val)
 Set whether the reader should enable normalization of unquoted whitespace. More...
 
void set_recovery_mode (json_recovery_mode_t val)
 Specifies the JSON reader's behavior on invalid JSON lines. More...
 
void set_strict_validation (bool val)
 Set whether strict validation is enabled or not. More...
 
void allow_numeric_leading_zeros (bool val)
 Set whether leading zeros are allowed in numeric values. Strict validation must be enabled for this to work. More...
 
void allow_nonnumeric_numbers (bool val)
 Set whether unquoted number values should be allowed NaN, +INF, -INF, +Infinity, Infinity, and -Infinity. Strict validation must be enabled for this to work. More...
 
void allow_unquoted_control_chars (bool val)
 Set whether in a quoted string should characters greater than or equal to 0 and less than 32 be allowed without some form of escaping. Strict validation must be enabled for this to work. More...
 
void set_na_values (std::vector< std::string > vals)
 Sets additional values to recognize as null values. More...
 

Static Public Member Functions

static json_reader_options_builder builder (source_info src)
 create json_reader_options_builder which will build json_reader_options. More...
 

Detailed Description

Input arguments to the read_json interface.

Available parameters are closely patterned after PANDAS' read_json API. Not all parameters are supported. If the matching PANDAS' parameter has a default value of None, then a default value of -1 or 0 may be used as the equivalent.

Parameters in PANDAS that are unavailable or in cudf:

Name Description
orient currently fixed-format
typ data is always returned as a cudf::table
convert_axes use column functions for axes operations instead
convert_dates dates are detected automatically
keep_default_dates dates are detected automatically
numpy data is always returned as a cudf::table
precise_float there is only one converter
date_unit only millisecond units are supported
encoding only ASCII-encoded data is supported
chunksize use byte_range_xxx for chunking instead

Definition at line 89 of file io/json.hpp.

Constructor & Destructor Documentation

◆ json_reader_options()

cudf::io::json_reader_options::json_reader_options ( )
default

Default constructor.

This has been added since Cython requires a default constructor to create objects on stack.

Member Function Documentation

◆ allow_nonnumeric_numbers()

void cudf::io::json_reader_options::allow_nonnumeric_numbers ( bool  val)
inline

Set whether unquoted number values should be allowed NaN, +INF, -INF, +Infinity, Infinity, and -Infinity. Strict validation must be enabled for this to work.

Exceptions
cudf::logic_errorif strict_validation is not enabled before setting this option.
Parameters
valBoolean value to indicate whether leading zeros are allowed in numeric values

Definition at line 544 of file io/json.hpp.

◆ allow_numeric_leading_zeros()

void cudf::io::json_reader_options::allow_numeric_leading_zeros ( bool  val)
inline

Set whether leading zeros are allowed in numeric values. Strict validation must be enabled for this to work.

Exceptions
cudf::logic_errorif strict_validation is not enabled before setting this option.
Parameters
valBoolean value to indicate whether leading zeros are allowed in numeric values

Definition at line 530 of file io/json.hpp.

◆ allow_unquoted_control_chars()

void cudf::io::json_reader_options::allow_unquoted_control_chars ( bool  val)
inline

Set whether in a quoted string should characters greater than or equal to 0 and less than 32 be allowed without some form of escaping. Strict validation must be enabled for this to work.

Exceptions
cudf::logic_errorif strict_validation is not enabled before setting this option.
Parameters
valtrue to indicate whether unquoted control chars are allowed.

Definition at line 559 of file io/json.hpp.

◆ builder()

static json_reader_options_builder cudf::io::json_reader_options::builder ( source_info  src)
static

create json_reader_options_builder which will build json_reader_options.

Parameters
srcsource information used to read json file
Returns
builder to build the options

◆ enable_dayfirst()

void cudf::io::json_reader_options::enable_dayfirst ( bool  val)
inline

Set whether to parse dates as DD/MM versus MM/DD.

Parameters
valBoolean value to enable/disable day first parsing format

Definition at line 482 of file io/json.hpp.

◆ enable_experimental()

void cudf::io::json_reader_options::enable_experimental ( bool  val)
inline

Set whether to enable experimental features.

When set to true, experimental features, such as the new column tree construction, utf-8 matching of field names will be enabled.

Parameters
valBoolean value to enable/disable experimental features

Definition at line 475 of file io/json.hpp.

◆ enable_keep_quotes()

void cudf::io::json_reader_options::enable_keep_quotes ( bool  val)
inline

Set whether the reader should keep quotes of string values.

Parameters
valBoolean value to indicate whether the reader should keep quotes of string values

Definition at line 490 of file io/json.hpp.

◆ enable_lines()

void cudf::io::json_reader_options::enable_lines ( bool  val)
inline

Set whether to read the file as a json object per line.

Parameters
valBoolean value to enable/disable the option to read each line as a json object

Definition at line 446 of file io/json.hpp.

◆ enable_mixed_types_as_string()

void cudf::io::json_reader_options::enable_mixed_types_as_string ( bool  val)
inline

Set whether to parse mixed types as a string column. Also enables forcing to read a struct as string column using schema.

Parameters
valBoolean value to enable/disable parsing mixed types as a string column

Definition at line 454 of file io/json.hpp.

◆ enable_normalize_single_quotes()

void cudf::io::json_reader_options::enable_normalize_single_quotes ( bool  val)
inline

Set whether the reader should enable normalization of single quotes around strings.

Parameters
valBoolean value to indicate whether the reader should normalize single quotes around strings

Definition at line 498 of file io/json.hpp.

◆ enable_normalize_whitespace()

void cudf::io::json_reader_options::enable_normalize_whitespace ( bool  val)
inline

Set whether the reader should enable normalization of unquoted whitespace.

Parameters
valBoolean value to indicate whether the reader should normalize unquoted whitespace characters i.e. tabs and spaces

Definition at line 506 of file io/json.hpp.

◆ enable_prune_columns()

void cudf::io::json_reader_options::enable_prune_columns ( bool  val)
inline

Set whether to prune columns on read, selected based on the set_dtypes option.

When set as true, if the reader options include set_dtypes, then the reader will only return those columns which are mentioned in set_dtypes. If false, then all columns are returned, independent of the set_dtypes setting.

Parameters
valBoolean value to enable/disable column pruning

Definition at line 465 of file io/json.hpp.

◆ get_byte_range_offset()

size_t cudf::io::json_reader_options::get_byte_range_offset ( ) const
inline

Returns number of bytes to skip from source start.

Returns
Number of bytes to skip from source start

Definition at line 204 of file io/json.hpp.

◆ get_byte_range_padding()

size_t cudf::io::json_reader_options::get_byte_range_padding ( ) const
inline

Returns number of bytes to pad when reading.

Returns
Number of bytes to pad

Definition at line 232 of file io/json.hpp.

◆ get_byte_range_size()

size_t cudf::io::json_reader_options::get_byte_range_size ( ) const
inline

Returns number of bytes to read.

Returns
Number of bytes to read

Definition at line 211 of file io/json.hpp.

◆ get_byte_range_size_with_padding()

size_t cudf::io::json_reader_options::get_byte_range_size_with_padding ( ) const
inline

Returns number of bytes to read with padding.

Returns
Number of bytes to read with padding

Definition at line 218 of file io/json.hpp.

◆ get_compression()

compression_type cudf::io::json_reader_options::get_compression ( ) const
inline

Returns compression format of the source.

Returns
Compression format of the source

Definition at line 197 of file io/json.hpp.

◆ get_delimiter()

char cudf::io::json_reader_options::get_delimiter ( ) const
inline

Returns delimiter separating records in JSON lines.

Returns
Delimiter separating records in JSON lines

Definition at line 254 of file io/json.hpp.

◆ get_dtypes()

std::variant<std::vector<data_type>, std::map<std::string, data_type>, std::map<std::string, schema_element> > const& cudf::io::json_reader_options::get_dtypes ( ) const
inline

Returns data types of the columns.

Returns
Data types of the columns

Definition at line 187 of file io/json.hpp.

◆ get_na_values()

std::vector<std::string> const& cudf::io::json_reader_options::get_na_values ( ) const
inline

Returns additional values to recognize as null values.

Returns
Additional values to recognize as null values

Definition at line 373 of file io/json.hpp.

◆ get_source()

source_info const& cudf::io::json_reader_options::get_source ( ) const
inline

Returns source info.

Returns
Source info

Definition at line 177 of file io/json.hpp.

◆ is_allowed_nonnumeric_numbers()

bool cudf::io::json_reader_options::is_allowed_nonnumeric_numbers ( ) const
inline

Whether unquoted number values should be allowed NaN, +INF, -INF, +Infinity, Infinity, and -Infinity.

Note
: This validation is enforced only if strict validation is enabled.
Returns
true if leading zeros are allowed in numeric values

Definition at line 353 of file io/json.hpp.

◆ is_allowed_numeric_leading_zeros()

bool cudf::io::json_reader_options::is_allowed_numeric_leading_zeros ( ) const
inline

Whether leading zeros are allowed in numeric values.

Note
: This validation is enforced only if strict validation is enabled.
Returns
true if leading zeros are allowed in numeric values

Definition at line 340 of file io/json.hpp.

◆ is_allowed_unquoted_control_chars()

bool cudf::io::json_reader_options::is_allowed_unquoted_control_chars ( ) const
inline

Whether in a quoted string should characters greater than or equal to 0 and less than 32 be allowed without some form of escaping.

Note
: This validation is enforced only if strict validation is enabled.
Returns
true if unquoted control chars are allowed.

Definition at line 363 of file io/json.hpp.

◆ is_enabled_dayfirst()

bool cudf::io::json_reader_options::is_enabled_dayfirst ( ) const
inline

Whether to parse dates as DD/MM versus MM/DD.

Returns
true if dates are parsed as DD/MM, false if MM/DD

Definition at line 296 of file io/json.hpp.

◆ is_enabled_experimental()

bool cudf::io::json_reader_options::is_enabled_experimental ( ) const
inline

Whether to enable experimental features.

When set to true, experimental features, such as the new column tree construction, utf-8 matching of field names will be enabled.

Returns
true if experimental features are enabled

Definition at line 289 of file io/json.hpp.

◆ is_enabled_keep_quotes()

bool cudf::io::json_reader_options::is_enabled_keep_quotes ( ) const
inline

Whether the reader should keep quotes of string values.

Returns
true if the reader should keep quotes, false otherwise

Definition at line 303 of file io/json.hpp.

◆ is_enabled_lines()

bool cudf::io::json_reader_options::is_enabled_lines ( ) const
inline

Whether to read the file as a json object per line.

Returns
true if reading the file as a json object per line

Definition at line 261 of file io/json.hpp.

◆ is_enabled_mixed_types_as_string()

bool cudf::io::json_reader_options::is_enabled_mixed_types_as_string ( ) const
inline

Whether to parse mixed types as a string column.

Returns
true if mixed types are parsed as a string column

Definition at line 268 of file io/json.hpp.

◆ is_enabled_normalize_single_quotes()

bool cudf::io::json_reader_options::is_enabled_normalize_single_quotes ( ) const
inline

Whether the reader should normalize single quotes around strings.

Returns
true if the reader should normalize single quotes, false otherwise

Definition at line 310 of file io/json.hpp.

◆ is_enabled_normalize_whitespace()

bool cudf::io::json_reader_options::is_enabled_normalize_whitespace ( ) const
inline

Whether the reader should normalize unquoted whitespace characters.

Returns
true if the reader should normalize whitespace, false otherwise

Definition at line 317 of file io/json.hpp.

◆ is_enabled_prune_columns()

bool cudf::io::json_reader_options::is_enabled_prune_columns ( ) const
inline

Whether to prune columns on read, selected based on the set_dtypes option.

When set as true, if the reader options include set_dtypes, then the reader will only return those columns which are mentioned in set_dtypes. If false, then all columns are returned, independent of the set_dtypes setting.

Returns
True if column pruning is enabled

Definition at line 280 of file io/json.hpp.

◆ is_strict_validation()

bool cudf::io::json_reader_options::is_strict_validation ( ) const
inline

Whether json validation should be enforced strictly or not.

Returns
true if it should be.

Definition at line 331 of file io/json.hpp.

◆ recovery_mode()

json_recovery_mode_t cudf::io::json_reader_options::recovery_mode ( ) const
inline

Queries the JSON reader's behavior on invalid JSON lines.

Returns
An enum that specifies the JSON reader's behavior on invalid JSON lines.

Definition at line 324 of file io/json.hpp.

◆ set_byte_range_offset()

void cudf::io::json_reader_options::set_byte_range_offset ( size_t  offset)
inline

Set number of bytes to skip from source start.

Parameters
offsetNumber of bytes of offset

Definition at line 408 of file io/json.hpp.

◆ set_byte_range_size()

void cudf::io::json_reader_options::set_byte_range_size ( size_t  size)
inline

Set number of bytes to read.

Parameters
sizeNumber of bytes to read

Definition at line 415 of file io/json.hpp.

◆ set_compression()

void cudf::io::json_reader_options::set_compression ( compression_type  comp_type)
inline

Set the compression type.

Parameters
comp_typeThe compression type used

Definition at line 401 of file io/json.hpp.

◆ set_delimiter()

void cudf::io::json_reader_options::set_delimiter ( char  delimiter)
inline

Set delimiter separating records in JSON lines.

Parameters
delimiterDelimiter separating records in JSON lines

Definition at line 422 of file io/json.hpp.

◆ set_dtypes() [1/3]

void cudf::io::json_reader_options::set_dtypes ( std::map< std::string, data_type types)
inline

Set data types for columns to be read.

Parameters
typesVector dtypes in string format

Definition at line 387 of file io/json.hpp.

◆ set_dtypes() [2/3]

void cudf::io::json_reader_options::set_dtypes ( std::map< std::string, schema_element types)
inline

Set data types for a potentially nested column hierarchy.

Parameters
typesMap of column names to schema_element to support arbitrary nesting of data types

Definition at line 394 of file io/json.hpp.

◆ set_dtypes() [3/3]

void cudf::io::json_reader_options::set_dtypes ( std::vector< data_type types)
inline

Set data types for columns to be read.

Parameters
typesVector of dtypes

Definition at line 380 of file io/json.hpp.

◆ set_na_values()

void cudf::io::json_reader_options::set_na_values ( std::vector< std::string >  vals)
inline

Sets additional values to recognize as null values.

Parameters
valsVector of values to be considered to be null

Definition at line 570 of file io/json.hpp.

◆ set_recovery_mode()

void cudf::io::json_reader_options::set_recovery_mode ( json_recovery_mode_t  val)
inline

Specifies the JSON reader's behavior on invalid JSON lines.

Parameters
valAn enum value to indicate the JSON reader's behavior on invalid JSON lines.

Definition at line 513 of file io/json.hpp.

◆ set_strict_validation()

void cudf::io::json_reader_options::set_strict_validation ( bool  val)
inline

Set whether strict validation is enabled or not.

Parameters
valBoolean value to indicate whether strict validation is enabled.

Definition at line 520 of file io/json.hpp.


The documentation for this class was generated from the following file: