Public Types | Public Member Functions | Static Public Member Functions | List of all members
cudf::io::json_reader_options Class Reference

Input arguments to the read_json interface. More...

#include <json.hpp>

Public Types

using dtype_variant = std::variant< std::vector< data_type >, std::map< std::string, data_type >, std::map< std::string, schema_element >, schema_element >
 Variant type holding dtypes information for the columns.
 

Public Member Functions

 json_reader_options ()=default
 Default constructor. More...
 
source_info const & get_source () const
 Returns source info. More...
 
dtype_variant const & get_dtypes () const
 Returns data types of the columns. More...
 
compression_type get_compression () const
 Returns compression format of the source. More...
 
size_t get_byte_range_offset () const
 Returns number of bytes to skip from source start. More...
 
size_t get_byte_range_size () const
 Returns number of bytes to read. More...
 
size_t get_byte_range_size_with_padding () const
 Returns number of bytes to read with padding. More...
 
size_t get_byte_range_padding () const
 Returns number of bytes to pad when reading. More...
 
char get_delimiter () const
 Returns delimiter separating records in JSON lines. More...
 
bool is_enabled_lines () const
 Whether to read the file as a json object per line. More...
 
bool is_enabled_mixed_types_as_string () const
 Whether to parse mixed types as a string column. More...
 
bool is_enabled_prune_columns () const
 Whether to prune columns on read, selected based on the set_dtypes option. More...
 
bool is_enabled_experimental () const
 Whether to enable experimental features. More...
 
bool is_enabled_dayfirst () const
 Whether to parse dates as DD/MM versus MM/DD. More...
 
bool is_enabled_keep_quotes () const
 Whether the reader should keep quotes of string values. More...
 
bool is_enabled_normalize_single_quotes () const
 Whether the reader should normalize single quotes around strings. More...
 
bool is_enabled_normalize_whitespace () const
 Whether the reader should normalize unquoted whitespace characters. More...
 
json_recovery_mode_t recovery_mode () const
 Queries the JSON reader's behavior on invalid JSON lines. More...
 
bool is_strict_validation () const
 Whether json validation should be enforced strictly or not. More...
 
bool is_allowed_numeric_leading_zeros () const
 Whether leading zeros are allowed in numeric values. More...
 
bool is_allowed_nonnumeric_numbers () const
 Whether unquoted number values should be allowed NaN, +INF, -INF, +Infinity, Infinity, and -Infinity. More...
 
bool is_allowed_unquoted_control_chars () const
 Whether in a quoted string should characters greater than or equal to 0 and less than 32 be allowed without some form of escaping. More...
 
std::vector< std::string > const & get_na_values () const
 Returns additional values to recognize as null values. More...
 
void set_dtypes (std::vector< data_type > types)
 Set data types for columns to be read. More...
 
void set_dtypes (std::map< std::string, data_type > types)
 Set data types for columns to be read. More...
 
void set_dtypes (std::map< std::string, schema_element > types)
 Set data types for a potentially nested column hierarchy. More...
 
void set_dtypes (schema_element types)
 Set data types for a potentially nested column hierarchy. More...
 
void set_compression (compression_type comp_type)
 Set the compression type. More...
 
void set_byte_range_offset (size_t offset)
 Set number of bytes to skip from source start. More...
 
void set_byte_range_size (size_t size)
 Set number of bytes to read. More...
 
void set_delimiter (char delimiter)
 Set delimiter separating records in JSON lines. More...
 
void enable_lines (bool val)
 Set whether to read the file as a json object per line. More...
 
void enable_mixed_types_as_string (bool val)
 Set whether to parse mixed types as a string column. Also enables forcing to read a struct as string column using schema. More...
 
void enable_prune_columns (bool val)
 Set whether to prune columns on read, selected based on the set_dtypes option. More...
 
void enable_experimental (bool val)
 Set whether to enable experimental features. More...
 
void enable_dayfirst (bool val)
 Set whether to parse dates as DD/MM versus MM/DD. More...
 
void enable_keep_quotes (bool val)
 Set whether the reader should keep quotes of string values. More...
 
void enable_normalize_single_quotes (bool val)
 Set whether the reader should enable normalization of single quotes around strings. More...
 
void enable_normalize_whitespace (bool val)
 Set whether the reader should enable normalization of unquoted whitespace. More...
 
void set_recovery_mode (json_recovery_mode_t val)
 Specifies the JSON reader's behavior on invalid JSON lines. More...
 
void set_strict_validation (bool val)
 Set whether strict validation is enabled or not. More...
 
void allow_numeric_leading_zeros (bool val)
 Set whether leading zeros are allowed in numeric values. Strict validation must be enabled for this to work. More...
 
void allow_nonnumeric_numbers (bool val)
 Set whether unquoted number values should be allowed NaN, +INF, -INF, +Infinity, Infinity, and -Infinity. Strict validation must be enabled for this to work. More...
 
void allow_unquoted_control_chars (bool val)
 Set whether in a quoted string should characters greater than or equal to 0 and less than 32 be allowed without some form of escaping. Strict validation must be enabled for this to work. More...
 
void set_na_values (std::vector< std::string > vals)
 Sets additional values to recognize as null values. More...
 

Static Public Member Functions

static json_reader_options_builder builder (source_info src)
 create json_reader_options_builder which will build json_reader_options. More...
 

Detailed Description

Input arguments to the read_json interface.

Available parameters are closely patterned after PANDAS' read_json API. Not all parameters are supported. If the matching PANDAS' parameter has a default value of None, then a default value of -1 or 0 may be used as the equivalent.

Parameters in PANDAS that are unavailable or in cudf:

Name Description
orient currently fixed-format
typ data is always returned as a cudf::table
convert_axes use column functions for axes operations instead
convert_dates dates are detected automatically
keep_default_dates dates are detected automatically
numpy data is always returned as a cudf::table
precise_float there is only one converter
date_unit only millisecond units are supported
encoding only ASCII-encoded data is supported
chunksize use byte_range_xxx for chunking instead

Definition at line 95 of file io/json.hpp.

Constructor & Destructor Documentation

◆ json_reader_options()

cudf::io::json_reader_options::json_reader_options ( )
default

Default constructor.

This has been added since Cython requires a default constructor to create objects on stack.

Member Function Documentation

◆ allow_nonnumeric_numbers()

void cudf::io::json_reader_options::allow_nonnumeric_numbers ( bool  val)
inline

Set whether unquoted number values should be allowed NaN, +INF, -INF, +Infinity, Infinity, and -Infinity. Strict validation must be enabled for this to work.

Exceptions
cudf::logic_errorif strict_validation is not enabled before setting this option.
Parameters
valBoolean value to indicate whether leading zeros are allowed in numeric values

Definition at line 558 of file io/json.hpp.

◆ allow_numeric_leading_zeros()

void cudf::io::json_reader_options::allow_numeric_leading_zeros ( bool  val)
inline

Set whether leading zeros are allowed in numeric values. Strict validation must be enabled for this to work.

Exceptions
cudf::logic_errorif strict_validation is not enabled before setting this option.
Parameters
valBoolean value to indicate whether leading zeros are allowed in numeric values

Definition at line 544 of file io/json.hpp.

◆ allow_unquoted_control_chars()

void cudf::io::json_reader_options::allow_unquoted_control_chars ( bool  val)
inline

Set whether in a quoted string should characters greater than or equal to 0 and less than 32 be allowed without some form of escaping. Strict validation must be enabled for this to work.

Exceptions
cudf::logic_errorif strict_validation is not enabled before setting this option.
Parameters
valtrue to indicate whether unquoted control chars are allowed.

Definition at line 573 of file io/json.hpp.

◆ builder()

static json_reader_options_builder cudf::io::json_reader_options::builder ( source_info  src)
static

create json_reader_options_builder which will build json_reader_options.

Parameters
srcsource information used to read json file
Returns
builder to build the options

◆ enable_dayfirst()

void cudf::io::json_reader_options::enable_dayfirst ( bool  val)
inline

Set whether to parse dates as DD/MM versus MM/DD.

Parameters
valBoolean value to enable/disable day first parsing format

Definition at line 496 of file io/json.hpp.

◆ enable_experimental()

void cudf::io::json_reader_options::enable_experimental ( bool  val)
inline

Set whether to enable experimental features.

When set to true, experimental features, such as the new column tree construction, utf-8 matching of field names will be enabled.

Parameters
valBoolean value to enable/disable experimental features

Definition at line 489 of file io/json.hpp.

◆ enable_keep_quotes()

void cudf::io::json_reader_options::enable_keep_quotes ( bool  val)
inline

Set whether the reader should keep quotes of string values.

Parameters
valBoolean value to indicate whether the reader should keep quotes of string values

Definition at line 504 of file io/json.hpp.

◆ enable_lines()

void cudf::io::json_reader_options::enable_lines ( bool  val)
inline

Set whether to read the file as a json object per line.

Parameters
valBoolean value to enable/disable the option to read each line as a json object

Definition at line 460 of file io/json.hpp.

◆ enable_mixed_types_as_string()

void cudf::io::json_reader_options::enable_mixed_types_as_string ( bool  val)
inline

Set whether to parse mixed types as a string column. Also enables forcing to read a struct as string column using schema.

Parameters
valBoolean value to enable/disable parsing mixed types as a string column

Definition at line 468 of file io/json.hpp.

◆ enable_normalize_single_quotes()

void cudf::io::json_reader_options::enable_normalize_single_quotes ( bool  val)
inline

Set whether the reader should enable normalization of single quotes around strings.

Parameters
valBoolean value to indicate whether the reader should normalize single quotes around strings

Definition at line 512 of file io/json.hpp.

◆ enable_normalize_whitespace()

void cudf::io::json_reader_options::enable_normalize_whitespace ( bool  val)
inline

Set whether the reader should enable normalization of unquoted whitespace.

Parameters
valBoolean value to indicate whether the reader should normalize unquoted whitespace characters i.e. tabs and spaces

Definition at line 520 of file io/json.hpp.

◆ enable_prune_columns()

void cudf::io::json_reader_options::enable_prune_columns ( bool  val)
inline

Set whether to prune columns on read, selected based on the set_dtypes option.

When set as true, if the reader options include set_dtypes, then the reader will only return those columns which are mentioned in set_dtypes. If false, then all columns are returned, independent of the set_dtypes setting.

Parameters
valBoolean value to enable/disable column pruning

Definition at line 479 of file io/json.hpp.

◆ get_byte_range_offset()

size_t cudf::io::json_reader_options::get_byte_range_offset ( ) const
inline

Returns number of bytes to skip from source start.

Returns
Number of bytes to skip from source start

Definition at line 206 of file io/json.hpp.

◆ get_byte_range_padding()

size_t cudf::io::json_reader_options::get_byte_range_padding ( ) const
inline

Returns number of bytes to pad when reading.

Returns
Number of bytes to pad

Definition at line 234 of file io/json.hpp.

◆ get_byte_range_size()

size_t cudf::io::json_reader_options::get_byte_range_size ( ) const
inline

Returns number of bytes to read.

Returns
Number of bytes to read

Definition at line 213 of file io/json.hpp.

◆ get_byte_range_size_with_padding()

size_t cudf::io::json_reader_options::get_byte_range_size_with_padding ( ) const
inline

Returns number of bytes to read with padding.

Returns
Number of bytes to read with padding

Definition at line 220 of file io/json.hpp.

◆ get_compression()

compression_type cudf::io::json_reader_options::get_compression ( ) const
inline

Returns compression format of the source.

Returns
Compression format of the source

Definition at line 199 of file io/json.hpp.

◆ get_delimiter()

char cudf::io::json_reader_options::get_delimiter ( ) const
inline

Returns delimiter separating records in JSON lines.

Returns
Delimiter separating records in JSON lines

Definition at line 260 of file io/json.hpp.

◆ get_dtypes()

dtype_variant const& cudf::io::json_reader_options::get_dtypes ( ) const
inline

Returns data types of the columns.

Returns
Data types of the columns

Definition at line 192 of file io/json.hpp.

◆ get_na_values()

std::vector<std::string> const& cudf::io::json_reader_options::get_na_values ( ) const
inline

Returns additional values to recognize as null values.

Returns
Additional values to recognize as null values

Definition at line 379 of file io/json.hpp.

◆ get_source()

source_info const& cudf::io::json_reader_options::get_source ( ) const
inline

Returns source info.

Returns
Source info

Definition at line 185 of file io/json.hpp.

◆ is_allowed_nonnumeric_numbers()

bool cudf::io::json_reader_options::is_allowed_nonnumeric_numbers ( ) const
inline

Whether unquoted number values should be allowed NaN, +INF, -INF, +Infinity, Infinity, and -Infinity.

Note
: This validation is enforced only if strict validation is enabled.
Returns
true if leading zeros are allowed in numeric values

Definition at line 359 of file io/json.hpp.

◆ is_allowed_numeric_leading_zeros()

bool cudf::io::json_reader_options::is_allowed_numeric_leading_zeros ( ) const
inline

Whether leading zeros are allowed in numeric values.

Note
: This validation is enforced only if strict validation is enabled.
Returns
true if leading zeros are allowed in numeric values

Definition at line 346 of file io/json.hpp.

◆ is_allowed_unquoted_control_chars()

bool cudf::io::json_reader_options::is_allowed_unquoted_control_chars ( ) const
inline

Whether in a quoted string should characters greater than or equal to 0 and less than 32 be allowed without some form of escaping.

Note
: This validation is enforced only if strict validation is enabled.
Returns
true if unquoted control chars are allowed.

Definition at line 369 of file io/json.hpp.

◆ is_enabled_dayfirst()

bool cudf::io::json_reader_options::is_enabled_dayfirst ( ) const
inline

Whether to parse dates as DD/MM versus MM/DD.

Returns
true if dates are parsed as DD/MM, false if MM/DD

Definition at line 302 of file io/json.hpp.

◆ is_enabled_experimental()

bool cudf::io::json_reader_options::is_enabled_experimental ( ) const
inline

Whether to enable experimental features.

When set to true, experimental features, such as the new column tree construction, utf-8 matching of field names will be enabled.

Returns
true if experimental features are enabled

Definition at line 295 of file io/json.hpp.

◆ is_enabled_keep_quotes()

bool cudf::io::json_reader_options::is_enabled_keep_quotes ( ) const
inline

Whether the reader should keep quotes of string values.

Returns
true if the reader should keep quotes, false otherwise

Definition at line 309 of file io/json.hpp.

◆ is_enabled_lines()

bool cudf::io::json_reader_options::is_enabled_lines ( ) const
inline

Whether to read the file as a json object per line.

Returns
true if reading the file as a json object per line

Definition at line 267 of file io/json.hpp.

◆ is_enabled_mixed_types_as_string()

bool cudf::io::json_reader_options::is_enabled_mixed_types_as_string ( ) const
inline

Whether to parse mixed types as a string column.

Returns
true if mixed types are parsed as a string column

Definition at line 274 of file io/json.hpp.

◆ is_enabled_normalize_single_quotes()

bool cudf::io::json_reader_options::is_enabled_normalize_single_quotes ( ) const
inline

Whether the reader should normalize single quotes around strings.

Returns
true if the reader should normalize single quotes, false otherwise

Definition at line 316 of file io/json.hpp.

◆ is_enabled_normalize_whitespace()

bool cudf::io::json_reader_options::is_enabled_normalize_whitespace ( ) const
inline

Whether the reader should normalize unquoted whitespace characters.

Returns
true if the reader should normalize whitespace, false otherwise

Definition at line 323 of file io/json.hpp.

◆ is_enabled_prune_columns()

bool cudf::io::json_reader_options::is_enabled_prune_columns ( ) const
inline

Whether to prune columns on read, selected based on the set_dtypes option.

When set as true, if the reader options include set_dtypes, then the reader will only return those columns which are mentioned in set_dtypes. If false, then all columns are returned, independent of the set_dtypes setting.

Returns
True if column pruning is enabled

Definition at line 286 of file io/json.hpp.

◆ is_strict_validation()

bool cudf::io::json_reader_options::is_strict_validation ( ) const
inline

Whether json validation should be enforced strictly or not.

Returns
true if it should be.

Definition at line 337 of file io/json.hpp.

◆ recovery_mode()

json_recovery_mode_t cudf::io::json_reader_options::recovery_mode ( ) const
inline

Queries the JSON reader's behavior on invalid JSON lines.

Returns
An enum that specifies the JSON reader's behavior on invalid JSON lines.

Definition at line 330 of file io/json.hpp.

◆ set_byte_range_offset()

void cudf::io::json_reader_options::set_byte_range_offset ( size_t  offset)
inline

Set number of bytes to skip from source start.

Parameters
offsetNumber of bytes of offset

Definition at line 422 of file io/json.hpp.

◆ set_byte_range_size()

void cudf::io::json_reader_options::set_byte_range_size ( size_t  size)
inline

Set number of bytes to read.

Parameters
sizeNumber of bytes to read

Definition at line 429 of file io/json.hpp.

◆ set_compression()

void cudf::io::json_reader_options::set_compression ( compression_type  comp_type)
inline

Set the compression type.

Parameters
comp_typeThe compression type used

Definition at line 415 of file io/json.hpp.

◆ set_delimiter()

void cudf::io::json_reader_options::set_delimiter ( char  delimiter)
inline

Set delimiter separating records in JSON lines.

Parameters
delimiterDelimiter separating records in JSON lines

Definition at line 436 of file io/json.hpp.

◆ set_dtypes() [1/4]

void cudf::io::json_reader_options::set_dtypes ( schema_element  types)

Set data types for a potentially nested column hierarchy.

Parameters
typesschema element with column names and column order to support arbitrary nesting of data types

◆ set_dtypes() [2/4]

void cudf::io::json_reader_options::set_dtypes ( std::map< std::string, data_type types)
inline

Set data types for columns to be read.

Parameters
typesVector dtypes in string format

Definition at line 393 of file io/json.hpp.

◆ set_dtypes() [3/4]

void cudf::io::json_reader_options::set_dtypes ( std::map< std::string, schema_element types)
inline

Set data types for a potentially nested column hierarchy.

Parameters
typesMap of column names to schema_element to support arbitrary nesting of data types

Definition at line 400 of file io/json.hpp.

◆ set_dtypes() [4/4]

void cudf::io::json_reader_options::set_dtypes ( std::vector< data_type types)
inline

Set data types for columns to be read.

Parameters
typesVector of dtypes

Definition at line 386 of file io/json.hpp.

◆ set_na_values()

void cudf::io::json_reader_options::set_na_values ( std::vector< std::string >  vals)
inline

Sets additional values to recognize as null values.

Parameters
valsVector of values to be considered to be null

Definition at line 584 of file io/json.hpp.

◆ set_recovery_mode()

void cudf::io::json_reader_options::set_recovery_mode ( json_recovery_mode_t  val)
inline

Specifies the JSON reader's behavior on invalid JSON lines.

Parameters
valAn enum value to indicate the JSON reader's behavior on invalid JSON lines.

Definition at line 527 of file io/json.hpp.

◆ set_strict_validation()

void cudf::io::json_reader_options::set_strict_validation ( bool  val)
inline

Set whether strict validation is enabled or not.

Parameters
valBoolean value to indicate whether strict validation is enabled.

Definition at line 534 of file io/json.hpp.


The documentation for this class was generated from the following file: