Public Member Functions | Static Public Member Functions | List of all members
cudf::io::json_reader_options Class Reference

Input arguments to the read_json interface. More...

#include <json.hpp>

Public Member Functions

 json_reader_options ()=default
 Default constructor. More...
 
source_info const & get_source () const
 Returns source info. More...
 
std::variant< std::vector< data_type >, std::map< std::string, data_type >, std::map< std::string, schema_element > > const & get_dtypes () const
 Returns data types of the columns. More...
 
compression_type get_compression () const
 Returns compression format of the source. More...
 
size_t get_byte_range_offset () const
 Returns number of bytes to skip from source start. More...
 
size_t get_byte_range_size () const
 Returns number of bytes to read. More...
 
size_t get_byte_range_size_with_padding () const
 Returns number of bytes to read with padding. More...
 
size_t get_byte_range_padding () const
 Returns number of bytes to pad when reading. More...
 
char get_delimiter () const
 Returns delimiter separating records in JSON lines. More...
 
bool is_enabled_lines () const
 Whether to read the file as a json object per line. More...
 
bool is_enabled_mixed_types_as_string () const
 Whether to parse mixed types as a string column. More...
 
bool is_enabled_prune_columns () const
 Whether to prune columns on read, selected based on the set_dtypes option. More...
 
bool is_enabled_dayfirst () const
 Whether to parse dates as DD/MM versus MM/DD. More...
 
bool is_enabled_keep_quotes () const
 Whether the reader should keep quotes of string values. More...
 
bool is_enabled_normalize_single_quotes () const
 Whether the reader should normalize single quotes around strings. More...
 
bool is_enabled_normalize_whitespace () const
 Whether the reader should normalize unquoted whitespace characters. More...
 
json_recovery_mode_t recovery_mode () const
 Queries the JSON reader's behavior on invalid JSON lines. More...
 
void set_dtypes (std::vector< data_type > types)
 Set data types for columns to be read. More...
 
void set_dtypes (std::map< std::string, data_type > types)
 Set data types for columns to be read. More...
 
void set_dtypes (std::map< std::string, schema_element > types)
 Set data types for a potentially nested column hierarchy. More...
 
void set_compression (compression_type comp_type)
 Set the compression type. More...
 
void set_byte_range_offset (size_t offset)
 Set number of bytes to skip from source start. More...
 
void set_byte_range_size (size_t size)
 Set number of bytes to read. More...
 
void set_delimiter (char delimiter)
 Set delimiter separating records in JSON lines. More...
 
void enable_lines (bool val)
 Set whether to read the file as a json object per line. More...
 
void enable_mixed_types_as_string (bool val)
 Set whether to parse mixed types as a string column. Also enables forcing to read a struct as string column using schema. More...
 
void enable_prune_columns (bool val)
 Set whether to prune columns on read, selected based on the set_dtypes option. More...
 
void enable_dayfirst (bool val)
 Set whether to parse dates as DD/MM versus MM/DD. More...
 
void enable_keep_quotes (bool val)
 Set whether the reader should keep quotes of string values. More...
 
void enable_normalize_single_quotes (bool val)
 Set whether the reader should enable normalization of single quotes around strings. More...
 
void enable_normalize_whitespace (bool val)
 Set whether the reader should enable normalization of unquoted whitespace. More...
 
void set_recovery_mode (json_recovery_mode_t val)
 Specifies the JSON reader's behavior on invalid JSON lines. More...
 

Static Public Member Functions

static json_reader_options_builder builder (source_info src)
 create json_reader_options_builder which will build json_reader_options. More...
 

Detailed Description

Input arguments to the read_json interface.

Available parameters are closely patterned after PANDAS' read_json API. Not all parameters are supported. If the matching PANDAS' parameter has a default value of None, then a default value of -1 or 0 may be used as the equivalent.

Parameters in PANDAS that are unavailable or in cudf:

Name Description
orient currently fixed-format
typ data is always returned as a cudf::table
convert_axes use column functions for axes operations instead
convert_dates dates are detected automatically
keep_default_dates dates are detected automatically
numpy data is always returned as a cudf::table
precise_float there is only one converter
date_unit only millisecond units are supported
encoding only ASCII-encoded data is supported
chunksize use byte_range_xxx for chunking instead

Definition at line 90 of file io/json.hpp.

Constructor & Destructor Documentation

◆ json_reader_options()

cudf::io::json_reader_options::json_reader_options ( )
default

Default constructor.

This has been added since Cython requires a default constructor to create objects on stack.

Member Function Documentation

◆ builder()

static json_reader_options_builder cudf::io::json_reader_options::builder ( source_info  src)
static

create json_reader_options_builder which will build json_reader_options.

Parameters
srcsource information used to read json file
Returns
builder to build the options

◆ enable_dayfirst()

void cudf::io::json_reader_options::enable_dayfirst ( bool  val)
inline

Set whether to parse dates as DD/MM versus MM/DD.

Parameters
valBoolean value to enable/disable day first parsing format

Definition at line 400 of file io/json.hpp.

◆ enable_keep_quotes()

void cudf::io::json_reader_options::enable_keep_quotes ( bool  val)
inline

Set whether the reader should keep quotes of string values.

Parameters
valBoolean value to indicate whether the reader should keep quotes of string values

Definition at line 408 of file io/json.hpp.

◆ enable_lines()

void cudf::io::json_reader_options::enable_lines ( bool  val)
inline

Set whether to read the file as a json object per line.

Parameters
valBoolean value to enable/disable the option to read each line as a json object

Definition at line 374 of file io/json.hpp.

◆ enable_mixed_types_as_string()

void cudf::io::json_reader_options::enable_mixed_types_as_string ( bool  val)
inline

Set whether to parse mixed types as a string column. Also enables forcing to read a struct as string column using schema.

Parameters
valBoolean value to enable/disable parsing mixed types as a string column

Definition at line 382 of file io/json.hpp.

◆ enable_normalize_single_quotes()

void cudf::io::json_reader_options::enable_normalize_single_quotes ( bool  val)
inline

Set whether the reader should enable normalization of single quotes around strings.

Parameters
valBoolean value to indicate whether the reader should normalize single quotes around strings

Definition at line 416 of file io/json.hpp.

◆ enable_normalize_whitespace()

void cudf::io::json_reader_options::enable_normalize_whitespace ( bool  val)
inline

Set whether the reader should enable normalization of unquoted whitespace.

Parameters
valBoolean value to indicate whether the reader should normalize unquoted whitespace characters i.e. tabs and spaces

Definition at line 424 of file io/json.hpp.

◆ enable_prune_columns()

void cudf::io::json_reader_options::enable_prune_columns ( bool  val)
inline

Set whether to prune columns on read, selected based on the set_dtypes option.

When set as true, if the reader options include set_dtypes, then the reader will only return those columns which are mentioned in set_dtypes. If false, then all columns are returned, independent of the set_dtypes setting.

Parameters
valBoolean value to enable/disable column pruning

Definition at line 393 of file io/json.hpp.

◆ get_byte_range_offset()

size_t cudf::io::json_reader_options::get_byte_range_offset ( ) const
inline

Returns number of bytes to skip from source start.

Returns
Number of bytes to skip from source start

Definition at line 190 of file io/json.hpp.

◆ get_byte_range_padding()

size_t cudf::io::json_reader_options::get_byte_range_padding ( ) const
inline

Returns number of bytes to pad when reading.

Returns
Number of bytes to pad

Definition at line 218 of file io/json.hpp.

◆ get_byte_range_size()

size_t cudf::io::json_reader_options::get_byte_range_size ( ) const
inline

Returns number of bytes to read.

Returns
Number of bytes to read

Definition at line 197 of file io/json.hpp.

◆ get_byte_range_size_with_padding()

size_t cudf::io::json_reader_options::get_byte_range_size_with_padding ( ) const
inline

Returns number of bytes to read with padding.

Returns
Number of bytes to read with padding

Definition at line 204 of file io/json.hpp.

◆ get_compression()

compression_type cudf::io::json_reader_options::get_compression ( ) const
inline

Returns compression format of the source.

Returns
Compression format of the source

Definition at line 183 of file io/json.hpp.

◆ get_delimiter()

char cudf::io::json_reader_options::get_delimiter ( ) const
inline

Returns delimiter separating records in JSON lines.

Returns
Delimiter separating records in JSON lines

Definition at line 240 of file io/json.hpp.

◆ get_dtypes()

std::variant<std::vector<data_type>, std::map<std::string, data_type>, std::map<std::string, schema_element> > const& cudf::io::json_reader_options::get_dtypes ( ) const
inline

Returns data types of the columns.

Returns
Data types of the columns

Definition at line 173 of file io/json.hpp.

◆ get_source()

source_info const& cudf::io::json_reader_options::get_source ( ) const
inline

Returns source info.

Returns
Source info

Definition at line 163 of file io/json.hpp.

◆ is_enabled_dayfirst()

bool cudf::io::json_reader_options::is_enabled_dayfirst ( ) const
inline

Whether to parse dates as DD/MM versus MM/DD.

Returns
true if dates are parsed as DD/MM, false if MM/DD

Definition at line 273 of file io/json.hpp.

◆ is_enabled_keep_quotes()

bool cudf::io::json_reader_options::is_enabled_keep_quotes ( ) const
inline

Whether the reader should keep quotes of string values.

Returns
true if the reader should keep quotes, false otherwise

Definition at line 280 of file io/json.hpp.

◆ is_enabled_lines()

bool cudf::io::json_reader_options::is_enabled_lines ( ) const
inline

Whether to read the file as a json object per line.

Returns
true if reading the file as a json object per line

Definition at line 247 of file io/json.hpp.

◆ is_enabled_mixed_types_as_string()

bool cudf::io::json_reader_options::is_enabled_mixed_types_as_string ( ) const
inline

Whether to parse mixed types as a string column.

Returns
true if mixed types are parsed as a string column

Definition at line 254 of file io/json.hpp.

◆ is_enabled_normalize_single_quotes()

bool cudf::io::json_reader_options::is_enabled_normalize_single_quotes ( ) const
inline

Whether the reader should normalize single quotes around strings.

Returns
true if the reader should normalize single quotes, false otherwise

Definition at line 287 of file io/json.hpp.

◆ is_enabled_normalize_whitespace()

bool cudf::io::json_reader_options::is_enabled_normalize_whitespace ( ) const
inline

Whether the reader should normalize unquoted whitespace characters.

Returns
true if the reader should normalize whitespace, false otherwise

Definition at line 294 of file io/json.hpp.

◆ is_enabled_prune_columns()

bool cudf::io::json_reader_options::is_enabled_prune_columns ( ) const
inline

Whether to prune columns on read, selected based on the set_dtypes option.

When set as true, if the reader options include set_dtypes, then the reader will only return those columns which are mentioned in set_dtypes. If false, then all columns are returned, independent of the set_dtypes setting.

Returns
True if column pruning is enabled

Definition at line 266 of file io/json.hpp.

◆ recovery_mode()

json_recovery_mode_t cudf::io::json_reader_options::recovery_mode ( ) const
inline

Queries the JSON reader's behavior on invalid JSON lines.

Returns
An enum that specifies the JSON reader's behavior on invalid JSON lines.

Definition at line 301 of file io/json.hpp.

◆ set_byte_range_offset()

void cudf::io::json_reader_options::set_byte_range_offset ( size_t  offset)
inline

Set number of bytes to skip from source start.

Parameters
offsetNumber of bytes of offset

Definition at line 336 of file io/json.hpp.

◆ set_byte_range_size()

void cudf::io::json_reader_options::set_byte_range_size ( size_t  size)
inline

Set number of bytes to read.

Parameters
sizeNumber of bytes to read

Definition at line 343 of file io/json.hpp.

◆ set_compression()

void cudf::io::json_reader_options::set_compression ( compression_type  comp_type)
inline

Set the compression type.

Parameters
comp_typeThe compression type used

Definition at line 329 of file io/json.hpp.

◆ set_delimiter()

void cudf::io::json_reader_options::set_delimiter ( char  delimiter)
inline

Set delimiter separating records in JSON lines.

Parameters
delimiterDelimiter separating records in JSON lines

Definition at line 350 of file io/json.hpp.

◆ set_dtypes() [1/3]

void cudf::io::json_reader_options::set_dtypes ( std::map< std::string, data_type types)
inline

Set data types for columns to be read.

Parameters
typesVector dtypes in string format

Definition at line 315 of file io/json.hpp.

◆ set_dtypes() [2/3]

void cudf::io::json_reader_options::set_dtypes ( std::map< std::string, schema_element types)
inline

Set data types for a potentially nested column hierarchy.

Parameters
typesMap of column names to schema_element to support arbitrary nesting of data types

Definition at line 322 of file io/json.hpp.

◆ set_dtypes() [3/3]

void cudf::io::json_reader_options::set_dtypes ( std::vector< data_type types)
inline

Set data types for columns to be read.

Parameters
typesVector of dtypes

Definition at line 308 of file io/json.hpp.

◆ set_recovery_mode()

void cudf::io::json_reader_options::set_recovery_mode ( json_recovery_mode_t  val)
inline

Specifies the JSON reader's behavior on invalid JSON lines.

Parameters
valAn enum value to indicate the JSON reader's behavior on invalid JSON lines.

Definition at line 431 of file io/json.hpp.


The documentation for this class was generated from the following file: