CSV#

class pylibcudf.io.csv.CsvReaderOptions#

The settings to use for read_csv For details, see cudf::io::csv_reader_options

Methods

builder(SourceInfo source)

Create a CsvWriterOptionsBuilder object

set_comment(self, unicode comment)

Sets comment line start character.

set_delimiter(self, unicode delimiter)

Sets field delimiter.

set_dtypes(self, types)

Sets per-column types.

set_false_values(self, list false_values)

Sets additional values to recognize as boolean false values.

set_header(self, size_type header)

Sets header row index.

set_na_values(self, list na_values)

Sets additional values to recognize as null values.

set_names(self, list col_names)

Sets names of the column.

set_parse_dates(self, list val)

Sets indexes or names of columns to read as datetime.

set_parse_hex(self, list val)

Sets indexes or names of columns to parse as hexadecimal.

set_prefix(self, unicode prefix)

Sets prefix to be used for column ID.

set_thousands(self, unicode thousands)

Sets numeric data thousands separator.

set_true_values(self, list true_values)

Sets additional values to recognize as boolean true values.

set_use_cols_indexes(self, list col_indices)

Sets indexes of columns to read.

set_use_cols_names(self, list col_names)

Sets names of the columns to be read.

static builder(SourceInfo source)#

Create a CsvWriterOptionsBuilder object

For details, see cudf::io::csv_reader_options::builder()

Parameters:
sinkSourceInfo

The source to read the CSV file from.

Returns:
CsvReaderOptionsBuilder

Builder to build CsvReaderOptions

set_comment(self, unicode comment) void#

Sets comment line start character.

Parameters:
commentstr

A character that indicates comment

Returns:
None
set_delimiter(self, unicode delimiter) void#

Sets field delimiter.

Parameters:
delimiterstr

A character to indicate delimiter

Returns:
None
set_dtypes(self, types) void#

Sets per-column types.

Parameters:
typesdict[str, data_type] | list[data_type]

Column name to data type map specifying the columns’ target data types. Or a list specifying the columns’ target data types.

Returns:
None
set_false_values(self, list false_values) void#

Sets additional values to recognize as boolean false values.

Parameters:
false_valueslist[str]

List of values to be considered to be false

Returns:
None
set_header(self, size_type header) void#

Sets header row index.

Parameters:
headersize_type

Index where header row is located

Returns:
None
set_na_values(self, list na_values) void#

Sets additional values to recognize as null values.

Parameters:
na_valueslist[str]

List of values to be considered to be null

Returns:
None
set_names(self, list col_names) void#

Sets names of the column.

Parameters:
col_nameslist[str]

List of column names

Returns:
None
set_parse_dates(self, list val) void#

Sets indexes or names of columns to read as datetime.

Parameters:
vallist[int | str]

List column indices or names to infer as datetime.

Returns:
None
set_parse_hex(self, list val) void#

Sets indexes or names of columns to parse as hexadecimal.

Parameters:
vallist[int | str]

List of column indices or names to parse as hexadecimal

Returns:
None
set_prefix(self, unicode prefix) void#

Sets prefix to be used for column ID.

Parameters:
prefixstr

String used as prefix in for each column name

Returns:
None
set_thousands(self, unicode thousands) void#

Sets numeric data thousands separator.

Parameters:
thousandsstr

A character that separates thousands

Returns:
None
set_true_values(self, list true_values) void#

Sets additional values to recognize as boolean true values.

Parameters:
true_valueslist[str]

List of values to be considered to be true

Returns:
None
set_use_cols_indexes(self, list col_indices) void#

Sets indexes of columns to read.

Parameters:
col_indiceslist[int]

List of column indices that are needed

Returns:
None
set_use_cols_names(self, list col_names) void#

Sets names of the columns to be read.

Parameters:
col_nameslist[str]

List of column indices that are needed

Returns:
None
class pylibcudf.io.csv.CsvReaderOptionsBuilder#

Builder to build options for read_csv

For details, see cudf::io::csv_reader_options_builder

Methods

build(self)

Create a CsvReaderOptions object

byte_range_offset(self, size_t byte_range_offset)

Sets number of bytes to skip from source start.

byte_range_size(self, size_t byte_range_size)

Sets number of bytes to read.

compression(self, compression_type compression)

Sets compression format of the source.

dayfirst(self, bool dayfirst)

Sets whether to parse dates as DD/MM versus MM/DD.

decimal(self, unicode decimal)

Sets decimal point character.

delim_whitespace(self, bool delim_whitespace)

Sets whether to treat whitespace as field delimiter.

doublequote(self, bool doublequote)

Sets a quote inside a value is double-quoted.

keep_default_na(self, bool keep_default_na)

Sets whether to keep the built-in default NA values.

lineterminator(self, unicode lineterminator)

Sets line terminator.

mangle_dupe_cols(self, bool mangle_dupe_cols)

Sets whether to rename duplicate column names.

na_filter(self, bool na_filter)

Sets whether to disable null filter.

nrows(self, size_type nrows)

Sets number of rows to read.

quotechar(self, unicode quotechar)

Sets quoting character.

quoting(self, quote_style quoting)

Sets quoting style.

skip_blank_lines(self, bool skip_blank_lines)

Sets whether to ignore empty lines or parse line values as invalid.

skipfooter(self, size_type skipfooter)

Sets number of rows to skip from end.

skipinitialspace(self, bool skipinitialspace)

Sets whether to skip whitespace after the delimiter.

skiprows(self, size_type skiprows)

Sets number of rows to skip from start.

build(self) CsvReaderOptions#

Create a CsvReaderOptions object

byte_range_offset(self, size_t byte_range_offset) CsvReaderOptionsBuilder#

Sets number of bytes to skip from source start.

Parameters:
byte_range_offsetsize_t

Number of bytes of offset

Returns:
CsvReaderOptionsBuilder
byte_range_size(self, size_t byte_range_size) CsvReaderOptionsBuilder#

Sets number of bytes to read.

Parameters:
byte_range_offsetsize_t

Number of bytes to read

Returns:
CsvReaderOptionsBuilder
compression(self, compression_type compression) CsvReaderOptionsBuilder#

Sets compression format of the source.

Parameters:
compressioncompression_type

Compression type

Returns:
CsvReaderOptionsBuilder
dayfirst(self, bool dayfirst) CsvReaderOptionsBuilder#

Sets whether to parse dates as DD/MM versus MM/DD.

Parameters:
dayfirstbool

Boolean value to enable/disable

Returns:
CsvReaderOptionsBuilder
decimal(self, unicode decimal) CsvReaderOptionsBuilder#

Sets decimal point character.

Parameters:
quotecharstr

A character that indicates decimal values

Returns:
CsvReaderOptionsBuilder
delim_whitespace(self, bool delim_whitespace) CsvReaderOptionsBuilder#

Sets whether to treat whitespace as field delimiter.

Parameters:
delim_whitespacebool

Boolean value to enable/disable

Returns:
CsvReaderOptionsBuilder
doublequote(self, bool doublequote) CsvReaderOptionsBuilder#

Sets a quote inside a value is double-quoted.

Parameters:
doublequotebool

Boolean value to enable/disable

Returns:
CsvReaderOptionsBuilder
keep_default_na(self, bool keep_default_na) CsvReaderOptionsBuilder#

Sets whether to keep the built-in default NA values.

Parameters:
keep_default_nabool

Boolean value to enable/disable

Returns:
CsvReaderOptionsBuilder
lineterminator(self, unicode lineterminator) CsvReaderOptionsBuilder#

Sets line terminator.

Parameters:
quotingstr

A character to indicate line termination

Returns:
CsvReaderOptionsBuilder
mangle_dupe_cols(self, bool mangle_dupe_cols) CsvReaderOptionsBuilder#

Sets whether to rename duplicate column names.

Parameters:
mangle_dupe_colsbool

Boolean value to enable/disable

Returns:
CsvReaderOptionsBuilder
na_filter(self, bool na_filter) CsvReaderOptionsBuilder#

Sets whether to disable null filter.

Parameters:
na_filterbool

Boolean value to enable/disable

Returns:
CsvReaderOptionsBuilder
nrows(self, size_type nrows) CsvReaderOptionsBuilder#

Sets number of rows to read.

Parameters:
nrowssize_type

Number of rows to read

Returns:
CsvReaderOptionsBuilder
quotechar(self, unicode quotechar) CsvReaderOptionsBuilder#

Sets quoting character.

Parameters:
quotecharstr

A character to indicate quoting

Returns:
CsvReaderOptionsBuilder
quoting(self, quote_style quoting) CsvReaderOptionsBuilder#

Sets quoting style.

Parameters:
quotingquote_style

Quoting style used

Returns:
CsvReaderOptionsBuilder
skip_blank_lines(self, bool skip_blank_lines) CsvReaderOptionsBuilder#

Sets whether to ignore empty lines or parse line values as invalid.

Parameters:
skip_blank_linesbool

Boolean value to enable/disable

Returns:
CsvReaderOptionsBuilder
skipfooter(self, size_type skipfooter) CsvReaderOptionsBuilder#

Sets number of rows to skip from end.

Parameters:
skipfootersize_type

Number of rows to skip

Returns:
CsvReaderOptionsBuilder
skipinitialspace(self, bool skipinitialspace) CsvReaderOptionsBuilder#

Sets whether to skip whitespace after the delimiter.

Parameters:
skipinitialspacebool

Boolean value to enable/disable

Returns:
CsvReaderOptionsBuilder
skiprows(self, size_type skiprows) CsvReaderOptionsBuilder#

Sets number of rows to skip from start.

Parameters:
skiprowssize_type

Number of rows to skip

Returns:
CsvReaderOptionsBuilder
class pylibcudf.io.csv.CsvWriterOptions#

The settings to use for write_csv

For details, see cudf::io::csv_writer_options

Methods

builder(SinkInfo sink, Table table)

Create a CsvWriterOptionsBuilder object

static builder(SinkInfo sink, Table table)#

Create a CsvWriterOptionsBuilder object

For details, see cudf::io::csv_writer_options::builder()

Parameters:
sinkSinkInfo

The sink used for writer output

tableTable

Table to be written to output

Returns:
CsvWriterOptionsBuilder

Builder to build CsvWriterOptions

class pylibcudf.io.csv.CsvWriterOptionsBuilder#

Builder to build options for write_csv

For details, see cudf::io::csv_writer_options_builder

Methods

build(self)

Create a CsvWriterOptions object

false_value(self, unicode val)

Sets string used for values == 0

include_header(self, bool val)

Enables/Disables headers being written to csv.

inter_column_delimiter(self, unicode delim)

Sets character used for separating column values.

line_terminator(self, unicode term)

Sets character used for separating lines.

na_rep(self, unicode val)

Sets string to used for null entries.

names(self, list names)

Sets optional column names.

rows_per_chunk(self, int val)

Sets maximum number of rows to process for each file write.

true_value(self, unicode val)

Sets string used for values != 0

build(self) CsvWriterOptions#

Create a CsvWriterOptions object

false_value(self, unicode val) CsvWriterOptionsBuilder#

Sets string used for values == 0

Parameters:
valstr

String to represent values == 0

Returns:
CsvWriterOptionsBuilder

Builder to build CsvWriterOptions

include_header(self, bool val) CsvWriterOptionsBuilder#

Enables/Disables headers being written to csv.

Parameters:
valbool

Boolean value to enable/disable

Returns:
CsvWriterOptionsBuilder

Builder to build CsvWriterOptions

inter_column_delimiter(self, unicode delim) CsvWriterOptionsBuilder#

Sets character used for separating column values.

Parameters:
delimstr

Character to delimit column values

Returns:
CsvWriterOptionsBuilder

Builder to build CsvWriterOptions

line_terminator(self, unicode term) CsvWriterOptionsBuilder#

Sets character used for separating lines.

Parameters:
termstr

Character to represent line termination

Returns:
CsvWriterOptionsBuilder

Builder to build CsvWriterOptions

na_rep(self, unicode val) CsvWriterOptionsBuilder#

Sets string to used for null entries.

Parameters:
valstr

String to represent null value

Returns:
CsvWriterOptionsBuilder

Builder to build CsvWriterOptions

names(self, list names) CsvWriterOptionsBuilder#

Sets optional column names.

Parameters:
nameslist[str]

Column names

Returns:
CsvWriterOptionsBuilder

Builder to build CsvWriterOptions

rows_per_chunk(self, int val) CsvWriterOptionsBuilder#

Sets maximum number of rows to process for each file write.

Parameters:
valint

Number of rows per chunk

Returns:
CsvWriterOptionsBuilder

Builder to build CsvWriterOptions

true_value(self, unicode val) CsvWriterOptionsBuilder#

Sets string used for values != 0

Parameters:
valstr

String to represent values != 0

Returns:
CsvWriterOptionsBuilder

Builder to build CsvWriterOptions

pylibcudf.io.csv.read_csv(CsvReaderOptions options) TableWithMetadata#

Read from CSV format.

The source to read from and options are encapsulated by the options object.

For details, see read_csv().

Parameters:
options: CsvReaderOptions

Settings for controlling reading behavior

pylibcudf.io.csv.write_csv(CsvWriterOptions options) void#

Write to CSV format.

The table to write, output paths, and options are encapsulated by the options object.

For details, see write_csv().

Parameters:
options: CsvWriterOptions

Settings for controlling writing behavior