CSV#
- class pylibcudf.io.csv.CsvReaderOptions#
The settings to use for
read_csvFor details, seecudf::io::csv_reader_optionsMethods
builder(SourceInfo source)Create a CsvWriterOptionsBuilder object
set_comment(self, str comment)Sets comment line start character.
set_delimiter(self, str delimiter)Sets field delimiter.
set_dtypes(self, types)Sets per-column types.
set_false_values(self, list false_values)Sets additional values to recognize as boolean false values.
set_header(self, size_type header)Sets header row index.
set_na_values(self, list na_values)Sets additional values to recognize as null values.
set_names(self, list col_names)Sets names of the column.
set_parse_dates(self, list val)Sets indexes or names of columns to read as datetime.
set_parse_hex(self, list val)Sets indexes or names of columns to parse as hexadecimal.
set_prefix(self, str prefix)Sets prefix to be used for column ID.
set_thousands(self, str thousands)Sets numeric data thousands separator.
set_true_values(self, list true_values)Sets additional values to recognize as boolean true values.
set_use_cols_indexes(self, list col_indices)Sets indexes of columns to read.
set_use_cols_names(self, list col_names)Sets names of the columns to be read.
- static builder(SourceInfo source)#
Create a CsvWriterOptionsBuilder object
For details, see
cudf::io::csv_reader_options::builder()- Parameters:
- sinkSourceInfo
The source to read the CSV file from.
- Returns:
- CsvReaderOptionsBuilder
Builder to build CsvReaderOptions
- set_comment(self, str comment) void#
Sets comment line start character.
- Parameters:
- commentstr
A character that indicates comment
- Returns:
- None
- set_delimiter(self, str delimiter) void#
Sets field delimiter.
- Parameters:
- delimiterstr
A character to indicate delimiter
- Returns:
- None
- set_dtypes(self, types) void#
Sets per-column types.
- Parameters:
- typesdict[str, data_type] | list[data_type]
Column name to data type map specifying the columns’ target data types. Or a list specifying the columns’ target data types.
- Returns:
- None
- set_false_values(self, list false_values) void#
Sets additional values to recognize as boolean false values.
- Parameters:
- false_valueslist[str]
List of values to be considered to be false
- Returns:
- None
- set_header(self, size_type header) void#
Sets header row index.
- Parameters:
- headersize_type
Index where header row is located
- Returns:
- None
- set_na_values(self, list na_values) void#
Sets additional values to recognize as null values.
- Parameters:
- na_valueslist[str]
List of values to be considered to be null
- Returns:
- None
- set_names(self, list col_names) void#
Sets names of the column.
- Parameters:
- col_nameslist[str]
List of column names
- Returns:
- None
- set_parse_dates(self, list val) void#
Sets indexes or names of columns to read as datetime.
- Parameters:
- vallist[int | str]
List column indices or names to infer as datetime.
- Returns:
- None
- set_parse_hex(self, list val) void#
Sets indexes or names of columns to parse as hexadecimal.
- Parameters:
- vallist[int | str]
List of column indices or names to parse as hexadecimal
- Returns:
- None
- set_prefix(self, str prefix) void#
Sets prefix to be used for column ID.
- Parameters:
- prefixstr
String used as prefix in for each column name
- Returns:
- None
- set_thousands(self, str thousands) void#
Sets numeric data thousands separator.
- Parameters:
- thousandsstr
A character that separates thousands
- Returns:
- None
- set_true_values(self, list true_values) void#
Sets additional values to recognize as boolean true values.
- Parameters:
- true_valueslist[str]
List of values to be considered to be true
- Returns:
- None
- set_use_cols_indexes(self, list col_indices) void#
Sets indexes of columns to read.
- Parameters:
- col_indiceslist[int]
List of column indices that are needed
- Returns:
- None
- set_use_cols_names(self, list col_names) void#
Sets names of the columns to be read.
- Parameters:
- col_nameslist[str]
List of column indices that are needed
- Returns:
- None
- class pylibcudf.io.csv.CsvReaderOptionsBuilder#
Builder to build options for
read_csvFor details, see
cudf::io::csv_reader_options_builderMethods
build(self)Create a CsvReaderOptions object
byte_range_offset(self, size_t byte_range_offset)Sets number of bytes to skip from source start.
byte_range_size(self, size_t byte_range_size)Sets number of bytes to read.
compression(self, compression_type compression)Sets compression format of the source.
dayfirst(self, bool dayfirst)Sets whether to parse dates as DD/MM versus MM/DD.
decimal(self, str decimal)Sets decimal point character.
delim_whitespace(self, bool delim_whitespace)Sets whether to treat whitespace as field delimiter.
delimiter(self, str delimiter)Sets field delimiter.
doublequote(self, bool doublequote)Sets a quote inside a value is double-quoted.
keep_default_na(self, bool keep_default_na)Sets whether to keep the built-in default NA values.
lineterminator(self, str lineterminator)Sets line terminator.
mangle_dupe_cols(self, bool mangle_dupe_cols)Sets whether to rename duplicate column names.
na_filter(self, bool na_filter)Sets whether to disable null filter.
nrows(self, size_type nrows)Sets number of rows to read.
quotechar(self, str quotechar)Sets quoting character.
quoting(self, quote_style quoting)Sets quoting style.
skip_blank_lines(self, bool skip_blank_lines)Sets whether to ignore empty lines or parse line values as invalid.
skipfooter(self, size_type skipfooter)Sets number of rows to skip from end.
skipinitialspace(self, bool skipinitialspace)Sets whether to skip whitespace after the delimiter.
skiprows(self, size_type skiprows)Sets number of rows to skip from start.
- build(self) CsvReaderOptions#
Create a CsvReaderOptions object
- byte_range_offset(self, size_t byte_range_offset) CsvReaderOptionsBuilder#
Sets number of bytes to skip from source start.
- Parameters:
- byte_range_offsetsize_t
Number of bytes of offset
- Returns:
- CsvReaderOptionsBuilder
- byte_range_size(self, size_t byte_range_size) CsvReaderOptionsBuilder#
Sets number of bytes to read.
- Parameters:
- byte_range_offsetsize_t
Number of bytes to read
- Returns:
- CsvReaderOptionsBuilder
- compression(self, compression_type compression) CsvReaderOptionsBuilder#
Sets compression format of the source.
- Parameters:
- compressioncompression_type
Compression type
- Returns:
- CsvReaderOptionsBuilder
- dayfirst(self, bool dayfirst) CsvReaderOptionsBuilder#
Sets whether to parse dates as DD/MM versus MM/DD.
- Parameters:
- dayfirstbool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- decimal(self, str decimal) CsvReaderOptionsBuilder#
Sets decimal point character.
- Parameters:
- quotecharstr
A character that indicates decimal values
- Returns:
- CsvReaderOptionsBuilder
- delim_whitespace(self, bool delim_whitespace) CsvReaderOptionsBuilder#
Sets whether to treat whitespace as field delimiter.
- Parameters:
- delim_whitespacebool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- delimiter(self, str delimiter) CsvReaderOptionsBuilder#
Sets field delimiter.
- Parameters:
- delimiterstr
A character to indicate delimiter
- Returns:
- CsvReaderOptionsBuilder
- doublequote(self, bool doublequote) CsvReaderOptionsBuilder#
Sets a quote inside a value is double-quoted.
- Parameters:
- doublequotebool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- keep_default_na(self, bool keep_default_na) CsvReaderOptionsBuilder#
Sets whether to keep the built-in default NA values.
- Parameters:
- keep_default_nabool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- lineterminator(self, str lineterminator) CsvReaderOptionsBuilder#
Sets line terminator.
- Parameters:
- quotingstr
A character to indicate line termination
- Returns:
- CsvReaderOptionsBuilder
- mangle_dupe_cols(self, bool mangle_dupe_cols) CsvReaderOptionsBuilder#
Sets whether to rename duplicate column names.
- Parameters:
- mangle_dupe_colsbool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- na_filter(self, bool na_filter) CsvReaderOptionsBuilder#
Sets whether to disable null filter.
- Parameters:
- na_filterbool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- nrows(self, size_type nrows) CsvReaderOptionsBuilder#
Sets number of rows to read.
- Parameters:
- nrowssize_type
Number of rows to read
- Returns:
- CsvReaderOptionsBuilder
- quotechar(self, str quotechar) CsvReaderOptionsBuilder#
Sets quoting character.
- Parameters:
- quotecharstr
A character to indicate quoting
- Returns:
- CsvReaderOptionsBuilder
- quoting(self, quote_style quoting) CsvReaderOptionsBuilder#
Sets quoting style.
- Parameters:
- quotingquote_style
Quoting style used
- Returns:
- CsvReaderOptionsBuilder
- skip_blank_lines(self, bool skip_blank_lines) CsvReaderOptionsBuilder#
Sets whether to ignore empty lines or parse line values as invalid.
- Parameters:
- skip_blank_linesbool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
Sets number of rows to skip from end.
- Parameters:
- skipfootersize_type
Number of rows to skip
- Returns:
- CsvReaderOptionsBuilder
- skipinitialspace(self, bool skipinitialspace) CsvReaderOptionsBuilder#
Sets whether to skip whitespace after the delimiter.
- Parameters:
- skipinitialspacebool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- skiprows(self, size_type skiprows) CsvReaderOptionsBuilder#
Sets number of rows to skip from start.
- Parameters:
- skiprowssize_type
Number of rows to skip
- Returns:
- CsvReaderOptionsBuilder
- class pylibcudf.io.csv.CsvWriterOptions#
The settings to use for
write_csvFor details, see
cudf::io::csv_writer_optionsMethods
builder(SinkInfo sink, Table table)Create a CsvWriterOptionsBuilder object
- static builder(SinkInfo sink, Table table)#
Create a CsvWriterOptionsBuilder object
For details, see
cudf::io::csv_writer_options::builder()- Parameters:
- sinkSinkInfo
The sink used for writer output
- tableTable
Table to be written to output
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- class pylibcudf.io.csv.CsvWriterOptionsBuilder#
Builder to build options for
write_csvFor details, see
cudf::io::csv_writer_options_builderMethods
build(self)Create a CsvWriterOptions object
false_value(self, str val)Sets string used for values == 0
include_header(self, bool val)Enables/Disables headers being written to csv.
inter_column_delimiter(self, str delim)Sets character used for separating column values.
line_terminator(self, str term)Sets character used for separating lines.
na_rep(self, str val)Sets string to used for null entries.
names(self, list names)Sets optional column names.
rows_per_chunk(self, int val)Sets maximum number of rows to process for each file write.
true_value(self, str val)Sets string used for values != 0
- build(self) CsvWriterOptions#
Create a CsvWriterOptions object
- false_value(self, str val) CsvWriterOptionsBuilder#
Sets string used for values == 0
- Parameters:
- valstr
String to represent values == 0
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- include_header(self, bool val) CsvWriterOptionsBuilder#
Enables/Disables headers being written to csv.
- Parameters:
- valbool
Boolean value to enable/disable
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- inter_column_delimiter(self, str delim) CsvWriterOptionsBuilder#
Sets character used for separating column values.
- Parameters:
- delimstr
Character to delimit column values
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- line_terminator(self, str term) CsvWriterOptionsBuilder#
Sets character used for separating lines.
- Parameters:
- termstr
Character to represent line termination
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- na_rep(self, str val) CsvWriterOptionsBuilder#
Sets string to used for null entries.
- Parameters:
- valstr
String to represent null value
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- names(self, list names) CsvWriterOptionsBuilder#
Sets optional column names.
- Parameters:
- nameslist[str]
Column names
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- rows_per_chunk(self, int val) CsvWriterOptionsBuilder#
Sets maximum number of rows to process for each file write.
- Parameters:
- valint
Number of rows per chunk
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- true_value(self, str val) CsvWriterOptionsBuilder#
Sets string used for values != 0
- Parameters:
- valstr
String to represent values != 0
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- pylibcudf.io.csv.read_csv(CsvReaderOptions options, Stream stream=None) TableWithMetadata#
Read from CSV format.
The source to read from and options are encapsulated by the options object.
For details, see
read_csv().- Parameters:
- options: CsvReaderOptions
Settings for controlling reading behavior
- stream: Stream
CUDA stream used for device memory operations and kernel launches
- pylibcudf.io.csv.write_csv(CsvWriterOptions options, Stream stream=None) void#
Write to CSV format.
The table to write, output paths, and options are encapsulated by the options object.
For details, see
write_csv().- Parameters:
- options: CsvWriterOptions
Settings for controlling writing behavior
- stream: Stream
CUDA stream used for device memory operations and kernel launches