CSV#
- class pylibcudf.io.csv.CsvReaderOptions#
The settings to use for
read_csv
For details, seecudf::io::csv_reader_options
Methods
builder
(SourceInfo source)Create a CsvWriterOptionsBuilder object
set_comment
(self, unicode comment)Sets comment line start character.
set_delimiter
(self, unicode delimiter)Sets field delimiter.
set_dtypes
(self, types)Sets per-column types.
set_false_values
(self, list false_values)Sets additional values to recognize as boolean false values.
set_header
(self, size_type header)Sets header row index.
set_na_values
(self, list na_values)Sets additional values to recognize as null values.
set_names
(self, list col_names)Sets names of the column.
set_parse_dates
(self, list val)Sets indexes or names of columns to read as datetime.
set_parse_hex
(self, list val)Sets indexes or names of columns to parse as hexadecimal.
set_prefix
(self, unicode prefix)Sets prefix to be used for column ID.
set_thousands
(self, unicode thousands)Sets numeric data thousands separator.
set_true_values
(self, list true_values)Sets additional values to recognize as boolean true values.
set_use_cols_indexes
(self, list col_indices)Sets indexes of columns to read.
set_use_cols_names
(self, list col_names)Sets names of the columns to be read.
- static builder(SourceInfo source)#
Create a CsvWriterOptionsBuilder object
For details, see
cudf::io::csv_reader_options::builder()
- Parameters:
- sinkSourceInfo
The source to read the CSV file from.
- Returns:
- CsvReaderOptionsBuilder
Builder to build CsvReaderOptions
- set_comment(self, unicode comment) void #
Sets comment line start character.
- Parameters:
- commentstr
A character that indicates comment
- Returns:
- None
- set_delimiter(self, unicode delimiter) void #
Sets field delimiter.
- Parameters:
- delimiterstr
A character to indicate delimiter
- Returns:
- None
- set_dtypes(self, types) void #
Sets per-column types.
- Parameters:
- typesdict[str, data_type] | list[data_type]
Column name to data type map specifying the columns’ target data types. Or a list specifying the columns’ target data types.
- Returns:
- None
- set_false_values(self, list false_values) void #
Sets additional values to recognize as boolean false values.
- Parameters:
- false_valueslist[str]
List of values to be considered to be false
- Returns:
- None
- set_header(self, size_type header) void #
Sets header row index.
- Parameters:
- headersize_type
Index where header row is located
- Returns:
- None
- set_na_values(self, list na_values) void #
Sets additional values to recognize as null values.
- Parameters:
- na_valueslist[str]
List of values to be considered to be null
- Returns:
- None
- set_names(self, list col_names) void #
Sets names of the column.
- Parameters:
- col_nameslist[str]
List of column names
- Returns:
- None
- set_parse_dates(self, list val) void #
Sets indexes or names of columns to read as datetime.
- Parameters:
- vallist[int | str]
List column indices or names to infer as datetime.
- Returns:
- None
- set_parse_hex(self, list val) void #
Sets indexes or names of columns to parse as hexadecimal.
- Parameters:
- vallist[int | str]
List of column indices or names to parse as hexadecimal
- Returns:
- None
- set_prefix(self, unicode prefix) void #
Sets prefix to be used for column ID.
- Parameters:
- prefixstr
String used as prefix in for each column name
- Returns:
- None
- set_thousands(self, unicode thousands) void #
Sets numeric data thousands separator.
- Parameters:
- thousandsstr
A character that separates thousands
- Returns:
- None
- set_true_values(self, list true_values) void #
Sets additional values to recognize as boolean true values.
- Parameters:
- true_valueslist[str]
List of values to be considered to be true
- Returns:
- None
- set_use_cols_indexes(self, list col_indices) void #
Sets indexes of columns to read.
- Parameters:
- col_indiceslist[int]
List of column indices that are needed
- Returns:
- None
- set_use_cols_names(self, list col_names) void #
Sets names of the columns to be read.
- Parameters:
- col_nameslist[str]
List of column indices that are needed
- Returns:
- None
- class pylibcudf.io.csv.CsvReaderOptionsBuilder#
Builder to build options for
read_csv
For details, see
cudf::io::csv_reader_options_builder
Methods
build
(self)Create a CsvReaderOptions object
byte_range_offset
(self, size_t byte_range_offset)Sets number of bytes to skip from source start.
byte_range_size
(self, size_t byte_range_size)Sets number of bytes to read.
compression
(self, compression_type compression)Sets compression format of the source.
dayfirst
(self, bool dayfirst)Sets whether to parse dates as DD/MM versus MM/DD.
decimal
(self, unicode decimal)Sets decimal point character.
delim_whitespace
(self, bool delim_whitespace)Sets whether to treat whitespace as field delimiter.
doublequote
(self, bool doublequote)Sets a quote inside a value is double-quoted.
keep_default_na
(self, bool keep_default_na)Sets whether to keep the built-in default NA values.
lineterminator
(self, unicode lineterminator)Sets line terminator.
mangle_dupe_cols
(self, bool mangle_dupe_cols)Sets whether to rename duplicate column names.
na_filter
(self, bool na_filter)Sets whether to disable null filter.
nrows
(self, size_type nrows)Sets number of rows to read.
quotechar
(self, unicode quotechar)Sets quoting character.
quoting
(self, quote_style quoting)Sets quoting style.
skip_blank_lines
(self, bool skip_blank_lines)Sets whether to ignore empty lines or parse line values as invalid.
skipfooter
(self, size_type skipfooter)Sets number of rows to skip from end.
skipinitialspace
(self, bool skipinitialspace)Sets whether to skip whitespace after the delimiter.
skiprows
(self, size_type skiprows)Sets number of rows to skip from start.
- build(self) CsvReaderOptions #
Create a CsvReaderOptions object
- byte_range_offset(self, size_t byte_range_offset) CsvReaderOptionsBuilder #
Sets number of bytes to skip from source start.
- Parameters:
- byte_range_offsetsize_t
Number of bytes of offset
- Returns:
- CsvReaderOptionsBuilder
- byte_range_size(self, size_t byte_range_size) CsvReaderOptionsBuilder #
Sets number of bytes to read.
- Parameters:
- byte_range_offsetsize_t
Number of bytes to read
- Returns:
- CsvReaderOptionsBuilder
- compression(self, compression_type compression) CsvReaderOptionsBuilder #
Sets compression format of the source.
- Parameters:
- compressioncompression_type
Compression type
- Returns:
- CsvReaderOptionsBuilder
- dayfirst(self, bool dayfirst) CsvReaderOptionsBuilder #
Sets whether to parse dates as DD/MM versus MM/DD.
- Parameters:
- dayfirstbool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- decimal(self, unicode decimal) CsvReaderOptionsBuilder #
Sets decimal point character.
- Parameters:
- quotecharstr
A character that indicates decimal values
- Returns:
- CsvReaderOptionsBuilder
- delim_whitespace(self, bool delim_whitespace) CsvReaderOptionsBuilder #
Sets whether to treat whitespace as field delimiter.
- Parameters:
- delim_whitespacebool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- doublequote(self, bool doublequote) CsvReaderOptionsBuilder #
Sets a quote inside a value is double-quoted.
- Parameters:
- doublequotebool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- keep_default_na(self, bool keep_default_na) CsvReaderOptionsBuilder #
Sets whether to keep the built-in default NA values.
- Parameters:
- keep_default_nabool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- lineterminator(self, unicode lineterminator) CsvReaderOptionsBuilder #
Sets line terminator.
- Parameters:
- quotingstr
A character to indicate line termination
- Returns:
- CsvReaderOptionsBuilder
- mangle_dupe_cols(self, bool mangle_dupe_cols) CsvReaderOptionsBuilder #
Sets whether to rename duplicate column names.
- Parameters:
- mangle_dupe_colsbool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- na_filter(self, bool na_filter) CsvReaderOptionsBuilder #
Sets whether to disable null filter.
- Parameters:
- na_filterbool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- nrows(self, size_type nrows) CsvReaderOptionsBuilder #
Sets number of rows to read.
- Parameters:
- nrowssize_type
Number of rows to read
- Returns:
- CsvReaderOptionsBuilder
- quotechar(self, unicode quotechar) CsvReaderOptionsBuilder #
Sets quoting character.
- Parameters:
- quotecharstr
A character to indicate quoting
- Returns:
- CsvReaderOptionsBuilder
- quoting(self, quote_style quoting) CsvReaderOptionsBuilder #
Sets quoting style.
- Parameters:
- quotingquote_style
Quoting style used
- Returns:
- CsvReaderOptionsBuilder
- skip_blank_lines(self, bool skip_blank_lines) CsvReaderOptionsBuilder #
Sets whether to ignore empty lines or parse line values as invalid.
- Parameters:
- skip_blank_linesbool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
Sets number of rows to skip from end.
- Parameters:
- skipfootersize_type
Number of rows to skip
- Returns:
- CsvReaderOptionsBuilder
- skipinitialspace(self, bool skipinitialspace) CsvReaderOptionsBuilder #
Sets whether to skip whitespace after the delimiter.
- Parameters:
- skipinitialspacebool
Boolean value to enable/disable
- Returns:
- CsvReaderOptionsBuilder
- skiprows(self, size_type skiprows) CsvReaderOptionsBuilder #
Sets number of rows to skip from start.
- Parameters:
- skiprowssize_type
Number of rows to skip
- Returns:
- CsvReaderOptionsBuilder
- class pylibcudf.io.csv.CsvWriterOptions#
The settings to use for
write_csv
For details, see
cudf::io::csv_writer_options
Methods
builder
(SinkInfo sink, Table table)Create a CsvWriterOptionsBuilder object
- static builder(SinkInfo sink, Table table)#
Create a CsvWriterOptionsBuilder object
For details, see
cudf::io::csv_writer_options::builder()
- Parameters:
- sinkSinkInfo
The sink used for writer output
- tableTable
Table to be written to output
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- class pylibcudf.io.csv.CsvWriterOptionsBuilder#
Builder to build options for
write_csv
For details, see
cudf::io::csv_writer_options_builder
Methods
build
(self)Create a CsvWriterOptions object
false_value
(self, unicode val)Sets string used for values == 0
include_header
(self, bool val)Enables/Disables headers being written to csv.
inter_column_delimiter
(self, unicode delim)Sets character used for separating column values.
line_terminator
(self, unicode term)Sets character used for separating lines.
na_rep
(self, unicode val)Sets string to used for null entries.
names
(self, list names)Sets optional column names.
rows_per_chunk
(self, int val)Sets maximum number of rows to process for each file write.
true_value
(self, unicode val)Sets string used for values != 0
- build(self) CsvWriterOptions #
Create a CsvWriterOptions object
- false_value(self, unicode val) CsvWriterOptionsBuilder #
Sets string used for values == 0
- Parameters:
- valstr
String to represent values == 0
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- include_header(self, bool val) CsvWriterOptionsBuilder #
Enables/Disables headers being written to csv.
- Parameters:
- valbool
Boolean value to enable/disable
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- inter_column_delimiter(self, unicode delim) CsvWriterOptionsBuilder #
Sets character used for separating column values.
- Parameters:
- delimstr
Character to delimit column values
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- line_terminator(self, unicode term) CsvWriterOptionsBuilder #
Sets character used for separating lines.
- Parameters:
- termstr
Character to represent line termination
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- na_rep(self, unicode val) CsvWriterOptionsBuilder #
Sets string to used for null entries.
- Parameters:
- valstr
String to represent null value
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- names(self, list names) CsvWriterOptionsBuilder #
Sets optional column names.
- Parameters:
- nameslist[str]
Column names
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- rows_per_chunk(self, int val) CsvWriterOptionsBuilder #
Sets maximum number of rows to process for each file write.
- Parameters:
- valint
Number of rows per chunk
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- true_value(self, unicode val) CsvWriterOptionsBuilder #
Sets string used for values != 0
- Parameters:
- valstr
String to represent values != 0
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- pylibcudf.io.csv.read_csv(CsvReaderOptions options) TableWithMetadata #
Read from CSV format.
The source to read from and options are encapsulated by the options object.
For details, see
read_csv()
.- Parameters:
- options: CsvReaderOptions
Settings for controlling reading behavior
- pylibcudf.io.csv.write_csv(CsvWriterOptions options) void #
Write to CSV format.
The table to write, output paths, and options are encapsulated by the options object.
For details, see
write_csv()
.- Parameters:
- options: CsvWriterOptions
Settings for controlling writing behavior