CSV#
- class pylibcudf.io.csv.CsvWriterOptions#
The settings to use for
write_csv
For details, see
cudf::io::csv_writer_options
Methods
builder
(SinkInfo sink, Table table)Create a CsvWriterOptionsBuilder object
- static builder(SinkInfo sink, Table table)#
Create a CsvWriterOptionsBuilder object
For details, see
cudf::io::csv_writer_options::builder()
- Parameters:
- sinkSinkInfo
The sink used for writer output
- tableTable
Table to be written to output
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- class pylibcudf.io.csv.CsvWriterOptionsBuilder#
Builder to build options for
write_csv
For details, see
cudf::io::csv_writer_options_builder
Methods
build
(self)Create a CsvWriterOptions object
false_value
(self, unicode val)Sets string used for values == 0
include_header
(self, bool val)Enables/Disables headers being written to csv.
inter_column_delimiter
(self, unicode delim)Sets character used for separating column values.
line_terminator
(self, unicode term)Sets character used for separating lines.
na_rep
(self, unicode val)Sets string to used for null entries.
names
(self, list names)Sets optional column names.
rows_per_chunk
(self, int val)Sets maximum number of rows to process for each file write.
true_value
(self, unicode val)Sets string used for values != 0
- build(self) CsvWriterOptions #
Create a CsvWriterOptions object
- false_value(self, unicode val) CsvWriterOptionsBuilder #
Sets string used for values == 0
- Parameters:
- valstr
String to represent values == 0
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- include_header(self, bool val) CsvWriterOptionsBuilder #
Enables/Disables headers being written to csv.
- Parameters:
- valbool
Boolean value to enable/disable
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- inter_column_delimiter(self, unicode delim) CsvWriterOptionsBuilder #
Sets character used for separating column values.
- Parameters:
- delimstr
Character to delimit column values
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- line_terminator(self, unicode term) CsvWriterOptionsBuilder #
Sets character used for separating lines.
- Parameters:
- termstr
Character to represent line termination
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- na_rep(self, unicode val) CsvWriterOptionsBuilder #
Sets string to used for null entries.
- Parameters:
- valstr
String to represent null value
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- names(self, list names) CsvWriterOptionsBuilder #
Sets optional column names.
- Parameters:
- nameslist[str]
Column names
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- rows_per_chunk(self, int val) CsvWriterOptionsBuilder #
Sets maximum number of rows to process for each file write.
- Parameters:
- valint
Number of rows per chunk
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- true_value(self, unicode val) CsvWriterOptionsBuilder #
Sets string used for values != 0
- Parameters:
- valstr
String to represent values != 0
- Returns:
- CsvWriterOptionsBuilder
Builder to build CsvWriterOptions
- pylibcudf.io.csv.read_csv(SourceInfo source_info, *, compression_type compression=compression_type.AUTO, size_t byte_range_offset=0, size_t byte_range_size=0, list col_names=None, unicode prefix=u'', bool mangle_dupe_cols=True, list usecols=None, size_type nrows=-1, size_type skiprows=0, size_type skipfooter=0, size_type header=0, unicode lineterminator=u'\n', unicode delimiter=None, unicode thousands=None, unicode decimal=u'.', unicode comment=None, bool delim_whitespace=False, bool skipinitialspace=False, bool skip_blank_lines=True, quote_style quoting=quote_style.MINIMAL, unicode quotechar=u'"', bool doublequote=True, list parse_dates=None, list parse_hex=None, dtypes=None, list true_values=None, list false_values=None, list na_values=None, bool keep_default_na=True, bool na_filter=True, bool dayfirst=False)#
Reads a CSV file into a
TableWithMetadata
.For details, see
read_csv()
.- Parameters:
- source_infoSourceInfo
The SourceInfo to read the CSV file from.
- compressioncompression_type, default CompressionType.AUTO
The compression format of the CSV source.
- byte_range_offsetsize_type, default 0
Number of bytes to skip from source start.
- byte_range_sizesize_type, default 0
Number of bytes to read. By default, will read all bytes.
- col_nameslist, default None
The column names to use.
- prefixstring, default ‘’
The prefix to apply to the column names.
- mangle_dupe_colsbool, default True
If True, rename duplicate column names.
- usecolslist, default None
Specify the string column names/integer column indices of columns to be read.
- nrowssize_type, default -1
The number of rows to read.
- skiprowssize_type, default 0
The number of rows to skip from the start before reading
- skipfootersize_type, default 0
The number of rows to skip from the end
- headersize_type, default 0
The index of the row that will be used for header names. Pass -1 to use default column names.
- lineterminatorstr, default ‘n’
The character used to determine the end of a line.
- delimiterstr, default “,”
The character used to separate fields in a row.
- thousandsstr, default None
The character used as the thousands separator. Cannot match delimiter.
- decimalstr, default ‘.’
The character used as the decimal separator. Cannot match delimiter.
- commentstr, default None
The character used to identify the start of a comment line. (which will be skipped by the reader)
- delim_whitespacebool, default False
If True, treat whitespace as the field delimiter.
- skipinitialspacebool, default False
If True, skip whitespace after the delimiter.
- skip_blank_linesbool, default True
If True, ignore empty lines (otherwise line values are parsed as null).
- quotingQuoteStyle, default QuoteStyle.MINIMAL
The quoting style used in the input CSV data. One of { QuoteStyle.MINIMAL, QuoteStyle.ALL, QuoteStyle.NONNUMERIC, QuoteStyle.NONE }
- quotecharstr, default ‘”’
The character used to indicate quoting.
- doublequotebool, default True
If True, a quote inside a value is double-quoted.
- parse_dateslist, default None
A list of integer column indices/string column names of columns to read as datetime.
- parse_hexlist, default None
A list of integer column indices/string column names of columns to read as hexadecimal.
- dtypesUnion[Dict[str, DataType], List[DataType]], default None
A list of data types or a dictionary mapping column names to a DataType.
- true_valuesList[str], default None
A list of additional values to recognize as True.
- false_valuesList[str], default None
A list of additional values to recognize as False.
- na_valuesList[str], default None
A list of additional values to recognize as null.
- keep_default_nabool, default True
Whether to keep the built-in default N/A values.
- na_filterbool, default True
Whether to detect missing values. If False, can improve performance.
- dayfirstbool, default False
If True, interpret dates as being in the DD/MM format.
- Returns:
- TableWithMetadata
The Table and its corresponding metadata (column names) that were read in.
- pylibcudf.io.csv.write_csv(CsvWriterOptions options) void #
Write to CSV format.
The table to write, output paths, and options are encapsulated by the options object.
For details, see
write_csv()
.- Parameters:
- options: CsvWriterOptions
Settings for controlling writing behavior