cudf.DataFrame.to_csv#
- DataFrame.to_csv(path_or_buf=None, sep=',', na_rep='', columns=None, header=True, index=True, encoding=None, compression=None, lineterminator=None, chunksize=None, storage_options=None)#
Write a dataframe to csv file format.
- Parameters:
- path_or_bufstr or file handle, default None
File path or object, if None is provided the result is returned as a string.
- sepchar, default ‘,’
Delimiter to be used.
- na_repstr, default ‘’
String to use for null entries
- columnslist of str, optional
Columns to write
- headerbool, default True
Write out the column names
- indexbool, default True
Write out the index as a column
- encodingstr, default ‘utf-8’
A string representing the encoding to use in the output file Only ‘utf-8’ is currently supported
- compressionstr, None
A string representing the compression scheme to use in the output file Compression while writing csv is not supported currently
- lineterminatorstr, optional
The newline character or character sequence to use in the output file. Defaults to
os.linesep
.- chunksizeint or None, default None
Rows to write at a time
- storage_optionsdict, optional, default None
Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to
urllib.request.Request
as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded tofsspec.open
. Please seefsspec
andurllib
for more details.
- Returns:
- None or str
If path_or_buf is None, returns the resulting csv format as a string. Otherwise returns None.
See also
Notes
Follows the standard of Pandas csv.QUOTE_NONNUMERIC for all output.
The default behaviour is to write all rows of the dataframe at once. This can lead to memory or overflow errors for large tables. If this happens, consider setting the
chunksize
argument to some reasonable fraction of the total rows in the dataframe.
Examples
Write a dataframe to csv.
>>> import cudf >>> filename = 'foo.csv' >>> df = cudf.DataFrame({'x': [0, 1, 2, 3], ... 'y': [1.0, 3.3, 2.2, 4.4], ... 'z': ['a', 'b', 'c', 'd']}) >>> df = df.set_index(cudf.Series([3, 2, 1, 0])) >>> df.to_csv(filename)