cudf.DataFrame.to_csv#

DataFrame.to_csv(path_or_buf=None, sep=',', na_rep='', columns=None, header=True, index=True, encoding=None, compression=None, lineterminator=None, chunksize=None, storage_options=None)[source]#

Write a dataframe to csv file format.

Parameters:
path_or_bufstr or file handle, default None

File path or object, if None is provided the result is returned as a string.

sepchar, default ‘,’

Delimiter to be used.

na_repstr, default ‘’

String to use for null entries

columnslist of str, optional

Columns to write

headerbool, default True

Write out the column names

indexbool, default True

Write out the index as a column

encodingstr, default ‘utf-8’

A string representing the encoding to use in the output file Only ‘utf-8’ is currently supported

compressionstr, None

A string representing the compression scheme to use in the output file Compression while writing csv is not supported currently

lineterminatorstr, optional

The newline character or character sequence to use in the output file. Defaults to os.linesep.

chunksizeint or None, default None

Rows to write at a time

storage_optionsdict, optional, default None

Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to urllib.request.Request as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded to fsspec.open. Please see fsspec and urllib for more details.

Returns:
None or str

If path_or_buf is None, returns the resulting csv format as a string. Otherwise returns None.

See also

cudf.read_csv

Notes

  • Follows the standard of Pandas csv.QUOTE_NONNUMERIC for all output.

  • The default behaviour is to write all rows of the dataframe at once. This can lead to memory or overflow errors for large tables. If this happens, consider setting the chunksize argument to some reasonable fraction of the total rows in the dataframe.

Examples

Write a dataframe to csv.

>>> import cudf
>>> filename = 'foo.csv'
>>> df = cudf.DataFrame({'x': [0, 1, 2, 3],
...                      'y': [1.0, 3.3, 2.2, 4.4],
...                      'z': ['a', 'b', 'c', 'd']})
>>> df = df.set_index(cudf.Series([3, 2, 1, 0]))
>>> df.to_csv(filename)