cudf.DataFrame.to_orc#
- DataFrame.to_orc(fname, compression='snappy', statistics='ROWGROUP', stripe_size_bytes=None, stripe_size_rows=None, row_index_stride=None, cols_as_map_type=None, storage_options=None, index=None)[source]#
Write a DataFrame to the ORC format.
- Parameters:
- fnamestr
File path or object where the ORC dataset will be stored.
- compression{{ ‘snappy’, ‘ZSTD’, ‘ZLIB’, ‘LZ4’, None }}, default ‘snappy’
Name of the compression to use; case insensitive. Use
None
for no compression.- statistics: str {{ “ROWGROUP”, “STRIPE”, None }}, default “ROWGROUP”
The granularity with which column statistics must be written to the file.
- stripe_size_bytes: integer or None, default None
Maximum size of each stripe of the output. If None, 67108864 (64MB) will be used.
- stripe_size_rows: integer or None, default None
Maximum number of rows of each stripe of the output. If None, 1000000 will be used.
- row_index_stride: integer or None, default None
Row index stride (maximum number of rows in each row group). If None, 10000 will be used.
- cols_as_map_typelist of column names or None, default None
A list of column names which should be written as map type in the ORC file. Note that this option only affects columns of ListDtype. Names of other column types will be ignored.
- storage_optionsdict, optional, default None
Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs the key-value pairs are forwarded to
urllib.request.Request
as header options. For other URLs (e.g. starting with “s3://”, and “gcs://”) the key-value pairs are forwarded tofsspec.open
. Please seefsspec
andurllib
for more details.- indexbool, default None
If
True
, include the dataframe’s index(es) in the file output. IfFalse
, they will not be written to the file. IfNone
, similar toTrue
the dataframe’s index(es) will be saved, however, instead of being saved as values anyRangeIndex
will be stored as a range in the metadata so it doesn’t require much space and is faster. Other indexes will be included as columns in the file output.
See also