# GPUEngine Configuration Options The `polars.GPUEngine` object may be configured in several different ways. ## Executor `cudf-polars` includes multiple *executors*, backends that take a Polars query and execute it to produce the result (either an in-memory `polars.DataFrame` from `.collect()` or one or more files with `.sink_`). These can be specified with the `executor` option when you create the `GPUEngine`. ```python import polars as pl engine = pl.GPUEngine(executor="streaming") query = ... result = query.collect(engine=engine) ``` The `streaming` executor is the default executor as of RAPIDS 25.08, and is equivalent to passing `engine="gpu"` or `engine=pl.GPUEngine()` to `collect`. At a high-level, the `streaming` executor works by breaking inputs (in-memory DataFrames or parquet files) into multiple pieces and streaming those pieces through the series of operations needed to produce the final result. We also provide an `in-memory` executor. This executor is often faster when the underlying data fits comfortably in device memory, because the overhead of splitting inputs and executing them in batches is less beneficial at this scale. With that said, this executor must rely on Unified Virtual Memory (UVM) if the input and intermediate data do not fit in device memory. The `in-memory` executor can be used with ```python engine = pl.GPUEngine(executor="in-memory") ``` In general, we recommend starting with the default `streaming` executor, because it scales significantly better than `in-memory`. The `streaming` executor includes several configuration options, which can be provided with the `executor_options` key when constructing the `GPUEngine`: ```python engine = pl.GPUEngine( executor="streaming", # the default executor_options={ "max_rows_per_partition": 500_000, } ) ``` You can configure the default value for configuration options through environment variables with the prefix `CUDF_POLARS__EXECUTOR__{option_name}`. For example, the environment variable `CUDF_POLARS__EXECUTOR__MAX_ROWS_PER_PARTITION` will set the default `max_rows_per_partition` to use if it isn't overridden through `executor_options`. For boolean options, like `rapidsmpf_spill`, the values `{"1", "true", "yes", "y"}` are considered `True` and `{"0", "false", "no", "n"}` are considered `False`. See [Configuration Reference](#cudf-polars-api) for a full list of options, and [Streaming Execution](#cudf-polars-streaming) for more on the streaming executor, including multi-GPU execution. ## Parquet Reader Options Reading large parquet files can use a large amount of memory, especially when the files are compressed. This may lead to out of memory errors for some workflows. To mitigate this, the "chunked" parquet reader may be selected. When enabled, parquet files are read in chunks, limiting the peak memory usage at the cost of a small drop in performance. To configure the parquet reader, we provide a dictionary of options to the `parquet_options` keyword of the `GPUEngine` object. Valid keys and values are: - `chunked` indicates that chunked parquet reading is to be used. By default, chunked reading is turned on. - [`chunk_read_limit`](https://docs.rapids.ai/api/libcudf/legacy/classcudf_1_1io_1_1chunked__parquet__reader#aad118178b7536b7966e3325ae1143a1a) controls the maximum size per chunk. By default, the maximum chunk size is unlimited. - [`pass_read_limit`](https://docs.rapids.ai/api/libcudf/legacy/classcudf_1_1io_1_1chunked__parquet__reader#aad118178b7536b7966e3325ae1143a1a) controls the maximum memory used for decompression. The default pass read limit is 16GiB. For example, to select the chunked reader with custom values for `pass_read_limit` and `chunk_read_limit`: ```python engine = GPUEngine( parquet_options={ 'chunked': True, 'chunk_read_limit': int(1e9), 'pass_read_limit': int(4e9) } ) result = query.collect(engine=engine) ``` Note that passing `chunked: False` disables chunked reading entirely, and thus `chunk_read_limit` and `pass_read_limit` will have no effect. You can configure the default value for configuration options through environment variables with the prefix `CUDF_POLARS__PARQUET_OPTIONS__{option_name}`. For example, the environment variable `CUDF_POLARS__PARQUET_OPTIONS__CHUNKED=0` will set the default `chunked` to `False`. ## Disabling CUDA Managed Memory By default the `in-memory` executor will use [CUDA managed memory](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#unified-memory-introduction) with RMM's pool allocator. On systems that don't support managed memory, a non-managed asynchronous pool allocator is used. Managed memory can be turned off by setting `POLARS_GPU_ENABLE_CUDA_MANAGED_MEMORY` to `0`. System requirements for managed memory can be found [here]( https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#system-requirements-for-unified-memory).