Default `engine="gpu"`#

.collect(engine="gpu") (and engine=pl.GPUEngine()) is the API you invoke when you don’t construct a streaming engine explicitly. It runs the same streaming executor as the explicit engines (Ray, Dask, SPMD), conceptually similar to Polars’ own streaming engine but on the GPU. Under the hood it’s backed by DefaultSingletonEngine, a process-wide singleton specialization of SPMDEngine. At most one live instance exists per process, which is created lazily on first use and torn down at interpreter exit. Ray is the showcased explicit engine (see Usage); this page documents what engine="gpu" does without you having to construct anything.

Important

engine="gpu" is meant for trivial setup: single-GPU execution with no configuration or engine object to manage. For any non-trivial workflow, construct an engine explicitly. To tune options, use RayEngine.from_options(...). engine="gpu" accepts no options, so settings such as spill_to_pinned_memory=True for spill-heavy workloads require an explicit engine. See Usage and Configuration Options.

What you get without an explicit engine#

When you just write:

import polars as pl

result = (
    pl.scan_parquet("/data/*.parquet")
      .group_by("customer_id")
      .agg(pl.col("amount").sum())
      .collect(engine="gpu")
)

cudf-polars uses DefaultSingletonEngine under the hood. No cluster is set up, the rapidsmpf Context is bootstrapped on first use, and subsequent .collect() calls in the same process reuse it.

Explicit handle#

If you genuinely want the singleton (for example in tests or scripts that need to call .shutdown() deterministically) you can obtain it via the factory:

from cudf_polars.engine.default_singleton_engine import (
    DefaultSingletonEngine,
)

engine = DefaultSingletonEngine.get_or_create()
result = query.collect(engine=engine)

get_or_create() is idempotent: calling it again returns the same instance.

For anything beyond defaults, prefer an explicit engine. See Usage.

Lifecycle#

The singleton is bootstrapped once per process. The rapidsmpf Context, RMM adaptor, and Python thread-pool executor are reused across every .collect() call.

Shutdown is automatic: the engine registers an atexit hook that tears it down at interpreter exit. To shut it down explicitly (for example to release resources before constructing a multi-GPU engine), call the static method:

from cudf_polars.engine.default_singleton_engine import (
    DefaultSingletonEngine,
)

DefaultSingletonEngine.shutdown()

shutdown() is idempotent (calling it twice is safe) and a no-op if no live engine exists.

Mutual exclusion with explicit engines#

DefaultSingletonEngine, RayEngine, DaskEngine, and SPMDEngine cannot coexist in the same process. Concretely:

Constructing RayEngine / DaskEngine / SPMDEngine while the singleton is alive raises RuntimeError.
DefaultSingletonEngine.get_or_create() raises RuntimeError if any explicit streaming engine is alive.

Recommended pattern: pick one engine for the lifetime of the program. If you need to switch, shut down the active engine first:

DefaultSingletonEngine.shutdown()
explicit_engine = SPMDEngine.from_options(opts)

No options#

DefaultSingletonEngine.get_or_create() takes no arguments. To tune StreamingOptions such as spill_to_pinned_memory, fallback_mode, max_rows_per_partition, or any rapidsmpf runtime knob, construct an explicit RayEngine via RayEngine.from_options(...). See Configuration Options for the available fields.