Python#
RapidsMPF provides a Python API for building high-performance multi-GPU data pipelines. The Python layer wraps the C++ core and integrates with popular distributed computing frameworks.
Quickstart#
Quickstart — Dask-cuDF shuffle example and Streaming Engine example
API Reference#
Python API Reference — Full Python API reference (integrations, shuffler, communicator, memory, config)
Integrations#
The Python API includes ready-to-use integrations with:
Dask (
rapidsmpf.integrations.dask) — shuffle Dask DataFrames across aLocalCUDAClusteror multi-node Dask deployment.Ray (
rapidsmpf.integrations.ray) — use RapidsMPF within Ray tasks and actors.Single-process (
rapidsmpf.integrations.single) — run multi-GPU workloads in a single Python process without a cluster manager.cuDF (
rapidsmpf.integrations.cudf) — partition and pack/unpack cuDF tables for use with the Shuffler.