Communication and scheduling overhead can be a major bottleneck in Dask/Distributed. Dask-CUDA addresses this by introducing an API for explicit communication in Dask tasks. The idea is that Dask/Distributed spawns workers and distribute data as usually while the user can submit tasks on the workers that communicate explicitly.

This makes it possible to bypass Distributed’s scheduler and write hand-tuned computation and communication patterns. Currently, Dask-CUDA includes an explicit-comms implementation of the Dataframe shuffle operation used for merging and sorting.


In order to use explicit-comms in Dask/Distributed automatically, simply define the environment variable DASK_EXPLICIT_COMMS=True or setting the "explicit-comms" key in the Dask configuration.

It is also possible to use explicit-comms in tasks manually, see the API and our implementation of shuffle for guidance.