Dask-CUDA is a library extending Dask.distributed’s single-machine LocalCluster and Worker for use in distributed GPU workloads.
You can use
LocalCUDACluster to create a cluster of one or more GPUs on your local machine. You can launch a Dask scheduler on LocalCUDACluster to parallelize and distribute your RAPIDS workflows across multiple GPUs on a single node.
In addition to enabling multi-GPU computation,
LocalCUDACluster also provides a simple interface for managing the cluster, such as starting and stopping the cluster, querying the status of the nodes, and monitoring the workload distribution.
Instantiate a LocalCUDACluster object#
LocalCUDACluster class autodetects the GPUs in your system, so if you create it on a machine with two GPUs it will create a cluster with two workers, each of which is responsible for executing tasks on a separate GPU.
cluster = LocalCUDACluster()
You can also restrict your cluster to use specific GPUs by setting the
CUDA_VISIBLE_DEVICES environment variable, or as a keyword argument.
cluster = LocalCUDACluster(CUDA_VISIBLE_DEVICES="0,1") # Creates one worker for GPUs 0 and 1
Connecting a Dask client#
Dask scheduler coordinates the execution of tasks, whereas Dask client is the user-facing interface that submits tasks to the scheduler and monitors their progress.
client = Client(cluster)
To test RAPIDS, create a distributed client for the cluster and query for the GPU model.
from dask_cuda import LocalCUDACluster from dask.distributed import Client cluster = LocalCUDACluster() client = Client(cluster) def get_gpu_model(): import pynvml pynvml.nvmlInit() return pynvml.nvmlDeviceGetName(pynvml.nvmlDeviceGetHandleByIndex(0)) client.submit(get_gpu_model).result()