Azure Virtual Machine#

Create Virtual Machine#

Create a new Azure Virtual Machine with GPUs, the NVIDIA Driver and the NVIDIA Container Runtime.

NVIDIA maintains a Virtual Machine Image (VMI) that pre-installs NVIDIA drivers and container runtimes, we recommend using this image as the starting point.

via Azure Portal

Select a resource group or create one if needed.
Select the latest NVIDIA GPU-Optimized VMI version from the drop down list, then select Get It Now (if there are multiple Gen versions, select the latest).
If already logged in on Azure, select continue clicking Create.
In Create a virtual machine interface, fill in required information for the vm.
- Select a GPU enabled VM size (see recommended VM types).
- In “Configure security features” select Standard.
- Make sure you create ssh keys and download them.

Click Review+Create to start the virtual machine.

via Azure CLI

Prepare the following environment variables.

Name	Description	Example
`AZ_VMNAME`	Name for VM	`RapidsAI-V100`
`AZ_RESOURCEGROUP`	Resource group of VM	`rapidsai-deployment`
`AZ_LOCATION`	Region of VM	`westus2`
`AZ_IMAGE`	URN of image	`nvidia:ngc_azure_17_11:ngc-base-version-22_06_0-gen2:22.06.0`
`AZ_SIZE`	VM Size	`Standard_NC6s_v3`
`AZ_USERNAME`	User name of VM	`rapidsai`
`AZ_SSH_KEY`	public ssh key	`~/.ssh/id_rsa.pub`

az vm create \
        --name ${AZ_VMNAME} \
        --resource-group ${AZ_RESOURCEGROUP} \
        --image ${AZ_IMAGE} \
        --location ${AZ_LOCATION} \
        --size ${AZ_SIZE} \
        --admin-username ${AZ_USERNAME} \
        --ssh-key-value ${AZ_SSH_KEY}

Note

Use az vm image list --publisher Nvidia --all --output table to inspect URNs of official NVIDIA images on Azure.

Note

See this link for supported ssh keys on Azure.

Create Network Security Group#

Next we need to allow network traffic to the VM so we can access Jupyter and Dask.

via Azure Portal

After creating VM, select Go to resource to access VM.
Select Networking -> Networking Settings in the left panel.
Select +Create port rule -> Add inbound port rule.
Set Destination port ranges to 8888,8787.
Modify the “Name” to avoid the , or any other symbols.

Keep rest unchanged. Select Add.

via Azure CLI

Name	Description	Example
`AZ_NSGNAME`	NSG name for the VM	`${AZ_VMNAME}NSG`
`AZ_NSGRULENAME`	Name for NSG rule	`Allow-Dask-Jupyter-ports`

az network nsg rule create \
    -g ${AZ_RESOURCEGROUP} \
    --nsg-name ${AZ_NSGNAME} \
    -n ${AZ_NSGRULENAME} \
    --priority 1050 \
    --destination-port-ranges 8888 8787

Install RAPIDS#

Next, we can SSH into our VM to install RAPIDS. SSH instructions can be found by selecting Connect in the left panel.

Tip

When connecting via SSH by doing

ssh -i <path-to-your-ssh-key-dir>/your-key-file.pem azureuser@<vm-ip-address>

you might get prompted with a WARNING: UNPROTECTED PRIVATE KEY FILE!, and get a “Permission denied” as a result of this.

Change the permissions of your key file to be less permissive by doing chmod 600 your_key_file.pem, and you should be good to go.

There are a selection of methods you can use to install RAPIDS which you can see via the RAPIDS release selector.

For this example we are going to run the RAPIDS Docker container so we need to know the name of the most recent container. On the release selector choose Docker in the Method column.

Then copy the commands shown:

docker pull nvcr.io/nvidia/rapidsai/notebooks:25.08-cuda12.8-py3.12
docker run --gpus all --rm -it \
    --shm-size=1g --ulimit memlock=-1 \
    -p 8888:8888 -p 8787:8787 -p 8786:8786 \
    nvcr.io/nvidia/rapidsai/notebooks:25.08-cuda12.8-py3.12

Note

If you see a “docker socket permission denied” error while running these commands try closing and reconnecting your SSH window. This happens because your user was added to the docker group only after you signed in.

Test RAPIDS#

To access Jupyter, navigate to <VM ip>:8888 in the browser.

In a Python notebook, check that you can import and use RAPIDS libraries like cudf.

In [1]: import cudf
In [2]: df = cudf.datasets.timeseries()
In [3]: df.head()
Out[3]:
                       id     name         x         y
timestamp
2000-01-01 00:00:00  1020    Kevin  0.091536  0.664482
2000-01-01 00:00:01   974    Frank  0.683788 -0.467281
2000-01-01 00:00:02  1000  Charlie  0.419740 -0.796866
2000-01-01 00:00:03  1019    Edith  0.488411  0.731661
2000-01-01 00:00:04   998    Quinn  0.651381 -0.525398

Open cudf/10min.ipynb and execute the cells to explore more of how cudf works.

When running a Dask cluster you can also visit <VM ip>:8787 to monitor the Dask cluster status.

Useful Links#

Using NGC with Azure

Related Examples#

HPO with dask-ml and cuml

dataset/airline library/numpy library/pandas library/xgboost library/dask library/dask-cuda library/dask-ml library/cuml cloud/aws/ec2 cloud/azure/azure-vm cloud/gcp/compute-engine cloud/ibm/virtual-server library/sklearn data-storage/s3 workflow/hpo

HPO with dask-ml and cuml

Measuring Performance with the One Billion Row Challenge

tools/dask-cuda data-format/csv library/cudf library/cupy library/dask library/pandas cloud/aws/ec2 cloud/aws/sagemaker cloud/azure/azure-vm cloud/azure/ml cloud/gcp/compute-engine cloud/gcp/vertex-ai

Measuring Performance with the One Billion Row Challenge