Elastic Compute Cloud (EC2)#
Create Instance#
Create a new EC2 Instance with GPUs, the NVIDIA Driver and the NVIDIA Container Runtime.
NVIDIA maintains an Amazon Machine Image (AMI) that pre-installs NVIDIA drivers and container runtimes, we recommend using this image as the starting point.
Open the EC2 Dashboard.
Select Launch Instance.
In the AMI selection box search for “nvidia”, then switch to the AWS Marketplace AMIs tab.
Select NVIDIA GPU-Optimized AMI and click “Select”. Then, in the new popup, select Subscribe on Instance Launch.
In Key pair select your SSH keys (create these first if you haven’t already).
Under network settings create a security group (or choose an existing) with inbound rules that allows SSH access on port
22and also allow ports8888,8786,8787to access Jupyter and Dask. For outbound rules, allow all traffic.Select Launch.
Set the following environment variables first. Edit any of them to match your preferred region, instance type, or naming convention.
REGION=us-east-1 INSTANCE_TYPE=g5.xlarge KEY_NAME=rapids-ec2-key SG_NAME=rapids-ec2-sg VM_NAME=rapids-ec2
Accept the NVIDIA Marketplace subscription before using the AMI: open the NVIDIA GPU-Optimized AMI listing, choose Continue to Subscribe, then select Accept Terms. Wait for the status to show as active.
Find the most recent NVIDIA Marketplace AMI ID in
us-east-1.AMI_ID=$(aws ec2 describe-images \ --region "$REGION" \ --filters "Name=name,Values=*NVIDIA*VMI*Base*" "Name=state,Values=available" \ --query 'Images | sort_by(@, &CreationDate)[-1].ImageId' \ --output text) echo "$AMI_ID"
Create an SSH key pair and secure it locally (if you already have a key, update
KEY_NAMEand skip this step).aws ec2 create-key-pair --region "$REGION" --key-name "$KEY_NAME" \ --query 'KeyMaterial' --output text > "${KEY_NAME}.pem" chmod 400 "${KEY_NAME}.pem"
Create a security group that allows SSH on
22plus the Jupyter (8888) and Dask (8786,8787) ports, and keep outbound traffic open. ReplaceALLOWED_CIDRwith something more restrictive if you want to limit inbound access. UseALLOWED_CIDR="$(curl ifconfig.co)/0"to restrict access to your current IP addressALLOWED_CIDR=0.0.0.0/0
VPC_ID=$(aws ec2 describe-vpcs \ --region "$REGION" \ --filters Name=isDefault,Values=true \ --query 'Vpcs[0].VpcId' \ --output text) echo "$VPC_ID" SG_ID=$(aws ec2 create-security-group \ --region "$REGION" \ --group-name "$SG_NAME" \ --description "RAPIDS EC2 security group" \ --vpc-id "$VPC_ID" \ --query 'GroupId' \ --output text) echo "$SG_ID" SUBNET_ID=$(aws ec2 describe-subnets \ --region "$REGION" \ --filters "Name=vpc-id,Values=$VPC_ID" \ --query 'Subnets[0].SubnetId' \ --output text) echo "$SUBNET_ID"
aws ec2 authorize-security-group-ingress --region "$REGION" --group-id "$SG_ID" \ --protocol tcp --port 22 --no-cli-pager --cidr "$ALLOWED_CIDR" aws ec2 authorize-security-group-ingress --region "$REGION" --group-id "$SG_ID" \ --protocol tcp --port 8888 --no-cli-pager --cidr "$ALLOWED_CIDR" aws ec2 authorize-security-group-ingress --region "$REGION" --group-id "$SG_ID" \ --protocol tcp --port 8786 --no-cli-pager --cidr "$ALLOWED_CIDR" aws ec2 authorize-security-group-ingress --region "$REGION" --group-id "$SG_ID" \ --protocol tcp --port 8787 --no-cli-pager --cidr "$ALLOWED_CIDR"
Launch an EC2 instance with the NVIDIA AMI.
INSTANCE_ID=$(aws ec2 run-instances \ --region "$REGION" \ --image-id "$AMI_ID" \ --count 1 \ --instance-type "$INSTANCE_TYPE" \ --key-name "$KEY_NAME" \ --security-group-ids "$SG_ID" \ --subnet-id "$SUBNET_ID" \ --associate-public-ip-address \ --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=$VM_NAME}]" \ --query 'Instances[0].InstanceId' \ --output text) echo "$INSTANCE_ID"
Connect to the instance#
Next we need to connect to the instance.
Open the EC2 Dashboard.
Locate your VM and note the Public IP Address.
In your terminal run
ssh ubuntu@<ip address>.
Note
If you use the AWS Console, please use the default ubuntu user to ensure the NVIDIA driver installs on the first boot.
Wait for the instance to pass health checks.
aws ec2 wait instance-status-ok --region "$REGION" --instance-ids "$INSTANCE_ID"
Retrieve the public IP address and use it to connect via SSH
PUBLIC_IP=$(aws ec2 describe-instances \ --region "$REGION" \ --instance-ids "$INSTANCE_ID" \ --query 'Reservations[0].Instances[0].PublicIpAddress' \ --output text) echo "$PUBLIC_IP"
Connect over SSH using the key created earlier.
ssh -i "${KEY_NAME}.pem" ubuntu@"$PUBLIC_IP"
Note
If you see WARNING: UNPROTECTED PRIVATE KEY FILE!, run chmod 400 rapids-ec2-key.pem before retrying.
Install RAPIDS#
There are a selection of methods you can use to install RAPIDS which you can see via the RAPIDS release selector.
For this example we are going to run the RAPIDS Docker container so we need to know the name of the most recent container. On the release selector choose Docker in the Method column.
Then copy the commands shown:
docker pull rapidsai/notebooks:25.12a-cuda12-py3.13
docker run --gpus all --rm -it \
--shm-size=1g --ulimit memlock=-1 \
-p 8888:8888 -p 8787:8787 -p 8786:8786 \
rapidsai/notebooks:25.12a-cuda12-py3.13
Note
If you see a “docker socket permission denied” error while running these commands try closing and reconnecting your
SSH window. This happens because your user was added to the docker group only after you signed in.
Note
If you see a “modprobe: FATAL: Module nvidia not found in directory /lib/modules/6.2.0-1011-aws” while first connecting to the EC2 instance, try logging out and reconnecting again.
Test RAPIDS#
To access Jupyter, navigate to <VM ip>:8888 in the browser.
In a Python notebook, check that you can import and use RAPIDS libraries like cudf.
In [1]: import cudf
In [2]: df = cudf.datasets.timeseries()
In [3]: df.head()
Out[3]:
id name x y
timestamp
2000-01-01 00:00:00 1020 Kevin 0.091536 0.664482
2000-01-01 00:00:01 974 Frank 0.683788 -0.467281
2000-01-01 00:00:02 1000 Charlie 0.419740 -0.796866
2000-01-01 00:00:03 1019 Edith 0.488411 0.731661
2000-01-01 00:00:04 998 Quinn 0.651381 -0.525398
Open cudf/10min.ipynb and execute the cells to explore more of how cudf works.
When running a Dask cluster you can also visit <VM ip>:8787 to monitor the Dask cluster status.
Clean up#
In the EC2 Dashboard, select your instance, choose Instance state → Terminate, and confirm.
Under Key Pairs, delete the key pair if you generated one and you no longer need it.
Under Security Groups, find the group you created (for example
rapids-ec2-sg), choose Actions → Delete security group.
Terminate the instance and wait until it is fully shut down.
aws ec2 terminate-instances --region "$REGION" --instance-ids "$INSTANCE_ID" --no-cli-pager aws ec2 wait instance-terminated --region "$REGION" --instance-ids "$INSTANCE_ID"
Delete the key pair and remove the local
.pemfile if you created it just for this guide.aws ec2 delete-key-pair --region "$REGION" --key-name "$KEY_NAME" rm -f "${KEY_NAME}.pem"
Delete the security group.
aws ec2 delete-security-group --region "$REGION" --group-id "$SG_ID"