How Do I Create Deep Learning VMs on GCP?
Introduction
In this tutorial, we will create a Deep Learning VM on Google Cloud Platform (GCP) AI Platform. This involves setting up necessary resources such as a Google Compute Engine instance and specifying the Deep Learning VM image type. We’ll use an AI Platform Deep Learning VM image that comes pre-installed with popular deep learning frameworks like TensorFlow and PyTorch.
Step-by-Step Guide
Follow these steps to create a Deep Learning VM on GCP:
Define the Project and Region: Start by specifying the project and region where you want to deploy your resources.
Create a VPC Network and Subnetwork:
- Use
google_compute_network
to define a Virtual Private Cloud (VPC) network. - Set up a subnetwork within this VPC using
google_compute_subnetwork
.
- Use
Define the Google Compute Engine Instance:
- Utilize the
google_compute_instance
resource to create a Compute Engine instance. - Leverage the Deep Learning VM image in the
boot_disk
to ensure it comes with necessary deep learning frameworks.
- Utilize the
Set Up Firewall Rules:
- Establish firewall rules to allow SSH access to your instance, ensuring proper connectivity.
Key Points
- The
google_compute_network
resource is used to create a VPC network. - A subnetwork is established using
google_compute_subnetwork
. - The Compute Engine instance is defined with a Deep Learning VM image in the
boot_disk
. - Firewall rules are crucial for granting access to the instance.
import * as pulumi from "@pulumi/pulumi";
import * as gcp from "@pulumi/gcp";
// Create a VPC network
const vpcNetwork = new gcp.compute.Network("vpc_network", {
name: "dl-vpc-network",
autoCreateSubnetworks: false,
});
// Create a subnetwork
const subnetwork = new gcp.compute.Subnetwork("subnetwork", {
name: "dl-subnetwork",
network: vpcNetwork.id,
ipCidrRange: "10.0.0.0/16",
region: "us-central1",
});
// Create firewall rule for allowing SSH access
const firewallRule = new gcp.compute.Firewall("firewall_rule", {
name: "allow-ssh",
network: vpcNetwork.id,
allows: [{
protocol: "tcp",
ports: ["22"],
}],
sourceRanges: ["0.0.0.0/0"],
});
// Define the Google Compute Engine instance
const dlVmInstance = new gcp.compute.Instance("dl_vm_instance", {
networkInterfaces: [{
accessConfigs: [{}],
network: vpcNetwork.id,
subnetwork: subnetwork.id,
}],
name: "dl-vm-instance",
machineType: "n1-standard-4",
zone: "us-central1-a",
bootDisk: {
initializeParams: {
image: "projects/deeplearning-platform-release/global/images/family/tf-latest-gpu",
},
},
metadata: {
"ssh-keys": "your-ssh-keys-content",
},
});
export const instanceName = dlVmInstance.name;
export const instanceZone = dlVmInstance.zone;
export const instancePublicIp = dlVmInstance.networkInterfaces.apply(networkInterfaces => networkInterfaces[0].accessConfigs?.[0]?.natIp);
Conclusion
In this example, we’ve successfully created a Deep Learning VM on Google Cloud Platform using Pulumi. By defining a GCP project, setting up a VPC network, creating a Compute Engine instance with a Deep Learning VM image, and configuring firewall rules, we have established a robust environment for executing deep learning workloads efficiently. This setup ensures that you have the necessary infrastructure to support intensive machine learning tasks.
Deploy this code
Want to deploy this code? Sign up for a free Pulumi account to deploy in a few clicks.
Sign upNew to Pulumi?
Want to deploy this code? Sign up with Pulumi to deploy in a few clicks.
Sign upThank you for your feedback!
If you have a question about how to use Pulumi, reach out in Community Slack.
Open an issue on GitHub to report a problem or suggest an improvement.