Deploy and Manage LLMs on Google Cloud Run GPUs with Pulumi

workshop · October 2024

Date

Oct 17, 2024

Time

Duration

90 minutes

If you’ve ever wanted to inference with an LLM in under a minute while paying only for what you consume, then Google Cloud Run GPUs are for you! In this hands-on workshop, we will demonstrate how Pulumi can seamlessly stand up an environment for deploying your LLMs and custom models on Google Cloud Run GPUs. Participants will learn how to create scalable, cost-efficient infrastructure that allows for rapid LLM inference, leveraging the power of Pulumi to automate and manage their deployments. Whether you’re deploying pre-trained LLMs or custom models, this workshop will provide the tools and knowledge you need to optimize your AI workloads on the cloud.

What you'll learn

How to efficiently deploy and manage LLMs and custom models on Google Cloud Run GPUs.
Best practices for setting up scalable and cost-effective infrastructure for fast LLM inference.
How to automate cloud infrastructure and streamline AI workload management using Pulumi.

This workshop uses Pulumi Cloud. Sign up to follow along.

This event has passed. Check out our on-demand recordings for more content.

Speakers

Jay Smith

Sr. Cloud Customer Engineer, Google

Mitch Gerdisch

Solutions Architect, Pulumi

Deploy and Manage LLMs on Google Cloud Run GPUs with Pulumi

What you'll learn

The infrastructure-as-code platform for any cloud.