Deploy and Manage LLMs on Google Cloud Run GPUs with Pulumi

workshop · October 2024
Date
Oct 17, 2024
Time
Duration
90 minutes
If you’ve ever wanted to inference with an LLM in under a minute while paying only for what you consume, then Google Cloud Run GPUs are for you! In this hands-on workshop, we will demonstrate how Pulumi can seamlessly stand up an environment for deploying your LLMs and custom models on Google Cloud Run GPUs. Participants will learn how to create scalable, cost-efficient infrastructure that allows for rapid LLM inference, leveraging the power of Pulumi to automate and manage their deployments. Whether you’re deploying pre-trained LLMs or custom models, this workshop will provide the tools and knowledge you need to optimize your AI workloads on the cloud.

What you'll learn

  • How to efficiently deploy and manage LLMs and custom models on Google Cloud Run GPUs.
  • Best practices for setting up scalable and cost-effective infrastructure for fast LLM inference.
  • How to automate cloud infrastructure and streamline AI workload management using Pulumi.
This workshop uses Pulumi Cloud. Sign up to follow along.

This event has passed. Check out our on-demand recordings for more content.

Speakers
Jay Smith
Jay Smith
Sr. Cloud Customer Engineer, Google
Mitch Gerdisch
Mitch Gerdisch
Solutions Architect, Pulumi