All upcoming events
WORKSHOP
•
VIRTUALDeploying LLMs on GKE with Pulumi
DATE
Jul 31, 2024
TIME
DURATION
90 minutes
This workshop is taught with Pulumi Cloud. Sign up for free to follow along.
This hands-on workshop guides participants through deploying a Mixtral 8X7B large language model on Google Kubernetes Engine (GKE) using Pulumi for infrastructure as code. Attendees will learn to leverage NVIDIA L4 GPUs on GKE to serve advanced AI models efficiently. The session covers setting up the Google Cloud environment, deploying a Pulumi-based GKE cluster, and containerizing the model using Hugging Face’s text generation inference.
You'll Learn:
Configuring GCP for AI workloads
Using Pulumi with Python for infrastructure deployment
Managing GPU-enabled Kubernetes clusters
Serving and testing large language models on GKE
Event Speakers
Engin Diri
Customer Experience Architect, Pulumi
Jason Smith
Sr. Cloud Customer Engineer, Google