Serverless Deployment of AI Models with GCP Cloud Build.

Question

Pulumi · Accepted Answer

To deploy AI models in a serverless fashion on Google Cloud Platform (GCP), you could use a combination of Google Cloud Build for automating the building and deployment process, and Google Cloud AI Platform for hosting the model.

Here's an overview of the steps you would take:

1. **Google Cloud Build (Cloud Build)** for Continuous Integration and Deployment (CI/CD):
   Cloud Build allows you to create a series of steps, called a build, executed using a YAML or JSON configuration file. Each build can consist of tasks such as building a Docker container, pushing that container to Google Container Registry, or deploying the container to Google Cloud Run or AI Platform.

2. **AI Platform Prediction** for serving models:
   AI Platform Prediction provides a serverless environment to deploy machine learning models. Once a model is deployed, it can handle incoming prediction requests via an API. For this, you'll need a trained machine learning model saved in a format supported by AI Platform Prediction.

3. **Google Cloud Storage (GCS)** to store your trained model:
   GCS will be used to store the AI model artifacts required during deployment. Once the model training is complete, you can save your trained model artifacts here.

Let's code these steps using Pulumi in Python.

```python
import pulumi_gcp as gcp

# Replace these variables with actual values or configuration references
project_id = 'your-gcp-project-id'
gcs_bucket_name = 'your-ai-models-bucket'
model_name = 'your-model-name'

# Create a Google Cloud Storage bucket to store AI models
ai_models_bucket = gcp.storage.Bucket('ai-models-bucket',
    name=gcs_bucket_name)

# Deploy the trained model to AI Platform
ai_model = gcp.ml.EngineModel('ai-model',
    name=model_name,
    project=project_id,
    regions=['us-central1']) # Change to your preferred region

# AI Platform requires a model version, which is associated
# with the artifact produced by training
model_version = gcp.ml.EngineModelVersion('model-version',
    model=ai_model.name,
    project=project_id,
    region='us-central1', # Change to your preferred region
    framework='TENSORFLOW', # Change according to your model type
    runtime_version='2.1',  # Change according to your requirements
    python_version='3.7',   # Change according to your requirements
    deployment_uri=ai_models_bucket.url) # URI of the model in GCS

# The Cloud Build Trigger automatically deploys new model versions when changes are made.
build_trigger = gcp.cloudbuild.Trigger('build-trigger',
    project=project_id,
    description='Deploy AI model when changes are made',
    filename='cloudbuild.yaml', # This file defines the build steps and resides in your repository
    included_files=[
        f'gs://{gcs_bucket_name}/**'
    ])

# Export the model version ID and build trigger ID
pulumi.export('model_version_id', model_version.id)
pulumi.export('build_trigger_id', build_trigger.id)
```

**Explanation of the code:**

- We first import the necessary Pulumi GCP module.
- We set manual variables for the project ID, the Cloud Storage bucket, and the model name. You should replace these with actual values or configuration references.
- Using `gcp.storage.Bucket`, we create a Cloud Storage bucket to store the AI model artifacts.
- `gcp.ml.EngineModel` represents the AI model in AI Platform and takes attributes like the project ID and region.
- Each AI model version requires a deployment URI, which points to a Cloud Storage location where the model's artifacts are stored. We specify this in `gcp.ml.EngineModelVersion` along with the framework, runtime, and Python versions.
- We create a Cloud Build trigger using `gcp.cloudbuild.Trigger` that listens for changes to our model files in the specified Cloud Storage bucket. This trigger is configured to use a `cloudbuild.yaml` file (not shown) which details the automated steps for deploying the model to AI Platform Prediction.
- Finally, we export the model version ID and the build trigger ID using Pulumi's `export` function, which will display these values as outputs after the Pulumi program is executed.

Now, you can extend this basic program with additional automation, monitoring, and scaling to suit your specific needs. Keep in mind that the above code is a simple deployment plan; a real-world scenario may require additional steps like training the model, configuring IAM roles, or setting up endpoint security.