Serving Model Inference Outputs from GCP Storage Buckets.

Question

Pulumi · Accepted Answer

To serve model inference outputs from Google Cloud Storage (GCS) buckets with Pulumi, you would typically follow these steps:

1. Create a GCS bucket that will hold the inference outputs.
2. Configure the bucket for public access if the outputs should be publicly available.
3. Upload the inference outputs to the bucket as objects.
4. Set up proper IAM policies and permissions to control access to the objects if they are not public.

Below is a Pulumi Python program that:

- Creates a new GCS bucket.
- Configures the bucket to serve objects.
- Sets IAM binding for public read access if you want the inference results to be public.
- (Optionally) You could upload objects to this bucket, representing the inference outputs.

```python
import pulumi
import pulumi_gcp as gcp

# Create a new GCS bucket to store model inference outputs.
inference_output_bucket = gcp.storage.Bucket("inference-output-bucket",
    location="US", # Choose the appropriate location for your use-case
    # Assuming here that model inference outputs should be public.
    # Set uniform bucket level access for easy permissions management.
    uniform_bucket_level_access=True
)

# Provide public read access to the objects in the bucket for serving the outputs to users.
# WARNING: This IAM policy allows public read access to the bucket's objects.
# Make sure that this is your intended permission setting.
public_read_policy = gcp.storage.BucketIAMBinding("inference-bucket-public-read",
    bucket=inference_output_bucket.name,
    role="roles/storage.objectViewer",
    members=["allUsers"]
)

# Export the bucket name and url to access it.
pulumi.export("bucket_name", inference_output_bucket.name)
pulumi.export("bucket_url", pulumi.Output.concat("https://storage.googleapis.com/", inference_output_bucket.name))

# Note: Uploading model inference outputs as objects to the bucket has not been included in this
# example but can be done using `BucketObject` resource similar to below pseudo-code:
#
# inference_output_object = gcp.storage.BucketObject("inference-output-object",
#     bucket=inference_output_bucket.name,
#     name="example-inference-output.json",
#     source=pulumi.FileAsset("./path/to/local/inference/output.json")
# )
# 
# Ensure you have the actual output files you wish to upload in the specified local path.
```

In this program, we use the `gcp.storage.Bucket` resource to create a new GCS bucket where we will store the inference outputs. The location is specified as "US", but you should choose a location that is closest to your users or meets your regulatory requirements.

The `gcp.storage.BucketIAMBinding` sets the IAM policy on this bucket, providing public read access to its objects by assigning the role `roles/storage.objectViewer` to `allUsers`. This means anyone on the internet can read the objects stored in this bucket. **Only apply this IAM policy if you are sure that the bucket should be publicly accessible.**

The `inference_output_bucket.name` and a constructed URL for the bucket are then exported. These can be used to access the bucket and its contents from other Pulumi stacks or applications.

Remember to replace `"./path/to/local/inference/output.json"` with the path to the actual model inference output file you intend to upload when including the upload functionality represented by the commented-out pseudo-code. This upload functionality is often performed as part of a CI/CD pipeline or a separate Pulumi program that handles dynamic object creation based on model inference activities.