Deploying Inference APIs on DigitalOcean Droplets

Question

Pulumi · Accepted Answer

To deploy an inference API on DigitalOcean droplets, we will follow these steps:

1. **Create a SSH Key**: For securely accessing the Droplet later on.
2. **Set up a DigitalOcean Droplet**: Where the API service will run.
3. **Assign a Floating IP (Optional)**: To have a static IP address for our service.
4. **Configure Firewall Rules**: To restrict traffic to specific ports.
5. **Deploy the Inference API**: This is usually done by provisioning the Droplet with an initial setup script or by deploying a Docker container that runs the API.

We will be using these resources from the DigitalOcean Pulumi provider:
- `SshKey`: To manage our SSH public keys on DigitalOcean, enabling us to access droplets.
- `Droplet`: The virtual private server (VPS) where our API will run.
- `Firewall`: To configure which traffic can reach our Droplet.

Assuming you have already set up Pulumi and authenticated with DigitalOcean, here is how you could write your Pulumi program in Python. Your API deployment steps (like setting up the environment, installing dependencies, and running your inference server) would typically go in a user data script that runs when the Droplet first boots up, or within a Docker container where you just need to run the container with the proper ports exposed.

Let's start by writing the program:

```python
import pulumi
import pulumi_digitalocean as digitalocean

# Create a new SSH key to access the droplet
ssh_key = digitalocean.SshKey("my-ssh-key",
    public_key="ssh-rsa AAAAB3Nza... user@example.com")

# Provision a DigitalOcean Droplet for the Inference API
droplet = digitalocean.Droplet("inference-api-droplet",
    image="ubuntu-20-04-x64",  # Using Ubuntu 20.04, you can choose an image that suits your API's requirements
    region="nyc3",  # New York datacenter, but choose the region closer to your users
    size="s-1vcpu-1gb",  # This is the smallest droplet, scale up as per your performance needs
    ssh_keys=[ssh_key.id],  # The SSH key we just created
    # Assuming you have a script 'setup-inference-api.sh' that installs the API and its dependencies
    user_data="""
        #!/bin/bash
        # Your commands to install the API and its dependencies go here
        # For example:
        sudo apt-get update
        sudo apt-get install -y python3 pip3 python3-venv
        # ... and so on.
        """
)

# Configure a Firewall for the Droplet to only allow HTTP(S) and SSH traffic
firewall = digitalocean.Firewall("inference-api-firewall",
    droplet_ids=[droplet.id],
    inbound_rules=[
        # Allow HTTP traffic
        digitalocean.FirewallInboundRuleArgs(
            protocol="tcp",
            port_range="80",
            source_addresses=["0.0.0.0/0", "::/0"]
        ),
        # Allow HTTPS traffic
        digitalocean.FirewallInboundRuleArgs(
            protocol="tcp",
            port_range="443",
            source_addresses=["0.0.0.0/0", "::/0"]
        ),
        # Allow SSH traffic
        digitalocean.FirewallInboundRuleArgs(
            protocol="tcp",
            port_range="22",
            source_addresses=["Your.IP.Address.Here/32"]
        )
    ],
    outbound_rules=[
        # Allow the server to reach the outside world for updates, etc.
        digitalocean.FirewallOutboundRuleArgs(
            protocol="tcp",
            port_range="1-65535",
            destination_addresses=["0.0.0.0/0", "::/0"]
        )
    ]
)

# Optionally, if you want a static IP address, uncomment the following lines
# floating_ip = digitalocean.FloatingIp("inference-api-ip",
#     droplet=droplet.id,
#     region=droplet.region
# )

# pulumi.export('droplet_ip', droplet.ipv4_address)
# pulumi.export('droplet_ipv6', droplet.ipv6_address)
# To export the static IP (if you've provisioned one), use the line below
# pulumi.export('static_ip', floating_ip.ip_address)
```

Explanation:
- We create a new SSH key to access our Droplet.
- Then, we provision a Droplet configured with Ubuntu 20.04. You must choose the right size according to your inference API requirements and traffic expectations.
- Next, we configure the user data with a bash script. This example assumes you have a script ready to set up the API.
- We set up a firewall rule to restrict access and only allow necessary traffic via HTTP(S) and SSH.
- We export the Droplet's IP so we can access it later. Optionally, we could assign a Floating IP if we wanted a static address.
  
Remember to replace `"ssh-rsa AAAAB3Nza... user@example.com"` with your actual public SSH key and `"Your.IP.Address.Here/32"` with the IP address from which you'll be managing the services on your Droplet.

This is a basic setup that you can expand upon based on the specifics of your inference API and traffic needs. You might need to configure your inference service to start when the Droplet starts, or use a configuration management tool to manage your service's deployment and updates.