High-Performance Inference Services on OCI OKE

Question

Pulumi · Accepted Answer

Oracle Cloud Infrastructure (OCI) Container Engine for Kubernetes (OKE) is a fully managed, scalable, and highly available service that you can use to deploy your containerized applications to the cloud. With OCI OKE, you can run highly available and scalable workloads, such as machine learning inference services, with ease.

In this program, we'll create an infrastructure for hosting a high-performance inference services application on OCI using Pulumi. The main components will include a Kubernetes cluster set up via OCI's Container Engine for Kubernetes, a network infrastructure with Virtual Cloud Network (VCN) for secure network management, and Node Pools to manage the worker nodes for the Kubernetes cluster.

Here's how you'll set up your high-performance inference services on OCI OKE using Pulumi:

1. **Virtual Cloud Network (VCN):** Define a VCN to provide a customizable private network in OCI.
2. **Kubernetes Cluster (OKE):** Provision an OKE cluster to run your inference services inside Kubernetes.
3. **Node Pool:** Configure a pool of worker nodes that will join the OKE cluster, where your inference services will be deployed.
4. **Network Security Group:** Create security rules to define allowable network communication to and from your Kubernetes nodes.

Below is the complete Pulumi program in Python that sets up these resources. After setting up the infrastructure, you would need to deploy your inference service application to the Kubernetes cluster, which might involve additional steps like creating Kubernetes Deployments, Services, and possibly Ingress Controllers for external access if needed.

```python
import pulumi
import pulumi_oci as oci

# Create a new VCN for the Kubernetes cluster
vcn = oci.core.Vcn("my_vcn",
                   cidr_block="10.0.0.0/16",
                   compartment_id=oci.get_tenancy().id,
                   display_name="MyVCN")

# Create a subnet for the Node Pool
subnet = oci.core.Subnet("my_subnet",
                         cidr_block="10.0.1.0/24",
                         vcn_id=vcn.id,
                         compartment_id=oci.get_tenancy().id,
                         display_name="MySubnet")

# Create a Network Security Group for the Node Pool
nsg = oci.core.NetworkSecurityGroup("my_nsg",
                                    compartment_id=oci.get_tenancy().id,
                                    vcn_id=vcn.id,
                                    display_name="MyNSG")

# Add a rule to the NSG to allow SSH access
nsg_rule = oci.core.NetworkSecurityGroupSecurityRule("nsg_rule",
                                                     network_security_group_id=nsg.id,
                                                     direction="INGRESS",
                                                     protocol="6",
                                                     source="0.0.0.0/0",
                                                     source_type="CIDR_BLOCK",
                                                     tcp_options=oci.core.NetworkSecurityGroupSecurityRuleTcpOptionsArgs(
                                                         destination_port_range=oci.core.NetworkSecurityGroupSecurityRuleTcpOptionsDestinationPortRangeArgs(
                                                             min=22,
                                                             max=22
                                                         )))
                                                         
# Provision an OKE cluster
cluster = oci.containerengine.Cluster("my_cluster",
                                      compartment_id=oci.get_tenancy().id,
                                      vcn_id=vcn.id,
                                      kubernetes_version="v1.20.8",
                                      options=oci.containerengine.ClusterOptionsArgs(
                                          service_lb_subnet_ids=[subnet.id]
                                      ))

# Create a Node Pool for the OKE cluster
node_pool = oci.containerengine.NodePool("my_node_pool",
                                         cluster_id=cluster.id,
                                         compartment_id=oci.get_tenancy().id,
                                         node_shape="VM.Standard2.4",
                                         node_config_details=oci.containerengine.NodePoolNodeConfigDetailsArgs(
                                             size=2,
                                             placement_configs=[
                                                 oci.containerengine.NodePoolNodeConfigDetailsPlacementConfigsArgs(
                                                     availability_domain="1",
                                                     subnet_id=subnet.id
                                                 )
                                             ]
                                         ),
                                         node_source_details=oci.containerengine.NodePoolNodeSourceDetailsArgs(
                                             source_type="IMAGE",
                                             image_id="ocid1.image.oc1..exampleuniqueID"
                                         ),
                                         ssh_public_key="ssh-rsa AAAAB3Nza...yourkeyhere... user@example.com")

# Export the cluster kubeconfig and Node Pool details
pulumi.export('kubeconfig', cluster.kubernetes_config.raw_config)
pulumi.export('node_pool_ocids', node_pool.nodes.apply(lambda nodes: [node.id for node in nodes]))
```

This program sets up the infrastructure needed to host high-performance inference services on Oracle Cloud Infrastructure using Pulumi:

- It creates a new Virtual Cloud Network with a single subnet. This network will be isolated, secure, and the foundation for your OKE cluster.
- It establishes a Network Security Group with a rule to allow SSH access, necessary for your operations team to securely manage the worker nodes.
- It provisions a Kubernetes cluster managed by OKE, which is where you will deploy your inference services. The cluster is set up with the latest stable Kubernetes version that's supported by OCI.
- It configures a Node Pool with worker nodes of a specified shape (compute capacity). These nodes will join the Kubernetes cluster and run the pods that make up your services.
- It exports the raw Kubernetes configuration needed to access your cluster with tools like `kubectl` and the Node Pool ids, so you can track the nodes running in your cluster.

Remember to replace placeholder values (like `ocid1.image.oc1..exampleuniqueID`) with actual values from your OCI environment. To use this program, save it to a `.py` file, set up your OCI provider configuration with the necessary credentials, and run `pulumi up` to provision the resources. Later, you can deploy your containerized inference services to this Kubernetes cluster.