Serverless Deep Learning with AWS Fargate on EKS Clusters
PythonTo create a serverless deep learning environment utilizing AWS Fargate on Amazon Elastic Kubernetes Service (EKS) clusters, you'll define a combination of AWS services using Pulumi's Infrastructure as Code (IaC) approach. Here's an overview of what we need to do:
-
Set up an EKS Cluster: An EKS Cluster is Amazon's managed Kubernetes service which allows you to run Kubernetes on AWS without managing the underlying infrastructure.
-
Configure Fargate Profile for EKS: AWS Fargate is a serverless compute engine for containers that works with both Amazon Elastic Container Service (ECS) and EKS. We'll use it for running our containers without having to manage or scale a fleet of virtual machines.
-
Define an ECR Repository: Amazon Elastic Container Registry (ECR) is a container image registry service. We'll use it to store our deep learning container images.
-
Deploy a Kubernetes Application: We'll create Kubernetes deployments and services to run our deep learning models in a serverless manner on EKS with Fargate.
Now, let's build this infrastructure with Pulumi in Python:
import pulumi import pulumi_aws as aws import pulumi_awsx as awsx import pulumi_eks as eks # Step 1: Create an EKS Cluster # Define the IAM roles for the cluster and create the cluster with default settings, # including the creation of a Fargate profile that targets all pods from the 'default' namespace. cluster = eks.Cluster('deep-learning-eks', create_oidc_provider=True, skip_default_node_group=True) # Step 2: Configure Fargate Profile for EKS # This profile targets all pods with the 'fargate' runtime in the 'kube-system' namespace. fargate_profile = aws.eks.FargateProfile('deep-learning-fargate-profile', cluster_name=cluster.core.cluster.name, pod_execution_role_arn=cluster.core.execution_role.arn, selectors=[{ 'namespace': 'kube-system', 'labels': { 'fargate': 'true' } }]) # Step 3: Define an ECR Repository # We will deploy a deep learning application later, which requires a Docker image stored in ECR. ecr_repo = aws.ecr.Repository('deep-learning-repo') # For simplicity, we define some placeholder values for the deep learning application. # In a real-world scenario, you would build and push a Docker image to this ECR repository # containing the code for your deep learning model. app_name = 'dl-app' app_image = 'deep-learning-image' # Placeholder image name app_tag = 'v1' # Placeholder image tag # Step 4: Deploy a Kubernetes Application # Deploying an example application to the EKS cluster using AWS Fargate. # Normally, you would specify your container image that contains the deep learning application here. app_labels = { 'app': app_name } app_deployment = awsx.ecs.FargateService( 'deep-learning-app-deployment', cluster=cluster.core.cluster, task_definition_args=awsx.ecs.FargateServiceTaskDefinitionArgs( containers={ app_name: awsx.ecs.TaskDefinitionContainerDefinitionArgs( image=ecr_repo.repository_url.apply( lambda url: f"{url}:{app_tag}"), cpu=4, memory=8192, port_mappings=[awsx.ecs.TaskDefinitionPortMappingArgs( container_port=80 )] ), }, ), desired_count=1, labels=app_labels, ) # Expose an endpoint for the service app_service = awsx.lb.ApplicationLoadBalancer(f"deep-learning-lb", external=True, vpc=cluster.vpc_id) app_target_group = app_service.default_target_group app_target_group.attach(app_deployment) app_listener = app_service.listeners[0] # Export the cluster kubeconfig and the application endpoint for easy access pulumi.export('kubeconfig', cluster.kubeconfig) pulumi.export('app_endpoint', app_listener.endpoint.hostname)
In this Pulumi program:
- We create an EKS cluster to run our workloads using the
eks.Cluster
class (eks.Cluster). - We add a Fargate profile targeting specific namespaces and labels using
aws.eks.FargateProfile
(aws.eks.FargateProfile). - We declare an ECR repository to store our container images using
aws.ecr.Repository
(aws.ecr.Repository). - We simulate deploying a deep learning application by defining a
FargateService
, though we've used placeholders for the container image as you’d typically build and push your image to ECR.
Run the Pulumi program with
pulumi up
. After completing the deployment, you'll get the Kubernetes kubeconfig and the application endpoint, which you can use to interact with the deployed application.-