1. Scheduled Model Inference Jobs on Kubernetes


    To create scheduled model inference jobs on Kubernetes, we will use a CronJob resource. A CronJob creates Jobs on a time-based schedule, which is perfect for running recurring tasks such as model inference.

    Here's how you can achieve this with Pulumi and Kubernetes:

    1. Define the cron schedule: The schedule follows the standard cron format. For example, "*/5 * * * *" would represent every 5 minutes.
    2. Create a Job Template: This template specifies the Pods that will be created. It should contain the details about the container, including the image to use and the command to run.
    3. Configure the CronJob Spec: This includes the job template, the schedule, and other settings such as concurrencyPolicy, which determines how to treat concurrent jobs, and successfulJobsHistoryLimit which specifies how many completed jobs should be kept.

    Now, let's write a Pulumi program that defines a CronJob for model inference using a container image:

    import pulumi import pulumi_kubernetes as k8s # Define the container to use for the job. # Replace 'your-container-image' with the actual image you intend to use and # provide the necessary commands to carry out the model inference. container = k8s.core.v1.ContainerArgs( name="model-inference", image="your-container-image", command=["/bin/sh", "-c", "command-to-run-model-inference"] ) # Define the job spec that will be used by the CronJob. job_spec = k8s.batch.v1.JobSpecArgs( template=k8s.core.v1.PodTemplateSpecArgs( spec=k8s.core.v1.PodSpecArgs( containers=[container], restart_policy="OnFailure", # Restart policy for all containers within the pod ) ) ) # Define the CronJob resource, establishing the schedule and the job template. model_inference_cronjob = k8s.batch.v1.CronJob( "model-inference-cronjob", metadata=k8s.meta.v1.ObjectMetaArgs(name="model-inference"), spec=k8s.batch.v1.CronJobSpecArgs( schedule="0 */1 * * *", # This will run the job at the top of every hour. job_template=k8s.batch.v1.JobTemplateSpecArgs( spec=job_spec ), # Additional optional configuration: # start_deadline_seconds specifies the deadline in seconds for starting the job if it misses its scheduled time for any reason. # concurrency_policy specifies how to treat concurrent executions of a Job. # successful_jobs_history_limit specifies how many completed jobs should be kept. ) ) # Export the name of the cronjob pulumi.export('cronjob_name', model_inference_cronjob.metadata["name"])

    This program creates a Kubernetes CronJob resource using Pulumi. It runs a specified container at the top of every hour. The container should contain your model inference code. Replace your-container-image with the Docker image you will use and provide the command that runs your model inference.

    Please adjust the schedule to fit the frequency you need. The restart_policy is set to "OnFailure" to restart the job only if it fails. You might also want to adjust concurrencyPolicy and successfulJobsHistoryLimit based on how you want to handle concurrent jobs and how many successful job logs you want to retain, respectively. The start_deadline_seconds can also be set to indicate the latest time to start a job.

    Remember to point image to your container image and replace "command-to-run-model-inference" with the actual command that triggers the inference in your container.

    After you deploy this Pulumi program with the Pulumi CLI, it will create the CronJob in your Kubernetes cluster, and the inference job will run at the scheduled times.