Using aws efs with batch
TypeScriptTo set up an AWS environment with Elastic File System (EFS) and integrate it with AWS Batch, we need to accomplish a few steps:
- Create an EFS file system: This will be our shared file system that can be accessed by the AWS Batch instances.
- Set up AWS Batch compute environment: Compute environments are the AWS resources that are used to run the batch jobs. We need to define one that can access our EFS.
- Create a job queue: A job queue is where jobs are sent before they are assigned to run on compute resources.
- Define a job definition: This is a template that describes how batch jobs should be run.
In the following Pulumi program in TypeScript, we'll create these resources. Make sure you have AWS credentials configured for Pulumi. Here is how you can provision the resources:
import * as aws from "@pulumi/aws"; import * as pulumi from "@pulumi/pulumi"; // Create an AWS EFS File System. const fileSystem = new aws.efs.FileSystem("my-filesystem", { tags: { Name: "MyEFSFileSystem", }, }); // A Batch compute environment requires a role to operate. // This IAM role assumes the AWS Batch Service Role. const serviceRole = new aws.iam.Role("batch-service-role", { assumeRolePolicy: aws.iam.assumeRolePolicyForPrincipal({ Service: "batch.amazonaws.com", }), }); // The role policy attachment that grants the service role the required permissions. const serviceRoleAttachment = new aws.iam.RolePolicyAttachment("batch-service-role-attachment", { role: serviceRole, policyArn: aws.iam.ManagedPolicies.AWSBatchServiceRole, }); // The instance profile for the EC2 instances that will run the batch jobs. const instanceProfile = new aws.iam.InstanceProfile("batch-instance-profile", { role: serviceRole, }); // The compute environment where jobs are run. const computeEnvironment = new aws.batch.ComputeEnvironment("my-compute-environment", { serviceRole: serviceRole.arn, computeResources: { type: "EC2", // Pick the desired instance types. instanceTypes: ["m4.large"], // Specify subnets, security groups, and EFS mount targets. subnets: ["<subnet-id>"], // replace <subnet-id> with actual subnet ID securityGroupIds: ["<security-group-id>"], // replace <security-group-id> with actual security group ID minvCpus: 0, maxvCpus: 16, }, }); // The job queue that holds the jobs. const jobQueue = new aws.batch.JobQueue("my-job-queue", { priority: 1, computeEnvironments: [{ computeEnvironment: computeEnvironment.arn, order: 1, }], }); // The job definition that describes how jobs should be run. const jobDefinition = new aws.batch.JobDefinition("my-job-definition", { type: "container", containerProperties: { image: "my-docker-image", // replace this with your Docker image memory: 1024, vcpus: 1, volumes: [ { name: "efsVolume", host: { sourcePath: fileSystem.arn, // Using the EFS file system ARN }, }, ], mountPoints: [ { containerPath: "/mnt/efs", // Mount point inside the container sourceVolume: "efsVolume", }, ], }, }); // Export the file system ID and ARN as stack outputs export const fileSystemId = fileSystem.id; export const fileSystemArn = fileSystem.arn; // Export the job queue name and ARN as stack outputs export const jobQueueName = jobQueue.name; export const jobQueueArn = jobQueue.arn;
The compute environment is an AWS Batch resource which defines the type of instances that will be used, their networking configuration, and the minimum and maximum vCPUs that can be used by the Batch jobs.
The job queue is another AWS Batch resource that receives jobs and sends them to compute environments depending on the order specified and other factors like job priority.
The job definition specifies details of how jobs are to be run. Here, we choose the Docker container image that the jobs will use, the necessary CPU and memory resources, and mount points for our EFS file system.
Remember to replace placeholders like
<subnet-id>
,<security-group-id>
, andmy-docker-image
with actual values that apply to your AWS environment and Docker image repository.After deploying this Pulumi program, you would have a full environment set up and ready to queue and run batch jobs with access to the same shared EFS file system.
To deploy the stack, navigate to the directory containing your
index.ts
file (which contains the above code) and run these Pulumi CLI commands:pulumi stack init dev pulumi up
Remember to review the plan that Pulumi presents and confirm the changes before proceeding with the actual deployment. When you're done, you can destroy all the resources to avoid incurring costs:
pulumi destroy