Containerized Jupyter Notebooks for Data Science on AWS ECS.
PythonTo create containerized Jupyter Notebooks for data science on AWS ECS (Elastic Container Service), we will need to set up the following components:
-
Amazon ECR (Elastic Container Registry): To store the Docker image of the Jupyter Notebook. We will create a repository in ECR to push our Jupyter Notebook image.
-
AWS ECS Task Definition: This is like a blueprint for our application; it describes what Docker container to run, its memory and CPU requirements, and more.
-
AWS ECS Cluster: This is a grouping of EC2 instances or Fargate tasks where our Jupyter service will run.
-
AWS ECS Service: This maintains a specified number of instances of the task definition to run in the ECS cluster.
-
Amazon VPC (Virtual Private Cloud): To provide networking for our ECS service, we'll create a VPC with a Subnet and Security Group to define networking access.
-
AWS IAM Roles: IAM roles to give ECS tasks permissions to AWS resources.
This step-by-step program below will guide you through the resources you need to create to run Jupyter Notebooks on AWS ECS.
Please replace the placeholders (like
<ACCOUNT_ID>
,<REGION>
, etc.) with your actual AWS account, region, and preferred names as appropriate.import pulumi import pulumi_aws as aws import pulumi_awsx as awsx # Create an ECR repository to store our Jupyter container image. jupyter_ecr_repo = aws.ecr.Repository("jupyter-notebook-repo") # Create an ECS cluster to host our services. ecs_cluster = awsx.ecs.Cluster("jupyter-cluster") # Define execution and task roles for the ECS task ecs_execution_role = aws.iam.Role("ecs-execution-role", assume_role_policy=aws.iam.get_assume_role_policy_document(service="ecs-tasks.amazonaws.com").json) task_role = aws.iam.Role("ecs-task-role", assume_role_policy=aws.iam.get_assume_role_policy_document(service="ecs-tasks.amazonaws.com").json) # Attach necessary policies to the execution role ecs_execution_role_policy_attachment = aws.iam.RolePolicyAttachment("ecs-execution-role-policy-attachment", role=ecs_execution_role.name, policy_arn="arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy") # Define a Security Group to allow traffic on port 8888 for Jupyter security_group = aws.ec2.SecurityGroup("jupyter-sg", description="Allow traffic for Jupyter", egress=[{"protocol": "-1", "from_port": 0, "to_port": 0, "cidr_blocks": ["0.0.0.0/0"]}], ingress=[{"protocol": "tcp", "from_port": 8888, "to_port": 8888, "cidr_blocks": ["0.0.0.0/0"]}]) # Define the ECS task definition with a single container running Jupyter. task_definition = awsx.ecs.EC2TaskDefinition("jupyter-task-def", container=awsx.ecs.TaskDefinitionContainerDefinitionArgs( image=pulumi.Output.concat(aws.ecr.get_authorization_token().then(lambda token: token.proxy_endpoint), jupyter_ecr_repo.repository_url, ":latest"), memory=512, cpu=1, port_mappings=[awsx.ecs.TaskDefinitionPortMappingArgs( container_port=8888, host_port=8888, protocol="tcp" )], essential=True, name="jupyter" ), execution_role_arn=ecs_execution_role.arn, task_role_arn=task_role.arn) # Create the ECS service to run and manage the Jupyter tasks. ecs_service = awsx.ecs.EC2Service("jupyter-service", cluster=ecs_cluster, task_definition=task_definition, desired_count=1, launch_type="EC2", security_groups=[security_group.id], subnet_ids=ecs_cluster.vpc.public_subnet_ids) # Export the URL where Jupyter will be accessible jupyter_url = ecs_service.task_definition.apply( lambda task_def: f"http://{security_group.id.apply(lambda sg_id: aws.ec2.get_security_group(id=sg_id).then(lambda sg: sg.vpc_id).apply(lambda vpc_id: aws.ec2.get_instance(private_ip=ecs_service.load_balancers[0].id).then(lambda instance: instance.public_ip)))}:{task_definition.container.port_mappings[0].container_port}") pulumi.export("jupyter_notebook_url", jupyter_url)
This program creates each of the AWS resources necessary to run a Jupyter Notebook server containerized in Docker on ECS:
- An ECR repository to store the Docker image for your Jupyter Notebook server.
- An ECS cluster where your Jupyter service will run.
- IAM roles for execution and task roles with the necessary policies attached.
- A Security Group to allow traffic on port 8888, the default port for Jupyter Notebook.
- An ECS task definition that describes your container, including its image, CPU and memory requirements, and port mappings so that it can be reached over the network on the specified port.
- An ECS service that maintains a desired number of instances of your task definition running, and manages tasks in the cluster.
Finally, it exports the public URL that you can visit to access your Jupyter Notebook, once the service has been deployed and the tasks are running.
Please remember that before you can deploy this Pulumi stack, you need to have the Jupyter Docker image built and pushed to the ECR repository created by this code. The ECS service will then be able to pull the image and run containers from it.
-