Run DeepSeek-R1 on AWS EC2 Using Ollama

Posted on

This weekend, my “for you” page on all of my social media accounts was filled with only one thing: DeepSeek. DeepSeek really managed to shake up the AI community with a series of very strong language models like DeepSeek R1.

But why? The answer is simple: DeepSeek entered the market as an open-source (MIT license) project with excellent performance and reasoning capabilities.

The company behind DeepSeek

DeepSeek is a Chinese AI startup founded in 2023 by Lian Wenfeng. One interesting fact about DeepSeek is that the cost of training and developing DeepSeek’s models was only a fraction of what OpenAI or Meta spent on their models.

This on its own sparked a lot of interest and curiosity in the AI community. DeepSeek R1 is near or even better than its rival models on some of the important benchmarks like AIME 2024 for mathematics, Codeforces for coding, and MMUL for general knowledge.

img_1.png

Mathematics: AIME 2024 & MATH-500

DeepSeek-R1 shows robust multi-step reasoning, scoring 79.8% on AIME 2024, edging out OpenAI o1-1217 at 79.2%. On MATH-500—which tests a wide range of high-school-level problems—DeepSeek-R1 again leads with 97.3%, slightly above OpenAI o1-1217’s 96.4%.

Coding: Codeforces & SWE-bench verified

In algorithmic reasoning (Codeforces), OpenAI o1-1217 stands at 96.6%, marginally ahead of DeepSeek-R1’s 96.3%. Yet on SWE-bench Verified, which focuses on software engineering reasoning, DeepSeek-R1 scores 49.2%, surpassing OpenAI o1-1217’s 48.9% and showcasing strong software verification capabilities.

General knowledge: GPQA Diamond & MMLU

OpenAI o1-1217 excels in factual queries (GPQA Diamond) with 75.7%, outperforming DeepSeek-R1 at 71.5%. For broader academic coverage (MMLU), the margin is still tight: 91.8% (OpenAI o1-1217) vs. 90.8% (DeepSeek-R1), indicating near-parity in multitask language understanding.

DeepSeek R1 model

DeepSeek R1 is a large language model developed with a strong focus on reasoning tasks. It excels at problems requiring multi-step analysis and logical thinking. Unlike typical models that rely heavily on Supervised Fine-Tuning (SFT), DeepSeek R1 uses Reinforcement Learning (RL) as its primary training strategy. This emphasis on RL empowers it to figure out solutions with greater independence.

What Are Distilled models?

Besides the main model, DeepSeek AI has introduced distilled versions in various parameter sizes—1.5B, 7B, 8B, 14B, 32B, and 70B. These distilled models draw on Qwen and Llama architectures, preserving much of the original model’s reasoning capabilities while being more accessible for personal computer use.

Notably, the 8B and smaller models can operate on standard CPUs, GPUs, or Apple Silicon machines, making them convenient for anyone interested in experimenting at home.

That’s why I decided to run DeepSeek on an AWS EC2 instance using Pulumi. I wanted to see how easy it is to set up and run DeepSeek on the cloud using Infrastructure as Code (IaC). So, let’s get started!

Setting up the environment

Prerequisites

Before we start, make sure you have the following prerequisites:

What Is Ollama?

img_2.png

Ollama allows you to run and manage large language models (LLMs) on your own computer. By simplifying the process of downloading, running, and using these models. It supports macOS, Linux, and Windows, making it accessible across different operating systems. Ollama is easy to use. It has simple commands to pull, run, and manage models.

In addition to local usage, Ollama provides an API for integrating LLMs into other applications. An experimental compatibility layer with the OpenAI API means many existing OpenAI-compatible tools can now work with a local Ollama server. It can leverage GPUs for faster processing and includes features like custom model creation and sharing.

Ollama provides strong support for many large language models such as Llama 2, Code Llama, or in our case DeepSeek R1, granting users secure, private, and local access. It offers GPU acceleration on macOS and Linux and provides libraries for Python and JavaScript.

Running DeepSeek on AWS EC2

img_4.png

First, we need to create a new Pulumi project. You can do this by running the following command:

# Select your preferred language (e.g., typescript, python, go, etc.)
pulumi new aws-<language>

Please choose the language you are most comfortable with.

This will create a new Pulumi project with the necessary files and configurations and a sample code. In our example code, it will also install the AWS provider for you.

Since you will not be using the sample code, feel free to delete it. After that, you can copy and paste the following code snippets into your Pulumi project.

Create an instance role with S3 access

To download the NVIDIA drivers needed to create an instance role with S3 access. Copy the following code to your Pulumi project:

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";
import * as fs from "fs";

const role = new aws.iam.Role("deepSeekRole", {
    name: "deepseek-role",
    assumeRolePolicy: JSON.stringify({
        Version: "2012-10-17",
        Statement: [
            {
                Action: "sts:AssumeRole",
                Effect: "Allow",
                Principal: {
                    Service: "ec2.amazonaws.com",
                },
            },
        ],
    }),
});

new aws.iam.RolePolicyAttachment("deepSeekS3Policy", {
    policyArn: "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess",
    role: role.name,
});

const instanceProfile = new aws.iam.InstanceProfile("deepSeekProfile", {
    name: "deepseek-profile",
    role: role.name,
});
import pulumi
import pulumi_aws as aws
import json
import os

# IAM Role for EC2 instances
role = aws.iam.Role(
    "deepSeekRole",
    name="deepseek-role",
    assume_role_policy=json.dumps(
        {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": "sts:AssumeRole",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "ec2.amazonaws.com",
                    },
                }
            ],
        }
    ),
)

# Attach S3 read-only policy to the IAM Role
iam_policy_attachment = aws.iam.RolePolicyAttachment(
    "deepSeekS3Policy",
    policy_arn="arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess",
    role=role.name,
)

# Instance Profile containing the IAM Role
instance_profile = aws.iam.InstanceProfile(
    "deepSeekProfile", name="deepseek-profile", role=role.name
)
package main

import (
	"encoding/json"
	"os"

	"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/ec2"
	"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/iam"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		rolePolicy, err := json.Marshal(map[string]interface{}{
			"Version": "2012-10-17",
			"Statement": []map[string]interface{}{
				{
					"Action":    "sts:AssumeRole",
					"Effect":    "Allow",
					"Principal": map[string]interface{}{"Service": "ec2.amazonaws.com"},
				},
			},
		})
		if err != nil {
			return err
		}

		role, err := iam.NewRole(ctx, "deepSeekRole", &iam.RoleArgs{
			Name:             pulumi.String("deepseek-role"),
			AssumeRolePolicy: pulumi.String(rolePolicy),
		})
		if err != nil {
			return err
		}

		// Attach S3 read-only policy to the IAM Role
		_, err = iam.NewRolePolicyAttachment(ctx, "deepSeekS3Policy", &iam.RolePolicyAttachmentArgs{
			PolicyArn: pulumi.String("arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"),
			Role:      role.Name,
		})
		if err != nil {
			return err
		}

		// Instance Profile containing the IAM Role
		instanceProfile, err := iam.NewInstanceProfile(ctx, "deepSeekProfile", &iam.InstanceProfileArgs{
			Name: pulumi.String("deepseek-profile"),
			Role: role.Name,
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using Pulumi;
using Pulumi.Aws.Ec2;
using Pulumi.Aws.Iam;
using System.Collections.Generic;
using System.IO;
using Pulumi.Aws.Ec2.Inputs;
using System.Threading.Tasks;
using System.Text.Json;
class MyStack: Stack
{
    public MyStack()
        {
        {
            // IAM Role for EC2 instances
            var rolePolicy = new Dictionary < string,
                object >
                {
                    {
                        "Version",
                        "2012-10-17"
                    },
                    {
                        "Statement",
                        new []
                        {
                            new Dictionary < string, object >
                            {
                                {
                                    "Action",
                                    "sts:AssumeRole"
                                },
                                {
                                    "Effect",
                                    "Allow"
                                },
                                {
                                    "Principal",
                                    new Dictionary < string,
                                    string >
                                    {
                                        {
                                            "Service",
                                            "ec2.amazonaws.com"
                                        }
                                    }
                                }
                            }
                        }
                    }
                };
            var role = new Role("deepSeekRole", new RoleArgs
            {
                Name = "deepseek-role",
                    AssumeRolePolicy = JsonSerializer.Serialize(rolePolicy)
            });
            // Attach S3 read-only policy to the IAM Role
            var rolePolicyAttachment = new RolePolicyAttachment("deepSeekS3Policy", new RolePolicyAttachmentArgs
            {
                PolicyArn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess",
                    Role = role.Name
            });
            // Instance Profile containing the IAM Role
            var instanceProfile = new InstanceProfile("deepSeekProfile", new InstanceProfileArgs
            {
                Name = "deepseek-profile",
                    Role = role.Name
            });
}
class Program
{
    static Task < int > Main() => Deployment.RunAsync < MyStack > ();
}
name: deepseek-ollama-yaml
description: DeepSeek Ollama AWS example
runtime: yaml

variables:
  publicKey:
    fn::readFile: ./deepseek.rsa
  userData:
    fn::readFile: ./cloud-init.yaml
  amiFilter: "amzn2-ami-hvm-*-x86_64-gp2"
  amiOwner: "137112412989"
  amiId:
    fn::invoke:
      function: aws:ec2:getAmi
      arguments:
        filters:
          - name: name
            values: ["${amiFilter}"]
        owners: ["${amiOwner}"]
        mostRecent: true
      return: id

resources:
  deepSeekRole:
    type: aws:iam:Role
    properties:
      name: deepseek-role
      assumeRolePolicy: |
        {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": "sts:AssumeRole",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "ec2.amazonaws.com"
                    }
                }
            ]
        }        

  deepSeekS3Policy:
    type: aws:iam:RolePolicyAttachment
    properties:
      role: ${deepSeekRole.name}
      policyArn: arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess

  deepSeekProfile:
    type: aws:iam:InstanceProfile
    properties:
      name: deepseek-profile
      role: ${deepSeekRole.name}

Create the network

Next, we need to create a VPC, subnet, Internet Gateway, and route table. Copy the following code to your Pulumi project:

const vpc = new aws.ec2.Vpc("deepSeekVpc", {
    cidrBlock: "10.0.0.0/16",
    enableDnsHostnames: true,
    enableDnsSupport: true,
});

const subnet = new aws.ec2.Subnet("deepSeekSubnet", {
    vpcId: vpc.id,
    cidrBlock: "10.0.48.0/20",
    availabilityZone: pulumi.interpolate`${aws.getAvailabilityZones().then(it => it.names[0])}`,
    mapPublicIpOnLaunch: true,
});

const internetGateway = new aws.ec2.InternetGateway("deepSeekInternetGateway", {
    vpcId: vpc.id,
});

const routeTable = new aws.ec2.RouteTable("deepSeekRouteTable", {
    vpcId: vpc.id,
    routes: [
        {
            cidrBlock: "0.0.0.0/0",
            gatewayId: internetGateway.id,
        },
    ],
});

const routeTableAssociation = new aws.ec2.RouteTableAssociation("deepSeekRouteTableAssociation", {
    subnetId: subnet.id,
    routeTableId: routeTable.id,
});

const securityGroup = new aws.ec2.SecurityGroup("deepSeekSecurityGroup", {
    vpcId: vpc.id,
    egress: [
        {
            fromPort: 0,
            toPort: 0,
            protocol: "-1",
            cidrBlocks: ["0.0.0.0/0"],
        },
    ],
    ingress: [
        {
            fromPort: 22,
            toPort: 22,
            protocol: "tcp",
            cidrBlocks: ["0.0.0.0/0"],
        },
        {
            fromPort: 3000,
            toPort: 3000,
            protocol: "tcp",
            cidrBlocks: ["0.0.0.0/0"],
        },
        {
            fromPort: 11434,
            toPort: 11434,
            protocol: "tcp",
            cidrBlocks: ["0.0.0.0/0"],
        },
    ],
});

# Create a VPC
vpc = aws.ec2.Vpc(
    "deepSeekVpc",
    cidr_block="10.0.0.0/16",
    enable_dns_hostnames=True,
    enable_dns_support=True,
)

# Create a subnet
subnet = aws.ec2.Subnet(
    "deepSeekSubnet",
    vpc_id=vpc.id,
    cidr_block="10.0.48.0/20",
    availability_zone="eu-central-1a",
    map_public_ip_on_launch=True,
)

# Create an internet gateway
internet_gateway = aws.ec2.InternetGateway("deepSeekInternetGateway", vpc_id=vpc.id)

# Create a route table and route table association
route_table = aws.ec2.RouteTable(
    "deepSeekRouteTable",
    vpc_id=vpc.id,
    routes=[
        aws.ec2.RouteTableRouteArgs(
            cidr_block="0.0.0.0/0", gateway_id=internet_gateway.id
        )
    ],
)

route_table_association = aws.ec2.RouteTableAssociation(
    "deepSeekRouteTableAssociation", subnet_id=subnet.id, route_table_id=route_table.id
)

# Create a security group
security_group = aws.ec2.SecurityGroup(
    "deepSeekSecurityGroup",
    vpc_id=vpc.id,
    egress=[
        {
            "from_port": 0,
            "to_port": 0,
            "protocol": "-1",
            "cidr_blocks": ["0.0.0.0/0"],
        }
    ],
    ingress=[
        {
            "from_port": 22,
            "to_port": 22,
            "protocol": "tcp",
            "cidr_blocks": ["0.0.0.0/0"],
        },
        {
            "from_port": 3000,
            "to_port": 3000,
            "protocol": "tcp",
            "cidr_blocks": ["0.0.0.0/0"],
        },
        {
            "from_port": 11434,
            "to_port": 11434,
            "protocol": "tcp",
            "cidr_blocks": ["0.0.0.0/0"],
        },
    ],
)
package main

import (
	"encoding/json"
	"os"

	"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/ec2"
	"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/iam"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
                // omitted for brevity
		}

		// Create a VPC
		vpc, err := ec2.NewVpc(ctx, "deepSeekVpc", &ec2.VpcArgs{
			CidrBlock:          pulumi.String("10.0.0.0/16"),
			EnableDnsHostnames: pulumi.Bool(true),
			EnableDnsSupport:   pulumi.Bool(true),
		})
		if err != nil {
			return err
		}

		// Create a subnet
		subnet, err := ec2.NewSubnet(ctx, "deepSeekSubnet", &ec2.SubnetArgs{
			VpcId:               vpc.ID(),
			CidrBlock:           pulumi.String("10.0.48.0/20"),
			AvailabilityZone:    pulumi.String("eu-central-1a"),
			MapPublicIpOnLaunch: pulumi.Bool(true),
		})
		if err != nil {
			return err
		}

		// Create an internet gateway
		internetGateway, err := ec2.NewInternetGateway(ctx, "deepSeekInternetGateway", &ec2.InternetGatewayArgs{
			VpcId: vpc.ID(),
		})
		if err != nil {
			return err
		}

		// Create a route table and route table association
		routeTable, err := ec2.NewRouteTable(ctx, "deepSeekRouteTable", &ec2.RouteTableArgs{
			VpcId: vpc.ID(),
			Routes: ec2.RouteTableRouteArray{
				&ec2.RouteTableRouteArgs{
					CidrBlock: pulumi.String("0.0.0.0/0"),
					GatewayId: internetGateway.ID(),
				},
			},
		})
		if err != nil {
			return err
		}

		_, err = ec2.NewRouteTableAssociation(ctx, "deepSeekRouteTableAssociation", &ec2.RouteTableAssociationArgs{
			SubnetId:     subnet.ID(),
			RouteTableId: routeTable.ID(),
		})
		if err != nil {
			return err
		}

		// Create a security group
		securityGroup, err := ec2.NewSecurityGroup(ctx, "deepSeekSecurityGroup", &ec2.SecurityGroupArgs{
			VpcId: vpc.ID(),
			Egress: ec2.SecurityGroupEgressArray{
				&ec2.SecurityGroupEgressArgs{
					FromPort:   pulumi.Int(0),
					ToPort:     pulumi.Int(0),
					Protocol:   pulumi.String("-1"),
					CidrBlocks: pulumi.StringArray{pulumi.String("0.0.0.0/0")},
				},
			},
			Ingress: ec2.SecurityGroupIngressArray{
				&ec2.SecurityGroupIngressArgs{
					FromPort:   pulumi.Int(22),
					ToPort:     pulumi.Int(22),
					Protocol:   pulumi.String("tcp"),
					CidrBlocks: pulumi.StringArray{pulumi.String("0.0.0.0/0")},
				},
				&ec2.SecurityGroupIngressArgs{
					FromPort:   pulumi.Int(3000),
					ToPort:     pulumi.Int(3000),
					Protocol:   pulumi.String("tcp"),
					CidrBlocks: pulumi.StringArray{pulumi.String("0.0.0.0/0")},
				},
				&ec2.SecurityGroupIngressArgs{
					FromPort:   pulumi.Int(11434),
					ToPort:     pulumi.Int(11434),
					Protocol:   pulumi.String("tcp"),
					CidrBlocks: pulumi.StringArray{pulumi.String("0.0.0.0/0")},
				},
			},
		})
		if err != nil {
			return err
		}

		return nil
	})
}
using Pulumi;
using Pulumi.Aws.Ec2;
using Pulumi.Aws.Iam;
using System.Collections.Generic;
using System.IO;
using Pulumi.Aws.Ec2.Inputs;
using System.Threading.Tasks;
using System.Text.Json;
class MyStack: Stack
{
    public MyStack()
        {
            // omitted for brevity
            // Create a VPC
            var vpc = new Vpc("deepSeekVpc", new VpcArgs
            {
                CidrBlock = "10.0.0.0/16",
                    EnableDnsHostnames = true,
                    EnableDnsSupport = true
            });
            // Create a subnet
            var subnet = new Subnet("deepSeekSubnet", new SubnetArgs
            {
                VpcId = vpc.Id, CidrBlock = "10.0.48.0/20",
                    AvailabilityZone = "eu-central-1a", MapPublicIpOnLaunch = true
            });
            // Create an internet gateway
            var internetGateway = new InternetGateway("deepSeekInternetGateway", new InternetGatewayArgs
            {
                VpcId = vpc.Id
            });
            // Create a route table and route table association
            var routeTable = new RouteTable("deepSeekRouteTable", new RouteTableArgs
            {
                VpcId = vpc.Id,
                    Routes = {
                        new RouteTableRouteArgs
                        {
                            CidrBlock = "0.0.0.0/0",
                                GatewayId = internetGateway.Id
                        }
                    }
            });
            var routeTableAssociation = new RouteTableAssociation("deepSeekRouteTableAssociation", new RouteTableAssociationArgs
            {
                SubnetId = subnet.Id,
                    RouteTableId = routeTable.Id
            });
            // Create a security group
            var securityGroup = new SecurityGroup("deepSeekSecurityGroup", new SecurityGroupArgs
            {
                VpcId = vpc.Id,
                    Egress = {
                        new SecurityGroupEgressArgs
                        {
                            FromPort = 0, ToPort = 0, Protocol = "-1",
                                CidrBlocks = {
                                    "0.0.0.0/0"
                                }
                        }
                    },
                    Ingress = {
                        new SecurityGroupIngressArgs
                        {
                            FromPort = 22, ToPort = 22,
                                Protocol = "tcp",
                                CidrBlocks = {
                                    "0.0.0.0/0"
                                }
                        },
                        new SecurityGroupIngressArgs
                        {
                            FromPort = 3000, ToPort = 3000,
                                Protocol = "tcp",
                                CidrBlocks = {
                                    "0.0.0.0/0"
                                }
                        },
                        new SecurityGroupIngressArgs
                        {
                            FromPort = 11434, ToPort = 11434,
                                Protocol = "tcp",
                                CidrBlocks = {
                                    "0.0.0.0/0"
                                }
                        }
                    }
            });
            // Key pair for SSH access
}
class Program
{
    static Task < int > Main() => Deployment.RunAsync < MyStack > ();
}

  deepSeekVpc:
    type: aws:ec2:Vpc
    properties:
      cidrBlock: 10.0.0.0/16
      enableDnsHostnames: true
      enableDnsSupport: true

  deepSeekSubnet:
    type: aws:ec2:Subnet
    properties:
      vpcId: ${deepSeekVpc.id}
      cidrBlock: 10.0.48.0/20
      availabilityZone: eu-central-1a
      mapPublicIpOnLaunch: true

  deepSeekInternetGateway:
    type: aws:ec2:InternetGateway
    properties:
      vpcId: ${deepSeekVpc.id}

  deepSeekRouteTable:
    type: aws:ec2:RouteTable
    properties:
      vpcId: ${deepSeekVpc.id}
      routes:
        - cidrBlock: 0.0.0.0/0
          gatewayId: ${deepSeekInternetGateway.id}

  deepSeekRouteTableAssociation:
    type: aws:ec2:RouteTableAssociation
    properties:
      subnetId: ${deepSeekSubnet.id}
      routeTableId: ${deepSeekRouteTable.id}

  deepSeekSecurityGroup:
    type: aws:ec2:SecurityGroup
    properties:
      vpcId: ${deepSeekVpc.id}
      ingress:
        - fromPort: 22
          toPort: 22
          protocol: tcp
          cidrBlocks:
            - 0.0.0.0/0
        - fromPort: 3000
          toPort: 3000
          protocol: tcp
          cidrBlocks:
            - 0.0.0.0/0
        - fromPort: 11434
          toPort: 11434
          protocol: tcp
          cidrBlocks:
            - 0.0.0.0/0
      egress:
        - fromPort: 0
          toPort: 0
          protocol: -1
          cidrBlocks:
            - 0.0.0.0/0

Create the EC2 instance

Finally, we need to create the EC2 instance. For this, we need to create our SSH key pair and retrieve the Amazon Machine Images to use in our instances. We are going to use Amazon Linux, as it is the most common and has all the necessary packages installed for us.

I also use a g4dn.xlarge, but you can change the instance type to any other instance type that supports GPU. You can find more information about the instance types.

If you need to create the key pair, run the following command:

openssl genrsa -out deepseek.pem 2048
openssl rsa -in deepseek.pem -pubout > deepseek.pub
ssh-keygen -f mykey.pub -i -mPKCS8 > deepseek.pem
const keyPair = new aws.ec2.KeyPair("deepSeekKey", {
    publicKey: pulumi.output(fs.readFileSync("deepseek.rsa", "utf-8")),
});

const deepSeekAmi = aws.ec2
    .getAmi({
        filters: [
            {
                name: "name",
                values: ["amzn2-ami-hvm-2.0.*-x86_64-gp2"],
            },
            {
                name: "architecture",
                values: ["x86_64"],
            },
        ],
        owners: ["137112412989"], // Amazon
        mostRecent: true,
    })
    .then(ami => ami.id);

const deepSeekInstance = new aws.ec2.Instance("deepSeekInstance", {
    ami: deepSeekAmi,
    instanceType: "g4dn.xlarge",
    keyName: keyPair.keyName,
    rootBlockDevice: {
        volumeSize: 100,
        volumeType: "gp3",
    },
    subnetId: subnet.id,
    vpcSecurityGroupIds: [securityGroup.id],
    iamInstanceProfile: instanceProfile.name,
    userData: fs.readFileSync("cloud-init.yaml", "utf-8"),
    tags: {
        Name: "deepSeek-server",
    },
});

export const amiId = deepSeekAmi;
export const instanceId = deepSeekInstance.id;
export const instancePublicDns = deepSeekInstance.publicIp;

# Key pair for SSH access
public_key = open("deepseek.rsa", "r").read()
key_pair = aws.ec2.KeyPair("deepSeekKey", public_key=public_key)


# Get the latest Amazon Linux 2 AMI
ami = aws.ec2.get_ami(
    filters=[
        {"name": "name", "values": ["amzn2-ami-hvm-2.0.*-x86_64-gp2"]},
        {"name": "architecture", "values": ["x86_64"]},
    ],
    owners=["137112412989"],  # Amazon
    most_recent=True,
).id

# Create an EC2 instance
user_data = open("cloud-init.yaml", "r").read()
instance = aws.ec2.Instance(
    "deepSeekInstance",
    ami=ami,
    instance_type="g4dn.xlarge",
    key_name=key_pair.key_name,
    root_block_device=aws.ec2.InstanceRootBlockDeviceArgs(
        volume_size=100, volume_type="gp3"
    ),
    subnet_id=subnet.id,
    vpc_security_group_ids=[security_group.id],
    iam_instance_profile=instance_profile.name,
    user_data=user_data,
    tags={"Name": "deepSeek-server"},
)

pulumi.export("amiId", ami)
pulumi.export("instanceId", instance.id)
pulumi.export("instancePublicDns", instance.public_ip)
package main

import (
	"encoding/json"
	"os"

	"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/ec2"
	"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/iam"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
                // omitted for brevity

		// Key pair for SSH access
		publicKey, err := os.ReadFile("deepseek.rsa")
		if err != nil {
			return err
		}

		keyPair, err := ec2.NewKeyPair(ctx, "deepSeekKey", &ec2.KeyPairArgs{
			PublicKey: pulumi.String(string(publicKey)),
		})
		if err != nil {
			return err
		}

		// Get the latest Amazon Linux 2 AMI
		mostRecent := true
		ami, err := ec2.LookupAmi(ctx, &ec2.LookupAmiArgs{
			Filters: []ec2.GetAmiFilter{
				{
					Name:   "name",
					Values: []string{"amzn2-ami-hvm-2.0.*-x86_64-gp2"},
				},
				{
					Name:   "architecture",
					Values: []string{"x86_64"},
				},
			},
			Owners:     []string{"137112412989"},
			MostRecent: &mostRecent,
		})
		if err != nil {
			return err
		}

		// Create an EC2 instance
		userData, err := os.ReadFile("cloud-init.yaml")
		if err != nil {
			return err
		}

		instance, err := ec2.NewInstance(ctx, "deepSeekInstance", &ec2.InstanceArgs{
			Ami:          pulumi.String(ami.Id),
			InstanceType: pulumi.String("g4dn.xlarge"),
			KeyName:      keyPair.KeyName,
			RootBlockDevice: &ec2.InstanceRootBlockDeviceArgs{
				VolumeSize: pulumi.Int(100),
				VolumeType: pulumi.String("gp3"),
			},
			SubnetId:            subnet.ID(),
			VpcSecurityGroupIds: pulumi.StringArray{securityGroup.ID()},
			IamInstanceProfile:  instanceProfile.Name,
			UserData:            pulumi.String(string(userData)),
			Tags: pulumi.StringMap{
				"Name": pulumi.String("deepSeek-server"),
			},
		})
		if err != nil {
			return err
		}

		ctx.Export("amiId", pulumi.String(ami.Id))
		ctx.Export("instanceId", instance.ID())
		ctx.Export("instancePublicDns", instance.PublicIp)

		return nil
	})
}
using Pulumi;
using Pulumi.Aws.Ec2;
using Pulumi.Aws.Iam;
using System.Collections.Generic;
using System.IO;
using Pulumi.Aws.Ec2.Inputs;
using System.Threading.Tasks;
using System.Text.Json;
class MyStack: Stack
{
    public MyStack()
        {
            // omitted for brevity
            // Key pair for SSH access
            var publicKey = File.ReadAllText("deepseek.rsa");
            var keyPair = new KeyPair("deepSeekKey", new KeyPairArgs
            {
                PublicKey = publicKey
            });
            // Get the latest Amazon Linux 2 AMI
            var amazonLinux = GetAmi.Invoke(new()
            {
                MostRecent = true,
                    Filters = new []
                    {
                        new GetAmiFilterInputArgs
                        {
                            Name = "name",
                                Values = new []
                                {
                                    "amzn2-ami-hvm-*-x86_64-gp2",
                                },
                        },
                        new GetAmiFilterInputArgs
                        {
                            Name = "architecture",
                                Values = new []
                                {
                                    "x86_64",
                                },
                        },
                    },
                    Owners = new []
                    {
                        "137112412989",
                    },
            });
            // Create an EC2 instance
            var userData = File.ReadAllText("cloud-init.yaml");
            var instance = new Instance("deepSeekInstance", new InstanceArgs
            {
                Ami = amazonLinux.Apply(GetAmiResult => GetAmiResult.Id),
                    InstanceType = "g4dn.xlarge", KeyName = keyPair.KeyName,
                    RootBlockDevice = new InstanceRootBlockDeviceArgs
                    {
                        VolumeSize = 100,
                            VolumeType = "gp3"
                    },
                    SubnetId = subnet.Id, VpcSecurityGroupIds = {
                        securityGroup.Id
                    },
                    IamInstanceProfile = instanceProfile.Name, UserData = userData,
                    Tags = {
                        {
                            "Name",
                            "deepSeek-server"
                        }
                    }
            });
            this.AmiId = amazonLinux.Apply(GetAmiResult => GetAmiResult.Id);
            this.InstanceId = instance.Id;
            this.InstancePublicDns = instance.PublicIp;
        }
        [Output]
    public Output < string > AmiId
        {
            get;
            set;
        }
        [Output]
    public Output < string > InstanceId
        {
            get;
            set;
        }
        [Output]
    public Output < string > InstancePublicDns
    {
        get;
        set;
    }
}
class Program
{
    static Task < int > Main() => Deployment.RunAsync < MyStack > ();
}

  deepSeekKey:
    type: aws:ec2:KeyPair
    properties:
      publicKey: ${publicKey}

  deepSeekInstance:
    type: aws:ec2:Instance
    properties:
      ami: ${amiId}
      instanceType: "g4dn.xlarge"
      keyName: ${deepSeekKey.keyName}
      rootBlockDevice:
        volumeSize: 100
        volumeType: gp3
      subnetId: ${deepSeekSubnet.id}
      vpcSecurityGroupIds:
        - ${deepSeekSecurityGroup.id}
      iamInstanceProfile: ${deepSeekProfile.name}
      userData: ${userData}
      tags:
        Name: deepSeek-server

outputs:
  AmiId: ${amiId}
  InstanceId: ${deepSeekInstance.id}
  InstancePublicDns: ${deepSeekInstance.publicIp}

Install Ollama and run DeepSeek

After we set up all the infrastructure needed for our GPU-powered EC2 instance, we can install Ollama and run DeepSeek. This will all be done as part of the user data script we pass to the EC2 instance.

In the runcmd section of the user data script, we will install the necessary packages, download the NVIDIA GRID drivers from S3, install Docker, and run the Ollama and Open WebUI containers.

#cloud-config
users:
- default

package_update: true

packages:
- apt-transport-https
- ca-certificates
- curl
- openjdk-17-jre-headless
- gcc

runcmd:
- yum install -y gcc kernel-devel-$(uname -r)
- aws s3 cp --recursive s3://ec2-linux-nvidia-drivers/latest/ .
- chmod +x NVIDIA-Linux-x86_64*.run
- /bin/sh ./NVIDIA-Linux-x86_64*.run --tmpdir . --silent
- touch /etc/modprobe.d/nvidia.conf
- echo "options nvidia NVreg_EnableGpuFirmware=0" | sudo tee --append /etc/modprobe.d/nvidia.conf
- yum install -y docker
- usermod -a -G docker ec2-user
- systemctl enable docker.service
- systemctl start docker.service
- curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
- yum install -y nvidia-container-toolkit
- nvidia-ctk runtime configure --runtime=docker
- systemctl restart docker
- docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart always ollama/ollama
- sleep 120
- docker exec ollama ollama run deepseek-r1:7b
- docker exec ollama ollama run deepseek-r1:14b
- docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Using DeepSeek models via Ollama

DeepSeek provides a diverse range of models in the Ollama library, each tailored to different resource requirements and use cases. Below is a concise overview:

Model sizes

The library offers models in sizes like 1.5B, 7B, 8B, 14B, 32B, 70B, and even 671B parameters (where “B” indicates billions). While larger models tend to deliver stronger performance, they also demand more computational power.

Quantized models

Certain DeepSeek models come in quantized variants (for example, q4_K_M or q8_0). These are optimized to use less memory and may run faster, though there can be a minor trade-off in quality.

Distilled versions

DeepSeek also releases distilled models (e.g., qwen-distill, llama-distill). These versions are lighter, having been trained to mimic the behavior of larger models and offering a more balanced mix of performance and resource efficiency.

Tags

Each model has both a “latest” tag and specialized tags indicating its size, quantization level, or distillation approach. For example: latest, 1.5b, 7b,8b,14b, 32b, 70b, 671b and more.

To pull a model, use the following command:

# Replace <tag> with the desired model tag
ollama pull deepseek-r1:<tag>

In our case, we will pull the 7B model:

ollama pull deepseek-r1:7b

Deploy the infrastructure

Before deploying the infrastructure, make sure you have the necessary AWS credentials set up. You can do this by running the following command:

aws configure

Pulumi supports a wide range of configuration options, including environment variables, configuration files, and more. You can find more information in the Pulumi documentation.

After setting up the credentials, you can deploy the infrastructure by running the following command:

pulumi up

This command will give you first a handy preview of the actions Pulumi will take. If you are happy with the changes, you can confirm the deployment by typing yes.

pulumi up
Please choose a stack, or create a new one: <create a new stack>
Please enter your desired stack name.
To create a stack in an organization, use the format <org-name>/<stack-name> (e.g. `acmecorp/dev`): dev
Please enter your desired stack name.
Created stack 'dev'
Previewing update (dev)

View in Browser (Ctrl+O): https://app.pulumi.com/dirien/deepseek-ollama-typescript/dev/previews/1dbb18ea-ba31-4d5b-9510-5dce19eb8ee8

     Type                              Name                            Plan
 +   pulumi:pulumi:Stack               deepseek-ollama-typescript-dev  create
 +   ├─ aws:ec2:KeyPair                deepSeekKey                     create
 +   ├─ aws:ec2:Vpc                    deepSeekVpc                     create
 +   ├─ aws:iam:Role                   deepSeekRole                    create
 +   ├─ aws:iam:InstanceProfile        deepSeekProfile                 create
 +   ├─ aws:ec2:SecurityGroup          deepSeekSecurityGroup           create
 +   ├─ aws:ec2:RouteTable             deepSeekRouteTable              create
 +   ├─ aws:ec2:InternetGateway        deepSeekInternetGateway         create
 +   ├─ aws:ec2:Subnet                 deepSeekSubnet                  create
 +   ├─ aws:iam:RolePolicyAttachment   deepSeekS3Policy                create
 +   ├─ aws:ec2:RouteTableAssociation  deepSeekRouteTableAssociation   create
 +   └─ aws:ec2:Instance               deepSeekInstance                create

Outputs:
    amiId            : "ami-085131ff43045c877"
    instanceId       : output<string>
    instancePublicDns: output<string>

Resources:
    + 12 to create

Do you want to perform this update? yes
Updating (dev)

View in Browser (Ctrl+O): https://app.pulumi.com/dirien/deepseek-ollama-typescript/dev/updates/1

     Type                              Name                            Status
 +   pulumi:pulumi:Stack               deepseek-ollama-typescript-dev  created (40s)
 +   ├─ aws:ec2:KeyPair                deepSeekKey                     created (0.47s)
 +   ├─ aws:iam:Role                   deepSeekRole                    created (1s)
 +   ├─ aws:ec2:Vpc                    deepSeekVpc                     created (12s)
 +   ├─ aws:iam:InstanceProfile        deepSeekProfile                 created (6s)
 +   ├─ aws:iam:RolePolicyAttachment   deepSeekS3Policy                created (0.90s)
 +   ├─ aws:ec2:InternetGateway        deepSeekInternetGateway         created (0.69s)
 +   ├─ aws:ec2:Subnet                 deepSeekSubnet                  created (11s)
 +   ├─ aws:ec2:SecurityGroup          deepSeekSecurityGroup           created (2s)
 +   ├─ aws:ec2:RouteTable             deepSeekRouteTable              created (1s)
 +   ├─ aws:ec2:RouteTableAssociation  deepSeekRouteTableAssociation   created (0.92s)
 +   └─ aws:ec2:Instance               deepSeekInstance                created (12s)

Outputs:
    amiId            : "ami-085131ff43045c877"
    instanceId       : "i-0ae7495781ace3e81"
    instancePublicDns: "18.159.211.136"

Resources:
    + 12 created

Duration: 42s

While the infrastructure is relatively quickly deployed, the user data script will take some time to download the necessary packages and run the containers.

You can check that everything is up and running by either connecting via ssh to the instance or navigating to the public IP address of the instance in your browser.

ssh -i deepseek.pem ec2-user@<instance-public-ip>

And then run the following command to check the status of the containers:

sudo docker ps
CONTAINER ID   IMAGE                                COMMAND               CREATED         STATUS                   PORTS                                           NAMES
c8714335e205   ghcr.io/open-webui/open-webui:main   "bash start.sh"       6 minutes ago   Up 6 minutes (healthy)   0.0.0.0:3000->8080/tcp, :::3000->8080/tcp       open-webui
bf4bb3b7ede1   ollama/ollama                        "/bin/ollama serve"   8 minutes ago   Up 7 minutes             0.0.0.0:11434->11434/tcp, :::11434->11434/tcp   ollama
[ec2-user@ip-10-0-58-122 ~]$

Accessing the web UI

When the EC2 instance is up and running and the containers are started, you can access the Ollama Web UI by navigating to http://<ec2-public-ip>:3000.

Keep in mind that the Ollama Web UI is not secure by default. Make sure to secure it before exposing it to the public.

We can give it a spin by running a few queries. For example, we can ask DeepSeek to solve a math problem:

img_6.png

What is nice about DeepSeek is that we can also see the reasoning behind the answer. This is very helpful to understand how the model came to a conclusion.

Accessing DeepSeek with Ollama OpenAI-compatible API

Ollama provides an OpenAI-compatible API that allows you to interact with DeepSeek models programmatically. This allows you to use existing OpenAI-compatible tools and applications with your local Ollama server.

I am not going to cover how to use the API in this post, but you can find more information in the Ollama documentation.

Cleaning up

After you are done experimenting with DeepSeek, you can clean up the resources by running the following command:

pulumi destroy

Conclusion

This post demonstrated how easy it is to set up and run DeepSeek on an AWS EC2 instance using Pulumi. By leveraging IaC, we were able to create the necessary infrastructure with a few lines of code. From here, we can easily configure the code to run any other AI model on the cloud, change the instance type, or even set additional infrastructure for the application connection to the model.

If you have any questions or need help with the code, feel free to reach out to me and if you want to give DeepSeek with Pulumi a try, head over to the Pulumi documentation.

Try Pulumi for Free

If you want to learn more about what we learned from using GenAI in production, head to this blog post