Run DeepSeek-R1 on AWS EC2 Using Ollama
Posted on
This weekend, my “for you” page on all of my social media accounts was filled with only one thing: DeepSeek. DeepSeek really managed to shake up the AI community with a series of very strong language models like DeepSeek R1.
But why? The answer is simple: DeepSeek entered the market as an open-source (MIT license) project with excellent performance and reasoning capabilities.
The company behind DeepSeek
DeepSeek is a Chinese AI startup founded in 2023 by Lian Wenfeng. One interesting fact about DeepSeek is that the cost of training and developing DeepSeek’s models was only a fraction of what OpenAI or Meta spent on their models.
This on its own sparked a lot of interest and curiosity in the AI community. DeepSeek R1 is near or even better than its rival models on some of the important benchmarks like AIME 2024 for mathematics, Codeforces for coding, and MMUL for general knowledge.
Mathematics: AIME 2024 & MATH-500
DeepSeek-R1 shows robust multi-step reasoning, scoring 79.8% on AIME 2024, edging out OpenAI o1-1217 at 79.2%. On MATH-500—which tests a wide range of high-school-level problems—DeepSeek-R1 again leads with 97.3%, slightly above OpenAI o1-1217’s 96.4%.
Coding: Codeforces & SWE-bench verified
In algorithmic reasoning (Codeforces), OpenAI o1-1217 stands at 96.6%, marginally ahead of DeepSeek-R1’s 96.3%. Yet on SWE-bench Verified, which focuses on software engineering reasoning, DeepSeek-R1 scores 49.2%, surpassing OpenAI o1-1217’s 48.9% and showcasing strong software verification capabilities.
General knowledge: GPQA Diamond & MMLU
OpenAI o1-1217 excels in factual queries (GPQA Diamond) with 75.7%, outperforming DeepSeek-R1 at 71.5%. For broader academic coverage (MMLU), the margin is still tight: 91.8% (OpenAI o1-1217) vs. 90.8% (DeepSeek-R1), indicating near-parity in multitask language understanding.
DeepSeek R1 model
DeepSeek R1 is a large language model developed with a strong focus on reasoning tasks. It excels at problems requiring multi-step analysis and logical thinking. Unlike typical models that rely heavily on Supervised Fine-Tuning (SFT), DeepSeek R1 uses Reinforcement Learning (RL) as its primary training strategy. This emphasis on RL empowers it to figure out solutions with greater independence.
What Are Distilled models?
Besides the main model, DeepSeek AI has introduced distilled versions in various parameter sizes—1.5B, 7B, 8B, 14B, 32B, and 70B. These distilled models draw on Qwen and Llama architectures, preserving much of the original model’s reasoning capabilities while being more accessible for personal computer use.
Notably, the 8B and smaller models can operate on standard CPUs, GPUs, or Apple Silicon machines, making them convenient for anyone interested in experimenting at home.
That’s why I decided to run DeepSeek on an AWS EC2 instance using Pulumi. I wanted to see how easy it is to set up and run DeepSeek on the cloud using Infrastructure as Code (IaC). So, let’s get started!
Setting up the environment
Prerequisites
Before we start, make sure you have the following prerequisites:
- An AWS account
- Pulumi CLI installed
- AWS CLI installed
- Understanding of Ollama
What Is Ollama?
Ollama allows you to run and manage large language models (LLMs) on your own computer. By simplifying the process of downloading, running, and using these models. It supports macOS, Linux, and Windows, making it accessible across different operating systems. Ollama is easy to use. It has simple commands to pull, run, and manage models.
In addition to local usage, Ollama provides an API for integrating LLMs into other applications. An experimental compatibility layer with the OpenAI API means many existing OpenAI-compatible tools can now work with a local Ollama server. It can leverage GPUs for faster processing and includes features like custom model creation and sharing.
Ollama provides strong support for many large language models such as Llama 2, Code Llama, or in our case DeepSeek R1, granting users secure, private, and local access. It offers GPU acceleration on macOS and Linux and provides libraries for Python and JavaScript.
Running DeepSeek on AWS EC2
First, we need to create a new Pulumi project. You can do this by running the following command:
# Select your preferred language (e.g., typescript, python, go, etc.)
pulumi new aws-<language>
Please choose the language you are most comfortable with.
This will create a new Pulumi project with the necessary files and configurations and a sample code. In our example code, it will also install the AWS provider for you.
Since you will not be using the sample code, feel free to delete it. After that, you can copy and paste the following code snippets into your Pulumi project.
Create an instance role with S3 access
To download the NVIDIA drivers needed to create an instance role with S3 access. Copy the following code to your Pulumi project:
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";
import * as fs from "fs";
const role = new aws.iam.Role("deepSeekRole", {
name: "deepseek-role",
assumeRolePolicy: JSON.stringify({
Version: "2012-10-17",
Statement: [
{
Action: "sts:AssumeRole",
Effect: "Allow",
Principal: {
Service: "ec2.amazonaws.com",
},
},
],
}),
});
new aws.iam.RolePolicyAttachment("deepSeekS3Policy", {
policyArn: "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess",
role: role.name,
});
const instanceProfile = new aws.iam.InstanceProfile("deepSeekProfile", {
name: "deepseek-profile",
role: role.name,
});
import pulumi
import pulumi_aws as aws
import json
import os
# IAM Role for EC2 instances
role = aws.iam.Role(
"deepSeekRole",
name="deepseek-role",
assume_role_policy=json.dumps(
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com",
},
}
],
}
),
)
# Attach S3 read-only policy to the IAM Role
iam_policy_attachment = aws.iam.RolePolicyAttachment(
"deepSeekS3Policy",
policy_arn="arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess",
role=role.name,
)
# Instance Profile containing the IAM Role
instance_profile = aws.iam.InstanceProfile(
"deepSeekProfile", name="deepseek-profile", role=role.name
)
package main
import (
"encoding/json"
"os"
"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/ec2"
"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/iam"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
rolePolicy, err := json.Marshal(map[string]interface{}{
"Version": "2012-10-17",
"Statement": []map[string]interface{}{
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": map[string]interface{}{"Service": "ec2.amazonaws.com"},
},
},
})
if err != nil {
return err
}
role, err := iam.NewRole(ctx, "deepSeekRole", &iam.RoleArgs{
Name: pulumi.String("deepseek-role"),
AssumeRolePolicy: pulumi.String(rolePolicy),
})
if err != nil {
return err
}
// Attach S3 read-only policy to the IAM Role
_, err = iam.NewRolePolicyAttachment(ctx, "deepSeekS3Policy", &iam.RolePolicyAttachmentArgs{
PolicyArn: pulumi.String("arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"),
Role: role.Name,
})
if err != nil {
return err
}
// Instance Profile containing the IAM Role
instanceProfile, err := iam.NewInstanceProfile(ctx, "deepSeekProfile", &iam.InstanceProfileArgs{
Name: pulumi.String("deepseek-profile"),
Role: role.Name,
})
if err != nil {
return err
}
return nil
})
}
using Pulumi;
using Pulumi.Aws.Ec2;
using Pulumi.Aws.Iam;
using System.Collections.Generic;
using System.IO;
using Pulumi.Aws.Ec2.Inputs;
using System.Threading.Tasks;
using System.Text.Json;
class MyStack: Stack
{
public MyStack()
{
{
// IAM Role for EC2 instances
var rolePolicy = new Dictionary < string,
object >
{
{
"Version",
"2012-10-17"
},
{
"Statement",
new []
{
new Dictionary < string, object >
{
{
"Action",
"sts:AssumeRole"
},
{
"Effect",
"Allow"
},
{
"Principal",
new Dictionary < string,
string >
{
{
"Service",
"ec2.amazonaws.com"
}
}
}
}
}
}
};
var role = new Role("deepSeekRole", new RoleArgs
{
Name = "deepseek-role",
AssumeRolePolicy = JsonSerializer.Serialize(rolePolicy)
});
// Attach S3 read-only policy to the IAM Role
var rolePolicyAttachment = new RolePolicyAttachment("deepSeekS3Policy", new RolePolicyAttachmentArgs
{
PolicyArn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess",
Role = role.Name
});
// Instance Profile containing the IAM Role
var instanceProfile = new InstanceProfile("deepSeekProfile", new InstanceProfileArgs
{
Name = "deepseek-profile",
Role = role.Name
});
}
class Program
{
static Task < int > Main() => Deployment.RunAsync < MyStack > ();
}
name: deepseek-ollama-yaml
description: DeepSeek Ollama AWS example
runtime: yaml
variables:
publicKey:
fn::readFile: ./deepseek.rsa
userData:
fn::readFile: ./cloud-init.yaml
amiFilter: "amzn2-ami-hvm-*-x86_64-gp2"
amiOwner: "137112412989"
amiId:
fn::invoke:
function: aws:ec2:getAmi
arguments:
filters:
- name: name
values: ["${amiFilter}"]
owners: ["${amiOwner}"]
mostRecent: true
return: id
resources:
deepSeekRole:
type: aws:iam:Role
properties:
name: deepseek-role
assumeRolePolicy: |
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
}
}
]
}
deepSeekS3Policy:
type: aws:iam:RolePolicyAttachment
properties:
role: ${deepSeekRole.name}
policyArn: arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess
deepSeekProfile:
type: aws:iam:InstanceProfile
properties:
name: deepseek-profile
role: ${deepSeekRole.name}
Create the network
Next, we need to create a VPC, subnet, Internet Gateway, and route table. Copy the following code to your Pulumi project:
const vpc = new aws.ec2.Vpc("deepSeekVpc", {
cidrBlock: "10.0.0.0/16",
enableDnsHostnames: true,
enableDnsSupport: true,
});
const subnet = new aws.ec2.Subnet("deepSeekSubnet", {
vpcId: vpc.id,
cidrBlock: "10.0.48.0/20",
availabilityZone: pulumi.interpolate`${aws.getAvailabilityZones().then(it => it.names[0])}`,
mapPublicIpOnLaunch: true,
});
const internetGateway = new aws.ec2.InternetGateway("deepSeekInternetGateway", {
vpcId: vpc.id,
});
const routeTable = new aws.ec2.RouteTable("deepSeekRouteTable", {
vpcId: vpc.id,
routes: [
{
cidrBlock: "0.0.0.0/0",
gatewayId: internetGateway.id,
},
],
});
const routeTableAssociation = new aws.ec2.RouteTableAssociation("deepSeekRouteTableAssociation", {
subnetId: subnet.id,
routeTableId: routeTable.id,
});
const securityGroup = new aws.ec2.SecurityGroup("deepSeekSecurityGroup", {
vpcId: vpc.id,
egress: [
{
fromPort: 0,
toPort: 0,
protocol: "-1",
cidrBlocks: ["0.0.0.0/0"],
},
],
ingress: [
{
fromPort: 22,
toPort: 22,
protocol: "tcp",
cidrBlocks: ["0.0.0.0/0"],
},
{
fromPort: 3000,
toPort: 3000,
protocol: "tcp",
cidrBlocks: ["0.0.0.0/0"],
},
{
fromPort: 11434,
toPort: 11434,
protocol: "tcp",
cidrBlocks: ["0.0.0.0/0"],
},
],
});
# Create a VPC
vpc = aws.ec2.Vpc(
"deepSeekVpc",
cidr_block="10.0.0.0/16",
enable_dns_hostnames=True,
enable_dns_support=True,
)
# Create a subnet
subnet = aws.ec2.Subnet(
"deepSeekSubnet",
vpc_id=vpc.id,
cidr_block="10.0.48.0/20",
availability_zone="eu-central-1a",
map_public_ip_on_launch=True,
)
# Create an internet gateway
internet_gateway = aws.ec2.InternetGateway("deepSeekInternetGateway", vpc_id=vpc.id)
# Create a route table and route table association
route_table = aws.ec2.RouteTable(
"deepSeekRouteTable",
vpc_id=vpc.id,
routes=[
aws.ec2.RouteTableRouteArgs(
cidr_block="0.0.0.0/0", gateway_id=internet_gateway.id
)
],
)
route_table_association = aws.ec2.RouteTableAssociation(
"deepSeekRouteTableAssociation", subnet_id=subnet.id, route_table_id=route_table.id
)
# Create a security group
security_group = aws.ec2.SecurityGroup(
"deepSeekSecurityGroup",
vpc_id=vpc.id,
egress=[
{
"from_port": 0,
"to_port": 0,
"protocol": "-1",
"cidr_blocks": ["0.0.0.0/0"],
}
],
ingress=[
{
"from_port": 22,
"to_port": 22,
"protocol": "tcp",
"cidr_blocks": ["0.0.0.0/0"],
},
{
"from_port": 3000,
"to_port": 3000,
"protocol": "tcp",
"cidr_blocks": ["0.0.0.0/0"],
},
{
"from_port": 11434,
"to_port": 11434,
"protocol": "tcp",
"cidr_blocks": ["0.0.0.0/0"],
},
],
)
package main
import (
"encoding/json"
"os"
"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/ec2"
"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/iam"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
// omitted for brevity
}
// Create a VPC
vpc, err := ec2.NewVpc(ctx, "deepSeekVpc", &ec2.VpcArgs{
CidrBlock: pulumi.String("10.0.0.0/16"),
EnableDnsHostnames: pulumi.Bool(true),
EnableDnsSupport: pulumi.Bool(true),
})
if err != nil {
return err
}
// Create a subnet
subnet, err := ec2.NewSubnet(ctx, "deepSeekSubnet", &ec2.SubnetArgs{
VpcId: vpc.ID(),
CidrBlock: pulumi.String("10.0.48.0/20"),
AvailabilityZone: pulumi.String("eu-central-1a"),
MapPublicIpOnLaunch: pulumi.Bool(true),
})
if err != nil {
return err
}
// Create an internet gateway
internetGateway, err := ec2.NewInternetGateway(ctx, "deepSeekInternetGateway", &ec2.InternetGatewayArgs{
VpcId: vpc.ID(),
})
if err != nil {
return err
}
// Create a route table and route table association
routeTable, err := ec2.NewRouteTable(ctx, "deepSeekRouteTable", &ec2.RouteTableArgs{
VpcId: vpc.ID(),
Routes: ec2.RouteTableRouteArray{
&ec2.RouteTableRouteArgs{
CidrBlock: pulumi.String("0.0.0.0/0"),
GatewayId: internetGateway.ID(),
},
},
})
if err != nil {
return err
}
_, err = ec2.NewRouteTableAssociation(ctx, "deepSeekRouteTableAssociation", &ec2.RouteTableAssociationArgs{
SubnetId: subnet.ID(),
RouteTableId: routeTable.ID(),
})
if err != nil {
return err
}
// Create a security group
securityGroup, err := ec2.NewSecurityGroup(ctx, "deepSeekSecurityGroup", &ec2.SecurityGroupArgs{
VpcId: vpc.ID(),
Egress: ec2.SecurityGroupEgressArray{
&ec2.SecurityGroupEgressArgs{
FromPort: pulumi.Int(0),
ToPort: pulumi.Int(0),
Protocol: pulumi.String("-1"),
CidrBlocks: pulumi.StringArray{pulumi.String("0.0.0.0/0")},
},
},
Ingress: ec2.SecurityGroupIngressArray{
&ec2.SecurityGroupIngressArgs{
FromPort: pulumi.Int(22),
ToPort: pulumi.Int(22),
Protocol: pulumi.String("tcp"),
CidrBlocks: pulumi.StringArray{pulumi.String("0.0.0.0/0")},
},
&ec2.SecurityGroupIngressArgs{
FromPort: pulumi.Int(3000),
ToPort: pulumi.Int(3000),
Protocol: pulumi.String("tcp"),
CidrBlocks: pulumi.StringArray{pulumi.String("0.0.0.0/0")},
},
&ec2.SecurityGroupIngressArgs{
FromPort: pulumi.Int(11434),
ToPort: pulumi.Int(11434),
Protocol: pulumi.String("tcp"),
CidrBlocks: pulumi.StringArray{pulumi.String("0.0.0.0/0")},
},
},
})
if err != nil {
return err
}
return nil
})
}
using Pulumi;
using Pulumi.Aws.Ec2;
using Pulumi.Aws.Iam;
using System.Collections.Generic;
using System.IO;
using Pulumi.Aws.Ec2.Inputs;
using System.Threading.Tasks;
using System.Text.Json;
class MyStack: Stack
{
public MyStack()
{
// omitted for brevity
// Create a VPC
var vpc = new Vpc("deepSeekVpc", new VpcArgs
{
CidrBlock = "10.0.0.0/16",
EnableDnsHostnames = true,
EnableDnsSupport = true
});
// Create a subnet
var subnet = new Subnet("deepSeekSubnet", new SubnetArgs
{
VpcId = vpc.Id, CidrBlock = "10.0.48.0/20",
AvailabilityZone = "eu-central-1a", MapPublicIpOnLaunch = true
});
// Create an internet gateway
var internetGateway = new InternetGateway("deepSeekInternetGateway", new InternetGatewayArgs
{
VpcId = vpc.Id
});
// Create a route table and route table association
var routeTable = new RouteTable("deepSeekRouteTable", new RouteTableArgs
{
VpcId = vpc.Id,
Routes = {
new RouteTableRouteArgs
{
CidrBlock = "0.0.0.0/0",
GatewayId = internetGateway.Id
}
}
});
var routeTableAssociation = new RouteTableAssociation("deepSeekRouteTableAssociation", new RouteTableAssociationArgs
{
SubnetId = subnet.Id,
RouteTableId = routeTable.Id
});
// Create a security group
var securityGroup = new SecurityGroup("deepSeekSecurityGroup", new SecurityGroupArgs
{
VpcId = vpc.Id,
Egress = {
new SecurityGroupEgressArgs
{
FromPort = 0, ToPort = 0, Protocol = "-1",
CidrBlocks = {
"0.0.0.0/0"
}
}
},
Ingress = {
new SecurityGroupIngressArgs
{
FromPort = 22, ToPort = 22,
Protocol = "tcp",
CidrBlocks = {
"0.0.0.0/0"
}
},
new SecurityGroupIngressArgs
{
FromPort = 3000, ToPort = 3000,
Protocol = "tcp",
CidrBlocks = {
"0.0.0.0/0"
}
},
new SecurityGroupIngressArgs
{
FromPort = 11434, ToPort = 11434,
Protocol = "tcp",
CidrBlocks = {
"0.0.0.0/0"
}
}
}
});
// Key pair for SSH access
}
class Program
{
static Task < int > Main() => Deployment.RunAsync < MyStack > ();
}
deepSeekVpc:
type: aws:ec2:Vpc
properties:
cidrBlock: 10.0.0.0/16
enableDnsHostnames: true
enableDnsSupport: true
deepSeekSubnet:
type: aws:ec2:Subnet
properties:
vpcId: ${deepSeekVpc.id}
cidrBlock: 10.0.48.0/20
availabilityZone: eu-central-1a
mapPublicIpOnLaunch: true
deepSeekInternetGateway:
type: aws:ec2:InternetGateway
properties:
vpcId: ${deepSeekVpc.id}
deepSeekRouteTable:
type: aws:ec2:RouteTable
properties:
vpcId: ${deepSeekVpc.id}
routes:
- cidrBlock: 0.0.0.0/0
gatewayId: ${deepSeekInternetGateway.id}
deepSeekRouteTableAssociation:
type: aws:ec2:RouteTableAssociation
properties:
subnetId: ${deepSeekSubnet.id}
routeTableId: ${deepSeekRouteTable.id}
deepSeekSecurityGroup:
type: aws:ec2:SecurityGroup
properties:
vpcId: ${deepSeekVpc.id}
ingress:
- fromPort: 22
toPort: 22
protocol: tcp
cidrBlocks:
- 0.0.0.0/0
- fromPort: 3000
toPort: 3000
protocol: tcp
cidrBlocks:
- 0.0.0.0/0
- fromPort: 11434
toPort: 11434
protocol: tcp
cidrBlocks:
- 0.0.0.0/0
egress:
- fromPort: 0
toPort: 0
protocol: -1
cidrBlocks:
- 0.0.0.0/0
Create the EC2 instance
Finally, we need to create the EC2 instance. For this, we need to create our SSH key pair and retrieve the Amazon Machine Images to use in our instances. We are going to use Amazon Linux
, as it is the most common and has all the necessary packages installed for us.
I also use a g4dn.xlarge
, but you can change the instance type to any other instance type that supports GPU. You can find more information about the instance types.
If you need to create the key pair, run the following command:
openssl genrsa -out deepseek.pem 2048
openssl rsa -in deepseek.pem -pubout > deepseek.pub
ssh-keygen -f mykey.pub -i -mPKCS8 > deepseek.pem
const keyPair = new aws.ec2.KeyPair("deepSeekKey", {
publicKey: pulumi.output(fs.readFileSync("deepseek.rsa", "utf-8")),
});
const deepSeekAmi = aws.ec2
.getAmi({
filters: [
{
name: "name",
values: ["amzn2-ami-hvm-2.0.*-x86_64-gp2"],
},
{
name: "architecture",
values: ["x86_64"],
},
],
owners: ["137112412989"], // Amazon
mostRecent: true,
})
.then(ami => ami.id);
const deepSeekInstance = new aws.ec2.Instance("deepSeekInstance", {
ami: deepSeekAmi,
instanceType: "g4dn.xlarge",
keyName: keyPair.keyName,
rootBlockDevice: {
volumeSize: 100,
volumeType: "gp3",
},
subnetId: subnet.id,
vpcSecurityGroupIds: [securityGroup.id],
iamInstanceProfile: instanceProfile.name,
userData: fs.readFileSync("cloud-init.yaml", "utf-8"),
tags: {
Name: "deepSeek-server",
},
});
export const amiId = deepSeekAmi;
export const instanceId = deepSeekInstance.id;
export const instancePublicDns = deepSeekInstance.publicIp;
# Key pair for SSH access
public_key = open("deepseek.rsa", "r").read()
key_pair = aws.ec2.KeyPair("deepSeekKey", public_key=public_key)
# Get the latest Amazon Linux 2 AMI
ami = aws.ec2.get_ami(
filters=[
{"name": "name", "values": ["amzn2-ami-hvm-2.0.*-x86_64-gp2"]},
{"name": "architecture", "values": ["x86_64"]},
],
owners=["137112412989"], # Amazon
most_recent=True,
).id
# Create an EC2 instance
user_data = open("cloud-init.yaml", "r").read()
instance = aws.ec2.Instance(
"deepSeekInstance",
ami=ami,
instance_type="g4dn.xlarge",
key_name=key_pair.key_name,
root_block_device=aws.ec2.InstanceRootBlockDeviceArgs(
volume_size=100, volume_type="gp3"
),
subnet_id=subnet.id,
vpc_security_group_ids=[security_group.id],
iam_instance_profile=instance_profile.name,
user_data=user_data,
tags={"Name": "deepSeek-server"},
)
pulumi.export("amiId", ami)
pulumi.export("instanceId", instance.id)
pulumi.export("instancePublicDns", instance.public_ip)
package main
import (
"encoding/json"
"os"
"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/ec2"
"github.com/pulumi/pulumi-aws/sdk/v6/go/aws/iam"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
// omitted for brevity
// Key pair for SSH access
publicKey, err := os.ReadFile("deepseek.rsa")
if err != nil {
return err
}
keyPair, err := ec2.NewKeyPair(ctx, "deepSeekKey", &ec2.KeyPairArgs{
PublicKey: pulumi.String(string(publicKey)),
})
if err != nil {
return err
}
// Get the latest Amazon Linux 2 AMI
mostRecent := true
ami, err := ec2.LookupAmi(ctx, &ec2.LookupAmiArgs{
Filters: []ec2.GetAmiFilter{
{
Name: "name",
Values: []string{"amzn2-ami-hvm-2.0.*-x86_64-gp2"},
},
{
Name: "architecture",
Values: []string{"x86_64"},
},
},
Owners: []string{"137112412989"},
MostRecent: &mostRecent,
})
if err != nil {
return err
}
// Create an EC2 instance
userData, err := os.ReadFile("cloud-init.yaml")
if err != nil {
return err
}
instance, err := ec2.NewInstance(ctx, "deepSeekInstance", &ec2.InstanceArgs{
Ami: pulumi.String(ami.Id),
InstanceType: pulumi.String("g4dn.xlarge"),
KeyName: keyPair.KeyName,
RootBlockDevice: &ec2.InstanceRootBlockDeviceArgs{
VolumeSize: pulumi.Int(100),
VolumeType: pulumi.String("gp3"),
},
SubnetId: subnet.ID(),
VpcSecurityGroupIds: pulumi.StringArray{securityGroup.ID()},
IamInstanceProfile: instanceProfile.Name,
UserData: pulumi.String(string(userData)),
Tags: pulumi.StringMap{
"Name": pulumi.String("deepSeek-server"),
},
})
if err != nil {
return err
}
ctx.Export("amiId", pulumi.String(ami.Id))
ctx.Export("instanceId", instance.ID())
ctx.Export("instancePublicDns", instance.PublicIp)
return nil
})
}
using Pulumi;
using Pulumi.Aws.Ec2;
using Pulumi.Aws.Iam;
using System.Collections.Generic;
using System.IO;
using Pulumi.Aws.Ec2.Inputs;
using System.Threading.Tasks;
using System.Text.Json;
class MyStack: Stack
{
public MyStack()
{
// omitted for brevity
// Key pair for SSH access
var publicKey = File.ReadAllText("deepseek.rsa");
var keyPair = new KeyPair("deepSeekKey", new KeyPairArgs
{
PublicKey = publicKey
});
// Get the latest Amazon Linux 2 AMI
var amazonLinux = GetAmi.Invoke(new()
{
MostRecent = true,
Filters = new []
{
new GetAmiFilterInputArgs
{
Name = "name",
Values = new []
{
"amzn2-ami-hvm-*-x86_64-gp2",
},
},
new GetAmiFilterInputArgs
{
Name = "architecture",
Values = new []
{
"x86_64",
},
},
},
Owners = new []
{
"137112412989",
},
});
// Create an EC2 instance
var userData = File.ReadAllText("cloud-init.yaml");
var instance = new Instance("deepSeekInstance", new InstanceArgs
{
Ami = amazonLinux.Apply(GetAmiResult => GetAmiResult.Id),
InstanceType = "g4dn.xlarge", KeyName = keyPair.KeyName,
RootBlockDevice = new InstanceRootBlockDeviceArgs
{
VolumeSize = 100,
VolumeType = "gp3"
},
SubnetId = subnet.Id, VpcSecurityGroupIds = {
securityGroup.Id
},
IamInstanceProfile = instanceProfile.Name, UserData = userData,
Tags = {
{
"Name",
"deepSeek-server"
}
}
});
this.AmiId = amazonLinux.Apply(GetAmiResult => GetAmiResult.Id);
this.InstanceId = instance.Id;
this.InstancePublicDns = instance.PublicIp;
}
[Output]
public Output < string > AmiId
{
get;
set;
}
[Output]
public Output < string > InstanceId
{
get;
set;
}
[Output]
public Output < string > InstancePublicDns
{
get;
set;
}
}
class Program
{
static Task < int > Main() => Deployment.RunAsync < MyStack > ();
}
deepSeekKey:
type: aws:ec2:KeyPair
properties:
publicKey: ${publicKey}
deepSeekInstance:
type: aws:ec2:Instance
properties:
ami: ${amiId}
instanceType: "g4dn.xlarge"
keyName: ${deepSeekKey.keyName}
rootBlockDevice:
volumeSize: 100
volumeType: gp3
subnetId: ${deepSeekSubnet.id}
vpcSecurityGroupIds:
- ${deepSeekSecurityGroup.id}
iamInstanceProfile: ${deepSeekProfile.name}
userData: ${userData}
tags:
Name: deepSeek-server
outputs:
AmiId: ${amiId}
InstanceId: ${deepSeekInstance.id}
InstancePublicDns: ${deepSeekInstance.publicIp}
Install Ollama and run DeepSeek
After we set up all the infrastructure needed for our GPU-powered EC2 instance, we can install Ollama and run DeepSeek. This will all be done as part of the user data script we pass to the EC2 instance.
In the runcmd
section of the user data script, we will install the necessary packages, download the NVIDIA GRID drivers from S3, install Docker, and run the Ollama and Open WebUI containers.
#cloud-config
users:
- default
package_update: true
packages:
- apt-transport-https
- ca-certificates
- curl
- openjdk-17-jre-headless
- gcc
runcmd:
- yum install -y gcc kernel-devel-$(uname -r)
- aws s3 cp --recursive s3://ec2-linux-nvidia-drivers/latest/ .
- chmod +x NVIDIA-Linux-x86_64*.run
- /bin/sh ./NVIDIA-Linux-x86_64*.run --tmpdir . --silent
- touch /etc/modprobe.d/nvidia.conf
- echo "options nvidia NVreg_EnableGpuFirmware=0" | sudo tee --append /etc/modprobe.d/nvidia.conf
- yum install -y docker
- usermod -a -G docker ec2-user
- systemctl enable docker.service
- systemctl start docker.service
- curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
- yum install -y nvidia-container-toolkit
- nvidia-ctk runtime configure --runtime=docker
- systemctl restart docker
- docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart always ollama/ollama
- sleep 120
- docker exec ollama ollama run deepseek-r1:7b
- docker exec ollama ollama run deepseek-r1:14b
- docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Using DeepSeek models via Ollama
DeepSeek provides a diverse range of models in the Ollama library, each tailored to different resource requirements and use cases. Below is a concise overview:
Model sizes
The library offers models in sizes like 1.5B, 7B, 8B, 14B, 32B, 70B, and even 671B parameters (where “B” indicates billions). While larger models tend to deliver stronger performance, they also demand more computational power.
Quantized models
Certain DeepSeek models come in quantized variants (for example, q4_K_M or q8_0). These are optimized to use less memory and may run faster, though there can be a minor trade-off in quality.
Distilled versions
DeepSeek also releases distilled models (e.g., qwen-distill, llama-distill). These versions are lighter, having been trained to mimic the behavior of larger models and offering a more balanced mix of performance and resource efficiency.
Tags
Each model has both a “latest” tag and specialized tags indicating its size, quantization level, or distillation approach. For example: latest
, 1.5b
, 7b
,8b
,14b
, 32b
, 70b
, 671b
and more.
To pull a model, use the following command:
# Replace <tag> with the desired model tag
ollama pull deepseek-r1:<tag>
In our case, we will pull the 7B model:
ollama pull deepseek-r1:7b
Deploy the infrastructure
Before deploying the infrastructure, make sure you have the necessary AWS credentials set up. You can do this by running the following command:
aws configure
Pulumi supports a wide range of configuration options, including environment variables, configuration files, and more. You can find more information in the Pulumi documentation.
After setting up the credentials, you can deploy the infrastructure by running the following command:
pulumi up
This command will give you first a handy preview of the actions Pulumi will take. If you are happy with the changes, you can confirm the deployment by typing yes
.
pulumi up
Please choose a stack, or create a new one: <create a new stack>
Please enter your desired stack name.
To create a stack in an organization, use the format <org-name>/<stack-name> (e.g. `acmecorp/dev`): dev
Please enter your desired stack name.
Created stack 'dev'
Previewing update (dev)
View in Browser (Ctrl+O): https://app.pulumi.com/dirien/deepseek-ollama-typescript/dev/previews/1dbb18ea-ba31-4d5b-9510-5dce19eb8ee8
Type Name Plan
+ pulumi:pulumi:Stack deepseek-ollama-typescript-dev create
+ ├─ aws:ec2:KeyPair deepSeekKey create
+ ├─ aws:ec2:Vpc deepSeekVpc create
+ ├─ aws:iam:Role deepSeekRole create
+ ├─ aws:iam:InstanceProfile deepSeekProfile create
+ ├─ aws:ec2:SecurityGroup deepSeekSecurityGroup create
+ ├─ aws:ec2:RouteTable deepSeekRouteTable create
+ ├─ aws:ec2:InternetGateway deepSeekInternetGateway create
+ ├─ aws:ec2:Subnet deepSeekSubnet create
+ ├─ aws:iam:RolePolicyAttachment deepSeekS3Policy create
+ ├─ aws:ec2:RouteTableAssociation deepSeekRouteTableAssociation create
+ └─ aws:ec2:Instance deepSeekInstance create
Outputs:
amiId : "ami-085131ff43045c877"
instanceId : output<string>
instancePublicDns: output<string>
Resources:
+ 12 to create
Do you want to perform this update? yes
Updating (dev)
View in Browser (Ctrl+O): https://app.pulumi.com/dirien/deepseek-ollama-typescript/dev/updates/1
Type Name Status
+ pulumi:pulumi:Stack deepseek-ollama-typescript-dev created (40s)
+ ├─ aws:ec2:KeyPair deepSeekKey created (0.47s)
+ ├─ aws:iam:Role deepSeekRole created (1s)
+ ├─ aws:ec2:Vpc deepSeekVpc created (12s)
+ ├─ aws:iam:InstanceProfile deepSeekProfile created (6s)
+ ├─ aws:iam:RolePolicyAttachment deepSeekS3Policy created (0.90s)
+ ├─ aws:ec2:InternetGateway deepSeekInternetGateway created (0.69s)
+ ├─ aws:ec2:Subnet deepSeekSubnet created (11s)
+ ├─ aws:ec2:SecurityGroup deepSeekSecurityGroup created (2s)
+ ├─ aws:ec2:RouteTable deepSeekRouteTable created (1s)
+ ├─ aws:ec2:RouteTableAssociation deepSeekRouteTableAssociation created (0.92s)
+ └─ aws:ec2:Instance deepSeekInstance created (12s)
Outputs:
amiId : "ami-085131ff43045c877"
instanceId : "i-0ae7495781ace3e81"
instancePublicDns: "18.159.211.136"
Resources:
+ 12 created
Duration: 42s
While the infrastructure is relatively quickly deployed, the user data script will take some time to download the necessary packages and run the containers.
You can check that everything is up and running by either connecting via ssh
to the instance or navigating to the public IP address of the instance in your browser.
ssh -i deepseek.pem ec2-user@<instance-public-ip>
And then run the following command to check the status of the containers:
sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c8714335e205 ghcr.io/open-webui/open-webui:main "bash start.sh" 6 minutes ago Up 6 minutes (healthy) 0.0.0.0:3000->8080/tcp, :::3000->8080/tcp open-webui
bf4bb3b7ede1 ollama/ollama "/bin/ollama serve" 8 minutes ago Up 7 minutes 0.0.0.0:11434->11434/tcp, :::11434->11434/tcp ollama
[ec2-user@ip-10-0-58-122 ~]$
Accessing the web UI
When the EC2 instance is up and running and the containers are started, you can access the Ollama Web UI by navigating to http://<ec2-public-ip>:3000
.
We can give it a spin by running a few queries. For example, we can ask DeepSeek to solve a math problem:
What is nice about DeepSeek is that we can also see the reasoning behind the answer. This is very helpful to understand how the model came to a conclusion.
Accessing DeepSeek with Ollama OpenAI-compatible API
Ollama provides an OpenAI-compatible API that allows you to interact with DeepSeek models programmatically. This allows you to use existing OpenAI-compatible tools and applications with your local Ollama server.
I am not going to cover how to use the API in this post, but you can find more information in the Ollama documentation.
Cleaning up
After you are done experimenting with DeepSeek, you can clean up the resources by running the following command:
pulumi destroy
Conclusion
This post demonstrated how easy it is to set up and run DeepSeek on an AWS EC2 instance using Pulumi. By leveraging IaC, we were able to create the necessary infrastructure with a few lines of code. From here, we can easily configure the code to run any other AI model on the cloud, change the instance type, or even set additional infrastructure for the application connection to the model.
If you have any questions or need help with the code, feel free to reach out to me and if you want to give DeepSeek with Pulumi a try, head over to the Pulumi documentation.
Try Pulumi for FreeIf you want to learn more about what we learned from using GenAI in production, head to this blog post