Skip to main content
Pulumi logo

Posts Tagged ec2

Run Open-Source LLMs on AWS EC2 with Ollama and Pulumi

TL;DR. Want to self-host an open-source LLM on AWS? Use a g4dn.xlarge ($0.526/hr on-demand, 16 GB GPU memory) for 7B/8B models, a g5.xlarge ($1.006/hr, 24 GB) for 13B–14B models, a g5.2xlarge ($1.212/hr, 24 GB) for 32B models, or a g6e.2xlarge ($2.242/hr, 48 GB) for 70B models. Deploy with the Pulumi program below and Ollama will run any model from its library: DeepSeek-R1, Llama 3, Qwen, or Mistral, with a one-line change.

Read more →

Fargate vs EC2

Building an EKS cluster requires choosing how your containers will actually run - either on EC2 instances you manage or through AWS Fargate’s pod-by-pod approach. The differences can be pretty dramatic in practice. I’m setting up a demo cluster right now using Pulumi, so let me show you what I mean.

  1. Bin Packing
  2. Pros and Cons
  3. Workload Example: Static Analysis
  4. Example: Go Services for E-commerce
  5. Fargate vs EC2 Pricing
  6. Misconceptions About Fargate
  7. Managing Container Orchestration with Pulumi
  8. Why Not Both

Here is my Fargate cluster:

Read more →

Reduce Cloud Costs with EC2 ARM Instances

Whether you’re migrating to the cloud or have existing infrastructure, cloud spend can be a significant barrier to your success. Too small of a budget could prevent your organization from meeting your performance metrics. You can use different strategies to reduce cloud spend, such as using Spot Instances, which cost less than On-Demand Instances or scaling your infrastructure based on peak usage times.

With the addition of Graviton2 based EC2 Instances, AWS offers an on-demand alternative for decreasing cloud spend. Both Amazon and independent testing demonstrated that the general-purpose M6g instance delivered up to a 40% gain of price/performance compared to Intel m5.large instances. In addition to the M6g general-purpose instance, AWS offers instances general-purpose burstable (T4g), compute-optimized (C6g), and memory-optimized (R6g) EC2 instances.

Read more →

The infrastructure as code platform for any cloud.