1. Docs
  2. Administration
  3. Self-Hosting
  4. Operations
  5. Networking

Networking and Load Balancing

    Self-hosting is only available with Pulumi Business Critical. If you would like to evaluate the self-hosted Pulumi Cloud, sign up for the 30-day trial or contact us.

    This page covers network architecture, multi-AZ deployment, auto-scaling, load balancing, and DNS configuration for production self-hosted deployments.

    For ingress/egress port requirements, see Network requirements.

    Multi-AZ deployment

    Deploy compute resources across at least two availability zones:

    • Kubernetes: Spread pods across AZs using pod anti-affinity or topology spread constraints.
    • ECS/Fargate: Configure services to launch tasks across multiple subnets in different AZs.
    • VMs: Use auto-scaling groups spanning multiple AZs.

    Network architecture

    Deploy a VPC/VNet with subnets in 2+ AZs, separated by tier:

    • Public subnets: Load balancers, NAT gateways
    • Private subnets: Application containers
    • Database subnets: Database instances (no public access)

    Additional recommendations:

    • Deploy one NAT gateway per AZ to avoid cross-AZ NAT bottlenecks.
    • Use VPC endpoints or private endpoints for object storage access to reduce NAT traffic and improve performance.

    Auto-scaling

    Configure auto-scaling for the API service based on CPU utilization:

    • Target: 50–60% average CPU utilization
    • Minimum instances: 2 (for HA across AZs)
    • Maximum instances: 4x desired count (to handle burst traffic)
    • Scale-in protection: Use graceful draining to allow in-flight requests to complete before terminating instances

    Graceful shutdown

    When scaling in or deploying updates, ensure containers have time to finish in-flight requests:

    • Set container stop timeout to at least 120 seconds.
    • Kubernetes: Set terminationGracePeriodSeconds: 130 on the API pod spec (slightly above the 120-second stop timeout to allow clean shutdown before Kubernetes force-kills the pod).
    • ECS: Configure deregistration delay on the target group and set stopTimeout on the container definition.
    • Use lifecycle hooks (ECS) or preStop hooks (Kubernetes) to drain connections before shutdown.

    Load balancer configuration

    • Deploy an Application Load Balancer (or equivalent).
    • Enable deletion protection on the load balancer in production.
    • Configure health checks for the API service (/api/status endpoint).
    • Set health check grace period to 120 seconds to allow containers to start.

    DNS

    Configure two DNS records pointing to your load balancer:

    • api.{domain} - for CLI and API access
    • app.{domain} - for web console access