This guide shows how to land LaunchDarkly events in Snowflake on Azure with batch COPY INTO.
It covers the storage, eventing, and Snowflake objects for this setup so you can get raw events flowing first and shape downstream tables later.
On Azure, this guide uses Azure Functions for the public handler and Azure Blob Storage for the raw landing zone. The shipped Azure path keeps payloads in Blob Storage, then loads them into Snowflake with batch COPY INTO.
The batch COPY INTO path stages files first and leaves loading under your control. Choose it when you want to run COPY INTO on a predictable cadence and keep the raw payloads in object storage before Snowflake loads them. This blueprint provisions the public Azure Function endpoint, the Blob landing container, a SAS-backed external Azure stage, an X-Small warehouse, and a Snowflake task that runs COPY INTO every five minutes.
In the blueprint, the top-level program creates the Snowflake database, schema, and warehouse, then passes those names into the reusable batch component. That means you can keep the blueprint-created objects, rename them, or swap them for your own existing database, schema, and warehouse without rewriting the loading logic. The same entrypoint also sets taskIntervalMinutes, which controls how often the Snowflake task runs COPY INTO.
If you want to change those defaults before deploying, set stack config like this:
pulumi config set database LANDING_ZONE_WEBHOOKS
pulumi config set schema RAW
pulumi config set warehouse WEBHOOK_BATCH_LOADER
pulumi config set taskIntervalMinutes 60
This guide provisions a LaunchDarkly webhook for you and points it at the deployed public endpoint. The handler validates the X-LD-Signature HMAC before forwarding each payload into the selected ingestion path.
Quickstart
- Download the blueprint zip for your language below, or create a new Pulumi project with the same file layout shown in the Download section.
- Install dependencies for your selected language and configure Snowflake plus Azure.
- For batch setups, decide whether you want to keep the blueprint database, schema, and warehouse names or point the program at names you already use. If needed, also change
taskIntervalMinutesto the cadence you want. - Deploy the stack to create the public Azure Function endpoint, the Blob landing container, and the Snowflake loading objects.
- Register the webhook source and send a test event.
- Query the landing table in Snowflake to confirm the event arrived.
After the first test event, new rows usually appear within an hour because the Snowflake task runs on an hourly cadence.
Prerequisites
- a Pulumi account and the Pulumi CLI
- an Azure subscription where you can create resource groups, storage accounts, function apps, and RBAC assignments
- a Snowflake account where you can create databases, schemas, and loading roles
- a LaunchDarkly access token that can create webhooks and read the target resources you want to emit
- no extra Pulumi config is required unless you want to narrow which LaunchDarkly events the webhook emits by editing the
statementsarray on the webhook resource later
For the Pulumi language you selected:
Initialize your stack for Azure with:
pulumi stack init dev
pulumi config set azure-native:location eastus
Set up credentials with Pulumi ESC
This guide needs cloud credentials, Snowflake credentials, and any source-specific token required to provision the webhook. A single ESC environment is usually the smallest setup that still keeps secrets out of local files.
values:
azure:
login:
fn::open::azure-login:
oidc:
tenantId: 00000000-0000-0000-0000-000000000000
subscriptionId: 00000000-0000-0000-0000-000000000000
snowflake:
login:
fn::open::snowflake-login:
oidc:
account: <your-snowflake-account>
user: ESC_SERVICE_USER
organizationName: <your-org-name>
accountName: <your-account-name>
launchdarkly:
accessToken:
fn::secret: <your-launchdarkly-access-token>
environmentVariables:
SNOWFLAKE_USER: ${snowflake.login.user}
SNOWFLAKE_TOKEN: ${snowflake.login.token}
pulumiConfig:
snowflake:organizationName: ${snowflake.organizationName}
snowflake:accountName: ${snowflake.accountName}
snowflake:authenticator: OAUTH
snowflake:role: PULUMI_DEPLOYER
launchdarkly:accessToken: ${launchdarkly.accessToken}
Then reference it from your stack config:
environment:
- <your-org>/<your-environment>
config:
webhook-to-snowflake:database: LANDING_ZONE_WEBHOOKS
What you get in the download
The downloadable example zip includes:
Pulumi.yaml- the Pulumi program, dependency files, cloud runtime support files, and reusable components for the language you pick below
- a README with a shorter quick start for this exact setup
For batch setups, the top-level program is where you choose the Snowflake database, schema, warehouse, and task cadence. The reusable batch component then builds the stage, destination table, and scheduled load inside those objects.
__main__.pyas the Pulumi entrypointcomponents/webhook_ingestion.pyfor the public webhook endpointcomponents/batch_pipeline.pyfor the Snowflake loading pathfunctionapp/host.jsonfor the Azure Functions hostfunctionapp/requirements.txtfor the Azure Functions Python dependenciesfunctionapp/Webhook/__init__.pyfor request validation and blob writesfunctionapp/Webhook/function.jsonfor the public HTTP trigger routerequirements.txtfor the root Pulumi project
index.tsas the Pulumi entrypointcomponents/webhook_ingestion.tsfor the public webhook endpointcomponents/batch_pipeline.tsfor the Snowflake loading pathfunctionapp/host.jsonfor the Azure Functions hostfunctionapp/requirements.txtfor the Azure Functions Python dependenciesfunctionapp/Webhook/__init__.pyfor request validation and blob writesfunctionapp/Webhook/function.jsonfor the public HTTP trigger routepackage.jsonandtsconfig.jsonfor the root Pulumi project
The next sections show the same entrypoint and component files that ship in the download.
Blueprint Pulumi program
This blueprint shows the full resource wiring for the Azure batch COPY INTO path with a LaunchDarkly source. The downloadable repo uses the same entrypoint and component files shown below.
import pulumi
import pulumi_azure_native as azure_native
import pulumi_random as random
import pulumi_snowflake as snowflake
import lbrlabs_pulumi_launchdarkly as launchdarkly
from components.batch_pipeline import BatchPipeline
from components.webhook_ingestion import WebhookIngestion
config = pulumi.Config()
database_name = config.get("database") or "LANDING_ZONE_WEBHOOKS"
schema_name = config.get("schema") or "RAW"
warehouse_name = config.get("warehouse") or "WEBHOOK_BATCH_LOADER"
task_interval_minutes = config.get_int("taskIntervalMinutes") or 60
location = pulumi.Config("azure-native").require("location")
resource_group = azure_native.resources.ResourceGroup(
"webhook-rg",
resource_group_name=f"webhook-to-snowflake-{pulumi.get_stack()}",
location=location,
)
storage_suffix = random.RandomString(
"storage-suffix",
length=8,
special=False,
upper=False,
)
storage_account = azure_native.storage.StorageAccount(
"landingstorage",
account_name=storage_suffix.result.apply(lambda value: f"w2sf{value}"),
allow_blob_public_access=False,
allow_shared_key_access=True,
kind=azure_native.storage.Kind.STORAGE_V2,
location=location,
resource_group_name=resource_group.name,
sku=azure_native.storage.SkuArgs(name=azure_native.storage.SkuName.STANDARD_LRS),
)
landing_container = azure_native.storage.BlobContainer(
"landing-container",
account_name=storage_account.name,
container_name="incoming",
resource_group_name=resource_group.name,
)
storage_keys = azure_native.storage.list_storage_account_keys_output(
account_name=storage_account.name,
resource_group_name=resource_group.name,
)
storage_connection_string = pulumi.Output.all(storage_account.name, storage_keys.keys).apply(
lambda args: f"DefaultEndpointsProtocol=https;AccountName={args[0]};AccountKey={args[1][0].value};EndpointSuffix=core.windows.net"
)
database = snowflake.Database("landing-db", name=database_name)
schema = snowflake.Schema("raw-schema", name=schema_name, database=database.name)
warehouse = snowflake.Warehouse(
"batch-loader-warehouse",
name=warehouse_name,
warehouse_size="XSMALL",
auto_resume="true",
auto_suspend=60,
initially_suspended=False,
)
webhook_secret = random.RandomPassword("webhook-secret", length=32, special=False)
ingestion = WebhookIngestion(
"source-webhooks",
location=resource_group.location,
resource_group_name=resource_group.name,
storage_account_name=storage_account.name,
blob_endpoint=storage_account.primary_endpoints.blob,
landing_container_name=landing_container.name,
storage_connection_string=storage_connection_string,
webhook_secret=webhook_secret.result,
)
pipeline = BatchPipeline(
"source-events",
database=database.name,
schema_name=schema.name,
warehouse_name=warehouse.name,
task_interval_minutes=task_interval_minutes,
resource_group_name=resource_group.name,
storage_account_name=storage_account.name,
landing_container_name=landing_container.name,
blob_endpoint=storage_account.primary_endpoints.blob,
)
endpoint_url = ingestion.endpoint_url
launchdarkly.Webhook(
"source-webhook",
url=endpoint_url,
name="snowflake-webhook",
on=True,
secret=webhook_secret.result,
statements=[
launchdarkly.WebhookStatementArgs(
effect="allow",
actions=["*"],
resources=["proj/*:env/*:flag/*"],
)
],
tags=["pulumi", "snowflake"],
)
pulumi.export("publicEndpointUrl", ingestion.endpoint_url)
pulumi.export("landingContainerName", landing_container.name)
pulumi.export("stageName", pipeline.stage_name)
pulumi.export("tableName", pipeline.table_name)
pulumi.export("warehouseName", warehouse.name)
pulumi.export("taskName", pipeline.task_name)
import * as azure_native from "@pulumi/azure-native";
import * as pulumi from "@pulumi/pulumi";
import * as random from "@pulumi/random";
import * as snowflake from "@pulumi/snowflake";
import * as launchdarkly from "@lbrlabs/pulumi-launchdarkly";
import { BatchPipeline } from "./components/batch_pipeline";
import { WebhookIngestion } from "./components/webhook_ingestion";
const config = new pulumi.Config();
const databaseName = config.get("database") ?? "LANDING_ZONE_WEBHOOKS";
const taskIntervalMinutes = config.getNumber("taskIntervalMinutes") ?? 60;
const location = new pulumi.Config("azure-native").require("location");
const resourceGroup = new azure_native.resources.ResourceGroup("webhook-rg", {
resourceGroupName: `webhook-to-snowflake-${pulumi.getStack()}`,
location,
});
const storageSuffix = new random.RandomString("storage-suffix", {
length: 8,
special: false,
upper: false,
});
const storageAccount = new azure_native.storage.StorageAccount("landingstorage", {
accountName: storageSuffix.result.apply((value) => `w2sf${value}`),
allowBlobPublicAccess: false,
allowSharedKeyAccess: true,
kind: azure_native.storage.Kind.StorageV2,
location,
resourceGroupName: resourceGroup.name,
sku: {
name: azure_native.storage.SkuName.Standard_LRS,
},
});
const landingContainer = new azure_native.storage.BlobContainer("landing-container", {
accountName: storageAccount.name,
containerName: "incoming",
resourceGroupName: resourceGroup.name,
});
const storageKeys = azure_native.storage.listStorageAccountKeysOutput({
accountName: storageAccount.name,
resourceGroupName: resourceGroup.name,
});
const storageConnectionString = pulumi.all([storageAccount.name, storageKeys.keys]).apply(([accountName, keys]) =>
`DefaultEndpointsProtocol=https;AccountName=${accountName};AccountKey=${keys[0].value};EndpointSuffix=core.windows.net`,
);
const database = new snowflake.Database("landing-db", { name: databaseName });
const schema = new snowflake.Schema("raw-schema", {
name: "RAW",
database: database.name,
});
const warehouse = new snowflake.Warehouse("batch-loader-warehouse", {
name: "WEBHOOK_BATCH_LOADER",
warehouseSize: "XSMALL",
autoResume: "true",
autoSuspend: 60,
initiallySuspended: false,
});
const sharedSecret = new random.RandomPassword("webhook-secret", {
length: 32,
special: false,
});
const ingestion = new WebhookIngestion("source-webhooks", {
location: resourceGroup.location,
resourceGroupName: resourceGroup.name,
storageAccountName: storageAccount.name,
blobEndpoint: storageAccount.primaryEndpoints.blob,
landingContainerName: landingContainer.name,
storageConnectionString,
webhookSecret: sharedSecret.result,
});
const pipeline = new BatchPipeline("source-events", {
database: database.name,
schemaName: schema.name,
warehouseName: warehouse.name,
taskIntervalMinutes,
resourceGroupName: resourceGroup.name,
storageAccountName: storageAccount.name,
landingContainerName: landingContainer.name,
blobEndpoint: storageAccount.primaryEndpoints.blob,
});
const endpointUrl = ingestion.endpointUrl;
new launchdarkly.Webhook("source-webhook", {
url: endpointUrl,
name: "snowflake-webhook",
on: true,
secret: sharedSecret.result,
statements: [
{
effect: "allow",
actions: ["*"],
resources: ["proj/*:env/*:flag/*"],
},
],
tags: ["pulumi", "snowflake"],
});
export const publicEndpointUrl = ingestion.endpointUrl;
export const landingContainerName = landingContainer.name;
export const stageName = pipeline.stageName;
export const tableName = pipeline.tableName;
export const warehouseName = warehouse.name;
export const taskName = pipeline.taskName;
Reusable components
The entrypoint stays small because the real ingestion work lives in reusable modules. These are the same component files packaged in the downloadable blueprint for this setup.
components/webhook_ingestion.py
Accepts the public webhook request, validates the signature, normalizes the payload, and writes the raw event into the landing path for this setup.
from __future__ import annotations
from dataclasses import dataclass
import pulumi
import pulumi_azure_native as azure_native
@dataclass
class WebhookIngestion:
endpoint_url: pulumi.Output[str]
app_name: pulumi.Output[str]
def __init__(
self,
name: str,
*,
location: pulumi.Input[str],
resource_group_name: pulumi.Input[str],
storage_account_name: pulumi.Input[str],
blob_endpoint: pulumi.Input[str],
landing_container_name: pulumi.Input[str],
storage_connection_string: pulumi.Input[str],
webhook_secret: pulumi.Input[str],
) -> None:
deployment_container = azure_native.storage.BlobContainer(
f"{name}-deployments",
account_name=storage_account_name,
container_name="deploymentpackage",
resource_group_name=resource_group_name,
)
azure_native.storage.Blob(
f"{name}-package",
account_name=storage_account_name,
blob_name="functionapp.zip",
container_name=deployment_container.name,
content_type="application/zip",
resource_group_name=resource_group_name,
source=pulumi.AssetArchive(
{
"host.json": pulumi.FileAsset("functionapp/host.json"),
"requirements.txt": pulumi.FileAsset("functionapp/requirements.txt"),
"Webhook/function.json": pulumi.FileAsset(
"functionapp/Webhook/function.json"
),
"Webhook/__init__.py": pulumi.FileAsset(
"functionapp/Webhook/__init__.py"
),
}
),
type=azure_native.storage.BlobType.BLOCK,
)
plan = azure_native.web.AppServicePlan(
f"{name}-plan",
kind="functionapp",
location=location,
resource_group_name=resource_group_name,
reserved=True,
sku=azure_native.web.SkuDescriptionArgs(
name="FC1",
tier="FlexConsumption",
),
)
function_app = azure_native.web.WebApp(
f"{name}-app",
kind="functionapp,linux",
location=location,
https_only=True,
public_network_access="Enabled",
resource_group_name=resource_group_name,
server_farm_id=plan.id,
function_app_config=azure_native.web.FunctionAppConfigArgs(
deployment=azure_native.web.FunctionsDeploymentArgs(
storage=azure_native.web.FunctionsDeploymentStorageArgs(
authentication=azure_native.web.FunctionsDeploymentAuthenticationArgs(
storage_account_connection_string_name="DEPLOYMENT_STORAGE_CONNECTION_STRING",
type=azure_native.web.AuthenticationType.STORAGE_ACCOUNT_CONNECTION_STRING,
),
type=azure_native.web.FunctionsDeploymentStorageType.BLOB_CONTAINER,
value=pulumi.Output.format("{0}{1}", blob_endpoint, deployment_container.name),
)
),
runtime=azure_native.web.FunctionsRuntimeArgs(
name=azure_native.web.RuntimeName.PYTHON,
version="3.11",
),
scale_and_concurrency=azure_native.web.FunctionsScaleAndConcurrencyArgs(
instance_memory_mb=2048,
maximum_instance_count=40,
),
),
site_config=azure_native.web.SiteConfigArgs(
app_settings=[
azure_native.web.NameValuePairArgs(
name="AzureWebJobsStorage",
value=storage_connection_string,
),
azure_native.web.NameValuePairArgs(
name="DEPLOYMENT_STORAGE_CONNECTION_STRING",
value=storage_connection_string,
),
azure_native.web.NameValuePairArgs(
name="LANDING_CONNECTION_STRING",
value=storage_connection_string,
),
azure_native.web.NameValuePairArgs(
name="LANDING_CONTAINER_NAME",
value=landing_container_name,
),
azure_native.web.NameValuePairArgs(
name="WEBHOOK_SECRET",
value=webhook_secret,
),
azure_native.web.NameValuePairArgs(
name="FUNCTIONS_EXTENSION_VERSION",
value="~4",
),
]
),
)
self.endpoint_url = function_app.default_host_name.apply(
lambda host: f"https://{host}/api/webhook"
)
self.app_name = function_app.name
components/batch_pipeline.py
Creates the Snowflake-side loading resources for this setup: the landing stage, the destination table, and the batch COPY INTO loading path.
from __future__ import annotations
import pulumi
import pulumi_azure_native as azure_native
import pulumi_snowflake as snowflake
def _copy_into_statement(database: pulumi.Input[str], schema_name: pulumi.Input[str]) -> pulumi.Output[str]:
return pulumi.Output.all(database, schema_name).apply(
lambda args: (
f'COPY INTO "{args[0]}"."{args[1]}"."WEBHOOK_EVENTS" '
f'FROM (SELECT metadata$filename, metadata$file_last_modified, $1, sysdate() '
f'FROM @"{args[0]}"."{args[1]}"."WEBHOOK_EVENTS_STAGE") '
"FILE_FORMAT = (TYPE = JSON)"
)
)
class BatchPipeline:
def __init__(
self,
name: str,
*,
database: pulumi.Input[str],
schema_name: pulumi.Input[str],
warehouse_name: pulumi.Input[str],
task_interval_minutes: int,
resource_group_name: pulumi.Input[str],
storage_account_name: pulumi.Input[str],
landing_container_name: pulumi.Input[str],
blob_endpoint: pulumi.Input[str],
) -> None:
preview_provider = snowflake.Provider(
f"{name}-preview-provider",
preview_features_enabled=[
"snowflakeStageExternalAzureResource",
],
)
preview_opts = pulumi.ResourceOptions(provider=preview_provider)
stage_url = pulumi.Output.all(blob_endpoint, landing_container_name).apply(
lambda args: f"azure://{args[0].replace('https://', '').rstrip('/')}/{args[1]}/incoming/"
)
storage_keys = azure_native.storage.list_storage_account_keys_output(
account_name=storage_account_name,
resource_group_name=resource_group_name,
)
stage_sas = pulumi.Output.all(
storage_account_name,
landing_container_name,
resource_group_name,
storage_keys.keys,
).apply(
lambda args: azure_native.storage.list_storage_account_service_sas(
account_name=args[0],
canonicalized_resource=f"/blob/{args[0]}/{args[1]}",
key_to_sign=args[3][0].value,
permissions="rl",
protocols="https",
resource="c",
resource_group_name=args[2],
shared_access_expiry_time="2035-01-01T00:00:00Z",
).service_sas_token
)
table = snowflake.Table(
f"{name}-table",
database=database,
schema=schema_name,
name="WEBHOOK_EVENTS",
columns=[
snowflake.TableColumnArgs(name="FILENAME", type="STRING", nullable=False),
snowflake.TableColumnArgs(name="LAST_MODIFIED_AT", type="TIMESTAMP_NTZ", nullable=False),
snowflake.TableColumnArgs(name="CONTENT", type="VARIANT"),
snowflake.TableColumnArgs(name="LOADED_AT", type="TIMESTAMP_NTZ"),
],
)
stage = snowflake.StageExternalAzure(
f"{name}-stage",
name="WEBHOOK_EVENTS_STAGE",
database=database,
schema=schema_name,
url=stage_url,
credentials=snowflake.StageExternalAzureCredentialsArgs(
azure_sas_token=stage_sas,
),
opts=preview_opts,
)
task = snowflake.Task(
f"{name}-task",
database=database,
schema=schema_name,
name="WEBHOOK_EVENTS_TASK",
warehouse=warehouse_name,
started=True,
schedule={"minutes": task_interval_minutes},
sql_statement=_copy_into_statement(database, schema_name),
)
self.stage_name = stage.fully_qualified_name
self.table_name = table.name
self.warehouse_name = pulumi.Output.from_input(warehouse_name)
self.task_name = task.name
self.stage_url = stage_url
components/webhook_ingestion.ts
Accepts the public webhook request, validates the signature, normalizes the payload, and writes the raw event into the landing path for this setup.
import * as azure_native from "@pulumi/azure-native";
import * as pulumi from "@pulumi/pulumi";
export interface WebhookIngestionArgs {
location: pulumi.Input<string>;
resourceGroupName: pulumi.Input<string>;
storageAccountName: pulumi.Input<string>;
blobEndpoint: pulumi.Input<string>;
landingContainerName: pulumi.Input<string>;
storageConnectionString: pulumi.Input<string>;
webhookSecret: pulumi.Input<string>;
}
export class WebhookIngestion {
public readonly endpointUrl: pulumi.Output<string>;
public readonly appName: pulumi.Output<string>;
constructor(name: string, args: WebhookIngestionArgs) {
const deploymentContainer = new azure_native.storage.BlobContainer(`${name}-deployments`, {
accountName: args.storageAccountName,
containerName: "deploymentpackage",
resourceGroupName: args.resourceGroupName,
});
new azure_native.storage.Blob(`${name}-package`, {
accountName: args.storageAccountName,
blobName: "functionapp.zip",
containerName: deploymentContainer.name,
contentType: "application/zip",
resourceGroupName: args.resourceGroupName,
source: new pulumi.asset.AssetArchive({
"host.json": new pulumi.asset.FileAsset("functionapp/host.json"),
"requirements.txt": new pulumi.asset.FileAsset("functionapp/requirements.txt"),
"Webhook/function.json": new pulumi.asset.FileAsset("functionapp/Webhook/function.json"),
"Webhook/__init__.py": new pulumi.asset.FileAsset("functionapp/Webhook/__init__.py"),
}),
type: azure_native.storage.BlobType.Block,
});
const plan = new azure_native.web.AppServicePlan(`${name}-plan`, {
kind: "functionapp",
location: args.location,
resourceGroupName: args.resourceGroupName,
reserved: true,
sku: {
name: "FC1",
tier: "FlexConsumption",
},
});
const app = new azure_native.web.WebApp(`${name}-app`, {
kind: "functionapp,linux",
location: args.location,
httpsOnly: true,
publicNetworkAccess: "Enabled",
resourceGroupName: args.resourceGroupName,
serverFarmId: plan.id,
functionAppConfig: {
deployment: {
storage: {
authentication: {
storageAccountConnectionStringName: "DEPLOYMENT_STORAGE_CONNECTION_STRING",
type: azure_native.web.AuthenticationType.StorageAccountConnectionString,
},
type: azure_native.web.FunctionsDeploymentStorageType.BlobContainer,
value: pulumi.interpolate`${args.blobEndpoint}${deploymentContainer.name}`,
},
},
runtime: {
name: azure_native.web.RuntimeName.Python,
version: "3.11",
},
scaleAndConcurrency: {
instanceMemoryMB: 2048,
maximumInstanceCount: 40,
},
},
siteConfig: {
appSettings: [
{ name: "AzureWebJobsStorage", value: args.storageConnectionString },
{ name: "DEPLOYMENT_STORAGE_CONNECTION_STRING", value: args.storageConnectionString },
{ name: "LANDING_CONNECTION_STRING", value: args.storageConnectionString },
{ name: "LANDING_CONTAINER_NAME", value: args.landingContainerName },
{ name: "WEBHOOK_SECRET", value: args.webhookSecret },
{ name: "FUNCTIONS_EXTENSION_VERSION", value: "~4" },
],
},
});
this.endpointUrl = app.defaultHostName.apply((host) => `https://${host}/api/webhook`);
this.appName = app.name;
}
}
components/batch_pipeline.ts
Creates the Snowflake-side loading resources for this setup: the landing stage, the destination table, and the batch COPY INTO loading path.
import * as azure_native from "@pulumi/azure-native";
import * as pulumi from "@pulumi/pulumi";
import * as snowflake from "@pulumi/snowflake";
export interface BatchPipelineArgs {
database: pulumi.Input<string>;
schemaName: pulumi.Input<string>;
warehouseName: pulumi.Input<string>;
taskIntervalMinutes: number;
resourceGroupName: pulumi.Input<string>;
storageAccountName: pulumi.Input<string>;
landingContainerName: pulumi.Input<string>;
blobEndpoint: pulumi.Input<string>;
}
function copyIntoStatement(database: pulumi.Input<string>, schemaName: pulumi.Input<string>) {
return pulumi.all([database, schemaName]).apply(([currentDatabase, currentSchema]) =>
`COPY INTO "${currentDatabase}"."${currentSchema}"."WEBHOOK_EVENTS" ` +
`FROM (SELECT metadata$filename, metadata$file_last_modified, $1, sysdate() ` +
`FROM @"${currentDatabase}"."${currentSchema}"."WEBHOOK_EVENTS_STAGE") ` +
"FILE_FORMAT = (TYPE = JSON)",
);
}
export class BatchPipeline {
public readonly stageName: pulumi.Output<string>;
public readonly tableName: pulumi.Output<string>;
public readonly warehouseName: pulumi.Output<string>;
public readonly taskName: pulumi.Output<string>;
public readonly stageUrl: pulumi.Output<string>;
constructor(name: string, args: BatchPipelineArgs) {
const previewProvider = new snowflake.Provider(`${name}-preview-provider`, {
previewFeaturesEnabled: [
"snowflakeStageExternalAzureResource",
],
});
const previewOpts = { provider: previewProvider };
const stageUrl = pulumi
.all([args.blobEndpoint, args.landingContainerName])
.apply(([endpoint, container]) => `azure://${endpoint.replace("https://", "").replace(/\/$/, "")}/${container}/incoming/`);
const storageKeys = azure_native.storage.listStorageAccountKeysOutput({
accountName: args.storageAccountName,
resourceGroupName: args.resourceGroupName,
});
const stageSas = pulumi
.all([args.storageAccountName, args.landingContainerName, args.resourceGroupName, storageKeys.keys])
.apply(([accountName, containerName, resourceGroupName, keys]) =>
azure_native.storage.listStorageAccountServiceSAS({
accountName,
canonicalizedResource: `/blob/${accountName}/${containerName}`,
keyToSign: keys[0].value,
permissions: "rl",
protocols: "https",
resource: "c",
resourceGroupName,
sharedAccessExpiryTime: "2035-01-01T00:00:00Z",
}).then((result) => result.serviceSasToken),
);
const table = new snowflake.Table(`${name}-table`, {
database: args.database,
schema: args.schemaName,
name: "WEBHOOK_EVENTS",
columns: [
{ name: "FILENAME", type: "STRING", nullable: false },
{ name: "LAST_MODIFIED_AT", type: "TIMESTAMP_NTZ", nullable: false },
{ name: "CONTENT", type: "VARIANT" },
{ name: "LOADED_AT", type: "TIMESTAMP_NTZ" },
],
});
const stage = new snowflake.StageExternalAzure(`${name}-stage`, {
name: "WEBHOOK_EVENTS_STAGE",
database: args.database,
schema: args.schemaName,
url: stageUrl,
credentials: {
azureSasToken: stageSas,
},
}, previewOpts);
const task = new snowflake.Task(`${name}-task`, {
database: args.database,
schema: args.schemaName,
name: "WEBHOOK_EVENTS_TASK",
warehouse: args.warehouseName,
started: true,
schedule: { minutes: args.taskIntervalMinutes },
sqlStatement: copyIntoStatement(args.database, args.schemaName),
});
this.stageName = stage.fullyQualifiedName;
this.tableName = table.name;
this.warehouseName = pulumi.output(args.warehouseName);
this.taskName = task.name;
this.stageUrl = stageUrl;
}
}
Verify the data landed
After you send a test event, query Snowflake to confirm the records are visible:
SELECT FILENAME,
LAST_MODIFIED_AT,
CONTENT,
LOADED_AT
FROM LANDING_ZONE_WEBHOOKS.RAW.WEBHOOK_EVENTS
ORDER BY LOADED_AT DESC;
For this path, payloads stay in Azure Blob Storage until the Snowflake task runs COPY INTO against the external stage.
Operating notes
- Keep the first table as a raw landing zone. Flatten and model into downstream tables later.
- Rotate the shared webhook secret when you roll senders or suspect exposure.
- Watch the landing storage path and Snowflake task history so failed loads and malformed payloads do not go unnoticed.
- Use a least-privilege Snowflake reader role for analysts instead of querying with the loading role.
- When you choose batch loading, tune
taskIntervalMinutesto match how quickly you want new files copied into Snowflake and how much warehouse activity you want between loads.