Manage AWS Lake Formation Data Lake Settings

The aws:lakeformation/dataLakeSettings:DataLakeSettings resource, part of the Pulumi AWS provider, configures account-level Lake Formation settings: who can administer the data lake, what default permissions apply to new catalog resources, and how cross-account sharing works. This guide focuses on four capabilities: administrator designation, default catalog permissions, EMR integration, and cross-account sharing protocols.

Lake Formation settings reference IAM users and roles that must exist separately. The examples are intentionally small. Combine them with your own IAM principals and catalog resources.

Designate IAM principals as data lake administrators

Lake Formation deployments begin by designating IAM users or roles as administrators who can grant permissions and manage catalog resources.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const example = new aws.lakeformation.DataLakeSettings("example", {admins: [
    test.arn,
    testAwsIamRole.arn,
]});
import pulumi
import pulumi_aws as aws

example = aws.lakeformation.DataLakeSettings("example", admins=[
    test["arn"],
    test_aws_iam_role["arn"],
])
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/lakeformation"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := lakeformation.NewDataLakeSettings(ctx, "example", &lakeformation.DataLakeSettingsArgs{
			Admins: pulumi.StringArray{
				test.Arn,
				testAwsIamRole.Arn,
			},
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var example = new Aws.LakeFormation.DataLakeSettings("example", new()
    {
        Admins = new[]
        {
            test.Arn,
            testAwsIamRole.Arn,
        },
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.lakeformation.DataLakeSettings;
import com.pulumi.aws.lakeformation.DataLakeSettingsArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var example = new DataLakeSettings("example", DataLakeSettingsArgs.builder()
            .admins(            
                test.arn(),
                testAwsIamRole.arn())
            .build());

    }
}
resources:
  example:
    type: aws:lakeformation:DataLakeSettings
    properties:
      admins:
        - ${test.arn}
        - ${testAwsIamRole.arn}

The admins property lists IAM principal ARNs. These administrators can create databases, grant permissions, and manage Lake Formation resources. Without explicit administrators, Lake Formation uses IAM-based permissions, which may not provide the fine-grained control you need.

Set default permissions for new databases and tables

Organizations often establish baseline permissions that apply automatically when users create new catalog resources.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const example = new aws.lakeformation.DataLakeSettings("example", {
    admins: [
        test.arn,
        testAwsIamRole.arn,
    ],
    createDatabaseDefaultPermissions: [{
        permissions: [
            "SELECT",
            "ALTER",
            "DROP",
        ],
        principal: test.arn,
    }],
    createTableDefaultPermissions: [{
        permissions: ["ALL"],
        principal: testAwsIamRole.arn,
    }],
});
import pulumi
import pulumi_aws as aws

example = aws.lakeformation.DataLakeSettings("example",
    admins=[
        test["arn"],
        test_aws_iam_role["arn"],
    ],
    create_database_default_permissions=[{
        "permissions": [
            "SELECT",
            "ALTER",
            "DROP",
        ],
        "principal": test["arn"],
    }],
    create_table_default_permissions=[{
        "permissions": ["ALL"],
        "principal": test_aws_iam_role["arn"],
    }])
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/lakeformation"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := lakeformation.NewDataLakeSettings(ctx, "example", &lakeformation.DataLakeSettingsArgs{
			Admins: pulumi.StringArray{
				test.Arn,
				testAwsIamRole.Arn,
			},
			CreateDatabaseDefaultPermissions: lakeformation.DataLakeSettingsCreateDatabaseDefaultPermissionArray{
				&lakeformation.DataLakeSettingsCreateDatabaseDefaultPermissionArgs{
					Permissions: pulumi.StringArray{
						pulumi.String("SELECT"),
						pulumi.String("ALTER"),
						pulumi.String("DROP"),
					},
					Principal: pulumi.Any(test.Arn),
				},
			},
			CreateTableDefaultPermissions: lakeformation.DataLakeSettingsCreateTableDefaultPermissionArray{
				&lakeformation.DataLakeSettingsCreateTableDefaultPermissionArgs{
					Permissions: pulumi.StringArray{
						pulumi.String("ALL"),
					},
					Principal: pulumi.Any(testAwsIamRole.Arn),
				},
			},
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var example = new Aws.LakeFormation.DataLakeSettings("example", new()
    {
        Admins = new[]
        {
            test.Arn,
            testAwsIamRole.Arn,
        },
        CreateDatabaseDefaultPermissions = new[]
        {
            new Aws.LakeFormation.Inputs.DataLakeSettingsCreateDatabaseDefaultPermissionArgs
            {
                Permissions = new[]
                {
                    "SELECT",
                    "ALTER",
                    "DROP",
                },
                Principal = test.Arn,
            },
        },
        CreateTableDefaultPermissions = new[]
        {
            new Aws.LakeFormation.Inputs.DataLakeSettingsCreateTableDefaultPermissionArgs
            {
                Permissions = new[]
                {
                    "ALL",
                },
                Principal = testAwsIamRole.Arn,
            },
        },
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.lakeformation.DataLakeSettings;
import com.pulumi.aws.lakeformation.DataLakeSettingsArgs;
import com.pulumi.aws.lakeformation.inputs.DataLakeSettingsCreateDatabaseDefaultPermissionArgs;
import com.pulumi.aws.lakeformation.inputs.DataLakeSettingsCreateTableDefaultPermissionArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var example = new DataLakeSettings("example", DataLakeSettingsArgs.builder()
            .admins(            
                test.arn(),
                testAwsIamRole.arn())
            .createDatabaseDefaultPermissions(DataLakeSettingsCreateDatabaseDefaultPermissionArgs.builder()
                .permissions(                
                    "SELECT",
                    "ALTER",
                    "DROP")
                .principal(test.arn())
                .build())
            .createTableDefaultPermissions(DataLakeSettingsCreateTableDefaultPermissionArgs.builder()
                .permissions("ALL")
                .principal(testAwsIamRole.arn())
                .build())
            .build());

    }
}
resources:
  example:
    type: aws:lakeformation:DataLakeSettings
    properties:
      admins:
        - ${test.arn}
        - ${testAwsIamRole.arn}
      createDatabaseDefaultPermissions:
        - permissions:
            - SELECT
            - ALTER
            - DROP
          principal: ${test.arn}
      createTableDefaultPermissions:
        - permissions:
            - ALL
          principal: ${testAwsIamRole.arn}

When a principal creates a database or table, Lake Formation applies the permissions defined in createDatabaseDefaultPermissions and createTableDefaultPermissions. Each entry specifies a principal ARN and a list of permissions (SELECT, ALTER, DROP, ALL). This ensures consistent access control without manual permission grants for every new resource.

Enable EMR clusters to query Lake Formation data

EMR clusters need special configuration to access Lake Formation-managed data, including external data filtering and session tag authorization.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const example = new aws.lakeformation.DataLakeSettings("example", {
    admins: [
        test.arn,
        testAwsIamRole.arn,
    ],
    createDatabaseDefaultPermissions: [{
        permissions: [
            "SELECT",
            "ALTER",
            "DROP",
        ],
        principal: test.arn,
    }],
    createTableDefaultPermissions: [{
        permissions: ["ALL"],
        principal: testAwsIamRole.arn,
    }],
    allowExternalDataFiltering: true,
    externalDataFilteringAllowLists: [
        current.accountId,
        thirdParty.accountId,
    ],
    authorizedSessionTagValueLists: ["Amazon EMR"],
    allowFullTableExternalDataAccess: true,
});
import pulumi
import pulumi_aws as aws

example = aws.lakeformation.DataLakeSettings("example",
    admins=[
        test["arn"],
        test_aws_iam_role["arn"],
    ],
    create_database_default_permissions=[{
        "permissions": [
            "SELECT",
            "ALTER",
            "DROP",
        ],
        "principal": test["arn"],
    }],
    create_table_default_permissions=[{
        "permissions": ["ALL"],
        "principal": test_aws_iam_role["arn"],
    }],
    allow_external_data_filtering=True,
    external_data_filtering_allow_lists=[
        current["accountId"],
        third_party["accountId"],
    ],
    authorized_session_tag_value_lists=["Amazon EMR"],
    allow_full_table_external_data_access=True)
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/lakeformation"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := lakeformation.NewDataLakeSettings(ctx, "example", &lakeformation.DataLakeSettingsArgs{
			Admins: pulumi.StringArray{
				test.Arn,
				testAwsIamRole.Arn,
			},
			CreateDatabaseDefaultPermissions: lakeformation.DataLakeSettingsCreateDatabaseDefaultPermissionArray{
				&lakeformation.DataLakeSettingsCreateDatabaseDefaultPermissionArgs{
					Permissions: pulumi.StringArray{
						pulumi.String("SELECT"),
						pulumi.String("ALTER"),
						pulumi.String("DROP"),
					},
					Principal: pulumi.Any(test.Arn),
				},
			},
			CreateTableDefaultPermissions: lakeformation.DataLakeSettingsCreateTableDefaultPermissionArray{
				&lakeformation.DataLakeSettingsCreateTableDefaultPermissionArgs{
					Permissions: pulumi.StringArray{
						pulumi.String("ALL"),
					},
					Principal: pulumi.Any(testAwsIamRole.Arn),
				},
			},
			AllowExternalDataFiltering: pulumi.Bool(true),
			ExternalDataFilteringAllowLists: pulumi.StringArray{
				current.AccountId,
				thirdParty.AccountId,
			},
			AuthorizedSessionTagValueLists: pulumi.StringArray{
				pulumi.String("Amazon EMR"),
			},
			AllowFullTableExternalDataAccess: pulumi.Bool(true),
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var example = new Aws.LakeFormation.DataLakeSettings("example", new()
    {
        Admins = new[]
        {
            test.Arn,
            testAwsIamRole.Arn,
        },
        CreateDatabaseDefaultPermissions = new[]
        {
            new Aws.LakeFormation.Inputs.DataLakeSettingsCreateDatabaseDefaultPermissionArgs
            {
                Permissions = new[]
                {
                    "SELECT",
                    "ALTER",
                    "DROP",
                },
                Principal = test.Arn,
            },
        },
        CreateTableDefaultPermissions = new[]
        {
            new Aws.LakeFormation.Inputs.DataLakeSettingsCreateTableDefaultPermissionArgs
            {
                Permissions = new[]
                {
                    "ALL",
                },
                Principal = testAwsIamRole.Arn,
            },
        },
        AllowExternalDataFiltering = true,
        ExternalDataFilteringAllowLists = new[]
        {
            current.AccountId,
            thirdParty.AccountId,
        },
        AuthorizedSessionTagValueLists = new[]
        {
            "Amazon EMR",
        },
        AllowFullTableExternalDataAccess = true,
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.lakeformation.DataLakeSettings;
import com.pulumi.aws.lakeformation.DataLakeSettingsArgs;
import com.pulumi.aws.lakeformation.inputs.DataLakeSettingsCreateDatabaseDefaultPermissionArgs;
import com.pulumi.aws.lakeformation.inputs.DataLakeSettingsCreateTableDefaultPermissionArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var example = new DataLakeSettings("example", DataLakeSettingsArgs.builder()
            .admins(            
                test.arn(),
                testAwsIamRole.arn())
            .createDatabaseDefaultPermissions(DataLakeSettingsCreateDatabaseDefaultPermissionArgs.builder()
                .permissions(                
                    "SELECT",
                    "ALTER",
                    "DROP")
                .principal(test.arn())
                .build())
            .createTableDefaultPermissions(DataLakeSettingsCreateTableDefaultPermissionArgs.builder()
                .permissions("ALL")
                .principal(testAwsIamRole.arn())
                .build())
            .allowExternalDataFiltering(true)
            .externalDataFilteringAllowLists(            
                current.accountId(),
                thirdParty.accountId())
            .authorizedSessionTagValueLists("Amazon EMR")
            .allowFullTableExternalDataAccess(true)
            .build());

    }
}
resources:
  example:
    type: aws:lakeformation:DataLakeSettings
    properties:
      admins:
        - ${test.arn}
        - ${testAwsIamRole.arn}
      createDatabaseDefaultPermissions:
        - permissions:
            - SELECT
            - ALTER
            - DROP
          principal: ${test.arn}
      createTableDefaultPermissions:
        - permissions:
            - ALL
          principal: ${testAwsIamRole.arn}
      allowExternalDataFiltering: true
      externalDataFilteringAllowLists:
        - ${current.accountId}
        - ${thirdParty.accountId}
      authorizedSessionTagValueLists:
        - Amazon EMR
      allowFullTableExternalDataAccess: true

The allowExternalDataFiltering property enables EMR and third-party engines to query Lake Formation catalogs. The externalDataFilteringAllowLists specifies which AWS accounts can run these queries. The authorizedSessionTagValueLists defines session tags (like “Amazon EMR”) that Lake Formation recognizes for authorization. The allowFullTableExternalDataAccess property controls whether query engines can access full tables without session tags when the caller has full permissions.

Configure cross-account sharing version

Lake Formation supports multiple cross-account sharing protocols. The CROSS_ACCOUNT_VERSION parameter controls which protocol version is active.

import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const example = new aws.lakeformation.DataLakeSettings("example", {parameters: {
    CROSS_ACCOUNT_VERSION: "3",
}});
import pulumi
import pulumi_aws as aws

example = aws.lakeformation.DataLakeSettings("example", parameters={
    "CROSS_ACCOUNT_VERSION": "3",
})
package main

import (
	"github.com/pulumi/pulumi-aws/sdk/v7/go/aws/lakeformation"
	"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)

func main() {
	pulumi.Run(func(ctx *pulumi.Context) error {
		_, err := lakeformation.NewDataLakeSettings(ctx, "example", &lakeformation.DataLakeSettingsArgs{
			Parameters: pulumi.StringMap{
				"CROSS_ACCOUNT_VERSION": pulumi.String("3"),
			},
		})
		if err != nil {
			return err
		}
		return nil
	})
}
using System.Collections.Generic;
using System.Linq;
using Pulumi;
using Aws = Pulumi.Aws;

return await Deployment.RunAsync(() => 
{
    var example = new Aws.LakeFormation.DataLakeSettings("example", new()
    {
        Parameters = 
        {
            { "CROSS_ACCOUNT_VERSION", "3" },
        },
    });

});
package generated_program;

import com.pulumi.Context;
import com.pulumi.Pulumi;
import com.pulumi.core.Output;
import com.pulumi.aws.lakeformation.DataLakeSettings;
import com.pulumi.aws.lakeformation.DataLakeSettingsArgs;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Paths;

public class App {
    public static void main(String[] args) {
        Pulumi.run(App::stack);
    }

    public static void stack(Context ctx) {
        var example = new DataLakeSettings("example", DataLakeSettingsArgs.builder()
            .parameters(Map.of("CROSS_ACCOUNT_VERSION", "3"))
            .build());

    }
}
resources:
  example:
    type: aws:lakeformation:DataLakeSettings
    properties:
      parameters:
        CROSS_ACCOUNT_VERSION: '3'

The parameters property accepts a CROSS_ACCOUNT_VERSION key with values “1”, “2”, “3”, or “4”. Each version enables different cross-account sharing features. Fresh accounts default to version “1”. Destroying this resource resets the version to “1”.

Beyond these examples

These snippets focus on specific Lake Formation settings: administrator designation and read-only access, default database and table permissions, and EMR integration and external data filtering. They’re intentionally minimal rather than complete data lake configurations.

The examples reference pre-existing infrastructure such as IAM users and roles for administrators and principals, and AWS account IDs for cross-account sharing. They focus on Lake Formation settings rather than provisioning IAM or catalog resources.

To keep things focused, common settings patterns are omitted, including:

  • Read-only administrators (readOnlyAdmins)
  • Trusted resource owners for cross-account sharing (trustedResourceOwners)
  • Region-specific configuration (region property)
  • Catalog ID specification (catalogId)

These omissions are intentional: the goal is to illustrate how each Lake Formation setting is wired, not provide drop-in data lake modules. See the Lake Formation DataLakeSettings resource reference for all available configuration options.

Let's manage AWS Lake Formation Data Lake Settings

Get started with Pulumi Cloud, then follow our quick setup guide to deploy this infrastructure.

Try Pulumi Cloud for FREE

Frequently Asked Questions

Configuration Gotchas
What happens if I don't specify admins or default permissions in my configuration?
Omitting admins, createDatabaseDefaultPermissions, createTableDefaultPermissions, parameters, or trustedResourceOwners clears those settings. Always explicitly set these fields to prevent unintended data loss.
Can I change the catalog ID after creation?
No, catalogId is immutable. It defaults to your account ID and cannot be changed after the resource is created.
Cross-Account & Compatibility
What are the CROSS_ACCOUNT_VERSION options and their behavior?
Valid values are "1", "2", "3", or "4". Fresh accounts default to "1". Destroying the resource resets CROSS_ACCOUNT_VERSION back to "1".
What's IAMAllowedPrincipals and why does it exist?
IAMAllowedPrincipals is a principal that makes Lake Formation backwards compatible with existing IAM and Glue permissions when introducing fine-grained access control.
Administrators & Permissions
What's the difference between admins and readOnlyAdmins?
admins are full data lake administrators with complete access, while readOnlyAdmins have only view access to Lake Formation resources.
How do I configure default permissions for databases and tables?
Use createDatabaseDefaultPermissions and createTableDefaultPermissions with permissions arrays (like SELECT, ALTER, DROP, or ALL) and principal ARNs. You can configure up to three blocks for each.
What permissions can I grant in default permission blocks?
For databases: SELECT, ALTER, DROP. For tables: ALL or specific permissions. Each permission block requires a permissions array and a principal ARN.
EMR & External Access
How do I enable EMR access to Lake Formation resources?
Set allowExternalDataFiltering to true, configure externalDataFilteringAllowLists with account IDs, add authorizedSessionTagValueLists (e.g., ["Amazon EMR"]), and optionally enable allowFullTableExternalDataAccess.
What's the purpose of authorizedSessionTagValueLists?
Lake Formation relies on a privileged process secured by Amazon EMR or third-party integrators to tag the user’s role while assuming it. This property specifies which session tag values are authorized.
When should I enable allowFullTableExternalDataAccess?
Enable this when you want third-party query engines to get data access credentials without session tags when a caller has full data access permissions.

Using a different cloud?

Explore analytics guides for other cloud providers: