Serverless query with AWS Lake Formation
TypeScriptDescription
The program will create an AWS Lake Formation Resource and set permissions for a specified AWS Principal (IAM Role or IAM User), thereby facilitating serverless querying.
The resources being used mainly are:
This is the simplified Pulumi program.
import * as pulumi from "@pulumi/pulumi"; import * as aws from "@pulumi/aws"; // Create the S3 bucket that will be registered // under LakeFormation and acted as a data lake. let bucket = new aws.s3.Bucket("myBucket"); // Create a Local Formation Resource // which makes the S3 bucket usable for LakeFormation. let resource = new aws.lakeformation.Resource("myResource", { arn: bucket.arn }); // IAM Role that will be granted permissions let role = new aws.iam.Role("role", { assumeRolePolicy: JSON.stringify({ "Version": "2012-10-17", "Statement": [ { "Action": "sts:AssumeRole", "Principal": { "Service": "lakeformation.amazonaws.com" }, "Effect": "Allow" } ] }) }); // Granting the IAM Role full permissions to the Lake Formation Resource. let permissions = new aws.lakeformation.Permissions("perm", { principal: role.arn, permissions: ["ALL"], dataLocation: { arn: resource.arn, } }); export const bucketName = bucket.bucket; export const resourceName = resource.id; export const roleName = role.name; export const permissionsId = permissions.id;
This program is quite basic and might need to be customized as per your requirements. Due to the loss of detail in the query results regarding the attributes and due to the versatility of AWS Lake Formation, proper configuration of the data lake, such as setting up a database and tables within, is not represented in this Pulumi program. The program might need to be further extended or adjusted to achieve serverless querying specifically.
The data in S3 would need to be structured according to a specific schema that matches the expected Lake Formation database and table configurations. Please note that while this setup can assist in setting up a foundation for a data lake, running serverless queries would typically involve services like AWS Athena or AWS Redshift which interact with the data lake created by LakeFormation.
For AWS Athena, you may want to use aws.athena.WorkGroup. But unfortunately Pulumi's current version of the AWS plugin doesn't support creating Athena named queries.
Please check the relevant AWS documentation for more detailed steps regarding AWS Lake Formation and related services to perform serverless querying. Also, keep in mind that IAM roles and policies might need further hardening based on your security requirements.