Building a document analytics system with AWS Textract, processing content using Lambda, and storing insights in Amazon RDSPython
To accomplish this goal, this Pulumi program will do the following:
- Create an S3 bucket for storing your document.
- Create a Lambda function that will be invoked every time a new document is uploaded to the S3 bucket. This function will call AWS Textract to extract the text from the document.
- The insights gained from Textract's analysis are updated to the RDS instance.
Here is a simplified demonstration of your Pulumi Program. You will need to manage permissions for the resources (e.g., allowing Lambda to call Textract and talk to S3 and RDS), which is not included in the snippet below.
This leverages the
pulumi_aws.rdsmodules. Please replace the lambda code section with the actual AWS Lambda function that uses Textract and connects to an RDS instance.
(Note: It's crucial to handle connection management well for Lambda to RDS. You need to think about connection pooling if you would have many lambdas connecting parallelly to your RDS instance.)
Remember to replace your preferences for the RDS instance class, allocated storage, username, and password.
The resources generated by this program include:
- Amazon S3 bucket
- AWS IAM roles and policies
- AWS Lambda function
- AWS RDS instance
This is a starting point, and you can enhance and tweak this program based on your other requirements such as VPC setup, security group settings, RDS instance details, etc. Please ensure you have a good understanding of security best practices while configuring your resources.