This repository supports the Dev.to blog series focused on building a document parsing pipeline using AWS Textract
— starting from local testing and scaling to serverless automation.
📍 Part 1 – Local Testing using Python
Simple script-based extraction and text validation using AWS Textract
locally.
🔗 Explore → /local/README.md
📍 Part 2 – Serverless Automation with Lambda, S3 & Textract
Event-driven pipeline triggered by PDF uploads into S3 and stored in DynamoDB.
🔗 Explore → /automation/README-automation.md
AWS
·AWS Textract
· Lambda
· S3
· DynamoDB
· Python
· Boto3