Example how to 1) upload files to AWS S3 and 2) process the PDF file via AWS Textract and 3) send link to form to validate data from PDF. What you need to do is decide where the data from the form should go. But that is a different story and a different Blueprint :-)
Amazon Textract
Amazon Textract is a machine learning service that automatically extracts text, handwriting and data from scanned documents that goes beyond simple optical character recognition (OCR) to identify, understand, and extract data from forms and tables. Read more: https://aws.amazon.com/textract/.
Here is a example of the PDF files that are in the "inbox".
Here is how Amazon Textract sees the PDF as a form.
Here is how the data ends up in the form in Onify.
- Onify Hub API 2.3.0 or later
- Mail configured in Onify Hub
- Onify Agent (tagged agent)
- Onify Flow license
- Node.js installed (on agent)
- Camunda Modeler 4.4 or later
- Amazon AWS services: S3 Bucket, SNS and SQS
- 1 x Flow
- 3 x Scripts (nodejs)
In order for this to work you need the following setup:
- Amazon S3 Bucket
- AWS user with permissions
- Document access key (accessKeyId) and Secure Access Key for AWS user (secretAccessKey)
NOTE: For more information, please read Configuring Amazon Textract for Asynchronous Operations
NOTE: Amazon Textract is not available in all regions. Also make sure S3 bucket and Textract are in same region.
- Copy files from .\resources\agent\scriptsto.\scriptsfolder on Onify Agent.
- Run npm installfrom the.\scriptsfolder
- Update aws_config.jsonwith AWS credentials and region.
Update flow (aws-textract-pdf-to-form.bpmn) with your own variables:
- inboxPath- Path to the PDF files
- bucket- S3 bucket to upload files
- mailTo- Where to send the link to the form
- onifyUrl- URL to Onify APP (default is http://localhost:3000)
- roleArn- The Amazon Resource Name (ARN) of an IAM role that gives Amazon Textract publishing permissions to the Amazon SNS topic
- snsTopicArn- The Amazon SNS topic that Amazon Textract posts the completion status to
- sqsQueueUrl- Amazon SQS url that is subscribed to the SNS topic
- Open aws-textract-pdf-to-form.bpmnin Camunda Modeler
- Click Start current diagram
- Community/forum: https://support.onify.co/discuss
- Documentation: https://support.onify.co/docs
- Support and SLA: https://support.onify.co/docs/get-support
This project is licensed under the MIT License - see the LICENSE file for details.





