IntelligentDocumentAutomationPreParser
Group |
|
Function |
This preparser is used to extract information from a PDF document. |
Configuration file |
PDF2Data.properties |
Description
This preparser is used to extract information from a PDF document and create a JSON file.
The document is sent to the machine learning service of our partner contract.fit via HTTPS. Text is extracted by means of Optical Character Recognition (OCR).
Access to our partner is subject to a fee. After purchase, the access will be configured for you by us. If you are interested, please contact our support or sales staff.
The configuration is done in a properties file, in which the following parameters can be defined.
Parameter
Parameter |
Description |
Synchronous |
Specifies whether the service is called synchronously (true) or asynchronously (false). Default: false. Important note: When using asynchronous calls, Lobster_data must be accessible via HTTPS from the outside. |
ChannelID |
Channel ID of an HTTPS channel with Basic Authentication (Preemptive Authentication). |
useDMZ |
Specifies whether the service is called via DMZ. Default: false. |
URL |
URL of contract.fit system including Inbox ID (see example). Note: Each document type (e.g. invoice, order, etc.) is defined as a single inbox on the contract.fit platform. The structure of the response JSON file (fields) depends on the inbox ID . |
Example File
Synchronous=true
ChannelID=1599728356339212
useDMZ=false
URL=https://lobster.contract-q.fit/admin/documents/5e7a390a3b08c6d23ab8b8c4
Note: The value 5e7a390a3b08c6d23ab8b8c4 is the Inbox ID.
Creating Source Structure
To create a matching source structure for the respective JSON file, the following procedure can be used.
Create profile with setting No mapping.
Configure preparser.
Set checkbox Result of preparser overrides backup file.
After a profile run, the backup file of the job can be used via the source structure menu entry Create structure from file analysis.