ExtractFileFromPDF
Configuration file |
None. Configuration is done directly with a string in field "Config file". |
Class name |
com.ebd.hub.datawizard.parser.ExtractFileFromPDF |
Description
This preparser is able to extract any contained file from a PDF/A file (apart from the file ZUGFeRD-invoice.xml).
Field Configuration file is used to specify the list of possible file names (separated by semicolons). Only the first found file is returned. See the following example.
Example
Assume the following value of the parameter string.
MyInvoice.txt;Orders.csv |
If both files are contained in the PDF/A, file MyInvoice.txt will be extracted if it was added to the PDF first.
Important note: When you open a PDF/A file in a viewer, it does not always display the real file names of the attached files. Let's assume you see the file name abadoc.xml in the viewer and specify it in the parameter string. If the actual file name is different, you will receive an error message of the following type.
[unknown] No valid embedded file found but these are included: 'AbaDoc', 'ZUGFeRD-invoice.xml'
[ExtractFileFromPDF] Exception in PreParser: java.lang.Exception: Invalid PDF/A format - unable to extract file
In that case, use file name "AbaDoc" instead.