Invalid characters in input data
If a profile receives invalid characters in the input data, you can configure the reaction to this.
If the following system property is set, the data is simply accepted:
-Dhub.datawizard.handleIllegalData=true |
However, if this system property is set to false, an IllegalDataException is thrown and the profile terminates with an error.
The following are all characters that are considered invalid and cause the exception:
Hex code 0x00 to 0x1F (exceptions: 0x09, 0x0A, 0x0D) (control characters) |
Catching/Correcting errors
The following system property can be used to specify a class that catches this exception and corrects the errors in the input data. This class must implement the interface com.ebd.hub.datawizard.parser.ICorrectValue. We provide an application programming interface (API) that also allows you to develop your own components in Java. We can provide an in-depth training on this. If you're interested, please contact our support or sales team. However, there is a standard class that can be used.
-Dhub.datawizard.IllegalDataClass=com.ebd.hub.datawizard.parser.DefaultCorrectValue |
If the correction made doesn't work, the processing is permanently terminated with an error. This prevents an endless loop. No error correction logs are generated (apart from the error itself should it still occur).
Please note that data is only corrected when it is parsed into the source structure. Note: Visible in a mapping test or when you load the data into the source structure.
The original input file remains unchanged.
Standard class "DefaultCorrectValue"
Occasionally, control characters may be contained in the input data due to a lack of data quality, which prevents the data from being parsed. The class DefaultCorrectValue replaces all ISO control characters (hex code 0x00 to 0x1F, exceptions: 0x09, 0x0A, 0x0D) with an underscore.
All or certain profiles
By default, this class corrects the data of all profiles. If you want to restrict the correction to certain profiles, then the following properties file must be present. Note: The file is also considered to be present if it is empty!
./conf/invalid_data_settings.properties |
The file has to contain the names of all profiles (one profile name per line) for which the correction is to be carried out. Note: If the profile name contains a space, comma, semicolon, colon, equal sign or backslash, this character must be devalued (escaped) in the file by a preceding backslash.
Example file
The following simplified file contains the illegal Unicode control character 0x1A (SUB): file_with_illegal_data.txt