UnicodeToASCIIPreparser
Group |
|
Class Name |
com.ebd.hub.datawizard.parser.UnicodeToASCIIPreparser |
Function |
This preparser converts Unicode data into ASCII data by replacing, or removing non-ASCII characters. |
Configuration File |
sample_UnicodeToASCIIPreparser.properties |
Description
This preparser converts Unicode data into ASCII data by replacing or removing non-ASCII characters. It expects the path to a properties file with 2 parameters for configuration.
conversiontype |
(replace or remove) If replace, non-ASCII characters are converted into their corresponding lower ASCII characters. Characters without corresponding lower ASCII character are removed. If remove, non-ASCII characters are removed. |
upperlimit |
(optional) Value (decimal byte value of the encoded character) to define the start of non-ASCII characters (lower ASCII characters are from 0 to 127). Default: 128 |
Example
conversiontype=replace
upperlimit=256
Concrete examples for above and further configurations.
Input Data |
conversiontype |
upperlimit |
Result |
Schönstraße costs 1 million € |
replace |
Schonstrasse costs 1 million |
|
Schönstraße costs 1 million € |
replace |
Schonstrasse costs 1 million |
|
Schönstraße costs 1 million € |
replace |
128 |
Schonstrasse costs 1 million |
Schönstraße costs 1 million € |
replace |
256 |
Schönstraße costs 1 million |
Schönstraße costs 1 million € |
replace |
65536 |
Schönstraße costs 1 million € |
Schönstraße costs 1 million € |
remove |
128 |
Schnstrae costs 1 million |