TokenStreamSplitter
Configuration file |
./conf/samples/sample_splitter.properties |
Class name |
com.ebd.hub.datawizard.parser.stream.TokenStreamSplitter |
Description
This preparser has the same functionality as the TokenFileSplitter, with the difference that it is a stream preparser, so it is able to process data of any size, without storing all the data in the main memory.
Another difference: The parameter header is allowed to be a path to a text file if it conforms to the syntax read:<URL>. URL can be a local file path file:C:///directory/file.txt, an HTTP URL, or an FTP URL. The whole content of the file (maybe several lines) will be read and inserted as a separating line (rather a separating block) into the file.
Parameters
Parameter |
Description |
rows |
Number of lines, after which the separating line is added. |
header |
Separating line to be added. |
expression |
(optional) Regular expression that delays the adding of the separating line until it matches the current line. |
eol |
(optional) Number defining the end of line characters. 0 is interpreted as \n, 1 as \r and all other values as \r\n. |
filter |
(optional) Regular expression that filters the input lines. Lines that do not match the expression are ignored (not output and not counted). |
check.BOM |
(optional) If true, the BOM is observed and the file is recoded accordingly. Default: false . For details see EncodingByBomOrXmlPreParser. |
check.XML |
(optional) If true, the XML encoding is observed and the file is recoded accordingly . Default: false. For details see EncodingByBomOrXmlPreParser. |
Example file
rows = 10
header=new!