TokenStreamSplitter
|
Group |
|
|
Class Name |
com.ebd.hub.datawizard.parser.stream.TokenStreamSplitter |
|
Function |
This preparser is the stream version of the TokenFileSplitter. |
|
Configuration File |
sample_splitter.properties |
Description
This preparser has the same functionality as the TokenFileSplitter, with the difference that it is a stream preparser, so it is able to process data of any size, without storing all the data in the main memory.
Another difference: The parameter header is allowed to be a path to a text file if it conforms to the syntax read:<URL>. URL can be a local file path file:C:///directory/file.txt, an HTTP URL, or an FTP URL. The whole content of the file (maybe several lines) will be read and inserted as a separating line (rather a separating block) into the file.
Parameters
|
rows |
(mandatory) Number of lines, after which the separating line is added. |
|
header |
(mandatory) Separating line to be added. |
|
expression |
Regular expression that delays the adding of the separating line until it matches the current line. |
|
eol |
Number defining the end of line characters. 0 is interpreted as \n, 1 as \r and all other values as \r\n. |
|
filter |
Regular expression that filters the input lines. Lines that do not match the expression are ignored (not output and not counted). |
Example File
## sample file for TokenFileSplitter# # Supported keys are: rows, header, expression, eol## rows = amount of rows that are combined for one record# header = line that will be pasted into to indicate a new record# expression = empty or a reg. expression that must match on current read line to create a new record (beside rows)# eol=end of line (0=\n, 1 = \r, all other settings will be used for \r\n)#rows = 10header=new!