Lines Without Dependencies
A CSV file is a text file that contains tabular structured data and is primarily used for data exchange. There can be a line-by-line or a column-by-column arrangement of the data.
The first table shows an example of a line-by-line arrangement. Each row is a dataset, the more datasets, the more rows. The number of columns in the file remains the same.
Field1 |
Field2 |
Field3 |
Field4 |
|
Dataset1 |
D1F1 |
D1F2 |
D1F3 |
D1F4 |
Dataset2 |
D2F1 |
D2F2 |
D2F3 |
D2F4 |
Dataset3 |
D3F1 |
D3F2 |
D3F3 |
D3F4 |
Dataset4 |
D4F1 |
D4F2 |
D4F3 |
D4F4 |
The second table shows an example of a column-by-column arrangement. Each column is a dataset, the more datasets, the longer the rows. The number of lines in the file remains the same.
Dataset1 |
Dataset2 |
Dataset3 |
Dataset4 |
... |
|
Field1 |
D1F1 |
D2F1 |
D3F1 |
D4F1 |
... |
Field2 |
D1F2 |
D2F2 |
D3F2 |
D4F2 |
... |
Field3 |
D1F3 |
D2F3 |
D3F3 |
D4F3 |
... |
Field4 |
D1F4 |
D2F4 |
D3F4 |
D4F4 |
... |
The first listing shows the simplest case of a CSV file in a line-by-line arrangement. The file has only one dataset type, the fields are separated by a semicolon (;).
#OrderHeader=OH
OH;Order1
OH;Order2
OH;Order3
The second listing shows the same CSV file in a column-by-column order.
OH;OH;OH
Order1;Order2;Order3
During mapping, the source tree is mapped to the destination tree. The creation of the source tree is done by the parser. The user can create its own nodes and fields in the source structure the input data.
The figure below shows the operation of the parser. The three lines become three datasets with 2 fields each. The parser, starting from the second lines, breaks the input file down into individual datasets. Then the lines have to be further broken down into single values. To achieve this goal, the parser must first be told to interpret an entire row as a column. This is achieved by entering the value New line as the delimiter in field (6) of the parser settings. Then the lines have to be taken apart by specifying how the single values in a row are separated. This is done in the attributes of a node.
The following screenshot shows the source structure for the example file, regardless of the arrangement of the data in the file.
The next screenshot shows the attributes for the source structure node OrderHeader of the previous screenshot. The value in (1) specifies the delimiter character that separates the values in a row. Since this value is entered in the nodes, different delimiters are possible for different dataset types.
Lobster_data only enters a node of the source structure if set conditions are met. These conditions, the so-called match codes, can be defined via the context menu of the source structure nodes. The following screenshot shows the dialogue for that.
Any number of conditions can be entered for a node. The conditions are ORed (logically linked with OR).In our example, the first value of each line is to be checked against the match codes. The match code mechanism allows you to mix different dataset types in an input file. If checkbox (14) is used, the same match code can be used in subnodes (details see there).
When using the CSV parser, the top nodes must also have the match codes that allow the subnodes to be found. By contrast, when using the fixed-length parser, only the subnodes need to have the match codes.
For a hierarchy of nodes, the whitespace character (" ") or nothing ("") must be entered as the column delimiter in the parent node.
The figure below shows the operation of the parser when breaking down a file into records with different types of datasets.