Collecting Data from Multiple Instances of One Input Node

Let’s start with something difficult: You receive an EDIFACT file which contains NAD segments with information about the parties involved. Unfortunately, during output, the information needs to be written to a single node which only has different fields for this. Following the input data.

UNA:+.? '
UNB+UNOC:3+11111+22222+070622:1145+15'
UNH+1+DESADV:D:96A:UN:EAN005'
BGM+351+253044+9'
DTM+137:20070531:102'
RFF+DQ:253044'
DTM+171:20070531:102'
NAD+SU+12345::9'
NAD+BY+67890::9'
NAD+CZ+13579::9'
CPS+1'
LIN+1'
PIA+1+303230:IN'
PIA+5+303230:SA'
UNT+13+1'
UNZ+1+15'

As you can see, we’re dealing with a DESADV D96A. For this example, we are only interested in the party ID - in other words, 12345, 67890 and 13579. So the field in question is F3039 in the NAD segment, which always introduces SG2.

images/download/thumbnails/36582325/Tipps_und_Tricks_zu_DataWizard_umgesetzt_html_798f5447-version-1-modificationdate-1566463082000-api-v2.png

We have a simple output structure which will be output as a CSV file. It has a node with three fields into which the partner IDs should be transferred.

images/download/thumbnails/36582325/Tipps_und_Tricks_zu_DataWizard_umgesetzt_html_m3caa24e6-version-1-modificationdate-1566463082000-api-v2.png

So, how do we get the values from the three NAD segments to the three fields? Simply defining a path from the Partner node to the NAD or SG2 node is not enough. First, we must somehow gather together and note the three values, and then insert them into the fields.

The 'Less than Ideal' Solution

Note the values. We could use variables. So let’s quickly define three variables: var__PARTY_ID_SU, var__PARTY_ID_BY and var__PARTY_ID_CZ'. We can write their values into the relevant destination fields with the function copy(field/value/variable). Taking SU as an example.

images/download/thumbnails/36582325/partner_var_1_EN-version-1-modificationdate-1566463082000-api-v2.png

images/download/thumbnails/36582325/partner_var_2_EN-version-1-modificationdate-1566463082000-api-v2.png

So far, so good. Now, how do we populate the variables? You know the following: if we define a path between a destination structure node and a source structure node, this destination structure node will run for as long as the source structure node has data. We can also use functions that return true or false to control whether the node is actually accessed or not. So let’s give our partners three nodes that respond to the different NAD segments. Each of these gets a calculation field (then nothing will appear in the output because they are calculation fields), each of which populates one of the variables. Taking SU as an example again.

images/download/thumbnails/36582325/suboptimal_1_EN-version-1-modificationdate-1566463082000-api-v2.png

images/download/thumbnails/36582325/suboptimal_2_EN-version-1-modificationdate-1566463082000-api-v2.png

And the calculation field.

images/download/thumbnails/36582325/Tipps_und_Tricks_zu_DataWizard_umgesetzt_html_m2d90289a-version-1-modificationdate-1566463082000-api-v2.png

images/download/thumbnails/36582325/suboptimal_4_EN-version-1-modificationdate-1566463082000-api-v2.png

What have we forgotten? That’s right: the variables retain their values for the rest of the profile run. If the file contains multiple messages (UNH to UNT) and some do not have instances of all three NAD segments, old values will be used. So always delete them at the start! To that end, we will put a calculation field right at the start of the Partner node - let’s call it calc_ResetVars' - which will populate all three variables with an empty character string again. If we do this right at the start, we can be sure that the variables will be reset before each run and that we never have old values. Now it looks like in the following screenshot.

images/download/thumbnails/36582325/Tipps_und_Tricks_zu_DataWizard_umgesetzt_html_m17bb1948-version-1-modificationdate-1566463082000-api-v2.png

images/download/thumbnails/36582325/suboptimal_6_EN-version-1-modificationdate-1566463082000-api-v2.png

That works. But it’s not great. We have three variables, three nodes each with a calculation field, and then also the field to reset the variables. We can get rid of most of that. So, next attempt.

Better

Why have three nodes with three fields? It would be better if we had just one node with one field to check which NAD segment we are in. Once again, the node path is NAD.

images/download/attachments/36582325/Tipps_und_Tricks_zu_DataWizard_umgesetzt_html_m7bc6d988-version-1-modificationdate-1566463082000-api-v2.png

On this occasion, field F3039 has been mapped directly to the calculation field, which means we can simply work with the Link (=mapped field) function parameter. Now we need to use the calculation field to decide which of the three variables should be populated. This could be done with the following function chain (only text from here).

1) result = goto function-pos(a==b, c, d)
a field: F3035
b constant: SU
c constant: 2
d constant: 4
 
2) result = save variable a(b) type-safe
a constant: var__PARTY_ID_SU
b linked field:
 
3) result = break function execution
 
4) result = goto function-pos(a==b, c, d)
a field: F3035
b constant: BY
c constant: 5
d constant: 7
 
5) result = save variable a(b) type-safe
a constant: var__PARTY_ID_BY
b linked field:
 
6) result = break function execution
 
7) result = goto function-pos(a==b, c, d)
a field: F3035
b constant: CZ
c constant: 8
d constant: 10
 
8) result = save variable a(b) type-safe
a constant: var__PARTY_ID_CZ
b linked field:
 
9) result = break function execution
 
10) result = copy(field/value/variable)
a constant:

That’s right - it’s not great. We will come to that in a moment. But first a brief explanation of what is happening here:

Function 1 checks whether we are in NAD+SU. If so, it continues to position 2, where the variables are populated with the mapped field. Then the function chain stops executing and we have what we want. If the qualifier is not SU, function 4 is started.

Function 4 does the same for BY and jumps to position 7 for all other qualifiers.

There everything is repeated again, but in the event that a completely different, irrelevant qualifier comes along, we simply skip to position 10 where the calculation field is written with an empty string. That actually has no effect on a calculation field, but we have to go somewhere. And we cannot jump directly to break function execution.

Good

There are a few other possibilities that are at least better than a chain of ten functions, but we are not going to spend time on those now. Instead, we are going to reduce the whole thing to two functions. To do so, however, we need an additional variable to hold the irrelevant values. As it has no other use, we will call it var__DUMMY. It can be safely overwritten again and again in many other places since it is never read anyway. It does not even need to be reset. The structure remains the same, with only the functions in the field calc_note_PIDs being replaced.

1) result = replace value(a, list b, list c, default d)
a field: F3035
b constant: SU,BY,CZ
c constant: var__PARTY_ID_SU,var__PARTY_ID_BY,var__PARTY_ID_CZ
d constant: var__DUMMY
 
2) result = save variable a(b) type-safe
a result: 1
b linked field:

That looks far neater, doesn’t it? The first function will return the appropriate variable names depending on the qualifier in field F3035. For irrelevant qualifiers, the result is var__DUMMY. This ensures there is always a variable that we can populate with a value in the second function. Needless to say, executing just two functions is also much faster than a chain of ten functions.

Great

In the previous solutions, we always used three (or even four) variables that were created specifically for each qualifier. First of all, this isn’t very neat - and secondly, it is possible that another qualifier might become relevant later on. This means that even with the 'good' solution you would need to create another variable and the lists in function 1 would need to be expanded to include the qualifier and variable. It is possible to avoid this, and also to save ourselves another function at the same time. To do this we will use a map. Maps are quite convenient; they always save pairs of keys and values. A map could, for example, consist of a list of postcodes and their associated towns.

Key

Value

82327

Tutzing

82362

Weilheim

82380

Peißenberg

82390

Eberfing

...

...


If you need to know what town has the postcode 82327, you simply query the map for the value corresponding to key 82327, and it will promptly return Tutzing. The important fact here is that you always have exactly one value for each key in the map. Of course, in a static case like this you are more likely to create a CSV file or even a database table that you can refer to. But we are dealing with constantly changing values. Every EDIFACT message contains different party IDs under the same qualifiers. Here’s how it works.

Maps in a Lobster_data profile have given names. Let’s call our map Party_Map. Unlike variables, maps do not need to be defined first. As soon as you use the name for the first time in a map function, the map is created automatically. And the name does not matter at all; you could even call your map 'Hugo', as long as you always make sure to use the correct map name. Incidentally, this is an excellent opportunity to store the name as a constant. Right at the start, in our field calc_ResetVars, we make sure that the map we want to use is empty. As before, when we used variables, there should be no old values left behind. The appropriate function for this:

1) result = clear map(name of map a)
a constant: Party_Map

We can save ourselves even more functions here. Instead of populating three variables individually with an empty value, we can do that in one go. And if another qualifier is required, we do not need any more functions. Convenient, isn’t it? So, now for the field note_PIDs. The two functions become one.

1) result = add to map(key a, value b, name of map c)
a field: F3035
b linked field:
c constant: Party_Map

What does this thing do? For every qualifier in field F3035 (the key), it saves the allocated field value (the party ID) to the map named Party_Map. And it makes no difference what qualifiers come along; everything is saved. What we read from the map afterwards is up to us. In the field PartnerId_SU, for example, we get the value corresponding to the key SU.

1) result = get value from map(key a, name of map b)
a constant: SU
b constant: Party_Map

What do you think? It is a lot easier to live with than a mishmash of nodes and fields or a chain of ten whole functions for just three values. Now imagine that you need to collect half a dozen values instead of just the party IDs. Then, this lean solution will be even more beneficial in terms of both performance and clarity.