Bulk Database Queries - A Time Consuming Task

A simple scenario: You receive a large number of orders, e.g. in a CSV file. The items only show the product code. But you also need the name and description of each product. Let’s reuse the orders from our police inspectors in the previous section.

O,09.07.2009,12345,Derrick,Munich,Germany
I,4711,3
I,4712,1
I,4713,5
I,4714,2
O,09.07.2009,12346,Kottan,Vienna,Austria
I,4711,1
I,4713,3
I,4715,6
O,09.07.2009,12347,Kojak,New York,USA
I,4717,11
O,09.07.2009,12348,Maigret,Paris,France
I,4712,4


Here’s how the structures look now.

images/download/attachments/36582421/5_3_Strukturen_EN-version-1-modificationdate-1566523844000-api-v2.png

Slow

We can retrieve the name of a product from the database with the following function.

1) result = select-statement a [[param &1 = c], d,e, f,g,h,i,j], default k, dbAlias = b
a constant: select Name from Product where PCode = &1
b constant: ProductDB
c field: ProductCode

We do not need the rest of the parameters from d to k just now. So we add this function to the field ProductName, and another, which we will use to retrieve the description from the database, to the field ProductDescription. Fantastic. For these nine items (containing six different products) we fire off a total of 18 select statements to the database. Doing that with a bulk order of, say, a thousand items? No thanks.

Half as Slow

Let’s make a few tweaks. Actually, we could retrieve both pieces of information - the name and description - in one go, couldn’t we? So let’s change the ProductName statement a little, although we will need a different function.

1) result = set-vars(select a, dbalias b, [c,d,e,f,g,h,i,j for params &1...&8][,k,l,m,n])
a constant: select Name, Description from Product where PCode = &1
b constant: ProductDB
c field: ProductCode

This function is able to load multiple columns from a database and populate variables which have exactly the same name as the columns (only with a VAR_ in front). So, contrary to our recommendation to always start variables with var__, let’s create these two variables: VAR_Name and VAR_Description. Because the function result includes the value from the first column (Name) anyway, the product name will already be in the field. Then you only need to use the copy function to add the value for the VAR_Description variable to the ProductDescription field. So, now we have only one function per item, a total of nine. And that means the profile will only be half as slow when it comes to the database queries.

As Fast as It Gets

Why should we retrieve both sets of information from the database each time for every item, even if the same product has already shown up several times? We should remember what we have already had. And, as you have no doubt realised for yourself, we can do that with our maps. As before, we have the two variables VAR_Name and VAR_Description, but now we add the following function chain to the ProductName field.

1) result = key in map(key a, name of map b)
a field: ProductCode
b constant: map_ProductNames
 
2) result = goto function-pos(a==true, b, c)
a result: 1
b constant: 6
c constant: 3
 
3) result = set-vars(select a, dbalias b, [c,d,e,f,g,h,i,j for params &1...&8][,k,l,m,n])
a constant: select Name, Description from Product where
PCode = &1
b constant: ProductDB
c field: ProductCode
 
4) result = add to map(key a, value b, name of map c)
a field: ProductCode
b variable: VAR_Name
c constant: map_ProductNames
 
5) result = add to map(key a, value b, name of map c)
a field: ProductCode
b variable: VAR_Description
c constant: map_ProductDescriptions
 
6) result = get value from map(key a, name of map b)
a field: ProductCode
b constant: map_ProductNames

What happens here? First, we check whether there is already a value stored under the current product code in a map called map_ProductNames. If so, we do not need a database query. We jump directly to position 6, where we retrieve the name from the aforementioned map. If there has not been an instance of the product code, we transmit our select statement and then save the name in map_ProductNames and the description in map_ProductDescriptions with the product code as the key. Then we retrieve the value for the field from the map straight away. Now you also know which function to use for the product description field.

1) result = get value from map(key a, name of map b)
a field: ProductCode
b constant: map_ProductDescriptions

Using variables may seem cumbersome, but is much faster than transmitting two individual select statements or even setting up a corresponding function chain for every single field. Because there were only six different products ordered in our example, the database only needs to be accessed six times. But suppose you have a few hundred items with only two or three dozen products, and that there is a really large table to be scanned for every select, you can easily imagine the speed benefits.

One last little thing: in theory, you could access the database just once, which would save a lot of time again. To do this, use the select into map function. In one go, it can create a map of the name and description for all products in the database with their ProductCode keys. This function would be a very bad choice here because a stock list like this is likely to contain thousands of datasets. The map would exhaust your memory faster than any badly set up profile. However, if you have a small table with a few dozen records, this method is certainly a good alternative. Why not take a look at the documentation of the select into map function.