DefaultSQLCron (SQLCron)

Group

Time-driven Custom Classes

Class Name

com.ebd.hub.datawizard.util.DefaultSQLCron

Function

With this class, complex select statements can be created.

Configuration File

./conf/samples/sample_sql_cron.properties

Description


In practice, it is often required to read data from various tables of a database. The class DefaultSQLCron allows to easily handle this requirement. It is configurated by a properties file, in which the select statements to be executed, and dependencies between those statements are defined.

The select statements select.<n> must have an ascending index. The indexes have to be assigned without any gaps, starting with 0 and increased by 1. Dependencies between the SQL statements can be defined in the syntax dependencies.<n>=<s>:<m>, where n is the index of the dependency, s is the index of the SQL statement and m is the column index of the SQL statement. The dependency dependencies.1=0:2 defines, that the SQL statement select.1 is dependent on the second column of the results of select.0. It is possible to define several columns, for example, dependencies.1=0:2,0:3.

Important note: Since it is not easy to create the configuration file manually the first time, this can be done with the SQL Configuration Wizard (1).

images/download/attachments/58591303/SQL_2_EN-version-1-modificationdate-1597736839533-api-v2.png


Note: A source structure for the file created by the DefaultSQLCron can automatically be created in the menu of the source structure.

Description of Parameters


Parameter

Description

db_alias

This parameter defines the alias of the database to be accessed. The alias has to be configured in the file database.xml. Variables from previous profiles or parameters from HTTP triggers can be also be used. For example db_alias=@MSG_CALL_DB_ALIAS@

pre.statement

It is possible to define several pre-statements, separated by ;

post.statement

It is possible to define several post-statements, separated by ;

save_as_csv

By declaring a directory path, a CSV file containing the source data can be created.

result.split

(true, or false) If true, every select.0 creates a new job. Only works if at least two select statements (select.0 and select.1) and one dependency (dependencies.1) exist. Note: If you only have one select statement, you can use a little trick. Simply create a dummy statement select.1 with a condition that is never true (to avoid creating an additional result line) and an arbitrary dependencies.1. Example:

images/download/attachments/58591303/sql_split-version-1-modificationdate-1597736839493-api-v2.png

select.<n>

This parameter defines the select statements, starting with 0, increased by 1. Note: Line breaks can be specified with \. Example:

select.1=select 1 from dual\

where 1=2

dependencies.<n>

This parameter defines the dependency of the respective select.<n> statement.

limit.<n>

Sets a limit for the number of rows of the respective select.<n>. A limit of 0 means no limit. This is also achieved by leaving out the particular limit.

empty.<n>

Defines a dummy result for select.<n>, if select.<n> has no result.

MSG_CALL_<VARNAME>

See explanations below.

encode_blob

(true, or false) If true, BLOB fields will be Base64-encoded.

Placeholders are used to reference dependencies in the WHERE clause of the select.<n> statements. The placeholder @1:i@ will be replaced with the first column of the result row of the associated dependency, the placeholder @2:i@ with the second column. The i stands for the data type, in this case, Integer (see the following table). It is also possible to use variables from preceding profiles, or parameters from HTTP triggers, but the variables have to be defined in the profile. If for example, the properties file declares MSG_CALL_MYVAR=100, the value of that variable will fill the placeholder @100:i@. Important note: The placeholder number for variables must not be smaller than 100.


Abbreviation

Data Type

i

Integer

f

Float

r

Real

d

Date

t

Timestamp

s

String

l

Long

v

Boolean

b

Blob

x

Textstream

Hint: It is also possible to use the point in time of the previous successful profile start in the WHERE condition by using placeholders in the following way.


  • @0:t@ stands for the point in time of the last profile run as a timestamp.

  • @0:d@ stands for the point in time of the last profile run as a date.

  • @0:s#<template>@ stands for the point in time as a formatted string. Example: @0:s#yyyyMMdd@ returns 20150310 for 10.3.2015. Important note: @0:s#<template>@ is only resolved once. A further occurrence of @0:s#<template>@ with a different template has no effect! Instead, the cached value of the first occurrence is used for further occurrences.

Example


Important note: Suppose we have a string variable MSG_CALL_MYVAR with content 'USA', 'UK'. If you now use the definition MSG_CALL_MYVAR=100 in the configuration file shown below, the SQL query SELECT id, name FROM supplier WHERE country IN (@100:s@) would not work. The reason is that this is not resolved into SELECT id, name FROM supplier WHERE country IN ('USA','UK'), but into SELECT id, name FROM supplier WHERE country IN (''USA','UK'') since the placeholder represents a string. Unfortunately, such queries can only be solved via dependencies, as shown below. The SQL Configuration Wizard will help you do this.


./conf/samples/sample_sql_cron.properties
#
# Sample config file for DefaultSQLCron
#
#Which db connection to use
db_alias=hub
#You can define multiple pre- and post-statements here (separated by ";"), like:
#pre.statement=lock table table_a in exclusvie mode;lock table_b in exclusive mode
#post.statement=
#Create CSV file for debugging - uncomment if needed.
#save_as_csv=/tmp/sqldump.csv
#For each row of select.0 create a new request (job) within Lobster_data, if param is set to true.
result.split=false
#If BLOB columns are selected, you may want to encode the content with Base64.
#encode_blob=true
#
#Here are the statements... must start with 0, step size is 1!
#
select.0=select 'A',id,name from order where date_at > @0:d@
select.1=select 'B',id,item_number from item where order_id = @1:i@
select.2=select 'C',id,text from item_text where item_id = @1:i@
#
#select.1 needs value (column at position 2) from statement 0 (select.0) for execution (join); multiple columns are seperated by ','.
#select.2 needs value (column at position 2) from statement 1 (select.1) for execution (join); multiple columns are seperated by ','.
#Offset for counting column positions is always 1. First dependency references to @1:xxx@, etc.
#Note: To use the time of the last time a cron job was executed, use tag @0:d@ or @0:t@ for date or timestamp.
#
dependencies.1=0:2
dependencies.2=1:2
#
#(Passed) variables can be used as well, if defined in the profile.
MSG_CALL_MYVAR=100
# This will replace filler @100:i@ in any SQL select statement by the integer value of the named variable.
#
#
#Use limit.xxx to limit the result set, e.g.
#limit.0 = 500
#will limit select.0 to 500 rows. If not set, no limit is given.
#If an SQL statement returns no results, you can define a dummy result instead.
#Define a valid CSV line with delimeter ';'.
#An example for a 'dummy' result for select.0:
#empty.0=A;0;Nothing


Three tables (order, item, item_text) are to be read. The tables are linked by 1:n relations. One order can hold any number of items and every item can have any number of item texts.

The order table:


id

name

date_at

10

order_1

01.01.2014

11

order_2

01.01.2014

The item table:


id

item_number

order_id

20

item_1

10

21

item_1

11

22

item_2

11

The item_text table:


id

text

item_id

30

text_1

20

31

text_2

20

32

text_3

22

The first SQL statement select.0 is executed and placeholder @0:d@ is set to the date of the last profile run. Following the result of the query:


A

10

order_1

A

11

order_2

The first line


A

10

order_1

of the result set of select.0 is used to create the dependencies.1:


10

The dependencies.1 are used to perform select.1:


B

20

item_1

The result is used to create the dependencies.2:


20

The dependencies.2 are used to perform select.2:


C

30

text_1

C

31

text_2

The result lines are combined to the first result subset:


A

10

order_1

B

20

item_1

C

30

text_1

C

31

text_2

Similarly, the second line


A

11

order_2

of the result set of select.0 is used to create the dependencies.1:


11

The dependencies.1 are used to perform select.1:


B

21

item_1

B

22

item_2

The result is used to create the dependencies.2:


21

22

The select.2 statement for the first line of dependencies.2 returns no result. Performed with the second line of dependencies.2, select.2 returns:


C

32

text_3

The result lines are combined to the second result subset:


A

11

order_2

B

21

item_1

B

22

item_2

C

32

text_3

At the end, both result subsets are combined and we get the overall result set:


A

10

order_1

B

20

item_1

C

30

text_1

C

31

text_2

A

11

order2

B

21

item_1

B

22

item_2

C

32

text_3