DefaultSQLCron
Configuration |
Is configured in the Business Connector (Input Agent cron). |
Class name |
com.ebd.hub.datawizard.util.DefaultSQLCron |
Description
In practice, it is often required to read data from various tables of a database. The class DefaultSQLCron allows to easily handle this requirement. It is configured by a properties file, in which the select statements to be executed, and dependencies between those statements are defined.
The select statements select.<n> must have an ascending index. The indexes have to be assigned without any gaps, starting with 0 and increased by 1. Dependencies between the SQL statements can be defined in the syntax dependencies.<n>=<s>:<m>, where n is the index of the dependency, s is the index of the SQL statement and m is the column index of the SQL statement. The dependency dependencies.1=0:2 defines, that the SQL statement select.1 is dependent on the second column of the results of select.0. It is possible to define several columns, for example, dependencies.1=0:2,0:3.
Important note: Since it is not easy to create the configuration file manually the first time, this can be done with the SQL Wizard.
Note: A source structure for the file created by the DefaultSQLCron can automatically be created in the menu of the source structure.
Parameters
Parameter |
Description |
db_alias |
This parameter defines the alias of the database to be accessed. See section Databases/connectors (administration). Variables from previous profiles or parameters from HTTP triggers can be also be used. For example db_alias=@MSG_CALL_DB_ALIAS@ |
pre.statement |
It is possible to define several pre-statements, separated by ; |
post.statement |
It is possible to define several post-statements, separated by ; |
save_as_csv |
By declaring a directory path, a CSV file containing the source data can be created. |
result.split |
(true, or false) If true, every select.0 creates a new job. Only works if at least two select statements (select.0 and select.1) and one dependency (dependencies.1) exist. Note: If you only have one select statement, you can use a little trick. Simply create a dummy statement select.1 with a condition that is never true (to avoid creating an additional result line) and an arbitrary dependencies.1. Example: result.split=true |
select.<n> |
This parameter defines the select statements, starting with 0, increased by 1. Note: Line breaks can be specified with \. Example: select.1=select 1 from dual \ where 1=2 |
dependencies.<n> |
This parameter defines the dependency of the respective select.<n> statement. |
limit.<n> |
Sets a limit for the number of rows of the respective select.<n>. A limit of 0 means no limit. This is also achieved by leaving out the particular limit. |
empty.<n> |
Defines a dummy result for select.<n>, if select.<n> has no result. |
MSG_CALL_<VARNAME> |
See explanations below. |
encode_blob |
(true, or false) If true, BLOB/CLOB fields will be Base64-encoded. |
use_raw_content |
If false (default), the result set is returned as CSV rows. Let's assume we have 4 result columns and 3 result rows. T001;1253698848119000;0;ASM If true, the entire result set is returned in one line, without CSV delimiters. T00112536988481190000ASMT00112563101269060000LobsterT00113049476314730000CONRAD |
Placeholders are used to reference dependencies in the WHERE clause of the select.<n> statements. The placeholder @1:i@ will be replaced with the first column of the result row of the associated dependency, the placeholder @2:i@ with the second column. The i stands for the data type, in this case, Integer (see the following table). It is also possible to use variables from preceding profiles, or parameters from HTTP triggers, but the variables have to be defined in the profile. If for example, the properties file declares MSG_CALL_MYVAR=100, the value of that variable will fill the placeholder @100:i@. Important note: The placeholder number for variables must not be smaller than 100.
Abbreviation |
Data type |
i |
Integer |
f |
Float |
r |
Real |
d |
Date |
t |
Timestamp |
s |
String |
l |
Long |
v |
Boolean |
b |
Blob |
x |
Textstream |
Note: It is also possible to use the point in time of the previous successful profile start in the WHERE condition by using placeholders in the following way.
@0:t@ stands for the point in time of the last profile run as a timestamp.
@0:d@ stands for the point in time of the last profile run as a date.
@0:s#<template>@ stands for the point in time as a formatted string. Example: @0:s#yyyyMMdd@ returns 20150310 for 10.3.2015. Important note: @0:s#<template>@ is only resolved once. A further occurrence of @0:s#<template>@ with a different template has no effect! Instead, the cached value of the first occurrence is used for further occurrences.
Example
Important note: Suppose we have a string variable MSG_CALL_MYVAR with content 'USA', 'UK'. If you now use the definition MSG_CALL_MYVAR=100 in the configuration file shown below, the SQL query SELECT id, name FROM supplier WHERE country IN (@100:s@) would not work. The reason is that this is not resolved into SELECT id, name FROM supplier WHERE country IN ('USA','UK'), but into SELECT id, name FROM supplier WHERE country IN (''USA','UK'') since the placeholder represents a string. Unfortunately, such queries can only be solved via dependencies, as shown below. The SQL Wizard will help you do this.
#
# Sample config file for DefaultSQLCron
#
#Which db connection to use
db_alias=hub
#You can define multiple pre- and post-statements here (separated by ";"), like:
#pre.statement=lock table table_a in exclusvie mode;lock table_b in exclusive mode
#post.statement=
#Create CSV file for debugging - uncomment if needed.
#save_as_csv=/tmp/sqldump.csv
#For each row of select.0 create a new request (job) if param is set to true.
result.split=false
#If BLOB columns are selected, you may want to encode the content with Base64.
#encode_blob=true
#
#Here are the statements... must start with 0, step size is 1!
#
select.0=select 'A',id,name from order where date_at > @0:d@
select.1=select 'B',id,item_number from item where order_id = @1:i@
select.2=select 'C',id,text from item_text where item_id = @1:i@
#
#select.1 needs value (column at position 2) from statement 0 (select.0) for execution (join); multiple columns are seperated by ','.
#select.2 needs value (column at position 2) from statement 1 (select.1) for execution (join); multiple columns are seperated by ','.
#Offset for counting column positions is always 1. First dependency references to @1:xxx@, etc.
#Note: To use the time of the last time a cron job was executed, use tag @0:d@ or @0:t@ for date or timestamp.
#
dependencies.1=0:2
dependencies.2=1:2
#
#(Passed) variables can be used as well, if defined in the profile.
MSG_CALL_MYVAR=100
# This will replace filler @100:i@ in any SQL select statement by the integer value of the named variable.
#
#
#Use limit.xxx to limit the result set, e.g.
#limit.0 = 500
#will limit select.0 to 500 rows. If not set, no limit is given.
#If an SQL statement returns no results, you can define a dummy result instead.
#Define a valid CSV line with delimeter ';'.
#An example for a 'dummy' result for select.0:
#empty.0=A;0;Nothing
Three tables (order, item, item_text) are to be read. The tables are linked by 1:n relations. One order can hold any number of items and every item can have any number of item texts.
The order table:
id |
name |
date_at |
10 |
order_1 |
01.01.2014 |
11 |
order_2 |
01.01.2014 |
The item table:
id |
item_number |
order_id |
20 |
item_1 |
10 |
21 |
item_1 |
11 |
22 |
item_2 |
11 |
The item_text table:
id |
text |
item_id |
30 |
text_1 |
20 |
31 |
text_2 |
20 |
32 |
text_3 |
22 |
The first SQL statement select.0 is executed and placeholder @0:d@ is set to the date of the last profile run. Following the result of the query:
A |
10 |
order_1 |
A |
11 |
order_2 |
The first line
A |
10 |
order_1 |
of the result set of select.0 is used to create the dependencies.1:
10 |
The dependencies.1 are used to perform select.1:
B |
20 |
item_1 |
The result is used to create the dependencies.2:
20 |
The dependencies.2 are used to perform select.2:
C |
30 |
text_1 |
C |
31 |
text_2 |
The result lines are combined to the first result subset:
A |
10 |
order_1 |
B |
20 |
item_1 |
C |
30 |
text_1 |
C |
31 |
text_2 |
Similarly, the second line
A |
11 |
order_2 |
of the result set of select.0 is used to create the dependencies.1:
11 |
The dependencies.1 are used to perform select.1:
B |
21 |
item_1 |
B |
22 |
item_2 |
The result is used to create the dependencies.2:
21 |
22 |
The select.2 statement for the first line of dependencies.2 returns no result. Performed with the second line of dependencies.2, select.2 returns:
C |
32 |
text_3 |
The result lines are combined to the second result subset:
A |
11 |
order_2 |
B |
21 |
item_1 |
B |
22 |
item_2 |
C |
32 |
text_3 |
At the end, both result subsets are combined and we get the overall result set:
A |
10 |
order_1 |
B |
20 |
item_1 |
C |
30 |
text_1 |
C |
31 |
text_2 |
A |
11 |
order2 |
B |
21 |
item_1 |
B |
22 |
item_2 |
C |
32 |
text_3 |
Blanks in query result (MSSQL/Informix)
In MSSQL and Informix databases, the result values of SQL calls are trimmed of preceding and following spaces by default. The "skipTrimResultValues" parameter can be used in the database alias configuration file to change this behaviour.