Phase 1 (introduction)

Input Agents


The input data is received by so-called Input Agents in phase 1. If several incoming agents are considered, the profile scoring decides.

Backup and Unresolved


If the data could be assigned to a profile, a backup of the input files and a job (for that profile) will be generated. Otherwise, the files end up in the Unresolved area.

Virus Scanner


It is possible to execute a virus scanner (Java class) at this point. The class is called whenever a backup file is generated or a file ends up in the Unresolved area.

To do so, a class derived from com.ebd.hub.datawizard.plugin.AbstractVirusCheck has to be created. This class only has to implement the following two methods.


/**
* Check file for virus
*
* @param backup backup file
* @throws Exception on any error
*/
public abstract void checkFile(File backup) throws Exception;
/**
* Check data for virus
*
* @param data received data, most likely by AS2 (already encrypted)
* @throws Exception on any error
*/
public abstract void checkData(byte[] data) throws Exception;


In these two methods, the virus scanner must be called with the data and throw an exception if the data is contaminated. The class has to be included (also on the DMZ server!) in configuration file ./etc/startup.xml with the following entry.


<Set name="virusScanner">your_class_name_including_package</Set>

Important note: Lobster_data provides a programming interface (API) that allows you to develop your own components in Java. For this, we offer in-depth training. If you are interested, please contact our support or sales staff.

Thread Queues


A distinction can be made between Input Agents that process jobs directly and those that send them in a Thread Queue. The two different methods were chosen to ensure the immediate execution of some of the jobs.

Jobs That Are Processed Directly



Jobs of time-driven Input Agents are processed sequentially per profile, jobs of Input Agents of type HTTP(S), Message and AS2 can be processed in parallel for each profile.

Jobs That Are Stored in a Thread Queue


  • The jobs of all other Input Agents.


The entries (jobs) of the Thread Queues are processed by the profiles that originally created them. The maximum number of different profiles working at the same time can be configured. The following listing shows how to set the minimum and maximum values in the configuration file ./etc/startup.xml.


...
<Set name="minBackgroundThreads">4</Set>
<Set name="maxBackgroundThreads">10</Set>
...

How Thread Queues Work


Normally, Lobster_data will process jobs in a timely manner, which means that the number of entries in the Thread Queues will be very small (usually even empty). If the entries cannot be processed fast enough, the number of entries increases. If the number exceeds a certain threshold, Lobster_data starts to swap the surplus jobs to the hard disk. These jobs are read in again as soon as a second threshold value is undershot. When the system shuts down and there are still jobs in the Thread Queues, these jobs are swapped to the hard disk as well. After a restart, they are then swapped back in the Thread Queues.

If a profile creates a job, a job number is assigned. This (ascending ) job number is unique across all profiles. Subsequently, the obtained data is copied to the backup directory ./datawizard/backup that contains a number of cryptic-looking directories with names ending in 7ff8. Each profile is associated to a directory that contains backup files named Job_ <job number>. Those files are used when you restart a job. You can also manually access them of course. For each job, a file named ENV_ <job number> is also created, which stores the environment variables needed for a restart. Note: See also function get path of backup-file(a,b,c)and the system variable VAR_SYS_BACKUP.


The queuing works in the following way.


  • A file is created in the directory ./datawizard/backup/queue/<node name>. <node name> stands for MainIS or the name of the respective node. See section Load Balancing.

  • All Thread Queues write to the same directory, there is no distinction by priority. Each Thread Queue entry thus corresponds to a file.

  • A Thread Queue entry contains the name of the associated profile and a reference to the data to be processed. The data (payload) itself is not included.

  • The data (payload) for the Thread Queue entries can be found in ./datawizard/backup/queue/payload.

  • If a Thread Queue entry is a manual restart of a backed up job, it will refer to the payload of the original job.

  • For priority changes in the profile see section Thread Queue.

  • Manually restarted jobs do not get the priority of the profile, but priority Highest (+2) instead.

  • When a profile is deleted, associated Thread Queue entries will fail and be deleted.


Standard Processing in Phase 1


images/download/attachments/55938786/Phase_1_Standard_Diagramm_EN-version-1-modificationdate-1594711336597-api-v2.png


The standard process for phase 1 is to receive data and create a job for the processing profile.

File Functions in Phase 1


images/download/attachments/55938786/Phase_1_File_Filter_Diagramm_EN-version-1-modificationdate-1594711336569-api-v2.png


Optional File Functions allow you to define conditions for input files for time-driven Input Agents. A job will only be created and the input file will only be processed if the conditions are met. Examples can be found in the documentation for the respective file function classes.

GUI


The configuration of this phase in the GUI is described in section Phase 1 (GUI).