FTP (Input Agent)
Introduction: Phase 1.
URL: ftp(s)://<URL or IP of the Integration Server>
The Integration Server can act as an FTP server. See also section Lobster_data as FTP Server. The URL is then the URL of the Integration Server.
For each file uploaded via FTP, Lobster_data checks whether it can be assigned to a profile and therefore a job can be started. This is done if the user and directory match the respective profile information and the file pattern is correct.
If several profiles are ready to accept a specific file, the profile scoring will decide.
This Input Agent allows checking the maximum size of input files. See section Maximum Size of Input Files.
Note: The name of the FTP user will be stored in the system variable VAR_SENDER.
(1) File name pattern. You can enter multiple file patterns separated by the pipe (|) character (e.g. *.txt|*.csv).
(2) You can restrict the response to incoming files to a specific directory. Only files that match the file pattern and are stored in this directory will then be considered. The can use the placeholder $USER$, e.g./data/$USER$/somefolder. The placeholder is replaced by the login ID.
(3) The processing of the data starts only after the logout of the respective user. Note: See also system variable VAR_SYS_LASTINBULK.
(4) If this checkbox is set, empty files (0 bytes) will be accepted. Important note: In order for such files to reach the Input Agent, system property hub.datawizard.acceptEmptyFtpFiles must have the value true (configuration see there).
(5) This list allows you to select (see arrows) all active FTP channels of the system.
(6) All selected and active FTP channels appear in this list. At least one must be selected.
The following figure shows how the FTP Input Agent works.
The FTP server provides a maximum number of FTP connections. These FTP connections allow FTP users to put files in the assigned directories. After a file has been successfully stored (or renamed), Lobster_data immediately checks whether a profile should process the file. If there is an appropriate profile, a job is created and placed in the job queue. Normally, the job queue is empty and the processing of the job starts immediately. As a result, the input set file is immediately saved as the backup file of the job and the original file is deleted.
The following figure shows the temporal behaviour.
(1) Start of the transfer of the first file.
(2) The transfer was successful, a backup is created immediately and the analysis of the input data is started.
(3) After creating the backup, the original input file is deleted. The file can therefore only be seen in the directory for a very short time.
The processing of the different FTP connections happens simultaneously and very efficiently. Clients who want to check the existence of stored files often have problems with this speed. When the get command arrives, most files will already have been deleted from the FTP directory.
Note: If a DMZ server is used, its user management can be used.
Note: It is possible that FTP users are still logged on when Lobster_data is shut down. Normally these sessions are terminated by Lobster_data after a (settable) waiting time and the FTP service is shut down. However, if you do not want to stop FTP sessions automatically, you can a different behaviour with option stopServices. If this option is activated, the maximum waiting time for the FTP service is ignored. The following listing shows the option in the configuration file ./etc/startup.xml in the section com.ebd.hub.datawizard.app.DataWizard. By default, this entry is commented out. The option also applies to the SMTP service.
<!-- Enable this line if Lobster_data should stop FTP and SMTP service when halted -->
<
Set
name
=
"stopServices"
>false</
Set
>