Availability of MessageAuthenticationService Through Caching
The MessageAuthenticationService (configuration file ./etc/auth_dmz.xml, or as entered in ./etc/factory_dmz.xml) in a DMZ scenario responds to requests it receives from communication services by forwarding them in a synchronous message to the AuthenticationService of the inner server. If the inner server is not accessible, the MessageAuthenticationService must still be able to process requests. For this, it accesses a copy of the database tables of the AuthenticationService that it holds locally. This local database copy is called cache database - or simply cache. To minimise the risk of using outdated data from the cache, the following caching rules are implemented (see also the parameter table in section Cache Configuration - Summary).
Caching rules
1. If the inner server responds, the response is used (online mode).
2. If the inner server does not respond, the cache is used (offline mode).
3. The cache is filled during the restart of the DMZ server. When the DMZ server is restarted, the data is filled into the cache by completely storing all partners, partner channels and certificates in the cache. This process is called initial update. It corresponds to a complete update, see (7).
4. If an initial update fails, it is repeated every 2 minutes, until a configurable amount of retries (maxInitAttempts, default: 12) is reached.
5. During operation, the cache is refreshed periodically (updatePeriod, default: 900000 ms = 15 minutes) by a partial update. In the process, only the records that changed are being updated. Records marked as 'deleted' are also marked as 'deleted' in the cache. However, records that have been physically deleted remain in the cache. For this reason, a complete update is required after some time to 'clean' the database. Changes to records are immediately pushed by the internal Integration Server to all DMZ MessageAuthenticationServices.
6. As soon as the MessageAuthenticationService of the DMZ server successfully accessed the inner AuthenticationService by Message, a 'syncRequired' flag is set in the response message if the inner AuthenticationService has modified data, the DMZ server has not replicated yet. The DMZ server then starts an immediate replication. The update period starts is reset after that. Using the configuration parameter ignoreNotification, this feature can be switched off.
7. In a full update (see below), a check is executed first to see if the inner server is reachable. If not, the cache data remains untouched. If the inner server was reachable, the cache is purged completely and all partner data, partner channels and certificates are restored. If the restricted caching by channel types (cachedChannelTypes ) is configured, only allowed types are cached. Partners that do not have channels of an allowed type are not replicated.
During the time between deleting old data and complete storage of the new data, the cache is invalid. Requests of the communication services during this phase that cannot immediately be handled online by the inner server, are temporarily postponed. After a configurable timeout (cacheUpdateTimeout, default=8000 ms), those requests fail with an exception if the update has not finished in the meantime. This situation can only occur if the network collapses during a full update and a request to the communication layer happens simultaneously.
Requests in online mode (inner server is reachable) are always handled by the inner server, even if a full update is executed.
The full update is not executed in offline mode, so requests can be responded to from the unchanged cache. Only the situation of a collapsed network during a full update is critical. However, failures shorter than the cacheUpdateTimeout will be handled.
8. The following cases will trigger a full update:
If the DMZ server was started (initial update, i.e. full update with retry).
If the inner AuthenticationService was restarted.
If the fullUpdatePeriod has passed.
The first update of a day is always a full update.
9. If the next update (because of one of the rules in item 8) needs to be a full one, but the inner server was not reachable, no immediate retry is executed (unless it is an initial update). Instead, the update is cancelled. The next update will then be scheduled as a full update.
10. If the inner server (or its AuthenticationService) is restarted in a DMZ cluster, the next update is scheduled as a full update.
11. If in a DMZ cluster one of the DMZ servers is restarted, only its cache is completely updated. The other DMZ servers have their own rules.