Failover concept

The failover concept is about ensuring the availability of the node controller. If the connection of a node to the (failing) node controller is interrupted, the next available node becomes the node controller and all remaining nodes connect to the new node controller. When the original node controller becomes available again, it is changed back into a normal node.

Configuration


Each Lobster_data instance knows both the Internet address and the failover priority of all other Lobster_data instances. This ensures that each instance can autonomously reorganise itself in a failover case. This information is read from the configuration files ./etc/startup.xml and ./etc/admin/datawizard/lb_nodes.properties when a Lobster_data instance is started. The relevant content of these files is replicated by the node controller to the working nodes when they log into the load balancing network (i.e. you only have to maintain the files on the node controller).

File "startup.xml"

To activate the failover mechanism, the following entry in configuration file ./etc/startup.xml has to exist.


<Call name="enableFailOver">
<Arg>
<New class="com.ebd.hub.datawizard.app.loadbalance.failover.Configuration">
<Set name="port">2320</Set>
<Set name="heartbeat">500</Set>
<Set name="externalURL">https://www.google.de</Set>
</New>
</Arg>
</Call>


Following the meaning of the parameters.


port

The port on which the 'ping' messages are exchanged.

heartbeat

Sets the period of the ping sequence in milliseconds.

externalURL

Must be a reachable HTTP(S) URL that allows the servers to verify whether they are disconnected from the network or the node controller.

File "lb_nodes.properties"

The configuration file ./etc/admin/datawizard/lb_nodes.properties contains the respective host address and the message service port of the Lobster_data instances (nodes and node controllers) involved in the load balancing. The key is the name of the instance as specified in the respective configuration file ./etc/factory.xml in element id. Note: See also section Structure of a Properties File (note the backslash before the colon in the following example file).


# define all working nodes by IP:Port that are licensed - must be the factory name as key
#
# e.g.
WorkNode2=192.168.132.56:8020
WorkNode1=192.168.132.55:8020
NodeContr2=192.168.132.54:8020
NodeContr1=192.168.132.53:8020
WorkNode3=192.168.132.57:8020

Functional principle


  • There is always exactly one active node controller.

  • The number of working nodes is unlimited and can be changed at any time.

  • In principle, each working node (by license and configuration) can be in working mode as well as in controller mode. The modes can change during operation.

  • Node controllers (by license and configuration) can only be in controller mode.

  • Always the last node controller that goes online is the active node controller. Old node controllers shut down unless they previously were a working node by license and configuration. In this case, it changes back to a working node and does not shut down.

  • Changing the operation mode of a node can occur automatically due to a detected failover, or it can be explicitly initiated by the user (in the Control Center or via HTTP).

  • A valid operating state with a node controller and working node(s) must be reached at least once (at startup), because stand-alone working nodes receive their setup from the node controller at startup.

Deactivating SAP RequestListener


See section SAP RequestListener in Load Balance Failover.

Failure of the primary DMZ server


See section DMZ Cluster.