Memory Guard Application
Objective
In rare cases, it might be necessary to monitor the main memory and its workload automatically to avoid performance problems of Lobster_data, which occur after an extended, uninterrupted runtime.
This can especially be an issue if driver software of third-party manufacturers, e.g. SAP JCo or database drivers, using native libraries (DLLs in Windows), is called by Lobster_data. Occasionally, the memory used by these libraries is not correctly released, which leads to a continuously decreasing available main memory (memory leak).
This problem can be avoided by restarting the Lobster Integration Server in regular intervals, which are chosen in such a way that it is guaranteed that there will be no main memory shortage. But in many cases, an extended runtime is wanted or necessary, or a regular restart is simply forgotten, which poses the risk of a system crash if the main memory is not monitored.
A 'manual' monitoring of the available main memory will always just be a momentary view and of course, a busy system can always reach up to 80% or 90% of memory usage. This peak in memory usage might also sustain over a period of time, which makes it necessary to have a systematic algorithm assessing the main memory, instead of relying on a momentary view or a gut feeling.
This necessity led to the development of the Memory Guard Application. This application is an additional tool for Lobster_data that is only needed if memory leaks occur after a prolonged runtime of Lobster_data.
Limitations
The Memory Guard Application cannot avoid or fix memory leaks. If configured correctly, it only provides the information, if and when an administrator should restart the system. If a configured 'red line' is crossed, an email is sent and will be resent periodically as long as the situation persists.
In addition to these error emails, there will be an email sent to the same receiver at each restart of the Memory Guard Application to notify about the restart and to check the used email channel.
Functional Principle
Memory Management of the Java Virtual Machine (JVM)
Inside the Java Virtual Machine (JVM), in which Lobster_data runs, the memory management is done by a subsystem of the JVM. This subsystem is also responsible for releasing memory that was used by Java objects if these objects are not used anymore. This release is called garbage collection and the subsystem is called garbage collector. This usually works just fine, so where do the memory leaks come from? If a Java library uses a system library, the memory needed for the system library is allocated by the JVM, but the garbage collector is not able to look into the system libraries to find unused memory.
Unfortunately, most operating systems have the bad habit of never unloading a once loaded system library and even the JVM can only unload system libraries that have been used by a Java class when the JVM terminates itself, i.e. at a restart.
Startup Options of the JVM
The size of the initially reserved memory at the start of the JVM is set with option -Xms and the maximum size is set with option -Xmx. The additional allocation of memory is done automatically if the currently allocated memory is used up to about 90% by Java objects. If the operating system cannot deliver the requested memory, the JVM could crash.
Since the JVM does not pass back memory it once allocated, it is likely that the size of the allocated memory has reached the maximum value after a prolonged runtime. For that reason these two options should have the same value for servers that run for several weeks without interruptions, to make sure that the JVM already takes the maximum amount of memory at startup and has no chance of causing a crash later on when trying to get more memory.
Measure Main Memory Utilisation
Fundamentals of the Measurement
Java offers methods to determine the amount of currently allocated memory to the JVM and how much of it is unused, which allows us to calculate the used memory. These values are only momentary and will be different after a few milliseconds. To get meaningful values, the measurements have to be done repeatedly in a short period of time, which allows us to determine the minimum value, the maximum value, the mean value and the tendency of the used memory.
The independent workings of the garbage collector make it difficult to find memory leaks since it will not start cleaning up the memory at an average memory workload. And we cannot distinguish between memory that is used up by dead objects but will be cleaned up later and used up memory that cannot be cleaned up at all. The sum of all memory leaks could only be determined if there were no dead or living Java objects in the memory. A living object is an object that is still used in the Java program whereas a dead object is not.
For that reason, we use the minimum value of the used memory. This value is calculated every few seconds and is used to derive two values: the absolute minimum and the current, 'weighted' minimum. By regularly starting the garbage collector 'manually' we try to measure as many dead objects as possible. The Memory Guard Application will delay such a forced start of the garbage collector at times of a high system load to avoid performance disadvantages for Lobster_data, which also means that we will not get very relevant measurements during times of high system loads.
Delay of the Measurement
Systems behave differently during startup, shutdown, and continuous operation. To avoid false measurements during startup, Memory Guard Application only starts its measurements after a certain delay time. This time is set in parameter delayMinutes. After that delay, the regular measurements occur every periodeMillis milliseconds.
Evaluation of the Measurement
The absolute minimum and the current minimum are compared. The comparison happens every checkMinutes. If the current minimum differs more than the tolerance value toleranceMB from the absolute minimum, the 'yellow' status is reached. In that case, the Memory Guard Application tries to free memory that is used up by dead objects. Usually, that is enough to get back to the 'green' status.
Normally, the status oscillates between 'green' and 'yellow'. If you set parameter verbose to true, detail messages and the status will be shown in the Control Center in tabs Logs -> General messages.
During times of high system loads (more than 1/3 of the main memory is used), the start of the garbage collector is delayed, but not longer than 10 times of the timespan set in parameter checkMinutes.
If the 'green' status is not reached for longer than the timespan set in parameter redDelayMinutes, the Memory Guard Application will switch into the 'red' status, which causes it to send an alarm message to the configured email recipient. This message will be resent if the 'red' status persists.
Installation and Configuration
Installation
<
Call
name
=
"addApplication"
>
<
Arg
>
<
New
class
=
"de.lobster.memory.MemoryGuardApplication"
>
<
Set
name
=
"verbose"
type
=
"boolean"
>true</
Set
>
<
Set
name
=
"delayMinutes"
>10</
Set
>
<
Set
name
=
"redDelayMinutes"
>240</
Set
>
<
Set
name
=
"checkMinutes"
>60</
Set
>
<
Set
name
=
"toleranceMB"
>200</
Set
>
<
Set
name
=
"periodeMillis"
>60000</
Set
>
<
Set
name
=
"mailServer"
>mail.xyz.tld</
Set
>
<
Set
name
=
"mailUser"
>user@domain</
Set
>
<
Set
name
=
"mailUserPassword"
>######</
Set
>
<
Set
name
=
"mailSender"
>memoryguard@xyz.tld</
Set
>
<
Set
name
=
"mailRecipient"
>operating@xyz.tld</
Set
>
<
Set
name
=
"additionalSystemId"
>PETER_PROD</
Set
>
</
Arg
>
</
Call
>
Copy Java library X_MemoryGuardApplication.jar into directory ./extlib
Add the section in the screenshot above to configuration file ./etc/startup.xml
Parameter Description and Default Values
Name |
Description |
Default |
verbose |
Detailed log messages in Control Center. |
false |
delayMinutes |
Delay time after system start till measurement starts [minutes]. |
10 |
redDelayMinutes |
Threashold for sending email [minutes]. |
240 |
checkMinutes |
Interval of measurement evaluation. |
60 |
toleranceMB |
Threashold for yellow status [MBytes]. |
200 |
periodeMillis |
Interval of measurement [milliseconds]. |
60000 |
mailServer mailUser mail UserPassword |
Email server configuration. |
From startup.xml: HubStartConfiguration |
mailSender mailRecipient |
Email address configuration. |
From startup.xml: DataWizardSetup |
additionalSystemId |
System name in email subject. |
empty |
Minimal Configuration (only with default values)
<
Call
name
=
"addApplication"
>
<
Arg
>
<
New
class
=
"de.lobster.memory.MemoryGuardApplication"
>
</
Arg
>
</
Call
>