Troubleshooting Memory-related Java Problems in Integration Agents and xAgents

This article provides instructions for troubleshooting Java on all platforms (except AIX).

Introduction

If you're experiencing memory-related problems with a Java application such as the xMatters Integration Agent or xMatters Agent, this article will help you to collect information needed for troubleshooting.  It also describes how to package this information and send it to xMatters Support for analysis.

NOTE: This article applies to all platforms except AIX.

What to Expect When Troubleshooting

Memory-related problems in Java applications can be complex, difficult to resolve, and time-consuming.  Possible reasons for these problems include:

  • Operating system or JVM is improperly tuned
  • Java application is managing objects improperly
  • Large or nested objects are allocated improperly
  • Memory is fragmented
  • Memory is not being properly released by JNI/native code
  • Troubleshooter has inadequate understanding of JVM memory allocation and usage

It might be necessary to reproduce the problem several times before there is sufficient information to identify and resolve the issue.  If the problem cannot be reproduced in a test, lab, or development environment, data may need to be collected from a production system.

If data cannot be collected from a system demonstrating the issue, support teams might not be able to resolve the issue. We understand that collecting data in production environments can result in down time for the application, which can affect revenue, stability, and customer perception. xMatters support teams are very aware of these considerations, and will make every effort to minimize disruption and to resolve the issue as quickly as possible.

Defining the Problem

Memory related problems are typically associated with one or more of the following:

  • Java (JVM, JIT)
  • Customer application or some component that it uses (JNI)
  • Native heap (system libraries, JVM, JIT, JNI)
  • Java heap (GC, objects not being released, use of finalizers)
  • Threads (stack overflows, thread management)
  • System paging (inadequate physical memory, excessive allocation from applications)
  • Consistent increases in process memory
  • Fragmentation
  • Application crashes and hangs (unresponsiveness)

Important Background Information

If you are working with xMatters Support to resolve this type of issue, it is important to provide the following information:

  • System or process effects (e.g., slower performance, hangs, crashes)
  • Error messages (e.g., OutOfMemory, StackOverflow)
  • Whether process generates files (e.g. AIX core files, Javacore files, heapdump files)
  • How memory usage is monitored (e.g., output provided from commands or options)
  • Whether this is a new issue, and, if so, details about what has recently changed (e.g., in the system, JRE, or application)
  • Whether the problem occurs on development, test, or production systems

Setting up the Environment

It's important to configure your environment properly before collecting data.

1. Generate heap dump.
Use the HeapDumpOnOutOfMemoryError option to instruct the HotSpot VM to generate a heap dump when the first thread throws a java.lang.OutOfMemoryError because the Java heap or the permanent generation is full. There is no overhead in running with this option, which makes it useful for production systems where OutOfMemoryError requires significant time to arise.

Edit the <IAHOME>/conf/wrapper.conf file, and add the following line:

wrapper.java.additional.<#>=-XX:+HeapDumpOnOutOfMemoryError

NOTE: Replace <#> with the next available number for the wrapper.java.additional parameter for your installation.  The number must be consecutive and unique (so no repetition and no gaps.)

2. Generate GC output.

Edit the <IAHOME>/conf/wrapper.conf file, and add the following line:

wrapper.java.additional.<#>=-verbose:gc

NOTE: Replace <#> with the next available number for the wrapper.java.additional parameter for your installation.  The number must be consecutive and unique (so no repetition and no gaps.)

Collecting Data

After you have prepared your environment, you are ready to start collecting the data that you will need to provide to xMatters Support. In general it is recommended to collect as much relevant data as possible when creating a bug report or submitting a support request. This section suggests the data to collect and, where applicable, it provides guidance regarding the commands or general procedure required to obtain the data.

The data that should be collected prior to submitting a bug report includes :

  • Detailed description of the problem including test case (where possible)
  • Hardware details
  • Operating system details
  • Fatal error log (in the case of a crash)
  • Heap Dump
  • Logs or trace information (where applicable)

The following sections discuss each of these data types in detail.

Detailed Description and Test Case

When creating a problem description the general rule is to include as much relevant information as possible. Describe the application, the environment, and most importantly the events leading up to when the problem was encountered. Additionally:

  • If the problem is reproducible, list the steps required to demonstrate the problem.
  • If the problem can be demonstrated with a simple test case, include the test case and the commands to compile and execute the test case.
  • If the test case or problem requires third-party code (e.g., a commercial or open source library or package), provide details regarding where and how to obtain the library.
  • If the problem can be reproduced only in a complex application environment, the description, combined with logs, core files, and other relevant information may be the sole means to diagnose the issue. In these situations the description should indicate whether the person or organization submitting the issue is willing to run further diagnosis or run test binaries on the affected system

Hardware details
Some bugs arise or can be reproduced only on certain hardware configurations. As a result, it is important for the bug report to document the computer's number and type or processors, the clock speed, and, where applicable, to provide details on the features of that processor (e.g., for Intel processors, it may be relevant that hyper-threading is available).

Operating System details
On Solaris, the showrev -a command prints the operating system version and patch information.

On Linux, it is important to know which distribution and version is used. Sometimes the /etc/*release file indicates the release information; however, because components and packages can be upgraded independently, this file is not always a reliable indication of the configuration. As a result, the following information should also be collected:

  • Kernel version: This can be obtained using the uname -a command.
  • glibc version: The rpm -q glibc command should indicate the patch level of glibc.
  • Thread library: Linux has three possible thread types, namely linuxthreads (fixed stack),linuxthreads (floating stack) and NPTL. They are normally installed in /lib, /lib/i686 and/or /lib/tls.

On Windows, the Control Panel the General tab on the System applet indicates the operating system and service pack installed.

Fatal Error Log
When a fatal error occurs, an error log is created in the file hs_err_pid.log (where is the process id). If possible, the file is created in the working directory of the process. When the file cannot be created in the working directory (due to insufficient space, permission problem, or another issue), the file is then created in the temporary directory for the operating system. On Solaris and Linux the temporary directory is /tmp. On Windows, the temporary directory is specified by the value of the TMP environment variable (or, if that is not defined, the temporary directory is specified by the value of the TEMP environment variable). The error log includes detailed information obtained at the time of the fatal error. It may also include version and environment information, details on the threads that provoked the crash, etc. If the fatal error log is generated, then it should be included in the bug report or support request.

Heap Dump
Heap dumps are very important when trying to diagnose an out of memory error. The term "heap dump" describes the Java Virtual Machine mechanism that generates a dump of all the reachable objects that are in the Java heap (i.e., those that are being used by the running Java application). This information is used to analyze the composition of the objects in the heap (e.g., to help find the objects that are controlling large amounts of memory in the Java heap and the reason why the Garbage Collector cannot collect them).

Logs/Traces

Sometimes a problem can be solved quickly using log or trace output. In general, all relevant logs, traces and other output should be included in the bug report or support request. There are two types of trace files: Java core and snap trace. Java core files are an internally generated and formatted analysis of the JVM, providing information that includes the Java threads present, the classes loaded, and heap statistics. The snap trace file mainly includes trace data from the internal operations of the JVM, but may not be available on all platforms.

JVM Tools

Various tools are distributed with modern JVMs to simplify monitoring and data collection.   "Java Control Panel" and "Java Mission Control" are two examples that ship with Oracle JVMs.  Unfortunately these tools don't have an equivalent in Adapt OpenJDK.

Packaging and Sending Data to xMatters Support

You should now be ready to create an information package that can be sent to xMatters Support for analysis.

NOTE: If you do not already have the information required to FTP the files, contact xMatters Support for details before proceeding with the following steps.

To package and send your data to xMatters Support:
1. Use an archive tool or command to archive the files:

On Linux or Solaris

tar -cf #.tar [data-files]  
gzip -0 #.tar  

Where:

  • [data-files] is the output files from all commands and any heapdump or Javacore files. Note the size of the file created.
  • # is a date necessary to ensure that each file placed on the testcase server is unique.

On Windows
Use Winzip or Winrar to archive the files.

2. Contact xMatters support by creating a support ticket at https://support.xmatters.com/hc/en-us/requests/new 

The support ticket will allow you to attach log files.  If your files are larger than allowed, then the support team will create an FTP account allowing you to upload the material.

 

xMatters reference:

JDN-1206; originally create by Don Clark

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.