Discussion:
[opennms-discuss] SOLVED: ver 19.0.1 Insufficient Memory errors
Hernandez, Paul
2017-06-05 16:50:07 UTC
Permalink
Hi,

Reporting back on success. After upgrading 19.0.1->19.1.0 and installing the two patched jar files Jesse mentions below, the memory leak is in fact resolved and my OpenNMS instance has been running for almost a week.

The upgrade did again break (with Page Not Found or Unavailable ) both Dashboard menu picks as well as both Maps->Topology and Maps->Geographical. This time I did not tee the output of the "bin/install -dis" step to a file, as in the past I've resolved this using warnings (not errors :() from this.

Many thanks Jesse!
Paul



From: Hernandez, Paul
Sent: Tuesday, May 30, 2017 10:16 AM
To: Jesse White <***@opennms.org>; General OpenNMS Discussion <opennms-***@lists.sourceforge.net>
Cc: Kroonen, Kevin <***@mentor.com>; Figueroa, Willie <***@mentor.com>
Subject: Re: [opennms-discuss] ver 19.0.1 Insufficient Memory errors

Jesse,

So appreciative of the fast response! I will try the new jars right away and report back.

Have a great day,
Paul

From: Jesse White [mailto:***@opennms.org]
Sent: Tuesday, May 30, 2017 10:00 AM
To: General OpenNMS Discussion <opennms-***@lists.sourceforge.net<mailto:opennms-***@lists.sourceforge.net>>; Hernandez, Paul <***@mentor.com<mailto:***@mentor.com>>
Cc: Kroonen, Kevin <***@mentor.com<mailto:***@mentor.com>>; Figueroa, Willie <***@mentor.com<mailto:***@mentor.com>>
Subject: Re: [opennms-discuss] ver 19.0.1 Insufficient Memory errors

Hi Paul,

Thanks for the detailed report. OpennMS Horizon 19.0.0, 19.0.1 and 19.1.0 are affected by a thread leak in SnmpUtils:
https://issues.opennms.org/browse/NMS-9233

The fix will be included in the next release of OpennMS Horizon.

If you're looking for a fix today, you can try replacing the affected .jar files with the patched files attached to the JIRA issue.

Best,
Jesse
On 05/30/2017 12:46 PM, Hernandez, Paul wrote:
Hi,

Am running 19.0.1 and unable to keep OpenNMS running for more than a couple of days.

The Server information is as follows:
OS: CentOS Linux release 7.2.1511 (Core) on VMware VM
VCPU: 4
RAM: 16GB
SWAP: 16GB

Total Number of Servers monitored: 4584
Total Number of Services: 31202

While onms is running, the web interface is nicely responsive and server loads using "top" look reasonable. No excessive wait state and early on (1/2 hr after start) I see:

[***@orw-onms-dev-vm opennms]# free -h
total used free shared buff/cache available
Mem: 15G 5.6G 362M 263M 9.8G 9.5G
Swap: 15G 377M 15G

I have read what I can find and have opennmsc.conf set to provide this:

ps -ef | grep open
root 572 1 99 09:04 ? 00:50:12 /usr/java/latest/bin/java -Djava.endorsed.dirs=/opt/opennms/lib/endorsed -Dopennms.home=/opt/opennms -Xmx4096m -XX:+HeapDumpOnOutOfMemoryError -d64 -XX:+UseStringDeduplication -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+UseG1GC -Xloggc:/var/log/opennms/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=4 -XX:GCLogFileSize=20M -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.login.config=opennms -Dcom.sun.management.jmxremote.access.file=/opt/opennms/etc/jmxremote.access -DisThreadContextMapInheritable=true -Dgroovy.use.classvalue=true -XX:MaxMetaspaceSize=256m -Djava.io.tmpdir=/opt/opennms/data/tmp -jar /opt/opennms/lib/opennms_bootstrap.jar start

The output.log file typically will contain a very large number of this trace prior, to running out of memory:

java.io.IOException: Only 32bit unsigned integers are supported at position 271
at org.snmp4j.asn1.BER.decodeUnsignedInteger(BER.java:684)
at org.snmp4j.smi.Counter32.decodeBER(Counter32.java:65)
at org.snmp4j.smi.AbstractVariable.createFromBER(AbstractVariable.java:172)
at org.snmp4j.smi.VariableBinding.decodeBER(VariableBinding.java:191)
at org.snmp4j.PDU.decodeBER(PDU.java:584)
at org.snmp4j.mp.MPv2c.prepareDataElements(MPv2c.java:201)
at org.snmp4j.MessageDispatcherImpl.dispatchMessage(MessageDispatcherImpl.java:276)
at org.snmp4j.MessageDispatcherImpl.processMessage(MessageDispatcherImpl.java:385)
at org.snmp4j.MessageDispatcherImpl.processMessage(MessageDispatcherImpl.java:345)
at org.snmp4j.transport.AbstractTransportMapping.fireProcessMessage(AbstractTransportMapping.java:76)
at org.snmp4j.transport.DefaultUdpTransportMapping$ListenThread.run(DefaultUdpTransportMapping.java:423)
at java.lang.Thread.run(Thread.java:745)

A typical error file: hs_err_pid73808.log top 50 lines show:

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux.cpp:2627), pid=73808, tid=0x00007f522eff1700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#

--------------- T H R E A D ---------------

Current thread (0x00007f5251ab4800): JavaThread "DefaultUDPTransportMapping_0.0.0.0/0" daemon [_thread_new, id=129126, stack(0x00007f522eef1000,0x00007f522eff2000)]

Stack: [0x00007f522eef1000,0x00007f522eff2000], sp=0x00007f522eff09a0, free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0xac703a] VMError::report_and_die()+0x2ba
V [libjvm.so+0x4fc7eb] report_vm_out_of_memory(char const*, int, unsigned long, VMErrorType, char const*)+0x8b
V [libjvm.so+0x923c43] os::Linux::commit_memory_impl(char*, unsigned long, bool)+0x103
V [libjvm.so+0x923d0c] os::pd_commit_memory(char*, unsigned long, bool)+0xc
V [libjvm.so+0x91d7ca] os::commit_memory(char*, unsigned long, bool)+0x2a
V [libjvm.so+0x9220ff] os::pd_create_stack_guard_pages(char*, unsigned long)+0x7f
V [libjvm.so+0xa6bdbe] JavaThread::create_stack_guard_pages()+0x5e
V [libjvm.so+0xa755a4] JavaThread::run()+0x34
V [libjvm.so+0x926268] java_start(Thread*)+0x108
C [libpthread.so.0+0x7dc5] start_thread+0xc5


--------------- P R O C E S S ---------------

Java Threads: ( => current thread )
0x00007f5251abb000 JavaThread "Timer-25740" daemon [_thread_blocked, id=129129, stack(0x00007f522ebee000,0x00007f522ecef000)]
.
.
.

Would moving to 19.1.0 be advised?

Thanks,
Paul



























From: Hernandez, Paul
Sent: Thursday, May 11, 2017 2:42 PM
To: opennms-***@lists.sourceforge.net<mailto:opennms-***@lists.sourceforge.net>
Subject: Search: "Not Providing service"

Has anyone ever found the need to search for the set of hosts which currently are *not* providing a service? In other words if you have a requisition with a number of services defined (in this case net-snmp ones) and say you populate a large number of hosts into this req from your datacenter. Furthermore you've hosts OS's spanning years of versions and flavors of the OS.

Now you'd like a way to easily find which hosts are not providing a service (for a number of reasons like buggy net-snmp, snmp was never installed, the net-snmp is too old to support the "extend" feature etc etc) via Search.

Search has "Providing service" but what you'd like is to discover which hosts *aren't* providing the service. Yes you could pour through the large list of alerts (consider 1000's of hosts) but would it not be nice to simply have search filter them out quickly?

I hope that I am missing something and there is an easy way to do this. And if not, and others find the idea of use, maybe make this into an enhancement request.

Thanks,
Paul





------------------------------------------------------------------------------

Check out the vibrant tech community on one of the world's most

engaging tech sites, Slashdot.org! http://sdm.link/slashdot



_______________________________________________

Please read the OpenNMS Mailing List FAQ:

http://www.opennms.org/index.php/Mailing_List_FAQ



opennms-discuss mailing list



To *unsubscribe* or change your subscription options, see the bottom of this page:

https://lists.sourceforge.net/lists/listinfo/opennms-discuss

Loading...