Discussion:
[opennms-discuss] ver 19.0.1 Insufficient Memory errors
Hernandez, Paul
2017-05-30 16:46:14 UTC
Permalink
Hi,

Am running 19.0.1 and unable to keep OpenNMS running for more than a couple of days.

The Server information is as follows:
OS: CentOS Linux release 7.2.1511 (Core) on VMware VM
VCPU: 4
RAM: 16GB
SWAP: 16GB

Total Number of Servers monitored: 4584
Total Number of Services: 31202

While onms is running, the web interface is nicely responsive and server loads using "top" look reasonable. No excessive wait state and early on (1/2 hr after start) I see:

[***@orw-onms-dev-vm opennms]# free -h
total used free shared buff/cache available
Mem: 15G 5.6G 362M 263M 9.8G 9.5G
Swap: 15G 377M 15G

I have read what I can find and have opennmsc.conf set to provide this:

ps -ef | grep open
root 572 1 99 09:04 ? 00:50:12 /usr/java/latest/bin/java -Djava.endorsed.dirs=/opt/opennms/lib/endorsed -Dopennms.home=/opt/opennms -Xmx4096m -XX:+HeapDumpOnOutOfMemoryError -d64 -XX:+UseStringDeduplication -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+UseG1GC -Xloggc:/var/log/opennms/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=4 -XX:GCLogFileSize=20M -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.login.config=opennms -Dcom.sun.management.jmxremote.access.file=/opt/opennms/etc/jmxremote.access -DisThreadContextMapInheritable=true -Dgroovy.use.classvalue=true -XX:MaxMetaspaceSize=256m -Djava.io.tmpdir=/opt/opennms/data/tmp -jar /opt/opennms/lib/opennms_bootstrap.jar start

The output.log file typically will contain a very large number of this trace prior, to running out of memory:

java.io.IOException: Only 32bit unsigned integers are supported at position 271
at org.snmp4j.asn1.BER.decodeUnsignedInteger(BER.java:684)
at org.snmp4j.smi.Counter32.decodeBER(Counter32.java:65)
at org.snmp4j.smi.AbstractVariable.createFromBER(AbstractVariable.java:172)
at org.snmp4j.smi.VariableBinding.decodeBER(VariableBinding.java:191)
at org.snmp4j.PDU.decodeBER(PDU.java:584)
at org.snmp4j.mp.MPv2c.prepareDataElements(MPv2c.java:201)
at org.snmp4j.MessageDispatcherImpl.dispatchMessage(MessageDispatcherImpl.java:276)
at org.snmp4j.MessageDispatcherImpl.processMessage(MessageDispatcherImpl.java:385)
at org.snmp4j.MessageDispatcherImpl.processMessage(MessageDispatcherImpl.java:345)
at org.snmp4j.transport.AbstractTransportMapping.fireProcessMessage(AbstractTransportMapping.java:76)
at org.snmp4j.transport.DefaultUdpTransportMapping$ListenThread.run(DefaultUdpTransportMapping.java:423)
at java.lang.Thread.run(Thread.java:745)

A typical error file: hs_err_pid73808.log top 50 lines show:

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux.cpp:2627), pid=73808, tid=0x00007f522eff1700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#

--------------- T H R E A D ---------------

Current thread (0x00007f5251ab4800): JavaThread "DefaultUDPTransportMapping_0.0.0.0/0" daemon [_thread_new, id=129126, stack(0x00007f522eef1000,0x00007f522eff2000)]

Stack: [0x00007f522eef1000,0x00007f522eff2000], sp=0x00007f522eff09a0, free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0xac703a] VMError::report_and_die()+0x2ba
V [libjvm.so+0x4fc7eb] report_vm_out_of_memory(char const*, int, unsigned long, VMErrorType, char const*)+0x8b
V [libjvm.so+0x923c43] os::Linux::commit_memory_impl(char*, unsigned long, bool)+0x103
V [libjvm.so+0x923d0c] os::pd_commit_memory(char*, unsigned long, bool)+0xc
V [libjvm.so+0x91d7ca] os::commit_memory(char*, unsigned long, bool)+0x2a
V [libjvm.so+0x9220ff] os::pd_create_stack_guard_pages(char*, unsigned long)+0x7f
V [libjvm.so+0xa6bdbe] JavaThread::create_stack_guard_pages()+0x5e
V [libjvm.so+0xa755a4] JavaThread::run()+0x34
V [libjvm.so+0x926268] java_start(Thread*)+0x108
C [libpthread.so.0+0x7dc5] start_thread+0xc5


--------------- P R O C E S S ---------------

Java Threads: ( => current thread )
0x00007f5251abb000 JavaThread "Timer-25740" daemon [_thread_blocked, id=129129, stack(0x00007f522ebee000,0x00007f522ecef000)]
.
.
.

Would moving to 19.1.0 be advised?

Thanks,
Paul



























From: Hernandez, Paul
Sent: Thursday, May 11, 2017 2:42 PM
To: opennms-***@lists.sourceforge.net
Subject: Search: "Not Providing service"

Has anyone ever found the need to search for the set of hosts which currently are *not* providing a service? In other words if you have a requisition with a number of services defined (in this case net-snmp ones) and say you populate a large number of hosts into this req from your datacenter. Furthermore you've hosts OS's spanning years of versions and flavors of the OS.

Now you'd like a way to easily find which hosts are not providing a service (for a number of reasons like buggy net-snmp, snmp was never installed, the net-snmp is too old to support the "extend" feature etc etc) via Search.

Search has "Providing service" but what you'd like is to discover which hosts *aren't* providing the service. Yes you could pour through the large list of alerts (consider 1000's of hosts) but would it not be nice to simply have search filter them out quickly?

I hope that I am missing something and there is an easy way to do this. And if not, and others find the idea of use, maybe make this into an enhancement request.

Thanks,
Paul
Jesse White
2017-05-30 16:59:39 UTC
Permalink
Hi Paul,

Thanks for the detailed report. OpennMS Horizon 19.0.0, 19.0.1 and 19.1.0 are affected by a thread leak in SnmpUtils:
https://issues.opennms.org/browse/NMS-9233

The fix will be included in the next release of OpennMS Horizon.

If you're looking for a fix today, you can try replacing the affected .jar files with the patched files attached to the
JIRA issue.

Best,
Jesse
*Hi,*
* *
*Am running 19.0.1 and unable to keep OpenNMS running for more than a couple of days.*
* *
*The Server information is as follows:*
*OS: CentOS Linux release 7.2.1511 (Core) on VMware VM*
*VCPU: 4*
*RAM: 16GB*
*SWAP: 16GB*
* *
*Total Number of Servers monitored: 4584*
*Total Number of Services: 31202*
* *
*While onms is running, the web interface is nicely responsive and server loads using “top” look reasonable. No
excessive wait state and early on (1/2 hr after start) I see:*
* *
* total used free shared buff/cache available*
*Mem: 15G 5.6G 362M 263M 9.8G 9.5G*
*Swap: 15G 377M 15G*
* *
*I have read what I can find and have opennmsc.conf set to provide this:*
* *
*ps -ef | grep open*
*root 572 1 99 09:04 ? 00:50:12 /usr/java/latest/bin/java
-Djava.endorsed.dirs=/opt/opennms/lib/endorsed -Dopennms.home=/opt/opennms -Xmx4096m -XX:+HeapDumpOnOutOfMemoryError
-d64 -XX:+UseStringDeduplication -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+UseG1GC
-Xloggc:/var/log/opennms/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=4 -XX:GCLogFileSize=20M
-Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.login.config=opennms
-Dcom.sun.management.jmxremote.access.file=/opt/opennms/etc/jmxremote.access -DisThreadContextMapInheritable=true
-Dgroovy.use.classvalue=true -XX:MaxMetaspaceSize=256m -Djava.io.tmpdir=/opt/opennms/data/tmp -jar
/opt/opennms/lib/opennms_bootstrap.jar start*
* *
*The output.log file typically will contain a very large number of this trace prior, to running out of memory:*
* *
*java.io.IOException: Only 32bit unsigned integers are supported at position 271*
* at org.snmp4j.asn1.BER.decodeUnsignedInteger(BER.java:684)*
* at org.snmp4j.smi.Counter32.decodeBER(Counter32.java:65)*
* at org.snmp4j.smi.AbstractVariable.createFromBER(AbstractVariable.java:172)*
* at org.snmp4j.smi.VariableBinding.decodeBER(VariableBinding.java:191)*
* at org.snmp4j.PDU.decodeBER(PDU.java:584)*
* at org.snmp4j.mp.MPv2c.prepareDataElements(MPv2c.java:201)*
* at org.snmp4j.MessageDispatcherImpl.dispatchMessage(MessageDispatcherImpl.java:276)*
* at org.snmp4j.MessageDispatcherImpl.processMessage(MessageDispatcherImpl.java:385)*
* at org.snmp4j.MessageDispatcherImpl.processMessage(MessageDispatcherImpl.java:345)*
* at org.snmp4j.transport.AbstractTransportMapping.fireProcessMessage(AbstractTransportMapping.java:76)*
* at org.snmp4j.transport.DefaultUdpTransportMapping$ListenThread.run(DefaultUdpTransportMapping.java:423)*
* at java.lang.Thread.run(Thread.java:745)*
* *
*A typical error file: hs_err_pid73808.log top 50 lines show:*
* *
*#*
*# There is insufficient memory for the Java Runtime Environment to continue.*
*# Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.*
*# Possible reasons:*
*# The system is out of physical RAM or swap space*
*# In 32 bit mode, the process size limit was hit*
*# Possible solutions:*
*# Reduce memory load on the system*
*# Increase physical memory or swap space*
*# Check if swap backing store is full*
*# Use 64 bit Java on a 64 bit OS*
*# Decrease Java heap size (-Xmx/-Xms)*
*# Decrease number of Java threads*
*# Decrease Java thread stack sizes (-Xss)*
*# Set larger code cache with -XX:ReservedCodeCacheSize=*
*# This output file may be truncated or incomplete.*
*#*
*# Out of Memory Error (os_linux.cpp:2627), pid=73808, tid=0x00007f522eff1700*
*#*
*# JRE version: Java(TM) SE Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13)*
*# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 compressed oops)*
*# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before
starting Java again*
*#*
* *
*--------------- T H R E A D ---------------*
* *
*Current thread (0x00007f5251ab4800): JavaThread "DefaultUDPTransportMapping_0.0.0.0/0" daemon [_thread_new,
id=129126, stack(0x00007f522eef1000,0x00007f522eff2000)]*
* *
*Stack: [0x00007f522eef1000,0x00007f522eff2000], sp=0x00007f522eff09a0, free space=1022k*
*Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)*
*V [libjvm.so+0xac703a] VMError::report_and_die()+0x2ba*
*V [libjvm.so+0x4fc7eb] report_vm_out_of_memory(char const*, int, unsigned long, VMErrorType, char const*)+0x8b*
*V [libjvm.so+0x923c43] os::Linux::commit_memory_impl(char*, unsigned long, bool)+0x103*
*V [libjvm.so+0x923d0c] os::pd_commit_memory(char*, unsigned long, bool)+0xc*
*V [libjvm.so+0x91d7ca] os::commit_memory(char*, unsigned long, bool)+0x2a*
*V [libjvm.so+0x9220ff] os::pd_create_stack_guard_pages(char*, unsigned long)+0x7f*
*V [libjvm.so+0xa6bdbe] JavaThread::create_stack_guard_pages()+0x5e*
*V [libjvm.so+0xa755a4] JavaThread::run()+0x34*
*V [libjvm.so+0x926268] java_start(Thread*)+0x108*
*C [libpthread.so.0+0x7dc5] start_thread+0xc5*
* *
* *
*--------------- P R O C E S S ---------------*
* *
*Java Threads: ( => current thread )*
* 0x00007f5251abb000 JavaThread "Timer-25740" daemon [_thread_blocked, id=129129,
stack(0x00007f522ebee000,0x00007f522ecef000)]*
* .*
* .*
* .*
* *
*Would moving to 19.1.0 be advised?*
* *
*Thanks,*
*Paul*
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
* *
*From:* Hernandez, Paul
*Sent:* Thursday, May 11, 2017 2:42 PM
*Subject:* Search: "Not Providing service"
*Has anyone ever found the need to search for the set of hosts which currently are *not* providing a service? In
other words if you have a requisition with a number of services defined (in this case net-snmp ones) and say you
populate a large number of hosts into this req from your datacenter. Furthermore you’ve hosts OS’s spanning years of
versions and flavors of the OS.*
* *
*Now you’d like a way to easily find which hosts are not providing a service (for a number of reasons like buggy
net-snmp, snmp was never installed, the net-snmp is too old to support the “extend” feature etc etc) via Search.*
* *
*Search has “Providing service” but what you’d like is to discover which hosts *aren’t* providing the service. Yes
you could pour through the large list of alerts (consider 1000’s of hosts) but would it not be nice to simply have
search filter them out quickly?*
* *
*I hope that I am missing something and there is an easy way to do this. And if not, and others find the idea of use,
maybe make this into an enhancement request.*
* *
*Thanks,*
*Paul*
* *
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
http://www.opennms.org/index.php/Mailing_List_FAQ
opennms-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/opennms-discuss
Hernandez, Paul
2017-05-30 17:16:20 UTC
Permalink
Jesse,

So appreciative of the fast response! I will try the new jars right away and report back.

Have a great day,
Paul

From: Jesse White [mailto:***@opennms.org]
Sent: Tuesday, May 30, 2017 10:00 AM
To: General OpenNMS Discussion <opennms-***@lists.sourceforge.net>; Hernandez, Paul <***@mentor.com>
Cc: Kroonen, Kevin <***@mentor.com>; Figueroa, Willie <***@mentor.com>
Subject: Re: [opennms-discuss] ver 19.0.1 Insufficient Memory errors

Hi Paul,

Thanks for the detailed report. OpennMS Horizon 19.0.0, 19.0.1 and 19.1.0 are affected by a thread leak in SnmpUtils:
https://issues.opennms.org/browse/NMS-9233

The fix will be included in the next release of OpennMS Horizon.

If you're looking for a fix today, you can try replacing the affected .jar files with the patched files attached to the JIRA issue.

Best,
Jesse
On 05/30/2017 12:46 PM, Hernandez, Paul wrote:
Hi,

Am running 19.0.1 and unable to keep OpenNMS running for more than a couple of days.

The Server information is as follows:
OS: CentOS Linux release 7.2.1511 (Core) on VMware VM
VCPU: 4
RAM: 16GB
SWAP: 16GB

Total Number of Servers monitored: 4584
Total Number of Services: 31202

While onms is running, the web interface is nicely responsive and server loads using "top" look reasonable. No excessive wait state and early on (1/2 hr after start) I see:

[***@orw-onms-dev-vm opennms]# free -h
total used free shared buff/cache available
Mem: 15G 5.6G 362M 263M 9.8G 9.5G
Swap: 15G 377M 15G

I have read what I can find and have opennmsc.conf set to provide this:

ps -ef | grep open
root 572 1 99 09:04 ? 00:50:12 /usr/java/latest/bin/java -Djava.endorsed.dirs=/opt/opennms/lib/endorsed -Dopennms.home=/opt/opennms -Xmx4096m -XX:+HeapDumpOnOutOfMemoryError -d64 -XX:+UseStringDeduplication -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+UseG1GC -Xloggc:/var/log/opennms/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=4 -XX:GCLogFileSize=20M -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.login.config=opennms -Dcom.sun.management.jmxremote.access.file=/opt/opennms/etc/jmxremote.access -DisThreadContextMapInheritable=true -Dgroovy.use.classvalue=true -XX:MaxMetaspaceSize=256m -Djava.io.tmpdir=/opt/opennms/data/tmp -jar /opt/opennms/lib/opennms_bootstrap.jar start

The output.log file typically will contain a very large number of this trace prior, to running out of memory:

java.io.IOException: Only 32bit unsigned integers are supported at position 271
at org.snmp4j.asn1.BER.decodeUnsignedInteger(BER.java:684)
at org.snmp4j.smi.Counter32.decodeBER(Counter32.java:65)
at org.snmp4j.smi.AbstractVariable.createFromBER(AbstractVariable.java:172)
at org.snmp4j.smi.VariableBinding.decodeBER(VariableBinding.java:191)
at org.snmp4j.PDU.decodeBER(PDU.java:584)
at org.snmp4j.mp.MPv2c.prepareDataElements(MPv2c.java:201)
at org.snmp4j.MessageDispatcherImpl.dispatchMessage(MessageDispatcherImpl.java:276)
at org.snmp4j.MessageDispatcherImpl.processMessage(MessageDispatcherImpl.java:385)
at org.snmp4j.MessageDispatcherImpl.processMessage(MessageDispatcherImpl.java:345)
at org.snmp4j.transport.AbstractTransportMapping.fireProcessMessage(AbstractTransportMapping.java:76)
at org.snmp4j.transport.DefaultUdpTransportMapping$ListenThread.run(DefaultUdpTransportMapping.java:423)
at java.lang.Thread.run(Thread.java:745)

A typical error file: hs_err_pid73808.log top 50 lines show:

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (os_linux.cpp:2627), pid=73808, tid=0x00007f522eff1700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#

--------------- T H R E A D ---------------

Current thread (0x00007f5251ab4800): JavaThread "DefaultUDPTransportMapping_0.0.0.0/0" daemon [_thread_new, id=129126, stack(0x00007f522eef1000,0x00007f522eff2000)]

Stack: [0x00007f522eef1000,0x00007f522eff2000], sp=0x00007f522eff09a0, free space=1022k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0xac703a] VMError::report_and_die()+0x2ba
V [libjvm.so+0x4fc7eb] report_vm_out_of_memory(char const*, int, unsigned long, VMErrorType, char const*)+0x8b
V [libjvm.so+0x923c43] os::Linux::commit_memory_impl(char*, unsigned long, bool)+0x103
V [libjvm.so+0x923d0c] os::pd_commit_memory(char*, unsigned long, bool)+0xc
V [libjvm.so+0x91d7ca] os::commit_memory(char*, unsigned long, bool)+0x2a
V [libjvm.so+0x9220ff] os::pd_create_stack_guard_pages(char*, unsigned long)+0x7f
V [libjvm.so+0xa6bdbe] JavaThread::create_stack_guard_pages()+0x5e
V [libjvm.so+0xa755a4] JavaThread::run()+0x34
V [libjvm.so+0x926268] java_start(Thread*)+0x108
C [libpthread.so.0+0x7dc5] start_thread+0xc5


--------------- P R O C E S S ---------------

Java Threads: ( => current thread )
0x00007f5251abb000 JavaThread "Timer-25740" daemon [_thread_blocked, id=129129, stack(0x00007f522ebee000,0x00007f522ecef000)]
.
.
.

Would moving to 19.1.0 be advised?

Thanks,
Paul



























From: Hernandez, Paul
Sent: Thursday, May 11, 2017 2:42 PM
To: opennms-***@lists.sourceforge.net<mailto:opennms-***@lists.sourceforge.net>
Subject: Search: "Not Providing service"

Has anyone ever found the need to search for the set of hosts which currently are *not* providing a service? In other words if you have a requisition with a number of services defined (in this case net-snmp ones) and say you populate a large number of hosts into this req from your datacenter. Furthermore you've hosts OS's spanning years of versions and flavors of the OS.

Now you'd like a way to easily find which hosts are not providing a service (for a number of reasons like buggy net-snmp, snmp was never installed, the net-snmp is too old to support the "extend" feature etc etc) via Search.

Search has "Providing service" but what you'd like is to discover which hosts *aren't* providing the service. Yes you could pour through the large list of alerts (consider 1000's of hosts) but would it not be nice to simply have search filter them out quickly?

I hope that I am missing something and there is an easy way to do this. And if not, and others find the idea of use, maybe make this into an enhancement request.

Thanks,
Paul






------------------------------------------------------------------------------

Check out the vibrant tech community on one of the world's most

engaging tech sites, Slashdot.org! http://sdm.link/slashdot




_______________________________________________

Please read the OpenNMS Mailing List FAQ:

http://www.opennms.org/index.php/Mailing_List_FAQ



opennms-discuss mailing list



To *unsubscribe* or change your subscription options, see the bottom of this page:

https://lists.sourceforge.net/lists/listinfo/opennms-discuss

Loading...