Discussion:
[opennms-discuss] Thresholding disk space percent used
Brian
2008-04-22 16:59:22 UTC
Permalink
This is probably a common use for OpenNMS: I need to alert on Windows and Linux disk space percent used, taking into account sometimes we want fixed disks, removable disks, memory, or swap threshold'ed and graphed.

I have read many previous posts and the wiki. I see there is also aucd-snmp config option posted in the wiki, but its super convoluted.

My closest attempt in 1.3.11 so far has been to add the following to datacollection-config.xml:
<group name="mib2-host-resources-storage" ifType="all">
<mibObj oid=".1.3.6.1.2.1.25.2.3.1.2" instance="hrStorageIndex" alias="hrStorageType" type="string" />
<mibObj oid=".1.3.6.1.2.1.25.2.3.1.3" instance="hrStorageIndex" alias="hrStorageDescr" type="string" />
<mibObj oid=".1.3.6.1.2.1.25.2.3.1.4" instance="hrStorageIndex" alias="hrStorageAllocUnits" type="gauge" />
<mibObj oid=".1.3.6.1.2.1.25.2.3.1.5" instance="hrStorageIndex" alias="hrStorageSize" type="gauge" />
<mibObj oid=".1.3.6.1.2.1.25.2.3.1.6" instance="hrStorageIndex" alias="hrStorageUsed" type="gauge" />
</group>

Then add in thresholds.xml
<expression type="high" ds-type="hrStorageIndex" value="80"
rearm="70" trigger="1" ds-label="hrStorageDescr" expression="(hrStorageUsed/hrStorageSize)*100">
<resource-filter field="hrStorageType">\.1\.3\.6\.1\.2\.1\.25\.2\.1\.4</resource-filter>
</expression>

I end up with an endless loop of "high threshold"/"high threshold rearmed" messages indicating eg. partition 1 on a server is over threshold, then partition 2 is found under threshold and it resets the partition 1 threshold.


Questions:
1. I thought the endless loop problem was fixed in a previous version (around December)?

2. How do you know what is valid in the resource-filter field? The above resource-filter example actually doesnt work for me, as it never matches. An older post mentioned \.1\.3\.6\.1\.2\.1\.25\.2\.1\.4 should match permanent disks. What are valid values?

3. How do you get hrStorageDescr into the notification text to know what partition/disk the alert is referencing? My (default) high threshold notifications look like this:

<text-message xmlns="">A Threshold has been exceeded on node: %nodelabel%, interface:%interface%. The parameter %parm[ds]% reached a value of %parm[value]% while the threshold is %parm[threshold]%. This alert will be rearmed when %parm[ds]% reaches %parm[rearm]%.</text-message>
<subject xmlns="">Notice #%noticeid%: High Threshold for %parm[ds]% on node %nodelabel%.</subject>

4. Can an exclude filter be used in snmp-graphs to eg. exclude all removable media graphs?


Brian



__________________________________________________________________
Ask a question on any topic and get answers from real people. Go to Yahoo! Answers and share what you know at http://ca.answers.yahoo.com
Jeff Gehlbach
2008-04-22 18:00:11 UTC
Permalink
Post by Brian
I end up with an endless loop of "high threshold"/"high threshold
rearmed" messages indicating eg. partition 1 on a server is over
threshold, then partition 2 is found under threshold and it resets
the partition 1 threshold.
http://www.opennms.org/index.php/Thresholding#Merge_into_collectd

-jeff
Brian
2008-04-22 19:09:42 UTC
Permalink
----- Original Message ----
Sent: Tuesday, April 22, 2008 2:00:11 PM
Subject: Re: [opennms-discuss] Thresholding disk space percent used
Post by Brian
I end up with an endless loop of "high threshold"/"high threshold
rearmed" messages indicating eg. partition 1 on a server is over
threshold, then partition 2 is found under threshold and it resets
the partition 1 threshold.
http://www.opennms.org/index.php/Thresholding#Merge_into_collectd
Does "You should disable the SnmpThresholder configuration" mean to say comment out or delete the following line from threshd-configuration.xml for 1.3.10 (the instructions are explicit for 1.5.90)?:
<thresholder service="SNMP" class-name="org.opennms.netmgt.threshd.SnmpThresholder"/>

If this is a better way to go than default, should this go in as an enhancement request?


I have modified my files to the following based on the wiki link. I'm not sure if this is what was intended:

"threshd-configuration.xml"
<?xml version="1.0" encoding="UTF-8"?>
<threshd-configuration
xmlns="http://xmlns.opennms.org/xsd/config/threshd" threads="5">
<package name="example1">
<filter>IPADDR != '0.0.0.0'</filter>
<include-range begin="1.1.1.1" end="254.254.254.254"/>
<service name="SNMP" interval="300000" user-defined="false" status="on">
<!--
<parameter key="thresholding-group" value="default-snmp"/>
<parameter key="thresholding-group" value="windows"/>
-->
<parameter key="range" value="600000"/>
</service>
<!--
<service name="ICMP" interval="300000" user-defined="false" status="on">
<parameter key="thresholding-group" value="latency"/>
</service>
-->
<outage-calendar xmlns="">torpdu1</outage-calendar>
</package>
<!--
<thresholder service="SNMP" class-name="org.opennms.netmgt.threshd.SnmpThresholder"/>
<thresholder service="ICMP" class-name="org.opennms.netmgt.threshd.LatencyThresholder"/>
-->
</threshd-configuration>




"collectd-configuration.xml"
<?xml version="1.0" encoding="UTF-8"?>
<collectd-configuration
xmlns="http://xmlns.opennms.org/xsd/config/collectd" threads="50">
<package name="example1">
<filter>IPADDR != '0.0.0.0'</filter>
<include-range begin="1.1.1.1" end="254.254.254.254"/>
<service name="SNMP" interval="300000" user-defined="false" status="on">
<parameter key="collection" value="default"/>
<parameter key="thresholding-group" value="default-snmp"/>
<parameter key="thresholding-group" value="windows"/>
</service>
<outage-calendar xmlns="">torpdu1</outage-calendar>
</package>
<collector service="SNMP" class-name="org.opennms.netmgt.collectd.SnmpCollector"/>
</collectd-configuration>


ONMS restarted, though after altering one of the thresholds to artificially generate some notificiations for testing I see the following message in the threshd.log file and nothing afterwards:

2008-04-22 15:04:31,201 DEBUG [Threshd:BroadcastEventProcessor] Threshd: reinitializeThresholders: About to reinitialize thresholder ICMP



Brian



__________________________________________________________________
Looking for the perfect gift? Give the gift of Flickr!

http://www.flickr.com/gift/
Brian
2008-04-22 20:01:43 UTC
Permalink
----- Original Message ----
Sent: Tuesday, April 22, 2008 3:09:42 PM
Subject: Re: [opennms-discuss] Thresholding disk space percent used
----- Original Message ----
From: Jeff Gehlbach
To: General OpenNMS Discussion
Sent: Tuesday, April 22, 2008 2:00:11 PM
Subject: Re: [opennms-discuss] Thresholding disk space percent used
Post by Brian
I end up with an endless loop of "high threshold"/"high threshold
rearmed" messages indicating eg. partition 1 on a server is over
threshold, then partition 2 is found under threshold and it resets
the partition 1 threshold.
http://www.opennms.org/index.php/Thresholding#Merge_into_collectd
Does "You should disable the SnmpThresholder configuration" mean to say comment
out or delete the following line from threshd-configuration.xml for 1.3.10 (the
class-name="org.opennms.netmgt.threshd.SnmpThresholder"/>
If this is a better way to go than default, should this go in as an enhancement
request?
I have modified my files to the following based on the wiki link. I'm not sure
I'm getting the following error in output.log and ONMS wont start:
ONMS wont start without "<thresholder service="SNMP" class-name="org.opennms.netmgt.threshd.SnmpThresholder"/>" throwing the following error:


Caused by: ValidationException: A minimum of 1 _thresholderList object(s) (whose xml name is 'thresholder') are required for class: org.opennms.netmgt.config.threshd.ThreshdConfig
uration;
...
An error occurred while attempting to start the "OpenNMS:Name=Threshd" service (class org.opennms.netmgt.threshd.jmx.Threshd). Shutting down and exiting.



Then I add the line back to the config, and get the following error but at least ONMS now starts:


threshd.log:
2008-04-22 15:54:06,373 WARN [ThreshdScheduler-5 Pool-fiber0] Threshd: scheduleService: Unable to schedule 172.20.20.1 for service SNMP, reason: Thresholding group default does n
ot exist.
java.lang.IllegalArgumentException: Thresholding group default does not exist.
at org.opennms.netmgt.config.ThresholdingConfigFactory.getGroup(ThresholdingConfigFactory.java:238)
at org.opennms.netmgt.config.ThresholdingConfigFactory.getRrdRepository(ThresholdingConfigFactory.java:232)
at org.opennms.netmgt.threshd.DefaultThresholdsDao.get(DefaultThresholdsDao.java:59)
at org.opennms.netmgt.threshd.SnmpThresholdConfiguration.<init>(SnmpThresholdConfiguration.java:70)
at org.opennms.netmgt.threshd.SnmpThresholdNetworkInterface.<init>(SnmpThresholdNetworkInterface.java:52)
at org.opennms.netmgt.threshd.SnmpThresholder.initialize(SnmpThresholder.java:182)
at org.opennms.netmgt.threshd.Threshd.scheduleService(Threshd.java:439)
at org.opennms.netmgt.threshd.Threshd$3.processRow(Threshd.java:345)
at org.opennms.netmgt.utils.Querier.executeStmt(Querier.java:66)
at org.opennms.netmgt.utils.JDBCTemplate.doExecute(JDBCTemplate.java:113)
at org.opennms.netmgt.utils.JDBCTemplate.execute(JDBCTemplate.java:84)
at org.opennms.netmgt.utils.JDBCTemplate.execute(JDBCTemplate.java:61)
at org.opennms.netmgt.threshd.Threshd.scheduleExistingInterfaces(Threshd.java:349)
at org.opennms.netmgt.threshd.Threshd.access$000(Threshd.java:71)
at org.opennms.netmgt.threshd.Threshd$2.run(Threshd.java:193)
at org.opennms.core.concurrent.RunnableConsumerThreadPool$FiberThreadImpl.run(RunnableConsumerThreadPool.java:422)
at java.lang.Thread.run(Thread.java:595)


Here's what my configs look like now:

<?xml version="1.0" encoding="UTF-8"?>
<threshd-configuration
xmlns="http://xmlns.opennms.org/xsd/config/threshd" threads="5">
<package name="example1">
<filter>IPADDR != '0.0.0.0'</filter>
<include-range begin="1.1.1.1" end="254.254.254.254"/>
<service name="SNMP" interval="300000" user-defined="false" status="on">
<parameter key="range" value="600000"/>
</service>
<outage-calendar xmlns="">torpdu1</outage-calendar>
</package>
<thresholder service="SNMP" class-name="org.opennms.netmgt.threshd.SnmpThresholder"/>
</threshd-configuration>


<?xml version="1.0" encoding="UTF-8"?>
<collectd-configuration
xmlns="http://xmlns.opennms.org/xsd/config/collectd" threads="50">
<package name="example1">
<filter>IPADDR != '0.0.0.0'</filter>
<include-range begin="1.1.1.1" end="254.254.254.254"/>
<service name="SNMP" interval="300000" user-defined="false" status="on">
<parameter key="collection" value="default"/>
<parameter key="thresholding-group" value="default-snmp"/>
<parameter key="thresholding-group" value="windows"/>
</service>
<outage-calendar xmlns="">torpdu1</outage-calendar>
</package>
<collector service="SNMP" class-name="org.opennms.netmgt.collectd.SnmpCollector"/>
</collectd-configuration>


What's happening here?


Brian




__________________________________________________________________
Get a sneak peak at messages with a handy reading pane with All new Yahoo! Mail: http://ca.promos.yahoo.com/newmail/overview2/
Jeff Gehlbach
2008-04-22 21:10:44 UTC
Permalink
Post by Brian
ONMS wont start without "<thresholder service="SNMP" class-
name="org.opennms.netmgt.threshd.SnmpThresholder"/>" throwing the
Caused by: ValidationException: A minimum of 1 _thresholderList
org.opennms.netmgt.config.threshd.ThreshdConfig
uration;
...
An error occurred while attempting to start the
"OpenNMS:Name=Threshd" service (class
org.opennms.netmgt.threshd.jmx.Threshd). Shutting down and exiting.
Then I add the line back to the config, and get the following error
If you're going to run threshd, you need to define at least one
thresholder in threshd-configuration.xml. You can turn off Threshd in
service-configuration.xml if you won't be needing it.
Post by Brian
<?xml version="1.0" encoding="UTF-8"?>
<collectd-configuration
xmlns="http://xmlns.opennms.org/xsd/config/collectd" threads="50">
<package name="example1">
<filter>IPADDR != '0.0.0.0'</filter>
<include-range begin="1.1.1.1" end="254.254.254.254"/>
<service name="SNMP" interval="300000" user-defined="false" status="on">
<parameter key="collection" value="default"/>
<parameter key="thresholding-group" value="default-snmp"/>
<parameter key="thresholding-group" value="windows"/>
Currently you can't do this -- the parameters for the <service>
element are implemented as a Map (think of a hash table) so there can
be only one thresholding-group defined per <service>. We're aware
that this is a functional regression and are working on a way to allow
multiple thresholds against a single collection again.

-jeff
Brian
2008-04-23 16:13:46 UTC
Permalink
----- Original Message ----
Sent: Tuesday, April 22, 2008 5:10:44 PM
Subject: Re: [opennms-discuss] Thresholding disk space percent used
If you're going to run threshd, you need to define at least one
thresholder in threshd-configuration.xml. You can turn off Threshd in
service-configuration.xml if you won't be needing it.
Thanks, that pointed me in the right direction. Now all thats left is to setup the regular expression for the resource filter.

I have tried various combinations of searching for "Physical Memory",and also using \.1\.3\.6\.1\.2\.1\.25\.2\.1\.2 which I believe maps tohrStorageRam. I have tried variations of m/Physical Memory/,m//Physical Memory/, /Physical Memory/, etc. etc.

Any pointers on how to get this working?

Here's what I'm seeing in the logs.
2008-04-23 11:47:30,280 INFO [CollectdScheduler-50 Pool-fiber4]SnmpAttribute: No data collected for attributeNode[39]/type[hrStorageIndex]/instance[4].hrStorageDescr[.1.3.6.1.2.1.25.2.3.1.3] =Physical Memory. Skipping
2008-04-23 11:47:30,283 DEBUG [CollectdScheduler-50 Pool-fiber4]ThresholdingVisitor: visitAttribute storing value Physical Memory forattribute named hrStorageDescr
2008-04-23 11:47:30,283 DEBUG [CollectdScheduler-50 Pool-fiber4]SnmpAttribute: Visiting attributeNode[39]/type[hrStorageIndex]/instance[4].hrStorageUsed[.1.3.6.1.2.1.25.2.3.1.6] = 15236
2008-04-23 11:47:30,283 DEBUG [CollectdScheduler-50 Pool-fiber4]ThresholdingVisitor: visitAttribute storing value 15236.0 for attributenamed hrStorageUsed
2008-04-23 11:47:30,283 DEBUG [CollectdScheduler-50 Pool-fiber4]ThresholdingVisitor: Completing Resource hrStorageIndex/172.20.40.16/4:-1 (Node[39]/type[hrStorageIndex]/instance[4])
2008-04-23 11:47:30,283 DEBUG [CollectdScheduler-50 Pool-fiber4]GenericIndexResource: getResourceDir:/opt/opennms/share/rrd/snmp/39/hrStorageIndex/4
2008-04-23 11:47:30,284 DEBUG [CollectdScheduler-50 Pool-fiber4]ThresholdingVisitor: passedThresholdFilters: resource=4,group=default-snmp, type=hrStorageIndex, filters=1
2008-04-23 11:47:30,284 DEBUG [CollectdScheduler-50 Pool-fiber4]ThresholdingVisitor: passedThresholdFilters: filter #1:field=hrStorageType, regex='"Physical Memory"'
2008-04-23 11:47:30,284 DEBUG [CollectdScheduler-50 Pool-fiber4]ThresholdingVisitor: Getting Value for hrStorageIndex::hrStorageType
2008-04-23 11:47:30,284 DEBUG [CollectdScheduler-50 Pool-fiber4]ThresholdingVisitor: passedThresholdFilters: the value of hrStorageTypeis .1.3.6.1.2.1.25.2.1.2. Pass filter? false

Brian
Post by Brian
xmlns="http://xmlns.opennms.org/xsd/config/collectd" threads="50">
IPADDR != '0.0.0.0'
status="on">
Currently you can't do this -- the parameters for the
element are implemented as a Map (think of a hash table) so there can
be only one thresholding-group defined per . We're aware
that this is a functional regression and are working on a way to allow
multiple thresholds against a single collection again.
I dont see any errors with this config, and it still works when I remove the windows line.


Brian



__________________________________________________________________
Connect with friends from any web browser - no download required. Try the new Yahoo! Canada Messenger for the Web BETA at http://ca.messenger.yahoo.com/webmessengerpromo.php
Brian
2008-04-23 17:30:14 UTC
Permalink
----- Original Message ----
Sent: Wednesday, April 23, 2008 12:13:46 PM
Subject: Re: [opennms-discuss] Thresholding disk space percent used
----- Original Message ----
From: Jeff Gehlbach
To: General OpenNMS Discussion
Sent: Tuesday, April 22, 2008 5:10:44 PM
Subject: Re: [opennms-discuss] Thresholding disk space percent used
If you're going to run threshd, you need to define at least one
thresholder in threshd-configuration.xml. You can turn off Threshd in
service-configuration.xml if you won't be needing it.
Thanks, that pointed me in the right direction. Now all thats left is to setup
the regular expression for the resource filter.
I have tried various combinations of searching for "Physical Memory",and also
using \.1\.3\.6\.1\.2\.1\.25\.2\.1\.2 which I believe maps tohrStorageRam. I
have tried variations of m/Physical Memory/,m//Physical Memory/, /Physical
Memory/, etc. etc.
Any pointers on how to get this working?
Figured it out. hrStorageType holds the numeric data types:

\.1\.3\.6\.1\.2\.1\.25\.2\.1\.2 = hrStorageRam
\.1\.3\.6\.1\.2\.1\.25\.2\.1\.3 = hrStorageVirtualMemory
\.1\.3\.6\.1\.2\.1\.25\.2\.1\.4 = hrStorageFixedDisk
\.1\.3\.6\.1\.2\.1\.25\.2\.1\.5 = hrStorageRemovableDisk
\.1\.3\.6\.1\.2\.1\.25\.2\.1\.7 = hrStorageCompactDisc

And hrStorageDescr holds the text labels, eg. Physical Memory.

Now I'm just messing with the notifications to get the integers rounded to no decimal points, and getting the right escaping to include a percentage sign.


Brian




__________________________________________________________________
Ask a question on any topic and get answers from real people. Go to Yahoo! Answers and share what you know at http://ca.answers.yahoo.com
Brian
2008-04-23 18:56:05 UTC
Permalink
----- Original Message ----
Sent: Wednesday, April 23, 2008 1:30:14 PM
Subject: Re: [opennms-discuss] Thresholding disk space percent used
I've found some very odd behaviour in the indexed threshold system. If I have a single expression for any given index, I can collect all the information as expected. This is shown in the first configuration below. However, this poses an obvious problem where I want different thresholds for each index value, say for physical memory, virtual memory, and fixed disk space:

This one works, but is obviously limited in use:
"thresholds.xml"
<expression type="high" ds-type="hrStorageIndex" value="60.0"
rearm="50.0" trigger="1" ds-label="hrStorageDescr" expression="(hrStorageUsed/hrStorageSize)*100.0">
<resource-filter field="hrStorageDescr">^(Physical Memory)$</resource-filter>
<resource-filter field="hrStorageDescr">^(Virtual Memory)$</resource-filter>
<resource-filter field="hrStorageType">\.1\.3\.6\.1\.2\.1\.25\.2\.1\.4</resource-filter>
</expression>



What I would like to do, and I assume this has been done before somehow, is have different thresholds based on index Description and/or Type. If I set this up, I end up with three alerts for the same index value, one for the 60% Physical Memory threshold, another for the 50% Virtual Memory threshold, and yet another for the 40% fixed disk space.

This one does not work but it should, or I'm doing something wrong:
"thresholds.xml"
<expression type="high" ds-type="hrStorageIndex" value="60.0"
rearm="50.0" trigger="1" ds-label="hrStorageDescr" expression="(hrStorageUsed/hrStorageSize)*100.0">
<resource-filter field="hrStorageDescr">^(Physical Memory)$</resource-filter>
</expression>
<expression type="high" ds-type="hrStorageIndex" value="50.0"
rearm="40.0" trigger="1" ds-label="hrStorageDescr" expression="(hrStorageUsed/hrStorageSize)*100.0">
<resource-filter field="hrStorageDescr">^(Virtual Memory)$</resource-filter>
</expression>
<expression type="high" ds-type="hrStorageIndex" value="40.0"
rearm="30.0" trigger="1" ds-label="hrStorageDescr" expression="(hrStorageUsed/hrStorageSize)*100.0">
<resource-filter field="hrStorageType">\.1\.3\.6\.1\.2\.1\.25\.2\.1\.4</resource-filter>
</expression>


Anyone have something like this working, or is this a bug?


Brian



__________________________________________________________________
Be smarter than spam. See how smart SpamGuard is at giving junk email the boot with the All-new Yahoo! Mail. Click on Options in Mail and switch to New Mail today or register for free at http://mail.yahoo.ca
Brian
2008-04-24 18:16:50 UTC
Permalink
Is there a 'only-one-threshold-per-(ds-name/expression)' in thresholds.xml limit limiting the use of different threshold for different index values?

Brian


----- Original Message ----
Sent: Wednesday, April 23, 2008 2:56:05 PM
Subject: Re: [opennms-discuss] Thresholding disk space percent used
----- Original Message ----
From: Brian
To: General OpenNMS Discussion
Sent: Wednesday, April 23, 2008 1:30:14 PM
Subject: Re: [opennms-discuss] Thresholding disk space percent used
I've found some very odd behaviour in the indexed threshold system. If I have a
single expression for any given index, I can collect all the information as
expected. This is shown in the first configuration below. However, this poses an
obvious problem where I want different thresholds for each index value, say for
"thresholds.xml"
rearm="50.0" trigger="1" ds-label="hrStorageDescr"
expression="(hrStorageUsed/hrStorageSize)*100.0">
^(Physical
Memory)$
^(Virtual
Memory)$
field="hrStorageType">\.1\.3\.6\.1\.2\.1\.25\.2\.1\.4
What I would like to do, and I assume this has been done before somehow, is have
different thresholds based on index Description and/or Type. If I set this up, I
end up with three alerts for the same index value, one for the 60% Physical
Memory threshold, another for the 50% Virtual Memory threshold, and yet another
for the 40% fixed disk space.
"thresholds.xml"
rearm="50.0" trigger="1" ds-label="hrStorageDescr"
expression="(hrStorageUsed/hrStorageSize)*100.0">
^(Physical
Memory)$
rearm="40.0" trigger="1" ds-label="hrStorageDescr"
expression="(hrStorageUsed/hrStorageSize)*100.0">
^(Virtual
Memory)$
rearm="30.0" trigger="1" ds-label="hrStorageDescr"
expression="(hrStorageUsed/hrStorageSize)*100.0">
field="hrStorageType">\.1\.3\.6\.1\.2\.1\.25\.2\.1\.4
Anyone have something like this working, or is this a bug?
Brian
__________________________________________________________________
Be smarter than spam. See how smart SpamGuard is at giving junk email the boot
with the All-new Yahoo! Mail. Click on Options in Mail and switch to New Mail
today or register for free at http://mail.yahoo.ca
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
http://www.opennms.org/index.php/Mailing_List_FAQ
opennms-discuss mailing list
To *unsubscribe* or change your subscription options, see the bottom of this
https://lists.sourceforge.net/lists/listinfo/opennms-discuss
__________________________________________________________________
Get a sneak peak at messages with a handy reading pane with All new Yahoo! Mail: http://ca.promos.yahoo.com/newmail/overview2/
Loading...