Are there any known issues getting groups to work with the alerting
rules?
Now that I understand what "purple" means, I'm trying to do the
following:
1) In general, servers should alert if they're in a degraded state for
more than 25 minutes
2) In general, if they're in a state that's worse than purple, they
should alert after 5 minutes
3) In general, if the issue is CPU, I should be paged after 10 minutes
4) for the BUILD group, we don't czre about anything unless it persists
for 2 hours
So these are the rules I added
*.* prio=20 down=purple up=green delay=25 \
check="$host.conn and ((not '$router') or $router.conn)" \
norepeat=20 keep=1 suppressed repeatprio=10
msgmax=60 \
maxpermin=20
*.* prio=20 down=yellow up=green delay=5 suppressed
*.cpu prio=40 down=red up=green delay=10 suppressed
@BUILD.* prio=10 down=red up=green delay=120 suppressed
Now I am starting to think that the first two *.* rules may run over
each other (basically meaning that we won't get any alert with a purple
status for any host).
But what's weird to me is that address_1 (which isn't my pager) got a
red CPU alert from one of the servers in the BUILD group after just 5
minutes, suggesting that the @BUILD.* rule was being ignored.
Does anyone know more about this?
Thanks
Christopher
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Bigsister-general mailing list
suppressed
https://lists.sourceforge.net/lists/listinfo/bigsister-general
Mail converted by mhonarc 2.6.15
This archive provided courtesy of JSW4.NET, Internet Hosting Services for Small Business.