[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bigsister-general] bigsister performance


On Tue, 2005-10-18 at 17:34 +0200, Rob Verduijn wrote:
> This is a bit difficult to answer since the console stops responding, I
> will keep an eye on bs to see what memory and/or resources are used.

If you could live without alarms for a day or two (you have said, this
is happening quite often :-)), would it be an option to switch off
alarming entirely for the ping tests in order to see if this is the
problem?

Actually, Big Sister *is* forking the alarm sending command (usually
sendmail) regardless of how many alarms are going to go off. So, if
we are talking about a real big number of alarms, this might really
bring the system into troubles. And, I think, Big Sister will be
quicker forking than sendmail is going to send mails.

So, memo for Tom: put a limit on the number of alerts that are sent
out within a given time (who will read 1000 alert messages, anyway?)

> It's installed on a machine with a 3ware sata raid controller.
> Running suse 9.2

Ok, that's kernel 2.6.8, I think. I'm pretty sure, the I/O locking
problem has been solved long before.

Maybe, you could give

	echo 1 > /proc/sys/vm/overcommit_memory

(or "echo 2" for a more moderate approach)

a try in order to prevent the kernel from committing more memory
to processes that is available and possibly rather get Big Sister
in trouble than the underlying system.

> I've already configured check in bb_event_generator for everything that
> has a dependency, but if the main router stays up

I see.

> (I'm not exactly thrilled about the service our wan provider gives us,
> but this is a €€€ decision from above, and I got 0 to say about it)

Well, we are all committed to saving money ... :-(

After all, no matter how bad the monitored entities behave, the
monitoring application should stand and monitor.

> I'm affraid I have not found any hints in the syslog.

Not even the (in)famous OOM killer ... that's bad news.

> One thing that might help :
> I've got 8 uxmon-asroot files, I did this to spread the ping tests over
> a bigger time intervall so that not all the 400 clients got pinged at
> the same time.

I see, are they all mainly executing the built-in ping test? I'm asking
because I'm searching for something that could consume resources ...

> I'm also going to build a new bigsister server to make sure it's not a
> hardware thingy(was a to do anyway).

Ok, this will certainly take some time and does not necessarily fix the
problem, I'm afraid.

Best regards,
Tom
----------------------------------------------------------------------------
Thomas Aeby, Kirchweg 52, 1735 Giffers, Switzerland, Tel: (+41)264180040
Internet: suppressed                       PGP public key available
----------------------------------------------------------------------------




-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Bigsister-general mailing list
suppressed
https://lists.sourceforge.net/lists/listinfo/bigsister-general


Mail converted by mhonarc 2.6.15
This archive provided courtesy of JSW4.NET, Internet Hosting Services for Small Business.