On Mon, Feb 19, 2007 at 01:36:14AM +0100, tomasz abramowicz wrote: > add your uxmon file so we can have a look. hopefully it wont be needed since Ive kinda figured out whats going on. > > is the uxmon testing http from the bs server > or is it running on the webserver? from the bs server localhost(tf1) url=http://xxx.xxx.230.218:8888 realhttp localhost(tf2) url=http://xxx.xxx.230.218:8888 realhttp > > try checking if the agents are getting their info thru. > > telnet from the machine with the uxmon testing the webserver to the > webserver. (obviously: use the address/port you have used in your uxmon, > if you used the IP as opposed to hostname of the webserver, > telnet using the IP to avoid DNS related problems.) > did it work? if it did then try telneting from the uxmon testing the > webserver to your bsserver. using the bs port usually 1984. unless > you are ssh tunneling your agents data. yes, those worked. > > provide more info on your problem and system. ok so the 2 boxes above tf1 and tf2 are running redhat cluster server and the apache server is the clustered service serving files off a GFS mounted partition (shared storage between the 2 boxes). So heres what happened. tf1 disappeared (i still dont know what happened to him) tf2 took over and tried to fence tf1 and was unable to because of a mistake on my part. tf2 didnt have exclusive control over the GFS partition and so the "df" command was hanging. This is when my big sister monitoring went haywire, reporting http on every host as down. anyway, I fixed the fence problem on tf2, let it fence tf1, then I was able to mount the gfs partition on tf2 and now big sister on my server is reporting all the hosts correctly.. weird huh? ;) jason > > suppressed wrote: > >i just dumped big brother about a month ago for big sister 1.02 on all > >hosts.. Ive got it running on one server and 5 clients. It was running > >fine for about 3 weeks and now, its kind of lost its mind.. its reporting > >things that arent true. i.e. for EVERY host that im monitoring http, it > >says they are all down (red) when in fact they are all fine. Ive tried > >restarting the processes but that doesnt seem to help. the weird error > >message I see when I click on the http red ball icon is > > > >http://monsterjam.org/index2.html header > > > >Content-Type: text/plain > >Client-Date: Sat, 17 Feb 2007 02:18:58 GMT > >Client-Warning: Internal response > > > >any ideas whats causing this? I tried looking at the logs but cant find > >anything useful. > > > >Jason > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Bigsister-general mailing list suppressed https://lists.sourceforge.net/lists/listinfo/bigsister-general
Mail converted by mhonarc 2.6.15
This archive provided courtesy of JSW4.NET, Internet Hosting Services for Small Business.