On Wednesday, August 24, 2005 5:57 PM, suppressed wrote:
Quoting John1 (suppressed):On Wednesday, August 24, 2005 2:45 PM, suppressed wrote:Quoting John1 (suppressed):On Wednesday, August 24, 2005 2:29 AM, suppressed wrote: Consequently the addr_ctr/IP file will keep counting up unless there is a *gap* of greater than "limit robot_expire" before a new session id is requested by the same IP address.Yes, this is correct.i.e. So if you use "Limit robot_expire 0.05", provided there are at least 2 requests per hour for a new session id from the same IP address the addr_ctr/IP file will keep counting up forever.Well, until it locks someone out for an hour.Except it is highly likely to be a lot longer than an hour (possibly indefinitely) if the IP in question is a large ISP's proxy server (using NAT as do NTL and AOL in the UK - 2 of the biggest ISPs in the UK). Has anybody any idea why AOL operate these NAT proxies?Should not happen. Since you don't assign a new session, and the counter gets incremented only at that time, after an hour of no new session you can get one.
But, what I am saying is that it appears that all UK AOL customers appear at our server on only a handful of IP addresses, i.e. the IP addresses of their proxy servers (and similarly for the major cable operator NTL). I don't know why they don't use a standard pass-through proxy server approach, but they don't seem to. Indeed, our web stats always list AOL and NTL proxy servers as the most popular visitors by IP address. So this does mean that our server is being asked to hand out maybe hundreds of session ids per hour to the same IP address (i.e. the proxy server's IP).
If RobotLimit is set to 500, then whilst it may take a little while for the 500 to be reached, once it has been reached the shutter comes down and the count_ip code operates like a latch as only *one* new session id per hour is required to *keep* the latch closed, not 500!You can't get a new session after you are locked out -- if you can, there is an error in the code.
Ah, OK, I now realise that I had made one wrong assumption. I was thinking that the mtime would still be updated even when the request for a new sessionid was denied. But, I now understand that mtime will remain untouched during the lockout period. So I now see that the lockout should end after time robot_expire.
And also note that RobotLimit 500 doesn't actually require traffic of 500 per hour for addr_ctr/IP to eventually reach 500. All that is needed is at least *one* new session id per hour provided that it never drops below *one* new session id per hour for the number of hours it takes to reach a count of 500.Looking at it, it may indeed be less than ideal. Perhaps someone can suggest an algorithm -- nothing clean and correct comes to my mind (new file every day, counting down instead of up if time > Limit->robot_expire * .1, etc.). In the interim, I would think Limit robot_expire 0.002 would work in all but the most extreme cases, where again I suggest you need more than RobotLimit to defend you from the onslaught.That's a fair point. I hadn't given any thought to the use of Limit robot_expire with very small values. A value of 0.002 would means that addr_ctr/IP would be deleted if there were no accesses from the same IP for 3 minutes.Not no accesses, no new sessions.
Oh yes, that's what a meant :-)
I guess that would work most of the time as I suppose in the middle of the night (if not during the day) requests for new session ids are likely to drop below this level at least once and therefore the addr_ctr/IP file will at least be deleted once every 24 hours. At the same time I suppose a 3 minute expiry limit is long enough to provide protection against unrecognised and unruly robots causing lots of new sessions to be spawed in quick succession - I guess this would tend to happen over a timeframe of seconds rather than minutes, so the 3 minutes should be sufficient to mitigate against this. Is this assumption correct? Do I understand the issue of runaway robots correctly?That is why I think it is probably good enough. In fact, so good that I may just pick 0.003 as the new value to put in the foundation catalog.cfg.
Great - I am glad something useful has come out of our discussion :-)Thanks for your help and suggestions Mike - you've persuaded me to put RobotLimit back to 100 from 0 but this time with a "Limit robot_expire 0.002"
I will let you know if I see any "Too many new ID assignments" reappear in the error log :-)
_______________________________________________ interchange-users mailing list suppressed http://www.icdevgroup.org/mailman/listinfo/interchange-users
Mail converted by mhonarc 2.6.15
This archive provided courtesy of JSW4.NET, Internet Hosting Services for Small Business.