[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ic] HELP - Inktomisearch stuck on ord/basket.html page


> Banning spiders from the basket is your best bet.

While robots.txt is great not all robots follow it and as far as I'm
concerned I'd rather have more control by banning them directly rather
than hoping the bots will do the work for me after they read
robots.txt

To do it with rewrites you can use something like (in httpd.conf
<VirtualHost ..> directive or in .htaccess):

RewriteEngine  On
RewriteCond %{HTTP_USER_AGENT}     
^.*(bot|pider|Googlebot|FAST|Sidewinder|T-Rex|Architext|Backrub|Gulliver|Slurp|ZyBorg|Yahoo|Scooter|msnbot|mozDex|psbot|ia_archiver|Wotbox|Teoma|Gigablast|Testingbot).*
RewriteRule ^(/ord/basket.html.*)$ -  [F,L]

Beware that this is a broad net ... anything with bot or pider in the
name will be banned.  To just do inktomi's spider put only Slurp in
the second line like
RewriteCond %{HTTP_USER_AGENT}      ^.*(Slurp).*

**Disclaimer**  Using rewrites is great, but may break all sorts of
things so test and re-test and do research before you implement.

Bryan
_______________________________________________
interchange-users mailing list
suppressed
http://www.icdevgroup.org/mailman/listinfo/interchange-users


Mail converted by mhonarc 2.6.15
This archive provided courtesy of JSW4.NET, Internet Hosting Services for Small Business.