[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ic] search engine indexing scan/ MM=0f73bb47ac44f4e422.....


On Fri, 16 Jun 2006, Jon wrote:

I just noticed that Google is reindexing our site after the upgrade to IC 5.4

Among the normal results are some of these:

www.mrlock.com/eshop/locks/scan/
MM=0f73bb47ac44f4e422fab7057f73d0c0:250:299:50.html?mv_more_ip=1&mv_n...
- 58k -
<http://64.233.187.104/search?q=cache:1wYAQPUy_g4J:www.mrlock.com/eshop/locks/scan/MM%3D0f73bb47ac44f4e422fab7057f73d0c0:250:299:50.html%3Fmv_more_ip%3D1%26mv_nextpage%3Dresults%26mv_arg%3D+cat+60+lock&hl=en&gl=us&ct=clnk&cd=3-
<http://www.google.com//search?hl=en&lr=&q=related:www.mrlock.com/eshop/locks/scan/MM%3D0f73bb47ac44f4e422fab7057f73d0c0:250:299:50.html%3Fmv_more_ip%3D1%26mv_nextpage%3Dresults%26mv_arg%3D>Similar
pages

If I click on the link on the google site - it returns nothing, but
if I click on there cached page it does show the result the spider
obtained originally.

What that appears to be are the Timed built pages I think. I see the same on my site when there is the Page forward/back via the [more-list] tag... some magic under there some where. When google crawls and picks up those pages they exist but when you click the links in the future the page is gone because, I assume, it has expired and needs to be created again on the fly. How to circumvent this in particular for google I do not know but wish I did since I've got the same problem. I think this has been discussed and explained some time ago but I've not been able to find it in the archives.

I haven't done this before, but it should work:

Set up your RobotUA etc. to detect GoogleBot (as is on by default). That sets CGI mv_tmp_session when a robot is the user.

On the page where you're using [more-list], set the matchlimit to a very big number, so that all the results fit on one page, e.g. ml=10000. Then when a search engine indexes the page, it will get all the content at once, and no more-list pages that won't work later.

Jon


--
Jon Jensen
End Point Corporation
http://www.endpoint.com/
_______________________________________________
interchange-users mailing list
suppressed
http://www.icdevgroup.org/mailman/listinfo/interchange-users


Mail converted by mhonarc 2.6.15
This archive provided courtesy of JSW4.NET, Internet Hosting Services for Small Business.