Hi List In recent testing of my search engine I realized that while, yes it DOES return a very good result set, it poorly sorts them. We have a content site, that for example sells images. So for example: sku keywords sku123 ocean, island, sky, trees, water sku124 sky, clouds, blue, day Lets say I have thousands similar to this. The problem arises when someone searches for the term 'sky'. It will pull both results from above, but if I sort by sku it will show the pictures of the island with water and sky, or any number of picture with sky in it WILL appear BEFORE a simple brilliant SKY by itself.... which is not good, if left to sorting a field value. I have been thinking of ways to "weight" the result set. I am not an expert on efficiency nor databases. I am using Mysql, but NOT an SQL query because I am doing full text searches. A pseudo idea would be like: sku keywords sku123 ocean_7, island_9, sky_5, trees_4, water_5 sku124 sky_10, clouds_10, blue_3, day_2 I have no idea if this is possible but in the above it is assumed that with substring matching turned on, 'sky' will still be a HIT for both, then maybe create some custom tf=? or method of sorting based on the numeric TOTALs of the corresponding _'n' with regards to the words matched by the users search spec. So now with the above, a search for 'sky' will still return both, but the first one visible will be sku124 (because sky=10) and for the other (sky=5) But if someone searched for 'sky ocean' then both would still be returned but sku123 will be first because (sky+ocean=12) and the other sku is (sky=10) I still want to return both, because a Graphic artists can just take the sky from one and the ocean from another, so both are relevant. I know I know this is starting to sound terribly inefficient :) but the but any normal tf=?,?,? will simply not work well at all for us. As of now I have a general "collection" form to gather the users search terms, then on my results page I separate out all the search terms and do a nice juicy in-page co=1 search. So basically I do have all search terms separated out at one point, if that helps. Any idea on how to go about this? If any consultants have an idea but feel it is way to complicated to share, or for me to handle, the please contact me off-list with the idea, and if it is suitable for what I am doing, then we can work out arrangements. suppressed If you have any advice(methods) for me to look into, I would be grateful, I like trying to do this on my own, but realize this may be a complicated one.... or not :) Other ideas may be to base it on how many times the word appears in the record (I don't like this one as it can get ugly). Obviously any method will require a competent user inputting the database info, I do not think there is any escaping that. I think I am pretty safe in assuming any solution will require some sort of post search (perl) sorting facility... correct? Thanks in advance. Paul _______________________________________________ interchange-users mailing list suppressed http://www.icdevgroup.org/mailman/listinfo/interchange-users
Mail converted by mhonarc 2.6.15
This archive provided courtesy of JSW4.NET, Internet Hosting Services for Small Business.