first of all i don't know exaclty how to explain this but i will try or if this is in the right forum. but i was wondering if it would be possible to have two databases. one for web pages, and one for the spidered version of that web page. the user when browsing your search engine would only be able to see the non spidered version. but when a user searches both databases will be searched and displayed. e.g. when someone adds there webpage to a directory. that web page is then spidered (all the web pages via that page are submitted into another database). but that page that was submitted is only displayed normaly. i don't know if that is clear enough, but it was just a thought that i had. so someones site isn't listed 10 imes for every page that they have because of the spidering.
Jul 22, 2000, 10:41 PM
User (381 posts)
Jul 22, 2000, 10:41 PM
Post #5 of 6
Views: 3503
i just used another few tables.. the main one just had the LinkID and the URL and first 500 characters... for searching..
then i addded code in search.cgi to search that..
then of course.. a perl spider that is a server process.. so it runs all the time..
problem is.. after about an hour.. you got a 200meg db..
Jerry Su
then i addded code in search.cgi to search that..
then of course.. a perl spider that is a server process.. so it runs all the time..
problem is.. after about an hour.. you got a 200meg db..
Jerry Su