Gossamer Forum: Products: Links 2.0: Customization: gofetch

Novice (17 posts)

Nov 12, 2000, 2:33 PM

Post #1 of 29

Shortcut

gofetch

Can anyone help
i am useing templates
i have the gofetch mod installed
what i need to do is when a user submits a site for validation at the same time he clicks the Add resource button on the add site page the gofetch system spiders the site

plus when a user runs a search i need it to search the links and the go fetch database.

cheers
tron

Nov 12, 2000, 4:47 PM

Veteran (17240 posts)

Nov 12, 2000, 4:47 PM

Post #2 of 29

Shortcut

Re: gofetch In reply to

A very clumpsy suggestion is copying all the goFetch codes into the add.cgi as a subroutine. Then in the sub main routine in add.cgi...add the following codes:

Code:

&gofetch;

AFTER the following codes:

Code:

&process_form;

Basically, what this will do is when someone "submits" their site by clicking on the SUBMIT button, the process_form and gofetch subroutines will be called and executed.

Regards,

Eliot Lee

Nov 12, 2000, 5:49 PM

Novice (41 posts)

Nov 12, 2000, 5:49 PM

Post #3 of 29

Shortcut

Re: gofetch In reply to

and how about processing the search in the goFetch database too??? Anyone have any suggestions for that?

Thanks in advance,
Aymeric.

Nov 12, 2000, 7:11 PM

Veteran (17240 posts)

Nov 12, 2000, 7:11 PM

Post #4 of 29

Shortcut

Re: gofetch In reply to

There is a Mod posted in the Resource Center that allows you to search across different links databases, including the gofetch database. The Mod was posted about a week and a half ago and there is a Thread in this forum that announces the new modification.

Regards,

Eliot Lee

Nov 13, 2000, 5:22 AM

Novice (17 posts)

Nov 13, 2000, 5:22 AM

Post #5 of 29

Shortcut

Re: gofetch In reply to

cheers i will give it a go
have you any thoughts on the search part?

Cheers
tron

Nov 13, 2000, 2:02 PM

Novice (17 posts)

Nov 13, 2000, 2:02 PM

Post #6 of 29

Shortcut

Re: gofetch In reply to

hi Eliot,
i have had a look at what you said below and must tell you i am completly new to cgi scripts.
i it possible to put it into very easy terms i.e. how do you setup a subroutine.

Cheers,
tron

-----------------------------------------------------------------------
A very clumpsy suggestion is copying all the goFetch codes into the add.cgi as a subroutine. Then in the sub main routine in add.cgi...add the following codes:

&gofetch;

AFTER the following codes:

&process_form;

Basically, what this will do is when someone "submits" their site by clicking on the SUBMIT button, the process_form and gofetch subroutines will be called and executed.

Regards,

Eliot Lee

---------------------------------------------------------------------------

Nov 13, 2000, 2:30 PM

Veteran (17240 posts)

Nov 13, 2000, 2:30 PM

Post #7 of 29

Shortcut

Re: gofetch In reply to

Like the following:

Code:

sub get_a_perl_book {
#------------------------------------
# Codes for Get a Perl Book Subroutine

ADD goFETCH codes
}

Regards,

Eliot Lee

Nov 13, 2000, 2:55 PM

Novice (17 posts)

Nov 13, 2000, 2:55 PM

Post #8 of 29

Shortcut

Re: gofetch In reply to

Thanks Eliot for your time
I willgive it a go

cheers
Tron

Nov 13, 2000, 3:19 PM

Novice (17 posts)

Nov 13, 2000, 3:19 PM

Post #9 of 29

Shortcut

Re: gofetch In reply to

Sorry its me again
Ive done what you said
but it doesnt seem to pass the url and email address from the add.cgi part to the gofetch part of the cgi file.

a friend also very limited with cgi sugested some thing like this:

system("/root/temp_site/cgi-bin/links/goFetch.pl URL=$in{'URL'} email=$in{'Contact Email'} submit=Fetch");

is there any way that we can add this into the add.cgi.

cheers
tron

Nov 13, 2000, 6:47 PM

Veteran (17240 posts)

Nov 13, 2000, 6:47 PM

Post #10 of 29

Shortcut

Re: gofetch In reply to

In Reply To:

system("/root/temp_site/cgi-bin/links/goFetch.pl URL=$in{'URL'} email=$in{'Contact Email'} submit=Fetch");

VERY VERY DANGEROUS to put system codes in PUBLIC cgi scripts. This was discussed in the Perl/CGI forum awhile back. I would recommend NOT using the above codes!

Regards,

Eliot Lee

Nov 13, 2000, 10:01 PM

Novice (41 posts)

Nov 13, 2000, 10:01 PM

Post #11 of 29

Shortcut

Re: gofetch In reply to

Eliot,

About the search. I am using goFetch, and would like users to be able to search this database that is created when sites are spidered.

The only mod that I could find that looked similar to what you proposed was 'Multisearch'. But what that does, is just letting users be able to CHOOSE different Links databases. I just want the cgi to search either ONLY the goFetch database, or both at the SAME TIME.

Thanks in advance,
Aymeric.

Nov 13, 2000, 11:05 PM

dan

Enthusiast (760 posts)

Nov 13, 2000, 11:05 PM

Post #12 of 29

Shortcut

Re: gofetch In reply to

Hi Tron:

To add to what Eliot said, especially NEVER (under any and all circumstances) use system when you are passing non-hard coded, user-submitted values to the system function! If you cannot do it with a Perl (or module) function, then move on. And get the word to your friend before he suggests to anyone else.

Dan Cool

Nov 14, 2000, 5:49 AM

Novice (17 posts)

Nov 14, 2000, 5:49 AM

Post #13 of 29

Shortcut

Re: gofetch In reply to

Thanks to all of you for your help
but for the meantime until i can learn perl
i have put a button on the add_success.html
so if the user whants to have there site spidered he clicks it.

cheers
tron

Nov 14, 2000, 10:30 AM

Novice (41 posts)

Nov 14, 2000, 10:30 AM

Post #14 of 29

Shortcut

Re: gofetch In reply to

Alright, PLEASE, I know this is stupid, but HOW CAN ONE CALL THE GOFETCH DATABSE WITH A SEARCH.

GoFetch works fine, I have my database created and everything, but I have no way to search through it. Where do I have to code, in order to change the searchable database to the goFetch one??????

Any help would be greatly appreciated,
Thanxxx

M.

Nov 29, 2000, 11:23 AM

Novice (41 posts)

Nov 29, 2000, 11:23 AM

Post #15 of 29

Shortcut

Again==> gofetch, PLEASE !!! In reply to

Does ANYONE have ANY ideas on how to do this????
(my last message above this one?... searching both databases???)

Thanks a lot, I am really desperate now...
Cheers,
M.

Nov 29, 2000, 8:16 PM

Bmxer

Veteran (1311 posts)

Nov 29, 2000, 8:16 PM

Post #16 of 29

Shortcut

Re: Again==> gofetch, PLEASE !!! In reply to

i had a way to incorporate it into the links db from the spider db, and was working on the admin program but i became busy. Maybe when i am less busy in the future i'll release the rest of my planned goFetch mod

Nov 30, 2000, 9:17 AM

Novice (41 posts)

Nov 30, 2000, 9:17 AM

Post #17 of 29

Shortcut

Re: Again==> gofetch, PLEASE !!! In reply to

sounds incredible. I guess I will have to wait, since I am so perl illiterate...
Anyway, if anybody else knows how to be able to search both databases, it would be great. Or even better, to incorporate the spidering engine into the links database, so that the results are not doubled.

Thanks in advance,
M.

ps: I guess I might have to pay a specialist/installer to do this if i get no answers...

Dec 1, 2000, 10:39 AM

MikeGraves

Novice (8 posts)

Dec 1, 2000, 10:39 AM

Post #18 of 29

Shortcut

Re: Again==> gofetch, PLEASE !!! In reply to

I made some modifications to goFetch to get it to write to my validate.db file. It works on my site.

1. Everywhere in the script it says $db_spider_id_file_name, I replaced that with $db_links_id_file_name.
2. Everywhere in the script it says $db_spider_name, I replaced it with $db_valid_name.

Ok, so now when it spiders a page it reads how many links there are and updates the validate.db file.

But it isn't that simple. My database is pipe delimited so I had to change:
print SPIDER "$ID%%$mytitle%%$myurl%%$mydescrip%%$mykeywords%%$mysize%%$lastupd\n";
to:
print SPIDER "$ID|$mytitle|$myurl|$mydescrip|$mykeywords|$mysize|$lastupd\n";

You have to match how many fields you have in your database. I have 17 fields so I had to change it to:
print SPIDER "$ID|$mytitle|$myurl|$date||$mydescrip|Name Here|your\@email.com|||||||||$date\n";

Also note that $lastupd is changed to $date to produce the right date format. You also have to change:
use HTTP::Date;
$lastupd = time2str($res->last_modified);
to:
$date = &get_date;

(Note, I still get one problem with this, though. The first 2 links always have the infamous 1969 date!)

C. Now you have to eliminate some characters from the title and description fields.
Beneath:
# Update the counter.
open (ID, ">$db_links_id_file_name") or &cgierr("error in get_defaults. unable to open id file: $db_links_id_file_name. Reason: $!");
flock(ID, 2) unless (!$db_use_flock);
print ID $ID; # update counter.
close ID; # automatically removes file lock
open (SPIDER, ">>$db_valid_name") or &cgierr ("Can't open for output counter file. Reason: $!");
if ($db_use_flock) { flock (SPIDER, 2) or &cgierr ("Can't get file lock. Reason: $!"); }

I added this:
$mydescrip =~ tr/\n//d;
$mydescrip =~ tr/|\n//d;
$mytitle =~ tr/|\n//d;
to remove the | character from the title and description. and to remove line breaks from the description.

Now it should enter everything into the valildate.db in the right slots with no pipes or line breaks to screw up the fields.

Now I need a page to spider. The way I do it is use a shareware program called UrlSearch. I find a page with links I would like to add,
then I save it to my hard drive. Then open it with UrlSearch and eliminate the irrelevant links. Then save it, upload it to my
server then spider it on my server. I usually only do 10 or 15 at a time because I still have to validate them to check the title
and description. But it is not often that both the title and description turn out OK. Too many people do not have a meta tag
for description. And when it builds a description from the content it puts in javascript and other things it doesn't understand.
Sometimes it is easier to use the bookmarklet tool to add links because you can highlight your description on the page.

Too much rambling now. Anyway I have it working on my site and am now adding 20 or 30 relevant links to my site a week this way.

Mike
http://www.sweepstalk.com

Dec 1, 2000, 11:17 AM

MikeGraves

Novice (8 posts)

Dec 1, 2000, 11:17 AM

Post #19 of 29

Shortcut

Re: Again==> gofetch, PLEASE !!! In reply to

Sorry, forgot about putting in values for certain fields.print SPIDER "$ID|$mytitle|$myurl|$date||$mydescrip|Name Here|your\@email.com|||||||||$date\n";

Should be changed to:
print SPIDER "$ID|$mytitle|$myurl|$date||$mydescrip|Name Here|your\@email.com|0|No|No|0|0|No||password|$date\n";

The only field I left blank was Expired. Add values to any other fields you may have. Also add any other fields you may have. If you have a field for keywords you may want to put the variable in for that too.

Also change the length of any fields if you allow more characters.

I changed:
$Rules{'Max Characters: URL'} = 128;
$Rules{'Max Characters: Title'} = 96;
$Rules{'Max Characters: Description'} = 384;

To:
$Rules{'Max Characters: URL'} = 140;
$Rules{'Max Characters: Title'} = 120;
$Rules{'Max Characters: Description'} = 525;

Also I changed:
$Description = substr($Description,0,225);
if (length($Description) > 224) {
$Description =~ s/\s+\S*$/.../;

to:
$Description = substr($Description,0,525);
if (length($Description) > 524) {
$Description =~ s/\s+\S*$/.../;

I think that's it.

Mike

Dec 2, 2000, 11:23 AM

Novice (41 posts)

Dec 2, 2000, 11:23 AM

Post #20 of 29

Shortcut

Re: Again==> gofetch, PLEASE !!! In reply to

Thanks a lot Mike, I will give it a try. I have been away for a while, and hadn't seen your reply before this morning.

Cheers,
Aymeric.

Dec 2, 2000, 12:55 PM

Novice (41 posts)

Dec 2, 2000, 12:55 PM

Post #21 of 29

Shortcut

Re: Again==> gofetch, PLEASE !!! In reply to

After looking through your code Mike, I realised that it doesn't do what I am looking for.
What it does is add the found spidered links into the validate.db.

I do not care about having those links into my database. What I want to do is MUCH simpler than that:

I just want my search to look through the SPIDER.DB rather than the normal database it looks through.

That's all I need, HOW can I have the search form look through that database. If there is a way to make it look though both, that's even better.

Also, if anyone knows a way to automatise the process of spidering when I validate a link, that would be a plus.

I have been looking through the forums a lot, and really can't find anything specific to what I want to do.

Thanks in advance,
Aymeric.

Dec 2, 2000, 1:07 PM

Veteran (17240 posts)

Dec 2, 2000, 1:07 PM

Post #22 of 29

Shortcut

Re: Again==> gofetch, PLEASE !!! In reply to

The Modification you are looking for is linked in the following Thread:

http://www.gossamer-threads.com/...&vc=1#Post112073

Regards,

Eliot Lee

Dec 2, 2000, 1:24 PM

Novice (41 posts)

Dec 2, 2000, 1:24 PM

Post #23 of 29

Shortcut

Re: Again==> gofetch, PLEASE !!! In reply to

Oops, let that one pass by without catching it.

Thanks Eliot.
Aymeric.

ps: any ideas on how to execute the goFetch.cgi when I validate links? Would that be hard to do?

Dec 2, 2000, 1:44 PM

Novice (41 posts)

Dec 2, 2000, 1:44 PM

Post #24 of 29

Shortcut

Re: Again==> gofetch, PLEASE !!! In reply to

Well, thanks Eliot, but again, that's not at all what I am trying to do. Why am I having such a hard time trying to explain this...

On one side I have my Links 2.0 database, on the other, I have the Spider.db, created by goFetch.cgi. Then, I want my search forms on my pages to search through the goFetch.cgi ONLY, NOT the Links 2.0 db. Or, if it can search through both AT THE SAME TIME, that's fine too.

Thanks a lot,
Aymeric.

Dec 2, 2000, 1:54 PM