Gossamer Forum
Home : Products : Links 2.0 : Customization :

gofetch

(Page 1 of 2)
> >
Quote Reply
gofetch
Can anyone help
i am useing templates
i have the gofetch mod installed
what i need to do is when a user submits a site for validation at the same time he clicks the Add resource button on the add site page the gofetch system spiders the site

plus when a user runs a search i need it to search the links and the go fetch database.

cheers
tron

Quote Reply
Re: gofetch In reply to
A very clumpsy suggestion is copying all the goFetch codes into the add.cgi as a subroutine. Then in the sub main routine in add.cgi...add the following codes:

Code:

&gofetch;


AFTER the following codes:

Code:

&process_form;


Basically, what this will do is when someone "submits" their site by clicking on the SUBMIT button, the process_form and gofetch subroutines will be called and executed.

Regards,

Eliot Lee
Quote Reply
Re: gofetch In reply to
and how about processing the search in the goFetch database too??? Anyone have any suggestions for that?

Thanks in advance,
Aymeric.

Quote Reply
Re: gofetch In reply to
There is a Mod posted in the Resource Center that allows you to search across different links databases, including the gofetch database. The Mod was posted about a week and a half ago and there is a Thread in this forum that announces the new modification.

Regards,

Eliot Lee
Quote Reply
Re: gofetch In reply to
cheers i will give it a go
have you any thoughts on the search part?


Cheers
tron

Quote Reply
Re: gofetch In reply to
hi Eliot,
i have had a look at what you said below and must tell you i am completly new to cgi scripts.
i it possible to put it into very easy terms i.e. how do you setup a subroutine.

Cheers,
tron



-----------------------------------------------------------------------
A very clumpsy suggestion is copying all the goFetch codes into the add.cgi as a subroutine. Then in the sub main routine in add.cgi...add the following codes:



&gofetch;


AFTER the following codes:



&process_form;


Basically, what this will do is when someone "submits" their site by clicking on the SUBMIT button, the process_form and gofetch subroutines will be called and executed.

Regards,

Eliot Lee

---------------------------------------------------------------------------


Quote Reply
Re: gofetch In reply to
Like the following:

Code:

sub get_a_perl_book {
#------------------------------------
# Codes for Get a Perl Book Subroutine

ADD goFETCH codes
}


Regards,

Eliot Lee
Quote Reply
Re: gofetch In reply to
Thanks Eliot for your time
I willgive it a go

cheers
Tron


Quote Reply
Re: gofetch In reply to
Sorry its me again
Ive done what you said
but it doesnt seem to pass the url and email address from the add.cgi part to the gofetch part of the cgi file.

a friend also very limited with cgi sugested some thing like this:

system("/root/temp_site/cgi-bin/links/goFetch.pl URL=$in{'URL'} email=$in{'Contact Email'} submit=Fetch");

is there any way that we can add this into the add.cgi.

cheers
tron


Quote Reply
Re: gofetch In reply to
In Reply To:
system("/root/temp_site/cgi-bin/links/goFetch.pl URL=$in{'URL'} email=$in{'Contact Email'} submit=Fetch");
VERY VERY DANGEROUS to put system codes in PUBLIC cgi scripts. This was discussed in the Perl/CGI forum awhile back. I would recommend NOT using the above codes!


Regards,

Eliot Lee
Quote Reply
Re: gofetch In reply to
Eliot,

About the search. I am using goFetch, and would like users to be able to search this database that is created when sites are spidered.

The only mod that I could find that looked similar to what you proposed was 'Multisearch'. But what that does, is just letting users be able to CHOOSE different Links databases. I just want the cgi to search either ONLY the goFetch database, or both at the SAME TIME.

Thanks in advance,
Aymeric.

Quote Reply
Re: gofetch In reply to
Hi Tron:

To add to what Eliot said, especially NEVER (under any and all circumstances) use system when you are passing non-hard coded, user-submitted values to the system function! If you cannot do it with a Perl (or module) function, then move on. And get the word to your friend before he suggests to anyone else.


Dan Cool


Quote Reply
Re: gofetch In reply to
Thanks to all of you for your help
but for the meantime until i can learn perl
i have put a button on the add_success.html
so if the user whants to have there site spidered he clicks it.

cheers
tron

Quote Reply
Re: gofetch In reply to
Alright, PLEASE, I know this is stupid, but HOW CAN ONE CALL THE GOFETCH DATABSE WITH A SEARCH.

GoFetch works fine, I have my database created and everything, but I have no way to search through it. Where do I have to code, in order to change the searchable database to the goFetch one??????

Any help would be greatly appreciated,
Thanxxx

M.

Quote Reply
Again==> gofetch, PLEASE !!! In reply to
Does ANYONE have ANY ideas on how to do this????
(my last message above this one?... searching both databases???)

Thanks a lot, I am really desperate now...
Cheers,
M.

Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
i had a way to incorporate it into the links db from the spider db, and was working on the admin program but i became busy. Maybe when i am less busy in the future i'll release the rest of my planned goFetch mod

Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
sounds incredible. I guess I will have to wait, since I am so perl illiterate...
Anyway, if anybody else knows how to be able to search both databases, it would be great. Or even better, to incorporate the spidering engine into the links database, so that the results are not doubled.

Thanks in advance,
M.

ps: I guess I might have to pay a specialist/installer to do this if i get no answers...

Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
I made some modifications to goFetch to get it to write to my validate.db file. It works on my site.

1. Everywhere in the script it says $db_spider_id_file_name, I replaced that with $db_links_id_file_name.
2. Everywhere in the script it says $db_spider_name, I replaced it with $db_valid_name.

Ok, so now when it spiders a page it reads how many links there are and updates the validate.db file.

But it isn't that simple. My database is pipe delimited so I had to change:
print SPIDER "$ID%%$mytitle%%$myurl%%$mydescrip%%$mykeywords%%$mysize%%$lastupd\n";
to:
print SPIDER "$ID|$mytitle|$myurl|$mydescrip|$mykeywords|$mysize|$lastupd\n";

You have to match how many fields you have in your database. I have 17 fields so I had to change it to:
print SPIDER "$ID|$mytitle|$myurl|$date||$mydescrip|Name Here|your\@email.com|||||||||$date\n";

Also note that $lastupd is changed to $date to produce the right date format. You also have to change:
use HTTP::Date;
$lastupd = time2str($res->last_modified);

to:
$date = &get_date;

(Note, I still get one problem with this, though. The first 2 links always have the infamous 1969 date!)


C. Now you have to eliminate some characters from the title and description fields.
Beneath:
# Update the counter.
open (ID, ">$db_links_id_file_name") or &cgierr("error in get_defaults. unable to open id file: $db_links_id_file_name. Reason: $!");
flock(ID, 2) unless (!$db_use_flock);
print ID $ID; # update counter.
close ID; # automatically removes file lock
open (SPIDER, ">>$db_valid_name") or &cgierr ("Can't open for output counter file. Reason: $!");
if ($db_use_flock) { flock (SPIDER, 2) or &cgierr ("Can't get file lock. Reason: $!"); }


I added this:
$mydescrip =~ tr/\n//d;
$mydescrip =~ tr/|\n//d;
$mytitle =~ tr/|\n//d;

to remove the | character from the title and description. and to remove line breaks from the description.

Now it should enter everything into the valildate.db in the right slots with no pipes or line breaks to screw up the fields.

Now I need a page to spider. The way I do it is use a shareware program called UrlSearch. I find a page with links I would like to add,
then I save it to my hard drive. Then open it with UrlSearch and eliminate the irrelevant links. Then save it, upload it to my
server then spider it on my server. I usually only do 10 or 15 at a time because I still have to validate them to check the title
and description. But it is not often that both the title and description turn out OK. Too many people do not have a meta tag
for description. And when it builds a description from the content it puts in javascript and other things it doesn't understand.
Sometimes it is easier to use the bookmarklet tool to add links because you can highlight your description on the page.

Too much rambling now. Anyway I have it working on my site and am now adding 20 or 30 relevant links to my site a week this way.

Mike
http://www.sweepstalk.com

Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
Sorry, forgot about putting in values for certain fields.print SPIDER "$ID|$mytitle|$myurl|$date||$mydescrip|Name Here|your\@email.com|||||||||$date\n";

Should be changed to:
print SPIDER "$ID|$mytitle|$myurl|$date||$mydescrip|Name Here|your\@email.com|0|No|No|0|0|No||password|$date\n";

The only field I left blank was Expired. Add values to any other fields you may have. Also add any other fields you may have. If you have a field for keywords you may want to put the variable in for that too.

Also change the length of any fields if you allow more characters.

I changed:
$Rules{'Max Characters: URL'} = 128;
$Rules{'Max Characters: Title'} = 96;
$Rules{'Max Characters: Description'} = 384;

To:
$Rules{'Max Characters: URL'} = 140;
$Rules{'Max Characters: Title'} = 120;
$Rules{'Max Characters: Description'} = 525;

Also I changed:
$Description = substr($Description,0,225);
if (length($Description) > 224) {
$Description =~ s/\s+\S*$/.../;

to:
$Description = substr($Description,0,525);
if (length($Description) > 524) {
$Description =~ s/\s+\S*$/.../;

I think that's it.

Mike

Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
Thanks a lot Mike, I will give it a try. I have been away for a while, and hadn't seen your reply before this morning.

Cheers,
Aymeric.

Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
After looking through your code Mike, I realised that it doesn't do what I am looking for.
What it does is add the found spidered links into the validate.db.

I do not care about having those links into my database. What I want to do is MUCH simpler than that:

I just want my search to look through the SPIDER.DB rather than the normal database it looks through.

That's all I need, HOW can I have the search form look through that database. If there is a way to make it look though both, that's even better.

Also, if anyone knows a way to automatise the process of spidering when I validate a link, that would be a plus.

I have been looking through the forums a lot, and really can't find anything specific to what I want to do.

Thanks in advance,
Aymeric.

Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
The Modification you are looking for is linked in the following Thread:

http://www.gossamer-threads.com/...&vc=1#Post112073

Regards,

Eliot Lee
Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
Oops, let that one pass by without catching it.

Thanks Eliot.
Aymeric.


ps: any ideas on how to execute the goFetch.cgi when I validate links? Would that be hard to do?


Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
Well, thanks Eliot, but again, that's not at all what I am trying to do. Why am I having such a hard time trying to explain this...

On one side I have my Links 2.0 database, on the other, I have the Spider.db, created by goFetch.cgi. Then, I want my search forms on my pages to search through the goFetch.cgi ONLY, NOT the Links 2.0 db. Or, if it can search through both AT THE SAME TIME, that's fine too.

Thanks a lot,
Aymeric.

Quote Reply
Re: Again==> gofetch, PLEASE !!! In reply to
Uh...the Mod I linked searches through MULTIPLE databases! So, you can search the spider.db and links.db (NOT at the same time), WHY use the goFetch.cgi when the search routines in search.cgi are already there??? All you have to do is apply the Mod I linked and it will allow people to search either database.

NOW, if you want COMBINED results, then I would suggest adding the Altavista Search Mod using the both parameter that will print results from both databases!

GOT IT!?!?!?!?!?!

Regards,

Eliot Lee
> >