Gossamer Forum
Home : Products : Links 2.0 : Customization :

Duplication check on Title not URL?

Quote Reply
Duplication check on Title not URL?
Hi,

Is there any way of checking for duplicate records by searching titles instead of URLs? Please include a solution on this thread.

Thanks in advance.

JeffB

Quote Reply
Re: Duplication check on Title not URL? In reply to
Putting this one back to the top as I think it's a much needed mod/hack.

JeffB

Quote Reply
Re: Duplication check on Title not URL? In reply to
I provided Ian C., another Links user, some suggestions for doing this in a Thread in the Links 2.0 Discussion Forum about four months ago.

Regards,

Eliot Lee
Quote Reply
Re: Duplication check on Title not URL? In reply to
Thanks but I can't find it. I did searches for Ian C and duplicate but nothing came up.

What was his username?

Quote Reply
Re: Duplication check on Title not URL? In reply to
Guess that's the thread you were looking for: http://www.gossamer-threads.com/...ew=&sb=&vc=1

Thomas
http://www.japanreference.com
Quote Reply
Re: Duplication check on Title not URL? In reply to
I found the code you suggested:

Code:
sub check_duplicates {
# --------------------------------------------------------
# This routine searches through the database and pulls up sets
# of links that have the same domain.
#
my (@values, %seen, %doubles, $url, $count);

open (DB, "<$db_links_name") or &cgierr("error in check_duplicates. unable to open db file: $db_links_name. Reason: $!");
LINE: while (<DB>) {
(/^#/) and next LINE;
(/^\s*$/) and next LINE;
chomp;
@values = &split_decode($_);
$values[$db_url] =~ s,/$,,;
$values[$db_title] =~ s,/$,,;
$seen{$values[$db_url]}++;
$seen{$values[$db_title]}++;
push (@{$doubles{$values[$db_url]}}, $values[$db_key_pos], $values[$db_title], $values[$db_category]);
}
close DB;
while (($url, $count) = each %seen) {
($count < 2) and delete $doubles{$url};
}
&html_check_duplicates (%doubles);
}
The changes are in bold. I tried this but it still came up with every record in the database. Basically all my links have no URL link so they are all set to "http://" - I want to check dupliaction on link title only.

Any help would be great. I know a little Perl but not enough to fix this! Smile

Quote Reply
Re: Duplication check on Title not URL? In reply to
You will have to edit the following codes:

Code:

my (@values, %seen, %doubles, $url, $count);


to the following codes:

Code:

my (@values, %seen, %doubles, $url, $title, $count);


Then add the following codes:

Code:

push (@{$doubles{$values[$db_title]}}, $values[$db_key_pos], $values[$db_title], $values[$db_category]);


AFTER the following codes:

Code:

push (@{$doubles{$values[$db_url]}}, $values[$db_key_pos], $values[$db_title], $values[$db_category]);


Then add another loop:

Code:

while (($title, $count) = each %seen) {
($count < 2) and delete $doubles{$title};
}


AFTER the following codes:

Code:

while (($url, $count) = each %seen) {
($count < 2) and delete $doubles{$url};
}


Don't gaurantee these modifications will work just like I did not gaurantee that the original code hacks would work. Tongue




Regards,

Eliot Lee
Quote Reply
Re: Duplication check on Title not URL? In reply to
Thanks for your help. The solution doesn't work but it helped me work it out! For anyone that's interested here is the solution:

Code:

sub check_duplicates {
# --------------------------------------------------------
# This routine searches through the database and pulls up sets
# of links that have the same domain.
#
my (@values, %seen, %doubles, $url, $title, $count);

open (DB, "<$db_links_name") or &cgierr("error in check_duplicates. unable to open db file: $db_links_name. Reason: $!");
LINE: while (<DB>) {
(/^#/) and next LINE;
(/^\s*$/) and next LINE;
chomp;
@values = &split_decode($_);
# $values[$db_url] =~ s,/$,,;
$values[$db_title] =~ s,/$,,;
# $seen{$values[$db_url]}++;
$seen{$values[$db_title]}++;

push (@{$doubles{$values[$db_title]}}, $values[$db_key_pos], $values[$db_title], $values[$db_category]);

}
close DB;
while (($title, $count) = each %seen) {
($count < 2) and delete $doubles{$title};
}
&html_check_duplicates (%doubles);
}
Thanks to all who helped.

JeffB
http://www.celebhoo.com

Quote Reply
Re: Duplication check on Title not URL? In reply to
Unfortunately, with your codes, it will only check the TITLE, NOT both URL and Title.

Good luck!

Regards,

Eliot Lee
Quote Reply
Re: Duplication check on Title not URL? In reply to
That's exactly what I wanted it to do! Like I kept saying - we have no URLs in our database so it is pointless checking duplicate URLs.

Good Luck!

Jeffb

Quote Reply
Re: [Stealth] Duplication check on Title not URL? In reply to
In Reply To:
Unfortunately, with your codes, it will only check the TITLE, NOT both URL and Title.

Good luck!

Regards,

Eliot Lee




Ok I figured out the way to check both TITLE and URL for dupes.

In db.pl, sub check_duplicates, Just after
$seen{$values[$db_url]}++;
you must add
$seen2{$values[$db_title]}++;
push (@{$doubles2{$values[$db_title]}}, $values[$db_key_pos],$values[$db_title], $values[$db_category]);

Then
After
while (($url, $count) = each %seen) {
($count < 2) and delete $doubles{$url};
}
You must add
while (($title, $count) = each %seen2) {
($count < 2) and delete $doubles2{$title};
}
@doubles2=%doubles2;


In admin_html.pl, sub html_check_duplicates, Just after
my %duplicates = @_;
You must add
my %duplicates2=@doubles2;

The line:
if (!%duplicates) {
Should be changed to
if (!%duplicates && !%duplicates2) {

Then
After

print qq~<input type=checkbox name="$id" value="delete"> <$font>
(<a href="$db_script_url?db=links&view_records=1&$db_key=$id&ww=1" target="_blan
k">$id</a>) $title in <em>$cat</em><br>~;
}
print qq~</td></tr>~;
}

You must add

foreach (keys %duplicates2) {
print qq~<tr><td colspan=2><$font><b>$_</b></font></td></tr>
<tr><td>&nbsp; &nbsp;</td>
<td>
~;
for ($i = 0; $i <= $#{$duplicates2{$_}}; $i = $i + 3) {
my $id = ${$duplicates2{$_}}[$i];
my $title = ${$duplicates2{$_}}[$i + 1];
my $cat = ${$duplicates2{$_}}[$i + 2];
print qq~<input type=checkbox name="$id" value="delete">
<$font>
(<a href="$db_script_url?db=links&view_records=1&$db_key=$id&ww=1"
target="_blan
k">$id</a>) $title in <em>$cat</em><br>~;
}
print qq~</td></tr>~;
}


That's it. This is assuming you haven't tinkered with the default Links 2.0 installation.

Cheers

Chinthaka Weerasinghe 11599119

Last edited by:

Chinthaka: Jun 28, 2002, 10:03 PM
Quote Reply
Re: [Chinthaka] Duplication check on Title not URL? In reply to
How to check duplicates for both url and email?