Gossamer Forum
Home : General : Perl Programming :

Danish encoding :/

Quote Reply
Danish encoding :/
Hi,

I'm trying to do a Danish import from DMOZ, and converting it via Unicode::MapUTF8, using this code;

Code:
open (CONTENT,"/usr/home/buhuu.dk/cgi-bin/admin/content.rdf.u8.2") || die $!;
open (WRITEIT,">/usr/home/buhuu.dk/cgi-bin/admin/content.rdf.u8") || die $!;
while (<CONTENT>) {
if (/[\200-\377]/) {
s/([\200-\377]+)/from_utf8({ -string => $1, -charset => 'UTF-8'})/eg;
}
print WRITEIT $_;
}
close(WRITEIT);
close(CONTENT);

However, I can't seem to find a valid charachter encoding type, to convert to :/

TIA for any suggestions (i've checked the w3.org website, and a ton more, but am still no closer to finding the answer Frown).

TIA

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] Danish encoding :/ In reply to
*bump* Frown

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] Danish encoding :/ In reply to
Andy,

Try ISO-8859-1
But UTF-8 is officially recommended.

Mind you, I'm myself struggling with the character sets ...

John
Quote Reply
Re: [gotze] Danish encoding :/ In reply to
Hi,

Managed to get it working in GLinks 3.0.4. Must have been a bug in pre-glinks versions :|

Charsets are such a PITA :(

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] Danish encoding :/ In reply to
Just noticed this post. I'm working a lot with both European and other languages. Watch what char encoding you use because some are picked up easily by the search engines and others are not.

Best way to check is look it up on google and then ask for a translated version of the page. If you get the translation, then encoding works. (not all languages but you can also check via FreeTranslation)