Gossamer Forum
Home : General : Perl Programming :

Net::POP3 Plain Text

Quote Reply
Net::POP3 Plain Text
Hi

I've been working with Net:POP3 lately to retreive e-mails to store into a database. However, some of these emails are not plain text. Sigh. Some are HTML, which I just delete outright and others have some funny encoding character. Here's an example:

Code:
1. Constitution

=A0

Please find attached the second draft of the constitution which=20
incorporates your amendments to the first draft. The deadline for=20
comments is 5pm Wednesday 21 January, unless there is a request to=20
extend this, by which date the draft will have been reviewed by=20
solicitors. Their comments will be included in (a) further draft(s) to=20=

be circulated before the next WHEILA meeting, when there will be a vote=20=

on disbanding WHEILA and adopting the new constitution for WIC.

=A0

How could I stip that back into plain text? Is there a module for doing this, or even an extension to Net::POP3?

Cheers

- wil
Quote Reply
Re: [Wil] Net::POP3 Plain Text In reply to
I got rid of those silly characters using MIME::QuotedPrint.

My next question is; if a message is a multi-part plain text/html. How can I only choose the text bit of the email? Here's my code so far.

Code:
my $self = shift;

my $dbh = $self->param('dbh');

use Net::POP3;
use Mail::Header;
use Date::Manip;
use MIME::QuotedPrint;

my $pop = Net::POP3->new('mail.localhost', Timeout => 60);

my $username = "foo";
my $password = "bar";

my (@messages);

if ($pop->login($username, $password) > 0) {

my $msgnums = $pop->list; # hashref of msgnum => size

foreach my $msgnum (keys %$msgnums) {

my %row;

my $msg = $pop->get($msgnum);

my $parsedhead = Mail::Header->new($msg);
chomp(my $subject = $parsedhead->get('Subject'));
chomp(my $date = $parsedhead->get('Date'));

$date = UnixDate($date,"%Y-%m-%e");

$row{id} = $msgnum;
$row{email} = join("",@$msg);
$row{email} = decode_qp($row{email});

$pop->delete($msgnum);

push (@messages,\%row);
}
}

$pop->quit;

- wil
Quote Reply
Re: [Wil] Net::POP3 Plain Text In reply to
If anyone is/was interested. I ended up with the following (albeit clumsy) solution:

Code:
sub get_newsletters {
# --------------------------------------------------------------------------
#
my $self = shift;

my $dbh = $self->param('dbh');

use Net::POP3;
use Mail::Audit;
use Mail::Header;
use Date::Manip;
use MIME::QuotedPrint;

my $pop = Net::POP3->new('foo', Timeout => 60);

my $username = "foo";
my $password = "bar";

my (@messages);

if ($pop->login($username, $password) > 0) {

my $msgnums = $pop->list; # hashref of msgnum => size

foreach my $msgnum (keys %$msgnums) {

my %row;

my $msg = $pop->get($msgnum);

my $parsedhead = Mail::Header->new($msg);

chomp(my $subject = $parsedhead->get('Subject'));
chomp(my $date = $parsedhead->get('Date'));

$msg =~ s/^.*?\n\n//s;

$date = UnixDate($date,"%Y-%m-%e");

$row{id} = $msgnum;
$row{date} = $date;
$row{subject} = $subject;

$row{email} = join("",@$msg);
$row{email} =~ s/^.*?\n\n//s;

my $audit = Mail::Audit->new(data => \@$msg);

if ($audit->is_mime) {

my @parts = $audit->parts;
my $num_parts = @parts;

foreach my $part (@parts) {

if ($num_parts > 0) {

if ($part->mime_type eq 'text/plain') {
my $body = $part->body;
$row{email} = join("",@$body);
}
}
}
}

$row{email} = decode_qp($row{email});

$pop->delete($msgnum);

push (@messages,\%row);

my $sth = $dbh->prepare("INSERT INTO wic_newsletters (date,subject,newsletter) VALUES(?,?,?)")
or die("Can't prepare SQL query: [$DBI::err] $DBI::errstr");

$sth->execute ($row{date},$subject,$row{email})
or die("Can't execute SQL query: [$DBI::err] $DBI::errstr");
}
}

$pop->quit;

my $tmpl_path = "view_emails.shtml";
my $tmpl = $self->load_tmpl($tmpl_path, die_on_bad_params => 0);

$tmpl->param (
messages => \@messages,
);

$self->_set_username_in_tmpl(\$tmpl);

return $tmpl->output;
}

- wil