First, I had understood that Bayes can learn previously tagged emails without
stripping Spamassassin tags. Has this changed?
Second, all of my users use a webmail client, though they can use OE if they
wish. It is probably best for them to use IMAP so that server-side scanning
can better be setup. I currently have 2 scripts that run nightly. The first
takes everthing in the user's /home/user/mail/Spam folder and learns it as
spam then empties it. The second does the same for Ham, but moved that mail to
a Cleaned folder. All the user has to do is move untagged spam into Spam and
false-positives into Ham.
--
<<JAV>>
---------- Original Message -----------
From: "Sander Holthaus - Orange XL" <***@orangexl.com>
To: "'SpamAssassin Users'" <***@spamassassin.apache.org>
Cc: "'Stuart Johnston'" <***@ebby.com>, "'Peter Marshall'"
<***@caris.com>
Sent: Fri, 4 Feb 2005 19:47:40 +0100
Subject: RE: Manually training SpamAssassin by forwarding mail
> > -----Original Message-----
> > From: Stuart Johnston [mailto:***@ebby.com]
> > Sent: Friday, February 04, 2005 7:35 PM
> > To: Peter Marshall; SpamAssassin Users
> > Subject: Re: Manually training SpamAssassin by forwarding mail
> >
> > Peter Marshall wrote:
> > > Stuart Johnston wrote:
> > >
> > >> Peter Marshall wrote:
> > >>
> > >>> Kevin Sullivan wrote:
> > >>>
> > >>>> --On 02/03/05 01:59:21 +0100 Sander Holthaus - Orange XL wrote:
> > >>>>
> > >>>>> I've been interested in offering customers to train
> > manually train
> > >>>>> the SpamAssassin Bayes filter for ham and spam (to reduce false
> > >>>>> positives and negatives). However, I can only find
> > documentation
> > >>>>> to this for local mailboxes and IMAP. Most users
> > however, retrieve
> > >>>>> their mail through POP and use Outlook (Express) as
> > mail client.
> > >>>>> Is there a way to train SpamAssassin with such a setup (e.g.
> > >>>>> forwarding mail with Outlook
> > >>>>> (Express) using SMTP)?
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> If you want to do a lot of programming, you could save
> > all incoming
> > >>>> messages for a few days in a database somewhere. When a user
> > >>>> forwards a message to a special "ham" or "spam" mailbox,
> > you pull
> > >>>> the message-id from the message and use it to recover
> > the original
> > >>>> message from your database.
> > >>>>
> > >>>> -Kevin
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> My question is the same as Henrik, I have a bunch of
> > email that is
> > >>> spam (either tagged by spam assassin or not tagged at all. I
> > >>> forwared it as an attachment to a "spam" mail box. What
> > do I have
> > >>> to do now before I can get bayes to learn the message ...
> > I read you
> > >>> have to remove the headers .... Could anyone give me a
> > little more
> > >>> detail ?
> > >>
> > >>
> > >>
> > >> I use a modified version of the DMZS-sa-learn.pl from:
> > >> http://www.dmzs.com/tools/files/spam.phtml
> > >>
> > >> When someone forwards a spam to me, I move the message to
> > a special
> > >> imap folder that gets processed by the script. My additions look
> > >> something like:
> > >>
> > >> use Email::MIME;
> > >> ...
> > >> my $msg = Email::MIME->new($raw_message_body);
> > >>
> > >> my @parts = $msg->parts;
> > >>
> > >> foreach (@parts) {
> > >> if ($_->content_type =~ m|message/rfc822|) {
> > >> sa_learn($_->body_raw);
> > >> }
> > >> }
> > >>
> > >>
> > >> I've tested this with messages forwarded as attachment
> > from Outlook
> > >> and Thunderbird. I'm not sure how effective it is though.
> > I'm sure
> > >> that it still looses something in the translation. All imap is
> > >> really the way to go if you can.
> > >>
> > >>
> > >> Stuart Johnston
> > >>
> > >>
> > > But I have no imap .. only pop .. they would forwared (as
> > attachment)
> > > to a mailbox, and then I have to run sa-learn ... I assume as root ?
> > >
> > > Will the stuff you posted work for this setup as well ??
> > >
> > > Would there be big problems just running it after the forwared as
> > > attachment. ??
> >
> > The code I posted only shows how you can extract the attached
> > spam from the email. You'll need to write your own code to
> > integrate it into your particular setup.
> >
> > BTW, in Outlook, you can easily attach multiple spams to one
> > message and this code should handle it.
>
> CTRL-a, right click, "Forward Items" will indeed do the trick.
>
> > >
> > > Can users also forwared as attachemtn mail that was sent that was
> > > already marked as spam ... or is there any advantage to this ?
> >
> > If you use Bayes auto learn, I suspect that this wouldn't do much.
> > Otherwise, it might help.
>
> I would check the headers of the forwarded messages to see if their
> spam-score is above your auto-learning threshold. If it is,
> relearning is is perhaps quite useless. You might wonder why they
> received the message anyway
> (I would think that something that is good enough to autolearn is
> good enough to refuse or discard).
>
> Kind Regards,
> Sander Holthaus
------- End of Original Message -------