Note: This document describes how to set up automatic e-mail filtering using tags added to your incoming e-mail by SpamAssassin™, a software package installed on the CIS Unix mail handling machines. This is our primary suggested method for spam filtering. It's a reasonably effective method for decreasing the hassle of dealing with spam, and is very easy to set up.
General Filtering Guidelines
People often ask for a method to handle unwanted e-mail (aka ``spam'') sent to their CIS Unix accounts. Spam is a waste of time and computing resources, and people who send spam (aka ``spammers'') are often sleazy types peddling shoddy products and services, often of dubious legality.
Worse, high volumes of spam can make it difficult to deal with your non-spam mail. (Following convention, we'll call your non-spam mail ``ham'' in the following.)
As a matter of policy, CIS Unix users are considered to be responsible for making their own decisions on mail they do and don't want to read. CIS doesn't want to be the business of reading your incoming messages and deciding whether you would find them uninteresting or offensive. But we do want to give you the tools that minimize the hassle of dealing with such messages.
The procmail program on the CIS Unix systems is used to deliver mail to CIS Unix users. Its normal operation is to put all incoming messages in your default INBOX. But you can also have a .procmailrc file in your home directory that will alter this operation based on the content of the incoming messages. This process is called ``filtering;'' the .procmailrc file contains one or more ``rules'' to automatically divert certain messages to other mailboxes, forward them to other addresses, or delete them entirely.
No automatic process can (yet) substitute for your personal judgment on whether any given message is spam or not. Inevitably, even carefully designed filters will have ``false positives'' (messages that the filter thinks look like spam, but aren't), and ``false negatives'' (messages that the filter thinks don't look like spam, but are).
So please note: CIS will not be responsible for incorrect classification of incoming messages, either false positives or false negatives.
For this reason, this method doesn't throw away suspicious mail without giving you a chance to read it (although an easy change will do that). Instead, suspicious mail is diverted to mailboxes other than your normal INBOX. The idea is that you zip through other mailboxes at lower priority, with the expectation that the messages are almost certainly all junk.
About SpamAssassin
Incoming e-mail to CIS Unix accounts passes through the SpamAssassinTM mail scanning program. SpamAssassin uses a number of heuristic tests to score each message it sees: the higher the score, the more likely the mail is spam. Special tags are added to each message before it is delivered to your account containing the score and SpamAssassin's guess as to whether the mail is spam or not.
While SpamAssassin does a very good job guessing whether incoming
messages are spam or not, you shouldn't forget that a very good guess is
still a guess. SpamAssassin will probably classify a relatively
small fraction of non-spam messages as spam ("false positives"),
which will be delivered to the IN.spam folder. (That's why the
method described here doesn't simply throw away such messages. Although
a simple change, described below, will let you do that.)
Conversely, SpamAssassin will
probably fail to correctly detect a fraction of spam messages ("false
negatives"), which will wind up in your INBOX.
Setting Up Filtering (The Easy Way)
If you are a WebMail user, you can start SpamAssassin-based mail filtering by following the instructions here. (You can disable filtering similarly.)
Non-WebMail users can instead choose the appropriate link at this location: https://webmail.unh.edu/cisunix/spamfilter.html. (There's also a link on that page to stop mail filtering, should you decide it's not for you.)
Either method will put default mail-filtering rules in your
.procmailrc file as described above. Filtering starts immediately.
You only need read on if (a) you want to incorporate spam filtering more intelligently into your existing procmail rules; (b) you want to adjust the procmail filtering rules to something other than the default; (c) you want to throw away probable spam, instead of directing it to a separate folder; or (d) you just want to know more about what's going on.
The Gory Details
SpamAssassin works by tagging mail messages with addditional "mail headers"; these headers are typically not shown when you're reading your mail. For a probable-spam message, the added headers might look something like this:
X-MailScanner-SpamCheck: spam, SpamAssassin (score=6.7, required 5,
DEAR_SOMETHING, FROM_ALL_NUMS, FROM_AND_TO_SAME, FROM_ENDS_IN_NUMS,
NO_REAL_NAME, RESENT_TO)
X-MailScanner-SpamScore: ssssss
For a probable-nonspam message, the addition might look like this:
X-MailScanner-SpamCheck: not spam, SpamAssassin (score=-0.8, required 5,
RESENT_TO)
SpamAssassin has assigned the first message a score
of 6.7, the second a score of -0.8. By default,
SpamAssassin considers a score over 5 to reflect a probable
spam message. For messages with positive scores, the
X-MailScanner-SpamScore header is added followed
by a number of s characters representing the (integral)
score.
The default .procmailrc rule provided by the
web-based setup looks like this:
:0:
* ^X-MailScanner-SpamScore: sssss
mail/IN.spam
This says: if the mail headers contain an
X-MailScanner-SpamScore header followed by five (or more)
s characters,
put the mail into the IN.spam file in your mail
directory. (This is the default location for mail folders in Pine
and WebMail.)
If you already have a .procmailrc file, the web-based setup
places this rule at the end; it will be checked after the ones you
had set up previously.
You can also create or modify your .procmailrc
file by logging into your CIS Unix account and using an editor
(like pico or vi). For example:
% pico .procmailrc
If you do this, please note that punctuation and spacing are extremely important in ths file; getting it wrong can cause lost mail. With that warning, here are some ways you can alter the default setup:
You can make the filtering more or less liberal by decreasing or increasing the number of
scharacters demanded after theX-MailScanner-SpamScoreheader. If (for example), you wanted to consider anything with a score of 7 or greater to be spam, your procmail rule could look like::0: * ^X-MailScanner-SpamScore: sssssss mail/IN.spamIncreasing the number of
s's will decrease the probability of false positives (and increase the probability of false negatives); conversely, decreasing the number ofs's will decrease the probability of false negatives (and--guess what--increase the probability of false positives).If you have more than one rule in your
.procmailrcfile, you might want others to take precedence over your spam filter. (For example, you always want mail sent to you from a mailing list to go to a separate folder, regardless if SpamAssassin thinks it's spam or not.) Simply move the spam-rule lines below the lines you want to take precedence.Replacing mail/IN.spam with /dev/null in your .procmailrc file will cause incoming mail that would normally be placed in your IN.spam folder to be simply deleted instead. We don't recommend you do that due to the theoretical possibility of false positives. But we won't stop you. It's your mail.
Client Filtering
The method described in this document is server-side filtering: it filters your mail on our server as it arrives in your Unix account. Most modern mail client programs allow you to filter messages as well; this is client-side filtering. If you use Microsoft Outlook to read your mail, you can configure it to read SpamAssassin's header and perform an appropriate action. Instructions are here. A similar method for recent versions of Eudora is described here.
Our setup of SpamAssassin doesn't allow filtering by current versions of Microsoft Outlook Express, sorry.
Page Maintenance: Paul A. Sand <pas@unh.edu> Last modified: 2012-05-07 8:54 AM EDT
[W3C Validator]

