Filtering Spam with Procmail

The CS Department uses SpamAssassin to give every email a "spam score", which indicates the likelihood of the message being spam. We do not filter out or delete any e-mail based on the spam score, although we do change the subject lines of messages that appear to be spam. This page describes how to filter your spam with Procmail on our systems.

You can also change SpamAssassin's configuration to affect things like how it calculates its spam scores or what score it uses to decide that a message is really spam. For more information on that, please see SpamAssassin Configuration.

Log in to a Linux Client

All of these changes must be done on one of the Department's Linux clients.

Create a .procmailrc File

We use procmail for mail filtering. It's controlled by a file in your home directory named .procmailrc (note the leading dot!)

Create that file. In most cases, that should be as easy as opening your preferred text editor, creating a new document, and saving it as .procmailrc in your home directory.

Put the following into your .procmailrc file:

MAILDIR=$HOME/Mail
LOGFILE=$MAILDIR/procmaillog
#LOGABSTRACT=all
#VERBOSE=yes

:0:
* ^X-Spam-Flag: YES
SPAM

This will take every message that SpamAssassin thinks is spam and put those messages into a mail folder named "SPAM" instead of your main inbox.

Save your .procmailrc file.

At a command prompt, run the following command to ensure that the file's permissions are appropriate:

chmod 600 ~/.procmailrc

Explanation

The first line, MAILDIR, sets the mail directory to be the Unix directory on our Grad/Research Net systems that you use to store your mail folders.

The LOGFILE can be set to keep a log of procmail activities for your account. You should set this; if you don't and there are problems delivering your mail, procmail will send a bounce email back to the person who sent the mail that had the problem.

LOGABSTRACT and VERBOSE will increase the information sent to the log file. They're useful for debugging your configuration. To activate them, just remove the "#" character at the beginning of their lines.

Note: Your log file will grow over time and may eventually use a lot of space. You can clear it out with this command, run from the command line of one of our Linux Clients: cat /dev/null >~/Mail/procmaillog

The section beginning with ":0:" files mail flagged as spam to a folder called SPAM. That folder will be stored in the directory defined by the first line in the file, MAILDIR.

The ":0:" line tells procmail that you're starting a new recipe. (More precisely, ":0" starts a new recipe. The final colon (":") tells procmail that this recipe will be storing the message in a file, so it needs to lock the file while writing to make sure that no other program tries to write to the file at the same time.)

In "* ^X-Spam-Flag: YES" the "*" at the beginning of the line tells procmail that you're giving it something to test each mail against The following regular expression tells procmail to match any message with an X-Spam-Flag: header whose value is YES.

The last line of the recipe, "SPAM" gives the name of mail folder (within the directory given by MAILDIR) where matching messages should be placed.

Check Your Spam Folder

Now, the most important step here is to wait a bit and see if **SPAM**-tagged mail continues to show up in your inbox. If you haven't seen any of it in bit, then go to your new spam folder. (It may depend on your mail program how you find that mail folder to view.) Do you see the SPAM-tagged messages there? If so, it's working. If not, contact CS IT Support, and we'll try to figure out why it isn't working for you.

VERY IMPORTANT: When should you check your spam folder? Often. Why? Two reasons. 1) There could be some false positives in your folder, mail tagged as SPAM because it looked like spam, but may actually not be spam. Example? You might be subscribed to a mailing list, and the list sends you mail that looks like spam. 2) You need to go through regularly and clear out (delete/remove) your spam folder of spam messages to conserve room on our home directory server disks. Some of the **SPAM**-tagged mail could be very big, especially if it contains attachments. This could fill up our disks rapidly. Please make sure you monitor your spam folder and keep it clean regularly.

Automatically Delete (Some) Spam

The score that SpamAssassin gives is an indication of how likely it thinks that a given email is spam. One common thing to do is to automatically delete messages with high scores (because they're extremely unlikely to be false positives) while keeping lower-scored (but still classified as spam) messages so you can evaluate them individually.

By default, we tag all messages with a score greater than 7 as spam. (This is very conservative. Many people use a threshold of 5 and some get good results with thresholds as low as 2.) If a message has a score of 15 or higher, it's almost certainly spam and can be discarded unread. Here's a procmail rule to accomplish that:

# Drop spam over score 15.
:0
* ^X-Spam-Status: Yes, score=\/[0-9-]+
* ? test $MATCH -gt 15
/dev/null

Warning: This will delete messages. Make sure you're okay with that before you use this rule.

Learn More about procmail

There are a number of man pages about procmail on our Linux clients. You can run man procmailex from a command prompt to see a number of useful examples of procmail use. man procmailrc describes the ins and outs of the .procmailrc file. (man procmail will tell you how to run procmail, but for the most part you shouldn't need to worry about that, since we run procmail for you automatically.)

If you're looking for more procmail examples, you can read through the extensive list at Timo's procmail tips and recipes.