Difference between revisions of "SpamAssassin Configuration"

m (Phil.cs.jhu.edu moved page Spamassassin Spam-Tagging to SpamAssassin Configuration)
Line 1: Line 1:
 
== Introduction ==
 
== Introduction ==
  
For an overview of how we use Spamassassin in our department, please, first visit our wiki article:  [[https://support.cs.jhu.edu/wiki/Spamassassin:_Overview_And_Use_In_The_Dept  Spamassassin: Overview And Use In The Dept]]
+
For an overview of how we use SpamAssassin in our department, please first read our [[SpamAssassin]] article.
  
We use [https://spamassassin.apache.org/ SpamAssassin] to tag e-mails' ''Subject'' lines with <tt>** SPAM **</tt> if they are found to be spam.
+
SpamAssassin uses a set of rules that control how it scores each message. It considers each rule in turn and if it determines that a rule applies to or matches the message, then that rule's score is added to the message's overall score.  Some rules have negative scores, indicating that messages with those features are probably not spam.  SpamAssassin's default rule scores have been chosen based on experience with large quantities of both spam and non-spam messages.
  
SpamAssassin uses several rule sets in determining what is spam and what is non-spam.  Points or ''"hits"'' are given to e-mails for various levels of "spammedness."  The more hits your email gets, the better chance it will be tagged as spam.  Once the email reaches the globally define required_hits level, the mail is tagged as spam.  By default, CS requires a hit level of ''7.0'' or above for a mail message to be considered spam.
+
By default, the CS Department uses a threshold of 7 for its spam classificationAny message whose overall score is equal to or greater than 7 is flagged as spam.  The [[SpamAssassin]] page describes what is done by default to spam-flagged messagesThe [[Filtering Spam with Procmail]] page describes how to automatically filter messages that have been flagged as spam.
  
If you receive mail tagged as <tt>** SPAM **</tt>, look for an <tt>X-Spam-Status:</tt> line in your mail headers to see the various tests your e-mail passed for being spam.  Sometimes your mail can pass several of the spam tests and yet still doesn't get tagged as spam, as many normal, legitimate e-mails have characteristics of spam within them.  The more tests that are passed as spam, the higher the hit level, the more chance of the e-mail getting tagged as spam.
+
== Create a user_prefs File ==
  
To filter out spam email (using your favorite mail-filtering program), it's probably best to look for a <tt>X-Spam-Flag: YES</tt> header in the message, but you can also just look for <tt>** SPAM **</tt> in the subject lineIn any case, we recommend you do not simply delete such mail, but move spam-tagged mail to a folder to review later, in case there are some legitimate mails that were tagged as spam.
+
SpamAssassin uses a configuration file in your home directory (on our [[:Category:Linux Clients|Linux clients]]) to supplement its Department-wide configurationYou can use that file to customize SpamAssassin's behavior for your messages.
  
== Customizing How SpamAssassin Works For You ==
+
To create an empty config file, from a command prompt on one of our Linux clients, run the following commands:
  
SpamAssassin use both global and user parameters for custom tagging configurationsAs a user, you can adjust the user settings.  To do so, you'll need to first create a <tt>.spamassassin</tt> directory (don't forget the dot at the beginning):
+
mkdir ~/.spamassassin
 +
  touch ~/.spamassassin/user_prefs
  
mkdir .spamassassin
+
== Change Your Spam Score Threshold ==
  
Then, create the configuration file called <tt>user_prefs</tt>:
+
The most common thing to customize is your spam score threshold.  The default value for the CS mailserver is 7, which is a very conservative setting--it's pretty unlikely to flag something as spam when it's not, but it also lets a fair amount of spam in untagged.
  
  touch user_prefs
+
The most common threshold for SpamAssassin is 5. Some people have good results with thresholds as low as 2.  Thresholds can be decimal numbers, so you can use, say, 5.3 if you want.
  
(The touch command merely creates an empty file.)
+
To set your threshold to 5.3, put the following in your <tt>~/.spamassassin/user_prefs</tt> file:
  
In your <tt>.spamassassin/user_prefs</tt> file, there are many parameters you can change.  The most popular one here is the <tt>required_score</tt> parameter. (In older versions of SpamAssassin, this was called <tt>required_hits</tt>.)  This will allow you to decide how many SpamAssassin test hits your incoming email message needs to have before being tagged as <tt>** SPAM **</tt>.  The lower the <tt>required_score</tt> number, the easier it is for mail (legimate or spam) to be  tagged as spam (this could lead to false positives).  The higher the number, the fewer items tagged as spam.  (More spam will be untagged.)  As mentioned earlier, CS uses a <tt>required_score</tt> level of '''''7.0''''' by default.  You can change this to a different level by adding the <tt>required_score</tt> parameter to <tt>user_prefs</tt> (the numbers can be decimals).
+
<pre>
 +
required_score 5.3
 +
</pre>
  
required_score 5.4
+
== Prevent Certain Email Addresses from Being Considered Spam ==
  
(more messages will be tagged as spam, however, some may be legitimate e-mails.)
+
Sometimes SpamAssassin will flag a message as spam even if it's not.  If that happens frequently to the same sender, you can ''whitelist'' that sender's email addressto tell SpamAssassin that email from that person or organization should never be considered spam.  You do this with the <tt>whitelist_from</tt> configuration directive.
  
or
+
If sally@example.com often gets flagged as spam, you can whitelist her with the following line in your <tt>~/.spamassassin/user_prefs</tt> file:
  
required_score 9.2
+
<pre>
 +
whitelist_from sally@example.com
 +
</pre>
  
(fewer messages will be tagged as spam.)
+
You can whitelist entire domains, if you want. Let's say that you receive several newsletters from a company whose email addresses all end with "@company.com".  You can whitelist all of those at once with the following directive:
  
Other <tt>user_prefs</tt> file options can be found in the <tt>Mail::SpamAssassin::Conf</tt> man page or at [http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.html Mail::SpamAssassin::Conf] on the SpamAssassin website.
+
<pre>
 +
whitelist_from *@company.com
 +
</pre>
  
== Whitelists (or how to prevent some real mail from being tagged as spam.) ==
+
== Our Default Settings ==
  
When you receive <tt>** SPAM **</tt>-tagged mail from someone you know, you'll probably want to put that person's e-mail address on a ''whitelist'', so that that person's e-mail doesn't get tagged as <tt>** SPAM **</tt> again.
+
For reference, here are the default settings we use for the Department:
  
To do that, add a <tt>whitelist_from</tt> line to your <tt>user_prefs</tt> file.
+
<pre>
 +
required_score 7
  
For example, say that email from your colleague ''mike@yahoo.com'' gets tagged as <tt>** SPAM **</tt>.
+
rewrite_header Subject **SPAM**
 +
clear_headers
 +
add_header spam Flag _YESNOCAPS_
 +
add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTS_ autolearn=_AUTOLEARN_ version=_VERSION_
  
Add:
+
fold_headers 0
 +
</pre>
  
whitelist_from mike@yahoo.com
+
== More Information ==
  
to your <tt>user_prefs</tt> file.  (It shouldn't matter where in the file you put it.)
+
Other <tt>user_prefs</tt> file options can be found in the <tt>Mail::SpamAssassin::Conf</tt> man page or at [http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.html Mail::SpamAssassin::Conf] on the SpamAssassin website.
 
 
Now email from ''mike@yahoo.com'' will ''not'' be tagged as spam when you receive it.
 
 
 
(You'll also notice that now, the <tt>X-Spam-Status:</tt> line in the mail headers from mike@yahoo.com includes  "<tt>USER_IN_WHITELIST</tt>")
 
 
 
You can also whitelist an entire domain. For example, if you want all your incoming mail from blah.com to never be tagged as spam, add the following whitelist entry to user_prefs:
 
 
 
whitelist_from blah.com
 
 
 
If that does't work for you, try:
 
 
 
whitelist_from *@blah.com
 
  
For more information, please read [http://wiki.apache.org/spamassassin/ManualWhitelist ManualWhitelist] on the SpamAssassin website.
 
  
 
[[Category:Spam Filtering]]
 
[[Category:Spam Filtering]]

Revision as of 19:35, 29 January 2015

Introduction

For an overview of how we use SpamAssassin in our department, please first read our SpamAssassin article.

SpamAssassin uses a set of rules that control how it scores each message. It considers each rule in turn and if it determines that a rule applies to or matches the message, then that rule's score is added to the message's overall score. Some rules have negative scores, indicating that messages with those features are probably not spam. SpamAssassin's default rule scores have been chosen based on experience with large quantities of both spam and non-spam messages.

By default, the CS Department uses a threshold of 7 for its spam classification. Any message whose overall score is equal to or greater than 7 is flagged as spam. The SpamAssassin page describes what is done by default to spam-flagged messages. The Filtering Spam with Procmail page describes how to automatically filter messages that have been flagged as spam.

Create a user_prefs File

SpamAssassin uses a configuration file in your home directory (on our Linux clients) to supplement its Department-wide configuration. You can use that file to customize SpamAssassin's behavior for your messages.

To create an empty config file, from a command prompt on one of our Linux clients, run the following commands:

mkdir ~/.spamassassin
touch ~/.spamassassin/user_prefs

Change Your Spam Score Threshold

The most common thing to customize is your spam score threshold. The default value for the CS mailserver is 7, which is a very conservative setting--it's pretty unlikely to flag something as spam when it's not, but it also lets a fair amount of spam in untagged.

The most common threshold for SpamAssassin is 5. Some people have good results with thresholds as low as 2. Thresholds can be decimal numbers, so you can use, say, 5.3 if you want.

To set your threshold to 5.3, put the following in your ~/.spamassassin/user_prefs file:

required_score 5.3

Prevent Certain Email Addresses from Being Considered Spam

Sometimes SpamAssassin will flag a message as spam even if it's not. If that happens frequently to the same sender, you can whitelist that sender's email addressto tell SpamAssassin that email from that person or organization should never be considered spam. You do this with the whitelist_from configuration directive.

If sally@example.com often gets flagged as spam, you can whitelist her with the following line in your ~/.spamassassin/user_prefs file:

whitelist_from sally@example.com

You can whitelist entire domains, if you want. Let's say that you receive several newsletters from a company whose email addresses all end with "@company.com". You can whitelist all of those at once with the following directive:

whitelist_from *@company.com

Our Default Settings

For reference, here are the default settings we use for the Department:

required_score 7

rewrite_header Subject **SPAM**
clear_headers
add_header spam Flag _YESNOCAPS_
add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTS_ autolearn=_AUTOLEARN_ version=_VERSION_

fold_headers 0

More Information

Other user_prefs file options can be found in the Mail::SpamAssassin::Conf man page or at Mail::SpamAssassin::Conf on the SpamAssassin website.