This post is over 5 years old, it may be out of date.
Blocking and filtering spam in electronic mail
Blocking on account of the address of the computer delivering mail
The blocklevel can be adjusted by every user via the Do It Yourself website. When an email arrives at the science mailserver the server instantly checks whether the mail address of the recipient exists and whether the recipient has a block level that includes the IP-number of the computer that tries to deliver the mail. Otherwise the mail is not accepted. If the mail would be blocked by a higher block level, the mail is accepted, but sent forward with a mail-header-line: “X-Would-Be-Blocked-By:”.
There are four block levels: ‘none’, ‘light’, ‘medium’ and ‘heavy’, where ‘medium’ is the standard and recommended value.
- none: no blocking takes place.
- light: blocking by using the following lists:
- whitelist.science.ru.nl: C&CZ’s own white list: machines that are or have been listed on blacklists, but a lot of our users like to receive mail from. The list contains at the moment (luckily) few entries: a list server of Surfnet and the spamprovider of the RadboudUMC.
- blacklist.science.ru..nl: C&CZ’s own black list: machines that are not (yet) a problem for others, but are for C&CZ. The list contains at the moment (luckily) few entries, mainly machines that have in the past bombarded C&CZ mailservers with spam and/or viruses.
- medium: bovenop de light-lijsten komen:
- bl.spamcop.net: A database with known and/or reported spammers per ’server IP-address.
- sbl.spamhaus.org: A database of verified spam sources, spam gangs and spam support services, controlled by the Spamhaus Project team.
When one has a lower blocklevel than the standard ‘medium’ (so one has ‘light’ or ‘none’), then mail that would be blocked by a ‘medium’ blocklevel is passed through with a warning sign “X-Would-Be-Blocked-By: Medium”.
- heavy: On top of the ‘medium’ lists the following is added:
- dnsbl.sorbs.net: SORBS A database of several sorts of spam sources.
- xbl.spamhaus.org: Spamhaus Exploits Block List: illegal exploits, incl. open proxies (HTTP, socks, AnalogX, wingate, etc), worms/viruses with built-in spam engines [en andere typen van trojan-horse misbruik]and other types of ‘trojan horse’ abuse.]
Mail that would be blocked by a ‘heavy’ blocklevel, is passed through if one has a ‘medium’ blocklevel, but with a warning sign: “X-Would-Be-Blocked-By: Heavy”.
New logins automatically get the ‘medium’ blocklevel. This has a small risk that wanted mail will be blocked, but it clearly blocks more spam for most users than the ‘light’ blocklevel. Mail that would be blocked by a heavier blocklevel, is let through, but with a warning sign (X-Would-Be-Blocked-By:). Users themselves can see from mail that arrives with these warning signs, how much wanted mail would be blocked by a heavier blocklevel (“false positives”). The warning sign can also be used to sort incoming mail into folders, e.g. with Sieve.
When one forwards mail from other addresses automatically to the science address, the blocking of spam-sending computers should be done by the mailserver that originally receives the spam. Our mailserver doesn’t see the spammer, it only has a connection with the forwarding mailserver.
Filtering based on the content of the mail
Despite the above way of blocking spam, one probably will still receive a lot of unwanted spam-mail. In that case one can only filter using the content of the mail. If one does that on the mailserver instead of on one’s own computer, it of course will use more capacity than the spam blocking described above.
All C&CZ mailservers in
MIMEDefang to filter the content of mail
for spam and viruses. Mimedefang in turn uses
SpamAssassin with central Bayes-filter.
If one has chosen to have mail that is recognized by SpamAssassin as
spam, to be delivered, then one can see in the final attachment a
summary of the reasons why this mail was recognised as spam. A lot of
different things count: words with a commercial or sexual meaning,
properties of addresses an d header-lines, formatting, use of capitals,
etc. Next to that, C&CZ maintains a statistic (Bayes) list of words that
appear in normal mail and in spam. Each of these things contribute to
the total ‘spamscore’ of the mail. If the score is more than 5.0, the
mail is normally considered to be spam. For users of the IMAP mailserver
post.science.ru.nl mail which is marked as spam is automatically moved
to the Spam folder on the mailserver. Mail older than 14 days is
automatically removed from this folder.
Using the Do It Yourself website it is possible to maintain a ‘whitelist’ of sender addresses and mail domains. Mail from these addresses and mail domains will never automatically be moved to the Spam folder, even if it is tagged by SpamAssassin as spam. With help of Sieve mail can be processed in other ways.
For complex filtering, such as redirecting or foldering mails depending on the sender/subject/contents of the mail or even the switching off of the spamfilter, see the Sieve page.