Navigation:

Search



Related Articles

Our Friends

Articles Filtering UCE with Bogofilter
 

Filtering UCE with Bogofilter

How to install and use the BogoFilter spam filter.

This was written by David Cantrell and given on Thu May 14 2003.

Table of Contents


1. Introduction

Bogofilter is a nice UCE filtering tool that uses Bayesian statistics to track messages and learn what to detect. Bogofilter was originally written by Eric S. Raymond.

Some sites of interest:

The above sites include the Bogofilter home page and a couple of sites that discuss Bayesian statistics. Caution, the sites contain math, which may or may not be desirable in your case. If you prefer, just consider Bayesian statistical calculations "magic". That's what I do.

2. Why use Bogofilter?

There are now several products available that do what Bogofilter does. SpamAssassin and SpamBayes are two popular ones. So why choose Bogofilter? Bogofilter is written in C, which means it is slightly more robust when it comes to execution. Other filters are written in Perl and Python, which offer other advantages, but speed usually isn't one of them.

I like Bogofilter simply because it's small and doesn't rely on a lot of external support software.

3. Installation

Bogofilter is super easy to install. The project appears to be offering RPM packages now. If that floats your boat, download the package and you'll be up and running.

If you suffer from the Not Compiled Here problem like me, grab the source and compile and install it:

gzip -dc bogofilter-0.10.3.1.tar.gz | tar -xvf -
cd bogofilter-0.10.3.1
./configure --prefix=/usr/local
make
make install

For those that like to compile things, but the above steps appear scary, grab the source RPM and let RPM compile it for you.

4. Seeding the Filter

Bogofilter scores spam and stores the results in two databases: the good list and the spam list. These are BerkDB files that grow as you use bogofilter. You must seed bogofilter for it to be useful. There are several ways to do this. Manually can be painful. Getting a DB dump from another bogofilter user is handy. If you get your hands on other DB files, you need to dump them to text first and then load them on your system:

# On the source machine
bogoutil -d goodlist.db <goodlist.txt
bogoutil -d spamlist.db <spamlist.txt

# On your machine
cat goodlist.txt | bogoutil -l goodlist.db
cat spamlist.txt | bogoutil -l spamlist.db

Using another set of data for your seed may or may not be a good idea. Be sure to think about this before doing it. Ideally you should seed your particular bogofilter installation with UCE that you have received. To seed bogofilter by hand, take your mbox file (or collection of email files) and pipe them through bogofilter with the -s option if it is spam, -n if it is not spam. The formail(1) tool is handy for doing this.

5. Wrapper Script

I use a wrapper script to invoke Bogofilter which currently just forces the configuration path. At one point in time, it was forcing some other settings. I still use it, and it is simply:

#!/bin/sh
/usr/local/bin/bogofilter -d /usr/local/etc/bogofilter/ $*
exit $?

The script is root:root and 0755.

6. Procmail Modifications

Bogofilter hooks in to procmail with ease. The man page for bogofilter gives a good procmailrc example. Here's what I do:

VERBOSE=yes
LOGDIR=$HOME/.procmail
LOGFILE=$LOGDIR/log

# Scan for spam
:0fw
| /usr/local/bin/spamfilter -u -e -p
      
# Return mail to queue on bogofilter failure
:0e
{ EXITCODE=75 HOST }

# Place in SPAM mbox if it's spam
:0:
* ^X-Bogosity: Yes, tests=bogofilter
SPAM

7. Mailer Modifications

The man page provides some macros for Mutt that let you handle UCE That bogofilter didn't catch. I have Mutt configured so that if I hit Esc-Del, the message is forced through bogofilter flagged as spam. Pressing just Del will delete the message.

8. Global Installation

Global installation can be done several ways. No special steps are required other than just installing hooks in the global procmailrc file. If you want users to be able to train bogofilter with spam that wasn't caught, you will need to make bogofilter setuid root or create a user and/or group that bogofilter runs as and change the database files to that user/group.

I recommend that you install bogofilter under your account only rather than globally.

9. Upgrading Bogofilter

From time to time you will want to download and install a new version of bogofilter. The authors make upgrades easy with the bogoupgrade command. This command upgrades your data files. You still need to compile and install the new version, but they always provide a tool to upgrade the BerkDB files. Be sure to check the man page for details.

10. Resources