SpamGAME should run on all platform having a Python2.3 interpreter and storing emails in
Unix mbox
format or maildir
format.
With these preconditions, the software can be easily used
on Linux, *BSD, MacOS X and other Unices. At this moment it has only been tested on Linux.
What you need to run SpamGAME is:
Mandatory:
SpamGAME is given as a python distutils
package, so if you have the python
distutils
libraries (they should be by default installed with Python2.3, or
included in the python2.3-dev
package) it is very easy to install
the software under a Unix-like system:
tar xvzf spamgame-x.y.z.tar.gz
spamgame-x.y.z
directory: cd spamgame-x.y.z
python2.3 setup.py install
(you must have root
privileges to do this)spamgame
package should be installed under your python2.3
site-packages
directory (on my Debian Linux distribution it goes into
/usr/lib/python2.3/site-packages/spamgame/
) and all the scripts should be under
your $PATH (on my Debian Linux they are stored in /usr/bin/
).
If you can't install the spamgame
package in the above way (maybe because you can't
gain root privileges or you don't have the distutils
package included in
python2.3-dev
) you can follow this simple alternative way:
tar xvzf spamgame-x.y.z.tar.gz
spamgame-x.y.z
directory: cd spamgame-x.y.z
spamgame
directory under your $PYTHONPATH. This variable
must be set in your shell preferences..deb
This kind of installation is preferred if you have a Debian GNU/Linux distribution, because
a simple
dpkg -i spamgame-x.y.z.deb
does all what is needed. In the .deb
I have included a version of PMW libraries
that works with Python2.3, because actually there is no debian package of that library working with
Python2.3. I hope no one will mind for this, I will remove it from the spamgame.deb as soon as
there is a official pmw
package for python2.3.
SpamGAME is just an email classifier, so it does not fetch or store the email for you. It is meant to be used as a thin layer between the retrieval of an email message and its storage on a local mailbox. In classifying a message it uses various methods, but the fundamental one is the categorization made by the GA.M.E. algorithm. In order to get the best classification, you must first train the system.
To train the system, you must first have manually separated your past emails in two corpuses:
mbox
or maildir mailboxes. If you have your
emails stored on your system in one mbox (or maildir), create another mbox (maildir) and move the spam
emails into this latter.
It is reccomended that you use at least 100-150 emails for each corpus, to get good classification
performance. $HOME/Mail/spam
(either the mbox file or the maildir directory) and
the ham mbox (maildir) in the path
$HOME/Mail/ham
, you can type the following commands:
spamgametrain.py -s $HOME/Mail/spam -f FORMAT
spamgametrain.py -g $HOME/Mail/ham -f FORMAT
spamgametrain.py
should be in your $PATH
.
Now that you have trained the system , you can classify emails. For example, if you have one
email in the file email.txt
, on a Unix-like system you could do a simple:
cat email.txt | spamgamefilter.py
and you will see on standard output the same email message but with the special header X-Spam
with value yes
or no
depending on how SpamGAME classified the message itself.
I currently use SpamGAME inside the mail client for KDE, KMail. It is very simple to configure it, just follow these steps:
spamgamefilter.py
X-Spam
X-Spam
mbox
folder must be previously created)
It is even simpler to integrate SpamGAME into the Mail Delivery Agent Procmail, just add these stanzas
to your .procmailrc
file:
:0 fw | spamgamefilter.py
:0 * ^X-Spam: yes yourSpamMbox
At this moment I can't provide help on how to use SpamGAME on different mail client, but if you follow the instructions above for KMail and Procmail it should be easy to adapt those procedures. Basically what you have to do is to make the email client pipe the email through SpamGAME which in turn will return the same message with the special header X-Spam with value yes or no depending if the email was classified as spam or ham (not spam). Than you can decide what to do with the so classified message: I decided to do nothing with the good messages (I let them arrive into my default mailbox), while instead I move the spam classified messages to a special folder containing only garbage. Check sometimes this folder to see if there was some wrong classification, at this moment SpamGAME has a quite good accuracy but is not perfect. Remember anyway that the performace dependes only on how well you trained the system in the training phase.
SpamGAME uses also a whitelist / blacklist filtering before classifying the message.
To personalize these lists, edit with an XML editor the file config.xml
that resides
in your personal SpamGAME user directory, created after you run the script spamgameinstall.py
.
Under Linux this directory is $HOME/.spamgame/
.
When you open this file, look for the whitelist
(or blacklist
) tag, and add
as many address
elements as you need. Every email address in the whitelist subtree will
make the filter consider an incoming email message as ham (not spam), every email address in the
blacklist substree will make instead an incoming email classified as spam without looking
inside its content.