SpamAssassin

From Wikislax
Jump to: navigation, search

What is SpamAssassin ?

SpamAssassin checks mail for spam using syntactic rules in perl and optional external modules. SpamAssassin integration with MTAs requires glue code such as external daemons (amasvid-new) or MIMEDefang with the sendmail Milter interface.

Installing Perl modules

Perl modules can be installed interactively using the CPAN module "# perl -MCPAN -e shell". On first use, CPAN asks questions most of which can be answered by <enter>. Choose to automatically install the missing modules when detected by CPAN. In theory, this would insure that all the missing modules are installed. In practise, SpamAssassin did not work properly (no spam detected) after installing in this way, so we'd recommend to install the perl modules instead in the order described below, which worked for us.

The modules are distributed in sub-directories of /usr/local/lib64/perl5 and /usr/local/share/perl5. It is still unclear if modules can be removed or not and how. Any module failing installation can alternatively be downloaded from the CPAN site and installed in this way :

# tar -C /usr/local -xvf module-x.y.z
# cd /usr/local
# chown -R root:root module-x.y.z
# cd module-x.y.z
# perl Makefile.PL
# make
# make install

Modules required by SpamAssassin

The modules below must be installed prior to installing SpamAssassin :

# perl -MCPAN -e shell
. . .
cpan> install CPAN
cpan> install YAML
cpan> install Digest::SHA1
cpan> install HTML::Parser
cpan> install Net::DNS
cpan> install LWP::UserAgent
cpan> install HTTP::Date
cpan> install IO::Zlib
cpan> install Archive::Tar
cpan> install MIME::Base64
cpan> install DB_File
cpan> install Net::SMTP
cpan> install Mail::SPF
cpan> install IP::Country::Fast
cpan> install Compress::Zlib
cpan> install Time::HiRes
cpan> install Mail::DKIM
cpan> install Mail::DomainKeys
cpan> install DBI
cpan> install DBD
cpan> install Encode::Detect
cpan> install Mail::SPF::Query
cpan> install Net::Ident
cpan> install IO::Socket::SSL
cpan> install Bundle::CPAN
cpan> install IO::Stringy
cpan> install Mail::Audit
cpan> install Unix::Syslog
cpan> quit

Installing SpamAssassin

Download and untar and install as below then test using the spamassassin command and check the files generated to make sure everything was OK. The -D flag affords getting an extended trace on the system output :

# tar -tvf Mail-SpamAssassin-x.y.z.tar.gz
# tar -C /usr/local -xvf Mail-SpamAssassin-x.y.z.tar.gz
# cd /usr/local/Mail-SpamAssassin-x.y.z
# perl Makefile.PL
# make
# make install
# sa-update --updatedir /usr/local/share/spamassassin
# spamassassin -t -D < sample-nonspam.txt > nonspam.out
# spamassassin -t -D < sample-spam.txt > spam.out

In practice, the basic syntactic tests in SpamAssassin are not very efficient. External modules can be used to extend, but are somewhat CPU or time-consuming. Below are a few modules that really work.

Installing SPF

Sender Policy Framework is a protocol to make sure senders send from legitimate domains. Install as below :

# perl -MCPAN -e shell
. . .
cpan> install Mail::SPF
cpan> install Mail::SPF::Query

Installing DCC

The Distributed Checksum Clearinghouse uses checksums of known spam to score the incoming mails. Download then install as below :

# tar -C /usr/local -xvzf dcc.tar.Z
# cd /usr/local/dcc-x.y.z
# ./configure
# make
# make install
# make clean
# cd /var
# mkdir dcc
# groupadd milter
# useradd -g milter -s /bin/bash milter
# chown -R milter:milter dcc

To use dcc, uncomment the dcc line in /etc/mail/spamassassin/v310.pre. Also authorize udp connections to port 6277 from your client ports and back. Provided you already accept outgoing protocols, accept port 6277 udp packets modifying /etc/rc.d/rc.firewall as below :

# vi /etc/rc.d/rc.firewall
. . .
iptables -A INPUT -p udp -j ACCEPT --dport 1024:65535 --sport 6277
:x
# /etc/rc.d/rc.firewall restart

Installing Razor

Razor uses checksums of known spam to score the incoming mails. It requires outbound access to tcp ports 7 and 2703. It requires perl modules Time::HiRes and Getopt::Long plus download razor-agents-x.y.tar.gz and install as below :

# perl -MCPAN -e shell
. . .
cpan> install Time::HiRes
cpan> install Getopt::Long
quit
# tar -C /usr/local -xvf razor-agents-x.y.tar.gz
# cd razor-agents-x.y
# perl Makefile.PL && make && make install && make clean
# cd
# razor-admin -create
# razor-admin -discover
# razor-admin -register

Installing Pyzor

Pyzor is a free database and software Hash Sharing System. It requires outbound access to udp and tcp port 24441 (from the mailing lists on the Pyzor site it seems that the Pyzor service is sometimes down). Download the tarball then install as below :

# tar -C /usr/local -xvf pyzor-x.y.z.tar.bz2
# chown -R milter:milter /usr/local/pyzor-x.y.z
# wget https://bootstrap.pypa.io/ez_setup.py -O - | python
# su milter
$ cd /usr/local/pyzor-x.y.z
$ python setup.py build
$ python setup.py install
$ pyzor discover
$ <ctrl>d
# mv /usr/bin/pyzor /usr/local/bin
# mv /usr/bin/pyzord /usr/local/bin

Language and locale check

Uncomment the TextCat language guesser in v310.pre and add the following lines to local.cf :

# Mail using languages used in these country codes will not be marked
# as being possibly spam in a foreign language.
# - english french italian russian
ok_languages en fr it ru

# Mail using locales used in these country codes will not be marked
# as being possibly spam in a foreign language.
# - english french italian russian
ok_locales en fr it ru

Another possible option to create local.cf is to use the Spam Assassin configuration generator.

The Bayes module

The SpamAssassin Bayes module uses databases of previously registered hams and spams to compare with incoming mails and assign probabilities. The module requires at least 200 hams and 200 spams in order to be used (enforced at run-time), but 3000 hams and 3000 spams to be fully efficient. You must provide the training mails. sa-learn affords analyzing them and initializing the databases using mbox or mbx mail folders. However, when using cyrus-imap, none of these formats is available. It is alternatively possible to use fetchmail to dump the messages from the existing cyrus-imap mailboxes. In the example below LearnSpam and LearnHam are shared imap folders that can be created using cyradm :

# cyradm --user postmaster --auth plain localhost
Password: 
localhost> cm LearnHam
localhost> sq LearnHam 307200
localhost> sam LearnHam myUser lrswipcda
localhost> cm LearnSpam
localhost> sq LearnSpam 307200
localhost> sam LearnSpam myUser lrswipcda
localhost> quit

Once the shared imap folders are created ham and spam mails can be manually copied using for example Thunderbird (not shown here). The Bayes database is created in the ~/.spamassassin directory using :

# sa-learn --clear
# sa-learn --sync
# sa-learn --dump magic

Before invoking fetchmail create a .fetchmailrc configuration file with permissions 700 :

# vi ~/.fetchmailrc
. . .
poll inner.studioware.com proto imap service 993 user "myUser" pass "myPass" ssl keep
<esc>

Fetchmail can now be invoked passing the mails to learn to sa-learn :

# fetchmail --folder 'LearnSpam' -m 'sa-learn --spam'
# fetchmail --folder 'LearnHam' -m 'sa-learn --ham'

Running SpamAssassin

SpamAssassin is not ran as such but used from the perl in MIMEDefang, itself being launched by the sendmail milter interface. So there is no daemon to put in place and nothing to launch !


Cyrus-IMAP Main Page ClamAV