What is SpamAssassin ?
SpamAssassin checks mail for spam using syntactic rules in perl and optional external modules. SpamAssassin integration with MTAs requires glue code such as external daemons (amasvid-new) or MIMEDefang with the sendmail Milter interface.
Installing Perl modules
Perl modules can be installed interactively using the CPAN module "# perl -MCPAN -e shell". On first use, CPAN asks questions most of which can be answered by <enter>. Choose to automatically install the missing modules when detected by CPAN. In theory, this would insure that all the missing modules are installed. In practise, SpamAssassin did not work properly (no spam detected) after installing in this way, so we'd recommend to install the perl modules instead in the order described below, which worked for us.
The modules are distributed in sub-directories of /usr/local/lib64/perl5 and /usr/local/share/perl5. It is still unclear if modules can be removed or not and how. Any module failing installation can alternatively be downloaded from the CPAN site and installed in this way :
# tar -C /usr/local -xvf module-x.y.z # cd /usr/local # chown -R root:root module-x.y.z # cd module-x.y.z # perl Makefile.PL # make # make install
Modules required by SpamAssassin
The modules below must be installed prior to installing SpamAssassin :
# perl -MCPAN -e shell . . . cpan> install CPAN cpan> install YAML cpan> install Digest::SHA1 cpan> install HTML::Parser cpan> install Net::DNS cpan> install LWP::UserAgent cpan> install HTTP::Date cpan> install IO::Zlib cpan> install Archive::Tar cpan> install MIME::Base64 cpan> install DB_File cpan> install Net::SMTP cpan> install Mail::SPF cpan> install IP::Country::Fast cpan> install Compress::Zlib cpan> install Time::HiRes cpan> install Mail::DKIM cpan> install Mail::DomainKeys cpan> install DBI cpan> install DBD cpan> install Encode::Detect cpan> install Mail::SPF::Query cpan> install Net::Ident cpan> install IO::Socket::SSL cpan> install Bundle::CPAN cpan> install IO::Stringy cpan> install Mail::Audit cpan> install Unix::Syslog cpan> quit
Download and untar and install as below then test using the spamassassin command and check the files generated to make sure everything was OK. The -D flag affords getting an extended trace on the system output :
# tar -tvf Mail-SpamAssassin-x.y.z.tar.gz # tar -C /usr/local -xvf Mail-SpamAssassin-x.y.z.tar.gz # cd /usr/local/Mail-SpamAssassin-x.y.z # perl Makefile.PL # make # make install # sa-update --updatedir /usr/local/share/spamassassin # spamassassin -t -D < sample-nonspam.txt > nonspam.out # spamassassin -t -D < sample-spam.txt > spam.out
In practice, the basic syntactic tests in SpamAssassin are not very efficient. External modules can be used to extend, but are somewhat CPU or time-consuming. Below are a few modules that really work.
Sender Policy Framework is a protocol to make sure senders send from legitimate domains. Install as below :
# perl -MCPAN -e shell . . . cpan> install Mail::SPF cpan> install Mail::SPF::Query
# tar -C /usr/local -xvzf dcc.tar.Z # cd /usr/local/dcc-x.y.z # ./configure # make # make install # make clean # cd /var # mkdir dcc # groupadd milter # useradd -g milter -s /bin/bash milter # chown -R milter:milter dcc
To use dcc, uncomment the dcc line in /etc/mail/spamassassin/v310.pre. Also authorize udp connections to port 6277 from your client ports and back. Provided you already accept outgoing protocols, accept port 6277 udp packets modifying /etc/rc.d/rc.firewall as below :
# vi /etc/rc.d/rc.firewall . . . iptables -A INPUT -p udp -j ACCEPT --dport 1024:65535 --sport 6277 :x # /etc/rc.d/rc.firewall restart
Razor uses checksums of known spam to score the incoming mails. It requires outbound access to tcp ports 7 and 2703. It requires perl modules Time::HiRes and Getopt::Long plus download razor-agents-x.y.tar.gz and install as below :
# perl -MCPAN -e shell . . . cpan> install Time::HiRes cpan> install Getopt::Long quit # tar -C /usr/local -xvf razor-agents-x.y.tar.gz # cd razor-agents-x.y # perl Makefile.PL && make && make install && make clean # cd # razor-admin -create # razor-admin -discover # razor-admin -register
Pyzor is a free database and software Hash Sharing System. It requires outbound access to udp and tcp port 24441 (from the mailing lists on the Pyzor site it seems that the Pyzor service is sometimes down). Download the tarball then install as below :
# tar -C /usr/local -xvf pyzor-x.y.z.tar.bz2 # chown -R milter:milter /usr/local/pyzor-x.y.z # wget https://bootstrap.pypa.io/ez_setup.py -O - | python # su milter $ cd /usr/local/pyzor-x.y.z $ python setup.py build $ python setup.py install $ pyzor discover $ <ctrl>d # mv /usr/bin/pyzor /usr/local/bin # mv /usr/bin/pyzord /usr/local/bin
Language and locale check
Uncomment the TextCat language guesser in v310.pre and add the following lines to local.cf :
# Mail using languages used in these country codes will not be marked # as being possibly spam in a foreign language. # - english french italian russian ok_languages en fr it ru # Mail using locales used in these country codes will not be marked # as being possibly spam in a foreign language. # - english french italian russian ok_locales en fr it ru
Another possible option to create local.cf is to use the Spam Assassin configuration generator.
The Bayes module
The SpamAssassin Bayes module uses databases of previously registered hams and spams to compare with incoming mails and assign probabilities. The module requires at least 200 hams and 200 spams in order to be used (enforced at run-time), but 3000 hams and 3000 spams to be fully efficient. You must provide the training mails. sa-learn affords analyzing them and initializing the databases using mbox or mbx mail folders. However, when using cyrus-imap, none of these formats is available. It is alternatively possible to use fetchmail to dump the messages from the existing cyrus-imap mailboxes. In the example below LearnSpam and LearnHam are shared imap folders that can be created using cyradm :
# cyradm --user postmaster --auth plain localhost Password: localhost> cm LearnHam localhost> sq LearnHam 307200 localhost> sam LearnHam myUser lrswipcda localhost> cm LearnSpam localhost> sq LearnSpam 307200 localhost> sam LearnSpam myUser lrswipcda localhost> quit
Once the shared imap folders are created ham and spam mails can be manually copied using for example Thunderbird (not shown here). The Bayes database is created in the ~/.spamassassin directory using :
# sa-learn --clear # sa-learn --sync # sa-learn --dump magic
Before invoking fetchmail create a .fetchmailrc configuration file with permissions 700 :
# vi ~/.fetchmailrc . . . poll inner.studioware.com proto imap service 993 user "myUser" pass "myPass" ssl keep <esc>
Fetchmail can now be invoked passing the mails to learn to sa-learn :
# fetchmail --folder 'LearnSpam' -m 'sa-learn --spam' # fetchmail --folder 'LearnHam' -m 'sa-learn --ham'
SpamAssassin is not ran as such but used from the perl in MIMEDefang, itself being launched by the sendmail milter interface. So there is no daemon to put in place and nothing to launch !