dataswamp.org

       Title: Filtering spam using Rspamd and OpenSMTPD on OpenBSD
       Author: Solène
       Date: 13 July 2021
       Tags: openbsd mail spam
       Description: 
       
       # Introduction
       
       I recently used Spamassassin to get ride of the spam I started to
       receive but it proved to be quite useless against some kind of spam so
       I decided to give rspamd a try and write about it.
       
       rspamd can filter spam but also sign outgoing messages with DKIM, I
       will only care about the anti spam aspect.
       
 (HTM) rspamd project website
       
       # Setup
       
       The rspamd setup for spam was incredibly easy on OpenBSD (6.9 for me
       when I wrote this).  We need to install the rspamd service but also the
       connector for opensmtpd, and also redis which is mandatory to make
       rspamd working.
       
       ```shell instructions
       pkg_add opensmtpd-filter-rspamd rspamd redis
       rcctl enable redis rspamd
       rcctl start redis rspamd
       ```
       
       Modify your /etc/mail/smtpd.conf file to add this new line:
       
       ```smtpd.conf file
       filter rspamd proc-exec "filter-rspamd"
       ```
       
       And modify your "listen on ..." lines to add "filter "rspamd"" to it,
       like in this example:
       
       ```smtpd.conf file
       listen on em0 pki perso.pw tls auth-optional   filter "rspamd"
       listen on em0 pki perso.pw smtps auth-optional filter "rspamd"
       ```
       
       Restart smtpd with "rcctl restart smtpd" and you should have rspamd
       working!
       
       # Using rspamd
       
       Rspamd will automatically check multiple criteria for assigning a score
       to an incoming email, beyond a high score the email will be rejected
       but between a low score and too high, it may be tagged with a header
       "X-spam" with the value true.
       
       If you want to automatically put the tagged email as spam in your Junk
       directory, either use a sieve filter on the server side or use a local
       filter in your email client.  The sieve filter would look like this:
       
       ```sieve rule
       
       if header :contains "X-Spam" "yes" {
               fileinto "Junk";
               stop;
       }
       ```
       
       # Feeding rspamd
       
       If you want better results, the filter needs to learn what is spam and
       what is not spam (named ham).  You need to regularly scan new emails to
       increase the effectiveness of the filter, in my example I have a single
       user with a Junk directory and an Archives directory within the maildir
       storage, I use crontab to run learning on mails newer than 24h.
       
       ```crontab
       0  1 * * * find /home/solene/maildir/.Archives/cur/ -mtime -1 -type f -exec rspamc learn_ham {} +
       10 1 * * * find /home/solene/maildir/.Junk/cur/     -mtime -1 -type f -exec rspamc learn_spam {} +
       ```
       
       # Getting statistics
       
       rspamd comes with very nice reporting tools, you can get a WebUI on the
       port 11334 which is listening on localhost by default so you would
       require tuning rspamd to listen on other addresses or you can use a SSH
       tunnel.
       
       You can get the same statistics on the command line using the command
       "rspamc stat" which should have an output similar to this:
       
       ```command line output
       Results for command: stat (0.031 seconds)
       Messages scanned: 615
       Messages with action reject: 15, 2.43%
       Messages with action soft reject: 0, 0.00%
       Messages with action rewrite subject: 0, 0.00%
       Messages with action add header: 9, 1.46%
       Messages with action greylist: 6, 0.97%
       Messages with action no action: 585, 95.12%
       Messages treated as spam: 24, 3.90%
       Messages treated as ham: 591, 96.09%
       Messages learned: 4167
       Connections count: 611
       Control connections count: 5190
       Pools allocated: 5824
       Pools freed: 5801
       Bytes allocated: 31.17MiB
       Memory chunks allocated: 158
       Shared chunks allocated: 16
       Chunks freed: 0
       Oversized chunks: 575
       Fuzzy hashes in storage "rspamd.com": 2936336370
       Fuzzy hashes stored: 2936336370
       Statfile: BAYES_SPAM type: redis; length: 0; free blocks: 0; total blocks: 0; free: 0.00%; learned: 344; users: 1; languages: 0
       Statfile: BAYES_HAM type: redis; length: 0; free blocks: 0; total blocks: 0; free: 0.00%; learned: 3822; users: 1; languages: 0
       Total learns: 4166
       ```
       
       # Conclusion
       
       rspamd is for me a huge improvement in term of efficiency, when I tag
       an email as spam the next one looking similar will immediately go into
       Spam after the learning cron runs, it draws less memory then
       Spamassassin and reports nice statistics.  My Spamassassin setup was
       directly rejecting emails so I didn't have a good comprehension of its
       effectiveness but I got too many identical messages over weeks that
       were never filtered, for now rspamd proved to be better here.
       
       I recommend looking at the configurations files, they are all disabled
       by default but offer many comments with explanations which is a nice
       introduction to learn about features of rspamd, I preferred to keep the
       defaults and see how it goes before tweaking more.