Random ramblings of a security nerd

Auto-tagging SPAM emails

Are you tired of publishing SPAM? Join me on a journey to set up simple blocklists to auto-filter based on origin and sender for Postfix mail servers.

If you're in academia, you likely know publishing SPAM. For those that are not (or missed out on the pleasure so far), publishing SPAM are publishers that send unsolicited requests for articles, offer publication services, or other auxiliary services such as proof reading or "help" to get papers published. They generally don't offer unsubscribe features and keep you on their list, especially if you reply.

While not disastrous, these emails are annoying. For example, I end up with 10-15 such unwanted emails in my inbox each day. My initial go-to goal was to train local SPAM filters to remove these emails. As I'm using at least 5 different computing systems (desktop home, desktop office, laptop home, laptop office, mobile client --- I know, I should look for help), I was looking for a different solution that avoids over-training local SPAM filters. Given that these publishers generally stick to their domains and emails, a simple blocklist should be sufficient to filter them. As I'm running my own mail server, this should be a piece of cake, right?

On my mail server, the main pieces are postfix for smtp handling, SpamAssassin for SPAM filtering, maildrop for vmail delivery, and dovecot for imap connections to the clients. Any of these components should be able to implement a simple blocklist based on the sender address. Or so I thought. Paging in all the configuration and customization across the different components was somewhat difficult, especially as my configuration grew over the last couple of years.

After searching the web for a bit, I discovered the PREPEND feature for smtpd_sender_restrictions. This must be it, I thought and tried to learn more. But the man page is rather dry and stackoverflow was not of much help (anymore). I therefore turned to ChatGPT and asked it for options.

What ChatGPT got right was that it's not straight-forward to move mail to alternate folders in Postfix as maildrop/dovecot takes care of local mail delivery. But I can tag messages. Unfortunately, ChatGPT hallucinated also quite a bit, offering options and half-truths about configurations that did not work reliably. While I initially assumed that Debian stable was just too outdated, some of the flags that ChatGPT suggested simply did not exist.

Another issue I ran into was that spamassassin removes any X-Spam-ABC flag when filtering email. As I initially tried to set the X-Spam-Status: YES to have dovecot filter the mail to the Junk folder, spamassassin silently removed the tag during processing.

After quite some trial and error, I settled on

smtpd_sender_restrictions =
    check_sender_access hash:/etc/postfix/sender_access,
    ...

with the file sender_access being auto-generated based on a simple text file where I encode unwanted email addresses and domains. For each email, address, I add a line foo@bar.com PREPEND X-Blocklist: YES (and run postmap sender_access) afterwards.

In my dovecot sieve for local delivery where I already move SPAM email into the Junk folder, I then simply do the same for any emails tagged with X-Blocklist: YES:

if header :contains "X-Blocklist" "YES" {
    fileinto "Junk";
}

This exercise took me roughly 1.5 days including testing. I was a bit surprised by how much stackoverflow has degraded. It's also an unfortunate fact that very few people keep running their own mail servers and not much information is out there (only a few outdated forum posts). Similarly, the hallucinations of ChatGPT were somewhat scary and lead me down a few wrong paths. In the end, a combination of trial-and-error, configuration hunting, reading lots of forum posts, and using ChatGPT in a developer-in-the-loop mode somewhat helped solve this issue.

Do you think it was worth spending 1.5 days to delete unwanted email? Also, how long will it take me to recoup the cost of this over-engineering? ;)

links

social