Message-id header rewrite in exim

Some times it is advisable to change/hide some message headers in outgoing mail. For example, you don't like that helo string with your internal domain is shown in Received: header or when it appears in MUA's generated Message-id

Fixing Received header

Changes to Received header string can be made via received_header_text variable. This string is expanded each time it is used:

received_header_text = "Received: \
        ${if def:sender_rcvhost {from ${if match {$sender_rcvhost}{.*DOMAIN_PATTERN_HERE.*} {localhost}{$sender_rcvhost}}\n\t}\
        {${if def:sender_ident {from ${quote_local_part:$sender_ident} }}\
        ${if def:sender_helo_name {(helo=${if match {$sender_helo_name}{.*DOMAIN_PATTERN_HERE.*} {localhost}{$sender_helo_name}})\n\t}}}}\
        by $primary_hostname \
        ${if def:received_protocol {with $received_protocol}} \
        ${if def:tls_cipher {($tls_cipher)\n\t}}\
        id $message_id\
        ${if def:received_for {\n\tfor $received_for}}"

Fixing Message-id

To change Message-id i added system filter with the following contents:

#/etc/exim4/filter.conf

if not first_delivery then
 finish
endif

if error_message then
 finish
endif


if "${if def:h_Message-Id {yes}}" is yes and
    $h_Message-Id matches "@.*DOMAIN_PATTERN_HERE.*" then
        headers remove Message-Id
        headers add "Message-Id: <${message_id}@$primary_hostname>"
endif

This filter should be specified in exim's configuration with:

system_filter = /etc/exim4/filter.conf

Setup of FuzzyOcr plugin for spamassassin

Every day spammers invent new technics to bypass spam filters. Whereas modern spam filters cope good with different text mail using regex rules and bayesian classifiers, they're useless when spammers send messages with attached image and random, non-spam text in the message body. But solution to this problem is already available! It is FuzzyOcr plugin for spamassassin.

This plugin checks for specific keywords in image/gif, image/jpeg or image/png attachments, using gocr (an optical character recognition program). This plugin can be used to detect spam that puts all the real spam content in an attached image. The mail itself only random text and random html, without any URL's or identifiable information. It also do approximate matches on words, so errors in recognition or attempts to obfuscate the text inside the image will not cause the detection to fail. It can be easely extended, because all words reside in a simple plain text file.

Setup it on Ubuntu takes a few simple steps:
  1. apt-get install gocr netpbm imagemagick libstring-approx-perl
  2. mkdir /tmp/fuzzyocr
  3. cd /tmp/fuzzyocr
  4. apt-get source libungif-bin
  5. wget http://users.own-hero.net/~decoder/fuzzyocr/ {giftext-segfault.patch,fuzzyocr-latest.tar.gz}
  6. patch libungif4-4.1.4/util/giftext.c ./giftext-segfault.patch
  7. cd libungif4-4.1.4
  8. dpkg-buildpackage -rfakeroot -us -uc
  9. cd ..
  10. dpkg -i libungif4g_4.1.4-1_i386.deb libungif-bin_4.1.4-1_i386.deb
  11. tar xzf fuzzyocr-latest.tar.gz
  12. mkdir -p /usr/local/lib/site-perl
  13. cp FuzzyOcr-2.3b/FuzzyOcr.pm /usr/local/lib/site-perl
  14. cp FuzzyOcr-2.3b/FuzzyOcr.cf /etc/spamassassin
  15. cp FuzzyOcr-2.3b/FuzzyOcr.words.sample /etc/spamassassin/FuzzyOcr.words

Steps 4-10 required only because segfault was discovered in giftext utility from that package. Now all you have to do is to enable FuzzyOcr plugin in spamassassin and tweak your word list.

Edit /etc/spamassassin/FuzzyOcr.cf and:
  • Remove loadplugin FuzzyOcr FuzzyOcr.pm
  • Set focr_pre314 to 1.
  • Set focr_logfile to /var/log/FuzzyOcr.log

Add following line to /etc/spamassassin/v312.pre:

  • loadplugin FuzzyOcr /usr/local/lib/site_perl/FuzzyOcr.pm

To test FuzzyOcr plugin you can use image spam message samples in FuzzyOcr-2.3b/samples:

[denis@sun:test]$ spamassassin -t FuzzyOcr-2.3b/samples/animated-gif.eml
...
...
Content analysis details:   (24.4 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 0.8 EXTRA_MPART_TYPE       Header has extraneous Content-type:...type= entry
 0.7 DATE_IN_PAST_06_12     Date: is 6 to 12 hours before Received: date
 2.8 TVD_FW_GRAPHIC_ID1     BODY: TVD_FW_GRAPHIC_ID1
 0.0 HTML_MESSAGE           BODY: HTML included in message
  20 FUZZY_OCR              BODY: Mail contains an image with common spam text inside
                            Words found:
                            "alert" in 4 lines
                            "charts" in 1 lines
                            "symbol" in 1 lines
                            "alert" in 4 lines
                            "stock" in 2 lines
                            "company" in 3 lines
                            "trade" in 1 lines
                            "meridia" in 1 lines
                            "growth" in 1 lines
                            (18 word occurrences found)

Mailman and silently discarded messages

Several times people told me that message sent to mailman list just disappears and never sent to its subscribers. All messages had attachements, mostly excel files and doc files.

While i was trying to send some file attached into mail list - it was ok. I reset all options that could affect to the safe values:

max_message_size: 0
max_days_to_hold: 5  <- hold before automatic discarding
default_member_moderation: no
member_moderation_action: hold
generic_nonmember_action: hold
forward_auto_discards: yes
require_excplicite_destination: no
max_num_recipients: 10

And after all this i got discarded message again:

Sep 04 23:33:45 2006 (99752) Message discarded, msgid:
 <01f701c6d061$67f01430$0dcc090a@foo.bar>

I found only one place where this even occurs - it's in Mailman/Queue/IncomingRunner.py, Class IncomingRunner, method _dopipeline. Here's the code fragment:

line sys.modules[modname].process(mlist, msg, msgdata) calls method process() of one of those handlers, located in Mailman/Handlers. I grep'ed that dir and found that only followin handlers raise Errors.DiscardMessage exception:

  • MimeDel.py
  • Moderate.py
  • Scrubber.py
  • SpamDetect.py
  • ToDigest.py
And seems like this problem has something to do with MimeDel.py:

In my case mlist.filter_action == 3 though. Anyway, I was wondered when i found mlist.filter_content set to 1. Though filter_mime_types was empty and xls/doc wasn't in filter_filename_extensions

Im not sure if i found the real reason of why message was discrded, but i set filter_action to 0. In this case, the handler won't process the message at all:

And just to make sure i will know the name of the handler which discard the message next time ( if this happens again), i added a bit more info to the logging string in _dopipeline method:

To be continued... :)