SOFTFAIL – Dr John's Tech Talk

Intro
I recently implemented a strict enforcement of Sender Policy Framework records on a mail server. The amount of false positives wasn’t worth the added benefit.

The details
When I want to learn about SPF I turn to the Wikipedia article. Many secure mail gateways offer the possibility to filter out emails based on the sending MTA being in the list allowed by that domains SPF record in DNS. I began to turn on enforcement domain-by-domain. First for bbb.org since their domain was spoofed by an annoying spam campaign earlier this year. Then fedex.com (another perennial favorite), aexp.com, amazon.com, ups.com, etc.

I was stricter than the standard. FAIL -> reject; SOFTFAIL -> reject! And it seemed to work well. Last week I went whole hog and turned on SPF enforcement for all domains with a defined SPF record, but with slightly relaxed standards. FAIL -> REJECT; SOFTFAIL -> quarantine. While it’s true that it may have helped prevent some new spam campaigns, complaints started mounting. Stuff was going into quarantine that never used to!

I knew SOFTFAIL must be the issue. Most(?) domains seem to have a SOFTFAIL condition. Here is a typical example for the domain communica-usa.com:

$ dig txt communica-usa.com

; <<>> DiG 9.7.3-P3-RedHat-9.7.3-8.P3.el6_2.2 <<>> txt communica-usa.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15711
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
 
;; QUESTION SECTION:
;communica-usa.com.             IN      TXT
 
;; ANSWER SECTION:
communica-usa.com.      600     IN      TXT     "v=spf1 ip4:72.241.62.1/26 ~all"

So the ~all at the end means they have defined a SOFTFAIL that matches “everything else.” And if a domain does have a SOFTFAIL contingency in its SPF record, this is almost always how it is defined. Which, when you think about it, is the same, formally, as not having bothered to define an SPF record at all! Because the proper action to take is NONE.

I reasoned as follows: why bother defining an SPF record unless you know what you’re doing and you feel you actually can identify the MTAs your mail will come from? Except when you’re first starting out and learning the ropes. That’s why I took stronger action than the standard suggested for SOFTFAIL.

A learning experience
Well, it wasn’t immediate, but after a week to ten days the complaints started rolling in. Looking at what had actually been released from quarantine was an eye-opener. It confirmed something I had previously seen, namely that only a minority speak up so the problems brought to my attention amounted to the tip of the iceberg. So the problem of my own creation was quite widespread in actual fact. I investigated a few by hand. Yup, they were sent from IPs not permitted by the narrowly defined part of their SPF record, and they were repeat offenders. Domains such as the aforementioned communica-usa.com, right.com, redphin.com, etc, etc. I haven’t gotten an explanation for a single infringement – finding someone with sufficient knowledge in these small companies to have a discussion with is a real challenge.

Waving the white flag
I had to back down on my quarantining SOFTFAILs in the global SPF setting. Now it’s SOFTFAIL -> PASS. I could have made domain-by-domain exceptions, but when the number of problem domains climbed over a hundred I decided against that reactive approach. I kept the original set of defined domains, ups.com, etc with their more-aggressive settings. These are larger companies, anyways, which either didn’t even have a SOFTFAIL defined, or never needed their SOFTFAIL.

August 2013 update
Well, a year has passed since this experiment – long enough to establish a trend. I was hoping by writing this article to nudge the community in the direction of wider adoption of SPF usage and more importantly, enforcement. Alas, I can now attest that there is no perceptible trend in either direction. How do I know? Because even with our more lax enforcement, we still ran into problems with lots of senders. I.e., they were too, how do I say this politely, confused, and apparently unable to make their strict SPF records actually match their sending hosts! Incredible, but true. If there were widespread or even spotty adoption, even 5 – 10 % of MTAs adopting enforcement of strict SPF records, they would also have refused messages from senders like this and these senders would have been motivated to fix their self-inflicted problems. But, in reality, what I have seen is complete lack of understanding amongst any of the dozen or so companies with this issue and complete inability to resolve it, forcing me to make an exception for their error.

SPF records can be bad for a number of reasons. Some companies provide two of them, and that is a source of problems. Here are a couple examples I had this issue with:

scterm.com

scterm.com.             3600    IN      TXT     "v=spf1 ip4:65.122.37.36 include:spf.protection.outlook.com -all"
scterm.com.             3600    IN      TXT     "MS=ms56827535"

atlantichealth.org

atlantichealth.org.     3600    IN      TXT     "v=spf1 mx ip4:198.140.183.244/32 ip4:198.140.183.245/32 ip4:198.140.184.244/32 ip4:198.140.184.245/32 ip4:207.211.31.0/25 ip4:205.139.110.0/24 ip4:205.139.111.0/24 -all"

So my guesstimate is < 1% adoption of even the most conservative (meaning least likely to reject legitimate emails) SPF enforcement. Too bad about that. Spammers really appreciate it, though. Conclusion
By being forced to back off aggressive SPF settings most domains with SOFTFAIL defined as “~all” – which is most of them – are lost to any enforcement. It’s a shame. SPF sounded so promising to me at first. Spammers are really good at controlling botnets, but terrible at controlling corporate MTAs. But still I do advocate for wider adoption of SPF, without the SOFTFAIL condition, of course!

References
A nice SPF validator is found here. It has no obnoxious ads or spyware.