Categories
Admin Internet Mail Linux Perl

The IT Detective Agency: last letter of attachment name is missing!

Intro
Today we bring you an IT whodunit thriller. A user using Lotus Notes informs his local IT that a process that emails SQL reports to him and a few others has suddenly stopped working correctly. The reports either contain an HTML attachment where the attachment type has been chopped to “ht” instead of “htm,” or an MHTML attachment type which has also been chopped, down to “mh” instead of “mht.” They get emailed from the reporting server to a sendmail mail relay. Now the convenient ability to double-click on the attachment and launch it stopped working as a result of these chopped filenames. What’s going on? Fix it!

Let’s Reproduce the Problem
Fortunately this one was easier than most to reproduce. But first a digression. Let’s have some fun and challenge ourselves with it before we deep dive. What do you think the culprit is? What’s your hypothesis? Drawing on my many years of experience running enterprise-class sendmail servers, and never before having seen this problem despite the hundreds of millions of delivered emails, my best instincts told me to look elsewhere.

The origin server, let’s call it aspen, sends few messages, so I had the luxury to turn on tracing on my sendmail server with a filter limiting the traffic to its IP:

$ tcpdump -i eth0 -s 1540 -w /tmp/aspen.cap host aspen

Using wireshark to analyze asp.cap and following the tcp stream I see this:

...
Content-Type: multipart/mixed;
		 boundary="CSmtpMsgPart123X456_000_C800C42D"
 
This is a multipart message in MIME format
 
--CSmtpMsgPart123X456_000_C800C42D
Content-Type: text/plain;
		 charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
 
SQLplus automated report
--CSmtpMsgPart123X456_000_C800C42D
Content-Type: application/octet-stream;
		 name="tower status_2012_06_04--09.25.00.htm"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
		 filename="tower status_2012_06_04--09.25.00.htm
 
<html><head></head><body><h1>Content goes here...</h1></body>
</html>
 
--CSmtpMsgPart123X456_000_C800C42D--

Result of trace of original email as received by sendmail

But the source as viewed from within Lotus Notes is:

...
Content-Type: multipart/mixed;
		 boundary="CSmtpMsgPart123X456_000_C800C42D"
 
 
--CSmtpMsgPart123X456_000_C800C42D
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
		 charset="iso-8859-1"
 
SQLplus automated report
--CSmtpMsgPart123X456_000_C800C42D
Content-Type: application/octet-stream;
		 name="tower status_2012_06_04--09.25.00.htm"
Content-Disposition: attachment;
		 filename="tower status_2012_06_04--09.25.00.ht"
Content-Transfer-Encoding: base64
 
PGh0bWw+PGhlYWQ+PC9oZWFkPjxib2R5PjxoMT5Db250ZW50IGdvZXMgaGVyZS4uLjwvaDE+PC9i
b2R5Pg0KPC9odG1sPg==
 
--CSmtpMsgPart123X456_000_C800C42D--

Same email after being trasferred to Lotus Notes

I was in shock.

I fully expected the message source to go through unaltered all the way into Lotus Notes, but it didn’t. The trace taken before sendmail’s actions was not an exact match to the source of the message I received. So either sendmail or Lotus Notes (or both) were altering the source in significant ways.

At the same time, we got a big clue as to what is behind the missing letter in the file extension. To highlight it, compare this line from the trace:

filename=”tower status_2012_06_04–09.25.00.htm

to that same line as it appears in the Lotus Notes source:

filename=”tower status_2012_06_04–09.25.00.ht

So there is no final close quote (“) in the filename attribute as it comes from the aspen server! That can’t be good.

But it used to work. What do we make of that fact??

I had to dig farther. I was suddenly reminded of the final episode of House where it is apparent that the solving the puzzle of symptoms is the highest aspiration for Doctor House. Maybe I am similarly motivated? Because I was definitely willing to throw the full weight of my resources behind this mystery. At least for the half-day I had to spare on this.

First step was to reproduce the problem myself. For sending an email you would normally use sendmail or mailx or such, but I didn’t trust any of those programs – afraid they would mess with my headers in secret, undocumented ways.

So I wrote my own mail sending program using Perl/Expect. Now I’m not advocating this as a best practice. It’s just that for me, given my skillset and perceived difficulty in finding a proper program to do what I wanted (which I’m sure is out there), this was the path of least resistance, the best and most efficient use of my time. You see, I already had the core of the program written for another purpose, so I knew it wouldn’t be too difficult to finish for this purpose. And I admit I’m not the best at Expect and I’m not the best at Perl. I just know enough to get things done and pretty quickly at that.

OK. Enough apologies. Here’s that code:

#!/usr/bin/perl
# drjohnstechtalk.com - 6/2012
# Send mail by explicit use of the protocol
$DEBUG = 1;
use Expect;
use Getopt::Std;
getopts('m:r:s:');
$recip = $opt_r;
$sender = $opt_s;
$hostname = $ENV{HOSTNAME};
chop($hostname);
print "hostname,mailhost,sender,recip: $hostname,$opt_m,$sender,$recip\n" if $DEBUG;
$telnet = "telnet";
@hosts = ($opt_m);
$logf = "/var/tmp/smtpresults.log";
 
$timeout = 15;
 
$data = qq(Subject: test of strange MIME error
X-myHeader: my-value
From: $sender
To: $recip
Subject: SQLplus Report - tower status
Date: Mon, 4 Jun 2012 9:25:10 --0400
Importance: Normal
X-Mailer: ATL CSmtp Class
X-MSMail-Priority: Normal
X-Priority: 3 (Normal)
MIME-Version: 1.0
Content-Type: multipart/mixed;
        boundary="CSmtpMsgPart123X456_000_C800C42D"
 
This is a multipart message in MIME format
 
--CSmtpMsgPart123X456_000_C800C42D
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
 
SQLplus automated report
--CSmtpMsgPart123X456_000_C800C42D
Content-Type: application/octet-stream;
        name="tower status_2012_06_04--09.25.00.htm"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
        filename="tower status_2012_06_04--09.25.00.htm
 
<html><head></head><body><h1>Content goes here...</h1></body>
</html>
--CSmtpMsgPart123X456_000_C800C42D--
 
.
);
sub myInit {
# This structure is ugly (p.148 in the book) but it's clear how to fill it
@steps = (
        { Expect => "220 ",
          Command => "helo $hostname"},
# Envelope sender
        { Expect => "250 ",
          Command => "mail from: $sender"},
# Envelope recipient
        { Expect => "250 ",
          Command => "rcpt to: $recip"},
# data command
        { Expect => "250 ",
          Command => "data"},
# start mail message
        { Expect => "354 Enter ",
          Command => $data},
# end session nicely
        { Expect => "250 Message accepted ",
          Command => "quit"},
);
}       # end sub myInit
#
# Main program
open(LOGF,">$logf") || die "Cannot open log file!!\n";
foreach $host (@hosts) {
  login($host);
}
 
# create an Expect object by spawning another process
sub login {
($host) = @_;
myInit();
#@params = ($host," 25");
$init_command = "$telnet $host 25";
#$Expect::Debug = 3;
my $exp = Expect->spawn("$init_command")
         or die "Cannot spawn $command: $!\n";
#
# Now run all the other commands
foreach $step (@steps) {
  $i++;
  $expstr = %{$step}->{Expect};
  $cmd = %{$step}->{Command};
#  print "expstr,cmd: $expstr, $cmd\n";
# Logging
#$exp->debug(2);
#$exp->exp_internal(1);
  $exp->log_stdout(0);  # disable stdout for each command
  $exp->log_file($logf);
  @match_patterns = ($expstr);
  ($matched_pattern_position, $error, $successfully_matching_string, $before_match, $after_match) = $exp->expect($timeout,
@match_patterns);
  unless ($matched_pattern_position == 1) {
    $err = 1;
    last;
  }
  #die "No match: error was: $error\n" unless $matched_pattern_position == 1;
  # We got our match. Proceed.
  $exp->send("$cmd\n");
}       # end loop over all the steps
 
#
# hard close
$exp->hard_close();
close(LOGF);
#unlink($logf);
}       # end sub login

Code for sendmsg2.pl

Invoke it:

$ ./sendmsg2.pl -m sendmail_host -s [email protected] -r [email protected]

The nice thing with this program is that I can inject a message into sendmail, but also I can inject it directly into the Lotus Notes smtp gateway, bypassing sendmail, and thereby triangulate the problem. The sendmail and Lotus Notes servers have slightly different responses to the various protocol stages, hence I clipped the Expect strings down to the minimal common set of characters after some experimentation.

This program makes it easy to test several scenarios of interest. Leave the final quote and inject into either sendmail or Lotus Notes (LN). Tack on the final quote to see if that really fixes things. The results?

Missing final quote

with final quote added

inject to sendmail

ht” in final email to LN; extension chopped

htm” and all is good

inject to LN

htm in final email; but extension not chopped

htm” and all is good

I now had incontrovertible proof that sendmail, my sendmail was altering the original message. It is looking at the unbalanced quote mark situation and recovering as best as possible by replacing the terminating character “m” with the missing double quote “. I was beginning to suspect it. After that shock drained away, I tried to check the RFCs. I figured it must be some well-meaning attempt on its part to make things right. Well, the RFCs, 822 and 1806 are a little hard to read and apply to this situation.

Let’s be clear. There’s no question that the sender is wrong and ought to be closing out that quote. But I don’t think there’s some single, unambiguous statement from the RFCs that make that abundantly apparent. Nevertheless, of course that’s what I told them to do.

The other thing from reading the RFC is that the whole filename attribute looks optional. To satisfy my curiosity – and possibly provide more options for remediation to aspen – I sent a test where I entirely left out the offending filename=”tower… line. In that case the line above it should have its terminating semicolon shorn:

Content-Disposition: attachment

After all, there already is a name=”tower…” as a Content-type parameter, and the string following that was never in question: it has its terminating semicolon.

Yup, that worked just great too!

Then I thought of another approach. Shouldn’t the overriding definition of the what the filetype is be contained in the Content-type header? What if it were more correctly defined as

Content-type: text/html

?

Content-type appears in two places in this email. I changed them both for good measure, but left the unbalanced quotations problem. Nope. Lotus Notes did not know what to with the attachment it displays as tower status_2012_06_04–09.25.00.ht. So we can’t recommend that course of action.

What Sendmail’s Point-of-View might be
Looking at the book, I see sendmail does care about MIME headers, in particular it cares about the Content-Disposition header. It feels that it is unreliable and hence merely advisory in nature. Also, some years ago there was a sendmail vulnerability wherein malformed multipart MIME messages could cause sendmail to crash (see http://www.kb.cert.org/vuls/id/146718. So maybe sendmail is just a little sensitive to this situation and feels perfectly comfortable and justified in right-forming a malformed header. Just a guess on my part.

Case closed.

Conclusion
We battled a strange email attachment naming error which seemed to be an RFC violation of the MIME protocols. By carefully constructing a testing program we were easily able to reproduce the problem and isolate the fault and recommend corrective actions in the sending program. Now we have a convenient way to inject SMTP email whenever and wherever we want. We feel sendmail’s reputation remains unscathed, though its corrective actions could be characterized as overly solicitous.

Leave a Reply

Your email address will not be published. Required fields are marked *