Flash Update
I noticed yahoo.com, yahoo.ca and yahoo.com.br had stopped accepting emails today.
I wanted to provide insight into this problem that few others would have, namely, when did the problem first occur?
I looked at my stuck messages in queue. I realized that with sendmail, there is no easy way to answer the question: what is the oldest stuck message for domain XYZ in the queue? So I rigged up something, very crude, but typical for a Unix command line-type of guy.
Namely:
$ grep yahoo */qf*|grep for|cut -d\; -f2|sort|head
The answer? My first stuck message comes from 12:43:46 EST today (July 31st). As of this writing, 3:53 PM, the problem still exists, which already makes it a quite long outage even if it is fixed in the next few minutes. Just minutes later, around 4 PM, I noticed a lot of the messages being delivered, so the problem seems to have finally cleared up.
The expression above works if you are root and the current directory is, e.g., /mqueue, under which you have queue directories q0, q1, …, q9, etc corresponding to an MC statement:
define(`QUEUE_DIR',`/mqueue/q*')dnl |
The grep above might miss a few messages, but if you have lots of them as I do, it doesn’t really matter as it’s purpose is just to convey the general idea of when the problem started. In my case it can be safely assumed that I am continuously sending emails to yahoo.com, so it is not possible to have a window that isn’t covered of more than ten minutes or so during the day.
Conclusion
We helped document a complete three-hour outage of Yahoo mail. Along the way we learned of a deficiency in sendmail’s mailq command – how limited its reporting options are. We compensated for that by rolling our own series of commands to answer the question of what is the oldest mail of this type in our queues.