Intro
So one of my power users complains that his FTPs to a particular site fail frequently, but not always. I rolled up my sleeves and set to work. The thing I do best is find the essence of a problem – what is the bare minimum sequence of events that reproduces it. I’m still getting my head around this one and I haven’t cracked the case yet, but I’ve learned about a few obscure packet generation tools.
The details
I may flesh this out later. The essence of the thing is that a packet trace (using tcpdump) shows that randomly no SYN-ACK packet is returned for our SYN packet to the FTP server on port 21. The FTP server resides on the Amazon cloud, but on the West Coast. We are on the East Coast. Not that that matters.
So I learned to reproduce the problem myself with the built-in ftp client. But I wanted even more control.
Packet generation tools
My trajectory went kind of like this:
ftp -> ping -> nmap -> hping3 -> mausezahn (-> scapy) |
I had to compile mausezahn but I did manage to get it to work. I guess the developer has passed away. It doesn’t offer complete control over tcp packet generation, but nearly so.
I just discovered scapy. It appears to offer complete control over packet generation, including the tcp options such as mss, but the proof is in the pudding and I haven’t had time to check it out.
See the references fo links to further information about where to find these packages.
Preliminary findings
I began to see that with nmap and hping3 I was getting SYN-ACKs back consistently. What’s the difference between their SYN packets and ftp’s? They don’t use any options whereas my ftp client does.
And that is the essence of the problem. A tcp SYN packet which sets options like SACK, wscale and MSS is not being responded to around 30% of the time. No options set? SYN-ACKs come back 100% of the time. Pings are answered 99 – 100% of the time. mausezahn (mz) allows to set the window size. The window size is irrelevant.
Is it one particular tcp option that is the culprit, or just the fact of using any of them? Unfortunately that’s where you reach the limits of mz. mz only allows you to turn on or off all options. scapy promises to be more granular. So at least with mz by itself I can turn of/off the problem at will. That is getting to the essence of the problem.
Another wrinkle? Only certain source IPs have the problem! I have an identical system using a different ISP and it works all day long.
Conclusion
A lot of work and only modest progress to show for it. I need cooperation of the ftp administrator to do a simultaneous trace. Either the packet never gets to him, or his infrastructure discards it, or he responds but I never see the response. A two-sided trace will narrow down which of these three things is happening.
But I did learn that fine-control packet generation is a bit difficult to come by, which comes as a surprise in this day and age. You have to do some work to get full control over your packets.
I have nos stomach for writing my own C++ code to have total control.
It’s still an open case.
References
nmap.org talks about nmap. nmap is a pretty standard package available for major distributions. But it is not sufficiently configurable.
I’ve written about hping3 before, showing how to compile it.
I used this site for mausezahn source code and documentation.
scapy is well-documented here.