Categories
Admin IT Operational Excellence Linux SLES

The IT Detective Agency: the case of the messages from mars

Intro
Today we got a “funny” message on our SLES 11 server in the /var/log/warn file. You might think that Martians have landed!

The Details
Specifically this:

Nov 9 10:54:19 drjohn24 kernel: [72397.088297] martian source 10.120.2.24 from 10.0.0.3, on dev eth1
Nov 9 10:54:19 drjohn24 kernel: [72397.088300] ll header: 78:e7:d1:7b:25:32:00:a0:8e:a8:8e:b3:08:00

Every time I pinged 10.120.2.24 (drjohn24) from 10.0.0.3 it would produce those two lines in the warn and messages file. More worrisome, I could not ssh from one host to the other. I could ssh from a host on the local network to drjohn24. We observed this behaviour even with the firewall disabled. Strange, right?

One more thing to note: drjohn24 has two network interfaces and various routes defined.

The Solution
It didn’t take too long to get to the bottom of this. We set up the routes wrong. We meant to create a default route out of eth0, which was right, and a net-10 route for eth1, which we specified incorrectly. Do

netstat -rn

to show all routes. I had this:

Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
10.120.2.0      0.0.0.0         255.255.255.128 U         0 0          0 eth1
10.120.3.0      0.0.0.0         255.255.255.0   U         0 0          0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth0
10.0.0.0        10.120.2.1      255.255.255.128 UG        0 0          0 eth1
128.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0 lo
0.0.0.0         10.120.3.1      0.0.0.0         UG        0 0          0 eth0

Do you see the error? We put the mask on the 10.0.0.0 the same as we put on the interface and that’s not what we wanted.

The corrected version looks like this:

Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
...
10.0.0.0        10.120.2.1      255.0.0.0       UG        0 0          0 eth1
...

Conclusion
So what was happening is that the inbound packet from 10.0.0.3 was arriving at eth0 as we intended. But SLES 11 is now clever enough to realize, based on its routing table, that that is not the expected interface where a packet with that source IP should arrive. It should have arrived at eth1 because of the default route. No other static route was more specific for 10.0.0.3 due to our error. And apparently even with firewall turned off, SLES gets very defensive at this point. I’m not sure if it was sending return packets out of eth1 or not, because I kept looking for them out of eth0!

Once we corrected the routes the inbound packet arrived at eth0 and was returned with an answer packet from eth0 and the martian messages went away.

The martian message thing is a little obscure, and at the time more a distraction than anything else as we had to research what that meant. I guess for the future we’ll instantly know. It’s very similar to defining network topology on your firewalls in an anti-spoofing defense.

Case closed!