Categories
Admin IT Operational Excellence Network Technologies

The IT Detective Agency: ARP Entry OK, PING not Working

Intro
Yes, the It detective agency is back by popular demand. This time we’ve got ourselves a thriller involving a piece of equipment – a wireless LAN controller, WLAN – on a directly connected network. From the router we could see the arp entry for the WLAN, but we could not PING it. Why?

A trace, or more correctly the output of tcpdump run on the router interface connected to that network, showed this:

>
12:08:59.623509  I arp who-has rtr7687.drjohnhilgarts.com tell wlan.drjohnhilgarts.com
12:08:59.623530  O arp reply rtr7687.drjohnhilgarts.com is-at 01:a1:00:74:55:12 (oui Nokia Internet Communications)
12:09:01.272922  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:03.271765  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:05.271469  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:07.271885  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:09.271804  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:09.622902  I arp who-has rtr7687.drjohnhilgarts.com tell wlan.drjohnhilgarts.com
12:09:09.622922  O arp reply rtr7687.drjohnhilgarts.com is-at 01:a1:00:74:55:12 (oui Nokia Internet Communications)
12:09:11.271567  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:13.271716  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:15.271971  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:17.040748  I b8:c7:5d:19:b9:9e (oui Unknown) > Broadcast Null Unnumbered, xid, Flags [Command], length 46
12:09:17.271663  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:19.271832  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:19.392578  I b8:c7:5d:19:b9:9e (oui Unknown) > Broadcast Null Unnumbered, xid, Flags [Command], length 46
12:09:19.623515  I arp who-has rtr7687.drjohnhilgarts.com tell wlan.drjohnhilgarts.com
12:09:19.623535  O arp reply rtr7687.drjohnhilgarts.com is-at 01:a1:00:74:55:12 (oui Nokia Internet Communications)
12:09:20.478397  O arp reply rtr7687.drjohnhilgarts.com is-at 01:a1:00:74:55:12 (oui Nokia Internet Communications)
12:09:21.271714  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:23.271697  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:25.271664  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:27.272156  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:29.271730  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:29.621882  I arp who-has rtr7687.drjohnhilgarts.com tell wlan.drjohnhilgarts.com
12:09:29.621903  O arp reply rtr7687.drjohnhilgarts.com is-at 01:a1:00:74:55:12 (oui Nokia Internet Communications)
12:09:31.271765  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43
12:09:33.271858  I STP 802.1d, Config, Flags [none], bridge-id 2332.3c:df:1e:8f:2b:c0.8312, length 43

What’s interesting is what isn’t present. No PINGs. No unicast traffic whatsoever, yet we knew the WLAN was generating traffic. The frequent arp requests for the same IP strongly hinted that the WLAN was not getting the response. We were not able to check the arp table of the WLAN. And we knew the WLAN was supposed to respond to our PINGs, but it wasn’t. Yet the router’s arp table had the correct entry for the WLAN, so we knew it was plugged into the right switch port and on the right vlan. We also triple-checked that the network masks matched on both devices. Let’s go back. Was it really on the right vlan??

The Solution
What we eventually realized is that in the WLAN GUI, VLANs were assigned to the various interfaces. the switch port, on a Cisco switch, was a regular access port. We reasoned (documentation was scarce) that the interface was vlan tagging its traffic. So we tried to change the access port to a trunk port and enter the correct vlan. Here’s the show conf snippet:

interface GigabitEthernet1/17
 description 5508-wlan
 switchport
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 887
 switchport mode trunk
 spanning-tree portfast edge trunk

Bingo! With that in place we could ping the WLAN and it could send us its traffic.

Case closed.

2018 update
I had totally forgotten my own posting. And I’ll be damned if in the heat of connecting a new firewall to a switch port we didn’t have this weird situation where we could see MAC entries of the firewall, and it could see MACs of other devices on that vlan, but nobody could ping the firewall and vica versa. A trace from tcpdump looked roughly similar to the above – a lot of arp who-has firewall, tell server. Sure enough, the firewall guy, new to the group, had configured all his ports to be tagged ports, even those with a single vlan. It had been our custom to make single vlans non-tagged ports. I didn’t start it, that’s just how it was. More than an hour was lost debugging…

And earlier in the year was yet another similar incident, where a router operated by a vendor joining one of our vlans assumed tagged ports where we did not. More than an hour was lost debugging… See a pattern there?

I had forgotten my own post from seven years ago to such an extent, I was just about to write a new one when I thought, Maybe I’ve covered that before. So old topics are new once again… Here’s to remember this for the next time!

Where to watch out for this
When you don’t run all the equipment. If you ran it all you’d have the presence of mind to make all the ports consistent.

Some terminology
A tagged port can also be known as using 802.1q, which is also known as dot1q, which in Cisco world is known as a trunk port. In the absence of that, you would have an access port (Cisco terminology) or untagged port (everyone else).

Conclusion
OK, there are probably many reasons and scenarios in which devices on the same network can see each other’s arp entries, but not send unicast traffic. But, the scenario we have laid out above definitely produces that effect, so keep it in mind as a possibility should you ever encounter this issue.