Categories
Admin DNS Proxy

The IT Detective Agency: Browsing Stopped Working on Internet-Connected Enterprise Laptops

Intro
Recall our elegant solution to DNS clobbering and use of a PAC file, which we documented here: The IT Detective Agency: How We Neutralized Nasty DNS Clobbering Before it Could Bite Us. Things were going smoothly with that kludge in place for months. Then suddenly it wasn’t. What went wrong and how to fix?? Read on.

The Details
We began hearing reports of increasing numbers of users not being able to access Internet sites on their enterprise laptops when directly connected to Internet. Internet Explorer gave that cryptic error to the effect Internet Explorer cannot display the web page. This was a real problem, and not a mere inconvenience, for those cases where a user was using a hotel Internet system that required a web page sign-on – that web page itself could not be displayed, giving that same error message. For simple use at home this error may not have been fatal.

But we had this working. What the…?

I struggled as Rop demoed the problem for me with a user’s laptop. I couldn’t deny it because I saw it with my own eyes and saw that the configuration was correct. Correct as in how we wanted it to be, not correct as in working. Basically all web pages were timing out after 15 seconds or so and displaying cannot display the web page. The chief setting is the PAC file, which is used by this enterprise on their Intranet.

On the Internet (see above-mentioned link) the PAC file was more of an annoyance and we had aliased the DNS value to 127.0.0.1 to get around DNS clobbering by ISPs.

While I was talking about the problem to a colleague I thought to look for a web server on the affected laptop. Yup, there it was:

C:\Users\drj>netstat -an|more

Active Connections

  Proto  Local Address          Foreign Address        State
  TCP    0.0.0.0:80             0.0.0.0:0              LISTENING
  TCP    0.0.0.0:135            0.0.0.0:0              LISTENING
  ...

What the…? Why is there a web server running on port 80? How will it respond to a PAC file request? I quickly got some hints by hitting my own laptop with curl:

$ curl -i 192.168.3.4/proxy.pac

HTTP/1.1 404 Not Found
Content-Type: text/html; charset=us-ascii
Server: Microsoft-HTTPAPI/2.0
Date: Tue, 17 Apr 2012 18:39:30 GMT
Connection: close
Content-Length: 315


Not Found

Not Found


HTTP Error 404. The requested resource is not found.

So the server is Microsoft-HTTPAPI, which upon invetsigation seems to be a Microsoft Web Deployment Agent Service (MsDepSvc).

The main point is that I don’t remember that being there in the past. I felt it’s presence, probably a new “feature” explained the current problem. What to do about it however??

Since this is an enterprise, not a small shop with a couple PCs, turning off MsDepSvc is not a realistic option. It’s probably used for peer-to-peer software distribution.

Hmm. Let’s review why we think our original solution worked in the first place. It didn’t work so well when the DNS was clobbered by my ISP. Why? I think because the ISP put up a web server when it encountered a NXDOMAIN DNS response and that web server gave a 404 not found error when the browser searched for the PAC file. Turning the DNS entry to the loopback interface, 127.0.0.1, gave the browser a valid IP, one that it would connect to and quickly receive a TCP RST (reset). Then it would happily conclude there was no way to reach the PAC file, not use it, and try DIRECT connections to Internet sites. That’s my theory and I’m sticking to it!

In light of all this information the following option emerged as most likely to succeed: set the Internet value of the DNS of the PAC file to a valid IP, and specifically, one that would send a TCP RST (essentially meaning that TCP port 80 is reachable but there is no listener on it).

We tried it and it seemed to work for me and my colleague. We no loner had the problem of not being able to connect to Internet sites with the PAC file configured and directly connected to Internet.

I noticed however that once I got on the enterprise network via VPN I wasn’t able to connect to Internet sites right away. After about five minutes I could.

My theory about that is that I had a too-long TTL on my PAC file DNS entry. The TTL was 30 minutes. I shortened it to five minutes. Because when you think about it, that DNS value should get cached by the PC and retained even after it transitions to Intranet-connected.

I haven’t retested, but I think that adjustment will help.

Conclusion
I also haven’t gotten a lot of feedback from this latest fix, but I’m feeling pretty good about it.

Case: again mostly solved.

Hey, don’t blame me about the ambiguity. I never hear back from the users when things are working 🙂

References
A closely related case involving Verizon “clobbering” TCP RST packets also bit us.

Categories
Admin DNS IT Operational Excellence

The IT Detective Agency: How We Neutralized Nasty DNS Clobbering Before it Could Bite Us

This gets a little involved. But if you’re the IT expert called on to fix something, you better be able to roll up your sleeves and figure it out!

In this article, I described how some, but not all ISPs change the results of DNS queries in violation of Internet standards.

A Proxy PAC for All
This work was done for an enterprise. They want everyone to use a proxy PAC file which whose location was to be (obfuscating the domain name just a little here) http://webproxy.intranet.drjohnstechtalk.com/proxy.pac. Centralized large enterprises like this sort of thing because the proxy settings are controlled in the one file, proxy.pac, by the central IT department.

So two IT guys try this PAC file setting on their work PC at their home networks. The guy with Comcast as his ISP reports that he can surf the Internet just fine at home. I, with Centurylink, am not so successful. It takes many minutes before an eventual timeout seems to occur and I cannot surf the Internet as long as I have that PAC file configured. But I can always uncheck it and life is good.

Now along comes a new requirement. This organization is going to roll out VPN without split tunneling, and the initial authentication to that VPN is a web page on the VPN switch. Now we have a real problem on our hands.

With my ISP, I can shut off the PAC file, get to the log-on page, establish VPN, but at that point if I wanted to get back out to the Internet (which is required for some job functions) I’d have to re-establish the PAC file setting. Furthermore it is desirable to lock down the proxy settings so that users can’t change them in any case. That makes it sound impossible for Centurylink customers, right?

Wrong. By the way the Comcast guy had this whole scenario working fine.

The Gory Details
This enterprise organization happened to have chosen legitimately owned but unused internal namespace for the PAC file location, analagous to my webproxy.intranet.drjohnstechtalk.com in my example. I reasoned as follows. Internet Explorer (“IE”) must quickly learn in the Comcast case that the domain name of the PAC file (webproxy.intranet.drjohnstechtalk.com) resolves with a NXDOMAIN and so it must fall back to making DIRECT connections to the Internet. For the unfortunate soul with CenturyLink (me), the domain name is clobbered! It does resolve, and to an active web site. That web site must produce a HTTP 404 not found. At least you’d think so. Today it seems to produce a simplified PAC file, which I am totally astonished by. And I wonder if this is more recent behaviour present in an attempt to ameliorate this situation. In any case, I reasoned that if they were clobbering a non-existent DNS record, we could actually define this domain name, but instead of going through the trouble of setting up a web server with the PAC file, just define the domain name as the loopback interface, 127.0.0.1. There’s no web server to connect to, so I hoped the browser would quickly detect this as a bad PAC URL, go on its way to make DIRECT connections to the VPN authentication web site, and then once VPN were established, use the PAC file again actively to permit the user to surf the Internet. And, furthermore, that this should work for both kinds of users: ones with DNS-clobbering ISPs and ones without.

That’s a lot of assumptions in the previous paragraph! But I built the case for it – it’s all based on reasonable extrapolation from observed behaviour. More testing needs to be done. What we have seen so far is that this DNS entry does no harm to the Comcast user. Direct Internet browsing works, VPN log-in works, Internet browsing post-login works. For the CenturyLink user the presence of this DNS entry permitted the browser of the work PC to surf the Internet very readily, which is already progress. VPN was not tested but I see no reason why it wouldn’t work.

More tests need to be done but it appears to be working out as per my educated guess.

April 2012 Update
Our fix seemed to collapse like a house of cards all-of-a-sudden many months later. Read how instead of panicking, we re-fixed it using our best understanding of the problems and mechanisms involved. The IT Detective Agency: Browsing Stopped Working on Internet-Connected Enterprise Laptops

Conclusion
We found a significant issue with DNS clobbering as practiced by some ISPs in an enterprise-class application: VPN. We found a work-around after taking an educated guess as to what would work – defining webproxy… to resolve to 127.0.01. We could have also changed the domain name of the PAC file – to one that wouldn’t be clobbered – but that was set by another group and so that option was not available to us. Also, we don’t yet know how extensive DNS clobbering is at other ISPs. Perhaps some clobber every domain name which returns a NXDOMAIN flag. That’s what Google’s DNS FAQ seems to imply at any rate. A more sensible approach may have been to migrate to use the auto-detect proxy settings, but that’s a big change for an enterprise and they weren’t ready to do that. A final concern is what if the PC is running a local web server because some application requires it?? That might affect our results.

Case: just about solved!

References
A related case of Verizon clobbering TCP reset packets is described here.

Categories
DNS IT Operational Excellence

DNS Clobbering – How ISPs Twist DNS Replies

Intro
Some ISPs have taken advantage of missing or broken DNS records, using them as an excuse to guide users to their own pages. From an Internet purist’s point-of-view this is bad behavior. I call it DNS clobbering.

In my article Google’s DNS Servers Rock! I mentioned that some ISPs provide a questionable feature that alters the results of DNS queries in unexpected ways, to their advantage.

In DNS if a domain name doesn’t exist the response should have the no such domain flag set. It’s that simple. So for instance I look for a resource record with the name webproxy.drjohnstechtalk.com:

dig webproxy.drjohnstechtalk.com

; <<>> DiG 9.7.1-P2 <<>> webproxy.drjohnstechtalk.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 26054
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;webproxy.drjohnstechtalk.com.  IN      A

;; AUTHORITY SECTION:
drjohnstechtalk.com.    10800   IN      SOA     ns71.domaincontrol.com. dns.jomax.net. 2011040901 28800 7200 604800 86400

See the NXDOMAIN and the ANSWER: 0? That's what I want to see for a non-existent domain name such as this. So all is good with my nameserver (in this case supplied by Amazon Cloud Northeast).

Now let's try that at home, where I have CenturyLink as my ISP. Lo and behold, I get a different answer, a completely different answer. Unfortunately I have to be on their network to get the result and I currently am not. I will try their DNS server 207.14.188.36. I get:

dig www.xyzaabc.com @207.14.188.36

; <<>> DiG 9.3.2 <<>> www.xyzaabc.com @207.14.188.36
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1394
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;www.xyzaabc.com.               IN      A

;; ANSWER SECTION:
www.xyzaabc.com.        60      IN      A       184.106.31.182

;; Query time: 46 msec
;; SERVER: 207.14.188.36#53(207.14.188.36)
;; WHEN: Thu Sep 01 22:46:04 2011
;; MSG SIZE  rcvd: 64

When you use a web browser the browser is initiating these types of queries for you. So if you mistakenly enter the URL www.drjohns.drsjohntechtalk.com in your browser I would like you to get a browser-generated page-not-found error. With CenturyLink that doesn't happen. They assign any unresolvable domain name which begins with www or web an IP address that points you to a search page on their own web server!

I'm sure they would argue that this is done as a convenience for the user, but I'm a user, too, and I don't like this trick of theirs. I'm sure it earns them a bit of revenue as well. I expect ISPs to follow the rules and the rules are pretty clear in this case.

Not all ISPs do this, by the way. A colleague with Comcast as his ISP did some DNS queries for me. The results showed that Comcast was not clobbering these types of resource records.

And it gets worse than that. I actually witnessed an enterprise application that behaved completely differently depending on whether an ISP played this sort of trick or not. And that's nasty.

It's hard for me to get more data except through cooperating customers of other ISPs. Try a few queries for these fictitious domain names and leave a comment with your results and what ISP you use:

www.xyzaabc.com
webproxy.xyzaabc.net
abc.xyzaabc.us

If you don't have a nice home Linux system or cygwin containing dig, you can even use nslookup on a Windows OS. From a CMD window:

nslookup www.xyzaabc.com

Results

ISP

Clobbers DNS?

DNS Server tested

Date

Example Clobber

CenturyLink

YES

207.14.188.36

2011

www.xyzaabc.com returns 72.32.218.57

Comcast

NO

unknown

6/2011

NA

Amazon Cloud NE

NO

172.16.0.23

8/2011

NA

The Amazon Cloud had better not clobber DNS. That is a server environment, and servers may be affected much more than individual users if they get wrong DNS results back.

Categories
DNS IT Operational Excellence Network Technologies Uncategorized

Google’s DNS Servers Rock!

Intro
DNS is the Domain name Service, the Internet service that converts IP addresses, e.g., 200.54.129.57 into mnemonic names like www.mysite.com.

I tried to run a cache-only DNS server for use by a proxy server. What I found is that certain sites were not accessible on a frequent basis. I think uol.com.br is one of the problem sites (need to check this). It may not mean much to a US audience, but it’s really popular in Brazil!

At some point I happened to learn that Google has a public DNS service. This is worth pondering. No one of any repute has offered a DNS service to that point. There are a host of concerns about security, especially DNS cache poisoning. They blazed a trail, and did it in a way only Google and very few other major infrastructure players could. Not only did they offer a DNS service, they put their DNS servers all over the Internet and created convenient anycast addresses for their servers.

I am no expert on anycast addresses. You can look it up on Wikipedia, however. The essence for my purposes is that with a single IP address you’re going to hit the closest server, network-wise. So no matter where you are some Google DNS server is not far away. Try it. The anycast addresses are 8.8.8.8 and 8.8.4.4. They don’t mind, really! You can ping them. Traceroute to them, whatever. From the Amazon cloud Northeast 8.8.8.8 responds to PINGs in 3.4 ms. That’s really low. Not so low as to make me think they are in the same data center (it is different companies after all), but not far away.

The gold standard for running a DNS service is BIND. I have been running it for many years now and I want to give the Internet Software Consortium their due for providing this wonderful application. Once I got wind of my DNS difficulties as mentioned above, I had to wonder why not everyone else was complaining? They had to be using something else. I ran a flat-out performance test. 5000 queries from an actual proxy log, fed straight to my BIND DNS server, and then to Google’s DNS server 8.8.8.8. I have to dig up the numbers, but Google’s won by quite a bit! This result was actually surprising because you’re always going off-site to the Google DNS server, whereas my server can build up its cache and is right on my network. From where I tested the Google server was about 11 ms away. So 5000 x 11 ms = 55 s. So there is a 55 s handicap from just network considerations alone! Yet it is faster. On the quickest of queries the local server is indeed faster, but what happens is that over the course of real life queries, you always get a few problematic ones which either time out or just seem to take a long time to get back a response. That’s what kills the traditional DNS server and where Google has (obviously) made some optimizations.

And, that’s not all! Google also deals in a more forgiving fashion with broken domain names. I used to get on my high horse and proclaim to others about how broken their DNS servers are – it’s no wonder I can’t resolve their names, which means, by the way, I also cannot get to their web site nor send them email!

It’s effectively like taking yourself off the Internet, or so I thought. Turns out in some cases that’s only true if you’ve constrained yourself to resolving names with BIND. You see, BIND enforces the rules. And I’m a believer in rules. The Internet has about 5,000 technical rules called RFCs. DNS is a topic of many of these rules. The Internet could only have expanded to the size it currently has because all the major players agreed to abide by those rules. What Google has done with their server, in effect, is to say, “Well, if you don’t follow the rules, we’re going to try to work with you anyways.”

Here’s a concrete example. appliedcoatings.org. I guess at some point they’ll actually fix their severely broken DNS, but at the time I write this, August 21, 2011, these comments are valid and their domain is severely broken. In fact, I was amazed that people weren’t jumping up and down screaming at them. I couldn’t even send an email to them. That’s akin to knocking yourself off the Internet, right? Ah, but it all depends on whose DNS servers you are using!

There used to be lots of good free DNS analyzers, like dnsreport.com. You can still find a few around. www.zonecheck.fr, for instance. It shows FAILURE. If it were better written it would show the real problem, which is a lame delegation. But we’re experts, and we don’t need such tools! We will do the queries ourselves and show the lame delegation. We start by learning who are the authoritative nameservers for .ca, the top-level domain used in Canada:

 dig ns ca

; <<>> DiG 9.7.1-P2 <<>> ns ca
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52928
;; flags: qr rd ra; QUERY: 1, ANSWER: 10, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;ca.                            IN      NS

;; ANSWER SECTION:
ca.                     83585   IN      NS      a.ca-servers.ca.
ca.                     83585   IN      NS      c.ca-servers.ca.
ca.                     83585   IN      NS      e.ca-servers.ca.
ca.                     83585   IN      NS      f.ca-servers.ca.
ca.                     83585   IN      NS      j.ca-servers.ca.
ca.                     83585   IN      NS      k.ca-servers.ca.
ca.                     83585   IN      NS      l.ca-servers.ca.
ca.                     83585   IN      NS      m.ca-servers.ca.
ca.                     83585   IN      NS      z.ca-servers.ca.
ca.                     83585   IN      NS      sns-pb.isc.org.

;; ADDITIONAL SECTION:
a.ca-servers.ca.        83594   IN      A       192.228.27.11

Now we ask one of them about the nameservers for appliedcoatings.ca:

 dig ns appliedcoatings.ca @a.ca-servers.ca.

; <<>> DiG 9.7.1-P2 <<>> ns appliedcoatings.ca @a.ca-servers.ca.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 288
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;appliedcoatings.ca.            IN      NS

;; AUTHORITY SECTION:
appliedcoatings.ca.     86400   IN      NS      sp2.domainpeople.com.
appliedcoatings.ca.     86400   IN      NS      sp1.domainpeople.com.

So far everything's cool. Now, since the authoritative flag (AA) was not present in that response we re-ask that query, but now to one of the nameservers that's supposed to be authoritative for that domain:

dig ns appliedcoatings.ca @sp2.domainpeople.com.

; <<>> DiG 9.7.1-P2 <<>> ns appliedcoatings.ca @sp2.domainpeople.com.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24373
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;appliedcoatings.ca.            IN      NS

;; ANSWER SECTION:
appliedcoatings.ca.     86400   IN      NS      ns1.domainpeople.com.
appliedcoatings.ca.     86400   IN      NS      ns2.domainpeople.com.

Oh, oh. That's not supposed to happen. We're getting back an entirely different set of nameservers. That's a lame delegation. The domain should be considered completely broken. I think even BIND might be forgiving up to this point. a BIND resolver does these types of quesires to get at the answer. At this point it says, "OK, this is strange, but not necessariily fatal. I will ask my subsequent queries to ns1.domainpeople.com and ns2.domainpeople.com since they are listed as being the nameservers of record.

So now let's get to something useful: looking up the mail exchanger record so we see how to deliver mail to this domain. BIND, which has been fastidiously following the rules, does it as follows:

dig mx appliedcoatings.ca @ns1.domainpeople.com.

; <<>> DiG 9.7.1-P2 <<>> mx appliedcoatings.ca @ns1.domainpeople.com.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 49996
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;appliedcoatings.ca.            IN      MX

;; Query time: 79 msec
;; SERVER: 204.174.223.72#53(204.174.223.72)
;; WHEN: Sun Aug 21 19:05:43 2011
;; MSG SIZE  rcvd: 36

That's not good. Status is REFUSED. But BIND can even forgive this slight. There is one more nameserver to try after all, right? Last chance query:

dig mx appliedcoatings.ca @ns2.domainpeople.com.

; <<>> DiG 9.7.1-P2 <<>> mx appliedcoatings.ca @ns2.domainpeople.com.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 44404
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;appliedcoatings.ca.            IN      MX

;; Query time: 72 msec
;; SERVER: 64.40.96.140#53(64.40.96.140)
;; WHEN: Sun Aug 21 19:07:34 2011
;; MSG SIZE  rcvd: 36

Status also REFUSED. Now we are really and truly dead. If you are using a BIND nameserver you have no way to send email to [email protected]. But not so with Google!

Of course I don't know how Google wrote their DNS server, but I do think that some of their infrastructure experts write it themselves rather than using open source programs. So with a Google nameserver you will get a response:

dig mx appliedcoatings.ca @8.8.8.8

; <<>> DiG 9.7.1-P2 <<>> mx appliedcoatings.ca @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6901
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;appliedcoatings.ca.            IN      MX

;; ANSWER SECTION:
appliedcoatings.ca.     82805   IN      MX      10 mail.appliedcoatings.ca.

;; Query time: 4 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sun Aug 21 19:11:14 2011
;; MSG SIZE  rcvd: 57

and just to close the loop and make sure this is a valid host you would do this:

dig mail.appliedcoatings.ca @8.8.8.8

; <<>> DiG 9.7.1-P2 <<>> mail.appliedcoatings.ca @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35190
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;mail.appliedcoatings.ca.       IN      A

;; ANSWER SECTION:
mail.appliedcoatings.ca. 86400  IN      A       66.183.21.181

And we can go the next step and begin an SMTP conversation with that server to make sure it is really operating. After all, if they messed up DNS there's no telling what else they might have gotten wrong.

 telnet  66.183.21.181 25
Trying 66.183.21.181...
Connected to 66.183.21.181.
Escape character is '^]'.
220 mail.appliedcoatings.ca Microsoft ESMTP MAIL Service, Version: 6.0.3790.4675 ready at  Sun, 21 Aug 2011 16:22:04 -0700
HELO localhost
250 mail.appliedcoatings.ca Hello [50.17.188.196]
quit
221 2.0.0 mail.appliedcoatings.ca Service closing transmission channel
Connection closed by foreign host.

Yup. They've got an operating mail server at that IP.

So we can reverse engineer a bit what Google's DNS server must have done behind the scenes to arrive at a valid answer where BIND could not. I'm 100% sure that Google would have also done the query

dig mx appliedcoatings.ca @ns1.domainpeople.com

since that is the right thing to do. But not getting a satisfactory answer (status: REFUSED), what it must do additionally after getting refused a second time by ns2.domainpeople, is to go back to the originally named nameservers sp1 and sp2. Watch what happens in that case:

 dig mx appliedcoatings.ca @sp1.domainpeople.com.

; <<>> DiG 9.7.1-P2 <<>> mx appliedcoatings.ca @sp1.domainpeople.com.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10226
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;appliedcoatings.ca.            IN      MX

;; ANSWER SECTION:
appliedcoatings.ca.     86400   IN      MX      10 mail.appliedcoatings.ca.

;; AUTHORITY SECTION:
appliedcoatings.ca.     86400   IN      NS      ns1.domainpeople.com.
appliedcoatings.ca.     86400   IN      NS      ns2.domainpeople.com.

;; ADDITIONAL SECTION:
mail.appliedcoatings.ca. 86400  IN      A       66.183.21.181

The AA (authoritative) flag is set in the response. So it's a good response, but sent to the "wrong" nameserver. Nevertheless, it is a response and it gets anyone using that nameserver more functionality than someone using BIND.

Conclusion
So far we've got three advantages speaking favorably for Google's DNS server: it's faster, it's answers are more complete and it's universally available. Wait, there's more! Another nice thing is what it does not do. Some ISPs have a "feature" I call DNS clobbering. In fact it's so annoying I will devote a whole blog post to describing it in more detail. Essentially they take license with DNS and make up answers to some queries! It's true and it's truly annoying. Not all ISPs do this but mine certainly does. So the other nice thing about Google DNS is that it does not do DNS clobbering and it's available for you to use it at home and avoid this annoying feature. You just set your DNS servers rather than have them assigned automatically via DHCP.

Other Resources
I should mention that while researching public DNS servers I was also led to commercial versions of the same thing. I went so far as to test the timings on one of those services and found that it is more distant, round-trip-wise, than Google's anycast server. Stands to reason. Google's got the best Internet access of anyone. They're on all the major highways. The commercial offerings have some additional cool features, however. They can serve as URL filter. So if someone puts in a URL which leads to a malicious site, for example, they can respond with an answer that spares you from going to that infected site. This is a little more crude than URL filtering at the proxy level, since a DNS server has no knowledge of the URI whereas a proxy URL filter does, but it could be quite serviceable. I'm not sure it allows you to pick and choose URL categories to block as with a URL filter (gambling, porn, hacking sites, etc.).

A lot more information on using Google DNS is at http://code.google.com/speed/public-dns/docs/using.html.

September 1 Update - a Crack in the Infrastructure
I now have my first case of a domain name which Google DNS did not resolve correctly, and for no apparent reason. The domain name is forums.tweaktown.com. Here's proof of Google's failure, followed immediately by Amazon's DNS servers' success:

dig forums.tweaktown.com @8.8.8.8

; <<>> DiG 9.7.1-P2 <<>> forums.tweaktown.com @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 15826
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;forums.tweaktown.com.          IN      A

;; AUTHORITY SECTION:
tweaktown.com.          116     IN      SOA     ns21.domaincontrol.com. dns.jomax.net. 2011060602 28800 7200 604800 86400

;; Query time: 4 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Thu Sep  1 14:40:50 2011
;; MSG SIZE  rcvd: 106


 dig forums.tweaktown.com

; <<>> DiG 9.7.1-P2 <<>> forums.tweaktown.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52290
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 1

;; QUESTION SECTION:
;forums.tweaktown.com.          IN      A

;; ANSWER SECTION:
forums.tweaktown.com.   1885    IN      A       38.101.21.25

;; AUTHORITY SECTION:
tweaktown.com.          1943    IN      NS      ns22.domaincontrol.com.
tweaktown.com.          1943    IN      NS      ns21.domaincontrol.com.

;; ADDITIONAL SECTION:
ns21.domaincontrol.com. 753     IN      A       216.69.185.11

;; Query time: 0 msec
;; SERVER: 172.16.0.23#53(172.16.0.23)
;; WHEN: Thu Sep  1 14:40:55 2011
;; MSG SIZE  rcvd: 122

All BIND servers I tried during this time returned the correct answer.

Is this an isolated incident or a tip of an iceberg of problems? I hope it is a one-off. I'll post updates as I find out more. I am slightly concerned now.

References and related
I finally wrote my own web interface to DNS and published the code I did it with. Check it out here.

A web interface to Google's public DNS service, which will give you more debug information, is https://dns.google.com/