Categories
Admin Network Technologies

Are we in Brazil? We are NOT in Brazil.

But Google thinks we are:

$ curl -i http://www.google.com/

returns:

HTTP/1.1 302 Found
Location: http://www.google.com.br/?gws_rd=cr&ei=nrA4UoPZB8fb4AP3loGYAg
Cache-Control: private
Content-Type: text/html; charset=UTF-8
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Date: Tue, 17 Sep 2013 19:42:22 GMT
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Content-Length: 262
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
Set-Cookie: PREF=ID=9ed4d9e91767f4ce:FF=0:TM=1379446942:LM=1379446942:S=O9jxL9sXRCc0kF-E; expires=Thu, 17-Sep-2015 19:42:22 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=BJA1pj-o2tu-IDI9KS5MjQvoKPJoY8x3uREAmeGCItGKxxfovFqdjPhvBaHUHcISFLVcTWSmCXwiTAuVhF4DIYCbPcuubfBBEGYaNy2wgveeyvGj35xTzM4Oo-yCLaDe; expires=Wed, 19-Mar-2014 19:42:22 GMT; path=/; domain=.google.com; HttpOnly
 
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com.br/?gws_rd=cr&amp;ei=nrA4UoPZB8fb4AP3loGYAg">here</A>.
</BODY></HTML>

and MSN is similar:

$ curl -i http://www.msn.com/

HTTP/1.1 302 Found
Cache-Control: no-cache, no-store
Pragma: no-cache
Content-Type: application/xhtml+xml; charset=utf-8
Location: http://br.msn.com/?rd=1&ucc=BR&dcc=BR&opt=0
P3P: CP="NON UNI COM NAV STA LOC CURa DEVa PSAa PSDa OUR IND"
X-AspNet-Version: 4.0.30319
S: BN2SCH030301048
Edge-control: no-store
Date: Tue, 17 Sep 2013 20:02:42 GMT
Content-Length: 172
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
Set-Cookie: MC1=V=3&GUID=7173a7e9bb5444fc88d52c3d302f0cdf; domain=.msn.com; expires=Thu, 17-Sep-2015 20:02:42 GMT; path=/
Set-Cookie: MC1=V=3&GUID=7173a7e9bb5444fc88d52c3d302f0cdf; domain=.msn.com; expires=Thu, 17-Sep-2015 20:02:42 GMT; path=/
Set-Cookie: brdSample=89; domain=.msn.com; expires=Thu, 17-Sep-2015 20:02:42 GMT; path=/
blah, blah
 
 
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="http://br.msn.com/?rd=1&amp;ucc=BR&amp;dcc=BR&amp;opt=0">here</a>.</h2>
</body></html>

Now a request from the IP Next door to it, on the same subnet, produces an entirely different result:

$ curl -i http://www.google.com/

Date: Tue, 17 Sep 2013 19:48:11 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more in
fo."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Transfer-Encoding: chunked
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
Set-Cookie: PREF=ID=0e280a6247dbe89f:FF=0:TM=1379447291:LM=1379447291:S=Ws_EoUItgwgQMFLL; expires=Thu, 17-Sep-2015 19:48:11
 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=MEOnbrr2pgKz8MVKEzIQ2v9f4nkR0o5FXJxbeBRk3GZg6GsfKNRZm9UEHTzcuRF5A6D2kXKa4N6FQnP88fkVLrgDokdOlXvX1Oba2JzC
koZ0K0PiACYSiTPCru5eH9C3; expires=Wed, 19-Mar-2014 19:48:11 GMT; path=/; domain=.google.com; HttpOnly
 
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage"><head><meta content="Search the world's information,
 including webpages, images, videos and more. Google has many special features
...

This is a problem when you do searches in which you expect to get localized results, like a search for dinner. Believe me, you see completely different things if Google thinks your IP is in Brazil.

And some services simply shut themselves off. Pandora simply does not work, for instance.

What is going on?

It seems some Geo-location service has incorrectly tagged our IP as Brazil. Not sure how to fix that…

Google does have an IP location change form, but they say it will take a month for them to make the change. That hardly seems like Internet time! https://support.google.com/websearch/contact/ip And what about the others? I don’t see any simple fix for Pandora for instance.

Possible solution?
Someone suggested that the proxy has cached Google’s home page. This is one thing I didn’t test for. However, these proxies are very sophisticated and know to refresh popular pages frequently so I very much doubt that was the cause of the problem.

Workarounds
Google present a link Google.com in the lower right corner of the page which you can click on to pop you to the English-language version of the site, but it’s still not location-aware and treats you more as an English traveler in Brazil when you search for a generic term like restaurant.

You can also go to http://www.google.com/ncr. I personally haven’t tried it, but that’s a tip I just received.

Oct 30 update
Well, finally, after filling out the IP change form twice, and of course keeping all Brazilian users off that proxy, Google has finally deemed fit to declare our proxy really is in the US after all.

It happens again…
Some light usage by a fraction of the user base and I find all proxies are in Brazil yet again. I fill out the form March 19th 2015 and a couple times after that. Now it is May 11th and Google still has not acted. This time I have not kept Brazil users off the proxies, because really this is ridiculous. When users complain I urge them to use duckduckgo.com for its better privacy protections. June 24th update. I’ve been checking every week just about. I’ve filled out Google’s own form about four times in total. So finally after three months, Google set us back to US Enlgish. Without a word to us, of course. Incredible.

Google has gotten big and unresponsive.

Conclusion
Use duckduckgo.com.

References and related
A good web site to learn where your IP is without too much malvertising that you often get with these kinds of things is: https://ipinfo.io/

Categories
Admin Network Technologies

Extended Passive Mode FTP through Checkpoint Firewall

Intro
The vast majority of time there is no problem doing an FTP to a server behind a firewall protected by Checkpoint’s Firewall-1. But occasionally there is.

The details

The problem I am about to document I think will only occur on a server that has multiple interfaces. I have seen it occur on multiple operating systems, so that doesn’t seem to matter. On the other hand, I have also not seen it on other similar systems, a point which I don’t fully yet understand.

Nevertheless, a work-around is always appreciated, so I provide what I found here, to complete my extensive documentation of problems I’ve encountered and resolved.

Here is a snippet from the FTP session showing the problem:

ftp> cd uploadDirectory
250 CWD command successful.
ftp> put smallfile.txt
local: smallfile.txt remote: smallfile.txt
229 Entering Extended Passive Mode (|||36945|)
200 EPRT command successful.
421 Service not available, remote server timed out. Connection closed

And here is the solution:

Enter epsv4 after logon and before any other commands are issued. Problem fixed!

Conclusion
We have shown a way to fix a firewall-related problem that manifests itself during extended passive mode FTPs. Some more research should be done to understand under what circumstances this problem should be expected, but it seems to occur with a Checkpoint Firewall-1 firewall and an FTP server with multiple interfaces.

Categories
Admin Network Technologies

The IT Detective Agency: trouble with wireless at home

Intro
I don’t usually have the luxury of writing about a mystery I’ve solved right out of my own home, but there finally is one that I got got to the bottom of recently – poor WiFi performance.

The details
Considering that I deal with this stuff for a living, I have a thread-bare setup at home. After my company-issued router’s WiFi began to work unreliably, I resuscitated an old Linksys wireless router, WRK54G V2. Superficially it seemed to work. But we weren’t very demanding of it.

It eventually seemed to be the case, as visitors mentioned, that streaming videos does not work through wireless. This was hard for me to check, with my broken-down, aging equipment. I have a desktop which always freezes and crashes is you play any Youtube video. And a Netbook which kind of worked better, but its peculiarity is that its ethernet interface doesn’t work. Wirelessly, its version of Flash was too old and insecure for Firefox, and attempts to update Flash using WiFi in turn were unsuccessful.

In general the Linksys router, as I eventually realized, seemed to initially serve up large downloads ok, but then at some point during the download, things begin to crawl and you are left with a download that proceeds at 10 kbit/s or something ridiculously slow like that.

Providing mixed evidence is a Sony BlueRay player. using WiFi it could sort of manage to show a HuluPlus TV episode. You might have to be patient at times while it’s loading, but we did get through a full episode of Grey’s Anatomy recently.

After more complaints I decided enough is enough. It seemed as though my WiFi was the most likely suspect, sifting through the mixed evidence. I perhaps waited so long because who’d think they’d be dealing with two bad WiFi routers from two totally different vendors?

So hedging my bets, I didn’t go all out with a new Gbit router. I reached back in time a little and got a refurbished Cisco 1200E wireless-N router. It was only $28 from Amazon. But before buying it, I read the comments and got one idea about routers: sometimes they need to be rebooted!

This is pretty funny, really, because it is probably apparent to any homeowner, and here I am, a specialist, missing this point. You see with Cisco enterprise-class gear you almost never have to reboot to fix a problem. These things run uninterrupted for not only weeks and months at a time, years at a time is also not at all uncommon. Same for some Unix servers. So from my perspective rebooting is something for consumer devices running Microsoft OSes!

So, before rebooting the Linksys to see if that would cure it, I ran a Ping to Google’s DNS server (very easy to remember its IP) from a CDM window:

> ping -t 8.8.8.8

I didn’t preserve the output, but it wasn’t pretty. It was something like this:

Pinging 8.8.8.8 with 32 bytes of data:
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=369ms TTL=56
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=1204ms TTL=56
Reply from 8.8.8.8: bytes=32 time=284ms TTL=56
...

51 msec – fine. But round-trip times much greater than that? That’s not right.

So I hopefully reboot the Linksys router and re-run the test on the Netbook:

Pinging 8.8.8.8 with 32 bytes of data:
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=50ms TTL=56
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
...

Much more consistent.

Try a Youtube video from Firefox. Nope, need to update Flash. Update Flash. Nope – download times out and kicks me out.

So I’ve accomplished nothing in rebooting in terms of results that matter.

That’s when I decided to check out of Amazon with that refurbished router.

Aside about Wireless-N
Given my ancient equipment, I was concerned that Wireles-N routers might not be compatible with my wireless radios, which would only support G. Is it backwards compatible? Yes. Some quick research showed that and my own experience confirmed it.

Conclusion
The setup of the router was pretty straightforward although it froze at some point just after I set the wireless password. It helps to have done this a zillion times before. At that point I observed what my default gateway was and hit it as a web site URL. Guessed the admin password incorrectly a zillion times, until I tried the wireless password as the admin password, and, wham – I was in and happily configuring away…

More importantly, I went to that Netbook, updated Flash. No problems. Ran a Youtube video. No problems. Ran a speedtest.net test (which wouldn’t even run before this). Numbers look as good as my wired connection: 6 mbit download, 0.6 mbit upload.

Last test is to see where the speed maxes out within my home network. I plan to hit my Raspberry Pi web server to test this and will provide results as soon as they are available.

Conclusion to the conclusion
So I really was cursed with two bad wireless routers. Sometimes using 10-year-old equipment is really not worth the $30 saved in deferred spending. Read product reviews on Amazon to get hints about real issues others have faced.
To be continued…

Categories
Apache Linux Network Technologies Perl Raspberry Pi

Getting started on my Raspberry Pi

Intro
The Raspberry Pi computer is an awesome idea. Its performance is surprisingly good as well, as I will show below. Available packages? Not so impressive. I share some old X-windows tricks which will allow you to bring up the GUI without ever using the HDMI port.

The details
My Methodology
I was too lazy to set up an HDMI console plus keyboard and mouse. I’m more a server guy anyways so I’m more interested in what I can accomplish from a command prompt. And this also makes getting started that much easier. I had burned the Raspbian Wheezy image to a super-fast SD card (more on that below) the day that my Pi came in the mail. I attached power and ethernet, booted it up, guessed the IP it acquired by running some PINGs, did an ssh using the pi/raspberry user and Bingo! I was in. It couldn’t be easier. How I tested GUI applications without a console is explained further down below.

First Impressions
It feels fast.

Packages
Not much seems to be there by default – no apache, not many X utilities. There is a lame X browser called x-www-browser. I thought this is Debian, right? So we can just start downloading Debian packages, like Firefox. Wrong! It doesn’t work that way. There’s no Firefox, Safari, Chrome or Opera! It does come pre-loaded with curl, however, ha, ha.

No, the Raspbian FAQ explains why this is. It’s rather complicated. I guess the compiler works though I haven’t tested it yet. So I suppose you could compile packages from their source code.

The x-terminal-emulator is pretty decent, however.

If it comes with a web server, I didn’t notice. So I quickly checked for the availability of apache. It’s available. Then installed it:

> sudo apt-get install apache2

That worked out well. It installed it and the packages it depended on and even launched it, and it all felt fairly peppy. See the suggested fix further down if this gives you errors. The default HTML DOCROOT is /var/www. I accessed it locally:

> curl localhost

And a welcome message displayed. A good start.

Where’s the rest of my 16 GB SD card gone to?

Original disk layout:

pi@raspberrypi:~$ df -k
Filesystem     1K-blocks    Used Available Use% Mounted on
rootfs           1804128 1492908    219572  88% /
/dev/root        1804128 1492908    219572  88% /
devtmpfs          224436       0    224436   0% /dev
tmpfs              44900     204     44696   1% /run
tmpfs               5120       0      5120   0% /run/lock
tmpfs              89780       0     89780   0% /run/shm
/dev/mmcblk0p1     57288   16872     40416  30% /boot

Layout after raspi-config:

pi@raspberrypi:~$ df -k
Filesystem     1K-blocks    Used Available Use% Mounted on
rootfs          15251960 1494852  12982544  11% /
/dev/root       15251960 1494852  12982544  11% /
devtmpfs          224436       0    224436   0% /dev
tmpfs              44900     196     44704   1% /run
tmpfs               5120       0      5120   0% /run/lock
tmpfs              89780       0     89780   0% /run/shm
/dev/mmcblk0p1     57288   16872     40416  30% /boot

Whew! That was easy. All 16 GB accounted for and actively used.

Was it worth it to buy that UHS SD card?
I didn’t want a sluggish server, so I paid a couple bucks more and bought a 16 GB SD UHS (ultra high speed) card for my “disk,” not knowing whether or not the Pi had the muscle to put it to work.

A quick aside about SD cards
I did a quickie self-education on this topic. Most SD cards are rated by class, so a class 4 SD card can do 4 MB/sec I/O, and a class 10 card can do at least 10 MB/sec. Faster still are the UHS SD cards. My Sandisk, which only cost about $19, is rated for 45 MB/sec I/O. A great write-up on this topic specifically for Raspberry Pi is: Raspberry Pi SD Card Speed Test – Raspberry Pi

diskSpec.pl benchmark (higher numbers are better)
1333 file creation/destruction operations per second – Raspberry Pi with UHS SD card
6666 file creation/destruction operations per second – EBS volume on small image running CentOS in Amazon cloud
26000 file creation/destruction operations per second – high-end HP server (G7 DL380) running SLES 11

I think I provided the source for this simple Perl program I wrote, diskSpec.pl. It creates a file, writes a random number into it, then deletes it – that all counts as one operation. Here it is:

#!/usr/bin/perl
# DrJ, 1/2000
# Test disk I/O
$DIR = $ARGV[0];
chdir($DIR);
$t0 = time();
while(1) {
  $ran = rand();
  open(FILE,"&gt;$ran") || die "Cannot open file $ran in directory $DIR!!\n";
  print FILE $ran;
  close(FILE);
  unlink($ran);
  $cnt++;
  if ($cnt % 20000 == 0) {
    $rate = $cnt / (time() - $t0) ;
    print "File creation/desctruction rate: $rate\n";
  }
}

DrJ 2017 Note: The notes below are historical and does not seem to work at all for the Raspberry Pi 3 loaded with NOOBS. In NOOBS you select your OS to install. You can’t ssh to it. I know. I just tried! Even after you install Raspbian Wheezy, you still can’t access it via ssh until you enable the ssh daemon with raspi-condfig.

How to get the GUI working without a console
I have this feeling that many people trying out the Pi won’t have the faintest idea how X windows works, unlike us Unix old-timers. It’s fun to put 20-year-old lessons to work on something new. Like I said I’m lazy and didn’t feel the need to set up an actual console to the thing. I used some old X features to allow me to launch specific X-windows applications that are pre-loaded on the device, and display them on my PC. How?

On a Windows PC you install Cygwin. Then launch the XWin Server. You ssh to your pi. How do you know its IP the first time? Guess! It picks it up via DHCP, so start PINGing around the range where your other devices are numbered. My PC is 192.168.5.12/24, my pi was 192.168.5.16. Maybe you have a bunch of devices responding to PING and are unsure which is which? Your MAC table is your friend. Here’s mine:

C:\Documents and Settings&gt;arp -a
 
Interface: 192.168.5.12 --- 0x2
  Internet Address      Physical Address      Type
  192.168.5.1           00-14-f6-e0-c0-4c     dynamic
  192.168.5.16          b8-27-eb-dd-21-02     dynamic
  192.168.5.99          00-90-a9-bb-3d-76     dynamic

arp displays the MAC table with the IP-to-physical (MAC) address correspondence. So most Pi’s will have a MAC address whose beginning is similar to b8-27-eb. A quick aside. Does the MAC address follow the board (SOC) or the SD Card? The board – I tested this with a friend’s SD Card.

You login with the pi/raspberry.

Then set your DISPLAY environment variable:

> export DISPLAY=192.168.5.12:0

Most of your X applications begin with the letter “x,” so enter

> x<tab><tab>

to see a display of available programs like this:

xapian-config        xdg-screensaver      xkbevd               xpdf.real            xxd
xarchiver            xdg-settings         xkbprint             xprop                xz
xargs                xdpyinfo             xkbvleds             x-session-manager    xzcat
xauth                xdriinfo             xkbwatch             xsubpp               xzcmp
xdg-desktop-icon     xev                  xkill                xtables-multi        xzdiff
xdg-desktop-menu     xfd                  xlsatoms             x-terminal-emulator  xzegrep
xdg-email            xfontsel             xlsclients           xvinfo               xzfgrep
xdg-icon-resource    xinit                xlsfonts             x-window-manager     xzgrep
xdg-mime             xkbbell              xmessage             xwininfo             xzless
xdg-open             xkbcomp              xpdf                 x-www-browser        xzmore

Actually I don’t know how many of these are X. But at least a few are.

Start an xterm in Cygwin. In the xterm window, give permission to the Pi to use it as its Xserver:

> xhost +

Now in the Pi shell (ha, ha), type:

> x-terminal-emulator

and you should see the colorful terminal emulator on your PC in a few seconds. this is a true GUI application. You similarly launch the x-www-browser. Don’t forget to background your X-windows in the Pi shell:

<Ctrl-Z>
> bg

so you can use the one window to launch multiple X windows.

Another example the book Programming the Raspberry Pi has is the Python interactive development environment. I reasoned from the screen shots that idles3 would also be an X application – hey, they don’t have to start with the letter x – and indeed it is!

Want the whole ball of wax, a complete console? I just figured this one out by taking an educated guess:

> x-session-manager

and you will see the complete GUI on your PC! Cool, huh?

Want to get rid of the last thing you backgrounded, like, say, that x-session-mnager which has taken over your PC?! Type

> fg
<Ctrl-C>

and it will be killed.

How to get the GUI working without a console, Method 2
The above steps look a little daunting? Even I don’t want to install cygwin on my new PC. There is an alternative which can suffice for light usage.

On the Pi install a vnc server:

$ sudo apt-get install tightvncserver

Launch it:

$ vncserver

The first time only it will ask you to set up a password. Might as well make it raspberry like everything else we do on the Pi.

Then install a VNC client on your PC (Or Macbook). I use RealVNC.

Launch your VNC client and connect to your Pi’s IP address (which you need to know) + the display number, like this:

192.168.0.100:1

For a Pi at IP 192.168.0.100 in which the vncserver started display 1. Normally it will be display 1, but I guess it might be display 0.

Don’t launch vncserver more than once! You don’t want a bunch of those running and dragging on performance.

Anyways, that’s it! You should see the Pi’s GUI on your PC, but it might seem a wee bit small.

Setting a static IP
If you’re going to use the Pi more as a server like I am, I think it’s a good idea to give it a static IP. What I did is to edit /etc/network/interfaces. Mine now looks like this:

Nagios can be installed! That's pretty cool - it's a sophisticated network monitoring utility.

Get a decent browser
The web browsers that come with the Pi are horrible. Midori? Seriously? I found you can get Firefox, but the downside is that it’s sloooww. But at least it works. The secret is that it’s not called Firefox. Instead:

$ sudo apt-get install iceweasel

Yes, it’s iceweasel, not Firefox, in Debian Linux. Go figure.

My cool transparent case
I recommend to get a case. I got the one with the best reviews. It’s kind of expensive, about $20, but worth it. It’s practically a work-of-art. Clear, the PC board fits snugly. I put it in my pocket and showed it around to my friends, feeling it was well protected, and yet also a sight to behold the first time. I even has a thoughtful light guide so the LEDs look beautiful as their light follows the rectangular opening to open air. I never had this much fun in show-and-tell! I just pulled the Pi wrapped in its case from my shirt pocket and amazed those around me. So go ahead and splurge. Anyways some of the cheaper cases look just that. Here is what I bought:

Helping a friend out with his Pi
So I dutifully take my friend’s Pi home and offer to install a web server. What did I do wrong? Well, duh, I could have just taken his SD Card home and plugged it into my Pi case! That concept takes some getting used to! We all have the same hardware. Our SD cards – our disk – are what make one Pi different from another.

So I followed my own blog post to recall some things. This Pi also had a MAC address beginning with the same six characters.

The apache2 installation did not work out, however. What to do? Well, I eventually read the darn output from running it. It suggests to try this:

> sudo apt-get update

So I ran that, figuring it could do no harm. Then I re-ran

> sudo apt-get install apache2

and this time the install actually worked!

Reading a flash drive
I was curious to see if you could stick a flash drive in the thing and just read it. I didn’t think so since I thought it would be formatted for NTFS. But if you have the GUI running and bring up a file manager, I’ll be darned if it doesn’t just work. I noticed the drive is mounted as /media/Cruzer (my flash drive has the brand name Cruzer).

If you don’t launch the file manager, I think you can still work with it as follows:

$ sudo mkdir /media/Cruzer; sudo mount /dev/sda1 /media/Cruzer

Then when you’re all done and before you remove it:

$ sudo umount /media/Cruzer

So that’s pretty cool. You can create tar archives on the flash drive, plug it into someone else’s Pi and untar it, etc, just like on Windows.

Conclusion
Raspberry Pi is respectable as a computer. It will be a lot of fun to explore for the hobbyist.

References

Raspberry Pi SD Card Speed Test – Raspberry Pi – a great discussion of the various speeds of Micro SD cards and how to measure yours
Go here for my next project – using your Raspberry Pi to monitor your home’s power or Internet connection.
Interested in networking? A lot of useful tips can be found in this posting describing how to turn your Pi into a router.
Realvnc.com distributes realVNC viewers for various platforms.
How about a Raspberry Pi-driven digital photo frame? I describe an approach in this article.
Brief Nagios for Raspberry Pi writeup.

Categories
Admin Network Technologies

The IT Detective Agency: Internet Explorer cannot display https web page, part II

Intro
They say when it rains it pours. As a harassed It support specialist its tempting to lump all similar-looking problems that come across your desk at the same time in the same bucket. This case shows the shallowness of that way of thinking, for that’s exactly what happened in this case. It occurred exactly when this case was occurring in which a user had an issue displaying a secure web site. That other case was described here.

The details
I was doing some work for a company when they came to me with a problem one user at HQ was having accessing an https web site. Everything else worked fine. The same web page worked fine from other sites.

Since I had helped them set up their proxy services I was keen to prove that the proxy wasn’t at fault. The user removed the proxy settings – still the error occurred. The desktop support got involved. I suggested a whole battery of tests because this was a weird one.
– what if you access via proxy?
– what if you take that laptop and access it from a VPN connection?
– what if you access another site which uses the same certificate-issuer?
– what if you access the host, but using http?
– can you PING the server?
– can you telnet to that server on port 443?
– what if you access the webpage by IP address?
– what if accessed via Firefox?
– etc.
The error, by the way, was

Internet Explorer cannot display the webpage

What you can try:
Diagnose Connection Problems

The answers came back like this:
– what if you access via proxy? It works!
– what if you take that laptop and access it from a VPN connection? It works.
– what if you access another site which uses the same certificate-issuer? It works.
– what if you access the host, but using http? It works.
– can you PING the server? Yes.
– can you telnet to that server on port 443? Yes
– what if you access the webpage by IP address? It does not work.
– Firefox? I don’t remember the answer to this.

Most of that thinking behind those tests is pretty obvious. Why so many tests? The networking group is kind of crotchety and understaffed, so they really wanted to eliminate all other possibilities first. And at one point I thought it could have been a desktop issue. Why would the network allow some of these packets through but not others when it doesn’t run a firewall? The desktop did have A-V and local firewall, after all.

So I didn’t have proof or even a good idea. Time to take out the big guns. We ran a trace on the server with tcpdump while the error occurred. The server was busy so we had to be pretty specific:

> tcpdump -s 1580 -w /tmp/drjcap.cap -i any host 10.19.79.216 and port 443

What do we see? We see the initial handshake go through just fine. Then a client Hello, then a server Hello, followed by a Certificate sent from the server to the client. The Certificate packet kept getting re-sent because there was no ACK to it from the client. So it’s beginning to smell like a dropped packet somewhere. I was hot on the trail but I decided I needed even stronger proof. We arranged to do simultaneous traces on both PC and server to compare the two results.

On the PC we had to install Wireshark. Actually I think Microsoft also has a utility to do traces but I’ve never used it. What did we find? Proof that there was indeed a dropped packet. That server certificate? Never received by the PC (which is acting as the SSL client). But why?

I had noticed one other funny thing about the trace comparisons. The server hello packet left as a packet of length 1518 bytes, but arrived, apparently, as two packets, one of length 778 bytes, the other of 770 bytes, i.e., it was fragmented. That should be OK. I don’t think the don’t fragment flag was set (have to check this). But it got me to wondering. Because the server certificate packet was also on the large side – 1460 bytes.

Regardless, I had enough ammo to go to the networking group with, which I did. It was like, “Oh yeah, our Telecom provider (let’s call it “CU”) implemented GETVPN around that time.” And further discussion revealed suspected other problems related to this change with MTU, etc. In fact they dredged up this nice description of the pitfalls that await GET VPN implementations from the Cisco deployment guide:

3.8 Designing Around MTU Issues Because of additional IPsec overhead added to each packet, MTU related issues are very common in IPsec deployments, and MTU size becomes a very important design consideration. If MTU value is not carefully selected by either predefining the MTU value on the end hosts or by dynamically setting it using PMTU discovery, the network performance will be impacted because of fragmentation and reassembly. In the worst case, the user applications will not work because network devices might not be able to handle the large packets and are unable to fragment them because of the df-bit setting. Some of the scenarios which can adversely affect traffic in a GET VPN environment and applicable mitigation techniques are discussed below. LAN MTU of 1500 – WAN MTU 44xx (MPLS) In this scenario, even after adding the 50-60 byte overhead, MTU size is much less than the MTU of the WAN. The MTU does not affect GET VPN traffic in any shape or form. LAN MTU of 1500 – WAN MTU 1500 In this scenario, when IPsec overhead is added to the maximum packet size the LAN can handle (i.e. 1500 bytes) the resulting packet size becomes greater than the MTU of the WAN. The following techniques could help reduce the MTU size to a value that the WAN infrastructure can actually handle. Manually setting a lower MTU on the hosts By manually setting the host MTU to 1400 bytes, IP packets coming in on the LAN segment will always have 100 extra bytes for encryption overhead. This is the easiest solution to the MTU issues but is harder to deploy because the MTU needs to be tweaked on all the hosts. TCP Traffic Configure ip tcp adjst-mss 1360 on GM LAN interface. This command will ensure that resulting IP packet on the LAN segment is less than 1400 bytes thereby providing 100 bytes for any overhead. If the maximum MTU is lowered by other links in the core (e.g. some other type of tunneling such as GRE is used in the core), the adjust-mss value can be lowered further. This value only affects TCP traffic and has no bearing on the UDP traffic. 3.8.1.1 Host compliant with PMTU discovery For non-TCP traffic, for a 1500 packet with DF bit set, the GM drops the packet and send ICMP message back to sender notifying it to adjust the MTU. If sender and the application is PMTU compliant, this will result in a packet size which can successfully be handled by WAN. For example, if a GM receives a 1500 byte IP packet with the df-bit set and encryption overhead is 60 bytes, GM will notify the sender to reduce the MTU size to 1440 bytes. Sender will comply with the request and the resulting WAN packets will be exactly 1500 bytes

I don’t claim to understand all those scenarios, but it shouts pretty loudly. Watch out for problems with packets of size 1400 bytes or larger!

Why would they want GET VPN? To encrypt the HQ communication over the WAN. So the idea is laudable, but the execution lacking.

Case: understood but not yet closed! The fix is not yet in…

To be continued…

Categories
Admin Internet Mail Linux Network Technologies

The IT Detective Agency: mail server went down with an old-school problem

Intro
I got a TXT from my monitoring system last night. I ignored it because I knew that someone was working on the firewall at that time. I’ve learned enough about human nature to know that it is easy to ignore the first alert. So I’ve actually programmed HP SiteScope alerts to send additional ones out after four hours of continuous errors. When I got the second one at 9 PM, I sprang into action knowing this was no false alarm!

The details
Thanks to a bank of still-green monitors I could pretty quickly rule out what wasn’t the matter. Other equipment on that subnet was fine, so the firewall/switch/router was not the issue. Then what the heck was it? And how badly was it impacting mail delivery?

This particular server has two network interfaces. Though the one interface was clearly unresponsive to SMTP, PING or any other protocol, I hadn’t yet investigated the other interface, which was more Internet-facing. I managed to find another Linux server on the outer network and tried to ping the outer interface. Yup. That worked. I tried a login. It took a whole long time to get through the ssh login, but then I got on and the server looked quite normal. I did a quick ifconfig – the inner interface listed up, had the right IP, looked completely normal. I tried some PINGs from it to its gateway and other devices on the inner network. Nothing doing. No PINGs were returned.

I happened to have access to the switch. I thought maybe someone had pulled out its cable. So I even checked the switch port. It showed connected and 1000 mbits, exactly like the other interface. So it was just too improbable that someone pulled out the cable and happened to plug another cable from another server into that same switch port. Not impossible, just highly improbable.

Then I did what all sysadmins do when encountering a funny error – I checked the messages file in /var/log/messages. At first I didn’t notice anything amiss, but upon closer inspection there was one line that was out-of-place from the usual:

Nov  8 16:49:42 drjmailgw kernel: [3018172.820223] do_IRQ: 1.221 No irq handler for vector (irq -1)

Buried amidst the usual biddings of cron was a kernel message with an IRQ complaint. What the? I haven’t worried about IRQ since loading Slackware from diskettes onto my PC in 1994! Could it be? I have multiple ways to test when the interface died – SiteScope monitoring, even the mail log itself (surely its log would look very different pre- and post-problem.) Yup.

That mysterious irq error coincides with when communication through that interface stopped working. Oh, for the record it’s SLES 11 SP1 running on HP server-class hardware.

What about my mail delivery? In a panic, realizing that sendmail would be happy as a clam through such an error, I shut down its service. I was afraid email could be piling up on this server, for hours, and I pride myself in delivering a faultless mail service that delivers in seconds, so that would be a big blow. With sendmail shut down I knew the backup server would handle all the mail seamlessly.

This morning, in the comfort of my office I pursued the answer to that question What was happening to my mail stream during this time? I knew outbound was not an issue (actually the act of writing this down makes me want to confirm that! I don’t like to have falsehoods in writing. Correct, I’ve now checked it and outbound was working.) But it was inbound that really worried me. Sendmail was listening on that interface after all, so I didn’t think of anything obvious that would have stopped inbound from being readily accepted then subsequently sat on.

But such was not the case! True, the sendmail listener was available and listening on that external interface, but, I dropped a hint above. Remember that my ssh login took a long while? That is classic behaviour when a server can’t communicate with its nameservers. It tries to do a reverse lookup on the ssh client’s source IP address. It tried the first nameserver, but it couldn’t communicate with it because it was on the internal network! Then it tried its next nameserver – also a no go for the same reason. I’ve seen the problem so often I wasn’t even worried when the login took a long time – a minute or so. I knew to wait it out and that I was getting in.

But in sendmail I had figured that certain communications should never take a long time. So a long time ago I had lowered some of the default timeouts. My mc file in the upstream server contains these lines:

...
dnl Do not use RFC1413 identd. p 762  It requires another whole in the F/W
define(`confTO_IDENT',`0s')dnl
dnl Set more reasonable timeouts for SMTP commands'
define(`confTO_INITIAL',`1m')dnl
define(`confTO_COMMAND',`5m')dnl
define(`confTO_HELO',`1m')dnl
define(`confTO_MAIL',`1m')dnl
define(`confTO_RCPT',`1m')dnl
define(`confTO_RSET',`1m')dnl
define(`confTO_MISC',`1m')dnl
define(`confTO_QUIT',`1m')dnl
...

I now think that in particular the HELO timeout (TO_HELO) of 60 seconds saved me! The upstream server reported in its mail log:

Timeout waiting for input from drjmailgw during client greeting

So it waited a minute, as drjmailgw tried to do a reverse lookup on its IP, unsuccessfully, before proceeding with the response to HELO, then went on to the secondary server as per the MX record in the mailertable. Whew!

More on that IRQ error
Let’s go back to that IRQ error. I got schooled by someone who knows these things better than I. He says the Intel chipset was limited insofar as there weren’t enough IRQs for all the devices people wanted to use. So engineers devised a way to share IRQs amongst multiple devices. Sort of like virtual IPs on one physical network interface. Anyways, on this server he suspects that something is wrong with the multipath driver which is loaded for the fiber channel host adapter card. In fact he noticed that the network interface flaked out several times previous to this error. Only it came back after some seconds. This is the server where we had a very high CPU when the SAN was being heavily used. The SAN vendor checked things out on their end and, of course, found nothing wrong with the SAN equipment. We actually switched from SAN to tmpfs after that. But we didn’t unload the multipath driver. Perhaps now we will.

Feb 22 Update
We haven’t seen the problem in over three weeks now. See my comments on what actions we took.

Conclusion
Persistence, patience and praeternatural practicality paid off in this perplexing puzzle!

Categories
Admin Network Technologies

Firewall is a significant drag on download speeds

Intro
This post might be a restatement of the obvious to some, but I thought it was noteworthy enough to measure and mention this affect. I was twiddling my thumbs during a long sftp upload when I began to notice these transfers I was doing went really quickly between some servers, and not so much between others. How to control for all variables except the ones I wanted to vary? How to measure things in such a way that an overworked network technician with vested interests in saying the status quo is “good enough” will listen to you? These are things I wrestled with.

The details

To be continued…

Categories
Admin Network Technologies

DIY Home Power Monitoring Solution – Perfect for Sandy

Intro
My recent experience losing power thanks to Sandy has gotten me thinking. How can I know when my power’s on? Or what if it gets shut off again, which by the way actually happened to me? I realized that I have all the pieces in place already and merely need to take advantage of the infrastructure already out there.

The details
I started with:

  • a working smartphone
  • an enterprise monitoring system
  • a SoHo router in my home

And that’s pretty much it!

The smartphone doesn’t really matter, as long as it receives emails. I guess a plain old cellphone would work just as well – the messages can be sent as text messages.

For enterprise monitoring I like HP SiteScope because it’s more economical than hard-core systems. I wrote a little about it in a previous post. Nagios is also commonly used and it’s open source, meaning free. Avoid Zabbix at all costs. Editor’s note: OK. I’ve changed my mind about Zabbix some seven years later. It’s still confusing as heck, but now I’m using it and I have to admit it is powerful. See this write-up.

A good SoHo router is Juniper SSG5. It extends the enterprise LAN into the home. You can carve up a Juniper router like that and provide a Home network, Work network, Home wireless and Work wireless. It’s great!

The last requirement is the key. The SoHo router at my house is always on, and so the enterprise LAN is always available, as long as I have power. Get it? I defined a simple PING monitor in SiteScope to ping my SoHo router’s WAN interface, which has a static private IP address on the enterprise LAN. If I can’t ping it, I’ve lost power, and use my monitoring system to send an email alert to my smartphone. When, or in the case of Sandy’s interminable outage, if, power ever comes back on, I send another alert letting me know that as well. If you’re not using SiteScope make sure you send several PINGs. A PING can be lost here or there for various reasons. siteScope sends out five PINGs at a regular interval, as a guideline.

Alternative
Of course if I had a business-class DSL or cable modem service with a dedicated IP I could have just PINGed that, but I don’t. With regaulr consumer grade service your IP can and will change from time to time, and using a dynamic DNS protocol (like dyndns) to mask that problem is a bit tricky.

Yes, if you call the power company they offer to call you back when your power is restored, but I like my monitor better. It tells me when things go off as well as on.

Conclusion
This is my necessity-is-the-mother-of-invention moment. Thank you, Sandy!

Categories
Linux Network Technologies SLES TCP/IP

Ethernet Bridging on the cheap. Fail. Then Success with OLTV

Intro
Some experiments just don’t work out. I became curious about a technology that has various names: ethernet bridging, wide-area VLANs, OTV, L2TP, etc. It looked like it could be done on the cheap, but that didn’t pan out for me. But later on we got hold of high-end gear that implements OTV and began to get it to work.

The details
What this is is the ability to extend a subnet to a remote location. How cool is that? This can be very useful for various reasons. A disaster recovery center, for instance, which uses the same IP addressing. A strategic decision to move some, but not all equipment on a particular LAN to another location, or just for the fun of it.

As with anything truly useful there is an open source implementation(s). I found openvpn, but decided against it because it had an overall client/server description and so didn’t seem quite what I had in mind. Openvpn does have a page about creating an ethernet bridging setup which is quite helpful, but when you install the product it is all about the client/server paradigm, which is really not what I had in mind for my application.

Then I learned about Astaro RED at the Amazon Cloud conference I attended. That’s RED as in Remote Ethernet Device. That sounded pretty good, but it didn’t seem quite what we were after. It must have looked good to Sophos as well because as I was studying it, Sophos bought them! Asataro RED is more for extending an ethernet to remote branch offices.

More promising for cheapo experimentation, or so I thought at the time, is etherip.

Very long story short, I never got that to work out in my environment, which was SLES VM servers.

What seems to be the most promising solution, and the most expensive, is overlay transport virtualization (OLTV or simply OTV), offered by Cisco in their Nexus switches. I’ll amend this post when I get a chance to see if it worked or not!

December Update
OTV is beginning to work. It’s really cool seeing it for the first time. For instance, I have a server in South Carolina on an OTV subnet, IP 10.94.45.2. Its default gateway is in New Jersey! Its gateway is in the ARP table, as it has to be, but merely to PING the gateway produces this unusual time lag:

> ping 10.94.45.1

PING 10.94.45.1 (10.194.54.33) 56(84) bytes of data.
64 bytes from 10.94.45.1: icmp_seq=1 ttl=255 time=29.0 ms
64 bytes from 10.94.45.1: icmp_seq=2 ttl=255 time=29.1 ms
64 bytes from 10.94.45.1: icmp_seq=3 ttl=255 time=29.6 ms
64 bytes from 10.94.45.1: icmp_seq=4 ttl=255 time=29.1 ms
64 bytes from 10.94.45.1: icmp_seq=5 ttl=255 time=29.4 ms

See those response times? Huge. I ping the same gateway from a different LAN but same server room in New Jersey and get this more typical result:

# ping 10.94.45.1

Type escape sequence to abort.
Sending 5, 64-byte ICMP Echos to 10.94.45.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/0/1 ms
Number of duplicate packets received = 0

But we quickly stumbled upon a gotcha. Large packets were killing us. The thing is that it’s one thing to run OTV over dark fiber, which we know another customer is doing without issues; but to run it in an MPLS network is something else.

Before making any adjustment on our servers we found behaviour like the following:
– initial ssh to linux server works OK; but session soon freezes after a directory listing or executing other commands
– pings with the -s parameter set to anything greater than 1430 bytes failed – they didn’t get returned

So this issue is very closely related to a problem we observed on a regular segment where getvpn had just been implemented. That problem, which manifested itself as occasional IE errors, is described in some detail here.

Currently we don’t see our carrier being able to accommodate larger packets so we began to see what we could alter on our servers. On Checkpoint IPSO you can lower the MTU as follows:

> dbset interface:eth1c0:ipmtu 1430

The change happens immediately. But that’s not a good idea and we eventually abandoned that approach.

On SLES Linux I did it like this:

> ifconfig eth1 mtu 1430

In this platform, too, the change takes place right away.

By the we experimented and found that the largest MTU value we could use was 1430. At this point I’m not sure how to make this change permanent, but a little research should show how to do it.

After changing this setting, our ssh sessions worked great, though now we can’t send pings larger than 1402 bytes.

The latest problem is that on our OTV segment we can ping only one device but not the other.

August 2013 update
Well, we are resourceful people so yes we got it running. Once the dust settled OTV worked pretty well, with certain concessions. We had to be able to control the MTU on at least one side of the connection, which, fortunately we always could. Load balancers, proxy servers, Linux servers, we ended up jiggering all of them to lower their MTU to 1420. For firewall management we ended up lowering the MTU on the centralized management station.

Firewalls needed further voodoo. After pushing policy clamping needs to be turned back on and acceleration off like this (for Checkpoint firewalls):

$ fw ctl set int fw_clamp_tcp_mss 1
$ fwaccel off

Conclusion
Having preserved IPs during a server move can be a great benefit and OTV permits it. But you’d better have a talented staff to overcome the hurdles that will accompany this advanced technology.

Categories
Admin Network Technologies

The IT Detecive Agency: FTPs and other things going slow to one part of one location

Intro
What happens when you have a slowly festering problem – slow but tolerable performance? Is it a problem when no one complains but you know the numbers don’t feel right? Read on to see what happens in this exciting adventure.

The details
I always felt that one of my locations, let’s call it Scattering, was just hard-to-explain slow. Web site accesses as measured in HP SiteScope was slower than at another site, lets call it Cooper, and the variation seemed larger. You could always chalk it up to the fact that the SiteScope server was in the data center with the good performance, but there seemed more to it than that.

The situation was, apparently, tolerable and people grew accustomed to it. Until one day someone said that it just wasn’t right. Ping times from his site, let’s call it Vanderbilt, looked like this from his desktop:

C:\Users>ping dmz-host
 
Pinging dmz-host [10.93.23.12] with 32 bytes of data:
Reply from 10.93.23.12: bytes=32 time=80ms TTL=248
Reply from 10.93.23.12: bytes=32 time=107ms TTL=248
Reply from 10.93.23.12: bytes=32 time=74ms TTL=248
Reply from 10.93.23.12: bytes=32 time=78ms TTL=248

And yet, ping times to another piece of equipment in that same physical server room looked much healthier – on order of 22 msec with very little jitter. What the…?

So I started looking around at my equipment. My equipment generally had good response times amongst itself, but when it crossed over to another group’s equipment it started to go up. Finally I saw it – a common denominator. My equipment had its gateway supplied by a different group. PINGing the local gateway gave 15 msec response times, with high jitter! I’ve never seen that before. You normally expect to get response times , 1 msec. Time to turn the problem over to them.

But as a former scientist, I like to have a hypothesis about what might be wrong. I thought a bad cable, because I’ve heard of that. Duplex mis-matches on the port are always a strong possibility. Guess what? It wasn’t any of those things. But it was something about the port on their equipment. What do you suppose it was? Hint: the equipment they were using was leftover stuff that needed to be replaced but hadn’t because it just kept working.

Mystery Resolved
The port was saturated! It was older equipment with 100 mbit ports, the amount of traffic slowly built up on it over time and sort of crept up on us all. I guess finally the demand on it was causing noticeable problems in response times that someone finally complained!

He shifted the gateway to a 1 gbit port and things got much better. Here are those PINGs now:

C:\Users>ping 10.93.23.12
 
Pinging 10.93.23.12 with 32 bytes of data:
Reply from 10.93.23.12: bytes=32 time=24ms TTL=248
Reply from 10.93.23.12: bytes=32 time=23ms TTL=248
Reply from 10.93.23.12: bytes=32 time=23ms TTL=248
Reply from 10.93.23.12: bytes=32 time=23ms TTL=248
 
Ping statistics for 10.93.23.12:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 23ms, Maximum = 24ms, Average = 23ms

That’s much better!

The HP SiteScope timing are more subtle, but no less important. Users don’t tolerate slow-loading web pages, you know!

I need to make an aside here. Is there no free and widely distributed Linux package to calculate the standard deviation of a set of numbers? I can’t believe it. I had to borrow this guy’s shell script, and it’s pretty lousy because it’s really slow: http://panoskrt.wordpress.com/2009/03/10/shell-script-for-standard-deviation-arithmetic-mean-and-median/. Anyways, doing a before and after analysis on my URL timing, where I fetched a Google page plus images during the daytime from 10 AM to 4 PM, I find this before and after result:

Before

Number of data points in “sca” = 144
Arithmetic mean (average) = .630548611
Standard Deviation = .225365467
Median number (middle) = .664500000

After

Number of data points in “sca2” = 144
Arithmetic mean (average) = .349270833
Standard Deviation = .162399076
Median number (middle) = .263000000

Users were very happy after the upgrade.

Conclusion
An old gateway port that maxed out its traffic capacity was to blame for performance problems in one part of a data center. This doesn’t happen too often, but it can happen, obviously.

Linux needs a simple math package to allow to calculate the standard deviation of a set of numbers.