Categories
Admin Network Technologies Raspberry Pi

Basic networking: creating a virtual IP in Debian Linux

Intro
A quick Internet search showed a couple top-level matches that didn’t quite work for me, so I’m documenting how I got my multiple IP assignments on one interface to stick. This was work done for my Raspberry Pi, but it should apply to any Debian Linux system.

The details
This was work done for my Raspberry Pi, but it should apply to any Debian Linux host. I made my file /etc/network/interfaces look as follows:

auto lo
auto eth0
auto eth0:0
 
iface lo inet loopback
# DrJ change: make IP static
# somewhat inspired by http://www.techiecorner.com/486/how-to-setup-static-ip-in-debian/ - DrJ 1/8/13
#iface eth0 inet dhcp
iface eth0 inet static
address 192.168.2.100
gateway  192.168.2.1
netmask 255.255.255.0
network 192.168.2.0
broadcast 192.168.2.255
 
# virtual IP on eth0
iface eth0:0 inet static
address 10.31.42.11
netmask 255.255.255.0
network 10.31.42.0
broadcast 10.31.42.255

I think the key statement which is missing on some people’s examples are the lines at the top of the file:

auto lo
auto eth0
auto eth0:0

When I didn’t have those I was finding that my primary IP was defined upon reboot, but not my virtual IP, although the virtual IP could be dynamically created with a simple

sudo ifup eth0:0

Still, I wanted it to survive a reboot and adding the auto lines did the trick.

Conclusion
A few of the pages you will find on the Internet may give incomplete information on how to configure virtual IPs in Debian Linux. The approach outlined above should work. Additional virtual IPs would just require sections like eth0:1, eth0:2, etc modelled after what was done for eth0:0

References
I present some basic information on one way to get started on the Pi without an external monitor (yes, it can be done) here!
If you think you like networking, you will learn a lot of useful tips in this posting which describes how to turn your Raspberry Pi into a full-blown router.

Categories
Network Technologies

How we got a little extra oomph from our firewall cluster, and why this trick no longer works for us

Intro
I was a running some Checkpoint firewalls in a cluster. In fact it’s been that way for years and years. At some point you get comfortable and forget to challenge and understand how it was set up. In this case re-examining the setup rewarded us with temporary survival as we were able to offload the primary member. Read on for the details…

The details
This firewall cluster included an active/standby pairing – a Nokia cluster with no state sync. The active firewall, an older model, was often hitting 99% or even 100% cpu usage on a daily basis. Dropped packets were correlated with these cpu spikes, and time-sensitive protocols, especially SIP used by IP phones, suffered mightily. Call quality often degraded, or the call was altogether dropped.

Some other relevant facts in this case: these firewalls were not doing NAT, they were acting more like routers with a firewall function. There are a handful of key servers behind them, like a VPN concentrator, a proxy, a Juniper ISG VPN concentrator, etc. On the external side was an Internet router, also under our control.

So the breakthrough was in revisiting what makes them active/passive in the first place. We weren’t relying on Checkpoint clustering. We used VRRP, defined through a Voyager setup. Then we set up our routing on all protected devices to use these VRRP IPs for their default routes. It all worked great until more and more usage crept in and then complaints started rolling in.

Upgrading costs $$ and the procurement cycle takes some time. What to do immediately, if anything?

The loudest complaints were from users of the Juniper ISG SSL VPN concentrators, who ran VOIP over those connections. What I realized (which of course is obvious in hindsight), is that this device could have all its traffic routed to the standby firewall where there was no cpu load whatsoever, and leave everything else on the active firewall.

How we did it
This was accomplished by adjusting the default route of the ISG to use the physical IP of the standby firewall, as opposed to the VRRP IP. Then, to avoid asymmetric routing, a host route was defined on the Internet router for this ISG, using as gateway the physical external IP of the standby firewall (again as opposed to the external VRRP IP.)

How it worked
It worked like a charm. We were well below our Internet link capacity, after all. So the master firewall was really the chokepoint for this voice traffic. Once we got it onto this unused firewall all the complaints stopped.

This is of course just a stop-gap measure because of course now we have no redundancy if we lose one of the firewalls. But meanwhile we’ve bought some time and kept the work-from-home users running smoothly. The master firewall still hits 99% cpu, but not quite as frequently. It’s difficult to find a true root cause, but an upgrade is definitely in order. Acceleration is already in place.

Why it won’t work for you – Checkpoint Cluster
Fast-forward five years and I tried this same trick which has served me well over the years. No worky. Why? Well these days we’ve switched to use of a Checkpoint Cluster with SYNC. In a Checkpoint cluster the secondary firewall will not forward traffic. In fact a firewall guy was the first to inform me of that. I didn’t believe him so I went ahead and configured it anyway. Sure enough, it simply didn’t work.

So for us, this trick has played itself out. But we used it multiple times during the five years it was available to us.

Conclusion
By re-visiting some old design principles were we able to get a little more mileage out of our firewalls and buy ourselves some time until we can do a planned upgrade.

Categories
Network Technologies

The IT Detective Agency: Why our forwarding vserver doesn’t route

Intro
F5 BigiP appliances are very versatile networking appliances. But sometimes you gotta know what you are doing!

The details
We set up a load-balanced Radius service using the same subnet for the radius servers as the load balancer itself. Setting up this service is moderately tricky. You have to set the default route of the Radius servers to be the load balancer, and on the load balancer SNAT (which I prefer to translate as “source NAT,” though technically it is “secure NAT”) and NAT should be disabled. And there are two services, AAA and audit (UDP ports 1812 and 1813).

So everything’s a bit different when all you'[re used to is creating load-balanced pools for web servers.

So with this setup an incoming packet comes in, its source is preserved, but its destination is NAT’d to the radius server by the load balancer. In the response, the source is the radius server. That gets NAT’d to the IP of the load-balanced service. So there are two stages for incoming request packets (pre- and -post NAT) and two for the responses. Here’s a trace which shows all this:

12:10:41.259073 IP drj-wlc-nausresea055-01.drj.com.filenet-rpc > radius.drj.com.radius: RADIUS, Access Request (1), id: 0x30 length: 260
12:10:41.259086 IP drj-wlc-nausresea055-01.drj.com.filenet-rpc > wusandradaa01.drjad.drj.net.radius: RADIUS, Access Request (1), id: 0x30 length: 260
12:10:41.259735 IP wusandradaa01.drjad.drj.net.radius > drj-wlc-nausresea055-01.drj.com.filenet-rpc: RADIUS, Access Reject (3), id: 0x30 length: 44
12:10:41.259745 IP radius.drj.com.radius > drj-wlc-nausresea055-01.drj.com.filenet-rpc: RADIUS, Access Reject (3), id: 0x30 length: 44

So all is good, right? Except that now we have a default route from the Radius server to the load balancer and so all response traffic is going through the load balancer, even things not related to Radius, such as a packets from an RDP session.

So we defined a forwarding_vserver to make the BigIP act as a router:

A forwarding vserver is a virtual server of type Forwarding (IP). In the bigip.conf file it looks like this:

virtual forwarding_vserver {
   ip forward
   destination any:any
   mask 0.0.0.0
   profiles route_friendly_fastL4 {}
}

But it doesn’t work! Packets from the Radius server come to the load balancer, and then they get source NAT’d to the floating self-IP of the load balancer. That’s no good. In TCP your response packets have to come from the IP you connected to! For simple PINGs you kind of get away with it, but with a warning. In TCP your PC will send a RST (reset) packet every time it gets a response packet with the wrong source IP, even if the other information is correct.

The solution
With the help of someone who understands snat auto-maps better than I do (evidently), I got the tip that I have a global snat-automap enabled, which is doing this. That’s how I’ve always run these LTMs (Local Traffic Managers). I had forgotten why or how I did it. Well the snat-automap pretty mcuh applies to all my other load-balanced services so I can’t simply chuck it. And I don’t have another subnet handy for use so I can’t simply exclude one of my vlans. They suggested that it could be turned off on my forwarding_vserver with an irule! Who would have figured? So I created a very simple irule:

# Turn off snat, i.e., for us in our forwarding_vserver
# inspired by https://devcentral.f5.com/wiki/iRules.snat.ashx
# DrJ, 11/2013
when CLIENT_ACCEPTED {
         snat none
}

and applied it to my forwarding_vserver, which now looks like this:

virtual forwarding_vserver {
   ip forward
   destination any:any
   mask 0.0.0.0
   rules snat-none
   profiles route_friendly_fastL4 {}
}

And voila, the LTM now routes those packets correctly without any address translation! And the Radius service still does its translations as desired.

Case closed!

Conclusion
We learned a little about F5 BigIPs today. The frustrating thing about the documentation is that they don’t really cover actual use cases so they introduce configuration settings without fully explaining why you would want to use them.

For the curious, the forwarding_vserver is accomodating an asymmetric routing situation. Incoming RDP (remote desktop protocol) packets get sent directly from a Cisco router to the Radius server. It’s just the response packets that flow from the Radius server, to the LTM, to the Cisco router.

References and related
In this post I show why a basic virtual server might not be working – a kind of rookie mistake we’ve all probably made at some point!
This post shows some non-trivial iRule examples.

Categories
Admin Network Technologies

Are we in Brazil? We are NOT in Brazil.

But Google thinks we are:

$ curl -i http://www.google.com/

returns:

HTTP/1.1 302 Found
Location: http://www.google.com.br/?gws_rd=cr&ei=nrA4UoPZB8fb4AP3loGYAg
Cache-Control: private
Content-Type: text/html; charset=UTF-8
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Date: Tue, 17 Sep 2013 19:42:22 GMT
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Content-Length: 262
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
Set-Cookie: PREF=ID=9ed4d9e91767f4ce:FF=0:TM=1379446942:LM=1379446942:S=O9jxL9sXRCc0kF-E; expires=Thu, 17-Sep-2015 19:42:22 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=BJA1pj-o2tu-IDI9KS5MjQvoKPJoY8x3uREAmeGCItGKxxfovFqdjPhvBaHUHcISFLVcTWSmCXwiTAuVhF4DIYCbPcuubfBBEGYaNy2wgveeyvGj35xTzM4Oo-yCLaDe; expires=Wed, 19-Mar-2014 19:42:22 GMT; path=/; domain=.google.com; HttpOnly
 
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com.br/?gws_rd=cr&amp;ei=nrA4UoPZB8fb4AP3loGYAg">here</A>.
</BODY></HTML>

and MSN is similar:

$ curl -i http://www.msn.com/

HTTP/1.1 302 Found
Cache-Control: no-cache, no-store
Pragma: no-cache
Content-Type: application/xhtml+xml; charset=utf-8
Location: http://br.msn.com/?rd=1&ucc=BR&dcc=BR&opt=0
P3P: CP="NON UNI COM NAV STA LOC CURa DEVa PSAa PSDa OUR IND"
X-AspNet-Version: 4.0.30319
S: BN2SCH030301048
Edge-control: no-store
Date: Tue, 17 Sep 2013 20:02:42 GMT
Content-Length: 172
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
Set-Cookie: MC1=V=3&GUID=7173a7e9bb5444fc88d52c3d302f0cdf; domain=.msn.com; expires=Thu, 17-Sep-2015 20:02:42 GMT; path=/
Set-Cookie: MC1=V=3&GUID=7173a7e9bb5444fc88d52c3d302f0cdf; domain=.msn.com; expires=Thu, 17-Sep-2015 20:02:42 GMT; path=/
Set-Cookie: brdSample=89; domain=.msn.com; expires=Thu, 17-Sep-2015 20:02:42 GMT; path=/
blah, blah
 
 
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="http://br.msn.com/?rd=1&amp;ucc=BR&amp;dcc=BR&amp;opt=0">here</a>.</h2>
</body></html>

Now a request from the IP Next door to it, on the same subnet, produces an entirely different result:

$ curl -i http://www.google.com/

Date: Tue, 17 Sep 2013 19:48:11 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more in
fo."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Transfer-Encoding: chunked
Proxy-Connection: Keep-Alive
Connection: Keep-Alive
Set-Cookie: PREF=ID=0e280a6247dbe89f:FF=0:TM=1379447291:LM=1379447291:S=Ws_EoUItgwgQMFLL; expires=Thu, 17-Sep-2015 19:48:11
 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=MEOnbrr2pgKz8MVKEzIQ2v9f4nkR0o5FXJxbeBRk3GZg6GsfKNRZm9UEHTzcuRF5A6D2kXKa4N6FQnP88fkVLrgDokdOlXvX1Oba2JzC
koZ0K0PiACYSiTPCru5eH9C3; expires=Wed, 19-Mar-2014 19:48:11 GMT; path=/; domain=.google.com; HttpOnly
 
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage"><head><meta content="Search the world's information,
 including webpages, images, videos and more. Google has many special features
...

This is a problem when you do searches in which you expect to get localized results, like a search for dinner. Believe me, you see completely different things if Google thinks your IP is in Brazil.

And some services simply shut themselves off. Pandora simply does not work, for instance.

What is going on?

It seems some Geo-location service has incorrectly tagged our IP as Brazil. Not sure how to fix that…

Google does have an IP location change form, but they say it will take a month for them to make the change. That hardly seems like Internet time! https://support.google.com/websearch/contact/ip And what about the others? I don’t see any simple fix for Pandora for instance.

Possible solution?
Someone suggested that the proxy has cached Google’s home page. This is one thing I didn’t test for. However, these proxies are very sophisticated and know to refresh popular pages frequently so I very much doubt that was the cause of the problem.

Workarounds
Google present a link Google.com in the lower right corner of the page which you can click on to pop you to the English-language version of the site, but it’s still not location-aware and treats you more as an English traveler in Brazil when you search for a generic term like restaurant.

You can also go to http://www.google.com/ncr. I personally haven’t tried it, but that’s a tip I just received.

Oct 30 update
Well, finally, after filling out the IP change form twice, and of course keeping all Brazilian users off that proxy, Google has finally deemed fit to declare our proxy really is in the US after all.

It happens again…
Some light usage by a fraction of the user base and I find all proxies are in Brazil yet again. I fill out the form March 19th 2015 and a couple times after that. Now it is May 11th and Google still has not acted. This time I have not kept Brazil users off the proxies, because really this is ridiculous. When users complain I urge them to use duckduckgo.com for its better privacy protections. June 24th update. I’ve been checking every week just about. I’ve filled out Google’s own form about four times in total. So finally after three months, Google set us back to US Enlgish. Without a word to us, of course. Incredible.

Google has gotten big and unresponsive.

Conclusion
Use duckduckgo.com.

References and related
A good web site to learn where your IP is without too much malvertising that you often get with these kinds of things is: https://ipinfo.io/

Categories
Admin Network Technologies

Extended Passive Mode FTP through Checkpoint Firewall

Intro
The vast majority of time there is no problem doing an FTP to a server behind a firewall protected by Checkpoint’s Firewall-1. But occasionally there is.

The details

The problem I am about to document I think will only occur on a server that has multiple interfaces. I have seen it occur on multiple operating systems, so that doesn’t seem to matter. On the other hand, I have also not seen it on other similar systems, a point which I don’t fully yet understand.

Nevertheless, a work-around is always appreciated, so I provide what I found here, to complete my extensive documentation of problems I’ve encountered and resolved.

Here is a snippet from the FTP session showing the problem:

ftp> cd uploadDirectory
250 CWD command successful.
ftp> put smallfile.txt
local: smallfile.txt remote: smallfile.txt
229 Entering Extended Passive Mode (|||36945|)
200 EPRT command successful.
421 Service not available, remote server timed out. Connection closed

And here is the solution:

Enter epsv4 after logon and before any other commands are issued. Problem fixed!

Conclusion
We have shown a way to fix a firewall-related problem that manifests itself during extended passive mode FTPs. Some more research should be done to understand under what circumstances this problem should be expected, but it seems to occur with a Checkpoint Firewall-1 firewall and an FTP server with multiple interfaces.

Categories
Admin Network Technologies

The IT Detective Agency: trouble with wireless at home

Intro
I don’t usually have the luxury of writing about a mystery I’ve solved right out of my own home, but there finally is one that I got got to the bottom of recently – poor WiFi performance.

The details
Considering that I deal with this stuff for a living, I have a thread-bare setup at home. After my company-issued router’s WiFi began to work unreliably, I resuscitated an old Linksys wireless router, WRK54G V2. Superficially it seemed to work. But we weren’t very demanding of it.

It eventually seemed to be the case, as visitors mentioned, that streaming videos does not work through wireless. This was hard for me to check, with my broken-down, aging equipment. I have a desktop which always freezes and crashes is you play any Youtube video. And a Netbook which kind of worked better, but its peculiarity is that its ethernet interface doesn’t work. Wirelessly, its version of Flash was too old and insecure for Firefox, and attempts to update Flash using WiFi in turn were unsuccessful.

In general the Linksys router, as I eventually realized, seemed to initially serve up large downloads ok, but then at some point during the download, things begin to crawl and you are left with a download that proceeds at 10 kbit/s or something ridiculously slow like that.

Providing mixed evidence is a Sony BlueRay player. using WiFi it could sort of manage to show a HuluPlus TV episode. You might have to be patient at times while it’s loading, but we did get through a full episode of Grey’s Anatomy recently.

After more complaints I decided enough is enough. It seemed as though my WiFi was the most likely suspect, sifting through the mixed evidence. I perhaps waited so long because who’d think they’d be dealing with two bad WiFi routers from two totally different vendors?

So hedging my bets, I didn’t go all out with a new Gbit router. I reached back in time a little and got a refurbished Cisco 1200E wireless-N router. It was only $28 from Amazon. But before buying it, I read the comments and got one idea about routers: sometimes they need to be rebooted!

This is pretty funny, really, because it is probably apparent to any homeowner, and here I am, a specialist, missing this point. You see with Cisco enterprise-class gear you almost never have to reboot to fix a problem. These things run uninterrupted for not only weeks and months at a time, years at a time is also not at all uncommon. Same for some Unix servers. So from my perspective rebooting is something for consumer devices running Microsoft OSes!

So, before rebooting the Linksys to see if that would cure it, I ran a Ping to Google’s DNS server (very easy to remember its IP) from a CDM window:

> ping -t 8.8.8.8

I didn’t preserve the output, but it wasn’t pretty. It was something like this:

Pinging 8.8.8.8 with 32 bytes of data:
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=369ms TTL=56
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=1204ms TTL=56
Reply from 8.8.8.8: bytes=32 time=284ms TTL=56
...

51 msec – fine. But round-trip times much greater than that? That’s not right.

So I hopefully reboot the Linksys router and re-run the test on the Netbook:

Pinging 8.8.8.8 with 32 bytes of data:
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
Reply from 8.8.8.8: bytes=32 time=50ms TTL=56
Reply from 8.8.8.8: bytes=32 time=51ms TTL=56
...

Much more consistent.

Try a Youtube video from Firefox. Nope, need to update Flash. Update Flash. Nope – download times out and kicks me out.

So I’ve accomplished nothing in rebooting in terms of results that matter.

That’s when I decided to check out of Amazon with that refurbished router.

Aside about Wireless-N
Given my ancient equipment, I was concerned that Wireles-N routers might not be compatible with my wireless radios, which would only support G. Is it backwards compatible? Yes. Some quick research showed that and my own experience confirmed it.

Conclusion
The setup of the router was pretty straightforward although it froze at some point just after I set the wireless password. It helps to have done this a zillion times before. At that point I observed what my default gateway was and hit it as a web site URL. Guessed the admin password incorrectly a zillion times, until I tried the wireless password as the admin password, and, wham – I was in and happily configuring away…

More importantly, I went to that Netbook, updated Flash. No problems. Ran a Youtube video. No problems. Ran a speedtest.net test (which wouldn’t even run before this). Numbers look as good as my wired connection: 6 mbit download, 0.6 mbit upload.

Last test is to see where the speed maxes out within my home network. I plan to hit my Raspberry Pi web server to test this and will provide results as soon as they are available.

Conclusion to the conclusion
So I really was cursed with two bad wireless routers. Sometimes using 10-year-old equipment is really not worth the $30 saved in deferred spending. Read product reviews on Amazon to get hints about real issues others have faced.
To be continued…

Categories
Apache Linux Network Technologies Perl Raspberry Pi

Getting started on my Raspberry Pi

Intro
The Raspberry Pi computer is an awesome idea. Its performance is surprisingly good as well, as I will show below. Available packages? Not so impressive. I share some old X-windows tricks which will allow you to bring up the GUI without ever using the HDMI port.

The details
My Methodology
I was too lazy to set up an HDMI console plus keyboard and mouse. I’m more a server guy anyways so I’m more interested in what I can accomplish from a command prompt. And this also makes getting started that much easier. I had burned the Raspbian Wheezy image to a super-fast SD card (more on that below) the day that my Pi came in the mail. I attached power and ethernet, booted it up, guessed the IP it acquired by running some PINGs, did an ssh using the pi/raspberry user and Bingo! I was in. It couldn’t be easier. How I tested GUI applications without a console is explained further down below.

First Impressions
It feels fast.

Packages
Not much seems to be there by default – no apache, not many X utilities. There is a lame X browser called x-www-browser. I thought this is Debian, right? So we can just start downloading Debian packages, like Firefox. Wrong! It doesn’t work that way. There’s no Firefox, Safari, Chrome or Opera! It does come pre-loaded with curl, however, ha, ha.

No, the Raspbian FAQ explains why this is. It’s rather complicated. I guess the compiler works though I haven’t tested it yet. So I suppose you could compile packages from their source code.

The x-terminal-emulator is pretty decent, however.

If it comes with a web server, I didn’t notice. So I quickly checked for the availability of apache. It’s available. Then installed it:

> sudo apt-get install apache2

That worked out well. It installed it and the packages it depended on and even launched it, and it all felt fairly peppy. See the suggested fix further down if this gives you errors. The default HTML DOCROOT is /var/www. I accessed it locally:

> curl localhost

And a welcome message displayed. A good start.

Where’s the rest of my 16 GB SD card gone to?

Original disk layout:

pi@raspberrypi:~$ df -k
Filesystem     1K-blocks    Used Available Use% Mounted on
rootfs           1804128 1492908    219572  88% /
/dev/root        1804128 1492908    219572  88% /
devtmpfs          224436       0    224436   0% /dev
tmpfs              44900     204     44696   1% /run
tmpfs               5120       0      5120   0% /run/lock
tmpfs              89780       0     89780   0% /run/shm
/dev/mmcblk0p1     57288   16872     40416  30% /boot

Layout after raspi-config:

pi@raspberrypi:~$ df -k
Filesystem     1K-blocks    Used Available Use% Mounted on
rootfs          15251960 1494852  12982544  11% /
/dev/root       15251960 1494852  12982544  11% /
devtmpfs          224436       0    224436   0% /dev
tmpfs              44900     196     44704   1% /run
tmpfs               5120       0      5120   0% /run/lock
tmpfs              89780       0     89780   0% /run/shm
/dev/mmcblk0p1     57288   16872     40416  30% /boot

Whew! That was easy. All 16 GB accounted for and actively used.

Was it worth it to buy that UHS SD card?
I didn’t want a sluggish server, so I paid a couple bucks more and bought a 16 GB SD UHS (ultra high speed) card for my “disk,” not knowing whether or not the Pi had the muscle to put it to work.

A quick aside about SD cards
I did a quickie self-education on this topic. Most SD cards are rated by class, so a class 4 SD card can do 4 MB/sec I/O, and a class 10 card can do at least 10 MB/sec. Faster still are the UHS SD cards. My Sandisk, which only cost about $19, is rated for 45 MB/sec I/O. A great write-up on this topic specifically for Raspberry Pi is: Raspberry Pi SD Card Speed Test – Raspberry Pi

diskSpec.pl benchmark (higher numbers are better)
1333 file creation/destruction operations per second – Raspberry Pi with UHS SD card
6666 file creation/destruction operations per second – EBS volume on small image running CentOS in Amazon cloud
26000 file creation/destruction operations per second – high-end HP server (G7 DL380) running SLES 11

I think I provided the source for this simple Perl program I wrote, diskSpec.pl. It creates a file, writes a random number into it, then deletes it – that all counts as one operation. Here it is:

#!/usr/bin/perl
# DrJ, 1/2000
# Test disk I/O
$DIR = $ARGV[0];
chdir($DIR);
$t0 = time();
while(1) {
  $ran = rand();
  open(FILE,"&gt;$ran") || die "Cannot open file $ran in directory $DIR!!\n";
  print FILE $ran;
  close(FILE);
  unlink($ran);
  $cnt++;
  if ($cnt % 20000 == 0) {
    $rate = $cnt / (time() - $t0) ;
    print "File creation/desctruction rate: $rate\n";
  }
}

DrJ 2017 Note: The notes below are historical and does not seem to work at all for the Raspberry Pi 3 loaded with NOOBS. In NOOBS you select your OS to install. You can’t ssh to it. I know. I just tried! Even after you install Raspbian Wheezy, you still can’t access it via ssh until you enable the ssh daemon with raspi-condfig.

How to get the GUI working without a console
I have this feeling that many people trying out the Pi won’t have the faintest idea how X windows works, unlike us Unix old-timers. It’s fun to put 20-year-old lessons to work on something new. Like I said I’m lazy and didn’t feel the need to set up an actual console to the thing. I used some old X features to allow me to launch specific X-windows applications that are pre-loaded on the device, and display them on my PC. How?

On a Windows PC you install Cygwin. Then launch the XWin Server. You ssh to your pi. How do you know its IP the first time? Guess! It picks it up via DHCP, so start PINGing around the range where your other devices are numbered. My PC is 192.168.5.12/24, my pi was 192.168.5.16. Maybe you have a bunch of devices responding to PING and are unsure which is which? Your MAC table is your friend. Here’s mine:

C:\Documents and Settings&gt;arp -a
 
Interface: 192.168.5.12 --- 0x2
  Internet Address      Physical Address      Type
  192.168.5.1           00-14-f6-e0-c0-4c     dynamic
  192.168.5.16          b8-27-eb-dd-21-02     dynamic
  192.168.5.99          00-90-a9-bb-3d-76     dynamic

arp displays the MAC table with the IP-to-physical (MAC) address correspondence. So most Pi’s will have a MAC address whose beginning is similar to b8-27-eb. A quick aside. Does the MAC address follow the board (SOC) or the SD Card? The board – I tested this with a friend’s SD Card.

You login with the pi/raspberry.

Then set your DISPLAY environment variable:

> export DISPLAY=192.168.5.12:0

Most of your X applications begin with the letter “x,” so enter

> x<tab><tab>

to see a display of available programs like this:

xapian-config        xdg-screensaver      xkbevd               xpdf.real            xxd
xarchiver            xdg-settings         xkbprint             xprop                xz
xargs                xdpyinfo             xkbvleds             x-session-manager    xzcat
xauth                xdriinfo             xkbwatch             xsubpp               xzcmp
xdg-desktop-icon     xev                  xkill                xtables-multi        xzdiff
xdg-desktop-menu     xfd                  xlsatoms             x-terminal-emulator  xzegrep
xdg-email            xfontsel             xlsclients           xvinfo               xzfgrep
xdg-icon-resource    xinit                xlsfonts             x-window-manager     xzgrep
xdg-mime             xkbbell              xmessage             xwininfo             xzless
xdg-open             xkbcomp              xpdf                 x-www-browser        xzmore

Actually I don’t know how many of these are X. But at least a few are.

Start an xterm in Cygwin. In the xterm window, give permission to the Pi to use it as its Xserver:

> xhost +

Now in the Pi shell (ha, ha), type:

> x-terminal-emulator

and you should see the colorful terminal emulator on your PC in a few seconds. this is a true GUI application. You similarly launch the x-www-browser. Don’t forget to background your X-windows in the Pi shell:

<Ctrl-Z>
> bg

so you can use the one window to launch multiple X windows.

Another example the book Programming the Raspberry Pi has is the Python interactive development environment. I reasoned from the screen shots that idles3 would also be an X application – hey, they don’t have to start with the letter x – and indeed it is!

Want the whole ball of wax, a complete console? I just figured this one out by taking an educated guess:

> x-session-manager

and you will see the complete GUI on your PC! Cool, huh?

Want to get rid of the last thing you backgrounded, like, say, that x-session-mnager which has taken over your PC?! Type

> fg
<Ctrl-C>

and it will be killed.

How to get the GUI working without a console, Method 2
The above steps look a little daunting? Even I don’t want to install cygwin on my new PC. There is an alternative which can suffice for light usage.

On the Pi install a vnc server:

$ sudo apt-get install tightvncserver

Launch it:

$ vncserver

The first time only it will ask you to set up a password. Might as well make it raspberry like everything else we do on the Pi.

Then install a VNC client on your PC (Or Macbook). I use RealVNC.

Launch your VNC client and connect to your Pi’s IP address (which you need to know) + the display number, like this:

192.168.0.100:1

For a Pi at IP 192.168.0.100 in which the vncserver started display 1. Normally it will be display 1, but I guess it might be display 0.

Don’t launch vncserver more than once! You don’t want a bunch of those running and dragging on performance.

Anyways, that’s it! You should see the Pi’s GUI on your PC, but it might seem a wee bit small.

Setting a static IP
If you’re going to use the Pi more as a server like I am, I think it’s a good idea to give it a static IP. What I did is to edit /etc/network/interfaces. Mine now looks like this:

Nagios can be installed! That's pretty cool - it's a sophisticated network monitoring utility.

Get a decent browser
The web browsers that come with the Pi are horrible. Midori? Seriously? I found you can get Firefox, but the downside is that it’s sloooww. But at least it works. The secret is that it’s not called Firefox. Instead:

$ sudo apt-get install iceweasel

Yes, it’s iceweasel, not Firefox, in Debian Linux. Go figure.

My cool transparent case
I recommend to get a case. I got the one with the best reviews. It’s kind of expensive, about $20, but worth it. It’s practically a work-of-art. Clear, the PC board fits snugly. I put it in my pocket and showed it around to my friends, feeling it was well protected, and yet also a sight to behold the first time. I even has a thoughtful light guide so the LEDs look beautiful as their light follows the rectangular opening to open air. I never had this much fun in show-and-tell! I just pulled the Pi wrapped in its case from my shirt pocket and amazed those around me. So go ahead and splurge. Anyways some of the cheaper cases look just that. Here is what I bought:

Helping a friend out with his Pi
So I dutifully take my friend’s Pi home and offer to install a web server. What did I do wrong? Well, duh, I could have just taken his SD Card home and plugged it into my Pi case! That concept takes some getting used to! We all have the same hardware. Our SD cards – our disk – are what make one Pi different from another.

So I followed my own blog post to recall some things. This Pi also had a MAC address beginning with the same six characters.

The apache2 installation did not work out, however. What to do? Well, I eventually read the darn output from running it. It suggests to try this:

> sudo apt-get update

So I ran that, figuring it could do no harm. Then I re-ran

> sudo apt-get install apache2

and this time the install actually worked!

Reading a flash drive
I was curious to see if you could stick a flash drive in the thing and just read it. I didn’t think so since I thought it would be formatted for NTFS. But if you have the GUI running and bring up a file manager, I’ll be darned if it doesn’t just work. I noticed the drive is mounted as /media/Cruzer (my flash drive has the brand name Cruzer).

If you don’t launch the file manager, I think you can still work with it as follows:

$ sudo mkdir /media/Cruzer; sudo mount /dev/sda1 /media/Cruzer

Then when you’re all done and before you remove it:

$ sudo umount /media/Cruzer

So that’s pretty cool. You can create tar archives on the flash drive, plug it into someone else’s Pi and untar it, etc, just like on Windows.

Conclusion
Raspberry Pi is respectable as a computer. It will be a lot of fun to explore for the hobbyist.

References

Raspberry Pi SD Card Speed Test – Raspberry Pi – a great discussion of the various speeds of Micro SD cards and how to measure yours
Go here for my next project – using your Raspberry Pi to monitor your home’s power or Internet connection.
Interested in networking? A lot of useful tips can be found in this posting describing how to turn your Pi into a router.
Realvnc.com distributes realVNC viewers for various platforms.
How about a Raspberry Pi-driven digital photo frame? I describe an approach in this article.
Brief Nagios for Raspberry Pi writeup.

Categories
Admin Network Technologies

The IT Detective Agency: Internet Explorer cannot display https web page, part II

Intro
They say when it rains it pours. As a harassed It support specialist its tempting to lump all similar-looking problems that come across your desk at the same time in the same bucket. This case shows the shallowness of that way of thinking, for that’s exactly what happened in this case. It occurred exactly when this case was occurring in which a user had an issue displaying a secure web site. That other case was described here.

The details
I was doing some work for a company when they came to me with a problem one user at HQ was having accessing an https web site. Everything else worked fine. The same web page worked fine from other sites.

Since I had helped them set up their proxy services I was keen to prove that the proxy wasn’t at fault. The user removed the proxy settings – still the error occurred. The desktop support got involved. I suggested a whole battery of tests because this was a weird one.
– what if you access via proxy?
– what if you take that laptop and access it from a VPN connection?
– what if you access another site which uses the same certificate-issuer?
– what if you access the host, but using http?
– can you PING the server?
– can you telnet to that server on port 443?
– what if you access the webpage by IP address?
– what if accessed via Firefox?
– etc.
The error, by the way, was

Internet Explorer cannot display the webpage

What you can try:
Diagnose Connection Problems

The answers came back like this:
– what if you access via proxy? It works!
– what if you take that laptop and access it from a VPN connection? It works.
– what if you access another site which uses the same certificate-issuer? It works.
– what if you access the host, but using http? It works.
– can you PING the server? Yes.
– can you telnet to that server on port 443? Yes
– what if you access the webpage by IP address? It does not work.
– Firefox? I don’t remember the answer to this.

Most of that thinking behind those tests is pretty obvious. Why so many tests? The networking group is kind of crotchety and understaffed, so they really wanted to eliminate all other possibilities first. And at one point I thought it could have been a desktop issue. Why would the network allow some of these packets through but not others when it doesn’t run a firewall? The desktop did have A-V and local firewall, after all.

So I didn’t have proof or even a good idea. Time to take out the big guns. We ran a trace on the server with tcpdump while the error occurred. The server was busy so we had to be pretty specific:

> tcpdump -s 1580 -w /tmp/drjcap.cap -i any host 10.19.79.216 and port 443

What do we see? We see the initial handshake go through just fine. Then a client Hello, then a server Hello, followed by a Certificate sent from the server to the client. The Certificate packet kept getting re-sent because there was no ACK to it from the client. So it’s beginning to smell like a dropped packet somewhere. I was hot on the trail but I decided I needed even stronger proof. We arranged to do simultaneous traces on both PC and server to compare the two results.

On the PC we had to install Wireshark. Actually I think Microsoft also has a utility to do traces but I’ve never used it. What did we find? Proof that there was indeed a dropped packet. That server certificate? Never received by the PC (which is acting as the SSL client). But why?

I had noticed one other funny thing about the trace comparisons. The server hello packet left as a packet of length 1518 bytes, but arrived, apparently, as two packets, one of length 778 bytes, the other of 770 bytes, i.e., it was fragmented. That should be OK. I don’t think the don’t fragment flag was set (have to check this). But it got me to wondering. Because the server certificate packet was also on the large side – 1460 bytes.

Regardless, I had enough ammo to go to the networking group with, which I did. It was like, “Oh yeah, our Telecom provider (let’s call it “CU”) implemented GETVPN around that time.” And further discussion revealed suspected other problems related to this change with MTU, etc. In fact they dredged up this nice description of the pitfalls that await GET VPN implementations from the Cisco deployment guide:

3.8 Designing Around MTU Issues Because of additional IPsec overhead added to each packet, MTU related issues are very common in IPsec deployments, and MTU size becomes a very important design consideration. If MTU value is not carefully selected by either predefining the MTU value on the end hosts or by dynamically setting it using PMTU discovery, the network performance will be impacted because of fragmentation and reassembly. In the worst case, the user applications will not work because network devices might not be able to handle the large packets and are unable to fragment them because of the df-bit setting. Some of the scenarios which can adversely affect traffic in a GET VPN environment and applicable mitigation techniques are discussed below. LAN MTU of 1500 – WAN MTU 44xx (MPLS) In this scenario, even after adding the 50-60 byte overhead, MTU size is much less than the MTU of the WAN. The MTU does not affect GET VPN traffic in any shape or form. LAN MTU of 1500 – WAN MTU 1500 In this scenario, when IPsec overhead is added to the maximum packet size the LAN can handle (i.e. 1500 bytes) the resulting packet size becomes greater than the MTU of the WAN. The following techniques could help reduce the MTU size to a value that the WAN infrastructure can actually handle. Manually setting a lower MTU on the hosts By manually setting the host MTU to 1400 bytes, IP packets coming in on the LAN segment will always have 100 extra bytes for encryption overhead. This is the easiest solution to the MTU issues but is harder to deploy because the MTU needs to be tweaked on all the hosts. TCP Traffic Configure ip tcp adjst-mss 1360 on GM LAN interface. This command will ensure that resulting IP packet on the LAN segment is less than 1400 bytes thereby providing 100 bytes for any overhead. If the maximum MTU is lowered by other links in the core (e.g. some other type of tunneling such as GRE is used in the core), the adjust-mss value can be lowered further. This value only affects TCP traffic and has no bearing on the UDP traffic. 3.8.1.1 Host compliant with PMTU discovery For non-TCP traffic, for a 1500 packet with DF bit set, the GM drops the packet and send ICMP message back to sender notifying it to adjust the MTU. If sender and the application is PMTU compliant, this will result in a packet size which can successfully be handled by WAN. For example, if a GM receives a 1500 byte IP packet with the df-bit set and encryption overhead is 60 bytes, GM will notify the sender to reduce the MTU size to 1440 bytes. Sender will comply with the request and the resulting WAN packets will be exactly 1500 bytes

I don’t claim to understand all those scenarios, but it shouts pretty loudly. Watch out for problems with packets of size 1400 bytes or larger!

Why would they want GET VPN? To encrypt the HQ communication over the WAN. So the idea is laudable, but the execution lacking.

Case: understood but not yet closed! The fix is not yet in…

To be continued…

Categories
Admin Internet Mail Linux Network Technologies

The IT Detective Agency: mail server went down with an old-school problem

Intro
I got a TXT from my monitoring system last night. I ignored it because I knew that someone was working on the firewall at that time. I’ve learned enough about human nature to know that it is easy to ignore the first alert. So I’ve actually programmed HP SiteScope alerts to send additional ones out after four hours of continuous errors. When I got the second one at 9 PM, I sprang into action knowing this was no false alarm!

The details
Thanks to a bank of still-green monitors I could pretty quickly rule out what wasn’t the matter. Other equipment on that subnet was fine, so the firewall/switch/router was not the issue. Then what the heck was it? And how badly was it impacting mail delivery?

This particular server has two network interfaces. Though the one interface was clearly unresponsive to SMTP, PING or any other protocol, I hadn’t yet investigated the other interface, which was more Internet-facing. I managed to find another Linux server on the outer network and tried to ping the outer interface. Yup. That worked. I tried a login. It took a whole long time to get through the ssh login, but then I got on and the server looked quite normal. I did a quick ifconfig – the inner interface listed up, had the right IP, looked completely normal. I tried some PINGs from it to its gateway and other devices on the inner network. Nothing doing. No PINGs were returned.

I happened to have access to the switch. I thought maybe someone had pulled out its cable. So I even checked the switch port. It showed connected and 1000 mbits, exactly like the other interface. So it was just too improbable that someone pulled out the cable and happened to plug another cable from another server into that same switch port. Not impossible, just highly improbable.

Then I did what all sysadmins do when encountering a funny error – I checked the messages file in /var/log/messages. At first I didn’t notice anything amiss, but upon closer inspection there was one line that was out-of-place from the usual:

Nov  8 16:49:42 drjmailgw kernel: [3018172.820223] do_IRQ: 1.221 No irq handler for vector (irq -1)

Buried amidst the usual biddings of cron was a kernel message with an IRQ complaint. What the? I haven’t worried about IRQ since loading Slackware from diskettes onto my PC in 1994! Could it be? I have multiple ways to test when the interface died – SiteScope monitoring, even the mail log itself (surely its log would look very different pre- and post-problem.) Yup.

That mysterious irq error coincides with when communication through that interface stopped working. Oh, for the record it’s SLES 11 SP1 running on HP server-class hardware.

What about my mail delivery? In a panic, realizing that sendmail would be happy as a clam through such an error, I shut down its service. I was afraid email could be piling up on this server, for hours, and I pride myself in delivering a faultless mail service that delivers in seconds, so that would be a big blow. With sendmail shut down I knew the backup server would handle all the mail seamlessly.

This morning, in the comfort of my office I pursued the answer to that question What was happening to my mail stream during this time? I knew outbound was not an issue (actually the act of writing this down makes me want to confirm that! I don’t like to have falsehoods in writing. Correct, I’ve now checked it and outbound was working.) But it was inbound that really worried me. Sendmail was listening on that interface after all, so I didn’t think of anything obvious that would have stopped inbound from being readily accepted then subsequently sat on.

But such was not the case! True, the sendmail listener was available and listening on that external interface, but, I dropped a hint above. Remember that my ssh login took a long while? That is classic behaviour when a server can’t communicate with its nameservers. It tries to do a reverse lookup on the ssh client’s source IP address. It tried the first nameserver, but it couldn’t communicate with it because it was on the internal network! Then it tried its next nameserver – also a no go for the same reason. I’ve seen the problem so often I wasn’t even worried when the login took a long time – a minute or so. I knew to wait it out and that I was getting in.

But in sendmail I had figured that certain communications should never take a long time. So a long time ago I had lowered some of the default timeouts. My mc file in the upstream server contains these lines:

...
dnl Do not use RFC1413 identd. p 762  It requires another whole in the F/W
define(`confTO_IDENT',`0s')dnl
dnl Set more reasonable timeouts for SMTP commands'
define(`confTO_INITIAL',`1m')dnl
define(`confTO_COMMAND',`5m')dnl
define(`confTO_HELO',`1m')dnl
define(`confTO_MAIL',`1m')dnl
define(`confTO_RCPT',`1m')dnl
define(`confTO_RSET',`1m')dnl
define(`confTO_MISC',`1m')dnl
define(`confTO_QUIT',`1m')dnl
...

I now think that in particular the HELO timeout (TO_HELO) of 60 seconds saved me! The upstream server reported in its mail log:

Timeout waiting for input from drjmailgw during client greeting

So it waited a minute, as drjmailgw tried to do a reverse lookup on its IP, unsuccessfully, before proceeding with the response to HELO, then went on to the secondary server as per the MX record in the mailertable. Whew!

More on that IRQ error
Let’s go back to that IRQ error. I got schooled by someone who knows these things better than I. He says the Intel chipset was limited insofar as there weren’t enough IRQs for all the devices people wanted to use. So engineers devised a way to share IRQs amongst multiple devices. Sort of like virtual IPs on one physical network interface. Anyways, on this server he suspects that something is wrong with the multipath driver which is loaded for the fiber channel host adapter card. In fact he noticed that the network interface flaked out several times previous to this error. Only it came back after some seconds. This is the server where we had a very high CPU when the SAN was being heavily used. The SAN vendor checked things out on their end and, of course, found nothing wrong with the SAN equipment. We actually switched from SAN to tmpfs after that. But we didn’t unload the multipath driver. Perhaps now we will.

Feb 22 Update
We haven’t seen the problem in over three weeks now. See my comments on what actions we took.

Conclusion
Persistence, patience and praeternatural practicality paid off in this perplexing puzzle!

Categories
Admin Network Technologies

Firewall is a significant drag on download speeds

Intro
This post might be a restatement of the obvious to some, but I thought it was noteworthy enough to measure and mention this affect. I was twiddling my thumbs during a long sftp upload when I began to notice these transfers I was doing went really quickly between some servers, and not so much between others. How to control for all variables except the ones I wanted to vary? How to measure things in such a way that an overworked network technician with vested interests in saying the status quo is “good enough” will listen to you? These are things I wrestled with.

The details

To be continued…