Categories
Linux

Gnu Parallel Really Helps With Zcat

I happened across what I think will turn out to be a very useful open source application: Gnu parallel.  Let me explain how I got to that point.  I was uncompressing gzip’d files, of which I have lots.  So many, in fact, it can take the better part of a day to go through them all.  The server I was using is four years old but I don’t have access to anything better.  In fact it seems processor speed has sort of plateaud.  Moore’s Law only applies if you count the number of processors, which are multiplying like rabbits.  So I start to think that maybe my single-threaded approach is wrong, see? I need to scale horizontally, not vetically.

It just happens that in my former life I spent quite some effort into parallelizing a compute-intensive program I wrote.  I can’t believe that it’s still hanging around on the Internet!  I haven’t looked for it for at least 17 years: my FERMISV  Monte Carlo.  So I thought, what if I could parallelize my zcat|grep?  Writing such a thing from scratch would be too time-consuming, but after searching for parallelize zcat I quickly found Gnu Parallel.  It’s one thing to identify such a program in a Google search, another to get it working.  Believe me, I’ve put in my time with some of these WordPress plugins.

The upshot is that, yes, Gnu Parallel actually works and it’s not hard to learn to use.  They have a nice Youtube video.  For me I employed syntax like:  

ls *gz|time parallel -k "zcat {}|grep 192.168.23.34"  > /tmp/matches

.  This, of course, is to be compared with the normal operation I have been doing for eons:

time zcat *gz|grep 192.168.23.34 > /tmp/matches

.  The “time” is just for benchmarking and normally wouldn’t be present. 

The results are that the matches file produced in the two different ways are identical.  That’s good!  The -k switch ensured the order is preserved in the output, and anyways, it didn’t cost me any time. Otherwise the order is not guaranteed. I tested on a server with four cores.  Gnu Parallel reported that it used 375% of my CPU!  It finished in less than half the time of my normal way of doing things.  I need to put it through more real-world exercises, but it really looks promising.  I never got onto the xargs bandwagon (perhaps I should be ashamed to admit it?), but Gnu Parallel can do a lot of that same type of thing, but in my opinion is more intuitive.  Take a look!

Categories
Linux

Performance Degradation With SLES 11 SP1 Under VMWare

Some friends and I found severe performance degradation using SLES 11 SP1 under VMWare.  That’s Suse Linux Enterprise Server Service Pack 1.  I’m still trying to get my head around it – there are lots of variables to control for. Later update For my analysis of the specific problem with grep on SLES 11 SP1, please go here: http://drjohnstechtalk.com/blog/2011/06/grep-is-slow-as-a-snail-in-sles-11/.

The Method

I start with a 20 Mb gzip’d file.  Let’s call it file.gz.  Uncompressed it is 107 MB.  I run

time zcat file.gz > /dev/null

I run it several times and report the lowest number which I am consistently able to reproduce. That feels the fairest way to me to benchmark in this case.

On a fast, 3 GHz physical server running SLES 10 SP3 it takes 0.63 s. Let’s call that server Physical.

Then I add in something to make it useful: grep’ing for an IP address:

time zcat file.gz|grep 192.168.23.34 > /dev/null

On Physical that takes 0.65 s.

My Amazon EC2 image – let’s call it Amazon-EC2 – runs ubuntu 10.10 on a 2.6 GHz VM. To learn CPU speed in ubuntu:

cat /proc/cpuinfo|grep MHz

My SLES 11 SP1 is a guest VM on VMWare ESX. It has a 2.4 GHz processor. Let’s call it SLES11SP1. For CPU speed in SLES*:

dmesg|grep -i cpu

and look for the line that says Intel … GHz.

* 7/1/2011 Correction The correct way to get the processor speed is

grep -i /proc/cpuinfo

The cpu Mhz line shows the running speed. this also seems to work in RHEL (Redhat) and Debian distributions such as Ubuntu. I’ve looked at several models. Usually the model name – which is what you get from the dmesg way – lists a speed that is the same as the cpu MHz given the 1000x difference between GHz and MHz, but not always! I have come across a recently purchased server with a surprisingly slow clock speed, and one that is quite different from the CPU name:

model name      : Intel(R) Xeon(R) CPU           E7520  @ 1.87GHz
cpu MHz         : 1064.000

Who knew you could even buy a server-class CPU that slow these days?

For comparison I have a SLES 10 SP3 VM under VMWare. It also has a 2.4 GHz CPU. SLES10SP3. All servers are X86_64.

The Results
The amazing results:

Server zcat time zcat|grep IP time
Physical 0.63 s 0.65 s
Amazon-EC2 0.73 s 1.06 s
SLES11SP1 0.99 s 5.8 s
SLES10SP3 0.78 s 0.93 s

Analysis
I’ve tried many more variants than displayed in that table, but you get the idea. All VMs are slower than all physical servers tested in all tests. Most discouragingly, the SLES11 SP3 is five or six times slower than a comparable physical server for the real-life test involving grep. I used the same file in all tests.

Conclusions
Virtualization is not a panacea. It has its role, but also its limitations. And either something is wrong with SLES 11 SP1, which I doubt, or something is wrong with the way I set it up, despite the fact that I’ve tried a few variants. We’re going to try it on a physical server next. I’ll update this post if anything noteworthy happens.

Update
I got it tested on a physical server now, too. A HP G6 w/ 4 Gb RAM. SLES 11 SP1. The CPU identified itself as Xeon E5504 @ 2.0 GHz. Here are the awful timings:

Server zcat time zcat|grep IP time
SLES 11 SP1 Physical 0.90 s 4.8 s

That shows that SLES 11 SP1 itself is causing my performance degradation. Not every instance of SLES 11 is faulty, however. It turns out that IBM’s Watson is running it! http://whatis.techtarget.com/definition/ibm-watson-supercomputer.html

For the record, the kernel in SLES 11 SP1 is 2.6.32.12-0.7, while in SLES 10 SP3 it is 2.6.16.60-0.68.1. On a RHEL 5.6 VM (I did not bother to list its results), where the kernel was 2.6.18-238.1.1, the degradation was not nearly so bad either. I wonder if the kernel version is having a major impact? Perhaps not. The kernel in ubuntu 10.10 looks pretty new – 2.6.35-24. I am running

uname -a

to see the kernel version.

Further Update
We also got to test with SLES 10 SP4 VM. You see from the table that it is well-behaved. It had 2 CPUs which identified themselves as X6550 @ 2.0 GHz. 4 GB RAM. Kernel 2.6.16.60-0.85.1

Server zcat time zcat|grep IP time
SLES 10 SP4 VM 0.96 s 1.0 s