Some friends and I found severe performance degradation using SLES 11 SP1 under VMWare. That’s Suse Linux Enterprise Server Service Pack 1. I’m still trying to get my head around it – there are lots of variables to control for. Later update For my analysis of the specific problem with grep on SLES 11 SP1, please go here: http://drjohnstechtalk.com/blog/2011/06/grep-is-slow-as-a-snail-in-sles-11/.
The Method
I start with a 20 Mb gzip’d file. Let’s call it file.gz. Uncompressed it is 107 MB. I run
time zcat file.gz > /dev/null |
I run it several times and report the lowest number which I am consistently able to reproduce. That feels the fairest way to me to benchmark in this case.
On a fast, 3 GHz physical server running SLES 10 SP3 it takes 0.63 s. Let’s call that server Physical.
Then I add in something to make it useful: grep’ing for an IP address:
time zcat file.gz|grep 192.168.23.34 > /dev/null |
On Physical that takes 0.65 s.
My Amazon EC2 image – let’s call it Amazon-EC2 – runs ubuntu 10.10 on a 2.6 GHz VM. To learn CPU speed in ubuntu:
cat /proc/cpuinfo|grep MHz |
My SLES 11 SP1 is a guest VM on VMWare ESX. It has a 2.4 GHz processor. Let’s call it SLES11SP1. For CPU speed in SLES*:
dmesg|grep -i cpu |
and look for the line that says Intel … GHz.
* 7/1/2011 Correction The correct way to get the processor speed is
grep -i /proc/cpuinfo |
The cpu Mhz line shows the running speed. this also seems to work in RHEL (Redhat) and Debian distributions such as Ubuntu. I’ve looked at several models. Usually the model name – which is what you get from the dmesg way – lists a speed that is the same as the cpu MHz given the 1000x difference between GHz and MHz, but not always! I have come across a recently purchased server with a surprisingly slow clock speed, and one that is quite different from the CPU name:
model name : Intel(R) Xeon(R) CPU E7520 @ 1.87GHz cpu MHz : 1064.000
Who knew you could even buy a server-class CPU that slow these days?
For comparison I have a SLES 10 SP3 VM under VMWare. It also has a 2.4 GHz CPU. SLES10SP3. All servers are X86_64.
The Results
The amazing results:
Server | zcat time | zcat|grep IP time |
---|---|---|
Physical | 0.63 s | 0.65 s |
Amazon-EC2 | 0.73 s | 1.06 s |
SLES11SP1 | 0.99 s | 5.8 s |
SLES10SP3 | 0.78 s | 0.93 s |
Analysis
I’ve tried many more variants than displayed in that table, but you get the idea. All VMs are slower than all physical servers tested in all tests. Most discouragingly, the SLES11 SP3 is five or six times slower than a comparable physical server for the real-life test involving grep. I used the same file in all tests.
Conclusions
Virtualization is not a panacea. It has its role, but also its limitations. And either something is wrong with SLES 11 SP1, which I doubt, or something is wrong with the way I set it up, despite the fact that I’ve tried a few variants. We’re going to try it on a physical server next. I’ll update this post if anything noteworthy happens.
Update
I got it tested on a physical server now, too. A HP G6 w/ 4 Gb RAM. SLES 11 SP1. The CPU identified itself as Xeon E5504 @ 2.0 GHz. Here are the awful timings:
Server | zcat time | zcat|grep IP time |
---|---|---|
SLES 11 SP1 Physical | 0.90 s | 4.8 s |
That shows that SLES 11 SP1 itself is causing my performance degradation. Not every instance of SLES 11 is faulty, however. It turns out that IBM’s Watson is running it! http://whatis.techtarget.com/definition/ibm-watson-supercomputer.html
For the record, the kernel in SLES 11 SP1 is 2.6.32.12-0.7, while in SLES 10 SP3 it is 2.6.16.60-0.68.1. On a RHEL 5.6 VM (I did not bother to list its results), where the kernel was 2.6.18-238.1.1, the degradation was not nearly so bad either. I wonder if the kernel version is having a major impact? Perhaps not. The kernel in ubuntu 10.10 looks pretty new – 2.6.35-24. I am running
uname -a |
to see the kernel version.
Further Update
We also got to test with SLES 10 SP4 VM. You see from the table that it is well-behaved. It had 2 CPUs which identified themselves as X6550 @ 2.0 GHz. 4 GB RAM. Kernel 2.6.16.60-0.85.1
Server | zcat time | zcat|grep IP time |
---|---|---|
SLES 10 SP4 VM | 0.96 s | 1.0 s |
One reply on “Performance Degradation With SLES 11 SP1 Under VMWare”
[…] problems of Suse Linux Enterprise Server v 11 Service Pack 1 (SLES 11 SP1) under VMWare: http://drjohnstechtalk.com/blog/2011/06/performance-degradation-with-sles-11-sp1-under-vmware/. What I hadn’t fully appreciated at that time is that part of the problem could be with the […]