Categories
Admin Apache IT Operational Excellence Linux Security Web Site Technologies

Apache Tips in Light of Security Problems

Intro
I am far from an expert in Apache. But I have a good knowledge of general best practices which I apply when running Apache web server. None of my tips are particularly insightful – they all can be found elsewhere, but this will be a single place to help find them all together.

To Compile or Not
As of this writing the current version is 2.2.21. The version supplied with the current version of SLES, SLES 11, is 2.2.10. To find the version run httpd -v

I think that’s fairly typical for them to be so many version behind. I recommend compiling your own version. But pay attention to security advisories and check every quarter to see what the latest release is. You’ll have to keep up with it on your own or you’ll actually be in worse shape than if you used the vendor version and applied patches regularly.

What You’ll Need to Know for the Range DOS Vulnerability
When you get the source you might try a simple ./configure, followed by a make and finally make install. And it would all seem to work. You can fetch the home page with a curl localhost. Then you remember about that recent Range header denial of service vulnerability described here. If you test for whether you support the Range header you’ll see that you do. I like to test for this as follows:

$ curl -H "Range: bytes=1-2" localhost

If before you saw something like

<html><body><h1>It works!</h1>

now it becomes

ht

i.e., it grabbed bytes one and two from <html>…

Now there are options and opinions about what to do about this. I think turning off Range header support is the best option. But if you try that you will fail. Why? Because you did not compile in the mod_headers module. To turn off Range headers add these lines to the global part of your configuration:

RequestHeader unset Range
RequestHeader unset Request-Range

To see what modules you have available in your apache binary you do

/usr/local/apache2/bin/httpd -l

which should look like the following if you have taken all the defaults:

Compiled in modules:
  core.c
  mod_authn_file.c
  mod_authn_default.c
  mod_authz_host.c
  mod_authz_groupfile.c
  mod_authz_user.c
  mod_authz_default.c
  mod_auth_basic.c
  mod_include.c
  mod_filter.c
  mod_log_config.c
  mod_env.c
  mod_setenvif.c
  mod_version.c
  prefork.c
  http_core.c
  mod_mime.c
  mod_status.c
  mod_autoindex.c
  mod_asis.c
  mod_cgi.c
  mod_negotiation.c
  mod_dir.c
  mod_actions.c
  mod_userdir.c
  mod_alias.c
  mod_so.c

Notice there is no mod_headers.c which means there is no mod_headers module. And in fact when you restart your apache web server you are likely to see this error:

Syntax error on line 360 of /usr/local/apache2/conf/httpd.conf:
Invalid command 'RequestHeader', perhaps misspelled or defined by a module not included in the server configuration

So you need to compile in mod_headers. Begin by cleaning your slate by running make clean in your source directory; then run configure as follows:

./configure –enable-headers –enable-rewrite

I’ve thrown in the –enable-rewrite qualifier because I like to be able to use mod_rewrite. It is not actually used for the security problems being discussed in this article.

Side note for those using the system-provided apache2 package on SLES
As an alternative to compiling yourself, you may be using an apache package. I have only tested this for SLES (so it would probably be the same for openSUSE). There you can edit the /etc/sysconfig/apache2 file and add additional modules to load. In particular the line

APACHE_MODULES="actions alias auth_basic authn_file authz_host authz_groupfile authz_default authz_user authn_dbm autoindex
 cgi dir env expires include log_config mime negotiation setenvif ssl suexec userdir php5 reqtimeout"

can be changed to

APACHE_MODULES="actions alias auth_basic authn_file authz_host authz_groupfile authz_default authz_user authn_dbm autoindex
 cgi dir env expires include log_config mime negotiation setenvif ssl suexec userdir php5 reqtimeout headers"

Back to compiling. Note that ./configure -help gives you some idea of all the options available, but it doesn’t exactly link the options to the precise module names, though it gives you a good idea via the description.

Then run make followed by make install as before. You should be good to go!

A Built-in Contradiction
You may have successfully suppressed use of range-headers, but on my web server, I noticed a contradictory HTTP Response header was still being issued after all that:

Accept-Ranges:

I use a simple

curl -i localhost

to look at the HTTP Response headers. The contradiction is that your server is not accepting ranges while it’s sending out the message that it is!

So turn that off to be consistent. This is what I did.

# need the following line to not send Accept-Ranges header
Header unset Accept-Ranges
#

Don’t Give Away the Keys
Don’t reveal too much about your server version such as OS and patch level of your web server. I suppose it is OK to reveal your web server type and its major version. Here is what I did:

# don't reveal too much about the server version - just web server and major version
# see http://www.ducea.com/2006/06/15/apache-tips-tricks-hide-apache-software-version/
ServerTokens Major

After all these changes curl -i localhost output looks as follows:

HTTP/1.1 200 OK
Date: Fri, 04 Nov 2011 20:39:02 GMT
Server: Apache/2
Last-Modified: Fri, 14 Oct 2011 15:37:41 GMT
ETag: "12005-a-4af4409a09b40"
Content-Length: 10
Content-Type: text/html

See? I’ve gotten rid of the Accept-Ranges and provide only sketchy information about the server.

I put these security-related measures into a single file I include from the global configuration file httpd.conf into a file I call security.conf. To put it all toegther, at this point my security.conf looks like this:

# 11/2011
# prevent DOS attack.  
# See http://mail-archives.apache.org/mod_mbox/httpd-announce/201108.mbox/%[email protected]%3E - JH 8/31/11
# a good explanation of how to test it: 
# http://devcentral.f5.com/weblogs/macvittie/archive/2011/08/26/f5-friday-zero-day-apache-exploit-zero-problem.aspx
# looks like we do have this vulnerability, 
# trying curl -i -H 'Range:bytes=1-5' http://bsm2.com/index.html
# note that I had to compile with ./configure --enable-headers to be able to use these directives
RequestHeader unset Range
RequestHeader unset Request-Range
#
# need the following line to not send Accept-Ranges header
Header unset Accept-Ranges
#
# don't reveal too much about the server version - just web server and major version
# see http://www.ducea.com/2006/06/15/apache-tips-tricks-hide-apache-software-version/
ServerTokens Major

SSL (added December, 2014)
Search engines are encouraging web site operators to switch to using SSL for the obvious added security. If you’re going to use SSL you’ll also need to do that responsibly or you could get a false sense of security. I document it in my post on working with cipher settings.

Disable folder browsing/directory listing
I recently got caught out on this rookie mistake: Web Directories listing vulnerability. The solution is simple. In side your main HTDOCS section of configuration you may have a line that looks like:

Options Indexes FollowSymLinks ExecCGI

Get rid of that Indexes – that’s what permits folder browsing, So this is better:

Options FollowSymLinks ExecCGI

Turn off php version listing, December 2016 update
Oops. I read about how the 47% of the top million web sites have security issues. One bases for the judgment is to see what version of PHP is running based on the headers. So i checked my https server, and, oops:

$ curl ‐s ‐i ‐k https://drjohnstechtalk.com/blog/|head ‐22

HTTP/1.1 200 OK
Date: Fri, 16 Dec 2016 20:00:09 GMT
Server: Apache/2
Strict-Transport-Security: max-age=15811200; includeSubDomains; preload
Vary: Cookie,Accept-Encoding
X-Powered-By: PHP/5.4.43
X-Pingback: https://drjohnstechtalk.com/blog/xmlrpc.php
Last-Modified: Fri, 16 Dec 2016 20:00:10 GMT
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
 
<!DOCTYPE html>
<html lang="en-US">
<head>
...

So there is was, hanging out for all to see, PHP version 5.4.43. I’d rather not publicly admit that. So I turned it off by adding the following to my php.ini file and re-starting apache:

expose_php = off

After this my HTTP response headers show only this:

HTTP/1.1 200 OK
Date: Fri, 16 Dec 2016 20:00:55 GMT
Server: Apache/2
Strict-Transport-Security: max-age=15811200; includeSubDomains; preload
Vary: Cookie,Accept-Encoding
X-Pingback: https://drjohnstechtalk.com/blog/xmlrpc.php
Last-Modified: Fri, 16 Dec 2016 20:00:57 GMT
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

I must have overlooked this when I compiled my own apache v 2.4 and used it to run my principal web server over https.

June 2017 update
PCI compliance will ding you for lack of an X-Frame-Options header. So for a simple web site like mine I can always safely send one out by adding this to my apache.conf file (or whichever apache conf file you deem most appropriate. I have a special security file in conf.d where I actually put it):

# don't permit framing from other sources, DrJ 6/16/17
# https://www.simonholywell.com/post/2013/04/three-things-i-set-on-new-servers/
Header always append X-Frame-Options SAMEORIGIN

PCI compliance will also ding you if TRACE method is enabled. In that security file of my configuration I disable it thusly:

TraceEnable Off

Test both those things in one fell swoop
$ curl ‐X TRACE ‐i ‐k https://drjohnstechtalk.com/

HTTP/1.1 405 Method Not Allowed
Date: Fri, 16 Jun 2017 18:20:24 GMT
Server: Apache/2
X-Frame-Options: SAMEORIGIN
Strict-Transport-Security: max-age=15811200; includeSubDomains; preload
Allow:
Content-Length: 295
Content-Type: text/html; charset=iso-8859-1
 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>405 Method Not Allowed</title>
</head><body>
<h1>Method Not Allowed</h1>
<p>The requested method TRACE is not allowed for the URL /.</p>
<hr>
<address>Apache/2 Server at drjohnstechtalk.com Port 443</address>
</body></html>

See? X-Frame-Options header now comes out with desired value. TRACE method was disallowed. All good.

Conclusion
Make sure you are taking some precautions against known security problems in Apache2. For information on running multiple web server instances under SLES see my next post Running Multiple Web Server Instances under SLES.

References and related
Remember, for handling the apache SSL hardening go here.
Compiling apache 2.4
drjohnstechtalk is now an HTTPS site!
TRACE method sounds useful for debugging, but I guess there are exploits so it needs to be disabled. Wikipedia documents it: https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods. Don’t forget that curl -v also shows you your request headers!

Categories
IT Operational Excellence Uncategorized

The IT Detective Agency: Debugging a Thorny Citrix Connection Issue

This case begins with the observation by the application owner for Citrix XenApp. External users were being knocked out of their sessions frequently – several times a day. And it happened en masse. Before this problem users were typically logged in all day. You can see that many must have been bumped around 12:30 PM then again around 2 PM. The problems began July 5th.

The users suffering the disconnects were all external users who access the applications via a Citrix Secure Gateway. The XenApp servers being accessed are also used on the Intranet and those users were not seeing any drops.

The AO asked if I had changed anything in the network. Nope. Had he changed anything? Nope.

So now we have the classic stand-off, right? AO vs network. There’s a root cause and it’s either the AO or the network guy who’s ultimately at fault.

My attitude in these cases is the following: the network person should prove it’s an application problem and the application owner should prove it’s a network problem! It sounds cynical, but this approach aligns with the best interests of each party. Both are really working towards the same goal, but preserving their own interests. E.g., the networking person thinks that If I can prove it’s an application problem then the AO will quit bothering me and I can get back to my real job. After all, I am not knowledgeable about the application. Even if it is a networking issue, I do not know where the issue is so I need the AO to point out the problem at a detailed level, e.g., the dropped packet or whatever, so I can focus my energies. The reality in my experience is quite different however. The AO typically does not know enough about networking to make this proof.

Nonetheless I proceeded this way, hoping to prove some knid of application problem so I could get back to my normal activities.

We enabled my own PC to use the application. This is always much easier than bothering other people. I can take traces to my heart’s content! So Monday I was connected to XenApp via the CSG. I was going along fine until 11:35 when I got the disconnected message! I later learned that the bulk of users, who are using a different app, were not disconnected then, but were at about an hour later.

Now there’s lots of pieces to look at, any one of which could be at fault. Working from PC on Internet to the XenApp we have: The Internet, my Internet router, firewall, load balancer, CSG server, firewall, XenApp server. That’s a lot to look after, but you have to start somewhere. I chose the load balancer. It was rather confusing, even to establsih a baseline of “normal” activity. I quickly observed that every 30 seconds packets were being transmitted to the PC even when nothing was going on.. Of course the communication was all encrypted so I did not even attempt to look into the packets. But sometimes I saw seven packets, sometimes six, and more rarely different numbers. The packet order didn’t even make sense. Sometimes the load balancer responded to the XenApp before the PC did! The trace of this behaviour until I was disconnected will be shown here when I get the time to include it:

The end of the trace shows a bunch of FIN packets. FIN is used to terminate a TCP connection. Now we’re getting somewhere. It looks like, from a TCP perspective, that a more-or-less orderly shutdown of the connection was occuring. If confirmed that would point to an application problem and life would be good!

The next day I logged into CSG and used a XenApp app again. This time I did an additional trace and included the CSG server itself. Again I was disconnected after a few hours. In this trace the CSG server is called webservera, the XneApp server is xenapp15. This is not a byte-level trace but rather running snoop on Solaris and looking at the meta-data:

________________________________
11:29:23.81577 xenapp15 -> webservera    ETHER Type=0800 (IP), size = 60 bytes
11:29:23.81577 xenapp15 -> webservera    IP  D=10.201.142.41 S=10.201.88.34 LEN=46, ID=4842, TOS=0x0, TTL=126
11:29:23.81577 xenapp15 -> webservera    TCP D=56011 S=1494 Push Ack=1900995851 Seq=4198089996 Len=6 Win=64134
________________________________
11:29:25.01881 xenapp15 -> webservera    ETHER Type=0800 (IP), size = 60 bytes
11:29:25.01881 xenapp15 -> webservera    IP  D=10.201.142.41 S=10.201.88.34 LEN=46, ID=4844, TOS=0x0, TTL=126
11:29:25.01881 xenapp15 -> webservera    TCP D=56011 S=1494 Push Ack=1900995851 Seq=4198089996 Len=6 Win=64134
________________________________
11:29:27.42530 xenapp15 -> webservera    ETHER Type=0800 (IP), size = 60 bytes
11:29:27.42530 xenapp15 -> webservera    IP  D=10.201.142.41 S=10.201.88.34 LEN=46, ID=4861, TOS=0x0, TTL=126
11:29:27.42530 xenapp15 -> webservera    TCP D=56011 S=1494 Push Ack=1900995851 Seq=4198089996 Len=6 Win=64134
________________________________
11:29:30.87645    webservera -> xenapp15 ETHER Type=0800 (IP), size = 54 bytes
11:29:30.87645    webservera -> xenapp15 IP  D=10.201.88.34 S=10.201.142.41 LEN=40, ID=847, TOS=0x0, TTL=64
11:29:30.87645    webservera -> xenapp15 TCP D=1494 S=56011 Ack=4198090002 Seq=1900995851 Len=0 Win=48871
________________________________
11:29:30.87657    webservera -> xenapp15 ETHER Type=0800 (IP), size = 54 bytes
11:29:30.87657    webservera -> xenapp15 IP  D=10.201.88.34 S=10.201.142.41 LEN=40, ID=848, TOS=0x0, TTL=64
11:29:30.87657    webservera -> xenapp15 TCP D=1494 S=56011 Ack=4198090002 Seq=1900995851 Len=0 Win=48871
________________________________
11:29:33.02325    webservera -> xenapp15 ETHER Type=0800 (IP), size = 54 bytes
11:29:33.02325    webservera -> xenapp15 IP  D=10.201.88.34 S=10.201.142.41 LEN=40, ID=849, TOS=0x0, TTL=64
11:29:33.02325    webservera -> xenapp15 TCP D=1494 S=56011 Ack=4198090002 Seq=1900995851 Len=0 Win=48871
________________________________
11:29:53.34945 xenapp15 -> webservera    ETHER Type=0800 (IP), size = 60 bytes
11:29:53.34945 xenapp15 -> webservera    IP  D=10.201.142.41 S=10.201.88.34 LEN=40, ID=4923, TOS=0x0, TTL=126
11:29:53.34945 xenapp15 -> webservera    TCP D=56011 S=1494 Rst Ack=1900995851 Seq=4198090002 Len=0 Win=0

What I saw this time is that RST packet was being sent from the XenApp server! That’s the very last line, which I will repeat here for emphasis since it is so important to the case:

11:29:53.34945 xenapp15 -> webservera    TCP D=56011 S=1494 Rst Ack=1900995851 Seq=4198090002 Len=0 Win=0

TCP RST is a way to immediately disconnect a connection! It seemed as though this was begin converted to a FIN by the CSG. Now it’s looking very much like for whatever reason the application decided to terminate the connection. It almost has to be an application problem, right?

Wrong! We have to keep an open mind.

This trace, while dense, hints at where the problem may lie. It is taken on the load balancer with tcpdump -i 0.0. The load balancer has two interfaces, one towards the Internet, the other towards webserverw. The hostname of the load balancer’s Internet interface is called CSG, the hostname of the Citrix client on the Internet is drjohnspc.

11:03:59.392810 802.1Q vlan#4093 P0 drjohnspc.20723 > webserverw.https: . ack 35 win 32768 (DF)
11:03:59.455730 802.1Q vlan#4093 P0 webserverw.https > drjohnspc.20723: P 35:68(33) ack 1 win 48677 (DF)
11:03:59.554810 802.1Q vlan#4093 P0 drjohnspc.20723 > webserverw.https: . ack 68 win 32768 (DF)
11:03:59.585845 802.1Q vlan#4094 P0 drjohnspc.20723 > CSG.https: . ack 35 win 64426 (DF)
11:03:59.585855 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: P 35:68(33) ack 1 win 32768 (DF)
11:03:59.885805 802.1Q vlan#4094 P0 drjohnspc.20723 > CSG.https: . ack 68 win 64393 (DF)
11:04:59.465070 802.1Q vlan#4093 P0 webserverw.https > drjohnspc.20723: P 68:103(35) ack 1 win 48677 (DF)
11:04:59.465080 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: P 68:103(35) ack 1 win 32768 (DF)
11:04:59.564818 802.1Q vlan#4093 P0 drjohnspc.20723 > webserverw.https: . ack 103 win 32768 (DF)
11:05:00.664812 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: P 68:103(35) ack 1 win 32768 (DF)
11:05:02.864812 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: P 68:103(35) ack 1 win 32768 (DF)
11:05:07.064810 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: P 68:103(35) ack 1 win 32768 (DF)
11:05:15.264811 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: P 68:103(35) ack 1 win 32768 (DF)
11:05:31.464812 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: P 68:103(35) ack 1 win 32768 (DF)
11:05:59.807514 802.1Q vlan#4093 P0 webserverw.https > drjohnspc.20723: P 103:130(27) ack 1 win 48677 (DF)
11:05:59.807741 802.1Q vlan#4093 P0 webserverw.https > drjohnspc.20723: F 130:130(0) ack 1 win 48677 (DF)
11:05:59.807754 802.1Q vlan#4093 P0 drjohnspc.20723 > webserverw.https: . ack 131 win 32768 (DF)
11:05:59.807759 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: FP 103:130(27) ack 1 win 32768 (DF)
11:06:03.664813 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: . 68:103(35) ack 1 win 32768 (DF)
11:06:12.642847 802.1Q vlan#4093 P0 drjohnspc.20723 > webserverw.https: R 1:1(0) ack 131 win 32768 (DF)
11:06:12.642862 802.1Q vlan#4094 P0 CSG.https > drjohnspc.20723: RP 103:130(27) ack 1 win 32768 (DF)

Notice the time stamp increasing by larger and larger leaps beginning with 11:05:00.664812. 11:05:02, 11:05:07, 11:05:15, 11:05:31 – the time keeps doubling! this is characteristic of a TCP retransmit. Note that all the other information is the same. It must be retransmitting the same packet. Why? Because it never got there! That seems to be the most likely reason. Now my conviction and hope that an application problem lies at the heart of the issue is starting to crumble. See why you need to keep an open mind? Your opinion can change to the polar opposite conclusion with the input of some additional data like that. Where to turn next?

There is a firewall inbetween the load balancer and the Internet. Now we will focus our attention on it. Could be that it dropped that packet and all the re-transmits.

Here’s the trace of that same conversation on the firewall’s internal interface (which faces the CSG) (I(O) means inbound(outbound) with respect to that interface):

11:04:59.441022  I IP CSG.https > drjohnspc.20723: P 2210302227:2210302262(35) ack 1714160833 win 32768
11:05:00.640781  I IP CSG.https > drjohnspc.20723: P 0:35(35) ack 1 win 32768
11:05:02.840742  I IP CSG.https > drjohnspc.20723: P 0:35(35) ack 1 win 32768
11:05:07.040729  I IP CSG.https > drjohnspc.20723: P 0:35(35) ack 1 win 32768
11:05:15.240780  I IP CSG.https > drjohnspc.20723: P 0:35(35) ack 1 win 32768
11:05:31.440571  I IP CSG.https > drjohnspc.20723: P 0:35(35) ack 1 win 32768
11:05:59.783366  I IP CSG.https > drjohnspc.20723: FP 35:62(27) ack 1 win 32768
11:06:03.640595  I IP CSG.https > drjohnspc.20723: . 0:35(35) ack 1 win 32768
^C

and the trace of the same thing on the firewall’s external interface, i.e., facing the Internet and drjohnspc:

11:03:59.269334  O IP CSG.https > drjohnspc.20723: P 2210302159:2210302194(35) ack 1714160833 win 32768
11:03:59.562011  I IP drjohnspc.20723 > CSG.https: . ack 35 win 64426
11:03:59.562139  O IP CSG.https > drjohnspc.20723: P 35:68(33) ack 1 win 32768
11:03:59.861970  I IP drjohnspc.20723 > CSG.https: . ack 68 win 64393

Notice what’s not present in the exterenal interface trace – all those re-transmits, or even the original packet.

Let’s summarize so far. One of those keep-alive packets from the XenApp server reached the firewall, but didn’t exit the firewall. the only possibility is that it got dropped by the firewall!

Now that was a lot of work, but who’s going to do it if not a patient and methodical IT person?

Results
We got a networking problem on our hands after all. Good thing we persisted in this investigation even when it looked like we were off the hook! Later it was confirmed that the firewall was “aggressively aging” its connections because it had either reached or was very close to its connection limit. The firewall connection limit was raised and the Citrix connection issues went away.

Let’s go back to that simplistic question that non-experts like to ask: what had changed that caused this problem? The change was external events – increased usage of that firewall. Network bandwidth, Internet usage – they all tend to increase over time. There were no changes done by either networking or the application group to cause this issue. A seasoned IT detective uses all available clues and arrives at the right conclusion. The “what has changed” question is normally very relevant, but it can’t be the only tool in your toolbox!

Case closed!