Author: john

The IT Detective agency: Excessive Requests for PAC file Crippling Web Server

Post author By john
Post date March 12, 2012
4 Comments on The IT Detective agency: Excessive Requests for PAC file Crippling Web Server

Intro
Funny thing about infrastructure. You may have something running fine for years, and then suddenly it doesn’t. That is one of the many mysteries in the case of the excessive requests for PAC file.

The Details
We serve our Proxy Auto-config (PAC) file from a couple web servers which are load-balanced. It’s worked great for over 10 years. The PAC file is actually produced by a Perl script which can alter the content based on the user’s IP or other variables.

The web servers got bogged down last week. I literally observed the load average shoot up past 200 (on a 2-CPU server). This is not good.

I quickly noticed lots and lots of accesses for wpad.dat and proxy.pac. Some PCs were individually making hundreds of requests for these files in a day! Sometimes there were 15 or 20 requests in a minute. Since it is a script it takes some compute power to handle all those requests. So it was time to do one of two things: either learn the root cause and address it, or make a quick fix. The symptoms were clear enough, but I had no idea about the root cause. I also was fascinated by the requests for wpad.dat which I felt was serving no purpose whatsoever in our environment. So I went for the quick fix hopinG that understanding would come later.

To be continued…
As promised – three and a half years later! And we still have this problem. It’s probably worse than ever. I pretty much threw in the towel and learned how to scale up our apache web server to handle more PAC file requests simultaneously, see the references.

References
Scaling apache to handle more requests.

Tags PAC file

Admin Linux SLES

How to Get By Without unix2dos in SLES

Post author By john
Post date March 7, 2012
2 Comments on How to Get By Without unix2dos in SLES

Intro
As a Unix old-timer looking at the latest releases, I only have observed one tendency – that of ever-increasing numbers of commands, always additive – until now. A command I considered useful (well, basically any command I have ever used I consider useful) has gone AWOL in Suse Linux Enterprise Server (SLES for short): unix2dos.

Why You Need It
These days you need it more than ever. What with sftp being used in place of ftp, your transferred text files will come over from a SLES server to your PC in binary mode, preserving the Linux-style way of declaring a new line with the newline character, “\n”. Bring that file onto your PC and look at it in Notepad and you’ll get one long line because Windows requires more to indicate a new line. Windows OS’s like Windows 7 require a carriage return + newline, i.e., “\r\n”.

Who You Going to Call
I spoke with some experts so I cannot take credit for finding this out personally. Long story short things evolved and there is a more sophisticated command available that does this sort of thing and much else. That’s recode.

But I don’t think I’ll ever use recode for anything else so I decided to re-create a unix2dos command using recode in a tiny shell script:

#!/bin/sh
# inspired by http://yourlinuxguy.com/?p=232 and the fact that they took away this useful command
# 3/6/12
recode latin1..ibmpc $*

You call it like this:

> unix2dos file

and it overwrites the file and converts it to the format Windows expects.

My other expert contact says I could find the old unix2dos in OpenSuse but I decided not to go that route.

Of course to convert in the other direction you have dos2unix which for some reason wasn’t removed from the distro. Strange, huh?

How to See That It Worked
I use

> od -c file|more

to look at the ascii characters in a text file. It also shows the newline and carriage return characters with a \n and \r respectively This is a good command to know because it is also a “safe” way to look at a binary file. By safe I mean it won’t try to print out 8-bit characters that will permanently mess your terminal settings!

2017 update
I finally needed this utility again after five years. My program doesn’t work on CentOS. – No recode, whatever that was. However, the one-liner provided in the comments worked just fine for me.

Conclusion
We can rest easy and send text files back-and-forth between a PC and a SLES server with the help of this unix2dos script we developed.

Interestingly, RedHat’s RHEL has kept unix2dos in its disrtibution. Good for them. In ubuntu Linux unix2dos also seems decidedly missing.

Tags recode, unix2dos

Admin Linux

Common Problems Installing Cognos Gateway on Linux

Post author By john
Post date March 2, 2012
No Comments on Common Problems Installing Cognos Gateway on Linux

Updated for a 2018 Cognos 11 install
with 2013 updates for Cognos 10 installation

Intro
I tried to take a shortcut and get a 2nd Cognos gateway up and running by copying files, etc. rather than a proper install. At one time or another I feel I must have encountered just about every problem conceivable. I didn’t take great, systematic notes, but I’d like to mention some highlights while it is still fresh in my memory!

The Details
Note that I have a working gateway server running on the same version of Linux, SLES 11 SP1. So I thought I’d be clever and just copy all the files below /opt/cognos8 from the working server.

First Rookie Mistake
Let’s call our COGNOS_ROOT /opt/cognos8 for convenience.
Cognos 10 note: /opt/cognos10 would be a more sensible installation directory!

So you’re following along in the documentation and dutifully looking for /opt/cognos8/bin/cogconfig.sh, and not finding it? Me, neither. So I cleverly borrowed it from a working solaris installation. It’s all Java, right, no OS dependencies, what can go wrong? Ha, ha. You try:

./cogconfig.sh
and get:

Using /usr/lib64/jvm/jre/bin/java
The java class is not found:  CRConfig

Long story short. Give up. Without telling anyone they moved it to /opt/cognos8/bin64. That’s assuming you’re on a 64-bit system like most of us are.

OK. Now you run it from the …bin64 directory, expecting better results, only to perhaps get something like:

./cogconfig.sh

Unable to locate a JRE. Please specify a valid JAVA_HOME environment variable.

Long story short, java-1_4_2-ibm (java-1_6_0-ibm if installing a Cognos 10 gateway) is a good Java environment to install for Cognos Gateway. At least it is on SLES Linux. So you install that and set up environment variables like these:

export JAVA_BINDIR=/usr/lib64/jvm/jre/bin
export JAVA_HOME=/usr/lib64/jvm/jre
export JAVA_ROOT=/usr/lib64/jvm/jre

Now you’re cooking. Run it yet again. You’re smart and know to set up your DISPLAY environment to a valid XServer you have access to. But even if the X application actually does launch and run (you may need some Motif or additional X packages, possibly even from the SDK DVD – see appendix A), if you try to export the configuration you’ll get an error like this:

java.lang.ClassNotFoundException: org.bouncycastle134.jce.provider.BouncyCastleProvider

Cognos 10 note: I did not have this class missing in my Cognos 10 installation. Yeah!

Yes, you are missing the infamous bouncycastleprovider! This stuff is too good to make up, right? It’s a jar file that’s somewhere in the Cognos Gateway distribution, bcprov-jdk14-134.jar. In my case I need to put it here:

/etc/alternatives/jre/lib/ext

With that in place run it yet again. Now you may be unable to export the configuration with this error:

CAM-CRP-1057 Unable to generate the machine specific symmetric key.

Does it ever end? Yes!

You may have old values of keys and what-not cryptography stuff from your copy of the other system. So you remove these directories and all their contents:

/opt/cognos8/{encryptkeypair,signkeypair}

And I even saw the following error:

02/03/2012,11:26:56,Err,com.cognos.crconfig.data.DataManagerException: CAM-CRP-1132 An error occurred while attempting to request a certificate from the Certificate Authority service. Unable to connect to the Certificate Authority service. Ensure that the Content Manager computer is configured and that the Cognos 8 services on it are currently running. Reason: java.net.ConnectException: Connection refused, com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:2730)

I think it comes about if you save the default config without editing it and putting in a valid dispatcher URI, but I forget.

The main point towards the end was to start with a clean config by a:

cd /opt/cognos8/configuration;cp cogstartup.xml{.new,}

, making sure there is no encryptkeypair and signkeypair directories, launching …bin64/cogconfig.sh, working with the GUI to define the dispatcher URIs to your working, running Cognos dispatcher, exporting it,

(Let me take a breath here. If that export succeeds, you’re home.)

and finally saving it, which also generates the system-specific keys.

That’s it! A bunch of green check marks are your reward. Hopefully.

Conclusion
In the end you will see that this “cheap method” of installing Cognos Gateway worked. We had a few bumps along the road, but we worked through them all. Now that we’ve seen just about every conceivable problem we have a treasure trove of documented errors and fixes should we ever find ourselves in this situation again.

There is one more Cognos Gateway problem we resolved, by the way, that was previously documented here.

Appendix A – Cognos 10 note
Yes, I referred to this document in my own installation of Cognos version 10 gateway component. The problems are very similar, and this was a big help, if I say so myself.

I notice I write a tight narrative. I have lots of tangential thoughts, but to list them all as I think of them would destroy the flow of the narrative. In this case I wanted to expand on the openmotif packages.

I got a missing libXm.so.4 message when launching issetup the first time. I determined this came from an openmotif package from my previous successful installation on another server. My new server had limited repositories.

> zypper search openmotif

produced these results:

 
S | Name                   | Summary                    | Type
--+------------------------+----------------------------+-----------
  | openmotif21-demos      | Open Motif 2.2.4 Libraries | package
  | openmotif21-libs       | Open Motif 2.2.4 Libraries | package
  | openmotif21-libs       | Open Motif 2.2.4 Libraries | srcpackage
  | openmotif21-libs-32bit | Open Motif 2.2.4 Libraries | package
  | openmotif22-libs       | Open Motif 2.2.4 Libraries | package
  | openmotif22-libs       | Open Motif 2.2.4 Libraries | srcpackage
  | openmotif22-libs-32bit | Open Motif 2.2.4 Libraries | package

Well, I tried to install first openmotif21-libs-32bit then openmotif22-libs-32bit, but neither gave me the right version of libXm.so! I had versions 2, 3 and 6! So I simply did one of these numbers:

> cd /usr/lib; ln -s libXm.so.3.0.3 libXm.so.4

and, to my surprise, it worked!

More Errors Documented for completeness’ sake
At the risk of making this blog post a total mess, I’ll include a few more errors I encountered during the upgrade. Who knows who might find this useful.

Generating the cryptographic keys is always a hold-your-breath-and-pray operation. I had my upgrade files in place in a new install directory, /opt/cognos10. I ran bin64/cogconfig.sh like usual. It was suggested I could save the configuration even though the application gateway wasn’t running, so I tried that. No dice.

The cryptographic information cannot be encrypted.

Fine. So probably the app server needs to be running before we save the config, right? So they got it running. I tried to save the config. Same error. The details were as follows:

[ ERROR ]
CAM-CRP-1315 Current configuration points to a different Trust Domain than originally configured.
 
[ ERROR ] 
The cryptography information was not generated.

The remedy? Close the configuration and completely remove these directories beneath the /opt/cognos10/configuration directory:

– encryptkeypair
– signkeypair
– csk (actually I didn’t have this one. But I guess it should be removed if present)

I held my breath, re-ran cogconfig and saved. This time it worked!

I also had an error with my Java version:

./cogconfig.sh
Using /usr/lib64/jvm/jre/bin/java
The java class could not be loaded. java.lang.UnsupportedClassVersionError: (CRConfig) bad major version at offset=6

/usr/lib64/jvm/jre/bin/java -version

showed

java version "1.4.2"
Java(TM) 2 Runtime Environment, Standard Edition (build 2.3)
IBM J9 VM (build 2.3, J2RE 1.4.2 IBM J9 2.3 Linux amd64-64 j9vmxa64142ifx-20110628 (JIT enabled)
J9VM - 20110627_85693_LHdSMr
JIT  - 20090210_1447ifx5_r8
GC   - 200902_24)

I installed a newer Java:

zypper install  java-1_6_0-ibm

and got past this error.

April 20123 update
Just when you thought every possible error was covered, you encounter a new one. Cognos Mobile isn’t working so well on actual mobile devices so they wanted to try a Fixpack from IBM. No problem, right? They gave me

up_cogmob_linuxi38664h_10.2.1102.33_ml.rar

and I set to work. I don’t particularly like rar files for Linux, but I figured out there is an unrar command:

$ unrar e up_*rar

But after setting up my DISPLAY environment variable I get this new error running ./issetup:

X Error of failed request:  BadDrawable (invalid Pixmap or Window parameter)
  Major opcode of failed request:  14 (X_GetGeometry)
  Resource id in failed request:  0x2
  Serial number of failed request:  257
  Current serial number in output stream:  257
IDS_MSG_PREFIXIDS_COPYRIGHT_LOGOIDS_MSG_PREFIXIDS_MSG_READ_ARCHIVE

The solution? They downloaded a tar.gz version of the Fixpack. I unpacked that and had absolutely no problems with issetup! The really strange thing is that in both issetup are identical files. I use cksum to do a quick compare. Even setup.csp are identical files. I did an strace -f of the two cases but the salient difference didn’t pop out at me. The files present in the tar.gz seem to be fewer in number.

Another random error you will encounter sooner or later

You are doing a Save in cogconfig and you get:

13/05/2013,17:39:05,Err,CAM-CRP-1132 An error occurred while attempting to request a certificate from the Certificate Authority service. Unable to connect to the Certificate Authority service. Ensure that the Content Manager computer is configured and that the IBM Cognos services on it are currently running. Reason: java.net.ConnectException: Connection refused, com.cognos.crconfig.data.crypto.ConfiguringSession.configure(ConfiguringSession.java:35)com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:3037)com.cognos.crconfig.data.DataManager$4.run(DataManager.java:4169)com.cognos.crconfig.data.CnfgActionEngine$CnfgActionThread.run(CnfgActionEngine.java:394)com.cognos.crconfig.data.crypto.ConfiguringSession.configure(ConfiguringSession.java:35)com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:3037)com.cognos.crconfig.data.DataManager$4.run(DataManager.java:4169)com.cognos.crconfig.data.CnfgActionEngine$CnfgActionThread.run(CnfgActionEngine.java:394)com.cognos.crconfig.data.crypto.ConfiguringSession.configure(ConfiguringSession.java:35)com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:3037)com.cognos.crconfig.data.DataManager$4.run(DataManager.java:4169)com.cognos.crconfig.data.CnfgActionEngine$CnfgActionThread.run(CnfgActionEngine.java:394)

This looks scary but has an easy fix. You aren’t communicating with the app server. Probably their dispatcher services are down. Bring them up and it should work fine – it did for me. This is assuming of course that you have your dispatcher URLs set up correctly.

I cloned my Cognos web gateway and got this error
I waited for a few weeks to examine the clone. I ran

$ ./cogconfig.sh

and got this error:

16/05/2013,15:57:35,Err,CAM-CRP-1280 An error occurred while trying to decrypt using the system protection key. Reason: javax.crypto.IllegalBlockSizeException: Input length (with padding) not multiple of 16 bytes

Umm. I don’t have the solution yet. One thing is most highly suspect: in the meatime we re-generated the keys on the production web gateway. So I am hoping that is all we need to do here as well.

Resolved. Here is the process I followed – a sort of colonic for Cognos:

$ cd /opt/cognos10/configuration; rm csk/* signkeypair/* encryptkeypair/* cogstartup.xml
$ cd ../bin64; ./cogconfig.sh

Then in the GUI I re-defined the app servers in the dispatcher URI portion of the environment.
Then did a Save.
Worked like a champ – four green check marks.

cogconfig hangs
This happened to me on an older server. The IBM Cognos Configuration screen displays but it’s supposed to exit so you can get to the part where you edit the configuration and it never does.

Currently no known solution.

June 2018 update
Cognos 11 install problem
The Cognos 11 install was going pretty well. Until it came time to launch cogconfig. That generated this error:

cognos10:/web/cognos11/bin64> ./cogconfig.sh

Using /usr/lib64/jvm/jre/bin/java
Exception in thread "main" java.lang.UnsupportedClassVersionError: JVMCFRE003 bad major version; class=com/cognos/accman/jcam/crypto/CAMCryptoException, offset=6
        at java.lang.ClassLoader.defineClass(ClassLoader.java:286)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:74)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:538)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
        at java.net.URLClassLoader.access$300(URLClassLoader.java:77)
        at java.net.URLClassLoader$ClassFinder.run(URLClassLoader.java:1041)
        at java.security.AccessController.doPrivileged(AccessController.java:448)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:427)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:676)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:358)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:642)
        at java.lang.J9VMInternals.verifyImpl(Native Method)
        at java.lang.J9VMInternals.verify(J9VMInternals.java:73)
        at java.lang.J9VMInternals.initialize(J9VMInternals.java:133)
        at com.cognos.cclcfgapi.CCLConfigurationFactory.getInstance(CCLConfigurationFactory.java:59)
        at com.cognos.crconfig.CnfgPreferences.<init>(CnfgPreferences.java:51)
        at com.cognos.crconfig.CnfgPreferences.<clinit>(CnfgPreferences.java:36)
        at java.lang.J9VMInternals.initializeImpl(Native Method)
        at java.lang.J9VMInternals.initialize(J9VMInternals.java:199)
        at CRConfig.main(CRConfig.java:144)

Note my system java version is woefully out-of-date:

$ /usr/lib64/jvm/jre/bin/java ‐version

java version "1.6.0"
Java(TM) SE Runtime Environment (build pxa6460sr16fp15-20151106_01(SR16 FP15))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr16fp15-20151020_272943 (JIT enabled, AOT enabled)
J9VM - 20151020_272943
JIT  - r9_20151019_103450
GC   - GA24_Java6_SR16_20151020_1627_B272943)
JCL  - 20151105_01

whereas the Cognos-supplied Java is two versions ahead:
cognos10:/web/cognos11> ./jre/bin/java ‐version

java version "1.8.0"
Java(TM) SE Runtime Environment (build pxa6480sr4fp10-20170727_01(SR4 FP10))
IBM J9 VM (build 2.8, JRE 1.8.0 Linux amd64-64 Compressed References 20170722_357405 (JIT enabled, AOT enabled)
J9VM - R28_20170722_0201_B357405
JIT  - tr.r14.java_20170722_357405
GC   - R28_20170722_0201_B357405_CMPRSS
J9CL - 20170722_357405)
JCL - 20170726_01 based on Oracle jdk8u144-b01

Instead of the previous approach which involved upgrading the system Java, I decided to just try the Java version Cognos itself had installed. In the following commands note that my installation directory was /web/cognos11.

$ cd /web/cognos11; export JAVA_HOME=`pwd`/jre
$ ./cogconfig.sh

Using /web/cognos11/jre/bin/java
06/06/2018,11:13:04,Dbg,Use Customized settings for font and color.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/web/cognos11/bin/slf4j-nop-1.7.23.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/web/cognos11/configuration/utilities/config-util.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.helpers.NOPLoggerFactory]
06/06/2018,11:13:10,Dbg,The original cogstartup.xml file is clear text. Don't back it up.

That is to say, it worked! I’ve often seen software packages install their own versions of Java. This is the first time I thought to take advantage of that. Wish I had thought of this approach during the Cognos 10 install!

Tags Cognos, Cognos 10, Cognos 11, cognos mobile

Admin Perl Web Site Technologies

Turning HP SiteScope into SiteScope Classic with Perl

Post author By john
Post date February 28, 2012
15 Comments on Turning HP SiteScope into SiteScope Classic with Perl

Intro
HP siteScope is a terrific web application tool and not too expensive for those who have any kind of a budget. The built-in monitor types are a bit limited, but since it allows calls to user-provided scripts your imagination is the only real limitation. For those with too many responsibilities and too little time on their hands it is a real productivity enhancer.

I’ve been using the product for 12 years now – since it was Freshwater SiteScope. I still have misgivings about the interface change introduced some years ago when it was part of Mercury. It went from simple and reliable to Java, complicated and flaky. To this day I have to re-start a SiteScope screen in my browser on a daily basis as the browser cannot recover from a server restart or who knows what other failures.

So I longed for the days of SiteScope Classic. We kept it running for as long as possible, years in fact. But at some point there were no more releases created for the classic view. So I investigated the feasibility of creating my own conversion tool. And…partially succeeded. Succeeded to the point where I can pull up the web page on my Blackberry and get the statuses and history. Think you can do that with regular HP SiteScope? I can’t. Maybe there’s an upgrade for it, but still. It’s nice to have the classic interface when you want to pull up the statuses as quickly as possible, regardless of the Blackberry display issue.

Looking back at my code, I obviously decided to try my hand at OO (object oriented) programming in Perl, with mixed results. Perl’s OO syntax isn’t the best, which addles comprehension. Without further ado, let’s jump into it.

The Details
It relies on something I noticed, that this URL on your HP SiteScope server, http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getConfigurationSnapshot, contains a tree of relationships of all the monitors. Cool, right? But it’s not a tree like you or I would design. Between parent and child is an intermediate layer. I suppose you need that because a group can contain monitors (my only focus in this exercise), but it can also contain alerts and maybe some other properties as well. So I guess the intermediate layer gives them the flexibility to represent all that, though it certainly added to my complication in parsing it. That’s why you’ll see the concern over “grandkids.” I developed a recursive, web-enabled Perl program to parse through this xml. That gives me the tools to build the nice hierarchical groupings. But it does not give me the statuses.

For the status of each monitor I wrote a separate scraper script that simply reads the entire daily SiteScope log every minute! Crude, but it works. I use it for an installation with hundreds of monitors and a log file that grows to 9 MB by the end of the day so I know it scales to that size. Beyond that it’s untested.

In addition to giving only the relationships, the xml also changes with every invocation. It attaches ID numbers to the monitors which initially you think is a nice unique identifier, but they change from invocation to invocation! So an additional challenge was to match up the names of the monitors in the xml output to the names as recorded in the SiteScope log. Also a bit tricky, but in general doable.

So without further ado, here’s the source code for the xml parser and main program which gets called from the web:

#!/usr/bin/perl
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# build v.simple SiteScope web GUI appropriate for smartphones
# 7/2010
#
# Id is our package which defines th Id class
use Id;
use CGI::Pretty;
my $cgi=new CGI;
$DEBUG = 0;
# GIF location on SiteScope classic
$ssgifs = "/artwork/";
$health{good} = qq(<img src="${ssgifs}okay.gif">);
$health{error} = qq(<img src="${ssgifs}error.gif">);
$health{warning} = qq(<img src="${ssgifs}warning.gif">);
# report CGI
$rprt = "/SS/rprt";
# the frustrating thing is that this xml output changes almost every time you call it
$url = 'http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getConfigurationSnapshot';
# get current health of all monitors - which is scraped from the log every minute by a hilgarj cron job
$monitorstats = "/tmp/monitorstats.txt";
print "Content-type: text/plain\n\n" if $DEBUG;
open(MONITORSTATS,"$monitorstats") || die "Cannot open monitor stats file $monitorstats!!";
while(<MONITORSTATS>) {
  chomp;
  ($monitor,$status,$value) = /([^\t]+)\t([^\t]+)\t([^\t]+)/;
  $monitors{"$monitor"} = $status;
  $monitorv{"$monitor"} = $value;
}
open(CURL,"curl $url 2>/dev/null|") || die "cannot open $url for reading!!\n";
my %myobjs = ();
# the xml is one long line!
@lines = <CURL>;
#print "xml line: $lines[0]\n" if $DEBUG;
@multiRefs = split "<multiRef",$lines[0];
#parse multiRefs
# create top-level object
my $id = Id->new (
      id => "id0");
# hash of this object with id as key
$myobjs{"id0"} = $id;
 
# first build our objects...
foreach $mref (@multiRefs) {
  next unless $mref =~ /\sid=/;
#  id="id0" ...
  ($parentid) =  $mref =~ /id=\"(id\d+)/;
  print "parentid: $parentid\n" if $DEBUG;
# watch out for <item><key xsi:type="soapenc:string">groupSnapshotChildren</key><value href="#id3 ...
# vs <item><key xsi:type="soapenc:string">Network</key><value href="#id40"/>
  print "mref: $mref\n" if $DEBUG;
  @ids = split /<item><key/, $mref;
# then loop over ids mentioned in this mref
  foreach $myid (@ids) {
    next unless $myid =~ /href="#(id\d+)/;
    next unless $myobjs{"$parentid"};
# types include group, monitor, alert
    ($typebyregex) = $myid =~ />snapshot_(\w+)SnapshotChildren</;
    $parenttype = $myobjs{"$parentid"}->type();
    $type = $typebyregex ? $typebyregex : $parenttype;
    print "type: $type\n" if $DEBUG;
# skip alert definitions
    next if $type eq "alert";
    print "myid: $myid\n" if $DEBUG;
    ($actualid) = $myid =~ /href="#(id\d+)/;
    print "actualid: $actualid\n" if $DEBUG;
# construct object
    my $id = Id->new (
      id => $actualid,
      type => $type,
      parentid => $parentid );
# build hash of these objects with actualid as key
    $myobjs{$actualid} = $id;
# addchild to parent. note that parent should already have been encountered
    $myobjs{"$parentid"}->addchild($actualid);
    if ($myid !~ /groupSnapshotChildren/) {
# interesting child - has name (every other generation has no name!)
      ($name) = $myid =~ /string\">(.+?)<\/key/;  # use non-greedy operator
      print "name: $name\n" if $DEBUG;
# some names are not of interest to us: alerts, which end in "error" or "good"
      if ($name !~ /(error|good)$/) {
# name may not be unique - get extended name which include all parents
        if (defined $myobjs{"$parentid"}->parentid()) {
          $gdparid = $myobjs{"$parentid"}->parentid();
          $gdparname = $myobjs{"$gdparid"}->extname();
# extname -> extended, or distinguished name.  Should be unique
          $extname = $gdparname. '/' . $name;
        } else {
# 1st generation
          print "1st generation\n" if $DEBUG;
          $extname = $name;
        }
        print "extname: $extname\n" if $DEBUG;
        $id->name($name);
        $id->extname($extname);
        $id->isanamedid(1);
        $myobjs{"$parentid"}->hasnamedkids(1); # want to mark its parent as "special"
# we also need our hash to reference objects by extended name since id changes with each extract and name
may not be unique
        $myobjs{"$extname"} = $id;
      } # end conditional over desirable name check
    } else {
      $id->isanamedid(0);
    }
  }
}
#
# now it's all parsed and our objects are alive. Let's build a web site!
#
# build a cookie containing path
my $pi = $ENV{PATH_INFO};
$script = $ENV{SCRIPT_NAME};
$ua = $ENV{HTTP_USER_AGENT};
# Blackberry browser test
$BB = $ua =~ /^BlackBerry/i ? 1 : 0;
$MSIE = $ua =~ /MSIE /;
# font-size depends on browser
$FS = "font-size: x-small;" if $MSIE;
$cookie = $cgi->cookie("pathinfo");
$uri = $script . $pi;
$cookie=$cgi->cookie(-name=>"pathinfo", -value=>"$uri");
print $cgi->header(-type=>"text/html",-cookie=>$cookie);
($url) = $pi =~ m#([^/]+)$#;
#  -title=>'SmartPhone View',
# this doesn't work, sigh...
#print $cgi->start_html(-head=>meta({-http_equiv=>'Refresh'}));
print qq( <HEAD>
<meta http-equiv="Expires" content="0">
<meta http-equiv="Pragma" content="no-cache">
<meta HTTP-EQUIV="Refresh" CONTENT="60; URL=$url">
<TITLE>SiteScope Classic $url Detail</TITLE>
<style type="text/css">
a.good {color: green; }
a.warning {color: green; }
a.error {color: red; }
td {font-family: Arial, Helvetica, sans-serif; $FS}
p.ss {font-family: Arial, Helvetica, sans-serif;}
</style>
<link rel="shortcut icon" href="/favicon.ico" type="image/x-icon" />
<script type=text/javascript>
function changeme(elemid,longvalue)
{
document.getElementById(elemid).innerText=longvalue;
}
function restoreme(elemid,truncvalue)
{
document.getElementById(elemid).innerText=truncvalue;
}
</script>
</HEAD><body>
);
 
#print $cgi->h1("This is the heading");
# parse path
# top lvl name:2nd lvl name:3rd lvl name
$altpi = $cgi->path_info();
print $cgi->p("pi is $pi") if $DEBUG;
#print $cgi->p("altpi is $altpi");
# relative url
$rurl = $cgi->url(-relative=>1);
if ($pi eq "") {
# the top
# top id is id3
  print qq(<p class="ss">);
  $myid = "id3";
  foreach $kid ($myobjs{"$myid"}->get_children()) {
    my $kidname = $myobjs{"$kid"}->name();
# kids can be subgroups or standalone monitors
    my $health = recurse("/$kidname");
    print "$health{$health} <a href=\"$rurl/$kidname\">$kidname</a><br>\n";
    $prodtest = $kid if $kidname eq "Production";
  }
  print "</p>\n";
} else {
  $extname = $pi;
  print "pi,name,extname,script: $pi,$name,$extname,$script\n" if $DEBUG;
# print where we are
  $uriname = $pi;
  $uriname =~ s#^/##;
  #print $cgi->p("name is $name");
  #print $cgi->p("uriname is $uriname");
  $uricompositepart = "/";
  @uriparts = split('/',$uriname);
  $lastpart = pop @uriparts;
  print qq(<p class="ss"><a href="$script"><b>Sitescope</b></a><br>);
  print qq(<b>Monitors in: );
  foreach $uripart (@uriparts) {
    my $healthp = recurse("$uricompositepart$uripart");
# build valid link
    ##$link = qq(<a class="good" href="$script$uricompositepart$uripart">$uripart</a>: );
    $link = qq(<a class="$healthp" href="$script$uricompositepart$uripart">$uripart</a>: );
    $uricompositepart .= "$uripart/";
    print $link;
  }
  my $healthp = recurse("$uricompositepart$lastpart");
  $color = $healthp eq "error" ? "red" : "green";
  print qq(<font color="$color">$lastpart</font></b></p>\n);
  print qq(<table border="1" cellspacing="0">);
  #print qq(<table>);
  %hashtrs = ();
  foreach $kid ($myobjs{"$extname"}->get_children()) {
    print "kid id: " . $myobjs{"$kid"}->id() . "\n" if $DEBUG;
    next unless $myobjs{"$kid"}->hasnamedkids();
    foreach $gdkid ($myobjs{"$kid"}->get_children()) {
      print "gdkid id: " . $myobjs{"$gdkid"}->id() . "\n" if $DEBUG;
      $gdkidname = $myobjs{"$gdkid"}->name();
      $gdkidextname = $myobjs{"$gdkid"}->extname();
      my $health = recurse("$gdkidextname");
      my $type = $myobjs{"$gdkid"}->type();
# dig deeper to learn health of the grankid's grandkids
      $objct = $healthct{good} = $healthct{error} = $healthct{warning} = 0;
      foreach $ggkid ($myobjs{"$gdkidextname"}->get_children()) {
        print "ggkid id: " . $myobjs{"$ggkid"}->id() . "\n" if $DEBUG;
        next unless $myobjs{"$ggkid"}->hasnamedkids();
        foreach $gggdkid ($myobjs{"$ggkid"}->get_children()) {
          print "gggdkid id: " . $myobjs{"$gggdkid"}->id() . "\n" if $DEBUG;
          $gggdkidname = $myobjs{"$gggdkid"}->name();
          $gggdkidextname = $myobjs{"$gggdkid"}->extname();
          my $health = recurse("$gggdkidextname");
          $objct++;
          $healthct{$health}++;
        }
      }
      $elemct++;
      $elemid = "elemid" . $elemct;
# groups should have distinctive cell background color to set them apart from monitors
      if ($type eq "group") {
        $bgcolor = "#F0F0F0";
        $celllink = "$lastpart/$gdkidname";
        $truncvalue = qq(<font color="red">$healthct{error}</font>/$objct);
        $tdval = $truncvalue;
      } else {
        $bgcolor = "#FFFFFF";
        $celllink = "$rprt?$gdkidname";
# truncate monitor value to save display space
        $longvalue = $monitorv{"$gdkidname"};
        (my $truncvalue) = $monitorv{"$gdkidname"} =~ /^(.{7,9})/;
        $truncvalue = $truncvalue? $truncvalue : "&nbsp;";
        $tdval = qq(<span id="$elemid" onmouseover="changeme('$elemid','$longvalue')" onmouseout="restorem
e('$elemid','$truncvalue')">$truncvalue</span>);
      }
      $hashtrs{"$gdkidname"} = qq(<tr><td bgcolor="#000000">$health{$health} </td><td>$tdval</td><td bgcol
or="$bgcolor"><a href="$celllink">$gdkidname</a></td></tr>\n);
# for health we're going to have to recurse
    }
  }
# print out in alphabetical order
  foreach $key (sort(keys %hashtrs)) {
    print $hashtrs{"$key"};
  }
  print "</table>";
}
print $cgi->end_html();
#######################################
sub recurse {
# to get the union of health of all ancestors
my $moniext = shift;
my ($moni) = $moniext =~ m#/([^/]+)$#;
# don't bother recursing and all that unless we have to...
return $myobjs{"$moniext"}->health() if defined $myobjs{"$moniext"}->health();
print "moni,moniext: $moni, $moniext\n" if $DEBUG;
my ($kid,$gdkidextname,$health,$cumhealth);
$cumhealth = $health = $monitors{"$moni"} ? $monitors{"$moni"} : "good";
foreach $kid ($myobjs{"$moniext"}->get_children()) {
    if ($myobjs{"$kid"}->hasnamedkids()) {
      foreach $gdkid ($myobjs{"$kid"}->get_children()) {
        $gdkidextname = $myobjs{"$gdkid"}->extname();
# for health we're going to have to recurse
        $health = recurse("$gdkidextname");
        if ($health eq "error" || $cumhealth eq "error") {
          $cumhealth = "error";
        } elsif ($health eq "warning" || $cumhealth eq "warning") {
          $cumhealth = "warning";
        }
      }
    } else {
# this kid is end of line
      $health = $monitors{"$kid"} ? $monitors{"$kid"} : "good";
        if ($health eq "error" || $cumhealth eq "error") {
          $cumhealth = "error";
        } elsif ($health eq "warning" || $cumhealth eq "warning") {
          $cumhealth = "warning";
        }
    }
}
$myobjs{"$moniext"}->health("$cumhealth");
return $cumhealth;
} # end sub recurse

#!/usr/bin/perl # Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0 # build v.simple SiteScope web GUI appropriate for smartphones # 7/2010 # # Id is our package which defines th Id class use Id; use CGI::Pretty; my $cgi=new CGI; $DEBUG = 0; # GIF location on SiteScope classic $ssgifs = "/artwork/"; $health{good} = qq(<img src="${ssgifs}okay.gif">); $health{error} = qq(<img src="${ssgifs}error.gif">); $health{warning} = qq(<img src="${ssgifs}warning.gif">); # report CGI $rprt = "/SS/rprt"; # the frustrating thing is that this xml output changes almost every time you call it $url = 'http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getConfigurationSnapshot'; # get current health of all monitors - which is scraped from the log every minute by a hilgarj cron job $monitorstats = "/tmp/monitorstats.txt"; print "Content-type: text/plain\n\n" if $DEBUG; open(MONITORSTATS,"$monitorstats") || die "Cannot open monitor stats file $monitorstats!!"; while(<MONITORSTATS>) { chomp; ($monitor,$status,$value) = /([^\t]+)\t([^\t]+)\t([^\t]+)/; $monitors{"$monitor"} = $status; $monitorv{"$monitor"} = $value; } open(CURL,"curl $url 2>/dev/null|") || die "cannot open $url for reading!!\n"; my %myobjs = (); # the xml is one long line! @lines = <CURL>; #print "xml line: $lines[0]\n" if $DEBUG; @multiRefs = split "<multiRef",$lines[0]; #parse multiRefs # create top-level object my $id = Id->new ( id => "id0"); # hash of this object with id as key $myobjs{"id0"} = $id; # first build our objects... foreach $mref (@multiRefs) { next unless $mref =~ /\sid=/; # id="id0" ... ($parentid) = $mref =~ /id=\"(id\d+)/; print "parentid: $parentid\n" if $DEBUG; # watch out for <item><key xsi:type="soapenc:string">groupSnapshotChildren</key><value href="#id3 ... # vs <item><key xsi:type="soapenc:string">Network</key><value href="#id40"/> print "mref: $mref\n" if $DEBUG; @ids = split /<item><key/, $mref; # then loop over ids mentioned in this mref foreach $myid (@ids) { next unless $myid =~ /href="#(id\d+)/; next unless $myobjs{"$parentid"}; # types include group, monitor, alert ($typebyregex) = $myid =~ />snapshot_(\w+)SnapshotChildren</; $parenttype = $myobjs{"$parentid"}->type(); $type = $typebyregex ? $typebyregex : $parenttype; print "type: $type\n" if $DEBUG; # skip alert definitions next if $type eq "alert"; print "myid: $myid\n" if $DEBUG; ($actualid) = $myid =~ /href="#(id\d+)/; print "actualid: $actualid\n" if $DEBUG; # construct object my $id = Id->new ( id => $actualid, type => $type, parentid => $parentid ); # build hash of these objects with actualid as key $myobjs{$actualid} = $id; # addchild to parent. note that parent should already have been encountered $myobjs{"$parentid"}->addchild($actualid); if ($myid !~ /groupSnapshotChildren/) { # interesting child - has name (every other generation has no name!) ($name) = $myid =~ /string\">(.+?)<\/key/; # use non-greedy operator print "name: $name\n" if $DEBUG; # some names are not of interest to us: alerts, which end in "error" or "good" if ($name !~ /(error|good)$/) { # name may not be unique - get extended name which include all parents if (defined $myobjs{"$parentid"}->parentid()) { $gdparid = $myobjs{"$parentid"}->parentid(); $gdparname = $myobjs{"$gdparid"}->extname(); # extname -> extended, or distinguished name. Should be unique $extname = $gdparname. '/' . $name; } else { # 1st generation print "1st generation\n" if $DEBUG; $extname = $name; } print "extname: $extname\n" if $DEBUG; $id->name($name); $id->extname($extname); $id->isanamedid(1); $myobjs{"$parentid"}->hasnamedkids(1); # want to mark its parent as "special" # we also need our hash to reference objects by extended name since id changes with each extract and name may not be unique $myobjs{"$extname"} = $id; } # end conditional over desirable name check } else { $id->isanamedid(0); } } } # # now it's all parsed and our objects are alive. Let's build a web site! # # build a cookie containing path my $pi = $ENV{PATH_INFO}; $script = $ENV{SCRIPT_NAME}; $ua = $ENV{HTTP_USER_AGENT}; # Blackberry browser test $BB = $ua =~ /^BlackBerry/i ? 1 : 0; $MSIE = $ua =~ /MSIE /; # font-size depends on browser $FS = "font-size: x-small;" if $MSIE; $cookie = $cgi->cookie("pathinfo"); $uri = $script . $pi; $cookie=$cgi->cookie(-name=>"pathinfo", -value=>"$uri"); print $cgi->header(-type=>"text/html",-cookie=>$cookie); ($url) = $pi =~ m#([^/]+)$#; # -title=>'SmartPhone View', # this doesn't work, sigh... #print $cgi->start_html(-head=>meta({-http_equiv=>'Refresh'})); print qq( <HEAD> <meta http-equiv="Expires" content="0"> <meta http-equiv="Pragma" content="no-cache"> <meta HTTP-EQUIV="Refresh" CONTENT="60; URL=$url"> <TITLE>SiteScope Classic $url Detail</TITLE> <style type="text/css"> a.good {color: green; } a.warning {color: green; } a.error {color: red; } td {font-family: Arial, Helvetica, sans-serif; $FS} p.ss {font-family: Arial, Helvetica, sans-serif;} </style> <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon" /> <script type=text/javascript> function changeme(elemid,longvalue) { document.getElementById(elemid).innerText=longvalue; } function restoreme(elemid,truncvalue) { document.getElementById(elemid).innerText=truncvalue; } </script> </HEAD><body> ); #print $cgi->h1("This is the heading"); # parse path # top lvl name:2nd lvl name:3rd lvl name $altpi = $cgi->path_info(); print $cgi->p("pi is $pi") if $DEBUG; #print $cgi->p("altpi is $altpi"); # relative url $rurl = $cgi->url(-relative=>1); if ($pi eq "") { # the top # top id is id3 print qq(<p class="ss">); $myid = "id3"; foreach $kid ($myobjs{"$myid"}->get_children()) { my $kidname = $myobjs{"$kid"}->name(); # kids can be subgroups or standalone monitors my $health = recurse("/$kidname"); print "$health{$health} <a href=\"$rurl/$kidname\">$kidname</a><br>\n"; $prodtest = $kid if $kidname eq "Production"; } print "</p>\n"; } else { $extname = $pi; print "pi,name,extname,script: $pi,$name,$extname,$script\n" if $DEBUG; # print where we are $uriname = $pi; $uriname =~ s#^/##; #print $cgi->p("name is $name"); #print $cgi->p("uriname is $uriname"); $uricompositepart = "/"; @uriparts = split('/',$uriname); $lastpart = pop @uriparts; print qq(<p class="ss"><a href="$script"><b>Sitescope</b></a><br>); print qq(<b>Monitors in: ); foreach $uripart (@uriparts) { my $healthp = recurse("$uricompositepart$uripart"); # build valid link ##$link = qq(<a class="good" href="$script$uricompositepart$uripart">$uripart</a>: ); $link = qq(<a class="$healthp" href="$script$uricompositepart$uripart">$uripart</a>: ); $uricompositepart .= "$uripart/"; print $link; } my $healthp = recurse("$uricompositepart$lastpart"); $color = $healthp eq "error" ? "red" : "green"; print qq(<font color="$color">$lastpart</font></b></p>\n); print qq(<table border="1" cellspacing="0">); #print qq(<table>); %hashtrs = (); foreach $kid ($myobjs{"$extname"}->get_children()) { print "kid id: " . $myobjs{"$kid"}->id() . "\n" if $DEBUG; next unless $myobjs{"$kid"}->hasnamedkids(); foreach $gdkid ($myobjs{"$kid"}->get_children()) { print "gdkid id: " . $myobjs{"$gdkid"}->id() . "\n" if $DEBUG; $gdkidname = $myobjs{"$gdkid"}->name(); $gdkidextname = $myobjs{"$gdkid"}->extname(); my $health = recurse("$gdkidextname"); my $type = $myobjs{"$gdkid"}->type(); # dig deeper to learn health of the grankid's grandkids $objct = $healthct{good} = $healthct{error} = $healthct{warning} = 0; foreach $ggkid ($myobjs{"$gdkidextname"}->get_children()) { print "ggkid id: " . $myobjs{"$ggkid"}->id() . "\n" if $DEBUG; next unless $myobjs{"$ggkid"}->hasnamedkids(); foreach $gggdkid ($myobjs{"$ggkid"}->get_children()) { print "gggdkid id: " . $myobjs{"$gggdkid"}->id() . "\n" if $DEBUG; $gggdkidname = $myobjs{"$gggdkid"}->name(); $gggdkidextname = $myobjs{"$gggdkid"}->extname(); my $health = recurse("$gggdkidextname"); $objct++; $healthct{$health}++; } } $elemct++; $elemid = "elemid" . $elemct; # groups should have distinctive cell background color to set them apart from monitors if ($type eq "group") { $bgcolor = "#F0F0F0"; $celllink = "$lastpart/$gdkidname"; $truncvalue = qq(<font color="red">$healthct{error}</font>/$objct); $tdval = $truncvalue; } else { $bgcolor = "#FFFFFF"; $celllink = "$rprt?$gdkidname"; # truncate monitor value to save display space $longvalue = $monitorv{"$gdkidname"}; (my $truncvalue) = $monitorv{"$gdkidname"} =~ /^(.{7,9})/; $truncvalue = $truncvalue? $truncvalue : " "; $tdval = qq(<span id="$elemid" onmouseover="changeme('$elemid','$longvalue')" onmouseout="restorem e('$elemid','$truncvalue')">$truncvalue</span>); } $hashtrs{"$gdkidname"} = qq(<tr><td bgcolor="#000000">$health{$health} </td><td>$tdval</td><td bgcol or="$bgcolor"><a href="$celllink">$gdkidname</a></td></tr>\n); # for health we're going to have to recurse } } # print out in alphabetical order foreach $key (sort(keys %hashtrs)) { print $hashtrs{"$key"}; } print "</table>"; } print $cgi->end_html(); ####################################### sub recurse { # to get the union of health of all ancestors my $moniext = shift; my ($moni) = $moniext =~ m#/([^/]+)$#; # don't bother recursing and all that unless we have to... return $myobjs{"$moniext"}->health() if defined $myobjs{"$moniext"}->health(); print "moni,moniext: $moni, $moniext\n" if $DEBUG; my ($kid,$gdkidextname,$health,$cumhealth); $cumhealth = $health = $monitors{"$moni"} ? $monitors{"$moni"} : "good"; foreach $kid ($myobjs{"$moniext"}->get_children()) { if ($myobjs{"$kid"}->hasnamedkids()) { foreach $gdkid ($myobjs{"$kid"}->get_children()) { $gdkidextname = $myobjs{"$gdkid"}->extname(); # for health we're going to have to recurse $health = recurse("$gdkidextname"); if ($health eq "error" || $cumhealth eq "error") { $cumhealth = "error"; } elsif ($health eq "warning" || $cumhealth eq "warning") { $cumhealth = "warning"; } } } else { # this kid is end of line $health = $monitors{"$kid"} ? $monitors{"$kid"} : "good"; if ($health eq "error" || $cumhealth eq "error") { $cumhealth = "error"; } elsif ($health eq "warning" || $cumhealth eq "warning") { $cumhealth = "warning"; } } } $myobjs{"$moniext"}->health("$cumhealth"); return $cumhealth; } # end sub recurse

I call it simply “ss” to minimize the typing required. You see it uses a package called Id.pm which I wrote to encapsulate the class and methods. Here is Id.pm:

package Id;
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# class for storing data about an id
# URL (not currently protected): http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getC
onfigurationSnapshot
# class for storing data about a group
use warnings;
use strict;
use Carp;
#group methods
# constructor
# get_members
# get_name
# get_id
# addmember
#
# member methods
# constructor
# get_id
# get_name
# get_type
# get_gp
# set_gp
 
sub new {
  my $class = shift;
  my $self = {@_};
  bless($self, "Id");
  return $self;
}
# get-set methods, p. 355
sub parentid { $_[0]->{parentid}=$_[1] if defined $_[1]; $_[0]->{parentid} }
sub isanamedid { $_[0]->{isanamedid}=$_[1] if defined $_[1]; $_[0]->{isanamedid} }
sub id { $_[0]->{id}=$_[1] if defined $_[1]; $_[0]->{id} }
sub name { $_[0]->{name}=$_[1] if defined $_[1]; $_[0]->{name} }
sub extname { $_[0]->{extname}=$_[1] if defined $_[1]; $_[0]->{extname} }
sub type { $_[0]->{type}=$_[1] if defined $_[1]; $_[0]->{type} }
sub health { $_[0]->{health}=$_[1] if defined $_[1]; $_[0]->{health} }
sub hasnamedkids { $_[0]->{hasnamedkids}=$_[1] if defined $_[1]; $_[0]->{hasnamedkids} }
 
# get children - use anonymous array, book p. 221-222
sub get_children {
# return empty array if arrary hasn't been defined...
  defined @{$_[0]->{children}} ? @{$_[0]->{children}} : ();
}
# adding children
sub addchild {
  $_[0]->{children} = [] unless defined  $_[0]->{children};
  push @{$_[0]->{children}},$_[1];
}
 
1;

ss also assumes the existence of just a few of the images from SiteScope classic – the green circle for good, red diamond for error and yellow warning, etc.. I borrowed them SiteScope classic.

Here is the code for the log scraper:

#!/usr/bin/perl
# analyze SiteScope log file
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# 8/2010
$DEBUG = 0;
$logdir = "/opt/SiteScope/logs";
$monitorstats = "/tmp/monitorstats.txt";
$monitorstatshis = "/tmp/monitorstats-his.txt";
$date = `date +%Y_%m_%d`;
chomp($date);
$file = "$logdir/SiteScope$date.log";
open(LOG,"$file") || die "Cannot open SiteScope log file: $file!!\n";
# example lines:
# 16:51:07 08/02/2010     good    LDAPServers     LDAP SSL test : ldapsrv.drj.com exit: 0, 0.502 sec    1:
3481  0       502
#16:51:22 08/02/2010     good    Network DNS: (AMEAST) ns2  0.033 sec   2:3459      200     33      ok
#16:51:49 08/02/2010     good    Proxy   proxy.pac script on iwww    0.055 sec   2:12467 200     55   ok
     4288    1280782309      0    0  55      0       0      200  0
#16:52:04 08/02/2010     good    Proxy   Disk Space: earth /logs   66% full, 13862MB free, 41921MB total
 3:3598      66      139862
#16:52:09 08/02/2010     good    DrjExtranet  URL: wwwsecure.drj.com     0.364 sec    1:3604      200
364  ok 26125   1280782328     0    0   358     4       2       200  0
while(<LOG>) {
  ($time,$date,$status,$group,$monitor,$value) = /(\S+)\s(\S+)\t(\S+)\t(\S+)\t([^\t]+)\t([^\t]+)/;
  print '$time,$date,$status,$group,$monitor,$value' . "$time,$date,$status,$group,$monitor,$value\n" if $DEBUG;
  next if $group =~ /__health__/; # don't care about these lines
  $mons{"$monitor"} = 1;
  push @{$mont{"$monitor"}} , $time;
  push @{$mond{"$monitor"}} , $date;
  push @{$monh{"$monitor"}} , $status;
  push @{$monv{"$monitor"}} , $value;
}
# open output at last moment to minimize chances of reading while locked for writing
open(MONITORSTATS,">$monitorstats") || die "Cannot open monitor stats file $monitorstats!!\n";
open(MONITORSTATSHIS,">$monitorstatshis") || die "Cannot open monitor stats file $monitorstatshis!!\n";
# write it all out - will always print the latest values
foreach $monitor (keys %mons) {
# dereference our anonymous arrays
  @times = @{$mont{"$monitor"}};
  @dates = @{$mond{"$monitor"}};
  @status = @{$monh{"$monitor"}};
  @value = @{$monv{"$monitor"}};
# last element is the latest measured status and value
  print MONITORSTATS "$monitor\t$status[-1]\t$value[-1]\n";
  print MONITORSTATSHIS "$monitor\n";
  #for ($i=-11;$i<0;$i++) {
# put latest measure on top
  for ($i=-1;$i>-13;$i--) {
    $time = defined $times[$i] ? $times[$i] : "NA";
    $date = defined $dates[$i] ? $dates[$i] : "NA";
    $stat = defined $status[$i] ? $status[$i] : "NA";
    $val = defined $value[$i] ? $value[$i] : "NA";
    print MONITORSTATSHIS "\t$time\t$date\t$stat\t$val\n";
  }
}

#!/usr/bin/perl # analyze SiteScope log file # Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0 # 8/2010 $DEBUG = 0; $logdir = "/opt/SiteScope/logs"; $monitorstats = "/tmp/monitorstats.txt"; $monitorstatshis = "/tmp/monitorstats-his.txt"; $date = `date +%Y_%m_%d`; chomp($date); $file = "$logdir/SiteScope$date.log"; open(LOG,"$file") || die "Cannot open SiteScope log file: $file!!\n"; # example lines: # 16:51:07 08/02/2010 good LDAPServers LDAP SSL test : ldapsrv.drj.com exit: 0, 0.502 sec 1: 3481 0 502 #16:51:22 08/02/2010 good Network DNS: (AMEAST) ns2 0.033 sec 2:3459 200 33 ok #16:51:49 08/02/2010 good Proxy proxy.pac script on iwww 0.055 sec 2:12467 200 55 ok 4288 1280782309 0 0 55 0 0 200 0 #16:52:04 08/02/2010 good Proxy Disk Space: earth /logs 66% full, 13862MB free, 41921MB total 3:3598 66 139862 #16:52:09 08/02/2010 good DrjExtranet URL: wwwsecure.drj.com 0.364 sec 1:3604 200 364 ok 26125 1280782328 0 0 358 4 2 200 0 while(<LOG>) { ($time,$date,$status,$group,$monitor,$value) = /(\S+)\s(\S+)\t(\S+)\t(\S+)\t([^\t]+)\t([^\t]+)/; print '$time,$date,$status,$group,$monitor,$value' . "$time,$date,$status,$group,$monitor,$value\n" if $DEBUG; next if $group =~ /__health__/; # don't care about these lines $mons{"$monitor"} = 1; push @{$mont{"$monitor"}} , $time; push @{$mond{"$monitor"}} , $date; push @{$monh{"$monitor"}} , $status; push @{$monv{"$monitor"}} , $value; } # open output at last moment to minimize chances of reading while locked for writing open(MONITORSTATS,">$monitorstats") || die "Cannot open monitor stats file $monitorstats!!\n"; open(MONITORSTATSHIS,">$monitorstatshis") || die "Cannot open monitor stats file $monitorstatshis!!\n"; # write it all out - will always print the latest values foreach $monitor (keys %mons) { # dereference our anonymous arrays @times = @{$mont{"$monitor"}}; @dates = @{$mond{"$monitor"}}; @status = @{$monh{"$monitor"}}; @value = @{$monv{"$monitor"}}; # last element is the latest measured status and value print MONITORSTATS "$monitor\t$status[-1]\t$value[-1]\n"; print MONITORSTATSHIS "$monitor\n"; #for ($i=-11;$i<0;$i++) { # put latest measure on top for ($i=-1;$i>-13;$i--) { $time = defined $times[$i] ? $times[$i] : "NA"; $date = defined $dates[$i] ? $dates[$i] : "NA"; $stat = defined $status[$i] ? $status[$i] : "NA"; $val = defined $value[$i] ? $value[$i] : "NA"; print MONITORSTATSHIS "\t$time\t$date\t$stat\t$val\n"; } }

As I said it gets called every minute by cron.

That’s it! I enter the url sitescope.drj.com/SS/ss to access the main program which gets executed because I made /SS a CGI-BIN directory.

This gives you a read-only, Java-free view into your SiteScope status and hierarchy which beckons back to the good old days of Freshwater SiteScope.

Know your limits
What it does not do, unfortunately, is allow you to run a monitor – that seems like the next most simple thing which I should have been able to do but couldn’t figure out – much less define new monitors (never going to happen) or alerts.

I use this successfully against my HP SiteScope instance of roughly 400 monitors which itself is on a VM and there is no apparent strain. At some point this simple-minded script would no longer scale to suit the task at hand, but it might be good for up to a few thousand monitors.

And now a word about open source alternatives
Since I was so enamored with SiteScope Classic there seemed to be no compelling reason to shell out the dough for HP SiteScope with its unwanted interface, so I briefly looked around at free alternatives. Free sounds good, right? Not so much in practice. Out there in Cyberspace there is an enthusiast for a product called Zabbix. I just want to go on the record that Zabbix is the most confused piece of junk I have run across. You are getting less than what you paid for ($0) because you will be wasting a lot of time with it, and in the end it isn’t all that capable. Nagios also had its limits – I can’t remember the exact reason I didn’t go down that route, but there were definite reasons.

HP SiteScope is no panacea. “HP” and “stifling bureaucracy” need to be mentioned in the same sentence. Every time we renew support it is the most confusing mess of line items. Every time there’s a new cast of characters over at HP who nothing about the account’s history. You practically have to beg them to accept your money for a low-budget item like SiteScope because they really don’t pursue it in any way. Then their SAID and contract numbers stuff is confusing if you only see it once every few years.

Conclusion
A conversion program does exist for turning the finicky HP SiteScope Java-encumbered view into pure SiteScope Classic because I wrote it! But it’s a limited read-only view. Still, it’s helpful in a pinch and can even be viewed on the Blackberry’s browser.

Another problem is that HP has threatened to completely change the API so this tool, which is designed for HP SiteScope v 10.12, will probably completely break for newer versions. Oh, well.

References
This post shows some silly mistakes to avoid when doing a minor upgrade in version 11.

Tags HP SiteScope, monitoring, nagios, SiteScope Classic, zabbix

Internet Mail

How to run sendmail in queue-only mode

Post author By john
Post date February 24, 2012
4 Comments on How to run sendmail in queue-only mode

Intro
I guess I’ve ragged on sendmail before. Incredibly powerful program. Finding out how to do that simple thing you want to do may not be so easy, even with the bible at your side. So to that end I’m making an effort to document those simple things which I’ve found I’ve struggled with.

The Details
Today I wanted to capture all email coming into my sendmail daemon. Well, actually it’s a little more complicated. I didn’t want to disturb production email, but I wanted to capture a spam sample. Today there was a hugely effective spam campaign purporting to be email from the Better Business Bureau (BBB). All the emails however actually came from various senders @aicpa.org. Postini put a filter in place but I knew more were getting through. But they weren’t coming to me. How to get capture them without disturbing users?

In this post I gave some obscure but useful tips for sendmail admins, including the ever-useful smarttable add-on. To reprise, smarttable allows you to make delivery decisions based on sender! That’s totally antithetical to your run-of-the-mill sendmail admin, but it’s really useful… Like now. So I quickly put up a sendmail instance, copying a working config I use in production. But I changed the listener to IP address 127.0.0.2 (which I fortunately had already set up for some other reason I can no longer recall). That one’s pretty standard. That’s just:

DAEMON_OPTIONS(`Name=sm-cap, Addr=127.0.0.2')dnl

Of course you want to create a new queue directory just for the captured emails. I created /mqueue/c0 and put in this line into my .mc file:

define(QUEUE_DIR, `/mqueue/c*')dnl

And here’s the main point, how to defer delivery of all emails. Sendmail actually distinguishes between defer and queueonly. I chose queueonly thusly:

define(`confDELIVERY_MODE',`queueonly')dnl

If by chance you happen to misspell DELIVERY_MODE, like, let’s say, DELIERY_MODE, you don’t seem to get a whole lot of errors. Not that that would ever happen to us, mind you, I’m just saying. That’s why it’s good to also know about the command-line option. Keep reading for that.

It’s simple enough to test once you have it running (which I do with this line: sudo sendmail -bd -q -C/etc/mail/capture.cf).

> telnet 127.0.0.2 25
Trying 127.0.0.2…
Connected to 127.0.0.2.
Escape character is ‘^]’.
220 drj.com ESMTP server ready at Fri, 24 Feb 2012 15:16:40 -0500
helo localhost
250 drjemgw2.drj.com Hello [127.0.0.2], pleased to meet you
mail from: asd@gmail.com
250 2.1.0 asd@gmail.com… Sender ok
rcpt to: drj@drj.com
250 2.1.5 drj@drj.com… Recipient ok
data
354 Enter mail, end with “.” on a line by itself
subject: test of the capture-only sendmail instance

Just a test!
-Dr J
.
250 2.0.0 q1OKGet2008636 Message accepted for delivery
quit
221 2.0.0 drj.com closing connection
Connection closed by foreign host.

Is the message there, queued up the way we’d like? You bet:

> ls -l /mqueue/c0

total 16
-rw------- 1 root root  19 2012-02-24 15:17 dfq1OKGet2008636
-rw------- 1 root root 542 2012-02-24 15:17 qfq1OKGet2008636

There also seems to be a second way to run sendmail in queue-only fashion. I got it to work from the command-line like this:

> sudo sendmail -odqueueonly -bd -C/etc/mail/capture.cf

The book says this is deprecrated usage, however. But let’s see, that’s O’Reilly’s Sendmail 3rd edition, published in 2003, we’re in 2012, so, hmm, they still haven’t cut us off…

One last thing, that smarttable entry for my main sendmail daemon. I added the line:

@aicpa.org relay:[127.0.0.2]

Conclusion
It can be useful to queue all incoming emails for various reasons. It’s a little hard to find out how to do this precisely. We found a way to do this without stopping/starting our main sendmail process. This post shows a couple ways to do it, and why you might need to.

May 2012 Update
Just wanted to mention about BBB email how I handle it now. They told me they maintain an accurate SPF record. Sure enough, they do. Now we only accept bbb.org email when the SPF record is a match. But I don’t use sendmail for that, I use Postini’s (OK, Google’s, technically) mail hygiene service. Postini rocks!

My most recent post on how to tame the confounding sendmail log is here.

Tags Postini, queue-only, sendmail, smarttable, SPF

Apache

Running CGI Scripts from any Directory with Apache

Post author By john
Post date February 24, 2012
1 Comment on Running CGI Scripts from any Directory with Apache

Intro
This is a really basic issue, but the documentation out there often doesn’t speak directly to this single issue – other things get thrown into the mix. This document is to show how to enable the running of CGI scripts from any directory under htdocs with the Apache2 web server.

The Details
Let’s get into it. CGI – common gateway interface – is a great environment for simple web server programs. But if you’ve got a fairly generic apache2 configuration file and are trying CGI you might encounter a few different errors before getting it right. Here’s the contents of my test.cgi file:

#!/bin/sh
echo "Content-type: text/html"
echo "Location: http://drjohnstechtalk.com"
echo ""

Not tuning my dflt.conf apache2 configuration file in any way, I find to my horror that the cgi file contents gets returned as is initially, just as if it were no more special than a random test file:

> curl -i localhost/test/test.cgi

HTTP/1.1 200 OK
Date: Fri, 24 Feb 2012 14:07:37 GMT
Server: Apache/2
Last-Modified: Fri, 24 Feb 2012 14:05:29 GMT
ETag: "56d8-5c-4b9b640c9dc40"
Content-Length: 92
Content-Type: text/plain
 
#!/bin/sh
echo "Content-type: text/html"
echo "Location: http://drjohnstechtalk.com"
echo ""

That’s no good.

Now I do some research and realize the following line should be added. Here I show it in its context:

<Directory "/usr/local/apache2/htdocs">
# make .cgi and .pl extensions valid CGI types
    AddHandler cgi-script .cgi .pl
    ...

and we get…
> curl -i localhost/test/test.cgi

HTTP/1.1 403 Forbidden
Date: Fri, 24 Feb 2012 14:11:55 GMT
Server: Apache/2
Content-Length: 215
Content-Type: text/html; charset=iso-8859-1
 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /test/test.cgi
on this server.</p>
</body></html>

Hmmm. Better, perhaps, but definitely not working. What did we miss? Chances are we had something like this in our dflt.conf:

<Directory "/usr/local/apache2/htdocs">
# make .cgi and .pl extensions valid CGI types
    AddHandler cgi-script .cgi .pl
    Options Indexes FollowSymLinks
    ...

but what we need is this:

<Directory "/usr/local/apache2/htdocs">
# make .cgi and .pl extensions valid CGI types
    AddHandler cgi-script .cgi .pl
    Options Indexes FollowSymLinks ExecCGI
    ...

Note the addition of the ExecCGI to the Options statement.

With that tweak to our apache2 configuration we get the desired result after restarting:

> curl -i localhost/test/test.cgi

HTTP/1.1 302 Found
Date: Fri, 24 Feb 2012 14:15:47 GMT
Server: Apache/2
Location: http://drjohnstechtalk.com
Content-Length: 209
Content-Type: text/html; charset=iso-8859-1
 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="http://drjohnstechtalk.com">here</a>.</p>
</body></html>

But what if we want our CGI script to be our default page? What I did for that is to add this statement:

DirectoryIndex index.html index.cgi

so now we have:

<Directory "/usr/local/apache2/htdocs">
# make .cgi and .pl extensions valid CGI types
    AddHandler cgi-script .cgi .pl
    Options Indexes FollowSymLinks ExecCGI
    DirectoryIndex index.html index.cgi
    ...

I create an index.cgi and delete the index.html and index.cgi is run from the top-level URL:

> curl -i johnstechtalk.com

HTTP/1.1 302 Found
Date: Fri, 29 Apr 2013 14:15:47 GMT
Server: Apache/2
Location: http://drjohnstechtalk.com/blog/
Content-Length: 209
Content-Type: text/html; charset=iso-8859-1
 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="http://drjohnstechtalk.com/blog/">here</a>.</p>
</body></html>

Conclusion
We have shown the two most common problems that occur when trying to enable CGI execution with the apache2 web server. We have shown how to fix the configuration.

Tags Apache, CGI

TCP/IP

C++ TCP Socket Program

Intro
I was looking around for a sample TCP socket program written in C++ that might make working with TCP sockets less mysterious. I expected to find a flood of things to pick from, but that really wasn’t the case.

The Details
OK, I only looked for a few minutes, to be honest. The one I did settle on seems adequate. It’s sufficiently old, however, that it doesn’t actually work as-is. Probably if it did I wouldn’t even mention it. So I thought it was worth repeating here, with some tiny semantic updates.

What I used is from this web page: http://cs.baylor.edu/~donahoo/practical/CSockets/practical/. I was really only interested in the TCP echo client. It’s a good stand-ion for any TCP client I think.

Here’s TCPechoClient.cpp:

/*
 *   C++ sockets on Unix and Windows
 *   Copyright (C) 2002
 *
 *   This program is free software; you can redistribute it and/or modify
 *   it under the terms of the GNU General Public License as published by
 *   the Free Software Foundation; either version 2 of the License, or
 *   (at your option) any later version.
 *
 *   This program is distributed in the hope that it will be useful,
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *   GNU General Public License for more details.
 *
 *   You should have received a copy of the GNU General Public License
 *   along with this program; if not, write to the Free Software
 *   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 */
 
// taken from http://cs.baylor.edu/~donahoo/practical/CSockets/practical/TCPEchoClient.cpp
 
#include "PracticalSocket.h"  // For Socket and SocketException
#include <iostream>           // For cerr and cout
#include <cstdlib>            // For atoi()
#include <cstring>            // author forgot this
 
using namespace std;
 
const int RCVBUFSIZE = 32;    // Size of receive buffer
 
int main(int argc, char *argv[]) {
  if ((argc < 3) || (argc > 4)) {     // Test for correct number of arguments
    cerr << "Usage: " << argv[0]
         << " <Server> <Echo String> [<Server Port>]" << endl;
    exit(1);
  }
 
  string servAddress = argv[1]; // First arg: server address
  char *echoString = argv[2];   // Second arg: string to echo
// DrJ test
//  echoString = "GET / HTTP/1.0\n\n";
  int echoStringLen = strlen(echoString);   // Determine input length
  unsigned short echoServPort = (argc == 4) ? atoi(argv[3]) : 7;
 
  try {
    // Establish connection with the echo server
    TCPSocket sock(servAddress, echoServPort);
 
    // Send the string to the echo server
    sock.send(echoString, echoStringLen);
 
    char echoBuffer[RCVBUFSIZE + 1];    // Buffer for echo string + \0
    int bytesReceived = 0;              // Bytes read on each recv()
    int totalBytesReceived = 0;         // Total bytes read
    // Receive the same string back from the server
    cout << "Received: ";               // Setup to print the echoed string
    while (totalBytesReceived < echoStringLen) {
      // Receive up to the buffer size bytes from the sender
      if ((bytesReceived = (sock.recv(echoBuffer, RCVBUFSIZE))) <= 0) {
        cerr << "Unable to read";
        exit(1);
      }
      totalBytesReceived += bytesReceived;     // Keep tally of total bytes
      echoBuffer[bytesReceived] = '\0';        // Terminate the string!
      cout << echoBuffer;                      // Print the echo buffer
    }
    cout << endl;
 
    // Destructor closes the socket
 
  } catch(SocketException &e) {
    cerr << e.what() << endl;
    exit(1);
  }
 
  return 0;
}

/* * C++ sockets on Unix and Windows * Copyright (C) 2002 * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ // taken from http://cs.baylor.edu/~donahoo/practical/CSockets/practical/TCPEchoClient.cpp #include "PracticalSocket.h" // For Socket and SocketException #include <iostream> // For cerr and cout #include <cstdlib> // For atoi() #include <cstring> // author forgot this using namespace std; const int RCVBUFSIZE = 32; // Size of receive buffer int main(int argc, char *argv[]) { if ((argc < 3) || (argc > 4)) { // Test for correct number of arguments cerr << "Usage: " << argv[0] << " <Server> <Echo String> [<Server Port>]" << endl; exit(1); } string servAddress = argv[1]; // First arg: server address char *echoString = argv[2]; // Second arg: string to echo // DrJ test // echoString = "GET / HTTP/1.0\n\n"; int echoStringLen = strlen(echoString); // Determine input length unsigned short echoServPort = (argc == 4) ? atoi(argv[3]) : 7; try { // Establish connection with the echo server TCPSocket sock(servAddress, echoServPort); // Send the string to the echo server sock.send(echoString, echoStringLen); char echoBuffer[RCVBUFSIZE + 1]; // Buffer for echo string + \0 int bytesReceived = 0; // Bytes read on each recv() int totalBytesReceived = 0; // Total bytes read // Receive the same string back from the server cout << "Received: "; // Setup to print the echoed string while (totalBytesReceived < echoStringLen) { // Receive up to the buffer size bytes from the sender if ((bytesReceived = (sock.recv(echoBuffer, RCVBUFSIZE))) <= 0) { cerr << "Unable to read"; exit(1); } totalBytesReceived += bytesReceived; // Keep tally of total bytes echoBuffer[bytesReceived] = '\0'; // Terminate the string! cout << echoBuffer; // Print the echo buffer } cout << endl; // Destructor closes the socket } catch(SocketException &e) { cerr << e.what() << endl; exit(1); } return 0; }

Note the cstring header file I needed to include. The standard must have changed to require this since the original code was published.

Then I neeed PracticalSocket.h, but that has no changes from the original version: “http://cs.baylor.edu/~donahoo/practical/CSockets/practical/PracticalSocket.h, and his Makefile is also just fine: http://cs.baylor.edu/~donahoo/practical/CSockets/practical/Makefile. For the fun of it I also set up the TCP Echo Server: http://cs.baylor.edu/~donahoo/practical/CSockets/practical/TCPEchoServer.cpp.

Run

make TCPEchoclient

and you should be good to go. How to test this TCPEchoClient against your web server? I found that the following works:

~/TCPEchoClient drjohnstechtalk.com 'GET / HTTP/1.0
Host: drjohnstechtalk.com
 
' 80

which gives this output:

Received: HTTP/1.1 301 Moved Permanently
Date: Thu, 23 Feb 2012 17:19:02

which, now that I analyze it, looks cut-off. Hmm. Because with curl I have:

curl -i drjohnstechtalk.com

HTTP/1.1 301 Moved Permanently
Date: Thu, 23 Feb 2012 17:19:43 GMT
Server: Apache/2.2.16 (Ubuntu)
X-Powered-By: PHP/5.3.3-1ubuntu9.5
Location: http://www.drjohnstechtalk.com/blog/
Vary: Accept-Encoding
Content-Length: 2
Content-Type: text/html

I guess that’s what you get for demo code. At this point I don’t have a need to sort it out so I won’t. Perhaps we’ll come back to it later. Looking at it, I see the received buffer size is quite small, 32 bytes. I tried to set that to a reasonable value, 200 MBytes, but get a segmentation fault. The largest I could manage, after experiementation, is 10000000 bytes:

//const int RCVBUFSIZE = 32;    // Size of receive buffer
const int RCVBUFSIZE = 10000000;    // Size of receive buffer - why is 10 MB the max. value??

and this does indeed give us the complete output from our web server home page now.

Conclusion
There is some demo C++ code which creates a useable class for dealing with TCP sockets. There might be some work to do before it could be used in a serious application, however.

Tags C++

Apache Linux Web Site Technologies

Turning Apache into a Redirect Factory

Post author By john
Post date February 8, 2012
No Comments on Turning Apache into a Redirect Factory

Intro
I’m getting a little more used to Apache. It’s a strange web server with all sorts of bolt-on pieces. The official documentation is horrible so you really need sites like this to explain how to actually do useful things. You needs real, working examples. In this example I’m going to show how to use the mod_rewrite engine of Apache to build a powerful and convenient web server whose sole purpose in life is for all types of redirects. I call it a redirect factory.

Which Redirects Will it Handle
The redirects will be read in from a file with an easy, editable format. So we never have to touch our running web server. We’ll build in support for the types of redirect requests that I have actually encountered. We don’t care what kind of crazy stuff Apache might permit. You’ll pull your hair out trying to understand it all. All redirects I have ever encountered fall into a relatively small handful of use cases. Ordered by most to least common:

host -> new_url
host/uri[Suffix] -> new_fixed_url (this can be a case-sensitive or case-insensitive match to the uri)
host/uri[Suffix] -> new_prefix_uri[Suffix] (also either case-sensitive or not)

So some examples (not the best examples because I don’t manage drj.com or drj.net, but pretend I did):

drj.com/WHATEVER -> http://drjohnstechtalk.com/
www.drj.com -> http://drjohnstechtalk.com/
drj.com/abcPATH/Preserve -> http://drjohnstechtalk.com/abcPATH/Preserve
drj.com/defPATH/Preserve -> http://drjohnstechtalk.com/ghiPATH/Preserve
drj.com/path/with/slash -> http://drjohnstechtalk.com/other/path
drj.com/path/with/prefix -> http://drjohnstechtalk.com/other/path
drj.net/pAtH/whatever -> https://drjohnstechtalk.com/straightpath
drj.net/2pAtH/stuff?hi=there http://drjohnstechtalk.com/2straightpath/stuff?hi=there
my.host -> http://regular-redirect.com/
whatever-host.whatever-domain/whatever-URI -> http://whatever-new-host.whatever-new-domain/whatever-new-URI

All these different cases can be handled with one config file. I’ve named it redirs.txt. It looks like this:

# redirs file
# The default target has to be listed first
defaultTarget   D       http://www.drjohnstechtalk.com/blog/
# hosts with URI-matching grouped together
# available flags: "P" - preserve part after match
#                  "C" - exact case match of URI
 
# Begin host: drj.com:www.drj.com - ":"-separated list of applicable hostnames
/                       http://drjohnstechtalk.com/
/abc    P       http://drjohnstechtalk.com/abc
/def    P       http://drjohnstechtalk.com/ghi
/path/with/slash https://drjohnstechtalk.com/other/path
/path/with/prefix P  https://drjohnstechtalk.com/other/path
# end host drj.com:www.drj.com
 
# this syntax - host/URI - is also OK...
drj.net/ter             http://drjohnstechtalk.com/terminalredirect
drj.net/pAtH    C       http://drjohnstechtalk.com/straightpath
drj.net/2pAtH   CP      http://drjohnstechtalk.com/2straightpath
 
# hosts with only host-name matching
my.host                 http://regular-redirect.com/
www.drj.edu             http://education-redirect.edu/edu-path

The Apache configuration file piece is this:

# I really don't think this does anything other than chase away a scary warning in the error log...
RewriteLock ${APACHE_LOCK_DIR}/rewrite_lock
 
# Inspired by the dreadful documentation on http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
RewriteEngine on
RewriteMap  redirectMap prg:conf/vhosts/redirect.pl
#RewriteCond ${lowercase:%{HTTP_HOST}} ^(.+)$
RewriteCond ${redirectMap:%{HTTP_HOST}%{REQUEST_URI}} ^(.+)$
# %N are backreferences to RewriteCond matches, and $N are backreferences to RewriteRule matches
RewriteRule ^/.* %1 [R=301,L]

Remember I split up apache configuration into smaller files. So that’s why you don’t see the lines about logging and what port to listen on, etc. And the APACHE_LOCK_DIR is an environment variable I set up elsewhere. This file is called redirect.conf and is in my conf/vhosts directory.

In my main httpd.conf file I extended the logging to prefix the lines in the access log with the host name (since this redirect server handles many host names this is the only way to get an idea of which hosts are popular):

...
    LogFormat "%{Host}i %h %l %u %t \"%r\" %&gt;s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
...

So a typical log line looks something like the following:

drj.com 201.212.205.11 - - [10/Feb/2012:09:09:07 -0500] "GET /abc HTTP/1.1" 301 238 "http://www.google.com.br/url?sa=t&amp;rct=j&amp;q=drjsearch" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E)"

I had to re-compile apache because originally my version did not have mod_rewrite compiled in. My description of compiling Apache with this module is here.

The directives themselves I figured out based on the lousy documentation at their official site: http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html. The heavy lifting is done in the Perl script because there you have some freedom (yeah!) and are not constrained to understand all their silly flags. One trick that does not seem documented is that you can send the full URL to your mapping program. Note the %{HTTP_HOST}%{REQUEST_URI} after the “:”.

I tried to keep redirect.pl brief and simple. Considering the many different cases it isn’t too bad. It weighs in at 70 lines. Here it is:

#!/usr/bin/perl
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# input is $HTTP_HOST$REQUEST_URI
$redirs = "redirs.txt";
# here I only want the actual script name
$working_directory = $script_name = $0;
$script_name =~ s/.*\///g;
$working_directory =~ s/\/$script_name$//g;
$finalType = "";
$DEBUG = 0;
$|=1;
while () {
  chomp;
  ($host,$uri) = /^([^\/]+)\/(.*)/;
  $host = lc $host;
# use generic redirect file
  open(REDIRS,"$working_directory/$redirs") || die "Cannot open redirs file $redirs!!\n";
  $lenmatchmax = -1;
  while() {
# look for alternate names section
    if (/#\s*Begin host\s*:\s*(\S+)/i) {
      @hostnames = split /:/,$1;
      $pathsection = 1;
    } elsif (/#\s*End host/i) {
      $pathsection = 0;
    }
    @hostnames = () unless $pathsection;
    next if /^#/ || /^\s*$/; # ignore comments and blank lines
    chomp;
    $type = "";
# take out trailing spaces after the target URL
    s/\s+$//;
    if (/^(\S+)\s+(\S{1,2})\s+(\S+)$/) {
      ($redirsURL,$type,$targetURL) = ($1,$2,$3);
    } else {
       ($redirsURL,$targetURL) = /^(\S+)\s+(\S+)$/;
    }
# set default target if specified. It has to come at beginning of file
    $finalURL = $targetURL if $type =~ /D/;
    $redirsHost = $redirsURI = $redirsURIesc = "";
    ($redirsHost,$redirsURI) = $redirsURL =~ /^([^\/]*)\/?(.*)/;
    $redirsURIesc = $redirsURI;
    $redirsURIesc =~ s/([\/\?\.])/\\$1/g;
    print "redirsHost,redirsURI,redirsURIesc,targetURL,type: $redirsHost,$redirsURI,$redirsURIesc,$targetURL,$type\n" if $DEBUG;
    push @hostnames,$redirsHost unless $pathsection;
    foreach $redirsHost (@hostnames) {
    if ($host eq $redirsHost) {
# assume case-insensitive match by default.  Use type of 'C' to demand exact case match
# also note this matches even if uri and redirsURI are both empty
      if ($uri =~ /^$redirsURIesc/ || ($type !~ /C/ &amp;&amp; $uri =~ /^$redirsURIesc/i)) {
# find longest match
        $lenmatch = length($redirsURI);
        if ($lenmatch &gt; $lenmatchmax) {
          $finalURL = $targetURL;
          $finalType = $type;
          $lenmatchmax = $lenmatch;
          if ($type =~ /P/) {
# prefix redirect
            if ($uri =~ /^$redirsURIesc(.+)/ || ($type !~ /C/ &amp;&amp; $uri =~ /^$redirsURIesc(.+)/i)) {
              $finalURL .= $1;
             }
          }
        }
      }
    } # end condition over input host matching host from redirs file
    } # end loop over hostnames list
  } # end loop over lines in redirs file
  close(REDIRS);
# non-prefix re-direct. This is bizarre, but you have to end URI with "?" to kill off the query string, unless the target already contains a "?", in which case you must NOT add it! Gotta love Apache...
  $finalURL .= '?' unless $finalType =~ /P/ || $finalURL =~ /\?/;
  print "$finalURL\n";
} # end loop over STDIN

#!/usr/bin/perl # Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0 # input is $HTTP_HOST$REQUEST_URI $redirs = "redirs.txt"; # here I only want the actual script name $working_directory = $script_name = $0; $script_name =~ s/.*\///g; $working_directory =~ s/\/$script_name$//g; $finalType = ""; $DEBUG = 0; $|=1; while () { chomp; ($host,$uri) = /^([^\/]+)\/(.*)/; $host = lc $host; # use generic redirect file open(REDIRS,"$working_directory/$redirs") || die "Cannot open redirs file $redirs!!\n"; $lenmatchmax = -1; while() { # look for alternate names section if (/#\s*Begin host\s*:\s*(\S+)/i) { @hostnames = split /:/,$1; $pathsection = 1; } elsif (/#\s*End host/i) { $pathsection = 0; } @hostnames = () unless $pathsection; next if /^#/ || /^\s*$/; # ignore comments and blank lines chomp; $type = ""; # take out trailing spaces after the target URL s/\s+$//; if (/^(\S+)\s+(\S{1,2})\s+(\S+)$/) { ($redirsURL,$type,$targetURL) = ($1,$2,$3); } else { ($redirsURL,$targetURL) = /^(\S+)\s+(\S+)$/; } # set default target if specified. It has to come at beginning of file $finalURL = $targetURL if $type =~ /D/; $redirsHost = $redirsURI = $redirsURIesc = ""; ($redirsHost,$redirsURI) = $redirsURL =~ /^([^\/]*)\/?(.*)/; $redirsURIesc = $redirsURI; $redirsURIesc =~ s/([\/\?\.])/\\$1/g; print "redirsHost,redirsURI,redirsURIesc,targetURL,type: $redirsHost,$redirsURI,$redirsURIesc,$targetURL,$type\n" if $DEBUG; push @hostnames,$redirsHost unless $pathsection; foreach $redirsHost (@hostnames) { if ($host eq $redirsHost) { # assume case-insensitive match by default. Use type of 'C' to demand exact case match # also note this matches even if uri and redirsURI are both empty if ($uri =~ /^$redirsURIesc/ || ($type !~ /C/ && $uri =~ /^$redirsURIesc/i)) { # find longest match $lenmatch = length($redirsURI); if ($lenmatch > $lenmatchmax) { $finalURL = $targetURL; $finalType = $type; $lenmatchmax = $lenmatch; if ($type =~ /P/) { # prefix redirect if ($uri =~ /^$redirsURIesc(.+)/ || ($type !~ /C/ && $uri =~ /^$redirsURIesc(.+)/i)) { $finalURL .= $1; } } } } } # end condition over input host matching host from redirs file } # end loop over hostnames list } # end loop over lines in redirs file close(REDIRS); # non-prefix re-direct. This is bizarre, but you have to end URI with "?" to kill off the query string, unless the target already contains a "?", in which case you must NOT add it! Gotta love Apache... $finalURL .= '?' unless $finalType =~ /P/ || $finalURL =~ /\?/; print "$finalURL\n"; } # end loop over STDIN

The nice thing here is that there are a couple of ways to test it, which gives you a sort of cross-check capability. Of course I made lots of mistakes in programming it, but I worked through all the cases until they were all right, using rapid testing.

For instance, let’s see what happens for www.drj.com. We run this test from the development server as follows:

> curl -i -H ‘Host: www.drj.com’ ‘localhost:90’

HTTP/1.1 301 Moved Permanently
Date: Thu, 09 Feb 2012 15:24:25 GMT
Server: Apache/2
Location: http://drjohnstechtalk.com/
Content-Length: 235
Content-Type: text/html; charset=iso-8859-1

Moved Permanently

The document has moved here.

And from the command line I test redirect.pl as follows:

> echo “www.drj.com/”|./redirect.pl

http://drjohnstechtalk.com/?

That terminal “?” is unfortunate, but apparently you need it to kill off any possible query_string.

You want some more? OK. How about matching a host and the initial path in a case-insensitive manner? No problem, we’re up to the challenge:

> curl -i -H ‘Host: DRJ.COM’ ‘localhost:90/PATH/WITH/SLASH/stuff?hi=there’

HTTP/1.1 301 Moved Permanently
Date: Thu, 09 Feb 2012 15:38:12 GMT
Server: Apache/2
Location: https://drjohnstechtalk.com/other/path
Content-Length: 246
Content-Type: text/html; charset=iso-8859-1

Moved Permanently

The document has moved here.

Refer back to the redirs file and you see this is the desired behaviour.

We could go on with an example for each case, but we’ll conclude with one last one:

> curl -i -H ‘Host: DRJ.NET’ ‘localhost:90/2pAtHstuff?hi=there’

HTTP/1.1 301 Moved Permanently
Date: Thu, 09 Feb 2012 15:44:37 GMT
Server: Apache/2
Location: http://drjohnstechtalk.com/2straightpathstuff?hi=there
Content-Length: 262
Content-Type: text/html; charset=iso-8859-1

Moved Permanently

The document has moved here.

A case-sensitive, preserve match. Change “pAtH” to “path” and there is no matching line in redirs.txt so you will get the default URL.

Creating exceptions
Eventually I wanted to have an exception – a URI which should be served with a 200 status rather than redirected. How to handle?

# Inspired by the dreadful documentation on http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
        RewriteEngine on
# just this one page should NOT be redirected
        Rewriterule ^/dontredirectThisPage.php - [L]
        RewriteMap  redirectMap prg:redirect.pl
        ... etc ...

The above apache configuration snippet shows that I had to put the page which shouldn’t be redirected at the top of the ruleset and set the target to “-“, which turns off redirection for that match, and make this the last executed Rewrite rule. I think this is better than a negated match (!) which always gets complicated.

Conclusion
A powerful redirect factory was constructed from Apache and Perl. We suffered quite a bit during development because of incomprehensible documentation. But hopefully we’ve saved someone else this travail.

References and related

2022 update. This is a very nice commercial service for redirects which I have just learned about: https://www.easyredir.com/
This post describes how to massage Apache so that it always returns a maintenance page no matter what URI was originally requested.
I have since learned that another term used in the industry for rediect server is persistent URL (PURL). It’s explained in Wikipedia by this article: https://en.wikipedia.org/wiki/Persistent_uniform_resource_locator

Tags PURL, redirect

Ajax

Web to ssh gateway – not so difficult with Right Tools

Post author By john
Post date January 31, 2012
No Comments on Web to ssh gateway – not so difficult with Right Tools

Intro
I won’t go into details in this posting for fear that the “bad people” will be more likely to benefit than the legitimate users of what I’m describing. That being said there are some legitimate uses, for instance when you need that terminal access but a direct ssh connection just isn’t available.

Ajaxterm
I’m kind of amazed at how far Javascript has come. You can implement a curses-based application in javascript, i.e., a terminal console? Yup. You bet. And the kicker is that it works quite well. Teraterm it ain’t, but I’ll be danged if you can’t vi a file, run top as well your basic commands, all over a pretty standard-looking web page. That’s what we mean by gateway – an application which converts one protocol to another. In this case HTTP to shell (I suppose).

The generic application is called ajaxterm. I used it from a distribution that runs a local python server on my server. It’s described here:

https://github.com/antonylesuisse/qweb/tree/master/ajaxterm/README.txt

If you keep the default screen size, 80×24, he says it has few enough characters that a screen refresh can be contained in one packet. In my testing the echo delay was probably under one quarter second.

Forget about a scroll bar holding 1000’s of lines, however. You get just your basic terminal like in the old days.

Someone reminded me about screen, which I hadn’t been using. Screen is an extremely useful tool. It’s like a terminal multiplexor. Now I normally set up my screen escape sequence to be Ctrl-\, but for some reason this particular sequence is not recognized by Ajaxterm. What I settled on instead is Ctrl-g (escape ^Gg in your .screenrc). I don’t like to use the default Ctrl-a because this is a useful emacs editing mode sequence – takes you to beginning of line. Popping between screens is a little slow with ajaxterm as might be expected. It’s a worst-case, everything must be re-drawn situation, I suppose. But ajaxterm + screen is a pretty powerful combination.

Conclusion
Now I have an additional path to my server’s command line if a direct ssh connection isn’t available.

Tags gateway, screen, ssh, teraterm, web

Ajax flot jquery Perl

Making Function Plots fun using Ajax while solving a real-world problem

Post author By john
Post date January 25, 2012
1 Comment on Making Function Plots fun using Ajax while solving a real-world problem

Intro
I learned an awful lot from this exercise. I wanted to plot the trajectory of a foam basketball through the air. You know the kind of thing where you can vary the initial conditions to see what differences the results will produce. Finally, finally a good excuse to learn some Ajax. Ajax is a natural fit because you can work within the same web page and the feel is more interactive.

High level description
There’s so much here to describe I hardly know where to begin. I may never get through describing it all.

At the highest levels I had to learn some of the following:

php
Ajax
DOM
Javascript
jquery
flot
json

Perl and basic physics are not on the list – they are used but I already know those!

I basically only learned as much as I needed to accomplish the task. This saved me quite a bit of time as you can get bogged down for months in any single one of those topics above. I’m pretty good at “programming by analogy” and this really put those skills to the test because, as is usually the case, analogies were indeed present, but they weren’t very exact so I needed a scary amount of extrapolation from what samples were easily available.

The net result of all this? I think it’s pretty neat if I say so myself. This web page follows the trajectory of a small foam basketball from a given set of initial conditions. The trajectory is plotted. You tweak the initial conditions and a new trajectory is plotted on top of the old one so you can see the differences. Here’s a link to the application.

To be continued in great detail, hopefully…

Tags DOM, function, Javascript, json, plotting