Categories
Admin Linux

Common Problems Installing Cognos Gateway on Linux

Updated for a 2018 Cognos 11 install
with 2013 updates for Cognos 10 installation

Intro
I tried to take a shortcut and get a 2nd Cognos gateway up and running by copying files, etc. rather than a proper install. At one time or another I feel I must have encountered just about every problem conceivable. I didn’t take great, systematic notes, but I’d like to mention some highlights while it is still fresh in my memory!

The Details
Note that I have a working gateway server running on the same version of Linux, SLES 11 SP1. So I thought I’d be clever and just copy all the files below /opt/cognos8 from the working server.

First Rookie Mistake
Let’s call our COGNOS_ROOT /opt/cognos8 for convenience.
Cognos 10 note: /opt/cognos10 would be a more sensible installation directory!

So you’re following along in the documentation and dutifully looking for /opt/cognos8/bin/cogconfig.sh, and not finding it? Me, neither. So I cleverly borrowed it from a working solaris installation. It’s all Java, right, no OS dependencies, what can go wrong? Ha, ha. You try:

./cogconfig.sh
and get:

Using /usr/lib64/jvm/jre/bin/java
The java class is not found:  CRConfig

Long story short. Give up. Without telling anyone they moved it to /opt/cognos8/bin64. That’s assuming you’re on a 64-bit system like most of us are.

OK. Now you run it from the …bin64 directory, expecting better results, only to perhaps get something like:

./cogconfig.sh

Unable to locate a JRE. Please specify a valid JAVA_HOME environment variable.

Long story short, java-1_4_2-ibm (java-1_6_0-ibm if installing a Cognos 10 gateway) is a good Java environment to install for Cognos Gateway. At least it is on SLES Linux. So you install that and set up environment variables like these:

export JAVA_BINDIR=/usr/lib64/jvm/jre/bin
export JAVA_HOME=/usr/lib64/jvm/jre
export JAVA_ROOT=/usr/lib64/jvm/jre

Now you’re cooking. Run it yet again. You’re smart and know to set up your DISPLAY environment to a valid XServer you have access to. But even if the X application actually does launch and run (you may need some Motif or additional X packages, possibly even from the SDK DVD – see appendix A), if you try to export the configuration you’ll get an error like this:

java.lang.ClassNotFoundException: org.bouncycastle134.jce.provider.BouncyCastleProvider

Cognos 10 note: I did not have this class missing in my Cognos 10 installation. Yeah!

Yes, you are missing the infamous bouncycastleprovider! This stuff is too good to make up, right? It’s a jar file that’s somewhere in the Cognos Gateway distribution, bcprov-jdk14-134.jar. In my case I need to put it here:

/etc/alternatives/jre/lib/ext

With that in place run it yet again. Now you may be unable to export the configuration with this error:

CAM-CRP-1057 Unable to generate the machine specific symmetric key.

Does it ever end? Yes!

You may have old values of keys and what-not cryptography stuff from your copy of the other system. So you remove these directories and all their contents:

/opt/cognos8/{encryptkeypair,signkeypair}

And I even saw the following error:

02/03/2012,11:26:56,Err,com.cognos.crconfig.data.DataManagerException: CAM-CRP-1132 An error occurred while attempting to request a certificate from the Certificate Authority service. Unable to connect to the Certificate Authority service. Ensure that the Content Manager computer is configured and that the Cognos 8 services on it are currently running. Reason: java.net.ConnectException: Connection refused, com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:2730)

I think it comes about if you save the default config without editing it and putting in a valid dispatcher URI, but I forget.

The main point towards the end was to start with a clean config by a:

cd /opt/cognos8/configuration;cp cogstartup.xml{.new,}

, making sure there is no encryptkeypair and signkeypair directories, launching …bin64/cogconfig.sh, working with the GUI to define the dispatcher URIs to your working, running Cognos dispatcher, exporting it,

(Let me take a breath here. If that export succeeds, you’re home.)

and finally saving it, which also generates the system-specific keys.

That’s it! A bunch of green check marks are your reward. Hopefully.

Conclusion
In the end you will see that this “cheap method” of installing Cognos Gateway worked. We had a few bumps along the road, but we worked through them all. Now that we’ve seen just about every conceivable problem we have a treasure trove of documented errors and fixes should we ever find ourselves in this situation again.

There is one more Cognos Gateway problem we resolved, by the way, that was previously documented here.

Appendix A – Cognos 10 note
Yes, I referred to this document in my own installation of Cognos version 10 gateway component. The problems are very similar, and this was a big help, if I say so myself.

I notice I write a tight narrative. I have lots of tangential thoughts, but to list them all as I think of them would destroy the flow of the narrative. In this case I wanted to expand on the openmotif packages.

I got a missing libXm.so.4 message when launching issetup the first time. I determined this came from an openmotif package from my previous successful installation on another server. My new server had limited repositories.

> zypper search openmotif

produced these results:

 
S | Name                   | Summary                    | Type
--+------------------------+----------------------------+-----------
  | openmotif21-demos      | Open Motif 2.2.4 Libraries | package
  | openmotif21-libs       | Open Motif 2.2.4 Libraries | package
  | openmotif21-libs       | Open Motif 2.2.4 Libraries | srcpackage
  | openmotif21-libs-32bit | Open Motif 2.2.4 Libraries | package
  | openmotif22-libs       | Open Motif 2.2.4 Libraries | package
  | openmotif22-libs       | Open Motif 2.2.4 Libraries | srcpackage
  | openmotif22-libs-32bit | Open Motif 2.2.4 Libraries | package

Well, I tried to install first openmotif21-libs-32bit then openmotif22-libs-32bit, but neither gave me the right version of libXm.so! I had versions 2, 3 and 6! So I simply did one of these numbers:

> cd /usr/lib; ln -s libXm.so.3.0.3 libXm.so.4

and, to my surprise, it worked!

More Errors Documented for completeness’ sake

At the risk of making this blog post a total mess, I’ll include a few more errors I encountered during the upgrade. Who knows who might find this useful.

Generating the cryptographic keys is always a hold-your-breath-and-pray operation. I had my upgrade files in place in a new install directory, /opt/cognos10. I ran bin64/cogconfig.sh like usual. It was suggested I could save the configuration even though the application gateway wasn’t running, so I tried that. No dice.

The cryptographic information cannot be encrypted.

Fine. So probably the app server needs to be running before we save the config, right? So they got it running. I tried to save the config. Same error. The details were as follows:

[ ERROR ]
CAM-CRP-1315 Current configuration points to a different Trust Domain than originally configured.
 
[ ERROR ] 
The cryptography information was not generated.

The remedy? Close the configuration and completely remove these directories beneath the /opt/cognos10/configuration directory:

– encryptkeypair
– signkeypair
– csk (actually I didn’t have this one. But I guess it should be removed if present)

I held my breath, re-ran cogconfig and saved. This time it worked!

I also had an error with my Java version:

./cogconfig.sh
Using /usr/lib64/jvm/jre/bin/java
The java class could not be loaded. java.lang.UnsupportedClassVersionError: (CRConfig) bad major version at offset=6
/usr/lib64/jvm/jre/bin/java -version

showed

java version "1.4.2"
Java(TM) 2 Runtime Environment, Standard Edition (build 2.3)
IBM J9 VM (build 2.3, J2RE 1.4.2 IBM J9 2.3 Linux amd64-64 j9vmxa64142ifx-20110628 (JIT enabled)
J9VM - 20110627_85693_LHdSMr
JIT  - 20090210_1447ifx5_r8
GC   - 200902_24)

I installed a newer Java:

zypper install  java-1_6_0-ibm

and got past this error.

April 20123 update
Just when you thought every possible error was covered, you encounter a new one. Cognos Mobile isn’t working so well on actual mobile devices so they wanted to try a Fixpack from IBM. No problem, right? They gave me

up_cogmob_linuxi38664h_10.2.1102.33_ml.rar

and I set to work. I don’t particularly like rar files for Linux, but I figured out there is an unrar command:

$ unrar e up_*rar

But after setting up my DISPLAY environment variable I get this new error running ./issetup:

X Error of failed request:  BadDrawable (invalid Pixmap or Window parameter)
  Major opcode of failed request:  14 (X_GetGeometry)
  Resource id in failed request:  0x2
  Serial number of failed request:  257
  Current serial number in output stream:  257
IDS_MSG_PREFIXIDS_COPYRIGHT_LOGOIDS_MSG_PREFIXIDS_MSG_READ_ARCHIVE

The solution? They downloaded a tar.gz version of the Fixpack. I unpacked that and had absolutely no problems with issetup! The really strange thing is that in both issetup are identical files. I use cksum to do a quick compare. Even setup.csp are identical files. I did an strace -f of the two cases but the salient difference didn’t pop out at me. The files present in the tar.gz seem to be fewer in number.

Another random error you will encounter sooner or later

You are doing a Save in cogconfig and you get:

13/05/2013,17:39:05,Err,CAM-CRP-1132 An error occurred while attempting to request a certificate from the Certificate Authority service. Unable to connect to the Certificate Authority service. Ensure that the Content Manager computer is configured and that the IBM Cognos services on it are currently running. Reason: java.net.ConnectException: Connection refused, com.cognos.crconfig.data.crypto.ConfiguringSession.configure(ConfiguringSession.java:35)com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:3037)com.cognos.crconfig.data.DataManager$4.run(DataManager.java:4169)com.cognos.crconfig.data.CnfgActionEngine$CnfgActionThread.run(CnfgActionEngine.java:394)com.cognos.crconfig.data.crypto.ConfiguringSession.configure(ConfiguringSession.java:35)com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:3037)com.cognos.crconfig.data.DataManager$4.run(DataManager.java:4169)com.cognos.crconfig.data.CnfgActionEngine$CnfgActionThread.run(CnfgActionEngine.java:394)com.cognos.crconfig.data.crypto.ConfiguringSession.configure(ConfiguringSession.java:35)com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:3037)com.cognos.crconfig.data.DataManager$4.run(DataManager.java:4169)com.cognos.crconfig.data.CnfgActionEngine$CnfgActionThread.run(CnfgActionEngine.java:394)

This looks scary but has an easy fix. You aren’t communicating with the app server. Probably their dispatcher services are down. Bring them up and it should work fine – it did for me. This is assuming of course that you have your dispatcher URLs set up correctly.

I cloned my Cognos web gateway and got this error
I waited for a few weeks to examine the clone. I ran

$ ./cogconfig.sh

and got this error:

16/05/2013,15:57:35,Err,CAM-CRP-1280 An error occurred while trying to decrypt using the system protection key. Reason: javax.crypto.IllegalBlockSizeException: Input length (with padding) not multiple of 16 bytes

Umm. I don’t have the solution yet. One thing is most highly suspect: in the meatime we re-generated the keys on the production web gateway. So I am hoping that is all we need to do here as well.

Resolved. Here is the process I followed – a sort of colonic for Cognos:

$ cd /opt/cognos10/configuration; rm csk/* signkeypair/* encryptkeypair/* cogstartup.xml
$ cd ../bin64; ./cogconfig.sh

Then in the GUI I re-defined the app servers in the dispatcher URI portion of the environment.
Then did a Save.
Worked like a champ – four green check marks.

cogconfig hangs
This happened to me on an older server. The IBM Cognos Configuration screen displays but it’s supposed to exit so you can get to the part where you edit the configuration and it never does.

Currently no known solution.

June 2018 update
Cognos 11 install problem

The Cognos 11 install was going pretty well. Until it came time to launch cogconfig. That generated this error:

cognos10:/web/cognos11/bin64> ./cogconfig.sh

Using /usr/lib64/jvm/jre/bin/java
Exception in thread "main" java.lang.UnsupportedClassVersionError: JVMCFRE003 bad major version; class=com/cognos/accman/jcam/crypto/CAMCryptoException, offset=6
        at java.lang.ClassLoader.defineClass(ClassLoader.java:286)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:74)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:538)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
        at java.net.URLClassLoader.access$300(URLClassLoader.java:77)
        at java.net.URLClassLoader$ClassFinder.run(URLClassLoader.java:1041)
        at java.security.AccessController.doPrivileged(AccessController.java:448)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:427)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:676)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:358)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:642)
        at java.lang.J9VMInternals.verifyImpl(Native Method)
        at java.lang.J9VMInternals.verify(J9VMInternals.java:73)
        at java.lang.J9VMInternals.initialize(J9VMInternals.java:133)
        at com.cognos.cclcfgapi.CCLConfigurationFactory.getInstance(CCLConfigurationFactory.java:59)
        at com.cognos.crconfig.CnfgPreferences.<init>(CnfgPreferences.java:51)
        at com.cognos.crconfig.CnfgPreferences.<clinit>(CnfgPreferences.java:36)
        at java.lang.J9VMInternals.initializeImpl(Native Method)
        at java.lang.J9VMInternals.initialize(J9VMInternals.java:199)
        at CRConfig.main(CRConfig.java:144)

Note my system java version is woefully out-of-date:

$ /usr/lib64/jvm/jre/bin/java ‐version

java version "1.6.0"
Java(TM) SE Runtime Environment (build pxa6460sr16fp15-20151106_01(SR16 FP15))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr16fp15-20151020_272943 (JIT enabled, AOT enabled)
J9VM - 20151020_272943
JIT  - r9_20151019_103450
GC   - GA24_Java6_SR16_20151020_1627_B272943)
JCL  - 20151105_01

whereas the Cognos-supplied Java is two versions ahead:
cognos10:/web/cognos11> ./jre/bin/java ‐version

java version "1.8.0"
Java(TM) SE Runtime Environment (build pxa6480sr4fp10-20170727_01(SR4 FP10))
IBM J9 VM (build 2.8, JRE 1.8.0 Linux amd64-64 Compressed References 20170722_357405 (JIT enabled, AOT enabled)
J9VM - R28_20170722_0201_B357405
JIT  - tr.r14.java_20170722_357405
GC   - R28_20170722_0201_B357405_CMPRSS
J9CL - 20170722_357405)
JCL - 20170726_01 based on Oracle jdk8u144-b01

Instead of the previous approach which involved upgrading the system Java, I decided to just try the Java version Cognos itself had installed. In the following commands note that my installation directory was /web/cognos11.

$ cd /web/cognos11; export JAVA_HOME=`pwd`/jre
$ ./cogconfig.sh

Using /web/cognos11/jre/bin/java
06/06/2018,11:13:04,Dbg,Use Customized settings for font and color.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/web/cognos11/bin/slf4j-nop-1.7.23.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/web/cognos11/configuration/utilities/config-util.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.helpers.NOPLoggerFactory]
06/06/2018,11:13:10,Dbg,The original cogstartup.xml file is clear text. Don't back it up.

That is to say, it worked! I’ve often seen software packages install their own versions of Java. This is the first time I thought to take advantage of that. Wish I had thought of this approach during the Cognos 10 install!

Categories
Admin Perl Web Site Technologies

Turning HP SiteScope into SiteScope Classic with Perl

Intro
HP siteScope is a terrific web application tool and not too expensive for those who have any kind of a budget. The built-in monitor types are a bit limited, but since it allows calls to user-provided scripts your imagination is the only real limitation. For those with too many responsibilities and too little time on their hands it is a real productivity enhancer.

I’ve been using the product for 12 years now – since it was Freshwater SiteScope. I still have misgivings about the interface change introduced some years ago when it was part of Mercury. It went from simple and reliable to Java, complicated and flaky. To this day I have to re-start a SiteScope screen in my browser on a daily basis as the browser cannot recover from a server restart or who knows what other failures.

So I longed for the days of SiteScope Classic. We kept it running for as long as possible, years in fact. But at some point there were no more releases created for the classic view. So I investigated the feasibility of creating my own conversion tool. And…partially succeeded. Succeeded to the point where I can pull up the web page on my Blackberry and get the statuses and history. Think you can do that with regular HP SiteScope? I can’t. Maybe there’s an upgrade for it, but still. It’s nice to have the classic interface when you want to pull up the statuses as quickly as possible, regardless of the Blackberry display issue.

Looking back at my code, I obviously decided to try my hand at OO (object oriented) programming in Perl, with mixed results. Perl’s OO syntax isn’t the best, which addles comprehension. Without further ado, let’s jump into it.

The Details
It relies on something I noticed, that this URL on your HP SiteScope server, http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getConfigurationSnapshot, contains a tree of relationships of all the monitors. Cool, right? But it’s not a tree like you or I would design. Between parent and child is an intermediate layer. I suppose you need that because a group can contain monitors (my only focus in this exercise), but it can also contain alerts and maybe some other properties as well. So I guess the intermediate layer gives them the flexibility to represent all that, though it certainly added to my complication in parsing it. That’s why you’ll see the concern over “grandkids.” I developed a recursive, web-enabled Perl program to parse through this xml. That gives me the tools to build the nice hierarchical groupings. But it does not give me the statuses.

For the status of each monitor I wrote a separate scraper script that simply reads the entire daily SiteScope log every minute! Crude, but it works. I use it for an installation with hundreds of monitors and a log file that grows to 9 MB by the end of the day so I know it scales to that size. Beyond that it’s untested.

In addition to giving only the relationships, the xml also changes with every invocation. It attaches ID numbers to the monitors which initially you think is a nice unique identifier, but they change from invocation to invocation! So an additional challenge was to match up the names of the monitors in the xml output to the names as recorded in the SiteScope log. Also a bit tricky, but in general doable.

So without further ado, here’s the source code for the xml parser and main program which gets called from the web:

#!/usr/bin/perl
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# build v.simple SiteScope web GUI appropriate for smartphones
# 7/2010
#
# Id is our package which defines th Id class
use Id;
use CGI::Pretty;
my $cgi=new CGI;
$DEBUG = 0;
# GIF location on SiteScope classic
$ssgifs = "/artwork/";
$health{good} = qq(<img src="${ssgifs}okay.gif">);
$health{error} = qq(<img src="${ssgifs}error.gif">);
$health{warning} = qq(<img src="${ssgifs}warning.gif">);
# report CGI
$rprt = "/SS/rprt";
# the frustrating thing is that this xml output changes almost every time you call it
$url = 'http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getConfigurationSnapshot';
# get current health of all monitors - which is scraped from the log every minute by a hilgarj cron job
$monitorstats = "/tmp/monitorstats.txt";
print "Content-type: text/plain\n\n" if $DEBUG;
open(MONITORSTATS,"$monitorstats") || die "Cannot open monitor stats file $monitorstats!!";
while(<MONITORSTATS>) {
  chomp;
  ($monitor,$status,$value) = /([^\t]+)\t([^\t]+)\t([^\t]+)/;
  $monitors{"$monitor"} = $status;
  $monitorv{"$monitor"} = $value;
}
open(CURL,"curl $url 2>/dev/null|") || die "cannot open $url for reading!!\n";
my %myobjs = ();
# the xml is one long line!
@lines = <CURL>;
#print "xml line: $lines[0]\n" if $DEBUG;
@multiRefs = split "<multiRef",$lines[0];
#parse multiRefs
# create top-level object
my $id = Id->new (
      id => "id0");
# hash of this object with id as key
$myobjs{"id0"} = $id;
 
# first build our objects...
foreach $mref (@multiRefs) {
  next unless $mref =~ /\sid=/;
#  id="id0" ...
  ($parentid) =  $mref =~ /id=\"(id\d+)/;
  print "parentid: $parentid\n" if $DEBUG;
# watch out for <item><key xsi:type="soapenc:string">groupSnapshotChildren</key><value href="#id3 ...
# vs <item><key xsi:type="soapenc:string">Network</key><value href="#id40"/>
  print "mref: $mref\n" if $DEBUG;
  @ids = split /<item><key/, $mref;
# then loop over ids mentioned in this mref
  foreach $myid (@ids) {
    next unless $myid =~ /href="#(id\d+)/;
    next unless $myobjs{"$parentid"};
# types include group, monitor, alert
    ($typebyregex) = $myid =~ />snapshot_(\w+)SnapshotChildren</;
    $parenttype = $myobjs{"$parentid"}->type();
    $type = $typebyregex ? $typebyregex : $parenttype;
    print "type: $type\n" if $DEBUG;
# skip alert definitions
    next if $type eq "alert";
    print "myid: $myid\n" if $DEBUG;
    ($actualid) = $myid =~ /href="#(id\d+)/;
    print "actualid: $actualid\n" if $DEBUG;
# construct object
    my $id = Id->new (
      id => $actualid,
      type => $type,
      parentid => $parentid );
# build hash of these objects with actualid as key
    $myobjs{$actualid} = $id;
# addchild to parent. note that parent should already have been encountered
    $myobjs{"$parentid"}->addchild($actualid);
    if ($myid !~ /groupSnapshotChildren/) {
# interesting child - has name (every other generation has no name!)
      ($name) = $myid =~ /string\">(.+?)<\/key/;  # use non-greedy operator
      print "name: $name\n" if $DEBUG;
# some names are not of interest to us: alerts, which end in "error" or "good"
      if ($name !~ /(error|good)$/) {
# name may not be unique - get extended name which include all parents
        if (defined $myobjs{"$parentid"}->parentid()) {
          $gdparid = $myobjs{"$parentid"}->parentid();
          $gdparname = $myobjs{"$gdparid"}->extname();
# extname -> extended, or distinguished name.  Should be unique
          $extname = $gdparname. '/' . $name;
        } else {
# 1st generation
          print "1st generation\n" if $DEBUG;
          $extname = $name;
        }
        print "extname: $extname\n" if $DEBUG;
        $id->name($name);
        $id->extname($extname);
        $id->isanamedid(1);
        $myobjs{"$parentid"}->hasnamedkids(1); # want to mark its parent as "special"
# we also need our hash to reference objects by extended name since id changes with each extract and name
may not be unique
        $myobjs{"$extname"} = $id;
      } # end conditional over desirable name check
    } else {
      $id->isanamedid(0);
    }
  }
}
#
# now it's all parsed and our objects are alive. Let's build a web site!
#
# build a cookie containing path
my $pi = $ENV{PATH_INFO};
$script = $ENV{SCRIPT_NAME};
$ua = $ENV{HTTP_USER_AGENT};
# Blackberry browser test
$BB = $ua =~ /^BlackBerry/i ? 1 : 0;
$MSIE = $ua =~ /MSIE /;
# font-size depends on browser
$FS = "font-size: x-small;" if $MSIE;
$cookie = $cgi->cookie("pathinfo");
$uri = $script . $pi;
$cookie=$cgi->cookie(-name=>"pathinfo", -value=>"$uri");
print $cgi->header(-type=>"text/html",-cookie=>$cookie);
($url) = $pi =~ m#([^/]+)$#;
#  -title=>'SmartPhone View',
# this doesn't work, sigh...
#print $cgi->start_html(-head=>meta({-http_equiv=>'Refresh'}));
print qq( <HEAD>
<meta http-equiv="Expires" content="0">
<meta http-equiv="Pragma" content="no-cache">
<meta HTTP-EQUIV="Refresh" CONTENT="60; URL=$url">
<TITLE>SiteScope Classic $url Detail</TITLE>
<style type="text/css">
a.good {color: green; }
a.warning {color: green; }
a.error {color: red; }
td {font-family: Arial, Helvetica, sans-serif; $FS}
p.ss {font-family: Arial, Helvetica, sans-serif;}
</style>
<link rel="shortcut icon" href="/favicon.ico" type="image/x-icon" />
<script type=text/javascript>
function changeme(elemid,longvalue)
{
document.getElementById(elemid).innerText=longvalue;
}
function restoreme(elemid,truncvalue)
{
document.getElementById(elemid).innerText=truncvalue;
}
</script>
</HEAD><body>
);
 
#print $cgi->h1("This is the heading");
# parse path
# top lvl name:2nd lvl name:3rd lvl name
$altpi = $cgi->path_info();
print $cgi->p("pi is $pi") if $DEBUG;
#print $cgi->p("altpi is $altpi");
# relative url
$rurl = $cgi->url(-relative=>1);
if ($pi eq "") {
# the top
# top id is id3
  print qq(<p class="ss">);
  $myid = "id3";
  foreach $kid ($myobjs{"$myid"}->get_children()) {
    my $kidname = $myobjs{"$kid"}->name();
# kids can be subgroups or standalone monitors
    my $health = recurse("/$kidname");
    print "$health{$health} <a href=\"$rurl/$kidname\">$kidname</a><br>\n";
    $prodtest = $kid if $kidname eq "Production";
  }
  print "</p>\n";
} else {
  $extname = $pi;
  print "pi,name,extname,script: $pi,$name,$extname,$script\n" if $DEBUG;
# print where we are
  $uriname = $pi;
  $uriname =~ s#^/##;
  #print $cgi->p("name is $name");
  #print $cgi->p("uriname is $uriname");
  $uricompositepart = "/";
  @uriparts = split('/',$uriname);
  $lastpart = pop @uriparts;
  print qq(<p class="ss"><a href="$script"><b>Sitescope</b></a><br>);
  print qq(<b>Monitors in: );
  foreach $uripart (@uriparts) {
    my $healthp = recurse("$uricompositepart$uripart");
# build valid link
    ##$link = qq(<a class="good" href="$script$uricompositepart$uripart">$uripart</a>: );
    $link = qq(<a class="$healthp" href="$script$uricompositepart$uripart">$uripart</a>: );
    $uricompositepart .= "$uripart/";
    print $link;
  }
  my $healthp = recurse("$uricompositepart$lastpart");
  $color = $healthp eq "error" ? "red" : "green";
  print qq(<font color="$color">$lastpart</font></b></p>\n);
  print qq(<table border="1" cellspacing="0">);
  #print qq(<table>);
  %hashtrs = ();
  foreach $kid ($myobjs{"$extname"}->get_children()) {
    print "kid id: " . $myobjs{"$kid"}->id() . "\n" if $DEBUG;
    next unless $myobjs{"$kid"}->hasnamedkids();
    foreach $gdkid ($myobjs{"$kid"}->get_children()) {
      print "gdkid id: " . $myobjs{"$gdkid"}->id() . "\n" if $DEBUG;
      $gdkidname = $myobjs{"$gdkid"}->name();
      $gdkidextname = $myobjs{"$gdkid"}->extname();
      my $health = recurse("$gdkidextname");
      my $type = $myobjs{"$gdkid"}->type();
# dig deeper to learn health of the grankid's grandkids
      $objct = $healthct{good} = $healthct{error} = $healthct{warning} = 0;
      foreach $ggkid ($myobjs{"$gdkidextname"}->get_children()) {
        print "ggkid id: " . $myobjs{"$ggkid"}->id() . "\n" if $DEBUG;
        next unless $myobjs{"$ggkid"}->hasnamedkids();
        foreach $gggdkid ($myobjs{"$ggkid"}->get_children()) {
          print "gggdkid id: " . $myobjs{"$gggdkid"}->id() . "\n" if $DEBUG;
          $gggdkidname = $myobjs{"$gggdkid"}->name();
          $gggdkidextname = $myobjs{"$gggdkid"}->extname();
          my $health = recurse("$gggdkidextname");
          $objct++;
          $healthct{$health}++;
        }
      }
      $elemct++;
      $elemid = "elemid" . $elemct;
# groups should have distinctive cell background color to set them apart from monitors
      if ($type eq "group") {
        $bgcolor = "#F0F0F0";
        $celllink = "$lastpart/$gdkidname";
        $truncvalue = qq(<font color="red">$healthct{error}</font>/$objct);
        $tdval = $truncvalue;
      } else {
        $bgcolor = "#FFFFFF";
        $celllink = "$rprt?$gdkidname";
# truncate monitor value to save display space
        $longvalue = $monitorv{"$gdkidname"};
        (my $truncvalue) = $monitorv{"$gdkidname"} =~ /^(.{7,9})/;
        $truncvalue = $truncvalue? $truncvalue : "&nbsp;";
        $tdval = qq(<span id="$elemid" onmouseover="changeme('$elemid','$longvalue')" onmouseout="restorem
e('$elemid','$truncvalue')">$truncvalue</span>);
      }
      $hashtrs{"$gdkidname"} = qq(<tr><td bgcolor="#000000">$health{$health} </td><td>$tdval</td><td bgcol
or="$bgcolor"><a href="$celllink">$gdkidname</a></td></tr>\n);
# for health we're going to have to recurse
    }
  }
# print out in alphabetical order
  foreach $key (sort(keys %hashtrs)) {
    print $hashtrs{"$key"};
  }
  print "</table>";
}
print $cgi->end_html();
#######################################
sub recurse {
# to get the union of health of all ancestors
my $moniext = shift;
my ($moni) = $moniext =~ m#/([^/]+)$#;
# don't bother recursing and all that unless we have to...
return $myobjs{"$moniext"}->health() if defined $myobjs{"$moniext"}->health();
print "moni,moniext: $moni, $moniext\n" if $DEBUG;
my ($kid,$gdkidextname,$health,$cumhealth);
$cumhealth = $health = $monitors{"$moni"} ? $monitors{"$moni"} : "good";
foreach $kid ($myobjs{"$moniext"}->get_children()) {
    if ($myobjs{"$kid"}->hasnamedkids()) {
      foreach $gdkid ($myobjs{"$kid"}->get_children()) {
        $gdkidextname = $myobjs{"$gdkid"}->extname();
# for health we're going to have to recurse
        $health = recurse("$gdkidextname");
        if ($health eq "error" || $cumhealth eq "error") {
          $cumhealth = "error";
        } elsif ($health eq "warning" || $cumhealth eq "warning") {
          $cumhealth = "warning";
        }
      }
    } else {
# this kid is end of line
      $health = $monitors{"$kid"} ? $monitors{"$kid"} : "good";
        if ($health eq "error" || $cumhealth eq "error") {
          $cumhealth = "error";
        } elsif ($health eq "warning" || $cumhealth eq "warning") {
          $cumhealth = "warning";
        }
    }
}
$myobjs{"$moniext"}->health("$cumhealth");
return $cumhealth;
} # end sub recurse

I call it simply “ss” to minimize the typing required. You see it uses a package called Id.pm which I wrote to encapsulate the class and methods. Here is Id.pm:

package Id;
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# class for storing data about an id
# URL (not currently protected): http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getC
onfigurationSnapshot
# class for storing data about a group
use warnings;
use strict;
use Carp;
#group methods
# constructor
# get_members
# get_name
# get_id
# addmember
#
# member methods
# constructor
# get_id
# get_name
# get_type
# get_gp
# set_gp
 
sub new {
  my $class = shift;
  my $self = {@_};
  bless($self, "Id");
  return $self;
}
# get-set methods, p. 355
sub parentid { $_[0]->{parentid}=$_[1] if defined $_[1]; $_[0]->{parentid} }
sub isanamedid { $_[0]->{isanamedid}=$_[1] if defined $_[1]; $_[0]->{isanamedid} }
sub id { $_[0]->{id}=$_[1] if defined $_[1]; $_[0]->{id} }
sub name { $_[0]->{name}=$_[1] if defined $_[1]; $_[0]->{name} }
sub extname { $_[0]->{extname}=$_[1] if defined $_[1]; $_[0]->{extname} }
sub type { $_[0]->{type}=$_[1] if defined $_[1]; $_[0]->{type} }
sub health { $_[0]->{health}=$_[1] if defined $_[1]; $_[0]->{health} }
sub hasnamedkids { $_[0]->{hasnamedkids}=$_[1] if defined $_[1]; $_[0]->{hasnamedkids} }
 
# get children - use anonymous array, book p. 221-222
sub get_children {
# return empty array if arrary hasn't been defined...
  defined @{$_[0]->{children}} ? @{$_[0]->{children}} : ();
}
# adding children
sub addchild {
  $_[0]->{children} = [] unless defined  $_[0]->{children};
  push @{$_[0]->{children}},$_[1];
}
 
1;

ss also assumes the existence of just a few of the images from SiteScope classic – the green circle for good, red diamond for error and yellow warning, etc.. I borrowed them SiteScope classic.

Here is the code for the log scraper:

#!/usr/bin/perl
# analyze SiteScope log file
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# 8/2010
$DEBUG = 0;
$logdir = "/opt/SiteScope/logs";
$monitorstats = "/tmp/monitorstats.txt";
$monitorstatshis = "/tmp/monitorstats-his.txt";
$date = `date +%Y_%m_%d`;
chomp($date);
$file = "$logdir/SiteScope$date.log";
open(LOG,"$file") || die "Cannot open SiteScope log file: $file!!\n";
# example lines:
# 16:51:07 08/02/2010     good    LDAPServers     LDAP SSL test : ldapsrv.drj.com exit: 0, 0.502 sec    1:
3481  0       502
#16:51:22 08/02/2010     good    Network DNS: (AMEAST) ns2  0.033 sec   2:3459      200     33      ok
#16:51:49 08/02/2010     good    Proxy   proxy.pac script on iwww    0.055 sec   2:12467 200     55   ok
     4288    1280782309      0    0  55      0       0      200  0
#16:52:04 08/02/2010     good    Proxy   Disk Space: earth /logs   66% full, 13862MB free, 41921MB total
 3:3598      66      139862
#16:52:09 08/02/2010     good    DrjExtranet  URL: wwwsecure.drj.com     0.364 sec    1:3604      200
364  ok 26125   1280782328     0    0   358     4       2       200  0
while(<LOG>) {
  ($time,$date,$status,$group,$monitor,$value) = /(\S+)\s(\S+)\t(\S+)\t(\S+)\t([^\t]+)\t([^\t]+)/;
  print '$time,$date,$status,$group,$monitor,$value' . "$time,$date,$status,$group,$monitor,$value\n" if $DEBUG;
  next if $group =~ /__health__/; # don't care about these lines
  $mons{"$monitor"} = 1;
  push @{$mont{"$monitor"}} , $time;
  push @{$mond{"$monitor"}} , $date;
  push @{$monh{"$monitor"}} , $status;
  push @{$monv{"$monitor"}} , $value;
}
# open output at last moment to minimize chances of reading while locked for writing
open(MONITORSTATS,">$monitorstats") || die "Cannot open monitor stats file $monitorstats!!\n";
open(MONITORSTATSHIS,">$monitorstatshis") || die "Cannot open monitor stats file $monitorstatshis!!\n";
# write it all out - will always print the latest values
foreach $monitor (keys %mons) {
# dereference our anonymous arrays
  @times = @{$mont{"$monitor"}};
  @dates = @{$mond{"$monitor"}};
  @status = @{$monh{"$monitor"}};
  @value = @{$monv{"$monitor"}};
# last element is the latest measured status and value
  print MONITORSTATS "$monitor\t$status[-1]\t$value[-1]\n";
  print MONITORSTATSHIS "$monitor\n";
  #for ($i=-11;$i<0;$i++) {
# put latest measure on top
  for ($i=-1;$i>-13;$i--) {
    $time = defined $times[$i] ? $times[$i] : "NA";
    $date = defined $dates[$i] ? $dates[$i] : "NA";
    $stat = defined $status[$i] ? $status[$i] : "NA";
    $val = defined $value[$i] ? $value[$i] : "NA";
    print MONITORSTATSHIS "\t$time\t$date\t$stat\t$val\n";
  }
}

As I said it gets called every minute by cron.

That’s it! I enter the url sitescope.drj.com/SS/ss to access the main program which gets executed because I made /SS a CGI-BIN directory.

This gives you a read-only, Java-free view into your SiteScope status and hierarchy which beckons back to the good old days of Freshwater SiteScope.

Know your limits
What it does not do, unfortunately, is allow you to run a monitor – that seems like the next most simple thing which I should have been able to do but couldn’t figure out – much less define new monitors (never going to happen) or alerts.

I use this successfully against my HP SiteScope instance of roughly 400 monitors which itself is on a VM and there is no apparent strain. At some point this simple-minded script would no longer scale to suit the task at hand, but it might be good for up to a few thousand monitors.

And now a word about open source alternatives
Since I was so enamored with SiteScope Classic there seemed to be no compelling reason to shell out the dough for HP SiteScope with its unwanted interface, so I briefly looked around at free alternatives. Free sounds good, right? Not so much in practice. Out there in Cyberspace there is an enthusiast for a product called Zabbix. I just want to go on the record that Zabbix is the most confused piece of junk I have run across. You are getting less than what you paid for ($0) because you will be wasting a lot of time with it, and in the end it isn’t all that capable. Nagios also had its limits – I can’t remember the exact reason I didn’t go down that route, but there were definite reasons.

HP SiteScope is no panacea. “HP” and “stifling bureaucracy” need to be mentioned in the same sentence. Every time we renew support it is the most confusing mess of line items. Every time there’s a new cast of characters over at HP who nothing about the account’s history. You practically have to beg them to accept your money for a low-budget item like SiteScope because they really don’t pursue it in any way. Then their SAID and contract numbers stuff is confusing if you only see it once every few years.

Conclusion
A conversion program does exist for turning the finicky HP SiteScope Java-encumbered view into pure SiteScope Classic because I wrote it! But it’s a limited read-only view. Still, it’s helpful in a pinch and can even be viewed on the Blackberry’s browser.

Another problem is that HP has threatened to completely change the API so this tool, which is designed for HP SiteScope v 10.12, will probably completely break for newer versions. Oh, well.

References
This post shows some silly mistakes to avoid when doing a minor upgrade in version 11.

Categories
Internet Mail

How to run sendmail in queue-only mode

Intro
I guess I’ve ragged on sendmail before. Incredibly powerful program. Finding out how to do that simple thing you want to do may not be so easy, even with the bible at your side. So to that end I’m making an effort to document those simple things which I’ve found I’ve struggled with.

The Details
Today I wanted to capture all email coming into my sendmail daemon. Well, actually it’s a little more complicated. I didn’t want to disturb production email, but I wanted to capture a spam sample. Today there was a hugely effective spam campaign purporting to be email from the Better Business Bureau (BBB). All the emails however actually came from various senders @aicpa.org. Postini put a filter in place but I knew more were getting through. But they weren’t coming to me. How to get capture them without disturbing users?

In this post I gave some obscure but useful tips for sendmail admins, including the ever-useful smarttable add-on. To reprise, smarttable allows you to make delivery decisions based on sender! That’s totally antithetical to your run-of-the-mill sendmail admin, but it’s really useful… Like now. So I quickly put up a sendmail instance, copying a working config I use in production. But I changed the listener to IP address 127.0.0.2 (which I fortunately had already set up for some other reason I can no longer recall). That one’s pretty standard. That’s just:

DAEMON_OPTIONS(`Name=sm-cap, Addr=127.0.0.2')dnl

Of course you want to create a new queue directory just for the captured emails. I created /mqueue/c0 and put in this line into my .mc file:

define(QUEUE_DIR, `/mqueue/c*')dnl

And here’s the main point, how to defer delivery of all emails. Sendmail actually distinguishes between defer and queueonly. I chose queueonly thusly:

define(`confDELIVERY_MODE',`queueonly')dnl

If by chance you happen to misspell DELIVERY_MODE, like, let’s say, DELIERY_MODE, you don’t seem to get a whole lot of errors. Not that that would ever happen to us, mind you, I’m just saying. That’s why it’s good to also know about the command-line option. Keep reading for that.

It’s simple enough to test once you have it running (which I do with this line: sudo sendmail -bd -q -C/etc/mail/capture.cf).

> telnet 127.0.0.2 25
Trying 127.0.0.2…
Connected to 127.0.0.2.
Escape character is ‘^]’.
220 drj.com ESMTP server ready at Fri, 24 Feb 2012 15:16:40 -0500
helo localhost
250 drjemgw2.drj.com Hello [127.0.0.2], pleased to meet you
mail from: [email protected]
250 2.1.0 [email protected]… Sender ok
rcpt to: [email protected]
250 2.1.5 [email protected]… Recipient ok
data
354 Enter mail, end with “.” on a line by itself
subject: test of the capture-only sendmail instance

Just a test!
-Dr J
.

250 2.0.0 q1OKGet2008636 Message accepted for delivery
quit
221 2.0.0 drj.com closing connection
Connection closed by foreign host.

Is the message there, queued up the way we’d like? You bet:

> ls -l /mqueue/c0

total 16
-rw------- 1 root root  19 2012-02-24 15:17 dfq1OKGet2008636
-rw------- 1 root root 542 2012-02-24 15:17 qfq1OKGet2008636

There also seems to be a second way to run sendmail in queue-only fashion. I got it to work from the command-line like this:

> sudo sendmail -odqueueonly -bd -C/etc/mail/capture.cf

The book says this is deprecrated usage, however. But let’s see, that’s O’Reilly’s Sendmail 3rd edition, published in 2003, we’re in 2012, so, hmm, they still haven’t cut us off…

One last thing, that smarttable entry for my main sendmail daemon. I added the line:

@aicpa.org relay:[127.0.0.2]

Conclusion
It can be useful to queue all incoming emails for various reasons. It’s a little hard to find out how to do this precisely. We found a way to do this without stopping/starting our main sendmail process. This post shows a couple ways to do it, and why you might need to.

May 2012 Update
Just wanted to mention about BBB email how I handle it now. They told me they maintain an accurate SPF record. Sure enough, they do. Now we only accept bbb.org email when the SPF record is a match. But I don’t use sendmail for that, I use Postini’s (OK, Google’s, technically) mail hygiene service. Postini rocks!

My most recent post on how to tame the confounding sendmail log is here.

Categories
Apache

Running CGI Scripts from any Directory with Apache

Intro
This is a really basic issue, but the documentation out there often doesn’t speak directly to this single issue – other things get thrown into the mix. This document is to show how to enable the running of CGI scripts from any directory under htdocs with the Apache2 web server.

The Details
Let’s get into it. CGI – common gateway interface – is a great environment for simple web server programs. But if you’ve got a fairly generic apache2 configuration file and are trying CGI you might encounter a few different errors before getting it right. Here’s the contents of my test.cgi file:

#!/bin/sh
echo "Content-type: text/html"
echo "Location: http://drjohnstechtalk.com"
echo ""

Not tuning my dflt.conf apache2 configuration file in any way, I find to my horror that the cgi file contents gets returned as is initially, just as if it were no more special than a random test file:

> curl -i localhost/test/test.cgi

HTTP/1.1 200 OK
Date: Fri, 24 Feb 2012 14:07:37 GMT
Server: Apache/2
Last-Modified: Fri, 24 Feb 2012 14:05:29 GMT
ETag: "56d8-5c-4b9b640c9dc40"
Content-Length: 92
Content-Type: text/plain
 
#!/bin/sh
echo "Content-type: text/html"
echo "Location: http://drjohnstechtalk.com"
echo ""

That’s no good.

Now I do some research and realize the following line should be added. Here I show it in its context:

<Directory "/usr/local/apache2/htdocs">
# make .cgi and .pl extensions valid CGI types
    AddHandler cgi-script .cgi .pl
    ...

and we get…
> curl -i localhost/test/test.cgi

HTTP/1.1 403 Forbidden
Date: Fri, 24 Feb 2012 14:11:55 GMT
Server: Apache/2
Content-Length: 215
Content-Type: text/html; charset=iso-8859-1
 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /test/test.cgi
on this server.</p>
</body></html>

Hmmm. Better, perhaps, but definitely not working. What did we miss? Chances are we had something like this in our dflt.conf:

<Directory "/usr/local/apache2/htdocs">
# make .cgi and .pl extensions valid CGI types
    AddHandler cgi-script .cgi .pl
    Options Indexes FollowSymLinks
    ...

but what we need is this:

<Directory "/usr/local/apache2/htdocs">
# make .cgi and .pl extensions valid CGI types
    AddHandler cgi-script .cgi .pl
    Options Indexes FollowSymLinks ExecCGI
    ...

Note the addition of the ExecCGI to the Options statement.

With that tweak to our apache2 configuration we get the desired result after restarting:

> curl -i localhost/test/test.cgi

HTTP/1.1 302 Found
Date: Fri, 24 Feb 2012 14:15:47 GMT
Server: Apache/2
Location: http://drjohnstechtalk.com
Content-Length: 209
Content-Type: text/html; charset=iso-8859-1
 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="http://drjohnstechtalk.com">here</a>.</p>
</body></html>

But what if we want our CGI script to be our default page? What I did for that is to add this statement:

DirectoryIndex index.html index.cgi

so now we have:

<Directory "/usr/local/apache2/htdocs">
# make .cgi and .pl extensions valid CGI types
    AddHandler cgi-script .cgi .pl
    Options Indexes FollowSymLinks ExecCGI
    DirectoryIndex index.html index.cgi
    ...

I create an index.cgi and delete the index.html and index.cgi is run from the top-level URL:

> curl -i johnstechtalk.com

HTTP/1.1 302 Found
Date: Fri, 29 Apr 2013 14:15:47 GMT
Server: Apache/2
Location: http://drjohnstechtalk.com/blog/
Content-Length: 209
Content-Type: text/html; charset=iso-8859-1
 
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="http://drjohnstechtalk.com/blog/">here</a>.</p>
</body></html>

Conclusion
We have shown the two most common problems that occur when trying to enable CGI execution with the apache2 web server. We have shown how to fix the configuration.

Categories
TCP/IP

C++ TCP Socket Program

Intro
I was looking around for a sample TCP socket program written in C++ that might make working with TCP sockets less mysterious. I expected to find a flood of things to pick from, but that really wasn’t the case.

The Details
OK, I only looked for a few minutes, to be honest. The one I did settle on seems adequate. It’s sufficiently old, however, that it doesn’t actually work as-is. Probably if it did I wouldn’t even mention it. So I thought it was worth repeating here, with some tiny semantic updates.

What I used is from this web page: http://cs.baylor.edu/~donahoo/practical/CSockets/practical/. I was really only interested in the TCP echo client. It’s a good stand-ion for any TCP client I think.

Here’s TCPechoClient.cpp:

/*
 *   C++ sockets on Unix and Windows
 *   Copyright (C) 2002
 *
 *   This program is free software; you can redistribute it and/or modify
 *   it under the terms of the GNU General Public License as published by
 *   the Free Software Foundation; either version 2 of the License, or
 *   (at your option) any later version.
 *
 *   This program is distributed in the hope that it will be useful,
 *   but WITHOUT ANY WARRANTY; without even the implied warranty of
 *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *   GNU General Public License for more details.
 *
 *   You should have received a copy of the GNU General Public License
 *   along with this program; if not, write to the Free Software
 *   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 */
 
// taken from http://cs.baylor.edu/~donahoo/practical/CSockets/practical/TCPEchoClient.cpp
 
#include "PracticalSocket.h"  // For Socket and SocketException
#include <iostream>           // For cerr and cout
#include <cstdlib>            // For atoi()
#include <cstring>            // author forgot this
 
using namespace std;
 
const int RCVBUFSIZE = 32;    // Size of receive buffer
 
int main(int argc, char *argv[]) {
  if ((argc < 3) || (argc > 4)) {     // Test for correct number of arguments
    cerr << "Usage: " << argv[0]
         << " <Server> <Echo String> [<Server Port>]" << endl;
    exit(1);
  }
 
  string servAddress = argv[1]; // First arg: server address
  char *echoString = argv[2];   // Second arg: string to echo
// DrJ test
//  echoString = "GET / HTTP/1.0\n\n";
  int echoStringLen = strlen(echoString);   // Determine input length
  unsigned short echoServPort = (argc == 4) ? atoi(argv[3]) : 7;
 
  try {
    // Establish connection with the echo server
    TCPSocket sock(servAddress, echoServPort);
 
    // Send the string to the echo server
    sock.send(echoString, echoStringLen);
 
    char echoBuffer[RCVBUFSIZE + 1];    // Buffer for echo string + \0
    int bytesReceived = 0;              // Bytes read on each recv()
    int totalBytesReceived = 0;         // Total bytes read
    // Receive the same string back from the server
    cout << "Received: ";               // Setup to print the echoed string
    while (totalBytesReceived < echoStringLen) {
      // Receive up to the buffer size bytes from the sender
      if ((bytesReceived = (sock.recv(echoBuffer, RCVBUFSIZE))) <= 0) {
        cerr << "Unable to read";
        exit(1);
      }
      totalBytesReceived += bytesReceived;     // Keep tally of total bytes
      echoBuffer[bytesReceived] = '\0';        // Terminate the string!
      cout << echoBuffer;                      // Print the echo buffer
    }
    cout << endl;
 
    // Destructor closes the socket
 
  } catch(SocketException &e) {
    cerr << e.what() << endl;
    exit(1);
  }
 
  return 0;
}

Note the cstring header file I needed to include. The standard must have changed to require this since the original code was published.

Then I neeed PracticalSocket.h, but that has no changes from the original version: “http://cs.baylor.edu/~donahoo/practical/CSockets/practical/PracticalSocket.h, and his Makefile is also just fine: http://cs.baylor.edu/~donahoo/practical/CSockets/practical/Makefile. For the fun of it I also set up the TCP Echo Server: http://cs.baylor.edu/~donahoo/practical/CSockets/practical/TCPEchoServer.cpp.

Run

make TCPEchoclient

and you should be good to go. How to test this TCPEchoClient against your web server? I found that the following works:

~/TCPEchoClient drjohnstechtalk.com 'GET / HTTP/1.0
Host: drjohnstechtalk.com
 
' 80

which gives this output:

Received: HTTP/1.1 301 Moved Permanently
Date: Thu, 23 Feb 2012 17:19:02

which, now that I analyze it, looks cut-off. Hmm. Because with curl I have:

curl -i drjohnstechtalk.com

HTTP/1.1 301 Moved Permanently
Date: Thu, 23 Feb 2012 17:19:43 GMT
Server: Apache/2.2.16 (Ubuntu)
X-Powered-By: PHP/5.3.3-1ubuntu9.5
Location: http://www.drjohnstechtalk.com/blog/
Vary: Accept-Encoding
Content-Length: 2
Content-Type: text/html

I guess that’s what you get for demo code. At this point I don’t have a need to sort it out so I won’t. Perhaps we’ll come back to it later. Looking at it, I see the received buffer size is quite small, 32 bytes. I tried to set that to a reasonable value, 200 MBytes, but get a segmentation fault. The largest I could manage, after experiementation, is 10000000 bytes:

//const int RCVBUFSIZE = 32;    // Size of receive buffer
const int RCVBUFSIZE = 10000000;    // Size of receive buffer - why is 10 MB the max. value??

and this does indeed give us the complete output from our web server home page now.

Conclusion
There is some demo C++ code which creates a useable class for dealing with TCP sockets. There might be some work to do before it could be used in a serious application, however.

Categories
Apache Linux Web Site Technologies

Turning Apache into a Redirect Factory

Intro
I’m getting a little more used to Apache. It’s a strange web server with all sorts of bolt-on pieces. The official documentation is horrible so you really need sites like this to explain how to actually do useful things. You needs real, working examples. In this example I’m going to show how to use the mod_rewrite engine of Apache to build a powerful and convenient web server whose sole purpose in life is for all types of redirects. I call it a redirect factory.

Which Redirects Will it Handle
The redirects will be read in from a file with an easy, editable format. So we never have to touch our running web server. We’ll build in support for the types of redirect requests that I have actually encountered. We don’t care what kind of crazy stuff Apache might permit. You’ll pull your hair out trying to understand it all. All redirects I have ever encountered fall into a relatively small handful of use cases. Ordered by most to least common:

  1. host -> new_url
  2. host/uri[Suffix] -> new_fixed_url (this can be a case-sensitive or case-insensitive match to the uri)
  3. host/uri[Suffix] -> new_prefix_uri[Suffix] (also either case-sensitive or not)

So some examples (not the best examples because I don’t manage drj.com or drj.net, but pretend I did):

  1. drj.com/WHATEVER -> http://drjohnstechtalk.com/
  2. www.drj.com -> http://drjohnstechtalk.com/
  3. drj.com/abcPATH/Preserve -> http://drjohnstechtalk.com/abcPATH/Preserve
  4. drj.com/defPATH/Preserve -> http://drjohnstechtalk.com/ghiPATH/Preserve
  5. drj.com/path/with/slash -> http://drjohnstechtalk.com/other/path
  6. drj.com/path/with/prefix -> http://drjohnstechtalk.com/other/path
  7. drj.net/pAtH/whatever -> https://drjohnstechtalk.com/straightpath
  8. drj.net/2pAtH/stuff?hi=there http://drjohnstechtalk.com/2straightpath/stuff?hi=there
  9. my.host -> http://regular-redirect.com/
  10. whatever-host.whatever-domain/whatever-URI -> http://whatever-new-host.whatever-new-domain/whatever-new-URI

All these different cases can be handled with one config file. I’ve named it redirs.txt. It looks like this:

# redirs file
# The default target has to be listed first
defaultTarget   D       http://www.drjohnstechtalk.com/blog/
# hosts with URI-matching grouped together
# available flags: "P" - preserve part after match
#                  "C" - exact case match of URI
 
# Begin host: drj.com:www.drj.com - ":"-separated list of applicable hostnames
/                       http://drjohnstechtalk.com/
/abc    P       http://drjohnstechtalk.com/abc
/def    P       http://drjohnstechtalk.com/ghi
/path/with/slash https://drjohnstechtalk.com/other/path
/path/with/prefix P  https://drjohnstechtalk.com/other/path
# end host drj.com:www.drj.com
 
# this syntax - host/URI - is also OK...
drj.net/ter             http://drjohnstechtalk.com/terminalredirect
drj.net/pAtH    C       http://drjohnstechtalk.com/straightpath
drj.net/2pAtH   CP      http://drjohnstechtalk.com/2straightpath
 
# hosts with only host-name matching
my.host                 http://regular-redirect.com/
www.drj.edu             http://education-redirect.edu/edu-path

The Apache configuration file piece is this:

# I really don't think this does anything other than chase away a scary warning in the error log...
RewriteLock ${APACHE_LOCK_DIR}/rewrite_lock
 
# Inspired by the dreadful documentation on http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
RewriteEngine on
RewriteMap  redirectMap prg:conf/vhosts/redirect.pl
#RewriteCond ${lowercase:%{HTTP_HOST}} ^(.+)$
RewriteCond ${redirectMap:%{HTTP_HOST}%{REQUEST_URI}} ^(.+)$
# %N are backreferences to RewriteCond matches, and $N are backreferences to RewriteRule matches
RewriteRule ^/.* %1 [R=301,L]

Remember I split up apache configuration into smaller files. So that’s why you don’t see the lines about logging and what port to listen on, etc. And the APACHE_LOCK_DIR is an environment variable I set up elsewhere. This file is called redirect.conf and is in my conf/vhosts directory.

In my main httpd.conf file I extended the logging to prefix the lines in the access log with the host name (since this redirect server handles many host names this is the only way to get an idea of which hosts are popular):

...
    LogFormat "%{Host}i %h %l %u %t \"%r\" %&gt;s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
...

So a typical log line looks something like the following:

drj.com 201.212.205.11 - - [10/Feb/2012:09:09:07 -0500] "GET /abc HTTP/1.1" 301 238 "http://www.google.com.br/url?sa=t&amp;rct=j&amp;q=drjsearch" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E)"

I had to re-compile apache because originally my version did not have mod_rewrite compiled in. My description of compiling Apache with this module is here.

The directives themselves I figured out based on the lousy documentation at their official site: http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html. The heavy lifting is done in the Perl script because there you have some freedom (yeah!) and are not constrained to understand all their silly flags. One trick that does not seem documented is that you can send the full URL to your mapping program. Note the %{HTTP_HOST}%{REQUEST_URI} after the “:”.

I tried to keep redirect.pl brief and simple. Considering the many different cases it isn’t too bad. It weighs in at 70 lines. Here it is:

#!/usr/bin/perl
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# input is $HTTP_HOST$REQUEST_URI
$redirs = "redirs.txt";
# here I only want the actual script name
$working_directory = $script_name = $0;
$script_name =~ s/.*\///g;
$working_directory =~ s/\/$script_name$//g;
$finalType = "";
$DEBUG = 0;
$|=1;
while () {
  chomp;
  ($host,$uri) = /^([^\/]+)\/(.*)/;
  $host = lc $host;
# use generic redirect file
  open(REDIRS,"$working_directory/$redirs") || die "Cannot open redirs file $redirs!!\n";
  $lenmatchmax = -1;
  while() {
# look for alternate names section
    if (/#\s*Begin host\s*:\s*(\S+)/i) {
      @hostnames = split /:/,$1;
      $pathsection = 1;
    } elsif (/#\s*End host/i) {
      $pathsection = 0;
    }
    @hostnames = () unless $pathsection;
    next if /^#/ || /^\s*$/; # ignore comments and blank lines
    chomp;
    $type = "";
# take out trailing spaces after the target URL
    s/\s+$//;
    if (/^(\S+)\s+(\S{1,2})\s+(\S+)$/) {
      ($redirsURL,$type,$targetURL) = ($1,$2,$3);
    } else {
       ($redirsURL,$targetURL) = /^(\S+)\s+(\S+)$/;
    }
# set default target if specified. It has to come at beginning of file
    $finalURL = $targetURL if $type =~ /D/;
    $redirsHost = $redirsURI = $redirsURIesc = "";
    ($redirsHost,$redirsURI) = $redirsURL =~ /^([^\/]*)\/?(.*)/;
    $redirsURIesc = $redirsURI;
    $redirsURIesc =~ s/([\/\?\.])/\\$1/g;
    print "redirsHost,redirsURI,redirsURIesc,targetURL,type: $redirsHost,$redirsURI,$redirsURIesc,$targetURL,$type\n" if $DEBUG;
    push @hostnames,$redirsHost unless $pathsection;
    foreach $redirsHost (@hostnames) {
    if ($host eq $redirsHost) {
# assume case-insensitive match by default.  Use type of 'C' to demand exact case match
# also note this matches even if uri and redirsURI are both empty
      if ($uri =~ /^$redirsURIesc/ || ($type !~ /C/ &amp;&amp; $uri =~ /^$redirsURIesc/i)) {
# find longest match
        $lenmatch = length($redirsURI);
        if ($lenmatch &gt; $lenmatchmax) {
          $finalURL = $targetURL;
          $finalType = $type;
          $lenmatchmax = $lenmatch;
          if ($type =~ /P/) {
# prefix redirect
            if ($uri =~ /^$redirsURIesc(.+)/ || ($type !~ /C/ &amp;&amp; $uri =~ /^$redirsURIesc(.+)/i)) {
              $finalURL .= $1;
             }
          }
        }
      }
    } # end condition over input host matching host from redirs file
    } # end loop over hostnames list
  } # end loop over lines in redirs file
  close(REDIRS);
# non-prefix re-direct. This is bizarre, but you have to end URI with "?" to kill off the query string, unless the target already contains a "?", in which case you must NOT add it! Gotta love Apache...
  $finalURL .= '?' unless $finalType =~ /P/ || $finalURL =~ /\?/;
  print "$finalURL\n";
} # end loop over STDIN

The nice thing here is that there are a couple of ways to test it, which gives you a sort of cross-check capability. Of course I made lots of mistakes in programming it, but I worked through all the cases until they were all right, using rapid testing.

For instance, let’s see what happens for www.drj.com. We run this test from the development server as follows:

> curl -i -H ‘Host: www.drj.com’ ‘localhost:90’

HTTP/1.1 301 Moved Permanently
Date: Thu, 09 Feb 2012 15:24:25 GMT
Server: Apache/2
Location: http://drjohnstechtalk.com/
Content-Length: 235
Content-Type: text/html; charset=iso-8859-1

Moved Permanently

The document has moved here.

 

And from the command line I test redirect.pl as follows:

> echo “www.drj.com/”|./redirect.pl

http://drjohnstechtalk.com/?

That terminal “?” is unfortunate, but apparently you need it to kill off any possible query_string.

You want some more? OK. How about matching a host and the initial path in a case-insensitive manner? No problem, we’re up to the challenge:

> curl -i -H ‘Host: DRJ.COM’ ‘localhost:90/PATH/WITH/SLASH/stuff?hi=there’

HTTP/1.1 301 Moved Permanently
Date: Thu, 09 Feb 2012 15:38:12 GMT
Server: Apache/2
Location: https://drjohnstechtalk.com/other/path
Content-Length: 246
Content-Type: text/html; charset=iso-8859-1

Moved Permanently

The document has moved here.

 

Refer back to the redirs file and you see this is the desired behaviour.

We could go on with an example for each case, but we’ll conclude with one last one:

> curl -i -H ‘Host: DRJ.NET’ ‘localhost:90/2pAtHstuff?hi=there’

HTTP/1.1 301 Moved Permanently
Date: Thu, 09 Feb 2012 15:44:37 GMT
Server: Apache/2
Location: http://drjohnstechtalk.com/2straightpathstuff?hi=there
Content-Length: 262
Content-Type: text/html; charset=iso-8859-1

Moved Permanently

The document has moved here.

 

A case-sensitive, preserve match. Change “pAtH” to “path” and there is no matching line in redirs.txt so you will get the default URL.

Creating exceptions

Eventually I wanted to have an exception – a URI which should be served with a 200 status rather than redirected. How to handle?

# Inspired by the dreadful documentation on http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
        RewriteEngine on
# just this one page should NOT be redirected
        Rewriterule ^/dontredirectThisPage.php - [L]
        RewriteMap  redirectMap prg:redirect.pl
        ... etc ...

The above apache configuration snippet shows that I had to put the page which shouldn’t be redirected at the top of the ruleset and set the target to “-“, which turns off redirection for that match, and make this the last executed Rewrite rule. I think this is better than a negated match (!) which always gets complicated.

Conclusion
A powerful redirect factory was constructed from Apache and Perl. We suffered quite a bit during development because of incomprehensible documentation. But hopefully we’ve saved someone else this travail.

References and related

2022 update. This is a very nice commercial service for redirects which I have just learned about: https://www.easyredir.com/
This post describes how to massage Apache so that it always returns a maintenance page no matter what URI was originally requested.
I have since learned that another term used in the industry for rediect server is persistent URL (PURL). It’s explained in Wikipedia by this article: https://en.wikipedia.org/wiki/Persistent_uniform_resource_locator

Categories
Ajax

Web to ssh gateway – not so difficult with Right Tools

Intro
I won’t go into details in this posting for fear that the “bad people” will be more likely to benefit than the legitimate users of what I’m describing. That being said there are some legitimate uses, for instance when you need that terminal access but a direct ssh connection just isn’t available.

Ajaxterm
I’m kind of amazed at how far Javascript has come. You can implement a curses-based application in javascript, i.e., a terminal console? Yup. You bet. And the kicker is that it works quite well. Teraterm it ain’t, but I’ll be danged if you can’t vi a file, run top as well your basic commands, all over a pretty standard-looking web page. That’s what we mean by gateway – an application which converts one protocol to another. In this case HTTP to shell (I suppose).

The generic application is called ajaxterm. I used it from a distribution that runs a local python server on my server. It’s described here:

https://github.com/antonylesuisse/qweb/tree/master/ajaxterm/README.txt

If you keep the default screen size, 80×24, he says it has few enough characters that a screen refresh can be contained in one packet. In my testing the echo delay was probably under one quarter second.

Forget about a scroll bar holding 1000’s of lines, however. You get just your basic terminal like in the old days.

Someone reminded me about screen, which I hadn’t been using. Screen is an extremely useful tool. It’s like a terminal multiplexor. Now I normally set up my screen escape sequence to be Ctrl-\, but for some reason this particular sequence is not recognized by Ajaxterm. What I settled on instead is Ctrl-g (escape ^Gg in your .screenrc). I don’t like to use the default Ctrl-a because this is a useful emacs editing mode sequence – takes you to beginning of line. Popping between screens is a little slow with ajaxterm as might be expected. It’s a worst-case, everything must be re-drawn situation, I suppose. But ajaxterm + screen is a pretty powerful combination.

Conclusion
Now I have an additional path to my server’s command line if a direct ssh connection isn’t available.

Categories
Ajax flot jquery Perl

Making Function Plots fun using Ajax while solving a real-world problem

Intro
I learned an awful lot from this exercise. I wanted to plot the trajectory of a foam basketball through the air. You know the kind of thing where you can vary the initial conditions to see what differences the results will produce. Finally, finally a good excuse to learn some Ajax. Ajax is a natural fit because you can work within the same web page and the feel is more interactive.

High level description
There’s so much here to describe I hardly know where to begin. I may never get through describing it all.

At the highest levels I had to learn some of the following:

  • php
  • Ajax
  • DOM
  • Javascript
  • jquery
  • flot
  • json

Perl and basic physics are not on the list – they are used but I already know those!

I basically only learned as much as I needed to accomplish the task. This saved me quite a bit of time as you can get bogged down for months in any single one of those topics above. I’m pretty good at “programming by analogy” and this really put those skills to the test because, as is usually the case, analogies were indeed present, but they weren’t very exact so I needed a scary amount of extrapolation from what samples were easily available.

The net result of all this? I think it’s pretty neat if I say so myself. This web page follows the trajectory of a small foam basketball from a given set of initial conditions. The trajectory is plotted. You tweak the initial conditions and a new trajectory is plotted on top of the old one so you can see the differences. Here’s a link to the application.

To be continued in great detail, hopefully…

Categories
Admin Network Technologies

The IT Detective Agency: virus updates are failing

Intro
This case hardly qualifies as worthy subject matter for the IT Detective Agency – it’s pretty run-of-the-mill stuff. But I wanted to document it for completeness and show how a problem in one thing can turn out to have an unexpected cause (At least to me. In hindsight it’s dead obvious what the issue was likely to have been).

The Situation
We have lots of servers at drjohns. So when one of our admins, Shake, said that one of them, nfuz01, can’t reach the Etrust serverto get its virus updates I had no recollection of what that server is or does. Shake asked if the firewall had changed recently. That’s sort of a tricky question because there are always minor changes being done. Most have absolutely no effect because they are additional rules providing new permissions. So I bravely answered No, it hadn’t. And I wondered what he meant in using the word “reach” anyways.

So I walk up to Shake’s desk to get a better idea. He said not only are updates not virus signature updates not occurring, but neither server can PING the other, neither by name nor by IP address. Now we’re getting somewhere. I still haven’t registered where nfuz01 is, but I know the firewall as I’ve set it up permits ICMP traffic to transit. I suggested that maybe nfuz01 had some missing or messed-up routes. Then I went back to my desk to think some more. That’s what gets me motivated – when I’ve publicly speculated about the root cause of something. It’s not so much that I may be proven wrong, but if I am wrong, I want to be the first to find out and issue a correction.

So I tried a PING from my desktop:

C:\>ping 10.91.12.14
 
Pinging 10.91.12.14 with 32 bytes of data:
Reply from 171.18.252.10: TTL expired in transit.
Reply from 171.18.252.10: TTL expired in transit.

I look up where nfuz01 is. It is in a secondary data center. I ping it from a server in that same data center, but one a different segment – it works fine! I ping it from a Linux server in my main data center – totally different results:

> ping 10.91.12.14
PING 10.91.12.14 (10.91.12.14) 56(84) bytes of data.
From 171.18.252.10 icmp_seq=1 Time to live exceeded
From 171.18.252.10 icmp_seq=2 Time to live exceeded
 
--- 10.91.12.14 ping statistics ---
2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1000ms
 
> traceroute -n 10.91.12.14
traceroute to 10.91.12.14 (10.91.12.14), 30 hops max, 40 byte packets
 1  10.136.188.2  1.100 ms  1.523 ms  2.010 ms
 2  10.1.4.2  0.934 ms  0.941 ms  0.773 ms
 3  10.1.4.141  0.869 ms  0.926 ms  0.913 ms
 4  171.18.252.10  1.076 ms  1.096 ms  1.130 ms
 5  10.1.4.129  1.043 ms  1.029 ms  1.018 ms
 6  10.1.4.141  0.993 ms  0.611 ms  0.918 ms
 7  171.18.252.10  0.932 ms  0.916 ms  1.002 ms
 8  10.1.4.129  0.987 ms  1.089 ms  1.121 ms
 9  10.1.4.141  1.152 ms  1.246 ms  1.229 ms
10  171.18.252.10  2.040 ms  2.747 ms  2.735 ms
11  10.1.4.129  1.332 ms  1.418 ms  1.467 ms
12  10.1.4.141  1.477 ms  1.754 ms  1.685 ms
13  171.18.252.10  1.993 ms  1.978 ms  2.013 ms
14  10.1.4.129  1.930 ms  1.960 ms  2.039 ms
15  10.1.4.141  2.065 ms  2.156 ms  2.140 ms
16  171.18.252.10  2.116 ms  5.454 ms  5.453 ms
17  10.1.4.129  4.466 ms  4.385 ms  4.296 ms
18  10.1.4.141  4.266 ms  4.267 ms  4.260 ms
19  171.18.252.10  4.232 ms  4.216 ms  4.216 ms
20  10.1.4.129  4.182 ms  4.063 ms  4.009 ms
21  10.1.4.141  3.994 ms  3.987 ms  2.398 ms
22  171.18.252.10  2.400 ms  2.484 ms  2.690 ms
23  10.1.4.129  2.346 ms  2.449 ms  2.544 ms
24  10.1.4.141  2.534 ms  2.607 ms  2.610 ms
25  171.18.252.10  2.602 ms  2.742 ms  2.736 ms
26  10.1.4.129  2.776 ms  2.856 ms  2.848 ms
27  10.1.4.141  2.648 ms  3.185 ms  3.291 ms
28  171.18.252.10  3.236 ms  3.223 ms  3.235 ms
29  10.1.4.129  3.219 ms  3.277 ms  3.377 ms
30  10.1.4.141  3.363 ms  3.381 ms  3.449 ms

Cool, right? We’ve caught a network loop in the act. Now I know it isn’t the firewall, it isn’t the routes on nfuz01 but it is something with networking. So I sent that off to them….

In less than an hour I got the explanation as well as the fix:


All should be reachable again. There’s a loop I can’t clear amongst some [telecom-owned] routers in the main data center. I’ve superseded it with two /27s until they clear it.

And it pings fine now:

> ping 10.91.12.14
PING 10.91.12.14 (10.91.12.14) 56(84) bytes of data.
64 bytes from 10.91.12.14: icmp_seq=1 ttl=125 time=46.2 ms
64 bytes from 10.91.12.14: icmp_seq=2 ttl=125 time=40.1 ms
64 bytes from 10.91.12.14: icmp_seq=3 ttl=125 time=23.1 ms
 
--- 10.91.12.14 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 23.114/36.502/46.275/9.796 ms

And Shake says the updates came in.

Case closed!

Conclusion
Why wasn’t the problem more obvious to us from the very beginning? Well, if the admin who said nfuz01 couldn’t reach Etrust had tried to log in nfuz01 through the normal Remote Desktop mechanism – and of course failed – then we might have drilled down into a networking cause more quickly. But nfuz01 is a VM and he must have been logged on via VMWare Virtual Center and so he hadn’t noticed that basically the server couldn’t reach anywhere in our main data center. It is also an obscure server (remember that I had no recall about it?) so no one really noticed that it was effectivley out-of-business.

Categories
Admin IT Operational Excellence Proxy

The IT Detective Agency: the case of the Sales and Use tax software

Intro
I have to give credit to my colleague “Ben” for cracking this case, which left me scratching my head. Users at drjohns were getting new Windows 7 PCs and some of the old software wasn’t going to run on those new PCs, including our indirect tax sales and use software from Thomson Reuters. The new approach is SaaS – software as a service. The new package was approved and everyone thought it was going to work fine, until late in the game it was actually tested. They couldn’t bring up their old tax returns. So at the last hour they bring in the Internet experts.

The Details
At drjohns our users are insulated from the Internet by proxy servers. There are no direct routes. It’s private address space and an explicit proxy connection to browse out to Internet. 99% of the time this works fine. And it sure is a secure way to go. But those exceptions can be quite a headache. This case is a very typical presentation of what we see, though the particular solution varies case by case.

We get detailed network requirements. They usually talk about opening up the firewall to certain servers, etc. We always patiently explain that the firewall is open – to the proxy! The desktops have no Internet routes, nor can they resolve Internet domain names. That’s right we have private root DNS servers. Most vendors have never encountered this setup and so they dig in their heels and insist that the only way is to ‘open the firewall…”

This case was no different, except we didn’t actually talk to the vendor. But their requirements were crystal clear in this networking document. Here’s the snippet that would seem to be fatal given our Intranet architecture:

RS APPLICATION SERVERS
The ONESOURCE Sales & Use Application Servers use TCP/IP communications from the client
PC to the Server. The requirements for communications with the ONESOURCE Application
Servers are itemized below:
- DNS Name Resolution is not used for the Application Servers.
- Proxy Server access to the Application Servers is supported ONLY in transparent mode. The
Proxy Server must not translate the TCP/IP address of the Application Servers. PCs must be
able to establish a connection using the actual TCP/IP address and port numbers of the
Application Server without application “awareness” of a Proxy Server.
- Network Address Translation (NAT) is supported for the client addresses but is NOT
supported for the Application Server addresses.
- Connections are outbound only from the client to the server.
- Security policies, firewall rules, proxy rules and router packet filters must allow outbound
connections (and inbound replies) on destination port 2429 to the Class “B” network address
164.57.0.0. when using the non-WCF application servers. If the client’s account has been
configured to use Windows Communications Foundation or WCF, there are no additional port
requirements. The source port selection uses standard port numbers 1024 and above.

The application installs about 10 ActiveX controls and it wouldn’t run on my desktop. Ben managed to get it to run using the OpenText socks client. It has an option to “socksify everything else” which he says proves to be very useful when you don’t know what specific application to socksify. So now let me repeat what I have just said: Ben got it to work without any changes to the firewall, ignoring all the vendor’s advice and requirements!

I was very pleased as this was getting to be a high-priority issue what with these sales and use taxes due each month.

But Ben didn’t stop there. He came up with even better solution. He said he was looking around at the folder where all the stuff is installed by the application. He noticed a file called ConfigProxy. He configured it to use the system proxy settings. Then he exempted the target site from proxy authentication. Lo and behold that worked as well, with no socksification required at all. We only socksify an app as a last resort.

This latest finding completely contradicts the vendor’s stated network requirements. But it’s better this way.

We now have a happy tax department. Case closed.

Conclusion
Vendor network requirements are not always what they seem. Clearly they are not testing in the more obscure environments such as a private Intranet with an independent namespace that connects to the wider Internet only via explicit proxy. If you’re in this situation, which offers some serious security advantages, there are things you can do to get demanding applications to work.