Category: Admin

The IT Detective Agency: Browsing Stopped Working on Internet-Connected Enterprise Laptops

Post author By john
Post date April 18, 2012
No Comments on The IT Detective Agency: Browsing Stopped Working on Internet-Connected Enterprise Laptops

Intro
Recall our elegant solution to DNS clobbering and use of a PAC file, which we documented here: The IT Detective Agency: How We Neutralized Nasty DNS Clobbering Before it Could Bite Us. Things were going smoothly with that kludge in place for months. Then suddenly it wasn’t. What went wrong and how to fix?? Read on.

The Details
We began hearing reports of increasing numbers of users not being able to access Internet sites on their enterprise laptops when directly connected to Internet. Internet Explorer gave that cryptic error to the effect Internet Explorer cannot display the web page. This was a real problem, and not a mere inconvenience, for those cases where a user was using a hotel Internet system that required a web page sign-on – that web page itself could not be displayed, giving that same error message. For simple use at home this error may not have been fatal.

But we had this working. What the…?

I struggled as Rop demoed the problem for me with a user’s laptop. I couldn’t deny it because I saw it with my own eyes and saw that the configuration was correct. Correct as in how we wanted it to be, not correct as in working. Basically all web pages were timing out after 15 seconds or so and displaying cannot display the web page. The chief setting is the PAC file, which is used by this enterprise on their Intranet.

On the Internet (see above-mentioned link) the PAC file was more of an annoyance and we had aliased the DNS value to 127.0.0.1 to get around DNS clobbering by ISPs.

While I was talking about the problem to a colleague I thought to look for a web server on the affected laptop. Yup, there it was:

C:\Users\drj>netstat -an|more

Active Connections

  Proto  Local Address          Foreign Address        State
  TCP    0.0.0.0:80             0.0.0.0:0              LISTENING
  TCP    0.0.0.0:135            0.0.0.0:0              LISTENING
  ...

What the…? Why is there a web server running on port 80? How will it respond to a PAC file request? I quickly got some hints by hitting my own laptop with curl:

$ curl -i 192.168.3.4/proxy.pac

HTTP/1.1 404 Not Found Content-Type: text/html; charset=us-ascii Server: Microsoft-HTTPAPI/2.0 Date: Tue, 17 Apr 2012 18:39:30 GMT Connection: close Content-Length: 315 Not Found

Not Found

HTTP Error 404. The requested resource is not found.

So the server is Microsoft-HTTPAPI, which upon invetsigation seems to be a Microsoft Web Deployment Agent Service (MsDepSvc).

The main point is that I don’t remember that being there in the past. I felt it’s presence, probably a new “feature” explained the current problem. What to do about it however??

Since this is an enterprise, not a small shop with a couple PCs, turning off MsDepSvc is not a realistic option. It’s probably used for peer-to-peer software distribution.

Hmm. Let’s review why we think our original solution worked in the first place. It didn’t work so well when the DNS was clobbered by my ISP. Why? I think because the ISP put up a web server when it encountered a NXDOMAIN DNS response and that web server gave a 404 not found error when the browser searched for the PAC file. Turning the DNS entry to the loopback interface, 127.0.0.1, gave the browser a valid IP, one that it would connect to and quickly receive a TCP RST (reset). Then it would happily conclude there was no way to reach the PAC file, not use it, and try DIRECT connections to Internet sites. That’s my theory and I’m sticking to it!

In light of all this information the following option emerged as most likely to succeed: set the Internet value of the DNS of the PAC file to a valid IP, and specifically, one that would send a TCP RST (essentially meaning that TCP port 80 is reachable but there is no listener on it).

We tried it and it seemed to work for me and my colleague. We no loner had the problem of not being able to connect to Internet sites with the PAC file configured and directly connected to Internet.

I noticed however that once I got on the enterprise network via VPN I wasn’t able to connect to Internet sites right away. After about five minutes I could.

My theory about that is that I had a too-long TTL on my PAC file DNS entry. The TTL was 30 minutes. I shortened it to five minutes. Because when you think about it, that DNS value should get cached by the PC and retained even after it transitions to Intranet-connected.

I haven’t retested, but I think that adjustment will help.

Conclusion
I also haven’t gotten a lot of feedback from this latest fix, but I’m feeling pretty good about it.

Case: again mostly solved.

Hey, don’t blame me about the ambiguity. I never hear back from the users when things are working 🙂

References
A closely related case involving Verizon “clobbering” TCP RST packets also bit us.

Tags PAC file

Admin Security

User Add/Delete Jython Script for RSA Authentication Manager

Post author By john
Post date April 6, 2012
8 Comments on User Add/Delete Jython Script for RSA Authentication Manager

Intro
I thought I might save someone else the trouble of re-creating a respectable program using the Authentication Manager SDK for version 7.1 which can simply add and delete users to keep the local database in sync with an external database.

History Lesson
We had this problem licked under Authentication Manager v 6.1.2. In that version the sdk was TCL-based and for whatever reason, it seemed a whole lot simpler to understand the model and get working code. When we began to look at v 7.1 we saw we were confronted with a whole different animal that required new understanding and new skills to master.

The Details
Jython is Python plus Java. I really don’t know either language so I used a technique you might call programming by extrapolation. Here is the code. Not really understanding python I preserved as much as possible for fear of breaking something. I nevertheless had to be a little innovative and create a new function.

'''
 * Jython class demonstrating the Administration API
 * usage from a Jython script.
 *
 * Run this script in the utils directory of the Authentication Manager installation.
 *
 * Execute the command "rsautil jython AdminAPIDemos.py create <admin user name> <password>"
 * Execute the command "rsautil jython AdminAPIDemos.py assign <admin user name> <password>"
 * Execute the command "rsautil jython AdminAPIDemos.py update <admin user name> <password>"
 * Execute the command "rsautil jython AdminAPIDemos.py delete <admin user name> <password>"
 *
 * If you are executing this script in an environment other than the predefined
 * rsautil scripting tool you must make the CommandClientAppContext.xml file
 * available in the end of the classpath for this script. You must also configure
 * the necessary connection parameters in a properties file located in the process
 * working directory. See the provided samples for more information.
'''
 
# imports
from jarray import array
import sys
# DrJ required import
# Not Workign! from org.python.modules import re
from java.util.regex import *
from java.lang import *
 
 
from java.util import Calendar,Date
from java.lang import String
 
from org.springframework.beans import BeanUtils
 
from com.rsa.admin import AddGroupCommand
from com.rsa.admin import AddPrincipalsCommand
from com.rsa.admin import DeleteGroupCommand
from com.rsa.admin import DeletePrincipalsCommand
from com.rsa.admin import LinkGroupPrincipalsCommand
from com.rsa.admin import LinkAdminRolesPrincipalsCommand
from com.rsa.admin import SearchAdminRolesCommand
from com.rsa.admin import SearchGroupsCommand
from com.rsa.admin import SearchPrincipalsCommand
from com.rsa.admin import SearchRealmsCommand
from com.rsa.admin import SearchSecurityDomainCommand
from com.rsa.admin import UpdateGroupCommand
from com.rsa.admin import UpdatePrincipalCommand
from com.rsa.admin.data import AdminRoleDTOBase
from com.rsa.admin.data import GroupDTO
from com.rsa.admin.data import IdentitySourceDTO
from com.rsa.admin.data import ModificationDTO
from com.rsa.admin.data import PrincipalDTO
from com.rsa.admin.data import RealmDTO
from com.rsa.admin.data import SecurityDomainDTO
from com.rsa.admin.data import UpdateGroupDTO
from com.rsa.admin.data import UpdatePrincipalDTO
from com.rsa.authmgr.admin.agentmgt import AddAgentCommand
from com.rsa.authmgr.admin.agentmgt import DeleteAgentsCommand
from com.rsa.authmgr.admin.agentmgt import LinkAgentsToGroupsCommand
from com.rsa.authmgr.admin.agentmgt import SearchAgentsCommand
from com.rsa.authmgr.admin.agentmgt import UpdateAgentCommand
from com.rsa.authmgr.admin.agentmgt.data import AgentConstants
from com.rsa.authmgr.admin.agentmgt.data import AgentDTO, ListAgentDTO
from com.rsa.authmgr.admin.hostmgt.data import HostDTO
from com.rsa.authmgr.admin.principalmgt import AddAMPrincipalCommand
from com.rsa.authmgr.admin.principalmgt.data import AMPrincipalDTO
from com.rsa.authmgr.admin.tokenmgt import GetNextAvailableTokenCommand
from com.rsa.authmgr.admin.tokenmgt import LinkTokensWithPrincipalCommand
from com.rsa.authn import SearchPasswordPoliciesCommand
from com.rsa.authn import UpdatePasswordPolicyCommand
from com.rsa.authn.data import PasswordPolicyDTO
from com.rsa.command import ClientSession
from com.rsa.command import CommandException
from com.rsa.command import CommandTargetPolicy, ConnectionFactory
from com.rsa.command.exception import DataNotFoundException, DuplicateDataException
from com.rsa.common.search import Filter
 
'''
 * This class demonstrates the usage patterns of the
 * Authentication Manager 7.1 API.
 *
 * <p>
 * The first set of operations performed if the first
 * command line argument is equal to "create".
 * The sample creates a restricted agent, a group, and a user.
 * Links the user to the group and the group to the agent.
 * </p>
 * <p>
 * The second set of operations performed if the first
 * command line argument is equal to "delete".
 * Lookup the user, group and agent created above.
 * Delete the user, group and agent.
 * </p>
 * <p>
 * A third set of operations is performed if the first
 * command line argument is equal to "assign".
 * Lookup the user and assign the next available
 * SecurID token to the user.
 * Lookup the SuperAdminRole and assign it to the user.
 * </p>
 * <p>
 * A fourth set of operations performed if the first
 * command line argument is equal to "update".
 * Update the Agent, Group, and User objects.
 * </p>
 * <p>
 * A fifth set of operations performed if the first
 * command line argument is equal to "disable".
 * Lookup a password policy with a name that starts
 * with "Initial" and then disable the password history
 * for that policy. Use this to allow the sample to
 * perform multiple updates of the user password using
 * the same password for each update.
 * </p>
 * <p>
 * The APIs demonstrated include the use of the Filter
 * class to generate search expressions for use with
 * all search commands.
 * </p>
'''
class AdminAPIDemos:
 
    '''
     * We need to know these fairly static values throughout this sample.
     * Set the references to top level security domain (realm) and system
     * identity source to use later.
     *
     * @throws CommandException if something goes wrong
    '''
    def __init__(self):
        searchRealmCmd = SearchRealmsCommand()
        searchRealmCmd.setFilter( Filter.equal( RealmDTO.NAME_ATTRIBUTE, "SystemDomain"))
        searchRealmCmd.execute()
        realms = searchRealmCmd.getRealms()
        if( len(realms) == 0 ):
            print "ERROR: Could not find realm SystemDomain"
            sys.exit( 2 )
 
        self.domain = realms[0].getTopLevelSecurityDomain()
        self.idSource = realms[0].getIdentitySources()[0]
 
 
    '''
     * Create an agent and set it to be restricted.
     *
     * @param: name the name of the agent to create
     * @param: addr the IP address for the agent
     * @param: alt array of alternate IP addresses
     * @return: the GUID of the agent just created
     * 
     * @throws CommandException if something goes wrong
    '''
    def createAgent(self, name, addr, alt):
        # need a HostDTO to be set
        host = HostDTO()
        host.setName(name)
        host.setPrimaryIpAddress(addr)
        host.setSecurityDomainGuid(self.domain.getGuid())
        host.setNotes("Created by AM Demo code")
 
        # the agent to be created
        agent = AgentDTO()
        agent.setName(name)
        agent.setHost(host)
        agent.setPrimaryAddress(addr)
        agent.setAlternateAddresses(alt)
        agent.setSecurityDomainId(self.domain.getGuid())
        agent.setAgentType(AgentConstants.STANDARD_AGENT)
        agent.setRestriction(1) # only allow activated groups
        agent.setEnabled(1)
        agent.setOfflineAuthDataRefreshRequired(0)
        agent.setNotes("Created by AM Demo code")
 
        cmd = AddAgentCommand(agent)
 
	try:        
	    cmd.execute()
        except DuplicateDataException:
            print "ERROR: Agent " + name + " already exists."
	    sys.exit(2)
 
        # return the created agents GUID for further linking
        return cmd.getAgentGuid()
 
 
    '''
     * Lookup an agent by name.
     *
     * @param: name the agent name to lookup
     * @return: the GUID of the agent
     * 
     * @throws CommandException if something goes wrong
    '''
    def lookupAgent(self, name):
        cmd = SearchAgentsCommand()
        cmd.setFilter(Filter.equal(AgentConstants.FILTER_HOSTNAME, name))
        cmd.setLimit(1)
        cmd.setSearchBase(self.domain.getGuid())
        # the scope flags are part of the SecurityDomainDTO
        cmd.setSearchScope(SecurityDomainDTO.SEARCH_SCOPE_ONE_LEVEL)
 
        cmd.execute()
 
	if (len(cmd.getAgents()) < 1):
            print "ERROR: Unable to find agent " + name + "."  
	    sys.exit(2)
 
        return cmd.getAgents()[0]
 
 
    '''
     * Update an agent, assumes a previous lookup done by lookupAgent.
     *
     * @param agent the result of a previous lookup
     *
     * @throws CommandException if something goes wrong
    '''
    def updateAgent(self, agent):
        cmd = UpdateAgentCommand()
 
        agentUpdate = AgentDTO()
        # copy the rowVersion to satisfy optimistic locking requirements
        BeanUtils.copyProperties(agent, agentUpdate)
 
        # ListAgentDTO does not include the SecurityDomainId
        # use the GUID of the security domain where agent was created
        agentUpdate.setSecurityDomainId(self.domain.getGuid())
 
        # clear the node secret flag and modify some others
        agentUpdate.setSentNodeSecret(0)
        agentUpdate.setOfflineAuthDataRefreshRequired(1)
        agentUpdate.setIpProtected(1)
        agentUpdate.setEnabled(1)
        agentUpdate.setNotes("Modified by AM Demo code")
 
        # set the requested updates in the command
        cmd.setAgentDTO(agentUpdate)
 
        # perform the update
        cmd.execute()
 
 
    '''
     * Delete an agent.
     *
     * @param: agentGuid the GUID of the agent to delete
     * 
     * @throws CommandException if something goes wrong
    '''
    def deleteAgent(self, agentGuid):
        cmd = DeleteAgentsCommand( [agentGuid] )
        cmd.execute()
 
 
    '''
     * Create an IMS user, needs to exist before an AM user can be
     * created.
     *
     * @param: userId the user's login UID
     * @param: password the user's password
     * @param: first the user's first name
     * @param: last the user's last name
     * 
     * @return: the GUID of the user just created
     * 
     * @throws CommandException if something goes wrong
    '''
    def createUser(self, userId, password, first, last):
        cal = Calendar.getInstance()
 
        # the start date
        now = cal.getTime()
# DrJ: add 50 years from now!    
        cal.add(Calendar.YEAR, 50)
 
        # the account end date
        expire = cal.getTime()
 
        principal = PrincipalDTO()
        principal.setUserID( userId )
        principal.setFirstName( first )
        principal.setLastName( last )
        #     principal.setPassword( password )
 
        principal.setEnabled(1)
        principal.setLockoutStatus(0)
        principal.setAccountStartDate(now)
        #principal.setAccountExpireDate(expire)
        #principal.setAccountExpireDate(0)
        principal.setAdminRole(0)
        principal.setCanBeImpersonated(0)
        principal.setTrustToImpersonate(0)
 
        principal.setSecurityDomainGuid( self.domain.getGuid() )
        principal.setIdentitySourceGuid( self.idSource.getGuid() )
        principal.setDescription("Created by DrJ utilities")
 
        cmd = AddPrincipalsCommand()
        cmd.setPrincipals( [principal] )
 
        try:
            cmd.execute()
	except DuplicateDataException:
            print "ERROR: User " + userId + " already exists."
	    sys.exit(2)
 
        # only one user was created, there should be one GUID result
        return cmd.getGuids()[0]
 
 
    '''
     * Lookup a user by login UID.
     * 
     * @param: userId the user login UID
     *
     * @return: the GUID of the user record.
    '''
    def lookupUser(self, userId):
        cmd = SearchPrincipalsCommand()
        cmd.setFilter(Filter.equal(PrincipalDTO.LOGINUID, userId))
        cmd.setSystemFilter(Filter.empty())
        cmd.setLimit(1)
        cmd.setIdentitySourceGuid(self.idSource.getGuid())
        cmd.setSecurityDomainGuid(self.domain.getGuid())
        cmd.setGroupGuid(None)
        cmd.setOnlyRegistered(1)
        cmd.setSearchSubDomains(0)
 
        cmd.execute()
 
	if (len(cmd.getPrincipals()) < 1):
            print "ERROR: Unable to find user " + userId + "."
	    sys.exit(2)
 
        return cmd.getPrincipals()[0]
 
 
    '''
     * Update the user definition.
     *
     * @param user the principal object from a previous lookup
    '''
    def updateUser(self, user):
        cmd = UpdatePrincipalCommand()
        cmd.setIdentitySourceGuid(user.getIdentitySourceGuid())
 
        updateDTO = UpdatePrincipalDTO()
        updateDTO.setGuid(user.getGuid())
        # copy the rowVersion to satisfy optimistic locking requirements
        updateDTO.setRowVersion(user.getRowVersion())
 
        # collect all modifications here
        mods = []
 
        # first change the email
        mod = ModificationDTO()
        mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE)
        mod.setName(PrincipalDTO.EMAIL)
        mod.setValues([ user.getUserID() + "@mycompany.com" ])
        mods.append(mod) # add it to the list
 
        # also change the password
        mod = ModificationDTO()
        mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE)
        mod.setName(PrincipalDTO.PASSWORD)
        mod.setValues([ "MyNewPAssW0rD1!" ])
        mods.append(mod) # add it to the list
 
        # change the middle name
        mod = ModificationDTO()
        mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE)
        mod.setName(PrincipalDTO.MIDDLE_NAME)
        mod.setValues([ "The Big Cahuna" ])
        mods.append(mod) # add it to the list
 
        # make a note of this update in the description
        mod = ModificationDTO()
        mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE)
        mod.setName(PrincipalDTO.DESCRIPTION)
        mod.setValues([ "Modified by AM Demo code" ])
        mods.append(mod) # add it to the list
 
        # set the requested updates into the UpdatePrincipalDTO
        updateDTO.setModifications(mods)
        cmd.setPrincipalModification(updateDTO)
 
        # perform the update
        cmd.execute()
 
 
    '''
     * Delete a user.
     *
     * @param: userGuid the GUID of the user to delete
     * 
     * @throws CommandException if something goes wrong
    '''
    def deleteUser(self, userGuid):
        cmd = DeletePrincipalsCommand()
        cmd.setGuids( array( [userGuid], String ) )
        cmd.setIdentitySourceGuid( self.idSource.getGuid() )
        cmd.execute()
 
 
    '''
     * Create an Authentication Manager user linked to the IMS user.
     * The user will have a limit of 3 bad passcodes, default shell
     * will be "/bin/sh", the static password will be "12345678" and
     * the Windows Password for offline authentication will be "Password123!".
     *
     * @param: guid the GUID of the IMS user
     * 
     * @throws CommandException if something goes wrong
    '''
    def createAMUser(self, guid):
        principal = AMPrincipalDTO()
        principal.setGuid(guid)
        principal.setBadPasscodes(3)
        principal.setDefaultShell("/bin/sh")
        principal.setDefaultUserIdShellAllowed(1)
        # these next three innocent-looking lines cost you a license! do not use them!! - DrJ 
        #principal.setStaticPassword("12345678")
        #principal.setStaticPasswordSet(1)
        #principal.setWindowsPassword("Password123!")
 
        cmd = AddAMPrincipalCommand(principal)
        cmd.execute()
 
 
    '''
     * Create a group to assign a user to.
     *
     * @param: name the name of the group to create
     * @return: the GUID of the group just created
     * 
     * @throws CommandException if something goes wrong
    '''
    def createGroup(self, name):
        group = GroupDTO()
        group.setName(name)
        group.setDescription("Created by AM Demo code")
        group.setSecurityDomainGuid(self.domain.getGuid())
        group.setIdentitySourceGuid(self.idSource.getGuid())
 
        cmd = AddGroupCommand()
        cmd.setGroup(group)
 
	try:
            cmd.execute()
	except DuplicateDataException:
            print "ERROR: Group " + name + " already exists."
	    sys.exit(2)
 
        return cmd.getGuid()
 
    '''
     * Lookup a group by name.
     *
     * @param: name the name of the group to lookup
     * @return: the GUID of the group
     * 
     * @throws CommandException if something goes wrong
    '''
    def lookupGroup(self, name):
        cmd = SearchGroupsCommand()
        cmd.setFilter(Filter.equal(GroupDTO.NAME, name))
        cmd.setSystemFilter(Filter.empty())
        cmd.setLimit(1)
        cmd.setIdentitySourceGuid(self.idSource.getGuid())
        cmd.setSecurityDomainGuid(self.domain.getGuid())
        cmd.setSearchSubDomains(0)
        cmd.setGroupGuid(None)
 
        cmd.execute()
 
	if (len(cmd.getGroups()) < 1):
            print "ERROR: Unable to find group " + name + "."
	    sys.exit(2)
 
        return cmd.getGroups()[0]
 
 
    '''
     * Update a group definition.
     *
     * @param group the current group object
    '''
    def updateGroup(self, group):
        cmd = UpdateGroupCommand()
        cmd.setIdentitySourceGuid(group.getIdentitySourceGuid())
 
        groupMod = UpdateGroupDTO()
        groupMod.setGuid(group.getGuid())
        # copy the rowVersion to satisfy optimistic locking requirements
        groupMod.setRowVersion(group.getRowVersion())
 
        # collect all modifications here
        mods = []
 
        mod = ModificationDTO()
        mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE)
        mod.setName(GroupDTO.DESCRIPTION)
        mod.setValues([ "Modified by AM Demo code" ])
        mods.append(mod)
 
        # set the requested updates into the UpdateGroupDTO
        groupMod.setModifications(mods)
        cmd.setGroupModification(groupMod)
 
        # perform the update
        cmd.execute()
 
 
    '''
     * Delete a group.
     *
     * @param: groupGuid the GUID of the group to delete
     * 
     * @throws CommandException if something goes wrong
    '''
    def deleteGroup(self, groupGuid):
        cmd = DeleteGroupCommand()
        cmd.setGuids( [groupGuid] )
        cmd.setIdentitySourceGuid( self.idSource.getGuid() )
        cmd.execute()
 
 
    '''
     * Assign the user to the specified group.
     *
     * @param: userGuid the GUID for the user to assign
     * @param: groupGuid the GUID for the group
     * 
     * @throws CommandException if something goes wrong
    '''
    def linkUserToGroup(self, userGuid, groupGuid):
        cmd = LinkGroupPrincipalsCommand()
        cmd.setGroupGuids( [groupGuid] )
        cmd.setPrincipalGuids( [userGuid] )
        cmd.setIdentitySourceGuid(self.idSource.getGuid())
 
        cmd.execute()
 
    '''
     * Assign the group to the restricted agent so users can authenticate.
     *
     * @param: agentGuid the GUID for the restricted agent
     * @param: groupGuid the GUID for the group to assign
     * 
     * @throws CommandException if something goes wrong
    '''
    def assignGroupToAgent(self, agentGuid, groupGuid):
        cmd = LinkAgentsToGroupsCommand()
        cmd.setGroupGuids( [groupGuid] )
        cmd.setAgentGuids( [agentGuid] )
        cmd.setIdentitySourceGuid(self.idSource.getGuid())
 
        cmd.execute()
 
    '''
     * Assign next available token to this user.
     *
     * @param: userGuid the GUID of the user to assign the token to
     * 
     * @throws CommandException if something goes wrong
    '''
    def assignNextAvailableTokenToUser(self, userGuid):
        cmd = GetNextAvailableTokenCommand()
        try:
            cmd.execute()
        except DataNotFoundException:
            print "ERROR: No tokens available"
        else:
            tokens = [cmd.getToken().getId()]
            cmd2 = LinkTokensWithPrincipalCommand(tokens, userGuid)
            cmd2.execute()
            print ("Assigned next available SecurID token to user jdoe")
 
    '''
     * Lookup an admin role and return the GUID.
     *
     * @param name the name of the role to lookup
     * @return the GUID for the required role
     *
     * @throws CommandException if something goes wrong
     '''
    def lookupAdminRole(self, name):
        cmd = SearchAdminRolesCommand()
 
        # set search filter to match the name
        cmd.setFilter(Filter.equal(AdminRoleDTOBase.NAME_ATTRIBUTE, name))
        # we only expect one anyway
        cmd.setLimit(1)
        # set the domain GUID
        cmd.setSecurityDomainGuid(self.domain.getGuid())
 
        cmd.execute()
	if (len(cmd.getAdminRoles()) < 1):
            print "ERROR: Unable to find admin role " + name + "."
	    sys.exit(2)
 
        return cmd.getAdminRoles()[0].getGuid()
 
    '''
     * Assign the given admin role to the principal provided.
     *
     * @param adminGuid the GUID for the administrator
     * @param roleGuid the GUID for the role to assign
     *
     * @throws CommandException if something goes wrong
     '''
    def assignAdminRole(self, adminGuid, roleGuid):
        cmd = LinkAdminRolesPrincipalsCommand()
        cmd.setIgnoreDuplicateLink(1)
        cmd.setPrincipalGuids( [ adminGuid ] )
        cmd.setAdminRoleGuids( [ roleGuid ] )
        cmd.execute()
        print ("Assigned SuperAdminRole to user jdoe")
 
    '''
     * Lookup a password policy by name and return the object.
     *
     * @param name the policy name
     * @return the object
     *
     * @throws CommandException if something goes wrong
     '''
    def lookupPasswordPolicy(self, name):
        cmd = SearchPasswordPoliciesCommand()
        cmd.setRealmGuid(self.domain.getGuid())
 
        # match the policy name
        cmd.setFilter(Filter.startsWith(PasswordPolicyDTO.NAME, name))
 
        cmd.execute()
 
	if (len(cmd.getPolicies()) < 1):
            print ("ERROR: Unable to find password policy with name starting with " + name + ".")
	    sys.exit(2)
 
        # we only expect one anyway
        return cmd.getPolicies()[0]
 
    '''
     * Update the given password policy, currently it just disables
     * password history.
     *
     * @param policy the policy to update
     *
     * @throws CommandException if something goes wrong
     '''
    def updatePasswordPolicy(self, policy):
        cmd = UpdatePasswordPolicyCommand()
 
        # disable password history
        policy.setHistorySize(0)
        cmd.setPasswordPolicy(policy)
 
        cmd.execute()
 
    '''
     * Create a collection of related entities, user, agent, group, token.
     *
     * @param admin the administrator user name
     * @param password the administrator password
     * 
     * @throws Exception if something goes wrong
    '''
    def doCreate(self):
 
        # Create a hypothetical agent with four alternate addresses
        addr = "1.2.3.4"
        alt = [ "2.2.2.2",  "3.3.3.3", "4.4.4.4", "5.5.5.5" ]
 
        # create a restricted agent
        agentGuid = self.createAgent("Demo Agent", addr, alt)
        print ("Created Demo Agent")
 
        # create a user group
        groupGuid = self.createGroup("Demo Agent Group")
        print ("Created Demo Agent Group")
 
        # assign the group to the restricted agent
        self.assignGroupToAgent(agentGuid, groupGuid)
        print ("Assigned Demo Agent Group to Demo Agent")
 
        # create a user and the AMPrincipal user record
        userGuid = self.createUser("jdoe", "Password123!", "John", "Doe")
        self.createAMUser(userGuid)
        print ("Created user jdoe")
 
        # link the user to the group
        self.linkUserToGroup(userGuid, groupGuid)
        print ("Added user jdoe to Demo Agent Group")
 
    '''
     * add user by DrJ
     *
     * @param admin the administrator user name
     * @param password the administrator password
     * 
     * @throws Exception if something goes wrong
    '''
    def doAdd(self):
        # create a user and the AMPrincipal user record
        # loop over all users listed in addusers.txt
        f = open('addusers.txt','r')
        str = f.readline()
        while str:
            strs = str.rstrip()
            cols = strs.split(",")
            userid = cols[0]
            fname = cols[1]
            lname = cols[2]
            print userid
# if user already exists we want to go continue with the list
	    try:
                userGuid = self.createUser(userid, "*LK*", fname, lname)
                self.createAMUser(userGuid)
                print "Created user userid,fname,lname: ", userid,",",lname,",",fname,"\n"
            except:
                print "exception for user ",userid,"\n"
            str = f.readline()
 
        f.close()
 
 
    '''
     * Assign the next available token to the user.
     *
     * @param admin the administrator user name
     * @param password the administrator password
     * 
     * @throws Exception if something goes wrong
    '''
    def doAssignNextToken(self):
 
        # lookup and then ...
        userGuid = self.lookupUser("jdoe").getGuid()
 
        # assign the next available token to this user
        self.assignNextAvailableTokenToUser(userGuid)
 
        # now that he has a token make him an admin
        roleGuid = self.lookupAdminRole("SuperAdminRole")
        self.assignAdminRole(userGuid, roleGuid)
 
    '''
     * Delete the entities created by the doCreate method.
     *
     * @param admin the administrator user name
     * @param password the administrator password
     * 
     * @throws Exception if something goes wrong
    '''
    def doDelete(self):
 
        # lookup and then ...
        # loop over all users listed in delusers.txt
        f = open('delusers.txt','r')
        str = f.readline()
        while str:
            # format: userid,fname,lname  . We just want the userid
            cols = str.split(",")
            userid = cols[0]
            print userid
# if user doesn't exist we want to go continue with the list
	    try:
                userGuid = self.lookupUser(userid).getGuid()
                # ... cleanup
                self.deleteUser(userGuid)
                print "Deleted user ",userid
            except:
                print "exception for user ",userid,"\n"
            str = f.readline()
 
        f.close()
 
    '''
     * Update the various entities created by the doCreate method.
     *
     * @throws Exception if something goes wrong
     '''
    def doUpdate(self):
        # lookup and then ...
        agent = self.lookupAgent("Demo Agent")
        group = self.lookupGroup("Demo Agent Group")
        user = self.lookupUser("jdoe")
 
        # ... update
        self.updateAgent(agent)
        print ("Updated Demo Agent")
        self.updateGroup(group)
        print ("Updated Demo Agent Group")
        self.updateUser(user)
        print ("Updated user jdoe")
 
    '''
     * Disable password history limit on default password policy so
     * we can issue multiple updates for the user password.
     *
     * @throws Exception if something goes wrong
     '''
    def doDisablePasswordHistory(self):
        # lookup and then ...
        policy = self.lookupPasswordPolicy("Initial")
 
        # ... update
        self.updatePasswordPolicy(policy)
        print ("Disabled password history")
 
# Globals here
'''
 * Show usage message and exit.
 * 
 * @param msg the error causing the exit
'''
def usage(msg):
    print ("ERROR: " + msg)
    print ("Usage: APIDemos <create|delete> <admin username> <admin password>")
    sys.exit(1)
 
'''
 * Use from command line with three arguments.
 * 
 * <p>
 * First argument:
 * create - to create the required entities
 * assign - to assign the next available token to the user
 * delete - to delete all created entities
 * </p>
 * <p>
 * Second argument is the administrator user name.
 * Third argument is the administrator password.
 * </p>
 * 
 * @param args the command line arguments
'''
 
if len(sys.argv) != 4:
    usage("Missing arguments")
 
# skip script name
args = sys.argv[1:]
 
# establish a connected session with given credentials
conn = ConnectionFactory.getConnection()
session = conn.connect(args[1], args[2])
 
# make all commands execute using this target automatically
CommandTargetPolicy.setDefaultCommandTarget(session)
 
 
try:
    # create instance
    api = AdminAPIDemos()
    # call delusers before addusers
    print "Deleting users...\n"
    api.doDelete()
    print "Adding users...\n"
    api.doAdd()
 
finally:
    # logout when done
    session.logout()

''' * Jython class demonstrating the Administration API * usage from a Jython script. * * Run this script in the utils directory of the Authentication Manager installation. * * Execute the command "rsautil jython AdminAPIDemos.py create <admin user name> <password>" * Execute the command "rsautil jython AdminAPIDemos.py assign <admin user name> <password>" * Execute the command "rsautil jython AdminAPIDemos.py update <admin user name> <password>" * Execute the command "rsautil jython AdminAPIDemos.py delete <admin user name> <password>" * * If you are executing this script in an environment other than the predefined * rsautil scripting tool you must make the CommandClientAppContext.xml file * available in the end of the classpath for this script. You must also configure * the necessary connection parameters in a properties file located in the process * working directory. See the provided samples for more information. ''' # imports from jarray import array import sys # DrJ required import # Not Workign! from org.python.modules import re from java.util.regex import * from java.lang import * from java.util import Calendar,Date from java.lang import String from org.springframework.beans import BeanUtils from com.rsa.admin import AddGroupCommand from com.rsa.admin import AddPrincipalsCommand from com.rsa.admin import DeleteGroupCommand from com.rsa.admin import DeletePrincipalsCommand from com.rsa.admin import LinkGroupPrincipalsCommand from com.rsa.admin import LinkAdminRolesPrincipalsCommand from com.rsa.admin import SearchAdminRolesCommand from com.rsa.admin import SearchGroupsCommand from com.rsa.admin import SearchPrincipalsCommand from com.rsa.admin import SearchRealmsCommand from com.rsa.admin import SearchSecurityDomainCommand from com.rsa.admin import UpdateGroupCommand from com.rsa.admin import UpdatePrincipalCommand from com.rsa.admin.data import AdminRoleDTOBase from com.rsa.admin.data import GroupDTO from com.rsa.admin.data import IdentitySourceDTO from com.rsa.admin.data import ModificationDTO from com.rsa.admin.data import PrincipalDTO from com.rsa.admin.data import RealmDTO from com.rsa.admin.data import SecurityDomainDTO from com.rsa.admin.data import UpdateGroupDTO from com.rsa.admin.data import UpdatePrincipalDTO from com.rsa.authmgr.admin.agentmgt import AddAgentCommand from com.rsa.authmgr.admin.agentmgt import DeleteAgentsCommand from com.rsa.authmgr.admin.agentmgt import LinkAgentsToGroupsCommand from com.rsa.authmgr.admin.agentmgt import SearchAgentsCommand from com.rsa.authmgr.admin.agentmgt import UpdateAgentCommand from com.rsa.authmgr.admin.agentmgt.data import AgentConstants from com.rsa.authmgr.admin.agentmgt.data import AgentDTO, ListAgentDTO from com.rsa.authmgr.admin.hostmgt.data import HostDTO from com.rsa.authmgr.admin.principalmgt import AddAMPrincipalCommand from com.rsa.authmgr.admin.principalmgt.data import AMPrincipalDTO from com.rsa.authmgr.admin.tokenmgt import GetNextAvailableTokenCommand from com.rsa.authmgr.admin.tokenmgt import LinkTokensWithPrincipalCommand from com.rsa.authn import SearchPasswordPoliciesCommand from com.rsa.authn import UpdatePasswordPolicyCommand from com.rsa.authn.data import PasswordPolicyDTO from com.rsa.command import ClientSession from com.rsa.command import CommandException from com.rsa.command import CommandTargetPolicy, ConnectionFactory from com.rsa.command.exception import DataNotFoundException, DuplicateDataException from com.rsa.common.search import Filter ''' * This class demonstrates the usage patterns of the * Authentication Manager 7.1 API. * * <p> * The first set of operations performed if the first * command line argument is equal to "create". * The sample creates a restricted agent, a group, and a user. * Links the user to the group and the group to the agent. * </p> * <p> * The second set of operations performed if the first * command line argument is equal to "delete". * Lookup the user, group and agent created above. * Delete the user, group and agent. * </p> * <p> * A third set of operations is performed if the first * command line argument is equal to "assign". * Lookup the user and assign the next available * SecurID token to the user. * Lookup the SuperAdminRole and assign it to the user. * </p> * <p> * A fourth set of operations performed if the first * command line argument is equal to "update". * Update the Agent, Group, and User objects. * </p> * <p> * A fifth set of operations performed if the first * command line argument is equal to "disable". * Lookup a password policy with a name that starts * with "Initial" and then disable the password history * for that policy. Use this to allow the sample to * perform multiple updates of the user password using * the same password for each update. * </p> * <p> * The APIs demonstrated include the use of the Filter * class to generate search expressions for use with * all search commands. * </p> ''' class AdminAPIDemos: ''' * We need to know these fairly static values throughout this sample. * Set the references to top level security domain (realm) and system * identity source to use later. * * @throws CommandException if something goes wrong ''' def __init__(self): searchRealmCmd = SearchRealmsCommand() searchRealmCmd.setFilter( Filter.equal( RealmDTO.NAME_ATTRIBUTE, "SystemDomain")) searchRealmCmd.execute() realms = searchRealmCmd.getRealms() if( len(realms) == 0 ): print "ERROR: Could not find realm SystemDomain" sys.exit( 2 ) self.domain = realms[0].getTopLevelSecurityDomain() self.idSource = realms[0].getIdentitySources()[0] ''' * Create an agent and set it to be restricted. * * @param: name the name of the agent to create * @param: addr the IP address for the agent * @param: alt array of alternate IP addresses * @return: the GUID of the agent just created * * @throws CommandException if something goes wrong ''' def createAgent(self, name, addr, alt): # need a HostDTO to be set host = HostDTO() host.setName(name) host.setPrimaryIpAddress(addr) host.setSecurityDomainGuid(self.domain.getGuid()) host.setNotes("Created by AM Demo code") # the agent to be created agent = AgentDTO() agent.setName(name) agent.setHost(host) agent.setPrimaryAddress(addr) agent.setAlternateAddresses(alt) agent.setSecurityDomainId(self.domain.getGuid()) agent.setAgentType(AgentConstants.STANDARD_AGENT) agent.setRestriction(1) # only allow activated groups agent.setEnabled(1) agent.setOfflineAuthDataRefreshRequired(0) agent.setNotes("Created by AM Demo code") cmd = AddAgentCommand(agent) try: cmd.execute() except DuplicateDataException: print "ERROR: Agent " + name + " already exists." sys.exit(2) # return the created agents GUID for further linking return cmd.getAgentGuid() ''' * Lookup an agent by name. * * @param: name the agent name to lookup * @return: the GUID of the agent * * @throws CommandException if something goes wrong ''' def lookupAgent(self, name): cmd = SearchAgentsCommand() cmd.setFilter(Filter.equal(AgentConstants.FILTER_HOSTNAME, name)) cmd.setLimit(1) cmd.setSearchBase(self.domain.getGuid()) # the scope flags are part of the SecurityDomainDTO cmd.setSearchScope(SecurityDomainDTO.SEARCH_SCOPE_ONE_LEVEL) cmd.execute() if (len(cmd.getAgents()) < 1): print "ERROR: Unable to find agent " + name + "." sys.exit(2) return cmd.getAgents()[0] ''' * Update an agent, assumes a previous lookup done by lookupAgent. * * @param agent the result of a previous lookup * * @throws CommandException if something goes wrong ''' def updateAgent(self, agent): cmd = UpdateAgentCommand() agentUpdate = AgentDTO() # copy the rowVersion to satisfy optimistic locking requirements BeanUtils.copyProperties(agent, agentUpdate) # ListAgentDTO does not include the SecurityDomainId # use the GUID of the security domain where agent was created agentUpdate.setSecurityDomainId(self.domain.getGuid()) # clear the node secret flag and modify some others agentUpdate.setSentNodeSecret(0) agentUpdate.setOfflineAuthDataRefreshRequired(1) agentUpdate.setIpProtected(1) agentUpdate.setEnabled(1) agentUpdate.setNotes("Modified by AM Demo code") # set the requested updates in the command cmd.setAgentDTO(agentUpdate) # perform the update cmd.execute() ''' * Delete an agent. * * @param: agentGuid the GUID of the agent to delete * * @throws CommandException if something goes wrong ''' def deleteAgent(self, agentGuid): cmd = DeleteAgentsCommand( [agentGuid] ) cmd.execute() ''' * Create an IMS user, needs to exist before an AM user can be * created. * * @param: userId the user's login UID * @param: password the user's password * @param: first the user's first name * @param: last the user's last name * * @return: the GUID of the user just created * * @throws CommandException if something goes wrong ''' def createUser(self, userId, password, first, last): cal = Calendar.getInstance() # the start date now = cal.getTime() # DrJ: add 50 years from now! cal.add(Calendar.YEAR, 50) # the account end date expire = cal.getTime() principal = PrincipalDTO() principal.setUserID( userId ) principal.setFirstName( first ) principal.setLastName( last ) # principal.setPassword( password ) principal.setEnabled(1) principal.setLockoutStatus(0) principal.setAccountStartDate(now) #principal.setAccountExpireDate(expire) #principal.setAccountExpireDate(0) principal.setAdminRole(0) principal.setCanBeImpersonated(0) principal.setTrustToImpersonate(0) principal.setSecurityDomainGuid( self.domain.getGuid() ) principal.setIdentitySourceGuid( self.idSource.getGuid() ) principal.setDescription("Created by DrJ utilities") cmd = AddPrincipalsCommand() cmd.setPrincipals( [principal] ) try: cmd.execute() except DuplicateDataException: print "ERROR: User " + userId + " already exists." sys.exit(2) # only one user was created, there should be one GUID result return cmd.getGuids()[0] ''' * Lookup a user by login UID. * * @param: userId the user login UID * * @return: the GUID of the user record. ''' def lookupUser(self, userId): cmd = SearchPrincipalsCommand() cmd.setFilter(Filter.equal(PrincipalDTO.LOGINUID, userId)) cmd.setSystemFilter(Filter.empty()) cmd.setLimit(1) cmd.setIdentitySourceGuid(self.idSource.getGuid()) cmd.setSecurityDomainGuid(self.domain.getGuid()) cmd.setGroupGuid(None) cmd.setOnlyRegistered(1) cmd.setSearchSubDomains(0) cmd.execute() if (len(cmd.getPrincipals()) < 1): print "ERROR: Unable to find user " + userId + "." sys.exit(2) return cmd.getPrincipals()[0] ''' * Update the user definition. * * @param user the principal object from a previous lookup ''' def updateUser(self, user): cmd = UpdatePrincipalCommand() cmd.setIdentitySourceGuid(user.getIdentitySourceGuid()) updateDTO = UpdatePrincipalDTO() updateDTO.setGuid(user.getGuid()) # copy the rowVersion to satisfy optimistic locking requirements updateDTO.setRowVersion(user.getRowVersion()) # collect all modifications here mods = [] # first change the email mod = ModificationDTO() mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE) mod.setName(PrincipalDTO.EMAIL) mod.setValues([ user.getUserID() + "@mycompany.com" ]) mods.append(mod) # add it to the list # also change the password mod = ModificationDTO() mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE) mod.setName(PrincipalDTO.PASSWORD) mod.setValues([ "MyNewPAssW0rD1!" ]) mods.append(mod) # add it to the list # change the middle name mod = ModificationDTO() mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE) mod.setName(PrincipalDTO.MIDDLE_NAME) mod.setValues([ "The Big Cahuna" ]) mods.append(mod) # add it to the list # make a note of this update in the description mod = ModificationDTO() mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE) mod.setName(PrincipalDTO.DESCRIPTION) mod.setValues([ "Modified by AM Demo code" ]) mods.append(mod) # add it to the list # set the requested updates into the UpdatePrincipalDTO updateDTO.setModifications(mods) cmd.setPrincipalModification(updateDTO) # perform the update cmd.execute() ''' * Delete a user. * * @param: userGuid the GUID of the user to delete * * @throws CommandException if something goes wrong ''' def deleteUser(self, userGuid): cmd = DeletePrincipalsCommand() cmd.setGuids( array( [userGuid], String ) ) cmd.setIdentitySourceGuid( self.idSource.getGuid() ) cmd.execute() ''' * Create an Authentication Manager user linked to the IMS user. * The user will have a limit of 3 bad passcodes, default shell * will be "/bin/sh", the static password will be "12345678" and * the Windows Password for offline authentication will be "Password123!". * * @param: guid the GUID of the IMS user * * @throws CommandException if something goes wrong ''' def createAMUser(self, guid): principal = AMPrincipalDTO() principal.setGuid(guid) principal.setBadPasscodes(3) principal.setDefaultShell("/bin/sh") principal.setDefaultUserIdShellAllowed(1) # these next three innocent-looking lines cost you a license! do not use them!! - DrJ #principal.setStaticPassword("12345678") #principal.setStaticPasswordSet(1) #principal.setWindowsPassword("Password123!") cmd = AddAMPrincipalCommand(principal) cmd.execute() ''' * Create a group to assign a user to. * * @param: name the name of the group to create * @return: the GUID of the group just created * * @throws CommandException if something goes wrong ''' def createGroup(self, name): group = GroupDTO() group.setName(name) group.setDescription("Created by AM Demo code") group.setSecurityDomainGuid(self.domain.getGuid()) group.setIdentitySourceGuid(self.idSource.getGuid()) cmd = AddGroupCommand() cmd.setGroup(group) try: cmd.execute() except DuplicateDataException: print "ERROR: Group " + name + " already exists." sys.exit(2) return cmd.getGuid() ''' * Lookup a group by name. * * @param: name the name of the group to lookup * @return: the GUID of the group * * @throws CommandException if something goes wrong ''' def lookupGroup(self, name): cmd = SearchGroupsCommand() cmd.setFilter(Filter.equal(GroupDTO.NAME, name)) cmd.setSystemFilter(Filter.empty()) cmd.setLimit(1) cmd.setIdentitySourceGuid(self.idSource.getGuid()) cmd.setSecurityDomainGuid(self.domain.getGuid()) cmd.setSearchSubDomains(0) cmd.setGroupGuid(None) cmd.execute() if (len(cmd.getGroups()) < 1): print "ERROR: Unable to find group " + name + "." sys.exit(2) return cmd.getGroups()[0] ''' * Update a group definition. * * @param group the current group object ''' def updateGroup(self, group): cmd = UpdateGroupCommand() cmd.setIdentitySourceGuid(group.getIdentitySourceGuid()) groupMod = UpdateGroupDTO() groupMod.setGuid(group.getGuid()) # copy the rowVersion to satisfy optimistic locking requirements groupMod.setRowVersion(group.getRowVersion()) # collect all modifications here mods = [] mod = ModificationDTO() mod.setOperation(ModificationDTO.REPLACE_ATTRIBUTE) mod.setName(GroupDTO.DESCRIPTION) mod.setValues([ "Modified by AM Demo code" ]) mods.append(mod) # set the requested updates into the UpdateGroupDTO groupMod.setModifications(mods) cmd.setGroupModification(groupMod) # perform the update cmd.execute() ''' * Delete a group. * * @param: groupGuid the GUID of the group to delete * * @throws CommandException if something goes wrong ''' def deleteGroup(self, groupGuid): cmd = DeleteGroupCommand() cmd.setGuids( [groupGuid] ) cmd.setIdentitySourceGuid( self.idSource.getGuid() ) cmd.execute() ''' * Assign the user to the specified group. * * @param: userGuid the GUID for the user to assign * @param: groupGuid the GUID for the group * * @throws CommandException if something goes wrong ''' def linkUserToGroup(self, userGuid, groupGuid): cmd = LinkGroupPrincipalsCommand() cmd.setGroupGuids( [groupGuid] ) cmd.setPrincipalGuids( [userGuid] ) cmd.setIdentitySourceGuid(self.idSource.getGuid()) cmd.execute() ''' * Assign the group to the restricted agent so users can authenticate. * * @param: agentGuid the GUID for the restricted agent * @param: groupGuid the GUID for the group to assign * * @throws CommandException if something goes wrong ''' def assignGroupToAgent(self, agentGuid, groupGuid): cmd = LinkAgentsToGroupsCommand() cmd.setGroupGuids( [groupGuid] ) cmd.setAgentGuids( [agentGuid] ) cmd.setIdentitySourceGuid(self.idSource.getGuid()) cmd.execute() ''' * Assign next available token to this user. * * @param: userGuid the GUID of the user to assign the token to * * @throws CommandException if something goes wrong ''' def assignNextAvailableTokenToUser(self, userGuid): cmd = GetNextAvailableTokenCommand() try: cmd.execute() except DataNotFoundException: print "ERROR: No tokens available" else: tokens = [cmd.getToken().getId()] cmd2 = LinkTokensWithPrincipalCommand(tokens, userGuid) cmd2.execute() print ("Assigned next available SecurID token to user jdoe") ''' * Lookup an admin role and return the GUID. * * @param name the name of the role to lookup * @return the GUID for the required role * * @throws CommandException if something goes wrong ''' def lookupAdminRole(self, name): cmd = SearchAdminRolesCommand() # set search filter to match the name cmd.setFilter(Filter.equal(AdminRoleDTOBase.NAME_ATTRIBUTE, name)) # we only expect one anyway cmd.setLimit(1) # set the domain GUID cmd.setSecurityDomainGuid(self.domain.getGuid()) cmd.execute() if (len(cmd.getAdminRoles()) < 1): print "ERROR: Unable to find admin role " + name + "." sys.exit(2) return cmd.getAdminRoles()[0].getGuid() ''' * Assign the given admin role to the principal provided. * * @param adminGuid the GUID for the administrator * @param roleGuid the GUID for the role to assign * * @throws CommandException if something goes wrong ''' def assignAdminRole(self, adminGuid, roleGuid): cmd = LinkAdminRolesPrincipalsCommand() cmd.setIgnoreDuplicateLink(1) cmd.setPrincipalGuids( [ adminGuid ] ) cmd.setAdminRoleGuids( [ roleGuid ] ) cmd.execute() print ("Assigned SuperAdminRole to user jdoe") ''' * Lookup a password policy by name and return the object. * * @param name the policy name * @return the object * * @throws CommandException if something goes wrong ''' def lookupPasswordPolicy(self, name): cmd = SearchPasswordPoliciesCommand() cmd.setRealmGuid(self.domain.getGuid()) # match the policy name cmd.setFilter(Filter.startsWith(PasswordPolicyDTO.NAME, name)) cmd.execute() if (len(cmd.getPolicies()) < 1): print ("ERROR: Unable to find password policy with name starting with " + name + ".") sys.exit(2) # we only expect one anyway return cmd.getPolicies()[0] ''' * Update the given password policy, currently it just disables * password history. * * @param policy the policy to update * * @throws CommandException if something goes wrong ''' def updatePasswordPolicy(self, policy): cmd = UpdatePasswordPolicyCommand() # disable password history policy.setHistorySize(0) cmd.setPasswordPolicy(policy) cmd.execute() ''' * Create a collection of related entities, user, agent, group, token. * * @param admin the administrator user name * @param password the administrator password * * @throws Exception if something goes wrong ''' def doCreate(self): # Create a hypothetical agent with four alternate addresses addr = "1.2.3.4" alt = [ "2.2.2.2", "3.3.3.3", "4.4.4.4", "5.5.5.5" ] # create a restricted agent agentGuid = self.createAgent("Demo Agent", addr, alt) print ("Created Demo Agent") # create a user group groupGuid = self.createGroup("Demo Agent Group") print ("Created Demo Agent Group") # assign the group to the restricted agent self.assignGroupToAgent(agentGuid, groupGuid) print ("Assigned Demo Agent Group to Demo Agent") # create a user and the AMPrincipal user record userGuid = self.createUser("jdoe", "Password123!", "John", "Doe") self.createAMUser(userGuid) print ("Created user jdoe") # link the user to the group self.linkUserToGroup(userGuid, groupGuid) print ("Added user jdoe to Demo Agent Group") ''' * add user by DrJ * * @param admin the administrator user name * @param password the administrator password * * @throws Exception if something goes wrong ''' def doAdd(self): # create a user and the AMPrincipal user record # loop over all users listed in addusers.txt f = open('addusers.txt','r') str = f.readline() while str: strs = str.rstrip() cols = strs.split(",") userid = cols[0] fname = cols[1] lname = cols[2] print userid # if user already exists we want to go continue with the list try: userGuid = self.createUser(userid, "*LK*", fname, lname) self.createAMUser(userGuid) print "Created user userid,fname,lname: ", userid,",",lname,",",fname,"\n" except: print "exception for user ",userid,"\n" str = f.readline() f.close() ''' * Assign the next available token to the user. * * @param admin the administrator user name * @param password the administrator password * * @throws Exception if something goes wrong ''' def doAssignNextToken(self): # lookup and then ... userGuid = self.lookupUser("jdoe").getGuid() # assign the next available token to this user self.assignNextAvailableTokenToUser(userGuid) # now that he has a token make him an admin roleGuid = self.lookupAdminRole("SuperAdminRole") self.assignAdminRole(userGuid, roleGuid) ''' * Delete the entities created by the doCreate method. * * @param admin the administrator user name * @param password the administrator password * * @throws Exception if something goes wrong ''' def doDelete(self): # lookup and then ... # loop over all users listed in delusers.txt f = open('delusers.txt','r') str = f.readline() while str: # format: userid,fname,lname . We just want the userid cols = str.split(",") userid = cols[0] print userid # if user doesn't exist we want to go continue with the list try: userGuid = self.lookupUser(userid).getGuid() # ... cleanup self.deleteUser(userGuid) print "Deleted user ",userid except: print "exception for user ",userid,"\n" str = f.readline() f.close() ''' * Update the various entities created by the doCreate method. * * @throws Exception if something goes wrong ''' def doUpdate(self): # lookup and then ... agent = self.lookupAgent("Demo Agent") group = self.lookupGroup("Demo Agent Group") user = self.lookupUser("jdoe") # ... update self.updateAgent(agent) print ("Updated Demo Agent") self.updateGroup(group) print ("Updated Demo Agent Group") self.updateUser(user) print ("Updated user jdoe") ''' * Disable password history limit on default password policy so * we can issue multiple updates for the user password. * * @throws Exception if something goes wrong ''' def doDisablePasswordHistory(self): # lookup and then ... policy = self.lookupPasswordPolicy("Initial") # ... update self.updatePasswordPolicy(policy) print ("Disabled password history") # Globals here ''' * Show usage message and exit. * * @param msg the error causing the exit ''' def usage(msg): print ("ERROR: " + msg) print ("Usage: APIDemos <create|delete> <admin username> <admin password>") sys.exit(1) ''' * Use from command line with three arguments. * * <p> * First argument: * create - to create the required entities * assign - to assign the next available token to the user * delete - to delete all created entities * </p> * <p> * Second argument is the administrator user name. * Third argument is the administrator password. * </p> * * @param args the command line arguments ''' if len(sys.argv) != 4: usage("Missing arguments") # skip script name args = sys.argv[1:] # establish a connected session with given credentials conn = ConnectionFactory.getConnection() session = conn.connect(args[1], args[2]) # make all commands execute using this target automatically CommandTargetPolicy.setDefaultCommandTarget(session) try: # create instance api = AdminAPIDemos() # call delusers before addusers print "Deleting users...\n" api.doDelete() print "Adding users...\n" api.doAdd() finally: # logout when done session.logout()

I of course worked from their demo file, AdminAPIDemos.py, and kept the name for simplicity. I added a a doAdd routine and modified their doDelete function.

These modified functions expect external files to exist, addusers.txt and delusers.txt. The syntax of addusers.txt is:

loginname1,first_name,last_name
loginname2,first_name,last_name
...

Delusers.txt has the same syntax.

The idea is that if you can create these files once per day with the new users/removed users from your corporate directory by some other means, then you have a way to use them as a basis for keeping your AM internal database in sync with your external enterprise directory, whatever it might be.

Other Notes
Initially I saw my users were set to expire after a year or so. The original code I borrwed from had lines like this:

        cal = Calendar.getInstance()
 
        # the start date
        now = cal.getTime()
 
        cal.add(Calendar.YEAR, 1)
 
        # the account end date
        expire = cal.getTime()

which caused this. I eventually found how to set a flag to create the account with unlimited validity.

I also introduced a very simple regex handling to break up the input lines. This caused the need for importing additional classes:

from java.util.regex import *
from java.lang import *

I could not get python regexes to work.

I also found these three innocent-looking lines were costing me a license unit for each added user:

        principal.setStaticPassword("12345678")
        principal.setStaticPasswordSet(1)
        principal.setWindowsPassword("Password123!")

So I commented them out as I did not need them.

That’s it!

Getting the SDK running cost me a few days but at least I’ve documented that as well in pretty good detail: Problems with Jython API for RSA Authentication Manager.

Conclusion
We’ve shared with the community an actual, working jython API for adding/removing users from an RSA Authentication Manager v 7.1 database.

Tags jython, RSA authentication manager

Admin Security

Problems with Jython API for RSA Authentication Manager

Post author By john
Post date April 6, 2012
5 Comments on Problems with Jython API for RSA Authentication Manager

Intro
This session is not for the faint-of-heart. I describe some of the many issues I had in trying to run a slightly modified jython program which I use to keep the local directory in sync with an external directory source. Since I am a non-specialist in all these technologies, I am describing this from a non-specialist’ point-of-view.

The Details
RSA provides an authentication manager sdk, AM7.1_sdk.zip, which is required to use the API.

I had it all working on our old appliance. I thought I could copy files onto the new appliance and it would all work again. It’s not nearly so simple.

You log on and su to rsaadmin for all this work.

Let’s call RSAHOME = /usr/local/RSASecurity/RSAAuthenticationManager.

Initially the crux of the problem is to get
$RSAHOME/appserver/jdk/samples/admin/src/AdminAPIDemos.py to run.

But how do you even run it if you know nothing about java, python, jython, and, IMS and weblogic? Ha. It isn’t so easy.

As you see from the above path I unpacked the sdk below appserver. I did most of my work in the $RSAHOME/appserver/jdk/samples/admin directory. First thing is to very carefully follow the instructions in the sdk documentation about copying files and about initializing a trust.jks keystore. You basically grab .jar files from several places and copy them to $RSAHOME/appserver/jdk/lib/java. Failure to do so will be disastrous.

Ant is apparently a souped-up version of make. You need it to compile the jython example. They don’t tell you where it is. I found it here:

$RSAHOME/appserver/modules/org.apache.ant_1.6.5/bin/ant

I created my own run script I call jython-build:

#!/bin/bash
Ant=/usr/local/RSASecurity/RSAAuthenticationManager/appserver/modules/org.apache.ant_1.6.5/bin/ant
# compiles the examples
#$Ant compile
#$Ant verify-setup
# this worked!
#$Ant run-create-jython
$Ant run-add-jython

This didn’t work at first. One error I got:

$Ant verify-setup
Error: JAVA_HOME is not defined correctly.
  We cannot execute java

OK. Even non-java experts know how to define the JAVA_HOME environment variable. I did this:

$ export JAVA_HOME=/usr/local/RSASecurity/RSAAuthenticationManager/appserver/jdk

I did a

$ $Ant verify-setup

and got missing com.bea.core.process.5.3.0.0.jar. I eventually found it in …utils/jars/thirdparty. And so I started copying in all the missing files. This was before I discovered the documentation! Copying all the jar files can get you so far, but you will never succeed in copying in wlfullclient.jar because it doesn’t exist! Turns out there is a step described in the sdk documentation which you need to do that creates this jar file. Similarly, trust.jks, your private keystore, does not exist. You have to follow the steps in the sdk documentation to create it with keytool, etc. You’ll need to augment your path, of course:

$ export PATH=$PATH:/usr/local/RSASecurity/RSAAuthenticationManager/appserver/jdk/bin

Some errors are experienced in and around this time:

org/apache/log4j/Appender not found
org/apache/commons/logging not found

This was when I was copying jar files in one-by-one and not following the sdk instructions.

I also got

weblogic.security.SSL.TrustManager class not found. This was more vexing at the time because it didn’t exist in any of my jar files! This is when I discovered that wlfullclient.jar has to be created by hand. It contains that class. Here’s how I check the jar contents:

$ jar tvf wlfullclient.jar|grep weblogic/security/SSL/Trust

   481 Wed May 09 18:12:02 EDT 2007 weblogic/security/SSL/TrustManager.class

For the record, my …lib/java directory, which has a few more files than it actually needs, looks like this:

activation-1.1.jar                commons-pool-1.2.jar                    opensaml-1.0.jar
am-client.jar                     commons-validator-1.3.0.jar             oscache-2.3.2rsa-1.jar
am-server-o.jar                   console-integration-api.jar             replication-api.jar
ant-1.6.5.jar                     dbunit-2.0.jar                          rsaweb-security-3.0.jar
antlr-2.7.6.jar                   dom4j-1.6.1.jar                         rsawebui-3.0.jar
asm-1.5.3.jar                     EccpressoAsn1.jar                       serializer-2.7.0.jar
axis-1.3.jar                      EccpressoCore.jar                       spring-2.0.7.jar
axis-saaj-1.3.jar                 EccpressoJcae.jar                       spring-mock-2.0.7.jar
c3p0-0.9.1.jar                    framework-common.jar                    store-command.jar
certj-2.1.1.jar                   groovy-all-1.0-jsr-05.jar               struts-core-1.3.5.jar
cglib-2.1_3.jar                   hibernate-3.2.2.jar                     struts-extras-1.3.5.jar
classpath.jar                     hibernate-annotations-3.2.1.jar         struts-taglib-1.3.5.jar
classworlds-1.1.jar               hibernate-ejb-persistence-3.2.2.jar     struts-tiles-1.3.5.jar
clu-common.jar                    hibernate-entitymanager-3.2.1.jar       systemfields-o.jar
com.bea.core.process_5.3.0.0.jar  ims-server-o.jar                        trust.jks
commons-beanutils-1.7.0.jar       install-utils.jar                       ucm-clu-common.jar
commons-chain-1.1.jar             iScreen-1-1-0rsa-2.jar                  ucm-server-o.jar
commons-cli-1.0.jar               iScreen-ognl-1-1-0rsa-2.jar             update-instance-node-ext.jar
commons-codec-1.3.jar             jargs-1.0.jar                           wlcipher.jar
commons-collections-3.0.jar       javassist-3.9.0.GA.jar                  wlfullclient.jar
commons-dbcp-1.2.jar              jboss-archive-browsing-5.0.0.Alpha.jar  wrapper-3.2.1rsa1.jar
commons-digester-1.6.jar          jdom-1.0.jar                            wsdl4j-1.5.1.jar
commons-discovery-0.2.jar         jline-0.9.91rsa-1.jar                   xalan-2.7.0.jar
commons-fileupload-1.2.jar        jsafe-3.6.jar                           xercesImpl-2.7.1.jar
commons-httpclient-3.0.1.jar      jsafeJCE-3.6.jar                        xml-apis-1.3.02.jar
commons-io-1.2.jar                jython-2.1.jar                          xmlsec-1.2.97.jar
commons-lang-2.2.jar              license.bea                             xmlspy-schema-2006-sp2.jar
commons-logging-1.0.4.jar         log4j-1.2.11rsa-3.jar
commons-net-2.0.jar               ognl-2.6.7.jar

activation-1.1.jar commons-pool-1.2.jar opensaml-1.0.jar am-client.jar commons-validator-1.3.0.jar oscache-2.3.2rsa-1.jar am-server-o.jar console-integration-api.jar replication-api.jar ant-1.6.5.jar dbunit-2.0.jar rsaweb-security-3.0.jar antlr-2.7.6.jar dom4j-1.6.1.jar rsawebui-3.0.jar asm-1.5.3.jar EccpressoAsn1.jar serializer-2.7.0.jar axis-1.3.jar EccpressoCore.jar spring-2.0.7.jar axis-saaj-1.3.jar EccpressoJcae.jar spring-mock-2.0.7.jar c3p0-0.9.1.jar framework-common.jar store-command.jar certj-2.1.1.jar groovy-all-1.0-jsr-05.jar struts-core-1.3.5.jar cglib-2.1_3.jar hibernate-3.2.2.jar struts-extras-1.3.5.jar classpath.jar hibernate-annotations-3.2.1.jar struts-taglib-1.3.5.jar classworlds-1.1.jar hibernate-ejb-persistence-3.2.2.jar struts-tiles-1.3.5.jar clu-common.jar hibernate-entitymanager-3.2.1.jar systemfields-o.jar com.bea.core.process_5.3.0.0.jar ims-server-o.jar trust.jks commons-beanutils-1.7.0.jar install-utils.jar ucm-clu-common.jar commons-chain-1.1.jar iScreen-1-1-0rsa-2.jar ucm-server-o.jar commons-cli-1.0.jar iScreen-ognl-1-1-0rsa-2.jar update-instance-node-ext.jar commons-codec-1.3.jar jargs-1.0.jar wlcipher.jar commons-collections-3.0.jar javassist-3.9.0.GA.jar wlfullclient.jar commons-dbcp-1.2.jar jboss-archive-browsing-5.0.0.Alpha.jar wrapper-3.2.1rsa1.jar commons-digester-1.6.jar jdom-1.0.jar wsdl4j-1.5.1.jar commons-discovery-0.2.jar jline-0.9.91rsa-1.jar xalan-2.7.0.jar commons-fileupload-1.2.jar jsafe-3.6.jar xercesImpl-2.7.1.jar commons-httpclient-3.0.1.jar jsafeJCE-3.6.jar xml-apis-1.3.02.jar commons-io-1.2.jar jython-2.1.jar xmlsec-1.2.97.jar commons-lang-2.2.jar license.bea xmlspy-schema-2006-sp2.jar commons-logging-1.0.4.jar log4j-1.2.11rsa-3.jar commons-net-2.0.jar ognl-2.6.7.jar

I don’t know if these steps are necessary, but they should work at this stage:

$Ant compile
$Ant verify-setup
$Ant run-create-jython

But are we done? No way. Don’t forget to edit your config.properties:

# JNDI factory class.
java.naming.factory.initial = weblogic.jndi.WLInitialContextFactory
 
# Server URL(s).  May be a comma separated list of URLs if running against a cluster
# NOTE: Replace authmgr-test.drj.com with the hostname of the managed server
java.naming.provider.url = t3s://authmgr-test.drj.com:7002
 
# User ID for process-level authentication.
# run rsautil manage-secrets --action list to learn these
#
com.rsa.cmdclient.user = CmdClient_blahblahblah
 
# Password for process-level authentication
com.rsa.cmdclient.user.password = blahblahblah
 
# Password for Two-Way SSL client identity keystore
com.rsa.ssl.client.id.store.password = password
 
# Password for Two-Way SSL client identity private key
com.rsa.ssl.client.id.key.password = password
 
# Provider URL for Two-Way SSL client authentication
ims.ssl.client.provider.url = t3s://authmgr-test.drj.com:7022
 
# Identity keystore for Two-Way SSL client authentication
ims.ssl.client.identity.keystore.filename = client-identity.jks
 
# Identity keystore private key alias for Two-Way SSL client authentication
ims.ssl.client.identity.key.alias = client-identity
 
# Identity keystore trusted root CA certificate alias
ims.ssl.client.root.ca.alias = root-ca
 
# SOAPCommandTargetBasicAuth provider URL
ims.soap.client.provider.url = https://authmgr-test.drj.com:7002/ims-ws/services/CommandServer

As it says you need to run

$ rsautil manage-secrets –action list

$ rsautil manage-secrets –action listkeys

to get the correct username/password. The URLs also need appropriate tweaking of course. Failure to do these things will produce some fairly obvious errors, however. Strangely, the keystore-related values which look like placeholders since they say “password” really don’t have to be modified.

Try to run jython-build now and you may get, like I did,

Cannot import name AddGroupCommand

which is a clear reference to this line in the python file:

from com.rsa.admin import AddGroupCommand

Rooting around, I find this class in ims-server-o.jar which I already have in my …lib/java. So I decided to make a ~/.api file which contains not only the JAVA_HOME, but a CLASSPATH:

export JAVA_HOME=/usr/local/RSASecurity/RSAAuthenticationManager/appserver/jdk
export CLASSPATH=/usr/local/RSASecurity/RSAAuthenticationManager/appserver/jdk/lib/java/ims-server-o.jar

Actually my ~/.api file contains more. I just borrowed $RSAHOME/utils/rsaenv and added those definitions, but I’m not sure any of the other stuff is needed.

Are we there yet? Not in my case. Now we get a simple “Access Denied.” I was stymied by this for awhile. I went to reporting and found this failure logged under Authentication Monitor:

(Description) User admin attempted authentication using authenticator RSA_password.

(Reason) Authentication method failed.

What this was about is that I had forgot to update my build.xml file with the correct username and password:

    <property name="admin.name" value="admin"/>
    <property name="admin.pw" value="mypassword"/>

After correcting that, it began to work.

Update after several months
Well, after some months of inattention, now once again it doesn’t work! I checked my adduser log file and saw this nugget in the traceback:

[java] File "src/AdminAPIDemos.py", line 846, in ?
[java] com.rsa.authn.AuthenticationCommandException: Access Denied

The only other clue I have is that another administrator changed the Super Admin password a couple weeks ago as it was expired. I haven’t resolved this one yet, it’s a work in progress! Ah. Simple. I needed to update my build.xml with the latest admin password. Hmm. That could be kind of a pain if it’s expiring every 90 days.

Conclusion
I was battered by this exercise, mostly by my own failure to read the manual. But some things are simply not documented. Why all the fuss just to get demo code working? Because we can customize it. I felt that once I had the demo running, the sky was the limit and creating customization to do automated add/delete would not be that difficult.

To see the jython script I created to do the user add/deletes click on this article: Add/Delete Jython Script for RSA Authentication Manager

Tags jython, RSA authentication manager

Admin Apache CentOS Linux Web Site Technologies

Major Headaches Migrating Apache from Ubuntu to CentOS

Post author By john
Post date March 19, 2012
7 Comments on Major Headaches Migrating Apache from Ubuntu to CentOS

Intro
I’m changing servers from Ubuntu server to CentOS. On Ubuntu I just checked off LAMP and got my environment. In CentOS I’m doing it piece-by-piece. I don’t think my Ubuntu install is quite regular, either, as I bastardized it by adding environment variables in the Apache config file, a concept I borrowed from SLES! Turns out it is quite an ordeal to make a smooth transition. I will share all my pitfalls. I still don’t have it working, but I think I’m over the hump. [Update: now it is working, or 99% of it is working. It is a bit sluggish, however.]

The Details
I installed httpd on CentOS using yum. I also installed some php5 packages which I saw were recommended as well. First thing I noticed is that the directory structure for “httpd” as it seems to be known on CentOS, is dramatically different from “apache2” as it is known in Ubuntu. This example illustrates the point. In CentOS the main config file is

/etc/httpd/conf/httpd.conf

while in Ubuntu I had

/etc/apache2/apache2.conf

so I tarred up my /etc/apache2 files and had the thought “Let’s make this work on CentOS.” Ha. Easier said than done.

To remind, the content of /etc/apache2 is:

apache2.conf, conf.d, sites-enabled sites-available mods-enabled mods-available plus some stuff I probably added, including envvars, httpd.conf and ports.conf.

envvars contains environment variables which are subsequently referenced in the config files, like these:

export APACHE_RUN_USER=www-data
export APACHE_RUN_GROUP=www-data
export APACHE_PID_FILE=/var/run/apache2$SUFFIX.pid
export APACHE_RUN_DIR=/var/run/apache2$SUFFIX
export APACHE_LOCK_DIR=/var/lock/apache2$SUFFIX
# Only /var/log/apache2 is handled by /etc/logrotate.d/apache2.
export APACHE_LOG_DIR=/var/log/apache2$SUFFIX

First step? Well we have to hook httpd startup to our new directory somehow. I don’t recall this part so well. I think I tried this from the command line:

$ apachectl -d /etc/apache2 -f apache2.conf -k start

and it may be at that point that I got the MPM workers error. But I forget. I switched to using the service command and that particular error seemed to go away at some point. I don’t believe I needed to do anything special.

So I tried this edit to /etc/sysconfig/httpd (sparing you the failed attempts):

OPTIONS=”-d /etc/apache2 -f apache2.conf”

Now we try to launch and see what happens.

$ service httpd start

Starting httpd: httpd: Syntax error on line 203 of /etc/apache2/apache2.conf: Syntax error on line 1 of /etc/apache2/mods-enabled/alias.load: Cannot load /usr/lib/apache2/modules/mod_alias.so into server: /usr/lib/apache2/modules/mod_alias.so: cannot open shared object file: No such file or directory
[FAILED]

Fasten your seatbelts, put on your big-boy pants or whatever. We’re just getting warmed up.

Let’s look at mods-available/alias.load:

$ more alias.load

LoadModule alias_module /usr/lib/apache2/modules/mod_alias.so

Sure enough, there is not only no such file, there is not even such a directory as /usr/lib/apache2. And all the load files have references like that. Where did the httpd install put its modules anyways? Why in /etc/httpd/modules. So I made a command decision:

$ mkdir /usr/lib/apache2
$ cd !$
$ ln -s /etc/httpd/modules

So where does that put us? Here:

$ service httpd start

Starting httpd: httpd: Syntax error on line 203 of /etc/apache2/apache2.conf: Syntax error on line 1 of /etc/apache2/mods-enabled/ssl.load: Cannot load /usr/lib/apache2/modules/mod_ssl.so into server: /usr/lib/apache2/modules/mod_ssl.so: cannot open shared object file: No such file or directory
     [FAILED]

Not everyone will see this issue. I had used SSL for some testing in Ubuntu so I had that module enabled. my CentOS is a core image and did not come with an SSL module. So let’s get it.

$ yum search mod_ssl

shows the full module name to be mod_ssl.x86_64, so we install it with yum install.

How far did that get us? To here:

$ service httpd start

Starting httpd: httpd: bad user name ${APACHE_RUN_USER}
  [FAILED]

Ah, remember my environment variables from above? As I said I actually use them with lines such as:

User ${APACHE_RUN_USER}

in apache2.conf. But clearly the definitions of those environment variables is not getting passed along. I decide to see if this step might work. I append these two lines to /etc/sysconfig/httpd:

$ Read in our environment variables. Inspired by apache on SLES.
. /etc/apache2/envvars

Could any more go wrong? Sure. Lots! Then there’s this:

$ service httpd start

Starting httpd: httpd: bad user name www-data
      [FAILED]

Amongst all the other stark differences, ubuntu and CentOS use different users to run apache. Great! So I create a www-data user as userid 33, gid 33 because that’s how it was under ubuntu. but GID 33 is already taken in CentOS. It is backup. I decide I will never use it that way, and change the group name to www-data.

That brings us here. you see I have a lot of patience…

$ service httpd start

Starting httpd: Syntax error on line 218 of /etc/apache2/apache2.conf:
Invalid command 'LogFormat', perhaps misspelled or defined by a module not included in the server configuration
   [FAILED]

Now my line 218 looks pretty regular. It’s simply:

LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined

I then realized something interesting. The modules built in to httpd on centOS and apache2 are different. apache2 seems to have some modules built in for logging:

$ apache2 -l

Compiled in modules:
  core.c
  mod_log_config.c
  mod_logio.c
  prefork.c
  http_core.c
  mod_so.c

whereas httpd does not:

$ httpd -l

Compiled in modules:
  core.c
  prefork.c
  http_core.c
  mod_so.c

So I made an empty log_config.conf and a log_config.load in mods-available that reads:

LoadModule log_config_module /usr/lib/apache2/modules/mod_log_config.so

I got the correct names by looking at the apache web site documenttion on that module. And i linked those two files up in the mods-available diretory:

$ cd mods-enabled
$ ln -s ../mods-available/log_config.conf
$ ln -s ../mods-available/log_config.load

Next error, please! Certainly. It is:

$ service httpd start

Starting httpd: Syntax error on line 218 of /etc/apache2/apache2.conf:
Unrecognized LogFormat directive %O
  [FAILED]

where line 218 is as recorded above. Well, some searches showed that you need the logio module. Note that it is also compiled into to apache2, but missing from httpd. So I did a similar thing with defining the necessary mods-{available,enabled} files. logio.load reads:

LoadModule logio_module /usr/lib/apache2/modules/mod_logio.so

The next?

$ service httpd start

Starting httpd: (2)No such file or directory: httpd: could not open error log file /var/log/apache2/error.log.
Unable to open logs
   [FAILED]

Oops. Didn’t make that directory. Naturally httpd and apache2 use different directories for logging. What else could you expect?

Now we’re down to this minimalist error:

$ service httpd start

Starting httpd:     [FAILED]

The error log contained this line:

[Mon Mar 19 14:11:14 2012] [error] (2)No such file or directory: Cannot create SSLMutex with file `/var/run/apache2/ssl_mutex'

After puzzling some over this what I eventually noticed is that my environment has references to directories which I haven’t defined yet:

export APACHE_RUN_DIR=/var/run/apache2$SUFFIX
export APACHE_LOCK_DIR=/var/lock/apache2$SUFFIX

So I created them.

And now I get:

$ service httpd start

Starting httpd:           [  OK  ]

But all is still not well. I cannot stop it the proper way. Trying to read its status goes like this:

$ service httpd status

httpd dead but subsys locked

I looked this one up. Killed off processes and semaphores as recommended with ipcs -s (see this link), etc. But since my case is different, I also did something different. I modified my /etc/init.d/httpd file:

#pidfile=${PIDFILE-/var/run/httpd/httpd.pid}
pidfile=${PIDFILE-/var/run/apache2.pid}
#lockfile=${LOCKFILE-/var/lock/subsys/httpd}
lockfile=${LOCKFILE-/var/lock/subsys/apache2}

Believe it or not, this worked. I can now run service httpd status and service httpd stop. To prove it:

$ service httpd status

httpd (pid  30366) is running...

Another Error Crops Up
I eventually noticed another problem with the web site. My trajectory page was not working. Upon investigation I found this comment in my main apache error log (not my virtual server error log, which I still don’t understand):

sh: /home/drj/traj/traj4.pl: Permission denied

This had to be a result of my call-out to a perl program from a php program:

...
$data = exec('/home/drj/traj/traj4.pl'.' '.$escargs);
...

But what’s so special about that call? Worked fine on Ubuntu, and I did a directory listing to show the file was really there. Well, here’s the thing, that file is under my home directory and guess what? When you crate your users in Ubuntu the home directory permissions are set to group and others read. Not in CentOS! A listing of /home looks kind of like this:

/home$ ll

total 12
drwx------ 2 drj   drj     4096 Mar 19 15:26 drj/
...

I set the permissions for all to read:

$ sudo chmod g+rx,o+rx drj

and I was good to go. The program began to work.

May 2013 Update
I was asked how all this survived after a yum update. Answer: pretty well, but not perfectly. The daemon was fine. And what miseld me is that it started fine. But then a couple days later I was looking at my access log and realized…it wasn’t there! Nor the errors log. Well, actually, the default access and error logs were there, but not for my virtual servers.

I soon realized that

$ service httpd status

produced

httpd dead but subsys locked

Well, who reads or remembers their own posts from a year ago? I totally forgot I had already dealt with this once, and my own post didn’t show up in my DDG search. Anywho, I stepped on the same rake twice. Being less patient this time around, probably because I am one year older, I simply altered the /etc/init.d/httpd file (looks like it had been changed by the update) thusly:

#pidfile=${PIDFILE-/var/run/httpd/httpd.pid}
#lockfile=${LOCKFILE-/var/lock/subsys/httpd}
# try as an experiment - DrJ 5/3/13
pidfile=/var/run/apache2.pid
lockfile=/var/lock/apache2/accept.lock

and I made sure I had a /var/lock/apache2 directory. This worked.

I chose a lock file with that particular name because I noticed this in my /etc/apache2/apache2.conf:

LockFile ${APACHE_LOCK_DIR}/accept.lock

To clean things out as I was working and re-working this problem since I couldn’t run

$ service httpd stop

I ran instead:

$ pkill -9 -f sbin/httpd

and I removed /var/run/apache2.pid.

Now, once again, I can get a status on my httpd service and restart works as well and my access and error logs are being written.

Conclusion
This conversion exercise turned out to be quite a teaching lesson and even after all this more remains. After the mysql migration I find the performance to be sub-par – about twice as slow as it was on Ubuntu.

Four months later, CentOS has not crashed once on me. Contrast that with Ubuntu freezing every two weeks or so. I optimized MySQL to cache some data and performance is adequate. I also have since learned about bitnami, which is kind of a stack for all the stuff I was using. Check out bitnami.org.

Tags bitnami, permissions, ubuntu

Admin Hosting Service Linux

Hosting: You Really Can’t beat Amazon Web Services EC2

Post author By john
Post date March 16, 2012
8 Comments on Hosting: You Really Can’t beat Amazon Web Services EC2

Intro
You want to have your own server hosted by a service provider that’s going to take care of the hard stuff – uninterruptible power, fast pipe to the Internet, backups? That’s what I wanted. In addition I didn’t want to worry about actual, messy hardware. Give me a virtual server any day of the week. I am no hosting expert, but I have some experience and I’d like to share that.

The Details
I’d say chances are about even whether you’d think of Amazon Web Services for the above scenario. I’d argue that Amazon is actually the most competitive service out there and should be at the top of any short list, but the situation wasn’t always so even as recently as February of this year.

You see, Amazon markets itself a bit differently. They are an IaaS (infrastructure as a service) provider. I don’t know who their top competition really is, but AWS (Amazon Web Service) is viewed as both visionary and able to execute by Gartner from a recent report. My personal experience over the last 12 months backs that up. My main point, however, is that hosting a server is a subset of IaaS. I assume that if you want your own server where you get root access, you have the skill set (aided by the vast resources on the Internet including blogs like mine) to install a web server, database server, programming environment, application engines or whatever you want to do with it. You don’t need the AWS utility computing model per se, just a reliable 24×7 server, right? That’s my situation.

I was actually looking to move to “regular” hosting provider, but it turns out to have been a really great time to look around. Some background. I’m currently running such an environment running Ubuntu server 10.10 as a free-tier micro instance. I’ve enjoyed it a lot except one thing. From time to time my server freezes. At least once a month since December. I have no idea why. Knowing that my free tier would be up anyways this month I asked my computer scientist friend “Niz” for a good OS to run a web server and he said CentOS is what I want. It’s basically Redhat Enterprise Linux except you don’t pay Redhat for support.

I looked at traditional hosting providers GoDaddy and Rackspace and 1and1 a bit. I ran the numbers and saw that GoDaddy, with whom I already host my DNS domains, was by far the cost leader. They were also offering CentOS v 5.6 I think RackSpace also had a CentOS offering. I spoke with a couple providers in my own state. I reasoned I wuold keep my business local if the price was within 25% of other offers I could find.

Then, and here’s one of the cool things about IaaS, I fired up a CentOS image at Amazon Elastic Compute Cloud. With utility computing I pay only by the hour so I can experiment cheaply, which I did. Niz said run v 5.6 because all the bugs have been worked out. He hosts with another provider so he knows a thing or two about this topic and many other topics besides. I asked him what version he runs. 5.6. So I fired it up. But you know, it just felt like a giant step backwards through an open source timeline. I mean Perl v 5.8.8 vs Ubuntu’s 5.10.1. Now mind you by this time my version of Ubuntu is itself a year old. Apache version 2.2.3 and kernel version 2.6.18 versus 2.2.16 and 2.6.35. Just plain old. Though he said support would be available for fantastical amount of time, I decided to chuck that image.

Just as I was thinking about all these things Amazon made a really important announcement: prices to be lowered. All of a sudden they were competitive when viewed as a pure hosting provider, never mind all the other features they bring to bear.

I decided I wanted more memory than the 700 MB available to a micro image, and more storage than the 8 GB that tier gives. So a “small” image was the next step up, at 1.7 GB of memory and 160 GB disk space. But then I noticed a quirky thing – the small images only come in 32-bit, not 64-bit unlike all the other tiers. I am so used to 64-bit by now that I don’t trust 32-bit. I want to run what a lot of other people are running to know that the issues have been worked out.

Then another wonderful thing happened – Amazon announced support for 64-bit OSes in their small tier! What timing.

The Comparison Results
AWS lowered their prices by about 35%, a really substantial amount. I am willing to commit up front for an extended hosting because I expect to be in this for the long haul. Frankly, I love having my own server! So I committed to three years small tier, heavy usage after doing the math in order to get the best deal for a 24×7 server. It’s ~~$300~~ $96 up front and about ~~$0.012~~$0.027/hour for one instance hour. So that’s about ~~$18~~ $22/month over three years. Reasonable, I would say. For some reason my earlier calculations had it coming out cheaper. These numbers are as of September, 2013. I was prepared to use GoDaddy which I think is $24/month for a two-year commitment. My finding was that RackSpace and 1and1 were more expensive in turn than GoDaddy. I have no idea how AWS did what they did on pricing. It’s kind of amazing. My local providers? One came in at six times the cost of GoDaddy(!), the other about $55/month. Too bad for them. But I am excited about my new server. I guess it’s a sort of master of my own destiny type of thing that appeals to my independent spirit. Don’t tell Amazon, but really I think they could have easily justified charging a small premium for their hosting, given all the other convenient infrastructure services that are there, ready to be dialed up, say, like a load balancer, snapshots, additional IPs, etc. And there are literally 8000 images to choose from when you are deciding what image (OS) to run. That alone speaks volumes about the choices you have available.

What I’m up to
I installed CentOS 6.0 core image. It feels fresher. It’s based on RedHat 6.0 It’s got Perl v. 5.10.1, kernel 2.6.32, and, once you install it, Apache v 2.2.15. It only came with about 300 packages installed, which is kind of nice, not the usual 1000+ bloated deal I am more used to. And it seems fast, too. Now whether or not it will prove to be stable is an entirely different question and only time will tell. I’m optimistic. But if not, I’ll chuck it and find something else. I’ve got my data on a separate volume anyways which will persist regardless of what image I choose – another nice plus of Amazon’s utility computing model.

A Quick Tip About Additional Volumes
With my micro instance it occupied a full 8 GB so I didn’t have a care about additional disk volumes. On the other hand, my CentOS 6.0 core image is a lean 6 GB. If I’m entitled to 160 GB as part of what I’m paying for, how do I get the access to the remaining 154 GB? I guess you create a volume. Using the Admin GUI is easiest. OK, so you have your volunme, how does your instance see it? It’s not too obvious from their documentation but in CentOS my extra volume is

/dev/xvdj

I mounted that a formatted it as an ext4 device as per their instructions. It didn’t take that long. I put in a line in /etc/fstab like this:

/dev/xvdj /mnt/vol ext4 defaults 1 2

Now I’m good to go! It gets mounted after reboot.

Dec, 2016 update
Amazon has announced Lightsail to better compete with GoDaddy and their ilk. Plans start as low as $5 a month. For $10 a month you get a static IP, 1 GB RAM, 20 GB SSD storage I think and ssh access. So I hope that means root access. Oh, plus a pre-configured WordPress software.

Conclusion
Amazon EC2 rocks. They could have charged a premium but instead they are the cheapest offering out there according to my informal survey. The richness of their service offerings is awesome. I didn’t mention that you can mount the entire data set of the human genome, or all the facts of the world which have been assembled in freebase.org. How cool is that?

Tags CentOS, IaaS, Lightsail

Admin Apache Uncategorized Web Site Technologies

The IT Detective agency: Excessive Requests for PAC file Crippling Web Server

Post author By john
Post date March 12, 2012
4 Comments on The IT Detective agency: Excessive Requests for PAC file Crippling Web Server

Intro
Funny thing about infrastructure. You may have something running fine for years, and then suddenly it doesn’t. That is one of the many mysteries in the case of the excessive requests for PAC file.

The Details
We serve our Proxy Auto-config (PAC) file from a couple web servers which are load-balanced. It’s worked great for over 10 years. The PAC file is actually produced by a Perl script which can alter the content based on the user’s IP or other variables.

The web servers got bogged down last week. I literally observed the load average shoot up past 200 (on a 2-CPU server). This is not good.

I quickly noticed lots and lots of accesses for wpad.dat and proxy.pac. Some PCs were individually making hundreds of requests for these files in a day! Sometimes there were 15 or 20 requests in a minute. Since it is a script it takes some compute power to handle all those requests. So it was time to do one of two things: either learn the root cause and address it, or make a quick fix. The symptoms were clear enough, but I had no idea about the root cause. I also was fascinated by the requests for wpad.dat which I felt was serving no purpose whatsoever in our environment. So I went for the quick fix hopinG that understanding would come later.

To be continued…
As promised – three and a half years later! And we still have this problem. It’s probably worse than ever. I pretty much threw in the towel and learned how to scale up our apache web server to handle more PAC file requests simultaneously, see the references.

References
Scaling apache to handle more requests.

Tags PAC file

Admin Linux SLES

How to Get By Without unix2dos in SLES

Post author By john
Post date March 7, 2012
2 Comments on How to Get By Without unix2dos in SLES

Intro
As a Unix old-timer looking at the latest releases, I only have observed one tendency – that of ever-increasing numbers of commands, always additive – until now. A command I considered useful (well, basically any command I have ever used I consider useful) has gone AWOL in Suse Linux Enterprise Server (SLES for short): unix2dos.

Why You Need It
These days you need it more than ever. What with sftp being used in place of ftp, your transferred text files will come over from a SLES server to your PC in binary mode, preserving the Linux-style way of declaring a new line with the newline character, “\n”. Bring that file onto your PC and look at it in Notepad and you’ll get one long line because Windows requires more to indicate a new line. Windows OS’s like Windows 7 require a carriage return + newline, i.e., “\r\n”.

Who You Going to Call
I spoke with some experts so I cannot take credit for finding this out personally. Long story short things evolved and there is a more sophisticated command available that does this sort of thing and much else. That’s recode.

But I don’t think I’ll ever use recode for anything else so I decided to re-create a unix2dos command using recode in a tiny shell script:

#!/bin/sh
# inspired by http://yourlinuxguy.com/?p=232 and the fact that they took away this useful command
# 3/6/12
recode latin1..ibmpc $*

You call it like this:

> unix2dos file

and it overwrites the file and converts it to the format Windows expects.

My other expert contact says I could find the old unix2dos in OpenSuse but I decided not to go that route.

Of course to convert in the other direction you have dos2unix which for some reason wasn’t removed from the distro. Strange, huh?

How to See That It Worked
I use

> od -c file|more

to look at the ascii characters in a text file. It also shows the newline and carriage return characters with a \n and \r respectively This is a good command to know because it is also a “safe” way to look at a binary file. By safe I mean it won’t try to print out 8-bit characters that will permanently mess your terminal settings!

2017 update
I finally needed this utility again after five years. My program doesn’t work on CentOS. – No recode, whatever that was. However, the one-liner provided in the comments worked just fine for me.

Conclusion
We can rest easy and send text files back-and-forth between a PC and a SLES server with the help of this unix2dos script we developed.

Interestingly, RedHat’s RHEL has kept unix2dos in its disrtibution. Good for them. In ubuntu Linux unix2dos also seems decidedly missing.

Tags recode, unix2dos

Admin Linux

Common Problems Installing Cognos Gateway on Linux

Post author By john
Post date March 2, 2012
No Comments on Common Problems Installing Cognos Gateway on Linux

Updated for a 2018 Cognos 11 install
with 2013 updates for Cognos 10 installation

Intro
I tried to take a shortcut and get a 2nd Cognos gateway up and running by copying files, etc. rather than a proper install. At one time or another I feel I must have encountered just about every problem conceivable. I didn’t take great, systematic notes, but I’d like to mention some highlights while it is still fresh in my memory!

The Details
Note that I have a working gateway server running on the same version of Linux, SLES 11 SP1. So I thought I’d be clever and just copy all the files below /opt/cognos8 from the working server.

First Rookie Mistake
Let’s call our COGNOS_ROOT /opt/cognos8 for convenience.
Cognos 10 note: /opt/cognos10 would be a more sensible installation directory!

So you’re following along in the documentation and dutifully looking for /opt/cognos8/bin/cogconfig.sh, and not finding it? Me, neither. So I cleverly borrowed it from a working solaris installation. It’s all Java, right, no OS dependencies, what can go wrong? Ha, ha. You try:

./cogconfig.sh
and get:

Using /usr/lib64/jvm/jre/bin/java
The java class is not found:  CRConfig

Long story short. Give up. Without telling anyone they moved it to /opt/cognos8/bin64. That’s assuming you’re on a 64-bit system like most of us are.

OK. Now you run it from the …bin64 directory, expecting better results, only to perhaps get something like:

./cogconfig.sh

Unable to locate a JRE. Please specify a valid JAVA_HOME environment variable.

Long story short, java-1_4_2-ibm (java-1_6_0-ibm if installing a Cognos 10 gateway) is a good Java environment to install for Cognos Gateway. At least it is on SLES Linux. So you install that and set up environment variables like these:

export JAVA_BINDIR=/usr/lib64/jvm/jre/bin
export JAVA_HOME=/usr/lib64/jvm/jre
export JAVA_ROOT=/usr/lib64/jvm/jre

Now you’re cooking. Run it yet again. You’re smart and know to set up your DISPLAY environment to a valid XServer you have access to. But even if the X application actually does launch and run (you may need some Motif or additional X packages, possibly even from the SDK DVD – see appendix A), if you try to export the configuration you’ll get an error like this:

java.lang.ClassNotFoundException: org.bouncycastle134.jce.provider.BouncyCastleProvider

Cognos 10 note: I did not have this class missing in my Cognos 10 installation. Yeah!

Yes, you are missing the infamous bouncycastleprovider! This stuff is too good to make up, right? It’s a jar file that’s somewhere in the Cognos Gateway distribution, bcprov-jdk14-134.jar. In my case I need to put it here:

/etc/alternatives/jre/lib/ext

With that in place run it yet again. Now you may be unable to export the configuration with this error:

CAM-CRP-1057 Unable to generate the machine specific symmetric key.

Does it ever end? Yes!

You may have old values of keys and what-not cryptography stuff from your copy of the other system. So you remove these directories and all their contents:

/opt/cognos8/{encryptkeypair,signkeypair}

And I even saw the following error:

02/03/2012,11:26:56,Err,com.cognos.crconfig.data.DataManagerException: CAM-CRP-1132 An error occurred while attempting to request a certificate from the Certificate Authority service. Unable to connect to the Certificate Authority service. Ensure that the Content Manager computer is configured and that the Cognos 8 services on it are currently running. Reason: java.net.ConnectException: Connection refused, com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:2730)

I think it comes about if you save the default config without editing it and putting in a valid dispatcher URI, but I forget.

The main point towards the end was to start with a clean config by a:

cd /opt/cognos8/configuration;cp cogstartup.xml{.new,}

, making sure there is no encryptkeypair and signkeypair directories, launching …bin64/cogconfig.sh, working with the GUI to define the dispatcher URIs to your working, running Cognos dispatcher, exporting it,

(Let me take a breath here. If that export succeeds, you’re home.)

and finally saving it, which also generates the system-specific keys.

That’s it! A bunch of green check marks are your reward. Hopefully.

Conclusion
In the end you will see that this “cheap method” of installing Cognos Gateway worked. We had a few bumps along the road, but we worked through them all. Now that we’ve seen just about every conceivable problem we have a treasure trove of documented errors and fixes should we ever find ourselves in this situation again.

There is one more Cognos Gateway problem we resolved, by the way, that was previously documented here.

Appendix A – Cognos 10 note
Yes, I referred to this document in my own installation of Cognos version 10 gateway component. The problems are very similar, and this was a big help, if I say so myself.

I notice I write a tight narrative. I have lots of tangential thoughts, but to list them all as I think of them would destroy the flow of the narrative. In this case I wanted to expand on the openmotif packages.

I got a missing libXm.so.4 message when launching issetup the first time. I determined this came from an openmotif package from my previous successful installation on another server. My new server had limited repositories.

> zypper search openmotif

produced these results:

 
S | Name                   | Summary                    | Type
--+------------------------+----------------------------+-----------
  | openmotif21-demos      | Open Motif 2.2.4 Libraries | package
  | openmotif21-libs       | Open Motif 2.2.4 Libraries | package
  | openmotif21-libs       | Open Motif 2.2.4 Libraries | srcpackage
  | openmotif21-libs-32bit | Open Motif 2.2.4 Libraries | package
  | openmotif22-libs       | Open Motif 2.2.4 Libraries | package
  | openmotif22-libs       | Open Motif 2.2.4 Libraries | srcpackage
  | openmotif22-libs-32bit | Open Motif 2.2.4 Libraries | package

Well, I tried to install first openmotif21-libs-32bit then openmotif22-libs-32bit, but neither gave me the right version of libXm.so! I had versions 2, 3 and 6! So I simply did one of these numbers:

> cd /usr/lib; ln -s libXm.so.3.0.3 libXm.so.4

and, to my surprise, it worked!

More Errors Documented for completeness’ sake
At the risk of making this blog post a total mess, I’ll include a few more errors I encountered during the upgrade. Who knows who might find this useful.

Generating the cryptographic keys is always a hold-your-breath-and-pray operation. I had my upgrade files in place in a new install directory, /opt/cognos10. I ran bin64/cogconfig.sh like usual. It was suggested I could save the configuration even though the application gateway wasn’t running, so I tried that. No dice.

The cryptographic information cannot be encrypted.

Fine. So probably the app server needs to be running before we save the config, right? So they got it running. I tried to save the config. Same error. The details were as follows:

[ ERROR ]
CAM-CRP-1315 Current configuration points to a different Trust Domain than originally configured.
 
[ ERROR ] 
The cryptography information was not generated.

The remedy? Close the configuration and completely remove these directories beneath the /opt/cognos10/configuration directory:

– encryptkeypair
– signkeypair
– csk (actually I didn’t have this one. But I guess it should be removed if present)

I held my breath, re-ran cogconfig and saved. This time it worked!

I also had an error with my Java version:

./cogconfig.sh
Using /usr/lib64/jvm/jre/bin/java
The java class could not be loaded. java.lang.UnsupportedClassVersionError: (CRConfig) bad major version at offset=6

/usr/lib64/jvm/jre/bin/java -version

showed

java version "1.4.2"
Java(TM) 2 Runtime Environment, Standard Edition (build 2.3)
IBM J9 VM (build 2.3, J2RE 1.4.2 IBM J9 2.3 Linux amd64-64 j9vmxa64142ifx-20110628 (JIT enabled)
J9VM - 20110627_85693_LHdSMr
JIT  - 20090210_1447ifx5_r8
GC   - 200902_24)

I installed a newer Java:

zypper install  java-1_6_0-ibm

and got past this error.

April 20123 update
Just when you thought every possible error was covered, you encounter a new one. Cognos Mobile isn’t working so well on actual mobile devices so they wanted to try a Fixpack from IBM. No problem, right? They gave me

up_cogmob_linuxi38664h_10.2.1102.33_ml.rar

and I set to work. I don’t particularly like rar files for Linux, but I figured out there is an unrar command:

$ unrar e up_*rar

But after setting up my DISPLAY environment variable I get this new error running ./issetup:

X Error of failed request:  BadDrawable (invalid Pixmap or Window parameter)
  Major opcode of failed request:  14 (X_GetGeometry)
  Resource id in failed request:  0x2
  Serial number of failed request:  257
  Current serial number in output stream:  257
IDS_MSG_PREFIXIDS_COPYRIGHT_LOGOIDS_MSG_PREFIXIDS_MSG_READ_ARCHIVE

The solution? They downloaded a tar.gz version of the Fixpack. I unpacked that and had absolutely no problems with issetup! The really strange thing is that in both issetup are identical files. I use cksum to do a quick compare. Even setup.csp are identical files. I did an strace -f of the two cases but the salient difference didn’t pop out at me. The files present in the tar.gz seem to be fewer in number.

Another random error you will encounter sooner or later

You are doing a Save in cogconfig and you get:

13/05/2013,17:39:05,Err,CAM-CRP-1132 An error occurred while attempting to request a certificate from the Certificate Authority service. Unable to connect to the Certificate Authority service. Ensure that the Content Manager computer is configured and that the IBM Cognos services on it are currently running. Reason: java.net.ConnectException: Connection refused, com.cognos.crconfig.data.crypto.ConfiguringSession.configure(ConfiguringSession.java:35)com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:3037)com.cognos.crconfig.data.DataManager$4.run(DataManager.java:4169)com.cognos.crconfig.data.CnfgActionEngine$CnfgActionThread.run(CnfgActionEngine.java:394)com.cognos.crconfig.data.crypto.ConfiguringSession.configure(ConfiguringSession.java:35)com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:3037)com.cognos.crconfig.data.DataManager$4.run(DataManager.java:4169)com.cognos.crconfig.data.CnfgActionEngine$CnfgActionThread.run(CnfgActionEngine.java:394)com.cognos.crconfig.data.crypto.ConfiguringSession.configure(ConfiguringSession.java:35)com.cognos.crconfig.data.DataManager.generateCryptoKeys(DataManager.java:3037)com.cognos.crconfig.data.DataManager$4.run(DataManager.java:4169)com.cognos.crconfig.data.CnfgActionEngine$CnfgActionThread.run(CnfgActionEngine.java:394)

This looks scary but has an easy fix. You aren’t communicating with the app server. Probably their dispatcher services are down. Bring them up and it should work fine – it did for me. This is assuming of course that you have your dispatcher URLs set up correctly.

I cloned my Cognos web gateway and got this error
I waited for a few weeks to examine the clone. I ran

$ ./cogconfig.sh

and got this error:

16/05/2013,15:57:35,Err,CAM-CRP-1280 An error occurred while trying to decrypt using the system protection key. Reason: javax.crypto.IllegalBlockSizeException: Input length (with padding) not multiple of 16 bytes

Umm. I don’t have the solution yet. One thing is most highly suspect: in the meatime we re-generated the keys on the production web gateway. So I am hoping that is all we need to do here as well.

Resolved. Here is the process I followed – a sort of colonic for Cognos:

$ cd /opt/cognos10/configuration; rm csk/* signkeypair/* encryptkeypair/* cogstartup.xml
$ cd ../bin64; ./cogconfig.sh

Then in the GUI I re-defined the app servers in the dispatcher URI portion of the environment.
Then did a Save.
Worked like a champ – four green check marks.

cogconfig hangs
This happened to me on an older server. The IBM Cognos Configuration screen displays but it’s supposed to exit so you can get to the part where you edit the configuration and it never does.

Currently no known solution.

June 2018 update
Cognos 11 install problem
The Cognos 11 install was going pretty well. Until it came time to launch cogconfig. That generated this error:

cognos10:/web/cognos11/bin64> ./cogconfig.sh

Using /usr/lib64/jvm/jre/bin/java
Exception in thread "main" java.lang.UnsupportedClassVersionError: JVMCFRE003 bad major version; class=com/cognos/accman/jcam/crypto/CAMCryptoException, offset=6
        at java.lang.ClassLoader.defineClass(ClassLoader.java:286)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:74)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:538)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
        at java.net.URLClassLoader.access$300(URLClassLoader.java:77)
        at java.net.URLClassLoader$ClassFinder.run(URLClassLoader.java:1041)
        at java.security.AccessController.doPrivileged(AccessController.java:448)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:427)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:676)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:358)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:642)
        at java.lang.J9VMInternals.verifyImpl(Native Method)
        at java.lang.J9VMInternals.verify(J9VMInternals.java:73)
        at java.lang.J9VMInternals.initialize(J9VMInternals.java:133)
        at com.cognos.cclcfgapi.CCLConfigurationFactory.getInstance(CCLConfigurationFactory.java:59)
        at com.cognos.crconfig.CnfgPreferences.<init>(CnfgPreferences.java:51)
        at com.cognos.crconfig.CnfgPreferences.<clinit>(CnfgPreferences.java:36)
        at java.lang.J9VMInternals.initializeImpl(Native Method)
        at java.lang.J9VMInternals.initialize(J9VMInternals.java:199)
        at CRConfig.main(CRConfig.java:144)

Note my system java version is woefully out-of-date:

$ /usr/lib64/jvm/jre/bin/java ‐version

java version "1.6.0"
Java(TM) SE Runtime Environment (build pxa6460sr16fp15-20151106_01(SR16 FP15))
IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux amd64-64 jvmxa6460sr16fp15-20151020_272943 (JIT enabled, AOT enabled)
J9VM - 20151020_272943
JIT  - r9_20151019_103450
GC   - GA24_Java6_SR16_20151020_1627_B272943)
JCL  - 20151105_01

whereas the Cognos-supplied Java is two versions ahead:
cognos10:/web/cognos11> ./jre/bin/java ‐version

java version "1.8.0"
Java(TM) SE Runtime Environment (build pxa6480sr4fp10-20170727_01(SR4 FP10))
IBM J9 VM (build 2.8, JRE 1.8.0 Linux amd64-64 Compressed References 20170722_357405 (JIT enabled, AOT enabled)
J9VM - R28_20170722_0201_B357405
JIT  - tr.r14.java_20170722_357405
GC   - R28_20170722_0201_B357405_CMPRSS
J9CL - 20170722_357405)
JCL - 20170726_01 based on Oracle jdk8u144-b01

Instead of the previous approach which involved upgrading the system Java, I decided to just try the Java version Cognos itself had installed. In the following commands note that my installation directory was /web/cognos11.

$ cd /web/cognos11; export JAVA_HOME=`pwd`/jre
$ ./cogconfig.sh

Using /web/cognos11/jre/bin/java
06/06/2018,11:13:04,Dbg,Use Customized settings for font and color.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/web/cognos11/bin/slf4j-nop-1.7.23.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/web/cognos11/configuration/utilities/config-util.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.helpers.NOPLoggerFactory]
06/06/2018,11:13:10,Dbg,The original cogstartup.xml file is clear text. Don't back it up.

That is to say, it worked! I’ve often seen software packages install their own versions of Java. This is the first time I thought to take advantage of that. Wish I had thought of this approach during the Cognos 10 install!

Tags Cognos, Cognos 10, Cognos 11, cognos mobile

Admin Perl Web Site Technologies

Turning HP SiteScope into SiteScope Classic with Perl

Post author By john
Post date February 28, 2012
15 Comments on Turning HP SiteScope into SiteScope Classic with Perl

Intro
HP siteScope is a terrific web application tool and not too expensive for those who have any kind of a budget. The built-in monitor types are a bit limited, but since it allows calls to user-provided scripts your imagination is the only real limitation. For those with too many responsibilities and too little time on their hands it is a real productivity enhancer.

I’ve been using the product for 12 years now – since it was Freshwater SiteScope. I still have misgivings about the interface change introduced some years ago when it was part of Mercury. It went from simple and reliable to Java, complicated and flaky. To this day I have to re-start a SiteScope screen in my browser on a daily basis as the browser cannot recover from a server restart or who knows what other failures.

So I longed for the days of SiteScope Classic. We kept it running for as long as possible, years in fact. But at some point there were no more releases created for the classic view. So I investigated the feasibility of creating my own conversion tool. And…partially succeeded. Succeeded to the point where I can pull up the web page on my Blackberry and get the statuses and history. Think you can do that with regular HP SiteScope? I can’t. Maybe there’s an upgrade for it, but still. It’s nice to have the classic interface when you want to pull up the statuses as quickly as possible, regardless of the Blackberry display issue.

Looking back at my code, I obviously decided to try my hand at OO (object oriented) programming in Perl, with mixed results. Perl’s OO syntax isn’t the best, which addles comprehension. Without further ado, let’s jump into it.

The Details
It relies on something I noticed, that this URL on your HP SiteScope server, http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getConfigurationSnapshot, contains a tree of relationships of all the monitors. Cool, right? But it’s not a tree like you or I would design. Between parent and child is an intermediate layer. I suppose you need that because a group can contain monitors (my only focus in this exercise), but it can also contain alerts and maybe some other properties as well. So I guess the intermediate layer gives them the flexibility to represent all that, though it certainly added to my complication in parsing it. That’s why you’ll see the concern over “grandkids.” I developed a recursive, web-enabled Perl program to parse through this xml. That gives me the tools to build the nice hierarchical groupings. But it does not give me the statuses.

For the status of each monitor I wrote a separate scraper script that simply reads the entire daily SiteScope log every minute! Crude, but it works. I use it for an installation with hundreds of monitors and a log file that grows to 9 MB by the end of the day so I know it scales to that size. Beyond that it’s untested.

In addition to giving only the relationships, the xml also changes with every invocation. It attaches ID numbers to the monitors which initially you think is a nice unique identifier, but they change from invocation to invocation! So an additional challenge was to match up the names of the monitors in the xml output to the names as recorded in the SiteScope log. Also a bit tricky, but in general doable.

So without further ado, here’s the source code for the xml parser and main program which gets called from the web:

#!/usr/bin/perl
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# build v.simple SiteScope web GUI appropriate for smartphones
# 7/2010
#
# Id is our package which defines th Id class
use Id;
use CGI::Pretty;
my $cgi=new CGI;
$DEBUG = 0;
# GIF location on SiteScope classic
$ssgifs = "/artwork/";
$health{good} = qq(<img src="${ssgifs}okay.gif">);
$health{error} = qq(<img src="${ssgifs}error.gif">);
$health{warning} = qq(<img src="${ssgifs}warning.gif">);
# report CGI
$rprt = "/SS/rprt";
# the frustrating thing is that this xml output changes almost every time you call it
$url = 'http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getConfigurationSnapshot';
# get current health of all monitors - which is scraped from the log every minute by a hilgarj cron job
$monitorstats = "/tmp/monitorstats.txt";
print "Content-type: text/plain\n\n" if $DEBUG;
open(MONITORSTATS,"$monitorstats") || die "Cannot open monitor stats file $monitorstats!!";
while(<MONITORSTATS>) {
  chomp;
  ($monitor,$status,$value) = /([^\t]+)\t([^\t]+)\t([^\t]+)/;
  $monitors{"$monitor"} = $status;
  $monitorv{"$monitor"} = $value;
}
open(CURL,"curl $url 2>/dev/null|") || die "cannot open $url for reading!!\n";
my %myobjs = ();
# the xml is one long line!
@lines = <CURL>;
#print "xml line: $lines[0]\n" if $DEBUG;
@multiRefs = split "<multiRef",$lines[0];
#parse multiRefs
# create top-level object
my $id = Id->new (
      id => "id0");
# hash of this object with id as key
$myobjs{"id0"} = $id;
 
# first build our objects...
foreach $mref (@multiRefs) {
  next unless $mref =~ /\sid=/;
#  id="id0" ...
  ($parentid) =  $mref =~ /id=\"(id\d+)/;
  print "parentid: $parentid\n" if $DEBUG;
# watch out for <item><key xsi:type="soapenc:string">groupSnapshotChildren</key><value href="#id3 ...
# vs <item><key xsi:type="soapenc:string">Network</key><value href="#id40"/>
  print "mref: $mref\n" if $DEBUG;
  @ids = split /<item><key/, $mref;
# then loop over ids mentioned in this mref
  foreach $myid (@ids) {
    next unless $myid =~ /href="#(id\d+)/;
    next unless $myobjs{"$parentid"};
# types include group, monitor, alert
    ($typebyregex) = $myid =~ />snapshot_(\w+)SnapshotChildren</;
    $parenttype = $myobjs{"$parentid"}->type();
    $type = $typebyregex ? $typebyregex : $parenttype;
    print "type: $type\n" if $DEBUG;
# skip alert definitions
    next if $type eq "alert";
    print "myid: $myid\n" if $DEBUG;
    ($actualid) = $myid =~ /href="#(id\d+)/;
    print "actualid: $actualid\n" if $DEBUG;
# construct object
    my $id = Id->new (
      id => $actualid,
      type => $type,
      parentid => $parentid );
# build hash of these objects with actualid as key
    $myobjs{$actualid} = $id;
# addchild to parent. note that parent should already have been encountered
    $myobjs{"$parentid"}->addchild($actualid);
    if ($myid !~ /groupSnapshotChildren/) {
# interesting child - has name (every other generation has no name!)
      ($name) = $myid =~ /string\">(.+?)<\/key/;  # use non-greedy operator
      print "name: $name\n" if $DEBUG;
# some names are not of interest to us: alerts, which end in "error" or "good"
      if ($name !~ /(error|good)$/) {
# name may not be unique - get extended name which include all parents
        if (defined $myobjs{"$parentid"}->parentid()) {
          $gdparid = $myobjs{"$parentid"}->parentid();
          $gdparname = $myobjs{"$gdparid"}->extname();
# extname -> extended, or distinguished name.  Should be unique
          $extname = $gdparname. '/' . $name;
        } else {
# 1st generation
          print "1st generation\n" if $DEBUG;
          $extname = $name;
        }
        print "extname: $extname\n" if $DEBUG;
        $id->name($name);
        $id->extname($extname);
        $id->isanamedid(1);
        $myobjs{"$parentid"}->hasnamedkids(1); # want to mark its parent as "special"
# we also need our hash to reference objects by extended name since id changes with each extract and name
may not be unique
        $myobjs{"$extname"} = $id;
      } # end conditional over desirable name check
    } else {
      $id->isanamedid(0);
    }
  }
}
#
# now it's all parsed and our objects are alive. Let's build a web site!
#
# build a cookie containing path
my $pi = $ENV{PATH_INFO};
$script = $ENV{SCRIPT_NAME};
$ua = $ENV{HTTP_USER_AGENT};
# Blackberry browser test
$BB = $ua =~ /^BlackBerry/i ? 1 : 0;
$MSIE = $ua =~ /MSIE /;
# font-size depends on browser
$FS = "font-size: x-small;" if $MSIE;
$cookie = $cgi->cookie("pathinfo");
$uri = $script . $pi;
$cookie=$cgi->cookie(-name=>"pathinfo", -value=>"$uri");
print $cgi->header(-type=>"text/html",-cookie=>$cookie);
($url) = $pi =~ m#([^/]+)$#;
#  -title=>'SmartPhone View',
# this doesn't work, sigh...
#print $cgi->start_html(-head=>meta({-http_equiv=>'Refresh'}));
print qq( <HEAD>
<meta http-equiv="Expires" content="0">
<meta http-equiv="Pragma" content="no-cache">
<meta HTTP-EQUIV="Refresh" CONTENT="60; URL=$url">
<TITLE>SiteScope Classic $url Detail</TITLE>
<style type="text/css">
a.good {color: green; }
a.warning {color: green; }
a.error {color: red; }
td {font-family: Arial, Helvetica, sans-serif; $FS}
p.ss {font-family: Arial, Helvetica, sans-serif;}
</style>
<link rel="shortcut icon" href="/favicon.ico" type="image/x-icon" />
<script type=text/javascript>
function changeme(elemid,longvalue)
{
document.getElementById(elemid).innerText=longvalue;
}
function restoreme(elemid,truncvalue)
{
document.getElementById(elemid).innerText=truncvalue;
}
</script>
</HEAD><body>
);
 
#print $cgi->h1("This is the heading");
# parse path
# top lvl name:2nd lvl name:3rd lvl name
$altpi = $cgi->path_info();
print $cgi->p("pi is $pi") if $DEBUG;
#print $cgi->p("altpi is $altpi");
# relative url
$rurl = $cgi->url(-relative=>1);
if ($pi eq "") {
# the top
# top id is id3
  print qq(<p class="ss">);
  $myid = "id3";
  foreach $kid ($myobjs{"$myid"}->get_children()) {
    my $kidname = $myobjs{"$kid"}->name();
# kids can be subgroups or standalone monitors
    my $health = recurse("/$kidname");
    print "$health{$health} <a href=\"$rurl/$kidname\">$kidname</a><br>\n";
    $prodtest = $kid if $kidname eq "Production";
  }
  print "</p>\n";
} else {
  $extname = $pi;
  print "pi,name,extname,script: $pi,$name,$extname,$script\n" if $DEBUG;
# print where we are
  $uriname = $pi;
  $uriname =~ s#^/##;
  #print $cgi->p("name is $name");
  #print $cgi->p("uriname is $uriname");
  $uricompositepart = "/";
  @uriparts = split('/',$uriname);
  $lastpart = pop @uriparts;
  print qq(<p class="ss"><a href="$script"><b>Sitescope</b></a><br>);
  print qq(<b>Monitors in: );
  foreach $uripart (@uriparts) {
    my $healthp = recurse("$uricompositepart$uripart");
# build valid link
    ##$link = qq(<a class="good" href="$script$uricompositepart$uripart">$uripart</a>: );
    $link = qq(<a class="$healthp" href="$script$uricompositepart$uripart">$uripart</a>: );
    $uricompositepart .= "$uripart/";
    print $link;
  }
  my $healthp = recurse("$uricompositepart$lastpart");
  $color = $healthp eq "error" ? "red" : "green";
  print qq(<font color="$color">$lastpart</font></b></p>\n);
  print qq(<table border="1" cellspacing="0">);
  #print qq(<table>);
  %hashtrs = ();
  foreach $kid ($myobjs{"$extname"}->get_children()) {
    print "kid id: " . $myobjs{"$kid"}->id() . "\n" if $DEBUG;
    next unless $myobjs{"$kid"}->hasnamedkids();
    foreach $gdkid ($myobjs{"$kid"}->get_children()) {
      print "gdkid id: " . $myobjs{"$gdkid"}->id() . "\n" if $DEBUG;
      $gdkidname = $myobjs{"$gdkid"}->name();
      $gdkidextname = $myobjs{"$gdkid"}->extname();
      my $health = recurse("$gdkidextname");
      my $type = $myobjs{"$gdkid"}->type();
# dig deeper to learn health of the grankid's grandkids
      $objct = $healthct{good} = $healthct{error} = $healthct{warning} = 0;
      foreach $ggkid ($myobjs{"$gdkidextname"}->get_children()) {
        print "ggkid id: " . $myobjs{"$ggkid"}->id() . "\n" if $DEBUG;
        next unless $myobjs{"$ggkid"}->hasnamedkids();
        foreach $gggdkid ($myobjs{"$ggkid"}->get_children()) {
          print "gggdkid id: " . $myobjs{"$gggdkid"}->id() . "\n" if $DEBUG;
          $gggdkidname = $myobjs{"$gggdkid"}->name();
          $gggdkidextname = $myobjs{"$gggdkid"}->extname();
          my $health = recurse("$gggdkidextname");
          $objct++;
          $healthct{$health}++;
        }
      }
      $elemct++;
      $elemid = "elemid" . $elemct;
# groups should have distinctive cell background color to set them apart from monitors
      if ($type eq "group") {
        $bgcolor = "#F0F0F0";
        $celllink = "$lastpart/$gdkidname";
        $truncvalue = qq(<font color="red">$healthct{error}</font>/$objct);
        $tdval = $truncvalue;
      } else {
        $bgcolor = "#FFFFFF";
        $celllink = "$rprt?$gdkidname";
# truncate monitor value to save display space
        $longvalue = $monitorv{"$gdkidname"};
        (my $truncvalue) = $monitorv{"$gdkidname"} =~ /^(.{7,9})/;
        $truncvalue = $truncvalue? $truncvalue : "&nbsp;";
        $tdval = qq(<span id="$elemid" onmouseover="changeme('$elemid','$longvalue')" onmouseout="restorem
e('$elemid','$truncvalue')">$truncvalue</span>);
      }
      $hashtrs{"$gdkidname"} = qq(<tr><td bgcolor="#000000">$health{$health} </td><td>$tdval</td><td bgcol
or="$bgcolor"><a href="$celllink">$gdkidname</a></td></tr>\n);
# for health we're going to have to recurse
    }
  }
# print out in alphabetical order
  foreach $key (sort(keys %hashtrs)) {
    print $hashtrs{"$key"};
  }
  print "</table>";
}
print $cgi->end_html();
#######################################
sub recurse {
# to get the union of health of all ancestors
my $moniext = shift;
my ($moni) = $moniext =~ m#/([^/]+)$#;
# don't bother recursing and all that unless we have to...
return $myobjs{"$moniext"}->health() if defined $myobjs{"$moniext"}->health();
print "moni,moniext: $moni, $moniext\n" if $DEBUG;
my ($kid,$gdkidextname,$health,$cumhealth);
$cumhealth = $health = $monitors{"$moni"} ? $monitors{"$moni"} : "good";
foreach $kid ($myobjs{"$moniext"}->get_children()) {
    if ($myobjs{"$kid"}->hasnamedkids()) {
      foreach $gdkid ($myobjs{"$kid"}->get_children()) {
        $gdkidextname = $myobjs{"$gdkid"}->extname();
# for health we're going to have to recurse
        $health = recurse("$gdkidextname");
        if ($health eq "error" || $cumhealth eq "error") {
          $cumhealth = "error";
        } elsif ($health eq "warning" || $cumhealth eq "warning") {
          $cumhealth = "warning";
        }
      }
    } else {
# this kid is end of line
      $health = $monitors{"$kid"} ? $monitors{"$kid"} : "good";
        if ($health eq "error" || $cumhealth eq "error") {
          $cumhealth = "error";
        } elsif ($health eq "warning" || $cumhealth eq "warning") {
          $cumhealth = "warning";
        }
    }
}
$myobjs{"$moniext"}->health("$cumhealth");
return $cumhealth;
} # end sub recurse

#!/usr/bin/perl # Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0 # build v.simple SiteScope web GUI appropriate for smartphones # 7/2010 # # Id is our package which defines th Id class use Id; use CGI::Pretty; my $cgi=new CGI; $DEBUG = 0; # GIF location on SiteScope classic $ssgifs = "/artwork/"; $health{good} = qq(<img src="${ssgifs}okay.gif">); $health{error} = qq(<img src="${ssgifs}error.gif">); $health{warning} = qq(<img src="${ssgifs}warning.gif">); # report CGI $rprt = "/SS/rprt"; # the frustrating thing is that this xml output changes almost every time you call it $url = 'http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getConfigurationSnapshot'; # get current health of all monitors - which is scraped from the log every minute by a hilgarj cron job $monitorstats = "/tmp/monitorstats.txt"; print "Content-type: text/plain\n\n" if $DEBUG; open(MONITORSTATS,"$monitorstats") || die "Cannot open monitor stats file $monitorstats!!"; while(<MONITORSTATS>) { chomp; ($monitor,$status,$value) = /([^\t]+)\t([^\t]+)\t([^\t]+)/; $monitors{"$monitor"} = $status; $monitorv{"$monitor"} = $value; } open(CURL,"curl $url 2>/dev/null|") || die "cannot open $url for reading!!\n"; my %myobjs = (); # the xml is one long line! @lines = <CURL>; #print "xml line: $lines[0]\n" if $DEBUG; @multiRefs = split "<multiRef",$lines[0]; #parse multiRefs # create top-level object my $id = Id->new ( id => "id0"); # hash of this object with id as key $myobjs{"id0"} = $id; # first build our objects... foreach $mref (@multiRefs) { next unless $mref =~ /\sid=/; # id="id0" ... ($parentid) = $mref =~ /id=\"(id\d+)/; print "parentid: $parentid\n" if $DEBUG; # watch out for <item><key xsi:type="soapenc:string">groupSnapshotChildren</key><value href="#id3 ... # vs <item><key xsi:type="soapenc:string">Network</key><value href="#id40"/> print "mref: $mref\n" if $DEBUG; @ids = split /<item><key/, $mref; # then loop over ids mentioned in this mref foreach $myid (@ids) { next unless $myid =~ /href="#(id\d+)/; next unless $myobjs{"$parentid"}; # types include group, monitor, alert ($typebyregex) = $myid =~ />snapshot_(\w+)SnapshotChildren</; $parenttype = $myobjs{"$parentid"}->type(); $type = $typebyregex ? $typebyregex : $parenttype; print "type: $type\n" if $DEBUG; # skip alert definitions next if $type eq "alert"; print "myid: $myid\n" if $DEBUG; ($actualid) = $myid =~ /href="#(id\d+)/; print "actualid: $actualid\n" if $DEBUG; # construct object my $id = Id->new ( id => $actualid, type => $type, parentid => $parentid ); # build hash of these objects with actualid as key $myobjs{$actualid} = $id; # addchild to parent. note that parent should already have been encountered $myobjs{"$parentid"}->addchild($actualid); if ($myid !~ /groupSnapshotChildren/) { # interesting child - has name (every other generation has no name!) ($name) = $myid =~ /string\">(.+?)<\/key/; # use non-greedy operator print "name: $name\n" if $DEBUG; # some names are not of interest to us: alerts, which end in "error" or "good" if ($name !~ /(error|good)$/) { # name may not be unique - get extended name which include all parents if (defined $myobjs{"$parentid"}->parentid()) { $gdparid = $myobjs{"$parentid"}->parentid(); $gdparname = $myobjs{"$gdparid"}->extname(); # extname -> extended, or distinguished name. Should be unique $extname = $gdparname. '/' . $name; } else { # 1st generation print "1st generation\n" if $DEBUG; $extname = $name; } print "extname: $extname\n" if $DEBUG; $id->name($name); $id->extname($extname); $id->isanamedid(1); $myobjs{"$parentid"}->hasnamedkids(1); # want to mark its parent as "special" # we also need our hash to reference objects by extended name since id changes with each extract and name may not be unique $myobjs{"$extname"} = $id; } # end conditional over desirable name check } else { $id->isanamedid(0); } } } # # now it's all parsed and our objects are alive. Let's build a web site! # # build a cookie containing path my $pi = $ENV{PATH_INFO}; $script = $ENV{SCRIPT_NAME}; $ua = $ENV{HTTP_USER_AGENT}; # Blackberry browser test $BB = $ua =~ /^BlackBerry/i ? 1 : 0; $MSIE = $ua =~ /MSIE /; # font-size depends on browser $FS = "font-size: x-small;" if $MSIE; $cookie = $cgi->cookie("pathinfo"); $uri = $script . $pi; $cookie=$cgi->cookie(-name=>"pathinfo", -value=>"$uri"); print $cgi->header(-type=>"text/html",-cookie=>$cookie); ($url) = $pi =~ m#([^/]+)$#; # -title=>'SmartPhone View', # this doesn't work, sigh... #print $cgi->start_html(-head=>meta({-http_equiv=>'Refresh'})); print qq( <HEAD> <meta http-equiv="Expires" content="0"> <meta http-equiv="Pragma" content="no-cache"> <meta HTTP-EQUIV="Refresh" CONTENT="60; URL=$url"> <TITLE>SiteScope Classic $url Detail</TITLE> <style type="text/css"> a.good {color: green; } a.warning {color: green; } a.error {color: red; } td {font-family: Arial, Helvetica, sans-serif; $FS} p.ss {font-family: Arial, Helvetica, sans-serif;} </style> <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon" /> <script type=text/javascript> function changeme(elemid,longvalue) { document.getElementById(elemid).innerText=longvalue; } function restoreme(elemid,truncvalue) { document.getElementById(elemid).innerText=truncvalue; } </script> </HEAD><body> ); #print $cgi->h1("This is the heading"); # parse path # top lvl name:2nd lvl name:3rd lvl name $altpi = $cgi->path_info(); print $cgi->p("pi is $pi") if $DEBUG; #print $cgi->p("altpi is $altpi"); # relative url $rurl = $cgi->url(-relative=>1); if ($pi eq "") { # the top # top id is id3 print qq(<p class="ss">); $myid = "id3"; foreach $kid ($myobjs{"$myid"}->get_children()) { my $kidname = $myobjs{"$kid"}->name(); # kids can be subgroups or standalone monitors my $health = recurse("/$kidname"); print "$health{$health} <a href=\"$rurl/$kidname\">$kidname</a><br>\n"; $prodtest = $kid if $kidname eq "Production"; } print "</p>\n"; } else { $extname = $pi; print "pi,name,extname,script: $pi,$name,$extname,$script\n" if $DEBUG; # print where we are $uriname = $pi; $uriname =~ s#^/##; #print $cgi->p("name is $name"); #print $cgi->p("uriname is $uriname"); $uricompositepart = "/"; @uriparts = split('/',$uriname); $lastpart = pop @uriparts; print qq(<p class="ss"><a href="$script"><b>Sitescope</b></a><br>); print qq(<b>Monitors in: ); foreach $uripart (@uriparts) { my $healthp = recurse("$uricompositepart$uripart"); # build valid link ##$link = qq(<a class="good" href="$script$uricompositepart$uripart">$uripart</a>: ); $link = qq(<a class="$healthp" href="$script$uricompositepart$uripart">$uripart</a>: ); $uricompositepart .= "$uripart/"; print $link; } my $healthp = recurse("$uricompositepart$lastpart"); $color = $healthp eq "error" ? "red" : "green"; print qq(<font color="$color">$lastpart</font></b></p>\n); print qq(<table border="1" cellspacing="0">); #print qq(<table>); %hashtrs = (); foreach $kid ($myobjs{"$extname"}->get_children()) { print "kid id: " . $myobjs{"$kid"}->id() . "\n" if $DEBUG; next unless $myobjs{"$kid"}->hasnamedkids(); foreach $gdkid ($myobjs{"$kid"}->get_children()) { print "gdkid id: " . $myobjs{"$gdkid"}->id() . "\n" if $DEBUG; $gdkidname = $myobjs{"$gdkid"}->name(); $gdkidextname = $myobjs{"$gdkid"}->extname(); my $health = recurse("$gdkidextname"); my $type = $myobjs{"$gdkid"}->type(); # dig deeper to learn health of the grankid's grandkids $objct = $healthct{good} = $healthct{error} = $healthct{warning} = 0; foreach $ggkid ($myobjs{"$gdkidextname"}->get_children()) { print "ggkid id: " . $myobjs{"$ggkid"}->id() . "\n" if $DEBUG; next unless $myobjs{"$ggkid"}->hasnamedkids(); foreach $gggdkid ($myobjs{"$ggkid"}->get_children()) { print "gggdkid id: " . $myobjs{"$gggdkid"}->id() . "\n" if $DEBUG; $gggdkidname = $myobjs{"$gggdkid"}->name(); $gggdkidextname = $myobjs{"$gggdkid"}->extname(); my $health = recurse("$gggdkidextname"); $objct++; $healthct{$health}++; } } $elemct++; $elemid = "elemid" . $elemct; # groups should have distinctive cell background color to set them apart from monitors if ($type eq "group") { $bgcolor = "#F0F0F0"; $celllink = "$lastpart/$gdkidname"; $truncvalue = qq(<font color="red">$healthct{error}</font>/$objct); $tdval = $truncvalue; } else { $bgcolor = "#FFFFFF"; $celllink = "$rprt?$gdkidname"; # truncate monitor value to save display space $longvalue = $monitorv{"$gdkidname"}; (my $truncvalue) = $monitorv{"$gdkidname"} =~ /^(.{7,9})/; $truncvalue = $truncvalue? $truncvalue : " "; $tdval = qq(<span id="$elemid" onmouseover="changeme('$elemid','$longvalue')" onmouseout="restorem e('$elemid','$truncvalue')">$truncvalue</span>); } $hashtrs{"$gdkidname"} = qq(<tr><td bgcolor="#000000">$health{$health} </td><td>$tdval</td><td bgcol or="$bgcolor"><a href="$celllink">$gdkidname</a></td></tr>\n); # for health we're going to have to recurse } } # print out in alphabetical order foreach $key (sort(keys %hashtrs)) { print $hashtrs{"$key"}; } print "</table>"; } print $cgi->end_html(); ####################################### sub recurse { # to get the union of health of all ancestors my $moniext = shift; my ($moni) = $moniext =~ m#/([^/]+)$#; # don't bother recursing and all that unless we have to... return $myobjs{"$moniext"}->health() if defined $myobjs{"$moniext"}->health(); print "moni,moniext: $moni, $moniext\n" if $DEBUG; my ($kid,$gdkidextname,$health,$cumhealth); $cumhealth = $health = $monitors{"$moni"} ? $monitors{"$moni"} : "good"; foreach $kid ($myobjs{"$moniext"}->get_children()) { if ($myobjs{"$kid"}->hasnamedkids()) { foreach $gdkid ($myobjs{"$kid"}->get_children()) { $gdkidextname = $myobjs{"$gdkid"}->extname(); # for health we're going to have to recurse $health = recurse("$gdkidextname"); if ($health eq "error" || $cumhealth eq "error") { $cumhealth = "error"; } elsif ($health eq "warning" || $cumhealth eq "warning") { $cumhealth = "warning"; } } } else { # this kid is end of line $health = $monitors{"$kid"} ? $monitors{"$kid"} : "good"; if ($health eq "error" || $cumhealth eq "error") { $cumhealth = "error"; } elsif ($health eq "warning" || $cumhealth eq "warning") { $cumhealth = "warning"; } } } $myobjs{"$moniext"}->health("$cumhealth"); return $cumhealth; } # end sub recurse

I call it simply “ss” to minimize the typing required. You see it uses a package called Id.pm which I wrote to encapsulate the class and methods. Here is Id.pm:

package Id;
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# class for storing data about an id
# URL (not currently protected): http://localhost:8080/SiteScope/services/APIConfigurationImpl?method=getC
onfigurationSnapshot
# class for storing data about a group
use warnings;
use strict;
use Carp;
#group methods
# constructor
# get_members
# get_name
# get_id
# addmember
#
# member methods
# constructor
# get_id
# get_name
# get_type
# get_gp
# set_gp
 
sub new {
  my $class = shift;
  my $self = {@_};
  bless($self, "Id");
  return $self;
}
# get-set methods, p. 355
sub parentid { $_[0]->{parentid}=$_[1] if defined $_[1]; $_[0]->{parentid} }
sub isanamedid { $_[0]->{isanamedid}=$_[1] if defined $_[1]; $_[0]->{isanamedid} }
sub id { $_[0]->{id}=$_[1] if defined $_[1]; $_[0]->{id} }
sub name { $_[0]->{name}=$_[1] if defined $_[1]; $_[0]->{name} }
sub extname { $_[0]->{extname}=$_[1] if defined $_[1]; $_[0]->{extname} }
sub type { $_[0]->{type}=$_[1] if defined $_[1]; $_[0]->{type} }
sub health { $_[0]->{health}=$_[1] if defined $_[1]; $_[0]->{health} }
sub hasnamedkids { $_[0]->{hasnamedkids}=$_[1] if defined $_[1]; $_[0]->{hasnamedkids} }
 
# get children - use anonymous array, book p. 221-222
sub get_children {
# return empty array if arrary hasn't been defined...
  defined @{$_[0]->{children}} ? @{$_[0]->{children}} : ();
}
# adding children
sub addchild {
  $_[0]->{children} = [] unless defined  $_[0]->{children};
  push @{$_[0]->{children}},$_[1];
}
 
1;

ss also assumes the existence of just a few of the images from SiteScope classic – the green circle for good, red diamond for error and yellow warning, etc.. I borrowed them SiteScope classic.

Here is the code for the log scraper:

#!/usr/bin/perl
# analyze SiteScope log file
# Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0
# 8/2010
$DEBUG = 0;
$logdir = "/opt/SiteScope/logs";
$monitorstats = "/tmp/monitorstats.txt";
$monitorstatshis = "/tmp/monitorstats-his.txt";
$date = `date +%Y_%m_%d`;
chomp($date);
$file = "$logdir/SiteScope$date.log";
open(LOG,"$file") || die "Cannot open SiteScope log file: $file!!\n";
# example lines:
# 16:51:07 08/02/2010     good    LDAPServers     LDAP SSL test : ldapsrv.drj.com exit: 0, 0.502 sec    1:
3481  0       502
#16:51:22 08/02/2010     good    Network DNS: (AMEAST) ns2  0.033 sec   2:3459      200     33      ok
#16:51:49 08/02/2010     good    Proxy   proxy.pac script on iwww    0.055 sec   2:12467 200     55   ok
     4288    1280782309      0    0  55      0       0      200  0
#16:52:04 08/02/2010     good    Proxy   Disk Space: earth /logs   66% full, 13862MB free, 41921MB total
 3:3598      66      139862
#16:52:09 08/02/2010     good    DrjExtranet  URL: wwwsecure.drj.com     0.364 sec    1:3604      200
364  ok 26125   1280782328     0    0   358     4       2       200  0
while(<LOG>) {
  ($time,$date,$status,$group,$monitor,$value) = /(\S+)\s(\S+)\t(\S+)\t(\S+)\t([^\t]+)\t([^\t]+)/;
  print '$time,$date,$status,$group,$monitor,$value' . "$time,$date,$status,$group,$monitor,$value\n" if $DEBUG;
  next if $group =~ /__health__/; # don't care about these lines
  $mons{"$monitor"} = 1;
  push @{$mont{"$monitor"}} , $time;
  push @{$mond{"$monitor"}} , $date;
  push @{$monh{"$monitor"}} , $status;
  push @{$monv{"$monitor"}} , $value;
}
# open output at last moment to minimize chances of reading while locked for writing
open(MONITORSTATS,">$monitorstats") || die "Cannot open monitor stats file $monitorstats!!\n";
open(MONITORSTATSHIS,">$monitorstatshis") || die "Cannot open monitor stats file $monitorstatshis!!\n";
# write it all out - will always print the latest values
foreach $monitor (keys %mons) {
# dereference our anonymous arrays
  @times = @{$mont{"$monitor"}};
  @dates = @{$mond{"$monitor"}};
  @status = @{$monh{"$monitor"}};
  @value = @{$monv{"$monitor"}};
# last element is the latest measured status and value
  print MONITORSTATS "$monitor\t$status[-1]\t$value[-1]\n";
  print MONITORSTATSHIS "$monitor\n";
  #for ($i=-11;$i<0;$i++) {
# put latest measure on top
  for ($i=-1;$i>-13;$i--) {
    $time = defined $times[$i] ? $times[$i] : "NA";
    $date = defined $dates[$i] ? $dates[$i] : "NA";
    $stat = defined $status[$i] ? $status[$i] : "NA";
    $val = defined $value[$i] ? $value[$i] : "NA";
    print MONITORSTATSHIS "\t$time\t$date\t$stat\t$val\n";
  }
}

#!/usr/bin/perl # analyze SiteScope log file # Copyright work under the Artistic License, http://www.opensource.org/licenses/Artistic-2.0 # 8/2010 $DEBUG = 0; $logdir = "/opt/SiteScope/logs"; $monitorstats = "/tmp/monitorstats.txt"; $monitorstatshis = "/tmp/monitorstats-his.txt"; $date = `date +%Y_%m_%d`; chomp($date); $file = "$logdir/SiteScope$date.log"; open(LOG,"$file") || die "Cannot open SiteScope log file: $file!!\n"; # example lines: # 16:51:07 08/02/2010 good LDAPServers LDAP SSL test : ldapsrv.drj.com exit: 0, 0.502 sec 1: 3481 0 502 #16:51:22 08/02/2010 good Network DNS: (AMEAST) ns2 0.033 sec 2:3459 200 33 ok #16:51:49 08/02/2010 good Proxy proxy.pac script on iwww 0.055 sec 2:12467 200 55 ok 4288 1280782309 0 0 55 0 0 200 0 #16:52:04 08/02/2010 good Proxy Disk Space: earth /logs 66% full, 13862MB free, 41921MB total 3:3598 66 139862 #16:52:09 08/02/2010 good DrjExtranet URL: wwwsecure.drj.com 0.364 sec 1:3604 200 364 ok 26125 1280782328 0 0 358 4 2 200 0 while(<LOG>) { ($time,$date,$status,$group,$monitor,$value) = /(\S+)\s(\S+)\t(\S+)\t(\S+)\t([^\t]+)\t([^\t]+)/; print '$time,$date,$status,$group,$monitor,$value' . "$time,$date,$status,$group,$monitor,$value\n" if $DEBUG; next if $group =~ /__health__/; # don't care about these lines $mons{"$monitor"} = 1; push @{$mont{"$monitor"}} , $time; push @{$mond{"$monitor"}} , $date; push @{$monh{"$monitor"}} , $status; push @{$monv{"$monitor"}} , $value; } # open output at last moment to minimize chances of reading while locked for writing open(MONITORSTATS,">$monitorstats") || die "Cannot open monitor stats file $monitorstats!!\n"; open(MONITORSTATSHIS,">$monitorstatshis") || die "Cannot open monitor stats file $monitorstatshis!!\n"; # write it all out - will always print the latest values foreach $monitor (keys %mons) { # dereference our anonymous arrays @times = @{$mont{"$monitor"}}; @dates = @{$mond{"$monitor"}}; @status = @{$monh{"$monitor"}}; @value = @{$monv{"$monitor"}}; # last element is the latest measured status and value print MONITORSTATS "$monitor\t$status[-1]\t$value[-1]\n"; print MONITORSTATSHIS "$monitor\n"; #for ($i=-11;$i<0;$i++) { # put latest measure on top for ($i=-1;$i>-13;$i--) { $time = defined $times[$i] ? $times[$i] : "NA"; $date = defined $dates[$i] ? $dates[$i] : "NA"; $stat = defined $status[$i] ? $status[$i] : "NA"; $val = defined $value[$i] ? $value[$i] : "NA"; print MONITORSTATSHIS "\t$time\t$date\t$stat\t$val\n"; } }

As I said it gets called every minute by cron.

That’s it! I enter the url sitescope.drj.com/SS/ss to access the main program which gets executed because I made /SS a CGI-BIN directory.

This gives you a read-only, Java-free view into your SiteScope status and hierarchy which beckons back to the good old days of Freshwater SiteScope.

Know your limits
What it does not do, unfortunately, is allow you to run a monitor – that seems like the next most simple thing which I should have been able to do but couldn’t figure out – much less define new monitors (never going to happen) or alerts.

I use this successfully against my HP SiteScope instance of roughly 400 monitors which itself is on a VM and there is no apparent strain. At some point this simple-minded script would no longer scale to suit the task at hand, but it might be good for up to a few thousand monitors.

And now a word about open source alternatives
Since I was so enamored with SiteScope Classic there seemed to be no compelling reason to shell out the dough for HP SiteScope with its unwanted interface, so I briefly looked around at free alternatives. Free sounds good, right? Not so much in practice. Out there in Cyberspace there is an enthusiast for a product called Zabbix. I just want to go on the record that Zabbix is the most confused piece of junk I have run across. You are getting less than what you paid for ($0) because you will be wasting a lot of time with it, and in the end it isn’t all that capable. Nagios also had its limits – I can’t remember the exact reason I didn’t go down that route, but there were definite reasons.

HP SiteScope is no panacea. “HP” and “stifling bureaucracy” need to be mentioned in the same sentence. Every time we renew support it is the most confusing mess of line items. Every time there’s a new cast of characters over at HP who nothing about the account’s history. You practically have to beg them to accept your money for a low-budget item like SiteScope because they really don’t pursue it in any way. Then their SAID and contract numbers stuff is confusing if you only see it once every few years.

Conclusion
A conversion program does exist for turning the finicky HP SiteScope Java-encumbered view into pure SiteScope Classic because I wrote it! But it’s a limited read-only view. Still, it’s helpful in a pinch and can even be viewed on the Blackberry’s browser.

Another problem is that HP has threatened to completely change the API so this tool, which is designed for HP SiteScope v 10.12, will probably completely break for newer versions. Oh, well.

References
This post shows some silly mistakes to avoid when doing a minor upgrade in version 11.

Tags HP SiteScope, monitoring, nagios, SiteScope Classic, zabbix

Admin Network Technologies

The IT Detective Agency: virus updates are failing

Post author By john
Post date January 12, 2012
No Comments on The IT Detective Agency: virus updates are failing

Intro
This case hardly qualifies as worthy subject matter for the IT Detective Agency – it’s pretty run-of-the-mill stuff. But I wanted to document it for completeness and show how a problem in one thing can turn out to have an unexpected cause (At least to me. In hindsight it’s dead obvious what the issue was likely to have been).

The Situation
We have lots of servers at drjohns. So when one of our admins, Shake, said that one of them, nfuz01, can’t reach the Etrust serverto get its virus updates I had no recollection of what that server is or does. Shake asked if the firewall had changed recently. That’s sort of a tricky question because there are always minor changes being done. Most have absolutely no effect because they are additional rules providing new permissions. So I bravely answered No, it hadn’t. And I wondered what he meant in using the word “reach” anyways.

So I walk up to Shake’s desk to get a better idea. He said not only are updates not virus signature updates not occurring, but neither server can PING the other, neither by name nor by IP address. Now we’re getting somewhere. I still haven’t registered where nfuz01 is, but I know the firewall as I’ve set it up permits ICMP traffic to transit. I suggested that maybe nfuz01 had some missing or messed-up routes. Then I went back to my desk to think some more. That’s what gets me motivated – when I’ve publicly speculated about the root cause of something. It’s not so much that I may be proven wrong, but if I am wrong, I want to be the first to find out and issue a correction.

So I tried a PING from my desktop:

C:\>ping 10.91.12.14
 
Pinging 10.91.12.14 with 32 bytes of data:
Reply from 171.18.252.10: TTL expired in transit.
Reply from 171.18.252.10: TTL expired in transit.

I look up where nfuz01 is. It is in a secondary data center. I ping it from a server in that same data center, but one a different segment – it works fine! I ping it from a Linux server in my main data center – totally different results:

> ping 10.91.12.14
PING 10.91.12.14 (10.91.12.14) 56(84) bytes of data.
From 171.18.252.10 icmp_seq=1 Time to live exceeded
From 171.18.252.10 icmp_seq=2 Time to live exceeded
 
--- 10.91.12.14 ping statistics ---
2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1000ms
 
> traceroute -n 10.91.12.14
traceroute to 10.91.12.14 (10.91.12.14), 30 hops max, 40 byte packets
 1  10.136.188.2  1.100 ms  1.523 ms  2.010 ms
 2  10.1.4.2  0.934 ms  0.941 ms  0.773 ms
 3  10.1.4.141  0.869 ms  0.926 ms  0.913 ms
 4  171.18.252.10  1.076 ms  1.096 ms  1.130 ms
 5  10.1.4.129  1.043 ms  1.029 ms  1.018 ms
 6  10.1.4.141  0.993 ms  0.611 ms  0.918 ms
 7  171.18.252.10  0.932 ms  0.916 ms  1.002 ms
 8  10.1.4.129  0.987 ms  1.089 ms  1.121 ms
 9  10.1.4.141  1.152 ms  1.246 ms  1.229 ms
10  171.18.252.10  2.040 ms  2.747 ms  2.735 ms
11  10.1.4.129  1.332 ms  1.418 ms  1.467 ms
12  10.1.4.141  1.477 ms  1.754 ms  1.685 ms
13  171.18.252.10  1.993 ms  1.978 ms  2.013 ms
14  10.1.4.129  1.930 ms  1.960 ms  2.039 ms
15  10.1.4.141  2.065 ms  2.156 ms  2.140 ms
16  171.18.252.10  2.116 ms  5.454 ms  5.453 ms
17  10.1.4.129  4.466 ms  4.385 ms  4.296 ms
18  10.1.4.141  4.266 ms  4.267 ms  4.260 ms
19  171.18.252.10  4.232 ms  4.216 ms  4.216 ms
20  10.1.4.129  4.182 ms  4.063 ms  4.009 ms
21  10.1.4.141  3.994 ms  3.987 ms  2.398 ms
22  171.18.252.10  2.400 ms  2.484 ms  2.690 ms
23  10.1.4.129  2.346 ms  2.449 ms  2.544 ms
24  10.1.4.141  2.534 ms  2.607 ms  2.610 ms
25  171.18.252.10  2.602 ms  2.742 ms  2.736 ms
26  10.1.4.129  2.776 ms  2.856 ms  2.848 ms
27  10.1.4.141  2.648 ms  3.185 ms  3.291 ms
28  171.18.252.10  3.236 ms  3.223 ms  3.235 ms
29  10.1.4.129  3.219 ms  3.277 ms  3.377 ms
30  10.1.4.141  3.363 ms  3.381 ms  3.449 ms

Cool, right? We’ve caught a network loop in the act. Now I know it isn’t the firewall, it isn’t the routes on nfuz01 but it is something with networking. So I sent that off to them….

In less than an hour I got the explanation as well as the fix:

All should be reachable again. There’s a loop I can’t clear amongst some [telecom-owned] routers in the main data center. I’ve superseded it with two /27s until they clear it.

And it pings fine now:

> ping 10.91.12.14
PING 10.91.12.14 (10.91.12.14) 56(84) bytes of data.
64 bytes from 10.91.12.14: icmp_seq=1 ttl=125 time=46.2 ms
64 bytes from 10.91.12.14: icmp_seq=2 ttl=125 time=40.1 ms
64 bytes from 10.91.12.14: icmp_seq=3 ttl=125 time=23.1 ms
 
--- 10.91.12.14 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 23.114/36.502/46.275/9.796 ms

And Shake says the updates came in.

Case closed!

Conclusion
Why wasn’t the problem more obvious to us from the very beginning? Well, if the admin who said nfuz01 couldn’t reach Etrust had tried to log in nfuz01 through the normal Remote Desktop mechanism – and of course failed – then we might have drilled down into a networking cause more quickly. But nfuz01 is a VM and he must have been logged on via VMWare Virtual Center and so he hadn’t noticed that basically the server couldn’t reach anywhere in our main data center. It is also an obscure server (remember that I had no recall about it?) so no one really noticed that it was effectivley out-of-business.