Categories
Perl

Dear Perl programmer, Here is a lifeline

Pythonizer

If you fit a certain profile: been in IT for > 20 years, managed to crate a few utility scripts in Perl, ut never wrapped your head around the newer and flashier Python, this blog post is for you.

Conversely, if you have grown up with Python and find yourself stuck maintaining some obscure legacy Perl code, this post is also for you.

A friend of mine has written a conceptually cool program that converts Perl programs into Python which he calls a Pythonizer.

I’m sure it won’t do well with special Perl packages and such. In fact it is an alpha release I think. But perhaps for those scripts which use the basic built-in Perl functions and operations, it will do the job.

When I get a chance to try it myself I will give some more feedback here. I have a perfect example in mind, i.e., a self-contained little Perl script which ought to work if anything will.

Conclusion

Old Perl programs have been given new life by Pythonizer, which can convert Perl programs into Python.

References and related

https://github.com/softpano/pythonizer

Perl is not a dead language after all. Work continues on Perl 7, which will be known as v5.32. Should be ready next year: https://www.perl.com/article/announcing-perl-7/?ref=alian.info

Categories
TCP/IP Uncategorized Web Site Technologies

The IT Detective Agency: web site not accessible

Intro
In this spellbinding segment we examine what happened when a user found an inaccessible web site.


Some details
The user in a corporate environment reports not being able to access https://login.smartnotice.net/. She has the latest version of Windows 10.


On the trail
I sense something is wrong with SSL because of the type of errors reported by the browser. Something to the effect that it can’t make a secure connection.


But I decided to doggedly pursue it because I have a decent background in understanding SSL-related problems, and I was wondering if this was the first of what might be a systemic problem. I’m always interested to find little problem and resolve them in a way that addresses bigger issues.


So the first thing I try to lean more about the SSL versions and ciphers supported is to use my Go-To site, ssllabs.com, Test your Server: https://www.ssllabs.com/ssltest/. Well, this test failed miserably, and in a way I’ve never seen before. SSLlabs just quickly gave up without any analysis! So we pushed ahead, undaunted.


So I hit the site with curl from my CentOS 8 server (Upgrading WordPress brings a thicket of problems). Curl works fine. But I see it prefers to use TLS 1.3. So I finally buckle down and learn how to properly cnotrol the SSL/TLS version in curl. The output from curl -help is misleading, shall we say?


You think using curl –tlsv1.2 is going to use TLS v 1.2? Think again. Maybe it will, or maybe it won’t. In fact it tells curl to use TLS version 1.2 or higher. I totally missed understanding that for all these years.
What I’m looking for is to determine if the web site is willing to use TLS v 1.2 in addition to TLS v 1.3.


The ticket is … –tls-max 1.2 . This sets the maximum TLS version curl will use to access the URL.


So we have
curl -v –tls-max 1.3 https://login.smartnotice.net/

<!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:1; mso-generic-font-family:roman; mso-font-format:other; mso-font-pitch:variable; mso-font-signature:0 0 0 0 0 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-469750017 -1073732485 9 0 511 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin-top:0in; margin-right:0in; margin-bottom:8.0pt; margin-left:0in; line-height:107%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri",sans-serif; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-family:"Calibri",sans-serif; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoPapDefault {mso-style-type:export-only; margin-bottom:8.0pt; line-height:107%;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} -->
*   Trying 104.18.27.134...
* TCP_NODELAY set
* Connected to login.smartnotice.net (104.18.27.134) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
...
html head

But

curl -v –tls-max 1.2 https://login.smartnotice.net/

<!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:1; mso-generic-font-family:roman; mso-font-format:other; mso-font-pitch:variable; mso-font-signature:0 0 0 0 0 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-469750017 -1073732485 9 0 511 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin-top:0in; margin-right:0in; margin-bottom:8.0pt; margin-left:0in; line-height:107%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri",sans-serif; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-family:"Calibri",sans-serif; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoPapDefault {mso-style-type:export-only; margin-bottom:8.0pt; line-height:107%;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} -->
*   Trying 104.18.27.134...
* TCP_NODELAY set
* Connected to login.smartnotice.net (104.18.27.134) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS alert, protocol version (582):
* error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version
* Closing connection 0
curl: (35) error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version

So now we know, this web site requires the latest and greatest TLS v 1.3.
Even TLS 1.2 won’t do.

Well, this old corporate environment still offered users a choice of old
browsers, including IE 11 and the old Edge browser. These two browsers simply do not support TLS 1.3. But I fuond even Firefox wasn’t working, although the Chrome browser was.

How to explain all that? How to fix it?

It comes down to a good knowledge of the particular environment. As I think I stated, the this corporate environment uses proxies, which in turn, most
likely, tried to SSL intercept the traffic. The proxies are old so they in turn
don’t actually support SSL interception of TLS v 1.3! They had separate
problems with Chrome browser so they weren’t intercepting its traffic. This explains why FF was broken yet Chrome worked.

So the fix, such as it was, was to disable SSL interception for this request
URL so that Firefox would work, and tell the user to use either FF or Chrome.

Just being thorough, when i tested from home with Edge Chromium – the newer Edge browser – it worked and SSLlabs showed (correctly) that it supports TLS 1.3. Edge in the corporate environment is the older, non-Chromium one. It seems to max out at TLS 1.2. No good.

For good measure I explained the situation to the desktop support people.

Case: closed.

Appendix

How did I decide the proxies didn’t support TLS 1,3? What if this site had some other issue after all? I looked on the web for another web site which only supports TLS 1.3. I thought hopefully badssl.com would have one. But they don’t! Undaunted yet again, I determined to change my own web site, drjohnstechtalk.com, into one that only supports TLS 1.3! This is easy to do with apache web server. You basically need a line that looks like this:

SSLProtocol all -SSLv3 -TLSv1 -TLSv1.1 -TLSv1.2

Categories
Consumer Interest Inquiring Minds

Inquiring Minds Want to Know: Do you save energy by dimming LED bulbs

Intro

I’ve got my Philips Hue light bulb working with my Amazon Alexa. It’s an older 860 lumens bulb. I also have a voltmeter. So I went through different intensities, recording the power draw for each. The results are in the table below.

Level (%)Power (Watts)
1009.0
907.0
805.7
704.3
603.4
503.1
401.7
301.2
201.0
100.9
5*0.8
0 (off)0.3**
Power draw of LED light bulb at various brightness set by Alexa voice command.

So above 60% or so the relationship looks exponential. 50% seems like an outlier.

*By observation, the lowest lighting you can get from your bulbs is 5%.

**Unexpected finding – smartbulbs are vampire devices

I didn’t originally measure the power draw when “off.” You don’t think to do that. Then I gave it some more thought and had an aha moment – the bulb can only be smart if it is always listening for commands. And that, in turn, must create a power draw when off. A quick measurement and sure enough, confirmed. Though very small – 0.3 watts – it is not nothing. A typical single-family home has over a hundred bulbs. If they were all smartbulbs, it would add up… I believe small draw devices – typically those power adapters for cell phones – are called vampire devices.

Conclusion

So we have a very non-linear relationship here. I probably should plot the current draw as well. But, you definitely can save energy by lowering the intensity – quite a lot. But LED bulbs are drawing very little power anyway, so unless you have bunch of them, why bother?

My second conclusion – a finding I didn’t expect – is that even when off these bulbs are consuming a bit of power. It’s not a lot, 0.3 watts, but it’s something to keep in mind when planning your smartbulb deployment. So, large arrays of smartbulbs? Probably not such a smart idea.

Categories
3D Printing Admin

OpenSCAD export to STL does nothing

Quick Tip
If you are using OpenSCAD for your 3D model construction, and after creating a satisfactory model do an export to STL, you may observe that nothing at all happens!

I was stuck on this problem for awhile. Yes, the solution is obvious for a regular users, but I only use it every few months. If you open the console you will see the problem immediately:

ERROR: Nothing to export! Try rendering first (press F6).

But in my case I had closed the console, forgot there was such a thing, and of course it remembers your settings.

So you have to render your object (F6) before you can export as an STL file.

References and related
I don’t know why this endplate design blog post which I wrote never caught on. I think the pictures are cool.

OpenSCAD is a 3D modelling application that uses CSG – constructive Solid Geometry. It’s very math and basic geometric shapes focussed – perfect for me. https://www.openscad.org/

Categories
Admin Linux

vsftd Virtual Users stopped working after patching: the solution

Intro
vsftpd is a useful daemon which I use to run an ftps service (ftp which uses TLS encryption). Since I am not part of the group that administers the server, it makes sense for me to maintain my own userlist rather than rely on the system password database. vsftpd has a convenient feature which allows this known as virtual users.

More details
In /etc/pam.d/vsftpd.virtual I have:

auth required pam_userdb.so db=/etc/vsftpd/vsftpd-virtual-user
account required pam_userdb.so db=/etc/vsftpd/vsftpd-virtual-user
session required pam_loginuid.so

In the file /etc/vsftpd-virtual-user.db I have my Berkeley database of users and passwords. See references on how to set this up.

The point is that I had this all working last year – 2019 – on my SLES 12SP4 server.

Then it all broke
Then in early May, 2020, all the FTPs stopped working. The status of the vsftpd service hinted that the file /lib64/security/pam_userdb.so could not be loaded. Sure enough, it was missing! I checked some of my other SLES12SP4 servers, some of which are on a different patch schedule. It was missing on some, and present on one. So I “borrowed” pam_userdb.so from the one server which still had it and put it onto my server in /lib64/security. All good. Service restored. But clearly that is a hack.

What’s going on
So I asked a Linux expert what’s going on and got a good explanation.

pam_userdb has been moved to a separate package, named pam-extra
 
1) http://lists.suse.com/pipermail/sle-security-updates/2020-April/006661.html
2) https://www.suse.com/support/update/announcement/2020/suse-ru-20200822-1/
 
Advisory ID: SUSE-RU-2020:917-1
Released: Fri Apr 3 15:02:25 2020
Summary: Recommended update for pam
Type: recommended
Severity: moderate
References: 1166510
This update for pam fixes the following issues:
 
- Moved pam_userdb into a separate package pam-extra. (bsc#1166510)
 
Installing the package pam-extra should resolve your issue.

I installed the pam-extra package using zypper, and, yes, it creates a /lib64/security/pam_userdb.so file!

And vsftpd works once more using supported packages.

Conclusion
Virtual users with vsftpd requires pam_userdb.so. However, PAM wished to decouple itself from dependency on external databases, etc, so they bundled this kind of thing into a separate package, pam-extra, more-or-less in the middle of a patch cycle. So if you had the problem I had, the solution may be as simple as installing the pam-extra package on your system. Although I experienced this on SLES, I believe it has or will happen on other Linux flavors as well.

This problem is poorly documented on the Internet.


References and related

https://www.cyberciti.biz/tips/centos-redhat-vsftpd-ftp-with-virtual-users.html

Categories
Admin Linux Network Technologies

Configure rsyslog to send syslog to SIEM server running TLS

Intro
You have to dig a little to find out about this somewhat obscure topic. You want to send syslog output, e.g., from the named daemon, to a syslog server with beefed up security, such that it requires the use of TLS so traffic is encrypted. This is how I did that.

The details
This is what worked for me:

...
# DrJ fixes - log local0 to DrJ's dmz syslog server - DrJ 5/6/20
# use local0 for named's query log, but also log locally
# see https://www.linuxquestions.org/questions/linux-server-73/bind-queries-log-to-remote-syslog-server-4175
669371/
# @@ means use TCP
$DefaultNetstreamDriver gtls
$DefaultNetstreamDriverCAFile /etc/ssl/certs/GlobalSign_Root_CA_-_R3.pem
$ActionSendStreamDriver gtls
$ActionSendStreamDriverMode 1
$ActionSendStreamDriverAuthMode anon
 
local0.*                                @@(o)14.17.85.10:6514
#local0.*                               /var/lib/named/query.log
local1.*                                -/var/log/localmessages
#local0.*;local1.*                      -/var/log/localmessages
local2.*;local3.*                       -/var/log/localmessages
local4.*;local5.*                       -/var/log/localmessages
local6.*;local7.*                       -/var/log/localmessages

The above is the important part of my /etc/rsyslog.conf file. The SIEM server is running at IP address 14.17.85.10 on TCP port 6514. It is using a certificate issued by Globalsign. An openssl call confirms this (see references).

Other gothcas
I am running on a SLES 15 server. Although it had rsyslog installed, it did not support tls initially. I was getting a dlopen error. So I figured out I needed to install this module:

rsyslog-module-gtls

References and related
How to find the server’s certificate using openssl. Very briefly, use the openssl s_client submenu.

The rsyslog site itself probably has the complete documentation. Though I haven’t looked at it thoroughly, it seems very complete.

Want to know what syslog is? Howtoneywork has this very good writeup: https://www.howtonetwork.com/technical/security-technical/syslog-server/

Categories
Admin JavaScript Network Technologies

Practical Zabbix examples

Intro
I share some Zabbix items I’ve had to create which I find useful.

Low-level discovery to discover IPSEC tunnels on an F5 BigIP

IPSec tunnels are weird insofar as there is one IKE SA but potentially lots of SAs – two for each traffic selector. So if your traffic selector is called proxy-01, some OIDs you’ll see in your SNMP walk will be like …proxy-01.58769, …proxy-01.58770. So to review, do an snmpwalk on the F5 itself. That command is something like

snmpwalk -v3 -l authPriv -u proxyUser -a SHA -A shaAUTHpwd -x AES -X AESpwd -c public 127.0.0.1 SNMPv2-SMI::enterprises >/tmp/snmpwalk-ent

Now…how to translate this LLD? In my case I have a template since there are several F5s which need this. The template already has discovery rules for Pool discovery, Virtual server discovery, etc. So first thing we do is add a Tunnel discovery rule.

Tunnel Discovery Rule

The SNMP OID is clipped at the end. In full it is:

discovery[{#SNMPVALUE},F5-BIGIP-SYSTEM-MIB::sysIpsecSpdStatTrafficSelectorName]

Initially I tried something else, but that did not go so well.

Now we want to know the tunnel status (up or down) and the amount of traffic over the tunnel. We create two item prototypes to get those.

Tunnel Status Item prototype

So, yes, we’re doing some fancy regex to simplify the otherwise ungainly name which would be generated, stripping out the useless stuff with a regsub function, which, by the way, is poorly documented. So that’s how we’re going to discover the statuses of the tunnels. In text, the name is:

Tunnel {{#SNMPINDEX}.regsub(“\”\/Common\/([^\”]+)\”(.+)”,\1\2)} status

and the OID is

F5-BIGIP-SYSTEM-MIB::sysIpsecSpdStatTunnelState.{#SNMPINDEX}

And for the traffic, we do this:

Tunnel Traffic Item prototype

I learned how to choose the OID, which is the most critical part, I guess, from a combination of parsing the output of the snmpwalk plus imitation of those other LLD item prortypes, which were writtne by someone more competent than I.

Now the SNMP value for traffic is bytes, but you see I set units of bps? I can do that because of the preprocessing steps which are

Bytes to traffic rate preprocessing steps

Final tip

For these discovery items what you want to do is to disable Create Enabled and disable Discover. I just run it on the F5s which actually have IPSEC tunnels. Execute now actually works and generates items pretty quickly.

Using the api with a token and security by obscurity

I am taking the approach of pulling the token out of a config file where it has been stored, base85 encoded, because, who uses base85, anyway? I call the following script encode.py:

import sys
from base64 import b85encode

s = sys.argv[1]
s_e = s.encode('utf-8')

s64 = b85encode(s_e)
print('s,s_e,s64',s,s_e,s64)

In my case I pull this encoded token from a config file, but to simplify, let’s say we got it from the command line. This is how that goes, and we use it to create the zapi object which can be used in any subsequent api calls. That is the key.

from base64 import b85decode
import sys

url_zabbix = sys.argv[1]
t_e = sys.argv[2] # base85 encoded token

# Login Zabbix API
t_b = t_e.encode('utf-8')
to_b = b85decode(t_b)

token_zabbix = to_b.decode('utf-8')
zapi = ZabbixAPI(url_zabbix)
zapi.login(api_token=token_zabbix)
...

So it’s a few extra lines of code, but the cool thing is that it works. This should be good for version 5.4 and 6.0. Note that if you installed both py-zabbix and pyzabbix, your best bet may be to uninstall both and reinstall just pyzabbix. At least that was my experience going from user/pass to token-based authentication.


Convert DateAndTime SNMP output to human-readable format

Of course this is not very Zabbix-specific, as long as you realize that Zabbix produces the outer skin of the function:

function (value) {
// DrJ 2020-05-04
// see https://support.zabbix.com/browse/ZBXNEXT-3899 for SNMP DateAndTime format
'use strict';
//var str = "07 E4 05 04 0C 32 0F 00 2B 00 00";
var str = value;
// alert("str: " + str);
// read values are hex
var y256 = str.slice(0,2); var y = str.slice(3,5); var m = str.slice(6,8); 
var d = str.slice(9,11); var h = str.slice(12,14); var min = str.slice(15,17);
// convert to decimal
var y256Base10 = +("0x" + y256);
// convert to decimal
var yBase10 = +("0x" + y);
var Year = 256*y256Base10 + yBase10;
//  alert("Year: " + Year);
var mBase10 = +("0x" + m);
var dBase10 = +("0x" + d);
var hBase10 = +("0x" + h);
var minBase10 = +("0x" + min);
var YR = String(Year); var MM = String(mBase10); var DD = String(dBase10);
var HH = String(hBase10);
var MIN = String(minBase10);
// padding
if (mBase10 &lt; 10)  MM = "0" + MM; if (dBase10 &lt; 10) DD = "0" + DD;
if (hBase10 &lt; 10) HH = "0" + HH; if (minBase10 &lt; 10) MIN = "0" + MIN;
var Date = YR + "-" + MM + "-" + DD + " " + HH + ":" + MIN;
return Date;

I put that javascript into the preprocessing step of a dependent item, of course.

All my real-life examples do not fill in the last two fields: +/-, UTC offset. So in my case the times must be local times. But consequently I have no idea how a + or – would be represented in HEX! So I just ignored those last fields in the SNNMP DateAndTime which otherwise might have been useful.

Here’s an alternative version which calculates how long its been in hours since the last AV signature update.

// DrJ 2020-05-05
// see https://support.zabbix.com/browse/ZBXNEXT-3899 for SNMP DateAndTime format
'use strict';
//var str = "07 E4 05 04 0C 32 0F 00 2B 00 00";
var Start = new Date();
var str = value;
// alert("str: " + str);
// read values are hex
var y256 = str.slice(0,2); var y = str.slice(3,5); var m = str.slice(6,8); var d = str.slice(9,11); var h = str.slice(12,14); var min = str.slice(15,17);
// convert to decimal
var y256Base10 = +("0x" + y256);
// convert to decimal
var yBase10 = +("0x" + y);
var Year = 256*y256Base10 + yBase10;
//  alert("Year: " + Year);
var mBase10 = +("0x" + m);
var dBase10 = +("0x" + d);
var hBase10 = +("0x" + h);
var minBase10 = +("0x" + min);
var YR = String(Year); var MM = String(mBase10); var DD = String(dBase10);
var HH = String(hBase10);
var MIN = String(minBase10);
var Sigdate = new Date(Year, mBase10 - 1, dBase10,hBase10,minBase10);
//difference in hours
var difference = Math.trunc((Start - Sigdate)/1000/3600);
return difference;

More complex JavaScript example

function customSlice(array, start, end) {
    var result = [];
    var length = array.length;

    // Handle negative start
    start = start < 0 ? Math.max(length + start, 0) : start;

    // Handle negative end
    end = end === undefined ? length : (end < 0 ? length + end : end);

    // Iterate over the array and push elements to result
    for (var i = start; i < end &amp;&amp; i < length; i++) {
        result.push(array[i]);
    }

    return result;
}

function compareRangeWithExtractedIPs(ranges, result) {
    // ranges is [&quot;120.52.22.96/27&quot;,&quot;205.251.249.0/24&quot;]
    // I need to know if ips are in some of the ranges
    var ipRegex = /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g;
    var ips = result.match(ipRegex) || [];

    return ips.every(function(ip) {
        return ranges.some(function(range) {
            var rangeParts = range.split('/');
            var rangeIp = rangeParts[0];
            var rangeCidr = rangeParts[1];

            var rangeIpParts = rangeIp.split('.').map(Number);
            var ipParts = ip.split('.').map(Number);

            var rangeBinary = rangeIpParts.map(toBinary).join('');
            var ipBinary = ipParts.map(toBinary).join('');

            return ipBinary.substring(0, rangeCidr) === rangeBinary.substring(0, rangeCidr);
        });
    });
}

function toBinary(num) {
    var binary = num.toString(2);
    return '00000000'.substring(binary.length) + binary;
}

function fetchData(url) {
    try {
        Zabbix.log(4, 'Starting GET request to the provided URL');

        var result = {},
            req = new HttpRequest(),
            resp;

        req.addHeader('Content-Type: application/json');

        resp = req.get(url);

        if (req.getStatus() != 200) {
            throw 'Response code: ' + req.getStatus();
        }

        resp = JSON.parse(resp);
        result.data = resp;
        var ipPrefixes = resp.prefixes.map(function(prefix) {
            return prefix.ip_prefix;
        });
        result.data = ipPrefixes;
    } catch (error) {
        Zabbix.log(4, 'GET request failed: ' + error);

        result = {};
    }

    return JSON.stringify(result.data);
}

var result = fetchData('https://ip-ranges.amazonaws.com/ip-ranges.json');
var ranges =  JSON.parse(result);

return compareRangeWithExtractedIPs(ranges, value);

This is the preprocessing step of a dependent item which ran a net.dns.record to do  DNS outlook and get results as you see them from dig. I don’t fully understand how it works! My colleague wrote it and he used copilot mostly! He started making progress with Copilot once he constrained it to use “duct tape Javascript.” Apparently that’s a thing. This is defined within a template. It compares all of the returned IPs to a json list of expected possible ranges, which are pulled directly from an AWS endpoint. There are currently 7931 ranges, who knew?

Since this returns a true of false there are two subsequent preprocessing steps which Repalce true with 0 and false with 1 and we set the type to Numeric unsigned.

Calculated bandwidth from an interface that only provides byte count
Again in this example the assumption is you have an item, probably from SNMP, that lists the total inbound/outbound byte count of a network interface – hopefully stored as a 64-bit number to avoid frequent rollovers. But the quantity that really excites you is bandwidth, such as megabits per second.

Use a calculated item as in this example for Bluecoat ProxySG:

change(sgProxyInBytesCount)*8/1000000/300

Give it type numeric, Units of mbps. sgProxyInBytesCount is the key for an SNMP monitor that uses OID

IF-MIB::ifHCInOctets.{$INTERFACE_TO_MEASURE}

where {$INTERFACE_TO_MEASURE} is a macro set for each proxy with the SNMP-reported interface number that we want to pull the statistics for.

The 300 in the denominator of the calculated item is required for me because my item is run every five minutes.

Alternative
No one really cares about the actual total value of byte count, right? So just re-purpose the In Bytes Count item a bit as follows:

  • add preprocessing step: Change per second
  • add second preprocessing step, Custom multiplier 8e-6

The first step gives you units of bytes/second which is less interesting than mbps, which is given by the second step. So the final units are mbps.

Be sure to put the units as !mbps into the Zabbix item, or else you may wind up with funny things like Kmbps in your graphs!

Creating a baseline

Even as of Zabbix v 5, there is no built-in baseline item type, which kind of sucks. Baseline can mean many different things to many people – it really depends on the data. In the corporate world, where I’m looking at bandwidth, my data has these distinct characteristics:

  • varies by hour-of-day, e.g., mornings see heavier usage than afternoons
  • there is the “Friday effect” where somewhat less usage is seen on Fridays, and extremely less usage occurs on weekends, hence variability by day-of-week
  • probably varies by day of month, e.g., month-end closings

So for this type of data (except the last criterion) I have created an appropriate baseline. Note I would do something different if I were graphing something like the solar generation from my solar panels, where the day-of-week variability does not exist.

Getting to the point, I have created a rolling lookback item. This needs to be created as a Zabbix Item of type Calculated. The formula is as follows:

(last(sgProxyInBytesCount,#1,1w)+
last(sgProxyInBytesCount,#1,2w)+
last(sgProxyInBytesCount,#1,3w)+
last(sgProxyInBytesCount,#1,4w)+
last(sgProxyInBytesCount,#1,5w)+
last(sgProxyInBytesCount,#1,6w))/6

In this example sgProxyInBytesCount is my key from the reference item. Breaking it down, it does a rolling lookback of the last six measurements taken at this time of day on this day of the week over the last six weeks and averages them. Voila, baseline! The more weeks you include the more likely you are to include data you’d rather not like holidays, days when things were busted, etc. I’d like to have a baseline that is from a fixed time, like “all of last year.” I have no idea how. I actually don’t think it’s possible.

But, anyway, the baseline approach above should generally work for any numeric item.

Refinement

The above approach only gives you six measurements, hence 1/sqrt(6) ~ 40% standard deviation by the law of large numbers, which is still pretty jittery as it turns out. So I came up with this refined approach which includes 72 measurements, hence 1/sqrt(72) ~ 12% st dev. I find that to be closer to what you intuitively expect in a baseline – a smooth approximation of the past. Here is the refined function:

(avg(sgProxyInBytesCount,1h,1w)+
avg(sgProxyInBytesCount,1h,2w)+
avg(sgProxyInBytesCount,1h,3w)+
avg(sgProxyInBytesCount,1h,4w)+
avg(sgProxyInBytesCount,1h,5w)+
avg(sgProxyInBytesCount,1h,6w))/6

I would have preferred a one-hour interval centered around one week ago, etc., e.g., something like 1w+30m, but such date arithmetic does not seem possible in Zabbix functions. And, yeah, I could put 84600s (i.e., 86400 – 1800), but that is much less meaingful and so harder to maintain. Here is a three-hour graph whose first half still reflects the original (jittery) baseline, and last half the refined function.

Latter part has smoothed baseline in light green

What I do not have mastered is whether we can easily use a proper smoothing function. It does not seem to be a built-in offering of Zabbix. Perhaps it could be faked by a combination of pre-processing and Javascript? I simply don’t know, and it’s more than I wish to tackle for the moment.

Data gap between mulitple item measurements looks terrible in Dashboard graph – solution

In a Dashboard if you are graphing items which were not all measured at the same time, the results can be frustrating. For instance, an item and its baseline as calculated above. The central part of the graph will look fine, but at either end giant sections will be missing when the timescale of display is 30 minutes or 60 minutes for items measured every five minutes or so. Here’s an example before I got it totally fixed.

Zabbix item timing mismatch

See the left side – how it’s broken up? I had beguin my fix so the right side is OK.

The data gap solution

Use Scheduling Intervals in defining the items. Say you want a measurement every five minutes. Then make your scheduling interval m/5 in all the items you are putting on the same graph. For good measure, make the regular interval value infrequent. I use a macro {$UPDATE_LONG}. What this does is force Zabbix to measure all the items at the same time, in this case every five minutes on minutes divisible by five. Once I did that my incoming bandwith item and its corresponding baseline item aligned nicely.

Low-level Discovery

I cottoned on to the utility of this part of Zabbix a little late. Hey, slow learner, but I eventually got there. What I found in my F5 devices is that using SNMP to monitor the /var filesystem was a snap: it was always device 32 (final OID digit). But /var/log monitoring? Not so much. Every device seemed different, with no obvious pattern. Active and standby units – identical hardware – and some would be 53, the partner 55. Then I rebooted a device and its number changed! So, clearly, dynamically assigned and no way was I going to keep up with it. I had learned the numbers by doing an snmpwalk. The solution to this dynamically changing OID number is to use low-level discovery.

Tip: using zabbix_sender in a more robust fashion

We run the Zabbix proxies as pairs. They are not run as a cluster. Instead one is active and the other is a warm standby. Then we can upgrade at our leisure the standby proxy, switch the hosts to it, then upgrade the other now-unused proxy.

But our scripts which send results using zabbix_sender run on other servers. Their data stops being recorded when the switch is made. What to do?

I learned you can send to both Zabbix proxies. It will fail on the standby one and succeed on the other. Since one proxy is always active, it will always succeed in sending its data!

A nice DNS synthetic monitor

It would have been so easy for Zabbix to have built in the capability of doing synthetic DNS checks against your DNS servers. But, alas, they left it out. Which leaves it to us to fill that gap. Here is a nice and simple but surprisingly effective script for doing synthetic DNS checks. You put it in the external script directory of whatever proxy is monitoring your DNS host. I called it dns.sh.

#!/bin/sh
# arg1 - hostname of nameserver
# arg2 - DNS server to test
# arg3 - FQDN
# arg4 - RR type
# arg5 - match arg
# [arg6] - tcpflag # this argument is optional

# if you set DEBUG=1, and debug through zabbix, set item type to text
DEBUG=0
timeout=2 # secs - seems a good value
name=$1
nameserver=$2
record=$3
type=$4
match=$5
tcpflag=$6
[[ "$DEBUG" -eq "1" ]] && echo "name: $name, nameserver: $nameserver , record: $record , type: $type , match pattern: $match, tcpflag: $tcpflag"
[[ "$tcpflag" = "y" ]] || [[ "$tcpflag" = "Y" ]] && PROTO="+tcp"
# unless you set tries to 1 it will try three times by default!
MATCH=$(dig +noedns +short $PROTO +timeout=$timeout +tries=1 $type $record @${nameserver} )
[[ "$DEBUG" -eq "1" ]] && echo MATCHed line is $MATCH
return=0
[[ "$MATCH" =~ $match ]] && return=1
[[ "$DEBUG" -eq "1" ]] && echo return $return
echo $return

It gives a value of 1 if it matched the match expression, 0 otherwise.

With Zabbix version 7 we finally replaced this nice external dns.sh script with built-in agent items. But it was a little tricky to throw that sort of item into a template – we had to define a host maro containing the IP of the host!

Convert a string to a number in item output

I’ve been bit by this a couple times. Create a dependent item. In the preprocessing steps first do a RegEx match and keep \0. Important to also check the Custom on Fail checkbox. Set value to 0 on fail. In the second step I used a replace and replaced the expected string with a 1.

I had a harder case where the RegEx had alternate strings. But that’s solvable as well! I created additional Replace steps with all possible alternate strings, setting each one to be replaced with 1. Kludge, yes, but it works. And that is how you turn a text item into a boolean style output of 0 or 1.

Conclusion
A couple of really useful but poorly documented items are shared. Perhaps more will be added in the future.


References and related

https://support.zabbix.com/browse/ZBXNEXT-3899 for SNMP DateAndTime format

My first Zabbix post was mostly documentation of a series of disasters and unfinished business.

Blog post about calculated items by a true expert: https://blog.zabbix.com/zabbix-monitoring-with-calculated-items-explained/9950/

Low-level Discovery write-up: https://blog.zabbix.com/how-to-use-zabbix-low-level-discovery/9993/

Categories
Admin Firewall

The IT Detective Agency: large packets dropped by firewall, but logs show OK

Intro
All of a sudden one day I could not access the GUI of one my security appliances. It had only worked yesterday. CLI access kind of worked – until it didn’t. It was the standby part of a cluster so I tried the active unit. Same issues. I have some ill-defined involvement with the firewall the traffic was traversing, so I tried to debug the problem without success. So I brought in a real firewall expert.

More details
Of course I knew to check the firewall logs. Well, they showed this traffic (https and ssh) to have been accepted, no problems. Hmm. I suspected some weird IPS thing. IPS is kind of a big black box to me as I don’t deal with it. But I have seen cases where it blocks traffic without logging the fact. But that concern led me to bring in the expert.

By myself I had gotten it to the point where I had done tcpdump (I had totally forgotten how to use fw monitor. Now I will know and refer to my own recent blog post) on the corporate network side as well as the protected subnet side. And I saw that packets were hitting the corporate network interface that weren’t crossing over to the protected subnet. Why? But first some more about the symptoms.

The strange behaviour of my ssh session
The web GUI just would not load the home page. But ssh was a little different. I could indeed log in. But my ssh froze every time I changed to the /var/log directory and did a detailed directory listing ls -l. The beginning of the file listing would come back, and then just hang there mid-stream, frozen. In my tcpdump I noticed that the packets that did not get through were larger than the ones sent in the beginning of the session – by a lot. 1494 data bytes or something like that. So I could kind of see that with ssh, you normally send smallish packets, until you need a bigger one for something like a detailed directory listing! And https sends a large server certificate at the beginning of the session so it makes sense that it would hang if those packets were being stopped. So the observed behaviour makes sense in light of the dropping of the large packets. But that doesn’t explain why.

I asked a colleague to try it and they got similar results.

The solution method
It had nothing to do with IPS. The firewall guy noticed and did several things.

  • He agreed the firewall logs showed my connection being accepted.
  • He saw that another firewall admin had installed policy around the time the problem began. We analyzed what was changed and concluded that was a false lead. No way those changes could have caused this problem.
  • He switched the active firewall to standby so that we used the standby unit. It worked just fine!
  • He observed that the current active unit became active around the time of the problem, due to a problem with an interface on the normally active unit.

I probably would have been fine to just work using the standby but I didn’t want to crimp his style, so he continued in investigating…and found the ultimate root cause.

And finally the solution
He noticed that on the bad firewall the one interface – I swear I am not making this up – had been configured with a non-standard MTU! 1420 instead of 1500.

Analysis
I did a head slap when he shared that finding. Of course I should have looked for that. It explains everything. The OS was dropping the packet, not the firewall blade per se. And I knew the history. Some years back these firewalls were used for testing OLTV, a tunneling technology to extend layer 2 across physically separated subnets. That never did work to my satisfaction. One of the issues we encountered was a problem with large packets. So the firewall guy at the time tried this out to help. Normally firewalls don’t fail so the one unit where this MTU setting was present just wasn’t really used, except for brief moments during OS upgrade. And, funny to say, this mis-configuration was even propagated from older hardware to newer! The firewall guys have a procedure where they suck up all the configuration from the old firewall and restore to the newer one, mapping updated interface names, etc, as needed.

Well, at least we found it before too many others complained. Though, as expected, complain they did, the next day.

Aside: where is curl?
I normally would have tested the web page from the firewall iself using curl. But curl has disappeared from Gaia v 80.20. And there’s no wget either. How can such a useful and universal utility be missing? The firewall guy looked it up and quickly found that instead of curl, they have curl_cli. Who knew?

Conclusion
The strange case of the large packets dropped by a firewall, but not by the firewall blade, was resolved the same day it occurred. It took a partner ship of two people bringing their domain-specific knowledge to bear on the problem to arrive at the solution.

Categories
Admin Firewall Network Technologies Security

Linux shell script to cut a packet trace every 10 minutes on Checkpoint firewall

Intro
Scripts are normally not worth sharing because they are so easy to construct. This one illustrates several different concepts so may be of interest to someone else besides myself:

  • packet trace utility in Checkpoint firewall Gaia
  • send Ctrl-C interrupt to a process which has been run in the background
  • giving unqieu filenames for each cut
  • general approach to tacklnig the challenge of breaking a potentially large output into manageable chunks

The script
I wanted to learn about unexpected VPN client disconnects that a user, Sandy, was experiencing. Her external IP is 99.221.205.103.

while /bin/true; do
# date +%H%M inserts the current Hour (HH) and minute (MM).
 file=/tmp/sandy`date +%H%M`.cap
# fw monitor is better than tcpdump because it looks at all interfaces
 fw monitor -o $file -l 60 -e "accept src=99.221.205.103 or dst=99.221.205.103;" &
# $! picks up the process number of the command we backgrounded just above
 pid=$!
 sleep 600
 #sleep 90
 kill $pid
 sleep 3
 gzip $file
done

This type of tracing of this VPN session produces about 20 MB of data every 10 minutes. I want to be able to easily share the trace file afterwards in an email. And smaller files will be faster when analyzed in Wireshark.

The script itself I run in the background:
# ./sandy.sh &
And to make sure I don’t get logged out, I just run a slow PING afterwards:
# ping ‐i45 1.1.1.1

Alternate approaches
In retrospect I could have simply used the -ci argument and had the process terminate itself after a certain number of packets were recorded, and saved myself the effort of killing that process. But oh well, it is what it is.

Small tip to see all packets
Turn acceleration off:
fwaccel stat
fwaccel off
fwaccel on (when you’re done).

Conclusion
I share a script I wrote today that is simple, but illustrates several useful concepts.

References and related
fw monitor cheat sheet.

The standard packet analyzer everyone uses is Wireshark from https://wireshark.org/

Categories
Network Technologies

What is BGP hijacking?

Intro
I just learned of this really clear explanation of BGP hijacking, including interactive links, and what can be done to improve the current situation, namely, implement RPKI. I guess this weighs on me lately after I learned about another massive BGP hijacking out of Russia. See the references for these links.

References and related
This is the post I was referring to above, very interactive and not too technical. Is BGP Safe Yet?

A more in-depth disucssion ow aht RPKI is all about is on this Cloudflare blog: https://blog.cloudflare.com/rpki/

A couple descriptions of the latest BGP hijack out of Russia: https://www.zdnet.com/article/russian-telco-hijacks-internet-traffic-for-google-aws-cloudflare-and-others/ and https://blog.qrator.net/en/how-you-deal-route-leaks_69/