Categories
Admin Web Site Technologies

TCL iRule program with comments for F5 BigIP

Intro

A publicity-adverse colleague of mine wrote this amazing program. I wanted to publish it not so much for what it specifically does, but as well for the programming techniques it uses. I personally find i relatively hard to look up concepts when using TCL for an F5 iRule.

Program Introduction

Test

                    
# RULE_INIT is executed once every time the iRule is saved or on reboot. So it is ideal for persistent data that is shared accross all sessions.
# In our case it is used to define a template with some variables that are later substituted

when RULE_INIT {
# "static" variables in iRules are global and read only. Unlike regular TCL global variables they are CMP-friendly, that means they don't break the F5 clustered multi-processing mechanism. They exist in memory once per CMP instance. Unlike regular variables that exist once per session / iRule execution. Read more about it here: https://devcentral.f5.com/s/articles/getting-started-with-irules-variables-20403
#
# One thing to be careful about is not to define the same static variable twice in multiple iRules. As they are global, the last iRule saved overwrites any previous values.
# Originally the idea was to load an iFile here. That's also the main reason to even use RULE_INIT and static variables. The reasoning was (and I don't even know if this is true), that loading the iFile into memory once would have to be more efficient than to do it every time the iRule is executed. However, it is entirely possible that F5 already optimized iFiles in a way that loads them into memory automatically at opportune times, so this might be completely unnecessary.
# Either way, as you can tell, in the end I didn't even use iFiles. The reason for that is simply visibility. iFiles can't be easily viewed from the web UI, so it would be quite inconvenient to work with.
# The template idea and the RULE_INIT event stayed, even though it doesn't really serve a purpose, except maybe visually separating the templates from the rest of the code.
#
# As for the actual content of the variable: First thing to note is the use of  {} to escape the entire string. Works perfectly, even though the string itself contains braces. TCL magic.
# The rest is just the actual PAC file, with strategically placed TCL variables in the form of $name (this becomes important later)

            set static::pacfiletemplate {function FindProxyForURL(url, host)
{
            var globalbypass = "$globalbypass";
            var localbypass = "$localbypass";
            var ceglobalbypass = "$ceglobalbypass";
            var zpaglobalbypass = "$zpaglobalbypass";
            var zscalerbypassexception = "$zscalerbypassexception";

            var bypass = globalbypass.split(";").concat(localbypass.split(";"));
            var cebypass = ceglobalbypass.split(";");
            var zscalerbypass = zpaglobalbypass.split(";");
            var zpaexception = zscalerbypassexception.split(";");

            if(isPlainHostName(host)) {
                        return "DIRECT";
            }

            for (var i = 0; i < zpaexception.length; ++i){
                        if (shExpMatch(host, zpaexception[i])) {
                                   return "PROXY $clientproxy";
                        }
            }

            for (var i = 0; i < zscalerbypass.length; ++i){
                        if (shExpMatch(host, zscalerbypass[i])) {
                                   return "DIRECT";
                        }
            }

            for (var i = 0; i < bypass.length; ++i){
                        if (shExpMatch(host, bypass[i])) {
                                   return "DIRECT";
                        }
            }

            for (var i = 0; i < cebypass.length; ++i) {
                        if (shExpMatch(host, cebypass[i])) {
                                   return "PROXY $ceproxy";
                        }
            }

            return "PROXY $clientproxy";
}
}

            set static::forwardingpactemplate {function FindProxyForURL(url, host)
{
            var forwardinglist = "$forwardinglist";
            var forwarding = forwardinglist.split(";");

            for (var i = 0; i < forwarding.length; ++i){
                        if (shExpMatch(host, forwarding[i])) {
                                   return "PROXY $clientproxy";
                        }
            }

            return "DIRECT";
}
}
}

# Now for the actual code (executed every time a user accesses the vserver)
when HTTP_REQUEST {
    # The request URI can of course be used to differentiate between multiple PAC files or to restrict access.
    # So can basically any other request attribute. Client IP, host, etc.
            if {[HTTP::uri] eq "/proxy.pac"} {

                        # Here we set variables with the exact same name as used in the template above.
                        # In our case the values come from a data group, but of course they could also be defined
                        # directly in this iRule. Using data groups makes the code a bit more compact and it
                        # limits the amount of times anyone needs to edit the iRule (potentially making a mistake)
                        # for simple changes like adding a host to the bypass list
                        # These variables are all set unconditionally. Of course it is possible to set them based
                        # on for example client IP (to give different bypass lists or proxy entries to different groups of users)
                        set globalbypass [ class lookup globalbypass ProxyBypassLists ]
                        set localbypass [ class lookup localbypassEU ProxyBypassLists ]
                        set ceglobalbypass [ class lookup ceglobalbypass ProxyBypassLists ]
                        set zpaglobalbypass [ class lookup zpaglobalbypass ProxyBypassLists ]
                        set zscalerbypassexception [ class lookup zscalerbypassexception ProxyBypassLists ]
                        set ceproxy [ class lookup ceproxyEU ProxyHosts ]

                        # Here's a bit of conditionals, setting the proxy variable based on which virtual server the
                        # iRule is currently executed from (makes sense only if the same iRule is attached to multiple
                        # vservers of course)
                        if {[virtual name] eq "/Common/proxy_pac_http_90_vserver"} {
                            set clientproxy [ class lookup formauthproxyEU ProxyHosts ]
                        } elseif {[virtual name] eq "/Common/testproxy_pac_http_81_vserver"} {
                            set clientproxy [ class lookup testproxyEU ProxyHosts]
                        } elseif {[virtual name] eq "/Common/proxy_pac_http_O365_vserver"} {
                            set clientproxy [ class lookup ceproxyEU ProxyHosts]
                        } else {
                            set clientproxy [ class lookup clientproxyEU ProxyHosts ]
                }

                        # Now this is the actual magic. As noted above we have now set TCL variables named for example
                        # $globalbypass and our template includes the string "$globalbypass"

                        # What we want to do next is substitute the variable name in the template with the variable values
                        # from the code.
                        # "subst" does exactly that. It performs one level of TCL execution. Think of "eval" in basically
                        # any language. It takes a string and executes it as code.
                        # Except for "subst" there are two in this context very useful parameters: -nocommands and -nobackslashes.
                        # Those prevent it from executing commands (like if there was a ping or rm or ssh or find or anything
                        # in the string being subst'd it wouldn't actually try to execute those commands) and from normalizing
                        # backslashes (we don't have any in our PAC file, but if we did, it would still work).
                        # So what is left that it DOES do? Substituting variables! Exactly what we want and nothing else.
                        # Now since the static variable is read only, we can't do this substitution on the template itself.
                        # And if we could it wouldn't be a good idea, because it is shared accross all sessions. So assuming
                        # there are multiple versions of the PAC file with different proxies or bypass lists, we would
                        # constantly overwrite them with each other.
                        # The solution is simply to save the output of the subst in a new local variable that exists in
                        # session context only.
                        # So from a memory point of view the static/global template doesn't really gain us anything.
                        # In the end we have the template in memory once per CMP and then a substituted copy of the template
                        # once per session. So as noted earlier, could've probably just removed the entire RULE_INIT block,
                        # set the template in session context (HTTP_REQUEST event) and get the same result,
                        # maybe even slightly more efficient.
                        set pacfile [subst -nocommands -nobackslashes $static::pacfiletemplate]

                        # All that's left to do is actually respond to the client. Simple stuff.
                        HTTP::respond 200 content $pacfile "Content-Type" "application/x-ns-proxy-autoconfig" "Cache-Control" "private,no-cache,no-store,max-age=0"
            # In this example we have two different PAC files with different templates on different URLs
            # Other iRules we use have more differentiation based on client IP. In theory we could have one big iRule
            # with all the PAC files in the world and it would still scale very well (just a few more if/else or switch cases)
            } elseif { [HTTP::uri] eq "/forwarding.pac" } {
                set clientproxy [ class lookup clientproxyEU ProxyHosts]
                set forwardinglist [ class lookup forwardinglist ProxyBypassLists ]
            set forwardingpac [subst -nocommands -nobackslashes $static::forwardingpactemplate]
            HTTP::respond 200 content $forwardingpac "Content-Type" "application/x-ns-proxy-autoconfig" "Cache-Control" "private,no-cache,no-store,max-age=0"
            } else {
                # If someone tries to access a different path, give them a 404 and the right URL
                HTTP::respond 404 content "Please try http://webproxy.drjohns.com/proxy.pac" "Content-Type" "text/plain" "Cache-Control" "private,no-cache,no-store,max-age=0"
            }
}

To be continued...

Categories
Linux Perl Raspberry Pi Web Site Technologies

Convert GPS Coordinates into town name or address

Intro

This is a small piece of a larger project – displaying your photos on Google Drive using a Raspberry Pi. That project will require completion of many small investigations, this being just one of them.

I thought, wouldn’t it be cool to ask your photo frame when and where a certain picture was taken? I thought that information was typically embedded into the picture by modern smartphones. Turns out this is disappointingly not the case – at least not on our smartphones, except in a small minority of pictures. But since I got somewhere with my investigation, I wanted to share the results, regardless.

Also, I naively assumed that there surely is a web service that permits one to easily convert GPS coordinates into the name – in text – of the closest town. After all, you can enter GPS coordinates into Google Maps and get back a map showing the exact location. Why shouldn’t it be just as easy to extract the nearest town name as text? Again, this assumption turns out to be faulty. But, I found a way to do it that is not toooo difficult.

Example for Cape May, New Jersey

$ curl -s http://api.geonames.org/address?lat=38.9302957777778&lng=-74.9183310833333&username=drjohns

<geonames>
<address>
<street>Beach Dr</street>
<houseNumber>690</houseNumber>
<locality>Cape May</locality>
<postalcode>08204</postalcode>
<lng>-74.91835</lng>
<lat>38.93054</lat>
<adminCode1>NJ</adminCode1>
<adminName1>New Jersey</adminName1>
<adminCode2>009</adminCode2>
<adminName2>Cape May</adminName2>
<adminCode3/>
<adminCode4/>
<countryCode>US</countryCode>
<distance>0.03</distance>
</address>
</geonames>

The above example used the address service. The results in this case are unusually complete. Sometime the lookups simply fail for no obvious reason, or provide incomplete information, such as a missing locality. In those cases the town name is usually still reported in the adminName2 element. I haven’t checked the address accuracy much, but it seems pretty accurate, like, representing an actual address within 100 yards, usually better, of where the picture was taken.

They have another service, findNearbyPlaceName, which sometimes works even when address fails. However its results are also unpredictable. I was in Merrillville, Indiana and it gave the toponym as Chapel Manor, which is the name of the subdivision! In Virginia it gave the name The Hamlet – still not sure where that came from, but I trust it is some hyper-local name for a section of the town (James City). Just as often it does spit back the town or city name, for instance, Atlantic City. So, it’s better than nothing.

The example for Nantucket

From a browser – here I use curl in the linux command line – you enter:

$ curl -s http://api.geonames.org/findNearbyPlaceName?lat=41.282778&lng=-70.099444&username=drjohns

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<geonames>
<geoname>
<toponymName>Nantucket</toponymName>
<name>Nantucket</name>
<lat>41.28346</lat>
<lng>-70.09946</lng>
<geonameId>4944903</geonameId>
<countryCode>US</countryCode>
<countryName>United States</countryName>
<fcl>P</fcl>
<fcode>PPLA2</fcode>
<distance>0.07534</distance>
</geoname>
</geonames>

So what did we do? For this example I looked up Nantucket in Wikipedia to find its GPS coordinates. Then I used the geonames api to convert those coordinates into the town name, Nantucket.

Note that drjohns is an actual registered username with geonames. I am counting on the unpopularity of my posts to prevent an onslaught of usage as the usage credits are limited for free accounts. If I understood the terms, a few lookups per hour would not be an issue.

I’m finding the PlaceName lookup pretty useless, the address lookup fails about 30% of the time, so I’m thinking as a backstop to use this sort of lookup:

$ curl ‘http://api.geonames.org/extendedFindNearby?lat=41.00050&lng=-74.65329&username=drjohn’

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”no”?>
<geonames>
<address>
<street>Stanhope Rd</street>
<mtfcc>S1400</mtfcc>
<streetNumber>439</streetNumber>
<lat>41.00072</lat>
<lng>-74.6554</lng>
<distance>0.18</distance>
<postalcode>07871</postalcode>
<placename>Lake Mohawk</placename>
<adminCode2>037</adminCode2>
<adminName2>Sussex</adminName2>
<adminCode1>NJ</adminCode1>
<adminName1>New Jersey</adminName1>
<countryCode>US</countryCode>
</address>
</geonames>

Note that gets a reasonably close address, and more importantly, a zipcode. The placename is too local and I will probably discard it. But another lookup can turn a zipcode into a town or city name which is what I am after.

$ curl ‘http://api.geonames.org/postalCodeSearch?country=US&postalcode=07871&username=drjohns’

<?xml version=”1.0″ encoding=”UTF-8″ standalone=”no”?>
<geonames>
<totalResultsCount>1</totalResultsCount>
<code>
<postalcode>07871</postalcode>
<name>Sparta</name>
<countryCode>US</countryCode>
<lat>41.0277</lat>
<lng>-74.6407</lng>
<adminCode1 ISO3166-2=”NJ”>NJ</adminCode1>
<adminName1>New Jersey</adminName1>
<adminCode2>037</adminCode2>
<adminName2>Sussex</adminName2>
<adminCode3/>
<adminName3/>
</code>
</geonames>

See? It was a lot of work, but we finally got the township name, Sparta, returned to us.

Ocean GPS?

I was whale-watching and took some pictures with GPS info. Trying to apply the methods above worked, but just barely. Basically all I could get out of the extended find nearby search was a name field with value North Atlantic Ocean! Well, that makes it sounds like I was on some Titanic-style ocean crossing. In fact I was in the Gulf of Maine a few miles from Provincetown. So they really could have done a better job there… Of course it’s understandable to not have a postalcode and street address and such. But still, bodies of waters have names and geographical boundaries as well. Casinos seem to be the main sponsors of geonames.org, and I guess they don’t care. Yesterday my script came up with a location Earth! But now I see geonames proposed several locations and I only look at the first one. I am creating a refinement which will perform better in such cases. Stay tuned… And…yes…the refinement is done. I had to do a wee bit of xml parsing, which I now do.

To get your own account at geonames.org

The process of getting your own account isn’t too difficult, just a bit squirrelly. For the record, here is what you do.

Go to http://www.geonames.org/login to create your account. It sends an email confirmation. Oh. Be sure to use a unique browser-generated password for this one. The security level is off-the-charts awful – just assume that any and all hackers who want that password are going to get it. It sends you a confirmation email. so far so good. But when you then try to use it in an api call it will tell you that that username isn’t known. This is the tricky part.

So go to https://www.geonames.org/manageaccount . It will say:

Free Web Services
the account is not yet enabled to use the free web services. Click here to enable. 

And that link, in turn is https://www.geonames.org/enablefreewebservice . And having enabled your account for the api web service, the URL, where you’ve put your username in place of drjohns, ought to work!

For a complete overview of all the different things you can find out from the GPS coordinates from geonames, look at this link: https://www.geonames.org/export/ws-overview.html

Working with pictures

Please look at this post for the python code to extract the metadata from an image, including, if available GPS info. I called the python program getinfo.py.

Here’s an actual example of running it to learn the GPS info:

$ ../getinfo.py 20170520_102248.jpg|grep -ai gps

GPSInfo = {0: b'\x02\x02\x00\x00', 1: 'N', 2: (42.0, 2.0, 18.6838), 3: 'W', 4: (70.0, 4.0, 27.5448), 5: b'\x00', 6: 0.0, 7: (14.0, 22.0, 25.0), 29: '2017:05:20'}

I don’t know if it’s good or bad, but the GPS coordinates seem to be encoded in the degrees, minutes, seconds format.

A nice little program to put things together

I call it analyzeGPS.pl and a, using it on a Raspberry Pi, but could easily be adapted to any linux system.

                    
#!/usr/bin/perl
# use in combination with this post https://drjohnstechtalk.com/blog/2020/12/convert-gps-coordinates-into-town-name/
use POSIX;
$DEBUG = 1;
$HOME = "/home/pi";
#$file = "Pictures/20180422_134220.jpg";
while(<>){
$GPS = $date = 0;
$gpsinfo = "";
$file = $_;
open(ANAL,"$HOME/getinfo.py \"$file\"|") || die "Cannot open file: $file!!\n";
#open(ANAL,"cat \"$file\"|") || die "Cannot open file: $file!!\n";
print STDERR "filename: $file\n" if $DEBUG;
while(<ANAL>){
  $postalcode = $town = $name = "";
  if (/GPS/i) {
    print STDERR "GPS: $_" if $DEBUG;
# GPSInfo = {1: 'N', 2: (39.0, 21.0, 22.5226), 3: 'W', 4: (74.0, 25.0, 40.0267), 5: 1.7, 6: 0.0, 7: (23.0, 4.0, 14.0), 29: '2016:07:22'}
   ($pole,$deg,$min,$sec,$hemi,$lngdeg,$lngmin,$lngsec) = /1: '([NS])', 2: \(([\d\.]+), ([\d\.]+), ([\d\.]+)...3: '([EW])', 4: \(([\d\.]+), ([\d\.]+), ([\d\.]+)\)/i;
   print STDERR "$pole,$deg,$min,$sec,$hemi,$lngdeg,$lngmin,$lngsec\n" if $DEBUG;
   $lat = $deg + $min/60.0 + $sec/3600.0;
   $lat = -$lat if $pole eq "S";
   $lng = $lngdeg + $lngmin/60.0 + $lngsec/3600.0;
   $lng = -$lng if $hemi = "W" || $hemi eq "w";
   print STDERR "lat,lng: $lat, $lng\n" if $DEBUG;
   #$placename = `curl -s "$url"|grep -i toponym`;
   next if $lat == 0 && $lng == 0;
# the address API is the most precise
   $url = "http://api.geonames.org/address?lat=$lat\&lng=$lng\&username=drjohns";
   print STDERR "Url: $url\n" if $DEBUG;
   $results = `curl -s "$url"|egrep -i 'street|house|locality|postal|adminName'`;
   print STDERR "results: $results\n" if $DEBUG;
   ($street) = $results =~ /street>(.+)</;
   ($houseNumber) = $results =~ /houseNumber>(.+)</;
   ($postalcode) = $results =~ /postalcode>(.+)</;
   ($state) = $results =~ /adminName1>(.+)</;
   ($town) = $results =~ /locality>(.+)</;
   print STDERR "street, houseNumber, postalcode, state, town: $street, $houseNumber, $postalcode, $state, $town\n" if $DEBUG;
# I think locality is pretty good name. If it exists, don't go  further
   $postalcode = "" if $town;
   if (!$postalcode && !$town){
# we are here if we didn't get interesting results from address reverse loookup, which often happens.
     $url = "http://api.geonames.org/extendedFindNearby?lat=$lat\&lng=$lng\&username=drjohns";
     print STDERR "Address didn't work out. Trying extendedFindNearby instead. Url: $url\n" if $DEBUG;
     $results = `curl -s "$url"`;
# parse results - there may be several objects returned
     $topelemnt = $results =~ /<geoname>/i ? "geoname" : "geonames";
     @elmnts = ("street","streetnumber","lat","lng","locality","postalcode","countrycode","countryname","name","adminName2","adminName1");
     $cnt = xml1levelparse($results,$topelemnt,@elmnts);

     @lati = @{ $xmlhash{lat}};
     @long = @{ $xmlhash{lng}};
# find the closest entry
     $distmax = 1E7;
     for($i=0;$i<$cnt;$i++){
       $dist = ($lat - $lati[$i])**2 + ($lng - $long[$i])**2;
       print STDERR "dist,lati,long: $dist, $lati[$i], $long[$i]\n" if $DEBUG;
       if ($dist < $distmax) {
         print STDERR "dist < distmax condition. i is: $i\n";
         $isave = $i;
       }
     }
     $street = @{ $xmlhash{street}}[$isave];
     $houseNumber = @{ $xmlhash{streetnumber}}[$isave];
     $admn2 = @{ $xmlhash{adminName2}}[$isave];
     $postalcode = @{ $xmlhash{postalcode}}[$isave];
     $name = @{ $xmlhash{name}}[$isave];
     $countrycode = @{ $xmlhash{countrycode}}[$isave];
     $countryname = @{ $xmlhash{countryname}}[$isave];
     $state = @{ $xmlhash{adminName1}}[$isave];
     print STDERR "street, houseNumber, postalcode, state, admn2, name: $street, $houseNumber, $postalcode, $state, $admn2, $name\n" if $DEBUG;
     if ($countrycode ne "US"){
       $state .= " $countryname";
     }
     $state .= " (approximate)";
   }
# turn zipcode into town name with this call
   if ($postalcode) {
     print STDERR "postalcode $postalcode exists, let's convert to a town name\n";
     print STDERR "url: $url\n";
     $url = "http://api.geonames.org/postalCodeSearch?country=US\&postalcode=$postalcode\&username=drjohns";
     $results = `curl -s "$url"|egrep -i 'name|locality|adminName'`;
     ($town) = $results =~ /<name>(.+)</i;
     print STDERR "results,town: $results,$town\n";
   }
   if (!$town) {
# no town name, use adminname2 which is who knows what in general
     print STDERR "Stil no town name. Use adminName2 as next best thing\n";
     $town = $admn2;
   }
   if (!$town) {
# we could be in the ocean! I saw that once, and name was North Atlantic Ocean
     print STDERR "Still no town. Try to use name: $name as last resort\n";
     $town = $name;
   }
   $gpsinfo = "$houseNumber $street $town, $state" if $locality || $town;
   } # end of GPS info exists condition
  } # end loop over ANAL file
  $gpsinfo = $gpsinfo || "No info found";
  print qq(Location: $gpsinfo
);
} # end loop over STDIN

#####################
# function to parse some xml and fill a hash of arrays
sub xml1levelparse{
# build an array of hashes
$string = shift;
# strip out newline chars
$string =~ s/\n//g;
$parentelement = shift;
@elements = @_;
$i=0;
while($string =~ /<$parentelement>/i){
 $i++;
 ($childelements) = $string =~ /<$parentelement>(.+?)<\/$parentelement>/i;
 print STDERR "childelements: $childelements" if $DEBUG;
 $string =~ s/<$parentelement>(.+?)<\/$parentelement>//i;
 print STDERR "string: $string\n" if $DEBUG;
 foreach $element (@elements){
  print STDERR "element: $element\n" if $DEBUG;
  ($value) = $childelements =~ /<$element>([^<]+)<\/$element>/i;
  print STDERR "value: $value\n" if $DEBUG;
  push @{ $xmlhash{$element} }, $value;
 }
} # end of loop over parent elements
return $i;
} # end sub xml1levelparse

Here’s a real example of calling it, one of the more difficult cases:

$ echo -n 20180127_212203.jpg|./analyzeGPS.pl

GPS: GPSInfo = {0: b'\x02\x02\x00\x00', 1: 'N', 2: (41.0, 0.0, 2.75), 3: 'W', 4: (74.0, 39.0, 12.0934), 5: b'\x00', 6: 0.0, 7: (2.0, 21.0, 58.0), 29: '2018:01:28'}
N,41.0,0.0,2.75,W,74.0,39.0,12.0934
lat,lng: 41.0007638888889, -74.6533592777778
Url: http://api.geonames.org/address?lat=41.0007638888889&lng=-74.6533592777778&username=drjohns
results:
street, houseNumber, postalcode, state, town: , , , ,
Address didn't work out. Trying extendedFindNearby instead. Url: http://api.geonames.org/extendedFindNearby?lat=41.0007638888889&lng=-74.6533592777778&username=drjohns
childelements: <address> <street>Stanhope Rd</street> <mtfcc>S1400</mtfcc> <streetNumber>433</streetNumber> <lat>41.00121</lat> <lng>-74.65528</lng> <distance>0.17</distance> <postalcode>07871</postalcode> <placename>Lake Mohawk</placename> <adminCode2>037</adminCode2> <adminName2>Sussex</adminName2> <adminCode1>NJ</adminCode1> <adminName1>New Jersey</adminName1> <countryCode>US</countryCode> </address>string: <?xml version="1.0" encoding="UTF-8" standalone="no"?>
element: street
value: Stanhope Rd
element: streetnumber
value: 433
element: lat
value: 41.00121
element: lng
value: -74.65528
element: locality
value:
element: postalcode
value: 07871
element: countrycode
value: US
element: countryname
value:
element: name
value:
element: adminName2
value: Sussex
element: adminName1
value: New Jersey
dist,lati,long: 3.88818897839883e-06, 41.00121, -74.65528
dist < distmax condition. i is: 0
street, houseNumber, postalcode, state, admn2, name: Stanhope Rd, 433, 07871, New Jersey, Sussex,
postalcode 07871 exists, let's convert to a town name
url: http://api.geonames.org/extendedFindNearby?lat=41.0007638888889&lng=-74.6533592777778&username=drjohns
results,town: <geonames>
<name>Sparta</name>
<adminName1>New Jersey</adminName1>
<adminName2>Sussex</adminName2>
<adminName3/>
</geonames>
,Sparta
Location: 433 Stanhope Rd Sparta, New Jersey (approximate)

Or, if you just want the interesting stuff,

$ echo -n 20180127_212203.jpg|./analyzeGPS.pl 2>/dev/null

Location: 433 Stanhope Rd Sparta, New Jersey (approximate)

Conclusion

An api for reverse lookup of GPS coordinates which returns the nearest address, including town name, is available. I have provided examples of how to use it. It is unreliable, however, and Geonames.org does provide alternatives which have their own drawbacks. In my image gallery, only a minority of my pictures have encoded GPS data, but it is fun to work with them to pluck out the town where they were shot.

I have incorporated this functionality into a Raspberry Pi-based photo frame I am working on.

I have created an example Perl program that analyzes a JPEG image to extract the GPS information and turn it into an address that is remarkably accurate. It is amazing and uncanny to see it at work. It deals with the screwy and inconsistent results returned by the free service, Geonames.org.

References and related

There are lots of different things you can derive given the GPS coordinates using the Geonames api. Here is a list: https://www.geonames.org/export/ws-overview.html

In this photo frame version of mine, I extract all the EXIF metadata which includes the GPS info.

One day my advanced photo frame will hopefully include an option to learn where a photo was taken by interacting with a remote control. Here is the start of that write-up.

You can pay $5 and get a zip codes to cities database in any format. I’m sure they’ve just re-packaged data from elsewhere, but it might be worth it: https://www.uszipcodeslist.com/

For a more professional api, https://smartystreets.com/ looks quite nice. Free level is 250 queries per month, so not too many. But their documentation and usability looks good to me. For this post I was looking for free services and have tried to avoid commercial services.

Categories
Perl Python Raspberry Pi Web Site Technologies

Raspberry Pi photo frame using your pictures on your Google Drive

Editor’s Note

Please note I am putting all my currently active development and latest updates into this newer post: Raspberry Pi photo frame using your pictures on your Google Drive II

Intro

All my spouse’s digital photo frames are either broken or nearly broken – probably she got them from garage sales. Regardless, they spend 99% of the the time black. Now, since I had bought that Raspberry Pi PiDisplay awhile back, and it is underutilized, and I know a thing or two about linux, I felt I could create a custom photo frame with things I already have lying around – a Raspberry Pi 3, a PiDisplay, and my personal Google Drive. We make a point to copy all our cameras’ pictures onto the Google Drive, which we do the old-fashioned, by-hand way. After 17 years of digital photos we have about 40,000 of them, over 200 GB.

So I also felt obliged to create features you will never have in a commercial product, to make the effort worthwhile. I thought, what about randomly picking a few for display from amongst all the pictures, displaying that subset for a few days, and then moving on to a new randomly selected sample of images, etc? That should produce a nice review of all of them over time, eventually. You need an approach like that because you will never get to the end if you just try to display 40000 images in order!

Equipment

This work was done on a Raspberry Pi 3 running Raspbian Lite (more on that later). I used a display custom-built for the RPi, Amazon.com: Raspberry Pi 7″ Touch Screen Display: Electronics), though I believe any HDMI display would do.

The scripts
Here is the master file which I call master.sh.

                    
#!/bin/sh
# DrJ 8/2019
# call this from cron once a day to refesh random slideshow once a day
RANFILE=”random.list”
NUMFOLDERS=20
DISPLAYFOLDER=”/home/pi/Pictures”
DISPLAYFOLDERTMP=”/home/pi/Picturestmp”
SLEEPINTERVAL=3
DEBUG=1
STARTFOLDER=”MaryDocs/Pictures and videos”

echo “Starting master process at “`date`

rm -rf $DISPLAYFOLDERTMP
mkdir $DISPLAYFOLDERTMP

#listing of all Google drive files starting from the picture root
if [ $DEBUG -eq 1 ]; then echo Listing all files from Google drive; fi
rclone ls remote:”$STARTFOLDER” > files

# filter down to only jpegs, lose the docs folders
if [ $DEBUG -eq 1 ]; then echo Picking out the JPEGs; fi
egrep ‘\.[jJ][pP][eE]?[gG]$’ files |awk ‘$1 > 11000 {$1=””; print substr($0,2)}’|grep -i -v /docs/ > jpegs.list

# throw NUMFOLDERS or so random numbers for picture selection, select triplets of photos by putting
# names into a file
if [ $DEBUG -eq 1 ]; then echo Generate random filename triplets; fi
./random-files.pl -f $NUMFOLDERS -j jpegs.list -r $RANFILE

# copy over these 60 jpegs
if [ $DEBUG -eq 1 ]; then echo Copy over these random files; fi
cat $RANFILE|while read line; do
rclone copy remote:”${STARTFOLDER}/$line” $DISPLAYFOLDERTMP
sleep $SLEEPINTERVAL
done

# rotate pics as needed
if [ $DEBUG -eq 1 ]; then echo Rotate the pics which need it; fi
cd $DISPLAYFOLDERTMP; ~/rotate-as-needed.sh
cd ~

# kill any qiv slideshow
if [ $DEBUG -eq 1 ]; then echo Killing old qiv and fbi slideshow; fi
pkill -9 -f qiv
sudo pkill -9 -f fbi
pkill -9 -f m2.pl

# remove old pics
if [ $DEBUG -eq 1 ]; then echo Removing old pictures; fi
rm -rf $DISPLAYFOLDER

mv $DISPLAYFOLDERTMP $DISPLAYFOLDER

#run looping fbi slideshow on these pictures
if [ $DEBUG -eq 1 ]; then echo Start fbi slideshow in background; fi
cd $DISPLAYFOLDER ; nohup ~/m2.pl >> ~/m2.log 2>&1 &

if [ $DEBUG -eq 1 ]; then echo “And now it is “`date`; fi

I call the following script random-files.pl:

                    

#!/usr/bin/perl
use Getopt::Std;
my %opt=();
getopts("c:df:j:r:",\%opt);
$nofolders = $opt{f} ? $opt{f} : 20;
$DEBUG = $opt{d} ? 1 : 0;
$cutoff = $opt{c} ? $opt{c} : 5;
$cutoffS = 60*$cutoff;
$jpegs = $opt{j} ? $opt{j} : "jpegs.list";
$ranpicfile = $opt{r} ? $opt{r} : "jpegs-random.list";
print "d,f,j,r: $opt{d}, $opt{f}, $opt{j}, $opt{r}\n" if $DEBUG;
open(JPEGS,$jpegs) || die "Cannot open jpegs listing file $jpegs!!\n";
@jpegs = ;
# remove newline character
$nopics = chomp @jpegs;
open(RAN,"> $ranpicfile") || die "Cannot open random picture file $ranpicfile!!\n";
for($i=0;$i<$nofolders;$i++) {
  $t = int(rand($nopics-2));
  print "random number is: $t\n" if $DEBUG;
# a lot of our pics follow this naming convention
# 20160831_090658.jpg
  ($date,$time) = $jpegs[$t] =~ /(\d{8})_(\d{6})/;
  if ($date) {
    print "date, time: $date $time\n" if $DEBUG;
# ensure neighboring picture is at least five minutes different in time
    $iPO = $iP = $diff = 0;
    ($hr,$min,$sec) = $time =~ /(\d\d)(\d\d)(\d\d)/;
    $secs = 3600*$hr + 60*$min + $sec;
    print "Pre-pic logic\n";
    while ($diff < $cutoffS) {
      $iP++;
      $priorPic = $jpegs[$t-$iP];
      $Pdate = $Ptime = 0;
      ($Pdate,$Ptime) = $priorPic =~ /(\d{8})_(\d{6})/;
      ($Phr,$Pmin,$Psec) = $Ptime =~ /(\d\d)(\d\d)(\d\d)/;
      $Psecs = 3600*$Phr + 60*$Pmin + $Psec;
      print "hr,min,sec,Phr,Pmin,Psec: $hr,$min,$sec,$Phr,$Pmin,$Psec\n" if $DEBUG;
      $diff = abs($secs - $Psecs);
      print "diff: $diff\n" if $DEBUG;
# end our search if we happened upon different dates
      $diff = 99999 if $Pdate ne $date;
    }
# post-picture logic - same as pre-picture
    print "Post-pic logic\n";
    $diff = 0;
    while ($diff < $cutoffS) {
      $iPO++;
      $postPic = $jpegs[$t+$iPO];
      $Pdate = $Ptime = 0;
      ($Pdate,$Ptime) = $postPic =~ /(\d{8})_(\d{6})/;
      ($Phr,$Pmin,$Psec) = $Ptime =~ /(\d\d)(\d\d)(\d\d)/;
      $Psecs = 3600*$Phr + 60*$Pmin + $Psec;
      print "hr,min,sec,Phr,Pmin,Psec: $hr,$min,$sec,$Phr,$Pmin,$Psec\n" if $DEBUG;
      $diff = abs($Psecs - $secs);
      print "diff: $diff\n" if $DEBUG;
# end our search if we happened upon different dates
      $diff = 99999 if $Pdate ne $date;
    }
  } else {
    $iP = $iPO = 2;
  }
  $priorPic = $jpegs[$t-$iP];
  $Pic = $jpegs[$t];
  $postPic = $jpegs[$t+$iPO];
  print RAN qq($priorPic
$Pic
$postPic
);
}
close(RAN);

Bunch of simple python scripts

I call this one getinfo.py:

                    
#!/usr/bin/python3
import os,sys
from PIL import Image
from PIL.ExifTags import TAGS

for (tag,value) in Image.open(sys.argv[1])._getexif().items():
print (‘%s = %s’ % (TAGS.get(tag), value))

print (‘%s = %s’ % (TAGS.get(tag), value))

And here’s rotate.py:

                    
#!/usr/bin/python3
import PIL, os
import sys
from PIL import Image

picture= Image.open(sys.argv[1])

# if orientation is 6, rotate clockwise 90 degrees
picture.rotate(-90,expand=True).save(“rot_” + sys.argv[1])

While here is rotatecc.py:

                    
#!/usr/bin/python3
import PIL, os
import sys
from PIL import Image

picture= Image.open(sys.argv[1])

# if orientation is 8, rotate counterclockwise 90 degrees
picture.rotate(90,expand=True).save(“rot_” + sys.argv[1])

And rotate-as-needed.sh:

                    
#!/bin/sh
# DrJ 12/2020
# some of our downloaded files will be sideways, and fbi doesn’t auto-rotate them as far as I know
# assumption is that are current directory is the one where we want to alter files
ls -1|while read line; do
echo fileis “$line”
o=`~/getinfo.py “$line”|grep -ai orientation|awk ‘{print $NF}’`
echo orientation is $o
if [ “$o” -eq “6” ]; then
echo “90 clockwise is needed, o is $o”
# rotate and move it
~/rotate.py “$line”
mv rot_”$line” “$line”
elif [ “$o” -eq “8” ]; then
echo “90 counterclock is needed, o is $o”
# rotate and move it
~/rotatecc.py “$line”
mv rot_”$line” “$line”
fi
don

And finally, m2.pl:

                    

#!/usr/bin/perl
# show the pics ; rotate the screen as needed
# for now, assume the display is in a neutral
# orientation at the start
use Time::HiRes qw(usleep);
$DEBUG = 1;
$delay = 6; # seconds between pics
$mdelay = 200; # milliseconds
$mshow = "$ENV{HOME}/mediashow";
$pNames = "$ENV{HOME}/pNames";
# pics are here
$picsDir = "$ENV{HOME}/Pictures";

chdir($picsDir);
system("ls -1 > $pNames");
# forther massage names
open(TMP,"$pNames");
@lines = ;
foreach (@lines) {
  chomp;
  $filesNullSeparated .= $_ . "\0";
}
open(MS,">$mshow") || die "Cannot open mediashow file $mshow!!\n";
print MS $filesNullSeparated;
close(MS);
print "filesNullSeparated: $filesNullSeparated\n" if $DEBUG;
$cn = @lines;
print "$cn files\n" if $DEBUG;
# throw up a first picture - all black. Trick to make black bckgrd permanent
system("sudo fbi -a --noverbose -T 1 $ENV{HOME}/black.jpg");
system("sudo fbi -a --noverbose -T 1 $ENV{HOME}/black.jpg");
sleep(1);
system("sleep 2; sudo killall fbi");
# start infinitely looping fbi slideshow
for (;;) {
# then start slide show
# shell echo cannot work with null character so we need to use a file to store it
    #system("cat $picNames|xargs -0 qiv -DfRsmi -d $delay \&");
    system("sudo xargs -a $mshow -0 fbi -a --noverbose -1 -T 1  -t $delay ");
# fbi runs in background, then exits, so we need to monitor if it's still alive
# wait appropriate estimated amount of time, then look aggressively for fbi
    sleep($delay*($cn - 2));
    for(;;) {
      open(MON,"ps -ef|grep fbi|grep -v grep|") || die "Cannot launch ps -ef!!\n";
      $match = ;
      if ($match) {
        print "got fbi match\n" if $DEBUG > 1;
        } else {
        print "no fbi match\n" if $DEBUG;
# fbi not found
          last;
      }
      close(MON);
      print "usleeping, noexist is $noexit\n" if $DEBUG > 1;
      usleep($mdelay);
    } # end loop testing if fbi has exited
} # close of infinite loop

You’ll need to make these files executable. Something like this should work:

$ chmod +x *.py *.pl *.sh

My crontab file looks like this (you edit crontab using the crontab -e command):

@reboot sleep 25; cd ~ ; ./m2.pl >> ./m2.log 2>&1
24 16 * * * ./master.sh >> ./master.log 2>&1

This invokes master.sh once a day at 4:24 PM to refresh the 60 photos. My refresh took about 13 minutes the other day, but the old slideshow keeps playing until almost the last second, so it’s OK.

The nice thing about this approach is that fbi works with a lightweight OS – Raspbian Lite is fine, you’ll just need to install a few packages. My SD card is unstable or something, so I have to re-install the OS periodically. An install of Raspberry Pi Lite on my RPi 4 took 11 minutes. Anyway, fbi is installed via:

$ sudo apt-get install fbi

But if your RPi is freshly installed, you may first need to do a

$ sudo apt-get update && sudo apt-get upgrade

python image manipulation

The drawback of this approach, i.e., not using qiv, is that we gotta do some image manipulation, for which python is the best candidate. I’m going by memory. I believe I installed python3, perhaps as sudo apt-get install python3. Then I needed pip3: sudo apt-get install python3-pip. Then I needed to install Pillow using pip3: sudo pip3 install Pillow.

m2.pl refers to a black.jpg file. It’s not a disaster to not have that, but under some circumstances it may help. There it is!

Many of my photos do not have EXIF information, yet they can still be displayed. So for those photos running getinfo.py will produce an error (but the processing of the other photos will continue.)

I was originally rotating the display 90 degrees as needed to display the photos with the using the maximum amount of display real estate. But that all broke when I tried to revive it. And the cheap servo motor was noisy. But folks were pretty impressed when I demoed it, because I did it get it the point where it was indeed working correctly.

Picture selection methodology

There are 20 “folders” (random numbers) of three triplets each. The idea is to give you additional context to help jog your memory. The triplets, with some luck, will often be from the same time period.

I observed how many similar pictures are adjacent to each other amongst our total collection. To avoid identical pictures, I require the pictures to be five minutes apart in time. Well, I cheated. I don’t pull out the timestamp from the EXIF data as I should (at least not yet – future enhancement, perhaps). But I rely on a file-naming convention I notice is common – 20201227_134508.jpg, which basically is a timestamp-encoded name. The last six digits are HHMMSS in case it isn’t clear.

Rclone

You must install the rclone package, sudo apt-get install rclone.

Can you configure rclone on a headless Raspberry Pi?

Indeed you can. I know because I just did it. You enable your Pi for ssh access. do the rclone-config (or whatever it’s called) using putty from a Windows 10 system. You’ll get a long Google URL in the course of configuring that you can paste into your browser. You verify it’s you, log into your Google account. Then you get back a url like http://127.0.0.1:5462/another-long-url-string. Well, put that url into your clipboard and in another login window, enter curl clipboard_contents

That’s what I did, not certain it would work, but I saw it go through in my rclone-config window, and that was that!

Don’t want to deal with rclone?

So you want to use a traditional flash drive you plug in to a USB port, just like you have for the commerical photo frames, but you otherwise like my approach of randomizing the picture selection each day? I’m sure that is possible. A mid-level linux person could rip out the rclone stuff I have embedded and replace as needed with filesystem commands. I’m imagining a colossal flash drive with all your tens of thousands of pictures on it where my random selection still adds value. If this post becomes popular enough perhapsI will post exactly how to do it.

Getting started with this

After you’ve done all that, and want to try it out. you can run

$ ./master.sh

First you should see a file called files growing in size – that’s rclone doing its listing. That takes a few minutes. Then it generates random numbers for photo selection – that’s very fast, maybe a second. Then it slowly copies over the selected images to a temporary folder called Picturestmp. That’s the slowest part. If you do a directory listing you should see the number of images in that directory growing slowly, adding maybe three per minute until it reaches 60 of them. Finally the rotation are applied. But even if you didn’t set up your python environment correctly, it doesn’t crash. It effectively skips the rotations. A rotation takes a couple seconds per image. Finally all the images are copied over to the production area, the directory called Pictures; the old slideshow program is “killed,” and the new slideshow starts up. Whole process takes around 15 minutes.

I highly recommend running master.sh by hand as just described to make sure it all works. Probably some of it won’t. I don’t specialize in making recipes, more just guidance. But if you’re feeling really bold you can just power it up and wait a day (because initially you won’t have any pictures in your slideshow) and pray that it all works.

Still missing

I’d like to display a transition image when switching from the current set of photos to the new ones.

Suppressing boot up messages might be nice for some. Personally I think they’re kind of cool – makes it look like you’ve done a lot more techie work than you actually have!

You’re going to get some junk images. I’ve seen where an image is a thumbnail (I guess) and gets blown up full screen so that you see these giant blocks of pixels. I could perhaps magnify those kind of images less.

Movies are going to be tricky so let’s not even go there…

I was thinking about making it a navigation-enabled photo frame, such as integration with a Gameboy controller. You could do some really awesome stuff: Pause this picture; display the location (town or city) where this photo was taken; refresh the slideshow. It sounds fantastical, but I don’t think it’s beyond the capability of even modestly capable hobbyist programmers such as myself.

I may still spin the frame 90 degrees this way an that. I have the servo mounted and ready. Just got to revive the control commands for it.

References and related

This 7″ display is a little small, but it’s great to get you started. It’s $64 at Amazon: Amazon.com: Raspberry Pi 7″ Touch Screen Display: Electronics

I have an older approach using qiv which I lost the files for, and my blog post got corrupted. Hence this new approach.

In this slightly more sophisticated approach, I make a greater effort to separate the photos in time. But I also make a whole bunch of other improvements as well. But it’s a lot more files so it may only be appropriate for a more seasoned RPi command-line user.

My advanced slideshow treatment is beginning to take shape. I just add to it while I develop it, so check it periodically if that is of interest. Raspberry Pi advanced photo frame.

Categories
Security Web Site Technologies

Who’s hacking Drjohnstechtalk lately?

Intro

This headline was inspired by years of listening to our managed service providers: overpromise and underdeliver! Who’s hacking my web site? I have no idea. But what I can deliver is a list of badly behaved IP addresses over the last 24 hours.

Let’s do it

So, here is a dynamically-compiled list of offenders who have “hacked” my web site over the last 24 hours. They are IP addresses caught trying to fetch non-existent web pages (such as the default login page) or post unauthorized content to the site such as spammy comments.

Without further ado, here are the latest IPs which include up-to-the-minute entries.

What are they?

I don’t think it’s anything glamorous like an actual black hat scheming to crack through my site’s defenses, which would probably fall pretty quickly! It looks like a lot of the same type of probes coming from different IPs. So I suspect the work of a botnet that crawls through promising-sounding WordPress sites, looking for weak ones. Probably thousands of bots – things like compromised security cameras and poorly configured routers (IoT) orchestrated by a Command and Control station under the control of a small group of bad actors.

And there is probably a bit of access from “security researchers” (ethical hackers) who look for weaknesses that they can responsibly disclose. I’m imagining this scenario: a security researcher discovers a 0-day WordPress vulnerability and wants to make a blanket statement to the effect: 30% of all WordPress sites are vulnerable to this 0-day exploit. So they have to test it. Well, I don’t want to be anyone’s statistic. So no thank you.

But I don’t have time to deal with any of that. It’s one strike and you’re out at my site: I block every single one of these IPs doing these things, even based on a single offense.

Actual example hacks

Here are some from November 2020:

100.26.218.97 - - [22/Nov/2020:13:31:13 -0500] 704 "GET /blog/ HTTP/1.1" 200 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4240.193 Safari/537.36" 818
100.26.218.97 - - [22/Nov/2020:13:31:14 -0500] 1 "GET /blog//wp-includes/wlwmanifest.xml HTTP/1.1" 200 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4240.193 Safari/537.36" 386
100.26.218.97 - - [22/Nov/2020:13:31:14 -0500] 409 "GET /blog//wp-login.php HTTP/1.1" 404 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4240.193 Safari/537.36" 371

Note the access at the end to /blog//wp-login.php, a link which does not exist on my site! I imagine the user agent is spoofed. Fate: never again to access my site.

46.119.172.173 - - [22/Nov/2020:12:31:43 -0500] 26103 "POST /blog//xmlrpc.php HTTP/1.1" 200 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4240.193 Safari/537.36" 1094

This one (above) is an xmlrpc.php example. The next one is a bit more infuriating to me – a blatant command injection attempt:

45.146.164.211 - - [22/Nov/2020:09:58:43 -0500] 673 "GET /blog/ HTTP/1.1" 200 "https://50.17.188.196:443/index.php?s=/Index/\\think\\app/invokefunction&function=call_user_func_array&vars[0]=md5&vars[1][]=HelloThinkPHP21" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36" 743

I caught it due to the presence of index.php – another string which does not have a legit reason to appear in my access log, AFAIK.

Then there’s the bot trying to pull a non-existent .env (which, if it existed, might have contained environment variables which might have provided hints about the inner workings of the site):

54.226.98.220 - - [22/Nov/2020:09:48:59 -0500] 1248 "GET /.env HTTP/1.1" 404 "-" "python-requests/2.25.0" 184

The 404 status code means not found.

And this one may be trying to convey a message. I don’t like it:

69.30.226.234 - - [12/Nov/2020:00:24:00 -0500] 623 "GET /blog/2011/08/http://Idonthaveanywebsite... HTTP/1.1" 301 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/)" 723

Discussion

By looking for specific strings I realize I am implementing a very poor man’s version of a Web Application Firewall. Commercial WAFs are amazing to me – I know because i work with them. They have thousands of signatures, positive and negative matches, stuff you’d never even dream about. I can’t afford one for my self-hosted and self-funded site.

A word about command injection

If you look at the top 10 web site exploits, command injection is #1. A bunch of security vendors got together to help web site operators understand the most common threats by cataloging and explaining them in easy-to-understand terms. It’s pretty interesting. https://owasp.org/www-project-top-ten/

Conclusion

Sadly, the most common visitor to me web site are bots up to no good. I have documented whose hitting me up in real time, in case this proves to be of interest to the security community. Actual offending lines from my access file have been provided to make everything more concrete.

I have offered a very brief security discussion.

I don’t know who’s hacking me, or what’s hacking me, but I have shared a lot of information not commonly shared.

References and related

A great commercial web application firewall (WAF) is offered by F5.

Here’s the link to the top 10 web site exploits in clear language: https://owasp.org/www-project-top-ten/

Categories
Admin Web Site Technologies

Building a regular (non-bloggy) web site with WordPress

Intro

I recently was a first-hand witness to the building of a couple web sites. I was impressed as the webmaster turned them into “regular” web sites – some bit of marketing, some practical functionality – and removed all the traditional blog components. Here are some of the ingredients.

The ingredients

Background images and logo

unsplash.com – a place to look for quality, non-copyrighted images on a variety of topics. These can serve as a background image to the home page for instance.

looka.com – a place to do your logo design.

Theme

Astra

Security Plugins

WPS Hide Login

Layout Plugins

Elementor

Envato Elements

Form Plugins

Contact Form 7

Contact Form 7 Captcha

Ninja Forms. Note that Ninja Forms 3 includes Google’s reCAPTCHA, so no need to get that as a separate plugin. I am trying to work with Ninja Forms for my contact form.

Infrastructure Plugins

WP Mail SMTP – my WordPress server needs this but your mileage may vary.

How-to videos

I don’t have this link yet.

Reference and related

To sign up for an API key for Google’s reCAPTCHA, go here: http://www.google.com/recaptcha/admin

Categories
TCP/IP Uncategorized Web Site Technologies

The IT Detective Agency: web site not accessible

Intro
In this spellbinding segment we examine what happened when a user found an inaccessible web site.


Some details
The user in a corporate environment reports not being able to access https://login.smartnotice.net/. She has the latest version of Windows 10.


On the trail
I sense something is wrong with SSL because of the type of errors reported by the browser. Something to the effect that it can’t make a secure connection.


But I decided to doggedly pursue it because I have a decent background in understanding SSL-related problems, and I was wondering if this was the first of what might be a systemic problem. I’m always interested to find little problem and resolve them in a way that addresses bigger issues.


So the first thing I try to lean more about the SSL versions and ciphers supported is to use my Go-To site, ssllabs.com, Test your Server: https://www.ssllabs.com/ssltest/. Well, this test failed miserably, and in a way I’ve never seen before. SSLlabs just quickly gave up without any analysis! So we pushed ahead, undaunted.


So I hit the site with curl from my CentOS 8 server (Upgrading WordPress brings a thicket of problems). Curl works fine. But I see it prefers to use TLS 1.3. So I finally buckle down and learn how to properly cnotrol the SSL/TLS version in curl. The output from curl -help is misleading, shall we say?


You think using curl –tlsv1.2 is going to use TLS v 1.2? Think again. Maybe it will, or maybe it won’t. In fact it tells curl to use TLS version 1.2 or higher. I totally missed understanding that for all these years.
What I’m looking for is to determine if the web site is willing to use TLS v 1.2 in addition to TLS v 1.3.


The ticket is … –tls-max 1.2 . This sets the maximum TLS version curl will use to access the URL.


So we have
curl -v –tls-max 1.3 https://login.smartnotice.net/

<!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:1; mso-generic-font-family:roman; mso-font-format:other; mso-font-pitch:variable; mso-font-signature:0 0 0 0 0 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-469750017 -1073732485 9 0 511 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin-top:0in; margin-right:0in; margin-bottom:8.0pt; margin-left:0in; line-height:107%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri",sans-serif; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-family:"Calibri",sans-serif; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoPapDefault {mso-style-type:export-only; margin-bottom:8.0pt; line-height:107%;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} -->
*   Trying 104.18.27.134...
* TCP_NODELAY set
* Connected to login.smartnotice.net (104.18.27.134) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
...
html head

But

curl -v –tls-max 1.2 https://login.smartnotice.net/

<!-- /* Font Definitions */ @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:1; mso-generic-font-family:roman; mso-font-format:other; mso-font-pitch:variable; mso-font-signature:0 0 0 0 0 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-469750017 -1073732485 9 0 511 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin-top:0in; margin-right:0in; margin-bottom:8.0pt; margin-left:0in; line-height:107%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri",sans-serif; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-family:"Calibri",sans-serif; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoPapDefault {mso-style-type:export-only; margin-bottom:8.0pt; line-height:107%;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} -->
*   Trying 104.18.27.134...
* TCP_NODELAY set
* Connected to login.smartnotice.net (104.18.27.134) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS alert, protocol version (582):
* error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version
* Closing connection 0
curl: (35) error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version

So now we know, this web site requires the latest and greatest TLS v 1.3.
Even TLS 1.2 won’t do.

Well, this old corporate environment still offered users a choice of old
browsers, including IE 11 and the old Edge browser. These two browsers simply do not support TLS 1.3. But I fuond even Firefox wasn’t working, although the Chrome browser was.

How to explain all that? How to fix it?

It comes down to a good knowledge of the particular environment. As I think I stated, the this corporate environment uses proxies, which in turn, most
likely, tried to SSL intercept the traffic. The proxies are old so they in turn
don’t actually support SSL interception of TLS v 1.3! They had separate
problems with Chrome browser so they weren’t intercepting its traffic. This explains why FF was broken yet Chrome worked.

So the fix, such as it was, was to disable SSL interception for this request
URL so that Firefox would work, and tell the user to use either FF or Chrome.

Just being thorough, when i tested from home with Edge Chromium – the newer Edge browser – it worked and SSLlabs showed (correctly) that it supports TLS 1.3. Edge in the corporate environment is the older, non-Chromium one. It seems to max out at TLS 1.2. No good.

For good measure I explained the situation to the desktop support people.

Case: closed.

Appendix

How did I decide the proxies didn’t support TLS 1,3? What if this site had some other issue after all? I looked on the web for another web site which only supports TLS 1.3. I thought hopefully badssl.com would have one. But they don’t! Undaunted yet again, I determined to change my own web site, drjohnstechtalk.com, into one that only supports TLS 1.3! This is easy to do with apache web server. You basically need a line that looks like this:

SSLProtocol all -SSLv3 -TLSv1 -TLSv1.1 -TLSv1.2

Categories
Admin Network Technologies Web Site Technologies

Examining certificates over explicit proxy with openssl

Intro
This is pretty esoteric, but I’ve personally been waiting for this for a long time. It seems that beginning with openssl 1.1, the s_client sub-menu has had support for a proxy setting. Until then it was basically impossible to examine the certificates your proxy was sending back to users.

The syntax is something like:

openssl s_client -proxy <proxy_ip>:<proxy_port> -servername expired.badssl.com -showcerts -connect expired.badssl.com:443

where the proxy is a standard HTTP proxy.

Why is it a great thing? If your proxy does SSL interception then it is interfering with with the site’s normal certificate. And worse, it can good. What if its own signing certificate has expired?? I’ve seen it happen, and it isn’t pretty…

To find the openssl version just run openssl version.

My SLES12 SP4 servers have a version which is too old. My Cygwin install is OK, actually. My Redhat 7.7 has a version which is too old. I do have a SLES 15 server which has a good version. But even the version on my F5 devices is too old, surprisingly.

References and related
the openssl project home page: https://www.openssl.org/

A few of my favorite openssl commands.

Categories
Network Technologies Web Site Technologies

The IT Detective Agency: the case of Failed to convert character

Intro
A user of a web form noticed any password that includes an accented character is rejected. He came to use as the operator of the web application firewall for a fix.

More details
The web server was behind an F5 device running ASM – application security manager. The reported error that we saw was Failed to convert character. What does it all mean?

One suggestion is that the policy may have the wrong language, but the application language of this policy is unicode (utf-8), just like all our others we set up. And they don’t have any issues. I see where I can remove the block on this particular input violation, but that seems kind of an extreme measure, like throwing out the baby with the bathwater.

I wondered about a more granular way to deal with this?

Check characters on this parameter value is already disabled I notice, so we can’t further loosen there.

Ask the expert
So I ask someone who speaks a foreign language and has to deal with this stuff a lot more than I do. He responds:

Looking at the website I think that form just defaults to ISO-8859-1 instead of UTF-8 and that causes your problem.
Umlauts or accented letters are double byte encoded in UTF-8 and single byte in ISO-8859-1

To confirm the problem with the form, he enters an “ä” as the username, which the event log shows encoded to %E4 which is not a valid UTF-8 sequence.

Our takeaway
To repeat a key learning from this little problem:
Umlauts or accented letters are double byte encoded in UTF-8 and single byte in ISO-8859-1

So the web form itself was the problem in this case; and I went back to the user/developer with this informatoin.

So he fixed it?
Well, turns out his submission form was a private page he quickly threw together to test another problem, the real problem, when he noticed this particular issue.

So, yes, his form needed to mention utf-8 if he were going to properly encode accented characters, but that did not resolve the real issue, which remains unresolved.

It happens that way sometimes.

But, yes, the problem reported to us was resolved by the developer based on our feedback, so at least we have that success.

Conclusion
If like me, your eyes glaze over when someone mentions ISO-8859-1 versus UTF-8, the differences are pretty stark, easy-to-understand, and, just sometimes, really, important! I think ISO-8859-1 will represent some of the popular accented characters in positions 128 – 255, but not utf-8. utf-8 will use additional bytes to represent characters outside of the Latin alphabet plus the usual special characters.

We’ll call this one Case Closed!

References and related
I like to do a man ascii on any linux system to see the representation of the various Latin characters. I had to install the man-pages package on my RHEL system before that man page was available on my system.

Categories
Web Site Technologies

How to POST with curl

Intro
For the hard-core curl fans I find these examples useful.

Example 1
Posting in-line form data, e.g., to an api:

$ curl ‐d ‘hi there’ https://drjohns.com/api/example

Well, that might work, but I normally add more switches.

Example 2

$ curl ‐iksv ‐d ‘hi there’ https://drjohns.com/api/example|more

Perhaps you have JSON data to POST and it would be awkward or impossible to stuff into the command line. You can read it from a file like this:

Example 3

$ curl ‐iksv ‐d @json.txt https://drjohns.com/api/example|more

Perhaps you have to fake a useragent to avoid a web application firewall. It actually suffices to identify with the -A Mozilla/4.0 switch like this:

Example 4

$ curl ‐A Mozilla/4.0 ‐iksv ‐d @json.txt https://drjohns.com/api/example|more

Suppose you are behind a proxy. Then you can tack on the -x switch like this next example.

Example 5

$ curl ‐A Mozilla/4.0 ‐x myproxy:8080 ‐iksv ‐d @json.txt https://drjohns.com/api/example|more

Those are the main ones I use for POSTing data while seeing what is going on. You can also add a maximum time (-m I think).

Example 6

If you’re sending JSON data, you ought to declare it with a content-type header:

$ curl ‐A Mozilla/4.0 ‐H ‘Content-type: application/json’ ‐iksv ‐d @json.txt https://drjohns.com/api/example|more

POSTman
Just overhearing people talk, I believe that “normal” people use a tool called POSTman to do similar things: POST XML, SOAP or JSON data to an endpoint. I haven’t had a need to use it or even to look into it myself. yet.

Conclusion
We have documented some useful switches in curl. POSTing data occurs when using APIs, e.g., RESTful APIs, so these techniques are useful to master. Roadblocks thrown up by web application firewalls or proxy servers can also be easily overcome.

Categories
IT Operational Excellence Network Technologies Web Site Technologies

F5 Big-IP: When your virtual server does not present your chain certificate

Intro
While I was on vacation someone replaced a certificate which had expired on the F5 Big-IP load balancer. Maybe they were not quite as careful as I would like to hope I would have been. In any case, shortly afterwards our SiteScope monitoring reported there was an untrusted server certificate chain. It took me quite some digging to get to the bottom of it.

The details
Well, the web site came up just fine in my browser. I checked it with SSLlabs and its grade was capped at B because of problems with the server certificate chain. I also independently confirmed usnig openssl that no intermediate certificate was being presented by this virtual server. To see what that looks like with an exampkle of this problem knidly privided by badssl.com, do:

$ openssl s_client ‐showcerts ‐connect incomplete-chain.badssl.com:443

CONNECTED(00000003)
depth=0 /C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 /C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
verify error:num=27:certificate not trusted
verify return:1
depth=0 /C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
verify error:num=21:unable to verify the first certificate
verify return:1
---
Certificate chain
 0 s:/C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
   i:/C=US/ST=California/L=San Francisco/O=BadSSL/CN=BadSSL Intermediate Certificate Authority
-----BEGIN CERTIFICATE-----
MIIE8DCCAtigAwIBAgIJAM28Wkrsl2exMA0GCSqGSIb3DQEBCwUAMH8xCzAJBgNV
BAYTAlVTMRMwEQYDVQQIDApDYWxpZm9ybmlhMRYwFAYDVQQHDA1TYW4gRnJhbmNp
...
HJKvc9OYjJD0ZuvZw9gBrY7qKyBX8g+sglEGFNhruH8/OhqrV8pBXX/EWY0fUZTh
iywmc6GTT7X94Ze2F7iB45jh7WQ=
-----END CERTIFICATE-----
---
Server certificate
subject=/C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
issuer=/C=US/ST=California/L=San Francisco/O=BadSSL/CN=BadSSL Intermediate Certificate Authority
...
    Verify return code: 21 (unable to verify the first certificate)

So you get that message about benig unable to verify the first certificate.

Here’s the weird thing, the certificate in question was issued by Globalsign, and we have used them for years so we had the intermediate certificate configured already in the SSL client profile. The so-called chain certificate was GlobalsignIntermediate. But it wasn’t being presented. What the heck? Then I checked someone else’s Globalsign certificate and found the same issue.

Then I began to get suspicious about the certificate. I checked the issuer more carefully and found that it wasn’t from the intermediate we had been using all these past years. Globalsign changed their intermediate certificate! The new one dates frmo November 2018 and expires in 2028.

And, to compound matters, F5 “helpfully” does not complain and simply does not send the wrong intermediate certificate we had specified in the SSL client profile. It just sends no intermediate certificate at all to accompany the server certificate.

Conclusion
The case of the missing intermediate certificate was resolved. It is not the end of the world to miss an intermediate certificate, but on the other hand it is not professional either. Sooner or later it will get you into trouble.

References and related
badssl.com is a great resource.
My favorite openssl commands can be very helpful.