Category: Network Technologies

Network Technologies

Network utilities for Windows

Post author By john
Post date January 30, 2020
No Comments on Network utilities for Windows

Intro
Today I came across a simple but useful tool which runs on Windows systems that will help determine if a remote host is listening on a particular port. I wanted to share that information.

The details
PortQry is attractive because of its simplicity, plus, it is supported and distributed by Microsoft themselves. The help section reads like this:

PortQry version 2.0
 
Displays the state of TCP and UDP ports
 
 
Command line mode:  portqry -n name_to_query [-options]
Interactive mode:   portqry -i [-n name_to_query] [-options]
Local Mode:         portqry -local | -wpid pid| -wport port [-options]
 
Command line mode:
 
portqry -n name_to_query [-p protocol] [-e || -r || -o endpoint(s)] [-q]
        [-l logfile] [-sp source_port] [-sl] [-cn SNMP community name]
 
Command line mode options explained:
        -n [name_to_query] IP address or name of system to query
        -p [protocol] TCP or UDP or BOTH (default is TCP)
        -e [endpoint] single port to query (valid range: 1-65535)
        -r [end point range] range of ports to query (start:end)
        -o [end point order] range of ports to query in an order (x,y,z)
        -l [logfile] name of text log file to create
        -y overwrites existing text log file without prompting
        -sp [source port] initial source port to use for query
        -sl 'slow link delay' waits longer for UDP replies from remote systems
        -nr by-passes default IP address-to-name resolution
            ignored unless an IP address is specified after -n
        -cn specifies SNMP community name for query
            ignored unless querying an SNMP port
            must be delimited with !
        -q 'quiet' operation runs with no output
           returns 0 if port is listening
           returns 1 if port is not listening
           returns 2 if port is listening or filtered
 
Notes:  PortQry runs on Windows 2000 and later systems
        Defaults: TCP, port 80, no log file, slow link delay off
        Hit Ctrl-c to terminate prematurely
 
examples:
portqry -n myserver.com -e 25
portqry -n 10.0.0.1 -e 53 -p UDP -i
portqry -n host1.dev.reskit.com -r 21:445
portqry -n 10.0.0.1 -o 25,445,1024 -p both -sp 53
portqry -n host2 -cn !my community name! -e 161 -p udp
...

The PortQry “install” consisted of unzipping a ZIP file, so, no install at all, and no special permissions needed, which is a plus in my book.

nmap
Of course there is always nmap. I never really got into it so much, but clearly you can go nuts with it. One advantage is that it is available on linux and MacOS as well. But in my opinion it is a heavy-handed install.

References and related
PortQry

nmap

Some nmap examples I have used.

Tags nmap, PortQry

Admin Network Technologies

Monitoring by Zabbix: a working document

Post author By john
Post date January 16, 2020
2 Comments on Monitoring by Zabbix: a working document

Intro
I panned Zabbix in this post: DIY monitoring. But I have compelling reasons to revisit it. I have to say it has matured, but there remain some very frustrating things about it, especially when compared with SiteScope (now owned by Microfocus) which is so much more intuitive.

But I am impressed by the breadth of the user base and the documentation. But learning how to do any specific thing is still an exercise in futility.

I am going to try to structure this post as a problems encountered, and how they were resolved.

Current production version as of this writing?
Answer: 6.0

Zabbix Manual does not work in Firefox
That’s right. I can’t even read the manual in my version of Firefox. Its sections do not expand. Solution: use Chrome

Which database?
You may see references to MYSQL in Zabbix docs. MYSQL is basically dead. what should you do?

Answer
Install mariadb which has replaced MYSQL and supports the same commands such as the mysql from the screenshot. On my Redhat instance I have installed these mariadb-related repositories:

mariadb-5.5.64-1.el7.x86_64
mariadb-server-5.5.64-1.el7.x86_64
mariadb-libs-5.5.64-1.el7.x86_64

Terminology confusion
what is a host, a host group, a template, an item, a web scenario, a trigger, a media type?
Answer
Don’t ask me. When I make progress I’ll post it here.

Web scenario specific issues
Can different web scenarios use different proxies?
Answer: Yes, no problem. In really old versions this was not possible. See web scenario screenshot below.

Can the proxy be a variable so that the same web scenario can be used for different proxies?
Answer: Yes. Let’s say you attach a web scenario to a host. In that host’s configuration you can define a “macro” which sets the variable value. e.g., the value of HTTP_PROXY in my example. I think you can do the same from a template, but I’m getting ahead of myself.

Similarly, can you do basic proxy auth and hide the credentials in a MACRO? Answer: I think so. I did it once at any rate. See above screenshot.

Why does my google.com web scenario work whereas my amazon.com scenario not when they’re exactly the same except for the URL?
Answer: some ideas, but the logging information is bad. Amazon does not take to bots hitting it for health check reasons. It may work better to change the agent type to Linux|Chrome, which is what I am trying now. Here’s my original answer: Even with command-line curl I get an error through this proxy. That can’t be good:
$ curl ‐vikL www.amazon.com

...
NSS error -5961 (PR_CONNECT_RESET_ERROR)
* TCP connection reset by peer
* Closing connection 1
curl: (35) TCP connection reset by peer

My amazon.com web scenario is not working (status of 1), yet in dashboard does not return any obvious warning or error or red color. Why? Answer: no idea. Maybe you have to define a trigger?

Say you’re on the Monitoring|latest data screen. Does the data get auto-updated? Answer: yes, it seems to refresh every 30 seconds.

In Zabbix Latest Data can you control the history displayed via url parameters? By default only one hour of history is displayed. Answer: There is an undocumented feature I have discovered which permits this. Let’s say your normal URL for your direct link to the latest data of item 1234 is https://drjohns.com/history.php?action=showgraph&itemids[]=1234. The modified version of that to display the last day of data is: https://drjohns.com/history.php?action=showgraph&from=now-1d&to=now&itemids[]=1234

In latest data viewing the graph for one item which has a trigger, sometimes the trigger line is displayed as a dashed line and sometimes not at all. Answer: From what I can tell the threshold line is only displayed if the threshold was entered as a number in the trigger condition. Strange. Unfortunate if true.

Why is the official FAQ so useless? Answer: no idea how a piece of software otherwise so feature-rich could have such a useless FAQ.

Zabbix costs nothing. Is it still actively supported? Answer: it seems very actively supported for some reason. Not sure what the revenue model is, however…

Can I force one or more web scenarios to be run immediately? I do this all the time in SiteScope. Answer: I guess not. There is no obvious way.

Suppose you have defined an item. what is the item key? Answer: You define it. Best to make it unique and use contiguous characters. I’m seeing it’s very important…

What is the equivalent to SiteScope’s script monitors? Answer: Either ssh check or external check.

How would you set up a simple PING monitor, i.e., to see if your host is up? Answer: Create an item as a “simple check”, e.g., with the name ping this host, and the key icmpping[{HOST.IP},3]. That can go into a template, by the way. If it succeeded it will return a 1.

I’ve made an error in my script for an external check. Why does Latest data show nothing at all? Answer: no idea. If the error is bad enough Zabbix will disable the item on you, so it’s not really running any longer. But even when it doesn’t do that, a lot of times I simply see no output whatsoever. Very frustrating.

Help! The Latest Data graph’s Y axis only shows 0’s and 5’s. Answer: Another wonderful Zabbix feature, this happens because your Units are too long. Even “per minute” as Units can get you into trouble if it is trying to draw a Y axis with values 22.0 22.5 23.0, etc: you’ll only see the .0’s and the .5’s. Change units to a maximum of seven characters such as “per min.”

Why is the output from an ssh check truncated, where does the rest go? Answer: no idea.

How do you increase the information contained in the zabbix server log? Let’s say your zabbix server is running normally. Then run this command: zabbix_server ‐R log_level_increase
You can run it multiple times to keep increasing the verbosity (log level), I think.

Attempting to use ssh items with key authentication fails with :”Public key authentication failed: Callback returned error” Initially I thought Zabbix was broken with regards to ssh public key authentication. I can get it to work with password. I can use my public/private key to authenticate by hand from command-line as root. Turns out running command such as sudo -u zabbix ssh … showed that my zabbix account did not have permissions to write to its home directory (which did not even exist). I guess this is a case of RTFM, because they do go over all those steps in the manual. I fixed up permissions and now it works for me, yeah.

Where should the scripts for external checks go? In my install it is /usr/lib/zabbix/externalscripts.

Why is the behaviour of triggers inconsistent. sometimes the same trigger has expected behaviour, sometimes not. Answer: No idea. Very frustrating. See more on that topic below.

How do you force a web scenario check when you are using templates? Answer: No idea.

Why do (resolved) Problems disappear no matter how you search for them if they are older than, say, 30 minutes? Answer: No idea. Just another stupid feature I guess.

Why does it say No media defined for user even though user has been set up with email as his media? Answer: no idea.

Why do too many errors disable an ssh check so that you get Status Disabled and have no graceful way to recover? Answer: no idea. It makes sense that Zabbix should not subject itself to too many consecutive errors. But once you’ve fixed the underlying problem the only recovery I can figure is to delete the item and recreate it. or delete the host and re-create it. Not cool.

I heard dependent items are the way to go to parse complex data coming out of a rich text item. How do you do that? Answer: Yes they are. I have gotten them to work and really give me the fine-grained control I’ve always wanted. I hope to show a real-life example soon. To get started creating a dependent item you can right-click on the dots of an item, or create a new item and choose type Dependent Item.

I am looking at Latest data and one item is grayed out and has no data. Why? Answer: almost no idea. This happens to me in a dependent item formed by a regular expression where the regular expression does not match the content. I am trying to make my RegEx more flexible to match both good and error conditions.

Why do my dependent items, when running a Check Now, say Cannot send request: wrong data type, yet they are producing data just fine when viewed through Latest data? Answer: this happens if you ran a Check Now on your template rather than when viewing an individual host. Make sure you select a host before you run Check Now. Actually, even still it does not work, so final answer: no idea.

Why do some regular expressions check out just fine on regex101.com yet produce a match for value of type “string”: pattern does not match error in Zabbix? Answer: Some idea. Fancy regular expressions do not seem to work for some reason.

Every time I add an item it takes the absolute maximum amount of time before I see data, whether or not I run check Now until I turn blue in the face. Why? Answer: no idea. Very frustrating.

If the Zabbix server is in one time zone and I am in another, can I have my view of timestamps customized to my time zone? Otherwise I see all times in the timezone of the Zabbix server. Answer: You are out of luck. The suggestion is to run two GUIs, one in your time zone. But there is a but. Support for this has been announced for v 5.20. Stay tuned…

My DNS queries using net.dns don’t do anything. Why? Answer: no idea. Maybe your host is not running an actual Zabbiox agent? That’ll do it. Forget that net.dns check if you can’t install an agent. Zabbix has no agentless DNS monitor for some strange reason.

A DNS query which returns many address records fails (such as querying an AD domain), though occasionally succeeds. Why? Answer: So your key looks something like this, right? net.dns.record[10.1.2.3,my-AD-domain.net,A,10,2,tcp]. And when you do the query through dig it works fine, right? E.g., dig +tcp my-AD-domain.net @10.1.2.3. And you’ve set the Zabbix response to type text? It seems to be just another Zabbix bug. You may have to use a script instead. Zabbix support has been able to reproduce this bug and they are working on it as we speak.

What does Check/Execute Now really do? Answer: essentially nothing as far as I tell. It certainly doesn’t “check now”, i.e., force the item to be run. However, if you have enough permissions, what you can do when you’re looking at an item for a specific Host is to run Test. Then Get Value. I usually get Permission Denied, however.

I want to show multiple things on a dashboard widget graph like an item plus its baseline (Ed: see references for calculating a baseline). What’s the best way? Answer: You can use the add new data set feature for instance to add your baseline. In your additional data set you put your baselines. Then I like to make the width 2, transparency 0 and fill 0. This will turn it into a thin bold line with a complementary color while not messing too much with the original colors of your items. The interface is squirrely, but, hey, it’s Zabbix, what did you expect?

I have a lot of hosts I want to add to a template. Does that Mass Update feature actually work? Answer: yes. Use it. It will save you time.

Help! I accidentally deleted an entire template. I meant to just delete one of its macros. Is there a revert? Answer: it doesn’t look like it. Hope you remember what you did…

It seems if I choose units in an item which have too many characters, e.g., client connections, the graph (in Latest Data) cuts it off and doesn’t even display the scale? Answer: seems so. It’s a bug. This won’t happen when using vector graphs in Dashboard. The graphs in Latest Data are PNG and limited to short Units, e.g., mbps. Changing to vector graphs has been in the roadmap but then disappeared.

Can I create a baseline? Nope. It’s on the roadmap. However, see this clever idea for building one on your own without too much effort.

I’ve put a few things on the same Dashboard graph. Why don’t they align? There are these big gaps. Zabbix runs the items when it feels like, and the result is gaps in data which Zabbix makes no attempt to conceal at the beginning and end of a graph. You can use Scheduling Intervals on your items to gain some control over this. See this article for details.

Besides cloning the whole thing, how can I change the name of a Dashboard? Answer: If you just click to edit a Dashboard the name appears fixed. However, click on the gear icon and that gives you the option to edit the dashboard name. It’s kind of an undocumented feature.

My SNMP MIB has bytes in/out for an interface when what I really want is bandwidth, i.e., Megabits per second. A little preprocessing on a 64-bit bytes value and you are there (32 bit values may roll over too frequently). See this article for details.

In functions like avg (sec|#num,<time_shift>), why is the time_shift argument so restricted? It can’t be a macro, contain a formula like 1w-30m, or anything semi-sophisticated. It just accepts a dumb literal like 5h? Answer: It’s just another shortcoming in Zabbix. How much did you pay for it? 🙂

I have an SNMP template with items for a hostgroup of dispersed servers. Some work fine. The one in Asia returns a few values, but not all. I am using Bulk Request. Answer (to your implied question!) You must have bad performance to that one. Use a Zabbix proxy with a longer timeout for SNMP requests. I was in that situation and that worked for me.

SNMPv3 situation. I have two identical virtual servers monitored by the same Zabbix proxy. Only one works. Command-line testing of snmpwalk looks fine. What could it be?’Answer: We are fighting this now. In our case the SNMP v3 engineIDs are identical on the two virtual servers because they were from the same image, whereas, if you read the specs, they are supposed to be unique, like a MAC address. Who knew? And, yes, once we made the engineIDs unique, they were fine in Zabbix.

Riddle: when is 80% not 80%? Answer: when pulling in used storage on a filesystem via SNMP and comparing it to storage size! I had carefully gotten a filesystem 83% full based on the output of df -m. But my trigger, set to go off at 80%, never went off. How could it be? The 83% includes some kind of reserved user space on the filesystem which is not included when you do the calculation directly. So I was at 78% or so in actuality. I changed the trigger to 75%.

My trigger for a DNS item, which relies on a simple diff(), goes off from time-to-time yet the response is the same. Why? Answer: We have seen this behavior for a CNAME DNS item. The response changed the case of the returned FQDN from time-to-time, and that is enough to set off the Zabbix diff()-based trigger! We pre-processed the output with a RegEx to just get the bits we wanted to examine to fix this.

Related question. My diff() trigger for a DNS item does NOT go off when the server actually goes down. What’s up with that? Answer: Although you might expect a suddenly unavailable server constitutes a “difference,” in Zabbix’s contorted view of reality it does not. I recommend an additional trigger using the function nodata().

Does the new feature of login using SAML actually work? Answer: Yes, we are using it in Zabbix v 5.0.

My OIDs for my filesystems keeps shifting around. What to do? Answer: Use low-level discovery. It’s yet another layer of abstraction and confusion, but it’s probably worth it. I intend to write up my approach in my practical Zabbix examples blog post.

After an Zabbix agent item goes bad (no data), Zabbix refuses to test it for a full 30 minutes after it went bad, despite an update interval of 5 minutes. Why? Answer: In one of the worst architectural decisions of all time, Zabbix created the concept of unsupported items. It works something like this: the very moment when you need to be told Hey there’s something wrong here is when Zabbix goes quiet. Your item became unsupported, which is like being in the penalty box for 30 minutes, during which time nothing works like you naively expected it to. Even the fact that your item became unsupported is almost impossible to find out from a trigger. An example of software which treats this situation correctly is Microfocus SiteScope. In Zabbix in version 5.0 there’s a global timeout for all unsupported items. Ours is set to 30 minutes, you see. In some cases that may make sense and prevent Zabbix from consuming too many resources trying to measure things which don’t work. I find it annoying. For DNS, specifically, best to use a key of type net.dns and not net.dns.record. That returns a simple 0 or 1 and does not become unsupported if the dns server can’t be reached. V 5.2 will provide some more options around this issue. For a HTTP agent and I suppose many other items, it’s best to create triggers which use the nodata() function, which can somewhat compensate for this glaring weakness in Zabbix. If you run Zabbix v 5.2, you should use the new preprocessing rule “Check for not supported value” and then set new value e.g. “Error”. Then the Item won’t become unsupported and can also be used for triggers.

We’ve got SNMP items set up for a host. What’s the best way to alert for a total outage? Answer: I just learned this. This is closely related to the previous question. To avoid that whole unsupported item thing, you make a Zabbix internal item. the key is literally this: zabbix[host,snmp,available] and type is numeric unsigned. This wil continue to poll even if the other host items became unsupported. This is another poorly documented Zabbix feature.

While trying to set up a host for SNMP monitoring I get the error Cannot update Host. Cannot find host interface on host_name for item key item_name. Answer: You probably used an interface type of Agent instead of SNMP. Under Interfaces for the host, add one for type SNMP and remove the Agent one. Or, maybe the reverse: your item type is of type Zabbix agent but your host’s interface is of type SNMP – that combo also produces this error.

In Zabbix my SNMP item shows error No such instance currently exists at this OID, yet my snmpwalk for same shows it works. Why? Answer: In my case I switched to snmpget for my independent testing and reproduced that error, and found that I needed a literal .”0″ at the end of the OID (specifically for swap used on an F5 device). Once I included the .”0″ (with the double-quotes) in the OID in Zabbix it began to work. In another case I could do the snmpget from the same zabbix proxy where I was getting this error message. The custom MIB was right there in /usr/share/snmp/mibs on the Zabbix proxy. Zabbix hadn’t been started in awhile. I restarted it and the problem went away.

I wish to use a DNS value instead of an IP in net.tcp.service[service,IP,port] because I use geoDNS or round-robin DNS. Can I? Answer: It seems to work, yes.

Can I send alerts to MS teams? Answer: This is obviously a fake question. But the answer is Yes. You set up a Connector in a MS Teams channel. It’s pretty staight forward and it’s pretty cool. I’ll try to publish more in my Zabbix tips post if I have time.

Get a lot of false positives? Answer: Yes! On F5 equipment this one is vexing me:

Resolved: BIG-IP is unreachable via SNMP for 15 minutes

And for others (pool member unavailable for a few minutes) I tried to require two consecutive failures before sending an alert. Basically still working on it.

I have a bunch of HTTP items on this one Zabbix proxy. They all sort of go bad at the same time (false positives) and Zabbix says this agent is unreachable for five minutes around the same time. Answer: Seen that. Short term it may be advisable to create a dependent trigger: https://www.zabbix.com/documentation/5.0/manual/config/triggers/dependencies Mid-term I am going to ask support about this problem.

Why is the name field truncated in Monitoring | Latest Data, with no possibility to increase it? Answer: If you have Show Details selected you see very few characters. Deselect that.

What, Zabbix version 5.2 RPMs are not available for RHEL 7? Answer: that is correct, unfortunately, as of this writing. You can run as high as v 5.0.7. We are trying to pressure them to provide this compatibility. Lots of people still run Redhat v 7.

Can you send reminder alerts periodically for a problem which persists? Answer: Yes you can. For instance, every four hours. Read all about it in the manual, under Action | Escalations, and look at their examples. However, the documentation is at odds with the product’s behaviour if you have multiple alerts with different durations defined. I am studying it…

Is Zabbix affected by the same hack that infected SolarWinds? Answer: No idea. Let’s see. Developed in Eastern Europe. Basically, no one’s saying. Let’s hope not.

Is Zabbix stupid enough to send multiple alerts for the same problem? Answer: In a word, yes. If you are unlucky enough to have defined overlapping alert conditions in your various alerts, Zabbix will make no effort to consolidate them.

What does it mean when I look at a host and I see inaccessible template? Answer: Most likely explanation is that you don’t have permission to see that template.

Can the y-axis be drawn in a logarithmic scale in a dashboard graph? I have low values (time for a DNS query) which sometimes soar to high ones. Answer: No. This feature has been requested now for almost 10 years and still is lacking. I will try to make a feature request.

Why does our Zabbix agent time out so often? The message is Zabbix agent on hostname is unreachable for five minutes. The problem is sporadic but it really interferes with the items like our simple net.dns checks. Answer: If you use a lot of net.dns agent items you can actually cause this behavior if you are running agent2. The default agent item is passive. We had better luck using an Active Agent item. We had severe but random timeouts and they all went away.

Our Webhook to MS Teams was working fine. Then we set up a new one to a new channel which wouldn’t work at all. A brief error message says invalid Webhook or something. What’s the fix? Answer: It is a known bug which is fixed in v 5.0.8. Of course a lot else could be wrong. In fairness Microsoft changes the format for webhooks from time-to-time so that could be the problem. This Microsoft page is a great resource to do your own testing of the Webhook: Sending messages to Connectors and Webhooks – Teams | Microsoft Docs

The formatting of alert emails is screwy, especially with line breaks in the wrong places. Can I force it to send HTML email to gain more control? Answer: Sort of. You can define a media type where you use HTML email instead of plain text email. I personally don’t have access to do that. But it is not possible to selectively use HTML email within the Custom email form of the alert setup screen. With the more straightforward custom emails, the trick is to put in extra line breaks. A single solitary linebreak is sometimes ignored, especially if the sequence is MACRO-FUNCTION linebreak more text. But if you use two consecutive linebreaks it will inject two linebreaks.

I swear Zabbix is ignoring my macros in trigger functions used in templates which refer to time values in minutes, and just filling in 0 instead. Is that even possible? Answer: I’m still investigating this one. I will withold my customary sardonic comments about Zabbix until I know who or what is to blame. [Later] I’m thinking this one is on me, not Zabbix.

Do Zabbix items, particularly HTTP items, have the concept of a hidden field to hide confidential data such as passwords from others with the same level of access? Answer: Apparently not. But if you believe in the terrible idea of security by obscurity, you can obscure values by stuffing them into a macro.

My Zabbix admin won’t let me get creative. No external items, no ssh items, etc. I can run some interesting scripts on my linux server. How to stuff the results into Zabbix? Answer: Install zabbix_sender utility on your linux. Then set up an item of type Zabbix trapper. The link to the RPM for zabbix_sender is in the references.

These days nothing is either black or white. So when a trigger fires, it’s likely it will return to good status, and then bad, and then good, etc. The alerts are killing us and casual users tend to discount all of them. What to do? Answer: This is common-sense, but, a very good strategy in these cases is to define a recovery expression for that trigger that looks at the average value for the last 3600 seconds and requires it to be in the good range before the trigger that all is good gets sent out as an alert.

I’m using the dynamic host feature in a dashboard. Unfortunately, one of my hosts has a really short name that matches so many other hosts that it never appears in the drop-down list. What to do? Answer: Click the “select” button to the right of the search field. Then you can choose the host group and from there the host. Or rename the host to somethng more unique.

I wish to add some explanatory text in the dashboard I’ve created. Is it possible? Answer: This is laughably kludgy, but you can do this with a map widget. What you can do is to create a map, add a text box to it, and put your desired text into the text box. But it is hard to get the sizing correct as things shrink when putting the widget on the dashboard.

My top hosts widget is now displaying 0’s. Answer: This happened after we upgraded from v 6.0 to 6.0.8. In characteristically Zabbix illogical fashion, if you now sort by BottomN instead of TopN you should see the expected results (highest on top). Not all our widgets displayed this bug!

I have an item which only runs once a week. Monitoring > Latest Data doesn’t show any values. Is that a bug or feature? Answer: There is a setting somewhere where you can change this behavior. Set it to last two weeks and all will be well.

While using the pyzabbix Zabbix api I had trouble switching from username/password to use an authentication token. Answer: Perhaps yuo installed both py-zabbix as well as pyzabbix? I’m confused by this. as there is some overlap. To use the token auth method – preferred by experts – uninstall both these packages and re-install only pyzabbix. I will give an example in my other Zabbix blog post, Practical Zabbix examples.

The trigger.create api call says a dependent triggerid must be passed? Is that really mandatory? It makes no sense. Answer: No. I experimented with it and found you can just leave the dependencies out altogether. The documentation is wrong.

I need to create about 100 custom alerts. Is there seriously no way to do this via the api? Answer: apparently not.

What’s the correct way to send a compound filter expression via the api? Answer: Watch out! If you are trying to filter on suppressed problems, do not put a reference to suppressed in your filter. Instead it goes outside the filter like so: zapi.problem.get(…,suppressed=False,filter={‘name’:…})

Monitoring > Problems > History view is slow. Then it grays out periodically. Answer: Zabbix is spending all its time figuring out which host groups you have access to. To speed things up, explicitly enter only your accessible host groups in the filter.

geoMAP in Zabbix 6.0 is cool until you blow up a continent and see all the local geographical names written in their native language. So Asian placenames are inscrutable to Enlgih speakers. Is there any fix? Answer: You are probably using the provider, OpenStreetMap in this case, which is using localized names. You can switch providers (global setting).

I’m using a RegEx in the regsub function on an LLD macro. What flavor of RegEx are supported and what characters need to be escaped? Answer: Supposedly Perl-compatible (PCRE) RegExes are supported. For anything remotely complex, enclose your RegEx in double-quotes. Then, for good measure put a backslash (\) in front of any double-quote (“) you require as a match character, and a backslash in front of any slash (/) match character, plus the usual rules.

Why am I seeing the same host graph twice? Answer: This is a bug I have personally discovered in Zabbix 6.0. It occurs when you have a template with just a single item and a single graph. They will be working on it as of August 2022.

In latest data I see: Value of type “string” is not suitable for value type “numeric unsigned.” Why? Answer: I got this in Zabbix 6.4 when I used zabbix_sender with argument -o 36 which I thought would feed in the integer 36. But no, it got interpreted as a string. I tried to introduce a preprcoessing step but I could not get it to work. In the end I created a dependent item with a RegEx to convert it. I made the original item type character. I could not beat this in a simple way.

I can’t get my new agent to be seen by its Zabbix proxy. Error is failed to accept an incoming connection: from [agent]: reading first byte from connection failed: [104] Connection reset. Answer: You may be running a Palo Alto firewall perhaps? They will permit the tcp handshake and then drop the connection with a “reset both sides.” which produces this error. Thus super simplified connection tests you run by hand with nc/nmp may appear to work.

Does changing the name of a host change its hostid? Answer: No. We have a multi-stage discovery process which relies on this fact.

Does the hosts IP filter accept a subnet mask? Answer: No, it is very primitive. It does accept a partial IP, strangely enough, so 10.9.9 matches 10.9.9.0/24.

A word about SSH checks and triggers
Through the school of hard knocks I have learned that my ssh check is clipping the output from the executed command. So you know that partial data you see when you look at latest data, and thought it was truncating it for display purposes? Nuh, ah. That’s all you’re getting to go up against in your trigger, which sucks. It’s something like 260 characters. I got lucky in a sense to discover this early by running an ssh check against dns resolution of amazon.com. The response I got varied almost every 60 seconds depending on whether or not the response came out of the dns cache. So this was an excellent testbed to learn about the flakiness of triggers as well as waste an entire day.

Another thing about triggers with a regex. As far as I can tell the logic is reversed. So you think you’re defining the OK condition when you seek to match the output and have it given the value of 1. But instead try to match the desired output for the OK condition, but assign it a value of 0. I guess. Only that approach seems to work for me. And getting the regex to treat multiple lines as a unit was also a little tricky. I think by default it favored testing only against the last line.

So let’s say my output as scraped from Monitoring|Latest Data alternated between either

proxy1&gt;test dns amazon.com
Performing DNS lookup for: amazon.com
 
DNS Response data:
Official Host Name: amazon.com
Resolved Addresses:
  205.251.242.103
  176.32.98.166
  176.32.103.205
Cache TTL: 1, cache HIT
DNS Resolver Response: Success

proxy1&gt;test dns amazon.com
Performing DNS lookup for: amazon.com
 
Sending A query for amazon.com to 192.168.135.145.
 
Sending A query for amazon.com to 8.8.8.8.
 
DNS Response data:
Official Host Name: amazon.com
Resolved Addresses:
  20

, then here is my iregexp expression which seems to do the correct thing (treat both of these outcomes as successes):

{proxy1:ssh.run[resolve DNS,1.2.3.4,22,utf-8].iregexp("(?s)((205\.251\.|176\.32\.)|Sending A query.+\s20)")}=0

Note that the (?s) at the beginning helps, I think, to treat the newline character as just another character which matches “.”. I may have an extra set of parentheses around the outermost alternating expression, but I can only experiment so much…

I ran various tests such as to change just one of the numbers to make sure it triggered.

I now think I will get better, i.e., more complete, results if I make the item of type text rather than character, at least that switch definitely helped with another truncated output I was getting from another ssh check. So, yes, now I am capturing all the output. So, note to self, use type text unless you have really brief output from your ssh check.

So with all that gained knowledge, my simplified expression now reads like this:

{proxy:ssh.run[resolve a dns name,1.2.3.4,22,utf-8].iregexp("(205\.251\.|176\.32\.)")}=0

Here’s a CPU trigger. From a show status it focuses on the line:

CPU utilization: 29%

and so if I want to trigger a problem for 95% or higher CPU, this expression works for me:

CPU utilization:\s+([ 1-8]\d|9[0-4])\%

A nice online regular expression checker is https://regexr.com/

And a very simple PING test ssh check item, where the expected resulting line will be:

5 packets transmitted, 5 packets received, 0% packet loss

– for that I used the item wizard, altered what it came up with, and arrived at this:

(({proxy:ssh.run[ping 8.8.8.8,1.2.3.4,22,utf-8].iregexp("[45] packets received")})=0)

So I will accept the results as OK as long as at most one of five packets was dropped.

A lesson learned from SNMP monitoring of F5 devices
My F5 BigIP devices began producing problems as soon as we set up the SNMP monitoring. Something like this:

Node /Common/drj-10_1_2_3 is not available in some capacity: blue (4)

It never seemed to matter until now that my nodes appear blue. But perhaps SNMP is enforcing a best practice and expecting nodes to not be blue, meaning to be monitored. And it turns out you can set up a default monitor for your nodes (I use gateway_icmp). It’s found in Nodes | Default Monitor. I’m not sure why this is not better documented by F5. After this, many legacy nodes turn red so I am cleaning them up… But my conclusion is that I have learned something about my own systems from the act of implementing this monitoring, and that’s a good thing.

To be continued…

References and related
A good commercial solution for infrastructure monitoring: Microfocus SiteScope.

DIY monitoring

The Zabbix manual

Direct link to Zabbix Repos (RPMs), including standalone RPMs for zabbix_sender, zabbix_get and zabbix_js: https://repo.zabbix.com/zabbix/5.0/rhel/8/x86_64/

A nice online regular expression (RegEx) checker is: https://RegEx101.com/.

Another online regular expression checker is: https://RegExr.com/.

Just to put it out there: If you like Zabbix you may also like Specto. Specto is an open-source tool for monitoring web sites (“synthetic” monitoring). I know one major organization which uses it so it can’t be too bad. https://specto.sourceforge.net/

Since this document is such a mess I’m starting to document some of my interesting items and Practical Zabbix examples in this newer and cleaner post. It includes the baseline calculation formula.

Tags dashboard, DNS, F5 BigIP, false positive, graph, Microfocus SiteScope, MS Teams, Palo Alto Networks firewall, RegEx, SNMP, snmpget, snmpwalk, unsupported item, widget, zabbix

Admin Network Technologies

The IT Detective Agency: the case of the Is it the firewall? or routing? or switch? or layer 2?

Post author By john
Post date November 21, 2019
1 Comment on The IT Detective Agency: the case of the Is it the firewall? or routing? or switch? or layer 2?

Intro
This is yet another tale of things in the IT world often do not turn out the way it seems at first blush. Or possibly a tale of just when you think you’ve seen it all after decades in the industry, something new (to you) occurs.

What’s going on
The firewall team was all busy so when this strange problem occurred Friday they called in the second string: me. I consider some of the team to be less-than-customer focused so I try to compensate for them and for my lack of knowledge about the firewall by applying a more customer-first attitude. In other words, a sympathetic listening ear. These days it can be hard just to find someone to complain to about your It problem, and I am keenly aware of that.

There was some strange communication which wasn’t working, mediated by a firewall I had never accessed and was not sure i even had access to. So of course I was asked to join a big conference call where an ongoing debugging session was taking place.

I refused.

I hate being blindsided, and i hate not having answers, making me sound even less competent than I already am.

But what I did do is being my research to see what the system is, if perhaps I had access, etc.

Yes. I found that through a management system I have access to I had access to view the policies on that particular firewall and view the logs as well.

So once I had that up, I agreed to join the call.

They had one server communicating to three different systems. Only one of the three systems was being reached. Yes the other two were on the same subnet. Two of our firewalls were between the system and the three servers.

And, yes, i could see some drops. The interesting TCP error stated: TCP packet out of state, first packet isn’t SYN.

No problem. routing must be screwed up such that we have asymmetric routing. It happens all the time. Right? well these systems are really appliances with only some basic networking information configurable, not real debugging facility, and really no ability to add a host route.

I could not establish a shell session onto the firewall – not sure what the password naming scheme was that they used.

Then a real firewall guy comes on the call. But his connectivity is messed up, so I keep with the debug session, if nothing else than to support him since four eyes is more effective than just two. He shares the routing tables of our two in-line firewalls. It’s hard to understand as these are all new subnets for me, some are ones that don’t look right. But just focusing on possible host routes for any of these three servers, I don’t see anything amiss.

Firewall policy
And, in firewall policy I see the entire subnet has this traffic permitted. There is no rule specific to one or the other of these systems.

So what do we have up until now?
A purist firewall administrator attitude would be as follows:
The firewall treats all these systems the same, therefore this cannot be a firewall problem. Talk to your networking or system people. Have a nice day.

Well, in fact there was some serious question about the network switch as well. So we had a network guy on the call. So they dug up the MAC addresses of these systems, from which they found the switch ports. Then they checked the port configuration. Ah, some complex 802.1x authentication was configured. As I understand this means the device would not even be allowed onto the subnet until it passed some kind of Radius authentication. So they removed this 802.1x stuff and just made sure that port was assigned to the right vlan.

Still, the problem persisted.

I think the other firewall guy was also new to this equipment. Eventually, though, he tries to do a packet trace of the one that’s working versus the one that isn’t.

You know, I never saw the results of those traces, but I’m pretty sure, reading between the lines, that they surprised him, meaning, they did not fit the hypothesis of the asymmetric routing.

In these situations there is the main communication in the mian session, then side communications going on, like between me and the firewall guy. But it is all chaotic. Acoustics are mediocre, accents are hard to understand. So the net transfer of information is pretty low. Statements, even important ones, often have to be repeated multiple times (rebroadcasts) to assure everyone “gets it.”

Typical questions were asked. When did this last work? what had changed? There were a couple changes. Some kind of networking thing (I forget what), and then the firewalls changed management systems after that. The firewall change seemed closer in time to the last known success.

You acquire more and more information as you dig into problems. It’s hard to judge which is relevant at the time and which lines if inquiry are a complete waste of time. A good incident manager or project manager can sense which are the more productive lines of investigation and nurture those discussions while suppressing the noise.

Actually it was the networking guy who found the Checkpoint link below. I looked at it. the firewall guy was muttering something about badly behaved, older applications that might exhibit this behaviour.

So we agreed to take the suggested steps, which would basically allow these out of state packets. Drat. The firewall returned an error.

But I continue to refresh the firewall logs. The communication was occurring about every minute. Lo and behold, I see the older drops, and then accepts for the last few minutes! I think it worked. I tell them to check.

They check their end. Sure enough. Communication beginning to work…

The customer tries to make assertion that this was a firewall problem all along. Not so fast. Firewall guy says, well, the firewall is doing exactly what it’s supposed to be doing. who’s right?

We’re all good for now, but we state this is a kludge for today and a follow-up meeting needs to occur.

So what happened?
I think the single most important thing is that the firewall guy switched his problem hypothesis from Must be asymmetric routing, to Maybe it’s a badly behaved application. Meaning what? What if you have an application that establishes a TCP connection, and then to beat idle timeouts, sends a KEEP ALIVE packet every minute? Well, now, suppose your firewall is rebooted in the middle of that because it has changed management stations and needs to reload policy? What might the situation look like to it?

It you were unlucky, it just might see these KEEP ALIVE TCP packets without having the connection in its connection table, in other words, exactly the situation we are observing!

What should have happened?
It would have been great if the communication were forced to be re-established form time-to-time, even once a day. This problem had been going on for days.

But, given this very stupid behaviour on the part of this application, if the app people had been aware they should have forced their application to re-establish the TCP connection after the firewall reboot. Probably, for the one that did work, it had been forced to re-establish.

A firewall person has to be sufficiently aware to realize this could be happening, and advise the app owner on what to do to prevent it.

Conclusion
So whose problem is it?

To the app people it looks like a firewall issue, cut-and-dried. To a firewall guy it looks like an application issue, cut-and-dried. I see both sides. It is some of both. An app owner has to understand enough about firewalls to see that this type of thing can occur. Assigning blame to one side or the other, as most people are wont to do, is not productive. Only a team effort could have revealed this issue. And recall that the “fix” is actually a kludge that lowers security.

Case: almost closed.

References and related
Checkpoint’s note on TCP packet out of state first packet isn’t SYN: https://community.checkpoint.com/t5/General-Topics/TCP-packet-out-of-state-First-packet-isn-t-SYN-tcp-flags-SYN-ACK/td-p/37166

The IT Detective agency cases are still coming fast and furious. Here’s another recent case. Failed to convert character

Tags firewall

Admin Network Technologies

The IT Detective Agency: WebEx and the case of the mysterious reset

Post author By john
Post date November 20, 2019
No Comments on The IT Detective Agency: WebEx and the case of the mysterious reset

Intro
A company known to me was a contented user of WebEx until they noticed a strange behaviour: their calls were losing quality or even dropped exactly after one hour. No one, most especially the vendor of WebEx, had the slightest idea of the root cause. Read on to see how this fascinating case is playing out.

Scene: company offices, Sao Paolo, Brazil
Triago, a very competent IT professional located in Sao Paolo was the first to report the problem. More-or-less it went like this:

– when he uses the call-my-computer feature in WebEx the call quality is fine, until he has been on the meeting for one hour. At the one hour mark the voice quality of others (from his perspective) dropped dramatically. Sometimes the call was completely lost. Then, about five minutes later, the quality was OK again.
– the problem only occurred when in the office or using VPN, i.e., when using the company network
– others in South America are having the same issue
– he can use the same company laptop, on the Internet, and will not have the problem

After the usual finger-pointing amongst various vendors a debugging plan was created.

It’s going really well. There have been about 12 test calls, stretched out over the last five months. You have to admire the chutzpah of US software vendors who sell to major customers and then still manage to treat them like crap come time for support.

The pattern, more-or-less, goes like this. test call with several vendors plus Triago and I in the US. Wait around for an hour, produce the problem. Wait for software vendor to “analyze”. Wait for two weeks. Some small insight may be gleaned by them. Conclusion: another test is needed, we didn’t have all the traces we need. rinse and repeat.

Scene: A soulless office park somewhere in northern New Jersey
To be continued, literally…

Scene: an enterprise-class server room somewhere in Research triangle Park, North Carolina
If one uses the company’s guest WiFi, one uses the company’s firewall, but not the company’s proxy server. This test succeeds. But, unfortunately, one also uses UDP rather than TCP for the communication because that is the default. See the references for communication requirements.

So one thought is to knock out the ability to use UDP by blocking UDP port 9000, thereby forcing use of TCP. Testing that today….

References and related
Networking requirements for WebEx: https://help.webex.com/en-us/WBX264/How-Do-I-Allow-Webex-Meetings-Traffic-on-My-Network

Tags WebEx

Network Technologies Web Site Technologies

The IT Detective Agency: the case of Failed to convert character

Post author By john
Post date November 2, 2019
No Comments on The IT Detective Agency: the case of Failed to convert character

Intro
A user of a web form noticed any password that includes an accented character is rejected. He came to use as the operator of the web application firewall for a fix.

More details
The web server was behind an F5 device running ASM – application security manager. The reported error that we saw was Failed to convert character. What does it all mean?

One suggestion is that the policy may have the wrong language, but the application language of this policy is unicode (utf-8), just like all our others we set up. And they don’t have any issues. I see where I can remove the block on this particular input violation, but that seems kind of an extreme measure, like throwing out the baby with the bathwater.

I wondered about a more granular way to deal with this?

Check characters on this parameter value is already disabled I notice, so we can’t further loosen there.

Ask the expert
So I ask someone who speaks a foreign language and has to deal with this stuff a lot more than I do. He responds:

Looking at the website I think that form just defaults to ISO-8859-1 instead of UTF-8 and that causes your problem.
Umlauts or accented letters are double byte encoded in UTF-8 and single byte in ISO-8859-1

To confirm the problem with the form, he enters an “ä” as the username, which the event log shows encoded to %E4 which is not a valid UTF-8 sequence.

Our takeaway
To repeat a key learning from this little problem:
Umlauts or accented letters are double byte encoded in UTF-8 and single byte in ISO-8859-1

So the web form itself was the problem in this case; and I went back to the user/developer with this informatoin.

So he fixed it?
Well, turns out his submission form was a private page he quickly threw together to test another problem, the real problem, when he noticed this particular issue.

So, yes, his form needed to mention utf-8 if he were going to properly encode accented characters, but that did not resolve the real issue, which remains unresolved.

It happens that way sometimes.

But, yes, the problem reported to us was resolved by the developer based on our feedback, so at least we have that success.

Conclusion
If like me, your eyes glaze over when someone mentions ISO-8859-1 versus UTF-8, the differences are pretty stark, easy-to-understand, and, just sometimes, really, important! I think ISO-8859-1 will represent some of the popular accented characters in positions 128 – 255, but not utf-8. utf-8 will use additional bytes to represent characters outside of the Latin alphabet plus the usual special characters.

We’ll call this one Case Closed!

References and related
I like to do a man ascii on any linux system to see the representation of the various Latin characters. I had to install the man-pages package on my RHEL system before that man page was available on my system.

Tags ASM

Admin IT Operational Excellence Network Technologies

No Internet, secure WiFi status message in Windows 10

Post author By john
Post date September 16, 2019
No Comments on No Internet, secure WiFi status message in Windows 10

Intro
Finding out how Windows decides if there is an Internet connection or not can be a challenge often posed by trying to do an Internet search comprised or words that are common and therefore used in many other contexts. I have to give credit to someone else who found most of these pertinent links that help explain how Windows decides whether or not your PC has an Internet connection.

What they don’t tell you
I think there are a lot more tests microsoft does than what they’ve documented. In my opinion, based on observation, in addition to the sites they recommend to whitelist, also whitelist

www.msftconnecttest.com

Some PCs get stuck in a loop requesting www.msftconnecttest.com/connecttest.txt indefinitely, which isn’t good for anyone.

Here’s one they don’t mention, of the same ilk:

ipv6.msftconnecttest.com/connecttest.txt

I’m thinking to just leave that one alone, unless you really are fully running on ipv6.

Now if you have a PAC file, what you’re going to see are accesses for
<PAC-file-address>/connecttest.txt

I don’t think that one’s documented either. I’m not yet sure how best to have the PAC file web server respond, where best means the reply which would make the PC most likely to decide Yes I really do have an Internet connection.

References and related
This Pulse Secure article is pretty good. You start with an Internet connection, then launch Pulse Secure vpn, then find you are told there is no longer an Internet connection. This explains why it might be, but in my opinion it is incomplete as it does not even consider the case where an authenticating proxy is the sole gateway to the Internet:
https://kb.pulsesecure.net/articles/Pulse_Secure_Article/KB43805

These are two more articles about VPN tunneling
https://community.pulsesecure.net/t5/Pulse-Desktop-Clients/Pulse-Secure-blocks-Windows-10-apps-from-internet-access/td-p/11944
https://docs.pulsesecure.net/WebHelp/PCS/8.3R1/Home.htm#PCS/PCS_AdminGuide_8.3/About_VPN_Tunneling.htm

network Location Awareness (NLA) and Network Connection Status Indicator (NCSI) are explained in these articles
https://support.microsoft.com/en-us/help/4494446/an-internet-explorer-or-edge-window-opens-when-your-computer-connects
https://support.microsoft.com/en-us/help/2778122/using-authenticated-proxy-servers-together-with-windows-8

Tags NCSI

Linux Network Technologies Raspberry Pi

OLD: Raspberry Pi photo frame using your pictures on your Google Drive

Post author By john
Post date August 6, 2019
1 Comment on OLD: Raspberry Pi photo frame using your pictures on your Google Drive

Intro

This posting is messed up. I’ll have to re-post. Working on it… Try this post instead.

All my spouse’s digital photo frames are either broken or nearly broken – probably she got them from garage sales. Regardless, they spend 99% of the the time black. Now, since I had bought that Raspberry Pi PiDisplay awhile back, and it is underutilized, and I know a thing or two about linux, I felt I could create a custom photo frame with things I already have lying around – a Raspberry Pi 3, a PiDisplay, and my personal Google Drive. We make a point to copy all our cameras’ pictures onto the Google Drive, which we do the old-fashioned, by-hand way. After 17 years of digital photos we have about 40,000 of them, over 200 GB.

So I also felt obliged to create features you will never have in a commercial product, to make the effort worthwhile. I thought, what about randomly picking a few for display from amongst all the pictures, displaying that subset for a few days, and then moving on to a new randomly selected sample of images, etc? That should produce a nice review of all of them over time, eventually. You need an approach like that because you will never get to the end if you just try to display 40000 images in order!

The scripts
Here is the master file which I call master.sh.

#!/bin/sh
# DrJ 8/2019
# call this from cron once a day to refesh random slideshow once a day
RANFILE="random.list"
NUMFOLDERS=20
DISPLAYFOLDER="/home/pi/Pictures"
DISPLAYFOLDERTMP="/home/pi/Picturestmp"
SLEEPINTERVAL=3
DEBUG=1
STARTFOLDER="MaryDocs/Pictures and videos"
 
echo "Starting master process at "`date`
 
rm -rf $DISPLAYFOLDERTMP
mkdir $DISPLAYFOLDERTMP
 
#listing of all Google drive files starting from the picture root
if [ $DEBUG -eq 1 ]; then echo Listing all files from Google drive; fi
rclone ls remote:"$STARTFOLDER" &gt; files
 
# filter down to only jpegs, lose the docs folders
if [ $DEBUG -eq 1 ]; then echo Picking out the JPEGs; fi
egrep '\.[jJ][pP][eE]?[gG]
 
Needless to say, but I'd better say it, the STARTFOLDER in this script is particular to my own Google drive. Customize it as appropriate for your situation.
 
Then qiv (quick image viewer) is called with a bunch of arguments and some trickery to ensure proper display of files with spaces in the filenames (an anathema for Linux but my spouse doesn't know that so I gotta deal with it). I call this script qiv.sh.

#!/bin/sh
# -f : full-screen; -R : disable deletion; -s : slideshow; -d : delay ; -i : status-bar;
# -m : zoom; [-r : ranomdize]
# this doesn't handle filenames with spaces:
##cd /media; qiv -f -R -s -d 5 -i -m `find /media -regex ".+\.jpe?g$"`
# this one does:
export DISPLAY=:0
if [ "$1" = "l" ]; then
# print out proposed filenames
  find . -regex ".+\.[jJ][pP][eE]?[gG]$"
else
# args: f fullscreen d delay s slideshow l autorotate R readonly I statusbar
# i nostatusbar m maxspect
  find . -regex ".+\.[jJ][pP][eE]?[gG]$" -print0|xargs -0 qiv -fRsmil -d 5
fi

Here is the perl script which generates the random numbers and associates them to the file listing we’ve just made with rclone, random-files.pl.

#!/usr/bin/perl
use Getopt::Std;
my %opt=();
getopts("df:j:r:",\%opt);
$nofolders = $opt{f} ? $opt{f} : 20;
$DEBUG = $opt{d} ? 1 : 0;
$jpegs = $opt{j} ? $opt{j} : "jpegs.list";
$ranpicfile = $opt{r} ? $opt{r} : "jpegs-random.list";
print "d,f,j,r: $opt{d}, $opt{f}, $opt{j}, $opt{r}\n" if $DEBUG;
open(JPEGS,$jpegs) || die "Cannot open jpegs listing file $jpegs!!\n";
@jpegs = ;
# remove newline character
$nopics = chomp @jpegs;
open(RAN,"&gt; $ranpicfile") || die "Cannot open random picture file $ranpicfile!!\n";
for($i=0;$i&lt;$nofolders;$i++) {
  $t = int(rand($nopics-2));
  print "random number is: $t\n" if $DEBUG;
  ($dateTime) = $jpegs[$t] =~ /(\d{8}_\d{6})/;
  if ($dateTime) {
    print "dateTime\n" if $DEBUG;
  }
  $priorPic = $jpegs[$t-2];
  $Pic = $jpegs[$t];
  $postPic = $jpegs[$t+2];
  print RAN qq($priorPic
$Pic
$postPic
);
}
close(RAN);

Note that to display 60 pictures only 20 random numbers are used, and then the picture 2 prior and the picture two after the one selected by the random number are also displayed. This helps to provide, hopefully, some context to what is being shown without showing all those duplicate pictures that everyone takes nowadays.

There is an attempt to favor recently uploaded pictures but I really haven’t perfected that part of master.sh, it’s more of a thought at this point.

My crontab entries take care of starting a slideshow upon first boot as well as a daily pick of 60 new random pictures!

@reboot sleep 25; cd ~ ; ./m2.pl &gt;&gt; ./m2.log 2&gt;&amp;1
24 16 * * * ./master.sh &gt;&gt; ./master.log 2&gt;&amp;1

Use crontab -e to edit your crontab file.

qiv – an easy install
To install qiv

$ sudo apt-get install qiv

Rclone shown in some detail
The real magic is tapping into the Google Drive, which is done with rclone. There are older packages but they are awful by comparison so don’t waste your time on any other package. More recent rclone packages offer more options than what is shown here, but work basically the same way.

$ sudo apt-get install rclone
$ rclone config

2019/08/05 20:22:42 NOTICE: Config file "/home/pi/.config/rclone/rclone.conf" not found - using defaults
No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config
n/s/q&gt; n
name&gt; remote
Type of storage to configure.
Choose a number from below, or type in your own value
 1 / Amazon Drive
   \ "amazon cloud drive"
 2 / Amazon S3 (also Dreamhost, Ceph, Minio)
   \ "s3"
 3 / Backblaze B2
   \ "b2"
 4 / Dropbox
   \ "dropbox"
 5 / Encrypt/Decrypt a remote
   \ "crypt"
 6 / Google Cloud Storage (this is not Google Drive)
   \ "google cloud storage"
 7 / Google Drive
   \ "drive"
 8 / Hubic
   \ "hubic"
 9 / Local Disk
   \ "local"
10 / Microsoft OneDrive
   \ "onedrive"
11 / Openstack Swift (Rackspace Cloud Files, Memset Memstore, OVH)
   \ "swift"
12 / Yandex Disk
   \ "yandex" 
Storage&gt;7
 
Google Application Client Id
Leave blank normally.
Enter a string value. Press Enter for the default ("").
client_id&gt;
Google Application Client Secret
Leave blank normally.
Enter a string value. Press Enter for the default ("").
client_secret&gt;
Remote config
Use auto config?
 * Say Y if not sure
 * Say N if you are working on a remote or headless machine or Y didn't work
y) Yes
n) No
y/n&gt; N
If your browser doesn't open automatically go to the following link: https://accounts.google.com/o/oauth2/auth?client_id=202264815644.apps.googleusercontent.com&amp;redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&amp;response_type=code&amp;scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&amp;state=07ab6a457efc9384772f919dca93375
Log in and authorize rclone for access

You sign in to your Google account with a regular browser.

After sign-in you see:

rclone wants to access your Google Account
<your_account>@gmail.com
This will allow rclone
to:

See, edit, create, and delete all of your Google Drive files

Make sure you trust rclone

After clicking Allow you get:

Please copy this code, switch to your application and paste it there:
 
Enter verification code&gt;4/nQEXJZOTdP_asMs6UQZ5ucs6ecvoiLPelQbhI76rnuj4sFjptxbjm7w
--------------------
[remote]
client_id =
client_secret =
token = {"access_token":"ya29.Il-KB3eniEpkdUGhwdi8XyZyfBFIF2ahRVQtrr7kR-E2lIExSh3C1j-PAB-JZucL1j9D801Wbh2_OEDHthV2jk_MsrKCMiLSibX7oa_YtFxts-V9CxRRUirF1_kPHi5u_Q","token_type":"Bearer","refresh_token":"1/MQP8jevISJL1iEXH9gaNc7LIsABC-92TpmqwtRJ3zV8","expiry":"2019-09-21T08:34:19.251821011-04:00"}
--------------------
y) Yes this is OK
e) Edit this remote
d) Delete this remote
y/e/d&gt; y
Current remotes:
 
Name                 Type
====                 ====
remote               drive
 
e) Edit existing remote
n) New remote
d) Delete remote
s) Set configuration password
q) Quit config
e/n/d/r/s/q&gt;q

Note you can very well keep the root folder id blank. In my case we store all our pictures in one top-level folder and the nested folders get pretty deep, plus there’s a busload of other things on the drive, so I wanted to give rclone the best possible shot at running well. Still, listing our 40,000+ pictures takes 90 seconds or so.

Goofed up your config of rclone? No worries. Remove .config/rclone and start over.

Don’t forget to make all these scripts executable (chmod +x <script_name>:)or you will end up seeing messages like this:

./master.sh
-bash: ./master.sh: Permission denied

Some noteworthy rclone commands
rclone ls remote: – lists all files, going recursively, no problem with MORE
rclone lsd remote: lists directories in top level of drive
rclone copy remote:”MaryDocs/Pictures and videos/Shutterfly books collection of photos/JJH birth photos/img2165.jpg” . – copies picture to current directory (does not create directory hierarchy)

Do a complete directory listing, capture the results in a file and see how long it took:
$ time rclone ls remote: > lsf-complete

real    1m12.201s
user    0m15.270s
sys     0m1.816s

My initial thought was to do a remote mount of the Google Drive onto a Raspberry Pi mount point, but it’s just so slow that it really provides no advantage to do it that way.

Some encountered issues
Well, I blew up on crontab, which in all my years working with linux/unix I’ve never done before. But I managed to fix it.

Prior to discovering rclone I made the mistake of using gdrivefs to create a mounted Google Drive – sounds great in principle, right? What a disaster. The files’ binary data were not correctly preserved when accessed through the mount though the size was! I have also never encountered a mounting software that corrupted files, but this piece of garbage does. One way to detect corruption in a binary file is to do a cksum (or md5sum, just be consistent and use one or the other) of source file and destination version of same file. The result should be the same number.

Imagined but avoided issue: JPEG orientation
I had prepared a whole python program to orient my pictures correctly, but lo and behold I “discovered” that the -l switch in qiv does that for you! So I actually ripped that whole unnecessary step out.

Conclusion
Re-purposing equipment I had lying around: Raspberry Pi 3, Pi Display, and 40,000 JPEG images on Google Drive, I put together a novel photoframe slideshow which randomly displays a different set of 60 pictures each day. It’s a nice way for us to be exposed to our collection of 17+ years of digital photos.

The qiv really is a quick image viewer, i.e., the slideshow runs clean, like a real one.

Long Todo list

Improve selection of recent pictures if we’ve just uploaded a bunch of pictures from our smartphones.
Hey, how about also showing some of those short videos we also shot with our camera phones and uploaded to Google Drive? And while we’re at it, re-purposing those cheap USB speakers I bought for RetroPi gaming to get the sound, or play a soundtrack!?
I realize that although the selection of the 20 anchor pictures is initially random, when they plus the 40 additional photos are presented for display additional order is imposed by the shell’s expansion of the regex and this has a tendency to make the pictures more chronologically organized than they would be by chance.

References and related
PiDisplay

RetroPi, the gaming emulation project for which I bought economical USB speakers.

The rclone home page.

A detailed write-up on using pipresents program where we had a Raspberry Pi drive a mixed media display 9pictures and videos) for a kiosk.

files |awk '{$1=""; print substr($0,2)}'|grep -i -v /docs/ &gt; jpegs.list # throw NUMFOLDERS or so random numbers for picture selection, select triplets of photos by putting # names into a file if [ $DEBUG -eq 1 ]; then echo Generate random filename triplets; fi ./random-files.pl -f $NUMFOLDERS -j jpegs.list -r $RANFILE # copy over these 60 jpegs if [ $DEBUG -eq 1 ]; then echo Copy over these random files; fi cat $RANFILE|while read line; do rclone copy remote:"${STARTFOLDER}/$line" $DISPLAYFOLDERTMP sleep $SLEEPINTERVAL done # kill any qiv slideshow if [ $DEBUG -eq 1 ]; then echo Killing old qiv slideshow; fi pkill -9 -f qiv # remove old pics if [ $DEBUG -eq 1 ]; then echo Removing old pictures; fi rm -rf $DISPLAYFOLDER mv $DISPLAYFOLDERTMP $DISPLAYFOLDER #run looping qiv slideshow on these pictures if [ $DEBUG -eq 1 ]; then echo Start qiv slideshow in background; fi cd $DISPLAYFOLDER ; nohup ~/qiv.sh &amp; if [ $DEBUG -eq 1 ]; then echo "And now it is "`date`; fi

Needless to say, but I’d better say it, the STARTFOLDER in this script is particular to my own Google drive. Customize it as appropriate for your situation.

Then qiv (quick image viewer) is called with a bunch of arguments and some trickery to ensure proper display of files with spaces in the filenames (an anathema for Linux but my spouse doesn’t know that so I gotta deal with it). I call this script qiv.sh.

Here is the perl script which generates the random numbers and associates them to the file listing we’ve just made with rclone, random-files.pl.

There is an attempt to favor recently uploaded pictures but I really haven’t perfected that part of master.sh, it’s more of a thought at this point.

My crontab entries take care of starting a slideshow upon first boot as well as a daily pick of 60 new random pictures!

Use crontab -e to edit your crontab file.

qiv – an easy install
To install qiv

$ sudo apt-get install qiv

$ sudo apt-get install rclone
$ rclone config

You sign in to your Google account with a regular browser.

After sign-in you see:

rclone wants to access your Google Account
<your_account>@gmail.com
This will allow rclone
to:

See, edit, create, and delete all of your Google Drive files

Make sure you trust rclone

After clicking Allow you get:

Goofed up your config of rclone? No worries. Remove .config/rclone and start over.

Don’t forget to make all these scripts executable (chmod +x <script_name>:)or you will end up seeing messages like this:

Some noteworthy rclone commands
rclone ls remote: – lists all files, going recursively, no problem with MORE
rclone lsd remote: lists directories in top level of drive
rclone copy remote:”MaryDocs/Pictures and videos/Shutterfly books collection of photos/JJH birth photos/img2165.jpg” . : copies picture to current directory (does not create directory hierarchy)

Do a complete directory listing, capture the results in a file and see how long it took:
$ time rclone ls remote: > lsf-complete

My initial thought was to do a remote mount of the Google Drive onto a Raspberry Pi mount point, but it’s just so slow that it really provides no advantage to do it that way.

Some encountered issues
Well, I blew up on crontab, which in all my years working with linux/unix I’ve never done before. But I managed to fix it.

The qiv really is a quick image viewer, i.e., the slideshow runs clean, like a real one.

Long Todo list

Improve selection of recent pictures if we’ve just uploaded a bunch of pictures from our smartphones.
Hey, how about also showing some of those short videos we also shot with our camera phones and uploaded to Google Drive? And while we’re at it, re-purposing those cheap USB speakers I bought for RetroPi gaming to get the sound, or play a soundtrack!?
I realize that although the selection of the 20 anchor pictures is initially random, when they plus the 40 additional photos are presented for display additional order is imposed by the shell’s expansion of the regex and this has a tendency to make the pictures more chronologically organized than they would be by chance.

References and related
PiDisplay

RetroPi, the gaming emulation project for which I bought economical USB speakers.

The rclone home page.

A detailed write-up on using pipresents program where we had a Raspberry Pi drive a mixed media display 9pictures and videos) for a kiosk.

files |awk '{$1=""; print substr($0,2)}'|grep -i -v /docs/ &gt; jpegs.list # throw NUMFOLDERS or so random numbers for picture selection, select triplets of photos by putting # names into a file if [ $DEBUG -eq 1 ]; then echo Generate random filename triplets; fi ./random-files.pl -f $NUMFOLDERS -j jpegs.list -r $RANFILE # copy over these 60 jpegs if [ $DEBUG -eq 1 ]; then echo Copy over these random files; fi cat $RANFILE|while read line; do rclone copy remote:"${STARTFOLDER}/$line" $DISPLAYFOLDERTMP sleep $SLEEPINTERVAL done # rotate pics as needed if [ $DEBUG -eq 1 ]; then echo Rotate the pics which need it; fi cd $DISPLAYFOLDERTMP; ~/rotate-as-needed.sh cd ~ # kill any qiv slideshow if [ $DEBUG -eq 1 ]; then echo Killing old qiv and fbi slideshow; fi pkill -9 -f qiv sudo pkill -9 -f fbi pkill -9 -f m2.pl # remove old pics if [ $DEBUG -eq 1 ]; then echo Removing old pictures; fi rm -rf $DISPLAYFOLDER mv $DISPLAYFOLDERTMP $DISPLAYFOLDER #run looping fbi slideshow on these pictures if [ $DEBUG -eq 1 ]; then echo Start fbi slideshow in background; fi cd $DISPLAYFOLDER ; nohup ~/m2.pl &gt;&gt; ~/m2.log 2&gt;&amp;1 &amp; if [ $DEBUG -eq 1 ]; then echo "And now it is "`date`; fi

Needless to say, but I’d better say it, the STARTFOLDER in this script is particular to my own Google drive. Customize it as appropriate for your situation.

Here is the perl script which generates the random numbers and associates them to the file listing we’ve just made with rclone, random-files.pl.

There is an attempt to favor recently uploaded pictures but I really haven’t perfected that part of master.sh, it’s more of a thought at this point.

My crontab entries take care of starting a slideshow upon first boot as well as a daily pick of 60 new random pictures!

Use crontab -e to edit your crontab file.

qiv – an easy install
To install qiv

$ sudo apt-get install qiv

$ sudo apt-get install rclone
$ rclone config

You sign in to your Google account with a regular browser.

After sign-in you see:

rclone wants to access your Google Account
<your_account>@gmail.com
This will allow rclone
to:

See, edit, create, and delete all of your Google Drive files

Make sure you trust rclone

After clicking Allow you get:

Goofed up your config of rclone? No worries. Remove .config/rclone and start over.

Don’t forget to make all these scripts executable (chmod +x <script_name>:)or you will end up seeing messages like this:

Do a complete directory listing, capture the results in a file and see how long it took:
$ time rclone ls remote: > lsf-complete

My initial thought was to do a remote mount of the Google Drive onto a Raspberry Pi mount point, but it’s just so slow that it really provides no advantage to do it that way.

Some encountered issues
Well, I blew up on crontab, which in all my years working with linux/unix I’ve never done before. But I managed to fix it.

The qiv really is a quick image viewer, i.e., the slideshow runs clean, like a real one.

Long Todo list

Improve selection of recent pictures if we’ve just uploaded a bunch of pictures from our smartphones.
Hey, how about also showing some of those short videos we also shot with our camera phones and uploaded to Google Drive? And while we’re at it, re-purposing those cheap USB speakers I bought for RetroPi gaming to get the sound, or play a soundtrack!?
I realize that although the selection of the 20 anchor pictures is initially random, when they plus the 40 additional photos are presented for display additional order is imposed by the shell’s expansion of the regex and this has a tendency to make the pictures more chronologically organized than they would be by chance.

References and related
PiDisplay

RetroPi, the gaming emulation project for which I bought economical USB speakers.

The rclone home page.

A detailed write-up on using pipresents program where we had a Raspberry Pi drive a mixed media display 9pictures and videos) for a kiosk.

Needless to say, but I’d better say it, the STARTFOLDER in this script is particular to my own Google drive. Customize it as appropriate for your situation.

Here is the perl script which generates the random numbers and associates them to the file listing we’ve just made with rclone, random-files.pl.

There is an attempt to favor recently uploaded pictures but I really haven’t perfected that part of master.sh, it’s more of a thought at this point.

My crontab entries take care of starting a slideshow upon first boot as well as a daily pick of 60 new random pictures!

Use crontab -e to edit your crontab file.

qiv – an easy install
To install qiv

$ sudo apt-get install qiv

$ sudo apt-get install rclone
$ rclone config

You sign in to your Google account with a regular browser.

After sign-in you see:

rclone wants to access your Google Account
<your_account>@gmail.com
This will allow rclone
to:

See, edit, create, and delete all of your Google Drive files

Make sure you trust rclone

After clicking Allow you get:

Goofed up your config of rclone? No worries. Remove .config/rclone and start over.

Don’t forget to make all these scripts executable (chmod +x <script_name>:)or you will end up seeing messages like this:

Some noteworthy rclone commands
rclone ls remote: – lists all files, going recursively, no problem with MORE
rclone lsd remote: lists directories in top level of drive
rclone copy remote:”MaryDocs/Pictures and videos/Shutterfly books collection of photos/JJH birth photos/img2165.jpg” . : copies picture to current directory (does not create directory hierarchy)

Do a complete directory listing, capture the results in a file and see how long it took:
$ time rclone ls remote: > lsf-complete

My initial thought was to do a remote mount of the Google Drive onto a Raspberry Pi mount point, but it’s just so slow that it really provides no advantage to do it that way.

Some encountered issues
Well, I blew up on crontab, which in all my years working with linux/unix I’ve never done before. But I managed to fix it.

The qiv really is a quick image viewer, i.e., the slideshow runs clean, like a real one.

Long Todo list

Improve selection of recent pictures if we’ve just uploaded a bunch of pictures from our smartphones.
Hey, how about also showing some of those short videos we also shot with our camera phones and uploaded to Google Drive? And while we’re at it, re-purposing those cheap USB speakers I bought for RetroPi gaming to get the sound, or play a soundtrack!?
I realize that although the selection of the 20 anchor pictures is initially random, when they plus the 40 additional photos are presented for display additional order is imposed by the shell’s expansion of the regex and this has a tendency to make the pictures more chronologically organized than they would be by chance.

References and related
PiDisplay

RetroPi, the gaming emulation project for which I bought economical USB speakers.

The rclone home page.

A detailed write-up on using pipresents program where we had a Raspberry Pi drive a mixed media display 9pictures and videos) for a kiosk.
files |awk ‘{$1=””; print substr($0,2)}’|grep -i -v /docs/ > jpegs.list

# throw NUMFOLDERS or so random numbers for picture selection, select triplets of photos by putting
# names into a file
if [ $DEBUG -eq 1 ]; then echo Generate random filename triplets; fi
./random-files.pl -f $NUMFOLDERS -j jpegs.list -r $RANFILE

# copy over these 60 jpegs
if [ $DEBUG -eq 1 ]; then echo Copy over these random files; fi
cat $RANFILE|while read line; do
rclone copy remote:”${STARTFOLDER}/$line” $DISPLAYFOLDERTMP
sleep $SLEEPINTERVAL
done

# rotate pics as needed
if [ $DEBUG -eq 1 ]; then echo Rotate the pics which need it; fi
cd $DISPLAYFOLDERTMP; ~/rotate-as-needed.sh
cd ~

# kill any qiv slideshow
if [ $DEBUG -eq 1 ]; then echo Killing old qiv and fbi slideshow; fi
pkill -9 -f qiv
sudo pkill -9 -f fbi
pkill -9 -f m2.pl

# remove old pics
if [ $DEBUG -eq 1 ]; then echo Removing old pictures; fi
rm -rf $DISPLAYFOLDER

mv $DISPLAYFOLDERTMP $DISPLAYFOLDER

#run looping fbi slideshow on these pictures
if [ $DEBUG -eq 1 ]; then echo Start fbi slideshow in background; fi
cd $DISPLAYFOLDER ; nohup ~/m2.pl >> ~/m2.log 2>&1 &

if [ $DEBUG -eq 1 ]; then echo “And now it is “`date`; fi

Needless to say, but I’d better say it, the STARTFOLDER in this script is particular to my own Google drive. Customize it as appropriate for your situation.

Here is the perl script which generates the random numbers and associates them to the file listing we’ve just made with rclone, random-files.pl.

There is an attempt to favor recently uploaded pictures but I really haven’t perfected that part of master.sh, it’s more of a thought at this point.

My crontab entries take care of starting a slideshow upon first boot as well as a daily pick of 60 new random pictures!

Use crontab -e to edit your crontab file.

qiv – an easy install
To install qiv

$ sudo apt-get install qiv

$ sudo apt-get install rclone
$ rclone config

You sign in to your Google account with a regular browser.

After sign-in you see:

rclone wants to access your Google Account
<your_account>@gmail.com
This will allow rclone
to:

See, edit, create, and delete all of your Google Drive files

Make sure you trust rclone

After clicking Allow you get:

Goofed up your config of rclone? No worries. Remove .config/rclone and start over.

Don’t forget to make all these scripts executable (chmod +x <script_name>:)or you will end up seeing messages like this:

Do a complete directory listing, capture the results in a file and see how long it took:
$ time rclone ls remote: > lsf-complete

My initial thought was to do a remote mount of the Google Drive onto a Raspberry Pi mount point, but it’s just so slow that it really provides no advantage to do it that way.

Some encountered issues
Well, I blew up on crontab, which in all my years working with linux/unix I’ve never done before. But I managed to fix it.

The qiv really is a quick image viewer, i.e., the slideshow runs clean, like a real one.

Long Todo list

Improve selection of recent pictures if we’ve just uploaded a bunch of pictures from our smartphones.
Hey, how about also showing some of those short videos we also shot with our camera phones and uploaded to Google Drive? And while we’re at it, re-purposing those cheap USB speakers I bought for RetroPi gaming to get the sound, or play a soundtrack!?
I realize that although the selection of the 20 anchor pictures is initially random, when they plus the 40 additional photos are presented for display additional order is imposed by the shell’s expansion of the regex and this has a tendency to make the pictures more chronologically organized than they would be by chance.

References and related
PiDisplay

RetroPi, the gaming emulation project for which I bought economical USB speakers.

The rclone home page.

A detailed write-up on using pipresents program where we had a Raspberry Pi drive a mixed media display 9pictures and videos) for a kiosk.

Needless to say, but I’d better say it, the STARTFOLDER in this script is particular to my own Google drive. Customize it as appropriate for your situation.

Here is the perl script which generates the random numbers and associates them to the file listing we’ve just made with rclone, random-files.pl.

There is an attempt to favor recently uploaded pictures but I really haven’t perfected that part of master.sh, it’s more of a thought at this point.

My crontab entries take care of starting a slideshow upon first boot as well as a daily pick of 60 new random pictures!

Use crontab -e to edit your crontab file.

qiv – an easy install
To install qiv

$ sudo apt-get install qiv

$ sudo apt-get install rclone
$ rclone config

You sign in to your Google account with a regular browser.

After sign-in you see:

rclone wants to access your Google Account
<your_account>@gmail.com
This will allow rclone
to:

See, edit, create, and delete all of your Google Drive files

Make sure you trust rclone

After clicking Allow you get:

Goofed up your config of rclone? No worries. Remove .config/rclone and start over.

Don’t forget to make all these scripts executable (chmod +x <script_name>:)or you will end up seeing messages like this:

Some noteworthy rclone commands
rclone ls remote: – lists all files, going recursively, no problem with MORE
rclone lsd remote: lists directories in top level of drive
rclone copy remote:”MaryDocs/Pictures and videos/Shutterfly books collection of photos/JJH birth photos/img2165.jpg” . : copies picture to current directory (does not create directory hierarchy)

Do a complete directory listing, capture the results in a file and see how long it took:
$ time rclone ls remote: > lsf-complete

My initial thought was to do a remote mount of the Google Drive onto a Raspberry Pi mount point, but it’s just so slow that it really provides no advantage to do it that way.

Some encountered issues
Well, I blew up on crontab, which in all my years working with linux/unix I’ve never done before. But I managed to fix it.

The qiv really is a quick image viewer, i.e., the slideshow runs clean, like a real one.

Long Todo list

Improve selection of recent pictures if we’ve just uploaded a bunch of pictures from our smartphones.
Hey, how about also showing some of those short videos we also shot with our camera phones and uploaded to Google Drive? And while we’re at it, re-purposing those cheap USB speakers I bought for RetroPi gaming to get the sound, or play a soundtrack!?
I realize that although the selection of the 20 anchor pictures is initially random, when they plus the 40 additional photos are presented for display additional order is imposed by the shell’s expansion of the regex and this has a tendency to make the pictures more chronologically organized than they would be by chance.

References and related
PiDisplay

RetroPi, the gaming emulation project for which I bought economical USB speakers.

The rclone home page.

A detailed write-up on using pipresents program where we had a Raspberry Pi drive a mixed media display 9pictures and videos) for a kiosk.

Needless to say, but I’d better say it, the STARTFOLDER in this script is particular to my own Google drive. Customize it as appropriate for your situation.

Here is the perl script which generates the random numbers and associates them to the file listing we’ve just made with rclone, random-files.pl.

There is an attempt to favor recently uploaded pictures but I really haven’t perfected that part of master.sh, it’s more of a thought at this point.

My crontab entries take care of starting a slideshow upon first boot as well as a daily pick of 60 new random pictures!

Use crontab -e to edit your crontab file.

qiv – an easy install
To install qiv

$ sudo apt-get install qiv

$ sudo apt-get install rclone
$ rclone config

You sign in to your Google account with a regular browser.

After sign-in you see:

rclone wants to access your Google Account
<your_account>@gmail.com
This will allow rclone
to:

See, edit, create, and delete all of your Google Drive files

Make sure you trust rclone

After clicking Allow you get:

Goofed up your config of rclone? No worries. Remove .config/rclone and start over.

Don’t forget to make all these scripts executable (chmod +x <script_name>:)or you will end up seeing messages like this:

Do a complete directory listing, capture the results in a file and see how long it took:
$ time rclone ls remote: > lsf-complete

My initial thought was to do a remote mount of the Google Drive onto a Raspberry Pi mount point, but it’s just so slow that it really provides no advantage to do it that way.

Some encountered issues
Well, I blew up on crontab, which in all my years working with linux/unix I’ve never done before. But I managed to fix it.

The qiv really is a quick image viewer, i.e., the slideshow runs clean, like a real one.

Long Todo list

Improve selection of recent pictures if we’ve just uploaded a bunch of pictures from our smartphones.
Hey, how about also showing some of those short videos we also shot with our camera phones and uploaded to Google Drive? And while we’re at it, re-purposing those cheap USB speakers I bought for RetroPi gaming to get the sound, or play a soundtrack!?
I realize that although the selection of the 20 anchor pictures is initially random, when they plus the 40 additional photos are presented for display additional order is imposed by the shell’s expansion of the regex and this has a tendency to make the pictures more chronologically organized than they would be by chance.

References and related

Current approach and writeup for this photo frame effort.
PiDisplay

RetroPi, the gaming emulation project for which I bought economical USB speakers.

The rclone home page.

A detailed write-up on using pipresents program where we had a Raspberry Pi drive a mixed media display 9pictures and videos) for a kiosk.

Needless to say, but I’d better say it, the STARTFOLDER in this script is particular to my own Google drive. Customize it as appropriate for your situation.

Here is the perl script which generates the random numbers and associates them to the file listing we’ve just made with rclone, random-files.pl.

There is an attempt to favor recently uploaded pictures but I really haven’t perfected that part of master.sh, it’s more of a thought at this point.

My crontab entries take care of starting a slideshow upon first boot as well as a daily pick of 60 new random pictures!

Use crontab -e to edit your crontab file.

qiv – an easy install
To install qiv

$ sudo apt-get install qiv

$ sudo apt-get install rclone
$ rclone config

You sign in to your Google account with a regular browser.

After sign-in you see:

rclone wants to access your Google Account
<your_account>@gmail.com
This will allow rclone
to:

See, edit, create, and delete all of your Google Drive files

Make sure you trust rclone

After clicking Allow you get:

Goofed up your config of rclone? No worries. Remove .config/rclone and start over.

Don’t forget to make all these scripts executable (chmod +x <script_name>:)or you will end up seeing messages like this:

Some noteworthy rclone commands
rclone ls remote: – lists all files, going recursively, no problem with MORE
rclone lsd remote: lists directories in top level of drive
rclone copy remote:”MaryDocs/Pictures and videos/Shutterfly books collection of photos/JJH birth photos/img2165.jpg” . : copies picture to current directory (does not create directory hierarchy)

Do a complete directory listing, capture the results in a file and see how long it took:
$ time rclone ls remote: > lsf-complete

My initial thought was to do a remote mount of the Google Drive onto a Raspberry Pi mount point, but it’s just so slow that it really provides no advantage to do it that way.

Some encountered issues
Well, I blew up on crontab, which in all my years working with linux/unix I’ve never done before. But I managed to fix it.

The qiv really is a quick image viewer, i.e., the slideshow runs clean, like a real one.

Long Todo list

Improve selection of recent pictures if we’ve just uploaded a bunch of pictures from our smartphones.
Hey, how about also showing some of those short videos we also shot with our camera phones and uploaded to Google Drive? And while we’re at it, re-purposing those cheap USB speakers I bought for RetroPi gaming to get the sound, or play a soundtrack!?
I realize that although the selection of the 20 anchor pictures is initially random, when they plus the 40 additional photos are presented for display additional order is imposed by the shell’s expansion of the regex and this has a tendency to make the pictures more chronologically organized than they would be by chance.

References and related
PiDisplay

RetroPi, the gaming emulation project for which I bought economical USB speakers.

The rclone home page.

A detailed write-up on using pipresents program where we had a Raspberry Pi drive a mixed media display 9pictures and videos) for a kiosk.

Tags Exif tags, Google Drive, image orientation, rclone

IT Operational Excellence Network Technologies Web Site Technologies

F5 Big-IP: When your virtual server does not present your chain certificate

Post author By john
Post date July 9, 2019
No Comments on F5 Big-IP: When your virtual server does not present your chain certificate

Intro
While I was on vacation someone replaced a certificate which had expired on the F5 Big-IP load balancer. Maybe they were not quite as careful as I would like to hope I would have been. In any case, shortly afterwards our SiteScope monitoring reported there was an untrusted server certificate chain. It took me quite some digging to get to the bottom of it.

The details
Well, the web site came up just fine in my browser. I checked it with SSLlabs and its grade was capped at B because of problems with the server certificate chain. I also independently confirmed usnig openssl that no intermediate certificate was being presented by this virtual server. To see what that looks like with an exampkle of this problem knidly privided by badssl.com, do:

$ openssl s_client ‐showcerts ‐connect incomplete-chain.badssl.com:443

CONNECTED(00000003)
depth=0 /C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 /C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
verify error:num=27:certificate not trusted
verify return:1
depth=0 /C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
verify error:num=21:unable to verify the first certificate
verify return:1
---
Certificate chain
 0 s:/C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
   i:/C=US/ST=California/L=San Francisco/O=BadSSL/CN=BadSSL Intermediate Certificate Authority
-----BEGIN CERTIFICATE-----
MIIE8DCCAtigAwIBAgIJAM28Wkrsl2exMA0GCSqGSIb3DQEBCwUAMH8xCzAJBgNV
BAYTAlVTMRMwEQYDVQQIDApDYWxpZm9ybmlhMRYwFAYDVQQHDA1TYW4gRnJhbmNp
...
HJKvc9OYjJD0ZuvZw9gBrY7qKyBX8g+sglEGFNhruH8/OhqrV8pBXX/EWY0fUZTh
iywmc6GTT7X94Ze2F7iB45jh7WQ=
-----END CERTIFICATE-----
---
Server certificate
subject=/C=US/ST=California/L=San Francisco/O=BadSSL Fallback. Unknown subdomain or no SNI./CN=badssl-fallback-unknown-subdomain-or-no-sni
issuer=/C=US/ST=California/L=San Francisco/O=BadSSL/CN=BadSSL Intermediate Certificate Authority
...
    Verify return code: 21 (unable to verify the first certificate)

So you get that message about benig unable to verify the first certificate.

Here’s the weird thing, the certificate in question was issued by Globalsign, and we have used them for years so we had the intermediate certificate configured already in the SSL client profile. The so-called chain certificate was GlobalsignIntermediate. But it wasn’t being presented. What the heck? Then I checked someone else’s Globalsign certificate and found the same issue.

Then I began to get suspicious about the certificate. I checked the issuer more carefully and found that it wasn’t from the intermediate we had been using all these past years. Globalsign changed their intermediate certificate! The new one dates frmo November 2018 and expires in 2028.

And, to compound matters, F5 “helpfully” does not complain and simply does not send the wrong intermediate certificate we had specified in the SSL client profile. It just sends no intermediate certificate at all to accompany the server certificate.

Conclusion
The case of the missing intermediate certificate was resolved. It is not the end of the world to miss an intermediate certificate, but on the other hand it is not professional either. Sooner or later it will get you into trouble.

References and related
badssl.com is a great resource.
My favorite openssl commands can be very helpful.

Admin Linux Network Technologies Raspberry Pi Security Web Site Technologies

How to test if a web site requires a client certificate

Post author By john
Post date June 20, 2019
2 Comments on How to test if a web site requires a client certificate

Intro
I can not find a link on the Internet for this, yet I think some admins would appreciate a relatively simple test to know is this a web site which requires a client certificate to work? The errors generated in a browser may be very generic in these situations. I see many ways to offer help, from a recipe to a tool to some pointers. I’m not yet sure how I want to proceed!

why would a site require a client CERT? Most likely as a form of client authentication.

Pointers for the DIY crowd
Badssl.com plus access to a linux command line – such as using a Raspberry Pi I so often write about – will do it for you guys.

The Client Certificate section of badssl.com has most of what you need. The page is getting big, look for this:

So as a big timesaver badssl.com has created a client certificate for you which you can use to test with. Download it as follows.

Go to your linux prompt and do something like this:
$ wget https://badssl.com/certs/badssl.com‐client.pem

If this link does not work, navigate to it starting from this link: https://badssl.com/download/

badssl.com has a web page you can test with which only shows success if you access it using a client certificate, https://client.badssl.com/

to see how this works, try to access it the usual way, without supplying a client CERT:

$ curl ‐i ‐k https://client.badssl.com/

HTTP/1.1 400 Bad Request
Server: nginx/1.10.3 (Ubuntu)
Date: Thu, 20 Jun 2019 17:53:38 GMT
Content-Type: text/html
Content-Length: 262
Connection: close

400 Bad Request

No required SSL certificate was sent

nginx/1.10.3 (Ubuntu)

Now try the same thing, this time using the client CERT you just downloaded:

$ curl ‐v ‐i ‐k ‐E ./badssl.com‐client.pem:badssl.com https://client.badssl.com/

* About to connect() to client.badssl.com port 443 (#0)
*   Trying 104.154.89.105... connected
* Connected to client.badssl.com (104.154.89.105) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* warning: ignoring value of ssl.verifyhost
* skipping SSL peer certificate verification
* NSS: client certificate from file
*       subject: CN=BadSSL Client Certificate,O=BadSSL,L=San Francisco,ST=California,C=US
*       start date: Nov 16 05:36:33 2017 GMT
*       expire date: Nov 16 05:36:33 2019 GMT
*       common name: BadSSL Client Certificate
*       issuer: CN=BadSSL Client Root Certificate Authority,O=BadSSL,L=San Francisco,ST=California,C=US
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
*       subject: CN=*.badssl.com,O=Lucas Garron,L=Walnut Creek,ST=California,C=US
*       start date: Mar 18 00:00:00 2017 GMT
*       expire date: Mar 25 12:00:00 2020 GMT
*       common name: *.badssl.com
*       issuer: CN=DigiCert SHA2 Secure Server CA,O=DigiCert Inc,C=US
&gt; GET / HTTP/1.1
&gt; User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.27.1 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
&gt; Host: client.badssl.com
&gt; Accept: */*
&gt;
&lt; HTTP/1.1 200 OK
HTTP/1.1 200 OK
&lt; Server: nginx/1.10.3 (Ubuntu)
Server: nginx/1.10.3 (Ubuntu)
&lt; Date: Thu, 20 Jun 2019 17:59:08 GMT
Date: Thu, 20 Jun 2019 17:59:08 GMT
&lt; Content-Type: text/html
Content-Type: text/html
&lt; Content-Length: 662
Content-Length: 662
&lt; Last-Modified: Wed, 12 Jun 2019 15:43:39 GMT
Last-Modified: Wed, 12 Jun 2019 15:43:39 GMT
&lt; Connection: keep-alive
Connection: keep-alive
&lt; ETag: "5d011dab-296"
ETag: "5d011dab-296"
&lt; Cache-Control: no-store
Cache-Control: no-store
&lt; Accept-Ranges: bytes
Accept-Ranges: bytes
 
&lt;
 
 
 
 
  <style>body { background: green; }</style>

client.
badssl.com

 
* Connection #0 to host client.badssl.com left intact
* Closing connection #0

No more 400 error status – that looks like success to me. Note that we had to provide the password for our client CERT, which they kindly provided as badssl.com

Here’s an example of a real site which requires client CERTs:

$ curl ‐v ‐i ‐k ‐E ./badssl.com‐client.pem:badssl.com https://jp.nissan.biz/

* About to connect() to jp.nissan.biz port 443 (#0)
*   Trying 150.63.252.1... connected
* Connected to jp.nissan.biz (150.63.252.1) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* warning: ignoring value of ssl.verifyhost
* skipping SSL peer certificate verification
* NSS: client certificate from file
*       subject: CN=BadSSL Client Certificate,O=BadSSL,L=San Francisco,ST=California,C=US
*       start date: Nov 16 05:36:33 2017 GMT
*       expire date: Nov 16 05:36:33 2019 GMT
*       common name: BadSSL Client Certificate
*       issuer: CN=BadSSL Client Root Certificate Authority,O=BadSSL,L=San Francisco,ST=California,C=US
* NSS error -12227
* Closing connection #0
* SSL connect error
curl: (35) SSL connect error

OK, so you get an error, but that’s to be expected because our certificate is not one it will accept.

The point is that if you don’t send it a certificate at all, you get a different error:

$ curl ‐v ‐i ‐k https://jp.nissan.biz/

* About to connect() to client.badssl.com port 443 (#0)
*   Trying 104.154.89.105... connected
* Connected to client.badssl.com (104.154.89.105) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* warning: ignoring value of ssl.verifyhost
* Unable to load client key -8025.
* NSS error -8025
* Closing connection #0
curl: (58) Unable to load client key -8025.

Chrome gives a fairly intelligible error

Possibly to be continued…

Conclusion
We have given a recipe for testing form a linux command line if a web site requires a client certificate or not. thus it could be turned into a program

References and related
My article about ciphers has been popular.

I’ve also used badssl.com for other related tests.

Can you use openssl directly? You’d hope so, but I haven’t had time to explore it… Here are my all-time favorite openssl commands.

https://badssl.com/ – lots of cool tests here. The creators have been really thorough.

Tags client certificate

Admin Network Technologies

Postfix Operational tips

Intro
I’m trying out the system-supplied postfix on a SLES system. i had been using sendmail but there doesn’t seem to be any development on that software.

Some commands I needed right away
Well, right away I had thousands of queued messages so I needed a way to make sense of what was happening.

For these commands to make sense you need to know that I am running a second postfix configuraiton out of /etc/postfixEXT.

Display the queue

postqueue -c /etc/postfixEXT -p

Force delivery from the queue

postqueue -c /etc/postfixEXT -f

List one email in detail

postcat -vq -c /etc/postfixEXT QUEUEID

Delete one email

postsuper -c /etc/postfixEXT -d QUEUEID

Put mail on hold

postsuper -c /etc/postfixEXT -h ALL|QUEUEID

Release mail form hold

postsuper -c /etc/postfixEXT -H ALL|QUEUEID

How to force delivery of a single message
This command is not documented anywhere – because it doesn’t exist so you have to get creative. If you have the luxury of halting all email for a few seconds simply do this:

Display the queue to find the queue ID of the email you want to force delivery of

postqueue -c /etc/postfixEXT -p

Put all mail on hold

postsuper -c /etc/postfixEXT -h ALL

Now release the hold on that one email

postsuper -c /etc/postfixEXT -H QUEUEID

QUEUEID is, of course, the queue id .e.g., F2A1A27891E, of the email in question.

Look for what happened
Check your mail log’s last lines in /var/log/mail

Revert back to normal running

postsuper -c /etc/postfixEXT -H ALL

Since mail is store-and-forward and not real time, you can do these steps, quickly, even on a production system and no one will be the wiser if you are pretty quick. Probably takes two minutes even for a slow typer.

How to run multiple listeners
I didn’t want to disturb the system-installed postfix too much. I would let it “have” the loopback address, 127.0.0.1, leaving me the other IPs for my relay config to listen on. I added these lines to /etc/postfix/main.cf

multi_instance_enable = yes
multi_instance_directories = /etc/postfixEXT

service postfix start starts up the local postfix plus my relay. Grep the process table for either master or postfix to see. However, to be honest, service postfix stop does not kill all processes. So I always end up killing one of the master processes by hand. Update: postmulti -p stop does the trick to kill all. There is also a status or start option instead of stop.

Sendmail to Postfix migration tips
This could be a separate post but I am too lazy to do that.

What happens to the access file? I kept the name of the file access but just list all the IPs, one per line, without any further arguments, to permit just those IPs relay access. In my main.cf I have a line like this to tie it together:

mynetworks = /etc/postfixEXT/access

Note that there is no hashed or .db version of this file any longer, unlike in the sendmail case.

References and related
Since I mentioned sendmail I have to give a shout out to one of my old sendmail posts.

More info on postfix multiple instances. A pretty complete guide.

Tags postfix, sendmail

400 Bad Request

client. badssl.com

client.
badssl.com