Categories
Debian Linux Raspberry Pi

My favorite bash scripting tips

Intro

The linux bash shell is great and very flexible. I love to use it and have even installed WSL 2 on my PCs so I can use it as much as possible. When it comes to scripting it’s not exactly my favorite. there is so much history it has absorbed that there are multiple ways to do everything: the really old way, the new way, the alternate way, etc. And your version of bash can also determine what features you can use. nevertheless, I guess if you stick to the basics it makes sense to use bash for simple scripting tasks.

So just like I’ve compiled all the python tips I need for writing my simple python scripts in one convenient, searchable page, I will now do the same for bash. No one but me uses it, but that’s fine.

Iterate (loop) over a range of numbers

END=255 # for instance to loop over an ocetet of an IP address
for i in $(seq 1 $END); do
  echo $i
done
# But if it's OK to just hard-wire start and end, then it's simpler to use:
for i in {1..255}; do echo $i; done

Infinite loop
while /bin/true; do...done

You can always exit to stop it.

Sort IPs in a sensible order

$ sort -n -t . -k1,1 -k2,2 -k 3,3 -k4,4 tmp

What directory is this script in?

DIR=$(cd $(dirname $0);pwd);echo$DIR

Guarantee this script is interpreted (run) by bash and not good ‘ole shell (sh)!
if [ ! "$BASH_VERSION" ] ; then
  exec /bin/bash "$0" "$@"
  exit
fi
Count total occurrences of the word print in a bunch of files which may or may not be compressed, storing the output in a file

print=0
zgrep -c print tst*|cut -d: -f2|while read pline; do prints=$((prints + pline));echo $prints>prints; done

Note that much of the awkwardness of the above line is to get around issues I had with variable scope.

Legal characters in variable names

Don’t use _ as you might in python! Stick to alphanumeric, but also do not begin with a number!

Execute a command

I used to use back ticks ` in the old days. parentheses is more visually appealing:

print1=$(cat prints)

Variable type

No, variables are not typed. Everything is treated as a string.

Function definition

Put function definitions before they are invoked in the script. Invocation is by plain name. function syntax is as in the example.

sendsummary() {
# function execution statements go here, then close it out
} # optionally with a comment like end function sendsummary
sendsummary # invoke our sendsummary function
Indentation

Unlike python, line indentation does not matter. I recommend to indent blocks of code two spaces, for example, for readability.

Booleans and order of execution
[[ "$DEBUG" -eq "1" ]] && echo subject, $subject, intro, "$intro"

The second statement only gets executed if the first one evaluated as true. Now a more complex example.

[[ $day -eq $DAY ]] || [[ -n “$anomalies” ]] && { statements…}

The second expressions get evaluated if the first one is false. If either the first or second expressions are true, then the last expression — a series of statements in what is essentially an unnamed function, hence the enclosing braces — gets executed. The -n is a test to see of length of a string is non-zero. See man test.

Conditionals

Note that clever use of && and || can in many cases obviate the need for a class if…then structure. But you can use if thens. An if block is terminated by a fi. There is an else statement as well as an elif (else if) statement.

grep conditionals
ping -c1 8.8.8.8|grep -iq '1 received'
[ $? -eq 0 ] && echo this host is alive

So the $? variable after grep is run contains 0 if there was a match and 1 if there was no match. -q argument puts grep in “quiet” mode (no output).

More sophisticated example testing exit status and executing multiple commands

#!/bin/bash
# restart mariaDB if home page response becomes greater than one second
curl -m1 -ksH 'Host:drjohnstechtalk.com' https://localhost/blog/ > /dev/null
# if curl didn't have enough time (one sec), its exit status is 28
[ $? -eq 28 ] && (systemctl stop mariadb; sleep 3; systemctl start mariadb; echo mariadb restart at $(date))

Note that I had to group the commands after the conditional test with surrounding parentheses (). That creates a code block. Without those the semicolon ; would have indicated the end of the block! A semicolon ; separates commands. Further note that I nested parentheses and that seems to work as you would hope. also note that STDOUT has been redirected by the greater than sign > to /dev/null in order to silently discard all STDOUT output. /dev/null is linux-specific. The windows equivalent, apparently, is nul. Use curl -so nul suppress output on a Windows system.

One square bracket or two?

I have no idea and I use whatever I get to work. All my samples work and I don’t have time to test all variations.

Variable scope

I really struggled with this so I may come back to this topic!

Variable interpolation

$variable will suffice for simple, i.e., one-word content. But if the variable contains anything a bit complex such as words separated by spaces, or containing unusual characters, better go with double quotes around it, “$variable”. And sometimes syntactically throw in curly braces to separate it from other elements, “${variable}”

Eval
eval="ls -l"
$eval # executes ls -l
Shell expansion
mv Pictures{,.old} # renames directory Pictures to Pictures.old
Poor man’s launch at boot time

Use crontab’s @reboot feature!

@reboot sleep 25; ./recordswitch.sh > recordswitch.log 2>&1

The above expression also shows how to redirect standard error to standard out and have both go into a file.

Use extended regular expressions, retrieving a positional field using awk, and how to subtract two numbers
t1=`echo -n $line|awk '{print $1}'` 
t2=`echo -n $line|awk '{print $4}'` 
# test for integer inputs 
[[ "$t1" =~ ^[0-9]+$ ]] && [[ "$t2" =~ ^[0-9]+$ ]] && downtime=$(($t1-$t2))

Oops, I used the backticks there! I never claim that my way is the best way, just the way that I know to work! I know of a zillion options to add or subtract numbers…

Why do assignments have no extra spaces?

It simply doesn’t work if you try to put in spacing around the assignment operator =.

Divert stdout and stderr to a file from within the script
log=/tmp/my-log.log
exec 1>$log 
exec 2>&1
Lists, arrays amd dictionary variables

I don’t think bash is for you if you need these types of variables.

Formatted date

date +%F

produces yyyy-mm-dd, i.e., 2024-01-25

Conclusion

I have documented here most of the tecniques I use from bash to achieve simple yet powerful scripts. My style is not always top form, but as I learn better ways I will adopt and improve.

Categories
Admin Linux Raspberry Pi

Scripts checker

Intro

Imagine an infrastructure team empowered to create its own scripts to do such things as regularly update external dynamic lists (EDLs) or interact with APIs in an automated fashion. At some point they will want to have a meta script in place to check the output of the all the automation scripts. This is something I developed to meet that need.

I am getting tired of perl, and I still don’t know python, so I decided to enhance my bash scripting for this script. I learned some valuable things along the way.

checklogs.sh

I call the script checklogs.sh Here it is.

#!/bin/bash
# DrJ 2021/12/17, updated 2023/7/26
# it is desired to run this using the logrotate mechanism
#
# logrotate invokes with /bin/sh so we have to do this trick...
if [ ! "$BASH_VERSION" ] ; then
  exec /bin/bash "$0" "$@"
  exit
fi
DIR=$(cd $(dirname $0);pwd)
INI=$DIR/log.ini
DAY=2 # Day of week to analyze full week of logs. Monday is 1, Tuesday 2, etc
DEBUG=0
maxdiff=10
maxerrors=10
minstarts=10
TMPDIR=/var/tmp
cd $TMPDIR
recipients="[email protected]"
#
checklog2() {
  starts=0;ends=0;errors=0
  [[ "$DEBUG" -eq "1" ]] && echo ID, $ID, LPATH, $LPATH, START, $START, ERROR, $ERROR, END, $END
  LPATH="${LPATH}${wildcard}"
# the Ec switches mean (E) extnded regular expressions, (c) count of matching lines
  zgrep -Ec "$START" ${LPATH}|cut -d: -f2|while read sline; do starts=$((starts + sline));echo $starts>starts; done
  zgrep -Ec "$END" ${LPATH}|cut -d: -f2|while read sline; do ends=$((ends + sline));echo $ends>ends; done
# Outlook likes to remove our newline characters - double up on them with this sed trick!
  zgrep -Ec "$ERROR" ${LPATH}|cut -d: -f2|sed 'a\\'|while read sline; do errors=$((errors + sline));echo $errors>errors; done
  exampleerrors=$(zgrep -E "$ERROR" ${LPATH}|head -10)
  starts=$(cat starts)
  ends=$(cat ends)
  errors=$(cat errors)
  info="${info}===========================================
$ID SUMMARY
  Total starts: $starts
  Total finishes: $ends
  Total errors: $errors
  Most recent errors: "
  info="${info}${exampleerrors}

"
  unset NEW
# get cumulative totals
  starttot=$((starttot + starts))
  endtot=$((endtot + ends))
  errortot=$((errortot + errors))
  [[ "$DEBUG" -eq "1" ]] && echo starttot, $starttot, endtot, $endtot, errortot, $errortot
  [[ "$DEBUG" -eq "1" ]] || rm starts ends errors
} # end of checklog2 function

checklog() {
# clear out stats and some variables
starttot=0;endtot=0;errortot=0;info=""
#this IFS and following line is trick to preserve those darn backslash charactes in the input file
IFS=$'\n'
for line in $(<$INI); do
  [[ "$line" =~ ^# ]] || {
  pval=$(echo "$line"|sed s'/: */:/')
  lhs=$(echo $pval|cut -d: -f1)
  rhs=$(echo "$pval"|cut -d: -f2-)
  lhs=$(echo $lhs|tr [:upper:] [:lower:])
  [[ "$DEBUG" -eq "1" ]] && echo line is "$line", pval is $pval, lhs is $lhs, rhs is "$rhs"
  if [ "$lhs" = "identifier" ]; then
    [[ "$DEBUG" -eq "1" ]] && echo matched lhs = identifer section
    [[ -n "$NEW" ]] && checklog2
    ID="$rhs"
  fi
  [[ "$lhs" = "path" ]] && LPATH="$rhs" && NEW=false
  [[ "$lhs" = "error" ]] && ERROR="$rhs"
  [[ "$lhs" = "start" ]] && START="$rhs"
  [[ "$lhs" = "end" ]] && END="$rhs"
  }
done
# call one last time at the end
checklog2
} # end of checklog function

anomalydetection() {
# a few tests - you can always come up with more...
  diff=$((starttot - endtot))
  [[ $diff -gt $maxdiff ]] || [[ $starttot -lt $minstarts ]] || [[ $errortot -gt $maxerrors ]] && {
    ANOMALIES=1
    [[ "$DEBUG" -eq "1" ]] && echo ANOMALIES, $ANOMALIES, starttot, $starttot, endtot, $endtot, errortot, $errortot
  }
} # end function anomalydetection

sendsummary() {
  subject="Weekly summary of sesamstrasse automation scripts - please review"
  [[ -n "$ANOMALIES" ]] && subject="${subject} - ANOMALIES DETECTED PLEASE REVIEW CAREFULLY!!"

  intro="This summarizes the results from the past week of running automation scripts on sesamstrasse.
Please check that values seem reasonable. If things are out of range, check with Heiko or look at
sesamstrasse yourself.

"

  [[ "$DEBUG" -eq "1" ]] && echo subject, $subject, intro, "$intro", info, "$info"
  [[ "$DEBUG" -eq "1" ]] && args="-v"
  echo "${intro}${info}"|mail "$args" -s "$subject" "$recipients"
} # end function sendsummary

# MAIN PROGRAM
# always check the latest log
checklog
anomalydetection

# only check all logs if it is certain day of the week. Monday = 1, etc
day=$(date +%u)
[[ "$DEBUG" -eq "1" ]] && echo day, $day
[[ $day -eq $DAY ]] || [[ -n "$ANOMALIES" ]] && {
  [[ "$DEBUG" -eq "1" ]] && echo calling checklog with wildcard set
  wildcard='*'
  checklog
  sendsummary
}

[[ "$DEBUG" -eq "1" ]] && echo message so far is "$info"

log.ini

# The suggestion: To have a configuration file with log identifiers
#(e.g. “anydesk-edl”) and per identifier: log file path (“/var/log/anydesk-edl.log”),
# error pattern (“.+\[Error\].+”), start pattern (“.+\[Notice\] Starting$”) end pattern (“.+\[Notice\] Done$”).
#Then just count number of executions (based on start/end) and number of errors.

# the start/end/error values are interpreted as extended regular expressions - see regex(7) man page
identifier: anydesk-edl
path: /var/log/anydesk-edl.log
error: .+\[Error\].+
start: .+\[Notice\] Starting$
end: .+\[Notice\] Done$

identifier: firewall-requester-to-edl
path: /var/log/firewall-requester-to-edl.log
error: .+\[Error\].+
start: .+\[Notice\] Starting$
end: .+\[Notice\] Done$

identifier: sase-ips-to-bigip
path: /var/log/sase-ips-to-bigip.log
error: .+\[Error\].+
start: .+\[Notice\] Starting$
end: .+\[Notice\] Done$

What this script does

So when the guy writes an automation script, he is so meticulous that he follows the same convention and hooks it into the syslogger to create uniquely named log files for it. He writes out a [Notice] Starting when his script starts, and a [Notice] Done when it ends. And errors are reported with an [Error] details. Some of the scripts are called hourly. So we agreed to have a script that checks all the other scripts once a week and send a summary email of the results. I look to see that the count of starts and ends is roughly the same, and I report back the ten most recent errors from a given script. I also look for other basic things. That’s the purpose of the function anomalydetection in my script. It’s just basic tests. I didn’t want to go wild.

But what if there was a problem with one of the scripts, wouldn’t we want to know sooner than possibly six days later? So I decided to have my script run every day, but only send email on the off days if an anomaly was detected. This made the logic a tad more complex, but nothing bash and I couldn’t handle. It fits the need of an overworked operational staff.

Techniques I learned and re-learned from developing this script

cron scheduling – more to it than you thought

I used to naively think that it suffices to look into the crontab files of all users to discover all the scheduled processes. What I missed is thinking about how log rotate works. How does it work? Turns out there is another section of cron for jobs run daily, weekly and monthly. logrotate is called from cron.daily.

logrotate – potential to do more

The person who wrote the automation scripts is a much better scripter than I am. I didn’t want to disappoint so I put in the extra effort to discover the best way to call my script. I reasoned that logrotate would offer the opportunity to run side scripts, and I was absolutely right about that! You can run a script just before the logs rotate, or just after. I chose the just before timing – prerotate. In actual fact logrotate calls the prerotate script with all the log files to be rotate as arguments, which you notice we don’t take advantage of, because at the time we were unsure how we were going to interface. But I figure let’s just leave it now. man logrotate to learn more.

By the way although I developed on a generic Debian system, it should work on a Raspberry Pi as well since it is Debian based.

BASH – the potential to do more, at a price

You’ll note that I use some bash-specific extensions in my script. I figure bash is near universal, so why not? The downside is that when logrotate invokes an external script, it calls is using old-fashioned shell. And my script does not work. Except I learned this useful trick:

if [ ! "$BASH_VERSION" ]; then
  exec /bin/bash "$0" "$@"
  exit
fi

Note this is legit syntax in SHELL and a legit conditional operator expression. So it means if you – and by you I mean the script talking about itself – are invoked via SHELL, then invoke yourself via BASH and exit the parent afterwards. And this actually does work (To do: have to check which occurs first, the syntax checking or the command invocation).

More

Speaking of that conditional, if you want to know all the major comparison tests, do a man test. I have around to use the double bracket expressions [[ more and more, though they are BASH specific I believe. The double bracket can be followed by a && and then an open curly brace { which can introduce a block of code delimited of course by a close curly brace }. So for me this is an attractive alternative to SHELL’s if conditional then code block fi syntax, and probably just slightly more compact. Replace && with || to execute the code block when the condition does not evaluate to be true.

zgrep is grep for compressed files, but we knew that right? But it’s agnostic – it works like grep on both compressed and uncompressed files. That’s important because with rotated logs you usually have a combination of both.

Now the expert suggested a certain regular expression for the search string. It wasn’t working in my first pass. I reasoned that zgrep may have a special mode to act more like egrep which supports extended regular expressions (EREs). EREs aren’t really the same as perl-compatible regular expressions (PCREs) but for this kind of simple stuff we want, they’re close enough. And sure enough zgrep has the -E option to force it to interpret the expression as an ERE. Great.

RegEx

So in the log.ini file the regular expression has a \[…\] syntax. The backslash is actually required because otherwise the […] syntax is interpreted as a character class, where all the characters between the brackets get tried to match a single character in the string to be matched. That’s a very different match!

My big thing was – will I have to further escape those lines read in from log.ini, perhaps to replace a \[ with a \\[? Stuff like that happens. I found as long as I used those double quotes around the variables (see below) I did not need to further escape them. Similarly, I found that the EREs in log.ini did not need to be placed between quotes though the guy initially proposed that. It looks cleaner without them.

Variable scope

I wasted a lot of time on a problem which I thought may be due to some weird variable scoping. I’ve memorized this syntax cat file|while read line; do etc, etc so I use it a lot in my tiny scripts. It’s amazing I got away with it as much as I have because it has one huge flaw. if you start using variables within the loop you can’t really suck them out, unless you write them to a file. So while at first I thought it was a problem of variable scoping – why do my loop variables have no values when the code comes out of the loop? – it really isn’t that issue. It’s that the pipe, |, created a forked process which has its own variables. So to avoid that I switched to this weird syntax for line in $(<$INI); do etc. So it does the line-by-line file reading as before but without the pipe and hence without the “variable scope” problem.

But in another place in the script – where I add up numbers – I felt I could not avoid the pipe. So there I do write the value to a file.

The conclusion is that with the caveat that if you know what you’re doing, all variables have global scope, and that’s just as it should be. Hey, I’m from the old Fortran 66/77 school where we were writing Monte Carlos with thousands of lines of code and dozens of variables in a COMMON block (global scope), and dozens of contributors. It worked just fine thank you very much. Because we knew what we were doing.

Adding numbers in bash

Speaking of adding, I can never remember how to add numbers (integers). In bash you can do starts=$((starts + sline)) , where starts and sline are integers. At least this worked in Debian linux Stretch. I did not really get the same to work so well in SLES Linux – at least not inside a loop where I most needed it.

When you look up how to add numbers in bash there are about a zillion different ways to do it. I’m trying to stick to the built-in way.

Sending mail in Debian linux

You probably need to configure a smarthost if you haven’t used your server to send emails up until now. You have to reconfigure of the exim4 package:

dpkg-reconfigure exim4-config

This also can be done on a RPi if you ever find you need for it to send out emails.

Variables

If a variables includes linebreaks and you want to see that, put it between double-quotes, e.g., echo “$myVariableWithLineBreaks”. If you don’t do that it seems to remove the linebreaks. Use of the double quotes also seems to help avoid mangling variables that contain meta characters found in regular expressions such as .+ or \[.

Result of executing the commands

I grew up using the backtick metacharacter, `, to indicate that the enclosed command should be executed. E.g., old way:

DIR=`dirname $0`

But when you think about it, that metacharacter is small, and often you are unlucky and it sits right alongside a double quote or a single quote, making for a visual trainwreck. So this year I’ve come to love the use of $(command to be executed) syntax instead. It offers much improved readability. But then the question became, could I nest a command within a command, e.g., for my DIR assignment? I tried it. Now this kind of runs counter to my philosophy of being able to examine every single step as it executes because now I’m executing two steps at once, but since it’s pretty straightforward, I went for it. And it does work. Hence the DIR variable is assigned with the compound command:

DIR=$(cd $(dirname $0);pwd)

So now I wonder if you can go more than two levels deep? Each level is an incrementally bad idea – just begging for undetectable mistakes, so I didn’t experiment with that!

By the way the reason I needed to do that is that the script jumps around to another directory to create temporary files, and I wanted it to be able to reference the full path to its original directory, so a simpler DIR=$(dirname $0) wasn’t going to cut it if it’s called with a relative path such as ./checklogs.sh

Debugging

I make mistakes left and right. But I know what results I expect. So I generously insert statements as variables get assigned to double check them, prefacing them with a conditional [[ $DEBUG -eq 1 ]] && print out these values. As I develop DEBUG is set to 1. When it’s finally working, I usually set it to 0, though in some script I never quite reach that point. It looks like a lot of typing, but it’s really just cut and paste and not over-thinking it for the variable dump, so it’s very quick to type.

Another thing I do when I’m stuck is to watch as the script executes in great detail by appending -xv to the first line, e.g., #!/bin/bash -xv. But the output is always confusing. Sometimes it helps though.

Compensating for Outlook’s newline handling

Outlook is too clever for its own good and “helpfully” removes what it considers extra linefeeds. Thanks Microsoft. Really helpful. So if you add extra linefeeds you can kind of get around that, but then you go from 0 linefeeds in the displayed output to two. Again, thanks Microsoft.

Anyway, I disocvered sed ‘a/\\/’ is a way to add an extra linefeed to my error lines, where the problem was especially acute and noticeable.

Techniques I’d like to use in the future

You can assign a function to a variable and then call that variable. I know that will have lots of uses but I’m not used to the construct. So maybe for my next program.

Conclusion

This fairly simple yet still powerful script has forced me to become a better BASH shell scripter. In this post I review some of the basics that make for successful scripting using the BASH shell. I feel the time invested will pay off as there are many opportunities to write such utility scripts. I actually prefer bash to perl or python for these tasks as it is conceptually simpler, less ambitious, less pretentious, yes, far less capable, but adequate for my tasks. A few rules of the road and you’re off and running! bash lends itself to very quick testing cycles. Different versions of bash introduced additional features, and that gets trying. I hope I have found and utilized some of the basic stuff that will be available on just about any bash implementation you are likely to run across.

References and related

The nitty gritty details about BASH shell can be gleaned by doing a man bash. It seems daunting at first but it’s really not too bad once you learn how to skim through it.

This post shows how to properly use the syslog package within python to create these log files that I parse.