Categories
Linux Python

How to prevent an image from being rotated

Intro

I still use my home-grown slideshow software based on Raspberry Pi, which is quite a testament to its robustness as it has been running with only minor modifications for many years now. one recent improvement has been my addition of being able to handle photos from recent iPhones which save photos in the new-to-me HEIC format. My original implementation only handles JPEGs and PNG file types, so it was skipping all our recent iPhone photos.

I figured there just had to be a converter our there which would even work on the RPi, which of course there was, heif-convert. But it has an oddity when it comes to rotation. It converts the HEIC to a jpeg, fine, but it rotates them, but it also leaves all the EXIF meta data, including the orientation meta data, as is. This in turn means display software such as fbi may try to rotate the picture a second time. Or at least that’s what happened to my software where one of my steps is an explicit rotate. That step was creating a double rotation.

So I needed a tiny program which left all the EXIF meta data alone except the rotation, which it sets to 0, i.e., do not rotate. Seeing nothing out there, I developed my own.

The details

Here is that script, which I call 0orientation.py:

#!/usr/bin/python3
# DrJ 10/24
# you need the exif package: pip install exif
# https://github.com/kennethleungty/Image-Metadata-Exif?tab=readme-ov-file
#https://exif.readthedocs.io/en/latest/
import sys
from exif import Image
file = sys.argv[1]
with open(file, 'rb') as image_file:
    my_image = Image(image_file)
exif_flag = my_image.has_exif
print(exif_flag)
exifs = my_image.list_all()
print(exifs)
my_image.orientation = 0
with open('modified_image.jpg', 'wb') as new_image_file:
    new_image_file.write(my_image.get_file())

Conclusion

I have shared how to overwrite just the orientation tag frmo a JPEG photo.

Reference and related

My photo frame software based on photos stored in a Google Drive

Categories
Python

Azure DevOps: how to run python in a virtual environment

Intro

My pipeline job has been running without issue for the last year+. All of a sudden I started to receive this error right after my pip3 install -r requirements.txt:

× This environment is externally managed

Well I never had that problem before. I assumed that ADO sprinkles some magic dust onto the agents in the pool and creates an already virtual environment so why bother requiring further vitrualization?? I don’t know. And that’s how I rationalized how it might have ever worked in the first place if I ever bothered to think about it at all.

But I guess I was living on borrowed time and that house of cards came down hard, probably after the agents were upgraded.

The details

With some small effort I have managed to have the pipeline build up a virtual python environment and install my needed packages into it. Here is the relevant code section in the yaml file.

This by the way is my job to check that all pipelines have run correctly in the last hour:

# As of 2/2024 we need to run pytho in a virtual environment
- script: |
    python3 -m venv venv
    source ./venv/bin/activate
    pip3 install -vvv --timeout 60 -r requirements.txt
    python3 check_all_pipelines.py conf_check_all.ini
  displayName: 'Run script'
  workingDirectory: $(System.DefaultWorkingDirectory)/Pipeline_check
  env:
    ADO_AUTH: $(ado_auth)
    PYTHONPATH: $(System.DefaultWorkingDirectory)/Pipeline_check:$(System.DefaultWorkingDirectory)

The requirements.txt simply contains the line

pymsteams==0.2.2

Conclusion

We have shown how to set up a python virtual environment within your yaml file for an Azure DevOps pipeline. You might need this if you rely on any external packages which are not present in the OS version of python on the agents.

References and related

Wrtie-up of my ADO pipeline checker pipeline

How an ADO piepline can modify its own git repository

My favorite python tips

Categories
Linux Perl Python SLES

Using syslog within python

Intro

We created a convention where-in our scripts log to syslog with a certain style. Originally these were Perl scripts but newer scripts are written in python. My question was, how to do in python what we had done in Perl?

The details

The linux system uses syslog-ng. In /etc/syslog-ng/conf.d I created a test file 03drj.conf with these contents:

destination d_drjtest { file("/var/log/drjtest.log"); };
filter f_drjtest{ program("drjtest"); };
log { source(s_src); filter(f_drjtest); destination(d_drjtest); flags(final); };

So we want that each of our little production scripts has its own log file in /var/log/.

The python test program I wrote which outputs to syslog is this:

[

import syslog
syslog.openlog('jhtest',syslog.LOG_PID,facility=syslog.LOG_LOCAL0)
syslog.syslog(syslog.LOG_NOTICE,'[Notice] Starting')
syslog.syslog(syslog.LOG_ERR,'[Error] We got an error')
syslog.syslog(syslog.LOG_INFO,'[Info] Just informational stuff')

Easy, right? Then someone newer to python showed me what he had done – not using syslog but logger, in which he accomplished pretty much the same goal but by different means. But he had to hard-code a lot more of the values and so it was not as elegant in my opinion.

In any case, the output is in /var/log/drjtest.log which look like this after a test run:

Jul 24 17:45:32 drjohnshost drjtest[928]: [Notice] Starting
Jul 24 17:45:32 drjohnshost drjtest[928]: [Error] We got an error
Jul 24 17:45:32 drjohnshost drjtest[928]: [Info] Just informational stuff
OSes using rsyslog

Today I needed to make this style of logging work on a new system which was running rsyslog. The OS is SLES v 15. For this OS I added a file to /etc/rsyslog.d called drjtest.conf with the contents:

if ($programname == 'drjtest' ) then {
        -/var/log/drjtest.log
        stop
}

But the python program did not need to change in any way.

Conclusion

We show how to properly use the syslog facility within python by using the syslog package. It’s all pretty obvious and barely needs to be mentioned, except when you’re just statring out you want a little hint that you may not find in the man pages or the documentation at syslog-ng.

References and related

I have a neat script which we use to parse all these other scripts and give us once a week summary emails, unless and error has been detected in which case the summary email goes out the day after a script has reported an error. It has some pretty nice logic if I say so myself. Here it is: drjohns script checker.

Categories
Linux Python

Cloudflare DNS: using the python api

Intro

The examples provided on github are kind of wrong. I created an example script which actually works. If you simply copy their example and try the one where you add a DNS record using the python interface to the api, you will get this error:

CloudFlare.exceptions.CloudFlareAPIError: Requires permission “com.cloudflare.api.account.zone.create” to create zones for the selected account

Read on to see the corrected script.

Then some months later I created a script – still using the python api – to do a DNS export of all the zone files our account owns on Cloudflare. I will also share that.

The details

I call the program below listrecords.py. This one was copied from somewhere and it worked without modification:

import CloudFlare
import sys

def main():
    zone_name = sys.argv[1]

    cf = CloudFlare.CloudFlare()

    # query for the zone name and expect only one value back
    try:
        zones = cf.zones.get(params = {'name':zone_name,'per_page':1})
    except CloudFlare.exceptions.CloudFlareAPIError as e:
        exit('/zones.get %d %s - api call failed' % (e, e))
    except Exception as e:
        exit('/zones.get - %s - api call failed' % (e))

    if len(zones) == 0:
        exit('No zones found')

    # extract the zone_id which is needed to process that zone
    zone = zones[0]
    zone_id = zone['id']

    # request the DNS records from that zone
    try:
        dns_records = cf.zones.dns_records.get(zone_id)
    except CloudFlare.exceptions.CloudFlareAPIError as e:
        exit('/zones/dns_records.get %d %s - api call failed' % (e, e))

    # print the results - first the zone name
    print("zone_id=%s zone_name=%s" % (zone_id, zone_name))

    # then all the DNS records for that zone
    for dns_record in dns_records:
        r_name = dns_record['name']
        r_type = dns_record['type']
        r_value = dns_record['content']
        r_id = dns_record['id']
        print('\t', r_id, r_name, r_type, r_value)

    exit(0)

if __name__ == '__main__':
    main()

The next script adds a DNS record. This is the one which I needed to modify.

# kind of from https://github.com/cloudflare/python-cloudflare
# except that most of their python examples are wrong. So this is a working version...
import sys
import CloudFlare

def main():
    zone_name = sys.argv[1]
    print('input zone name',zone_name)
    cf = CloudFlare.CloudFlare()
# zone_info is a list: [{'id': '20bd55fbc94ff155c468739', 'name': 'johnstechtalk-2.com', 'status': 'pending',
    zone_info = cf.zones.get(params={'name': zone_name})
    zone_id = zone_info[0]['id']

    dns_records = [
        {'name':'foo', 'type':'A', 'content':'192.168.0.1'},
    ]

    for dns_record in dns_records:
        r = cf.zones.dns_records.post(zone_id, data=dns_record)
    exit(0)

if __name__ == '__main__':
    main()

The zone_id is where the original program’s wheels fell off. Cloudflare Support does not support this python api, at least that’s what they told me. So I was on my own. What gave me confidence that it really should work is that when you install the python package, it also installs cli4. And cli4 works pretty well! The examples work. cli4 is a command line program for linux. But when you examine it you realize it’s (I think) using the python behind the scenes. And in the original bad code there was a POST just to get the zone_id – that didn’t seem right to me.

Backup all zones in the Cloudflare account by doing a DNS export

I call this script backup-all-zones.py:

import os
import CloudFlare
from datetime import datetime

def listzones(cf):
    allzones = list()
    page_number = 0
    while True:
        page_number += 1
        raw_results = cf.zones.get(params={'per_page':20,'page':page_number})
        #print(raw_results)
        zones = raw_results['result']

        for zone in zones:
            zone_id = zone['id']
            zone_name = zone['name']
            print("zone_id=%s zone_name=%s" % (zone_id, zone_name))
            allzones.append([zone_id,zone_name])

        total_pages = raw_results['result_info']['total_pages']
        if page_number == total_pages:
            break
    #print('allzones',allzones)
    return allzones

# main program
today = datetime.today().date() # today's date
date = today.strftime('%Y%m%d') # formatted date
print('Begin backup of zones on this day:',date)
newdir = 'zones-' + date
os.makedirs(newdir,exist_ok=True)

cf = CloudFlare.CloudFlare(raw=True)
print('Getting list of all zones and zone ids')
allzones = listzones(cf)
print('Begin export of the zone data')
for zone in allzones:
    zone_id,zone_name = zone
    print('Doing dns export of',zone_id,zone_name)
# call to do a BIND-style export of the zone, specified by zoneid
    res = cf.zones.dns_records.export.get(zone_id)
    dns_records = res['result']
    with open(f'{newdir}/{zone_name}','w') as f:
        f.write(dns_records)
# create compressed tar file and delete temp directory
print('Create compressed tar file')
os.system(f'tar czf backups/{newdir}.tar.gz {newdir}')
print(f'Remove directory {newdir} and all its contents')
os.system(f'rm -rf {newdir}')

As mentioned in the comments the cool thing in this backup is that the format output is the BIND style of files, which are quite readable. Obviously this script is designed for linux systems because that’s all I use.

Backup all zones again, without using the CF python api

After we upgraded to python 3.12 the CF api didn’t really work. It may just be we needed the private certificates installed. Regardless, that led us to create this backup script which does not depend on the CF python api. Well, it includes it, but it’s not material and could be removed. Instead direct url calls are made.

import os
import CloudFlare
from datetime import datetime,timezone
import urllib3
import requests

def listzones(cf):
    allzones = list()
    page_number = 0
    results_per_page = 50 # max allowed
    while True:
        page_number += 1
        #raw_results = cf.zones.get(params={'per_page':20,'page':page_number})
        url = f"https://api.cloudflare.com/client/v4/zones?per_page={results_per_page}&page={page_number}"
        raw_results = requests.get(
            url=url,
            headers={
                'Authorization': 'Bearer {}'.format(os.environ['CLOUDFLARE_API_TOKEN']),
                'Content-Type': 'application/json'
            },
        )
        print('results')
        #print(raw_results)
        print(raw_results.json())
        zones = raw_results.json().get('result')

        for zone in zones:
            zone_id = zone['id']
            zone_name = zone['name']
            print("zone_id=%s zone_name=%s" % (zone_id, zone_name))
            allzones.append([zone_id,zone_name])

        total_count = raw_results.json()['result_info']['total_count']

        total_pages = int(total_count/results_per_page)
# fraction of a page is the last page, if applicable
        if total_count % results_per_page > 0: total_pages += 1
        if page_number == total_pages:
            break
    #print('allzones',allzones)
    return allzones

# main program
urllib3.disable_warnings()
#today = datetime.utcnow() # today's date
today = datetime.now(timezone.utc)
date = today.strftime('%Y%m%d') # formatted date
print('Begin backup of zones on this day:',date)
newdir = 'zones-' + date
os.makedirs(newdir,exist_ok=True)

cf = CloudFlare.CloudFlare(raw=True, warnings=False)
print('Getting list of all zones and zone ids')
allzones = listzones(cf)
print('Total zone count:',len(allzones))
print('Begin export of the zone data')
for zone in allzones:
    zone_id,zone_name = zone
    print('Doing dns export of',zone_id,zone_name)
# call to do a BIND-style export of the zone, specified by zoneid
    #res = cf.zones.dns_records.export.get(zone_id)
    raw_results = requests.get(
        url = 'https://api.cloudflare.com/client/v4/zones/{}/dns_records/export'.format(zone_id),
        headers={
            'Authorization': 'Bearer {}'.format(os.environ['CLOUDFLARE_API_TOKEN']),
            'Content-Type': 'application/json'
        },
    )
    dns_records = raw_results.text
    print('dns records: first few lines, skip a few, then a few more entries')
    print(dns_records[:128],dns_records[840:968])
    with open(f'{newdir}/{zone_name}','w') as f:
        f.write(dns_records)
# create compressed tar file and delete temp directory
print('Create compressed tar file')
os.system(f'tar czf backups/{newdir}.tar.gz {newdir}')
print(f'Remove directory {newdir} and all its contents')
os.system(f'rm -rf {newdir}')

The environment

Just to note it, you install the package with a pip3 install cloudflare. Then I set up an environment variable CLOUDFLARE_API_TOKEN before running these programs.

Conclusion

I’ve shown a corrected python script which uses the Cloudflare api. I’ve also shown another one which can do a backup of all Cloudflare zones.

References and related

The Cloudflare api

The (wrong) api examples on github

My hearty endorsement of Using Cloudflare’s free tier to protect your personal web site.

Categories
Linux Python Raspberry Pi

vlc command-line tips

Intro

I’m looking to test my old Raspberry Pi model 3 to see if it can play mp4 videos I recorded on my Samsung Galaxy A51 smartphone. I had assumed it would get overwhelmed and give up, but I haven’t tried in many years, so… The first couple videos did play, sort of. I was using vlc. Now if you’ve seen any of my posts you know I’ve written a zillion posts on running a dynamic slideshow based on RPi. Though the most important of these posts was written years ago, it honestly still runs and runs well to this day, amazingly enough. Usually technology changes or hardware breaks. But that didn’t happen. Every day I enjoy a brand new slideshow in my kitchen.

In most of my posts I use the old stalwart program fbi. In fact I don’t even have XWindows installed – it’s not a requirement if you know what you’re doing. But as far as I can see, good ‘ole fbi doesn’t do streaming media such as videos in mp4 format. As far as I know, vlc is more modern and most importantly, better supported. So after a FAIL trying with mplayer (still haven’t diagnose that one), I switched to trials with vlc.

I haven’t gotten very far, and that’s why I wanted to share my learnings. There’s just so much you can do with vlc, that even what you may think are the most common things anyone would want are very hard to find working examples for. So that’s where I plan to contribute to the community. As I figure out an ‘easy” thing, I will add it here. And if I’m the only one who ever refers to this post, so be it. I love my own My favorite python tips, post, for instance. it has everything I use on a regular basis. So I’m thinking this will be similar.

References and related

My RPi slideshow blog post

My favorite python tips – everything I need!

Categories
Linux Python

Blur images with Python

Intro

I sometimes find myself in need to blur images to avoid giving away details. I once blurred just a section of an image using a labor-intensive method involving MS Paint. Here I provide a python program to blur an entire image.

The program

I call it blur.py. It uses the Pillow package and it takes an image file as its input.

# Dr John - 4/2023
# This will import Image and ImageChops modules
import optparse
from PIL import Image, ImageEnhance
import sys,re

p = optparse.OptionParser()
p.add_option('-b','--brushWidth',dest='brushWidth',type='float')
p.set_defaults(brushWidth=3.0)
opt, args = p.parse_args()
brushWidth = opt.brushWidth
print('brushWidth',brushWidth)

# Open Image
image = args[0]
print('image file',image)


base = re.sub(r'\.\S+$','',image)
file_type = re.sub(r'^.+\.','',image)
canvas = Image.open(image)

width,height = canvas.size
print('Original image size',width,height)
widthn = int(width/brushWidth)
heightn = int(height/brushWidth)
smallerCanvas = canvas.resize((widthn, heightn), resample=Image.Resampling.LANCZOS)

# Creating object of Sharpness class
im3 = ImageEnhance.Sharpness(smallerCanvas)

# no of blurring passes to make. 5 seems to be a minimum required
iterations = 5

# showing resultant image
# 0,1,2: blurred,original,sharpened
for i in range(iterations):
    canvas_fuzzed = im3.enhance(0.0)
    im3 = ImageEnhance.Sharpness(canvas_fuzzed)

# resize back to original size
canvas = canvas_fuzzed.resize((width,height), resample=Image.Resampling.LANCZOS)
canvas.save(base + '-blurred.' + file_type)

So there would be nothing to write about if the the Pillow ImageEnhance worked as expected. But it doesn’t. As far as I can tell on its own it will only do a little blurring. My insight was to realize that by making several passes you can enhance the blurring effect. My second insight is that Image Enhance is probably only working within a 1 pixel radius. I have intruduced the concept of a brush size where the default width is 3.0 (pixels). I effectuate a brush size by reduing the image by the factor equal to the brush size! Then I do the blurring passes, then finally resize back to the original size! Brilliant, if I say so myself.

So in general it is called as

$ python3 blur.py -b 5 image.png

That example would be to use a brush size of 5 pixels. But that is optional so you can use my default value of 3 and call it simply as:

$ python3 blur.py image.png

Example output
Blur a select section of an image

You can easily find the coordinates of a rectangular section of an image by using, e.g., MS Paint and doing a mouseover in the corners of the rectangular section you wish to blur. Note the coordinates in the upper left corner and then again in the lower right corner. Mark them down in that order. My program even allows more than one section to be included. In this example I have three sections. The resulting image with its blurred sections is shown below.

Three rectangular setions of this image were blurred

Here is the code, which I call DrJblur.py for lack of a better name.

# blur one or more sections of an image. Section coordinates can easiily be picked up using e.g., MS Paint
# partially inspired by this tutorial: https://auth0.com/blog/image-processing-in-python-with-pillow/
# This will import Image and ImageChops modules
from PIL import Image, ImageEnhance
import sys,re

def blur_this_rectangle(image,x1,y1,x2,y2):
    box = (x1,y1,x2,y2)
    cropped_image = image.crop(box)

# Creating object of Sharpness class
    im3 = ImageEnhance.Sharpness(cropped_image)

# no of blurring passes to make. 10 seems to be a minimum required
    iterations = 10

# showing resultant image
# 0,1,2: blurred,original,sharpened
    for i in range(iterations):
        cropped_fuzzed = im3.enhance(-.75)
        im3 = ImageEnhance.Sharpness(cropped_fuzzed)

# paste this blurred section back onto original image
    image.paste(cropped_fuzzed,(x1,y1)) # this modified the original image

# Open Image
image = sys.argv[1]
base = re.sub(r'\.\S+$','',image)
file_type = re.sub(r'^.+\.','',image)
canvas = Image.open(image)

argNo = len(sys.argv)
boxNo = int(argNo/4) # number of box sections to blur
# (x1,y1) (x2,y2) of rectangle to blur is the next argument
for i in range(boxNo):
    j = i*4 + 2
    x1 = int(sys.argv[j])
    y1 = int(sys.argv[j+1])
    x2 = int(sys.argv[j+2])
    y2 = int(sys.argv[j+3])
    blur_this_rectangle(canvas,x1,y1,x2,y2)
canvas.save(base + '-blurred.' + file_type)

Here is how I called it:

$ python3 ~/draw/DrJblur.py MultipleVedges.PNG 626 415 1143 452 597 532 1164 566 621 645 1136 679

Conclusion

Since it can be a little hard to find an a simple and easy-to-use blurring program, I have written my own and provided it here for general use. Actually I have provided two programs. One blurs an entire picture, the other blurs rectangular sections within a picture. Although I hardcoded 10 passes, that number may need to be increased depending on the amount of blurriness desired. To blur a larger font I changed it to 50 passes, for example!

Obviously, obviously, if you have a decent image editing program like an Adobe Photoshop, you would just use that. There are also probably some online tools available. I myself am leery of using “free” online tools – there is always a hidden cost. And if you all you want to do is to erase in that rectangle and not blur, even lowly MS Paint can do that quite nicely all on its own. But as for me, I will continue to use my blurring program – I like it!

References and related

The need for the ability to blur an image arose when I wanted to share something concrete resulting from my network diagram as code effort.

I also am blurring some of the Grafana-generated images mentioned in this post: All I need to know about Grafana and InfluxDB.

Categories
Network Technologies Python

Python network diagram generator

Intro

Since they took away our Visio license to save licensing fees, some of us have wondered where to turn to. I once used the venerable old MS Paint after learning one of my colleagues used it. Some have turned to Powerpoint. Since I had some time and some previous familiarity with the components – for instance when I create CAD designs for 3D printing I am basically also doing CAD as code using openSCAD – I wondered if I could generate my network diagram using code? It turns out I can, at least the basic stuff I was looking to do.

Pillow

I’m sure there are much better libraries out there but I picked something that was very common although also very limited for my purposes. That is the python Pillow package. I created a few auxiliary functions to ease my life by factoring out common calls. I call the auxiliary modules aux_modules.py. Here they are.

from PIL import Image, ImageDraw, ImageFont
serverWidth = 100
serverHeight = 40
small = 5
fnt = ImageFont.truetype('/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf', 12)
fntBold = ImageFont.truetype('/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf', 11)

def drawServer(img_draw,xCorner,yCorner,text,color='white'):
# known good colors for visibility of text: lightgreen, lightblue, tomato, pink and of course white
# draw the server
    img_draw.rectangle((xCorner,yCorner,xCorner+serverWidth,yCorner+serverHeight), outline='black', fill=color)
    img_draw.text((xCorner+small,yCorner+small),text,font=fntBold,fill='black')

def drawServerPipe(img_draw,xCorner,yCorner,len,source,color='black'):
# draw the connecting line for this server. We permit len to be negative!
# known good colors if added text is in same color as pipe: orange, purple, gold, green and of course black
    lenAbs = abs(len)
    xhalf = xCorner + int(serverWidth/2)
    if source == 'top':
        coords = [(xhalf,yCorner),(xhalf,yCorner-lenAbs)]
    if source == 'bottom':
        coords = [(xhalf,yCorner+serverHeight),(xhalf,yCorner+serverHeight+lenAbs)]
    img_draw.line(coords,color,2)

def drawArrow(img_draw,xStart,yStart,len,direction,color='black'):
# draw using several lines
    if direction == 'down':
        x2,y2 = xStart,yStart+len
        x3,y3 = xStart-small,y2-small
        x4,y4 = x2,y2
        x5,y5 = xStart+small,y3
        x6,y6 = x2,y2
        coords = [(xStart,yStart),(x2,y2),(x3,y3),(x4,y4),(x5,y5),(x6,y6)]
    if direction == 'right':
        x2,y2 = xStart+len,yStart
        x3,y3 = x2-small,y2-small
        x4,y4 = x2,y2
        x5,y5 = x3,yStart+small
        x6,y6 = x2,y2
        coords = [(xStart,yStart),(x2,y2),(x3,y3),(x4,y4),(x5,y5),(x6,y6)]
    img_draw.line(coords,color,2)
    img_draw.line(coords,color,2)

def drawText(img_draw,x,y,text,fnt,placement,color):
# draw appropriately spaced text
    xy = (x,y)
    bb = img_draw.textbbox(xy, text, font=fnt, anchor=None, spacing=4, align='left', direction=None, features=None, language=None, stroke_width=0, embedded_color=False)
# honestly, the y results from the bounding box are terrible, or maybe I don't understand how to use it
    if placement == 'lowerRight':
        x1,y1 = (bb[0]+small,bb[1])
    if placement == 'upperRight':
        x1,y1 = (bb[0]+small,bb[1]-(bb[3]-bb[1])-2*small)
    if placement == 'upperLeft':
        x1,y1 = (bb[0]-(bb[2]-bb[0])-small,bb[1]-(bb[3]-bb[1])-2*small)
    if placement == 'lowerLeft':
        x1,y1 = (bb[0]-(bb[2]-bb[0])-small,bb[1])
    xy = (x1,y1)
    img_draw.text(xy,text,font=fntBold,fill=color)

How to use

I can’t exactly show my eample due to proprietary elements. So I can just mention I write a main program making lots of calls tto these auxiliary functions.

Tip

Don’t forget that in this environment, the x axis behaves like you learned in geometry class with positive x values to the right of the y axis, but the y axis is inverted! So positive y values are below the x axis. That’s just how it is in a lot of these programs. get used to it.

What I am lacking is a good idea to do element groupings, or an obvious way to do transformations or rotations. So I just have to keep track of where I am, basically. But even still I enjoy creating a network diagram this way because there is so much control. And boy was it easy to replicate a diagram for another one which had a similar layout.

It only required the Pillow package. I am able to develop my diagrams on my local PC in my WSL environment. It’s nice and fast as well.

Example Output

This is an example output from this diagram as code approach which I produced over the last couple days, sufficiently blurred for sharing.

Network diagram (blurred) resulting from use of this code-first approach
Conclusion

I provide my auxiliary functions which permit creating “network diagrams as code.” The results are not pretty, but networking people will understand them.

References and related

I developed a way to blur images using the Python Pillow package.

CAD as code: openSCAD is what I had in mind in taking this code first approach to building up geometries.

My disorganized cheat sheet of python language features I most commonly use.

Categories
Python

Azure DevOps: use the api to copy logs to linux

Intro

As far as I can tell there’s no way to search through multiple pipeline logs with a single command. In linux it’s trivial. Seeing the lack of this basic functionality I decided to copy all my pipeline logs over to a linux server using the Azure DevOps (ADO) api.

The details

This is the main program which I’ve called get_raw_logs.py.

#!/usr/bin/python3
# fetch raw log to local machine
# for relevant api section, see:
#https://learn.microsoft.com/en-us/rest/api/azure/devops/build/builds/get-build-log?view=azure-devops-rest-7.1
import urllib.request,json,sys
from datetime import datetime,timedelta
from modules import aux_modules

conf_file = sys.argv[1]

# pipeline uses UTC so we must follow suit or we will miss files
a_day_ago = (datetime.utcnow() - timedelta(days = 1)).strftime('%Y-%m-%dT%H:%M:%SZ')
print('a day ago (UTC)',a_day_ago)

#url = 'https://dev.azure.com/drjohns4ServicesCoreSystems/Connectivity/_apis/build/builds?minTime=2022-10-11T13:00:00Z&api-version=7.1-preview.7'

# dump config file into a dict
config_d = aux_modules.parse_config(conf_file)

url = config_d['url_base'] + config_d['organization'] + '/' + config_d['project'] + '/_apis/build/builds?minTime=' + a_day_ago + config_d['url_params']
#print('url',url)
req = urllib.request.Request(url)
req.add_header('Authorization', 'Basic ' + config_d['auth'])

# Get buildIds for pipeline runs from last 24 hours
with urllib.request.urlopen(req) as response:
   html = response.read()
txt_d = json.loads(html)
#{"count":215,"value":[{"id":xxx,buildNumber":"20221011.106","definition":{"name":"PAN-Usage4Mgrs-2"
value_l = txt_d['value']
for builds in value_l:
    buildId = builds['id']
    build_number = builds['buildNumber']
    build_def = builds['definition']
    name = build_def['name']
    #print('name,build_number,id',name,build_number,buildId)
    #print('this_build',builds)
    if name == config_d['pipeline1'] or name == config_d['pipeline2']:
        aux_modules.get_this_log(config_d,name,buildId,build_number)

In the modules directory this is aux_modules.py.

import json
import os,urllib.request

def parse_config(conf_file):
# config file should be a json file
    f = open(conf_file)
    config_d = json.load(f)
    f.close()
    return config_d

def get_this_log(config_d,name,buildId,build_number):
# leaving out the api-version etc works better
#GET https://dev.azure.com/{organization}/{project}/_apis/build/builds/{buildId}/logs/{logId}?api-version=7.1-preview.2
#https://dev.azure.com/drjohns4ServicesCoreSystems/d6335c8e-f5b4-44a5-8f6c-7b17fe663a86/_apis/build/builds/44071/logs/7'
        buildId_s = str(buildId)
        log_name = config_d['log_dir'] + "/" + name + "-" + build_number
# check if we already got this one
        if os.path.exists(log_name):
            return
        #url = url_base + organization + '/' + project + '/_apis/build/builds/' + buildId_s + '/logs/' + logId + '?' + url_params
        url = config_d['url_base'] + config_d['organization'] + '/' + config_d['project'] + '/_apis/build/builds/' + buildId_s + '/logs/' + config_d['logId']
        print('url for this log',url)
        req = urllib.request.Request(url)
        req.add_header('Authorization', 'Basic ' + config_d['auth'])
        with urllib.request.urlopen(req) as response:
            html = response.read()
        #print('log',html)
        print("Getting (name,build_number,buildId,logId) ",name,build_number,buildId_s,config_d['logId'])
        f = open(log_name,"wb")
        f.write(html)
        f.close()

Unlike programs I usually write, some of the key logic resides in the config file. My config file looks something like this.

{
"organization":"drjohns4ServicesCoreSystems",
"project":"Connectivity",
"pipeline1":"PAN-Usage4Mgrs",
"pipeline2":"PAN-Usage4Mgrs-2",
"logId":"7",
"auth":"Yaskaslkasjklaskldslkjsasddenxisv=",
"url_base":"https://dev.azure.com/",
"url_params":"&api-version=7.1-preview.7",
"log_dir":"/var/tmp/rawlogs"
}

It runs very efficiently so I run it every three minutes.

In my pipelines, all the interesting stuff is in logId 7 so I’ve hardcoded that. It could have turned out differently. Notice I am getting the logs from two pipelines due to the limitation, discussed previously, that you can only run 1000 pipeline runs a week so I was forced to run two identical ones, staggered, every 12 minutes with pipeline-2 sleeping the first six minutes.

The auth is the base-64 encoded text for any:<my_auth_token>.

Conclusion

I show how to copy the logs over from Azure DevOps pipeline runs to a local Unix system where you can do normal cool linux commands on them.

References and related

Running an ADO pipeline more than 1000 times a week.

ADO Rest api reference section relevant for this post: https://learn.microsoft.com/en-us/rest/api/azure/devops/build/builds/get-build-log?view=azure-devops-rest-7.1

How to secure a sensitive variable in ADO.

Categories
Cloud Python

Azure DevOps: How to work in a subfolder of a project

Intro

Our repo corresponds to a project. Within it are subfolders corresponding to individual apps we want to run in a pipeline.

Someone provided me with a starter yaml file to run my code in a pipeline. Originally my code was running just fine on my own linux server. In the pipeline, not so much as it became apparent it was expecting the current working directory to be the subfolder (directory in linux-speak) for references to modules, config files, etc. The documentation is kind of thin. So I spent hours checking things out and creating the solution which I now present.

The solution

The critical thing is to set the workingDirectory. Here is that section of the yaml file.

 script: python getlogdata.py 'PAN4-5min.aql'
  displayName: 'Run script'
  workingDirectory: $(System.DefaultWorkingDirectory)/PAN_Usage_4_Mgrs
  env:
    AUTH_TOKEN: $(auth_token)
#    PYTHONPATH: $(System.DefaultWorkingDirectory)/PAN_Usage_4_Mgrs/modules

Note that that PYTHONPATH environment variable is another possible way out – if all you need is to include modules, but it won’t help you with other things like finding your config file.

Errors

Now suppose you see an error like I got:

ImportError: cannot import name 'ZabbixMetric' from 'jhpyzabbix' (unknown location).

I had tried to put jhpyzabbix folder at the same level as my subfolder, so, right under the top project level. At first I was getting module not found errors. So I put back my PYTHONPATH like so

    PYTHONPATH: $(System.DefaultWorkingDirectory)/PAN_Usage_4_Mgrs:$(System.DefaultWorkingDirectory)

And that’s when I got that cannot import name error. Whar caused that is that although I had copied over the needed .py files to jhpyzabbix, I forgot one whose purpose seemed irrelevant to me. __init__.py. Turns out that tiny python file is quite important after all. School of hard knocks… It sets up the namespace mapping, I guess. To be concrete, mine looks like this:

from .api import ZabbixAPI, ZabbixAPIException, ssl_context_compat
from .sender import ZabbixMetric, ZabbixSender, ZabbixResponse
References and related

Passing secure variable in Azure DevOps to your program

Categories
Python

My favorite Python tips

Intro

I mostly transitioned from perl to python programming. I resisted for the longest time, but now I would never go back. I realized something. I was never really good at Perl. What I was good at were the regular expressions. So Perl for me was just a framework to write RegExes. But Perl code looks ugly with all those semicolons. Python is neater and therefore more legible at a glance. Python also has more libraries to pick from. Data structures in Perl were just plain ugly. I never mastered the syntax. I think I got it now in Python, which is a huge timesaver – I don’t have to hit the books every time I have a complex data structure.

I will probably find these tips useful and will improve upon them as I find better ways to do things. It’s mostly for my own reference, but maybe someone else will find them handy. I use maybe 5% of python? As I need additional things I’ll throw thm in here.

I’ve added some entries which I realized I needed so I can understand other people’s programming. For instance there are multiple ways to initialize an empty list.

What is this object?

Say you have an object <obj> and want to know what type it is because you’re a little lost. Do:

print(<obj>.__class__)

Check if this key exists in this dict

if “model” in thisdict:

But check out my with suppress(Exception): example further below to avoid this key test altogether.

Remove key from dict

if “model” in thisdict: del thisdict[“model”]

Copy (assign) one dict to another – watch the assignment operator!

Do not use dict2 = dict1! That is accepted, syntactically, but won’t work as you expect because the assignment operator (=) is economical and works by reference. Instead do this:

dict2 = dict1.copy()

It may even be necessary to use deepcopy:

import copy

dict2_complex = copy.deepcopy(dict1_complex)

Multiple assignments in on line

a,b,c = “hi”,23,”there”

Key and value from a single line

for itemid,val in itemvals.items():

Formatting

I guess it is pretty common to use a space-based (not tab) indent of four spaces for each subsequent code block.

Initializing lists and dicts

alist = []

blist = list() # another way to initialize an empty list

adict = {}

adict = dict() # another way to initialize an empty dict

Test for an empty list or empty dict or empty string

if not alist: print(“empty list”)

if not adict: print(“the dict adict is empty”)

astring=””

if not astring: print("the string is empty")
Avoid the KeyError: error

I just learned this technique. Wish I had known sooner!

a = adict.get(‘my_nonexistent_key’) # returns a with None if key does not exist. To test a: if a == None: …

Length of a list or string or dict

len(alist)

len(astring)

len(adict)

Merge two lists together

for elmnt in list2: list1.append(elmnt)

Address first/last element in a list

alist[0] # first element

alist[-1] # last element

Iterate (loop) over a list

for my_element in alist: print(my_element) # all on one line for demo!

First/Last two characters in a string

astring[:2]

astring[-2:]

Third and fourth characters in a string

astring[2:4] # returns AE for astring = EUAEABUDH0014

Lowercase a string

astring.lower() # there also exists an upper() function as well, of course

Conditional (comparison) operators

if a == b: print(“equals”) # so == is comparison operator for strings

if re.search(r’eq’,a):

do something

elif re.search(r’newstring’,a):

do something else

else:

etc.

Order of evaluation of conditionals and max value of a dictionary

a = {‘hi’:0,’there’:1,’man’:2}

if not a or max(a.values()) < 3: do something

Is the above expression safe to evaluate in the case where the dict a is defined but empty? Answer: yes, it is! Although by itself max(a.values()) would produce an error, in this or conditional, execution, I guess, never reaches that statement because the first statement evaluates as True. Same reasoning applies if the boolean operator is and.

Ternary operator

I don’t think is well-developed in Python and shouldn’t be used (my opinion).

++ operator? Doesn’t exist. += and its ilk does, however.

Absolute Value

abs(a)

Boolean variables + multiple assignment example

a, b=True, False

if a==b: print(“equals”)

if a: print(“a is true”)

Reduce number of lines in the program

for n in range(12): colors[n] = ‘red’

if not mykey in mydict: mydict[mykey] = []

Printing stuff while developing

print(“mydict”,mydict,flush=True)

Python figures it out how to print your object, whatever type it is, which is cool. That flush=True is needed if you want to see our output which you’ve redirected to a file right away! Otherwise it gets buffered.

Reading and writing files – prettyify.py

import requests, json, sys, os
import sys,json

from pathlib import Path
aql_file = sys.argv[1]
aql_path = Path(aql_file)
json_file = str(aql_path.with_suffix('.json'))

# Script path
dir_path = os.path.dirname(os.path.realpath(__file__))
dir_path_files = dir_path + "/files/"

# make ugly json file prettier    
# this is kind of a different example, mixed in there
file = sys.argv[1]
f = open(file)
# return json obj as dict
fjson = json.load(f)
nicer = json.dumps(fjson,indent=4)
print(nicer,flush=True)
# back to original example
f = open(dir_path_files + json_file,'w+')
f.write(body)
f.close()

Reading in command-line arguments

Reading in a boolean value

python pgm.py False

So, you could use argparse, but I chose ast. Then I have a line in the script:

import ast
overwrite_s = sys.argv[1] # either True of False - whether to overwrite or not
overwrite = ast.literal_eval(overwrite_s)

Nota Bene that if you fail to take these steps your argument will be read in as a string, not a boolean!

See Reading and Writing files example.

Parsing command line arguments II

Here is a more versatile and generalized way to parse command line arguments.

import optparse
p = optparse.OptionParser()
p.add_option('-b','--brushWidth',dest='brushWidth',type='float')
p.set_defaults(brushWidth=1.0)
opt, args = p.parse_args()
width = opt.brushWidth
print('brushWidth',width)
print(width.class)
remaining arguments
print(args)

$ python3 tst.py -b 1.2 my_file.png

brushWidth 1.2

['my_file.png']
Rounding a floating point number to two significant digits

a = round(901/3600,2)

Command line tips

The command line is your friend and should be used for little tests. Moreover, you can print an object without even a print statement.

>>>a =[1,’hi’,3]

>>>a

Going from byte object to string

s_b = b’string’

s = s_b.decode(‘utf-8’)

Test if object is a string

if type(thisobject) == str: print(“It is a string”)

Python as a calculator

I always used python command line as a calculator, even ebfore I knew the language syntax! It’s very handy.

>>> 5 + 6/23

Breaking out of a for loop

Use the continue statement after testing a condition to skip remaining block and continue onto next iteration. Use the break to completely skip out of this loop. Note that break and continue only apply to the innermost loop!

Infinite loop

while True: # then continue with statements in a code block

Iterator to get key value pairs out of a dict

>>>a = {‘hi’:’there’,’hi2′:12}

>>>for k,v in a.items():

>>> print(‘key,value’,k,v)

Executing shell commands

import os

os.system(“ls -l”)

But, to capture the output, you can use the subprocess package:

import subprocess

output = subprocess.run(cmd, shell=True, capture_output=True)

Generate (pseudo-)random numbers

import random

a = random.random()

Accessing environment variables

os.environ[‘ENV_TOKEN’]

Handling glob (wildcards) in your shell command

import glob

for query_results_file in glob.glob(os.path.join(dir_path_files,OSpattern)): print(“query_results_file”,query_results_file)

But, if you want the results in the same order as the shell gives, put a sorted() around that. Otherwise the results come out randomly.

JSON tips

Python is great for reading and writing JSON files.

# Load inventory file

with open(dir_path_files + inventory_file) as inventory_file:
inventory_json = json.load(inventory_file)

sitenoted={'gmtOffset':jdict["gmtOffset"],'timezoneId':jdict["timezoneId"]}

# update inventory with custom field Site Notes – put GMT – make sitenoted pretty using json.dumps

sitenote=json.dumps(sitenoted,indent=4)
print("sitenote",sitenote)

Convert a string which basically is in json format to a Python data structure

import json
txt_d = json.loads(response.text)
Convert int to a string

str(42) # produces ’42’

Test for null in JSON value

You may see “mykey”:null in your json values. How to test for that?

if my_dict[mykey] == None: continue

Format a json file into something human-readable

curl json_api|python3 -m json.tool

Sleep

from time import sleep

sleep(0.1)

RegExes

Although supported in Python, seems kind of ugly. Many RegExes will need to prefaced with r (raw), or else you’ll get yourself into trouble, as in

import re

r'[a-z]{4}.\s*\w(abc|def)’

if re.search(‘EGW-‘,locale): continue

b = re.sub(‘ ‘,’-‘,locale) # replace the first space with a hyphen

b = re.split(r’\s’,’a b c d e f’) # creates list with value [‘a’,’b’,’c’,’d’,’e’,’f’]

[subnet,descr] = re.split(‘,’,’10.1.2.3/24,descr,etc’,maxsplit=1)

If your match is itself a variable where you have escaped characters, you may wish to use a compile expression:

subnet_preescape = ‘10.0.0.’

subnet_drj = re.sub(r’\.’,’\\.’,subnet_preescape)

subnet = re.compile(subnet_drj)

if subnet.search(subnet_to_test): etc.

Minimalist URL example

import urllib.request

res = urllib.request.urlopen(‘https://drjohnstechtalk.com/’).read()

Function arguments: are they passed by reference or by value?

This section needs more research and may be inaccurate or simply wrong! By reference for complex objects like a dict (not sure about a list), but by value for a simple object like a Boolean! I got burned by this. I wanted a Boolean to persist across a function call. In the end I simple stuffed it into a dict! And that worked. But python doesn’t use that terminology. But it means you can pass your complex data structure, say a list of dicts of dicts, start appending to the list in your function, and the calling program has access to the additional members.

Print to a string a la sprintf

In python 3.6 and later you have the f-format which is way cool. Stuff between curly braces gets evaluated in place. Say a = 3 and b = ‘man’, then

str = f"first some text mixed with value of a, which is {a} and the text of b, which is {b}"

So no need to paste a string together with awkward combos of strings, plus signs and variables! You may also see this done as

str = ‘different way to inject a variable {a}’.format(a)

Insert a newline character into a string

a=’b\nc’ # when you print(a) b and c will be on separate lines

Putting the concepts to work: print out n randomly sampled lines from a file

import random,sys

def random_line(fname):
    lines = open(fname).read().splitlines() # splitlines removes \n chars
    return random.choice(lines)

file = sys.argv[1]
no_lines = int(sys.argv[2])
for n in range(no_lines):
    print(random_line(file))

Count occurences of a substring within a string

if ‘egw-fw’.count(‘egw’) > 1:

String concatenation operator (+)

newstring = ‘first string’ + myoldstringvariable

Working with IP addresses

Is this IP address in this subnet test

import ipaddress
ipad = ipaddress.ip_address(‘192.0.2.1’)
ipsubnet = ipaddress.IPv4Network(‘192.0.0.0/22’)
if ipad in ipsubnet: print(‘hi’)

Excel files

I’ve been using the package openpyxl quite successfully to read and write Excel files but I see that pandas also has built-in functions to read spreadsheets.

Date and time

import time

epoch_time = int(time.time()) # seconds since the epoch

Math

numpy seems to be the go-to package.

Using syslog

Please see this post.

Can a keyword be a variable?

Yes. Here’s an example.

timeunit = ‘days’

numbr = 3

datetime.now() + timedelta(**{timeunit: numbr})

try except block with retry for requests.get

import urllib3
import requests
from time import sleep
url = 'https://drjohnstechtalk.com/api'
try:
raw_results = requests.get(url=url)
except requests.exceptions.RequestException as e:
print('error is',e,'But we will pause and try again! Retrying now...')
sleep(90)
raw_results = requests.get(url=url)

Generically, we can do

try:

code

except Exception as e:

print(f'This exception occurred: {e}')

I'm beginning to think this python 3.4 construct is superior when you don't care at all about the exception and just want program execution to proceed as in this example:

from contextlib import suppress

wired = True

with suppress(Exception):

if item["networkProfile"]["wirelessProfile"]: wired = False # the existence of this key implies wireless

The python philosophy is to try first and ask permission later, like many Silicon Valley startups, hence the importance of adding these try / except blocks. I usually only add them when there is a demonstrated need.

Once when I was a bit unsure about what exception I was catching I just put in a generic except: (to catch some kind of timeout in that case) and it worked like a charm! But I know it is not best practice.

Date arithmetic

import datetime,calendar
from datetime import timedelta
today = datetime.datetime.now(datetime.UTC) # current time in UTC land
date = today.strftime('%Y%m%d') # e.g., 20240418

H = today.hour # just the hour, as an integer
t_hr = today - timedelta(hours = 1)
last_sec = datetime.datetime(t_hr.year, t_hr.month, t_hr.day, t_hr.hour, 59, 59) # the last second of that hour!
time_stamp = calendar.timegm(last_sec.timetuple()) # seconds since epoch for last_sec

Working with exit()

I like to add an exit() when testing code inside a loop so that the first iteration executes but I don't sit around waiting for the whole thing to be done because I probably have other mistakes I need to correct. However, that can cause trouble if that is inside a try/except block! If the except block has no explicit Exception, it will always get executed and therefore you won't exit! To get around this, this construct can be used:

try:
    exit() # this always raises SystemExit
except SystemExit:
    print("exit() worked as expected")
except:
    print("Something is horribly wrong") # some other exception got raised

Python and self-signed certificates, or certificates issued by private CAs

I updated this blog article to help address that: Adding private root CAs in Redhat or SLES or Debian.

Write it with style

Use flake8 to see if your python program conforms to the best practice style. In CentOS I installed flake8 using pip while in Debian linux I installed it using apt-get install flake8. But I end up using pyflakes.

In-line comments

Good code writers (but not me) may spread out their function calls over multiple lines. Yes, you can put a comment using the # character at the end of each of those in-between lines!

Skip first element of a generator function
subnet_g = ipaddress.IPv4Network('10.23.97.0/26').hosts() # subnet_g is a generator
subnet_l = list(subnet_g) # turn it into a list
for ip in subnet_l[1:]: # skips over first element in the list
    print('ip is',ip)
Does it at least pass the compiler - check syntax without running it

Install pyflakes: pip3 install pyflakes. Then

pyflakes your_script.py
Can I modify a Python script while its running?

Sure. No worries. It is safe to do so.

Print statement prints everything twice

This happens if you unfortunately named your program the same as a module you are importing. In this situation the program imports itself and runs twice. Rename your program something different!

Create virtual environment for portability

I like to call my virtual environment venv.

python3 -m venv venv # requires the SYSTEM package python3.11-venv

Use this virtual environment

source ./venv/bin/activate

List all the packages in this virtual environment

Good portable development style would have you install the minimal set of packages in your virtual environment and then build a requirements.txt file:

pip3 freeze > requirements.txt

Leave this virtual environment

deactivate

Test if package has been installed

python3 -c "import pymsteams" # is pymsteams package present?

Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'pymsteams'
Multiple python versions in Redhat

Say you want python3 to refer to your nice new python3.12 installation. Then try something like this:

alternatives --set python3 /usr/bin/python3.12
List all attributes of a module
import datetime
print(dir(datetime))
Conclusion

I've written down some of my favorite tips for using python effectively.

References and related

Good guide to working with dates and times in python: https://www.programiz.com/python-programming/datetime

Adding private root CAs in Redhat or SLES or Debian.

Writing output to syslog

A convention for commits