Intro
I mostly transitioned from perl to python programming. I resisted for the longest time, but now I would never go back. I realized something. I was never really good at Perl. What I was good at were the regular expressions. So Perl for me was just a framework to write RegExes. But Perl code looks ugly with all those semicolons. Python is neater and therefore more legible at a glance. Python also has more libraries to pick from. Data structures in Perl were just plain ugly. I never mastered the syntax. I think I got it now in Python, which is a huge timesaver – I don’t have to hit the books every time I have a complex data structure.
I will probably find these tips useful and will improve upon them as I find better ways to do things. It’s mostly for my own reference, but maybe someone else will find them handy. I use maybe 5% of python? As I need additional things I’ll throw thm in here.
I’ve added some entries which I realized I needed so I can understand other people’s programming. For instance there are multiple ways to initialize an empty list.
What is this object?
Say you have an object <obj> and want to know what type it is because you’re a little lost. Do:
print(<obj>.__class__)
Check if this key exists in this dict
if “model” in thisdict:
Remove key from dict
if “model” in thisdict: del thisdict[“model”]
Copy (assign) one dict to another – watch the assignment operator!
Do not use dict2 = dict1! That is accepted, syntactically, but won’t work as you expect because the assignment operator (=) is economical and works by reference. Instead do this:
dict2 = dict1.copy()
It may even be necessary to use deepcopy:
import copy
dict2_complex = copy.deepcopy(dict1_complex)
Multiple assignments in on line
a,b,c = “hi”,23,”there”
Key and value from a single line
for itemid,val in itemvals.items():
Formatting
I guess it is pretty common to use a space-based (not tab) indent of four spaces for each subsequent code block.
Initializing lists and dicts
alist = []
blist = list() # another way to initialize an empty list
adict = {}
adict = dict() # another way to initialize an empty dict
Test for an empty list or empty dict or empty string
if not alist: print(“empty list”)
if not adict: print(“the dict adict is empty”)
astring=””
if not astring: print("the string is empty")
Avoid the KeyError: error
I just learned this technique. Wish I had known sooner!
a = adict.get(‘my_nonexistent_key’) # returns a with None if key does not exist. To test a: if a == None: …
Length of a list or string or dict
len(alist)
len(astring)
len(adict)
Merge two lists together
for elmnt in list2: list1.append(elmnt)
Address first/last element in a list
alist[0] # first element
alist[-1] # last element
Iterate (loop) over a list
for my_element in alist: print(my_element) # all on one line for demo!
First/Last two characters in a string
astring[:2]
astring[-2:]
Third and fourth characters in a string
astring[2:4] # returns AE for astring = EUAEABUDH0014
Lowercase a string
astring.lower() # there also exists an upper() function as well, of course
Conditional (comparison) operators
if a == b: print(“equals”) # so == is comparison operator for strings
if re.search(r’eq’,a):
do something
elif re.search(r’newstring’,a):
do something else
else:
etc.
Order of evaluation of conditionals and max value of a dictionary
a = {‘hi’:0,’there’:1,’man’:2}
if not a or max(a.values()) < 3: do something
Is the above expression safe to evaluate in the case where the dict a is defined but empty? Answer: yes, it is! Although by itself max(a.values()) would produce an error, in this or conditional, execution, I guess, never reaches that statement because the first statement evaluates as True. Same reasoning applies if the boolean operator is and.
Ternary operator
I don’t think is well-developed in Python and shouldn’t be used (my opinion).
++ operator? Doesn’t exist. += and its ilk does, however.
Absolute Value
abs(a)
Boolean variables + multiple assignment example
a, b=True, False
if a==b: print(“equals”)
if a: print(“a is true”)
Reduce number of lines in the program
for n in range(12): colors[n] = ‘red’
if not mykey in mydict: mydict[mykey] = []
Printing stuff while developing
print(“mydict”,mydict,flush=True)
Python figures it out how to print your object, whatever type it is, which is cool. That flush=True is needed if you want to see our output which you’ve redirected to a file right away! Otherwise it gets buffered.
Reading and writing files – prettyify.py
import requests, json, sys, os
import sys,json
from pathlib import Path
aql_file = sys.argv[1]
aql_path = Path(aql_file)
json_file = str(aql_path.with_suffix('.json'))
# Script path
dir_path = os.path.dirname(os.path.realpath(__file__))
dir_path_files = dir_path + "/files/"
# make ugly json file prettier
# this is kind of a different example, mixed in there
file = sys.argv[1]
f = open(file)
# return json obj as dict
fjson = json.load(f)
nicer = json.dumps(fjson,indent=4)
print(nicer,flush=True)
# back to original example
f = open(dir_path_files + json_file,'w+')
f.write(body)
f.close()
Reading in command-line arguments
Reading in a boolean value
python pgm.py False
So, you could use argparse, but I chose ast. Then I have a line in the script:
import ast
overwrite_s = sys.argv[1] # either True of False - whether to overwrite or not
overwrite = ast.literal_eval(overwrite_s)
Nota Bene that if you fail to take these steps your argument will be read in as a string, not a boolean!
See Reading and Writing files example.
Parsing command line arguments II
Here is a more versatile and generalized way to parse command line arguments.
import optparse
p = optparse.OptionParser()
p.add_option('-b','--brushWidth',dest='brushWidth',type='float')
p.set_defaults(brushWidth=1.0)
opt, args = p.parse_args()
width = opt.brushWidth
print('brushWidth',width)
print(width.class)
remaining arguments
print(args)
$ python3 tst.py -b 1.2 my_file.png
brushWidth 1.2
['my_file.png']
Rounding a floating point number to two significant digits
a = round(901/3600,2)
Command line tips
The command line is your friend and should be used for little tests. Moreover, you can print an object without even a print statement.
>>>a =[1,’hi’,3]
>>>a
Going from byte object to string
s_b = b’string’
s = s_b.decode(‘utf-8’)
Test if object is a string
if type(thisobject) == str: print(“It is a string”)
Python as a calculator
I always used python command line as a calculator, even ebfore I knew the language syntax! It’s very handy.
>>> 5 + 6/23
Breaking out of a for loop
Use the continue statement after testing a condition to skip remaining block and continue onto next iteration. Use the break to completely skip out of this loop. Note that break and continue only apply to the innermost loop!
Infinite loop
while True: # then continue with statements in a code block
Iterator to get key value pairs out of a dict
>>>a = {‘hi’:’there’,’hi2′:12}
>>>for k,v in a.items():
>>> print(‘key,value’,k,v)
Executing shell commands
import os
os.system(“ls -l”)
But, to capture the output, you can use the subprocess package:
import subprocess
output = subprocess.run(cmd, shell=True, capture_output=True)
Generate (pseudo-)random numbers
import random
a = random.random()
Accessing environment variables
os.environ[‘ENV_TOKEN’]
Handling glob (wildcards) in your shell command
import glob
for query_results_file in glob.glob(os.path.join(dir_path_files,OSpattern)): print(“query_results_file”,query_results_file)
But, if you want the results in the same order as the shell gives, put a sorted() around that. Otherwise the results come out randomly.
JSON tips
Python is great for reading and writing JSON files.
# Load inventory file
with open(dir_path_files + inventory_file) as inventory_file:
inventory_json = json.load(inventory_file)
sitenoted={'gmtOffset':jdict["gmtOffset"],'timezoneId':jdict["timezoneId"]}
# update inventory with custom field Site Notes – put GMT – make sitenoted pretty using json.dumps
sitenote=json.dumps(sitenoted,indent=4)
print("sitenote",sitenote)
Convert a string which basically is in json format to a Python data structure
import json
txt_d = json.loads(response.text)
Test for null in JSON value
You may see “mykey”:null in your json values. How to test for that?
if my_dict[mykey] == None: continue
Format a json file into something human-readable
curl json_api|python3 -m json.tool
Sleep
from time import sleep
sleep(0.1)
RegExes
Although supported in Python, seems kind of ugly. Many RegExes will need to prefaced with r (raw), or else you’ll get yourself into trouble, as in
import re
r'[a-z]{4}.\s*\w(abc|def)’
if re.search(‘EGW-‘,locale): continue
b = re.sub(‘ ‘,’-‘,locale) # replace the first space with a hyphen
b = re.split(r’\s’,’a b c d e f’) # creates list with value [‘a’,’b’,’c’,’d’,’e’,’f’]
[subnet,descr] = re.split(‘,’,’10.1.2.3/24,descr,etc’,maxsplit=1)
Minimalist URL example
import urllib.request
res = urllib.request.urlopen(‘https://drjohnstechtalk.com/’).read()
Function arguments: are they passed by reference or by value?
This section needs more research and may be inaccurate or simply wrong! By reference for complex objects like a dict (not sure about a list), but by value for a simple object like a Boolean! I got burned by this. I wanted a Boolean to persist across a function call. In the end I simple stuffed it into a dict! And that worked. But python doesn’t use that terminology. But it means you can pass your complex data structure, say a list of dicts of dicts, start appending to the list in your function, and the calling program has access to the additional members.
Print to a string a la sprintf
In python 3.6 and later you have the f-format which is way cool. Stuff between curly braces gets evaluated in place. Say a = 3 and b = ‘man’, then
str = f"first some text mixed with value of a, which is {a} and the text of b, which is {b}"
So no need to paste a string together with awkward combos of strings, plus signs and variables! You may also see this done as
str = ‘different way to inject a variable {a}’.format(a)
Insert a newline character into a string
a=’b\nc’ # when you print(a) b and c will be on separate lines
Putting the concepts to work: print out n randomly sampled lines from a file
import random,sys
def random_line(fname):
lines = open(fname).read().splitlines() # splitlines removes \n chars
return random.choice(lines)
file = sys.argv[1]
no_lines = int(sys.argv[2])
for n in range(no_lines):
print(random_line(file))
Count occurences of a substring within a string
if ‘egw-fw’.count(‘egw’) > 1:
String concatenation operator (+)
newstring = ‘first string’ + myoldstringvariable
Working with IP addresses
Is this IP address in this subnet test
import ipaddress
ipad = ipaddress.ip_address(‘192.0.2.1’)
ipsubnet = ipaddress.IPv4Network(‘192.0.0.0/22’)
if ipad in ipsubnet: print(‘hi’)
Excel files
I’ve been using the package openpyxl quite successfully to read and write Excel files but I see that pandas also has built-in functions to read spreadsheets.
Date and time
import time
epoch_time = int(time.time()) # seconds since the epoch
Math
numpy seems to be the go-to package.
Using syslog
Please see this post.
Can a keyword be a variable?
Yes. Here’s an example.
timeunit = ‘days’
numbr = 3
datetime.now() + timedelta(**{timeunit: numbr})
try except block with retry for requests.get
import urllib3
import requests
from time import sleep
url = 'https://drjohnstechtalk.com/api'
try:
raw_results = requests.get(url=url)
except requests.exceptions.RequestException as e:
print('error is',e,'But we will pause and try again! Retrying now...')
sleep(90)
raw_results = requests.get(url=url)
Generically, we can do
try:
code
except Exception as e:
print(f'This exception occurred: {e}')
Once when I was a bit unsure about what exception I was catching I just put in a generic except: (to catch some kind of timeout in that case) and it worked like a charm! But I know it is not best practice.
Date arithmetic
import datetime,calendar
from datetime import timedelta
today = datetime.datetime.now(datetime.UTC) # current time in UTC land
date = today.strftime('%Y%m%d') # e.g., 20240418
H = today.hour # just the hour, as an integer
t_hr = today - timedelta(hours = 1)
last_sec = datetime.datetime(t_hr.year, t_hr.month, t_hr.day, t_hr.hour, 59, 59) # the last second of that hour!
time_stamp = calendar.timegm(last_sec.timetuple()) # seconds since epoch for last_sec
Working with exit()
I like to add an exit() when testing code inside a loop so that the first iteration executes but I don't sit around waiting for the whole thing to be done because I probably have other mistakes I need to correct. However, that can cause trouble if that is inside a try/except block! If the except block has no explicit Exception, it will always get executed and therefore you won't exit! To get around this, this construct can be used:
try:
exit() # this always raises SystemExit
except SystemExit:
print("exit() worked as expected")
except:
print("Something is horribly wrong") # some other exception got raised
Python and self-signed certificates, or certificates issued by private CAs
I updated this blog article to help address that: Adding private root CAs in Redhat or SLES or Debian.
Write it with style
Use flake8 to see if your python program conforms to the best practice style. In CentOS I installed flake8 using pip while in Debian linux I installed it using apt-get install flake8. But I end up using pyflakes.
In-line comments
Good code writers (but not me) may spread out their function calls over multiple lines. Yes, you can put a comment using the # character at the end of each of those in-between lines!
Skip first element of a generator function
subnet_g = ipaddress.IPv4Network('10.23.97.0/26').hosts() # subnet_g is a generator
subnet_l = list(subnet_g) # turn it into a list
for ip in subnet_l[1:]: # skips over first element in the list
print('ip is',ip)
Does it at least pass the compiler - check syntax without running it
Install pyflakes: pip3 install pyflakes. Then
pyflakes your_script.py
Can I modify a Python script while its running?
Sure. No worries. It is safe to do so.
Print statement prints everything twice
This happens if you unfortunately named your program the same as a module you are importing. In this situation the program imports itself and runs twice. Rename your program something different!
Create virtual environment for portability
I like to call my virtual environment venv.
python3 -m venv venv # requires the SYSTEM package python3.11-venv
Use this virtual environment
source ./venv/bin/activate
List all the packages in this virtual environment
Good portable development style would have you install the minimal set of packages in your virtual environment and then build a requirements.txt file:
pip3 freeze > requirements.txt
Leave this virtual environment
deactivate
Test if package has been installed
python3 -c "import pymsteams" # is pymsteams package present?
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'pymsteams'
Multiple python versions in Redhat
Say you want python3 to refer to your nice new python3.12 installation. Then try something like this:
alternatives --set python3 /usr/bin/python3.12
List all attributes of a module
import datetime
print(dir(datetime))
Conclusion
I've written down some of my favorite tips for using python effectively.
References and related
Good guide to working with dates and times in python: https://www.programiz.com/python-programming/datetime
Adding private root CAs in Redhat or SLES or Debian.
Writing output to syslog
A convention for commits