Categories
Consumer Interest

Consumer Tech: HP Pavilion Aero laptop review

Intro

I am very pleased with my online purchase of an HP lsptop. So I am sharing my experience here. Believe it or not, I did not, unfortunately, receive anything for this endorsement! I simply am thrilled with the product. I heartily recommend this laptop to others if it is similarly configured.

Requirements

Requirements are never made in the abstract, but represent a combination of what is possible and what others offer.

  • laptop
  • 13″ diagonal screen
  • lightweight
  • “fast,” whatever that means
  • future-proof, if at all possible
  • distinctive (you’ll see what that means in a second)
  • durable
  • no touch-screen!! (hate them)
  • Windows 11 Home Edition
  • under $1200
  • 1 TB of storage space
  • SSD
  • HP brand
What I got

I used to be a fan of Dell until I got one a few years back in which the left half of the keyboard went dead. It seems that problem was not so uncommon when you would do a search. Also my company seems to much more on the HP bandwagon than the Dell one, and they generally know what they are doing.

I remember buying an HP Pavilion laptop in November 2017. It was an advertised model which had the features I sought at the time, including Windows 7, 512 GB SSD disk. Surely, with the inexorable improvements in everything, wouldn’t you have thought that in the intervening five years, 1 TB would be commonplace, even on relatively low-end laptop models? For whatever reason, that upgrade didn’t happen and even five years later, 1 TB is all but unheard of on sub $1000 laptops. I guess everyone trusts the cloud for their storage. I work with cloud computing every day. But I want the assurance of having my photos on my drive, and not exclusively owned by some corporation. And we have lots of photos. So our Google Drive is about 400 GB. So with regards to storage, future-proof for me means room to grow for years, hence, 1 TB.

My company uses HP Elitebooks. They have touchscreens which I never use and are more geared towards business uses. Not only do I dislike touchscreens (you’re often touching them unintentionally), but they add weight and draw power. So not having one – that’s a win-win.

So since so few cheap laptops offer 1 TB standard, I imagined, correctly, that HP would have a configurator. The model which supports this is the HP Aero. I configured a few key upgrades, all of which are worthwhile.

I configued a model which has:

  • 13.3″ screen
  • 1 TB SSD disk
  • OLED WQXGA screen (2600 x 1600 pixels)
  • Windows 11 Home Edition
  • AMD Ryzen 7 5825U (up to 4.5 GHz, 16 MB L3 cache, 8 cores, 16 threads) + AMD Radeon Graphics + 8 GB onboard
  • pale rose gold trim

The screen size and the fact of running Windows 11 are not upgrades, everything else on the above list is. Some, like the cpu, a bit pricey. But my five-year-old laptop, which runs fine, by the way, is EOL because Microsoft refuses to support its cpu for Windows 11 upgrade. I’m hoping when I write my five year lookback in 2028 the same does not happen to this laptop!

I especially like the pale rose gold trim. Why? When you go to a public place such as an airport, your laptop does not look like everyone else’s.

We also want to carry this laptop around. So another benefit is that it’s one of the lightest laptops around, for its size. Again, a touchscreen would have been heavier.

Of course the Aero contains microphone, built-in speakers, but no ethernet port (I’m a little leery about that). Only two USB ports, plus a USB-C port and full-sized hdmi port.

One usage beef I have is that it supposedly has a back-lit keyboard, but I’ve never seen it turn on.

My company has a coupon code for a roughly four percent discount – not huge, but every bit helps. Shipping is free. But to get the discount I had to talk to a human being to place the order, which is a good idea anyway for a purchase of this magnitude. She carefully reviewed the order with me multiple times. She commended me on my choice to upgrade to the OLED display, which gave me a good feeling.

Unexpected features

I wasn’t really looking for it, but there it is, a fingerprint scanner(!) in order to do a Windows Hello verification. I did not set it up. I guess it could also do a facial recognition as well (that’s what I use at work for Windows Hello for Business), but I also didn’t try that.

I think there’s a mini stereo output but maybe no microphone input? Of course get a USB microphone and you’re all good…

Price

Price as configured above and with my company coupon code applied was $1080. I think that’s much better than a similarly equipped Surface tablet though I honestly didn’t do any real comparisons since I wanted to go HP from the get-go.

Conclusion

I bought a new HP Pavilion Aero laptop. It’s only been a month but I am very pleased with it so far. I configured it the with upgrades important to me since no off-the-shelf model has adequate storage capacity at the sub $1000 price point where I am.

I recommend this configuration for others. I think it’s really a winning combo. I have – I know this is hard to believe – not been compensated in any way for this glowing review! See my site – no ads? That shows you this is a different kind of web site, the kind that reflects the ideals of the Internet when it was conceived decades ago as an altruistic exchange of ideas, not an overly commercialized hellscape.

Since I saw this laptop was a winner I decided to give it away to a loved one, and now I’m back on that five-year-old HP Pavilion laptop!

References and related

HP Pavilion Aero Customize and Buy

Categories
Cloud

ADO: Check pipeline runs

Intro

I have previously written how to copy all Azure DevOps (ADO) logs to a linux server. In this post I share a script I wrote which does a quality check on all the most recent pipeline runs. If there are any issues, a message is sent to a MS teams channel.

Let’s get into the details.

Preliminary details

I am using the api, needless to say. I cannot say I have mastered the api or even come close to understanding it. I however have leveraged the same api call I have previously used since I observed it contains a lot of interesting data.

conf_check_all.ini

This config file is written as json to make importing a breeze. You set up optional trigger conditions for the various pipeline runs you will have, or even whether or not to perform any checks on it at all.

{
"organization":"drjohns4ServicesCoreSystems",
"project":"Connectivity",
"url_base":"https://dev.azure.com/",
"url_params":"&api-version=7.1-preview.7",
"test_flag":false,
"run_ct_min":2,
"queue_time_max":1800,
"pipelines":{
"comment":{"maximum_processing_time_in_seconds":"integer_value","minimum_processing_time_in_seconds":"integer_value","(optional) check_flag - to potentially disable the checks for this pipeline":"either true or false"},
"default":{"max_proc_time":1800,"min_proc_time":3,"check_flag":true},
"feed_influxdb":{"max_proc_time":180,"min_proc_time":3},
"PAN-Usage4Mgrs":{"max_proc_time":900,"min_proc_time":60,"check_flag":true},
"PAN-Usage4Mgrs-2":{"max_proc_time":900,"min_proc_time":60},
"speed-up-sampling":{"max_proc_time":900,"min_proc_time":"2","check_flag":false},
"Pipeline_check":{"max_proc_time":45,"min_proc_time":"2","check_flag":false},
"Discover new vEdges":{"max_proc_time":3600,"min_proc_time":3,"check_flag":true},
}
}

So you see at the bottom is a dictionary where the keys are the names of the pipelines I am running, plus a default entry.

check_all_pipelines.py
                    

#!/usr/bin/python3
# fetch raw log to local machine
# for relevant api section, see:
#https://learn.microsoft.com/en-us/rest/api/azure/devops/build/builds/get-build-log?view=azure-devops-rest-7.1
import urllib.request,json,sys,os
from datetime import datetime,timedelta
from modules import aux_modules

conf_file = sys.argv[1]

# pipeline uses UTC so we must follow suit or we will miss files
#a_day_ago = (datetime.utcnow() - timedelta(days = 1)).strftime('%Y-%m-%dT%H:%M:%SZ')
startup_delay = 30 # rough time in seconds before the pipeline even begins to execute our script
an_hour_ago = (datetime.utcnow() - timedelta(hours = 1, seconds = startup_delay)).strftime('%Y-%m-%dT%H:%M:%SZ')
print('An hour ago was (UTC)',an_hour_ago)
format = '%Y-%m-%dT%H:%M:%SZ'

#url = 'https://dev.azure.com/drjohns4ServicesCoreSystems/Connectivity/_apis/build/builds?minTime=2022-10-11T13:00:00Z&api-version=7.1-preview.7'

# dump config file into a dict
config_d = aux_modules.parse_config(conf_file)
test_flag = config_d['test_flag']
if test_flag:
    print('config_d',config_d)
    print('We are in a testing mode because test_flag is:',test_flag)

url_base = f"{config_d['url_base']}{config_d['organization']}/{config_d['project']}/_apis/build/builds"
url = f"{url_base}?minTime={an_hour_ago}{config_d['url_params']}"
#print('url',url)
req = urllib.request.Request(url)
req.add_header('Authorization', 'Basic ' + os.environ['ADO_AUTH'])

# Get buildIds for pipeline runs from last 1 hour
with urllib.request.urlopen(req) as response:
   html = response.read()
txt_d = json.loads(html)
#{"count":215,"value":[{"id":xxx, "buildNumber":"20230203.107","status":"completed","result":"succeeded","queueTime":"2023-02-03T21:12:01.0865046Z","startTime":"2023-02-03T21:12:05.2177605Z","finishTime":"2023-02-03T21:17:28.1523128Z","definition":{"name":"PAN-Usage4Mgrs-2"
value_l = txt_d['value']
all_msgs = ''
header_msg = '**Recent pipeline issues**\n'
# check for too few pipeline runs
if len(value_l) <= config_d['run_ct_min']:
    all_msgs = f"There have been fewer than expected pipeline runs this past hour. Greater than **{config_d['run_ct_min']}** runs are expected, but there have been only **{len(value_l)}** runs.  \nSeomthing may be wrong.  \n"

for builds in value_l:
    msg = aux_modules.check_this_build(builds,config_d,url_base)
    if msg: all_msgs = f"{all_msgs}  \n{msg}  \n"

if all_msgs:
    if not test_flag: aux_modules.sendMessageToTeams(header_msg + all_msgs) # send to WebHook if not in a testing mode
    print(header_msg + all_msgs)
else:
    print('No recent pipeline errors')

Short explanation

I consider the code to be mostly self-explanatory. A cool thing I’m trying out here is the f- format specifier to write to a string kind of like sprintf. I run this script every hour from, yes, an ADO pipeline! But since this job looks for errors, including errors which indicate a systemic problem with the agent pool, I run it from a different agent pool.

aux_modules.py
                    

import json,re
import os,urllib.request
from datetime import datetime,timedelta
import pymsteams

def parse_config(conf_file):
# config file should be a json file
    f = open(conf_file)
    config_d = json.load(f)
    f.close()
    return config_d

def get_this_log(config_d,name,buildId,build_number):
# leaving out the api-version etc works better
#GET https://dev.azure.com/{organization}/{project}/_apis/build/builds/{buildId}/logs/{logId}?api-version=7.1-preview.2
#https://dev.azure.com/drjohns4ServicesCoreSystems/d6338e-f5b4-45-6c-7b3a86/_apis/build/builds/44071/logs/7'
        buildId_s = str(buildId)
        log_name = config_d['log_dir'] + "/" + name + "-" + build_number
# check if we already got this one
        if os.path.exists(log_name):
            return
        #url = url_base + organization + '/' + project + '/_apis/build/builds/' + buildId_s + '/logs/' + logId + '?' + url_params
        url = config_d['url_base'] + config_d['organization'] + '/' + config_d['project'] + '/_apis/build/builds/' + buildId_s + '/logs/' + config_d['logId']
        print('url for this log',url)
        req = urllib.request.Request(url)
        req.add_header('Authorization', 'Basic ' + config_d['auth'])
        with urllib.request.urlopen(req) as response:
            html = response.read()
        #print('log',html)
        print("Getting (name,build_number,buildId,logId) ",name,build_number,buildId_s,config_d['logId'])
        f = open(log_name,"wb")
        f.write(html)
        f.close()

def check_this_build(builds,config_d,url_base):
    format = '%Y-%m-%dT%H:%M:%SZ'
    buildId = builds['id']
    build_number = builds['buildNumber']
    status = builds['status'] # normally: completed
    result = builds['result'] # normally: succeeded
    queueTime = builds['queueTime']
    startTime = builds['startTime']
    finishTime = builds['finishTime']
    build_def = builds['definition']
    name = build_def['name']
    print('name,build_number,id',name,build_number,buildId)
    print('status,result,queueTime,startTime,finishTime',status,result,queueTime,startTime,finishTime)
    qTime = re.sub(r'\.\d+','',queueTime)
    fTime = re.sub(r'\.\d+','',finishTime)
    sTime = re.sub(r'\.\d+','',startTime)
    qt_o = datetime.strptime(qTime, format)
    ft_o = datetime.strptime(fTime, format)
    st_o = datetime.strptime(sTime, format)
    duration_o = ft_o - st_o
    duration = int(duration_o.total_seconds())
    print('duration',duration)
    queued_time_o = st_o - qt_o
    queued_time = int(queued_time_o.total_seconds())
    queue_time_max = config_d['queue_time_max']
# and from the config file we have...
    pipes_d = config_d['pipelines']
    this_pipe = pipes_d['default']
    if name in pipes_d: this_pipe = pipes_d[name]
    msg = ''
    if 'check_flag' in this_pipe:
        if not this_pipe['check_flag']:
            print('Checking for this pipeline has been disabled: ',name)
            return msg # skip this build if in test mode or whatever
    print('duration,min_proc_time,max_proc_time',duration,this_pipe['min_proc_time'],this_pipe['max_proc_time'])
    print('queued_time,queue_time_max',queued_time,queue_time_max)
    if duration > this_pipe['max_proc_time'] or duration < this_pipe['min_proc_time']:
        msg = f"ADO Pipeline **{name}** run is outside of expected time range. Build number: **{build_number}**. \n  Duration, max_proc_time, min_proc_time: **{duration},{this_pipe['max_proc_time']},{this_pipe['min_proc_time']}**"
    if not status == 'completed' or not result == 'succeeded':
        msg = f"ADO Pipeline **{name}** run has unexpected status or result. Build number: **{build_number}**. \n  - Status: **{status}** \n  - Result: **{result}**"
    if queued_time > queue_time_max: # Check if this job was queued for too long
        msg = f"ADO Pipeline **{name}** build number **{build_number}** was queued too long. Queued time was **{queued_time}** seconds"
    if msg:
# get the logs meta info to see which log is the largest
        url = f"{url_base}/{buildId}/logs"
        req = urllib.request.Request(url)
        req.add_header('Authorization', 'Basic ' + os.environ['ADO_AUTH'])
# Get buildIds for pipeline runs from last 1 hour
        with urllib.request.urlopen(req) as response:
            html = response.read()
        txt_d = json.loads(html)
        value_l = txt_d['value']
#{"count":11,"value":[{"lineCount":31,"createdOn":"2023-02-13T19:03:17.577Z","lastChangedOn":"2023-02-13T19:03:17.697Z","id":1...
        l_ct_max = 0
        log_id_err = 0
# determine log with either an error or the most lines - it differs for different pipeline jobs
        for logs_d in value_l[4:]:      # only consider the later logs
            url = f"{url_base}/{buildId}/logs/{logs_d['id']}"
            req = urllib.request.Request(url)
            req.add_header('Authorization', 'Basic ' + os.environ['ADO_AUTH'])
            with urllib.request.urlopen(req) as response:
                html = response.read().decode('utf-8')
            if re.search('error',html):
                log_id_err = logs_d['id']
                print('We matched the word error in log id',log_id_err)
            l_ct = logs_d['lineCount']
            if l_ct > l_ct_max:
                l_ct_max = l_ct
                log_id_all = logs_d['id']
        if log_id_err > 0 and not log_id_all == log_id_err: # error over long log file when in conflict
            log_id_all = log_id_err
        url_all_logs = f"{url_base}/{buildId}/logs/{log_id_all}"
        msg = f"{msg}  \n**[Go to Log]({url_all_logs})**  "
    print(msg)
    return msg

def sendMessageToTeams(msg: str):
    """
    Send a message to a Teams Channel using webhook
    """
# my Pipeline_check webhook
    webHookUrl = "https://drjohns.webhook.office.com/webhookb2/66f741-9b1e-401c-a8d3-9448d352db@ec386b-c8f-4c0-a01-740cb5ba55/IncomingWebhook/2c8e881d05caba4f484c92617/7909f-d2f-b1d-3c-4d82a54"
    try:
        # escaping underscores to avoid alerts in italics.
        msg = msg.replace('_', '\_')
        teams_msg = pymsteams.connectorcard(webHookUrl)
        teams_msg.text(f'{msg}')
        teams_msg.send()

    except Exception as e:
        print(f'failed to send alert: {str(e)}')

aux_modules.py contains most of the logic with checking each pipeline against the criteria and constructing an alert in Markdown to send to MS Teams. I’m not saying it’s beautiful code. I’m still learning. But I am saying it works.

I’ve revised the code to find the log file which is most likely to contain the “interesting” stuff. That’s usually the longest one excluding the first five or so. There are often about 10 logs available for even a minimal pipeline run. So this extra effort helps.

Then I further revised the code to fetch the logs and look for the word “error.” That may show up in the longest log or it may not. It not, that log takes precedence as the most interesting log.

check_all_pipelines.yml
                    

# Python package
# Create and test a Python package on multiple Python versions.
# Add steps that analyze code, save the dist with the build record, publish to a PyPI-compatible index, and more:
# https://docs.microsoft.com/azure/devops/pipelines/languages/python

##trigger:
##- main

trigger: none

pool:
  name: dsc-adosonar-drjohns4ServicesCoreSystems-agent
#  name: visibility_agents

#strategy:
#  matrix:
#    Python36:
#      python.version: '3.6'

steps:
#- task: UsePythonVersion@0
#  inputs:
#    versionSpec: '$(python.version)'
#  displayName: 'Use Python $(python.version)'

- script: pip3 install -vvv --timeout 60 -r Pipeline_check/requirements.txt
  displayName: 'Install requirements'

- script: python3 check_all_pipelines.py conf_check_all.ini
  displayName: 'Run script'
  workingDirectory: $(System.DefaultWorkingDirectory)/Pipeline_check
  env:
    ADO_AUTH: $(ado_auth)
    PYTHONPATH: $(System.DefaultWorkingDirectory)/Pipeline_check:$(System.DefaultWorkingDirectory)
schedules:
- cron: "19 * * * *"
  displayName: Run the script at 19 minutes after the hour
  branches:
    include:
    - main
  always: true

This yaml file we sort of drag around from pipeline to pipeline so some of it may not appear too optimized for this particular pipeline. But it does the job without fuss.

Conclusion

My ADO pipeline checker is conveniently showing us all our pipeline runs which have failed for various reasons – takes too long, completed with errors, too few jobs have been run ni the last hour, … It sends its output to a MS Teams channel we have subscribed to, by way of a webhook we set up. So far it’s working great!

References and related

Here’s my post on fetching the script log resulting from the pipeline run.

Categories
Visibility

Everything I need to know about Influxdb and Grafana

Intro

Some of my colleagues had used Influxdb and Grafana at their previous job so they thought it might fit for what we’re doing in the Visibility team. It sounded good in theory, anyway, so I had to agree. There were a lot of pitfalls. Eventually I got it to the point where I’m satisfied with my accomplishments and want to document the hurdles I’ve overcome.

So as time permits I will be fleshing this out.

Grafana

I’m going to lead with the picture and then the explanation makes a lot more sense.

Single vedge heat map

I’ve spent the bulk of my time wrestling with Grafana. Actually it looks like a pretty capable tool. It’s mostly just understanding how to make it do what you are dreaming about. Our installed version currently is 9.2.1.

My goal is to make a heatmap. But a special kind similar to what I saw the network provider has. That would namely entail one vedge per row, and one column per hour, hence, 24 columns in total. A vedge is a kind of SD-Wan router. I want to help the networking group look at hundreds of them at a time. So that’s on potential dashboard. It would give a view of a day. Another dashboard would show just one router with the each row representing a day, and the columns again showing an hour. Also a heatmap. The multi-vedge dashboard should link to the individual dashboard, ideally. In the end I pulled it off. I am also responsible for feeding the raw data into Influxdb and hence also for the table design.

Getting a workable table design was really imporant. I tried to design it in a vacuum, but that only partially worked. So I revised, adding tags and fields as I felt I needed to, while being mindful of not blowing up the cardinality. I am now using these two tables, sorry, measurements.

vedge measurement
vedge_stats measurement

Although there are hundreds of vedges, some of my tags are redundant, so don’t get overly worried about my high cardinality. UTChour is yes a total kludge – not the “right” way to do things. But I’m still learning and it was simpler in my mind. item in the first measurement is redundant with itemid. But it is more user-friendly: a human-readable name.

Influx Query Explorer

It’s very helpful to use the Explorer, but the synatx there is not exactly the same as it will be when you define template variables. Go figure.

Multiple vedges for the last day

So how did I do it in the end?

Mastering template variables is really key here. I have a drop-down selection for region. In Grafana-world it is a custom variable with potential values EU,NA,SA,AP. That’s pretty easy. I also have a threshold variable, with possible values: 0,20,40,60,80,90,95. And a math variable with values n95,avg,max. More recently I’ve added a threshold_max and a math_days variable.

It gets more interesting however, I promise. I have a category variable which is of type query:

from(bucket: "poc_bucket2")
|> range (start: -1d)
|> filter(fn:(r) => r._measurement == "vedge_stats")
|> group()
|> distinct(column: "category")

Multi-value and Include all options are checked. Just to make it meaningful, category is assigned by the WAN provider and has values such as Gold, Silver, Bronze.

And it gets still more interesting because the last variable depends on the earlier ones, hence we are using chained variables. The last variable, item, is defined thusly:

from(bucket: "poc_bucket2")
|> range (start: -${math_days}d)
|> filter(fn:(r) => r._measurement == "vedge_stats" and r.region == "${Region}")
|> filter(fn:(r) => contains(value: r.category, set: ${category:json}))
|> filter(fn:(r) => r._field == "${math}" and r._value >= ${threshold} and r._value <= ${threshold_max})
|> group()
|> distinct(column: "item")

So what it is designed to do is to generate a list of all the items, which in reality are particular interfaces of the vedges, into a drop-down list.

Note that I want the user to be able to select multiple categories. It’s not well-documented how to chain such a variable, so note the use of contains and set in that one filter function.

And note the double-quotes around ${Region}, another chained variable. You need those double-quotes! It kind of threw me because in Explorer I believe you may not need them.

And all that would be simply nice if we didn’t somehow incorporate these template variables into our panels. I use the Stat visualization. So you’ll get one stat per series. That’s why I artifically created a tag UTChour, so I could easily get a unique stat box for each hour.

The stat visualization flux Query

Here it is…

data = from(bucket: "poc_bucket2")
  |> range(start: -24h, stop: now())
  |> filter(fn: (r) =>
    r._measurement == "vedge" and
    r._field == "percent" and r.hostname =~ /^${Region}/ and r.item == "${item}"
  )
  |> drop(columns: ["itemid","ltype","hostname"])
data

Note I hae dropped my extra tags and such which I do not wish to appear during a mouseover.

Remember our regions can be one of AP,EU,NA or SA? Well the hostnames assigned to each vedge start with the two letters of its region of location. Hence the regular explression matching works there to restrict consideration to just the vedges in the selected region.

We are almost done.

Making it a heat map

So my measurement has a tag called percent, which is the percent of available bandwidth that is being used. So I created color-based thresholds:

Colorful percent-based thresholds

You can imagine how colorful the dashboard gets as you ratchet up the threshold template variable. So the use of these thresholds is what turns our stat squares into a true heatmap.

Heatmap visualization

I found the actual heatmap visualization useless for my purposes, by the way!

There is also an unsupported heatmap plugin for Grafana which simply doesn’t work. Hence my roll-your-own approach.

Repitition

How do we get a panel row per vedge? The stat visualization has a feature called Repeat Options. So you repeat by variable. The variable selected is item. Remember that item came from our very last template variable. Repeat direction is Vertical.

For calculation I choose mean. Layout orienttion is Vertical.

The visualization title is also variable-driven. It is ${item} .

The panels are long and thin. Like maybe two units high? – one unit for the label (the item) and the one below it for the 24 horizontal stat boxes.

Put it all together and voila, it works and it’s cool and interactive and fast!

Single vedge heatmap data over multiple days

Of course this is very similar to the multiple vedge dashboard. But now we’re drilling down into a single vedge to look at its usage over a period of time, such as the last two weeks.

Flux query
import "date"
b = date.add(d: 1d, to: -${day}d)
data = from(bucket: "poc_bucket2")
  |> range(start: -${day}d, stop: b)
  |> filter(fn: (r) =>
    r._measurement == "vedge" and
    r._field == "percent" and
    r.item == "$item"
  )
  |> drop(columns:["itemid","ltype","hostname"])
data
Variables

As before we have a threshold, Region and category variable with category derived from the same flux query shown above. A new variable is day, which is custom and hidden, It has values 1,2,3,4,…,14. I don’t know how to do a loop in flux or I might have opted a more elegant method to specify the last 14 days.

I did the item variable query a little different, but I think it’s mostly an alternate and could have been the same:

from(bucket: "poc_bucket2")
|> range (start: -${math_days}d)
|> filter(fn:(r) => r._measurement == "vedge_stats" and r.region == "${Region}")
|> filter(fn:(r) => contains(value: r.category, set: ${category:json}))
|> filter(fn:(r) => r._field == "${math}" and r._value >= ${threshold} and r._value <= ${threshold_max})
|> group()
|> distinct(column: "item")

Notice the slightly different handling of Region. And those double-quotes are important, as I learned from the school of hard knocks!

The flux query in the panel is of course different. It looks like this:

import "date"
b = date.add(d: 1d, to: -${day}d)
data = from(bucket: "poc_bucket2")
  |> range(start: -${day}d, stop: b)
  |> filter(fn: (r) =>
    r._measurement == "vedge" and
    r._field == "percent" and
    r.item == "$item"
  )
  |> drop(columns:["itemid","ltype","hostname"])
data

So we’re doing some date arithmetic so we can get panel strips, one per day. These panels are long and thin, same as before, but I omitted the title since it’s all the same vedge.

The repeat options are repeat by variable day, repeat direction Vertical as in the other dashboard. The visualization is Stat, as in the other dashboard.

And that’s about it! Here the idea is that you play with the independent variables such as Region and threshold, it generates a list of matching vedge interfaces and you pick one from the drop-down list.

Linking the multiple vedge dashboard to the single vedge history dashboard

Of course the more interactive you make these things the cooler it becomes, right? I was excited to be able to link these two dashboards together in a sensible way.

In the panel config you have Data links. I found this link works:

https://drjohns.com:3000/d/1MqpjD24k/single-vedge-usage-history?orgId=1&var-threshold=60&var-math=n95&${item:queryparam}

So to generalize since most of the URL is specific to my implementation, both dashboards utilize the item variable. I basically discovered the URL for a single vedge dashboard and dissected it and parameterized the item, getting the syntax right with a little Internet research.

So the net effect is that when you hover over any of the vedge panels in the multi-vedge dashboard, you can click on that vedge and pull up – in a new tab in my case – the individual vedge usage history. It’s pretty awesome.

Influxdb

Influxdb is a time series database. It takes some getting used to. Here is my cheat sheet which I like to refer to.

  • bucket is named location with retention policy where time-series data is stored.
  • series is a logical grouping of data defined by shared measurement, tag and field.
  • measurement is similar to an SQL database table.
  • tag is similar to indexed columns in an SQL database.
  • field is similar to unindexed columns in an SQL database.
  • point is similar to SQL row.

This is not going to make a lot of sense to anyone who isn’t Dr John. But I’m sure I’ll be referring to this section for a How I did this reminder.

OK. So I wrote a feed_influxdb.py script which runs every 12 minutes in an Azure DevOps pipeline. It extracts the relevant vedge data from Zabbix using the Zabbix api and puts it into my influxdb measurement vedge whose definition I have shown above. I would say the code is fairly generic, except that it relies on the existence of a master file which contains all the relevant static data about the vedges such as their interface names, Zabbix itemids, and their maximum bandwidth (we called it zabbixSpeed). You cuold pretty much deduce the format of this master file by reverse-engineering this script. So anyway here is feed_influxdb.py.

                    

from pyzabbix import ZabbixAPI
import requests, json, sys, os, re
import time,datetime
from time import sleep
from influxdb_client import InfluxDBClient, Point, WritePrecision
from influxdb_client.client.write_api import SYNCHRONOUS
from modules import aux_modules,influx_modules

# we need to get data out of Zabbix
inventory_file = 'prod.config.visibility_dashboard_reporting.json'
#inventory_file = 'inv-w-bw.json' # this is a modified version of the above and includes Zabbix bandwidth for most ports
# Login Zabbix API - use hidden variable to this pipeline
token_zabbix = os.environ['ZABBIX_AUTH_TOKEN']
url_zabbix = 'https://zabbix.basf.net/'
zapi = ZabbixAPI(url_zabbix)
zapi.login(api_token=token_zabbix)
# Load inventory file
with open(inventory_file) as inventory_file:
    inventory_json = json.load(inventory_file)
# Time range which want to get data (unixtimestamp)
inventory_s = json.dumps(inventory_json)
inventory_d = json.loads(inventory_s)
time_till = int(time.mktime(datetime.datetime.now().timetuple()))
time_from = int(time_till - 780)  # about 12 minutes plus an extra minute to reflect start delay, etc
i=0
max_items = 200
item_l = []
itemid_to_vedge,itemid_to_ltype,itemid_to_bw,itemid_to_itemname = {},{},{},{}
for SSID in inventory_d:
    print('SSID',SSID)
    hostname_d = inventory_d[SSID]['hostname']
    for vedge_s in hostname_d:
        print('vedge_s',vedge_s,flush=True)
        items_l = hostname_d[vedge_s]
        for item_d in items_l:
            print('item_d',item_d,flush=True)
            itemname = item_d['itemname']
            lineType = item_d['lineType']
            if 'zabbixSpeed' in item_d:
                bandwidth = int(item_d['zabbixSpeed'])
            else:
                bandwidth = 0
            itemid = item_d['itemid']
            if lineType == 'MPLS' or lineType == 'Internet':
                i = i + 1
                itemid_to_vedge[itemid] = vedge_s # we need this b.c. Zabbix only returns itemid
                itemid_to_ltype[itemid] = lineType # This info is nice to see
                itemid_to_bw[itemid] = bandwidth # So we can get percentage used
                itemid_to_itemname[itemid] = itemname # So we can get percentage used
                item_l.append(itemid)
                if i > max_items:
                    print('item_l',item_l,flush=True)
                    params = {'itemids':item_l,'time_from':time_from,'time_till':time_till,'history':0,'limit':500000}
                    print('params',params)
                    res_d = zapi.do_request('history.get',params)
                    #print('res_d',res_d)
                    #exit()
                    print('After call to zapi.do_request')
                    result_l = res_d['result']
                    Pts = aux_modules.zabbix_to_pts(result_l,itemid_to_vedge,itemid_to_ltype,itemid_to_bw,itemid_to_itemname)
                    for Pt in Pts:
                        print('Pt',Pt,flush=True)
                        # DEBUGGING!!! Normally call to data_entry is outside of this loop!!
                        #influx_modules.data_entry([Pt])
                    influx_modules.data_entry(Pts)
                    item_l = [] # empty out item list
                    i = 0
                    sleep(0.2)
else:
# we have to deal with leftovers which did not fill the max_items
    if i > 0:
                    print('Remainder section')
                    print('item_l',item_l,flush=True)
                    params = {'itemids':item_l,'time_from':time_from,'time_till':time_till,'history':0,'limit':500000}
                    res_d = zapi.do_request('history.get',params)
                    print('After call to zapi.do_request')
                    result_l = res_d['result']
                    Pts = aux_modules.zabbix_to_pts(result_l,itemid_to_vedge,itemid_to_ltype,itemid_to_bw,itemid_to_itemname)
                    for Pt in Pts:
                        # DEBUGGING!!! normally data_entry is called after this loop
                        print('Pt',Pt,flush=True)
                        #influx_modules.data_entry([Pt])
                    influx_modules.data_entry(Pts)
print('All done feeding influxdb!')

I’m not saying it’s great code. I’m only saying that it gets the job done. It relies on a couple auxiliary scripts. Here is aux_modules.py (it may include some packages I need later on).

                    

import re
import time as tm
import numpy as np

def zabbix_to_pts(result_l,itemid_to_vedge,itemid_to_ltype,itemid_to_bw,itemid_to_itemname):

# turn Zabbix results into a list of points which can be fed into influxdb
# [{'itemid': '682837', 'clock': '1671036337', 'value': '8.298851463718859E+005', 'ns': '199631779'},

    Pts = []
    for datapt_d in result_l:
        itemid = datapt_d['itemid']
        time = datapt_d['clock']
        value_s = datapt_d['value']
        value = float(value_s) # we are getting a floating point represented as a string. Convert back to float
        hostname = itemid_to_vedge[itemid]
        lineType = itemid_to_ltype[itemid]
        itemname = itemid_to_itemname[itemid]
# item is a hybrid tag, like a primary tag key
        iface_dir = re.sub(r'(\S+) interface (\S+) .+',r'\1_\2',itemname)
        item = hostname + '_' + lineType + '_' + iface_dir
        if itemid in itemid_to_bw:
            bw_s = itemid_to_bw[itemid]
            bw = int(bw_s)
            if bw == 0:
                percent = 0
            else:
                percent = int(100*value/bw)
        else:
            percent = 0
        tags = [{'tag':'hostname','value':hostname},{'tag':'itemid','value':itemid},{'tag':'ltype','value':lineType},{'tag':'item','value':item}]
        fields = [value,percent]
        Pt = {'measurement':'vedge','tags':tags,'fields':fields,'time':time}
        Pts.append(Pt)
    return Pts

Next I’ll show influx_modules.py.

                    

import influxdb_client, os, time
from datetime import datetime, timezone
import pytz
from influxdb_client import InfluxDBClient, Point, WritePrecision
from influxdb_client.client.write_api import SYNCHRONOUS
import random

def data_entry(Pts):
# Set up variables
    bucket = "poc_bucket2" # DrJ test bucket
    org = "poc_org"
    influxdb_cloud_token = os.environ['INFLUX_AUTH_TOKEN']

# Store the URL of your InfluxDB instance
    url_local ="https://drjohnstechtalk.com:8086/"

# Initialize client
    client = influxdb_client.InfluxDBClient(url=url_local,token=influxdb_cloud_token,org=org)

# Write data
    write_api = client.write_api(write_options=SYNCHRONOUS)

    pts = []
    for Pt in Pts:
        measurement = Pt['measurement']
        tag_host_l = Pt['tags']
# for now we've hard-coded the three tags and two field values
        hostname = tag_host_l[0]['value']
        itemid = tag_host_l[1]['value']
        lineType = tag_host_l[2]['value']
        item = tag_host_l[3]['value']
        value = Pt['fields'][0]
        percent = Pt['fields'][1]

        time = Pt['time']
# convert seconds since epoch into format required by influxdb
        pt_time = datetime.fromtimestamp(int(time), timezone.utc).isoformat('T', 'milliseconds')
# pull out the UTC hour
        ts=datetime.fromtimestamp(int(time)).astimezone(pytz.UTC)
        H = ts.strftime('%H')
        point = Point(measurement).tag("hostname",hostname).tag("itemid",itemid).tag("ltype",lineType).tag("item",item).tag("UTChour",H).field('value',value).field('percent',percent).time(pt_time)
        pts.append(point)
    write_api.write(bucket=bucket, org="poc_org", record=pts, write_precision=WritePrecision.S)

These scripts show how I accumulate a bunch of points and make an entry in influxdb once I have a bunch of them to make things go faster.

Complaints

I am not comfortable with the flux query documentation. Nor the Grafana documentation for that matter. They both give you a taste of what you need, without many details or examples. For instance it looks like there are multiple syntaxes available to you using flux. I just pragmatically developed what works.

Conclusion

I am content if no one reads this and it only serves as my own documentation. But perhaps it will help someone facing similar issues. Unfortunately one of the challenges is asking a good question to a search engine when you’re a newbie and haven’t mastered the concepts.

But without too too much effort I was able to master enough Grafana to create something that will probably be useful to our vendor management team. Grafana is fun and powerful. There is a slight lack of examples and the documentation is a wee bit sparse.

Categories
Linux Raspberry Pi

Interpreting speech with a Raspberry Pi

Or the beginning or creating your own smart speaker

Intro

Imagine you could use a low-cost device to interpret speech without the aid of the big cloud services and their complexity and security and big-brotherly-ness. Well if you have a DIY mindset, you can!

I wanted to control the raspberry pi-based slideshow I have written about many times in the past with voice commands. The question became How could I do it and is it even possible at all? And would I need to master the complex apis provided by either Amazon or Google cloud services? Well, it turns out that it is possible to do passable speech to text without any external cloud provider; and I am very excited to share what I’ve learned so far.

Equipment

raspberry pi 4 (even my old RPI 3 seems to work)

USB microphone

Raspberry Pi OS

Skills

basic linux and python skills are required

vosk – your main tool

I’m going to cut to the chase and just tell you that the vosk api is how I got this all working, but not before I drove into several dead-ends.

Here are the vosk installation instructions, which do work on RPi:

Vosk Installation (alphacephei.com)

It will be helpful to install and test the examples:

git clone https://github.com/alphacep/vosk-api
cd vosk-api/python/example
python3 ./test_simple.py test.wav

On my RPi 4 it took 36 s the first time, and 6.6 s the second time to run this test.wav. So I got worried and fully expected it would be just too slow on these underpowed RPi systems.

But I forged ahead and looked for an example which could do real-time speech-to-text. They provide a microphone example. It requires some additional packages. But even after installing them it still produced a nasty segmentation fault. So I gave up on that. Then I noticed an ffmpeg-based example. Well, turns out I have lots of prior ffmpeg experience as I also post about live recording of audio with the raspberry pi.

It turns out their example was simply to use ffmpeg to interpret a file, but I didn’t know that to begin with. But I know my way around ffmpeg that I could use it for processing a lvie stream. So I made those changes, and voila. I’m glad to say I was dead wrong about the processing speed. On the RPi 4 it can keep up with its text-to-speech task in real time!

Basic program to examine your speech in real time

I developed the following python script based off one of the python examples from the api. I call it drjtst4.py, just to give it a name:

                    

#!/usr/bin/env python3

import subprocess
import re
from modules import aux_modules

from vosk import Model, KaldiRecognizer, SetLogLevel

SAMPLE_RATE = 16000

SetLogLevel(0)

model = Model(lang="en-us")
rec = KaldiRecognizer(model, SAMPLE_RATE)
start,start_a = 0,0
input_device = 'plughw:1,0'
phrase = ''
accumulating = False
# wake word hey photo is often confused with a photo by vosk...
wake_word_re = '^(hey|a) photo'

with subprocess.Popen(["ffmpeg","-loglevel", "quiet","-f","alsa","-i",
                            input_device,
                            "-ar", str(SAMPLE_RATE) , "-ac", "1", "-f", "s16le", "-"],
                            stdout=subprocess.PIPE) as process:

    while True:
        data = process.stdout.read(4000)
        if len(data) == 0:
            break
        if rec.AcceptWaveform(data):
            print('in first part')
            print(rec.Result())
            text = rec.PartialResult()
# text is a "string" which is basically a dict
            start,start_a,accumulating,phrase = aux_modules.process_text(wake_word_re,text,start,start_a,accumulating,phrase)
        else:
# this part always seems to be executed for whatever reason
            print('in else part')
            text = rec.PartialResult()
            start,start_a,accumulating,phrase = aux_modules.process_text(wake_word_re,text,start,start_a,accumulating,phrase)
            print(rec.PartialResult())

# we never seem to get here
    print(rec.FinalResult())
    print('In final part')
    text = rec.FinalResult()

I created a modules directory and in it a file called aux_modules.py. It look like this:

                    

import re,time,json

def process_text(wake_word_re,text_s,start,start_a,accumulating,phrase):
    max = 5.5 # seconds
    inactivity = 10 # seconds
    short_max = 1.5 # seconds
    elapsed = 0
    if time.time() - start_a < inactivity:
# Allow some time to elapse since we just took an action
        return start,start_a,accumulating,phrase
# convert text to real text. Real text is in 'partial'
    text_d = json.loads(text_s)
    text = ''
    if 'partial' in text_d:
        text = text_d['partial']
    if 'text' in text_d:
        text = text_d['text']
    if not text == '': phrase = text
    if re.search(wake_word_re,text):
        if not accumulating:
            start = time.time()
            accumulating = True
            print('Wake word detected. Now accumulating text.')
    l = len(re.split(r'\s',text))
    print('text, word ct',text,l)
    if accumulating:
        elapsed = time.time() - start
        print('Elapsed time:',elapsed)
        if l > 1:
           phrase = text
    if elapsed > max or (elapsed > short_max and l == 1):
# we're at a natural ending here...
        print('This is the total text',phrase)
# do some action
# reset everything
        accumulating = False
        phrase = ''
        start_a = time.time()
    return start,start_a,accumulating,phrase

And you just invoke it with python3 drjtst4.py.

Sample session output
in else part
text, word ct 1
{
"partial" : ""
}
in else part
text, word ct hey 1
{
"partial" : "hey"
}
in else part
text, word ct hey 1
{
"partial" : "hey"
}
in else part
text, word ct hey 1
{
"partial" : "hey"
}
in else part
Wake word detected. Now accumulating text.
text, word ct hey photo 2
Elapsed time: 0.0004639625549316406
{
"partial" : "hey photo"
}
in else part
text, word ct hey photo 2
Elapsed time: 0.003415822982788086
{
"partial" : "hey photo"
}
in else part
text, word ct hey photo 2
Elapsed time: 0.034906625747680664
{
"partial" : "hey photo"
}
in else part
text, word ct hey photo 2
Elapsed time: 0.09063172340393066
{
"partial" : "hey photo"
}
in else part
text, word ct hey photo 2
Elapsed time: 0.2488384246826172
{
"partial" : "hey photo"
}
in else part
text, word ct hey photo 2
Elapsed time: 0.33771753311157227
{
"partial" : "hey photo"
}
in else part
text, word ct hey photo place 3
Elapsed time: 0.7102789878845215
{
"partial" : "hey photo place"
}
in else part
text, word ct hey photo place 3
Elapsed time: 0.7134637832641602
{
"partial" : "hey photo place"
}
in else part
text, word ct hey photo player 3
Elapsed time: 0.8728365898132324
{
"partial" : "hey photo player"
}
in else part
text, word ct hey photo player 3
Elapsed time: 0.8759913444519043
{
"partial" : "hey photo player"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 1.0684640407562256
{
"partial" : "hey photo play slideshow"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 1.0879075527191162
{
"partial" : "hey photo play slideshow"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 1.3674390316009521
{
"partial" : "hey photo play slideshow"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 1.3706269264221191
{
"partial" : "hey photo play slideshow"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 1.5532972812652588
{
"partial" : "hey photo play slideshow"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 1.5963218212127686
{
"partial" : "hey photo play slideshow"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 1.74298095703125
{
"partial" : "hey photo play slideshow"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 1.842745065689087
{
"partial" : "hey photo play slideshow"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 1.9888567924499512
{
"partial" : "hey photo play slideshow"
}
in else part
text, word ct hey photo play slideshow 4
Elapsed time: 2.0897343158721924
{
"partial" : "hey photo play slideshow"
}
in first part
{
"text" : "hey photo play slideshow"
}
text, word ct 1
Elapsed time: 2.3853299617767334
This is the total text hey photo play slideshow
in else part
{
"partial" : ""
}
in else part
{
"partial" : ""
}
A word on accuracy

It isn’t Alexa or Google. No one expected it would be, right? But if you’re a native English speaker it isn’t too bad. You can see it trying to correct itself.

The desire to choose an uncommon wake word of three syllables is at direct odds with how neural networks are trained! So… although I desired my wake word to be “hey photo,” I also allow “a photo.” A photo was probably in their training set whereas Hey photo certainly was not. Hence the bias against recognizing a unique wake word. And no way will I re-train their model – way too much effort. But to lower false positives this phrase has to occur at the beginning of a spoken phrase.

Turning this into a smart speaker

You can see I’ve got all the pieces set up. At least I think I do! I’ve got my wake word. I don’t have natural language processing but I think I can forgo that. I have a place in the code where I print out the “final text.” That’s where the spoken command is perceived to have been uttered and and a potential action could be exectured at that point.

Dead ends

To be fleshed out later as time permits.

Conclusion

I have demonstrated how speech-to-text without use of complex cloud apis such as those provided by Amazon and Google can be easily achieved on an inexpensive raspberry pi.

I will be building on this facility in subsequent posts as I turn my RPi-powered slideshow into a slideshow which reacts to voice commands!

References and related

Vosk Installation (alphacephei.com)

Raspberry Pi slideshow

This conference USB mic works really well for me.

Categories
Python

Azure DevOps: use the api to copy logs to linux

Intro

As far as I can tell there’s no way to search through multiple pipeline logs with a single command. In linux it’s trivial. Seeing the lack of this basic functionality I decided to copy all my pipeline logs over to a linux server using the Azure DevOps (ADO) api.

The details

This is the main program which I’ve called get_raw_logs.py.

                    

#!/usr/bin/python3
# fetch raw log to local machine
# for relevant api section, see:
#https://learn.microsoft.com/en-us/rest/api/azure/devops/build/builds/get-build-log?view=azure-devops-rest-7.1
import urllib.request,json,sys
from datetime import datetime,timedelta
from modules import aux_modules

conf_file = sys.argv[1]

# pipeline uses UTC so we must follow suit or we will miss files
a_day_ago = (datetime.utcnow() - timedelta(days = 1)).strftime('%Y-%m-%dT%H:%M:%SZ')
print('a day ago (UTC)',a_day_ago)

#url = 'https://dev.azure.com/drjohns4ServicesCoreSystems/Connectivity/_apis/build/builds?minTime=2022-10-11T13:00:00Z&api-version=7.1-preview.7'

# dump config file into a dict
config_d = aux_modules.parse_config(conf_file)

url = config_d['url_base'] + config_d['organization'] + '/' + config_d['project'] + '/_apis/build/builds?minTime=' + a_day_ago + config_d['url_params']
#print('url',url)
req = urllib.request.Request(url)
req.add_header('Authorization', 'Basic ' + config_d['auth'])

# Get buildIds for pipeline runs from last 24 hours
with urllib.request.urlopen(req) as response:
   html = response.read()
txt_d = json.loads(html)
#{"count":215,"value":[{"id":xxx,buildNumber":"20221011.106","definition":{"name":"PAN-Usage4Mgrs-2"
value_l = txt_d['value']
for builds in value_l:
    buildId = builds['id']
    build_number = builds['buildNumber']
    build_def = builds['definition']
    name = build_def['name']
    #print('name,build_number,id',name,build_number,buildId)
    #print('this_build',builds)
    if name == config_d['pipeline1'] or name == config_d['pipeline2']:
        aux_modules.get_this_log(config_d,name,buildId,build_number)

In the modules directory this is aux_modules.py.

                    

import json
import os,urllib.request

def parse_config(conf_file):
# config file should be a json file
    f = open(conf_file)
    config_d = json.load(f)
    f.close()
    return config_d

def get_this_log(config_d,name,buildId,build_number):
# leaving out the api-version etc works better
#GET https://dev.azure.com/{organization}/{project}/_apis/build/builds/{buildId}/logs/{logId}?api-version=7.1-preview.2
#https://dev.azure.com/drjohns4ServicesCoreSystems/d6335c8e-f5b4-44a5-8f6c-7b17fe663a86/_apis/build/builds/44071/logs/7'
        buildId_s = str(buildId)
        log_name = config_d['log_dir'] + "/" + name + "-" + build_number
# check if we already got this one
        if os.path.exists(log_name):
            return
        #url = url_base + organization + '/' + project + '/_apis/build/builds/' + buildId_s + '/logs/' + logId + '?' + url_params
        url = config_d['url_base'] + config_d['organization'] + '/' + config_d['project'] + '/_apis/build/builds/' + buildId_s + '/logs/' + config_d['logId']
        print('url for this log',url)
        req = urllib.request.Request(url)
        req.add_header('Authorization', 'Basic ' + config_d['auth'])
        with urllib.request.urlopen(req) as response:
            html = response.read()
        #print('log',html)
        print("Getting (name,build_number,buildId,logId) ",name,build_number,buildId_s,config_d['logId'])
        f = open(log_name,"wb")
        f.write(html)
        f.close()

Unlike programs I usually write, some of the key logic resides in the config file. My config file looks something like this.

{
"organization":"drjohns4ServicesCoreSystems",
"project":"Connectivity",
"pipeline1":"PAN-Usage4Mgrs",
"pipeline2":"PAN-Usage4Mgrs-2",
"logId":"7",
"auth":"Yaskaslkasjklaskldslkjsasddenxisv=",
"url_base":"https://dev.azure.com/",
"url_params":"&api-version=7.1-preview.7",
"log_dir":"/var/tmp/rawlogs"
}

It runs very efficiently so I run it every three minutes.

In my pipelines, all the interesting stuff is in logId 7 so I’ve hardcoded that. It could have turned out differently. Notice I am getting the logs from two pipelines due to the limitation, discussed previously, that you can only run 1000 pipeline runs a week so I was forced to run two identical ones, staggered, every 12 minutes with pipeline-2 sleeping the first six minutes.

The auth is the base-64 encoded text for any:<my_auth_token>.

Conclusion

I show how to copy the logs over from Azure DevOps pipeline runs to a local Unix system where you can do normal cool linux commands on them.

References and related

Running an ADO pipeline more than 1000 times a week.

ADO Rest api reference section relevant for this post: https://learn.microsoft.com/en-us/rest/api/azure/devops/build/builds/get-build-log?view=azure-devops-rest-7.1

How to secure a sensitive variable in ADO.

Categories
IT Operational Excellence

MS Teams tip: how to avoid embarrassment of starting a meeting after it’s finished

Intro

You can only join a meetnig well after it’s started. You don’t want to be “that guy” who has started the meeting that has already finished, sending out an embarrassing alert to all meeting participants that you have started the meeting. So how do you prevent that?

The tip

It’s a little subtle and thus worth mentioning. From the MS Teams calendar (not from Outlook) look for the meeting. Click on it, but not on the Join button, and then click on Chat with particpants. See the screenshot.

Or you can right-click on the meeting and choose Chat with participants from the menu of available actions.

The chat, which is associated with the meeting or meeting series, will show when the meeting has ended!

Assumptions

Allow meeting chat is enabled for the meeting.

It is an internal meeting – not one for which you are waiting in the lobby.

Any participant can start the meeting.

Conclusion

We have shown how to always check to make sure you aren’t starting a Microsoft Teams meeting which is already over. It will spare you some embarrassment. I have even experienced and contributed to “meeting start ping-pong,” which is most embarrassing. By the time you join a meeting with no participants and realize it, it is already too late! The others will have been notified of your stupid action no matter how quick you are to leave the meeting.

Less frequently you will get a warning that a meeting is in progress and x participants have joined the meeting. But I have found that cannot be relied upon.

Categories
Cloud

Azure DevOps: pipeline tips

Intro

I’ve made a lot of mistakes and learned something from every one. I am trynig to pass on some of what I learned.

Limit of 1000 pipeline runs per week

This seems pretty crazy. In my unit we were going all gung-ho and imagining Azure DevOps pipelines as an elegant replacement for running cron jobs on our own linux servers. But something as simple as “I need to run this job every five minutes” seems to be overwhelming for a pipeline. What? Yes there is a hard limit of 1000 pipeline jobs a week. This limit is discussed here.

I wanted a job to run every six minutes, which will still hit that limit. So what I am trynig is to create two pipelines. Each is scheduled to run every 12 minutes. The yaml files are almost the same except in the one I sleep for six minutes. I also needed to remember to re-create the pipeline variables I was using.

Getting the raw logs from your pipeline runs

I wrote a little script to get the raw logs and copy them to a linux filesystem where I can use the linux command-line tools I know and love to examine them in bulk. The main point I wish to share right now is that it is not at all obvious that you need to use the get builds section of the api, not the log section! Who would have guessed? https://learn.microsoft.com/en-us/rest/api/azure/devops/build/builds/get-build-log?view=azure-devops-rest-7.1

Errors I am seeing in my pipeline

[error] We stopped hearing from agent dsc-adosonar-drjohns4servicescoresystems-agent-549c476959-whd72. Verify the agent machine is running and has a healthy network connection. Anything that terminates an agent process, starves it for CPU, or blocks its network access can cause this error. For more information, see: https://go.microsoft.com/fwlink/?linkid=846610

We still need to figure this one out. The error appears only randomly.

I also saw a lot of more subtle errors which amounted to my variables not being defined correctly in the yaml section. Indentation is important! I had variables set up secret environment variables amongst other things. The behavior which results does not always make it obvious what the root cause is.

Don’t run the pipeline for every commit

In your commit comment, put

[skip ci]

somewhere on its own line. This will avoid that the pipeline runs each time you do a commit, which quickly gets annoying.

References and related

How to use the Azure DevOps api to for instance fetch the raw logs

Microsoft’s api documentation pertinent to this topic:

https://learn.microsoft.com/en-us/rest/api/azure/devops/build/builds/get-build-log?view=azure-devops-rest-7.1

Categories
Cloud Python

Azure DevOps: How to work in a subfolder of a project

Intro

Our repo corresponds to a project. Within it are subfolders corresponding to individual apps we want to run in a pipeline.

Someone provided me with a starter yaml file to run my code in a pipeline. Originally my code was running just fine on my own linux server. In the pipeline, not so much as it became apparent it was expecting the current working directory to be the subfolder (directory in linux-speak) for references to modules, config files, etc. The documentation is kind of thin. So I spent hours checking things out and creating the solution which I now present.

The solution

The critical thing is to set the workingDirectory. Here is that section of the yaml file.

 script: python getlogdata.py 'PAN4-5min.aql'
  displayName: 'Run script'
  workingDirectory: $(System.DefaultWorkingDirectory)/PAN_Usage_4_Mgrs
  env:
    AUTH_TOKEN: $(auth_token)
#    PYTHONPATH: $(System.DefaultWorkingDirectory)/PAN_Usage_4_Mgrs/modules

Note that that PYTHONPATH environment variable is another possible way out – if all you need is to include modules, but it won’t help you with other things like finding your config file.

Errors

Now suppose you see an error like I got:

ImportError: cannot import name 'ZabbixMetric' from 'jhpyzabbix' (unknown location).

I had tried to put jhpyzabbix folder at the same level as my subfolder, so, right under the top project level. At first I was getting module not found errors. So I put back my PYTHONPATH like so

    PYTHONPATH: $(System.DefaultWorkingDirectory)/PAN_Usage_4_Mgrs:$(System.DefaultWorkingDirectory)

And that’s when I got that cannot import name error. Whar caused that is that although I had copied over the needed .py files to jhpyzabbix, I forgot one whose purpose seemed irrelevant to me. __init__.py. Turns out that tiny python file is quite important after all. School of hard knocks… It sets up the namespace mapping, I guess. To be concrete, mine looks like this:

from .api import ZabbixAPI, ZabbixAPIException, ssl_context_compat
from .sender import ZabbixMetric, ZabbixSender, ZabbixResponse
References and related

Passing secure variable in Azure DevOps to your program

Categories
Cloud

Securing a sensitive variable in Azure DevOps

Intro

I’m a newbie at Azure DevOps. They sort of have to drag me kicking and screaming to give up the comfort of my personal linux to endure the torture of learning something brand new. But I iguess that’s the new way so I gotta go along with it. I work within a project that has a repo. And we intend to run a pipeline. Question is, how to deal with sensitive parameters like passwords?

One approach

If you just do raw Internet searches you can fall into any number of rabbit holes. Key vault, anyone? How about relying on secure files within the library? I guess – very tentatively – that using a key vault might be the right way to deal with this issue, but it all depends on the ACLs available and I do not know that.. I also do not see a simple-minded way to set up a key vault. So what the guy who seems to know a lot more than I do is to set up a hidden variable for the pipeline itself.

The thing is that even that has its own gotchas. I find that depending on where you start from, you may ior may not see the option to declare a pipline variable as hidden.

If I am in the Pipeline section and looking at Recently Run Pipelines, and click on my pipeline, before I run it I can add variables. Doing it that way, you only get the option to include a name and Value. No option for declaring it to be hidden.

So how do you do it?

Instead of Run Pipline > variables go to Edit Pipeline > Variables. Then variables you create will have the option to be kept secret.

It is a bad idea to pass a sensitive information on the command line. Anyone with access to that agent could do a process listing to read the secret. So instead you use an environment variable. It’s a little tricky as you have to understand yaml variable interpolation versus script interpolation.

My secret variable is auth_token for the Zabbix api token. So in my yaml file I have a reference to it which sets up an environment variable:

- script: python Delete_Unused_VEdges/activity_check.py 'SA'
  displayName: 'Run script'
  env:
    AUTH_TOKEN: $(auth_token)

And in activity_check.py I read that environment variable like this:

token_zabbix = os.environ['AUTH_TOKEN']

And, voila, it’s all good.

But how secure is it really?

Question is, is this just performative security? Because I’m working with a project. Anyone else with access to the project could alter my python program temporarily to print out the value of the secret variable. I’m assuming they can both alter the repo I use as well as run a pipeline. Yes, they make it a tiny bit challenging because you need to break up a secret variable into two pieces in order to print it out, but come on. So maybe we’ve mostly achieved security by obscurity.

But again, I don’t really know the richness of this environment and perhaps more can be done to restrict who can alter the repo and run it in a pipeline.

Conclusion

People who know more than I, a newbie, have sugested a certain approach to dealing with sensitive variables with our Azure DevOps pipelines. I have described here what I as a newbie see as potential pitfalls as well as how to implement their chosen method.

Categories
CentOS Web Site Technologies

CentOS is dead

Intro

As my devoted followers will be aware, I nearly kiled myself converting my Centos 6 VM to CentOS 8. For instance see Upgrading WordPress brings a thicket of problems. That is an experience I only want to go through every 10 years, and fortunately, CentOS was just the right platform as its support was supposed to last 10 years I started this blog in either 2011 I believe. I went to CentOS 8 in 2020.

But instead of eight more good years, I’ve learned that CentOS is basically a dead product. EOL in industry parlance. IBM killed it. The last upgrades to CentOS 8 came at the end of 2021. There is a sort of CentOS, now called CentOS Stream, but it should be basically thought of a another Fedora. Probably IBM was losing too much money with people choosing CentOS (free) over RedHat paid subscription.

But anyway, I’ve come to resent how out-of-date the packages are on CentOS and I am much more favorable to plain old Debian linux, largely due to my work on the Raspberry Pi. There the packages like python are much more uptodate. I guess the support is for five years.

The other VM I would consider for my next iteration is Amazon Linux. It has a lot of what I need already installed, so less fuss. But I think they’re only supported for three years.

References and related

There is a snarky commentary about this topic which inspired this article. I don’t have the link right now but it is enlightening. I will post it if I ever find it.

Upgrading WordPress brings a thicket of problems