Categories
Linux

Solving this week’s NPR weekend puzzle with a few Linux commands

Intro
I listen to the NPR puzzle every Sunday morning. I’m not particularly good at solving them, however – I usually don’t. But I always consider if I could get a little help from my friendly Linux server, i.e., if it lends itself to solution by programming. As soon as I heard this week’s challenge I felt that it was a good candidate. I was not disappointed…

The details
So Will Shortz says think of a common word with four letters. Now add O, H and M to that word, scramble the letters to make another common word in seven letters. The words are both things you use daily, and these things might be next to each other.

My thought pattern on that is that, great, we can look through a dictionary of seven-letter words which contain O, H and M. That already might be sufficiently limiting.

This reminded me of using the built-in Linux dictionary to give me some great tips when playing Words with Friends, which I document here.

In my CentOS my dictionary is /unix/share/dict/linux.words. It has 479,829 words:

$ cd /usr/share/dict; wc linux.words

That’s a lot. So of course most of them are garbagey words. Here’s the beginning of the list:

$ more linux.words

1080
10-point
10th
11-point
12-point
16-point
18-point
1st
2
20-point
2,4,5-t
2,4-d
2D
2nd
30-30
3-D
3-d
3D
3M
3rd
48-point
4-D
4GL
4H
4th
5-point
5-T
5th
6-point
6th
7-point
7th
8-point
8th
9-point
9th
-a
A
A.
a
a'
a-
a.
A-1
A1
a1
A4
A5
AA
aa
A.A.A.
AAA
aaa
AAAA
AAAAAA
...

You see my point? But amongst the garbage are real words, so it’ll be fine for our purpose.

What I like to do is to build up to increasingly complex constructions. Mind you, I am no command-line expert. I am an experimentalist through-and-through. My development cycle is Try, Demonstrate, Fix, Try Demonstrate, Improve. The whole process can sometimes be finished in under a minute, so it must have merit.

First try:

$ grep o linux.words|wc

 230908  230908 2597289

OK. Looks like we got some work to do, yet.

Next (using up-arrow key to recall previous command, of course):

$ grep o linux.words|grep m|wc

  60483   60483  724857

Next:

$ grep o linux.words|grep m|grep h|wc

  15379   15379  199724

Drat. Still too many. But what are we actually producing?

$ grep o linux.words|grep m|grep h|more

abbroachment
abdominohysterectomy
abdominohysterotomy
abdominothoracic
Abelmoschus
abhominable
abmho
abmhos
abohm
abohms
abolishment
abolishments
abouchement
absmho
absohm
Acantholimon
acanthoma
acanthomas
Acanthomeridae
acanthopomatous
accompliceship
accomplish
accomplishable
accomplished
accomplisher
accomplishers
accomplishes
accomplishing
accomplishment
accomplishments
accomplisht
accouchement
accouchements
accroachment
Acetaminophen
acetaminophen
acetoamidophenol
acetomorphin
acetomorphine
acetylmethylcarbinol
acetylthymol
Achamoth
achenodium
achlamydeous
Achomawi
...

Of course, words with capitalizations, words longer and shorter than seven letters – there’s lots of tools left to cut this down to manageable size.

With this expression we can simultaneously require exactly seven letters in our words and require only lowercase alphabetical letters: egrep ′^[a-z]{7}$′. This is an extended regular expression that matches the beginning (^) and end ($) of the string, only characters a-z, and exactly seven of them ({7}).

With that vast improvement, we’re down to 352 entries, a list small enough to browse by hand. But the solution still didn’t pop out at me. Most of the words are obscure ones, which should automatically be excluded because we are looking for common words. We have:

$ grep o linux.words|grep m|grep h|egrep ′^[a-z]{7}$′|more

achroma
alamoth
almohad
amchoor
amolish
amorpha
amorphi
amorphy
amphion
amphora
amphore
apothem
apothgm
armhole
armhoop
bemouth
bimorph
bioherm
bochism
bohemia
bohmite
camooch
camphol
camphor
chagoma
chamiso
chamois
chamoix
chefdom
chemizo
chessom
chiloma
chomage
chomped
chomper
chorism
chrisom
chromas
chromed
chromes
chromic
chromid
chromos
chromyl
...

So I thought it might be inspiring to put the four letters you would have if you take away the O, H and M next to each word, right?

I probably ought to use xargs but never got used to it. I’ve memorized this other way:

$ grep o linux.words |grep m|grep h|egrep ′^[a-z]{7}$′|while read line; do
> s=`echo $line|sed s/o//|sed s/h//|sed s/m//`
> echo $line $s
> done|more

sed is an old standard used to do substitutions. sed s/o// for example is a filter which removes the first occurrence of the letter O.

I could almost use the tr command, as in

> …|tr -d ′[ohm]′

in place of all those sed statements, but I couldn’t solve the problem of tr deleting all occurrences of the letters O, H and M. And the solution didn’t jump out at me.

So until I figure that out, use sed. That gives:

achroma acra
alamoth alat
almohad alad
amchoor acor
amolish alis
amorpha arpa
amorphi arpi
amorphy arpy
amphion apin
amphora apra
amphore apre
apothem apte
apothgm aptg
armhole arle
armhoop arop
bemouth beut
bimorph birp
bioherm bier
bochism bcis
bohemia beia
bohmite bite
camooch caoc
camphol capl
camphor capr
chagoma caga
chamiso cais
chamois cais
chamoix caix
chefdom cefd
chemizo ceiz
chessom cess
chiloma cila
chomage cage
chomped cped
chomper cper
chorism cris
chrisom cris
chromas cras
chromed cred
chromes cres
chromic cric
chromid crid
chromos cros
chromyl cryl
...

Friday update
I can now reveal the section listing that reveals the answer because the submission deadline has passed. It’s here:

...
schmoes sces
schmoos scos
semihot seit
shahdom sahd
shaloms sals
shamalo saal
shammos sams
shamois sais
shamoys says
shampoo sapo
shimose sise
shmooze soze
shoeman sean
sholoms slos
shopman span
shopmen spen
shotman stan
...

See it? I think it leaps out at you:

shampoo sapo

becomes of course:

soap
shampoo

!

They’re common words found next to each other that obey the rules of the challenge. You can probably tell I’m proud of solving this one. I rarely do. I hope they don’t call on me because I also don’t even play well against the radio on Sunday mornings.

Conclusion
Now I can’t give out the answer right now because the submission deadline is a few days from now. But I will say that the answer pretty much pops out at you when you review the full listing generated with the above sequence of commands. There is no doubt whatsoever.

I have shown how a person with modest command-line familiarity can solve a word problem that was put out on NPR. I don’t think people are so much interested in learning a command line because there is no instant gratification and th learning curve is steep, but for some it is still worth the effort. I use it, well, all the time. Solving the puzzle this way took a lot longer to document, but probably only about 30 minutes of actual tinkering.

Categories
Admin Linux Raspberry Pi Security

Generate Pronounceable Passwords

2017 update
Turns out gpw is an available package in Debian Linux, including Raspbian which runs on Raspberry Pi. Who knew? A simple sudo apt-get install gpw will provide it. So I guess the source wasn’t lost at all.

Intro
15 years ago I worked for a company that wanted to require authentication in order to browse to the Internet. I searched around for something.

What I came up with is gpw – generate pronounceable passwords.

The details
I think this approach to secure passwords is no longer best practice, but I still think it has a place for some applications. What it does is analyze a dictionary that you’ve fed it. It then determines the frequency of occurrence of what it calls trigraphs – I guess that’s three consecutive letter combinations. Then it generates random, non-dictionary passwords using those trigraphs, which are presumably wholly or partially pronounceable.

Cute, huh? I’d say one problem is that if the bad guys got wind of this approach, the numbers of combinations they’d have to use to do password cracking is severely restricted.

Sophos has a recommendation for forming good strong passwords. See their blog post about the 50 worse passwords which contains a link to a video on how to choose a good password.

But I still have a soft spot for this old approach, and I think it’s OK to use it, get your password such as inglogri, add a few non-alpha-numeric characters and come up with a reasonably good, memorable password. Every site you use should really get a different password, and this tool might make that actually feasible.

I run it as:

$ gpw

which produces:

seminour
shnopoos
alespige
olpidest
hastrewe
nsivelys
shaphtra
bratorid
melexseu
sheaditi

Its output changes every time, of course.

I mostly run it this way:

$ gpw 1

which produces only a single password, for instance:

ojavishd

You see how these passwords are sort of like words, but not words? Much more memorable than those completely random ones you are sometimes forced to type and which are impossible to remember?

I noted the location where I pulled it from the web 15 years ago as is my custom, but it is no longer available. So I have decided to make it available. I tweaked it to compile on CentOS with a C++ compiler.

Here is the CentOS v 6 binary for x86_64 architecture and README file.

Here is the tar file with the sources and the binary mentioned above. Run a make clean first to begin building it.

Enjoy!

Potential Problems
I know when we originally used it to assign 15,000 unique passwords, the randomness algorithm was so bad that I believe some people received identical passwords! So the the total number of generatable passwords might be severely limited. Please check this before using it in any meaningful way. I would naively expect and hope that it could generate about two- to three-times the number of words in my dictionary (/usr/share/dict/linux.words, with 479,829 words). But I never verified this.

2017 update
I ran it, 100 passwords at a time, on my Rsapberry Pi for a couple minutes. I created 275,900 passwords, of which 269,407 were unique. Strange. So you get some repeats but you motly get new passwords.

Further, I was going to tweak the code to generate 9-letter passwords which would presumably be more secure. But they just didn’t look as good to me, and I’ve only ever used it with 8 letters. So I just decided to keep it at 8 letters. You can experiment with that if you want.

More fun with the Linux dictionary
For another fun example using the Linux dictionary see how I solved the NPR weekend puzzle using it, described here.

A note for Debian Linux users (Ubuntu, Raspberry Pi, …)
The dictionary there is /usr/share/dictd/wn.index. You’ll need to update the Makefile to reflect this. This post about Words with Friends explains the packages I used to provide that dictionary.

Conclusion
An old pronounceable password generating program has been dusted off and given back to the open source community. It may not be state-of-the-art, but it has a role for some usages.

References and related
Want truly random passwords? I want to call your attention to random.org’s password generator: https://www.random.org/passwords/

Most people are becoming familiar with the idea of not reusing passwords but I don’t know if everyone realizes why. This article is a comprehensive review of the topic, plus review of password vaults like Lastpass, etc which you may have heard of: https://pixelprivacy.com/resources/reusing-passwords/