Categories
Web Site Technologies

How to escape Linux commands for the WordPress editor

Intro
I develop a lot of stuff on Linux command line. Then I want to share it on my blog, which is implemented in WordPress. I only use HTML editing mode because the alternative was disastrous. But my commands were being mangled, even when they displayed OK. Put into the clipboard and pasted spit out some really strange characters and not at all what I had in my original command. What to do?

The details
An example will go a long way to show what I mean. Say I want to examine what static routes I’ve created on my server and send the results to a file. This came up a couple posts ago. So I developed the command:

$ netstat ‐rn|cut ‐c‐16|egrep ‐v ^'10\.|172|169' > /tmp/results

Now if I enter it literally in my blog with those characters it would appear like this:

$ netstat -rn|cut -c-16|egrep -v ^’10\.|172|169′ > /tmp/results

I can avoid formatting issues by using the <pre> tag, but then I can’t bold my commands. Stylistically I try to follow the style where commands typed in by the user are in boldface.

A python program solves the problem
I developed the following python program which spits out properly encoded characters that I’ve determined are at risk of being misrepresented in my blog. I call it htmlescape.

#!/usr/bin/python
# mostly lifted from https://wiki.python.org/moin/EscapingHtml
# DrJ - 7/22/16
import sys
import cgi
html_escape_table = {
    "&": "&amp;",
    '"': "&quot;",
    "'": "&apos;",
    ">": "&gt;",
    "<": "&lt;",
    "-": "&hyphen;",
    }
def html_escape(text):
    """Produce entities within text."""
    return "".join(html_escape_table.get(c,c) for c in text)
sys.stdout.write("Enter your command string: ")
code = sys.stdin.readline()
print code
print html_escape(code)

Why the command string prompt
I realized that if I allowed the shell to intervene it would mangle my single quotes, double quotes, dollar signs and a whole lot more. So I wanted to be in the context of a special shell, which python provides with its sys.stdin/stdout functions. They are perfect – they do not do any character manipulation.

A few comments about the characters

Why encode the hyphen? It comes up all the time as prefix character to command arguments. A single hyphen gets represented OK, but some commands actually require a double hyphen, ‐‐, and that gets mangled. Also, I’ve noticed that minus sign and hyphen are represented differently in HTML. The minus sign is shown to be longer and just doesn’t look right. And that is the default representation of the “-” character, even though in shell commands you almost always mean it as a hyphen, ‐.

The apostrophe is important to prevent the shell from interpolating variables inside a set of apostrophes. In the context of the shell it is more appropriately to be called a single quote or a tick mark. In HTML browsers try to be fancy and look for pairs of single quotes and turns one upside-down – rendering it as an entirely different character. Same thing for double quotes.

Strangely, the back tick ` does not suffer a similar fate. That does not get mangled so no need to represent it in encoded form. At least as far as I’ve seen. I suspect that somewhere under some circumstances it too might get mangled, but I can’t produce those conditions right now.

Example 2
$ curl –noproxy –show-error

Those are double hyphens, which is the correct syntax for using curl! But it renders as a long dash, which is bad enough, and put it in your clipboard and paste it into a shell and it produces garbage characters. At the end I put a <url> but that just became totally invisible! Running htmlescape on it makes the same look like this:

$ curl ‐‐noproxy ‐‐show‐error <url>

In my editor screen I have entered this:

$ curl &hyphen;&hyphen;noproxy &hyphen;&hyphen;show&hyphen;error &lt;url&gt;

And how did I produce that line? Why I took the previous rendering and ran it through htmlescape one more time!

Alternatives considered
I hate looking for plugins. None work exactly the way you want, they are poorly documented, fall out of support, etc. So yes there are plugins which may be able to work, but I think for my situation I like to maintain full control and go my own way.
Up until yesterday I was doing all the character substitutions by hand! That’s another alternative, but it gets tiresome.

Conclusion
A python program is presented which properly escapes Linux command line strings for suitable publication in a WordPress blog.