Categories
Admin IT Operational Excellence Linux

Splitting a Text File Into Two Lines with Awk

Intro
How do you split a text file into two lines output per one original input line? Of course there are zillions of ways, with shell, xargs, Perl, your favorite tool, etc. But I decided to revisit that old standard awk to see if it might not just be the best (most compact and intelligible) way to do it!

The Challenge
I was provided a spreadsheet concerning printers in a new building, which I was to use to create access table entries for sendmail, i.e., so that they would be permitted to relay mail (these days it seems all printers are also scanners).

I wanted to have a comment line with the native printer name, with format

# Printer_Name

Then the appropriate access table entry, which has format

IP_ADDRESS   RELAY

As an additional wrinkle the spreadsheet had columns with variable amount of whitespace! It was very similar to the input below, which I had in a file called tmp:

PA01-USCVI-B52_160-P137C              Bldg 52 Plant 1st 160           10.12.210.161
PA02-Y-B53_160-D220                 Blag 53 Plant 1st 160               10.13.209.162
PA03-UIT-B54_COPY1-D645C         Bldg 54 Plant 1st Copy Rm   10.208.211.163
PA04-RUITY-B55-P235                 Bldg 55 Plant Basement Off         10.14.205.169
PA05-THY-675            John Tollesin    Bldg 53 Plant 2nd 220          10.13.204.156 
 

Fortunately I was interested in the first and last fields, which kept things simple. Here’s what I came up with:
 

awk '{print "# "$1"\n"$NF"\tRELAY"}' tmp

Not bad, eh? In addition to being relatively few characters, it makes sense to me, so I will remember this trick for the next time, which is a timesaver.

I have to get myself to a Unix or Cygwin session to show the output, but it is as I described. I guess the biggest trick is that awk allowed me to conveniently write out two lines in one statement by creating an ASCII newline character with the “\n”character. It’s probably better known that $1 stands for the first field and $NF (number of fields) stands for the last field of a line.

Conclusion
Sexier tools have come along, but don’t give up on our old friend awk – basic knowledge of what it does can be a real timesaver.