Awk.Info

"Cause a little auk awk
goes a long way."

About awk.info
 »  table of contents
 »  featured topics
 »  page tags


About Awk
 »  advocacy
 »  learning
 »  history
 »  Wikipedia entry
 »  mascot
Implementations
 »  Awk (rarely used)
 »  Nawk (the-one-true, old)
 »  Gawk (widely used)
 »  Mawk
 »  Xgawk (gawk + xml + ...)
 »  Spawk (SQL + awk)
 »  Jawk (Awk in Java JVM)
 »  QTawk (extensions to gawk)
 »  Runawk (a runtime tool)
 »  platform support
Coding
 »  one-liners
 »  ten-liners
 »  tips
 »  the Awk 100
Community
 »  read our blog
 »  read/write the awk wiki
 »  discussion news group

Libraries
 »  Gawk
 »  Xgawk
 »  the Lawker library
Online doc
 »  reference card
 »  cheat sheet
 »  manual pages
 »  FAQ

Reading
 »  articles
 »  books:

WHAT'S NEW?

Mar 01: Michael Sanders demos an X-windows GUI for AWK.

Mar 01: Awk100#24: A. Lahm and E. de Rinaldis' patent search, in AWK

Feb 28: Tim Menzies asks this community to write an AWK cookbook.

Feb 28: Arnold Robbins announces a new debugger for GAWK.

Feb 28: Awk100#23: Premysl Janouch offers a IRC bot, In AWK

Feb 28: Updated: the AWK FAQ

Feb 28: Tim Menzies offers a tiny content management system, in Awk.

Jan 31: Comment system added to awk.info. For example, see discussion bottom of ?keys2awk

Jan 31: Martin Cohen shows that Gawk can handle massively long strings (300 million characters).

Jan 31: The AWK FAQ is being updated. For comments/ corrections/ extensions, please mail tim@menzies.us

Jan 31: Martin Cohen finds Awk on the Android platform.

Jan 31: Aleksey Cheusov released a new version of runawk.

Jan 31: Hirofumi Saito contributes a candidate Awk mascot.

Jan 31: Michael Sanders shows how to quickly build an AWK GUI for windows.

Jan 31: Hyung-Hwan Chung offers QSE, an embeddable Awk Interpreter.

[More ...]

Bookmark and Share

categories: Tips,Aug,2009,EdM

Print Ranges

In comp.lang.awk, Ed Morton offers advise on how to print ranges of Awk records.

Problem

Suppose you are looking to extract a section of code from a text file based on two regular expressions.

Say the file looks like this: newspaper magazing hiking hiking trails in the city muir hike black mountain hike summer meados hike end hiking phone cell skype

and you want to extract

hiking trails in the city
muir hike
black mountain hike
summer meados hike
The following regular expression won't work right:
awk '/hiking/,/end hiking/{print}' myfile
since that returns some spurious information.

What do do?

Solution

Personally, I rarely if ever use

/start/,/end/

as I'm never immediately sure what it'd output for input such as:

start
a
start
b
end
c
end

and whenever you want to do something just slightly different with the selection you need to change the script a lot.

Not being sure of the semantics is probably a catch 22 since I rarely use it but the benefit of using that syntax vs spelling it out:

/start/{f=1} f; /end/{f=0}

just doesn't really seem worthwhile, and then if you want to do something extra like test for some other condition over the block this:

/start/{f=1} f&&cond; /end/{f=0}

is about as brief as:

/start/,/end/{if (cond) print}

and if you want to exclude the start (or end) of the block you're printing then you just move the "f" test to the obvious place and you don't need to duplicate the condition:

f; /start/{f=1} /end/{f=0}
vs
/start/,/end/{if (!/start/) print}

and note the different semantics now. This:

f; /start/{f=1} /end/{f=0}

will exclude the line at the start of the block you're printing, whereas this:

/start/,/end/{if (!/start/) print}

will exclude that line plus every other occurrence of "start" within the block which is probably not what you'd want. To simply exclude only the first line of the block but stay with the /start/,/end/ approach you'd need to do something like:

/start/,/end/{if (!nr++) print; if (/end/) nr=0}

(which is getting fairly obscure.)

blog comments powered by Disqus