About awk.info
» table of contents
» featured topics
» page tags
|
|
|
|
|
|
Mar 01: Michael Sanders demos an X-windows GUI for AWK.
Mar 01: Awk100#24: A. Lahm and E. de Rinaldis' patent search, in AWK
Feb 28: Tim Menzies asks this community to write an AWK cookbook.
Feb 28: Arnold Robbins announces a new debugger for GAWK.
Feb 28: Awk100#23: Premysl Janouch offers a IRC bot, In AWK
Feb 28: Updated: the AWK FAQ
Feb 28: Tim Menzies offers a tiny content management system, in Awk.
Jan 31: Comment system added to awk.info. For example, see discussion bottom of ?keys2awk
Jan 31: Martin Cohen shows that Gawk can handle massively long strings (300 million characters).
Jan 31: The AWK FAQ is being updated. For comments/ corrections/ extensions, please mail tim@menzies.us
Jan 31: Martin Cohen finds Awk on the Android platform.
Jan 31: Aleksey Cheusov released a new version of runawk.
Jan 31: Hirofumi Saito contributes a candidate Awk mascot.
Jan 31: Michael Sanders shows how to quickly build an AWK GUI for windows.
Jan 31: Hyung-Hwan Chung offers QSE, an embeddable Awk Interpreter.
In comp.lang.awk, Ed Morton offers advise on how to print ranges of Awk records.
Suppose you are looking to extract a section of code from a text file based on two regular expressions.
Say the file looks like this: newspaper magazing hiking hiking trails in the city muir hike black mountain hike summer meados hike end hiking phone cell skype
and you want to extract
hiking trails in the city muir hike black mountain hike summer meados hikeThe following regular expression won't work right:
awk '/hiking/,/end hiking/{print}' myfile
since that returns some spurious information.
What do do?
Personally, I rarely if ever use
/start/,/end/
as I'm never immediately sure what it'd output for input such as:
start a start b end c end
and whenever you want to do something just slightly different with the selection you need to change the script a lot.
Not being sure of the semantics is probably a catch 22 since I rarely use it but the benefit of using that syntax vs spelling it out:
/start/{f=1} f; /end/{f=0}
just doesn't really seem worthwhile, and then if you want to do something extra like test for some other condition over the block this:
/start/{f=1} f&&cond; /end/{f=0}
is about as brief as:
/start/,/end/{if (cond) print}
and if you want to exclude the start (or end) of the block you're printing then you just move the "f" test to the obvious place and you don't need to duplicate the condition:
f; /start/{f=1} /end/{f=0}
vs
/start/,/end/{if (!/start/) print}
and note the different semantics now. This:
f; /start/{f=1} /end/{f=0}
will exclude the line at the start of the block you're printing, whereas this:
/start/,/end/{if (!/start/) print}
will exclude that line plus every other occurrence of "start" within the block which is probably not what you'd want. To simply exclude only the first line of the block but stay with the /start/,/end/ approach you'd need to do something like:
/start/,/end/{if (!nr++) print; if (/end/) nr=0}
(which is getting fairly obscure.)
blog comments powered by Disqus