Awk.Info

"Cause a little auk awk
goes a long way."

About awk.info
 »  table of contents
 »  featured topics
 »  page tags


About Awk
 »  advocacy
 »  learning
 »  history
 »  Wikipedia entry
 »  mascot
Implementations
 »  Awk (rarely used)
 »  Nawk (the-one-true, old)
 »  Gawk (widely used)
 »  Mawk
 »  Xgawk (gawk + xml + ...)
 »  Spawk (SQL + awk)
 »  Jawk (Awk in Java JVM)
 »  QTawk (extensions to gawk)
 »  Runawk (a runtime tool)
 »  platform support
Coding
 »  one-liners
 »  ten-liners
 »  tips
 »  the Awk 100
Community
 »  read our blog
 »  read/write the awk wiki
 »  discussion news group

Libraries
 »  Gawk
 »  Xgawk
 »  the Lawker library
Online doc
 »  reference card
 »  cheat sheet
 »  manual pages
 »  FAQ

Reading
 »  articles
 »  books:

WHAT'S NEW?

Mar 01: Michael Sanders demos an X-windows GUI for AWK.

Mar 01: Awk100#24: A. Lahm and E. de Rinaldis' patent search, in AWK

Feb 28: Tim Menzies asks this community to write an AWK cookbook.

Feb 28: Arnold Robbins announces a new debugger for GAWK.

Feb 28: Awk100#23: Premysl Janouch offers a IRC bot, In AWK

Feb 28: Updated: the AWK FAQ

Feb 28: Tim Menzies offers a tiny content management system, in Awk.

Jan 31: Comment system added to awk.info. For example, see discussion bottom of ?keys2awk

Jan 31: Martin Cohen shows that Gawk can handle massively long strings (300 million characters).

Jan 31: The AWK FAQ is being updated. For comments/ corrections/ extensions, please mail tim@menzies.us

Jan 31: Martin Cohen finds Awk on the Android platform.

Jan 31: Aleksey Cheusov released a new version of runawk.

Jan 31: Hirofumi Saito contributes a candidate Awk mascot.

Jan 31: Michael Sanders shows how to quickly build an AWK GUI for windows.

Jan 31: Hyung-Hwan Chung offers QSE, an embeddable Awk Interpreter.

[More ...]

Bookmark and Share

categories: Runawk,Project,Tools,Jan,2010,AlexC

Runawk 0.19 Released

Download

http://sourceforge.net/projects/runawk

About

runawk is a small wrapper for the AWK interpreter that helps one write standalone AWK scripts. Its main feature is to provide a module/library system for AWK which is somewhat similar to Perl's "use" command. It also allows you to select a preferred AWK interpreter and to setup the environment for your scripts. It also provides other helpful features, for example it includes numerous useful of modules.

Major Changes IN RUNAWK-0.19.0

  • fix in runawk.c: \n was missed in "running '%s' failed: %s" error message. The problem was seen on ancient (12 years old) HP-UX
  • fix in teets/test.mk: "diff -u" is not portable (SunOS, HP-UX),
  • DIFF_PROG variable is introduced to fix the problem
  • fix in modules/power_getopt.awk: after printing help message we
  • should exit immediately not running END section, s/exit/exitnow/
  • new function heapsort_values in heapsort.awk module
  • new function quicksort_values in quicksort.awk module
  • new function sort_values in sort.awk module

Author

Aleksey Cheusov


categories: Runawk,Project,Tools,Nov,2009,AlexC

Runawk 0.18 Released

Download

http://sourceforge.net/projects/runawk

About

runawk is a small wrapper for the AWK interpreter that helps one write standalone AWK scripts. Its main feature is to provide a module/library system for AWK which is somewhat similar to Perl's "use" command. It also allows you to select a preferred AWK interpreter and to setup the environment for your scripts. It also provides other helpful features, for example it includes numerous useful of modules.

Major Changes IN RUNAWK-0.18.0

Makefile:

  • "install-dirs" target has been renamed to "installdirs"
  • At compile time MODULESDIR can contain a *list* of colon-separated directories, e.g. /usr/local/share/runawk:/usr/local/share/awk
  • Support for multiply applied options, e.g. -vvv for increasing verbosity level. If option without arguments is multiply applied, getarg() function returns a number of times it was applied, not just 0 or 1.

New modules:

  • init_getopt.awk using alt_getopt.awk and used by power_getopt.awk. Its goal is to initialize `long_opts' and `long_opts' variables but not run `getopt' function.
  • heapsort.awk : heapsort :-)
  • quicksort.awk : quicksort :-)
  • sort.awk : either heapsort or quicksort, the default is heapsort. Unfortunately GAWK's asort() and asorti() functions do *not* satisfy my needs. Another (and more important) reason is a portability.

Improvements, clean-ups and fixes in regression tests.

Also, runawk-0-18-0 was successfully tested on the following platforms: NetBSD-5.0/x86, NetBSD-2.0/alpha, OpenBSD-4.5/x86, FreeBSD-7.1/x86, FreeBSD-7.1/spark, Linux/x86 and Darwin/ppc.

Author

Aleksey Cheusov


categories: Runawk,Project,Tools,Sept,2009,AlexC

New release: RUNAWK 0.17

What is RUNAWK?

RUNAWK is a small wrapper for the AWK interpreter that helps one write standalone AWK scripts. Its main feature is to provide a module/library system for AWK which is somewhat similar to Perl's "use" command. It also allows you to select a preferred AWK interpreter and to setup the environment for your scripts. RUNAWK makes programming AWK easy and efficient. RUNAWK also provides many useful AWK modules.

Sources

Major Changes

Version 0.17.0, by Aleksey Cheusov, Sat, 12 Sep 2009

runawk:

  • ADDED: new option for runawk for #use'ing modules: -f. runawk can also be used for oneliners! ;-)
          runawk -f abs.awk -e 'BEGIN {print abs(-123); exit}'
    
  • In a multilined code passed to runawk using option -e, spaces are allowed before #directives.
  • After inventing alt_getopt.awk module there is no reason for heuristics that detects whether to add `-' to AWK arguments or not. So I've removed this heuristics. Use alt_getopt.awk module or other "smart" module for handling options correctly!

alt_getopt.awk and power_getopt.awk:

  • FIX: for "abc:" short options specifier BSD and GNU getopt(3) accept "-acb" and understand it as "-a -cb", they also accept "-ac b" and also translate it to "-a -cb". Now alt_getopt.awk and power_getopt.awk work the same way.

power_getopt.awk:

  • -h option doesn't print usage information, --help (and its short synonym) does.

New modules:

  • shquote.awk, implementing shquote() function.
    shquote(str):
      `shquote' transforms the string `str' by adding shell escape and quoting characters to include it to the system() and popen() functions as an argument, so that the arguments will have the correct values after being evaluated by the shell.
    Inspired by NetBSD's shquote(3) from libc.
  • runcmd.awk, implementing functions runcmd1() and xruncmd1()
    runcmd1(CMD, OPTS, FILE):
      wrapper for function system() that runs a command CMD with options OPTS and one filename FILE. Unlike system(CMD " " OPTS " " FILE) the function runcmd1() handles correctly FILE and CMD containing spaces, single quote, double quote, tilde etc.
  • xruncmd1(FILE):
      safe wrapper for 'runcmd(1)'. awk exits with error if running command failed.
  • isnum.awk, implementing trivial isnum() function, see the source code.
  • alt_join.awk, implementing the following functions:
    join_keys(HASH, SEP):
      returns string consisting of all keys from HASH separated by SEP.
    join_values(HASH, SEP):
      returns string consisting of all values from HASH separated by SEP.
    join_by_numkeys (ARRAY, SEP [, START [, END]]):
      returns string consisting of all values from ARRAY separated by SEP. Indices from START (default: 1) to END (default: +inf) are analysed. Collecting values is stopped on index absent in ARRAY.

categories: Runawk,Project,Tools,Apr,2009,AlexC

New release: Runawk 0.16

In comp.lang.awk, Aleksey Cheusov writes:

I've made runawk-0.16.0 release. This release has lots of important improvements and additions. Sources are available from

What is runawk?

RUNAWK is a small wrapper for AWK interpreter that helps to write the standalone programs in AWK. It provides MODULES for AWK similar to PERL's "use" command and other powerful features. Dozens of ready to use modules are also provided.

(For more information, see details from the last release.)

Major changes in this release

Lots of demo programs for most runawk modules were created and they are in examples/ subdirectory now.

New MEGA module ;-) power_getopt.awk See the documentation and demo program examples/demo_power_getopt. It makes options handling REALLY easy (see below).

New modules:

  • embed_str.awk has_suffix.awk
  • has_prefix.awk
  • readfile.awk
  • modinfo.awk

Minor fixes and improvements in dirname.awk and basename.awk. Now they are fully compatible with dirname(1) and basename(1)

RUNAWK sets the following environment variables for the child awk subprocess:

  • RUNAWK_MODC - A number of modules (-f filename) passed to AWK
  • RUNAWK_MODV_<n> - Full path to the module #n, where n is in [0..RUNAWK_MODC) range.

RUNAWK sets RUNAWK_ART_STDIN environment variable for the child awk subprocess to 1 if additional/artificial `-' was added to the list to awk's arguments.

Makefile:

  • bmake-ism were removed. Now Makefile is fully compatible with FreeBSD make.
  • CLEANFILES target is used instead of hand-made rules
  • Minor fix in 'test_all' target

Power_GetOpt.awk

The most powerful feature of this release is power_getopt.awk module. It provides a very powerful and very easy way to handle options. Everything is in the usage message, you should do anything at all. I think example below is easy.

Example Code

% cat 1.awk
#!/usr/bin/env runawk

#use "power_getopt.awk"

#.begin-str help
# power_getopt - program demonstrating a power of power_getopt.awk module
# usage: power_getopt [OPTIONS]
# OPTIONS:
#    -h|--help                  display this screen
#    -f|--flag                  flag
#       --long-flag             long flag only
#    -s                         short flag only
#    =F|--FLAG           flag with value
#.end-str

BEGIN {
        print "f         --- " getarg("f")
        print "flag      --- " getarg("flag")
        print "long-flag --- " getarg("long-flag")
        print "s         --- " getarg("s")
        print "F         --- " getarg("F", "default1")
        print "FLAG      --- " getarg("FLAG", "default2")

        exit 0
}

./1.awk

% ./1.awk
f         --- 0
flag      --- 0
long-flag --- 0
s         --- 0
F         --- default1
FLAG      --- default2

./1.awk -h

% ./1.awk -h
power_getopt - program demonstrating a power of power_getopt.awk module
usage: power_getopt [OPTIONS]
OPTIONS:
   -h|--help                  display this screen
   -f|--flag                  flag
      --long-flag             long flag only
   -s                         short flag only
   -F|--FLAG           flag with value

./1.awk -f

% ./1.awk -f
f         --- 1
flag      --- 1
long-flag --- 0
s         --- 0
F         --- default1
FLAG      --- default2

./1.awk -F value

% ./1.awk -F value
f         --- 0
flag      --- 0
long-flag --- 0
s         --- 0
F         --- value
FLAG      --- value

./1.awk --FLAG=value

% ./1.awk --FLAG=value
f         --- 0
flag      --- 0
long-flag --- 0
s         --- 0
F         --- value
FLAG      --- value

categories: Runawk,Project,Tools,Mar,2009,AlexC

runawk - wrapper for AWK interpreter

(Note: see recent update.)

Contents

Download from...

Download from LAWKER or a tar file or from SourceForge.

NAME

runawk - wrapper for AWK interpreter

SYNOPSIS

runawk [options] program_file

runawk -e program

DESCRIPTION

After years of using AWK for programming I've found that despite of its simplicity and limitations AWK is good enough for scripting a wide range of different tasks. AWK is not as poweful as their bigger counterparts like Perl, Ruby, TCL and others but it has their own advantages like compactness, simplicity and availability on almost all UNIX-like systems. I personally also like its data-driven nature and token orientation, very useful technique for simple text processing utilities.

But! Unfortunately awk interpreters lacks some important features and sometimes work not as good as it whould be.

Problems I see (some of them, of course)

  1. AWK lacks support for modules. Even if I create small programs, I often want to use the functions created earlier and already used in other scripts. That is, it whould great to orginise functions into so called libraries (modules).

  2. In order to pass arguments to #!/usr/bin/awk -f script (not to awk interpreter), it is necessary to prepand a list of arguments with -- (two minus signes). In my view, this looks badly.

    Example:

    awk_program:

        #!/usr/bin/awk -f
    
        BEGIN {
           for (i=1; i < ARGC; ++i){
              printf "ARGV [%d]=%s\n", i, ARGV [i]
           }
        }

    Shell session:

        % awk_program --opt1 --opt2
        /usr/bin/awk: unknown option --opt1 ignored
        /usr/bin/awk: unknown option --opt2 ignored
    
        % awk_program -- --opt1 --opt2
        ARGV [1]=--opt1
        ARGV [2]=--opt2
        %

    In my opinion awk_program script should work like this

        % awk_program --opt1 --opt2
        ARGV [1]=--opt1
        ARGV [2]=--opt2
        %

    It is possible using runawk.

  3. When #!/usr/bin/awk -f script handles arguments (options) and wants to read from stdin, it is necessary to add /dev/stdin (or `-') as a last argument explicitly.

    Example:

    awk_program:

        #!/usr/bin/awk -f
    
        BEGIN {
           if (ARGV [1] == "--flag"){
              flag = 1
              ARGV [1] = "" # to not read file named "--flag"
           }
        }
        {
           print "flag=" flag " $0=" $0
        }

    Shell session:

        % echo test | awk_program -- --flag
        % echo test | awk_program -- --flag /dev/stdin
        flag=1 $0=test
        %

    Ideally awk_program should work like this

        % echo test | awk_program --flag
        flag=1 $0=test
        %

runawk was created to solve all these problems

OPTIONS

-h|--help

Display help information.

-V|--version

Display version information.

-d|--debug

Turn on a debugging mode in which runawk prints argument list with which real awk interpreter will be run.

-i|--with-stdin

Always add stdin file name to a list of awk arguments

-I|--without-stdin

Do not add stdin file name to a list of awk arguments

-e|--execute program

Specify program. If -e is not specified program is read from program_file.

DETAILS/INTERNALS

Standalone script

Under UNIX-like OS-es you can use runawk by beginning your script with

   #!/usr/local/bin/runawk

line or something like this instead of

   #!/usr/bin/awk -f

or similar.

AWK modules

In order to activate modules you should add them into awk script like this

  #use "module1.awk"
  #use "module2.awk"

that is the line that specifies module name is treated as a comment line by normal AWK interpreter but is processed by runawk especially.

Note that #use should begin with column 0, no spaces are allowed before it and no spaces are allowed between # and use.

Also note that AWK modules can also "use" another modules and so forth. All them are collected in a depth-first order and each one is added to the list of awk interpreter arguments prepanded with -f option. That is #use directive is *NOT* similar to #include in C programming language, runawk's module code is not inserted into the place of #use. Runawk's modules are closer to Perl's "use" command. In case some module is mentioned more than once, only one -f will be added for it, i.e duplications are removed automatically.

Position of #use directive in a source file does matter, i.e. the earlier module is mentioned, the earlier -f will be generated for it.

Example:

  file prog:
     #!/usr/local/bin/runawk

     #use "A.awk"
     #use "B.awk"
     #use "E.awk"

     PROG code
     ...
  file B.awk:
     #use "A.awk"
     #use "C.awk"
     B code
     ...
  file C.awk:
     #use "A.awk"
     #use "D.awk"

     C code
     ...
A.awk and D.awk don't contain #use directive.

If you run

  runawk prog file1 file2

or

  /path/to/prog file1 file2

the following command

  awk -f A.awk -f D.awk -f C.awk -f B.awk -f E.awk -f prog -- file1 file2

will actually run.

You can check this by running

  runawk -d prog file1 file2

Module search strategy

Modules are first searched in a directory where main program (or module in which #use directive is specified) is placed. If it is not found there, then AWKPATH environment variable is checked. AWKPATH keeps a colon separated list of search directories. Finally, module is searched in system runawk modules directory, by default PREFIX/share/runawk but this can be changed at build time.

An absolute path of the module can also be specified.

AWK interpreter and its arguments

In order to pass arguments to AWK script correctly, runawk treats their arguments beginning with `-' sign (minus) especially. The following command

  runawk prog2 -x -f=file -o=output file1 file2

or

  /path/to/prog2 -x -f=file -o=output file1 file2

will actually run

  awk -f prog2 -- -x -f=file -o=output file1 file2

therefore -s, -f, -o options will be passed to ARGV/ARGC awk's variables together with file1 and file2. If all arguments begin with `-' (minus), runawk will add stdin filename to the end of argument list, (unless -I option is specified) i.e. running

  runawk prog3 --value=value

or

  /path/to/prog3 --value=value

will actually run the following

  awk -f prog3 -- --value=value /dev/stdin

Program as an argument

Like some other interpreters runawk can obtain the script from a command line like this

 /path/to/runawk -e '
 #use "alt_assert.awk"

 {
   assert($1 >= 0 && $1 <= 10, "Bad value: " $1)

   # your code below
   ...
 }'

Selecting a preferred AWK interpreter

For some reason you may prefer one AWK interpreter or another with a help of #interp command like this

  file prog:
     #!/usr/local/bin/runawk

     #use "A.awk"
     #use "B.awk"

     #interp "/usr/pkg/bin/nbawk"

     # your code here
     ...

The reason may be efficiency for a particular task, useful but not standard extensions or enything else.

Note that #interp directive should also begin with column 0, no spaces are allowed before it and between # and interp.

Setting environment

In some cases you may want to run AWK interpreter with a specific environment. For example, your script may be oriented to process ASCII text only. In this case you can run AWK with LC_CTYPE=C environment and use regexp ranges.

runawk provides #env directive for this. Strings inside double quotes is passed to putenv(3) libc function.

Example:

  file prog:
     #!/usr/local/bin/runawk

     #env "LC_ALL=C"

     $1 ~ /^[A-Z]+$/ { # A-Z is valid if LC_CTYPE=C
         print $1
     }

EXIT STATUS

If AWK interpreter exits normally, runawk exits with its exit status. If AWK interpreter was killed by signal, runawk exits with exit status 128+signal.

ENVIRONMENT

AWKPATH

Colon separated list of directories where awk modules are searched.

RUNAWK_AWKPROG

Sets the path to the AWK interpreter, used by default, i.e. this variable overrides the compile-time default. Note that #interp directive overrides this.

AUTHOR/LICENSE

Copyright (c) 2007-2008 Aleksey Cheusov <vle@gmx.net>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

BUGS/FEEDBACK

Please send any comments, questions, bug reports etc. to me by e-mail or (even better) register them at sourceforge project home. Feature requests are also welcomed.

blog comments powered by Disqus