Awk.Info

"Cause a little auk awk
goes a long way."

About awk.info
 »  table of contents
 »  featured topics
 »  page tags


About Awk
 »  advocacy
 »  learning
 »  history
 »  Wikipedia entry
 »  mascot
Implementations
 »  Awk (rarely used)
 »  Nawk (the-one-true, old)
 »  Gawk (widely used)
 »  Mawk
 »  Xgawk (gawk + xml + ...)
 »  Spawk (SQL + awk)
 »  Jawk (Awk in Java JVM)
 »  QTawk (extensions to gawk)
 »  Runawk (a runtime tool)
 »  platform support
Coding
 »  one-liners
 »  ten-liners
 »  tips
 »  the Awk 100
Community
 »  read our blog
 »  read/write the awk wiki
 »  discussion news group

Libraries
 »  Gawk
 »  Xgawk
 »  the Lawker library
Online doc
 »  reference card
 »  cheat sheet
 »  manual pages
 »  FAQ

Reading
 »  articles
 »  books:

WHAT'S NEW?

Mar 01: Michael Sanders demos an X-windows GUI for AWK.

Mar 01: Awk100#24: A. Lahm and E. de Rinaldis' patent search, in AWK

Feb 28: Tim Menzies asks this community to write an AWK cookbook.

Feb 28: Arnold Robbins announces a new debugger for GAWK.

Feb 28: Awk100#23: Premysl Janouch offers a IRC bot, In AWK

Feb 28: Updated: the AWK FAQ

Feb 28: Tim Menzies offers a tiny content management system, in Awk.

Jan 31: Comment system added to awk.info. For example, see discussion bottom of ?keys2awk

Jan 31: Martin Cohen shows that Gawk can handle massively long strings (300 million characters).

Jan 31: The AWK FAQ is being updated. For comments/ corrections/ extensions, please mail tim@menzies.us

Jan 31: Martin Cohen finds Awk on the Android platform.

Jan 31: Aleksey Cheusov released a new version of runawk.

Jan 31: Hirofumi Saito contributes a candidate Awk mascot.

Jan 31: Michael Sanders shows how to quickly build an AWK GUI for windows.

Jan 31: Hyung-Hwan Chung offers QSE, an embeddable Awk Interpreter.

[More ...]

Bookmark and Share

categories: Games,Top10,TenLiners,Mar,2009,BrianK

Story.awk

Contents

Synopsis

echo Goal | gawk -f story.awk [ -v Grammar=FILE ] [ -v Seed=NUMBER ] 
echo Goal | gawk -f storyp.awk [ -v Grammar=FILE ] [ -v Seed=NUMBER ] 

Download

Download from LAWKER.

Description

This code inputs a set of productions and outputs a string of words that satisfy the production rules.

This page describes two versions of that system: story.awk and storyp.awk. The former selects productions at random with equal probability. The latter allows the user to bias the selection by adding weights at the end of line, after each production.

Options

-v Grammar=FILE
Sets the FILE containing the productions. Defaults to "grammar".
-v Seed=NUM
Sets the seed for the random number generator. Defaults to "1". A useful idiom for generating random text is to use Seed=$RANDOM

Examples

A Short Example

This grammar..

Sentence -> Nounphrase Verbphrase   
Nounphrase -> the boy              
Nounphrase -> the girl           
Verbphrase -> Verb Modlist Adverb 
Verb -> runs                    
Verb -> walks                  
Modlist ->                    
Modlist -> very Modlist      
Adverb -> quickly           
Adverb -> slowly           
... and this input ...
for i in 1 2 3 4 5 6 7 8 9 10;do
	echo Sentence | 
	gawk -f ../story.awk -v Grammar=english.rules -v Seed=$i | 
	fmt
done
... generates these sentences:
the boy runs very slowly
the girl runs slowly
the boy runs very slowly
the girl walks very very quickly
the boy runs quickly
the girl walks very very slowly
the boy walks very very very very very very quickly
the boy walks very quickly
the girl runs slowly
the girl runs very quickly

A Longer Example

Here is Gahan Wilson's sci-fi plot generator ...

Using the above, we can generate the following stories:


 Earth scientists invent giant bugs who want Our Women,  And Take
 A Few And Leave

 Earth is Attacked By tiny lunar superbeings who  Under Stand and
 Are Not radioactive and can not be killed by the Navy but They Die
 From Catching A Cold

 Earth scientists invent enormous bugs who are Friendly and and
 They Get Married And Live Happily Forever After

 Earth is Struck By A Giant cloud and Magically Saved

 Earth scientists invent giant bugs who  Under Stand and Are Not
 radioactive and can not be killed by the Air Force so They Kill
 Us

 Earth is Attacked By enormous extra Galactic blobs who  Under Stand
 and Are Not radioactive and can be killed by the Air Force

 Earth scientists discover enormous blobs who  Under Stand and Are
 Not radioactive and can be killed by a Crowd Of Peasants

 Earth falls Into Sun and  Some  Resuced

 Earth is Struck By A Giant comet but Is Saved

 Earth is Struck By A Giant comet and Is Destroyed

This is generated from the following code:

for i in 1 2 3 4 5 6 7 8 9 10;do
	echo
	echo Start | 
	gawk -f ../story.awk -v Grammar=scifi.rules -v Seed=$i | 
	fmt
done

running on the following grammar:

Start      -> Earth IsStressed
IsStressed -> Catestrophes 
IsStressed -> Science 
IsStressed -> Attack 
IsStressed -> Collision

Catestrophes -> Catestrophe and PossibleMegaDeath

Catestrophe -> burnsUp 
Catestrophe -> freezes
Catestrophe -> fallsIntoSun

Collision -> isStruckByAGiant Floater AndThen

Floater -> comet
Floater -> asteroid
Floater -> cloud

AndThen -> butIsSaved
AndThen -> andIsDestroyed
AndThen -> andMagicallySaved


PossibleMegaDeath -> everybodyDies
PossibleMegaDeath -> Some GoOn 

SomeSaved ->  somePeople
SomeSaved ->  everybody
SomeSaved ->  almostEverybody
  
GoOn -> dies
GoOn -> Resuced
GoOn -> Saved
 
Rescued -> isRescuedBy Sizes Extraterestrial Beings
Saved   -> butIsSavedBy SomeOne scientists the  Science

SomeOne -> earth
SomeOne -> extraterestrial

Science -> scientists DoSomething Sizes Beings Whichetc

DoSomething -> invent
DoSomething -> discover

Attack -> isAttackedBy Sizes Extraterestrial Beings Whichetc

Sizes -> tiny 
Sizes -> giant 
Sizes -> enormous
 
Extraterestrial -> martian
Extraterestrial -> lunar
Extraterestrial -> extraGalactic

Beings -> bugs
Beings -> reptiles
Beings -> blobs
Beings -> superbeings

Whichetc -> who WantSomething

WantSomething -> WantWomen
WantSomething -> areFriendly  and DenoumentOrHappyEnding
WantSomething -> UnderStand ButEtc

Understand -> areFriendly butMisunderstood
Understand -> misunderstandUs
Understand -> understandUsAllTooWell
Understand -> hungry

DenoumentOrHappyEnding -> Denoument
DenoumentOrHappyEnding -> HappyEnding
 
Dine -> Hungry and eat us Denoument?

WhichEtc -> 
Hungry -> lookUponUsAsASourceOfNourishment

WantWomen -> wantOurWomen, AndTakeAFewAndLeave

ButEtc -> AndAre radioactive and TryToKill

AndAre -> andAre
AndAre -> andAreNot

Killers -> Killer 
Killers -> Killer and Killer

Killer -> aCrowdOfPeasants
Killer -> theArmy
Killer -> theNavy
Killer -> theAirForce
Killer -> theMarines
Killer -> theCoastGuard
Killer -> theAtomBomb

TryToKill -> can be killed by Killers
TryToKill -> can not be killed by Killers SoEtc

SoEtc -> butTheyDieFromCatchingACold
SoEtc -> soTheyKillUs
SoEtc -> soTheyPutUsUnderABenignDictatorShip
SoEtc -> soTheyEatUs
SoEtc -> soScientistsInventAWeapon Which
SeEtc -> but Denoument

Which -> whichTurnsThemIntoDisgustingLumps
Which -> whichKillsThem
Which -> whichFails SoEtc

Denomument? ->  
Denomument? -> Denoument  

Denoument ->  aCuteLittleKidConvincesThemPeopleAreOk Ending
Denoument -> aPriestTalksToThemOfGod Ending
Denoument -> theyFallInLoveWithThisBeautifulGirl EndSadOrHappy

EndSadOrHappy -> Ending
EndSadOrHappy -> HappyEnding

Ending -> andTheyDie
Ending -> andTheyLeave
Ending -> andTheyTurnIntoDisgustingLumps

HappyEnding -> andTheyGetMarriedAndLiveHappilyForeverAfter

Biasing the Story

Here is a grammar suitable for storyp.awk. Note that number at end of line that biases how often a production is selected. For example, "runs" and "slowly" are nine times more likely than other Verbs and Adverbs.

Sentence -> Nounphrase Verbphrase   1
Nounphrase -> the boy               0.75
Nounphrase -> the girl              0.25
Verbphrase -> Verb Modlist Adverb   1
Verb -> runs                        0.9
Verb -> walks                       0.1
Modlist ->                          0.5
Modlist -> very Modlist             0.5
Adverb -> quickly                   0.1
Adverb -> slowly                    0.9
The following code executes the biases story generation:
for((i=1;i<=10;i++)); do echo Sentence ;  done |
gawk -f ../storyp.awk -v Grammar=englishp.rules 

This produces the following output. Note that, usually, we run slowly.

the boy runs very slowly 
the boy runs slowly 
the girl runs very slowly 
the boy runs slowly 
the boy runs slowly 
the girl walks very slowly 
the boy walks slowly 
the girl runs slowly 
the boy runs slowly 
the boy runs slowly 

Code

Story.awk

BEGIN { 
    srand(Seed ? Seed : 1) 
	Grammar = Grammar ? Grammar : "grammar"
	while (getline < Grammar > 0)
	    if ($2 == "->") {
		    i = ++lhs[$1]              # count lhs
		    rhscnt[$1, i] = NF-2       # how many in rhs
		    for (j = 3; j <= NF; j++)  # record them
		        rhslist[$1, i, j-2] = $j
	    } else
		     if ($0 !~ /^[ \t]*$/)
        	    print "illegal production: " $0
}
{   if ($1 in lhs) {  # nonterminal to expand
        gen($1)
        printf("\n")
    } else 
        print "unknown nonterminal: " $0   
}
function gen(sym,    i, j) {
    if (sym in lhs) {       # a nonterminal
        i = int(lhs[sym] * rand()) + 1   # random production
        for (j = 1; j <= rhscnt[sym, i]; j++) # expand rhs's
            gen(rhslist[sym, i, j])
    } else {
        gsub(/[A-Z]/," &",sym)
        printf("%s ", sym) }
}

Storyp.awk

Storyp.awk is almost the same as story.awk but it is assumed that each line ends in a number that will bias how often that production gets selected.

BEGIN {
    srand(Seed ? Seed : 1) 
    Grammar = Grammar ? Grammar : "grammar"
    while ((getline < Grammar) > 0)
        if ($2 == "->") {
            i = ++lhs[$1]              # count lhs
            rhsprob[$1, i] = $NF       # 0 <= probability <= 1
            rhscnt[$1, i] = NF-3       # how many in rhs
            for (j = 3; j < NF; j++)   # record them
               rhslist[$1, i, j-2] = $j
        } else
            print "illegal production: " $0
    for (sym in lhs)
         for (i = 2; i <= lhs[sym]; i++)
            rhsprob[sym, i] += rhsprob[sym, i-1]
}
{   if ($1 in lhs) {  # nonterminal to expand
         gen($1)
         printf("\n")
     } else 
         print "unknown nonterminal: " $0   
}
function gen(sym,    i, j) {
    if (sym in lhs) {       # a nonterminal
        j = rand()          # random production
        for (i = 1; i <= lhs[sym] && j > rhsprob[sym, i]; i++) ;       
        for (j = 1; j <= rhscnt[sym, i]; j++) # expand rhs's
            gen(rhslist[sym, i, j])
    } else
        printf("%s ", sym)
}

Author

The code comes from Alfred Aho, Brian Kernighan, and Peter Weinberger from the book "The AWK Programming Language", Addison-Wesley, 1988.

The scifi grammar was written by Tim Menzies, 2009, and is based on Gahan Wilson's sci-fi plot generator: "The Science Fiction Horror Movie Pocket Computer" ( in "The Year's Best Science Fiction No. 5", edited by Harry Harrison and Brian Aldiss, Sphere, London, 1972).

blog comments powered by Disqus