About awk.info
» table of contents
» featured topics
» page tags
|
|
|
|
|
|
Mar 01: Michael Sanders demos an X-windows GUI for AWK.
Mar 01: Awk100#24: A. Lahm and E. de Rinaldis' patent search, in AWK
Feb 28: Tim Menzies asks this community to write an AWK cookbook.
Feb 28: Arnold Robbins announces a new debugger for GAWK.
Feb 28: Awk100#23: Premysl Janouch offers a IRC bot, In AWK
Feb 28: Updated: the AWK FAQ
Feb 28: Tim Menzies offers a tiny content management system, in Awk.
Jan 31: Comment system added to awk.info. For example, see discussion bottom of ?keys2awk
Jan 31: Martin Cohen shows that Gawk can handle massively long strings (300 million characters).
Jan 31: The AWK FAQ is being updated. For comments/ corrections/ extensions, please mail tim@menzies.us
Jan 31: Martin Cohen finds Awk on the Android platform.
Jan 31: Aleksey Cheusov released a new version of runawk.
Jan 31: Hirofumi Saito contributes a candidate Awk mascot.
Jan 31: Michael Sanders shows how to quickly build an AWK GUI for windows.
Jan 31: Hyung-Hwan Chung offers QSE, an embeddable Awk Interpreter.
echo Goal | gawk -f story.awk [ -v Grammar=FILE ] [ -v Seed=NUMBER ] echo Goal | gawk -f storyp.awk [ -v Grammar=FILE ] [ -v Seed=NUMBER ]
Download from LAWKER.
This code inputs a set of productions and outputs a string of words that satisfy the production rules.
This page describes two versions of that system: story.awk and storyp.awk. The former selects productions at random with equal probability. The latter allows the user to bias the selection by adding weights at the end of line, after each production.
This grammar..
Sentence -> Nounphrase Verbphrase Nounphrase -> the boy Nounphrase -> the girl Verbphrase -> Verb Modlist Adverb Verb -> runs Verb -> walks Modlist -> Modlist -> very Modlist Adverb -> quickly Adverb -> slowly... and this input ...
for i in 1 2 3 4 5 6 7 8 9 10;do echo Sentence | gawk -f ../story.awk -v Grammar=english.rules -v Seed=$i | fmt done... generates these sentences:
the boy runs very slowly the girl runs slowly the boy runs very slowly the girl walks very very quickly the boy runs quickly the girl walks very very slowly the boy walks very very very very very very quickly the boy walks very quickly the girl runs slowly the girl runs very quickly
Here is Gahan Wilson's sci-fi plot generator ...
Using the above, we can generate the following stories:
Earth scientists invent giant bugs who want Our Women, And Take A Few And Leave Earth is Attacked By tiny lunar superbeings who Under Stand and Are Not radioactive and can not be killed by the Navy but They Die From Catching A Cold Earth scientists invent enormous bugs who are Friendly and and They Get Married And Live Happily Forever After Earth is Struck By A Giant cloud and Magically Saved Earth scientists invent giant bugs who Under Stand and Are Not radioactive and can not be killed by the Air Force so They Kill Us Earth is Attacked By enormous extra Galactic blobs who Under Stand and Are Not radioactive and can be killed by the Air Force Earth scientists discover enormous blobs who Under Stand and Are Not radioactive and can be killed by a Crowd Of Peasants Earth falls Into Sun and Some Resuced Earth is Struck By A Giant comet but Is Saved Earth is Struck By A Giant comet and Is Destroyed
This is generated from the following code:
for i in 1 2 3 4 5 6 7 8 9 10;do echo echo Start | gawk -f ../story.awk -v Grammar=scifi.rules -v Seed=$i | fmt done
running on the following grammar:
Start -> Earth IsStressed IsStressed -> Catestrophes IsStressed -> Science IsStressed -> Attack IsStressed -> Collision Catestrophes -> Catestrophe and PossibleMegaDeath Catestrophe -> burnsUp Catestrophe -> freezes Catestrophe -> fallsIntoSun Collision -> isStruckByAGiant Floater AndThen Floater -> comet Floater -> asteroid Floater -> cloud AndThen -> butIsSaved AndThen -> andIsDestroyed AndThen -> andMagicallySaved PossibleMegaDeath -> everybodyDies PossibleMegaDeath -> Some GoOn SomeSaved -> somePeople SomeSaved -> everybody SomeSaved -> almostEverybody GoOn -> dies GoOn -> Resuced GoOn -> Saved Rescued -> isRescuedBy Sizes Extraterestrial Beings Saved -> butIsSavedBy SomeOne scientists the Science SomeOne -> earth SomeOne -> extraterestrial Science -> scientists DoSomething Sizes Beings Whichetc DoSomething -> invent DoSomething -> discover Attack -> isAttackedBy Sizes Extraterestrial Beings Whichetc Sizes -> tiny Sizes -> giant Sizes -> enormous Extraterestrial -> martian Extraterestrial -> lunar Extraterestrial -> extraGalactic Beings -> bugs Beings -> reptiles Beings -> blobs Beings -> superbeings Whichetc -> who WantSomething WantSomething -> WantWomen WantSomething -> areFriendly and DenoumentOrHappyEnding WantSomething -> UnderStand ButEtc Understand -> areFriendly butMisunderstood Understand -> misunderstandUs Understand -> understandUsAllTooWell Understand -> hungry DenoumentOrHappyEnding -> Denoument DenoumentOrHappyEnding -> HappyEnding Dine -> Hungry and eat us Denoument? WhichEtc -> Hungry -> lookUponUsAsASourceOfNourishment WantWomen -> wantOurWomen, AndTakeAFewAndLeave ButEtc -> AndAre radioactive and TryToKill AndAre -> andAre AndAre -> andAreNot Killers -> Killer Killers -> Killer and Killer Killer -> aCrowdOfPeasants Killer -> theArmy Killer -> theNavy Killer -> theAirForce Killer -> theMarines Killer -> theCoastGuard Killer -> theAtomBomb TryToKill -> can be killed by Killers TryToKill -> can not be killed by Killers SoEtc SoEtc -> butTheyDieFromCatchingACold SoEtc -> soTheyKillUs SoEtc -> soTheyPutUsUnderABenignDictatorShip SoEtc -> soTheyEatUs SoEtc -> soScientistsInventAWeapon Which SeEtc -> but Denoument Which -> whichTurnsThemIntoDisgustingLumps Which -> whichKillsThem Which -> whichFails SoEtc Denomument? -> Denomument? -> Denoument Denoument -> aCuteLittleKidConvincesThemPeopleAreOk Ending Denoument -> aPriestTalksToThemOfGod Ending Denoument -> theyFallInLoveWithThisBeautifulGirl EndSadOrHappy EndSadOrHappy -> Ending EndSadOrHappy -> HappyEnding Ending -> andTheyDie Ending -> andTheyLeave Ending -> andTheyTurnIntoDisgustingLumps HappyEnding -> andTheyGetMarriedAndLiveHappilyForeverAfter
Here is a grammar suitable for storyp.awk. Note that number at end of line that biases how often a production is selected. For example, "runs" and "slowly" are nine times more likely than other Verbs and Adverbs.
Sentence -> Nounphrase Verbphrase 1 Nounphrase -> the boy 0.75 Nounphrase -> the girl 0.25 Verbphrase -> Verb Modlist Adverb 1 Verb -> runs 0.9 Verb -> walks 0.1 Modlist -> 0.5 Modlist -> very Modlist 0.5 Adverb -> quickly 0.1 Adverb -> slowly 0.9The following code executes the biases story generation:
for((i=1;i<=10;i++)); do echo Sentence ; done | gawk -f ../storyp.awk -v Grammar=englishp.rules
This produces the following output. Note that, usually, we run slowly.
the boy runs very slowly the boy runs slowly the girl runs very slowly the boy runs slowly the boy runs slowly the girl walks very slowly the boy walks slowly the girl runs slowly the boy runs slowly the boy runs slowly
BEGIN {
srand(Seed ? Seed : 1)
Grammar = Grammar ? Grammar : "grammar"
while (getline < Grammar > 0)
if ($2 == "->") {
i = ++lhs[$1] # count lhs
rhscnt[$1, i] = NF-2 # how many in rhs
for (j = 3; j <= NF; j++) # record them
rhslist[$1, i, j-2] = $j
} else
if ($0 !~ /^[ \t]*$/)
print "illegal production: " $0
}
{ if ($1 in lhs) { # nonterminal to expand
gen($1)
printf("\n")
} else
print "unknown nonterminal: " $0
}
function gen(sym, i, j) {
if (sym in lhs) { # a nonterminal
i = int(lhs[sym] * rand()) + 1 # random production
for (j = 1; j <= rhscnt[sym, i]; j++) # expand rhs's
gen(rhslist[sym, i, j])
} else {
gsub(/[A-Z]/," &",sym)
printf("%s ", sym) }
}
Storyp.awk is almost the same as story.awk but it is assumed that each line ends in a number that will bias how often that production gets selected.
BEGIN {
srand(Seed ? Seed : 1)
Grammar = Grammar ? Grammar : "grammar"
while ((getline < Grammar) > 0)
if ($2 == "->") {
i = ++lhs[$1] # count lhs
rhsprob[$1, i] = $NF # 0 <= probability <= 1
rhscnt[$1, i] = NF-3 # how many in rhs
for (j = 3; j < NF; j++) # record them
rhslist[$1, i, j-2] = $j
} else
print "illegal production: " $0
for (sym in lhs)
for (i = 2; i <= lhs[sym]; i++)
rhsprob[sym, i] += rhsprob[sym, i-1]
}
{ if ($1 in lhs) { # nonterminal to expand
gen($1)
printf("\n")
} else
print "unknown nonterminal: " $0
}
function gen(sym, i, j) {
if (sym in lhs) { # a nonterminal
j = rand() # random production
for (i = 1; i <= lhs[sym] && j > rhsprob[sym, i]; i++) ;
for (j = 1; j <= rhscnt[sym, i]; j++) # expand rhs's
gen(rhslist[sym, i, j])
} else
printf("%s ", sym)
}
The code comes from Alfred Aho, Brian Kernighan, and Peter Weinberger from the book "The AWK Programming Language", Addison-Wesley, 1988.
The scifi grammar was written by Tim Menzies, 2009, and is based on Gahan Wilson's sci-fi plot generator: "The Science Fiction Horror Movie Pocket Computer" ( in "The Year's Best Science Fiction No. 5", edited by Harry Harrison and Brian Aldiss, Sphere, London, 1972).