About awk.info
» table of contents
» featured topics
» page tags
|
|
|
|
|
|
Mar 01: Michael Sanders demos an X-windows GUI for AWK.
Mar 01: Awk100#24: A. Lahm and E. de Rinaldis' patent search, in AWK
Feb 28: Tim Menzies asks this community to write an AWK cookbook.
Feb 28: Arnold Robbins announces a new debugger for GAWK.
Feb 28: Awk100#23: Premysl Janouch offers a IRC bot, In AWK
Feb 28: Updated: the AWK FAQ
Feb 28: Tim Menzies offers a tiny content management system, in Awk.
Jan 31: Comment system added to awk.info. For example, see discussion bottom of ?keys2awk
Jan 31: Martin Cohen shows that Gawk can handle massively long strings (300 million characters).
Jan 31: The AWK FAQ is being updated. For comments/ corrections/ extensions, please mail tim@menzies.us
Jan 31: Martin Cohen finds Awk on the Android platform.
Jan 31: Aleksey Cheusov released a new version of runawk.
Jan 31: Hirofumi Saito contributes a candidate Awk mascot.
Jan 31: Michael Sanders shows how to quickly build an AWK GUI for windows.
Jan 31: Hyung-Hwan Chung offers QSE, an embeddable Awk Interpreter.
Not a single program.
Generate TeX code for a bilingual dictionary from a flat file database. This system has been used to generate multiple editions of dictionaries for several dialects of Carrier, the endangered language of a large portion of the central interior of British Columbia.
Bill Poser
Canada
linguistics - dictionary publishing
Bill Poser
billposer@alum.mit.edu
A dictionary database consists of four flat files containing records in which fields are identified by tags, in a format isomorphic to Standard Dictionary Format. The four files contain: main entries, example sentences with translations, verb roots, verb stems. This provides modest degree of relativization. Awk scripts controlled by a makefile do the bulk of the work of generating TeX code for printing dictionaries containing front matter, a Carrier-English section, an English-Carrier section, a topical index, an alphabetical root list, a list of roots sorted by English gloss, an alphabetical list of verb stems, a list of verb stems sorted by root, an alphabetical list of affixes, a list of affixes sorted by English gloss, a list of scientific names , a list of placenames, and credits for illustrations.
gawk
The awk scripts are executed from a make file.
GNU/Linux on x86.
The awk scripts are executed from a makefile by GNU make. The other program used extensively is the sort utility msort.
5500
The first usable version took no more than a day (plus the time to create the TeX template into which the generated code is inserted).
Pure maintenance due to changes in environment, bit rot, etc. has been just about nil. The effort devoted to adding features very difficult to estimate as it has taken place at irregular intervals over a period of 15 years.
Status 1=Prototype, 2=Evaluation, 3=Released, 4=No longer supported, 5=Dead 3, I guess. The code is mature but not really released since the author is the only one who normally uses it.
1=Personal use, 2=in-House use, 3=Free/public domain, 4=Licensed, 5=Sold product 1
1
June 1993.
A paper describing these databases and the process for generating dictionaries from them is available: Lexical Databases for Carrier
Some information about the resulting dictionaries: http://www.ydli.org/products/dicts.htm
blog comments powered by Disqus