Documenting Code with AWK

In this article I present an AWK script I've been using for quite some time to create HTML documentation from the headers in my source code.

The script fulfills a similar role as tools like Javadoc and Doxygen, but is not as powerful.

The simplicity comes with some advantages though: AWK is widely available, so I can include the script with my source code and be sure that it will just work. Also, the input (and the output) tends to be simpler, which is ideal for the smaller hobby projects I've been spending my time on.

Introduction to AWK

AWK must certainly be one of the most underappreciated programming languages.

It has a lot of things to like:

• It uses a curly brace-type syntax which makes it easy to learn,
• Its standard library is so compact that you can keep most of it in your head without searching through man pages.
• It is specifically tailored to work on text files so it has a wide variety of string manipulation functions, yet it is flexible enough for other general purpose programming tasks.
• It is also widely distributed on Unix-like operating systems (A Windows version of GAWK, the GNU implementation of AWK, is available as part of unxutils).

It certainly has a good pedigree by including Alfred Aho (author of the Dragon book) and Brian Kernighan (of The C Programming Language) among its creators.

An AWK program consists of a series of constructs like condition { statements }, where the condition is checked against each line in an input file and if it evaluates to true the statements are executed. Conditions are typically regular expressions, but there are some special forms for other purposes.

The following example prints a line from the input file only if it matches the regular expression /foo/:


/foo/ {print}