.aX
.nr H1 1
.H 1 "APPENDIX B: M4 - A Macro Processor"
.PH "'Appendix B''Appendix B'"
Macro processors are used to define and to process
specially defined strings of characters (called macros).
M4 is the name of the \*(x1 macro processor.
By defining a set of macros to be processed by M4,
a programming language can be enhanced to make it:
.BL
.LI
More structured
.LI
More readable
.LI
More appropriate for a particular application
.LE 1
.P
The
.B #define
statement in C
and the analogous
.B define
in Ratfor are examples of the basic facility provided by
any macro processor\*(EMreplacement of text by other text.
.P
Besides the straightforward replacement of one string of text 
by another, M4 provides:
.BL
.LI 
Macros with arguments
.LI
Conditional macro expansions
.LI
Arithmetic expressions
.LI
File manipulation facilities
.LI
String processing functions
.LE 1
.P
The basic operation of M4 is copying its input to its output.
As the input is read, each alphanumeric 
.Q "token"
(that is, string of letters and digits) is checked.
If the token is the name of a macro,
then the name of the macro is replaced by its defining text.
The resulting string is reread by M4.
Macros may also be called with arguments, 
in which case the arguments are collected
and substituted in the right places in the defining text
before M4 rescans the text.
.P
M4 provides a collection of about twenty built-in macros.
In addition, the user can define new macros.
Built-ins and user-defined macros work in exactly the same way, 
except that some of the built-in macros have side effects
on the state of the process.
.H 2 "Command Usage"
The invocation syntax for 
.I m4
is:
.DS I
.B
m4 [\fIfiles\fP]
.R
.DE
Each filename argument is processed in order.
If there are no arguments, or if an argument
is a dash (\-),
then the standard input is read.
The processed text is written to the standard output,
and can be redirected as in the following example:
.DS I
m4 file1 file2 \- >outputfile
.DE
Note the use of the dash in the above example to
indicate processing of the standard input,
.I after
the files 
.FN file1 
and 
.FN file2
have been processed by 
.I m4 .
.H 2 "Defining Macros"
The primary built-in function of M4
is
.B define ,
which is used to define new macros.
The input
.DS I
define(name, stuff)
.DE
causes the string
.Q name
to be defined as
.Q stuff .
All subsequent occurrences of
.Q name
will be replaced by
.Q stuff .
.Q Name
must be alphanumeric and must begin with a letter
(the underscore (\(ul) counts as a letter).
.Q Stuff
is any text, including text that contains balanced parentheses;
it may stretch over multiple lines.
.P
Thus, as a typical example
.DS I
define(N, 100)
 .
 .
 .
if (i > N)
.DE
defines
.Q N
to be 100, and uses this 
.Q "symbolic constant"
in a later
.B if
statement.
.P
The left parenthesis must immediately follow the word
.B define ,
to signal that
.B define
has arguments.
If a macro or built-in name is not followed 
immediately by a left parenthesis,
.Q "(" ,
it is assumed to have no arguments.
This is the situation for
.Q N
above;
it is actually a macro with no arguments.
Thus, when it is used, no parentheses are 
needed following its name.
.P
You should also notice that a macro name is only recognized as such
if it appears surrounded by nonalphanumerics.
For example, in
.DS I
define(N, 100)
 ...
if (NNN > 100)
.DE
the variable 
.Q NNN
is absolutely unrelated to the defined macro
.Q N ,
even though it contains three N's.
.P
Things may be defined in terms of other things.
For example
.DS I
define(N, 100)
define(M, N)
.DE
defines both M and N to be 100.
.P
What happens if
.Q N
is redefined?
Or, to say it another way, is
.Q M 
defined as
.Q N
or as 100?
In M4,
the latter is true \*(EM
.Q M
is 100, so even if
.Q N 
subsequently changes,
.Q M
does not.
.P
This behavior arises because
M4 expands macro names into their defining text 
as soon as it possibly can.
Here, that means that when the string
.Q N
is seen as the arguments of
.B define
are being collected, it is immediately replaced by 100;
it's just as if you had said
.DS I
define(M, 100)
.DE
in the first place.
.P
If this isn't what you really want, there are two ways out of it.
The first, which is specific to this situation,
is to interchange the order of the definitions:
.DS I
define(M, N)
define(N, 100)
.DE
Now
.Q M
is defined to be the string
.Q N ,
so when you ask for 
.Q M 
later, 
you will always get the value of
.Q N 
at that time
(because the
.Q M
will be replaced by
.Q N
which, in turn, will be replaced by 100).
.H 2 "Quoting"
The more general solution is to delay the expansion of
the arguments of
.B define 
by
.I quoting
them.
Any text surrounded by single quotation marks \(ga and \(aa
is not expanded immediately, but has the quotation marks stripped off.
If you say
.DS I
define(N, 100)
define(M, `N')
.DE
the quotation marks around the
.Q N
are stripped off as the argument is being collected,
but they have served their purpose, and 
.Q M
is defined as
the string
.Q N ,
not 100.
The general rule is that M4 always strips off
one level of single quotation marks whenever it evaluates
something.
This is true even outside of
macros.
If you want the word
.B define
to appear in the output,
you have to quote it in the input,
as in
.DS I
\&`define' = 1;
.DE
.P
As another instance of the same thing, 
which is a bit more surprising, consider redefining
.Q N :
.DS I
define(N, 100)
 ...
define(N, 200)
.DE
Perhaps regrettably, the
.Q N
in the second definition is
evaluated as soon as it's seen;
that is, it is
replaced by
100, so it's as if you had written
.DS I
define(100, 200)
.DE
This statement is ignored by M4, 
since you can only define things that look
like names, 
but it obviously doesn't have the effect you wanted.
To really redefine 
.Q N ,
you must delay the evaluation by quoting:
.DS I
define(N, 100)
 ...
define(`N', 200)
.DE
In M4,
it is often wise to quote the first argument of a macro.
.P
If the forward and backward quotation marks (\|\` and \'\|) 
are not convenient for some reason,
the quotation marks can be changed with the built-in
.B changequote .
For example:
.DS I
changequote([, ])
.DE
makes the new quotation marks the left and right brackets.
You can restore the original characters with just
.DS I
changequote
.DE
.P
There are two additional built-ins related to
.B define .
The built-in
.B undefine
removes the definition of some macro or built-in:
.DS I
undefine(`N')
.DE
removes the definition of
.Q N .
Built-ins can be removed with 
.B undefine ,
as in
.DS I
undefine(`define')
.DE
but once you remove one, you can never get it back.
.P
The built-in 
.B ifdef
provides a way to determine if a macro is currently defined.
For instance, pretend that either the word 
.Q xenix
or
.Q unix
is defined according to a particular implementation of
a program.
To perform operations according to which system you
have you might say:
.DS I
ifdef(`xenix', `define(system,1)' )
ifdef(`unix', `define(system,2)' )
.DE
Don't forget the quotation marks in the above example.
.P
.B Ifdef
actually permits three arguments:
if the name is undefined, the value of
.B ifdef
is then the third argument, as in
.DS I
ifdef(`xenix', on XENIX, not on XENIX)
.DE
.H 2 "Arguments"
So far we have discussed the simplest form 
of macro processing \*(EM
replacing one string by another (fixed) string.
User-defined macros may also have arguments, so different invocations
can have different results.
Within the replacement text for a macro
(the second argument of its
.B define )
any occurrence of
.I $n
will be replaced by the 
.I n th
argument when the macro
is actually used.
Thus, the macro
.I bump ,
defined as
.DS I
define(bump, $1 = $1 + 1)
.DE
generates code to increment its argument by 1:
.DS I
bump(x)
.DE
is
.DS I
x = x + 1
.DE
.P
A macro can have as many arguments as you want,
but only the first nine are accessible,
through $1 to $9.
(The macro name itself is $0.)
Arguments that are not supplied are replaced by null strings,
so
we can define a macro
.I cat
which simply concatenates its arguments, like this:
.DS I
define(cat, $1$2$3$4$5$6$7$8$9)
.DE
Thus
.DS I
cat(x, y, z)
.DE
is equivalent to
.DS I
xyz
.DE
The arguments $4 through $9 are null, 
since no corresponding arguments were provided.
.P
Leading unquoted blanks, tabs, or newlines that 
occur during argument collection
are discarded.
All other white space is retained.
Thus:
.DS I
define(a,   b   c)
.DE
defines
.Q a
to be
.Q "b\0\0\0c" .
.P
Arguments are separated by commas, 
but parentheses are counted properly,
so a comma 
.Q "protected"
by parentheses 
does not terminate an argument.
That is, in
.DS I
define(a, (b,c))
.DE
there are only two arguments;
the second is literally
.Q (b,c) .
And of course a bare comma or parenthesis can be inserted 
by quoting it.
.H 2 "Arithmetic Built-ins"
M4 provides two built-in functions for doing arithmetic
on integers.
The simplest is
.B incr ,
which increments its numeric argument by 1.
Thus, to handle the common programming situation
where you want a variable to be defined as 
.Q "one more than N" ,
write
.DS I
define(N, 100)
define(N1, `incr(N)')
.DE
Then
.Q N1
is defined as one more than the current value of
.Q N .
.P
The more general mechanism for arithmetic is a built-in
called
.B eval ,
which is capable of arbitrary arithmetic on integers.
It provides the following operators
(in decreasing order of precedence):
.DS I
unary + and -
** or ^	(exponentiation)
*  /  %	(modulus)
+  -
==  !=  <  <=  >  >=
!       (not)
& or &&	(logical and)
| or ||	(logical or)
.DE
.P
Parentheses may be used to group operations where needed.
All the operands of an expression given to
.B eval
must ultimately be numeric.
The numeric value of a true relation
(like 1>0)
is 1, and false is 0.
The precision in
.B eval
is
implementation dependent.
.P
As a simple example, suppose we want 
.Q M
to be 
.Q 2**N+1 .
Then
.DS I
define(N, 3)	
define(M, `eval(2**N+1)')
.DE
As a matter of principle, it is advisable
to quote the defining text for a macro
unless it is very simple indeed
(say just a number);
it usually gives the result you want,
and is a good habit to get into.
.H 2 "File Manipulation"
You can include a new file in the input at any time by
the built-in function
.B include :
.DS I
include(filename)
.DE
inserts the contents of
.I filename
in place of the
.B include
command.
The contents of the file is often a set of definitions.
The value
of
.B include
(that is, its replacement text)
is the contents of the file;
this can be captured in definitions, etc.
.P
It is a fatal error if the file named in
.B include
cannot be accessed.
To get some control over this situation, the alternate form
.B sinclude
can be used;
.B sinclude 
(
.Q "silent include" )
says nothing and continues if it can't access the file.
.P
It is also possible to divert the output of M4 
to temporary files during processing,
and output the collected material upon command.
M4 maintains nine of these diversions, numbered 1 through 9.
If you say
.DS I
divert(n)
.DE
all subsequent output is put onto the end of a temporary file
referred to as
.Q n .
Diverting to this file is stopped by another 
.B divert 
command;
in particular,
.B divert
or
.B divert(0)
resumes the normal output process.
.P
Diverted text is normally output all at once
at the end of processing,
with the diversions output in numeric order.
It is possible, however, to bring back diversions
at any time,
that is, to append them to the current diversion.
.DS I
undivert
.DE
brings back all diversions in numeric order, and
.B undivert
with arguments brings back the selected diversions
in the order given.
The act of undiverting discards the diverted stuff,
as does diverting into a diversion 
whose number is not between 0 and 9 inclusive.
.P
The value of
.B undivert
is
.I not
the diverted stuff.
Furthermore, the diverted material is
.I not
rescanned for macros.
.P
The built-in
.B divnum
returns the number of the currently active diversion.
This is zero during normal processing.
.H 2 "System Command"
You can run any program in the local operating system
with the
.B syscmd
built-in.
For example,
.DS I
syscmd(date)
.DE
runs the
.B date
command.
Normally,
.B syscmd
would be used to create a file
for a subsequent
.B include .
.P
To facilitate making unique file names, the built-in
.B maketemp
is provided, with specifications identical to the system function
.B mktemp:
a string of 
.Q XXXXX 
in the argument is replaced
by the process id of the current process.
.H 2 "Conditionals"
There is a built-in called
.Q ifelse
which enables you to perform arbitrary conditional testing.
In the simplest form,
.DS I
ifelse(a, b, c, d)
.DE
compares the two strings
.Q a
and
.Q b .
If these are identical, 
.B ifelse
returns
the string
.Q c ;
otherwise it returns
.Q d .
Thus, we might define a macro called
.I compare
which compares two strings and returns 
.Q "yes"
or 
.Q "no"
if they are the same or different.
.DS I
define(compare, `ifelse($1, $2, yes, no)')
.DE
Note the quotation marks,
which prevent too-early evaluation of
.B ifelse .
.P
If the fourth argument is missing, it is treated as empty.
.P
.B ifelse
can actually have any number of arguments,
and thus provides a limited form of multi-way decision capability.
In the input
.DS I
ifelse(a, b, c, d, e, f, g)
.DE
if the string
.Q a
matches the string
.Q b ,
the result is
.Q c .
Otherwise, if
.Q d
is the same as
.Q e ,
the result is
.Q f .
Otherwise the result is
.Q g .
If the final argument
is omitted, the result is null,
so
.DS I
ifelse(a, b, c)
.DE
is
.Q c
if 
.Q a
matches
.Q b ,
and null otherwise.
.H 2 "String Manipulation"
The built-in
.B len
returns the length of the string that makes up its argument.
Thus
.DS I
len(abcdef)
.DE
is 6, and
.DS I
len((a,b))
.DE
is 5.
.P
The built-in
.B substr
can be used to produce substrings of strings.
For example
.DS I
substr(s,\ i,\ n)
.DE
returns the substring of
.Q s
that starts at 
position
.I i 
(origin zero),
and is
.I n
characters long.
If 
.I n
is omitted, the rest of the string is returned,
so
.DS I
substr(`now is the time', 1)
.DE
is
.DS I
ow is the time
.DE
If 
.I i
or
.I n
are out of range, various sensible things happen.
.P
The command
.DS I
index(s1,\ s2)
.DE
returns the index (position) in
.Q s1
where the string
.Q s2
occurs, or \-1
if it doesn't occur.
As with
.B substr ,
the origin for strings is 0.
.P
The built-in
.B translit
performs character transliteration.
.DS I
translit(s, f, t)
.DE
modifies
.Q s
by replacing any character found in
.Q f
by the corresponding character of
.Q t .
That is
.DS I
translit(s, aeiou, 12345)
.DE
replaces the vowels by the corresponding digits.
If
.Q t
is shorter than
.Q f ,
characters that don't have an entry in
.Q t
are deleted; as a limiting case,
if
.Q t
is not present at all,
characters from 
.Q f
are deleted from 
.Q s .
So
.DS I
translit(s, aeiou)
.DE
deletes vowels from 
.Q s .
.P
There is also a built-in called
.B dnl
which deletes all characters that follow it up to
and including the next newline.
It is useful mainly for throwing away 
empty lines that otherwise tend to clutter up M4 output.
For example, if you say
.DS I
define(N, 100)
define(M, 200)
define(L, 300)
.DE
the newline at the end of each line is not part of the definition,
so it is copied into the output, where it may not be wanted.
If you add
.B dnl
to each of these lines, the newlines will disappear.
.P
Another way to achieve this, 
is
.DS I
divert(-1)
	define(...)
	...
divert
.DE
.H 2 "Printing"
The built-in
.B errprint
writes its arguments out on the standard error file.
Thus, you can say
.DS I
errprint(`fatal error')
.DE
.P
.B Dumpdef
is a debugging aid that
dumps the current definitions of defined terms.
If there are no arguments, you get everything;
otherwise you get the ones you name as arguments.
Don't forget the quotation marks.
.bp
.H 2 "Summary of Built-ins"
.DS I
changequote(L, R)
define(name, replacement)
divert(number)
divnum
dnl
dumpdef(`name', `name', ...)
errprint(s, s, ...)
eval(numeric expression)
ifdef(`name', this if true, this if false)
ifelse(a, b, c, d)
include(file)
incr(number)
index(s1, s2)
len(string)
maketemp(...XXXXX...)
sinclude(file)
substr(string, position, number)
syscmd(s)
translit(str, from, to)
undefine(`name')
undivert(number,number,...)
.DE
