!head('Rules for threaded code routines')
in general the threaded code follows the following rules:
!pt the threaded code register (R4) must be preserved from one threaded
routine to another as it specifies what operation is to be performed.
!pt any other general register (r0 thru r3) may be used freely.
!pt the floating point registers may be used freely.
!pt the floating point processor mode may be changed freely from single
to double as required.
!pt since the compiler expects to know the number of values on the stack
at any given time care should be taken to insure that the stack pointer is
manipulated properly.
!pt the SBRK routine maintains the break address for the Fortran runtime, so
it should be used to get any additional space required. 
!pt the mode of the floating point processor is changed to DOUBLE before
calling any subroutine in order that C subroutines have the right processor
mode.
it might be reasonable to change the runtime code to use DOUBLE mode always
and switch modes only when doing single precision.
!head('naming conventions')
a function or subroutine or entry point is given it's Fortran name, prefixed
by an underscore ("!47"). this is compatible with the C compiler convention.
Common blocks names are also prefixed by "!47", and may thus be used to 
interface with C variables, arrays, or structures globally declared.
!head('calling sequence and parameter mechanism')
Fortran routines generate a calling sequence that is mostly compatible with the
DEC PC calling sequence, and the calling sequence used by the C 
compiler. to code generated for:
)l2 call sub(a,b,c) !p is: )l2a
push c-address
push b-address
push a-address
push 3      (the number of arguments)
mov sp,r5   (point to parameter list with r5)
jsr pc,sub  (call the subroutine)
add $8,sp   (pop off 3 parameters and arg count)
) !p the Fortran calling sequence has the following characteristics:
!pt the addresses (NOT the values) of the parameters are passed to the called routine.
if an expession is given the result is stored into a temporary location
and the address of the temporary is passed.
!pt the number of parameters is passed as part of the parameter list.
!pt General register five (R5) is set to point to the start of the parameter
list. this is only so that older subroutine written in assembler language
will still work. new assembler routines should conform to the C calling
sequence.
!pt the actual transfer is via a "JSR PC,SUB" (i.e.~the return address is
on the top of the stack)
this satisfies the DEC PC, calling sequence convention.
!pt since the parameter list itself is stored on the top of the stack, the
C calling sequence is also satisfied.
!pt after the called routine returns (it has saved R2, R3, R4, and preserved
the value of the stack pointer SP) the parameter list is popped off the stack.
!p the runtime now assumes that the parameter list has been placed on the
stack (as for a C call), and does not assume that R5 points to it (although
for a Fortran call it will). this makes calling C from Fortran and Fortran
from C more reliable, since it does not assume anything about what R5
points to.
!p note however, that any C routine that calls a Fortran subroutine must provide
the count of the number of arguments in order to satisfy the Fortran calling
sequence, and all arguments must be passed by address, not value.
!head('Accessing of Parameters')
as part of the procedure entry code the addresses of the parameters are copied
into local variables. all references to parameters is done by using the address
stored in the local pointers.
!head('shared and separated code')
the runtime is set up to be sharable and
the compiler generates code that may be made sharable (that is useful
when tracking down code that overwrites itself). 
however, since the generated code must be in DATA space if I/D separation
is made little advantage is gained by puting just the runtime code into
the INSTRUCTION space. it could be done, but the ability to make the code
itself sharable is more generally usefull.
!p in order to gain full advantage of I/D separation the compiler would
have to generate actual machine code. this is not trivial and would be
advantagous for the speed advantage if nothing else.
!head('compiler optimizations')
the compiler does not attempt many optimizations. it optimizes the following
types of expessions (mostly for efficient subscript evaluation).
)l2a
c1 [+ - * /] c2 ==> c3
e [+ -] c1 ==> [+ -] c1 + e
e [+ -] 0  ==> e
e [* /] 1  ==> e
c1 * (c2 [+ -] e) ==> (c1*c2) [+ -] (c1 * e)
) !p
in addition, it factors out any constants in subscript calculations that
can be incorporated into the base address of the array. for example:
a(2*i+3) is compiled into value((a+8)+8*i) where (a+8) is known at linkedit
time.
!head('IF statement optimizations')
the compiler will optimize IF statements that evaluate to constants in
such a way that code that can never be executed is not actually generated.
for example:
)l2a
1      if (.false.) print,'hi'
2      if (.true.) print,'there'
) !p is actually compiled as if it were written:
)l2a
1      continue
2      print,'there'
) !p this also applies to the BLOCK-IF statements. 
whenever the compiler is able to determine the result of a logical expression
at compile time it will do so. for example:
)l2a
      parameter (i=1)
      if (i.eq.1) then
           stmt-1
      else
           stmt-2
      endif
) )p in this case, since the expression is known to be true at compile time
it is compiled as if it were written:
)l2a
     stmt-1
) 
