From: DSP Hotline

Subject: 320C30 Code Generation Tools Bug List

-------------------------------------------------------------------------------
* RELEASE 4.50 BUGLIST                                                        *
* Update - 11/18/94                                                           *
*                                                                             *
* This file contains a list of bugs in the release 4.50 of the 320C30         *
* Code Generation Tools                                                       *
*                                                                             *
* All bugs will be fixed in the next production release unless othwerwise     *
* indicated. There will sometimes be references to internal versions in       *
* this list. For that reason you must always refer to the release status      *
* information (published every two weeks) to verify the availability of a     *
* specific revision.                                                          *
*                                                                             *
-------------------------------------------------------------------------------


				Part 1

 COMPILER
-------------------------------------------------------------------------------

5051     V4.50  will fix next release
  The compiler/optimizer may produce the incorrect loop count for a loop in
which the index is incremented by a value greater than 65535.

i.e.

main()
{
unsigned int a;
volatile unsigned int b;

for (a = 0xAAAAAAAA; a >= 0x55555555; a -= 0x55555555)
     b = a;
}


5055     V4.50  will fix next release
  Under some circumstances the optimizer may generate a branch to a location
outside of a RPTB loop, leaving the RM bit set causing subsequent problems.
This problem may occur when a conditional "return" occurs in a counting loop
that the optimizer can reduce to a RPTB.

e.g.

 len = cluster_p->tmp_code[0];

  for (i=0; i<len; i++)
    if (!isdigit(code[i])) return (FALSE); /* Inlining (-x option) must be
                               enabled for the error to   
                               occur here                 */


 for (i=0; i< num_codes; i++)
 {
   if ((setup->codes[i].num_barcodes) > 1)
     return (FALSE);
   if (++groups[setup->codes[i].group] > 1 )
     return (FALSE);
 }

 for (index = 0; index < 5; index++)
 {
    if ((width[index] < 2) || (width[index] > 7 ))
       return ( NOT_C128_CHAR);
 }

 Workaround:  1. Compile without optimization
           2. Place a call to a dummy function within the loop

          

                 for (index = 0; index < 5; index++)
                 {
              call dummy();
                    if ((width[index] < 2) || (width[index] > 7 ))
                       return ( NOT_C128_CHAR);
                 }

           dummy() { }


              3. Change the loop index so that the optimizer cannot use
           RPTB for the loop.

           volatile int max = 5;
                 for (index = 0; index < max; index++)
                 {
                    if ((width[index] < 2) || (width[index] > 7 ))
                       return ( NOT_C128_CHAR);
                 }


5060     V4.50  will fix next release
  The compiler may generate floating point operations using illegal registers,
ie. registers other than R0-R7 (C3x), R0-R11 (C4x). This may occur when 
optimization has been used and the code contains conversions of 
float-to-unsigned.

i.e.

float x,y,z,w;

main()
{  unsigned i,j,k,l;

    i = (unsigned) x;
    j = (unsigned) y;
    k = (unsigned) z;
    l = (unsigned) w;
    .
    .
    .
    .
}


Workaround: Compile without optimization or if the value of the
         unsigned int can really be represented by an int,
         use int vars instead of unsigned.


5091     V4.50  will fix next release
  The compiler may not save the value of IRx across a function call, when
a function call appears on the right-hand side of amn assignment statement
and the expression on the left-hand side of the assignment uses IRx to 
calculate the address of the destination.

i.e.

  for  (procNum = 0; procNum < iNUM ; procNum++)
  {
     links[procNum].sendStatus  = (int *) Locate(procNum);
  }


5095     V4.50  will fix next release
  The compiler on rare occasions may generate illegal parallel instructions
to perform a structure assignment.  The parallel instructions will have
index value for an ARn register that is greater than 1.  The resultant code 
will not assemble.


Workaround:  Compile with -mx option.


5096     V4.50  will fix next release
  On rare occasions, when use of RPTS instruction has been disabled and there
is a structure assignment in the code, the compiler may generate a duplicate
label.  The resulting asm file will not assemble.

Workaround: Compile without -mi option.


5121     V4.50  will fix next release
  The compiler fails to generate code for the else clause of a conditional
assignment - or an if-else that the optimizer has reduced to a conditional
assignment under the following circumstances.

    1) The left-hand side of the assignment contains an expression
       that includes a pre/post increment/decrement operation.

    2) Both expressions on the right-hand side of the assignment
       contain pre-/post increment/decrements and both expressions
       on the right-hand side of the assignment involve pre/post 
       increment/decrement of the same variable.

      i.e.
    
      if ((*p_enc_cam >> 24) != *p_enc_port) {
          *p_enc_buf++ = *p_enc_cam++ | 0x4000;
      }
      else {
          *p_enc_buf++ = *p+enc_cam++;
      }

Workaround: perform the pre/post increment outside of the
         conditional assignment or if-else statement.


5204     V4.50  will fix next release
  The compiler may generate an internal error message when trying to
generate code for an expression in which a common index subexpresion
appears on both sides of an operation: i.e.    a[expr] * b[expr]
where expr is the same for both indeces.

i.e.

main () {
float *a, *b;
int idx;

 *obuf += ibuf[idx] * ibuf[idx];   /* Error would occur here */
}


5308     V4.50  will fix next release
  The compiler errs in that it does not perform a store of a pointer value
when the same pointer appears on both sides of an assignment statement.

i.e.

func()
{
   volatile int *p1;

   .
   .
   *p1 = *p1;
}


5360     V4.50  will fix next release
  Under some rare circumstacnes, the compiler may generate incorrect
code for the statement following a structure copy. This problem 
is most likely to occur when optimization is turned on.

e.g.


typedef struct _stag{
   int          x;
   unsigned     y;
} STRUC_B;

extern STRUC_B     ext;

void func(unsigned int i)
{
   unsigned int j;
   STRUC_B *loc = (STRUC_B *)0;

   *loc = ext;
    j = loc->y;     <==== Incorrect code may occur for asssignment to j
}


5514     V4.50  will fix in V4.51
  Conversions from float to unsigned can produce non-existing instructions. 
With these compiler options:

     cl30 -v40 -x2 -mr -s -d__OS2__

this code: 

     unsigned long ulIndex = (unsigned long)usSizeX * 
                    (unsigned long)putLabelElm->dGravPtY +
                             (unsigned long)putLabelElm->dGravPtX;

gets assembled into non-existing instructions.

Where the variables putLabelElm->dGravPtX and putLabelElm->dGravPtY 
are declared as double like:   

     double              dGravPtX, 
                      dGravPtY;

WORK AROUND:
============

/*  First create two temporary variables like this:  */
volatile int TempIntVarXX;  
volatile int TempIntVarYY;  

/*  Then assign values to the temporary variables    */
/*  before using them in assignment to ulIndex.      */
     TempIntVarYY = (int) putLabelElm->dGravPtY;
     TempIntVarXX = (int) putLabelElm->dGravPtX;
     ulIndex = (unsigned long)usSizeX * (unsigned long)TempIntVarYY + 
               (unsigned long)TempIntVarXX;

/*  Note that the TempIntVar`s replace the putLabelElm->dGravPt's  */
/* in the assignment statement ulIndex = . . . .                   */


5530     V4.50  will fix in V4.51
  When compiling the equation "val3 = ~val1 | 0xFFFFFFC0;" the second 
value 0xFFFFFFC0 gets assembled as a 16 bit value rather than a 32 value. 
This happens in the following code:

void main ()
{
    unsigned long val1, val2 ,val3;
    val1 = 0x12345678; 
    val2 = val1 | 0xFFFFFFC0;
    val3 = ~val1 | 0xFFFFFFC0; 
}



SOLUTION
======== 

To WORK AROUND this problem, declare a temporary variable to hold the
32 bit value of 0xFFFFFFC0 like this:

void main ()
{
    unsigned long val1, val2 ,val3, temp_val;
    val1 = 0x12345678; 
    val2 = val1 | 0xFFFFFFC0;
    temp_val = 0xFFFFFFC0;
    val3 = ~val1 | temp_val; 
}


5590     V4.50  will fix in V4.51
  Code in this form:  x = x->y = z;  should be evaluated as:  x = (x->y = z);
as per ANSI C.   In other words, this should result in z being assigned to 
x->y first and then z being assigned to x.   With optimization turned on, 
however, the compiler rearranges the order of evaluation to be:

     x = z;
     x->y = z;

WORKAROUND:
===========
Break up multiple assignment statements like x = x->y = z; into single
assignment statements like this:

     x->y = z;
     x = z;


5600     V4.50  will fix in V4.51
  The compiler generates incorrect code for the assignment of a
bit field exprssion to an unsigned long.   This ONLY happens
when the length of the bit field is 24 bits AND when programs
are compiled with the "-v40" option.

This problem is demonstrated in the following code:

 /* first line of program bitfield.ccc  */
typedef unsigned int Uint;
typedef unsigned long Ulong;
 
typedef struct{
   Uint tdma_fn:24;
   Uint err:1;
   Uint taf_:1;
   Uint ab:1;
   Uint sp_:1;
   Uint st:1;
   Uint unused:3;
   }Ych_msgb_ctrl;
 
 
void main(void)
{
Ych_msgb_ctrl ctrl;
Ulong fn;
 
ctrl.tdma_fn=20;
ctrl.ab=1;
fn=ctrl.tdma_fn;  /* Compiler Error : fn is loaded with complete word,  */
                  /* not only 24 bits                                   */
 
}


WORKAROUND
==========
Use an AND instruction to mask the 24 bits.


5621     V4.50  will fix in V4.51
  The expression (h/100)%1000 is not handled properly by the
compiler.  In the following code it gives a result of 5 rather
than 6 as would be expected.

main()
{
unsigned int result;
unsigned int hour;
hour=639;
convert_hour(hour,&result);
}
void output(unsigned int n, unsigned int * p_res)
{
*p_res=n;
}
void convert_hour(unsigned int h, unsigned int * p_result)
{
unsigned int a;
unsigned int b;
a=(h/100)%1000; /* ERROR line result should be 6 as h=639 */
b=(h%100)*10;              /* see note 1 */
output(a,p_result);
output(b,p_result);         /* see note 1 */
/* note 1:                                         */
/* if b=(h%100)*10 and output(b,p_result are suppressed generated code is*/
/* correct and that means a=6                               */
}


5662     V4.50  will fix in V4.51
  The compiler errs in perfoming a load of the data page pointer
immediately following a push of the DP to the stack, when it
generates code to save the registers inside of an interrupt
service routine. Note this problem occurs for C40 in big 
model only (-V40 -mb options).


5715     V4.40  Fixed in V4.56 
  Version 4.40 of the C30 linker fails to concatenates the input sections
in the order that it encounters them in the input files.  In the following
command file, the .stack output section should contain the .stack section
from rts30.lib followed by all the .stack input sections, but it doesn't.
The linker puts the .stack section from rts30.lib _after_ all the .stack
input sections. THIS BUG HAS BEEN FIXED IN BETA RELEASE 4.56!

MEMORY
{
 SRAM:   o=000000h  l=20000h
 RAM:    o=809800h  l=700h
 STACK:  o=809f00h  l=100h
}
SECTIONS
{
 .text   : {} > SRAM
 .cinit  : {} > SRAM
 .data   : {} > SRAM
 .stack  : {
    rts30.lib(.stack)
    *(.stack) } > STACK
}

Workaround/Solution: 

    Remove the .stack section that is created in the object file
    and allow the linker to handle the location and size of the stack.
OR
    Wait for the V4.60 of the tools and then use the above method.


5782     V4.50  Fixed in V4.56 
  Version 4.50 of the C30 compiler will generate an infinite loop if
the following conditions exist:
     1) a for-loop counter is a pointer to a volatile 
   AND  2) there is an if statement in the for-loop with the value
        of the loop counter as a condition
   AND  3) code compiled with -x 

The following is a sample code that demonstrate this bug.

main()
{
  volatile int *i,*j,dummy; 
  int k=0;

  i=(volatile int *)0x2000;
  j=(volatile int *)0x2010;
  for (i; (i <= j) && (k++ < 20) ; i++) {
      dummy=0;
      if (*i) ;
  }
}


WORK AROUND/SOLUTION:
=====================
If one of the above conditions is not satisfied, then there is no problem. 

This has already been corrected in the next release of the
tools to be release at a latter date.


5862     V4.56  will fix next release
  The optimizer v4.56 & v4.57 generates a bug when it encounters the following union.
When a member of the union is updated, it doesn't update the other members.  In the
following code, x.word is originally stored in R0.  x.bit.onoff is then changed and
stored in R2.  Since x.bit.onoff is in R2, so should x.word, but it's not.


Workaround:
     Don't use the optimizer when there's a union.  Bug will be fixed in 
comming release.


typedef union {
    unsigned int     word;
    struct {
     unsigned int     onoff : 1;
    } bit;
}Union ;

 
void bug(void)
{
    Union x;
    unsigned int *iptr;


    x.word = *iptr;
    x.bit.onoff=1;
    *iptr = x.word;
}


-----------BUG---------COMPILED W/OPTIMIZER---------------

******************************************************
*    TMS320C30 C COMPILER     Beta Version 4.56 [Sep 9]
******************************************************
;     ac30 -q y.c y.if 
;     opt30 -q -O2 y.if y.opt 
;     cg30 -q y.opt y.asm y.tmp 
     .version     30
FP     .set          AR3
     .globl     _bug
******************************************************
* FUNCTION DEF : _bug
******************************************************
_bug:
     PUSH     FP
     LDI     SP,FP
     ADDI     1,SP
*
* AR2     assigned to variable  iptr
*
     LDI     *AR2,R0
     STI     R0,*+FP(1)
     LDI     FP,AR0
     LDI     1,R1
     OR     R1,*+AR0(1),R2    >>R2 contains the new result for the union
     STI1     R2,*+AR0(1)
    ||     STI2     R0,*AR2    >>Should be R2 instead of R0
EPI0_1:
     LDI     *-FP(1),R1
     BD     R1
     LDI     *FP,FP
     NOP
     SUBI     3,SP
***     B     R1     ;BRANCH OCCURS
     .end


---------------NO BUGS IF


5943     V4.50  Fixed in V4.56 
  An infinite loop is generated if a switch statement uses the for-loop counter 
as a test.  The following asm code shows that the same register (IR1) is being
used storing the loop counter and the switch test.  This only happens when 
compiled with optimization (o1 or o2).  This bug has been fixed in version 4.57.



main(){foo();}
foo()
{
  int i,j=0;
  short tab[5] = {0, 2, 3, 4, 5};


  for(i=0; i<5; i++){
    switch(i){
     case 1:
       j += tab[i];
       break;
     case 2:
     case 3:
     case 4:
       j += tab[i];
       break;
    }
  }
  return j;
}

The asm code generated from the above C code:


******************************************************
*    TMS320C40 C COMPILER     Version 4.50
******************************************************
;     ac30 -q -v40 x.c x.if 
;     opt30 -q -v40 -s -O1 x.if x.opt 
;     cg30 -q -o -v40 x.opt x.asm x.tmp 
     .version     40
FP     .set          AR3
     .file     "x.c"

     .sym     _main,_main,36,2,0
     .globl     _main

     .func     1
******************************************************
* FUNCTION DEF : _main
******************************************************
_main:
     PUSH     FP
     LDI     SP,FP
*** 1     -----------------------    foo();
     .line     1
     CALL     _foo
***       -----------------------    return;
EPI0_1:
     .line     1
     LDI     *-FP(1),R1
     LDI     *FP,FP
     SUBI     2,SP
     B     R1
     .endfunc     1,000000000H,0

     .sym     _foo,_foo,36,2,0
     .globl     _foo

     .func     2
******************************************************
* FUNCTION DEF : _foo
******************************************************
_foo:
     PUSH     FP
     LDI     SP,FP
     ADDI     5,SP

     .sect


LINKER
-------------------------------------------------------------------------------

5076     V4.50  will fix next release
  A bug occurs in the linker if you link two files which are both 
assembled from this file:
 
        .global fred
fred    .equ 5
        .sym fred,5,4,2,8
 
The linker generates the error: 
>> internal error: : negative aux table id


6118     V4.50  Fixed in V4.56 
  CAN NOT INCLUDE THE .LIB FILE IN THE SPECIFYING INPUT
 SECTION OF THE COMMAND FILE.


RTS
-------------------------------------------------------------------------------

5104     V4.50  will fix next release
  Note: This is a problem when compiling with SMALL model only.
In several routines in the MATHASM library, the address of a table
is loaded using direct addressing without loading the value of
the data page pointer. Since the address was inadvertently allocated
in .text section rather than .bss, the DP is pointing to the wrong
page.  This leads to erroneous results.

This bug affects the following routines:

acos.asm   asin.asm   sin.asm    sinh.asm    tan.asm   tanh.asm         
atan.asm   atan2.asm  cos.asm    cosh.asm    exp.asm   log.asm
log10.asm  pow.asm    

Workaround:  
        1. Extract the source files from mathasm.src
        2. Search for the section comment that says
           "DEFINE CONSTANTS" and make the following
           changes. 
           (Note: This example is for cos.asm - but
            all of the files require a similar change)

           3. Replace the files in mathasm.src and re-assemble
           the math library.


***********************************************************************
*  DEFINE CONSTANTS
***********************************************************************
COS_ADR:        .word   COS  <===== This is the statement causing
                        the problem.  Delete this statement.

 .if .BIGMODEL == 0                      ;if small model use .bss
          .bss COS_ADR,1  <==== Add these statements 
          .sect ".cinit"  <====
          .word 1,COS_ADR <====
          .word COS       <====

                .bss COS,8
                .sect ".cinit"


5248     V4.50  will fix next release
  The compiler fails to generate code to save the contents of the R10
register before calling the RTS intrinsic function INV_F40.  This
can result in erroneous results since R10 is used by the INV_F40 instruction
and the compiler still assumes that the old value of R10 (before the call)
is still valid.

Workaround:  Edit the INV_F40 routione in the RTS library to
          save and restore R10.

     1. Extract invf.asm from RTS.SRC
          
     ar30 -x rts.src invf.asm

     2. Make the following changes.   

        1. Insert the following two lines aftert line 125 of
           the file:

     Old Line 125: POP     AR1         ; return address
     New Line 126: PUSH    R10         ; save lower 32 bits of R10
        New Line 127: PUSHF   R10         ; save upper 32 bits of R10

        2. Move lines 137 and 138 before the delayed branch:

           Old line 135: BD    AR1           ; delayed branch to return
           Old line 137: MPYF  R1,R0,R10     ; R10 = v * x[1]
           Old line 138: SUBRF 2.0,R10       ; R10 = 2.0 - v * x[1]

           Becomes:

           New line 135: MPYF  R1,R0,R10     ; R10 = v * x[1]
           New line 136: SUBRF 2.0,R10       ; R10 = 2.0 - v * x[1]
           New line 137: BD    AR1           ; delayed branch to return


     3. Insert the folloing lines to restore R10 after
           line 139

           Old line 139: MPYF   R10,R1,R0    ; R0 = x[2] = x[1] ....
           New line 140: POPF   R10          ; restore upper 32 bits
           New line 141: POP


ASSEMBLER
-------------------------------------------------------------------------------

5188     V4.50  will fix next release
  The assembler does not flag as en error, the use of the same register
as both source and destination for the C4x LDA command.

i.e.

     LDA  AR0, AR0      ; Assembler should generate an error, since
                 this is an illegal instruction.


5706     V4.50  Fixed in V4.60 
  If a C source program defines a C label that has the same name as a 
C function, and the source is compiled with -g option, the assembler
incorrectly combines the two symbols into a single incorrect entry 
within the symbol table.  When the program is linked, this incorrect 
entry will result in the linker aborting with the following error 
message:

>> Relocation symbol not found, index ...  section ...  file ....


DOCUMENTATION
-------------------------------------------------------------------------------

5189     V4.50  will fix next release
  The TMS320C4x User's Guide fails to mention in the description of the LDA
ionstruction on pg. 11-96, that it is illegal to specify the same register
as both the source and destination register in the LDA instruction.

i.e.

          LDA  AR0, AR0           ;  Is an illegal instruction.


5498     V(current)  will fix in V0.01
  The TMS320C3x User's Guide (Revision F, July 1992), on page 9-28  
in the fourth paragraph states, "If src3 is in external memory and 
src4 is in internal memory, one cycle is necessary to complete two 
reads." This is NOT true.  Two cycles are in fact necessary. 

The third sentence in the third paragraph is also incorrect.   
The sentence should read, "Again, two memory reads are completed
in one cycle."


5746     V(current)  will fix next release
  COFF 'Optional File Header' format given on page A-5
in the 1991 Floating Point Assembley Tools User's Guide
is not correct.  'Entry point address' and 'Beggining of 
.text' are swapped.


5759     V(current)  will fix next release
  Example 5-4 on page 5-18 of the 1993 "C3x Source Debugger:
User's Guide" has incorrect ordering of the arguments.

WORK AROUND/SOLUTION:
=====================
Following change need to be made to the documentation:

Guide gives:   pinc <filename> <pinname>
Correct:       pinc <pinname> <filename>


EMULATOR
-------------------------------------------------------------------------------

5355     V4.60  will fix next release
  On SUN workstation, emu3x using -p 0 fails, attempts to open device at sd4.    


Workaround: Use a different driver other than sd0.


5359     V4.60  will fix next release
  Note: This is a problem on SUN & PC under SunOS and OS/2

The board configuration path and filename specified after the -f option        
is limited to 32 characters including the '\0' end-of-string character.
Longer names are truncated.

This is especially a problem when the OS expands names as in ~user.


Workaround: Don't use long names.


5364     V4.60  will fix next release
  Under the following specific circumstance, the debugger may incorrectly 
calculate the number/size of the packets required to perform a transfer 
across the SCSI bus. This problem will affect both the memory save, MS, 
and LOAD commands.  If the length specified in the MS command, or the 
length of the loaded portion of the file given in the LOAD command is such 
that (length%500)%128>116, then the proper number of WORDS will not be 
saved/loaded. 


Workaround: For MS command - 

         1). break the range of memory specified
             into sizes of appropriate lengths.
          That is  (length%500)%128<=116.

         For LOAD command - 
         
            2). Check the link map from the link of the object file,
          for the size of all loaded (UNITIALIZED sections are not
          loaded) sections to make sure that the section lengths 
          are in bounds, i.e. (length%500)%128>116. 
          
          If the section length exceeds this figure then pad the 
          section to the next 128 word boundary. This may be done 
          by altering the link command file Sections directive for the 
          corresponding "bad" section.

          i.e.

          badsec : { *(badsec)
                  .+= "pad length" ;   } ....


5486     V4.60  will fix in V4.61
  The debugger comes up, but I can't modify memory or registers, nor run.

    CAUSE : User didn't execute emurst with the correct c?x510ws.out
            since powering-up the XDS510WS, but before invoking the debugger.

SOLUTION/WORKAROUND DESCRIPTION : 
    Obviously, make sure to execute emurst with the correct c?x510ws.out
    file prior to invoking the debugger, yet after powering-up the XDS510WS.


5490     V4.60  will fix in V4.61
  The XDS510WS hangs if emu3x is executed on it, but it was initialized,
via emurst, with a file other than c3x510ws.out, such as c4x510ws.out.
SOLUTION/WORKAROUND DESCRIPTION:
    Cycle the power on the XDS510WS and re-emurst with the correct
    c?x510ws.out file or simply have enough sense not to execute a
    debugger on an XDS510WS that is not set up for that device.


HEX
-------------------------------------------------------------------------------

5517     V4.50  will fix in V4.51
  The DOS version of "hex30.exe" hangs when the amount of initialized static-
global data is about 12,000 words or more.


XDS510WS
-------------------------------------------------------------------------------

5521     V1.00  will fix in V1.01
  The production version of the XDS510WS fails to respond to probe-scsi or       
emurst, nor does it's sd? information display when its attatched Sparc
reboots.
 
Root cause was the device ignored the SCSI bus after recieving a SCSI bus
reset.  The bug also exists in the first production release of the C3x
emulator software for the XDS510WS.

SOLUTION/WORKAROUND : 
 
    Don't power-up the XDS510WS until after the Sparc has completed booting,
    or update ROM.


-------------------------------------------------------------------------------
