 _______________________________________________________________________
|                                                                       |
| XILINX APPLICATIONS XAPP004V:  CXSD32-V2.00               BN-3-21-94  |
|_______________________________________________________________________|


README file for the XC3000A Counter CXSD32:
===========================================

Note: A more detailed description of this application can be found in
Section 8 of the Xilinx Data Book.


CXSD32
------

This counter is a 32-bit Loadable Down Counter with a Count Enable (CE),
Parallel Load (PE), and Clock (CLK).  All signals are active-High.  The
operation of the counter is based on 2-bit cells.  The count outputs are
parallel decoded for all zeros and then ANDed in the carry CLBs to form the
carries that advance each set of 2 count bits.  See the application note on
loadable 16- and 32-bit binary counters.  This counter is similar to the 16-
bit version except that this counter is functionally broken into 2 pieces.
The two pieces are the lower 18 bits and their parallel decoding and
carries and the upper 14 bits and thier parallel decoding and carries.  The
two pieces are linked by a carry from the lower 18 bits.

The speed of the counter is determined by speed of the carry chain.  
A speed of approx. 25 MHz in the XC3000A-6 and 37 MHz in the XC3100A-3 can 
be achieved.

Design files included in directory CXSD32:

  README          This README file.
  SCH\CXSD32H.1   Top-level Viewlogic V4.1.3a schematic
  SCH\CXSD32.1    32 Bit Loadable Down Counter              (Sheet 1)
  SCH\CXSD32.2    CLBMAPs for the counter                   (Sheet 2)
  SCH\CXSD32.3    CLBMAPs for the counter                   (Sheet 3)
  SCH\CXSD32.4    CLBMAPs for the counter                   (Sheet 4)
  SYM\*.1         Viewlogic Symbol for Counter
  WIR\*.1         Viewlogic Wire files

  XNF\            Xilinx Netlists for High Level Schematic
  CXSD32H.LCA     Placed and Routed LCA file
  CXSD32H.CST     Contraints file for the CLB placement.
  CXSD32H.XRP     Xdelay Timing report using XC3000A-6

Software Versions used:

  DS390 Version 4.1.3a Viewlogic and Interface

Recommended Layout, Routing:

   Simple floorplanning will significantly improve the performance of any 
design.  The recommended layout is listed in the constraints file CXSD32H.CST. 
The placement is is meant to minimize the delays of the Q0-Q15 bits through 
their parallel decoding through the C6, C8, C10, C12, and C16 CLBs. The 
placement is arranged in a rectangle and generally in columns of function from 
left to right.   

The first 8 count bits Q0 through Q7 are in 2 columns and alternating 
left-right up the 2 columns.  The next 2 columns contain the parallel decoding 
for these bits with the CLBs CXSTA/(P0-3, P4-7, P8-11, and P12-15) and the 
carries in CLBs CXSTA/(C6/16, C8/10, and C12/14).  They are placed here to 
be in close proximity to the Q0-7 bits in the left 2 columns and the Q8-15 
bits in the right 2 columns.

The total is a block 4 high and 6 wide.  This is the same as the 16-bit
version of this counter.  The CLBs for bits Q16 and Q17 and the CXSTA/P16-17
CLBS are added just below the Q14 and Q15 CLBs.  A nearly duplicate 4 X 6
group from the 16-bit counter is placed directly above the CLBs for the
lower 18 bits of the counter except that the two LSBs of this upper counter
are not used since they were place below the lower 16-bit group.  The result
is that the lower 18 bits are in the lower group of CLBs and the upper 14
bits are the upper group of CLBs.  The CXSTA/C18 CLB is placed in the hole
missing is the upper part of the lower group of CLBs so that its inputs
CXSTA/(P0-3, P4-7, P8-11, P12-15, and P16-17) can reach it with short nets
and also so that the CXSTA/C18 net can reach the CXSTA/(C20/22, C24/26, and
C28/30) CLBs on short nets since these blocks participate in the critical
path. 

The placement is listed below:

place block CXSTA/C6 : HC;
place block CXSTA/C8 : ED;
place block CXSTA/C12 : HD;
place block CXSTA/C18 : EC;
place block CXSTA/C20 : CC;
place block CXSTA/C24 : BD;
place block CXSTA/C28 : DD;
place block CXSTA/P0-1 : FC;
place block CXSTA/P4-5 : GC;
place block CXSTA/P8-9 : FD;
place block CXSTA/P12-13 : GD;
place block CXSTA/P16-17 : ID;
place block CXSTA/P18-19 : BC;
place block CXSTA/P22-23 : DC;
place block CXSTA/P26-27 : CD;
place block CXSTA/P30-31 : DG;
place block TC : CG;
place block Q0 : EA;
place block Q1 : ;
place block Q2 : FA;
place block Q3 : FB;
place block Q4 : GA;
place block Q5 : GB;
place block Q6 : HA;
place block Q7 : HB;
place block Q8 : EE;
place block Q9 : EF;
place block Q10 : FE;
place block Q11 : FF;
place block Q12 : GE;
place block Q13 : GF;
place block Q14 : HE;
place block Q15 : HF;
place block Q16 : IE;
place block Q17 : IF;
place block Q18 : BA;
place block Q19 : BB;
place block Q20 : CA;
place block Q21 : CB;
place block Q22 : DA;
place block Q23 : DB;
place block Q24 : AE;
place block Q25 : AF;
place block Q26 : BE;
place block Q27 : BF;
place block Q28 : CE;
place block Q29 : CF;
place block Q30 : DE;
place block Q31 : DF;

   For maximum performance, some hand routing may be required, although 
PPR will do a very good job on longline assignment and the use of zero delay
routing resources.
 
   The recommended routing is now described.  The longest carry chain is
from the least significant 18 bits of the counter Q0-Q17 through their
respective 4 bit decodes (and 2 bit decode in the case of Q16 and Q17)
through the C18 carry logic and through the C20, C22/24, C26/28, or C30
carry logic and into the CLBs for bits Q18 to Q31.  This is 3 CLB logic
delays and 4 Net delays along with the clock to output and setup time.
So this net should be routed first on the vertical long lines.  

Next the CXSTA/CE1 (count enable ORed with parallel enable) line can be 
routed to the .EC pins of the Q0-Q31 CLBs on the vertical long line. 
The remaining critical nets are the Q0 through Q17 counter outputs,
the parallel decode signals P0-3, P4-7, P8-11, P12-15, P16-17, and the C18
through C30 carries.  One should not completely neglect the P18-21, P22-25,
P26-29, and P30-31 or nets Q18 to Q31 because if they are exceedingly long
they can swamp out the gains made in the critical nets.  The CE1 signal
is not critical, but is it shown in the schematic as long and critical as a
reminder that for convenience it is to be routed on long lines to save
resources that other signals could use. 

It is recommended that the RoutePin (rp) command in XDE be used to route 
the CXSTA/(P0-3, P4-7, P8-11, P12-15, and P16-17) to the CXSTA/C18 CLB.  
The Q0-Q17 nets should be routed with the RoutePin command to the 
CXSTA/(P0-3, P4-7, P8-11, P12-15, and P16-17) CLBs since these routings 
are critical to the overall performance.  Next the CXSTA/(C18, C20, C22, 
C24, C26, C28, and C30) signals can be routed. 

Performance:

XDELAY was used to report all clock-to-set-up paths. See the .XRP file.

