 _______________________________________________________________________
|                                                                       |
| XILINX APPLICATIONS XAPP004V:  CXSB32-V2.00               BN-3-21-94  |
|_______________________________________________________________________|


README file for the XC3000A Counter CXSB32:
===========================================

Note: A more detailed description of this application can be founf in
Section 8 of the XIlinx Data Book.


CXSB32
------

This counter is a 32-bit Loadable Up/Down Counter with a Count Enable (CE),
Parallel Load (PE), Up/Down Control (UP), and Clock (CLK).  All signals are
active-High.  The operation of the counter is based on 2-bit cells.  The count
outputs are parallel decoded along with the UP signal for all zeros or
all ones and then ANDed in the carry CLBs to form the carries that advance
each set of 2 count bits.  See the application note on loadable 16- and 32-
bit binary counters.  This counter is similar to the 16-bit version except
that this counter is functionally broken into 2 pieces.  The two pieces are
the lower 18 bits and their parallel decoding and carries and the upper 14
bits and their parallel decoding and carries.  The two pieces are linked by
a carry from the lower 18 bits.  

The speed of the counter is determined by speed of the carry chain.  
A speed of approx. 24 MHz in the -XC3000A-6 and 37 MHz in the XC3100A-3 can 
be achieved.

Design files included in directory CXSB32:

  README          This README file.
  SCH\CXSB32H.1   Top-level Viewlogic V4.1.3a schematic
  SCH\CXSB32.1    32 Bit Loadable Up/Down Counter                (Sheet 1)
  SCH\CXSB32.2    CLBMAPs for the counter                   (Sheet 2)
  SCH\CXSB32.3    CLBMAPs for the counter                   (Sheet 3)
  SCH\CXSB32.4    CLBMAPs for the counter                   (Sheet 4)
  SYM\*.1         Viewlogic Symbol for Counter
  WIR\*.1         Viewlogic Wire files

  XNF\            Xilinx Netlists for High Level Schematic
  CXSB32H.LCA     Placed and Routed LCA file
  CXSB32H.CST     Contraints file for the CLB placement.
  CXSB32H.XRP     Xdelay timing report using XC3000A-6

Software Versions used:

  DS390 Version 4.1.3a Viewlogic and Interface

Recommended Layout, Routing:

   Simple floorplanning will significantly improve the performance of any 
design.  The recommended layout is listed in the constraints file CXSB32H.CST. 
The placement to be described is meant to minimize the delays of the Q0-Q17 
bits through their parallel decoding through the C18 CLB through the C20 
through C30 carry logic and into the Q18 through Q31 CLBs.  

The placement is arranged vertically with the first 18 count bits, Q0 through 
Q17, being in 2 columns and alternating left-right down the 2 columns.  The 
next column contains the parallel decoding for these bits with the CLBs 
CXSTA/(P0-1, P0-3, P4-5, P4-7, P8-9, P8-11, P12-13, P12-15, and P16-17). 
They are placed here to be in close proximity to the Q0-Q17 bits.  In the 
center of the next column adjacent to the parallel decoding CLBs are the 
carry CLBs CXSTA/(C6/16, C18, C8/10, and C12/14).  

In the next column are the carry CLBs, CXSTA/(C20/22, C24/26, and C28/30) for 
the upper 14 bits Q18-Q31. They are placed here because they receive the 
CXSTA/C18 net and drive the inputs of the CLBs in the next 2 columns ie CLBs 
for Q18-Q31.  In the last column after the Q18-Q31 CLBs are the parallel 
decoding for the Q18-Q31 bits.  All CLBs are arranged vertically for the 
convenience of the UP line which must drive all odd count CLBs (ie Q1, 
Q3, .. Q31) and all parallel decode CLBs CXSTA/(PXX-YY).

The placement is listed below:

place block CXSTA/C6 : DD;
place block CXSTA/C8 : FD;
place block CXSTA/C12 : GD;
place block CXSTA/C18 : ED;
place block CXSTA/C20 : CE;
place block CXSTA/C24 : EE;
place block CXSTA/C28 : GE;
place block CXSTA/P0-1 : AC;
place block CXSTA/P0-3 : CC;
place block CXSTA/P4-5 : BC;
place block CXSTA/P4-7 : DC;
place block CXSTA/P8-9 : EC;
place block CXSTA/P8-11 : FC;
place block CXSTA/P12-13 : IC;
place block CXSTA/P12-15 : GC;
place block CXSTA/P16-17 : HC;
place block CXSTA/P18-19 : BH;
place block CXSTA/P18-21 : CH;
place block CXSTA/P22-23 : DH;
place block CXSTA/P22-25 : EH;
place block CXSTA/P26-27 : FH;
place block CXSTA/P26-29 : GH;
place block CXSTA/P30-31 : HH;
place block TC : IE;
place block Q0 : AA;
place block Q1 : AB;
place block Q2 : BA;
place block Q3 : BB;
place block Q4 : CA;
place block Q5 : CB;
place block Q6 : DA;
place block Q7 : DB;
place block Q8 : EA;
place block Q9 : EB;
place block Q10 : FA;
place block Q11 : FB;
place block Q12 : GA;
place block Q13 : GB;
place block Q14 : HA;
place block Q15 : HB;
place block Q16 : IA;
place block Q17 : IB;
place block Q18 : BF;
place block Q19 : BG;
place block Q20 : CF;
place block Q21 : CG;
place block Q22 : DF;
place block Q23 : DG;
place block Q24 : EF;
place block Q25 : EG;
place block Q26 : FF;
place block Q27 : FG;
place block Q28 : GF;
place block Q29 : GG;
place block Q30 : HF;
place block Q31 : HG;

   For maximum performance, some hand routing may be required, although 
PPR will do a very good job on longline assignment and the use of zero delay
routing resources.
 
   The recommended routing is now described.  The longest carry chain is
from the least significant 18 bits of the counter, Q0-Q17, and the UP signal
through their respective 4 bit decodes (and 2 bit decode in the case of
Q16 and Q17) through the C18 carry logic and through the C20, C22/24, C26/28,
or C30 carry logic and into the CLBs for bits Q18 to Q31.  This is 3 CLB
logic delays and 4 Net delays along with the clock to output and setup time.

Since the UP net is critical, all CLBs that use the UP signal have access to
it on a veritical long line at the .C or .E inputs on the CLBs.  So this net
should be routed first on the vertical long lines.  Next the CXSTA/CE1
(count enable ORed with parallel enable) line can be routed to the .EC pins
of the Q0-Q31 CLBs on the vertical long line to keep it out of the way. The
remaining critical nets are the Q0 through Q17 counter outputs, the parallel
decode signals P0-3, P4-7, P8-11, P12-15, P16-17, and the C18 through C30
carries.  One should not completely neglect the P18-21, P22-25, P26-29, and
P30-31 or nets Q18 to Q31 because if they are exceedingly long they can
swamp out the gains made in the critical nets.  

   The CE1 signal is not critical, but is it shown in the schematic as long 
and critical as a reminder that for convenience it is to be routed on long 
lines to save resources that other signals could use. It is recommended that 
the RoutePin (rp) command in XACT be used to route the CXSTA/(P0-3, P4-7, 
P8-11, P12-15, and P16-17) to the CXSTA/C18 CLB. The Q0-Q17 nets should be 
routed with the RoutePin command to the CXSTA/(P0-3, P4-7, P8-11, P12-15, 
and P16-17) CLBs since these routings are critical to the overall performance.
Next the CXSTA/(C18, C20, C22, C24, C26, C28, and C30) signals can be routed.


Performance:

XDELAY was used to report all clock-to-set-up paths. See the.XRP file.

