 _______________________________________________________________________
|                                                                       |
| XILINX APPLICATIONS XAPP004V:  CXSB16-V2.00              BN-3-21-94   |
|_______________________________________________________________________|


README file for the XC3000A Counter CXSB16:
===========================================

Note: A more detailed description of this application can be found in
Section 8 of the Xilinx Data Book.

CXSB16
------

This counter is a 16-bit Loadable Up/Down Counter with a Count Enable (CE),
Parallel Load (PE), and Clock (CLK).  All signals are active-High.  The
operation of the counter is based on 2-bit cells.  The count outputs are
parallel decoded for all ones and then ANDed in the carry CLBs to form
the carries that advance each set of 2 count bits.  See the application note
on loadable 16- and 32-bit binary counters.  

The speed of the counter is determined by speed of the carry chain.  
A speed of approx. 31 MHz in the XC3000A-6 and 46 MHz in the XC3100A-3 
can be achieved.

Design files included in directory CXSB16:

  README          This README file.
  SCH\CXSB16H.1   Top-level Viewlogic V4.1.3a schematic
  SCH\CXSB16.1    16 Bit Loadable Up/Down Counter           (Sheet 1)
  SCH\CXSB16.2    CLBMAPs for the counter                   (Sheet 2)
  SYM\*.1         Viewlogic Symbol for Counter
  WIR\*.1         Viewlogic Wire files

  XNF\            Xilinx Netlists for High Level Schematic
  CXSB16H.LCA     Placed and Routed LCA file
  CXSB16H.CST     Contraints file for the CLB placement.
  CXSB16H.SCP     Schematics Constraints

Software Versions used:
  DS390 Version 4.1.3a Viewlogic and Interface

Recommended Layout, Routing:

   Simple floorplanning will significantly improve the performance of any 
design.  The recommended layout is listed in the constraints file CXSB16H.CST.  
The placement is shown in the application note and is meant to minimize the 
delays of the Q0-Q15 bits through their parallel decoding through the C6, C8, 
C10, C12, and C16 CLBs. The placement is arranged vertically with the 16 count 
bits Q0 through Q15 being in 2 columns and alternating left-right down the 
2 columns.  

The next column contains the parallel decoding for these bits with the CLBs
CXSTA/(P0-1, P0-3, P4-5, P4-7, P8-9, P8-11, P12-13, and P12-15).  They are
placed here to be in close proximity to the Q0-Q15 bits.  In the center of
the next column adjacent to the parallel decoding CLBs are the carry CLBs
CXSTA/(C6/16, C8/10, and C12/14).  All CLBs are arranged vertically for the
convenience of the UP line which must drive all odd count CLBs (ie Q1, Q3,
.. Q15) and all parallel decode CLBs CXSTA/(PXX-YY).

The placement is listed below:

place block CXSTA/C6 : DD;
place block CXSTA/C8 : FD;
place block CXSTA/C12 : GD;
place block CXSTA/P0-1 : AC;
place block CXSTA/P0-3 : CC;
place block CXSTA/P4-5 : BC;
place block CXSTA/P4-7 : DC;
place block CXSTA/P8-9 : EC;
place block CXSTA/P8-11 : FC;
place block CXSTA/P12-13 : HC;
place block CXSTA/P12-15 : GC;
place block Q0 : AA;
place block Q1 : AB;
place block Q2 : BA;
place block Q3 : BB;
place block Q4 : CA;
place block Q5 : CB;
place block Q6 : DA;
place block Q7 : DB;
place block Q8 : EA;
place block Q9 : EB;
place block Q10 : FA;
place block Q11 : FB;
place block Q12 : GA;
place block Q13 : GB;
place block Q14 : HA;
place block Q15 : HB;

   For maximum performance, some hand routing may be required, although 
PPR will do a very good job on longline assignment and the use of zero delay
routing resources.
 
   The recommended routing is now described.  The longest carry chain is
from the least significant 16 bits of the counter Q0-Q15 and the UP signal
through their respective 4-bit decodes and through the C6/16, C8/10, or
C12/14 carry logic and into the CLBs for bits Q6 to Q15.  This is 2 CLB
logic delays and 3 Net delays along with the clock to output and setup time.

Since the UP net is critical, all CLBs that use the UP signal have access to
it on a veritical long line at the .C or .E inputs on the CLBs.  This net
should be routed first on the vertical long lines.  Next the CXSTA/CE1
(count enable ORed with parallel enable) line can be routed to the .EC pins
of the Q0-Q15 CLBs on the vertical long line to keep it out of the way. The
remaining critical nets are the Q0 through Q15 counter outputs, the parallel
decode signals P0-3, P4-7, and P8-11 and the C6 through C14 carries. One must
be careful with the other parallel decodes (P0-1, P4-5, P8-9, and P12-13)
which are also critical but because they each only go one place they are
routed after the above nets are done.  

The CE1 signal is not critical, but is it shown in the schematic as long 
and critical as a reminder that it is to be routed on long lines to save
resources that other signals could use. It is recommended that the RoutePin
(rp) command in XDE be used to route the CXSTA/(P0-3, P4-7, P8-11, and
P12-15) to the CXSTA/(C6/16, C8/10, and C12/14) CLBs. The Q0-Q15 nets should
be routed with the RoutePin command to the CXSTA/(P0-3, P4-7, P8-11, and
P12-15) CLBs since these routings are critical to the overall performance.
Next the CXSTA/(C6, C8, C10, C12, C14, and TC) signals can be routed. 

Performance:

XDELAY was used to report all clock-to-set-up paths. See the .XRP file.
