         Title LRU Massive Cache buffer for SDOS
*
*
**********<copy LRU Cache spec here>
*
* Implementation:
*
* The LRU cache mechanism keeps track of sectors by use of Sector
* Descriptor Blocks (SDBs) and Sector Buffer Blocks (SBBs).
* An SBB holds a sector full of information that belongs to a disk.
* SDBs hold information about SBBs, such as the disk to which the SBB belongs
* (the DCB address), where on the disk the sector contents belong (the LSN),
* whether the SBB is "dirty" (needs to be moved back to the disk),
* some aging ("Least Recently Used") information, and some binary-tree links
* used to help search for particular LSN, and lastly, a Sector Buffer
* Block Selector code (see below).  A SDBSELECTOR of zero is reserved
* for use as an "empty/end list" marker.
*
* To manage a large pool, the cache mechanism allow up to 65536 SDBs
* to be active simulataneously.  Since the CPU can only manipulate 16
* bit items conveniently, SDBs are referenced by a 16 bit SDBSELECTOR
* code; this code is multiplied by the size of the SDB and added to
* the 24 bit real memory address of the base of the cache pool to
* find the SDB contents.  The LRU cache mechanism cannot conveniently
* modify an SDB in the pool; instead, it fetches the SDB from the pool
* into a local buffer, manipulates it, and places the SDB back.
* Since SDBs are small, this process is acceptably fast.
* The part of the pool containing SDBs is allocated at system boot
* time, and is never used for SBBs.  For details on how an individual
* SDB is allocated, see the code itself.
*
* A sector buffer is always a power of two, and, by SDOS design, are
* at least 128 bytes in size.  Rather than allocate SBBs to be large
* enough to hold the largest size sector on a system, the cache manager
* dynamically allocates power-of-two multiples of 128 bytes on
* demand from the balance of the cache pool. The allocated block
* is always placed on a boundary of 128 bytes, and is referenced by
* a 16 bit SBBSELECTOR code.  This code is multiplied by 128 and
* added to the 24 bit real address which designates the top of the SDB area.
*
* The number of SDBs allocated is enough to ensure that if the entire
* cache area is filled with the minimal size sector configured for
* this system (usually 128 byte sectors), there will be exactly
* one SDB for each SBB.  With systems that have drives with larger sectors,
* at any instant in time there will typically be unused SDBs; but
* memory is cheap enough so this doesn't bother us.  Thus the scheme can
* handle a cache which stores 2^23 bytes (not counting SDBs) = 8 Mb.
*
* Since the memory under an SDOS system is likely to be banked,
* with a common region available to all banks.  Thus we need a
* a method of splitting SBBs and SDBs across gaps in the address
* space.  The ideal method is to pre-allocate SBBs and SDBs that
* fit across windows so that they are never used.  This is accomplished
* by calling the allocator at power up time with (address,length)
* pairs that specify regions of storage that are to be pre-allocated.
* Then an SBB can be converted to a bank number by simply left shifting
* the 16 bit quantity 7 times, producing a 64K bank number in the upper
* 24 bits and an offset into the 64K bank for the block buffer.
*
lru:convertsbbtoreal ; convert SBB selector to (bank,offset)
         clr   tempx+1                 ; makes (a,b,tempx+1) into 24 bits
         lsra                          ; right shift to obtain bank number
         rorb                          ; right shift upper 8 bits of offset
         stb   tempx                   ; save lower 8 bits of offset
         ldx   tempx                   ; get offset into selected bank
         rts                           ; all done

lru:convertsdbtoreal ; convert SDB selector to (bank,offset)
; multiply (A,B) by 25 (=16 + 8 + 1) and place into (A,TEMPX)
; Note that it is very unusual to have large SDBs, so this code
; is designed to operate quickly for small SDB numbers
; Note also that there can be 65536 SDBs, so carries are important!
         <this is a miserable idea. Who gave it to you?>
         std   tempx+2                 ; save original value
         clr   tempx                   ; so (tempx,a,b)=1*entry value
         asl   tempx+3                 ; 2* entry value
         rol   tempx+2
         bcc   ?                       ; b/ MSB not set
         ; add 32768*24 to sum.
         asl   tempx+3                 ; 4*entry value
         rol   tempx+2
         bcc   ?                       ; b/ MSB not set
         ; add 16384*24 to sum.
         asl   tempx+3                 ; 8*entry value
         rol   tempx+2
         bcc   ?                       ; b/ MSB not set
         ; add 16384*24 to sum.
         addd  tempx+2                 ; add 8*entry value to sum
         bcc   lru:convertsdbtoreal8nc ; b/ no carry
         inc   tempx                   ; propogate carry
lru:convertsdbtoreal8nc
         asl   tempx+3                 ; 16*entry value
         rol   tempx+2
         bcc   ?                       ; b/ MSB not set
         ; add 16384*24 to sum.
         addd  tempx+2                 ; add 16*entry value to sum
         bcc   lru:convertsdbtoreal16nc ; b/ no carry
         inc   tempx                   ; propogate carry
lru:convertsdbtoreal16nc




; multiply (A,B) by 32 and place into (A,TEMPX)
         clr   tempx+1                 ; (6~) now (a,b,tempx+1)=256*(a,b) on entry
         lsra                          ; (2~) 128 * (a,b) on entry
         rorb                          ; (2~)
         lsr   tempx+1                 ; (6~)
         lsra                          ; (2~) 64 * (a,b) on entry
         rorb                          ; (2~)
         lsr   tempx+1                 ; (6~)
         lsra                          ; (2~) 2 * (a,b) on entry
         rorb                          ; (2~)
         lsr   tempx+1                 ; (6~)
         stb   tempx                   ; (4~) so TEMPX has offset
         ldx   tempx                   ; (5~) exit with (A)=bank, (X)=offset
                                       ;-------
                                       ; (45~)

         jmp   fetchsdb                ; (4~) jump to common space
         leax  16,x                    ; set source to middle of SDB desired
         leay  16,y                    ; set destination to middle of SDB to fill
         ldab  #SDBcommon/256          ; (2~) get page number of scratch area
         tfr   b,dpr                   ; (7~) set page register
         setdpr SDBcommon              ; tell assembler where page zero is
         orcc  #%01010000              ; (3~) shut off interrupt system
         staa  bankselect              ; (6~) select bank
; repeat the following 16 times (fetches 32 bytes)
         ldd   -16,x                   ; (5+1~)
         std   SDBcommon+...           ; (5~)

; then...
         clr   bankselect              ; (6~) select system space again
         andcc #\%01010000             ; (3~) re-enable interrupts
         clrb                          ; (2~) select page zero again
         tfr   b,dpr                   ; (7~) set page register
         setdpr SDBcommon              ; tell assembler where page zero is
; repeat the following 16 times (stores 32 bytes)
         ldd   SDBcommon+...           ; (5~)
         std   -16+...,y               ; (5+1~)
                                       ; = 16*(2*11)+45+20=420~ --> 210uS.
         rts
         page
; The following routines must be defined by the I/O package
; with some cleverness, they can probably re-use the /MT primitive
; block move logic

; lru:storecachesdb ; put SDB in system space back into cache area
; (D) holds selector for SDB
; (X) points to buffer holding SDB in system space
; computes real 24 bit address from (D) and copies buffer content there
; The amount of data to move is CACHESDB:SIZE bytes

; lru:fetchcachesdb ; fetch SDB from cache area
; (D) holds selector for SDB
; (X) points to buffer in system space to which the SDB must be copied
; computes real 24 bit address from (D) and copies buffer content from there
; The amount of data to move is CACHESDB:SIZE bytes

; lru:fetchsbbpointer ; fetch next SBB descriptor
; Entered with TEMPX holding buffer selector
; exits with (D) holding 1st 2 bytes of SBB

; lru:storesbbpointer ; store SBB descriptor
; Entered with (D) holding 16 bits to store into 1st 2 bytes of SBB
; TEMPX holding SBB selector

; BlockMoveFromCache ; copy sector of data from LRU cache
; Entered with (X) pointing to target buffer in system space
; (D) holds # bytes to move
; TEMPX holds buffer selector (offset into cache area divided by 128)

; BlockMoveToCache ; copy sector of data to LRU cache
; Entered with (X) pointing to source buffer in system space
; (D) holds # bytes to move
; TEMPX holds buffer selector (offset into cache area divided by 128)

         org   0
; Define CACHE Sector Descriptor Block (CACHESDB:) offsets
CACHESDB:DCB   RMB  2                  pointer to DCB of drive owning sector
CACHESDB:LSN   RMB  3                  Logical Sector Number of sector
CACHESDB:BUFFER RMB 2                  = Offset into cache/128 of sector buffer
CACHESDB:STATE RMB  1                  = One of (Clean, Dirty, Free)
CACHESDB:FLINK RMB  2                  SDB selector for older sector in cache
CACHESDB:BLINK RMB  2                  SDB selector for younger sector in cache
CACHESDB:RSON  RMB  2                  SDB selector of same-drive right son or zero
CACHESDB:LSON  RMB  2                  SDB selector of same-drive left son or zero
CACHESDB:CYLINDER RMB 2                Physical cylinder on drive for sector
CACHESDB:TRACK RMB 2                   Physical track in cylinder for sector
CACHESDB:SECTOR RMB 2                  Physical sector in track for sector
CACHESDB:TIME  RMB  3                  Timestamp of last reference to sector
;                                      Indicates how long block has been dirty
;                                      So we can write any blocks > threshold age
CACHESDB:DFLINK RMB 2                  Dirty block list for same-drive
CACHESDB:FBLINK RMB 2                  Dirty block list for same-drive
; Note: a Dirty Block DCB chain keeps track of which DCB has oldest
; dirty block chain attached to it.

CACHESDB:SIZE  EQU  *
         if    CACHESDB:SIZE>>16
         ? ; Cache Sector Descriptor Block exceeds 16 bytes in size
         fin
We need an LRU chain for all sectors in cache, to determine which
sector to heave out.
We need a binary tree on which to hang sectors with matching hash codes,
that we can perform associative retrieval of sectors.
We need chains of clean sectors from a particular disk drive, and
dirty sectors from a particular disk drive, so that we can dump/dismount
a drive easily.
We need cylinder oriented, LSN sequenced lists to make efficient dumping
possible. 15 heads * 32 sectors/track --> 1500 sectors /cylinder!
clearly an efficient managment technique is required.
We have already been given a powerful discriminant on entry: the
DCB pointer.

>>>>>>>> Give THE FOLLOWING MORE THOUGHT!
Why not attach ordered list of sectors to DCBs?
65536 SDBs max --> binary tree has at most 16 levels.
Perhaps a better scheme is to allow each DCB to have a built-in table
containing hash buckets; then really huge disks can have really huge
hash tables (256 slots).  If buckets are ordered by LSN, then
"same cylinder" determination must only cross to neighboring buckets.
Since accesses to disk are typically sequential, we must take extreme
care to ensure we don't build right-leaning trees.  But we don't want
real balanced trees, because overhead of maintaining them is high.
Need DIRTYLIST per DCB, ordered by LSN (sigh... balanced again?) to
ensure efficient flush algorithm.
>>>> for LRUCACHE:FLUSH, don't we need to thread the binary trees ?
>>> No. Just go thru disk-specific list, extracting dirty sectors,
>>>>sort and THEN flush.
>>> Keep SDBs sorted by LRU to determine which one to throw out.
>>> Keep SDBs simultaneously sorted by LSN so that when we determine
>>> which SDB to dump, we can determine sequentially increasing LSNs
>>> with same cylinder number trivially.
>>> For DISMOUNT, delete un-dirty sectors, then sort and flush.
LRUCACHE:OLDESTSDB ; contains Selector for oldest Sector Descriptor Block
         fdb   changed                 intially zero --> none
LRUCACHE:MOSTRECENTSDB ; contains Selector for most recently referenced SDB
         fdb   changed                 intially zero --> none

         ifund
LRU:HASHTABLE:SIZE equ  256
         fin

LRU:HASHTABLE ; Used to index lists of CACHESDBs quickly
; This table is indexed by a function of the desired sector's DCB/LSN information
; Each slot points an (un)balanced tree of CACHESDBs.
         rpt   LRU:HASHTABLE:SIZE
         FDB   CHANGED                 0 --> Empty slot
         page
lru:determinehashslot ; from LSN specified by DCB
; Returns (X) pointing to HashTable slot
; (D) holds contents of hash table slot
; Z set if hash table slot is empty
         ldx   RDSIpointer             fetch descriptor for desired sector
         ldd   RDSI:DSKINFO,x          store in scratchpad for fast search
         std   DCBPOINTER
; ?? Is the above instruction really necessary ?
         ldd   RDSI:LSN,x              fetch upper 16 bits of LSN
         std   tempx+2                 and store in scratchpad for fast search
         ldab  RDSI:LSN+2,x            use LSB as hash code
         stb   tempx+4                 save in scratchpad for fast search
;        eorb  RDSI:LSN+1,x            hash it up good
;        eorb  RDSI:LSN,x
         andb  #LRU:HASHTABLE:SIZE-1   mask to match hash table
         if    m6801!m6811
         ldx   #LRU:HASHTABLE          set to index hash table
         abx                           form index into hash table
         abx                           (double index to obtain word offset)
         elseif m6809
         ldx   #LRU:HASHTABLE+LRU:HASHTABLE:SIZE/2 set to index hash table
         leax  b,x                     form index into hash table
         leax  b,x
         else  (m6800)
         clra                          double to form word index
         aslb
         rola
         addb  #LRU:HASHTABLE&$FF      add to base of table
         adca  #LRU:HASHTABLE/256
         staa  tempx
         stab  tempx+1
         ldx   tempx                   ...whew!
         fin
         ; Now (X) points to Hash Table entry
         ldd   0,x                     anything in hash table entry ?
         rts
         page
LRU:FINDSECTOR ; entered with RDSIpointer pointing to SECTORDB:
; Exits with carry reset --> found sector and filled SECTORDB:RDSI with data
; Exit with carry set --> can't find sector in cache
         bsr   lru:determinehashslot   find out which hash slot desired
         beq   LRU:FINDSECTORCANT      b/ no, FINDSECTOR operation failed
LRU:FINDSECTORloop ; (D) hold CACHESDB selector code
         ldx   #CACHESDB               (3~) where to fetch SDB to in system bank
         jsr   lru:fetchcachesdb       (150~?) fetch cache sector descriptor block
         ; Since the CACHESDBs are organized as a binary tree, we wish
         ; to make some attempt to keep the trees balanced.  The balancing
         ; is performed using the 40 bit number composed of (DCB,LSN)
         ; as the "key" in the binary tree.  A bad hueristic is to treat
         ; the DCB as the most significant 16 bits, because that places
         ; all the sectors from one drive down one branch of the tree,
         ; with all the sectors from other drives down the other branch.
         ; Since SDOS systems typically have only one large drive on them,
         ; most of the sectors will all be tagged with the same DCB and
         ; therefore the first branch of the tree would almost always
         ; be wasted in terms of its ability to divide the sector population
         ; of the tree in half, slowing us down.
         ; So we treat the 40 bit number as (LSN,DCB) with LSN as most
         ; significant 24 bits.
         ; If we were to treat the DCB as the upper 16 bits, we would not
         ; have to compare it until the entire LSN matches; but to
         ; go down the tree, we must compute the 40 bit difference between
         ; desired (DCB,LSN) and this SDB's (DCB,LSN) anyway, so where
         ; we do the subtraction is really immaterial.  Finally, the
         ; time to fetch each SDB from the pool dominates the compare
         ; time significantly, so no great gain would be made by eliminating
         ; the DCB comparisons.
         ldd   DCBPOINTER              (5~) same DCB ?
         if    m6800
         subb  CACHESDB:DCB+1          (5~) (lower 8 bits of DCB match?)
         bne   lru:findsectornext1     (3~) b/ no
         sbca  CACHESDB:DCB            (5~) (upper 8 bits of DCB match?)
         else  (m6801!m6811!m6809)
         subd  CACHESDB:DCB            (6~)
         fin
         bne   lru:findsectornext2     (3~) b/ no (usually doesn't branch)
         ldaa  tempx+4                 (4~) LSB of LSN match ?
         sbca  CACHESDB:LSN+2          (5~) ...?
         bne   lru:findsectornext3     (3~) b/ no (usually branches)
         ldd   tempx+2                 (5~) Top/Middle 16 bits of LSN match ?
         sbcb  CACHESDB:LSN+1          (5~) (Middle 8 bits of LSN match?)
         bne   lru:findsectornext4     (3~) b/ no
         sbca  CACHESDB:LSN            (5~) (Top 8 bits of LSN match?)
         bcs   lru:findsectorlessthan  (3~) b/ go down left son
         beq   lru:foundsector         (3~) b/ we found desired CACHESDB!
lru:findsectorgreaterthan ; follow RSON pointer
         ldd   CACHESDB:RSON           (6~) follow right son pointer
         bned  LRU:FINDSECTORloop      (3~) b/ there is a right son!
LRU:FINDSECTORCANT ; FINDSECTOR operation failed
         errorrts                      signal failure

; The following code computes the sign of the difference between desired LSN
; and actual LSN in CACHESDB just examined
;        ldd   DCBPOINTER              (5~) fetch lowest 16 bits of desired LSN
;        subb  CACHESDB:DCB+1          (5~) compute difference to determine sign
         if    m6800
lru:findsectornext1 ; finish computing difference
         sbca  CACHESDB:DCB            (5~) (upper 8 bits of DCB match?)
         fin
lru:findsectornext2 ; finish computing difference
         ldaa  tempx+4                 (4~) LSB of LSN match ?
         sbca  CACHESDB:LSN+2          (5~) ...?
lru:findsectornext3 ; finish computing difference
         ldd   tempx+2                 (5~) Top/Middle 16 bits of LSN match ?
         sbcb  CACHESDB:LSN+1          (5~) (Middle 8 bits of LSN match?)
lru:findsectornext4 ; finish computing difference
         sbca  CACHESDB:LSN            (5~) (Top 8 bits of LSN match?)
         ; Now carry bit tells us which way to go down the tree.
         bcc   lru:findsectorgreaterthan (3~) b/ follow right son pointer
lru:findsectorlessthan ; follow left son
         ldd   CACHESDB:LSN            (6~) follow left son pointer
         ; Assuming the DCB pointers always match, and that lsb of LSB
         ; never matches, it takes 50~ to get to here for each CACHESDB.
         ; Add 150~ to fetch each cache block as overhead for 200~/block
         ; For Average case = 1Mb, 256 byte sectors, 128 slots in hash table,
         ; tree balanced, --> 32 sectors/slot --> 5 tree compares --> 1000~
         ; Average case with 2Mb, 128 byte sectors, 256 slots in hash table,
         ; and tree reasonably balanced --> 1200~
         ; Worst case = 2Mb, 128 byte sectors, 256 slots in hash table,
         ; and tree right-leaning --> 64 sectors/slot --> 12800~ --> 6 mS.
         ; (With a large cache, it is probably worth reading a track from
         ; the drive if the heads are over the proper cylinder!)
         bned  LRUCACHE:FINDSECTORloop (3~) b/ there is a right son!
         errorrts                      signal failure

lru:foundsector ; we found desired CACHESDB!
         ; by golly, we found the CACHESDB that matches!
; ??? just because we found it in the cache doesn't mean we want its content.
; It might mean we want to UPDATE its content from SDOS's buffer pool.
         ldaa  CACHESDB:STATE          copy state into RDSI
         staa  RDSI:STATE,x            (records if dirty or clean)
         clr   CACHESDB:STATE          no point in both (pool,cache) knowing "dirty"
         ldd   CACHESDB:CYLINDER       record cylinder address
         std   RDSI:CYLINDER,x
         ldd   CACHESDB:TRACK
         std   RDSI:TRACK,x
         ldd   CACHESDB:SECTOR
         std   RDSI:SECTOR,x
         lda   SDOS+SDOS:CLOCK         timestamp the SDB
         sta   CACHESDB:TIMESTAMP      (??? Perhaps only need to do this...
         ldd   SDOS+SDOS:CLOCK         on a write??)
         std   CACHESDB:TIMESTAMP+1
         ; Now fetch sector content from pool.
         if    m6800!m6801
         ldd   RDSI:BUFFER,x           record TO location
         std   tempx
         else  (m6811!m6809)
         ldy   RDSI:BUFFER,x           record TO location
         fin
         ldx   RDSI:DCB,x              determine number of bytes to move
         ldd   DSKINFO:NBPS,x          fetch transfer count
         ldx   CACHESDB:BUFFER         get "FROM" offset (upper 16 bits)
         jsr   BlockMoveFromCache      move data from Cache to Buffer pool
; Make the found SDB be the first SDB in the list.
; Three cases: 1) SDB is ALREADY the first in list.
;              2) SDB is the LAST in the list (but is not first)
;              3) SDB is in middle of list.
; These complications arise because list head is not a CACHESDB.
; Sometimes life is just complicated.
>>> SHOULDNT WE SHUFFLE THE DIRTY BLOCK CHAIN AROUND ALSO?
         ldd   CACHESDB:BLINK          make this SDB the first in the list
         ; The following branch is unlikely to be taken...
         ; but it removes a lot of cases to do it this way.
         beq   lru:foundfirst          b/ no need to shuffle stuff around
         ldx   #0                      set BLINK of new most recent...
         stx   CACHESDB:BLINK          to zero (head of list)
; Note: BLINK of most recent SDB is ALWAYS zero (we do this because
; we always must update the most recent SDB, thus zeroing the slot
; doesn't really add very much extra time to the process).
; Note that FLINK of oldest SDB is simply garbage (because to
; make FLINK of oldest SDB zero is simply extra work that we can avoid).
         ldx   LRUCACHE:OLDESTSDB      is found block the oldest SDB in cache ?
         cmpx  tempx+2                 ...?
         beq   lru:foundlast           b/ SDB is last in list (rarely!)
         ; Found SDB must be in middle of list --> can't have edge conditions.
         pshd                          (remember who preceded this SDB)
         ldd   CACHESDB:FLINK          make it the first SDB in list
         pshd                          (remember who followed this SDB)
         ldx   LRUCACHE:MOSTRECENTSDB  points to old front of LRU list
         stx   CACHESDB:FLINK          make this SDB point to former front
         ldd   tempx+2                 selector for this SDB
         std   LRUCACHE:MOSTRECENTSDB  make this SDB first in LRU list
         ldx   #CACHESDB               which SDB to move back to buffer space
         jsr   lru:storecachesdb       put updated SDB back into pool area
         ldd   CACHESDB:FLINK          make former most recent...
         std   tempx+4                 (remember selector for former most recent)
         ldx   #CACHESDB               which SDB to move back to buffer space
         jsr   lru:fetchcachesdb       point back to new most recent
         ldd   tempx+2                 selector for new most recent
         std   CACHESDB:BLINK
         ldd   tempx+4                 selector for former most recent
         ldx   #CACHESDB               which SDB to move back to buffer space
         jsr   lru:storecachesdb       move updated SDB back to cache space
         ; Now unlink found SDB from its old place in the LRU chain,
         ; by making former previous SDB point to former next SDB and vice-versa
         puld                          SDB that followed the one we found
         std   tempx+4
         ldx   #CACHESDB               which SDB to fetch
         jsr   lru:fetchcachesdb       get it from cache space
         puld                          SDB that preceded one we found
         std   tempx+2                 make former next SDB...
         std   CACHESDB:BLINK          point back to former previous SDB
         ldd   tempx+4                 descriptor for former next SDB
         ldx   #CACHESDB               which SDB to move back to buffer space
         jsr   lru:storecachesdb       put it back
         ldd   tempx+2                 selector for (former) next SDB
         ldx   #CACHESDB               where to fetch to
         jsr   lru:fetchcachesdb       get it from cache space
         ldd   tempx+4                 pointer to former next SDB
         std   CACHESDB:FLINK          update (former) previous to point to next
         ldx   #CACHESDB               which SDB to move back to buffer space
         jsr   lru:storecachesdb       save back in cache space
lru:foundfirst ; SDB found is already first in list
         okrts                         signal "success!"

lru:foundlast ; SDB found is last on list, and is not first on list.
         std   LRUCACHE:OLDESTSDB      remember new "oldest SDB"
         ldx   LRUCACHE:MOSTRECENTSDB  points to old front of LRU list
         stx   CACHESDB:FLINK          make this SDB point to former front
         ldd   tempx+2                 selector for this SDB
         std   LRUCACHE:MOSTRECENTSDB  make this SDB first in LRU list
         ldx   #CACHESDB               which SDB to move back to buffer space
         jsr   lru:storecachesdb       put updated SDB back into pool area
         ldd   CACHESDB:FLINK          make former most recen