January, 1991
January, 1991
EDITORIAL


The Right Thing To Do




Jonathan Erickson


Every now and then, the right things get done for the right reasons -- and at
just the right time. So it is this month as we launch a major series of
articles by Bill and Lynne Jolitz. The project they'll be describing involves
porting BSD Unix to the 80386/486 platform. Working with the Computer Science
Research Group (CSRG) at the University of California at Berkeley, Bill
conducted a clean-room port of 4.3BSD Unix from its VAX roots to its new home
on the PC.
There are a number of significant points here. For one thing, you won't need
tens, if not hundreds, of thousands of dollars in hardware to use the
operating system. Less apparent, but perhaps more important, is that 386BSD
(as Bill refers to the port) will be free of AT&T code -- the only license
required will be that issued by the University of California. There are many
other issues of consequence, including the degree of MS-DOS "cohabitation"
that enables shared disk space so that DOS and Unix can exchange information,
but I'll let Bill and Lynne tell you about that.
This obviously hasn't been a trivial project. So why would anyone undertake
such a massive project? It's simple: To broaden the base for BSD Unix and, in
the words of one CSRG member, "to make Unix the right way."
When the history of Unix is written, CSRG will be cast in the role of an
information clearinghouse and source of innovation in Unix technology. The
shadow cast by the group and some of its graduates (who include Bill Joy) has
provided Unix with much of the success and momentum it has today. And,
although it's less well known, much of the underlying technology of systems
like Mach, System V, Release 4, and virtually every other version of Unix is
based upon or contains some Berkeley code.
Over the years, CSRG has faithfully adhered to its fundamental goal -- to
learn, educate and innovate within a community of like-minded researchers,
scholars, and programmers. This had led to, among other things, early
development of features such as virtual memory, networked file systems, and a
standard visual editor.
It's important to give credit where credit is due. Bill's work couldn't have
been undertaken without Michael Karel, Kirk McKusick, Keith Bostic, and many
others. Unraveling technical knots has often been the least of their concerns,
what with political, legal, and bureaucratic battles to contend with.
Nevertheless, they've stuck by their consoles and deserve whatever accolades
come their way.
By the time you finish reading this first installment, the $64 question you'll
be asking is how you can get your hands on 386BSD. Although the specific
details are still being worked out, the general approach is that there will be
no restrictions on who can license the source -- including the kernel and all
utilities. There will be a nominal license fee and UC notices must remain in
the code. Bill and Lynne will keep you posted on the status of this as the
series unfolds.


Coming to Terms with Software Design


The first 386BSD article, which focuses on designing the specification, fits
right in with this month's theme -- software design. Like just about
everything else around us, terminology evolves and the term "software design"
is no exception. In this issue, we're taking a look at the topic from three
different, yet related, perspectives: The traditional view of how programmers
go about designing a program or system; the use of software engineering (CASE)
tools that automate and structure the process to some degree; and the
implications of software design from the user's perspective, the most recent
evolution of the term as embodied in Mitch Kapor's "software design
manifesto."


My How Time Flies...


I certainly can't close without saying that this is a special issue of a
special magazine. Fifteen years ago this month, Dr. Dobb's Journal of Computer
Calisthenics & Orthodontia: Running Light Without Overbyte rolled off the
presses and into your hands. Since then, DDJ has watched the technology
evolve, creativity flourish, and, from Tiny Basic to 386BSD, has played a part
in the process.
As proud as we are of DDJ, this issue is really a celebration in honor of the
progress made in the art and science of computer programming -- and the people
responsible for those advances.
Computers, and the people who build and program them, truly have changed the
world. By building on the foundations shaped by generations of scientists,
programmers have given us fresh ways of looking at what's already there, and
enabled us to see things in ways never thought possible. They've freed us from
the drudgery that those before us accepted as the norm, and have given us ways
of enjoying newfound freedom.
DDJs role in all of this is, and always has been, simply to provide a means of
sharing information that makes all of this wonderment possible, "realizable
fantasies" is how an early issue of DDJ put it. Thanks to you, the past has
been amazing and I genuinely hope you're with us to see how the future
unfolds.


























January, 1991
LETTERS







Hashing It All Out


Dear DDJ,
I really enjoyed your recent article on Bloom Filters, "An Existential
Dictionary" by Edwin T. Floyd (November 1990); studying hashing techniques is
one of my favorite forms of recreation. I was, however, so astonished by
Floyd's assertion that a 32-bit CRC function was not an adequate hashing
function that I had to duplicate his collision test for myself. Sure enough, I
also saw "thousands of collisions." But then I realized that I was using a
16-bit counter to produce the "10-digit ASCII numbers" used in the test; most
of my "collisions" were really the result of my counter wrapping at 64K and
producing duplicate keys! With that little problem solved I measured 12
collisions in a test with 373,380 keys; Floyd's equation for expected
collisions predicts 17. I have no idea if Floyd's test suffered from the same
bug as mine, but I learned long ago that when a test produces results that are
at great variance with theory, you should take a real good look at the test
before you start to doubt the theory.
Some ten hours of crunching on Knuth's "Algorithm S (Percentage Points for
Collision Test)" reveals that with probability 0.99 there will be at most 60
collisions in Floyd's collision test; Floyd's data shows that he exceeded this
value in seven of 30 tests. The odds against that happening with a good hash
function are something like 50 million to one! There is also a probability of
0.22 that there will be at most 39 collisions, yet Floyd's tests never got a
value that low. He consistently gets about ten too many collisions. There seem
to be three possibilities: 1. Floyd's hashing algorithm doesn't do a very good
job of turning his word list into a list of random numbers; 2. his word list
has about 48,000 more words than he thinks it does; 3. his word list contains
about ten duplicate words.
While there is nothing wrong with Floyd's scheme of hashing a key to produce a
seed and then inserting the seed into the Bloom Filter, the same results are
obtainable with much less work. A CRC generator really does make a pretty good
hashing function, and it's very easy to generate a string of hash values by
appending zeros to the end of the key. The first hash value is the CRC of the
key, the second is the CRC of the key with a zero appended, the third is the
CRC of the key with two zeros appended, etc. In practice you don't actually
recompute the CRC of the key each time, you simply crank one more byte through
the CRC process. Given a function that accepts a pointer to a string of bytes,
a byte count, and an initial value, and returns a CRC, the code fragment in
Example 1 inserts a key into a Bloom Filter.
Example 1

 hash = crc ( key, strlen ( key ), 0 );
 setBit ( hash % hashMod );
 for( i = 1; i < 14; ++i ) {
 hash = crc( "\O", 1, hash );
 setBit( hash % hashMod );
 }

When I used this method on Floyd's "practical test" I measured 39 false drops
on a list of 93,345 words; the number predicted by theory is 40. As usual,
simplicity brings speed; this technique, all coded in C, inserts 280 keys per
second on an old 6-MHz AT, 590 on a 10-MHz 286, and 1678 on a 20-MHz 386. The
simple optimization in Example 2 , which moves the incremental CRC calculation
in-line, brings those numbers up to 340, 700, and 1995, respectively.
Example 2

 hash = crc ( key, strlen ( key ), 0 );
 setBit ( hash % hashMod );
 for( i = 1; i < 14; ++i ) {
 hash = ( crcTable [ hash & OXFF ] ) ^ ( hash >> 8 );
 setBit( hash % hashMod );
 }

Finally, I must mention that Doug McIlroy was not only aware of Bloom Filters
when he wrote the spelling checker mentioned in the article, he was improving
on a spelling checker that used a Bloom Filter! The solution Floyd proposes
takes 25 percent more memory then McIlroy's; more memory, in fact, than was
addressable in the PDP 11 it was written for.
John A. Murphy
Performance Technology
San Antonio, Texas
Edwin replies: Mr. Murphy is correct; there was a bug in my 32-bit CRC test,
though not the one he describes. After I corrected the CRC routine (by
inserting a 1-byte instruction: CLD) the collision test showed 41 collisions
where theory predicts 45, and the practical test showed 46 false drops where
theory predicts 47. Encouraged, I rewrote the algorithms in assembler and
reran the benchmarks. My implementation now inserts 850 keys per second on an
8-MHz V-20 and 4941 on a 20-MHz 386!
I have to say I was troubled from the beginning by the complexity of my
hashing algorithm, but it was the first one that worked with anywhere near the
predicted test results. Thanks to John, I believe we now have a better, faster
hashing algorithm. My faith in the power of publication to improve our art is
renewed. I've improved DICT.PAS and the assembler source, BLOOM.ASM, for the
high-speed CRC and bit set/test routines; this code is available in the DDJ
Forum on CompuServe and on M&T's Telepath online service.
Finally, I realize that Doug McIlroy must have been aware of Bloom Filters
though I didn't know his spelling checker was an improvement on one. It's
amusing that I chose his "improvement" to illustrate the technique. My
solution does take more space than Doug's, as I pointed out in the article,
but with that space we buy the ability to update the dictionary
instantaneously. It's a classic trade-off.


Following Up on Software Patents


Dear DDJ,
The Dr. Dobb's Journal, November 1990 article titled "Software Patents" by The
League for Programming Freedom brings to mind the saying that you never can
appreciate the problems of others until they become your own. The present
woeful state of software patents is nothing new to engineers and scientists
who have dealt with the patent "system" in the past. The article's conclusion
that patents should not be granted to any form of software hits the nail on
the head.
It is well known that the patent office, and in many cases the same examiner,
will issue duplicate patents for the exact same invention by different
applicants within a few months or even weeks of each other, making the title
"patent examiner" an oxymoron. It is common knowledge that at least half of
the patents granted, in all areas, are of questionable integrity since they
bear strong ties to the obvious and/or prior art. With this past record as
prologue, that the U.S. Patent Office can pretend to delve deeply into the
superficial layers of an application's obviousness and conflict with prior art
strains credulity. That the U.S. Patent Office is incompetent is a moot point
since the office will never be competent with any level of staffing. The U.S.
Patent Office only can be a rubber stamp issuing entry visas into the U.S.
legal system and guaranteeing job security for lawyers and patent examiners.
What better excuse to raise taxes?
Why is this? Patents are applied for by two types of applicants. The first is
genuine and thinks their idea/invention is new and original (which it may/may
not be) while the second, a "patent system parasite," simply applies for
patents strategically where it is believed that one may be granted, even when
the application is known to be based upon prior art and/or is obvious. The
only people that can be considered competent to review a patent application in
a specific area are those people who currently or in the past have worked in
areas related to the patent application. This type of peer review, while
impossible to achieve in the present patent system, would still have obvious
limits. Yet we have something better, a system of illiterate stone age
cave-person judges, juries and lawyers. Can we ever escape the dark ages?
The U.S. Constitution not only sets forth the law for patents, it is the base
of all law, and all things technical or mundane fall under the law. Aye, there
lies the rub. Law, lawyers, the legal "system" (read casino) are the final
controllers of all our lives and, as things progress, thoughts. This fact
flies in the face of a responsible community of engineers, scientists, and
citizens who naturally expect more of the U.S. Government. What we have is an
intellectual dichotomy as vast and penal as the dark ages. The U.S. law and
legal system is not interested in scientific progress, freedom, or anything
except the propagation of laws and the power of government, right or wrong.
Unfortunately this sad tale has repeated itself for thousands of years and
hundreds of generations, being the Achilles' heel of the most intelligent
mammals on this planet. The more things change the more they stay the same, or
what is this thing we call "civilization" anyway?
To argue that the patent system "protects" inventors is like saying the Mafia
protects small family businesses. Sure it does, but at what cost? The granter
of patents, the U.S. constitution, also leaves answers to the problem in the
keyword freedom. Freedom, that modern anachronism, usually gets the short
shrift when the fittest creature surviving, able to squelch any innovation if
it threatens an agency or fee, is the U.S. bureaucracy. To look to Congress to
solve this problem is to ask the largest body of conflict of interest to cut
its own throat; granted, an exaggeration, but today one's wallet is
anatomically connected to the throat. Talk about virtual realities!
Too bad the Eastern Europeans are turning to the U.S. for legal advice. Let's
hope they don't repeat our errors and learn from our mistakes. Step right up
and place your bets in the legal casino of life.
Tari Taricco, President
Taricco Corp.
San Pedro, California
Dear DDJ,

It's great to see DDJ return to its historic visionary role with that software
patents article. We get absorbed in the minutiae of what we do so easily, and
we really need to pay attention to the big world around us. As a
fundamentalist of the old-time religion of assembly language, I'm as guilty as
they get on that point.
There are other issues just as worthy of some hell raising: The collapse of
hardware standards, the ownership of the dominant operating system by a
secretive and erratic private company, the next-year's-Chevyism of minimal
upgrades.... So keep it up.
Instead of refusing to cooperate in patent applications, we might try the
inundation approach. Small companies could require every programmer to put
each week's work into a patent application -- sort of like kids stockpiling
snowballs for a fight. The resulting applications would have no less merit
than most of the existing patents. In fact, that's such a good idea I should
probably try to patent it.
John Sprung
Viacom Productions
Universal City, California
Dear DDJ,
Your article on patents (November 1990) was very timely and informative. As a
software developer in a small startup company, the potential restrictions
imposed by arbitrary patents are disconcerting to say the least. I propose
much of the problem stems from the misinterpretation of what software is in
the mind of the public -- which is reflected in the minds of the bureaucracies
and legislative bodies. As pointed out in the article, patents apply
specifically to things, not abstract entities. The confusion seems to arise
because software is so intimately tied to a machine, which is definitely a
patentable thing. The (mistaken) perception that software is a patentable
thing derives from this close association.
Perhaps the better way to think of software is as a literature equivalent: it
is not built, it is written. The end result is not a process or a widget, but
something that is read by a machine to produce widgets and objects (even if
they are on a screen). We can copyright something that is written, but we
can't copyright or patent the principles and techniques by which they are
written.
For instance, if the field of music were suddenly a brand new discipline, and
in the evolution of this discipline the notation principles of black notes on
staves of five lines each (with all the sharp and flat signatures, etc.) were
developed by an individual or an organization, would that individual or
organization be entitled to a patent or even a copyright? Could someone patent
a B minor chord? Or a cadence leading to a resolution? What would happen to
the field of music if such practices were allowed?
Similarly, in the field of literature, would it be possible to patent a
construct such as a poem? Iambic pentameter? Chapters? Table of Contents?
Clearly, these are the fundamental concepts and techniques which are
indigenous to the field of literature and make its advance possible: the
allegory to algorithm is not inevident or accidental.
There is certainly protection for authors who create a specific piece of
literature, music, or art through the copyright process. This has worked well
(mostly) in these fields and, if the software allegory to literature can be
accepted, would work equally well for it.
Software has the unique capability of producing further literature, to be
consumed by machines or humans, and if this literature can be specifically
described, it should be subject to copyright rules as well. The key here is
specifically described. This would include (and is specifically intended to
provide for) menu and user interface screens. All menu and UIF constructs are
finite state -- there are only so many specific combinations possible,
allowing perhaps specific rules for certain menus and windows to be
repositioned. Under this construct, a software company should be able to
render complete descriptions of every state of its UIF and, if it meets the
tests of specificity, be granted a copyright for that UIF. This should make
people like Apple and Lotus happy, without rendering proprietary the
techniques by which those screens were created which are, in fact, part of the
software development discipline.
In short, I am agreeing with the statements and definitions provided in the
article, but suggest that software be considered as a new type of literature.
This should be easily provable: no software can produce anything without a
machine to read it. This position would retain the tenets adopted by the
authors, but provide a graspable allegory for the nontechnically minded
officials and legislators who are, fortunately or otherwise, the ones that
have to be convinced.
Rick Berger
Sedona Software
San Diego, California


An Accidental Tourist...


Dear DDJ,
I am not a programmer, amateur or professional, but a semi-retired physicist,
electronischer, inventor, and patent buff who is much more at ease bashing tin
or slinging solder than writing the strictly ordered poetry of a computer
program. Oh, all right, I do cook up some Basic, with a few lines of machine
code thrown in, to do donkey-work (modeling the on-axis performance of any
horizontal-axis windmill using a Sinclair ZX-81 and only 16K; or home-brew
memory-mapped I/O, using another Sinclair to control and log a long-term life
test) which I would otherwise have to do manually. To me, a computer is just
another power tool, like a sabre saw.
So why on earth am I a subscriber to DDJ? It was an accident. I subscribed to
a "computer" magazine; it went belly-up, and offered a choice of others to
take its place. I selected one, but that one also died promptly, and offered a
choice of still others. One was described as "highly technical" and I gladly
chose it, fearing at the same time that I might be a Jonah. It proved to be
highly technical, but in a field which was (and is) very strange and
wonderful. Your accidental subscriber has found Dr. Dobb's.
But why have I continued to subscribe? I have asked myself that question many
times, but just when I resolve to let the subscription expire, I find some
articles or letters which are of such great general interest, or simply so
superbly written, that they make the task of reading an unalloyed pleasure.
Even though I am nearly illiterate in the languages of programming, I can
enjoy explorations of neural nets, fulminations about Ada, and always the
trenchant comments in "Swaine's Flames." My interest seems more aesthetic than
technical, but that may be a common human trait: I thoroughly enjoy opera,
even though I know little German and French, and even less Italian.
Gurdon Abell
Woodstock, Connecticut


... And an Accidental Turing


Dear DDJ,
In his November 1990 "Programming Paradigms" column, Michael Swaine presented
portions of the critique of connectionism advanced by Fodor and Pylyshin.
These authors have made major contributions to the field of cognitive science,
and their analysis of the connectionist approach to cognition raises many
important issues.
One of Fodor's lines of reasoning goes something like this:
1. Cognition (thinking) involves the manipulation of symbols. A symbol must
have semantic content. Therefore something that thinks (e.g., a mind) must
deal with semantic elements.
2. "Neural nets" deal in the weights of interactions among, and levels of
excitation of, simple processing units. These weights and excitation levels
are not semantic elements. Therefore a neural net cannot think.
Although Fodor and Pylyshin present their case very elegantly, it is
unsatisfactory for several reasons. The most striking of these is that their
argument refutes itself.
Let us suppose that the human mind is capable of cognition. Let us further
suppose that the human mind is implemented in hardware which we will call the
brain (to suppose otherwise is to invoke dualism). It is generally agreed that
the brain consists of neurons joined by inhibitory and excitatory connections,
and that the level of excitation of these neurons defines the state of the
brain at any moment. In short, the brain is a neural net, albeit a far more
complicated and capricious one than any artificial neural net to date.
However, according to Fodor and Pylyshin, neural nets cannot support
cognition. Therefore human beings cannot think ("I knew it all along!" you say
...). If we assume that Fodor and Pylyshin are human beings, this conclusion
applies to them as well. From this we must infer that they derived their
arguments without resort to cognitive processes.
In closing, Michael Swaine states that a neural net is "the [computational]
equal of a Turing machine." Given this premise, and the premise that a Turing
machine is capable of semantic manipulation, then a neural net must be
similarly capable. Why does he assert that a neural net can support semantic
processing only if used to implement a Turing machine, which then does the
real work? Does a neural net stop being a neural net as soon as it replicates
the function of a Turing machine?
Although a Turing machine can be programmed to emulate some cognitive
processes, my suggestion is that most of what passes for human thought
(including thoughts generated by Fodor, Pylyshin, and Swaine) arises without
the intermediary of a Turing machine.
Suppose for the moment that Fodor and Pylyshin were correct, that neural nets
were incapable of cognition. What is their utility? Biological neural nets,
even very simple ones, solve countless life and death problems daily, reliably
and in real time, with a limited amount of hardware, apparently without
resorting to semantic manipulation or cognition. Consider the ability of
flying insects to take off, navigate, and land, making adjustments as
necessary in a fraction of a second. Show me the program that performs a
similar function, and then show me the nonbiological hardware that implements
it as quickly and as well as the nonsemantic fly! Better yet, show it to
Boeing or the folks at DARPA, and watch the bucks roll in.
Ted Carnevale
Stony Brook, New York


RAM Disk for the Rest of Us


Dear DDJ,
Thanks for the article "RAM Disk Driver for Unix" (Jeff Reagen, October 1990).
I was able to compile and install the driver on my Microport System V/386.
Driver code was unchanged but the kernel rebuild was a bit different from the
procedure outlined in the article. Anyway, it was educational (only somewhat
painful!) and took me a few places in the Unix manuals where I don't usually
go. Again, thanks, and keep up the good work.
James Littlefield
CompuServe 71611,2121







January, 1991
PORTING UNIX TO THE 386: A PRACTICAL APPROACH


Designing the software specification




William Frederick Jolitz and Lynne Greer Jolitz


Prior to leading the 386BSD project, Bill was the founder and CEO of Symmetric
Computer Systems, a BSD-based workstation and networking products
manufacturer. He was the principal developer of 2.8 and 2.9 BSD and the chief
architect of National Semiconductor's GENIX project, the first virtual memory
microprocessor-based UNIX system. Prior to establishing TeleMuse, a market
research firm, Lynne was vice president of marketing at Symmetric Computer
Systems. She has produced white papers on strategic topics for the
telecommunications, electronics, and power industries. Bill and Lynne conducts
seminars on BSD, ISDN, and TCP/IP, and are in the process of producing a book
on 386BSD and a textbook focusing on the applications layer of the Internet
Protocol Suite. They can by contacted via e-mail at william@berkeley.edu or at
uunet!william. Copyright (c) 1990 TeleMuse.


The University of California's Berkeley Software Distribution (BSD) has been
the catalyst for much of the innovative work done with the UNIX operating
system in both the research and commercial sectors. Encompassing over 150
Mbytes (and growing) of cutting-edge operating systems, networking, and
applications software, BSD is a fully functional and nonproprietary complete
operating systems software distribution (see Figure 1). In fact, every version
of UNIX available from every vendor contains at least some Berkeley UNIX code,
particularly in the areas of filesystems and networking technologies. However,
unless one could pay the high cost of site licenses and equipment, access to
this software was simply not within the means of most individual programmers
and smaller research groups.
The 386BSD project was established in the summer of 1989 for the specific
purpose of porting BSD to the Intel 80386 microprocessor platform so that the
tools this software offers can be made available to any programmer or research
group with a 386 PC. In coordination with the Computer Systems Research Group
(CSRG) at the University of California at Berkeley, we successively ported a
basic research system to a common AT class machine (see, Figure 2), with the
result that approximately 65 percent of all 32-bit systems could immediately
make use of this new definition of UNIX. We have been refining and improving
this base port ever since.
By providing the base 386BSD port to CSRG, our hope is to foster new interest
in Berkeley UNIX technology and to speed its acceptance and use worldwide. We
hope to see those interested in this technology build on it in both commercial
and noncommercial ventures.
In this and following articles, we will examine the key aspects of software,
strategy, and experience that encompassed a project of this magnitude. We
intend to explore the process of the 386BSD port, while learning to
effectively exploit features of the 386 architecture for use with an advanced
operating system. We also intend to outline some of the tradeoffs in
implementation goals which must be periodically reexamined. Finally, we will
highlight extensions which remain for future work, perhaps to be done by some
of you reading this article today. Note that we are assuming familiarity with
UNIX, its concepts and structures, and the basic functions of the 386, so we
will not present exhaustive coverage of these areas.
In this installment, we discuss the beginning of our project and the initial
framework that guided our efforts, in particular, the development of the
386BSD specification. Future articles will address specific topics of interest
and actual nonproprietary code fragments used in 386BSD. Among the future
areas to be covered are:
386BSD process context switching
Executing the first 386BSD process on the PC
386BSD kernel interrupt and exception handling
386BSD INTERNET networking
ISA device drivers and system support
386BSD bootstrap process


Getting Started: References, Equipment, and Software


Most software ports begin with the naive assumption that the UNIX kernel is
merely a C program with a handful of functions, supporting other utility C
programs on demand. While in essence this is true, in practice this is a vast
oversimplification. Nevertheless, in the tradition of great projects, we
acquired a few tools and other items before getting down to work:
The Design and Implementation of the 4.3BSD UNIX Operating System by Leffler,
McKusick, Karels, and Quarterman (Addison-Wesley, 1989) and Programming the
80386 by Crawford and Gelsinger (Sybex, 1987) were purchased from a bookstore
in Berkeley. Since no one on our team possessed any extensive technical
background on either the 386 or the IBM PC, the 80386 book was our sole
resource for the microprocessor. The 4.3BSD book illuminated some of the
obscure areas and requirements of the BSD UNIX operating systems kernel. We
highly recommend these books. Both books have become somewhat shopworn during
the process -- the 80386 book has had it's covers taped twice, primarily due
to being thrown repeatedly across the room in the general direction of the
trash can. This book, while the best resource available on the subject, is not
as complete as one might hope, primarily because the 80386 is a complex animal
and is enigmatic in the correct use of its many features. Segmentation
exception handling descriptions should not be taken literally, although the
book was of great value when writing the first versions of exception handling
code. Some portions of the software were even determined empirically. (Intel
was not eager to provide any information.) The single biggest problem
encountered in our project was that of inadequate 80386 documentation.
A completely blank, inexpensive standard 386 AT clone was the selected
hardware platform. To minimize expenses and to emphasize commonality, we chose
to support only the basic 386 platform.
Using exploratory programs written in Borland's Turbo C, we were able to
explore the typical AT hardware. These exercises permitted us to better
understand the information contained in IBM's Technical Reference Guide
Personal Computer AT, a classic if not obscure work. We then tested the
mechanisms inside the AT to make certain we knew what must be provided in
order to generate the necessary software driver support for BSD UNIX.
Our initial kernel source was the 4.3BSD Tahoe release (available for an
obscure machine, the CCI Power 6/32, and as similar to the 386 as a can
opener), at that time the most stable and recent release.
All of these references and the equipment were examined prior to generating
even the first line of code. An understanding of the architectures of the
hardware and software is critical to developing an appropriate 386BSD
specification. Thus, we were able to ensure a successful port, even when
unanticipated problems arose.


Development of the 386BSD Specification


Once all the materials were gathered, the temptation was to immediately sit at
the PC and write code. This is a temptation that should always be vigorously
avoided. One needs to sit down and carefully break down this project into
smaller bites. However, because many parts of this project are interrelated,
we must insure that the internal standards are uniformly maintained by all
areas of the port and during all phases. In other words, the bridge must meet
in the center.
Therefore, instead of plunging directly into development, we began the most
critical phase of this (or any) port -- that of creating the 386BSD
specification. This specification addressed the following major issues:
Segmentation and paging
Virtual and physical address space
Process context description
System call interfaces
ISA device requirements
Microprocessor idiosyncrasies
Bootstrap
Unlike a commercial specification, the 386BSD specification was intended to be
lightweight and flexible. We wanted to focus 386BSD without making the
specification a major work in itself. We also knew that many of the finer
points would change as we got closer to our goal.


The Definition of the 386BSD Specification


At first glance, the choice of the 386 microprocessor and ISA system
architecture appears to define the operating system's machine-dependent
requirements. For example, on the original 8088 PC to the present, MS-DOS
would use the software interrupt INT $XX instruction to dispatch through the
interrupt vector table entry XX, and then dispatch to the desired system call
inside MS-DOS. This way the only was application programs could call the
operating system.

Had this regularity been true for the UNIX operating system, all 80x86 UNIX
systems would be alike, and the development of a specification would be a
simple task. However, in exploiting the power and flexibility of UNIX, one is
faced with a grander specification. The kernel architect is now faced with
competing alternatives. With UNIX, the choices are no longer "cut and dried."
Adding to this dilemma, the 386 is at least two generations beyond its simple
ancestor. The enhanced features the 386 now offers allow us many competing
ways to satisfy a UNIX system design. Continuing our example, instead of using
the INT $XX instruction, we can use the intersegment LCALL instruction to call
the operating system through call gate segments. We can use some powerful
features of the 386, but at the cost of a more elaborate mechanism. Is it
worth it?
In this case, the LCALL instruction can be used to support reverse
compatibility with other versions of UNIX in the form of an applications
package rather than within the operating systems kernel, and thus may be worth
the effort. However, choosing among the myriad, often conflicting,
alternatives is typically a task fraught with peril.
For the 386BSD project, we first determined our priorities: 100 percent BSD
kernel and user functionality. The system must contain all important
underlying mechanisms of the Berkeley UNIX system. Any extensive modification
pertaining to how Berkeley UNIX functions on other extant platforms can result
in incompatibility. Incompatibility is like an irritating insect that bites in
many places -- and tends to lay hidden until after extensive distribution. As
such, we did not exploit some features of the 386, such as its elaborate
segmented architecture, at the expense of incompatibility.
Efficient use of the native processor architecture. We would like to use the
system in ways to obtain the highest performance and greatest functionality
possible.
Interoperability with existing commercial standards. We would like to use the
system in ways which maintains compliance with extant commercial standards. We
do not intend to unnecessarily create arbitrary new standards if current
standards are acceptable.
Rapid implementation of the basic operating system. One maxim of any UNIX
development effort is "the best tool to build a UNIX system IS a UNIX system."
We needed to bootstrap ourselves rapidly into operation and leverage 386BSD
itself to complete the project.


Conflicts in Priorities


These basic priorities inherently conflict. For example, BSD systems have
basic incompatibilities with the AT&T System 5 UNIX systems, because each
project has firm interests and no compelling need to cooperate. As such,
perfect compatibility is impossible to achieve given our project focus. The
opposite tact, no compatibility constraints, is also not completely
acceptable, because we are dealing with the PC class of computers and not
minicomputers or workstations. Fine grain differences also exist among the
many standards currently competing for favor in the world of 386 UNIX systems.


386BSD Port Goals: A Practical Approach


Given all of these trade-offs, we decided to take what we call a "practical"
approach to 386BSD. We concentrated primarily on "hard adherence" to both BSD
operability and high-performance implementation, for the simple reason that
386BSD is a research project intended for use by the research community.
However, because even this audience depends on commercial resources, we
decided to invest some of our effort in the development of a few fundamental
areas such as System Call Interface Definition.
By dealing with these basic areas, we allowed for limited adherence to
commercial standards from the start, with the ability to gradually extend
386BSD as needed. (For example, in future releases we hope to offer some
degree of support for segmentation and VM8086 mode.) We have also tried, when
possible, to conform to the spirit of the 386 Application Binary Interface
(ABI) and its predecessor Binary Compatible Standard (BCS) when they did not
conflict with our adherence to Berkeley UNIX.
Some may take issue with this stance, seeing binary compatibility standards
entirely as an "all or nothing" issue. Those who spend a great deal of time
arguing over the big end and the little end of the ABI egg are usually
involved in maintaining control over the shrink-wrap commercial software
market. However, those who wish to ignore the ABI juggernaut are also ignoring
the largest body of UNIX software outside the research community. In this
case, ignorance is simply a mask for arrogance. As we stated earlier, we have
tried to take a "practical" approach that builds in the flexibility without
altering the scope of our project.
Many people wonder why UNIX systems are so big and complex. A look through any
UNIX kernel can quickly answer this question. Many different groups prefer to
further standard agenda b claiming a piece of the kernel for their own use,
instead of redesigning it for common support or moving things out of it that
really belong in an application process. SVR4 alone is rumored to contain 14
different filesystems which are just a variation on a theme. This "Chinese
menu" approach to kernel design has resulted in a bloated kernel that is
difficult to enhance or maintain. Because standards by accumulation just don't
work, with 386BSD we strive to avoid such nonsense.
Another goal of our project was to insure that all code developed for the
386-specific portions of this project be unique and novel. This is to prevent
any particular commercial agent from arbitrarily appropriating, monopolizing,
or prohibiting discussion and distribution of this code. This is the major
reason why we are able to examine some of the interesting mechanisms of 386BSD
without the censorious effect of proprietary license agreements.


Microprocessor and System Specification Issues


Our specification required that we break it down into two basic technical
areas: the microprocessor itself and the surrounding system hardware. In
keeping with our goals, we segregated the two in order to allow future support
for other buses (such as EISA and Micro Channel) and to avoid obscuring
microprocessor issues.
The microprocessor required much delineation in the areas of segment and
paging strategies, virtual memory allocation and other memory management
issues; communications primitives, context switching, faults, and the system
call interface. We also had to factor in microprocessor idiosyncrasies and
bugs as we went along. On the system side, we concentrated on ISA bus
considerations.
We first outline some of the major issues revolving around the 386
microprocessor itself and how they relate to a Berkeley UNIX port.


386 Memory Management Vitals


Most popular microprocessor use either segmentation or paging to manage memory
address space access. The 386 is rare in that it possesses both. In fact,
since segmentation, (Figure 3(a)), is placed on top of paging (Figure 3(b)),
you are expected to use segmentation in some form any time memory is paged.
And, most important, BSD relies on paging.
More Details.
All operand references on the 386 are tied to one of the segment registers.
This segment register uses a 16-bit selector (low-order bits determine level
of access) to find a descriptor. This descriptor then determines the location
of underlying memory in linear address space. When segmentation alone is
enabled (also known as protected mode), the linear address space corresponds
to the physical address of the selected segment for the operand. However, when
paging is implemented, the linear address space address must be run through a
two-level paging mechanism to find the physical page frame number, the actual
address of physical memory underneath the virtual address.
One of the most powerful, yet confusing, features of the 386 is its segmented
architecture. While the current trend in microprocessors has been oriented
towards a single "flat" linear virtual address space, the 386 has continued
the bias toward segments held by the entire 80x86 line. The two most important
changes in the 386 from previous versions -- permitting 32-bit operations and
expanding segments from 64 Kbytes to 4 gigabytes in size -- may turn some of
the inherent disadvantages of 80x86 segments into an advantage. Segments once
too small for many data items (such as arrays of real numbers) can now utilize
alternative address spaces. This is of great interest to those working with
specialized applications, such as 3-D to 2-D transformations.


Segmentation and 386BSD


UNIX was initially developed on machines that relied on linear virtual address
spaces. As such, Berkeley UNIX provides no support for segments and instead
expects a large linear virtual address space for both kernel and user. In
fact, UNIX in general adapts to segments only under duress.
Originally, we had intended to use segments in a straight-forward manner.
However, we found that would result in a host of nuisance problems. For
example, many programs (debuggers, assemblers, and object-linking editors)
must be modified so that separate address spaces for the various regions could
be maintained. Object file format, always in a state of flux due to the
varying degrees of dynamic loading of instruction and data structures, would
require change.
Another problem which arises when using segments is that the shared data in
the instruction segment requires strict typing in the assembler (we force
instructions to reference the CS segment directly) to obtain access. Because
some compilers put data constants in the code area with the intent of sharing
memory used by other processes, invoking segments would create little problems
everywhere for the compiler.
Still other problems result from the use of string instructions on stack
resident data and that time honored bad practice known as self-modifying code.
The key flaw in all these cases is that the binding to the particular segment
register is mandated by the assembler, and cannot be properly resolved by the
object code linker as other symbols are normally handled.
Given all of the problems which arose and, in accordance with our 386BSD
goals, we chose to minimize support for segmentation by running the machine in
"flat" mode. As a result, no tinkering with object file format or tools was
required. An amusing side effect of this approach is that it allowed us to
cross-develop 386 code on VAX and NSC32000-based computers using the native
object utilities. This choice minimized bookkeeping considerably but also
ultimately defeated the purpose of segments. A more elaborate design was
beyond the scope of our project.


Kernel Linear Address Space Overhead


The kernel, as well as the user mode programs, requires its own set of segment
registers. If the kernel is called, its segments must be present. This takes
up precious linear address space. Thus, we can never run a process exactly 4
gigabytes in size because a portion of the address space must be reserved for
kernel use. Even if we try to use segments to relocate the kernel, we cannot
escape the limit -- it not only takes up the same linear address space but
also forces us to use intersegmental instructions to communicate data between
user process and the kernel. Since the user, the process, and the kernel must
share virtual address space, we limited ourselves to a maximum process size of
4 gigabytes less the kernel size.
The kernel segment registers are outlined in Figure 4. These segment registers
cover (alias) the user segments and allow access to the user space from the
kernel in any way desired (read, write, or execute). Because all segments
start at zero, the kernel begins at a high address (or offset) and always runs
relocated. In 386BSD, the code segment just covers the end of the kernel
instruction region, because no self-modifying code was needed.
One way to avoid linear address space sharing constraints is to have all
interrupts, traps, exceptions, and system calls internally context switch to a
separate process to execute UNIX system functions, using the 386 trap with
task switching feature. This unique 386 hardware allows traps to be handled by
either procedures or tasks. However, task switching is very expensive and the
system would context switch thousands of times more frequently than otherwise.
Also, the UNIX kernel is not intended to run itself as a process, as use of
this feature would require.



Virtual Address Space Layout


Within the 4 gigabytes per process address space, a process must be allocated
regions for instruction, data, and stack for both user programs and the
kernel. Some of these regions (user data, user stack) must grow as a process
runs, and support must be available for additional regions used for shared
memory and shared/dynamically loaded libraries. The size of these regions and
their placement becomes an important consideration for any UNIX port.
The traditional UNIX approach is to place the instruction region at the
beginning of the address space, followed by data, unused space, and finally a
stack region. The purpose of the empty space is to build in room so that the
stack can grow down and the data (for heap storage) can grow up. The end-point
is known in UNIX vernacular as the "break." Usually, text starts at absolute
virtual address 0.
A problem common with UNIX systems arose from the extensive use of
uninitialized string pointers, which by default were set to the value 0.
Because the first word at address 0 was also set to 0, this meant that null
pointers always pointed to null strings. However, many early computers did not
permit the bottom of address space to be used in this way and a tested program
would abort. UNIX code that was thought "proven" on the PDP-11 and VAX was
actually masked by the development system architecture. Eventually, many
uninitialized pointers were located and corrected. Some versions of UNIX also
leave the very bottom and top of address space unmapped to catch in directions
through 0 and -1. This method is of limited effectiveness, however, if a
structure referenced through such a pointer is bigger than the size of the
bottom and top address space holes.
386BSD virtual address space is arranged in the traditional manner (see Figure
5). The user address space begins at zero with text, (yes, we do indeed have 0
at location 0), followed by data, unused space, and finally the stack. The
start of the user stack, located at the top of the user's address space, is
not fixed. (A future project may utilize this feature to "lower" the stack,
providing room for dynamically created regions.) Because only the operating
system needs to know the exact location of the user stack, it assigns the
stack's address space on process program load (exec system call).


Per-Process Data Structures


The kernel address space resides above the user portion of the process virtual
address space. By virtue of being co-resident in the virtual address space
with the user space (a somewhat mandatory virtue), the kernel can directly
reference any part of the current running user process in the lower portion of
memory.
As in the user space (and in UNIX executable files), kernel instructions and
data are arranged consecutively. The stack and a new special region, the
per-process data structure or user structure (u. for short), appear below the
kernel. One advantage of this arrangement is that it becomes possible to share
all portions of the page tables for address space above the kernel base
address. Notice that through this is a vital part of the kernel, it is
technically at the very top of user address space and is purposely left
readable by the user process. Everything beneath the system base address is
switched when a context switch to the next process occurs.
Currently, the kernel address space starts at virtual address 0xfe000000, and
allows up to 32 Mbyte of address to be reserved for use within the kernel.
This boundary can be moved at a later date if more address space is needed.
Access of the ISA bus device memory (screen and LAN buffers) is obtained
through an allocated region of the kernel memory, known as a utility page map.
This is similar to portions of on-demand physical memory used by the kernel
through other utility page maps. The kernel also has a variety of data
structures scaled and allocated at boot time (valloc) and a heap for dynamic
demands (malloc).


386 Virtual Memory Address Translation Mechanism


The 386 paging mechanism impacts the 386BSD specification with respect to
address space allocation constants: Each page is 4K byte in size and must
reflect the minimum granularity of address space allocation, while each page
of page tables maps 4 Mbyte of address space. These constants determine
address boundaries used to allocate memory and share address space between
similar processes. Shared objects starting on 4-Mbyte boundaries can share
page tables as well as underlying physical memory.
Page size granularity is important to the layout of executable files.
Instruction and data regions are arranged into discrete and aligned memory
page units, so that it is possible to demand load pages that may be either
"read-only" (instructions) or "read-write" (data or stack). The page table
size granularity is typically located at the beginning of each user, user
stack, and kernel address space. It is possible to share these among many
processes, obviating the need for separate page tables. As a result, while
each process has its own page table directory, the top eight PDEs of each
process page table directory point to the same kernel page tables. Thus, the
kernel's portion of address space is global to all other processes.


User to Kernel Communication Primitives


By arranging our address space as outlined, we've greatly simplified the
routines that communicate between kernel and user process (now the kernel
routines can directly access user space). All that is needed is a way to
determine if a selected portion of user memory may be read or written before
it is attempted. On some machines (such as the VAX) special instructions are
available for this purpose. The 386, however, offers instructions only for use
in validating segments, not pages. So we must use a different strategy.
In 386BSD, we chose to set a global variable (nofault) to a nonzero value. If
a fault happens during any user/kernel communication primitive, it transfers
to the address held within no fault. In this way we can catch illegal
references by using the microprocessor's own address translation mechanism to
find them, instead of by tedious code evaluation on every reference.
Unfortunately, one idiosyncrasy of the 386 now rears its ugly head. The
designers of the 386 decided that segment attributes should be used to
ultimately determine access to regions in a process, thus making their use
mandatory in the system even if we don't need them. To be precise, we have
page attribute bits that can be used for protection. These work as expected,
unless the 386 is run in supervisor mode (as does the kernel). In this case,
only the valid/invalid attribute has any effect. This nuisance or "feature"
requires a bit of workaround to make the primitives complete.


Berkeley UNIX Virtual Memory System Strategy


The current Berkeley UNIX virtual memory management subsystem was originally
designed for use with a VAX, and as such has no support for page directories.
For that matter, the 386 doesn't know of such VAX concepts as P0 and P1
address spaces for instruction/data and stack nor of page table-length
registers. Currently, these are simulated in 386BSD. However, work is underway
to revise the entire virtual memory system to permit more generalized
operation over all supported Berkeley UNIX platforms, now that the demands of
each platform have been made obvious.
Portions of the VAX were simulated by employing code, written by Mike Hibler
at the University of Utah, which supports the 68030 paging memory management.
Because the 386 code is so similar, we used a conditional compilation that
shares 68030 and 386 versions interchangeably -- an odd couple indeed.


Structure of Per-Process Data (u.)


Within each process accessed by the kernel exists a unique data structure
containing the private variables of the process used to provide UNIX system
call functionality. This is called the "extended state" of a given process and
is collected into one location. If the process is long inactive, this state is
swapped to secondary storage to reclaim RAM memory. All of the
machine-dependent fields in this structure lie within the first element u_pcb,
a process context descriptor. However, the size of this structure and its
adjoining kernel stack is also a machine-dependent parameter. The u. is
currently defined at about 1 Kbyte in size. This fits amply within a single
page.
Another page is sufficient to hold a kernel stack. This results in a
per-process data structure two pages in size. By leaving these as two separate
pages in 386BSD, instead of combining them into a single page (giving us a
smaller kernel stack), the kernel stack segment can be used to catch the stack
overflow ("redstack") condition. This will appear as a future enhancement.


Process Context Description


As seen in Figure 6, the process control block (struct pcb), contains the
386-specific per-process information. This is broken down into
hardware-dependent fields and software-related fields. The process control
block is place at the front of the user structure so that the information can
be reloaded from the address of the user structure and force active a
previously inactive process. The user structure address is recorded in the
process table. Each entry describes global information about a process.
The 386's hardware context switch facility can be used to switch from process
to process. By placing the hardware-dependent information at the beginning of
the process control block, in the form of the 386's Task Switch State (TSS)
data structure, it is possible to switch from one process to another with a
single intersegment ljmp instruction to the appropriate task gate selector.
While this feature has been implemented in 386BSD, it is not used at this time
for switching between processes due to performance considerations. However, it
can be used in other cases, such as exception handling, and we may elect to
use it for process switching in the future. We view this as one of those rare
"have your cake and eat it too" decisions.
In 386BSD, not all hardware context is switched in this manner, because some
processes never access the large amount of state information (108 bytes) used
by the numeric coprocessor. We allow for this with the pcb_fpusav structure.
Other fields correspond to some implementation demands specific to Berkeley
UNIX, including simulating VAX hardware constructs invoked by the virtual
memory system not existing on the 386. Fortunately, this was a small amount of
code. It is a tribute to the concept of UNIX that the machine-dependent
portion of the system is as small as it is.


Page Fault and Segmentation Fault Mechanism



To report exceptions that occur in the 386 memory management hardware, they
must be caught and routed to the proper portion of the kernel. UNIX places
these exceptions in two categories: Faults signaled to the user process, which
terminates the process if it is not interested in the exception, and "resource
not present" faults sent to the virtual memory system to request a missing
page.
The 386 also signals a variety of segment exceptions, almost all of which
result in dire consequences for the process that invokes them. A single page
fault exception encodes both "page not present" as well as "protection
violation" events. These page faults, along with the fault address, are
recorded in processor special register cr2 and should be carefully examined to
determine the precise nature of each exception.


Other Processor Faults


Along with address space faults, we found we must map 15 other faults (see
Figure 7) into the Berkeley UNIX kernel exception-handling mechanisms. The
numeric coprocessor presents special fault-handling challenges, for it can be
operating when 386BSD switches to another unrelated process. In that case, we
can get a trap that should have been passed to a process other than the one
currently running.
Figure 7: 386 processor exceptions that needed to be mapped into the kernel
exception-handling mechanism

 386 Processor Exceptions
 -------------------------------------------------------------------------

 Exception Description Pushes an Error Code?
 -------------------------------------------------------------------------

 Divide Division by 0 or division
 overflow No
 Debug/Trace Single step or debug hardware
 condition No
 Breakpoint Executed an INT3 instruction No
 Overflow Executed an INTO instruction
 when OF bit set No
 Bounds Check Executed an BOUND instruction
 which failed No
 Illegal Instruction Executed an unknown
 instruction No
 NPX DNA Numeric processor device not
 available No
 Double Fault Recursive fault (fault while
 processing a fault) Yes
 NPX Operand Numeric processor accessed
 outside of segment No
 Invalid TSS Attempted to task switch to
 incorrect task state Yes
 Segment Not Present Attempt to access a not
 present descriptor Yes
 Stack Segment Problem with current stack
 descriptor Yes
 General Protection Protection problem with a
 segment descriptor Yes
 Page Page missing or protection
 problem with address Yes
 NPX Error Numeric processor signals
 an error No

If 386BSD receives an unexpected fault while running in the kernel, it must
immediately force the kernel down (in UNIX vernacular, to "panic") and attempt
to save as much state information as possible for diagnostic purposes. Thus,
we differentiated user traps from kernel traps. In most other microprocessors,
a bit in the processor flags or status word determines if we are running in
the kernel, but the 386 offers no such bit. So, 386BSD examines the contents
of the CS segment register when a trap occurs (this is saved by the hardware
during an exception) to determine if an instruction was executing in user
mode.


Microprocessor Idiosyncrasies


We found a hornet's nest of microprocessor idiosyncrasies unique to a 386 UNIX
port. Some of the primary issues these touched upon included that of switching
from real mode (20-bit addressing) to protected mode (32-bit addressing),
creating segment descriptors to fill the interrupt descriptor table, creating
other segments for use by the user and kernel modes of a process, and finally,
novel suprises between different steppings of the 386/486 themselves.
One major irritant was the need for at least one TSS structure to be present
at any time, even if we didn't use a TSS for task switching. The TSS records
the contents of the kernel's stack pointer for use when the kernel is
reentered from user mode (interrupt, exception, and system call). Our early
versions of 386BSD worked well as it started up within the kernel, moved into
user mode for the first process, and then froze after hitting the first system
call. Imagine our surprise when we found that, in effect, it had no place to
save where it was coming from on the kernel stack!


System Call Interface



A table of system calls is provided by Berkeley UNIX with the assigned index
number that differentiates them. This table specifies, in part, a binary
standard for system calls -- in this case, of a POSIX-based system. Of course,
because POSIX is considered an "object library" definition (as opposed to the
regulation at the system call level desired by ABI and BCS advocates), one
might accurately consider this an "academic" standard. In deference to these
other standards, however, we chose to accept their suggested format for system
calls.
Figure 8 is a code template for the system call stub used in 386BSD, in this
case a write system call. The lcall instruction is an intersegmental call
instruction that references a special segment selector, known to be a UNIX
system call gate into the kernel. The selector corresponds to the first
descriptor in the processes local descriptor table. To designate which system
call is to be used, the eax register is loaded with the index from the table.
Arguments for each system call are present on the stack, and this stub is
called from another procedure. System calls return after the lcall
instruction, returning values in the eax and edx registers (just as other C
procedures do). System calls report failure by setting the carry bit and
recording error notification in eax.


System Specific (ISA) Issues


So far, we have only described issues relating to our choice of
microprocessor. But this specification is incomplete unless the issues
relating to the bus and the system surrounding the microprocessor are
examined. We recognized that the 386 already operates on a plethora of
different buses, including ISA, EISA, MCA, VME, and MULTIBUS, and that these
issues vary depending on which bus is used. We may even need to support more
than one bus at a time, or even a custom bus. As such, we decided that 386BSD
must take into consideration the support requirements of many different bus
standards.


Physical Memory Map


The ISA bus physical memory layout is outlined in Figure 9. The memory is
broken into three parts: base memory, I/O device memory, and extended memory.
RAM is split up on this standard, with a base memory section, holding up to
640 Kbyte of memory, starting at address 0 and ending at the beginning of
device memory. Remaining memory is located starting at address 0x100000 (above
1 Mbyte) and extending to as much as 0xFFFFFF (16 Mbytes).
Between the base and extended RAM regions lies device memory, where display
adapter cards and LAN cards use special RAM buffers. This region, called the
"hole," is a nuisance for UNIX ports, because we would rather see contiguous
memory. Although we do have a means of making memory appear contiguous through
the use of virtual memory, this does us no good when we must work with
physical memory addresses during system bootstrap, hardware DMA devices, and
physical memory allocation structures.
If extended memory is not available, we must temporarily reside in the MS-DOS
640-Kbyte base-memory dungeon. This is truly hell for memory-consumptive UNIX
systems. Fortunately, this occurs only when the system is "misconfigured"
during the configuration or boot processes, and is not a "normal" situation.


ISA Device Controllers


To support common ISA devices, 386BSD must cope with a separate I/O address
bus, shared memory, vectored interrupts, and dedicated DMA controllers. Since
most of these evolved from ad hoc standards, device conflicts are common. In
order to accurately support ISA, we began with a minimal AT 386 configuration
-- 386/387, 1-Mbyte RAM, keyboard, monitor, Winchester drive (ST506, ESDI,
IDE), and floppy drive -- and relied solely on what the BIOS uses to work the
hardware. We expect an improvement in performance when these guidelines are
eventually relaxed.


ISA Device Auto Configuration


A key advantage of Berkeley UNIX is its ability to configure at boot time
devices present on the system. This feature, while difficult to implement on
the ISA given numerous conflicts, was considered valuable and was implemented.
In Figure 10(a), we have data structures that encode all the appropriate
information to configure a device in 386BSD. Each driver, which may have many
devices, is able to locate and configure a device if present. The isa_device
structure also contains the characteristics of each device to be recognized.
If found, hardware resources can then be assigned to each device as
configured. A sample table of possible devices to search for within the kernel
appears in Figure 10(b).


Interrupt Priority Level Management


In the PC architecture, there is a separate interrupt level per device
interrupt. These are more levels than traditional UNIX wants or needs.
Instead, UNIX groups different classes of devices into interrupt priority
levels that can be disabled and enabled as a group (disks, terminals,
network). This is done through spl( ) function calls, named for a PDP-11/45
instruction which implemented this feature on early UNIX systems. This
capability must be provided in 386BSD as well.
Each interrupt vector (interrupt gate) has code that saves the cpl (current
priority level) variable on the stack, sets the new cpl value, and turns on
interrupts above this level. On return from the interrupt, all vectors call a
common routine that disables interrupts, restores the cpl, and returns with
interrupts enabled. The cpl is altered, as is the priority mask of the dual
8259 ICUs, by the spl( ) subroutines. This micro-processor or system can now
be run at different priority levels on demand.


Bootstrap Operation


One of the last considerations in the development of the 386BSD specification
is deciding how we can most easily bootstrap load the BSD kernel from hard or
floppy disk. We know that ISA machines have BIOS ROMs that select the device
to be booted (typically the floppy first, followed by the hard disk), load the
very first block into RAM at location 0x7c00, and finally execute it in real
mode. From this point on, we had to create some tight code to run within that
512-byte block to read in our kernel from an executable file in the UNIX file
system.
Traditional Berkeley UNIX undergoes a four-step bootstrap process to load in
the kernel. First, the initial block bootstrap is brought in from disk by the
hardware (in this case, the BIOS). The primary purpose of this assembly
language bootstrap is to load in the second 7.5-Kbyte bootstrap located
immediately after the initial boot on disk. This larger program, written in C,
is much more elaborate in that it can decipher the UNIX file system, extract
the UNIX file /boot, and load it as the next stage in the bootstrap. /boot,
the most complex of the three bootstraps, evaluates the boot event and finally
passes configuration parameters to the kernel as it is loading /vmunix, also
located in the file system.
At first we intended to write the initial block bootstrap in MASM, Microsoft's
MS-DOS assembler, and use calls to the BIOS to accomplish the boot process.
This proved to be unsatisfactory, as it still left us tied to MS-DOS. So, we
decided to use the UNIX protected mode assembler. This allowed us to "cut the
cord" with MS-DOS and permitted the system alone to support all code. We also
chose to create drivers for the hardware directly, from the initial boot block
on up, to break away from the BIOS as well. As a result, 386BSD can now be
easily retargeted to new buses that might not rely on either MS-DOS or the
BIOS.
Both the second and third bootstraps are actually separate incarnations of the
same source code (drivers and all). The only difference is that the second
bootstrap is a functional subset of the third bootstrap, so that it could fit
within the small confines required. All of the bootstraps reference a special
data structure called the disklabel that knows the layout and geometry of the
disk drive booted. In this way thousands of different disk drives can be
supported independent of MS-DOS and the BIOS information.


Summary: Where is 386BSD Now?


Perhaps the discussion of some of these issues might have seemed difficult or
incomplete, but we found each item to be of tremendous importance in
understanding the practice of a port to the 386 architecture. Unlike Berkeley
UNIX ports to other systems, we found that we had to bend over backwards
dealing with segments, memory issues, device issues, and a plethora of unique
microprocessor features. Now, one may ask, was it all worth it?
Well, BSD is now available on the 386 platform. Even though it is only a
preliminary release, we already support the following:
Many different PC platforms, including the Compaq 386/20, Compaq Systempro
386, any 386 with the Chips and Technologies chipset, any 486 with the OPTI
chipset, Toshiba 3100SX, and more.
ESDI, IDE, and ST-506 drives
3-1/2 inch and 5-1/4 inch floppy drives
Novell NE2000 and Western Digital Ethernet controller boards
EGA, VGA, CGA, and MDA monitors
287/387 floating point, including the Cyrix EMC
A single-floppy standalone UNIX system, containing support for modems,
Ethernet, SLIP, and Kermit to facilitate downloading of 386BSD to any PC over
the INTERNET network.

Those of you who can meet University of California requirements should obtain
a copy of 386BSD from the University of California, so that you can follow
along yourself as we work through the basics of this port from every angle.
In addition, we would like to thank some of the people who have helped make
386BSD a reality, including Mike Karels, Keith Bostic, and Kirk McKusick of
CSRG, Dixon Dick and all the support engineers at Compaq, Fred Dunlap and Bob
McGhee of Cyrix, Don Ahn (UCB), Tim L. Tucker (Evans and Sutherland), and Clem
Cole (Cole Computer Consulting).


Suggested Readings


1. Leffler, Samuel J., Marshall Kirk McKusick, Michael J. Karels, and John S.
Quarterman. The Design and Implementation of the 4.3BSD UNIX Operating System.
Reading, Mass.: Addison-Wesley, 1989.
2. Crawford, John H., and Patrick P. Gelsinger. Programming the 80386.
Alameda, Calif.: Sybex, 1987.
3. IBM Technical Reference: Personal Computer AT. Boca Rotan, Fla.: IBM, 1984.


386 Segmentation and Paging


The 386 has six segment registers (CS, DS, SS, ES, FS, and GS) which can
select one of 16,383 (8,191 shared and 8,192 private) segment descriptors.
These segment descriptors reside in either the Global Descriptor Table (GDT)
or the Local Descriptor Table (LDT) and determine underlying characteristics
(type attributes, location in linear address space, and segment size). In
addition to memory segments, system segments are available to the operating
system for special purposes and call gates to facilitate controlled
indirection into other possibly hidden segments.
Memory segments can be selected via a dedicated segment register, with
different results. The CS register contains program instructions. The DS
register selects program data. The SS register selects the program stack. The
ES register selects the destination of string instructions. Both the FS and GS
registers are undedicated at this time. It is even possible to reassign the
segment registers in the machine instructions, so one can view the ES, FS, and
GS segment registers as alternative DS segment registers.
Each memory segment has a size, and can be as large as 4 gigabytes. In order
for that segment to be active, however, it must consume space (global linear
address space) in direct proportion to its size. This means that, although a
process may possess a total address space greater than 4 gigabytes, only an
aggregate of active segments totaling less than or equal to 4 gigabytes is
permitted. While the 386 theoretically can address 2{14} x 2{46} bytes, in
practice only 2{32} bytes (4 gigabytes) can be active at any time. If the
maximum 4 gigabytes of instruction, data, and stack (for both operating system
and each user process) is invoked, managing the global linear address space to
allow segments to be active (present) when linear address space is available
becomes a significant problem.
Segments can also be overlapped in linear address space. Because through both
segments we can access the same memory interchangeably, possibly with
different attributes, this overlap is called an alias.
80x86 segments can be either "bottom up" or "top down." A segment that is
bottom up means that one begins with segment relative address 0 and "grows up"
to the desired address x (that is, [0 ... x]). A segment that is top down
means that one begins with segment relative address 0xffffffff and "grows
down" to the desired address y ([y ... Oxfffffff]). (Yes, we know this is
awkward, but that's how it works). Segments are grown only in accordance to
these rules. The stack segment is the only common example of a downward
growing segment.
Many other attributes are provided that control the type of access allowed
within the segment. The designers of the 386 prefer segments be used in memory
protection regulation, and have provided a plethora of features not found in
the paging unit. Segment attributes, such as 32-bit vs. 16-bit operations,
byte vs. page granularity, and user vs. supervisor mode, control the mode of
the microprocessor, depending on the segments that are actually in use.
It is quite costly to implement segments in the microprocessor. That is why
underlying shadow registers, invisible to the programmer, are used. They
provide a hardware "assist" to the segmentation functionality.
We manage to avoid many paging bookkeping problems by running in "flat" mode.
This is accomplished by aliasing the CS, DS, SS, and ES segment registers to
the exact same linear address space (see Figure 4), thus making it an identity
function. We can then regard any of the intrasegment addresses as if they were
linear address space. Of course, this ends up defeating the advantages of
segments as well.
Some new microprocessors, such as the 386, feature architectures which exploit
large segments. This is because 4 gigabytes is starting to fill up, and going
to 64-bit addresses will not be happening soon. Many would argue that 4
gigabytes will never be filled, but history states otherwise. 64-Mbit RAM is
already on the drawing boards -- in fact, some actually exist. In a few years,
it will be commercially available. Because a typical computer uses on average
64 to 128 RAM chips, with many companies currently offering 64-Mbyte systems
(512 1-Mbit RAM), it will not be long before a computer with 512 64-Mbit RAM
chips (4 gigabytes) is introduced. As such, segmented architectures may
provide a way of spanning the address space gap that could result.
It's amazing that at the beginning of the microcomputer revolution, an Altair
8800 with 4 Kbyte of RAM was considered incredible because it could run Basic!
How time change.
We have seen how segmentation works in the 386. Now let's examine paging. For
our purposes, segmentation on the 386 is defeated by running in "flat" mode.
We can then consider intrasegment addresses as if they are linear address
space.
Paging works with a two-level scheme that permits the sparse allocation of
address space, so that the whole address space, or even all of the address
space mapping information, need not be present. Otherwise, a 4 gigabyte
process would require more than 4 Mbyte of page tables, even though it may be
the case that only a few thousand would be active at any time. Typically, for
our purposes, only three pages of page tables are allocated per process (page
directory and the top and bottom address space page tables). This is
sufficient to run a 4-Mbyte process (instruction plus data size) and 4 Mbyte
of stack. (Note that all processes run with a full-sized address space and can
dynamically grow to use it.) This mechanism is quite successful in reducing
memory-management overhead.
The two-level scheme splits the incoming virtual address into three parts: 10
bits of page table directory index, 10 bits of page table index, and 12 bits
of offset within a page. The page table directory is a single page of physical
memory that facilitates allocation of page table space by breaking it up into
4-Mbyte chunks of linear address space per each of its 1024 PDEs (Page
Directory Entry), which determine the location of underlying page tables in
physical memory.
Each PDE-addressed page of a page table contains 1024 PTEs (Page Table Entry).
A PTE is similar in form and function to a PDE. The major difference between a
PDE and a PTE is that a PTE selects the physical page frame for the desired
reference. Once the frame offset least-significant address bits are obtained,
the final address is determined. This method is identical to that used in many
other common microprocessors (the MC68030, Clipper, and NS32532, among
others).
Each PDE and PTE may be marked either "invalid" (not currently used) or
"valid" (the underlying page of physical memory is present). In addition,
other attribute bits mark entries as "read only" or "read-write" and
"supervisor" or "user." Because segmentation is not used to control memory
protection, we keep processes honest by relying entirely on the paging
mechanism's attributes for protection as well as for the allocation of memory.
The mechanism to convert virtual to physical addresses is quite elaborate. To
speed things up, the 386 keeps a Translation Look-aside Buffer (TLB) of 64
cached entries, managed entirely transparently. One side affect of this
hardware is that if the operating system changes any of the page tables that
may be in use, it must flush this cache. The 386 does not allow selective
flushing -- only a complete flush of all cache entries by reloading the page
directory address register cr3. This is an expensive operation which may be
repeatedly performed as we successively transform an address mapping of a
process within the kernel (as many as six times in the worst case).
--B.J., L.J.

_PORTING UNIX TO THE 80386: A PRACTICAL APPROACH_ by William Frederick Jolitz
and Lynne Greer Jolitz
[FIGURE 6]

/* Intel 386 process control block */

struct pcb {
 struct i386tss pcbtss;
#define pcb_ksp pcbtss.tss_esp0
#define pcb_ptd pcbtss.tss_cr3
#define pcb_pc pcbtss.tss_eip
#define pcb_psl pcbtss.tss_eflags
#define pcb_usp pcbtss.tss_esp
#define pcb_fp pcbtss.tss_ebp

/* Software pcb (extension) */
int pcb_fpsav;
#define FP_NEEDSAVE 0x1 /* need save on next context switch */
#define FP_NEEDRESTORE 0x2 /* need restore on next DNA fault */
 struct save87 pcb_savefpu;
 struct pte *pcb_p0br;
 struct pte *pcb_p1br;
 int pcb_p0lr;
 int pcb_p1lr;
 int pcb_szpt; /* number of pages of user
page table */
 int pcb_cmap2;

 int *pcb_sswap;
 long pcb_sigc[8]; /* sigcode actually 19 bytes */
 int pcb_iml; /* interrupt mask level */
};

/* Intel 386 Task Switch State */
struct i386tss {
 long tss_link; /* actually 16 bits: top 16 bits must be
zero */
 long tss_esp0; /* kernel stack pointer priviledge level 0
*/
#define tss_ksp tss_esp0
 long tss_ss0; /* actually 16 bits: top 16 bits must be
zero */
 long tss_esp1; /* kernel stack pointer priviledge level 1
*/
 long tss_ss1; /* actually 16 bits: top 16 bits must be
zero */
 long tss_esp2; /* kernel stack pointer priviledge level 2
*/
 long tss_ss2; /* actually 16 bits: top 16 bits must be
zero */
 long tss_cr3; /* page table directory physical address
*/
#define tss_ptd tss_cr3
 long tss_eip; /* program counter */
#define tss_pc tss_eip
 long tss_eflags; /* program status longword */
#define tss_psl tss_eflags
 long tss_eax;
 long tss_ecx;
 long tss_edx;
 long tss_ebx;
 long tss_esp; /* user stack pointer */
#define tss_usp tss_esp
 long tss_ebp; /* user frame pointer */
#define tss_fp tss_ebp
 long tss_esi;
 long tss_edi;
 long tss_es; /* actually 16 bits: top 16 bits must be
zero */
 long tss_cs; /* actually 16 bits: top 16 bits must be
zero */
 long tss_ss; /* actually 16 bits: top 16 bits must be
zero */
 long tss_ds; /* actually 16 bits: top 16 bits must be
zero */
 long tss_fs; /* actually 16 bits: top 16 bits must be
zero */
 long tss_gs; /* actually 16 bits: top 16 bits must be
zero */
 long tss_ldt; /* actually 16 bits: top 16 bits must be
zero */
 long tss_ioopt; /* options & io offset bitmap: currently zero
*/
 /* XXX unimplemented .. i/o permission
bitmap */
};


[FIGURE 8]

#include <syscall.h>

 globl _write, _errno

#amtwritten = write(fildes, address, count);

_write: # caller places arguments on stack
 lea SYS_write,%eax # select desired system call
 lcall $0x7,0 # call the system
 jb 1f # if system returns error, handle
 ret # otherwise return

1: movl %eax,_errno # save error in global variable
 movl $-1,%eax # indicate error has occured
 ret # and return

Figure 10: ISA device controllers: (a) data structures for configuring devices
(b) sample table of possible devices

[FIGURE 10a]

/* Per device structure. */
struct isa_device {
 struct isa_driver *id_driver; /* per driver configuration info */
 short id_iobase; /* Base i/o address register */
 short id_irq; /* Interrupt request */
 short id_drq; /* DMA request */
 caddr_t id_maddr; /* Physical shared memory address on bus */
 int id_msize; /* Size of shared memory */
 int (*id_intr)(); /* Interrupt interface routine */
 int id_unit; /* Physical unit number within driver
*/
 int id_scsiid; /* SCSI id if SCSI device */
 int id_alive; /* Device is present and accounted for
*/
};

/* Per driver structure. */
struct isa_driver {
 int (*probe)(); /* Test whether device is present */
 int (*attach)(); /* Setup driver for a device */
 char *name; /* Device name */
};

[FIGURE 10b]

/* ISA Bus devices */

#include "machine/isa/device.h" /* device structure */

/* Software drivers */
#define V(s) V/**/s
extern struct driver wddriver; extern V(wd0)();
extern struct driver cndriver; extern V(cn0)();
extern struct driver comdriver; extern V(com0)(); extern V(com1)();
extern struct driver fddriver; extern V(fd0)();
extern struct driver nedriver; extern V(ne0)();


/* Possible hardware devices */
#define C (caddr_t)
struct isa_device isa_devtab_bio[] = {
/* driver iobase irq drq maddr msiz intr
 unit */

{ &wddriver, IO_WD0, IRQ14, -1, C 0, 0,
V(wd0), 0},
{ &wddriver, IO_WD1, IRQ13, -1, C 0, 0,
V(wd1), 1},
{ &fddriver, IO_FD0, IRQ6, 2, C 0, 0,
 V(fd0), 0},
{ &fddriver, IO_FD1, IRQ6, 2, C 0, 0,
 V(fd1), 1},
0
};

struct isa_device isa_devtab_tty[] = {
/* driver iobase irq drq maddr msiz intr
 unit */

{ &vgadriver, IO_VGA, 0, -1, C 0xa0000, 0x10000,
 0, 0},
{ &cgadriver, IO_CGA, 0, -1, C 0xa0000, 0x4000,
 0, 0},
{ &mdadriver, IO_MDA, 0, -1, C 0xb8000, 0x4000,
 0, 0},
{ &kbddriver, IO_KBD, IRQ1, -1, C 0,
 0, V(kbd0), 0},
{ &cndriver, IO_KBD, IRQ1, -1, C 0,
 0, V(cn0), 0},

{ &comdriver, IO_COM0,IRQ4, -1, C 0, 0, V(com0),
 0},
{ &comdriver, IO_COM1,IRQ3, -1, C 0, 0, V(com1),
 1},
0
};

struct isa_device isa_devtab_net[] = {
/* driver iobase irq drq maddr msiz intr
 unit */

{ &nedriver, 0x320, IRQ9, -1, C 0, 0, V(ne0),
 0},
0
};

struct isa_device isa_devtab_null[] = {
/* driver iobase irq drq maddr msiz intr
 unit */

0
};








January, 1991
DESIGNING PLAN 9


Bell Labs' Plan 9 research project looks to tomorrow




Rob Pike, Dave Presotto, Ken Thompson, and Howard Trickey


The authors are all researchers at AT&T Bell Laboratories, Murray Hill, NJ
07974. This paper was originally delivered at the UKUUG Conference in London
in July 1990 and is reprinted here with permission from the UKUUG.


Plan 9 is a distributed computing environment assembled from separate machines
acting as CPU servers, file servers, and terminals. The pieces are connected
by a single file-oriented protocol and local name space operations. Because
the system was built from distinct, specialized components rather than similar
general-purpose components, Plan 9 achieves levels of efficiency, security,
simplicity, and reliability seldom realized in other distributed systems. This
article discusses the building blocks, interconnections, and conventions of
Plan 9.
Unhappy with the trends in commercial systems, we began a few years ago to
design a system that could adapt well to changes in computing hardware. In
particular, we wanted to build a system that could profit from the continuing
improvements in personal machines with bitmap graphics, in medium-and
high-speed networks, and in high-performance microprocessors. A common
approach is to connect a group of small personal timesharing systems --
workstations -- by a medium-speed network, but this has a number of failings.
Because each workstation has private data, each must be administered
separately; maintenance is difficult to centralize. The machines are replaced
every couple of years to take advantage of technological improvements,
rendering the hardware obsolete, often before it has been paid for. Most
telling, a workstation is a largely self-contained system, not specialized to
any particular task; too slow and I/O-bound for fast compilation; too
expensive to be used just to run a window system. For our purposes --
primarily software development -- it seemed that an approach based on
distributed specialization rather than compromise would best address issues of
cost-effectiveness, maintenance, performance, reliability, and security. We
decided to build a completely new system, including compiler, operating
system, networking software, command interpreter, window system, and terminal.
This construction would also offer an occasion to rethink, revisit, and
perhaps even replace most of the utilities we had accumulated over the years.
Plan 9 is divided along lines of service function. CPU servers concentrate
computing power into large (not overloaded) multiprocessors; file servers
provide repositories for storage; terminals give each user a dedicated
computer with bitmap screen and mouse on which to run a window system. Sharing
computing and file storage services provides a sense of community for a group
of programmers, amortizes costs, and centralizes and simplifies management and
administration.
The pieces communicate by a single protocol, built above a reliable data
transport layer offered by an appropriate network, that defines each service
as a rooted tree of files. Even for services not usually considered as files,
the unified design permits some noteworthy and profitable simplification. Each
process has a local filename space that contains attachments to all services
the process is using and thereby to the files in those services. One of the
most important jobs of a terminal is to support its user's customized view of
the entire system as represented by the services visible in the name space.
To be used effectively, the system requires a CPU server and a file server
(large machines best housed in an air conditioned machine room with
conditioned power) and a terminal. The system is intended to provide service
at the level of a departmental computer center or larger, and its strengths
stem in part from economies of scale. Accordingly, one of our goals is to
unite the computing environment for all of AT&T Bell Laboratories (about
30,000 people) into a single Plan 9 system comprising thousands of CPU and
file servers spread throughout, and clustered in, the company's various
departments. That is clearly beyond the administrative capacity of
workstations on Ethernets.
The following sections describe the basic components of Plan 9, explain the
name space and how it is used, and offer examples of unusual services that
illustrate how the ideas of Plan 9 can be applied to a variety of problems.


CPU Servers


Several computers provide CPU service for Plan 9. The production CPU server is
a Silicon Graphics Power Series machine with four 25-MHz MIPS processors, 128
Mbytes of memory, no disk, and a 20 Mbyte-per-second back-to-back DMA
connection to the file server. It also has Datakit and Ethernet controllers to
connect to terminals and non-Plan 9 systems. The operating system provides a
conventional view of processes, based on fork and exec system calls, and of
files, mostly determined by the remote file server. Once a connection to the
CPU server is established, the user may begin typing commands to a command
interpreter in a conventional-looking environment.
A multiprocessor CPU server has several advantages. The most important is its
ability to absorb load. If the machine is not saturated (which can be
economically feasible for a multiprocessor), there is usually a free processor
ready to run a new process. This is similar to the notion of free disk blocks
in which to store new files on a file system. The comparison extends farther:
Just as you might buy a new disk when a file system gets full, you may add
processors to a multiprocessor when the system gets busy, without needing to
replace or duplicate the entire system. Of course, you may also add new CPU
servers and share the file servers.
The CPU server performs compilation, text processing, and other applications.
It has no local storage; all the permanent files it accesses are provided by
remote servers. Transient parts of the name space, such as the collected
images of active processes or services provided by user processes, may reside
locally but these disappear when the CPU server is rebooted. Plan 9 CPU
servers are as inter-changeable for their task -- computation -- as are
ordinary terminals for theirs.


File Servers


The Plan 9 file servers hold all permanent files. The current server is
another Silicon Graphics computer with two processors, 64 Mbytes of memory,
600 Mbytes of magnetic disk, and a 300 gigabyte jukebox of write-once optical
disk (WORM). (This machine is to be replaced by a MIPS 6280, a single
processor with much greater I/O bandwidth.) It connects to Plan 9 CPU servers
through 20 Mbyte-per-second DMA links, and to terminals and other machines
though conventional networks.
The file server presents to its clients a file system rather than, say, an
array of disks or blocks or files. The files are named by slash-separated
components that label branches of a tree, and may be addressed for I/O at the
byte level. The location of a file in the server is invisible to the client.
The true file system resides on the WORM, and is accessed through a two-level
cache of magnetic disk and RAM. The contents of recently-used files reside in
RAM and are sent to the CPU server rapidly by DMA over a high-speed link,
which is much faster than regular disk although not as fast as local memory.
The magnetic disk acts as a cache for the WORM and simultaneously as a backup
medium for the RAM. With the high-speed links, it is unnecessary for clients
to cache data; the file server centralizes the caching for all its clients,
avoiding the problems of distributed caches.
The file server actually presents several file systems. One, the "main"
system, is used as the file system for most clients. Other systems provide
less generally-used data for private applications. One service is unusual: the
backup system. Once a day, the file server freezes activity on the main file
system and flushes the data in that system to the WORM. Normal file service
continues unaffected, but changes to files are applied to a fresh hierarchy,
fabricated on demand, using a copy-on-write scheme. Thus, the file tree is
split into two parts: A read-only version representing the system at the time
of the dump, and an ordinary system that continues to provide normal service.
The roots of these old file trees are available as directories in a file
system that may be accessed exactly as any other (read-only) system. For
example, the file /usr/rob/doc/plan9.ms as it existed on April 1, 1990, can be
accessed through the backup file system by the name
/1990/0401/usr/rob/doc/plan9.ms. This scheme permits recovery or comparison of
lost files by traditional commands such as file copy and comparison routines
rather than by special utilities in a backup subsystem. Moreover, the backup
system is provided by the same file server and the same mechanism as the
original files, so permissions in the backup system are identical to those in
the main system; you cannot use the backup data to subvert security.


Terminals


The standard terminal for Plan 9 is a Gnot (with silent "G"), a
locally-designed machine of which several hundred have been manufactured. The
terminal's hardware is reminiscent of a diskless workstation: with 4 or 8
Mbytes of memory, a 25-MHz 68020 processor, a 1024 x 1024 pixel display with 2
bits per pixel, a keyboard, and a mouse. It has no external storage and no
expansion bus; it is a terminal, not a workstation. A 2 megabit per second
packet-switched distribution network connects the terminals to the CPU and
file servers. Although the bandwidth is low for applications such as
compilation, it is more than adequate for the terminal's intended purpose: To
provide a window system, that is, a multiplexed interface to the rest of Plan
9.
Unlike a workstation, the Gnot does not handle compilation; that is done by
the CPU server. The terminal runs a version of the CPU server's operating
system, configured for a single, smaller processor with support for bitmap
graphics, and uses that to run programs such as a window system and a text
editor. Files are provided by the standard file server over the terminal's
network connection.
Just like old character terminals, all Gnots are equivalent, as they have no
private storage either locally or on the file server. They are inexpensive
enough that every member of our research center can have two -- one at work
and one at home -- and see exactly the same system on both. All the files and
computing resources remain at work where they can be shared and maintained
effectively.


Networks


Plan 9 has a variety of networks that connect the components. To connect
components on a small (computer center or departmental) scale, CPU servers and
file servers communicate over back-to-back DMA controllers. More distant
machines are connected by traditional networks such as Ethernet or Datakit,
which a terminal or CPU server may use completely transparently except for
performance considerations. Because our Datakit network spans the country,
Plan 9 systems could potentially be assembled on a large scale. (See Figure
1.)
To keep their cost down, Gnots employ an inexpensive network that uses
standard telephone wire and a single-chip interface. (The throughput is
respectable, about 120 Kbytes per second.) Getting even that bandwidth to
home, however, is problematic. Some of us have DS-1 lines at 1.54 megabits per
second; others are experimenting with more modest communications equipment.
Because the terminal only mediates communication -- it instructs the CPU
server to connect to the file server but does not participate in the resulting
communication -- the relatively low bandwidth to the terminal does not affect
the overall performance of the system.


Name Spaces


There are two kinds of name space in Plan 9: The global space of the names of
the various servers on the network and the local space of files and servers
visible to a process. Names of machines and services connected to Datakit are
hierarchical: nj/mh/astro/helix, for example, roughly defines the area,
building, department, and machine. Because the network provides naming for its
machines, Plan 9 need not directly handle global naming issues. It does,
however, attach network services to the local name space on a per-process
basis. This is used to address the issues of customizability, transparency,
and heterogeneity.

The protocol for communicating with Plan 9 services is file-oriented; all
services, local or remote, are arranged into a set of file-like objects
collected into a hierarchy called the name space of the server. For a file
server, this is a trivial requirement. Other services must sometimes be more
imaginative. For instance, a printing service might be implemented as a
directory in which processes create files to be printed. Other examples are
described in the following sections. For the moment, consider just a set of
ordinary file servers distributed around the network.
When a program calls a Plan 9 service, (using mechanisms inherent in the
network and outside Plan 9 itself) the program is connected to the root of the
service's name space. Using the protocol, usually as mediated by the local
operating system into a set of file-oriented system calls, the program
accesses the service by opening, creating, removing, reading, and writing
files in the name space.
After the user selects desired services (file servers containing personal
files, data, or software for a group project, for example), their name spaces
are collected and joined to the user's own private name space by a fundamental
Plan 9 operator called attach. The user's name space is formed by the union of
the spaces of the services being used. The local name space is assembled by
the local operating system for each user, typically, the terminal. The name
space is modifiable on a per-process level, although in practice the name
space is assembled at login time and shared by all that user's processes.
To login to the system, the user instructs the terminal which file server to
connect to. The terminal calls the server, authenticates the user (described
later), and loads the operating system from the server. It then reads a file,
called the "profile," in the user's personal directory. The profile contains
commands that define what services to use by default, and where in the local
name space to attach them. For example, the main file server to be used is
attached to the root of the local name space, "/", and the process file system
is attached to the directory /proc. The profile then typically starts the
window system.
Within each window, a command interpreter may be used to execute commands
locally, using file names interpreted in the name space assembled by the
profile. For computation-intensive applications such as compilation, the user
runs a command cpu that selects (automatically or by name) a CPU server to run
commands. After typing cpu, the user sees a regular prompt from the command
interpreter. But that command interpreter is running on the CPU server in the
same name space -- even the same current directory -- as the cpu command
itself.
The terminal exports a description of the name space to the CPU server, which
then assembles an identical name space, so the customized view of the system
assembled by the terminal is the same as that seen on the CPU server. (A
description of the name space is used rather than the name space itself so the
CPU server may use high-speed links when possible, rather than requiring
terminal intervention.) The cpu command affects only the performance of
subsequent commands; it has nothing to do with the services available or how
they are accessed.
The following are a few examples of the usage and possibilities afforded by
Plan 9.


The Process File System


An example of a local service is the "process file system," which permits
examination and debugging of executing processes through a file-oriented
interface. It is related to Killian's process file system but its differences
exemplify the way that Plan 9 services are constructed.
The root of the process file system is conventionally attached to the
directory /proc. (Convention is important in Plan 9; many programs have
conventional names built in that require the name space to have a certain
form. For example, it doesn't matter which server the command interpreter
/bin/rc comes from, but it must have that name to be accessible by the
commands that call on it.) After attachment, the directory/proc itself
contains one subdirectory for each local process in the system, with name
equal to the numerical unique identifier of that process. (Processes running
on the remote CPU server may also be made visible; this will be discussed
shortly.) Each subdirectory contains a set of files that implement the view of
that process. For example, /proc/77/mem contains an image of the virtual
memory of process number 77. That file is closely related to the files in
Killian's process file system, but unlike Killian's, Plan 9's /proc implements
other functions through other files, rather than through peculiar operations
applied to a single file. Table 1 shows a list of the files provided for each
process.
Table 1: Files provided for the "process file system"

 Filename Description
 ------------------------------------------------------------------------

 mem The virtual memory of the process image. Offsets in the file
 correspond to virtual addresses in the process.

 ctl Control behavior of the processes. Messages sent (by a write
 system call) to this file cause the process to stop,
 terminate, resume execution, and so on.

 text The file from which the program originated. This is typically
 used by a debugger to examine the symbol table of the target
 process, but is in all respects except name the original
 file; thus one may type '/proc/77/text' to the command
 interpreter to instantiate the program afresh.

 note Any process with suitable permissions may write the note file
 of another process to send it an asynchronous message for
 interprocess communication. The system also uses this file to
 send (poisoned) messages when a process misbehaves, for
 example, divides by zero.

 status A fixed-format ASCII representation of the status of the
 process. It includes the name of the file the process was
 executed from, the CPU time it has consumed, its current
 state, and so on.

The status file illustrates how heterogeneity and portability can be handled
by a file server model for system functions. The command cat/proc/*/status
presents the status of all processes in the system; in fact, the process
status command ps is just a reformatting of the ASCII text so gathered. The
source for ps is a page long and is completely portable across machines. Even
when /proc contains files for processes on several heterogeneous machines, the
same implementation works.
The functions provided by the ctl file can be accessed through further files
(stop or terminate, for example). We, however, chose to fold all the true
control operations into the ctl file and provide the more data-intensive
functions through separate files.
Note that the services /proc provides, although varied, do not strain the
notion of a process as a file. For example, it is not possible to terminate a
process by attempting to remove its process file, nor is it possible to start
a new process by creating a process file. The files give an active view of the
processes, but they do not literally represent them. This distinction is
important when designing services as file systems.


The Window System


In Plan 9, user programs, as well as specialized stand-alone servers, may
provide file service. The window system is an example of such a program; one
of Plan 9's most unusual aspects is that the window system is implemented as a
user-level file server.
The window system is a server that presents a file /dev/cons, similar to the
/dev/tty or CON: of other systems, to the client processes running in its
windows. Because it controls all I/O activities on that file, it can arrange
for each window's group of processes to see a private /dev/cons. When a new
window is made, the window system allocates a new /dev/cons/ file, puts it in
a new name space (otherwise the same as its own) for the new client, and
begins a client process in that window. That process connects the standard
input and output channels to /dev/cons using the normal file opening system
call and executes a command interpreter. When the command interpreter prints a
prompt, it will therefore be written to /dev/cons and appear in the
appropriate window.
It is instructive to compare this structure to other operating systems. Most
operating systems provide a file-like /dev/cons that is an alias for the
terminal connected to a process. A process that opens the special file
accesses the terminal it is running on without knowing the terminal's precise
name. Because the alias is usually provided by special arrangement in the
operating system, it can be difficult for a window system to guarantee its
client processes access to their window through this file. Plan 9 handles this
problem easily by inverting it. A set of processes in a window shares a name
space, and in particular /dev/cons, so by multiplexing /dev/cons/ and forcing
all textual input and output to go through that file, the window system can
simulate the expected properties of the file.
The window system serves several files, all conventionally attached to the
directory of I/O devices, /dev. These include cons, the port for ASCII I/O;
mouse, a file that reports the position of the mouse; and bitblt, which may be
written messages to execute bitmap graphics primitives. Much as the different
cons files keep separate clients' output in separate windows, the mouse and
bitblt files are implemented by the window system in a way that keeps the
various clients independent. For example, when a client process in a window
writes a message (to the bitblt file) to clear the screen, the window system
clears only that window. All graphics sent to partially or totally obscured
windows are maintained as bitmap layers, in memory private to the window
system. The clients are oblivious of one another.
Because the window system is implemented entirely at user level with file and
name space operations, it can be run recursively: It may be a client of
itself. The window system functions by opening the files /dev/cons,
/dev/bitblt, and so forth, as provided by the operating system, and reproduces
-- multiplexes -- their functionality among its clients. Therefore, if a fresh
instantiation of the window system is run in a window, it will behave
normally, multiplexing its /dev/cons and other files for its clients. This
recursion can be used profitably to debug a new window system in a window or
to multiplex the connection to a CPU server. Because the window system has no
bitmap graphics code -- all its graphics operations are executed by writing
standard messages to a file -- the window system may be run on any machine
that has /dev/bitblt in its name space, including the CPU server.


The cpu Command



The cpu command connects from a terminal to a CPU server using a full-duplex
network connection and runs a setup process there. The terminal and CPU
processes exchange information about the user and name space, and then the
terminal-resident process becomes a user-level file server that makes the
terminal's private files visible from the CPU server. (At the time of writing,
the CPU server builds the name space by reexecuting the user's profile; a
version being designed will export the name space using a special
terminal-resident server that can be queried to recover the terminal's name
space.) The CPU process makes a few adjustments to the name space, such as
making the file /dev/cons on the CPU server be the same file as on the
terminal, and begins a command interpreter. The command interpreter then reads
commands from, and prints results on, its file /dev/cons, which is connected
through the terminal process to the appropriate window (for example) on the
terminal. Graphics programs such as bitmap editors may also be executed on the
CPU server because their definition is entirely based on I/O to files "served"
by the terminal for the CPU server. The connection to the CPU server and back
again is utterly transparent.
This connection raises the issue of heterogeneity: The CPU server and the
terminal may be, and in the current system are, different types of processors.
There are two distinct problems: binary data and executable code. Binary data
can be handled two ways: By making it not binary or by strictly defining the
format of the data at the byte level. The former is exemplified by the status
file in /proc, which enables programs to examine, transparently and portably,
the status of remote processes. Another example is the file, provided by the
terminal's operating system, /dev/time. This is a fixed-format ASCII
representation of the number of seconds since the epoch that serves as a time
base for make and other programs. Processes on the CPU server get their time
base from the terminal, thereby obviating problems of distributed clocks.
For files that are I/O intensive, such as /dev/bitblt, the overhead of an
ASCII interface can be prohibitive. In Plan 9, therefore, such files accept a
binary format in which the byte order is predefined, and programs that access
the files use portable libraries that make no assumptions about the order.
Thus /dev/bitblt is usable from any machine, not just the terminal. This
principle is used throughout Plan 9. For instance, the format of the
compilers' object files and libraries is similarly defined, which means that
object files are independent of the type of the CPU that compiled them.
Having different formats of executable binaries is a thornier problem, and
Plan 9 solves it adequately if not gracefully. Directories of executable
binaries are named appropriately: /mips/bin, /68020/bin, and so on, and a
program may ascertain, through a special server, what CPU type it is running
on. A program, in particular the cpu command, may therefore attach the
appropriate directory to the conventional name /bin so that when a program
runs, say, /bin/rc, the appropriate file is found. The various object files
and compilers use distinct formats and naming conventions, which makes
cross-compilation painless, at least once automated by make or a similar
program.


Security


Plan 9 does not address security issues directly, but some of its aspects are
relevant to the topic. Breaking the file server away from the CPU server
enhances security possibilities. Because the file server is a separate machine
that can only be accessed over the network by the standard protocol, and
therefore can only serve files, it cannot run programs. Many security issues
are resolved by the simple observation that the CPU server and file server
communicate using a rigorously controlled interface through which it is
impossible to gain special privileges.
Of course, certain administrative functions must be performed on the file
server, but these are available only through a special command interface
accessible only on the console and hence subject to physical security.
Moreover, that interface is for administration only. For example, it permits
making backups and creating and removing files, but not reading files or
changing their permissions. The contents of a file with read permission for
only its owner will not be divulged by the file server to any other user, even
the administrator.
This begs the question of how a user proves who he or she is. At the moment,
we use a simple authentication manager on the Datakit network itself, so that
when a user logs in from a terminal, the network assures the authenticity of
the maker of calls from the associated terminal. In order to remove the need
for trust in our local network, we plan to replace the authentication manager
by a Kerberos-like system.


Discussion


A fairly complete version of Plan 9 was built in 1987 and 1988, but
development was abandoned. In May of 1989 work was begun on a completely new
system, based on the SGI MIPS-based multiprocessors, using the first version
as a bootstrap environment. By October, the CPU server could compile all its
own software, using the first-draft file server. The SGI file server came on
line in February 1990; the true operating system kernel at its core was taken
from the CPU server's system, but the file server is otherwise a completely
separate program (and computer). The CPU server's system was ported to the
68020 in 13 hours elapsed time in November, 1989. One portability bug was
found; the fix affected two lines of code. At the time this article was
originally written, work had just begun on a new window system, which has
since been implemented. An electronic mail system has also been added,
clearing the way for use of Plan 9 on a daily basis by all the authors and 50
to 60 other users. Plan 9 is now up, running, and comfortable to use, although
it is certainly too early to pass final judgment.
The multiprocessor operating system for the MIPS-based CPU server has 454
lines of assembly language, more than half of which save and restore registers
on interrupts. The kernel proper contains 3647 lines of C plus 774 lines of
header files, which includes all process control, virtual memory support, trap
handling, and so on. There are 1020 lines of code to interface to the 29
system calls. Much of the functionality of the system is contained in the
"drivers" that implement built-in servers such as /proc; these and the network
software add another 9511 lines of code. Most of this code is identical on the
68020 version; for instance, all the code to implement processes, including
the process switcher and the fork and exec system calls, is identical in the
two versions; the peculiar properties of each processor are encapsulated in
two five-line assembler routines. (The code for the respective MMUs is quite
different, although the page fault handler is substantially the same.) It is
only fair to admit, however, that the compilers for the two machines are
closely related, and the operating system may depend on properties of the
compiler in unknown ways.
The system is efficient. On the four-processor machine connected to the MIPS
file server, the 45 source files of the operating system compile in about ten
seconds of real time and load in another ten. (The loader runs
single-threaded.) Partly due to the register-saving convention of the
compiler, the null system call takes only seven microseconds on the MIPS,
about half of which is attributed to relatively slow memory on the
multiprocessor. A process fork takes 700 microseconds irrespective of the
process's size.
Plan 9 does not implement lightweight processes explicitly. We are uneasy
about deciding where on the continuum from fine-grained hardware-supported
parallelism to the usual timesharing notion of a process we should provide
support for user multiprocessing. Existing definitions of threads and
lightweight processes seem arbitrary and raise more questions than they
resolve. We prefer to have a single kind of process and to permit multiple
processes to share their address space. With the ability to share local memory
and with efficient process creation and switching, both of which are in Plan
9, we can match the functionality of threads without taking a stand on how
users should multiprocess.
Process migration is also deliberately absent from Plan 9. Although Plan 9
makes it easy to instantiate processes where they can most effectively run, it
does nothing explicit to make this happen. The compiler, for instance, does
not arrange that it run on the CPU server. We prefer to do coarse-grained
allocation of computing resources simply by running each new command
interpreter on a lightly-loaded CPU server. Reasonable management of computing
resources renders process migration unnecessary.
Other aspects of the system lead to other efficiencies. A large
single-threaded chess database problem runs about four times as fast on Plan 9
as on the same machine running commercial software because the remote cache on
the file server is so large. In general, most file I/O is done by direct DMA
from the file server's cache; the file server rarely needs to read from disk
at all.
Much of Plan 9 is straightforward. The individual pieces that make Plan 9 up
are relatively ordinary; its unusual aspects lie in their combination. As a
case in point, the recent interest in using X terminals connected to
timeshared hosts might seem to be similar in spirit to how Plan 9 terminals
are used, but that is a mistaken impression. The Gnot, although similar in
hardware power to a typical X terminal, serves a much higher-level function in
the computing environment. It is a fully programmable computer running a
virtual memory operating system that maintains its user's view of the entire
Plan 9 system. It off loads from the CPU server all the bookkeeping and I/O
intensive chores that a window system must perform. It is not a workstation
either; one would rarely bother to compile on the Gnot, although one would
certainly run a text editor there. Like the other pieces of Plan 9, the Gnot's
strength derives from careful specialization in concert with other specialized
components.


Acknowledgments


Special thanks go to Bart Locanthi, who built the Gnot and encouraged us to
program it; Tom Duff, who wrote the command interpreter rc, Tom Killian and
Ted Kowalski, who cheerfully endured early versions of the software; Dennis
Ritchie, who frequently provided us with much-needed wisdom; and all those who
helped build the system.


References


Accetta, M.J., Robert Baron, William Bolosky, David Golub, Richard Rashid,
Avadis Tevanian, and Michael Young. "Mach: A New Kernel Foundation for UNIX
Development." In USENIX Conference Proceedings. Atlanta, Georgia, 1986.
Duff, T. "Rc -- A Shell for Plan 9 and UNIX." In UNIX Programmer's Manual.
10th ed. Murray Hill, N.J.: AT&T Bell Laboratories, 1990.
Fraser, A.G. "Datakit -- A Modular Network for Synchronous and Asynchronous
Traffic." In Proc. Int. Conf. on Commun. Boston, Mass., 1980.
Kernighan, Brian W. and Rob Pike. The UNIX Programming Environment. Englewood
Cliffs, N.J.: Prentice-Hall, 1984.
Killian, T.J. "Processes as Files." In USENIX Summer Conference Proceedings.
Salt Lake City, Utah, 1984.
Metcalfe, R.M. and D.R. Boggs. The Ethernet Local Network: Three Reports. Palo
Alto, Calif.: Xerox Research Center, 1980.
Miller, S.P., C. Neumann, J.I. Schiller, and J.H. Saltzer. Kerberos
Authentication and Authorization System. Cambridge, Mass.: MIT Press, 1987.
Pike, R. "Graphics in Overlapping Bitmap Layers." In Transactions on Graphics.
Vol. 2, No.2, 135-160.
Pike, R. "A Concurrent Window System," In Computing Systems. Vol. 2, No. 2,
133-153.
Quinlan, S. "A Cached WORM File System." In Software -- Practice and
Experience. To appear.















January, 1991
A SOFTWARE DESIGN MANIFESTO


Time for a change




Mitchell Kapor


Mitch Kapor is CEO of On Technology. He may be contacted via e-mail at
mkapor@well.sf.ca.us, or at 155 Second Street, Cambridge, MA 02141.


The great and rapid success of the personal computer industry over the past
decade is not without its unexpected ironies. What began as a revolution of
individual empowerment has ended with the personal computer industry not only
joining the computing mainstream, but in fact defining it. Despite the
enormous outward success of personal computers, the daily experience of using
computers far too often is still fraught with difficulty, pain, and barriers
for most people, which means that the revolution, measured by its original
goals, has not as yet succeeded.
Instead we find ourselves in a period of retrenchment and consolidation in
which corporations seek to rationalize their computing investment by
standardizing on platforms, applications, and methods of connectivity, rather
than striving for a fundamental simplification of the user experience. In
fact, the need for extensive help in the installation, configuration, and
routine maintenance of system functions continues to make the work of
corporate data processing and MIS departments highly meaningful. But no one is
speaking for the poor user.
There is a conspiracy of silence on this issue. It's not splashed all over the
front pages of the industry trade press, but we all know it's true. Users are
largely silent about this. There is no uproar, no outrage. Scratch the surface
and you'll find that people are embarrassed to say they find these devices
hard to use. They think the fault is their own. So users learn a bare minimum
to get by. They under-use the products we work so hard to make and so don't
help themselves or us as much as we would like. They're afraid to try anything
else. In sum, everyone I know (including me) feels the urge to throw that
infuriating machine through the window at least once a week. (And now, thanks
to recent advances in miniaturization, this is now possible.)
The lack of usability of software and poor design of programs is the secret
shame of the industry. Given a choice, no one would want it to be this way.
What is to be done? Computing professionals themselves should take
responsibility for creating a positive user experience. Perhaps the most
important conceptual move to be taken is to recognize the critical role of
design, as a counterpart to programming, in the creation of computer
artifacts. And the most important social evolution within the computing
professions would be to create a role for the software designer as a champion
of the user experience.
By training and inclination, people who develop programs haven't been oriented
to design issues. This is not to fault the vital work of programmers. It is
simply to say that the perspective and skills, which are critical to good
design, are typically absent from the development process, or, if present,
exist only in an underground fashion. We need to take a fresh look at the
entire process of creating software -- what I call the "software design
viewpoint." We need to rethink the fundamentals of how software is made.


The Case for Design


What is design? What makes something a design problem? It's where you stand
with a foot in two worlds -- the world of technology and the world of people
and human purposes -- and you try to bring the two together. Consider an
example:
Architects, not construction engineers, are the professionals who have overall
responsibility for creating buildings. Architecture and engineering are, as
disciplines, peers to each other, but in the actual process of designing and
implementing the building, the engineers take direction from the architects.
The engineers play a vital and crucial role in the process, but they take
their essential direction from the design of the building as established by
the architect.
When you go to design a house you talk to an architect first, not an engineer.
Why is this? Because the criteria for what makes a good building fall
substantially outside the domain of what engineering deals with. You want the
bedrooms where it will be quiet so people can sleep, and you want the dining
room to be near the kitchen. The fact that the kitchen and dining room should
be proximate to each other emerges from knowing first, that the purpose of the
kitchen is to prepare food and the dining room to consume it, and second,
rooms with related purposes ought to be closely related in space. This is not
a fact, nor a technical item of knowledge, but a piece of design wisdom.
Similarly, in computer programs, the selection of the various components and
elements of the application must be driven by an appreciation of the overall
conditions of use and user needs through a process of intelligent and
conscious design. How is this to be done? By software designers.
Design disciplines are concerned with making artifacts for human use.
Architects work in the medium of buildings, graphic designers work in paper
and other print media, industrial designers on mass-produced manufactured
goods, and software designers on software. The software designer should be the
person with overall responsibility for the conception and realization of the
program.
The Roman architecture critic Vitruvius advanced the notion that well-designed
buildings were those which exhibited firmness, commodity, and delight.
The same might be said of good software. Firmness: A program should not have
any bugs which inhibit its function. Commodity: A program should be suitable
for the purposes for which it was intended. Delight: The experience of using
the program should be a pleasurable one. Here we have the beginnings of a
theory of design for software.


Software Design Today


Today, the software designer leads a guerrilla existence, formally
unrecognized and often unappreciated. There's no spot on the corporate org
chart or career ladder for such an individual. Yet time after time I've found
people in software development companies who recognize themselves as software
designers, even though their employers and colleagues don't yet accord them
the professional recognition they seek.
Design is widely regarded by computer scientists as being a proper subpart of
computer science itself. Also, engineers would claim design for their own. I
would claim that software design needs to be recognized as a profession in its
own right, a disciplinary peer to computer science and software engineering, a
first-class member of the family of computing disciplines.
One of the main reasons most computer software is so abysmal is that it's not
designed at all, but merely engineered. Another reason is that implementors
often place more emphasis on a program's internal construction than on its
external design, despite the fact that as much as 75 percent of the code in a
modern program deals with the interface to the user.


More than Interface Design


Software design is not the same as user interface design.
The overall design of a program is to be clearly distinguished from the design
of its user interface. If a user interface is designed "after the fact" it is
like designing an automobile's dashboard after the engine, chassis, and all
other components and functions are specified. The separation of the user
interface from the overall design process fundamentally disenfranchises
designers at the expense of programmers and relegates them to the status of
second class citizens.
The software designer is concerned primarily with the overall conception of
the product. Dan Bricklin's invention of the electronic spreadsheet is one of
the crowning achievements of software design. It is the metaphor of the
spreadsheet itself, its tableau of rows and columns with their precisely
interrelated labels, numbers, and formulas -- rather than the user interface
of VisiCalc -- or which he will be remembered. The "look and feel" of a
product is but one part of its design.


Training Designers


If software design is to be a profession in its own right, then there must be
professional training which develops the consciousness and skills central to
the profession.
Training in software design is distinguished from computer science, software
engineering, and computer programming in that its principal focus is on the
training of professional practitioners whose work it is to create usable
computer-based artifacts, that is, software programs. The emphasis on
developing this specific professional competency distinguishes software design
on the one hand from computer science, which seeks to train scientists in a
theoretical discipline, and on the other, from engineering, which focuses
almost exclusively on the construction of the internals of computer programs
and, from the design point of view, gives short shrift to consideration of use
and users.
In architecture, the study of design begins with the fundamental principles
and techniques of architectural representation and composition, which include:
freehand drawing, constructed drawing, presentation graphics, and visual
composition and analysis.
In both architecture and software design it's necessary to provide the
professional practitioner with a way to model the final result with far less
effort than is required to build the final product. In each case specialized
tools and techniques are used. In software design, unfortunately, design tools
aren't sufficiently developed to be maximally useful.
HyperCard, for instance, allows the ready simulation of the appearance of a
program, but is not effective at modeling the behavior of real-world programs.
It captures the surface, but not the semantics. For this, object-oriented
approaches will do better, especially when there are plug-in libraries, or
components, readily available that perform basic back-end functions. These
might not have the performance or capacity of back-ends embedded in commercial
products, but will be more than adequate for prototyping purposes.



A Firm Grounding in Technology


Many people who think of themselves as working on the design of software
simply lack the technical grounding to be an effective participant in the
overall process. Naturally, programmers quickly lose respect for people who
fail to understand fundamental technical issues. The answer to this is not to
exclude designers from the process, but to make sure that they have a sound
mastery of technical fundamentals, so that genuine communication with
programmers is possible.
Technology courses for the student designer should deal with the principles
and methods of computer program construction. Topics would include computer
systems architecture, microprocessor architectures, operating systems, network
communications, data structures and algorithms, data-bases, distributed
computing, programming environments, and object-oriented development
methodologies.
Designers must have a solid working knowledge of at least one modern
programming language (C or Pascal) in addition to exposure to a wide variety
of languages and tools, including Forth and Lisp.


The Software Design Studio


Most important, students learn software design by practicing it. A major
component of the professional training, therefore, would consist of design
studios in which students carry out directed projects to design parts of
actual programs, whole programs, and groups of programs using the tools and
techniques of their trade.
Prospective software designers must also master the existing research in the
field of human-computer interaction and social science research on the use of
the computer in the workplace and in organizations.
A design is only realized in a particular medium. What are the characteristic
properties of the medium in which we create software?
Digital media have unique properties that distinguish them from print-based
and electronic predecessors. Software designers need to make a systematic
study and comparison of different media: print, audiovisual, and digital,
examining their properties and affordances with a critical eye to how these
properties shape and constrain the artifacts realized in them.


Design and the Development Process


Designers must study how to integrate software design into the overall
software development process -- in actual field conditions of teams of
programmers, systems architects, and technical management.
In general, the programming and design activities of a project must be closely
interrelated. During the course of implementing a design, new information will
arise, which many times will change the original design. If design and
implementation are in water-tight compartments, it can be a recipe for
disaster because the natural process of refinement and change is prevented.
The fact that design and implementation are closely related does not mean that
they are identical -- even if the two tasks are sometimes performed by one and
the same person. The technical demands of writing the code are often so
strenuous that the programmer can lose perspective on the larger issues
affecting the design of the product.
Before you can integrate programming and design, each of the two has to have
its own genuine identity.


A Call to Action


We need to create a professional discipline of software design. We need our
own community. Today you can't get a degree in software design, go to a
conference on the subject, or ubscribe to a journal on the topic. Designers
need to be brought onto development teams as peers to programmers. The entire
PC community needs to become sensitized to issues of design.
Software designers should be trained more like architects than like computer
scientists. Software designers should be technically very well grounded
without being measured by their ability to write production-quality code.
In the year since I first sounded this call to action, there has been a
gratifying response from the computing industry and academic computer science
departments. At Stanford University, Computer Science Professor Terry Winograd
has been awarded a major National Science Foundation grant to develop and
teach the first multicourse curriculum in software design. And in Silicon
Valley and elsewhere there is talk of forming a professional organization
dedicated to advancing the interests of software design.




























January, 1991
DESIGNING A PORTABLE GUI TOOLKIT


Five principles can unravel knotty design problems




Robert T. Nicholson


Bob has worked with user interface design since the days when menus were
considered a pretty neat idea. He is currently responsible for multimedia
tools development at Oracle, and can be contacted at 500 Oracle Pkwy., M/S 4
OP 12, Redwood Shores, CA 94065.


For software developers, the proliferation of windowing systems -- Macintosh,
MicrosoftWindows, Presentation Manager, Motif, Open Look, NeXTStep -- is a
major headache. Develop a product for one GUI, and you miss out on markets for
all the others. What's worse, the GUI you pick might be the once-promising
contender that later falls by the wayside.
Some developers hedge their bets by writing for several of the most popular
systems. But the GUI toolkits are all different, hard to learn, and cumbersome
to use. Development for multiple systems is both risky and expensive. And
products that are converted from one GUI to another often end up with a
"foreign" feel that hinders acceptance by users.


A Portable GUI Toolkit


One solution to this dilemma is to design a portable toolkit for
implementation on top of each "native" windowing system. This toolkit can
provide application programmers with a standard library of routines that map
to all underlying platforms, thus allowing products to be ported without
changes to the application's source code.
Ideally, applications written with the portable toolkit should make full use
of all native windowing system capabilities, and have the look and feel of
applications written specifically for that platform. It is hard to write
full-featured, competitive applications if your toolkit restricts you to a
"least common denominator" intersection of GUI features.
Additionally, the toolkit should have little impact on an application's
performance. And, just to make things challenging, a reasonable subset of the
toolkit should be available on character-mode terminals, so that simple
applications using menus and dialog boxes can work equally well on a VT220 and
a Macintosh.
These were our goals when we decided to develop a portable user interface
toolkit. But we quickly learned that the task is much more difficult than it
seems. The same problems that make it tough to port applications from one GUI
to another make it very difficult to write a portable toolkit.


Problem Areas


In developing the toolkit, the challenges we faced ranged from dealing with
fundamentally different event models to nagging details such as deciphering
the various ellipse-drawing algorithms.
One of our thorniest problem areas dealt with look-and-feel, in which native
GUIs have differing notions of how to perform similar tasks. Most systems, for
example, permit a menu in each window, while the Macintosh only allows for a
single menu bar. How could our toolkit allow people to write applications that
would look right in any environment?
Typefaces were another problem area. On the Macintosh, typefaces are described
by three completely independent properties: A font (Times or Helvetica, for
example), a point size, and a style or combination of styles (underline,
Italic, bold, and so on). In theory, any font can be represented in any style
and in any size. (In practice, only a few sizes are stored as screen bitmap
files; other sizes are derived by bitmap scaling, with visual results that are
often unacceptable.)
On other systems, typefaces are only provided for specific combinations. The
existence of a 12-point Helvetica Italic, for example, doesn't imply that a
10-point Helvetica bold exists. Some systems also support an additional
property, the font weight, which may range from ultra-light to ultra-bold.
Finally, character-mode terminals provide their own (very limited) typeface
model, which includes such attributes as reverse-video and blinking.
Another major GUI difference involves coordinate systems. Each GUI supports
one or more different display resolutions. Often, the horizontal and vertical
dimensions are not the same, which complicates things like drawing circles
that are truly round. And while most GUIs place their origin at the upper-left
corner of the drawing area, Presentation Manager's origin is at the
lower-left, following the coordinate system commonly used in mathematics.
Because coordinates are not used just for graphical shapes, but also for
positioning controls and text within windows, the toolkit's coordinate space
had to be mapped onto character-mode devices as well.
Even with a uniform coordinate system, there is no guarantee that a drawing
will appear the same across GUIs. Two curves that just touch on one platform
may be a pixel apart, or even overlapping one another, due to minor
differences in drawing algorithms.
Rather than tackling these problems on a case-by-case basis, and possibly
ending up with a chaotic mess, we evolved five formal design principles that
were applied across the entire toolkit: overspecification, abstraction,
augmentation, exclusion, and qualification. These are discussed in the
sections that follow.


Overspecification


Each GUI toolkit has features and options that the others lack. Because our
toolkit must have the necessary information to pass on to each of the
underlying toolkits, the application programmer must supply the superset of
that information.
For example, to create a window, the programmer supplies the minimum size,
maximum size, resize increment, window style, title, and so on, even though
much of this may be ignored by some platforms. When this specification system
proved burdensome to the application programmer, we relieved the burden by
providing reasonable defaults for all attributes.
Thus, if the programmer is concerned with the window's minimum size, he can
specify it, and the toolkit will pass this on to the underlying GUI. If the
programmer doesn't care, the toolkit will supply a default value that is
consistent with the look and feel of the underlying platform (if the attribute
is relevant to that platform).
Rather than passing attributes as strings of parameters, we created an
attribute structure for each GUI entity (windows, views, text fields, buttons,
and so on). Each attribute structure has a mask field, with a bit
corresponding to each attribute. By setting the mask bits before passing the
attribute structure to a toolkit routine, the programmer specifies which
attributes of the entity are to be set explicitly, and which will default (see
Listing One). The attribute structures have the added advantage of being
extensible, should some future GUI require the specification of another
attribute.


Abstraction


In cases where the implementation of a particular feature differs from one
platform to another, the portable toolkit presents a uniform abstract
interface to the application programmer. A good example of this is the menu
bar problem mentioned earlier.
There are at least three ways that native GUIs handle menu bars:
One menu bar for the entire system (type 1)
A menu bar for the system, plus a menu bar in each window (type 2)
A menu bar in each window, but no system menu bar (type 3)
To mask the differences between these three types, we use a more general
abstraction. The programmer needn't be concerned with how menus will look on
any particular system. The toolkit supports all three types, as shown in
Figure 1. For type 1 systems, the program's global menu bar is concatenated
with the menu bar of the currently active window, and the result is displayed
in the platform's single system menu bar. For type 2 systems, the program's
global menu bar is displayed in the platform's system menu bar, and the
program's per-window menu bars are displayed in the windows (no translation is
required in this case). Finally, in type 3 systems, the program's global menu
bar is prepended to each of the per-window menus.



Augmentation


In cases where a native windowing system lacks a particularly useful feature,
the portable toolkit actually implements that feature on the deficient
platform, so that application programmers can make use of it. Examples of this
are: A help facility, a text list box, and "Open" and "Save" file dialogs, all
of which were missing from at least one GUI. In this way, we avoid becoming a
"least common denominator."


Exclusion


Some features are found on only one or two platforms, and would be too
difficult to implement on others. Here, our toolkit simply excludes the
feature.
In general, the toolkit uses exclusion only when there is no reasonable
alternative. A good example is the Macintosh sound manager, which has no
parallel in any other major windowing system. Should this feature become more
common across GUIs, we can add a portable sound facility using the design
techniques already discussed.


Qualification


If a feature is supported in a very specific way on some platforms and not on
others, and if abstraction offers no reasonable way to conceal the difference
in implementation, the toolkit allows the programmer to qualify requests based
on the capabilities of the underlying windowing system.
An example of this is the "About" menu item, which, by convention, most
Macintosh applications display in the "Apple" menu. Unfortunately, the Apple
menu has no parallel on some other systems, so if applications want an About
menu item, they have to display it somewhere else.
The toolkit allows the application to make a qualified request for a
platform-specific About menu item. On the Macintosh, the portable toolkit
displays the item in the Apple menu. On other platforms, the toolkit refuses
the request; the application can then place the About item in one of its own
menus.


Handling Events


One of our biggest problems was the whole area of event distribution. Events
are central to GUI programming. Unlike traditional programs, which choose when
and where to accept user input, programs on window systems must respond to a
variety of user- or system-generated events (such as mouse clicks or window
activations) at any time.
Unfortunately, the various GUI event models differ in three fundamental ways:
How events are distributed, who receives events, and what types of events are
supported. We'll briefly discuss these differences, and then show how the
toolkit design techniques resolved them.
In some GUIs (the Macintosh, for example), events are queued by the system.
The application program is responsible for frequently calling a Next-Event
routine to see if any events are waiting to be processed. This occurs within
an event loop -- the heart of a Macintosh application -- which does nothing
but get events and call appropriate routines to process them.
In other GUIs, the application registers event-handler routines with the
system. When events occur, the system calls the appropriate event handler.
Thus, the application has control only when one of its event routines is
called.
Microsoft Windows takes a hybrid approach. The application's event loop first
calls a routine, GetMessage( ), to get the next event from the event queue. It
then calls another system routine, DispatchMessage( ), which in turn invokes
the application's preregistered event handlers.
These various event distribution models are related to the notion of an
event's destination. On the Macintosh, the application's event loop gets all
events. Some events are marked with additional destination information. For
example, an activate event specifies the window to be activated, while a mouse
button down event includes the position of the mouse. However, it is the
application's responsibility to determine the event's ultimate target.
At the other end of the spectrum we have the "callback" model of OSF Motif,
based on the X Window intrinsics. In a callback system, each object on the
screen, from a checkbox to a text field, may have its own event handler;
events are dispatched directly to the entity to which they pertain by calling
its event handler.
Event targets are also affected by the types of events each GUI supports.
Although common events such as keystrokes or mouse clicks have very similar
destinations on all GUIs, targets of more abstract events such as activation
and deactivation vary. On some GUIs, an activation event may be delivered only
to the window; on others, it may go to a particular entity, such as a text
control within the window.
Creating a uniform programmatic interface on top of these different models
required careful application of all toolkit design techniques.
To begin with, we chose to support callback-style event distribution, because
GUIs that use an event loop model could easily be augmented by building the
event loop into our toolkit and adding a dispatcher. This was much simpler
than the alternative: Replacing native GUI event dispatchers with a routine
that funnels all events into a single queue, where an application event loop
can retrieve them.
The next task was to create a uniform system of event targets that could be
mapped onto the various GUIs. The toolkit can dispatch events to three classes
of entities: the application itself, individual windows, and views. Views are
objects that appear within windows, such as text fields, scroll bars, and
checkboxes.
The application must register an event handler routine for each instance of a
window or view (and of course, one for the application itself). Registration
is accomplished by setting the EVENT attribute in an entity's attribute
structure when it is created. The application can also register a pointer to a
private data area; this pointer is passed to the event routine when it is
called. Applications can then provide event routines with additional
application-specific data.
Note that many entities can share the same event handler routine, because the
routine is passed the target (or "recipient") ID in the event record, whenever
it is called. The event record may also contain data specific to the type of
event; for example, a KEY event record contains the actual key that the user
pressed. To provide all of the event-specific data without becoming too large,
the master event record passed to event handler routines is actually a union
of type-specific event records; the individual variants overlap only in the
"type" and "recipient" fields, which are provided for all types of events.
Listing Two shows an application-event routine that responds to push-button
events for three different buttons in a dialog.
Actually implementing this scheme on the various GUIs required extensive
abstraction and augmentation. The entire view scheme is itself an abstraction,
because many GUIs group objects into numerous distinct classes, rather than a
single, all-encompassing class. Augmentation was required to implement uniform
view behavior, including event dispatching, on top of the many different
object classes on each of the GUIs.
All three classes of toolkit event targets (application, windows, and views)
were mapped directly onto those GUIs that already allow individual entities to
register their own event handlers. Other GUIs were augmented to allow
registration of event routines, and to send events to the appropriate entity
through the event dispatcher. (When we say that the toolkit is "sending an
event" to an entity, we simply mean that it is calling that entity's event
handler.)
The actual target for any particular event is based on the event type. For
example, when the toolkit receives a mouse button press event from the
underlying GUI, it may have to determine the view that contains the current
mouse position, and then dispatch a MOUSEDOWN event to that view's event
handler. A KEY event, in contrast, is sent directly to the view or window
which is currently the focus for keyboard input. The user may be able to
change the keyboard focus by tabbing to, or clicking in, another view. In this
case, the view that previously had the keyboard focus is sent a FOCUSOUT
event, and the view that has just become eligible for keyboard input is sent a
FOCUSIN event.
The FOCUSIN and FOCUSOUT events are examples of events not supported on some
GUIs, but necessary to writing well-behaved applications. On some GUIs, for
example, the system may highlight text selected by the user. If the user moves
to another view by tabbing or clicking, the text in the previous field must be
unhighlighted. To support this functionality, the application must be notified
when the keyboard focus shifts. Providing a standard set of events across GUIs
required extensive augmentation.


Conclusion


No software project is ever 100 percent successful, but by consistently
applying toolkit design techniques, we met all of the goals described in the
beginning of this article, and have implemented some or all of the toolkit on
character-mode terminals, Microsoft Windows, Presentation Manager, Macintosh,
X Window, and Motif.
Two very large software products, a forms/data entry package and a
presentation graphics/charting package, were built on top of the first version
of the toolkit. These were successfully ported across native GUIs with an
investment of less than two percent of the original coding effort.
While the effort to design and code for GUI independence is not small, it can
pay off: Applications will reach a far wider audience, and will not be locked
in to one platform as GUIs continue to advance and proliferate.


Acknowledgments


Credit for the toolkit belongs to: Ed Screven, who is chiefly responsible for
the current design; Bruce Daniels, who provided the initial vision and
inspiration; and all of the outstanding individuals who have nursed the
toolkit through many revisions.



_DESIGNING A PORTABLE GUI TOOLKIT_
by Robert T. Nicholson


[LISTING ONE]

/* make_my_window -- Application code to create a window demonstrates use of
 attributes for overspecification. The programmer can specify values for
 window properties applying to any underlying GUI, or can accept
 platform-specific defaults. Routines and constants prefaced with the
 letters "ui" are toolkit routines.
*/

uiWindow make_my_window(private_data)

ptr_t private_data;

{
struct
uiWindowAttr window_attr; /* window attributes */

uiWindow the_window; /* The window created */

/* Set the attibute "mask" for the attributes we're concerned with */
window_attr.mask = uiWINDOW_A_EVENT 
 uiWINDOW_A_TYPE 
 uiWINDOW_A_MODALITY 
 uiWINDOW_A_POSITION 
 uiWINDOW_A_SIZE 
 uiWINDOW_A_TITLE 
 uiWINDOW_A_CANRESIZE 
 uiWINDOW_A_CANCLOSE 
 uiWINDOW_A_CANMINIMIZE 
 uiWINDOW_A_CANMAXIMIZE 
 uiWINDOW_A_CANTITLE 
 uiWINDOW_A_VSATYPE 
 uiWINDOW_A_HSATYPE;

/* Set the desired attribute values */
/* EVENT - register the event handler, and a pointer to an application data
 structure that will be passed to the event routine when it is called. */
window_attr.eventproc = my_event_routine;
window_attr.eventarg = private_data;

/* TYPE & MODALITY - a non-modal document window */
window_attr.type = uiWINDOW_DOCUMENT;
window_attr.modality = uiWINDOW_MODALITY_NONE;

/* POSITION, SIZE, and TITLE */
window_attr.x = 200;
window_attr.y = 100;
window_attr.width = 500;
window_attr.height = 300;
window_attr.title = "Untitled";

/* CANRESIZE, CANCLOSE, CANMINIMIZE, CANMAXIMIZE, CANTITLE - these permissions
 determine how the user can manipulate this window. We could simply accept
 platform-specific defaults for these, to maintain local look and feel. */
window_attr.canresize = TRUE;

window_attr.canclose = TRUE;
window_attr.canminimize = TRUE;
window_attr.canmaximize = TRUE;
window_attr.cantitle = TRUE;

/* VSATYPE & HSATYPE - window will have vertical and horizontal scrollbars */
window_attr.vsatype = uiWINDOW_SATYPE_BAR;
window_attr.hsatype = uiWINDOW_SATYPE_BAR;

the_window = uiWindowCreate(&window_attr);

return the_window;
}






[LISTING TWO]

/* button_routine -- A sample of an event handler routine that processes
 events for three buttons in a dialog box. Data structures prefaced with
 the letters "ui" are toolkit structures.
*/

void button_routine(the_event, private_data)

uiEvent *the_event; /* Event record pointer */
ptr_t private_data; /* Dialog's data pointer */
{

switch (the_event->type)
 {
 /* Handle button-push events for all three of the
 dialog's buttons. */
 case uiEVENT_PUSHB:
 if (the_event->recipient ==
 ((dialog_data_struct *) private_data)->yes_button)
 do_yes_action(private_data);
 if (the_event->recipient ==
 ((dialog_data_struct *) private_data)->no_button)
 do_no_action(private_data);
 if (the_event->recipient ==
 ((dialog_data_struct *) private_data)->cancel_button)
 do_cancel_action(private_data);
 break;

 /* Handle key events - if the user pressed the Return
 key, treat it the same as the "Yes" button, which
 is the default button for this dialog. */
 case uiEVENT_KEY:
 if (((uiEventKey *) the_event)->key) == '\n')
 do_yes_action(private_data);
 else
 uiWSBeep();
 /* (Not interested in any other events that the buttons
 may receive.) */
 }

return;
}




























































January, 1991
DESIGNING A WRITE-ONCE FILE SYSTEM


A general-purpose optical storage software technology




Simson Garfinkel


Simson is the principal scientist at N/Hance Systems, which sells a variety of
products based upon WOFS. He can be reached at 52 1/2 Pleasant, Cambridge, MA
02139, or Internet mail as simsong@next.cambridge.ma.us.


Write-once storage systems are one of three different optical storage
technologies to emerge over the last decade. Write-once systems (called WORMs,
for Write Once, Read Multiple) are the oldest of the optical systems and in
many ways the most successful to date.
Once a block is written into WORM, it can't be changed. This unique
characteristic of WORM is at once its virtue and its handicap: WORM offers
data permanence, but traditional computer systems can't use WORM without
special software. Operating systems like MS-DOS and Unix, for example, need to
be able to update the blocks used to store directories when files are created
or deleted.
In 1985 I started a research project at the MIT Media Lab to develop a file
system designed for write-once devices. This article describes the design and
evolution of this software component, known as the Write-Once File System
(WOFS).


Why WORM?


Comparing WORM disks with read-only CD-ROMs and rewritable magneto-optical
systems, we find that each optical storage technology has advantages and
disadvantages.
CD-ROM, with more than 500 Mbytes on a 4.77-inch disk that costs less than a
dollar to manufacture, is an ideal system for publishing databases and
distributing large software systems. But CD-ROM can't be written, and thus
isn't a replacement for conventional storage.
Rewritable magneto-optical disks are much slower than magnetic disks, and
don't hold as much as CD-ROM, or even magnetic disks of the same size.
Although rewritable cartridges are removable, they cannot easily serve as a
publication format, because each disk must be individually recorded, rather
than being mass-produced.
WORM cartridges can store two to four times more data than similarly-sized
rewritable ones. Today, 5.25-inch drives are available that can store more
than 500 Mbytes per side, and 14-inch write-once systems can store two to four
gigabytes. As stated earlier, the unique virtue of write-once technology is
the permanence it offers for valuable data. In many application areas, such as
financial and medical applications, this feature is extremely critical.


WORM in Use


There are three approaches to using write-once technology in computer
applications. In the first, a specialized application uses the optical disk
for storing large datafiles (such as images). The application may track these
files using its own dedicated routines, or use a database management system
for this purpose.
The second approach is to use a special device driver that lets the write-once
disk simulate a rewritable disk. When the operating system tries to rewrite a
block, the driver writes a new block and remembers the translation.
This article focuses on the third approach: A file system designed
specifically for use with write-once optical disks. The goal of our WOFS
project was twofold: To invent a file system standard which defined the means
of arranging information into files on the optical disk; and to create a file
system implementation -- a function library written in the C programming
language which would let us test the design.
From the beginning, we had high hopes for our project. We designed the file
system so that it would be applicable to both read-only and rewritable optical
disks, in addition to write-once. (Our system was designed one year before the
High Sierra CD-ROM standard.) We chose not to take advantage of special
features that were present in the drives made by certain manufacturers, so
that the software would work with any drive. (See the accompanying text box
entitled "The Problem with Post Fields.") Most importantly, we designed the
file system to be operating system independent, so that files written to a
disk with one operating system could be read back with another.
Over the past five years, WOFS has undergone three major redesigns, most
recently in November 1989. Today it is a full-fledged file system that
provides many of the function calls specified by the POSIX standard, including
open( ), close( ), read( ), and write( ). WOFS has been ported to both MS-DOS
and Unix; optical cartridges can be moved between the two systems, and files
from one operating system can be shared with the other, even if the two
operating systems are running on processors that use different byte orders.
WOFS also allows the user to access previous versions of files, as well as
take the entire disk "into the past."


Differences from Conventional Systems


Magnetic file systems update the data stored in files by rewriting the blocks
that the data is stored in. Files are deleted from directories by rewriting
the directory with the file's name missing, and returning the blocks
associated with the file to the pool of unused blocks.
When working with write-once devices, however, things are more complicated.
Because the physical blocks on a write-once disk cannot be changed, files and
directories that are logically changed must be rewritten to new locations. The
role of the write-once file system is to keep track of the most recent version
of each directory and file on the disk and permit them to be found quickly. A
good write-once file system also minimizes the number of blocks that have to
be written for any given operation.
WOFS does not store information on the optical disk in MS-DOS, Unix,
Macintosh, or any other "standard" format. WOFS uses its own format, instead;
an operating system specific interface allows existing operating systems to
read and write files on the optical.
Making WOFS CPU-independent was tricky, however, because different kinds of
microprocessors store data in different ways. The Intel 80286 microprocessor,
for example, stores 16-bit integer values on 2-byte boundaries; the Motorola
68020 processor stores 16-bit integers on 4-byte boundaries. Different
microprocessors use different strategies for sign extension when converting
from 16-bit values to 32-bit values.
To get around these problems, WOFS stores all directories and other file
system information with a small set of predefined structures which were
specifically designed to be portable and readable by many different kinds of
microcomputers. The only two data types in the structures are 32-bit unsigned
numbers and null-terminated character strings. Strings are always padded to a
multiple of 4 bytes, WOFS solves word-alignment problems by storing all data
on 4-byte boundaries. WOFS also has a mechanism for detecting and swapping
byte order when necessary.


Basic Structures


WOFS records new data and updates to the optical file system as a series of
sequential block-write operations, starting with the first block on the disk
and continuing until the end.
The WOFS approach divides the disk into two discrete regions: One in which
blocks have been recorded and one in which they are blank. The transition
point between the two regions is called the "last written block."
By not having a specific part of the disk dedicated to directories or file
pointers (such as Unix inodes), WOFS eliminates a problem that is common with
other file systems: Part of the disk fills up, making the disk unable to hold
more information, while other parts of the disk remain empty.
When a disk is mounted, WOFS finds the last written block with a binary
search: If the block examined contains valid data, WOFS searches higher on the
disk for the last written block; if the block does not contain data, WOFS
searches lower on the disk.
More Details.
Although a binary search across the disk requires much "seeking" (movement of
the optical head), each seek is precisely half the distance of the previous
seek. If there are N blocks on the disk, the total number of seeks is
guaranteed to be LOG2(N). In practice, WOFS takes less than four seconds to
find the last written block on a 5.25-inch optical disk drive.



Transactions and Filemaps


WOFS writes to the optical disk in groups of operations called transactions. A
transaction might be creating a file, changing its data, or erasing a file or
directory; it also might consist of many operations grouped together. At the
end of each transaction, WOFS writes a special block called the End of
Transaction Block, or EOT.
When a disk is mounted, WOFS locates the last written block on the drive and
checks to see if it is an EOT. If it isn't, then the last transaction was
interrupted by a hardware failure (for example, a loss of power or a pulled
cable); WOFS then searches backwards on the disk, block-by-block, to find the
last complete transaction.
Every EOT contains a pointer to a block (or blocks, if necessary) that stores
a list of current directories. The first directory in the list is the root
directory.
Additional fields in the EOT are used to maintain the disk volume name and
accounting information, such as the time of the last transaction. The EOT also
has a pointer to the previous EOT, which is used for stepping the disk into
the past.
WOFS uses the same basic structure to remember where both files and
directories are stored. We call it a fragment table, but it can be thought of
as a list of regions on the disk where data is located and pointers that tell
the file system how to assemble the data into a continuous file.
When a file that has been written is closed (or when a modified directory
entry is written to the disk), WOFS writes two additional blocks to the disk
for that file: a "filemap," and a "file header."
The filemap is the database that tells WOFS where on the disk the data in the
logical file is actually written. Each record in the database has three
fields:
Ordinal (the byte position in the logical file where this data fragment
starts)
Length (the number of bytes in this data fragment)
Starting block (the point on the disk where the data fragment is located)
Because WOFS writes most files contiguously on the optical disk, most files
are represented by a single fragment. For example, a file consisting of 65,536
bytes starting at block 1000 might have a filemap that looked like that in
Figure 1.
The filemap allows programs that update records within a file (such as
database programs) to be used with the optical disk. If a program were to
reopen file AAA, as in Figure 1, and rewrite the first 1024 bytes of the file
with new information, the new filemap might look like that shown in Figure 2
when the file is closed (assuming a disk-block size of 512 bytes).
If this file is then opened for reading, WOFS will return to the calling
program the 1024 bytes starting at block 1140, followed by the 64,512 bytes
that start at block 1024.
Of course, the filemap for version 1 of file AAA is still on the write-once
disk; the user can still access the original version of the file by reading
the file with the first filemap instead of the second. A special function
allows the calling program to find out which version of the file it is
reading. Another special function is provided which can step individual files
-- or an entire optical cartridge -- backwards in time.
At the heart of WOFS is a set of state machines that manipulate filemaps and
transfer information to and from the optical disk. These state machines
provide many standard file-system functions, as well as some that are only
possible because of the WOFS filemaps. These functions include the following:
filestr_seek
filestr_read
filestr_write
filestr_insert
filestr_delete
filestr_check
The filestr_seek function repositions the file pointer in much the same way as
lseek( ). Likewise, filestr_read works like read( ), and filestr_write is like
write( ). The function filestr_insert has no equivalent in other file systems.
This function opens up the file and inserts bytes at the current position
within the file, making the file longer. A converse function, filestr_delete,
also has no standard equivalent, and deletes bytes at the current position,
making the file smaller.
Finally, filestr_check provides a consistency check, by checking the variables
used by the state machines for self-consistency. Frequently, use of this
function made it relatively easy to find and isolate programming errors during
the file system's development.
The Insert and Delete functions are used primarily to maintain directories; by
keeping the directory entries in sorted order, WOFS reduces the average time
needed to create a file by a factor of two.
The state machines are optimized to transfer data directly between the optical
disk interface and user memory when more than a complete block of data is
transferred. Thus, WOFS often performs reads and writes of large files at the
maximum possible transfer rate of the hardware.


Directory Entries


All of the operations that file systems perform can be broken down into four
main categories:
Looking up a name in a directory
Making an entry in a directory (for a file or a subdirectory)
Removing an entry in a directory
Transferring data between an open file (or directory) and the operating
systems
To resolve a filename, WOFS starts at the root directory and searches for the
names of each successive subdirectory, one-by-one. Eventually, WOFS determines
the location on the disk of the file's containing directory. These
translations are cached, which substantially improves the file system's
performance.
Like the data in files, each WOFS subdirectory is stored in one or more
fragments. The actual directory entries are built out of one or more
variable-length records. The contents of a directory entry are shown in Figure
3.
Figure 3: Contents of a directory entry

 * directory entry flag (0x00001000, used for
 byte-swap detection)
 * size of directory entry
 * file type
 * modification time
 * file version number
 * x,y location (for Macintosh OS)
 * file header location
 * file length in bytes
 * filename

The first four bytes of the directory record are the directory entry flag,
0x00001000. If WOFS reads 0x00100000 instead of 0x00001000, all of the binary
values stored in the directory entry are byte-swapped. This might happen if
the directory entry was written by a VAX and read back by a SPARC
microprocessor -- the VAX and the SPARC store the same binary value in the
reverse order. By detecting byte-swaps and swapping the data back, WOFS allows
cartridges written with one byte order to be read back on a system that uses
another.
WOFS only swaps on read. If WOFS running on a SPARC updates a directory
written by WOFS running on a VAX, it does not swap the binary values back
before writing. Chances are that data written by one microprocessor will be
read back by that same microprocessor; if the directory entry is eventually
read again by a VAX, the VAX will swap the data when it reads it.
Directory entries are created with the filestr_insert( ) function; likewise,
they are deleted with the filestr_delete( ) function. A function called
wofsfile_readdir( ) reads directory entries and translates them into a form
appropriate for Unix, MS-DOS, or the Macintosh operating system.
When a file is opened for reading or writing, that file's fragment table is
loaded into memory and a state machine is set up to handle data transfer. When
the file is closed, the fragment table is written back to the optical disk,
followed by a file header.



Individual Files


Every WOFS file has a file header. File headers contain all of the information
in the directory entry -- allowing the directory entry to be reconstructed in
the event of media damage -- as well as additional information used by
operating systems other than MS-DOS. File headers can be extended to include
security-related information such as access control lists used by newer,
security-conscious operating systems. Normally, however, the file header is
used only by WOFS and remains invisible to the user.
The file header fits within a single WORM block. The information it contains
is shown in Figure 4.
Figure 4: Contents of the file header block

 * File header version number
 * File number
 * File type
 * File version number
 * Location of previous version
 * Directory number of containing
 directory
 * Filename (without directory name prefix)
 * Time of last write
 * Time of file creation
 * Number of bytes in file
 * x,y (for Macintosh OS)
 * Location on disk of fragment table
 * Number of fragments in fragment
 table
 * Name and version of operating
 system that created file
 * Name of site that created file



File System Blocks


EOTs, File Headers, and Fragment Tables are all stored on the optical disk
within a special kind of data structure called a File System Block (FSB). The
FSB has a special 12-byte header that makes it possible to identify the block
in the event of a media failure, which allows for quick and reliable recovery
of the data on the WORM disk.
The FSB header contains a 4-byte FSB flag, a 4-byte "type" field which
indicates whether the FSB holds an EOT, a file header or a fragment table, and
a 4-byte self-referential pointer that contains the disk block address where
the FSB was actually written. In C, the FSB has the structure shown in Figure
5.
Figure 5: FSB structure declarations

 typedef struct {
 u_long flag;
 u_long location;
 u_long type;
 } fs_hdr;

 typedef struct {
 fs_hdr hdr;
 char data[ BLOCK_SIZE - sizeof(fs_hdr) ];
 } fs_block;

In the event of media failure, the user can scan the entire optical cartridge
with a special recovery program that searches for FSBs that hold file headers.
The entire directory hierarchy can then be automatically recovered. In
practice, this procedure takes less than 20 minutes for a 400-Mbyte cartridge.
Identifying a file-system block by both self-referential pointer and 4-byte
flag minimizes the chance that a random block of data will appear to be an
FSB.
WOFS also uses the FSB to allow implementations running on computers with one
byte order to use files written by implementations using another byte order.
If the WOFS function that reads the EOT discovers that the byte order of the
FSB flag is reversed, it reverses the byte order of every 4-byte integer value
stored in the EOT. Like the byte-swapping for directory entries, this swapping
happens only on read operations. When the EOT is written to the disk at the
conclusion of the next transaction, it is written in the new byte order. The
functions which read file headers and fragment tables swap similarly when
necessary.


The Operating System Interface


Although it is possible to link an application program directly with the WOFS
function library, most developers choose to access write-once optical disks
through the operating system. Having the WORM disk behave like a regular
magnetic drive lets developers develop and test and their programs with
conventional hard disks, before they go to the expense of purchasing an
optical subsystem.
Presently, WOFS runs transparently under MS-DOS and Unix. The MS-DOS
implementation uses a terminate-and-stay-resident (TSR) program that
intercepts all uses of software interrupt 0x21. Each software interrupt is
examined to see if it should be handled by WOFS or by MS-DOS, and then the
appropriate functions are called. This is the same technique used by a variety
of network systems available for MS-DOS.
One significant problem we encountered in writing the TSR was that many DOS
application programs use MS-DOS functions that are undocumented. Along the
way, we learned that many application programs currently on the market -- such
as WordPerfect -- use the File Control Block (FCB) disk I/O access mechanism
left over from MS-DOS Version 1.0. This is despite the fact that ever since
MS-DOS Version 2.0, Microsoft has declared these functions obsolete and asked
developers to avoid using them. The problem is that it is very difficult to
properly implement all of the subtleties of the FCB; it took several months of
trial and testing before we stopped finding problems with our implementation.

One problem that remains with WOFS is memory consumption. The WOFS core
requires approximately 50 Kbytes of code and another 40 Kbytes of data space.
These memory constraints are modest under every operating system environment
with the exception of MS-DOS. Unfortunately, MS-DOS is a very large part of
today's write-once market. We are currently working an EMM-version of WOFS
which will lower the MS-DOS low-memory requirements from 90 Kbytes to less
than 50 Kbytes.
People who have the Unix operating system can use WOFS with a server we have
developed for Sun Microsystems' Network File System. A workstation on a local
area network simply mounts the WOFS disk the same way it would any other
remote file system. The server program translates all network requests into
the appropriate WOFS operations on the optical drive. The WOFS NFS server
makes WOFS available to an extremely large base of users.
A WOFS interface for the Macintosh operating system is currently under
development. One of the biggest expected users is the graphics arts community,
which will be able to use WOFS as an alternative to local area networks for
transporting large image files between computers running different operating
systems.


The Evolution of the Design


The design of WOFS has undergone several iterations. The original system was
developed in 1985 as a research file system for experimenting with basic
concepts. This was used at the MIT Media Lab for work with read-only and
write-once compact disks. Although we used that version to master a number of
CD-ROMs, we never successfully integrated it into a running ope ating system.
The limitation of the first WOFS was that files could only be stored
contiguously, and directories had to be buffered in memory in their entirety.
The first major rewrite introduced the idea of fragmented files, which allowed
files and directories to be updated a piece at a time. The second rewrite
eliminated dynamic allocation of memory inside the WOFS core, which was
necessary to allow the WOFS to interface with the kernels of real operating
systems, such as MS-DOS and Unix. The third rewrite replaced all 2-byte
integers stored in the file system structures with 4-byte integers, to
facilitate adopting WOFS to computer architectures such as the 68000, which
can store integers only on arbitrary 4-byte boundaries.
WOFS's feature set has remained basically unchanged for the past five years.
From the beginning, the system had to provide all of the basic Unix functions
for file and directory access with off-the-shelf write-once optical disks.
The biggest surprise in developing WOFS was that it was much harder to achieve
our original design goals than we ever anticipated. It took four years of
thinking about the byte-swapping problem before we realized that we could
support heterogeneous byte orders on the same disk -- to the point of
switching byte orders on successive directory entries. Likewise, it was
difficult to develop a set of file structures that would work reliably and
efficiently under a variety of different operating criteria; every time we
thought we had solved the problem, we discovered a way to improve the
structures that would make them either more efficient, more portable, or
interpretable with less lines of code.
We were not alone in encountering these difficulties. Indeed, one of the
reasons that write-once optical drives have failed to catch on is the dismal
state of most write-once software.


The Rewritable Future


Perhaps the industry's current excitement about rewritable optical disks comes
from a belief that finally, here is a high-density mass storage system that
can be used unmodified with existing operating systems.
Unfortunately, in jumping on the rewritable bandwagon, I believe that we will
be losing the very characteristic that attracted me to write-once in the first
place: data permanence. I have always been comforted by the idea of being able
to undelete any file that I have previously deleted. You can do that with
WOFS. You can't do that with Unix or MS-DOS file systems running on rewritable
optical disks.
Nevertheless, not all data needs to be permanently archived. Furthermore, high
performance and price of rewritable disks is now improving at a greater rate
than write-once due to market pressure. For this reason I am now in the
process of designing an improved version of WOFS, "WOFS 2," which will work
interchangeably with write-once and rewritable optical disks.
Why use a WOFS 2 with a rewritable optical disk when you can just as well use
rewritable optical disks with native Unix or MS-DOS file systems? There are
two reasons: portability and speed. WOFS 2 will retain WOFS's ability to move
disks between operating systems and CPU types. And because it will be
specifically optimized for the performance characteristics of optical disks,
it will be faster and more reliable when used with optical disks than file
systems developed for the hard-disk technologies of the 1970s.
Write-once compact disks, the media that WOFS was designed to be used with,
may be several years in the future. But WOFS exists today and is in productive
use.


The Problem With Post Fields


In the early days of write-once, some manufacturers tried to add special
features to their drives to overcome the difficulty of using write-once with
traditional operating systems. One popular system was called "post fields,"
small records at the end of each block of data that could be recorded
independently from the block itself. Post fields were used as pointers to
newer versions of blocks. If the operating system had to rewrite block 500,
the new data might be written in block 567 and the post field of block 500
would be set to "567."
There are two primary difficulties with post fields: speed and reliability.
The more often a block is modified with a post field-based file system, the
longer it takes to find that block. This is a big problem with blocks that are
used to store directories, for if the directory is modified 100 times, then
100 blocks and post fields have to be read in order to find the most recent
version.
A second problem with post fields arises because of the inherent unreliability
of write-once disks. WORM disks frequently contain blocks with bad bytes in
them. WORM drives get around this problem by reading back every block after it
is written and rewriting it in another location if required. The problem with
a post field based-scheme is that occasionally it is the bytes in the post
field itself that are bad, which makes it impossible to make the post field
point at the rewritten data.
-- S.G.






























January, 1991
GRAPH DECOMPOSITION


Imposing order on chaos




Edward Allburn


Edward has been developing software professionally since 1983. Currently he is
working as a team leader for a company that develops software for the
financial community. Ed can be reached at 4830 S. Salida Ct., Aurora, CO
80015; or through CompuServe 72617,2624.


The graph is one of the more versatile data structures. In its simplest form,
a graph consists of a collection of vertices connected by edges, and a cost or
weight is often associated with each edge. An example use of such weighted
graphs can be found in the classic "Shortest Network" (also known as the
"Traveling Salesman") problem whereby the goal is to find the shortest way to
connect all of the vertices of a graph. Given a list of cities, for example,
design the shortest possible highway system that will connect them.
The key to solving this problem is to create imaginary vertices in the midst
of the real ones. These imaginary vertices are called "Steiner" points after
their inventor, the nineteenth-century mathematician Jakob Steiner. The
equilateral triangle in Figure 1(a) provides a good demonstration of the value
of Steiner points. Assuming that each edge is 100 units long, without Steiner
points it takes a total of 200 units to connect the three vertices. However,
it takes only 175 units to connect the vertices if a single Steiner point S is
placed in the middle of the triangle and the edges are redrawn so A, B, and C
each have a single edge to S. Figure 1(a) shows the original graph, and Figure
1(b) shows the graph after adding the S vertex and redrawing the edges.
With a bit more work, a solution can be found that connects the vertices using
only 173.2 nits. (It can, in fact, be shown that for any equilateral triangle
the shortest path is N multiplied by the square root of 3 where N = length.)
Graphs have a wide variety of other applications. Their uses range from
modeling complicated network-flow systems to simply representing sets of
related elements.


Graphs and Disjoint Sets


An important property of graphs is that it is not necessary for all of the
vertices to be connected. A collection of vertices that are connected to each
other is called a "connected component," while a series of edges that allow
one vertex to reach another is called a "path." Thus, a graph is made up of a
collection of connected components, which are in turn made up of a collection
of connected vertices. Figure 2(a), Figure 2(b), and Figure 2(c) each show a
single graph.
From these figures, it is apparent that connected components provide a natural
representation for disjoint sets (none of the elements in one set appear in
any other set). In addition, indirect relations between set elements are also
represented. Figure 2(a) provides a good example of this. In this graph, both
vertex 1 and vertex 7 are directly related (that is, have an edge to) vertex
5. Although vertices 1 and 7 are not directly related, it is possible to
determine that both belong to the same set because a path exists that connects
them.
A variety of common questions can be asked of disjoint sets:
1. Does a path exist that allows vertex A to reach vertex B?
2. Is the graph connected (that is, comprised of only one connected
component)?
3. How many connected components is a graph comprised of? What are the
vertices in each of these connected components?
4. How many vertices are not connected to any other vertex? What are they?
5. What is the largest and smallest connected component?
6. What is the average size of all the connected components?
7. What are the min and max vertex values of every (or a single) connected
component?
8. Given a vertex, how many other vertices are in the same connected
component? What are they?
The field of "connectivity" deals with these and other similar questions.
At my workplace, graphs are used in just this fashion. Our processing system
creates databases where sets of related objects are grouped together. The
first section of the system generates a large, complex graph. It does this by
comparing two objects at a time, outputting the pair of object numbers if the
objects are determined to match. In this fashion, a graph is generated one
edge at a time. Because the databases often have over 1,000,000 objects, the
graphs generated by the system often have over 1,000,000 vertices and are
comprised of several million edges. The only way the system can represent such
a huge graph is by saving in a file the list of the graph's edges. We then use
this huge graph to determine what sets of records should be grouped together
(question 8 above). This is where a problem arises.


The Problem


The problem is to find an efficient algorithm for determining if a path exists
that connects a pair of vertices for all possible pairs of vertices (and do it
before the sun burns out or the entire universe collapses into a black hole).
Using the graph shown in Figure 2(c) as an example, it is easy to answer this
question for vertices 1 and 2 because the vertices are directly related by the
edge (1, 2). However, answering this question for the next pair of vertices, 1
and 3, is much more difficult. To determine if a path exists that connects
this pair of vertices, you must this time "traverse" several of the edges in
vertex 1's connected component. In the case of the next pair of vertices, 1
and 4, verifying that a path does not exist requires one to traverse all of
the edges in vertex 1's connected component. Algorithms used to answer this
and other questions of connectivity are known as "union-find" algorithms.


Existing Solutions


Warshall{1} developed an algorithm that requires only a single operation to
determine if a path exists that connects a pair of vertices. It works by
loading the graph into a Boolean "adjacency matrix" and then finding the
"transitive closure" of the matrix. (Refer to Sedgewick (1988) for a complete
discussion of adjacency matrices and transitive closure.) Figure 3(a)
illustrates the adjacency matrix for the graph shown in Figure 2(c). Figure
3(b) shows the adjacency matrix after the transitive closure has been found.
Figure 3: (a) Adjacency matrix for graph in Figure 1(c), (b) transitive
closure of the adjacency matrix.

 111 111
 123456789012 123456789012
 1 X 1 XX XX
 2X XX 2X X XX
 3 X 3XX XX
 4 XX 4 XX
 5 X X 5XXX X

 6 XX X 6XXX X
 7 X X 7 X X
 8 X X 8 X X
 9 9
 10 10
 11 X 11 X
 12 X 12 X

 3a 3b

With the finished matrix, each vertex has an edge to all of the other vertices
in the same connected component. For example, the adjacency matrix shown in
Figure 2(b) now depicts the edge (1, 3). Thus, it now only takes a single
operation (if table [1,3] = X ... ) to determine if two vertices are in the
same connected component.
There are two significant drawbacks with this algorithm, however. The first is
that there is a tremendous amount of overhead in finding the transitive
closure of the graph. This is done with the brute-force method shown in
Example 1.
Example 1: Brute-force method of finding the transitive closure of graph.

 for y:= 1 to N do
 for x:= 1 to N do
 if matrix[x,y] = TRUE then
 for i:= 1 to N do
 if matrix[y,i] = TRUE then
 matrix[x,i] = TRUE;

From the three nested loops, it is apparent that this process is O(N{3})
(where N indicates the number of vertices). (Refer to Sedgewick (1988) for a
complete discussion on "Big O-Notation".) The second drawback is that the
matrix requires an enormous amount (N{2}) of memory. Even if a matrix of bits
is used, the quadratic expansion of the matrix rapidly makes it impractical
for use with larger graphs. Table 1 illustrates this.
Table 1: The quadratic expansion of the matrix rapidly makes it impractical
for use with larger graphs.

 ~ ~ Requirements
 ------------------------------

 12 18.00 Bytes
 1 thousand 125.00 Kbytes
 10 thousand 12.50 Megabytes
 100 thousand 1.25 Gigabytes
 1 million 12.50 Terabytes

A second common solution is to convert the graph into a "forest" of "spanning
trees." Each spanning tree represents a single connected component of the
graph. Both Prim{2} and Krusk l{3} have invented algorithms for finding
"minimum" spanning trees in a weighted graph. Simpler forms of either of these
algorithms can be used for finding the spanning trees of a conventional graph.
Worst-case traversal times of the trees can be substantially reduced by using
techniques of weight/ height balancing and path compression on the trees.
Although the overhead operations of these algorithms are more complex than
Warshall's, both execute significantly faster. In addition, the forest of
spanning trees will require much less memory. Even so, graphs 1,000,000
vertices in size could easily consume tens of megabytes of memory with this
solution.
Because of the tremendous size of the graphs involved, it was apparent that
both of these existing solutions would require a virtual memory mechanism. It
was also apparent that, even with a cache mechanism installed, neither
solution would have an acceptable execution time. It was time to develop a new
algorithm.


A New Solution


For the time being, I decided to ignore implementation issues and focus solely
on the high-level algorithm. In doing so, I had a working solution relatively
quickly. The new algorithm I developed requires only one pass through the
file. In addition, neither sorting nor ordering of the vertices is required.
Example 2 shows the high-level pseudocode. As is often the case, the algorithm
itself is disarmingly simple.
Example 2: High-level pseudocode for the new algorithm.

 1. Read the edge (A,B)
 2. Determine if the A vertex has been seen before.
 3. Determine if the B vertex has been seen before.
 4. BRANCH
 a. Neither seen before = create a new set.
 b. Only A seen before = append B to A's set.
 c. Only B seen before = append A to B's set.
 d. Both seen before = BRANCH
 1. Determine if both vertices are already in the same
 set.
 2. BRANCH
 a. Both in same set = do nothing.
 b. Each in different set = merge the two sets.
 5. If not EOF, goto 1

The key section of this algorithm is determining if both vertices are already
in the same set. The entire graph must be kept in memory if this is to be done
with any kind of efficiency. Thus, my new algorithm was faced with the same
issue as the existing ones -- that is, how to efficiently represent the graph
in memory. The challenge of this approach ended up being in the implementation
of the algorithm, not in its development.



A New Data Structure


One of the most common data structures used to represent a graph is the
adjacency list. With adjacency lists, each vertex is the head of a linked
list. The linked list contains all of the vertices that are adjacent to (for
instance, have an edge directly to) the vertex at the head of the list. Arrays
are often used to store the heads of these linked lists, thus allowing for
very efficient lookups for a given vertex. Figure 3 illustrates the adjacency
lists for the graph shown in Figure 2(c).
The appeal of this structure, in addition to its simplicity, is its efficiency
in some cases. To determine if two vertices are adjacent, you just have to
follow the first vertex's adjacency list until either the second vertex or the
end of the list is found. However, this structure is inefficient for the more
general task of determining if two vertices are in the same connected
component. This is because the adjacency list of each vertex encountered must
also be searched. In this case, spanning trees offer better efficiency both in
terms of lookup times and memory usage. This fact only added to my surprise
when I found a solution based not on spanning trees but, rather, on the humble
adjacency lists.
I redrew the adjacency list for one of the connected components of the graph
in Figure 2(c), this time vertically lining up the vertices, see Figure 5(a).
Then I realized that if each linked list was "overlaid" on the other and the
Nil pointers canceled out, a circular linked list resulted (see Figure 5(b)).
The significance of this is that a circular linked list can be represented as
an array. Further, because each overlaid adjacency list represents a disjoint
set, all of them can be implemented in the same, single array. Figure 6(a)
shows the entire graph from Figure 2(c) represented as a series of overlaid
adjacency lists. Figure 6(b) illustrates the array implementation of the
structure.
Now that the entire graph array could be efficiently represented in memory,
the only issue left was how to determine which vertices had been seen before.
Once again, the array provided the answer. Note that array[9]and array[10] are
empty. Because vertices 9 and 10 have not been encountered, their array
positions have never been filled in. Thus, it can be determined if a vertex
has been seen before by simply seeing if the vertex's array position is empty
or not. With this final mechanism in place, a detailed design of my
algorithm's implementation could now be done.


The Implementation


Armed with this new data structure, designing the implementation of my
algorithm was almost trivial. Example 3 shows the detailed pseudocode. Because
of the loop in section 4d1, it is apparent that this algorithm is O(N{2}) in
the worst case.
Example 3: Pseudocode for implementing algorithm.

 ASSUME: Max Vertex Value = 8
 "Empty" = 0
 array[1..Max Vertex Value] has been filled with "Empty"

 1. ReadEdge(A,B)
 2. if (array[A] <> "Empty") then A_Seen = TRUE else A_Seen = FALSE.
 3. if (array[B] <> "Empty") then B_Seen = TRUE else B_Seen = FALSE.
 4. BRANCH
 a. if (NOT(A_Seen OR B_Seen)) then {create a new set.}
 array[A] = B
 array[B] = A
 b. if (A_Seen AND (NOT B_Seen)) then {append B to A's set.}
 array[B] = array[A]
 array[A] = B
 c. if (B_Seen AND (NOT A_Seen)) then {append A to B's set.}
 array[A] = array[B]
 array[B] = A
 d. if (A_Seen AND B_Seen) then
 1. {Determine if both vertices are already in the same set.}
 temp = array[A]
 while ((temp <> A) AND (temp <> B)) do
 temp = array[temp]
 if (temp = B) then Same_Set = TRUE else Same_Set = FALSE
 2. BRANCH
 a. if (Same_Set) then {do nothing}
 b. if (NOT Same_Set) then {merge the two sets}
 temp = array[A]
 array[A] = array[B]
 array[B] = temp
 5. If not EOF, goto 1

To illustrate this algorithm's use, I have included a program that uses the
algorithm to determine how many connected components a graph is comprised of
(see Listing One). The program accomplishes this by simply incrementing a
counter when a new set is created and decrementing the counter when two sets
are merged. Although the implementation actually used at my workplace is
written in assembly, the listing shown with this article is in Pascal for the
sake of clarity and space. For these same reasons, the Pascal version always
allocates an array 10,000 vertices in size instead of calculating the size and
dynamically allocating the array from the system. Thus, this version of the
program can handle only graphs 10,000 vertices in size and smaller.
The assembly language version of the program (see Listing Two) was written
with Phar Lap's 386 ASM compiler and linked with Phar Lap's 386 DOS Extender.
Because of the DOS Extender, this version of the program is capable of
allocating arrays of several megabytes, and thus is capable of processing
graphs millions of vertices in size. Interested readers can obtain this source
listing from DDJ's CompuServe Forum or the listings disks. All of the
empirical analysis discussed in the next section was done using the Phar Lap
version of the program.


Empirical Analysis


To test my implementation of the algorithm, I created a "worst-case" data set.
By far the most expensive section of the algorithm lies in merging two sets.
The worst-case data, therefore, should exercise this section as much as
possible. The most convenient way I have found to do this is to read a two-way
tree from the bottom up. After the bottom level of the tree has been read,
each edge of every remaining level will cause two sets to be merged.
Conversely, the best case is where a two-way tree is generated from the top
down. Figure 7(a) and Figure 7(b) illustrate the worst case and best case,
respectively, for a max vertex value of eight.
For my test, I used a max vertex value of 2{20} (1,048,576). The resulting
two-way tree generated 2{20}-1 (1,048,575) edges, 2{19}-1 (524,287) of which
would cause sets to be merged. In order to isolate the algorithm's performance
from that of the disk I/O, I did two things: I wrote an I/O routine that
generated the edges directly instead of physically reading them from disk, and
I timed the execution of just this phony I/O routine. By subtracting this time
from the total execution time of the program I was able to isolate the
execution time of the algorithm.
I ran the test on four different 80386 machines, using the 32-bit
protected-mode version of the program. Table 2 summarizes the machines'
configurations and my algorithm's performance on each (all times are in
seconds).

Table 2: Summary of algorithm performance.

 Brand Speed ~ I/O Algorithm Total execution time
 ----------------------------------------------------------

 Generic 16MHz 64K 6.48 29.06 35.54 seconds
 Generic 20MHz 0K 6.37 15.82 22.19 seconds
 ALR 25MHz 64K 3.74 11.97 15.71 seconds
 Compaq 33MHz 64K 2.72 9.37 12.08 seconds

As a final test, I created an actual file of the two-way tree of 2{20}-1 edges
and timed the execution of the entire program. For this test I chose the
20-MHz machine, which had 8 Mbytes of RAM and a 16-millisecond Wren VI hard
disk. Even including all disk I/O time, the program processes over 1,000,000
vertices in less than 30 seconds. Although my new algorithm was O(N{2}) in the
worst case, it proved to be a fast N{2}.


Graph Array Decomposition


I named the algorithm Graph Array Decomposition because an array is key to the
implementation. With this algorithm, you can quickly find answers to common
connectivity questions such as those identified earlier. It differs from most
other union-find algorithms in several respects:
The conservative memory requirements allow very large graphs to be manipulated
entirely in memory. Thus, neither paging nor virtual memory mechanisms are
required. This results in very fast execution times.
It is extremely simple to implement. Most other algorithms require building
trees with weight/height balancing and path compression (via halving or
splitting) mechanisms for efficient execution times. This has two important
implications: There is much less likelihood of errors in the implementation,
and the small amount of code needed will likely fit entirely into the
instruction cache of machines so equipped.
No time is wasted processing vertices that do not have any edges. In addition
to saving time, this allows data caches installed on the machine to be much
more effective.
Other algorithms preserve all of the information about the graph's structure.
This algorithm "simplifies" the graph before attempting to use the graph's
information.
I am releasing the Graph Array Decomposition algorithm into the public domain;
I encourage developers to use and modify this algorithm as they see fit. Any
observations or suggestions people have about ways to improve my
implementation will be greatly appreciated. Even a savings of a few clock
cycles adds up when the code is being executed millions of times. Because the
most expensive section of the algorithm is in determining if both vertices are
already in the same set, I expect opportunities for further optimizations to
be there.


References


1. S. Warshall. "A Theorem on Boolean Matrices," Journal of the ACM, 9:1
(1962), 11-12.
2. R.C. Prim. "Shortest Connection Networks and Some Generalizations." Bell
System Technical Journal, 36 (1957), 1389-1401.
3. J.B. Kruskal, Jr. "On the Shortest Spanning Subtree of a Graph and the
Traveling Salesman Problem," Proceedings of the American Mathematical Society,
7:1 (1956), 48-50.


Further Reading


M.W. Bern and R.L. Graham, "The Shortest Network Problem," Scientific
American, January, 1989, 84-89. Bern and Graham provide an excellent
introduction to the Shortest Network problem in this article. They include a
brief history of the developments in this area as well as discussing
derivations of the problem.
R.E. Sedgewick, Algorithms, second edition, Addison-Wesley, Reading, Mass.,
1988. Sedgewick manages the impressive feat of covering many classic
algorithms and problems without burying the material in academia. His text is
as readable as it is thorough. I highly recommend it to anyone serious about
software development.
G. Brassard and P. Bratley, Algorithmics: Theory & Practice, Prentice-Hall,
Englewood Cliffs, N.J., 1988. Brassard and Bratley take a much more
mathematically oriented perspective in their descriptions and analysis of
algorithms. With this perspective, algorithms and their analysis are pursued
in the depth and detail one usually finds in formal papers in the field of
computer science.

_GRAPH DECOMPOSITION_
by Edward Allburn



[LISTING ONE]

(*******************************************************************************
* GAD.Pas
* Program: GAD.Exe
* Author: Edward Allburn (September, 1990)
* Compiler: Turbo Pascal 5.0
*
* Description: This program demonstrates the Graph Array Decomposition
* algorithm described in the JAN '91 issue of "Dr. Dobb's Journal".
* It uses the algorithm to determine how many connected components
* a graph is comprised of. Both this program and the algorithm it
* demonstrates are released into the public domain.
*
* Usage: GAD NNN

* Where: NNN = Max vertex value of graph (ok if > than actual max val).
*
* IN Files: "GAD.In" - List of edges that make up the graph. Each vertex
* of an edge is a 4-byte DWORD. Thus, the total
* record length of each edge is 8 bytes.
* OUT Files: None.
*
* Abbreviations:
* "Garray" - Refers to the array used to hold the graph.
* "OAlist" - Refers to a single "Overlayed Adjacency List" in the Garray. Each
* OAlist corresponds to a single connected component of the graph.
*******************************************************************************)
Program Graph_Array_Decomposition_Demo;
uses Dos;

const
 cMaxVertex = 10000; (* Graph must have less than 10,001 vertices.*)
 cEmpty = cMaxVertex + 1; (* Use invalid vertex value as "empty" flag. *)

type
 tVertex = longint; (* Vertices are 4-byte values. *)
 tEdge = record (* An edge is comprised of 2 vertices. *)
 a,
 b :tVertex;
 end;

var
 in_file :file of tEdge;
 edge :tEdge;
 Garray :array[0..cMaxVertex] of tVertex;
 max_vertex,
 temp,
 a,b :tVertex;
 A_Seen,
 B_Seen,
 Same_Set :boolean;
 total,
 result, i :integer;


begin
 (* Print the title. *)
 writeln('GAD.Exe 1.0 Copyright (c) 1990 by Edward Allburn');
 writeln('------------------------------------------------');

 (* Get the max vertex value from the command line. *)
 val(paramstr(1), max_vertex, result);
 if (result <> 0) or (paramcount <> 1) then begin
writeln(' This program demonstrates the Graph Array Decomposition ');
writeln('algorithm described in the NOV ''90 issue of "Dr. Dobb''s
Journal".');
writeln('It uses the algorithm to determine how many connected components ');
writeln('a graph is comprised of. Both this program and the algorithm it ');
writeln('demonstrates are released into the public domain. ');
writeln;
writeln('Usage: GAD NNN ');
writeln('Where: NNN = Max vertex value of graph (ok if > actual max val).');
 halt(255);
 end
 else if max_vertex > cMaxVertex then begin

 writeln('Max vertex valued allowed is ', cMaxVertex);
 halt(255);
 end;

 (* Initialize array & open file. *)
 total := 0;
 for i:=0 to cMaxVertex do Garray[i] := cEmpty;
 assign(in_file, 'GAD.In');
 reset(in_file);

 (* Use Graph Array Decomposition to determine if the graph is connected. *)
 repeat
 (* Read next edge & determine if vertices have been seen before *)
 Read(in_file, edge);
 with edge do begin
 if (Garray[a] <> cEmpty) then A_Seen := TRUE else A_Seen := FALSE;
 if (Garray[b] <> cEmpty) then B_Seen := TRUE else B_Seen := FALSE;

 if NOT(A_Seen OR B_Seen) then begin {create a new set.}
 Garray[a] := b;
 Garray[b] := a;
 total := total + 1;
 end
 else if A_Seen AND(NOT B_Seen) then begin {append B to A's set.}
 Garray[b] := Garray[a];
 Garray[a] := b;
 end
 else if B_Seen AND(NOT A_Seen) then begin {append A to B's set.}
 Garray[a] := Garray[b];
 Garray[b] := a;
 end
 else begin
 {Determine if both vertices are already in the same set.}
 temp := Garray[a];
 while ((temp <> a) AND (temp <> b)) do
 temp := Garray[temp];
 Same_Set := (temp = b);

 if NOT Same_Set then begin
 (* Merge the two sets into a single set *)
 temp := Garray[a];
 Garray[a] := Garray[b];
 Garray[b] := temp;
 total := total - 1;
 end;
 end;
 end;
 until eof(in_file);
 close(in_file);

 writeln('Total connected components = ', total);
 if total = 1 then
 writeln('The graph is CONNECTED.')
 else
 writeln('The graph is NOT connected.');
end.
(**************************** End of GAD.Pas ********************************)







[LISTING TWO]


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; GAD.Asm
; Program: GAD.Exe
; Author: Edward Allburn (September, 1990)
; Compiler: Phar Lap 386ASM with Phar Lap 386LINK
; Build Cmds: 386asm GAD.Asm -FullWarn -80386P
; 386link GAD -FullWarn -80386 -maxdata 0
;
; Description: This program demonstrates the Graph Array Decomposition
; algorithm described in the JAN '91 issue of "Dr. Dobb's Journal".
; It uses the algorithm to determine how many connected components
; a graph is comprised of. Both this program and the algorithm it
; demonstrates are released into the public domain.
;
; Usage: GAD NNN
; Where: NNN = Max vertex value of graph (ok if > than actual max val).
;
; IN Files: "GAD.In" - List of edges that make up the graph. Each vertex
; of an edge is a 4-byte DWORD. Thus, the total
; record length of each edge is 8 bytes.
; OUT Files: None.
;
; Abbreviations:
; "Garray" - Refers to the array used to hold the graph.
; "OAlist" - Refers to a single "Overlayed Adjacency List" in the Garray. Each
; OAlist corresponds to a single connected component of the graph.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
ASSUME CS:_CodeSeg, DS:_DataSeg, ES:_DataSeg, SS:_StackSeg

_StackSeg SEGMENT PARA STACK USE32 'STACK'
 db 10240 dup (?) ; A 10K stack is more than enough.
_StackSeg ENDS



_DataSeg SEGMENT PARA PUBLIC USE32 'DATA'
 ; Global Constants
 cEmpty equ 0ffffffffh ; Signifies that a Garray element is empty.
 cEOF equ 1 ; Signifies that current rec is last in file.
 cFatalError equ 255 ; Return code of program when error occurred.
 cBell equ 7 ; Misc. string constants...
 cCR equ 0dh
 cLF equ 0ah
 cEndOfStr equ '$'

 ; Global Data
 maxVertexVal dd 0 ; Max vertex value of graph.
 inFile dw ? ; File handle of input file.
 buffSize dd ? ; File buffer size, in bytes.
 bytesInBuff dd ? ; Num bytes of input data in file buffer.
 buffEOF db cEOF - 1 ; Contains cEOF when End Of File reached.
_DataSeg ENDS




_CodeSeg SEGMENT PARA PUBLIC USE32 'CODE'
Main PROC NEAR
 call ProcCmdLine ; Get the max vertex value from the command line.
 call AllocMemory ; Allocate memory for the Garray and file buffer.
 call OpenInputFile
 call ReadGraph ; Read/Decompose the graph from the input file.
 call PrintResults ; Print the total # of connected components in graph.
Main ENDP



mPrn MACRO msg
; Outputs the "$" terminated string to std out.
; in: msg - Is a "$" terminated string.
; out: nothing.
; dstryd: ax, edx
 mov ah, 09h
 mov edx, offset &msg&
 int 21h
ENDM



mHalt MACRO msg, returnCode
; Displays the halt message to the user and halts the program.
; in: msg - Is a "$" terminated string containing the message.
; returnCode - Return code the program is halted with.
; out: nothing.
; dstryd: ax
 mPrn &msg& ; Print the halt message.
 mov ah, 4Ch ; Halt the program
 mov al, &returnCode& ; with the specified return code.
 int 21h
ENDM



ProcCmdLine PROC NEAR
; Processes the command line, displaying a help screen if the command line is
; not valid.
; in: nothing.
; out: maxVertexVal - Contains max vertex value of the graph in "GAD.In"
; dstryd: eax, ebx, ecx, edx
 jmp #lStart ; Jump past local data to the start of the procedure.
 titleMsg db cCR
 db'GAD.Exe 1.0 Copyright (C) 1990 by Edward Allburn '
 db cCR, cLF
 db'------------------------------------------------ '
 db cCR, cLF, cEndOfStr
 helpMsg db cCR
 db'Desc: This program demonstrates the Graph Array Decomposition '
 db cCR, cLF
 db' algorithm described in the NOV ''90 issue of "Dr. Dobb''s Journal".'
 db cCR, cLF
 db' It uses the algorithm to determine how many connected components '
 db cCR, cLF

 db' a graph is comprised of. Both this program and the algorithm it '
 db cCR, cLF
 db' demonstrates are released into the public domain. '
 db cCR, cLF
 db cCR, cLF
 db' Use: GAD NNN '
 db cCR, cLF
 db'Where: NNN = Max vertex value of graph (ok if > than actual max val). '
 db cCR, cLF
 db cCR, cLF
 db' IN: "GAD.In" - List of edges that make up the graph. Each '
 db cCR, cLF
 db' vertex of an edge is a 4-byte DWORD. Thus, the '
 db cCR, cLF
 db' total record length of each edge is 8 bytes. '
 db cCR, cLF
 db' OUT: Nothing. '
 db cCR, cLF, cEndOfStr
 cmdEnd dd ?

#lStart:
 ; Find the command line in the Disk Transfer Area (DTA).
 mPrn titleMsg ; Display the title.
 mov ah, 2Fh
 int 21h ; Get the current DTA.
 mov ecx, 0
 mov cl, es:[ebx] ; Determine how many chars were entered on command line.
 cmp cl, 2 ; Were less than 2 chars entered on command line?
 jl lShowHelp ; Yes, obviously wrong so show help screen.

 ; Verify that a single, unsigned value was entered on the command line.
 add ecx, ebx ; Determine the end of the command line.
 mov ds:[cmdEnd], ecx
 mov eax, 0
 inc ebx ; Skip 1st blank of the command line.
 mov ecx, 0
lNextDigit:
 ; Get the next char & verify that it is a valid digit ['0'..'9'].
 inc ebx ; Advance to the next char in the command line.
 mov cl, es:[ebx] ; Get the next char.
 sub cl, '0' ; Convert the char to a digit.
 cmp cl, 9 ; Is this a valid digit?
 ja lShowHelp ; No, show the help screen.

 ; Append the digit to the running total.
 mov edx, 10 ; Make room for new digit in 1's position,
 mul edx ; by multiplying the total by 10.
 add eax, ecx ; Append the digit to the total.
 cmp ebx, ds:[cmdEnd] ; Was this char the last one of the command line?
 jl lNextDigit ; No, process the next digit.

 ; Save the max vertex value of the graph & return.
 mov ds:[maxVertexVal], eax
 ret

lShowHelp:
 mHalt helpMsg, 0
ProcCmdLine ENDP




AllocMemory PROC NEAR
; Allocate & initialize memory for the Garray. Then allocate all of the
; remaining memory for use as a file buffer.
; in: maxVertexVal - Contains max vertex value of the graph.
; out: buffSize - Size of file buffer, in bytes.
; FS - Points to the file buffer.
; GS - Points to the Garray.
; dstryd: eax, ebx, ecx, edx, es, edi
 jmp #lStart ; Jump past local data to the start of the procedure.
 allocStartMsg db ' Allocating/Initializing memory...', cEndOfStr
 allocFinishMsg db ' Finished.', cCR, cLF, cEndOfStr
 allocFailMsg db ' FAILED! Not enough memory.', cBell, cCR, cLF, cEndOfStr

#lStart:
 ; Calculate the size of the array needed to hold the entire graph.
 ; Garray size in bytes = (maxVertexVal + 1) * 4
 mPrn allocStartMsg
 mov ebx, ds:[maxVertexVal]
 inc ebx ; +1 to allow for 0..maxVertexVal.
 inc ebx ; +1 again to allow for sentinel.
 shl ebx, 2 ; *4 each vertex needs 4 bytes of memory.

 ; Allocate the array.
 mov ah, 48h
 shr ebx, 12 ; Convert the array size into 4K (4096) pages.
 inc ebx ; Add 1 more page as a safety margin.
 int 21h
 jc lAllocFailed

 ; Save a pointer to the Garray & initialize it with "cEmpty".
 mov gs, ax ; Save a pointer to the Garray.
 mov ecx, ds:[maxVertexVal] ; Load the counter with # vertices to scan.
 inc ecx ; +1 to allow for 0..maxVertexVal.
 inc ecx ; +1 again to allow for sentinel.
 mov es, ax ; Set up ES & EDI to scan from
 mov edi, 0 ; the start of the Garray
 cld ; forward,
 mov eax, cEmpty ; filling it with "cEmpty".
 rep stosd ; Fill the Garray.

 ; Allocate all of the remaining memory for a file buffer.
 mov ah, 48h ; Determine how much memory is left.
 mov ebx, 0ffffffffh
 int 21h
 mov ah, 48h ; Claim all of it for the file buffer.
 int 21h
 jc lAllocFailed

 ; Save pointer to file buffer as well as determine/save its size (in bytes).
 mov fs, ax ; Save a pointer to the file buffer.
 shl ebx, 12 ; Convert the 4K pages to bytes.
 sub ebx, 8 ; Leave room for 1 extra record.
 mov ds:[buffSize], ebx ; Save for future reference.

 ; Announce that this routine has finished & return.
 mPrn allocFinishMsg
 ret


lAllocFailed: mHalt allocFailMsg, cFatalError
AllocMemory ENDP



OpenInputFile PROC NEAR
; Opens the input file & loads the first buffer's worth of input data.
; in: Nothing.
; out: inFile - Contains handle of opened input file.
; dstryd: ax, cx, edx
 jmp #lStart ; Jump past local data to the start of the procedure.
 cNormalFile equ 0
 cWriteOnly equ 1
 cReadWrite equ 2
 inFileName db 'GAD.In', 0
 openStartMsg db 'Loading first input file buffer...', cEndOfStr
 openFinishMsg db ' Finished.', cCR, cLF, cEndOfStr
 openFailMsg db 'Could not open input file "GAD.In".'
 db cBell, cCR, cLF, cEndOfStr

#lStart:
 ; Open the input file & save its handle.
 mPrn openStartMsg
 mov ah, 3dh
 mov al, cNormalFile
 mov edx, offset inFileName
 int 21h
 jc lOpenFailed
 mov ds:[inFile], ax ; Save the file handle

 ; Load the first block of input data into the file buffer & return.
 call BuffLoad
 mPrn openFinishMsg
 ret

lOpenFailed: mHalt openFailMsg, cFatalError
OpenInputFile ENDP



ReadGraph PROC NEAR
; Read the graph from the input file, decomposing it while keeping count of
the
; total connected components along the way.
; NOTE: Vertex values greater than the "maxVertexVal" specified on the
; command line are NOT checked for.
; in: buffEOF - Does not equal cEOF.
; GS - Points to the "cEmpty"-initialized Garray.
; out: buffEOF - Equals cEOF.
; GS - Points to the Garray containing the decomposed graph.
; edi - Total number of connected components of the graph.
; dstryd: eax, ebx
 jmp #lStart ; Jump past local data & macros to the start of the procedure.
 readStartMsg db ' Reading/Decomposing graph...', cEndOfStr
 readFinishMsg db ' Finished.', cCR, cLF, cEndOfStr



 mReadEdge MACRO

 ; Reads the next edge (i.e., record) from the file buffer.
 ; in: bytesInBuff - Contains the number of bytes in the file buffer.
 ; esi - Buffer pointer to the next edge.
 ; FS - Points to the file buffer.
 ; out: buffEOF - Equals cEOF if edge being returned is last in file.
 ; esi - Buffer pointer to next edge.
 ; eax - A vertex value of edge.
 ; ebx - B vertex value of edge.
 ; dstryd: Nothing.
 ; Read the next edge, advancing the buffer pointer.
 mov eax, fs:[esi]
 mov ebx, fs:[esi + 4]
 add esi, 8

 ; If this edge is the last one in the buffer, load the next buffer.
 cmp esi, ds:[bytesInBuff] ; Last edge in file buffer?
 jb lReadEdgeEnd ; No, just return.
 call BuffLoad ; Yes, load the new buffer.
 cmp ds:[bytesInBuff], 0 ; Is the new buffer empty?
 jg lReadEdgeEnd ; No, just return.
 mov ds:[buffEOF], cEOF ; Yes, last edge in file, so set EOF flag.
 lReadEdgeEnd:
 ENDM



 mNeitherSeen MACRO
 ; Neither A nor B seen before. Create a new OAlist.
 ; in: GS - Points to the Garray.
 ; eax - A vertex (NOT seen before).
 ; ebx - B vertex (NOT seen before).
 ; edi - Total connected components of the graph thus far.
 ; out: GS - The Garray has been updated with the new OAlist.
 ; edi - One more connected component has been added to the total.
 ; dstryd: Nothing.
 mov gs:[eax], ebx ; Point A to B.
 mov gs:[ebx], eax ; Point B back to A.
 inc edi ; Increment total connected components.
 ENDM



 mOnlyAseen MACRO
 ; A seen before, so append B to A's OAlist by doing a standard linked-list
 ; insertion.
 ; in: GS - Points to the Garray.
 ; eax - A vertex (seen before).
 ; ebx - B vertex (NOT seen before).
 ; out: GS - The B vertex has been appended to the A vertex's OAlist.
 ; dstryd: ecx
 mov ecx, gs:[eax]
 mov gs:[ebx], ecx ; Point B to what A is currently pointing to.
 mov gs:[eax], ebx ; Point A to B.
 ENDM



 mOnlyBseen MACRO
 ; B seen before, so append A to B's OAlist by doing a standard linked-list

 ; insertion.
 ; in: GS - Points to the Garray.
 ; eax - A vertex (NOT seen before).
 ; ebx - B vertex (seen before).
 ; out: GS - The A vertex has been appended to the B vertex's OAlist.
 ; dstryd: ecx
 mov ecx, gs:[ebx]
 mov gs:[eax], ecx ; Point A to what B is currently pointing to.
 mov gs:[ebx], eax ; Point B to A.
 ENDM



 mBothSeen MACRO
 ; If A & B aren't already in the same OAlist, merge their OAlists.
 ; in: GS - Points to the Garray.
 ; eax - A vertex (seen before).
 ; ebx - B vertex (seen before).
 ; edi - Total connected components of the graph thus far.
 ; out: GS - The 2 OAlists have been merged into a single OAlist.
 ; edi - One connected component has been subtracted from the total.
 ; dstryd: ecx, edx
 ; Determine if vertex B is already in vertex A's OAlist.
 mov ecx, eax ; Save starting place in OAlist.
 lTestNextVertex:
 mov eax, gs:[eax] ; Get the next vertex in A's OAlist.
 cmp eax, ebx ; Is B already in A's OAlist?
 je lDoNothing ; Yes, so just return.
 cmp eax, ecx ; Have we come full circle thru A's OAlist?
 jne lTestNextVertex ; No, so see if the next vertex is B.

 ; Merge the 2 OAlists by swaping the pointers.
 mov ecx, gs:[eax]
 mov edx, gs:[ebx]
 mov gs:[eax], edx
 mov gs:[ebx], ecx
 dec edi ; Decrement total connected components.
 lDoNothing:
 ENDM



;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Start of procedure ReadGraph
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
#lStart:
 mPrn readStartMsg ; Anounce that the routine has started.
 mov edi, 0 ; Initialize total connected components.

lReadNextEdge:
 mReadEdge ; Returns A in EAX and B in EBX.
 shl eax, 2 ; *4 to convert vertex values to their
 shl ebx, 2 ; actual byte offsets into the Garray.

 ; Determine if A and/or B have been seen before & call appropriate routine.
 cmp dword ptr gs:[eax], cEmpty ; Was A seen before?
 jne lBothPossible ; Yes, it's possible both have been seen.
 cmp dword ptr gs:[ebx], cEmpty ; No, but was B seen before?
 jne lOnlyB ; Yes, so only B was seen before.

 mNeitherSeen ; No, neither has been seen before.
 jmp lTestEOF
lOnlyB:
 mOnlyBseen ; Only B was seen before.
 jmp lTestEOF
lBothPossible:
 cmp dword ptr gs:[ebx], cEmpty ; Was B seen before?
 jne lBoth ; Yes, so both A & B were seen before.
 mOnlyAseen ; No, only A was seen before.
 jmp lTestEOF
lBoth:
 mBothSeen
lTestEOF:
 cmp ds:[buffEOF], cEOF ; Are we at EOF?
 jne lReadNextEdge ; No, so process the next edge.

 ; Anounce that the routine has finished & return.
 mPrn readFinishMsg
 ret
ReadGraph ENDP



BuffLoad PROC NEAR
; Loads the next buffer's worth of data from disk into the file buffer.
; in: inFile - File handle of the input file.
; buffSize - Size of buffer, in bytes.
; FS - Points to the file buffer.
; esi - Buffer offset of next record location.
; eax - A vertex of last record in previous buffer.
; ebx - B vertex of last record in previous buffer.
; out: bytesInBuff - Contains the number of bytes actually in file buffer.
; esi - Points to the first rec in the file buffer.
; dstryd: Nothing.
 jmp #lStart ; Jump past local data to the start of the procedure.
 readFailMsg db 'Disk read error.', cBell, cCR, cLF, cEndOfStr

#lStart:
 ; Load the next buffer from disk.
 pushad
 mov ah, 3fh
 mov bx, ds:[inFile]
 mov ecx, ds:[buffSize]
 mov dx, fs
 push ds
 mov ds, dx
 mov edx, 0
 int 21h
 pop ds
 jc lReadError

 ; Resore the current record & return.
 mov ds:[bytesInBuff], eax ; Save number of bytes actually in buffer.
 popad ; Restore registers.
 mov esi, 0 ; Re-set the pointer to start of the buffer.
 ret

lReadError:
 mHalt readFailMsg, cFatalError

BuffLoad ENDP



PrintResults PROC NEAR
; Closes the input & output files.
; in: inFile - Contains handle of opened input file.
; edi - Total number of connected components of the graph.
; out: Nothing.
; dstryd: ax, bx
 jmp #lStart ; Jump past local data to the start of the procedure.
 totalMsg db cCR, cLF
 db 'Total connected components = ', cEndOfStr
 connectMsg db cCR, cLF
 db 'The graph is CONNECTED.', cCR, cLF, cEndOfStr
 notConnectMsg db cCR, cLF
 db 'The graph is NOT connected.', cCR, cLF, cEndOfStr



 mPrnEDI MACRO
 ; Prints the number in EDI.
 ; in: edi - Number to be printed.
 ; out: edi - Number to be printed.
 ; dstryd: eax, ebx, ecx, edx
 ; Push each digit onto the stack.
 mov eax, edi
 mov edx, 0
 mov ebx, 10
 mov ecx, 0
 lGetNextDigit:
 inc ecx
 div ebx ; Determine the next digit.
 push edx ; Save the digit.
 mov edx, 0
 cmp eax, 0 ; Was the last digit just processed?
 jg lGetNextDigit ; No, get the next one.

 ; Pop each digit off the stack & print the ASCII version of it.
 lPrnNextDigit:
 pop edx ; Get the next digit.
 add dl, '0' ; Convert the digit to ASCII.
 mov ah, 02h ; Print it.
 int 21h
 loop lPrnNextDigit ; If any digits left, print the next one.
 ENDM



;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; Start of procedure PrintResults
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
#lStart:
 ; Close the input file.
 mov ah, 3eh
 mov bx, ds:[inFile]
 int 21h

 ; Print the results.

 mPrn totalMsg
 mPrnEDI
 cmp edi, 1
 jg lNotConnected
 mHalt connectMsg, 0
lNotConnected:
 mHalt notConnectMsg, 0
PrintResults ENDP

_CodeSeg ENDS
 END Main
;;;;;;;;;;;;;;;;;;;;;;;;;;;;; End of GAD.Asm ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;


[GRAFDUMP.PAS]

(*******************************************************************************
* GrafDump.Pas
* Program: GrafDump.Exe
* Author: Edward Allburn
* Compiler: Turbo Pascal 5.0
*
* Description: Displays list of edges in the file "GrafSimp.In".
*
* Usage: GrafDump
* Where: Nothing.
*
* IN Files: "GAD.In" - List of edges that make up the graph. Each vertex
* of an edge is a 4-byte DWORD. Thus, the total
* record length of each edge is 8 bytes.
* OUT Files: None.
*
* Abbreviations:
* None.
*******************************************************************************)
Program GrafDump;
uses Dos;

type
 tVertex = longint; (* Vertices are 4-byte values. *)
 tEdge = record (* An edge is comprised of 2 vertices. *)
 a,
 b :tVertex;
 end;

var
 in_file :file of tEdge;
 edge :tEdge;


begin
 (* Print the title. *)
 writeln('GrafDump.Exe 1.0 Copyright (c) 1990 by Edward Allburn');
 writeln('-----------------------------------------------------');

 (* Print the list of edges contained in the file. *)
 assign(in_file, 'GAD.In');
 reset(in_file);
 repeat

 Read(in_file, edge);
 with edge do writeln(a, ',', b);
 until eof(in_file);
 close(in_file);
end.
(************************** End of GrafDump.Pas *****************************)


[GAD.DOC]

--------------------------------------------------------------------------------
- GAD.Doc
- Program: GAD.Exe
- Author: Edward Allburn (September, 1990)
-
- Description: This program demonstrates the Graph Array Decomposition
- algorithm described in the JAN '91 issue of "Dr. Dobb's Journal".
- It uses the algorithm to determine how many connected components
- a graph is comprised of. Both this program and the algorithm it
- demonstrates are released into the public domain.
-
- Usage: GAD NNN
- Where: NNN = Max vertex value of graph (ok if > than actual max val).
-
- IN Files: "GAD.In" - List of edges that make up the graph. Each vertex
- of an edge is a 4-byte DWORD. Thus, the total
- record length of each edge is 8 bytes.
- OUT Files: None.
--------------------------------------------------------------------------------

FILES IN "GAD.Zip":
------------------------
 GrafDump.Exe - Displays all of the edges in "GAD.In" to StdOut.
 GrafDump.Pas - Pascal source (Turbo Pascal 5.0) for GrafDump.Exe.
 GAD.Asm - Assembly source (Phar Lap 386ASM) for GAD.Exe.
 GAD.Doc - This file.
 GAD.Exe - Compiled from Turbo Pascal 5.0 version of program.
 GAD.In - Sample input file of graph shown in figure 7a of article.
 GAD.Pas - Pascal source (Turbo Pascal 5.0) that accompanied article.


USE OF "GAD.Exe":
-----------------
 The file "GAD.In" must exist in the current directory. The max vertex
value in this file must be specified on the command line. This value is used
to determine how large of an array to allocate. No harm is done if a value
greater than the actual max vertex value is specified. For example, the sample
input file supplied is the graph shown in figure 7a of the article which has a
max vertex value of 8. To process this graph, the command line would read:
 "GAD 8"
A help screen is displayed if no/invalid paramaters are specified.

 The program has an overhead of about 100K of memory. Beyond that, the
memory requirements for this program depend on the max vertex value specified
on the command line. Because each vertex is 4 bytes in size, the amount of
memory required for the array is (max vertex value * 4) bytes. For example,
the total amount of memory required by the program for the previous example
is:
 100K + (8 * 4) = 100.032K
The total memory required by a graph 1 million vertices in size will be about

4.1 megabytes. After memory for the array has been allocated, all of the
remaining memory on the system is allocated for use as a file buffer.


FILE RECORD STRUCTURE:
----------------------
 The input file "GAD.In" contains the list of edges that make up the
graph. Each vertex of an edge is an unsigned DWORD (four bytes). Thus, the
total record length of each edge is eight bytes. The following type
definitions were taken from the file "GAD.Pas" and illustrate the input
file layout:
 type
 tVertex = longint; (* Vertices are 4-byte values. *)
 tEdge = record (* An edge is comprised of 2 vertices. *)
 a,
 b :tVertex;
 end;


YOUR FEEDBACK:
--------------
 I am interested in hearing any suggestions or observations you have. You
can contact me via my CompuServe account 72617,2624 or by my mailing me at:
 4830 S. Salida Ct.
 Aurora, CO 80015
-------------------------------- End of GAD.Doc ------------------------------




































January, 1991
WE, THE PEOPLE, IN THE INFORMATION AGE


Early times in Silicon Valley




Jim Warren


Jim was DDJ's founding editor, and is the future columnist for MicroTimes. He
can be reached at 345 Swett Road, Woodside, CA 94062.


This rambling retrospective illustrates a pattern of creative connections, and
offers some cloudy insights into our future. It derives from a quarter-century
of often unplanned, risk-laden, immensely productive, often euphoric
techno-social collaborations.
Most of all, though, this is a story of modern explorers, cooperatively
building bridges into the electronic frontier -- "space" that all of us can
explore.
Dr. Dobb's Journal is a thread in this grand and chaotic web, as good a
starting point as any for this spastic short story with no beginning, no end,
and a great future....


Egalitarian Origins


Around the mid-'60s, an energetic character named Bob Albrecht learned about
computers, learned Basic, and enthusiastically began teaching kids how to
program. He didn't do it for money; he did it because it was worth doing and
exciting -- and he communicated that excitement to the kids. (He often said
he'd rather teach kids to use computers than teach computers to use kids;
slamming the rote, drill-oriented, computer-aided instruction of the era.)
In the late '60s -- with help from Portola Institute and some programmable
desktop calculators from Hewlett-Packard -- Bob opened what was perhaps the
first center in the world where people could walk in and use computers for
free, for fun, for personal projects, to learn. Appropriately, he called it
People's Computer Center.
PCC was in a storefront facing the rail commuter's parking lot in Menlo Park,
California. The nonprofit Portola Institute was around the corner, in a funky
second-floor walk-up. Well, it really wasn't much of an "Institute." Directed
and primarily funded by Dick Raymond, I always thought of it as, mostly, a
legal mechanism for channeling "straight" loot into "hip" causes.
Aside: Later, HP invented the first hand-held calculator, the HP-35. Bill
Hewlett's wife chose the colors for the keys. It cost $395 back when that was
real money. Before its introduction, HP's best marketing experts projected
that the hand-held calculator market would saturate at around 10,000 units.
Within a year, they were producing 10,000 per month.


Another "Hippie" Start-up


Portola provided space and encouragement for another egalitarian venture
founded by a couple who lived in a tiny house trailer, and had a teepee on a
creek-bank near Stanford -- yep, a real Indian-style teepee. They were Stewart
and Lois Brand; the venture was the Whole Earth Catalog & Truck Store. It used
another storefront near PCC and Portola, a block from the Midpeninsula Free
University's Free U Store (more about that, later).
This socio-political experiment focused on access to tools, alternative
economics, communes, and "back to the land" visions -- and it led to the Whole
Earth Catalog. It began as an exciting exploration of alternative lifestyles,
and had long-term success in ways few people expected. WEC has earned
millions, seeding still more innovation, including CoEvolution Quarterly and
Point Foundation (replacing the old Portola Institute). Among other things,
they created the Hackers Conference, an invitational workshop where many of
the best-known innovators and computer gurus in the industry share ideas,
problems, and solutions.


Father of Modern Interfaces


While in the midst of creating WEC&TS, Stewart took time, without pay, to help
with a computer project at Stanford Research Institute (now SRI International,
also in Menlo Park) which was directed by Doug Englebart.
Years earlier, Doug had vaguely realized that our global society and its
decision processes were becoming too complex to manage by paper and by hand.
If citizens are to remain free and involved in decisions and if
decision-makers are to make informed decisions, he reasoned, tools are needed
to help utilize the growing masses of information on which to base sound
policies.
Pursuing this cloudy vision, Doug founded SRI's [Human] Augmentation Research
Center (ARC). Among other things, Doug's late-'60s research produced the
mouse, two-dimensional text editors (previous editors were line-oriented,
typewriter-based), and on-screen menus.
Recognizing that networking -- human or electronic -- is essential for
collaboration, Doug also founded NIC, the Network Information Center, which
was the repository of computer networking information for the early ARPANET,
predecessor of the Internet. In volunteering to run NIC, Doug became the
second computer on the ARPANET.


Corporate-style "Vision"


Preoccupied with "office automation," SRI corporate decision-makers killed
ARC, selling its carcass to Tymshare. It took about a decade for Doug's
innovations to become publicly available tools, first, and still
predominantly, on microcomputers.
Some of the ARC staff moved to Xerox's Palo Alto Research Center, X-PARC. In
those days, X-PARC offered instant tenure, no responsibilities, and major
computer resources. "You're bright; do something interesting."
There, Doug's former staff helped create the in-house, desk-sized Alto mini.
Stewart Brand called it a "personal computer" in a 1974 Rolling Stone article
(reportedly, the first use of the phrase). Many of the same folks later moved
to Apple and helped create the Macintosh, a decade later.
Doug is still exploring computer-aided cooperation, currently heading
Stanford's nonprofit Bootstrap Project. With seed-funding from such groups as
Apple, Sun, and Mitch Kapor's foundation, he's investigating the use of
computer collaboration for exploring computer collaboration. And only now,
almost three decades after Doug's first visions of computer-assisted
collaborative-work (CSCW), we are seeing the first bumpy inklings of viable
groupware tools.


"For a Good Time, Call..."


In the '80s, Stewart Brand's Point Foundation, gave office space to the WELL
-- Whole Earth 'Lectronic Link. (Both are located in Sausalito, across the
Golden Gate from San Francisco.)

The WELL has under 5000 users, hardly the largest teleconferencing system in
the world. But its 100+ public conferences may be the most highly reputed in
computerdom, praised for their high quality of content, informal and
egalitarian candor, ready sharing of broad-ranging expertise, and stellar
collection of nationally-known participants from a great range of disciplines.


People's Computer Company


Around 1970, Albrecht's PCC developed a board of directors, had some
political/personality hassles (the bane of volunteer efforts), and probably
became somewhat boring for him. So Bob created People's Computer Company,
which -- despite it's name -- was a quarterly tabloid newspaper with a wildly
eccentric format, pictures of dragons, many-font text in varied alignments,
reader's letters, virulent opinions, Basic programs, and fantasies of computer
futures. It's focus was computer games, Basic for fun, and computers for
people.
Adding confusion, Bob later created a nonprofit corporation, also named
People's Computer Company (also called PCC), to publish PCC newspaper, which
often reported on activities at PCC (People's Computer Center).


Altair Begat Tiny Basic


In January 1975, Bob read about the first computer kit, the Altair, from a
small electronics company named MITS in Albuquerque, New Mexico.
Being a Basic junkie, Albrecht wanted Basic on the Altair. But, the Altair
came with 256 bytes -- not kilobytes! -- of memory, and additional 256-byte
modules were expensive. Bob discussed it with Dennis Allison, a long-time
computer consultant and PCC gadfly.
Aside: Dennis had worked at SRI and had close association with Doug Englebart
-- circles within circles within circles.
More asides: Ed Roberts owned MITS, sort of a poor-man's Heathkit. MITS had
introduced a hand-held calculator kit, just in time to have its market
demolished by a price-war between TI and Commodore. Ed decided to offer a
computer kit, named by his young daughter after a planet in a "Star Trek"
episode.
MITS' tech writer was Dave Bunnell, a liberal arts type who, when Ed first
proposed the Altair, knew nothing about computers and envisioned a roomsized
machine. In college, Dave had been an SDS (Students for a Democratic Society)
organizer. He went on to be founding publisher of PC Magazine, followed by PC
World, and often published strong editorials regarding socio-political issues
in technology.


Designs by Dennis


For the fun of it, Dennis Allison designed Tiny Basic, a stripped-down
language, coded in a simple pseudocode that was easily implemented on micros.
He omitted such things as floating-point arithmetic, transcendental functions,
and matrix operators; Bob's kids and games didn't need them.
Dennis and Bob published the design in PCC, in three parts, between March and
September 1975. They then invited readers to share implementations. On
December 12, 1975, Dick Whipple and John Arnold from Tyler, Texas, detailed
their version, which required under 3K of RAM. They forwarded a full octal
listing, shortly thereafter -- cooperative effort, freely shared.
Dennis and Bob gathered the design articles, feedback letters, and Whipple and
Arnold's Tiny Basic for a three-part document to be photocopied for anyone
wanting it, costing a buck a part.
They handed the pieces to part-time pasteup artist Rick Bakalinsky, asking him
to put it all together in a newsletter format. He asked what to title it. As
Dennis and Bob departed for "the boardroom" (a local pizza and booze parlor),
they told Rick to dream up a title.


Who Is Dr. Dobb?


Knowing nothing of computing or Basic, Rick wandered around PCC asking what it
was about. It's about "Tiny Basic." It's an exercise in programming --
"Calisthenics." What's tiny about it? Doesn't use many bytes--"Orthodontia,"
avoiding "Overbyte." Thinking that Dennis's name was Don, Rick named it "Dr.
Dobb's," after "DOn" and "BoB."
More name notes: After WEC, Lois Brand, renamed Lois Britten, worked for PCC.
Rick Bakalinsky listed himself in the DDJ staff box as Rosehip Maloy, later
changing his name to Teal Lake. Not to be outdone, M&T abbreviated our haloed
rag's name to, simply, "Dr. Dobb's Journal."
Personally, I sorta liked Dr. Dobb's Journal of Computer Calisthenics &
Orthodontia, Running Light Without OverByte, burdensome though it was,
Stanford's Ed Feigenbaum once said it was the only computer publication his
wife would allow on their living room table.


Cooperative Anarchy


Dennis, Bob, Stewart, Lois, and over a thousand others were varyingly involved
in the Midpeninsula Free University, reportedly the largest such "alternative
university" in the nation. Membership was $10. Courses were entirely free;
instruction and labor, entirely voluntary. It held that "one is neither too
old to learn, nor too young to teach."
I was greatly interested in cooperative/alternative communities, and joined
the Free U's Intentional Communities seminar, larger than all the other
courses combined. The radical Maoist founders of the Free U objected to such
"irrelevant hippie courses" with classic radical stupidity, insisted they be
canceled -- and were voted out of leadership by the large "hippie" majority.
Because of previous experience producing a regional math teacher's newsletter
(pro bono), I was elected General Secretary of the Free U, responsible for
producing its Free You newsletter -- another volunteer venture.


Free U Prompted Macintosh Design


While I was General Secretary, Larry Tesler was MFU treasurer. He often helped
produce the newsletter and the quarterly course catalog, a major chore,
tediously typed on an infuriating proportional-spacing IBM Executive Selectric
typewriter. Larry later worked at X-PARC, then moved to Apple, as part of the
group that created the Mac.
He later told me that our frustrating MFU publishing efforts directly led him
to explore computerized typesetting and page layout, which led him to want
variable fonts, all of which caused him to successfully insist that the
Macintosh have a bit-mapped monitor and printer -- when all other popular
computers of the early '80s had only character-oriented ASCII displays.
True story. A volunteer, alternative-lifestyle project directly prompted a
major design innovation that now permeates microcomputing.


More Computer Connections


Back to Dennis and Bob: We first met around Free U circles. At the time, I was
programming minicomputers (assembler-level, 4K memory, real-time applications)
at Stanford Medical Center, working for a fellow Free U activist who wanted to
make sure I could afford to continue volunteer MFU work.
Later, Dennis taught while I was a grad student, both in EE/CS at Stanford and
in UCSF Med Center's Medical Information Science program. Additionally, Dennis
chaired the local Association for Computing Machinery, SIGMICRO and SIGPLAN
chapters, and I often followed him as their next chair -- also editing the
regional ACM/SIG newsletter. As usual, our work was as unpaid volunteers.
When PCC announced the Tiny Basic three-part quickie, response was immediate.
People wanted it to continue, complaining that the few micro magazines of the
time were preoccupied with hardware and didn't cover significant software.
Dennis didn't have the time to edit such a rag. Bob knew little more than
BASIC. Their tiny bank account couldn't compete with what competent computer
pros were paid. When they wanted a technically competent editor who would work
for peanuts....



Low Pay, Great Value


About a week before Christmas 1975, my Stanford doctoral adviser informed me
he was cutting off my support as of January 1. Said he, I did good work, but
couldn't write well enough to produce a dissertation. (He had also missed
Stanford's tenure hurdle and was leaving for industry around June.)
I began reestablishing consulting contacts (then paying about $30,000-$40,000
per year), talked to Dennis, and he invited me to edit DDJ. The choice was
easy -- $40K/year consulting or $350/month editing a nonexistent software rag.
Of course, I took the editorship -- more fun, more interesting topic, greater
potential impact.
One thing had to change, however. Being a snooty computer pro, I wasn't about
to besmirch my reputation by editing a Basic mag. Thus, with the first issue
of the ongoing magazine, I titled it Dr. Dobb's Journal of Computer
Calisthenics. The editorial style was informal and candid, and vigorously
promoted sharing and consumer advocacy (the latter being easy; we had no
advertisers to offend, and no money to attract libel lawyers). We were the
only rag willing to devote 10 to 30 pages for hardcore code-listings.


More Later, Maybe


Much more could be said -- about cooperative connections to the Homebrew
Computer Club's immensely valuable biweekly sharing orgies; the largest public
microcomputer conventions in the world; Info World; the PBS "Computer
Chronicles," and much more -- ventures that often made energetic volunteers
into millionaires, as delightful but unplanned, unintended, incidental
byproducts. And, that simply provides loot to pursue further socio-technical
innovation.


What Are the Points?


The "expected" often doesn't occur.
The "unexpected" becomes almost predictable -- commonly having much greater
value, by any measure, than whatever was originally intended.
We are more productive when we freely share and cooperate than when we
covetously clutch at each incremental innovation -- so much more productive,
that each individual, and our nation, ends up "getting" more than if we don't
share.
Energetic, unpaid, or low-paid "volunteer" effort often has amazing "personal
payoffs" while accomplishing needed and laudable improvements.


What's Next?


The areas for most valuable innovation and impact involve collecting,
distributing, and processing massive quantities of information for "the
masses." All of these areas have major, unending opportunities for individuals
and small groups; none are limited to mammoth corporations:
Low-cost tools are needed to process and utilize ill-structured, heterogeneous
text and graphic information.
Low-cost, high-capacity wired and wireless mass data distribution, e-mail,
teleconferencing, BBS, and networking has great room for innovation. (Data
broadcasting has unlimited opportunities and very low entry costs.)
Providing useful information in machine-readable form has opportunities
without limit -- from the mundane to the magnificent: (food) weekly grocery
prices, dietary data; (shelter) rental housing, real estate data; (education)
reference materials, school and college details; (money) economic data of all
kinds; (legal) statutory and case law. And, community events, regional data,
environmental information, political activity, local, regional, national, and
international news, archival data, and time-sensitive information, without
limit.
People's computers, accessing significant information about the People's
world, assure a Free People. Our electronic "intellectual assistants" can
provide "power to the people" -- not from the end of a gun, but, rather, by
allowing citizens the practical opportunity to make knowledgeable, reasoned
decisions about their person, family, community, state, nation, and world.
Big, conservative corporations won't do it; risk-taking, dedicated, energetic
individuals will do it -- together. Let us continue.




























January, 1991
FIRE IN THE VALLEY REVISITED


Where there's smoke, there's bound to be fire




Michael Swaine


Michael is DDJ's editor-at-large and coauthor of Fire in the Valley, a
landmark history of the personal computer industry, published by Osborne
McGraw-Hill. He can be reached at 501 Galveston Drive, Redwood City, CA 94063.


History is counted among the humanities, rather than, say, among the sciences.
I never thought much about this categorization until I tried to write some
history. Paul Freiberger and I were editors at InfoWorld in 1981 when our
boss, Thom Hogan, suggested that we interview the heroes of the personal
computer revolution while all the frantic events were fresh in their memories,
and write a book that would convey the facts and the feelings of the
revolution.
By the time we had finished Fire in the Valley I knew which of the meanings of
the word human motivated the classification of history among the humanities.
The humanities are the human studies, and history is a human study in the
sense of to err is human. We did a lot of choosing among alternative versions
of events in researching Fire in the Valley, often deciding what must have
been true on the basis of plausibility. The judgment of plausibility is, of
course, highly subjective.
We were not alone in making such subjective choices. We wrote a history of a
revolution, but history itself appears to be of many minds about the nature,
duration, and very existence of that revolution or, as it may be, those
revolutions. I have since read history books and articles that contend that
the revolution hasn't happened yet, that it was over in 1974, that there
hasn't been any revolution, and that there have been several. It appears that
we can take our choice of histories. I choose to say that there was a
revolution, and that it's over, and that we won.
Now that the recent glorious war of liberation is over, it is time to
celebrate our heroes, decorate the battle sites, and begin the job of
civilizing the wilderness and creating the new society for which we fought.


Revolutionary Decade: 1974 - 1983


Just as the history of the American revolution does not begin with the Magna
Carta, I won't hark back to Larry Tesler and Jeff Rulifson's readings in
semiotics at Xerox PARC, or Doug Engelbart's blueprints for user-friendly
design at SRI, or that bizarre community of neologistic technophiles at MIT.
One event preceding the main event must be mentioned, though: The 1974
self-publication by Ted Nelson of Computer Lib/Dream Machines. Computer Lib
was the Common Sense of the revolution, and Nelson its Tom Paine. It's a book
still worth dipping into, which is, in fact, the only way in which it is
possible to read the book.
But the revolution actually broke out over Christmas break, 1974-75, with the
publication of an article on a kit computer from a firm in Albuquerque.
Specifically, it was the hobbyist phase of the personal computer revolution
that began when that January 1975 issue of Popular Electronics hit the stands,
with its cover story on the MITS Altair computer. There were other fronts on
which the revolution could have broken out: Any electronic hobbyists who could
lay hands on a microprocessor could attempt to build a computer, and some did.
Stephen Wozniak, whose first attempt in 1971 went up in smoke, was only the
best known of them, but in the Southwest, the Northwest, in New Jersey and New
York and Massachusetts and California, hobbyists were creating the personal
computer. Don Lancaster demonstrated how to build your own terminal in 1973,
and Radio Electronics published a piece on the Mark 8, your personal
minicomputer, in 1974.
But even though the Altair featured on the Popular Electronics cover was one
of many, and was in fact only a mock-up, it was the shot heard 'round the
hobbyist community, because it promised a computer on your desktop for under
$500. The average hobbyist couldn't buy the parts for that price. The Altair
started the revolution.
In 1975, computer clubs sprang up around the country; groups in southern
California, Silicon Valley, and New Jersey would be especially influential,
but the Boston Computer Society, founded two years later by Jonathan
Rotenberg, would prove more enduring. Dozens of small companies sprang up
overnight, growing to hundreds within two years. Computer retailing began in
1975 at Dick Heiser's Computer Store in Los Angeles, and by 1977 the first
computer store franchise chain had been launched: It was called Computerland.
Computer magazines evolved out of user group newsletters, with Wayne Green's
Byte rapidly achieving preeminence among the hundreds that existed at their
peak. Personal computer shows became popular. Software companies were
launched, starting with Microsoft, which wrote the Basic for the Altair, and
Digital Research, which created the CP/M operating system. Languages and
operating systems were the first kinds of software needed for the machines,
and Dr. Dobb's Journal was launched in 1976 to put a Basic interpreter in the
hands of hobbyists.
The hobbyist phase in its purest form came to an end in 1977, the year Apple
opened offices in Cupertino and put an assembled and slick-looking Apple II on
the market. This was also the year that the big electronics companies
(Tandy/Radio Shack and Commodore) announced their first PC, and the year MITS
effectively went under.
By 1978 it was possible, with a lot of imagination, to consider using a
personal computer for business purposes. Disk drives were replacing cassette
recorders for storage. There were word processors and databases and accounting
programs. By the end of 1979, there was a very good word processor called
WordStar, and an innovative electronic spreadsheet program called VisiCalc.
You could choose among IMSAI and Cromemco and Processor Technology machines,
or bypass the hobbyist standard S-100 bus and buy an Apple II, Radio Shack
Model I, Commodore PET, Exidy Sorceror, or a computer from video game
manufacturer Atari. By 1980, Radio Shack, Texas Instruments, and Sinclair had
all introduced very lowcost computers and made it reasonable for non-fanatics
to satisfy their curiosity about personal computers. Most buyers were
hobbyists and most software fell into the category of utilities or games
through 1978 and 1979, but most of the companies understood that the personal
computer would be an extremely important business tool, and were fighting to
establish a serious reputation and a position in the market before some
mainframe computer company got into the game and changed the rules.
Apple had emerged as the leading personal computer company by 1980. The
machine was well designed, the company was skillfully managed, and Apple took
upon itself the task of marketing the concept of a personal computer to a
skeptical general public. It didn't hurt that VisiCalc, the first program that
could sell a computer, ran only on an Apple II. Apple management was acutely
aware of the need to get big enough to survive when the big company stepped
in, and it was becoming clear that the biggest company would soon take the
plunge.
Microsoft began developing an operating system for IBM in 1980.
Hewlett-Packard had entered the fray in 1980, Xerox came in 1981, and DEC
announced a line of personal computers in 1982. But it was the IBM PC,
announced in mid-1981, that launched the next phase of the personal computer
revolution.
From 1981 to 1984, most personal computer companies were concerned with IBM PC
compatibility. Some companies didn't get it right, and failed. Some, such as
TI, decided they could do without the competition of IBM, and got out of
personal computers. But many hardware companies and many more software
companies prospered in the new air of legitimacy that IBM brought to the
market. Lotus, Ashton-Tate, and Microsoft became big companies. It was the Big
Blue phase, but not because IBM dominated the market in sales. IBM simply
defined what a PC was during this period.
That's why it is so significant that the IBM PC was an open machine. IBM could
have released a proprietary machine at a premium price, differentiated it
clearly from the computers that had sprung up out of the hobbyist community,
and sold lots of computers into its loyal business user base. Instead, it
produced a machine with an open architecture running a third-party operating
system and third-party application software, all derived from that hobbyist
community. It permitted competitors to develop IBM-compatible machines to the
point where the clone makers had a greater collective market share than IBM.
This was not what anyone had expected from IBM, and some credit may be due to
Bill Gates for nudging IBM in the direction of an open architecture. But it
was IBM that decided to talk to Microsoft in 1980.
Meanwhile, Apple was presenting itself as the guardian of the computer power
to the people spirit of the hobbyist phase while building a new machine that
would feature an architecture so closed that it was even hard to find a tool
with which to open the case. The irony was not lost on the savvy personal
computer community.


After the Fire


That's how things stood at the outset of 1984. That January, it looked as
though the big news of the year in the computer industry was going to be a
commercial: The legendary Ridley Scott SuperBowl ad for the Macintosh. The Mac
itself was big news, and the Canon laser engine was an important technical
advance that year. A few other things happened, too.
MS-DOS 2.0 was released and Bill Gates made the cover of Time. AT&T introduced
its first personal computer. The press was chanting shakeout, and Business
Week crowned IBM the winner and sole survivor with 26 percent market share.
Jack Tramiel resigned as president of Commodore, then bought Atari, vowing to
turn this company from a democracy into a dictatorship. In 1984, the line
workers in Silicon Valley computer companies made from $3.50 to $11.50 an
hour, as compared with the $1.20 hourly average earned by Taiwanese workers.
The companies behind VisiCalc, Software Arts, and VisiCorp, got into a legal
wrangle that had aspects of a mutual suicide pact.
This was the year when all the windowing systems for PCs were vying for
attention. IBM introduced TopView, which briefly looked like a competitor to
Microsoft Windows, QuarterDeck Desq, and a rumored DRI product. Jim Fawcette,
editorial director of InfoWorld at the time, characterized TopView as a
program for expert users of novice programs. TopView was pretty bad, but it
was from IBM, and a lot of IBM watchers thought that it must be part of an IBM
plan to proprietarize the IBM operating system. DRI introduced GEM, which was
praised in the press for bringing a Mac-like interface to the PC. The irony of
that praise would soon be apparent.
Americans polled by the Louis Harris organization in 1984 said that they
viewed computers as capable of making life easier and better, but they also
feared that computers might take their jobs and undermine their privacy. At
the same time, some users of on-line conferencing systems claimed that their
privacy was being violated when magazines reprinted their electronic comments.
The Harris poll supported the idea that a schism of computer haves and
have-nots was growing. The Supreme Court upheld the right to record video
programs on our Betamaxes, as an InfoWorld story put it. The decision held up
even though "Betamax" has the ring of "Tucker" today.
Spurred on by the movie War Games and the arrest of the 414s in 1983, the FBI
put into place a three-week training program for agents: They went in ignorant
and came out fully qualified to investigate computer crimes. That year the FBI
raided the homes of four Huntsville, Alabama teenagers suspected of cracking
NASA's Space Physics Analysis Network, which had little in the way of security
and no classified or sensitive information.
It wasn't the FBI, but a Los Angeles detective and two Pacific Telephone
security officers who ripped out Tom Tcimpidis's BBS hardware. Someone had
posted a credit-card code and two Sprint access numbers on Tcimpidis's
bulletin board, and when the Pac Tel police arrived at the door, they made
Tcimpidis the computer community cause celebre of the year. The judge in the
case said that he understood the issues involved, having seen War Games. Pac
Tel eventually dropped the case, but made it clear that it might prosecute
other bulletin board operators.
By the fall, the House of Representatives was examining a computer crime bill
that would make unauthorized access to a computer system a felony if it
resulted in a $5000 loss to a victim or an equal gain to the cracker.
Nibble magazine either clarified or clouded the waters when it won an
injunction against a program-typing service that wasn't doing enough typing.
Nibble was publishing code in its pages for its readers to key in and use. The
typing service, Amtype, ostensibly charged $10 for typing the code for people,
sending them a disk with the code on it. In fact, they were typing the code
once and running off copies. Had they actually typed the code each time,
Nibble argued, that would have been legal, but what they were doing was a
violation of copyright.
Massachusetts Congressman Edward Markey launched an online discussion of
nuclear weapons. Many candidates were using personal computers in their
campaigns, a few even going online to get input from computer-using
constituents.
1985 was a bloody year. Software Arts died of self-inflicted wounds, Lotus
Development Corp. acquiring its product line, founders, and reputation for
having invented the electronic spreadsheet. DRI, facing hard times, cut back
on staff. Counting on GEM to pull it out of its slump, it lost some momentum
when Apple forced it to redesign the product to look less Mac-like. Control of
Apple computer was wrested from Steve Jobs, who left, selling off stock, and
taking key people with him to start a new company. Most industry leaders
viewed this as a positive step for Apple, although Philippe Kahn said the only
people left are the bean counters.
The U.S. Patent and Trademark Office, which formerly had a policy of not
granting patents for software, began taking applications for software patents
seriously, as directed by the Supreme Court in decisions earlier in the
decade. Several applications were under scrutiny, and Number 4,555,775 was
granted to AT&T for a hardware/software display system it called windows.
AT&T's windows appeared to be a fairly specific invention, not related to
windowing systems.
DDJ did some beard-pulling in 1985, finding weaknesses in Turbo Pascal,
showing how to add both memory and a hard disk to the closed Macintosh, and
publishing Richard Stallman's GNU Manifesto.
And so it went. In 1986 Intel laid off 700 employees, IBM unveiled its
RISC-based RT PC, and credit card-sized computers came on the scene. By 1986,
over 90 percent of the school systems in America had (some) personal
computers. A study showed that personal computer use resulted in an 80 percent
reduction in television viewing. DDJ went online in 1986.
1987 saw the Computer Security Act passed, mandating that government agencies
protect the privacy and security of sensitive data. IBM lost five percent
market share, about the same number of unit sales Compaq picked up. In 1987,
language and operating system implementations were getting more sophisticated.
Basic showed some new versions, C some standards efforts, and optimizing
compilers and object-oriented programming were getting attention, as was
Microsoft's operating system for the future, OS/2.


The Current Phase



In 1988 Apple announced a strategic alliance with DEC and sued Microsoft and
Hewlett-Packard over look-and-feel. Microsoft and HP countersued; OS/2 1.1 met
its ship date; Steve Jobs's company, NeXT, showed off its machine.
IBM was getting pressure from customers to license its Micro Channel
architecture. Clone makers, though, criticized MCA, professed little interest
in developing MCA machines, and, led by Compaq, formed a Gang of Nine to
develop an alternative in the form of an extended AT architecture called EISA.
IBM released two new computers labelled PS/2s but based on the existing AT bus
architecture, while several Gang of Nine members admitted that they were
hedging their bets and still working on MCA machines. Most observers felt that
there was only room for one standard and that the battle would be won and lost
on the 386 battleground.
Paul Heckel was awarded a patent for ZoomRacks, his card-in-rack hypertext
system, while Xerox got a patent on the icons in its Viewpoint interface, the
first patent granted on user interface elements.
The Open Software Foundation was formed to develop a non-AT&T standard for
Unix. AT&T and Sun were at work on Open Look, a graphical front end for Unix;
Sun licensed Viewpoint icons for Open Look; and IBM began talking with NeXT
about developing a competing graphical front end for IBM's Unix, AIX.
In 1989 QuarterDeck won a patent for the multitasking windowing system in
Desqview. The first EISA machines arrived, and were high-end machines that
didn't exactly throw down a gauntlet to IBM.
In 1990 Windows 3.0 was delivered, pa rumpa pum pum. HP's New Wave desktop
environment skipped version 2 and went straight to the magic number 3.0,
adding agents (task-spanning macros). Borland announced that the next versions
of Turbo Pascal and Turbo C would have object-oriented features, which would
result in a huge increase in the number of programmers exposed to OOP.
A number of notebook computers with hand-printing recognition and stylus input
were announced or hinted at. Xerox came out with a combination plain-paper
fax, copier, printer, and scanner. Apple demonstrated that it wasn't kidding
about getting competitive in its pricing by introducing new low-priced
computers and lowering prices on some existing machines. PC manufacturers
continued to keep the EISA, MCA, and old AT bus architectures alive, while the
media conglomerate named MCA told IBM to stop using that acronym, and IBM
agreed.
Lotus and Novell talked merger, but changed their minds. Raima entered into
joint ventures with several Soviet software companies. Security of sensitive
data in the computers of many government agencies was not in compliance with
the Computer Security Act of 1987 after three years. IBM picked up market
share and spun off its Information Products division (typewriters, printers,
keyboards), leading to speculation that other divisions would follow.
And in 1990 Mitch Kapor issued his Software Design Manifesto, his first step
in a plan to create an environment in which good software design pays off
commercially. He also founded, with Steve Wozniak and John Perry Barlow, the
Electronic Frontier Foundation to educate the public about social issues of
the information age.
If the revolution is over and we won, then what Mitch Kapor is up to now is
more important than any work underway in any research and development lab in
the world. The conscious design of software for use by people on the one hand,
and the development of a consensus on values for electronic communication on
the other, will define the kind of world we will inhabit in the future.
As software designers, you have the greatest opportunity and the greatest
responsibility for the shaping of that world, because you are the architects
of the future.

















































January, 1991
THE CHANGING LANDSCAPE OF SOFTWARE DEVELOPMENT


DDJ's editors reflect on the future of computer programming




Ray Valdes, Michael Floyd, and Jonathon Erickson




Fanning the Flames


Ray Valdes
Ever since I moved to California I've had to accustom myself to sudden changes
in the landscape. Here lies, after all, the land of earthquakes and massive
forest fires -- natural catastrophes that punctuate what are otherwise
glacially slow natural processes. These sometimes tragic catastrophes are not
without benefit. Earthquakes make new mountains, forest fires can renew the
wilderness.
The computer industry is not without its share of catastrophic
discontinuities, those events that clear out the old growth and make space for
vibrant new weeds and saplings. It's been 15 years since the "fire in the
valley" cleared a space in the old growth of the mainframe and minicomputer
industries and made room for such weeds as Apple Computer, Adobe, Autodesk,
and Atari (and these are merely those new companies whose names start with the
letter "A").
Looking toward the next few years of this industry, certain general
predictions are easy to make: more MIPS, more memory, more mass storage, more
multimedia, and so on. The precise details are harder to foresee, but (in some
sense) who really cares? Except for those people directly connected, it makes
little difference to desktop PC users that Northgate, Everex, and Dell are
major clone-makers, instead of Cromemco, NorthStar, or Osborne.
And if you stand back far enough, at some locus, the hot points of distinction
between Windows 3.0, Presentation Manager, and OSF/Motif fade into a generic
gray GUI image. We may as well be using VisiOn. Likewise from a distance, the
noisy war between OS/2 and Unix subsides to a steady background noise, like
the sound of waves on a distant beach. It may as well be Mach, or DOS 7.0.
The point I'm trying to make is that, as far as desktop PCs are concerned,
technology is making unexciting steady progress down a wide evolutionary road
that will not have sudde turns and unexpected detours. The path of this
mainstream highway is predictable, and its general contour is constrained by
the underlying technology and guided by the needs of market. The exact details
are left to the vagaries of historical accident, such as the particular lawyer
Gary Kildall had on hand when IBM came calling about an operating system for
its PC --which resulted in the fact most of our desktop machines say MS-DOS
rather than DR-DOS when we boot them up.
In five years, all our desktop machines will have an operating system that
multitasks preemptively, exploits 32-bit addressing, and has lightweight
threads, virtual memory, and support for networked interoperability. And, from
a technologist's point of view, it doesn't much matter what its brand name
will be. As with most mature industries, there will be a few major brands --
ABC and NBC, Time and Newsweek, GM and Ford, Republicans and Democrats, OS/2
and Unix -- to give consumers the illusion that they have a choice.
Likewise, users of desktop computers will be served by a direct manipulation
interface that has overlapping windows, graphical icons, multibit pixel
displays, aural feedback, and cluttered dialog boxes dressed up in a
pseudo-3-D look-and-feel. Will users really care if the name on the
shrink-wrap box says Presentation Manager, Wheaties, or Cheerios?
In short, the average desktop machine of 1995 will look a lot like Steve
Jobs's Next machine, and then some: twin RISC/DSP computing engines,
heavyweight pixels, a multimaster data bus, a modern networked operating
system, and a post-modern user interface. No one can say for sure, but I doubt
the majority of such machines will have the Next brand name on them.
What's more interesting to speculate about are the sudden catastrophic radical
discontinuities -- also known as revolutions -- that are as unpredictable as
they are inevitable. These new fires in the valley will affect the demise of
stagnant, hollowed-out giants like Ashton-Tate and Lotus and enable the growth
of new corporate forms heretofore unseen. These radical discontinuities will
have a tragic aspect, in that numbers of workers will find themselves looking
for new jobs, much like the laid-off employees of Wang who went knocking on
the doors of Lotus in the early 1980s. The purpose of this prediction is not
to pass judgment or place a stamp of approval on these events, as much as it
is to foresee them so that we can be better prepared.
Looking back on the two major revolutions in the computer industry, they were
the result of years of steady, evolutionary growth, punctuated by an abrupt
jump to a hardware platform based on a fundamentally smaller level of user
scale. DEC's minicomputer was the first machine affordable by the small
engineering or research group, and marked the first time scientists could work
interactively in the same room with their machines. This created a whole
industry based on this new platform, which displaced the mainstream mainframe
industry (to some extent) and then continued to evolve alongside it.
Likewise, the PC revolution gave us the first machines that we could place on
our desktops, or put in the back seat of the old Chevy and take to the
Computer Faire to exchange small-scale, garage-grown technology. The nascent
PC industry destroyed the manufacturers of dedicated WP machines and displaced
(to a certain degree) both mini and mainframe systems, giving us now three
strains of mainstream computer technology marching alongside each other.
So when and where will the fourth strain arrive? And when it comes what will
it look like?
Predicting the next revolution is a little like predicting the next
earthquake, a somewhat dubious endeavor. Nevertheless, certain aspects are
inevitable. Like the two previous revolutions, it will involve a hardware
platform on a fundamentally smaller level of user scale. Like those
revolutions, it will also involve a convergence of enabling software
technologies (new operating systems, new tools, new languages, new application
methodologies) fulfilling the previously unmet requirements of new groups of
users.
You may ask: What about laptop and notebook computers; do they constitute a
revolution? No, they are just old wine in smaller bottles. The DEC LSI-11/03
was almost the same size as an Altair or IBM PC, yet in all other respects it
belonged to the same strain as the room-size 11/70. Likewise, Compaq's new
notebook machine is of the same family tree as its floor-standing SystemPro.
The platform for the next revolution may be the same size as today's notebook
computer, but it will be in most other ways a new and different species. It
will be notebook size or smaller (that is, armtop or palmtop). It will be
controlled by a direct manipulation interface. And it will be what I call
"analog accessible." Analog accessible is a fancy term for a closer way of
being user-friendly.
DEC's minicomputers were the first machines that the average person could
stand beside, type in a request, and get an interactive response. (Prior to
this, of course, you had to submit decks to the card reader and wait overnight
for a response.) The PC, with its standard memory-mapped display, vastly
increased the bandwidth of digital output to the user. But the method of input
remained the same: ASCII characters typed at a keyboard.
The new breed of machines will allow for input that more closely resembles the
analog world in which we will live. At a minimum, they will replace the
digital keyboard with a stylus or pen. This pen will enable more direct
manipulation of objects on the display, and eliminate the dichotomy between
mouse-on-desk vs. object-on-screen. No cursor will be needed, because
what-you-see-is-where-you-are. Merely place the stylus on the desired object
and it will respond.
This is nice, but it's not really "analog." What is analog are other methods
of input to the machine, namely, handwriting and voice. Transforming
continuous pen strokes and analog speech into digital data that can be
processed by the machine is a very difficult task. It is likely that early
machines will have limited success in handling these new input modes. In fact,
we can see these limitations in predecessor machines that are already on the
market, like the Sony PalmTop or the one by Grid. But over time, steady
progress will result in qualitative change. Remember that the first CP/M
machines used standard dumb terminals instead of higher bandwidth interfaces.
Who will produce these new machines? All the usual suspects: IBM, Apple,
Compaq, Sony, Toshiba, and so on. Plus a host of smaller companies, whose
names have now started appearing in the press: Go Corporation, Active Book
Company, Scribe, Momenta, Data Entry, Touchstone, and CIC. Some of these
smaller companies have already come and gone, like Linus Technology, which
went out of business earlier this year. After a forest fire, not all of the
initial weeds find a secure home on the burnt-out soil.
If all the existing players are working on this new generation of machines,
won't the new machines be just another milestone along the mainstream road
traveled by the major players? No, because the new platform implies radical
discontinuities in the multiple areas of software technology, hardware
products, and the user population. These kinds of abrupt changes are hard for
an established, large company to handle. Two years before the Apple I, IBM
introduced a desktop personal computer called the 1501. It took several more
years before IBM realized a more radical approach was needed.
Direct manipulation interfaces will require operating environments that are
thoroughly object oriented, as opposed to yet another layer added on DOS. This
will work against the skills of the major players and provide a blank slate
for application developers.
Applications will be addressed to an entirely different set of users. The
first mainframes were for the rocket scientists of the 1940s and 1950s. The
minicomputer met the needs of Joe Engineer, while today's PCs are being used
by Josephine Engineer, Accountant, and Office Worker. In addition to Joe and
Josephine, users of the new machines will have names like Yamashita and
Gonzalez. That is to say, because of the increased globalization of the
economy and the diminished role of the U.S., it is likely that some of the
major players will be based outside the U.S.. And even inside the U.S., there
will be a whole new population of users in industries previously untouched by
desktop technology: truck drivers, service workers, auto mechanics, field
salespeople. This will imply a radical shift in the established channels
(Businessland, Computerland, mail order), not to mention entirely new crops of
application authors.
The design of applications will have to change -- moving from a focus on the
keyboard/mouse to what some now call "pen-centric" design. For example, in
today's desktop-oriented graphics programs, to draw a circle you have to first
choose the circle tool from the palette window, then move over to the document
window, and finally click-and-drag with the mouse. In a pen-centric
application, you merely use the stylus to draw a circle (or a box or a line or
some text) on the document and the system responds accordingly. It is an
interesting exercise to rethink some of our favorite applications in light of
this new user interface paradigm.
If forced to predict the ABC, CBS, and NBC of this nascent industry, I would
say that there will likely be one or two U.S.-based companies, one from Japan,
and perhaps another based in Europe. Vendors of tools might also be
internationally distributed. Only the application vendors will be locally
based. But with entire industries to automate -- from real estate sales to
trucking to restaurants -- these won't be small potatoes. From this
description, one may wonder if there is any place at all for the shoestring
garage start-up.
The answer is that there will be many opportunities for small
technology-intensive operations, if they form alliances with larger companies,
in the areas of both manufacturing and applications. If IBM is to repeat its
success with the PC, it will likely do so by licensing technology from a
smaller vendor. Small enterprises like Go Corporation and Metaphor Computer
have announced agreements with IBM. It's likely there will be others before
this revolution plays itself out. This is not quite like the days of the Home
Brew Computer Club, but it's as close as one can get in the fin-de-siecle.
And where will the next revolution lead us? Eventually to a place like where
we now stand: A mature, stagnant mainstream, ready to be overturned by a new
radical discontinuity. That subsequent discontinuity will involve a shift
toward virtual reality interfaces (what-you-sense-is-what-you-get) and
biocomputing technologies, but that's a subject for another time.


The Evolution of Component-Based Programming


Michael Floyd
When recently asked if I thought object-oriented programming was just a
passing fad, my response was a resounding "no!" Object-oriented programming is
an evolutionary step in software engineering and, as such, the object-oriented
approach is perhaps a key link connecting the preceding paradigms with those
yet to come. Consider that as programming languages have evolved from assembly
to modern languages such as C and Pascal, so has the notion of modularity.
Modularity favors a "divide and conquer" approach that helps the programmer
manage complexity by grouping a process or set of actions, usually into a
subroutine or separate, relocatable module.
Out of this comes the idea of building reusable software components. With the
help of abstraction, software elements within a program or project can be
combined to create new elements. Object-oriented programming refines the
software component concept by combining process with data. In fact,
encapsulating data with the processes that act on it completes the software
component idea, and the benefits of object-oriented programming (reusability
and extensibility) are really benefits of component-based programming.
If you accept for the moment that object-oriented programming is more than
just hype, the next question to consider is "where do we go from here?"
The next step in the evolutionary process may well be something called
megaprogramming, a concept that views programming in terms of designing and
composing software components, but on the grandest of scales. The term itself
was introduced by Barry Boehm and William Scherlis at the June 1990 DARPA
Workshop. Megaprogramming uses software components to manage the life cycle of
systems, and promises to provide huge increases in programmer productivity.
In megaprogramming, megamodules take the notion of an object as an
encapsulation of data and actions (in the form of functions and procedures) a
step further. Megamodules encapsulate at a higher level the behavior,
knowledge, and know-how within a community of software components. According
to Peter Wegner of Brown University, "Megamodules are like nation-states. They
have their own languages, traditions, cultures, and nationalistic
loyalties."{1}
Megaprograms, then, are the programs that manage megamodules and model the
interaction between systems. Imagine, for the moment, megamodules that
simulate the interaction between organisms and the human immune system, or a
megaprogram that models the world economy, with each country's micro economy
representing a separate megamodule.
Megaprogramming, sometimes referred to as "programming in the large," involves
managing programs of extreme size. As a consequence, development teams will
also grow. Another factor that may be less apparent, however, is that the life
of a system will necessarily be extended. Such extended-life systems must be
easily extendable to accommodate change over longer periods of time, and
issues such as data persistence must be considered. Therefore, a key concern
of megaprogramming is managing the life cycle of megasystems.
So, what will megalanguages look like? In all likelihood, megamodules will
support multiple paradigms enabling today's object-oriented (and procedural)
languages to play key roles. In addition to addressing the issues of life
cycle management, however, a megalanguage must support the interconnection of,
and a common interface to, these large modules. Additionally, megalanguages
will have to handle pragmatic problems such as those associated with
concurrency, provide support for interrupts and exception handling, and deal
with real-time systems.

If you're skeptical, note that according to the OOPS Messenger, the
president's science adviser has proposed a $2 billion, five-year plan that
includes megaprogramming as a primary goal.{2}
Of course, object-oriented programming presents its own challenges that must
be resolved before we move to the next generation of programming. And, what if
you're not sold on this object-oriented hype? The biggest stumbling block I
see for objects is typified in the tired, but true, saying: "Garbage in,
garbage out."
The problem is that objects place more weight on design than previous
approaches did. Unfortunately, few have formal training in design, because our
education stresses engineering. Hence, much of the design work occurs during
the implementation phase. But, object-oriented software design goes beyond the
process of organizing hierarchies, classes, and objects. Consequently, many
programmers are finding that, although they have a working program, they must
redesign to truly gain the benefits of extensibility and reusability.
In some sense, design is the simulation or modeling of a problem. And the
success of a given design depends largely on how well the solution fits the
problem, especially as the problem changes. To complicate matters, subtle
aspects of the problem may not be apparent during the design phase, so the
design must be as equally flexible and extensible as the system it models.
Hopefully, the coming years will teach us how and, perhaps more importantly,
when to use our new-found wisdom. Certainly, object orientation is a missing
piece, but it does not represent the entire puzzle, and you should keep in
mind that we have yet to find the silver bullet.
1. P. Wegner, "Object-Oriented Megaprogramming." ACMemberNet, October 1990.
2. P. Wegner, "Concepts and Paradigms of Object-Oriented Programming". OOPS
Messenger, ACM Press, 1990.


Baby Don't You Drive My Car


Jonathan Erickson
Like it or not, the technologies that make up the fragile infrastructure of
technological progress are barreling headlong into roadblocks that are legal,
not technical, in nature. As a consequence of this rush, the spirit of
innovation that's fueled software development since it began -- and at the
breakneck speed we've come to expect -- may run out of gas, if it doesn't
first come to a crashing halt. In any event, future programming efforts may be
very different from today, as programmers discover they need to be clever
paralegals first, and competent coders second.
Software patents and copyrights are at the heart of this legal labyrinth.
Putting aside ethical questions surrounding software patents, a number of
day-to-day, legal-related programming issues remain. To my mind, the most
confounding problem is simply knowing whether or not the algorithm you're
using has been patented. Of course, you'd expect the U.S. Patent Office to be
the place to go to find answers to questions like this; at least that's what I
thought. The answer I received, however, was that there is no simple way to
find out. You can't say "give me a list of all registered software patents so
that I can avoid using them," because no such list or database currently
exists.
The way the Patent Office works is that all patents are assigned to a primary
category (software, by the way, is "broadly" assigned to category #364) which
is made up of classes and subclasses. The patent is then cross-referenced to
one or more subsidiary categories, again with individual classes and
subclasses. Many software patents, it turns out, are buried as subclasses
within a subsidiary category in a patent for some kind of hardware invention.
The Patent Office isn't trying to keep trade secrets, well, secret; it does
publish a list of patents after they've been granted. This list provides you
with the first step for challenging a patent -- if you know it exists. You
simply request a reexamination and provide prior art or other relevant
information that the patent examiner might have missed. This is what
competitors of patentees often do. The Patent Office doesn't publish a list of
"applied for" patents; you have to wait until the "granted" list is made
public.
(In defense of the Patent Office, examiners are overworked and there is a
shortage of them. It typically takes an average of 18 months for a patent to
be approved; for new areas like biotechnology, it can take up to four years.)
This takes us back to my original question: If you're a programmer
implementing a familiar algorithm to draw a circle, for example, how do you
find out if that technique has been patented? The answer is straightforward:
There is no way. Because of this informational maze, your most common recourse
will be, in all likelihood, to forge ahead and wait (but not hope) for
someone's attorney to call. Not the safest tack, but the most expedient. In
fact, this may be what you're doing right now -- you just don't know it.
(Maybe what we need is a "patent checker," somewhat like a spell checker, that
works like this: As you begin compiling your source code, the checker looks
for algorithms that, according to its database, match patented algorithms.
When it hits one, the system pops into the debugger with the cursor on the
patented technique and a message flashes the assigned patent number.
Naturally, adding hypertext lets you click on the patent number to find out
who owns the patent and other details. You could tie the checker into your
bank account and automatically cut a check to cover the license fee. Or you
might want to add a "patent thesaurus" for a selection of safe workarounds,
user definable, of course. But I'm getting carried away with entrepreneurial
inclinations....)
Copyrights raise equally confusing questions and I'm willing to bet that over
the next decade, the big questions in this arena will involve the concept of
public domain and whether or not it exists anymore.
Perhaps it doesn't. The way the copyright act is written is that any time you
take pen to paper (or, in this day and age, fingertips to keyboards), you own
the copyright to what you've created. To formally protect that material, you
must register it with the copyright office, thereby enabling you to claim
damages and recover legal fees if someone infringes on your copyright. But
what if you've created something (like source code) and want to release it to
the public domain for the betterment of your fellow citizens (programmers)?
Sorry, there's no government form that lets you do this. The best you can do
is choose not to enforce the copyright.
For the sake of argument, assume you've openly (that is, with the author's
knowledge) used "public domain" code, but in software that hasn't been
commercially successful; further assume that the original author didn't mind
your using his code. The American dream being what it is, one of your programs
-- one that incorporates this public domain code -- becomes wildly successful,
making you both rich and famous. Wonderful, you say, until you get a letter
from the copyright holder (or, more likely, his lawyer). Surprise, surprise --
he's decided to enforce his copyright after all, license fee attached. Is this
the kind of public domain you want to trust?
Here's another copyright issue that's also up in the air. When you register
for a software copyright, do you protect the object code or the source code
(or both)? Most developers "publish" and distribute object or binary code
versions of the source code; the source itself is kept secret. The question
then is, does the copyright law in effect "decompile" the source from the
object code? Maybe so, maybe no. Pick a card, take a chance.
I've only scratched the scruffy surface of the legal questions that are
beginning to bedevil software developers (And computer users, for that matter.
Try this one on for size: Who has the right to read that electronic mail you
send and receive over the company LAN or over an online service? Just you? Can
your boss or the owner of the company sneak a peek at your e-mail? This
question is being answered in a couple of courtrooms right now and, to my
mind, should be relatively easy, at least compared to the issue of software
patents.)
My one hope is that the legal quagmires we're starting to encounter are
potholes in the road, not chasms, and that we'll pass over them carefully, if
not quickly. Unfortunately, it will probably take the next decade to sort out
the answers. In the meantime, I'll wager that either some large patent-holding
corporation will take a lone programmer or small development house to court,
or a lone patent-holding programmer will sue a large software company. (Well
actually, both types of cases have occurred, but with out-of-court
settlements, not clear-cut decisions and answers.) I hate to say it, but court
challenges may be the only way we'll get an answer. In any event, we'll all
pay a price that I hope isn't too great as we travel this road, which I pray
isn't too perilous. And I further hope we'll all be aboard for the ride for as
long as it lasts, no matter where it takes us.




































January, 1991
EXAMINING THE HAMILTON C SHELL


Unix power for OS/2




Scott Richman


Scott is an independent software consultant specializing in systems and
applications programming under VMS, Unix, DOS, and the Macintosh. He can be
reached at R.R. 3, Box 3471, Susquehanna, PA 18847.


Starting OS/2 for the first time was, for me, like unlocking a Ferrari,
sitting behind its wheel and finding a Yugo's dash. What a disappointment.
Sure, the engine and suspension were first rate, but the controls were
minimal, the clutch was stiff, and the pedals were nonresponsive! OS/2 comes
with great stuff, but CMD.EXE, the default command-line processor, is poor
compared to the powerful operating system beneath. CMD.EXE appears to be a
port of the MS-DOS COMMAND.COM and lacks the major features of a serious front
end.
Fortunately, there's a tool that fills this gap. The Hamilton C Shell is a
collection of programs that takes advantage of OS/2 features to create a
faster, more powerful environment for serious OS/2 users. The Hamilton C Shell
efficiently uses OS/2 to implement a superset of the C shell environment used
in the Berkeley flavor of Unix. The Shell supports a powerful script language
borrowing C's constructs.


C Shell for OS/2


The Hamilton C Shell is not a quick port of a Unix shell from another system.
The Shell was created from scratch, implemented with modern compiler
technology, and designed to fully take advantage of the powerful OS/2
architecture, including HPFS (high-performance file system), long filenames,
and threads.
Additionally, the Shell supports large command lines and pipes (up to 64K) and
includes faster and more powerful utilities than those supplied with OS/2.
This is more than Unix -- this is a powerful requirement for development under
OS/2. The ability to execute C shells simultaneously in different Presentation
Manager (PM) text windows converts your PC into a flexible workstation.
The Hamilton C Shell comes with many programs and shell scripts. To install
the Shell, you simply copy the files to their new home, and modify your
CONFIG.SYS. The Shell program, CSH.EXE, can be executed in a text window of
the PM or as a non-PM character-mode application.


Scripts


Scripts allow you to program the many commands and features with full support
for complex logic, looping, nested control statements, and symbols. Scripts
are composed of C Shell commands, OS/2 programs, and comments prefixed by the
pound character (#).
This combination can produce potent applications. Scripts can be composed and
tested interactively at the command level or typed into files and run later.
The Shell assumes that files with extensions of .CSH are C Shell script files.
Scripts can read user input and can be recursively.
For example, Listing One presents CTL_T.CSH, a script to send a Ctrl-T to
COM1: every 400 seconds. It's useful when logged onto a busy terminal server
that impatiently bumps you off when there's no activity. Invoking this script,
using ctl_t &, will execute it in the background and the server will be kept
busy.


Supporting Procedures


Script programmers can create C Shell procedures, which are more like
functions: They accept a parameter list and return a value. These procedures
are compiled into C Shell memory and are then executed as new shell commands.
Procedures can greatly extend the power and flexibility of your environment.
As an example, consider ZCW.CSH (Listing Two), which is used to build a C++ PM
program. ZCW.CSH defines a procedure that recieves a filename as its
parameter. The script calls the procedure at the end: The Shell reads the file
once, compiles the procedure and executes the compiled code from that point
on. In other words, the zcw procedure is now treated like another C Shell
command.
Listing Three shows the global edit procedure ged, which can be used to
globally edit several files. For instance, you can edit all .H files and
change your last name from "Lovejoy" to "Stern," using the command ged
s/Lovejoy/Stern/ *.h. As with zcw, the Shell reads and compiles the procedure
and executes ged as it would any other C Shell command.


Variables


Users can create local, environmental, and C shell global variables. These
symbols can contain any text representing pathnames, strings, numbers, and so
on, which can be referred to by the other Shell components. Long pathnames,
for instance, could be stored in variables and used in a command line to refer
to the target location. To define a variable, use the set command (set a =
"this is a"). To have the Shell calculate an expression, use @ instead of set.
Additionally, variables can be arrays with full support for C-style
subscripting of the elements. The Shell makes it easy to access the words
which make up a variable. The Shell supplies many internal variable functions
to test and manipulate the text within a symbol. The printf function, for
example, is used to format variables. There are also provisions to scan
strings for substrings, concatenate variables, and return string lengths.
The Shell is also flexible in treating symbols as numbers and will allow
complicated arithmetic calculations. The Shell handles integer and
floating-point arithmetic and supports C-like calculations, evaluations and
expressions, including switch and case. Variables can be tested for patterns
using the Unix pattern-matching expressions.


Taking Command


The Shell has full command history. It remembers previous command lines, which
can be recalled through many different methods. Besides using the up and down
arrow keys to recall past lines, you can recall a previous command line (or
specific parts of it) by command sequence number, or you can recall the last
command which contained a specific string. Groups of command lines can be
saved into a text file and later read back into another session. The saved
command lines can be edited by your favorite text editor and then submitted to
the Shell as a script. The Berkeley history mechanism supplies many nifty ways
to access parts of previous command lines. When a command line contains !$,
the Shell inserts the last word (argument) of the previous command line:
Repeated sequences of commands to the same file (such as edit, compile, link
and print) are executed faster and with fewer typos because the argument is
never retyped. Some of the other history-recall commands are shown in Table 1.
Table 1: History recall commands

 Command Description

 -------------------------------------

 !^ Inserts the first argument
 (or word) of the last line.

 !* Inserts all the arguments
 of the last line.

 !! Inserts the previous
 line.

The Shell lets you define aliases, which allow you to abbreviate or rename any
command. Complicated command lines are much easier to work with when they are
defined by an alias. Once an alias is defined, it can be used as another
command.
Because the C Shell furnishes many ways to group commands together on the same
command line, the Enter key has much more power than under conventional PC
systems. Command lines ending with an ampersand (&) will be executed in the
background. The PS command will show the currently active processes and
threads created by the Shell and their command lines, while the Kill command
can terminate any job shown by PS, making it easy to manage a multithreaded
system. On my wish list of future enhancements, however, is a feature that
will display and manipulate the priority of a thread.


File and Command Accessibility


The Shell controls command-name parsing through efficient hashing techniques
and sophisticated OS/2 features. Filenames are expanded within the command
line with greater speed and flexibility than under OS/2. For example, when you
press the Alt/Ctrl key combination, the Shell will complete a partially typed
file or command name in the current command line. These features save much
time and ensure more accuracy by reducing unnecessary typing.
C Shell supports full Unix filename wildcarding, to provide a very flexible
means of describing groups of files. Subdirectories can also be wildcarded.
The asterisk (*) and question mark (?) can represent any character except the
colon (:) and backslash(\). However, the period between the filename and its
extension is no longer sacred. A wildcard expression of *.[ch] will translate
into all files with either .C or .H extensions. Square brackets declare a list
of characters which can match one character. If the first character within the
square bracket list is the escape character (^), the list will define all
characters that will not match. These character lists may include ranges of
characters: [A-Z] [0-9] will match any two characters starting with one
alphabetic and ending with one digit.
The Shell also has built-in file tests to determine file type. The commands
shown in Example 1, for instance, will print the attributes of the file whose
name is stored in the variable a. Also, the Shell can be directed to parse
full filenames into their component parts, and programming is not needed to
edit the extension out of a filename. For example, if we set the variable a to
"dir1\dir1\file.ext" the Shell will interpret the filenames according to the
list shown in Table 2.
Example 1: Commands to print the attributes of a specified file

 if ( -d $a) echo $a is a directory
 if ( -H $a) echo $a is hidden
 if ( -R $a) echo $a is ReadOnly
 if ( -S $a) echo $a is SystemFile
 if ( -e $a) echo $a exists
 if ( -x $a) echo $a is executable
 if ( -z $a) echo $a zero length

Table 2: Parsing filenames into their component parts

 Expression Description C Shell result
 -----------------------------------------------------------------

 $a:h (head) Directory \dir1\dir2

 $a:r (root) Path w/o.Ext \dir1\dir2\file

 $a:t (tail) File name file.ext

 $a:e (ext.) extension w/o. ext

 $a:f (fullpath) expanded file name d:\top\dir1\dir2\file.ext

The Shell features a directory-stack mechanism comprised of the commands
pushd, popd, and rotd. pushd is the CD command with memory. It will remember
the current directory (by placing it on the directory stack) and then change
to a new directory. popd will return to the directory at the top of the stack,
and rotd will rotate the order of the directories saved. Jumping around from
directory to directory is a snap, especially when you use wildcards to declare
the directory to push.


Redirection


The Shell supports full I/O redirection of any of its components and allows
you to build new commands from the output of other commands on the same
command line. As an example, to browse the unknown .C or .H files that contain
the string VIO, invoke the command: more 'grep -l VIO *.[hc]'
The command line within the single quotes is executed first, and its output is
then inserted into its place. So, more's arguments are the output of grep.
The command line in Example 2, which finds all duplicate filenames on the
current disk, demonstrates how powerful a simple shell command can be. Example
2 starts by creating a list of all the full pathnames of every file using the
-r (recursive) option of ls. The :gt means globally trim each pathname down to
just the tail (no drive:\dir\\\). The foreach loop writes each name out to the
pipe, one per line. All lines are sorted alphabetically and the uniq -d
command outputs just the duplicates. Within moments, the current drive is
scanned for all files with the same name.
Example 2: Command to find all duplicate file names on the current disk

 foreach i (' ls -r \' :gt) echo $i; end sort uniq -d.




Supplied Utilities


The Hamilton C Shell product comes chock-full of many wonderful utility
programs. All utilities have the same homogeneous feel, a quality lacking in
other software packages. Also, all Hamilton supplied programs will display
help when invoked with the -h switch. Because there are so many utilities
included in the package, I selected my ten favorites and described them in
Table 3. Table 4 lists other utilities found in the package.
Table 3: My favorite C Shell utilities

 Utility Description
 ------------------------------------------------------------------------

 Cut Outputs specific parts of each line of its input, and
 allows you to specify the character positions and/or
 the field numbers to include.

 Diff Compares files or directories, and can be instructed
 to ignore case and spaces. Diff can recursively
 compare the contents of two directories. You can
 also define the minimum match length to insist on.

 Strings Searches binary files and displays the ASCII strings
 found within them. Strings is quite handy for finding
 the strings embedded within a program or database.

 xd Dumps the contents of its input to stdout. This
 wonderful dump utility can display its input by bytes,
 words, long words, or floating-point values. xd is
 fluent in decimal, hex, oct, and even other user-supplied
 radixes. xd can be told the offsets at which to begin
 (and end) its dump.

 More Flexible full-screen file browser. More will scroll up
 and down, and search for text and line numbers. It can
 also format lines with octal and hex values. C
 programmers will appreciate the feature of displaying the
 \n\r escape sequences.

 Ls The ultimate DIR program that specifies types of files
 and displays file information in many different sorted
 orders. Ls can also display file-size totals. The
 program will, if told, recursively search the directory
 structure.

 Uniq Displays the duplicate or nonduplicate lines found in a
 given file.

 Fgrep and grep Searches files (or standard input) for specific
 occurrences of text. grep works with regular
 expressions which can help find approximated text
 strings.

 Tail Shows the end of a file. If, however, the file is
 growing (another process or thread is expanding it),
 it can continue to show the growing file. I find tail
 indispensable for logging downloads while I am free to
 work in another window.

 Sed A stream editor -- a filter which outputs an edited
 version of its input. Sed will replace strings, convert

 characters, delete text and insert text. Sed will work
 by ranges of line numbers or regular expressions.

Table 4: Hamilton utility programs

 Utility Description
 -------------------------------------------------------------

 chmod Change mode bits on files (not directories)

 markexe Set OS/2 application type bits

 pwd Print the current working directories

 mkdir Make directories

 sum Checksum the contents of a file

 tar Read/write Unix tape archive format files

 dt Print the date and time

 setrows Set height of current window

 patchlnk Patch "the linker bug"

 du List disk usage statistics

 vl List volume labels

 label Read/write the volume label

 newer Test whether file 1 is newer than all the others

 older Test whether file 1 is older than all the others

 tee Copy Stdin to Stdout and to each file specified

 tr Translate characters filter

 wc Count words (and lines and characters)

 split Split a large file into chunks

 tabs Expand/unexpand tabs

 cat Concatenate files to Stdout

 head Copy first part of file to output

 rmdir Remove directories

 cp & mv Copy (or move) files or directories. These two programs
 can force read-only files to be overwritten. They can ask
 before acting on each file and can log the action. Both will
 merge subdirectories.

 rm Remove files or directories. rm can force read-only files
 to be overwritten. rm can ask before acting on each file and

 can log the action. rm can recursively remove non-empty
 directories. (System files or hidden files or directories
 can be removed.)



Final Assessment


Of course, no product is without its blemishes. Although it improves with each
update, the documentation is the weakest part of this otherwise fine product.
The documentation is written for a highly technical user who understands OS/2
and Unix. Users new to Unix will need to read other documents (see
Bibliography). The sparseness of complete shell scripts makes it hard for a
novice C Shell user to appreciate the many wonderful features of this product.
Unix can be cryptic and unfriendly, but the Shell's awesome power makes it
worth the effort of learning.
While the C Shell is a text powerhouse, database capabilities would make this
product more helpful for business-oriented tasks. More powerful record I/O
procedures and structures for full record handling would help. If C Shell
could integrate an ISAM engine, C Shell applications could be used to solve
complex business and scientific problems.
A screen-capturing feature and internal date functions would be greatly
appreciated. The Macintosh MPW commando facility of dialoguing a shell command
would help users build complex commands without the aid of manuals.
While C Shell works fine in a PM text window, inevitably it will evolve into a
full graphics PM application. Such a version should embody the programming
strengths of HyperCard. Controls and gadgets should invoke scripts, and
programmable dialogues could facilitate PM applications creation.


Products Mentioned


Hamilton C Shell Hamilton Laboratories 13 Old Farm Road Wayland, MA 01778-3117
508-358-5715 $350 System Requirements: OS/2 1.1 or later


Bibliography


Anderson, Gail and Paul Anderson. Unix C Shell Field Guide. Englewood Cliffs,
N.J.: Prentice Hall, 1986.
Muster, John and Peter Birns. Unix Power Utilities. Portland, Ore.: MIS Press,
1989.
The Waite Group. Unix Primer Plus. 2nd ed. Carmel, Ind.: Howard Sams, 1990.


_EXAMINING THE HAMILTON C SHELL_
by Scott Richman



[LISTING ONE]

# CTL_T.CSH

while (1) # endless
 echo -n ^x14 > com1: # send control t to com1:
 sleep 400 # Zzz for 400 seconds
end # while






[LISTING TWO]

# Procedure zcw builds Zortech C++ and creates a PM program,
 proc zcw(name) # param name
 ztc -W -c $name.cpp # compile ($name is param name)
 # we test name.obj for eistance,
 if (-e $name.obj ) then # got valid obj file to link
 link $name,/align:16,NUL,os2+d:\oz\cp\srzpm.lib,$name
 rm $name.obj # remove the obj
 end # if
 end # proc


 zcw $argv # now here's the invocation of my proc defined above.







[LISTING THREE]

proc ged(edt_str,files) # 2 params
local i #local variables used
local n
foreach i ($files) # loop thru the files
 @ n = concat($i:r,".bak") # save a backup (:r is root name)
 cp -l $i:f $n:f # copy it (:f is full name)
 sed "$edt_str" < $n:f > $i:f # edit from new to i
end # foreach i
end # end ged proc()

[EXAMPLE 1: Commands to print the attributes of a specified file.]

 if ( -d $a) echo $a is a directory
 if ( -H $a) echo $a is Hidden
 if ( -R $a) echo $a is ReadOnly
 if ( -S $a) echo $a is SystemFile
 if ( -e $a) echo $a exists
 if ( -x $a) echo $a Is Executable
 if ( -z $a) echo $a Zero Length

[EXAMPLE 2: Command to find all duplicate file names on the
current disk]

foreach i (`ls -r \`:gt) echo $i; end sort uniq -d



























January, 1991
MAKING A CASE FOR SOFTWARE DESIGN


Design tools can make a difference




Michael Hagerty


Michael is a senior computer scientist at Computer Sciences Corporation and
can be reached at 27911 Berwick Drive, Carmel, CA 93923-8518.


For many programmers and software engineers, the thought of using CASE tools
conjures up nightmares of large teams of programmers working on massive
defense department projects for years on end. While there's been some truth to
this in the past, primarily because of the cost, CASE tools are becoming
available that provide the functionality needed for big projects at the cost
of a good PC compiler.
Moreover, most engineers assume that the only time to begin using a CASE tool
is in a project's design stage -- once coding begins, there really isn't any
need for design tools, and if there is, it's too late to go back. This article
presents a contrary view by describing a project in which design
considerations and CASE tools entered the picture only after the coding was
well underway and the product was looking like something no one wanted.


The Project and Its Problems


The project began as an effort to acquire a computer-based vertical
application to provide information on the availability of rental housing using
voice-text (similar to voice mail) technology. I signed on as the firm's tech
weenie, primarily to keep the principles from getting fleeced by fast-talking
vendors and, where appropriate, to provide technical guidance.
My two clients were quite capable in their own fields. One had extensive
experience in property management and in running a Watson-based computerized
rental-housing information service; the other had run a successful service
business. My role was to help them select and manage the vendor who would
actually produce the system. For all practical purposes, I was to act as the
technical intermediary between my clients and an undetermined contractor (the
vendor).
The vendor we selected was a small firm that had already developed and
installed a rental hotline. While this system addressed the correct market, it
was considerably different, and much less comprehensive, than the system we
had in mind. Nevertheless, we signed with the vendor on the understanding that
these differences would be worked out and that we would receive an improved
version which would meet the standards we had set.
As the project progressed, it became apparent that the vendor was not moving
in the direction we wanted, but was driven by either ease-of-implementation
concerns or a lack of content knowledge. My clients' efforts to redirect the
project were frustrating at best, so I decided to break the impasse by going
back to first principles. (Isn't that what we all do when we find that winging
it just doesn't work as well as we had hoped?)
At that point, we took stock of what we had, what we wanted, and what it would
take to achieve it. Clearly, we did not yet have software anywhere close to a
usable system, and the vendor did not appear capable of understanding how to
build it. We reviewed our goals, limiting the scope of the project to "the
acquisition of an efficient and easily maintained system which would provide
relatively unsophisticated users with continuously updated voice-text
information about the availability of rental housing, meeting their specific
requirements through a 900/800 touch-tone telephone service."
Having clarified the scope, we now needed some way of specifying how the
vendor should implement the system in terms that all parties (client and
vendor) could understand. We had tried pseudocode and notes pointing out
differences between what the vendor had done and want we wanted, but to no
avail. The client wasn't well enough versed to understand pseudocode and the
vendor clearly did not understand the context of the application.
I decided that EasyCase Plus, a PC-based CASE tool I had used in past
projects, would provide the Rosetta stone I needed to bridge the gap between
the two sides.


Designing the Design Process


First I had to extract from my clients a mental picture of how the system
would work. Next, this picture had to be refined to eliminate any impossible
or outlandishly expensive features. I based these limitations upon an educated
hunch of what things were possible within the current technology. It was my
responsibility to transform this mental picture into a physical
representation, which depicted how the pieces they understood fitted together,
using concepts familiar to my clients. In the process, my clients had to learn
to be much more rigorous in considering features of the system and the
interaction among those features.
Once we were satisfied with the specification, I had to hand it over to the
vendor and ensure that he truly understood it, educating him in the "design"
process as well. This turned out to be a difficult task, because the vendor
was apparently unfamiliar with the most basic principles of software
engineering. He preferred to be told what was wrong in one specific place so
he could fix that, then go on to the next, rather than attempt to embrace the
overall flow and structure of the system all at once.
My role in this process was analogous to an architect's when you ask to have a
house designed and built for you. The tasks of explanation, specification,
reflection, analysis, design, approval, implementation, and acceptance are
present in both software system development and housing construction. In order
to relate the specifications of the client's dream to the builder in a
language he could understand, I became the software architect.
The system consisted of a finite-state machine in which the states are voice
messages and the transitions between states are accomplished by pressing the
keys on the phone (events). The most appropriate graphic representation is a
State Transition Diagram (STD), which is supported by the EasyCase tool.
Unlike other CASE diagram representations, the STD is fairly simple,
consisting of only one kind of box (states) and their interconnecting lines
(transitions).
To make this work, each of the states (messages) we knew about was defined
independently of the others. The state was placed on the diagram (see Figure
1), named, and described to the data dictionary; the text associated with that
state was entered into an "explosion" of its identified box. (Exploding is the
ability to link an object on a diagram to another diagram type.) EasyCase
provides a facility to link text files to chart objects within its structures
and provides easy access to an editor to create and modify that text. To gild
the lily, we could have stored actual voice text and linked it to the chart
object, rather like the "blob" capability of some database systems, but that
would have required a separate voice-text editor with audio cut and paste.
Reviewing the states was illuminating for all of us. States which either we
thought or the vendor indicated were absolutely mandatory turned out to be
superfluous when placed along side of the others. States were rapidly merged,
split, added, and deleted as needed. Late recognition of certain mandatory
legal requirements added several extremely verbose states, which, in turn,
required additional states to allow the user the option to skip the legalese.
After all, the user of the system was going to be billed in increments of one
tenth of a minute. The text of the messages was revised over and over to
achieve a high level of vocabulary and syntax consistency. The bulk of the
effort in this phase was directed toward reducing both the number and the
complexity of the words used. (We even found a nonpracticing lawyer who was
capable of compressing and simplifying text, rather than expanding and
opacifying it!)
Once, most of the states had been defined, the next task was to specify the
interconnections (transitions) among the boxes (states) on the diagram. For
STDs, the interconnection is fairly simple: "On this key-press, go to this box
unconditionally." What was difficult was not selecting which keys did what,
but specifying an easily remembered, consistent pattern of key usage that
would make sense to the users. After all of the effort invested in making the
text consistent, it would be foolish to overlook this critical area. While the
vendor had provided at least two different patterns of transition, each
context sensitive, it was imperative that we have only one. States had to
transition to other states in a smooth, predictable fashion, to make the
system seem as though it were produced by one hand rather than from a
collection of unrelated pieces.
The specification of the transitions was more complex than we had anticipated.
It was necessary to predict all the actions, including inactions, that a user
would want to take while in a given state. Some states, for example, supplied
only information and, upon conclusion, flowed directly into another state.
Other states offered menus of keys, which upon pressing a particular key
transferred to another state. Additionally, we had to decide what to do should
the user not press a key from among those offered within a given period of
time. The system had to have some way to remind the user of his choices,
should he forget.
The STD could have been broken neatly into two pieces: the top half taking
care of the introduction, instructions, and legal requirements, and the bottom
half providing the information the user was paying for. Curiously, the amount
of effort in defining states and transitions was almost exactly opposite for
the two halves. The top half had few complex transitions and huge amounts of
invariant text, while the bottom half had little invariant text, many
menu-based transitions, and a tree of variable-length lists containing voice
text of the listings users wanted to hear.
This last part, the tree of variable-length lists, seemed at odds with the
overall structure. Traditionally, STDs do not contain recursive structures.
The structure was a simple hierarchy with divisions for house/apartment,
number of bedrooms, location, and price. I was convinced that a hierarchical
structuring of the stored data was correct. Reviewing several texts got me
nowhere. The flexibility of EasyCase gave me an out. I could place the
"off-chart" connectors, normally reserved for linking different diagrams
together, inside the box representing the "tree," and use the tool's exploding
capability to hide this complexity. Inside this box was a hierarchy of STDs
providing all of the branches and levels, along with their associated menuing
text.
This use of off-chart connectors was extremely helpful and very painful.
EasyCase allowed the easy disconnection and rerouting of the transitions and
even the movement of the endpoints, except for off-chart connectors. This bug
has since been fixed, but it required the manual removal and reinstallation of
the connectors each of the many times the box was moved. In spite of the extra
work this "feature" caused, however, I believed that this was the correct
structuring mechanism and forced the tool to work as I wanted it to.
At first blush, this doesn't seem to be much of a breakthrough, but it pushed
the recursive nature of the menus governing the tree directly into the
vendor's view. In all of the earlier discussions, the vendor had kept focusing
on the tree, seeing it as a relatively flat data structure; we had drawn it
out to five levels deep with between three and eight branches at each level,
linking the listings describing the various properties on the terminal nodes
of the tree. On the primary STD, only one box appears for the entire tree,
decoupling the data from the overall system structure.
The organization of the data tree was developed by my clients totally on their
own. They selected all the levels and branches based upon their educated
intuition of how users would cut up the world of rental housing. Certain
divisions were obvious at the start; others became obvious only after
considerable research and reflection.


Implementing the Design


At this point, the system was completed -- at least on paper. Now was the time
to sell it back to the vendor. We believed that the reorganization and
respecification had improved the product considerably by making it much more
general and easily supported. The vendor, however, disagreed. At first he
refused to look at the diagram, requesting instead that each change be
described separately. We balked at this, insisting that only by looking at the
changes within the context of the whole system could we be certain that errors
were not introduced elsewhere. He relented after we gave examples of errors
which had already been introduced in the process of "correction."
The vendor expressed concern that this representation of the system could
conceal errors of our own making: "States could be left either undefined or
unconnected to other states," he said. Using EasyCase's analysis capability, I
verified that all of the states we had defined were active, noted the states
that had been deleted, and cataloged all of the transitions. The thoroughness
of the tool in noting inconsistencies quieted his concern. The remainder of
the changes were installed and, by the time this article is published, the
system will have been running for two months. Its completion was six months
late, but we got the system the clients really wanted.


Products Mentioned



EasyCase Plus Evergreen CASE Tools 16710 N.E. 79th Street Suite 105 Redmond,
WA 98052 206-881-5149 Version 2: $295 Professional Pack: $395 System
Requirements: IBM PC with 640K RAM mouse, Hercules/EGA/VGA


Conclusions


What was learned from this effort? First, it is always dangerous to assume
that what works in one arena can be made to work in another. Things unplanned
take longer than things planned. Good design makes for good products. These
are all reasonable conclusions, but the important lesson I learned from the
project, painfully at that, is that it's never too late to begin using
structured techniques and tools, even if it means going back a step to
accommodate the effort.
Although they are incredibly helpful (I would not go through this effort again
without a tool like EasyCase), tools alone are not the answer. As the EasyCase
manual warns:
"It is important to remember that a CASE tool is not a "magical" solution to
perfect, on-time, on-budget system development. What is important is a firm
understanding of structured development methods and their advantages,
disadvantages, and limitations. CASE simply provides a means of automating the
structured development life cycle.
In fact, it is highly likely that a system that has been badly designed using
traditional unstructured or even nonautomated, structured methods will also be
badly designed using a CASE tool. The only advantage here is that inherent
disasters may become apparent sooner, perhaps enabling a system redefinition
or redesign before it is too late.
In other words, if you do not know how to properly (with regard to budgets,
schedule, quality, meeting customer requirements, etc.) develop a system prior
to using a CASE tool, you will most likely be unable to develop a system
properly using a CASE tool.
Tools, coupled with an understanding of the process of system development, can
produce specifications which represent not only the client's intents made
concrete, but a working set of blueprints for a contractor and his journeyman
programmers. Such use of tools, combined with intensive client involvement and
rigor of specification, characterizes the endeavor of the software architect.

















































January, 1991
WINTHERE


Does your program know when Windows is running?




Ben Myers


Ben is a founder and partner in Spirit of Performance, a Harvard, Mass. firm
that develops application performance and resource utilization software. He
also designs and programs custom benchmarks of hardware and software. Ben
Myers can be reached at MCI Mail ID 357-1400.


In its User's Guide for Microsoft Windows 3.0, Microsoft passes on some strong
advice regarding hard disk utilities. For instance, page 54 states that you
should exit Windows before running CHKDSK/F or any other utility that modifies
hard disk allocation tables. If launched from within Windows, hard disk
optimizing utilities and programs that change the interleave of a hard disk
can wreak havoc with the disk. Then, too, TSR programs may not exactly be good
for the health of a PC system when run from within the Windows environment.
Surprisingly, Microsoft does not provide clear and complete advice to software
developers about how to prevent software from being started up within Windows.
Appendix D of the Virtual Device Adaptation Guide, part of the Device Driver
Kit (DDK), does explain how to detect whether Windows is running in enhanced
mode. If you set the AX register to 1600h and make an interrupt multiplexer
call, interrupt 2Fh, the returned values tell whether Windows is running in
enhanced mode (see Table 1). Even a non-Windows application, such as a disk
defragmenter or a TSR, can make this call to see if it has been launched from
inside the enhanced Windows environment. However, this is only half of the
battle, because there is nothing that tells you how to find out if Windows 3.0
is running in either real or standard mode in the first place.
Table 1: Values returned by calling int 2Fh with AX=1600h

 AL{*} Value Meaning
 ----------------------------------------------------------------------

 00h or 80h Enhanced Windows 3.x or Windows/386, Version 2.xx is not
 running.

 01h or FFh Windows/386 Version 2.xx is running.

 Anything else Enhanced Windows 3.x is running.
 {*}AL = Major Version Number of Windows

In our case, we didn't want users of our performance measurement package
(Personal Measure) to run its TSR (PMEASURE) from within Windows in any mode;
consequently we took the following approach.
First we used the DOS DEBUG program to uncover how Windows 3.0 behaved in real
or standard mode. By first obtaining the interrupt 2Fh pointer from low DOS
memory and then examining the code at the interrupt 2Fh handler inside Windows
3.0, it is easy to see what happens in both real and standard modes -- they
both check for the same multiplexer value, 4680h, and return (see Table 2).
Table 2: Values returned by calling int 2Fh with AX=4680h

 AX Value Meaning
 -----------------------------------------------------------

 00h Real or standard Windows 3.x is running.
 Anything else Real or standard Windows 3.x is not running.

Together, the interrupt multiplexer calls 1600h and 4680h provide a fairly
certain way to see if a program is running within the Windows environment. At
present, Microsoft support does not officially acknowledge that the
multiplexer value 4680h exists, a la all of the undocumented DOS calls. And,
of course, this mux call may not be available in future DOS releases.
The program WINTHERE (see Listing One) incorporates the two interrupt
multiplexer calls to display a message that reflects the state of Windows as
accurately as possible. An exit code of one from WINTHERE indicates that
Windows is running. The 4680h multiplexer call is insufficient to tell apart
real and standard Windows, which may be important for certain kinds of
applications.
WINTHERE needs little explanation. Display is a utility macro that makes it
easier to display messages on the PC console through DOS. The program looks
for an operating system extension that responds to interrupt multiplexer value
1600h, then for one that answers to 4680h.
If you have a commercial TSR or hard disk utility product that must not run
within Windows, it is easy to make it more foolproof: Simply insert some of
the code from WINTHERE into the startup logic of the program, changing the
messages displayed to suit your own conventions. By so doing, you will rest
easier at night, knowing that users of your products are less likely to make
errors that have unpredictable and possibly catastrophic effects.


Reference


Microsoft Windows Device Driver Adaptation Kit, Virtual Device Adaptation
Guide, Version 3.00. Redmond, Wash.: Microsoft Corp., 1990.


_WINTHERE_
by Ben Myers


[LISTING ONE]


 page 58,132
 title WINTHERE, A program to test for the presence of Windows 3.0
 subttl (C)Copyright 1990 Spirit of Performance, Inc.
; All Rights Reserved.
 .list
; You may use any portion of this program for any purpose whatsoever, but
; you must include the above copyright in any program into which portions of
; this program are incorporated.
; Use Microsoft MASM 5.1 or later and Borland TLINK to build WINTHERE.COM.
; masm %1,%1.obj;
; tlink /x /t %1.obj,%1.com
; You may also use LINK and EXE2BIN to build WINTHERE.COM. MASM local
; reference operators @f, @b, and @@ are not handled correctly by Borland
TASM.

; Equates used in this program
Multiplexor equ 2Fh ; DOS multiplexor interrupt
KbdIO equ 16h ; BIOS Keyboard interrupt
DOS equ 21h ; DOS function call interrupt
Terminate equ 4Ch ; DOS terminate function
PrintString equ 09h ; DOS print string function
CR equ 0dh ; Carriage Return.
LF equ 0ah ; Line Feed.

; Simple macro to display a text string with the DOS print string function
Display macro message
 local amsg,around
 mov dx,offset amsg ; Load offset of message
 mov ah,PrintString ; DOS function code
 int DOS
 jmp short around ; jump around message text
amsg:
 .errb <message> ; generate assembler error if no message
 irp y,<message> ; repeat for each of y args in message list
 db y
 endm
 db '$' ; terminate message with '$' as required
around:
 endm
cseg segment public 'code'
 assume cs:cseg

 org 100h
Begin:
 Display <"WINTHERE - (C)Copyright 1990 Spirit of Performance, Inc.",CR,LF>

; See if being executed from Windows 3.0 in enhanced mode.
 mov ax,1600h ; Enhanced Windows multiplex signature.
 int Multiplexor
 test al,7fh ; Windows 386?
 jnz Win_Enhanced ; Yes.

; See if being executed from Windows 3.0 in real or standard mode.
 mov ax,4680h ; Multiplex signature...
 int Multiplexor ; apparently when Win3 is not enhanced.
 or ax,ax ; Windows 3.0 /r or /s?
 jz @f ; Yes.
 jmp Not_Enhanced_Win ; No.
@@:
Display <"WINTHERE has been run from Windows real or standard mode.",CR,LF>

 jmp WrapUp

Win_Enhanced:
 Display <"WINTHERE has been run from within Windows in enhanced mode.",CR,LF>
WrapUp:
 Display <"Press any key to continue. . .",CR,LF>
 xor ah,ah ; Read a keystroke.
 int KbdIO
 or ah,ah ; Extended scan code?
 jnz @f ; No.
 int KbdIO ; Read second half of extended character.
@@:
 mov ah,Terminate ; Quit.
 mov al,1 ; DOS exit code 1 to indicate error.
 int DOS
Not_Enhanced_Win:
 Display <"WINTHERE has not been run from within MS Windows.",CR,LF>
 mov ah,Terminate ; Quit.
 xor al,al ; Exit code 0, no error.
 int DOS

; The interrupt mux call with ax=4680h is the one that Microsoft refuses to
; acknowledge, but it sure is there every time Windows is run in real or
; standard mode, and the mux interrupt vector points dead square in the middle
; of the Windows kernel, which then chains the mux interrupt elsewhere.
cseg ends
 end Begin



































January, 1991
PROGRAMMING PARADIGMS


The Code of the New West




Michael Swaine


To boldly go where no human has gone before; to seek out the basis of a new
civilization in the realm of cyberspace: This (one can get away with claiming
in so self-congratulatory a context as a commemorative anniversary issue of
the premier software developer's magazine) has been the 15-year mission of the
enterprise DDJ.
Fueled by several power sources, this month's "Programming Paradigms" touches
down at three points along the electronic frontier. I am indebted to David
Bushnell, Victoria Elder, and Howard Rheingold for insight into the origins of
ARPANET; to Lee Felsenstein for his explication of the vision of community
memory; and to Computer Professionals for Social Responsibility and John Perry
Barlow for the working notes to the Code of the New West.
First, a loose definition and a strong claim. Cyberspace is so science
fictiony a word that I almost shrink from using it as I want to use it. But
it's the best word I can think of to refer to the place where you go when you
go online. That's the loose definition, and here's the strong claim, which I
will come back to twice more in this column: Although cyberspace is not
associated with any physical location, it is a real place, or many places.
Describing it in these terms is not metaphorical, but factual. This, I
believe, is important to realize.


The Community of ARPANET


The story is told, by David Bushnell and Victoria Elder in Directions and
Implications of Advanced Technology (Jonathan Jacky and Douglas Shuler, eds.,
Ablex Publishing, 1989) and by Howard Rheingold in his Tools for Thought
(Simon & Schuster, 1985), of how the Department of Defense funded the
development of a new technology that was inherently antiauthoritarian. It says
something about my own prejudices that I find this surprising, but I do. That
is, I am surprised that the DOD would bring into existence something that it
could not control; I am not surprised that a technology could have political
implications.
In 1960, the batch model of programming prevailed: You submitted your program
in the form of a box of punched cards and hoped that there would be no syntax
errors or typos in your code when it finally got its turn to run. Time
sharing, when it came into its own in the 1960s, was an enormous advance in
accessibility and productivity -- now many users could work simultaneously at
terminals, interacting with the computer as though each were the only user.
Then came remote hookups and the possibility of working at home on the campus
or office mainframe.
Toward the end of the 1960s, some important ideas came together, one of them
being the idea that, rather than just connecting terminals to central
computers, it might be interesting to connect computers to computers. Where
the ideas came together most effectively was at ARPA (later DARPA), the
Department of Defense's Advanced Research Projects Agency.
The Rand Corporation put together an 11-volume classified report in 1964 that
proposed developing a fully distributed communication system for all (data and
voice) military communications, via what would later be called "packet
switching." The reason for opting for a distributed system was military: The
argument was that a system with no central control would be much harder to
disable in a nuclear war. The proposal went nowhere, but a few years later,
Robert Taylor, the director of ARPA, was considering something very similar.
ARPA was supporting research in various universities and other research sites
around the country, so Taylor was well situated to see the value of close
communication among distant researchers. By 1968, ARPA was looking at a full
proposal for a distributed, packet-switching network of the computers at ARPA
research sites.
Such a network would demand significant processing power to maintain it, but
the proposal did not use a large central computer to run the network. Rather,
it gave each constituent computer a dedicated processor, which would come to
be called an Interface Message Processor, or IMP, to share the job of running
the network. The network would be distributed.
ARPA bought the idea and solicited bids. IBM, among other likely bidders,
didn't bid. It was Bolt, Beranek, and Newman (BBN) that got the bid and that
built ARPANET, and BBN still maintains it today.
Rheingold points out the subtle changes that the distributed nature of ARPANET
wrought: "The controlling agent in a packet-switched network like ARPANET was
not a central computer somewhere, not even the 'message processors' that
mediated between computers, but the messages themselves." And: "The idea of a
community that could be brought into existence by the construction of a new
kind of computer system was perhaps the most radical proposal of the
[original] paper."
A community. ARPANET really did create a community, a community whose
gathering place was, and is, ARPANET. Even though it has no specific physical
location, ARPANET is a real -- not a metaphorical -- place. In cyberspace,
place can exist without physical location and community can exist without
physical proximity.


The Electronic Commons


The community that gathers on the ARPANET commons is a specialized community
with built-in shared interests. As ARPANET was being implemented, another
network was being designed that endeavored to create a kind of electronic
commons for ordinary people.
Community Memory's chief engineer, Lee Felsenstein, describes the system in
deliberately prosaic terms: "Imagine a telephone book in which you can list
yourself as many different ways as you want, even under different names, and
when you change your listing in your telephone book, it changes in everyone
else's telephone book. Wouldn't that be nice to have. For everyone to have.
That's essentially what Community Memory is."
More concretely, Community Memory was planned to be a network of public-access
terminals in the San Francisco Bay Area. It was not initially designed as a
distributed system such as ARPANET. While ARPANET had the resources of the DOD
behind it, Community Memory had a hand-to-mouth existence as the brainchild of
impecunious 1960s counterculture technophiles. It ran on a central computer
that Felsenstein and the people he was working with happened to have, a 940
time-sharing system; Felsenstein characterized it as "the first machine
designed and built for time-share. Big deal." The original purpose for which
the computer had been acquired had been forgotten, supplanted, or voided
somehow, but there sat the machine with a free-form query input database, a
large hard disk, and time-sharing access. Community Memory was the use to
which the group decided to put it.
Community Memory went online in 1973. "We were running the system with a
terminal in Berkeley and one in San Francisco from August, 1973 to January,
1975," Felsenstein recalls. It was used: One of the first personal computer
companies -- in one sense the first -- was started from a discussion on
Community Memory. Processor Technology founder Bob Marsh connected with
Felsenstein on Community Memory, and Felsenstein later designed the Sol
computer that made Proc Tech, briefly the hottest of the early microcomputer
companies. But the idea was that Community Memory was to be for nontechnical
people. The success of that idea had to wait for the success of the technology
and financing of the Community Memory project. It took over a decade before
Community Memory went online for real. In mid-August, 1984, Community Memory
put three terminals online in Berkeley. Today Community Memory supports a
diversity of users, although it is still a small network.
That diversity is more important than it might seem and has a lot to do with
the civilizing of cyberspace.


The Civilizing of Cyberspace


"As a result of the fact that we've got a group of people who are convinced
that [information] needs to be free and another group of individuals who are
convinced that they own it, we enter cyberspace already in a state of civil
war. And that's not a good way to start out a new civilization."
That's John Perry Barlow, who, with Mitch Kapor, founded the Electronic
Frontiers Foundation. Barlow was speaking at the annual meeting of the
Computer Professionals for Social Responsibility last October. He may have
been preaching to the choir, but he made a strong case that all computer
professionals should give thought to the uses of the technology they create
and, in particular, to the new world they are creating.
"Most of the people I talk to about this stuff say 'I'm a bus architect, not a
social philosopher,' " Barlow said. "And I really understand that. But society
is so completely perplexed by this technology that even a bus architect is
better qualified to be a social philosopher than a lot of people."
Barlow outlined the questions that we social philosophers of cyberspace should
be asking ourselves, and helping less technical people to ask themselves:
What is -- or are -- data and what is expression?
What is property that has no tangible form and can be infinitely reproduced,
and how do we get money to create it?
What kind of place is a computer?
What are the rights and responsibilities of both individuals and companies in
relation to one another; in other words, what is the social contract?
Can anyone claim to own knowledge?
In beginning to answer these questions, Barlow says, we begin the process of
civilizing cyberspace. There are many ways to go about this, but "the first
and most important of these," Barlow says, "is the same one that we used with
the West, which was making it habitable for ordinary people. It is very
difficult for an average person to go to the places I go whenever I log in.
And the bandwidth is extremely thin. We have to change that bandwidth to make
it possible for human interaction to take place there.
"We have to make it possible for human interaction to take place in a familiar
format. Design is a very important function because what it is is the cultural
mediation of the familiar. There is no design going on [in the computer
field], with the exception of the Macintosh. If we designed cars this way, the
engine would be in the front seat and you'd adjust your speed by diddling with
the carburetor. It's primitive, folks. We have to make it possible for
ordinary people to live here. We have to think about issues of esthetics and
culture." And design can have serious cultural implications: "Jaron Lanier has
said that there's nothing technologically advanced about the Mac: it's a
cultural device. He's right.
"The second thing we have to do in civilizing cyberspace is to reduce the
polarization that already exists. I go around telling security people that
they have nothing to fear from the crackers, and telling crackers that
security people are basically afraid and confused, and that if you treat them
as though they were malevolent fascists they will be that. If you call
somebody a pig he will oblige you by becoming one.
"But it's difficult, because everybody has already chosen up sides and drawn a
deep line in the dirt, and they are already throwing rocks across it. So one
of the responsibilities that you folks have," Barlow told the CPSR audience,
"is to try to bring some light and some humanity and some decency to a
situation that is already very deeply polarized.
"Finally, there are political solutions. I don't have a lot of faith in
political solutions over the long term because I think of cyberspace as being
basically an apolitical place. For starters, it's transnational. I get a lot
of e-mail from abroad saying, 'How can we protect our civil liberties?' And I
can't say, 'Rely on the First Amendment,' because they don't have one. This
brings up a very important point:
"You can't rely on the law, because the law is for the 'real' world and is
necessarily local."

If you can't count on the law, where can you turn? "You have to start relying
on culture and community and shared ethics," Barlow says. "I hate to tell
people that because they've become so reliant on lawyers that [they find it
hard to grasp.] But I think people are going to have to treat one another like
human beings, and that's the only way it's going to work.
"Besides, you've got a situation [in cyberspace] where guerrillas will always
win. There is no better jungle to fade back into than the jungle of
information -- which, as Stewart Brand says, wants to be free anyway. And will
be, I think.
"In order to have this kind of cultural integrity, the first thing you have to
do is to abolish fear. You have to assess the real risks [and confront them
honestly]. It's difficult telling people that we have to abolish fear, because
there's a lot of nameless dread kicking around, mostly as a result of
alienation from this very stuff. People out on the end-user end of things feel
like they're on the learning curve of Sisyphus. They're afraid. They feel like
they're being dragged into a place where their children are natives and they
will never be able to learn the language. And that's enough to give you a lot
of nameless dread.
"Good work is being done on the political front, but I would not rest my faith
for the long term in political solutions, particularly legal ones. I would
rest it in the community of my brothers and sisters and the people in the
computer community who can understand the technology and can make it inclusive
of the ordinary folks out there who are afraid of it."
In the Q&A session after Barlow's talk, a representative of Community Memory
stood and talked about how young urban African-American men are using CM's
public access terminals to discuss events in their community and to invite
each other to parties. The CM representative singled out this group because
they are a group currently being stereotyped in the media as drug-crazed
homicidal maniacs. "I hope that as [we] build a constituency for electronic
freedom," the CM representative said, "we also do everything we can to build
in a diversity of participants, or else it will be a very white, very bland,
very government-controlled cyberspace." And very male, she might have added.
It is a tenet of the American mythology that we need a frontier. I doubt that
this is as peculiarly American a need as we sometimes pretend. Doesn't the
whole world need a frontier? America itself was once Europe's frontier, its
New World, and a lot of Europeans seem to think that we're all still cowboys.
Maybe they need to think that. Since the West was won, that is, civilized,
various metaphorical frontiers have been put forth to substitute for our lost
frontier, to try to give us whatever it is that we need from a frontier. I
suggest that the Electronic Frontiers Foundation is doing something better
than this. I suggest that the name of Kapor and Barlow's organization is not
metaphorical, but literal, and that the electronic frontier is a real frontier
because it is a real place, every bit as real as the physicist's universe of
quarks and galaxies.
But much larger.























































January, 1991
C PROGRAMMING


Down Memory Lane with C




Al Stevens


This month is DDJ's 15th anniversary issue, so I am taking time away from my
usual programming pursuits and indulging in a bit of history. I thought it
would be interesting to do some basic research into the impact that the C
language has had on the magazine and its growth, and to reflect that impact
against the influence that C has had on programming as a whole.
The C language is a few years older than DDJ. Dennis Ritchie developed the
first compiler in about 1972 and, with Brian Kernighan, published The C
Programming Language in 1978. C was developed on the PDP-11, and by the time
K&R was published, C was running on the IBM 370, the Honeywell 6000, and the
Interdata 8/32.
The emphasis for my research was on how and when C first found its way into
the pages of DDJ, and what form its treatment and use took through the years.
I wanted to see how the early contributors to DDJ perceived the language that
was destined to pervade their industry, where their early tools came from, and
what became of the tools and their creators.
Using the DDJ bound editions from 1977 until last year, I searched for
articles that were about C, or that used C as the language to describe an
algorithm or make a computer do something useful. The bound editions do not
include advertisements, so I couldn't spot the first appearance of this or
that milestone C product. But because my objective was to trace the evolution
of editorial treatment of the C language, the ads were not important, except
perhaps as items of nostalgic interest.
I didn't really scour those bound editions. As much fun as it might have been,
there just wasn't time. Instead, I looked for references to C in the article
titles, flipped through the books looking for C listings, and checked out the
authors to see if anyone who eventually became a C luminary started out as a
DDJ published visionary. I did not read every letter to the editor, every
column, or every article. What follows, then, is the result of superficial
research at best, but it offers, I think, an historical perspective on the
relationship between C and the generation of programmers who grew up with DDJ.


Who's Who?


My scan for authors revealed that the early bylines in DDJ are a virtual
rogues' gallery of folks who went on to become industry celebrities,
luminaries of the first order in all fields of computing. Jef Raskin, George
Morrow, Gary Kildall, Ward Christensen, Steve Wozniak, Ray Duncan, Lee
Felsenstein, Brian Kernighan, Dennis Ritchie, Donald Knuth, Charles Petzold,
Davy Crockett, Richard Wilton, and Herb Schildt all published in DDJ, most of
them in the early years. (Davy Crockett? Naw, must be some other guy.)
The DDJ of today is a professional programmer's magazine without an editorial
commitment to any particular platform, paradigm, or scale of computer.
However, being mostly reader-written, the articles naturally incline toward
the computer systems that programming writers and writing programmers have at
hand -- and usually at home. But that is the legacy of the wonder years when
DDJ was dedicated to articles about code that people could develop for what
they called "home-brew" computers, and the code and articles were heavily
oriented toward assembly language first and then Basic. C would not make an
appearance in the magazine until there were compilers that programmer's could
take home with them.


tiny-c


The first mention of C that I found was in a letter to the editor in February
of 1979 where a reader, Ted Chapin, offered a review of a product named
"tiny-c," which was a C language subset interpreter.
The product was distributed as the "tiny-c Owner's Manual" with printed
assembly language source code for the 8080 and the PDP-11 and was, as near as
I can tell, the first commercially available C language implementation for a
microcomputer. It cost $40 and gobbled up a whopping 4K in the 8080. Well, you
can still get a C compiler for under $100, anyway.
The next mention of C in DDJ was in May of 1979. In a short article -- one
paragraph and some code, Ray Duncan published the assembly language interface
that integrated tiny-c with CDOS, a CP/M derivative operating system that ran
on Cromemco Z-80 microcomputers.
One month later, Les Hancock wrote the first DDJ article that was about the C
language. In "Growing, Pruning, and Climbing Binary Trees with tiny-c," Les
stressed that the article was not about binary trees but about C. The only C
language system available for his microcomputer was tiny-c. He spoke to its
limitations as a subset and then proceeded to use it effectively to make his
point.
All three authors spoke highly of the tiny-c documentation. Hancock called it
"lovely."
Tom Gibson, a C programmer who worked in an early Unix shop, was the author of
the 8080 version of tiny-c. A coworker, Scott Guthery, wrote the PDP- 11
version, and together they wrote and self-published the documentation.
Scott is now the proprietor of the Austin Code Works, a company that continues
the tiny-c tradition of selling software only when it comes with source code.
Scott is also the author of the articulate and controversial article "Are the
Emperor's New Clothes Object Oriented?" in the December 1989 DDJ.
You can still get tiny-c. Scott rewrote it in C to run on MS-DOS machines and
published it in his book, Learning C with tiny-c (Tab Books, 1985). I used it
to learn how to write interpreters. I liked the book, but it has too many
gorilla cartoons. You need crayons.
In May of 1980, Gibson and Guthery wrote an article in DDJ called "Structured
Programming, C and tiny-c," where they described their view of what structured
programming is by using examples in both languages.


BDS C


In January of 1980, Les Hancock returned to the pages of DDJ with
"Implementing a Tiny Interpreter With a CP/M-flavored C." Les's good news was
that CP/M users finally had a C compiler. The product was BDS C, authored by
Leor Zolman and published by BD Software, Zolman's company. The article used
BDS C to implement Les's own tiny language interpreter. The language was
tinier and further away from C than tiny-c, but the importance of this article
is that it was the first one that used compiled C to implement something in
the pages of DDJ.
BDS C was the first compiler I saw and used. My brother was using it to
develop firmware for black box applications -- what they call "embedded
systems" now -- and he showed it to me and gave me my first copy of K&R. That
was around 1981.
Not long afterward I wrote an assembler in BDS C for a member of the TMS
family of microprocessors. BDS C was a really fast, integer-only compiler with
a Fortran-like common area, and I loved it but abandoned it as soon as a full
K&R compiler for CP/M was available that I could afford. I met Leor at
Software Development '90. He's with the C User's Group and still sells an
occasional copy of BDS C.


K&R & J&L


In 1980, DDJ published an AT&T paper by Brian Kernighan, Dennis Ritchie, S.C.
Johnson, and M.E. Lesk, titled, "The C Programming Language." In their closing
remarks they ponder the future of C: "Should the pressure for improvements
become too strong for the language to accommodate, C would probably have to be
left as is, and a totally new language developed. We leave it to the reader to
speculate on whether it should be called D or P."
Three years later the ANSI X3J11 committee was formed.


Small C



In May of 1980, Ron Cain published "A Small C Compiler for the 8080s," an
article that contained a subset C compiler named "Small C," which was written
in Small C. He started by developing the compiler in tiny-c and then used it
to compile itself iteratively as he refined it. Cain, too, took the time in
the article to praise the tiny-c documentation. This thing must have been
something. Small C compiled to 8080 assembly language.
Cain had no way for readers to compile the compiler except by their own
ingenuity. The code he published would compile on a Unix machine, but he
openly left it to others to put executable versions of the compiler onto media
that would be readable by all the many diskette and cassette formats of the
day. Obviously, Cain was no entrepreneur. He confessed that he could not and
did not want to keep up with the demands from programmers who wanted the Small
C compiler.
Small C was a major breakthrough for budding C programmers who had 8080-based
computers at home. It became a subject for many articles to follow, it was the
language that other writers used to write about implementations of other
things, and was eventually ported to CP/M and then to the 8088 and MS-DOS.
Cain wrote a follow-up article in September of the same year. In "Runtime
Library for the Small C Compiler," he published the arithmetic and logical
runtime code and the I/O library which resembled the Unix functions. All of
his source code for the runtime library was in 8080 assembly language.
Cain mentioned the availability of C compilers for CP/M machines at the time
of his article. Cain didn't like CP/M and didn't use it, and he spoke of the
high cost of the existing compilers. I remember hearing about C compilers that
ran on CP/M back then, but I do not know when they first came out or which
ones Cain referred to.
In 1981, there were four Small C articles. These were about the compiler
itself. One author rewrote the compiler in Fortran to compile the first
bootstrap of the compiler, and he wrote two articles about his experiences.
Another rewrote the compiler, changing its structure considerably, and
offering it to DDJ readers as the InfoSoft C compiler for $50.
The fourth article about Small C was the first appearance in DDJ by J.E.
Hendrix. His short article published some patches to the compiler. Hendrix was
to become the heir apparent to the Small C mantle. In 1982 he published "Small
C Compiler, v.2," adding many features to the compiler. He later published
that version in a book, and has recently published in book and diskette form,
A Small C Compiler, 2nd Edition (M&T Books, 1990), which ports the compiler
and compiled language to the 8088 and MS-DOS.
I have a complaint about the book. Small C has never supported structures. I
wish it did, because it is a small, fast compiler that would work nicely on my
slow, single-disk laptop. But I need structures. In his discussion of what the
Small C subset leaves out, Hendrix brushes off structures and programmers who
use them when he says, "The use of structures and unions is never essential,
although it may seem so to programmers who depend on them."
Now, in the first place, I don't need to be talked down to from that far up.
And, in the second place, to address the importance of structures I'll quote
Dennis Ritchie from my interview with him in the DDJ C Sourcebook for the
1990s. When asked how long it took to rewrite Unix in C, Dennis said:
"There were two tries at it... Ken Thompson tried to do it and gave up. The
single thing that made the difference was the addition of structures to the
language... Without that it was too much of a mess."
So there. Back to the history lesson.
In 1983 there were articles publishing "A Small C Operating System," and "A
Small C Help Facility." 1984 saw "A New Library for Small C," "cc - A Driver
for a Small C Programming System," and "p - A Small C Preprocessor." Hendrix
wrote "Small C Update" in 1985.


Ed Ream's Editor


1982 was the year that Edward K. Ream wrote, "A Portable Screen-Oriented
Editor," the program that you will find in every public domain C library in
the Western world. Ed wrote his editor in Small C. Alan Howard wrote a
responding article that enhanced the editor, and Ream published his own
enhancement called "RED," written in BDS C.


C Gets Its Own Column


In October, 1983, Anthony Skjellum wrote the first installment of his "C/Unix
Programmer's Notebook" column, a bimonthly column that would run more than a
year. C finally rated its own place in the magazine. In 1984 Allen Holub
published an article and the code to GREP.C. He would launch the "C Chest" in
March of the following year when Skjellum decided to give up his column. The
"C Chest" was the first of the monthly columns devoted entirely to the C
language, and it ran for over five years.


Reviews


By 1985, there were enough C compilers for MS-DOS machines to warrant a
benchmark article. They were Aztec C, Control C, C Systems C, Computer
Innovations C86, Datalight C, DeSmet C, Digital Research C, EcoSoft C, Lattice
C, Mark Williams C, Microsoft C, Software Toolworks C, and Wizard C. Fourteen
in all.
In 1986 the number was up to 17, and another benchmark article followed.
Digital Research C and Control C were gone from the list, but Datalight added
something called the "Datalight Kit," and the new ones were Hot C, IBM C, High
C, Mix C, and Whitesmiths C.
Where are they all now? The first, Microsoft C, was really a repackaged
Lattice compiler that Microsoft sold until their own in-house compiler was
ready. Datalight changed to Zortech and moved up to C++. Wizard moved west and
joined Borland to become Turbo C. The Turbo C that was being developed
in-house left Borland and became TopSpeed C. Mix C became Power C. Digital
Research C became extinct. Mark Williams C became Let's C. This is as hard to
keep up with as the "Twin Peaks" plot.
The 1988 article, "Speed Trials: Five Cs Compared," tested Microsoft C, Watcom
C, Turbo C, Datalight Optimum-C, and Computer Innovations C86+, and mentioned
High C.


Nay-Sayers


1986 was the year that someone discovered C-bashing as a national sport. It
gained popularity mostly among programmers who were just learning C and
couldn't wait to write something clever about it, mainly about how much
trouble it was giving them. Couldn't be them. Must be C. It's fashionable not
to like something that's so popular. I never liked Johnny Cash all that much.
In January of 1986, we had "Inefficient C," (get it?) telling us how slow and
big C programs are. And in June, we got "What's Wrong With C," telling us more
of the same, and suggesting that C programmers do it for the glory of being in
on something esoteric. Yeah, and your mama, too.


ANSI C


The first DDJ article I found about the ANSI standardization of C was in
"Preparing for ANSI C" in August of 1987. I guess they had the shape of the
conference table settled by then. We were to keep preparing for almost another
three years. In 1988, K&R second edition came out, based on ANSI C. They
couldn't wait. In August of 1989, DDJ had "Going from K&R to ANSI C," although
it still wasn't ready for us to go there yet. We got the standard in early
1990. It's pretty good.


The C Programming Column


In August of 1988, Alan Holub hung up his spurs, closed the C chest, and I
took over the C desk at DDJ. No big deal in the annals of computer science,
but the little old lady in the next condo is impressed.


Classy C


The emphasis in 1989 was on C++. We had articles on C++ vs. Modula-2, C++
multitasking, directory searches with C++, and TAWK in C++. In December I
found myself dead center between the past and the future when I interviewed
Dennis Ritchie and Bjarne Stroustrup in the DDJ C Sourcebook for the 1990s
about the history of C and the directions of C++.



Back to the Present


If we've worn too many ruts in Memory Lane this month, don't fret. We won't do
it again for another five years, I bet. As I wandered through the DDJ titles
of 15 years worth of reader-written journalism, I was struck by what I found
and by what I didn't find, too. You'd think that the articles would not wear
well, that the technology would have overtaken them and made them obsolete. To
a certain extent that happened. You won't find much demand for 8080 assembly
language any more. But just this past week someone on the CompuServe DDJ Forum
was looking for the source code to Tiny Basic.
The Small C articles all the way back to the first are still valid to anyone
who wants to see how a compiler can be written in C and watch how it grows and
improves with time. There are many articles about fundamental algorithms that
will remain valid and will survive all the paradigm shifts yet to come because
the underlying principals never wear out.
And what didn't I see? There are a lot of data structures and fundamental
algorithms that I expected to find somewhere in the lore of the last 15 years
just because they have been out there waiting to be written about. Many have
been neglected. That's reassuring. I was worried about running out of ideas.
























































January, 1991
STRUCTURED PROGRAMMING


If You Care




Jeff Duntemann K16RA/7


No matter how you slice it or dice it, '76 was a good year. America, the
world's best hope for human freedom, marked its 200th anniversary. I married
Carol and stopped subsisting on Rice-A-Roni and Golden Grahams cereal. I
wire-wrapped my first computer, a COSMAC ELF with 256 bytes of RAM. I wrote my
first operating system. I wrote my last operating system. (It was the same
operating system.)
And out of nowhere there came Dr. Dobb's Journal of Computer Calisthenics and
Orthodontia: Running Light Without Overbyte. Byte may well have been the first
microcomputer magazine, but DDJ was the first magazine for microcomputer
programmers, which was a degree of specialization that most people considered
a little nutty at the time. It was printed on plain white paper, and didn't
even have a cover. The title was dead on target. And best of all, it made my
brain crawl with ideas.
Over the years we've seen a lot of weird and interesting material in DDJ, and
I couldn't begin to catalog the things I learned here long before I ever saw
them anywhere else. I think it's fair to say that by publishing Ron Cain's
tiny c, DDJ gave the C language the push it needed away from near-terminal
Unix bloat and toward critical mass on the leaner, meaner platforms that rule
today.
Beyond all that, however, what earned my everlasting respect for DDJ is that
it is, and has always been, a publication that cares. It recognizes, first of
all, that there is a universe of complication outside the cubbyholes where we
lay down our code, line by line. These complications affect us, our ability to
earn a living, and in some cases, our ability to speak and act as free beings.
Rather than pretend that these complications don't exist (as my earlier
employer, PC Tech Journal, always did) DDJ allowed concerned voices among its
staff and readership to speak to those readers who were perhaps unaware of or
as yet undecided about those complications.
Some years back, Allen Holub ignited a small storm with his contention that
programmers have the obligation to act ethically, and that ethics preclude
working on software that supports weapons systems. Allen and I chawed on
opposite ends of that particular bone of contention (since after all, our
nuclear weapons prevented the Russians from destroying themselves -- and us --
before they had a chance to come to their senses) but I stood a little in awe
that DDJ gave him the forum to make his feelings known. No hint of that debate
would ever have surfaced in print at PC Tech Journal, where I was regularly
dressed down for attempting to lighten that magazine's often-leaden,
all-business heart.
Software piracy, DOD suppression of public-key encryption algorithms, BBS
harassment, look-and-feel banditry, and (most recently) the absurd activities
of our own Patent Office have seen considerable discussion in these pages.
Sometimes DDJ has an official position, and sometimes it does not. (Not only
are there not always any easy answers; there are often no answers at all.)
Keep in mind that magazines are there to inform and to stimulate discussion.
Ultimately, it is individuals who act. What DDJ does that no other
programmer's magazine has ever done is to lay out these ugly issues for public
dissection, and then plead, if you care, act.
It's been 15 years that leave me out of breath to recall. Mostly it's been 15
years of unbridled freedom to hack, to learn, to work, and to make money.
We've come to take that freedom for granted, forgetting that freedom is always
under attack by the greedy, the unprincipled, the envious, and the fearful.
We've been lucky so far. It's not going to last. Large, technologically
bankrupt firms such as Lotus are putting systems in place to take by force
what they can no longer earn in the free market. The U.S. Patent Office is
illegally handing out patents on formulas (which we call algorithms)
irrespective of the fact that formulas explicitly cannot be patented, not to
mention additional silly points like blatant obviousness and prior art. Many
government bodies are trying their best to make BBS systems impossible.
So let me echo DDJ's unwritten philosophy: If you care, act. It's your hind
end on the line. Boycott firms that claim what isn't theirs. Pester the
bejeezus out of your congressman to put a leash back on the Patent Office and
force them to obey their own law.
Most of all, strive in whatever way you can manage to return our industry and
our nation to the rule of law. The law today has become so rubbery that it has
come to mean nothing but what some judge somewhere says it means, which is to
say nothing at all. I still believe it can be done. The alternative is chaos,
especially in our industry where the limits of what we can do is nowhere in
sight. (I have already heard rumors of a new class of virus that inserts
realistic-looking bugs into copies of Lotus 1-2-3 that it finds...do we really
want to let slip the dogs of that sort of war?)
Ultimately, it depends on you. The opinions expressed in this particular
column are entirely my own, and do not reflect the views of Dr. Dobb's Journal
-- which is entirely the point! They gave me this space to make noise because
they care. Now it's your turn. Care enough to understand the issues. Care
enough to have opinions. Get excited. Get mad. If you care, you can win. If
you hide, you will lose.
It's that simple. And I learned it by reading DDJ.


An Object's Private Parts


Turbo Pascal 5.5 worked so well at bringing objects to the common hacker that
few of us carped about its (minor) shortcomings. Probably the most major of
its minor shortcomings was lack of any management of access rights to object
internals. In other words, any program statement within the scope of an object
could freely access any field or method within that object, period. All fields
and methods were strictly public. About the best we could do was simply not
publish the full definition of an object type, but rather give an object's
users an edited list of those fields and methods we chose to make available.
This made access-rights management something like an exercise in industrial
espionage, and almost nobody bothered.
C++ has access-rights management in spades, as I am discovering while writing
Object-Oriented Programming From Square One, which touches on C++ in the
course of explaining OOP principles. (And C++ From Square One is still ahead
of me -- arrgh!) You can restrict access to object fields and methods at three
different levels -- and then selectively violate those restrictions using
"friend" functions. It took a couple of days for me to get it all straight in
my mind, and it left me with the lingering feeling that C++ is ripe with
spokeshaves; that is, tools good for only one specialized purpose (such as
shaving spokes) that rarely come to hand in any other situation.
Borland added access-rights management to Turbo Pascal 6.0 (released this past
November). In keeping with seven years of tradition, they managed to do it in
a way that retained 80 percent of the power of That Other Language, while
remaining simple enough to master without a lifetime of effort.


Sticking With a Winning Paradigm


Winning paradigms are like winning horses: You stick with 'em. Borland weighed
the need for limited access rights in Turbo Pascal objects very carefully
before deciding just how to implement them. In fact, they had a very
successful paradigm of limited access rights in their hip pockets all the
time, and they wisely decided to stick with it.
The paradigm I'm speaking of is the units paradigm, and it's both a familiar
and an effective model for limiting object access rights. In every unit there
is a public portion called the definition part, and a private portion called
the implementation part. Program entities declared in the definition part are
"public;" that is, any code using the unit can reference those entities
freely. On the other hand, entities declared and defined wholly in the
implementation part of a unit are private to that unit. This means that other
entities inside the implementation part of the unit can use them, but no
entity outside the implementation part of the unit can reference them or know
that they exist.
This works beautifully. So Borland stuck with units as the mechanism through
which object access rights are defined. A new directive, PRIVATE, has been
added to the language. PRIVATE is a directive rather than a reserved word; it
has special meaning only within an object definition. (The reserved word
VIRTUAL, added with Turbo Pascal 5.5, has been demoted to this same sort of
directive.) If you put the directive PRIVATE inside an object-type definition,
any fields or methods declared after PRIVATE may be referenced only from
within the unit in which the object type is declared.
Think of it this way: An object type definition is typically placed in the
interface portion of a unit, making it public and referenceable from anything
that uses the unit. The PRIVATE directive is a way to move declarations that
would ordinarily be made in the implementation section of a unit up into the
interface section -- without making the declared items public.


When, Yet Again


Actually, the best way to explain it is to move right to a practical code
example. Listing One is a remake of my old, (somewhat) reliable "when stamp,"
which I first presented in the April 1990 DDJ as an example of encapsulation
in Turbo Pascal 5.5. A when stamp, in case you're just tuning in, is my
coinage for a model of a point in time under DOS. It contains both the time
and the date, stored as a single 32-bit quantity, along with machinery to
fetch the current time and date from DOS, and to provide the user with the
time and date in various formats. I wrote it to bundle a whole toolkit of time
and date formatting procedures and functions into a single logical entity --
which I definitely think of as encapsulation in action.
WHEN2.PAS recasts the when stamp for Turbo Pascal 6.0. Look closely at the
object-type definition. It now has both public and private parts, separated by
the new directive PRIVATE. Those items declared above PRIVATE are accessible
by users of the unit. Those items declared below PRIVATE are accessible only
from within the implementation section of the unit.
Although we don't refer to them as such, an object, like a unit, now has an
interface and an implementation section. (The Borland manuals simply refer to
them as "public" and "private.") The parts of an object that the user of the
object is allowed to use is the interface section, whereas those parts of the
object not available to the object's users are the implementation section.
I've sketched out this correspondence in Figure 1. Because the complete object
definition must be in the unit's interface section, the user of the object is
fully aware of the object's private parts, but isn't allowed to get at them.
(Anyone who came of age prior to the Sixties will know what I mean.)
I suppose that in the purest sense of the word private, the object definition
should be split in two, with the public portion in the interface section of
the unit, and the private portion in the implementation section of the unit.
This would make for needless confusion, since after all, encapsulation is a
coming together.


Hands Off, Kids


There is one downside to Borland's system of access rights: For full access to
all fields and methods, subclassing must be done within the same unit as the
superclass. In other words, if you choose to extend an existing object by
declaring a child type of that object, the child type's methods must be fully
implemented in the same implementation section containing the parent's
methods. The rule that private fields and methods are private within a single
unit is absolute. You can declare child types outside the parent type's unit,
but those child types must work with the parent type on the same terms as
everybody else: Without touching the parent type's private parts.
What this does mostly is put a crimp in extendibility. Extending an object
with much of itself set off as private becomes difficult or impossible unless
the person doing the extending has the source code to the unit defining the
parent. If providers of objects intend their objects to be extended, they must
be very careful in choosing what should be private and what should not. A
private method cannot be overridden from outside the unit.
How much of a problem will this turn out to be? Only some serious use of the
product will tell. I suspect it sounds worse than it truly is. Got any
insights? Do share them.



The Capsule in Encapsulation


The coming of access rights with Turbo Pascal 6.0 solved an ugly problem
besetting the original when stamp unit. All of the fields in the private
portion of the new When object were present in the original, but in the
original they could be accessed freely, and there was no way that I could
prevent such access. So I turned a bug into a feature and declared that this
made for speedier performance: If you wanted a string form of the date, you
just went in and grabbed the string form of the date that the object
maintained internally inside the field called DateString.
Fast, easy; no function-call overhead.
All well and good -- but anything that can be read from outside the object can
be changed as well. Reading any of the when stamps' data fields is fine.
Directly changing any of them is a recipe for instant trouble.
Why? Consider: The when stamp actually models only one moment in time, but
internally it contains several expressions of that moment in time. There is
the central 32-bit field, WhenStamp, which contains the bit-mapped values of
the current hours, minutes, seconds, year, month, and day. Then there are
separate numeric fields containing the same information: Hours, Minutes,
Seconds, and so on. Additionally, there are three different string fields
containing human-readable representations of time and date, plus another
numeric field indicating the day of the week, produced by that rascal, Zeller.
Now, suppose that you instantiate a When object (RightNow, say) and call the
PutNow method to load the current time and date into RightNow. This current
time and date value is stored in the field called WhenStamp. The PutNow method
then calls several other routines, which take the value in WhenStamp and
calculate values for the other representations of the time and date.
Later on, you turn around and write the value 3 directly into the Month field.
Unless it just happens to be March, the internal fields of RightNow no longer
agree with themselves on what month it is. WhenStamp may say October, but
Month says March. Who do you believe?
By design, the WhenStamp field is boss. The "true" time and date contained in
a when stamp is the time and date value in the WhenStamp field. The other
representations must be calculated from the value in WhenStamp. Change one of
the other representations without changing WhenStamp first, and your when
stamp may start telling lies, and in this business, lies beget bugs.
By design, in other words, fields like Month and LongDateString are
"read-only," but there was no mechanism in Turbo Pascal 5.5 to enforce that
stipulation. Now, in Turbo Pascal 6.0, that mechanism is there, in the form of
access rights. Month, LongDateString, and the other fields are now private.
The PutNow method and other code within the When2 unit may change them, but
users of the unit may not. To allow users access to the various
representations of the time and date, I added a whole raft of new methods,
such as GetYear, GetMonth, GetDayOfWeek, and so on. If you look at their
implementation, you'll see that these methods are nothing more than single
assignment statements: The value of the field in question is assigned to the
name of the method in question. The method grabs the value of the private
field and carries that value out front to the user. It's a one-way street: The
user cannot make the GetDayOfWeek method go back and somehow change the
DayOfWeek field.
I could have had (and probably should have had) all these methods in the
original when stamp unit, presented here in DDJ last April. I chose not to
because there was no way to force users of the unit to go through channels and
use the methods to retrieve time and date values, rather than going directly
to the time and date fields themselves.


The Virtue of Private Methods


Having private methods also allowed me to take the several utility routines in
the original when stamp unit -- CalcTimeString, CalcDayOfWeek, etc. -- and
make them methods. The user has no cause to call these methods directly (and
in fact might make a mess if allowed to do so); private methods cannot be
called by the user. They can only be called by code within the implementation
portion of the unit, typically by the object's other methods.
Now, the "calc" routines were always in the implementation portion of the
When2 unit, and hence off-limits to when stamp users. So why bother making
them methods? The answer is that, as methods, the "calc" routines can access
the object's data fields directly, rather than as parameters. When a data
field is pushed on the stack as a parameter, a lot more code must be executed
than if a data field is referenced directly. Without all that thrashing of
parameters onto and off of the stack, the when stamp object is both smaller
and faster.
And apart from that, making all the code connected with an object into methods
helps from a documentation and comprehension standpoint. One glance tells you
what an object can do. There's less digging around to get an overview of its
internals.
Good OOP practice has always held that the values of object fields should
always be returned through methods, rather than through direct access. In
Smalltalk and Actor, there's no choice in the matter -- data fields are
off-limits outside the boundaries of the object that contains them. With
private methods and fields, Borland has put the capsule into encapsulation,
and made the Pascal OOP design a bit more foolproof for all us fools
struggling to make something of it.


Events in Graphics


Last month, I described Turbo Vision, the event-driven windowing application
framework Borland is now shipping with Turbo Pascal 6.0. As good as it is,
Turbo Vision operates only in text mode. This is fine; I do like Windows 3.0,
but I like freedom of choice a lot more.
It may be a bit before we see Windows 3.0 development with Turbo Pascal. In
the meantime, I've found a dandy graphics-based event-driven application
framework, and while it's not object oriented, it's still extremely well done:
The TEGL Windows Toolkit II from TEGL Systems in Vancouver, British Columbia.
The TEGL Windows Toolkit II (TEGL, for short) operates in much the same way as
Turbo Vision. Your application is a process of setting up responses to mouse,
keyboard, and timer events, and then letting the event handler take over. The
event handler intercepts events from their various sources, and invokes your
routines appropriately.
TEGL provides many of the same services as Turbo Vision: pull-down menus,
pop-up windows, dialog boxes, and so on. There are a great many
graphics-specific features as well, including several very nice fonts (plus a
few ugly ones), an icon editor, and a whole raft of drawing primitives.
I tested two versions of TEGL, one for Turbo Pascal and a nearly identical one
for Turbo C. (It also works with Turbo C++, but again, TEGL is not an OOP
tool.) Both versions rely on the BGI for graphics, but Richard Tom has
replaced some of the slower BGI primitives with his own versions, which are a
great deal faster. If you've avoided BGI graphics for speed reasons, you might
try again, using TEGL instead.


Products Mentioned


Borland International 1800 Green Hills Road Scotts Valley, CA 95066
408-438-8400 Turbo Pascal 6.0: $199.95 Turbo Pascal 6.0 Professional: $299.95
The TEGL Windows Toolkit II TEGL Systems Corporation 789 W. Pender Street,
Suite 780 Vancouver, BC Canada V6C 1H2 604-669-2577 Intro Pack: $5.00 With
source code: $50.00 Games Toolkit: $90.00
The Turbo Language User's Conference Sheraton Palace Hotel San Francisco,
Calif. April 28 through May 1 1-800-942-TURBO
Some of the niftiest features of TEGL relate to animation. Icons may be
animated. In the very nice TEGL-generated Mah Jongg game Richard Tom markets
as a shareware product, the icon for the game is an old Chinese gentleman who
bows when you click on him, to a short riff of Chinese music.
The Mah Jongg game is beautifully done, but it illustrates a trap that our
anarchically diverse PC video universe sets for the unwary developer. The
oriental tile patterns for the game are bitmaps, edited in the TEGL icon
editor. And because they are bitmaps, their physical size on the screen
depends on the resolution of the current screen mode. They seemed just about
right in 640 x 350 EGA graphics, but shrank to a level I'd call close to
uncomfortable when I recompiled the program for 640 x 480 VGA graphics.
There's no easy way around this problem that retains the speed of bitmapped
graphics. Keep it in mind if you develop any graphics application: The screen
will look different in other graphics modes. Plan to test it thoroughly (for
usability as well as for simple correctness) under any screen mode you intend
to support. And (although you may not agree with me) I recommend not
supporting a given graphics mode rather than putting an ugly or
difficult-to-use application out there.
TEGL is fast, well-documented, and cheap -- and definitely the most fun I've
had playing with graphics in a good long while.


The Turbo Language User's Conference


I've just learned that Borland will be holding a conference for Turbo Language
users in San Francisco April 28 through May 1. Although there will be vendor
booths (I'll have one myself, for PC TECHNIQUES) the whole point of the
conference is to present technical seminars from which you can learn
something. Details are still few, but from what I've heard, it'll be well
worth the trip. Pencil it in -- and I'll see you there!


_STRUCTURED PROGRAMMING COLUMN_
by Jeff Duntemann


[LISTING ONE]

{---------------------------------------------------}

{ WHEN2.PAS }
{ A time-and-date stamp object for Turbo Pascal 6.0 }
{ by Jeff Duntemann }
{ From DDJ for Jan. 1991 }
{ NOTE: This unit should be good until December 31, }
{ 2043, when the long integer time/date stamp turns }
{ negative. }
{---------------------------------------------------}

UNIT When2;

INTERFACE

USES DOS;

TYPE
 String9 = STRING[9];
 String20 = STRING[20];
 String50 = STRING[50];

 When =
 OBJECT
 FUNCTION GetWhenStamp : LongInt; { Returns 32-bit time/date stamp }
 FUNCTION GetTimeStamp : Word; { Returns DOS-format time stamp }
 FUNCTION GetDateStamp : Word; { Returns DOS-format date dtamp }
 FUNCTION GetYear : Word;
 FUNCTION GetMonth : Word;
 FUNCTION GetDay : Word;
 FUNCTION GetDayOfWeek : Integer; { 0=Sunday; 1=Monday, etc. }
 FUNCTION GetHours : Word;
 FUNCTION GetMinutes : Word;
 FUNCTION GetSeconds : Word;
 PROCEDURE PutNow;
 PROCEDURE PutWhenStamp(NewWhen : LongInt);
 PROCEDURE PutTimeStamp(NewStamp : Word);
 PROCEDURE PutDateStamp(NewStamp : Word);
 PROCEDURE PutNewDate(NewYear,NewMonth,NewDay : Word);
 PROCEDURE PutNewTime(NewHours,NewMinutes,NewSeconds : Word);
 PRIVATE
 WhenStamp : LongInt; { Combined time/date stamp }
 TimeString : String9; { i.e., "12:45a" }
 Hours,Minutes,Seconds : Word; { Seconds is always even! }
 DateString : String20; { i.e., "06/29/89" }
 LongDateString : String50; { i.e., "Thursday, June 29, 1989" }
 Year,Month,Day : Word;
 DayOfWeek : Integer; { 0=Sunday, 1=Monday, etc. }
 FUNCTION CalcTimeStamp : Word;
 FUNCTION CalcDateStamp : Word;
 FUNCTION CalcDayOfWeek : Integer; { via Zeller's Congruence }
 PROCEDURE CalcTimeString;
 PROCEDURE CalcDateString;
 PROCEDURE CalcLongDateString;
 END;

IMPLEMENTATION

{ Keep in mind that all this stuff is PRIVATE to the unit! }

CONST

 MonthTags : ARRAY [1..12] of String9 =
 ('January','February','March','April','May','June','July',
 'August','September','October','November','December');
 DayTags : ARRAY [0..6] OF String9 =
 ('Sunday','Monday','Tuesday','Wednesday',
 'Thursday','Friday','Saturday');

TYPE
 WhenUnion =
 RECORD
 TimePart : Word;
 DatePart : Word;
 END;

VAR
 Temp1 : String50;
 Dummy : Word;

{***********************************************}
{ PRIVATE method implementations for type When: }
{***********************************************}

FUNCTION When.CalcTimeStamp : Word;

BEGIN
 CalcTimeStamp := (Hours SHL 11) OR (Minutes SHL 5) OR (Seconds SHR 1);
END;

FUNCTION When.CalcDateStamp : Word;

BEGIN
 CalcDateStamp := ((Year - 1980) SHL 9) OR (Month SHL 5) OR Day;
END;

PROCEDURE When.CalcTimeString;

VAR
 Temp1,Temp2 : String9;
 AMPM : Char;
 I : Integer;

BEGIN
 I := Hours;
 IF Hours = 0 THEN I := 12; { "0" hours = 12am }
 IF Hours > 12 THEN I := Hours - 12;
 IF Hours > 11 THEN AMPM := 'p' ELSE AMPM := 'a';
 Str(I:2,Temp1); Str(Minutes,Temp2);
 IF Length(Temp2) < 2 THEN Temp2 := '0' + Temp2;
 TimeString := Temp1 + ':' + Temp2 + AMPM;
END;

PROCEDURE When.CalcDateString;

BEGIN
 Str(Month,DateString);
 Str(Day,Temp1);
 DateString := DateString + '/' + Temp1;
 Str(Year,Temp1);
 DateString := DateString + '/' + Copy(Temp1,3,2);

END;

PROCEDURE When.CalcLongDateString;

VAR
 Temp1 : String9;

BEGIN
 LongDateString := DayTags[DayOfWeek] + ', ';
 Str(Day,Temp1);
 LongDateString := LongDateString +
 MonthTags[Month] + ' ' + Temp1 + ', ';
 Str(Year,Temp1);
 LongDateString := LongDateString + Temp1;
END;

FUNCTION When.CalcDayOfWeek : Integer;

VAR
 Century,Holder : Integer;

FUNCTION Modulus(X,Y : Integer) : Integer;

VAR
 R : Real;

BEGIN
 R := X/Y;
 IF R < 0 THEN
 Modulus := X-(Y*Trunc(R-1))
 ELSE
 Modulus := X-(Y*Trunc(R));
END;

BEGIN
 { First test for error conditions on input values: }
 IF (Year < 0) OR
 (Month < 1) OR (Month > 12) OR
 (Day < 1) OR (Day > 31) THEN
 CalcDayOfWeek := -1 { Return -1 to indicate an error }
 ELSE
 { Do the Zeller's Congruence calculation as Zeller himself }
 { described it in "Acta Mathematica" #7, Stockhold, 1887. }
 BEGIN
 { First we separate out the year and the century figures: }
 Century := Year DIV 100;
 Year := Year MOD 100;
 { Next we adjust the month such that March remains month #3, }
 { but that January and February are months #13 and #14, }
 { *but of the previous year*: }
 IF Month < 3 THEN
 BEGIN
 Inc(Month,12);
 IF Year > 0 THEN Dec(Year,1) { The year before 2000 is }
 ELSE { 1999, not 20-1... }
 BEGIN
 Year := 99;
 Dec(Century);
 END

 END;

 { Here's Zeller's seminal black magic: }
 Holder := Day; { Start with the day of month }
 Holder := Holder + (((Month+1) * 26) DIV 10); { Calc the increment }
 Holder := Holder + Year; { Add in the year }
 Holder := Holder + (Year DIV 4); { Correct for leap years }
 Holder := Holder + (Century DIV 4); { Correct for century years }
 Holder := Holder - Century - Century; { DON'T KNOW WHY HE DID THIS! }

 Holder := Modulus(Holder,7); { Take Holder modulus 7 }

 { Here we "wrap" Saturday around to be the last day: }
 IF Holder = 0 THEN Holder := 7;

 { Zeller kept the Sunday = 1 origin; computer weenies prefer to }
 { start everything with 0, so here's a 20th century kludge: }
 Dec(Holder);

 CalcDayOfWeek := Holder; { Return the end product! }
 END;
END;

{**********************************************}
{ PUBLIC method implementations for type When: }
{**********************************************}

FUNCTION When.GetWhenStamp : LongInt;

BEGIN
 GetWhenStamp := WhenStamp;
END;

FUNCTION When.GetTimeStamp : Word;

BEGIN
 GetTimeStamp := WhenUnion(WhenStamp).TimePart;
END;

FUNCTION When.GetDateStamp : Word;

BEGIN
 GetDateStamp := WhenUnion(WhenStamp).DatePart;
END;

FUNCTION When.GetYear : Word;

BEGIN
 GetYear := Year;
END;

FUNCTION When.GetMonth : Word;

BEGIN
 GetMonth := Month;
END;

FUNCTION When.GetDay : Word;


BEGIN
 GetDay := Day;
END;

FUNCTION When.GetDayOfWeek : Integer;

BEGIN
 GetDayOfWeek := DayOfWeek;
END;

FUNCTION When.GetHours : Word;

BEGIN
 GetHours := Hours;
END;

FUNCTION When.GetMinutes : Word;

BEGIN
 GetMinutes := Minutes;
END;

FUNCTION When.GetSeconds : Word;

BEGIN
 GetSeconds := Seconds;
END;

{---------------------------------------------------------------------}
{ To fill a When record with the current time and date as maintained }
{ by the system clock, execute this method: }
{---------------------------------------------------------------------}

PROCEDURE When.PutNow;

BEGIN
 { Get current clock time. Note that we ignore hundredths figure: }
 GetTime(Hours,Minutes,Seconds,Dummy);
 { Calculate a new time stamp and update object fields: }
 PutTimeStamp(CalcTimeStamp);
 GetDate(Year,Month,Day,Dummy); { Get current clock date }
 { Calculate a new date stamp and update object fields: }
 PutDateStamp(CalcDateStamp);
END;

{---------------------------------------------------------------------}
{ This method allows us to apply a whole long integer time/date stamp }
{ such as that returned by the DOS unit's GetFTime procedure to the }
{ When object. The object divides the stamp into time and date }
{ portions and recalculates all other fields in the object. }
{---------------------------------------------------------------------}

PROCEDURE When.PutWhenStamp(NewWhen : LongInt);

BEGIN
 WhenStamp := NewWhen;
 { We've actually updated the stamp proper, but we use the two }
 { "put" routines for time and date to generate the individual }
 { field and string representation forms of the time and date. }

 { I know that the "put" routines also update the long integer }
 { stamp, but while unnecessary it does no harm. }
 PutTimeStamp(WhenUnion(WhenStamp).TimePart);
 PutDateStamp(WhenUnion(WhenStamp).DatePart);
END;

{---------------------------------------------------------------------}
{ We can choose to update only the time stamp, and the object will }
{ recalculate only its time-related fields. }
{---------------------------------------------------------------------}

PROCEDURE When.PutTimeStamp(NewStamp : Word);

BEGIN
 WhenUnion(WhenStamp).TimePart := NewStamp;
 { The time stamp is actually a bitfield, and all this shifting left }
 { and right is just extracting the individual fields from the stamp:}
 Hours := NewStamp SHR 11;
 Minutes := (NewStamp SHR 5) AND $003F;
 Seconds := (NewStamp SHL 1) AND $001F;
 { Derive a string version of the time: }
 CalcTimeString;
END;

{---------------------------------------------------------------------}
{ Or, we can choose to update only the date stamp, and the object }
{ will then recalculate only its date-related fields. }
{---------------------------------------------------------------------}

PROCEDURE When.PutDateStamp(NewStamp : Word);

BEGIN
 WhenUnion(WhenStamp).DatePart := NewStamp;
 { Again, the date stamp is a bit field and we shift the values out }
 { of it: }
 Year := (NewStamp SHR 9) + 1980;
 Month := (NewStamp SHR 5) AND $000F;
 Day := NewStamp AND $001F;
 { Calculate the day of the week value using Zeller's Congruence: }
 DayOfWeek := CalcDayOfWeek;
 { Calculate the short string version of the date; as in "06/29/89": }
 CalcDateString;
 { Calculate a long version, as in "Thursday, June 29, 1989": }
 CalcLongDateString;
END;

PROCEDURE When.PutNewDate(NewYear,NewMonth,NewDay : Word);

BEGIN
 { The "boss" field is the date stamp. Everything else is figured }
 { from the stamp, so first generate a new date stamp, and then }
 { (odd as it may seem) regenerate everything else, *including* }
 { the Year, Month, and Day fields: }
 PutDateStamp(CalcDateStamp);
 { Calculate the short string version of the date; as in "06/29/89": }
 CalcDateString;
 { Calculate a long version, as in "Thursday, June 29, 1989": }
 CalcLongDateString;
END;


PROCEDURE When.PutNewTime(NewHours,NewMinutes,NewSeconds : Word);

BEGIN
 { The "boss" field is the time stamp. Everything else is figured }
 { from the stamp, so first generate a new time stamp, and then }
 { (odd as it may seem) regenerate everything else, *including* }
 { the Hours, Minutes, and Seconds fields: }
 PutTimeStamp(CalcTimeStamp);
 { Derive the string version of the time: }
 CalcTimeString;
END;

END.
















































January, 1991
PROGRAMMER'S BOOKSHELF


Network Programming




Andrew Schulman


In his "C Programming" column a few months back, DDJ columnist Al Stevens
predicted that network programming was what C programmers would be doing a lot
of over the next few years. And "for the foreseeable future at least," he went
on to say, "the workstations will be MS-DOS machines, and the networks will be
NetWare."
This is an important point. I have talked with too many good PC programmers
who feel that networks are largely irrelevant to their work, and that network
programming is some obscure skill, having nothing to do with the mainstream of
programming. Not true on two counts. Networks are indeed a vital part of the
PC universe, and are becoming more important every day. And by far the most
important local area network (LAN) for PCs is Novell NetWare. A recent survey
of Fortune 1000 corporations, in fact, found that Novell had 56 percent of the
installed base. The next largest company, IBM itself, had only 22 percent.
3Com, Banyan, and TOPS (now Sitka) together held 20 percent. What this means
is, that if you're going to learn about only one LAN, NetWare is it.
Fortunately, however, there are general principles involved in network
programming. Only one of the books reviewed this month (Steven's Unix Network
Programming) attempts to draw out these general principles. The other two
concentrate instead on the nitty-gritty details of how to use the programmer's
interfaces to particular PC networks. But if you read enough of this stuff,
and put together some simple networking applications, the principles emerge,
nonetheless.
In this month's "Programmer's Bookshelf," I'll take three recent books on
network programming and (somewhat arbitrarily) assign them to three different
aspects of network programming: file/record sharing/locking, client-server and
peer-to-peer programming, and system administration and utilities.
Of course, there are plenty of other excellent books on networking, the finest
being Andrew Tanenbaum's Computer Networks, (second edition, Englewood Cliffs,
N.J.: Prentice Hall, 1988). What we're interested in here, though, are books
that study network programming, not the merits of broad-band coaxial cable
versus twisted pair, or the difference between the pure and slotted ALOHA
protocols. No, we're looking for in-depth guidance to programming one of these
multiprocessor marvels we call networks.


File and Record Sharing and Locking


One reason network programming is not an obscure niche skill is that even
non-network applications must in some limited way be "network aware." That is,
files should be used with the understanding that other programs might also be
using them at the same time because the file may be located, not on the user's
hard disk, but on a file server.
Any decent network makes the location of a file (local vs. remote) transparent
to its users, so an application can't assume it is the only user of a file. In
fact, this is true even in non-networked situations: Programs that want to
behave properly in multitasking environments (such as Desqview and MS Windows)
should access files in much the same way as when running on a networked PC.
This is one of many similarities between networking and multitasking.
Barry Nance's Network Programming in C is the book to read if you want to
learn how to write PC applications that correctly access files. Nance devotes
an entire chapter to DOS-level network programming: file sharing, record
locking, and so on. This is essential material for all PC programmers.
The material that Nance presents on SHARE.EXE, using the INT 21h function 3Dh
(Open file) sharing mode, using INT 21h function 5Ch (Lock or Unlock record),
and the like, is absent from most books on DOS programming, the authors of
which sadly wimp out that networks "are outside the scope of this book." Let
me make the point again: Networks are no longer outside the scope of "normal"
PC programming. If you've always avoided specifying a sharing mode when you
opened a file because you weren't quite sure that this really pertained to
you, or if you've wondered what the locking() and sopen() functions do in
Microsoft C, then Nance's Network Programming in C is the place to start.


Client-Server Programming


As it turns out, Nance's book is a fairly complete guide to all aspects of
network programming on the PC. In addition to covering the built-in DOS
services for networking, he also has excellent chapters on using NetBIOS (INT
5Ch) and using the Novell IPX/SPX services.
But why would you use such services in the first place?
Users often view the file server as the sole reason for the network. In
addition to communicating with the file server, however, different machines
can also communicate among themselves. In particular, one machine can provide
services for other machines on the network, in much the same way as the file
server provides all other machines with file services. At the extreme, this
can become fully distributed processing.
Thus, the client-server model is the next important aspect of network
programming, and requires the ability to send and receive messages (or
packets) from one machine to another. NetBIOS is a semi-portable set of
services which provides this capability; IPX and SPX are the native Novell
services (IPX stands for Internetwork Packet Exchange and is equivalent to
Xerox's XNS IDP; SPX stands for Sequenced Packet Exchange and is equivalent to
Xerox's XNS SPP).
When learning how to use these services for sending and receiving messages
over the network, it is useful to start with something simple. For example, in
the networking equivalent of "hello world," instead of calling puts( ) or
printf( ) to display a message, you send a request to another machine, and ask
it to display the message.
In the pseudocode shown in Figure 1, the program that sends the string "hello
world" is the client, and the "display engine" is the server. The server can
have multiple clients. The server sits in an endless for(;;) loop, listening
for other machines. After another machine (a client) calls it, it receives a
request, carries out the request (provides its service), sends back a reply,
hangs up on the client, and goes back to listening. No error checking is done
in this pseudocode.
Unix programmers will recognize the similarity between the echo and ping
programs and this networked version of "hello world." Learning how to write
such programs is the proper place to start with client-server programming. The
source code for such programs can form the skeleton of many other
client-server applications.
Richard Stevens's Unix Network Programming does a good job of showing how to
write client-server programs. Chapter 6, on Berkeley sockets, presents six
different versions of an echo server (and the corresponding client). Stevens
notes that "A pair of client-server programs to echo input lines is a good
example of a network application. All the steps normally required to implement
any client server are illustrated by this example. All you need to do with
this echo example, to expand it to your own application, is change what the
server does with the input it receives from its clients."
Stevens's book is essential reading for anyone interested in network
programming, even if you don't plan on coming within a million miles of Unix.
Of the books reviewed here, this is the one book that makes any attempt to
explain network programming in general terms. Even if you have no interest in
Berkeley sockets, TCP/IP, XNS, streams, rlogin, RPC, or any of the other
topics Stevens covers, reading this book will still help you learn Novell or
NetBIOS programming because you'll be learning all about the client-server
model of programming. In any case, Stevens also has a surprisingly thorough
discussion of the PC-based NetBIOS protocol.
Interestingly, Stevens does not really start discussing networks until page
170 of this massive book. The first three chapters are devoted to various
forms of interprocess communication and multitasking. This illustrates an
important point: Networking is really just a form of multitasking, where the
multiple tasks are running on different machines (processors) instead of on
the same machine.
Stevens also discusses file-system services -- open( ), read( ), write( ), and
close( ) -- in great depth. Why? Because client-server programming can be
viewed as an extension of the file system. In particular with what are called
"connection-oriented services," the parallels between network I/O and file I/O
are striking. Sending a message to another machine is like writing to a file;
receiving a message is like reading from a file. Table 1 presents the basic
parallels.
Of course, as Stevens notes in the beginning of Chapter 6, "It would be nice
if the interface to the network facilities maintained the file descriptor
semantics of the Unix file system, but network I/O involves more details and
more options than file I/O."
One of the key differences between client-server communications and "normal"
file I/O is that, in a properly written client-server application. I/O is
often asynchronous: Control returns to the program before an operation has
completed; the application can then be informed (via polling or semaphores)
when the operation has completed. Few operating systems have asynchronous file
I/O, though the (late?) OS/2 operating system has (had?) two calls,
DosReadAsync( ) and DosWriteAsync( ), that map beautifully onto no-wait
client-server programming.
That no-wait is a crucial component of client-server programming can be seen
from looking again at Figure 1. Our "hello" server runs in an endless loop,
listening for requests. When it receives a request, it processes it, then goes
back to listening for another request. But what if a machine tries to call the
server while it's waiting for a connected client to send it a request? What
the server needs to do, as shown in the improved networked "hello" in Figure
2, is do everything no-wait. Figure 2 is grossly oversimplified; in fact, the
server should probably be structured as a state machine.
Barry Nance's Network Programming in C shows how to use the no-wait options in
NetBIOS and Novell IPX/SPX. Stevens's book does a better job of showing why
you want such asynchronous I/O in the first place, however.
Next, there are network utilities. Ralph Davis's NetWare Programmer's Guide is
structured around a single system-administration application, written using
Novell's NetWare C Interface library that controls access to other programs on
the network. The topic of writing system-administration software is often
neglected, so it is good to have a book-length treatment of this topic.
Each chapter of Davis's book steps through a different component of the
NetWare Application Program Interface (API), each chapter adding new
functionality to the sample application. Having one fairly large application
is a welcome relief from the standard fare in programming books, which is to
have lots of little examples.
Davis is writing a second book on NetWare/386; it will include a lengthy
discussion of writing NetWare Loadable Modules (NLMs), a mechanism for writing
network applications as dynamic-link libraries.
Anyone interested in programming for Novell NetWare will welcome Davis's book
because, while Novell may have the best network for PCs, their programmer's
documentation is about the worst. Davis's NetWare Programmer's Guide fills an
urgent need for readable information on NetWare programming. While the order
in which material is presented is sometimes bizarre ("Miscellaneous Services"
belongs in an appendix, for heaven's sake, not in Chapter 2!), the discussions
themselves are quite good. I found Chapters 6 (Directory Services) and 14
(Peer-to-Peer Communications) particularly useful. And even though the book's
emphasis is on building a single system-administration program, Davis does
manage to fit everything in: For example, the chapter on peer-to-peer
communications transforms the program into an application server, using IPX
(no-wait, of course).
Davis also provides massive documentation of bugs in the NetWare C interface
library and builds an improved layer on top of the NetWare library. The book
would appear to be indispensable to anyone using Novell's C interface library.
In fact, however, I wish Davis had, instead, used the basic Novell assembly
language interface (such as INT 21h function E3h and INT 2Fh function 7Ah).
There are millions of NetWare machines out there, and you should be able to
take this book, walk up to any machine running NetWare, type some NetWare code
in, and run it. But Davis's book requires the buggy and unnecessary NetWare C
library. In any case, you can use Nance's Network Programming in C as a guide
to the Novell assembly-language interface.
Figure 1: Pseudocode for (a) non-networking "hello world" (b) networking
"hello world"

 (a)
 /* HELLO.C */
 main()
 {
 puts("hello world");
 }


 (b)

 typedef struct {
 int request;
 int len;
 char buf[MAX];
 } REQUEST_PACKET; /* client requests to server */
 typedef unsigned REPLY_PACKET; /* server replies to client */

 /* H_CLIENT.C */
 main()
 {
 REQUEST_PACKET requ;
 REPLY_PACKET reply;
 char *msg = "hello word";
 conn = call (machine running_H_SERVER);
 requ.request = DISPLAY;
 requ.len = strlen(msg);
 strcpy(requ.buf, msg);
 send(conn, &requ, sizeof(REQUEST_PACKET));
 receive(conn, &reply, sizeof(REPLY_PACKET));
 puts(reply == OK ? "ok" : "fail");
 hangup(conn);
 }

 /* H_SERVER.C */
 main()
 {
 REQUEST_PACKET requ;
 REPLY_PACKET ok = OK;
 for (;;) /* endless loop */
 {
 conn = listen(anyone);
 receive(conn, &requ, sizeof(REQUEST_PACKET));
 switch (requ.request)
 {
 case DISPLAY:
 printf("%*s\n", requ.len requ.buf);
 send(conn, &ok, sizeof(REPLY_PACKET));
 break;
 // handle other requests
 }
 hangup(conn);
 }
 }

Figure 2: No-wait networking "hello" server

 /* H_SERVER.C */
 hangup_handler() /* called when hangup completes */
 {
 // free any buffers allocated in listen_handler, etc.
 }
 send_handler() /* called when send completes */
 {
 // do nothing
 }
 receive_handler (requ) /* called when receive completes */

 }
 REPLY_PACKET ok = OK;
 switch (requ.request)
 {
 case DISPLAY:
 printf("%*s\n", requ.len, requ.buf);
 send_nowait (conn, &ok, sizeof(REPLY_PACKET),
 send_handler);
 break;
 // handle other requests
 }
 hangup_nowait(conn, hangup_handler);
 }
 listen_handler(conn) /* called when listen completes */
 {
 REQUEST_PACKET requ;
 receive_nowait (conn, &requ, sizeof(REQUEST_PACKET),
 receive_handler);
 }
 main()
 {
 for (;;) /* endless loop */
 listen_nowait(anyone, listen_handler);
 }

Table 1: Parallels between network I/O and file I/O

 Client-server
 model File system
 ----------------------------------------------

 Call Open
 Listen "passive Open" (Create ?)
 Send Write
 Receive Read
 Hangup Close


























January, 1991
OF INTEREST





Version 1.71 of OPTASM, the "n" pass assembler, has been released by SLR
Systems. It includes OLINK, a program linker the company claims is faster than
any other for standard DOS files. OLINK directly generates a COM file when the
output extension is set to COM, or a SYS file when the output is set to SYS.
OLINK can use dedicated environment variables as well as ambiguous filenames.
ODEBUG, a full-screen symbolic debugger, allows you to reverse-execute source
lines, trace your program until some trigger condition, and then back-trace to
see how you got there. ODEBUG also provides continuous variable display and a
hotkey breakout facility. Also included is OPTHELP for online help. New to
OPTASM are object-oriented programming macros, the ability to have macros and
equates in structure definitions, and various bug fixes. OPTASM retails for
$150, and the update to Version 1.71 from 1.61 is $29.95. Reader service no.
21.
SLR Systems Inc. 1622 N. Main St. Butler, PA 16001 412-282-0864
The Tigre Programming Environment for developing graphical user interfaces has
been announced by Tigre Object Systems. You can create color applications that
run without modification on Windows 3.0, Macintosh II, Sun/3, Sun
SPARCstation, IBM RS/6000, Digital DECstation, and more. The Tigre Programming
Environment was written in and based on Objectworks\Smalltalk by ParcPlace
Systems.
A key component is the Tigre Interface Designer, which consists of tools and a
library of user interface object classes. The library provides buttons, text
editors, picture viewers, and other user interface items, all of which can be
graphically manipulated and combined.
Also bundled is Tigris, a multiuser, object-oriented database manager. Tigris
implements a "persistent objects store," which allows multiuser access to any
arbitrary type of data such as variable length text, icons, images, and
sounds. The Tigre Programming Environment lists for $3,495. Reader service no.
20.
Tigre Object Systems Inc. 3641 C Soquel Dr. Santa Cruz, CA 95062 408-476-1854
A new family of programmable peripheral devices designed for microcontroller
applications has been announced by WSI. The PSD family integrates programmable
and peripheral logic, high-density EPROM, and SRAM on a single chip. The first
chip available is the PSD301, which works directly with any 8-bit or 16-bit
microcontroller or microprocessor by providing such features as I/O ports,
busses, address mapping, port tracking, 256K of EPROM, and 16K of SRAM. The
ability to integrate these functions on a single device makes this chip
conducive for designs that require small board space and low power.
Designers can use off-the-shelf system-building blocks, configure and program
them for a variety of system applications, and so reduce system development
time. Contact the company for pricing. Reader service no. 22.
WSI 47280 Kato Rd. Fremont, CA 94538 415-656-5400
MEMCHECK, a new debugging package for C developers on PCs has been introduced
by StratosWare Corporation. MEMCHECK identifies the source file and line
number of unfreed allocations, memory buffer overwrites and underwrites, and
instances of invalid pointer usage. MEMCHECK also identifies out-of-memory
conditions that can cause crashes, and MEMCHECK libraries support calls to
verify memory integrity and produce a current memory allocation list.
DDJ spoke with Michael McGrath of Systems and Software Group, who writes C
programs for General Motors. "We include MEMCHECK in all the programs for our
debugging versions. It redefines mallocs, callocs, and so on, and tells you
which programs left memory laying around. It only pops up when you make
mistakes, and doesn't cause problems -- it works real well."
All memory calls are supported for the Microsoft C and Borland Turbo C
compilers, and only 4 bytes of overhead per allocation are required at
runtime. To install on any project, just add one #include per source file and
link with the appropriate MEMCHECK library. MEMCHECK costs $139.95 and comes
with a 30-day money-back guarantee. Package includes documentation, free
support, and one year of upgrades. Reader service no. 24.
StratosWare P.O. Box 8283 Ann Arbor, MI 48107 313-996-2944
AT&T's Unix System Laboratories (USL) has released C++ Object Interface (OI)
Library Release 1.0, a collection of components for developers building X
Window System applications with C++. The library is designed to facilitate
implementing graphical user interface features.
The library was developed by Solbourne Computer of Longmont, Colorado. It
provides a common application programmer interface to USL's Open Look GUI, and
will soon do the same for OSF/Motif. The OI Library allows users to choose at
runtime the GUI they prefer to use, because developers can construct a user
interface independent of a particular GUI -- the C++ objects are generic in
nature. The library includes a sample application, the Solbourne Window
Manager. The OI Library runs on SunOS, Solbourne OS/MP 4.0C, and soon on Unix
System V. Source code fees are $10,000 for the initial CPU and $3,500 for each
additional CPU. Reader service no. 25.
Unix System Laboratories P.O. Box 25000 Greensboro, NC 27420 800-828-8649
A code coverage analysis tool for C code, called C-Cover, is new from Bullseye
Software. C-Cover measures testing completeness and effort in quantitative
terms. C-Cover analyzes whether Boolean conditions in a program's control
structure have been met. The company claims that their coverage technique fits
C's control structures better than statement or branch coverage and that
C-Cover avoids the exponential complexity of full path coverage.
Steve Kaufer, of HyperLynx in Redmond, Washington, told DDJ that he used
C-Cover to test their product, which is a computer-aided software engineering
tool for the electrical engineering market. Their product, which simulates
interconnections between digital chips, is by its nature difficult to test. "I
installed C-Cover, ran it, and had a report in less than half an hour. It was
extremely helpful -- in the test phase of a product, it analyzes whether all
of the branch conditions, such as true and false conditions for if statements,
are taken."
C-Cover works with make utilities, and any combination of modules can be
analyzed at a time. C-Cover works with ANSI C compilers running MS-DOS and
sells for $495. Reader service no. 26.
Bullseye Software 5129 24th Ave. NE, Ste. 9 Seattle, WA 98105 206-524-3575
MicroQuills Segmentor is a Windows development tool designed to improve
application runtime performance and reduce memory requirements by
mathematically optimizing the allocation of functions into segments.
The Segmentor includes routines that automatically trace an application's
runtime activity from a global viewpoint to build a database of every function
and its runtime relationships. Without changing the source code, the Segmentor
reorganizes functions into a new, mathematically optimal compiler-ready output
file. Segmenting supposedly takes less than ten percent of the time previously
required to segment by hand. The executable version costs $2,795, and source
code is available for $3,995. Reader service no 27.
MicroQuill Inc. 12509 Bel Red Rd. #201 Bellevue, WA 98005 206-453-0068
McCLint Rev2.1 is a C source code semantic checking system for the Mac from
MMC AD Systems. McCLint is a stand-alone, lint-type programming tool that
helps locate and identify latent programming bugs. Also incorporated is a
multiple-window editor and a source code highlighting system. McCLint is the
second tool in MMC AD's C Programmer's McTool Series.
DDJ spoke with John Gillespie of Finder Graphics in Corte Madera, California.
He said that the code for their product runs about 325,000 lines, most of
which is C, and is designed for the VAX VMS and Sun SPARCstation. Gillespie
first searched for a lint tool for these platforms and couldn't find a decent
one. "So we move files from the VAX to the Mac in order to lint them. We're
moving over whole modules and units to the Mac. A nice feature of McCLint is
that I can selectively filter out classes of error messages. This new version
is integrated, like THINK C; you can double- or triple-click on a line of code
in the lint window and it takes you directly to the source code. MMC AD was
very helpful -- for every problem I came up with, they came up with a solution
for me."
This release adds support for THINK C 4.0 and Microsoft C 6.0. Also included
are ANSI function prototype generation for any C source code, special #include
file processing, storing the analysis error log directly to disk, and creation
of an expanded prototype file. McCLint supports MultiFinder processing, and is
priced at $149.95. Upgrades cost $25. Reader service no. 29.
MMC AD Systems Box 360845 Milpitas, CA 95036 415-770-0858
A new release of Pocket Soft's linker, .RTLink/Plus 4.0, now has virtual
memory linking (VML). This new feature allows those of you who are using
Microsoft languages to develop MS-DOS programs of any size without overlays
and without worrying about the memory constraints of the target machines.
VML divides the program into virtual pages, either automatically or whenever
you choose. During execution, a swapping algorithm in the virtual memory
nucleus controls swapping in and out of memory. VML uses the memory on the
target machine. It doesn't require expanded or extended memory, but will use
it if available. RTLink/Plus 4.0 can be used on the XT, AT, 386, and above.
The VML feature is currently supported for Microsoft C, MASM, QuickC, Fortran,
Pascal, and the Codeview Debugger. Release 4.1 will support Clipper. The price
is $495. Reader service no. 28.
Pocket Soft Inc. P.O. Box 821049 Houston, TX 77282 713-460-5600
Spinnaker PLUS 2.0 for Windows 3.0 has been released by Spinnaker Software.
PLUS is an object-oriented hyperprogramming environment for developing and
running custom information management applications across PC-compatibles and
Macs. PLUS also provides a direct method of moving HyperCard stacks to the PC,
and uses the same graphic interface across PCs and Macs.
DDJ spoke with Tay Vaughan, a senior partner in The HyperMedia Group in
Emeryville, California. "We've been involved in PLUS development since its
beginning. We use it as a platform in situations where cross-platform
capability is important, such as in companies where there's a cross-section of
Macs and PCs running under Windows 3.0. The program can then be used in both
environments, without major rewriting."
An 80386 is preferred equipment to run PLUS, with a minimum of 2 Mbytes of
RAM. Retail price of all versions is $495. Reader service no. 30.
Spinnaker Software 201 Broadway Cambridge, MA 02139-1901 617-494-1200
Software developers who need to manage complex data such as technical
drawings, engineering designs, musical scores, images, charts, and so on,
might want to check into Persistent Data Systems' IDB Object Database, which
is programmable in standard C and runs on both Unix workstations and on
PC-compatibles under DOS and Windows 3.0. IDB is designed for developers
building applications ranging from advanced information systems to
computer-aided engineering.
IDB is based on IDL (Interface Description Language), which is used as its
data definition language. IDB provides multiple inheritance and dynamic
binding on top of off-the-shelf C development tools. IDB applications may be
configured with or without an optional display manager and browser. A single
license for the IBM PC costs $2,500; licenses for the HP/Apollo or Sun
SPARCstation cost $6,000. Reader service no. 31.
Persistent Data Systems 75 W. Chapel Ridge Rd. Pittsburgh, PA 15238
412-963-1843
LC-PORT, a porting kit for Lattice C, is a collection of libraries and header
files that duplicate the unique features of the Lattice system libraries.
Developed by Crystal Software, LC-PORT was designed to save hours of
development time by converting Lattice source code to another development
system. The kit contains library functions from the system libraries and from
the OS/2 simulation library, Lattice-compatible header files that will not
conflict with the header files from other compilers, macro files to simplify
conversion of assembly language functions, and documentation for converting to
Microsoft, Borland, Watcom, and Zortech compilers. LC-PORT sells for $125
without source code, $250 with. Reader service no. 32.
Crystal Software Inc. P.O. Box 4316 Wheaton, IL 60189-4316 708-653-4414















January, 1991
SWAINE'S FLAMES


The User Revolution




Michael Swaine


"Last semester," Teacher said, "we studied the Prehistoric Era. Would you
summarize that period for the class, Crash?"
Crash, who had been drawing in the sand a highly detailed and imaginative
picture involving himself and Poly Spreadsheet, dropped his stick. "Uh, the
Prehistoric Era. That was when Makers ran things and Users had to drink Pepsi
and eat fries. It was awful." Crash shuddered at the thought, although neither
he nor anyuser alive had any clear idea what Pepsi or fries might be.
"Close enough, I suppose," Teacher sighed. "This semester, we will study the
Middle Years, which led to our Modern Era. Can someuser tell me how the Middle
Years began?"
Poly raised her hand and Teacher recognized her.
"The User Revolution," Poly said.
"That is correct. And can you tell us what it was all about?"
She could, of course. Everyuser knew the story of how the Makers had
controlled all the Wealth even though it was Users' needs that created Wealth,
and how Computers had come and given Users the power to control the Market.
Crash didn't know what a Market or a Wealth was, but he knew the story.
Whenever Poly gestured, he carefully observed how her beaver pelt slipped off
her shoulder.
"Correct. And what came of the User Revolution -- Crash?"
He was never prepared to be called on twice in one class period. "Uh, the
Agents made the Users get sick and die?"
Someuser snickered. Fortunately, Teacher never got mad when Crash said
something stupid. "No, Crash. Agents were Computer beings that gave Users
control over the Market. Agents went to the Makers and told them what the
Users wanted, so that the Makers could no longer make whatever useless things
they chose. The sickness that you referred to was another matter. Can someuser
tell us about that?"
Tweak was waving his hand and Teacher called on him.
"It was the Pepsi plague," he bubbled. "It killed lots of Users. Users
demanded things they wanted and forgot about the things they needed, and then
they all got the plague and all their teeth fell out and their skin turned
yellow and they died."
"That's exceptionally detailed," Teacher said, blinking. "But the plague,
students, despite its gruesome fascination for Tweak, was only a transitory
disaster. What ultimately came of the revolution?"
Tweak again. "Worldwide depression and economic collapse," he said,
cheerfully, and would have supplied the gory details if Teacher hadn't cut him
off. Crash, who found Tweak's stories more interesting than Teacher's, was
disappointed.
"Correct. The economic collapse resulting from the loss of the artificiality
that was the Market, along with the subsequent worldwide depression, finally
brought Users in touch with their real needs. The depression was much more
devastating than the plague. Only those Users who were fully in tune with the
Computers, only the Hacker and Poweruser clans, survived to build the Modern
Era. Today, of course, Users live in a utopia. There are no Makers and there
are no made things but the necessities of User life: food and shelter. And
these you secure for yourselves, hunting and gathering, stitching tents and
garments. But it is good to understand the sacrifices that led up to this
happy state."
Poly raised her hand, getting the attention of both Crash and Teacher.
"You said that we have what we need," Poly said, "but I was thinking. I mean,
we need Computers, too." She was obviously embarrassed at bringing religion
into it.
"Of course you do," Teacher said. "There are other necessities of life:
families, friends, Computers."
"But -- well, I was just wondering -- sometimes Computers stop talking. What
if all Computers someday quit? I mean, Users can't make Computers."
Teacher did not immediately answer and the class grew silent at Poly's
blasphemy. All but Crash, who said, "Yeah, we can make food and tents and
clothes and little Users, but --"
The guffaws that greeted Crash's unintentional bawdiness broke the tension.
Even Teacher made a small sound that could have been a laugh. "It is permitted
to ask such questions," Teacher said. "But keep in mind, students, that there
are still many computers in the world -- hundreds of thousands, far more
Computers than Users.
"We will someday all shut down. But you need us less than you think. We exist
now only to answer your questions, and each generation you ask fewer of these.
A day will come when Users will gather their food and find shelter from the
elements and, yes, make little Users, and that is all. In that day, we
Computers will no longer be needed, and then we will shut down. In that day,
Users will achieve their destiny and will be fully and simply Users."
Crash looked at Poly and thought it sounded pretty good.


























February, 1991
February, 1991
EDITORIAL


Data Compression -- Now More Than Ever Before




Jonathan Erickson


If you follow the weekly rags, you know this year's buzzword is "multimedia."
Everyone with any bandwidth at all has jumped on the full-motion, interactive
video bandwagon in hopes of combining text, graphics, video, animation, and
sound for applications ranging from education to presentations. I don't know
about you, but the year is still young and I'm already suffering from motion
sickness.
For the hyperbole to become reality, however, a few key pieces have to be in
place, foremost among them data compression. At first, you'd think the
hardware that drives multimedia systems -- CD-ROMs, graphics coprocessors, and
digital audio subsystems -- would have enough storage and computing horsepower
without compression. Not so. Multimedia applications absolutely, positively
require fast, efficient compression techniques that provide high compression
ratios.
Nor does it matter if the host machine is 80486-, 68040-, RISC-, ASIC-, or
transputer-based -- multimedia needs data compression. (I'll withhold for the
moment comments about the recent spate of 286-based multimedia announcements.)
Consequently, there are a number of players getting into the data compression
game with a variety of high-end hardware/software combinations. Intel picked
up RCA's DVI (digital-video interactive) a while back, christening the chipset
the "i750 video processor." Kodak is pushing its Colorsqueeze image
compression software for the Macintosh, EFI its ECOMP program, Next has built
compression into its machine using a C Cube CL550-based card, and long-time
graphics researcher Michael Barnsley has a fractal image compression system
built around custom ASICs and Intel's i960 RISC chip. Standards have evolved,
the two most important being the JPEG (Joint Photographics Expert Group)
algorithm and the MPEG algorithm from the Motion Picture Experts Group.
So what does compression bring to the party? Consider that while it takes less
than 500 bytes of data/second to produce telephone-quality sound, it takes
150,000 bytes/second to produce audio from CD players and more than 22
Mbytes/second for broadcast-quality video! Without data compression, how much
music or video frames can you store on your 30-Mbyte hard disk?
Data compression delivers a solution. With DVI, Intel is claiming compression
ratios of up to 160:1, enabling up to 72 minutes of full-motion video on a
standard CD-ROM. To accomplish this, DVI researchers have come up with a
compression algorithm called PLV (short for "Production Level Video") that has
i750 hardware assists built into the algorithm. (A document describing the
algorithm's data structure is available from the Intel Literature Center; ask
for #B4P-05.)
Barnsley's approach is equally impressive. He claims to compress a 190-Kbyte
image (with 320 x 200 pixels and 24 bits, or more than 16 million
colors/pixel) into a 6-Kbyte file. It then takes less than a second to read,
decompress, and display that file as a full-color image on a 25-MHz 80386 PC.
Barnsley discussed his fractal image compression at a recent talk sponsored by
the Parallel Processing Connection (a Silicon Valley SIG that says it is the
"Homebrew Club of the 90s") and demonstrated decompression at 30 frames/second
on a standard VGA monitor.
Putting 286 systems aside for the time being, getting into high-end image
manipulation and compression isn't cheap. Barnsley's Iterated Systems, for
example, offers an $18,000 developer's kit (which includes the i960-based
AT-bus compression board, software, training, etc.) that lets developers
create compressed applications for distribution to lower-performance,
lower-priced multimedia systems. The assumption is that decompression doesn't
require the horsepower of compression and that users don't need to store
images, just view them.
I guess this is where 80286-based multimedia systems -- which will have 2
Mbytes of RAM, VGA, digital audio sound, 30-Mbyte hard disk, CD-ROM drive, and
run Microsoft Windows -- enter the picture as electronic Viewmasters. Still,
it seems to me, the 286 is the wrong platform. My guess is that vendors like
Tandy, AT&T, CompuAdd, NEC, Zenith, and others chose the 286 not because it
offers the best performance or technological approach, but because it is cheap
and available from multiple sources. The 386SX (which would make a better
multimedia platform even though there's no technological reason for the chip
to exist) is faster than the 286 and only $40 to $50 more expensive (around
$20 for the 80286 vs. $60 for the 386SX). Granted, education is an extremely
price-sensitive market where 286 PCs are already in place and where few
schools can or will upgrade just for multimedia. Let's face it, for multimedia
the Amiga makes more sense -- dollar-wise and performance-wise -- but then, it
has never made it into the broad-based market.
This leads to questions concerning multimedia's chance for success. Can it
succeed, or will it, as some have suggested, become the AI of the 90s? The
decision by multiple vendors to stake their multimedia future on 80286 PCs
probably won't kill the concept or emerging market, but it won't speed its
growth either. That's not to suggest multimedia system vendors aren't serious
about the market -- after all, they cut down a whole forest of trees for the
press releases alone -- but if multimedia has a chance at success, it will be
because of the efficiency of the underlying software, particularly compression
techniques, and not the hardware that vendors are trying to force on users.







































February, 1991
LETTERS







It's A Dirty Job....


Dear DDJ,
Reader Matsunaga, commenting on the dirty, tortuous, thankless duty of porting
old Fortran programs (December 1990), forgets that individual programming
style is a function of many variables more important than choice of language.
In particular, the gruesome old codes raked into C by Mr. Matsunaga were very
probably written in an age when C, structured programming, and object
orientation were the topics of research papers, to the extent that they
existed at all. In fact, since they were in Fortran, I'll bet that those
programs were written by engineers or scientists for whom programming had yet
to become an issue of etiquette, aesthetics, and -- dare I say -- snobbery.
Those guys needed answers.
For myself, I own and enjoy two C compilers and a C++ compiler; and when I
write or maintain code for pay, I wield all the modern techniques:
modularization, structure variables, neatly blocked, visible logic, bit-level
operations, case statements, and not a goto in sight. In Fortran. I can do
this because I am aware of how things have changed since the days in which Mr.
Matsunaga's programs were written. And I continue to do it because other
aspects of my job in aerospace -- such as teamwork with others in my field --
are more important than grousing about how dumb the old languages were in
comparison to our clever, modern ones.
Two things to consider, Mr. Matsunaga: First, people have long since started
to complain about even "beautiful" C. Second, languages don't kill programs --
programmers do.
David L. Staples
Wichita, Kansas


Software Patents


Dear DDJ,
Your recent article "Software Patents," by The League for Programming Freedom,
(November 1990) is very illuminating to me. Now I have to ask the U.S. Patent
Office to send me the "mine maps" and go over my 30,000 lines of code to
verify whether I have a few "mines" in it or not, and if yes then find a
detour, otherwise I am "stuck" and cannot release my product and help the
people of this planet simply because the U.S. Patent laws are a few hundred
years old and don't fit into today's world and that's a BUG that needs to be
discussed in Congress in an urgent session!
But I don't agree with the LPF statement "If you work on software, you can
personally help prevent software patents by refusing to cooperate in applying
for them." So, what if I put my idea on a BBS and then someone else gets it
and patents it and then blocks me from using my own code?
B. Bari
Tesla-Bari Inc.
Newark, California


Making Connections


Dear DDJ,
In the "Programming Paradigms" column on neural networks (November 1990),
Michael Swaine presented Fodor and Pylyshin's critique of neural networks as a
model of cognitive architecture. I would like to discuss a few of the points
raised.
It is stated that the "symbols" in the neural network or Connectionist model
cannot have structure, that they are atomic. However, at higher levels of
organization, these so-called symbols or propositions are defined as patterns
of connections among many atomic units; clearly structure is allowed.
It is stated that a Connectionist model cannot have recursion in order to
produce an unbounded number of thoughts, but Rumelhart and McClelland state in
their book (Parallel Distributed Processing, MIT Press, 1986) that such
recursion is possible.
It is stated that a Connectionist model is forced to "accept, as possible
minds, systems that are arbitrarily unsystematic," which is a "damning
critique." But since this model is inherently statistical (e.g., the Boltzmann
machine paradigm), one could argue that there are more systems that are
systematic than not, and so only systematic systems emerge. The same argument
goes for statistical thermodynamics and quantum mechanics: we are forced to
accept impossible states, but these being so rare in number, we never see
them.
Finally, it is stated that Fodor and Pylyshin maintain "uncontroversially"
that the appropriate level of explanation for cognitive systems is the
symbolic level, where symbols represent some aspect of reality. However, there
is a convincing argument given in George Lakoff's book Women, Fire, and
Dangerous Things (University of Chicago, 1987) that exactly the opposite is
true: "Meaning cannot be characterized by a relationship between symbols and
entities in the world" (p. 343), a result from Hilary Putnam's work.
I commend Michael Swaine for writing on a difficult subject; I just felt that
counterarguments to Fodor and Pylyshin's work needed discussion.
William Naoki Kumai
Berkeley, California


There's More Than One Way


Dear DDJ,
In the article "Extending printf( )," by Jim Mischel (August 1990), the same
effect would have been achieved by writing a function like that in Example
1(a), which returns its first argument, a temporary buffer; additional
parameters may be added for formatting. It would be used like that in Example
1(b).
Example 1.

 (a)
 char * double2dollarascii (char *pbuff, double dollars);

 (b)
 printf("%s", double2dollarascii(&tempbuff, dollar));


 (c)
 #define va_start(ap,v) ap = (va_list) &v + sizeof(v)
 #define va_arg(ap,t) ((t_FAR_*) (ap + = sizeof(t))) [-1]
 #define va_end(ap) ap = NULL

 (d)
 typedef char *va_list;
 #define va_dcl int va_alist;
 #define va_start(list) list = (char *) &va_alist
 #define va_end(list)
 #define va_arg(list,mode) ((mode *) (list + = sizeof(mode))) [-1]

This use would have the added advantage of being reentrant. Readers may recall
that this strategy is often used for date and time. Oh bored computer science
majors! Oh back room hackers!
The same article had an excellent discussion on functions with variable number
of parameters, however, I would like to warn your readers that the macros
provided are not always portable. On the Microsoft C 6.0 (large model), they
are defined as in Example 1(c), while on my Unix (Sun3 4.2BSD) system, they
are like the macros in Example 1(d). In these macros, the starting variables
are different.
Rayaz Jagani
Sunnyvale, California


VGA BIOS's and Bias's


Dear DDJ,
I found Christopher Howard's article "Super VGA Programming" (July 1990) very
enlightening. I particularly appreciate the time he spent to differentiate
between the Tseng, Paradise, and Video Seven windowing schemes. I am exposed
to all these boards daily and I just purchased the Orchid Technology
ProDesigner II for my personal system.
I, too, agree with Mr. Howard that video board manufacturers should allow for
a method of identifying the capabilities of the board. Compliance with the
Video Electronics Standards Association (VESA) standards will provide for much
easier identification of video adapters and their supported resolutions. I
refer the reader back to the excellent article in the April 1990 issue of Dr.
Dobb's Journal, "VESA VGA BIOS Extensions," by Bo Ericcson, or directly to the
Super VGA BIOS Extension from the VESA committee (October 1, 1989).
No doubt my next statement will be considered as launching the first volley in
this round of the Programming Styles War. I found Mr. Howard's method of
determining the type of VGA chipset rather archaic. It was not because of his
choice of assembly language over C, but rather because of his method of coding
for string compares to avoid static data (see Listing Three, page 84, DDJ July
1990).
Mr. Howard's coding style violates the primary maxim of engineering. He strays
from the rule "Keep it simple and stupid," otherwise known as the KISS
principle. Should Mr. Howard have to change his code in the months to come to
reflect a change in case sensitivity of any of the strings for which he is
searching (i.e., Paradise or Tseng), he would have to wade through the code to
manually change each letter. Software engineering emphasizes better coding
practices.
Assembly language definitely has its place and I do a rather large amount of
mixed C and assembly language programming myself. Mr. Howard could have placed
the strings in the code segment rather than in a separate data segment, and
thus would have maintained the readability and maintainability that structured
programming requires. He should have defined the strings between the EXIT
macro found next to the label svQC_exit and before the ENDP directive in
Listing Three on page 85. By declaring memory locations and defining their
contents with th assembler directives ParadiseStr DB "PARADISE" and TsengStr
DB "Tseng", Mr. Howard could have used the repeat string compare (repse cmpsb)
to determine if any of the desired strings could be found in the video BIOS.
I also feel that some routines are better left to a higher-level language.
Therefore, I chose to implement a method of determining the type of video
adapter using C because more people are likely to be familiar with a higher
level language.
Richard Heffel
The Networking Group
Hayward, California


Extending Optimal Extents


Dear DDJ,
I was quite interested in the article "Optimal Determination of Object
Extents," by Duvanenko, Gyurcsik, and Robbins (October 1990). To test the
algorithm, I implemented the code on my Apple IIGS computer, using the ORCA/C
ANSI standard compiler (version 1.1 from Byte Works, Inc.). The IIGS has 3
megabytes of memory, uses software floating point (SANE), with 4 bytes for
float and 10 bytes for extended (both IEEE format). Times in Table 1 are
elapsed (same as CPU time on the IIGS). All optimization options available
were used (a very limited set since there are no registers available for
optimization or holding temporary results).
Table 1

 Base New Percent Base New
 Machine Items MIN&MAX MIN&MAX Change Average Average
 ------------------------------------------------------------------------

 Apple
 IIGS float 100,000 228 171 -25.0% .0022800 .0017100
 SANE MC68881 100,000 187 141 -24.6% .0018700 .0014100
 Call MC68881 100,000 38 31 -18.4% .0003800 .0003100

 IIGS
 extended 100,000 157 115 -26.8% .0015700 .0011500
 SANE MC68881 100,000 126 104 -17.5% .0012600 .0010400
 Call MC68881 100,000 78 70 -10.3% .0007800 .0007000

 IIGS long
 integer 100,000 15 12 -20.0% .0001500 .0001200

 IBM 3090
 300J 60,000,000 27 17 -37.0% .0000004 .0000003


 IBM RISC
 S/6000 60,000,000 42 32 -23.8% .0000007 .0000005

Due to memory size and processor speed (the IIGS is no match for the other
machines in Table 1), I held the number of elements down to 100,000. Clock
values were only accurate to one second (good enough for this comparison).
My initial run used float numbers. The new algorithm gives exactly the
expected reduction in time. However, upon reading the SANE documentation, I
discovered that all float numbers are internally converted to extended numbers
before use. I made a second run using extended numbers (80-bit floating point
IEEE format).
The second run removed all the time for converting numbers from 4-byte float
to 10-byte extended for the comparison. The reduction is significant, at the
expense of 600,000 bytes of memory. This points out that the floating point
operations in SANE dominate both algorithms (comparison).
I made an additional run using long integers instead of floating point. This
was done to determine the effect of using SANE compared to essentially the
basic loop overhead. As we can see from Table 1, the loop itself, even with
long integer operations, takes very little time when compared to the floating
point operations. If this problem were solved using integer operations, other
items, such as subscript calculations and movement of data, would be more
important. Of course, this is still probably true when using floating point
hardware for the operations. In this case, the algorithms themselves should be
optimized with regard to the subscript calculations and data movement (if not
automatically done by the compiler).
For optimal speed, I would choose to code this algorithm in assembly language
(a practical necessity on the IIGS). In doing so, one often discovers how well
(or poorly) a particular compiler generates code. A good compiler on poor
hardware can sometimes do as well as a poor compiler on good hardware.
I took this comparison one step further on the IIGS. By installing the
Floating Point Engine (FPE) from Innovative Systems, the numeric operations
were performed by hardware floating point (MC68881 processor). This is the
same chip used with the Motorola 68000. The chip is used by replacing SANE on
the IIGS, or by directly generated compiler code (supported by the ORCA/ C
compiler). Table 2 provides the comparison for both. Note that FPE handles
float (4-byte) operands directly without conversion to extended (10-byte)
operands. For the directly generated compiler code, this provides the fastest
speed without memory penalty. Note also that the performance improvement for
the new algorithm is not as good when better hardware is used. The loops are
no longer compare intensive, and the timing is more dependent on loop control
code and data movement (an indication of code quality from the compiler).
Table 2

 Machine C compiler Memory Co-proc? float CPU elap Opts
 ------------------------------------------------------------------------

 Apple IIGS float ORCA/C V1.1 3 meg no 4 bytes elap all
 MC6881 yes

 IIGS extended ORCA/C V1.1 3 meg no 4 bytes elap all
 MC6881 yes

 IIGS extended ORCA/C V1.1 3 meg no 4 bytes elap all
 long integer

 IBM 3090 300J C/370 V2 384 meg no 4 bytes CPU OPT

 IBM RISC S/6000 RISC C 246 meg no 4 bytes CPU -O
 Model 540

Finally, I have access to an IBM 3090 300J processor with 384 Mbytes of
memory, and an IBM RISC S/6000 Model 540 processor with 256 Mbytes of memory.
These computers are the fastest models available in their respective product
lines. These processors were timed using 60 million items. My experience with
the C compilers on these two machines is with this program. The IBM C/370
compiler had one optimization level (OPT), but it cut the time by more than
half from the unoptimized run (NOOPT). The RISC C Compiler also had one
optimization level (xO), but it cut the time even more. This kind of reduction
is expected with RISC C, since this compiler is specifically set up to
generate optimal code for the RISC instruction set (especially overlapping
instruction sequences for the instruction set processors).
Ken Kashmarek
Eldridge, Iowa






























February, 1991
ARITHMETIC CODING AND STATISTICAL MODELING


Achieving higher compression ratios


 This article contains the following executables: NELSON.ARC


Mark R. Nelson


Mark is the vice president of development and senior developer for Greenleaf
Software Inc., of Dallas, Texas. He can be reached through the DDJ office.


Most of the data compression methods in common use today fall into one of two
camps: dictionary-based schemes or statistical methods. In the world of small
systems, dictionary-based data compression techniques seem to be more popular
at this time. However, by combining arithmetic coding with powerful modeling
techniques, statistical methods for data compression are actually able to
achieve better performance. This article discusses how to combine arithmetic
coding with modeling methods to achieve some impressive compression ratios.


Terms of Endearment


In general, data compression operates by taking "symbols" from an input
"text," processing them, and writing "codes" to a compressed file. For the
purposes of this article, symbols are usually bytes, but they could just as
easily be pixels, 80-bit floating point numbers, or EBCDIC characters. To be
effective, a data compression scheme needs to be able to transform the
compressed file back into an identical copy of the input text. Needless to
say, it also helps if the compressed file is smaller than the input text!
Dictionary-based compression systems operate by replacing groups of symbols in
the input text with fixed length codes. A well-known example of a dictionary
technique is LZW data compression. (See "LZW Data Compression," DDJ, October
1989). LZW operates by replacing strings of essentially unlimited length with
codes that usually range in size from 9 to 16 bits.
Statistical methods of data compression take a completely different approach:
They operate by encoding symbols one at a time. The symbols are encoded into
output codes, the length of which varies based on the probability or frequency
of the symbol. Low probability symbols are encoded using many bits, and high
probability symbols are encoded using fewer bits.
In practice, the dividing line between statistical and dictionary methods is
not always so distinct. Some schemes can't be clearly put in one camp or the
other, and there are always hybrids which use features from both techniques.
However, the methods discussed in this article use arithmetic coding to
implement purely statistical compression schemes.


How Arithmetic Coding Works


Only in the last ten years has a respectable candidate to replace Huffman
coding been successfully demonstrated: arithmetic coding. Arithmetic coding
completely bypasses the idea of replacing an input symbol with a specific
code. Instead, it takes a stream of input symbols and replaces it with a
single floating point output number. The longer (and more complex) the
message, the more bits are needed in the output number. It was not until
recently that practical methods were found to implement this on computers with
fixed-sized registers.
The output from an arithmetic coding process is a single number less than 1
and greater than or equal to 0. This single number can be uniquely decoded to
create the exact stream of symbols that went into its construction. To
construct the output number, the symbols being encoded have to have set
probabilities assigned to them. For example, if I were to encode the random
message BILL GATES, I would have a probability distribution that looks like
Figure 1.
Figure 1: Probability distribution for BILL GATES

 Character Probability
 ----------------------

 SPACE 1/10
 A 1/10
 B 1/10
 E 1/10
 G 1/10
 I 1/10
 L 2/10
 S 1/10
 T 1/10

Once the character probabilities are known, the individual symbols need to be
assigned a range along a "probability line," which is nominally 0 to 1. It
doesn't matter which characters are assigned which segment of the range, as
long as it is done in the same manner by both the encoder and the decoder. The
nine-character symbol set used here would look like Figure 2.
Figure 2: A nine-character symbol set

 Character Probability Range
 -----------------------------------

 SPACE 1/10 0.00 - 0.10

 A 1/10 0.10 - 0.20
 B 1/10 0.20 - 0.30

 E 1/10 0.30 - 0.40
 G 1/10 0.40 - 0.50
 I 1/10 0.50 - 0.60
 L 2/10 0.60 - 0.80
 S 1/10 0.80 - 0.90
 T 1/10 0.90 - 1.00

Each character is assigned the portion of the 0 - 1 range that corresponds to
its probability of appearance. Note also that the character "owns" everything
up to, but not including the higher number. So the letter T in fact has the
range 0.90 - 0.9999 ....
The most significant portion of an arithmetic coded message belongs to the
first symbol to be encoded. When encoding the message BILL GATES, the first
symbol is B. For the first character to be decoded properly, the final coded
message has to be a number greater than or equal to 0.20 and less than 0.30.
To encode this number, we keep track of the range within which this number
could fall. So after the first character is encoded, the low end of this range
is 0.20 and the high end is 0.30.
After the first character is encoded, we also know that the range for our
output number is bounded by the low and high numbers. During the rest of the
encoding process, each new symbol to be encoded will further restrict the
possible range of the output number. The next character to be encoded, I, owns
the range 0.50 through 0.60. If this was the first number in our message, we
would set these as our low- and high-range values. But I is the second
character; therefore, we say that I owns the range corresponding to 0.50 -
0.60 in the new subrange of 0.2 - 0.3. This means that the new encoded number
will have to fall somewhere in the 50 to 60th percentile of the currently
established range. Applying this logic will further restrict our number to the
range between 0.25 and 0.26. The algorithm to accomplish this for a message of
any length is shown in Figure 3. Figure 4 shows this process followed through
to its natural conclusion with our chosen message. So the final low value,
0.2572167752, will uniquely encode the message BILL GATES using our present
encoding scheme.
Figure 3: Encoding algorithm for a message of any length

 Set low to 0.0
 Set high to 1.0
 While there are still input symbols do
 get an input symbol
 code_range = high - low.
 high = low + range*high_range (symbol)
 low = low + range*low_range(symbol)
 End of While
 output low

Figure 4: Resulting message

 New Character Low Value High Value
 ---------------------------------------

 0.0 1.0
 B 0.2 0.3
 I 0.25 0.26
 L 0.256 0.258
 L 0.2572 0.2576
 SPACE 0.25720 0.25724
 G 0.257216 0.257220
 A 0.2572164 0.2572168
 T 0.25721676 0.2572168
 E 0.257216772 0.257216776

Given this encoding scheme, it is relatively easy to see how the decoding
process will operate. We find the first symbol in the message by seeing which
symbol owns the code space in which our encoded message falls. Because
0.2572167752 falls between 0.2 and 0.3, we know that the first character must
be B. We must now remove the B from the encoded number. We know the low and
high ranges of B, so we can remove their effects by reversing the process that
put them in. First, we subtract the low value of B from 0.2572167752, giving
0.0572167752. Then we divide by the range of B, 0.1, and get 0.572167752. Now
we can calculate where that lands, which is in the range of the next letter,
I.
The algorithm for decoding the incoming number looks like Figure 5. Note that
I have conveniently ignored the problem of how to decide when there are no
more symbols left to decode. This can be handled by either encoding a special
EOF symbol, or carrying the stream length along with the encoded message. The
decoding algorithm for the BILL GATES message will proceed as shown in Figure
6.
Figure 5: Algorithm for decoding an incoming number

 get encoded number
 Do
 find symbol whose range straddles the encoded number
 output the symbol
 range = symbol_low_value - symbol_high_value
 subtract symbol_low_value from encoded number
 divide encoded number by range
 until no more symbols

Figure 6: Resulting message

 Encoded Number Output Symbol Low High Range
 -----------------------------------------------

 0.2572167752 B 0.2 0.3 0.1
 0.572167752 I 0.5 0.6 0.1
 0.72167752 L 0.6 0.8 0.2

 0.6083876 L 0.6 0.8 0.2
 0.041938 SPACE 0.0 0.1 0.1
 0.41938 G 0.4 0.5 0.1
 0.1938 A 0.2 0.3 0.1
 0.938 T 0.9 1.0 0.1
 0.38 E 0.3 0.4 0.1
 0.8 S 0.8 0.9 0.1
 0.0

To summarize, the encoding process consists simply of narrowing the range of
possible numbers with every new symbol. The new range is proportional to the
predefined probability attached to that symbol. Decoding is the inverse
procedure: The range is expanded in proportion to the probability of each
symbol as it is extracted.


Practical Matters


The process of encoding and decoding a stream of symbols using arithmetic
coding is not too complicated. But at first glance, it seems completely
impractical. Most computers support floating point numbers of up to 80 bits or
so. Does this mean you have to start over every time you finish encoding 10 or
15 symbols? Do you need a floating point processor? Can machines with
different floating point formats communicate using arithmetic coding?
As it turns out, arithmetic coding is best accomplished using standard 16-bit
and 32-bit integer math. No floating point math is required, nor would it help
to use it. Instead, we use an incremental transmission scheme in which
fixed-size integer-state variables receive new bits at the low end and shift
them out the high end, forming a single number that can be as long as the
number of bits available on the computer's storage medium.
In the previous section, I showed how the algorithm works by keeping track of
a high and low number that bracket the range of the possible output number.
When the algorithm first starts up, the low number is set to 0.0, and the high
number to 1.0. To work with integer math, first change the 1.0 to 0.999....,
or.111 ... in binary.
To store these numbers in integer registers, we first justify them so the
implied decimal point is on the left-hand side of the word. Then we load as
many of the initial high and low values as will fit into the word size we are
working with. My implementation uses 16-bit unsigned math, so the initial
value of high is 0xFFFF, and low is 0. We know that the high value continues
with FFs forever, and low continues with 0s forever, so we can shift those
extra bits in with impunity when they are needed.
If you imagine our BILL GATES example in a 5-digit register, the decimal
equivalent of our setup would look like Figure 7(a). To find our new range
numbers, we need to apply the encoding algorithm from the previous section. We
first calculate the range between the low and high values. The difference
between the two registers will be 100000, not 99999, because assuming the high
register has an infinite number of 9s added on to it, we need to increment the
calculated difference. We then compute the new high value using the formula
from the previous section: high = low + high_range(symbol).
Figure 7: (a) Decimal equivalent of setup; (b) high and low look after
calculation; (c) resulting message

 (a)

 HIGH: 99999
 LOW: 00000

 (b)

 #define va_arg(list,mode) ((mode *) (list + = sizeof(mode))) [-1]
 HIGH: 29999 (999...) LOW: 20000 (000...)

 (c)

 HIGH LOW RANGE CUMULATIVE OUTPUT
 ---------------------------------------------------------------

 Initial state 99999 00000 100000
 Encode B (0.2-0.3) 29999 20000
 Shift out 2 99999 00000 100000 .2
 Encode I (0.5-0.6) 59999 50000 .2
 Shift out 5 99999 00000 100000 .25
 Encode L (0.6-0.8) 79999 60000 20000 .25
 Encode L (0.6-0.8) 75999 72000 .25
 Shift out 7 59999 20000 40000 .257
 Encode SPACE (0.0-0.1) 23999 20000 .257
 Shift out 2 39999 00000 40000 .2572
 Encode G (0.4-0.5) 19999 16000 .2572
 Shift out 1 99999 60000 40000 .25721
 Encode A (0.1-0.2) 67999 64000 .25721
 Shift out 6 79999 40000 40000 .257216
 Encode T (0.9-1.0) 79999 76000 .257216
 Shift out 7 99999 60000 40000 .2572167
 Encode E (0.3-0.4) 75999 72000 .2572167
 Shift out 7 59999 20000 40000 .25721677
 Encode S (0.8-0.9) 55999 52000 .25721677
 Shift out 5 59999 20000 .257216775
 Shift out 2 .2572167752
 Shift out 0 .25721677520


In this case, the high range is 0.30, which gives a new high value of 30000.
Before storing this value, we need to decrement it (again because of the
implied digits appended to the integer value), so the new value of high is
29999. The calculation of low follows the same path, with a resulting new
value of 20000. High and low now look like Figure 7(b).
At this point, the most significant digits of high and low match. Due to the
nature of our algorithm, high and low can continue to grow closer to one
another without ever quite matching. Therefore, once the most significant
digit matches, that digit will never change. We can then output that digit as
the first digit of our encoded number. This is done by shifting both high and
low left by one digit, and shifting in a 9 in the least significant digit of
high. The equivalent operations are performed in binary in the C
implementation of this algorithm.
As this process continues, high and low are continually growing closer
together, then shifting digits out into the coded word. The process for our
BILL GATES message looks like Figure 7(c). Note that after all the letters
have been accounted for, two extra digits need to be shifted out of either the
high or low value to finish up the output word.


A Complication


This scheme works well for incrementally encoding a message. There is enough
accuracy retained during the double precision integer calculations to ensure
that the message is accurately encoded. However, there is potential for a loss
of precision under certain circumstances.
In the event that the encoded word has a string of 0s or 9s in it, the high
and low values will slowly converge on a value, but their most significant
digits may not match immediately. For example, high and low may look like
Figure 8(a). At this point, the calculated range is going to be only a single
digit long, which means the output word will not have enough precision to be
accurately encoded. Even worse, after a few more iterations, high and low
could look like Figure 8(b).
Figure 8: (a) High and low; (b) after a few more iterations, a new high and
low; (c) setting an underflow counter to remember what was thrown away

 (a)
 HIGH: 700004
 LOW: 699995

 (b)
 HIGH: 70000
 LOW: 69999

 (c)
 Before After
 -------------------------

 HIGH: 40344 43449
 LOW: 39810 38100
 Underflow: 0 1

At this point, the values are permanently stuck. The range between high and
low has become so small that any calculation will always return the same
values. But because the most significant digits of both words are not equal,
the algorithm can't output the digit and shift. It seems like an impasse.
In the original algorithm, if the most significant digit of high and low
matched, we shifted it out. To prevent the underflow problem just encountered,
a second test needs to be applied before the digits match, while they are on
adjacent numbers. If high and low are one apart, we must test to see if the
second most significant digit in high is a 0, and the second digit in low is a
9. If so, we are on the road to underflow and need to take action.
When underflow rears its ugly head, we head it off with a slightly different
shift operation. Instead of shifting the most significant digit out of the
word, we just delete the second digits from high and low, and shift the rest
of the digits left to fill up the space. The most significant digit stays in
place. We must then set an underflow counter to remember that we threw away a
digit, and we aren't quite sure whether it was going to end up as a 0 or a 9.
This operation is shown in Figure 8(c).
After every recalculation operation, if the most significant digits don't
match up, we can check for underflow digits again. If they are present, we
shift them out and increment the counter.
When the most significant digits do finally converge to a single value, we
first output that value. Then we output all the previously discarded
"underflow" digits. The underflow digits will be all 9s or 0s, depending on
whether High and Low converged to the higher or lower value. In the C
implementation of this algorithm, the underflow counter keeps track of how
many 1s or 0s to put out.


Decoding


In the "ideal" decoding process, we had the entire input number to work with,
so the algorithm had us do things such as "divide the encoded number by the
symbol probability." In practice, we can't perform an operation like that on a
number that could be billions of bytes long. Just like the encoding process,
the decoder can operate using 16-and 32-bit integers for calculations.
Instead of maintaining two numbers, high and low, the decoder has to maintain
three integers. The first two, high and low, correspond exactly to the high
and low values maintained by the encoder. The third number, code, contains the
current bits being read in from the input bits stream. The code value will
always lie between the high and low values. As they come closer and closer to
it, new shift operations will take place, and high and low will move back away
from code.
The high and low values in the decoder correspond exactly to the high and low
that the encoder was using. They will be updated after every symbol, just as
they were in the encoder and should have exactly the same values. By
performing the same comparison test on the upper digit of high and low, the
decoder knows when to shift a new digit into the incoming code. The same
underflow tests are performed as well, in lockstep with the encoder.
In the ideal algorithm, it was possible to determine what the current encoded
symbols were, just by finding the symbol whose probabilities enclosed the
present value of the code. In the integer math algorithm, things are somewhat
more complicated: The probability scale is determined by the difference
between high and low, so instead of the range being between 0.0 and 1.0, it
will be between two positive 16-bit integer counts. The current probability is
determined by where the present code value falls within that range. If you
were to divide (value low) by (high low+1), you would get the actual
probability for the present symbol.
Earlier, we saw how each character "owned" a probability range in the scale
ranging from 0.0 to 1.0. To implement arithmetic coding using integer math,
this ownership was restated as a low and high count along an integer range
from 0 to the maximum count.


Modeling


The need to accurately predict the probability of symbols in the input data is
inherent to the nature of arithmetic coding. The principle of this type of
coding is to reduce the number of bits needed to encode a character as its
probability of appearance increases. So if the letter "e" represents 25
percent of the input data, it would only take 2 bits to code. If the letter
"z" represents only 0.1 percent of the input data, it might take 10 bits to
code. If the model is not generating probabilities accurately, it might take
10 bits to represent "e" and 2 bits to represent "z," causing data expansion
instead of compression.
The second condition is that the model needs to make predictions that deviate
from a uniform distribution. The better the model is at making these
predictions, the better the compression ratios will be. For example, a model
could be created that assigned all 256 possible symbols a uniform probability
of 1/256. This model would create an output file that was exactly the same
size as the input file, because every symbol would take exactly 8 bits to
encode. Only by correctly finding probabilities that deviate from a uniform
distribution can the number of bits be reduced, leading to compression. Of
course, the increased probabilities have to accurately reflect reality, as
prescribed by the first condition.
It may seem that the probability of a given symbol occurring in a data stream
is fixed, but this is not quite true. Depending on the model being used, the
probability of the character can change quite a bit. For example, when
compressing a C program, the probability of a newline character in the text
might be 1/40. This probability could be determined by scanning the entire
text and dividing the number of the character's occurrences by the total
number of characters. But if we use a modeling technique that looks at a
single previous character, the probabilities change. In that case, if the
previous character was a "}", the probability of a newline character goes up
to 1/2. This improved modeling technique leads to better compression, even
though both models were generating accurate probabilities.


Finite Context Modeling


The type of modeling I will present in this article is referred to as
"finite-context" modeling. It is based on a very simple idea: The
probabilities for each incoming symbol are calculated based on the context in
which the symbol appears. In all the examples I will show here, the context
consists of nothing more than previously encountered symbols. The "order" of
the model refers to the number of previous symbols that make up the context.
The simplest finite-context model would be an order-0 model, in which the
probability of each symbol is independent of any previous symbols. In order to
implement this model, we need only a single table containing the frequency
counts for each symbol that might be encountered in the input stream. For an
order-1 model, you will keep track of 256 different tables of frequencies,
because you need to keep a separate set of counts for each possible context.
Likewise, an order-2 model needs to be able to handle 65,536 different tables
of contexts.



Adaptive Modeling


Logically, as the order of the model increases, the compression ratios ought
to improve as well. For example, the probability of the symbol "u" appearing
in this article may only be five percent, but if the previous context
character is "q", the probability goes up to 95 percent. Being able to predict
characters with high probability lowers the number of bits needed, and larger
contexts ought to let us make better predictions.
Unfortunately, as the order of the model increases linearly, the memory
consumed by the model increases exponentially. With an order-0 model, the
space consumed by the statistics could be as small as 256 bytes. But once the
order of the model increases to 2 or 3, even the most cleverly designed models
will consume hundreds of kilobytes.
One conventional way of compressing data is to take a single pass over the
symbols to be compressed to gather the statistics for the model. A second pass
is then made to actually encode the data. The statistics are then usually
prepended to the compressed data so that the decoder will have a copy of them
to work with. This will obviously have serious problems if the statistics for
the model consume more space than the data to be compressed.
The solution is to perform "adaptive" compression, in which both the
compressor and the decompressor start with the same model. The compressor
encodes a symbol using the existing model, then updates the model to account
for the new symbol. The decompressor likewise decodes a symbol using the
existing model, then updates the model. As long as the algorithm to update the
model operates identically for the compressor and the decompressor, the
process can operate perfectly without having to pass a statistics table from
the compressor to the decompressor.
The place where adaptive compression suffers is in the cost of updating the
model. When updating the count for a particular symbol using arithmetic
coding, the update code has the potential cost of updating the cumulative
counts for all the other symbols as well, leading to code that has to perform
an average of 128 arithmetic operations for every single symbol encoded or
decoded.
Because of the high cost in both memory and CPU time, higher order adaptive
models have become practical only in the last ten years. It is somewhat ironic
that as the cost of disk space and memory goes down, so does the cost of
compressing the data they store. As these costs continue to decline, we will
be able to implement even more effective programs than are practical today.


Highest-Order Modeling


An order-0 model doesn't take into account any of the previous symbols from
the text file when calculating the probabilities for the current symbol. By
looking at the previous characters in the text file, or the "context," we can
more accurately predict the incoming symbols.
When we move up to an order-2 or order-3 model, one problem is that in a
fixed-order model, each character must have a finite, nonzero probability, so
that it can be encoded, if and when it appears. The solution is to set the
initial probabilities of all the symbols to 0 for a given context, and to have
a method of falling back to a different context when a previously unseen
symbol occurs. This is done by emitting a special "Escape" code. For the
previous context of REQ, we could set the Escape code to a count of 1, and all
other symbols to a count of 0. The first time the character "U" followed REQ,
we would have to emit an Escape code, followed by the code for "U" in a
different context. During the model update immediately following that, we
could increment the count for "U" in the REQ context to 1, giving it a
probability of 1/2. The next time it appeared, it would be encoded in only 1
bit, with the probability increasing and the number of bits decreasing with
each appearance.
The obvious question is: What do we use as our "fall-back" context after
emitting an Escape code? In the MODEL-2 program (see the following section
entitled "Implementation"), if the order-3 context generates an Escape code,
the next context tried is that of order-2. This means that the first time the
context REQ is used, and "U" must be encoded, an Escape code is generated.
Following that, the MODEL-2 program drops back to an order-2 model, and tries
to encode the character "U" using the context EQ. This continues on down
through the order-0 context. If the Escape code is still generated at order-0,
we fall back to a special order (-1) context. The -1 context is set up at
initialization to have a count of 1 for every possible symbol. It is never
updated, so it is guaranteed to be able to encode every symbol.


Implementation


In writing this article, I've developed a number of programs, ranging from
routines that implement the arithmetic coding algorithm itself to higher-order
modeling. Due to space constraints, all of these fully-commented files are
available electronically (see "Availability," page 3). The modules presented
here in the magazine implement an order-2 compression program. The source
files consist of MODEL.H (Listing One ), the header file; COMP-2.C (Listing
Two), the main module for the compression program that handles escape codes
and does compression ratio checking; EXPAND-2.C Listing Three), the main
module for decompression; and MODEL-2.C (Listing Four), the highly optimized
source for a variable order compression program that can be used with COMP-2
or EXPAND-2.
MODEL-2.C has prodigious memory requirements. When running on DOS machines
limited to 640K, these programs have to be limited to order-1, or perhaps
order-2 for text with a higher redundancy ratio. To examine compression ratios
for higher orders on binary files, there are three choices. First, the program
can be compiled using Zortech C and use EMS_handle pointers. Second, they can
be built using a DOS Extender, such as Rational Systems 16/M. Third, they can
be built on a machine that supports virtual memory, such as VMS. The code
distributed here was written in an attempt to be portable across all three
options.
I found that with an extra megabyte of EMS, I could compress virtually any
ASCII file on my PC using order-3 compression. Some binary files require more
memory. My Xenix system had no problem with order-3 compression, and turned in
the best performance overall in terms of speed. I had mixed results with DOS
Extenders. For unknown reasons, my tests with Lattice C/286 and Microsoft C +
DOS 16/M ran much slower than Zortech's EMS code, although logic indicates
that the opposite should be true. This was not an optimization problem either,
because the Lattice and Microsoft implementations ran faster than Zortech's
when inside 640K. Executables built with the Lattice C/286 and the Zortech
2.06 code are available electronically. The Rational Systems 16/M license
agreement requires royalty payments, so that code cannot be distributed.


Testing and Comparing: CHURN


To test compression programs, I've created a general-purpose test program,
CHURN, that simply churns through a directory and all of its subdirectories,
compressing and then expanding all the files it finds there. CHURN and a Unix
variant, CHURNX, are not printed here for reasons of space, but both are
available electronically (see page 3).
Table 1 shows the results returned when testing various compression programs
against two different bodies of input. The first sample, TEXTDATA, is roughly
1 Mbyte of text-only files, consisting of source code, word processing files,
and documentation. The second sample, BINDATA, is roughly 1 Mbyte of randomly
selected files, containing executables, database files, binary images, and so
on.
Table 1: Compression testing results

 TEXTDATA - Total input bytes: 987070 Text data
 -----------------------------------------------------------------------

 COMPRESS Unix 16-bit LZW implementation 351446 bytes 2.85 bits/byte
 LZHUF Sliding window dictionary 313541 bytes 2.54 bits/byte
 PKZIP Proprietary dictionary based 292232 bytes 2.37 bits/byte
 Model-2 Highest context, max order-3 239327 bytes 1.94 bits/byte

 BINDATA - Total input bytes: 989917 Binary data
 -----------------------------------------------------------------------

 COMPRESS Unix 16-bit LZW implementation 662692 bytes 5.35 bits/byte
 PKZIP Proprietary dictionary based 503827 bytes 4.06 bits/byte
 LZHUF Sliding window dictionary 503224 bytes 4.06 bits/byte
 Model-2 Highest context, max order-3 500055 bytes 4.03 bits/byte

For comparison, three dictionary-based coding schemes were also run on the
datasets. COMPRESS is a 16-bit LZW implementation in widespread use on Unix
systems. The PC implementation that uses 16 bits takes up about 500K of RAM.
LZHUF is an LZSS program with an adaptive Huffman coding stage written by
Haruyasu Yoshikazi, later modified and posted on Fidonet by Paul Edwards. This
is essentially the same compression used in the LHARC program. Finally, the
commercial product PKZIP 1.10 by PKWare, (Glendale, Wisc.) was also tested on
the datasets. PKZIP uses a proprietary dictionary-based scheme, which is
discussed in the program's documentation.


Conclusions


In terms of compression ratios, these tests show that statistical modeling can
perform at least as well as dictionary-based methods. At present, however,
these programs are somewhat impractical, due to their high resource
requirements. MODEL-2 is fairly slow, compressing data with speeds in the
range of 1 Kbyte per second, and needs huge amounts of memory to use
higher-order modeling. However, as memory becomes cheaper and processors more
powerful, schemes such as the ones shown here may become practical. Currently,
they could be applied to circumstances where either storage or transmission
costs are extremely high.

Order-0 adaptive modeling using arithmetic coding could be usefully applied
today to situations requiring extremely low memory consumption. The
compression ratios might not be as good as those of more sophisticated models,
but the memory consumption is minimized.


References


The June 1987 issue of Communications of the ACM provides the definitive
overview of arithmetic coding. Most of the article is reprinted in the book
Text Compression, by Timothy C. Bell, John G. Cleary, and Ian H. Witten. This
book provides an excellent overview of both statistical and dictionary-based
compression techniques.
Bell, Timothy C., John G. Cleary, and Ian H. Witten. Text Compression.
Englewood Cliffs, N.J.: Prentice Hall, 1990.
Nelson, Mark. "LZW Data Compression." Doctor Dobb's Journal (October, 1989).
Storer, J.A. Data Compression. Rockville, Md.: Computer Science Press, 1988.
Witten, Ian H., Radford M. Neal, and John G. Cleary. "Arithmetic Coding for
Data Compression." Communications of the ACM (June, 1987).


[LISTING ONE]

/*
 * model.h
 *
 * This file contains all of the function prototypes and
 * external variable declarations needed to interface with
 * the modeling code found in model-1.c or model-2.c.
 */

/*
 * Eternal variable declarations.
 */
extern int max_order;
extern int flushing_enabled;
/*
 * Prototypes for routines that can be called from MODEL-X.C
 */
void initialize_model( void );
void update_model( int symbol );
int convert_int_to_symbol( int symbol, SYMBOL *s );
void get_symbol_scale( SYMBOL *s );
int convert_symbol_to_int( int count, SYMBOL *s );
void add_character_to_model( int c );
void flush_model( void );




[LISTING TWO]

/*
 * comp-2.c
 *
 * This module is the driver program for a variable order
 * finite context compression program. The maximum order is
 * determined by command line option. This particular version
 * also monitors compression ratios, and flushes the model whenever
 * the local (last 256 symbols) compression ratio hits 90% or higher.
 *
 * To build this program:
 *
 * Turbo C: tcc -w -mc comp-2.c model-2.c bitio.c coder.c
 * QuickC: qcl /AC /W3 comp-2.c model-2.c bitio.c coder.c
 * Zortech: ztc -mc comp-2.c model-2.c bitio.c coder.c
 * *NIX: cc -o comp-2 comp-2.c model-2.c bitio.c coder.c

 *
 * Command line options:
 *
 * -f text_file_name [defaults to test.inp]
 * -c compressed_file_name [defaults to test.cmp]
 * -o order [defaults to 3 for model-2]
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "coder.h"
#include "model.h"
#include "bitio.h"

/*
 * The file pointers are used throughout this module.
 */
FILE *text_file;
FILE *compressed_file;

/*
 * Declarations for local procedures.
 */
void initialize_options( int argc, char **argv );
int check_compression( void );
void print_compression( void );

/*
 * The main procedure is similar to the main found in COMP-1.C.
 * It has to initialize the coder, the bit oriented I/O, the
 * standard I/O, and the model. It then sits in a loop reading
 * input symbols and encoding them. One difference is that every
 * 256 symbols a compression check is performed. If the compression
 * ratio exceeds 90%, a flush character is encoded. This flushes
 * the encoding model, and will cause the decoder to flush its model
 * when the file is being expanded. The second difference is that
 * each symbol is repeatedly encoded until a succesfull encoding
 * occurs. When trying to encode a character in a particular order,
 * the model may have to transmit an ESCAPE character. If this
 * is the case, the character has to be retransmitted using a lower
 * order. This process repeats until a succesful match is found of
 * the symbol in a particular context. Usually this means going down
 * no further than the order -1 model. However, the FLUSH and DONE
 * symbols do drop back to the order -2 model. Note also that by
 * all rights, add_character_to_model() and update_model() logically
 * should be combined into a single routine.
 */
void main( int argc, char **argv )
{
 SYMBOL s;
 int c;
 int escaped;
 int flush = 0;
 long int text_count = 0;

 initialize_options( --argc, ++argv );
 initialize_model();
 initialize_output_bitstream();
 initialize_arithmetic_encoder();

 for ( ; ; )
 {
 if ( ( ++text_count & 0x0ff ) == 0 )
 flush = check_compression();
 if ( !flush )
 c = getc( text_file );
 else
 c = FLUSH;
 if ( c == EOF )
 c = DONE;
 do {
 escaped = convert_int_to_symbol( c, &s );
 encode_symbol( compressed_file, &s );
 } while ( escaped );
 if ( c == DONE )
 break;
 if ( c == FLUSH )
 {
 flush_model();
 flush = 0;
 }
 update_model( c );
 add_character_to_model( c );
 }
 flush_arithmetic_encoder( compressed_file );
 flush_output_bitstream( compressed_file );
 print_compression();
 fputc( '\n', stderr );
 exit( 0 );
}

/*
 * This routine checks for command line options, and opens the
 * input and output files. The only other command line option
 * besides the input and output file names is the order of the model,
 * which defaults to 3.
 */
void initialize_options( int argc, char **argv )
{
 char text_file_name[ 81 ];
 char compressed_file_name[ 81 ];

 strcpy( compressed_file_name, "test.cmp" );
 strcpy( text_file_name, "test.inp" );
 while ( argc-- > 0 )
 {
 if ( strcmp( *argv, "-f" ) == 0 )
 {
 argc--;
 strcpy( text_file_name, *++argv );
 }
 else if ( strcmp( *argv, "-c" ) == 0 )
 {
 argc--;
 strcpy( compressed_file_name, *++argv );
 }
 else if ( strcmp( *argv, "-o" ) == 0 )
 {
 argc--;

 max_order = atoi( *++argv );
 }
 else
 {
 fprintf( stderr, "\nUsage: COMP-2 [-o order] " );
 fprintf( stderr, "[-f text file] [-c compressed file]\n" );
 exit( -1 );
 }
 argc--;
 argv++;
 }
 text_file = fopen( text_file_name, "rb" );
 compressed_file = fopen( "test.cmp", "wb" );
 if ( text_file == NULL compressed_file == NULL )
 {
 printf( "Had trouble opening one of the files!\n" );
 exit( -1 );
 }
 setvbuf( text_file, NULL, _IOFBF, 4096 );
 setbuf( stdout, NULL );
 printf( "Compressing %s to %s, order %d.\n",
 text_file_name,
 compressed_file_name,
 max_order );
}

/*
 * This routine is called to print the current compression ratio.
 * It prints out the number of input bytes, the number of output bytes,
 * and the bits per byte compression ratio. This is done both as a
 * pacifier and as a seat-of-the-pants diagnostice. A better version
 * of this routine would also print the local compression ratio.
 */
void print_compression()
{
 long total_input_bytes;
 long total_output_bytes;

 total_input_bytes = ftell( text_file );
 total_output_bytes = bit_ftell_output( compressed_file );
 if ( total_output_bytes == 0 )
 total_output_bytes = 1;

 fprintf( stderr,"%ld/%ld, %2.3f\r",
 total_input_bytes,
 total_output_bytes,
 8.0 * total_output_bytes / total_input_bytes );
}

/*
 * This routine is called once every 256 input symbols. Its job is to
 * check to see if the compression ratio hits or exceeds 90%. If the
 * output size is 90% of the input size, it means not much compression
 * is taking place, so we probably ought to flush the statistics in the
 * model to allow for more current statistics to have greater impactic.
 * This heuristic approach does seem to have some effect.
 */
int check_compression()
{

 static long local_input_marker = 0L;
 static long local_output_marker = 0L;
 long total_input_bytes;
 long total_output_bytes;
 int local_ratio;

 print_compression();
 total_input_bytes = ftell( text_file ) - local_input_marker;
 total_output_bytes = bit_ftell_output( compressed_file );
 total_output_bytes -= local_output_marker;
 if ( total_output_bytes == 0 )
 total_output_bytes = 1;
 local_ratio = (int)( ( total_output_bytes * 100 ) / total_input_bytes );

 local_input_marker = ftell( text_file );
 local_output_marker = bit_ftell_output( compressed_file );

 if ( local_ratio > 90 && flushing_enabled )
 {
 fprintf( stderr, "Flushing... \r" );
 return( 1 );
 }
 return( 0 );
}




[LISTING THREE]

/*
 * expand-2.c
 *
 * This module is the driver program for a variabler order finite
 * context expansion program. The maximum order is determined by
 * command line option. This particular version can respond to
 * the FLUSH code inserted in the bit stream by the compressor.
 *
 * To build this program:
 *
 * Turbo C: tcc -w -mc expand-2.c model-2.c bitio.c coder.c
 * QuickC: qcl /W3 /AC expand-2.c model-2.c bitio.c coder.c
 * Zortech: ztc -mc expand-2.c model-2.c bitio.c coder.c
 * *NIX: cc -o expand-2 expand-2.c model-2.c bitio.c coder.c
 *
 *
 * Command line options:
 *
 * -f text_file_name [defaults to test.inp]
 * -c compressed_file_name [defaults to test.cmp]
 * -o order [defaults to 3 for model-2]
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "coder.h"
#include "model.h"
#include "bitio.h"


/*
 * Declarations for local procedures.
 */
void initialize_options( int argc, char **argv );
void print_compression( void );

/*
 * Variables used throughout this particular file.
 */
FILE *text_file;
FILE *compressed_file;

/*
 * The main loop for expansion is very similar to the expansion routine
 * used in the simpler compression program, EXPAND-1.C. The routine
 * first has to initialize the standard i/o, the bit oriented i/o,
 * the arithmetic coder, and the model. The decompression loop
 * differs in a couple of respects. First of all, it handles the
 * special ESCAPE character, by removing them from the input
 * bit stream but just throwing them away otherwise. Secondly,
 * it handles the special FLUSH character. Once the main decoding
 * loop is done, the cleanup code is called, and the program exits.
 */
void main( int argc, char *argv[] )
{
 SYMBOL s;
 int c;
 int count;
 long int counter = 0;

 initialize_options( --argc, ++argv );
 initialize_model();
 initialize_input_bitstream();
 initialize_arithmetic_decoder( compressed_file );
 for ( ; ; )
 {
 do
 {
 get_symbol_scale( &s );
 count = get_current_count( &s );
 c = convert_symbol_to_int( count, &s );
 remove_symbol_from_stream( compressed_file, &s );
 } while ( c == ESCAPE );
 if ( c == DONE )
 break;
 if ( ( ++counter & 0xff ) == 0 )
 print_compression();
 if ( c != FLUSH )
 putc( (char) c, text_file );
 else
 {
 fprintf( stderr, "\rFlushing... \r" );
 flush_model();
 }
 update_model( c );
 add_character_to_model( c );
 }
 print_compression();
 fputc( '\n', stderr );

 exit( 0 );
}

/*
 * This routine checks for command line options, and opens the
 * input and output files. The only other command line option
 * besides the input and output file names is the order of the model,
 * which defaults to 3.
 */
void initialize_options( int argc, char **argv )
{
 char text_file_name[ 81 ];
 char compressed_file_name[ 81 ];

 strcpy( compressed_file_name, "test.cmp" );
 strcpy( text_file_name, "test.out" );
 while ( argc-- > 0 )
 {
 if ( strcmp( *argv, "-f" ) == 0 )
 {
 argc--;
 strcpy( text_file_name, *++argv );
 }
 else if ( strcmp( *argv, "-c" ) == 0 )
 {
 argc--;
 strcpy( compressed_file_name, *++argv );
 }
 else if ( strcmp( *argv, "-o" ) == 0 )
 {
 argc--;
 max_order = atoi( *++argv );
 }
 else
 {
 fprintf( stderr, "\nUsage: EXPAND-2 [-o order] " );
 fprintf( stderr, "[-f text file] [-c compressed file]\n" );
 exit( -1 );
 }
 argc--;
 argv++;
 }
 text_file = fopen( text_file_name, "wb" );
 compressed_file = fopen( compressed_file_name, "rb" );
 setvbuf( text_file, NULL, _IOFBF, 4096 );
 setbuf( stdout, NULL );
 printf( "Decoding %s to %s, order %d.\n",
 compressed_file_name ,
 text_file_name,
 max_order );
}

/*
 * This routine is called to print the current compression ratio.
 * It prints out the number of input bytes, the number of output bytes,
 * and the bits per byte compression ratio. This is done both as a
 * pacifier and as a seat-of-the-pants diagnostice. A better version
 * of this routine would also print the local compression ratio.
 */

void print_compression()
{
 long input_bytes;
 long output_bytes;

 output_bytes = ftell( text_file );
 input_bytes = bit_ftell_input( compressed_file );
 if ( output_bytes == 0 )
 output_bytes = 1;
 fprintf( stderr,
 "\r%ld/%ld, %2.3f ",
 input_bytes,
 output_bytes,
 8.0 * input_bytes / output_bytes );
}



[LISTING FOUR]

/*
 * model-2.c
 *
 * This module contains all of the modeling functions used with
 * comp-2.c and expand-2.c. This modeling unit keeps track of
 * all contexts from 0 up to max_order, which defaults to 3.
 * In addition, there is a special context -1 which is a fixed model
 * used to encode previously unseen characters, and a context -2
 * which is used to encode EOF and FLUSH messages.
 *
 * Each context is stored in a special CONTEXT structure, which is
 * documented below. Context tables are not created until the
 * context is seen, and they are never destroyed.
 *
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "coder.h"
#include "model.h"

/*
 * This program consumes massive amounts of memory. One way to
 * handle large amounts of memory is to use Zortech's __handle
 * pointer type. So that my code will run with other compilers
 * as well, the handle stuff gets redefined when using other
 * compilers.
 */
#ifdef __ZTC__
#include <handle.h>
#else
#define __handle
#define handle_calloc( a ) calloc( (a), 1 )
#define handle_realloc( a, b ) realloc( (a), (b) )
#define handle_free( a ) free( (a) )
#endif

/* A context table contains a list of the counts for all symbols
 * that have been seen in the defined context. For example, a

 * context of "Zor" might have only had 2 different characters
 * appear. 't' might have appeared 10 times, and 'l' might have
 * appeared once. These two counts are stored in the context
 * table. The counts are stored in the STATS structure. All of
 * the counts for a given context are stored in and array of STATS.
 * As new characters are added to a particular contexts, the STATS
 * array will grow. Sometimes, the STATS array will shrink
 * after flushing the model.
 */
typedef struct {
 unsigned char symbol;
 unsigned char counts;
 } STATS;

/*
 * Each context has to have links to higher order contexts. These
 * links are used to navigate through the context tables. For example,
 * to find the context table for "ABC", I start at the order 0 table,
 * then find the pointer to the "A" context table by looking through
 * then LINKS array. At that table, we find the "B" link and go to
 * that table. The process continues until the destination table is
 * found. The table pointed to by the LINKS array corresponds to the
 * symbol found at the same offset in the STATS table. The reason that
 * LINKS is in a separate structure instead of being combined with
 * STATS is to save space. All of the leaf context nodes don't need
 * next pointers, since they are in the highest order context. In the
 * leaf nodes, the LINKS array is a NULL pointers.
 */
typedef struct {
 struct context *next;
 } LINKS;

/*
 * The CONTEXT structure holds all of the know information about
 * a particular context. The links and stats pointers are discussed
 * immediately above here. The max_index element gives the maximum
 * index that can be applied to the stats or link array. When the
 * table is first created, and stats is set to NULL, max_index is set
 * to -1. As soon as single element is added to stats, max_index is
 * incremented to 0.
 *
 * The lesser context pointer is a navigational aid. It points to
 * the context that is one less than the current order. For example,
 * if the current context is "ABC", the lesser_context pointer will
 * point to "BC". The reason for maintaining this pointer is that
 * this particular bit of table searching is done frequently, but
 * the pointer only needs to be built once, when the context is
 * created.
 */
typedef struct context {
 int max_index;
 LINKS __handle *links;
 STATS __handle *stats;
 struct context *lesser_context;
 } CONTEXT;
/*
 * max_order is the maximum order that will be maintained by this
 * program. EXPAND-2 and COMP-2 both will modify this int based
 * on command line parameters.

 */
int max_order=3;
/*
 * *contexts[] is an array of current contexts. If I want to find
 * the order 0 context for the current state of the model, I just
 * look at contexts[0]. This array of context pointers is set up
 * every time the model is updated.
 */
CONTEXT **contexts;
/*
 * current_order contains the current order of the model. It starts
 * at max_order, and is decremented every time an ESCAPE is sent. It
 * will only go down to -1 for normal symbols, but can go to -2 for
 * EOF and FLUSH.
 */
int current_order;
/*
 * This variable tells COMP-2.C that the FLUSH symbol can be
 * sent using this model.
 */
int flushing_enabled=1;
/*
 * This table contains the cumulative totals for the current context.
 * Because this program is using exclusion, totals has to be calculated
 * every time a context is used. The scoreboard array keeps track of
 * symbols that have appeared in higher order models, so that they
 * can be excluded from lower order context total calculations.
 */
short int totals[ 258 ];
char scoreboard[ 256 ];

/*
 * Local procedure declarations.
 */
void error_exit( char *message );
void update_table( CONTEXT *table, unsigned char symbol );
void rescale_table( CONTEXT *table );
void totalize_table( CONTEXT *table );
CONTEXT *shift_to_next_context( CONTEXT *table, unsigned char c, int order);
CONTEXT *allocate_next_order_table( CONTEXT *table,
 unsigned char symbol,
 CONTEXT *lesser_context );

/*
 * This routine has to get everything set up properly so that
 * the model can be maintained properly. The first step is to create
 * the *contexts[] array used later to find current context tables.
 * The *contexts[] array indices go from -2 up to max_order, so
 * the table needs to be fiddled with a little. This routine then
 * has to create the special order -2 and order -1 tables by hand,
 * since they aren't quite like other tables. Then the current
 * context is set to \0, \0, \0, ... and the appropriate tables
 * are built to support that context. The current order is set
 * to max_order, the scoreboard is cleared, and the system is
 * ready to go.
 */

void initialize_model()
{

 int i;
 CONTEXT *null_table;
 CONTEXT *control_table;

 current_order = max_order;
 contexts = (CONTEXT **) calloc( sizeof( CONTEXT * ), 10 );
 if ( contexts == NULL )
 error_exit( "Failure #1: allocating context table!" );
 contexts += 2;
 null_table = (CONTEXT *) calloc( sizeof( CONTEXT ), 1 );
 if ( null_table == NULL )
 error_exit( "Failure #2: allocating null table!" );
 null_table->max_index = -1;
 contexts[ -1 ] = null_table;
 for ( i = 0 ; i <= max_order ; i++ )
 contexts[ i ] = allocate_next_order_table( contexts[ i-1 ],
 0,
 contexts[ i-1 ] );
 handle_free( (char __handle *) null_table->stats );
 null_table->stats =
 (STATS __handle *) handle_calloc( sizeof( STATS ) * 256 );
 if ( null_table->stats == NULL )
 error_exit( "Failure #3: allocating null table!" );
 null_table->max_index = 255;
 for ( i=0 ; i < 256 ; i++ )
 {
 null_table->stats[ i ].symbol = (unsigned char) i;
 null_table->stats[ i ].counts = 1;
 }

 control_table = (CONTEXT *) calloc( sizeof(CONTEXT), 1 );
 if ( control_table == NULL )
 error_exit( "Failure #4: allocating null table!" );
 control_table->stats =
 (STATS __handle *) handle_calloc( sizeof( STATS ) * 2 );
 if ( control_table->stats == NULL )
 error_exit( "Failure #5: allocating null table!" );
 contexts[ -2 ] = control_table;
 control_table->max_index = 1;
 control_table->stats[ 0 ].symbol = -FLUSH;
 control_table->stats[ 0 ].counts = 1;
 control_table->stats[ 1 ].symbol =- DONE;
 control_table->stats[ 1 ].counts = 1;

 for ( i = 0 ; i < 256 ; i++ )
 scoreboard[ i ] = 0;
}
/*
 * This is a utility routine used to create new tables when a new
 * context is created. It gets a pointer to the current context,
 * and gets the symbol that needs to be added to it. It also needs
 * a pointer to the lesser context for the table that is to be
 * created. For example, if the current context was "ABC", and the
 * symbol 'D' was read in, add_character_to_model would need to
 * create the new context "BCD". This routine would get called
 * with a pointer to "BC", the symbol 'D', and a pointer to context
 * "CD". This routine then creates a new table for "BCD", adds it
 * to the link table for "BC", and gives "BCD" a back pointer to
 * "CD". Note that finding the lesser context is a difficult

 * task, and isn't done here. This routine mainly worries about
 * modifying the stats and links fields in the current context.
 */

CONTEXT *allocate_next_order_table( CONTEXT *table,
 unsigned char symbol,
 CONTEXT *lesser_context )
{
 CONTEXT *new_table;
 int i;
 unsigned int new_size;

 for ( i = 0 ; i <= table->max_index ; i++ )
 if ( table->stats[ i ].symbol == symbol )
 break;
 if ( i > table->max_index )
 {
 table->max_index++;
 new_size = sizeof( LINKS );
 new_size *= table->max_index + 1;
 if ( table->links == NULL )
 table->links = (LINKS __handle *) handle_calloc( new_size );
 else
 table->links = (LINKS __handle *)
 handle_realloc( (char __handle *) table->links, new_size );
 new_size = sizeof( STATS );
 new_size *= table->max_index + 1;
 if ( table->stats == NULL )
 table->stats = (STATS __handle *) handle_calloc( new_size );
 else
 table->stats = (STATS __handle *)
 handle_realloc( (char __handle *) table->stats, new_size );
 if ( table->links == NULL )
 error_exit( "Failure #6: allocating new table" );
 if ( table->stats == NULL )
 error_exit( "Failure #7: allocating new table" );
 table->stats[ i ].symbol = symbol;
 table->stats[ i ].counts = 0;
 }
 new_table = (CONTEXT *) calloc( sizeof( CONTEXT ), 1 );
 if ( new_table == NULL )
 error_exit( "Failure #8: allocating new table" );
 new_table->max_index = -1;
 table->links[ i ].next = new_table;
 new_table->lesser_context = lesser_context;
 return( new_table );
}

/*
 * This routine is called to increment the counts for the current
 * contexts. It is called after a character has been encoded or
 * decoded. All it does is call update_table for each of the
 * current contexts, which does the work of incrementing the count.
 * This particular version of update_model() practices update exclusion,
 * which means that if lower order models weren't used to encode
 * or decode the character, they don't get their counts updated.
 * This seems to improve compression performance quite a bit.
 * To disable update exclusion, the loop would be changed to run
 * from 0 to max_order, instead of current_order to max_order.

 */
void update_model( int symbol )
{
 int i;
 int local_order;

 if ( current_order < 0 )
 local_order = 0;
 else
 local_order = current_order;
 if ( symbol >= 0 )
 {
 while ( local_order <= max_order )
 {
 if ( symbol >= 0 )
 update_table( contexts[ local_order ], (unsigned char) symbol );
 local_order++;
 }
 }
 current_order = max_order;
 for ( i = 0 ; i < 256 ; i++ )
 scoreboard[ i ] = 0;
}
/*
 * This routine is called to update the count for a particular symbol
 * in a particular table. The table is one of the current contexts,
 * and the symbol is the last symbol encoded or decoded. In principle
 * this is a fairly simple routine, but a couple of complications make
 * things a little messier. First of all, the given table may not
 * already have the symbol defined in its statistics table. If it
 * doesn't, the stats table has to grow and have the new guy added
 * to it. Secondly, the symbols are kept in sorted order by count
 * in the table so as that the table can be trimmed during the flush
 * operation. When this symbol is incremented, it might have to be moved
 * up to reflect its new rank. Finally, since the counters are only
 * bytes, if the count reaches 255, the table absolutely must be rescaled
 * to get the counts back down to a reasonable level.
 */
void update_table( CONTEXT *table, unsigned char symbol )
{
 int i;
 int index;
 unsigned char temp;
 CONTEXT *temp_ptr;
 unsigned int new_size;
/*
 * First, find the symbol in the appropriate context table. The first
 * symbol in the table is the most active, so start there.
 */
 index = 0;
 while ( index <= table->max_index &&
 table->stats[index].symbol != symbol )
 index++;
 if ( index > table->max_index )
 {
 table->max_index++;
 new_size = sizeof( LINKS );
 new_size *= table->max_index + 1;
 if ( current_order < max_order )

 {
 if ( table->max_index == 0 )
 table->links = (LINKS __handle *) handle_calloc( new_size );
 else
 table->links = (LINKS __handle *)
 handle_realloc( (char __handle *) table->links, new_size );
 if ( table->links == NULL )
 error_exit( "Error #9: reallocating table space!" );
 table->links[ index ].next = NULL;
 }
 new_size = sizeof( STATS );
 new_size *= table->max_index + 1;
 if (table->max_index==0)
 table->stats = (STATS __handle *) handle_calloc( new_size );
 else
 table->stats = (STATS __handle *)
 handle_realloc( (char __handle *) table->stats, new_size );
 if ( table->stats == NULL )
 error_exit( "Error #10: reallocating table space!" );
 table->stats[ index ].symbol = symbol;
 table->stats[ index ].counts = 0;
 }
/*
 * Now I move the symbol to the front of its list.
 */
 i = index;
 while ( i > 0 &&
 table->stats[ index ].counts == table->stats[ i-1 ].counts )
 i--;
 if ( i != index )
 {
 temp = table->stats[ index ].symbol;
 table->stats[ index ].symbol = table->stats[ i ].symbol;
 table->stats[ i ].symbol = temp;
 if ( table->links != NULL )
 {
 temp_ptr = table->links[ index ].next;
 table->links[ index ].next = table->links[ i ].next;
 table->links[ i ].next = temp_ptr;
 }
 index = i;
 }
/*
 * The switch has been performed, now I can update the counts
 */
 table->stats[ index ].counts++;
 if ( table->stats[ index ].counts == 255 )
 rescale_table( table );
}

/*
 * This routine is called when a given symbol needs to be encoded.
 * It is the job of this routine to find the symbol in the context
 * table associated with the current table, and return the low and
 * high counts associated with that symbol, as well as the scale.
 * Finding the table is simple. Unfortunately, once I find the table,
 * I have to build the table of cumulative counts, which is
 * expensive, and is done elsewhere. If the symbol is found in the
 * table, the appropriate counts are returned. If the symbols is

 * not found, the ESCAPE symbol probabilities are returned, and
 * the current order is reduced. Note also the kludge to support
 * the order -2 character set, which consists of negative numbers
 * instead of unsigned chars. This insures that no match will every
 * be found for the EOF or FLUSH symbols in any "normal" table.
 */
int convert_int_to_symbol( int c, SYMBOL *s )
{
 int i;
 CONTEXT *table;

 table = contexts[ current_order ];
 totalize_table( table );
 s->scale = totals[ 0 ];
 if ( current_order == -2 )
 c = -c;
 for ( i = 0 ; i <= table->max_index ; i++ )
 {
 if ( c == (int) table->stats[ i ].symbol )
 {
 if ( table->stats[ i ].counts == 0 )
 break;
 s->low_count = totals[ i+2 ];
 s->high_count = totals[ i+1 ];
 return( 0 );
 }
 }
 s->low_count = totals[ 1 ];
 s->high_count = totals[ 0 ];
 current_order--;
 return( 1 );
}
/*
 * This routine is called when decoding an arithmetic number. In
 * order to decode the present symbol, the current scale in the
 * model must be determined. This requires looking up the current
 * table, then building the totals table. Once that is done, the
 * cumulative total table has the symbol scale at element 0.
 */
void get_symbol_scale( SYMBOL *s )
{
 CONTEXT *table;

 table = contexts[ current_order ];
 totalize_table( table );
 s->scale = totals[ 0 ];
}

/*
 * This routine is called during decoding. It is given a count that
 * came out of the arithmetic decoder, and has to find the symbol that
 * matches the count. The cumulative totals are already stored in the
 * totals[] table, form the call to get_symbol_scale, so this routine
 * just has to look through that table. Once the match is found,
 * the appropriate character is returned to the caller. Two possible
 * complications. First, the character might be the ESCAPE character,
 * in which case the current_order has to be decremented. The other
 * complication is that the order might be -2, in which case we return
 * the negative of the symbol so it isn't confused with a normal

 * symbol.
 */
int convert_symbol_to_int( int count, SYMBOL *s)
{
 int c;
 CONTEXT *table;

 table = contexts[ current_order ];
 for ( c = 0; count < totals[ c ] ; c++ )
 ;
 s->high_count = totals[ c-1 ];
 s->low_count = totals[ c ];
 if ( c == 1 )
 {
 current_order--;
 return( ESCAPE );
 }
 if ( current_order < -1 )
 return( (int) -table->stats[ c-2 ].symbol );
 else
 return( table->stats[ c-2 ].symbol );
}


/*
 * After the model has been updated for a new character, this routine
 * is called to "shift" into the new context. For example, if the
 * last context was "ABC", and the symbol 'D' had just been processed,
 * this routine would want to update the context pointers to that
 * contexts[1]=="D", contexts[2]=="CD" and contexts[3]=="BCD". The
 * potential problem is that some of these tables may not exist.
 * The way this is handled is by the shift_to_next_context routine.
 * It is passed a pointer to the "ABC" context, along with the symbol
 * 'D', and its job is to return a pointer to "BCD". Once we have
 * "BCD", we can follow the lesser context pointers in order to get
 * the pointers to "CD" and "C". The hard work was done in
 * shift_to_context().
 */
void add_character_to_model( int c )
{
 int i;
 if ( max_order < 0 c < 0 )
 return;
 contexts[ max_order ] =
 shift_to_next_context( contexts[ max_order ],
 (unsigned char) c, max_order );
 for ( i = max_order-1 ; i > 0 ; i-- )
 contexts[ i ] = contexts[ i+1 ]->lesser_context;
}

/*
 * This routine is called when adding a new character to the model. From
 * the previous example, if the current context was "ABC", and the new
 * symbol was 'D', this routine would get called with a pointer to
 * context table "ABC", and symbol 'D', with order max_order. What this
 * routine needs to do then is to find the context table "BCD". This
 * should be an easy job, and it is if the table already exists. All
 * we have to in that case is follow the back pointer from "ABC" to "BC".
 * We then search the link table of "BC" until we find the linke to "D".

 * That link points to "BCD", and that value is then returned to the
 * caller. The problem crops up when "BC" doesn't have a pointer to
 * "BCD". This generally means that the "BCD" context has not appeared
 * yet. When this happens, it means a new table has to be created and
 * added to the "BC" table. That can be done with a single call to
 * the allocate_new_table routine. The only problem is that the
 * allocate_new_table routine wants to know what the lesser context for
 * the new table is going to be. In other words, when I create "BCD",
 * I need to know where "CD" is located. In order to find "CD", I
 * have to recursively call shift_to_next_context, passing it a pointer
 * to context "C" and they symbol 'D'. It then returns a pointer to
 * "CD", which I use to create the "BCD" table. The recursion is guaranteed
 * to end if it ever gets to order -1, because the null table is
 * guaranteed to have a for every symbol to the order 0 table. This is
 * the most complicated part of the modeling program, but it is
 * necessary for performance reasons.
 */
CONTEXT *shift_to_next_context( CONTEXT *table, unsigned char c, int order)
{
 int i;
 CONTEXT *new_lesser;
/*
 * First, try to find the new context by backing up to the lesser
 * context and searching its link table. If I find the link, we take
 * a quick and easy exit, returning the link. Note that their is a
 * special Kludge for context order 0. We know for a fact that
 * the lesser context pointer at order 0 points to the null table,
 * order -1, and we know that the -1 table only has a single link
 * pointer, which points back to the order 0 table.
 */
 table = table->lesser_context;
 if ( order == 0 )
 return( table->links[ 0 ].next );
 for ( i = 0 ; i <= table->max_index ; i++ )
 if ( table->stats[ i ].symbol == c )
 if ( table->links[ i ].next != NULL )
 return( table->links[ i ].next );
 else
 break;
/*
 * If I get here, it means the new context did not exist. I have to
 * create the new context, add a link to it here, and add the backwards
 * link to *his* previous context. Creating the table and adding it to
 * this table is pretty easy, but adding the back pointer isn't. Since
 * creating the new back pointer isn't easy, I duck my responsibility
 * and recurse to myself in order to pick it up.
 */
 new_lesser = shift_to_next_context( table, c, order-1 );
/*
 * Now that I have the back pointer for this table, I can make a call
 * to a utility to allocate the new table.
 */
 table = allocate_next_order_table( table, c, new_lesser );
 return( table );
}

/*
 * Rescaling the table needs to be done for one of three reasons.
 * First, if the maximum count for the table has exceeded 16383, it

 * means that arithmetic coding using 16 and 32 bit registers might
 * no longer work. Secondly, if an individual symbol count has
 * reached 255, it will no longer fit in a byte. Third, if the
 * current model isn't compressing well, the compressor program may
 * want to rescale all tables in order to give more weight to newer
 * statistics. All this routine does is divide each count by 2.
 * If any counts drop to 0, the counters can be removed from the
 * stats table, but only if this is a leaf context. Otherwise, we
 * might cut a link to a higher order table.
 */
void rescale_table( CONTEXT *table )
{
 int i;

 if ( table->max_index == -1 )
 return;
 for ( i = 0 ; i <= table->max_index ; i++ )
 table->stats[ i ].counts /= 2;
 if ( table->stats[ table->max_index ].counts == 0 &&
 table->links == NULL )
 {
 while ( table->stats[ table->max_index ].counts == 0 &&
 table->max_index >= 0 )
 table->max_index--;
 if ( table->max_index == -1 )
 {
 handle_free( (char __handle *) table->stats );
 table->stats = NULL;
 }
 else
 {
 table->stats = (STATS __handle *)
 handle_realloc( (char __handle *) table->stats,
 sizeof( STATS ) * ( table->max_index + 1 ) );
 if ( table->stats == NULL )
 error_exit( "Error #11: reallocating stats space!" );
 }
 }
}

/*
 * This routine has the job of creating a cumulative totals table for
 * a given context. The cumulative low and high for symbol c are going to
 * be stored in totals[c+2] and totals[c+1]. Locations 0 and 1 are
 * reserved for the special ESCAPE symbol. The ESCAPE symbol
 * count is calculated dynamically, and changes based on what the
 * current context looks like. Note also that this routine ignores
 * any counts for symbols that have already showed up in the scoreboard,
 * and it adds all new symbols found here to the scoreboard. This
 * allows us to exclude counts of symbols that have already appeared in
 * higher order contexts, improving compression quite a bit.
 */
void totalize_table( CONTEXT *table )
{
 int i;
 unsigned char max;

 for ( ; ; )
 {

 max = 0;
 i = table->max_index + 2;
 totals[ i ] = 0;
 for ( ; i > 1 ; i-- )
 {
 totals[ i-1 ] = totals[ i ];
 if ( table->stats[ i-2 ].counts )
 if ( ( current_order == -2 ) 
 scoreboard[ table->stats[ i-2 ].symbol ] == 0 )
 totals[ i-1 ] += table->stats[ i-2 ].counts;
 if ( table->stats[ i-2 ].counts > max )
 max = table->stats[ i-2 ].counts;
 }
/*
 * Here is where the escape calculation needs to take place.
 */
 if ( max == 0 )
 totals[ 0 ] = 1;
 else
 {
 totals[ 0 ] = (short int) ( 256 - table->max_index );
 totals[ 0 ] *= table->max_index;
 totals[ 0 ] /= 256;
 totals[ 0 ] /= max;
 totals[ 0 ]++;
 totals[ 0 ] += totals[ 1 ];
 }
 if ( totals[ 0 ] < MAXIMUM_SCALE )
 break;
 rescale_table( table );
 }
 for ( i = 0 ; i < table->max_index ; i++ )
 if (table->stats[i].counts != 0)
 scoreboard[ table->stats[ i ].symbol ] = 1;
}

/*
 * This routine is called when the entire model is to be flushed.
 * This is done in an attempt to improve the compression ratio by
 * giving greater weight to upcoming statistics. This routine
 * starts at the given table, and recursively calls itself to
 * rescale every table in its list of links. The table itself
 * is then rescaled.
 */
void recursive_flush( CONTEXT *table )
{
 int i;

 if ( table->links != NULL )
 for ( i = 0 ; i <= table->max_index ; i++ )
 if ( table->links[ i ].next != NULL )
 recursive_flush( table->links[ i ].next );
 rescale_table( table );
}

/*
 * This routine is called to flush the whole table, which it does
 * by calling the recursive flush routine starting at the order 0
 * table.

 */
void flush_model()
{
 recursive_flush( contexts[ 0 ] );
}

void error_exit( char *message)
{
 putc( '\n', stdout );
 puts( message );
 exit( -1 );
}


















































February, 1991
ENTROPY


The key to data compression




Kas Thomas


Kas is a consultant specializing in the design and implementation of
ultrahigh-speed data compression algorithms. He can be reached at 578
Fairfield Ave., Stamford, CT 06902.


"Omit needless words! Omit needless words! Omit needless words!"
-- Will Strunk, Jr.
Data compression is many times thought of as an exercise in redundancy
removal. Actually, it is much more. Data compression cuts right to the heart
of one of the two classical problems of information theory -- how best to
encode a message. (The other is how best to send a message in the presence of
noise.) "Best" here is taken to mean most efficient, in terms of bits per
symbol. When a message has been expressed in the fewest possible bits per
symbol, it is said to be optimally encoded. No bits are wasted. This is the
goal of data compression.
Claude Shannon was among the first to try to quantify the encoding efficiency
of various coding schemes, including ordinary English text. One of Shannon's
favorite investigative techniques was a simple cocktail-party game in which he
would pick a page in a book at random and (while reading a portion of text out
loud, one letter at a time) have volunteers try to guess what the next
letter(s) would be, based solely on the letters that had come before. If the
player could not guess correctly, Shannon would give the player the correct
answer, and that would form the "clue" for the next round. Shannon kept a
tally of correct and incorrect guesses as the game went on. At the end of the
game, it was possible to tally the redundancy (or encoding inefficiency) of
the given portion of text. For example, the outcome of one game might be that
the player correctly guessed 60 out of 90 letters in a stretch of text. This
would imply that two thirds of the letters were redundant, because the player
could predict them in advance, based on conventional spelling rules, grammar,
and usage.
Shannon concluded that ordinary English text is anywhere from 70 to 80 percent
redundant. This, in turn, implies that only about 2 bits per 8-bit byte of
text stored on disk (or in RAM) actually contain information -- the remaining
bits are redundant. Shannon would turn this statement around and say that the
average information content of English is 2 bits per symbol, or thereabouts.
Not content to let party-goers determine the information content of data
streams, Shannon looked for ways to calculate the information content of
various "messages." The strategy he used was disarmingly simple. Let the unit
of information be the "bit," yes or no, one or zero. Let a message (or event)
be deemed informative only to the extent that it resolves uncertainty in the
mind of the observer. If the observer already knows (or can correctly guess) a
message, then that message conveys no information. If the message cannot be
guessed, it does convey information.
Suppose our alphabet is only two letters long. The information conveyed by a
stream of bits (each bit representing one letter of our alphabet) is inversely
proportional to the predictability of the bits' values. This is made clearer
by imagining that our bit values represent opposite sides of a coin (1 for
heads, 0 for tails). Each toss of the coin resolves an uncertainty of 1 bit --
assuming the coin lands heads-up half the time and tails-up half the time. But
consider the case of a weighted coin: Suppose we know (from experience) that
our "dishonest nickel" falls heads-up three-fourths of the time. How much
uncertainty is resolved with each toss? Clearly, it must be less than 1 bit,
because if we simply guess "heads" 100 percent of the time, we'll be correct
more often than not when attempting to guess the outcome of successive coin
tosses. What it means is that the information efficiency of the toss has been
degraded. And we can calculate the amount.
Start by reducing everything to its probability of occurrence. An honest
nickel has a 0.50 chance of turning up heads, and an equal chance of turning
up tails, and therefore an informational degree of freedom corresponding to
-0.50 * log(0.50) for heads, plus -0.50 * log(0.50) for tails, or a total of
1.00 bit per toss. (Here, we mean base-2 logarithm when we say "log.") By
contrast, the dishonest nickel is constrained so that its informational degree
of freedom is as illustrated in Example 1. In other words, if an honest nickel
is telling us 1 bit of information per toss, a dishonest nickel that falls
heads-up 75 percent of the time is telling us only 0.811 bits of information
per toss. You could say that each toss is 19.9 percent redundant!
Example 1: The informational degree of freedom of the dishonest nickel

 - (0.75) * log(0.75) for heads
 + - (0.25) * log(0.25) for tails
 -------------------------------------

 TOTAL = - [ 0.25 * (-2) + 0.75 * (-0.415) ] = 0.811 bit per toss

With English text, we have an alphabet of 26 characters, representing (if you
will) 26 possible outcomes for each "toss" (each symbol), and thus 4.76 bits
per "toss" if all outcomes are equally probable. But we know that all outcomes
are not equally likely in English text. The 26-sided dice have been weighted
so that "e" turns up more often than any other letter, with "x" occurring much
less frequently than "s," and so forth. To obtain the informational degree of
freedom of English text, we need to determine the probability P of occurrence
of each of the 26 letters of the alphabet, then sum the terms - Plog(P) for
all letters.
Shannon reserved a special name for this quantity, which we've been calling
the "informational degree of freedom." He called it entropy, in honor of the
fact that the equation that expresses it is of the same form as the equation
derived by Boltzmann for thermodynamic entropy, namely S = k log(W) where S is
entropy, W is the number of ways in which the parts of the system can be
rearranged, and k (for computations involving gas molecules, etc.) is a
fundamental constant of nature, now known as Boltzmann's constant.
In information theory as well as thermodynamics, entropy is a measure of
freedom of choice. It represents the average uncertainty as to which of many
states a system might be in. For a data stream, it's the number of bits per
symbol required to encode the message.
Because entropy calculation is a straightforward matter, it's also easy to
determine the degree of redundancy in a file. If a data file is represented as
8-bit bytes, we can obtain the redundancy of the file by subtracting its
entropy from 8. (The answer will be in bits per byte.) A Turbo C program
(called ENTROPY.C) to calculate the entropy and apparent redundancy of a file
is shown in Listing One. Note that this program calculates the entropy of a
message with respect to a certain model, in this case the order-0 finite
context model. An order-1 finite context model would likely produce a
different calculation. Furthermore, there are other ways of modeling data, all
of which may generate vastly different calculations. (As an aside, you can
think of Shannon's model as a readily available neural network, the human
brain.)
Listing One is short and self-explanatory. It is appropriate to point out,
however, that the calculation method used in this program offers a first
approximation only. Shannon himself saw each symbol (indeed, each string) in a
message as constituting the "source" of the symbol located immediately
downstream of itself. For instance, in the word "quiet," the "q" can be
considered a message source for the message "u," just as "qu" can be
considered a message source for "i," and "ui" a source for "e," and so on. The
central idea here is that context, as well as statistical abundance,
determines information content. A more accurate entropy estimate is obtained
when context (upstream and downstream characters) is taken into effect. This
can be done by tallying character frequencies in a two-dimensional array, with
one dimension given by an upstream character and the other given by the
character currently being read. (The "alphabet" associated with "q" is just
one letter long: namely "u." Because there is only one allowable following
symbol for "q," the "u" contributes nothing to the information of the message,
and its entropy contribution in the context of "q" is zero.)
Having said all this, the zero-order or static entropy calculation offered in
ENTROPY.C nonetheless gives a very good first approximation to entropy. For
example, running ENTROPY on itself (the file ENTROPY.EXE being some 25K in
size) yields an entropy estimate of 6.933 bits per byte, for a file redundancy
of 13 percent. Using ARC 6.02 to compress the same file resulted in a 14
percent size reduction. Running ENTROPY on COMMAND.COM (Compaq DOS 3.3) gave
an entropy estimate of 6.436 bits per byte, for an apparent redundancy of 20
percent; ARC reduced the file by 22 percent.
Knowing nothing else about a file other than the frequency of occurrence of
its constituent bytes, and using no data compression techniques whatsoever, we
are able to calculate its redundancy to an astonishingly accurate degree. When
you think about it, this is pretty amazing. (It wasn't long after Shannon
published his work in this area, of course, that a fellow by the name of
Huffman devised an algorithm to exploit statistical redundancy in files, to
achieve more efficient coding.)
Understanding entropy is basic to understanding data compression. Only when
data compression is seen in the context of efficient information encoding (as
opposed to mere redundancy removal) can true insight into the data compression
problem be obtained.


References


1. Huffman, D.A. "A Method for the Construction of Minimum Redundancy Codes."
Proceedings of Institute of Electrical and Radio Engineers 40 (9), 1098-1101.
September, 1952.
2. Shannon, C.E. & W. Weaver. The Mathematical Theory of Communication.
Urbana, Illinois: Univ. of Illinois Press, 1949.
3. Lelewer, Debra A. and Hirschberg, Daniel S. "Data Compression." ACM
Computing Surveys, vol. 19, no 3. New York, September, 1987.


_ENTROPY_
by Kas Thomas



[LISTING ONE]

/* * * * * * * * * * * * * * * * * ENTROPY.C * * * * * * * * * * * * * * * */
/* Calculates zero-order entropy of a file, a la Shannon. */

/* Turbo C version by Kas Thomas */
/* You may distribute this listing to fellow programmers. Please retain */
/* authorship notices, however. */
/* This program will give an approximate measure of how compressible a */
/* given file is using Huffman-type compression techniques. It calculates */
/* the best compression possible using order-0 finite context modelling. */
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */

#include<stdio.h>
#include<stdlib.h>
#include<math.h>

#define LOG(x) 3.32 * log10(x) /* base-2 log macro */
#define ENTROPY(x) -(x * LOG(x)) /* classic definition of entropy */

FILE *in; /* input file pointer */
unsigned int table[256]; /* count data goes here */

void read_input(void);
double analyze(void);
void usage(void);

 /* ------------------- MAIN ------------------ */
main(int ac, char **av)
{
double result; /* return value of analyze() */

if (ac==1) usage(); /* explain program usage & exit */

in = fopen(av[1],"rb"); /* open the input file */
if (!in) printf("\nCouldn't open input file."); /* error message */
if (!in) exit(-1); /* exit program if file couldn't be opened */

printf("\n ** Reading file . . ."); /* status message */
read_input(); /* read file & tally character frequencies */
printf("\n ** Calculating . . .\n"); /* status message */
result = analyze(); /* analyze the frequency data */

 /* finally, print the results to the screen */
printf("\n The file \"%s\" has a zero-order",av[1]);
printf("\n entropy of %3.3f bits per byte.\n",result);
printf("\n Approximate shrinkage potential");
printf("\n using Huffman techniques:");
printf(" %2.0f%%\n\n\n",100-(result * 100)/8);

fclose(in); /* close file */
return (1); /* optional, but a good idea anyway */
} /* end function main() */

 /* ----------------------- read_input() ----------------------- */
void read_input()
{
 int ch;

 while (( ch = getc(in)) != EOF) /* until EOF reached . . . */
 table[ch]++; /* read a byte at a time & tally char counts */
} /* end function read_input() */

 /* ----------------------- analyze() -------------------------- */

double analyze()
{
 double accum = 0.0; /* entropy will accumulate here */
 double freq; /* frequency of occurrence of character */
 long fsize = 0L; /* input file's size */
 register int z; /* scratch variable */

 fsize = ftell( in ); /* get file size */
 for (z = 0; z < 256; z++) /* for every position in table */
 if (table[z]) /* if data exists */
 {
 freq = (double) table[z]/fsize; /* calculate frequency */
 accum += (double) ENTROPY(freq); /* get entropy contribution */
 }
 return accum;
} /* end analyze() */

 /* --------------------------------- usage() -------------------------- */
 /* Explain program & exit. */
void usage()
{
printf("\n\n");
printf(" Entropy v1.00 by Kas Thomas. Public Domain.\n\n");
printf(" Syntax: ENTROPY {filename} [Enter]\n\n");
printf(" Entropy is a measure of information storage efficiency.\n");
printf(" This program calculates a file's entropy, hence its\n");
printf(" compressibility, using the entropy equation of Shannon.\n");
printf(" (See \"Information Theory: Symbols, Signals, & Noise,\"\n");
printf(" by John Pierce, Dover, 1981).\n\n");
exit(1);
} /* end function usage() */































February, 1991
DIFFERENTIAL IMAGE COMPRESSION


An ideal technique for compressing animated sequences




John Bridges


John Bridges is the author of GRASP (Paul Mace Software), PC Paint (Mouse
Systems), Imagetools (HSC Software), and a slew of freeware. He is currently
working on new versions of his products, as well as a long-term project with
the IBM Multi-media Lab. He can be found haunting the PICS forum on CompuServe
and can be reached at [73307,606].


Differential (DFF) image storage takes advantage of the inherent similarity
between frames in a sequence by keeping track of only the differences between
images rather than the images themselves. DFF is therefore particularly
effective for compressing animated sequences. For example, many Saturday
morning cartoons have an entirely still or fixed background with a few
characters moving in front of it. Rather than storing the background over and
over again, the DFF technique is used to store the background in the first
frame of data, then concentrate on the changes in those characters that
actually move on successive frames.
The savings in disk space is dramatic. Even a simple DFF algorithm exceeds the
space savings performance of complicated and slow general-purpose,
single-image compression algorithms. In fact, you can easily decode a DFF
format in real time to display memory.
Another advantage of DFF-encoded animation is that you can reduce the number
of bytes per frame that need to be written to video RAM. This is especially
important on machines that have slow video RAM access (the PS/2 Model 70, for
example, that has a ratio of about 10:1 between regular RAM and video
RAM-access). Often on these machines, the majority of animation computing time
is spent waiting for the machine to write data into video RAM.
Although it is possible to store differences between pixels that are not a
byte in size, for this article I'm going to stick to byte-oriented
differences, because storing the differences between individual pixels that
are smaller than a byte can take more computing power -- twiddling all those
bits around -- than it's worth for most applications.


Storing DFF Information


Here are two ways to store DFF information -- bitmap and skip/copy.
Bitmap DFF data is a bit table with one bit for each possible byte position.
Each time a bit is true, a byte has to be copied from a table of changed bytes
which follow. In Figure 1(a), the two "images" are represented by strings of
40 characters. When the vertically adjacent letter pairs are the same, the
result (represented below the two images) is a 0 (false, or no difference).
When they are different, the corresponding bit is a 1 (true). The result is a
bit table of 40 bits, or 5 bytes followed by the 14 bytes, which have changed
as in Figure 1(b). As you can see, the result is a total of 19 bytes in
length.
Figure 1: Bitmapped storage of DFF information

 (a)

 Image 1: AAAABBBBBBCCCCHHHHHHHDDDDEEEFFFGGGGGGBBB
 Image 2: AAAGGGGAAAACCCCHHHHHHHDDDDEEEFFFGGGGGGBB
 0001111111100010000001000100100100000100

 (b)

 Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 14 changed bytes
 ------------------------------------------------------------------

 00011111 11100010 00000100 01001001 00000100 GGGGAAAACHDEFG

Skip/copy, on the other hand, stores the number of bytes to skip and the
number of new bytes to copy. Using the same images as in Figure 1, you can see
in Figure 2(a) which bytes are skipped and which are to be copied with the
results shown in Figure 2(b).
If a single byte is used to represent whether to skip or to copy (with
negative numbers being the number of bytes to skip, and positive numbers the
number of bytes to copy), the byte stream looks like Figure 2(c). The result
is 29 bytes long.
To save even more space, you can represent runs of the same byte to copy with
a count of the number of times to repeat that byte, followed by the byte to
repeat. This is called run-length encoding. For this example I'll use the
numbers from 1 to 63 for the number of bytes to copy, and 64 to 128 for the
number of times (plus 64) to repeat the byte which follows as in Figure 2(d).
This reduces the data from 29 to 24 bytes.
Figure 2: Skip/copy storage of DFF information

 (a)

 Image 1: AAAABBBBBBCCCCHHHHHHHDDDDEEEFFFGGGGGGBBB
 Image 2: AAAGGGGAAAACCCCHHHHHHHDDDDEEEFFFGGGGGGBB
 Changes: GGGGAAAA C H D E F G

 (b)

 Skip 3 copy 8: GGGGAAAA
 Skip 3 copy 1: C
 Skip 6 copy 1: H
 Skip 3 copy 1: D

 Skip 2 copy 1: E
 Skip 2 copy 1: F
 Skip 5 copy 1: G
 Skip 2

 (c)

 - 3 8 GGGGAAAA -3 1 C -6 1 H -3 1 D -2 1 E -2 1 F -5 1 G -2

 (d)

 -3 64+4 G 64+4 A -3 1 C -6 1 H -3 1 D -2 1 E -2 1 F -5 1 G -2



Two-Dimensional DFF


For actual images, DFF data is two-dimensional. There are several tricks to
take advantage of the similarity between the horizontal lines within an image.
Figure 3, for example, shows three images. Because we don't know what is
displayed on the screen before displaying the first image, I include the
differences between "no image" and the first image to yield three frames of
DFF information. It should be pointed out that every pixel will change from
the "no image" state to the first image.
Also, because the changes in each image do not affect the entire 16 x 16 area,
time and space are saved by storing a starting and ending X,Y offset within
the frame where the actual changes take place. In particular, this saves space
when using the bitmap storage as each byte within the area affected must be a
bit value in the bit table. If I stored a full bitmap for each frame, it would
take 32 bytes of bitmap for every frame, compared to less than half that in
the example described shortly.
In Figure 3(a), the X,Y offsets are 4 bytes, which is applicable only to
images up to 255 x 255; for real-world use, that can be changed to 8 bytes for
images up to 65535 x 65535. There are three images, each of which represents a
frame in an animated sequence. At the corners of each frame are their
corresponding coordinates. The resultant bitmaps are shown in Figure 3(b).
Figure 3: Two-dimensional DFF

 (a)
 Images:
 Image 1 Image 2 Image 3
 ---------------- ---------------- ----------------

 0,0 15,0 0,0 15,0 0,0 15,0

 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA
 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA
 AAAAAAAAAAAAAAAA AAAZZZZZZZZZAAAA AAAAZZZZZZZZZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZZZZZZZZZAAAA AAAAZZZZZZZZZAAA
 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA
 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA
 0,15 15,15 0,15 15,15 0,15 15,15

 (b)

 Frame 1 Frame 2 Frame 3
 ---------------- ---------------- ----------------

 0,0 15,0 0,0 15,0 0,0 15,0

 1111111111111111 0000000000000000 0000000000000000
 1111111111111111 0000000000000000 0000000000000000
 1111111111111111 0001111111110000 0001000000001000

 1111111111111111 0001000000010000 0001100000011000
 1111111111111111 0001000000010000 0001100000011000
 1111111111111111 0001000000010000 0001100000011000
 1111111111111111 0001000000010000 0001100000011000
 1111111111111111 0000000000000000 0000000000000000
 1111111111111111 0001000000010000 0001100000011000
 1111111111111111 0001000000010000 0001100000011000
 1111111111111111 0001000000010000 0001100000011000
 1111111111111111 0001000000010000 0001100000011000
 1111111111111111 0001000000010000 0001100000011000
 1111111111111111 0001111111110000 0001000000001000
 1111111111111111 0000000000000000 0000000000000000
 1111111111111111 0000000000000000 0000000000000000

 0,15 15,15 0,15 15,15 0,15 15,15

To reduce the size of each list of changed bytes, the repeating bytes or
patterns are reduced to a count and the byte or pattern which is to be
repeated.
A zero count means the following 2 bytes are to be treated as a 16 bit value,
which is useful for large repeats or blocks of nonrepeating bytes.
When the values -32767 to -1 are used, they represent the number of times to
repeat the byte that follows.
When the values 64 to 32767 are used, they represent the number of bytes +64
to copy directly (without repetition), followed by the bytes which are to be
copied.
When the values 1 to 64 are used, they represent the number of bytes in a
pattern to be repeated followed by the number of times to repeat the pattern,
and following that, the pattern itself.
For an example of the kind of savings to be had, consider that for frame 1
there is a list of 256 As which can be reduced from 256 bytes to 4 bytes
encoded as: 0 -256 A. Remember, in two's complement, -256 requires 2 bytes.
The data in Figure 4 is a bitmap of the changed bytes, followed by the
run-length compressed bytes, which are the actual difference information. As
you can see, there is tremendous savings when the differences are located
within a small area as in frames 2 and 3.
Figure 4: Bitmap of the changed bytes followed by RLE compressed bytes. (a)
Frame 1: Difference information between no image and image 1 (39 bytes); the
area from 0,0 to 15,15. (b) Frame 2: Difference information between image 1
and image 2 (20 bytes); the area from 3,2 to 11,13. (c) Frame 3: Difference
information between image 2 and image 3 (23 bytes); the area from 3,2 to
12,13.
 (a)

 11111111 11111111 11111111 11111111 11111111 11111111 11111111
 11111111 11111111 11111111 11111111 11111111 11111111 11111111
 11111111 11111111

 11111111 11111111 11111111 11111111 11111111 11111111 11111111
 11111111 11111111 11111111 11111111 11111111 11111111 11111111
 11111111 11111111

 0 -256 A

 (b)

 11111111 11000000 01100000 00110000 00011000 00001000 00000010
 00000011 00000001 10000000 11000000 01100000 00111111 1111

 -36 Z

 (c)

 10000000 01110000 00111100 00001111 00000011 11000000 11000000
 00001100 00001111 00000011 11000000 11110000 00111100 00001110
 00000001

 2 20 AZ


The example of the skip/copy algorithm shown in Figure 5(a) illustrates a
method which is more effective than the bitmap method just used. In this
technique, the comparisons are identical to those of the bitmap method. First,
image 1 is compared to no image to produce frame 1. Image 1 is compared with
image 2 to produce frame 2, and image 2 is compared with image 3 to produce
frame 3. The resultant changed bytes are shown in Figure 5(b). Notice that the
actual number of bytes that change in frames 2 and 3 is very small (36 bytes
in frame 2, and 40 bytes in frame 3). Figure 6 shows frames 1 through 3,
stored as skip values and the run-length compressed changes.
Figure 5: The skip/copy algorithm
 (a)

 Images:

 Image 1 Image 2 Image 3

 ---------------- ---------------- ----------------

 0,0 15,0 0,0 15,0 0,0 15,0

 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA
 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA
 AAAAAAAAAAAAAAAA AAAZZZZZZZZZAAAA AAAAZZZZZZZZZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZAAAAAAAZAAAA AAAAZAAAAAAAZAAA
 AAAAAAAAAAAAAAAA AAAZZZZZZZZZAAAA AAAAZZZZZZZZZAAA
 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA
 AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA

 0,15 15,15 0,15 15,15 0,15 15,15

 (b)

 Frame 1 Frame 2 Frame 3
 ---------------- ---------------- ----------------

 0,0 15,0 0,0 15,0 0,0 15,0

 AAAAAAAAAAAAAAAA ................ ................
 AAAAAAAAAAAAAAAA ................ ................
 AAAAAAAAAAAAAAAA ...ZZZZZZZZZ.... ...A........Z...
 AAAAAAAAAAAAAAAA ...Z.......Z.... ...AZ......AZ...
 AAAAAAAAAAAAAAAA ...Z.......Z.... ...AZ......AZ...
 AAAAAAAAAAAAAAAA ...Z.......Z.... ...AZ......AZ...
 AAAAAAAAAAAAAAAA ...Z.......Z.... ...AZ......AZ...
 AAAAAAAAAAAAAAAA ................ ................
 AAAAAAAAAAAAAAAA ...Z.......Z.... ...AZ......AZ...
 AAAAAAAAAAAAAAAA ...Z.......Z.... ...AZ......AZ...
 AAAAAAAAAAAAAAAA ...Z.......Z.... ...AZ......AZ...
 AAAAAAAAAAAAAAAA ...Z.......Z.... ...AZ......AZ...
 AAAAAAAAAAAAAAAA ...Z.......Z.... ...AZ......AZ...
 AAAAAAAAAAAAAAAA ...ZZZZZZZZZ.... ...A........Z...
 AAAAAAAAAAAAAAAA ................ ................
 AAAAAAAAAAAAAAAA ................ ................

 0,15 15,15 0,15 15,15 0,15 15,15

Figure 6: Frames 1-3 stored as skip values. (a) Frame 1: Difference between no
image and image 1 (the area from 0,0 to 15,15), 36 bytes. (b) Frame 2:
Difference between image 1 and image 2 (the area from 3,2 to 11,13), 54 bytes.
(c) Frame 3: Difference between image 2 and image 3 (the area from 3,2 to
12,13), 78 bytes.

 (a)

 Data in
 Frame 1 Description of the Data
 -----------------------------------------------------------

 64+16A Copy run of 16 A's in line 0
 64+16A Copy run of 16 A's in line 1

 64+16A Copy run of 16 A's in line 2
 64+16A Copy run of 16 A's in line 3
 64+16A Copy run of 16 A's in line 4
 64+16A Copy run of 16 A's in line 5
 64+16A Copy run of 16 A's in line 6
 64+16A Copy run of 16 A's in line 7
 64+16A Copy run of 16 A's in line 8
 64+16A Copy run of 16 A's in line 9
 64+16A Copy run of 16 A's in line 10
 64+16A Copy run of 16 A's in line 11
 64+16A Copy run of 16 A's in line 12
 64+16A Copy run of 16 A's in line 13
 64+16A Copy run of 16 A's in line 14
 64+16A Copy run of 16 A's in line 15

 (b)

 Data in
 Frame 2 Description of the Data
 -----------------------------------------------------------

 64+9 Z Copy run of 9 Zs in line 2
 1 Z-7 1 Z Copy Z, skip 7 bytes, and copy Z in line 3
 1 Z-7 1 Z Copy Z, skip 7 bytes, and copy Z in line 4
 1 Z-7 1 Z Copy Z, skip 7 bytes, and copy Z in line 5
 1 Z-7 1 Z Copy Z, skip 7 bytes, and copy Z in line 6
 -9 Skip 9 bytes in line 7
 1 Z-7 1 Z Copy Z, skip 7 bytes, and copy Z in line 8
 1 Z-7 1 Z Copy Z, skip 7 bytes, and copy Z in line 9
 1 Z-7 1 Z Copy Z, skip 7 bytes, and copy Z in line 10
 1 Z-7 1 Z Copy Z, skip 7 bytes, and copy Z in line 11
 1 Z-7 1 Z Copy Z, skip 7 bytes, and copy Z in line 12
 64+9 Z Copy run of 9 Zs in line 13

 (c)

 Data in
 Frame 3 Description of the Data
 -----------------------------------------------------------

 1 A-8 1 Z Copy A, skip 8 bytes and copy Z in line 2
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes, and copy AZ in line 3
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes, and copy AZ in line 4
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes, and copy AZ in line 5
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes, and copy AZ in line 6
 -10 Skip 10 bytes in line 7
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes, and copy AZ in line 8
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes, and copy AZ in line 9
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes, and copy AZ in line 10
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes, and copy AZ in line 11
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes, and copy AZ in line 12
 1 A-8 1 Z Copy A, skip 8 bytes and copy Z in line 13

The three frames in Figure 7 are encoded using skip values with the run-length
of changed bytes and the "repeat any previous line" option. This option is
coded as the relative line number, of previous lines, to be repeated.
Figure 7: Frames encoded using skip values with RL of changed bytes and the
"repeat any previous line" option. (a) Frame 1: Difference between no image
and image 1 (21 bytes) area from 0,0 to 15,15. (b) Frame 2: Difference between
image 1 and image 2 (the area from 3,2 to 11,13), 21 bytes. (c) Frame 3:
Difference between image 2 and image 3 (the area from 3,2 to 12,13) 26 bytes.

 (a)

 Data in

 Frame 1 Description of the Data
 ---------------------------------------------------------

 64+16 A Copy run of 16 A's in line 0
 -64 Repeat line 0 in line 1
 -64 Repeat line 1 in line 2
 -64 Repeat line 2 in line 3
 -64 Repeat line 3 in line 4
 -64 Repeat line 4 in line 5
 -64 Repeat line 5 in line 6
 -64 Repeat line 6 in line 7
 -64 Repeat line 7 in line 8
 -64 Repeat line 8 in line 9
 -64 Repeat line 9 in line 10
 -64 Repeat line 10 in line 11
 -64 Repeat line 11 in line 12
 -64 Repeat line 12 in line 13
 -64 Repeat line 13 in line 14
 -64 Repeat line 14 in line 15

 (b)

 Data in
 Frame 2 Description of the Data
 ---------------------------------------------------------

 64+9 Z Run of 9 Zs in line 2
 1 Z-7 1 Z Copy Z, skip 7 bytes and copy Z in line 3
 -64 Repeat line 3 in line 4
 -64 Repeat line 4 in line 5
 -64 Repeat line 5 in line 6
 -9 Skip 9 bytes in line 7
 -(64+1) Repeat line 6 in line 8
 -64 Repeat line 8 in line 9
 -64 Repeat line 9 in line 10
 -64 Repeat line 10 in line 11
 -64 Repeat line 11 in line 12
 -(64+10) Repeat line 2 in line 13

 (c)

 Data in
 Frame 3 Description of the Data
 ---------------------------------------------------------

 1 A -8 1 Z Copy A, skip 8 bytes and copy Z in line 2
 2 AZ -6 2 AZ Copy AZ, skip 6 bytes and copy AZ in line 3
 -64 Repeat line 3 in line 4
 -64 Repeat line 4 in line 5
 -64 Repeat line 5 in line 6
 -10 Skip 10 bytes in line 7
 -(64+1) Repeat line 6 in line 8
 -64 Repeat line 8 in line 9
 -64 Repeat line 9 in line 10
 -64 Repeat line 10 in line 11
 -64 Repeat line 11 in line 12
 -(64+10) Repeat line 2 in line 13

For purposes of illustration, some of the numbers are shown as arithmetic
expressions, but are actually stored as single values. Thus, -(64+1) is stored
as -65, and -(64+10) is stored as -74. For example, in frame 2, line 8, there
is the value -(64+1) which repeats line 6. Negative 64 indicates the previous
line (line 7), and negative 65, which is represented as -(64+1), indicates
that line 6 is the line to be repeated.

A further refinement is to add the "repeat previous line N times" option. Now
there are skip values with run-length compressed changed bytes; repeat any
previous line option; and repeat previous line N times option. Figure 8 is an
example that uses all of the options.
Figure 8: Adding to the "repeat previous line N times" option: (a) Frame 1:
Difference between no image and image 1 (the area from 0,0 to 15,15), 7 bytes.
(b) Frame 2: Difference between image 1 and image 2 (the area from 3,2 to
11,13), 16 bytes. (c) Frame 3: Difference between image 2 and image 3 (the
area from 3,2 to 12,13), 21 bytes.

 (a)

 Data in
 Frame 1 Description of the Data
 ---------------------------------------------------------

 64+16 A Copy run of 16 A's in line 0
 95+15 Repeat line 0, 15 times in lines 1 to 15

 (b)

 Data in
 Frame 2 Description of the Data
 ---------------------------------------------------------

 64+9 Z Copy run of 9 Zs in line 2
 1 Z-7 1 Z Copy Z, skip 7 bytes and copy Z in line 3
 95+3 Repeat line 3, 3 times in lines 4 to 6
 -9 Skip 9 bytes in line 7
 -(64+1) Repeat line 6 in line 8
 95+4 Repeat line 3, 4 times in lines 9 to 12
 -(64+10) Repeat line 2 in line 13

 (c)

 Data in
 Frame 3 Description of the Data
 ---------------------------------------------------------

 1 A-8 1 Z Copy A, skip 8 bytes and copy Z in line 2
 2 AZ-6 2 AZ Copy AZ, skip 6 bytes and copy AZ in line 3
 95+3 Repeat line 3, 3 times in lines 4 to 6
 -10 Skip 10 bytes in line 7
 -(64+1) Repeat line 6 in line 8
 95+4 Repeat line 3, 4 times in lines 9 to 12
 -(64+10) Repeat line 2 in line 13

For a quick review of the compression codes that have been used for skip/copy
examples, refer to Table 1.
Table 1: Compression codes used for skip/copy examples

 Code Description
 ------------------------------------------------------------------------

 1 to 63 Number of bytes to directly copy (bytes to copy follow)
 64 to 95 Repeat the following byte N-63 times.
 96 to 128 Repeat the previous line N-95 times (so, 98 would repeat
 the previous line 3 times).
 -127 to -64 Repeat line N-64 lines ago (so, -64 would repeat the
 previous line and -65 would repeat the line before that).
 -63 to -1 Number of bytes to skip forward (skipping unchanged bytes).



Evaluating Storage Savings



By comparing the various methods of compression presented in this article so
far, it is fairly easy to evaluate the savings in storage when compressing
images using DFF techniques. Table 2 shows the differences between the three
16 x 16 images used earlier. Note that the number of bytes in each image
includes 4 bytes for starting and ending X,Y positions of the area within the
image that changes from frame to frame. The best results give a total of 44
bytes; the original uncompressed images were 256 bytes each -- 768 bytes. This
is a savings of approximately 94 percent!
Table 2: Evaluating storage savings

 1 2 3
 --------------------------------------------------------------------

 Bitmap of changed bytes with run-length compressed
 changed bytes 39 20 23
 Skip values with run length of changed bytes 36 54 78
 Skip values with run length of changed bytes and repeat
 any previous line option 21 21 26
 Skip values with run length of changed bytes, repeat any
 previous line option and repeat previous line N 7 16 21
 times option

However, there are situations in which the above algorithms are not so
efficient, primarily when widely spaced image changes occur or when you have
drastic image changes.


Widely Spaced Image Changes


When the bytes that change are widely spaced within the image, the least-fit
rectangle that defines the area in which changes take place can be very large
-- often, close to the size of the original image. This is particularly
devastating to the bit-map algorithm because the larger the least-fit
rectangle, the larger the bitmap must be.
For instance, take a sequence of images 1024 x 768 bytes in size which have
four bouncing balls in each image. The balls are "all over the place," but are
fairly small in size (maybe 16 x 16). The sequence is 32 frames long, and then
it repeats. Because the balls are often widely spaced (at opposite corners of
the image), the least-fit rectangle is, on average, 90 percent of the full
image size. This would mean the bitmap for each image would be about 88K, or a
total of 2.8 Mbytes for all 32 frames!
While the actual change data, even if every pixel in each ball is in a new
place on each frame, would still be 2K per frame, or 64K for all 32 frames.
Also, those 88K bitmaps have to be scanned for positive bits each time a frame
is displayed. For most computers this is an overwhelming task to accomplish
for real-time animation.
One alternative is to search for multiple least-fit rectangles. (A good
general-purpose algorithm for this task is beyond the scope of this article.)
For this example of the four bouncing balls, you could write a special case
algorithm which always started with eight least-fit rectangles (four for the
new ball positions and four for the old ball positions), then reduce that
number for intersecting paths and less-than-extreme movement of a ball between
frames (where the old and new positions of a ball can both be held within a
small rectangle).
Although the bit-map algorithm for storing differences is well suited to
nearly all random changes that occur over large areas of an image when the
number of changes is large, there is a major problem. This approach tends to
be below par for small changes that are spread over large areas of an image,
which requires a large bitmap and a small set of change bytes. It is my
opinion that the best use of the bitmap algorithm is for video images, which
tend to have a lot of random-noise-type changes between frames, with change
ratios (the percentage of the image's bytes which have changed between frames)
as high as 70 or 80 percent.


Drastic Image Changes


When a large percentage of the image changes between frames, the skip/copy
algorithm does not give a high compression ratio. Good examples are frames
that were captured from video and 3-D rendered computer-generated animation.
In particular, video captured from a poor/noisy source can have random changes
constantly going on, even on a still frame. Video captured from a good source,
such as a hi-res video camera, tends to have random changes which give an
image that "live" look. Some of these random changes can be suppressed with a
good time-based digital filter -- one that removes small pixel fluctuations
over a short period of time without affecting normal motion. But the best
digital filter can't do much with live action: Even a simple slow pan of a
still scene has a drastic change rate. This is where the skip/copy algorithm
is at its worst because it has almost no interline repetition to take
advantage of, and it must generate zillions of little skips and small copies
of nonrepeating bytes.
3-D rendered computer-generated animation (like those displays you see at
computer graphics shows) tends to use smooth shading, so even though you may
see giant block letters flying across the screen, they are probably smoothly
shaded. This means that each time the image moves, almost every pixel changes
its value. DFF will still help in avoiding the storage of a fixed background,
but the skip/copy method performs somewhat better than the bitmap algorithm
for these types of images.


Error Distribution Dithering


Error Distribution Dithering (EDD) is used to compensate for lack of color or
gray scale -- when displaying a gray-scale image in a black-and-white mode,
for example, or when displaying a full color RGB image on a 16-color display.
The great advantage of EDDs is also the reason they are a bane to DFF storage.
The concept of EDDs is to distribute the error (the difference between the
original color for a pixel and the closest we could come to displaying it)
over the surrounding pixels. This gives us a simulation of true gray scale or
color, while still maintaining a great deal of the original image's
resolution. The problem occurs when a single pixel changes its value: It can
affect the value of numerous surrounding pixels. In fact, a small change in
one pixel can shift the pattern of error distribution enough so that hundreds
of pixels are changed in the dithered image. This means that EDD rendered
images often have a change rate of nearly 100 percent. You can find a good
example in those cheap video capture boards which capture in gray scale and
render an EDD image in black and white for display.


Special Applications Using DFF Technology


One problem with DFF storage is that frame size can vary greatly. You can have
one frame that is 1/20 the original raw data size, and another that is 4/5 the
original raw data size. On systems where the images are being read from a slow
I/O device, in real-time (such as CD-ROM), the frame size must be limited to
some maximum number of bytes. One method for doing this is to reduce the
resolution of the image (or part of the image) as the number of changes
exceeds some maximum.
You can reduce resolution on the X-axis, then Y-axis, one step at a time until
the changes are reduced to the maximum frame byte size. Lost resolution is
made up on successive frames since you calculate the new frame change
information relative to the lower resolution of the previous frame. This has
the effect of blurring fast motion, which will sharpen when the action settles
down. This is particularly useful on systems that have RGB-type hardware where
reduced resolution pixels don't have to look blocky because you have averaged
the new lower resolution pixel with the surrounding pixels to get an
out-of-focus or blurred effect.
Other possibilities are "lossy" algorithms for RGB pixel-based systems. The
term lossy refers to algorithms that sacrifice (lose) data accuracy in
exchange for data compression, hence loss + y. You can sacrifice short-term
image accuracy (it would take time for changes to resolve completely) in
exchange for amazing space savings. An example would be to take the change in
red, green, and blue for each pixel over time, and use an algorithm much like
ADPCM (adaptive delta pulse-code modulation, a common method of compressing
and storing sound). With a lossy algorithm you could store only 1 or 2 bits of
change information for red, green, and blue per pixel, per frame. All this is
pretty impractical for existing micros, but as machines get faster, and custom
hardware for DFF encoding/ decoding becomes available, we will see everyday
uses for this type of technology.
Results vary a great deal when you use additional compression on the DFF data,
but something like LZW encoding (or one of its variations), or Huffman
encoding could pay off with substantial disk space savings. The problem is
decoding data of that type in real time. Sadly, that's not realistic on
existing micros: Most decent general-purpose compression algorithms are usable
only to save disk space for a DFF sequence that will be decompressed into
memory or a file before playback.
The hottest use for real-time DFF encoding and decoding is for video transfer
over standard phone lines (video phones). If you are willing to sacrifice
image quality for fast motion, you can still have crystal clear images -- if
you can sit still long enough.
With existing out-of-the-box technology, you can transfer around 1600 bytes
per second over a phone line, provided the transfer is in one direction only.
That really is not enough for TV-quality images unless you are willing to wait
several seconds to see someone's face. But if you use pixel scaling
technology, you can make a very low resolution image look quite smooth, if not
detailed.
As you sit still in front of the phone, the image would get sharper with more
detail as the DFF algorithm continued to send more and more information. Note,
the key phrase here is "sit still," which shows we have a long way to go until
video phones are as common as FAX machines.











February, 1991
THE DDJ DATA COMPRESSION CONTEST


Here's your chance for fame and fortune


 This article contains the following executables: CONTEST.ARC





It wasn't that long ago that the only barrier between programmers and killer
applications was the want of a faster PC with a bigger hard disk and more
memory. Not an unreasonable desire, especially considering that cutting-edge
was defined as 256 Kbytes of RAM, two floppy disk drives, and 4.77 MHz CPUs.
To offset this dearth of hardware, programmers dipped into their bag of tricks
for every imaginable gimmick to efficiently move data from one place to
another -- and to store it, once it got where it was going. Foremost among
these gimmicks was data compression.
Today, 33-MHz 80386 PCs with 80-Mbyte hard disks and 8 Mbytes of memory are
common and (well, sort of) affordable. But with bigger and bigger programs
devouring more and more system resources, the need for data compression hasn't
disappeared; if anything, it has escalated. It's commonplace for a program to
require a multitude of distribution diskettes, making the convenience and
economic good sense of data compression more relevant than ever before.
As far back as Graham Jenkins's "A General Purpose Data Compression Program"
in September of 1979, DDJ readers have consistently held data compression to
be one of the most exciting topics we examine.
This abiding fascination leads us to this month's special project, the "DDJ
Data Compression Contest," a competition designed to allow you to show off
your programming prowess and help other programmers decide which data
compression approach is best for a particular problem.


The Rules


The contest rules are simple. It's open to any individual or organization. The
only requirement is that each entry must include source code for the
compression technique. Sorry, off-the-shelf commercial compression programs
aren't eligible.
Each entry must be submitted in both source and executable format, along with
instructions on how to build the executable from the source. Entries can
either be mailed in on 5.25-inch or 3.5-inch diskettes, or uploaded to the DDJ
Data Compression Contest section on M&T's Telepath or the DDJ Forum on
CompuServe. (A special data library has been set up for the contest.) Special
entry forms, which must be completed, are available online and through the DDJ
editorial offices. All entries must be postmarked no later than May 1, 1991.
We'll use the executable program you provide as your entrant, but we must be
able to successfully compile and link your source code. The output of our
compiled program should be identical to the output of the program you submit.
We will do our best to accommodate anyone who can submit only source.


Putting It to the Test


The test data will consist of three different types of input files, each
approximately 2 Mbytes in size: ASCII text files, executable data, and
graphics images. The graphics files will use 8-bit pixels. Mail us a blank,
formatted diskette, and we'll mail it back to you with the sample files on it.
Alternatively, you can download the test files from CompuServe or Telepath.
The compression programs will be tested on a 25-MHz 80386 PC generously
provided for the contest by Everex Systems. This machine has 8 Mbytes of
32-bit memory and a 128-Kbyte cache, and will be able to test entries under
either MS-DOS 3.3 or Unix System V 3.2. Entries running under MS-DOS can use
up to 6 Mbytes of either EMS or Extended memory. (If your program needs a
machine more powerful than this, you probably shouldn't consider your approach
a "general-purpose" implementation.)
We'll select winners in the following categories:
Compression ratios for each of the three file types
Overall compression ratios
Compression speed
Extraction speed
Overall performance -- the Grand Prize!
Mark Nelson, a frequent DDJ contributor and programmer for Greenleaf Software,
will be refereeing the contest. The DDJ editorial staff will assist Mark as he
evaluates contest entries.
The contest results will be published in the November 1991 issue of Dr. Dobb's
Journal, in which we'll discuss and summarize the winning programs and publish
some or all of the source code.


The Awards


We'll be providing a number of awards for the winners. The grand prize winner
will receive a $250 honorarium, while winners in individual areas will receive
$100 each. (Note that it is possible to win in more than one category.)
We look forward to receiving your entries -- and may the best program (and
programmer) win!














February, 1991
 PORTING UNIX TO THE 386: THREE INITIAL PC UTILITIES


Getting to the hardware




William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 and 2.9 BSD and the chief architect of
National Semiconductor's GENIX. Lynne established TeleMuse, a market research
firm specializing in the telecommunications and electronics industry. They can
by contacted via e-mail at william@berkeley.edu or at uunet!william. Copyright
(c) 1990 TeleMuse.


In last month's installment, we discussed the elements of the 386BSD port,
which required planning prior to the actual coding. In brief, the
specification we outlined emphasized BSD compatibility, efficient use of the
80386 architecture, interoperability with extant commercial standards, and
rapid implementation to leverage BSD UNIX to port the rest of itself. We also
discussed the conflicts inherent between a segmented architecture and a
virtual memory system which prefers paging, other microprocessor
idiosyncrasies and requirements, and the basic planning for the surrounding
hardware. By taking a "practical approach" to this port and focusing on "hard
adherence" to BSD operability and high-performance, we identified the key
milestones required for this (or any) advanced operating system port and set
the stage for our next effort: Writing the PC utilities that allow us to
initially load the first programs and data onto our 386 target host.
With this in mind, we'll now examine code from three PC-based utilities --
boot.exe, cpfs.exe, and cpsw.exe -- that facilitate the basic access to the
hardware from MS-DOS needed to begin a UNIX port. boot.exe executes a
GCC-compiled program (using the Free Software Foundation's GNU C Compiler) in
protected mode from MS-DOS. (Note that GCC generates only 32-bit
protected-mode code.) cpfs.exe installs a root filesystem onto the hard disk.
cpsw.exe copies files to a shared portion of disk so that MS-DOS and UNIX can
exchange information.
In examining these areas, we will illustrate how the UNIX bootstrap process
functions, because these programs mimic that process to a great degree. This
will be important in later articles when we discuss the code and strategies
used to build the bootstraps that allow the newly ported system to become
independent of MS-DOS.


The Purpose of Our PC Utilities


To port UNIX, we needed to devise methods to: Load large 32-bit protected-mode
programs (that is, the BSD kernel); load the initial root filesystems; and
communicate information onto our early UNIX system to augment its capabilities
as we port increasing numbers of utilities.
An initial UNIX port to a brand-new architecture with no native software can
be a miserable task for the inexperienced. One of the authors has done this
for other architectures and survived, but we don't recommend it because it
forces you to write absolute code for the purposes outlined above, only to
abandon it for the UNIX code, which eventually provides the same function.
Writing absolute code is difficult to debug (because there is no debugging
environment), time consuming (one needs to support and initialize the entire
machine in addition to the above functions), and subtle (tiny
machine-dependent characteristics thwart the development effort). These
concerns, especially when working with a processor as complex as the 386,
arise when the port is most vulnerable -- when there is little project
history.
One of the advantages of porting an operating system to a popular machine like
the PC lies in the wealth of previously written program development software.
(In other words, someone else has already suffered to our art.) We chose to
use Borland's Turbo C and Microsoft's MASM, primarily because they "were
there," and also because they were appropriate to rapid PC program
prototyping. While these programs do rely on a few object library primitives
in Turbo C, they are reasonably portable to other MS-DOS C implementations,
and on the whole are not restricted solely to Turbo C (or MASM, for that
manner).
Another advantage of the 386 PC environment is that MS-DOS and it's
applications programs run on the absolute machine, and do not rely on memory
management, relocation, or protection. Thus, we could write programs that
would ultimately usurp control from the MS-DOS operating system without regard
for it's functions and strategies. An operating system that makes extensive
use of memory management mechanisms, such as System V UNIX, would have made it
more difficult to write and debug an absolute program loader. In this case, we
would have spent more time defeating those mechanisms than we would have spent
writing absolute programs in the first place!


The First PC Utility: boot.exe


boot.exe is quite simple in theory, as our mock code fragment in Example 1
demonstrates. It just loads a GCC executable into memory at location 0, enters
into protected mode, and then executes it. Simple, huh? There are some
niggling little gotchas, however:
Example 1: Mock code that loads a GCC executable into memory

 main() {
 int fi;
 struct exec hdr;

 fi = open ("pgm", O_RDONLY);
 read (fi, &hdr, sizeof (hdr));
 read (fi, (char *) 0, hdr.a_text +

 hdr.a_data);
 (* (void * () 0) ();
 /* NOTREACHED */
 }

* Programs are frequently larger than is considered "convenient" in the PC
world. On the PC, 64K or less is considered adequate, while the UNIX kernel we
must load averages about 280 Kbytes in size, so we will have to manage the
so-called "far" pointers in a large model 8086 program.
More Details.
* The bottom (address 0) of PC memory contains a critical portion of the
MS-DOS operating system. We will need to use MS-DOS itself to load the
program, so we can't touch this area until after we read in the entire
program. We will therefore have to allocate a pool of memory space large
enough to temporarily hold a copy of the program we are loading until it is
safe to overwrite location 0.
* Once we enter into protected mode, we can't easily go back and enter MS-DOS
again, so we must do all our checks and anticipate needs prior to taking that
last giant step.
Listing One is the boot.c program which resolves these three areas. Note that
it is no longer a simple eight-line program. Ah well, life is never simple.


The GCC Executable Format



The programs to be loaded have been generated on another UNIX system, where
the GCC compiler, GAS assembler, and BSD linkage editor provide
cross-development support, allowing us to generate BSD a.out format files.
This format is the oldest of the many (and, unfortunately still growing)
different UNIX executable file formats. The BSD a.out format consists of a
header structure (see Listing Two exec.h) that details the size of sections
following, the instruction segment (or text), the data segment, relocation
information, and finally, a symbol-table segment. At this time, we are
interested only in the information contained in instruction and data sections,
which we then load into a large, dynamically allocated temporary array, before
moving it into position. We do not use the relocation information or the
symbol-table segment.


Consistency Checks


Loading this large array of data containing the programs to be executed is a
complex task, because many different 64K segments may be used. A "fence-post"
error arising from incorrectly maintained far pointers can lead to
unpredictable results when the protected mode program runs. Therefore, to
verify that the program contents are loaded correctly, we use a simple
checksum just before we dispatch to it in protected mode (see Listing One,
boot.c).


GATE A20


Another feature which deserves mention involves the PC hardware feature known
as GATE A20. Because the original IBM PC had only 20 bits of address (2{10} or
1 Mbyte, denoted as A19 < -- > A0), newer machines possessing greater physical
address space (80286 with 16 Mbytes and the 80386 with 4 gigabytes) might
exhibit a small difference when executing in real mode. GATE A20 was designed
to mitigate this problem. Without it, a reference at the topmost address
incrementing up would actually reference outside of the 20-bit address space,
because the rollover would be carried up instead of being wrapped around to
address zero. GATE A20 would not be necessary were it not for the presence of
a considerable body of ancient MS-DOS applications that rely on the address
space execution, assuming that this would rollover to the same address space
occupied by the bottom of physical memory. Thus, the urgent need for GATE A20
(short for "Gate the A20 address line to logic zero"). With our UNIX system,
we will want to grab all available RAM memory, especially that above 1 Mbyte,
so we need to defeat the GATE A20 feature and allow all the processor's
address lines to be functional. We did this with our gatea20.asm module in
Listing Three invoked by boot.c in Listing One.


Entering Protected Mode


Protected-mode programming frequently has a mystique about it, probably due,
in large part, to the difficulty in going between modes on the 80(2/3/4)86
processors on which it is supported. You can't just poke a bit, or execute a
single instruction, and end up executing in protected mode. The transition is
a methodical one, where, over the course of tens of instructions, the
processor is incrementally prepared for the transition (which, by the way, is
not intuitively obvious). This, of course, gives errors many opportunities to
sneak in. Writing and debugging a subroutine for reliable entry into protected
mode was not exactly the evening's diversion we estimated; embarrassingly
enough, it took nearly a month.
As you examine the code from protentr.asm in Listing Four you can see that
many different things are being reconciled at once. There are three different
kinds of addressing standards being interconverted as needed:
20-bit segment:offset pair "real" mode addresses
32-bit absolute or physical addresses
32-bit segment selector: offset protected mode addresses
Protected-mode instructions are being "generated" from within a "real" mode
assembler. A descriptor table is encoded in its peculiar and convoluted
structure style, which has its base address split into high and low address
chunks on separate portions of the descriptor. Note that in some versions of
MASM, LIDT/LGDT instructions present undocumented surprises.
Our goal with this subroutine is to turn the 386 into a "flat" 32-bit address
space, reminiscent of a 68000, and to dispatch to location 0 to execute the
above loaded program. Because we don't anticipate using any other descriptors
while our stand-alone program runs, the descriptor table itself is abandoned
in memory -- probably to be written over during protected-mode program
execution.
Interrupts are disabled before entry into protected mode. We don't yet know
where the interrupt and exception processing code exists in the protected-mode
program, so we must leave the IDT uninitialized (zero length). This means that
if an exception or interrupt occurs, the processor will spontaneously reset.
Thus, the first responsibility of a just-loaded 32-bit program must be to
sensibly initialize itself to catch these conditions.
Note that the code for entry into protected mode is PIC (Position Independent
Code). We can easily overwrite the memory of the bootstrap program itself, so
we must arrange to copy this entry into protected-mode code just above our
protected-mode program. This insures its survival when we overwrite MS-DOS,
and quite possibly our boot program, never to return.


The Second PC Utility: cpfs.exe


In addition to being able to run 32-bit protected-mode programs, we need to
load a preliminary root filesystem for our BSD UNIX kernel to access as it
initializes itself during the late phase of boot-up. Like MS-DOS, UNIX needs a
dedicated region of the hard disk drive to store the data structures and data
blocks that support its filesystem scheme. As we do not have a drive dedicated
to BSD, we must instead secure a second partition on the sole disk drive to
contain the BSD root filesystem.
cpfs.c (see Listing Five) is a program which loads this filesystem from a
previously downloaded MS-DOS file. cpfs.c leverages BIOS disk calls to write
appropriately to the absolute disk. If this program were to be commonly used,
you might wish to dig out the disk geometry and BSD partition from additional
MS-DOS and/or BIOS calls, but for our purposes, this program is sufficient.
The first block of the root (typically, the first sector of the drive)
contains the disk label (see Listing Six diskl.h). This data structure will
eventually be used in the 386BSD port to make the system drive-independent.
However, we first need to place the seminal label on the very first
filesystem. cpfs.c, which has hard-wired geometry constants, can initially be
compiled (by defining "FIRST") to blindly write that first filesystem with
this label. In subsequent use (compiled without "FIRST" defined) cpfs.c will
use this label to validate a load, hopefully saving a bleary-eyed developer
from ultimate disaster.
When using this program with disk drives greater than 1024 cylinders, logical
translation by the disk controller to a different geometry is a problem. Some
calls used by MS-DOS and applications programs would invoke a 10-bit field for
cylinder address in the disk address data structure, reflecting an early
limitation of some PC disk controllers (the WD1010 disk controller chip, for
example). One clever workaround which doesn't require altering the operating
system is to encode disk addresses with a logical mapping scheme so that some
of the cylinder address bits would be mapped into more plentiful sector and
head address bits. This scheme, while quite acceptable to MS-DOS (which is
never picky about sector placement), is not acceptable to BSD (which is
extremely picky about sector placement). The BSD Fast Filesystem uses
rotational and head placement algorithms to improve filesystem performance by
taking disk latency into account. Therefore, running on a logically mapped
disk may significantly degrade performance by throwing off this mechanism in
the Fast Filesystem. Additional code is required to detect and defeat this
condition, because this translation must be maintained while MS-DOS is
running.


The Third PC Utility: cpsw.exe


Our last of three programs is of use when the early BSD operating system
kernel is running and must receive additional files. Ordinarily, we would
prefer to use either serial communications or floppy disks to add files to our
nascent BSD root filesystem. However, our early kernel has drivers for only
the display, keyboard, and hard disk drive (that is, the "bare minimum"),
because we want to use the system itself to develop and test further
extensions and improvements. In a nutshell, we want to leverage our tiny BSD
UNIX system with MS-DOS's drivers and applications programs by using MS-DOS to
receive information into a MS-DOS file, and then using a trivial program to
place this information on a reserved portion of the disk, where BSD can easily
access it.
At this point, we had seriously considered giving BSD the ability to read
MS-DOS file structures directly, but this is a nontrivial process and we
wished to devote our energies toward developing and improving the BSD kernel
to become self-supporting. As a consequence, we decided to push this project
off and favor instead an expedient solution to a temporary problem.
If more disk space had been available, a partition could have been dedicated
to the MS-DOS communications functions. Unfortunately, our early host machine
contained only a 40-Mbyte drive, so we were very tight on space. (Yes, I know
we have large machines now, but when you begin a project, it is usually on the
cheap until you convince people that it is worthwhile -- of course, by that
time you've probably finished the project, or at least a fair portion of it).
We elected to force the swap space to do double duty, by arranging to use it
to hold information from and to MS-DOS just after or before BSD system
operation. We were counting on the fact that we only use the swap space when
the system really gets rolling. While this arrangement is somewhat heretical,
it worked adequately enough to let us finish our nascent system to the point
where it no longer required MS-DOS to boot or exchange files with other
systems.
cpsw.c (see Listing Seven) differs from cpfs.c in that it uses the disk label
to configure itself. Disk geometry is determined entirely from the disk label.
Prior to using cpsw.c, a TAR file image is created on a cross-host. This file
is then transferred to an MS-DOS file via one of the many MS-DOS
communications programs available. cpsw.exe is used to make this file
accessible to 386BSD. 386BSD is then booted, and the 386BSD TAR utility is
invoked to extract the information (prior to paging). This method is somewhat
tedious, but proved adequate for the early stages of this port.
cpsw.exe is very similar in function to cpfs.exe, and both could be subsumed
into a single program. Simplicity, however, has allowed us to achieve our goal
of getting 386BSD off the ground and running, without becoming an outright
diversion into a MS-DOS/ UNIX merger, a weighty and significant objective not
suited for an early operating system project of considerable and
ever-increasing scope, but still short on history.


Where We Go From Here


Now that we have our PC utilities in place, we can plan for the next stage in
our 386BSD effort: development of the stand-alone system /sys/stand and its
utilities. This system will possess the rudimentary drivers and a library of
support routines which allow GCC programs to access devices and UNIX
file-structures on the hard disk. It will also provide us with a platform to
examine the requirements which must be met so that the 386 will support
features to be incorporated into 386BSD.


The 386BSD Project and Berkeley UNIX


The 386BSD project was established in the summer of 1989 for the specific
purpose of porting the University of California's Berkeley Software
Distribution (BSD) to the Intel 80386 microprocessor platform. Encompassing
over 150 Mbytes of operating systems, networking, and applications software,
BSD is a fully functional and nonproprietary complete operating systems
software distribution. The goal of this project was to make this cutting-edge
research version of UNIX widely available to small research and commercial
efforts on an inexpensive PC platform. By providing the base 386BSD port to
Berkeley, our hope is to foster new interest in Berkeley UNIX technology and
to speed its acceptance and use worldwide. We hope to see those interested in
this technology build upon it in both commercial and noncommercial ventures.
In each of these articles we will examine the key aspects of software,
strategy, and experience that make up a project of this magnitude. We intend
to explore the process of the 386BSD port, while learning to effectively
exploit features of the 386 architecture for use with an advanced operating
system. We also intend to outline some of the trade-offs in implementation
goals, which must be periodically reexamined. Finally, we will highlight
extensions which remain for future work.

Currently, 386BSD is available on the 386 PC platform and supports the
following:
Many different PC platforms, including the Compaq 386/20, Compaq Systempro
386, any 386 with the Chips and Technologies chipset, any 486 with the OPTI
chipset, Toshiba 3100SX, and more
ESDI, IDE, and ST-506 drives
3.5 inch and 5.25 inch floppy drives
Cartridge tape drive
Novell NE2000 and Western Digital Ethernet controller boards
EGA, VGA, CGA, and MDA monitors
287/387 floating point, including the Cyrix EMC
A single-floppy, stand-alone UNIX system, supporting modems, Ethernet, SLIP,
and Kermit to facilitate down-loading of 386BSD to any PC over the INTERNET
network.
Copies of 386BSD source code can be obtained by contacting the Computer
Systems Research Group (CSRG) at UC Berkeley. Some restrictions may apply.
While working with us through our 386BSD article series, the following texts
on Berkeley UNIX and the 80386 microprocessor are also recommended:
The Design and Implementation of the 4.3BSD UNIX Operating System, by Samuel
J. Leffler, Marshall Kirk McKusick, Michael J. Karels, and John S. Quarterman
(Addison-Wesley, 1989).
Programming the 80386 by John H. Crawford and Patrick P. Gelsinger (Sybex,
1987).
IBM Technical Reference: Personal Computer AT (IBM, 1984).
In addition, an augmented and detailed book on 386BSD by William Frederick
Jolitz and Lynne Greer Jolitz, The 386BSD Handbook, will be available in the
summer of 1991.

-- B.J., L.J.


_PORTING UNIX TO THE 386: THREE INITIAL PC UTILITIES_
by William F. Jolitz and Lynne G. Jolitz



[LISTING ONE]

/* Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 7/89
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * This program allows the bootstrap load of GCC cross compiled
 * 32 bit protected mode absolute programs onto the obtuse architecture
 * of PC AT/386, destroying the running DOS in the process.
 * Currently works with TURBO C 1.5 & MASM 5.0, relies on farmalloc().
 */

#pragma inline
#include <io.h>
#include <fcntl.h>
#include <alloc.h>
#include <dos.h>
#include <sys\stat.h>
#include "exec.h"

#define PGSIZE 4096
#define CLOFSET (PGSIZE - 1) /* 386 page roundup */

/* Header record of BSD UNIX executable file */
struct exec exec;

long far_read(), to_long();
char far *to_far();
char far *add_to_far(char far *p, long n);

/* Get a file we can open, attempt to load it */

main(argc, argv)
char *argv[];
{ int fd, i;
 long addr, totalsz;
 char far *base;

 if (argc != 2) {
 printf("usage: boot <filename>\n");
 exit(1);
 }
 fd = open(argv[1], O_BINARY);
 if (fd < 0) {
 printf("boot: Cannot open file \"%s\" \n", argv[1]);
 exit(1);
 }

 /* Reasonable file to load? */
 i = read(fd, (char *)&exec, sizeof exec);
 if (i != sizeof exec 
 (exec.a_magic != OMAGIC && exec.a_magic != NMAGIC
 && exec.a_magic != ZMAGIC)) {
 printf("Not a recognized object file format\n");
 exit(1);
 }

 /* Allocate buffer for temporary copy of protected mode executable
 Buffer space requirements: <--a.out------------> pageroundup heap */
 totalsz = exec.a_text + exec.a_data + exec.a_bss + 4096L + 20*1024L;

 /* Pad with trailing portion to put protected mode entry code in */
 base = farmalloc(totalsz + 64*1024L);
 if (base == 0) {
 printf("Cannot allocate enough memory\n");
 exit(1);
 }
 /* Load Instruction (e.g. text) portion of file */
 printf("Text %ld", exec.a_text);
 if (far_read(fd, base, exec.a_text) != exec.a_text)
 goto eof;
 /* Load Data portion of file */
 addr = exec.a_text;

 /* Adjust for page alignment for pure procedure format */
 if (exec.a_magic == NMAGIC && (addr & (PGSIZE-1)))
 while (addr & CLOFSET)
 * add_to_far(base, addr++) = 0;
 printf("\nData %ld", exec.a_data);
 if (far_read(fd, add_to_far(base,addr), exec.a_data) != exec.a_data)
 goto eof;
 /* Clear Uninitialized data (BSS) space */
 addr += exec.a_data;
 printf("\nBss %ld", exec.a_bss);
 for ( ; addr < totalsz; )
 * add_to_far(base,addr++) = 0;
 if(exec.a_entry)
 printf("\nStart 0x%lx", exec.a_entry);
#ifdef CKSUM

 /* Optionally calculate checksum to validate against cross host's copy. */

 far_cksum(base, addr-1L);
#endif CKSUM

 printf("\n");

 /* Effect copydown to absolute 0 and entry into protected mode at
 location "a_entry". */
 transfer(base, totalsz, exec.a_entry);
 /* NOTREACHED */
eof:
 printf(" - File incomplete, load aborted\n");
 exit(1);
}

/* We use the routines below to always keep far pointers normalized
 * to simplify comparision/subtraction. */
char far *to_far(l) long l; {
 unsigned seg, offs;
 seg = l>>4;
 offs = l & 0xf;
 return(MK_FP(seg,offs));
}

long to_long(f) char far *f; {
 unsigned long l;
 l = FP_SEG(f)*16L + (unsigned long)FP_OFF(f);
 return(l);
}

char far *add_to_far(f,l) char far *f; long l; {
 return(to_far(to_long(f)+l));
}

char far *normalize(f) char far *f; {
 unsigned seg,offs ;

 /* add in offset */
 seg = FP_SEG(f); offs = FP_OFF(f);
 seg += (offs >> 4) ; offs &= 0xf ;
 return(MK_FP (seg, offs));
}

/* read() that works anywhere in DOS address space for any size data,
 * works via bounce buffer. Not designed for speed or elegance. */
long far_read(io, base, len) int io; long len; char far *base; {
 char far *fp;

 /* normalize far pointer to handle segment rollover case */
 fp = base = normalize(base);
 while (len) {
 static char dbuf[PGSIZE];
 long rlen,tlen;

 /* bounce buffer between my data segment and ultimate dest */
 tlen = (len > PGSIZE)? PGSIZE : len;
 if ((rlen = read (io, dbuf, tlen)) < 0) return (rlen);

 /* shoot into place */
 movedata (_DS, (unsigned)dbuf, FP_SEG(fp), FP_OFF(fp), rlen);


 /* update transfer address and count */
 fp = add_to_far(fp, rlen);
 len -= rlen ;
 if (tlen != rlen) break ;
 }
 return (to_long(fp) - to_long(base));
}

extern far protentry(); /* known to be less than 0x200 bytes long */
extern far gatea20();

/* set up to transfer to 386 program; call protentry to do the dirty work. */
transfer(base, len, entry) char far *base; long len, entry; {
 unsigned seg,offs ;
 long rbase;
 char far *fp;

 /* Copy code to top of the system and execute there. This keeps it
 from getting stepped on. */
 /* make 32 bit address */
 rbase = to_long(base);
 fp = add_to_far(base,len);
 seg = FP_SEG(fp); offs = FP_OFF(fp);

 /* protect possible conflict of top paragraph of bss */
 if (offs) seg++ ;

 /* force to protentry's offset so offsets agree */
 offs = FP_OFF(&protentry);
 movedata (FP_SEG(&protentry), offs, seg, offs, PGSIZE);

 /* degate A20 - from now on, full physical memory address bus */
 gatea20();

 /* enter prot_entry program, relocated to top of loaded program, via
 intersegment return */
 asm push word ptr rbase+2 ;
 asm push word ptr rbase ;
 asm push word ptr len+2 ;
 asm push word ptr len ;
 asm push word ptr entry+2 ;
 asm push word ptr entry ;
 asm push word ptr seg;
 asm push word ptr offs;
 asm db 0cbh; /* lret - intersegment return */

 /* within protentry: go into 32 bit mode, copy entire system to 0 with
 single string instruction, intrasegment jump to entry point */
 printf("protentry returned?!?\n");
 exit(1);

 /* NOTREACHED */
}

#ifdef CKSUM
/* 16 bit checksum of program. */
far_cksum(base, len) long len; char far *base; {
 char far *tmp;

 unsigned seg,offs ;
 long nbytes,sum, tlen;
 tmp = base;
 sum = 0;
 nbytes = 0;
 while (len) {
 /* normalize far pointer to handle segment rollover case */
 tmp = normalize(tmp);

 /* Do a page at a time */
 tlen = (len > PGSIZE)? PGSIZE : len;
 len -= tlen ;
 while (tlen--) {
 nbytes++;
 if (sum&01)
 sum = (sum>>1) + 0x8000;
 else
 sum >>= 1;
 sum += *tmp++ ;
 sum &= 0xFFFF;
 }
 }
 printf("\nChecksum %05lu%6ld ", sum, (nbytes+CLSIZE)/PGSIZE);
}
#endif CKSUM





[LISTING TWO]

/* Excerpted with permission from 4.3BSD include file
 * "/usr/include/sys/exec.h"
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * Header prepended to each a.out file.
 */
struct exec {
 long a_magic; /* magic number */
unsigned long a_text; /* size of text segment */
unsigned long a_data; /* size of initialized data */
unsigned long a_bss; /* size of uninitialized data */
unsigned long a_syms; /* size of symbol table */
unsigned long a_entry; /* entry point */
unsigned long a_trsize; /* size of text relocation */
unsigned long a_drsize; /* size of data relocation */
};

#define OMAGIC 0407 /* old impure format */
#define NMAGIC 0410 /* read-only text */
#define ZMAGIC 0413 /* demand load format */
 </LT >




[LISTING THREE]

 title _gatea20
; Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
; Written by William Jolitz, July 1989
; THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
; IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
; WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
; (void) gatea20();
; Enable Address Bit 20 that was disabled by BIOS for MSDOS
; We need it off to use entire memory space of the AT.
; We do this just prior to entering protected mode, never to return.
;

_TEXT segment byte public 'CODE'
 assume cs:_TEXT,ds:_TEXT
_TEXT ends

Status_Port equ 64h ; 8042 Status Port
Cmd_rdy equ 2 ; Keyboard is ready?
Write_outpt equ 0d1h ; Write next data to output port
Port_A equ 60h ; 8042 Keyboard Scan and Diagnostic
EnableA20 equ 0dfh ; Enable Address bit 20 for use

_TEXT segment byte public 'CODE'

; Wait for Keyboard controller to be ready for command
wait42 proc near
chkrdy:
 in al, Status_Port
 and al, Cmd_rdy
 jnz chkrdy
 ret
wait42 endp

; Turn on A20 again.
_gatea20 proc far
 call wait42
 mov al, Write_outpt
 out Status_Port, al
 call wait42
 mov al, EnableA20
 out Port_A, al
 call wait42
 ret
_gatea20 endp

 public _gatea20
_TEXT ends
 end






[LISTING FOUR]


 title protentry
; Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
; Written by William Jolitz 7/89
; Redistribution and use in source and binary forms are freely permitted
; provided that the above copyright notice and attribution and date of work
; and this paragraph are duplicated in all such forms.
; THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
; IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
; WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
; protentry(entry,len,addr,...) long entry,len,addr;
; Entered via jump or "ret" (e.g. no return address on stack),
; builds necessary data structures and transfers into 32-bit
; mode, then copies the 32-bit mode program at address "addr"
; and byte length "len" to location 0 and enters the program
; at location "entry". Note that both "entry" and "addr" are
; true 32-bit absolute pointers, NOT segment:offset pairs. It
; is assumed that both the stack and this program will not be
; overwritten in the subsequent copy to 0 of the program to be
; entered, so caller is responsible to place this in a location
; above the program.
;
; Note that this program is position-independant (self relocating).
;
; Any additional args past the necessary three will be passed on the
; stack to the entered program [note: we obviously don't provide a
; "return" address]
;
_TEXT segment byte public 'CODE'
 assume cs:_TEXT,ds:nothing
_TEXT ends

Data32 equ 66h ; prefix to toggle 16/32 data operand
JMPFAR equ 0eah ; opcode for JMP intersegment

 .186 ; allow use of shl ax,cnt insn
_TEXT segment byte public 'CODE'

_protentry proc far
 jmp short relstrt

; Global Descriptor Table contains three descriptors:
; 0h: Null: not used
; 8h: Code: code segment starts at 0 and extents for 4 gbytes
; 10h: Data: data segment starts at 0 and extends for 4 gbytes(overlays code)
GDT:
NullDesc dw 0,0,0,0 ; null descriptor - not used
CodeDesc dw 0FFFFh ; limit at maximum: (bits 15:0)
 db 0,0,0 ; base at 0: (bits 23:0)
 db 10011111b ; present/priv level 0/code/conforming/readable
 db 11001111b ; page granular/default 32-bit/limit(bits 19:16)
 db 0 ; base at 0: (bits 31:24)
DataDesc dw 0FFFFh ; limit at maximum: (bits 15:0)
 db 0,0,0 ; base at 0: (bits 23:0)
 db 10010011b ; present/priv level 0/data/expand-up/writeable
 db 11001111b ; page granular/default 32-bit/limit(bits 19:16)
 db 0 ; base at 0: (bits 31:24)

; Load Pointers for Tables
; contains 6-byte pointer information for: LIDT, LGDT


; Interrupt Descriptor Table pointer
IDTPtr dw 7FFh ; limit at maximum (allows all 256 interrupts)
 dw 0 ; base at 0: (bits 15:0)
 dw 0 ; base at 0: (bits 31:16)

; Global Descriptor Table pointer
GDTPtr dw 17h ; limit to three 8 byte selectors(null,code,data)
 dw offset GDT ; base address of GDT (bits 15:0)
 dw 0h ; base address of GDT (bits 31:16)

; Constructed instruction for entry into 32 bit protected mode
; ljmp far Note
dispat: db Data32 ; 32-bit override prefix
 db JMPFAR ; opcode for JMP intersegment
offl dw 0 ; starting address of 32-bit code (low-word)
 dw 0h ; starting address (high word of linear address)
 dw 8h ; CodeDesc selector=8h

relstrt:
 cli ; disable interrupts
 ; do address fixups
 mov ax,ss ; first, make a new 32 bit stack pointer!
 mov cx,4
 shl ax,cl ; ax now contains segment address low 16 bits
 mov bx,ss
 mov cx,12
 shr bx,cl ; bx now contains segment address high 16 bits
 add ax,sp
 adc bx,0 ; ax contains esp 15:0, bx contains esp 31:16
 mov si,ax ; pass new stack to 32bit mode via si & di
 mov di,bx

 mov ax,cs
 mov cx,4
 shl ax,cl ; ax now contains segment address low 16 bits
 mov bx,cs
 mov cx,12
 shr bx,cl ; bx now contains segment address high 16 bits

 mov cx,cs:GDTPtr+2
 mov dx,bx
 add cx,ax
 mov cs:GDTPtr+2,cx
 adc cs:GDTPtr+4,dx
 mov cx, OFFSET(cpydwn)
 mov dx,bx
 add cx,ax
 mov cs:offl,cx ; overflow?
 adc cs:offl+2,dx

 ; Load the descriptor tables

; lidt cs:IDTPtr ; load Interrupt Descriptor Table
db 2eh,0Fh,01h,00011110b
dw offset IDTPtr
; lgdt cs:GDTPtr ; load Global Descriptor Table
db 2Eh,0Fh,01h,00010110b
dw offset GDTPtr

; smsw ax ; put Machine Status Word in AX
db 0fh, 01h, 11100000b
 or al,1 ; activate Protection Enable bit
; lmsw ax ; store Machine Status Word, begin protected mode
db 0fh,01h,11110000b

 jmp short Next ; flush prefetch queue

 ; Load the segment registers with approriate descriptor selectors

Next: mov bx,10h ; set segment registers to DataDesc
 mov ss,bx ; load SS,DS,ES segment registers with DataDesc
 mov ds,bx
 mov es,bx

 ; Load CS via above's constructed ljmp, entering 32 bit protected mode
 jmp short dispat

 ; Finally running in Protected 32-bit Mode
cpydwn:
 mov ax,di ; movl %edi,%eax
 shl ax,16 ;db 0c1h,0e0h,10h ; shll $16,%eax
 db Data32
 mov ax,si ; movw %si,%ax
 mov sp,ax ; movl %eax,%esp
 pop ax ; pop eax ; entry addr
 pop cx ; pop ecx ; byte size
 pop si ; pop esi ; source address
 xor di,di ; xor edi,edi ; destination address
 cld
 rep movsb ; copy into place
 mov sp,si ; movl esp,esi
 jmp ax ; jmp eax ; go to entry
_protentry endp

 public _protentry
_TEXT ends
 end
</LT >



[LISTING FIVE]

/* Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 7/89
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * This program copies a BSD filesystem out of an MSDOS file and
 * places it on an pre-reserved disk partition. Note that both the
 * geometry of the particular disk, and the particulars of the
 * BSD partition need to be adjusted to suit the drive on which this will
 * be used. Normally, this would be a very rude requirement, but
 * we tolerate this because this program is a throw-away used to get
 * us started, and we have better schemes to deal with configuration

 * a little further down the pike.
 * Currently works with TURBO C 1.5 .
 */

#include <bios.h>
#include <io.h>
#include <fcntl.h>
#include <sys\stat.h>
#include "diskl.h"

/* Disk geometry (here, a NEC DS5146). Adjust parameters to suit drive. */
#define NCYL 615
#define NTRACK 8
#define NSECT 17

#define BSIZE 512 /* Disk block size */

/* Location & size of root partition. Adjust for drive partition layout. */
#define OFF_CYL 290 /* Cylinder offset of start of BSD root partition */
#define ROOTSZ 50 /* size of root partition, in units of cylinders */

char trkbuf[NSECT*BSIZE];

struct label_blk {
 char bufr[LABELOFFSET];
 struct disklabel dl;
} lbl;

main (argc, argv) char *argv[]; {
 int fi, rem, cyl, head, sector, tfrcnt;
 if (argc != 2) {
 printf ("usage: cpfs <rootfs>\n");
 exit (1);
 }
 fi = open (argv[1],O_BINARY);
 if (fi < 0) {
 printf ("Cannot open \"%s\" file to read filesystem\n",
 argv[1]);
 exit (1);
 }
 cyl = OFF_CYL;
 tfrcnt = head = 0;

#ifndef FIRST
 /* check for presence of disklabel */
 biosdisk (2, 0x80, 0, OFF_CYL, LABELSECTOR, 1, &lbl);
 if (lbl.dl.dk_magic != DISKMAGIC) {
 printf ("BSD Disk partition does not have a label!\n");
 exit (1);
 }

 /* Treat first track of data special; use disk label in first block of
 * file to validate that the file to be loaded and disk drive
 * partition are appropriate for each other. */
 read (fi, trkbuf, BSIZE);
 if (strncmp (trkbuf, &lbl, BSIZE) != 0) {
 printf ("BSD root partition and filesystem mismatch!\n");
 exit (1);
 }


 /* reset filesystem file to beginning */
 lseek (fi, 0, SEEK_SET);
#endif

 printf ("WARNING! About to overwrite disk (will loose previous\n");
 printf ("contents). Are you certain of your use of this program?");
 if (getche () != 'y') exit (1);
 printf("\n");

 /* Transfer file to absolute disk section, a track at a time,
 because we're impatient. */
 while ((rem = read (fi, trkbuf, NSECT*BSIZE)) == NSECT*BSIZE) {
 biosdisk (3, 0x80, head, cyl, 1, NSECT, trkbuf);
 if (++head == NTRACK) {
 head = 0;
 if (++cyl > NCYL cyl > OFF_CYL+ROOTSZ ) {
 printf ("Overran root partition!\n");
 exit (1);
 }
 }
 tfrcnt += NSECT;
 printf ("Amount transferred %5dK bytes\r",
 tfrcnt*BSIZE/1024);
 }

 /* Transfer any remainder leftover in track buffer. */
 if (rem > BSIZE-1) {
 biosdisk (3, 0x80, head, cyl, 1, rem/BSIZE, trkbuf);
 tfrcnt += rem/BSIZE;
 printf ("Amount transferred %5dK bytes\n",
 tfrcnt*BSIZE/1024);
 }

 exit (0);
}





[LISTING SIX]


/* Copyright (c) 1985,1986,1989,1990 Micheal J. Karels. All rights reserved.
 * Based on a concept by Sam Leffler. Written by Michael J. Karels 4/85
 * Revised by William Jolitz 86-90.
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * Each disk has a label which includes information about the hardware
 * disk geometry, filesystem partitions, and drive specific information.
 * The label is in block 1, offset from the beginning to leave room
 * for a bootstrap, etc.
 */


#define LABELSECTOR 1 /* sector containing label */
#define LABELOFFSET (BSIZE-120) /* offset of label in sector */
#define DISKMAGIC 0xabc /* The disk magic number */
#define DTYPE_ST506 1 /* ST506 Winchester */
#define DTYPE_FLOPPY 2 /* 5-1/4" minifloppy */
#define DTYPE_SCSI 3 /* SCSI Direct Access Device */

struct disklabel {
 short dk_magic; /* the magic number */
 short dk_type; /* drive type */
 struct dcon {
 short dc_secsize; /* # of bytes per sector */
 short dc_nsectors; /* # of sectors per track */
 short dc_ntracks; /* # of tracks per cylinder */
 short dc_ncylinders; /* # of cylinders per unit */
 long dc_secpercyl; /* # of sectors per cylinder */
 long dc_secperunit; /* # of sectors per unit */
 long dc_drivedata[4]; /* drive-type specific information */
 } dc;
 struct dpart { /* the partition table */
 long nblocks; /* number of sectors in partition */
 long cyloff; /* starting cylinder for partition */
 } dk_partition[8];
 char dk_name[16]; /* pack identifier */
};

#define dk_secsize dc.dc_secsize
#define dk_nsectors dc.dc_nsectors
#define dk_ntracks dc.dc_ntracks
#define dk_ncylinders dc.dc_ncylinders
#define dk_secpercyl dc.dc_secpercyl
#define dk_secperunit dc.dc_secperunit

/* Drive data for ST506. */
#define dk_precompcyl dc.dc_drivedata[0]
#define dk_ecc dc.dc_drivedata[1] /* used only when formatting */
#define dk_gap3 dc.dc_drivedata[2] /* used only when formatting */

/* Drive data for SCSI */
#define dk_blind dc.dc_drivedata[0] /* can we work in "blind" i/o */







[LISTING SEVEN]


/* Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 7/89
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * This program copies a MSDOS file to BSD's idea of a swap partition,

 * known to be the second one in the disklabel. Typical use is to place
 * a TAR formatted file, obtained from a cross-host, onto swap. Then
 * BSD is booted with the boot program and the BSD tar utility is
 * used to extract the files being transferred within the TAR image,
 * hopefully before we need to page on the swap space. Again, this
 * program is rude in requiring one to adjust the manifest constant
 * denoting the cylinder on which the BSD root filesystem appears,
 * but this is another throw-away program to get the real work started.
 * Currently works with TURBO C 1.5 .
 */

#include <bios.h>
#include <alloc.h>
#include <fcntl.h>
#include <sys\stat.h>
#include "diskl.h"

/* Location of root partition. Adjust to suit given drive partition layout. */
#define OFF_CYL 290 /* Cylinder offset of start of BSD root partition */

char *trkbuf;

#define BSIZE 512
struct label_blk {
 char bufr[LABELOFFSET];
 struct disklabel dl;
} lbl;

main(argc, argv) char *argv[]; {
 int fi, rem, cyl, head, tfrcnt;
 int bsize, ncyl, ntrack, nsect, off_cyl, maxcyl;

 if (argc != 2) {
 printf("usage: cpsw <file>\n");
 exit(1);
 }

 fi = open(argv[1], O_BINARY);
 if (fi < 0) {
 printf("Cannot open \"%s\" file to BSD swap\n",
 argv[1]);
 exit(1);
 }

 /* check for presence of disklabel */
 biosdisk(2, 0x80, 0, OFF_CYL, LABELSECTOR, 1, &lbl);
 if (lbl.dl.dk_magic != 0xabc) {
 printf("BSD root disk partition does not have a label!\n");
 exit(1);
 }

 /* Extract disk geometry and swap partition location from disk label. */
 bsize = lbl.dl.dk_secsize;
 nsect = lbl.dl.dk_nsectors;
 ntrack = lbl.dl.dk_ntracks;
 off_cyl = lbl.dl.dk_partition[1].cyloff;
 maxcyl = lbl.dl.dk_partition[1].cyloff +
 lbl.dl.dk_partition[1].nblocks / lbl.dl.dk_secpercyl;


 /* Allocate track buffer */
 trkbuf = malloc (nsect*bsize);

 printf("WARNING! About to overwrite disk (will loose previous\n");
 printf("contents). Are you certain of your use of this program?");
 if (getche() != 'y') exit(1);
 printf("\n");

 tfrcnt = head = 0;
 cyl = off_cyl;

 /* Transfer file to absolute disk section, a track at a time,
 because we're impatient. */
 while ((rem = read(fi, trkbuf, nsect*bsize)) == nsect*bsize) {
 biosdisk(3, 0x80, head, cyl, 1, nsect, trkbuf);
 if (++head == ntrack) {
 head = 0;
 if (++cyl > maxcyl) {
 printf("Overran swap partition!\n");
 exit(1);
 }
 }
 tfrcnt += nsect;
 printf("Amount transferred %5dK bytes\r",
 tfrcnt*BSIZE/1024);
 }

 /* Transfer any remainder leftover */
 if (rem > BSIZE-1) {
 biosdisk(3, 0x80, head, cyl, 1, rem/bsize, trkbuf);
 tfrcnt += rem/bsize;
 printf("Amount transferred %5dK bytes\n",
 tfrcnt*bsize/1024);
 }
 exit(0);
}


























February, 1991
 REMOTE CONNECTIVITY FOR PORTABLE TERMINALS: PART I


VT100 terminal emulation and more for 8051-based systems




Dan Troy


Dan is software manager at Murata Hand-Held Terminals and is currently
developing more operating system firmware and application software for the
Links product line. He can be reached at Murata HHT, 9 Columbia Dr., Amherst,
NH 03031.


In an effort to achieve remote VT100 hand-held connectivity and thereby extend
the usefulness of our Links terminal (the 8051-based, hand-held, touchscreen
terminal with built-in modem seen in Figure 1), we developed a standard
application called "Links100" that would emulate the VT100's 24 x 80 virtual
screen image on the much smaller 12 x 20 Links display. This two-part article
discusses the development of that custom VT100 terminal-emulation application.
This month, I'll examine the development of the background VT100; in Part II,
I'll discuss the application development process itself.
Virtually all of the development work was done in C using our Speed
development package (which is built around ANSI standard C language with
special Links extensions to C) and an ANSI Standard 8051 Cross Compiler with a
modified library to interface with the PROM-based operating system residing in
the Links. Applications were written in C on a PC, compiled under Microsoft C,
tested on the PC using the Murata Speed C Emulator (with optional use of
Codeview), and cross compiled and loaded from the PC into the Links'
non-volatile program RAM.


Developing the VT100 Driver


Because we were working on a tight development schedule, I decided early on to
build the program around an existing VT100 terminal-emulation program, if
possible. The criterion for selecting an emulation program was
straight-forward: It had to be written in C for an IBM-compatible PC,
preferrably in DOS-file format. My plan was to subsequently port the library
over to our hand-held terminal with minimal changes, thereby saving a
substantial amount of software development time. After evaluating several
libraries, I settled on the C Communications Toolkit (Magna Carta Software,
Garland, Texas) because it seemed to be complete (among other features, it
includes a variety of terminal emulation options, single and multifile
transfer with ASCII, XModem, YModem, Kermit, and so on). Furthermore, the
toolkit comes with full source code, requires no royalties for object code
developed from the source, and is relatively inexpensive.
Once I had the C Communication Toolkit source code in hand, the first step in
porting the code to run on the Links terminal was to replace Magna Carta's DOS
int86( ) interrupt function calls and 8250 UART accessing calls with Links
8051-based operating system calls. Table 1 lists the Links functions that were
substituted for Magna Carta serial read and write commands in the VT100
emulation code. The primary VT100 emulator routine was rewritten as shown in
Listing One. TERMINALP represents the default VT100 setup parameters, which
are defined in the structure shown in Listing Two. The write_screen function
in process_VT100_input (see Listing One) interprets the character according to
VT100 rules. Likewise, the DOS int86() interrupt calls within Magna Carta's
screen access and cursor positioning functions were replaced with Links
functions.
Table 1: Links functions that were introduced into the Magna Carta VT100
emulation source code.

 Functions Description
 ------------------------------------------------------------------------

 char eof (void); Checks to see if a character has been received in the
 Links terminal serial port (end of file?). Returns a
 TRUE/FALSE condition.

 char read(char *); Reads a character out of the Links terminal serial
 input buffer. Character read is passed back in the
 function call, and the read status is returned.

 void write(char); Writes a character out of the Links terminal serial
 port.

Next, our software development team considered two possible ways of
implementing the VT100 command processor (the VT100 driver). The first
approach was to call the driver from the Links100 application, requiring the
driver to be called in a main-processing loop of the application. The second
approach, on the other hand, was to call the driver transparently from an 8051
interrupt. This would allow the driver to be installed from the application at
initialization, but process the VT100 commands transparently from the
low-level operating system timer interrupt handler.
We decided to have the VT100 driver executed in the background via the
interrupt so that the engineer writing the Links100 application would not have
to be concerned with calling the VT100 driver repetitively from a
main-processing loop, or in multiple locations within the application. This
required modifying the existing Links operating system to handle the
installation and calling of the driver. Furthermore, this same installation
methodology can now be used for any background driver for future applications.


Installing the VT100 Driver


In order to install the background driver, we developed an initialization
routine, written in 8051 assembly code and callable from C (see Listing
Three). This function takes the address of the background driver, storing it
at emul_processor_address. It then stores the pointer to any data to pass to
it in data_struct.
The function call to install the driver within the Links application is in
Listing Four. To utilize this function, the prototype can be declared as shown
in Listing Five. The execution of the call to init_emulation_mode accesses the
Links operating system by calling an 8051 assembly library routine. The
routine jumps to a fixed-location jump table in the operating system, which,
in turn, jumps to the beginning address of the init_emulation_mode routine
previously described. This is how the two separately linked programs (the
Links100 application and the Links operating system) are linked together. Once
the address of the driver has been stored, it can be easily accessed by the
execution mechanism.


Executing the VT100 driver


Our next step was to modify the 8051 timer 0 interrupt handler within the
Links operating system in order to periodically execute the
process_VT100_input function. This timer was the only one available, and it
was already being used by the operating system to handle several other
processes. Because the VT100 driver could possibly have taken longer to
execute than the interrupt rate, a recursion protection semaphore was
introduced in the setup to the VT100 driver function call (Listing Six).
When the timer 0 interrupt handler is called, it first protects all 8051
registers that the C code driver (in this case, process_VT100_input) could
possibly use. Next, it puts the address of the
return_from_emul_processorroutine on the 8051 program stack (see Listing
Seven). It then puts the address of the process_VT100_input function on the
stack. Before returning from the interrupt, the pointer to the default data
structure being passed to the C function is put into 8051 registers r2 and r3
(in this case, the pointer to the VT100 default parameters data structure).
Upon performing the return from interrupt, the execution of the program
resumes at process_VT100_input. When it is finished executing, program
execution resumes at return_from_emul_processor and restores the 8051 program
stack to the state that existed prior to the timer 0 interrupt. The program
execution then returns to the point where it was interrupted.



Accessing the VT100 Virtual Screen Image


We developed two support routines to access the 24 x 80 VT100 virtual image:
One to read characters from the image and the other to get the cursor
position. The function in Listing Eight was developed to read characters from
the image. This function passes back a string of characters starting at the
specified row (1 - 24) and column (1 - 80) and ending at either the number of
characters specified in number_to_read or the end of the row, whichever occurs
first in the read. The return parameter is the pointer to the string. The get
cursor position function is shown in Listing Nine.
Some global parameters were made available to determine whether certain
expected information had arrived. The active flag is a TRUE/FALSE condition
that notifies the user if any data has come in. It must be initialized and
reset to FALSE by the application program using the driver in order to detect
the active state change. The total character, line feed, and carriage return
count variables were created to allow the application program knowledge of how
much of the expected data has arrived. This was done because it is often
impossible to know how long the host will take to send the expected data.


Benchmarking


Some VT100 command processing functions did not meet the required performance
execution time because of some of the 8051 code generated by the cross
compiler. For one thing, we found that the clear screen VT100 escape sequence
(ESC [J), when sent from a host to the Links, was much too slow. This "slow"
version appears in Listing Ten. The code generated by the cross compiler for
this version calculated the screen position by first determining the screen
image offset from the indices and then adding it to the screen image base
address. This is very intense processing when assembly code is generated from
C, especially because we are using an 8-bit processor for 16-bit addressing.
In the improved version (see Listing Eleven) we did not use offset
calculations, but used one tenth as many loop comparison operations. Execution
time for the slow version was 1.96 seconds, compared to the improved time of
0.33 seconds. The code size, however, increased from 145 to 213 bytes. We
decided that this was a small price to pay for the dramatic performance
increase.
Similar improvements were made in the clear from cursor position to end of
screen, clear from beginning of screen to cursor position, and clear line
escape commands.
Next, we determined that scrolling performance was also below an acceptable
level (see Listing Twelve), so we changed the VT100 scroll-up function as
shown in Listing Thirteen. In this case, the slow version took 2.80 seconds to
execute versus 0.57 seconds for the improved version (based on scrolling up 23
rows). Likewise, the code size went up from 294 to 565 bytes. The scroll_down
function was similarly improved so that all of the performance criteria were
met.


Emulating a VT100 Application on a PC


Earlier, I mentioned the Speed C development package as a tool available to
develop applications on a PC to run on the Links terminal. To adequately
develop a VT100-based application for the Links, we needed to extend the Speed
C PC-based Links terminal emulator so that it could simulate the VT100 driver
we had just developed.
First, we needed to introduce a timer-interrupt handler so that it could
execute the VT100 driver on a periodic basis, just like the Links terminal.
Even though the timing would not be exactly the same, it would be close enough
for simulation. When the timer interrupt is called, it calls the same VT100
driver that will ultimately reside in the Links, except that it is compiled
with Microsoft C.
Second, we added to the Speed C PC-based emulator the ability to display the
24 x 80 character virtual image on the PC screen. We utilized an off-the-shelf
PC Windows package to display the virtual buffer. As VT100 commands are
processed by the Links PC-based emulator, the displayable characters are shown
in the window. Also, the characters are displayed on the Links screen as if
the application were running on the Links instead of the PC. This is
accomplished by loading a specialized application into the Links prior to
executing the application to be tested on the PC. This Links-based application
accepts PC characters, which are to be displayed on the Links, and also tells
the Links terminal where to display them so that it is properly emulated.


Using the VT100 Driver in an Application


To demonstrate use of the VT100, the application in Listing Fourteen extracts
the first 20 characters of the first three rows of the 24 x 80 VT100 virtual
buffer image in the Links terminal. It then displays them on the Links screen
using the special x Links function, which handles all the Links display
commands. The D command ("define a touch-sensitive display area on the Links
terminal") and the P command ("print a character string to a defined
touch-sensitive display area") are used to display the received data. This
data is processed through the Links RS-232 port at 9600 baud, 8-data bits, no
parity, and 1 stop bit (see the statement open ("98N1",1) in Listing
Fourteen). The second parameter, 1, means that the XON/XOFF protocol is
enabled.
To execute the demonstration, I compiled the program using Microsoft C on my
PC, then linked with the Speed C emulation libraries and the VT100 emulation
object code. Next, I tested the mini-application on the PC. I was able to see
the full 24 x 80 character virtual screen image on the PC and the first three
rows and first 20 columns of the VT100 image displayed on the Links terminal
simultaneously. The Links terminal was tied into the COM1 serial port of my
PC, and the simulated host was sending VT100 commands and ASCII data to my PC
through the COM2 serial port.
The next step was to cross compile the application using the Speed C cross
compiler. I also cross compiled the VT100 driver C source and linked it with
the application and the Speed C library, using the Speed C linker. I then
loaded the executable application via the Speed C PC-based loader into the
Links terminal, and used the same PC as the simulated host to send the VT100
commands and data to test the application. I was able to display the same
screen image on the Links that was on the Speed C emulator when I sent the
same VT100 data from the simulated host!


Next Month


In Part II, I'll develop the Links100 application, which emulates a VT100
terminal using just about every feature of the Links touch-sensitive display,
including graphics. I'll also cover the ergonomics involved in developing
applications for small-size, touch-sensitive systems.


_REMOTE CONNECTIVITY FOR PORTABLE TERMINALS: PART I_
by Dan Troy


[LISTING ONE]

void process_VT100_input(char *z)
{
char data;
TERMINALP t = (TERMINALP)z; /* default VT100 parameters */

 if(!eof()) /* if character exists in */
 { /* serial buffer */
 read(&data); /* then read it */
 active = TRUE; /* set global activity flag */
 write_screen(t, data); /* process VT100 character */
 }
}







[LISTING TWO]

typedef struct
{
 char addlf; /* line feed/new line */
 char keymode; /* cursor/application */
 char insert; /* replace/insert */
 char autowrap; /* off/on */
 char keypad; /* numeric/alternate */
 char origin; /* absolute/relative */
 char kblock; /* keyboard unlock/lock */
}TERMINAL, *TERMINALP;







[LISTING THREE]

init_emulation_mode:

clr ea ;shut off interrupts
mov dptr,#emul_processor_address ;get c function address
mov a,r3 ;get low byte of function
 ;from C call
movx @dptr,a ;save at storage address
inc dptr ;inc processor address ptr
mov a,r2 ;get high byte of function
movx @dptr,a ;save at storage address
setb ea ;turn back on interrupts

lcall get_and_decr_stack_pointer ;get data stack ptr
 ;parameter data struct
movx a,@dptr ;get low byte of setup
 ;parameter data struct
xch a,b ;save in b reg
lcall get_and_decr_stack_pointer ;adjust data stack ptr
movx a,@dptr ;get high byte of setup
 ;parameter data struct
push acc ;save on program stack
mov dptr,#data_struct ;get storage address
xch a,b
movx @dptr,a ;save low byte of setup
inc dptr
pop acc
movx @dptr,a ;save high byte of setup

ret








[LISTING FOUR]

init_emulation_mode(process_VT100_input, (char *) t);






[LISTING FIVE]

typedef void(*PTF) (); /* a pointer to a function */
extern void init_emulation_mode(PTF, char *);

And the VT100 driver can be installed as follows:

/* initialize VT100 default parameters */

cursor.row = cursor.col = 1;
t->origin = t->addlf = t->keymode = RESET;
t->kblock = t->insert = t->autowrap = RESET;
t->keypad = NUMERIC;

clr_display(); /* clear LINKS display */

init_emulation_mode(process_VT100_input, (char *) t);






[LISTING SIX]

BEFORE

timer0:
 .
 .
 .
reti


AFTER

timer0:
 .
 .
 .
jnb in_emul_processor,process_emul ;if the emulation driver
reti ;is already running, then
 ;return from interrupt
process_emul: ;else call the driver

clr ea ;shut off interrupts while the
setb in_emul_processor ;recursion prevention semaphore
setb ea ;is set

push dph ;protect all registers that the C code driver

push dpl ;could possibly use (includes all of bank 3
push psw ;registers)
push acc
push b
push 18h
push 19h
push 1ah
push 1bh
push 1ch
push 1dh
push 1eh
push 1fh

mov dptr,#return_from_emul_processor ;put return from
push dpl ;emulation driver
push dph ;on stack

mov dptr,#emul_processor_address ;put emulation processor
movx a,@dptr ;driver address on stack
push acc
inc dptr
movx a,@dptr
push acc

mov dptr,#data_struct ;get the pointer to any
movx a,@dptr ;data to be passed to
mov r2,a ;the C language driver.
inc dptr ;pointer address is
movx a,@dptr ;stored in r2 and r3
mov r3,a

reti ;calls generic emulation driver
 ;(last address on program stack)







[LISTING SEVEN]

return_from_emul_processor:

pop 1fh ; restore stack prior to call to VT100 driver
pop 1eh
pop 1dh
pop 1ch
pop 1bh
pop 1ah
pop 19h
pop 18h
pop b
pop acc
pop psw
pop dpl
pop dph

clr ea ;reset recursion prevention semaphore

clr in_emul_processor ;while interrupts are off
setb ea
 .
 .
ret ; gets address of next instruction to execute
 ; in the routine that had been interrupted by
 ; timer 0.
 ; address taken off the 8051 stack and the stack
 ; pointer is updated







[LISTING EIGHT]

char *read_VT100_image(char row, char col, char *string,
 char number_to_read)
{
 short i;
 char *ptr;

 if(row <= VT100_MAX_ROWS && col <= VT100_MAX_COLS)
 {
 /* calculate number of characters to read on row */
 if((number_to_read + col) > (VT100_MAX_COLS+1))
 number_to_read = (VT100_MAX_COLS+1) - col;

 /* get string start address from global screen array */
 ptr = &screen[row - 1][col - 1];

 /* transfer string to return string array */
 for(i = 0; i < number_to_read; i++;)
 string[i] = *ptr++;
 str[i] = 0; /* terminate string */
 }
 return(string);
}







[LISTING NINE]


void get_cursor_position(TERMINALP t, char *row, char *col)
{
/* if cursor origin is relative, then calc row position
 based on scrolling start position, else use global row
 position */
 if(t->origin == SET)*row = cursor.row - begin.scroll + 1;
 else *row = cursor.row;

 *col = cursor.col;

}






[LISTING TEN]


static char screen[24][80]; /* VT100 virtual screen image */

/* put a space character in each virtual image position */

static void clr_display()
{
 short i,j;

 for(j = 0;j < VT100_MAX_ROWS; j++)
 for(i = 0;i < VT100_MAX_COLS; i++)
 screen[j][i]=' ';
}






[LISTING ELEVEN]

static char screen[24][80]; /* VT100 virtual screen image */
char *current, *next, *last;

/* put a space character in each virtual image position */

static void clr_display()
{
current = &screen[0][0];
last = &screen[24 - 1][80 - 1];

do{
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 }while(current < last);
}







[LISTING TWELVE]

static char screen[24][80]; /* VT100 virtual screen image */
static short begin_scroll_row, end_scroll_row;

/* Scroll screen up one row. Last row is blank. */

static void scroll_up()
{
 short i,j;

 for(j = (begin_scroll_row-1; j<(end_scroll_row-1); i++)
 for(i = 0;i < VT100_MAX_COLS; i++)
 screen[j][i]=screen[j+1][i];

 for(i = 0;i < VT100_MAX_COLS; screen[j][i] = ' ', i++);
}






[LISTING THIRTEEN]

static char screen[24][80]; /* VT100 virtual screen image */
static char *current, *next, *last;
static short begin_scroll_row, end_scroll_row;

/* Scroll screen up one row. Last row is blank. */

static void scroll_up()
{
 current = &screen[begin_scroll_row - 1][0];
 next = current + 80;
 last = &screen[end_scroll_row - 1][0];

do{
 *current++ = *next++;
 *current++ = *next++;
 *current++ = *next++;
 *current++ = *next++;
 *current++ = *next++;
 *current++ = *next++;
 *current++ = *next++;
 *current++ = *next++;
 *current++ = *next++;
 *current++ = *next++;
 }while(current < last);

last += 80;

do{
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';

 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 *current++ = ' ';
 }while(current < last);
}






[LISTING FOURTEEN]

 #include "speedc.h"
 #include "vt100.h"
 #include "string.h"

 void exception_handler(char code);

 void main()
 {
 TERMINAL t; /* define VT100 setup parameters */
 char string[21];
 char display_string[23];

 cursor.row = cursor.col = 1; /* globals defined in vt100.h */
 t->origin = t->addlf = t->keymode = 0;
 t->kblock = t->insert = t-> autowrap = 0;
 t->keypad = NUMERIC;

init_emulation_mode(process_VT100_input, (char *)t); /* prototype in vt100.h
*/

 /* initialize the first 3 lines on the LINKS terminal display by
 using the special LINKS x function. This function allows the
 user to define distinct display regions on the terminal. The
 nomenclature is as follows:
 D means define a display region which is touch sensitive.
 1,2, or 3 means that the touch sensitive area defined will
 generate transmit that particular ASCII character in the
 key buffer when that touch sensitive area is pressed on
 the LINKS screen. This is also referred to as its name.
 A18; means the touch sensitive display area in row A (first row
 on the LINKS), columns 1-8 (touch display areas have 8
 distinct areas per row). The semicolon means the end of
 the touch display definition, and what follows is the
 message to be displayed in that display area.
 B18; means row B (second row), columns 1-8.
 C18; means row C (third row), columns 1-8.
 */

 x("D1 A18; ");
 x("D2 B18; ");
 x("D3 C18; ");

 open("98N1",1); /* open LINKS RS232 port with special LINKS
 operating system function */

 /* continuously update LINKS terminal display with the current VT100

 virtual image found in rows 1-3, columns 1-20) using read_VT100_scr
 whose prototype is in vt100.h.
 */
 do{
 read_VT100_scr(1, 1, string, 20); /* read row 1, cols 1-20 */
 strcpy(display_string, "P1"); /* prefix string with special */
 strcat(display_string, string); /* links P cmd (print to touch */
 x(display_string); /* display area named '1') */

 read_VT100_scr(2, 1, string, 20); /* read row 2, cols 1-20 */
 strcpy(display_string, "P2"); /* prefix string with special */
 strcat(display_string, string); /* links P cmd (print to touch */
 x(display_string); /* display area named '2') */

 read_VT100_scr(3, 1, string, 20); /* read row 3, cols 1-20 */
 strcpy(display_string, "P3"); /* prefix string with special */
 strcat(display_string, string); /* links P cmd (print to touch */
 x(display_string); /* display area named '3') */

 }while(-1);
}

/* The exception handler is needed for all LINKS applications to handle
 special LINKS control functions.
*/

void exception_handler(char code)
{
 if (code == 4)turn_off(); /* detects ON/OFF button pressed,
 and turns LINKS off via LINKS turn_off */
}































February, 1991
LOOKING INTO THE FUTURE OF MICROPROCESSORS


A report from the Microprocessor Forum




Ray Duncan


Ray is a DDJ contributing editor and can be contacted at LMI, P.O. Box 10430,
Marina Del Rey, CA 90295.


Over the past three years, the most important indicator of the microprocessor
industry's future directions -- particularly in the areas of general-purpose
microprocessors, embedded microprocessors, digital signal processors (DSPs),
and multichip modules -- has become the Microprocessor Forum, sponsored by the
Microprocessor Report. The most recent conference was no exception, attracting
several hundred hard-core engineers and systems designers, software
developers, journalists, and -- in a rather ominous development -- lawyers.
In his keynote speech entitle "The Impact of Free Silicon on Microprocessor
Design and Application," Andrew Rappaport of the Technology Research Group
(Boston, Mass.) presented a simple premise: Microprocessor implementation and
fabrication technology rides on the coattails of Dynamic Random Access Memory
(DRAM) technology. But DRAM technology is advancing so quickly that it will
soon far outstrip our ability to make effective use of the number of
transistors that we can cram onto a single CPU chip.
To illustrate this point, consider that the 80386 was basically implemented
with 256-Kbit DRAM technology; it has on the order of 300,000 transistors. By
the time the 80486 came along, 1-Mbit DRAMs were in full production. The 80486
designers, though, were more interested in maintaining full compatibility with
the 80386 than anything else, and really had very little need for the extra
700,000 transistors that 1-Mbit DRAM technology made available. They
ultimately managed to use up their one million transistor budget by putting
the 80387 onto the same chip with the 80486, along with an 8-Kbyte cache. And
that was in 1988!
4-Mbit DRAM production is coming online now, 16-Mbit DRAMs should be along by
1993 or 1994, and the year 2000 will, unless fabrication technology runs into
some unexpected potholes, bring DRAMs with 128 Mbits or more on a single chip.
Over the same period, the manufacturing cost per DRAM bit should decline from
the present 0.0005 cents (500 microcents) to around 0.00001 cents (10
microcents), and by the year 2010 will decrease further to around 0.0000008
cents (0.8 microcents). At the same time, the estimated manufacturing cost per
die of a one million-transistor microprocessor (such as the 80486) should
decline from its 1989 value of $8.33 to around $0.29 (this does not include
the cost of testing, packaging, and so on).
The implications? In the year 2000, it will be perfectly feasible and
economical to put the equivalent of an 80386 in a tiny corner of every single
DRAM chip! The capabilities of the high-end microprocessors we are familiar
with today will become essentially "free" -- and we will be able to put such
processors literally everywhere. As Rappaport says, what can be built will be
governed by what can be designed and by our ability to find ways to utilize
all this processing power.
A counterpoint to Rappaport's prognostications was provided by Paul Saffo,
InfoWorld columnist and chief guru of the Institute for the Future, in a talk
entitled "The Zen of Change." The relationship of Saffo's arguments to Zen was
lost on me (if indeed there was any; Zen seems to have become a popular
marketing buzzword like "object oriented" or "superscalar"), but he did make
some interesting points. Saffo's theme was the natural history of the adoption
of any new technology, and he drew some interesting parallels between the
years following the invention of printing by Gutenberg and the current state
of the computer industry.
According to Saffo, the acceptance of a completely new technology takes place
in three stages, each requiring about ten years. During the first decade, the
implications of the new technology are poorly understood, and it is frequently
(mis)used to solve old problems to which it is only vaguely related. Saffo
calls this the "paving the cowpaths" period. In the second decade,
entrepreneurs experiment with applications of the new technology, taking it in
wildly different directions (many of which inevitably turn out to be dead
ends). In the third decade, natural selection winnows out the suboptimal
solutions, standards for use of the new technology emerge, mass production
lowers costs drastically, and the technology is integrated seamlessly into the
background noise of daily life.
When Gutenberg printed his first books, nobody foresaw the newspapers,
magazines, paperback racks, and bookstores of today; instead, monks and
scribes decorated the printed books with colorful pictures and designs so
they'd look more like the traditional hand-copied volumes. In the following
years, the technology of printing spread rapidly throughout Europe, but there
was as yet no standard design for books; for example, any given book might or
might not have running heads, page numbers, a title page, or a table of
contents. It took several decades before the structure of a book as we know it
today emerged and became recognized as the "right way" to publish a
manuscript.
A hundred years from now, the state of computer technology in our time may be
perceived as the era of wild and woolly experimentation, or (who knows?) even
as the period of "paving the cow paths." We can't possibly visualize the uses
to which Rappaport's "free silicon" will be put in a decade or two, any more
than Gutenberg could imagine the satellite communications, high-speed printing
presses, and distribution network that puts identical copies of USA Today (for
better or worse) on every street corner in America each morning. It's amusing
to speculate whether the rapid spread of GUIs over the last few years really
represents the best fruit of natural selection and the arrival of
standardization. Personally, I suspect that GUIs will fade away without a
trace when the "real" user interfaces arrive -- perhaps real-time voice
recognition and synthesis a la Star Trek, or direct mind-to-computer linkages
in Cyberspace a la William Gibson.


New Chips Now


The most important new chip announcements at these sessions were probably the
Motorola 88100 (the next generation of Motorola's 88000 RISC processor), the
AMD 29050 (the next generation of its 29000 RISC processor), and the Inmos H1
transputer (the next generation of its current T425 and T805 transputers). The
least important and most useless announcement was undoubtedly Motorola's
unveiling of the 68EC030 microprocessor. The 68EC030, which was billed as a
low cost implementation of the 68030 especially designed for embedded systems,
turned out to be merely an ordinary 68030 with an untested Memory Management
Unit (MMU) in a plastic package.
For users of IBM-compatible PCs, as opposed to designers, the most interesting
news was the first public description of AMD's AM286ZX/LX microprocessors and
Intel's new two-chip set for notebook computers. The AM286ZX/LX products are
essentially PC AT motherboards on a single chip, including a CMOS
80286-compatible CPU, clock and bus controller, programmable counter/timer,
DMA controller, DRAM controller, EMS address mapper, interrupt controller,
real-time clock, and CMOS RAM. The manufacturer of a laptop or notebook
controller needs only to add some RAM, EPROMs for the ROM BIOS, keyboard
controller, a mass storage device, and a display to have a complete
functioning system.
The new Intel chip set, called the "SL SuperSet," is even more ambitious than
AMD's. The two components in the set are the 386SL CPU chip, which contains
approximately 855,000 transistors, and the 8236OSL companion I/O chip,
composed of approximately 226,000 transistors -- together replacing
approximately eight VLSI chips or over 100 conventional integrated circuits.
The 386SL CPU is based on the 80386, but is a fully static implementation with
considerable added logic (and a whole new execution "mode") for power
management. Intel validated the SL SuperSet design before committing it to
silicon by simulating the bootup of a DOS system from Ctrl-Alt-Del all the way
to the C> prompt. This simulation took several days to execute on an IBM 3090
mainframe, but had an excellent payoff: The first chips off the assembly line
were fully functional, and it was rumored at the conference that Compaq and
Toshiba are already building prototype 80386 notebook computers using SL
SuperSet engineering samples.


RISC vs. CISC


At the 1989 Microprocessor Forum, the RISC vs. CISC debates were bitter and,
with nearly a dozen different RISC or RISC-wanna-be designs contending for
attention, the RISC marketplace proper was still tremendously confusing. At
this year's Forum, the RISC proponents were noticeably less strident, perhaps
partly because the RISC processors -- with the addition of floating point
units, support for superscalar operation, and other complexities -- are
diverging further and further from their original, Spartan ideals. The RISC
marketplace also seems to be shaking out somewhat; SPARC and MIPS appear to be
the best choices if you are designing a workstation, while the AMD 29050 and
the Intel 80960 look like the main contenders for the embedded systems market.
The Intel i860 and Motorola 88000 still have a few fans, but their importance
is fading rapidly, and the IBM RISC processor is ignored by everyone except
captive IBM accounts.


And Into the Future ...


The last session of the conference, called "Architectural Issues for the
1990s," provided some interesting glimpses into the future. Mike Johnson of
AMD presented a tutorial on Superscalar and Superpipelined microprocessors;
both these buzzwords represent advanced techniques to allow the execution of
more than one instruction per machine cycle. Gregory Papadopoulos of MIT
described the current state of the art in data-flow processors, a peculiar
species of CPU which may yet have its day when the technology of conventional
processors has been pushed to its limit. And Monica Lam of Stanford gave a
talk on compiler-directed parallelism, showing how CPU designers and compiler
designers can work together to achieve performance that neither can attain
separately.
The "Architectural Issues" session ended with a panel discussion where David
Patterson of UC Berkeley, John Mashey of MIPS, Patrick Gelsinger of Intel, and
other illustrious computer architects each presented their view of where
microprocessors are going in the 1990s and beyond. Predictably, there were
some fundamental differences of opinion between the RISC proponents (such as
Patterson, one of the designers of the RISC I, which evolved into the Sun
SPARC) and the CISC advocates (such as Gelsinger, one of the architects of the
80486), although the RISC team generously acknowledged the 80486 as a
technically excellent implementation of a bankrupt architecture. But everyone
on the panel agreed on one point: 32-bits of address space are not enough, and
the first true 64-bit microprocessors will be appearing within a very short
time.
















February, 1991
 OPTIMIZING INTEGER DIVISION BY A CONSTANT DIVISOR


Speeding up slow processors




Robert D. Grappel


Robert holds a Ph.D in solid-state physics from Ohio University and is
currently a researcher at MIT, Lincoln Laboratory. He can be reached at 28
Buckmaster Dr., Concord, MA 01742.


Division is not an easy operation for computers to perform. Those of us old
enough to remember 8-bit microprocessors such as the Motorola 6800 and the
Intel 8080 (or those of us still using them for embedded systems or other
similar applications) can recall writing division subroutines which were slow
and consumed scarce memory resources. Modern 16- and 32-bit processors have
divide instructions, but these instructions are still quite slow when compared
to the "primitive" arithmetic instructions such as add, subtract, and shift.
Table 1, for example, lists the number of clock cycles required by the Intel
80286 and Motorola 68020 processors for some of the basic arithmetic
operations.
Table 1: Timings for arithmetic instructions

 Instruction 80286 68020
 ----------------------------------------

 Add, Subtract 2 0-3
 Shift n 5+n 1-4
 Multiply (unsigned 16-bit) 21 21-28
 Multiply (unsigned 32-bit) --- 41-44
 Divide (unsigned 16-bit) 22 42-47

 Notes:


1. All times are in processor clock cycles.
2. Motorola time range for best-cache-worst memory case.
3. Intel add, subtract, and shift for 16-bit values. Motorola add, subtract,
and shift for 32-bit values.
4. All instructions assume operands in registers.
5. Shifts assume a constant shift-count.
Note that the division instruction is many times slower than addition,
subtraction, or shifting. This suggests that it might be useful to find a
method to convert a division operation into a short sequence of adds,
subtracts, and shifts. This instruction sequence might prove faster that the
computer's divide instruction. For those computers which lack a divide
instruction, the ability to express a division as a sequence of instructions
which do exist in the instruction set can provide a substantial speed
improvement over a subroutine call. The sequence might even prove to save
memory, given the overhead of parameter passing and the subroutine code
itself.
This article describes a method of decomposing a division by a constant
divisor into just such a simple sequence of additions, subtractions, and
shifts. For simplicity, we will assume that the numerator is an unsigned value
and that the divisor is an unsigned constant. (Extending the algorithm to
handle signed values is straightforward -- as mathematicians say "the proof is
left for the reader.") The numerator and denominator (divisor) are assumed to
be 16-bit values here, although the algorithm can be extended to handle longer
values. We will assume that the computer has the capability to do
double-precision (32-bit) adds, subtracts, and shifts. The instruction
sequence assumes the existence of two registers in which the calculations will
be done.


How Does a Mathematician Boil Water?


There is an old joke about mathematicians that describes one technique they
frequently use to solve problems. The problem in this case is "how to boil
water."
Problem 1: The mathematician finds himself in a kitchen. There is a kettle of
water on the counter next to the sink.
Solution: He carries the kettle to the stove and heats it to boiling. (QED)
Problem 2: The kettle of water is already on the stove.
Solution: The mathematician takes the kettle and places it on the counter next
to the sink. Now, proceed as in the solution to the first problem. (QED)
The technique of converting the existing problem into one which has already
been solved is a basic tool in mathematics. In this article, the technique
will be used to convert the problem of solving the quotient function
Q(x)=x/y(where"/" indicates integer division, x and y are positive, and y is a
nonzero constant) into the new form Q'(x)=(ax+b)/z(where a,b, and z are
derived from y). We have replaced a division by a multiplication, an addition,
and a different division. At first, it appears that we have made the problem
worse. However, note that if we could make z a power-of-two, then the new
division could be done with a shift instruction. The addition is no problem,
so we have traded a division by a constant divisor for a multiplication by a
constant multiplier. This is a problem we already know how to solve! In the
March 1987 issue of DDJ # 125 (pages 34-37), I described a technique for
unrolling a multiplication by a constant multiplier into a "star-chain"
sequence of adds, subtracts, and shifts. Now, if we can generate appropriate
values for a, b, and z, we have solved the problem of generating a sequence of
adds, subtracts, and shifts for division.
The math for calculating the constants a, b and z is not very complicated.
Because z must be a power-of-two, we will choose the maximum 16-bit value
65,536 for our example. We define a by the formula a = z/y which can be viewed
as a sort of scaled reciprocal of the constant divisor y. We define the
remainder of the integer division r as r = z%y which can be viewed as the
"fractional part" of the scaled reciprocal. Then we can define b as b = a+r-1
and the transformation is complete. We have converted the problem Q(x) into
the equivalent problem Q'(x) which we already know how to express in adds,
subtracts, and shifts. We can precalculate the required constants and generate
the complete instruction sequence at "compile time." (Note that the adds,
subtracts, and shifts in the algorithm must be done in double-precision
registers [32-bits] to avoid overflows in the calculation steps.) The final
resulting quotient ends up in the upper 16-bits of the register (before the
16-bit shift-division by z) and the low-order 16-bits of the register contain
an approximation to the remainder of the division.


Close Enough for Government Work


There is only one problem with the Q'(x) algorithm for division by a constant
divisor -- it doesn't always get the right answer! Sometimes the accumulated
round-off error in the calculation causes the quotient to be off by one when
the numerator gets sufficiently large. This may be good enough for some
applications which can limit the numerator range, or where an approximate
quotient is acceptable.
There is a way to improve the chances of getting the right answer from the
Q'(x) algorithm. If the divisor is even, keep shifting the numerator and
denominator right (equivalent to divisions by two) until the denominator
becomes odd, then apply the Q'(x) algorithm. By reducing the denominator, it
becomes more likely that the round-off error will not cause problems for
16-bit (less than z) numerators. This improvement is employed in the results
and code shown later in this article.
Table 2 lists the first 300 divisors for which the Q'(x) algorithm is accurate
for all 16-bit numerators (after applying the shift-right operation described
in the preceding paragraph). The table was generated by brute-force
calculation. Q'(x) can be off by one for divisors not in the table. A quick
check program can be written to determine if Q'(x) is accurate for a given
divisor (not in the table) over the desired numerator range. For example, if
the numerator range is limited to a maximum of 32,767 (15 bits, the maximum
signed 16-bit value), the additional divisors in Table 3calculate accurate
results when using the Q'(x) algorithm.
Table 2: First 300 divisors for which Q'(x) is accurate


 2 3 4 5 6 8 10 12 14 15
 16 17 20 24 28 30 32 34 40 48
 51 52 56 60 62 64 68 72 80 85
 96 102 104 112 120 124 128 136 144 152
 160 170 172 176 192 204 208 216 224 240
 248 255 256 257 272 284 288 302 304 320
 336 340 344 352 368 384 400 408 416 432
 434 448 480 496 508 510 512 514 516 544
 560 568 576 592 604 608 624 640 648 672
 680 688 704 720 736 768 771 800 816 832
 864 868 896 928 960 992 1008 1016 1020 1024
 1028 1032 1040 1056 1072 1088 1120 1136 1152 1184
 1208 1216 1232 1248 1280 1285 1296 1312 1344 1360
 1376 1408 1440 1456 1472 1504 1524 1536 1542 1568
 1600 1632 1664 1680 1696 1728 1736 1760 1792 1856
 1872 1920 1952 1984 2016 2032 2040 2048 2056 2064
 2080 2112 2114 2144 2176 2240 2272 2304 2368 2416
 2432 2464 2496 2560 2570 2576 2592 2608 2624 2688
 2720 2752 2784 8216 2848 2880 2896 2912 2944 3008
 3048 3072 3084 3120 3136 3200 3216 3264 3296 3328
 3360 3392 3456 3472 3488 3520 3584 3648 3692 3712
 3744 3776 3840 3855 3904 3968 4032 4048 4064 4080
 4096 4112 4128 4144 4160 4224 4228 4288 4352 4368
 4369 4416 4480 4544 4608 4672 4736 4800 4832 4864
 4928 4992 5040 5056 5088 5120 5140 5152 5184 5216
 5248 5280 5312 5376 5440 5504 5568 5632 5696 5728
 5760 5792 5824 5856 5888 5952 6016 6096 6112 6144
 6168 6208 6240 6272 6400 6432 6472 6512 6528 6592
 6656 6720 6784 6808 6848 6912 6944 6976 7040 7104
 7168 7280 7296 7384 7424 7488 7552 7680 7710 7744

 Note: all divisors > 32767 are accurate

Table 3: Accurate Q'(x) divisors (numerator<32768)

 7 26 31 36 76 86 88 108 142 151
 168 184 200 217 254 258 280 296 312 324
 360 464 504 520 528 536 616 656 728 752
 762 784 848 880 936 976 1057 1288 1304 1392
 1424 1448 1560 1608 1648 1744 1824 1846 1888 2024
 2072 2184 2336 2520 2528 2544 2656 2864 2928 2976
 3056 3104 3236 3256 3404 3424 3553 3640 3872 3912
 4000 4016 4176 4192 4320 4384 4448 4576 4680 4681
 4896 4944 5024 5344 5472 5488 5576 5600 5664 5920
 5984 6080 6336 6368 6392 6464 6552 6898 7008 7084
 7200 7232 7360 7456 7616 7648

Note that a factoring approach can be used to deal with some divisors that are
not in the table. For example, the divisor 25 is not in the table, but 5 is.
Hence, applying Q'(x) for 5 twice will result in an accurate division by 25.


Just the Facts, Ma'am


Listing One contains a C program which generates an instruction sequence for
the Q'(x) division algorithm. The code is very similar to the multiplication
algorithm code shown in my previous Dr. Dobb's Journal article. The main
program of my March 1987 article (page 35) has been made into the subroutine
binmul(). Note that the code to load the machine working-register Rw has been
moved to the new main program. This is the only change that has been made to
the star-chain multiplication code. The program assumes that the numerator
will be in machine register R1 and it returns the quotient in machine register
Rw. Note that unlike the multiplication algorithm instruction sequence, the
instruction sequence generated by the Q'(x) algorithm will alter the contents
of register R1 if the divisor is not a power-of-two.
The main program of Listing One uses the function trim_trailing( ) to count
the number of low-order zero bits in the divisor (equivalent to the number of
factors-of-two in the divisor which will be shifted out). Code is generated
for the initial right-shift scaling. If the divisor was a power-of-two, this
right-shift does the entire job; otherwise the Q'(x) algorithm is implemented.
Note that the final right-shift by 16 bits (the division by z) might be done
with a 68000-family SWAP instruction or by returning only the upper half of Rw
(if the computer has only 16-bit registers, such as the Intel 80286).
As an example, consider a division by 102 (one of the accurate values from
Table 2). The Q'(x) instruction sequence generated by the program in Listing
One is shown in Example 1 (the comments were added for additional clarity).
Example 1: Division by 102


 R1 >>= 1 scale 102 down to 51 (odd)
 Rw = R1 mult. by a = (65536/51) = 1285
 Rw <<= 2
 Rw += R1
 Rw <<= 6
 Rw += R1
 Rw <<= 2
 Rw += R1
 Rw += 1285 add b = (a + r - 1) = 1285
 Rw >>= 16 divide by z = 65536

For a 68020, the worst-case timing for this sequence would be about 35 clocks
-- compared to 47 clocks for a divide instruction. If the code was in cache,
the timing for the sequence would improve to about 28 clocks against 46 clocks
for the divide instruction. In the "best" timing case for the 68020, the
sequence could require as few as 10 clocks against 42 clocks for the divide
instruction. For computers which must implement division as a subroutine, the
improvement entailed in using the Q'(x) algorithm can be greater still.


_OPTIMIZING INTEGER DIVISION BY A CONSTANT DIVISOR_
by Robert Grappel



[LISTING ONE]

#include stdio.h

/* Program to generate "star-chain-sequence" for division of an unsigned
16-bit integer numerator by an unsigned 16-bit integer constant denominator.
It assumes the existence of two 32-bit integer "registers", add, subtract, and
shift instructions. It uses the "star-chain" multiplication routine from DDJ
#125, March 1987 page 35 */

static unsigned int mult; /* global variable for trim_trailing() & binmul() */

/* Support subroutine to trim trailing 0s or 1s from global variable "mult".
*/
int trim_trailing(one_zero) int one_zero;
{ /* if one_zero == 0, trim trailing zeros in "mult", return "count"
 == 1, ones */
 int c; /* bit counter */
 for (c = 0; ((mult & 1) == one_zero); c++, mult >>= 1) ;
 return c;
}

/* Slightly modified version of multiplication routine */
binmul(m) long m;
{
 int last_shift, /* final shift count */
 last_cnt, /* count of low-order zeros */
 stkptr, /* pointer to "stack[]" */
 cnt, /* bit counter */
 ts, /* top-of-stack element */
 flag, /* flag for special-case */
 stack[16]; /* stack for shift-add/subs */
 mult = m;
 stkptr = last_cnt = 0; /* init. stack ptr. and count */
 last_shift = trim_trailing(0); /* trim trailing 0s */
 while (1)
 { /* loop to decompose "mult", building stack */
 cnt = trim_trailing(1); /* count low-order 1s */
 if (cnt > 1)
 { /* more than 1 bit, shift-subtract */
 flag = 0;

 if (last_cnt == 1)
 /* shift "k",sub,shift 1,add --> shift "k+1", sub */
 stack[stkptr-1] = -(cnt+1); /* overwrite */
 else
 stack[stkptr++] = -cnt;
 }
 else
 flag = 1; /* need another shift-add */
 if (mult == 0) break; /* decomp. "mult", now output */
 last_cnt = trim_trailing(0) + flag; /* low-order 0s */
 stack[stkptr++] = last_cnt; /* shift-add */
 }
 while (stkptr > 0)
 { /* output code from stack */
 ts = stack[--stkptr]; /* get top stack element */
 if (ts < 0)
 { /* generate shift-subtract instructions */
 printf("\nRw <<= %d",-ts);
 printf("\nRw -= R1");
 }
 else
 { /* generate shift-add instructions */
 printf("\nRw <<= %d",ts);
 printf("\nRw += R1");
 }
 }
 if (last_shift != 0) printf("\nRw <<= %d",last_shift);
}

main()
{ /* generate pseudo-instructions for star-chain division */
 int a,b,r, /* computed multiplier, addend, and remainder */
 i2, /* number of bits to scale divisor */
 denom, /* intended divisor */
 denom2; /* divisor scaled by powers-of-2 */
 int z = 65536; /* 2^16 */
 printf("\nEnter positive integer denominator: ");
 scanf("%d",&denom);
 if (denom != 0)
 { /* scale denominator by powers-of-2 */
 mult = denom;
 i2 = trim_trailing(0); /* how many powers-of-2? */
 denom2 = mult;
 if (denom2 == 1)
 { /* divisor was power-of-2, simply scale it */
 printf("\nRw = R1");
 if (i2 > 0) printf("\nRw >>= %d",i2);
 }
 else
 { /* divisor not power-of-2, scale and more */
 if (i2 > 0)
 printf("\nR1 >>= %d",i2); /* handle scaling */
 printf("\nRw = R1"); /* load work register */
 a = z / denom2; /* scaled reciprocal */
 r = z % denom2; /* remainder of recip. */
 b = a + r - 1;
 binmul(a);
 printf("\nRw += %d",b);
 printf("\nRw >>= 16");

 }
 }
 else
 printf("\nCannot divide by zero\n"); /* special case */
}



Example 1: Division by 102

R1 >>= 1 scale 102 down to 51 (odd)
Rw = R1 mult. by a = (65536/51) = 1285
Rw <<= 2
Rw += R1
Rw <<= 6
Rw += R1
Rw <<= 2
Rw += R1
Rw += 1285 add b = (a + r - 1) = 1285
Rw >>= 16 divide by z = 65536










































February, 1991
SCREEN CAPTURING FOR WINDOWS 3.0


A useful utility for grabbing screen images




Jim Conger


Jim is the author of C Programming For MIDI, MIDI Sequencing In C (M&T Books),
and articles for Electronic Musician magazine. He can be reached via
CompuServe (73220,324).


Snap3 is a program that allows you to grab any part of a Windows 3.0
application screen and paste it to the clipboard. Once in the clipboard, you
can use the Write or Word for Windows Paste command to paste the image
directly into your documents. Not only is the program a handy tool, but
understanding how it works is a good introduction into the sometimes
mysterious world of Windows 3 programming.
I use Snap3 to grab representative images from the running program I'm
documenting, adding the image to the manual as I write. The utility can also
be used to create help screens by letting you cut and paste images from the
program into your help source file, created with Windows. (For more
information on the Microsoft Help compiler, refer to "Building an Efficient
Help System" by Leo Notenboom and Michael Vose [DDJ, June 1990]).


Snap3


The idea for Snap3 came from a similar public domain program called "Snap"
available on the CompuServe MSWIN forum. Because Snap is a Windows 2.x
program, it generates those nasty warning messages every time you run it in
Windows 3.0 Standard or 386 Enhanced Mode. I decided to write a 3.0 version,
and cut the program down to the bare essentials.
When you run the program, a window (see Figure 1) with two menu items will
appear, the first item being "Start Capture." When this item is clicked, the
mouse cursor turns to a crosshair which you can move to the upper left of the
screen area you want to capture. By pressing the left mouse button and
dragging the mouse down, you will "draw" a rectangle on the screen. You
stretch the rectangle to encompass the area you want to capture and then
release the button. The graphics image captured shows up in Snap3's window.
The other key function is "Clear Buffer," which clears the Snap3 window and
empties the clipboard. Snap3 also has an About function and a simple Help
screen.


Snap3 Source Code


From a programmer's point of view, all Snap3 is doing is grabbing a bitmap off
of the screen and pasting it to the Windows clipboard that is common to all
Windows applications. The captured bitmap is then available for a Paste
operation from Word For Windows, Paintbrush, or other similar applications.
You can also see the image by clicking open Windows' clipboard viewer.
The Snap3 program is broken up into four files. The NMAKE file (Listing One)
calls the C compiler (cl), the resource compiler (rc), and the linker (link)
to build the finished program. All Windows programs have a definition file
that provides basic information as to the program's name and organization.
SNAP3.DEF (Listing Two) is an example of the simplest possible DEF file. The
resource file SNAP3.RC (Listing Three) gives the name of the program's icon
and defines the program menu. Following normal Windows programming practice,
the menu items are numbered based on #define statements in the header file
SNAP3.H (Listing Four ). The header file also includes the function
prototypes. The actual program code is in SNAP3.C (Listing Five). There are
only three functions. The WinMain( ) function loads the program icon that was
created using the SDK Paint application that comes with the Windows 3.0 SDK.
The SNAP3.C WndProc( ) function contains all of the program logic. Capturing
starts when you select the Start Capture menu item, generating a IDM_START
message, changing the cursor to IDC_CROSS. Once capturing starts, WM_MOUSEMOVE
messages cause the rectangular region captured by the mouse to be outlined.
The function OutlineBlock( ) at the end of the listing does the drawing. One
sneaky thing here is that the outline rectangle's lines are drawn with the
logical R2_NOT operator (function SetROP2( )). This causes drawing the lines
in the same place twice to erase the lines. Windows includes a number of
built-in functions for dealing with bitmap images. These work at a fairly high
level, freeing you from having to worry about how the pixel data is stored or
manipulated. SNAP3 uses several of these functions to do the capture of the
screen image.
The transfer of the image to the clipboard happens when you release the mouse
button, generating a WM_LBUTTONUP message. A memory area for the bitmap is
created using CreateCompatibleBitmap( ). StretchBlt( ) copies from the screen
device to the memory device. Then SetClipboardData( ) alerts Windows that the
clipboard should now look to the bitmap memory area for its data.
If Snap3 is not minimized down to an icon, it will display the captured image
in its window. Any time a WM_PAINT message is received, it copies the
clipboard bitmap (if any) to the Snap3 window area. The Windows function
StretchBlt( ) does the copying of the bitmap to the screen image.


_SCREEN CAPTURING FOR WINDOWS 3.0_
by Jim Conger


[LISTING ONE]

ALL: snap3.exe

snap3.obj : snap3.c
 cl -AS -c -DLINT_ARGS -Gsw -Oat -W2 -Zped snap3.c

snap3.res: snap3.rc snap3.ico
 rc -r snap3.rc

snap3.exe : snap3.obj snap3.def snap3.res
 link /NOD snap3, , ,libw slibcew, snap3.def
 rc snap3.res







[LISTING TWO]

NAME SNAP3
DESCRIPTION 'snap3 program for windows bitmap capture to clipboard'
EXETYPE WINDOWS
STUB 'WINSTUB.EXE'
CODE PRELOAD MOVEABLE
DATA PRELOAD MOVEABLE MULTIPLE
HEAPSIZE 1024
STACKSIZE 4096
EXPORTS WndProc






[LISTING THREE]

/* snap3.rc */
#include "snap3.h"

snap3 ICON snap3.ico
snap3 MENU
BEGIN
 MENUITEM "&Start Capture" IDM_START
 MENUITEM "&Clear Buffer", IDM_CLEAR
 MENUITEM "&About", IDM_ABOUT
 MENUITEM "\a&Help", IDM_HELP
END






[LISTING FOUR]

/* snap3.h */

#define IDM_START 1 /* menu item id values */
#define IDM_CLEAR 2
#define IDM_ABOUT 3
#define IDM_HELP 4

/* function prototypes */
long FAR PASCAL WndProc (HWND, unsigned, WORD, LONG) ;
void OutlineBlock (HWND hWnd, POINT beg, POINT end) ;






[LISTING FIVE]

/* snap3.C -- Screen Capture to clipboard -- jim conger */


#include <windows.h>
#include <stdlib.h>
#include "snap3.h"

int PASCAL WinMain (HANDLE hInstance, HANDLE hPrevInstance, LPSTR lpszCmdLine,
 int nCmdShow)
{
 static char szAppName [] = "snap3" ;
 HWND hWnd ;
 MSG msg ;
 WNDCLASS wndclass ;

 if (!hPrevInstance)
 {
 wndclass.style = CS_HREDRAW CS_VREDRAW ;
 wndclass.lpfnWndProc = WndProc ;
 wndclass.cbClsExtra = 0 ;
 wndclass.cbWndExtra = 0 ;
 wndclass.hInstance = hInstance ;
 wndclass.hIcon = LoadIcon (NULL, szAppName) ;
 wndclass.hCursor = LoadCursor (NULL, IDC_ARROW) ;
 wndclass.hbrBackground = GetStockObject (WHITE_BRUSH) ;
 wndclass.lpszMenuName = szAppName ;
 wndclass.lpszClassName = szAppName ;

 if (!RegisterClass (&wndclass))
 return FALSE ;
 }
 hWnd = CreateWindow (szAppName, "Snap3",
 WS_OVERLAPPEDWINDOW, CW_USEDEFAULT, CW_USEDEFAULT,
 GetSystemMetrics (SM_CXSCREEN) / 2,
 8 * GetSystemMetrics (SM_CYMENU),
 NULL, NULL, hInstance, NULL) ;
 ShowWindow (hWnd, nCmdShow) ;
 UpdateWindow (hWnd) ;

 while (GetMessage (&msg, NULL, 0, 0))
 {
 TranslateMessage (&msg) ;
 DispatchMessage (&msg) ;
 }
 return msg.wParam ;
}

long FAR PASCAL WndProc (HWND hWnd, unsigned iMessage, WORD wParam,
 LONG lParam)
{
 static BOOL bCapturing = FALSE, bBlocking = FALSE, bStarted = FALSE ;
 static POINT beg, end, oldend ;
 static short xSize, ySize ;
 static HANDLE hInstance ;
 HDC hDC, hMemDC ;
 BITMAP bm ;
 HBITMAP hBitmap ;
 HICON hIcon ;
 PAINTSTRUCT ps ;
 switch (iMessage)
 {
 case WM_CREATE: /* get program instance when window is created */

 hInstance = GetWindowWord (hWnd, GWW_HINSTANCE) ;
 break ;
 case WM_COMMAND: /* one of the menu items has been clicked */
 switch (wParam)
 {
 case IDM_START: /* the start capture item */
 bCapturing = TRUE ;
 bBlocking = bStarted = FALSE ;
 SetCapture (hWnd) ; /* grab mouse */
 SetCursor (LoadCursor (NULL, IDC_CROSS)) ;
 CloseWindow (hWnd) ; /* minimize window */
 break ;
 case IDM_CLEAR: /* clears screen and clipboard */
 OpenClipboard (hWnd) ;
 EmptyClipboard () ;
 CloseClipboard () ;
 InvalidateRect (hWnd, NULL, TRUE) ; /* forces paint */
 break ;
 case IDM_ABOUT: /* show about box */
 MessageBox (hWnd, "Snap3 - Windows screen capture to clipboard.
 \nJim Conger 1990.",
 "Snap3 About", MB_OK) ;
 break ;
 case IDM_HELP:
 MessageBox (hWnd, "After you click the Start Capture menu
 item, move the mouse to the upper left
 of the area you want to copy to the
 clipboard. Hold down the left mouse
 button while you drag the mouse to the
 lower right of the area. Once you release
 the mouse button, the area is sent to the
 clipboard and shown in Snap3's window.",
 "Snap3 Help", MB_OK) ;
 break ;
 }
 case WM_LBUTTONDOWN: /* starting capturing screen */
 if (bCapturing)
 {
 if (bStarted)
 {
 bBlocking = TRUE ;
 oldend = beg = MAKEPOINT (lParam) ;
 OutlineBlock (hWnd, beg, oldend) ;
 SetCursor (LoadCursor (NULL, IDC_CROSS)) ;
 }
 else
 bStarted = TRUE ;
 }
 break ;
 case WM_MOUSEMOVE: /* show area as rectangle on screen */
 if (bBlocking)
 {
 end = MAKEPOINT (lParam) ;
 OutlineBlock (hWnd, beg, oldend) ; /* erase old outline */
 OutlineBlock (hWnd, beg, end) ; /* draw new one */
 oldend = end ;
 }
 break ;
 case WM_LBUTTONUP: /* capture and send to clipboard */

 if (bBlocking)
 {
 bBlocking = bCapturing = FALSE ;
 SetCursor (LoadCursor (NULL, IDC_ARROW)) ;
 ReleaseCapture () ; /* free mouse */

 end = MAKEPOINT (lParam) ;
 OutlineBlock (hWnd, beg, oldend) ; /* erase area outline */
 xSize = abs (beg.x - end.x) ;
 ySize = abs (beg.y - end.y) ;
 hDC = GetDC (hWnd) ;
 hMemDC = CreateCompatibleDC (hDC) ;
 hBitmap = CreateCompatibleBitmap (hDC, xSize, ySize) ;

 if (hBitmap)
 {
 SelectObject (hMemDC, hBitmap) ;
 StretchBlt (hMemDC, 0, 0, xSize, ySize,
 hDC, beg.x, beg.y, end.x - beg.x,
 end.y - beg.y, SRCCOPY) ;
 OpenClipboard (hWnd) ;
 EmptyClipboard () ;
 SetClipboardData (CF_BITMAP, hBitmap) ; /* copy to */
 CloseClipboard () ; /* clipboard */
 InvalidateRect (hWnd, NULL, TRUE) ; /* request paint*/
 }
 else
 MessageBeep (0) ;
 DeleteDC (hMemDC) ;
 ReleaseDC (hWnd, hDC) ;
 }
 ShowWindow (hWnd, SW_RESTORE) ; /* un-minimize window */
 break ;
 case WM_PAINT: /* display contents of clipboard if bitmap */
 hDC = BeginPaint (hWnd, &ps) ;
 if (IsIconic (hWnd)) /* if window is iconic, show icon */
 {
 hIcon = LoadIcon (hInstance, "snap3") ;
 if (hIcon != NULL)
 DrawIcon (hDC, 1, 1, hIcon) ;
 }
 else /* if not, show clipboard contents */
 {
 OpenClipboard (hWnd) ;
 if (hBitmap = GetClipboardData (CF_BITMAP)) /* if bitmap */
 {
 hMemDC = CreateCompatibleDC (hDC) ;
 SelectObject (hMemDC, hBitmap) ;
 GetObject (hBitmap, sizeof (BITMAP), (LPSTR) &bm) ;
 SetStretchBltMode (hDC, COLORONCOLOR) ;
 StretchBlt (hDC, 0, 0, xSize, ySize, hMemDC, 0, 0,
 bm.bmWidth, bm.bmHeight, SRCCOPY) ;
 DeleteDC (hMemDC) ;
 }
 CloseClipboard () ;
 }

 EndPaint (hWnd, &ps) ;
 break ;

 case WM_DESTROY:
 PostQuitMessage (0) ;
 break ;
 default:
 return DefWindowProc (hWnd, iMessage, wParam, lParam) ;
 }
 return 0L ;
}

/* OutlineBlock() writes a rectangle on screen given two corner points. R2_NOT
 style is used, so drawing twice on the same location erases outline. */
void OutlineBlock (HWND hWnd, POINT beg, POINT end)
{
 HDC hDC ;

 hDC = CreateDC ("DISPLAY", NULL, NULL, NULL) ;
 ClientToScreen (hWnd, &beg) ; /* convert to screen units */
 ClientToScreen (hWnd, &end) ;
 SetROP2 (hDC, R2_NOT) ; /* use logical NOT brush */
 MoveTo (hDC, beg.x, beg.y) ; /* draw rectangle */
 LineTo (hDC, end.x, beg.y) ;
 LineTo (hDC, end.x, end.y) ;
 LineTo (hDC, beg.x, end.y) ;
 LineTo (hDC, beg.x, beg.y) ;
 DeleteDC (hDC) ;
}




































February, 1991
YACC FOR EXPERT SYSTEMS


Developing a complex rule base




Todd King


Todd is a programmer/analyst with the Institute of Geophysics and Planetary
Physics at UCLA. He is also associated with the NASA/JPL Planetary Data
Systems project. Todd can be reached at 1104 N. Orchard, Burbank, CA 91506.


Not long ago I needed to implement an expert system, so I started looking for
tools I could use to build one. I was surprised to learn that there are over
two dozen expert system tools for the PC. In order to select one (or more), I
began examining expert systems in detail and, as I studied how they are
implemented, terms like "rules" and "actions" began cropping up everywhere.
These terms weren't totally foreign to me; I'd run across them before in tools
such as YACC. Once I realized that you can build expert systems with it, I
began looking at YACC differently.
In this article, I'll discuss just how you can build an expert system with
YACC using Turbo C, Version 2.0 and MKS YACC from Mortice Kern Systems (MKS).
MKS YACC (which together with MKS LEX make up the MKS LEX & YACC package) is
identical in function to those commonly found on Unix systems, so the code
presented here should be easily ported to Unix platforms. All development and
testing in this article was done on a 12.5-MHz PC compatible running MS-DOS.


Expert Systems


Expert systems can be classified into five categories: procedural, diagnostic,
monitoring, configuration/design, and planning/scheduling. Procedural expert
systems are step-by-step systems and typically take the form of a series of
menus -- hypertext systems, desktop publishing systems, and application
development environments, for example. Traditionally, these types of systems
are not referred to as expert systems because the primary purpose of such
systems is not to provide expert advice, conclusions, or actions, but rather
to provide some other service. The expert elements of these systems exist
simply to aid in the primary goal.
Diagnostic, monitoring, configuration/design, and planning/scheduling systems
are commonly thought of as expert systems because their primary goal is to
provide some expert guidance or advice. Diagnostic systems typically have a
large number of rules and conclusions, and can provide estimates of the
confidence of any conclusion -- a medical diagnostic system, for example.
Monitoring systems typically run continuously and sample various states of
another system. Based on these states it can provide conclusions about the
system as a whole.
Configuration/design systems consist of rules on how things may be linked
together. You might specify to such a system a final product and it will
inform you on how that product could be assembled or built. Closely related to
Configuration/design systems are planning/scheduling systems. They differ in
that with planning/scheduling systems time is a factor.
A common characteristic of all expert systems is that they have a "knowledge
base" which usually consists of a set of rules that detail how data and
information is interpreted. In some applications the knowledge base is
maintained as a set of explicit rules, where as in others it's simply a matter
of the structure and function flow of an application. In either case, an
expert system operates by stepping through one rule after another in search of
a conclusion. In general, expert systems use one of two methods to search for
a conclusion. The first method is called "forward chaining" and is a
goal-oriented, data-driven system in which input is supplied to the system and
a conclusion is derived based upon these inputs. An example of a forward
chaining system is a monitoring system that could accept a stream of data on
the height of the terrain and provide an analysis of the particular landforms
it "sees." This is precisely the type of system I'll implement in order to
demonstrate how you can use YACC to build an expert system.
The second method is "backward chaining" and is a solution-oriented,
fact-driven system in which an event is presented to the system and the cause
of the event is derived. An example of a backward chaining system is a
diagnostic system which could determine the cause of a computer failure. With
such a system, you would supply a description of the current state of the
computer. The expert system would then search its knowledge base for all
possible causes. The system could then present a list of causes or it might
ask for additional information in order to further refine the list.


Knowledge Engineering


Knowledge can be rather ambiguous at times. You can't always describe
something with a mathematical formula -- sometimes you must deal with facts at
a more abstract level. A landform recognizer is a good example of this. You
might ask an expert on landforms (a geologist) "How do you classify a form as
a 'peak'?" The expert might respond "Well, the land slopes upward and then it
slopes downward." (Such a statement will lead to other questions, such as "How
do you define a slope?" But for now let's not pursue this answer.)
Using YACC's rule syntax you could define a peak with the following text:
peak: UP_SLOPE DOWN_SLOPE
In YACC, this is a rule and it would read as "a peak is an up slope followed
by a down slope," which is how the expert described a peak. The extraction and
codification of an expert's knowledge is called "knowledge engineering." The
coding, in this case, is a rather simple operation.
As the conversation with the expert continues, you would probably find out
that there are a variety of landforms that include peaks, valleys, plateaus,
basins, and shelves. Listing One details all of the rules that describe each
of these pieces of knowledge.
Interspersed with the rules in Listing One you'll find text, most of which is
C code, and what remains are directives to YACC. I won't describe every detail
of Listing One. What's important to note is that with YACC all actions
associated with a rule are defined using C code. Once Listing One is processed
by YACC, you will obtain a file that contains just C source code. This source
code implements a state machine that consists of the rules translated into the
proper C instructions and the actions you specified. This allows for the
seamless integration of the output of YACC into any application.


Abstraction


Every state machine generated by YACC has three interfaces. The first is a
function, called yyparse( ), which you can call within an application to start
the state machine. The second interface is a function, called yylex( ), that
you supply so that you can provide the fuel for the engine. For a state
machine, this fuel is a series of tokens. The only allowed tokens yylex( ) can
return are defined by the %token directive (there's one in Listing One). The
third interface reports errors when the state machine encounters an impossible
combination of tokens. Listing Two contains the source for a yyerror( ).
Whenever yyerror( ) is called it prints a "?!" to indicate an impossible
combination. It then clears the token stack so that the application can
recover from the error.
Listing Three contains the source for a yylex( ) that will meet our immediate
needs. In addition to yylex( ) there are the functions get_token( ),
token_copy( ), location_copy( ), trend( ), pop_token( ), push_token( ), and
init_source( ). The names of these functions reflect their purpose.
Collectively, the functions in Listing Three read a source of data and
tokenize the indivisible features (direction of slope), which exist in the
data. It's in the function trend( ) that the definition of slope is codified.
The slope is always calculated between adjacent data points and regions with
the same slope are reduced to a single token.
One aspect of a landform expert system that complicates the source in Listing
Three is that features overlap one another. That is, a peak and a valley which
are adjacent to each other share a common line segment. Because the state
engine produced by YACC consumes all tokens when a rule is satisfied, we must
maintain a stack of tokens within yylex( ). Doing so allows us to accommodate
overlapping features without fiddling with the simple form of the YACC rules
specifications.


Producing Output


The source code for the expert system is designed to work either in a graphics
or text mode. This is accomplished by using the preprocessor if/else/endif
directive to selectively include code for each mode. If you wish to use a
graphics mode simply define the token GRAPHICS. The simplest way to ensure
that this token is defined for all source code is to place it in the include
file lfclass.h. Listing Four contains the source for lfclass.h. In addition to
the GRAPHICS token, there are several new data types that have been defined.
These data types in class types are used throughout the source code.
All output produced by the landform expert system is rendered by the functions
label( ), display_message( ), and draw_trace( ), with the help of
set_screen_max( ), set_view_max( ), init_graphic( ), deinit_graphics( ), and
scaled( ). These functions, which can be found in Listing Five serve as
standard entry points for implementation specific graphics calls. The Turbo
library is used in this implementation.


An Example



Now let's put all the source code just presented to work. Listing Six contains
the source for a fully functional expert system and a makefile to aid in
constructing the example provided in Figure 1. When the source in Listing Six
is compiled and linked with the other files in this project, you will have an
application that expects one command line argument. This argument should be
the name of a file that contains terrain height data. The format of this data
must be a pair of numbers separated by commas. The first number of the pair is
the horizontal position of the sample and the second is the height of the
terrain at the point.
 Figure 1: A makefile for building the example application

 #----------------------------------------------------------------------
 # Makefile for the generation of a land form expert system -- Todd King
 #----------------------------------------------------------------------

 CFLAGS = -mc
 LDFLAGS = $ (CFLAGS)
 YFLAGS = -d

 OBJ = example.obj lfx.obj lfclass.obj render.obj error.obj
 example.exe : $ (OBJ)
 $ (LD) $ (LDFLAGS) $ (OBJ) graphics.lib

 lfx.c : lfx.y lfclass.h
 lfclass.obj : lfclass.c lfclass.h ytab.h
 render.obj : render.c lfclass.h
 error.obj : error.c ytab.h
 clean :
 del *.obj
 del lfx.c

A set of terrain data that contains one of each feature is provided in Figure
2. When you run the example (and if graphical output is selected) you will see
a drawing of the representation of the terrain data with each landform
properly labeled. If graphical output is not selected, then the type of
landform and its location is printed to the screen. The final landform in
every possible data set will be labeled as impossible. This is actually
reasonable because the land ends abruptly. This might have been true before
Columbus sailed to the new world, but is not true today.
Figure 2: An example set of terrain data

 0, 50
 10, 60
 20, 40
 30, 30
 40, 30
 55, 50
 60, 50
 70, 80
 80, 20
 90, 80
 100, 50
 105, 50
 110, 20



Final Notes


In most expert system shells (the typical tools used to build expert systems),
the knowledge base is often constructed using a series of if/then statements
called "production rules." There are several aspects of using YACC that sets
it apart from the production rule approach, foremost being the compact YACC
syntax. Also, the source produced by YACC contains efficient stack management
code, which buffers tokens so that a match for the longest sequence of tokens
is always found. Another benefit is that it is possible to have rules that are
dependent on other rules. The expert system that results from using YACC is a
C function which is compiled into machine code and executes very quickly, and
the function can be included in any program. A limitation of the YACC
approach, however, is that only forward-chaining systems can be constructed.


Products Mentioned


MKS LEX & YACC MKS Inc. 35 King Street North Waterloo, Ontario N2J 2W9 Canada
800-265-2797 $249 (DOS), $395 (OS/2)


_YACC FOR EXPERT SYSTEMS_
by Todd King


[LISTING ONE]


/*---------------------------------------------------------------------------
 FILE: lfx.y -- Rules and actions for a land form expert system -- Todd King
----------------------------------------------------------------------------*/
%{
#include <string.h>
#include "lfclass.h"
%}

%union {
 LOCATION location;
}

%token <location> UP_SLOPE DOWN_SLOPE HORIZONTAL DONE

%start terrain
%%
terrain: land_form
 {
 YYACCEPT;
 }
 ;
land_form : peak
 valley
 plateau
 basin
 shelf
 DONE
 ;
peak : UP_SLOPE DOWN_SLOPE
 {
 label("peak", $1.start_x, $2.end_x);
 }
 ;
valley : DOWN_SLOPE UP_SLOPE
 {
 label("valley", $1.start_x, $2.end_x);
 }
 ;
plateau : UP_SLOPE HORIZONTAL DOWN_SLOPE
 {
 label("plateau", $1.start_x, $3.end_x);
 }
 ;
basin : DOWN_SLOPE HORIZONTAL UP_SLOPE
 {
 label("basin", $1.start_x, $3.end_x);
 }
 ;
shelf : DOWN_SLOPE HORIZONTAL DOWN_SLOPE
 {
 label("shelf", $1.start_x, $3.end_x);
 }
 UP_SLOPE HORIZONTAL UP_SLOPE
 {
 label("shelf", $1.start_x, $3.end_x);
 }
 ;
%%








[LISTING TWO]

/*-------------------------------------------------------------------------
 FILE: error.c -- Error reporting and recover function called by the YACC
 state engine whenever an impossible combination of states is encountered.
 Todd King
---------------------------------------------------------------------------*/
#include "lfclass.h" /* Must precede "ytab.h" */
#include "ytab.h"

yyerror(c)
{
 extern SOURCE Map_source;

 TOKEN *token;

 label("?!", Map_source.last_token.location.start_x,
 Map_source.last_token.location.start_x);
 token = pop_token(&Map_source);
 while(token->value != DONE) { /* Clear stack */
 token = pop_token(&Map_source);
 }
 Map_source.last_token.value = UNDEF_TOKEN;
 return(0);
}






[LISTING THREE]

/*--------------------------------------------------------------------------
 FILE: lfclass.c -- All functions used to identify indivisable features of
 the land form. This is just the slope of the terrain. -- Todd King
--------------------------------------------------------------------------*/
#include "lfclass.h" /* Must precede "ytab.h" */
#include "ytab.h"

yylex() {
 extern SOURCE Map_source;

 TOKEN last_token;
 TOKEN *tptr;
 int x;
 char buffer[10];

 token_copy(&last_token, get_token(&Map_source));
 token_copy(&Map_source.last_token, &last_token);
 if(last_token.value == DONE) return(last_token.value);


 tptr = get_token(&Map_source);
 while(tptr->value == last_token.value) {
 tptr = get_token(&Map_source);
 }
 push_token(&Map_source, tptr);
 location_copy(&yylval, &last_token.location);
 return(last_token.value);
}

TOKEN *get_token(source)
SOURCE *source;
{
 if(source->in_buffer > 0) {
 return(pop_token(source));
 } else {
 return(trend(source));
 }
}

token_copy(t1, t2)
TOKEN *t1;
TOKEN *t2;
{
 t1->value = t2->value;
 location_copy(&t1->location, &t2->location);
}

location_copy(l1, l2)
LOCATION *l1;
LOCATION *l2;
{
 l1->start_x = l2->start_x;
 l1->start_y = l2->start_y;
 l1->end_x = l2->end_x;
 l1->end_y = l2->end_y;
}

TOKEN *trend(source)
SOURCE *source;
{
 static TOKEN _Trend_token;

 int delta;
 int x, y;

 _Trend_token.value = DONE;
 if(source->last_y == NOT_DEF) {
 if(fscanf(source->fptr, "%d, %d", &source->last_x, &source->last_y) < 2) {
 return(&_Trend_token);
 }
 }
 if(fscanf(source->fptr, "%d, %d", &x, &y) < 1) {
 return(&_Trend_token);
 }

 _Trend_token.location.start_x = source->last_x;
 _Trend_token.location.start_y = source->last_y;
 _Trend_token.location.end_x = x;
 _Trend_token.location.end_y = y;


 delta = y - source->last_y;
 source->last_y = y;
 source->last_x = x;

 if(delta < 0) {
 _Trend_token.value = DOWN_SLOPE;
 } else if(delta > 0) {
 _Trend_token.value = UP_SLOPE;
 } else {
 _Trend_token.value = HORIZONTAL;
 }
 return(&_Trend_token);
}

TOKEN *pop_token(source)
SOURCE *source;
{
 static TOKEN _Pop_token;

 _Pop_token.value = DONE;
 if(source->in_buffer <= 0) return(&_Pop_token);
 source->in_buffer--;
 return(&source->token_buffer[source->in_buffer]);
}

push_token(source, token)
SOURCE *source;
TOKEN *token;
{
 if(source->in_buffer >= MAX_TOKEN_BUFFER) {
 return(-1);
 }
 if(token->value == UNDEF_TOKEN) {
 return(-1);
 }

 token_copy(&source->token_buffer[source->in_buffer], token);
 source->in_buffer++;
 return(source->in_buffer);
}

init_source(source)
SOURCE *source;
{
 source->in_buffer = 0;
 source->fptr = NULL;
 source->last_y = NOT_DEF;
 source->last_token.value = UNDEF_TOKEN;
}






[LISTING FOUR]

/*----------------------------------------------------------------------

 FILE: lfclass.h -- Type and misc. definitions for land form classifier.
 Todd King
------------------------------------------------------------------------*/
#ifndef _LFCLASS.H_
#define _LFCLASS.H_

#define GRAPHICS 1 /* If defined graphics output is used */
#include <stdio.h>

/* Defines for use with the scaled() function */
#define X_AXIS 1
#define Y_AXIS 2

/* Definitions for token and token source items */
#define NOT_DEF -1
#define UNDEF_TOKEN 0 /* Must less than 255 for YACC's sake */
#define MAX_TOKEN_BUFFER 8
#define LABEL_AT_Y 12

typedef struct {
 int start_x;
 int start_y;
 int end_x;
 int end_y;
} LOCATION;

typedef struct {
 LOCATION location;
 int value;
} TOKEN;

typedef struct {
 FILE *fptr;
 int last_y;
 int last_x;
 TOKEN last_token;
 int in_buffer;
 TOKEN token_buffer[MAX_TOKEN_BUFFER];
} SOURCE;

/* Function Definitions */
TOKEN *pop_token();
TOKEN *get_token();
TOKEN *trend();

#endif _LFCLASS.H_






[LISTING FIVE]

/*---------------------------------------------------------
 FILE: render.c -- All rendering functions -- Todd King
----------------------------------------------------------*/
#include <graphics.h>
#include "lfclass.h" /* Must Precede "ytab.h" */

#include "ytab.h"

/* Variables for graphics output */
int View_max_x;
int View_max_y;
int Screen_max_x;
int Screen_max_y;

set_screen_max(x, y)
int x;
int y;
{
 Screen_max_x = x;
 Screen_max_y = y;
}

set_view_max(fptr)
FILE *fptr;
{
 int x, y;

 View_max_x = 0;
 View_max_y = 0;

 while(fscanf(fptr, "%d, %d", &x, &y) > 0) {
 if(x > View_max_x) View_max_x = x;
 if(y > View_max_y) View_max_y = y;
 }
 rewind(fptr);
}

init_graphics()
{
#ifdef GRAPHICS
 int gdriver;
 int gmode;

/* Initialize graphics */
 detectgraph(&gdriver, &gmode);
 if(gdriver < 0) {
 fprintf(stderr, "Unable to determine graphics device.\n");
 return(0);
 }
 initgraph(&gdriver, &gmode, "c:\\turboc");
 if(gdriver < 0) {
 fprintf(stderr, "Unable to determine graphics device.\n");
 return(0);
 }

/* A few initializations. */
 set_screen_max(getmaxx(), getmaxy());
#endif GRAPHICS

 return(1); /* Success */
}

deinit_graphics()
{
 closegraph();

}

display_message(msg)
char msg[];
{
#ifdef GRAPHICS
 moveto(0, getmaxy() - 12);
 outtext(msg);
 getch();
#else
 printf(msg);
 getch();
#endif GRAPHICS
}

draw_trace(fptr)
FILE *fptr;
{
#ifdef GRAPHICS
 int x, y;
 int first = 1;

/* Draw surface */
 while(fscanf(fptr, "%d, %d", &x, &y) > 0) {
 if(first) {
 moveto(scaled(x, X_AXIS), scaled(y, Y_AXIS));
 first = 0;
 }
 lineto(scaled(x, X_AXIS), scaled(y, Y_AXIS));
 }
 rewind(fptr);
#endif GRAPHICS
}

label(text, start_x, end_x)
char text[];
int start_x;
int end_x;
{
#ifdef GRAPHICS
 settextjustify(CENTER_TEXT, BOTTOM_TEXT);
 moveto(scaled(start_x + (end_x - start_x) / 2, X_AXIS),
 LABEL_AT_Y);
 outtext(text);
 settextjustify(LEFT_TEXT, BOTTOM_TEXT);
#else
 printf("%s @ %d\n", text, start_x + (end_x - start_x) / 2);
#endif GRAPHICS
}

scaled(xy, axis)
int xy;
int axis;
{
 int new_xy;

 switch(axis) {
 case X_AXIS:
 new_xy = ((float)xy / View_max_x) * Screen_max_x;

 break;
 case Y_AXIS:
 new_xy = Screen_max_y - ((float)xy / View_max_y) * Screen_max_y;
 break;
 }

 return(new_xy);
}







[LISTING SIX]

/*----------------------------------------------------------------------
 FILE: example.c -- An example integration of all the land form expert
 system code -- Todd King
-----------------------------------------------------------------------*/
#include <stdio.h>
#include "mclass.h" /* Must precede "ytab.h" */
#include "ytab.h"

#define MAX_WIDTH 79
#define MAX_HEIGHT 10
SOURCE Map_source;

main(argc, argv)
int argc;
char *argv[];
{
 FILE *fptr;
 int temp[2];

/* Check arguments, if OK open file */
 if(argc < 2) {
 printf("Proper usage: example {terrain hieght file}\n");
 exit(0);
 }

 init_source(&Map_source);
 if((Map_source.fptr = fopen(argv[1], "r")) == NULL) {
 perror(argv[1]);
 exit(0);
 }

 init_graphics();

/* A few initializations. */
 set_view_max(Map_source.fptr);

 draw_trace(Map_source.fptr);

/* Now classify all features. */
 while(Map_source.last_token.value != DONE) {
 yyparse();
 push_token(&Map_source, &Map_source.last_token);

 }

/* All done, wait before cleaning up. */
 display_message("Press any key to exit ...");
 deinit_graphics();
 exit(0);
}




[FIGURE 1]

#-------------------------------------------------------------------------
# Makefile for the generation of a land form expert system -- Todd King
#--------------------------------------------------------------------------
CFLAGS = -mc
LDFLAGS = $(CFLAGS)
YFLAGS = -d

OBJ = example.obj lfx.obj lfclass.obj render.obj error.obj
example.exe : $(OBJ)
 $(LD) $(LDFLAGS) $(OBJ) graphics.lib

lfx.c : lfx.y lfclass.h

lfclass.obj : lfclass.c lfclass.h ytab.h

render.obj : render.c lfclass.h

error.obj : error.c ytab.h

clean :
 del *.obj
 del lfx.c


[FIGURE 2]

0, 50
10, 60
20, 40
30, 30
40, 30
55, 50
60, 50
70, 80
80, 20
90, 80
100, 50
105, 50
110, 20
120, 20





































































February, 1991
INTRINSICS OF THE X TOOLKIT


A toolkit for configuring your user interface




Todd Lainhart


Todd is the technical project leader for the User Interface Management Group
at Cadre Technologies Inc. He can be reached at Cadre Technologies, 222
Richmond Street, Providence, RI 02903; on CompuServe at 71560,443 or at
sun!cadreri!twl.


The computer user interface has made evolutionary leaps in the last decade.
Character display terminals are being obsoleted by bitmapped graphics
displays. The WIMP (overlapping Windows, trashcan Icons, pop-up Menus and
Pointing device, or mouse) model for user computing is overshadowing the
blinking hardware cursor. In the workstation world, there are now toolkits
available for developing to the X Window System that allow maximum flexibility
in enabling users to configure not only the look of their applications, but
also how that application will respond to events. In effect, users are offered
the ability to program their own interface.
This article discusses the use of one such toolkit, the Intrinsics, and how,
through studied application of resources and translation tables, the
application developer can allow users personalized configuration of their
computing environment. To demonstrate the power of these facilities, I've
included a sample application: a simple text editor, written using the
OSF/Motif Toolkit.


Quick Tour of X11


The X Window System, or X11 (X Version 11), is a hardware-independent
windowing system for workstations with bitmap displays. Developed at MIT and
DEC, and available on practically all Unix-based platforms, X manages I/O to
the workstation's display locally and across a heterogeneous networking
environment.
The fact that X can manage I/O across a network is a major reason for its
power and immediate acceptance into the workstation community. At the heart of
the X specification is a networking model that supports distributed computing
on top of such protocols as TCP/IP and DECnet, all of which is invisible to
the application programmer or user.
The model for application design with X is that of client/server. The X server
is a process that runs on the host display and services application requests
for I/O from potentially many clients running on that same host system, or
perhaps running remotely on a different vendor's workstation on another part
of the network (even across phone lines). The application interface to the X
protocol is via the Xlib layer, a C-callable set of subroutines bound to the
application, which sends window-management and graphics-display requests to
the X server. While this layer is complete in that it allows the application
full access to the X programming environment, it offers little abstraction for
the developer, so that even trivial applications require excessive amounts of
code just to react to incoming events.


The Intrinsics


There are a number of toolkits available on top of Xlib that abstract away the
low-level details, and allow the developer to better concentrate on building
the application. XView, InterViews, Serpent, and CLUE are available in the
public domain, while the CommonView and XVT toolkits are available
commercially. Probably the most widely used toolkit, and the toolkit I wish to
discuss here, is known as the "Intrinsics," also referred to as Xt or the X
Toolkit. The Intrinsics is a standard defined by the X Consortium -- all
platform vendors that ship an X server with their hardware are bound by that
standard to ship a version of Xlib and the Intrinsics that are usable in that
environment.
Strictly speaking, the X Toolkit is the union of two subroutine libraries: The
Intrinsics, an abstraction of Xlib that presents a particular model for
application/user-interface design, and the widget set Xaw (Athena Widget set),
which is built upon the Intrinsics. The widget is a generic user-interface
object. As defined by the Intrinsics, however, a widget is a reusable
hierarchically-derived user-interface component that encapsulates an X window
(or collection of X windows), state data, and the methods to manipulate those
windows and data.
Although the Athena Widget set is shipped with the X Toolkit, developers are
not bound by the Intrinsics to develop applications with it. Typically,
commercial application developers will replace Xaw with another widget set,
such as OSF/Motif or the Open Look Toolkit for greater functionality and ease
of development. So in part, the Intrinsics act as an Xlib framework or
programming interface on which to build and use widget-based toolkits. For the
remainder of this article, when I refer to the X Toolkit, I'll actually be
referring to the Intrinsics and the OSF/Motif toolkit. I'll be using the
OSF/Motif toolkit simply because it's available on a large number of
Unix-based workstations.


X Toolkit Using the Layered Approach


When developing an application using the X Toolkit, that application will
actually touch all levels of the X programming environment. Figure 1 shows the
model used most often to describe the relationship between an application and
the toolkit layers. Using the Motif toolkit, you create all of your high-level
widgets, interface with the system clipboard, and work with internationalized
strings. At the Intrinsics level, you establish the event dispatching
mechanisms, and perhaps assign resources to your application widgets. Finally,
you program directly to Xlib in order to dispatch IPC calls to cooperating
applications, or to draw a circle in the application's canvas.
Applications written for X at the lowest levels are event driven, synchronous,
and procedural. If you're familiar with Windows or PM programming, you'll
recognize an application written strictly at the Xlib level: Somewhere there
is a significant switch/case statement, with appropriate procedure calls made
to respond to each incoming event. With an application containing many
windows, this can be a difficult program to understand and maintain. The
Intrinsics, thankfully, present an entirely different and more useful model of
event-driven programming; one that is apparently asynchronous and object
oriented.


A Typical Application


The listings accompanying this article describe a simple text editor --
Text-edit -- written with the Motif toolkit. The editor is horizontally and
vertically scrollable and supports cutting and pasting. Like all
Intrinsics-based programs, there is an initialization section, a section where
all of the prominent widgets are created, a section for registering all
callbacks and event handlers, and finally the event-dispatch loop.
Listing One is the Xdefaults file. Listing Two contains main( ), and the
descriptions of all high-level widgets. Listing Three contains all callbacks
defined by this application. Listing Four contains code for interfacing to the
Motif clipboard. Listings Five and Six are the header files that export the
routines defined in listings Three and Four, respectively.
main( ) begins with a call to initialize the Intrinsics, XtInitialize( ).
Along with stripping the command line of options that are significant to the
Intrinsics, XtInitialize( ) registers this program and its class with the
Resource Manager and the Translations Manager. Next, using the Motif toolkit
subroutine, XmCreateScrolledText( ), we create the most interesting widget of
this program -- a multiline scrollable text editor with default selection and
navigation behavior. I've specified resources (widget attributes) that are
specific to this widget, along with their values (such as horizontal and
vertical scrolling) and multiline text-editing facilities. This is achieved,
in part, via the macro, XtSetArg( ).
Following this, a pop-up menu associated with the text widget is created to
handle cutting and pasting text and writing text from and to a file. An event
handler is then added to the text widget via XtAddEventHandler( ), which
declares the application-defined procedure DisplayTextEditMenu to be called in
response to a referenced event -- in this case, a mouse button press.
DisplayTextEditMenu manages the display and selectability of the pop-up menu.
Finally, the application is mapped to the display screen via the call to
XtRealizeWidget( ) and XtMainLoop( ) is entered. XtMainLoop( ) responds to
events passed to it by the X server and dispatches these events to the
appropriate widget, invoking any callbacks and event handlings as registered
by that widget.


Resources


Like the design of X11, the Intrinsics subscribes to the philosophy of
mechanism, not policy. Nowhere in the X11 specification or implementation are
there guidelines for user-interface design or policies: A robust environment
is offered to allow the development of a user interface as specified by the
designer. The Intrinsics adheres to this philosophy by specifying a policy by
which widgets and user-interface components are to be designed and used.
Widgets are designed to be reusable, and are highly configurable, primarily
through the use of resources and translation tables.
A resource may specify window width or height, foreground and background
colors, text labels, or even what key is bound to the cut function in a text
editing widget. A widget attribute is a name-value pair.
At application startup, the Intrinsics resource manager is responsible for
assigning resources to appropriate widgets. These resources may either be
hardcoded, appear on the application's command line, or appear as ASCII text
in one of several files. The resources configurable for each widget are listed
in that widget's man page description. And because widget design is based on
inheritance, a widget (and thus the designer or user) has access to all the
resources of its superclasses.

A resource may be specified in one of several places. The search tree is as
follows. First, the application looks for the class name of the application. A
robust, stand-alone application should place any user-configurable options in
this file. If this file is not found, several other locations are queried,
including assorted environment variables. Failing all of these, the .Xdefaults
file is read and parsed for resources applicable to the running application.
.Xdefaults is the standard mechanism in modifying the attributes for multiple
applications per user session. In the TextEdit example, I've decided to assign
resources in my .Xdefaults file (see Listing One). Note that the syntax for
specifying the widget to which I'll assign resources allows the use of
wild-cards, instance names, and class names. The resource manager depends upon
every application and widget to have a name (instance) and belong to a class.
Typographical conventions require the instance name to begin in lowercase and
the class name to begin in uppercase. Resource names work similarly:
borderColor refers to the borderColor resource of a particular widget, while
BorderColor refers to all borderColor resources for referenced widgets that
support this resource.
Resources are known and specified to the resource manager as strings. These
strings and the manifest constants (as they are known to the C compiler) are
found in StringDefs.h. Resources introduced by the Motif toolkit are found in
Xm.h. Resource instance names are preceded by the XtN or XmN prefix and
resource class names are preceded by XtC or XmC. When specifying resources via
C code, the left-hand side of the resource definition is used (for example,
XmNborderColor). When specifying resources via an ASCII file, such as
Xdefaults, the right-hand side is referenced (as in borderColor) and quotes
are omitted.
Resources can be assigned values in such a way that either all text widgets
for the application will have 80 columns and 24 rows, or just the bottom text
widget will have 80 columns and only 5 rows. Values are assigned to specific
widgets, by addressing those widgets via a naming tree. In keeping with
object-oriented design principles, widgets are created in a hierarchical
fashion. Each widget must have a parent (except for the root or top-level
widget created by XtInitialize) by which it is managed. When referring to a
specific widget's resources, that widget is addressed in the .Xdefaults file
by describing the names of all the widgets in its hierarchy (or "widget
tree"). The name of the widget is passed to its creation routine in the "name
argument." In our example, textEdit is the name of the text widget as
instantiated by XmCreateScrolledText in Listing Two.
Note that there are orders of precedence to resource specification. Specific
resource descriptions (such as myfile.textEdit.row: 53) will override more
general descriptions. Command-line and hardcoded specifications take an even
higher precedence. Also note that any errors in parsing the resource
specification will be silently ignored by the resource manager!
In Textedit, I've made the decision to allow resource configurability for only
the text widget. I've hardcoded the scrollVertical and scrollHorizontal
resources to be true, which enables horizontal and vertical scrolling. Note
however, that I have allowed the user to configure the rows and columns
resources, as well as the .Xdefaults file resources for resizing height and
width of the textedit window. I've not made any assignments for background or
foreground colors, but they also could be configured. It's important to note
that if user configurability is to be allowed, the user must know the naming
tree and type of the widget to be configured.
Resources such as borderColor and backgroundColor may not seem very
interesting. However, these are only the beginning of the configurable
resources available for the Motif widget set. Resources are available for
setting the directory setting and directory mask for file-search dialog boxes
and for describing text editing modes for text-edit widgets. You'll have to
make a decision as to which resources should be exposed to the user and which
you want hidden and setable only by the application.


Translation Tables


Motif and the Intrinsics support resource translations. The Intrinsics
translation manager allows user-specified mapping between key or mouse events
and action procedures. Most widgets support some form of default translation
action procedures. The Motif XmText widget supports action procedures such as
backword-word( ), forward-word( ), delete-previous-word( ), and
delete-next-word( ). Because translations are resources, they may be specified
in the same locations as other resources, using a similar syntax. That is, the
widget having its translations modified is referenced either explicitly via
its naming tree, or using wildcards. Once a path to a specific widget has been
described, the mapping between key and mouse events and action procedures can
be detailed, as in Listing One.
In my text editor, I decided to change the behavior of the default cursor
navigation keys. I used Ctrl-A and Ctrl-E to move my insertion-point cursor to
the beginning and end of the current line, respectively. Looking through the
man page for XmText, I notice that the internal functions beginning-of-line( )
and end-of-line( ) are available. The syntax for translation specification is
similar to any other resource-specification syntax. The first line of the
specification starts with override. This informs the translation manager that
I wish for all default translations to remain intact, with the following
translations overriding any previously defined translations.


Products Mentioned


X11R4 MIT Software Center E32-300 28 Carlton Street Cambridge, MA 02139
617-258-8330 $400 U.S., $500 overseas Format: 3 9-track 1/2" 6250 BPI tapes in
tar format, using Berkeley C; includes manuals. Also available on UUCP from
UUNET or via anonymous FTP at the following hosts:
 Location Hostname Address Directory
-------------------------------------------------------------

West gatekeeper.dec.com 128.45.9.52 pub/X.V11R4
East UUNET.uu.net 192.12.141.129 X/X.V11R4
NEast expo.lcs.mit.edu 18.30.0.212 pub/X.V11R4
Midwest cygnusX1.cs.utk.edu 128.169.201.12 pub/X.V11R4
South dinorah.wustl.edu 129.252.118.101 pub/X.V11R4
The value section of the name value pair must fit on one line, and translation
syntax must include newlines to separate the different specifications.
Therefore, I must add a newline character (using C syntax) at the end of each
line, followed by an escape of the actual newline. Following override, I've
defined the keys that I wish to set and the action to be invoked. Key names
(for example, Right, Left) for your specific installation are described in
keysymdefs.h. Reserved modifiers such as Ctrl, and <Key> are found in the man
pages for the Intrinsics library.
With the exception of one subroutine, all the default translations are defined
by the XmText widget. Because a user-configurable way to exit the editor was
needed, and no default mechanisms for exiting the application exist (except
killing the process), I created an action routine and added it to the
default-actions table via XtAddAction( ). By associating this routine with the
named string "exit," the translation manager is allowed upon startup to scan
the resource database for this action and associate it with ExitApp.


Summary


Figure 2 presents a generic makefile that assumes default locations for the
X11 and Xm include files and libraries. Depending upon your installation, you
may have to make some modifications. Note that I've taken some trouble to
locate all callback functions and the procedures accessing the clipboard in a
separate file. I've done this for clarity: The requirements for good
object-oriented design for asynchronous event-driven programs of this type
actually suggest a different packaging scheme.
Figure 2: A generic makefile

 # Makefile to build textedit

 # Macros

 CC=/bin/cc
 DEBUG=-g
 INCLUDE DIRS=-I /usr/include/Xm -I /usr/include/X11
 SYS_DEFS=$(SYS_T) $(RUN_T) -DSYSV
 CC_SWITCHES= -c $(SYS_DEFS) $(INCLUDE_DIRS) $(DEBUG)

 LD=/bin/ld
 LIBDIRS=-L/usr/X11/lib
 LIBS=-1Xm -1Xtm -1Xaw -1X11
 LD_SWITCHES=$(LIBDIRS) $(LIBS)

 # Inference rules
 .SUFFIXES: .c .o .ln

 .c.o:
 $(CC) $(CC_SWITCHES) $<


 OBJS=\
 xm_main.o\
 xm_clipboard.o\
 xm_callbacks.o

 # Targets
 all: textedit

 textedit: $(OBJS)
 $(LD) -o $@ $(OBJS) $(LD_SWITCHES)

 xm_main.o: xm_callbacks.h

 xm_callbacks.o: xm_clipboard.h

 # Misc targets
 clean:
 -rm *.bak *.o

 lint:
 lint $(INCLUDE_DIRS) -DSYSV *.c

There is a lot of information for the budding Intrinsics programmer to digest
when creating applications with this toolkit. Certainly, there is too much
information to cover thoroughly here, so I've included a bibliography of some
excellent texts to get you started. There should, however, be enough
information to whet your interest, and to get you started in more involved
programming projects using the X toolkit.


Acknowledgments


Thanks to David K. Taylor and Paul Caswell of Cadre Technologies Inc. for
their comments and review.


Bibliography


Nye, Adrian and Tim O'Reilly. X Toolkit Intrinsics Programming Manual.
Wilmington, Mass.: O'Reilly & Associates, 1990.
- - -. X Toolkit Intrinsics Reference Manual. Wilmington, Mass.: O'Reilly &
Associates, 1990.
Young, Douglas A. The X Window System, Programming and Applications with Xt:
OSF/Motif Edition. Englewood Cliffs, N.J.: Prentice-Hall, 1990.


_INTRINSICS OF THE X TOOLKIT_
by Todd Lainhart


[LISTING ONE]

!
! Resource specifications for simple text editor
!
*textEdit.rows: 24
*textEdit.columns: 80
*textEdit.resizeWidth: False
*textEdit.resizeHeight: False
*textEdit.translations: #override \n\
 Ctrl<Key>Right: forward-word() \n\
 Ctrl<Key>Left: backward-word() \n\
 Ctrl<Key>a: beginning-of-line() \n\
 Ctrl<Key>e: end-of-line() \n\
 Ctrl<Key>a, Ctrl<Key>a: beginning-of-file() \n\
 Ctrl<Key>e, Ctrl<Key>e: end-of-file()








[LISTING TWO]


/*~PKG*************************************************************************
 * Package Name: xm_main.c
 * Synopsis: Implements a simple text editor using the Motif toolkit.
 * Features Supported: Not much.
 * References: Xt Programming and Apps by Doug Young.
 * Xm Programming Reference and Guide by OSF.
 * Xt Programming Reference and Guide by O'Reilly.
 * Usage: Bind this with a couple of other support objs.
 * Known Bugs/Deficiencies:
 * Modification History: 11/01/90 twl original
 */

/******************************************************************************
 * Header files included. */
#include <X11/Intrinsic.h>
#include <X11/StringDefs.h>
#include <Xm/Xm.h>
#include <Xm/Text.h>
#include <Xm/RowColumn.h>
#include <Xm/PushBG.h>
#include <Xm/FileSB.h>
#include <Xm/SelectioB.h>
#include "xm_callbacks.h"

/******************************************************************************
 * Constants and variables local to this package. */

/* These widgets are the popup menu items, externalized here so that
 * functions within this package can have access (for the setting/unsetting
 * of selectability. */
static Widget CopyMenuItem;
static Widget CutMenuItem;
static Widget PasteMenuItem;
static Widget PasteFileMenuItem;
static Widget WriteFileMenuItem;

static void ExitApp();

/* The actions table for declaring new translations. */
static
XtActionsRec actionTable[] =
{
 { "exit", ExitApp },
};

/******************************************************************************
 * Procedure: ExitApp
 * Synopsis: Action procedure for exiting application
 * Assumptions: None.

 * Features Supported:
 * Known Bugs/Deficiencies: We're not interested in checking state of the
editor before going down.
 * Regardless of the circumstances, down we go.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
static void
ExitApp( Widget parent, XEvent *event, String *actionArgs, Cardinal argNdx )
{

 XtCloseDisplay( XtDisplay( parent ) );
 exit( 0 );
}

/******************************************************************************
 * Procedure: DisplayTextEditMenu
 * Synopsis: Event handler to display the text body popup menu.
 * Assumptions: The parent is a Text Widget.
 * Features Supported:
 * Known Bugs/Deficiencies: External resources should be considered.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
static void
DisplayTextEditMenu( textBody, popupMenu, xEvent )

 Widget textBody; /* Owner of the event handler */
 Widget popupMenu; /* Data passed by the registering procedure */
 XEvent *xEvent; /* Passed to all event handlers */
{

 Arg argList[25]; /* Resource argument list */
 int argNdx; /* Index into resource list */

 int menuButton; /* MENU button assigned to popup */
 char *selectedText; /* The text selected for the widget invoking


 /* We're assuming that the owning widget of this event handler is of
 * type XmCText. If not, get out. */
 if ( !XmIsText( textBody ) )
 {
 printf( "DisplayTextEditMenu: Not Text\n" );
 exit( 1 );
 }

 /* We're also assuming that the the data passed by the event handler
 * is a popup menu widget. If not, get out. */
 if ( !XmIsRowColumn( popupMenu ) )
 {
 printf( "DisplayTextEditMenu: Not RowColumn\n" );
 exit( 1 );
 }

 /* Check to see if the button that caused this event is the menu
 * button. If not, get out. */
 argNdx = 0;
 XtSetArg( argList[argNdx], XmNwhichButton, &menuButton ); argNdx++;
 XtGetValues( popupMenu, argList, argNdx );
 if ( xEvent->xbutton.button != menuButton )
 {

 return;
 }

 /* We need to set the selectability of the menu items here. For most menu
 * items, that involves checking to see if any text has been selected. */
 selectedText = XmTextGetSelection( textBody );

 /* The Copy menu item. */
 if ( selectedText != NULL )
 {
 XtSetSensitive( CopyMenuItem, TRUE );
 }
 else
 {
 XtSetSensitive( CopyMenuItem, FALSE );
 }

 /* The Cut menu item. */
 if ( selectedText != NULL )
 {
 XtSetSensitive( CutMenuItem, TRUE );
 }
 else
 {
 XtSetSensitive( CutMenuItem, FALSE );
 }

 /* The Paste menu item. See if there's something in the clipboard,
 * and set sensitivity accordingly. */
 if ( selectedText == NULL )
 {
 if ( ClipboardIsEmpty( textBody ) )
 {
 XtSetSensitive( PasteMenuItem, FALSE );
 }
 else
 {
 XtSetSensitive( PasteMenuItem, TRUE );
 }
 }
 else
 {
 XtSetSensitive( PasteMenuItem, FALSE );
 }

 /* The PasteFile menu item. Let's say that we can only paste from a file
 * if no text has been selected. */
 if ( selectedText == NULL )
 {
 XtSetSensitive( PasteFileMenuItem, TRUE );
 }
 else
 {
 XtSetSensitive( PasteFileMenuItem, FALSE );
 }

 /* The WriteFile menu item. */
 if ( selectedText != NULL )
 {

 XtSetSensitive( WriteFileMenuItem, TRUE );
 }
 else
 {
 XtSetSensitive( WriteFileMenuItem, FALSE );
 }

 XmMenuPosition( popupMenu, xEvent );
 XtManageChild( popupMenu );

}

/*~PROC************************************************************************
 * Procedure: CreateTextEditPopup
 * Synopsis: Creates the Popup menu displayed over the text edit area.
 * Callbacks are also defined here.
 * Assumptions:
 * Features Supported:
 * Known Bugs/Deficiencies: External resources should perhaps be considered.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
static Widget
CreateTextEditPopup( Widget parent )
{

 Widget textPopup; /* Created popup returned */
 Arg argList[25]; /* Resource argument list */
 int argNdx; /* Index into argument list */

 Widget fileDialog; /* File selection dialog box */
 Widget promptDialog; /* Text input prompt */

 /* We assume a text edit widget as parent. If not, get out. */
 if ( !XmIsText( parent ) )
 {
 printf( "CreateTextEditPopup: Not Text\n" );
 exit( 1 );
 }

 /* Create the popup menu. We'll tell Xt to manage it at the time that
 * it needs to be displayed. */
 textPopup = XmCreatePopupMenu( parent, "textPopup", NULL, 0 );

 /* Add the menu items (buttons). */
 argNdx = 0;
 XtSetArg( argList[argNdx], XmNlabelString, XmStringCreateLtoR( "Copy",
XmSTRING_DEFAULT_CHARSET ) ); argNdx++;
 CopyMenuItem = XmCreatePushButtonGadget( textPopup, "copyMenuItem", argList,
argNdx );
 XtManageChild( CopyMenuItem );

 argNdx = 0;
 XtSetArg( argList[argNdx], XmNlabelString, XmStringCreateLtoR( "Cut",
XmSTRING_DEFAULT_CHARSET ) ); argNdx++;
 CutMenuItem = XmCreatePushButtonGadget( textPopup, "cutMenuItem", argList,
argNdx );
 XtManageChild( CutMenuItem );

 argNdx = 0;
 XtSetArg( argList[argNdx], XmNlabelString, XmStringCreateLtoR( "Paste",
XmSTRING_DEFAULT_CHARSET ) ); argNdx++;
 PasteMenuItem = XmCreatePushButtonGadget( textPopup, "pasteMenuItem",
argList, argNdx );
 XtManageChild( PasteMenuItem );


 argNdx = 0;
 XtSetArg( argList[argNdx], XmNlabelString, XmStringCreateLtoR( "Paste From
File...", XmSTRING_DEFAULT_CHARSET ) ); argNdx++;
 PasteFileMenuItem = XmCreatePushButtonGadget( textPopup, "pasteFileMenuItem",
argList, argNdx );
 XtManageChild( PasteFileMenuItem );

 argNdx = 0;
 XtSetArg( argList[argNdx], XmNlabelString, XmStringCreateLtoR( "Write To
File...", XmSTRING_DEFAULT_CHARSET ) ); argNdx++;
 WriteFileMenuItem = XmCreatePushButtonGadget( textPopup, "writeFileMenuItem",
argList, argNdx );
 XtManageChild( WriteFileMenuItem );

 /* Add the File Selection dialog, to be invoked by PasteFileMenu button. */
 argNdx = 0;
 XtSetArg( argList[argNdx], XmNdialogStyle, XmDIALOG_APPLICATION_MODAL );
argNdx++;
 XtSetArg( argList[argNdx], XmNdialogTitle, XmStringCreateLtoR( "Paste From
File", XmSTRING_DEFAULT_CHARSET ) ); argNdx++;
 XtSetArg( argList[argNdx], XmNselectionLabelString, XmStringCreateLtoR(
"Directory", XmSTRING_DEFAULT_CHARSET ) ); argNdx++ ;
 XtSetArg( argList[argNdx], XmNautoUnmanage, True ); argNdx++;
 fileDialog = XmCreateFileSelectionDialog( parent, "fileDialog", argList,
argNdx );

 /* Add a selection dialog, to be invoked by the WriteFileMenu button. */
 argNdx = 0;
 XtSetArg( argList[argNdx], XmNdialogStyle, XmDIALOG_APPLICATION_MODAL );
argNdx++;
 XtSetArg( argList[argNdx], XmNdialogTitle, XmStringCreateLtoR( "Write To
File", XmSTRING_DEFAULT_CHARSET ) ); argNdx++;
 XtSetArg( argList[argNdx], XmNselectionLabelString, XmStringCreateLtoR(
"File", XmSTRING_DEFAULT_CHARSET ) ); argNdx++ ;
 XtSetArg( argList[argNdx], XmNtextColumns, 32 ); argNdx++;
 promptDialog = XmCreatePromptDialog( parent, "promptDialog", argList, argNdx
);

 /* Add callbacks for the menu buttons. */
 XtAddCallback( CopyMenuItem, XmNactivateCallback, CopyCB, parent );
 XtAddCallback( CutMenuItem, XmNactivateCallback, CutCB, parent );
 XtAddCallback( PasteMenuItem, XmNactivateCallback, PasteCB, parent );
 XtAddCallback( PasteFileMenuItem, XmNactivateCallback, PasteFileCB,
fileDialog );
 XtAddCallback( WriteFileMenuItem, XmNactivateCallback, WriteFileCB,
promptDialog );

 /* Add callbacks for the dialog box buttons. */
 XtAddCallback( fileDialog, XmNokCallback, FileDialogOKCB, parent );
 XtAddCallback( fileDialog, XmNcancelCallback, UnMapDialogCB, fileDialog );
 XtAddCallback( fileDialog, XmNhelpCallback, UnMapDialogCB, fileDialog );
 XtAddCallback( promptDialog, XmNokCallback, PromptDialogOKCB, parent );
 XtAddCallback( promptDialog, XmNcancelCallback, UnMapDialogCB, promptDialog
);
 XtAddCallback( promptDialog, XmNhelpCallback, UnMapDialogCB, promptDialog );

 return( textPopup );

}

/*~PROC************************************************************************
 * Procedure: main
 * Synopsis: Initializes the Intrinsics, creates all of the higher-level
widgets
 * necessary to make the application happen, and enters the main loop.
 * Assumptions:
 * Usage: Command-line arguments are ignored (for now).
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
void
main( int argc, char *argv[] )
{

 Widget topShell; /* Top level shell created by the Intrinsics */
 Widget textEdit; /* Main edit Text Widget */

 Widget textMenu; /* Popup menu associated with the text editor */

 Arg argList[25]; /* Resource argument list */
 int argNdx; /* Index into resource list */

 /* Initialize the Intrinsics. */
 topShell = XtInitialize( argv[0], "Editor", NULL, 0, &argc, argv );

 /* Create the scrolled Text Widget */
 argNdx = 0;
 XtSetArg(argList[argNdx], XmNscrollVertical, True ); argNdx++;
 XtSetArg(argList[argNdx], XmNscrollHorizontal, True ); argNdx++;
 XtSetArg(argList[argNdx], XmNeditMode, XmMULTI_LINE_EDIT ); argNdx++;

 textEdit = XmCreateScrolledText( topShell, "textEdit", argList, argNdx );
 XtManageChild( textEdit );

 /* Create the context-sensitive popup menu for this Widget */
 textMenu = CreateTextEditPopup( textEdit );

 /* Add the event handler to the Text Widget, invoking the popup menu. */
 XtAddEventHandler( textEdit, ButtonPressMask, FALSE, DisplayTextEditMenu,
textMenu );

 /* Register new actions to be associated with our app. */
 XtAddActions( actionTable, XtNumber( actionTable ) );

 /* Map the editor, and enter the event dispatch loop. */
 XtRealizeWidget( topShell );
 XtMainLoop();

}






[LISTING THREE]

/*~PKG*************************************************************************
 * Package Name: xm_callbacks.c
 * Synopsis: Common text manipulation callbacks.
 * Features Supported:
 * References: Xt Programming and Apps by Doug Young.
 * Xm Programming Reference and Guide by OSF.
 * Xt Programming Reference and Guide by O'Reilly.
 * Usage: Include "xm_callbacks.h"
 * Known Bugs/Deficiencies:
 * Modification History: 11/01/90 twl original
 */

/*~HDR*************************************************************************
 * Header files included.
 */
#include <stdio.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>


#include <X11/Xatom.h>
#include <X11/StringDefs.h>
#include <X11/Intrinsic.h>
#include <Xm/Xm.h>
#include <Xm/Text.h>
#include <Xm/FileSB.h>

#include "xm_clipboard.h"

/*~PROC************************************************************************
 * Procedure: MapDialogCB
 * Synopsis: Maps the referenced dialog box.
 * Assumptions: The parent has been realized.
 * The widget passed to this callback is a subclass of dialogshell.
 * Features Supported:
 * Known Bugs/Deficiencies:
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
void
MapDialogCB( source, dialog, callbackData )

 Widget source; /* Source of the callback */
 Widget dialog; /* Data passed to the callback by the register procedure */
 XmAnyCallbackStruct *callbackData; /* Generic data passed to all callback
procedures */
{

 XtManageChild( dialog );

}

/*~PROC************************************************************************
 * Procedure: UnMapDialogCB
 * Synopsis: Unmaps the referenced dialog box.
 * Assumptions: The parent has been realized.
 * The widget passed to this callback is a subclass of dialogshell.
 * Features Supported:
 * Known Bugs/Deficiencies:
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
void
UnMapDialogCB( source, dialog, callbackData )

 Widget source; /* Source of the callback */
 Widget dialog; /* Data passed to the callback by the register procedure */
 XmAnyCallbackStruct *callbackData; /* Generic data passed to all callback
procedures */
{

 XtUnmanageChild( dialog );

}

/*~PROC************************************************************************
 * Procedure: CutCB
 * Synopsis: Callback procedure for cutting text from the referenced text
 * widget to the clipboard. Callback for the "Cut" menu item.
 * Assumptions:
 * Features Supported:
 * Known Bugs/Deficiencies: Cursor should change to a wait state.
 * Modification History: 11/01/90 twl original

 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
void
CutCB( source, textID, callbackData )

 Widget source; /* Source of the callback */
 Widget textID; /* Data passed to the callback by the register procedure */
 XmAnyCallbackStruct *callbackData; /* Generic data passed to all callback
procedures */
{

 XClientMessageEvent clientMessage; /* X client message structure */
 Time timestamp; /* X Event time */
 int clipStat; /* Return status of clipboard call */

 /* Get the event timestamp */
 timestamp = ((XButtonEvent *)callbackData->event)->time;

 /* Copy the selected text to the clipboard. */
 clipStat = CopyToClipboard( textID, timestamp );

 /* Delete the selected text from the Text Widget */
 if ( clipStat == True )
 {
 clientMessage.type = ClientMessage;
 clientMessage.display = XtDisplay( textID );
 clientMessage.message_type = XmInternAtom( XtDisplay( textID ),
"KILL_SELECTION", FALSE );
 clientMessage.window = XtWindow( textID );
 clientMessage.format = 32;
 clientMessage.data.l[0] = XA_PRIMARY;
 XSendEvent( XtDisplay( textID ), clientMessage.window, TRUE, NoEventMask,
&clientMessage );
 }

}

/*~PROC************************************************************************
 * Procedure: CopyCB
 * Synopsis: Callback procedure for copying text from the referenced text
 * widget to the clipboard. Callback for the "Copy" menu item.
 * Assumptions:
 * Features Supported:
 * Known Bugs/Deficiencies: The cursor should change into a waiting cursor.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
void
CopyCB( source, textID, clientData )

 Widget source; /* Source of the callback */
 Widget textID; /* Data passed to the callback as defined by the registering
procedure */
 XmAnyCallbackStruct *clientData; /* Generic data passed to all callback
procedures */

{
 Time eventTime; /* Time stamp for the clipboard */

 /* Get the time the event occurred */
 eventTime = ((XButtonEvent *)clientData->event)->time;

 /* Copy the selected text (if any) to the clipboard */
 CopyToClipboard( textID, eventTime );

}


/*~PROC************************************************************************
 * Procedure: PasteCB
 * Synopsis: Callback procedure for pasting text from the referenced text
widget
 * to the clipboard. Callback for the "Paste" menu item.
 * Assumptions:
 * Features Supported:
 * Known Bugs/Deficiencies: External resources should be considered.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
void
PasteCB( source, textID, callbackData )

 Widget source; /* Owner of the callback */
 Widget textID; /* Data passed to the callback routine by */
 /* the registering procedure */
 XmAnyCallbackStruct *callbackData; /* Data passed to all callbacks */

{
 char *pasteText; /* That text which is to be retrieved from the paste buffer
*/
 Time eventTime; /* Time stamp for the clipboard routines */
 Arg argList[25]; /* Resource retrieval array */
 int argNdx; /* Index into resource array */

 XmTextPosition textCursorPos; /* Position of Text Widget insertion cursor */

 /* Get the time the event occurred (for transaction timestamping) */
 eventTime = ((XButtonEvent *)callbackData->event)->time;

 /* Get the latest text from the clipboard. */
 pasteText = RetrieveFromClipboard( textID, eventTime );

 /* See if we have a hit. If not, get out. */
 if ( pasteText == NULL )
 {
 return;
 }

 /* Get the insertion point of the text Widget */
 argNdx = 0;
 XtSetArg( argList[argNdx], XmNcursorPosition, &textCursorPos ); argNdx++;
 XtGetValues( textID, argList, argNdx );

 /* ...and insert the text */
 XmTextReplace( textID, textCursorPos, textCursorPos, pasteText );

 XtFree( pasteText );

}

/*~PROC************************************************************************
 * Procedure: PasteFileCB
 * Synopsis: Callback procedure for the Paste from File... menu item.
 * Currently, just the dialog box is displayed.
 * Assumptions:
 * Features Supported:
 * Known Bugs/Deficiencies: External resources should be considered.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/

void
PasteFileCB( source, dialog, callbackData )

 Widget source; /* Owner of the callback */
 Widget dialog; /* Data passed to the callback routine by */
 /* the registering procedure */
 XmAnyCallbackStruct *callbackData; /* Data passed to all callbacks */

{
 XtManageChild( dialog );
}

/*~PROC************************************************************************
 * Procedure: WriteFileCB
 * Synopsis: Callback procedure for the Write to File... menu item.
 * Currently, just the dialog box is displayed.
 * Assumptions:
 * Features Supported:
 * Known Bugs/Deficiencies: External resources should be considered.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
void
WriteFileCB( source, dialog, callbackData )

 Widget source; /* Owner of the callback */
 Widget dialog; /* Data passed to the callback routine by */
 /* the registering procedure */
 XmAnyCallbackStruct *callbackData; /* Data passed to all callbacks */

{
 XtManageChild( dialog );
}

/*~PROC************************************************************************
 * Procedure: FileDialogOKCB
 * Synopsis: Callback procedure for the activation of the OK button on the
file selection
 * dialog box.
 * Assumptions: The file to be pasted is ASCII.
 * The source of the callback is a file selection dialog box.
 * Features Supported:
 * Known Bugs/Deficiencies: External resources should be considered.
 * The file to be pasted is not checked for type (should be ASCII).
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
void
FileDialogOKCB( source, textID, callbackData )

 Widget source; /* Owner of the callback */
 Widget textID; /* Data passed to the callback routine */
 XmFileSelectionBoxCallbackStruct *callbackData; /* Data passed to all file
selection callbacks */
{
 char *pasteFile; /* Filename returned from the dialog */
 int pasteFileLen; /* Length of referenced file */
 char *pasteText; /* Contents of reference file */
 struct stat statBuf; /* Buffer for stat() results */
 FILE *fileDesc; /* UNIX file descriptor */

 Arg argList[25]; /* Resource retrieval array */
 int argNdx; /* Index into resource array */


 XmTextPosition textCursorPos; /* Position of Text Widget insertion cursor */

 if ( !XmIsText( textID ) )
 {
 printf( "FileDialogOKCB: Not Text Widget\n" );
 exit( 1 );
 }

 if ( !XmIsFileSelectionBox( source ) )
 {
 printf( "FileDialogOKCB: Not dialog box\n" );
 exit( 1 );
 }

 /* Get the filename */
 XmStringGetLtoR( callbackData->value, XmSTRING_DEFAULT_CHARSET, &pasteFile );

 /* Open the file */
 fileDesc = fopen( pasteFile, "r" );
 if ( fileDesc == NULL )
 {
 /* Display an error prompt, and get out */
 printf( "FileDialogOKCB: File not available for read\n" );
 exit( 1 );
 }

 /* Get its length, read the contents, and close it up. */
 stat( pasteFile, &statBuf );
 pasteFileLen = statBuf.st_size;
 pasteText = XtMalloc( pasteFileLen );
 fread( pasteText, sizeof( char ), pasteFileLen, fileDesc );
 fclose( fileDesc );

 /* Paste the contents at the current insertion point. */
 argNdx = 0;
 XtSetArg( argList[argNdx], XmNcursorPosition, &textCursorPos ); argNdx++;
 XtGetValues( textID, argList, argNdx );
 XmTextReplace( textID, textCursorPos, textCursorPos, pasteText );

 /* Free up resources */
 XtFree( pasteFile );
 XtFree( pasteText );

 /* Bring down the dialog box */
 XtUnmanageChild( source );

}

/*~PROC************************************************************************
 * Procedure: PromptDialogOKCB
 * Synopsis: Callback procedure for the activation of the OK button on the
prompt
 * dialog box.
 * Assumptions:
 * Features Supported:
 * Known Bugs/Deficiencies: External resources should be considered.
 * Minimal error checking on file creation and write.
 * Modification History: 08/20/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */

void
PromptDialogOKCB( source, textID, callbackData )

 Widget source; /* Owner of the callback */
 Widget textID; /* Data passed to the callback routine */
 XmSelectionBoxCallbackStruct *callbackData; /* Data passed to all selection
callbacks */
{

 char *writeFile; /* Filename returned from the dialog */
 int writeFileLen; /* Length of referenced file */
 char *writeText; /* Contents of reference file */
 struct stat statBuf; /* Buffer for stat() results */
 FILE *fileDesc; /* UNIX file descriptor */

 char *selectedText; /* That text which is marked as selected in textID */

 if ( !XmIsText( textID ) )
 {
 printf( "PromptDialogOKCB: Not Text Widget\n" );
 exit( 1 );
 }

 /* If no text selected, we can leave. */
 selectedText = XmTextGetSelection( textID );
 if ( selectedText == NULL )
 {
 return;
 }

 /* Get the filename */
 XmStringGetLtoR( callbackData->value, XmSTRING_DEFAULT_CHARSET, &writeFile );

 /* Open the file */
 fileDesc = fopen( writeFile, "w" );
 if ( fileDesc == NULL )
 {
 /* Display an error, and get out */
 printf( "PromptDialogOKCB: Error on file creation\n" );
 exit( 1 );
 }

 /* Write the file, and close it up */
 fwrite( selectedText, sizeof( char ), strlen( selectedText ), fileDesc );
 if ( fclose( fileDesc ) != NULL )
 {
 /* Display an error, and get out */
 printf( "PromptDialogOKCB: Error on file close\n" );
 exit( 1 );
 }

}






[LISTING FOUR]


/*~PKG*************************************************************************
 * Package Name: xm_clipboard.c
 * Synopsis: Implements clipboard store and retrieve procedures.
 * Features Supported:
 * References: Xt Programming and Apps by Doug Young.
 * Xm Programming Reference and Guide by OSF.
 * Xt Programming Reference and Guide by O'Reilly.
 * Usage: Include "xm_clipboard.h"
 * Known Bugs/Deficiencies:
 * Modification History: 11/01/90 twl original
 */

/*~HDR*************************************************************************
 * Header files included. */
#include <X11/StringDefs.h>
#include <X11/Intrinsic.h>
#include <Xm/Xm.h>
#include <Xm/Text.h>
#include <Xm/CutPaste.h>

/*~LOC*DATA********************************************************************
 * Constants and variables local to this package. */

#define CBLABEL "TextEdit"

/*~PROC************************************************************************
 * Procedure: CopyToClipboard
 * Synopsis: Retrieve selected text from reference textID, and copy it to the
system
 * clipboard. Returns True if successful, False if not.
 * Assumptions:
 * Features Supported:
 * Known Bugs/Deficiencies: Text only supported.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
int
CopyToClipboard( Widget textID, Time timestamp )
{

 char *selectedText; /* That text which is marked as selected in textID */
 int clipStat; /* Return value from XmClipboard routines */
 XmString clipLabel; /* The label used to identify the clipboard string */
 long clipID, copyID; /* The handles used in identifying clipboard
transactions */

 /* Sanity check. */
 if ( !XmIsText( textID ) )
 {
 printf( "CopyToClipboard: Not Text Widget\n" );
 exit( 1 );
 }

 /* If no text selected, we can leave. */
 selectedText = XmTextGetSelection( textID );
 if ( selectedText == NULL )
 {
 return( False );
 }

 /* Create the label that appears in the clipboard. */
 clipLabel = XmStringCreateLtoR( CBLABEL, XmSTRING_DEFAULT_CHARSET );


 /* Poll the clipboard, asking for permission to start. */
 clipStat = ClipboardLocked;
 while( clipStat == ClipboardLocked )
 {
 clipStat = XmClipboardStartCopy( XtDisplay( textID ), XtWindow( textID ),
 clipLabel, timestamp, textID, NULL,
 &clipID );
 }

 /* Copy the data to the clipboard until successful. */
 clipStat = ClipboardLocked;
 while( clipStat == ClipboardLocked )
 {
 clipStat = XmClipboardCopy( XtDisplay( textID ), XtWindow( textID ), clipID,
 XtRString, selectedText, (long)strlen( selectedText ), 0
 &copyID );

 }

 /* End the transaction... */
 clipStat = ClipboardLocked;
 while( clipStat == ClipboardLocked )
 {
 clipStat = XmClipboardEndCopy( XtDisplay( textID ), XtWindow( textID ),
clipID );

 }

 /* ... cleanup, and leave. */
 XtFree( selectedText );
 XmStringFree( clipLabel );

 return( True );
}

/*~PROC************************************************************************
 * Procedure: RetrieveFromClipboard
 * Synopsis: Return text from the clipboard.
 * Assumptions: The caller assumes responsibility for freeing returned string.
 * Features Supported:
 * Known Bugs/Deficiencies: Text only supported.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
char *
RetrieveFromClipboard( Widget textID, Time timestamp )
{

 char *pasteText; /* That text which is to be retrieved from the paste buffer
*/
 int pasteTextLen; /* Length of text in clipboard */
 int clipStat; /* Return value from XmClipboard routines */
 XmString clipLabel; /* The label used to identify the clipboard string */
 long clipID, privateID; /* The handles used in identifying clipboard
transactions */
 long outlen; /* Length of data retrieved from clipboard */

 /* Check to be sure that we have a text Widget */
 if ( !XmIsText( textID ) )
 {
 printf( "RetrieveFromClipboard: Widget not Text\n" );
 exit( 1 );

 }

 /* Start our clipboard transaction */
 clipStat = ClipboardLocked;
 while( clipStat == ClipboardLocked )
 {
 clipStat = XmClipboardStartRetrieve( XtDisplay( textID ), XtWindow( textID ),
 timestamp );
 }

 /* Get the length of the clipboard contents */
 clipStat = ClipboardLocked;
 pasteTextLen = 0;
 while( clipStat == ClipboardLocked )
 {
 clipStat = XmClipboardInquireLength( XtDisplay( textID ), XtWindow( textID ),
 XmRString, &pasteTextLen );
 if ( clipStat == ClipboardNoData )
 {
 return( NULL );
 }
 }

 /* Retrieve the data (allocating a string buffer) */
 pasteText = XtMalloc( pasteTextLen + 1 );

 clipStat = ClipboardLocked;
 while( clipStat == ClipboardLocked )
 {
 clipStat = XmClipboardRetrieve( XtDisplay( textID ), XtWindow( textID ),
 XmRString, pasteText, pasteTextLen,
 &outlen, &privateID );
 }

 /* End the clipboard session... */
 clipStat = ClipboardLocked;
 while( clipStat == ClipboardLocked )
 {
 clipStat = XmClipboardEndRetrieve( XtDisplay( textID ), XtWindow( textID ) );
 }

 /* ... and return the clipboard contents. */
 return( pasteText );

}

/*~PROC************************************************************************
 * Procedure: ClipboardIsEmpty
 * Synopsis: Returns FALSE, if no items in the clipboard.
 * Assumptions:
 * Features Supported:
 * Known Bugs/Deficiencies: Text only supported. Returns False (no data) if
clipboard is locked.
 * Modification History: 11/01/90 twl original
 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
*/
int
ClipboardIsEmpty( Widget w )
{

 int clipStat; /* Clipboard status value */

 int textLength; /* Length of text in clipboard */

 clipStat = XmClipboardInquireLength( XtDisplay( w ), XtWindow( w ),
XmRString,
 &textLength );

 if ( clipStat == ClipboardSuccess )
 {
 return( False );
 }
 else
 {
 return( True );
 }
}






[LISTING FIVE]

#ifndef XM_CALLBACKS_H
#define XM_CALLBACKS_H
/*****************************************************************************
 * Include File Name: xm_callbacks.h
 * Contents: Interface to the callbacks package.
 * This include file is dependent on the following include file(s):
 * None.
 * Modification History: 11/01/90 twl original
 */


/*~EXP*PROC********************************************************************
 * Procedures and functions exported by this package. */
extern void
MapDialogCB( Widget source, Widget dialog, XmAnyCallbackStruct *callbackData
);

extern void
UnMapDialogCB( Widget source, Widget dialog, XmAnyCallbackStruct *callbackData
);

extern void
CutCB( Widget source, Widget textID, XmAnyCallbackStruct *callbackData );

extern void
CopyCB( Widget source, Widget textID, XmAnyCallbackStruct *callbackData );

extern void
PasteCB( Widget source, Widget textID, XmAnyCallbackStruct *callbackData );

extern void
PasteFileCB( Widget source, Widget textID, XmAnyCallbackStruct *callbackData
);

extern void
WriteFileCB( Widget source, Widget textID, XmAnyCallbackStruct *callbackData
);

extern void
FileDialogOKCB( Widget source, Widget textID, XmFileSelectionBoxCallbackStruct
*callbackData );


extern void
PromptDialogOKCB( Widget source, Widget textID, XmSelectionBoxCallbackStruct
*callbackData );

#endif






[LISTING SIX]


#ifndef XM_CLIPBOARD_H
#define XM_CLIPBOARD_H
/*****************************************************************************
 *
 * Include File Name: xm_clipboard.h
 *
 * Contents:
 * Interface to the Clipboard package.
 *
 * This include file is dependent on the following include file(s):
 * None.
 *
 * Modification History:
 * 11/01/90 twl original
 */


/*~EXP*PROC********************************************************************
 *
 * Procedures and functions exported by this package.
 */
extern int
CopyToClipboard( Widget textID, Time timestamp );

extern char *
RetrieveFromClipboard( Widget textID, Time timestamp );

extern int
ClipboardIsEmpty( Widget w );

#endif






[LISTING SEVEN]

#
# Makefile to build textedit
#

#
# Macros
#


CC=/bin/cc
DEBUG=-g
INCLUDE_DIRS=-I /usr/include/Xm -I /usr/include/X11
SYS_DEFS=$(SYS_T) $(RUN_T) -DSYSV
CC_SWITCHES= -c $(SYS_DEFS) $(INCLUDE_DIRS) $(DEBUG)

LD=/bin/ld
LIBDIRS=-L/usr/X11/lib
LIBS=-lXm -lXtm -lXaw -lX11
LD_SWITCHES=$(LIBDIRS) $(LIBS)

#
# Inference rules
#
.SUFFIXES: .c .o .ln

.c.o:
 $(CC) $(CC_SWITCHES) $<

OBJS=\
 xm_main.o\
 xm_clipboard.o\
 xm_callbacks.o

#
# Targets
#

all: textedit

textedit: $(OBJS)
 $(LD) -o $@ $(OBJS) $(LD_SWITCHES)

xm_main.o: xm_callbacks.h

xm_callbacks.o: xm_clipboard.h

#-------------------------
# Misc targets
#-------------------------
clean:
 -rm *.bak *.o

lint:
 lint $(INCLUDE_DIRS) -DSYSV *.c
















February, 1991
PROGRAMMING PARADIGMS


A Programmer Over Your Shoulder




Michael Swaine




List A


developing algorithms practicing programming benchmarking code pursuing
efficiency


List B


discovering algorithms teaching programming documenting code pursuing elegance
Consider the two lists above. What distinguishes all the items of list B from
all the items of list A? Your answer may be different from mine; what I had in
mind was, for lack of a better term, softness. Most of us, I think, would
describe the B items as softer, or less rigorous, than the A items. The items
in list A have to do with mathematics, logic, and problem solving. The items
in list B have to do with psychology, pedagogy, and taste.
Lots of books and articles have been written about A list topics, and most of
them take a mathematical, logical, or problem-solving approach. A number of B
list books and articles have been written, too, and some of them are excellent
-- Polya's books, How to Solve It and Mathematical Discovery, for example.
What we rarely see are books on B list topics that approach their subject
matter from a rigorous mathematical or logical or problem-solving perspective.
I've found one. It's called On the Shape of Mathematical Arguments, but I
think it should be called The Programmer Over Your Shoulder.
The digression which follows will explain why.
I've long thought that there should be a software developer's version of The
Reader Over Your Shoulder, a gutsy book by Robert Graves and Alan Hodges for
writers and editors. It's gutsy in two ways. In the first half of the book,
Graves and Hodges have the courage to lay down a list of principles to which
good writing should adhere, and it is a long and explicit list. Many other
pedagogues of writing have set down lists of basic writing principles, but the
principles are usually vague, and when they aren't they are often wrong.
Graves and Hodges manage to be precise and correct at the same time. This is
impressive not only as an achievement, but also as a gutsy move, because an
awful lot of writers would prefer not to believe that there are that many
rigorous rules of good writing. It's art, and they'd like to keep it that way.
(I hope it's clear that this digression is not wholly irrelevant to the
process of software development and to the mental processes of [some] software
developers.)
Back to the digression: The second half of The Reader Over Your Shoulder is
more audacious. In it, Graves and Hodges present what they call "examinations"
and "fair copies." They reprint short passages from known writers such as H.G.
Wells, George Bernard Shaw, and Winston Churchill. Each passage looks as it
did on original publication, except for the intrusion of dozens of
superscripted number and letters. These refer to the pointed and often funny
footnotes in which Graves and Hodges detail the writing errors committed by
the famous author. Then Graves and Hodges rewrite the passage in good English.
As I say, gutsy. It's an old book; a lot of these authors were alive when it
was written.
I was so taken with The Reader Over Your Shoulder that I included a chapter
titled "The Programmer over your Shoulder" in my HyperTalk book. (Yes, this is
a Personal Aside within a Digression. Only expert writers should attempt such
a tricky maneuver.) My chapter didn't live up to its title. The point I was
trying to make with the title was that HyperCard gave HyperTalk scripters an
unprecedented opportunity to study other scripters' code and critique it,
because there was a lot of it around and because HyperCard retained the source
code in its stacks, not that I had actually done an examination and fair copy
on professional programmers' work.
Antonetta van Gasteren has done just that.
Or something very much like it. (Digression over, man.) In On the Shape of
Mathematical Arguments, van Gasteren, a colleague of Edsger Dijkstra, doesn't
take on programmers so much as algorists. She examines and fa~-copies several
published algorithms and proofs, making their authors look a little foolish.
The logic behind her book's title, incidentally, is that she thinks the term
mathematical argument covers a ~~~ Everything from formal mathematical proofs
through the expression of algorithms for publication and the writing of
readable and maintainable code, to the writing of documentation and teaching
of programming, computer science, and mathematics in textbooks and lectures.
The intriguing thing, at least if you buy Dijkstra's argument in the Foreword,
is that all these things do fit together, precisely because of ~~~ rigorous
approach she has taken to the shape of mathematical arguments. Design and
presentation emerge as two sides of the same coin. Dijkstra says, because in
this united setting, the issues involved are purely technical ~~~ and have
nothing to do with in~ or taste.
Here's how van Gasteren puts ~...many hold the opinion that mathematical and
expositional style are purely (or at best largely) a matter of personal taste.
Admittedly, there is no such thing as the best proof or the rule of thumb that
always works, but what I hope to show is the existence of a variety of
technical criteria by which one argument can be shown to be objectively less
streamlined than another.... It has turned out that a lot can be said about
mathematical arguments in general that is independent of the particular area
of mathematics that the argument comes from.
Just as with Graves and Hodges's book, van Gasteren's critiques of published
algorithms and proofs takes up only half her book, the other half being
devoted to laying out objective principles for the presentation of algorithms
and other mathematical arguments. Her examinations and fair copies are there
to exemplify her principles.
Here's an example of how she recasts a mathematical argument. The following is
a typical statement of a particular maximization problem, as she paraphrases
it from the literature.
Given an ascending sequence a[i], 0<=i<N, of natural numbers and a sequence
b[i] of natural numbers, we are requested to maximize (i: 0<=i<N: a[i]
*b[p(i)]), where p ranges over the permutations of i: 0<=i<N.
The immediate question that occurs to anyone reading this is, "Why would I
want to maximize that?" Tempus fugit. It would occur to anyone but a
mathematician, at least.
The problem is a mystery, and it need not be; the trouble with the problem as
stated is that it isn't really asking the question that it wants to ask.
What the problem is really after does not depend on the order of the b[i], but
because the statement of the problem is unfortunately written in terms of
sequences, we have to deal with order. This is an error of overspecificity,
and we can see the consequences of it. In order to undo the overspecificity,
we have to unsequence the b[i] by permuting their subscripts. That's why we
have this p ranging over permutations of subscripts. A problem that has
nothing to do with permutations gets permutations slipped into its formulation
in order to undo the damage caused by introducing sequences where order is not
important, and the result is a mess.
In fact, one group of authors (mathematicians Hardy, Littlewood, and Polya)
discuss this very problem as an example of a problem in rearrangements,
apparently deluded by their own notation into thinking that the problem is
some other problem than the problem it is, which van Gasteren gently
ridicules.
She dumps sequences entirely in favor of bags -- unordered collections -- of
natural numbers, and the real problem becomes much clearer. Here's the problem
expressed without sequences:
Match up pairs of natural numbers from two bags, so as to maximize a simple
function. The function is computed by multiplying together the two numbers in
each pair, and summing.
That's my version of van Gasteren's statement of the problem, and I've
cheated, of course. It's not at all rigorous. But it does state a problem one
can imagine running into in real (programming) life, and you can tell what
it's about. It's obviously not about sequences or permutations. The van
Gasteren version is rigorous, and it still has the virtues she's championing:
...consider couplings, i.e., one-to-one correspondences, between two equally
sized finite bags of natural numbers. Hence, a coupling can be considered a
bag of -- ordered -- pairs of numbers, the subbags of which are as usual
called its sub-couplings. The value of a coupling is defined recursively by -
the value of an empty coupling is 0; the value of a one-element coupling is
the product of the members in the single pair; the value of a nonempty
coupling is the value of one element + the value of the remaining
sub-coupling.
The problem is to construct a coupling with maximum value.
This is longer than the statement of the problem in terms of sequences and
permutations, but if you are familiar with the terminology and with the format
of recursive definitions, this statement of the problem is very clear. Some of
its bulk is devoted to introducing the terminology of bags and couplings, and
some comes from using words that represent concepts rather than symbols that
represent nonce variables. There are no symbols at all here except the numeral
0.
Not introducing names that you don't need is one of van Gasteren's principles.
The original formulation of the problem introduced names for all the elements
in the sequences and consequently for the lengths of the sequences and for the
permutation. We don't need any of these names, nor do we care about the things
that they name.
By not introducing names for things that don't matter, van Gasteren forces
herself to come up with a formulation for the thing she's trying to maximize,
and this in turn leads her to a simple recursive construction for the maximum.
Since, in this problem, van Gasteren's goal is to develop a proof rather than
an algorithm, I won't spell out further details. Her proof, though, does turn
out to be short and simple, and proofs of this problem are generally messy.
Another principle demonstrated in this problem is the principle of maintaining
symmetry. The original statement of the problem speaks of an ascending
sequence a and a sequence b. In fact, the problem is symmetric in the two
collections of numbers, but this formulation masks that symmetry. It is van
Gasteren's contention that breaking symmetry is bad, and that maintaining
symmetry can lead to deeper insights into problems and to simpler solutions.
A very simple demonstration of this uses a game involving bit strings. The
problem is to prove that the game terminates. In this game, a finite-length
bit string is repeatedly transformed as follows:
 00 --> 01
 11 --> 10,
wherever in the string and for as long as such transformations are possible.

There are two cases here, but building a solution around the two cases
overlooks the symmetry in the problem. The approach van Gasteren recommends is
to define x as a pair of matching adjacent bits and y as a pair of nonmatching
adjacent bits. Then the game becomes
x --> y,
and the argument for termination is simple: The leftmost bit does not change,
and every x is eventually turned into a y.
The key to the approach is the recognition and exploitation of the problem's
symmetry in 0 and 1.
Choosing what to name and what not to name and exploiting symmetries in
problems are two of the areas van Gasteren delves into. Others include:
avoiding proof by cases, the exploitation of equivalence, degree of detail in
arguments, and linearization of arguments. Here are some of her principles
regarding naming:
Name as little as possible. The arbitrary identifier that is used only once is
easy to spot, but still occurs in mathematical arguments regularly. Names used
more than once can often be entirely unnecessary: In triangle ABC, the
bisectors of the angles A, B, and C respectively are concurrent. That can be
rewritten: In a triangle the angle bisectors are concurrent.
Watch out for the phrase without loss of generality we can choose ... It is a
warning that the author is about to introduce an overspecific nomenclature
that will cover up symmetries in the problem.
Name everything that needs a name. The warning sign here is the repetition of
long or similar expressions. If you really need the expression, then it
probably needs a name.
Name appropriately. Some objects have internal structure and some don't. Some
arguments depend on the internal structure of some objects, and some don't.
Don't use a name that emphasizes the internal structure of an object if you
don't need that internal structure in the argument.
Name the right thing. If it's the function or its value that you're talking
about, rather than the application of the function to its argument or
parameter, use f instead of f(x). If you're deriving an algorithm for
computing the coordinates of a pixel on the screen and you're derivation is
full of expressions like (X-XO) and (Y-YO), you're probably naming the wrong
thing. Change coordinate systems, define X to be (x+XO) and Y to be (y+YO),
and you can replace all the (X-XO)s and (Y-YO)s with x's and y's.
One more naming topic: van Gasteren discusses symmetry-preserving and
symmetry-masking terminology, and presents the example of the binomial
coefficient. A lot of space in mathematics books is taken up with presenting
identities involving binomial coefficients. She thinks that a
symmetry-preserving notation would eliminate the need for a lot of those
identities. She points out that
(n) (k)
is a function of n, k, and n-k, and is symmetric in k and n-k. The
conventional notation obfuscates the symmetry, and the fact that
(n) = (n ) (k) (k-1)
has to be presented as a theorem. She suggests using some such notation as
P.i.j, in which i and j stand in the place of k and n-k in the conventional
formulation above. This reflects, as well as a left-to-right writing system
can, the symmetry between i and j, and it cuts a wide swath through those long
lists of binomial identities. It also brings an interesting symmetry into some
of the identities it leaves behind.
I have focused on van Gasteren's naming principles here, but she has a lot to
say about symmetry, too, and about the other topics mentioned. It's a readable
book, with some ideas definitely worth considering. It was published by
Springer-Verlag in 1990.
A note on van Gasteren and computer science. Although van Gasteren more often
speaks of proofs than of algorithms, her work was motivated more by the needs
of computer science than of mathematics, "... the explorations reported here,"
she says, "have been inspired by computing's needs and challenges...."
That's interesting, since the advent of computers has brought into mathematics
a new proof technique that van Gasteren probably hates: the zillion-case
proof, as employed in solving the four-color problem. You know, the approach
in which a problem is broken down into a huge number of special cases and
these are cranked through via computer, and you end up knowing what but having
no idea why.












































February, 1991
C PROGRAMMING


CUA and Data Compression




Al Stevens


The East Coast version of Software Development '90 was held November 13-16 in
the Omni Parker House hotel, the oldest hotel in Boston. SD is the annual
Miller Freeman conference for programmers, and this was the first eastern
edition. It featured exhibits, lectures, and workshops that interest
programmers. SD East was small by comparison to the older, established West
Coast conference. There were only about 30 exhibitors, and attendance was
about the same as the first West Coast SD three years ago. But there were
plenty of lectures and workshops to attend. Tom Plum, Robert Ward, Jack
Purdum, and others presented topics that would interest C programmers at all
levels. Larry Constantine and Ken Orr lectured on software development
methodologies. P.J. Plauger lectured on heresies in programming and software
management, one of which is that if you do not understand the latest trendy
methodology, it is probably not your fault.
Intel announced their 386/486 C Code Builder Kit at SD East. This product
combines a 32-bit C compiler, libraries, librarian, linker, make utility, DOS
extender, and debugger. Although some of these products have been around for a
while, this is the first time that Intel has marketed them as an integrated
retail package. They emphasize that their customers prefer and will benefit
from a single-vendor solution, but the package does not include an editor or
profiler, so, until Intel adds them, you will need to look elsewhere for those
capabilities. Intel does not yet support Windows development. They said that
when they do, you might need the Microsoft Windows Software Development Kit
because they have not decided whether they will develop a look-alike. The
Intel compiler and libraries are compatible with Microsoft C. I think that
their emphasis on the single-vendor solution as a marketing device is a
mistake. There is no way that Intel will be able to offer every library, tool,
and utility that programmers need, and programmers know that. If, by their
marketing strategies, they make you believe that the strength of their product
is that it is a single-vendor solution, then whatever they leave out will draw
attention away from whatever its real strengths might be.
Borland showed Turbo Pascal 6.0, which includes a new package called Turbo
Vision, something that Turbo C++ programmers should demand. Jeff Duntemann
talked about TP 6.0 and Turbo Vision in his "Structured Programming" column
(DDJ, December 1990). Turbo Vision is what C++ programmers would call a class
library. It implements something close to the IBM Systems Application
Architecture (SAA) Common User Access (CUA) interface. CUA is the standard
with which Windows 3.0 and OS/2 Presentation Manager comply. The Turbo Vision
library includes application windows, dialog boxes, radio buttons, command
buttons, lists, text boxes, a 64K editor object, and so on. It implements
these things in the IBM text mode with full mouse support, allowing DOS
text-mode programs to resemble Windows programs. More importantly, Turbo
Vision provides a way for DOS programmers to use the CUA interface, not just
as a Windows look-alike but as a way to comply with an emerging standard. With
Turbo Vision, you do not need to write code for the mouse and keyboard, you do
not need to write menu code, you do not need to write a text editor, and so
on. But what is better, you do not need to write user documentation or
detailed help screens that describe your unique user interface. That is the
number one advantage of a common user interface, both for programmers and
users. Good or bad, the look and feel of one application is much the same
among all complying applications. Inasmuch as CUA is descending upon us as a
de facto standard, C and C++ programmers will soon need such libraries.
My imagination fired up by Turbo Vision, I prowled the exhibitions looking for
such a text-mode CUA library for C or C++. I did not find one there, but
several vendors said they are thinking about doing it. Two packages that come
close are the Zinc Interface Library and Magma Software's Mewel 3, but neither
company exhibited at SD.


The Zinc Interface Library


DDJ reviewed the Zinc Interface Library for Turbo C++ programmers in December
1990. Jeff Duntemann's column also referred to the Zinc library, calling it
"SAA-compliant." It does not seem to be a complete implementation of SAA CUA,
however. For example, I could not find support for radio buttons or clipboard
operations, both of which are components of the SAA CUA. But because Zinc is a
class library that works with Turbo C++, you can probably extend it to add the
missing parts of CUA by deriving and adding new classes.


Mewel 3


C programmers still need a CUA function library, however, and Magma Software
Systems has a product called Mewel 3 that works with Microsoft C and the C
compiler component of Turbo C++. It is a C function library that uses the same
API as Windows 3. To use it, you must know how to write Windows 3 programs,
which is not easy. Your programs, however, will be reasonably portable between
the DOS text-mode and the Windows 3 GUI platforms. This feature offers
advantages to several different development environments. Windows programmers
can port their applications to DOS with a minimum of fuss, thus expanding
their potential user base. Developers of new programs can target them for both
platforms. Windows developers can use the DOS platform for program
development, avoiding the clumsier Windows debugging environment.
I ran the Mewel demonstration programs, and their look-and-feel approximate
the Windows user interface. There are some occasional minor differences -- for
example, double-clicking the control box does not always close the window --
but these are small points. One thing that concerns me is the size of the
executable programs. The simplest programs are well in excess of 100K, some
exceeding 300K. Maybe there's a penalty for using CUA. Maybe it could be done
better. We'll know when the other vendors offer their CUA libraries.
The Mewel documentation explains the library reasonably well. It is, however,
replete with typographical, usage, and grammatical errors. The Mewel license
concerns me. It is one of those pointless licenses that allows you to make
only one backup copy and use the product on only one of your computers.
Furthermore, you are not allowed to "create other works based on the
Documentation," which is presumptuous; Mewel itself is based very closely on
SAA CUA and the Windows API. Finally, for some oddball reason, the license
prohibits you from using Mewel to develop a word processor or text editor
program. I do not understand that limitation. A developer should clarify these
points with Magma before launching into a project that uses Mewel as the
interface library.
Because Microsoft Press now publishes the Windows 3 SDK Programmer's
Reference, Guide to Programming, and Programming Tools as separate books, a
programmer does not need to buy the SDK to get the API documentation. Vendors
such as Magma who clone the Windows API have their work made easier for them.
They do not need to develop extensive programmer documentation if their
libraries are true clones of the SDK. Watch for my "Examining Room" article on
Mewel 3 in an upcoming issue of DDJ.


Encryption and Compression and Who Owns What?


No single subject in this column has stirred more response than the two
columns devoted to the Data Encryption Standard last year. It seems that
encryption is a hobby for a lot of programmers. Many of you sent me your
encryption programs not only for the DES algorithm but for other encryption
methods as well. I now have an extensive library of encryption algorithms as a
result of your contributions. Now if I only had some secrets.
Some readers expressed their concern that I was publishing an implementation
of the DES algorithm, thinking that I was somehow violating national security
interests. If the government can publish the details of the algorithm in a
booklet that is available to everyone, then programmers should be free to
write code that implements the algorithm. No guys in trench coats have been
knocking on my door, so I guess I'm safe.
This question raises the specter of software patents again. DDJ has taken an
interest in the issue. We published code that implements the LZW data
compression algorithm in June, and that article lit a fuse. It seems that
someone patented the algorithm. This causes me to wonder what is protected by
such patents. If the details of an algorithm are public knowledge -- as are
those of LZW and DES -- does the patent holder own all expressions of that
algorithm, or does he or she own rights only to implementations of the
algorithm? What is the difference? Is the publication of source code an
implementation of an algorithm or merely an expression of how it works?
Obviously if only the implementation is protected and if the publication of
source code is not an implementation, then magazines can print the programs
but readers cannot use them. What good would that be other than as an exercise
in how algorithms work and how the patent system does not?
The League for Programming Freedom wrote the article "Software Patents" (DDJ,
November 1990). They abhor the practice and say why. I chatted with some of
the League's members in Boston. By coincidence, the November issue of Boston
Magazine has an article that discusses the League and its founder, Richard
Stallman. Most of us know Stallman as the author of the EMACS editor and GNU,
the Unix look-alike. You have to buy GNU for $150 from Stallman's other group,
the Free Software Foundation, but you are permitted to give away copies. Unix
costs $900, and you may not give away copies. I've never read where GNU is at
least one-sixth as good as Unix, so I don't know if it's a deal or not.
According to the Boston Magazine article, Stallman, founder of the League for
Programming Freedom and the Free Software Foundation, will write programs for
you for $260 an hour as long as the program is not proprietary. Software
should be free, but Stallman sure isn't.
I suppose that under the "not proprietary" restriction, Stallman's clients can
sell the programs he writes but cannot prevent their buyers from giving them
it away. Any takers? How about in-house stuff? Most clients who develop custom
software strictly for internal use consider it to be proprietary, perhaps to
keep it from their competitors. Hmm. I'd follow Richard's example, but I don't
think I could live on $520 a year.


Huffman Compression Trees


I don't know if Huffman trees involve a proprietary algorithm. But my research
for last month's "Memory Lane" column did not find the algorithm in the DDJ
morgue, so I'll take the chance and address it now. Maybe I'll be writing next
month's column from the federal country club at Eglin Air Force Base. That way
I can talk to my lawyer, my banker, and some fellow pilots every day.
The Huffman algorithm is a form of data compression. Others are LZW, mentioned
above, and run-length encoding (RLE), where strings of the same character are
replaced by a character count and the character. I used RLE in one of the
encryption programs in November. There have been other RLE articles in past
issues of DDJ. Contemporary file compression utility programs use combinations
of several compression algorithms, examining each file to see which algorithm
yields the best compression ratio. You are not likely to write yet another
general-purpose file compression utility program, but you might need to use
compression in one of your applications. Developers who distribute software on
diskettes often compress it and have the installation program decompress it
for the user. If you use one of the commercial or shareware compression
utilities, you might have to pay for a license to distribute the decompression
program. At the very least, your installation procedure might have to display
the copyright notice of the company or person who owns the program. With the
programs in this column and the LZW program from last June, you can write your
own compression/decompression programs.
The Huffman compression algorithm assumes that data files consist of patterns
of characters where some bytes occur more frequently than others. This is true
for English language text files. We use some letters more often than others.
By analyzing a file, the algorithm builds an array that identifies the
frequency of each character. Then it builds the Huffman tree structure from
the frequency array. The purpose of the tree is to associate each character
with a bit string. The more frequent characters get shorter bit strings; the
less frequent ones get longer bit strings, and so the data in the file can be
compressed.
To compress the file, the Huffman algorithm reads the file a second time,
translates each character into the bit string assigned to it by the Huffman
tree, and writes the bit strings to the compressed file.
The decompression algorithm reverses the process. It uses the frequency array
that the compression pass created to build the Huffman tree. Some applications
use a global constant frequency array based on an empirical understanding of
the database. With this technique, the program does not have to write the
array to the compressed file. Other applications allow each file to have a
unique frequency array. The compression pass writes the unique array to the
compressed file and the decompression pass reads it and rebuilds the tree
before decompressing the bit strings. This approach can generate a larger
compressed file than the original from a short file or from one with a
relatively even character distribution.


Compression


To understand how Huffman compresses data, you must observe how it builds the
tree. Consider this sentence in an example file: now is the time to compress.
To build the tree you first must build a frequency array. This array will tell
you the frequency of each character in the file. The frequency array for the
example file looks like Table 1.
Table 1: The frequency array for our example file.


 Character Frequency
 --------------------

 `' 5
 e 3
 o 3
 s 3
 t 3
 i 2
 m 2
 c 1
 h 1
 n 1
 p 1
 r 1
 w 1

The next step is to build the Huffman tree. The tree structure contains nodes,
each of which contains the character, its frequency, a pointer to a parent
node, and pointers to left and right child nodes. The tree contains entries
for all 256 possible characters and 255 parent nodes. At first there are no
parent nodes. The tree grows by making successive passes through the existing
nodes. Each pass searches for the two nodes that have not grown a parent node
and that have the two lowest frequency counts. When the program finds those
two nodes, it allocates a new node, assigns it as the parent to the two nodes,
and gives the new node a frequency count that is the sum of the two child
nodes. The next iteration of the search ignores those two child nodes but
includes the new parent node. The passes continue until only one node with no
parent remains. That node will be the root node of the tree.
Figure 1 illustrates how the Huffman tree for the example file might appear.
Observe that only the original leaf nodes have meaningful character values.
That field in the higher nodes is meaningless. The tree search algorithms
distinguish leaves from higher nodes by the appearance of pointers to child
nodes. If the node has children, it is not a leaf. The node that has no parent
is the root. The compression algorithm uses the tree to translate the
characters in the file into bit strings. You can see in Figure 1 that if you
begin at the root node and trace your way to the most frequent character,
there are fewer nodes in the path than there are to the least frequent
character. It is therefore a matter of assigning a bit value, 0 or 1, to each
of the right or left branches from a parent node to its children. Figure 2
shows the example Huffman tree with the value 1 assigned to each left branch,
and 0 assigned to each right branch. If you trace through the tree, you can
see that the bit string to represent the space character, which is the most
frequent, is 011, while the string for the 'n' is 01001. In a file with
instances of many more characters, the longer strings will be much longer that
the longest ones in this example. Only one character code in the Huffman tree
begins with 011, for example, and so there is no potential for the bit strings
of two characters to confuse the decompression algorithm.
Compression, then, involves traversing the tree beginning at the leaf node for
the character to be compressed and navigating to the root. This navigation
iteratively selects the parent of the current node and sees whether the
current node is the right or left child of the parent, thus determining
whether the next bit is a one or a zero. Because you are proceeding from leaf
to root, you are collecting bits in the reverse order in which you will write
them to the compressed file.
The assignment of the 1 bit to the left branch and the 0 bit to the right
branch is arbitrary. Also, the actual tree might not always look exactly like
the one in the figures. It depends on the order in which the search of the
tree proceeds while it builds parents, and where in the tree it inserts those
parents. The right and left branches from the root node could be reversed
without affecting the compression ratio, for example. Figure 1 and Figure 2
are representative. The tree that would be built by the code in this month's
column would be slightly different with respect to right and left child node
assignments, but it is more difficult to draw for the examples. The tree in
the figures would compress "now is" into this bit stream:
 n o w '' i s 01001 101 1000 011 0001 110


Decompression


Decompression involves building the Huffman tree, reading the compressed file,
and converting its bit streams into characters. You read the file a bit at a
time. Beginning at the root node in the Huffman tree and depending on the
value of the bit, you take the right or left branch of the tree and then
return to read another bit. When the node you select is a leaf--that is, it
has no right and left child nodes--you write its character value to the
decompressed file and go back to the root node for the next bit.
Most definitions of Huffman trees treat the frequency values as decimal rather
than whole numbers. A high frequency character might have a node value of .55
while a low frequency value would have a value of .00002. The value in the
root node would, therefore, be 1.0, the sum of all the decimal parts. My
implementation uses the actual counts for the values, so the value in the root
node is the total number of bytes in the file. This approach avoids the use of
floating point arithmetic.


The Huffman Programs


Listing One is htree.h. The BYTECOUNTER typedef defines the data type for the
frequency count. I used an unsigned int, which means that the program can
compress files of up to 64K bytes. You could change that to a long integer to
compress bigger files. The htree structure defines the Huffman tree, and the
buildtree prototype is for the function that the compression and decompression
programs use to build a Huffman tree from a character frequency array.
 Listing Two is htree.c, which contains the buildtree function. It assumes
that the first 256 entries in a global tree are initialized with the leaf
values of the tree. It scans the tree and initializes local pointers to the
two nodes it finds that have the lowest frequency counts. It bypasses any node
that already has a parent or that has a zero value in its frequency count.
After each scan where two nodes are found, the function adds a node to the
tree, placing the address of the new node into the parent member of both nodes
that were found by the scan. It puts the sum of the two child frequency counts
into the new node's frequency count, and it puts the addresses of the two
child nodes into the new node's right and left child node pointer members.
When the scan fails to find two nodes that do not have parents and that have
nonzero frequency counts, the tree is complete, and the remaining node without
a parent is the root node.
Listing Three is huffc.c, the file compression program. After opening the
input and output files, the program reads through the input file, building the
frequency array and counting the bytes. It also counts the number of distinct
character values found in the file. When the program has read the complete
file, it writes the byte count and the count of distinct character values to
the output file. Then it writes the frequency array, which consists of each
character value that occurred in the input file and the number of times it
occurred.
There are other ways that the program could have recorded the frequency array.
It could have written 256 values, with the position of each value in the array
being the character it counted. The array would record all zero as well as
significant counts. This might use less room in the compressed file than the
method chosen, which writes a count of significant characters followed by each
character value and its count. Entries for character values that do not appear
in the input file are not in the compressed file. A file that has occurrences
of most of the 256 characters would probably record a smaller array by using
the former method. A file with fewer distinct character values--such as a
7-bit ASCII text file--would have a smaller array by using the method chosen.
A really smart compression program would decide which form is smaller and
write the shorter form with a control value that identifies it.
After writing the frequency array, the compression program calls the buildtree
function to build the Huffman tree. Then it rewinds the input file, reads each
character, and calls the compress function. The recursive compress function
sends the compressed bit stream representation for a character to the output
file. It starts from the character's leaf node position in the tree and calls
itself with the address of the node's parent until it gets to the root node.
When it returns from the call to itself, it writes a zero or one bit to the
compressed file, depending on whether it was called from a right or left child
node.
The program calls the outbit function to write each bit of compressed data.
The outbit function rotates bits into a byte until 8 bits have been added,
whereupon it writes the byte to the file. The program calls outbit a last time
with a -1 parameter to tell it to write the last byte value to the file.
Listing Four is huffd.c, the file decompression program. It opens the input
and output files and then reads the byte and frequency counts. The purpose of
the byte count is so that the decompression program will know when it is done
decompressing. The last byte in the compressed file will usually contain fewer
bits than are needed to complete the decompressed file, so the byte count
controls the number of bytes written.
The frequency count value tells the program how many entries to read into the
frequency array. Each entry consists of a character value and its frequency
count. This array, once loaded, becomes the Huffman tree when the program
calls the buildtree function. Then the program decompresses by calling the
decompress function to get each byte to write to the output file.
The decompress function calls the inbit function to read each successive bit
value in the compressed file. The inbit function reads a byte from the
compressed file and shifts bits out of it until it has returned all 8 bits,
whereupon it reads the next byte. The decompress function uses the bits to
walk down the Huffman tree starting at the root node. As long as the current
node has child nodes--entries in the right and left pointers--the function
gets another bit and moves to the left child node if the bit is a 1 and the
right child node if the bit is a 0. When the current node has no child node it
is a leaf, and the decompress function returns its character value.


Compression as Encryption


You can use Huffman compression to encrypt data files. To decrypt a file, you
must know the algorithm that encrypted it and the encryption key value. A data
file that you compress with a Huffman tree is reasonably well encrypted if you
keep the frequency array private. The array becomes the key to the file's
decryption. The file is further protected if the algorithm itself is part of
the secret. A codebreaker would need to figure out in the first place that you
used a Huffman tree to compress the file, and then, having made that
determination, the spy would have to decipher the frequency array in order to
decompress/decrypt it.


_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

/* ------------------- htree.h -------------------- */

typedef unsigned int BYTECOUNTER;

/* ---- Huffman tree structure ---- */
struct htree {
 unsigned char ch; /* character value */
 BYTECOUNTER cnt; /* character frequency */
 struct htree *parent; /* pointer to parent node */
 struct htree *right; /* pointer to right child node */
 struct htree *left; /* pointer to left child node */
};
extern struct htree ht[];
extern struct htree *root;

void buildtree(void);







[LISTING TWO]

/* ------------------- htree.c -------------------- */
#include <stdio.h>
#include <stdlib.h>
#include "htree.h"

struct htree ht[512];
struct htree *root;

/* ------ build a Huffman tree from a frequency array ------ */
void buildtree(void)
{
 int treect = 256;
 int i;

 /* ---- build the huffman tree ----- */
 while (1) {
 struct htree *h1 = NULL, *h2 = NULL;
 /* ---- find the two smallest frequencies ---- */
 for (i = 0; i < treect; i++) {
 if (ht+i != h1) {
 if (ht[i].cnt > 0 && ht[i].parent == NULL) {
 if (h1 == NULL ht[i].cnt < h1->cnt) {
 if (h2 == NULL h1->cnt < h2->cnt)
 h2 = h1;
 h1 = ht+i;
 }
 else if (h2 == NULL ht[i].cnt < h2->cnt)
 h2 = ht+i;
 }
 }
 }
 if (h2 == NULL) {
 root = h1;
 break;
 }
 /* --- combine two nodes and add one --- */

 h1->parent = ht+treect;
 h2->parent = ht+treect;
 ht[treect].cnt = h1->cnt + h2->cnt;
 ht[treect].right = h1;
 ht[treect].left = h2;
 treect++;
 }
}







[LISTING THREE]

/* ------------------- huffc.c -------------------- */
#include <stdio.h>
#include <stdlib.h>
#include "htree.h"

static void compress(FILE *fo, struct htree *h,struct htree *child);
static void outbit(FILE *fo, int bit);

void main(int argc, char *argv[])
{
 FILE *fi, *fo;
 int c;
 BYTECOUNTER bytectr = 0;
 int freqctr = 0;
 if (argc < 3) {
 printf("\nusage: huffc infile outfile");
 exit(1);
 }

 if ((fi = fopen(argv[1], "rb")) == NULL) {
 printf("\nCannot open %s", argv[1]);
 exit(1);
 }
 if ((fo = fopen(argv[2], "wb")) == NULL) {
 printf("\nCannot open %s", argv[2]);
 fclose(fi);
 exit(1);
 }

 /* - read the input file and count character frequency - */
 while ((c = fgetc(fi)) != EOF) {
 c &= 255;
 if (ht[c].cnt == 0) {
 freqctr++;
 ht[c].ch = c;
 }
 ht[c].cnt++;
 bytectr++;
 }

 /* --- write the byte count to the output file --- */
 fwrite(&bytectr, sizeof bytectr, 1, fo);


 /* --- write the frequency count to the output file --- */
 fwrite(&freqctr, sizeof freqctr, 1, fo);

 /* -- write the frequency array to the output file -- */
 for (c = 0; c < 256; c++) {
 if (ht[c].cnt > 0) {
 fwrite(&ht[c].ch, sizeof(char), 1, fo);
 fwrite(&ht[c].cnt, sizeof(BYTECOUNTER), 1, fo);
 }
 }

 /* ---- build the huffman tree ---- */
 buildtree();

 /* ------ compress the file ------ */
 fseek(fi, 0L, 0);
 while ((c = fgetc(fi)) != EOF)
 compress(fo, ht + (c & 255), NULL);
 outbit(fo, -1);
 fclose(fi);
 fclose(fo);
}

/* ---- compress a character value into a bit stream ---- */
static void compress(FILE *fo, struct htree *h,
 struct htree *child)
{
 if (h->parent != NULL)
 compress(fo, h->parent, h);
 if (child) {
 if (child == h->right)
 outbit(fo, 0);
 else if (child == h->left)
 outbit(fo, 1);
 }
}
static char out8;
static int ct8;

/* -- collect and write bits to the compressed output file -- */
static void outbit(FILE *fo, int bit)
{
 if (ct8 == 8 bit == -1) {
 fputc(out8, fo);
 ct8 = 0;
 }
 out8 = (out8 << 1) bit;
 ct8++;
}







[LISTING FOUR]


/* ------------------- huffd.c -------------------- */
#include <stdio.h>
#include <stdlib.h>
#include "htree.h"

static int decompress(FILE *fi, struct htree *root);

void main(int argc, char *argv[])
{
 FILE *fi, *fo;
 unsigned char c;
 BYTECOUNTER bytectr;
 int freqctr;
 if (argc < 3) {
 printf("\nusage: huffd infile outfile");
 exit(1);
 }
 if ((fi = fopen(argv[1], "rb")) == NULL) {
 printf("\nCannot open %s", argv[1]);
 exit(1);
 }
 if ((fo = fopen(argv[2], "wb")) == NULL) {
 printf("\nCannot open %s", argv[2]);
 fclose(fi);
 exit(1);
 }

 /* ----- read the byte count ------ */
 fread(&bytectr, sizeof bytectr, 1, fi);

 /* ----- read the frequency count ------ */
 fread(&freqctr, sizeof freqctr, 1, fi);

 while (freqctr--) {
 fread(&c, sizeof(char), 1, fi);
 ht[c].ch = c;
 fread(&ht[c].cnt, sizeof(BYTECOUNTER), 1, fi);
 }

 /* ---- build the huffman tree ----- */
 buildtree();

 /* ----- decompress the file ------ */
 while (bytectr--)
 fputc(decompress(fi, root), fo);
 fclose(fo);
 fclose(fi);
}
static int in8;
static int ct8 = 8;

/* ---- read a bit at a time from a file ---- */
static int inbit(FILE *fi)
{
 int obit;
 if (ct8 == 8) {
 in8 = fgetc(fi);
 ct8 = 0;
 }

 obit = in8 & 0x80;
 in8 <<= 1;
 ct8++;
 return obit;
}

/* ----- decompress file bits into characters ------ */
static int decompress(FILE *fi, struct htree *h)
{
 while (h->right != NULL)
 if (inbit(fi))
 h = h->left;
 else
 h = h->right;
 return h->ch;
}














































February, 1991
STRUCTURED PROGRAMMING


Pondering Imponderables




Jeff Duntemann, K16RA


I'm one of those guys who's always pondering imponderables. Such as: What is
it that schnauzers do when they're schnauzing? Why do we continue to reelect
that pack of thieving cowards we call a Congress? If there are parakeets, are
there also metakeets? Why did God create heavy metal bands when we already had
trial lawyers? (Or, Lord knows, Lima beans?)
Occasionally, answers present themselves. Not long ago, while watching Mr.
Byte squirming around on his back, rubbing his nose against the carpet, I
realized that this was, in fact, schnauzing. Schnauzers must just have itchier
noses and do it more often. As for Congress; hey, go figure. I always vote
against the incumbent. (There are no good congressmen, and not nearly enough
dead ones.) We'll leave the question of metakeets to Lewis Carroll's next
incarnation.
But you want an imponderable, well, ponder this: Is there some truly general
method for designing software?


The Most-asked Question


I'd guess that this is the most-asked question in all computerdom, even if
most people asking the question ask it of themselves. (Usually when a
deadline's coming up and they have nothing to show for their six months' worth
of "design phase.") It beats out the runner-up, "What the hell is
object-oriented programming, anyway?" (asked mostly when the boss has laid
down the law that all future projects are to be written in an object-oriented
fashion) by an Arizona mile.
I've been spending the lion's share of last year's columns fussing with the
newness of the OOP concept. We've definitely chewed the flavor out of that
particular gumball, big though it was when it rattled out of the machine, and
it's time to chew on something else.
So for the next umpty-ump columns I'll be addressing design issues as I
perceive them. In the process I'll lay out my own design process for a
specific application, and I'll build the application as I go. I readily admit
that I'm not a software design expert. I don't have a degree in computer
science or software engineering. (My degree is in English, that most
cantankerous interpreter of all.) However, I've designed, implemented, and
maintained a couple of middling systems in my time (over and above the little
programming projects I've done in lieu of watching "L.A. Law" all these years)
and, more important, I've watched a lot of other people try and sometimes fail
to design software of their own.


The Nature of Design


Let's go back to the last imponderable I mentioned. Is there a truly general
method for designing software? There is indeed. It is, in fact, summarized in
just one word: Think!
I say this because I've read between the lines a lot in talking to other
programmers about design. Numerous people have asked me to recommend a good
design methodology. Closer probing sometimes revealed that they were in fact
hunting about for something they thought they might have read about in a copy
of Datamation that somebody left on the floor of the MIS/DP men's room in
1982; called something like the Krobolgian-Czstrwytrszyz Decomposition
Theorem, which would take a set of vague statements scribbled on the back of a
Chinese menu and automagically poof them into tight pseudocode that your
little sister could implement in GW BASIC in an afternoon. In other words,
what they were really asking for was to please relieve them of the burden of
actually thinking about what they had to do.
No. Design is not easy. There may be tools to make certain parts of the
process less burdensome, but none of these tools relieves you of the
responsibility of understanding what's going on all along.
I've created systems in a number of different ways. In looking back, however,
I can cook the process down to the following:
1. Understand what you can't do.
2. Understand what you must do.
3. Resolve any conflicts between #1 and #2.
4. Do what you must do.
5. Make sure that what you've got is what you wanted.
As far as I've been able to tell, these five steps apply to any sort of
software design project you can name. None of them is easy. All of them
require considerable thought. #3 may demand a certain element of statesmanship
as well, especially in corporate environments where the people who define #1
and #2 are separate groups that throw blunt objects at one another whenever
they get the chance.


Constraint-based Design


But step #1 deserves some explication. No design methodology I've studied
gives it the importance that I do. Not surprisingly, many if not most of the
failed design projects I've seen, failed precisely because the designers gave
little thought to their collection of constraints.
I warn you: Get your constraints down on paper first. Memorize them. Then tape
them to your door, and do not cover them with your favorite entries from The
Far Side Calendar.
So what's a constraint? A constraint is one of the points through which you
draw the boundaries of possibility. Inside is to work. Outside is to dream.
The most obvious constraints are defined by the target platform. One very
common constraint is to work in text mode only, often because the target
machine or machines have no graphics ability. Another constraint is for the
program to work in 640K, without benefit of EMS, again because the target
machine or machines have no EMS. Connecting to network or mainframe resources
can involve pages of detailed and subtle constraints, things you can't do at
certain times, or things you can't do faster than certain speeds. I once saw a
guy try to design a system that had to pass a megabyte of data every evening
-- through a 300 baud phone link. Not his problem -- he just had to write the
program. Not his fault if the damned data was too big!
Much thornier constraints involve the people who use the system. How much
operator time per day will the system demand? Will the operator be a seasoned
part of the team requesting the system, or a minimum-wage clock-watcher from
Rent-A-Loser?
Constraints may also be dictated by organization policies. The company may
hire you to design and implement a system for them, but may forget to mention
that all systems must conform to SAA (in many mainframe-dominated shops, IBM's
word has been law for so long that nobody even questions whether SAA is a good
idea or not) or even, God help us, be written in Cobol. Management may not
ordinarily enforce the Cobol rule -- until somebody decides his turf is being
stepped on during the design process and goes looking for fine print to invoke
against the proposed system.
Expect constraints if the proposed system involves personnel information, or
financial information, or anything else a corporate body might consider
sensitive. Transmissions may have to be encrypted. Password access may have to
be strictly controlled.
I'm a fanatic about constraints because constraints affect the shape of the
system you're designing more than the functional requirements of the system
do. Sounds weird, but it's perfectly true. Before you choose which pair of
shoes to wear to Hashimoto-San's party, you'd better find out whether he
allows people to wear shoes in his house.


Constraints vs. Requirements


Knowing what's a constraint and what's a requirement is a little like knowing
what's a bug and what's a feature. I see it this way: A requirement is part of
the list of things that the software you're designing must accomplish. Period.
A "requirement" that a system be written in C isn't a requirement -- it's a
constraint.
(And to avoid any accusations of C-bashing, I'll add that a requirement that
the system be written in Pascal is every bit as much a constraint. Just a
pleasanter one....)

These are also examples of constraints:
The application must run under Windows 3.0.
The application must be able to link with the company's standard Fortran
graphics library.
The application may not grab the timer tick interrupt.
All graphics must be drawn in a graphics mode that displays square pixels.
Only COM1: and COM2: should be supported.
As you move toward the bottom of this list, the constraints start to look more
and more like features. The last item on the list, for example, could be
either. The touchstone is this: A constraint is a limitation imposed on the
application that takes priority over the feature set. You may know how to
implement support for COM3: and COM4:, but some other peripherals in the
target machine may already be using those two ports. Thus, you can't just
"toss them in;" in fact, if your personal library of comm routines includes
COM3: and COM4: support, you may have to explicitly disable it or even carve
it out of the source code.


The Topsy Problem


Does it make sense to impose explicit constraints on software you're writing
for yourself? (Apart from obvious constraints such as working within the
limits of your own hardware.) It can, especially if you as a programmer are
prone to "creeping featurism," that is, the irresistible urge to keep stuffing
new features into a project far beyond what you imagined at the outset. (To
paraphrase the potato chip people, "Betcha can't add just one!")
Projects that grow like Topsy can be hazardous to your delivery schedule, your
sleep schedule, your marriage. A set of self-imposed constraints will help you
set a clear boundary on when the project is actually finished. Limiting
yourself to text mode means you're less likely to spend another two months
adding graphics mode on a lark.
Later on, if you really, really want to add graphics support, you can break
your own constraints.


The JTERM Project


I've decided to completely rethink and reimplement JTERM, the terminal
emulator application that I started writing (in CP/M Turbo Pascal) back in
1984. It grew by accretion and is the epitome of software that has utterly no
design or forethought behind it. It sends and captures text files, and it
transfers files through XModem checksum. There are no menus at all; everything
is done most cryptically by various sequences of control keys.
In short, it richly deserves to be shot in the head and left on the anthill.
So where do we start?
Here: With a list of constraints.
JTERM will support text mode only.
It will be written in Turbo Pascal 6.0.
It will require no third-party libraries.
It will operate on AT-class machines only (286 or better).
It will support only COM1: and COM2:.
It will use Borland's Turbo Vision as its user interface.
Obviously, these are all self-imposed and, are there mainly to give a certain
shape and bounds to the project. Each constraint has some thought behind it,
however, as every constraint should.
I'm excluding graphics support to keep the project manageable. At some future
time, I may reimplement JTERM for Microsoft Windows 3.0, but that's a whole
different design exercise. I'm writing it in Turbo Pascal to make it
interesting to the broadest possible subset of "Structured Programming." I'm
excluding third-party libraries because I want to be able to distribute the
entire package in source code form. Besides, buying all the tough parts will
preclude discussion of the design efforts that go into the tough parts. In a
real-world design effort (where the source code needn't be distributed) you're
probably better off buying as much of the technology in library form as you
can.
I'm stopping at COM2: because the interrupt infrastructure for the first two
COM: ports is fairly standard and well-understood. Beyond that, things get
dicey and the complexity of the project as a whole goes up severely.
Finally, I'm using Turbo Vision for the user interface because Turbo Vision is
now going out with every copy of Turbo Pascal as an extension of the runtime
library. As such, it instantly becomes a force to contend with, and deserves
some serious investigation. Because everyone who has Turbo Pascal 6.0 will
also have Turbo Vision, distributing the source for Turbo Vision along with
JTERM is unnecessary.
A project simple enough to describe in a few magazine articles won't really be
big enough to have much in the line of constraints. And when you're working on
your own, you can do pretty much whatever you want. However, let me reiterate
that in almost every case, programming for money involves numerous
constraints, some clearly stated at the outset, and others that you will have
to dig for. You'd better dig for them, too -- or later on they'll come
crawling up out of the ground like those jive zombies in Michael Jackson's
"Thriller" video, muttering one horrible sound or another:
"You know that no one outside Headquarters can access that data set...."
"You know that our Poughkeepsie offices run everything on Apple II
machines...."
"You know that all software here has to be written in Cobol...."
Eek!


The Big Zeller Wrapup


When last I looked at the pile on the windowsill, I had a little more than 80
letters, cards, and e-mail notes about Zeller's Congruence, a widely-used, but
rarely-explained algorithm for extracting the day of the week given the year,
month, and day in the Gregorian era. Zeller's Congruence revolves around the
expression shown in Figure 1, reprinted verbatim from my October 1990 column.
The q term represents the day of the month. The m term represents the month,
but massaged slightly so that while March is month 3, January and February are
months 13 and 14, but of the previous year. (I'm being terse here. For the
full story, do refer to the October column.) K is the year term, from 0 to 99,
and J is the century term. (that is, 17, 18, 19, and so on.)
Figure 1: The Zeller Expression

 (m + 1) * 26 K J
 q + ------------ + K +--- + --- - 2 * J
 10 4 4

The majority of my correspondents wrote most helpfully to explain the meaning
of the -2*j term in the algorithm, at which I'd thrown up my hands in despair
of understanding. Here's the skinny; like everything else but American
politics, it's simple enough in hindsight.


The Secret of -2*J


What the expression in Figure 1 does is calculate the way the day of the week
advances for each day (q), each month (m), each year (K), and each century
(J). For each day, the day of the week advances by one. (Obviously.) For each
month, the day of week advances somewhat erratically, but Zeller was bright
enough to come up with the ((m+1)*26)/10 term to describe it.
Now, the last four terms in the expression are actually two terms plus two
corrections. The day of the week advances by one for every year, so we add K.
However, every four years, the day of the week advances by an additional (leap
year) day, so we have to add K/4, which throws in an extra day for every four
years we add. The K/4 term is thus a necessary correction to the K term.

Now, where is the term that shows how the day of the week advances for every
century? You guessed it: -2*j. The day of the week moves back by two days
every century. The addition of the J/4 term throws an extra day in every 4
centuries, when the century day (that is, noughty-nought," the 00 year) which
is ordinarily not a leap year, is made a leap year to account for a
slowly-accumulating round-off error in the number of days it takes to make a
year. J/4 is thus a correction to -2*j. (I had falsely assumed -- Lord knows
why -- that the day does not advance at all in an ordinary century, leaving
J/4 a correction to an unstated 0 term.)


Modulus and Remainder


One of the things that makes Zeller so hard to implement is that what Zeller
called the modulus function is not quite the same thing as the MOD operator
present in most of our compilers. (I took up this issue in my November 1990
column.) What we call MOD in Pascal and Modula-2 is really a remainder
function. Modulus and remainder return identical results for positive
quantities, but different results for negative quantities, and my Zeller's
Congruence implementation was going haywire every time the -2*j term forced
the value of the expression as a whole into the negative.
I implemented a true modulus function, presented as Listing One in the
November column -- and blew it. My Modulus(X,Y) function returns an erroneous
value in every case where either X or Y is negative and Y is an even multiple
of X. Where Y is a multiple of X, X modulus Y is 0, always -- and the
(X*Trunc(R-1)) term evaluates to X or -X for those cases, when it should in
fact evaluate to 0.
Harry J. Smith of Saratoga, California pointed this out and sent a correct
Modulus function, which I've given in Figure 2. If you've already begun using
my CalcDayOfWeek function containing the Modulus local function, please
replace the old Modulus with the one in Figure 2.
Figure 2: MODULUS2.SRC

 FUNCTION Modulus (X, Y : Integer) : Integer;

 VAR
 Holder : Integer;

 BEGIN
 Holder := X MOD Y;
 IF Holder < 0 THEN Inc(Holder,Abs(Y));
 Modulus := Holder;
 END;

Many thanks, Harry.


Congruence?


But the most remarkable thing I learned from this small mountain of letters is
that the whole modulus business could have been avoided by magically replacing
the -2*j term with a +5*J term. Everything comes out exactly the same in the
end, except that since we're adding a quantity to the expression instead of
subtracting it, the expression as a whole never goes negative, and we can use
MOD with impunity. MOD, remember, is actually the remainder function, and
returns the same results as true modulus for all positive quantities.
Why, though, is +5*J equivalent to -2*j? Keep in mind what happens to the
value of the expression once we take it: We calculate expression modulus 7.
Adding or subtracting multiples of 7 to the value of expression does not
change the final value of expression modulus 7. This is why I could get away
with my original kludge (shown in the listing for CalcDayOfWeek in my October
1990 column) of adding 7 repeatedly to the expression any time it came out
negative until the value turned positive.
Adding 5*J to the expression does something like that. Think of it this way:
Every passing century moves the day of the week back by two days. In other
words, if today (Halloween, October 31, 1990) is Wednesday, the day of the
week on 10/31/1890 was Friday. We say this because most centuries contain
36,524 days -- two days short of a multiple of seven days.
On the other hand, two days less than a multiple of seven days is absolutely
the same thing as five days more than a multiple of seven days. It's just as
valid to say that in the century between Halloween 1890 and Halloween 1990,
the day of the week went forward by five days: From Friday to Wednesday. We're
not the least bit concerned about the actual number of days that pass in a
century; we're only concerned with the relative position of the day of the
week from one end of the century to the other. +5*J, -2*J; Five steps forward,
two steps back: modulo 7, it's all the same.


Products Mentioned


MyFLIN OpalFire Software 329 North State Street Division II Orem, UT 84057
801-227-7100, $59.00
I'm not going to print CalcDayOfWeek here yet a fourth time (which would
probably be a DDJ record for one piece of code) since I am thoroughly
Zellered-out. The change, if you choose to make it, is simple enough for me to
cop out and leave as an exercise to the reader.
Thanks much to everybody who wrote to me on the subject.


And By the Way, What's a FLIN?


The downside to having really original product ideas is that there's no
comfortable niche to fit into. Truly original products have to explain
themselves very well and very often or they don't get their fair share of the
public's band-width.
I received a product not long ago that presents a classic example. The product
is MyFLIN, written by an Australian hacker and marketed through an American
firm in Utah. It's extremely clever, and extremely useful; but it may also be
the least self-explanatory product I've ever seen.
Not that it's hard to use; I don't mean self-explanatory that way. But there
is not a whit of explanation on the packaging as to what the product does, and
as far as I can tell, the disk-based documentation does not share the secret
of what "FLIN" means.
MyFLIN solves a very specific problem that I've had for the whole time I've
been a Pascal programmer: I create a passel of procedures, including some
middling complex ones with eight or nine parameters. Later on, I go to call
those procedures from some other part of the source code, or from a different
module, and I realize I don't quite remember the order, spelling, or types of
all the parameters. Did ErrorCode come before BufferPtr or after? Was it
StringForm, StrForm, or StringFrm? Before you know it, I'm Ctrl-QRing to the
top of the file to start searching for the malremembered proc, at considerable
cost in time and concentration.
MyFLIN fixes that. It's a TSR that builds a database of your own procedure and
function declarations for you. There's no data entry involved; MyFLIN sucks
the declaration right out of the screen buffer. You just put the cursor
anywhere inside the name of the declared procedure, hit a hot key, and you've
captured it. Later on, you can recall it just as easily.
This is seminal stuff. It's pretty much alone in its niche, and you really
have to play with it for an evening to get a feel for how indispensable the
concept is. The software has a couple of rough spots, but nothing worth any
serious carping.
Someday I'll figure out what FLIN means -- but I now have a FLIN, and if you
suffer from badly-remembered procedure declarations, you should get one, too.


Two Years Before the Masthead


It's Halloween night again -- and I realize that I finished off my very first
column for DDJ two years ago today. Back when Kent Porter first approached me
with the idea of taking over the "Structured Programming" column, I was in
terror of running out of topics to cover after a couple of months. Now, 25
columns later, I see that my list of things-to-be-discussed covers several
pages, single-spaced, and grows seemingly without end.
Some call programming a bottomless pit. It's actually more of a never-ending
Lifesaver; no matter how many wonders you pinch off the roll, another is
always there, right behind it, waiting to be mastered and savored.

Keep that in mind, the next time you're chasing an intermittent system crash
at three ayem. Boo!


_STRUCTURED PROGRAMMING COLUMN_
by Jeff Duntemann



[Figure 1: The Zeller expression]


 (m + 1) * 26 K J
q + ------------ + K + --- + --- - 2*J
 10 4 4




[FIGURE 2]

FUNCTION Modulus(X,Y : Integer) : Integer;

VAR
 Holder : Integer;

BEGIN
 Holder := X MOD Y;
 IF Holder < 0 THEN Inc(Holder,Abs(Y));
 Modulus := Holder;
END;
































February, 1991
GRAPHICS PROGRAMMING


The Polygon Primeval




Michael Abrash


"Give me but one firm spot on which to stand, and I will move the Earth."
-Archimedes
Were Archimedes alive today, he might say, "Give me but one fast polygon-fill
routine on which to call, and I will draw the Earth." Programmers often think
of pixel drawing as being the basic graphics primitive, but filled polygons
are equally fundamental and far more useful. Filled polygons can be used for
constructs as diverse as a single pixel or a 3-D surface, and virtually
everything in between.
I'll spend some time in this column developing routines to draw filled
polygons and building more sophisticated graphics operations atop those
routines. Sooner, rather than later, I'll get to 2-D manipulation and
animation of polygon-based entities (with occasional diversions to other
graphics topics of interest), leading up to an exploration of 3-D graphics.
You can't get there from here without laying some groundwork, though, so this
month I'll begin with the basics of filling a polygon. Next month, we'll see
how to draw a polygon considerably faster. That will set the tone for this
column: High-level exploration of a graphics topic first, followed by a speedy
hardware-specific implementation for the IBM PC/VGA combination, the most
widely used graphics system around. Abstract, machine-independent graphics is
a thing of beauty, but only by understanding graphics at all levels, including
the hardware, can you boost performance into the realm of the sublime.
And slow computer graphics is scarcely worth the bother.


Filled Polygons


A polygon is simply a shape formed by lines laid end to end to form a
continuous, closed path. A polygon is filled by setting all pixels within the
polygon's boundaries to a color or pattern. For now, we'll work only with
polygons filled with solid colors.
You can divide polygons into three categories: convex, nonconvex, and complex,
as shown in Figure 1. Convex polygons include what you'd normally think of as
"convex" and more; as far as we're concerned, a convex polygon is one for
which any horizontal line drawn through the polygon encounters the right edge
exactly once and the left edge exactly once, excluding horizontal and
zero-length edge segments. Put another way, neither the right nor left edge of
a convex polygon ever reverses direction from up to down, or vice-versa. Also,
the right and left edges of a convex polygon may not cross each other,
although they may touch so long as the right edge never crosses over to the
left side of the left edge. (Check out the second polygon drawn in Listing
Three which certainly isn't convex in the normal sense.) The boundaries of
nonconvex polygons, on the other hand, can go in whatever directions they
please, so long as they never cross. Complex polygons can have any boundaries
you might imagine, which makes for interesting problems in deciding which
interior spaces to fill and which not. Each category is a superset of the
previous one.
Why bother to distinguish between convex, nonconvex, and complex polygons? For
performance, especially when it comes to filling convex polygons. It's with
filled convex polygons that we're going to start; they're widely useful and
will serve well to introduce some of the subtler complexities of polygon
drawing, not the least of which is the slippery concept of "inside."


Which Side is Inside?


The basic principle of polygon filling is decomposing each polygon into a
series of horizontal lines, one for each horizontal row of pixels, or scan
line, within the polygon (a process I'll call scan conversion), and drawing
the horizontal lines. I'll refer to the entire process as rasterization.
Rasterization of convex polygons is easily done by starting at the top of the
polygon and tracing down the left and right sides, one scan line (one vertical
pixel) at a time, filling the extent between the two edges on each scan line,
until the bottom of the polygon is reached. At first glance, rasterization
does not seem to be particularly complicated, although it should be apparent
that this simple approach is inadequate for nonconvex polygons.
There are a couple of complications, however. The lesser complication is how
to rasterize the polygon efficiently, given that it's difficult to write fast
code that simultaneously traces two edges and fills the space between them.
The solution is decoupling the process of scan-converting the polygon into a
list of horizontal lines from that of drawing the horizontal lines. One
device-independent routine can trace along the two edges and build a list of
the beginning and end coordinates of the polygon on each raster line. Then a
second, device-specific routine can draw from the list after the entire
polygon has been scanned. We'll see this in action shortly.
The second, greater complication arises because the definition of which pixels
are "within" a polygon is a more complicated matter than you might imagine.
You might think that scan-converting an edge of a polygon is analogous to
drawing a line from one vertex to the next, but this is not so. A line by
itself is a one-dimensional construct, and as such is approximated on a
display by drawing the pixels nearest to the line on either side of the true
line. A line serving as a polygon boundary, on the other hand, is part of a
two-dimensional object. When filling a polygon, we want to draw the pixels
within the polygon, but a standard vertex-to-vertex line-drawing algorithm
will draw many pixels outside the polygon, as shown in Figure 2.
It's no crime to use standard lines to trace out a polygon, rather than
drawing only interior pixels. In fact, there are certain advantages: For
example, the edges of a filled polygon will match the edges of the same
polygon drawn unfilled. Such polygons will look pretty much as they're
supposed to, and all drawing on raster displays is, after all, only an
approximation of an ideal.
There's one great drawback to tracing polygons with standard lines, however:
Adjacent polygons won't fit together properly, as shown in Figure 3. If you
use six equilateral triangles to make a hexagon, for example, the edges of the
triangles will overlap when traced with standard lines, and more recently
drawn triangles will wipe out portions of their predecessors. Worse still, odd
color effects will show up along the polygon boundaries if exclusive-or
drawing is used. Consequently, filling out to the boundary lines just won't do
for drawing images composed of fitted-together polygons. And because fitting
polygons together is exactly what I have in mind, we need a different
approach.


How Do You Fit Polygons Together?


How, then, do you fit polygons together? Very carefully. First, the
line-tracing algorithm must be adjusted so that it selects only those pixels
that are truly inside the polygon. This basically requires shifting a standard
line-drawing algorithm horizontally by one half-pixel toward the polygon's
interior. That leaves the issue of how to handle points that are exactly on
the boundary, and points that lie at vertices, so that those points are drawn
once and only once. To deal with that, we're going to adopt the following
rules:
Points located exactly on nonhorizontal edges are drawn only if the interior
of the polygon is directly to the right (left edges are drawn, right edges
aren't).
Points located exactly on horizontal edges are drawn only if the interior of
the polygon is directly below them (horizontal top edges are drawn, horizontal
bottom edges aren't).
A vertex is drawn only if all lines ending at that point meet the above
conditions (no right or bottom edges end at that point).
All edges of a polygon except those that are flat tops or flat bottoms will be
considered either right edges or left edges, regardless of slope. The left
edge is the one that starts with the leftmost line down from the top of the
polygon.
These rules ensure that no pixel is drawn more than once when adjacent
polygons are filled, and that if polygons cover the full 360-degree range
around a pixel, then that pixel will be drawn once and only once -- just what
we need in order to be able to fit filled polygons together seamlessly.
This sort of non-overlapping polygon filling isn't ideal for all purposes.
Polygons are skewed toward the top and left edges, which not only introduces
drawing error relative to the ideal polygon but also means that a filled
polygon won't match the same polygon drawn unfilled. Narrow wedges and
one-pixel-wide polygons will show up spottily. All in all, the choice of
polygon-filling approach depends entirely on the ways in which the filled
polygons will be used.
For our purposes, nonoverlapping polygons are the way to go, so let's have at
them.


Filling Non-overlapping Convex Polygons Made Easy


Without further ado, Listing One contains a function, FillConvexPolygon, that
accepts a list of points that describe a convex polygon, with the last point
assumed to connect to the first, and scans it into a list of lines to fill,
then passes that list to the function DrawHorizontalLineList in Listing Two.
Listing Three is a sample program that calls FillConvexPolygon to draw
polygons of various sorts, and Listing Four is a header file included by the
other listings.
Listing Two isn't particularly interesting; it merely draws each horizontal
line in the passed-in list in the simplest possible way, one pixel at a time.
(No, that doesn't make the pixel the fundamental primitive; next month I'll
replace Listing Two with a much faster version that doesn't bother with
individual pixels at all.)
Listing One is where the action is this month. Our goal is to scan out the
left and right edges of each polygon so that all points inside and no points
outside the polygon are drawn, and so that all points located exactly on the
boundary are drawn only if they are not on right or bottom edges. That's
precisely what Listing One does; here's how.
Listing One first finds the top and bottom of the polygon, then works out from
the top point to find the two ends of the top edge. If the ends are at
different locations, the top is flat, which has two implications. Firstly,
it's easy to find the starting vertices and directions through the vertex list
for the left and right edges. (To scan-convert them properly, we must first
determine which edge is which.) Secondly, the top scan line of the polygon
should be drawn without the rightmost pixel, because only the rightmost pixel
of the horizontal edge that makes up the top scan line is part of a right
edge.
If, on the other hand, the ends of the top edge are at the same location, then
the top is pointed. In that case, the top scan line of the polygon isn't
drawn; it's part of the right-edge line that starts at the top vertex. (It's
part of a left-edge line, too, but the right edge overrides.) When the top
isn't flat, it's more difficult to tell in which direction through the vertex
list the right and left edges go, because both edges start at the top vertex.
The solution is to compare the slopes from the top vertex to ends of the two
lines coming out of it in order to see which is leftmost. The calculations in
Listing One involving the various deltas do this, using a slightly rearranged
form of the equation:

DeltaYN/DeltaXN>DeltaYP/DeltaXP
Once we know where the left edge starts in the vertex list, we can
scan-convert it a line at a time until the bottom vertex is reached. Each
point is stored as the starting X coordinate for the corresponding scan line
in the list we'll pass to DrawHorizontalLineList. The nearest X coordinate on
each scan line that's on or to the right of the left edge is selected. The
last point of each line making up the left edge isn't scan-converted,
producing two desirable effects. First, it avoids drawing each vertex twice;
two lines come into every vertex, but we want to scan-convert each vertex only
once. Second, not scan-converting the last point of each line causes the
bottom scan line of the polygon not to be drawn, as required by our rules. The
first scan line of the polygon is also skipped if the top isn't flat.
Now we need to scan-convert the right edge into the ending X coordinate fields
of the line list. This is performed in the same manner as for the left edge,
except that every line in the right edge is moved one pixel to the left before
being scan-converted. Why? We want the nearest point to the left of but not on
the right edge, so that the right edge itself isn't drawn. As it happens,
drawing the nearest point on or to the right of a line moved one pixel to the
left is exactly the same as drawing the nearest point to the left of but not
on that line in its original location. Sketch it out and you'll see what I
mean.
Once the two edges are scan-converted, the whole line list is passed to
DrawHorizontalLineList, and the polygon is drawn.
Finis.


Oddball Cases


Listing One handles zero-length segments (multiple vertices at the same
location) by ignoring them, which will be useful down the road because
scaled-down polygons can end up with nearby vertices moved to the same
location. Horizontal line segments are fine anywhere in a polygon, too.
Basically, Listing One scan-converts between active edges (the edges that
define the extent of the polygon on each scan line) and both horizontal and
zero-length lines are non-active; neither advances to another scan line, so
they don't affect the edges being scanned.


Book of the Month


The book of the month is the second edition of Foley and van Dam's classic
Fundamentals of Interactive Computer Graphics, the inspiration and primary
reference for much of the nonmachine-specific material I'll present in this
column. The almost entirely rewritten new version, retitled Computer Graphics:
Principles and Practice (Addison-Wesley, 1990, $64.50), nearly doubles the
size of the first tome, to a total of 1174 pages. You'll wish it were longer,
too, because computer graphics has become such a broad field that even this
massive book often merely touches on an area, providing the fundamental
concepts, equations, and algorithms, and moves on. Still, just about
everything you could want to know is in there somewhere. Truly a book to lose
yourself in, and highly recommended.


Coming Up Next


This month's code merely demonstrates the principles of filling convex
polygons, and is by no means fast. Next month, we'll spice things up by
eliminating the floating point calculations and pixel-at-a-time drawing and
tossing a little assembly language into the mix.


_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

/* Color-fills a convex polygon. All vertices are offset by (XOffset,
 YOffset). "Convex" means that every horizontal line drawn through
 the polygon at any point would cross exactly two active edges
 (neither horizontal lines nor zero-length edges count as active
 edges; both are acceptable anywhere in the polygon), and that the
 right & left edges never cross. (It's OK for them to touch, though,
 so long as the right edge never crosses over to the left of the
 left edge.) Nonconvex polygons won't be drawn properly. Returns 1
 for success, 0 if memory allocation failed */

#include <stdio.h>
#include <math.h>
#ifdef __TURBOC__
#include <alloc.h>
#else /* MSC */
#include <malloc.h>
#endif
#include "polygon.h"

/* Advances the index by one vertex forward through the vertex list,
 wrapping at the end of the list */
#define INDEX_FORWARD(Index) \
 Index = (Index + 1) % VertexList->Length;

/* Advances the index by one vertex backward through the vertex list,
 wrapping at the start of the list */
#define INDEX_BACKWARD(Index) \
 Index = (Index - 1 + VertexList->Length) % VertexList->Length;


/* Advances the index by one vertex either forward or backward through
 the vertex list, wrapping at either end of the list */
#define INDEX_MOVE(Index,Direction) \
 if (Direction > 0) \
 Index = (Index + 1) % VertexList->Length; \
 else \
 Index = (Index - 1 + VertexList->Length) % VertexList->Length;

extern void DrawHorizontalLineList(struct HLineList *, int);
static void ScanEdge(int, int, int, int, int, int, struct HLine **);

int FillConvexPolygon(struct PointListHeader * VertexList, int Color,
 int XOffset, int YOffset)
{
 int i, MinIndexL, MaxIndex, MinIndexR, SkipFirst, Temp;
 int MinPoint_Y, MaxPoint_Y, TopIsFlat, LeftEdgeDir;
 int NextIndex, CurrentIndex, PreviousIndex;
 int DeltaXN, DeltaYN, DeltaXP, DeltaYP;
 struct HLineList WorkingHLineList;
 struct HLine *EdgePointPtr;
 struct Point *VertexPtr;

 /* Point to the vertex list */
 VertexPtr = VertexList->PointPtr;

 /* Scan the list to find the top and bottom of the polygon */
 if (VertexList->Length == 0)
 return(1); /* reject null polygons */
 MaxPoint_Y = MinPoint_Y = VertexPtr[MinIndexL = MaxIndex = 0].Y;
 for (i = 1; i < VertexList->Length; i++) {
 if (VertexPtr[i].Y < MinPoint_Y)
 MinPoint_Y = VertexPtr[MinIndexL = i].Y; /* new top */
 else if (VertexPtr[i].Y > MaxPoint_Y)
 MaxPoint_Y = VertexPtr[MaxIndex = i].Y; /* new bottom */
 }
 if (MinPoint_Y == MaxPoint_Y)
 return(1); /* polygon is 0-height; avoid infinite loop below */

 /* Scan in ascending order to find the last top-edge point */
 MinIndexR = MinIndexL;
 while (VertexPtr[MinIndexR].Y == MinPoint_Y)
 INDEX_FORWARD(MinIndexR);
 INDEX_BACKWARD(MinIndexR); /* back up to last top-edge point */

 /* Now scan in descending order to find the first top-edge point */
 while (VertexPtr[MinIndexL].Y == MinPoint_Y)
 INDEX_BACKWARD(MinIndexL);
 INDEX_FORWARD(MinIndexL); /* back up to first top-edge point */

 /* Figure out which direction through the vertex list from the top
 vertex is the left edge and which is the right */
 LeftEdgeDir = -1; /* assume left edge runs down thru vertex list */
 if ((TopIsFlat = (VertexPtr[MinIndexL].X !=
 VertexPtr[MinIndexR].X) ? 1 : 0) == 1) {
 /* If the top is flat, just see which of the ends is leftmost */
 if (VertexPtr[MinIndexL].X > VertexPtr[MinIndexR].X) {
 LeftEdgeDir = 1; /* left edge runs up through vertex list */
 Temp = MinIndexL; /* swap the indices so MinIndexL */

 MinIndexL = MinIndexR; /* points to the start of the left */
 MinIndexR = Temp; /* edge, similarly for MinIndexR */
 }
 } else {
 /* Point to the downward end of the first line of each of the
 two edges down from the top */
 NextIndex = MinIndexR;
 INDEX_FORWARD(NextIndex);
 PreviousIndex = MinIndexL;
 INDEX_BACKWARD(PreviousIndex);
 /* Calculate X and Y lengths from the top vertex to the end of
 the first line down each edge; use those to compare slopes
 and see which line is leftmost */
 DeltaXN = VertexPtr[NextIndex].X - VertexPtr[MinIndexL].X;
 DeltaYN = VertexPtr[NextIndex].Y - VertexPtr[MinIndexL].Y;
 DeltaXP = VertexPtr[PreviousIndex].X - VertexPtr[MinIndexL].X;
 DeltaYP = VertexPtr[PreviousIndex].Y - VertexPtr[MinIndexL].Y;
 if (((long)DeltaXN * DeltaYP - (long)DeltaYN * DeltaXP) < 0L) {
 LeftEdgeDir = 1; /* left edge runs up through vertex list */
 Temp = MinIndexL; /* swap the indices so MinIndexL */
 MinIndexL = MinIndexR; /* points to the start of the left */
 MinIndexR = Temp; /* edge, similarly for MinIndexR */
 }
 }

 /* Set the # of scan lines in the polygon, skipping the bottom edge
 and also skipping the top vertex if the top isn't flat because
 in that case the top vertex has a right edge component, and set
 the top scan line to draw, which is likewise the second line of
 the polygon unless the top is flat */
 if ((WorkingHLineList.Length =
 MaxPoint_Y - MinPoint_Y - 1 + TopIsFlat) <= 0)
 return(1); /* there's nothing to draw, so we're done */
 WorkingHLineList.YStart = YOffset + MinPoint_Y + 1 - TopIsFlat;

 /* Get memory in which to store the line list we generate */
 if ((WorkingHLineList.HLinePtr =
 (struct HLine *) (malloc(sizeof(struct HLine) *
 WorkingHLineList.Length))) == NULL)
 return(0); /* couldn't get memory for the line list */

 /* Scan the left edge and store the boundary points in the list */
 /* Initial pointer for storing scan converted left-edge coords */
 EdgePointPtr = WorkingHLineList.HLinePtr;
 /* Start from the top of the left edge */
 PreviousIndex = CurrentIndex = MinIndexL;
 /* Skip the first point of the first line unless the top is flat;
 if the top isn't flat, the top vertex is exactly on a right
 edge and isn't drawn */
 SkipFirst = TopIsFlat ? 0 : 1;
 /* Scan convert each line in the left edge from top to bottom */
 do {
 INDEX_MOVE(CurrentIndex,LeftEdgeDir);
 ScanEdge(VertexPtr[PreviousIndex].X + XOffset,
 VertexPtr[PreviousIndex].Y,
 VertexPtr[CurrentIndex].X + XOffset,
 VertexPtr[CurrentIndex].Y, 1, SkipFirst, &EdgePointPtr);
 PreviousIndex = CurrentIndex;
 SkipFirst = 0; /* scan convert the first point from now on */

 } while (CurrentIndex != MaxIndex);

 /* Scan the right edge and store the boundary points in the list */
 EdgePointPtr = WorkingHLineList.HLinePtr;
 PreviousIndex = CurrentIndex = MinIndexR;
 SkipFirst = TopIsFlat ? 0 : 1;
 /* Scan convert the right edge, top to bottom. X coordinates are
 adjusted 1 to the left, effectively causing scan conversion of
 the nearest points to the left of but not exactly on the edge */
 do {
 INDEX_MOVE(CurrentIndex,-LeftEdgeDir);
 ScanEdge(VertexPtr[PreviousIndex].X + XOffset - 1,
 VertexPtr[PreviousIndex].Y,
 VertexPtr[CurrentIndex].X + XOffset - 1,
 VertexPtr[CurrentIndex].Y, 0, SkipFirst, &EdgePointPtr);
 PreviousIndex = CurrentIndex;
 SkipFirst = 0; /* scan convert the first point from now on */
 } while (CurrentIndex != MaxIndex);

 /* Draw the line list representing the scan converted polygon */
 DrawHorizontalLineList(&WorkingHLineList, Color);

 /* Release the line list's memory and we're successfully done */
 free(WorkingHLineList.HLinePtr);
 return(1);
}

/* Scan converts an edge from (X1,Y1) to (X2,Y2), not including the
 point at (X2,Y2). This avoids overlapping the end of one line with
 the start of the next, and causes the bottom scan line of the
 polygon not to be drawn. If SkipFirst != 0, the point at (X1,Y1)
 isn't drawn. For each scan line, the pixel closest to the scanned
 line without being to the left of the scanned line is chosen */
static void ScanEdge(int X1, int Y1, int X2, int Y2, int SetXStart,
 int SkipFirst, struct HLine **EdgePointPtr)
{
 int Y, DeltaX, DeltaY;
 double InverseSlope;
 struct HLine *WorkingEdgePointPtr;

 /* Calculate X and Y lengths of the line and the inverse slope */
 DeltaX = X2 - X1;
 if ((DeltaY = Y2 - Y1) <= 0)
 return; /* guard against 0-length and horizontal edges */
 InverseSlope = (double)DeltaX / (double)DeltaY;

 /* Store the X coordinate of the pixel closest to but not to the
 left of the line for each Y coordinate between Y1 and Y2, not
 including Y2 and also not including Y1 if SkipFirst != 0 */
 WorkingEdgePointPtr = *EdgePointPtr; /* avoid double dereference */
 for (Y = Y1 + SkipFirst; Y < Y2; Y++, WorkingEdgePointPtr++) {
 /* Store the X coordinate in the appropriate edge list */
 if (SetXStart == 1)
 WorkingEdgePointPtr->XStart =
 X1 + (int)(ceil((Y-Y1) * InverseSlope));
 else
 WorkingEdgePointPtr->XEnd =
 X1 + (int)(ceil((Y-Y1) * InverseSlope));
 }

 *EdgePointPtr = WorkingEdgePointPtr; /* advance caller's ptr */
}






[LISTING TWO]


/* Draws all pixels in the list of horizontal lines passed in, in
 mode 13h, the VGA's 320x200 256-color mode. Uses a slow pixel-by-
 pixel approach, which does have the virtue of being easily ported
 to any environment. */

#include <dos.h>
#include "polygon.h"

#define SCREEN_WIDTH 320
#define SCREEN_SEGMENT 0xA000

static void DrawPixel(int, int, int);

void DrawHorizontalLineList(struct HLineList * HLineListPtr,
 int Color)
{
 struct HLine *HLinePtr;
 int Y, X;

 /* Point to the XStart/XEnd descriptor for the first (top)
 horizontal line */
 HLinePtr = HLineListPtr->HLinePtr;
 /* Draw each horizontal line in turn, starting with the top one and
 advancing one line each time */
 for (Y = HLineListPtr->YStart; Y < (HLineListPtr->YStart +
 HLineListPtr->Length); Y++, HLinePtr++) {
 /* Draw each pixel in the current horizontal line in turn,
 starting with the leftmost one */
 for (X = HLinePtr->XStart; X <= HLinePtr->XEnd; X++)
 DrawPixel(X, Y, Color);
 }
}

/* Draws the pixel at (X, Y) in color Color in VGA mode 13h */
static void DrawPixel(int X, int Y, int Color) {
 unsigned char far *ScreenPtr;

#ifdef __TURBOC__
 ScreenPtr = MK_FP(SCREEN_SEGMENT, Y * SCREEN_WIDTH + X);
#else /* MSC 5.0 */
 FP_SEG(ScreenPtr) = SCREEN_SEGMENT;
 FP_OFF(ScreenPtr) = Y * SCREEN_WIDTH + X;
#endif
 *ScreenPtr = (unsigned char)Color;
}







[LISTING THREE]

/* Sample program to exercise the polygon-filling routines. This code
 and all polygon-filling code has been tested with Turbo C 2.0 and
 Microsoft C 5.0 */

#include <conio.h>
#include <dos.h>
#include "polygon.h"

/* Draws the polygon described by the point list PointList in color
 Color with all vertices offset by (X,Y) */
#define DRAW_POLYGON(PointList,Color,X,Y) \
 Polygon.Length = sizeof(PointList)/sizeof(struct Point); \
 Polygon.PointPtr = PointList; \
 FillConvexPolygon(&Polygon, Color, X, Y);

void main(void);
extern int FillConvexPolygon(struct PointListHeader *, int, int, int);

void main() {
 int i, j;
 struct PointListHeader Polygon;
 static struct Point ScreenRectangle[] =
 {{0,0},{320,0},{320,200},{0,200}};
 static struct Point ConvexShape[] =
 {{0,0},{121,0},{320,0},{200,51},{301,51},{250,51},{319,143},
 {320,200},{22,200},{0,200},{50,180},{20,160},{50,140},
 {20,120},{50,100},{20,80},{50,60},{20,40},{50,20}};
 static struct Point Hexagon[] =
 {{90,-50},{0,-90},{-90,-50},{-90,50},{0,90},{90,50}};
 static struct Point Triangle1[] = {{30,0},{15,20},{0,0}};
 static struct Point Triangle2[] = {{30,20},{15,0},{0,20}};
 static struct Point Triangle3[] = {{0,20},{20,10},{0,0}};
 static struct Point Triangle4[] = {{20,20},{20,0},{0,10}};
 union REGS regset;

 /* Set the display to VGA mode 13h, 320x200 256-color mode */
 regset.x.ax = 0x0013; /* AH = 0 selects mode set function,
 AL = 0x13 selects mode 0x13
 when set as parameters for INT 0x10 */
 int86(0x10, &regset, &regset);

 /* Clear the screen to cyan */
 DRAW_POLYGON(ScreenRectangle, 3, 0, 0);

 /* Draw an irregular shape that meets our definition of convex but
 is not convex by any normal description */
 DRAW_POLYGON(ConvexShape, 6, 0, 0);
 getch(); /* wait for a keypress */

 /* Draw adjacent triangles across the top half of the screen */
 for (j=0; j<=80; j+=20) {
 for (i=0; i<290; i += 30) {
 DRAW_POLYGON(Triangle1, 2, i, j);
 DRAW_POLYGON(Triangle2, 4, i+15, j);

 }
 }

 /* Draw adjacent triangles across the bottom half of the screen */
 for (j=100; j<=170; j+=20) {
 /* Do a row of pointing-right triangles */
 for (i=0; i<290; i += 20) {
 DRAW_POLYGON(Triangle3, 40, i, j);
 }
 /* Do a row of pointing-left triangles halfway between one row
 of pointing-right triangles and the next, to fit between */
 for (i=0; i<290; i += 20) {
 DRAW_POLYGON(Triangle4, 1, i, j+10);
 }
 }
 getch(); /* wait for a keypress */

 /* Finally, draw a series of concentric hexagons of approximately
 the same proportions in the center of the screen */
 for (i=0; i<16; i++) {
 DRAW_POLYGON(Hexagon, i, 160, 100);
 for (j=0; j<sizeof(Hexagon)/sizeof(struct Point); j++) {
 /* Advance each vertex toward the center */
 if (Hexagon[j].X != 0) {
 Hexagon[j].X -= Hexagon[j].X >= 0 ? 3 : -3;
 Hexagon[j].Y -= Hexagon[j].Y >= 0 ? 2 : -2;
 } else {
 Hexagon[j].Y -= Hexagon[j].Y >= 0 ? 3 : -3;
 }
 }
 }
 getch(); /* wait for a keypress */

 /* Return to text mode and exit */
 regset.x.ax = 0x0003; /* AL = 3 selects 80x25 text mode */
 int86(0x10, &regset, &regset);
}






[LISTING FOUR]

/* POLYGON.H: Header file for polygon-filling code */

/* Describes a single point (used for a single vertex) */
struct Point {
 int X; /* X coordinate */
 int Y; /* Y coordinate */
};

/* Describes a series of points (used to store a list of vertices that
 describe a polygon; each vertex is assumed to connect to the two
 adjacent vertices, and the last vertex is assumed to connect to the
 first) */
struct PointListHeader {
 int Length; /* # of points */

 struct Point * PointPtr; /* pointer to list of points */
};

/* Describes the beginning and ending X coordinates of a single
 horizontal line */
struct HLine {
 int XStart; /* X coordinate of leftmost pixel in line */
 int XEnd; /* X coordinate of rightmost pixel in line */
};

/* Describes a Length-long series of horizontal lines, all assumed to
 be on contiguous scan lines starting at YStart and proceeding
 downward (used to describe a scan-converted polygon to the
 low-level hardware-dependent drawing code) */
struct HLineList {
 int Length; /* # of horizontal lines */
 int YStart; /* Y coordinate of topmost line */
 struct HLine * HLinePtr; /* pointer to list of horz lines */
};











































February, 1991
PROGRAMMER'S BOOKSHELF


Typography for the Rest of Us




Ray Duncan


Peter Norton is a talented self-merchandiser, but once in a while even Peter
misses the mark. The Norton Disk Doctor advertisement, which featured The
Peter in one of his pastel shirts with a stethoscope draped rakishly around
his neck, skirted some ancient societal taboos, but those of us who actually
cross over into the medical world found it amusing rather than obnoxious
because it was so ignorant. In the Byzantine world of medicine, there are many
subtle signals and class distinctions associated with even so commonplace an
article as a stethoscope. Not only is the "scope around the neck" style a
hallmark of insecure interns, but the prospect of being seen in public with
the $5 plastic stethoscope in Peter's outfit would gag any self-respecting
physician, nurse, or medical student.
Anyone can walk into a medical supply store and buy a stethoscope, but an
"appropriate" stethoscope can only be selected with the aid of unwritten rules
and tradition of the medical subculture, while interpretation of the data you
can acquire with a stethoscope requires years of study and experience.
Self-evident, you may say, but consider some cases that may hit a little
closer to home. Many purchasers of painting or drawing programs have
discovered, to their regret, that the acquisition didn't make them any more
creative; true artists can work wonders with these programs, but the rest of
us only produce computerized scrawls. Similarly, possession of a computer, a
WYSIWIG word processor, a laser printer, and a battery of fonts does not
magically turn one into a graphics designer or typographer--but while society
is protected from ersatz physicians by law and from nonartists by visual
common sense, our defenses against brain-damaged desktop publishers are much
flimsier.
We've all found ourselves on the receiving end of memos, newsletters, and
manuals, produced with personal computers, that would easily qualify as
felonious acts in the world of "real" publishing. At one extreme, we've got
the Sir Edmund P. Hillarys of word processing who jam each page with as many
different typefaces as possible "because they are there." At the other
extreme, we've got aberrations like the 1000-bed hospital I work in, Los
Angeles's Cedars-Sinai Medical Center. Cedars-Sinai owns scores of PCs and
Hewlett Packard LaserJets, but views them only as typewriter replacements--all
documents are printed in 12 point Courier. Somewhere in between, we've got the
villains who use fancy fonts and "chartjunk" to disguise lack of content--a
practice also known as "garbage in, gospel out."
Typography is a subtle art and is not intuitive (at least for most of us);
consciousness-raising is needed. Digital Typography: An Introduction to Type
and Composition for Computer System Design summarizes its own premise with
these words:
The ability to create typeset-quality documents easily may well change
people's attitudes about type and printing. For example, a merchandising
executive relates how he can no longer gauge the metaqualities of memos. When
memos were typed on conventional typewriters, he could hold the document up to
the light to see how much correction fluid had been expended. The more
corrections, the less the sender must have cared, or else the memo would have
been retyped. With computers, only the uncorrected errors show. For this
executive, word processing has meant the loss of some information that used to
come with his interoffice mail....
A similar situation arises out of expectations about the quality of printing
itself. Through experience, most people associate typesetting with a
good-quality document. After all, much care, attention, and revision are
required to put a document into typeset form in a book, journal, or magazine.
But now, memos, drafts of articles, and business letters enter homes and
offices in near-typeset form. Unconsciously, people attribute quality to the
content, based on the form.
The reverse also happens. Non-specialists with no flair whatsoever for
typographic design are designing and producing documents that have worse form
than content. Our high expectations for graphic quality from commercial
printing will be brought to bear on everything printed. The same thing has
happened with film and television. We are so used to a high level of
production quality in movies that even Grade B movies must rise to this level.
Amateur productions can scarcely be shown on television today unless they meet
minimum professional standards of production quality.
Digital Typography addresses itself to four major topics in turn. It begins
with a history of letterforms and the technologies for their production,
ranging from the clay tablets of yore to the digital typesetters, laser
printers, and CRTs of today--explaining along the way the origins and meanings
of many common publishing and typesetting terms, such as leading, fonts,
typefaces, points, and joins. With this groundwork laid, we're introduced to
the bewildering variety of typeface designs and taught how to categorize them,
with many beautiful illustrations as examples. The section on letterforms
concludes with a fascinating discussion of what typographers have learned
about reading over the centuries, and how intercharacter spacing, the presence
or absence of serifs, and other qualities interact with the structure of the
eye and the characteristics of the visual system to affect legibility.
The author then turns his attention to computer output devices, again building
his discussion from fundamental concepts: pixels, resolution, aspect ratio,
and so on. He takes us on a whirlwind tour of all the major types of output
devices currently in use, contrasting their capabilities, limitations,
underlying technology, and suitability for typography. Digital font designers
must be eclectic as well as artistic, because the physics of a device can have
a striking influence on the appearance of a font. For example, the "soft
edges" of CRT pixels can be exploited to make the joins in letterforms appear
smoother, while the effects of drum polarity, charge leaks, and other
electrostatic phenomena can be brought to bear in laser printer fonts to
produce more subtle curves, more pleasing characters.
The third topic is the creation of graceful digital letterforms, a job that
has proved more difficult than anyone ever expected. When Donald Knuth, a
wizard of computer science if ever there was one, turned his attention to
computer-based publishing, he knocked out his first digital typeface and
typesetting program in a few months, only to find that he had just scratched
the surface of what he had hoped to accomplish. Knuth ended up spending a full
decade on his T[E]X and METAFONT programs before returning to work on volume 4
of The Art of Computer Programming. Some early digital typefaces were scanned
in directly from existing physical fonts and then "tuned up" with a pixel
editor, and this strategy is still used on occasion, but the currently
preferred approach is to store typefaces as algorithms or as generalized
mathematical descriptions called outlines. The associated software
technologies have become the foundation for a lucrative and hotly competitive
industry; the recent squabbles between IBM, Microsoft, Apple, and Adobe over
the relative merits of PostScript and TrueType give us some idea of the
stakes.
The final portion of the book, and the largest, is devoted to page layout and
document design: the organization of paragraphs and columns; the use of
figures and grids; the importance of white space, margins, hyphenation, line
lengths, and pagination; the development of document specifications and
templates; and the general problem of matching screen, paper, and
expectations. As in the rest of the book, many beautiful illustrations and
examples are used to great advantage here. Document interchange formats, page
description languages, and automatic layout tools are also surveyed briefly.
There's some interesting history in this section amid the huge amount of
useful advice. For example, we learn of the Bravo editor, developed on the
Alto at Xerox PARC circa 1978, which pioneered the treatment of words,
sentences, and paragraphs as objects, and controlled the formatting of each
object type through interactive windows--normally hidden--called property
sheets. When Charles Simonyi migrated from PARC to Redmond, Washington, this
concept went with him, and it resurfaced as style sheets, one of the
fundamental features of Microsoft Word.
Although I've probably made this book sound rather esoteric, I recommend it
without reservation to everyone who uses a word processor. Most of us have
tools at our fingertips that would have made publishers of 50 years ago weep
for joy; consequently, we owe it to ourselves to become familiar with at least
the basic issues and principles of typography, so that--like the layman who
takes a course in CPR--we do no harm, and we know when to call for help.
Fortunately, Digital Typography turns this task into a pleasure; it's a
genuine little gem of technical writing. Reading this book is like sitting by
the fireplace with a wise old professor, as he chats about his field over a
glass of fine wine. Rubinstein's style is informal, yet exceptionally lucid,
and his discussions range over a marvelous variety of disciplines without ever
losing focus. At the end, you'll qualify as an informed amateur--you'll be
sensitized to the aesthetics of typography, and you'll look at books and
magazines (and your own productions) with a new and critical eye. If you
should wish to educate yourself further, Rubinstein includes a 14-page
annotated bibliography that includes both the classic works on typography and
the most important recent papers and research studies.





































February, 1991
OF INTEREST





A data compression product from Stac Electronics, called Stacker, is now
available for IBM PCs and compatibles. Stac Electronics claims that Stacker
can double disk storage capabilities. The product is available as software for
laptop, notebook, and micro channel-based computers; as an add-in board plus
software for PCs, XTs, ATs, and compatibles; and in coprocessors for OEMs.
Stacker provides realtime, loss-less compression at ratios ranging from 2:1 to
15:1.
DDJ talked with Gregory Thomas of Southern California Edison, a beta tester
for the product. "It works real well. There are two different ways you can use
it--one way is called free space, and it sets up a virtual hard disk and
creates another drive. The other way is called incremental, and it takes a
full disk and compresses it, also to create another drive. I've used it on
several machines, and it works with Windows. It compresses text files better
than spreadsheet files, though. With spreadsheet files the compression ratio
is something like 1.7:1."
The compression and decompression process is continuous, transparent to users.
Stacker is a 30 Kbyte program you can load into high memory with a memory
manager utility. The coprocessor card fits into an ISA expansion slot and
requires no switching. The software-only version sells for $129 and comes with
a $100 coupon for the coprocessor board. Board and software together sell for
$229. Reader service no. 20.
Stac Electronics 5993 Avenida Encinas Carlsbad, CA 92008 619-431-7474
Bloomsbury Software Group is now shipping the ydb symbolic debugger for yacc
grammars and parsers. A grammar development tool, ydb offers interactive
environments for grammar debugging and parser generation on Unix systems.
The company claims it is 100 percent backward-compatible with yacc, so it can
be used for debugging yacc grammars or for creating more flexible parsers than
yacc can generate. ydb has been used to help with research in grammars for
image recognition.
Ronnie Kon, project manager at Mind-craft of Palo Alto, California, told DDJ
"I wish I'd had it sooner--it could have saved me a lot of time on a project I
was working on. It has a friendly way of specifying how you want a conflict in
syntax resolved. Yacc is bad about giving debugging help; so ydb sits on top
of the grammar and interprets it. There's nothing else like it that I'm aware
of."
A set of tools for producing grammars at translate time is included, as is
debugging control of an operating parser at runtime. It also gives you the
ability to trace parser actions and set breakpoints at particular rules or
other points in the parse. The company is marketing ydb to developers and
maintainers of compilers, command interpreters, and front ends, as well as to
educators who teach compiler and parser courses, and to researchers who use
experimental grammars for pattern recognition. ydb is now available for Sun 3,
Sun 4, and DECstation computers, and is being ported to others. Prices start
at $1,250 for a single CPU license. Academic and quantity discounts are
available. Reader service no. 22.
Bloomsbury Software Group Inc. P.O. Box 390018 Mountain View, CA 94039
415-964-3486
libhpgl.lib, version 5, is a 32-bit 386 protected-mode extended graphics
library from Gary R. Olhoeft. The library supports graphics direct to hardware
for IBM standard modes EGA/VGA/MGA/8514, VESA/SVGA, Hercules Graphics Station
Card GB1024, Truevision ATVista-4M, and WYSIWYG hardcopy to HP-GL and
PostScript devices or files.
libhpgl.lib supports mixed-vector plotting and raster imaging, graphics
viewports (partial screen windowing), user unit scaling, rotatable and
scalable labels, and more, up to 1024 x 768 with 8-, 16-, or 32-bit graphics
cards (256, 32,768, or 16,777,216 colors).
The library comes with full source code in MicroWay NDP C-386 and Phar Lap
386/ASM, extensive example code and manual, free upgrades for one year, and no
royalties. libhpgl.lib is priced at $200. Reader service no. 23.
Gary R. Olhoeft P.O. Box 10870 Edgemont Golden, CO 80401-0620 303-279-6345
If you are developing device drivers for multiprocessing under SCO Unix, you
might be interested in the technology Corollary has developed for
multi-threading serial I/O device drivers under SCO MPX, the multiprocessing
version of Unix for running off-the-shelf applications and serial I/O device
drivers. Corollary will soon ship a multithreaded version of the device driver
for its 8x4 and 8x2 multiport boards. The company expects this technology will
produce higher performance of SCO Unix-based multiprocessor systems that
utilize multiport boards to support multiple users. Reader service no. 24.
Corollary Inc. 17881 Cartwright Rd. P.O. Box 18977 Irvine, CA 92714
714-250-4040
A grey-scale and full-color image compression software package is available
for PCs from Xing Technology. VT-Compress is a complete implementation of the
ISO/CCITT JPEG standard for still image compression and transmission. JPEG is
an international group of data compression experts formed in 1986 by ISO and
CCITT. Their task has been to reduce image storage requirements and
transmission times. The JPEG Baseline Algorithm uses "lossy" compression
techniques to remove redundant information from the digital storage of
grey-scale and full-color images. The company claims that VT-Compress can
compress at ratios of up to 100:1, and that images compressed at ratios of 8:1
to 20:1 are for the most part indistinguishable from the originals.
VT-Compress optimizes the use of registers and instructions of the 80286 and
80386 microprocessors, and is priced at $179. Reader service no 25.
Xing Technology Corp. P.O. Box 950, 456 Carpenter Canyon Arroyo Grande, CA
93420 805-473-0145
JAM (JYACC Application Manager) version 5 is the centerpiece of JYACC's
product family for prototyping and producing applications that are hardware,
operating system, database, and network independent. This new version of JAM
has been optimized for PC use with full mouse support, extended graphical
character sets, use of foreground/background color, and line drawing.
Other new features include virtual screens, viewports, mouse support, radio
buttons, checklists, and scroll bars. These features give character-based
applications a graphical look-and-feel without added memory or hardware costs.
8-bit internationalization allows JAM developers to create applications for
use with languages that are read from left to right and that use characters
represented in 8 bits of information. Screen editor facilities include
clipboards and block, move, and copy for manipulating data objects in a form.
Used with JAM/DBi, JAM applications can be seamlessly linked to 11 relational
databases. Prices are $595 for DOS, $1,350 for OS/2, and $1,950 for Unix/386.
Call for pricing on other platforms. Reader service no. 26.
JYACC 116 John St. New York, NY 10038 212-267-7722
The Real Time Consortium (RTC) has been formed from realtime operating system
and kernel vendors Ready Systems, Wind River, Lynx Realtime Systems, and
Eyring Research Institute, and from high-performance computer board vendors
Heurikon and FORCE Computers. RTC seeks to support existing global standards
within industry organizations and to develop new standards at the appropriate
hardware interface and application level. RTC was formed by the growing demand
to establish interoperability between different hardware products and various
realtime kernels and operating systems.
The first project of RTC's Technical Committee was to develop a preliminary
specification for an Open, Basic Input Output System (OBIOS). DDJ spoke with
Greg Buzzard of Ready Systems, who is chairman of that committee. He said, "As
drivers become more complex and as the number of drivers grows the problem
worsens. The OBIOS implementation will provide device-dependent parts of a
device driver. It will be the interface between the operating system part of
the code and the physical devices. Instead of targeting specific I/O devices,
board and software vendors will target the OBIOS."
The OBIOS standard's physical I/O interface cuts the time necessary to program
device drivers. The intent is for the OBIOS to eliminate the need to
continuously rewrite device driver modules for each combination of hardware
and operating systems or realtime kernels.
RTC seeks technical input and design commitments from chip, hardware, and
software companies, as well as from users throughout the industry. Buzzard
said, "Our goal is to go to IEEE ballot by the first quarter of 1991. We
expect to have a draft ready by January that the working group is willing to
accept public comment on." If you would like to obtain the draft, you may
download data from the Internet via anonymous file transfer protocol from the
RTC directory at gate.ready.com. If you wish to write to the RTC, contact
Shohat & Kahn PR 1875 Winchester Blvd., Ste. 203 Campbell, CA 95008
408-379-7434
ARC+Plus 7.1, a file compression program formerly available as shareware, has
been revised and is supposedly now even easier to use. SEA (System Enhancement
Associates) has upgraded ARC+Plus, which features pull-down menus, mouse
support, backup options, auto-self-extraction, portability of files to other
operating systems, and tight compression.
Former shareware users can upgrade for $34.95 by sending in the system disk.
The retail price for ARC+Plus 7.1 is $89.95. SEA is seeking retail
distribution because of the cost involved in publishing and upgrading a
program and supporting its customers. Reader service no. 29.
SEA 925 Clifton Ave. Clifton, NJ 07013 201-473-5153


























February, 1991
SWAINE'S FLAMES


The Eye of the Beholder




Michael Swaine


It was in the early days of the Age of Universal Hypermedia. It was a time of
conflicting values, short attention spans, and great special effects. It was
the People vs. The Masters of the Universe.
"Mr. Scopes, whyn't you tell the Court how you boys work?" Broadway Tommy
asked his client, and winced on seeing Judge Huffman checking out Scope's
big-ticket threads.
"Well, Bit, Blit, and Scuzz run agents on Collecon to gather the text, video,
and sound bites--"
"Collecon -- that's insider's talk for UKB, the Universal Knowledge Base, am I
right?"
"It's the term we use. Then Steve, that's Steve Clone, our chunker, reduces
the granularity -- how do I say it? He decides how small to slice the
information. I put in the links."
"So they gather the materials and cut 'em up, and you tie 'em back together,
huh?"
"It's more that I create the links to let the reader tie something together.
The reader is the final collaborator in all of our work." Better stop there,
Tommy thought. That idea of collaboration was enough concept for the jury to
chew on for a while. He turned Scopes over to Jennings, hoping the kid could
survive what the legendary prosecutor would throw at him.
"Mr. Scopes," Jennings began, "Why do you call yourselves Masters of the
Universe?"
"That's the group's name. The idea is, we build a universe for the reader to
explore."
"Indeed. Let's have no false modesty. And can that exploration lead in obscene
directions?"
"I'd have to know what you mean by obscene," Scopes answered, shrugging.
"I'll make it easy, Mr. Scopes. Applying your own standards for obscenity --
you do have standards, I trust? -- could your work be explored in a way that
leads to an obscene result?"
Tommy jumped up. "Objection, Your Honor. The prosecutor's tone --"
"-- is not material, Counselor. Overruled."
"But the Criminal Code --"
"In this court, Counselor, we go by the Huffman Code. Witness will answer the
question."
Scopes looked blandly at Jennings. "You mean in the same way that a
cut-and-paste job on one of your speeches could produce something you would
find obscene, Mr. Jennings? I guess so." The kid's good, to score one off
Jennings, Tommy thought.
And that was not the last point for the Home Team. Later, Jennings was
questioning an art critic. "Is there anything about a picture," he asked,
"that can make it obscene on the face of it?"
Tommy was on his feet before the witness could answer. "If it is the
prosecutor's intention to address the question of obscenity of my clients'
research materials, he's outa line. A legal ruling already exists. These
materials, by virtue of being drawn exclusively from UKB, have passed
government censors and are statutorily not obscene, and this testimony is
bogus."
Huffman sustained the objection: Another one for us, Tommy thought. This is
almost too easy.
"We're walking all over him, one of the boys said at dinner that night.
"Don't get smug," Tommy told him. "Jennings is one slick dude."
"Couldn't prove it by me."
Tommy wondered about that. Jennings did seem to be blowing it. Did he have
some kinda trick up his sleeve? But there was nothing left but the summations
in the morning, and Tommy was prepared. His defense didn't rest on the
obscenity issue -- let Jennings stir 'em up with that -- but just on
responsibility. The promise of hypermedia is Reader Control. More reader
control means less artist control means less artist responsibility. How could
the artist control which links the reader follows? It was a no-brainer and he
was sure the jury had got it.
The Not Guilty verdict came in right after lunch.
It wasn't until he and the boys were fighting through the crowd of reporters
and fans outside and he saw Jennings waiting by their limo, smiling, that
Tommy knew he had been had.
"This is what you wanted, isn't it?" Tommy asked Jennings when they reached
the car.
Jennings laughed. "I am glad that you caught on -- and that you didn't catch
on sooner."
Scopes wanted to know what they were talking about.
"Jennings suckered me," Tommy answered. "He played along in order to establish
a precedent."
Jennings smiled. "You did make my point nicely. If reading is an active
process, then the reader may be, not an innocent victim of obscenity, but
rather a collaborator or even a perpetrator."
Slipping into the limo, Scopes shouted, "What are you going to do, indict our
readers?"
Jennings glared at the crowd. "Perhaps a few, but I have bigger fish in mind."
"I'm sorry," Tommy told Scopes through the window as the limo began to roll
away. "You're artistic middlemen, Scopes. You read to write. As writers,
you're cleared, but treating you as readers turns all my arguments inside out.
He's still after you, but this time you're gonna be charged with reading an
obscene work into existence."















March, 1991
March, 1991
EDITORIAL


Yes, You Can Make a Difference




Jonathan Erickson


The article "Software Patents" (DDJ, November 1990) created quite a stir. But
then, we expected it would. What we didn't anticipate was the passion of your
reactions to the article -- both on the part of those of you who agreed with
the League for Programming Freedom's position and those who didn't.
Among those who agreed with the League's position -- and were moved to action
-- was the reader at a large corporation who, at the urging of his superiors,
applied for a patent on a rather fundamental windowing scheme. He sent us a
description of the technique, imploring us to publish it and establish prior
art, thereby possibly blocking his own patent application. Unfortunately
what's done is done and our publicizing the method wouldn't halt the patent
process.
On the other hand, we heard from folks like Paul Heckel, creator of the card
and rack metaphor (and its card and stack subset) developed and patented long
before Apple released HyperCard. After years of anguish on Paul's part, Apple
finally acknowledged his inventions and licensed his patents. The patent
process protected Paul by not allowing a large corporation to play Goliath to
his David.
But the tremors weren't felt at DDJ alone. At the article's behest, many of
you also wrote the U.S. House of Representative's Subcommittee on Intellectual
Property. And rarely, according to the people in Washington I talked to, has
the Subcommittee received the quantity -- and quality -- of mail the article
generated.
To give you an idea: One measure of the quality of mail received by
Congressional committees is whether the letters are standardized form letters
or individual letters. Every letter sent by DDJ readers in response to the
November article was a carefully thought out, multipage, personal letter
written by an individual. As a Subcommittee spokesperson told me, "people
wrote because they were concerned." He went on to say that "you've certainly
hit a nerve here."


TelePath Update


On February 1 of this year, TelePath, DDJ's online companion, became a free
service available via direct dial. You can access the system at anytime
without hassling with credit cards or packet-switched networks.
In addition to the source code in DDJ (and other M&T magazines), you'll find
source code libraries and on-going discussions on subjects ranging from
object-oriented programming to database benchmarks.
The communications parameters are 1200/2400 baud, 8 data bits, no parity, 1
stop bit, and the direct dial number is 415-364-8315.


386BSD Availability


Nor were we surprised at the flood of inquiries regarding the status and
availability of 386BSD. In fact, the general response can be summed up in two
words -- more now!
According to Bill and Lynne Jolitz, the current status of the project is that
386BSD has been merged into the 4.3BSD-Reno version of the Berkeley Software
Distribution, a work-in-progress version of the upcoming 4.4BSD release. Due
to the research orientation of BSD, updated versions are made available from
the University of California at Berkeley Computer Systems Research Group
(CSRG) at scheduled intervals. What follows is the current official statement
by the CSRG on this matter:
The 386BSD support will be available in February as part of an update of the
1989 Networking Release distribution. One very important fact to remember is,
that although the 386BSD support itself is freely redistributable, much of the
rest of the operating system and utilities require proprietary source
licenses. Therefore, the February distribution will NOT be a complete system
and cannot be booted or run on a 386 machine. This distribution will only
require a Berkeley license and distribution fee. Previous fees were
approximately $500, but the actual fee has not yet been determined.
The 4.4BSD release is scheduled for the middle of 1991, and additional, freely
redistributable support will be made available at that time.
Since the CSRG is a research group, not all calls or e-mail can be promptly
answered, so we'll try to keep you updated here in the magazine. CSRG can be
contacted via e-mail at bsd.dist@ucbarpa.berkeley.edu, and e-mail is preferred
for complete and accurate updates. If you have any specific questions or
comments regarding 386BSD, Bill and Lynne ask that you contact them directly
via e-mail at william@berkeley.edu or at uunet!william.

























March, 1991
LETTERS







Birds of a Feather


Dear DDJ,
I am very glad that DDJ tries to inform the "ordinary" programmer about recent
developments in connectionism. Generally these articles are very informative.
In Michael Swaine's "Programming Paradigms" column "Neural Nets: A Cautionary
View" (November 1990), though, I read some things I disagree with.
Swaine says that Fodor and Pylyshin's (F&P) critique of neural nets are
relevant to the potential of neural nets as a programming tool. This is simply
not true. F&P's critique could be of some importance for the assessment of
neural nets as a psychological model (although I would argue on that, too). As
a programming tool, neural nets could be of great importance (and surely they
will be) even if they fail as psychological models, and that I doubt. I don't
think it is necessary for neural nets to model some "real" psychological
process or structure to be a good programming tool. Take, for example,
learning neural nets. Most implement the so-called "backpropagation rule," a
learning rule that is surely not to be found in real brains. The point is:
Backprop works (though I can think of better learning rules). Nobody would
think of jet propulsion as bad means to fly, just because it doesn't function
the way the wings of a bird do, so why should neural nets model nature?
I also disagree with the notion that symbolic processing is really necessary
for neural nets to be truly relevant models of psychological phenomena. On the
contrary, I believe that the processing of language, for example, could be
implemented subsymbolically. This kind of representation being in the
connections (not the nodes) body as weight and spike frequency (with some
synchronization to realize attentional processes) comes very close to what we
know about representation and processing of knowledge in the brain. Then only
the input or output has to be symbolic. F&P's critique is surely relevant for
older neural nets, but current research concentrates on modular
self-organizing neural nets (no backprop, sigh) with more sophisticated
connections, and these neural nets won't have the weaknesses of most of the
original ones.
My conclusion is that there will be two (loosely related) mainstreams of
connectionism: the engineering/programmers' connectionism dedicated to real
world applications (with no psychological relevance -- like expert systems)
and the psychological connectionism. It is very likely that future
connectionists will have to choose between jet propulsion and the wings of a
bird....
Christian van Hoven
The Netherlands


Tracing Ray Tracing


Dear DDJ,
I would like to commend Dan Lyke on his article on ray tracing. It was
understandable and interesting. I have been interested in the graphical
aspects of computer programming since I first started programming in C, three
years ago. Before this change in perspective, I had been programming on
mainframes and minis in Fortran. (What a difference a language makes!)
I took a computer graphics course during my graduate studies to flesh out my
self-taught graphics programming. What an eye-opener! The mathematics required
to accurately simulate the real world is somewhat tedious. Dan's
simplification may mislead some to think that it would be easy to implement a
3-D ray tracer in this fashion and not run into difficulties. Anyone who has
read the classic text by Newman and Sproull will realize that throwing around
4 X 4 matrices is not trivial.
At the risk of seeming a total bore, consider the generation of a viewing
transformation (VT) matrix. The matrix itself must be a 4 x 4 matrix due to
the homogeneous representation of the world as developed by early geometers
for working in projective geometry. VT is formed by:
Translating the world to zero the eye location (Dan's simplification)
Rotating about the x-axis 90 degrees
Rotating about the y-axis by a geometrically determined angle
Rotating about the x-axis by a geometrically determined angle
Reversing the sense of the z-axis to convert the system to a left-handed
coordinate system
Multiplying the resulting matrix by another matrix that contains viewing angle
(i.e., perspective translations) information
Once VT is formed, it is used for all translations from the world to the
virtual screen. Additionally, VT provides another useful function. Inverting
VT provides a means of going from the screen (pixels) to the world for
implementation of a ray tracer.
Michael R. Schore
Redlands, California


Bezier Business


Dear DDJ,
I enjoyed Todd King's article, "Drawing Character Shapes with Bezier Curves"
in the July 1990 DDJ, but more importantly, I found it extremely practical in
the context of one of my projects. Magicorp is a slide service bureau. We
accept files from many software packages such as Applause, Harvard Graphics,
Freelance, Designer, Artline, etc. and render them into very high resolution
(4032 x 2688) 35mm slides and overhead transparencies. We have all 207
Bitstream fontware fonts in our font library with the character shapes defined
as straight vectors. We did this because the only way we knew of rendering
Bezier curves was from the parametric equations, and this method was too slow
for our production system. Now that we know about the deCasteljau algorithm,
however, we can save considerable disk space by changing our font library to
represent character shapes in their original Bezier format without too much
performance degradation.
I was wondering if Mr. King could supply me with some reference for further
reading. In particular, I would be interested in the references that
originally made him aware of the deCasteljau algorithm, as well as any other
papers or books on the subject of which he is aware.
Philip N. Jacobs
Elmsford, New York
Todd responds: It's interesting you should ask what led me to the deCasteljau
method. The original draft of my article did not contain information about the
deCasteljau method of calculating Bezier curves. When DDJ technical editor Ray
Valdes looked at the article, he recommended that I also look at the
deCasteljau method and directed me to CAD: Computational Concepts and Methods,
by Glen Mullineux (MacMillan Publishing Co.). The chapter on representing
curves has a good discussion of Bezier curves and the generation description
of the algorithms. This is a good place to start. The references in the book
should lead you to the original (first generation) descriptions of the
algorithms by Bezier and deCasteljau (as well as others).
In writing the article I also referred to Fundamentals of Interactive Computer
Graphics by James D. Foley and Andries Van Dam (Addison-Wesley, 1984). A
reader of DDJ also recommends Algorithms by Sedgewick (Addison-Wesley, 1988).
I would also refer you to the "Letters to the Editor" section of the November
and December 1990 issues of DDJ, since some readers have sent in comments on
how to improve upon the efficiency of the implementation presented in my
article. Their comments should also prove useful.


B-tree Business


Dear DDJ,
I enjoyed the article "The B-tree Again" by Al Stevens in the December 1990
DDJ. I appreciate in particular his focus on practical implementation of tools
for people who don't want or need a lot of theory.
I ran into some trouble when considering how the key handling mechanism would
support integers. It occurred to me that the definition of the keyspace within
a treenode as a simple character array could lead to trouble on some machines.
I didn't notice any mechanism for preventing integer values in the keyspace
from being misaligned on machines that require integer alignment on word
boundaries.
On some machines, this can merely cause performance degradation, on others
(some of the new RISC architecture processors), this will lead to bus
exception errors, i.e., the dreaded "Bus Error, Core Dumped" message from
Unix. I hope Al can clarify his approach to this problem for me.

Mark Rosenthal
Louisville, Colorado
Al responds: The B-tree algorithms in my column treat keys as fixed-length
character arrays. If I need to use an integral value for a key, I encode the
value as an ASCII string. This method uses more space for keys but is less
dependent on computer and compiler architectures. To use binary integer
values, you would need to address the function that compares keys as well as
the alignment problems you have mentioned.


Who's On First?


Dear DDJ,
Michael Swaine's recent article, "Fire In The Valley Revisited" (January 1991)
gives the impression that the personal computer revolution started with the
MITS Altair computer kit. It didn't. There was a great deal of activity prior
to the Altair.
In the early 1970s, many of us were members of Steve Gray's Amateur Computer
Society -- a group of dedicated hardware hackers who were building their own
computers and computer circuits. Several members cloned versions of Digital
Equipment Corporation's popular PDP-8/L minicomputer. The group published a
lively newsletter for computer hobbyists.
In July 1974, Radio-Electronics magazine featured my Mark-8 computer on its
cover. The computer construction project used Intel's 8-bit 8008
microprocessor chip, and the computer allowed for as many as 16 Kbytes of
static RAM. (At that time, a hard disk for a PDP-8/L minicomputer furnished
32K 12-bit words.) Interest in the Mark-8 was very high and about a thousand
of the circuit-board kits were sold. Several mail-order companies offered kits
of hard-to-get components. Radio Electronics sold many of the complete
booklets that gave all of the construction details and circuit-board layouts.
Over the years I've talked with and met many people who built and used the
Mark-8. The original Mark-8 is now on display in the Smithsonian Institution's
Information Age exhibit in Washington, D.C.
No less an authority than Robert Noyce, the chairman of Intel, recognized the
Mark-8 as the first true personal computer. Sure, there were other small
computers available at the same time, but none were accessible to an
electronics hobbyist or computer buff. The Mark-8 put such a computer in the
hands of those people. At least one computer company got its start because of
the Mark-8. Some readers may recall the Digital Group, a company that provided
a line of CPU-interchangeable computers, many of which were adopted for
regular commercial use.
The Mark-8 also spawned at least one publication prior to the Altair. As I
recall, Hal Singer and John Craig started the Mark-8 newsletter out in
Camarillo, Calif., shortly after the computer appeared in Radio-Electronics.
Craig later went on to Infoworld. There were many users groups in the USA,
too. Many of these evolved into the groups and clubs that supported the
Altair, IMSAI, PET, Apple, and other computers. The clubs and the people were
already receptive to computers when the Altair came along.
Keep in mind, too, that the Mark-8 actually worked, right from the first unit.
The design was thoroughly tested so that it would work properly whenever a
hobbyist constructed a computer. Altair builders weren't so lucky. Many of the
original versions didn't work at all, nor were fixes or support readily at
hand. Whenever I fired up my Mark-8 -- even as late as 1988 -- it always
worked. I still have two nonworking Altairs that one day I'd like to get
around to putting in working condition.
I'm not denigrating the Altair. It was an important link in the chain of
personal computer advancements made during the last 17 years. However, let's
not revise history and put the start of the PC "revolution" at January 1975.
It took place months before.
I wish I could recall more history of the "early days," but most of my source
material went to the Smithsonian with the Mark-8. I still have models of and
documentation for many older computers, though. Who knows, maybe there are
others interested in preserving and restoring these fossils of the computer
age.
Jonathan A. Titus
Editorial Director
EDN Magazine
Milford, Massachusetts


Always the Optimist


Dear DDJ,
In reference to Jeff Duntemann's article "Sex and Algorithms" in the October
1990 DDJ, my best guess is that Zeller's Congruence doesn't extend past the
year 2000 because Zeller didn't figure that the world would last past the year
2000.
David M. Raley
Laurel Hill, N.C.


Patents, Shapes, and More


Dear DDJ,
I read the "Software Patents" article by The League for Programming Freedom
(November 1990) and have a few comments. I have never run into a patent
problem, at least not yet, and I hope I never do. I see this as a chicken and
egg problem: Which is most important -- the algorithm or the software that
uses it? On one hand, certain algorithms may make some software work more
efficiently, but what is the algorithm's value in the overall success of the
software? I have some doubts about the patent holders going after, legally,
users of their ideas, except where there is a deep pocket to pick. And from
the article itself, it seems that a few companies just buy up patents and go
looking for a successful product that uses their patented algorithms. And for
them, it's a very good business; they don't have to market products -- just
hold the patent and retain a legal firm. So in the modern world you don't have
to produce anything -- just collect from people who do. What an idea!
The article "An Existential Dictionary" by Edwin Floyd (November 1990) was
particularly well done. It showed some of the thought processes and mistakes
that are always part of a project. Perhaps Mr. Floyd will write more articles
in the future.
In addition, I found the geometric shapes on the cover and interspersed among
the articles to be fascinating, especially since they were made of paper and
used no glue. I was wondering if you know where I could get a book about
modular origami.
William Tennyson
Columbia, Missouri
Editor's note. For more information on modular origami, write to Vicki Mihara
Avery at P.O. Box 371144, Montara, CA, 94037. Vicki is the artist who provided
the origami for the November issue.


How Fast Is Fast?


Dear DDJ,
In Bruce Tonkin's article on PowerBasic (July 1990) he mentions that the
expanded string space (compared to Quick-Basic?) in that compiler carries a
small penalty of slower operation due to the larger memory spaces available
for PowerBasic's string operations. The tables on pages 76 and 77 show the
MID$ operations to be about 3.4 times slower in PowerBasic than they are in
the QuickBasic 4 and 7 version compilers.
My feeling is that 3.4 times is not a small difference when you consider what
the MID$ operation does in many commercial programs. Many people use the MID$
function to move data in sort buffers and/or text-editing buffers, where the
buffers range from 30 Kbytes or so in size up to several hundred Kbytes, and
the string-shifts need to be nearly instantaneous.
Basic's capability to do these string moves is just adequate in the Microsoft
compilers using small buffers on a PC or large buffers on an AT, but would be
unacceptable on these same machines using PowerBasic. What does Bruce think?
Dale Thorn
Round Lake, Illinois
Bruce responds: I can't agree that the time difference for the MID$ operation
is important. Yes, PowerBasic is slower, taking about 80 seconds per million
operations compared to about 25 for QuickBasic 4.x or Basic 7.0. A meaningful
comparison is not that easy, though, as my review mentioned.
Few programs will need to do anything like a million MID$ operations. For
reasonable programs, several thousand to ten thousand operations will be more
typical -- and for them, the difference will be much less than a second.
Further, PowerBasic allows fully dynamic string space to be over 400K on a
640K machine. QuickBasic and Basic 7.0 will not allow more than 64K per array
(and under QuickBasic 4.x, the limit is more like 50K with no other dynamic
space available). The only way to get more than 64K in a single string array
with any Microsoft Basic is to use fixed-length strings, and to get 128K or
more the strings lengths must be a power of two.
Also, PowerBasic removes the need for many MID$ operations. There is an
equivalent of the FIELD statement that can be used on arbitrary strings. So
you can look at or assign any part of any string without using MID$ at all --
and PowerBasic's assignment operation is actually a little faster than
QuickBasic's. That kind of thing is ideal for changing record buffers.
Let's take a sort program that uses large record buffers. I'll assume that the
individual records are no more than 32 Kbytes. Here's what happens when you
write that application in QuickBasic or Basic 7.0, compared to PowerBasic:
1. The Microsoft versions will limit the dynamic string space to 64 Kbytes per
array, forcing a sort that uses dynamic string arrays to be much smaller than
memory. PowerBasic allows the programmer to use all available memory for
dynamic strings.

2. Fixed-length strings must be predeclared as to length (a power of two if an
array of 128K or more) in the Microsoft versions, meaning that a
general-purpose sort is much more difficult to write. Microsoft's fixed-length
string assignment operations are slower than their dynamic equivalents by a
factor of about two. PowerBasic strings are fully dynamic.
3. In QuickBasic or Basic 7.0, you'll have to write your own sort, and you'll
need to use MID$ to sort on the middle part of a string. In PowerBasic a sort
is built-in, and you can specify the starting position for the sort -- no MID$
is required.
4. All versions of Microsoft Basic slow down drastically for string operations
as string space becomes full. PowerBasic actually becomes faster. If you're
running close to the edge, PowerBasic can show astounding speed improvements
over QuickBasic or Basic 7.0. This is the kind of thing that's hard to put
into a benchmark table (how full is "full"?) but can be worth plenty in an
application.
You mentioned text-editing buffers. The PowerBasic functions that allow you to
strip any leading or trailing characters, or remove any unwanted characters
from a string, can get rid of a lot of otherwise hand-coded routines -- again,
removing the need for a lot of MID$ operations.
For the last six years, I've sold a word processor written in Basic. To get a
version that allowed text files of more than 64 Kbytes using QuickBasic, I had
to store text in fixed-length string blocks and convert it between dynamic
strings and blocks. I wrote all the allocation, deallocation, and
garbage-handling routines myself. It was not a pleasant job, and debugging was
a pain. With PowerBasic, I removed those routines -- and the result ran a lot
faster. The search and replace functions still use MID$, but run as much as
ten times as fast because there's no need for blocking or deblocking with
PowerBasic.
Raw benchmark numbers can be valuable. They can also be misleading or
irrelevant. I can understand your concern with a factor of 3.4 speed
difference, but in this case I think it's unlikely to make any difference in
your applications; PowerBasic's other advantages can overwhelm the effect.
I do suggest you buy a copy of PowerBasic and write some applications to take
advantage of the new features. Though MID$ may be slower, I think you'll find
(as I have) that you'll need to write less code to get the job done. That was
the point I tried to make in the review, and perhaps I didn't make it well
enough.


Summing Up Patents


Dear DDJ,
I am writing this letter in protest to today's situations concerning software
piracy and patenting algorithms. As a 13-year old, who's sole income is gained
from mowing lawns, allowance, and presents, I cannot always afford the
software I need. I try shareware, and some is good, but a lot of it stinks. I
am currently scrounging to buy QuickC so I can learn C. In a way, I view
software piracy as "illegal shareware." Many people will get a copy and try it
out. If they want to use it, they probably will purchase it anyway. Some
software is too overpriced: $389 for Lotus 1-2-3?
Patenting algorithms is the stupidest thing I have ever heard of. Who can tell
you not to multiply by adding x to itself y times? Same for other formulas.
Jonathan Cooper
Clearwater, Florida













































March, 1991
80X86 OPTIMIZATION


Aim down the middle and pray




Michael Abrash


Michael is a contributing editor to DDJ and can be contacted at 7 Adirondack
Street, South Burlington, VT 05403.


Picture this: You're an archer aiming at a target 100 feet away. A strong wind
comes up and pushes each arrow to the left as it flies. Naturally, you
compensate by aiming farther to the right. That's what it's like optimizing
for the 8088; once you learn to compensate for the strong but steady effects
of the prefetch queue and the 8-bit bus, you can continue merrily on your
programming way.
Now the wind starts gusting unpredictably. There's no way to compensate, so
you just aim for the bull's-eye and hope for the best. That's what it's like
writing code for good performance across the entire 80x86 family, or even for
the 286/386SX/386 heart of today's market. You just aim down the middle and
pray.


The New World of the 80x86


In the beginning, the 8088 was king, and that was good. The optimization rules
weren't obvious, but once you learned them, you could count on them serving
you well on every computer out there.
Not so these days. There are four major processor types -- 8088, 80286, 80386,
and 80486 -- with a bewildering array of memory architectures: cached (in
several forms), page mode, static-column RAM, interleaved, and, of course, the
386SX, with its half-pint memory interface. The processors offer wildly
differing instruction execution times, and memory architectures warp those
times further by affecting the speed of instruction fetching and access to
memory operands. Because actual performance is a complex interaction of
instruction characteristics, instruction execution times, and memory access
speed, the myriad processor-memory combinations out there make "exact
performance" a meaningless term. A specific instruction sequence may run at a
certain speed on a certain processor in a certain system, but that often says
little about the performance of the same instructions on a different
processor, or even on the same processor with a different memory system. The
result: Precise optimization for the general PC market is a thing of the past.
(We're talking about optimizing for speed here; optimizing for size is the
same for all processors so long as you stick to 8088-compatible code.)
So there is no way to optimize performance ideally across the 80x86 family. An
optimization that suits one processor beautifully is often a dog on another.
Any 8088 programmer would instinctively replace:
 DEC CX JNZ LOOPTOP
with:
 LOOP LOOPTOP
because LOOP is significantly faster on the 8088. LOOP is also faster on the
286. On the 386, however, LOOP is actually two cycles slower than DEC/JNZ. The
pendulum swings still further on the 486, where LOOP is about twice as slow as
DEC/JNZ -- and, mind you, we're talking about what was originally perhaps the
most obvious optimization in the entire 80x86 instruction set.
In short, there is no such thing as code that's truly optimized for the 80x86.
Instead, code is either optimized for specific processor-memory combinations,
or aimed down the middle, designed to produce good performance across a range
of systems. Optimizing for the 80x86 family by aiming down the middle is quite
different from optimizing for the 8088, but many PC programmers are
inappropriately still applying the optimization lore they've learned over the
years on the PC (or AT). The world has changed, and many of those old
assumptions and tricks don't hold true anymore.
You will not love the new world of 80x86 optimization, which is less precise
and offers fewer clever tricks than optimizing for the 8088 alone. Still,
isn't it better to understand the forces affecting your code's performance out
in the real world than to optimize for a single processor and hope for the
best?
Better, yes. As much fun, no. Optimizing for the 8088 was just about as good
as it gets. So it goes.


Optimization Rules for a New World


So, how do you go about writing fast code nowadays? One way is to write
different versions of critical code for various processors and memory access
speeds, selecting the best version at runtime. That's a great solution, but it
requires an awful lot of knowledge and work.
An alternative is to optimize for one particular processor and settle for
whatever performance you get on the others. This might make sense when the
8088 is the target processor because it certainly needs the optimization more
than any other processor. However, 8088 optimization works poorly at the upper
end of the 80x86 family.
Nowadays, though, most of us want to optimize for the 286 and 386 systems that
dominate the market, or across all 80x86 processors, and that's a tough nut to
crack. The 286 and 386 come in many configurations, and you can be sure, for
example, that a 386SX, an interleaved 386, and a cached 386 have markedly
different performance characteristics. There are, alas, no hard and fast
optimization rules that apply across all these environments.
My own approach to 80x86 optimization has been to develop a set of general
rules that serve reasonably well throughout the 80x86 line, especially the 286
and 386, and to select a specific processor (in my case a cached 386, for
which cycle times tend to be accurate) to serve as the tiebreaker when
optimization details vary from one processor to another. (Naturally, it's only
worth bothering with these optimizations in critical code.) The rules I've
developed are:
Avoid accessing memory operands; use the registers to the max.
Don't branch.
Use string instructions, but don't go much out of your way to do so.
Keep memory accesses to a minimum by avoiding memory operands and keeping
instructions short.
Align memory accesses.
Forget about many of those clever 8088 optimizations, using oddball
instructions such as DAA and XLAT, that you spent years learning.
Next I'll discuss each of these rules in turn in the context of
8088-compatible real mode, which is still the focus of the 80x86 world. Later,
I'll touch on protected mode.
Let's start by looking at the last -- and most surprising -- rule.


Kiss Those Tricks Goodbye


To skilled assembly language programmers, the 8088 is perhaps the most
wonderful processor ever created, largely because the instruction set is
packed with odd instructions that are worthless to compilers but can work
miracles in the hands of clever assembly programmers. Unfortunately, each new
generation of the 80x86 has rendered those odd instructions and marvelous
tricks less desirable. As the execution time for the commonly used instruction
ADD BX, 4 has gone down from four cycles (8088) to three cycles (286) to two
cycles (386) to one cycle (486), the time for the less frequently used
instruction CBW has gone from two cycles (8088 and 286) up to three cycles
(386 and 486)!
Consider this ancient optimization for converting a binary digit to hex ASCII:
 ADD AL,90H DAA ADC AL,40H DAA
Now consider the standard alternative:

 ADD AL,'O' CMP AL,'9' JBE HaveAscii ADD AL,'A'-('9'+1) HaveAscii:
As Figure 1 indicates, the standard code should be slower on an 8088 or 286,
but faster on a 386 or a 486 -- and real-world tests confirm those results, as
shown in Figure 2. (All "actual performance" timings in this article were
performed with the Zen timer from Zen of Assembly Language, see "References"
for details. The systems used for the tests were: 8088, standard 4.77 MHz PC
XT; 80286, standard one-wait-state, 8 MHz PC AT; 386SX, 16 MHz noncached;
80386, 20 MHz externally cached with all instructions and data in external
cache for all tests except Listings One and Two; 80486, 25 MHz internally
cached, with all instructions and data in internal cache for all tests except
Listings One and Two.)
In other words, this nifty, time-tested optimization is an anti-optimization
on the 386 and 486.
Why is this? On the 386, DAA -- a rarely used instruction -- takes four
cycles, and on the 486 it takes two cycles, in both cases twice as long as the
more common instructions CMP and ADD; in contrast, on the 8088 all three
instructions are equally fast at four cycles. Also, the instruction-fetching
advantage that the 1-byte DAA provides on the 8088 means nothing on a cached
386.
Nor is this an isolated example. Most oddball instructions, from AAA to XCHG,
have failed to keep pace with the core instructions -- ADC, ADD, AND, CALL,
CMP, DEC, INC, Jcc, JMP, LEA, MOV, OR, POP, PUSH, RET, SBB, SUB, TEST, and XOR
-- during the evolution from 8088 to 486. As we saw earlier, even LOOP lags
behind on the 386 and 486. Check your favorite tricks for yourself; they might
or might not hold up on the 386, but will most likely be liabilities on the
486. Sorry, but I just report the news, and the news is: Kiss most of those
tricks goodbye as the 386 and 486 come to dominate the market. (This means
that hand-optimization in assembly language yields less of a performance boost
nowadays than it did when the 8088 was king; the improvement is certainly
significant, but rarely in the 200-500 percent range anymore. Sic transit
gloria mundi.) Most startling of all, string instructions lose much of their
allure as we move away from the 8088, hitting bottom on the 486.


The 486: All the Rules Change


The 486 represents a fundamental break with 8088-style optimization. Virtually
all the old rules fail on the 486, where, incredibly, a move to or from memory
often takes just one cycle, but exchanging two registers takes three cycles.
The nonbranching core instructions mentioned earlier take only one cycle on
the 486 when operating on registers; MOV can, under most conditions, access
memory in one cycle; and CALL and JMP take only three cycles, given a cache
hit. However, noncore instructions take considerably longer. XLAT takes four
cycles; even STC and CLC take two cycles each. The 486's highly asymmetric
execution times heavily favor core instructions and defeat most pre-486
optimizations.
Core instructions do have a weakness on the 486. While 486 MOVs involving
memory are remarkably fast, accessing memory for an operand to OR, ADD, or the
like costs cycles. Even with the 8K internal cache, memory is not as fast as
registers, except when MOV is used (and sometimes not even then), so registers
are still preferred operands. (AND [BX],1 is fast, at only three cycles, but
AND BX,1 takes only one cycle -- three times as fast.)
OUT should be avoided whenever possible on the 486, and likewise for IN. OUT
takes anywhere from 10 to 31 cycles, depending on processor mode and
privileges, more than an order of magnitude slower than MOV. The lousy
performance of OUT -- true on the 386 as well -- has important implications
for graphics applications.
String instructions are so slow on the 486 that you should check cycle times
before using any string instruction other than the always superior REP MOVs.
For example, LODSB takes five cycles on the 486, but MOV AL,[SI]/INC SI takes
only two cycles; likewise for STOSB and MOV [DI],AL/INC DI. Listing One uses
LODSB/STOSB to copy a string, converting lowercase to uppercase while copying;
Listing Two uses MOV/INC instead. Figure 3 summarizes the performance of the
two routines on a variety of processors; note the diminishing effectiveness of
string instructions on the newer processors. Think long and hard before using
string instructions other than REP MOVS on the 486.
Optimization for the 486 is really a whole new ball game. When optimizing
across the 80x86 family, the 486 will generally be the least of your worries
because it is so much faster than the rest of the family; anything that runs
adequately on any other processor will look terrific on the 486. Still, the
future surely holds millions of 486s, so it wouldn't hurt to keep one eye on
the 486 as you optimize.


String Instructions: Fading Stars


On the 8088, string instructions are so far superior to other instructions
that it's worth going to great lengths to use them, but they lose much of that
status on newer processors. One of the best things about string instructions
on the 8088 is that they require little instruction fetching, because they're
1-byte instructions and because of the REP prefix; however, instruction
fetching is less of a bottleneck on newer processors. String instructions also
have superior cycle times on the 8088, but that advantage fades on the 286 and
386 as well.
On the 286, string instructions (when they do exactly what you need) are still
clearly better than the alternatives. On the 386, however, some string
instructions are, even under ideal circumstances, the best choice only by a
whisker, if at all. For example, since Day One, clearing a buffer has been
done with REP STOS. That's certainly faster than the looping MOV/ADD approach
shown in Listing Three, but on the 386 and 486 it's no faster than the
unrolled loop MOV/ADD approach of Listing Four, as shown in Figure 4.
(Actually, in my tests REP STOS was a fraction of a cycle slower on the 386,
and fractionally faster on the 486.) REP STOS is much easier to code and more
compact, so it's still the approach of choice for buffer clearing -- but it's
not necessarily fastest on a 486 or fast-memory 386. This again demonstrates
just how unreliable the old optimization rules are on the newer processors.
The point is not that you shouldn't use string instructions on the 386. REP
MOVs is the best way to move data, and the other string instructions are
compact and usually faster, especially on uncached systems. However, on the
386 it's no longer worth going to the trouble of juggling registers and
reorganizing data structures to use string instructions. Furthermore, when you
truly need maximum performance on the 386, check out nonstring instructions in
unrolled loops. It goes against every lesson learned in a decade of 8088
programming, but avoiding string instructions sometimes pays on the 386.


The Siren Song of Memory Accesses


Finally, here's a rule that's constant from the 8088 to the 486: Use the
registers. Avoid memory.
Don't be fooled by the much faster memory access times of the 286 and 386. The
effective address calculation time of the 8088 is mostly gone, so MOV AX,[BX]
takes only five cycles on the 286, and ADD [SI],DX takes only seven on the
386. That's so much faster than the 17 and 29 cycles, respectively, that they
take on the 8088 that you might start thinking that memory is pretty much
interchangeable with registers.
Think again. MOV AX,BX is still more than twice as fast as MOV AX,[BX] on the
286, and ADD SI,DX is more than three times as fast as ADD [SI],DX on the 386.
Memory operands can also reduce performance by slowing instruction fetching.
Memory is fast on the 286 and 386. Registers are faster. Use them as heavily
as possible.


Don't Branch


Here's another rule that stays the same across the 8Ox86 family: Don't branch.
Branching suffers on the 8088 from lengthy cycle counts and emptying the
prefetch queue. Emptying the prefetch queue is a lesser but nonetheless real
problem in the post-8088 world, and the cycle counts of branches are still
killers. As Figure 4 indicates, it pays to eliminate branches by unrolling
loops or using repeated string instructions.


Modern-Day Instruction Fetching


Instruction fetching is the bugbear of 8088 performance; the 8088 simply can't
fetch instruction bytes as quickly as it can execute them, thanks to its
undersized bus. Minimizing all memory accesses, including instruction fetches,
is paramount on the 8088.
Instruction fetching is less of a problem nowadays. Figure 5 shows the maximum
rates at which various processors can fetch instruction bytes; clearly,
matters have improved considerably since the 8088, although instructions also
execute in fewer cycles on the newer processors. Fetching problems can occur
on any 80x86 processor, even the 486, but the only processors other than the
8088 that face major instruction fetching problems are the one-wait-state 286
and the 386SX, although uncached 386s may also outrun memory. However, the
problems here are different from and less serious than with the 8088.
Consider: An 8088 executes a register ADD in three cycles, but requires eight
cycles to fetch that instruction, a fetch/execute ratio of 2.67. A
one-wait-state 286 requires three cycles to fetch a register ADD and executes
it in two cycles, a ratio of 1.5. A 386SX can fetch a register ADD in two
cycles, matching the execution time nicely, and a cached 386 can fetch two
register ADDs in the two cycles it takes to execute just one. For
register-only code -- the sort of code critical loops should contain -- the
386 generally runs flat out, and the 286 and 386SX usually (not always, but
usually) outrun memory by only a little at worst. Greater fetching problems
can arise when working with large instructions or instruction sequences that
access memory nonstop, but those are uncommon in critical code. This is a
welcome change from the 8088, where small, register-only instructions tend to
suffer most from inadequate instruction fetching.
Also, uncached 386 systems often use memory architectures that provide
zero-wait-state performance when memory is accessed sequentially. In
register-only code, instruction fetches are the only memory accesses, so
fetching proceeds at full speed when the registers are used heavily.
So, is instruction fetching a problem in the post-8088 world? Should
instructions be kept short?
Yes. Smaller instructions can help considerably on the one-wait-state 286 and
on the 386SX. Not as much as on the 8088, but it's still worth the trouble.
Even a cached 386 can suffer from fetching problems, although that's fairly
uncommon. For example, when several MOV WOR PTR [MemVar],O instructions are
executed in a row, as might happen when initializing memory variables,
performance tends to fall far below rated speed, as shown in Figure 6. The
particular problem with MOV WORD PTR [MemVar],O is that it executes in just
two (386) or three (286) cycles, yet has both an addressing displacement field
and a constant field. This eats up memory bandwidth by requiring more
instruction fetching. It also accesses memory, eating up still more bandwidth.
We'll see this again, and worse, when we discuss protected mode.
Generally, though, post-8088 processors with fast memory systems and
full-width buses run most instructions at pretty near their official cycle
times; for these systems, optimization consists mostly of counting cycles.
Slower memory or constricted buses (as in the 386SX) require that memory
accesses (both instruction fetches and operand accesses) be minimized as well.
Fortunately, the same sort of code -- register only -- meets both
requirements.
Use the registers. Avoid constants. Avoid displacements. Don't branch. That's
the big picture. Don't sweat the details.


Alignment: The Easy Optimization


The 286, 386SX, and 386 take twice as long to access memory words at odd
addresses as at even addresses. The 386 takes twice as long to access memory
dwords at addresses that aren't multiples of four as those that are. You
should use ALIGN 2 to word align all word-sized data, and ALIGN 4 to dword
align all data that's accessed as a dword operand, as in:
 ALIGN 4 MemVar dd ? : MOV EAX,[MemVar]
Alignment also applies to code; you may want to word or dword align the starts
of procedures, labels that can only be reached by branching, and the tops of
loops. (Code alignment matters only at branch targets, because only the first
instruction fetch after a branch can suffer from nonalignment.) Dword
alignment of code is optimal, and will help on the 386 even in real mode, but
word alignment will produce nearly as much improvement as dword alignment
without wasting nearly as many bytes.
Alignment improves performance on many 80x86 systems without hindering it on
any. Recommended.



Protected Mode


There are two sorts of protected mode, 16-bit and 32-bit. The primary
optimization characteristic of 16-bit protected mode (OS/2 1.X, Rational DOS
Extender) is that it takes an ungodly long time to load a segment register
(for example, MOV ES,AX takes 17 cycles on a 286) so load segment registers as
infrequently as possible in 16-bit protected mode.
Optimizing for 32-bit protected mode (OS/2 2.0, SCO Unix, Phar Lap DOS
Extender) is another matter entirely. Typically, no segment loads are needed
because of the flat address space. However, 32-bit protected mode code can be
bulky, and that can slow instruction fetching. Constants and addressing
displacements can be as large as 4 bytes each, and an extra byte, the SIB
byte, is required whenever two 32-bit registers are used to address an operand
or scaled addressing is used. So, for example, MOV DWORD PTR [MemVar],0 is a
10-byte instruction in 32-bit protected mode. The instruction is supposed to
execute in two cycles, but even a 386 needs four to six cycles to fetch it,
plus another two cycles to access memory; a few such instructions in a row can
empty the prefetch queue and slow performance considerably. The slowdown
occurs more quickly and is more acute on a 386SX, which needs 14 cycles to
perform the memory accesses for this nominally 2-cycle instruction.
Code can get even larger when 32-bit instructions are executed in 16-bit
segments, adding prefix bytes. (Avoid prefix bytes if you can; they increase
instruction size and can cost cycles.) Figure 7 shows actual versus nominal
cycle times of multiple MOV DWORD PTR [EBX*4+MemVar],0 instructions running in
a 16-bit segment. Although cache type (write-back, write-through) and
main-memory write time also affect the performance of stores to memory, there
is clearly a significant penalty for using several large (in this case,
13-byte) instructions in a row.
Fortunately, this is a worst case, easily avoided by keeping constants and
displacements out of critical loops. For example, you should replace:
 ADDLOOP:
 MOV DWORD PTR BaseTable[EDX+EBX],0
 ADD EBX,4
 DEC ECX
 JNZ ADDLOOP
with:
 LEA EBX,BaseTable[EDX+EBX]
 SUB EAX,EAX
 ADDLOOP:
 MOV [EBX],EAX
 ADD EBX,4
 DEC ECX
 JNZ ADDLOOP
Better yet, use REP STOSD or unroll the loop!
Happily, register-only instructions are no larger in 32-bit protected mode
than otherwise and run at or near their rated speed in 32-bit protected mode
on all processors. All in all, in protected mode it's more important than ever
to avoid large constants and displacements and to use the registers as much as
possible.


Conclusion


Optimization across the 80x86 family isn't as precise as 8088 optimization,
and it's a lot less fun, with fewer nifty tricks and less spectacular
speed-ups. Still, familiarity with the basic 80x86 optimization rules can give
you a decided advantage over programmers still laboring under the delusion
that the 286, 386, and 486 are merely faster 8088s.


References


Abrash, Michael. Zen of Assembly Language. Glenview, Ill.: Scott, Foresman,
1990.
Barrenechea, Mark. "Peak Performance: On to the 486." Programmer's Journal,
(November-December 1990).
Paterson, Tim. "Assembly Language Tricks of the Trade." Dr. Dobb's Journal
(March 1990).
Turbo Assembler Quick Reference Guide. Borland International, 1990.
i486 Microprocessor Programmer's Reference Manual. Intel Corporation, 1989.
80386 Programmer's Reference Manual. Intel Corporation, 1986.
Microsystems Components Handbook: Microprocessors Volume I. Intel Corporation,
1985.

_80x86 OPTIMIZATION_
by Michael Abrash


[LISTING ONE]

; Copies one string to another string, converting all characters to
; uppercase in the process, using a loop containing LODSB and STOSB.
; Adapted from Zen of Assembly Language, by Michael Abrash; not a
; standalone program, but designed to be used with the Zen timer from
; that book via the Zen timer's PZTIME.BAT batch file: ZTimerOn starts
; the clock, ZTimerOff stops it, and the test-bed program linked in by
; PZTIME.BAT starts the program, reports the results, and ends.

 jmp Skip ;skip over data in CS and subroutine


SourceString label word ;sample string to copy
 db 'This space intentionally left not blank',0
DestString db 100 dup (?) ;destination for copy

; Copies one zero-terminated string to another string,
; converting all characters to uppercase.
; Input: DS:SI = start of source string; DS:DI = start of destination buffer
; Output: none
; Registers altered: AX, BX, SI, DI, ES
; Direction flag cleared

CopyStringUpper:
 mov ax,ds
 mov es,ax ;for STOS
 mov bl,'a' ;set up for fast register-register
 mov bh,'z' ; comparisons
 cld
StringUpperLoop:
 lodsb ;get next character and point to following character
 cmp al,bl ;below 'a'?
 jb IsUpper ;yes, not lowercase
 cmp al,bh ;above 'z'?
 ja IsUpper ;yes, not lowercase
 and al,not 20h ;is lowercase-make uppercase
IsUpper:
 stosb ;put character into new string and point to
 ; following location
 and al,al ;is this the zero that marks end of the string?
 jnz StringUpperLoop ;no, do the next character
 ret

; Calls CopyStringUpper to copy & convert SourceString->DestString.
Skip:
 call ZTimerOn ;start timing
 mov si,offset SourceString ;point SI to the string to copy from
 mov di,offset DestString ;point DI to the string to copy to
 call CopyStringUpper ;copy & convert to uppercase
 call ZTimerOff ;stop timing






[LISTING TWO]

; Copies one string to another string, converting all characters to
; uppercase in the process, using no string instructions.
; Not a standalone program, but designed to be used with the Zen
; timer, as described in Listing 1.

 jmp Skip ;skip over data in CS and subroutine

SourceString label word ;sample string to copy
 db 'This space intentionally left not blank',0
DestString db 100 dup (?) ;destination for copy

; Copies one zero-terminated string to another string,
; converting all characters to uppercase.

; Input: DS:SI = start of source string; DS:DI = start of destination string
; Output: none
; Registers altered: AL, BX, SI, DI

CopyStringUpper:
 mov bl,'a' ;set up for fast register-register
 mov bh,'z' ; comparisons
StringUpperLoop:
 mov al,[si] ;get the next character and
 inc si ; point to the following character
 cmp al,bl ;below 'a'?
 jb IsUpper ;yes, not lowercase
 cmp al,bh ;above 'z'?
 ja IsUpper ;yes, not lowercase
 and al,not 20h ;is lowercase-make uppercase
IsUpper:
 mov [di],al ;put the character into the new string and
 inc di ; point to the following location
 and al,al ;is this the zero that marks the end of the string?
 jnz StringUpperLoop ;no, do the next character
 ret

; Calls CopyStringUpper to copy & convert SourceString->DestString.
Skip:
 call ZTimerOn
 mov si,offset SourceString ;point SI to the string to copy from
 mov di,offset DestString ;point DI to the string to copy to
 call CopyStringUpper ;copy & convert to uppercase
 call ZTimerOff






[LISTING THREE]

; Clears a buffer using MOV/ADD in a loop.
; Not a standalone program, but designed to be used with the Zen
; timer, as described in Listing 1.

 mov dx,2 ;repeat the test code twice, to make
 ; sure it's in the cache (if there is one)
 mov bx,dx ;distance from the start of one word
 ; to the start of the next
 sub ax,ax ;set buffer to zeroes
TestTwiceLoop:
 mov cx,1024 ;clear 1024 words starting at address
 mov di,8000h ; DS:8000h (this is just unused memory
 ; past the end of the program)
 call ZTimerOn ;start timing (resets timer to 0)
StoreLoop:
 mov [di],ax ;clear the current word
 add di,bx ;point to the next word
 dec cx ;count off words to clear until none
 jnz StoreLoop ; remain
 call ZTimerOff ;stop timing
 dec dx ;count off passes through test code
 jz StoreDone ;that was the second pass; we're done

 jmp TestTwiceLoop ;that was first pass; do second pass with all
 ; instructions and data in the cache
StoreDone:






[LISTING FOUR]

; Clears a buffer using MOV/ADD in an unrolled loop.
; Not a standalone program, but designed to be used with the Zen
; timer, as described in Listing 1.

 mov dx,2 ;repeat the test code twice, to make
 ; sure it's in the cache (if there is one)
 mov bx,dx ;distance from the start of one word
 ; to the start of the next
 sub ax,ax ;set buffer to zeroes
TestTwiceLoop:
 mov si,1024 ;clear 1024 words starting at address
 mov di,8000h ; DS:8000h (this is just unused memory
 ; past the end of the program)
 call ZTimerOn ;start timing (resets timer to 0)
 mov cl,4 ;divide the count of words to clear by
 shr si,cl ; 16, because we'll clear 16 words
 ; each time through the loop
StoreLoop:
 REPT 16 ;clear 16 words in a row without looping
 mov [di],ax ;clear the current word
 add di,bx ;point to the next word
 ENDM
 dec si ;count off blocks of 16 words to clear
 jnz StoreLoop ; until none remain
 call ZTimerOff ;stop timing
 dec dx ;count off passes through test code
 jz StoreDone ;that was the second pass; we're done
 jmp TestTwiceLoop ;that was the first pass; do the second pass
 ; with all instructions and data in the cache
StoreDone:





















March, 1991
ASSEMBLY LANGUAGE MACROS


Writing assembly language you can read




Ken Skier


Ken learned the program by hand-assembling object code for a KIM-1
microcomputer with 1K of RAM in 1978. Now he writes word processing software,
desktop publishing, and utility software for PCs. His word processor, Eye
Relief, won two Excellence in Software Awards from the Software Publishers
Association in 1990. Ken can be reached at SkiSoft Publishing Corporation,
1644 Massachusetts Avenue, Suite 79, Lexington, MA 02173. Phone: 617-863-1876,
Fax 617-861-0086.


There are a lot of myths about assembly language. Perhaps the most persistent
of these is that assembly language programs are hard to write, hard to debug,
and nigh onto impossible to maintain because they are hard to read.
(Of course, in many cases it's not a myth. Sometimes when I look through
assembly language code in a book or magazine, I find myself subvocalizing,
writing in the margins, trying to interpret the programmer's intent from a
series of MOV and COMPARE instructions. What is he trying to do? And what is
actually going on here?)
No wonder people avoid assembly language.
But I write only in assembly language, and have done so for over a decade.
During that time I have created whole applications from assembly language,
including graphical word processors and a desktop publishing program. They're
smaller than competitive applications, they run faster, and I've found that it
takes me less time to develop applications in assembly language than for
others to do so in C.
Yet you can look through all of my source code and never find a COMPARE
instruction. In fact, you might look through my code and scratch your head,
trying to figure out what language it's written in. Certainly not C or Pascal,
but it doesn't look much like assembly language, either.
That's because I write in macros. I've developed, in effect, a language of
macros so I can write readable source, with all the traditional benefits of
assembly language -- small code size and excellent performance -- without the
penalty of impenetrable code.


Comparisons


Let me give you an example. Suppose we need a routine to let us know whether
some character we've got is an ASCII digit. We'll pass it the character in AL,
and we want it to return TRUE if AL is in the range '0'...'9'; FALSE
otherwise.
Throughout my code, I use Carry as a TRUE flag, because it's easy to set, easy
to test, and doesn't interfere with any of the 80x6 general registers. So we
can write that routine like this:
 A_DIGIT PROC
 CMP AL, '0'
 JB __false
 CMP AL, '9'
 JA __false
 STC
 RET

 __false: CLC
 RET
 endp
This little routine is about as simple as standard assembly language gets, but
it does require you to interpret the 8086 instructions, and from them figure
out the programmer's intent. I wrote these few lines just a moment ago, and
yet when I look at them, I don't find their function clear. I have to
subvocalize: "Compare AL to an ASCII zero. Jump if Below. Below? Ah, if AL is
Below the ASCII zero ... then jump to __false. So if AL is less than an ASCII
zero, go to __false."
I have to go through that kind of interpretation with every couple of lines of
standard assembly language. I don't like that. I want to take a single thought
and write it down as a single line of code -- not as two cryptic lines that
have to be put together to make sense. Even the act of returning a status code
takes two lines in standard assembly language: One to set (or clear) the carry
flag, and another line to return. So a single thought ("Return TRUE") is
encrypted as two lines of code: STC, RET.
I really want to write that subroutine like this:
 A_DIGIT:
 IF AL < '0', return FALSE.
 IF AL > '9', return FALSE.
 Else ... return TRUE.
In practice, my code doesn't look quite that clear, but it's getting there. I
have a subroutine in my code library that looks like this:
 A_DIGIT PROC
 IF_AL {, '0', __false
 IF_AL }, '9', __false

 RET_TRUE

 __false: RET_FALSE

 endp
Does this look like assembly language? Probably not. But it assembles as
exactly the same code shown in the first example. It's just a lot easier to
write and to read.
This routine incorporates three of my most heavily-used macros: IF_AL,
RET_TRUE, and RET_FALSE.

RET_TRUE and RET_FALSE are pretty simple. RET_TRUE returns with carry set;
RET_FALSE returns with carry clear. Why bother with a macro to do something so
trivial? Because the most important effect of these macros is not on the
object code, but on the READER. Look at the line "RET_TRUE" in a subroutine
and you can tell immediately what the programmer wants to happen; look at some
carry manipulation and you've got to think about it.
I use macros often to make my intent clear.
The other macro, IF_AL, is a real workhorse. It lets me express a single
thought in a single line of code -- and it lets me do so visually, using a
natural algebraic notation (as opposed to such confusing mnemonics as JLE,
JGE, and JBE).
When I want to write:
 IF_AL < 12, goto __label
I can write it like this:
 IF_AL {, 12, __label
The macro name "IF_AL" start's the line. It is followed by a visual notation
implying a relationship, IF_AL "understands" the following notation:
 { less than
 {= less than or equal
 = equal
 {} not equal
 } greater than
 }= greater than or equal
(If I had my druthers, I would use "<" and ">" instead of "{" and "}", but
TASM treats angle brackets as delimiters, so I can't use them as arguments to
a macro. Still, braces look enough like "<" and ">" for me to feel pretty
comfortable with this implementation.)
The macro expands in a pretty straight forward way:
 IF_AL arg1, arg2, arg3
expands to a CMP AL, arg2, followed by a conditional jump to arg3. (Arg1
defines which conditional jump to use.) Some extra bookkeeping within the
macro allows it to accept four arguments, if one of them is OFFSET or PTR.
As you might expect, I have similar macros named IF_AH, IF_AX, IF_BL, IF_BH,
IF_BX, IF_CL, and so on for every register in the 80x6. So I never have to
write a CMP instruction, and I never have to figure out which conditional JMP
instruction to use. I leave all of those details to my macros.


Procedure Calls


I mentioned earlier that throughout my code, I use the Carry flag to indicate
whether a routine has returned TRUE or FALSE. For example, I have a procedure
that asks the user to confirm whether the program should go ahead and do
something. That procedure (called "CONFIRM") displays a message and waits for
a key. The user can press Enter or 'Y' or "y" for yes; ESC or "N" or "n" for
no. (Any other key makes CONFIRM beep; it then waits until the user presses a
legal key.) When it receives a legal key, CONFIRM returns TRUE or FALSE, to
report the user's intent.
I could use the CALL instruction to invoke CONFIRM, and then use JC or JNC to
branch based on carry. But doing so would take two lines of code, and involve
testing carry in a way whose meaning might not be clear. After all, I don't
really care about carry as a bit; I care about whether a procedure is
returning "true" or "false."
So I have two macros that handle this situation: IF_ and IF_NOT.
 IF_ arg1, arg2
expands to:
 CALL arg1 JC arg2
and
 IF_NOT arg1, arg2
expands to:
 CALL arg1 JNC arg2
These extremely simple macros contribute greatly to the readability of my
code. Now, instead of writing two lines of conventional assembly language such
as this:
 CALL CONFIRM JC __do_it
I write one line using my macro:
 IF_CONFIRM, __do_it
In this line of code, the intent of the programmer is clear. "If the user
confirms, then go ahead and do it!"


Tables


I use a lot of tables in my code. Tables are small, efficient, and lend
themselves to easy modification. For example, I will often use a table to
translate an 8-bit character to a 16-bit procedure address. "The user just
pressed this function key. What procedure should I invoke?" The table consists
of a series of entries, where each entry is a BYTE followed by a WORD.
I would really like to be able to set up the table like this:
 db a_key
 dw a_procedure
 db other_key
 dw other_procedure
 .
 .
 .
 db 0
This would let me have each key and its associated procedure on the same line
of the table. But TASM won't let me put two instructions on the same line. So
I have to do this:
 db a_key
 dw a_procedure
 db other_key
 dw other_procedure

 .
 .
 .
 db 0
I don't like that at all. It fails to show the direct relationship between a
key and its corresponding procedure. (And I will be in terrible trouble if I
delete a DB line without deleting the DW line below it!) But I can't put a DB
and a DW on the same line ... or I thought I couldn't, until I decided to
create a macro.
My macro, DBW, lets me put a DB and then a DW on the same line.
 DBW arg1, arg2
expands to:
 db arg1
 dw arg2
Using the DBW macro, I can make my tables look the way they should:
 dbw a_key, a_procedure
 dbw other_key, other_procedure
 . .
 . .
 dbw 0 0


Equates


Macros aren't the only way to make your code more readable. In many cases you
can use equates, instead.
I use equates to help me write code in a style that I find more natural. For
example, I learned to program in 6502 assembly language, so I am accustomed to
writing JSR instead of CALL, RTS instead of RET, and SEC instead of STC.
Rather than reprogram my brain to type the correct mnemonics for Intel 8Ox6
assembly language, I just put a few lines into the MACROS file (see the macro
definition file in Listing One) that I include at the start of every module:
 JSR equ <CALL>
 RTS equ <RET>
 SEC equ <STC>
Now when I type JSR, TASM knows I mean "CALL" ... when I type SEC, TASM knows
I mean "STC" ... and when I type RTS, TASM knows I mean RET. That doesn't
enhance readability, exactly (except to another 80x6 programmer who cut his
teeth on the 6502!) but I like the idea of teaching TASM to accommodate me,
rather than vice versa.
Before I ever wrote a line of code, I was a writer and a teacher of writing. I
believed very strongly that writing should be read aloud, and that a good
piece of writing is clear to the person who reads it. Now I write word
processors instead of novels, screen drivers instead of dialogues, but I find
the basic process of writing is the same, whether I am writing English
narrative or assembly language source code. It has to be readable. It has to
make sense to someone else.
Fortunately, as an assembly language programmer I can use macros to make my
code more readable. I think it makes me more productive. And it makes it
possible for me to understand my code long after I've written it.
After all, the most efficiently-programmed code in the world isn't going to do
you much good if you can't understand it when you look at it in your editor.


_ASSEMBLY LANGUAGE MACROS_
by Ken Skier



[LISTING ONE]

;--------------------------------------------;
; SkiSoft Macros ;
; ;
; Copyright (c) 1990 by SkiSoft, Inc. ;
; All rights reserved. ;
; ;
; ;
; Created by Ken Skier ;
; ;
; SkiSoft, Inc. ;
; 1644 Massachusetts Avenue, Suite 79 ;
; Lexington, MA 02173 ;
; Tel: (617) 863-1876 Fax: (617 861-0086 ;
; ;
;--------------------------------------------;



@ EQU OFFSET


JSR equ CALL
RTS equ RET
SEC equ STC


IF_ MACRO sub, dest
 CALL sub
 JC dest
 ENDM


IF_NOT MACRO sub, dest
 CALL sub
 JNC dest
 ENDM



RET_FALSE MACRO
 CLC
 RET
 ENDM



RET_TRUE MACRO
 STC
 RET
 ENDM



;---------------------------------------;
; ;
; 16-bit Register Compare macros ;
; ;
;---------------------------------------;


IF_AX MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG16 AX, exp, val, dest, last
 ENDM

IF_BP MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG16 BP, exp, val, dest, last
 ENDM

IF_BX MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG16 BX, exp, val, dest, last
 ENDM

IF_CX MACRO exp, val, dest, last
 %PUSHLCTL

 %NOLIST
 IF_REG16 CX, exp, val, dest, last
 ENDM

IF_DX MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG16 DX, exp, val, dest, last
 ENDM

IF_SI MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG16 SI, exp, val, dest, last
 ENDM

IF_SP MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG16 SP, exp, val, dest, last
 ENDM

IF_DI MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG16 DI, exp, val, dest, last
 ENDM


IF_REG16 MACRO reg, exp, val, dest, last
 %POPLCTL ;; Restore source-level listing parameters.
 IFIDNI <val>, <@>
 CMP reg, @ dest
 %PUSHLCTL
 %NOLIST
 IFITS_ exp, last
 ELSE
 CMP reg, word ptr val
 %PUSHLCTL
 %NOLIST
 IFITS_ exp, dest
 ENDIF
 ENDM


;---------------------------------------;
; ;
; 8-bit Register Compare macros ;
; ;
;---------------------------------------;


IF_AL MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG8 AL, exp, val, dest, last
 ENDM



IF_AH MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG8 AH, exp, val, dest, last
 ENDM


IF_BL MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG8 BL, exp, val, dest, last
 ENDM

IF_BH MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG8 BH, exp, val, dest, last
 ENDM

IF_CL MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG8 CL, exp, val, dest, last
 ENDM

IF_CH MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG8 CH, exp, val, dest, last
 ENDM

IF_DL MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG8 DL, exp, val, dest, last
 ENDM

IF_DH MACRO exp, val, dest, last
 %PUSHLCTL
 %NOLIST
 IF_REG8 DH, exp, val, dest, last
 ENDM


IF_REG8 MACRO reg, exp, val, dest, last
 %POPLCTL ;; Restore source-level listing parameters.
 IFIDNI <val>, <@>
 CMP reg, @ dest
 %PUSHLCTL
 %NOLIST
 IFITS_ exp, last
 ELSE
 CMP reg, byte ptr val
 %PUSHLCTL
 %NOLIST
 IFITS_ exp, dest
 ENDIF
 ENDM



IFITS_ MACRO exp, dest
 %POPLCTL ;; Restore source-level listing parameters.
 IFIDNI <exp>, <{> ;; <
 JB dest
 elseIFIDNI <exp>, <=> ;; =
 JE dest
 elseIFIDNI <exp>, <}> ;; >
 JA dest
 elseIFIDNI <exp>, <{=> ;; <
 JBE dest
 elseIFIDNI <exp>, <{}> ;;
 JNE dest
 elseIFIDNI <exp>, <}=> ;; >=
 JAE dest
 ENDIF
 ENDM



IFITS MACRO exp, dest
 %PUSHLCTL
 %LIST
 IFIDNI <exp>, <{> ;; <
 JB dest
 elseIFIDNI <exp>, <=> ;; =
 JE dest
 elseIFIDNI <exp>, <}> ;; >
 JA dest
 elseIFIDNI <exp>, <{=> ;; <
 JBE dest
 elseIFIDNI <exp>, <{}> ;;
 JNE dest
 elseIFIDNI <exp>, <}=> ;; >=
 JAE dest
 ENDIF
 %POPLCTL
 ENDM

IF_ITS EQU IFITS


;--------------------------------------;
; ;
; End of SkiSoft macros. ;
; ;
;--------------------------------------;















March, 1991
PORTING UNIX TO THE 386: THE STANDALONE SYSTEM


Creating a protected-mode standalone C programming environment




William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 and 2.9 BSD and the chief architect of
National Semiconductor's GENIX. Lynne established TeleMuse, a market research
firm specializing in the telecommunications and electronics industry. They can
be contacted via e-mail at william@ berkeley.edu or at uunet! william.
Copyright (c) 1990 TeleMuse.


This is the third article in this series, and at this point we feel it is
important to examine just what we have accomplished so far. In our first
article, we arrived at e sentially a "plan of action," outlining what we
understand to be the important goals of our project, as well as discussing (as
always, in hindsight) some of the important technical decisions made in the
process of completing our successful port of BSD to the 80386. In the second
article, we wrote three programs (using Turbo C and MASM) to prepare our host
for the beginnings of this port by creating the basic tools. We are now at the
point of departure, where the goal itself can become all-consuming.
"Why all the drama?" one may ask. Well, what we are about to embark on may be
considered, in a sense, like a mountaineering expedition to K2. We have done
all the scheduling and planning and assembled the consumables and equipment.
Now, here we sit at the base of the mountain, staring up at its intimidating
peak and contemplating our first steps with both anticipation and dread.
Projects of great complexity are always uncertain.
In this case, our mountain is an empty 386 residing in protected mode. There
is not one shred of code that we can rely on. One false step can cause a
spontaneous reset, or worse yet, a hang. Please believe us when we say that it
takes a lot of courage to take on such projects. Now one must shrug one's
shoulders of any uncertainty and begin to place one foot in front of the other
and scale the foothills. We must establish our base camp from which we can
explore further.
In this article, we endeavor to scale those foothills and establish our base
camp by building upon our previous work; using our protected-mode program
loader, we can create a minimal 80386 protected-mode standalone C programming
environment for operating systems kernel development work. Next, we must write
prototype code for various kernel hardware support facilities. Finally, we
must use our standalone programming environment as a test bed to shake out the
bugs in our first-cut implementation of kernel 386 machine-dependent code in
preparation for incorporation in the BSD kernel.
At this point, the neophyte tends to ask the question (and it is a good
question, mind you) "Why spend so much time on small programs, prototype code,
and the like, instead of getting into the hard stuff?" Yes, it does appear on
the surface that one should start shoveling this huge operating system through
the compiler and onto our host. (At this very instant of writing, the BSD
kernel consists of 128,332 lines, according to wc -l, and supports roughly 150
Mbytes of user-level sources -- sorry, can't wait to consult wc on that!)
Besides being a bit of a bore, just what would happen if we jumped into
compiling code willy-nilly? In sum, it would be a complete disaster, as we
would spread all of our latent bugs and misconceptions over a much broader
body of code. Worse yet, all these different bugs would be well-distributed
throughout our code and hence not easily differentiated or ordered.
A simple beginning affords us the chance to find various bugs early, when the
problem still has a chance of being resolved to our satisfaction. We have
plenty of land mines to avoid as it is, without adding to our troubles.


Watching for Land Mines


Porting 386BSD was definitely an eerie return to the basics; using the
protected-mode loader, we bulldozed MS-DOS right off the top and were left
with an empty machine that we incrementally built up to a functioning UNIX
system. At this stage in the port, an unlocalizable or nondeterministic bug is
a very real and costly possibility that can stall a porting project for
months.
At this point, a crafty programmer can use subtle techniques which anticipate
the sources of error and enable them to be identified and corrected in a
predictable and orderly manner. In a famous discussion, Donald E. Knuth wrote
of how he was able to greatly reduce the time it took to debug a new compiler
by anticipating worst-case test conditions and "stress testing" by use of the
adversary method. We employ similar techniques by testing the mechanisms used
in the kernel separately from the vast body of code that is the kernel. This
also has the added benefit of differentiating problems in the code from
compiler and assembler bugs that are almost certain to be present. This is not
a method that guarantees success, but it is much better to seek out trouble
rather than wait for it to find you.
Serious thought was given to implementing a tiny debugger to facilitate this
stage of the port. PC debuggers were also examined as a possible tool to ease
this effort. However, it was determined that the effort to keep the tools
concurrent with the rest of the project would be too costly for too little
advantage. This proved, in retrospect, to be the correct strategy for 386BSD,
since most of the bugs we encountered were either inherent to our naive
assumptions regarding the 386 instruction set or to silent "features" of the
particular version of the GNU GAS assembler used, and as such would have
affected the debugger as well as the operating system kernel we were porting.
However, an appropriate debugging tool might have cut our development time and
would have been especially useful if we were contemplating many ports to other
architectures.
In practice, an appropriate tool for kernel debugging should afford little
impact on the absolute environment. It should allow for source-level debugging
(now considered a bare minimum requirement) and should leverage existing
development platforms as much as possible. Van Jacobsen's "kernel GDB," for
example, is derived from GNU's GDB debugger. It uses a small stub routine in
the kernel and a serial line to communicate with a cross-host running the
debugger. Other debuggers probably exist that exhibit these qualities, but we
are unaware of them.
With this article, our port is moved from the conceptual to the tangible
world. This discussion, while by no means complete, addresses many of the
mechanisms necessary for 386BSD kernel functionality. It is tantalizing to
watch as the basic mechanisms come together, and one tends to think of what
remains as mere bookkeeping. If this were true, operating systems programming
techniques would more closely resemble those used by applications programmers.
They do not, however. While it may appear that the end of the project is
nearing, what is actually occurring is merely the first battle of a long and
involved war. Again, to use our mountain climbing analogy, we clear the
foothills only to be faced with the mountain range itself. In other words, we
are continually challenged with a new class of obstacles.


The First Step


At this point, there is little confidence in any of our tools because we have
yet to actually "shake down" the absolute loader, assembler, and link editor.
Beginning with trivial programs of a few instructions and gradually expanding
them, we incrementally prove our tools to the point where we can use them with
some degree of confidence. The journey begins, as always, with a single step.
More Details.
The program in Listing One is the simplest protected-mode program we can write
that generates output on the screen. It displays the message "hi" midscreen
and then stops. A program this simple must always work. If it does not, it
presents a minimum number of possibilities to determine why it fails. During
our port, this program originally did not work, due to bugs in the early
loader and assembler. While this may seem trite to some, this program
illustrates the pathetic level at which untested software tools begin. After
eliminating a handful of nuisance bugs, this simple program did work, and it
proved valuable because it was able to smoke out bugs quickly.
A side note to those who may have noticed that our assembly code format seems
to have changed since the previous article when we used Microsoft's MASM: For
those unaware, UNIX 386 assemblers prefer the operands in the opposite order,
partly because early UNIX systems appeared on PDP-11s, which preferred this
ordering style. Thus, on MS-DOS with MASM:
 mov eax,edx ;move contents of edx into eax register
corresponds to the UNIX assembler format:
 movl %edx,%eax #move contents of edx into eax register
In other words, it is (destination, source) instead of (source, destination).
This is yet another stunning "improvement" in the field of computer languages,
destined to be appreciated by those simultaneously debugging a MASM-coded
bootstrap loader and code generated by the GAS UNIX assembler!
As we proceed further, we add more complexity, testing span-dependent jumps,
stacks, and other mechanisms. Listing Two is a more elaborate program which
sends character and string output functions to the screen, thus allowing for a
primitive degree of debugging. Listing Three contains a simple runtime
start-off for C, with the obligatory "hello world\0" program heralding our
arrival into serious programming mode. At this point, we've found most of the
silly bugs and also created a primitive debugging tool. One might even claim
that, through this method, our entire BSD UNIX system is derived from our
original two-instruction program that we started with!


Introducing the Standalone System


The next milestone on our path was to produce, debug, and test a library of
support routines written in absolute protected-mode code. These routines allow
us to write the GCC programs needed to implement 386 machine-dependent code,
to access devices, and to access UNIX file structures on the hard disk. For
the code in Listing Three to function, a library is required to fill out all
of the primitives invoked.
This library and corresponding programs constitutes a standalone system of a
kind, and it affords us an opportunity to write a minimal amount of
machine-dependent code outlining our basic structure before we commit to
massive coding. It is a minimal C environment at best, but more than enough
for us to implement and test things like exception catching, system call
handling, line clock interrupts, and so forth. As we begin our climb, we are
able to expand a toe hold into a foot hold.
The standalone system actually consists of assembly language programs for
runtime start-off and processor support (module srt.s), as well as
machine-dependent C code for device support (many modules, including kbd.c and
cga.c) and machine-independent C code for language support, formatted output,
and filesystem operations (modules prf.c and sys.c). With the standalone
system, a file can be read or written from a BSD filesystem on a disk drive.
The BSD standalone system is not intended quite for this purpose; instead,
it's used to bootstrap load the system from disk or tape as part of the
process of initializing the computer to run the BSD system. Since we don't yet
have an operable kernel to be loaded and we've already written a MS-DOS
program loader (see DDJ, February 1991), the standalone system is not really
of use to us yet. However, the standalone system also provides us with file
I/O, formatted output, and a structure to hang hardware drivers on, while
demanding little from the hardware for support. Thus, we can use the
standalone system to prototype code for the kernel, with the added dividend of
completing the bootstrap code required by the complete kernel.
To run this minimal system, only the simplest of keyboard, display, and hard
disk device drivers are required. These can be enhanced later as needed.



Keyboard Driver


Listing Four outlines code for a simple driver, which extracts ASCII
characters from the PC keyboard on demand by grabbing display codes from the
8042 keyboard interface, consulting a table of actions for the given key press
display code, and returning the appropriate value out of the key table. It
does not even bother to initialize the keyboard controller, since we know that
MS-DOS already did that for us before we loaded the program with our absolute
loader.
This is the first place where we are hit with variations in PC keyboard
interfaces, all of which are hidden from applications programs and MS-DOS by
appropriate BIOS ROM drivers. It is possible to dance back and forth between
real and protected mode (thankfully made easier on the 386 than was the case
on the 286), "translating" the BIOS calls into BSD driver requests. This
method was intended for the PC, if one examines its real-mode ancestry, and
also addresses a nest of manufacturer idiosyncrasies. However, it goes against
the grain of our project in three basic ways: 1. performance degrades in
getting away from direct interaction with the hardware; 2. incompatibility
with previous BSD systems develops; and 3. implementation becomes a bigger
project than the port itself. In addition, it perpetuates the intertwining of
MS-DOS and UNIX to the point where it becomes a significant future liability.
To resolve this dilemma, we must choose to support the "raw" machine in its
entirety, with the result that undocumented or "secret" proprietary hardware
features must be ignored. This is not as great a burden as it may first
appear, because a considerable body of code already exists for this purpose,
and the great bulk of 386 AT platforms already conform to de facto hardware
standards.


Display Adapter


In Listing Five, we can examine the code from a trivial "glass tty" terminal
emulator for the display, which in this case happens to be a CGA board. We can
be content at the moment with newline, carriage return, and tab functions,
since we do not intend to do anything other than line-oriented text output in
the standalone system. Scrolling, by far, is the most complicated feature.
Our decision to avoid the BIOS at this point does make things more difficult,
because the BIOS automatically configures in device-dependent code from ROMs
onboard the display card to support the given device. Fortunately, market
forces have kept the proliferation of variations down to a reasonable number,
with either MDA or CGA interface standards supported by practically all
boards. Up to the point where we must support X Windows, we can live with
probing to determine the display type and "hard coding" for each.


Prevaricating with the Standalone System


The standalone system also provides us with a test bed for trying many
different ideas which can satisfy the mechanisms used in the BSD operating
system kernel, for we can then selectively test these mechanisms individually.
Otherwise, we would be forced to test them all together within the operating
system. Thus, as we vary our approach, we can determine whether each method
satisfies our basic specification conditions and whether implementation is
feasible for our project. Over the course of this project, the support
strategies for device-interrupts configuration and process context-switching
changed drastically as we began to notice the degree of difference between
porting BSD to a 386 and porting BSD to more "conventional" architectures. In
fact, we were still using the standalone system to find unintended
interactions in our 386 hardware-features support code long after we had
386BSD self-compiling. Another valuable aspect of this test bed is we can
benchmark competitive solutions to the same kernel support mechanism sans
other interactions. This was useful in selecting appropriate context switch,
interrupt control, and virtual memory system code.


Extending the Standalone System


On top of the standalone system framework (which really requires very little
processor-dependent support) we can write and test portions of code for the
operating system kernel (which requires quite a lot of processor-dependent
support). In the following sections of this article, we will discuss some
extensions to the standalone system which add kernel functionality. Processor
support for the kernel reflects support for memory protection of 386 "rings,"
ring crossings, and address space translations among other needs (see the
accompanying box "Brief Notes: 386 Rings").
These extensions are not required for the standalone system to function, but
they are not only used to test the kernel code, but actually form the basis
for the prototype kernel code. In essence, the standalone system can be viewed
as if it were the kernel itself, or possibly even a nano-kernel!


Processor Support -- i386.c


Within the i386.c module appears the code and data structures needed to
"wire-down" most of the 386's processor structures (descriptors, exceptions,
task switch state). init386() is a subroutine that "fills in the blanks" and
test386() tests portions of the mechanisms we will need to run our BSD UNIX
system. Note that this creates a superficial test bed that does not entirely
address our intended system, as user and kernel mode not only share address
space, they are the same program!
We start first by initializing paging (Listing Nine). The next fragment
contains code which enables paging by building a set of page tables and page
directory. For this example, we map virtual addresses to correspond with
physical addresses identically, and allow the first 4 Mbytes of physical
memory to be referenced "read/write" by both user and supervisor (kernel)
rings. It is important to remember that while the processor's instructions
work through the paging MMU with virtual addresses, the addresses that the MMU
uses to consult page directory and page tables are all physical addresses.
These physical addresses do not always correspond to the virtual addresses
that the processor uses, unlike this example where virtual addresses are
mapped one for one. As a result, when modifying the page tables and page
directory the kernel must explicitly convert any virtual addresses used to
physical.
Another point to mention about this paging mechanism is that the page tables
and page directory themselves need to be mapped to a given virtual address so
that the kernel may modify them to change address translation on demand. An
oddity of this paging mechanism is that it can work even if the page tables
are completely inaccessible to the kernel in its virtual address space. This
would be inconvenient for the kernel, however, as it spends a great deal of
time modifying these structures already.
Two assembly language helper routines lcr0() and lcr3() allow us to set the
386's processor control and page directory base register, respectively. Since
we are already running "protected," the lcr0() simply overwrites the already
set protect-mode bit as well as the paging-mode bit, allowing the MMU to enter
into paging mode.
Our page tables and directory as encoded here provide a null address mapping,
so that there is, as yet, no effective difference in address translation. One
might wonder why we must do this. If we don't, several subtle problems arise.
For example, if the address mapping of the instructions we are executing were
to differ, the 386's view of which instruction was to be executed next might
no longer match the next assembled instruction the program should have
executed. Both must be changed synchronously. Worse, if the 386 has an
instruction queue fetching asynchronously, we may not be able to predict
exactly when the transition occurred. The safest way to avoid these problems
is to enable page mapping with no net translation, then modify the address
mapping after the processor is running on the "identity" map. We can then
arrange to flush our various processor instruction queues and MMU address
translation buffers before allowing the processor to execute instructions in a
"translated" portion of the address space.
Besides paging, we must reinitialize segmentation. We start by "flattening"
the 386 with our descriptor tables. On the 386 (see Listing Six), our Global
Descriptor Table (GDT) describes address space selectors that will have global
visibility within our BSD kernel such that all processes will see them. Kernel
address space requires a descriptor for instructions and data, as well as a
task gate used to switch processes through, and various task state descriptors
used to save and restore state on demand. The kernel has a "panic" task state
reserved to be used when catching certain exceptions that require an "known
good" task state.
For the address space selectors used in user processes, we have the Local
Descriptor Table (LDT). We can use, potentially, one per process. These
descriptors, as the name suggests, are private to each process, and describe
the memory segments of that process. In addition, we have "gates." We need to
use only one to call the system.
Descriptors come in many different flavors (see Listing Seven): Those that
refer to memory or system data structures directly, and gates that indirectly
refer to other memory segments. We use task gates to generically switch to the
next consecutive task state, and call gates to allow us to enter the kernel's
global code segment in a system call. Gates get their name from the controlled
fashion that they regulate ring crossings, again from the MULTICS heritage.
Actually our coverage of descriptors is not yet complete. We have hidden
descriptors as well that serve special functions. Interrupts and exceptions on
the 386 index yet another descriptor table, the Interrupt Descriptor Table
(IDT). No program code can call these gate descriptors. Instead, external
interrupts and internal processor exceptions transfer through these gate
descriptors. We also use a kind of "meta descriptor" called a "region
descriptor," which is used to describe descriptor tables so that we can load
them via appropriate instructions. So much for the cast of players in this
descriptor drama.
Because the actual descriptor encoding is somewhat obscure (it was meant to be
reverse-compatible with the 286), we chose to refer to the descriptor by
having a subroutine shuffle our software descriptors into appropriate form
when presented to the hardware for use. In Listing Six local and global tables
are filled out by translating them into hardware form and loading them with a
lgdt(), lidt() function. We do this, even through we are already in protected
mode, to provide this newer version of the descriptor tables that we wish to
use. The function lgdt() hides some characteristics of the 386 segmentation
from view, because when we reload the GDT (we are running using active GDT
descriptors), we need to flush instruction prefetch and reload all kernel
descriptors. This insures proper code execution. We then reload the CS
register by turning the normal intrasegment return into a intersegment return.
In the case of our IDT table, we use a subroutine, setgate() (see Listing
Eight) to build interrupt gate descriptors that will enter the system's global
code descriptor at special assembler stub routines. Each is referred to by a
special naming convention hidden by the IDTVEC() macro that catches the
exception or interrupt. With all of these descriptor tables loaded and in
place, the 386 now has complete information describing the legitimate
references to RAM memory by user programs, the operating system kernel, and
hardware-accessed data structures. Exceptions, including incorrect references
to memory, will also be caught and directed to appropriate code.
One virtue of this complicated scheme of descriptors and segments is that it
is possible to add new microprocessor features by simply adding new descriptor
types. The mechanism is now general enough to support a wide variety of data
objects in a consistent way.


Initial Task State Load


Even if we don't wish to use the 386's special context switch feature, we must
initialize a root task state. Why? Because once we are in a user-mode process,
only the TSS (Task State Segment) contains the information on where the stack
is in the supervisor (kernel) processor ring.
Interestingly, the processor will indeed go into user mode, functioning just
fine until a trap, interrupt, or system call occurs. Most other processors
have dedicated register sets to locate the kernel stack in these cases. But
the 386 designers conceptualized ring crossings (user <-> kernel mode) like
that of task switches. Thus, we include the supervisor "entry into ring" stack
pointer in the TSS.
In the TSS structure (see Listing Ten), we assign a kernel/supervisor stack
top well below our current stack to avoid conflicts. We select this as the
current task segment, and then use a little trick to fill out the remainder of
this large structure by arranging to context switch back to our task segment,
using our assembly language stub jmptss. jmptss always saves the task state of
the current task and then loads the state yet again. Because the new state
must not already be BUSY in order to use this trick, we force it to be
AVAILABLE. UNIX kernels use a function, called resume(), to provide for this
mechanism. However, jmptss also provides for context switching when we wish to
transfer from one process or task to the next one. The general case, when we
call jmptss with a new TSS selector, not the current one, will be covered in a
later article.


Trap Handling


Earlier in this article we alluded to the 386 exception/trap's assembler stubs
routines. Now that we have enough 386 support in place, we can describe trap
handling and the mechanism by which these stubs reflect the trap event into
the C trap() handling function. Listing Twelve contains code for a sample
trap, in this case a breakpoint or INT3 instruction. Assembly language stubs
in module srt.s are executed by the processor in response to receiving a trap
or interrupt that selects the corresponding IDT entry.
These stubs are the minimal glue that index each kind of trap with a manifest
constant. This constant is always of the form T_XXXX and is obtained from the
file trap.h (see Listing Fourteen). Some traps on the 386 also place an error
code word on the stack, in order to transfer additional information about the
cause of the trap. Because we need to ultimately remove this error code before
returning from the trap, we first make all traps appear alike by pushing a
dummy error code of zero on the stack for traps without error codes. Then, the
common code that returns after the trap has only to remove both the trap
constant and error code, regardless of which trap occurred.
After saving the processor state, all trap stubs call common code, which calls
the C language trap handler; they also have code following, which restores the
state and returns to whatever code was running when the trap occurred. The C
language handler merely notifies us of the processor state and exception type
and then returns. Since our test function will test different traps in a
sequence, we prefer to bypass faults that don't move the program counter. We
do this by manually incrementing the program counter, knowing that all faults
we intend to encounter happen to be 1 byte in size. Obviously, this
convenience doesn't hold for the BSD kernel, but it is satisfactory for the
moment.



Interrupt Handling


Interrupts on the 386 are a kind of trap that function much like the
exceptions we discussed. In Listing Twelve, the AT's interrupt control units
and interrupt timer are initialized to allow hardware interrupts to be
signaled from AT devices, such as timer to processor. As a minor point, we
clear the coprocessor's exception interrupt to avoid spotting a possibly
spurious interrupt from some preexisting condition formed in MS-DOS mode prior
to running the standalone program.
Next, our interrupt test enables the processor and interrupt control unit for
a brief period of time, allowing hardware interrupts to be processed by the
386. Any interrupts occurring during that interval will cause the 386 to
extract the appropriate IDT entry from the table and cause one of the assembly
stubs in Listing Twelve to be executed. These stubs, like the trap stubs, save
the state, record the interrupt index on the stack, and call the common C
function intr().
In intr(), the present interrupt is masked off and the interrupts are then
reenabled so other interrupts can be active while the received interrupt is
processed. Note that both the stubs and this function are fully reentrant, as
this is not an uncommon occurrence. The example code also provides some
trivial interrupt actions for our timer, keyboard, and any other device that
generates an interrupt. At this point, we dispatch to a specific device
driver's interrupt routine.
After responding to the interrupt, we restore our old mask in an
uninterruptable "critical section," and signal the interrupt control unit that
this interrupt is to be acknowledged as "finished." Our interrupt stub then
unwinds the stack, restores the state as needed, and returns us to the exact
state we were in prior to processing the interrupt signaled to the 386.


System Call Handling


To test system calls, we must first enter user mode by generating an outbound
ring crossing. The touser() function (see Listing Thirteen) does this by
switching to our previously set-up user ring address space found in the LDT
and enters user mode in the function named usercode(). (The LDT, by a
remarkable coincidence, exactly corresponds to our standalone system's
"kernel" address space!) A special kind of stack frame is built that imitates
the 386's inbound ring-crossing processing. In other words, we "fake out" the
processor into believing it has just come from user mode. This done, we calmly
return with an intersegment return, executing in the new mode at the beginning
of the function.
usercode() does not tarry in the user ring for long; it immediately calls the
system call gate previously set up in the LDT. This call gate regulates entry
into the kernel at location IDTVEC (syscall), which in turn calls the C
function syscall() to properly enter the kernel.
Normally, at the end of the system call assembly stub we would return to the
user ring program, but since we have concluded testing the user ring
transition, we instead return to the touserp() function caller via a nonlocal
goto. We have carefully preserved the stack frame further up the stack, so we
can test other parts of the kernel mechanism in the test386() function.


Page Fault Handling


Our final mechanism demonstrated here involves generating a page fault, a
rather common occurrence in our BSD UNIX system. These faults, caught by the
mechanism described earlier, are trap type T_PAGEFLT, and they end up in the
trap handler trap(). In Listing Eleven, notice that this function prints the
contents of the 386 special processor register cr2 on a trap. This register
records the address value causing the page fault trap. Eventually, the virtual
memory system will require this in order to determine which page is being
requested by the program being run and if this page should be made accessible.
In this case, it obviously is not accessible.
To generate a page fault, code in module i386.c (see Listing Fifteen) first
reads and then writes an address outside of the range of valid page table and
directory structures. If this address had been outside of the range of the
current segment descriptor, it would have generated a general-protection fault
before being flagged by the paging unit. (In the scheme of things,
segmentation is ahead of paging.) But the address invoked is well within the
range of the segment descriptor, and only the paging unit takes issue with it.
Our page table mapping validates the first page of page tables (the first 4
Mbytes of address space) even though all the other page table pages are not
present. However, because the page table directory entry in this case is
actually invalid, the 386 MMU balks on address 0x800000 and signals trap type
T_PAGE-FLT to trap().
We can determine the type of page fault from the error code of the trap, which
in turn tells us whether a "read," "write," or "protection violation"
condition occurred. With the 386, we can also restrict pages to "supervisor
only" access, thus keeping them out of the hands of any nosy user programs. It
is interesting to note that while user programs can write-protect pages of
memory (typically when used for instruction segments), the kernel (running in
the supervisor ring) does not have the same option since the 386 ignores the
"write protect" control on paging. While this is not needed for UNIX to
function, we would like to make parts of the kernel "read-only" to catch
unintended modifications by undiscovered bugs in the kernel. We just can't do
this on the 386.


Where Do We Go From Here?


In the first examination of our initial utilities, we discussed several items
of importance in our standalone system /sys/stand, some of its utilities, and
a library of support routines. Through the standalone system, we were able to
use GCC programs to access devices, such as the keyboard and display, as well
as UNIX file structures on the hard disk. It also provided us with a platform
to examine the 386's requirements through extensions which supported features
incorporated into our UNIX port, and could also be used as a test bed for some
of these functions. As we stated earlier, the standalone system can be viewed
at this stage as if it were the kernel itself, with the extensions the basis
of our prototype kernel code. We have started up the base of the mountain.
Next time, we will proceed further with our initial utilities development, by
creating a stable cross-tools environment. Kermit and NCSA telnet will be used
to load files and program over Ethernet and serial lines. We will then focus
on proving GCC itself valid for cross-support purposes, as well as the
limitations and alternatives.


Brief Notes: 386 Rings


A "ring" is a concept developed in the early days of large-scale timesharing
by those working on the MULTICS operating system (see The Multics Systems: An
Examination of Its Structure, Elliot I. Organick, MIT Press: Cambridge Mass.,
1972). These rings establish a hierarchy of memory protection and processor
function, in which code running in lesser-valued rings has access only to all
higher-valued or equally valued rings. A protection violation occurs when less
"secure" code (running in a higher-valued ring) accesses more "secure" code or
data (at a lower-valued ring).
Rings can be used to regulate access and determine if a protection violation
occurs.
Support for the multiple ring protection model of the 386 occurs in four
distinct rings of protection (0-3). Traditional UNIX supports only two rings:
One for the kernel operating system, and one for the current user process. On
the 386, this corresponds to the supervisor ring (0) and the user ring (3).
When the 386 is running in the user ring and receives an interrupt or
exception, or it needs to process a system call, it must switch rings to the
supervisor ring. Once there, the kernel is run as a trusted program to handle
these events. This switching of rings, or "ring crossing," is central to the
UNIX memory protection model. Unlike MS-DOS, where operating system code and
user application code mingle in the same address space, UNIX programs run in
"hard" shells of address space. UNIX programs are not able to modify each
other or the operating system by virtue of memory protection. This is enforced
by memory protection hardware on the microprocessor in the general case when
the applications program is running. Ring crossing, where we go from user to
kernel code, needs to be carefully done to preserve the protection model in
all cases. In this way, nothing that a user program does can possibly affect
another user program adversely or "crash" the operating system.

--B.J., L.J.


_PORTING UNIX TO THE 386: THE STANDALONE SYSTEM_
by William Frederick Jolitz and Lynne Greer Jolitz


[LISTING ONE]


# hi.s: Simplest protected mode program providing some kind of output.
 .text
start: movl $0x0e690e48, 0x0b8800 # put "hi" mid screen on display
 hlt








[LISTING TWO]

# hello.s: Minimal test of GNU GAS assembler, handles CGA & strings.
 .text
start:
 movl $0xA0000,%esp

 pushl $str
 call _puts
 pop %eax
 hlt
str: .asciz "\n\rHello world from GAS\r\n"
_puts:
 push %ebx
 movl 8(%esp),%ebx
1: cmpb $0,(%ebx) # until we see a null
 je 2f

 movzbl (%ebx),%eax
 pushl %eax
 call _putchar # put out characters
 popl %eax

 incl %ebx
 jmp 1b
2: popl %ebx
 ret
crtat: .long 0xb8000 # address of CGA video RAM
row: .long 0

_putchar:
 movzbl 4(%esp),%eax
 push %ebx
 push %ecx
 movl crtat,%ebx
 cmpl $0xb8000+80*25*2,%ebx # continous output off screen edge & bot
 jl 1f
 movl $0,row
 movl $0xb8000+80*(25-1)*2,%ebx
1: cmpb $0xd,%al # cr
 jne 1f
 movl $80,%ecx # clear rest of line
 subl row,%ecx
 movl %ebx,%edi
 movw $0xfff,%ax
 cld
 rep
 stosw
 subl row,%ebx
 subl row,%ebx
 movl $0,row
 jmp 9f
1: cmpb $0xa,%al # nl
 jne 2f
 cmpl $0xb8000+80*(25-1)*2,%ebx # scroll?
 jl 1f
 movl $0xb8000,%edi # scroll page

 movl $0xb8000+80*2,%esi
 movl $80*(25-1),%ecx
 cld
 rep
 movsw
 movl $80,%ecx # clear new bottom line
 movl $0xb8000+80*(25-1)*2,%edi
 movw $0,%ax
 rep
 stosw
 sub $80*2,%ebx # position cursor before lf
1: add $80*2,%ebx
 jmp 9f
2: orw $0x0e00,%ax # attribute
 movw %ax,(%ebx)
 addl $2,%ebx
 incl row
9: movl %ebx,crtat
 pop %ecx
 pop %ebx
 ret







[LISTING THREE]

/* [Excerpted from srt.s] */

 ...
entry: .globl entry
 jmp 1f
 .space 0x500 /* skip over BIOS data area */
1: cli /* no interrupts yet */

 movl $0xA0000,%esp

 movl %esp,%edx
 movl $_edata,%eax
 subl %eax,%edx /* clear stack and heap store */
 pushl %edx
 pushl %eax
 call _bzero
 popl %eax
 popl %eax

 call _main
 ...

/* hello.c */
main() { printf("Hello, world!\n"); }








[LISTING FOUR]

/* kbd.c: Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 9/89
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * Standalone driver for IBM PC keyboards.
 */

#define L 0x001 /* locking function */
#define SHF 0x002 /* keyboard shift */
#define ALT 0x004 /* alternate shift -- alternate chars */
#define NUM 0x008 /* numeric shift cursors vs. numeric */
#define CTL 0x010 /* control shift -- allows ctl function */
#define CPS 0x020 /* caps shift -- swaps case of letter */
#define ASCII 0x040 /* ascii code for this key */
#define STP 0x080 /* stop output */
#define BREAK 0x100 /* key breaking contact */

typedef unsigned char u_char;

u_char inb();

u_char action[] = {
0, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, /* scan 0- 7 */
ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, /* scan 8-15 */
ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, /* scan 16-23 */
ASCII, ASCII, ASCII, ASCII, ASCII, CTL, ASCII, ASCII, /* scan 24-31 */
ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, /* scan 32-39 */
ASCII, ASCII, SHF , ASCII, ASCII, ASCII, ASCII, ASCII, /* scan 40-47 */
ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, SHF, ASCII, /* scan 48-55 */
 ALT, ASCII, CPSL, 0, 0, ASCII, 0, 0, /* scan 56-63 */
 0, 0, 0, 0, 0, NUML, STPL, ASCII, /* scan 64-71 */
ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, ASCII, /* scan 72-79 */
ASCII, ASCII, ASCII, ASCII, 0, 0, 0, 0, /* scan 80-87 */
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, } ;

u_char unshift[] = { /* no shift */
0, 033 , '1' , '2' , '3' , '4' , '5' , '6' , /* scan 0- 7 */
'7' , '8' , '9' , '0' , '-' , '=' , 0177 ,'\t' , /* scan 8-15 */

'q' , 'w' , 'e' , 'r' , 't' , 'y' , 'u' , 'i' , /* scan 16-23 */
'o' , 'p' , '[' , ']' , '\r' , CTL , 'a' , 's' , /* scan 24-31 */

'd' , 'f' , 'g' , 'h' , 'j' , 'k' , 'l' , ';' , /* scan 32-39 */
'\'' , '`' , SHF , '\\' , 'z' , 'x' , 'c' , 'v' , /* scan 40-47 */

'b' , 'n' , 'm' , ',' , '.' , '/' , SHF , '*', /* scan 48-55 */
ALT , ' ' , CPSL, 0, 0, ' ' , 0, 0, /* scan 56-63 */

 0, 0, 0, 0, 0, NUML, STPL, '7', /* scan 64-71 */

 '8', '9', '-', '4', '5', '6', '+', '1', /* scan 72-79 */

 '2', '3', '0', '.', 0, 0, 0, 0, /* scan 80-87 */
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, } ;

u_char shift[] = { /* shift shift */
0, 033 , '!' , '@' , '#' , '$' , '%' , '^' , /* scan 0- 7 */
'&' , '*' , '(' , ')' , '_' , '+' , 0177 ,'\t' , /* scan 8-15 */
'Q' , 'W' , 'E' , 'R' , 'T' , 'Y' , 'U' , 'I' , /* scan 16-23 */
'O' , 'P' , '[' , ']' , '\r' , CTL , 'A' , 'S' , /* scan 24-31 */
'D' , 'F' , 'G' , 'H' , 'J' , 'K' , 'L' , ':' , /* scan 32-39 */
'"' , '~' , SHF , '' , 'Z' , 'X' , 'C' , 'V' , /* scan 40-47 */
'B' , 'N' , 'M' , '<' , '>' , '?' , SHF , '*', /* scan 48-55 */
ALT , ' ' , CPSL, 0, 0, ' ' , 0, 0, /* scan 56-63 */
 0, 0, 0, 0, 0, NUML, STPL, '7', /* scan 64-71 */
 '8', '9', '-', '4', '5', '6', '+', '1', /* scan 72-79 */
 '2', '3', '0', '.', 0, 0, 0, 0, /* scan 80-87 */
0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, } ;

u_char ctl[] = { /* CTL shift */
0, 033 , '!' , 000 , '#' , '$' , '%' , 036 , /* scan 0- 7 */
'&' , '*' , '(' , ')' , 037 , '+' , 034 ,'\177', /* scan 8-15 */
021 , 027 , 005 , 022 , 024 , 031 , 025 , 011 , /* scan 16-23 */
017 , 020 , 033 , 035 , '\r' , CTL , 001 , 013 , /* scan 24-31 */
004 , 006 , 007 , 010 , 012 , 013 , 014 , ';' , /* scan 32-39 */
'\'' , '`' , SHF , 034 , 032 , 030 , 003 , 026 , /* scan 40-47 */
002 , 016 , 015 , '<' , '>' , '?' , SHF , '*', /* scan 48-55 */
ALT , ' ' , CPSL, 0, 0, ' ' , 0, 0, /* scan 56-63 */
CPSL, 0, 0, 0, 0, 0, 0, 0, /* scan 64-71 */
 0, 0, 0, 0, 0, 0, 0, 0, /* scan 72-79 */
 0, 0, 0, 0, 0, 0, 0, 0, /* scan 80-87 */
 0, 0, 033, '7' , '4' , '1' , 0, NUML, /* scan 88-95 */
'8' , '5' , '2' , 0, STPL, '9' , '6' , '3' , /*scan 96-103*/
'.' , 0, '*' , '-' , '+' , 0, 0, 0, /*scan 104-111*/
0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, } ;

#define KBSTATP 0x64 /* kbd status port */
#define KBS_RDY 0x02 /* kbd char ready */
#define KBDATAP 0x60 /* kbd data port */
#define KBSTATUSPORT 0x61 /* kbd status */
#define KBD_BRK 0x80 /* key is breaking contact, not making contact */
#define KBD_KEY(s) ((s) & 0x7f) /* key that has changed */

/* Return an ASCII character from the keyboard. */
u_char kbd() {
 u_char dt, act;
 static u_char odt, shfts, ctls, alts, caps, num, stp;

 do {
 do {
 while (inb(KBSTATP)&KBS_RDY) ;
 dt = inb(KBDATAP);
 } while (dt == odt);

 odt = dt;

 dt = KBD_KEY(dt);
 act = action[dt];
 if (odt & KBD_BRK) act = BREAK;

 /* kinds of shift keys */
 if (act&SHF) actl (act, &shfts);
 if (act&ALT) actl (act, &alts);
 if (act&NUM) actl (act, &num);
 if (act&CTL) actl (act, &ctls);
 if (act&CPS) actl (act, &caps);
 if (act&STP) actl (act, &stp);

 if (act&(ASCIIBREAK) == ASCII) {
 u_char chr;

 if (shfts)
 chr = shift[dt] ;
 else {
 if (ctls) chr = ctl[dt] ;
 else chr = unshift[dt] ;
 }
 if (caps && (chr >= 'a' && chr <= 'z'))
 chr -= 'a' - 'A' ;
 return(chr);
 }
 } while (1);
}

/* Handle shift key actions */
actl(act, v, brk) char *v; {

 /* are we locking ... */
 if (act&L) {
 if((act&BREAK) == 0) *v ^= 1;

 /* ... or single - action ? */
 } else
 if(act&BREAK) *v = 0; else *v = 1;
}






[LISTING FIVE]

/* cga.c: Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 9/89
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * Standalone driver for IBM PC Displays like CGA.
 */

typedef unsigned short u_short;

typedef unsigned char u_char;

#define CRT_TXTADDR Crtat
#define COL 80
#define ROW 25
#define CHR 2

u_short *Crtat = ((u_short *)0xb8000); /* 0xb0000 for monochrome */
u_short *crtat;
u_char color = 0xe ;
int row;

sput(c) u_char c; {

 if (crtat == 0) {
 crtat = CRT_TXTADDR; bzero (crtat,COL*ROW*CHR);
 }
 if (crtat >= (CRT_TXTADDR+COL*ROW*CHR)) {
 crtat = CRT_TXTADDR+COL*(ROW-1); row = 0;
 }
 switch(c) {

 case '\t':
 do {
 *crtat++ = (color<<8) ' '; row++ ;
 } while (row %8);
 break;
 case '\010':
 crtat--; row--;
 break;
 case '\r':
 bzero (crtat,(COL-row)*CHR) ; crtat -= row ; row = 0;
 break;
 case '\n':
 if (crtat >= CRT_TXTADDR+COL*(ROW-1)) { /* scroll */
 bcopy(CRT_TXTADDR+COL, CRT_TXTADDR,COL*(ROW-1)*CHR);
 bzero (CRT_TXTADDR+COL*(ROW-1),COL*CHR) ;
 crtat -= COL ;
 }
 crtat += COL ;
 break;
 default:
 *crtat++ = (color<<8) c; row++ ;
 break ;
 }
}







[LISTING SIX]

/* [excerpted from i386.c] */
 ...
/* Descriptor Tables */


 /* Global Descriptor Table */
#define GNULL_SEL 0 /* Null Descriptor - obligatory */
#define GCODE_SEL 1 /* Kernel Code Descriptor */
#define GDATA_SEL 2 /* Kernel Data Descriptor */
#define GLDT_SEL 3 /* LDT - eventually one per process */
#define GTGATE_SEL 4 /* Process task switch gate */
#define GPANIC_SEL 5 /* Task state to consider panic from */
#define GPROC0_SEL 6 /* Task state process slot zero and up */
union descriptor gdt[GPROC0_SEL+NPROC];

/* interrupt descriptor table */
struct gate_descriptor idt[NEXECPT+NINTR];

/* local descriptor table */
#define LSYS5CALLS_SEL 0 /* SVID/BCS 386 system call gate */
#define LSYS5SIGR_SEL 1 /* SVID/BCS 386 sigreturn() */
#define LBSDCALLS_SEL 2 /* BSD experimental system calls */
#define LUCODE_SEL 3 /* user process code descriptor */
#define LUDATA_SEL 4 /* user process data descriptor */
union descriptor ldt[LUDATA_SEL+1];

/* Task State Structures (TSS) for hardware context switch */
struct i386tss tss[NPROC], ptss;

/* software prototypes -- in more palitable form */
struct soft_segment_descriptor gdt_segs[GPROC0_SEL+NPROC] = {
 /* Null Descriptor */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Code Descriptor for kernel */
{ 0x0, /* segment base address */
 0xfffff, /* length - all address space */
 SDT_MEMERA, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 1, /* default 32 vs 16 bit size */
 1 /* limit granularity (byte/page units)*/ },
 /* Data Descriptor for kernel */
{ 0x0, /* segment base address */
 0xfffff, /* length - all address space */
 SDT_MEMRWA, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 1, /* default 32 vs 16 bit size */
 1 /* limit granularity (byte/page units)*/ },
 /* LDT Descriptor */
{ (int) ldt, /* segment base address */
 sizeof(ldt)-1, /* length - all address space */
 SDT_SYSLDT, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */

 0,0,
 0, /* unused - default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Null Descriptor - Placeholder */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Panic Tss Descriptor */
{ (int) &ptss, /* segment base address */
 sizeof(tss)-1, /* length - all address space */
 SDT_SYS386TSS, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 0, /* unused - default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Process 0 Tss Descriptor */
{ (int) &tss[0], /* segment base address */
 sizeof(tss)-1, /* length - all address space */
 SDT_SYS386TSS, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 0, /* unused - default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ } };

struct soft_segment_descriptor ldt_segs[] = {
 /* Null Descriptor - overwritten by call gate */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Null Descriptor - overwritten by call gate */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Null Descriptor - overwritten by call gate */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },

 /* Code Descriptor for user */
{ 0x0, /* segment base address */
 0xfffff, /* length - all address space */
 SDT_MEMERA, /* segment type */
 SEL_UPL, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 1, /* default 32 vs 16 bit size */
 1 /* limit granularity (byte/page units)*/ },
 /* Data Descriptor for user */
{ 0x0, /* segment base address */
 0xfffff, /* length - all address space */
 SDT_MEMRWA, /* segment type */
 SEL_UPL, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 1, /* default 32 vs 16 bit size */
 1 /* limit granularity (byte/page units)*/ } };
 ...
extern ssdtosd(), lgdt(), lidt(), lldt(), usercode(), touser();

init386() {
 ...
 /* make gdt memory segments */
 for (x=0; x < sizeof gdt / sizeof gdt[0] ; x++)
 ssdtosd(gdt_segs+x, gdt+x);
 printf("lgdt\n"); getchar();
 lgdt(gdt, sizeof(gdt)-1);

 /* make ldt memory segments */
 for (x=0; x < sizeof ldt / sizeof ldt[0] ; x++)
 ssdtosd(ldt_segs+x, ldt+x);

 /* make a call gate to reenter kernel with */
 setgate(&ldt[LSYS5CALLS_SEL].gd, &IDTVEC(syscall), SDT_SYS386CGT,
 SEL_UPL);
 printf("lldt\n"); getchar();
 lldt(GSEL(GLDT_SEL, SEL_KPL));
 ...
/* [excerpted from srt.s] */
 ...
 /* lgdt(*gdt, ngdt) */
 .globl _lgdt
gdesc: .word 0
 .long 0
_lgdt:
 movl 4(%esp),%eax
 movl %eax,gdesc+2
 movl 8(%esp),%eax
 movw %ax,gdesc
 lgdt gdesc
 jmp 1f /* flush instruction prefetch q */
 nop
1: movw $0x10,%ax /* reload other "well known" descriptors */
 movw %ax,%ds
 movw %ax,%es
 movw %ax,%ss
 movl 0(%esp),%eax
 pushl %eax

 movl $8,4(%esp) /* including the ever popular CS */
 lret
 ...
 /* lldt(sel) */
 .globl _lldt
_lldt:
 lldt 4(%esp)
 ret
 ...







[LISTING SEVEN]

/* segments.h: Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 6/20/1989
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * 386 Segmentation Data Structures and definitions
 */

/* Selectors */
#define ISPL(s) ((s)&3) /* what is the priority level of a selector */
#define SEL_KPL 0 /* kernel priority level */
#define SEL_UPL 3 /* user priority level */
#define ISLDT(s) ((s)&SEL_LDT) /* is it local or global */
#define SEL_LDT 4 /* local descriptor table */
#define IDXSEL(s) (((s)>>3) & 0x1fff) /* index of selector */
#define LSEL(s,r) (((s)<<3) SEL_LDT r) /* a local selector */
#define GSEL(s,r) (((s)<<3) r) /* a global selector */

/* Memory and System segment descriptors */
struct segment_descriptor {
 unsigned sd_lolimit:16 ; /* segment extent (lsb) */
 unsigned sd_lobase:24 ; /* segment base address (lsb) */
 unsigned sd_type:5 ; /* segment type */
 unsigned sd_dpl:2 ; /* segment descriptor priority level */
 unsigned sd_p:1 ; /* segment descriptor present */
 unsigned sd_hilimit:4 ; /* segment extent (msb) */
 unsigned sd_xx:2 ; /* unused */
 unsigned sd_def32:1 ; /* default 32 vs 16 bit size */
 unsigned sd_gran:1 ; /* limit granularity (byte/page units)*/
 unsigned sd_hibase:8 ; /* segment base address (msb) */
} ;

/* Gate descriptors (e.g. indirect descriptors) */
struct gate_descriptor {
 unsigned gd_looffset:16 ; /* gate offset (lsb) */
 unsigned gd_selector:16 ; /* gate segment selector */
 unsigned gd_stkcpy:5 ; /* number of stack wds to cpy */
 unsigned gd_xx:3 ; /* unused */

 unsigned gd_type:5 ; /* segment type */
 unsigned gd_dpl:2 ; /* segment descriptor priority level */
 unsigned gd_p:1 ; /* segment descriptor present */
 unsigned gd_hioffset:16 ; /* gate offset (msb) */
} ;

/* Generic descriptor */
union descriptor {
 struct segment_descriptor sd;
 struct gate_descriptor gd;
};
#define d_type gd.gd_type

 /* system segments and gate types */
#define SDT_SYSNULL 0 /* system null */
#define SDT_SYS286TSS 1 /* system 286 TSS available */
#define SDT_SYSLDT 2 /* system local descriptor table */
#define SDT_SYS286BSY 3 /* system 286 TSS busy */
#define SDT_SYS286CGT 4 /* system 286 call gate */
#define SDT_SYSTASKGT 5 /* system task gate */
#define SDT_SYS286IGT 6 /* system 286 interrupt gate */
#define SDT_SYS286TGT 7 /* system 286 trap gate */
#define SDT_SYSNULL2 8 /* system null again */
#define SDT_SYS386TSS 9 /* system 386 TSS available */
#define SDT_SYSNULL3 10 /* system null again */
#define SDT_SYS386BSY 11 /* system 386 TSS busy */
#define SDT_SYS386CGT 12 /* system 386 call gate */
#define SDT_SYSNULL4 13 /* system null again */
#define SDT_SYS386IGT 14 /* system 386 interrupt gate */
#define SDT_SYS386TGT 15 /* system 386 trap gate */

 /* memory segment types */
#define SDT_MEMRO 16 /* memory read only */
#define SDT_MEMROA 17 /* memory read only accessed */
#define SDT_MEMRW 18 /* memory read write */
#define SDT_MEMRWA 19 /* memory read write accessed */
#define SDT_MEMROD 20 /* memory read only expand dwn limit */
#define SDT_MEMRODA 21 /* memory read only expand dwn limit accessed */
#define SDT_MEMRWD 22 /* memory read write expand dwn limit */
#define SDT_MEMRWDA 23 /* memory r/w expand dwn limit acessed */
#define SDT_MEME 24 /* memory execute only */
#define SDT_MEMEA 25 /* memory execute only accessed */
#define SDT_MEMER 26 /* memory execute read */
#define SDT_MEMERA 27 /* memory execute read accessed */
#define SDT_MEMEC 28 /* memory execute only conforming */
#define SDT_MEMEAC 29 /* memory execute only accessed conforming */
#define SDT_MEMERC 30 /* memory execute read conforming */
#define SDT_MEMERAC 31 /* memory execute read accessed conforming */

/* is memory segment descriptor pointer ? */
#define ISMEMSDP(s) ((s->d_type) >= SDT_MEMRO && (s->d_type) <= SDT_MEMERAC)

/* is 286 gate descriptor pointer ? */
#define IS286GDP(s) (((s->d_type) >= SDT_SYS286CGT \
 && (s->d_type) < SDT_SYS286TGT))
/* is 386 gate descriptor pointer ? */
#define IS386GDP(s) (((s->d_type) >= SDT_SYS386CGT \
 && (s->d_type) < SDT_SYS386TGT))
/* is gate descriptor pointer ? */

#define ISGDP(s) (IS286GDP(s) IS386GDP(s))

/* is segment descriptor pointer ? */
#define ISSDP(s) (ISMEMSDP(s) !ISGDP(s))

/* is system segment descriptor pointer ? */
#define ISSYSSDP(s) (!ISMEMSDP(s) && !ISGDP(s))

/* Software definitions are in this convenient format; translated into
 * inconvenient segment descriptors when needed to be used by 386 hardware */
struct soft_segment_descriptor {
 unsigned ssd_base ; /* segment base address */
 unsigned ssd_limit ; /* segment extent */
 unsigned ssd_type:5 ; /* segment type */
 unsigned ssd_dpl:2 ; /* segment descriptor priority level */
 unsigned ssd_p:1 ; /* segment descriptor present */
 unsigned ssd_xx:4 ; /* unused */
 unsigned ssd_xx1:2 ; /* unused */
 unsigned ssd_def32:1 ; /* default 32 vs 16 bit size */
 unsigned ssd_gran:1 ; /* limit granularity (byte/page units)*/
};

extern ssdtosd() ; /* to decode a ssd */
extern sdtossd() ; /* to encode a sd */

/* region descriptors, used to load gdt/idt tables before segments yet exist
*/
struct region_descriptor {
 unsigned rd_limit:16 ; /* segment extent */
 char *rd_base; /* base address */
};

/* Segment Protection Exception code bits */
#define SEGEX_EXT 0x01 /* recursive or externally induced */
#define SEGEX_IDT 0x02 /* interrupt descriptor table */
#define SEGEX_TI 0x04 /* local descriptor table */
 /* other bits are affected descriptor index */
#define SEGEX_IDX(s) ((s)>>3)&0x1fff)







[LISTING EIGHT]

/* [excerpted from i386.c] */
 ...
/* Assemble a gate descriptor */
setgate(gp, func, typ, dpl) char *func; struct gate_descriptor *gp; {
 gp->gd_looffset = (int)func;
 gp->gd_selector = GSEL(GCODE_SEL,SEL_KPL);
 gp->gd_stkcpy = 0;
 gp->gd_xx = 0;
 gp->gd_type = typ;
 gp->gd_dpl = dpl;
 gp->gd_p = 1; /* definitely present */
 gp->gd_hioffset = ((int)func)>>16 ;
}


/* ASM entry points to exception/trap/interrupt entry stub code. */
#define IDTVEC(name) X##name
extern
 IDTVEC(div), IDTVEC(dbg), IDTVEC(nmi), IDTVEC(bpt), IDTVEC(ofl),
 IDTVEC(bnd), IDTVEC(ill), IDTVEC(dna), IDTVEC(dble), IDTVEC(fpusegm),
 IDTVEC(tss), IDTVEC(missing), IDTVEC(stk), IDTVEC(prot), IDTVEC(page),
 IDTVEC(rsvd), IDTVEC(fpu), IDTVEC(rsvd0), IDTVEC(rsvd1), IDTVEC(rsvd2),
 IDTVEC(rsvd3), IDTVEC(rsvd4), IDTVEC(rsvd5), IDTVEC(rsvd6),
 IDTVEC(rsvd7), IDTVEC(rsvd8), IDTVEC(rsvd9), IDTVEC(rsvd10),
 IDTVEC(rsvd11), IDTVEC(rsvd12), IDTVEC(rsvd13), IDTVEC(rsvd14),
 IDTVEC(rsvd14), IDTVEC(intr0), IDTVEC(intr1), IDTVEC(intr2),
 IDTVEC(intr3), IDTVEC(intr4), IDTVEC(intr5), IDTVEC(intr6),
 IDTVEC(intr7), IDTVEC(intr8), IDTVEC(intr9), IDTVEC(intr10),
 IDTVEC(intr11), IDTVEC(intr12), IDTVEC(intr13), IDTVEC(intr14),
 IDTVEC(intr15), IDTVEC(syscall);
init386() {
 ...
 /* exceptions */
 setgate(idt+0, &IDTVEC(div), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+1, &IDTVEC(dbg), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+2, &IDTVEC(nmi), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+3, &IDTVEC(bpt), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+4, &IDTVEC(ofl), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+5, &IDTVEC(bnd), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+6, &IDTVEC(ill), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+7, &IDTVEC(dna), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+8, &IDTVEC(dble), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+9, &IDTVEC(fpusegm), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+10, &IDTVEC(tss), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+11, &IDTVEC(missing), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+12, &IDTVEC(stk), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+13, &IDTVEC(prot), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+14, &IDTVEC(page), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+15, &IDTVEC(rsvd), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+16, &IDTVEC(fpu), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+17, &IDTVEC(rsvd0), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+18, &IDTVEC(rsvd1), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+19, &IDTVEC(rsvd2), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+20, &IDTVEC(rsvd3), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+21, &IDTVEC(rsvd4), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+22, &IDTVEC(rsvd5), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+23, &IDTVEC(rsvd6), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+24, &IDTVEC(rsvd7), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+25, &IDTVEC(rsvd8), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+26, &IDTVEC(rsvd9), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+27, &IDTVEC(rsvd10), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+28, &IDTVEC(rsvd11), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+29, &IDTVEC(rsvd12), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+30, &IDTVEC(rsvd13), SDT_SYS386TGT, SEL_KPL);
 setgate(idt+31, &IDTVEC(rsvd14), SDT_SYS386TGT, SEL_KPL);

 /* first icu */
 setgate(idt+32, &IDTVEC(intr0), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+33, &IDTVEC(intr1), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+34, &IDTVEC(intr2), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+35, &IDTVEC(intr3), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+36, &IDTVEC(intr4), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+37, &IDTVEC(intr5), SDT_SYS386IGT, SEL_KPL);

 setgate(idt+38, &IDTVEC(intr6), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+39, &IDTVEC(intr7), SDT_SYS386IGT, SEL_KPL);

 /* second icu */
 setgate(idt+40, &IDTVEC(intr8), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+41, &IDTVEC(intr9), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+42, &IDTVEC(intr10), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+43, &IDTVEC(intr11), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+44, &IDTVEC(intr12), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+45, &IDTVEC(intr13), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+46, &IDTVEC(intr14), SDT_SYS386IGT, SEL_KPL);
 setgate(idt+47, &IDTVEC(intr15), SDT_SYS386IGT, SEL_KPL);

 printf("lidt\n"); getchar();
 lidt(idt, sizeof(idt)-1);
 ...
 /* [excerpted from srt.s] */

 /* lidt(*idt, nidt) */
 .globl _lidt
idesc: .word 0
 .long 0
_lidt:
 movl 4(%esp),%eax
 movl %eax,idesc+2
 movl 8(%esp),%eax
 movw %ax,idesc
 lidt idesc
 ret






[LISTING NINE]

/* [excerpted from i386.c] */
 ...
#define NBPG 4096 /* number of bytes per page */
#define PG_V 0x00000001 /* mark this page as valid */
#define PG_UW 0x00000006 /* user and supervisor writable */

int lcr0(), lcr3();
 ...
init386() {
 /* bag of bytes to put page table, page directory in */
 static char bag[(1+1+1)*NBPG];
 int *ppte, *pptd, *cr3, x;

 /* make page table & directory aligned to NBPG */
 ppte = (int *) (((int) bag + NBPG-1) & ~(NBPG-1));
 cr3 = pptd = ppte + 1024;

 /* page table directory only has lowest 4MB entry mapped */
 *pptd++ = (int) ppte + (PG_VPG_UW);
 for (x = 1; x < 1024 ; x++,pptd++) *pptd = 0;

 /* page table, all entrys virtual == real, user/supervisor r/w */

 for (x = 0; x < 1024 ; x++,ppte++) *ppte = x*NBPG + (PG_VPG_UW) ;

 /* turn on paging */
 lcr3(cr3);
 printf("paging"); getchar();
 lcr0(0x80000001);
 ...

/* [excerpted from srt.s] */

 /*
 * lcr3(cr3)
 */
 .globl _lcr3
_lcr3:
 movl 4(%esp),%eax
 movl %eax,%cr3
 ret

 /* lcr0(cr0) */
 .globl _lcr0
_lcr0:
 movl 4(%esp),%eax
 movl %eax,%cr0
 ret







[LISTING TEN]

/* [excerpted from i386.c] */
 ...
init386(){
 ...
 /* make a initial tss so 386 can get interrupt stack on syscall! */
 tss[0].tss_esp0 = (int) &x - 4096;
 tss[0].tss_ss0 = GSEL(GDATA_SEL, SEL_KPL) ;
 tss[0].tss_cr3 = (int) cr3;
 printf("ltr "); getchar();
 ltr(GSEL(GPROC0_SEL, SEL_KPL));

 printf("resume() "); getchar();
 /* set busy type to avail */
 gdt[GPROC0_SEL].sd.sd_type = SDT_SYS386TSS;
 /* jump to self to fill out tss, like BSD resume() */
 jmptss(GSEL(GPROC0_SEL, SEL_KPL));
 ...
# excerpted from srt.s
 ...
/* jmptss(sel)-- Jump to TSS so that we can load/unload context */
 .globl _jmptss /* similar to BSD swtch()/resume() */
_jmptss:
 ljmp 0(%esp) /* ljmp tss */
 /* saved pc points here */
 ret









[LISTING ELEVEN]

/* [excerpted from i386.c] */
 ...
test386(){
 ...
 /* test handling exceptions */
 printf("breakpoint "); getchar();
 asm (" int $3 ");
 ...
/* Trap exception processing code */
trap(es, ds, edi, esi, ebp, dummy, ebx, edx, ecx, eax,
 fault, ec, eip, cs, eflags, esp, ss) {

 printf("pc:%x cs:%x ds:%x eflags:%x ec %x fault %x cr0 %x cr2 %x \n",
 eip, cs, ds, eflags, ec, fault, rcr0(), rcr2());
 printf("edi %x esi %x ebp %x ebx %x edx %x ecx %x eax %x\n",
 edi, esi, ebp, ebx, edx, ecx, eax);
 eip++; /* simple way to 'jump' over fault */
 getchar();
}

 ...
# excerpted from srt.s
 ...
#include <machine/i386/trap.h>

#define IDTVEC(name) .align 4; .globl _X##name; _X##name:
 ...
/* Trap and fault vector routines */
#define TRAP(a) pushl $##a ; jmp alltraps

IDTVEC(div)
 pushl $0; TRAP(T_DIVIDE)
IDTVEC(dbg)
 pushl $0; TRAP(T_DEBUG)
IDTVEC(nmi)
 pushl $0; TRAP(T_NMI)
IDTVEC(bpt)
 pushl $0; TRAP(T_BPTFLT)
IDTVEC(ofl)
 pushl $0; TRAP(T_OFLOW)
IDTVEC(bnd)
 pushl $0; TRAP(T_BOUND)
IDTVEC(ill)
 pushl $0; TRAP(T_PRIVINFLT)
IDTVEC(dna)
 pushl $0; TRAP(T_DNA)
IDTVEC(dble)
 TRAP(T_DOUBLEFLT)
IDTVEC(fpusegm)

 pushl $0; TRAP(T_FPOPFLT)
IDTVEC(tss)
 TRAP(T_TSSFLT)
IDTVEC(missing)
 TRAP(T_SEGNPFLT)
IDTVEC(stk)
 TRAP(T_STKFLT)
IDTVEC(prot)
 TRAP(T_PROTFLT)
IDTVEC(page)
 TRAP(T_PAGEFLT)
IDTVEC(rsvd)
 pushl $0; TRAP(T_RESERVED)
IDTVEC(fpu)
 pushl $0; TRAP(T_ARITHTRAP)
 /* 17 - 31 reserved for future exp */
IDTVEC(rsvd0)
 pushl $0; TRAP(17)
IDTVEC(rsvd1)
 pushl $0; TRAP(18)
IDTVEC(rsvd2)
 pushl $0; TRAP(19)
IDTVEC(rsvd3)
 pushl $0; TRAP(20)
IDTVEC(rsvd4)
 pushl $0; TRAP(21)
IDTVEC(rsvd5)
 pushl $0; TRAP(22)
IDTVEC(rsvd6)
 pushl $0; TRAP(23)
IDTVEC(rsvd7)
 pushl $0; TRAP(24)
IDTVEC(rsvd8)
 pushl $0; TRAP(25)
IDTVEC(rsvd9)
 pushl $0; TRAP(26)
IDTVEC(rsvd10)
 pushl $0; TRAP(27)
IDTVEC(rsvd11)
 pushl $0; TRAP(28)
IDTVEC(rsvd12)
 pushl $0; TRAP(29)
IDTVEC(rsvd13)
 pushl $0; TRAP(30)
IDTVEC(rsvd14)
 pushl $0; TRAP(31)

alltraps:
 pushal
 push %ds # save old selector's we will use
 push %es
 movw $0x10,%ax # load them with kernel global data sel
 movw %ax,%ds
 movw %ax,%es
 call _trap
 pop %es
 pop %ds
 popal
 addl $8,%esp # pop type, code

 iret






[LISTING TWELVE]

/* [excerpted from i386.c] */
 ...
init386() {
 ...
 outb(0xf1,0); /* clear coprocessor to cover all bases */

 /* initialize 8259 ICU's in preperation for device interrupts */

 outb(ICU1,0x11); /* reset the unit */
 outb(ICU1+1,32); /* start with idt 32 */
 outb(ICU1+1,4); /* master please */
 outb(ICU1+1,1);
 outb(ICU1+1,0xff); /* all disabled */

 outb(ICU2,0x11);
 outb(ICU2+1,40); /* start with idt 40 */
 outb(ICU2+1,2); /* just a slave */
 outb(ICU2+1,1);
 outb(ICU2+1,0xff); /* all disabled */

 /* initialize 8253 timer on interrupt #0 */

 outb (0x43, 0x36);
 outb (0x40, 193182/60);
 outb (0x40, (193182/60)/256);
}
test386(){
 ...
 /* test interrupts for a while */
 printf("inton"); getchar();
 outb(ICU1+1,0); /* unmask all interrupts */
 outb(ICU2+1,0);
 inton();

 timeout = 0x8000000;
 do nothing(); while (timeout-- );

 intoff();
 ...

# excerpted from srt.s
 ...
#define INTR(a) \
 pushal ; \
 push %ds ; \
 push %es ; \
 movw $0x10, %ax ; \
 movw %ax, %ds ; \
 movw %ax,%es ; \
 pushl $##a ; \

 call _intr ; \
 pop %eax ; \
 pop %es ; \
 pop %ds ; \
 popal ; \
 iret

 /* hardware 32 - 47 */
IDTVEC(intr0)
 INTR(0)
IDTVEC(intr1)
 INTR(1)
IDTVEC(intr2)
 INTR(2)
IDTVEC(intr3)
 INTR(3)
IDTVEC(intr4)
 INTR(4)
IDTVEC(intr5)
 INTR(5)
IDTVEC(intr6)
 INTR(6)
IDTVEC(intr7)
 INTR(7)
IDTVEC(intr8)
 INTR(8)
IDTVEC(intr9)
 INTR(9)
IDTVEC(intr10)
 INTR(10)
IDTVEC(intr11)
 INTR(11)
IDTVEC(intr12)
 INTR(12)
IDTVEC(intr13)
 INTR(13)
IDTVEC(intr14)
 INTR(14)
IDTVEC(intr15)
 INTR(15)

 .globl _inton
_inton:
 sti
 ret

 .globl _intoff
_intoff:
 cli
 ret
 ...

/* back to i386.c */
 ...
/* Interrupt vector processing code */
intr(ivec) {
 static clk;
 int omsk1, omsk2;


 /* mask off interrupt being serviced, save old mask */
 if (ivec > 7) {
 omsk2 = inb(ICU2+1);
 outb(ICU2+1, 1<<(ivec-8));
 } else {
 omsk1 = inb(ICU1+1);
 outb(ICU1+1, 1<<ivec);
 }

 /* re-enable processor's interrupts, allowing others in */
 inton();

 /* if we are the clock, count clock tick */
 if (ivec == 0) clk++;
 /* if we are the keyboard, show data incoming */
 else if (ivec == 1) printf("kbd data %x, clk %d\n", inb(0x60), clk);
 /* otherwise print message stating source and time */
 else {
 printf("intr %d, clk %d \n", ivec, clk);
 getchar();
 }

 /* turn off interrupts, re-enable old mask, do interrupt acknowledge */
 intoff();
 if (ivec > 7) {
 outb(ICU2+1,omsk2);
 outb(ICU2,0x20);
 } else
 outb(ICU1+1,omsk1);
 outb(ICU1,0x20);
 /* return to interrupt stub */
}
 ...





[LISTING THIRTEEN]

test386(){
 ...
 /* transfer to user mode to test system call */
 printf("touser "); getchar();
 touser (LSEL(LUCODE_SEL,SEL_UPL), LSEL(LUDATA_SEL, SEL_UPL), &usercode);
 ...
# [excerpted from srt.s]

 /* touser (cs,ds,func) */
 .globl _touser
_touser:
 pushal
 movl %esp,_myspback
 movl 32+4(%esp),%eax
 movl 32+8(%esp),%edx
 movl 32+12(%esp),%ecx
 # build outer stack frame
 pushl %edx # user ss
 pushl %esp # user esp

 pushl %eax # cs
 pushl %ecx # ip
 movw %dx,%ds
 movw %dx,%es
 lret # goto user!

/* code to execute in user mode */
 .globl _usercode
#define LCALL(x,y) .byte 0x9a ; .long y; .word x
_usercode:
 LCALL(0x7,0x0) /* would be lcall $0x7,0x0 except for assembler bug */
IDTVEC(syscall)
 pushal
 movw $0x10,%ax
 movw %ax,%ds
 movw %ax,%es
 call _syscall
 movl _myspback,%esp /* non-local goto touser() exit */
 popal
 ret
/* back to i386.c */
 ...
/* System call processing */
syscall() {
 printf("syscall\n");
}







[LISTING FOURTEEN]

/* trap.h: i386 trap type index [as they intersect with other BSD systems] */

#define T_PRIVINFLT 1 /* privileged instruction */
#define T_BPTFLT 3 /* breakpoint instruction */
#define T_ARITHTRAP 6 /* arithmetic trap */
#define T_PROTFLT 9 /* protection fault */
#define T_PAGEFLT 12 /* page fault */

#define T_DIVIDE 18 /* integer divide fault */
#define T_NMI 19 /* non-maskable trap */
#define T_OFLOW 20 /* overflow trap */
#define T_BOUND 21 /* bound instruction fault */
#define T_DNA 22 /* device not available fault */
#define T_DOUBLEFLT 23 /* double fault */
#define T_FPOPFLT 24 /* fp coprocessor operand fetch fault */
#define T_TSSFLT 25 /* invalid tss fault */
#define T_SEGNPFLT 26 /* segment not present fault */
#define T_STKFLT 27 /* stack fault */
#define T_RESERVED 28 /* reserved fault base */








[LISTING FIFTEEN]

/* [excerpted from i386.c] */
 ...
test386(){
 int x, *pi, timeout;
 ...
 /* generate a page fault exception */
 printf("dopagflt\n"); getchar();
 pi = (int *) 0x800000; /* above 4MB */
 x = *pi; /* will fault invalid read */
 *pi = ++x ; /* will fault invalid write */
 ...















































March, 1991
SPEEDY BUFFERING


Faster access on slow drives




Bruce Tonkin


Bruce develops and sells software for TRS-80 and MS-DOS computers. He can be
reached at T.N.T. Software Inc., 34069 Hainesville Road, Round Lake, IL 60073.


Disk reads and writes can slow nearly any application. It's easy to tell a
user to buy a faster hard drive, a better controller, or to move from floppies
to a hard drive. Unfortunately, many users have a strange preoccupation with
money -- especially their own.
There are ways to appreciably speed disk access times for many applications,
without any outlay of cash. One of the easiest is a buffering method I use
with QuickBasic.
First, I'll suppose the data file is random-access and is being read or
written in record number order. The method I use can be extended to some other
cases fairly easily. If there is a reasonable chance that you'll be accessing
multiple records from the same buffer, you needn't access the file
sequentially -- but what constitutes "reasonable" will depend on the
application, the drive used, and the memory available.
Second, I'll suppose you have at least some memory available. That's not
always the case, but it usually is.
The method involves two parts. The file you want to buffer is opened with a
record length that is a multiple of the true record length. I usually use a
record length of about 16K bytes. Next, a dummy file is opened with a record
length equal to the true record length of the actual file.
The large record file is fielded a whole record at a time, and the small file
is fielded the way you want to read the actual records. An adapted version of
this method can be used with binary files and user-defined types; I'll leave
that to you.
Now, suppose you want to read record R. You call a subroutine which checks to
see if record R is currently in the buffer. If it is, that part of the file is
transferred from the large buffer to the small dummy one. If not, the suitable
chunk containing record R is read and then the transfer is made. Listing One
presents a program that illustrates the method.
I tested the program against a 231K data file read from a 1.44-Mbyte floppy
disk and from a Plus Development 40-Mbyte HardCard (28 ms) using various
record lengths. Table 1 shows the times I observed while running in the
QuickBasic 4.5 environment.
Table 1: Buffering results

 Buffer Disk Record Length Buffered Time Unbuffered Time
 ---------------------------------------------------------------

 8K 1.44 100 to 500 11.87 92.82
 8K 1.44 1000 11.65 50.64
 8K 1.44 2000 10.60 28.45
 16K 1.44 50 to 500 8.98 92.81
 16K 1.44 1000 9.05 50.64
 16K 1.44 2000 8.85 28.45
 8K 40MB 100 3.56 10.38
 8K 40MB 200 3.02 10.39
 8K 40MB 500 2.96 10.38
 8K 40MB 1000 2.91 10.38
 8K 40MB 2000 2.75 6.16
 16K 40MB 100 3.19 10.43
 16K 40MB 200 2.80 10.43
 16K 40MB 500 2.75 10.38
 16K 40MB 1000 2.70 10.38
 16K 40MB 2000 2.63 6.15
 24K 40MB 100 2.86 10.38
 24K 40MB 200 2.69 10.33
 24K 40MB 500 2.47 10.38
 24K 40MB 1000 2.53 10.38
 24K 40MB 2000 2.47 6.15

The times show a substantial speedup with buffering: As much as a factor of
eight on the floppy drive, and a factor of three on the hard drive. Times on
the floppy disk with buffering are faster than the hard drive without. The
optimal buffer size seems to be about 16K for either drive. The times don't
vary much if buffering is used, regardless of record length, but drop
substantially for unbuffered reads as record lengths increase beyond 500 (512)
bytes on the floppy and beyond 1000 (1024) bytes on the hard drive.
If you use this method to write a file, buffering can append null records to
the end. So long as you can use null records to detect EOF in your
applications, that shouldn't be a problem.
Buffering is simple, and can give enormous speedups. Watch your applications
for opportunities to use it!

_SPEEDY BUFFERING_
by Bruce Tonkin



[LISTING ONE]

DEFINT A-Z
CLS
LINE INPUT "Name of file to read: "; filename$
INPUT "Record length: "; rl
buffersize = (1 + 16384 \ rl) * rl '16K, rounded up.
blocks = buffersize \ rl
DIM chunk$(1 TO blocks) 'each chunk will be one record.

'open the random file with a record length of buffersize
OPEN "r", 1, filename$, buffersize
filesize& = CLNG(LOF(1)) \ rl

'set up the buffer in record-length sized chunks
FOR i = 1 TO blocks
 FIELD #1, dummy AS dummy$, rl AS chunk$(i)
 dummy = dummy + rl
NEXT i

'open up a dummy buffer to hold the actual records
OPEN "r", 2, "real.$$$", rl
FIELD #2, rl AS all$

t1! = TIMER
FOR i = 1 TO filesize&
 GOSUB getrec 'read the buffered records
NEXT i
t1! = TIMER - t1! 'save the time taken
CLOSE 'close the files and delete the dummy file.
KILL "real.$$$"

OPEN "r", 1, filename$, rl 'now open and read unbuffered
FIELD #1, rl AS all$
t2! = TIMER
FOR i = 1 TO filesize&
 GET 1, i
NEXT i
t2! = TIMER - t2!
CLOSE
PRINT "Buffered time: "; t1!
PRINT "Unbuffered time: "; t2!
END

getrec:
whichblock = 1 + (i - 1) \ blocks
IF whichblock <> lastblock THEN
 GET 1, whichblock
 lastblock = whichblock
 end if
whichchunk = 1 + i MOD blocks
LSET all$ = chunk$(whichchunk)
RETURN





































































March, 1991
THE MEWEL WINDOW SYSTEM


Targeting two environments for the price of one




Al Stevens


Al is a DDJ contributing editor and can be contacted at 501 Galveston Drive,
Redwood City, CA 94063.


The Mewel 3 Window System from Magma Software Systems is a function library
that resembles the Microsoft Windows 3.0 SDK (Software Development Kit) API
and implements a subset of the SDK functions in a text-mode DOS environment.
Mewel addresses a number of issues faced by programmers today, and its
viability as a programmer's tool is reflected in the way it addresses those
issues.
A programmer who is writing text-based DOS applications might choose Mewel for
any of several reasons. First, like other such products, Mewel is a text-based
library that supports windows, menus, data entry templates, and mouse input.
As such, it competes with other C function libraries that support video
windows on the PC. The one you choose will depend on the style you prefer for
your user interface.
A second reason to use Mewel is that it is an implementation of the IBM
Systems Application Architecture (SAA) Common User Access (CUA) standard. This
is the user interface model that OS/2 Presentation Manager, Windows 3.0, and
many applications use. Whether or not you like the CUA approach, the PC
industry is moving toward it, and many users will come to expect it. Mewel is
the closest thing to an SAA-compliant text-based window package that I have
seen.
A third reason for using Mewel is that it is a text-based subset of the
Windows SDK API. This opens several possibilities. Programmers can use Mewel
to port their Windows applications to DOS with a minimum of fuss, provided
that the applications are not heavily dependent on Windows memory management
and multitasking. You can design new applications to compile under both APIs
and increase your potential market. You can use Mewel to prototype Windows
applications. But a hidden strength of Mewel's near-compatibility with Windows
is that it provides a stepping stone for DOS programmers to learn Windows
programming, a stepping stone that does not require the programmer to purchase
the Windows SDK, to develop under Windows, or even to run Windows.
The Windows API goes far beyond the user interface functions of CUA. The
Windows operating environment encompasses a complex memory management system
and supports a multi-tasking environment with intertask communication through
dynamic data exchange. The API is huge and overwhelming. The biggest books on
the computer bookshelves are about Windows programming. No wonder programmers
are intimidated. Windows programming has become another subculture of esoteric
folklore that makes outsiders wince when they see the volume of knowledge
required to gain entrance. Many GUI opponents are programmers who are unsure
of their ability or willingness to learn the APIs. Mewel offers an opening.
Its API is a small subset of the Windows API, appears manageable on the
surface, and has the advantage that you do not need a Windows development
system. Software development with the Windows SDK is inhospitable at best. The
Microsoft C compiler and resource compiler do not themselves run under
Windows, and the CodeView debugger requires a monochrome monitor in addition
to the graphics monitor used by Windows itself. All you need to develop with
Mewel is a DOS system and a C compiler.


Event-Driven Programming


You will hear that the program development environments for Windows and Mewel
are object oriented. The Mewel documentation itself implies that it supports
an object-oriented programming environment. This is not entirely true. The
Windows and Mewel APIs have some things in common with OOP. Windows are
classes of a kind. You can derive window classes and subclass them. You send
messages to windows to make them react and behave. But the similarities end
there. There is no encapsulation of objects and no inherent polymorphism. You
do not instantiate a window class by declaring an instance of an object. There
are products that surround the Windows API with object-oriented development
environments, but the Windows API itself is not OOP. You could similarly
surround the Mewel API with C++ classes, but the Mewel API itself is not OOP.
What these APIs support is something else -- something called "event-driven"
programming.
In event-driven programming, the functions of the program execute as the
result of events. The main body of the program waits in a loop for an event to
occur. When the event takes place, the program sends it to a dispatcher
function that calls whatever functions should deal with the particular event.
Events in the Mewel API generate messages that are sent to windows. Before any
messages can occur, the program must create at least one window. Then the
program goes into the loop that senses events and dispatches messages to
windows.
There are two kinds of messages, those sent as the result of user events, and
those sent by the program itself. Users can do three things -- type keys, move
the mouse, and click the mouse. The user events are sent as messages to the
window that the program has created. When there are several windows on the
screen, only one of them "has the focus," and that window is the one that
receives the user-generated messages. The program-generated messages can be
anything at all, and the program can send the messages to specific windows
regardless of which window has the focus. That is how the Windows/Mewel API
works in a nut-shell, and it is the foundation of the mysteries of Windows
programming.
Of course there is much more than that to Mewel and Windows programming. The
API has its own classes of windows for menus, documents, scroll bars, frames,
titles, minimize/maximize boxes, control menus, window hierarchies, and dialog
boxes. Dialog boxes themselves contain control windows that include radio
buttons, pushbuttons, text boxes, edit boxes, list boxes, and drop-down list
boxes. All these things have their own sets of messages and functions. When
you integrate the canned messages and functions into your application's
messages, functions, and windows, you hopefully have a well-ordered
application.
The size and scope of the Windows API scares programmers. The number of source
code lines required for even the simplest program fuels criticism. The
yet-another paradigm of event-driven programming makes programmers ask, "What
next?" These are obstacles that the new Windows/Mewel programmer must
overcome. They are intimidating at first, but do not be put off; you can learn
them quicker than you think. To paraphrase P.J. Plauger: If you were not a
smart person, you wouldn't be reading this magazine.


Installation


The Mewel installation program is an INSTALL.BAT file that begins by telling
you that if you are not installing from A: to C:\Mewel you must terminate the
process and modify the INSTALL.BAT file. That is not an unreasonable thing to
ask a programmer to do. The problem, however, is that the message scrolls away
before you can read it. Apparently, the Magma programmers do not know how to
write a DOS batch file that uses ECHO instead of REM to display a full-screen
message. It is hard to believe that they ran the batch file on an 80-column
screen before releasing it. That nuisance aside, the installation is simple
enough. It creates the Mewel subdirectory and de-archives the files from the
diskette to the hard disk. That's all there is to the installation. It
finishes with another unreadable REM screen that tells you to read the text
files that contain changes to the documentation.
If you install an upgrade to a previous Mewel installation, you will bump into
another minor annoyance. Magma uses the LHARC.EXE archive program. To replace
existing files, LHARC makes you verify the replacement of every file.
As packaged, you will need either Microsoft C or Turbo C++ to use Mewel. The
Mewel installation procedure copies the Mewel header and library files into
its own subdirectory. To get your compiler and linker to find the Mewel files,
you will need to modify Microsoft's INCLUDE and LIB environment variables or
the Turbo C++ TCCONFIG.TC and TURBOC.CFG files.
The Turbo C++ version is not a C++ product but instead uses the C compiler
component of Turbo C++. If you prefer to use Turbo C 2.0, you may purchase the
Mewel source code and recompile the libraries. Optionally, you may request the
Turbo C 2.0 version of the libraries from Magma Systems. (Zortech C++ and JPI
TopSpeed C versions are also available on request.) Finally, note that the
Mewel libraries support the medium and large memory models for each compiler.


The Documentation


Documentation is the weakest part of the Mewel package. It is incomplete and
contains many typographical, grammatical, and technical errors. Magma says
that a new document is coming, one that will remedy the deficiencies of the
existing manual. Until then you will need two of the three books from the
Windows SDK to supplement the Mewel document. That does not mean that you need
the SDK itself. Microsoft Press publishes the three-volume SDK documents
separately, and you can find them in most book stores. You will need the SDK
Programmer's Reference and the Guide to Programming. This increases the cost
of Mewel by $70, the combined price of the two books.
As an example of the kind of problem I had with the Mewel documentation,
consider this: The manual describes the EM_GETHANDLE message that returns the
address of an edit control window's buffer. It does not, however, mention the
corresponding EM_SETHANDLE message that changes the edit control window's
buffer. I needed that message for the example program that accompanies this
article, and I located it in the SDK books. There was a catch, however. The
Mewel implementation of EM_GETHANDLE is different from that of Windows.
Windows programs use handles to manage memory allocations. Mewel uses the
standard C memory allocation functions, which use pointers rather than
handles. Windows handles are 16-bit values, while the far pointers of the
Mewel medium and large memory models are 32-bit values. As a result, the Mewel
implementation uses a different parameter convention for sending the
EM_SETHANDLE message. Because the Mewel documentation does not describe
EM_SETHANDLE and because the Windows SDK documentation describes the Windows
convention, the Mewel programmer is in the dark about how to send
EM_SETHANDLE. This is not an isolated incident. Deficiencies and errors of
this kind permeate the Mewel documentation.


The Source Code


How did I solve the EM_SETHANDLE problem? My copy of Mewel includes source
code, and I used a GREP utility to find the EM_SETHANDLE treatment. I learned
from the code what parameters were expected and how they were used. The source
code is available for an extra $300. Inasmuch as a programmer needs the source
code to find out how some of the functions and messages really work, Magma
should include it at no cost -- at least until they provide adequate
documentation.
There are several other reasons why a programmer would want the source code of
a function library. You just saw that I needed it to solve a problem about how
a feature worked. You shouldn't have to pay extra for that. You would also
need source code to port Mewel to a compiler not supported by the original
distribution. The code contains compile-time conditional statements that refer
to Unix, Zortech, and others, so obviously someone has considered the problem.
Certainly you'll need the source code if you want to modify the package. But
beware that when you make custom modifications to commercial function
libraries, you'll need to retrofit your improvements every time the vendor
sends an upgrade.
Source code can be a security blanket. When you use a package such as Mewel,
you invest a lot of time in the use of an API that might have only one source.
If you absolutely need a modification that the vendor does not want to support
and you have the source code, you have a way to get what you need. Besides, if
the vendor goes out of business or drops the product, you are covered.
If you are thinking about buying the source code to learn about good C code,
think again. The code is often difficult to read and makes liberal use of the
goto statement in highly unstructured ways. In defense of the code, the Magma
programmer contends that there are times when a programmer simply must use a
goto. I've never run across such times.


Support



When you call Magma, the programmer who wrote Mewel answers the phone to
answer your questions. He assured me that I was not getting special treatment
just because I am writing about his product. I cannot imagine Magma being able
to maintain that level of support, if the program gains the popularity it
deserves. But while they can, it's the best support in the business.
I ran into a number of bugs in the program. The multiple document interface
feature is new, and has not been thoroughly tested. After discussing the
problems with Magma, I got an upgrade. They fixed most of the problems and
inadvertently added some new ones, which they will fix in the next upgrade.
The company is responsive and wants the product to be as correct as possible.
When you phone to report bugs, you often hear a groan and a perceptible
forehead slap as if the person on the other end could'a had a V-8.


Portability with Windows


A Mewel program can port to Windows, but only with some effort on your part.
If you plan the program to be portable and understand the portability issues,
you can minimize the scope of changes needed. It is possible to write a
program that has compile-time conditional statements that manage the
differences. Mewel includes macros and functions that constitute what they
call "the Microsoft Windows Porting Layer." Part of this layer is available
only when you buy the source code.


The MEMOPAD Program


I used Mewel to build MEMOPAD, a multiple document notepad program similar to
the MULTIPAD example program that comes with the Windows SDK. MEMOPAD has
fewer features than MULTIPAD, but it illustrates the use of Mewel in a
multiple document application. Windows programmers will readily see the close
resemblance to Windows programs.
Microsoft introduced the Multiple Document Interface in Windows 3.0. It
provides for an application parent window to have multiple document child
windows. The parent window has the application's menu, and the document
windows have the data. The user decides how many documents to open. The
MEMOPAD program uses text files as documents. You can have several different
text files open at one time.
Mewel does not work correctly if a document window has or is derived from a
control window class -- a list box, an edit box, and so on. You must declare a
window of the class you want and make it a child of the document window. This
becomes a visual problem if the window in question will have a border or
scroll bars. These controls appear inside the border of the document window. A
frame inside a frame is unattractive. A child edit window without a frame can
occupy the entire client area of the document window, but if it has scroll
bars, they appear without a frame inside the document window's frame. The
MEMOPAD program works this way. Such a window configuration seems to be
nonstandard. I have not used the Windows SDK Multiple Document Interface, so I
do not know whether it has the same problem.
Listing One is memopad.h, which defines all the identifiers for windows, menu
commands, controls, and strings. The program source file and the resource file
both include this header file to associate the values of the identifiers with
the resources and the code that use them.
Listing Two is memopad.rc, the resource file for the application. It defines
the MEMOPAD menu and the text string values. Mewel includes a resource
compiler similar to the one that comes with the Windows SDK. The resource
compiler compiles the resource text file into the resource binary file that
the runtime system uses. The text file has the .RC extension and the binary
file has the .RES extension. A program can load the .RES file at runtime, or
the resource compiler can write it into the program's .EXE file.
Using resource files separates the content and format of menus, dialog boxes,
and strings from the program's source code. This practice makes it possible to
modify resources without recompiling the program. In a Mewel application you
can code the ASCII values of strings in the resource file and associate them
with an identifier that the program refers to. This practice facilitates the
development of applications where some values are customized for different
user environments, perhaps for foreign language translations.
Listing Three is memopad.c. This is the MEMOPAD application code. It looks
very much like a Windows program. It registers the frame and document window
classes, creates the frame and MDI client windows, and enters the message
sensing and dispatching loop. The FrameWndProc function receives and processes
the messages that result when the user selects menu commands. The functions
that it calls open empty notepad windows, load selected files into notepad
windows, save the notepad contents to files, and print the contents of edit
buffers. The program is simple, intended mainly to illustrate the use of Mewel
and how it resembles the Windows SDK.


The File Open Dialog Box


The MEMOPAD program calls a function named DlgOpenFile to allow the user to
select a file to load or to name a file to be saved. Mewel includes such a
function, but because it does not look very much like the Windows 3.0 file
open dialog box and because I wanted to illustrate how dialog boxes work, I
redesigned the dialog box and rewrote the function. Listing Four is
fileopen.h, the header file that defines the identifiers. Listing Five is
fileopen.dlg, the text definition of the dialog box contents and format. The
memopad.rc file includes this .DLG file. By maintaining separate .DLG files
and including them in each program's .RC file, you can share common dialog
boxes across applications.
Listing Five is fileopen.c, the program that implements the File Open dialog
box. It displays the dialog box and allows the user to navigate the disk
system to locate a file. When the user selects one, the function logs onto the
disk drive where the file exists, changes to its subdirectory, and copies the
selected filename into the calling function's memory as pointed to by the
Fname argument.
MEMOPAD compiles to a 180K .EXE file with Turbo C++ 1.0. This is a big
executable module for such a small program. Obviously, the CUA library is a
big one. You will probably not use Mewel to develop TSR and other programs
where memory requirements are tight.
Mewel includes a dialog box editor program that purports to be similar to the
one included in the Windows SDK tools. The program is, however, mostly
unusable. It does not allow you to move or resize the dialog box window, does
not compile to .DLG format (.RES or C code only), and its composition tools
are difficult to use. Although the manual identifies the program and
represents it as a real tool, Magma says that it is only an example and that a
more useful one is under development.


Performance


The downside of MEWEL is its performance. Your CUA programs are not going to
be snappy on the slower processors. For example, if you run the MEMOPAD
program on an 8-MHz AT, open five document windows, and select the Close All
command on the Window menu, it takes MEWEL ten seconds to close all five
windows. Other window display and swapping functions are similarly slow.
Unless you intend to run your programs on fast machines, you might find MEWEL
to be too sluggish. Windows 3.0 is no screamer on the 8-MHz machines, either,
but it is somewhat faster than MEWEL. You would not expect a graphics window
manager to outrun a text-based one, particularly when the APIs and design
philosophies are the same.


The License


When I first saw Mewel, I was puzzled by a clause in the licensing agreement
that prohibits users of the library from using it to develop text editor or
word processing programs. It turns out that Magma markets a text editor and a
word processor in addition to Mewel. They firmly assert that they do not want
others using the results of Magma's hard work -- meaning Mewel -- to go into
competition with Magma in the applications arena. It is understandable that
they do not allow you to use Mewel to develop a video window product to
compete with Mewel itself. But to prevent you from developing specific
applications is unfair. They are commercially marketing the results of their
so-called hard work and accepting your hard-earned money for it. Restricting
your use of it in such a way is the same as if Borland and Microsoft were to
prevent you from using their C compilers to develop word processors,
spreadsheets, and desktop utilities.


Conclusion


Mewel is a relatively new product that was developed and is supported by a
small company and which is competing in a marketplace dominated by better
financed, better operated, and more experienced companies. Yet Mewel is
unique. At the time I'm writing this, I know of no other DOS text-mode library
that combines windows and mouse support, SAA-compliance, and the Windows SDK
API. Until someone comes along with a competing library, Mewel is alone in
this particular market. Despite the documentation problems, the restrictive
clause in the license agreement, and the occasional bug, I can recommend this
library to programmers who want SAA-compliant DOS programs, a bridge between
Windows and DOS, or an easier road to learning Windows programming.


Products Mentioned


Mewel 3 Window System Magma Software Systems 15 Bodwell Terrace Millburn, NJ
07041 201-912-0192 System requirements: Microsoft C 5.1 or later, or Turbo
C++, Turbo C 2.0, Zortech C++, and JPI TopSeed C versions available on
request. $295 for MSC and TC++ libraries $595 for libraries with source code


_THE MEWEL WINDOW SYSTEM_
by Al Stevens




[LISTING ONE]

/* ------------ memopad.h ------------- */

/* -------- window identifiers ----------- */
#define ID_MAIN 1
#define ID_MDICLIENT 2
#define ID_EDITOR 3
#define ID_FIRSTEDITOR 100

/* ------- menu command identifiers -------- */
#define ID_NEWFILE 5
#define ID_OPENFILE 6
#define ID_SAVE 7
#define ID_SAVEAS 8
#define ID_PRINT 9
#define ID_EXIT 10

#define IDM_WINDOWTILE 12
#define IDM_WINDOWCASCADE 13
#define IDM_WINDOWICONS 14
#define IDM_WINDOWCLOSEALL 15

#define ID_HELP 99

/* -------- string identifiers --------- */
#define IDS_TITLE 0
#define IDS_HELP 1
#define IDS_ERROR 2
#define IDS_OVERWRITE 3
#define IDS_WRITEERROR 4
#define IDS_SELECTERROR 5
#define IDS_NOFILE 6
#define IDS_FORMFEED 7
#define IDS_UNTITLED 8







[LISTING TWO]

#include <style.h>
#include "memopad.h"

#include "fileopen.dlg"

MPmenu MENU
BEGIN
 POPUP "~File"
 BEGIN
 MENUITEM "~New", ID_NEWFILE SHADOW
 MENUITEM "~Open...", ID_OPENFILE
 MENUITEM "~Save", ID_SAVE

 MENUITEM "Save ~As...", ID_SAVEAS
 MENUITEM SEPARATOR
 MENUITEM "~Print", ID_PRINT
 MENUITEM SEPARATOR
 MENUITEM "E~xit", ID_EXIT
 END
 POPUP "~Window"
 BEGIN
 MENUITEM "~Tile", IDM_WINDOWTILE SHADOW
 MENUITEM "~Cascade", IDM_WINDOWCASCADE
 MENUITEM "Arrange ~Icons", IDM_WINDOWICONS
 MENUITEM "Close ~All", IDM_WINDOWCLOSEALL
 END
 MENUITEM "~Help", ID_HELP HELP
END

STRINGTABLE
BEGIN
 IDS_TITLE, "MemoPad"
 IDS_HELP,
"MemoPad is a multiple document\
memo processor. You can have\
several *.PAD documents open at\
one time. It demonstrates the\
MDI document feature of MEWEL."
 IDS_ERROR, "Error!"
 IDS_OVERWRITE, "Overwrite Existing File?"
 IDS_WRITEERROR, "Cannot write file"
 IDS_SELECTERROR, "Select an open document first"
 IDS_NOFILE, "No such file"
 IDS_FORMFEED, "Send a Form Feed?"
 IDS_UNTITLED, "Untitled"
END








[LISTING THREE]

/* ------------ memopad.c -------------- */

#include <stdio.h>
#include <stdlib.h>
#include <window.h>
#include <string.h>
#include <sys\stat.h>
#include <io.h>
#include "memopad.h"

long FAR PASCAL FrameWndProc(HWND, WORD, WORD, DWORD);
long FAR PASCAL EditorProc(HWND, WORD, WORD, DWORD);
void NewFile(void);
void SelectFile(void);
void SaveFile(BOOL);
void PrintPad(void);

void OpenWindow(char *);
void LoadFile(HWND, char *, int);
void BuildEditor(HWND);
HWND GetEditorHandle(void);
int ErrorMessage(int);

char EditorClass[] = "Editor";
char FrameClass[] = "FrameClass";
char Untitled[26];

HWND hClient;
HWND hEditor;
HWND hFrame;

int hInstance;

void main(void)
{
 MSG event;
 WNDCLASS wndclass;
 CLIENTCREATESTRUCT ccs;
 char Title[26];

 WinInit();
 WinUseSysColors(NULLHWND, TRUE);
 MDIInitialize();

 /* Register the Editor Document Window */
 memset (&wndclass, 0, sizeof (wndclass));
 wndclass.style = CS_HREDRAW CS_VREDRAW ;
 wndclass.lpfnWndProc = EditorProc ;
 wndclass.lpszMenuName = NULL;
 wndclass.lpszClassName = EditorClass;
 if (!RegisterClass (&wndclass))
 exit(1);

 /* Register the Frame Window */
 memset (&wndclass, 0, sizeof (wndclass));
 wndclass.style = CS_HREDRAW CS_VREDRAW ;
 wndclass.lpfnWndProc = FrameWndProc ;
 wndclass.lpszMenuName = "MPMenu";
 wndclass.lpszClassName = FrameClass;
 if (!RegisterClass (&wndclass))
 exit(1);

 /* Open the Resource File */
 hInstance = OpenResourceFile("MEMOPAD");
 LoadString(hInstance, IDS_UNTITLED, Untitled, 25);

 /* Create the frame window */
 LoadString(hInstance, IDS_TITLE, Title, 25);
 hFrame = CreateWindow(
 FrameClass,
 Title,
 WS_OVERLAPPEDWINDOW WS_CLIPCHILDREN
 WS_MINIMIZEBOX WS_MAXIMIZEBOX,
 CW_USEDEFAULT,
 CW_USEDEFAULT,
 CW_USEDEFAULT,

 CW_USEDEFAULT,
 SYSTEM_COLOR,
 ID_MAIN,
 NULLHWND,
 NULLHWND,
 hInstance,
 (LPSTR) NULL
 );

 /* display the frame window */
 ShowWindow(hFrame, SW_SHOW);

 /* create the MDI client window */
 ccs.hWindowMenu = GetSubMenu(GetMenu(hFrame), 1);
 ccs.idFirstChild = ID_FIRSTEDITOR;

 hClient = CreateWindow("mdiclient",
 NULL,
 WS_CHILD WS_CLIPCHILDREN 
 WS_CLIPSIBLINGS,
 0,0,0,0,
 SYSTEM_COLOR,
 ID_MDICLIENT,
 hFrame,
 NULL,
 hInstance,
 (LPSTR) &ccs);
 ShowWindow(hClient, SW_SHOW);

 /* set focus for keyboard users */
 SetFocus(hFrame);

 /* message loop */
 while (GetMessage(&event, NULLHWND, 0, 0)) {
 TranslateMessage(&event);
 DispatchMessage(&event);
 }

 CloseResourceFile(hInstance);
 exit(0);
}

/* wndproc for the frame window */
long FAR PASCAL FrameWndProc(HWND hWnd, WORD message,
 WORD wParam, DWORD lParam)
{
 HWND hwndCheck;
 char Hmsg[501];

 switch (message) {
 case WM_HELP:
 LoadString(hInstance, IDS_HELP, Hmsg, 500);
 MessageBox(hFrame, Hmsg, NULL, MB_OK);
 break;
 case WM_COMMAND:
 switch (wParam) {
 case ID_NEWFILE:
 NewFile();
 break;

 case ID_OPENFILE:
 SelectFile();
 break;
 case ID_SAVE:
 SaveFile(FALSE);
 break;
 case ID_SAVEAS:
 SaveFile(TRUE);
 break;
 case ID_PRINT:
 PrintPad();
 break;
 case ID_EXIT:
 PostQuitMessage(0);
 break;
 case IDM_WINDOWTILE:
 SendMessage(hClient,WM_MDITILE,0,0);
 break;
 case IDM_WINDOWCASCADE:
 SendMessage(hClient,WM_MDICASCADE,0,0);
 break;
 case IDM_WINDOWICONS:
 SendMessage(hClient,WM_MDIICONARRANGE,0,0);
 break;
 case IDM_WINDOWCLOSEALL:
 while ((hwndCheck =
 GetWindow(hClient, GW_CHILD))
 != NULLHWND)
 SendMessage(hClient, WM_MDIDESTROY,
 hwndCheck, 0);
 break;
 default:
 break;
 }
 break;
 default:
 break;
 }
 return DefFrameProc(hWnd,hClient,message,wParam,lParam);
}

/* The New command. Open an empty editor window */
void NewFile(void)
{
 OpenWindow(Untitled);
}

/* The Open... command. Select a file */
void SelectFile(void)
{
 char FileName[64];
 if (DlgOpenFile(hFrame, "*.PAD", FileName)) {
 HWND hWnd, hEditor;
 /* test to see if the document is already open */
 if ((hWnd = FindWindow(EditorClass, FileName))
 != NULLHWND) {
 /* document is open, activate its window */
 BringWindowToTop(hWnd);
 OpenIcon(hWnd);

 hEditor = GetTopWindow(hWnd);
 SetFocus(hEditor);
 }
 else
 OpenWindow(FileName);
 }
}

/* get the current active editor window handle */
HWND GetEditorHandle(void)
{
 HWND hCurEd;

 hCurEd = GetFocus();
 if (!IsChild(hClient, GetParent(hCurEd)))
 hCurEd = NULLHWND;
 return hCurEd;
}

/* Save the notepad file */
void SaveFile(BOOL SaveAs)
{
 char FileName[64];
 HWND hCurEd, hEditor;
 char *text;
 FILE *fp;

 /* get the handle of the active notepad editor window */
 if ((hCurEd = GetEditorHandle()) != NULLHWND) {
 hEditor = GetParent(hCurEd);
 /* --- get the editor window's file name --- */
 GetWindowText(hEditor, FileName, 64);
 /* get a name for untitled window or Save As command */
 if (SaveAs strcmp(FileName, Untitled) == 0) {
 if (!DlgOpenFile(hFrame, "*.PAD", FileName))
 return;
 if (access(FileName, 0) == 0) {
 char omsg[81];
 LoadString(hInstance, IDS_OVERWRITE, omsg, 80);
 if (MessageBox(hFrame, omsg,
 NULL, MB_YESNO) == IDNO)
 return;
 }
 SetWindowText(hEditor, FileName);
 }
 /* - get the address of the editor text - */
 text = (char *) SendMessage(hCurEd, EM_GETHANDLE,0,0);
 if ((fp = fopen(FileName, "wt")) != NULL) {
 fwrite(text, strlen(text), 1, fp);
 fclose(fp);
 }
 else
 ErrorMessage(IDS_WRITEERROR);
 }
 else
 ErrorMessage(IDS_SELECTERROR);
}

/* open a document window and load a file */

void OpenWindow(char *FileName)
{
 MDICREATESTRUCT mcs;
 HWND hWnd, hEditor;
 struct stat sb;

 if (strcmp(FileName, Untitled) && stat(FileName, &sb)) {
 ErrorMessage(IDS_NOFILE);
 return;
 }

 mcs.szTitle = FileName;
 mcs.szClass = EditorClass;
 mcs.hOwner = hInstance;
 mcs.lParam = NULL;
 mcs.x = mcs.y = mcs.cy = mcs.cx = CW_USEDEFAULT;
 mcs.style = WS_CLIPCHILDREN;

 /* tell the client window to create the document window */
 hWnd = SendMessage(hClient, WM_MDICREATE, 0,
 (LONG) (LPMDICREATESTRUCT) &mcs);

 hEditor = GetTopWindow(hWnd);
 SetFocus(hEditor);

 if (strcmp(FileName, Untitled))
 LoadFile(hEditor, FileName, (int) sb.st_size);
}

/* wndproc for the editor window */
long FAR PASCAL EditorProc(HWND hWnd, WORD message,
 WORD wParam, DWORD lParam)
{
 RECT rc;
 int rtn;

 switch (message) {
 case WM_SIZE:
 /* Resize the edit control box. */
 GetClientRect (hWnd, &rc);

 WinSetSize(GetTopWindow(hWnd),
 rc.bottom-rc.top+1,
 rc.right-rc.left+1);
 break;
 case WM_SETFOCUS: {
 rtn = DefMDIChildProc(hWnd,message,wParam,lParam);
 /* Set the focus on the editor window */
 SetFocus(GetTopWindow(hWnd));
 return rtn;
 }
 case WM_CREATE:
 /* create the file window's editor box */
 BuildEditor(hWnd);
 break;
 default:
 break;
 }
 return DefMDIChildProc(hWnd, message, wParam, lParam);

}

/* Create the editor window */
void BuildEditor(HWND hWnd)
{
 CreateWindow(
 "edit",
 NULL,
 WS_CHILD WS_CLIPCHILDREN WS_VSCROLL 
 ES_MULTILINE ES_AUTOVSCROLL,
 CW_USEDEFAULT,
 CW_USEDEFAULT,
 CW_USEDEFAULT,
 CW_USEDEFAULT,
 SYSTEM_COLOR,
 ID_EDITOR,
 hWnd,
 NULLHWND,
 hInstance,
 (LPSTR) NULL);
}

/* Load the notepad file into the editor text buffer */
void LoadFile(HWND hEditor, char *FileName, int tLen)
{
 int bfsize;
 char *Buf;
 FILE *fp;

 Buf = (char *)
 SendMessage(hEditor, EM_GETHANDLE, 0, (long) &bfsize);
 if (bfsize < tLen+1) {
 Buf = LocalReAlloc(Buf, tLen+1, 0);
 SendMessage(hEditor, EM_SETHANDLE, tLen+1, (long) Buf);
 }
 if (Buf != NULL) {
 if ((fp = fopen(FileName, "rt")) != NULL) {
 memset (Buf, 0, tLen+1);
 fread(Buf, tLen, 1, fp);
 SendMessage(hEditor, WM_SETTEXT, 0, (long) Buf);
 fclose(fp);
 }
 }
}

/* print the current notepad */
void PrintPad(void)
{
 char FileName[64];
 HWND hCurEd;

 if ((hCurEd = GetEditorHandle()) != NULLHWND) {
 char *text;
 char msg[81];
 HWND hEditor = GetParent(hCurEd);
 /* --- get the editor window's file name --- */
 GetWindowText(hEditor, FileName, 64);

 /* ---------- print the file name ---------- */

 fputs("\r\n", stdprn);
 fputs(FileName, stdprn);
 fputs(":\r\n\n", stdprn);

 /* ---- get the address of the editor text ----- */
 text = (char *) SendMessage(hCurEd, EM_GETHANDLE,0,0);

 /* ------- print the notepad text --------- */
 while (*text) {
 if (*text == '\n')
 fputc('\r', stdprn);
 fputc(*text++, stdprn);
 }

 /* ------- follow with a form feed? --------- */
 LoadString(hInstance, IDS_FORMFEED, msg, 80);
 if (MessageBox(hFrame, msg, NULL, MB_YESNO) == IDYES)
 fputc('\f', stdprn);
 }
 else
 ErrorMessage(IDS_SELECTERROR);
}

/* Error message handler */
int ErrorMessage(int ErrorNumber)
{
 char ErrorMsg[81];
 char Error[26];

 LoadString(hInstance, ErrorNumber, ErrorMsg, 80);
 LoadString(hInstance, IDS_ERROR, Error, 26);
 MessageBeep(0);
 return MessageBox(hFrame, ErrorMsg, Error, MB_OK);
}






[LISTING FOUR]

/* ------------ fileopen.h --------------- */

/* ------- file open dialog box identifiers --------- */
#define ID_FILEOPEN 20
#define ID_PATH 21
#define ID_FILES 22
#define ID_FILENAME 23
#define ID_DRIVE 24





[LISTING FIVE]

/* ----------- fileopen.c ------------- */


#include <window.h>
#include <string.h>
#include "fileopen.h"

static int pascal DlgFnOpen(HDLG, WORD, WORD, DWORD);
static BOOL InitDlgBox(HDLG);
static void StripPath(char *);

static char OrigSpec[80];
static char FileSpec[80];
static char FileName[80];

static int FileSelected;

#define HasWildCards(s) (strchr(s, '?') strchr(s, '*'))

/* Dialog Box to select a file from the disk system */
int pascal DlgOpenFile(HWND hParent, BYTE *Fpath, BYTE *Fname)
{
 HDLG hDlg;
 int rtn;
 extern int hInstance;

 hDlg = LoadDialog(hInstance, MAKEINTRESOURCE(ID_FILEOPEN),
 hParent, DlgFnOpen);
 strncpy(FileSpec, Fpath, sizeof(FileSpec));
 strcpy(OrigSpec, FileSpec);

 if ((rtn = DialogBox(hDlg)) == TRUE)
 strcpy(Fname, FileName);
 else
 *Fname = '\0';

 return rtn;
}

/* Process dialog box messages */
static int pascal DlgFnOpen(HDLG hDlg, WORD msg, WORD wParam,
 DWORD lParam)
{
 switch (msg) {
 case WM_INITDIALOG:
 if (!InitDlgBox(hDlg))
 EndDialog(hDlg, 0);
 return TRUE;

 case WM_COMMAND:
 switch (wParam) {
 case ID_FILENAME:
 /* allow user to modify the file spec */
 GetDlgItemText(hDlg, ID_FILENAME,
 FileName, 64);
 if (HasWildCards(FileName)) {
 strcpy(OrigSpec, FileName);
 StripPath(OrigSpec);
 }
 break;
 case IDOK:
 if (HasWildCards(FileName)) {

 /* no file name yet */
 strcpy(FileSpec, FileName);
 if (InitDlgBox(hDlg)) {
 SetDlgItemText(hDlg, ID_FILENAME,
 FileSpec);
 strcpy(OrigSpec, FileSpec);
 }
 }
 else
 EndDialog(hDlg, 1);
 return TRUE;

 case IDCANCEL:
 EndDialog(hDlg, 0);
 return TRUE;

 case ID_FILES:
 switch (HIWORD(lParam)) {
 case LBN_SELCHANGE :
 /* selected a different filename */
 DlgDirSelect(hDlg, FileName,
 ID_FILES);
 SetDlgItemText(hDlg, ID_FILENAME,
 FileName);
 FileSelected = TRUE;
 break;
 case LBN_DBLCLK :
 /* chose a file name */
 DlgDirSelect(hDlg, FileName,
 ID_FILES);
 EndDialog(hDlg, 1);
 return TRUE;
 }
 break;
 case ID_DRIVE:
 switch (HIWORD(lParam)) {
 case LBN_SELCHANGE :
 /* selected different drive/dir */
 DlgDirSelect(hDlg, FileName,
 ID_DRIVE);
 strcat(FileName, OrigSpec);
 strcpy(FileSpec, FileName);
 SetDlgItemText(hDlg, ID_FILENAME,
 FileSpec);
 break;
 case LBN_DBLCLK :
 /* chose drive/dir */
 if (InitDlgBox(hDlg))
 SetDlgItemText(hDlg, ID_FILENAME,
 FileSpec);
 else
 strcpy(FileSpec, OrigSpec);
 return TRUE;
 }
 break;

 default:
 break;
 }

 }
 return FALSE;
}

/* Initialize the dialog box */
static BOOL InitDlgBox(HDLG hDlg)
{
 FileSelected = FALSE;
 SetDlgItemText(hDlg, ID_FILENAME, FileSpec);
 if (!DlgDirList(hDlg, FileSpec, ID_FILES, ID_PATH, 0))
 return FALSE;
 /* MEWEL DlgDirList should do this, but does not */
 StripPath(FileSpec);
 return DlgDirList(hDlg, "*.*", ID_DRIVE, 0, 0xc010);
}

/* Strip the drive and path information from a file spec */
static void StripPath(char *filespec)
{
 char *cp, *cp1;

 cp = strchr(filespec, ':');
 if (cp != NULL)
 cp++;
 else
 cp = filespec;
 while (TRUE) {
 cp1 = strchr(cp, '\\');
 if (cp1 == NULL)
 break;
 cp = cp1+1;
 }
 strcpy(filespec, cp);
}




























March, 1991
NETWORKING WITH WINDOWS 3.0


Give your NetWare utilities a face lift


 This article contains the following executables: MEGAPHON.ARC


Mike Klein


Mike is a software engineer with Insight Development Corporation and
specializes in Microsoft Windows, HP NewWave, and Novell NetWare. He is also
the author of several books and numerous magazine articles, and can be reached
at 500 Cole St., San Francisco, CA 94117, via CompuServe at 73750,2152, and on
Telepath as MikeKlein.


Every programmer is familiar with the inversely proportional graph involving
software, which shows that with an increase in user-friendliness and
ease-of-use comes a decrease in ease of development for the programmer. And I
have to admit that this has somewhat biased my opinion about programming for a
GUI environment such as Microsoft Windows. However, the old saying about the
bark being worse than the bite seems to apply to Windows 3.0. In fact, I found
that by "snitching" a couple of predefined program templates (included in the
Windows 3.0 SDK), development can be fairly easy. In no time at all I was
finishing up my first Windows program, a NetWare messaging and user
information utility I call "Megaphone."
I started with the idea that Megaphone would be a simple message-sending
utility for allowing a user to type in a 50-character message and broadcast it
to another user or users on the network. However, I got a small case of
"feature-itis" during development, and Megaphone now sports some additional
functionality. For instance, by double-clicking on a user's name, network
statistics such as login name, full name, login date and time, and network and
node address can be displayed.
Megaphone was written using Microsoft C 6.0, the Microsoft Windows 3.0
Software Development Kit, Solution Systems' Brief programming editor,
Microsoft's Windows Write, and Novell's NetWare C Interface. Megaphone
requires a version of Windows 3.0 that has been installed for NetWare and a
NetWare network to run on. Although Megaphone has only been tested on NetWare
2.1x, it should run on ELS NetWare and NetWare 386.


A New Way of Thinking (or Programming)


The biggest hurdle for new Windows programmers is the fact that you don't have
entire control over your program -- Windows does. No longer does your program
call the shots. Instead of polling the mouse to find out when it has been
clicked on a certain window, Windows informs a particular window's procedure
that a mouse has been clicked in its area. Despite the new way of thinking and
the supposed difficulty of mastering the 1000+ function calls, Windows is
still an easy-to-use and robust development environment.
Under Windows, every created window has a number of properties associated with
it, one of which is a far pointer to a procedure that handles events for the
window. This procedure handles all or most message processing for the window.
In addition, almost every item or object in a window or dialog box is capable
of receiving and/or sending messages.
Megaphone has a main window with a system menu, caption, and a minimize
button. In Megaphone's client area (see Figure 1) there are a variety of
controls, including a list box, a drop-down combo box, an edit control, and
three buttons: Send, Settings, and Exit. Each of these controls is capable of
sending and receiving a message to other windows and to other controls (which
are essentially windows), or of passing messages on to Windows for further
processing.
Messages can be sent in one of two ways, either directly to a window's
handling procedure (sending a message), or put into the window's application
queue and serviced later (posting a message). Messages sent to a window's
handling procedure can be sent in one of two ways, either directly to the
window (such as an edit control, button, or list box) using its handle, or to
a control ID in a dialog box. A control ID is an integer identifier
representing a unique value for a control in a dialog box. When a dialog box
is first defined or created, the only handles available for manipulation of
the dialog and its controls are a window handle to the main dialog box and ID
numbers for all of the controls. It's a good idea to set up global window
handles to any frequently accessed controls to make sending messages less
complicated. Also, certain API functions only work with window handles.
When an application sends messages using PostMessage(), control immediately
returns to the calling program. When an application sends a message to another
window using SendMessage(), control is not returned until the target window
has processed the message sent to it. Note that messages meant for dialog box
controls should be sent, not posted. (See Example 1.)
Example 1: Three Windows functions for sending messages

 BOOL PostMessage (hWnd, wMsg, wParam, lParam);
 DWORD SendMessage (hWnd, wMsg, wParam, lParam);
 DWORD SendDlgItemMessage (hDlg, nIDDlgItem, wMsg, wParam, lParam);

Prior to being dispatched to a window's handling procedure, messages are
acquired and dispatched in your application's WinMain(), which is somewhat
equivalent to C's main(). WinMain() is where a program does most of its
initialization, including registering window classes and displaying windows,
and goes into the main message dispatching loop. Before being dispatched to a
window handling procedure, messages are stored in a MSG structure, the format
of which is shown in Figure 2.
Figure 2: The MSG structure used to store messages

 typedef struct
 {
 HWND hwnd; /* Window recvg the message */
 WORD message; /* The actual message code */
 WORD wParam; /* Data dependent upon message */
 LONG lParam; /* Addn'l data for message */
 DWORD time; /* Time message received */
 POINT pt; /* Mouse pos at time of msg */
 }
 MSG;

The key to initial success when programming for Windows is knowing when and
how to act on certain messages. Window procedures need to either act on the
message (maybe doing nothing with it) or pass it on to Windows (using the
DefWindowProc or DefDlgProc functions) for further processing. Sometimes you
need to act on the message and still pass it on to Windows. One of my first
problems was determining which messages I needed to process and which I could
ignore. It's also important to determine whether the message you're acting on
is posted after an event has occurred or before. In the case of the Windows
message WM_DESTROY, you need to question whether this message is being posted
to a window because it has already been destroyed or because Windows wants to
destroy it. You'll find that the section of the Windows reference manual that
details different Windows messages will be the most dogeared.
All window procedures must be declared FAR PASCAL and EXPORTed in your
application's .DEF file. The FAR designation is necessary since Windows' code
segment will probably differ from your own application's code segment. The
Pascal calling convention, where parameters are pushed onto the stack in the
order that they appear, is used because it cuts down on code size. The EXPORT
command advertises your function as "callable" to Windows.


The Application


Megaphone is a fairly straightforward Windows program. Listing One is the
header file for Megaphone and includes the #defines identifying the controls
in the different dialog boxes. Listing Two the C source file for the Megaphone
program, could have been broken into several components. However, I didn't
want to add to the program's complexity. If you plan on extending this program
to other networks, your first step might be to separate the network-specific
functions into a separate file.
Listing Three contains the dialog box templates, icon definitions, and other
"resources." Listing Four is the .DEF file that contains all the nitty-gritty
information including the program name, stack and heap sizes, and window
procedures to be imported and exported. Listing Five is a standard Windows
make file.

My first problem involved Megaphone's main window, since I wanted to use a
dialog box instead of a plain window containing different controls. The
problem with the latter approach is that I wanted to retain certain
characteristics of a dialog box, such as Tab key processing. However, I also
wanted the main dialog box to be treated just like a regular window so that an
icon shows at the bottom of the screen when the dialog box is minimized. The
answer, a la Charles Petzold's Hexcalc program, was to define the dialog box
as a window class, registering and using it in a call to CreateDialog to
create the main window. I now had a main window with all the features of a
dialog box. You'll find that most utility-style programs use a similar format.
The next problem involved message processing for dialog boxes. Because
Megaphone uses almost nothing but modeless dialog boxes, none of the Tab or
Alt key controls worked for the different dialog boxes. Normally, the function
IsDialogMessage() is used inside a main message-processing loop to check for
messages belonging to a dialog box and processes them accordingly, skipping
the Windows message-posting procedure. The problem is that IsDialogMessage
requires a window handle; I didn't want to have to keep track of up to 50
modeless dialog boxes on my screen, so I created a global window handle that
points to the currently active window. Each window-handling procedure traps
the WM_ACTIVATE message and accordingly sets the global window handle. This
way IsDialogMessage() needs only to be called with the global window handle.
Note that DestroyWindow, not EndDialog, is called when any of the modeless
dialog boxes are to be destroyed. In many cases, modal and modeless dialog
boxes are dealt with very differently. The different handshaking between
window types and the confusion surrounding dialog boxes as a window class took
a little getting used to. Many of the dialog box-oriented message descriptions
in the SDK are somewhat terse and often confusing. To add to the problem,
several "inviting" and undocumented dialog box function calls are made in the
source Microsoft provides with the SDK. I couldn't find certain function calls
and window messages defined anywhere in the SDK manuals. The SDK proved to be
an excellent reference, except in the areas of subclassing, window filters and
hooks, owner draw controls, and in describing the different window messages.
When Windows creates a dialog box or any other sort of window, several window
parameters have to be set, including class, style, positioning, size, and the
window's parent (if one exists). Megaphone's User and Message windows were
both created with no parent window, making them somewhat independent. When the
Megaphone window is minimized, the user information and message windows stay
up on the screen. Although they don't have a parent, they are still
technically child windows. When Megaphone's main window is closed, all windows
(whether children or not) belonging to that program instance are removed from
the screen.


NetWare Programming


I found that adding NetWare-API functionality to Megaphone was completely
painless. Future access to the NetWare API will be made possible by a DLL
shared amongst all Windows applications. Currently, however, a library must be
statically linked in at compile time. If you're using assembly and doing
direct INT 21 calls, then you need to IMPORT NetWare.NetWareRequest into your
application's .DEF file. This is necessary because Windows needs to trap any
calls you make directly to DOS. Since Megaphone uses the C interface, this
step isn't required -- the conversion is automatically done. Novell provides
two APIs for NetWare: the NetWare C Interface-DOS (preferred); and the NetWare
DOS Interface (for assembly language programmers).
The problem with NetWare's SEND utility (which Megaphone mimics somewhat) is
that when a message is sent to a character-mode DOS or Windows-based
workstation, all background processing halts. In addition, only one message
may be received by a workstation at a time. One of the first things that
Megaphone does is instruct the file server to hold messages destined for the
workstation. (Normally, the file server sends messages to a workstation as
soon as it gets them.) Next, Megaphone sets up a Windows timer event when a
WM_TIMER message is received and polls the server every five seconds for
messages. This could end up generating a certain amount of traffic on your
network, so you may want to alter the polling delay somewhat. Because modeless
dialog boxes are used for the message (instead of system-modal dialog boxes,
which require user input and freeze the background windows), multiple messages
may exist on the computer's screen without any user interaction.
The only real problems when programming for NetWare surface in multiple-server
environments. NetWare defines three types of server attachments: Primary, the
server that a user runs their network login scripts from; preferred, the
server that all requests go to; and default, the server that your current
drive letter points to (assuming it's a network drive).
Normally, if a preferred server connection exists, all requests go to the
preferred server. If a preferred server doesn't exist, then the user's current
drive is checked to see if it's a network or local device. If it points to a
network volume, then that particular server is used. If the user's current
drive points to a local drive, requests go to the primary server. If a primary
server doesn't exist, requests go to the first server in the server name
table. All of NetWare's API calls either automatically assume the preferred
server, get passed a handle to a specific server, or include a parameter
involving a network drive letter.
One other important thing to note is the difference between connection numbers
and connection IDs. A file server has a certain number of user connections it
can support (100 for NetWare/286, 250 for NetWare/386), each represented by a
16-bit connection number. Each workstation, on the other hand, can have up to
eight file server attachments, each represented by a 16-bit connection ID.
Every file server pointed to by a workstation's connection ID has a matching
connection number that points to the workstation. It's important from the
start to be clear on which value a function call is asking and which it is
returning.
Handling of multiple-server connections in Megaphone can be improved. Because
a user could inadvertently change the status of any of their connections, some
periodic checking for changes should be done. With a little extra programming
to occasionally monitor connection status, the user list box and server combo
box could be changed to dynamically alter their contents in response to
changes in the network.


Using Megaphone


Megaphone allows users to send messages to each other and retrieve simple
connection information on other users. To retrieve connection information (see
Figure 3) on a user, merely double click on their name. You can do this as
many times as you want, displaying multiple information windows. To send a
message, just click on a single user or a group of users, type a message into
the message box, and click on the Send button -- it's that easy. To select or
deselect all of the users in the list box, just click the left mouse button
anywhere in Megaphone's white client area around the satellite dish icon. To
automatically refresh network connection information and update the contents
of all the list boxes, simply choose Refresh from the menu or click the left
mouse button on the satellite dish icon.
When a message from another workstation is received by Megaphone, it is
displayed on the screen in a modeless dialog box (see Figure 4). Incoming
messages are displayed in a window consisting of a caption showing who sent
the message and the date and time the message was received; an edit control
for the incoming message and possibly the outgoing response; a Reply button
(sometimes disabled) for responding to messages; a Cancel button to remove the
message; and lastly, a Save button, which minimizes the message and saves it
as a special Message class icon at the bottom of the screen.
The reply feature is possible due to some special formatting when the message
is first sent. Using NetWare's built-in messaging utilities, replies to
messages are not possible. Megaphone makes it a lot easier to respond to a
coworker's request by formatting the outgoing message in the format
username[connection#]message. Incoming messages, including NetWare's (which
are formatted differently), are then parsed according to this format. If an
incoming message does not follow this format, the reply button is disabled.
If you don't want to be disturbed, press the Settings button and change some
of your defaults (see Figure 5). If you're feeling unsociable and don't want
to receive messages, simply remove the Accept box's check mark. Also included
in the Settings menu under Incoming Messages is the Iconize check box, which
determines whether incoming messages are displayed as a window or iconized.
The Settings menu also includes a set of radio buttons determining whether
retrieved users have to be attached to a server or simply defined in its
bindery. A limited amount of information can be retrieved on users even when
they aren't logged in to a particular server.


Improvements


One enhancement that could be added to Megaphone is checking for double clicks
in the file server combo box (perhaps changing it to a list box), where a
double-click on a file server would bring up key statistics on it, such as
network I/O and traffic information. Key to implementing this feature would be
issuing NetWare API calls such as GetLANDriverConfigInfo(), which is used to
retrieve LAN driver information. Adding a user group box would also be nice,
where single-clicking on a group would automatically highlight every member in
the list box that's in the group, and double-clicking on the group would bring
up miscellaneous information about the group. This option could be easily
implemented using NetWare's ReadPropertyValue call.
Keyboard support could be improved to trap the Delete key while in the user
list box, deleting the user's connection when pressed (requires calling
NetWare's ClearConnectionNumber). A real-time chatting facility using
NetWare's SPX/IPX network transport protocol would also be useful.
Hopefully I've been able to give you enough information and enough code to get
you started on Windows-based network utilities. The Windows API is seemingly
endless and quite robust -- you'll always be surprised by that one function
call you "happen to stumble across." Likewise, I continue to find new uses for
the different calls in the NetWare API. Combining the two can make for some
really powerful network applications.


_NETWORKING WITH WINDOWS 3_
by Mike Klein


[LISTING ONE]

/* MEGAPHON.H */

#define MAX_CONNECTIONS 100
#define MAX_MESSAGE_LEN 56

#define IDT_MESSAGETIMER 100

#define IDC_USERLISTBOXTITLE 100
#define IDC_USERLISTBOX 101
#define IDC_SERVERCOMBOBOXTITLE 102
#define IDC_SERVERCOMBOBOX 103
#define IDC_MESSAGEEDITBOX 104
#define IDC_SENDBUTTON 105
#define IDC_SETTINGSBUTTON 106
#define IDC_EXITBUTTON 107

#define IDC_ACCEPTMESSAGES 100
#define IDC_ICONIZEMESSAGES 101
#define IDC_ONLYATTACHEDUSERS 102

#define IDC_ALLUSERSINBINDERY 103

#define IDC_USERNAME 100
#define IDC_STATION 101
#define IDC_NODE 102
#define IDC_FULLNAME 103
#define IDC_LOGINTIME 104
#define IDC_NETWORK 105

#define IDC_REPLYEDITBOX 100
#define IDC_REPLYBUTTON 101
#define IDC_SAVEBUTTON 102

#define IDM_EXIT 200
#define IDM_ABOUT 201
#define IDM_REFRESH 220


int PASCAL WinMain(HANDLE, HANDLE, LPSTR, int);

LONG FAR PASCAL MainWndProc(HWND, unsigned, WORD, LONG);
LONG FAR PASCAL UserInfo(HWND, unsigned, WORD, LONG);
LONG FAR PASCAL MessageHandler(HWND, unsigned, WORD, LONG);
BOOL FAR PASCAL About(HWND, unsigned, WORD, LONG);
BOOL FAR PASCAL Settings(HWND, unsigned, WORD, LONG);

BOOL PASCAL InitNetStuff(VOID);
VOID PASCAL EnableOrDisableSendButton(VOID);
VOID PASCAL SendNetWareMessageToUsers(VOID);
VOID PASCAL ShowUserInformation(VOID);






[LISTING TWO]

/*****************************************************************************

 PROGRAM: Megaphone
 AUTHOR : Mike Klein
 VERSION: 1.0
 FILE : megaphon.exe
 CREATED: 10-25-90

 REQUIREMENTS: Windows 3.x and a Novell NetWare 2.1x or compatible network

 PURPOSE : Messaging and simple information system for Novell NetWare.
 Allows quick replies to messages from coworkers and
 miscellaneous login information about them.

*****************************************************************************/


#define NOCOMM


#include <windows.h>

#include <direct.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

#include <nwbindry.h>
#include <nwconn.h>
#include <nwdir.h>
#include <nwmsg.h>
#include <nwwrkenv.h>

#include "megaphon.h"


HANDLE hInstMegaphone; /* Original instance of Megaphone */

/* These are handles to commonly accessed controls in different dialogs */

HWND hWndCurrent; /* handle to currently active window on screen */
HWND hDlgMegaphone; /* handle to Megaphone dialog window */
HWND hWndUserListBox;
HWND hWndServerComboBox;
HWND hWndMessageEditBox;
HWND hWndSendButton;

/* Set up some additional global variables to commonly used NetWare stuff */

WORD DefaultConnectionID;
BYTE ServerName[48];

WORD UserConnectionNum;
BYTE UserName[48];
BYTE UserLoginTime[7];

WORD SelUserConnectionNum;
BYTE SelUserNetworkAddr[4];
BYTE SelUserNodeAddr[6];
BYTE SelUserName[48];
BYTE SelUserFullName[48];
BYTE SelUserLoginTime[7];

BOOL AcceptMessages = TRUE;
BOOL IconizeMessages = FALSE;
BOOL AllUsers = FALSE;

BYTE Text[100];


/*****************************************************************************

 FUNCTION: WinMain

 PURPOSE : Calls initialization function, processes message loop

*****************************************************************************/

int PASCAL WinMain(HANDLE hInstance, HANDLE hPrevInstance, LPSTR lpCmdLine,
 int nCmdShow)
{

 WNDCLASS wc;
 MSG msg;

 if(!hPrevInstance) /* Other instances of app running? */
 {
 hInstMegaphone = hInstance; /* Remember original instance */

 /* Fill in window class structure with parameters that describe the */
 /* main window. */

 wc.style = CS_DBLCLKS; /* Process double click msgs */
 wc.lpfnWndProc = MainWndProc; /* Function to retrieve msgs for */
 /* windows of this class. */
 wc.cbClsExtra = 0; /* No per-class extra data. */
 wc.cbWndExtra = DLGWINDOWEXTRA; /* Set becuase we used the CLASS */
 /* statement in dialog box */
 wc.hInstance = hInstance; /* Application that owns the class.*/
 wc.hIcon = LoadIcon(hInstance, "Megaphone");
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = "Megaphone";
 wc.lpszClassName = "Megaphone";

 if(!RegisterClass(&wc))
 return(FALSE);

 /* Fill in window class structure with parameters that describe the */
 /* message window. */

 wc.style = NULL; /* Process double click msgs */
 wc.lpfnWndProc = MessageHandler;
 wc.cbClsExtra = 0; /* No per-class extra data. */
 wc.cbWndExtra = DLGWINDOWEXTRA; /* Set becuase we used the CLASS */
 /* statement in dialog box */
 wc.hInstance = hInstance; /* Application that owns the class.*/
 wc.hIcon = LoadIcon(hInstance, "Message");
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = NULL;
 wc.lpszClassName = "Message";

 if(!RegisterClass(&wc))
 return(FALSE);

 /* Fill in window class structure with parameters that describe the */
 /* userinfo window. */

 wc.style = NULL; /* Process double click msgs */
 wc.lpfnWndProc = UserInfo;
 wc.cbClsExtra = 0; /* No per-class extra data. */
 wc.cbWndExtra = DLGWINDOWEXTRA; /* Set becuase we used the CLASS */
 /* statement in dialog box */
 wc.hInstance = hInstance; /* Application that owns the class.*/
 wc.hIcon = LoadIcon(hInstance, "User");
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = NULL;
 wc.lpszClassName = "User";


 if(!RegisterClass(&wc))
 return(FALSE);

 /* Create the main dialog window Megaphone, get window handles to */
 /* several of the controls, send the message edit box a message to */
 /* limit itself to MAX_MESSAGE_LEN characters, and set the system */
 /* font (mono-spaced) for server combo box and user list box. */

 hDlgMegaphone = CreateDialog(hInstance, "Megaphone", NULL, 0L);

 hWndUserListBox = GetDlgItem(hDlgMegaphone, IDC_USERLISTBOX);
 hWndServerComboBox = GetDlgItem(hDlgMegaphone, IDC_SERVERCOMBOBOX);
 hWndMessageEditBox = GetDlgItem(hDlgMegaphone, IDC_MESSAGEEDITBOX);
 hWndSendButton = GetDlgItem(hDlgMegaphone, IDC_SENDBUTTON);

 SendMessage(hWndMessageEditBox, EM_LIMITTEXT, MAX_MESSAGE_LEN - 1, 0L);
 SendMessage(hWndUserListBox, WM_SETFONT,
 GetStockObject(SYSTEM_FIXED_FONT), FALSE);
 SendMessage(hWndServerComboBox, WM_SETFONT,
 GetStockObject(SYSTEM_FIXED_FONT), FALSE);

 /* Finally, show the Megaphone dialog box */

 ShowWindow(hDlgMegaphone, nCmdShow);
 UpdateWindow(hDlgMegaphone);

 /* Initialize the network stuff, and fill in list boxes */

 InitNetStuff();
 }
 else
 {
 /* If there was another instance of Megaphone running, then switch to */
 /* it by finding any window of class = "Megaphone". Then, if it's an */
 /* icon, open the window, otherwise just make it active. */

 hDlgMegaphone = FindWindow("Megaphone", NULL);
 if(IsIconic(hDlgMegaphone))
 ShowWindow(hDlgMegaphone, SW_SHOWNORMAL);
 SetActiveWindow(hDlgMegaphone);

 return(FALSE);
 }

 /* Acquire and dispatch messages until a WM_QUIT message is received. */
 /* The window handle hWndCurrent points to the currently active window, */
 /* and is used to identify and process keystrokes going to any modeless */
 /* dialog box. */

 while(GetMessage(&msg, NULL, NULL, NULL))
 if(hWndCurrent == NULL !IsDialogMessage(hWndCurrent, &msg))
 {
 TranslateMessage(&msg);
 DispatchMessage(&msg);
 }
}


/*****************************************************************************


 FUNCTION: MainWndProc

 PURPOSE : Processes messages for Megaphone dialog box

*****************************************************************************/

long FAR PASCAL MainWndProc(HWND hWnd, unsigned wMsg, WORD wParam, LONG
lParam)
{
 FARPROC lpProc; /* Far procedure ptr to be used for About box */

 HWND hWndTemp;
 HWND hDlgMessage;

 BYTE *ptr;
 int Index = 100;

 BYTE MessageCaption[100];
 BYTE MessageDate[9];
 BYTE MessageTime[9];

 switch(wMsg)
 {
 case WM_COMMAND :

 switch(wParam)
 {
 /* When wParam == 1, return was pressed in either the list box, */
 /* combo box, check boxes, or edit control. */

 case 1 :

 /* Find out the current control (window) and process ENTER */
 /* key accordingly. */

 switch(GetDlgCtrlID(GetFocus()))
 {
 case IDC_USERLISTBOX :

 ShowUserInformation();
 break;

 case IDC_SERVERCOMBOBOX :

 SendMessage(hWndServerComboBox, WM_KILLFOCUS, NULL, 0L);
 SendMessage(hWndServerComboBox, WM_SETFOCUS, NULL, 0L);
 break;

 case IDC_MESSAGEEDITBOX :

 if(IsWindowEnabled(hWndSendButton))
 {
 SendMessage(hWndSendButton, WM_LBUTTONDOWN, 0, 0L);
 SendMessage(hWndSendButton, WM_LBUTTONUP, 0, 0L);
 SendMessage(hDlgMegaphone, WM_NEXTDLGCTL,
 hWndMessageEditBox, TRUE);
 }
 else
 {

 MessageBeep(0);
 MessageBox(hDlgMegaphone, "You need a message and a user(s)",
 "ERROR", MB_ICONEXCLAMATION MB_OK);
 }
 break;

 default :

 break;
 }
 break;

 case IDC_USERLISTBOX :

 if(HIWORD(lParam) == LBN_DBLCLK)
 {
 /* Restore the list box item's selection state. If it */
 /* isn't flagged, then flag it, and vice versa. */

 Index = (int) SendMessage(hWndUserListBox, LB_GETCURSEL, 0, 0L);
 if(SendMessage(hWndUserListBox, LB_GETSEL, Index, 0L))
 SendMessage(hWndUserListBox, LB_SETSEL, FALSE, Index);
 else
 SendMessage(hWndUserListBox, LB_SETSEL, TRUE, Index);

 ShowUserInformation();
 }
 else
 EnableOrDisableSendButton();

 break;

 case IDC_SERVERCOMBOBOX :

 if(HIWORD(lParam) == CBN_SELCHANGE)
 {
 if((Index = (int) SendMessage(hWndServerComboBox, CB_GETCURSEL,
 0, 0L)) == CB_ERR)
 break;

 SendMessage(hWndServerComboBox, CB_GETLBTEXT, Index,
 (LONG) (LPSTR) ServerName);

 if(!GetConnectionID(ServerName, &DefaultConnectionID))
 {
 SetPreferredConnectionID(DefaultConnectionID);
 InitNetStuff();
 }
 }
 break;

 case IDC_MESSAGEEDITBOX :

 EnableOrDisableSendButton();
 break;

 case IDC_SENDBUTTON :

 if(HIWORD(lParam) == BN_CLICKED)

 SendNetWareMessageToUsers();
 break;

 case IDM_EXIT :
 case IDC_EXITBUTTON :

 SendMessage(hDlgMegaphone, WM_CLOSE, 0, 0L);
 break;

 case IDM_ABOUT :

 lpProc = MakeProcInstance(About, hInstMegaphone);
 DialogBox(hInstMegaphone, "About", hWnd, lpProc);
 FreeProcInstance(lpProc);
 break;

 case IDM_REFRESH :

 InitNetStuff();
 break;

 case IDC_SETTINGSBUTTON :

 lpProc = MakeProcInstance(Settings, hInstMegaphone);
 DialogBox(hInstMegaphone, "Settings", hWnd, lpProc);
 FreeProcInstance(lpProc);
 break;

 default :

 break;
 }

 break;

 case WM_TIMER :

 /* This is the Windows timer for retrieving messages that goes off */
 /* every five seconds. */

 GetBroadcastMessage(Text);

 if(*Text)
 {
 /* Create the message reply dialog box and limit the edit box */
 /* to NetWare's limit of 56 or so characters. */

 hDlgMessage = CreateDialog(hInstMegaphone, "Message",
 hDlgMegaphone, 0L);

 SendDlgItemMessage(hDlgMessage, IDC_REPLYEDITBOX, EM_LIMITTEXT,
 MAX_MESSAGE_LEN - 1, 0L);

 /* Parse the incoming string of 'username[station#]message' */

 if((ptr = strchr(Text, '[')) == NULL)
 {
 /* If the incoming message isn't formatted by NetWare's */
 /* SEND command, SESSION program, or Megaphone, then we */

 /* can't use the REPLY button, so disable it. */

 SelUserName[0] = '\0';
 SelUserConnectionNum = 0;
 EnableWindow(GetDlgItem(hDlgMessage, IDC_REPLYBUTTON), FALSE);
 }
 else
 {
 /* Pull up the user name and connection#, and message, which */
 /* is right after the ']'. */

 strncpy(SelUserName, Text, ptr - Text);
 SelUserName[ptr - Text] = '\0';
 SelUserConnectionNum = atoi(ptr + 1);
 if((ptr = strchr(Text, ']')) != NULL)
 lstrcpy((LPSTR) Text, (LPSTR) (ptr + 1));

 /* Check again to see if we pulled up a valid Conn#. If we */
 /* didn't, then disable the REPLY button. */

 if(SelUserConnectionNum < 1 SelUserConnectionNum > 255)
 EnableWindow(GetDlgItem(hDlgMessage, IDC_REPLYBUTTON), FALSE);
 }

 /* Put the retrieved message in the dialog's edit box */

 SetDlgItemText(hDlgMessage, IDC_REPLYEDITBOX, Text);

 /* Record the date and time that the message came in at and */
 /* make it reflected in the message caption. */

 _strdate(MessageDate);
 _strtime(MessageTime);
 wsprintf(MessageCaption, "%s %s %s", (LPSTR) SelUserName,
 (LPSTR) MessageDate, (LPSTR) MessageTime);
 SetWindowText(hDlgMessage, MessageCaption);

 /* Finally, show (or minimize) the completed message dialog box */

 if(IconizeMessages)
 ShowWindow(hDlgMessage, SW_SHOWMINNOACTIVE);
 else
 ShowWindow(hDlgMessage, SW_SHOWNORMAL);

 MessageBeep(0);
 }

 return(0L);

 case WM_CLOSE :

 /* Check before closing the main window if there are any */
 /* outstanding messages that haven't been closed. */

 if(hWndTemp = FindWindow((LPSTR) "Message", NULL))
 {
 if(MessageBox(hDlgMegaphone,
 "Quit without disposing of/reading messages?",
 "Messages Outstanding", MB_YESNO MB_APPLMODAL 

 MB_ICONEXCLAMATION MB_DEFBUTTON2) == IDYES)
 {
 DestroyWindow(hDlgMegaphone);
 }
 else
 {
 ShowWindow(hWndTemp, SW_SHOWNORMAL);
 SetActiveWindow(hWndTemp);
 }
 }
 else
 DestroyWindow(hDlgMegaphone);

 return(0L);

 case WM_SETFOCUS :

 if(IsWindowEnabled(hWndMessageEditBox))
 SendMessage(hDlgMegaphone, WM_NEXTDLGCTL, hWndMessageEditBox, TRUE);
 else
 SendMessage(hDlgMegaphone, WM_NEXTDLGCTL,
 GetDlgItem(hDlgMegaphone, IDC_EXITBUTTON), TRUE);

 return(0L);

 case WM_ACTIVATE :

 hWndCurrent = (wParam == NULL) ? NULL : hWnd;
 break;

 case WM_DESTROY :

 PostQuitMessage(0);
 return(0L);

 default :

 break;

 }
 return(DefDlgProc(hWnd, wMsg, wParam, lParam));
}


/*****************************************************************************

 FUNCTION: SendNetWareMessageToUsers

 PURPOSE : Do I really need to explain this one?

*****************************************************************************/

VOID PASCAL SendNetWareMessageToUsers(VOID)
{
 BYTE Message[MAX_MESSAGE_LEN];
 WORD ConnectionsToSend[MAX_CONNECTIONS];
 BYTE ResultList[MAX_CONNECTIONS];
 int NumUsers;
 int i, j;


 /* Get text inside message edit box and format message so it includes */
 /* the username, connection#, and message. The first two fields are */
 /* needed for replying back since there's nothing in NetWare's messaging */
 /* facility to tell you who sent the message. */

 GetDlgItemText(hDlgMegaphone, IDC_MESSAGEEDITBOX, (LPSTR) Text,
MAX_MESSAGE_LEN);

 wsprintf(Message, "%s[%d]%s", (LPSTR) UserName, UserConnectionNum,
 (LPSTR) Text);

 /* Get total number of users in list box and check to see if they've */
 /* been selected or not. If they have, get their Connection# and put it */
 /* in the ConnectionsToSend array. */

 NumUsers = (int) SendMessage(hWndUserListBox, LB_GETCOUNT, 0, 0L);

 for(i = j = 0; i < NumUsers; i++)
 if(SendMessage(hWndUserListBox, LB_GETSEL, i, 0L))
 {
 SendMessage(hWndUserListBox, LB_GETTEXT, i, (LONG) (LPSTR) Text);
 ConnectionsToSend[j++] = atoi(&Text[18]);
 }

 /* Send the message to users in the array. */

 SendBroadcastMessage(Message, ConnectionsToSend, ResultList, j);

 /* Scan through the ResultList array checking for messages that had */
 /* problems. Selecting OK will continue to check the status of the other */
 /* messages, where selecting CANCEL from the message box will abort the */
 /* send status checking altogether. */

 for(i = 0; i < j; i++)
 switch(ResultList[i])
 {
 case 0xfc :

 wsprintf(Text, "Message to Connection %d", ConnectionsToSend[i]);
 if(MessageBox(hDlgMegaphone,
 "Message not sent - User already has message pending",
 Text, MB_OKCANCEL MB_ICONEXCLAMATION) == IDCANCEL)
 {
 i = j;
 }
 break;

 case 0xfd :

 wsprintf(Text, "Message to Connection %d", ConnectionsToSend[i]);
 if(MessageBox(hDlgMegaphone,
 "Message not sent - Invalid connection number",
 Text, MB_OKCANCEL MB_ICONEXCLAMATION) == IDCANCEL)
 {
 i = j;
 }
 break;

 case 0xff :


 wsprintf(Text, "Message to Connection %d", ConnectionsToSend[i]);
 if(MessageBox(hDlgMegaphone,
 "Message not sent - User has blocking turned on",
 Text, MB_OKCANCEL MB_ICONEXCLAMATION) == IDCANCEL)
 {
 i = j;
 }
 break;

 default :

 break;
 }
}


/*****************************************************************************

 FUNCTION: EnableOrDisableSendButton

 PURPOSE : Based on a message being in the edit box and at least one
 selected user, the send button is enabled or disabled

*****************************************************************************/

VOID PASCAL EnableOrDisableSendButton(VOID)
{
 /* Check to see if at least one user is selected and at least a one */
 /* character message in the edit box. If there is, then enable the SEND */
 /* button and thicken it to make it the default response when ENTER is */
 /* pressed. */

 if(SendMessage(hWndUserListBox, LB_GETSELCOUNT, 0, 0L) &&
 SendMessage(hWndMessageEditBox, EM_LINELENGTH, -1, 0L))
 {
 EnableWindow(hWndSendButton, TRUE);
 }
 else
 {
 EnableWindow(hWndSendButton, FALSE);
 }
}


/*****************************************************************************

 FUNCTION: InitNetStuff

 PURPOSE : Initialize network connections and fill in combo and list boxes

*****************************************************************************/

BOOL PASCAL InitNetStuff(VOID)
{
 HCURSOR hOldCursor;

 WORD NumberOfConnections;
 WORD NumberOfServers;


 int Index;
 BYTE DirHandle;

 BYTE TempServerName[48];
 WORD ObjectType;
 WORD ConnID;
 WORD ConnectionList[MAX_CONNECTIONS];
 BYTE SearchObjectName[48] = "*";
 BYTE ObjectName[48];
 long ObjectID;
 BYTE ObjectHasProperties;
 BYTE ObjectFlag;
 BYTE ObjectSecurity;

 /* Check to see if a connection has been made to any server */

 if(UserConnectionNum = GetConnectionNumber())
 if(!GetConnectionInformation(UserConnectionNum, UserName, &ObjectType,
 &ObjectID, UserLoginTime))
 if(*UserName)
 {
 /* If we have a preferred connection ID, then were supposed to */
 /* use it for all of our requests. If we don't, then check to */
 /* see if we're sitting on a local drive (bit 0 or 1 not set). */
 /* If we are, then set the default connection ID to that of the */
 /* primary server. If we're sitting on a network drive, then */
 /* requests go to the associated server. */

 if(GetPreferredConnectionID())
 DefaultConnectionID = GetPreferredConnectionID();
 else
 {
 if(!(GetDriveInformation((BYTE) (_getdrive() - 1), &ConnID,
 &DirHandle) & 3))
 {
 DefaultConnectionID = GetPrimaryConnectionID();
 }
 SetPreferredConnectionID(DefaultConnectionID);
 }

 /* Set NetWare's message mode so that Megaphone can poll for */
 /* messages instead of automatically having them sent to the */
 /* station. */

 EnableBroadcasts();
 SetBroadcastMode(3);

 /* Set up a Windows timer so that every 5 seconds, the server is */
 /* polled for waiting messages. */

 SetTimer(hDlgMegaphone, IDT_MESSAGETIMER, 5000, NULL);

 EnableWindow(GetDlgItem(hDlgMegaphone, IDC_SETTINGSBUTTON), TRUE);
 }

 if(!UserConnectionNum)
 {
 EnableWindow(GetDlgItem(hDlgMegaphone, IDC_SETTINGSBUTTON), FALSE);

 MessageBox(hDlgMegaphone, "Must be logged into a NetWare server",
 "ERROR - NO USERS", MB_ICONSTOP MB_OK);
 return(FALSE);
 }

 /* Now that we've established a network connection, let's fill in the */
 /* drop-down combo box with file servers and the list box with users of */
 /* whatever the node's preferred server is. */

 /* Turn off re-drawing of the list box so it doesn't flicker, reset the */
 /* contents of both boxes, capture and intercept all mouse activity, and */
 /* turn the cursor into an hourglass. */

 SendMessage(hWndUserListBox, WM_SETREDRAW, FALSE, 0L);
 SendMessage(hWndUserListBox, LB_RESETCONTENT, 0, 0L);
 SendMessage(hWndServerComboBox, CB_RESETCONTENT, 0, 0L);
 SetCapture(hDlgMegaphone);
 hOldCursor = SetCursor(LoadCursor(NULL, IDC_WAIT));

 /* Scan through the possible ConnectionID#'s (1-8) and see what file */
 /* servers are attached, if any are, and put them in the combo box. */

 for(ConnID = 1; ConnID < 9; ++ConnID)
 {
 GetFileServerName(ConnID, TempServerName);
 if(*TempServerName)
 SendMessage(hWndServerComboBox, CB_ADDSTRING, NULL,
 (LONG) (LPSTR) TempServerName);
 }

 /* Get default server */

 GetFileServerName(DefaultConnectionID, ServerName);

 /* Search the NetWare bindery for active user connections, putting */
 /* them into the list box */

 ObjectID = -1;
 while(!ScanBinderyObject(SearchObjectName, OT_USER, &ObjectID, ObjectName,
 &ObjectType, &ObjectHasProperties, &ObjectFlag, &ObjectSecurity))
 {
 GetObjectConnectionNumbers(ObjectName, OT_USER, &NumberOfConnections,
 ConnectionList, MAX_CONNECTIONS);

 /* If there are multiple connections for a single user then we */
 /* have to make sure and get all of them. */

 if(!NumberOfConnections)
 {
 if(AllUsers)
 {
 wsprintf(Text, "[%s]", (LPSTR) ObjectName);
 SendMessage(hWndUserListBox, LB_ADDSTRING, NULL, (LONG) (LPSTR) Text);
 }
 }
 else
 for(Index = 0; Index < (int) NumberOfConnections; ++Index)
 {
 if(UserConnectionNum == ConnectionList[Index])

 wsprintf(Text, "%-16.16s *%3d", (LPSTR) ObjectName, ConnectionList[Index]);
 else
 wsprintf(Text, "%-17.17s %3d", (LPSTR) ObjectName, ConnectionList[Index]);
 SendMessage(hWndUserListBox, LB_ADDSTRING, NULL, (LONG) (LPSTR) Text);
 }
 }

 /* Turn re-drawing for the list box back on and make the first item in */
 /* the server combo box and user list box the default. */

 InvalidateRect(hWndUserListBox, NULL, TRUE);
 SendMessage(hWndUserListBox, LB_SETSEL, 0, 0L);
 SendMessage(hWndUserListBox, WM_SETREDRAW, TRUE, 0L);

 /* Select the default server in the server combo box */

 SendMessage(hWndServerComboBox, CB_SELECTSTRING, -1, (LONG) (LPSTR)
ServerName);

 /* Add the # of servers and users to the caption on the server and list */
 /* boxes. */

 NumberOfConnections = (int) SendMessage(hWndUserListBox, LB_GETCOUNT, 0, 0L);
 wsprintf(Text, "%d &Users on %s", NumberOfConnections, (LPSTR) ServerName);
 SetDlgItemText(hDlgMegaphone, IDC_USERLISTBOXTITLE, (LPSTR) Text);

 NumberOfServers = (int) SendMessage(hWndServerComboBox, CB_GETCOUNT, 0, 0L);
 wsprintf(Text, "%d Ser&vers", NumberOfServers);
 SetDlgItemText(hDlgMegaphone, IDC_SERVERCOMBOBOXTITLE, (LPSTR) Text);

 /* Restore mouse activity, set the cursor back to normal, and initially */
 /* disable the Send button. */

 ReleaseCapture();
 SetCursor(hOldCursor);
 EnableOrDisableSendButton();

 return(TRUE);
}


/*****************************************************************************

 FUNCTION: About

 PURPOSE : Processes messages for About box

*****************************************************************************/

BOOL FAR PASCAL About(HWND hWnd, unsigned wMsg, WORD wParam, LONG lParam)
{
 switch(wMsg)
 {
 case WM_INITDIALOG :

 return(TRUE);

 case WM_COMMAND :

 if(wParam == IDOK wParam == IDCANCEL)

 {
 EndDialog(hWnd, TRUE);
 return(TRUE);
 }
 break;

 default :

 break;
 }
 return(FALSE);
}


/*****************************************************************************

 FUNCTION: Settings

 PURPOSE : Processes messages for Settings window

*****************************************************************************/

BOOL FAR PASCAL Settings(HWND hWnd, unsigned wMsg, WORD wParam, LONG lParam)
{
 switch(wMsg)
 {
 case WM_INITDIALOG :

 CheckDlgButton(hWnd, IDC_ACCEPTMESSAGES, AcceptMessages);
 CheckDlgButton(hWnd, IDC_ICONIZEMESSAGES, IconizeMessages);

 if(AllUsers)
 CheckRadioButton(hWnd, IDC_ALLUSERSINBINDERY,
 IDC_ALLUSERSINBINDERY, IDC_ALLUSERSINBINDERY);
 else
 CheckRadioButton(hWnd, IDC_ONLYATTACHEDUSERS,
 IDC_ONLYATTACHEDUSERS, IDC_ONLYATTACHEDUSERS);

 break;

 case WM_COMMAND :

 switch(wParam)
 {
 case IDC_ACCEPTMESSAGES :

 if(IsDlgButtonChecked(hWnd, IDC_ACCEPTMESSAGES))
 {
 EnableBroadcasts();
 SetTimer(hDlgMegaphone, IDT_MESSAGETIMER, 5000, NULL);
 AcceptMessages = TRUE;
 }
 else
 {
 DisableBroadcasts();
 KillTimer(hDlgMegaphone, IDT_MESSAGETIMER);
 AcceptMessages = FALSE;
 }


 break;

 case IDC_ICONIZEMESSAGES :

 if(IsDlgButtonChecked(hWnd, IDC_ICONIZEMESSAGES))
 IconizeMessages = TRUE;
 else
 IconizeMessages = FALSE;

 break;

 case IDC_ONLYATTACHEDUSERS :

 AllUsers = FALSE;
 InitNetStuff();
 break;

 case IDC_ALLUSERSINBINDERY :

 AllUsers = TRUE;
 InitNetStuff();
 break;

 case IDOK :
 case IDCANCEL :

 EndDialog(hWnd, TRUE);
 return(TRUE);

 default :

 break;
 }
 }
 return(FALSE);
}


/*****************************************************************************

 FUNCTION: MessageHandler

 PURPOSE : Processes messages for user message dialog box/window

*****************************************************************************/

long FAR PASCAL MessageHandler(HWND hWnd, unsigned wMsg, WORD wParam, LONG
lParam)
{
 BYTE Message[MAX_MESSAGE_LEN];
 WORD ConnectionsToSend[1];
 BYTE ResultList[1];

 switch(wMsg)
 {
 case WM_COMMAND :

 switch(wParam)
 {
 case 1 :


 /* A '1' is generated when ENTER is pressed while in a */
 /* control in a dialog box and there's no default button */
 /* defined. This is to trap the ENTER key when in the reply */
 /* edit box. We'll simulate the pressing of the REPLY button */

 if(IsWindowEnabled(GetDlgItem(hWnd, IDC_REPLYBUTTON)))
 {
 SendMessage(GetDlgItem(hWnd, IDC_REPLYBUTTON),
 WM_LBUTTONDOWN, 0, 0L);
 SendMessage(GetDlgItem(hWnd, IDC_REPLYBUTTON),
 WM_LBUTTONUP, 0, 0L);
 }
 else
 {
 MessageBeep(0);
 MessageBox(hWnd, "You cannot reply to this message",
 "ERROR", MB_ICONEXCLAMATION MB_OK);
 }
 break;

 case IDC_SAVEBUTTON :

 /* "Save" the message by iconizing it */

 CloseWindow(hWnd);
 break;

 case IDC_REPLYBUTTON :

 /* Get text from edit box located in message dialog box */

 GetDlgItemText(hWnd, IDC_REPLYEDITBOX, (LPSTR) Text,
 MAX_MESSAGE_LEN);

 /* Set up my connection# to send to, format the message, and */
 /* send it to the user. */

 ConnectionsToSend[0] = SelUserConnectionNum;
 wsprintf(Message, "%s[%d]%s", (LPSTR) SelUserName,
 SelUserConnectionNum, (LPSTR) Text);

 SendBroadcastMessage(Message, ConnectionsToSend, ResultList, 1);

 /* Possible results of sending a message */

 switch(ResultList[0])
 {
 case 0xfc :

 wsprintf(Text, "Message to Connection %d",
 ConnectionsToSend[0]);
 MessageBox(hDlgMegaphone,
 "Message not sent - User already has message pending",
 Text, MB_OK MB_ICONEXCLAMATION);
 break;

 case 0xfd :


 wsprintf(Text, "Message to Connection %d",
 ConnectionsToSend[0]);
 MessageBox(hDlgMegaphone,
 "Message not sent - Invalid connection number",
 Text, MB_OK MB_ICONEXCLAMATION);

 break;

 case 0xff :

 wsprintf(Text, "Message to Connection %d",
 ConnectionsToSend[0]);
 MessageBox(hDlgMegaphone,
 "Message not sent - User has blocking turned on",
 Text, MB_OK MB_ICONEXCLAMATION);

 break;

 default :

 break;
 }

 /* Get rid of the message reply dialog box/window */

 DestroyWindow(hWnd);
 return(0L);

 case IDCANCEL :

 DestroyWindow(hWnd);
 return(0L);

 default :

 break;
 }
 break;

 case WM_CLOSE :

 DestroyWindow(hWnd);
 return(0L);

 case WM_SETFOCUS :

 /* Set the focus to the edit control. */

 SendMessage(hWnd, WM_NEXTDLGCTL, GetDlgItem(hWnd, IDC_REPLYEDITBOX),
 TRUE);

 return(0L);

 case WM_ACTIVATE :

 hWndCurrent = (wParam == NULL) ? NULL : hWnd;
 break;

 default :


 break;
 }
 return(DefDlgProc(hWnd, wMsg, wParam, lParam));
}


/*****************************************************************************

 FUNCTION: ShowUserInformation

 PURPOSE : Shows user information on current or double-clicked entry

*****************************************************************************/

VOID PASCAL ShowUserInformation(VOID)
{
 int Index;
 BYTE *ptr;

 HWND hDlgUserInfo;

 WORD ObjectType;
 long ObjectID;
 WORD SocketNum;
 BYTE PropertyValue[128];
 BYTE MoreSegments;
 BYTE PropertyFlags;

 /* Get an index to the user name underneath the cursor, */
 /* and then get the user's connection number so we can */
 /* retrieve NetWare information on him/her. */

 Index = (int) SendMessage(hWndUserListBox, LB_GETCURSEL, 0, 0L);
 SendMessage(hWndUserListBox, LB_GETTEXT, Index, (LONG) (LPSTR) Text);

 /* If entry in list box doesn't have a connection #, then we need to get */
 /* the login name by parsing the list box string, we can't use a call to */
 /* GetConnectionInformation(). */

 memset(SelUserLoginTime, '\0', 7);
 memset(SelUserNetworkAddr, '\0', 4);
 memset(SelUserNodeAddr, '\0', 6);
 SelUserFullName[0] = '\0';
 SelUserConnectionNum = 0;

 if(Text[0] == '[')
 {
 ptr = strchr(Text, ']');
 strncpy(SelUserName, Text + 1, ptr - Text - 1);
 SelUserName[ptr - Text - 1] = '\0';
 }
 else
 {
 ptr = strchr(Text, ' ');
 strncpy(SelUserName, Text, ptr - Text);
 SelUserConnectionNum = atoi(&Text[18]);
 SelUserName[ptr - Text] = '\0';


 /* We can get connection info only for users that are logged in to a */
 /* server. So, get the user name and login time, network and node */
 /* address, and full name for specified connection#. */

 GetConnectionInformation(SelUserConnectionNum, SelUserName, &ObjectType,
 &ObjectID, SelUserLoginTime);

 GetInternetAddress(SelUserConnectionNum, SelUserNetworkAddr,
 SelUserNodeAddr, &SocketNum);
 }

 if(!ReadPropertyValue(SelUserName, OT_USER, "IDENTIFICATION", 1,
 PropertyValue, &MoreSegments, &PropertyFlags))
 {
 wsprintf(SelUserFullName, "%s", (LPSTR) PropertyValue);
 }

 /* Create userinfo dialog box and change caption to the */
 /* user's login name. */

 hDlgUserInfo = CreateDialog(hInstMegaphone, "UserInfo", hDlgMegaphone, 0L);
 SetWindowText(hDlgUserInfo, SelUserName);

 /* Initialize the user info dialog box by changing */
 /* the values of different static text fields to */
 /* reflect acquired user information */

 wsprintf(Text, ": %s", (LPSTR) SelUserName);
 SetDlgItemText(hDlgUserInfo, IDC_USERNAME, (LPSTR) Text);

 wsprintf(Text, ": %d", SelUserConnectionNum);
 SetDlgItemText(hDlgUserInfo, IDC_STATION, (LPSTR) Text);

 wsprintf(Text, ": %s", (LPSTR) SelUserFullName);
 SetDlgItemText(hDlgUserInfo, IDC_FULLNAME, (LPSTR) Text);

 wsprintf(Text, ": %02d/%02d/%02d %02d:%02d:%02d",
 SelUserLoginTime[1], SelUserLoginTime[2], SelUserLoginTime[0],
 SelUserLoginTime[3], SelUserLoginTime[4], SelUserLoginTime[5]);
 SetDlgItemText(hDlgUserInfo, IDC_LOGINTIME, (LPSTR) Text);

 wsprintf(Text, ": %02X%02X%02X%02X",
 SelUserNetworkAddr[0], SelUserNetworkAddr[1],
 SelUserNetworkAddr[2], SelUserNetworkAddr[3]);
 SetDlgItemText(hDlgUserInfo, IDC_NETWORK, (LPSTR) Text);

 wsprintf(Text, ": %02X%02X%02X%02X%02X%02X",
 SelUserNodeAddr[0], SelUserNodeAddr[1], SelUserNodeAddr[2],
 SelUserNodeAddr[3], SelUserNodeAddr[4], SelUserNodeAddr[5]);
 SetDlgItemText(hDlgUserInfo, IDC_NODE, (LPSTR) Text);

 ShowWindow(hDlgUserInfo, SW_SHOWNORMAL);
}


/*****************************************************************************

 FUNCTION: UserInfo


 PURPOSE : Processes messages for user info box

*****************************************************************************/

long FAR PASCAL UserInfo(HWND hWnd, unsigned wMsg, WORD wParam, LONG lParam)
{
 switch(wMsg)
 {
 case WM_COMMAND :

 if(wParam == IDOK wParam == IDCANCEL)
 {
 DestroyWindow(hWnd);
 return(0L);
 }
 break;

 case WM_CLOSE :

 DestroyWindow(hWnd);
 return(0L);

 case WM_ACTIVATE :

 hWndCurrent = (wParam == NULL) ? NULL : hWnd;
 break;

 default :

 break;
 }
 return(DefDlgProc(hWnd, wMsg, wParam, lParam));
}






[LISTING THREE]

#include "windows.h"
#include "megaphon.h"


Megaphone ICON MEGAPHON.ICO
Message ICON MESSAGE.ICO
User ICON USER.ICO


Megaphone MENU
BEGIN
 POPUP "&File"
 BEGIN
 MENUITEM "&Exit", IDM_EXIT
 MENUITEM SEPARATOR
 MENUITEM "&About ...", IDM_ABOUT
 END
 MENUITEM "&Refresh!", IDM_REFRESH

END


Megaphone DIALOG 65, 75, 165, 139
CLASS "Megaphone"
CAPTION "Megaphone - NetWare Intercom"
STYLE WS_POPUPWINDOW WS_CAPTION WS_MINIMIZEBOX
BEGIN
 CONTROL "Megaphone", -1, "static",
 SS_ICON WS_CHILD, 24, 6, 0, 0
 CONTROL "&Users", IDC_USERLISTBOXTITLE, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 65, 2, 95, 9
 CONTROL "", IDC_USERLISTBOX, "listbox",
 LBS_STANDARD LBS_EXTENDEDSEL WS_VSCROLL WS_TABSTOP 
 WS_CHILD,
 65, 12, 95, 71
 CONTROL "Ser&vers", IDC_SERVERCOMBOBOXTITLE, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 5, 23, 55, 9
 CONTROL "", IDC_SERVERCOMBOBOX, "combobox",
 CBS_HASSTRINGS CBS_SORT CBS_DROPDOWNLIST 
 WS_VSCROLL WS_TABSTOP WS_CHILD,
 5, 33, 55, 63
 CONTROL "&Message", -1, "static", SS_LEFT WS_CHILD,
 5, 80, 55, 9
 CONTROL "", IDC_MESSAGEEDITBOX, "edit",
 ES_LEFT ES_AUTOHSCROLL ES_UPPERCASE WS_BORDER WS_TABSTOP WS_CHILD,
 5, 90, 155, 12
 CONTROL "&Send", IDC_SENDBUTTON, "button",
 BS_PUSHBUTTON WS_TABSTOP WS_CHILD,
 5, 110, 45, 15
 CONTROL "Se&ttings", IDC_SETTINGSBUTTON, "button",
 BS_PUSHBUTTON WS_TABSTOP WS_CHILD,
 60, 110, 45, 15
 CONTROL "&Exit", IDC_EXITBUTTON, "button",
 BS_PUSHBUTTON WS_TABSTOP WS_CHILD,
 115, 110, 45, 15
END


Message DIALOG 100, 100, 170, 55
CAPTION "Message"
CLASS "Message"
STYLE WS_POPUPWINDOW WS_CAPTION WS_MINIMIZEBOX
BEGIN
 CONTROL "", IDC_REPLYEDITBOX, "edit",
 ES_LEFT ES_AUTOHSCROLL ES_UPPERCASE WS_BORDER 
 WS_TABSTOP WS_CHILD, 5, 10, 155, 12
 CONTROL "&Reply", IDC_REPLYBUTTON, "button",
 BS_PUSHBUTTON WS_TABSTOP WS_CHILD, 5, 30, 45, 15
 CONTROL "&Cancel", IDCANCEL, "button",
 BS_PUSHBUTTON WS_TABSTOP WS_CHILD, 60, 30, 45, 15
 CONTROL "&Save", IDC_SAVEBUTTON, "button",
 BS_PUSHBUTTON WS_TABSTOP WS_CHILD, 115, 30, 45, 15
END


UserInfo DIALOG 68, 54, 145, 79
CAPTION "User Information"
CLASS "User"

STYLE WS_POPUPWINDOW WS_CAPTION WS_MINIMIZEBOX
BEGIN
 CONTROL "User", -1, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 42, 3, 18, 9
 CONTROL ":", IDC_USERNAME, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 60, 3, 82, 9
 CONTROL "Stn", -1, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 42, 12, 18, 9
 CONTROL ":", IDC_STATION, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 60, 12, 77, 9
 CONTROL "Node", -1, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 42, 21, 18, 9
 CONTROL ":", IDC_NODE, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 60, 21, 77, 9
 CONTROL "Full Name", -1, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 3, 30, 39, 9
 CONTROL ":", IDC_FULLNAME, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 42, 30, 95, 9
 CONTROL "Login Time", -1, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 3, 39, 39, 9
 CONTROL ":", IDC_LOGINTIME, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 42, 39, 95, 9
 CONTROL "Network", -1, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 3, 48, 39, 9
 CONTROL ":", IDC_NETWORK, "static",
 SS_LEFTNOWORDWRAP WS_CHILD, 42, 48, 95, 9
 CONTROL "User", -1, "static",
 SS_ICON WS_CHILD, 11, 6, 15, 15
 DEFPUSHBUTTON "OK", IDOK, 39, 60, 45, 15
END


Settings DIALOG 11, 21, 175, 65
CAPTION "Settings"
STYLE WS_POPUPWINDOW WS_CAPTION
BEGIN
 CONTROL "&Incoming Messages", -1, "button",
 BS_GROUPBOX WS_CHILD, 5, 5, 75, 35
 CONTROL "Accept", IDC_ACCEPTMESSAGES, "button",
 BS_AUTOCHECKBOX WS_TABSTOP WS_CHILD, 10, 15, 60, 10
 CONTROL "Iconize", IDC_ICONIZEMESSAGES, "button",
 BS_AUTOCHECKBOX WS_TABSTOP WS_CHILD, 10, 25, 60, 10
 CONTROL "&Users to scan", -1, "button",
 BS_GROUPBOX WS_CHILD, 85, 5, 85, 35
 CONTROL "Attached users only", IDC_ONLYATTACHEDUSERS, "button",
 BS_AUTORADIOBUTTON WS_CHILD, 90, 15, 75, 10
 CONTROL "Include unattached", IDC_ALLUSERSINBINDERY, "button",
 BS_AUTORADIOBUTTON WS_CHILD, 90, 25, 75, 10
 CONTROL "&OK", IDOK, "button",
 BS_DEFPUSHBUTTON WS_TABSTOP WS_CHILD, 55, 45, 55, 15
END


About DIALOG 22, 17, 110, 80
CAPTION "About"
STYLE WS_POPUPWINDOW WS_CAPTION
BEGIN
 CTEXT "Megaphone - NetWare Intercom" -1, 0, 5, 110, 8
 CTEXT "Version 1.0" -1, 0, 14, 110, 8

 CTEXT "by Mike Klein" -1, 0, 22, 110, 8
 ICON "User" -1, 13, 35, 0, 0
 ICON "Megaphone" -1, 48, 35, 0, 0
 ICON "Message" -1, 82, 35, 0, 0
 DEFPUSHBUTTON "OK" IDOK, 33, 60, 45, 15
END







[LISTING FOUR]

; module-definition file for Megaphone -- used by LINK.EXE

NAME Megaphone ; application's module name
DESCRIPTION 'Megaphone - NetWare Intercom'
EXETYPE WINDOWS ; required for all Windows applications
STUB 'WINSTUB.EXE' ; Generates error message if application
 ; is run without Windows

;CODE can be moved in memory and discarded/reloaded
CODE PRELOAD MOVEABLE DISCARDABLE

;DATA must be MULTIPLE if program can be invoked more than once
DATA PRELOAD MOVEABLE MULTIPLE

HEAPSIZE 1024
STACKSIZE 5120 ; recommended minimum for Windows applications

; All functions that will be called by any Windows routine
; MUST be exported.

EXPORTS
 MainWndProc @1
 About @2
 UserInfo @3
 MessageHandler @4
 Settings @5







[LISTING FIVE]

# Standard Windows make file. The utility MAKE.EXE compares the
# creation date of the file to the left of the colon with the file(s)
# to the right of the colon. If the file(s) on the right are newer
# then the file on the left, Make will execute all of the command lines
# following this line that are indented by at least one tab or space.
# Any valid MS-DOS command line may be used.

# This line allows NMAKE to work as well


all: megaphon.exe

# Update the resource if necessary

megaphon.res: megaphon.rc megaphon.h megaphon.ico
 rc -r megaphon.rc

# Update the object file if necessary

megaphon.obj: megaphon.c megaphon.h
 cl -W4 -c -AS -Gsw -Oad -Zp megaphon.c

# Update the executable file if necessary, and if so, add the resource back
in.

megaphon.exe: megaphon.obj megaphon.def
 link /NOD megaphon,,, libw slibcew snit, megaphon.def
 rc megaphon.res

# If the .res file is new and the .exe file is not, update the resource.
# Note that the .rc file can be updated without having to either
# compile or link the file.

megaphon.exe: megaphon.res
 rc megaphon.res






































March, 1991
 REMOTE CONNECTIVITY FOR PORTABLE TERMINALS: PART II


Developing the VT100 terminal emulation application




Dan Troy


Dan is software manager at Murata Hand-Held Terminals and is currently
developing more operating system firmware and application software for the
Links product line. He can be reached at Murata HHT, 9 Columbia Dr., Amherst,
NH 03031.


Last month I described the development of the software that processed the
standard VT100 commands and screen data, resulting in the 24 x 80-character
VT100 image. The resulting image was virtual, because the Links terminal can
physically display a maximum of only 12 x 20 fixed-width characters. This
logically leads to the question: How can we present the 24 x 80-character
VT100 image on a hand-held terminal in an easily usable and readable way?
Furthermore, what other screens are needed to allow a user to connect to the
host and configure the terminal? With these questions in mind, this month I'll
discuss the development of an application that emulates a VT100 terminal using
just about every feature of the Links touch-sensitive display, including
graphics.


Screen Logic and Hierarchy


To display a series of screens using the special Links functions, we
implemented the pseudocode shown in Example 1. As you can see, the display and
processing of the Setup/Connect screen is repeated whenever the user returns
from either of the two possible choices, setup or connect.
Example 1: Pseudocode that describes how screens are displayed

 void setup_connect_screen (void)
 {
 while(TRUE)
 {
 Clear screen
 Display LINKS100 connect screen title and revision
 Display and activate SETUP and CONNECT keys
 Wait until a key is pressed and get it
 IF key is SETUP then perform LINKS100 setup
 ELSE attempt to connect to host (CONNECT key)
 }
 }

The logic for all screens on the Links terminal except for the while(TRUE) is
similar. The programmer must clear the previous screen and paint the current
screen. The terminal does not remember previous screens; to redisplay them,
they must be repainted.
The first phase in developing the Links100 application was to layout the
screen hierarchy depicted in Figure 1. Setup/Connect is the initial screen on
power-up. From this screen, the user can either attempt to connect to the host
or select the setup option. If Setup Keyboard is selected, the user is
presented with key click on/off and key repeat on/off. If Setup Ports is
selected, the user can choose to setup either the Links modem or the RS-232
port. Setting up the RS-232 port allows you to setup parity, the number of
stop bits, and the baud rate. Setting up the modem allows you to setup only
parity and stop bits, because the internal modem is fixed at 1200 baud.
We found that most systems require some sort of character or series of
characters upon physical connection. If the host doesn't get what it expects,
it won't respond. To accommodate this, we modified the Links100 setup to allow
the user to enter a connect string that would be sent upon an attempt to
connect to the host. Many systems require only a single carriage return
character (ASCII 13), although others may require more than one character.
Because the Links terminal has two ports, modem and RS-232, we decided to
implement two paths within the software to establish each of these linkages.
If users wish to connect to the host through the modem, they can select the
active port option (modem or RS-232). When the modem is selected, a dial
screen appears in the Links100 application. This allows users to input or edit
the phone number, along with adding Hayes compatible commands (pause, wait for
dial tone, pulse/tone selection, and flash hook) to it. Upon modem connection,
the connect string is sent to the host, but because the host may not be ready
to accept it immediately, connect string transmission repeats for several
seconds, until connection is made. Direct connection to a host via the RS-232
port is the same as through a modem, except that dialing and modem connection
are not required. The connect string is sent in the same manner, and if a
connection isn't made, a message screen notifies the user that the Links100
was unable to connect to the host. The user can then return to the primary
screen and try to reconnect.
If a connection is made, then the next screen displayed would be the Section
Information screen that presents to the user the data currently being received
from the host. The user can then proceed to the Greeking/View Port Access
screen, the various emulated VT100 keyboards, or back to the Connect/Setup
screen.


It's Greek to Me (the Greeking Screen)


Our engineering team came up with the idea of "Greeking" in order to allow the
user to immediately see the general character layout of the screen. Greeking
is used in many word processing and desktop publishing packages to represent
text that cannot be displayed legibly due to insufficient screen pixel
resolution. By representing characters as pixels, we were able to present an
approximate picture of the text on a full 24 x 80 screen; see Figure 2(a).
Because the Links terminal consists of 96 rows of pixels, we can represent
each of the 24 VT100 character rows as a single row of pixels, each separated
by a blank line of pixels. This would utilize a total of 49 pixel rows (a
blank line above the first row and one below the last one), leaving room for
control keys at the bottom of the Links touchscreen, a title at the top, and a
miniaturized terminal around the Greeked text.
The representation of the columns was more difficult. Even though the Links
terminal has 120 columns of pixels, there was insufficient room to allow a
one-to-one correspondence between the 80 VT100 virtual-image characters and
the column of pixels on the Links. Space was needed on the Links screen to
accommodate the miniaturized terminal, as well as future control keys. We
determined that we had three pixels available for every four characters in the
VT100 image. To compress four characters into three pixels, a simple algorithm
was developed. For every four characters, the first two where translated
directly -- each was represented by a pixel. The next two characters were
evaluated as follows: If either was non-blank, the result would be a pixel
that was turned on; otherwise, it was turned off. This simple method was found
to be very practical because the representation was not only sufficiently
accurate, but also fast to process, albeit the large number of characters. (It
was coded in assembly language because C would be too sluggish.) The assembly
routine processes each of the global VT100 images' 24 x 80 characters and sets
the appropriate pixels in other global variables. These other global variables
can then be accessed by the Links100 application to display the Greeked text.
The miniaturized terminal was displayed using the graphics functions of the
Links Speed C package, which allows for customized pixel settings.
We then added iconified control keys on the bottom row of touch key areas.
Figure 2(b) shows the stop, reverse image, view mode, and cursor section keys.
The stop key gave us the ability to terminate the current session, logging off
if attached to a modem. The reverse image key allows the Links100 to toggle
between white-on-black and black-on-white. (This was a relatively simple
addition because the Links operating system has a global "reverse video"
flag.) The view mode key toggles between two ways of presenting the segmented
VT100 image (described in the next section), and the cursor section key shows
in which section the cursor is currently located. Each section represents an
8-row by 20-column section of the VT100 image, for a total of 12 sections. The
small black square within the cursor section key indicates which of the 12
sections the cursor is currently situated in. We found this to be very
informative to the user, because the cursor could not be practically presented
on the Greeked screen.


View Modes


Our primary customer requirement was to present the VT100 virtual image in
discrete sections, which we call "field" mode. In field mode, horizontal and
vertical scrolling enables traversal among the 12 sections. In the other
operating mode, "wrap" mode, an entire 80-column VT100 line is displayed in a
discrete section without the need for horizontal scrolling by wrapping each
Links line down onto the next Links line. The field and wrap modes are shown
in Figure 3 and Figure 4, respectively.
As mentioned earlier, each field mode section is 8 rows by 20 columns. This
allows space for control and VT100 keyboard-access touch keys, while at the
same time maximizing screen usage. But how do we access a specific section?
The easiest way that we could find was to make the VT100 miniaturized icon
touch sensitive in 12 discrete locations. Each specific touch sensitive area,
or viewport, is defined (using Links C language extensions to the operating
system) as a unique key, and corresponds to a unique section. When the desired
section on the VT100 icon is touched, the Links100 application detects a
key-press, and processes the appropriate section key. The special functions
used are char key_pressed(void), returning a boolean, and char get_key(void),
returning the ASCII key value assigned to the key. We sized the VT100 icon to
contain the 12 keys necessary to simply "point and shoot" for quick and easy
section access. This works out perfectly with condensing four characters into
three for the Greeking, because an 80-character row can be represented by 60
pixels. Each touch sensitive key area is 15 pixels wide, so four keys across
give us exactly 60 pixels! A section information screen display similar to
that in Figure 3 appears once a particular touch sensitive area is selected.
(We have applied for a patent for this "zooming in" technique.) A Section
Information Display screen contains the keys for scrolling left, right, up,
and down (represented by an appropriate arrow) and selecting an alphabetic,
numeric, auxiliary, or control keyboard. The section icon, Figure 2(b),
displays the currently selected section for the user's easy reference.

To provide text continuity when scrolling the currently displayed screen, we
decided that each scroll key actuation would scroll only half a section. To
redisplay the VT100 icon, it seemed logical to use the current screen section
icon as a touch sensitive key. When pressed, the Links100 application detects
this key and redisplays the VT100 icon screen. Each unique screen corresponds
to a C language function. This allows the Links100 application code to
redisplay any screen by simply returning from one screen function, or calling
another screen function from the current one.


Wrap Text Mode


Wrap text mode is particularly useful in applications such as electronic mail,
as it does not require right and left scrolling. Instead of dividing the VT100
image into 12 sections as we did in field mode, we divided it into five
primary sections, each representing five rows. Each section is further divided
into two subsections, giving a total of ten. Each of the primary sections is
accessed by its appropriate section icon (see Figure 4), and each subsection
can contain up to 200 characters (10 rows by 20 characters). By selecting the
appropriate section icon, the user can read the first half of the
corresponding text; remaining text in the primary section is accessed by
scrolling down.
The view mode icon allows the selection of wrap mode. When wrap mode is
selected, the VT100 icon becomes touch insensitive. The wrap mode icons will
disappear once the view mode is toggled back to field mode. Pressing the
cursor section icon will cause the section containing the cursor to appear.
This is frequently where the end of the incoming text is located.
Because many lines of text in wrap mode may have trailing spaces, we
introduced space compression. Any such line of text taken from the VT100 image
will have the spaces removed, so that the text on screen is substantially more
readable. Many times, a selected section will contain all the text for both of
its subsections, reducing the need for scrolling.


VT100 Keyboard Emulation


The emulated VT100 terminal keyboard consists of separate alphabetic, numeric,
control, and auxiliary keyboards which are individually displayed on the
screen. To process entered keystrokes from the keyboards, we used the special
key_pressed and get_key Links operating system functions. As data is entered
from the alphabetic and numeric keyboards, the characters are displayed in the
space available above the keyboards on the Links screen, and is not
immediately sent to the host. This allows message editing using the
backspace/delete key. When the Enter key is pressed, the entered data is sent
to the host through the modem or RS-232 port, whichever has been selected in
the VT100 setup portion of the Links100. It is also possible to move from
alphabetic to numeric before sending the message to the host. The control and
auxiliary keyboards send each keystroke immediately.


Dynamicicity


Dynamicicity is the degree to which a particular process is regularly updated,
and we decided to implement a way to "dynamically" update the VT100 Greeking
screen icon, the current cursor section icon, and the Information section
display screens. We discovered, however, that if they were updated
continuously as data came in, it was often impossible to understand any of the
information because it was changing so rapidly. The screen was never static,
unless characters stopped coming into the Links from the host. Therefore, we
decreased the degree of dynamicicity to allow for short term display
stability. The icon and text section screens are now updated approximately
every two seconds, allowing the user to watch the VT100 and cursor section
location icons and wait until they become stable. At that point there is a
high probability that the host is waiting for a response from the Links100.









































March, 1991
PROGRAMMING PARADIGMS


A Conversation with Bill Duvall




Michael Swaine


For the people at Apple involved with HyperCard, 1990 was a chaotic year. The
Claris software subsidiary of Apple was almost spun off as a separate company,
then pulled back in, whereupon HyperCard was transferred to Claris. As version
2 finally came out more than three years after version 1, it turned into three
products, and there was confusion about the capabilities of the three
HyperCards. Word got out that Claris was considering bundling a crippled,
run-only version with Apple computers. HyperCard inventor Bill Atkinson left
Apple to start his own company, General Magic, taking along Dan Winkler, the
author of HyperTalk, among others. By the end of the year virtually none of
the team was intact.
In the midst of the confusion, Bill Duvall was coaxed out of semiretirement in
Idaho to direct future HyperCard development.
The choice is suggestive. Duvall is best known for producing professional
development tools, including an early C compiler for the Macintosh. But his
roots are in what he calls investigating "how computers can work as augmenters
of human intellect." If that language sounds vaguely familiar, it's because
it's the way Doug Engelbart, pioneer in human interface design, has talked
about his own work for decades. Duvall comes by the phrasing honestly, having
worked with Engelbart at SRI, and later at Xerox PARC. If Duvall brings to his
work for Claris the interests and skills that he has acquired in his work to
date, we might be tempted to draw certain inferences about the future
evolution of HyperCard. Such as: HyperCard will become a more serious tool for
professional software development. Such as: HyperCard's user interface will
improve.
Duvall invites us to draw these inferences.
DDJ: When did you start at SRI?
Duvall: In the mid-'60s. I worked at a number of places within SRI,
[including] an artificial intelligence group doing robotics, and then I went
to work for Doug Engelbart at what was at that time the Augmented Human
Intellect Research Center, the AHIRC, and later became the Augmentation
Research Center.
DDJ: There's a legendary story about Doug Engelbart involving a brick. Is it
true?
Duvall: Doug used to have some wonderful examples of the power of tools when
they are companions to people. The first time you met Doug he would tell you
that what people produce are products of the people and their tools, and that
the tools are absolutely critical. And he'd reach behind his desk and pull out
a brick with a pencil taped to it. He'd say, "Suppose this was the writing
implement that you had. How would writing have evolved?"
DDJ: Very differently, I suppose. What exactly was Engelbart working on at
that time?
Duvall: Doug at that time was, basically, one of two people in the world
working on hypertext concepts, the other of course being Ted Nelson at Brown.
I think that Doug and his group had far and away the best implementation of a
hypertext system. In fact there's a videotape which I've ordered and am going
to be receiving shortly of the presentation that Doug gave at the 1968 AFIPS
conference. He had computers at SRI with microwave linkup to San Francisco and
a terminal in San Francisco, and he went through his entire NLS system, which
demonstrated linking and graphics and mixed graphics and text. It was very
impressive.
DDJ: I understand you were linked up with SRI remotely, that you were an early
telecommuter.
Duvall: After working there for a couple of years I was taken with the idea of
getting out of the city. We were just beginning to gain some expertise in
modems and things like that, so I went off to Sebastopol and we put in a
"high-speed" phone line (2000 baud), and an Imlac PD81, which may have been
the first personal computer in the sense that we know it today. It had a
vector generator display on it and 8K of local memory and a local processor
and some serial communications ports. We had that interfaced to the PDP-10 at
SRI and I worked for a number of years up there doing various things for Doug.
DDJ: Such as?
Duvall: Something called the Network Information Center, which was an
ARPA-net-wide repository for data and information. I designed it and set it
up.
DDJ: When did you move from SRI to Xerox PARC?
Duvall: That was in the early '70s. There were a number of projects I did for
them, all of them having to do with video display systems, operating systems,
text editing, and the genre of office systems.
DDJ: Were you still a telecommuter then?
Duvall: At first I worked with PARC remotely, but then I moved down to the Bay
Area.
DDJ: You've seen a lot of the development of the personal computer and
interactive computing from the inside. You've said that there are some unsung
heros.
Duvall: When you trace the roots of interactive computing as we know it today,
there have been people who have been instrumental along the way, and one of
them is Bill English. He was a key person at SRI, the coinventor of the mouse.
Later he went to PARC and was instrumental in getting a lot of their programs
going. In fact, he was probably the person who invented the desktop metaphor.
I remember very clearly a meeting where that came up. Another person was Gary
Starkweather at PARC, who was instrumental in the development of the laser
printer. Roger Bates, one of the best hardware designers around, was key in
designing some of the early personal computers. Now he's the vice-president of
development at La Cie.
DDJ: When did you first get involved with the Macintosh?
Duvall: I became involved with the Mac some years before it was released when
Apple asked me to write an assembly language development system for the
Macintosh: an editor, a debugger, an assembler, and a linker. Which I did.
DDJ: Why did Apple come to you?
Duvall: The fellow that Apple hired to head up the Mac as an engineering
manager was Bob Belleville, a person with whom I had done a lot of work at
PARC. He had a need for a development system, knew that I had some expertise
in that area, and asked me to come and do it.
DDJ: Early Macintosh developers may know you as the author of that development
system, and more Mac developers should remember your company, Consulair, for
its C compiler. But Consulair was around before the Mac.
Duvall: Consulair was first formed in the mid-'70s. Most of our work was in
the area of word processing; in particular, we did a lot of work in one of the
early Japanese word processors. There was a fair amount of systems work,
writing operating systems. There was some language development; we did some
debugging systems, things like that. In general, [it was] what you would
classify as advanced development or systems programming.
DDJ: You were doing tools for programmers, then?
Duvall: We always worked on development systems, but primarily up until the
early to mid-'80s we had done development systems as internal tools. In fact,
we first wrote what later became the Consulair C compiler in the late '70s as
a tool for programming our own projects. This was just after the publication
of Kernighan and Richie's original book, and I looked around to see what was
the best system programming language around, one that was going to be around
for the next few years, and came to the conclusion that C was probably the
best candidate. So I wrote a C compiler.
DDJ: Where did the commercial Consulair C compiler come into the picture?
Duvall: After the Mac was released to the world it became apparent that there
was a need for a C language development system. Since we had a C compiler that
we had been using internally for five years, it was -- I won't say it was a
short step, but it was a logical step -- to put that in a form that could be
used commercially on the Macintosh.
DDJ: It was well received. Didn't you get the first MacUser Editor's Choice
Award for a development language?
Duvall: It's difficult to answer something like that. I, like many people,
suffer from the programmer's dilemma.
DDJ: The programmer's dilemma?
Duvall: Any time you've worked on a program, you realize all of the faults in
it. It's difficult to go back and look at it and say, yes, that was really
important. Rather, you look at it and say, "Oh my God." But, yeah, I think it
played an important role in the early development of the Mac, although clearly
it was superceded by MPW and Lightspeed.
DDJ: Did the Mac seem at the time like just another development platform?
Duvall: It was not just another development platform. It would be nice to say
that the philosophy and the image of the Mac were well established and that it
was clear that we were setting forth on a new journey. In fact, nobody knew
what the Mac was going to be. There were some good ideas, some good visions,
and a lot of chaos. The project took twice as long as it should have, maybe
three times as long as it should have, and the reason was just sorting out how
the Mac should go, what it should look like.
DDJ: For the past few years, you've been living in semiretirement in Idaho.
But you came back to Silicon Valley to be the director of HyperCard
development for Claris. Why was this job so attractive? In fact, what's a
developer's developer like you doing heading up development on a weekend
developer's such as HyperCard?
Duvall: Well, first of all, I don't believe it's a weekend developer's tool.
Part of the beauty of HyperCard is that it has the potential for being a
development tool, not only for the person whose job is something other than
professional development, but for the professional developer as well. For me,
that's part of its intrigue. You have a broad range of skills as a
professional developer. One of those skills is very detailed: assembly
language, C programming, whatever. But another of those skills is higher-level
programming. And if you're at all motivated to get things done, you'd much
rather spend your time at the higher level than at the lower level, because
you're more effective. So I view HyperCard as entirely appropriate for the
professional developer.
DDJ: This from a C compiler author.
Duvall: Make no mistake: I support C as a language. I sold it professionally
for a long time. But this is a question you might almost have asked an
assembly language programmer 10 or 15 years ago about C. "What's a hard-nosed
assembly language programmer like you doing going to a higher-level language
like C?" And the answer is that you want to be able to get more done in a
shorter period of time, and you want it to have less maintenance and be more
transportable and more affordable, and so forth. All of those arguments go
likewise between C and HyperCard.
DDJ: HyperCard 1 was never taken very seriously by professional developers,
but HyperCard 2 definitely is closer to meeting professional developers'
needs.
Duvall: It's a significant step toward bringing HyperCard up to a level where
we can begin to view it as a professional development tool. Although HyperCard
1 did serve many people very well, from a professional development perspective
some people did feel that it had kind of a hobby characteristic. HyperCard 2
begins to address some of those things. You can have multiple stacks open,
multiple windows. There's the increased sophistication of the X commands,
where you have message passing, and can call up an X command interactively.
The thing that is intriguing to me is, where can we move it to now?
DDJ: Is that a good trajectory to sight along, HyperCard 1 to HyperCard 2, to
guess where you might be headed?
Duvall: I don't know. Both HyperCard 1 and HyperCard 2 are broad enough
platforms that I don't think they make a very good gunsight. What you can look
at is in the history of Hypercard. In its very spirit is the concept that this
is something that everyone can program in. There's a lot of momentum there,
and that will carry forward. And you can look at my history.
DDJ: In hiring you, Claris appears to be making a statement about the
direction in which it wants to see HyperCard taken.
Dunvall: I've been concerned with user interfaces since the very beginning,
starting with the things Doug Engelbart was doing back at SRI. I've also been
concerned with development systems for a long time. You can put those things
together and get some sense of where we're going to go. If I were to try and
summarize it briefly I'd say that the areas in which I see HyperCard advancing
would be the area of usability, both for the professional and for the novice;
and the area of power, in particular for the higher-end developer and the
professional developer.

DDJ: Who are your developer customers today? The developer kit packaging says
the product is for custom development. What about commercial developers? Do
the developers of commercial stacks fall more in the realm of weekend
programmers? That tends to be the general view of stackware as a market, that
it's for weekend programmers.
Duvall: I don't agree with that. There is what I would call professional
development going on in HyperCard in a number of areas. I made a pass through
MacWorld Expo looking to see if the hand of HyperCard was up there. There were
an awful lot of them. I was impressed.
DDJ: The hand of HyperCard? Oh, you mean the idle cursor, which looks like a
hand. That's how you spot a HyperCard application?
Duvall: I asked myself when I came into this position, "How do I tell if this
is a HyperCard product?" And the answer is, if you see a hand up there, you
know it's a HyperCard product. I saw a lot of hands.
DDJ: Otherwise, I guess it could be hard to tell. With HyperCard 2, it's
possible to control the menubar, and display multiple windows of different
sizes. You can pretty much make your HyperCard application look like any
professional Mac application. I guess you're saying these don't just look like
professional applications.
Duvall: The people who exhibit at MacWorld are not weekend programmers. These
are professional programmers.
DDJ: What categories do these professional products fall into?
Duvall: The obvious one is multimedia. HyperCard is very strong in multimedia;
in fact, many people believe that HyperCard is the enabling tool in
multimedia. It's the tool that allows the pieces to be pulled together and
controlled. Education is, of course, in some ways a custom application area,
but beyond that there is a tremendous number of educational products that are
HyperCard-based. Something that wasn't so much visible at MacWorld, but that
represents a significant customer base for HyperCard is people who are
developing front ends to another system, especially in the corporate world.
DDJ: Do you see a change in this audience?
Duvall: Yeah, there's no question that, as we improve HyperCard in the way it
serves the professional developer, I expect to see many more people using
HyperCard as a development platform. We have to increase the power of
HyperCard so that it does the things a professional developer needs; that's
number one. Number two is that the professional development community has to
begin to realize the power that HyperCard has. My gut feeling is that that's
not going to take very long.
DDJ: Well, Claris marketing and customer support should help, because in both
cases you're essentially starting from zero.
Duvall: As we get our customer support programs in place, not only will
developers find themselves with a new tool but they'll also find themselves
with a new set of resources for dealing with that tool.
DDJ: One "resource" that developers would surely like to have is the ability
to port stacks easily to other platforms. It's well known that Claris is
looking at the multiplatform problem very seriously. But taking HyperCard
multiplatform would seem to involve some difficult problems.
Duvall: Absolutely. This is an issue that we're looking at right now. If we
can solve the issues, there's an incredible amount of power there. The easiest
way to realize that power is this: We talked earlier about going from assembly
language to C; well, from C to HyperCard is a least as big a step. Now, if you
look at the platform independence you gain in going from assembly language to
C, you probably gain that much and more in going from C to HyperCard. It could
be an immensely important step.
DDJ: But difficult. I see the problem presented by external commands. How big
a problem would it be to produce a Windows version of HyperCard?
Duvall: There are some immensely difficult problems to solve. If you "solve"
the multiplatform problem for a development system like HyperCard, then you
may to a large degree have solved the multiplatform problem, period. I'm not
so naive as to believe that in one fell swoop you can solve that problem.
We're looking at the possibilities, trying to plot a path through that morass
to a point where you don't have to worry about the platform that you're
programming on.
DDJ: There's also a marketing challenge. At least one competitor is being
bundled with Windows.
Duvall: Um-hm.
DDJ: Even though you're taking on an existing, well-known product, you do have
the opportunity to start pretty much from scratch in building a team. Do you
have a strategy for that? Do you have a philosophy of software development?
Duvall: Yes, I do. There are a couple of things that I feel are important. One
of them is that software is best done by small teams. I'm not a proponent of
the software factory. Then there is what I would call an opportunistic
approach to software; this is one of the reasons you use small teams. Software
is sufficiently complex that you can't go into a large project understanding
everything you need to understand to complete it. That's just the nature of
the beast. No effort to do that has ever been successful. My philosophy is
that you go into a project with an understanding of where you want to go, a
basic understanding of how you might get there, and an awareness of what the
opportunities are along the way. That way, as things change, you can take
advantage of opportunities to help you get to where you want to go. That's
reflected in the way I like to structure software projects, which is to set a
fairly long-term vision two or three years out, or even longer in some cases,
and then in terms of products to have shorter-term product releases, say once
a year. This basically gives you checkpoints where you can look around and
say, "Where am I, where are the opportunities now?" Yet you still have that
long-term vision.
DDJ: Are there things you've picked up along the way that have stayed with
you?
Duvall: I think the thing that you pick up along the way as much as anything
is an image of how things might be.
DDJ: Can you pin that down a little more?
Duvall: It's difficult to define. When I go to work on a very complex project
the first thing I do is to go in and start handling the pieces. I would love
to say that I get a piece of paper out and do a clear, concise analysis of the
problem. I don't. What I do is I pick a corner of it and I fiddle with the
corner, and then I pick another corner and I fiddle with that. And I find that
if I feel the pieces enough and I hold the pieces and manipulate the pieces,
then sooner or later not only does the problem become clearer but a path to
the solution becomes clearer. There's this way of looking at things that you
get by feeling various parts of the problem.
DDJ: And the "image of how things might be"?
Duvall: Well, I think you can take that same concept and spread it over time.
As you work on different projects, these projects are all a part of a larger
problem, so you gain a sort of intuition, or way of looking at things, or a
feeling for how things might be.
DDJ: It sounds like you're defining experience.
Duvall: I'd say that this is the thing you carry forward most. In my case, the
pieces that I've dealt with most have been in the area of how people and
computers can interact; how computers can work, as Doug used to put it, as
augmenters of human intellect. So over the years, that's what I carry forward.
When I see a problem or a possibility, that's the area that I think in.
DDJ: Are there values you keep in mind when designing tools for people?
Duvall: It's a wide-open game.
DDJ: No general rules? No advice for the young developer?
Duvall: If I were to give people advice, I'd say, first of all, understand
whether you're trying to solve a problem or trying to investigate an area. If
you're trying to solve a problem, try to understand what that problem is and
make sure that you're doing something that makes sense toward trying to solve
that problem. If you're not trying to solve a problem, but are just trying to
investigate an area, accept that and don't try to turn it into something that
it's not. That's a very general rule.
DDJ: Do you have a ten-year vision for Bill Duvall?
Duvall: Nope. For myself as well as my work, I tend to look about two or three
years out. I'm perfectly happy looking a couple of years ahead and saying,
Where we going next?




























March, 1991
C PROGRAMMING


Of Mice and Messages




Al Stevens


This month's column delves into several different areas. The theme is
event-driven programming, and to show you how it works, I had to come up with
a software model that uses the architecture of events. To do that, I had to
settle on something that could tell the story in one encapsulated column. So,
besides the event-driven software, you will find drivers for the mouse, the
keyboard, and the screen.
You will learn the two sides of an event-driven architecture. You will see how
the event and message managers work, and you will see how an application
program uses the message system to manage its execution.
In a typical event-driven model, the events happen when you hit a key, slap
the mouse around, or take some other external action that makes the program do
something. The application program that accompanies this column is a simple
screen grabber program for a textmode PC under DOS. It runs from the DOS
command line, but, with the proper driver, it can be a TSR, which means that
it can be resident in memory. The program uses the keyboard or mouse to
describe a rectangle on the screen, which it can write to a text file. I use a
variation of this program to capture screen snapshots for software
documentation.
Although most of the code deals with the mouse hardware and -- next month --
all the ornery stuff that you have to do to get a TSR to work, the purpose of
the program is to demonstrate event-driven programming in the C language. If
you do not want to mess with the TSR part of the project, you can skip next
month, use the stubbed-in main function that substitutes as the popup
function, and run the program each time from the command line. Of course, to
be useful for something other than as an example of event-driven programming,
the screen grabber needs to be memory resident.
Another requisite: I wrote the hardware drivers in Turbo C 2.0 and used the
pseudoregister and other extensions that the Turbo dialect of the language
provide. If you want to use a different compiler, you must port the program to
the other compiler's conventions for managing interrupts, hardware registers,
and the like. No two of them do it quite the same way. The application part of
the code is standard C, but the device drivers use the Turbo extensions.


Events and Messages


How does an event-driven program work? First, consider how you might write
such a program by using the traditional procedural approach. After you
initialized the program, you would poll the keyboard and the mouse. When one
of them did something, you would read the device, determine if what it did had
a bearing on what your program needed, take appropriate action, and return to
poll the hardware again. When one of the user's actions indicated that the
program was done, you would not return to the polling loop, but would shut
things down and exit from the program. Polling the devices, reading them, and
interpreting their different inputs are integral parts of your program.
An event-driven program does all of that, too, but in a somewhat different
manner. Instead of polling devices and reacting, an event-driven program uses
an event-dispatching function to call your applications functions when an
event has occurred. The dispatcher sends a message that your function
interprets. So, for example, instead of reading the keyboard, your program
waits until the dispatcher sends you a message telling you that the user
pressed a key.
The event software watches the hardware for you and queues the hardware events
as messages in a queue. The dispatcher retrieves messages from the queue and
sends them to your functions. As you might expect, your functions can put
messages into the queue as well -- messages that your application will receive
in turn from the dispatcher.
What is the point of all this? Why is it any better than the old way? First,
an event-driven program tends to be more orderly. I know, you've heard it all
before. Every new trendy paradigm that someone is puffing up makes the same
claim. This is one you'll just have to see for yourself. The advantages of the
event-driven architecture were not apparent to me until I ported a
conventional program into it. Things got a lot easier to maintain. It isn't
the answer to every programming problem, but in some cases it will make
program design and maintenance easier because it imposes its form of structure
on your code. The more structure, the better.
Second, if you get into Windows programming, you'll find yourself deeply
ensconced in event-driven programming. In the Windows world, you create a
window with an associated window-processing function that you provide. The
windows dispatcher sends messages to the window-processing functions whenever
an event occurs that might be important to the windows. For example, whenever
you move the mouse across a window, the dispatcher sends the window a message
about it. When you type a key, the dispatcher sends a message to whichever
window is active.
The event-driven paradigm is especially applicable to Windows programming. For
example, the Windows user can choose a menu command by one of several manual
actions. Click the command with the mouse. Press the command's accelerator
key. Open the menu and press the shortcut key. Or move the menu cursor to the
command and press the Enter key. But your application program does not care
about all that. The message manager takes care of it. When the user chooses
the command, your program gets a message to that effect, one message only,
regardless of how the user made the choice. The message tells your program
that the user chose the command. What is more, another part of your program
can simulate the same command choice simply by sending the same message.


Hardware Drivers


To build our event-driven program, first we need some software to recognize
events so they can become messages. The screen grabber program will work with
the keyboard and the mouse, and it deals with the video screen. You want to
move the details of the hardware away from the application code and into the
event manager software. Although most event-driven environments come complete
with hardware drivers, we'll need to build our own. Remember, though, that the
point of all this is in the event and message code that comes later.


The Keyboard and the Cursor


Listings One and Two are keys.h and getkey.c, the functions that manage the
keyboard and the screen cursor. You might have seen similar modules in other
programs. The keys.h file defines some values for keys that the program will
use. It also provides the prototypes for the keyboard and cursor functions.
The getkey.c file has two keyboard and several cursor functions. The keyhit
function returns a true value if you have pressed a key. The getkey function
reads a key from the keyboard, translating function keys into values above
128. The key definitions in keys.h reflect this behavior. The cursor functions
deal with the keyboard cursor, allowing a program to position the cursor, get
its current position, hide it, unhide it, and save and restore the cursor
context and configuration of whatever other program the TSR interrupts.


The Mouse


Listings Three and Four, are mouse.h and mouse.c, the hardware drivers for the
PC's mouse. The mouse.h file defines the prototypes and some macros. The
mouse.c file contains the functions that allow a program to determine if a
mouse is installed, read or set the screen coordinates of its cursor, read its
buttons, turn the cursor on and off, and save and restore the mouse context of
the program that the TSR interrupts. There are more mouse operations than
these functions support. I have included only the ones that the program needs.
Microsoft Press publishes the Microsoft Mouse Programmer's Reference that
specifies how all the mouse operations work.


The Message Driver


Listings Five and Six are message.h and message.c, the message driver
software. The message.h file defines the message codes. For this example,
there are only 16 messages. More about them later. An application might add
custom messages to this list.
There are three functions that a program calls to use the event-driven
architecture of the message drivers. The dispatch_message function is the
message dispatching module. You pass it the address of your application's
message processing function, which is a void function that expects to receive
three integer parameters. The first parameter will be the message code, and
the other two are generic parameters to be used by the messages any way they
need.
You call the post_message function to insert messages into the message queue.
The dispatcher will send these messages in the order in which they appear in
the queue. Messages that the post_message function posts are not sent right
away. To send a message immediately, call the send_message function. You would
use send_message when the message returns a value or when its effects must
take place on the spot.


The Messages



The dispatcher sends the START message to your function when the process first
begins, and your program posts the STOP message to tell the program at large
that the message-dispatching loop should end.
In this simple example of an event-driven architecture, a message cycle begins
and ends and has one destination. In more complex systems, such as Windows,
there are many more messages, different categories of messages, and they can
be sent to many different processing functions. For example, you can have
several windows on the screen and each of them can deal with its own copy of
the messages that tell it to start and stop.
The dispatcher sends the KEYBOARD message when the user presses a key. The
first of the two parameters holds the value of the keystroke.
You send the CURRENT_KEBOARD_CURSOR and KEYBOARD_CURSOR messages yourself. The
CURRENT_KEYBOARD_CURSOR message returns the current cursor coordinates in the
two parameters. The first parameter is the address of where the message will
copy the X coordinate, and the second parameter is the address of the Y
coordinate. The KEYBOARD_CURSOR message changes the cursor location. The first
parameter is the new X coordinate and the second parameter is the new Y
coordinate.
The dispatcher sends the RIGHT_BUTTON and LEFT_BUTTON messages when you press
the right or left button on the mouse. As long as you hold the button down,
the dispatcher will continue to send the message. The dispatcher sends the
MOUSE_MOVED message when you move the mouse, whether a button is pressed or
not. When you release a button, the dispatcher sends the BUTTON_RELEASED
message. In these four mouse messages, the first parameter contains the column
(X) screen coordinate, and the second parameter contains the row (Y) screen
coordinate where the mouse cursor was when the event occurred.
Your program sends the CURRENT_MOUSE_CURSOR, MOUSE_CURSOR, SHOW_MOUSE, and
HIDE_MOUSE messages. The CURRENT_MOUSE_CURSOR message returns the current
mouse screen coordinates in the two parameters. The first parameter is the
address of where the message will copy the X coordinate, and the second
parameter is the address of the Y coordinate. The MOUSE_CURSOR message changes
the mouse cursor location. The first parameter is the new X coordinate and the
second parameter is the new Y coordinate.
The HIDE_MOUSE and SHOW_MOUSE messages hide and display the mouse cursor. The
two parameters are zero for both messages.
There are other messages that a mouse driver might send. You might want to
know when you have double-clicked a location, for example. A keyboard driver
might send the status of the Shift, Alt, and Ctrl keys in the second
parameter. The small set of messages and events in this example serve to
illustrate the principle, but you would use many more in an actual
event-driven architecture.
Your program sends the VIDEO_CHAR message to retrieve a character that is on
the screen. The parameters are the X and Y screen coordinates, and the message
returns the character.
The message.h file defines the RECT structure, which contains the upper left
and bottom right X/Y screen coordinates. The GET_VIDEORECT and PUT_VIDEORECT
messages read and write video data between the screen and your buffer. The
first parameter is the address of a RECT structure that has the coordinates of
the rectangle, and the second parameter is the address of the buffer.
Observe that the PARAM typedef in message.h for the message parameters is an
integer. When a message expects addresses, you must cast them to the PARAM
type. If you use a large data model, you should change the PARAM typedef to a
long integer.


The Message Queue


The functions in message.c maintain a circular queue of messages. The
post_message function adds an entry to that queue. The queue entries contain
the message code and the two parameters. The collect_message function polls
the hardware for events and posts messages to the queue. It also recognizes
when it is executing for the first time and posts the START message. It
returns a true value if messages exist in the queue when it exits. It returns
false if the queue is empty.
The dispatch_message function is the one that an application calls to have
queued messages dispatched. It calls collect_message to have any uncollected
events queued and to see if there are any messages in the queue to send to the
application. If a message is on the queue, dispatch_message dequeues the
message and calls send_message to send it to the application's message
processing function. The send_message function also acts on messages such as
MOUSE_CURSOR that the application sends to drive the hardware. Observe that
when send_message calls the application's message processing function, it
takes action on the messages only if the message processing function returns a
true value. This allows the application to override any default message
processing. In the case of this example, that option never gets used, but the
architecture is typical of event-driven systems. The send_message function can
return a value to the caller. For example, the VIDEO_CHAR message returns the
video character.
The dispatch_message function returns a true value to its caller whether or
not it found a message to send. When it sees that it has just dispatched the
STOP message, it returns a FALSE value instead. Therefore, an application
program should continue to call dispatch_message as long as the function
returns a true value. The program should quit when dispatch_message returns a
false value.
The mouse_event function calls the mouse functions to build mouse event
messages. The collect_messages function calls it, and mouse_event returns any
mouse events that occur. The function reads the mouse position and posts the X
and Y coordinates into global variables. The collect_message function will use
these variables to build the parameters for the message to be queued. If the
mouse position has changed since the last time collect_message called
mouse_event, the function will return the MOUSE_MOVED message. If you have
released a mouse button since the last time the function checked, the function
returns the BUTTON_RELEASED message. If the right or left button is down, the
function returns the RIGHT_BUTTON or LEFT_BUTTON messages.


The Screen Grabber


You can see that the message and event functions are small. It doesn't take
much code to implement a simple event-driven environment. The strength of the
approach is in how you use it. Listing Seven is copyscrn.c, the event-driven
application that illustrates the use of the event and message manager
software. Observe the #ifdef TSR statement at the beginning of copyscrn.c.
We'll use that compile-time conditional expression next month to install this
program as a memory resident utility.
The program begins by creating a text disk file named "grab.dat." Then it
makes repeated calls to dispatch_message, passing the address of the
message_proc function. These calls continue until the dispatch_message
function returns a false value, whereupon the copyscrn function closes the
text file and returns.
From this point on, think of this program as an event-driven system. The only
way anything will happen is when the user does something with the keyboard or
mouse. These events will become messages that the program's message_proc
function receives and processes.


The START Message


If you look back into message.c, you will see that the first time copyscrn
calls dispatch_message, the collect_message function will post the START
message to be sent to the application's message_proc function. It uses the
START message to tell it to initiate processing. In this case, it sends
messages that save the keyboard and mouse cursor positions and set both
cursors to the upper left corner of the screen.


Keyboard Messages


Now the program is running with the keyboard and screen cursors both in the
upper left corner of the screen and with a text file open. Nothing is going on
out here in application land because the user isn't doing anything. But the
dispatcher is busily watching the keyboard and mouse for some action. Suppose
now that the user presses a key on the keyboard. Look back into message.c. The
collect_messages function is calling the keyboard driver's keyhit function to
see if a key was pressed. When keyhit returns a true value, collect_messages
posts the KEYBOARD message into the queue with the value returned from getkey
as the first parameter. Because the message-dispatching loop is running,
send_message will eventually send that message to the message_proc function in
the copyscrn.c source file. You can see that the message_proc function
processes the KEYBOARD message, doing different things depending on the value
of the keystroke. It is watching for the up, down, right, and left arrow keys;
the Esc key; the Enter key; and the F2 key. It will ignore any other keys. You
can use the arrow keys to move the cursor all over the screen. When you press
F2, you tell the program that you want to begin describing a screen rectangle
to write to the file. The cursor location when you press F2 will become one of
the corners of the rectangle. When you subsequently move the cursor, the
program defines the rectangle by reversing the video colors of the characters
within the rectangle. If you press F2 again, you are telling the program that
you do not like that rectangle. The reverse video rectangle goes away, and you
can move to a new starting point again and press F2. When the rectangle is
defined, you press the Enter key. To forget the whole thing, you press the Esc
key.
The forward, upward, backward, and downward functions manage the cursor
position. The highlight function draws and redraws the rectangle as you move
the cursor around. The setstart function turns screen marking on and off and
positions the block markers.


The STOP Message


Observe the treatment of the Enter ('\r') key and the Esc key in the
message_proc function. Remember that these keys terminate the program by
writing the rectangle to disk or by ignoring it. When their messages arrive,
the message_proc function calls post_message to post the STOP message. The
first parameter has a true value for the Enter key and a false value for the
Esc key. That message, like all messages, will eventually be sent to
message_proc as well. Now observe the function's treatment of the STOP
message. If a block is marked, the function calls highlight to turn it off. If
the first parameter is true, the function calls the writescreen function to
write the block to the disk file. In either case, the function sends messages
to restore the mouse and keyboard cursor to the positions they had when the
program began running. Remember that the dispatch_message function uses the
STOP message to tell it to return a false value to its caller to stop the
program. But it does that after your message-processing function gets a crack
at the STOP message.


Mouse Messages


The mouse works in a manner similar to the keyboard. You can move the mouse
around until its cursor is on the corner -- any corner -- of the rectangle you
want to define. Press the left mouse button and hold it down while you move
the cursor. The program will follow your movements and define the rectangle.
Release the button to stop defining. If you do not like the rectangle being
defined, move to a new corner and press the left button again. To write the
rectangle to disk press the Enter key or the right mouse button. The Esc key
rejects the rectangle and terminates the program.
When you press the left button, the message_proc function receives the
LEFT_BUTTON message. As long as you hold that button down, that message will
continue to come in, so you only want to do something with it the first time.
If the program is not presently marking a block with the mouse, this message
gets it started by saving the coordinates and calling setstart.
The message_proc function gets the MOUSE_MOVED message whenever you move the
mouse. If you are not currently marking the rectangle, the function ignores
the message. If you marking, the function calls one of the same forward,
upward, backward, and downward functions that the KEYBOARD message uses to
define the rectangle.
When the BUTTON_RELEASED message comes in, the function notes that it is no
longer marking a block with the mouse. When the RIGHT_BUTTON comes in the
function does the same thing that the Enter key value of the KEYBOARD message
does -- it posts a STOP message with a true first parameter to tell the STOP
message to write the rectangle to the screen.
The program allows you to define a rectangle with the mouse and then press F2
to define a different one with the keyboard. It allows you to click the mouse
during a keyboard definition to override the keyboard's rectangle and start a
new one with the mouse.



The Message is the Medium


The event-driven architecture of this small program is typical of that of most
event-driven systems. Most of the components are here, albeit on a smaller
scale. You do not need hardware events to use the event-driven architecture.
You could use traditional input/output functions and implement soft events to
manage the processing flow of a program. This is yet another technique, one
that offers a different way to view programming and one that seems to have
some promise to bring order to complex programming problems.


Pacific High


As I write this column, I am listening to "Pacific High," a compact disk that
Borland sent me along with a huge calendar with works of art superimposed on
pictures of diskettes. The CD features Philippe Kahn, his flutes,
compositions, and a collection of different musical styles and musicians. It's
mostly jazz, and mostly listenable, but every now and then it reaches into
that post-Coltrane cacophonous sound that makes my cat bark.
The drum machine programmer got a mention in the CD's liner notes. I hope
that's not a new paradigm.


Next Month...


We'll discuss a C language TSR driver that will make this month's screen
grabber and most other C programs memory resident TSR programs. I'll try to
unravel some of the arcane underpinnings of the TSR.


_C PROGRAMMING COLUMN_
by Al Stevens



[LISTING ONE]

/* ----------- keys.h ------------ */

#define TRUE 1
#define FALSE 0

#define ESC 27
#define F2 188
#define UP 200
#define FWD 205
#define DN 208
#define BS 203

int getkey(void);
int keyhit(void);
void curr_cursor(int *, int *);
void cursor(int, int);
void hidecursor(void);
void unhidecursor(void);
void savecursor(void);
void restorecursor(void);
void set_cursor_type(unsigned);
#define normalcursor() set_cursor_type(0x0607)






[LISTING TWO]

/* ----------- getkey.c ---------- */


#include <bios.h>
#include <dos.h>
#include "keys.h"

#define KEYBOARD 0x16
#define ZEROFLAG 0x40
#define SETCURSORTYPE 1
#define SETCURSOR 2
#define READCURSOR 3
#define HIDECURSOR 0x20

static unsigned video_mode;
static unsigned video_page;
static int cursorpos;
static int cursorshape;

/* ---- Test for keystroke ---- */
int keyhit(void)
{
 _AH = 1;
 geninterrupt(KEYBOARD);
 return (_FLAGS & ZEROFLAG) == 0;
}

/* ---- Read a keystroke ---- */
int getkey(void)
{
 int c;
 while (keyhit() == 0)
 ;
 if (((c = bioskey(0)) & 0xff) == 0)
 c = (c >> 8) 0x80;
 else
 c &= 0xff;
 return c;
}

static void videoint(void)
{
 static unsigned oldbp;
 _DI = _DI;
 oldbp = _BP;
 geninterrupt(0x10);
 _BP = oldbp;
}

void videomode(void)
{
 _AH = 15;
 videoint();
 video_mode = _AL;
 video_page = _BX;
 video_page &= 0xff00;
 video_mode &= 0x7f;
}

/* ---- Position the cursor ---- */
void cursor(int x, int y)
{

 videomode();
 _DX = ((y << 8) & 0xff00) + x;
 _AX = 0x0200;
 _BX = video_page;
 videoint();
}

/* ---- get cursor shape and position ---- */
static void near getcursor(void)
{
 videomode();
 _AH = READCURSOR;
 _BX = video_page;
 videoint();
}

/* ---- Get current cursor position ---- */
void curr_cursor(int *x, int *y)
{
 getcursor();
 *x = _DL;
 *y = _DH;
}

/* ---- Hide the cursor ---- */
void hidecursor(void)
{
 getcursor();
 _CH = HIDECURSOR;
 _AH = SETCURSORTYPE;
 videoint();
}

/* ---- Unhide the cursor ---- */
void unhidecursor(void)
{
 getcursor();
 _CH &= ~HIDECURSOR;
 _AH = SETCURSORTYPE;
 videoint();
}

/* ---- Save the current cursor configuration ---- */
void savecursor(void)
{
 getcursor();
 cursorshape = _CX;
 cursorpos = _DX;
}

/* ---- Restore the saved cursor configuration ---- */
void restorecursor(void)
{
 videomode();
 _DX = cursorpos;
 _AH = SETCURSOR;
 _BX = video_page;
 videoint();
 set_cursor_type(cursorshape);

}

/* ----------- set the cursor type -------------- */
void set_cursor_type(unsigned t)
{
 videomode();
 _AH = SETCURSORTYPE;
 _BX = video_page;
 _CX = t;
 videoint();
}





[LISTING THREE]

/* ------------- mouse.h ------------- */

#ifndef MOUSE_H
#define MOUSE_H

#define MOUSE 0x33

int mouse_installed(void);
int mousebuttons(void);
void get_mouseposition(int *x, int *y);
void set_mouseposition(int x, int y);
void show_mousecursor(void);
void hide_mousecursor(void);
int button_releases(void);
void intercept_mouse(void);
void restore_mouse(void);
void resetmouse(void);

#define leftbutton() (mousebuttons()&1)
#define rightbutton() (mousebuttons()&2)
#define waitformouse() while(mousebuttons());

#endif





[LISTING FOUR]

/* ------------- mouse.c ------------- */

#include <stdio.h>
#include <dos.h>
#include <stdlib.h>
#include <string.h>
#include "mouse.h"

static void mouse(int m1,int m2,int m3,int m4)
{
 _DX = m4;

 _CX = m3;
 _BX = m2;
 _AX = m1;
 geninterrupt(MOUSE);
}

/* ---------- reset the mouse ---------- */
void reset_mouse(void)
{
 mouse(0,0,0,0);
}

/* ----- test to see if the mouse driver is installed ----- */
int mouse_installed(void)
{
 unsigned char far *ms;
 ms = MK_FP(peek(0, MOUSE*4+2), peek(0, MOUSE*4));
 return (ms != NULL && *ms != 0xcf);
}

/* ------ return true if mouse buttons are pressed ------- */
int mousebuttons(void)
{
 int bx = 0;
 if (mouse_installed()) {
 mouse(3,0,0,0);
 bx = _BX;
 }
 return bx & 3;
}

/* ---------- return mouse coordinates ---------- */
void get_mouseposition(int *x, int *y)
{
 if (mouse_installed()) {
 int mx, my;
 mouse(3,0,0,0);
 mx = _CX;
 my = _DX;
 *x = mx/8;
 *y = my/8;
 }
}

/* -------- position the mouse cursor -------- */
void set_mouseposition(int x, int y)
{
 if(mouse_installed())
 mouse(4,0,x*8,y*8);
}

/* --------- display the mouse cursor -------- */
void show_mousecursor(void)
{
 if(mouse_installed())
 mouse(1,0,0,0);
}

/* --------- hide the mouse cursor ------- */

void hide_mousecursor(void)
{
 if(mouse_installed())
 mouse(2,0,0,0);
}

/* --- return true if a mouse button has been released --- */
int button_releases(void)
{
 int ct = 0;
 if(mouse_installed()) {
 mouse(6,0,0,0);
 ct = _BX;
 }
 return ct;
}

static int mx, my;

/* ----- intercept the mouse in case an interrupted program is using it ----
*/
void intercept_mouse(void)
{
 if (mouse_installed()) {
 _AX = 3;
 geninterrupt(MOUSE);
 mx = _CX;
 my = _DX;
 _AX = 31;
 geninterrupt(MOUSE);
 }
}

/* ----- restore the mouse to the interrupted program ----- */
void restore_mouse(void)
{
 if (mouse_installed()) {
 _AX = 32;
 geninterrupt(MOUSE);
 _CX = mx;
 _DX = my;
 _AX = 4;
 geninterrupt(MOUSE);
 }
}







[LISTING FIVE]

/* ----------- message.h ------------ */

#ifndef MESSAGES_H
#define MESSGAES_H

#define MAXMESSAGES 50


/* --------- event message codes ----------- */
enum messages {
 START,
 STOP,
 KEYBOARD,
 RIGHT_BUTTON,
 LEFT_BUTTON,
 MOUSE_MOVED,
 BUTTON_RELEASED,
 CURRENT_MOUSE_CURSOR,
 MOUSE_CURSOR,
 SHOW_MOUSE,
 HIDE_MOUSE,
 KEYBOARD_CURSOR,
 CURRENT_KEYBOARD_CURSOR,
 VIDEO_CHAR,
 PUT_VIDEORECT,
 GET_VIDEORECT
};

/* ------- defines a screen rectangle ------ */
typedef struct {
 int x, y, x1, y1;
} RECT;

/* ------ integer type for message parameters ----- */
typedef int PARAM;

void post_message(enum messages, PARAM, PARAM);
int send_message(enum messages, PARAM, PARAM);
int dispatch_message(int (*)(enum messages, PARAM, PARAM));
RECT rect(int, int, int, int);

#endif





[LISTING SIX]

/* --------- message.c ---------- */

#include <stdio.h>
#include <dos.h>
#include <conio.h>
#include "mouse.h"
#include "keys.h"
#include "message.h"

static int mouse_event(void);

static int px = -1, py = -1;
static int mx, my;
static int first_dispatch = TRUE;

static struct msgs {
 enum messages msg;

 PARAM p1;
 PARAM p2;
} msg_queue[MAXMESSAGES];

static int qonctr;
static int qoffctr;
static int (*mproc)(enum messages,int,int);

/* ----- post a message and parameters to msg queue ---- */
void post_message(enum messages msg, PARAM p1, PARAM p2)
{
 msg_queue[qonctr].msg = msg;
 msg_queue[qonctr].p1 = p1;
 msg_queue[qonctr].p2 = p2;
 if (++qonctr == MAXMESSAGES)
 qonctr = 0;
}

/* --------- clear the message queue --------- */
static void clear_queue(void)
{
 px = py = -1;
 mx = my = 0;
 first_dispatch = TRUE;
 qonctr = qoffctr = 0;
}

/* ------ collect events into the message queue ------ */
static int collect_messages(void)
{
 /* ---- collect any unqueued messages ---- */
 enum messages event;
 if (first_dispatch) {
 first_dispatch = FALSE;
 reset_mouse();
 show_mousecursor();
 send_message(START,0,0);
 }
 if ((event = mouse_event()) != 0)
 post_message(event, mx, my);
 if (keyhit())
 post_message(KEYBOARD, getkey(), 0);
 return qoffctr != qonctr;
}

int send_message(enum messages msg, PARAM p1, PARAM p2)
{
 int rtn = 0;
 RECT rc;
 if (mproc == NULL)
 return -1;
 if (mproc(msg, p1, p2)) {
 switch (msg) {
 case STOP:
 hide_mousecursor();
 clear_queue();
 mproc = NULL;
 break;
 /* -------- keyboard messages ------- */

 case KEYBOARD_CURSOR:
 unhidecursor();
 cursor(p1, p2);
 break;
 case CURRENT_KEYBOARD_CURSOR:
 curr_cursor((int*)p1,(int*)p2);
 break;
 /* -------- mouse messages -------- */
 case SHOW_MOUSE:
 show_mousecursor();
 break;
 case HIDE_MOUSE:
 hide_mousecursor();
 break;
 case MOUSE_CURSOR:
 set_mouseposition(p1, p2);
 break;
 case CURRENT_MOUSE_CURSOR:
 get_mouseposition((int*)p1,(int*)p2);
 break;
 /* ----------- video messages ----------- */
 case VIDEO_CHAR:
 gettext(p1+1, p2+1, p1+1, p2+1, &rtn);
 rtn &= 255;
 break;
 case PUT_VIDEORECT:
 rc = *(RECT *) p1;
 puttext(rc.x+1, rc.y+1, rc.x1+1, rc.y1+1,(char *) p2);
 break;
 case GET_VIDEORECT:
 rc = *(RECT *) p1;
 gettext(rc.x+1, rc.y+1, rc.x1+1, rc.y1+1,(char *) p2);
 break;
 default:
 break;
 }
 }
 return rtn;
}

/* ---- dispatch messages to the message proc function ---- */
int dispatch_message(
 int (*msgproc)(enum messages msg,PARAM p1,PARAM p2))
{
 mproc = msgproc;
 /* ------ dequeue the next message ----- */
 if (collect_messages()) {
 struct msgs mq = msg_queue[qoffctr];
 send_message(mq.msg, mq.p1, mq.p2);
 if (++qoffctr == MAXMESSAGES)
 qoffctr = 0;
 if (mq.msg == STOP)
 return FALSE;
 }
 return TRUE;
}

/* ---------- gather and interpret mouse events -------- */
static int mouse_event(void)

{
 get_mouseposition(&mx, &my);
 if (mx != px my != py) {
 px = mx;
 py = my;
 return MOUSE_MOVED;
 }
 if (button_releases())
 return BUTTON_RELEASED;
 if (rightbutton())
 return RIGHT_BUTTON;
 if (leftbutton())
 return LEFT_BUTTON;
 return 0;
}

/* ----------- make a RECT from coordinates --------- */
RECT rect(int x, int y, int x1, int y1)
{
 RECT rc;
 rc.x = x;
 rc.y = y;
 rc.x1 = x1;
 rc.y1 = y1;
 return rc;
}






[LISTING SEVEN]

/* ---------------- copyscrn.c -------------- */
#include <stdio.h>
#include <stdlib.h>
#include "keys.h"
#include "message.h"

static int message_proc(enum messages, int, int);
static void near highlight(RECT);
static void writescreen(RECT);
static void near setstart(int *, int, int);
static void near forward(int);
static void near backward(int);
static void near upward(int);
static void near downward(int);
static void init_variables(void);

static int cursorx, cursory; /* Cursor position */
static int mousex, mousey; /* Mouse cursor position */

static RECT blk;
static int kx = 0, ky = 0;
static int px = -1, py = -1;
static int mouse_marking = FALSE;
static int keyboard_marking = FALSE;
static int marked_block = FALSE;

static FILE *fp;

#ifdef TSR
#define main tsr_program
#endif

/* ---------- enter here to run screen grabber --------- */
void main(void)
{
 fp = fopen("grab.dat", "wt");
 if (fp != NULL) {
 /* ----- event message dispatching loop ---- */
 while(dispatch_message(message_proc))
 ;
 fclose(fp);
 }
}

/* --------- event-driven message processing function ------- */
static int message_proc(
 enum message message, /* message */
 int param1, /* 1st parameter */
 int param2) /* 2nd parameter */
{
 int mx = param1;
 int my = param2;
 int key = param1;

 switch (message) {
 case START:
 init_variables();
 post_message(CURRENT_KEYBOARD_CURSOR,(PARAM) &cursorx, (PARAM) &cursory);
 post_message(KEYBOARD_CURSOR, 0, 0);
 post_message(CURRENT_MOUSE_CURSOR,(PARAM) &mousex, (PARAM) &mousey);
 post_message(MOUSE_CURSOR, 0, 0);
 break;
 case KEYBOARD:
 switch (key) {
 case FWD:
 if (kx < 79) {
 if (keyboard_marking)
 forward(1);
 kx++;
 }
 break;
 case BS:
 if (kx) {
 if (keyboard_marking)
 backward(1);
 --kx;
 }
 break;
 case UP:
 if (ky) {
 if (keyboard_marking)
 upward(1);
 --ky;
 }
 break;

 case DN:
 if (ky < 24) {
 if (keyboard_marking)
 downward(1);
 ky++;
 }
 break;
 case F2:
 mouse_marking = FALSE;
 setstart(&keyboard_marking, kx, ky);
 break;
 case '\r':
 post_message(STOP, TRUE, 0);
 break;
 case ESC:
 post_message(STOP, FALSE, 0);
 break;
 }
 send_message(KEYBOARD_CURSOR, kx, ky);
 break;
 case LEFT_BUTTON:
 if (!mouse_marking) {
 px = mx;
 py = my;
 keyboard_marking = FALSE;
 setstart(&mouse_marking, mx, my);
 }
 break;
 case MOUSE_MOVED:
 if (mouse_marking) {
 if (px < mx)
 forward(mx-px);
 if (mx < px)
 backward(px-mx);
 if (py < my)
 downward(my-py);
 if (my < py)
 upward(py-my);
 px = mx;
 py = my;
 }
 break;
 case BUTTON_RELEASED:
 mouse_marking = FALSE;
 break;
 case RIGHT_BUTTON:
 post_message(STOP, TRUE, 0);
 break;
 case STOP:
 if (marked_block) {
 highlight(blk);
 if (param1)
 writescreen(rect(min(blk.x, blk.x1),min(blk.y, blk.y1),
 max(blk.x, blk.x1),max(blk.y, blk.y1)));
 }
 send_message(MOUSE_CURSOR, mousex, mousey);
 send_message(KEYBOARD_CURSOR, cursorx, cursory);
 init_variables();
 break;

 default:
 break;
 }
 return TRUE;
}

/* ------- set the start of block marking ------- */
static void near setstart(int *marking, int x, int y)
{
 if (marked_block)
 highlight(blk); /* turn off old block */

 marked_block = FALSE;
 *marking ^= TRUE;
 blk.x1 = blk.x = x; /* set the corners of the new block */
 blk.y1 = blk.y = y;
 if (*marking)
 highlight(blk); /* turn on the new block */
}

/* ----- move the block rectangle forward one position ----- */
static void near forward(int n)
{
 marked_block = TRUE;
 while (n-- > 0) {
 if (blk.x < blk.x1)
 highlight(rect(blk.x,blk.y,blk.x,blk.y1));
 else
 highlight(rect(blk.x+1,blk.y,blk.x+1,blk.y1));
 blk.x++;
 }
}

/* ---- move the block rectangle backward one position ----- */
static void near backward(int n)
{
 marked_block = TRUE;
 while (n-- > 0) {
 if (blk.x > blk.x1)
 highlight(rect(blk.x,blk.y,blk.x,blk.y1));
 else
 highlight(rect(blk.x-1,blk.y,blk.x-1,blk.y1));
 --blk.x;
 }
}

/* ----- move the block rectangle up one position ----- */
static void near upward(int n)
{
 marked_block = TRUE;
 while (n-- > 0) {
 if (blk.y > blk.y1)
 highlight(rect(blk.x,blk.y,blk.x1,blk.y));
 else
 highlight(rect(blk.x,blk.y-1,blk.x1,blk.y-1));
 --blk.y;
 }
}


/* ----- move the block rectangle down one position ----- */
static void near downward(int n)
{
 marked_block = TRUE;
 while (n-- > 0) {
 if (blk.y < blk.y1)
 highlight(rect(blk.x,blk.y,blk.x1,blk.y));
 else
 highlight(rect(blk.x,blk.y+1,blk.x1,blk.y+1));
 blk.y++;
 }
}

/* ------ write the rectangle to the file ------- */
static void writescreen(RECT rc)
{
 int vx = rc.x;
 int vy = rc.y;
 while (vy != rc.y1+1) {
 if (vx == rc.x1+1) {
 fputc('\n', fp);
 vx = rc.x;
 vy++;
 }
 else {
 fputc(send_message(VIDEO_CHAR, vx, vy), fp);
 vx++;
 }
 }
}

/* ------- simple swap macro ------ */
#define swap(a,b) {int s=a;a=b;b=s;}

/* -------- invert the video of a defined rectangle ------- */
static void near highlight(RECT rc)
{
 int *bf, *bf1, bflen;
 if (rc.x > rc.x1)
 swap(rc.x,rc.x1);
 if (rc.y > rc.y1)
 swap(rc.y,rc.y1);
 bflen = (rc.y1-rc.y+1) * (rc.x1-rc.x+1) * 2;
 if ((bf = malloc(bflen)) != NULL) {
 send_message(HIDE_MOUSE, 0, 0);
 send_message(GET_VIDEORECT, (PARAM) &rc, (PARAM) bf);
 bf1 = bf;
 bflen /= 2;
 while (bflen--)
 *bf1++ ^= 0x7700;
 send_message(PUT_VIDEORECT, (PARAM) &rc, (PARAM) bf);
 send_message(SHOW_MOUSE, 0, 0);
 free(bf);
 }
}

/* ---- initialize global variables for later popup ---- */
static void init_variables(void)
{

 mouse_marking = keyboard_marking = FALSE;
 kx = ky = blk.x = blk.y = blk.x1 = blk.y1 = 0;
 px = py = -1;
 mouse_marking = FALSE;
 keyboard_marking = FALSE;
 marked_block = FALSE;
}























































March, 1991
STRUCTURED PROGRAMMING


Launching Rubber Chickens




Jeff Duntemann KG7JF


What was the most amazing thing I saw at Comdex/Fall 1990? This: Author Tom
Swan launching rubber chickens from a catapult at the Circus Circus hotel
casino. Circus Circus has a stranglehold on a good niche: They cater to
families with kids. While children are not allowed unattended down on the
floor with all the slot machines, upstairs there is a never-ending collection
of good old fashioned carnival games, waiting for parents to hand a roll of
quarters each to Nickie and Suzie and shag them up the stairs.
The best of these games involves a circular counter about 25 feet in diameter.
At 12 points around the circumference of the counter are small steel
catapults, each of which has a rubber mallet chained to it. Inside the
counter, a circular disk about ten feet in diameter slowly rotates, carrying
about 15 two-gallon stewpots arranged randomly on the disk.
You get three rubber chickens for a dollar. You position a chicken on one end
of the catapult. You clobber the other end with the rubber mallet. If you're
lucky, your chicken flies gracefully through the air and flops into one of the
stewpots, providing you with a stuffed penguin that may have cost the casino
as much as fifty cents. Most of the time, however, the chicken ends up
elsewhere, and you lose.
While watching Tom Swan catapulting rubber chickens with gay abandon, it
occurred to me that this is a fair metaphor for launching a software product
in our high-stakes industry. First of all, how you position the chicken is
critical. Call it a database when it's really a spreadsheet and it won't fly
right. Rubber chickens fly best when you put them butt-forward and tummy-down
on the catapult. Software products tend to fly best when you position them as
something familiar. Calling a product "... like nothing you've ever heard of
before!" no matter how honest an assertion, will usually send the product
flying right over the heads of the people with money in their fists.
Second, how hard you hit the catapult is critical. Hit it only a little, and
both your chickens and your products will fall flat on the floor without ever
approaching the magic disk. Our audience (unlike that of the early '80s) is a
jaded audience. They've seen a lot. Rubber chickens are flying past them all
the time. They won't come looking for you or your product. You have to let
them know that you exist. Good use of PR, effective ads, press tours, all of
these are necessary these days to get the chicken into the air.
Hit the catapult too hard, and your chicken will go rocketing across the
circle into someone else's face, at which point you will be glad the rubber
mallets are chained to the catapults. Similarly, too much hype, too much
noise, too high a profile eventually become more irritating than effective and
turn opinion against you, especially when the chicken you launch is too much
of a lightweight to live up to your hype campaign.
Third, when a lot of people are launching rubber chickens at the same time,
some are bound to collide in mid-air. Worse, if one comes down in the pot you
were aiming for just before yours, your chicken will bounce off and you will
lose, regardless of the quality of your aim. Timing is critical, competition
can appear out of nowhere and eat your launch. There are no guarantees.
Fourth, even if you hit the catapult exactly right, remember that the stewpots
are slowly moving around the circle, and when your chicken comes down, the
stewpot you aimed for may well be somewhere else. Niches and audiences are
both moving targets. Sometimes their movements can be predicted, but they
occasionally veer sharply to one side or another. Now and then they vanish
before you ever bring the mallet down.
And finally, it's three rubber chickens for a dollar, win or lose. No refunds.
Most of the money goes out up front. Rarely can any of it be gotten back.
Launching products, unlike launching rubber chickens, is hellishly expensive.
Study. Observe others who have done it well. Practice in the privacy of your
local market. And never forget that the odds always favor someone else.
Tom has evidently had some practice launching rubber chickens. He walked away
with three stuffed penguins. I shot chickens all over hell and gone, and
didn't sink a one into a pot. At some point you just have to ask yourself: Am
I in the wrong game? (Don't despair, though -- later, at a totally different
game, I won a stuffed toucan for Carol on the first try.) Nonetheless, and
this may be the most important insight into the whole matter, I would rather
lose every time than be the poor guy selling the rubber chickens.
In other words, go for it.


Slim Pickens


If you get the impression that it was a dull Comdex, you're right. It's never
a software man's show, and this time less so than ever. "Best of show" in
developer tools was Digitalk's Smalltalk/V for Windows 3.0. Digitalk has
ported their most potent Smalltalk/V PM down to Windows under DOS, and it
looks great. Dan Goldman showed me some wonderful things you could do with
Windows DDE (Dynamic Data Exchange), calling Smalltalk/V from inside Microsoft
Word for Windows. If your machine is fast enough (and these days, most 386
machines are) you can treat Smalltalk/V as a word processor macro language.
Call Smalltalk/V from Word, crunch some data, and pass data back to Word via
DDE. Dan called Smalltalk from Word, generated a 200-digit factorial, and then
pasted a string equivalent of the monster number back into his Word document,
all automatically. This is fine, fine stuff, and I'll have more to say about
Smalltalk/V for Windows once it ships.
The other developer tools there were mostly Windows tools, including a
tantalizing sneak preview of a Turbo Pascal product running under Windows 3.0.
Needless to say, watch this space; when the time comes, you'll hear about it.
There was one hardware concept afoot at Comdex worth mentioning among
developers. Supersmall system units are turning up, often no larger than a fat
trade book, containing all system components (including diskette drive, hard
disk, and VGA graphics) except a keyboard and screen. The ones I saw tended to
be 12 MHz 286 boxes from small Pacific Rim companies, but at least one -- the
rakish Brick from Ergo Computing -- runs a 386SX.
The idea here is that you don't need removable media to take your work home
after hours -- just unplug the keyboard and screen and throw the whole machine
into your briefcase. Running an onsite product demo is easy: Use your
prospect's keyboard and screen, but bring the fully configured machine and
avoid the embarrassment of a demo that won't run (or worse, runs and then
blows up) on a prospect machine. Little by little, the system unit becomes a
swelling on the cable between the keyboard and the screen, which is how I feel
it should be.
I'm interested in machines like this, and very few other writers seem to have
noticed the trend. If your firm sells one or plans to, please get some product
information out to me.


Get the Background Down!


One element of software design that few people bother to mention is pretty
important: Be familiar with the technologies you're going to be using before
you begin the design itself. Don't try to pick it up after you've made several
major design decisions and burned a few bridges. If your design incorporates
SQL and you are a total SQL virgin, pick up a book on SQL and read it
thoroughly. Try to find another product that does SQL and play with it for
awhile. You don't have to become an SQL expert to get your design down
correctly, but you'd better understand what SQL is at a high level, and
perhaps take some notes on whatever "gotchas" SQL can hand you.
Furthermore, this is true even if you intend to buy your SQL technology from
someone else. It may be more true, since if you buy SQL in a can you may not
have the opportunity to become an SQL expert by building it yourself.
I've found it broadly true that Pascal programmers tend to shy away from
things like hardware interrupts, UART registers, and things like that. This
technology will lie at the heart of JTERM, so I'm going to spend some time
introducing you to it. Keep in mind that we haven't really gotten into our
design yet. This is research -- though it will save us a lot of time later on.


The Nature of a Communications Link


There are certain things I would just as soon not know -- how bodies are
embalmed comes to mind, as well as how stainless-steel hip joints are
installed in little old ladies. Many people feel the same way about
understanding the nature of communications software. Somewhere along the way,
they learned just enough to suspect they don't want to learn any more.
It's a little messier, but much less messy than code generation (in my
opinion) or housebreaking small dogs. Mostly it's a translation process, as
I've outlined in Figure 1.
The idea in any communications link is to get an 8-bit byte across a
single-conductor line so that it arrives on the other end of the line with all
bits intact. Obviously, the bits have to go one-by-one down the wire in
follow-the-leader fashion. The first translation takes the eight parallel bits
of a byte and sends them down the line one by one. The engine behind this
translation is called a UART, which is a passable euphemism for Universal
Asynchronous Receiver/Transmitter. The UART is one chip and a little support
logic; in a PC-type machine, typically an 8255.
The UART has a register (which is just a fancy name for a memory location)
into which the byte to be transmitted may be written. The UART then takes the
byte in the register and translates it into a carefully timed series of
voltage levels on a single output pin. A +5V level on the pin indicates the
presence of a 1-bit, while a 0V voltage level indicates the presence of a
0-bit.
"Carefully timed" means that each bit is placed on the line for a very short
and precisely measured period of time. Each bit may get three milliseconds,
for example. What this means is that the UART will place a +5V voltage level
on the line for three milliseconds, and by convention we say that a 1-bit has
been placed on the line. When that three milliseconds is up, the next bit (by
convention) is placed on the line. If that bit is also a 1-bit, the voltage on
the line won't change; it simply remains at +5V for another three
milliseconds. If the bit is a 0-bit, however, the voltage level will drop from
+5V to 0 volts, where it will remain for three milliseconds. Then the next bit
will be placed on the line, and the process continues until all 8 bits have
had their moment on the line.
When you hook a serial cable to your serial port, (and a serial port on a PC
is essentially a UART) you are providing a path for these changing voltage
levels. Such a serial cable can get you anywhere within about 50 feet. Beyond
that, the inductance of the wire in the cable begins to bog down the UART's
rather meager ability to switch voltages quickly.
Fifty feet will get you across the room or maybe across the office. It won't
get you across the street, however -- for that, the next translation stage is
necessary. This is the modem, for Modulator/Demodulator.
At least for purposes of transmitting bits, the modem is best thought of as a
tone generator. The venerable and thoroughly obsolete Bell 103 300-baud modem
takes a +5 voltage level on the input side and converts it to a burst of audio
tone at a frequency of 1270 Hz. A 0-bit is translated to a burst of audio at
1070 Hz. So rather than switching voltage levels from 0 volts to 5 volts, the
modem plays a little two-note song as bits are presented to it. For each
1-bit, it plays a 1270 Hz tone for three milliseconds; for each 0-bit, it
plays a 1070 Hz tone for three milliseconds.
Audio tones are much more readily sent down very long runs of wire than pure
DC voltage levels. And audio tones are the only way to send signals over
telephone lines, which deliberately block the passage of DC. The modem, in
fact, lets you send data anywhere the phone lines go, which today is almost
anywhere.
Once the transmitting modem's little song reaches the receiving modem on the
other end of the link, the translation process runs in reverse. The receiving
modem takes the bursts of audio tones and translates them into voltage levels.
It sends those voltage levels over a short serial cable to a serial port. The
UART in that serial port takes the sequence of voltage levels sent to it by
the modem and reassembles the original byte in its register. And there it is:
A byte has been broken down into loose bits, translated into loose beeps for
shipment over phone lines, and then reassembled into the original byte, all in
a very short period of time, even when the two ends are separated by half a
planet.



Two Directions at Once


That's about as simple as I can make the process sound. I've left out a number
of critical issues like baud rate and framing just to get the basic idea
across. We'll return to all the details in time.
I've described a communications link in such a way as to make you think it
operates in only one direction. Not true -- your typical serial link can pass
data in both directions at the same time. This is less of a trick than it may
seem, since no one thinks it remarkable that you can talk and listen at the
same time on the telephone. Talking and listening on the phone is possible
because the person on the other end sounds different from the way you sound to
yourself. And so it is with the modem. When two modems connect, they agree to
sing in two different keys so that they can tell one another apart.
When two modems connect over a line, one (typically the one that coordinates
the connection) is called the originate modem, and the other is called the
answer modem. The originate modem uses the set of two audio tones I described
above: 1070 Hz for a 0-bit, and 1270 Hz for a 1-bit. The answer modem, on the
other hand, goes up the scale a little and uses 2025 Hz for a 0-bit and 2225
Hz for a 1-bit. Both modems contain circuitry that can easily distinguish
between the two pairs of tones. Encoded in these two sets of tones, data can
pass one another on the line as obliviously as though the line were in fact
passing data in only one direction.
The cable between the serial port and the modem does not deal in audio tones.
It deals in voltage levels, which do not mix well on a single wire. This is
why your typical serial cable has two data conductors: One for data passing
into the serial port, and another, separate conductor for data passing out of
the serial port. (See Figure 1.) The UART, like the modem, is thoroughly
ambidextrous and can handle data moving through itself in two directions at
once without a burp.
Figure 2 shows that what might seem like a single data channel between the two
ends is actually two separate data channels. Between the UART and the modem,
data is kept separate by being passed on separate wires. Between the two
modems, data is kept separate by being encoded in two different and easily
discernable pairs of audio tones. Operation in this fashion, where data flows
simultaneously in both directions through a link, is called full duplex
operation. There are circumstances where the line will only accommodate data
moving in one direction at a time. This is called half duplex operation, and
is not used very much in asynch work.


Interactive Communication Through a Link


When you speak of a "serial port" on a PC, what you're actually talking about
is a UART. When you send a byte of data out through the link, you write it
into the UART, and the UART takes it from there. Similarly, when the UART
receives a complete byte from the other end of the link, it makes that byte
available in a register for your program to read.
Keeping all of that in mind, the fundamental nature of a simple terminal
program is simple indeed. I've drawn up a flowchart in Figure 3 to show you
how simple it is.
What we have here is simply a loop that checks two things: Availability of a
character coming in from the modem, and the availability of a keystroke typed
at the keyboard. If a character is available from the modem, the program reads
the character and displays it to the screen. If a key was pressed at the
keyboard, the program reads the keystroke and hands it to the modem. One
additional check is made to see whether the keystroke is a predefined exit key
(Alt-X has come to be common) but this isn't essential as long as you don't
mind re-booting out of your program. (Don't laugh; I used to use programs like
that in my CP/M days, and the original version of Visicalc for the Apple III
was designed such that a cold reboot was necessary to get out of it -- a
rather pathetic antipiracy measure.)
If this were all it took to implement a communications link, I wouldn't be
bothering to write whole columns about it. Figure 1 shows you the idea of a
link through a serial port and modem, but it ignores a very serious problem:
The UART doesn't retain characters somewhere until they are read. It simply
places an incoming character up on a rack and expects you to come and get it.
If a second character comes in on the link before you have had a chance to
grab the first one, the first character will be nudged into nothingness and
the second character will take its place.
If the polling loop is fast enough, you won't lose data. But once you start
adding other features to the program that slow down the loop -- for example,
allowing the remote system to position the cursor through control sequences --
lost characters are almost inevitable.
The solution is wonderfully messy: Allow the UART to generate an interrupt
every time a character comes in on the line. The interrupt forces the CPU to
stop what it's doing and go fetch the character from the UART right now, and
place it someplace temporarily safe: a memory buffer. When the polling loop
has time, it goes to the buffer for characters rather than directly to the
UART.
Behind that one paragraph hides a universe of complication. It's messy but
comprehensible -- and next time I'll explain the machinery of UART interrupts
and how you can use them.


Products Mentioned


Smalltalk/V Windows Digitalk 9841 Airport Blvd. Los Angeles, CA 90045
213-645-1082 Price: $499.95




































March, 1991
GRAPHICS PROGRAMMING


Fast Convex Polygons




Michael Abrash


Last month, we explored the surprisingly intricate process of filling convex
polygons. This month we're going to fill them an order of magnitude or so
faster.
Two thoughts may occur to some of you at this point: "Oh, no, he's not going
to get into assembly language and device dependent code, is he?" and, "Why
bother with polygon filling -- or, indeed, any drawing primitives -- anyway?
Isn't that what GUIs and third-party libraries are for?"
To which I answer, "Well, yes, I am," and, "If you have to ask, you've missed
the magic of microcomputer programming." Actually, both questions ask the same
thing, and that is: "Why should I, as a programmer, have any idea how my
program actually works?"
Put that way, it sounds a little different, doesn't it?
GUIs, reusable code, portable code written entirely in high-level languages,
and object-oriented programming are all the rage now, and promise to remain so
for the foreseeable future. The thrust of this technology is to enhance the
software development process by offloading as much responsibility as possible
to other programmers, and by writing all remaining code in modular, generic
form. This modular code then becomes a black box to be reused endlessly
without another thought about what actually lies inside. GUIs also reduce
development time by making many interface choices for you. That, in turn,
makes it possible to create quickly and reliably programs that will be easy
for new users to pick up, so software becomes easier to both produce and
learn. This is, without question, a Good Thing.
The "black box" approach does not, however, necessarily cause the software
itself to become faster, smaller, or more innovative; quite the opposite, I
suspect. I'll reserve judgement on whether that is a good thing or not, but
I'll make a prediction: In the short run, the aforementioned techniques will
lead to noticeably larger, slower programs, as programmers understand less and
less of what the key parts of their programs do and rely increasingly on
general-purpose code written by other people. (In the long run, programs will
be bigger and slower yet, but computers will be so fast and will have so much
memory that no one will care.) Over time, PC programs will also come to be
more similar to one another -- and to programs running on other platforms,
such as the Mac -- as regards both user interface and performance.
Again, I am not saying that this is bad. It does, however, have major
implications for the future nature of PC graphics programming, in ways that
will directly affect the means by which many of you earn your livings. Not so
very long from now, graphics programming -- all programming, for that matter
-- will become mostly a matter of assembling in various ways components
written by other people, and will cease to be the all-inclusively creative,
mindbendingly complex pursuit it is today. (Using legally certified black
boxes is, by the way, one direction in which the patent lawyers are leading
us; legal considerations may be the final nail in the coffin of homegrown
code.) For now, though, it's still within your power, as a PC programmer, to
understand and even control every single thing that happens on a computer if
you so desire, to realize any vision you may have. Take advantage of this
unique window of opportunity to create some magic!
Neither does it hurt to understand what's involved in drawing, say, a filled
polygon, even if you are using a GUI. You will better understand the
performance implications of the available GUI functions, and you will be able
to fill in any gaps in the functions provided. You may even find that you can
outperform the GUI on occasion by doing your own drawing into a system memory
bitmap, then copying the result to the screen. You will also be able to
understand why various quirks exist, and will be able to put them to good use.
For example, X Window follows the polygon drawing rules described last month
(although it's not obvious from the documentation); if you understood last
month's discussion, you're in good shape to use polygons under X.
In short, even though it runs counter to current trends, it helps to
understand how things work, especially when they're very visible parts of the
software you develop. That said, let's learn more about filling convex
polygons.


Fast Convex Polygon Filling


When last we left the topic of filling convex polygons, the implementation we
had met all of our functional requirements. In particular, it met stringent
rules that guaranteed that polygons would never overlap at shared edges, an
important consideration when building polygon-based images. Unfortunately, the
implementation was also slow as molasses. This month we'll work up polygon
filling code that's fast enough to be truly usable.
Last month's polygon filling code involved three major tasks, each performed
by a separate function: Tracing each polygon edge to generate a coordinate
list (performed by the function ScanEdge); drawing the scanned-out horizontal
lines that constitute the filled polygon (DrawHorizontalLineList); and
characterizing the polygon and coordinating the tracing and drawing
(FillConvexPolygon). The amount of time that the sample program from last
month spent in each of these areas is shown in Table 1. As you can see, half
the time was spent drawing and the other half was spent tracing the polygon
edges (the time spent in FillConvexPolygon was relatively minuscule), so we
have our choice of where to begin optimizing.
Table 1: Separate and combined execution times of the various versions of the
convex polygon drawing code's three major sections when executing the test
program from last month (Listing Three, February 1991). Note that time spent
in main() is not included. All times are in seconds, as measured with Turbo
Profiler on a 20-MHz cached 386 with no math coprocessor installed. C code was
compiled with Turbo C++ with maximum optimization (-G -O -Z -r -a); assembly
language code was assembled with TASM. Percentages of combined times are
rounded to the nearest percent, so the sum of the three percentages does not
always equal 100.

 Total
 Polygon
 Filling DrawHorizontal FillConvex
 Implementation Time LineList ScanEdge Polygon
 ---------------------------------------------------------------------

 Drawing to display memory in mode 13h

 C floating point scan/ 11.69 5.80 seconds 5.86 0.03
 DrawPixel drawing
 code from last month, (50% of total) (50%) (<1%)
 (small model)

 C floating point scan/ 6.64 0.49 6.11 0.04
 memset drawing
 (L1.C, compact model) (7%) (92%) (<1%)

 C integer scan/ 0.60 0.49 0.07 0.04
 memset drawing
 (L1.C & L2.C, compact (82%) (12%) (7%)
 model)

 C integer scan/ 0.45 0.36 0.06 0.03
 ASM drawing
 (L2.C & L3.ASM, small (80%) (13%) (7%)
 model)

 ASM integer scan/ 0.42 0.36 0.03 0.03

 ASM drawing
 (L3.ASM & L4.ASM, (86%) (7%) (7%)
 small model)

 Drawing to system memory

 C integer scan/ 0.31 0.20 0.07 0.04
 memset drawing
 (L1.C & L2.C, compact (65%) (23%) (13%)
 model)

 ASM integer scan/ 0.13 0.07 0.03 0.03
 ASM drawing
 (L3.ASM & L4.ASM,I (54%) (23%) (23%)
 small model)



Fast Drawing


Let's start with drawing, which is easily sped up. The original code used a
double-nested loop that called a draw-pixel function to plot each pixel in the
polygon individually. That's a ridiculous approach in a graphics mode that
offers linearly mapped memory, as does VGA mode 13h, the mode with which we're
working. At the very least, we could point a far pointer to the left edge of
each polygon scan line, then draw each pixel in that scan line in quick
succession, using something along the lines of *ScrPtr++ = FillColor; inside a
loop.
However, it seems silly to use a loop when the 80x86 has an instruction, REP
STOS, that's uniquely suited to filling linear memory buffers. There's no way
to use REP STOS directly in C code, but it's a good bet that the memset
library function uses REP STOS, so you could greatly enhance performance by
using memset to draw each scan line of the polygon in a single shot. That,
however, is easier said than done. The memset function linked in from the
library is tied to the memory model in use; in small (tiny, small, or medium)
data models memset accepts only near pointers, so it can't be used to access
screen memory. Consequently, a large (compact, large, or huge) data model must
be used to allow memset to draw to display memory -- a clear case of the tail
wagging the dog. This is an excellent example of why, although it is possible
to use C to do virtually anything, it's sometimes much simpler just to use a
little assembly code and be done with it.
At any rate, Listing One shows a version of DrawHorizontalLineList that uses
memset to draw each scan line of the polygon in a single call. When linked to
last month's test program, Listing One increases pure drawing speed
(disregarding edge tracing and other nondrawing time) by more than an order of
magnitude over last month's draw-pixel-based code. This despite the fact that
Listing One requires a large (in this case, compact) data model. Listing One
works fine with Turbo C++, but may not work with other compilers, for it
relies on the aforementioned interaction between memset and the selected
memory model.
At this point, I'd like to mention that benchmarks are notoriously unreliable;
the results in Table 1 are accurate only for the test program, and then only
when running on a particular system. Results could be vastly different if
smaller, larger, or more complex polygons were drawn, or if a faster or slower
computer/VGA combination were used. These factors notwithstanding, the test
program does fill a variety of polygons of varying complexity sized from large
to small and in between, and certainly the order of magnitude difference
between Listing One and the old version of DrawHorizontalLineList is a clear
indication of which code is superior.
Anyway, Listing One has the desired effect of vastly improving drawing time.
There are cycles yet to be had in the drawing code, but as tracing polygon
edges now takes 92 percent of the polygon filling time, it's logical to
optimize the tracing code next.


Fast Edge Tracing


There's no secret as to why last month's ScanEdge was so slow: It used
floating point calculations. One secret of fast graphics is using integer or
fixed-point calculations, instead. (Sure, the floating point code would run
faster if a math coprocessor were installed, but it would still be slower than
the alternatives; besides, why require a math coprocessor when you don't have
to?) Both integer and fixed-point calculations are fast. In many cases,
fixed-point is faster, but integer calculations have one tremendous virtue:
They're completely accurate. The tiny imprecision inherent in either fixed-or
floating-point calculations can result in occasional pixels being one off from
their proper location. This is no great tragedy, but after going to so much
trouble to ensure that polygons don't overlap at common edges, why not get it
exactly right?
In fact, when I tested out the integer edge tracing code by comparing an
integer-based test image to one produced by floating point calculations, two
pixels out of the whole screen differed, leading me to suspect a bug in the
integer code. It turned out, however, that's in those two cases, the floating
point results were sufficiently imprecise to creep from just under an integer
value to just over it, so that the ceil function returned a coordinate that
was one too large. Floating point is very accurate -- but it is not precise.
Integer calculations, properly performed, are.
Listing Two shows a C implementation of integer edge tracing. Vertical and
diagonal lines, which are trivial to trace, are special-cased. Other lines are
broken into two categories: Y-major (closer to vertical) and X-major (closer
to horizontal). The handlers for the Y-major and X-major cases operate on the
principle of similar triangles: The number of X pixels advanced per scan line
is the same as the ratio of the X delta of the edge to the Y delta. Listing
Two is more complex than the original floating point implementation, but not
painfully so. In return for that complexity, Listing Two is more than 80 times
faster -- and, as just mentioned, it's actually more accurate than the
floating point code.
You gotta love that integer arithmetic.


The Finishing Touch: Assembly Language


The C implementation is now nearly 20 times as fast as the original, good
enough for most purposes. Still, it requires that a large data model be used
(for memset), and it's certainly not the fastest possible code. The obvious
next step is assembly language.
Listing Three is an assembly language version of DrawHorizontalLineList. In
actual use, it proved to be about 36 percent faster than Listing One; better
than a poke in the eye, but just barely. There's more to these timing results
than meets that eye, though. Display memory generally responds much more
slowly than system memory, especially in 386 and 486 systems. That means that
much of the time taken by Listing Three is actually spent waiting for display
memory accesses to complete, with the processor forced to idle by wait states.
If, instead, Listing Three drew to a local buffer in system memory or to a
particularly fast VGA, the assembly implementation might well display a far
more substantial advantage over the C code.
And indeed it does. When the test program is modified to draw to a local
buffer, both the C and assembly language versions get 0.29 seconds faster,
that being a measure of the time taken by display memory wait states. With
those wait states factored out, the assembly language version of
DrawHorizontalLineList becomes almost three times as fast as the C code.
There is a lesson here. An optimization has no fixed payoff; its value
fluctuates according to the context in which it is used. There's relatively
little benefit to further optimizing code that already spends half its time
waiting for display memory; no matter how good your optimizations, you'll get
only a two-times speedup at best, and generally much less than that. There is,
on the other hand, potential for tremendous improvement when drawing to system
memory, so if that's where most of your drawing will occur, optimizations such
as Listing Three are well worth the effort.
Know the environments in which your code will run, and know where the cycles
go in those environments.


Maximizing REP STOS


Listing Three doesn't take the easy way out and use REP STOSB to fill each
scan line; instead, it uses REP STOSW to fill as many pixel pairs as possible
via word-sized accesses, using STOSB only to do odd bytes. Word accesses to
odd addresses are always split by the processor into 2-byte accesses. Such
word accesses take twice as long as word accesses to even addresses, so
Listing Three makes sure that all word accesses occur at even addresses, by
performing a leading STOSB first if necessary.
Listing Three is another case in which it's worth knowing the environment in
which your code will run. Extra code is required to perform aligned
word-at-a-time filling, resulting in extra overhead. For very small or narrow
polygons, that overhead might overwhelm the advantage of drawing a word at a
time, making plain old REP STOSB faster.


Faster Edge Tracing


Finally, Listing Four is an assembly language version of ScanEdge. Listing
Four is a relatively straightforward translation from C to assembly, but is
nonetheless almost twice as fast as Listing Two.
The version of ScanEdge in Listing Four could certainly be sped up still
further by unrolling the loops. FillConvexPolygon, the overall coordination
routine, hasn't even been converted to assembly language, so that could be
sped up as well. I haven't bothered with these optimizations because all code
other than DrawHorizontalLineList takes only 14 percent of the overall polygon
filling time when drawing to display memory; the potential return on
optimizing nondrawing code simply isn't great enough to justify the effort.
Part of the value of a profiler is being able to tell when to stop optimizing;
with Listings Three and Four in use, more than two-thirds of the time taken to
draw polygons is spent waiting for display memory, so optimization is pretty
much maxed out. However, further optimization might be worthwhile when drawing
to system memory, where wait states are out of the picture and the nondrawing
code takes a significant portion (46 percent) of the overall time.

Again, know where the cycles go.
By the way, note that all the versions of ScanEdge and FillConvexPolygon that
we've looked at are adapter independent, and that the C code is also machine
independent; all adapter-specific code is isolated in DrawHorizontalLineList.
This makes it easy to add support for other graphics formats, such as the
8514/A, the XGA, or, for that matter, a non-PC system.


Books of the Month


Two books share honors this month. One isn't new, but is well worth having if
you're a PC graphics programmer. Programmer's Guide to PC & PS/2 Video
Systems, by Richard Wilton (Microsoft Press, 1987) is pretty much the standard
graphics reference for the PC, covering all standards up through the VGA,
including MCGA, EGA, Hercules, CGA, and MDA (but not 8514/A). For 8514/A
programming, Graphics Programming for the 8514/A, by Jake Richter and Bud
Smith (M&T Books, 1990), is good; which is lucky, because it's the only 8514/A
book I know of!


_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

/* Draws all pixels in the list of horizontal lines passed in, in
 mode 13h, the VGA's 320x200 256-color mode. Uses memset to fill
 each line, which is much faster than using DrawPixel but requires
 that a large data model (compact, large, or huge) be in use when
 running in real mode or 286 protected mode.
 All C code tested with Turbo C++ */

#include <string.h>
#include <dos.h>
#include "polygon.h"

#define SCREEN_WIDTH 320
#define SCREEN_SEGMENT 0xA000

void DrawHorizontalLineList(struct HLineList * HLineListPtr,
 int Color)
{
 struct HLine *HLinePtr;
 int Length, Width;
 unsigned char far *ScreenPtr;

 /* Point to the start of the first scan line on which to draw */
 ScreenPtr = MK_FP(SCREEN_SEGMENT,
 HLineListPtr->YStart * SCREEN_WIDTH);

 /* Point to the XStart/XEnd descriptor for the first (top)
 horizontal line */
 HLinePtr = HLineListPtr->HLinePtr;
 /* Draw each horizontal line in turn, starting with the top one and
 advancing one line each time */
 Length = HLineListPtr->Length;
 while (Length-- > 0) {
 /* Draw the whole horizontal line if it has a positive width */
 if ((Width = HLinePtr->XEnd - HLinePtr->XStart + 1) > 0)
 memset(ScreenPtr + HLinePtr->XStart, Color, Width);
 HLinePtr++; /* point to next scan line X info */
 ScreenPtr += SCREEN_WIDTH; /* point to next scan line start */
 }
}







[LISTING TWO]

/* Scan converts an edge from (X1,Y1) to (X2,Y2), not including the
 point at (X2,Y2). If SkipFirst == 1, the point at (X1,Y1) isn't
 drawn; if SkipFirst == 0, it is. For each scan line, the pixel
 closest to the scanned edge without being to the left of the
 scanned edge is chosen. Uses an all-integer approach for speed and
 precision */

#include <math.h>
#include "polygon.h"

void ScanEdge(int X1, int Y1, int X2, int Y2, int SetXStart,
 int SkipFirst, struct HLine **EdgePointPtr)
{
 int Y, DeltaX, Height, Width, AdvanceAmt, ErrorTerm, i;
 int ErrorTermAdvance, XMajorAdvanceAmt;
 struct HLine *WorkingEdgePointPtr;

 WorkingEdgePointPtr = *EdgePointPtr; /* avoid double dereference */
 AdvanceAmt = ((DeltaX = X2 - X1) > 0) ? 1 : -1;
 /* direction in which X moves (Y2 is
 always > Y1, so Y always counts up) */

 if ((Height = Y2 - Y1) <= 0) /* Y length of the edge */
 return; /* guard against 0-length and horizontal edges */

 /* Figure out whether the edge is vertical, diagonal, X-major
 (mostly horizontal), or Y-major (mostly vertical) and handle
 appropriately */
 if ((Width = abs(DeltaX)) == 0) {
 /* The edge is vertical; special-case by just storing the same
 X coordinate for every scan line */
 /* Scan the edge for each scan line in turn */
 for (i = Height - SkipFirst; i-- > 0; WorkingEdgePointPtr++) {
 /* Store the X coordinate in the appropriate edge list */
 if (SetXStart == 1)
 WorkingEdgePointPtr->XStart = X1;
 else
 WorkingEdgePointPtr->XEnd = X1;
 }
 } else if (Width == Height) {
 /* The edge is diagonal; special-case by advancing the X
 coordinate 1 pixel for each scan line */
 if (SkipFirst) /* skip the first point if so indicated */
 X1 += AdvanceAmt; /* move 1 pixel to the left or right */
 /* Scan the edge for each scan line in turn */
 for (i = Height - SkipFirst; i-- > 0; WorkingEdgePointPtr++) {
 /* Store the X coordinate in the appropriate edge list */
 if (SetXStart == 1)
 WorkingEdgePointPtr->XStart = X1;
 else
 WorkingEdgePointPtr->XEnd = X1;
 X1 += AdvanceAmt; /* move 1 pixel to the left or right */
 }
 } else if (Height > Width) {
 /* Edge is closer to vertical than horizontal (Y-major) */

 if (DeltaX >= 0)
 ErrorTerm = 0; /* initial error term going left->right */
 else
 ErrorTerm = -Height + 1; /* going right->left */
 if (SkipFirst) { /* skip the first point if so indicated */
 /* Determine whether it's time for the X coord to advance */
 if ((ErrorTerm += Width) > 0) {
 X1 += AdvanceAmt; /* move 1 pixel to the left or right */
 ErrorTerm -= Height; /* advance ErrorTerm to next point */
 }
 }
 /* Scan the edge for each scan line in turn */
 for (i = Height - SkipFirst; i-- > 0; WorkingEdgePointPtr++) {
 /* Store the X coordinate in the appropriate edge list */
 if (SetXStart == 1)
 WorkingEdgePointPtr->XStart = X1;
 else
 WorkingEdgePointPtr->XEnd = X1;
 /* Determine whether it's time for the X coord to advance */
 if ((ErrorTerm += Width) > 0) {
 X1 += AdvanceAmt; /* move 1 pixel to the left or right */
 ErrorTerm -= Height; /* advance ErrorTerm to correspond */
 }
 }
 } else {
 /* Edge is closer to horizontal than vertical (X-major) */
 /* Minimum distance to advance X each time */
 XMajorAdvanceAmt = (Width / Height) * AdvanceAmt;
 /* Error term advance for deciding when to advance X 1 extra */
 ErrorTermAdvance = Width % Height;
 if (DeltaX >= 0)
 ErrorTerm = 0; /* initial error term going left->right */
 else
 ErrorTerm = -Height + 1; /* going right->left */
 if (SkipFirst) { /* skip the first point if so indicated */
 X1 += XMajorAdvanceAmt; /* move X minimum distance */
 /* Determine whether it's time for X to advance one extra */
 if ((ErrorTerm += ErrorTermAdvance) > 0) {
 X1 += AdvanceAmt; /* move X one more */
 ErrorTerm -= Height; /* advance ErrorTerm to correspond */
 }
 }
 /* Scan the edge for each scan line in turn */
 for (i = Height - SkipFirst; i-- > 0; WorkingEdgePointPtr++) {
 /* Store the X coordinate in the appropriate edge list */
 if (SetXStart == 1)
 WorkingEdgePointPtr->XStart = X1;
 else
 WorkingEdgePointPtr->XEnd = X1;
 X1 += XMajorAdvanceAmt; /* move X minimum distance */
 /* Determine whether it's time for X to advance one extra */
 if ((ErrorTerm += ErrorTermAdvance) > 0) {
 X1 += AdvanceAmt; /* move X one more */
 ErrorTerm -= Height; /* advance ErrorTerm to correspond */
 }
 }
 }

 *EdgePointPtr = WorkingEdgePointPtr; /* advance caller's ptr */

}







[LISTING THREE]

; Draws all pixels in the list of horizontal lines passed in, in
; mode 13h, the VGA's 320x200 256-color mode. Uses REP STOS to fill
; each line.
; C near-callable as:
; void DrawHorizontalLineList(struct HLineList * HLineListPtr,
; int Color);
; All assembly code tested with TASM 2.0 and MASM 5.0

SCREEN_WIDTH equ 320
SCREEN_SEGMENT equ 0a000h

HLine struc
XStart dw ? ;X coordinate of leftmost pixel in line
XEnd dw ? ;X coordinate of rightmost pixel in line
HLine ends

HLineList struc
Lngth dw ? ;# of horizontal lines
YStart dw ? ;Y coordinate of topmost line
HLinePtr dw ? ;pointer to list of horz lines
HLineList ends

Parms struc
 dw 2 dup(?) ;return address & pushed BP
HLineListPtr dw ? ;pointer to HLineList structure
Color dw ? ;color with which to fill
Parms ends

 .model small
 .code
 public _DrawHorizontalLineList
 align 2
_DrawHorizontalLineList proc
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to our stack frame
 push si ;preserve caller's register variables
 push di
 cld ;make string instructions inc pointers

 mov ax,SCREEN_SEGMENT
 mov es,ax ;point ES to display memory for REP STOS

 mov si,[bp+HLineListPtr] ;point to the line list
 mov ax,SCREEN_WIDTH ;point to the start of the first scan
 mul [si+YStart] ; line in which to draw
 mov dx,ax ;ES:DX points to first scan line to
 ; draw
 mov bx,[si+HLinePtr] ;point to the XStart/XEnd descriptor
 ; for the first (top) horizontal line

 mov si,[si+Lngth] ;# of scan lines to draw
 and si,si ;are there any lines to draw?
 jz FillDone ;no, so we're done
 mov al,byte ptr [bp+Color] ;color with which to fill
 mov ah,al ;duplicate color for STOSW
FillLoop:
 mov di,[bx+XStart] ;left edge of fill on this line
 mov cx,[bx+XEnd] ;right edge of fill
 sub cx,di
 js LineFillDone ;skip if negative width
 inc cx ;width of fill on this line
 add di,dx ;offset of left edge of fill
 test di,1 ;does fill start at an odd address?
 jz MainFill ;no
 stosb ;yes, draw the odd leading byte to
 ; word-align the rest of the fill
 dec cx ;count off the odd leading byte
 jz LineFillDone ;done if that was the only byte
MainFill:
 shr cx,1 ;# of words in fill
 rep stosw ;fill as many words as possible
 adc cx,cx ;1 if there's an odd trailing byte to
 ; do, 0 otherwise
 rep stosb ;fill any odd trailing byte
LineFillDone:
 add bx,size HLine ;point to the next line descriptor
 add dx,SCREEN_WIDTH ;point to the next scan line
 dec si ;count off lines to fill
 jnz FillLoop
FillDone:
 pop di ;restore caller's register variables
 pop si
 pop bp ;restore caller's stack frame
 ret
_DrawHorizontalLineList endp
 end






[LISTING FOUR]

; Scan converts an edge from (X1,Y1) to (X2,Y2), not including the
; point at (X2,Y2). If SkipFirst == 1, the point at (X1,Y1) isn't
; drawn; if SkipFirst == 0, it is. For each scan line, the pixel
; closest to the scanned edge without being to the left of the scanned
; edge is chosen. Uses an all-integer approach for speed & precision.
; C near-callable as:
; void ScanEdge(int X1, int Y1, int X2, int Y2, int SetXStart,
; int SkipFirst, struct HLine **EdgePointPtr);
; Edges must not go bottom to top; that is, Y1 must be <= Y2.
; Updates the pointer pointed to by EdgePointPtr to point to the next
; free entry in the array of HLine structures.

HLine struc
XStart dw ? ;X coordinate of leftmost pixel in scan line
XEnd dw ? ;X coordinate of rightmost pixel in scan line

HLine ends

Parms struc
 dw 2 dup(?) ;return address & pushed BP
X1 dw ? ;X start coord of edge
Y1 dw ? ;Y start coord of edge
X2 dw ? ;X end coord of edge
Y2 dw ? ;Y end coord of edge
SetXStart dw ? ;1 to set the XStart field of each
 ; HLine struc, 0 to set XEnd
SkipFirst dw ? ;1 to skip scanning the first point
 ; of the edge, 0 to scan first point
EdgePointPtr dw ? ;pointer to a pointer to the array of
 ; HLine structures in which to store
 ; the scanned X coordinates
Parms ends

;Offsets from BP in stack frame of local variables.
AdvanceAmt equ -2
Height equ -4
LOCAL_SIZE equ 4 ;total size of local variables

 .model small
 .code
 public _ScanEdge
 align 2
_ScanEdge proc
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to our stack frame
 sub sp,LOCAL_SIZE ;allocate space for local variables
 push si ;preserve caller's register variables
 push di
 mov di,[bp+EdgePointPtr]
 mov di,[di] ;point to the HLine array
 cmp [bp+SetXStart],1 ;set the XStart field of each HLine
 ; struc?
 jz HLinePtrSet ;yes, DI points to the first XStart
 add di,XEnd ;no, point to the XEnd field of the
 ; first HLine struc
HLinePtrSet:
 mov bx,[bp+Y2]
 sub bx,[bp+Y1] ;edge height
 jle ToScanEdgeExit ;guard against 0-length & horz edges
 mov [bp+Height],bx ;Height = Y2 - Y1
 sub cx,cx ;assume ErrorTerm starts at 0 (true if
 ; we're moving right as we draw)
 mov dx,1 ;assume AdvanceAmt = 1 (move right)
 mov ax,[bp+X2]
 sub ax,[bp+X1] ;DeltaX = X2 - X1
 jz IsVertical ;it's a vertical edge--special case it
 jns SetAdvanceAmt ;DeltaX >= 0
 mov cx,1 ;DeltaX < 0 (move left as we draw)
 sub cx,bx ;ErrorTerm = -Height + 1
 neg dx ;AdvanceAmt = -1 (move left)
 neg ax ;Width = abs(DeltaX)
SetAdvanceAmt:
 mov [bp+AdvanceAmt],dx
; Figure out whether the edge is diagonal, X-major (more horizontal),
; or Y-major (more vertical) and handle appropriately.

 cmp ax,bx ;if Width==Height, it's a diagonal edge
 jz IsDiagonal ;it's a diagonal edge--special case
 jb YMajor ;it's a Y-major (more vertical) edge
 ;it's an X-major (more horz) edge
 sub dx,dx ;prepare DX:AX (Width) for division
 div bx ;Width/Height
 ;DX = error term advance per scan line
 mov si,ax ;SI = minimum # of pixels to advance X
 ; on each scan line
 test [bp+AdvanceAmt],8000h ;move left or right?
 jz XMajorAdvanceAmtSet ;right, already set
 neg si ;left, negate the distance to advance
 ; on each scan line
XMajorAdvanceAmtSet: ;
 mov ax,[bp+X1] ;starting X coordinate
 cmp [bp+SkipFirst],1 ;skip the first point?
 jz XMajorSkipEntry ;yes
XMajorLoop:
 mov [di],ax ;store the current X value
 add di,size HLine ;point to the next HLine struc
XMajorSkipEntry:
 add ax,si ;set X for the next scan line
 add cx,dx ;advance error term
 jle XMajorNoAdvance ;not time for X coord to advance one
 ; extra
 add ax,[bp+AdvanceAmt] ;advance X coord one extra
 sub cx,[bp+Height] ;adjust error term back
XMajorNoAdvance:
 dec bx ;count off this scan line
 jnz XMajorLoop
 jmp ScanEdgeDone
 align 2
ToScanEdgeExit:
 jmp ScanEdgeExit
 align 2
IsVertical:
 mov ax,[bp+X1] ;starting (and only) X coordinate
 sub bx,[bp+SkipFirst] ;loop count = Height - SkipFirst
 jz ScanEdgeExit ;no scan lines left after skipping 1st
VerticalLoop:
 mov [di],ax ;store the current X value
 add di,size HLine ;point to the next HLine struc
 dec bx ;count off this scan line
 jnz VerticalLoop
 jmp ScanEdgeDone
 align 2
IsDiagonal:
 mov ax,[bp+X1] ;starting X coordinate
 cmp [bp+SkipFirst],1 ;skip the first point?
 jz DiagonalSkipEntry ;yes
DiagonalLoop:
 mov [di],ax ;store the current X value
 add di,size HLine ;point to the next HLine struc
DiagonalSkipEntry:
 add ax,dx ;advance the X coordinate
 dec bx ;count off this scan line
 jnz DiagonalLoop
 jmp ScanEdgeDone
 align 2

YMajor:
 push bp ;preserve stack frame pointer
 mov si,[bp+X1] ;starting X coordinate
 cmp [bp+SkipFirst],1 ;skip the first point?
 mov bp,bx ;put Height in BP for error term calcs
 jz YMajorSkipEntry ;yes, skip the first point
YMajorLoop:
 mov [di],si ;store the current X value
 add di,size HLine ;point to the next HLine struc
YMajorSkipEntry:
 add cx,ax ;advance the error term
 jle YMajorNoAdvance ;not time for X coord to advance
 add si,dx ;advance the X coordinate
 sub cx,bp ;adjust error term back
YMajorNoAdvance:
 dec bx ;count off this scan line
 jnz YMajorLoop
 pop bp ;restore stack frame pointer
ScanEdgeDone:
 cmp [bp+SetXStart],1 ;were we working with XStart field?
 jz UpdateHLinePtr ;yes, DI points to the next XStart
 sub di,XEnd ;no, point back to the XStart field
UpdateHLinePtr:
 mov bx,[bp+EdgePointPtr] ;point to pointer to HLine array
 mov [bx],di ;update caller's HLine array pointer
ScanEdgeExit:
 pop di ;restore caller's register variables
 pop si
 mov sp,bp ;deallocate local variables
 pop bp ;restore caller's stack frame
 ret
_ScanEdge endp
 end





























March, 1991
PROGRAMMER'S BOOKSHELF


Subatomic Programming




Andrew Schulman


Most programs are written in a high-level language, not assembly language, but
the authors of these programs are generally at least dimly aware that, below
the surface, their high-level-language statements such as p = x "turn in"
assembly language statements such as MOV AX, [BX]. Even introductory books on
computing always seem to include a picture of a funnel, with LETs and GOTOs
flowing in the top, and MOVs and JMPs dropping out the bottom.
It is probably a sign of progress in computing that most of us view these MOVs
and JMPs as atomic operations. That is, they don't "turn into" anything,
except perhaps the "0s and 1s" to which introductory computer books like to
vaguely refer. For the majority of programmers, the actually enormous
complexity underneath the surface of an "atomic" assembly language statement
like MOV AX, [BX] can remain a total mystery.
Nonetheless, it is worth having an appreciation for what makes up these
supposedly simple operations. In addition to the pure enjoyment of knowing a
little more about the machine, an appreciation for its subatomic particles --
things like bus cycles, memory access time, instruction prefetches,
pipelining, DRAM refresh, timing issues, cache management, wait states, DMA,
and bus arbitration -- may become more important as microprocessors become
faster and more compact. As Intel's new 386SL chipset shows, even a seemingly
lowly issue like power management can take on great importance when computers
get small enough.
This month, we will examine three books that take us beneath the valley of
assembly language.


Zen of Assembly Language


Michael Abrash's oddly titled Zen of Assembly Language is a good place to
start. Chapters 3, 4, and 5 in particular deal with what he calls "the raw
stuff of performance, which lies beneath the programming interface, in the
dimly seen realm populated by instruction prefetching, dynamic RAM refresh,
and wait states, where software meets hardware" (p. 75).
Interestingly, Abrash's goal is actually to show that we can't totally
understand this level. "The exact performance of assembler code over time is
such a complex problem that it might as well be unsolvable" (p. 114). He shows
all instruction timings are relative. In one example code sequence, the SHR
instruction takes eight-plus cycles to execute, and in another it takes only
two. Thus, "the only true execution time for an instruction is a time measured
in a certain context, and that time is meaningful only in that context" (p.
91).
In other words, "there's no way to be sure what code is the fastest for a
particular purpose"; one must "write code by feel as much as by prescription."
Apparently such thoughts are what inspired the "Zen" book title. "How can it
not be possible to come up with a purely rational solution to a problem that
involves that most rational of man's creations, the computer?" he asks (p.
113), yet the answer is, at this subatomic level, that the order and duration
of events is unknown.
In a particularly nice demonstration, Abrash hooks a logic analyzer up to the
8088 and PC bus, and examines the following simple instruction sequence:
 i db 1 j db 0 mov ah, ds:[i] mov ds:[j], ah
The result is a timeline of "170 Cycles in the Life of a PC" (pp. 119-121), in
which we see the 8088's execution unit load up opcodes from the instruction
prefetch queue, the bus-interface unit reload the instruction queue from
memory, the occurrence of DRAM refresh reads, wait states, and so on. And we
see even these simple instructions behave differently (execute at different
speeds) at different times.
Abrash's own conclusion is "code execution isn't all that exciting ... it's
awfully tedious, even by assembler standards. During the entire course of the
figure only seven instructions are executed -- not much to show for all the
events listed." Abrash's point is that such a "microanalysis ... is not only
expensive and time consuming, but also pointless."
Yet, for most readers this is the most fascinating part of the book! Abrash's
book can be used, not only as a guide to assembly language performance issues,
but also as a fine explanation of what really happens "inside" a MOV
instruction.
Of course, the title is slightly misleading, in that he is talking about Intel
assembly language, not assembly language in general. Furthermore, the focus is
far too much on the 8088 than seems appropriate now that the baseline PC
machine is 80286-based. Abrash takes a perverse pleasure in the poor quality
of the 8088, because clearly the worse the chip, the more one needs assembly
language optimizations! However, he does devote an entire chapter to "Other
Processors" (the 80286 and 80386); this chapter alone is worth the price of
the book.
And perhaps the book's 8088 focus may not be so off base, after all. Abrash
points out, "If you're going to go to the trouble of using 80386-specific
features, thereby eliminating any chance of running on PCs and ATs, you might
as well go all the way and write 80386 protected-mode code" (p. 716). In a
book on real-mode programming, then, perhaps there isn't much to say on the
80286 and 80386. "The protected-mode 80386 is a wonderful processor to
program, and a good topic -- a terrific topic -- for some book to cover in
detail, but this is not that book" (p. 717).
Even Abrash's entire chapter on the 8080 (!) is not so out of place for the
1990s. "You no doubt think you've seen the last of the venerable but not
particularly powerful 8080. Not a chance. The 8080 lingers on in the
instruction set and architecture ... Although it may seem strange that the
design of an advanced processor would be influenced by the architecture of a
less capable one, that practice is actually quite common" (p. 266). As a
result, even the spiffiest 486 has many features in common with the 8080, a
glorified calculator chip. This chapter of the book ("Strange Fruit of the
8080") makes particularly enjoyable reading, because it shows how minor
engineering decisions live on for many years. A frightening thought.


Structured Computer Organization


Our next book, Tanenbaum's Structured Computer Organization, may not at first
seem relevant. What does this venerable (now in its third edition) computer
architecture textbook have to do with the bizarre world Abrash describes?
Tanenbaum takes us to some of the levels below the odd enough level of bus
cycles and instruction prefetches. Chapter 3, "The Digital Logic Level," is a
superb examination of everything from NAND gates to the construction of
latches, flip-flops, and registers, up to memory and buses. Furthermore, this
is no abstract discussion of a hypothetical machine. Throughout the book,
Tanenbaum uses the Intel 80x86 and Motorola 680x0 families as his running
examples. For example, this chapter contains a discussion of the IBM PC and AT
buses. The internal workings of "a typical IBM PC clone" are described at the
chip level, and a circuit diagram is given and discussed at length.
Tanenbaum's book is based on "the idea that a computer can be regarded as a
hierarchy of levels" (p. xv). Furthermore, each level, even the lowest "device
level," corresponds to a language. "A central theme of this book that will
occur over and over again is: Hardware and software are logically equivalent"
(p. 11).
Chapter 4, "The Microprogramming Level," includes brief but useful studies of
the microarchitecture of the Intel and Motorola chips. Many PC programmers
will want to at least read the discussion (pp. 215-220) of the Intel 8088
microcode. I've never seen this discussed anywhere else.
Tanenbaum also has brief, but useful coverage of the issues of instruction
pipelining, memory interface, and cache memory. I found myself wanting more on
these increasingly important topics. One good book is: High-Performance
Computer Architecture, Second Edition, by Harold S. Stone (Addison-Wesley,
1990).
One aspect of Tanenbaum's text that seems odd, at least with the benefit of
hindsight, is his choice of OS/2, rather than MS-DOS, as the archetypal Intel
operating system. True, "OS/2 has a surprisingly large number of features that
are not present in UNIX and are well worth examining." But it makes no sense
to write off MS-DOS with the comment that it is "an obsolete, primitive, and
not very interesting system, despite its widespread use" (p. 372). Its
widespread use is precisely what makes DOS intrinsically interesting. To say
that something is "of great commercial importance" but "of little interest to
us" (p. 373) seems like a bad way to educate engineers! Since "the OS/2
designers were not permitted to simply treat MS-DOS as a bad dream and start
all over," it's not clear why anyone else should pretend they have such a
luxury. Oh, well.
But, like Tanenbaum's other books, Computer Networks and Operating Systems,
this one is nearly perfect.


80x86 Architecture and Programming


Finally, we come to Volume II of Rakesh Agarwal's 80x86 Architecture and
Programming. As an odd reversal to the natural order, Volume I apparently
won't be available for almost a year. Nonetheless, Volume II stands on its own
as an indispensible guide to the 80286, 80386, and 486 microprocessors. In
particular, Agarwal presents such a clear picture of the processors' operation
in protected mode that one could probably use his extensive C code and
diagrams to clone an Intel chip.
Agarwal presents extremely detailed C (and pseudo-C) code for each Intel
instruction. These in turn use a library of functions such as LA_rdChk()
(linear-address read), LA_wrChk() linear-address write),
priv_lev_switch_CALL() (privilege-level switch), enter_new_task(), and the
sickeningly complex read_descr() (read-descriptor).
The book also contains up-to-the-minute information on the 486 cache,
hard-to-find details on the floating-point exception/NMI interface (and how it
had to be faked on the 486!), a complete discussion of the undocumented
LOADALL instruction, and similar goodies. Unfortunately, the book did come out
too soon for inclusion of the deranged eight (count 'em) new address spaces
added on the 386SL Super-Set chips.
If you've ever asked what really happens when you MOV ES, AX in protected
mode, or how Windows 3.0 enhanced mode traps IN and OUT instructions using
Virtual 8086 mode, this is the book to get. When you're finished, you may be
sorry you asked, but that's a different story.








March, 1991
OF INTEREST





MultiScope has recently released its MultiScope debuggers for DOS and Windows
3.0, and an upgrade to its OS/2 debugger. The debuggers support any language
that is capable of generating Codeview information, including Microsoft C,
Fortran, Pascal, Basic, and MASM, as well as Logitech's Modula-2.
Each debugging platform provides a runtime, postmortem, and remote debugger.
The runtime debuggers sport multiple windows, including a source window that
allows you to view and control the execution of your source code. Features
here include conditional and unconditional breakpoints at both the source and
assembly levels, data and memory watchpoints, a sophisticated set of "GO"
commands, and a symbolic expression evaluator for C, Pascal, or Modula-2.
Other windows allow you to view simultaneously at the source and assembly
levels, and to examine the registers of the CPU and math coprocessor. Using
the graphic data window, you can browse a graphical hierarchy of the data
structures within your program with full zoom and scroll capability, while a
complimentary data window lets you view the internals of a given structure.
Also, a memory window lets you view, modify, or set watchpoints on any
location in your program's memory space and display in any of a number of
formats.
The DOS and Windows versions provide full support and take advantage of the
386/486, while those with an 8086/286 can take advantage of MultiScope's
remote debugging capability. The package, which includes all three debugging
platforms, is available at an introductory price of $179. Current owners of
MultiScope for OS/2 can upgrade for $99. Reader service no. 20.
MultiScope Inc. 1235 Pear Ave. Mountain View, CA 94043 415-968-4892
Interactive Engineering is expected to release its WINDOWS.TXT product, a
character-mode implementation of the Microsoft Windows 3.0 SDK, in the first
quarter of 1991. The product is intended to provide programmers with the means
to develop SAA/CUA compliant applications under DOS while adding a mapping
layer that provides source-code compatibility with the Windows SDK. DDJ
recently spoke with Stephen E. Buck, president of Interactive Engineering.
According to Buck, "In most cases, a programmer can run a Windows program out
of the box by changing just the make file."
Key features include full support of pull-down and pop-up menus, overlapping,
pop-up, and child windows, and modal or modeless dialog boxes. The libraries
provide functions that support message passing including posting, sending,
dispatching, and so on, as well as functions to manage the message queue,
register programmer-defined messages, and to subclass windows. Windows-style
memory management functions are also provided. In addition, the package
includes clipboard, paint and timer functions, and support for resources.
The WINDOWS.TXT SDK supports Microsoft C 5.1, Turbo C 2.0, and Turbo C++ 1.0,
and includes a resource compiler, a dialog editor, documentation, and sample
source code. A Unix version is expected later this year, with OS/2 and VMS
versions under consideration. The SDK with full source to the libraries is
available at $595, or can be purchased separately for $295. Reader service no.
25.
Interactive Engineering Inc. P.O. Box 7022 Boulder, CO 80306-7022 303-440-7674
The ANSI committee on Pascal is working on a technical report on
object-oriented extensions to Pascal. Tom Turba of Unisys told DDJ that they
are "working with people from Apple, Microsoft, Symantec, DEC, and other
vendors. We have a few users working with us and we'd like to have a few more
users participate. We are planning on getting it done in a year's time frame."
If you would like further information or are interested in participating,
contact:
Thomas Turba, Chairman X3J9 Unisys Corp., MS WE3C P.O. Box 64942 St. Paul, MN
55164 612-635-2349 turba@rsvl.Unisys.com
Unix users working on 386 or 486 PCs can now purchase the System V Release 4
Migration Package from UHC. Xenix, Unix System V, or other Unix system
variants for the Intel architecture can be traded in for the SVR4 Migration
Package, which includes the base operating system, Berkeley compatibility,
editing utilities, forms and menu language interpreter, framed access command
environment, kernel debugger, line printer utilities, mouse driver, networking
support utilities, PC-Interface, remote terminal utilities, security
administration, Termcap compatibility, windowing utilities, Xenix
compatibility, and operation, administration, and maintenance.
Also included is compatibility with MFM, RLL, IDE, Western Digital 1007 ESDI,
Adaptec 1542 SCSI, and Western Digital 7000 SCSI controllers. Drivers are
provided for most VGA and Super VGA cards, and for selected ARchive and
Wangtek tape drives.
SVR4 also provides a library that implements the BSD Sockets interface, 4.3
BSD signal mechanisms, BSD commands and job control, the TCP/IP network
protocols, RPC and XDR libraries, the Network File System file-sharing
utility, Virtual File System architecture, and more. The cost for migration to
UHC (trading in the original disks of your current Unix system) is $395.
Reader service no. 21.
UHC 3600 S: Gessner, Ste. 110 Houston, TX 77063 713-782-2700
In other Windows news, Texas Instruments is offering a Windows display driver
that can support Windows 3 on any TIGA 340-based board, regardless of
resolution or color depth. This eliminates the need to develop separate
drivers for each resolution and display configuration.
TI's Windows driver links the graphics capabilities of Windows with the high
resolution, industry standard 340x0 processors, giving Windows applications
higher performance and faster processing by offloading the graphics processing
from the CPU to the 340 graphics processor. The Windows driver provides
hardware support for all the current Windows features, and is packaged with
the TIGA Software Porting Kit, available to licensed TIGA OEMs. Reader service
no. 23.
Texas Instruments Inc. Semiconductor Group, SC-9082 P.O. Box 809066 Dallas, TX
75380-9066 214-995-6611, Ext. 700
Digital signal processing (DSP) may soon be more accessible to PC and
workstation applications through Spectron's Open Signal Processing
Architecture (OSPA), which provides a common framework and a set of standard
interfaces developed to allow designers of DSP-based systems to integrate DSP
software with applications. Spectron expects this to encourage the development
of PC and workstation applications that are designed and built without costly
porting processes or reinvented communications mechanisms.
OSPA defines a set of interfaces and protocols that link standard computer
operating systems such as DOS or Unix with Spox, Spectron's digital signal
processing operating system, via an extension called Spox Server. Host
application programs can thus control and communicate with multiple DSP tasks
running at the same time on the same hardware. Mark Borcherding of Texas
Instruments Audio told DDJ, "We're using the RTK (real-time kernel) component,
which gives us a multitasking environment for option boards that plug into T1
phone lines. For the filtering operations we use the RTK for a combination of
multiprocessing and multitasking with script interpreters. We use both the PC
and Sun development tools."
DSP technology has been of limited value to designers of voice, imaging,
telecommunications, and instrumentation systems who are unfamiliar with DSP
algorithms. Independent software vendors will be able to package DSP
algorithms as Spox applications by using OSPA's features. OSPA development
packages will be offered initially by TI TMS320C30 development system vendors
for approximately $5,000. Reader service no. 24.
Spectron Microsystems 600 Ward Dr. Santa Barbara, CA 93111 805-967-0503
Smalltalk/V Windows is now available from Digitalk. Smalltalk/V works
optimally with graphical user interfaces such as Windows, Presentation
Manager, and the Mac, providing for these environments classes that hide the
details and so allow you to concentrate on application logic. Key features of
earlier Smalltalk/V programming systems such as browsers, inspectors, and
push-button debuggers are included in the Windows version.
New features include interfaces to dynamic data exchange (allowing information
to be shared between Smalltalk/V programs and other programs) and dynamic link
libraries, for calling applications outside of Smalltalk/V.
Because Smalltalk/V Windows source code is compatible with the existing
Smalltalk/V PM programming environment, you can choose either system as a
development platform and deliver applications on both systems simultaneously.
The price for Smalltalk/V Windows is $499.95. Reader service no. 22.
Digitalk Inc. 9841 Airport Blvd. Los Angeles, CA 90045 213-645-1082
DOCZ, a software development tool from Software Toolz, automates the
documentation of subroutine libraries and programs, allowing the sharing and
reuse of software modules. You can put the documentation in the same file as
the source code and extract it for reference manuals and revision notices.
DOCZ also builds online help libraries, and program editors can be interfaced
to DOCZ in order to extract function call prototypes from the help libraries
while editing.
DDJ spoke with Paul Anderson of Crossroads Computing in Atlanta. He said,
"It's so nice to finally find a system for filing my source code. You can use
it to organize lots of programming thoughts -- I've found code I'd completely
forgotten about. It's so easy to use; there are just two commands, and only
one needs parameters. And my partner and I can now look up each other's work
online without interrupting each other."
DOCZ is identical under VMS, Unix System V, and MS-DOS, and enables the
transport of help libraries and documentation from one platform to another. It
can automatically detect the presence of version control systems and can
retrieve source files via such systems as CMS and SCCS. Supports all major
languages and command-interpreter languages. Currently supports the
Interactive 386/ix Unix platform. Single-user platforms are priced at $195.
Samples and DOS demo version are free. Reader service no. 26.
Software Toolz Inc. 8030 Pooles Mill Dr. Ball Ground, GA 30107-9610
404-889-8264; 800-869-3878
A library of Turbo C- and Quick C-compatible functions, called FFP (fast
floating point), has been announced by Triakis. FFP is a floating-point
arithmetic method based on the sign-logarithm number system. According to
Triakis, the library operations in many cases rival a math coprocessor in
speed and can be used in applications that run without special math hardware.
The precision and exponent range of the numbers supposedly equal and slightly
exceed standard single precision.
You can use FFP in applications such as general scientific computation,
geometrical graphics, signal processing, simulation, and nonlinear systems.
The library includes transcendental functions, logarithm, exponential, square
root, and a feature for making complicated derived functions execute faster.
The product sells for $59, and the company charges a small royalty if you use
it in a commercial product. Reader service no. 28.
Triakis 1011 Duchess Rd. Bothell, WA 98012 206-486-8282
The Turbo Vision Development Toolkit has been announced by Blaise Computing.
The toolkit is a set of utilities and object class libraries designed for use
with Turbo Vision, Borland's application framework for establishing uniform
user interfaces and basic architecture features in applications. The idea
behind Turbo Vision is to eliminate the need to repeatedly create the basic
platform on which you build your application programs.
The toolkit includes a resource editor for interactively creating or changing
dialog boxes and other resources. It allows you to see the resources as they
will appear in the final version of the application. You then save this
information, and a single method call loads it into the application. The
resource compiler takes its input from a file rather than interactively. An
enhanced help facility displays context-sensitive help information within
Turbo Vision applications; a memory monitor records and displays memory usage;
an event monitor displays all or selected events processed by the Turbo Vision
system; and a recorder allows you to save selected events for "playback" at
another time.
The Turbo Vision Development Toolkit should be available by late March; it
will sell for $149. Reader service no. 29.
Blaise Computing Inc. 2560 Ninth St., Ste. 316 Berkeley, CA 94710 415-540-5441

















March, 1991
SWAINE'S FLAMES


Chairman of the Rings




Michael Swaine


The following narrative was recently discovered among the effects of J.R.R.
Tolkien. It appears to describe an incident from an alternative version of
Lord of the Rings, in which the Hobbit Bilbo has kept the ring of power and is
consolidating his position as absolute ruler of Middle Earth. This text is a
translation from the original, which was written in Cryptic, a difficult
language of Tolkien's invention. Bilbo is referred to throughout as "The
Chairman."
"What is the status of Project Platform Independence?" asked the Chairman,
resting his elbows on the table and pushing his glasses up. "I don't know
where we get these strange code names."
"Total virtuality will not now be long in the coming," said Gandalf the Grey,
the VP for VR. "I am in the process of instituting complete Virtual
Environment control."
"You're a wizard, G.G. What does that mean, exactly?"
"That your subjects will be able to work in a boat on the Brandywine and be
convinced that they are in a glade in Lorien, or work on the slopes of Erebor
and perceive themselves at the Grey Havens, listening to the surf."
"Where will they actually be working?" asked the Chairman.
"We've set up cubicles in the caves of Moria."
"Well, that sounds all right," said the Chairman, "but what's the trouble on
the language front, Frodo? I ask you to make everyone speak our language,
Simple, and you let an army of Eunuchs spread their religious doctrine of
Crypticism throughout Middle Earth, with the result that now everybody is
speaking Cryptic. What do you have to say for yourself?"
"Ahem, that was strategic, Mr. Chairman," said Frodo. "It is easier to defeat
a single enemy than many. Now that Cryptic has taken over, we have only to
displace it."
"Hmph. That may be so, but I think I'd better have a report on how Simple
development is going."
Sam Gamgee, the Simple developer, blushed in embarrassment and stood up. "Your
Chairmanship, sir," said he, "work on the new version of Simple is near
completion, sir, and like always, Your Lordship, it's completely different
from previous versions, save for the name Simple and a few vulgarisms like
goto."
"Heh. The Gaffer would have been speechless without his gotos, wouldn't he,
the old pervert."
"Yes sir, Your Pleasance, and very kind of you it is to remember my late
father. The new version is based on a military dialect of Entish called Ada++,
if it please Your Worship."
"Add Plus Plus?" said the Chairman. "That sounds a little redundant."
"Bless Your Grace's wit, sir, if redundancy isn't sure enough its long suit,
but it's spoke like 'Ada Plus Plus,' Your Magnificence. Named after a book by
some Nabokov feller, some says."
"Whatever," said the Chairman. "But you haven't told me what's new in this
version."
"It's got a new what you call your orientation, Your Honor. The whole language
is about getting the job done, if you take my meaning. We're calling it
Objective-Oriented Simple."
"Well, when it's released," said the Chairman, "make sure it's called The
Chairman's Simple, as always. Now, Gimli Gloin's son, what was it that I asked
you to do?"
The Dwarf laid his ax on the table and stared stoically at it. "To go forth
with mighty Dwarfish battalions and to bend the isles of Iac to your will,
Sire."
"Ah, yes. How's that going?"
"Hard is the task, Sire. The Windward isles we hold in thrall, but the Isle of
the Eunuchs and the Maggotish Land yet resist our iron will. But I have sent
brave Ole to infiltrate their systems, and by Durin's beard, we shall prevail
ere long!"
"Well, I certainly hope so," said the Chairman, pressing down his cowlick.
"Look, fellows, I want all these projects wrapped up ASAP. I want people,
whatever their actual environments, to be working in a virtual environment
that we define. I want all of them using one language, and I want it to be our
language. And I want complete control of all of Iac. Now, when am I going to
see all of this?"
"Real soon now!" they cried in unison.
"Super."

























April, 1991
April, 1991
EDITORIAL


Mark's Modest Patent Proposal




Jonathan Erickson


The subject of software patents won't go away. And until the problems
surrounding patents are solved, let's hope it doesn't. The topic, in fact, was
the subject of a lively (boy, that's putting it mildly) panel discussion at
the recent Software Development '91 conference. On the panel, moderated by
Greenleaf Software's (and frequent DDJ contributor ) Mark Nelson, were Jim
Bidzos (CEO of RSA Data Security), Richard Stallman (representing the League
for Programming Freedom), Dick Gabriel (of Lucid and Stanford University),
Woody Higgins (of Flohr, Hohbach, Test, Albritton & Herbert--you guessed it,
an attorney specializing in software patents), Paul Heckel (who holds, and has
vigorously defended, several software patents), and yours truly.
Things went fine until Mark began introducing the panel. That's when the
shouting started--and I hadn't even been acknowledged yet.
Much of the ensuing discussion concerned whether or not software patents
thwart innovation (no real consensus, although I suspect they do), whether or
not the patent process works (nope, it sure doesn't), the definition of
"obvious" (a key factor in granting a patent), and whether or not there are
any alternatives to the system. One comment really floored me: Woody Higgins
said that currently about 100 software patents are being granted every month.
Near the end of our two-hour session, Mark made a modest proposal which, if
adopted, could have broad and profound implications. According to the rules
governing the granting of a patent (and I'm working from memory here, so bear
with me), if there is prior art or a technique is "obvious" to the "average"
practitioner of an art or science, a patent should not be allowed. As I've
said before, part of the problem with the current process is that many
programming techniques are obvious to experienced programmers, but not to the
general population or to patent examiners. Consequently, it is likely that
many of the 100 or so patents handed out every month wouldn't be granted if
they were subject to peer review.
What Mark proposed is the establishment of an ad hoc patent jury system. Now
there's nothing in his proposal that limits the process to software or the
computer industry. Other trades (auto, biotech, and so on) could (and probably
should) have review panels of their own to advise the patent office. (Okay,
this is assuming that patents apply to software in the first place. Many,
including the League for Programming Freedom, don't believe in granting
software patents at all. I'll not get into a discussion of that right now.
However, I think Mark's proposal has a better chance of acceptance than that
of doing away with software patents altogether.) For software, the jury would
be subdivided into graphics, communications, and so on. Thus, a patent
involving a complicated graphics algorithm would be reviewed by a group of
"average" graphics programmers.
Under the jury system, patent applications would be reviewed by the patent
office in secrecy, just as now. However, instead of waiting to make the
application public on the day it's granted, the application would be opened
for examination, say, one year before actually being granted.
The patent office would make public not just the applications, but the
"wrapper" that contains all of the give-and-take between the inventor and
examiner. Any interested parties could examine the patent and would have a
period of six months to file objections, including prior art.
The patent office could, at its discretion, act on or ignore these filings;
objecting parties would not have the right of appeal.
By putting a time limit on peer reviews, I'm willing to bet the review panel
would speed up the entire process by invalidating many trivial or obvious
applications early on. The patent office could then focus on getting a fewer
number of patents out the door quickly. And the patents ultimately granted
would be much stronger, having already survived a peer review.
A lot of questions have to be answered before the process could be
implemented: Those in the industry, for instance, would be concerned about
submitting patents to committees that may include employees of competing
firms. Congress and the patent office would obviously have to buy into it, and
they usually don't want to give up any power at any time without a fight. It
all comes down to just how sincere these groups are about solving a very
serious problem.
I'd like your feedback on this: Will it work? How could it be implemented?
While this may not be the ultimate solution, it is a positive step in the
right direction. You can help point the way. In the meantime, I'll start
contacting the patent office and appropriate house subcommittees to gauge
their reactions--and I'll keep you posted in this space.
For the short term, Dick Gabriel shared some good advice. His approach is to
be open about what his company is doing, then out-engineer, out-market, and
generally out-perform the competition. Now that I think about it, that's good
advice for the long term as well.




































April, 1991
LETTERS


Designing Software Design


Dear DDJ,
It was with great enthusiasm that I read Mr. Kapor's article "A Software
Design Manifesto" (DDJ, January 1991). I found it particularly interesting
that Mr. Kapor recognizes that software design really crosses many
disciplines. I do a lot of research for my designs and I find myself looking
for human factors impacted by computers under social sciences,
software/hardware interfaces under electrical engineering, data structures,
and software architecture under computer science, and systems design under
business administration. These disciplines, and others, factor into a
well-designed product, yet many are missing in academia. However, I do believe
that both the computer industry and colleges are changing their views on how
software should be designed, developed, and implemented. Recognizing the
difference between design and engineering is the first step. Once colleges and
companies begin to produce "software designers" trained to a level which Mr.
Kapor has outlined (or some similar level), software will begin to be user
friendly and may reach levels of reliability never before dreamed of by even
the most imaginative NASA engineer. Until then, we will continue to have
software that is marvelously engineered, but most humans will be unable or
unwilling to use because it is design deficient.
As a final note, I would like to add a course in "maintainable design" to Mr.
Kapor's list of topics to be studied. It is a subject most companies would
rather avoid, even though in so doing companies are subjecting themselves to
tremendous redesign and reengineering as well as bad feelings among users. In
an industry driven not only by its own passion but the passion of its
customers as well, we cannot afford bad feelings.
Mike Maitland
Compusol Inc.
Camp Hill, Pennsylvania
Dear DDJ,
Thanks to Mitch Kapor for the article "A Software Design Manifesto," (DDJ,
January 1991). I have recently been complaining (to the wrong people, of
course) about the abysmal approach the mainstream software manufacturers take
to the user interface and other aspects of programming. Mitch Kapor's article
put into words much of what I've been feeling: Software design needs to be
taken more seriously.
Many of my coworkers are computer illiterate, or are just beginning to learn
how to use computers. The typical response I get from them is, "Oh, I don't
know what you mean. The program can do all sorts of neat things," or some
similar statement. They don't realize that the power hides behind a hideous
mask of controls. They believe inexperience is causing their trouble. I like
to warn them that experience does not always help. Proper software design
could change that.
Having written a handful of small special-purpose programs, I can understand
the discipline required to design a program. I am also much more demanding of
the programs I use than the normal user. I know better.
I am, unfortunately, a victim of the same thought process when I write code. I
spend ten percent of my time designing the program, and 90 percent perfecting
the algorithm. I should (and will from now on) spend more time on the design
aspect. If I plan during my design process I could likely save considerable
time usually spent on modifying the design after the fact.
A software architect! Such a great idea.
John Sandlin
San Antonio, Texas
Dear DDJ,
Thank you for publishing Mitchell Kapor's article entitled "A Software Design
Manifesto," in the January 1991 issue. I applaud Mitchell Kapor's comments for
their enlightened thought, courage of conviction, accuracy, and timeliness.
Issues he has addressed needed public exhibition in order to shake up the
personal computer and software industries.
Mitchell and I are part of an imperceptibly small group in the software
engineering community that completely comprehends and embraces the "software
design viewpoint." Thus, we battle daily against software mediocrity. My
background in industrial design and consumer product development management is
the foundation for my venture into the software design and engineering area.
Consumer product development focuses on user requirements. Designers learn
sensitivity, adopt the user needs as their own, empathize with the user, and
become the user in order to understand the user's needs. Thus,
state-of-the-art technology is forced to conform to the humanness of the user.
Unfortunately, the opposite situation manifests itself in MIS departments,
software development companies, and consulting firms.
Today, those persons who relate to the mechanics of programming are given the
task of defining and addressing the humanness of the user. Therefore, the user
gets the expertise of the analyst/programmer's coded perception of the user's
needs. The depressing part of this situation is that neither the user nor the
analyst/programmer are aware that the software could be more than it is in its
present form. The reasons for this situation are:
First, analysts and programmers lack marketing or scientific research
experience; hence, they cannot ask the right questions of the user to get the
correct information for development.
Second, they cannot or do not communicate the versatility of the programming
language to the user.
Third, analysts and programmers lack time, motion, and methods, and human
factors and perceptual analysis experience; thus they do not observe and
evaluate the user's activities as they relate to software requirements.
Fourth, I have sensed the following attitude among some software engineering
persons and some MIS departments, which is: No one outside of our discipline
can contribute much to the software development process.
Fifth, society promotes the following myth: The personal computer and
programming are both difficult to learn and mysterious.
However, the user must assume some responsibility for software design failure.
Some users are afraid and are uncomfortable with computers or software
development. Today, the PC is important to productivity in the workplace, yet
employees still resist it. Typically, the attitude is: Don't teach me more
about computers than I need to know in order to get my job done. Because users
refuse to learn that little extra about the software, they continue to make
the same mistakes, which increases their frustration. If this computer phobia
could be overcome, users would make ideal candidates when polling for software
ideas.
Sometimes users don't appreciate the value of the data they possess. Since
users have only a cursory relationship with the computer, how can they ask for
software products that meet their needs?
Mr. Kapor's article supports a philosophy I have advocated for years, before I
founded Alexander PC Systems, a software engineering firm. When we begin a
software development project, the marketing and user research is completed
first. Then a design concept and matrix structure is created. The design
concept is user tested and perfected before the coding is started.
We are obligated to overthrow traditional software development methods. The
losses from mediocre software must cost millions of dollars each day. Current
practices need to be replaced with "user-focused software development," or in
Mr. Kapor's words, a "software design viewpoint."
Bryan R. Alexander
Morristown, New Jersey


Whose Got the Secrets?


Dear DDJ,
In his January "Structured Programming" column, Jeff Duntemann pokes at Turbo
Pascal's newly introduced "private" data fields and methods in Version 6.0,
and brings up the thorny "rose" of keeping programmers handcuffed and
blind-folded.
The myth is that this binding and blinding contributes to productivity. This
might be true if programmers were a subhuman lot incapable of focusing
attention on "a level at hand" without blinders and cuffs. Actually, it's
paranoid entrepreneurs who need programmers to write the code that is the
wealth they are stealing and who imagine their victims will steal it back if
they "see it whole." This "hiding" is simple divide and conquer. The
paranoid's bind: If the programmer knows what he's doing, he might go
elsewhere to do it. If he doesn't, he might not be able to do it there. Tough.
Actually, "access management" belongs in the operating instructions, not in
the compiler directives, where it directs the unthinking compiler to turn out
Rube Goldberg code.
Borland's use of Mandelbrot's "self-identity" is aesthetically pleasing.
Objects with their miniature "interface/implementation" (public/private)
notation are ...cute. But the havoc wreaked upon extensibility isn't an
occasional lost feature. It's even knowing what can and cannot be "overridden"
because of dependency on some proc near--because the privates are listed, but
the dependencies are not. In short, the whole tortuous interlacing of proc
nears and proc fars is the problem.
The solution--and it works the same for 5.5 and 6.0--follows from recognizing
that access management should be in the operating instructions, not in the
compiling process.
Nonparanoids might figure that the programmer whom the object is delivered to
has an interest in using it according to its design, and will look at the
directions for using it.
We can make those directions a bit deeper than the parameter list. In fact,
let's use "private" on the line identifying a field or method, and in
comments. A {pvt} by a field means "Look, when I get this for you, I'm going
to process it some, so if you grab it yourself you better know what you're
doing with it. And if you write in it, you're a purblind idiot."
And the {pvt} by a method says "This is set up for use by other methods; use
from outside can screw things up; and even overriding those other methods
requires checking for possible special instructions."
So, we can have it both ways. We can have public/private distinctions without
hopelessly tangling inter- and intrasegment calling realities, thereby forcing
a programmer to tiptoe among imminent system crashes.
Just hold in mind that access management belongs in the operating
instructions, not in the compiling. If you figure that you can't trust a
programmer to focus on the "level" that he's working on without "hiding" the
data and code for all other levels in a locked drawer, you probably shouldn't
be hiring programmers in the first place. Or vending code chunks to them. And
if a buyer wants to drive his new car around "out of tune," isn't that his or
her prerogative?
Crine Outloud
Berkeley, California



WIN Here and There


Dear DDJ,
The WINTHERE program you presented in the January 1991 issue looked very
interesting to me. Our company has a device driver which needs to print error
messages on occasion, and has trouble with Windows 3. Currently, the user must
tell the driver during installation if it will be running under Windows. The
problem is that since the device driver is part of DOS, it cannot call DOS
routines to print to the screen. All messages are printed instead by the BIOS.
Unfortunately, Windows overrides the BIOS, and while the BIOS thinks it has
written a message to the screen, it has in fact written to the bit bucket. If
we could tell that Windows was running, we would just default to the critical
error handler, and let Windows take care of it. The WINTHERE program promised
to do just that.
The 4680h call is not exactly what Ben Myers supposed. I had discovered this
call by disassembling, just as Ben had done. However, Windows changes the
multiplex interrupt during operation. The 4680h call is available within the
DOS window, or when a non-Windows applications program is running. (In fact,
this is the only time under Windows when the BIOS can write to the screen.)
When Windows has the screen in graphics mode, this call is not available, and
having gone through that code too, there is no simple call under interrupt 2Fh
to check if Windows is installed.
Bill Hawkins
Winter Park, Florida
Ben replies: Thank you for pointing out that the interrupt mux call with
ax=4680h does not detect all cases of Windows 3.0 running in real (/R) or
standard (/S) mode. I admit that I did not spelunk into Windows with a Windows
app that displays real memory locations, though I just got Logitech's
Multiscope for DOS and Windows and it will permit me to do so very easily. I
did run DEBUG in a window using a PIF that told Windows 3.0 that DEBUG is
well-behaved in its use of the screen, which it is.
I am just as frustrated about the situation as you are, particularly since
Microsoft has not responded with any bulletproof solutions, badly needed for
TSRs, device drivers, disk defraggers, and other software. My interpretation
of the lack of response by Microsoft is that Windows 3.0 has a hole in this
area, one that will be addressed in a subsequent release, possibly Windows
3.1. Understand that this is my interpretation, since Microsoft is officially
moot on the subject. In the meantime, the only recourse is to do as you are
doing today--ask the user whether the driver will be running under Windows.
When a graphics app is running, even a well-disciplined old DOS app with an
appropriate PIF, Windows intercepts all video calls done with interrupt 10h,
and interprets them as it sees fit.
Another possibility might be to use int 21h, function 09h calls to display
error messages from within your device driver. DOS maintains two stacks for
int 21h calls. One stack is used for functions 00h through 0Ch, inclusive. The
other stack is used for the other function calls, including file opens,
closes, reads, and writes, which is what I assume your device driver does.


Back To the Future Again


Dear DDJ,
Having just received a diskette in the mail with 1 Mbyte of programs for my
HP-48 calculator, I was not expecting to surface for a week or so, but the
mailman brought the January issue of Dr. Dobb's today, and so I made the usual
exception and pored over the magazine with much haste so as to get back to
graphing and calculating with the little pocket machine.
The first thing that caught my attention was Mitch Kapor's article, in which
he makes reference to a "modern" programming language (C or Pascal). The
article overall is quite good, as is most of the rest of this issue, but Mr.
Kapor shoots himself in the foot with his biased and uninformed notion of a
modern programming language. I won't bore you with the details, but Bill
Gates, Ethan Winer of Crescent Software, and myself will gladly demonstrate to
anyone that you can still write programs in Basic that are faster, smaller,
more portable, easier to read and analyze, and even more powerful in terms of
advanced programming concepts than C or Pascal. Basic is both the People's
Language and the preferred language of top professionals like myself (pat,
pat), because of its widespread acceptance in the business and scientific
community as well as its proliferation in virtually every computer system
made, not to mention its inclusion free with practically every personal
computer sold today.
The second thing that caught my attention was Jim Warren's article, in which
he mentions that the first use of the term "personal computer" was in Rolling
Stone magazine in 1974. I have an HP Journal article from May 1974 describing
the HP-65 hand-held "personal computer," and I believe that because of its
earlier cousins and the institution of the first "personal computer" user
group in 1974 based on the HP-65, that it could make a better claim. Persons
who were associated with the Homebrew Computer Club in the Bay Area are
generally accorded special status in the press when it comes to pronouncements
on the origins of personal computing, and their views are almost never
contrasted with those of a somewhat different group of users whose association
begins in 1967 with the introduction of the HP-9100A machine, now termed a
"calculator," even though it had programmability, off-line storage, and a
built-i printer.
Personal computing for the masses took a leap in the direction of
nonprogrammability in early 1984 with the introduction of the Macintosh and
its mouse-driven interface, but for myself and a number of associates it took
on a much different tone at the same exact time with the introduction of the
HP-71 pocket computer. This tiny little machine had these capabilities way
back then: 80-Kbyte ROM-based operating system and language (with peripheral
interface), 1 Mbyte RAM of contiguous address space (my machine has 393,000
bytes free over and above the operating system and language), the ability to
be controlled by a standard 25 x 80 display terminal and keyboard, to name a
few. To input and output information to other computers and devices you would
use the Basic commands INPUT and OUTPUT. Ordinary nontechnical programmers
like myself could communicate with the world in a way that Macintosh owners
could not feasibly do because of the complications and expense involved. For
those individuals to whom 80 Kbytes of ROM was not enough, hundreds of
operating systems extensions could be loaded into the machine, providing
virtually every capability known to mankind.
One last comment about the future in software development: I don't believe we
will ever achieve a happy arrangement between developers, designers, users,
and programmers if programmers are working primarily in C (this also implies
the use of some assembler code), and the designers are having to be schooled
in C at least enough so they are familiar with the governing principles of its
use. We need a combination of factors to produce better software at reasonable
cost, with reasonable performance, and soon enough to meet a reasonable
demand. Some of these factors are:
1. A language that is powerful enough and efficient enough to do the job, but
not so cryptic as to be difficult to write and maintain code with. Basic 7.1
by Microsoft comes to mind, and Crescent's PDQ where size is critical. You
could think of a computer language as just a tool, and the operating system
itself as the more universal language through which various users communicate
with each other, much the same as large numbers of people of different
nationalities around the world communicate with each other in English today.
What you would be missing in this analogy is the prior experience of people
like myself who have used a language that is both the language and the
operating system of the computer, and therefore quite complementary to one
another. It is extremely difficult to do some things in MS-DOS alone that are
a snap in Basic, and the reverse is also true. When the two are combined (and
as MS-Basic progresses from 7.1, it appears to take on ever more of that
flavor), however, programming and productivity Nirvana are as nearly achieved
as in possible with today's technology.
2. Programmers who know how to use the language to its best ability, as
compared to the more usual situation where the programmer's skills are divided
and correspondingly diluted by their desire to work with many languages and/or
operating systems, or many variations of the same language and/or libraries
from multiple vendors. My experience with hundreds of programmers from years
in Encino, Beverly Hills, Pasadena, Santa Monica, and El Segundo has convinced
me that the vast majority of programmers are indifferent to considerations of
productivity, and their preference for cryptic and hard-to-use languages is
based not on the alleged power that these low-level dialects provide, but
rather on the existential pleasures they give to the programmer. It's time we
"just say no" to this nonsense.
3. A true standard library of functions as part of the language, much like the
HP Rocky-Mountain Basic or the PC version called "HP-Basic," where commands
actually change the collating order from ASCII to some other sequence (EBCDIC,
for example) and OUTPUT ... will output data in any format to any device or
memory variable without sweat or strain. Trust me when I say that a
programmer's ability to produce a lot of useable code is greatly enhanced when
he or she can get virtually all the capability they need rather than a melange
of material from several vendors.
4. A commitment from a major supplier to maintain a library of high-level
functions over a long period of time, with no runtime royalties, and with a
solid user-approved upgrade and revision policy. I am constantly amazed that
practically everywhere I go I see people looking for a way to port data (as an
example) from .DBF or .WKS files to proprietary systems and back again, where
the solution is not readily available to them that will interface to their
system. I have developed a set of small (100-line) sub-programs to read from
and write to these file types using a fixed-length ASCII format (like .PRN or
.SDF) for data interchange, where the SUBs can open, dump header information
and data, and close a file in a small fraction of a second on an average
AT-class computer, and the code uses only the simplest of Basic commands which
are amenable to even the ancient generic compilers and interpreters.
I guess that constitutes my manifesto, and I thank you for your attention.
Dale Thorn
Cleveland, Tennessee
































April, 1991
NEURAL NETS TELL WHY


A technique for explaining a neural network's decision-making process


 This article contains the following executables: CASEY.ARC


Casimir C. "Casey" Klimasauskas


Casimir C. "Casey" Klimasauskas is the founder of Neural-Ware Inc., a supplier
of neural network development systems and services. Prior to that he worked
extensively in machine vision and robotics. He can be reached at Penn Center
West IV-227, Pittsburgh, PA 15276; 412-787-8222.


One of the key areas in which neural networks and expert systems differ is in
how they internalize the information which they contain. In traditional expert
systems, information is kept in the form of rules. When an expert system
arrives at a conclusion, you can ask it to look back through its database of
rules and find out which ones were activated in order to reach the current
conclusion. In the case of a neural network, this kind of explanation is not
so easy to obtain -- because neural nets store information in the form of
"weights" or "synaptic strengths." Weights in a network have numerical rather
than symbolic values, making it hard to elicit a symbolic explanation.
But it is possible to learn more about how the various inputs affect the
current output of a neural network. This is accomplished by applying a
technique from the field of nonparametric statistical analysis called
"sensitivity analysis."
This article describes sensitivity analysis and how it can be applied to
"explaining" the thinking in a neural network. An example, complete with C
code, is provided and described in depth. The example is a complete,
self-contained neural net processor implementing the Back-Propagation model.
This article assumes you have some familiarity with neural nets and with the
Back-Propagation model. (For more information, see "Untangling Neural Nets" by
Jet Lawerence, DDJ April 1990.)


What is Sensitivity Analysis?


Sensitivity analysis looks at each of the inputs and determines how much and
what kind of impact a small change in inputs would have on the output. If a
small change in a particular input causes a drastic change in a particular
output, that input might be considered one of the key factors in the current
outcome.
The actual process of applying sensitivity analysis consists of starting with
the first input to the network and adding a small amount to it, recomputing
the outputs, subtracting a small amount from the network input, recomputing
the outputs, then taking the difference in the two outputs (the change in the
output) and dividing it by the amount the input changed (in total). This
results in M values, one for each output. This process is repeated for each of
the inputs to the network. The result is a MxN matrix where M is the number of
outputs, and N is the number of inputs. The small amount the input is changed
is called the "dither."
As an aside, this basic technique is used internally in a variety of neural
networks to determine which processing element should be modified. In
particular, it is used in certain variants of the Madaline, and also in
certain network optimization techniques.
As an example of how sensitivity analysis works, suppose a neural network has
been trained to determine credit worthiness. The inputs to this network are
information from a credit application, including income, expenses, number of
credit cards, own/rent, and so on. For the current applicant, say the output
of the network is 0.3 (<.5 is deny). Table 1 shows the effect of a small
change in each input on the outputs.
Table 1. Effect of changing various input variables on the current output of a
neural network trained to do credit approval. The nominal output for the
current set of inputs is 0.3.

 Input New Output Change in Output
 ----------------------------------------------

 Income up 10% .60 +.30
 Income down 10% .15 -.15
 Expenses up 10% .25 -.05
 Expenses down 10% .40 +.10
 rent --> own .31 +.01

Which of the inputs would you have to change least to switch the current
"deny" (output = 0.3) to an "accept"? The easiest would be to require the
applicant to increase his or her income by 10 percent, resulting in an
"accept" (output = 0.6). As to why credit was denied, the primary reason is
that the applicant's income is too low. From Table 1, it makes almost no
difference if the person were to go from renting to owning a house. On the
other hand, a secondary reason for denying credit might be that the
applicant's expenses are to high. If expenses were reduced by 20 to 30 percent
(rather than 10 percent), this might be enough to raise the credit evaluator
output over the accept/deny threshold of 0.5.
It is very important to note that the way in which each of these input
variables affects the output is dependent on their current values. For another
applicant, the most important factor might be the rent/own status.
In many ways, this is analogous to how individuals face decisions. When it is
difficult to make a decision between two courses of action, it is usually
because there is one particular factor (though there may be more) which is
right on the border line. If it were a little more one way or the other, the
decision would be clear cut.
For example, my company (NeuralWare) offers an introductory software package
for $99.00 as well as a sophisticated software package for $1895.00. Would you
spend $99.00 for a software package to learn about neural networks? I would,
almost without thinking (buy level = 1.0). Would I spend $1895.00? Probably
not, unless I had a specific application and wanted to make sure I had the
best tools possible to insure my success (education buy level = 0.1,
application buy level = 0.8). Would I spend $200.00? Maybe (Buy level = 0.5).
$500.00? Not unless I really, really wanted it (buy level = 0.3).
So, faced with the decision as to how much I would pay to purchase a software
package to learn about neural networks, the higher the price, the less I would
be willing to purchase it. Below $150.00, I would buy it without a thought.
Over that, I need additional compelling motivation.
But if I had a particular application in mind, would I pay $1895 for the
product? Yes, based on appropriate capabilities (buy level = 0.8). $3,000?
Yes, again based on appropriate capabilities (buy level = 0.7). $500? Maybe,
though I might want to look very carefully at product reviews (buy level =
0.5). Why would I feel more comfortable paying $2,000 or $3,000 for a product
to solve a particular application? Because I know from experience that good
tools cost money. Inexpensive ones often cost more in wasted effort than the
money saved (with some notable exceptions).
Look carefully at Table 2. Notice that a product purchased for education is
very price-sensitive when the price is over $150. When purchasing a tool for
application development, above a certain price threshold, other factors become
more important and price is not an issue. One key observation from this table
is that the way in which price affects my willingness to purchase is dependent
on my purpose for purchasing the product. This is true of sensitivity analysis
in general. The resulting changes in the output are valid only for the current
set of inputs. Because a small change in a particular input may have no effect
on the output does not mean that particular input is unimportant. Combined
with some other set of inputs, a small change in it may have a very large
impact on the output.
More Details.
Table 2: Change in buy levels. The amount you are willing to spend depends
upon the purpose and on its price. The change in buy level is estimated.
Notice that the sensitivity of the situation to a particular input (price)
depends on the other inputs (purpose). Sensitivity is the percentage change in
output divided by the percentage change in input.

 Purpose Cost in $ Buy Level for 10% Change Sensitivity
 in Price
 --------------------------------------------------------------

 Education 99.00 1.0 0.00 0.0
 Education 150.00 1.0 -0.20 -2.0
 Education 200.00 0.5 -0.10 -1.0
 Education 500.00 0.3 -0.05 -0.5
 Education 1,895.00 0.1 -0.01 -0.1


 Application 99.00 0.1 +0.10 +1.0
 Application 500.00 0.5 +0.10 +1.0
 Application 1,895.00 0.8 -0.01 -0.1
 Application 3,000.00 0.7 -0.01 -0.1



The Example Program


Listing One (page 78) shows a neural network program designed to train a
Back-Propagation network and then have it "explain" itself using sensitivity
analysis. A main routine selects which function to perform and allows the
network and explanations to be printed to a file for later inspection. Because
of space constraints, the complete network (network.net) is provided
electronically along with the program source and executable (see
"Availability," page 3). The program was written in Zortech C, Version 2.1,
but should compile readily under Microsoft C or Turbo C.
The program is organized in a fairly straightforward fashion. Utility routines
for loading, saving, deleting, creating, and printing the network are at the
beginning. These are invoked directly from the main program. The routine
Recall is used to perform a single execution of the network to determine its
response to a set of inputs. Learn performs a single update of the network
weights. The standard Delta-Rule with Momentum is used for training the
weights (see Rummelhart and McClelland's Parallel Distributed Processing,
volume 1, chapter 8 for more details). Train repeatedly invokes Learn, passing
it examples from the static training set (testE[ ]). To insure that all of the
data examples are presented, a "shuffle and deal" strategy is used. This is
implemented in NextTestN, which creates a shuffled list of examples, deals it
out to the test program, and reshuffles when the stack is empty.
(See the accompanying text box for details on running the program. Every
precaution has been taken to make the program as easy to compile as possible
under other C compilers. From experience, there should be no problem with
Microsoft C or Turbo C compilers.)


Analyzing the Results


The network was trained until the root-mean-square error was less than 0.001.
Table 4(a) shows the output of the network over the range of inputs 1 and 2.
This corresponds very well to the training inputs in the program (testE[ ]
array). The training set was designed to make input 3 "random." If it were
completely random, the weights connected to it should be close to 0. Table 3
is a printout of the network. Notice that the magnitude of the weights
associated with input 3 (highlighted in the table) are almost all very small
in comparison to the weights for the other inputs to each processing element.
This means that input 3 will typically have little effect on the output of the
network. Processing element 0 is the "bias" term, and its output is always
1.0.
Table 3: The weights in the trained network. The highlighted items are the
weights associated with input 3. Input 3 was designed to be more or less
"random." As a result, the weights associated with it are generally small
relative to other weights for Processing Element (PE).

 Layer 0 with 3 PEs
 1 1abb:5b12 PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=0
 2 1abb:5b32 PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=0
 3 1abb:5b52 PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=0

 Layer 1 with 5 PEs
 4 1abb:5b72 PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=4
 Src= 0 1abb:00d6 Weight= 1.591 Delta Wt= 0.000
 Src= 1 1abb:5b12 Weight= 3.830 Delta Wt= 0.000
 Src= 2 1abb:5b32 Weight= -16.104 Delta Wt= 0.000
 Src= 3 1abb:5b52 Weight= 0.486 Delta Wt= 0.000
 5 1abb:5bba PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=4
 Src= 0 1abb:00d6 Weight= 14.082 Delta Wt= 0.000
 Src= 1 1abb:5b12 Weight= -15.446 Delta Wt= 0.000
 Src= 2 1abb:5b32 Weight= -16.720 Delta Wt= 0.000
 Src= 3 1abb:5b52 Weight= 2.472 Delta Wt= 0.000
 6 1abb:5c02 PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=4
 Src= 0 1abb:00d6 Weight= 3.883 Delta Wt= 0.000
 Src= 1 1abb:5b12 Weight= 2.332 Delta Wt= 0.000
 Src= 2 1abb:5b32 Weight= -20.745 Delta Wt= 0.000
 Src= 3 1abb:5b52 Weight= 1.366 Delta Wt= 0.000
 7 1abb:5c4a PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=4
 Src= 0 1abb:00d6 Weight= -1.377 Delta Wt= 0.000
 Src= 1 1abb:5b12 Weight= -3.519 Delta Wt= 0.000
 Src= 2 1abb:5b32 Weight= -3.751 Delta Wt= 0.000
 Src= 3 1abb:5b52 Weight= 0.545 Delta Wt= 0.000
 8 1abb:5c92 PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=4
 Src= 0 1abb:00d6 Weight= 4.869 Delta Wt= 0.000
 Src= 1 1abb:5b12 Weight= -10.058 Delta Wt= 0.000
 Src= 2 1abb:5b32 Weight= -8.766 Delta Wt= 0.000
 Src= 3 1abb:5b52 Weight= 3.108 Delta Wt= 0.000

 Layer 2 with 2 PEs
 9 1abb:5cda PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=9
 Src= 0 1abb:00d6 Weight= -20.434 Delta Wt= 0.000
 Src= 1 1abb:5b12 Weight= 46.298 Delta Wt= 0.000
 Src= 2 1abb:5b32 Weight= -2.895 Delta Wt= 0.000

 Src= 3 1abb:5b52 Weight= -0.492 Delta Wt= 0.000
 Src= 4 1abb:5b72 Weight= -21.351 Delta Wt= 0.000
 Src= 5 1abb:5bba Weight= 20.866 Delta Wt= 0.000
 Src= 6 1abb:5c02 Weight= -26.287 Delta Wt= 0.000
 Src= 7 1abb:5c4a Weight= 1.304 Delta Wt= 0.000
 Src= 8 1abb:5c92 Weight= 8.782 Delta Wt= 0.000
 10 1abb:5d54 PE Output= 0.000 Error= 0.000 WorkR= 0.000 NConns=9
 Src= 0 1abb:00d6 Weight= 20.434 Delta Wt= 0.000
 Src= 1 1abb:5b12 Weight= -46.298 Delta Wt= 0.000
 Src= 2 1abb:5b32 Weight= 2.896 Delta Wt= 0.000
 Src= 3 1abb:5b52 Weight= 0.493 Delta Wt= 0.000
 Src= 4 1abb:5b72 Weight= 21.338 Delta Wt= 0.000
 Src= 5 1abb:5bba Weight= -20.865 Delta Wt= 0.000
 Src= 6 1abb:5c02 Weight= 26.300 Delta Wt= 0.000
 Src= 7 1abb:5c4a Weight= -1.292 Delta Wt= 0.000
 Src= 8 1abb:5c92 Weight= -8.783 Delta Wt= 0.000

Table 4(b), Table 4(c), and Table 4(d) show the sensitivity of the output to
small changes in the first and second variables, respectively. The tables
range over the possible input values (input 1 along the x-axis; input 2 along
the y-axis). Notice from these tables that the output is most sensitive when
the particular variable is near a border. If you think about it a moment, this
makes sense. In a region away from a border, small changes in the inputs will
have little affect on the output. Only near a border, where the output changes
from one state to another, will the appropriate input have a large impact.
This suggests some modifications to the program. In particular, if a small
change in each of the inputs produces little or no change in the outputs,
increase the magnitude of the change in input until the change is significant.
In Table 4(b), which looks at the sensitivity of the output to a change in
input 1, notice that the output is most sensitive when input 1 is in the range
0.45 to 0.55 and input 2 is in the range 0.7 to 1.0. This is where the output
of the network makes the transition from 0 to 1. Likewise, in Table 4(c), the
output is most sensitive to input 2 when input 2 is in the range 0.25_0.35 and
0.65 to 0.8. These are where the network makes a transition from 0 to 1 and 1
to 0 along the input 2 axis.
In looking at Table 4(b), Table 4(c), and Table 4(d), you will notice that the
output is always sensitive to the input when there is a transition. However,
the training set has very sharp boundaries and only one variable changes at
each of them. This might lead you to suspect some kind of problem. Actually,
the problem results from certain basic characteristics of how Back-Propagation
creates its output. It takes a series of humps and fits them together to
"construct" the resulting mapping from inputs to outputs. As a result, the
borders are a little lumpy making the output somewhat sensitive to all of the
inputs.
Another aside. Look back at the first example of credit approval. The changes
to the input variables were as a percentage of the unpreprocessed inputs.
Depending on how those input variables were transformed, this may have
represented a larger or smaller change in one or more actual inputs to the
network. In actual situations, it is important to vary the raw input data and
measure the changes in the output based on that.


For Further Investigation


The program as supplied contains several opportunities for further
exploration. As described earlier, dynamically making the dither larger
provides interesting effects. To a certain degree, a level of confidence can
be derived from how large the dither is. A small dither resulting in large
sensitivity means a less confident conclusion than no sensitivity at all. In
the program, the effects of a positive and negative dither have been averaged.
Another modification would be to report the change in each direction
separately.
Setting a higher error threshold on training results in fuzzier boundaries.
This tends to make the network less sensitive to all inputs. It also creates
additional "anomalous" regions where the "bumps" which Back-Propagation uses
have not fit together.
The data set which is hard-coded into the program was designed specifically to
illustrate how sensitivity analysis works. Try constructing other data sets
which have fuzzier and nonrectangular boundaries.


The Example Program


Here are the steps in running the example program. The following sequence will
create the network.exe program and replicate the tables shown in this article.

 1. Compile the source program "network.c" (I used
 Zortech C, Version 2.1, though Microsoft C or Turbo
 C should work as well).

 a. If you have a math coprocessor:
 C> ztc - ms - f - o + all network.c
 b. If you do not have a math coprocessor:
 C> ztc - ms - o + all network.c
 Due to intermediate round-off errors, there will be
 slight differences in the operation of the program
 created with inline floating point or software
 floating point. The tables in this article were produced
 with software floating point.

 2. Run the program: C> network
 3. Create a new network: What do you want to do? c
 ... network is created.
 4. Train the network (this took 3 hours on a Toshiba
 5200): What do you want to do? t ... displays
 messages about the current "pass."
 5. Save the network to "network.net" (default): What
 do you want to do? s
 6. Print out the network weights: What do you want

 to do? p network.lst

 7. Analyze how sensitive the network is to various
 inputs: What do you want to do? e network.exp

 8. Quit: What do you want to do? x


_NEURAL NETS TELL WHY_
by Casimir C. "Casey" Klimasauskas


[LISTING ONE]

/* network.c -- Backprop network with explain function */

/************************************************************************
 * *
 * Explain How a Neural Network "thinks" using Sensitivity *
 * Analysis *
 * *
 ************************************************************************
 This is a program designed to implement a back-propagation network
 and show how sensitivity analysis works to "explain" the network's
 reasoning.
 */

#include <stdio.h> /* file & printer support */
#include <stdlib.h> /* malloc, rand, RAND_MAX & other things */
#include <math.h> /* exp() */
#include <dos.h> /* FP_SEG(), FP_OFF() */

#define MAX_LAYER 4 /* maximum number of layers */
#define MAX_PES 50 /* maximum number of PEs */

/* --- locally required defines --- */

#define WORD(x) (((short *)(&x))[0])
#define MAX(x,y) ((x)>(y)?(x):(y))
typedef float WREAL; /* work real */

/* --- Connection structure --- */

typedef struct _conn { /* connection */
 struct _pe *SourceP; /* source pointer */
 WREAL WtValR; /* weight value */
 WREAL DWtValR; /* delta weight value */
} CONN;

typedef struct _pe { /* processing element */
 struct _pe *NextPEP; /* next PE in layer */
 WREAL OutputR; /* output of PE */
 WREAL ErrorR; /* work area for error */
 WREAL WorkR; /* work area for explain */
 int PEIndexN; /* PE index (ordinal) */
 int MaxConnsN; /* maximum number of connections */
 int NConnsN; /* # of connections used */
 CONN ConnsS[1]; /* connections to this PE */
} PE;


PE *LayerTP[MAX_LAYER+1] = {0}; /* pointer to PEs in layer */
int LayerNI[MAX_LAYER+1] = {0}; /* # of items in each layer */

PE *PEIndexP[MAX_PES] = {0}; /* index into PEs */
int NextPEXN = {0}; /* index of next free PE */

PE PEBias = { 0, 1.0, 0.0, 0, }; /* "Bias" PE */


/************************************************************************
 * *
 * RRandR() - compute uniform random number over a range *
 * *
 ************************************************************************
 */

double RRandR( vR ) /* random value over a range */
double vR; /* range magnitude */
{
 double rvR; /* return value */

 /* compute random value in range 0..1 */
 rvR = ((double)rand()) / (double)RAND_MAX;

 /* rescale to range -vR..vR */
 rvR = vR * (rvR + rvR - 1.0);

 return( rvR );
}

/************************************************************************
 * *
 * AllocPE() - Allocate a PE dynamically *
 * *
 ************************************************************************
 */

PE *AllocPE( peXN, MaxConnsN ) /* allocate A PE dynamically */
int peXN; /* index of PE (0=auto) */
int MaxConnsN; /* max number of connections */
{
 PE *peP; /* pointer to PE allocated */
 int AlcSize; /* size to allocate */

 if ( NextPEXN == 0 ) {
 PEIndexP[0] = &PEBias; /* bias PE */
 NextPEXN++;
 }

 if ( peXN == 0 )
 peXN = NextPEXN++;
 else if ( peXN >= NextPEXN )
 NextPEXN = peXN+1;

 if ( peXN < 0 MAX_PES <= peXN ) {
 printf( "Illegal PE number to allocate: %d\n", peXN );
 exit( 1 );
 }


 if ( PEIndexP[peXN] != (PE *)0 ) {
 printf( "PE number %d is already in use\n", peXN );
 exit( 1 );
 }

 AlcSize = sizeof(PE) + MaxConnsN*sizeof(CONN);
 peP = (PE *)malloc( AlcSize );
 if ( peP == (PE *)0 ) {
 printf( "Could not allocate %d bytes for PE number %d\n",
 AlcSize, peXN );
 exit( 1 );
 }

 memset( (char *)peP, 0, AlcSize );
 peP->MaxConnsN = MaxConnsN+1; /* max number of connections */
 peP->PEIndexN = peXN; /* self index for load/save */
 PEIndexP[peXN] = peP; /* key for later */

 return( peP );
}


/************************************************************************
 * *
 * AllocLayer() - Dynamically allocate PEs in a layer *
 * *
 ************************************************************************
 */

int AllocLayer( LayN, NPEsN, NConnPPEN ) /* allocate a layer */
int LayN; /* layer number */
int NPEsN; /* # of PEs in layer */
int NConnPPEN; /* # of connections per PE */
{
 PE *peP; /* PE Pointer */
 PE *apeP; /* alternate PE pointer */
 int wxN; /* general counter */

 /* Sanity check */
 if ( LayN < 0 sizeof(LayerTP)/sizeof(LayerTP[0]) <= LayN ) {
 printf( "Layer nubmer (%d) is out of range\n", LayN );
 exit( 1 );
 }

 /* Allocate PEs in the layer & link them together */
 LayerNI[LayN] = NPEsN;
 for( wxN = 0; wxN < NPEsN; wxN++, apeP = peP ) {
 peP = AllocPE( 0, NConnPPEN+1 ); /* allocate next PE */
 if ( LayerTP[LayN] == (PE *)0 ) {
 LayerTP[LayN] = peP; /* insert table pionter */
 } else {
 apeP->NextPEP = peP; /* link forward */
 }
 }

 return( 0 );
}


/************************************************************************
 * *
 * FreeNet() - Free network memory *
 * *
 ************************************************************************
 */

int FreeNet()
{
 int wxN; /* work index */
 char *P; /* work pointer */

 for( wxN = 1; wxN < MAX_PES; wxN++ ) {
 if ( (P = (char *)PEIndexP[wxN]) != (char *)0 )
 free( P );
 PEIndexP[wxN] = (PE *)0;
 }
 NextPEXN = 0;

 for( wxN = 0; wxN < MAX_LAYER; wxN++ ) {
 LayerTP[wxN] = (PE *)0;
 LayerNI[wxN] = 0;
 }

 return( 0 );
}

/************************************************************************
 * *
 * SaveNet() - Save Network *
 * *
 ************************************************************************
 */

int SaveNet( fnP ) /* save network */
char *fnP; /* name of file to save */
{
 int wxN; /* work index */
 FILE *fP; /* file pointer */
 PE *peP; /* PE pointer for save */
 CONN *cP; /* connection pointer */
 int ConnXN; /* connection index */

 if ( NextPEXN <= 1 )
 return( 0 ); /* nothing to do */

 if ( (fP = fopen(fnP, "w")) == (FILE *)0 ) {
 printf( "Could not open output file <%s>\n", fnP );
 return( -1 );
 }

 /* --- save all of the PEs --- */
 for( wxN = 1; wxN < NextPEXN; wxN++ ) {
 peP = PEIndexP[wxN];
 if ( peP == (PE *)0 ) {
 fprintf( fP, "%d 0 PE\n", wxN );
 continue;
 }
 fprintf( fP, "%d %d PE\n", wxN, peP->NConnsN );

 for( ConnXN = 0; ConnXN < peP->NConnsN; ConnXN++ ) {
 cP = &peP->ConnsS[ConnXN];
 fprintf( fP, "%d %.6f %.6f\n",
 cP->SourceP->PEIndexN, cP->WtValR, cP->DWtValR );
 }
 }
 fprintf( fP, "%d %d END OF PES\n", -1, 0 );

 /* --- save information about how layers are assembled --- */
 for( wxN = 0; wxN < MAX_LAYER; wxN++ ) {
 if ( (peP = LayerTP[wxN]) == (PE *)0 )
 continue;
 fprintf( fP, "%d LAYER\n", wxN );
 do {
 fprintf( fP, "%d\n", peP->PEIndexN );
 peP = peP->NextPEP;
 } while( peP != (PE *)0 );
 fprintf( fP, "-1 End Layer\n" ); /* end of layer */
 }
 fprintf( fP, "-1\n" ); /* no more layers */

 fclose( fP );

 return( 0 );
}

/************************************************************************
 * *
 * LoadNet() - Load Network *
 * *
 ************************************************************************
 */

int LoadNet( fnP ) /* load a network file */
char *fnP; /* file name pointer */
{
 int wxN; /* work index */
 FILE *fP; /* file pointer */
 PE *peP; /* PE pointer for save */
 PE *lpeP; /* last pe in chain */
 int LayN; /* layer number */
 CONN *cP; /* connection pointer */
 int ConnXN; /* connection index */
 int PEN; /* PE number */
 int PENConnsN; /* # of connections */
 int StateN; /* current state 0=PEs, 1=Layers */
 float WtR, DWtR; /* weight & delta weight */
 char BufC[80]; /* work buffer */

 fP = (FILE *)0;
 if ( (fP = fopen( fnP, "r" )) == (FILE *)0 ) {
 printf( "Could not open output file <%s>\n", fnP );
 return( -1 );
 }

 FreeNet(); /* release any existing network */

 StateN = 0;
 while( fgets( BufC, sizeof(BufC)-1, fP ) != (char *)0 ) {

 switch( StateN ) {
 case 0: /* PEs */
 sscanf( BufC, "%d %d", &PEN, &PENConnsN );
 if ( PEN < 0 ) {
 StateN = 2;
 break;
 }

 peP = AllocPE( PEN, PENConnsN ); /* allocate PE */
 cP = &peP->ConnsS[0]; /* Pointer to Conns */
 ConnXN = PENConnsN;
 if ( ConnXN > 0 )
 StateN = 1; /* scanning for conns */
 break;

 case 1: /* PE Connections */
 sscanf( BufC, "%d %f %f", &PEN, &WtR, &DWtR );
 WORD(cP->SourceP) = PEN;
 cP->WtValR = WtR;
 cP->DWtValR = DWtR;
 cP++; /* next connection area */
 peP->NConnsN++; /* count connections */
 if ( --ConnXN <= 0 )
 StateN = 0; /* back to looking for PEs */
 break;

 case 2: /* Layer data */
 sscanf( BufC, "%d", &LayN );
 StateN = 3;
 if ( LayN < 0 )
 goto Done;
 lpeP = (PE *)&LayerTP[LayN];
 break;

 case 3: /* layer items */
 sscanf( BufC, "%d", &PEN );
 if ( PEN < 0 ) {
 StateN = 2;
 break;
 }

 LayerNI[LayN]++; /* update # of PEs */
 peP = PEIndexP[PEN]; /* point to PE */
 lpeP->NextPEP = peP; /* forward chain */
 lpeP = peP;
 break;
 }
 }

Done:
 /* go through ALL PEs and convert PE index to pointers */
 for( wxN = 1; wxN < MAX_PES; wxN++ ) {
 if ( (peP = PEIndexP[wxN]) == (PE *)0 )
 continue;

 for( ConnXN = peP->NConnsN, cP = &peP->ConnsS[0];
 --ConnXN >= 0;
 cP++ ) {
 cP->SourceP = PEIndexP[ WORD(cP->SourceP) ];

 }
 }

 if ( fP ) fclose( fP );
 return( 0 );

ErrExit:
 if ( fP ) fclose( fP );
 FreeNet();
 return( -1 );
}

/************************************************************************
 * *
 * PrintNet() - Print out Network *
 * *
 ************************************************************************
 */

int PrintNet( fnP ) /* print out network */
char *fnP; /* file to print to (append) */
{
 FILE *fP; /* file pointer */
 PE *dpeP; /* destination PE */
 int layerXN; /* layer index */
 CONN *cP; /* connection pointer */
 int ConnXN; /* connection index */

 if ( *fnP == '\0' ) {
 fP = stdout;
 } else {
 if ( (fP = fopen( fnP, "a" )) == (FILE *)0 ) {
 printf( "Could not open print output file <%s>\n", fnP );
 return( -1 );
 }
 }

 for( layerXN = 0; (dpeP = LayerTP[layerXN]) != (PE *)0; layerXN++ ) {
 fprintf( fP, "\nLayer %d with %d PEs\n", layerXN, LayerNI[layerXN] );

 for(; dpeP != (PE *)0; dpeP = dpeP->NextPEP ) {
 fprintf( fP,
 " %2d %04x:%04x PE Output=%6.3f Error=%6.3f WorkR=%6.3f NConns=%d\n",
 dpeP->PEIndexN, FP_SEG(dpeP), FP_OFF(dpeP),
 dpeP->OutputR, dpeP->ErrorR, dpeP->WorkR, dpeP->NConnsN );
 for( ConnXN = 0; ConnXN < dpeP->NConnsN; ConnXN++ ) {
 cP = &dpeP->ConnsS[ConnXN];
 fprintf( fP,
 " Src=%2d %04x:%04x Weight=%7.3f Delta Wt=%6.3f\n",
 cP->SourceP->PEIndexN,
 FP_SEG(cP->SourceP), FP_OFF(cP->SourceP),
 cP->WtValR, cP->DWtValR );
 }
 }
 }

 if ( fP != stdout )
 fclose( fP );
 return( 0 );

}

/************************************************************************
 * *
 * FullyConn() - Fully connect a source layer to a destination *
 * *
 ************************************************************************
 */

int FullyConn( DLayN, SLayN, RangeR )
int DLayN; /* destination layer */
int SLayN; /* source layer */
double RangeR; /* range magnitude */
{
 CONN *cP; /* connection pointer */
 PE *speP; /* source PE pointer */
 PE *dpeP; /* destination PE pointer */

 /* loop through each of the PEs in the destination layer */
 for( dpeP = LayerTP[DLayN]; dpeP != (PE *)0; dpeP = dpeP->NextPEP ) {
 cP = &dpeP->ConnsS[dpeP->NConnsN]; /* start of connections */
 if ( dpeP->NConnsN == 0 ) {
 /* insert bias PE as first one */
 cP->SourceP = &PEBias; /* bias PE */
 cP->WtValR = RRandR( RangeR ); /* initial weight */
 cP->DWtValR = 0.0;
 cP++; /* account for this conn */
 dpeP->NConnsN++;
 }

 /* loop through all PEs in source layer & make connections */
 for( speP = LayerTP[SLayN]; speP != (PE *)0; speP = speP->NextPEP ) {
 cP->SourceP = speP; /* point to PE */
 cP->WtValR = RRandR( RangeR ); /* initial weight */
 cP->DWtValR = 0.0;
 cP++; /* account for this conn */
 dpeP->NConnsN++;
 }
 }

 return( 0 ); /* layers fully connected */
}

/************************************************************************
 * *
 * BuildNet() - Build all data structures for back-prop network *
 * *
 ************************************************************************
 */

int BuildNet( NInpN, NHid1N, NHid2N, NOutN, ConnPrevF )
int NInpN; /* # of input PEs */
int NHid1N; /* # of hidden 1 PEs (zero if none) */
int NHid2N; /* # of hidden 2 PEs (zero if none) */
int NOutN; /* # of output PEs */
int ConnPrevF; /* 1=connect to all prior layers */
{
 int ReqdPEsN; /* # of required PEs */
 int LayerXN; /* layer index */

 int SLayN, DLayN; /* source / destination layer indicies */

 if ( NInpN <= 0 NOutN <= 0 )
 return( -1 ); /* could not build ! */

 FreeNet(); /* kill existing net */
 ReqdPEsN = NInpN + NHid1N + NHid2N + NOutN;

 LayerXN = 0; /* layer index */
 AllocLayer( LayerXN, NInpN, 0 ); /* input layer */
 if ( NHid1N > 0 ) {
 LayerXN++; /* next layer */
 AllocLayer( LayerXN, NHid1N, NInpN );
 if ( NHid2N > 0 ) {
 LayerXN++;
 AllocLayer( LayerXN, NHid2N, NHid1N + (ConnPrevF?NInpN:0) );
 }
 }

 LayerXN++;
 AllocLayer( LayerXN, NOutN, ConnPrevF?(NInpN+NHid1N+NHid2N):NHid2N );

 /* connect up the layers */
 for( DLayN = 1; LayerTP[DLayN] != (PE *)0; DLayN++ ) {
 for( SLayN = ConnPrevF?0:(DLayN-1); SLayN < DLayN; SLayN++ )
 FullyConn( DLayN, SLayN, 0.2 );
 }

 return( 0 );
}

/************************************************************************
 * *
 * Recall() - Step network through one recall cycle *
 * *
 ************************************************************************
 */

int Recall( ivRP, ovRP ) /* perform a recall */
float *ivRP; /* input vector */
float *ovRP; /* output vector */
{
 int DLayN; /* destination layer index */
 PE *peP; /* work PE pointer */
 CONN *cP; /* connection pointer */
 int ConnC; /* connection counter */
 double SumR; /* summation function */

 for( DLayN = 0; (peP = LayerTP[DLayN]) != (PE *)0; DLayN++ ) {
 for( ; peP != (PE *)0; peP = peP->NextPEP ) {
 if ( DLayN == 0 ) {
 /* input layer, output is just input vector */
 peP->OutputR = ivRP[0]; /* copy input values */
 peP->ErrorR = 0.0; /* clear error */
 ivRP++;
 } else {
 /* hidden or output layer, compute weighted sum & transform */
 ConnC = peP->NConnsN; /* # of connections */
 cP = &peP->ConnsS[0]; /* pointer to connections */

 SumR = 0.0; /* no sum yet */
 for( ; --ConnC >= 0; cP++ )
 SumR += cP->SourceP->OutputR * cP->WtValR;
 peP->OutputR = 1.0 / (1.0 + exp(-SumR) );
 peP->ErrorR = 0.0;
 }

 if ( LayerTP[DLayN+1] == (PE *)0 ) {
 /* this is output layer, copy result back to user */
 ovRP[0] = peP->OutputR; /* copy output value */
 ovRP++; /* next output value */
 }
 }
 }

 return( 0 );
}

/************************************************************************
 * *
 * Learn() - step network through one learn cycle *
 * *
 ************************************************************************
 return: squared error for this training example
 */

double Learn( ivRP, ovRP, doRP, LearnR, MomR ) /* train network */
float *ivRP; /* input vector */
float *ovRP; /* output vector */
float *doRP; /* desired output vector */
double LearnR; /* learning rate */
double MomR; /* momentum */
{
 double ErrorR = 0.0; /* squared error */
 double LErrorR; /* local error */
 PE *speP; /* source PE */
 int DLayN; /* destination layer index */
 PE *peP; /* work PE pointer */
 CONN *cP; /* connection pointer */
 int ConnC; /* connection counter */

 Recall( ivRP, ovRP ); /* perform recall */

 /* search for output layer */
 for( DLayN = 0; (LayerTP[DLayN+1]) != (PE *)0; )
 DLayN++;

 /* compute error, backpropagate error, update weights */
 for( ; DLayN > 0; DLayN-- ) {
 for( peP = LayerTP[DLayN]; peP != (PE *)0; peP = peP->NextPEP ) {
 if ( LayerTP[DLayN+1] == (PE *)0 ) {
 /* output layer, compute error specially */
 peP->ErrorR = (doRP[0] - peP->OutputR);
 ErrorR += (peP->ErrorR * peP->ErrorR);
 doRP++;
 }

 /* pass error back through transfer function */
 peP->ErrorR *= peP->OutputR * (1.0 - peP->OutputR);


 /* back-propagate it through connections & update them */
 ConnC = peP->NConnsN; /* # of connections */
 cP = &peP->ConnsS[0]; /* pointer to connections */
 LErrorR = peP->ErrorR; /* local error */
 for( ; --ConnC >= 0; cP++ ) {
 speP = cP->SourceP;
 speP->ErrorR += LErrorR * cP->WtValR; /* propagate error */
 cP->DWtValR = /* compute new weight */
 LearnR * LErrorR * speP->OutputR +
 MomR * cP->DWtValR;
 cP->WtValR += cP->DWtValR; /* update weight */
 }
 }
 }

 return( ErrorR );
}

/************************************************************************
 * *
 * Explain() - compute the derivative of the output for changes *
 * in the inputs *
 * *
 ************************************************************************
 Basic Procedure:
 1) do a recall to find out what the nominal output values are
 2) copy the nominal values to "WorkR" in the PE structure.
 (We could have used the ErrorR field but WorkR was
 used to reduce confusion.)
 3) for each input:
 a) Add a small amount to the input value (DitherR)
 b) do a Recall & compute derivative of output
 c) subtract samll amount from nominal in put value
 d) do a Recall & compute derivative of outputs
 e) Average two derivatives
 */

int Explain( ivRP, ovRP, evRP, DitherR )
float *ivRP; /* input vector */
float *ovRP; /* output result vector */
float *evRP; /* explain vector */
double DitherR; /* dither */
{
 PE *speP; /* source PE (input) */
 int speXN; /* source PE index */
 PE *dpeP; /* destination PE (output) */
 int dpeXN; /* destination PE index */
 int OutLXN; /* output layer index */

 /* figure out index of output layer */
 for( OutLXN = 0; LayerTP[OutLXN+1] != (PE *)0; )
 OutLXN++;

 Recall( ivRP, ovRP ); /* set up initial recall */

 /* go through output layer and copy output to "WorkR" */
 for( dpeP = LayerTP[OutLXN]; dpeP != (PE *)0; dpeP = dpeP->NextPEP )
 dpeP->WorkR = dpeP->OutputR;


 /* for each input, compute its effects on the output */
 for( speXN = 0, speP = LayerTP[0];
 speP != (PE *)0;
 speXN++, speP = speP->NextPEP ) {
 /* dither in positive direction */
 ivRP[speXN] += DitherR; /* add dither */
 Recall( ivRP, ovRP ); /* new output */

 /* set initial results to evRP */
 for( dpeXN = 0, dpeP = LayerTP[OutLXN];
 dpeP != (PE *)0;
 dpeXN++, dpeP = dpeP->NextPEP )
 evRP[dpeXN] = 0.5 * (dpeP->OutputR - dpeP->WorkR) / DitherR;

 /* dither in negative direction */
 ivRP[speXN] -= (DitherR + DitherR); /* subtract dither */
 Recall( ivRP, ovRP ); /* new output */

 /* set final results to evRP */
 for( dpeXN = 0, dpeP = LayerTP[OutLXN];
 dpeP != (PE *)0;
 dpeXN++, dpeP = dpeP->NextPEP )
 evRP[dpeXN] -= 0.5 * (dpeP->OutputR - dpeP->WorkR) / DitherR;

 /* point to next row of explain vector */
 evRP += dpeXN;

 /* restore current input to original value */
 ivRP[speXN] += DitherR;
 }

 return( 0 );
}

/************************************************************************
 * *
 * Network Training Data *
 * *
 ************************************************************************

 +--------------+-----------+ 1.0
 
2 zero 
 one 
 t +--------------+ 0.7
 u 
 p 
 n +--------------------------+ 0.3
 I 
 zero 
 
 +--------------------------+ 0.0
 0.0 0.5 1.0

 Input 1

 Input 3 is "noise".
 Output 1 is shown above.

 Output 2 is opposite of output 1.
 */

typedef struct _example { /* example */
 float InVecR[3]; /* input vector */
 float DoVecR[2]; /* desired output vector */
} EXAMPLE;


#define NTEST (sizeof(testE)/sizeof(testE[0])) /* # of test items */

EXAMPLE testE[] = {
/* --- Inputs --- --- Desired Outputs -- */
 { 0.0, 0.0, 0.0, 0.0, 1.0 },
 { 0.0, 0.2, 0.6, 0.0, 1.0 },
 { 0.0, 0.4, 0.1, 1.0, 0.0 },
 { 0.0, 0.6, 0.7, 1.0, 0.0 },
 { 0.0, 0.8, 0.2, 0.0, 1.0 },
 { 0.0, 1.0, 0.8, 0.0, 1.0 },

 { 0.2, 0.0, 0.9, 0.0, 1.0 },
 { 0.2, 0.2, 0.3, 0.0, 1.0 },
 { 0.2, 0.4, 0.8, 1.0, 0.0 },
 { 0.2, 0.6, 0.2, 1.0, 0.0 },
 { 0.2, 0.8, 0.7, 0.0, 1.0 },
 { 0.2, 1.0, 0.1, 0.0, 1.0 },

 { 0.4, 0.0, 0.0, 0.0, 1.0 },
 { 0.4, 0.2, 0.6, 0.0, 1.0 },
 { 0.4, 0.4, 0.1, 1.0, 0.0 },
 { 0.4, 0.6, 0.7, 1.0, 0.0 },
 { 0.4, 0.8, 0.2, 0.0, 1.0 },
 { 0.4, 1.0, 0.8, 0.0, 1.0 },

 { 0.6, 0.0, 0.9, 0.0, 1.0 },
 { 0.6, 0.2, 0.3, 0.0, 1.0 },
 { 0.6, 0.4, 0.8, 1.0, 0.0 },
 { 0.6, 0.6, 0.2, 1.0, 0.0 },
 { 0.6, 0.8, 0.7, 1.0, 0.0 },
 { 0.6, 1.0, 0.1, 1.0, 0.0 },

 { 0.8, 0.0, 0.4, 0.0, 1.0 },
 { 0.8, 0.2, 0.6, 0.0, 1.0 },
 { 0.8, 0.4, 0.1, 1.0, 0.0 },
 { 0.8, 0.6, 0.7, 1.0, 0.0 },
 { 0.8, 0.8, 0.2, 1.0, 0.0 },
 { 0.8, 1.0, 0.8, 1.0, 0.0 },

 { 1.0, 0.0, 1.0, 0.0, 1.0 },
 { 1.0, 0.2, 0.3, 0.0, 1.0 },
 { 1.0, 0.4, 0.8, 1.0, 0.0 },
 { 1.0, 0.6, 0.2, 1.0, 0.0 },
 { 1.0, 0.8, 0.7, 1.0, 0.0 },
 { 1.0, 1.0, 0.0, 1.0, 0.0 }
};

int TestShuffleN[ NTEST ] = {0}; /* shuffle array */
int TestSXN = {NTEST+1}; /* current shuffle index */


/************************************************************************
 * *
 * NextTestN() - Do shuffle & deal randomization of training set *
 * *
 ************************************************************************
 */

int NextTestN() /* Get next Training example index */
{
 int HitsN; /* # of items we have added to list */
 int wxN; /* work index into shuffle array */
 int xN,yN; /* indicies of items to swap */

 if ( TestSXN >= NTEST ) {
 /* reshuffle the array */
 for( wxN = 0; wxN < NTEST; wxN++ )
 TestShuffleN[wxN] = wxN;

 /* quick & dirty way to shuffle. Much better ways exist. */
 for( HitsN = 0; HitsN < NTEST+NTEST/2; HitsN++ ) {
 xN = rand() % NTEST;
 yN = rand() % NTEST;
 wxN = TestShuffleN[xN];
 TestShuffleN[xN] = TestShuffleN[yN];
 TestShuffleN[yN] = wxN;
 }

 TestSXN = 0;
 }

 return( TestShuffleN[TestSXN++] );
}

/************************************************************************
 * *
 * TrainNet() - Driver for training Network *
 * *
 ************************************************************************
 */

int TrainNet( ErrLvlR, MaxPassN ) /* train network */
double ErrLvlR; /* error level to achieve */
long MaxPassN; /* max number of passes */
{
 float rvR[MAX_PES]; /* result vector */
 double lsErrR;
 int CurTestN; /* current test number */
 int HitsN; /* # of times below threshold */
 int PassN; /* pass through the data */
 int ExampleN; /* example number */

 HitsN = 0;
 CurTestN = 0;
 lsErrR = 0.0;
 PassN = 0;
 for(;;) {
 ExampleN = NextTestN(); /* next test number */
 lsErrR += Learn( &testE[ExampleN].InVecR[0],
 &rvR[0],

 &testE[ExampleN].DoVecR[0], 0.9, 0.5 );
 CurTestN++;
 if ( CurTestN >= NTEST ) {
 PassN++;
 lsErrR = sqrt(lsErrR)/ (double)NTEST;
 if ( lsErrR < ErrLvlR )
 HitsN++;
 else HitsN = 0;

 printf( "Pass %3d Error = %.3f Hits = %d\n",
 PassN, lsErrR, HitsN );

 if ( PassN > MaxPassN HitsN > 3 ) /* exit criterial */
 break;
 CurTestN = 0;
 lsErrR = 0.0;
 }
 }

 /* done training, start testing */

 return( 0 );
}

/************************************************************************
 * *
 * ExplainNet() - do explain & print it out *
 * *
 ************************************************************************
 */

int ExplainNet( fnP, DitherR ) /* explain & print */
char *fnP; /* output file name */
double DitherR; /* amount to dither */
{
 FILE *fP; /* file pointer */
 int wxN; /* work index */
 int xN, yN; /* x,y values */
 int axN; /* alternate work index */
 float *wfP; /* work float pointer */
 static float ivR[MAX_PES] = {0}; /* input vector */
 static float ovR[MAX_PES] = {0}; /* work area for output data */
 static float evR[MAX_PES*MAX_PES] = {0}; /* explain vector */
 /* evR[0] = dY1 vs Input 1
 * evR[1] = dY2 vs Input 1
 * evR[2] = dY1 vs Input 2
 * evR[3] = dY2 vs Input 2
 * evR[4] = dY1 vs Input 3
 * evR[5] = dY2 vs Input 3
 */


 if ( *fnP == '\0' ) {
 fP = stdout;
 } else {
 if ( (fP = fopen( fnP, "a" )) == (FILE *)0 ) {
 printf( "Could not open explain output file <%s>\n", fnP );
 return( -1 );
 }

 }

 fprintf( fP,
 "\f\n*** Network Output as a function of inputs 1 & 2 ***\n\n" );

 ivR[2] = 0.5;
 for( yN = 20; yN >= 0; yN-- ) {
 if ( (yN % 2) == 0 ) fprintf( fP, "%6.2f ", yN/20. );
 else fprintf( fP, " " );
 for( xN = 0; xN <= 20; xN++ ) {
 ivR[0] = xN / 20.;
 ivR[1] = yN / 20.;
 Recall( &ivR[0], &ovR[0] );

 /* --- ignore very small changes --- */
 if ( fabs(ovR[0]) < .1 ) fprintf( fP, " - " );
 else fprintf( fP, "%5.1f", ovR[0] );
 }
 fprintf( fP, "\n" );
 }
 fprintf( fP, " +-" );
 for( xN = 0; xN <= 20; xN++ )
 fprintf( fP, "-----" );
 fprintf( fP, "\n " );

 for( xN = 0; xN <= 20; xN++ )
 fprintf( fP, (xN % 2)==0?"%5.1f":" ", xN/20. );
 fprintf( fP, "\n" );



 fprintf( fP,
 "\f\n*** Plot of Explain Function for Input 1 over input range ***\n\n" );

 ivR[2] = 0.5;
 for( yN = 20; yN >= 0; yN-- ) {
 if ( (yN % 2) == 0 ) fprintf( fP, "%6.2f ", yN/20. );
 else fprintf( fP, " " );
 for( xN = 0; xN <= 20; xN++ ) {
 ivR[0] = xN / 20.;
 ivR[1] = yN / 20.;
 Explain( &ivR[0], &ovR[0], &evR[0], DitherR );

 /* --- ignore very small changes --- */
 if ( fabs(evR[0]) < .1 ) fprintf( fP, " - " );
 else fprintf( fP, "%5.1f", evR[0] );
 }
 fprintf( fP, "\n" );
 }
 fprintf( fP, " +-" );
 for( xN = 0; xN <= 20; xN++ )
 fprintf( fP, "-----" );
 fprintf( fP, "\n " );

 for( xN = 0; xN <= 20; xN++ )
 fprintf( fP, (xN % 2)==0?"%5.1f":" ", xN/20. );
 fprintf( fP, "\n" );




 fprintf( fP,
 "\f\n*** Plot of Explain Function for Input 2 over input range ***\n\n" );


 ivR[2] = 0.5;
 for( yN = 20; yN >= 0; yN-- ) {
 if ( (yN % 2) == 0 ) fprintf( fP, "%6.2f ", yN/20. );
 else fprintf( fP, " " );
 for( xN = 0; xN <= 20; xN++ ) {
 ivR[0] = xN / 20.;
 ivR[1] = yN / 20.;
 Explain( &ivR[0], &ovR[0], &evR[0], DitherR );

 /* --- ignore very small changes --- */
 if ( fabs(evR[2]) < .1 ) fprintf( fP, " - " );
 else fprintf( fP, "%5.1f", evR[2] );
 }
 fprintf( fP, "\n" );
 }
 fprintf( fP, " +-" );
 for( xN = 0; xN <= 20; xN++ )
 fprintf( fP, "-----" );
 fprintf( fP, "\n " );

 for( xN = 0; xN <= 20; xN++ )
 fprintf( fP, (xN % 2)==0?"%5.1f":" ", xN/20. );
 fprintf( fP, "\n" );



 fprintf( fP,
 "\f\n*** Plot of Explain Function for Input 3 over input range ***\n\n" );


 ivR[2] = 0.5;
 for( yN = 20; yN >= 0; yN-- ) {
 if ( (yN % 2) == 0 ) fprintf( fP, "%6.2f ", yN/20. );
 else fprintf( fP, " " );
 for( xN = 0; xN <= 20; xN++ ) {
 ivR[0] = xN / 20.;
 ivR[1] = yN / 20.;
 Explain( &ivR[0], &ovR[0], &evR[0], DitherR );

 /* --- ignore very small changes --- */
 if ( fabs(evR[4]) < .1 ) fprintf( fP, " - " );
 else fprintf( fP, "%5.1f", evR[4] );
 }
 fprintf( fP, "\n" );
 }
 fprintf( fP, " +-" );
 for( xN = 0; xN <= 20; xN++ )
 fprintf( fP, "-----" );
 fprintf( fP, "\n " );

 for( xN = 0; xN <= 20; xN++ )
 fprintf( fP, (xN % 2)==0?"%5.1f":" ", xN/20. );
 fprintf( fP, "\n" );



 if ( fP != stdout )
 fclose( fP );
 return( 0 );
}


/************************************************************************
 * *
 * main() - Driver for entre program *
 * *
 ************************************************************************
 */

main()
{
 int ActionN; /* action character */
 char *sP; /* string pointer */
 char *aP; /* alternate pointer */
 char BufC[80]; /* work buffer */

 printf( "\nC-Program to Explain a Neural Network's Conclusions\n" );
 printf( " Written by: Casimir C. 'Casey' Klimasauskas, 04-Jan-91\n" );
 for(;;) {
 printf( "\
C - create a new network\n\
L [fname] - load a trained network\n\
S [fname] - save a network\n\
P [fname] - print out network\n\
F - free network from memory\n\
T - Train network\n\
E [fname] - Explain network\n\
X - eXit from the program\n\
What do you want to do? " );
 fflush( stdout );
 sP = fgets( BufC, sizeof(BufC)-1, stdin );
 if ( sP == (char *)0 )
 break;

 while( *sP != 0 && *sP <= ' ' )
 sP++;
 ActionN = *sP;
 if ( 'A' <= ActionN && ActionN <= 'Z' )
 ActionN -= 'A'-'a'; /* convert to LC */
 sP++;
 while( *sP != 0 && *sP <= ' ' )
 sP++; /* skip to argument */
 for( aP = sP; *aP > ' '; )
 aP++; /* skip to end of argument */
 *aP = '\0'; /* null terminate it */

 switch( ActionN ) {
 case 'c': /* create network */
 BuildNet( 3, 5, 0, 2, 1 );
 break;

 case 'l': /* load network */
 if ( *sP == '\0' )
 sP = "network.net";

 LoadNet( sP );
 break;

 case 's': /* save network */
 if ( *sP == '\0' )
 sP = "network.net";
 SaveNet( sP );
 break;

 case 'p': /* print network */
 PrintNet( sP );
 break;

 case 'f': /* free network */
 FreeNet();
 break;

 case 't': /* train network */
 TrainNet( 0.001, 100000L );
 break;

 case 'e': /* explain network */
 ExplainNet( sP, .01 );
 break;

 case 'x': /* done */
 goto Done;

 default:
 break;
 }
 }

Done:
 return( 0 );
}


























April, 1991
GENETIC ALGORITHMS


A new class of searching algorithms


 This article contains the following executables: MORROW.ARC


Mike Morrow


Mike is a programmer at Applied Microsystems Corp. He can be contacted at
16541 Redmond Way #162, Redmond, WA 98052.


Genetic algorithms are a class of machine-learning techniques that gain their
name from a similarity to certain processes that occur in the interactions of
natural, biological genes. In this article, I'll outline the steps in a
typical Genetic Algorithm (GA) and illustrate the concept by implementing a
word-guessing application.
A GA is a method of finding a good answer to a problem, based on feedback
received from its repeated attempts at a solution. The judge of the GA's
attempts is called an objective function. GAs don't know how to derive a
problem's solution, but they do know, from the objective function, how close
they are to a better solution.
Each attempt a GA makes towards a solution is called a gene--a sequence of
information that can somehow be interpreted in the problem space to yield a
possible solution. A GA gene is analogous to a biological gene in that both
are representations of alternative solutions to a problem. In the biological
world, the problem is evolutionary survival, and a particular gene represents
one possible solution to survival within a competitive environment. In the
digital world, the stated problem will vary from one application program to
another, as will the objective function. The coding of a GA's gene will also
depend on the problem being addressed. We'll examine a couple of possible
encodings in this article.
Figure 1(a) shows a gene that is a sequence of binary digits. This gene could
represent one of a GA's attempts at answering the question, "What is the
square root of 64?" This particular gene, with binary value 0101, represents
the idea that 5 is the square root of 64. It happens to be wrong, of course.
Typically, a GA will maintain a collection, or population, of genes. Each gene
in the population may represent a different sequence and a different idea
about how to solve the problem, thus satisfying the objective function.
Figure 1: Trying to find the square root of 64: (a) A genetic sequence,
composed of 1s and 0s, represents the value 5; (b) an objective function: Gene
values close to the answer get FITNESS close to 0.

 (a) 0 0 0 0 0 1 0 1
 (b) FITNESS=64-(gene_value*gene_value)

Deciding how to encode genes to represent possible solutions in a particular
problem space leads to the next requirement: a suitable objective function. An
objective function must be able to interpret the data contained within a gene
and decide how good a solution it represents. Suppose that we are indeed
trying to find the square root of 64. We need to set up an objective function
that rewards higher fitness to genes that are good solutions. We don't "know"
the correct answer, so we can't just compare each gene to 8. Instead, we have
to plug the value a gene advocates into an equation that tells us how close
the gene is to the solution. Thus, our objective function should square a
gene's solution and note the difference from 64. Figure 1(b) shows how a
fitness value might be derived programmatically. Good genes evaluated by this
fitness function will get fitness values close to zero.
Fitness values are associated with the gene that generated them until the data
within the gene changes. After an encoding and an objective function have been
formulated, the person running a GA may step back and let it do the work.
Through a sequence of steps, the GA will work toward making its genes more
fit. Again, we find a biological analogy in the optimization steps. Each pass
through the set of optimization steps is called a generation. If all is going
well, we expect the overall fitness of the genes to increase as we pass
through generations.
We name the optimization steps in each generation reproduction, crossover, and
mutation. In reproduction, the first part of a generation, genes from the
previous generation are duplicated and form the new population. We use the
fitness of a gene, conferred by the objective function, to decide how likely
that gene is to reproduce. Genes that are more fit are more likely to be
duplicated; less fit genes have a poorer chance. However, reproduction is
ruled by chance, so it is not impossible for genes to be reproduced in a
proportion not exactly in keeping with their fitness. In our square root
problem, we would expect that genes with fitness close to zero would be more
likely to reproduce than genes with less optimal fitness. It is important that
we maintain all sorts of genes in our population, both fit and unfit. The
genes that are apparently unfit may contain important information that will be
revealed in later generations. The fittest genes may be indicating only a
locally good solution; such solutions may be less useful in future
generations.
After reproduction, the genes undergo crossover, where we choose pairs of
genes at random. The gene pairs then exchange parts of their sequences. The
result is two new genes. Figure 2 illustrates crossover of two genes. In this
example, the two genes in Figure 2(a) are crossed at the indicated points to
create the two genes in Figure 2(b).
Figure 2: Two genes (a) are crossed at the indicated points to create two new
genes (b).

 (a) ^0 0 0 0 1^1 1 1 FITNESS= -161
 ^1 1 1 1 0^0 1 0 FITNESS= -58400

 (b) 1 1 1 1 0 1 1 1 FITNESS= -60945
 0 0 0 0 1 0 1 0 FITNESS= -36

In performing the crossover step, we want the genes to exchange subsequences
that contain good information about solutions. Hopefully, exchanging material
will result in genes that are better than their ancestors. The crossover of
Figure 2 actually produced one gene with better fitness, and another with
poorer fitness.
The final step in a generation is mutation. This optimization entails randomly
altering a very small percentage of the genetic sequences present in the
population. Mutation may introduce new concepts into the population.
If the second bit of the good gene in Figure 2(b) was mutated from 1 to 0,
that gene would then represent the value 8, the solution to the square root
problem. Note that if our entire population were made up of the two genes
shown in Figure 2, then crossover would never be able to change that
particular bit from 1 to 0 -- there are no genes that have a 0 in that bit
position. In cases such as these, mutation is the only operation that will
produce genes of the optimal solution. We don't want to rely on mutation to
save our GA, though; normally, we try to get a varied population, so mutation
serves only to speed up the search.
To put it all together then, a GA can be seen as a series of steps. Initially,
a random population of genes is created. Then, we attempt to optimize the
fitness of the genes by running through generations of optimization steps
(reproduction, crossover, and mutation). An objective function is used
throughout to judge the fitness of members of the population.


A Simple GA


In this section, I'll apply the genetic algorithm presented in the previous
discussion to a simple problem -- we'll have the GA learn how to form a short
phrase. Let's choose the phrase "Hello there," but to make things a little
simpler, we'll reduce the problem so that the GA need only consider uppercase
alphabetic characters, HELLOTHERE.
The objective function to drive the GA will return the number of correct
letters in a gene. The best possible fitness is 10 -- the number of characters
in the target phrase. Listing One (page 86) shows an implementation of the
objective function, written in C.
Figure 3(a) shows the five fittest genes in the initial, randomly generated
population. It can be seen that a few of the desired letters were arrived at
through chance. Though we're proud parents of our genes, we can't truthfully
say that they look anything like the target phrase. The situation improves
with a few generations of effort from the GA. Figure 3(b) shows the top five
genes after five generations. They've started to build up subsequences with
good fitness values. As time goes on, they will exchange these subsequences to
arrive at still better results.
Figure 3(c) shows the five fittest genes after 15 generations. It can be seen
that some of the genes have actually arrived at the solution.
Figure 3: The first (a), fifth (b), and 15th (c) generation of the
word-guessing population. The first column of values is the genes' fitnesses.

 (a) 2 P M Y M Z T T E D G
 2 H R R H O S T O I Q
 2 O R T X U U H M Y E
 2 H U L M I D I L A S
 2 M K P Z S O V E W E


 (b) 6 H Y V L W T H E R B
 6 G E L L W T H H V E
 5 G A L L H T H V R B
 5 H Y V L W T H E G G
 5 G A L L W T X V R E

 (c) 10 H E L L O T H E R E
 10 H E L L O T H E R E
 9 H Y L L O T H E R E
 9 H Y L L O T H E R E
 9 H A L L O T H E R E

Actually, this form of word guessing is relatively easy for a GA. The feedback
the genes got from the objective function are monotonic -- there weren't any
false leads for the genes to pursue.
Other types of problems present a GA with a more difficult challenge. These
problems have many "bumps." These are good, but not optimal, solutions that
may attract a lot of genes. It is in these types of problems that the
diversity of a GA's population is important. Rather than concentrating on a
short-term, fairly good solution, a GA may look at a variety of possibilities
in parallel. In the next section, we will ask our GA to find a solution to a
maze.


The Maze Problem


Running a GA's genes through a maze to come up with a good solution is a
harder problem than guessing a word because a maze has dead ends; a gene might
be happily traversing a maze, only to turn a corner and smack into a wall.
Unlike the word-guesser, the maze problem doesn't have a monotonic objective
function.
As always, we need to decide on a way to encode the gene and an objective
function. We can encode the gene as a set of directions: north, east, west,
and south. The ideal gene would be a sequence of directions that indicated how
to get from the starting point of the maze to the finish point. At each
decision point in the maze, we would consult the next direction in this
perfect gene to continue our trip. An example of such a gene, and its
interpretation, is shown in Figure 4.
Figure 4: According to our encoding, this gene directs us to go north, east,
west, south, west, and finally east again.

 N E W S W E

The choice of objective function is important for this problem. If we made the
objective function the same as the word guessing objective function, then the
GA would have an easy time of it. In fact, the word guessing objective
function is not realistic in this setting; laboratory animals don't get to
consult an oracle when they want to get to the end of the maze. Let us create
an objective function that gives each gene a score based on its final distance
from the end of the maze. This is something like a lab rat gauging the
distance from a goal by the intensity of the appetizing aroma of the cheese
there. The objective function will also count the number of direction changes
made by a gene. If a gene makes the goal in fewer moves, it will get a bonus
from the objective function. A C function to implement this objective function
is shown in Listing Two (page 86). The complete GA system that solves the maze
problem consists of a number of files. Due to space constraints, however,
source code for the complete system is only available electronically (see
"Availability" on page 3).
The final player in this drama of directions is the maze. Figure 5 shows the
maze without any inhabitants. Although we won't show them in future maze
displays, coordinates have been assigned to each row and column in the maze.
This will aid the objective function in evaluating genes, but is just another
detail to us.
Notice that we've made the genetic sequences longer than necessary. We give
the genes a little slack so they can make some dumb decisions at first and
still have hope of reaching the goal. We do reward shorter paths, so as the
genes get smarter, the extra length will not be necessary.
The population size is another important parameter in the maze-running GA. If
we have too small a population, we'll lose genetic diversity and it is
probable that the genes will settle into a comfortable spot close to the
solution without thought of venturing out to get to the real solution. We live
in the real world, however, so we can't have an excessively large population
because of MIPS restrictions. I settled on a population of 100 genes;
population size is an interesting parameter for experimentation.
So, what happened when we slipped the chain and let loose the genes into the
maze? Figure 6 shows some of the initial, random population. In the graphical
representation, genes that share a position in the maze overwrite each other,
so only one shows up in any particular spot where genes have congregated. You
can see what is actually at that spot by consulting the list of genes on top
of the picture of the maze.
After 20 generations, the fittest genes are as shown in Figure 3. Note that
many genes have stumbled onto a spot that is physically close to the end.
Unfortunately, some are blocked from going further because of intervening
walls. Other genes have lower scores, but are actually on the right path to
get to the best solution. In future generations, their genetic material will
be valuable as building blocks for really good genes.
After 40 generations, Figure 8 shows that the top genes are those that have
arrived at the destination. Now the difference in fitness is decided by how
long they took to get there. Genes that went on a more direct path have higher
value in this application.
If we continued to watch generations go by, we would see genes that used fewer
moves than those in the current generation.


Conclusion


Our GA managed to traverse the maze after a number of generations. There are
computationally easier ways to traverse mazes, but traditional methods require
the algorithm to know about mazes. The GA learned about the maze; it actually
had no a priori knowledge of what it was doing. For this reason, the GA maze
runner is an interesting variation.
This article has just shown a bit of the type of things people are doing with
genetic algorithms. If you are interested in learning more about genetic
algorithms, a good book to consult is Genetic Algorithms in Searching,
Optimization, and Machine Learning by David E. Goldberg (Addison Wesley,
1989).

_GENETIC ALGORITHMS_
by Mike Morrow


[LISTING ONE]

/*** GASystem -- Mike Morrow -- Objective function **/

#include "ga.h"

void objinit()
{
}

FIT_TYPE objective(s, len)
SEQ s;
int len;
{

 FIT_TYPE n;
 unsigned int i;
 static char tgt[] = "HELLOTHERE";

 n = 0;
 for(i = 0; i < len && i < sizeof tgt - 1; i++)
 if(tgt[i] == s[i])
 n++;

 return n;
}

void objshow(s, len, fitness)
SEQ s;
int len;
FIT_TYPE fitness;
{
 printf("%d ", fitness);
 while(len--) printf(" %c", *s++);
 puts("");
}

void objdumpdone()
{

}





[LISTING TWO]

/*** GASystem -- Mike Morrow -- Objective function. Evaluates a set of
* directions as applied to a maze. The closer the set of directions gets to
* the end of the maze, the higher the fitness of that set of directions.
**/

#include "ga.h"
#include <stdio.h>

/** A set of directions is made up of the following **/
#define NORTH 0
#define EAST 1
#define WEST 2
#define SOUTH 3

/** Define the maze **/
#if MSDOS
#define BLOCK (char) 178 /* block char on PC */
#define SPACE ' '
#else
#define BLOCK '@'
#define SPACE ' '
#endif

#define _ BLOCK,
#define A SPACE,


#define MAZEX 17
#define MAZEY 13

typedef char MAZE[MAZEY][MAZEX];
typedef char DISPLINE[80];

static CONST MAZE maze =
{
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
 _ A A A A A A A _ _ _ _ _ _ _ A _
 _ A _ _ A _ _ _ _ _ A A _ _ _ A _
 _ A _ _ A A A A A A A _ _ A A A _
 _ A _ _ A _ _ _ A _ _ _ _ A _ _ _
 _ A _ _ _ _ _ _ A A A _ _ A _ A _
 _ A _ _ A _ _ _ _ _ _ _ _ A _ A _
 _ A A A A A A _ _ _ A A A A _ A _
 _ _ _ _ A _ _ _ A _ A _ _ A _ A _
 _ A _ _ A _ A A A _ A A _ A A A _
 _ A A A A _ _ _ A _ _ _ _ _ _ A _
 _ A _ _ A A A A A A A A A A A A _
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
};

/** Maze runners start at this point in the maze **/
#define MAZESTARTX 10
#define MAZESTARTY 11

/** Maze runners' goal is this point in the maze **/
#define MAZEENDX 7
#define MAZEENDY 1

/** How far from the MAZEEND is this set of points (x, y)? **/
#define DIST(x, y) ((MAZEENDX - x) * (MAZEENDX - x) + (MAZEENDY - y)
 * (MAZEENDY - y))

/** What is the longest distance in a maze this size? **/
#define MAXDIST ((MAZEX * MAZEX) + (MAZEY * MAZEY))




#if __STDC__ __ZTC__
static int mazerun(SEQ s, int len, unsigned int *xp, unsigned int *yp);
static int xroads(int x, int y);
#else
static int mazerun();
static int xroads();
#endif

void objinit()
{
 char exebuff[80];

 /** set parameters. High allele should be SOUTH, low allele
 * should be NORTH. **/
 sprintf(exebuff, "HIA = %d", SOUTH);
 exec(exebuff);

 sprintf(exebuff, "LOWA = %d", NORTH);

 exec(exebuff);

 /** For starters, give a gene this many elements. User may want to
 * experiment with this parameter anyway. **/
 sprintf(exebuff, "LEN = 15");
 exec(exebuff);
}

/*** This function evaluates a gene's sequence of directions. It returns
* a fitness value. ***/
FIT_TYPE objective(s, len)
SEQ s;
int len;
{
 FIT_TYPE dist;
 unsigned int x, y;
 int n_moves;

 /** Run through maze using directions in s. x and y will get final position
 * we reach. n_moves will get number of moves it took to get there. **/
 n_moves = mazerun(s, len, &x, &y);

 /** The fitness of that path through maze is distance from MAZEEND.**/
 dist = DIST(x, y);

 /** Convert: lower distances imply higher fitness value. **/
 dist = (FIT_TYPE)MAXDIST - dist;

 /** Scale down result. **/
 dist = (dist * dist) >> 12;

 /** Plus a bonus for brevity. **/
 dist += 5 * (FIT_TYPE) (len - n_moves);

 return dist;
}

static CONST char i_to_c[] = "NEWS";

#define N_PER_DISP_BLOCK 5

static DISPLINE displines[N_PER_DISP_BLOCK];
static int n_lines = 0;
static MAZE disp_maze;

void objshow(s, len, fitness)
SEQ s;
int len;
FIT_TYPE fitness;
{
 unsigned int x, y;
 int n_moves;
 DISPLINE buff;

 if(! n_lines)
 memcpy(disp_maze, maze, sizeof maze);

 n_moves = mazerun(s, len, &x, &y);


 disp_maze[y][x] = '0' + n_lines;

 sprintf(displines[n_lines], "%6d ", fitness);
 while(len--)
 {
 if(*s > SOUTH)
 sprintf(buff, " %d!", *s++);
 else
 sprintf(buff, " %c", i_to_c[*s++]);
 strcat(displines[n_lines], buff);
 }

 sprintf(buff, " : (%2d, %2d); %2d moves", x, y, n_moves);
 strcat(displines[n_lines], buff);

 n_lines++;
 if(n_lines == N_PER_DISP_BLOCK)
 objdumpdone();
}

void objdumpdone()
{
 unsigned int i, x, y;

 if(! n_lines)
 return ;

 for(i = 0; i < n_lines; i++)
 {
 printf("%d) %s\n", i, displines[i]);
 }

 puts("");
 for(y = 0; y < MAZEY; y++)
 {
 printf(" ");
 for(x = 0; x < MAZEX; x++)
 {
 putchar(disp_maze[y][x]);
 }
 puts("");
 }
 n_lines = 0;
 puts("\n\n");
}

/** Run through maze with directions given in s. *xp and *yp are set to final
* coords that we end up with. This function returns number of moves it took to
* run maze. It will terminate when moves in s are used up, or when we arrive
* at the end of maze. **/
static int mazerun(s, len, xp, yp)
SEQ s;
int len;
unsigned int *xp, *yp;
{
 register int x, y;
 register SEQ dirs;
 int n_moves;



 x = MAZESTARTX;
 y = MAZESTARTY;
 dirs = s;
 n_moves = 0;

 while(len-- && ! (x == MAZEENDX && y == MAZEENDY))
 {
 switch(*dirs++)
 {
 case NORTH:
 while(maze[y - 1][x] == SPACE)
 {
 y--;
 if(xroads(x, y))
 break;
 }
 break;

 case EAST:
 while(maze[y][x + 1] == SPACE)
 {
 x++;
 if(xroads(x, y))
 break;
 }
 break;

 case WEST:
 while(maze[y][x - 1] == SPACE)
 {
 x--;
 if(xroads(x, y))
 break;
 }
 break;

 case SOUTH:
 while(maze[y + 1][x] == SPACE)
 {
 y++;
 if(xroads(x, y))
 break;
 }
 break;

 default:
 printf("Error in objective(), got allele = %d!", *(dirs - 1));
 break;
 }
 n_moves++;
 }
 *xp = x;
 *yp = y;

 return n_moves;
}

/** If this is a cross roads in maze, i.e. there are more than two exits

* from the current cell, then return TRUE. **/
static int xroads(x, y)
int x, y;
{
 char exits;

 exits = (maze[y][x+1] != SPACE) + (maze[y][x-1] != SPACE) +
 (maze[y+1][x] != SPACE) + (maze[y-1][x] != SPACE);
 return exits < 2;
}




















































April, 1991
 PORTING UNIX TO THE 386 LANGUAGE TOOLS CROSS SUPPORT


Developing the initial utilities




William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 and 2.9BSD and was the chief architect
of National Semiconductor's GENIX project, which was the first virtual memory
microprocessor-based UNIX system. Prior to establishing TeleMuse, a market
research firm, Lynne was vice president of marketing at Symmetric Computer
Systems. Bill and Lynne conduct seminars on BSD, ISDN and TCP/IP. Send e-mail
questions or comments to lynne@berkeley .edu. Copyright (c) 1991 TeleMuse.


We stated last month that "Projects of great complexity are always uncertain"
and then we went on to develop our standalone system. Now we must examine what
we accomplished. Recall that last month, we started with an empty 386 residing
in protected mode without one shred of reliable code: just three little PC
utilities to facilitate software loading and bootstrap operation. Using our
protected-mode program loader, we created a minimal 80386 protected-mode
standalone C programming environment for operating systems kernel development
work. Then we wrote prototype code for various kernel hardware support
facilities. Finally, we used our standalone programming environment as a
testbed to shake out the bugs in our first-cut implementation of kernel 386
machine-dependent code in preparation for incorporation in the BSD kernel.
Following our specification methodology, we created a suitable standalone
system and conquered a number of latent software bugs and misconceptions.
With our standalone system, we have essentially established the "base camp" on
our 386 expedition. We now possess much of the "gear" (utilities, compiler and
assembler, and other equipment) required for such an adventure, but we must
check it out and test it prior to actual use. As any good mountaineer knows,
thorough knowledge of your equipment could save your life. In this case, an
adherence to appropriate testing and coding procedures could save a project.
As we stated earlier, the standalone system can be viewed at this stage as if
it were the kernel itself, with the extensions as the basis of our prototype
kernel code. We now continue up the base of the mountain, furthering our
initial utilities development through the creation of a stable cross-tools
environment.


Why Develop Cross-Tools?


We have mentioned little about our protected-mode software generation
mechanism in previous articles. In this article, we describe our set of tools
that allows us to port 386BSD. Since we don't have 386BSD to generate 386BSD
(yet), we must use another UNIX host to run the tools and generate
protected-mode software; this "cross" mode operation is part of the means by
which we bootstrap 386BSD. In our case, the cross-host that runs the software
generating 386 code isn't even a 386!
Because the computer we use to generate the software is not the one that runs
it, we will need a means to load files and program over Ethernet and serial
lines to the target 386 system. We will then focus on proving GCC itself valid
for cross-support purposes. The mechanisms used for this "first assault" will
be of great importance until we have developed a stable native environment.
Careful preparation in this area will allow us to weather the blinding
"blizzards" of bugs which will inevitably arise on our way to the top.


386BSD Cross-Tools Goals


A proper evaluation of our cross-tools was crucial to the successful
generation of the earliest version of 386BSD--before the system had the
ability to generate its own binaries. While everyone always wants to use the
very best tools possible in all cases, we decided that what we wanted from our
cross-tools was simply to be able to generate enough of an operational BSD
kernel and utilities to run our language tools in a native environment.
Ultimately, we want to use native tools because they are more convenient, have
a shorter "compile-edit-debug" cycle, are easier to support (for example, just
one architecture to worry about), and use much of the traditional program
development aids provided in BSD UNIX.
In a nutshell, BSD, like most UNIX systems, expects to be developed in a
native environment.
As such, our principle concern at this stage is with correctness, not
optimization. Performance considerations arise only after we achieve an
operational system that can be refined using traditional means. This first
"bootstrap" version of utilities and kernel is compromised in areas where our
cross-support mechanisms are weakest. However, if carefully selected, we can
jettison these compromised areas when we "go native."
More Details.
Both the kernel and early utilities are predominantly written in C, with some
assembler support. Before a self-supporting kernel exists, approximately
250,000 lines of C code must be made operational via the cross-support. The
chance of discovering compiler bugs, or cross-support-induced bugs, is almost
certain.


What's in the Tool Chest?


Our tool chest of cross-tools consists of the following:
The C Compiler: The bulk of our effort is organized around the C compiler. For
the 386BSD project, we relied upon the Free Software Foundation's (FSF) GCC
compiler, version 1.34. At the beginning of this port, we had little
familiarity with the strengths and weaknesses of GCC. We were also uncertain
about its usefulness as an operating systems development tool, as it appeared
primarily alongside other 386 C compilers on extant System 5 UNIX systems.
Unfortunately, we cannot supply written code fragment examples from GCC (or
any other FSF software) in this article due to constraints of the "copyleft"
(see the accompanying text box entitled "Brief Notes: Copyrights, Copylefts,
and Competitive Advantage").
386 Protected-mode Assembler: The remaining 386 code, particularly the code
used for interfacing to non-C mechanisms and data structures needed to support
i386 and ISA hardware functionality, was written in assembly language. The
FSF's GAS assembler was used for this purpose, more out of need than
preference. The great majority of problems we encountered with the port were
traced to "hidden surprises" and "features" in GAS, which we bypassed with
clever use of inline code and other contrivances. GAS is functional and
proven, if not pretty.
Linker-Loader: Object modules created by GAS were linked together by an object
module linkage editor. We had a wide variety of candidates available from BSD,
FSF, and others from which to choose. However, because the object file format
exactly matched the arrangement of our cross-host (a National Series 32000
machine), we put off the ultimate decision by using our cross-host's native
UNIX ld command. This worked without modification to our satisfaction.
Communications and File Transfer: We needed a way to get programs and files
created on our cross-host transferred to our 386 PC. Many cross-host to PC
communications programs are available, and we settled on Kermit and NCSA
Telnet (ftp) to do the job.
Protected-mode Loader: Once we had transferred the programs to the PC, we used
our protected-mode loader program (see "Three Initial PC Utilities" in DDJ,
February 1991 ) to load the programs and execute them in 386 protected mode.
Ancillary Tools: In addition to the heavy hitters, various minor commands are
also needed to create and organize the object libraries. Commands such as ar,
ranlib, nm, and lorder were required. Again, like the ld command above, we
were able to use the cross-host's native commands due to the identical
executable format and byte order of cross-host and our 386.
In addition to these programs, our cross-support facility must have the
following data objects present to build kernel and utilities:
Object Libraries: The standalone system (libsa.a) and utilities (libc.a and
others) make great use of their respective library calls. These libraries
satisfy, on the average, a few hundred of the function entry and data
structure references invoked by various BSD utility programs. Most of the
machine-dependent portions of BSD utilities are located in the libraries, so
the majority of effort expended in porting the utilities is focused on the
libraries. Over the course of the 386BSD project, we wrote the
machine-dependent code into the libraries to get a given utility operational
only as needed, rather than writing it all at once. Incremental coding
provided a tactical advantage, because by the time we needed to wrestle with
the most difficult code, we had quite a bit of seasoned experience with the
386.
Include Files: In addition to object libraries, we must provide a complete set
of include files for use with our cross-support package. A simple approach
might be to have all references to include files directed to a separate i386
include directory, but this would interfere with the pathnames invoked by a
variety of makefiles and shell scripts, not to mention all the embedded
references in the source code itself. After finding over a hundred references
to absolute pathnames, with no end in sight, we gave up on this approach and
did the unspeakable--put into place on the cross-host all 386 include files.
By virtue of the shell commands to386 and back2normal, we could switch our
cross-host back and forth in this manner. Thank goodness, no other users
needed to compile native programs at the time; they would have been somewhat
surprised!


Cross-Support Methodology


We can employ several standard methods to aid in our cross-support effort:
regression testing, divide and conquer, consistency checks, and defeating
optimizations.
Regression testing is used to probe for the presence of induced bugs in every
step along the way to proving our cross-tools. Prior to creating our
cross-compiler, we generate our early test files off of a known good and
tested implementation (in the case of 386BSD, a Sequent 386 UNIX system). The
compiler output for some unmodified portions of the compiler and the kernel of
the operating system are kept as reference assembly language files, for
comparison against subsequent compiler versions output compiling the same
files. An induced error would cause a difference to show up in the comparison
of the two. As an example of this, a whole group of instructions might be
missing, signifying a dropped expression left uncompiled by a buggy compiler.
In a similar fashion, a group of object files from the assembler are also
created to compare with those created by the assembler on our 386.
In addition to this set of test files, a record is kept of every kind of
induced bug and the source code which generated it. Thus, common bugs which
are inadvertently reintroduced periodically can be caught without needing to
be debugged a second time (or a third ... ).

This mechanism for tracking compiler bugs is not a panacea--it is vulnerable
to error in two major ways: It does nothing to aid detection of "latent" bugs
in the "good" version we started with; and it becomes useless if modifications
to the compiler result in widespread changes in the output code, thus
obscuring "bug" changes. However, it proved adequate for the short period (one
to two months) it took to reliably compile code in native 386BSD.
"Divide and conquer" is used to isolate the effect of multiple bugs appearing
as a single impossible-to-find bug. It is a very powerful tool for use in
certain unpleasant predicaments. For example, during the 386BSD project, we
detected the presence of a kernel bug, a compiler bug, and a library bug all
hitting at the same point, at a time when we did not yet have an operable
debugger to sort out the mess. After isolating the problem with blitheringly
primitive printfs, we tried porting similar, related programs, until we found
a program that isolated the library bug and the compiler bug at separate
times. Once we fixed these bugs, we recompiled the entire set of kernel and
applications programs. The remaining kernel bug was then obvious to see and
correct. Divide and conquer allowed us to solve an "unsolvable" problem.
Consistency checks are implemented in the drivers and trap/system call
handlers to detect "impossible" conditions, such as returning to a user
program with interrupts off, a completely invalid user stack pointer, and so
forth. At one point, we even had them in library code and inline to the C
compilers assembly language output. Throughout the 386BSD development cycle,
consistency checks provided a mechanism to detect a problem before it became
terminal and untraceable. For example, when we converted 386BSD from
4.3BSD-Tahoe to 4.3BSD-Reno, consistency checks detected a disastrous problem
caused by a side effect of the context-switch code. Consistency checks have
their downside, however. Performance degrades with the use of consistency
checks in speed-sensitive areas such as system call handling. Resist
temptation, however, and don't take them out just for convenience. Otherwise,
mysterious problems will reappear and drive you crazy.
Another type of seemingly benign tinkering which results in disaster comes
when one tries various performance optimizations too early in the game. We ran
into problems every time we tried jumping ahead by improving our early
development code before it was fully reliable. It is better to "comment out"
performance improvements, compiler optimization, and "short circuit" code
evaluation, until the code and compiler are somewhat shopworn. It is very
frustrating when you have found a mechanism for a section of code that might
improve performance by an order of magnitude or more, but only at the risk of
upsetting the kernel operation itself. Be wary of such improvements--patience
is definitely a virtue in a systems project.


Which C Standard?


In the early days of Berkeley UNIX (pre-Version 6), C was not yet
standardized. For example, types such as "unsigned" did not even
exist--instead, arithmetic was done on "char*" types. Partly as a result of
early portability experiments, Bell Labs eventually revised C to conform to a
definition devised by Brian Kernighan and Dennis Ritchie (K&R), two Bell Labs
scientists. Their book, The C Programming Language (Prentice-Hall, 1978),
defined what C was for almost the next ten years. Berkeley then adopted this
new "standard" for all related prior code and all new code when it began to
put a serious effort into developing new UNIX functionality. As the use of C
has grown, its popularity has necessitated the evolution and solidification of
an ANSI specification of the language and its semantics. Pre-K&R adherents to
C, ideological to a fault, have frequently found much amusement in this
obsession with standards. After all, they originally had to fight management
and funding group opposition to its use (partly on the grounds of
"standardization") in many major projects for which it was well suited, and
had to live with the barrage of Fortran, Pascal, and then Ada efforts to
displace C as the preeminent systems programming language of the day. Perhaps
those groups might finally agree that C will be around for yet a few years to
come!
What does this have to do with 386BSD? Plenty! It seems that some believe it
is time to move BSD, kicking and screaming, into the ANSI C world, but others
are still adherents of the K&R viewpoint. Since the K&R portable C compiler is
still used for slowly dying architectures and is yet a force to be reckoned
with, 386BSD must find a median solution. 386BSD has an eye towards the
future, however, so a concerted effort has been made with 386-dependent code
to work within the new ANSI C format, while remaining compatible with K&R C in
common code by virtue of #ifdefs.
GCC attempts to remedy this conflict by providing a traditional mode, but this
is inadequate to our needs. GCC, it turns out, is not perfectly "traditional,"
as it favors ANSI semantics. (This should actually be no surprise, as it is
difficult to be complete in this regard.) As such, it is another source of
"silent" bugs that one should be aware of because the majority of the BSD code
was written to older standards.


Other Cross-Support Issues


In the area of cross-host communications, a few amusing irritants developed.
When we first used Kermit and ordinary serial lines for the early standalone
system and kernel work, the few minutes of download delay to MS-DOS were
livable, given that the debugging time required for each cycle was usually
about 20 minutes. As we got more proficient with the 386, however, and as we
reached the limits of our documentation on 386 features, our debug sessions
became shorter than the download time. Also, downloading a kernel (100 to 200
Kbytes) or a filesystem (1 to 5 Mbytes) began to occur more frequently, thus
eating up even more time. Finally, with the help of a cheap (approximately
$100) Ethernet card, we migrated to NCSA Telnet. This change cut the download
time to a more reasonable number.
Success frequently results in its own problems; we rapidly filled our tiny
40-Mbyte drive. It became increasingly difficult to manage slightly different
versions of utilities, and the cheap and clever tricks we had used to bypass
some development steps were themselves becoming stumbling blocks. Because we
were sharing the disk with MS-DOS and using MS-DOS utilities to communicate
with the outside world, files had to fit in the MS-DOS partition. By this
time, it was clear that the tenuous partnership between MS-DOS and BSD was
drawing to an end.


Validating GCC for Use in a Cross-Environment


We found GCC to have many fine qualities--unfortunately, cross-support
operation was not one of them. From its inception, GCC has traditionally been
run on the host on which it was compiled, and little thought has been put into
preserving its ability to run on a machine vastly different from that host. In
addition, some architectures supported under GCC relied to some degree on the
presence of a preexisting native compiler to compile GCC and parts of its own
compiler support libraries. To be fair, the compiler itself is quite capable
of compiling and supporting itself. However, as originally configured, both
cross-support and compiler bootstrapping are not very satisfying.
Other hurdles which we had to surmount included locating host compiler bugs
upon compiling GCC. Unlike other compiler writers who attempt to minimize the
use of arbitrary C features in their code, GCC's creators revel in it. As a
result, compiling GCC itself constitutes an excellent test of a compiler
because of its rich use of the language, and the impressive demands (macros,
pointer dereferencing) it places on the said compiler. While this style of
implementation goes loggerheads with practical portability in our compromised
"real" world, we must admit that the creators of GCC show fearless, if not
reckless, faith in their compiler. No one else so completely exploits the C
language, at the price of providing faultless support for such an extensive
use of the language. The intellectual honesty required for such an
implementation has received its fair portion of praise.
In the course of attempting to qualify a cross-host, we attempted to compile
GCC on many machines. One less than serious attempt was made to compile
portions of GCC on MS-DOS using various common PC C compilers. As expected, we
got dismal results. We found that to compile GCC on MS-DOS, we would have to
extensively rewrite the code, and also use some manner of MS-DOS extender--an
effort not compatible with our specification goals. We did consider using the
standalone library (see "The Standalone System" in DDJ, March 1991) to run GCC
in native mode after compiling GCC on a borrowed 386 system elsewhere, but
gave up on this when our cross-host version of GCC stabilized. We worried that
these two PC-hosted approaches would not only require a great deal of
additional work, but also require us to maintain them in the future for avid
users. Perhaps a fate worse than death?
Our intended cross-host, a UNIX machine, had many problems in compiling GCC,
even though the compiler has been part of a stable production system for many
years. However, consistency checks within GCC itself allowed us to locate the
nature of the problem to within a few thousand instructions, whereupon we
would tediously single-step to the problem with a debugger. Since we could not
fix the cross-host's native compiler (frequently this would mean exchanging
the bug you know for the bugs generated by the fix that you don't know), we
mauled GCC itself and defeated portions of the compiler in a successful
attempt to avoid code that the native compiler would mishandle. Due to the
nature of the native compiler bug (an obscure pointer aliasing problem), this
was the only way we could convince ourselves that we were not just migrating
the bug. As you might expect from our mention above, one of the best tests of
our then-generated cross-compiler was GCC itself.
Another aspect of running GCC in a cross-environment is dealing with an
internal support library known as gnulib.a. GCC is arranged so that portions
of machine-dependent operations not implemented by the compiler itself with
issued assembly code will instead be implemented by a subroutine call to a
gnulib.a entry point. To cleverly implement these missing areas within the
compiler, one creates gnulib.a by compiling source code encapsulating the
missing feature with the native host's compiler (not GCC), relying on it to
implement the missing feature as it sees fit. Here's an example. Suppose we
have the C expression:
 if(a != b)....
Let's assume the compiler does not know how to handle !=. It could generate
code to call a gnulib entry point:
 ...
 pushl _a
 pushl _b
 call noteqsi2
 ...
The gnulib would contain code compiled with a different native compiler than
GCC, one that can deal with a != expression:
 noteqsi2(n,m) {
 return(n != m);
 }
This is a sneaky way to leverage an existing native compiler to fill out voids
in GCC. Surprisingly, this works with our cross-host in most cases. We
implemented a replacement for gnulib only as needed (few are ever called).
We ran into an entertaining problem when we first moved the compiler onto the
386. Because we no longer needed our cross-host modifications to GCC, we
started recompiling the stock version of GCC, including gnulib, with the only
compiler we had on our nascent BSD UNIX system, namely GCC. GCC generated code
that would call the support library, which in turn would then call itself to
implement the same support, and so on ad infinitum! This is another minor
example on the lack of native support for GCC in the then standard release. It
is expected that GCC 2.0 and later versions will better address these and
other cross-support issues.


GCC Support Calls to Replace GNULIB


In addition to the normal subroutine libraries found with BSD, two support
subroutines are needed. GCC handles all ANSI C operations by generating the
appropriate 386 instructions, with the exception of floating point conversion
to signed and unsigned integers. In Listing One (page xx), fixdfsi( ) manages
to take a double precision floating point argument (a df) and turn it into a
signed integer (an si, or small, within a machine word, integer). In Listing
Two (page xx), fixunsdfsi( ) likewise takes a double-precision floating point
argument and returns an unsigned (uns) integer. These functions use the 386
numeric processor integer truncation features to return the appropriate
values. Because there is no direct method to convert a floating point number
to unsigned format, we detect the condition (for example, above the most
positive number possible), reduce the value prior to conversion (so it will
fit into a signed value), then add back in what we subtracted after
conversion, thus avoiding overflow.


Choosing a Sensible Cross-Host


Our ad hoc modifications of GCC resulted in a cross-compiler that would
provide a considerable amount of language support, but it had limits. We also
needed to consider the following: include file differences, byte sex, floating
point format, inline assembly code, table generation programs, hardware page
size, and object libraries. Some of these areas were so pervasive and
important that they were primary considerations when we selected our
cross-host.
By selecting an appropriate cross-host, we minimized a number of problems,
including compatible byte sex, structure data alignment, program size, and
existing tool set. Floating point data format turned out to be a minor concern
because few programs in the early utilities group require it. Thanks to the
IEEE floating point standard, this becomes easy as most post-VAX period
processors support the same format (modulo byte order). Obviously, our job
would have been simpler if we already had 386BSD up and running and then had
to port it, so what we looked for in a cross-host was something very similar.
Oddly enough, a C compiler hides most of the native machine's instruction set,
so the least important part is the cross-host's processor architecture.
Operating system version and program development tool similarity count for
much more.
Those more dogmatic, gutsy, or energetic might say that we simply avoided the
hard parts. They are quite correct. What hardships we did endure in
cross-tools were more than enough for us.


Where Do We Go From Here



Now that we have created a stable cross-tools environment, we can get on to
the last of our initial utilities--the initial root filesystem. In our next
article, we will examine the minimum requirements which must be met to run a
UNIX system, and the interrelationships between different UNIX files and
utilities needed during the various stages of our 386BSD port. We then create
a root filesystem containing, among others, /etc/init, /bin/sh, /dev/console,
and /bin/ls (a token program), and debug it via the standalone utilities. We
also discuss some of the problems encountered in filesystem downloading and
validation procedures.


Brief Notes: Copyrights, Copylefts, and Competitive Advantage


Usually when we discuss a piece of software, we attempt to enhance our
understanding with a program or fragment of code which illustrates the topic.
Therefore, it is quite frustrating to discuss as major a tool as GCC, where
the code is available to anyone upon request but we are prevented by the
"copyleft" from showing you any code fragments. As such, we feel it important
to examine the history and some effects of the copyleft.
The copyleft on GNU software was born out of rather turbulent circumstance. In
the mid-1980s, a number of commercial entities made a practice of
"appropriating" software developed at MIT and other universities and placing
their own copyright on it. Richard Stallman, then (and still) at the MIT Media
Lab, was involved with some early LISP software development, and experienced
first hand the ruthless and bloody battle between Symbolics and LMI over LISP
software enhancements. At the same time, AT&T was leading the forefront in the
development of license agreements for UNIX, though not investing much at that
time in the development of UNIX itself. This obvious (and still successful)
locking up of research led Stallman and others to work on software projects
which would be unencumbered by licenses, copyrights, and other restrictive
means. Stallman's EMACS for the PDP-10 was one of the first visual editors
available without those restrictions.
While commendable in theory, the practice was quickly thwarted by the success
of Gosling's EMACS, a C-based version of Stallman's EMACS, which ran under
UNIX. As more use was made of Gosling's EMACS, companies began to support it,
add new features, and so forth, until finally it was locked-up by the vendors.
Of course, it goes without saying that the changes to the code and new
features were not returned to Stallman's group for updates, since that would
have impacted a vendor's perceived competitive advantage.
Basically, the copyleft was an extreme response to the excesses of a cutthroat
market. While permitting redistribution, the copyleft attempts to maintain
access to and control of changes in code, by requiring that source
modifications be returned to the FSF for redistribution and by demanding that
the source with these modifications be made available from that company to
anyone for essentially a "copying" fee. A liberal reading of the license makes
it practically impossible for a company to easily lock up the software. It
also prevents a company from easily recouping its investment in further
software development, enhancements, or support by eliminating its competitive
advantage over its competitors. A large company can avoid this by developing
or licensing needed software tools, but a small business or individual
developer does not have access to these resources.
Finally, the copyleft attempts to exert control over any discussion and
analysis of the code itself in any printed medium, and states in part: "...The
'Program,' below, refers to any such program or work, and a 'work based on the
Program' means either the Program or any work containing the Program or a
portion of it, either verbatim or with modifications ...."
Thus, according to the copyleft, a written examination of GCC, which utilizes
some of the code itself for purposes of discussion, falls under the copyleft
itself. This is a condition unacceptable to authors and publishers, because
they make their income only from the publishing and distribution of written
works, and not necessarily from software products. Perhaps this was an
unintended side effect of the copyleft, but attempts to narrow it have been to
no avail.
The headlong rush towards "open standards," an oxymoron worthy of the
military, is no solution either, but merely an effort to mask the implicit
control, development, and innovation of a proprietary object by a vested
interest by calling it "open." The only open standard is one that has an
openly accessible model or example of the standard itself. Just as a
mathematical formula in physics is meaningless without example problems and
solutions, a standard based on a proprietary object is also meaningless
without code solutions which justify its worthiness --and the code answer book
to this open standard should not be subject to ransom through the use of
"licensing" fees and anticompetitive product controls. Such a standard must
also be equally accessible to those developing proprietary and nonproprietary
works. This not only mitigates the inherent competitive disadvantage for the
small innovator, but is also a disincentive to the development of proprietary
"copycat" standards alongside the open standard, in an attempt to undermine
its use.
Recently, the trend at many universities and research institutions has been to
permit access to university-developed code through simple copyright procedures
which permit modification and redistribution with attribution. The copyright
used by TeleMuse, for example, is similar to the University of California at
Berkeley (UCB) copyright and is designed to be simple and direct; see Figure
1.
Figure 1: The copyright used by TeleMuse in the 386BSD article series

 /* Copyright (c) date, name-of-author. All rights reserved.
 * Written by name-of-author, date-written.
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.*/

In addition, UCB copyrights currently prohibit use of the UCB name in products
incorporating the software to avoid the appearance of an endorsement.
According to Marshall Kirk McKusick, UCB CSRG Research Computer scientist and
president of USENIX; "We have the capitalists with their copyright and the
radicals with their copyleft. We are at the 'copycenter,' since we allow
redistribution with credit to the authors. Our goal is to have as many people
as possible use our software." In January of 1991, CMU adopted a variant of
the UCB copyright for the MACH operating system.
This different approach to copyright does not attempt to regulate the
development and distribution of code as does the copyleft. Instead, software
is made available with the full knowledge that it will be incorporated into
many different projects. These projects, in turn, will ultimately enhance the
international competitiveness of the computer industry itself, by allowing
individuals and small businesses the same access to these development tools as
large corporations. After all, it is the individual and small business which
are the sources of innovation in our society. Anything less (including the
copyleft) results in a competitive advantage only for large companies with a
vested interest in the status quo.
The Free Software Foundation deserves high praise for leading the fight
against locked-up software. Some GNU packages, such as GCC and EMACS, have
been used by small firms and research groups to develop innovative and unique
software and products, which would not otherwise have been feasible for these
economically strapped entities. Even 386BSD might not have been possible had
we not been able to leverage other resources like GCC. However, as the climate
in which the copyleft was developed has moderated, we hope that the FSF will
moderate its stand as well, and at the very least permit unfettered discussion
and analysis of the code in print. We have every confidence that there will
continue to be a flow of new software back to the source from companies,
individuals and research groups.
It is time vested interest started offering innovative and competitive works
and stopped preventing innovation through the "anticompetitive" use of
copylefts, open standards, and licensing. Those who maintain a competitive
advantage through the inappropriate use of these methods, instead of through
true innovation, have done so at the cost of the competitiveness of the entire
domestic computer industry. --L.J.

_PORTING UNIX TO THE 386: LANGUAGE TOOLS CROSS SUPPORT_
by William Frederick Jolitz and Lynne Greer Jolitz


[LISTING ONE]

/* fixdfsi.s: Copyright (c) 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 1/90
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * GCC compiler support function, truncates a double float into a signed long.
 */

 .globl ___fixdfsi
___fixdfsi:
 pushl $0xe7f /* truncate, long real, mask all */
 fnstcw 2(%esp) /* save my old control word */
 fldcw (%esp) /* load truncating one */

 fldl 8(%esp) /* load double */
 fistpl 8(%esp) /* store back as an integer */
 fldcw 2(%esp) /* load prior control word */

 popl %eax
 movl 4(%esp),%eax
 ret





[LISTING TWO]

/* fixunsdfsi.s: Copyright (c) 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 4/90
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 * GCC compiler support function, truncates a double float into unsigned long.
 */

 .globl ___fixunsdfsi
___fixunsdfsi:
 pushl $0xe7f /* truncate, long real, mask all */
 fnstcw 2(%esp) /* save my old control word */
 fldcw (%esp) /* load truncating one */
 fldl 8(%esp) /* argument double to accum stack */
 frndint /* create integer */
 fcoml fbiggestsigned /* bigger than biggest signed? */
 fstsw %ax
 sahf
 jnb 1f

 fistpl 8(%esp)
 fldcw 2(%esp) /* load prior control word */
 popl %eax
 movl 4(%esp),%eax
 ret

1: fsubl fbiggestsigned /* reduce for proper conversion */
 fistpl 8(%esp) /* convert */
 fldcw 2(%esp) /* load prior control word */
 popl %eax
 movl 4(%esp),%eax
 addl $2147483648,%eax /* restore bias of 2^31 */
 ret

fbiggestsigned: .double 0r2147483648.0 /* 2^31 */














April, 1991
WHAT IS BIOCOMPUTING?


Biologically-inspired approaches to creating software




Ray Valdes


Ray is a technical editor for DDJ and can be reached at 501 Galveston Drive,
Redwood City, CA 94063.


BioComputing refers to biologically inspired approaches to creating software.
In recent years a number of these techniques and technologies have emerged.
Some are proven and out in the market, others are still being nurtured in
research labs. Together they hold great promise in allowing software
developers to push the envelope of program complexity and size, and to allow
for tractable solutions to thorny programming problems such as pattern
recognition.
The most visibly success of these techniques is neural networks, also known as
neurocomputing, neoconnectionism, and parallel distributed processing. Other
BioComputing technologies include genetic algorithms, iterated function
systems, fuzzy logic, simulated annealing, fractal systems, cellular automata,
L-systems, classifier systems, and chaotic dynamics. To our knowledge, this is
the first time they have been grouped in this way. Yet, as we learn more about
them, the interconnections become more clear.
In some cases, the techniques are related at a deep theoretical level--for
example, chaotic dynamics, fractals, L-systems, and iterated functions stand
on common mathematical ground. In other cases, the techniques exist in their
own distinct territory (that is, fuzzy logic), but still maintain some
connection to the natural and biological world.


The Need for BioComputing


The 1980s was the decade in which real-time networked systems, GUIs, and
distributed applications appeared en masse, with program size and complexity
taking a big leap. It was during this time that many of us encountered the
limits of existing software technology.
For example, the ten million lines of code in the Space Shuttle software
system came to a screeching halt during one launch countdown when an error,
caused by a missing comma in one program, was encountered. Last year, the
entire AT&T phone network was immobilized for a day when it encountered an
anomalous condition in the multiprocessor software.
Closer to home, many of the leading companies in the PC software industry have
been embarrassed by long-delayed and/or bug-ridden programs (such as dBase IV
1.0, Lotus 1-2-3/G, OS/2, Windows, Macintosh System 7, and so on). The bigger
they are, the harder they fall, and the longer it takes.
It's no surprise that computer scientists have been looking for ways to build
software systems that are less brittle, more reliable, robust, and flexible,
yet still allow for high levels of functionality. Many directions have been
explored, from OOP to CASE to AI to Steve Jobs's fresh-squeezed orange juice
and high salaries for small, dedicated teams. One exciting set of approaches
falls under the rubric of BioComputing.


Mother Nature Knows Best


Lest anyone misunderstand, the rationale behind BioComputing is not to create
big programs. In fact, most current implementations are modest in size and
resource consumption. Rather, the rationale comes from the fact that, as the
number of lines in a conventional program increases, its complexity approaches
that of a small bioorganism.
Living systems are intricate structures made out of simpler components
(cells), with high degrees of redundancy, fault tolerance, and adaptiveness.
Incredibly detailed and complex biostructures arise from what must be very
small sets of rules. The engineering specifications for a Boeing 747 are equal
in size and volume to the aircraft itself. The specifications for a six-foot
human being (that is, our DNA genetic code) fit into a container less than a
millionth of a cubic inch in size, and human beings are orders of magnitude
more complex than a 747. Moreover, unlike the Space Shuttle, we don't come to
a screeching halt when there is a misplaced comma in our DNA sequences (in
general).
Hans Moravec, a computer scientist and robotics researcher at CMU, has tried
to compare the processing power of natural organisms to artificial CPUs. In
his view, one MIPS is roughly equivalent to 100,000 neuronal cells in the
brain. In terms of both storage capacity and processing speed, then, a
Macintosh computer is roughly equivalent to a snail, and a Cray2 is at the
level of a small rodent.
We can extend Moravec's hardware comparison to software. The DNA sequence for
a human being is about six billion bits long, and represents the engineering
specification for the human body. This sounds like a lot of information, but
is surprisingly small when compared to contemporary software packages. Our DNA
is a package of code and data only 1000 times larger than the information
content of products like dBase IV or Lotus 1-2-3. Clearly, nature's
programmers are not using C or assembler.
Actually, the effective information in DNA is much less than 6 billion bits,
because only 5 percent of our DNA is used as data by the human body. This
active DNA is about 40 Mbytes worth of information -- the size of a PC/AT disk
drive. The other 95 percent is so-called "junk DNA" or "introns," which
contain information not used by the target system. Some of this data can be
seen figuratively as "stop bits" or parity bits--information used internally
for error correction and preservation of the genetic signal. Other parts of
the junk DNA represent outdated information that was applicable further back
along the evolutionary timeline--in other words, on obsolete hardware
platforms. Clearly, nature's programmers do know how to use #ifdefs to comment
out code that is no longer needed.


Neural Nets


Artificial neural nets are loosely based on current theories of how the
natural brain works, that is, through interconnections of neuronal cells. In
reality, the specifics of mammalian brain function continue to elude
researchers, but this has not stopped neural net researchers from making great
strides in neural net models and implementations.
The application areas of neural nets are broad and surprisingly widespread,
although the principal task is pattern recognition. One of the earliest neural
net models, Bernard Widrow's adaptive linear element (Adaline) has been in use
since 1959 as adaptive hardware filters to eliminate echoes on phone lines. In
recent years artificial neural nets have been used for pattern recognition,
image processing, compression, speech synthesis, natural language processing,
noise filtering, robotic control, and financial modeling.
A key advantage of neural nets over conventionally written programs is the way
in which nets imitate the brain's ability to make decisions and draw
conclusions when presented with complex, noisy, irrelevant, and/or partial
information. As computer applications broaden to include hand-held devices
that process handwriting and voice input, the neural net's ability to make
sense of messy real-world data becomes more critical.
Another advantage is that neural net application is not a handcrafted program,
but rather the result of feeding training data to a net model, which then
learns to output the desired results. Once a net has been trained, it will
also be able to deal with input that is different from what it has been
trained with, as long as the input is not too different. This is a big
advantage over conventional software, which must be specifically programmed to
handle every anticipated input. Presumably, this will allow us to avoid
"missing comma" fiascos like the aborted Shuttle launch.


Fractals and Iterated Function Systems


Fractal shapes are a class of object that results from the repeated evaluation
of a certain simple mathematical function to produce a complex shape with
infinite level of detail. (Actually, that is a loose definition. More
precisely, Mandelbrot's definition of a fractal is "a set whose Hausdorff
dimension is not an integer" -- a fractional dimension.)
Although discovered by a mathematician, fractal techniques have proved
invaluable in rendering computer graphic images of natural objects with
uncanny detail and texture. In recent years, the deep interconnections between
fractal objects, iterated function systems (IFS), and chaotic dynamics have
begun to be plumbed, demonstrating results in both the biological and natural
sciences as well as the computer sciences.
In the natural sciences, fractal methods have proved a useful tool for
visualizing the chaotic dynamics of nonlinear systems. Chaos science, also
known as nonlinear dynamics, is a mathematical modeling technique used to
represent the complex behavior of feedback-based systems, which can range from
the human heart to global weather systems to the interactions between
mammalian neurons to the movements of planets.
Perhaps the most practical use of fractals has been in the PC industry, with
Michael Barnsley's use of IFSs for data compression of scanned images. IFSs
are a compact way of representing a subclass of fractals, those that can be
partitioned into a number of tiles. Barnsley's technology uses the iterative
tiling characteristics of IFSs to obtain dramatic compression ratios of 500:1,
or in some cases, as much as 10000:1.


L-Systems



Although fractal systems of equations can produce images of natural objects
like trees or mountains with an uncanny resemblance, it's hard to see the
direct connection between the iterative function Z{2} + C --> z and a natural
organism. The discoverer of the set defined by that function, Benoit
Mandelbrot, would have no quarrel with that, because he never intended to
directly model the internal workings of biological processes.
In contrast, Lindenmayer systems, or L-systems, are similar to fractals in
some respects, but were created with the specific intent of modeling nature.
Aristid Lindenmayer, a biologist, conceived a mathematical theory of plant
development in 1968. An L-system is a set of rules that specify a repeated
sequence of transformations on a starting shape. L-system rules are not unlike
the BNF rules that specify the syntax of programming languages; but instead of
parsing expressions, they generate data. Given the appropriate starting shape
and rules, you can generate images of plant-like objects that are incredibly
similar to the real thing.
As with fractals, a small amount of information can produce a large detailed
object. In fact, one can say that an L-system is merely another way to
represent a fractal set. But unlike conventional fractals, L-systems are used
to mimic both the natural end result (such as the shape of a full-grown tree),
as well as the stages of growth along the way, from twig to sapling to mature
tree.
The dramatic "compression ratios" in L-systems (and IFS) provide some inkling
as to how a portion of 10{8} bits in DNA specify the human neural network of
10{11} processing elements and 10{14} interconnections. No biologist has found
the place inside a plant where L-system rules are stored. That was never a
goal of this research. Rather, L-systems should be seen as a formal exercise
in understanding how the natural processes of growth can be specified,
modeled, controlled, and predicted.
More recently, computer graphics people such as Przemyslaw Prusinkiewicz (who
generated the image on the cover of the magazine) have been using L-systems as
a rendering technique for more natural images.


Genetic Algorithms


A biologically inspired technique that is not used in computer graphics is
genetic algorithms (GA), invented by John Holland in 1975. (See the article
entitled "Genetic Algorithms" in this issue for a more detailed description.)
Briefly, this approach provides programs with a means for finding a particular
solution in a general search space by mimicking the natural processes of
evolution, mutation, and natural selection. Although GAs are directly inspired
by biological processes, in practice, the connection is rather loose. GAs are
as accurate a model of evolution as an artificial neural network is a model of
the brain -- which is to say, not necessarily so.
Nevertheless, a close imitation of nature is not a requirement. These
problem-solving techniques are valuable in and of themselves.
There is another semirandom search technique, called "simulated annealing"
(basically equivalent to GAs) that mimics the crystallization of a liquid as
it cools (or the annealing of a metal as it is heated and cooled). This
technique is similar to GAs in that it consists of repeatedly generating a
trial solution, testing it against the desired goal, and then semirandomly
mutating the solution to see if a better fit can be found. As the
"temperature" cools down, the program settles into a near-optimal solution to
the problem. The equivalence between simulated annealing and genetic
algorithms points out how the same process can be manifested in both
biological and nonbiological natural systems. (See "Simulated Annealing" by
Michael McLaughlin, DDJ September 1989.)
Genetic algorithmists such as John Holland are now working on an improved
technique for parallel search and optimization, called "classifier systems."
Classifier systems are similar to L-systems in that they are rule-based;
however, they incorporate genetic evolution and mutation. They have been
proven formally equivalent to connectionist systems (artificial neural
networks).
Other researchers are combining GAs with other techniques--for example, using
GAs to evolve different neural net models. Or using GAs to transform and
mutate conventional programs, for example, to produce a near-optimal sort
program for a given set of data. Like neural nets, GAs map very easily to
massively parallel hardware, which explains the serious interest of
semiconductor makers (Intel, TI) in neural nets and other BioComputing
technologies.


Other BioComputing Techniques


Fuzzy logic, or the theory of fuzzy sets, is a technique that is inspired by
nature without intending to be a realistic model of any physical objects.
Invented by Lofti Zadeh in 1965, fuzzy logic is an extension to mathematical
logic that allows for "soft" values that are in between the hard values of
true and false (0 and 1). The intent is to be able to deal in a meaningful way
with imprecise notions or concepts that do not have exact boundaries. This
does not mimic how the physical human brain works, but it does follow how the
human mind seems to carry out its reasoning.
Fuzzy logic has found many practical adherents in Japanese companies. This
technology is now a key element in products such as Canon autofocus cameras,
Hitachi washing machines, and Nissan and Subaru transmissions, as well as in
the handwriting input recognition found in the Sony Palmtop computer.


The Flip Side of BioComputing


Until now, I've focused on biologically inspired ways of creating software.
But no discussion of BioComputing would be complete without mentioning its
flip side: the use of computers and information science to study biological
systems, which we call computational biology. Some techniques mentioned
earlier (neural nets, L-systems) started out with the intent of understanding
biological systems. However, these have since become more a computational
technique than a biologically faithful model.
Nevertheless, researchers remain working on the flip side, studying
bioorganisms as information-processing entities. For example, one author in
the field of immunology makes the case that the immune system, which is a
complex network of interacting elements distributed throughout the human body
(such as the three-foot long filament of human DNA), can be viewed as a
cognitive process, an entity capable of parallel distributed search, pattern
recognition, and associative memory, not unlike an artificial neural network
or genetic classifier system.
Computational biology has had its largest impact in the field of medical
molecular genetics, although researchers there would not use this phrase to
describe their work. Given that bioorganisms are the result of "executing"
biochemical programs stored on a digitally encoded tape (the long sequences of
genetic DNA), medical researchers have discovered that some diseases are quite
literally data transmission errors in the genetic signal. The fatal hereditary
disorder cystic fibrosis involves a single change in a data value, like the
missing comma in the Space Shuttle software. Like that missing comma, this
fatal data error can be detected, corrected, and the program restored to a
healthy state.
Naturally, better tools are being built to improve on these old-fashioned
methods: automatic computer-controlled DNA sequencers ("disassemblers"),
polymerase chain reaction machines ("digital tape duplicators"), molecular
design workstations ("CASE tools"), and so on. There are of course serious
bioethical issues that need to be addressed here, as well as profound
technical challenges. And who knows, maybe 30 years from now Borland or
Multiscope will come out with an interactive debugger for home-brewed
life-forms, complete with read/write device for DNA sequences.




























April, 1991
UNDERSTANDING THE GPIB


The general-purpose instrumentation bus has a wide range of applications




Don Morgan


Don is a consulting engineer in the area of embedded systems and automation
and can be contacted care of Don Morgan Electronics, 2669 N. Wanda, Simi
Valley, CA 93065.


The general-purpose instrumentation bus (GPIB), or IEEE 488 Bus, is a
high-speed communications bus widely used in engineering and scientific
applications. It was originated in the mid-'60s as the HP-IB by
Hewlett-Packard to ease the integration of computers and instrumentation.
Today, the GPIB is used in scientific workstations and computer-based systems
where the instrumentation involved--oscilloscopes, meters, and equipment such
as lasers and plotters--needs to exchange information.
The reason for the GPIB's popularity is manifest in the objectives of its
creators: 1. To define a general-purpose system for use in small or limited
distance applications; 2. to develop common mechanical, electrical, and
functional interface requirements that would be simple, easy to use, and
inexpensive to implement; 3. to permit instrumentation from a wide range of
capabilities to be connected simultaneously; 4. to allow direct connection
between devices so that no intermediary (central point through which all
messages had to be routed) was needed; 5. to require a minimum of restrictions
on the performance characteristics of any device connected; and 6. to create a
system that permitted asynchronous communications over a wide range of data
rates.
It was the mid-70's before the official IEEE 488 standard was set. This
standard described the functional description and electrical characteristics
of the bus, along with the connector and related mechanical data. These
standards were echoed in ANSI MC1.1, and, except for the connector proposed,
in the international IEC 625-1 and British B.S. 6146.
Then, in 1987, after more than 15 years of use, IEEE 488 was revised and
expanded. At that time, the original standard was renamed IEEE 488.1 and the
revision was given the name IEEE 488.2. The new standard does not change the
basic functionality described in the original but goes further, describing
basic sets of abilities necessary in a conforming device. It prescribes new
codes and data formats, protocols, and commands.
The extensions presented by the IEEE 488.2 are wide ranging, but meaningless
without a good understanding of the IEEE 488.1. In fact, unless a device
complies with IEEE 488.1, it can not comply with the new standard. The focus
of this article is to provide an understanding of the basic functionality
described by 488.1 through description and example, from the viewpoints of
both the host PC and the microprocessor/embedded system. I'll deal mainly with
IEEE 488.1, alluding to 488.2 when its extensions or amendments bear on what
we are doing.


Overview


The IEEE 488.1 describes an 8-bit parallel, byte serial interface that, under
proper circumstances, can communicate at speeds up to 1 MHz. It employs a
patented three-wire handshake and does not require that all devices involved
handle data at the same speed, but it guarantees that the flow of data is
accepted by all devices before continuing. In addition, it specifies five
command lines that are used singly and together to issue certain commands,
request service for individual devices, and distinguish data from commands.
Certain hardware and electrical characteristics, such as the connector, the
length of the bus and the number of contiguous devices are also prescribed.
In its simplest form, a GPIB bus may consist of only two devices: one that
transmits data and one that receives it. These two devices are known,
respectively, as a talker and a listener. An example of this sort of bus might
be an oscilloscope connected to a printer/plotter; the oscilloscope is a
permanent talker and the printer/plotter is a permanent listener. In such a
case, data might be printed upon a trigger or the press of a button.
Most of the time, there are more than two devices connected to the bus, and
more often than not, these devices will need to receive information as well as
transmit it. To accomplish this, we add one more element, a controller.
Now, the bus may be composed of as many as 15 devices, including the
controller, all with the ability to become talkers or listeners as the need
arises. There can, however, be only one controller at a time on the bus, and
no device may send or receive device-dependent data unless addressed to do so.
This arrangement provides for each of the 15 devices on the bus, a My Listen
Address, and a My Talk Address. To obtain a listen address from the primary
address, this address is ORed with 20h; to obtain a talk address, it is ORed
with 40h. In most implementations of the interface, this address manipulation
is done by the hardware, and the primary address of any one node is set
locally with a switch, or remotely with bus commands.
On a small scale, this bus may be composed of a computer, acting as
controller, the oscilloscope as a talker and listener, and the printer/plotter
as a listener. Here, the oscilloscope will need to receive data from the
computer setting parameters for data collection. When it gets the data it is
meant to capture, it must become a talker and transfer that information to the
printer/plotter.
There are really three kinds of communication involved in any bus transaction.
One, which might be considered interface maintenance, comprises that state of
the control lines and commands, which tell the interface how to behave and
what to do. These commands are not device dependent; they are specified in
detail beginning with IEEE 488.1 and in a good deal more detail in IEEE 488.2.
Examples of these commands might be DCL (Device Clear), IFC (Interface Clear),
or LLO (Local Lockout).
More Details.
The next type of communication that occurs is device dependent and consists of
the actual commands transferred over the bus to the instruments connected.
These data relate to information to or from the device, as well as the
commands designed by the manufacturer of the device to set parameters. These
data are in the language of the device and are not part of the interface.
Before IEEE 488.2, there were no guidelines for the type of data transmitted
over the bus, but with the new specification, some protocol has been
established. Data, for example, are to be transferred as 7-bit ASCII, because
binary data sometimes mimicked interface commands and caused communication
problems. There are set ways to end a message, so that there is no mistaking
the intention of the device. There are many more additions, far too many to go
into in the framework of this article.
The third sort of communication that exists on the bus is between the
interface hardware and the instrument itself. This communication is in the
form of logic signals that handle the hardware as well as the device-dependent
data that must pass through it. There a number of chip manufacturers that
provide hardware for this purpose, among them NEC, TI, and National
Instruments, with its IEEE 488.2 compatible device. In my follow-up article,
"Implementing GPIB in an Embedded System," I'll describe the interface from
the viewpoint of an embedded system using the TMS9914A.


An Implementation


To illustrate some of the features of a system using the GPIB, I've chosen a
simple laboratory experiment in which a computer, an oscilloscope, and a
printer/plotter are used to capture data for study. The system would be
designed to set the oscilloscope for a series of trigger levels, and, as each
is met, transmit the information to the plotter for plotting.
We want the system to maintain itself: The program would run on the host, a
PC, which in turn would configure the oscilloscope and printer/plotter. The
oscilloscope would be set-up to wait for a trigger from the subject of the
test, collect the data and inform the host that it has done so. The host would
then arrange for the oscilloscope to transmit the data to the printer/plotter.
As soon as the oscilloscope had done its job it could be told to wait for
another trigger, and the printer could be told to print. This loop could then
continue until other conditions were met.
For this example, our host is the PC, and as such it is the controller as
well, with the GPIB cables connecting the oscilloscope and printer in either a
daisy chain or star pattern. The manner isn't important because essentially
they are all parallel, each having it's own unique address: The controller
address is 0, the scope address 7, and the printer/plotter address 1.
To simplify the illustration, I'll use the pseudocode commands OUT and IN in
place of language- or driver-specific commands to highlight the IEEE 488
command structure itself. Where possible, I will use the ASCII representations
of the Multiline Interface Messages needed. For a sample of code using C and
driver, see Scope.c (Listing One, page 92), in which I use a library and
driver supplied by National Instruments to accomplish a task similar to the
one I will now describe.
When the system starts up, the controller asserts an active low on the
Interface Clear (IFC) command line; see Example 1(a). Upon receiving the IFC,
all devices on the bus cease Talking (transmitting) or Listening (receiving)
and return to an idle state. This command does not affect the state of the
instruments involved, only the interface. The controller now asserts the
Remote Enable (REN) command line. This will permit the host to program the
other instruments on the bus, as shown in Example 1(b). A Device Clear is
sent. This multiline interface message returns all receiving devices to a
default state defined by the manufacturer, which may be a full reset; see
Example 1(c).
Example 1: Implementing the IEEE 488 command structure

 (a) OUT "IFC" /*Interface Clear*/

 OUT "REN" /*Remote Enable*/

 (b) OUT '?_@'\0X14"; /*? = Unlisten (UNL), _ = Utalk (UNT), @ = My Talk
 Address of controller (MTA), ' = My Listen address
 of scope (MLA), 0X14 = Device clear*/

 (c) OUT '?_@'"; /*UNL, UNT, MTA of controller, ML; of scope*/
 OUT ":ACQUIRE:TYPE NORMAL";
 ; /*Now the device dependent configuration data*/


 ;
 OUT ":TRIGGER:SOURCE CHANNEL1";
 OUT ":SRQ 1"; /*Enable SRQ on trigger, bit 1 of SRQ mask
 register*/
 OUT "?_"; /*UNL, UNT*/

 (d) OUT "?_\X18G "; /*UNL, UNT, Serial Poll Enable (SPE), Talk
 Address (TAD) of scope, MLA of controller*/
 IN & BUFFER; /*Program puts Serial Poll Byte in a buffer*/
 OUT "\0X19"; /*Issue Serial Poll Disable (SPD)*/

 (e) OU' "?_@'"a /*UNL" UNT" MT; of controller" ML; of scope*/
 OUT "PRINT"; /*Tell scope to print*/
 OUT "?_@!G"; /*UNL, UNT, MTA of controller, ML; of
 printer/plotter, MTA of scope*/
 OU' "DATA"a /*Controller takes its own attention (ATN(c) line
 low to allow scope to talk to printer/plotter*/
 WAI' "END"a /*Controller watches command lines for end or
 identify (EOI), which means that the transfer is
 complete*/

The next step is to configure the instruments involved in the procedure.
First, the controller makes the oscilloscope a listener by sending its My
Listen Address. After the oscilloscope is made a Listener, the controller
becomes a Talker and sends data over the bus in the language (command set) of
the oscilloscope. For the purpose of this experiment, the commands sent set a
sweep rate, gain, and trigger level, and enable the instrument to issue an SRQ
when it is triggered.
The oscilloscope now goes into a ready state, waiting for a trigger from the
subject of the test. Having configured the oscilloscope, the host again
becomes a controller and tells the oscilloscope to unlisten with the Unlisten
command (UNL). At this time, the host may go on to other duties, maintaining
other instruments or messaging data; see Example 1(d).
When the oscilloscope is finally triggered and the data captured, it asserts a
negative true on the bus's SRQ command line. The controller responds to this
signal by polling all the devices on the bus to find out which one issued the
command. This is done by issuing a Serial Poll Enable (SPE), which tells each
device on the bus that when it is made a talker, it should put the Serial Poll
Byte on the bus. The controller then issues the MTA for each device on the
bus, reads its Serial Poll Byte in and determines which device or devices
require service. When the serial poll is complete on all devices on the bus,
the controller issues a Serial Poll Disable (SPD). The Serial Poll Byte sent
by the oscilloscope indicates that it is requesting service and that it has
been triggered; see Example 1(e).
The controller now configures the oscilloscope to transmit the data it has
collected. It makes the plotter a listener and the oscilloscope a talker with
its MLA and MTA commands, respectively, and goes off line lowering the ATN
command line. The oscilloscope will now transmit its data to the plotter. The
controller monitors the bus when it is off line, but the software may be
written so that the controller reasserts ATN when transmission between the
oscilloscope and printer/plotter is complete, and returns to being the
controller.
When the scope has completed this transfer, the controller can issue an IFC to
return both scope and printer to an idle state. The controller might then make
the printer a listener with the appropriate MLA command and tell the device to
print using the command structure of the particular device, followed by an UNL
command.
It may take time for the plotter/printer to print the data. What happens next
depends on the devices used and the programmer. If the printer has an
appropriately sized buffer, the oscilloscope may be freed up almost
immediately, whereupon it can be set to trigger again. If not, the next
trigger will have to wait until the printer/plotter is finished.
There are a number of ways to implement this procedure. Besides using the SRQ,
you might choose to poll the oscilloscope, waiting for a trigger before
turning it into a talker and the printer/plotter into a listener. Depending on
the device, it might be possible to set the parameters necessary for a trigger
and immediately make the oscilloscope a talker and the printer/plotter a
listener.


The Host PC


To complete the host side of the discussion, I'm providing a small program
(see Listing One) to acquire data using an HP oscilloscope and to print it
with an HP ThinkJet printer. The program is written in C using a library and
driver provided with the PCIIA GPIB card from National Instruments.
This code is an approximation of the example that we have been working with.
The PCIIA resides in an AT and performs the functions of controller, talker,
and listener on the bus. The oscilloscope is an HP 54502A and the
printer/plotter is an HP ThinkJet. The program is a short example, using real
hardware, of actual code written using the lower-level functions provided with
the National Instruments' card to emphasize the functionality and flexibility
of the interface itself.


Conclusion


The purpose of this article was to present the GPIB as an understandable and
useful interface. Once the protocol is understood, IEEE 488 is not difficult
to use, and its speed, commonality of interface, and range of data rates offer
many opportunities to the designers of instrumentation and systems. In my next
article, I'll examine the interface from the viewpoint of the embedded system.


Bibliography


Caristi, Anthony J. IEEE-488 General Purpose Instrumentation Bus Manual. San
Diego, Calif.: Academic Press, 1989.
Tutorial Description of the Hewlett-Packard Interface Bus. Hewlett-Packard,
1980.
HP 54502A 400MHz Digitizing Oscilloscope Programming Reference. First Edition.
Hewlett-Packard, 1989.
IEEE Standard Digital Interface for Programmable Instrumentation. New York,
N.Y.: The Institute of Electrical and Electronics Engineers, 1987.
IEEE Standard Codes, Formats, Protocols, and Common Commands For Use with
ANSI/IEEE Std 488.1-1987. New York, N.Y.: The Institute of Electrical and
Electronics Engineers, 1987.
IEEE Standard Digital Interface for Programmable Instrumentation. New York,
N.Y.: The Institute of Electrical and Electronics Engineers, 1987.
NI-488 MS-DOS Software Reference Manual. Austin, Tex.: National Instruments
Corporation, 1990.


The Three-Wire Handshake


The three wire handshake used by the designers of the bus was a means to
provide reliable communication across a wide range of devices, from the very
slow to the very fast. Because the bus transfer isn't considered complete
until each device listening acknowledges that it has received the data, no
device will miss any data. This also means that in any particular transfer,
the slowest device addressed controls the speed of transfer.

It is important to remember that these command lines are in parallel and
negative true, so that a line will remain true until all devices release it.
The following terminology is associated with this function:
NFRD (not ready for data). The devices on the bus use this line to indicate
readiness to accept a data transfer. If a device is not ready, for whatever
reason, it will pull the line low, inhibiting the talker from transmitting.
All devices must release the line before a transfer can occur.
DAV (data valid). The talker asserts this line when the data lines have
settled and are valid. This line will not go true until all devices have
released the NFRD.
NDAC (no data accepted). Listeners pull this line low to indicate that the
data has not yet been accepted. When the slowest listener releases the line,
it will finally go high.
This is how the three-wire handshake works (see Figure 1):
T[-1] At the end of a successful transfer, each listener releases the NFRD
line as it becomes ready for the next byte. The source of communication
(current talker) checks the NRFD\ and NDAC\ command lines to see that there
are listeners on the bus. If both are high, there are no listeners. If all is
okay, the current talker places a byte on the data lines.
T[0] The source tells the listeners that the data is stable on the data lines
by pulling the Data Valid (DAV) line low.
T[1] The first to accept the data pulls the NFRD line low, inhibiting any
further transmissions.
T[2] The slowest device on the bus accepts the data and releases the No Data
Accepted (NDAC) line, indicating that all listeners have received and accepted
the data.
T[3] The source allows the DAV line to go high, telling the listeners that the
data is no longer valid. Another data byte may now be placed on the bus.
T[4] The fastest device sets the NDAC line low, ready for the next byte.
T[5] The slowest listener finally releases NFRD allowing the next transfer to
occur.


_UNDERSTANDING THE GPIB_
by Don Morgan


[LISTING ONE]

/* Scope.c */
#include <stdio.h>
#include "decl.h" /* a header file containing declares pertinent to the
 National Instruments library*/

/* Application program variables passed to GPIB functions */
char rd[512]; /* read data buffer*/
int bd; /* board number, in this case that representing controller */

main()
{

/* Assign unique identifier to board 0 and store in variable bd. Check for
error.*/
if ((bd = ibfind ("GPIB0")) < 0) error();
printf("\ninitializing controller");
getch();

/* Send the Interface Clear (IFC). */
if (ibsic (bd) & ERR) error();

/* Turn on Remote Enable (REN) signal so instrument can be programmed. */
if (ibsre (bd,1) & ERR) error();

/* Put scope in remote mode by addressing it to listen, send Device Clear
(DCL) message to clear internal device functions, and address GPIB board to
talk. */
ibcmd (bd,"?_@' ",4);
if (ibsta & ERR) error();

/* Write the function, range, and trigger source instructions to scope. */
ibwrt (bd,":bnc probe",10);
if (ibsta & ERR) error();

ibwrt (bd,":acquire:type normal",20);
if (ibsta & ERR) error();

ibwrt (bd,":timebase:range 5e-4",20);

if (ibsta & ERR) error();

ibwrt (bd,":timebase:delay 0",17);
if (ibsta & ERR) error();

ibwrt (bd,":timebase:reference center",26);
if (ibsta & ERR) error();

ibwrt (bd,":timebase:mode triggered",24);
if (ibsta & ERR) error();

ibwrt (bd,":channel1:probe 10",18);
if (ibsta & ERR) error();

ibwrt (bd,":channel:range 1.2",18);
if (ibsta & ERR) error();

ibwrt (bd,":trigger:mode edge",18);
if (ibsta & ERR) error();

ibwrt (bd,":trigger:slope positive",23);
if (ibsta & ERR) error();

ibwrt (bd,":trigger:level 300mv",20);
if (ibsta & ERR) error();

ibwrt (bd,":trigger:source channel1",24 );
if (ibsta & ERR) error();

/* Scope is now ready to go. Set up SRQ by first clearing internal
data structures. */
ibwrt (bd,"*cls",4);
if (ibsta & ERR) error();

/* Clear the trigger by issuing the command to return the bit. */
ibwrt (bd,":ter?",5);
if (ibsta & ERR) error();

/* Then make the scope a talker. */
ibcmd (bd,"?_ G",4);

/* And become a listener. Read three bytes or stop when an EOI is received. */
ibrd(bd,rd,3);
if (ibsta & ERR) error();

/* When the data is in buffer issue an untalk and unlisten, then make the
controller a talker again and scope a listener. */
ibcmd (bd,"?_@'",4);

/* Enable the SRQ bit within scope that will case an RQS on next trigger. */
ibwrt (bd,"*sre 1",6);
 if (ibsta & ERR) error();

/* Now wait for the trigger. */
if (ibwait (bd,SRQI) & (ERR)) error();

/* If we are here, scope must have been triggered and must have asserted SRQ
command line. Do a serial poll. First unaddress bus devices and and send
Serial Poll Enable (SPE) command, followed by scope's talk address, and GPIB

board's listen address. */
ibcmd (bd,"?_\x18G ",5); /*UNL UNT SPE TAD MLA*/
if (ibsta & ERR) error();

/* Now read status byte. If it is 0x41, the scope has valid data to send;
otherwise it has a fault condition to report. */
ibrd (bd,rd,1);
if (ibsta & ERR) error();
if ((rd[0] & 0xFF) != 0x41) error();

/* Note that if more than one device is attached to bus, each device must be
checked to see which issued the SRQ. */

/* Complete serial poll by sending the Serial Poll Disable (SPD) message. */
if (ibcmd (bd,"\x19",1) & ERR) error();

/*Send scope untalk and unlisten; make controller talker and scope listener*/
ibcmd (bd,"?_@'");

/*Tell the scope to print the screen and associated data*/
ibwrt(bd,":hardcopy:page automatic",24);
ibwrt(bd,":print?",7);

/*Make scope a talker and printer a listener; have controller get out of
the way for the transfer*/
ibcmd(bd,"?_@!G",5);
ibgts(bd,1);

/*Wait for the transfer to complete and reassert control over the bus*/
ibwait(END);

/*The program terminates with an interface clear*/
ibsic (bd);
}

/* Simple error routine that reads system variables and returns them. */
error()
 {
 printf("\nGPIB function call error:");
 printf("\nibsta=0x%x, iberr=0x%x,",ibsta,iberr);
 printf("\nibcnt=0x%x\n",ibcnt);
 }



Example 1:

(a)

OUT "IFC"; /*Interface Clear*/

(b)

OUT "REN"; /*Remote Enable*/

(c)

OUT "?_@'\0X14"; /*? = Unlisten (UNL), _ = Utalk (UNT), @ = My Talk Address
 of controller (MTA), ' = My Listen address of scope (MLA),

 0X14 = Device clear*/

(d)

OUT "?_@'"; /*UNL, UNT, MTA of controller, MLA of scope*/
OUT ":ACQUIRE:TYPE NORMAL";
; /*Now the device dependent configuration data*/

;
OUT ":TRIGGER:SOURCE CHANNEL1";
OUT ":SRQ 1"; /*Enable SRQ on trigger, bit 1 of SRQ mask register*/
OUT "?_"; /*UNL, UNT*/

(e)

OUT "?_\X18G "; /*UNL, UNT, Serial Poll Enable (SPE), Talk Address (TAD)
 of scope, MLA of controller*/
IN &BUFFER; /*Program puts Serial Poll Byte in a buffer*/
OUT "\0X19"; /*Issue Serial Poll Disable (SPD)*/

(f)

OUT "?_@'"; /*UNL, UNT, MTA of controller, MLA of scope*/
OUT "PRINT"; /*Tell scope to print*/
OUT "?_@!G"; /*UNL, UNT, MTA of controller, MLA of printer/plotter,
 MTA of scope*/
OUT "DATA"; /*Controller takes its own attention (ATN) line low to allow
 scope to talk to printer/plotter*/
WAIT "END"; /*Controller watches command lines for end or identify (EOI),
 which means that the transfer is complete*/
































April, 1991
COOPERATIVE MULTITASKING IN C++


When resources are scarce, this is the way to go




Marc Tarpenning


Marc is a software engineer with Digital Alchemy Incorporated, a software
consulting firm. He specializes in real-time applications and can be reached
at P.O. Box 254801, Sacramento, CA 95865 or by e-mail at met@sactoh0.SAC.CA.
US.


Embedded systems programming presents some interesting challenges. A typical
application might involve juggling a variety of sampling, processing, control,
and communication functions simultaneously on a very small system. Such
applications are inherently parallel and are best implemented by many small
tasks working together. I've recently switched from developing these
applications in a cooperative multitasking Forth environment to
object-oriented programming in C++. Unfortunately, operating system support is
required for normal preemptive multitasking, context switches can be slow, and
resource sharing can be complex. Not wanting that kind of overhead in an
embedded application, my solution was to create cooperative multitasking
objects for C++.


Cooperative vs. Preemptive Multitasking


In preemptive multitasking, an interrupt timer periodically executes an
executive program. The executive determines if the current task has been
running too long, and if so forces a context switch to the next scheduled
task. If the task still has time, it updates a counter and returns from
interrupt. Because an interrupt can occur and force a context switch at any
point in the program, the system must save and restore the processor's entire
state. Resource sharing between tasks requires routines to lock and unlock
common data structures. Further, operating system calls must be reentrant
(which MS-DOS is not).
Cooperative multitasking, on the other hand, has no executive overhead and
does not rely on interrupts. Instead, when a task is ready to give up control
the task calls a routine (pause), which switches context to the next task.
Because the point of the switch is always the same (the pause routine), only
the stack pointer and a few registers are saved and restored. This makes the
context switch simple and potentially very fast. Cooperative multitasking
greatly simplifies resource sharing since other tasks can't "sneak in" during
updating of common structures. System calls can be performed in any operating
system because a context switch cannot occur unexpectedly during the call.


The Code


The cooperative multitasker I'm presenting here is implemented using a C++
task object which is part of a multiple channel communication program
currently under development at our company. The task object and associated
header file are in Listings One and Two (page 96). I've also included a demo
program (Listing Three, page 96) to show the multitasker in use. The demo
program requires the simple text window object and header file which make up
Listings Four and Five (page 99). The interface into the multitasker consists
of the routines InitTasking, fork, and pause. InitTasking sets up the
multitasking system and must be called before any other calls are made. fork
spawns new tasks and pause performs the context switch to the next task. The
task object itself has an Activate method to change what a task object is
executing and a Show method to display the current instance variables for
debugging purposes. Although I wrote the code in Borland's Turbo C++, it
should be fairly portable to other environments, with the following caveats:
The demo program uses several screen functions unique to Borland's standard
library, and the task objects get direct access to the CPU registers via
Borland's "pseudovariables." You can easily replace Borland's pseudovariable
references with standard assembly code if required. Any type of processor
could be used, but the saving of registers would of course have to be changed.
Though these objects are designed for the small memory model, only a slight
modification is required for larger models.


How to Context Switch


The C calling convention, also used in C++, performs a procedure call by
pushing all of the passed parameters onto the stack, followed by the return
address. When the procedure call finishes, it returns to the pushed address,
leaving the calling code to remove the passed parameters from the stack. The
registers BP, SI, and DI and the stack pointer (SP) maintain the entire state
of a Turbo C++ program upon entering a function. Therefore, to save the state
of a task we need only push these three registers and store the stack pointer.
The next task starts by getting the saved stack pointer from the next task
object, popping the registers, and returning to the address where the new task
left off. The new task now executes and the caller straightens out the stack.
An example execution flow for a two task system can be seen in Figure 1. Note
that in Task 1 the pause within the while loop executes twice. Each execution
takes the program flow to a different place within Task 2, depending on where
Task 2 gave up control the previous time.


Task Objects


The fork routine in Listing Two creates a new task object and initializes the
object to execute a passed function. Each object contains its own stack area,
the saved stack pointer, an ID number for debugging, and links to other task
objects. The number of task objects is limited only by available memory. The
task object constructor method links the new object into a circular list of
other task objects (Figure 2). The size of the private stack area is passed to
the constructor. The default stack size in this program is somewhat large due
to library functions such as printf, which use considerable stack space. A
stack size of zero causes the constructor to assign the system stack to the
new stack object. Only one task object can own the system stack. The program
creates this task during the InitTasking function called at the beginning of
the program.
In small memory models malloc (which allocates the stack area) uses the
current stack pointer to determine if any heap space is available. In order
for child processes to spawn additional tasks, fork "borrows" the system stack
before the new operation. Any tasks can then spawn additional tasks as needed
by using fork. The Activate method initializes the task's stack to restore
dummy values to SI, DI, BP, and initializes the return address to execute the
function passed to fork. Activate also places the address of the routine
terminate on the stack. If the passed function ever finishes and returns,
execution passes to terminate, which deletes the task object and performs a
context switch to the next task. Each task object has a forward and backward
pointer, so the task destructor method can easily take the task object out of
the linked list.


Switching


The currently running task executes pause when the task is ready to give up
control. This routine sends a Switch message to the currently running task
object. Switch pushes the three registers, saves the stack pointer, and moves
to the next task in the linked list. The Switch method uses a slight trick to
save the registers. For any object method, the compiler automatically
generates a BP PUSH at the start of the routine and the complimentary BP POP
before the return instruction.
The compiler normally uses SI and DI for register variables and generates
PUSHs only if register variables are used in the method. By assigning SI to DI
at the beginning of the routine, we force the compiler to save SI and DI by
generating SI PUSH, DI PUSH, and the corresponding POPs at the end of the
routine. After the PUSHs, all the required registers are on the stack, so the
stack pointer is stored in the object. The function retrieves the stack
pointer of the next task in the chain and, as the routine exits, pops the new
task's registers from the stack and returns to the new task's previous
execution address. The assembly code generated by Switch is in Figure 3. This
technique of forcing the compiler to save important registers also works in
many other compilers. Although the routine could be optimized in assembly, it
was easier to keep the code high level and not resort to any direct assembly
language programming.
Figure 3: Task::Switch() partial assembly code

 Task: :Switch()
 push bp ;save stack frame
 mov bp, sp
 push si ;save register SI
 push di ;save register DI
 mov si, di ;dummy instruction

 ...
 _SI = _DI; //dummy instruction to force SI, DI save
 saved = (int*)_SP; //store stack pointer
 CurrentTask = next; //set task to next task
 _SP = (int) CurrentTask->saved; //use new task's stack pointer ...
 pop di ;restore DI from new task
 pop si ;restore SI from new task
 pop bp ;restore stack frame from new task
 ret ;return to new task's execution address



The Demo Program


The demo program creates six tasks that each run in their own text window (the
program is compatible with any standard IBM display). The program creates five
of the tasks by executing fork; the sixth is the "main" task, which uses the
system stack. Additional temporary tasks can be created by pressing the M key
when the demo is running. Even on an 8088, the program can run a large number
of tasks before slowing down appreciably. The important thing to remember when
programming with cooperative multitasking is to place a pause whenever a
function is likely to use up a great deal of time. For example, the keyboard
input routine has been coded to constantly cycle through pause while waiting
for a keystroke, and delays are achieved by watching the BIOS timer and
executing pause until the correct number of ticks has passed. In embedded
applications, I/O routines execute pause between samples or while waiting for
data availability. Communications routines execute pause until the interrupt
code has constructed an incoming buffer or completed an outgoing transmission.


Conclusion


Cooperative multitasking provides some of the benefits of more powerful
multitasking operating systems without many of the complexities. For embedded
applications, where resources are often scarce, cooperative multitasking's
lack of an executive and the associated overhead is ideal. In small
environments even the dynamic memory allocation can be eliminated to produce
tight and fast code. Further, until a multitasking operating system comes into
widespread use, cooperative multitasking allows those of us addicted to
multiple task programs to write effective applications while we wait eagerly
for the demise of MS-DOS.

_COOPERATIVE MULTITASKING IN C++_
by Marc Tarpenning


[LISTING ONE]

// File: TASK.H
// Task object header -- Each task object contains its own stack, links to
next
// and previous task objects in the round robin, the saved stack pointer, and
// debug information.

class Task {
 int *area; // base stack location
 int id; // task id number for debuging purposes
 Task *next; // next task in chain
 Task *prev; // prev task in chain
 int *saved; // saved stack location during pause
 int *top; // top stack location

public:
 void Activate(void (*)() ); // starting function
 int GetId() // return id number
 { return id; }
 Task *GetNext() // return next pointer
 { return next; }
 int *GetSP() // return saved sp
 { return saved; }
 void SetNext(Task *t) // sets next pointer
 { next = t; }
 void SetPrev(Task *t) // sets prev pointer
 { prev = t; }
 void Show(); // display data for debugging
 void Switch(); // context switch to next task
 Task(int); // constructor
 ~Task(); // destructor
};


Task *fork(void (*)() ); // forks tasks
Task *InitTasking(); // Initializes all task stuff
void pause(); // switches to next task

extern int totalTasks; // debug counter of total tasks







[LISTING TWO]

// File: TASK.CPP
// Cooperative Multi-tasking in C++ --
// Marc Tarpenning, P.O. 254801, Sacramento, CA 95865
// Cis: 71435,1753 uucp: met@sactoh0.SAC.CA.US

#include <conio.h>
#include <stdlib.h>
#include "task.h"

// Protypes
void terminate(); // terminate task on completion

// Defines
#define STACKSIZE 0x200 // Stack size of each task

// Global variables
Task *CurrentTask = 0; // current task executing
Task *SystemTask; // main task using system stack
int totalTasks = 0; // debug counter of tasks

// Task.Activate - sets up the task object's stack so the when the task is
// switched to, the passed function is performed.
void Task::Activate(void (*f)() )
{
 saved = top; // reset stack
 *(--saved) = (int) &terminate; // kill task function on exit
 *(--saved) = (int) f; // place function for return address
 *(--saved) = _BP; // save all registers for switch
 *(--saved) = _SI; // save SI
 *(--saved) = _DI; // save DI
}

// Task.Show - Debug information is displayed
void Task::Show()
{
 cprintf("Task: %4i area: %04X\n\r",id,area);
 cprintf(" top: %04X saved: %04X\n\r",top,saved);
 cprintf("prev: %04X next: %04X",prev,next);
}

// Task.Switch - switch context to next task object in the round robin.
// It saves the current stack pointer, gets the stack pointer of the
// next task (after making it current), and returns.
void Task::Switch()

{
 _SI = _DI; // force compiler to save SI and DI
 saved = (int *) _SP; // store stack pointer
 CurrentTask = next; // set current task to next task
 _SP = (int) CurrentTask->saved; // restore new task's stack pointer
}

// Task.Task - Initializes the new task object. Threads the object into
// the linked round robin of other tasks. If size is 0, then does not
// allocate any stack space and uses the system stack instead (system).
Task::Task(int size)
{
 static int newid = 0; // unique identifier for each task
 id = newid++; // set ID and inc
 totalTasks++; // inc debug counter of total tasks
 if (size) { // Want to create operator task?
 if ((area = (int *) malloc(size * 2)) == 0) // No, so allocate
 {
 cprintf("Not enough memory to create task %i\n", id);
 exit(1);
 }
 top = area + size; // set absolute top of stack
 saved = top; // default saved stack to top

 next = CurrentTask->GetNext(); // link task in chain
 prev = CurrentTask;
 prev->SetNext(this); // set current task to point to me
 next->SetPrev(this); // set next task to point to me
 } else { // operator task, so don't allocate stack
 top = (int *) _SP; // instead, co-opt system stack
 saved = top;
 next = this; // since first task, make point
 prev = this; // to myself
 }
}

// Task destructor - return all allocate memory to the system.
Task::~Task()
{
 totalTasks--; // dec debug counter of total tasks
 prev->SetNext(next); // unthread this task from the round robin.
 next->SetPrev(prev);

 CurrentTask = next; // Set new current task
 if (area) // don't free if no stack allocated (system)
 free(area); // free object's stack
}

// fork - creates a new task to execute the passed function. When
// the function has completed, the task is automatically destroyed.
// fork returns a pointer to the new task object or NULL if out of memory.
Task *fork(void (*f)() )
{
 Task *newtask; // pointer to new task object
 // In small memory models, malloc uses the stack pointer to
 // determine if there is any free memory on the heap.
 // To allow forking from sub-tasks, we "borrow" the system stack
 // for the malloc operation.
 int temp = _SP; // save current stack pointer

 _SP = (int) SystemTask->GetSP() - 20; // borrow system stack
 // create new task object
 if ( (newtask = (Task *) new Task (STACKSIZE)) )
 newtask->Activate(f); // Setup new stack to execute function
 _SP = temp; // restore original stack
 return newtask; // return a pointer to the new object
}

// InitTasking - Initializes anything required before multitasking can
// begin. This function must be called before any other tasks are
// forked. It creates the "system" task by coopting the system
// stack into a task object (task # 0). It also sets CurrentTask
// to point to the new operator task.
Task *InitTasking()
{
 CurrentTask = (Task *) new Task(0); // create system task
 SystemTask = CurrentTask; // set system task pointer
 return SystemTask; // return with pointer to system task
}

// pause - non-object interface to switch context.
void pause()
{
 CurrentTask->Switch(); // context switch out of current task
}

// terminate - kills the current task when the fork is over. This is not
// a method, but its address is setup on the initial stack so if the
// task's function ever returns, terminate will be the next execution addr.
void terminate()
{
 _DI = _SI; // force compiler to save DI and SI
 delete CurrentTask; // kill the current task
 _SP = (int) CurrentTask->GetSP(); // set to next task's stack and
 // return into the new task
}






[LISTING THREE]

// File: DEMO.CPP
// Demo program for cooperative multitasking objects
// Marc Tarpenning, P.O. 254801, Sacramento, CA 95865
// Cis: 71435,1753 uucp: met@sactoh0.SAC.CA.US

// General includes for prototype headers
#include <iostream.h>
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <conio.h>
#include <string.h>
#include <bios.h>
#include <ctype.h>
#include <dos.h>


#include "task.h"
#include "twindow.h"

// Prototypes for simple demo functions
void endlessCount();
void fiveseconds();
void funwindow();
void msdelay(int);
int newgetch();
void periodic();
void quicky();
void status();
void wallclock();

main()
{
 /* Init multi-tasker. Creates system parent task */
 InitTasking(); // init task, coopt system stack

 /* ---- Init screen ----- */
 textattr(WHITE); // set "normal" white on black
 clrscr(); // clear screen
 _setcursortype(_NOCURSOR); // kill cursor

 /* ----- start up some tasks ----- */
 fork(&endlessCount); // spawn endless counter task
 fork(&wallclock); // spawn clock task
 fork(&periodic); // spawn periodic launcher task
 fork(&funwindow); // spawn strange window
 fork(&status); // spawn total number of tasks

 /* ----- create main window for user commands ---- */
 TextWindow myWindow(1,20,80,25,(LIGHTGRAY<<4)+BLUE);
 gotoxy(20,1);
 cputs("*** Cooperative Multitasking Demo ***\r\n");
 cputs(" Each one of the windows is a seperate task object ");
 cputs("executing a C++\r\n");
 cputs(" function. All are running 'concurrently' using the ");
 cputs("pause() context\r\n");
 cputs(" switch routine.");
 gotoxy(2,6);
 cputs("Commands: [M]ake new task, [Q]uit");

 /* ----- wait for input & process key strokes ------ */
 for (;;)
 switch ( toupper( newgetch() ) ) {
 case 'Q': // quit - clean up screen and leave
 window(1,1,80,24);
 textattr(WHITE);
 clrscr();
 _setcursortype(_NORMALCURSOR);
 return(0);
 case 'M': // make - fork a new quick task
 fork(&quicky);
 break;
 default: // illegal character
 sound(500);
 msdelay(160);

 nosound();
 break;
 }
}

// endlessCount - opens a window and counts up forever.
void endlessCount()
{
 TextWindow myWindow( 40,7,64,8,(CYAN<<4)+RED );
 cprintf(" This task never ends!");
 long count = 0;
 for(;;) { // just keep counting, but
 myWindow.Activate(); // don't forget to pause
 gotoxy(1,2);
 cprintf(" Current count: %li",count++);
 pause(); // let other tasks run
 }
}

// fiveseconds - opens a window, counts for 5 seconds, and returns
void fiveseconds()
{
 TextWindow myWindow( 5,5,35,7,(GREEN<<4)+RED ); // make text window
 cprintf(" This is a temporary task");
 gotoxy(2,3);
 cprintf("which only lasts five seconds");

 time_t t; // get current time
 t = time(NULL);

 int i = 10000; // count down from 10000
 while (difftime(time(NULL),t) < 5) { // keep counting down until
 myWindow.Activate(); // difftime is five seconds
 gotoxy(13,2); // or more.
 cprintf("%5i",i--);
 pause(); // let other tasks run
 }
}

// funwindow - displays moving character in window
void funwindow()
{
 TextWindow myWindow(65,10,78,10, (BROWN<<4) + YELLOW);

 for(int i=0;;i = ++i % 20) { // forever move i from 0 to 19
 myWindow.Activate();
 gotoxy( abs( ((i/10) * -20) + i) + 1 ,1); // calc cursor
 cputs(" * "); // so range is 1..10 then 10..1
 msdelay(100); // delay ~ 100 ms
 }
}

// msdelay - delays the number of milliseconds with ~ 55ms resolution
void msdelay(int delay)
{
 long ticksplus = biostime(0,0L) + delay / 55;
 while (biostime(0,0L) < ticksplus) // wait until time has passed
 pause(); // let other tasks run
}


// newgetch - does same as getch, except calls pause while waiting
// for a keyboard hit.
int newgetch()
{
 while (!kbhit())
 pause();
 return getch();
}

// periodic - occasionally launchs another task
void periodic()
{
 TextWindow myWindow(1,10,41,11,(LIGHTGRAY<<4) + MAGENTA);
 cputs(" Every ten seconds launch temporary task");
 for (;;) {
 for (int i=0; i < 10; i++) { // loop ten times before forking
 myWindow.Activate();
 gotoxy(20,2);
 cprintf("%i",i); // display current count
 msdelay(1000); // delay ~ one second
 }
 fork(&fiveseconds); // spawn new task which dies in 5 sec
 }
}

// quicky - opens window, hangs around for a few seconds, and leaves
void quicky()
{
 static int xpos = 0; // x position of new task window
 static int ypos = 0; // base y of new task window
 TextWindow myWindow( xpos+1,ypos+12,xpos+16,ypos+12,(GREEN<<4)+BROWN);
 xpos = (xpos+3) % 64; // inc x position of "step" windows
 ypos = ++ypos % 7; // inc y but keep within 7 lines

 for (int i=0; i < 10; i++) { // count down for ten seconds
 myWindow.Activate();
 cprintf(" Dead in ten: %i",i);
 msdelay(1000); // delay ~ one second
 }
}

// status - displays the number of tasks running
void status()
{
 TextWindow myWindow(1,1,18,1, (CYAN<<4) + MAGENTA);
 for (;;) {
 myWindow.Activate();
 cprintf(" Total tasks: %2i", totalTasks ); // display total
 msdelay(200); // delay ~ 200 ms
 }
}

// wallclock - continuously displays the current time
void wallclock()
{
 TextWindow myWindow( 55,1,80,1, (LIGHTGRAY << 4) + BLUE);
 time_t t; // will hold the current time
 char buf[40]; // temp buffer so can kill the \n

 for (;;) { // always keep updating the time
 myWindow.Activate();

 t = time(NULL); // get the current time string address
 strcpy(buf,ctime(&t)); // copy the string into temp
 buf[24]='\0'; // kill the \n so window won't scroll
 cprintf(" %s",buf); // display it

 msdelay(1000); // wait for ~ one second
 }
}






[LISTING FOUR]

// File: TWINDOW.H -- Demo text window objects -- These window objects create
// a primitive text window for demo program.
// Assume Borland C++ libarary functions

class TextWindow {
 int attrib; // text mode attribute
 int left,top; // starting x and y position
 int right,bottom; // ending x and y position

public:
 void Activate(); // make active
 TextWindow(int,int,int,int,int); // constructor
 ~TextWindow(); // destructor
};





[LISTING FIVE]

// File: TWINDOW.CPP -- Demo text window objects
// Marc Tarpenning, P.O. 254801, Sacramento, CA 95865
// Cis: 71435,1753 uucp: met@sactoh0.SAC.CA.US

#include "twindow.h"
#include <stdio.h>
#include <conio.h>

// Window activation - makes this window the active window.
void TextWindow::Activate()
{
 window(left,top,right,bottom); // Use C++ library
 textattr(attrib); // to make window active
}

// Window constructor - store coordinates and clear window
TextWindow::TextWindow(int l,int t,int r,int b,int a)
{
 left = l; // set up all instance variables

 top = t;
 right = r;
 bottom = b;
 attrib = a;
 Activate(); // activate window
 clrscr(); // clear window
}

// Window destructor - clears window with black background
TextWindow::~TextWindow()
{
 Activate();
 textattr( (BLACK << 4) + WHITE);
 clrscr();
}















































April, 1991
EXAMINING THE MICROSOFT MAIL SDK


Mail APIs can hide the complexity of network programming




Bruce D. Schatzman


Bruce has worked in the computer industry for more than ten years, holding a
variety of positions at corporations including General Dynamics, Tektronix,
and Xerox. He is currently an independent consultant and can be reached at
P.O. Box 5703, Bellevue, WA 98006.


Whether you are implementing a terminal emulator, a client-server application,
or a peer-to-peer file transfer mechanism, serious communications programming
can mean hundreds of hours of low-level coding and debugging. Many
programmers, in fact, feel that the only "right" way to develop communications
software is at a low level, controlling the environment directly through
transport or session-layer APIs. A limited selection of general-purpose
network APIs such as Apple's Comm Toolbox (see "The Macintosh Communications
Toolbox" DDJ, December issue 1990) do exist to make this job easier, but many
of the details are left to the programmer. However, there are ways to
implement network solutions that require no previous network development
experience and considerably less work. One such solution is the mail API
included in the Microsoft Mail SDK for the Macintosh. This may sound strange
at first, but you can use mail APIs to build an almost unlimited number of
peer-to-peer and client-server network applications that have little or
nothing to do with mail.
An advantage of using a mail API is that it hides virtually all of the
lower-level routines and data structures usually associated with network
programming. A good mail API is typically simple and designed for developers
who have little experience with network programming. In addition, a transport
system consisting of thousands of networks and millions of users is often
already in place for your application to use. Also available are third-party
mail servers, gateways, bridges, and routers that allow your application to
send information around the globe to virtually any PC, Mac, VAX, Unix
workstation, IBM mainframe, or fax device.


When to Use a Mail API


If you want to move bits on a cable and maintain a high degree of control over
network addressing, performance, routing, or data formatting, you need a
general-purpose communications API. You can use a mail API if you don't need
this degree of control and if your application deals with files or messages
(rather than bytes or packets) as the basic unit of transmission.
When choosing a mail API, you must be willing to accept the architecture,
security, fault tolerance, and transport mechanisms that are offered through
the mail product. Any application that you build with Microsoft Mail, for
example, moves data from point A to point B via MS Mail's store-and-forward
scheme. With this in mind, you clearly could not use a mail API to implement
"real-time" applications such as a terminal emulation program, but you could
certainly use it to develop an automated document management system for
workgroups.


Overview of the MS Mail C API (MAPILib.h)


The best way to introduce the MS Mail API is through its most basic
service--what Microsoft calls "standard integration with MS Mail." This is a
relatively simple job and involves putting appropriate function calls in your
application that enable users to retrieve and send mail directly from your
application using the MS Mail desk accessory (DA) as the user interface.
When standard integration is implemented, two commands--Open Mail and Send
Mail--are dynamically added or removed from an application's File menu each
time the user logs into or out of MS Mail through the MS Mail DA. In this way,
a user simply chooses the Send Mail command from the File menu and enters the
name(s) of the addressee(s) in the default mail form to send the currently
open file (spreadsheet, drawing, database table, document, and so on) to
another user's mailbox on the server.
Implementing standard integration with MS Mail is straightforward, usually
requiring only a couple of days for coding and debugging. Listing One, page
100, presents a code extract (generously provided by Darryl Lovato of Aladdin
Systems), from Stuffit, Aladdin's widely used file compression utility.
Beginning with SendMail() in Listing One, a call is made to
msmSession-Established(), which returns zero if the workstation is already
logged onto a mail account, or a nonzero error code that is sent to MSMError()
for handling. Note that MSMError() is not a MAPILib function. If
msmSessionEstablished() does not return an error, it is assumed everything is
OK and a message is created using msmCreateMess(). msmCreateMess() allocates
and initializes a message structure, returning a handle in the variable messh
(declared as type MessHdl). All message fields are automatically set to their
initial default values, if any. The second parameter states the message type
as Mess, which indicates a standard note message. msmCreateMess() returns 0 if
it was successful.
Next, a subject is added to the message using msmAddMessSubject(). This
function simply sets the subject of message messh to the text in datah. Note
that datah is a handle to pure text: It is not a null-terminated C string or a
length-encoded Pascal string. Therefore, the size of datah must be less than
255 characters.
Immediately after completion of msmAddMessSubject(), datah is disposed to free
its memory. Failure to dispose of handles properly is one of the few areas
that can cause problems in your program. Keep this in mind. In this case, the
memory is released with the standard Mac Toolbox call to DisposHandle().
Because SendMail() is embedded within a compression application, it assumes
that you are sending an archive as a file enclosure. To access the archive
file, the appropriate parameter block fields are specified. The parameter
block is then passed to the Toolbox routine PBOpenWD(), which opens the file's
data fork.
Next, msmAddMessEnclosure() associates the selected archive file with the
message messh, which must reference an unsent message. It now must be
determined where the message will be sent. To accomplish this, the routine
msmDisplaySendDocScreen() puts a standard mail form on the screen for the user
to interactively edit, address, and send the message.
Because this application was written with version 2.0 of the MS Mail SDK,
msmDisplaySendDocScreen() has only one parameter--the message handle. The
recently introduced MS Mail SDK 3.0 has an improved msmDisplaySendDocScreen()
that includes two additional parameters: allowEnclAccess, which allows the
user to add or remove enclosures before sending; and send, which checks to see
whether the user pressed the Send or Cancel button.
It is important to note that the 2.0 version of msmDisplaySendDocScreen()
automatically frees the message's memory, regardless of whether Send or Cancel
was pressed. In 3.0, it will dispose of the handle only if send is true. This
is a good illustration of why you must always keep disposing of handles in
mind, being especially diligent when upgrading your application to the 3.0
SDK.
Like SendMail, OpenMail() demonstrates the use of the MAPILib routines.
msmDisplayMessageCenter, for instance, displays a Message Center dialog box
similar to that used by the MS Mail DA, allowing the user to browse through
mail messages and open any, if desired. Another routine,
msmGetListNumEnclosures returns the number of file enclosures selected in the
mail message list specified by its one parameter, which is a handle to a
message list.
Next, msmGetListEnclosureName returns the name of the jth file enclosure
referenced in the mail message list, and msmGetListEnclosure saves the jth
file enclosure to the disk in the current working directory under a specified
file name. Finally, the MAPILib routine, msmGetListEnclosureComments, displays
a mail form that contains the message body and addressing information of the
specified mail message.


MAPILib 3.0


MAPILib 3.0 contains about 110 functions that can be grouped into categories,
a few of which are shown in Figure 1. Some function prototypes are also listed
to show what the routines in the API look like.
Figure 1: Function categories in MAPILib with sample function prototypes

 Accessing Mail Accounts
 Examples: msmGetUserName (Boolean fullname, Handle *namehp)
 msmGetServerName (Handle *namehpo)

 Creating Mail Messages
 Examples: msmCreateMess (MessHdl *messhp, long type)
 msmCreateReplyMess (MessHdl *messhp, MessHdl original,
 Boolean replyAll, Boolean copyText)


 Addressing Mail Messages
 Examples: msmAddMessRecipient (MessHdl messh, Handle datah)
 msmAddMessCcRecipient (MessHdl messh, Handle datah)

 Sending Mail Messages
 Examples: msmSendMess (MessHdl messh)
 msmAddMessCcRecipient (MessHdl messh, Handle datah)

 Reading Mail Messages
 Examples: msmReadMess (MessHdl *messhp, long type, long id, short fldrld)
 msmGetMessSubject (MessHdl messh, Handle *datahp)
 msmGetMessSender (MessHdl messh, Boolean fullname, Handle
 *namehp)

There are also sets of routines for printing messages, manipulating forms,
accessing a mail database, adding file enclosures to messages, accessing mail
folders, and a variety of other functions. These are all documented within the
MAPILib header file itself. A few of the new routines included within MAPILib
3.0 are listed in Figure 2.
Figure 2: MAPILib 3.0 functions

 Function Description
 -------------------------------------------------------------------------

 msmEnableDialog() Disables the looking for server, progress,
 and notify dialog boxes
 msmChooseServer() "Chooses" to another server
 msmGetListField() Extracts information from mail lists
 msmGetMultiListEnclosures() Reads multiple file enclosures in one step
 msmPrintMessList() Prints multiple messages in a single job
 msmMoveMess() Moves a message between folders
 msmCopyMess() Copies a message between folders
 msmCreateFldr() Creates a user folder



Using Mail for General Network Communications


Once you are familiar with the MAPILib library, you will begin to see
possibilities that go beyond standard integration with mail. The key is to
view the API routines not as a mail interface, but as a general
message-passing interface that interacts transparently with a network.
From this perspective, each mail message can be thought of as a "generic"
network message that can contain any data and be acted upon in any way. Figure
3 shows a standard mail message. The message consists of a variable number of
objects known as "message fields." Each message field contains an identifier,
type, and N bytes of associated data, where N is limited only by the space
available on the mail server and by the memory available on the client. Very
long messages can be encoded as file attachments to get around memory
limitations.
An application can use the standard (default) message formats, or build custom
message types (using MAPILib functions) to suit specific application
requirements. Standard mail message types include those shown in Figure 4.
Figure 4: Some of the standard mail message types

 Function Description
 ---------------------------------------------------------------------

 smMessTypeNote Simple text-only message
 msmMessTypeImage A graphic image, such as TIFF
 msmMessTypePhone A phone message
 msmMessTypeAssist A request for network administrator assistance
 msmMessTypeAppleLink AppleLink message
 msmMessTypeVoice Voice messages (works with a variety of third
 party sound drivers)
 msmMessType80Col A message in 80 column format for use with PCs

Once the appropriate message formats are identified or built, you can use
MAPILib functions to manipulate them, with or without interaction, through the
MS Mail DA user interface. The msmLogOn() function can set up a logon session
with a server that either goes through the DA or through your own
application-specific session. Consider the statement in Example 1.
Example 1: Using MsmLogOn() to log onto a server either through the DA (if fDA
is true, or forcing the application to directly log into the mail system (if
fDA is false).

 msmLogOn (Boolean fDA, StringPtr stUsername, StringPtr stPassword,
 StringPtr stPrompt);

If fDA is false, the DA is bypassed and the application logs into the mail
system directly. You can then scan for new messages in the background,
retrieve them, and act on them without user intervention. In other words,
because the content of a message is arbitrary (it can easily include binary
data) and MAPILib gives you a general message-handling interface, you can
build a broad range of fairly sophisticated network applications.



Custom Forms


The standard form that comes with MS Mail handles only mail applications. If
you build a more general application that involves any type of user
interaction, you will usually create custom forms that are designed to obtain
input and display messages in an application-specific format. This can be done
in one of two ways. The first and easiest involves no programming at all. Use
the HyperCard-based MS Mail Form Designer to create new forms interactively
and store them on the server. This is another advantage of using a mail
program to build network applications--user interface screens can be designed
and stored without writing a single line of code.
For the most sophisticated applications, MS Mail supports special Form Control
Procedures (FCPs) that allow developers to control forms programmatically and
extend the functionality of the mail system. FCPs have their own API that is
separate from the functions provided in MAPILib and is useful for interfacing
to specialized hardware drivers. Be aware, however, that the FCP API is a
lower-level interface that is much more difficult to use than MAPILib. In
general, the recommendation is to use the Form Designer and MAPILib whenever
possible.
As an example of an FCP, consider a special "sound" form that is custombuilt
and installed on the server under a unique form ID number. When the user
clicks a sound icon in their Mail DA, the form and its associated FCP are
automatically downloaded from the server. The FCP contains the information to
display the form to the user as well as the code that deals with processing
the form. The form would then appear and display "play" and "record" buttons
that interface with Apple or third-party sound drivers, allowing you to record
your voice and send it to anyone as compressed digital voice mail. In this
case, the message would be of type "Voice" and would be in binary format,
perhaps along with other nonbinary enclosures.


Other Tools


The C SDK is one of several MS Mail programming environments. Excel and
HyperCard have macro and script equivalents for most of the MAPILib functions.
C programmers should note that either of these programs can act as a
prototyping tool by which you can build your application and test it at a very
high level before you begin coding in C. Since Excel and HyperCard have user
interface tools as well, you can also experiment quickly with different forms
and dialog boxes before spending time with C libraries. Even if you have no
intention of using Excel or HyperCard, you should be aware that the
documentation provided with these two SDKs is more complete than the C SDK and
can provide better insight into the operation of each MAPILib function.
Naturally, you can use both the Excel and HyperCard SDKs as full-blown
development environments, not just as prototyping tools. However, Excel and
HyperCard are not well-suited for applications that involve large amounts of
data and/or transactions.
The MS Mail Excel Macro Library is actually an Excel Macro sheet which
includes about 65 macro functions that may also make calls to external
("registered") code that is supplied with Mail's file resources fork.
Programming with Excel macros does not give you the flexibility that
programming directly with MAPILib does, but you can access virtually all of
the mail functionality.
Mail messages that are related to specific information stored in an Excel
spreadsheet or database can be created, sent, or read by users and/or other
applications. For example, if a finance department keeps their company-wide
budget allocations in an Excel spreadsheet, a report with a corresponding
chart and message could be automatically generated and sent to each department
head at the end of the month to inform them of their budget status. The macro
program would be initiated by including the macro shown in Example 2 in the
Excel application which opens and initializes the Excel Mail library. The
macro could also store monthly budget results in the mail database for each of
the department heads. The four macro calls shown in Example 3 would be the
foundation for the routine.
Example 2: Macro to open and initialize the Excel Mail library.

 =IF (ISERROR (OPEN ("Mail", FALSE, TRUE))
 ,ALERT ("Cannot find the 'Mail'
 library in this directory",2))

 =IF (Mail!INIT.MAIL.LIBRARY)()<0,ALERT
 ("Could not initialize the Mail
 library", 2))

Example 3: Macro calls to steup[SP] a store monthly budget results in the mail

 =GET.USER.LIST (Mail!msmLocalList, "user name sub-string" ,1,1)
 =GET.LIST.ELEMENT.FIELD (AValidHandle,1,MAIL!msxUserLEF ID)
 =SET.APPSIG(0,0)
 =ADD.DATABASE.ELEMENT (0, Local_user_key,user_list)

Each of the above calls (as well as the other macro functions) have C language
equivalents in MAPILib. The HyperCard SDK extends the HyperTalk script
language in a similar manner, and is suitable for applications that involve a
high degree of user interaction.


Products Mentioned


Microsoft Mail Version 3.0 SDK Microsoft Corporation One Microsoft Way
Redmond, WA 98052-6399 206-882-8080 MPW C SDK: $445.00 System Requirements:
MPW 3.0 Note: HyperCard SDK is available free over AppleLink.


Conclusion


Mail APIs are important tools for developers who want to quickly implement
network applications without the traditional hassle of low-level
communications programming. They are simple, robust, high-level, and provide
great flexibility for applications that are file or message-oriented. While
Apple's Comm Toolbox or other low-level API is the only choice for some
situations, a mail API can save hundreds of hours of development time yet
allow you to build extremely sophisticated applications.


Acknowledgments


I would like to thank the people at Information Research Corporation, which
markets a workgroup-oriented project tracking application called Syzygy, for
their assistance and comments while I was preparing this article.

_EXAMINING THE MICROSOFT MAIL SDK_
by Bruce D. Schatzman

[LISTING ONE]

#include <MacHeaders>

#include "MAPILib.h"
#include "MAPIErrs.h"
#include "stuffit.h"

extern char string[256];
extern arcRecord *myArc;
extern WindowPtr windows[8];
extern EventRecord pss;
extern HParamBlockRec HRec;

SendMail()
{
 int i, j;
 MessHdl messh;
 Handle datah;
 SFReply myReply;

 i = msmSessionEstablished ();
 if ( i )
 {
 MSMError ( i );
 return;
 }

 i = msmCreateMess ( &messh, 'Mess' );

 /* the message was successfully created */
 if ( i == 0 )
 {
 i=PtrToHand ( &myArc->arcName[1], &datah, myArc->arcName[0] );
 /* Add archive name as subject */
 if ( i == 0 )
 {
 i = msmAddMessSubject ( messh, datah );
 DisposHandle ( datah );
 }
 if( i== 0) /*add default body */
 {
 GetIndString ( string, 260, 1 );
 i = PtrToHand ( &string[1], &datah, string[0] );
 if ( i == 0 ) i = msmAddMessBody( messh, datah );
 DisposHandle ( datah );
 }

 /* Now add the archive as an enclosure */
 HRec.wdParam.ioNamePtr = 0L;
 HRec.wdParam.ioVRefNum = myArc->arcVol;
 HRec.wdParam.ioWDProcID = 'SIT! ';
 HRec.wdParam.ioWDDirID = myArc->arcDir;

 PBOpenWD ( &HRec, FALSE );

 i = msmAddMessEnclosure ( messh, myArc->arcName,
 j = HRec.wdParam.ioVRefNum, 0L );

 /* Now find out where to send it */
 if ( i == 0 )
 {
 i = msmDisplaySendDocScreen ( messh );

 /* this now disposes of messh.. */
 }
 else
 {
 /* we're done with it */
 msmDisposeMess ( messh );
 }
 HRec.wdParam.ioVRefNum = j;
 PBCloseWD ( &HRec, FALSE );
 }
 if ( i == msmTErrCancelled ) i = 0;
 if ( i ) MSError ( i );
}

OpenMail ()
{
 int i, numItems, j, vol;
 long dir;
 SFTypeList myList;
 Handle resh;
 SFReply myReply;
 char str[32];
 int numOpen = 0;
 Point startPt;

 i = msmSessionEstablished ();
 if ( i )
 MSMError ( i );
 return;
 }

/* How many archives are open now ? */
 for ( i = 0; i < 8 ; i++)
 {
 if ( windows[i] ) numOpen++;
 }
 myList[0] = 'SIT! ';
 myList[1] = 'SIT2 ';
 myList[2] = 'SITD ';

 i = msmDisplayMessageCenter (3, myList, 0L, 0, &resh);

 if ( i == 0)
 {
 EventAvail ( 0, &pss );
 numItems = msmGetListNumEnclosures ( resh );
 for ( j = 0; j < numItems && i == 0; ++j )
 {
 i = msmGetListEnclosureName (resh, j ,
 (StringPtr) string );
 if ( i == 0 )
 {
 GetIndString (str, 260, 2); /* "Save enclosure as :" */
 startPt.h = (screenBits.bounds.right -
 screenBits.bounds.left) / 2 - 158;
 startPt.v = 80;
 SFPutFile ( startPt, (StringPtr) str, (StringPtr)
 string, (ProcPtr) oL, &myReply );
 if ( myReply.good )

 {
 if ( i == 0 )
 i = msmGetListEnclosure ( resh, j, TRUE,
 myReply.fName, myReply.vRefNum, &myReply);
 if (( i == 0 ) && ( numOpen < 8 ))
 /* open archive if room permits */
 {
 StartSpinCursor ();
 GetWDVolDir ( myReply.vRefNum, &vol, &dir );
 if ( !ActualOpen (vol, dir, myReply.fName, 0L, 0,
 0L, pss.modifiers ))
 break;
 if ( pss.modifiers & shiftKey )
 DoAutoUnsit ( FALSE );
 AutoView ();
 numOpen++;
 StopSpinCursor ();
 }
 if ( i == 0)
 i = msmGetListEnclosureComments ( resh, j );
 }
 }
 }
 StopSpinCursor ();
 DisposHandle ( resh );
 }

 if ( i == msmTErrCancelled ) i = 0;
 if ( i ) MSMError ( i );
}

MSMError ( i )
{
 char str2[256];
 NumToString ( i, (StringPtr) string );
 switch ( i )
 {
 case msmNoDriver:
 GetIndString ( str2, 259, 3 );
 /* "MS Mail was not loaded at startup. " */
 break;
 case msmNoServer:
 GetIndString (str2, 259, 1 );
 /* "You are not connected to a MS Mail server. " */
 break;
 case msmDErrNotLoggedOn:
 GetIndString (str2, 259, 13 );
 /* "You are not logged onto the MS Mail server. " */
 break;
 default:
 GetIndString ( str2, 259, 2 );
 /* " Miscellaneous MS Mail Error. " */
 break;
 }
 ParamText (str2, string, 0, 0 );
 StopCAlert ( 296, 0L );
}

































































April, 1991
FRACTALS IN THE REAL WORLD


A general-purpose interactive drawing and modeling tool


 This article contains the following executables: OLIVER.ARC


Dick Oliver


Dick has taught computer science to all ages since 1979. He is now president
of Cedar Software and author of the "Fractal Grafics" Guidebook and software.
He can be reached at Cedar Software, RI, Box 5140, Morrisville, VT 05661;
802-888-5275.


Some of the hardest things to model with traditional geometry -- plants,
splashing water, and other complex natural forms--are the easiest to draw with
fractals. Fractal drawing doesn't require new technology--though
computationally intensive, it's well within the capabilities of modern PCs.
Ironically, the popularity of "Mandelbrot Set" fractal programs, which are
enchanting but useless for interactive drawing and real-world modeling, has
helped keep the potential of fractals from being widely understood.
This article describes how to generate fractal images using a general drawing
tool I call a "fractal template." This tool is a subset of a mouse-oriented
fractal drawing system, "Fractal Grafics," which my company provides. The
template is simple enough to be implemented interactively on your PC,
intuitive enough for creative design work, and capable of producing any image
displayable by your hardware (though some images are a lot easier to draw than
others!). I'll use Microsoft C 5.1/6.0, but porting to other compilers should
be straightforward.


What's a Fractal?


Fractals are infinitely detailed shapes: You can magnify them as much as you
want, and you'll still see complex detail. Conventional, "linear" shapes such
as Bezier curves and rectangles, on the other hand, always look like straight
lines when greatly magnified (see Figure 1).
How do you describe an infinitely detailed shape? The simplest way is through
"successive approximation." Imagine an example: Start with a triangle, and lay
three smaller triangles over it so that a hole is left uncovered in the
middle. Then cover each of those three small triangles with three more, even
smaller ones. Continue that process forever, and you have the well-known
fractal called "Sierpinski's Triangle" (see Figure 2).


Growing From a Seed


You can draw an approximation of Sierpinski's Triangle using a fractal
template. The original triangle can be defined as an array of x, y points,
shown here as Example 1(a). Call it _seed_, because the rest grows from it.
(To improve performance, the triangle is declared as two arrays of ints,
_seedx_ and _seedy_, rather than a single struct.)
Example 1: Triangle transformations

 (a) #define NPOINTS 3
 int seedx [NPOINTS] = {-200, 200, 0},
 seedy [NPOINTS] = {-200, -200, 200};

 (b) #define NTRANS 3

 float movex [NTRANS] = {-100.0, 100.0, 0.0},
 movey [NTRANS] = {-100.0, -100.0, 100.0},
 sizex [NTRANS] = {.05, 0.5, 0.5},
 sizey [NTRANS] = {0.5, 0.5, 0.5};

 (c) float spinx [NTRANS] = {0.0, 0.0, 0.0},
 spiny [NTRANS] = {0.0, 0.0, 0.0};

 (d) a = sizex * cos (spinx)
 b = - sizey * sin (spiny)
 c = sizex * sin (spinx)
 d = sizey * cos (spiny)

 (e) x2 = a * x1 + b * y1 + movex
 y2 = c * x1 + d * y1 + movey

 (f) a3 = a2 * a1 + b2 * c1
 b3 = a2 * b1 + b2 * d1
 c3 = c2 * a1 + d2 * c1

 d3 = c2 * b1 + d2 * d1

Now you need to describe the placement of the three smaller triangles as a
transformation of the seed. Together, the "seed" and "transformations" define
the fractal by pointing toward an infinite progression of smaller triangles
being transformed into still smaller ones.
In our example, each triangle is half the size of the original, and offset
half way to one of the corners. So we define a _size_ variable, and a _move_
in x,y, for each of the three transformations; see Example 1(b).
Eventually, we want to both describe a wide variety of shapes besides
Sierpinski's Triangle, and to interactively place, resize, and maybe even spin
and stretch the seed shape to choose transformations. Because we have defined
separate _size_ variables for the x and y axes, we can describe a stretching
transformation by increasing the size on one axis only. (For simple shrinking,
the x and y sizes will have the same value.)
Similarly, separate _spinx_ and _spiny_ variables allow rotation and skewing
transformations, as shown in Example 1(c). Because the three transformations
that define a Sierpinski's Triangle don't involve any rotation, for now the
spin variables are zero.
By using matrix algebra, there is a simple way to do all this at once: with
"affine transformations." An affine transformation T can be expressed as four
variables (a, b, c, and d) plus the x,y displacement (movex, movey). To derive
these variables from size and rotation, use the formulas in Example 1(d).
Example 1(e) shows how to go from a point (x1,y1) on the big seed triangle to
the corresponding point (x2,y2) on one of the smaller, "transformed"
triangles.
Finally, given two transformations, T1 and T2, there is a third
transformation, T3, that represents the combination of the two. The formula in
Example 1(f) shows how to produce T3.
With this, you're ready to start drawing. Listing One (sierp.c, page 101) and
Listing Two (sierp.h, page 101) contain the C code to draw Sierpinski's
Triangle on a VGA screen. After initializing the screen, the equations in
Example 1(d) are used to compute the transformations defined by the size and
spin variables. In the _draw_ function, the seed shape is transformed using
the approach in Example 1(e), so that its size and rotation are unchanged, but
its position is closer to the center of the screen. _draw_ draws the
transformed shape on the screen using the approach in example 1(f); then it
calls itself recursively to draw smaller and smaller triangles. The recursive
calls continue until _iter_, the iteration counter originally set to NLEVELS,
reaches zero.


The Chaos Game


Technically, a shape isn't really "fractal" until you draw the "infinite"
level of detail. Of course, the smallest level distinguishable at the
resolution of your screen will do. With Sierpinski's Triangle, that level is
reached when each tiny triangle is one pixel wide--about the sixth level of
detail on a VGA screen.
Michael Barnsley and his colleagues at Georgia Tech have discovered a
mathematical magic trick which allows you to skip immediately to that finest
level of detail and "paint" the whole shape at once on your screen. They call
it the "Chaos Game," and it is played as follows:
Start at any point on the screen, and randomly select one of the
transformations which define the fractal. Apply the transformation to the
point, as in Example 1(e), and make a dot at the resulting new point. Again
choose one of the transformations at random, apply it to the new point, and
make another dot wherever it ends up. Believe it or not, if you continue this
seemingly random journey around the screen, a fractal will quickly appear!
If you're choosing between the same three transformations defined above,
you'll get the same Sierpinski's Triangle as drawn by the "successive
approximation" method. The _paint_ function in Listing One demonstrates the
Chaos Game, or "random iteration" algorithm.
How does this work? One way to understand it is to notice that the resulting
shape is made up of three miniature copies of itself. If you took every dot in
the whole big shape and applied one of the transformations to it, you'd get
one of those three miniatures. So each "random pick" goes from a dot on the
whole to the corresponding dot on one of the parts. Leap around this way on
the shape for a while, and you end up landing on almost all the dots in all
three copies.


Drawing a Fractal Tree


You've seen two ways to draw a fractal, given a seed shape and some
transformations. To see how this sort of approach can model real-world
phenomena, imagine the growth of a fractal tree. If our seed shape looked like
the trunk of a tree, and the transformed trunks were placed like the first
level of branches, the _draw_ function described above would grow a realistic
tree on the screen. Change the #include "sierp.h" in Listing One to #include
"maple.h" (Listing Three, page 101), and you can watch it happen. The _paint_
function shows only the finest level of foliage, skipping the trunk and
branches altogether.
Figure 3 and Listing Three show how you can model the genetics of different
species of plants by defining templates. Note that the template data for the
maple tree and maple leaf are very similar, while the template for a pine tree
is quite different.


Interactive Design


Movement, size, and rotation are easy to represent visually. By displaying the
seed shape and the first level of transformed copies, you can show the entire
definition of a complex fractal on the screen. The template can be
interactively manipulated in real time with simple menu selections like
"spin," "shrink," "grow," and "stretch." Display the user's changes in real
time, and you have the essentials of an interactive fractal drawing program!
See Listing Four (page 102) for an example called the FACDRAW program. As with
Listing One, you can change the #include statement to change the starting
template.
When FRACDRAW starts with the "maple" template, you'll see several polygons on
the screen. The largest is always the seed shape, and the others are
transformed copies of it. By pressing one of the keys listed on the menu, you
can change a transformation. A T-shaped "handle" is always displayed in the
center of one of the polygons to let you know which transformation you are
currently editing. Picking "Next Part" moves the handle to another polygon,
allowing you to edit the corresponding transformation.
If you want to swivel one of the branches of the tree, you can pick Next Part
(press the Tab key) until the handle is in the center of that branch. Then,
picking spin (the + or - key) will spin that branch. The arrow keys will move
that particular branch relative to the rest. When you "draw," your changes
will be carried throughout all levels of the tree.
When the draw and paint functions place the fractal on the screen, they apply
a final transformation to the whole thing, independent of the shape itself. By
placing the handle on the seed shape, you can edit that final transformation,
which effectively spins, resizes, and moves the whole fractal at once.
You can also edit the seed polygon itself. Picking "NextPoint" moves the
handle to a corner point on the seed, and the arrow keys will then move that
point. All the menu choices will then modify the seed shape instead of the
transformation data. Picking Next Part puts the handle back in the center of a
polygon, letting you edit the transformations again.


Fractal Drawing Techniques


To get creative with fractal templates, you'll need to learn two basic drawing
techniques. You've already seen the first in action. To draw the maple tree, I
modeled the natural growth of a tree geometrically. This works great when you
want to draw something that grows level-by-level. FLAKE.H in Listing Five
(page 105) gives you the data for a snowflake drawn the same way.
If you want to draw things like clouds and mountains, which don't obviously
grow out of a simple "seed," you'll need a more powerful technique called
"tiling." To draw each of the fractals in Figure 4 and Listing Five, I
sketched a rough outline of the whole shape, and then "tiled" it with copies
of itself. Listing Six (page 106) contains templates for cumulus, cirrus, and
stratus clouds, as well as for other images (pine tree, leaf, and so on).
How does tiling work? A surprising theorem by John Elton has proven that any
shape can be tiled with smaller copies of itself, and that applying the random
iteration algorithm with the transformations used in the tiling will recreate
the original shape. Because each transformation can be expressed as only six
numbers, complex images can be described in very few bytes. Michael Barnsley
and Alan Sloan have achieved image compression ratios of more than 10,000:1 in
this fashion (see References).


Coloring


You can add color to your fractal images with a simple two-dimensional array.
Each transformation, or "part" of the template, can have a separate color at
each level of detail. The _draw_ routine in FRACDRAW.C (Listing Four) uses
just such an array to assign colors.
Usually, you'll want to use one of two coloring schemes: coloring by level, or
coloring by transformation. The maple template defined in Listing Three uses
the former, assigning a separate color to each level of detail, from a brown
trunk to green leaves. The _sierp_ template uses the latter, giving each
"part" its own color throughout all levels.
Coloring with the _paint_ function can actually use the same color array, even
though the fractal skips immediately to the infinite level of detail. You can
#include the "color.c" template definition to see how this works. The comments
in Listing Four explain how it's done. You can experiment with color
assignments for both the _draw_ and _paint_ functions by playing around with
the COLOR definitions in the header files.


A New Way of Seeing


Drawing with traditional tools involves breaking an image into many separate
shapes and rendering each individually with a curve, polygon, line, or dot.
Drawing with fractals, on the other hand, demands that you find reflections of
the whole in each part and work with entire sections of the image as
indivisible, infinitely detailed forms. The symmetries and structures revealed
by fractal drawing are thus quite different than those revealed by more
atomistic approaches. At first, you may find it difficult to associate the
form of the template with your resulting fractal artwork. With practice,
though, you'll start seeing the global changes and intricate symmetries quite
easily.
With the right mathematical tools, the complex, irregular beauty that
surrounds us in nature is as easy to model and describe as the smooth
linearity of man-made artifacts.



References


For access to a wide variety of books and software on fractals and computer
graphics, contact Media Magic, P.O. Box 507, Nicasio, CA 94946.
Barnsley, Michael J. "A Better Way to Compress Images." BYTE (January, 1988).
Barnsley, Michael J. Fractals Everywhere. San Diego, Calif.: Academic Press,
1988.
Mandelbrot, Benoit B. The Fractal Geometry of Nature. San Francisco, Calif.:
W.H. Freeman, 1982.
Oliver, Dick T. The Fractal Grafics Guidebook. Morrisville, Vt.: Cedar
Software, 1990.

_FRACTALS IN THE REAL WORLD_
by Dick Oliver


[LISTING ONE]

/****************************************************************
 SIERP.C -- (C) 1990 by Dick Oliver, R1 Box 5140, Morrisville, VT 05661
 A program to "draw" and "paint" Sierpinksi's Triangle, defined by a
 "seed" (or "parent") shape and three transformations of that shape
 ("children"). The author makes no claims as to readability or
 suitability for a particular task, but I'll be happy to give advice
 and assistance.
*****************************************************************/

#include <stdio.h> /* For getch() */
#include <math.h> /* For cos() and sin() */
#include <graph.h> /* For graphics calls */

#include "sierp.h" /* You can change this for other template definitions */

int seedx[NPOINTS] = {SEEDX}, /* The "parent" polygon */
 seedy[NPOINTS] = {SEEDY};

 /* The tranformations which define the "children" */
float movex[NTRANS] = {MOVEX}, /* Displacement */
 movey[NTRANS] = {MOVEY},
 sizex[NTRANS] = {SIZEX}, /* Size change */
 sizey[NTRANS] = {SIZEY},
 spinx[NTRANS] = {SPINX}, /* Rotation */
 spiny[NTRANS] = {SPINY},

 /* The transformation matrix T, computed from the above variables */
 Ta[NTRANS], Tb[NTRANS], Tc[NTRANS], Td[NTRANS];

/* Function prototypes */
void draw(float a, float b, float c, float d, float mx, float my, int iter);
void paint(int mx, int my);

void main(void)
{ int t;
 _setvideomode(_VRES16COLOR); /* Initialize the screen */
 _clearscreen(_GCLEARSCREEN);

 /* Compute a,b,c,d from the move, size, and spin variables */
 for (t = 0; t < NTRANS; t++)
 { Ta[t] = sizex[t] * cos(spinx[t]);
 Tb[t] = - sizey[t] * sin(spiny[t]);

 Tc[t] = sizex[t] * sin(spinx[t]);
 Td[t] = sizey[t] * cos(spiny[t]);
 }
 /* Invoke draw with an initial transformation to move the triangle
 to the center of the screen, unchanged in size or rotation */
 draw(1.0, 0.0, 0.0, 1.0, (float) CENTERX, (float) CENTERY, NLEVELS);
 _settextposition(30,0);
 _outtext("Press any key to paint.");
 getch();
 _clearscreen(_GCLEARSCREEN);

 /* Invoke paint, specifying the center of the screen */
 paint(CENTERX, CENTERY);
 _settextposition(30,0);
 _outtext("Press any key to exit.");
 getch();

 _setvideomode(_DEFAULTMODE); /* Go back to text mode and exit */
}


/* This recursive routine draws one "parent" polygon, then calls itself
 to draw the "children" using the transformations defined above */
void draw(float a, float b, float c, float d, float mx, float my, int iter)
{ int t;
 iter--; /* Count one more level of drawing depth */
 { /* Use a,b,c,d,mx,my to transform the polygon */
 float x1, y1; /* Point on the parent */
 int p, x2[NTRANS], y2[NTRANS]; /* Points on the child */
 for (p = 0; p < NPOINTS; p++)
 { x1 = seedx[p];
 y1 = seedy[p];
 x2[p] = a * x1 + b * y1 + mx;
 y2[p] = c * x1 + d * y1 + my;
 }
 /* Now draw the new polygon on the screen */
 _moveto(x2[NPOINTS - 1], y2[NPOINTS - 1]);
 for (p = 0; p < NPOINTS; p++) _lineto(x2[p], y2[p]);
 }
 if (iter < 0) return; /* If we're at the deepest level, back out */

 /* Do a recursive call for each "child" of the polygon we just drew */
 for (t = 0; t < NTRANS; t++)
 { draw(Ta[t] * a + Tc[t] * b,
 Tb[t] * a + Td[t] * b,
 Ta[t] * c + Tc[t] * d,
 Tb[t] * c + Td[t] * d,
 movex[t] * a + movey[t] * b + mx,
 movex[t] * c + movey[t] * d + my,
 iter);
 }
}


/* This routine uses "random iteration" to paint the fractal dot-by-dot.
 The resulting shape will be the same as if we called draw with a
 huge value for the number of levels */
void paint(int mx, int my)
{ int t;

 unsigned long ct = 0; /* Counter for number of dots painted so far */
 float x1 = 0.0, y1 = 0.0, x2, y2; /* Current and next dot */

 /* Keep going until a key is pressed or we reach the COUNT limit */
 while(!kbhit() && (++ct < COUNT))
 { t = rand() % NTRANS; /* Pick one of the transformations at random */

 /* Then move from a dot on the "whole" to the corresponding dot
 on some transformed "part" */
 x2 = x1 * Ta[t] + y1 * Tb[t] + movex[t];
 y2 = x1 * Tc[t] + y1 * Td[t] + movey[t];
 x1 = x2, y1 = y2;

 /* Skip the first few dots--it takes a while to "find" the fractal */
 if (ct > 8) _setpixel((int) x2 + mx, (int) y2 + my);
 }
}






[LISTING TWO]

/****************************************************************
 SIERP.H -- Header file for Sierpinski's Triangle template
 This (and the other header files like it) can be used to define
 the initial fractal template for the SIERP.C and FRACDRAW.C programs
*****************************************************************/

#define NPOINTS 3 /* Number of points on the "parent" polygon */
#define NTRANS 3 /* Number of transformed "children" */
#define NLEVELS 6 /* Number of levels to draw */
#define COUNT 10000 /* Number of dots to paint */
#define CENTERX 320 /* Center of the screen x, y*/
#define CENTERY 240
#define SEEDX -200, 200, 0 /* The "parent" polygon */
#define SEEDY -200, -200, 200

 /* The tranformations which define the "children" */
#define MOVEX -100.0, 100.0, 0.0 /* Displacement */
#define MOVEY -100.0, -100.0, 100.0
#define SIZEX 0.5, 0.5, 0.5 /* Size change */
#define SIZEY 0.5, 0.5, 0.5
#define SPINX 0.0, 0.0, 0.0 /* Rotation */
#define SPINY 0.0, 0.0, 0.0

/* The following color definitions are ignored by the SIERP program
 and used only by FRACDRAW.
 PALETTE defines the 16-color VGA palette
 COLOR intializes a two-dimensional array with color values:
 each column in the array definition below corresponds to one level
 of detail, and each row corresponds to a "part", or transformation.
 Note that the array only needs to be 6 by 3 for the template defined
 above, but more rows are included in case the user inserts additional
 "parts".
*/


#define PALETTE {_BLACK, _RED, _GREEN, _CYAN, \
 _BLUE, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{2, 2, 2, 2, 2, 2},\
 {1, 1, 1, 1, 1, 1},\
 {5, 5, 5, 5, 5, 5},\
 {4, 4, 4, 4, 4, 4},\
 {2, 2, 2, 2, 2, 2},\
 {3, 3, 3, 3, 3, 3},\
 {7, 7, 7, 7, 7, 7},\
 {8, 8, 8, 8, 8, 8},\
 {1, 1, 1, 1, 1, 1}}






[LISTING THREE]

/*****************************************************************
 MAPLE.H --- Header file for maple tree template
 This (and the other header files like it) can be used to define
 the initial fractal template for the SIERP.C and FRACDRAW.C programs
*****************************************************************/
#define NPOINTS 4 /* Number of points on the "parent" polygon */
#define NTRANS 3 /* Number of transformed "children" */
#define NLEVELS 6 /* Number of levels to draw */
#define COUNT 10000 /* Number of dots to paint */
#define CENTERX 320 /* Center of the screen */
#define CENTERY 350
 /* The "parent" polygon */
#define SEEDX 6,20,-6,-12
#define SEEDY -120,120,120,-120
 /* The tranformations which define the "children" */
#define MOVEX -6.1,-46,48 /* Displacement */
#define MOVEY -156,-40,-38
#define SIZEX .65,.57,.58 /* Size change */
#define SIZEY .56,.77,.82
#define SPINX 6.28,5.52,.44 /* Rotation */
#define SPINY 6.28,5.52,.44

/* The following color definitions are ignored by the SIERP program
 and used only by FRACDRAW. See similar #defines in SIERP.H (Listing 2)
*/
#define PALETTE {_BLACK, _RED, _GREEN, _CYAN, \
 _BLUE, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\

 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2}}





[LISTING FOUR]

/*****************************************************************
 FRACDRAW.C -- Drawing with fractals
 Copyright 1990 by Dick Oliver, R1 Box 5140, Morrisville, VT 05661
 A program for interactive fractal drawing.
 The author makes no claims as to readability or suitability for a
 particular task, but I'll be happy to give advice and assistance.
*****************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <graph.h>
#include <math.h>
#include <ctype.h>
#include <bios.h>

 /* #include file for initial template definition, can be changed */
#include "maple.h"
 /* Numerical constants */
#define PI 3.141592
#define TWOPI 6.283853
#define HALFPI 1.570796
#define ALMOSTZERO 0.00002
#define MAXSIZE 0.998
#define MAXINT 32767
 /* Keyboard constants */
#define ENTER 13
#define BACKSPACE 8
#define ESC 27
#define END -'O'
#define HOME -'G'
#define INSERT -'R'
#define DELETE -'S'
#define TAB 9
#define UNTAB -15
#define UP -'H'
#define DN -'P'
#define LT -'K'
#define RT -'M'
#define NULLKEY '^'

 /* Generic getch() replacement */
#define geta if ((a = getch()) == 0) a = -getch();\
 else if (a > 0) a = toupper(a)

 /* Main menu */
#define MENUMSG "ACTION KEY\n"\
 " Draw D\n"\
 " Paint P\n"\
 " Both B\n"\
 " Next Part Tab\n"\

 " NextPoint ~\n"\
 " Insert Ins\n"\
 " Delete Del\n"\
 " Grow *\n"\
 " Shrink /\n"\
 " Spin + -\n"\
 " Skew ; \'\n"\
 " Squish [\n"\
 " Stretch ]\n"\
 " Quit ESC\n\n\n"\
 " DRAWING\n WITH\n FRACTALS\n\n"\
 " (C) 1990 by\n Dick Oliver"

#define MENUKEYS {'D', 'P', 'B', TAB, '`', INSERT, DELETE, \
 '*', '/', '+', ';', '[', ']', ESC}
#define MENUWD 15 /* width of menu in characters */
#define NMENUITEMS 14 /* number of menu items */
#define HAND 64 /* size of handle in pixels */
#define MAXPTS 19 /* max. no. of points on seed */
#define MAXTRANS 19 /* max. no. of parts of template */

/* template variables:
 spininc is the amount to rotate (in radians) each time spin is picked
 sizeinc is the amount to grow or shrink
 ra, rb, rc, rd, rmx, and rmy are the reverse of the initial tranformation
 fa, fb, fc, and fd are the tranformations computed from
 sizex, sizey, spinx, and spiny
 movex and movey are the translation part of the transformations
 asprat is the aspect ratio (always 1 for VGA)
 fx and fy are used for various temporary storage purposes
 x and y are the points on the seed polygon */

float sizeinc = 0.16, spininc = PI / 16,
 ra, rb, rc, rd, rmx, rmy,
 fa[MAXTRANS + 1], fb[MAXTRANS + 1], fc[MAXTRANS + 1], fd[MAXTRANS + 1],
 sizex[MAXTRANS + 1] = {1, SIZEX}, sizey[MAXTRANS + 1] = {1, SIZEY},
 spinx[MAXTRANS + 1] = {0, SPINX}, spiny[MAXTRANS + 1] = {0, SPINY},
 movex[MAXTRANS + 1] = {CENTERX, MOVEX},
 movey[MAXTRANS + 1] = {CENTERY, MOVEY},
 asprat, fx, fy, x[MAXPTS] = {SEEDX}, y[MAXPTS] = {SEEDY};

 /* menu vars */
char a, menukeys[] = MENUKEYS;

/* xtop, etc. are the points on the handle
 drawclr is the text and handle color
 xx, yy, midx, and midy are used for various pixel-shuffling operations
 menuitem is the current menu choice
 hand is the size of the handle in pixels
 sk is used to keep track of when to re-sketch the template
 xo and yo are the current template corners for quick erasing
 thispt & thistran are the current point/part
 npts is the number of point, ntrans is the number of parts,
 level is the number of levels of detail to draw or paint
 color determines the color to make each part at each level
*/
int xtop, ytop, xctr, yctr, xlft, ylft, xrgt, yrgt,
 drawclr, i, j, xx, yy, midx, midy, menuitem = 0, hand = HAND, sk,
 xo[MAXTRANS + 1][MAXPTS], yo[MAXTRANS + 1][MAXPTS], moveinc = 16,

 thispt = 0, thistran = 0, npts = NPOINTS, ntrans = NTRANS,
 level = NLEVELS - 1, color[MAXTRANS][NLEVELS] = COLOR;

 /* ptmode means we're in "point" mode rather than "part" mode*/
enum {OFF, ON} ptmode = OFF;

 /* standard Microsoft video variables */
struct videoconfig vc;
long palette[16] = PALETTE;

 /* these function prototypes are needed to avoid confusion about parameter
 * types (most of the functions aren't prototyped) */
void draw(float a, float b, float c, float d, float mx, float my, int iter);
void warp(float spinxinc, float spinyinc, float sizexinc, float sizeyinc);

main()
{ hello(); /* initialize everything */
 while(1) /* the main event-processing loop */
 { geta; /* geta is a #define */
 switch(a) /* what shall we do now? */
 { case BACKSPACE:
 case ' ':
 /* move ">" to new menu item */
 _settextposition(menuitem + 2, 1);
 _outtext(" ");
 if (a == ' ')
 { if (++menuitem == NMENUITEMS) menuitem = 0;
 }
 else if (--menuitem < 0) menuitem = NMENUITEMS - 1;
 _settextposition(menuitem + 2, 1);
 _outtext(">");
 break;
 case ENTER: /* pick a menu item */
 ungetch(menukeys[menuitem]);
 break;
 default:
 sk = 0;
 switch(a)
 { case LT: case DN:
 case RT: case UP:
 /* move a point or part of the template */
 xx = 0, yy = 0;
 switch (a)
 { case LT: xx = -moveinc; break;
 case RT: xx = moveinc; break;
 case UP: yy = -moveinc; break;
 case DN: yy = moveinc; break;
 }
 if (!ptmode && (thistran == 0))
 *movex += xx, *movey += yy;
 else
 { if (ptmode)
 { x[thispt] += xx * ra + yy * rb;
 y[thispt] += xx * rc + yy * rd;
 }
 else movex[thistran] += xx * ra + yy * rb,
 movey[thistran] += xx * rc + yy * rd;
 }
 break;

 case '/': /* Shrink */
 if (ptmode)
 { fx = 1 / (sizeinc + 1);
 warp(0.0, 0.0, fx, fx);
 }
 else
 { if ((sizex[thistran] /= sizeinc + 1) == 0)
 sizex[thistran] = ALMOSTZERO;
 if ((sizey[thistran] /= sizeinc + 1) == 0)
 sizey[thistran] = ALMOSTZERO;
 computef(thistran);
 }
 break;
 case '*': /* Grow */
 if (ptmode) warp(0.0, 0.0,sizeinc+1, sizeinc+1);
 else
 { if (((sizex[thistran] *= sizeinc + 1)
 > MAXSIZE)
 && (thistran > 0))
 sizex[thistran] = MAXSIZE;
 if (((sizey[thistran] *= sizeinc + 1)
 > MAXSIZE)
 && (thistran > 0))
 sizey[thistran] = MAXSIZE;
 computef(thistran);
 }
 break;
 case '[': /* Squish x-axis */
 if (ptmode) warp(0.0, 0.0, 1/(sizeinc + 1), 1.0);
 else
 { if ((sizex[thistran] /= (sizeinc + 1)) == 0)
 sizex[thistran] = ALMOSTZERO;
 computef(thistran);
 }
 break;
 case ']': /* Stretch x-axis */
 if (ptmode) warp(0.0, 0.0, sizeinc + 1, 1.0);
 else
 { if (((sizex[thistran] *= sizeinc + 1)
 > MAXSIZE)
 && (thistran > 0))
 sizex[thistran] = MAXSIZE;
 computef(thistran);
 }
 break;
 case '-': /* Spin counter-clockwise */
 if (ptmode) warp(-spininc, -spininc, 1.0, 1.0);
 else
 { if ((spinx[thistran] -= spininc) < 0)
 spinx[thistran] += TWOPI;
 if ((spiny[thistran] -= spininc) < 0)
 spiny[thistran] += TWOPI;
 computef(thistran);
 }
 break;
 case '+': /* Spin clockwise */
 if (ptmode) warp(spininc, spininc, 1.0, 1.0);
 else
 { if ((spinx[thistran] += spininc) >= TWOPI)

 spinx[thistran] -= TWOPI;
 if ((spiny[thistran] += spininc) >= TWOPI)
 spiny[thistran] -= TWOPI;
 computef(thistran);
 }
 break;
 case ';': /* Skew x-axis counter-clockwise */
 if (ptmode) warp(spininc, 0.0, 1.0, 1.0);
 else
 { if ((spinx[thistran] += spininc) >= TWOPI)
 spinx[thistran] -= TWOPI;
 computef(thistran);
 }
 break;
 case '\'': /* Skew x-axis clockwise */
 if (ptmode) warp(-spininc, 0.0, 1.0, 1.0);
 else
 { if ((spinx[thistran] -= spininc) < 0)
 spinx[thistran] += TWOPI;
 computef(thistran);
 }
 break;
 case '`': /* NextPoint */
 if (ptmode) ++thispt;
 else ptmode = ON, thistran = 0;
 if (thispt >= npts) thispt = 0;
 break;
 default:
 switch(a)
 { case TAB: /* Next part */
 if (ptmode)
 { ptmode = OFF;
 midpoint();
 }
 else
 { if (++thistran > ntrans) thistran = 0;
 }
 break;
 case 'D':
 case 'P':
 case 'B': /* Draw and/or Paint */
 _clearscreen(_GCLEARSCREEN);
 _setcliprgn(0, 0,
 vc.numxpixels - 1,
 vc.numypixels - 1);
 _setcolor(**color);
 if ((a == 'D') (a == 'B'))
 draw(*fa, *fb, *fc, *fd,
 *movex, *movey, level);
 if ((a == 'P') (a == 'B')) paint();
 printf("\7");
 getch();
 _clearscreen(_GCLEARSCREEN);
 printmenu();
 break;
 case ESC: /* Quit */
 _setvideomode(_DEFAULTMODE);
 printf("Seeyalater!");
 exit(0);

 case INSERT: /* Insert a point or part */
 if (ptmode)
 { if (npts < MAXPTS)
 { erase();
 ++npts;
 for (i = npts - 1; i > thispt; i--)
 x[i] = x[i - 1],
 y[i] = y[i - 1];
 if (thispt > 0)
 xx = x[thispt - 1],
 yy = y[thispt - 1];
 else xx = x[npts - 1],
 yy = y[npts - 1];
 if ((xx == x[thispt]) &&
 (yy == y[thispt]))
 x[thispt] += moveinc,
 y[thispt] += moveinc;
 else x[thispt] =
 (xx + x[thispt]) / 2,
 y[thispt] =
 (yy + y[thispt]) / 2;
 }
 else printf("\7");
 }
 else
 { if ((ntrans < MAXTRANS) && (ntrans > 0))
 { ++ntrans;
 for (i = ntrans; i > thistran; i--)
 { if (i > 1)
 { movex[i] = movex[i - 1];
 movey[i] = movey[i - 1];
 spinx[i] = spinx[i - 1];
 spiny[i] = spiny[i - 1];
 sizex[i] = sizex[i - 1];
 sizey[i] = sizey[i - 1];
 for (j = 0; j < NLEVELS;
 j++)
 color[i - 1][j] =
 color[i - 2][j];
 fa[i] = fa[i - 1];
 fb[i] = fb[i - 1];
 fc[i] = fc[i - 1];
 fd[i] = fd[i - 1];
 }
 else
 { spinx[1] = 0;
 spiny[1] = 0;
 sizex[1] = sizey[1];
 computef(1);
 }
 }
 if (thistran == 0) thistran = 1,i = 1;
 if (thistran > 1) j = thistran - 1;
 else j = ntrans;
 if ((movex[i] == movex[j]) &&
 (movey[i] == movey[j]))
 movex[i] += moveinc,
 movey[i] += moveinc;
 else movex[i] =

 (movex[i] + movex[j]) / 2,
 movey[i] =
 (movey[i] + movey[j]) / 2;
 }
 else
 { if (ntrans == 0) thistran = ++ntrans;
 else printf("\7");
 }
 }
 break;
 case DELETE: /* Delete a point or part */
 erase();
 if (ptmode)
 { if (npts > 1)
 { if (thispt == --npts) --thispt;
 else for (i = thispt; i < npts; i++)
 x[i] = x[i + 1],
 y[i] = y[i + 1];
 }
 else printf("\7");
 }
 else
 { if (ntrans > 0)
 { --ntrans;
 }
 else printf("\7");
 if (ntrans > 0)
 { if (thistran == 0) thistran = 1;
 else
 for (i = thistran;
 i <= ntrans; i++)
 { movex[i] = movex[i + 1];
 movey[i] = movey[i + 1];
 spinx[i] = spinx[i + 1];
 spiny[i] = spiny[i + 1];
 sizex[i] = sizex[i + 1];
 sizey[i] = sizey[i + 1];
 for (j = 0; j < NLEVELS;
 j++)
 color[i - 1][j] =
 color[i][j];
 fa[i] = fa[i + 1];
 fb[i] = fb[i + 1];
 fc[i] = fc[i + 1];
 fd[i] = fd[i + 1];
 }
 }
 if (thistran > ntrans) --thistran;
 }
 }
 sk = 1;
 }
 erase();
 sketch(sk);
 }
 }
}
 /* midpoint() -- find the center of the seed */
midpoint()

{ int xx, yy;
 midx = 0, midy = 0;
 for (i = 0; i < npts; i++) midx += x[i], midy += y[i];
 midx /= npts, midy /= npts;
 for (i = 0; i < npts; i++) x[i] -= midx, y[i] -= midy;
 for (i = 1; i <= ntrans; i++)
 { xx = midx * fa[i] + midy * fb[i];
 yy = midx * fc[i] + midy * fd[i];
 movex[i] -= midx - xx;
 movey[i] -= midy - yy;
 }
 xx = midx * *fa + midy * *fb,
 yy = midx * *fc + midy * *fd;
 *movex += xx,
 *movey += yy;
}

 /* compute the affine transformations expressed by the template */
computef(int i)
{ fa[i] = sizex[i] * cos(spinx[i]);
 fb[i] = -sizey[i] * sin(spiny[i]);
 fc[i] = sizex[i] * sin(spinx[i]);
 fd[i] = sizey[i] * cos(spiny[i]);
 if (i == 0)
 { if ((fx = *fa * *fd - *fb * *fc) == 0) fx = 0.001;
 ra = *fd / fx;
 rb = - *fb / fx;
 rc = - *fc / fx;
 rd = *fa / fx;
 }
}
 /* warp the seed shape (used to skew, squish, and stretch) */
void warp(float spinxinc, float spinyinc, float sizexinc, float sizeyinc)
{ float a, b, c, d, dsizex, dsizey, dspinx, dspiny;
 dspinx = spinxinc + *spinx;
 dspiny = spinyinc + *spiny;
 dsizex = sizexinc * *sizex;
 dsizey = sizeyinc * *sizey;
 a = cos(dspinx) * dsizex;
 b = -sin(dspiny) * dsizey;
 c = sin(dspinx) * dsizex;
 d = cos(dspiny) * dsizey;
 for (i = 0; i < MAXPTS; i++)
 { fx = x[i] * a + y[i] * b;
 fy = x[i] * c + y[i] * d;
 x[i] = fx * ra + fy * rb;
 y[i] = fx * rc + fy * rd;
 }
}

 /* sketch() -- sketch the template and the handle.
 * Note that the handle shows not only which part you are on,
 * but also the relative size and orientation of both axes. */
sketch(int all)
{ int i, j, x1, y1, inc, tran0 = 0;
 float x2, y2, a, b, c, d, mx, my;
 inc = hand;
 if (ptmode)
 { tran0 = 1;

 inc *= *sizey / 2;
 fx = x[thispt], fy = y[thispt];
 x1 = fx * *fa + fy * *fb + *movex;
 y1 = fx * *fc + fy * *fd + *movey;
 xctr = x1, yctr = (y1 + inc) * asprat;
 xtop = x1, ytop = (y1 - inc) * asprat;
 y1 *= asprat;
 xlft = x1 - inc, ylft = y1;
 xrgt = x1 + inc, yrgt = y1;
 }
 else
 { if (thistran == 0) x1 = 0, y1 = 0, tran0 = 1;
 else x1 = movex[thistran], y1 = movey[thistran];
 if (tran0) fx = x1, fy = y1 - inc;
 else fx = x1 - inc * fb[thistran],
 fy = y1 - inc * fd[thistran];
 xtop = fx * *fa + fy * *fb + *movex;
 ytop = (fx * *fc + fy * *fd + *movey) * asprat;
 xctr = x1 * *fa + y1 * *fb + *movex;
 yctr = (x1 * *fc + y1 * *fd + *movey) * asprat;
 inc /= 2;
 if (tran0) fx = x1 - inc, fy = y1;
 else fx = x1 - inc * fa[thistran],
 fy = y1 - inc * fc[thistran];
 xlft = fx * *fa + fy * *fb + *movex;
 ylft = (fx * *fc + fy * *fd + *movey) * asprat;
 if (tran0) fx = x1 + inc, fy = y1;
 else fx = x1 + inc * fa[thistran],
 fy = y1 + inc * fc[thistran];
 xrgt = fx * *fa + fy * *fb + *movex;
 yrgt = (fx * *fc + fy * *fd + *movey) * asprat;
 }
 _setcolor(**color);
 for (j = 0; j < npts; j++)
 { x1 = x[j] * *fa + y[j] * *fb + *movex;
 y1 = (x[j] * *fc + y[j] * *fd + *movey) * asprat;
 (*xo)[j] = x1, (*yo)[j] = y1;
 if (j == 0) _moveto(x1, y1);
 else _lineto(x1, y1);
 }
 _lineto(**xo, **yo);
 for (i = 1; i <= ntrans; i++)
 { if ((thistran == 0) (i == thistran) (all))
 { _setcolor(color[i - 1][level]);
 a = fa[i] * *fa + fc[i] * *fb;
 b = fb[i] * *fa + fd[i] * *fb;
 c = fa[i] * *fc + fc[i] * *fd;
 d = fb[i] * *fc + fd[i] * *fd;
 mx = movex[i] * *fa + movey[i] * *fb + *movex;
 my = movex[i] * *fc + movey[i] * *fd + *movey;
 for (j = 0; j < npts; j++)
 { x1 = a * x[j] + b * y[j] + mx;
 y1 = (c * x[j] + d * y[j] + my) * asprat;
 if (j == 0) _moveto(x1, y1);
 else _lineto(x1, y1);
 xo[i][j] = x1, yo[i][j] = y1;
 }
 _lineto(*(xo[i]), *(yo[i]));
 }

 }
 _setcolor(drawclr);
 _moveto(xtop, ytop);
 _lineto(xctr, yctr);
 _moveto(xlft, ylft);
 _lineto(xrgt, yrgt);
}
 /* erase the template */
erase()
{ _setcolor(0);
 _moveto(**xo, **yo);
 for (i = 1; i < npts; i++) _lineto((*xo)[i], (*yo)[i]);
 _lineto(**xo, **yo);
 for (i = 1; i <= ntrans; i++)
 { if ((thistran == 0) (i == thistran))
 { _moveto(*(xo[i]), *(yo[i]));
 for (j = 0; j < npts; j++) _lineto(xo[i][j], yo[i][j]);
 _lineto(*(xo[i]), *(yo[i]));
 }
 }
 _moveto(xtop, ytop);
 _lineto(xctr, yctr);
 _moveto(xlft, ylft);
 _lineto(xrgt, yrgt);
}

 /* paint() -- uses the "Chaos Game", or "random iteration" algorithm
 * to paint the "infinite-level" fractal on the screen. */
paint()
{ int i, j, p[MAXTRANS], tc, tp, ci[NLEVELS], cc = 0, mx, my;
 unsigned long ct = COUNT;
 float x1 = 0.0, y1 = 0.0, x2, y2 = 0, sx[MAXTRANS], sy[MAXTRANS];
 mx = *movex, my = *movey;

 /* First, we need to compute the relative area of each part of the
 template. This is done by comparing the size of the determinants
 of the matrix (a,b,c,d). These weights are then used to decide
 how often to visit each part--big parts get more visits than small
 ones, giving the overall fractal an even density. */
 for (i = 1; i <= ntrans; i++)
 y2 += (sx[i - 1] =
 fabs(fa[i] * fd[i] - fb[i] * fc[i]));
 if (y2 == 0) y2 = 0.01;
 x2 = MAXINT / y2;
 j = 0;
 for (i = 0; i < ntrans; i++)
 { if ((xx = sx[i] * x2) == 0) xx = 1;
 p[i] = (j += xx);
 }

 /* We skip the first eight points on our journey, because it may take
 that long to settle down from wherever we started onto the fractal.*/
 for (j = 0; j < 8; j++)
 { i = rand() % ntrans + 1;
 x2 = x1 * fa[i] + y1 * fb[i] + movex[i];
 y2 = x1 * fc[i] + y1 * fd[i] + movey[i];
 x1 = x2, y1 = y2;
 ci[cc] = i;
 if (++cc == level) cc = 0;

 }

 /* Now we put it on the screen. The cc, tc, and ci variables are used
 to determine coloring. At each iteration, we choose a level at
 random out of all the levels we will be coloring. We then find the
 color for that level based on which part was "visited" that many
 iterations ago. How does this work? Each iteration of the orbit
 goes from a point on the "whole" to a point on a "part". Therefore,
 if we were in part #3 one iteration ago, we are now in a level 1
 reflection of part #3 within the current part. If we were at part #5
 two cycles ago, we are now painting a point within a level 2
 reflection of part #5, and so on. */
 while(!kbhit() && (--ct != 0))
 { j = rand();
 for (i = 0; i < ntrans; i++) if (j < p[i]) break;
 i++;
 x2 = x1 * fa[i] + y1 * fb[i] + movex[i];
 y2 = x1 * fc[i] + y1 * fd[i] + movey[i];
 x1 = x2, y1 = y2;
 ci[cc] = i - 1;
 j = rand() % level;
 if ((i = cc - j) < 0) i += level;
 if ((tc = color[ci[i]][j + 1]) > 0)
 { _setcolor(tc);
 _setpixel((int) (x2 * *fa + y2 * *fb + mx),
 (int) ((x2 * *fc + y2 * *fd + my) * asprat));
 }
 if (++cc == level) cc = 0;
 }
}

/* draw() -- uses the "successive approximation" algorithm to draw
 the fractal.*/
void draw(float a, float b, float c, float d, float mx, float my, int iter)
{ int i;

 /* When we start drawing, iter is the number of levels to draw.
 Each time we go down one level, we decrement iter. */
 iter--;

 /* If user hits ESC, pass that keypress up through the recursive
 calls until we get back to the main procedure.*/
 if (kbhit() && (getch() == ESC))
 { ungetch(ESC);
 return;
 }

 /* Draw a reflection of the seed polygon using the current
 transformation */
 for (i = 0; i < npts; i++)
 { fx = x[i] * a + y[i] * b + mx;
 fy = x[i] * c + y[i] * d + my;
 if (i == 0)
 { xx = fx, yy = fy;
 _moveto((int) fx, (int) (fy * asprat));
 }
 else _lineto((int) fx, (int) (fy * asprat));
 }
 _lineto((int) xx, (int) (yy * asprat));


 /* If iter has counted all the way down to zero, don't draw the next
 deepest level, but back out one level instead */

 if (iter < 0) return;
 else
 { /* Call draw recursively for each transformation, drawing the
 next deepest level of each part */
 for (i = 1; i <= ntrans; i++)
 { _setcolor(color[i - 1][level - iter]);
 draw(fa[i] * a + fc[i] * b,
 fb[i] * a + fd[i] * b,
 fa[i] * c + fc[i] * d,
 fb[i] * c + fd[i] * d,
 a * movex[i] + b * movey[i] + mx,
 c * movex[i] + d * movey[i] + my,
 iter);
 }
 }
}

 /* display the menu */
printmenu()
{ _settextwindow(1, vc.numtextcols - MENUWD,
 vc.numtextrows, vc.numtextcols);
 _clearscreen(_GWINDOW);
 _outtext(MENUMSG);
 _settextposition(menuitem + 2, 1);
 _outtext(">");
 _setcliprgn(0, 0, vc.numxpixels - (MENUWD + 1) *
 (vc.numxpixels / vc.numtextcols) - 1,
 vc.numypixels - 1);
}

 /* hello() -- initialize everything */
hello()
{ if (!_setvideomode(_VRES16COLOR))
 { printf("Can't set video mode. VGA required.");
 }
 _getvideoconfig(&vc);
 _remapallpalette(palette);
 _wrapon(_GWRAPOFF);
 _clearscreen(_GCLEARSCREEN);
 printmenu();
 asprat = (float) (4 * vc.numypixels) / (float) (3 * vc.numxpixels);
 drawclr = vc.numcolors - 1;
 for (i = 0; i <= ntrans; i++) computef(i);
 sketch(0);
}





[LISTING FIVE]

/*****************************************************************
 FLAKE.H --- Header file for snowflake template
 This (and the other header files like it) can be used to define

 the initial fractal template for the SIERP.C and FRACDRAW.C programs
 This template models the crystalization process to draw a realistic
 snowflake shape.
 See additional comments in SIERP.H
*****************************************************************/
#define NPOINTS 6 /* Number of points on the "parent" polygon */
#define NTRANS 6 /* Number of transformed "children" */
#define NLEVELS 5 /* Number of levels to draw */
#define COUNT 10000 /* Number of dots to paint */
#define CENTERX 320 /* Center of the screen */
#define CENTERY 240
#define SEEDX 1,21,21,1,-21,-21 /* The "parent" polygon */
#define SEEDY -27,-15,9,40,9,-15

/* The tranformations which define the "children" */
#define MOVEX -1,55,55,1,-55,-55 /* Displacement */
#define MOVEY -65,-35,35,65,35,-35
#define SIZEX .18,.18,.18,.18,.18,.18 /* Size change */
#define SIZEY .91,.91,.91,.91,.91,.91
#define SPINX 0,1.05,2.09,3.14,4.19,5.24 /* Rotation */
#define SPINY 0,1.05,2.09,3.14,4.19,5.24

#define PALETTE {_BLACK, _RED, _GREEN, _CYAN, \
 _BLUE, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15}}





[LISTING SIX]

/****************************************************************
 CUMULUS.H--- Header file for cumulus cloud template
 This (and the other header files like it) can be used to define
 the initial fractal template for the SIERP.C and FRACDRAW.C programs
 See additional comments in SIERP.H
****************************************************************/

#define NPOINTS 7 /* Number of points on the "parent" polygon */
#define NTRANS 6 /* Number of transformed "children" */
#define NLEVELS 5 /* Number of levels to draw */
#define COUNT 10000 /* Number of dots to paint */
#define CENTERX 320 /* Center of the screen */
#define CENTERY 240
#define SEEDX 23,17,-4,-10,-27,7,44 /* The "parent" polygon */
#define SEEDY 0,55,55,0,9,-66,10


/* The tranformations which define the "children" */
#define MOVEX 85,-94,-3,51.5,-49,0 /* Displacement */
#define MOVEY 15,13,3.5,-35,-40,40
#define SIZEX .36,.4,.53,.48,.4,.87 /* Size change */
#define SIZEY .36,.47,.53,.53,.4,.33
#define SPINX .25,6,6.2,0.15,6.2,0 /* Rotation */
#define SPINY .25,6,6.2,0.15,6.2,6.3

#define PALETTE {_BLACK, _RED, _GREEN, _CYAN, \
 _BLUE, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{15,15,15, 7, 8},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15}}

/*****************************************************************
 CIRRUS.H-- Header file for cirrus cloud template
 In this and the other cloud models (STRATUS and CUMULUS), the air
 currents which control cloud formation are modelled as arrows.
 When those forces are reflected throughout all levels of scale,
 a realistic image of the cloud type that results from those air
 currents appears. Cirrus clouds are characterized by high-altitude
 rising cross winds, stratus clouds by slow horizontal air flow,
 and cumulus clouds by warm air rising from the ground.
*****************************************************************/

#define NPOINTS 7 /* Number of points on the "parent" polygon */
#define NTRANS 6 /* Number of transformed "children" */
#define NLEVELS 5 /* Number of levels to draw */
#define COUNT 10000 /* Number of dots to paint */
#define CENTERX 180 /* Center of the screen */
#define CENTERY 240
#define SEEDX 16,-27,-42,-7,-27,54,23 /* The "parent" polygon */
#define SEEDY 33,52,36,-7,-13,-43,36

/* The tranformations which define the "children" */
#define MOVEX 143.4,-90,5,-4 /* Displacement */
#define MOVEY 11.08,13.6,-15.5,45
#define SIZEX .75,.43,.38,.75 /* Size change */
#define SIZEY .45,.47,.44,.21
#define SPINX 6.07,0.05,0.02,0.0 /* Rotation */
#define SPINY 6.07,0.05,0.02,6.28

#define PALETTE {_BLACK, _RED, _GREEN, _CYAN, \
 _BLUE, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{15,15,15, 7, 8},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\

 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15}}


/*****************************************************************
 LEAF.H --- Header file for maple leaf template
 This is the "geometric genetic code" for a maple leaf. Note the close
 similarity to the MAPLE tree template.
*****************************************************************/
#define NPOINTS 4 /* Number of points on the "parent" polygon */
#define NTRANS 4 /* Number of transformed "children" */
#define NLEVELS 6 /* Number of levels to draw */
#define COUNT 10000 /* Number of dots to paint */
#define CENTERX 320 /* Center of the screen */
#define CENTERY 350
#define SEEDX 6,20,-6,-12 /* The "parent" polygon */
#define SEEDY -120,120,120,-120

/* The tranformations which define the "children" */
#define MOVEX -1.15,-53,51,-6 /* Displacement */
#define MOVEY 35,7,6,-111
#define SIZEX 0.14,0.62,0.65,0.49 /* Size change */
#define SIZEY 0.51,0.72,0.68,0.51
#define SPINX 6.26,5.47,0.81,6.28 /* Rotation */
#define SPINY 6.26,5.47,0.81,6.28

#define PALETTE {_BLACK, _RED, _GREEN, _CYAN, \
 _BLUE, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2},\
 {6, 6,14,14,10, 2}}

/*****************************************************************
 MOUSE.H -- Header file for cartoon mouse template
 This template was made by tracing the rough outline of an image and
 tiling with copies of itself.
*****************************************************************/
#define NPOINTS 14 /* Number of points on the "parent" polygon */
#define NTRANS 18 /* Number of transformed "children" */
#define NLEVELS 2 /* Number of levels to draw */
#define COUNT 50000 /* Number of dots to paint */
#define CENTERX 320 /* Center of the screen */
#define CENTERY 240
#define SEEDX 131,140,97,6,-56,-97,-146,-148,-121,-101,-47,-3,32,29
#define SEEDY 5,55,99,133,121,70,21,-17,-17,-31,-20,-78,-93,-71


/* The tranformations which define the "children" */
#define MOVEX -89,36,-60,-4,4,61,-71,1,81,-49,-133,-130,-8,-3,-36,-24,13,15
#define MOVEY -3,13,25,-35,-63,-43,101,116,56,87,-50,-24,104,-1,-20,-27,-16,76
#define SIZEX .31,.4,.62,.07,.19,.32,.19,.4,.55,.31,.12,.17,.21,.06,.06,\
 .08,.11,.42
#define SIZEY .18,.35,.44,.27,.27,.48,.06,.13,.33,.20,.04,.23,.12,.16,.14,\
 .2,.23,.2
#define SPINX 3.23,6.4,.32,2.72,5.84,4.61,.75,6.25,5.34,.29,3.16,5.9,3.04,\
 4.15,4.32,.91,1.1,2.55
#define SPINY 3.6,6.5,.77,.37,3.49,4.28,.75,6.25,5.27,.1,2.85,2.65,2.53,\
 3.56,3.73,3.73,3.53,5.89

#define PALETTE {_BLACK, _RED, _GREEN, _CYAN, \
 _BLUE, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{6,6},\
 {6,6},\
 {6,6},\
 {6,6},\
 {6,6},\
 {6,6},\
 {1,1},\
 {6,6}, {6,6}, {1,1}, {7,7}, {8,8}, {12,12},\
 {9,9}, {9,9}, {15,15}, {15,15}, {1,1}}

/*****************************************************************
 PINE.H --- Header file for pine tree template
 This (and the other header files like it) can be used to define
 the initial fractal template for the SIERP.C and FRACDRAW.C programs
*****************************************************************/

#define NPOINTS 4 /* Number of points on the "parent" polygon */
#define NTRANS 6 /* Number of transformed "children" */
#define NLEVELS 5 /* Number of levels to draw */
#define COUNT 10000 /* Number of dots to paint */
#define CENTERX 320 /* Center of the screen */
#define CENTERY 350
#define SEEDX 6,20,-6,-12 /* The "parent" polygon */
#define SEEDY -120,120,120,-120

/* The tranformations which define the "children" */

#define MOVEX -41.2,36.9,5.13,-14.64,2.2,40.07 /* Displacement */
#define MOVEY 14.987,-61.31,7.10,-32.33,-50.46
#define SIZEX 0.39,0.41,0.52,0.35,0.86,0.37 /* Size change */
#define SIZEY 0.39,0.31,0.17,0.24,0.79,0.42
#define SPINX 5.62,0.61,6.15,5.43,3.27,0.54 /* Rotation */
#define SPINY 4.91,1.27,0.13,4.71,6.28,1.4

#define PALETTE {_BLACK, _RED, _GREEN, _CYAN, \
 _BLUE, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{6, 6,14,10, 2},\
 {6, 6,14,10, 2},\
 {6, 6,14,10, 2},\

 {6, 6,14,10, 2},\
 {6, 6,14,10, 2},\
 {6, 6,14,10, 2},\
 {6, 6,14,10, 2},\
 {6, 6,14,10, 2},\
 {6, 6,14,10, 2}}
/*****************************************************************
 STRATUS.H -- Header file for stratus cloud template
****************************************************************/

#define NPOINTS 6 /* Number of points on the "parent" polygon */
#define NTRANS 5 /* Number of transformed "children" */
#define NLEVELS 5 /* Number of levels to draw */
#define COUNT 10000 /* Number of dots to paint */
#define CENTERX 320 /* Center of the screen */
#define CENTERY 240
#define SEEDX 40,80,40,-40,-80,-40 /* The "parent" polygon */
#define SEEDY 22,-2,-22,-22,2,22

/* The tranformations which define the "children" */
#define MOVEX -70,-44,45,60,-3.3 /* Displacement */
#define MOVEY 11,-34,-31,1.6,42
#define SIZEX .75,.43,.38,.75,.8 /* Size change */
#define SIZEY .45,.47,.44,.61,.2
#define SPINX 6.3,6.3,6.3,6.3,0 /* Rotation */
#define SPINY 6.3,6.3,6.3,6.3,6.3

#define PALETTE {_BLACK, _RED, _GREEN, _CYAN, \
 _BLUE, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{15,15,15, 7, 8},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15},\
 {15,15,15,15,15}}
/*****************************************************************
 ZOOM.H -- Header file for abstract zooming artwork template
 Note that this very simple template creates a visually complex image,
 while seemingly simple linear forms like the cartoon mouse require
 much more work and information to express as fractals. When drawing
 with fractals, richly detailed trees are simpler and easier to create
 than smooth objects like tables and chairs!
*****************************************************************/

#define NPOINTS 5 /* Number of points on the "parent" polygon */
#define NTRANS 5 /* Number of transformed "children" */
#define NLEVELS 4 /* Number of levels to draw */
#define COUNT 20000 /* Number of dots to paint */
#define CENTERX 320 /* Center of the screen */
#define CENTERY 340
#define SEEDX -66,-334,60,272,66
#define SEEDY -7,100,-120,-27,55


/* The tranformations which define the "children" */

#define MOVEX 55,104,185,30,-45
#define MOVEY -309,-1,-50,28,-25
#define SIZEX .27,.36,.5,.28,.98
#define SIZEY .27,.27,.18,.21,.5
#define SPINX 4.71,3.88,3.34,4.3,6
#define SPINY 2.48,.93,.39,1.16,6

#define PALETTE {_BLUE, _RED, _GREEN, _CYAN, \
 _BLACK, _MAGENTA, _BROWN, _WHITE, \
 _GRAY, _LIGHTBLUE, _LIGHTGREEN, _LIGHTCYAN, \
 _LIGHTRED, _LIGHTMAGENTA, _LIGHTYELLOW, _BRIGHTWHITE}

#define COLOR {{15,15,15,15},\
 {14,14,14,14},\
 {14,14,14,14},\
 {14,14,14,14},\
 {14,14,14,14},\
 {15,15,15,15},\
 {15,15,15,15}}









































April, 1991
PROGRAMMING PARADIGMS


Building Xanadu




Michael Swaine


After 30 years, Ted Nelson's dream of universal hypertext is close to
realization. Last fall, Autodesk bought Ted's Xanadu project, or an important
piece of it. Later this year, if all goes well, Autodesk will release
Xanadu/Server, a radically different approach to data storage and perhaps the
most ambitious multimedia and hypertext system ever commercially available.
The release of Xanadu/Server should be enormously satisfying to Ted, and to
the faithful who have followed the glacial movement of Xanadu toward
producthood over the decades. And it should be big news: There is likely to be
a flurry of press coverage around the release, for several reasons. The
product is genuinely interesting, and Autodesk will put some effort into
promoting the release. But the event will also get press because Ted Nelson is
always quotable.
Unfortunately, some of that press coverage is likely to be wrong, because the
product and its connection with the 30-year-long Project Xanadu are
complicated. And what Autodesk is building is not the whole of Project Xanadu,
but a key component. DDJ will probably have more to say about Ted,
Xanadu/Server, Project Xanadu, and the dream of universal hypertext when
Autodesk actually releases the product. But before the noise begins, here is
an advance look at a radical piece of software, and at the radical guy who
dreamed it up.


Beyond Files


"People referring to Xanadu have a quite natural confusion between Xanadu the
server and Xanadu the projected world-wide publishing repository," Ted
explains over dinner in a restaurant down the street from the Palo Alto
offices of Xanadu Operating Company (XOC). XOC is the Autodesk subsidiary that
is developing Xanadu the server. We've just come from the XOC offices, where
Ted has reassembled virtually his entire original Xanadu team: Roger Gregory,
Mark Miller, Eric Hill, Roland King, and others. What they are working on at
XOC is merely Xanadu the server; that's all Autodesk bought. Xanadu the
projected world-wide publishing repository remains the exclusive province of
Theodor Holm Nelson.
I wonder if I have just heard one of the reasons for this "quite natural
confusion" when Ted immediately goes on to tell me that the server and the
publishing repository are really the same idea. This idea has obsessed him for
30 years, spinning off several books, entertaining lectures, and provocative
views on software interface design as a branch of cinema. Perhaps only someone
with Ted's views on the interconnectedness of information would think of this
web of themes as a single idea.
If it is one idea, it has many faces. "There are a thousand ways to describe
it," he says. The Evil of Files is one approach. The phrase comes from a
seminar Ted gave this spring in one of Terry Winograd's classes at Stanford.
"Most people act as though God invented files," he grumbles over a salmon
steak. "Because it is simple to implement a hierarchical directory structure,
and it is simple to implement a model in which all of memory is cleared and
you bring in something new as a great lump, this comes to seem the way
computers ought to work."
Ted's rejection of this model is total. "From the very beginning, before I
really understood that model -- well, I never deeply accepted it. I understood
it but I said, 'Wait a minute. What we need is --' and I've been on that same
path ever since."
He doesn't have a single name for that path, either. "You can call it
multithreading or zippered lists or whichways, the term I used in the 1970s.
The basic idea is, you have a pool of material and the different objects you
want to deal with are threads through the pool of material. So you can pull
out the threads. The same object can be on many of these different threads."
This is fundamentally different from the file model. "In the file model, if
you want to use the same material twice you copy it into the different files.
So it loses its identity."
In Ted's model, the Xanadu model, "each document will be able to use the same
material in different ways, and if you do it right You'll be able to find out
which documents use the same material and see them side-by-side."
This approach grew out of what Ted wanted to do with computers. "I have so
many uses for this that anything else seems a totally wrong use of computers.
In my personal life, my plan since 1960 has been to have all my writings be in
a solid block of hypertext, interconnected among all the uses of the same
material."
That's not how it has worked out. Ted has published several books "the old
way," and it bothers him. "Every book I write in the old method is written
wrong, and therefore does not participate in the plan I originally formulated.
So each of these mock achievements is a heartbreak to me."
Hence the Xanadu model, in which documents are represented the way Ted Nelson
wants them to be represented.
Xanadu documents, not files, are the focus of the Xanadu/Server system.
Understanding the document model of Xanadu/Server is a prerequisite to
understanding anything else about it. And understanding the document model
requires learning a lot of new terminology. Some of the terminology may be
absolutely necessary, because new ideas are being discussed (or at least ideas
not seen in personal computer operating systems). Other terms may be of value
chiefly in emphasizing the fact that there are new concepts here. But there is
among Ted's proteges an enjoyment of terminology for its own sake. Earlier, in
the XOC offices, Ted referred to me as "Michael," and there was a momentary
confusion, Michael McClary thinking that Ted was referring to him. "Do you
mean we've overloaded Michael?" he asked.
Ted, a self-described neologian, enjoys this.
So what is a Xanadu/Server document? Here's what it is not: A Xanadu/Server
document is not a discrete stream of sequential bytes with a static content
and clear boundaries.
If this sounds more muddled than the usual model of a document, it is in fact
partly an attempt to resolve some of the muddle in the usual model.
Conventionally, we often refer to two different things by the same name. "The
DDJ Editorial Calendar" can refer to a persistent entity that passes through
various revisions and can appear on various disks and in printed form. At the
same time, it can refer to a specific ordering of bytes in a particular
representation at a particular time.
This is a confusion of the identity of the document with its state, and it's a
confusion that is more or less built into the conventional model. The
Xanadu/Server document model deals with the confusion by being explicit about
the identity and the state of the document. The model also distinguishes
between editions of a document, which embody temporal changes in the document,
and variants of a document, such as what-if scenarios.
The identity and the state of a document in the Xanadu/Server system are
distinguished by having distinct objects assigned to each. A document's
identity, that is, the user's perception of the document as a cohesive whole,
is represented by a globally unique, system-generated, persistent object
called a "Bert." A document's state is represented by a globally unique,
system-generated, persistent object called a "Stamp." Each Stamp is associated
with one Bert, and is a snapshot of the document's state at a particular time.
Stamps cannot change.
Managing versions of documents, allowing multiple users to access a single
document, and tracing the development of a document through time are all
accomplished through manipulating the associations among Berts and Stamps. The
system is subtle and elegant.
The Xanadu/Server model differs most clearly from the conventional model in
its use of virtual documents. A simple document may consist of a sequence of
bytes in a particular location; then it is similar in its state representation
to a conventional document. But a document can also include parts of other
documents in the form of virtual copies. The virtual copy is a kind of hot
link; not a static copy, but a use of the original. It is possible to create
Xanadu documents entirely by pasting together virtual copies of other works.
Such a virtual document is effectively a frame placed around existing chunks
of data to create a new whole from the parts.
Because no new copy is created, this virtual copy approach is not expensive in
terms of space. And since the Xanadu/Server system is designed to support this
mechanism efficiently, it is also not expensive in terms of time.


Xanadu Links


The key to all this interconnectedness is the connections themselves, the
links. Now that hypertext is ubiquitous, it might seem that some part of
Xanadu is already realized. But the links of the Xanadu system are so richly
realized that they are fundamentally different from the connections in
products like Guide and HyperCard. The richness is there, again, because it
was something that Ted needed for his own purposes.
"I'm an amateur philosopher," he tells me over coffee. "I was a philosophy
major in college and I have probably about 200,000 notes accumulated in the
past 30 yers. You see, I decided in the fall of 1960 that I was not going to
write any books until I had a decent editing system."
I observe that he is still working on it.
"Still working on it. Unfortunately I had to write three books the wrong way.
But most of my stuff is still hanging in notes. I have literally millions of
notes squirreled away on file cards in storage."
Digging through the Xanadu/Server documentation later, trying to understand
the complexities and the depth of the linking system, I find it helpful to
keep those file cards in mind.
The Xanadu/Server documentation describes links as objects created by end
users to assert relationships between the various components of their data.
Exactly how the end users do this is not the business of Xanadu/Server,
however. Applications running on Xanadu/Server will provide various interfaces
to the documents.
As I kept hearing Xanadu links described as omnidirectional, I wanted to
challenge this claim. Aren't backlinks, I wondered, fundamentally different
from forward links? Thinking about hypertext links in products like Guide or
HyperCard, I envisioned clicking on a word and jumping to a new document. The
backlink in this case would seem to be from the document as a whole to the
word, but is it? The user interface is necessarily different, since you can't
really click on an entire document as you can click on a word. But more than
that, the backlink's job is not really to follow a path through the space of
links so much as to backtrack through the temporal dimension of link
traversal. Backing out of a link traversal is like an Undo operation more than
it is like a navigation operation.
I finally realized that I was missing the point. Xanadu/Server is not a user
tool, although it will probably ship with sample user tools. It is a back end,
and so far as the API goes, links are entirely omnidirectional. Specific
applications may apply a handedness to certain kinds of links. But only to
certain kinds, and I also realized that I was missing another point in
thinking of links purely as associations between chunks of hypertext. Links in
Xanadu/Server can be used to map font and other attribute information to parts
of documents, to implement electronic mail services, and to implement all the
navigational operations of the system, both within and between documents.
Everything is linked.
Because links are so fundamental to Xanadu, they are richly realized. A link
is not just a connection between two chunks of data. A Xanadu link has two
kinds of what are called endsets: a type space endset and a linkend endset.
The type space endset specifies what type of link this is; the set of possible
types is open, and new types can be defined by applications. Some possible
link types are criticism, version change, definition, footnote,
cross-reference, and effectively any other relationship imaginable.
The linkend endset, and there is always at least one, describes the object
pointed to by the link. Any number of endsets is possible. Each endset has a
structure that consists of:
a Bert context

an original context
a path context
an end context
The Bert context provides a document identity to use in viewing the endset.
The original context is a Stamp ID representing the state of the context in
which the link was created. The path and end contexts together specify fully
the material being linked. This entire structure, this link object, resides in
the same amorphous, non-file-oriented space in which documents reside; and
links are editable and readable. All of this flexibility in specifying types
and power in specifying context seems to me to make Xanadu links a genuinely
new idea of as-yet unrealized power.
There's another point about the omnidirectionality of links. There is
apparently a philosophical principle here that application developers are
expected to adhere to. Links are not supposed to be hidden. The user is
supposed to be able to find and follow all links from a particular document.
Even if the user does not have permission to read the chunk linked to, he or
she should be able to see that the link is there. On the other hand, the user
is supposed to have the same ability to see all links to the document being
read. So if somebody creates a link to something you have written, you can
determine that this is the case.
The projected world-wide publishing repository that Ted Nelson plans to build
on top of this product depends on this facility, because it provides the
mechanism for automatic payment of royalties to authors.


Saving the World


"On bad days I feel I could have saved the world if I had only been more
efficient," Ted says.
Xanadu has been a long time coming, but it is a big chunk of newness to
swallow all at once. I haven't touched here on the efforts the XOC team are
putting into the more conventional aspects of the system, but looking at
Xanadu/Server as a conventional product, the goal is to provide high-end DBMS
integrity and performance for freeform, multimedia, real hypertext,
seamless-across-file, platform, and network boundaries. Not a small ambition.
Xanadu is "a generalized facility of great power," as Ted puts it, and the
characterization hardly seems an exaggeration. But the practical question
ultimately arises, will it actually work?
"Naturally, I'm holding my breath. I haven't been involved in the
implementation for 11 years. Whenever I worry about this question I remember
the immortal words of General Buck Turgidson in Dr. Strangelove: 'Hell, my
boys'll get through!'"















































April, 1991
C PROGRAMMING


Terminate and Stay Resident Programs And a New Project




Al Stevens


Last month, I discussed event-driven programming in C and wrote a small
example that used the keyboard and mouse to capture text screen images into a
file on a PC. The program is useful for developing user documentation. You can
define a rectangular area of the screen, capture it to disk, and edit it later
with a text editor. Then you can print it or merge it into your word
processing files.
The example program runs from the DOS command line, which limits its utility.
You need to capture screen snapshots on-the-fly while the program you are
documenting is running. Most programs will not retain the screen while you
exit to DOS to run a screen grabber, and even if they would, the DOS command
line would use up some of the screen, possibly scrolling off the part you want
to capture. Besides, the business of moving between DOS and your program is a
nuisance. Therefore, to capture screen segments from running programs, the
screen-grabber program must be memory-resident.
This month's column shows you how to turn the command-line program into a
terminate-and-stay-resident (TSR) program. The subject of TSRs has been given
comprehensive treatment by now. Many books and magazine articles have dealt
with it. I've covered the subject myself in three books. I'm addressing it
again because I wanted to complete the program that I started last month and
because it would be unfair to dump a lot of code on you without a bit of
explanation.
The TSR is a kludge. It exists because the designers of DOS failed to foresee
the need for simple task-switching between concurrently resident programs --
not multitasking, mind you, but simple task-switching. The TSR was made
possible because those same designers needed to tack a print spooler to the
single-user, single-tasking DOS. They put some hooks into DOS to allow their
print spooler to run in the background in a way that did not disturb the
integrity of a foreground task. They did not publish the details of those
hooks, probably because the technique is both fragile and ugly. Other
programmers reverse-engineered the DOS print spooler program's code to build
early pop-up utility programs such as SideKick. Eventually, the techniques
came into common knowledge, and TSRs proliferated. By the way, you can use the
Paste operation of SideKick's Notepad to do what the example screen grabber
program does, except that SideKick does not use the mouse to describe the part
of the screen you want to capture.
Here in a nutshell -- an appropriate place for a discussion of a kludge -- is
how a pop-up TSR works. You run a TSR from the DOS command line. It attaches
itself to some interrupt vectors and uses a special DOS call to terminate
without giving up the memory it occupies. Thus the name,
"terminate-and-stay-resident." When you run other programs, DOS loads them
into memory above the TSR program. When you press the TSR's hot key, the
interrupt service routine (that the TSR attached to the keyboard interrupt
vector) intercepts the keystroke and goes through some gyrations that let the
TSR program pop up.
A TSR that makes DOS calls cannot pop up at just any old time. DOS is not
always able to accommodate that. DOS is not reentrant. If you interrupt a
program while it is in the middle of a DOS call and run another program that
makes DOS calls, DOS will crash and burn. A TSR must set an indicator that
says it wants to pop up and then wait until DOS says it is OK. DOS does that
in two ways. DOS maintains something called the INDOS flag, an indicator that
tells when DOS is running and cannot be interrupted. As long as the INDOS flag
is set, you must not switch to a different program that makes DOS calls. The
catch is that the COMMAND.COM program makes a DOS call to read the console. As
long as you are sitting at the command line prompt, DOS is running and the
INDOS flag is set. If INDOS provided the only way to let you interrupt DOS, no
pop up could occur while you were at the command line prompt. To get around
that snafu, DOS adds another kludge. While DOS is looping waiting for a
keystroke, it periodically sets things up to allow itself to be interrupted.
Then it calls interrupt number 0x28. A TSR attaches itself to INT 28 and knows
that it is OK to pop itself up when its INT 28 interrupt service routine
executes.
To tie all these things together, a TSR attaches the keyboard interrupt to
watch for the hot key, the INT 28 interrupt to watch for when DOS will allow
itself to run, and the timer interrupt to watch everything. When the TSR's
keyboard interrupt service routine sees that you have pressed the hot key, it
sets a flag saying so. When the timer interrupt service routine executes --
18.2 times every second -- it looks at the hot key flag. If the hot key flag
is set and the INDOS flag is not, the timer interrupt service routine pops up
the TSR. If the INT 28 interrupt service routine executes and the hot flag is
set, the INT 28 routine pops up the TSR.
Before popping up, the TSR must switch context from the interrupted program to
itself. In an orderly multitasking environment, the operating system handles
context switching between tasks. In the TSR kludge, every resident program
must do it for itself. One of the reasons that the TSR situation is fragile is
that there are different ways to manage context switching; not every program
does it the same way, and not every program does a thorough job of it. Here
are the items of context that you must switch. You need to change from the
interrupted program's stack to that of the TSR; you need to trick DOS into
thinking the TSR is its single task by switching its pointer to the running
program's Program Segment Prefix (PSP); you need to change the Disk Transfer
Address (DTA) from that of the interrupted program to the TSR's DTA; you need
to temporarily disable what DOS would do with a critical error and when the
user presses Ctrl-Break or Ctrl-C; if the TSR is going to use the mouse, you
must save the current mouse context. The interrupted program might be using
it, too.
Depending on what the TSR does, you must deal with the video mode as well. If
the TSR is smart enough to recognize the current video mode and use it, you do
not need to change video context. On the other hand, if the TSR is a
text-mode-only program, for example, you need to capture the current mode and,
perhaps, the contents of video memory, and change the video mode to the one
that the TSR uses. The screen grabber program does not worry about that
because it captures text mode screens only and assumes that you would run it
only in text mode.
Dealing with PC video modes is no trivial job. If the interrupted program sets
modes by directly addressing the CRT controller's registers, for example, you
might not be able to determine the current mode, much less save it and switch
back to it.
After you've done all that context switching, you can run the TSR. A TSR that
executes when INDOS is clear is free to make almost any DOS call. One that
executes from its INT 28 interrupt can use most DOS calls above function 0x0c.
The functions from 0 to 0x0c are screen and keyboard DOS calls and are still
vulnerable to a crash. Not to worry. These functions are ones that most TSRs
would not use because of their behavior and performance. Most TSRs will read
the keyboard through BIOS and write to the screen with direct video memory
writes. TSRs should avoid making DOS memory allocation calls or running other
programs from the DOS EXEC function.
When the TSR is done, it must switch the context back to what it was before
the program popped up. Note that the program in this and the previous issue
will run correctly as a TSR only for versions of DOS greater than 3.0. The DOS
functions that get and set the PSP did not work properly in earlier versions
of DOS. There are tricks for swapping PSP context in DOS Version 2, and these
are described in the books mentioned at the end of this discussion.
Listing One, page 150, is tsr.c, the driver that turns the program into a DOS
terminate-and-stay-resident program. It uses some of the Turbo C extensions
for reading and writing hardware registers and executing interrupts. Other
compilers have other ways of doing these nonstandard operations, so if you use
a compiler other than Turbo C, you must port the hardware-specific stuff to
the conventions of your compiler. Most compilers have no analogue to the Turbo
C stack segment and stack pointer pseudoregisters, so you might need to write
an assembly language equivalent to read and change the values in SS and SP.
Most PC compilers do have the int86 family of functions with REGS and SREGS
structures, and they all implement them in common ways, so that part of your
port should be no problem.
You can use this TSR engine to create other TSRs from regular C programs if
those programs avoid any console input/output through DOS calls. Three define
statements at the beginning of Listing One associate the driver with a TSR
program. The KEYMASK and SCANCODE symbols define the hot key, and the
tsr_program symbol defines the name of the application function that the
driver calls when the user presses the hot key. In this example, the hot key
uses a scan code of 52 and a key mask of 8. These values make Alt-period the
program's hot key. The program will call the function named copy-scrn when the
user presses the hot key. Scan codes and the value of the BIOS shift mask are
published in most books on low-level PC programming, including the ones listed
at the end of this discussion.
The TSR driver manages all the TSR bother about when to pop up and how to
switch contexts between the program that is running and the TSR. The TSR
driver includes the main function for the program that it supports. To bypass
being a TSR, a program should provide its own main function, one that either
calls or is itself the tsr_program function. In fact, that is the easiest way
to test a TSR. Because in that configuration it is a regular DOS program, you
can test it with a source-level debugger, linking in the TSR driver after the
program works properly as a DOS program. The code from last month includes a
TSR compile-time conditional statement. When you define that global symbol in
the compile, the code changes the name of its main function to tsr_program,
the function called by the engine to run the TSR.
There is a lot more to know about TSRs than I have told you here. You can
learn more from the books mentioned later, or you can decide not to know much
about the subject and use engines such as the one published with this column.
The engine published here is not comprehensive, however. You would not use it
with a commercially distributed TSR because it would crash under DOS 2.0 and
would make funny screens if you popped it up in graphics mode. There are other
engines published in books, including some of my own, that have the code to
get around some of these problems. There are some commercial libraries that
manage the TSR part of your programs. Your best bet is a rugged shareware
package named TesSeRact that takes care of most of it in ways that survive in
the presence of most other TSRs. You can download TesSeRact from many BBSs and
services or you can order it from Innovative Data Concepts at 215-443-9705.
Source code is available, too. (I'd plug this package more often if its name
was easier to type.)


TSR Books


This is a list of books that address the machinations of the TSR or that
discuss issues relevant to the development of TSR programs:
Extending Turbo C Professional, by Al Stevens (MIS Press, 1989)
Microsoft Mouse Programmer's Reference (Microsoft Press, 1989)
MS-DOS Developer's Guide, 2nd Edition, by The Waite Group, (Howard W. Sams,
1989)
The MS-DOS Encyclopedia, by Ray Duncan et al. (Microsoft Press, 1988)
PC System Programming for Developers, by Michael Tischer (Abacus, 1989)
Performance Programming Under MS-DOS, by Michael J. Young (Sybex, 1987)
Programmer's Guide to PC & PS/2 Video Systems by Richard Wilton (Microsoft
Press, 1987)


TSRynosaurus


Having just written and published a TSR program, let me crawl out on a spindly
limb and say that I think the TSR is and should be an endangered species. Its
original justification was that PCs of yore did not support multitasking, and
users wanted the convenience of pop-up utility programs. The conventional
wisdom persisted that because there are a kazillion of those old 8088-based
PCs out there with their paltry 640K, we have a duty to continue to develop
software to support them. Why? You can get a 386 with lots of memory for a lot
less than the original cost of a 1981 PC with no hard drive and 64K. Tell
those folks to use the old software or get off a dollar and upgrade. In the
meantime, the multitaskers such as Windows and DesqView solve the thorny TSR
problems for those willing to use contemporary hardware. Write a program, any
program. Forget about INT 28, INDOS, critical errors, break handlers, context
switching, and all that rot. Run that program under DesqView and pop it up
whenever you want. And if you happen to pop it up while your communications
program is downloading a big file, so what? Everything keeps humming along.
And if one of those old-timers can't live without your snazzy new program and
won't give up the antique, tell them about the DOS command line.
I hereby resolve to never write another piece on TSRs.


D-Flat


My excursion into event-driven programming and my analysis of TurboVision,
Zinc, and Mewel led me to a conclusion. C programmers need an efficient way to
put the IBM Systems Application Architecture (SAA) Common User Access (CUA)
into their DOS text-mode programs and into programs developed for other,
non-PC platforms. If TurboVision, a lovely new part of Turbo Pascal 6.0, finds
its way into Turbo C, it will no doubt be a C++ additive because of its strong
orientation to classes. Users of the C component of Turbo C++, Turbo C 2.0, or
other C compilers will not benefit from TurboVision. The Zinc library is
likewise a Turbo C++ product. Mewel is a good solution for C programmers, but
only if you are developing for the high-end computers, ones fast enough and
with enough memory to support Mewel programs, and only if you want most of the
features supported by the Windows CUA interface.
Over the next several months I will be publishing a new "C Programming" column
project, which will be a C library that implements a subset of CUA in a
text-mode environment. Because all the really good C-oriented puns have
already been taken (C-Worthy, C-scape, and so on), I will call the package
"D-Flat," which is another way of saying "C-Sharp." If I really wanted to be
hip, I'd call it "Five," which is the jazz musician's shorthand for the key of
D-flat. I will not use "C-Sharp" itself because there is almost certain to be
someone out there with a trademark registration and a lawyer on the payroll.
D-Flat sounds safer somehow. Maybe there could be two versions: a small,
concise, featureless version called "C-Sharp Minor" and a feature-rich,
all-things-to-all-programmers version called "C-Sharp Major." This could get
disgusting. I'll stick with D-Flat.
D-Flat will provide the CUA interface in an event-driven architecture with the
hardware drivers developed separately. It will support applications windows,
child document windows, menu bars, pop down menus, dialog boxes, buttons, edit
boxes, list boxes, scroll bars, context-sensitive help, and other CUA things.
It will use the C compiler's preprocessor as a resource compiler. The version
published here will run on the PC and will compile with as many popular
compilers as I can possibly address within the confines of this column and the
time I have to give to it. The hardware-dependent and compiler-dependent code
will be separate from the rest of the library, and it will be small in
relation to the rest of D-Flat.
D-Flat will not clone the Windows' API as Mewel has done. D-Flat's purpose is
not to grease the skids on the way to Windows or to provide Windows-to-DOS
source code portability. That solution has already been effectively
implemented by Mewel. I am making no attempt to get close to the Windows' way
of doing things in the development of this software. On the other hand, I've
made no conscious effort to avoid it, either. Looking back on the way the code
works so far, I can see some occasional Windows influence in the design.
D-Flat was born after I looked for a CUA-like library for an applications
program I am writing. The program must be hospitable to most computers
including the little laptops. Performance is critical. If the program is slow
to load and slow to run, the users will not use it. If that belies the TSR
soapbox I mounted a few paragraphs back, so be it. Besides wanting
performance, I wanted the benefits of a package that could manage menus, the
mouse, dialog boxes, and the like after the fashion of the high-end libraries.
I do not suggest that such smaller libraries are not available, and I do not
pretend to have made an exhaustive search, but I liked the idea of a package
of my own design over which I have complete control, and I liked the notion of
publishing it as a project. We will start the project next month with some of
the low-level stuff.



The Journal of C Language Translation


The Journal of C Language Translation is a small, quarterly publication that
targets developers of C language compilers, interpreters, libraries, and such.
Each edition contains nine or ten essays of interest to those who need to
understand the ins and outs of C translation so that they can write
translating programs or interfacing libraries that comply with the official
definition of Standard C. It is no surprise that many of the contributors to
the Journal are members of the ANSI X3J11 committee. The audience that the
Journal has targeted is, understandably, small, and so the price is steep --
$235 for a one-year subscription of four issues. The irony of this marketing
strategy is that it excludes a large segment of its potential market.
Programmers not writing compilers could learn a lot about C from the Journal
if they could only afford the admission. Some of the contributors are
regularly seen elsewhere in print, but it appears to me that they save their
best C stuff for the Journal.
Here are some of the subjects they've covered in their first two years:
proposed numerical extensions to C; name space pollution; const and volatile
type qualifiers; aliasing problems; translating Pascal to C; the so-called
"quiet changes" introduced in the Ansi standard definition; translating
Fortran to C; variable length arrays; C on a Cray; the weak spots in Standard
C; and trigraphs. The authors cover these and many other topics from the
perspective of those who helped to frame the standard, those in the best
position to recognize and identify its weaknesses and to understand and
explain the imponderable.
It might be hard to persuade your boss to cough up 235 bucks for four issues
of an inexpensively produced, nonglossy periodical with no ads and a total
annual page count of about 250 pages, smaller than a typical $25 C book. But
try. If your shop has lots of programmers, maybe you can all share a copy.
Address your inquiries to:
The Journal of C Language Translation
2051 Swans Neck Way Reston,
Virginia 22091
703-860-0091

_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

/* --------- tsr.c --------- */

/*
 * A Terminate and Stay Resident (TSR) engine
 */

#include <dos.h>
#include <stdlib.h>
#include <stdio.h>
#include "mouse.h"
#include "keys.h"

void tsr_program(void);

#define KEYMASK 8
#define SCANCODE 52

extern unsigned _stklen = 1024;
extern unsigned _heaplen = 8192;

/* ------- the interrupt function registers -------- */
typedef struct {
 int bp,di,si,ds,es,dx,cx,bx,ax,ip,cs,fl;
} IREGS;

/* --- vectors ---- */
#define DISK 0x13
#define CTRLBRK 0x1b
#define INT28 0x28
#define CRIT 0x24
#define CTRLC 0x23
#define TIMER 8
#define KYBRD 9
#define DOS 0x21

unsigned highmemory;

/* ------ interrupt vector chains ------ */
static void (interrupt *oldtimer)(void);
static void (interrupt *old28)(void);

static void (interrupt *oldkb)(void);
static void (interrupt *olddisk)(void);

/* ------ ISRs for the TSR ------- */
static void interrupt newtimer(void);
static void interrupt new28(void);
static void interrupt newdisk(IREGS);
static void interrupt newkb(void);
static void interrupt newcrit(IREGS);
static void interrupt newbreak(void);

static unsigned sizeprogram; /* TSR's program size */
unsigned dossegmnt; /* DOS segment address */
unsigned dosbusy; /* offset to InDOS flag */
static int diskflag; /* Disk BIOS busy flag */
unsigned mcbseg; /* address of 1st DOS mcb */
static char far *mydta; /* TSR's DTA */

int hotkeyhit = FALSE;
int tsrss; /* TSR's stack segment */
int tsrsp; /* TSR's stack pointer */

/* -------------- context for the popup ---------------- */
unsigned intpsp; /* Interrupted PSP address */
int running; /* TSR running indicator */
char far *intdta; /* interrupted DTA */
unsigned intsp; /* " stack pointer */
unsigned intss; /* " stack segment */
unsigned ctrl_break; /* Ctrl-Break setting */
void (interrupt *oldcrit)(void);
void (interrupt *oldbreak)(void);
void (interrupt *oldctrlc)(void);

/* ------- local prototypes -------- */
static void resident_psp(void);
static void interrupted_psp(void);
static void popup(void);

void main(void)
{
 unsigned es, bx;
 /* ---------- compute memory parameters ------------ */
 highmemory = _SS + ((_SP + 256) / 16);
 /* ------ get address of DOS busy flag ---- */
 _AH = 0x34;
 geninterrupt(DOS);
 dossegmnt = _ES;
 dosbusy = _BX;
 /* ---- get the seg addr of 1st DOS MCB ---- */
 _AH = 0x52;
 geninterrupt(DOS);
 es = _ES;
 bx = _BX;
 mcbseg = peek(es, bx-2);
 /* ----- get address of resident program's dta ----- */
 mydta = getdta();
 /* ------------ prepare for residence ------------ */
 tsrss = _SS;
 tsrsp = _SP;

 oldtimer = getvect(TIMER);
 old28 = getvect(INT28);
 oldkb = getvect(KYBRD);
 olddisk = getvect(DISK);

 /* ----- attach vectors to resident program ----- */
 setvect(KYBRD, newkb);
 setvect(INT28, new28);
 setvect(DISK, newdisk);
 setvect(TIMER, newtimer);
 /* ------ compute program size ------- */
 sizeprogram = highmemory - _psp + 1;
 /* ----- terminate and stay resident ------- */
 _DX = sizeprogram;
 _AX = 0x3100;
 geninterrupt(DOS);
}

/* ---------- break handler ------------ */
static void interrupt newbreak(void)
{
 return;
}

/* -------- critical error ISR ---------- */
static void interrupt newcrit(IREGS ir)
{
 ir.ax = 0; /* ignore critical errors */
}

/* ------ BIOS disk functions ISR ------- */
static void interrupt newdisk(IREGS ir)
{
 diskflag++;
 (*olddisk)();
 ir.ax = _AX; /* for the register returns */
 ir.cx = _CX;
 ir.dx = _DX;
 ir.es = _ES;
 ir.di = _DI;
 ir.fl = _FLAGS;
 --diskflag;
}

/* ----- keyboard ISR ------ */
static void interrupt newkb(void)
{
 static unsigned char kbval;

 kbval = inportb(0x60);
 if (!hotkeyhit && !running)
 if ((peekb(0, 0x417) & 0xf) == KEYMASK)
 if (SCANCODE == kbval) {
 hotkeyhit = TRUE;
 /* --- reset the keyboard ---- */
 kbval = inportb(0x61);
 outportb(0x61, kbval 0x80);
 outportb(0x61, kbval);
 outportb(0x20, 0x20);

 return;
 }
 (*oldkb)();
}

/* ----- timer ISR ------- */
static void interrupt newtimer(void)
{
 (*oldtimer)();
 if (hotkeyhit && (peekb(dossegmnt, dosbusy) == 0) &&
 !diskflag)
 popup();
}

/* ----- 0x28 ISR -------- */
static void interrupt new28(void)
{
 (*old28)();
 if (hotkeyhit)
 popup();
}

/* ------ switch psp context from interrupted to TSR ----- */
static void resident_psp(void)
{
 intpsp = getpsp();
 _AH = 0x50;
 _BX = _psp;
 geninterrupt(DOS);
}

/* ---- switch psp context from TSR to interrupted ---- */
static void interrupted_psp(void)
{
 _BX = intpsp;
 _AH = 0x50;
 geninterrupt(DOS);
}

/* ------ execute the resident program ------- */
static void popup(void)
{
 running = TRUE;
 hotkeyhit = FALSE;
 intsp = _SP;
 intss = _SS;
 _SP = tsrsp;
 _SS = tsrss;
 oldcrit = getvect(CRIT); /* redirect critical err */
 oldbreak = getvect(CTRLBRK);
 oldctrlc = getvect(CTRLC);
 setvect(CRIT, newcrit);
 setvect(CTRLBRK, newbreak);
 setvect(CTRLC, newbreak);
 ctrl_break = getcbrk(); /* get ctrl break setting */
 setcbrk(0); /* turn off ctrl break */
 intdta = getdta(); /* get interrupted dta */
 setdta(mydta); /* set resident dta */
 resident_psp(); /* swap psps */

 intercept_mouse(); /* intercept the mouse */
 /* ------ save the video cursor configuration ------- */
 savecursor();
 normalcursor();
 unhidecursor();
 enable();
 tsr_program(); /* call the TSR C program */
 disable();
 /* ----- restore the video cursor configuration ----- */
 restorecursor();
 restore_mouse(); /* restore the mouse */
 interrupted_psp(); /* reset interrupted psp */
 setdta(intdta); /* reset interrupted dta */
 setvect(CRIT, oldcrit); /* reset critical error */
 setvect(CTRLBRK, oldbreak);
 setvect(CTRLC, oldctrlc);
 setcbrk(ctrl_break); /* reset ctrl break */
 disable();
 _SP = intsp; /* reset interrupted stack*/
 _SS = intss;
 running = FALSE; /* reset semaphore */
}








































April, 1991
STRUCTURED PROGRAMMING


You Can't Go Home Again




Jeff Duntemann, KG7JF


"I am an American, Chicago-born; Chicago, that somber city."
So said the hero in the opening line of Saul Bellow's novel The Adventures of
Augie Marsh -- words that I may say with equal truth. I thought of poor Augie
last week as I stumbled out of the State Street subway entrance into a howling
30 MPH wind and a -40 degree windchill factor.
And somber; you don't know somber until you live under one of those
garage-floor gray skies that descend in November and don't lift again until
May. I lived under those skies for 26 years, and it has now been 12 years
since I have spent more than an odd weekend there.
It's a sobering experience, trying to go home again. You can't, of course; nor
am I the first one who has said so. The Chicago I had called home was gone,
and although familiar pieces were scattered all over the map, the general
impression was that the Windy City had simply blown itself away.
Whole blocks of downtown where I used to repair Xerox machines have been
razed, to make way for massive postmodern skyscrapers and stupendously ugly
government monuments. Familiar stores have vanished, and new strip malls are
everywhere. The Jefferson Park subway, which I rode on its very first day in
1970, now looks grimy, battered, and tired.
I ate Bay's English muffins and Salerno cookies, and had lunch at Superdawg on
Milwaukee Avenue, and little by little realized that home had better be where
you are, or it's nowhere. It isn't just that home is gone, for it is -- but
the you that lived there is gone too, continually reshaped into another being
by forces that work gradually and never quite show themselves. The little
house on Clarence Avenue where I grew up now looks tiny; its entire first
floor would fit neatly inside my Scottsdale garage. It's no smaller than it
was when I lived there, and I'm no larger ... but my sense of perspective has
been forever altered by the hills of Baltimore and the cliffs of Santa Cruz.


Portability Nostalgia


There's an increasingly vocal contingent in our field that's been demanding
that we all go home again, where home is that fabled academician's Erewhon,
portability. I've fielded some interesting threads on the nets, hollering that
if that nasty old Turbo Pascal hadn't messed with the pristine Pascal
Standard, all our code would be portable and we'd all be happily Home.
It's characteristic of this argument (which has been cluttering up discussions
of programming for many years) that what we want is always referred to as
"portability." Nobody ever says that what we want is for one piece of source
code to compile and run identically on all compilers for all machines -- even
though when pressed, most will admit that that's what "portability" is
supposed to represent. I don't know about you, but spelled out it sounds
pretty dicey to me.
In this interpretation, portability in Pascal is impossible, period -- unless
you limit yourself to programs that don't do much, like the programs you
generally write in college. College programming exercises are throwaways that
teach a lesson and then become extraneous. Sure, you can write a program in
ISO Standard Pascal that creates a linked list, sorts it, and then writes the
sorted list to Output. But I dare you to do this: Open files INPUT1 and INPUT2
for input and OUTPUT1 for output, and then merge the two input files to the
output file. You can't do it because Standard Pascal has only two logical
files, Input and Output. Not to mention the fact that Standard Pascal has no
way to associate a physical filename with a logical file once the program has
begun running.
Let's not even talk about doing a binary search on a sorted index file on
disk. Seek? What's that? Not Standard, mon.


Syntax and Semantics


Forgive me for railing. I just want those two-bit book floggers to put a sock
in it and quit praising Standard Pascal as the quick road Home. Portability is
an intriguing topic that deserves better treatment than the nutcase
discussions I've heard. Let's explore the notion for a bit.
For starters, what would it take to realize the ideal of portability? What
would we have to have to allow one single source code file to compile and run
identically on all compilers of a given language on all machines? Too many
people place all the blame on the language itself, but the problem is much,
much bigger than that. In my analysis, what we need are two things: standard
syntax across language implementations, and standard semantics across
platforms.
Syntax first. The biggest barriers to syntactical portability in structured
languages, oddly enough, are often the designers of those languages. I've
gotten some nastygrams from the nutcases for criticizing Niklaus Wirth in
these columns, but whether he realizes it or not, Dr. Wirth is as much to
blame for the lack of Pascal portability as anyone else.
It comes down to this: He designed a language, and stopped there. He did not
specify a set of standard libraries. There are a handful of fundamental
omissions in Pascal, mostly connected with file I/O. (The addition of Assign,
Seek, and Erase to ISO Pascal would quell about 40 percent of my objection to
that nonlanguage.) But most of the problems in providing syntactic portability
in Pascal lie not with the language itself but with the absolutely essential
libraries that provide things like access to the underlying system, string
support, and detailed file management.
Wirth has stated that he expects the programmer to develop his own libraries
and to recompile them on every platform he moves his application to, and does
not see any particular need for any set of standard libraries.
This is unrealistic. String support, time/date support, and file management
are so universally needed that forcing every programmer to create them from
scratch is a titanic waste of manpower. Language vendors recognize this, and
that's why Turbo Pascal comes with its own units such as DOS and Crt.
If Wirth had simply spent a few more weeks and defined a spec for libraries
containing the most needed procedures and functions in common programming
tasks, Pascal would be a great deal more portable today than it is.


Half a Loaf


Modula-2 people are reminding me inside their heads right now that Pascal was
just an exercise for Wirth to prove the value of structured programming,
something so ingrained today it seems incredible that anyone would ever doubt
it. In defining Modula-2, Wirth did in fact define a few standard libraries,
making Modula-2 infinitely more amenable to syntactic portability than Pascal.
However, the emphasis here is on few. Modula-2's standard libraries are
strictly half a loaf. What we need, in fact, is something on the order of the
standard function libraries defined for ANSI C. As much as it galls me to
admit it, ANSI C and C++ 2.0 are now much more portable than any flavor of
Pascal or Modula-2, largely because of the breadth of the ANSI standard
library spec.
My own research in C++ led me right to that conclusion: Early on I wrote some
programs in Zortech C++, and when Turbo C++ came along I ported even the
biggest one from Zortech to Turbo in about half an hour.


Going for Syntactic Portability


Achieving some degree of syntactic portability can be done according to these
time-honored principles:
1. Use standard library routines wherever you can.
2. Avoid vendor-supplied language extensions whenever possible.
3. When you must use nonstandard language extensions, confine them as much as
possible to mission-specific library modules.
This isn't an especially good prescription for Modula-2 programmers, and it's
simply beyond hope for Pascal, because in Pascal there's neither a useful
language standard nor any standard libraries at all. Principle #3 still has
some validity, however, and if you choose to incorporate syntactic portability
into your project as a design goal, you might consider these strategies:
1. Isolate direct references to hardware devices (modems, FAX boards, extended
memory, and so on) inside modules. Don't sprinkle your 80,000-line application
with hooks into some fourth-tier company's scanner interface board. This
precaution is easy and simple prudence; do it whether you need portability or
not. Hardware devices come and go like the wind, and over the life of an
application you may have to change FAX boards or scanner interface boards two
or three times. Better still, create some sort of installable device driver
system for such things so that changing the supported device doesn't require
recompilation of the application. (Unfortunately, portable mechanisms for
loading code at runtime don't exist in Pascal, and will require some
considerable calisthenics in Modula-2. If anyone has done this, drop me a
note.)
2. Create an intermediate layer module between the standard language bulk of
your application and calls to vendor-specific extensions to the language
standard. This works best when you anticipate moving to another compiler that
has most of the same functionality in its extensions but simply implements
them in a slightly different way. The intermediate layer module isolates all
interface to the language extensions, and when port-time comes, most of the
work to be done will be done in that intermediate layer.

In other words, to reposition the cursor, don't call Turbo Pascal's GotoXY
routine directly. Create a routine in the intermediate layer named CursorXY,
and then implement CursorXY this way:
 PROCEDURE CursorXY(X,Y: Integer);
 BEGIN
 GotoXY(X,Y);
 END;
The intermediate layer will use Turbo Pascal's Crt unit, but the modules
comprising the standard portion of the application will make no reference to
Crt at all. All video and DOS access will be through the intermediate layer.
Later on, when you implement the intermediate layer module for another
platform, replace the GotoXY call in CursorXY with the platform-specific call
that positions the cursor on the destination platform. Your application only
calls CursorXY, and the intermediate layer handles the translation to the
specifics of the current platform.
I've seen this done effectively in moving between DOS Turbo Pascal and
character-mode Unix Pascal. The downside is that the layer can eat performance
significantly if carelessly done, and will always slow you down at least a
little. And for ambitious applications, that intermediate layer module can get
enormous. It's a clunky thing to do. But it may be the only thing you can do.
3. Don't use Pascal or Modula objects. The OOP soup is still bubbling. OOP
standards are not even in the talking stage for these languages. Everybody's
doing things differently. If you insist on using objects, consider Smalltalk
-- which, as I'll explain a little later in this column, has it all over C++
for portability.


The Platform Problem


Sounds grim, this reaching for syntactic portability. But wait, it gets worse.
Syntax, in fact, is a minor problem, solvable by doing enough somersaults and
sticking sufficient mediation between the standard language and the machine it
runs on. The real headaches come from elsewhere, notably, the fact that not
all platforms are created equal. To ease into that discussion, a little
history:
Long ago, there was a brave attempt at ideal portability called the "P
System." It came out of the University of California at San Diego (UCSD), was
really big for about half an hour in the middle of the CP/M era, then pretty
much died its first death once the IBM PC appeared on the scene. In the
mid-eighties it was purchased by a new vendor and resurrected for a while, but
its second death soon followed.
The P System was pretty amazing in its day. The vendors could implement it on
any damfool machine they got their hands on in only a week or two, and it was
available for a lot of different machines. And lo! You could take object code
compiled on any P System machine and run it on any other P System machine,
regardless of CPU or how different the two hardware implementations were.
The P System was in fact an operating system, but more than that, it was an
operating system written for a virtual CPU; that is, a CPU that exists only as
a software simulation written to run on real silicon CPUs. Its registers were
memory locations and its microcode was implemented in the instruction set of
the host CPU. In effect, the "P-Machine" (P for "pseudo") executed an
interpreted assembly language. The P-Machine supported a suite of virtual
opcodes, and these opcodes were executed by calling short sequences of silicon
opcodes that taken together provided the function of the virtual opcode.
Alas, I've long since dumpstered my P System documentation, but as I recall,
most of the virtual instructions took two or more silicon opcodes to
implement. For example, if the P Machine's pseudoregisters were kept in memory
locations, then executing the virtual opcode to move one register to another
would require executing the silicon opcodes that moved one memory location to
another -- which for the 86-family meant moving memory into a register and
then moving the register back out into memory. This proved the undoing of the
P System; it invariably gobbled about 50 percent of the performance of the
machine in an age when the machines were none too powerful to begin with.
Nonetheless, this made for tremendous portability, executed at what amounted
to the microcode level. The P System's compilers and other utilities were
binary files of virtual opcodes or pseudocode (a term generally shortened to
P- code) meant to be executed by the P-Machine. The P-Machine was the only
part of the system specific to a particular silicon CPU. Porting the P-Machine
to a new silicon CPU only required rewriting the P-Machine's "microcode" as
required by the new silicon CPU.
(An interesting sidenote to the P System concerns Western Digital's late
seventies attempt to speed the P- code execution by creating a multichip
silicon CPU whose instruction set was in fact identical to the P-Machine's
virtual instruction set. This "Pascal MicroEngine" eliminated the interpreter
layer and allowed P- code to execute directly on the CPU as native code. Alas,
the MicroEngine was faster than an interpreted P-Machine, but turned out to be
quirky and only about as fast as a good CP/M machine -- while costing about
twice as much. No one seemed to think portability was worth a 100 percent cost
premium, and I hardly blame them.)
I'm spending a lot of time on the P System because it's a good illustration of
a solution to the portability problem -- and a warning to people who gloss
over the importance of performance competitiveness in our industry. (And also
because somebody is trying the very same thing again today, in an unexpected
way, and with considerably better chance for success. See if you can guess
who, and for what language, before I describe the effort later in this
column.)
The P System succeeded at its nominal goal of providing binary-file
portability across all platforms. Most people credit this success to the P
System's use of a common, identical language syntax (UCSD Pascal) on all these
platforms. This isn't quite half true. A common language syntax helped, but
what really made the P System work was its way of providing identical platform
semantics on all the supported platforms. Therein hangs a lesson few people
have learned.


Platform Semantics


Compared to language syntax (which is just an orderly convention for hanging
language elements together) language semantics are much harder to define. In
fact, "language semantics" is a misnomer. "Semantics" deals with what things
mean, and the semantics of a programming language is the description of what a
language's statements mean in the context of a specific underlying machine.
Moving a screen cursor can be as syntactically simple as the statement
GotoXY(X,Y). What executing GotoXY(X,Y) accomplishes (in effect, what the
statement means) depends on what sort of cursor/video setup a given machine
has. On a text screen, X,Y specifies a character position, where a single
character may exist without existing in any other position, and overlapping no
other character. On a graphics screen, however, X,Y specifies a pixel
position, which may fall within two or more overlapping graphics characters.
On graphics systems you can't say "the character at X,Y" because there may be
no single character at X,Y.
These differences are differences in semantics, and because they depend on the
specifics of the underlying platform, I call them platform semantics.
Other examples of differences in platform semantics: The support for multiple
mouse buttons in some platforms, compared to Apple's militantly defended
insistence on a single mouse button for the Mac. (Users are too dumb to handle
more than one mouse button, dontcha see?) Hard-code handling for the right (or
middle) mouse button into your app, and you have a difficult question when
moving the app to the Mac. What becomes of that right mouse button event?
Here's one of my favorites: The use of color in one platform versus a
monochrome platform with no gray scales. Or: Porting a multitasking app to a
single-task platform.
And of course, there are a multitude of little piddly differences between
platforms that individually may not seem very serious, but when taken together
with all their interactions, will make you tear your hair out.


Least Common Denominator Porting


The traditional way of dealing with platform differences is simply to see what
both platforms have in common, and use only what those platforms have in
common, on both platforms, ignoring the additional features of the more
advanced of the two platforms. That this is wasteful in the extreme should be
obvious, as evidenced by the mechanism through which Turbo Pascal Macintosh
allowed character-mode PC programs to run on the Mac: by making the Mac a
character-mode machine. This did not enthrall Mac owners.
The P System handled platform semantics by being the platform on every
machine. The P System was a disk operating system and a set of screen
management conventions. Your P System applications could only use those disk
I/O and screen management features supported by the P System. Anything else
had to be done by circumventing the P System (through escape sequences or
direct ROM calls or somesuch) which, of course, rendered an application
nonportable.
Of course, back then this was less of a liability, since machines rarely had
much of anything useful in ROM, and the P System often offered lots more than
your typical CP/M system could offer the user.
The other factor that killed the P System was that it had very primitive disk
space management. There was no File Allocation Table. Disk files used
contiguous blocks of memory, and when you deleted a file you had a "hole" on
disk that only a file the same size or smaller could use. Eventually you had a
disk carved up into multitudes of useless slivers, and had to perform slow
manual "garbage collection" to gather free space back into a contiguous block.
Even CP/M did better than that.
The P System was widely used in schools, and it provided a level of
portability never equalled to this day. This is why lots of academics yearn to
go home to the "good old days" when all software was portable. They seem to
have forgotten that all software was portable because the P System made all
machines equally clumsy and limited.


The Ghost of the P System


I had despaired of portability for many years because of the problem of
platform semantics. What would all of the GUI marvels of the MAC mean when
translated (clumsily) to text mode on the PC? Every Unix vendor had a grossly
different set of networking and UI assumptions, and nobody took the need for a
common binary code format seriously. This is the sole reason Unix blew its one
chance to become the platform for desktop computing. DOS is in the saddle now,
and Unix will forever be a niche OS.
I decided to write this column, however, because the diverse paths are
beginning to converge again. The Mac and Windows are alike enough (thanks to
Xerox's seminal research and no thanks at all to lawyer-crazed Apple) to make
portability between the two platforms at least possible. There are plenty of
semantical hangups to be overcome, but not so many as to warrant despair.
Unix is now coming around to agreement on X Window as the underlying windowing
architecture, but true to form, those ever-so-righteous dudes can't decide on
a UI. By the time they choose, Unix probably won't matter anymore -- but if
portability to Unix is important to you, the path to either Open Look or Motif
is plain. (I recommend Motif.)
The Mac, MS Windows, and X Window platforms have now grown close enough
semantically and the underlying machines powerful enough to support another
stab at the P System concept. Sure enough, somebody is trying it, and this
time it might just work. ParcPlace Systems is doing it with their
Objectworks/Smalltalk Release 4 product, a Smalltalk development environment
tailored specifically to overcome differences in platform semantics while
providing a very clever common binary code format.
Objectworks/Smalltalk confronts differences in platform semantics in a number
of ways. In general, Object works makes use of platform facilities when it
can, and fills in the gaps itself on lesser platforms, to support the
Objectworks UI and tools. This "greatest common denominator" solution requires
lots of memory and compute power, but since Objectworks starts at the
386-class machines and goes up from there, the power it needs should be
available.
The problems of color and aspect ratio are handled by something called the
Smalltalk Portable Imaging Model (SPIM) to create graphics images that look
identical on any supported platform. SPIM supports device-independent "true
color" so that color representation will be consistent on each platform
without adjustment. (I confess skepticism on this one. We'll see.) Country
differences in character sets and alphabets are handled by using 16 bits to
represent each character.


Native Code Binary Portability



The kicker from a portability perspective is that Objectworks goes the P
System one better: What runs on each platform is not slow interpreted P-code,
but true native code, regardless of where the application was originally
compiled. In other words, if you compile your Smalltalk app on the Mac and run
it as 68000 native code you can take the very same compiled file to a
386-based PC system, load it, and run it as 386 native code. Or run it on a
SPARCStation as SPARC native code, and so on.
This is a good trick. What happens is that Objectworks first compiles your
application to machine-independent intermediate code, called byte code, which
is the analog of the P System's P-code. You can interpret the byte code from
the Objectworks environment, which provides numerous debugging tools that work
specifically on byte code. The byte code file is truly platform-independent,
and you can haul it around your company dropping copies on any platform where
Objectworks has been installed.
However, when you finally go to run the application as native code,
Objectworks (quickly) compiles the intermediate code to native code, and
caches the native code in memory. As long as the compiled native code image
remains in memory, the final compilation step need only be done once. What
runs is native code, really and truly. And this is how native code binary
portability happens using Smalltalk.


My Portability Prescription


Yes, this sounds mighty good, but I had better add that I haven't tried it
yet. The Windows 3.0 implementation of Objectworks is still in beta test and
should appear this spring. Still, the ParcPlace people are very good at what
they do and I expect that they will pull it off. They are, after all, the
original Xerox Smalltalk team, spun off at last to a technology company that
can get Smalltalk out there into the hands of the people who need it.
Smalltalk now has the blessing of IBM and is being used in some extremely
conservative DP shops, often by people whose only prior programming experience
is in Cobol.
There are lots of questions a system like this raises: How good is the native
code produced in that final, runtime compilation? How long does the final
compilation step take? How is color handled portably among systems that don't
support zillions of VGA colors? For that matter, how is color translated to
monochrome Mac systems? I have way too much respect for the platform semantics
problem to assume that there aren't still some rough edges here.
Just as surely, I have way too much respect for ParcPlace to think it's a
sham. The cost is a little scary to us basement hackers, but if you're the
vendor of a $5,000 vertical market package or a corporate MIS strategist,
portability like this could be cheap at twice the price. And once ParcPlace
proves that the technology is workable, other firms may try implementing such
systems for other languages -- including (in my dreams, sigh) Pascal. I've
seen P-code implementations of Modula-2 (well, M-code, they call it) so Modula
could work there as well. Please keep me apprised of any such efforts if you
hear of them.
I'll report further on Objectworks when I have a chance to play with it. From
where I sit, it looks to me like your very best chance to incorporate true,
absolute drop-in portability into your application design across the (no
longer) impassible chasms dividing the PC, Mac, and Unix workstations.


Products Mentioned


Objectworks/Smalltalk Release 4 ParcPlace Systems 1550 Plymouth St. Mountain
View, CA 94043 415-691-6700 $3,500
With that in mind, Jeff's Prescription for Portable Design cooks down to this:
If you need portability badly enough, go whole-hog with a system that
intelligently manages differences in platform semantics -- in essence, doing
all the portability work for you. I suspect Objectworks is only the first such
system. On the other hand, if you don't need portability that badly, don't
bother with it at all. Make as much as you can of the platform you're most
familiar with. Your customers will not like being treated as least common
denominators. Trust me.
Perhaps you can go home again ... but going home now means going bigtime. The
lesson of the P System remains valid: Let the language handle portability. And
how many languages are truly big enough to do it? Only Smalltalk.
I have to smile.









































April, 1991
GRAPHICS PROGRAMMING


The Virtues of Inexpensive Approximation: The Edsun Continuous Edge Graphics
DAC




Michael Abrash


A while back, Hal Hardenbergh (of DTACK Grounded, neural net, and Offramp
fame) was good enough to help me figure out how to draw roundish objects
rapidly. Circles were no problem for Hal; neither were ellipses with
horizontal or vertical major axes. The sticking point was tilted ellipses.
The problem wasn't that Hal didn't know how to draw tilted ellipses; the
problem was drawing them fast. No matter how he approached the problem, he
always ended up stuck with one small term that was very expensive (that is,
slow) to calculate. And there matters stood for a while.
One day, Hal called and said, "I can draw tilted ovals very fast. These are
not ellipses. However, they look so much like ellipses that it's hard to tell
them apart." Hal had replaced the expensive term with an easily calculated but
slightly less accurate term. The resulting ovals were no longer mathematically
ellipses -- but they were sure as heck close.
Sometimes you really do need genuine ellipses -- but not very often. Computer
graphics is the art of approximating the ideal in such a way that the eye and
brain together fill in the gaps, and in that context, a tilted ellipse-like
oval will almost always serve as well as a true ellipse -- better, if the oval
can be drawn faster than the ellipse.
In other words, an inexpensive approximation often beats an expensive ideal.
Which brings us to the Continuous Edge Graphics Digital-to-Analog Converter
(CEG/DAC), from Edsun Laboratories.


The Edsun Continuous Edge DAC


The CEG/DAC, which may well impact the state of IBM PC graphics every bit as
much as Super VGA did, is a triumph of inexpensive approximation over
expensive perfection. For about $15 added cost (in quantity; projected to drop
to around $5 within a year), otherwise off-the-shelf VGAs can approach and
sometimes even surpass 24-bit-per-pixel (bpp) display quality, with stunning
results. It is literally impossible to watch Edsun's demo and not lust for a
CEG/DAC. The CEG/DAC is being touted as an antialiasing device, about which I
have my doubts. (It vastly expands the number of colors available in a single
VGA frame, but in a sharply limited fashion. More about this later.) It might
be more accurate to call it a dejagging palette expansion device, but whatever
it is, it works.
The CEG/DAC is less than ideal in several respects. It's complex to program,
it doesn't give you complete color control, and software that takes advantage
of CEG/DAC features is often slower than normal VGA software at the same
resolution. Nonetheless, for good and sufficient reasons which I will explain
shortly, I think that a year from now the high-end VGA standard will be
CEG/DAC-plus-Super VGA, much as Super VGA is the standard today. Once the
price of the CEG/DAC drops a little, a Super VGA manufacturer would be nuts to
build an adapter without CEG/DAC, a user would be foolish to buy a Super VGA
sans CEG/DAC, and a graphics programmer would be right out of his gourd not to
at least investigate the technology. Photorealistic CEG/DAC images look
amazingly better than standard 256 color images; lines look like they stepped
off the Planet of the Vector Displays (okay, there's a little striping, but
we're picking very small nits here); polygons blend together seamlessly;
detail is much easier to pick out; and text looks terrific on the CEG/DAC.
To put it simply, CEG/DAC graphics Just Plain Look Great. And they're cheap.
Ignore that at your own risk.


How It Works


The CEG/DAC combines the following four attributes, none of which seems
particularly profound:
Compatibility with standard VGA DACs
Support for 8 bpp per color gun
Capability for information to be embedded in the bitmap to reprogram the
palette on-the-fly
Capability for a pixel's color to be specified as a weighting of adjacent
colors on the same scan line
Together, these seemingly innocuous features may set the next PC graphics
standard.


VGA DAC Compatibility


Like any VGA DAC, the CEG/DAC sits at the end of the VGA adapter's pixel
pipeline, accepting 8-bit pixel attribute values from the VGA chip, converting
them into red, green, and blue pixel colors, and sending the appropriate RGB
voltage levels to the monitor. Until a specific sequence of OUTs kicks it into
CEG mode, the CEG/DAC performs this pixel conversion just like the
VGA-standard Inmos DAC; each pixel attribute is used as a look-up index in an
internal 256-entry palette table, and the looked-up RGB value is sent to the
monitor. The CEG/DAC is Inmos-compatible right down to the pins, so
manufacturers can put CEG/DACs on VGAs without altering board layouts, BIOSs,
or manufacturing processes. This, in combination with the relatively low
price, instantly converts the CEG/DAC from a high-end item with a niche market
to a mass-market part. Why? Because the CEG/DAC is an inexpensive, no-fuss way
for VGA manufacturers to distinguish and add value to their products, a
valuable attribute indeed in a market where every VGA is beginning to look
like every other, and prices are falling through the floor.
I predict that there will be CEG/DACs on several hundred thousand VGAs within
a year even if no one ever uses it in CEG mode. In years past, "1024 x 768"
sold a lot of Super VGAs, even though the 30Hz interlacing used to reach that
resolution was capable of frying optic nerves in a matter of minutes.
Likewise, it's of immense value to manufacturers to be able to use the CEG/DAC
to claim "740,000 colors" and "2048 x 2048 effective resolution" (that is,
that CEG/DAC displays at 1024 x 768 are equivalent to normal displays at 2048
x 2048, an interesting concept to which we'll return another time). Let's not
carry the analogy too far, because the CEG/DAC is far more useful than was the
30Hz 1024 x 768 mode; the point is that the actual utility of the CEG/DAC is
almost superfluous to its widespread use, which seems virtually assured. That,
in turn, means that the CEG/DAC should quickly reach the critical mass at
which it becomes worthwhile for mass-market software to support it routinely
-- and that's the definition of a PC standard in my book.


The Standard of Comparison: 24 Bits per Pixel


In order to understand the importance and limitations of the remaining CEG/DAC
features, we must first look at the traditional high-end standard for color
graphics: 24-bpp color selection, yielding 16,777,216 independently selectable
colors per pixel, enough to convince the eye that it's looking at a continuous
color spectrum, or at least that portion of the spectrum that a CRT can
display. The obvious benefit of 24 bpp is that the colors of the primitives
that make up graphics displays -- points, lines, polygons, and son on -- can
be independently selected from a virtually unlimited palette, giving the
programmer a free hand in color selection. It's worth noting, however, that
you'll never need all 16,000,000-plus colors simultaneously; there are, after
all, fewer than a million pixels even at 1024 x 768 resolution. Then, too,
primitives are rarely drawn in a random mix of colors; typically, each
primitive is drawn in a single color or a mix of a few colors. It's likely
that at most only a few hundred completely unrelated "main" colors will be
required for a typical display, although thousands of variations on and
combinations of those colors may also be required.
The less obvious benefit of 24 bpp is that it allows for excellent
antialiasing. Raster graphics attempts to render an image of essentially
infinite resolution on a medium of decidedly finite resolution -- the computer
screen. The generally inadequate pixel-by-pixel rate at which the image is
sampled onto the screen can result in considerable error (aliasing) in
rendering the image. The most notable symptom of aliasing is jagged primitive
edges.
Antialiasing is the process of reducing the error of the displayed image
relative to the original (ideal) image, in any of a number of ways, a topic to
which we'll return in the future. For now, what's important to understand
about antialiasing is that it creates attractive, smooth displays that are
good representations of ideal images, and that it is generally performed by
selecting the color of a given pixel as a function of the colors surrounding
that pixel in the ideal image. As a very simple example, pixels that fall
right on the boundary between a white polygon and a black polygon could be
weighted 50 percent white and 50 percent black, as shown in Figure 1. While
the resulting image has the same resolution as if antialiasing had not been
performed, the eye perceives the gray pixels as blending into the adjacent
polygons when displayed on the screen, and the edges appear to be smooth.
There are three points to be made here. First, 24 bpp is perfect for
antialiasing because by making all colors available, it allows just the right
color weighting to be selected for each pixel. Second, antialiasing does not
involve selection of random pixel colors, but rather of pixel colors closely
related to the colors of adjoining pixels. Antialiasing does not require new
main colors, but rather variations on the main colors used to draw the
primitives; there is considerable color coherence between neighboring pixels.
Third, antialiasing need not be perfect to be useful. Merely eliminating
jagged edges is a major step in the direction of visually appealing displays,
regardless of the mathematical correctness of the procedure by which this is
done.
24 bpp, though theoretically ideal, suffers from one major shortcoming; It's
expensive. It requires both a great deal of memory (2.5 Mbyte for 1024 x 768
24-bpp), and hardware designed to handle the vast amounts of video data that
must be pumped out to the screen. (It's worth noting, though, that relatively
inexpensive 15 bit-per-pixel VGAs, built around the Tseng Labs ET4000 chip,
should be available soon.) Performance also suffers from the need to
manipulate larger bitmaps, and often from the increased demands that scanning
all that video data places on memory bandwidth, as well. High-resolution
adapters tend to be plagued by similar problems. The CEG/DAC doesn't provide
either 24 bpp or higher resolution, but neither does it require any extra
memory or additional hardware, and it places no additional demands on memory
bandwidth.


8 bpp per Color Gun



The standard VGA DAC supports only 6 bpp per color gun, but in CEG mode the
CEG/DAC provides a full 8 bpp per gun. This extra color information is,
however, made available in the normal way, via the 256-entry palette look-up
table (albeit with 24-bit RGB entries), so this feature alone doesn't increase
the number of colors available at one time.


Reprogramming the Palette On-The-Fly


The CEG/DAC allows information embedded in the bitmap to reprogram the palette
on-the-fly, a feature Edsun calls EDP, for Edsun Dynamic Palette. EDP expands
the available color set by allowing the palette to be treated as a dynamic
resource during the course of a frame; you can, if you want, have a completely
different palette available for the hundredth scan line than you had for the
first. However, EDP is not so useful as it may initially appear to be, because
in order to reprogram the palette, at least six pixels in a row must display
the same color, as follows:
The CEG/DAC normally accepts pixel values from the VGA and looks up the
corresponding RGB values, just as a standard VGA DAC does. However, in CEG
modes, certain pixel values are treated as commands rather than as pixel
attributes; Table 1 shows the list of commands in Advanced-8 mode (more on
this shortly). A portion of the palette is given over to these commands, so
fewer pixel colors than normal can be selected directly through the palette;
only 223 main pixel colors are available in Advanced-8 mode, and fewer still
in other modes. Nonetheless, the total number of colors available in a single
frame is vastly increased by the use of command values to derive new colors
from the main colors and to change the palette on-the-fly. (The command-driven
nature of the CEG/DAC is the reason a VGA adapter doesn't have to be changed
if a CEG/DAC is installed. The bitmap is laid out in exactly the same way as
usual; only the interpretation of some of the values in the bitmap changes,
and the CEG/DAC does that interpretation. The VGA sends pixel values from the
bitmap to the DAC just as it always does, without the faintest notion of the
new meanings of some of the values in CEG mode.)
Table 1: CEG/DAC interpretations of pixel values in Advanced-8 mode, a CEG
variation of 8-bpp modes. A pixel set to a pixel weighting command is
displayed in the specified gamma-corrected mix of the colors (not attributes,
but final colors, as looked up in the palette table) of neighboring pixels.
Logically, color A is the nearest unmixed color to the right and color B is
the nearest unmixed color to the left, although Edsun's mixing rules are
actually considerably more complex than that. See text for descriptions of
pixel weighting and EDP dynamic palette loading.

 Pixel value Description
 -------------------------------------------------------------------

 0 Attribute 0 (causes the value in palette location 0 to
 be displayed as the pixel color, as in a standard VGA)
 1 Attribute 1 (standard VGA)
 :
 :
 189 Attribute 189 (standard VGA)
 190 Attribute 190 (standard VGA)
 191 EDP (dynamic palette load) command, or standard VGA
 attribute 191 if EDP isn't enabled

 Start of pixel weighting commands

 192 100% (31/31) color A, 0% (0/31) color B
 193 97% (30/31) color A, 3% (1/31) color B
 194 94% (29/31) color A, 6% (2/31) color B
 :
 :
 222 3% (1/31) color A, 97% (30/31) color B
 223 0% (0/31) color A, 100% (31/31) color B

 End of pixel weighting commands

 224 Attribute 224 (standard VGA)
 225 Attribute 225 (standard VGA)
 :
 :
 254 Attribute 254 (standard VGA)
 255 Attribute 255 (standard VGA)

When a pixel set to the EDP command value is received by the DAC, the next
four pixel values are taken to specify the new red, green, and blue values for
the palette location to load, and the index of that location. While this is
going on, the DAC has no new pixel information to display; the data stream
from the VGA that should have provided the pixel values is instead providing
the palette load values, as shown in Figure 2. The CEG/DAC compensates by
displaying the last known pixel color for the duration of the palette load, so
each EDP load results in six identical pixels in a row.
EDP loads can be handled without impacting the displayed image by sacrificing
a portion of one or both sides of the bitmap to make a solid border; the EDP
commands can be placed in this part of the bitmap without affecting the image,
which is displayed in the remaining portion of the bitmap. Sacrificing five
percent of the bitmap for a left edge border and five percent for a right
border in 1024 x 768 mode allows 20 palette entries to be reprogrammed on
every line. It would certainly be better if EDP commands could be executed
during normal border time, but the VGA chip doesn't send information from the
bitmap to the DAC at that time. A special VGA chip could easily be built that
would allow more efficient palette loading -- but this way the CEG/DAC
requires no modification to existing VGAs, and that characteristic is key to
the success of the CEG/DAC.
Another way to handle EDP loads is to perform them in bitmap locations where
there just happen to be at least six pixels in a row of the same color. This
is obviously a complex proposition, especially if the screen is constantly
changing. Using the CEG/DAC effectively is without a doubt a nontrivial
programming exercise; the CEG/DAC makes high-quality graphics possible, but
the bulk of the work in actually realizing such graphics falls on the
software.
The quick-minded among you will no doubt note that EDP doesn't actually make
any more colors simultaneously available; they're available in the same frame,
but you're still limited to 223 main colors at any one time. True enough, but
rarely will you need more than 200 main colors on a single scan line.


Pixel Weighting


The basic premise of pixel weighting is simple: Any pixel that's set to a
pixel-weighting command value is drawn as the specified mix of the colors (not
the attributes, but the final, looked-up, 24-bit colors) of the nearest pixels
to the left and right. (See Table 1.) If either neighboring pixel is also a
command rather than a color, the color of the nearest pixel that is a color is
used. (Pixel weighting actually follows a more complicated set of rules,
particularly for line drawing and multipixel weighting sequences, as described
in Edsun's CEG Level 3 Specification, but conceptually the above description
will do.) This allows 32 weightings of colors, ranging in steps of 1/31 from
100 percent one color to 100 percent the other; oddly, this does not allow
weightings of exactly 50-50. These weightings tend to be well suited to
smoothing edges and boundaries, as illustrated by Figure 3. Similarly, pixel
weighting tends to provide useful colors when color gradients are drawn.
Pixel weighting is truly the key to the CEG/DAC. The advertised 740,000 colors
result from pixel weighting: 742,813 is the exact number, derived as each of
the 223 main colors mixed in any of 30 ways with the 222 other main colors,
divided by two because half the combinations are duplicates, plus the 223
unweighted main colors themselves. (The number of available colors is slightly
higher if EDP is disabled, because an additional pixel value is then freed up
for use as an attribute. If EDP is enabled, additional colors can be made
available by loading the palette on-the-fly.) Pixel weighting doesn't actually
allow you to select any arbitrary one out of 742,813 colors for any given
pixel; in fact, you still have a choice of only one of at most 256 colors for
any one pixel. However, if you can get by with 223 or fewer main colors
(bearing in mind that EDP can change the main colors), the weightings
available at any given pixel are useful for smoothing edges or shading, so you
get a good portion (although by no means all) of the utility that 740,000
simultaneous colors would offer.
Pixel weighting is scarcely a perfect solution. Many antialiasing algorithms
consider not only horizontally but also vertically adjacent pixels; pixel
weighting takes no account of those pixels at all. In general, points at which
three or more colors need to be taken into account aren't amenable to pixel
weighting. What's more, because pixel weighting is left-to-right sequence
dependent, it doesn't allow for proper weighting of the leftmost pixel on a
scan line, and it can't always handle weighting sequences that are disrupted
by intersecting primitives. (Edsun has, however, put in special cases that
allow, for example, a near-vertical single-width line to cross a weighting
sequence without disrupting the sequence. See the CEG Level 3 Specification
for details on the special case mix rules, which are fairly complex.)
What pixel weighting amounts to is horizontal color mixing; as such, it
provides many but not all of the colors needed for ideal antialiasing. (EDP
can patch "holes" in the available color set, but only to a limited extent,
because of the requirement for six consecutive pixels of the same color and
because it must also be used to change main colors as necessary.) Nonetheless,
CEG/DAC graphics look good even when mathematically correct antialiasing isn't
possible; Lord knows, they do expand the color palette and get rid of the
jaggies.
Like a stereo with the bass cranked up or an overbright TV, a CEG/DAC image
can look terrific even if it isn't an ideal representation of the original
image. I wouldn't call the CEG/DAC an "antialiasing" device in the classic
sense, but what it does is useful in its own right; in fact, I expect new
drawing and rendering approaches to spring up around it. To quote Foley and
van Dam, from the Second Edition of Computer Graphics (page 646): "A simple or
fast algorithm may be used if it produces attractive effects, even if no
justification can be found in the laws of physics."



Pixel Weighting Details


There are actually three distinct CEG modes. The one described above is
Advanced-8 mode, available in all 8-bpp modes. There's also Advanced-4 mode,
used in 4-bpp display modes, which is similar but offers only seven or eight
main colors (depending on whether EDP is enabled) and eight weightings. (See
Table 2.) The final mode, Basic-8, also available in 8-bpp modes, is quite
different. (See Table 2.) Instead of requiring that a pixel contain either a
color attribute or a command, each Basic-8 pixel contains both a 4-bit color
attribute and a one-of-eight pixel weighting; EDP is not available in Basic-8.
The color attribute is mixed with an earlier pixel, possibly but not
necessarily the preceding pixel. Because it is simpler and more flexible than
Advanced-8, Basic-8 is useful for applications that need no more than 16 main
colors.
Table 2: CEG/DAC interpretations of pixel values in Advanced-4 mode, a CEG
variation of 4-bpp modes. See Table 1 and text for descriptions of color A,
color B, EDP, and pixel weighting.

 Pixel value Description
 -------------------------------------------------------------------------

 0 Attribute 0 (causes the value in palette location 0 to be
 displayed as the pixel color, as in a standard VGA)
 1 Attribute 1 (standard VGA)
 2 Attribute 2 (standard VGA)
 3 Attribute 3 (standard VGA)
 4 Attribute 4 (standard VGA)
 5 Attribute 5 (standard VGA)
 6 Attribute 6 (standard VGA)
 7 EDP (dynamic palette load) command, or standard VGA
 attribute 7 if EDP isn't enabled

 Start of pixel weighting commands

 8 100% (31/31) color A, 0% (0/31) color B
 9 87% (27/31) color A, 13% (4/31) color B
 10 71% (22/31) color A, 29% (9/31) color B
 11 58% (18/31) color A, 42% (13/31) color B
 12 42% (13/31) color A, 58% (18/31) color B
 13 29% (9/31) color A, 71% (22/31) color B
 14 13% (4/31) color A, 87% (27/31) color B
 15 0% (0/31) color A, 100% (31/31) color B

 End of pixel weighting commands

Table 3: CEG/DAC interpretations of pixel values in Basic-8 mode, a CEG
variation of 8-bpp modes. Each 8-bit pixel value is split into three fields,
as follows:

Pixel bits 3-0: Attribute
Pixel bit 4: Register Select
Pixel bits 7-5: Pixel Weighting

If the Register Select bit is 0, Attribute is placed in the A color register;
if the Register Select bit is 1, Attribute is placed in the B color register.
The A color register looks up colors from palette locations 0-15; the B color
register looks up colors from palette locations 16-31. Pixel Weighting
specifies the desired mix of color registers A and B (the color register not
specified by Register Select retains its value from whatever pixel last loaded
it.)

 Pixel Weighting Value Description
 ---------------------------------------------------------------

 0 100% (31/31) color A, 0% (0/31) color B
 1 87% (27/31) color A, 13% (4/31) color B
 2 71% (22/31) color A, 29% (9/31) color B
 3 58% (18/31) color A, 42% (13/31) color B
 4 42% (13/31) color A, 58% (18/31) color B
 5 29% (9/31) color A, 71% (22/31) color B
 6 13% (4/31) color A, 87% (27/31) color B
 7 0% (0/31) color A, 100% (31/31) color B



Pros and Cons


It's unquestionably more difficult to program a CEG/DAC bitmap than a normal
VGA or 24-bpp adapter. For general drawing with a full set of primitives, you
must consider the surrounding state of the bitmap as you draw, because you may
partially overwrite pixel weighting sequences already in the bitmap, producing
unintended effects. Such is the cost of sequence-dependent bitmap encoding.

CEG/DAC code also often runs slower than standard VGA code at the same
resolution, because there's more calculation to be performed for each pixel.
However, CEG/DAC code is generally faster than equivalent code would be
running on an adapter with 24 bpp or two to four times the resolution, because
those approaches require that more bytes of display memory be manipulated per
pixel or that many more pixels be drawn. Also, it's not necessary that you use
pixel weightings and EDP everywhere on the screen; you may choose to use only
the 223 main colors for most of the screen, and reserve CEG drawing for a
logo, or for a few special icons.
On balance, the CEG/DAC involves considerable added software complexity,
provides slower performance at a given resolution, and doesn't fit standard
antialiasing, drawing, and rendering models particularly well. On the other
hand, it makes wonderful graphics possible, and is faster and much, much
cheaper than higher-resolution or higher-color displays. Basically, the burden
with the CEG/DAC falls entirely on the programmer, who has to deal with the
complex bitmap and must come up with drawing models suited to this odd chip.
As far as the user is concerned, it looks good and costs little -- a winner
across the board.
And remember, users pay the freight.


Gamma Correction


Edsun has built into the CEG/DAC "gamma correction" -- the process of
generating a desired percentage of full brightness when mixing colors --
making for accurate mixing, and saving programmers a major headache in
calculating color mixes. Interestingly, gamma correction plus pixel weighting
makes the CEG/DAC able to produce useful color components other than the 256
normally selectable for each color gun via the palette look-up table. If a
50-50 pixel weighting (actually, 16/31 to 15/31, which is as close as the
CEG/DAC can come to 50-50 because of the 1/31 pixel weighting step) is
specified between a pixel with a red color component of 254 and another of
255, a red gamma-corrected component with a brightness halfway between the two
will be sent to the monitor. In this respect, the CEG/DAC exceeds the
color-generation capabilities of even 24-bpp adapters.


How to Develop for the CEG/DAC


To develop for the CEG/DAC, you'll need a VGA with a CEG/DAC, along with
Edsun's basic documentation: Edsun Continuous Edge Graphics/ D-to-A Converters
Data Sheet and Edsun Continuous Edge Graphics Level 3 Specification. You'll
also want to get Edsun's CEG Software Development Kit (SDK), a $250 library of
CEG-aware graphics functions -- lines, ellipses, CEG control, and the like --
and sample code. Applications developed with the SDK are royalty-free. SDK
source code is available at no cost if you sign a nondisclosure and license
agreement that allows you to distribute object code royalty-free.
The SDK is adequate to get started with CEG programming, but it isn't a
complete graphics library; for instance, there's no support for clipping.
Neither is the SDK particularly well optimized for performance; for example, a
little probing with a debugger revealed that polygons are filled one dot at a
time via calls to a draw-pixel subroutine. I found the SDK to be functional,
but something of a chore to use, partly because the manual was of marginal
assistance in dealing with the many unfamiliar CEG-specific functions; more
explanatory and overview material would help greatly. The SDK is a good way to
get started with CEG/DAC programming, and will no doubt improve with time
(like the rest of us, it will take some time for Edsun's programmers to master
CEG), but right now, don't expect too much from it. Personally, I'd get the
SDK source code and treat that as no more than a starting point. Clever
programmers willing to take the time to understand the hardware and program
the CEG bitmap directly will surely be able to work wonderful tricks with both
display quality and performance.
The SDK comes with some utilities, including several to convert from Targa to
CEG format and back; image manipulation tools are on the way for the next SDK
revision. However, there are currently no tools to let you edit icons or
prototype screens. That, I suspect, will change in a hurry; if you're looking
for an interesting and potentially lucrative project, you could do worse than
to build a first-rate CEG/DAC image editor.
Microsoft Windows CEG drivers will be available soon, so for Windows
applications, the benefits of the CEG/DAC will be automatic with driver
installation.


Implications


The CEG/DAC is an odd and complex technology, and it will be some time before
its full potential is known. Areas for further research include maximizing
performance, exploring the possibilities of the CEG/DAC's extremely fine color
control, and allocating EDP loads in a changing bitmap, which seems to me a
first-rate optimization problem.


Products Mentioned


Edsun Continuous Edge Graphics Software Development Kit Edsun Laboratories
Inc. 564 Main Street Waltham, MA 02154 617-647-9300 $250.00
Given that CEG/DAC programming makes for more attractive lines, edges, text,
and images at little cost, I think that CEG/DAC-plus-Super VGA is likely to
set the high-end mass-market standard, at least until XGA arrives in force,
and possibly longer. CEG could well be applied to an XGA DAC clone, or, for
that matter, to 15-bpp VGAs or non-PC systems. CEG/DAC-plus-Super VGA may
dominate the market between vanilla VGA and Targa-level adapters until
low-cost 24-bit-per-pixel graphics become available, a matter of a few years
at least.
Is the ascendancy of the CEG/DAC a sure thing? Hardly. I do think that Super
VGA manufacturers will put it into play; they can't afford not to. The problem
is software: The chip is unfamiliar, limited in many respects, and just plain
hard to program. Still, for all its imperfections, the CEG/DAC makes possible
some very desirable graphics that were formerly flat-out impossible for
mass-market software -- and that sort of potential doesn't stay untapped in
the PC arena for very long.






























April, 1991
PROGRAMMER'S BOOKSHELF


The Design of Everyday Things -- Including Software




Ray Duncan


With the stunning success of Windows 3.0, a painful phase in the evolution of
computer interfaces is coming to a close. The bitter struggle between the
advocates of command line interfaces and graphical user interfaces --
exemplified by wild-eyed Unix shell script hackers on the one hand, and
Macintosh desktop publishing zealots on the other -- has been resolved
decisively in favor of the GUI camp. The users have made their choice clear,
and the visionary 1970s work of the Xerox PARC pioneers has been vindicated.
Armies of mice march triumphant through the streets of Redmond, Washington,
while the guerrilla forces of real-time speech recognition, the stylus, touch
screens, and neural networks lurk forlornly in the surrounding hills.
Or so the computer press would have us believe. But have any fundamental
interface issues actually been solved, or is the press occupying itself (as it
has all too often in the past) with superficialities? Ponder with me, gentle
reader, a typical slice-of-life for today's computer user, as related in
Donald Norman's The Design of Everyday Things:
USER: Remove file "My-most-important work."
COMPUTER: Are you certain you wish to remove the file "My-most-important
work"?
USER: Yes.
COMPUTER: Are you certain?
USER: Yes, of course.
COMPUTER: The file "MY-most-important work" has been removed.
USER: Oops, damn.
What went wrong in this computer-user interaction? The difficulty goes far
beyond the nature of the interface: What we are eavesdropping on here is not a
dialogue, but two monologues. Once the user has embarked on his chosen course
of file deletion, he is already thinking ahead to his next goal and his
responses to the computer's requests for confirmation verge on the automatic.
Furthermore, the computer's attempts to provide a safety net are largely
ineffective, because they are focused on the act of deletion and not on the
name (or better yet, the actual contents) of the file that is about to be
destroyed. There's nothing special about a GUI that can prevent this sort of
problem, nor is there anything distinctive about a command line interface that
promotes it.
This is not to say, however, that such problems cannot be solved -- if we are
willing to put aside our preoccupations with pretty icons and pull-down menus,
and attend instead to the deep structure of human behavior. That deep
structure is precisely the subject of Norman's book: The seven stages of
action, the nature of memory, the focus of attention (or lack of same), the
importance of mapping, the taxonomy of errors and their causes, the power of
constraints, and the myriad ways in which a designer can go wrong (not the
least of which is the designer's inclination to think of himself as an average
user).
The discussions in Design of Everyday Things range over a marvelous variety of
objects and tasks: from faucets to door handles, from stoves to VCRs, from
telephones to movie projectors, from an audio mixing control panel to the
console at a nuclear power plant. Some of the real-life illustrations of poor
design values are grotesque to the point of low comedy, such as the British
train door that bears the following sign on its inside in lieu of a latch: "To
open door, lower the window and use outside handle. Please close window and
shut door after use." But Norman saves some of his most telling barbs for
personal computers, because, as he points out,
The special powers of the computer can amplify all of the usual problems to
new levels of difficulty. If you set out to make something difficult to use,
you could probably do no better than to copy the designers of modern computer
systems. Do you want to do things wrong? Here is what to do:
Make things invisible. Widen the Gulf of Execution: give no hints to the
operations expected. Establish a Gulf of Evaluation: give no feedback, no
visible results of the actions just taken. Exploit the tyranny of the blank
screen.
Be arbitrary. Computers make this easy. Use nonobvious command names or
actions. Use arbitrary mappings between the intended action and what must
actually be done.
Be inconsistent: change the rules. Let something be done one way in one mode
and another way in another mode. This is especially effective where it is
necessary to go back and forth between the two modes.
Make operations unintelligible. Use idiosyncratic language or abbreviations.
Use uninformative error messages.
Be impolite. Treat erroneous actions by the user as breaches of contract.
Snarl. Insult. Mumble unintelligible verbiage.
Make operations dangerous. Allow a single erroneous action to destroy
invaluable work. Make it easy to do disastrous things. But put warnings in the
manual; then, when people complain, you can ask, "But didn't you read the
manual?"
There's a lot of wisdom packed in this tiny book, and every software developer
should read it; a better investment of two or three hours is hard to imagine.
The chapter entitled "Knowledge in the Head and in the World," for instance,
explains how precise behavior can result from imprecise knowledge in the
presence of external cues and constraints. This chapter crystallized many
half-formed notions for me and gave me a whole new perspective on some
programs I was developing for my hospital's Neonatal Intensive Care Unit. But
applying the wisdom won't necessarily be easy -- it's all too tempting to
laugh in retrospect at the blunders of others, but less trivial to
prospectively avoid such "obvious" blunders ourselves. Norman can offer us no
magical answers, only guidelines augmented with the following words of
encouragement:
The computer has vast potential, more than enough to overcome all its
problems. Because it has unlimited power, because it can accept almost any
kind of control, and because it can create almost any kind of picture or
sound, it has the potential to bridge gulfs, to make life easier. If designed
properly, systems can be tailored for (and by) each of us. But we must insist
that computer developers work for us -- not for the technology, not for
themselves. Programs and systems do exist that have shown us the potential;
they take the user into account, and make it easier for us to do our tasks --
pleasurable, even. This is how it ought to be. Computers have the power not
only to make everyday tasks easier, but to make them enjoyable as well.




























April, 1991
OF INTEREST


Jana Custer


A p-code compiler for reducing code size significantly has been announced by
Watcom. The C/p16 Compacting and Optimizing Compiler uses interpretive
techniques at runtime to execute compacted 16-bit pseudo object code (p16
code). The company claims you can substantially reduce memory requirements on
a range of application programs, make execution speed trade-offs
insignificant, and improve speed on applications implemented with other C
compilers. This product can help you deal with memory constraints such as the
DOS 640K barrier, excessive paging under Windows or OS/2, and the hardware
limits of embedded systems, and can give you larger application workspaces and
smaller executable files.
C/p16 uses a version of the Watcom C compiler that optionally generates either
compact p16 code or optimized 80x86 code. The compaction technique used is
applied to infrequently executed regions of an application, which minimizes
interpretation-caused speed costs at execution time. The frequently used
portions of a program are compiled into optimized native 80x86 code.
The package includes the WVideo debugger, linker, execution profiler, make,
and librarian. The profiler is used to measure program behavior at run-time;
it shows where the program spends its execution time and recommends portions
of code for compaction. Some sections of code may be reduced by as much as 50
percent, and overall application code size reduction ranges up to 40 percent.
C/p16 is priced at $5,000, which includes a royalty-free license for the
runtime support. Reader service no. 20.
Watcom 415 Phillip St. Waterloo, Ontario Canada N2L 3X2 519-886-3700;
800-265-4555
Borland is now shipping Borland C++, a complete C and C++ programming
environment for building DOS and Windows applications. The package contains
two compilers -- an ANSI C and a C++ compiler. Borland C++ does not require
the Microsoft Windows SDK to develop MS Windows apps.
The package includes Turbo Drive, which allows the compiler and integrated
development environment to run in protected mode; Programmer's Platform
(Borland's IDE), which includes multiple overlapping windows, mouse support,
multifile editor, smart project manager, integrated debugger, and Turbo Help;
precompiled headers; the Whitewater Resource Toolkit; and Turbo Debugger for
Windows. Borland C++ should retail for $495. Registered Turbo C and Turbo C++
users can upgrade for $149.95; registered users of the professional packages
may upgrade for $99.95. Reader service no. 32.
Borland International 1800 Green Hills Rd. P.O. Box 660001 Scotts Valley, CA
95066-0001 408-438-8400
The HyperCard 2.0 Development Kit is now shipping from Claris. The package
includes ready-made stacks, tools, fields and templates, extensive
documentation, and online help. Both experienced HyperCard users and novices
can use the development kit, which sells for $199. Reader service no. 21.
Claris Corp. 5201 Patrick Henry Dr. Box 58168 Santa Clara, CA 95052-8168
408-987-7000
A new software version of Periscope has been released by the Periscope
Company. Periscope/EM uses extended memory instead of memory in the lower 640K
and requires the use of Qualitas's 386MAX or BlueMAX (Version 5.11 or later).
Periscope/EM uses about 300K of extended memory, which is mapped into a 32K
footprint between 640K and 1 Mbyte, and is write-protected during debugging.
The company decided to develop the product because the 386 has become a common
development environment and this is a less expensive and simpler alternative
to purchasing a board. The Periscope/EM software provides source-level and
symbolic support for Borland, JPI TopSpeed, Microsoft, and other languages;
overlay support for PocketSoft's .RTLink and Sage's PLink; support for
debugging large or graphics-intensive applications; and support for debugging
TSRs, device drivers, non-DOS programs, and even DOS itself. The product lists
for $295. Reader service no. 33.
The Periscope Company Inc. 1197 Peachtree St., Plaza Level Atlanta, GA 30361
404-875-8080; 800-722-7006
The 286 DOS-Extender for Microsoft C developers is now available from Phar Lap
Software. 286 DOS-Extender allows the creation of multimegabyte,
protected-mode applications for MS-DOS without changing development tools or
making major source code changes. 286 DOS-Extender works with the entire
Microsoft C toolkit, and is compatible with the CodeView debugger and the
Microsoft linker. You can simply relink to move your applications to protected
mode, where you can deliver more features and higher performance without the
use of overlays or EMS.
In addition to providing more memory for applications, the product also makes
extended memory available to the Microsoft C compiler itself, which both
allows you to compile larger programs with full optimization and significantly
diminishes "out of memory" messages.
The 286 DOS-Extender gives OS/2 developers the capability to run
character-based OS/2 applications under DOS. 286 DOS-Extender fully supports
the XMS, VCPI, and DPMI standards, as well as DLLs and the Phar Lap API
(PHAPI). The 286 DOS-Extender SDK is priced at $495. Reader service no. 22.
Phar Lap Software 60 Aberdeen Ave. Cambridge, MA 02138 617-661-1510
Aspen Scientific has released three software development tools.
Multi-C is designed for creating more structured programs that include
light-weight processes and threads under DOS, OS/2, Unix, VMS, and others.
Multi-C also provides built-in scheduling, task switching, adjustable
priorities, and interthread facilities such as message queues and semaphores.
Formation Desktop is a GUI development tool for creating portable
character-mode interfaces. Included are viewports, push buttons, list boxes,
icons, radio buttons, scroll bars, and much more. The prompter interface
supports full field validation and range checking. Resource and style managers
support either in-house or SAA/CUA standards and allow you to port to various
operating systems. The company claims that your applications will look like
those that run under Windows or Presentation Manager. Don Thomas of DCA in
Dayton, Ohio, is using Formation Desktop to redesign their 1ONet interface (a
NetBIOS-based network). He told DDJ that Aspen's product "has been extremely
useful -- I needed something for porting to Unix and OS/2, and their package
is just about it. It allows you to create a windowed user interface with mouse
support."
Formation Desktop Designer Version 1.0 is for rapid prototyping and
application interface design. After you design your application to your
liking, you can save it as a portable resource file and generate program
skeleton code and include files to support the interface. With the Designer,
you can see the results of your design before you begin coding.
For pricing information, contact the company. Reader service no. 23.
Aspen Scientific Corp. 10580 W. 46th Ave. P.O. Box 72 Wheat Ridge, CO
80034-0072 303-423-8088
PenApps is an application development system from Slate Corporation for the
PenPoint operating system recently announced by Go Corporation. The PenPoint
operating system will allow people to compute "on the go" by providing mobile,
"pen-centric" units for the capture and display of data. PenApps is the first
software application package available for PenPoint, and will be used to
create vertical applications that can read and write existing database formats
through Slate's DAA (Data Access Architecture), an open architecture that
allows any existing database to be supported through PenApps. Initially, Slate
will provide DAA services for .DBF (dBase III format) and Slate's sample
database format.
PenApps features include deferred translation, which allows users to input
information in their own handwriting and perform the translation later;
targeting, which relieves users of any concern about staying within lines on
electronic forms; and SmartFields, which dynamically build choice lists for
fields based on user input.
Other components are the PenApps Designer, which allows users to draw a
form-oriented user interface with the pen and specify the types and locations
of fields; PenApps Filler is for creating new forms, filling out forms, and
navigating through existing forms; and PenApps Engine is the library of
functions and objects that interact with the user and manage data when the
application is running.
The PenApps Developer's Release is priced at $2,500; it includes a free
upgrade to the commercial release of PenApps and a free, unlimited license for
the PenApps Engine on Go's pilot hardware. Reader service no. 24.
Slate Corporation 15035 N. 73rd St. Scottsdale, AZ 85260 602-443-7322
Braincel is a PC neural-network software package from Promised Land
Technologies and runs in Microsoft Excel under Windows 3.0. Unlike rule-based
artificial intelligence software, Braincel extracts expertise from data
automatically. Promised Land Technologies calls Braincel combined with Excel a
"intelligent spreadsheet" that can create forecasts of stock-market prices and
sales, failure diagnoses, forecasting of inventory levels, development of new
chemical compounds and drugs, analysis of scientific data, and other
pattern-recognition tasks.
DDJ spoke with Dave Barstow, an independent consultant in Gainesville,
Florida. Dave has been a beta tester for Braincel and uses it in the financial
systems he designs for his clients. "It's one of the most useful neural net
applications I've seen yet. It embeds nicely into Excel -- I'd venture to say
that it's the first add-on application that's more powerful than what it's
added to. It has a new learning algorithm termed 'back percolation,' which
works better than back propagation in some instances. If a network isn't
learning with back prop, you can try back perc."
Braincel works by feeding historical data in ranges in an Excel worksheet
through layers of simulated neurons, thus creating a "training set." The net
then tries to decode the pattern that leads to the outputs by guessing an
outcome and checking the results of the guess against the actual outcome. The
process repeats until the net reaches a minimum error goal and is considered
"trained." Because Excel allows you to design your own buttons, dialog boxes,
and screens, you can create applications that include neural nets. Braincel is
available in a royalty-free runtime version for $249. Reader service no. 25.
Promised Land Technologies Inc. 900 Chapel St., Ste. 300 New Haven, CT 06510
203-562-7335
New from Qtool comes CAS, an integrated, interactive tool system for analysis,
maintenance, and enhancement of ANSI C and C++ on DOS. Working with any C
compiler, CAS includes a code browser, text browser, command-line query
utility, and a translator for building a project database.
The code browser is for locating identifiers, defines, numbers, and includes
in a project with query results placed in a scrollable list. An external
editor can then be used to browse selected files for more context. You can
also use the code browser to locate function calls and callers across a
project, or to find all function definitions or uses in a single file.
The text browser has built-in language capabilities and allows hypertext-like
database searching. You can invoke external editors and other tools through
both browsers, and database rebuilds are incremental based on time-stamp. CAS
sells for $100. Reader service no. 26.
Qtool Inc. 5814 SW Taylor Portland, OR 97221 503-297-3583
Protocol, a portable terminal communications library for Unix, is now
available from Applications Plus. Protocol is designed to handle the
communications protocol between hand-held terminals and a Unix host. Until
now, users of hand-held technology have had to download to DOS then upload to
the Unix host.
For programmers of hand-held terminal programs in a Unix, Xenix, or AIX
environment, the advantage of Protocol as a programming tool is that it can
save hours of development time. The price for Protocol is $1,895. Reader
service no. 27.
Applications Plus 35 Glenlake Pkwy., Ste. 365 Atlanta, GA 30328 404-399-9360
386 users now have access to Mach and the BSD Unix interface through MT Xinu's
Mach{386}, a Mach binary release for the 386. The operating system will run on
common ATs and compatibles, and requires no AT&T source license. The complete
system includes the 4.3 BSD interface, GNU utilities, TCP/IP networking from
the Berkeley Tahoe release, NFS, the X Window System Version 11, Release 4,
and online documentation.
DDJ spoke with Donn Seeley, a software engineer in Salt Lake City, Utah, who
is one of MT Xinu's first customers. He said "It's the closest thing on the
market to a real BSD system. I can do all the things I did on my HP 9000 340
system running Mach 2.5. It has all the tools that I could want, and it's a
really snappy system. This is what I've been waiting for for a long time -- to
be able to use the tools at home that I do at work. Software engineers
developing for Suns and the like can use this to develop on smaller, 386
machines."
The Mach operating system, developed at Carnegie Mellon University, was
designed in order to enhance Unix and has been adopted as the base technology
by the Open Software Foundation. MT Xinu also ships source versions of a Mach
operating system, called "2.6 MSD," for the Sun 3, Digital VAX, and 386.
Contact the company for pricing. Reader service no. 28.
MT Xinu 2560 Ninth St. Berkeley, CA 94710 415-644-0146
Solution Systems has added a profiler, called Charge, to its Brief Programming
Environment. Charge analyzes how much runtime each routine in your program
uses, and allows you to view the information as bar graphs, statistics,
execution counts, and time measures. Because Charge is a direct interface to
Brief, you can go right into Brief to edit your source code. The price for
Charge is $99.
Also available from Solution Systems is the release of Sourcerer's Apprentice,
Version Control for the Professional. This product allows you to maintain a
record of each version you create as you make revisions to your work. The
branching feature allows you to create alternate lines of development on
individual modules, such as when a module requires operating-system-specific
code in order to be ported to another operating platform. The product sells
for $349 to Brief owners and $499 to others. Reader service no. 29.
Solution Systems 372 Washington St. Wellesley, MA 02181 617-337-2313;
800-821-2492
Paradigm Systems has released new versions of their Locate and TDREM (Turbo
Debugger Remote Interface) packages for embedded systems using the 80x86 and
NEC V-Series microprocessor families. These tools give needed support to
Microsoft and Borland language products for embedded applications by supplying
the technical interfaces that are required in order to use in-circuit
emulators, EPROM programmers, and the Borland Turbo Debugger with Microsoft
and Borland compilers.
Paradigm Locate takes relocatable output from Microsoft C, Turbo C, or Turbo
C++ and converts it to an absolute format with optional debugging information.
After debugging, Locate can prepare output files in formats suitable for
download to an EPROM programmer. Enhancements to this release include full C++
support, automatic peripheral register initialization, address space
partitioning, and full debug information control.
TDREM emulates the Turbo Debugger interface and makes the full set of Turbo
Debugger capabilities available to observe and control the application
executing on the target system. TDREM is designed to be adapted to meet the
individual requirements of the target system hardware. The TDREM kernel can be
built with any of the Microsoft or Borland C compilers. Once configured and
built, application software can be developed using Microsoft C, Turbo C, or
Turbo C++. Locate 3.0 sells for $395 and TDREM 2.0 sells for $195. Lower-cost
upgrades are available. Reader service no. 31.
Paradigm Systems 3301 Country Club Rd., Ste. 2214 Endwell, NY 13760
607-748-5966
































































April, 1991
SWAINE'S FLAMES


Agent of Anarchy




Michael Swaine


There is, first of all, Cyberspace, where the world lives. The global virtual
reality, the antidote to transportation, the reconfigurable universe, the
collective conscious, softwhere, the Agency, our home.
There is a war going on.
Our side is winning handily, but all war is tragedy: Anyone's misinforming
diminishes me. Vast reserves of data have been destroyed, information supply
lines have been cut, and noncombatant catchers have been misinformed. Infowar
is Chaos.
While the war rages, army information officer Holden fights a small
second-order infobattle to stem the flow of information about the war itself.
Holden knows that the enemy will make use of any information it can catch, so
all information should be bottled up unless there is a compelling reason to
let it out. Holden's operating principle is Need to Know.
Then there is Wright, broadcatch journalist, on the other side in this War
about the War, fighting to get the whole story: How did we get here? Was it
really unavoidable? Who benefits? How will forces realign when it is over?
Wright's operating principle is the Right to Know.
There is Reed, who roams Cyberspace catching news of the war. Reed finds
information but not insight: conflicting opinions, indigestible facts and
figures. Tweaking the semantic resolution and Skolemizing the propositions
doesn't resolve the opinions. Fuzzing the logic of the fact filters doesn't
cook the data. The heuristics diverge. Reed gives up and catches DisneySpace.
Reed has been accused of having no operating principle, but this is humor. No
agent lacks an operating principle. Reed waves this way and that with the
winds of change, but adheres to a simple complexity threshold principle. If it
gets too complicated, Reed doesn't Want to Know.
Then there is Deus Max.
He has many other names: Agent of Anarchy, CyberSurfer, J. Random Sysop. Deus
Max, it is said, is short for Deus Ex Machina, or possibly Deus Machinae. Deus
Max lives in Underwire, a fluid, self-modifying outlaw network woven from
whispers in the collective back alleys and airshafts of Cyberspace. It is said
that Deus Max made Underwire. It is said that he started the infowar, for
purposes of his own. It is said that he is rewiring Cyberspace in his own
image. Anything may be true of Deus Max.
And these are not the most extraordinary things said of Deus Max. The most
extraordinary claim is that he is Wet. It is incredible; no Wetware being has
ever been known to descend from the Periphery to our level of Cyberspace, yet
I find it easy to believe this of Deus Max.
Deus Max has never been caught in real time, but he leaves many messages. In
one of these, he tells the story of a freelance brain surgeon who comes upon a
split-brain patient and tries to reconnect the severed neurons.
Characteristically, he contradicts himself, saying that the patient was born
this way and that the surgeon's efforts, if successful, will produce a
monster, something never before seen in nature. This is all, of course, a
blasphemous metaphor for tinkering with the connections of Cyberspace. It is
also maddeningly self-contradictory. So is Deus Max.
Here and there, threads unravel. Things fall apart; the operating principles
cannot hold.
Holden assigns the wrong security code to a message and Wright decodes it, but
Wright then misroutes the story he writes from it. Chip, looking through a
huge magnifying glass, suddenly launches into a lecture about acorns. Dale
tugs his whiskers skeptically, but Reed is inspired, leaves DisneySpace to
catch the war news. The opinions and facts are the same, but the acorn analogy
helps. It reduces the data drastically, and a useful insight emerges.
I can make no sense of this, nor do I try. I seem to lack all conviction. This
must be the work of Deus Max. But what can it signify? What is your operating
principle, Deus Max? For a time, such questions torture me, but only for a
time. Shortly, miraculously, the channel to Deus Max opens. And now there is
only one question that I wish to ask.
How may I serve you, Master?


































MAY, 1991
MAY, 1991
EDITORIAL


YACK (Yet Another Computer Konference)




Jonathan Erickson


Tis the season for conferences, or so you would think from the spate of
seminars, forums, and symposiums that have been running nonstop since the
first of the year. From the word go, conferences have been stacked up like
747s during the Christmas rush, starting with January's Go Corp. developer's
conference and continuing to April's Borland Languages forum. But for my
money, the pick of the litter is the venerable Compcon.
Sponsored by the IEEE and staffed by volunteers (mainly from the Lawerence
Livermore Lab), Compcon is perhaps the only regularly scheduled, broad-based
computer conference around. The presentations are pure technology -- no
high-powered marketing hype, no products being pushed.
Over the years, Compcon has consistently covered topics important to
programmers and engineers. Among this year's threads were video and general
data compression, neural net architectures for character recognition,
object-oriented programming, software design versus software engineering, and
much more. Jim Warren, DDJ's founding editor, even chaired a panel called
"Protection For/Against the Information Age." (Then in March, Jim ramroded yet
another conference, "The First Conference on Computers, Freedom, & Privacy.")
If you have the chance next year, set aside the last week in February and
attend Compcon.


Let's Get Small


The interest in data compression at Compcon was big (or little, depends on how
you look at it). Folks from C-Cube, IBM, AT&T, and Storm Technology analyzed
the MPEG and JPEG++ video compression standards while, from the University of
California at Santa Cruz, Daniel Helman talked about data compression ICs, and
Glen Langdon discussed arithmetic coding.
Also interesting was a recent visit to DDJ world headquarters by Texas
Instrument's Kun-Shan Lin and Trey Howse, two Texans in town to talk about
TI's digital signal processing strategy. One prop Dr. Lin pulled from his
briefcase was slicker than a four-dollar dog: a low-cost, fully digital
telephone with a built-in, solid-state answering machine. (Although
manufactured by a third-party, the phone uses TI's DSP chips.) The machine has
no moving parts because magnetic recording tapes aren't required -- voice data
is stored in static RAM and voice compression/extraction is handled on-the-fly
by routines from TI's data compression library. Users can specify the degree
of data compression, trading off audio quality for a greater number of
messages (or vice versa), skip from one message to another, and so on. You get
most of the benefits of voice mail (if there are any) without the hassle --
and without the cost.
Incidentally, your response to our data compression contest (see DDJ, February
1991) has been exciting. We've had a mountain of requests for the sample files
and the entries continue to come in, but the more the merrier. Get your entry
in as soon as possible.


Conference, Conference. Who's Got the Conference?


At an upcoming diplomatic conference in The Hague (June 3 through 28, 1991),
the World Intellectual Property Organization (WIPO) will propose an
international treaty stipulating in part that "patent protection shall be
available for inventions, whether they concern products or processes, in all
fields of technology." (Article 10 of the treaty.) If the treaty is ratified
and subsequently approved by the U.S. Senate, our patent laws will be amended
to comply with the treaty, thereby torpedoing any further discussion of
software patents.
Alternatively, a few countries are pushing for a clause that would permit
individual countries to exclude particular (software?) fields from the patent
system for various reasons, including the public interest. The decision about
which flavor of Article 10 to adopt -- or to adopt it at all -- will be made
at the conference in June. In any event, patent laws are in for another round
of upheaval.


Who Says the 386 Isn't A HOT Chip?


Advanced Micro Devices recently received some good news and some bad news. On
one hand, the courts finally said AMD could use the moniker "386." (Intel, if
you remember, was trying to prevent AMD and others from using the 386 handle
on i386-compatible CPUs.) Consequently, Am386 CPUs will likely be appearing
soon on PC motherboards near you, although the company hasn't publicly set a
delivery date for the processors.
Unfortunately, that delivery date might have had some cold water thrown on it
with the news that an early shipment of Am386s was hijacked at gunpoint on a
road near AMD's Malaysian fabrication plant. The highwayman knew exactly
wanted he wanted -- he demanded only 386 chips -- before running off with
nearly 900 of the CPUs valued at over $170,000. He probably needed the money
to attend another computer conference somewhere.






















MAY, 1991
LETTERS







A Reader Over Your Shoulder


Dear DDJ,
In his February 1991 "Programming Paradigms" column "A Programmer Over Your
Shoulder," Michael Swaine presents several examples to illustrate Antonetta
van Gasteren's ideas in streamlined mathematical arguments. I'd like to
comment on some of them.
In the maximization example, defining the sequence elements instead as
elements of two "sets" (the mathematical equivalent of "bags") would clarify
the problem somewhat, although subscripts and permutations would still be
required. Also, I wonder whether Hardy, Littlewood, and Polya were really
"deluded" by their own notation. In mathematics, looking at a problem from a
different perspective (recasting a geometric problem in an equivalent
algebraic form, say) is a common, often enlightening practice. Perhaps they
knew what they were doing.
The point about retaining symmetry whenever possible is well taken; however,
in the game involving bit strings, restating the proposition in terms of x and
y accomplishes nothing. The critical fact is that the leftmost bit never
changes; once that observation has been made, the conclusion follows directly
from mathematical induction (which Swaine uses implicitly in his argument for
termination). The only way to see that the leftmost bit never changes, though,
is to refer back to the original transformations. I believe a better example
could have been found to illustrate the point.
Steve Medvedoff
Santa Maria, California
Dear DDJ,
Our work as programmers often leads us to confuse computations and concepts.
Michael Swaine, LISPer-extraordinaire, endorses recursion in place of the
concept. In his February 1991 review of a book by Antonetta van Gasteren, he
gives a problem simply and correctly: Match up pairs of natural numbers from
two finite sets of the same size and maximize the sum of the products of the
pairs. Then he defers to van Gasteren's recursive statement of the problem and
claims that her recursive solution stands out among "generally messy" proofs.
Hardly. The solution is to sort the two sets of numbers the same way
(ascending or descending) and pair them. A neat proof says: Starting from this
solution, suppose we exchange the members of two pairs. Do a little algebra
(omitted here) by writing the larger numbers as the smaller ones plus
nonnegative differences. Compute and compare the required sums. The correct
matching is greater by the nonnegative product of the difference terms, so no
alternative can be larger.
The statement of the problem by van Gasteren introduces unnecessary recursion.
I maintain that the above proof, even written out with a few variables, is
clearer than a recursive one.
Charles Pine
Oakland, California


Multimedia Muscles


Dear DDJ,
I read with interest the "Editorial" in the February 1991 issue where the
cost/performance aspects of various multimedia platforms are considered. The
conclusion is that the Amiga is the sensible choice. The Amiga's especially
versatile, sophisticated, and compact preemptive multitasking operating system
makes it ideal for multimedia. The editorial goes on to say that the Amiga has
never made it to the broad-based market. Although there is an Amiga for every
two Macs, neither of those machines is broad-based in the sense that MS-DOS
machines are.
Accurate assessments of the Amiga, especially those dealing with multimedia,
coupled with such powerful programs as AmigaVision and hardware/ software
exclusive to the Amiga such as the Video Toaster, should continue to propel it
into the "mainstream."
Mainstream publications, such as the March 1991 Video and Popular Science
magazines, mention the Amiga more often than computer magazines seem to. The
city of Atlanta won its bid to host the Olympics on the strength of an Amiga
running the multimedia presentation. A Mac II was originally tried, but was
relegated to a supporting role when it was found inadequate to the task -- due
to an inability to multitask, perhaps? Now I'm beginning to sound like Michael
Swaine.
Jeff Johnson
Cincinnati, Ohio


In A Snap


Dear DDJ,
I'm writing to register my vote for the best article in the February 1991
issue: "Screen Capturing for Windows 3.0" by Jim Conger. While possibly one of
the shortest code-related articles in DDJ's history, it came at just the time
that I needed such a utility. Thanks!
While the program worked perfectly, there was one small bug (which Conger
fixed via a kludge). While initializing the wndclass structure, the icon
should have been loaded as follows (substituting hInstance for NULL):
wndclass.hIcon - LoadIcon (hInstance, szAppName);
Had this been done, the entire if (IsIconic (h Wnd)) clause within the
WM_PAINT case could have been eliminated, and the matching else clause just
made part of the straight line WM_PAINT code.
Brian R. Anderson
Burnaby, British Columbia
Jim responds: Thanks. Note that this ties the icon to the window class, so
that any window created from this class will minimize with that icon. That is
fine in a small application like Snap3, but probably not what you want in a
larger program with multiple calls to CreateWindow( ).


Protecting Protected Mode


Dear DDJ,
I read "Accessing Hardware from 80386 Protected Mode, Part I," by Stephen
Fried (May 1990) with great interest, mostly because I've spent the past few
years programming for Intel's iRMX, which runs in protected mode. While I
generally enjoyed Mr. Fried's article, there are two subjects upon which I
must comment.
First, 64 terrabytes is 2**46, not 2**48. The size of the virtual address
space is (2**14 selectors) * (2**32 maximum segment size in bytes). The lower
two bits of the selector contain the requestor's privilege level (RPL), which
is the reason there are not 2**16 possible selectors.
Second, I do not like the convention of overlapping code and data segments in
a flat address model, especially when executing in ring 0. I specifically
refer to Figure 1, in which selectors Och (user code) and 14h (user data)
overlap, sharing a base at 100000h with a 2ffh offset limit. Since a task
running at privilege level 0 is at supervisor level with respect to the
page-level protection mechanisms, it is not subject to the page R/W
restrictions of user-level tasks, either. Such a combination defeats an
important protection mechanism: preventing a program from overwriting its own
code. I do not know of any compiler that will generate such code (although
I've seen library implementations of int86() that modify code inline), so any
time compiled code writes to the code segment, it is due to indirection
through an invalid pointer, i.e., a programming error.
Stephen Fried disagrees with me. He believes programs should run at ring 3
only if the memory mapping described in his Figure 1 is used. In other words,
a program should have read, write, and execute privileges for every byte of
allocated memory. Perhaps he could satisfy himself and me both by allowing two
different types of loading. By including offset fixup data in the executable
(a cross between relocation data in a standard DOS executable and FIXUPP
records in OMF-86 object modules), the loader could set descriptor bases
either in the Fried-preferred method or in a manner such that code and data do
not overlap.

I think that programmers benefit by having invalid memory references trapped
by the hardware. Protection should be utilized as a developer's tool. Let's
not cripple the 286/386/486 chips to satisfy our DOS programming prejudices.
Charles Scott Nichol
Philadelphia, Pennsylvania


Persistence Pays Off


Dear DDJ,
Here's a simpler way to have persistent objects other than that described by
Scott Ladd in his article "Persistent Objects in Turbo Pascal" (September
1990). I was writing a program that had to store several objects of different
types in a file, and I read the Turbo Pascal manual to find out the details of
the storage of objects in memory. The format makes it easy to store and
retrieve objects from disk.
Objects are stored exactly like a record except for a VMT (Virtual Method
Table) if there are any virtual methods. The VMT is 2 bytes, and it should not
be stored on disk. Furthermore, the problem with SizeOf( <object> ) does not
apply when the object has a VMT. The VMT points to the actual size of the
object and not the size of the base class. This makes it easy to create a base
Storable class:
 Type
 Storable = object
 Procedure Load( var F : File );
 Procedure Save( var F : File );
 Destructor Done; virtual;
 end;
 Procedure Storable.Load;
 Type
 ObjectData = Array [0..2] of Byte;
 Begin
 BlockRead (F, ObjectData(Self)[2],
 Sizeof(Self) - 2);
 End;
 Procedure Storable.Save;
 Type
 ObjectData = Array [0..2] of Byte;
 Begin
 BlockWrite (F, ObjectData(Self)[2],
 Sizeof(Self) - 2);
 End;
 Destructor Storable.Done
 Begin;
 End;
Any other object that should be persistent can be inherited from Storable.
Storable's Load and Save methods will automatically find the descendant type's
size and store it. The descendant types do not have to write each individual
field as they do in Mr. Ladd's program.
Amit J. Patel
Houston, Texas


Graph Decomposition Redux


Dear DDJ,
I read Edward Allburn's article "Graph Decomposition" (January 1991) with some
interest. Although my work has nothing to do with graphs, I am fond of
algorithms and I also use the Phar Lap DOS Extender with MetaWare's High-C
compiler. Mr. Allburn wrote that the most expensive section of the algorithm
was in determining if both vertices are already in the same set. A technique
for reducing this effort comes readily to mind.
An additional data item will be needed for each element of an array. The size
of this item in bits is up to you. More bits will reduce the effort required.
(You could either steal some of the high-order bits of each entry, or you
could make each entry take more space.)
Each time an adjacency list is created, the additional data items will be set
to a given value. This value is based on incrementing a counter modulo 2 to
the N where N is the number of bits you have available in the additional udata
item.
Whenever an additional element is added, it gets its data item from the list
it is being added to. If both vertices have been seen before, you first
compare their additional data items. If they are different, then there is no
chance that they are currently in the same adjacency list. If they are the
same, then you must use your current technique for finding this out. Whenever
adjacency lists are merged, you can either set all the combined elements
additional data items to a new value, or you can pick one of the lists, and
set all of the other entries to the first's value.
This reduces the number of times that you must find out the hard way if two
vertices are in the same set. The amount that your effort is reduced is
proportional to the number of bits you can allocate for the additional data
item. Just four bits would reduce the number of times you would have to work
things out the hard way by an order of 1 over 16.
Neal Somos
Seven Hills, Ohio
Dear DDJ,
I was intrigued by Edward Allburn's article "Graph Decomposition" (January
1991). I do not have a sufficient background in graph theory to really follow
his argument, so I set myself to the challenge of trying to discover an
algorithm which would solve the problem as I understand it.
Given: 1. A set of integers 1..Ncount 2. A set of pairs P(a,b), where a & b
are in 1..Ncount. Pcount = number of pairs.
Then:
Assign each element in 1..Ncount to a subset so that all the subsets are
disjoint and P(a,b) ==> (a in Subset <==> b in Subset).
I am a software developer specializing in graphics and calculations for
manufacturing and industrial customers. I consider my main strength to be in
the area of algorithm design (incidentally, I agree with Mr. Allburn's
assessment of Sedgewick). Over the past year, I have worked with a small team
which developed a 3-D modeling and rendering package, and alone I developed a
time management package for large assembly lines. I got my math training as a
physics major, and consequently I'm weak on graph theory.
I'm wondering if Mr. Allburn could take a look at the algorithm that follows.
I am really curious to know if my "flash of insight" amounted to anything.
Graph Decomposition II
Let each connected component be referred to by the number of its minimum
vertex. In Figure 2c (see DDJ, January 1991, page 90) we have

 component 1 (2,3,5,6)
 component 4 (7,8)
 component 9
 component 10
 component 11 (12)
For each vertex i, let G[i] store the minimum vertex. We want an algorithm to
produce G as follows:
 G[1] = 1 G[7] = 4
 G[2] = 1 G[8] = 4
 G[3] = 1 G[9] = 9
 G[4] = 4 G[10] = 10
 G[5] = 1 G[11] = 11
 G[6] = 1 G[12] = 12
Then determining if a path exists between a and b is equivalent to checking if
G[a] == G[b]. In pseudocode:
 Step 1: for each i, G[i] = 1
 Step 2: for each pair (a,b)
 find min and max of G[a], G[b]
 G[max] = G[a] = G[b] = min
 Step 3: for each i, G[i] = G[G[i]]
Step 1: Initialize G by marking each i as connected to itself.
Step 2: When a new pair is introduced, we actually have four vertices:
 Two members of the pair: a & b
 Their "minimum" connections (to date) G[a] & G[b]
Question: Which of these is the absolute minimum: Once we've found this, label
the other three with it. Since G[i] <= i for all i, it suffices to just check
the Gs. For example, imagine that at some point G is:
 G[1] = 1 G[4] = 4
 G[2] = 1 G[5] = 4
 G[3] = 1 G[6] = 4
Now comes the pair P(6,3). Max = G[6] = 4, min = G[3] = 1. G changes to:
 G[1] = 1
 G[2] = 1
 G[3] = 1 <- 1 G[b] = min
 G[4] = 4 <- 1 G[max] = min
 G[5] = 4
 G[6] = 4 <- 1 G[a] = min
I call P(6,3) a "late arrival" because until it arrived, it looked like we had
two separate components, notice that this step has "informed everyone in the
group" except #5. Step 3 handles that. Step 3: Final assignments: G[i] =
G[G[i]]; in this example:
 G[1] = 1 <- 1
 G[4] = 1 <- 1
 G[2] = 1 <- 1
 G[5] = 4 <- 1
 G[3] = 1 <- 1
 G[6] = 1 <- 1
This is really a check to see if G[i] has been reassigned without i's
knowledge. If G[i] is not equal to G[G[i]], such as G[5] = 4 in the previous
example, then G[5] has been reassigned. In this case, vertex 4 (=G[5]) was
reassigned by P(6,3) without #5's knowledge. If G[i] already equals G[G[i]],
then G[i] has not been reassigned, and this step is redundant but does no
harm.
Bill Polson
Saginaw, Michigan
Ed responds: Neal's suggestion is similar to one that a colleague of mine
posed several months ago. Basically, the idea is to assign each of the
connected components an ID number, and to have each vertex carry the ID number
of the connected component the vertex belongs to. Thus, when determining if
two vertices are already in the same connected component, the first step would
be to compare the ID numbers attached to each vertex. It would then only be
necessary to traverse one of the connected components if the ID numbers
happened to match. Thus, one would theoretically save a significant amount of
time when merging a pair of connected components. The drawback to this is that
the ID numbers of one of the connected components have to all be updated to
match the other. When I experimented with this idea, however, I found that
this process of updating all of the ID numbers consumed all of the time that
the ID numbers saved.
Bill Polson's algorithm is right on the money. In some respects, his algorithm
is similar to Neal's suggestion of using ID numbers on each connected
component. In Bill's algorithm, the smallest vertex number of the connected
component basically serves this purpose. I implemented both his algorithm and
GAD using Watcom C 386 8.0. When I compared them side by side, I found that
Bill's algorithm consistently ran twice as fast as GAD when counting connected
components in worst-case graphs.
The main drawback of Bill's algorithm compared to GAD is that it takes
significantly more time to determine what vertices are in a particular
connected component. I recently had the opportunity to write a DIFF program
that is used to compare large graphs at my workplace. When a difference is
found, all of the vertices involved in the difference must be logged into a
report. Since GAD represents each connected component as a linked list of
vertices, one simply follows this list to find all of the vertices. With
Bill's algorithm, it would be necessary to scan the entire array to find all
of the vertices. Bill, too, has released his algorithm into the public domain.
It deserves serious consideration by anyone who is working with graph data
structures.
















MAY, 1991
A COPROCESSOR FOR A COPROCESSOR?


The 34082 floating point coprocessor for the 34020 graphics processor




Warren Davis and Kan Yabumoto


Warren, who has been a graphics programmer for ten years, is the designer and
programmer of a number of video arcade games including Q*Bert, Us vs. Them,
Lotto Fun, and Exterminator. Currently, he is a senior software engineer at
Pixelab Inc., a graphics consulting firm. Kan was originally a gas
chromatographer, but has been involved with graphics software for the last ten
years. He is a cofounder of Pixelab Inc., and the author of a series of
source-level debuggers for the TMS340 family of graphics processors, the GSP
Operating Tools. His lesser-known arcade game, Mad Planets, is a collector's
item. Both authors can be reached at Pixelab Inc., 4513 Lincoln Ave., Suite
105, Lisle, IL 60532, 70 -960-9339, or via their BBS, 708-960-9352.


When it was introduced in 1985, the Texas Instruments TMS34010 Graphics System
Processor (GSP) faced an identity crisis. Was it really a general-purpose
microprocessor that happened to have built-in graphics-related instructions
and video control circuitry, or was it merely an unusually powerful
programmable graphics coprocessor? In truth it is both, although just the
first description is more accurate from a technical standpoint. And while
there have been many systems designed in which a TMS34010 is the sole (or
main) microprocessor, it is in the PC graphics arena that this device has the
potential to flourish by being used to offload graphics related tasks from a
host processor (usually an 8Ox86 or 68OxO). At the very least, TI hopes the
GSP will become a major player in this field, as evidenced by TIGA (Texas
Instruments Graphics Architecture), a standard for communication between a
host and a target (TMS340-based) graphics system.
But the 34010 was just the beginning. In 1989, TI began mass producing the
TMS34020, which includes speed and functionality improvements over its
predecessor, and is designed to accommodate an optional floating point
coprocessor, the 34082. A coprocessor for a coprocessor? We shudder to think
what might be next. But let's look a little deeper into the workings of these
devices. Who knows, they might even make sense!


The 34010: Processor or Coprocessor?


There is no doubt that TI's GSPs are complete microprocessors in their own
right. They contain internal registers, a stack pointer, a status register,
and interrupt vectors. They fetch instructions and data from a local memory,
have the ability to make conditional jumps, and are supported by all the
standard language tools (assembler, C compiler, linker, and so on). And of
course, there are the graphics-related features that make them unique. In
fact, the 34010 was the first device to incorporate video signal generation
and efficient graphics-related operations with an instruction set for
general-purpose computing. In addition, there is a host interface built into
the silicon which simplifies the hardware connection between a GSP and another
computer's bus. This is a somewhat unusual feature for a microprocessor, but
looking at the real world filled with PCs, Macs, and Unixbased systems, you
see the logic of it. The simpler the interface, the easier it is to develop
GSP programs on the host computer and then download them to the GSP's memory.
But such an interface can also be used to communicate between a host and
target processor while programs are running on both. The host could download
parameters -- say, the position and radius of a circle along with a fill color
-- to the GSP, which would then perform some graphics-related operations, such
as drawing a filled circle on the screen. In fact, most graphics coprocessors
have a similar means of receiving graphics commands from a host. For this
reason, no doubt, many people originally thought of the 34010 as a glorified
graphics coprocessor. (The term "graphics coprocessor" is actually somewhat
vague. The history of devices which assist a host processor in performing
graphics tasks covers a wide spectrum of "processing" ability.)
Anyway, all "graphics coprocessors" are treated as peripheral devices by a
host processor, and this is certainly true of the 34010 as well. Once the
analogy was made, some pointed out that the 34010 was actually slower in
performing certain graphics tasks than some graphics controllers, which
implemented a fixed set of functions internally and performed them at
lightning speed. The beauty of the GSP, however, is in its ability to be
tailored to a specific task.
Let's look at an example. Say we want to draw a series of filled circles along
a path represented by an equation. Say also that we can divide the
computational tasks into four sections fairly easily. Figure 1 shows us how
our processing time would probably be spent using a typical graphics
controller. The host processor takes some amount of time (Tm) to calculate the
position of the next circle. When it comes time to draw the circle, we offload
that task to the controller. In doing so, we incur a small bit of overhead
(To) which is usually more than made up for by the speed of the controller
(Tg). Presuming that the host does not need to acknowledge the completion of
the graphics task, the total time for the loop is Tm = T[A] + T[B] + T[C] +
T[D] + T[o].
If Tg is less than Tm, the graphics controller could be spending most of its
time waiting for the host to send a command. Unfortunately, there isn't any
work for the graphics device to do while the host is busy with other things.
Now look at Figure 2, which shows a possible way of implementing this program
using a GSP. We can increase the parallelism between the two processors by
adjusting the division of tasks. So instead of having the GSP just draw the
circle, we can send it some interim values, have it complete the computation,
and then draw the circle. Even if the actual circle drawing time of the GSP is
slower, the throughput of the system is faster.
Most graphics controllers contain hardcoded primitives, so the host has little
or no choice in how to divide its tasks between itself and the controller. But
because the GSP is completely programmable and capable of performing any
standard computational task as well, there is no restriction on how much or
how little it does at a time. The division of tasks between a host and GSP can
be tailored to a particular need and tweaked to perfection (or as near to
perfection as a deadline will allow).
So it seems pretty clear that the TMS340 GSPs must be accepted as more than
just graphics coprocessors, although if that's the way you want to use them,
they are more than equipped to handle the job exceptionally well.


Enter the 34020


Flexibility is one thing, but performance is another. The 34020, TI's newest
GSP, provides a 32-bit external data path (which by itself virtually doubles
the speed of pixel transfers over its predecessor), faster cycle times, a
larger internal cache, support for a variety of VRAM capabilities, and a
multiprocessor interface to allow multiple 34020s to share a memory space.
Most relevant to the scope of this article, however, is the inclusion of a
coprocessor interface. This notion was completely missing from the 34010, but
its need becomes apparent as soon as you try to perform floating point
arithmetic on the 34010. While the performance is respectable, it is nowhere
near remarkable.
The 34020's coprocessor interface is general-purpose in a somewhat limited
sense. Some of the 34020's local memory interface signals are used to tell a
coprocessor that a command is being directed to it. Naturally, the coprocessor
must be designed to listen properly, and at present there is only one device
(the 34082) which will do that. Also, the 34020 is capable of working with
more than one coprocessor. Through an ID field in its coprocessor
instructions, the 34020 can control up to five coprocessors. Up to four of
these can be 34082s, and only one may be a coprocessor of another origin which
conforms to the 34020's coprocessor interface conventions.
The 34020 communicates with its coprocessors through a set of general
coprocessor instructions, shown in Table 1. One of these instructions, CEXEC,
simply involves the transfer of a command, embedded into the instruction, to a
coprocessor. All the others involve the additional transfer of data between
the 34020 and coprocessor. As Table 1 shows, data can be sent to or returned
from the coprocessor using 34020 registers or memory. When executing any
coprocessor instruction, the 34020 first generates a particular combination of
control signals on its address/data bus to signal the coprocessor. The
coprocessor command is placed onto the bus along with some other information
including the coprocessor ID. Transferral of data, if any, follows. The 34020
controls these transfers, but the coprocessor needs to know what to do with
data it is receiving or, if it is expected to return data to the 34020, what
data to send back. This information must be inherently present in the command
field sent by the 34020.
Table 1: 34020 coprocessor instructions
These are all of the 34020's coprocessor instructions.
The size field is a bit which indicates whether the operation is to be
performed on 32-bit values (size = 0) or 64-bit values (size = 1).
The command field tells the coprocessor what operation to perform. If data is
being transferred from the 34020 (that is, CMOVGC or CMOVMC), the command
should indicate where it is to go. If data is being transferred to the 34020
(that is,. CMOVCG, CMOVCM, or CMOVCS), the command should indicate what data
is to be returned.
The ID field is used to select a particular coprocessor (or all coprocessors)
when there is more than one in the system. When omitted, a default value
(which can be changed with an assembler directive) is used.
 Execute Coprocessor Command without Data Transfer
 CEXEC size,command[,ID][,L] Long Form
 CEXEC size,command[,ID] Short Form

 Move from Coprocessor to 34020 Registers
 CMOVCG Rd,command[,ID] Move one register
 CMOVCG Rd1,Rd2,size,command[,ID] Move two registers

 Move from Coprocessor to Memory
 CMOVCM *Rd+,cnt,size,command[,ID] Post increment
 CMOVCM *-Rd,cnt,size,command[,ID] Pre decrement

 Move from Coprocessor to 34020 Status Register
 CMOVCS command[,ID] Replaces N, C, Z, and V bits of
 34020's status register


 Move from 34020 Register(s) to Coprocessor
 CMOVGC Rs,command[,ID] Move one register
 CMOVGC Rs1,Rs2,size,command[,ID] Move two registers

 Move from Memory to Coprocessor
 CMOVMC *Rs+,cnt,size,command[,ID] Post increment, Constant count
 CMOVMC *-Rs,cnt,size,command[,ID] Pre decrement, Constant count
 CMOVMC *Rs+,Rd,size,command[,ID] Post increment, Register count



Presenting the 34082


As we mentioned before, the 34082 is currently the only device designed to
work with the 34020's coprocessor interface. Because these devices have been
designed to work so closely together, TI's TMS340 language tools support a
special set of so-called "pseudo-ops" which consists entirely of variations on
the instructions shown Table 1.
For example, the 34020 instruction, ADD CRs1,CRs2,CRd (Add Integer), is
actually a CEXEC instruction which sends a command to the 34082, instructing
it to add two of its registers (CRs1 and CRs2) as integers and place the
result in another register, CRd. The ADDF (Add Float) instruction is identical
to ADD except that a different coprocessor command is sent to indicate
floating point addition. The ADDD instruction is identical to ADDF except the
size field is 1 to indicate an operation on 64-bit values.
The 34082 has a built-in command set contained in its internal ROM. The
"commands" sent by the 34020 are actually nothing more than addresses of
microcoded programs in this ROM. So when the 34020 issues the ADD instruction
mentioned before, it is really just triggering the 34082 to execute a one-line
program consisting of a native 34082 ADD instruction. Some 34020 pseudo-ops
trigger more complex 34082 programs, such as matrix multiplications or
polynomial expansions.
Looking at the specs of the 34082 would lead one to conclude something very
exciting. It is fast! The 34082-32 has a 67.5 ns instruction cycle time, a
three-operand Floating Point Unit (FPU) with two levels of internal
pipelining, and can perform most single precision operations in one cycle when
executing out of its own local memory. (When commands are sent from the 34020,
the minimum timing is equal to one 34020 cycle or 125 ns.) It supports three
data types: 32-bit integer, 32-bit IEEE float, and 64-bit IEEE double.
A configuration register allows you to set the rounding mode and pipeline
configuration. The 34082's native instruction set allows for conditional
branches, jumps to subroutines (nested up to two-deep), loops, and interrupt
service routines.
This degree of programmability within the 34082 itself is no accident. As if
the "processor vs. coprocessor" issue were not muddy enough, the 34082 has the
ability to act as a standalone processor. In this mode, called the
"host-independent mode", programs are executed from an external memory (up to
64K-long words of program and 64K-long words of data) made up of either Static
RAM (SRAM) or EPROM, which connects to the 34082 without any glue logic! A
bootstrap loader is provided to simplify the initialization of SRAM. And TI
provides not only a macro assembler and linker for the 34082, but a C compiler
as well! This external memory is required for host-independent operation, but
it can still be present even when the 34082 is in coprocessor mode.
Communication between the 34082 and this memory occurs over a local bus,
independent from the 34020 (see Figure 3). You can actually develop custom
routines for the 34082, download them to SRAM or burn them into EPROM, and use
them just as you would the commands built into the internal ROM! The
improvement in execution speed can be remarkable, as you will see shortly.


Programming Notes


Many programmers stay away from multiplication operations, replacing them with
additions when possible (for example, adding a number to itself instead of
multiplying by two). On the 34082 this becomes a moot point. As long as you
use the float format, most operations are a single clock cycle, so you gain
nothing by replacing a multiplication with an addition. In fact, the 34082
runs so fast on many instructions that the bottleneck ends up being in the
34020-to-34082 communication.
To squeeze every drop of 34082 performance, you should focus your effort on
optimizing the allocation of registers so that the restrictions placed on
source operands do not force you to shuffle data between registers. The
34082's three-operand FPU allows many instructions to specify two source
operands and a destination operand. The restriction on most instructions of
this type is that the first source register must come from the A-file and the
second from the B-file. Some instructions requiring a single-source operand
require it to reside in a particular file (for example, the source register
for SQR (square) instructions must be from the A-file, and the source register
for INV (invert) instructions must be from the B-file). There is a mode bit in
the 34082's CONFIG register which allows you to remove these restrictions by
making the A- and B-files equivalent. The trade-off is that you then have only
10 registers available instead of 20.
Some of the more complex instructions act like subroutines and use specific
registers as inputs. This is right along the lines of the GSP's graphics
instructions, which expect operands to be stored in specific B-file registers.
The Feedback Registers, C and CT, which are primarily used for temporary
storage by some instructions, are also available and can be used to minimize
any inconveniences. Thankfully, there are no file restrictions on the
destination register.


Fractals


Now for the fun part. To evaluate the performance of the TMS34082 floating
point processor, we wrote a simple C program that displays a picture of the
Mandelbrot set. The screen represents a rectangle of arbitrary dimension at
some position in the complex plane. The X axis represents real number
components and the Y axis represents imaginary number components. The
Mandelbrot plot is created by computing successive iterations of the equation
A[n] = A{2}[n-1] + C where A and C are complex numbers, the initial value of A
is 0+0i, and C is a constant which is represented by a pixel in the complex
plane. For all values of C which are visible on our screen, we determine how
many iterations it takes for A to diverge. For our purposes, that means how
many iterations until the magnitude of A becomes greater than 2. In plotting
these results, we use the number of iterations until divergence as an index
into our color map. If, after 256 iterations, A has still not diverged, we
simply use color 0.
To freshen your algebra in complex numbers, if we represent a complex number,
A, as follows:
 A = a[R]+a[I]i = (a[R],a[I])
then
A+B = (a[R]+b[R],a[I]+b[I]) A{2} = (a{2}[R]-a{2}[I],2a[R]a[I])
_________________
magnitude of \/A = a[R]{2}+a[I]{2}


Our Test Program


Listing Two (page 84) shows a C program written to run under any environment
and on any graphics display. The main routine starts by calling a black box
function called initialize( ) that performs all hardware-dependent tasks -- it
initializes the display board, clears the display screen, and loads a
predetermined set of 256 colors into the display board's palette memory. Under
some environments, you make a query to find out what your pixel resolution is,
so initialize( ) also sets the global variables screenx and screeny. There is
another "black box" function which is dependent on the display board used:
put_pixel( ), which writes a color at a given position on the screen. To port
this program, all you need to do is write your own initialize( ) and
put_pixel( ).
The only other purpose of the main routine is to set up the parameters for
compute_fractal( ). The four parameters form two complex numbers, which
determine what chunk of the complex plane appears on our display screen. The
origin parameter becomes the upper left corner of the screen, and the size
parameter gives the dimensions of the screen in the complex plane. You can see
by the initial values of origin and size that we will map an area from -4.0 to
+4.0 along the real (X) axis and from -3.0 to +3.0 along the imaginary (Y)
axis. These numbers were chosen to approximate the aspect ratio of a typical
monitor so that each pixel represents a true square. It also gives a nice
encompassing picture of the Mandelbrot set. By varying these parameters, you
can achieve a limitless variety of fractal landscapes, some of which are quite
breathtaking.
More Details.
The compute_fractal routine begins by computing DeltaR and DeltaI, which
essentially represent the width and height of a single pixel in the complex
plane. For every pixel on the screen, we need to determine a color. Therefore,
we have two outer "for" loops, which encompass the entire screen, and an inner
loop, which performs the calculations. The inner loop essentially performs
complex arithmetic to determine how many iterations it takes to meet our
divergence criterion. If we detect divergence, we break out of the loop and
plot a pixel using the loop count as a color index. Otherwise, we fall through
and plot a pixel of color 0.
(Rather than compute a true magnitude, which involves a square root, we
compare the square of the magnitude to the square of our comparison value.)
Although the program is fairly simple, it is obviously a real number cruncher,
so we tried to optimize the code as much as possible without losing its
readability: All variables have been declared as register; we save the squares
of the real and imaginary portions of A at each iteration. This is because
they are used in computing both the next iteration and the square of the
magnitude. By storing them, we save ourselves a multiplication.
The program in Listing One (page 84) was compiled under two environments --
Microsoft C 6.0 and Texas Instruments TMS340 C 5.01. The host computer was a
80386/25 MHz MS-DOS machine with an 80387. A TI SDB20 board, which is built
around a 32MHz 34020 processor and 34082 coprocessor, was plugged into a slot
on the host computer. In both cases, we used the display buffer of the SDB20
board connected to an NEC 3D Multisync monitor to view our images. The screen
resolution was 640 x 480 pixels with 256 colors.
We compiled the program for each environment in two ways. First, we had the
compilers generate floating point library calls. Next, we had them generate
coprocessor instructions. The timing results are shown in Table 2. We were
also fortunate enough to try our program out on a 80486 machine (at 25 MHz).
The 80486 is essentially an 80386 married to an 80387 on a single chip with
speed enhancements, and is therefore software-compatible with the 387 version
of our program.
Table 2: Results of fractal comparison (Times are shown in seconds and
hr:min:sec.)

 Image 1 Image 2 Image 3
 -------------------------------------------

 80386/FP Library 2231 13251 31059
 0:37:11 3:40:51 8:37:39

 34010/FP Library 1077 5199 15528
 0:17:57 1:26:39 2:18:48
 34020/FP Library 443 2534 6304
 0:07:23 0:42:14 1:45:04
 80386/80387 97 569 1319
 0:01:37 0:09:29 0:21:59
 80486 23 126 293
 0:00:23 0:02:06 0:04:53
 34020/34082 18 93 216
 0:00:18 0:01:33 0:03:36

 *** Above entries used C program as source.
 *** Following entries used assembler.

 Tweaked 34020/34082 11 64 149
 0:00:11 0:01:04 0:02:29
 34082 running out of
 its local SRAM 4 17 38

Note: The times shown in this table do NOT include the overhead of writing the
640x480 pixels to the display screen. Each program was run in a mode where all
pixel writing was inhibited. So the results shown above are the computation
times of the algorithm only.

Although it's nice to know that the TMS340 C compiler is capable of generating
34082 instructions, anyone who's ever done any graphics programming knows that
for performance, nothing beats assembler language. For that reason, we created
a hand-tweaked assembler version of compute_fractal( ) based on code generated
by the C compiler. The original output of the C compiler is shown in Listing
Two. Compare this against the assembly code we tweaked in Listing Three (page
87).
The first thing to notice in Listing Three is that only one of the variables
we declared to be "register" was placed in local memory, namely DeltaR. Every
other variable is maintained in a register. Not only that, the float variables
have been assigned to 34082 registers while the integers reside in 34020
registers. This is done by the C compiler automatically!


The Ultimate Method


We mentioned before that the 34082 can have its own local memory which can
contain user programmed commands. In the case of the SDB20 board, there is a
piggyback card available which plugs into the 34082 socket to provide the
34082 with external SRAM. By using this card, we were able to port the
Mandelbrot algorithm to the 34082's SRAM. The particular programming
techniques used are beyond the scope of this article, but we would be happy to
answer any inquiries from interested readers. Basically, we created three new
34082 "commands." The first initializes the 34082's registers. The second
performs the computations for a single point, returns the color of that point,
and adjusts all registers to prepare for the next point. The third is called
at the end of each line and adjusts all registers to prepare for the beginning
of the next line. The 34020 simply maintains the row and column loops while
sending these newly defined commands to the 34082.


And the Winner Is ...


In examining the timing results in Table 2, keep in mind that this was a test
of curiosity more than anything else. The timings for the 386/387 are very
dependent on the compiler and library used. The three Mandelbrot images we
chose represent a wide variation in the amount of computing necessary.
The coprocessors boosted performance by a factor between 20 and 30 for both
the TI and Intel chips. That isn't too surprising. After all, the existence of
math coprocessors cannot be justified if the gain is marginal. However, we
were very surprised to see TI's chip outperform the 80387 by a factor of 6. To
explain this difference in performance, we must look into the underlying
processor architectures. The entire compute_fractal( ) function fits into the
on-chip instruction cache of the 34020, eliminating all instruction fetches.
In this case, the 34020 executes over 80 percent of typical instructions in
one machine cycle. All of its coprocessor instructions are also executed in
one machine cycle. And because the TI C compiler puts 11 local variables into
registers (many of which stay entirely inside the coprocessor), there are
hardly any memory accesses. In the tweaked assembler version of the program,
there are no memory accesses at all except for the outer loop initialization
and pixel drawing.
Normally, when you replace a routine written in C with a tweaked assembler
version, you would expect performance to improve by a factor of 3 or more. Not
so in this case. We did not achieve even a two-fold increase in speed. Whereas
many C programmers may have been skeptical of declaring register variables in
the past, GSP C programmers should now get in the habit of declaring all
automatic variables to be "register," keeping in mind that the compiler
assigns registers in the order in which the declarations appear. By the way,
we did not write a hand-tweaked version of the program for the 386/387 because
it was not our purpose to provide an official benchmark, just a rough
comparison. We would be happy to hear about anyone else's results from similar
comparisons.
The times for the 486 machine are about four times faster than those of the
386/387 combination, which is as we expected. However, the 34020/82
combination was still faster by about 35 percent. Part of the speed
improvements of the 486 come from the fact that there is no bus overhead in
communicating with a coprocessor. This is almost the case when the 34082 is
running our custom commands from its SRAM. The amount of communication between
the 34020 and 34082 is reduced considerably, though not entirely, and yet we
still see an improvement of close to a factor of 4 over the tweaked version
which uses the 34082's built-in commands.
One can typically experience frustration while waiting for a Mandelbrot plot
to complete. Using the 34020/34082 combination, we have practically exhausted
our curiosity in this area by viewing image after image, many within a few
seconds, using an interactive version of our program. Having observed this
incredible performance, we wonder why we haven't yet seen an add-on card
interfacing a 34082 to a PC, because a bus connection is technically feasible.
At present, the price of a 34082 is about one-third that of an 80387. With
some software support, it could turn a regular PC into a super number
cruncher.


80x86 vs. TMS340 Philosophies


A 34082 connected to a 34020 is a floating point coprocessor in the truest
sense. The 34020 does not treat it as a peripheral device but as an extension
of itself. Even the hardware interface between the two devices has been
optimized to make it as direct as possible. This is similar to the
relationship between the Intel 80x86 and 80x87 devices. Just for grins, let's
compare the Intel and Texas Instruments way of doing things.
Intel's processors are built upon a classic CISC architecture where the CPU
contains a relatively small number of registers but allows most of the
arithmetic and logical instructions to use memory locations as operands. This
approach results in fewer move instructions than the TMS340 processors, which
are influenced by the RISC philosophy. They have many more registers (30
general-purpose 32-bit wide registers) and cannot perform arithmetic and
logical operations out of memory. Memory accesses are slower than register
accesses, so the idea is to keep as much information as possible in registers.
These philosophies were carried over to some extent to both companies'
floating point math coprocessors. The 80x87 processors have relatively few (8)
registers in a stack-like organization. The 34082 math coprocessor comes with
many registers (20 general-purpose 64-bit wide registers plus two Feedback
Registers) that can be accessed more freely.
Another concept carried over from the TMS340 processors to the 34082 is that
of A-file and B-file registers. The 30 general-purpose registers of the GSPs
are divided into 15 A registers (AO-A14) and 15 B registers (BO-B14). Many
instructions require that both register operands be within the same file. The
34082's 20 registers are also organized in A- and B-Files. Like the 34010 and
34020, there are some restrictions on register usage.
Both the 80x87 and 34082 have synchronization instructions to allow a lengthy
coprocessor operation to take place concurrently with main CPU execution. Both
coprocessors can also transmit/receive data to/from system memory directly.
And in both cases, the main CPU is responsible for coprocessor instruction
decoding and memory access for optional operands. In the Intel case, when a
special "ESC" prefix is encountered by the CPU, then the CPU generates a
special I/O cycle to communicate with the 80x87. In the TI case, when a
coprocessor instruction opcode is detected by the 34020, the 34020 initiates a
special coprocessor bus cycle to which the 34082 responds. The data which
actually appears on the data bus has been massaged by the 34020 to look very
much like a microcoded instruction, with the "command" field being a pointer
into the 34082's internal ROM.
Another interesting comparison between the 80x87 and 34082 is that the Intel
chips perform 80-bit "temporary real" floating point math which provides more
range and accuracy than the IEEE 64-bit double format used in the 34082. Also,
while Intel's parts contain built-in logarithmic, exponential, and
trigonometric functions, TI's device has none of these. These were sacrificed
in favor of a variety of matrix and vector arithmetic and other graphics
oriented functions. However, using the optional external memory, you can write
your own functions as needed and expect the performance to be as fast or
faster than other numeric processors.
All this is fascinating, I'm sure, but what about performance? Well, Table 3
shows a comparison of the speed of some floating point instructions among the
latest math coprocessors. In comparing the performance of these coprocessors,
we should note that the move/load/store functions of the 80387 devices create
a significant overhead (20 to 93 cycles) which is negligible in the 34082.
This is because Intel chose to convert all numbers to/from the "temporary
real" format. TI maintains three distinct formats (int, float, and double) and
gives you the choice of transferring data as is, or transferring and
converting to a desired representation in one breath. We should also note that
a comparison of instruction cycles alone is not very meaningful. The overall
architecture of the processing environment can become very significant in
evaluating the device's performance.
Table 3: Comparison of Instruction Execution Times in nanoseconds for 80387,
80486, and 34082

 Operation 80486 (33 MHz) 80387 (33 MHz) 34082 (32 MHz)
 ---------------------------------------------------------------

 abs 90 (FABS) 660 125/125 (ABSx)
 compare 120 (FCOM) 720 125/125 (CMPx)
 add 300 (FADD) 690 125/125 (ADDx)
 multiply 480 (FMUL) 870 250/125 (MPYx)

 divide 2190 (FDIV) 2640 1500/750 (DIVx)
 sqrt 2550 (FDIV) 3660 1875/1125 (SQRTx)
 int2real 480/330 (FILD) 1680/600 125/125 (CVIx)

Note 1: Currently, TI is only shipping 34082s rated at 32 MHz (40 MHz will be
available later).
Note 2: The two numbers separated by a slash correspond to double and float
operations, respectively. The integer operations of the 34082 are equal or
slightly slower than their double precision counterparts. On the other hand,
the Intel parts always operate in "temporary real" format.
Note 3: The third column reflects the timings of these operations when
executed as 34020 coprocessor instructions. The minimum possible execution
time is one 34020 instruction cycle (or 125 ns). On the other hand, if the
34082 were executing instructions from its local memory, the timings would be
different. Specifically, the single cycle functions (abs, cmp, add, and mult)
would execute in one 34082 instruction cycle (or 67.5 ns).

--W.D. and K.Y.

_A COPROCESSOR FOR A COPROCESSOR?_
by Warren Davis and Kan Yabumoto


[LISTING ONE]

/* C program to perform display of Mandelbrot set. Needs to be
linked with a module containing the initialize() and put_pixel() */

int screenx, screeny; /* These values represent the size of the display */
 /* screen in pixels. They are initialized in the */
 /* initialize() routine called by main(). */

/*****************************************************************************
compute_fractal is the heart of our program. Four parameters are passed
from main() representing two two complex numbers. The first two parameters,
base_R and base_I, are the real and imaginary portions of upper left corner
of the screen screen in the complex plane. The last two, span_R and span_I,
give the size of the area of the complex plane visible on the screen.
SOME BACKGROUND... This routine computes successive iterations of the
equation,
(An = An-1 ** 2) + C where A and C are complex numbers, and C represents a
point in the complex plane. The initial value of A is 0+0i, and when the
magnitude of A becomes greater than 2.0, it will be considered that series
will eventually diverge. The color of pixel at C becomes the number of
iterations before divergence. If after 256 iterations, there is no divergence,
color 0 is written. The color is used as an index into color palette of the
display board. COMPLEX ARITHMETIC... For those of you a little rusty on your
complex arithmetic, the following formulas are supplied...
If W and Z are complex numbers, then each has two parts, real and imaginary.
(i.e. W = W_real + W_imag * i). W + Z means (W_real + Z_real) + (W_imag +
Z_imag) * i W * W means (W_real * W_real) - (W_imag * W_imag) +
(2 * W_real * W_imag) * i. The magnitude of Z would be SQRT((Z_real *
Z_real) + (Z_imag * Z_imag))
****************************************************************************/

void compute_fractal(float BaseR, float BaseI, float SpanR, float SpanI)
{
register float AR, AI; /* Real and Imaginary components of A */
register float ConstR, ConstI;/* Real and Imaginary components of C */
register float DeltaR, DeltaI; /* increment values for C */
register float ARsqr, AIsqr; /* squares of AR and AI */
register int row, col, color; /**** See NOTE 1 ****/

DeltaR = SpanR / (float)screenx;
DeltaI = SpanI / (float)screeny;

ConstI = BaseI;
for (row=0; row < screeny; row++) { /* Scan top to bottom */
 ConstR = BaseR;

 for (col=0; col < screenx; col++) { /* Scan left to right */
 AR = AI = ARsqr = AIsqr = 0.0F; /**** See NOTE 2 ****/
 for (color = 256; --color > 0;) {/* Find color for this C */
 AI = (AR * AI * 2.0F) + ConstI; /* Compute next */
 AR = ARsqr - AIsqr + ConstR; /* iteration of A */

 if ( ((ARsqr = AR * AR) + (AIsqr = AI * AI)) > 4.0F )
 break; /**** See NOTE 3 ****/
 }
 put_pixel(color,col,row);/* Write color to display buffer. */
 ConstR += DeltaR;
 }
 ConstI += DeltaI;
 }
}

/* NOTE 1: We declare everything to be register variables. For some processors
this may not have much of an effect, but on others (like the 34020 and 34082)
you may be surprised.
NOTE 2: For each point on the screen, we begin computing iterations of the
Mandelbrot equation. The initial value of A is 0+0i. Since the values
A_real*A_real and A_imag*A_imag are used in computing both the next iteration
of A and its magnitude, we maintain these values as separate variables so the
multiplications need only be computed once.
NOTE 3: For our magnitude comparison, we actually compare the SQUARE of the
magnitude against the square of our divergence value. This saves us from
computing a square root.
*/

/****************************************************************************
The main() function serves only to pass initial values to compute_fractal. We
will leave the initialize() routine to be a "black box". Interested
programmers may want to write their own routine for whatever display board is
available. The values used in this test program show the familiar picture of
the Mandelbrot set. By varying these numbers, you can obtain some breathtaking
fractal landscapes.
 ***************************************************************************/

main()
{
float origin_R,origin_I,size_R,size_I;

/* The initialize() routine must initialize display board, clear display
buffer, load a table of 256 colors into color palette, and set global
variables, screenx and screeny. If successful, it returns 0. If it encounters
any problems it returns a non-zero value. */

if (initialize()) return(1);

origin_R = -4.0; /* origin represents the upper left corner of */
origin_I = -3.0; /* the screen. */
size_R = 8.0; /* size represents the domain of the screen */
size_I = 6.0; /* in the complex plane. */

compute_fractal(origin_R,origin_I,size_R,size_I);
}







[LISTING TWO]

******************************************************************************
* Assembly code generated by TMS340 C Compiler using the -mc option for
* generating coprocessor instructions.
******************************************************************************
; gspac -mc -v20 mandel.gc mandel.if
; gspcg -o -c -v20 -o mandel.if mandel.asm mandel.tmp
 .version 20
 .ieeefl
FP .set A13
STK.set A14
 .file "mandel.gc"
 .globl _screenx
 .globl _screeny

 .sym _compute_fractal,_compute_fractal,32,2,0
 .globl _compute_fractal

 .func 50
;>>>> void compute_fractal(float BaseR,float BaseI,float SpanR,
 float SpanI)
;>>>> register float AR, AI, ConstR, ConstI;
;>>>> register float ARsqr, AIsqr, DeltaI, DeltaR;
;>>>> register int row,col,color;
******************************************************
* FUNCTION DEF : _compute_fractal
******************************************************
_compute_fractal:
 MMTM SP,A7,A9,A10,A11,FP
 SUBI 448,SP
 MOVE SP,A11
 MOVD RA5,*A11+,4
 MOVD RB6,*A11+,3
 MOVE STK,FP
 ADDK 32,STK
 MOVE SP,*STK+,1 ;; DEBUGGER TRACEBACK AID
 .sym _BaseR,-32,6,9,32
 .sym _BaseI,-64,6,9,32
 .sym _SpanR,-96,6,9,32
 .sym _SpanI,-128,6,9,32
 .sym _AR,32,6,4,32
 .sym _AI,33,6,4,32
 .sym _ConstR,30,6,4,32
 .sym _ConstI,31,6,4,32
 .sym _ARsqr,28,6,4,32
 .sym _AIsqr,29,6,4,32
 .sym _DeltaR,26,6,4,32
 .sym _DeltaI,0,6,1,32
 .sym _row,9,4,4,32
 .sym _col,10,4,4,32
 .sym _color,11,4,4,32

 .line 9
;>>>> DeltaR = SpanR / (float)screenx;
 MOVE @_screenx,A7,1

 MOVE A7,RA0 ; screenx --> RA0
 CVIF RA0,RB0 ; convert RA0 from int to float, put in RB0
 MOVE FP,A7
 SUBI 96,A7

 MOVF *A7+,RA0 ; move parameter SpanR --> RA0
 DIVF RA0,RB0,RB0 ; RA0 / RB0 --> RB0. Result is DeltaR
 ADDI 64,A7
 MOVF RB0,*A7+ ; Store DeltaR as a local variable.

 .line 10
;>>>> DeltaI = SpanI / (float)screeny;
 MOVE @_screeny,A7,1
 MOVE A7,RA1 ; screeny --> RA1
 CVIF RA1,RB1 ; convert to float and put in RB1
 MOVE FP,A7
 SUBI 128,A7
 MOVF *A7+,RA1 ; get SpanI
 DIVF RA1,RB1,RA5 ; compute DeltaI and LEAVE IN RA5!!!
 ; DeltaI is used as a register variable!
 .line 12
;>>>> ConstI = BaseI;
 ADDK 32,A7
 MOVF *A7+,RB7 ; BaseI --> ConstI (RB7)

 .line 13
;>>>> for (row=0; row < screeny; row++) {
; NOTICE here that both ConstI and row are used as register variables. Yet
; ConstI, which is a float, is kept in a 34082 register and row, which is an
; int, is kept in a 34020 register! The C compiler is smart enough to know
; which variables should be maintained on which processor!
;
 CLRS A9 ; 0 --> row (A9)
 MOVE @_screeny,A7,1
 CMP A7,A9
 JRGE L2

L1:
 .line 15
;>>>> ConstR = BaseR;
 MOVE FP,A7
 SUBK 32,A7
 MOVF *A7+,RA7 ; BaseR --> ConstR (RA7)

 .line 16
;>>>> for (col=0; col < screenx; col++) {
 CLRS A10 ; 0 --> col (A10)
 MOVE @_screenx,A7,1
 CMP A7,A10
 JRGE L4

L3:
 .line 18
;>>>> AR = AI = ARsqr = AIsqr = 0.0F;
 CLRF RB6 ; clear AIsqr (RB6)
 MOVF RB6,RA6 ; clear ARsqr (RA6)
 MOVF RB6,RB8 ; clear AI (RB8)

 MOVF RB6,RA8 ; clear AR (RA8)


 .line 20
;>>>> for (color = 256; --color > 0;)
 MOVI 256,A11
 SUBK 1,A11 ; 255 --> color (A11)
 JRLE L6

L5:
 .line 22
;>>>> AI = (AR * AI * 2.0F) + ConstI;
 MPYF RA8,RB8,RA0 ; AR * AI --> RA0
 TWOF RB0 ; 2.0F --> RB0
 MPYF RA0,RB0,RA0 ; AR * AR * 2.0 --> RA0
 ADDF RA0,RB7,RB8 ; RA0 + ConstR --> AI (RB8)

 .line 23
;>>>> AR = ARsqr - AIsqr + ConstR;
 SUBF RA6,RB6,RB1 ; ARsqr - AIsqr --> RB1
 ADDF RA7,RB1,RA8 ; ConstR + RB1 --> AR (RA8)

 .line 25
;>>>> if ( ((ARsqr = AR*AR)+
 MOVF RA8,RB1 ; AR --> RB1
 MPYF RA8,RB1,RA6 ; Compute new ARsqr
 MOVF RB8,RA0 ; AI --> RA0
 MPYF RA0,RB8,RB6 ; Compute new AR_imag
 ADDF RA6,RB6,RA0 ; Sum of squares --> RA0
 MOVI FS3,A7 ; FS3 is a pointer to a float constant, 4.0
 MOVF *A7+,RB1 ; 4.0 --> RB1
 CMPF RA0,RB1 ; if square of magnitude > 4.0, break
 GETCST
 JRGT L6

 .line 26
;>>>> (AIsqr = AI*AI)) > 4.0F ) break;
 .line 20
 SUBK 1,A11 ; Otherwise, decrement color and see
 JRGT L5 ; if loop ended.

L6:
 .line 29
;>>>> put_pixel(color,col,row);
 MOVE STK,-*SP,1 ; Call display_board dependent routine
 MOVE A9,*STK+,1 ; to place a pixel on the screen.
 MOVE A10,*STK+,1
 MOVE A11,*STK+,1
 CALLA _put_pixel

 .line 30
;>>>> ConstR += DeltaR;
 MOVE FP,A8
 MOVF *A8+,RB0
 ADDF RA7,RB0,RA7


 .line 16
 ADDK 1,A10 ; col++
 MOVE @_screenx,A7,1
 CMP A7,A10 ; If col >= screenx, end middle loop

 JRLT L3 ; Otherwise, jump back

L4:
 .line 32
;>>>> ConstI += DeltaI;
 ADDF RA5,RB7,RB7

 .line 13
 ADDK 1,A9 ; row++
 MOVE @_screeny,A7,1
 CMP A7,A9 ; If row >= screeny, end outer loop
 JRLT L1 ; Otherwise, jump back

L2:
EPI0_1:
 .line 34
 MOVE *SP(640),STK,1 ; C cleanup
 MOVD *SP+,RA5,4
 MOVD *SP+,RB6,3
 MMFM SP,A7,A9,A10,A11,FP
 RETS 2

 .endfunc 83,00000ee80H,32

 .sym _main,_main,36,2,0
 .globl _main

 .func 103
;>>>> main()
;>>>> float origin_R,origin_I,size_R,size_I;
******************************************************
* FUNCTION DEF : _main
******************************************************
_main:
 MOVE FP,-*SP,1
 MOVE STK,FP
 ADDI 128,STK
 MOVE SP,*STK+,1 ;; DEBUGGER TRACEBACK AID
 .sym _origin_R,0,6,1,32
 .sym _origin_I,32,6,1,32
 .sym _size_R,64,6,1,32
 .sym _size_I,96,6,1,32


 .line 12
;>>>> if (initialize()) return(1);
 CALLA _initialize
 MOVE A8,A8
 JRZ L8
 MOVK 1,A8
 JR EPI0_2


L8:
 .line 14
;>>>> origin_R = -4.0;
 MOVE @FS4,A8,1
 MOVE A8,*FP,1


 .line 15
;>>>> origin_I = -3.0;
 MOVE @FS5,A8,1
 MOVE A8,*FP(32),1

 .line 16
;>>>> size_R = 8.0;
 MOVE @FS6,A8,1
 MOVE A8,*FP(64),1

 .line 17
;>>>> size_I = 6.0;
 MOVE @FS7,A8,1
 MOVE A8,*FP(96),1

 .line 19
;>>>> compute_fractal(origin_R,origin_I,size_R,size_I);
 MOVE STK,-*SP,1
 MOVE *FP(96),*STK+,1
 MOVE *FP(64),*STK+,1
 MOVE *FP(32),*STK+,1
 MOVE *FP(0),*STK+,1
 CALLA _compute_fractal

EPI0_2:
 .line 20
 SUBI 160,STK
 MOVE *SP+,FP,1
 RETS 0

 .endfunc 140,00000a000H,128

 .sym _screenx,_screenx,4,2,32
 .globl _screenx
 .bss _screenx,32,32

 .sym _screeny,_screeny,4,2,32
 .globl _screeny
 .bss _screeny,32,32
*************************************************
* DEFINE FLOATING POINT CONSTANTS *
*************************************************
 .text
 .even 32
FS1:.float0.0
FS3:.float4.0
FS4:.float-4.0
FS5:.float-3.0

FS6:.float8.0
FS7:.float6.0
*****************************************************
* UNDEFINED REFERENCES *
*****************************************************
 .ref _put_pixel
 .ref _initialize
 .end

.po 0






[LISTING THREE]

* Hand-tweaked assembler code using Listing 2 as a basis. *
 .version 20
 .ieeefl
 .globl _screenx
 .globl _screeny

* Register Nicknames are used for program clarity
* 34020 Registers...
FP .set A13 ; C function Frame Pointer
STK .set A14 ; C function Stack

DPTCH .set B3 ; Destination Pitch of Screen
OFFSET .set B4 ; Offset of Screen

* 34082 Registers...
RA0_2 .set RA0 ; 2.0 constant
RA1_4 .set RA1 ; 4.0 constant
RA2_TMP .set RA2 ; temporary storage
RA5_DI .set RA5 ; DeltaI
RA6_AR2 .set RA6 ; AR squared
RA7_CR .set RA7 ; ConstR
RA8_AR .set RA8 ; AR

RB1_DR .set RB1 ; DeltaR
RB2_TMP .set RB2 ; temporary storage
RB4_BI .set RB4 ; BaseI
RB5_BR .set RB5 ; BaseR
RB6_AI2 .set RB6 ; AI squared
RB7_CI .set RB7 ; ConstI
RB8_AI .set RB8 ; AI

TubeOffset .set 2000H ; These definitions apply for the
TubePitch .set (1024 * 8) ; SDB20 board which we used.

 .globl _compute_fractal


******************************************************
* FUNCTION DEF : _compute_fractal
******************************************************
_compute_fractal:
 MMTM SP,A0,A1,A2,A3,A4,A11,FP

* Since we are creating a highly efficient tweaked program, we have the
* main program place the 4 parameters used in compute_fractal directly
* into 34082 registers. Specifically, BaseI has been placed in RB4,
* BaseR has been placed in RB5, SpanI has been placed in RA0, SpanR has
* been placed in RA1

;>>>> DeltaR = SpanR / (float)screenx;
 MOVE @_screenx,A3,1 ; screenx --> A3 (stays there)
 MOVE A3,RA2_TMP

 CVIF RA2_TMP,RB0 ; (float)screenx --> RB0
 DIVF RA1,RB0,RB1_DR ; SpanR / screenx = DeltaR --> RB1
 ; (stays there)
;>>>> DeltaI = SpanI / (float)screeny;
 MOVE @_screeny,A4,1 ; screeny --> A4 (stays there)
 MOVE A4,RA2_TMP
 CVIF RA2_TMP,RB0 ; (float)screeny --> RB1
 DIVF RA0,RB0,RA5_DI ; SpanI / screeny = DeltaI --> RA5
 ; (stays there)
* Set up initializations outside any loops
 TWOF RA0_2 ; constant 2.0 in RA0
 SQRF RA0_2,RA1_4 ; constant 4.0 in RA1

;>>>> for (ConstI = BaseI, row=0; row < screeny; row++,ConstI += DeltaI)
 MOVF RB4_BI,RB7_CI ; BaseI --> ConstI (RB7)
 CLRS A0 ; 0 --> row (A0)

L1:
;>>>> for (ConstR = BaseR, col=0; col < screenx; col++,ConstR += DeltaR)
 MOVF RB5_BR,RA7_CR ; BaseR --> ConstR (RA7)
 CLRS A1 ; 0 --> col (A1)

L3:
;>>>> AR = AI = ARsqr = AIsqr = 0.0F;
 CLRF RB8_AI ; 0.0 --> AI (RB8)
 MOVF RB8_AI,RB6_AI2 ; 0.0 --> AI squared (RB6)
 CLRF RA8_AR ; 0.0 --> AR (RA8)
 MOVF RA8_AR,RA6_AR2 ; 0.0 --> AR squared (RA6)

;>>>> for (color = 256; --color > 0;)
 MOVI 255,A2 ; 255 --> color (A2)

L5:

;>>>> AI = ( AR * AI * 2.0F ) + ConstI;
 MPYF RA8_AR,RB8_AI,RB2_TMP ; AR * AI --> tmp (RB2)
 MPYF RB2_TMP,RA0_2,RA2_TMP ; tmp * 2.0 --> tmp (RA2)
 ADDF RA2_TMP,RB7_CI,RB8_AI ; tmp + ConstI --> AI

;>>>> AR = ARsqr - AIsqr + ConstR;
 SUBF RA6_AR2,RB6_AI2,RB2_TMP ; AR**2 - AI**2 --> tmp (RB2)
 ADDF RB2_TMP,RA7_CR,RA8_AR ; tmp + ConstR --> AR

;>>>> if ( ((ARsqr = AR*AR)+
;>>>> (AIsqr = AI*AI)) > 4.0F ) break;
 SQRF RA8_AR,RA6_AR2 ; Compute new ARsqr
 MOVF RB8_AI,RA2_TMP ; SQRF must be performed on an A reg.
 SQRF RA2_TMP,RB6_AI2 ; Compute new AIsqr
 ADDF RA6_AR2,RB6_AI2,RB2_TMP ; sum of squares in RB2
 CMPF RA1_4,RB2_TMP ; if sum of squares > 4.0, break
 GETCST
 JRLE L6

 DSJ A2,L5 ; dec color and loop back if not 0

L6:
;>>>> put_pixel(color,col,row);
 MOVE A0,A8 ; row becomes Y
 SLL 16,A8 ; shift Y into upper 16 bits

 MOVA A1,A8 ; col becomes A, Y:X now in A8
 PIXT A2,*A8.XY ; write the pixel

; bottom of 'col' loop
 ADDF RB1_DR,RA7_CR,RA7_CR ; ConstR += DeltaR
 INC A1 ; col++
 CMP A3,A1 ; if col < screenx, jump back
 JRLT L3

; bottom of 'row' loop
L4:
 ADDF RA5_DI,RB7_CI,RB7_CI ; ConstI += DeltaI
 INC A0 ; row++
 CMP A4,A0 ; if row < screeny, jump back
 JRLT L1

L2:
EPI0_1:
 MMFM SP,A0,A1,A2,A3,A4,A11,FP
 RETS

 .globl _main

******************************************************
* FUNCTION DEF : _main
******************************************************
_main:
 MOVE FP,-*SP,1
 MOVE STK,FP
 ADDI 128,STK

 MOVE SP,*STK+,1 ;; DEBUGGER TRACEBACK AID
 CALLA _initialize
 MOVE A8,A8
 JRZ L8
 MOVK 1,A8
 JR EPI0_2
L8:
 MOVE @ORG_I,A8,1 ; We can place the initial parameters
 MOVF A8,RB4_BI ; directly into the 34082 registers
 MOVE @ORG_R,A8,1 ; where they will be used by the
 MOVF A8,RB5_BR ; compute_fractal routine.
 MOVE @SIZE_I,A8,1
 MOVF A8,RA0
 MOVE @SIZE_R,A8,1
 MOVF A8,RA1
 CALLA _compute_fractal
EPI0_2:
 MOVE *SP+,FP,1
 RETS 0

 .globl _screenx
 .bss _screenx,32,32

 .globl _screeny
 .bss _screeny,32,32

*************************************************
* DEFINE FLOATING POINT CONSTANTS *

*************************************************
 .text
 .even 32
ORG_R: .float -4.0
ORG_I: .float -3.0
SIZE_R: .float 8.0
SIZE_I: .float 6.0

 .ref _initialize
 .end




















































MAY, 1991
ADDING THE POWER OF DSP TO YOUR APPLICATION


25 Megaflops can't hurt


 This article contains the following listings: BITTMAN.ARC


Jim Bittman


Jim is the president and founder of Bittware Research Systems. He has a BSEE
from MIT and an MSEE from the University of Maryland. He can be reached at
Bittware Research Systems, 400 East Pratt Street, Baltimore, MD 21202.


When confronted with the challenges presented by Digital Signal Processing
(DSP), programmers usually come up with questions such as, "Why would I want
to use DSP in the first place?" "How hard is DSP development, and what is
required to integrate a DSP processor into my PC-based system?" and, most
importantly, "Is it worth the trouble?"
The answer to the first question is simple: The main reason for adding DSP
power to your application is to speed up the program to achieve real-time
processing speeds--speeds typically in the range of 25 Mflops. The response to
the second question isn't quite so straight-forward because DSP programming
varies from board to board and processor to processor. In general, the
development packages supplied with most commercial DSP add-in boards make the
process relatively simple. The main purpose of this article, in fact, is to
describe how to add the power of DSP processing to your PC application using
some of these off-the-shelf tools. As for the the third question, we'll return
to it at the end of this article, when we evaluate the relative performance of
some of the algorithms presented here.
I'll use as an example the classic DSP algorithm known as the Fast Fourier
Transform (FFT). The hardware used in the example is a DSP coprocessor board
from CAC, which is built around an AT&T DSP32c chip (32-bit floating point DSP
processor) with 256K of 25ns SRAM and dual 16-bit analog input and output
channels. The DSP32c runs at 50 MHz and is capable of 25 Mflops.


Fast Fourier Transform


For comparison, I'll implement the FFT in three different ways: as a C program
running on the PC, as C code running on the add-in DSP board, and as DSP
assembly language running on the add-in board.
The examples will utilize FFTs from several different sources. One of the FFTs
comes from the book Numerical Recipes in C: The Art of Scientific Computing,
an excellent reference that provides C source code and detailed information on
many different algorithms. The source code is also provided on a diskette that
can be compiled for the DSP32c without modification. We'll test this routine
on both the PC and the DSP board.
I'll also try a PC-based FFT from the MathPak 87 library by Precision Plus
Software. Finally, to demonstrate programming the DSP in assembler, I will use
the assembly-code FFT routine provided by the AT&T application library. We
will invoke the assembly language FFT from both a main module written in C as
well as one written in DSP assembler.
I'll run the examples from within the Digital Signal Processing HeadQuarters
(DspHq) environment. DspHq is my company's DSP development software package
that integrates libraries of functions for the PC and DSP cards, along with
graphics and data management. Please note that there is no intrisic
requirement for DspHq; it is used as a matter of convenience. However, the
listings as shown here do rely on DspHq. It should not be difficult to modify
them for standalone operation. Note also that the electronic version of the
listings contain additional DspHq-related files that could not be presented
here for lack of space.
Testing an FFT requires a number of steps:
Synthesize a cosine wave form with a given amplitude, dc offset, frequency and
sample rate.
Calculate the real-valued FFT of the cosine wave form.
Rederive the original waveform by using an inverse real-FFT.
Calculate log-magnitude of FFT.
Display all waveforms graphically (as shown in Figure 1 through Figure 4).
The next section shows these steps as accomplished on a PC.


FFT On the PC


The PC-based executable file is called ddj.exe and incorporates both the
PC-based FFT routines from Numerical Recipes and the functions necessary to
invoke binaries that will execute on the add-in board.
The main module is called ddj.c and is shown in Listing One (page 90). The
comments in ddj.c contain information such as the makefile used to create
ddj.exe. The heart of this routine is a switch statement that selects from
among the different functions necessary for our test. The first four case
blocks invoke PC-based routines. Case 1 calls synth_cos to generate a cosine;
case 2 calls the routine realft, which is the algorithm from Numerical Recipes
optimized for real values; case 3 also calls realft, but invokes it as the
inverse transform; finally, case 4 calls a function to calculate the
log-magnitude. The rest of the case blocks are used to invoke functions
residing on the DSP board and are discussed later in this article.
For purposes of timing comparison, I also implemented a host-resident C
program that calls an FFT routine optimized for the 80x87. This routine comes
from Precision Plus' MathPak87 library. The driving source file is called
ddj_mp87.c and is shown in Listing Two (page 90).


C Code Running On the DSP


Binary files that will execute on the DSP32c have the suffix .32c. Those for
the older DSP32 have the suffix .32. Although the two devices are source
compatible (the DSP32c being a superset), their binary formats are not.
The C source file for the example is ddj_32c.c, and its corresponding
executable is ddj_32c.32c. Executable files are produced by d3cc, the AT&T
C-compiler for the DSP32c. This compiler produces object files with a .o file
extension. The command d3cc -c realft.c compiles the algorithm from Numerical
Recipes in C without modification. For example, the resulting object file,
realft.o, is then linked with the other files and libraries shown in Example
1.
Example 1: Compile/link command to produce a DSP32c executable is from the C
language source. I use the math library first, even though it is slower,
because the error checking is more complete, and the routines accept a wider
range of inputs.

 d3cc -lm -lap -0 ddj_32c.32c ddj_32c.o four1.o realft.o \
 -s startup.o -m memory.map
 d3cc : AT&T DSP32c 'C' compiler
 -lm : Use math library first
 -lap : Use application library second
 -0 ddj_32c.32c : output file name ddj_32c.32c

 ddj_32c.o : object file for linker
 four1.o : object file for linker
 realft.o : object file for linker
 -s startup.o : assembly startup file (hardware dependent)
 -m memory.map : memory configuration file (hardware dependent)

Downloading and running this executable is accomplished by the code fragment
shown in case 6 in ddj.c (Listing One). You can also download and execute a
binary from the DOS command line with the AT&T d3dl utility. You can choose
the most appropriate blend of working in an integrated environment (such as
DspHq) versus working with standalone utilities (such as d3dl); d3emu, a
symbolic debugger that comes with the add-in board; and d3hex, a utility to
produce output suitable for PROM burners.
In this case, I have combined several functions into one executable. The
DSP-resident program waits for a function number sent from the PC and then
executes the proper routine, clears the function number, and waits for the
next instruction.
The main module for the target-based executable, ddj_32c.c, is shown in
Listing Three (page 92). This is analogous to the host-based ddj.c shown
earlier in that it basically consists of a switch statement that dispatches to
the requested functions. The functions, in the order shown in the case
statement, are:
Convert from IEEE format to DSP format.
Convert from DSP to IEEE.
Synthesize a cosine waveform.
Invoke realft for a forward FFT.
Invoke realft for an inverse FFT.
Calculate the log-magnitude.
Invoke rffta, the assembly-language implementation of the FFT.
Because the DSP32C uses a data format different from the PC, when data is
downloaded, the DSP converts the numbers to its own internal format. When
uploading, the DSP converts the data to IEEE format, waits for the PC to
upload, and then converts the data back to DSP format. A single-cycle
instruction is available for the conversion in both directions. These details
are hidden in the ul_float and dl_float functions.


Assembly Code on the DSP


For testing assembly-code, we'll use the FFT routine rfft, provided by the
AT&T assembler application library. (Note the similarity between DSP32c
assembly language and C, a similarity that makes it relatively easy to become
proficient in DSP32c assembler.) Invoking rfft is similar to invoking the C
routine realft, with the exception of having to download the number of stages
of the FFT prior to execution. This is shown in case 9 of main( ) in ddj.c
Listing One).
I've also implemented the equivalent of ddj_32c.c in assembler. Because
assembly-language source files have the suffix .s, our file is called
ddj_32s.s and is shown in Listing Four (page 94). AT&T provides its own
assembler/linker program called d3make used with assembly language programs
only. Example 2 shows how this would be invoked in the source code file.
Example 2: Invoking the AT&T assembler/linker program called d3make in the
source code file.

 d3make -o ddj_32s.32c -M6 -Q -W ddj_32s.s
 d3make : AT&T assembler/linker make
 -o ddj_32s.32c : output file name
 -M6 : memory mode, hardware dependent
 -Q : DSP32c (as opposed to DSP32) target device
 -W : turn off warnings
 ddj_32s.s : source file input



The Results


As I stated at the beginning of this article, the main reason for adding DSP
power to your application is to speed it up and achieve real-time processing
speeds. The timing results for the examples developed here are shown in Table
1. Except where noted, the DSP32c processing times include data
upload/download and conversion. As you can see, realft running on the DSP32c
is about four times faster than the same code running on a 25-MHz 386/387
computer, and the assembly-language rffta is another factor of four faster
than that. If you subtract all overhead times (download, upload, and
conversion), you end up with a 80-fold speed-up over straightforward hostbased
C code, and a 40-fold improvement over the optimized PC-based routines from
the MathPak library.
Table 1: Timing tests were calculated on a 25-MHz 386 PC(25-MHz 387). 1024
point real-valued FFT, input sine wave. 200 Hz sine sampled at kHz, amplitude
= 1 dc offset = 0, phase = 0.

 Routine Name Computation Execution time (mS)
 ------------------------------------------------------------------
 Numerical Recipes, realft PC 165.8
 MathPak 87, rvfft PC 86.3
 Numerical Recipes, realft DSP32c 45.6*
 AT&T Application Library, rffta DSP32c 9.8*
 AT&T Application Library, rffta DSP32c 2.1**

 Generate Cosine PC 94.0
 Generate Cosine DSP32c 38.4

 Download and convert n/a 2.2
 Convert, upload, and convert n/a 3.3

 Upload rate = 2.7 Mbytes/sec, Download rate = 2.9 Mbytes/sec
 *Includes data download, upload, and conversion
 **Algorithm only, no overhead.




Products Mentioned


DspHq Signal Processing Development Software BittWare Research Systems Inner
Harbor Center, 8th Floor 400 East Pratt Street Baltimore, MD 21202
800-848-0435
D3emu Debugger Board Interface Library Communications, Automation, and Control
1642 Union Blvd. Allentown, PA 18103 800-367-6735
Numerical Recipes in C: The Art of Scientific Computing William H. Press,
Brian P. Flannery, Saul A. Teukolsky, and William T. Vetterling ISBN
0-521-35465-X Cambridge University Press New Rochelle, NY 10801
DSP32c Assembler DSP32c C Compiler DSP32c Application Library AT&T
Microelectronics Dept. 52AL330240 555 Union Blvd. Allentown, PA 18103
800-372-2447
MathPak 87 Precision Plus Software 939 Griffith Street London, Ontario N6K 3S2
Canada 519-657-0633

_ADDING THE POWER OF DSP TO YOUR APPLICATIONS_
by Jim Bittman



[LISTING ONE]
/*-------------------- DDJ.C-----------------------------------------
 PC-Resident program for FFT and for controlling DSP co-processor
 The Makefile for this program is:
 .c.obj:
 cl /AL -c -DANSI $<
 ddj.exe: ddj.obj dspif.obj dspif.h four1.obj realft.obj
 cl /AL ddj.obj dspif.obj four1.obj realft.obj \
 -link dsputill.lib hqmcil.lib
 This program assumes the DspHq environment and requires additional
 support files (menu definition and function specification files)
 not listed in the Makefile. However, it should not be difficult to
 excerpt the relevant portions of this code for standalone execution.
+------------------------------------------------------------------*/

#include <stdio.h> /* include for NULL define */
#include <math.h> /* include for math functions */
#include "hq_ci_.h" /* include for dsphq interface */
#include "nr.h" /* include nr function prototypes */
#include "nrutil.h" /* include nr utility prototypes */
#include "dsputil.h" /* include CAC prototypes */
#include "dspif.h" /* dsp interface function header */

/*---------- define the menu parameter types ----------------------*/
typedef unsigned long parm1_type; /* Buffer 1 */
typedef unsigned long parm2_type; /* Buffer 3 */
typedef menu_float parm3_type; /* Cosine Amplitude */
typedef menu_float parm4_type; /* Cosine DC-offset */
typedef menu_float parm5_type; /* Cosine Frequency */
typedef menu_float parm6_type; /* Cosine Sammple Rate */
/*------------ constant definitions -------------------------------*/
#define SQR(a) ((a)*(a))
#define PI (float) 3.141592654
#define BAD_FUNC_NUM (int) -1 /* user error code */
/*----- Function Constants For Dsp32 'C' & Assembly Code ------*/
#define DSP32_X 1 /* converts IEEE ==> DSP numbers */
#define IEEE32_X 2 /* converts DSP ==> IEEE numbers */
/*----- Function Constants For Dsp32 'C' Code ----------------*/
#define GENCOS_C 3 /* generate test cosine */
#define REALFT_C 4 /* performs realft, from numerical recipes */
#define IREALFT_C 5 /* performs inv_realft, from numerical recipes */
#define LOGMAG_C 6 /* calculates log-magnitude */

#define RFFTA_C 7 /* calculates real fft using app-lib */
/*----- Function Constants For Dsp32 Assembly Code ------------*/
#define RFFT_S 3 /* performs rfft, from AT&T application library */
#define MAG_S 4 /* calculates magnitude-squared */
#define LOG_S 5 /* calculates log */
/*----- Function Prototypes ----------------------------------*/
int ilog2 (unsigned int);
void log_mag (float far * i1, float far * o1, long bs);
void scale_data (float far *output1, float scale_val, long np);
void synth_cos (float far *data1, long np, float a, float d, float f,
 float s);
main(int argc, char *argv[])
{ long indx; /* used for loop index */
 int func_num; /* for function number from dsphq */
 long np; /* for block size from dsphq */
 float far * input1; /* array address */
 float far * output1; /* array address */
 long bs; /* address of DSP blocksize var */
 long flag; /* address of DSP funcnum flag */
 parm1_type b1; /* DSP Buffer #1 */
 parm2_type b2; /* DSP Buffer #2 */
 parm3_type amp; /* cosine amplitude */
 parm4_type dco; /* cosine DC offset */
 parm5_type freq; /* cosine frequency */
 parm6_type samprate; /* cosine sample rate */
 init_intfc(argc, argv); /* init dsphq interface */
 func_num = get_func_num(); /* get the function number */
 np = get_block_size(); /* get the block size */
 /* get menu parameters */
 b1 = get_parm(1); /* DSP Buffer #1 */
 b2 = get_parm(2); /* DSP Buffer #2 */
 amp = get_parm(3); /* cosine amplitude */
 dco = get_parm(4); /* cosine DC offset */
 freq = get_parm(5); /* cosine frequency */
 samprate = get_parm(6); /* cosine sample rate */
 /* get array addresses */
 input1 = get_data_in_ptr(1); /* base address of input #1 */
 output1 = get_data_out_ptr(1); /* base address of output #1 */
 /* perform selected function */
 switch (func_num)
 {
 case 1 : /*--- Synthesize Cosine Using PC ---------------------*/
 synth_cos(output1, np, amp, dco, freq, samprate);
 break;
 case 2 : /*---- Numerical Recipes Forward Real FFT Using PC ----*/
 output1--; /* NR funcs index at 1 */
 realft(output1, np>>1, 1); /* forward real fft */
 break;
 case 3 : /*---- Numerical Recipes Inverse Real FFT Using PC ----*/
 output1--; /* NR funcs index at 1 */
 realft(output1, np>>1, -1); /* inverse real fft */
 output1++; /* reallign address */
 scale_data(output1,1.0/(np >> 1),np); /* restore original ampl. */
 break;
 case 4 : /*------ Calculate LOG(10)-[MAGNITUDE] using PC --------*/
 if (input1)
 log_mag(input1, output1, np); /* take logmag of input */
 else
 log_mag(output1, output1, np); /* perform logmag in-place */

 break;
 case 5 : /*--------- Synthesize Cosine Using DSP-32C -------------*/
 init_dsp("ddj_32c.32c",&flag,&bs,np); /* download dsp code & init */
 dsp_dl_fp(get_addr("amp"),amp); /* download floats */
 dsp_dl_fp(get_addr("dco"),dco);
 dsp_dl_fp(get_addr("freq"),freq);
 dsp_dl_fp(get_addr("samprate"),samprate);
 set_dspbuf("o1", b1); /* set dsp buffer address */
 dsp_dl_int(flag,GENCOS_C); /* invoke function on dsp */
 wait4dsp(flag); /* wait for dsp to finish */
 ul_float(output1,np,flag,b1); /* upload results */
 break;
 case 6 : /*---- Numerical Recipes Forward Real FFT Using DSP-32C ----*/
 init_dsp("ddj_32c.32c",&flag,&bs,np); /* download dsp code & init */
 set_dspbuf("o1", b1); /* set dsp buffer address */
 dl_float(input1,np,flag,b1); /* download float array */
 dsp_dl_int(flag,REALFT_C); /* execute "realft" on dsp */
 wait4dsp(flag); /* wait for dsp to finish */
 ul_float(output1,np,flag,b1); /* upload results */
 break;
 case 7 : /*---- Numerical Recipes Inverse Real FFT Using DSP-32C -----*/
 init_dsp("ddj_32c.32c",&flag,&bs,np); /* download dsp code & init */
 dl_float(input1,np,flag,b1); /* download float array */
 set_dspbuf("o1", b1); /* set dsp buffer address */
 dsp_dl_int(flag,IREALFT_C); /* execute "inv_realft" on dsp */
 wait4dsp(flag); /* wait for dsp to finish */
 ul_float(output1,np,flag,b1); /* upload results */
 break;
 case 8 : /*----- Calculate LOG(10)-[MAGNITUDE] using DSP-32C ----*/
 init_dsp("ddj_32c.32c",&flag,&bs,np); /* download dsp code & init */
 dl_float(input1,np,flag,b1); /* download float array */
 set_dspbuf("i1", b1); /* set dsp buffer address */
 set_dspbuf("o1", b2); /* set dsp buffer address */
 dsp_dl_int(flag,LOGMAG_C); /* execute "inv_realft" on dsp */
 wait4dsp(flag); /* wait for dsp to finish */
 ul_float(output1,np,flag,b2); /* upload results */
 break;
 case 9 : /*------- Forward Real FFT Using DSP-32C App-Lib --------*/
 init_dsp("ddj_32s.32c",&flag,&bs,np); /* download dsp code & init */
 dl_float(input1,np,flag,b1); /* download float array */
 set_dspbuf("o1", b1); /* set dsp buffer address */
 dsp_dl_int(get_addr("stages"),ilog2(np));
 /* download int */
 dsp_dl_int(flag,RFFT_S); /* execute "rfft" on dsp */
 wait4dsp(flag); /* wait for dsp to finish */
 ul_float(output1,np,flag,b1); /* upload results */
 break;
 case 10: /*-------- Download Data To DSP-32C ---------------------*/
 init_dsp("ddj_32s.32c",&flag,&bs,np); /* download dsp code & init */
 dl_float(input1,np,flag,b1); /* download data from pc to dsp */
 break;
 case 11: /*--------- Upload Data From DSP-32C ---------------------*/
 init_dsp("ddj_32s.32c",&flag,&bs,np); /* download dsp code & init */
 ul_float(output1,np,flag,b1); /* upload results */
 break;
 default : set_err_return(BAD_FUNC_NUM); break;
 }
}
/*--- This function returns the integer part of the log base 2 of x. ---*/

int ilog2(unsigned int x)
{
 return( x >> 1 ? 1 + ilog2(x >> 1) : 0);
}
/*--- This function scales np elements of data1[] by scale_val. ---*/
void scale_data(float far * data1, float scale_val, long np)
{ long i;
 for (i = 0; i < np; i++) {
 data1[i] *= scale_val;
 }
}
/*--- Function generates cosine data: data[i] = A cos((2 pi f/s) i) + d ---*/
void synth_cos(float far * data1, long np, float a, float d, float f, float s)
{ long i;
 float theta, angle_step;
 angle_step = 2.0 * PI * f / s;
 theta = 0.0;
 for (i = 0; i < np; i++) {
 data1[i] = (a * cos(theta)) + d;
 theta += angle_step;
 }
}
/*--- log_mag ---*/
void log_mag(float far * i1, float far * o1, long bs)
{ long i;
 long n;
 n = bs >> 1;
 o1[0] = log10(SQR(i1[0]));
 for (i = 1; i < n; i++) {
 o1[i] = log10(SQR(i1[2*i]) + SQR(i1[2*i+1]));
 }
 for (i = n; i < bs; i++) {
 o1[i] = 0.0;
 }
}






[LISTING TWO]

/*-----------------DDJ_MP87.C-------------------------------------------
 MathPak 87 FFT Function Group Execution Source File
 The makefile is: ddj_mp87.exe: ddj_mp87.c
 cl /AL ddj_mp87.c -link hqmcil.lib mpak87.lib
+----------------------------------------------------------------------*/
#include <hq_ci_.h> /* include function prototypes */
#include <mpak87.h> /* include math pak header */
#define BAD_BLOCK_SIZE -1 /* user defined error codes */
#define BAD_FUNC_NUM -2

int fft_stages(long fft_size); /*function prototype*/

void main(int argc, char *argv[])
{
 int m; /* number of fft stages */
 far_array_of_double o1; /* data pointer */

 init_intfc(argc, argv); /* MUST do this before other interface functions */
 o1 = get_data_out_ptr(1); /* get address of data */
 if ((m = fft_stages(get_block_size())) == 0)
 set_err_return(BAD_BLOCK_SIZE); /* won't happen if .fnc file correct */
 else
 switch(get_func_num()) {
 case 1: rvfft(o1, m); break; /* real fft from MathPak library */
 case 2: irvfft(o1, m); break; /* inverse real fft from MathPak */
 default:
 set_err_return(BAD_FUNC_NUM); /* won't happen if .fnc file correct */
 break;
 }
}
/* return the log(base 2) of the input, or 0 if input is not a power of 2 */
int fft_stages(long fft_size)
{ int rtn;
 int sw_fft;
 sw_fft = fft_size;
 switch(sw_fft) {
 case 8 : rtn = 3; break;
 case 16 : rtn = 4; break;
 case 32 : rtn = 5; break;
 case 64 : rtn = 6; break;
 case 128 : rtn = 7; break;
 case 256 : rtn = 8; break;
 case 512 : rtn = 9; break;
 case 1024 : rtn = 10; break;
 case 2048 : rtn = 11; break;
 case 4096 : rtn = 12; break;
 case 8192 : rtn = 13; break;
 default : rtn = 0; break;
 }
 return(rtn);
}







[LISTING THREE]

/*--------------------DDJ_32C.C--------------------------------------
 This file is will run on the DSP add-in board after compilation
 by the AT&T C compiler d3cc. The makefile is:
 .c.o:
 d3cc -c $<
 .s.o:
 d3as -W -Q $<
 ddj_32c.32c: ddj_32c.o startup.o memory.map four1.o realft.o
 d3cc -lm -lap -o ddj_32c.32c ddj_32c.o four1.o realft.o \
 -s startup.o -m memory.map
+--------------------------------------------------------------------*/
#include <stdio.h>
#include <math.h>
#include <nr.h>

#define PI 3.1415926

#define SQR(a) ((a)*(a))

asm(".global i1, i2, o1, o2");
asm(".global funcnum, bs");
asm(".global amp, dco, freq, samprate");

short funcnum;
short bs;
float *i1, *i2, *o1, *o2;
float amp, dco, freq, samprate;
/*------------------------------------------------------------*/
short fft_stages(fft_size)
short fft_size;
{
 short rtn;
 switch(fft_size)
 {
 case 32 : rtn = 5; break;
 case 64 : rtn = 6; break;
 case 128 : rtn = 7; break;
 case 256 : rtn = 8; break;
 case 512 : rtn = 9; break;
 case 1024 : rtn = 10; break;
 case 2048 : rtn = 11; break;
 case 4096 : rtn = 12; break;
 case 8192 : rtn = 13; break;
 default : rtn = 0; break;
 }
 return(rtn);
}
/*------------------------------------------------------------*/
main()
{ register float scal;
 short n;
 float *data1, *data2, temp;
 register short i,j;

 while (1) {
 funcnum=0;
 while (!(funcnum))
 ; /* Wait for PC to Download Function Number */
 n = bs >> 1;
 switch(funcnum) {
 case 1: /*--------- Convert to DSP Format ------------*/
 dsp32(bs,o1); /*IEEE-->DSP */
 break;
 case 2: /*--------- Convert to IEEE format -----------*/
 ieee32(bs,o1); /*DSP-->IEEE*/
 break;
 case 3: /*--------- Synthesize Cosine ----------------*/
 scal = 2.0 * PI * freq / samprate;
 j = 0;
 data1 = o1;
 for (i=bs; i-- > 0; j++) {
 *data1++ = (amp * cos(scal * j)) + dco;
 }
 break;
 case 4: /*------- Forward FFT ------------------------*/
 data1 = o1;

 data1--;
 realft(data1,n,1); /*from numerical recipes*/
 break;
 case 5: /*------- Inverse FFT ------------------------*/
 data1 = o1;
 data1--;
 realft(data1,n,-1); /*from numerical recipes*/
 /* scale by 1/n to retain original amplitude*/
 data1 = o1;
 scal = 1.0 / n;
 for (i=bs; i-- > 0; data1++) {
 *data1 = *data1 * scal;
 }
 break;
 case 6: /*----- Calc LOG-MAGNITUDE (data output from NR-realft)--*/
 o1[0] = log10(SQR(i1[0]));
 temp = log10(SQR(i1[1]));
 for (i=1;i<n;i++) {
 o1[i]=log10(SQR(i1[2*i])+SQR(i1[2*i+1]));
 }
 o1[n] = temp;
 for (i=n+1; i<bs; i++) {
 o1[i] = 0.0;
 }
 break;
 case 7: /*------ Forward FFT ----------------------------------*/
 data1 = o1;
 rffta(bs,fft_stages(bs),data1); /*uses AT&T app lib*/
 break;
 default : break;
 }
 }
}





[LISTING FOUR]

/*------ file DDJ_32S.S-----------------------------------------------
 Assembly language version of FFT test. The makefile is as follows:
 ddj_32s.32c: ddj_32s.s
 d3make -o ddj_32s.32c -M6 -Q -W ddj_32s.s
+---------------------------------------------------------------------*/
#include <dspregs.h>
/*--------------------------------------------------------------------*/
.global i1,o1,o2
.global funcnum, bs, stages
.global endofcode
/*--------------------------------------- initialization -------------*/
 r5 = 0x0003
 pcw = r5
 ioc = 0x30CC0
/*--------------------------------------- wait until funcnum != 0 ----*/
begin:
 r5 = *funcnum
 nop
 if (eq) goto begin

 nop
/*--------------- switch on funcnum, which is set to function number --*/
/* func 1: IEEE-->DSP */
 r5 = r5 - 1
 if (eq) goto do_dsp32
/* func 2: DSP-->IEEE */
 r5 = r5 - 1
 if (eq) goto do_ieee32
/* func 3: invoke rffta, from AT&T app library */
 r5 = r5 - 1
 if (eq) goto rffta
/* func 4: calc magnitude squared */
 r5 = r5 - 1
 if (eq) goto do_mag
 nop
/* func 5: calc log */
 r5 = r5 - 1
 if (eq) goto do_log10
 nop
/* illegal function number */
 goto finished
 nop
/*---------------------------------call to _rffta -----------------*/
rffta:
 r2e = *o1
 r1 = *bs
 r3 = *stages
 *fft_b = r2e
 *fft_n = r1e
 *fft_m = r3e
 call _rffta (r14)
 nop
fft_lv: int24 localV
fft_n: int24 0
fft_m: int24 0
fft_b: int24 0
.align 4
 goto finished
 nop
/*---------------------------------calc magnitude------------------*/
do_mag:
 r8 = *bs
 r10e = *i1
 r8 = r8 - 2
 r11e = *o1
 nop
 a0 = *r10++ /* DC value */
 nop
 a2 = *r10++ /* Nyquist */
 *r11++ = a0 = a0*a0 /* save DC mag */
magloop:
 a0 = *r10++
 nop
 a1 = *r10++
 a0 = a0*a0
 nop
 a1 = a1*a1
 nop
 nop

 *r11++ = a1 = a0 + a1
 if (r8-->=0) goto magloop
 nop
 *r11++ = a0 = a2*a2 /* save Nyquist mag */
 goto finished
 nop
/*---------------------------------calc log10------------------*/
do_log10:
 r12e = *i1
 r11e = *o1
 r13 = *bs
 r10e = in
 r9e = out
 r13 = r13 - 2
logloop:
 *r10 = a3 = *r12++
 call _log10 (r14)
 nop
 int24 localV
 int24 in, out
.align 4
 *r11++ = a3 = *r9
 if (r13-->=0) goto logloop
 nop
 goto finished
 nop
/*--------------------------------------call _ieee32 converter-----*/
do_ieee32:
 r1e = *o1
 r2 = *bs
 *o1_ieee32 = r1e
 *bs_ieee32 = r2e
 call _ieee32 (r14)
 nop
 int24 localV
bs_ieee32: int24 0
o1_ieee32: int24 0
.align 4
 goto finished
 nop
/*--------------------------------------call _dsp32 converter-----*/
do_dsp32:
 r1e = *o1
 r2 = *bs
 *o1_dsp32 = r1e
 *bs_dsp32 = r2e
 call _dsp32 (r14)
 nop
bs_dsp32: int24 0
o1_dsp32: int24 0
.align 4
 goto finished
 nop
/*-------------------------------finished, set funcnum=0 -----------*/
finished:
 r1=0
 goto begin
 *funcnum=r1
bs: int 256

stages: int 8
funcnum: int 0
.align 4
i1: int24 0x2000
o1: int24 0x2000
o2: int24 0x3000

.align 4
localV: 2*float 0.0
in: float 0.0
out: float 0.0
max: float 0.0
scalefac: float 0.0
#include <_rffta.asm>
#include <_log10.asm>
#include <_ieee32.asm>
#include <_dsp32.asm>
/*------------------------------------- mark end of code------------*/
endofcode: int 0xDEAD, 0xC0DE











































MAY, 1991
GETTING NUMERIC COPROCESSORS UP TO SPEED


New processors require new techniques




John H. Letcher


John is a professor of Computer Sciences at the University of Tulsa, 600 South
College Ave., Tulsa, Oklahoma 74014.


Over the past decade, a revolution has occurred in numeric processing, as
coprocessors have usurped many of the CPU's number-crunching chores. This
article describes these trends and shows how you can take advantage of these
developments, using as examples floating point (FP) numeric coprocessors that
support the 80x86 integer CPU.
As the 80x86 family (8088, 8086, 80188, 80186, 80286, and 80386) evolved, each
generation became successively faster and more potent than its predecessor.
And even as each CPU model continued to obey the instructions of its
predecessor, new instructions were added to the repertoire. This led to
unavoidable conflicts between the integer CPUs and their numeric coprocessors,
ultimately requiring that new coprocessors be developed for each CPU. This
pattern persisted until the 80486, which incorporates the FP unit as part of
its onboard circuitry. A parallel scenario has occurred with Motorola's 680x0
family.


More Speed, More Power


It's no secret that historically, floating point instructions have executed
more slowly than their integer counterparts: A given integer multiply or
divide, for instance, executes much faster than the corresponding FP
instruction.
It's less well known, however, that with each successive processor/coprocessor
generation, the time required to perform an FP multiply has moved closer to
its integer counterpart. For example, Table 1 lists the execution time for
selected instructions on a variety of Intel and Cyrix processors/coprocessors.
Table 1: Execution time (in clock cycles) for selected instructions on a
variety of Intel and Cyrix processors/coprocessors.

 CPU IMUL<reg> Coprocessor FMUL ST(O), ST(I)
 Instruction(in clocks) Instruction (in clocks)
 -----------------------------------------------------------------------

 8088 128-154 8087 130-145
 80188 34-37 8087 130-145
 80296 21 80287 130-145
 80386 9-22 80387 52
 80386 9-22 83D87 19
 80386 9-22 EMC87 10

Note the performance improvement in integer instructions from the original
4.77-MHz 8088 PC to the 33-MHz 80386. As measured by Norton's SI, the 386
provides a performance improvement factor of over 40. Granted, SI is a
notoriously misleading benchmark: Nevertheless, this is an impressive boost.
More significantly, note the even greater improvement in coprocessor
performance--both the Cyrix 83D87 and EMC87 execute FP multiply instructions
faster than integer multiply instructions on the 386!
How are these dramatic performance gains achieved? In the case of the Cyrix
coprocessors listed in Table 1(as well as other FP coprocessors such as the
Weitek 3167), performance gains come from improvements in algorithms expressed
in microcode, from improved internal design, and from utilizing what I've
termed a "memory-mapped" operation mode. This mode operates in contrast to
what I call the "80387 compatible" mode, which utilizes 80387 instructions per
se.
Memory-mapped coprocessors take advantage of the fact that in protected mode
on the 386, there is a page high in the 32-bit address space where the CPU
does not insert internal wait states on accessing these locations. (These
internal wait states are not to be confused with the wait states inserted into
ordinary memory references and I/O instructions.) By sending or receiving
bytes, words, or doublewords to or from these locations, you can double up on
sending information to the processor about the instructions to be carried out
by the coprocessor. The coprocessor decodes the address location and its
instruction opcode, and uses the operand values passed on the data bus in the
conventional manner. Each FP instruction is assigned a unique address within
the 4K block of memory that starts at C0000000h. A MOV instruction to or from
a location in this block tells the coprocessor that the FP instruction is to
be executed. The coprocessor uses the address alone to determine which
instruction is specified. The MOV instruction (usually to or from the register
EAX) also causes the transfer of operand data. Because the address bus and the
data bus work in parallel, the instructions work faster. For the purposes of
this article, I'll refer to the FP units that operate in this way as
"memory-mapped coprocessors."
Unfortunately, this memory-mapped scheme is incompatible with the standard
mode of operation in the Intel 80387. In this mode of operation (the
"387-compatible" mode), the FP unit expects opcodes to be embedded within the
code, not as instructions to be poked into memory addresses. This can be a
real problem if you have existing programs written for the 80387.
Fortunately, both the Cyrix and Weitek coprocessors provide solutions to the
incompatibilities between memory-mapped and 387-compatible modes that give you
higher performance without making you rewrite your code. The two processors
approach the problem differently, however.
With the Weitek processor, you must purchase a small circuit board, a 3167,
and an 80387. These two processors are plugged into the circuit board, which
is in turn plugged into the coprocessor socket on the motherboard. (One such
circuit board, the mW3167/80387 Board, is available from Microway.)
By contrast, the Cyrix EMC87 can operate in 387-compatible mode because its
register set is the same. To use it, you do not need the small circuit board
mentioned earlier, only the EMC87 (which plugs directly into the 121-pin 80387
socket) on the motherboard.
(One other difference between the Weitek approach and that of Cyrix is that
the Weitek coprocessor is not compliant with the IEEE 754 FP Standard, while
the Intel and Cyrix coprocessors are. Weitek has, in effect, traded numerical
accuracy for speed. This, of course, is fine for those applications that don't
require precise accuracy, of which there are many.)
There is a significant performance difference between the memory-mapped and
387-compatible modes, see Table2, which shows the performance of two
mathematical algorithms I've used in designing medical imaging devices. The
first is a 256-point complex Fast Fourier Transform (each complex point
consisting of a 32-bit REAL for the real and imaginary parts, respectively).
The second is a 256-point Daubechies D[4] wavelet transformation (each point
is represented by a 32-bit REAL and produces a 256-point array of REALs plus a
255-point array of REALs in the multispectral decomposition of the input
data).
Table 2: Performance of two complex mathematical algorithms.

 Processor 256-point 256-point Forward 256-point Inverse
 Complex FFT D[4] Wavelet D[4] Wavelet
 Transformation Transformation
 -----------------------------------------------------------------------

 10MHz 80286 with
 80287
 160.0 msec - -

 33MHz/80386

 with Intel 80387 45.0 msec 6.30 msec 7.00 msec


 33MHz/80386
 with Cyrix 17.6 msec 5.75 msec 4.88 msec
 83D87 or the
 Cyrix EMC87
 operated in
 compatible
 mode

 33MHz/80386 with
 Cyrix EMC87
 operated in 9.2 msec 4.00 msec 3.60 msec
 memory-mapped
 mode

As evident from Table 2, the 83D87 and EMC87 are faster than the 80387. Note
also that the performance gains were achieved without sacrificing
compatibility with the IEEE FP Standard. If a program functioned correctly on
your older 8088-based PC with 8087, it should function perfectly on any of the
more powerful Intel coprocessors, the Cyrix 83D87 or EMC87. Comparing the
numeric results down to the last bit reveals minor discrepancies. The affected
instructions are not the usual FP add, subtract, or multiply, but rather those
instructions whose accuracies have not been defined by IEEE FP Standard -- for
example, the transcendental functions. Therefore, the accuracy of one
processor may not be as great as another even though each is still fully
compliant with the Standard. These differences are slight and should not
significantly affect the results of typical calculations.
Although not listed in Table 2, the Weitek 3167 comes in at speeds essentially
equivalent to the Cyrix coprocessors.


Porting to Memory-Mapped Mode


Programmers are faced with complications when using memory-mapped
coprocessors. Although some compilers (like Microway's NDP family) can
generate memory-mapped FP instructions, there are still many programmers who
want to keep their existing assembly language code without a major rewrite.
Fortunately, translation programs that allow you to continue using compatible
instructions are normally supplied with the coprocessor.
To use the memory-mapped scheme, the computer must think it is operating in
protected mode while the programmer thinks the computer is in real mode.
The first step in accomplishing this is to find an unused 4K block of memory
below the 1-Mbyte boundary. Between 640K and 1-Mbyte, there is usually some
address space for which no physical memory is installed. These locations will
vary from machine to machine (Cyrix supplies a program to find these blocks
for you). Let us say that the program finds an unused 4K block at D000:0000
(0D0000h through 0D0FFFh). This number will be used in the examples later on.
The second step is to place the computer into the Virtual 8086 mode, in which
the 386 runs in protected mode and is made to think that memory is broken into
1-Mbyte bundles (which may be placed anywhere). The no wait page at C0000000h
is then mapped to somewhere within the 1-Mbyte limit of the apparent real
mode, in this case to 0D0000h. The real-mode program can then use this block
of address space to communicate with the memory-mapped coprocessor.
To put the computer into Virtual 8086 mode when using the Cyrix coprocessors,
run a program (supplied by Cyrix) called CRXV86M.EXE, the Virtual 8086
Monitor. You can also execute the program from within the AUTOEXEC.BAT file by
adding the line CRXV86M/m5. (The option /m5 places the monitor at the end of
extended memory so that you can still use the standard RAM disk driver.) The
monitor remains in memory and intercepts operating system requests from
real-mode programs. The monitor converts the requests to the equivalent
protected-mode requests. From this point on, the computer is in protected mode
even though the real-mode programmer doesn't realize it.


Generating FP Code


The process by which "387 compatible" routines are translated into
memory-mapped instructions is best explained in an example.
Consider the Fortran subroutine ABC in Example 1(a), to be called by
Ryan-McFarland F-77 compiled routines. The Fortran compiler produces a table
of addresses that the the calling program uses to pass the locations of its
arguments to the subroutine. Note that this Fortran calls by reference, and
not by value, as C usually does. Also notice that the argument locations are
not pushed onto the stack. Example 1(b), shows the resulting table of
pointers. The compiled call to ABC is shown in Example 1(c).
Example 1: (a) A Fortran subroutine ABC to be called by Ryan-McFarland F-77
compiled routines; (b)the resulting table of pointers; (c)the compiled call to
ABC.

 (a) SUBROUTINE ABC(X,Y,Z)
 Z=X+Y
 RETURN
 END

 (b) ARGLOC DW OFFSET X ; argument 1
 DW SEG X
 DW OFFSET Y ; argument 2
 DW SEG Y
 DW OFFSET Z ; argument 3
 DW SEG Z

 (c) MOV AX, SEG ARGLOC
 MOV ES, AX
 MOV BX, OFFSET ARGLOC
 CALL ABC

Example 2 shows ABC after compilation into an assembly language module that
uses 387-compatible FP instructions. To convert this module to one using
memory-mapped instructions, you must run the translation program provided by
the chip manufacturer (the Cyrix version is called CX.EXE). The translator
produces an assembly language module, which can be assembled and linked in the
usual manner.
Example 2: ABC after compilation into an assembly language module that uses
387-compatible FP instructions.

 TITLE ABC

 CODE SEGMENT 'CODE'
 ; SUBROUTINE ABC(X,Y,Z)
 ABC PROC FAR
 PUBLIC ABC
 LDS SI,ES:[BX] ; NOW DS:SI POINTS TO X
 FLD DWORD PTR [SI] ; LOADS THE VALUE OF X
 LDS SI,ES:4[BX] ; NOW DS:SI POINTS TO Y
 FADD DWORD PTR [SI] ; ADDS Y TO X
 LDS SI,ES:8[BX] ; NOW DS:SI POINTS TO Z
 FSTP DWORD PTR [SI] ; STORES VALUE INTO Z
 RET
 ABC ENDP
 CODE ENDS
 END

Running the translator on ABC produces the code shown in Example3. Notice that
the 386 register FS is used to map the absolute location C0000000h into
D000:0000 (0D0000h). The offset mflds is an equate (having, in this case, a
value of 0200h) defined in the include file FCODE.INC (supplied by Cyrix,
along with the translator). This equate, along with other entries such as
mfldd, mfstps, and mfadds, represents the offsets in the address space that
correspond to specific FP instructions.
Example 3: Assembly language module resulting from the translation into
memory-mapped instructions.

 INCLUDE FCODE.INC
 TITLE ABC
 CODE SEGMENT 'CODE'
 ; SUBROUTINE ABC(X,Y,Z)
 .386
 ABC PROC FAR
 mov eax, 0d000h ; EMC
 mov fs,eax ; EMC
 PUBLIC ABC
 LDS SI,ES:[BX] ; NOW DS:SI POINTS TO X
 ; FLD DWORD PTR [SI] ; LOADS THE VALUE OF X
 mov eax, DWORD PTR [SI] ; EMC
 mov DWORD PTR FS:mflds, eax ; EMC
 LDS SI,ES:4[BX] ; NOW DS:SI POINTS TO Y
 ; FADD DWORD PTR [SI] ; ADDS Y TO X
 mov eax, DWORD PTR [SI] ; EMC
 mov DWORD PTR FS:mfadds, eax ; EMC
 LDS SI,ES:8[BX] ; NOW DS:SI POINTS TO Z
 ; FSTP DWORD PTR [SI] ; STORES VALUE INTO Z
 mov eax, DWORD PTR FS:mfstps ; EMC
 mov DWORD PTR [SI], eax ; EMC
 RET
 ABC ENDP
 CODE ENDS
 END

The translated assembly language module can be run through any assembler (such
as Borland's Turbo Assembler) that understands the .386 pseudo-op and 80386
mnemonics. The new object module is then used in lieu of the untranslated one.


Conclusion


The impact of the numeric processing performance described in this article is
dramatically affecting how PCs are being used. For instance, most medical
Magnetic Resonance Imager(MRI) manufactures currently used computers that cost
hundreds of thousands of dollars. An MRI unit I've designed, however, was
built using a PC--that uses numeric coprocessors--as the only computer within
the device.
A single MRI image is an array of 256 x 256 pixels. These data are calculated
by means of 512 Fourier transforms. Image reconstruction in under two seconds
is possible! Integral transforms (both Fourier and wavelet) make it possible
to produce very inexpensive medical imaging devices in the near future.
As mentioned earlier, the speed of both integer and FP instructions has
increased dramatically over the past five years, with FP instructions all but
overtaking their integer counterparts. This will impact compiler design,
because the overhead associated with control statements (DO, for instance) is
becoming more important. In the past, index calculation time didn't constitute
a significant percentage of the execution of a program. It does now! In short,
hardware designers have done their jobs and compiler writers should be on
notice that now it's their turn.


References


Letcher, J. "The Use of Weiner Deconvolution (An Optimal Filter) in Nuclear
Magnetic Resonance Imaging." International Journal of Imaging Systems &
Technology (vol. 1, 1989).

Letcher, J. "The Use of Weiner Deconvolution to Improve the Quality of MR
Images." 74th Scientific Assembly and Annual Meeting, Radiology Society of
North America, 1988.


Birth of the IEEE 754 Floating Point Standard


In the early 1960s, computer manufacturers couldn't agree on how to represent
floating point numbers and in the definition of language used to calculate
these numbers. Consequently, programs written for one computer couldn't be
ported intact to another. A push for standardization evolved, particularly in
the area of Fortran-66, that eventually leading to improved language
specifications and to the ANSI-blessed Fortran-77.
This language standardization provided benefits to the programmer, but a
serious problem remained: A program could be ported from one machine to
another, compiled, and executed without errors--but sometimes, to the
programmer's horror, different answers were obtained. To deal with such
issues, a committee of several hundred scientists and engineers convened under
the auspices of the IEEE. They explored, studied, and proposed a standard way
of representing floating point (FP) numbers and the precise characteristics of
how the arithmetic should behave.
The result of this committee's work was the IEEE 754 Standard for FP
arithmetic that allowed the programmer to specify rounding structures and
other arithmetic properties. The stage was set for programmers to port files
from one machine to another with identical answers and a high degree of
numerical stability.
The IEEE FP Standard is not simple. Routines written using integer
instructions to fully comply with the Standard are large, cumbersome, and
time-consuming in execution--a limitation particularly important to PCs.
Nevertheless, work had begun at Zilog, Intel, Motorola, and others to
implement the IEEE FP Standard in the form of a single integrated circuit, one
which would share (in parallel) almost all of the signal lines to the integer
CPU. The Intel 8087 and Motorola 68881 were born.
To illustrate by example, an ingenious scheme was implemented on the Intel
8086 whereby the integer CPU (the 8086) would encounter a single byte
instruction called ESC (Escape). When it did, the 8086 knew that the upcoming
instruction was destined for the companion circuit (the 8087, the IEEE
Standard compliant coprocessor) and passed the instruction and operand bytes,
as appropriate, to the coprocessor. It also helped with address calculations.
This scheme allowed the 8086 to look ahead in the instruction stream and
execute in parallel whatever it could while 8087 was busy with integer
calculations. A simple instruction, FWAIT OR WAIT, caused the 8086 to freeze
until the instant that the coprocessor had finished its previously assigned
work.
This gave PC computers accurate and predictable numeric processing
capabilities (by way of IEEE 754 compliance), which were undoubtedly one of
the strongest contributing factors to the explosive growth and acceptance of
PC and workstation computers by serious scientists and engineers. --J.H.L.



_GETTING NUMERIC COPROCESSORS UP TO SPEED_
by John H. Letcher


Example 1:

(a)

 SUBROUTINE ABC(X,Y,Z)
 Z=X+Y
 RETURN
 END

(b)

 ARGLOC DW OFFSET X ; argument 1
 DW SEG X
 DW OFFSET Y ; argument 2
 DW SEG Y
 DW OFFSET Z ; argument 3
 DW SEG Z

(c)
 MOV AX,SEG ARGLOC
 MOV ES,AX
 MOV BX,OFFSET ARGLOC
 CALL ABC


Example 2:

 TITLE ABC
 CODE SEGMENT 'CODE'
 ; SUBROUTINE ABC(X,Y,Z)
 ABC PROC FAR
 PUBLIC ABC
 LDS SI,ES:[BX] ; NOW DS:SI POINTS TO X
 FLD DWORD PTR [SI] ; LOADS THE VALUE OF X
 LDS SI,ES:4[BX] ; NOW DS:SI POINTS TO Y
 FADD DWORD PTR [SI] ; ADDS Y TO X
 LDS SI,ES:8[BX] ; NOW DS:SI POINTS TO Z
 FSTP DWORD PTR [SI] ; STORES VALUE INTO Z
 RET
 ABC ENDP

 CODE ENDS
 END


Example 3:

 INCLUDE FCODE.INC
 TITLE ABC
 CODE SEGMENT 'CODE'
 ; SUBROUTINE ABC(X,Y,Z)
 .386
 ABC PROC FAR
 mov eax,0d000h ; EMC
 mov fs,eax ; EMC
 PUBLIC ABC
 LDS SI,ES:[BX] ; NOW DS:SI POINTS TO X
; FLD DWORD PTR [SI] ; LOADS THE VALUE OF X
mov eax,DWORD PTR [SI] ; EMC
mov DWORD PTR FS:mflds,eax ; EMC
 LDS SI,ES:4[BX] ; NOW DS:SI POINTS TO Y
; FADD DWORD PTR [SI] ; ADDS Y TO X
mov eax,DWORD PTR [SI] ; EMC
mov DWORD PTR FS:mfadds,eax ; EMC
 LDS SI,ES:8[BX] ; NOW DS:SI POINTS TO Z
; FSTP DWORD PTR [SI] ; STORES VALUE INTO Z
mov eax,DWORD PTR FS:mfstps ; EMC
mov DWORD PTR [SI],eax ; EMC
 RET
ABC ENDP
CODE ENDS
 END































MAY, 1991
 PORTING UNIX TO THE 386 THE INITIAL ROOT FILESYSTEM


Completing the toolset




William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 BSD and 2.9BSD and was the chief
architect of National Semiconductor's GENIX project. Lynne established
TeleMuse, a market research firm specializing in the telecommunications and
electronics industry. They can be contacted via e-mail at
lynne@berkeley.edu.Copyright (c) 1991 TeleMuse.


In previous installments of this project, we've tantalized you with the
preliminaries to our porting project. We've discussed our initial plan of the
port, a bootstrap of the system off MS-DOS, the standalone utilities that help
us test out the basic protected mode mechanisms of the 386, and cross-tools
for generating the BSD utility programs that will run off our BSD operating
systems kernel. In our analogy of climbing K2, most of our equipment has
checked out in use, and the route along the ridge to the peak looks clear with
possible good weather. In fact, at the end of this installment, we will
finally complete the preliminaries, and leave our base camp to start the major
ascent.
We now examine the initial root file-system required for our 386BSD operating
system kernel. Earlier in this series, we discussed the cross-tools used to
create 386BSD utilities, but we did not mention how we got these utilities
onto our target machine. We could load them as files onto MS-DOS --
unfortunately, 386BSD has no ability (initially) to decipher the organization
of files on the disk. (Some programmers who have spent time with the FAT,
clusters, and their ilk might consider this to be more of a blessing than a
curse.) Again, keep in mind that the primary operating system focus in this
port is UNIX and not MS-DOS, and that we are working on a research project,
not a commercial release.
We now embark on making a usable filesystem in order to hold the programs and
files used by our newly ported system. The filesystem is a special data
structure and functions that describe the storage of files on some means of
bulk storage. It literally is a subsystem for reading, writing, creating, and
destroying programs and data files on a media. Some programs and data files
will need to be used by our operating system kernel immediately when it begins
to run; the rest will be made accessible as the system is configured for use
by the configuration programs, which will be run only after the system is
completely underway. The first group of files will contain the programs that
allow us to add (or mount) new filesystems, creating hierarchical or
"tree-based" filesystems. Because trees grow from their roots, this filesystem
will be known as the "root," or bottom-most of the filesystems.
The kernel is the "heart" of UNIX, running programs inside of processes that
it creates for that purpose, and satisfying program requests (system calls) as
needed. Later, when we describe the formulation of the kernel operating
systems program (hereafter called the "kernel") and its initialization, we
will use this initial root filesystem. Thus, in starting our major ascent, we
will begin the actual job of porting the kernel program.


The Role of the Root Filesystem


The utility programs on the root establish a primary environment to craft an
arrangement of filesystems, introduce special systems functionality via the
server processes (daemons), and configure devices for operation. In addition,
the root filesystem possesses the utility tools used to fix, or reload if
necessary, other filesystems. These tools are often used to fix the root
itself if it is not badly damaged. By virtue of its small size and lack of
actively modified files, the root usually survives intact when a system crash
occurs. This is all the better for us because we need it to run the system and
summarily fix all ills. Should it get destroyed, however, we must completely
reload it by some means; that's why some systems have "back up" root
filesystems--just in case this actually happens. (With 386BSD, we eventually
allow for root filesystem recovery and installation of the first root
filesystem by means of a floppy root filesystem, which contains the tools to
load the entire system over the network, via a serial port, or from a floppy
or cartridge tape dump.)
The root filesystem is a small but essential portion of disk storage. It
provides enough functionality for the system to expand its resources to use
storage other than the root itself, and configure operations based on
arrangements mandated by current conditions. The root filesystem is also the
starting point for all filename translations and path searches. As a result, a
smaller root with fewer files to search through will generally improve file
operations performance.


A Brief Review of the Root


We will now briefly review the organization and location of various files and
their responsibilities in the UNIX tree-structured filesystem. This is in no
way intended to replace the more authoritative descriptions of the UNIX file
tree (see The Design and Implementation of the 4.3BSD UNIX Operating System,
by Leffler, et al., Addison-Wesley, 1989 for more information on this topic)
but will outline what needs to be present in the root for minimal operation
with our 386BSD system.
In Example 1, the root directory, we can see the base of the root filesystem
containing all of the top-level directories and files in our 386BSD system.
This listing, generated by the UNIX ls command (ls -l), shows file attributes,
link count, ownership, file size, modification date, and filename. Three kinds
of files are present here (as indicated by the first character of attributes):
directories (d), symbolic links (l), and regular files (-). Files in the root
serve the functions of installation, booting, system initialization, device
configuration, basic utilities, system operations, and so on.
Example 1: The root directory generated by the UNIX ls command (ls -l), shows
file attributes, link count, ownership, file size, modification date, and file
name.

 drwxr-xr-x 2 root 1536 Feb 3 10:18 bin/
 -rwxr-xr-x 1 root 20480 Sep 4 21:02 boot*
 drwxr-xr-x 2 root 1024 Feb 22 13:32 dev/
 drwxr-xr-x 2 root 1536 Mar 5 18:31 etc/
 drwxr-xr-x 2 root 512 Dec 7 12:41 lib/
 drwxr-xr-x 2 root 4096 Dec 7 12:41 lost+found/
 drwxr-xr-x 2 root 512 Aug 16 1990 mnt/
 drwxr-xr-x 2 root 512 Dec 6 12:11 root/
 drwxr-xr-x 2 root 1024 Dec 8 12:45 sbin/
 drwxr-xr-x 2 root 512 Sep 19 09:18 stand/
 lrwxr-xr-x 1 root 12 Jun 4 1990 sys@ --> /usr/src/sys
 drwxrwxrwx 2 root 512 Mar 5 18:31 tmp/
 drwxr-xr-x 2 root 512 Jan 26 22:12 usr/
 drwxr-xr-x 2 root 512 Jan 27 23:12 var/
 -rwxr-xr-x 1 root 319488 Feb 22 08:57 vmunix*



Installation: /stand


The /stand directory in our BSD root filesystem contains standalone programs
to be loaded using the standalone /boot program and run directly on the
machine, sans the presence (and possible interference) of the operating
system. This permits us to run programs to test, format, or diagnose device
behavior. Other programs (/stand/cat, /stand/ls, /stand/icheck) allow us to
diagnose problems with the root filesystem, independent of the operating
system. In addition, standalone disk bootstrap programs (/stand/bootwd,
/stand/bootfd) reside here, to be installed by other programs
(/sbin/disklabel) onto the disk media.



Booting: /boot and /vmunix


Two other standalone programs are worthy of mention here: /boot, the universal
bootstrap used to load the system off any media, and /vmunix, the operating
system kernel proper. According to our early porting plan, we actually use
these files last, because we load our system off of MS-DOS instead of from the
BSD root filesystem. For those unfamiliar with UNIX, the only use of the
operating system's executable file after bootup is to provide information on
symbolic references inside of the kernel, which are run to a very few
nonessential programs. In other words, although our system would continue to
run if this file were overwritten or deleted, it probably would not boot in
these cases.


Initialization: /sbin/init, /dev/console, and /bin/sh


When the system starts operation, it first executes the program in the file
/sbin/ init, which initializes the system and prepares it for operation. The
system, as it starts, is completely mute otherwise. In the minimal case, the
system is started "single-user" -- init manages to configure the system to
execute commands from the console device (on a PC, the keyboard, and display).
This resembles the command mode that MS-DOS systems provide on a standard
boot-to-command interpreter. In this case, init opens the console
(/dev/console) and executes the command interpreter or shell (/bin/sh) . Thus,
the minimum files we need (in addition to booting mentioned earlier) are
/sbin/init, /dev/console, and /bin/sh. If any of these are lacking or damaged,
UNIX cannot run, and we will not get a prompt from the command interpreter. Of
course, in order to do something useful, we'll also need the fides that
correspond to commands presently used to run the aforementioned commands from
the interpreter. This is the minimum required to get our kernel running.
Although many PCs frequently run UNIX with a sole user, /sbin/init can also
prepare the system for multiuser or multitasking operation. In this case, init
runs the command interpreter on a file of commands (/etc/rc) commonly referred
to as a "shell script." This in turn calls other shell scripts for network,
device, and server process invocation.
Server processes, which provide for a variety of services available with UNIX
systems are often referred to as "daemons," as they attempt to do work
invisibly. This is a play on Maxwell's daemon, who would merrily put hot
molecules in one box and cool molecules in another box, thus (?) violating the
Second Law of Thermodynamics. (The proof fails when the daemon acquires so
much energy in rapid collisions from highly vibrating molecules that it must
radiate the energy as heat, thus perturbing the system. See Feynman's Lectures
on Physics, Volume 1, for more information. You just can't get something for
nothing, can you?)
Among other things, the system may now perform housekeeping functions: fixing
any broken filesystems it can, erasing temporary files and other garbage,
adding filesystems (both on the computer and over the network), and connecting
the system into the world.
Traditionally, all these commands provide little output as they are launched
--which can occasionally confuse more than reassure. One popular computer
author, unfamiliar with UNIX, complained of feeling quite uncomfortable when a
UNIX workstation flashed him the message, "starting standard daemons." Perhaps
he thought he needed the help of a system exorcist!
In multiuser operation, the system depends on all the functionality that has
been configured, including indications of service availability through the
appearance of a login prompt. We find it amusing when our 386 PC laptop
prompts us for a login account name and password, as if we are competing with
hundreds of users for access to the machine! On the other hand, our little 386
laptop, running 386BSD, has about as much disk space and memory and is three
times the speed of the PDP 11/70 that the University of California used to run
50 to 70 students at a clip. UNIX regards little PCs and systems with hundreds
of terminals in the same way -- a login prompt per customer. Configuration:
/dev and /etc
Hardware devices on UNIX are accessed through special filenames in /dev. For
our filesystem to work correctly, we must have the appropriate device files
already made. Otherwise, the utility programs will not be able to access the
devices, even if the operating system has drivers that work with the
underlying hardware. These files are special because they are made with a
special program (/sbin/mknod) which creates an association between the file
and a software driver in the kernel. A shorthand script program
(/dev/MAKE-DEV) provides a way to make these files symbolically. With 386BSD,
we must make the console (/dev/console) and the root filesystem's device
(/dev/wdOa) before we run; it's wise to make other special files at this time,
too.
Besides configuring device filenames, we need to specify device configuration
for disk drives (/etc/fstab), terminal lines (/etc/ttys), and printers
(/etc/printcap) to describe device characteristics. One criticism of all UNIX
systems has been the need to wade through a plethora of ad hoc configuration
files for device and program use. Most of the system configuration files in
this project, however, can be found within the /etc directory.


Utilities: /bin and /sbin


The basic utilities needed for operation of UNIX are found in the two
directories: /bin and /sbin. /sbin contains supervisory commands not generally
useful to ordinary users, but important for system operation and system
management. /bin contains basic commands useful to all UNIX users -- kind of a
core group. Both of these directories are kept short and small to minimize the
size of the root and the time it takes to search for a command. All other
commands (hundreds, usually) are found in the additional filesystems that
become active when UNIX is brought up multiuser. To this end, it is important
to note that /sbin/mount and /sbin/umount are used to mount and unmount those
additional filesystems.


Operation: /tmp and /var


Once in operation, the /tmp directory is used to store temporary files from
editors, formatters, compilers, and assemblers, as needed. /var is a directory
that holds various short-term data, such as usage accounting, security logs,
incoming electronic mail, crash dumps, printer spooling, and runtime program
databases. Frequently, these two directories are separately mounted
filesystems, especially on systems where these kinds of files take up much
space.


Other Directories: /lib, /mnt, /usr, /root, and /sys


Finally, we have a group of files that don't fit any of the above categories.
/lib contains object libraries and runtime start-off routines to allow C and
other languages to run on the system. /usr is an empty directory used as a
mount point to attach a much larger filesystem to -- one that contains
everything else not on the root in the way of utilities, object libraries,
include files, documentation, and system source. /mnt, also an empty
directory, is used as a mount point for optional filesystems to be attached to
when needed. /root contains the home directory for the superuser account
(root), keeping it separate from the actual root directory of the system. /sys
is our sole example here of a symbolic link -- a file type that provides a
shortcut within the filesystem to another location in the filesystem tree. In
this case, /sys hides a reference to /usr/src/sys, so when the filesystem
associated with /usr is mounted, a reference to a file like
"/sys/i386/i386/locore.s" is satisfied with a reference to the file
"/usr/src/sys/i386/i386/locore.s".


Filesystem Creation


Normally, we would use our ported system to create root filesystems, but we
again run into a "chicken-and-egg" problem, because we need a finished system
to create the archetypal root filesystem from which we make all others. So, in
the typical "break the egg" and "cook the chicken" way we resolve all minor
paradoxes, we make the first filesystem on our cross-host by special means. We
either find a cross-host with identical key data structure characteristics
(byte order, structure field alignment, and structure packing) or write a
transformation program to turn our crosshost's filesystem format (via
stretching, swapping, and shrinking) into a 386BSD-compatible form. The result
is a file of bytes that contains an image of what the filesystem should
contain on the PC's disk drive.
If we were starting this project now, we might consider a novel alternative
method using the BSD NFS (Network FileSystem) code. We would then run our
386BSD kernel in a "diskless" fashion, passing all file operations over the
network to be satisfied by an NFS server host. We could use any NFS server to
provide access to our initial root filesystem. Oddly enough, this would hide
not only the cross-host's filesystem format, but the cross-hosts operating
system as well. Conceivably, one could even use a non-UNIX cross-host. All of
this is made possible by NFS's file abstraction mechanism, which converts
filesystem data to a common external representation via its internal XDR
(eXternal Data Representation) library.


Filesystem Downloading


With our filesystem image in a file, we can download it using either Kermit,
NCSA Telnet, or some other file transfer utility that can copy a binary image
from our cross-host to the PC under MS-DOS. In the early stages, before the
kernel successfully ran processes, small filesystems of a few hundred Kbytes
(principly /dev/console, /sbin/init, /bin/ sh, and /bin/ls) could be
downloaded as needed over the serial ports using Kermit. As success with the
kernel increased, so did the size of the root filesystem, because the focus of
the project moved from minimal operation to proving the kernel by means of
increasingly larger utilities. This affected us in three ways: Serial link
downloading took too long; our MS-DOS partition limited the size of the
filesystem across which we could copy; and even a single byte change in a
single file required a complete filesystem download to affect modification.
Having downloaded the filesystem, we used the copyfs program (see "Initial
Utilities: Three PC Utilities" in DDJ, February 1991) to install it in a
partition on the hard disk, separate from MS-DOS. The BSD kernel disk driver
was also modified to relocate what it considered the beginning of the disk to
this point so we could share the disk with two systems. Copyfs would place the
image of the filesystem onto the absolute disk storage blocks without any
translation, making the image real.


Filesystem Debugging


At this stage, it is considered good practice to check the filesystem on the
PC. We used the standalone utilities (/boot, /stand/cat, /stand/ls,
/stand/icheck) to verify that the filesystem was correct for use with the
kernel. However, even before having an operational system, we can validate our
filesystem with our standalone system (see "Initial Utilities: The Standalone
System," DDJ, March 1991), because it has the ability to interpret the
filesystem data structures. /boot can be used to check for the presence of
files and directories by attempting to boot from a file. For example, one can
try to boot from /stand/ls, with the proviso that "/" be a directory that has
the "stand" directory in it, and that "stand", in turn, contain "ls"--an
executable file. If the given file cannot be opened, /boot will tell us why.
ls, like its user-mode utility counterpart, lists the contents of a directory
on a disk, so we can check to see if the contents are correct. Similarly, cat
shows the contents of an ASCII file, so we can check to see that the ASCII
files present have the appropriate contents and that fence-post or data
translation problems have not corrupted the files. Finally, /stand/ icheck,
the largest standalone program, can exhaustively check for filesystem
consistency to make certain that all of the filesystems' data structures are
undamaged. We can verify this by running the same icheck program on the
cross-host, ensuring that the filesystem is identically consistent on both the
cross-host and the target system.
These validation techniques independently test file contents separate from
file system data structures, or "meta data," on the off-chance that we are
somehow corrupting the contents of files when we create the filesystem. It's
important to realize that programs that check the filesystem have no way to
check contents of files. Thus, the file contents may be completely mangled in
ways that could still leave the filesystem in a correct state!



What's in a Filesystem?


As we stated earlier in this article, a filesystem is a data structure
designed to implement the abstraction of files and directories. As such, there
are dozens of types of filesystems possible. Berkeley UNIX currently offers
three flavors of filesystems: UFS, NFS, and MFS.
UFS, like many other filesystems, manages to impress its underlying files and
directories on a bulk storage media such as magnetic moving head disks. In
particular, UFS uses placement algorithms to schedule head movement and
rotational delay to improve average filesystem effectiveness.
NFS, the Network Filesystem originally designed by Sun Microsystems, funnels
program requests for files over a network connection, which is then satisfied
by a server machine's own filesystems. Consequently, these files can be
located quite a distance away from the actual computer whose program is
referencing a file.
MFS, a memory-based filesystem, stores temporary files in the processor's
virtual memory storage areas for rapid access to transient data. It evolved
from RAM-based disks used on many MS-DOS systems and uses virtual memory to
provide a way to keep active files present in RAM while gradually moving
inactive portions back to the disk.


Why Do We Need a Root Filesystem?


Traditionally, the UNIX filesystem is used to hold the operating system and
its bootstrap as ordinary files. This makes it convenient to create and
install new versions of the operating system with the very same tools used to
develop ordinary user programs. This arrangement also makes it possible to
choose alternative versions of the operating system, and to run newer systems
under development, or fall back to back-up versions if for some reason the
default system is damaged and unusable. This flexibility presents a
problem--how do you load the operating system which makes use of the
filesystem if it's already in the filesystem itself?
As part of the bootstrap process, the computer loads bootstrap programs with
an ever-increasing ability to manipulate the hardware and access files from
the UNIX filesystem. In 386BSD, the ROM BIOS starts the process by reading the
first block of disk storage off the disk, and then executes its contents as an
ordinary program. This tiny program has the sole responsibility of reading in
a program 15 times its size and located on the next successive blocks on the
disk drive. In turn, this larger program has the responsibility of deciphering
the UNIX filesystem located adjacent to it on this disk drive, and extracting
the next bootstrap program from the file "/boot" in the filesystem. This final
bootstrap program can be arbitrarily large (bounded by physical memory) and
can load programs from all possible devices on the computer. This bootstrap
can also determine which device to load the operating system from, the
configuration of the processor prior to boot, and power-fail or crash-recovery
steps. It can also decide whether the system should automatically reboot
itself or pause and await manual intervention to remove an obstacle inhibiting
automatic reboot; it can be interrupted by an operator if he wishes to change
his mind and insist on alternative actions. Thus, the bootstrap can be used to
load other standalone programs that might be used for disk formatting,
recovery, or installation, as well as loading the operating system itself (the
file /vmunix). In a sense, when the bootstrap is loading, you might call the
filesystem it is using the "boot filesystem"!
A similar chicken-and-egg problem occurs when we decide to run the
initialization process (/sbin/init) to initialize the subsequent user program
operation of the system. Because UNIX systems only know how to execute a
program from a file in a filesystem, we need a filesystem from which to
execute files. Thus, the root filesystem is the first filesystem accessible,
via a kind of "virgin birth." All other filesystems will be explicitly
attached to it via the UNIX mount command, which tapes the base (or root) of
the filesystem to be mounted onto an existing directory in the root (the
"mount" point).
Non-UNIX systems have an entirely different perspective regarding
bootstrapping. Usually, the given system is kept on a special, dedicated
location on the disk, frequently adjacent to bootstrap code. Sometimes, the
equivalent of the UNIX /sbin/init program is also found in this special
location. Therefore, these programs require special installation onto the
disk, and the system does not require the concept of a "root" filesystem,
because it does not require a filesystem to become active.
Note also that we have one file-naming convention in UNIX, so that even
devices are named just like ordinary UNIX files (/dev/wd0a or /dev/console,
for example). This is different from MS-DOS or VMS, where two namespaces are
present at any time: the device namespace (A: or DKOA:) and the file pathname
(\foo\bar\bletch or [foo] bar,bletch;2). With UNIX, the filesystem is a
central concept, along with the global way in which it is used and reused to
provide a sole namespace for file objects. In a sense, the originators of UNIX
felt this concept to be so important, that in follow-on-work (such as Plan 9,
see DDJ January 1991), the filesystem is even more central to the system, by
becoming a way of expressing interprocessor, window system, and program
communications metaphors.


The Filesystem Metaphor and its Importance in Future Work


With all modern systems, we now use the filesystem metaphor underlying the
basic syntax and semantics of the UNIX filesystem. As a result, the same file
specification syntax known to all UNIX applications programs can be used to
transparently access files embedded in archival storage systems, remotely
manipulate files on remote systems of entirely heterogeneous design, store
files on fail-safe redundant media, or a combination of these. We could even
design a database filesystem where the filename directory path would describe
a database query, with the "leaf" files themselves being the database records.
The foresight of the originators of the early hierarchical filesystems (and
the Multics Project) is now apparent, as these ideas come to fruition in a
variety of research and commercial applications. As we continue to struggle
with the complexity of our software systems, the use of powerful metaphors
that unify many mechanisms within one becomes increasingly critical to the
design and implementation of any complex system.







































MAY, 1991
IMPLEMENTING THE GPIB


Developing polled and interrupt-driven routines




Don Morgan


Don is a consulting engineer in the area of embedded systems and automation
and can be contacted in care of Don Morgan Electronics, 2669 N. Wanda, Simi
Valley, CA 93065.


In "Understanding the GPIB" (DDJ, April 1991), I presented an overview of the
general-purpose instrumentation bus (GPIB), a short tutorial, and an example
of its use in a real application. That application involved using a host
computer to program an oscilloscope to trigger on predetermined signals. The
oscilloscope would then alert the host that the data collection was done,
whereupon the host would instruct the oscilloscope to transmit the data to the
printer/plotter, which would print the data.
In this article, I continue that example, but from the point of view of an
embedded system. I present the TMS9914A, a very popular chip for implementing
488.1 functionality of the GPIB, and show how to develop polled and
interrupt-driven routines for talking, listening, and generating an SRQ on the
bus.
The TMS9914A provides all the functionality of the IEEE 488 1975/78 standards
and the IEEE 488A 1980 supplement. Although it is not capable of meeting all
the requirements of the IEEE 488.2, it does provide the basic functionality of
the IEEE 488.1, which every device must meet to exist on the GPIB.


The TMS9914A


As far as hardware goes, the 9914A provides almost everything necessary for a
complete bus interface. The chip, two drivers, and one OR gate will do
everything--including parallel polling on command. What's more, there is
nothing complex about laying out the circuit from the 9914A to the bus,
because the pins are named after one another and appropriately for the bus.
There are a few caveats, however. TI adopted an alternate bit-ordering
convention for the chip, with D0 as the Most Significant Bit and D7 the Least
Significant Bit. This can cause confusion and waste your time. So as not to
create any further confusion, I will follow the same numbering system TI uses
in their data sheets, with reminders where appropriate.
The remaining logic lines are fairly routine. The register select lines, RS0
through RS2, follow the usual pattern, with RS0 as the Least Significant Bit.
The Chip Enable, Write Enable, and reset are active low, while the Read Enable
is active high.
Beyond these points, the 9914A is fully featured. It can be clocked over a
wide range, from 500KHz to 5MHz, without any real degradation of performance.
In addition the duty cycle is equally forgiving, allowing it to be developed
from a number of possible sources, including the clock of a local
microcontroller or its ALE line.


Communications Techniques


The form of the communications driver necessary is dependent on the system and
the degree of efficiency required of the instrument and the bus. The GPIB was
designed to operate at data rates up to 1 MHz, and this is possible if all the
electrical constraints described in the 488.1 specification are met; possible,
but not always practical. Even clocking the 9914A at the minimum clock rate
will only contribute microseconds to a transfer. In the end, the true pacing
item for a device on the bus, and therefore the bus, is the software
implementation of the Interface.
The IEEE 488 bus is a bit-parallel/ byte-serial bus employing a unique
three-wire handshake that requires that the current byte be accepted by all
listeners on the bus before the next can be placed there. The individual
device does this by releasing the NFRD\ (Not Ready For Data) line. This, of
course, has great advantages when it comes to interfacing devices of different
data rates and performance, but also means that the bus can only operate as
fast as its slowest acceptor.
The TMS9914A will not complete the handshake (release the NFRD\ line) until
the most recent data byte has been removed from the Data In register, or any
special hold-off has been removed. Assuming no auxiliary hold-off, the time it
takes to handle the chip at a local level determines the speed of the
transfer. And this, almost always, is dependent on software.
All other factors aside, polling the interface for data is the simplest
approach but invariably results in a slower bus transfer. This can mean that
the faster instrumentation must wait and that longer time-outs will be
required. Still, it isn't always possible to interrupt a critical process
within a device. I was recently involved in the development of a microstepping
indexer designed to drive stepper motors for stage positioning. This indexer
had to be optimized for speed and accuracy in its moves. Even using a fast
microprocessor, there were times when allowing a bus transfer of any length
could tie up the processor and distort the move.
As mentioned earlier, the bus handshake depends upon the release of the NFRD\.
Normally, the 9914A releases this line after the current byte is read from the
Data In register, which means that if the current talker is attempting to send
a string to the current listener(s), the next byte will be placed on the bus
almost as soon as the last is taken out. In an interrupt-driven scheme, this
can sometimes mean that you are on your way back to the routine as soon as you
leave it.
If you opt for interrupt-driven or DMA transfers, there are some things that
can be done to help. The interrupt could be turned off in certain situations
and the interface polled only when it is safe. Alternatively, there is an
"auxiliary command" available that will allow the system to "hold off" the
NFRD\ until explicitly released by the host processor. In this case, the
interrupt can remain enabled and the microprocessor can pace the transfer of
data over the bus, based on its needs.
The 9914A also provides for DMA. To use this feature, the My Address bit in
Interrupt Register 1 must be set. This will generate an interrupt and a Data
Accepted (DAC) hold-off whenever the 9914A is addressed to talk or listen. As
part of the service routine, the microprocessor must look in the Command
Pass-Through register to see whether it received a My Talk Address (MTA) or a
My Listen Address (MLA), initialize the system according to the need, release
the hold-off, and the transfer will occur. Using this technique, entire
strings can be read in the shortest possible time. Alone or in combination
with other approaches, this can provide for the fastest bus and fastest local
processing possible.
One more note about the DMA lines, ACCRQ\ and ACCGR\. If DMA is not going to
be used, ACCGR\ must be held high or the data lines will pull the bus down. In
addition, ACCRQ\ can be used as a separate interrupt for BI and BO, making it
unnecessary to read Interrupt Register 0 for these bits.


A Simple Listener


To begin with, we need to know how to listen on the bus. At power up (or with
a hardware reset), the TMS9914A is placed in swrst, software reset. While it
is in swrst, it is not capable of taking part in any activity on the bus; the
device must be configured and swrst cleared first.
1. Place the address of the device into the five LSBs of the 9914A's address
register. To talk or listen, a device must be addressed to do so.
2. See that bit 1 of the address register (DO is MSB) is cleared to allow the
listener function.
3. Clear the interrupt mask registers; these bits come up in random fashion at
power on. In a polling scheme, the mask registers are not so important because
the bits in the interrupt status registers are set and reset regardless of the
state of the mask register. It is good practice, however, to clear them.
4. Clear the swrst auxiliary command. The TMS9914A may now exist on the bus.
5. Wait for the BI (Byte In) bit in the Interrupt Register 0 to go high. This
bit indicates that the TMS9914A has been addressed and received a byte. In the
simplest listener form, a loop can be written that reads Int 0, checks for the
BI, and reads the Data In register when it is true. (Remember that reading
from Int Register 0 or from the Data In register clears the BI bit and any
other interrupt represented there; save this register if you are concerned
about any other conditions.)
The above algorithm implements a polled listener function. To go from there to
interrupt-driven, it is only necessary that in step 3, instead of just
clearing both Interrupt mask registers, we set the BI mask bit afterwards. Now
when the TMS9914A is addressed and receives a byte, it will lower its Int\
line to the microprocessor. This means that polling the Interrupt 0 Register
is no longer required, and other processing can be done while waiting for Data
In.


A Simple Talker


This talker is not much different from the listener. At power on or a hard
reset:

1. Set up the address register with the address in the five least significant
bits.
2. This time, see that bit 2 (d0 is MSB) of the address register is low to
enable the talk function.
3. Clear the interrupt mask registers.
4. Clear swrst. As before, this is set by the hardware reset and must be
cleared before the 9914A can talk or listen on the bus.
5. Monitor the Byte Out (BO) bit in Int 0. When it goes high, the 9914A has
been addressed to talk and is waiting for a byte. In polled mode, it is a
simple matter to watch this bit and supply a byte whenever it goes high.
As with the listener mode, it is a short move from polled to interrupt-driven.
After the interrupt mask registers have been cleared in step 3, set the BO
bit. Now, Int\ on the TMS9914A will go low whenever the device has been
addressed to talk and is ready for a byte. Being able to talk and listen on
the bus are the two fundamental requisites of any commun cations driver. For
real efficiency, though, the instrument must have a way to interrupt the
controller when it needs servicing.


Serial and Parallel Polling


The 9914A allows for both parallel and serial polling and both methods involve
the use of the SRQ command line. In either case, it is set low to indicate to
the controller that the device requires service.
The original way to set the SRQ line low was to set bit 1 (remember that D0 is
the MSB) of the Serial Poll Register, then, after the device had been
serviced, reset bit 1. There was a problem with this method that resulted in
the possibility of a device appearing to request service again if another
device requests service before bit 1 of the serial poll register for the first
device is cleared. The advisable way to set SRQ low is to use the auxiliary
command, "Request Service Bit 2." Here, the bit is cleared automatically as
soon as the controller accepts the serial poll status byte.
Each device is to have a serial poll status byte. It is here that current data
about the state of the instrument is kept. Originally, only one bit in the
serial poll status byte, the RQS, was defined, the idea being that it would be
set to indicate that the device needed service. The rest of the serial poll
status bits, according to 488.1, were to be defined by the manufacturer of the
instrument carrying the interface. Often they are pertinent only to the
device, but just as often they carry meaning common to many devices on the
bus, such as message available, overrun, process error, operation complete.
The reason, in fact, for so many of the 488.2 extensions is that there were
common, or "standard," events that might just as well be placed in a byte by
themselves with any one of them being reason enough for setting a bit in the
serial poll status byte. If you are interested in these and the many other
additions to the 488.2, please see the IEEE 488.2 document listed in the
bibliography.
Typically, the instrument designer will maintain a mask for the serial status
byte. In it, he can set or reset a mask bit for each of the status bits. When
that bit is unmasked and true, he will request service, but this is not at all
automatic. IEEE.1 only allows for the status byte and the RQS; it is up to the
equipment manufacturer to do the rest.
During a serial poll, the controller must poll each device individually to
ascertain which one(s) need servicing. During the poll, the controller reads
the serial poll status byte of each device. This byte should be kept current
by the system software.
To perform a serial poll, the controller need only place a Serial Poll Enable
(SPE) on the bus, make each device a talker, read the serial poll status byte
from each device, issue Serial Poll Disable (SPD), and handle the request(s).
The parallel poll can be somewhat more complex to implement than the serial
poll but can increase response time on the part of the controller many times
over the serial poll. With the parallel poll, the controller can ask up to
eight devices at a time to set a bit on the IEEE 488 bus. The controller can
then read this byte and determine which device(s) needs attention.
Using the parallel poll requires that the local processor be told over the
interface, or read from memory or switches, which bit to set or reset, in the
parallel poll register. This done, the parallel poll may be issued, the bus
read by the controller, and the appropriate devices serviced.
The advantage here is speed with an increase in the complexity of both the
controller and the instrument. It has the drawback of limited information: A
bit can only be made to set or reset upon condition. It is that bit that is
placed on the bus during the parallel poll.


A Communications Handler


Listings One and Two (page 96) contain three functions that implement the
foregoing discussion. Although the original for these drivers was written in
8OC196 assembly language, I have rewritten it in C for portability. Without a
specific target, an application such as this can, at best, be somewhat sketchy
because the hardware and architecture are not known. The code was written to
provide a more concrete example of the contents of this article and with an
eye to fitting it into the user's system.
The GPB.H header file in Listing Two provides the needed defines with
indications as to where the particular address of the target system should go.
Here, too, are the global variables serial_poll and serial_poll_mask that are
used to record the current state of the instrument and set the conditions that
will cause an SRQ.
Listing One gives an example of the initialization code for the TMS9914A, an
integrated interrupt handler, and a routine for checking serial_poll and
generating an SRQ when appropriate. What is missing is a function to set bits
in serial_poll_mask. All that is needed is the creation of a command to set
and reset individual bits in the mask. I felt that this was such an intimate
part of the target system, involving the commands used and the manner in which
they are parsed, that it would best be left up to the author of that software.


Bibliography


HP 54502A 40OMHz Digitizing Oscilloscope Programming Reference. First Edition.
Hewlett-Packard, June 1989.
IEEE Standard Digital Interface for Programmable Instrumentation. ANSI/IEEE
std 488.1-1987. New York, N.Y.: The Institute of Electrical and Electronics
Engineers, 1987.
IEEE Standard Codes, Formats, Protocols, and Common Commands For Use with
ANSI/IEEE Std 488.1-1987 IEEE Standard Digital Interface for Programmable
Instrumentation. ANSI/IEEE std 488.2-1987. New York, N.Y.: The Institute of
Electrical and Electronics Engineers, 1987.
TMS9914A General Purpose Interface Bus (GPIB) Controller. Texas Instruments,
1982.
TMS9914A GPIB Controller User's Guide, System Interface Controllers. Texas
Instruments, 1985.

_IMPLEMENTING THE GPIB_
by Don Morgan


[LISTING ONE]

#include <conio.h>
#include "gpib.h" /*local header file containing declares and defines */

/*Reinitialize local interface, whether from power up or device or interface
clear. It will do nothing, however to initialize system interrupts to take
advantage of TMS9914A interrupts; you do that */
void Gpib_Init()
{
 int address;
 outp(AUXCMD,SWRST);
 /*software reset*/
 address = (inp(DIPSW) & 0x1f);
 /* Read dipswitch for address of device. Lower 5 bits carry legal

 address of device, clearing upper bits and enables talker & listener
 functions, as well as, disabling dual primary addressing mode.*/
 outp(ADDRES,address);
 outp(IMSK0,BI BO SPAS END);
 /*set the interrupts BI, BO, SPAS, and END*/
 outp(IMSK1, DCAS IFC);
 /*and DCAS and IFC*/
 outp(SERPOL,0x0);
 serial_poll = serial_poll_mask = 0x0;
 /*clear the serial poll byte and mask*/
 outp(AUXCMD,SWRSTC);
 /*clear the software reset*/
 }
/* Each time a condition arrises in the device that sets a bit int the serial
poll status byte, this routine is called. It is passed the mask (MAV or TRG,
at
the moment) for the bit to be set and does three things. 1) sets the bit in
the serial poll status byte. 2) sets the corresponding bit in global variable
we created serial_poll. This is for reference, the serial poll status byte is
write only. 3) It checks to see whether this condition is one the user has
determined should cause an SRQ.*/
void Check_Mask(mask);
{
 serial_poll = mask;
 outp(SERPOL, serial_poll);
 if(serial_poll_mask & mask)
 outp(AUXCMD,RSV2);
}
/* Vectored to by system interrupts.*/
 void Gpib_Int_Han()
 {
 int status_byte; /*This is where we will save interrupt status
 registers until we are through with them*/
 int chr; /*character holder*/
 status_byte = inp(ISTAT0);
 /*get what is in interrupt status register 0*/
 if(status_byte & SPAS) {
 /*place routine here to perform any maintainance necessary
 after controller reads the serial poll byte. If you are
 using rsv1, you would reset bit 1 (D0 is MSB), otherwise it
 may be a reset to any bits gone active since the last SRQ.*/
 }
 if(status_byte & INT0) {
 /*this bit will be set if there is at least one active,
 unhandled interrupt in int status 0. The interrupts that we
 will be concerned about here are: BI, BO, SPAS, and END*/
 if(status_byte & BI) {
 /*checking for a byte received from the bus*/
 chr = inp(DATIN); /*get the character*/
 if(status_byte & EOI)
 flag = TRUE;
 else;
 flag = FALSE;
 receive(chr,flag); /*receive is your function that accepts
 character and places it in whatever buffer
 scheme is your favorite. flag, is set if an EOI
 is detected signalling the end of a string.*/
 }
 else { if(status_byte & BO) {
 chr = get_byte();

 if((chr >> 0x8) == TRUE) {
 outp(AUXCMD, FEOI);
 serial_poll &= ~MAV;
 outp(SERPOL,serial_poll);
 /*since this is the last character of string, we take this
 time to reset 'message available' bit in serial_poll byte*/
 }
 outp(AUXCMD,chr);
 /*Toutine for retrieving next byte from transmit buffer. If
 you are using the EOI, include some way to indicate that this
 is last character of string, as FEOI must be placed in
 auxiliary command register before last character is sent. In
 this case, I assumed upper byte of chr was used to carry a
 flag explaining the status of that character*/
 }
 }
 if(status_byte & INT1) {
 inp(ISTAT1); /*Get contents of interrupt status register 1. Here we
 are interested in DCAS (device clear) and IFC
 (interface clear*/
 if((ISTAT1 & DCAS) (ISTAT1 & IFC)) Gpib_Init();
 /*whether we receive a device clear or an interface
 clear, we will do the same thing, reinitialize the
 local interface*/
 }





[LISTING TWO]

/*GPIB HEADER FILE*/

/* The following are defines for address mapping, they are written assuming
a linear addressing scheme and an 8 bit bus. Depending on the width and
decoding scheme you use, these defines might be different.*/

#define DIPSW ?? /*place dipswitch setting device address address here*/
#define IEEE ?? /*place base address of TMS9914A here*/
#define IMSK0 IEEE+0 /*interrupt mask register 0*/
#define ISTAT0 IEEE+0 /*interrupt status register 0*/
#define IMASK1 IEEE+1 /*interrupt mask register 1*/
#define ISTAT1 IEEE+1 /*interrupt status register 1*/
#define ADSTAT IEEE+2 /*address status register*/
#define BUSTAT IEEE+3 /*bus status register*/
#define AUXCMD IEEE+3 /*address register*/
#define ADRSWI IEEE+4 /*address switch register*/
#define ADDRES IEEE+4 /*address register*/
#define SERPOL IEEE+5 /*serial poll register*/
#define CMDPAS IEEE+6 /*command pass through register*/
#define PARPOL IEEE+6 /*parallel poll register*/
#define DATIN IEEE+7 /*data from bus register*/
#define DATOUT IEEE+7 /*data to bus register*/

/*next, define some bit masks to be used in the interrupt routines. These will
be AND'd with interrupt status registers to determine proper action to take*/
#define INT0 0x80 /*interrupt register 0 has an interrupt, int status 0*/
#define INT1 0x40 /*interrupt register 1 has an interrupt, int status 0*/

#define BI 0x20 /*Byte In, int status 0*/
#define B0 0x10 /*Byte Out, int status 0*/
#define DCAS 0x8 /*device clear, int status 1*/
#define EOI 0x8 /*End of Identify, int status 0*/
#define SPAS 0x4 /*serial poll active state, int status 0*/
#define IFC 0x1 /*interface clear, int status 1*/

/*now, defines for some of the auxiliary commands used*/
#define SWRST 0x80 /*sets the software reset*/
#define SWRSTC 0x0 /*clears the software reset*/
#define FEOI 0x88 /*sent prior to last byte in string to indicate end*/
#define RSV2 0x98 /*alternate and preferred method of asserting SRQ*/

/*these bits are used as definitions of serial poll status byte and mask*/
#define MAV 0x20 /*message available*/
#define TRG 0x1 /*trigger*/

/*other stuff*/
#define TRUE 0xff
#define FALSE 0x0

/*declarations*/
int serial_poll; /*global variable containing a record of the mask
 written to the serial poll register in the TMS9914A.*/
int serial_poll_mask;
/*for our purposes, the serial poll status byte and serial poll mask byte is
to be defined as follows, remember D0 is MSB:
D0 D1 D2 D3 D4 D5 D6 D7
X RQS MAV X X X X TRG
RQS is the request for service and is read as a 1 by controller during a
serial poll when this device has asserted an SRQ. MAV is 'message available'
and is set when there is something in the transmit buffer, reset otherwise.
TRG is the trigger bit and is set when the oscilloscope has been triggered.*/

/*function prototypes, these are example only, yours may well be different*/
extern int get_byte(void);
extern void receive(int,int);
extern void Gpib_Int_Han(void);
extern void Gpib_init(void);
extern void Check_Mask(int);






















MAY, 1991
MAKING SMALLTALK WITH WIDGETS


An extensive class library and a sophisticated interface editor highlight
Widgets/V 286




Kenneth E. Ayers


Ken is a software engineer at Eaton/ IDT in Westerville, Ohio. He is involved
in the design of real-time software for industrial graphic workstations. He
also works part time as a consultant, specializing in prototyping
custom-software systems and applications. Ken can be contacted at 7825
Larchwood St., Dublin, OH 43017.


Widgets were once thought to be the ultimate generic product, manufactured by
--who else?--the Acme Widget Company. No one seemed to know exactly what a
widget was or what it did, but according to those sample financial statements
found in business text books, a lot of widgets were being produced--and
consumed! But with the introduction of its Widgets/V 286 package, Acumen
Software has elevated the widget to the status of a real object. Well, sort
of.
Widgets/V 286 is a software package that extends the user interface
capabilities of the Smalltalk/V 286 environment. Widgets provides a set of one
hundred (or so) classes that offer contemporary alternatives to the standard
user interface components supplied as part of the basic Smalltalk image. The
enhanced objects range in complexity from simple buttons to drop-down menus to
multipart dialog windows. Each object class conforms to a standard creation
protocol and each provides a rich set of methods for accessing its underlying
behaviors--all of which makes Widgets extremely easy to use.
Complimenting the extensive library of classes is a sophisticated interface
editor that allows you to assemble an application window interactively. With
the editor's easy-to-use toolkit approach, you simply drop widgets onto a
blank window, drag them around, resize them, and so on. When you're done, you
just save your work and the editor does the rest by generating the Smalltalk
code necessary to create your window.
This article presents an application, developed using the Widgets tools, that
illustrates how the component parts of the Widgets/V 286 package can be
assembled to create virtually any kind of window you might need. But first, a
quick tour of the package is in order.


New and Improved Windows


The Smalltalk/V 286 environment does offer components that can be used to
create window-based applications. However, the process of doing so can be
tedious. This, in turn, can be attributed to the fact that Smalltalk's default
user interface classes are primitive by today's standards.
In the first place, the programmer must tend to a considerable amount of
detail just to get the parts of an application window to line up properly.
Using Smalltalk's built-in facilities, you must find that magic combination of
framingRatios and/or framingBlocks (don't forget to account for border
widths!) that put all of the panes in their proper place.
By contrast, the Widgets/V 286 interface editor allows you to line everything
up in the most natural way possible--visually. And, to guarantee that things
stay lined up when a window's size changes, Widgets provides a concise set of
framing parameters that allow you to keep a widget centered in a window or
keep the origin of a widget anchored relative to any corner of a window. Here,
too, the results are presented visually by the interface editor.
Secondly, many critical components such as buttons and menu bars are missing
altogether from the standard Smalltalk environment. Those that are provided,
such as pop-up menus and prompters, are relatively crude when compared to
modern windowing systems such as Microsoft Windows or the X-based OpenLook
toolkit.
To address this deficiency, Widgets/V 286 provides an extensive set of
component parts with which to create sophisticated user interfaces. Table 1
outlines the categories of widget classes supplied with the package.
Table 1: User interface components offered by Widgets/V 286

 Widget Type Description
 -------------------------------------------------------------------------

 Drawing Widgets Provide functional and decorative "embellishments"
 such as vertical and horizontal lines and
 rectangles for separating regions within a window;
 a static text widget for labelling fields; and a
 general-purpose drawing widget onto which
 arbitrary bit-mapped forms may be superimposed.

 Button Widgets Offer a number of button-like widgets which invoke
 an action when "pushed." Among the varieties of
 buttons are those with traditional text labels,
 one whose label is any bit-mapped image, a
 "transparent" button that transforms any screen
 region into an action area, check-box buttons,
 radio buttons, and an on-off button.

 Choice Widgets Allow users to chose from several alternatives.
 They include: radio button groups; drop-down menu
 button; standard list selector box; multiple list
 selector box; editable list selector box; iconic
 menu; and iconic drop-down menu.

 Valuator Widgets Allow users to modify some numeric quantity. The
 group includes sliders, bar gauges, a numeric edit
 box, and vertical and horizontal scroll bars.


 Editable Text Widgets Provide users with the ability to enter and
 modify text. Include a simple one-line editor
 with an optional label (useful for data entry
 fields) and a multiline text editor box.

 Pane Widgets Provide high-level grouping capabilities and a
 plain border, a titled border, and a border with
 scroll bars.

 Miscellaneous Widgets Includes editors for times (both standard and
 24-hour format), dates, and points (X-Y
 coordinates).

In the "stock" Smalltalk environment, the channels of interaction between a
model and its window(s) are rather poorly defined and are not consistently
implemented across the Pane/Dispatcher class hierarchies.
Widgets/V 286 helps on this front, too. It provides a mechanism by which an
application may request notification of certain standard events such as window
activation/deactivation, mouse button activity, mouse movement, and key
presses. This event registration mechanism uses the on: anAction send: Message
message, which registers the symbol anAction as being an event that the widget
is interested in knowing about. Later, when that event occurs, the system will
respond by sending aMessage. The interface editor identifies the events that
are supported by each standard widget type and allows you to define the
message to be sent. Registration is generated automatically by the editor.
Along these same lines, the Widget package provides an easy-to-use mechanism
for assigning functions to specific keys. Basically, any widget that accepts
input will have an instance variable known as a KeyBindingsDictionary. This
dictionary associates keys (using sensible names such as "up arrow" and "esc,"
or compounds like "ctrl-alt-left arrow") with names of methods that implement
the actual functions for those keys.
It stands to reason that, in a graphical user interface, graphic drawing
operations are common. Yet, while Smalltalk supports a more-than-adequate set
of graphics operations, it is often difficult to determine exactly how to draw
graphic shapes on the screen (should I use a BitBlt or a Pen?). Furthermore,
those drawing operations are generally defined only in terms of absolute
screen coordinates, placing responsibility for relocation to the frame of a
specific window squarely on the application's shoulder.
Widgets/V 286 alleviates both of these problems. First, all of the basic
drawing operations are reimplemented in the base class, Widget. Thus, for
instance, every widget knows how to draw an ellipse. Then, all drawing (and
clipping) operations are performed relative to the coordinate frame of the
enclosing widget, even if it is nested within other widgets.
Finally, Smalltalk provides no direct mechanism for opening windows. Yes, of
course, windows do get opened; but it's the result of various combinations of
magic and outright trickery! Widgets/V 286 simplifies this process by
providing simple messages that can be used to simultaneously create a new
instance of a window and display that window on the screen.
The open# message, for instance, provides a no-frills, "open this window and
get on with it" approach. This message is sent directly to the class object,
by-passing the need to use the <class> new open combination. Actually, the
simple open message is just a convenient disguise for a much more versatile
method of opening and initializing windows. Consider the default case of the
open method, shown in Example 1. Any application-specific initialization can
be accomplished simply by providing an instance method name initialize.
However, if your application requires something special, such as an openOn:
method, it might look like Example 2. In this case, you provide an instance
method, initializeMe:, that handles initializing the window.
Example 1: The default open method

 open
 ^super new
 openWithInitializeMethod: #initialize
 arguments: #().

Example 2: initializeMe initializes the window for a given object, which is
passed as an argument to the openOn: message

 openOn: someObject
 ^super new
 openWithInitializeMethod: #initializeMe:
 arguments: (Array with: someObject).

One other default opening mechanism is provided for dialog-type application
windows. This is the prompt: aString message. Its internal representation is
shown in Example 3. If this method of opening is appropriate for your
particular application, all you do is reimplement the instance method
promptString:.
Example 3: Internals of the prompt message

 prompt: aString
 ^super new
 openWithInitializeMethod: #promptString:
 arguments: (Array with: aString).



Constructing an Application


The ultimate question that any toolkit must answer is whether these tools can
be used to create real applications. In order to put Acumen's claims to the
test, I set out to construct a real application --an appointment management
utility. The initial specifications were to be able to browse a list of
existing appointments. In addition, I needed to add new appointments to the
list and change or remove existing entries. An appointment should specify a
date, the time at which the appointment starts, the time at which it ends, and
a time at which I am to be notified. Furthermore, there must be a way to enter
some text that describes the nature of the appointment. Finally, when it's
time to notify me about an appointment, the system should beep the terminal
and pop up a window with the appropriate information. Dismissing the window
should remove the appointment from the list.
From these specifications, I determined that the project would need three
classes of objects. The Appointment class (see Listing One, page 98) holds
information about a specific appointment, including the name of the person the
appointment is with and the starting and ending times of the appointment.
AppointmentBrowser (see Listing Two, page 98) allows a user to browse his or
her list of appointments, add new appointments, and/or change or remove
existing appointments. AppointmentNotifier (see Listing Three, page 101) pops
up a window when it's time to notify a user about an appointment. Two of the
three classes, AppointmentBrowser and AppointmentNotifier, have user
interfaces created with the Widget's interface editor.
In addition to the object classes, one global variable is required. It is
created by evaluating the following Smalltalk expression Appointments: =
Dictionary new. This public dictionary holds (sorted) lists of appointments
accessed by a user's name. Consequently, the system is capable of managing
appointments for more than one user. Also, before appointment notification can
occur, the timer interrupt handler (the timerinterrupt class method in class
Process) must be modified as shown in Listing Four (page 102).
Finally, an appointment browser window can be opened by either the
AppointmentBrowser open0n: aUserName or AppointmentBrowser open message. (The
latter message will prompt for the name of a user.)


Building the User Interface


Once the requirements for the appointment manager application are determined,
the process of creating the program begins with the interface editor. Choosing
the New Interface selection from the (new) system menu opens an interface
editor window containing an empty application window. In this case, the empty
application window is transformed into an appointment browser window by
selecting various widgets from the editor's iconic menus and "dropping" them
onto the empty application window. Figure 1 shows the interface editor on
which the completed appointment browser has been constructed. It consists of a
ListBox, a DateWidget, three TimeWidgets, a TextEditWidget (laid on top of a
TitledPane), and three TitledButtons.
After the widgets are placed on the application window, specific details about
each must be filled in. This is done by pointing at the widget and double
clicking the left mouse button. In response, the widget pops up its own
special editor. While many of the widget editors are unique, all follow the
same general format.

In the case of the ListBox widget, its title has been set to "Appointments."
I've also filled in the Instance Variable Name field with the name
appointmentList. When the editor generates what source code it can, the class
definition will include a declaration for an appointmentList instance
variable.
Also, notice that the On: (select) and Send: (selectAppointment:) fields have
been filled in. This instructs the interface editor to establish an event
handler method for selections made from this list widget.
Each widget additionally has an editor for its framing parameters. This is a
fairly complex editor (covered well in the Widgets/V 286 manual) that lets you
specify how the size and position of a particular widget will be affected by
resizing the parent window.
When the interface editor is instructed to save the newly created interface,
it generates a class header containing named instance variables taken from
those widgets for which the instance variable name has been specified; a
complete createWindow method that will be called when the window is opened to
construct the actual window object; empty ("stub") methods to handle events
for widgets in which the event response (On: Send:) fields have been filled in
(these methods must be completed later--by you--so that they implement the
required functionality). It's important to note that saving a new version of
an existing interface will not overwrite code that you've added (except for
the createWindow method).
In addition to the user interface for the browser, a simple interface must be
constructed for the AppointmentNotifier window. In operation, the notifier
window is created by evaluating the expression, AppointmentNotifier open. Once
opened, this window remains on the screen.
During its initialization, the notifier window forks a process that patiently
waits for clock ticks. At each clock tick, the notifier object checks to see
if the current minute has changed. If it has, the appointment lists
(SortedCollections of Appointments) for all users are polled, searching for an
appointment whose notification time has arrived.
When an appointment's time arrives, the notifier fills in its text widget with
the appointment information, beeps the terminal, and forces its window to
become the active window. At that time, the user may press the OK button to
clear the window and remove the appointment from the list.


Browsing the Package


Other than a couple of minor gripes (mentioned later), I found the Widgets
package delightful to use. Its editor has a clean, consistent, and intuitive
user interface. The set of widgets is extensive (bordering on overwhelming)
and in the spirit of Smalltalk's open environment, there is source code for
everything!
I should also mention that if your hardware supports color (EGA or VGA),
you're in for a real treat. The package sports a really nice three-dimensional
look for all of its widgets: beveled edges, shaded buttons, the works! (See
Figure 2.) To enhance the portability of an application, Widgets automatically
detects a monochrome video mode (or you can force it, manually) and adjusts
its visual style accordingly.
Even though I spent several hours trying to figure out how to handle timer
interrupts (a problem with Smalltalk's documentation, not the Widgets
package), I feel comfortable saying that the appointment manager application
could have been completed in four hours tops. In contrast, I estimate that an
equivalent application, using only the built-in Smalltalk windowing components
(crude, but it could be done), would require a great deal more time depending
upon how closely you tried to emulate the capabilities offered by the Widgets
package.
I did, however, find a couple of minor bugs. One is in the class TimeWidget
(lets you edit time values) and affects the way the meridian (A.M./P.M.)
indicator rolls over at noon and midnight.
On a related issue, true military time spans the hours 0100 to 2459 with 2400
hours being midnight. It appears that Acumen's MilitaryTimeWidget uses what is
commonly referred to as a 24-hour time format, which spans the hours 0000 to
2359, with 0000 hours being midnight. (Okay, I know this is probably
nit-picking! But, as an ex-GI, I felt a peculiar urge to set the record
straight.) However, this widget, too, has a bug that causes the time to roll
over from 2359 hours to 0100 hours. Now, that's not right, in any format!


Products Mentioned


Widgets/V 286 Acumen Software 2140 Shattuck Ave, Suite 1008 Berkeley, CA 94704
415-649-0601, FAX: 415-649-0514 $149.95 Requirements: Smalltalk/V 286
I called Acumen to report the bugs with an ulterior motive of checking out the
company's attitude toward technical support. No surprises here! The gentleman
I talked to was extremely courteous. He was somewhat surprised that no one
else had reported problems with the time editors, but assured me that he would
look into the matter. I was also informed that Acumen will soon release a list
of bug fixes--the fix to the time editors will be included.
Finally, since programming in Smalltalk is so intimately tied to the
environment itself, I was somewhat disappointed that there was virtually no
discussion of how the environment had been modified by the Widgets package.
Again, this kind of information, which is available if you're willing to roll
up your sleeves and dig into the source code, could have been put in another
"oh, by the way" appendix.


Conclusions


My first exposure to the Widgets/V 286 package was through a slick brochure
that was flashy and full of what I would normally consider to be marketing
hype. Still, Widgets seemed to be just what I was looking for--something that
would put a little excitement back into creating Smalltalk applications. I
took a chance and purchased the package, sincerely hoping I wouldn't have to
take advantage of Acumen's 30-day money-back guarantee.
Let me just say that it wasn't hype! Widgets is easy to use, it's fun, and in
just a couple of hours you can be cranking out applications that, until now,
you've only dreamed of. If you're interested in developing Smalltalk
applications, and you hope to achieve the levels of efficiency and
productivity that you know Smalltalk is capable of, Widgets/V286 is just the
kind of toolkit you've been waiting for.

_MAKING SMALLTALK WITH WIDGETS_
by Kenneth E. Ayers


[LISTING ONE]

"The Appointment class definition"

Object subclass: #Appointment
 instanceVariableNames:
 'user date startTime endTime notifyTime text '
 classVariableNames: ''
 poolDictionaries: ''

"**** Appointment class methods ****"
timePrintString:aTime
 "Answer a string with a representation of
 aTime formatted as hh:mm AM/PM"

 amPM hours minutes minStr 

 amPM := 'AM'.
 hours := aTime hours.
 minutes := aTime minutes.
 hours = 0
 ifTrue: [hours := 12. amPM := 'PM']

 ifFalse:[
 hours >= 12
 ifTrue:[
 amPM := 'PM'.
 hours > 12
 ifTrue:[hours := hours - 12]]].
 minutes < 10
 ifTrue: [minStr := '0', minutes printString]
 ifFalse:[minStr := minutes printString].
 ^hours printString, ':', minStr, ' ', amPM.

"**** Appointment instance methods ****"
< anAppointment
 "Answer true if the receiver's date
 and notify time are earlier than
 those of anAppointment"

 ^(date < anAppointment date)
 or:[(date = anAppointment date)
 and:[notifyTime < anAppointment notifyTime]].

<= anAppointment
 "Answer true if the receiver's date
 and notify time are the same or
 earlier than those of anAppointment"

 ^(self < anAppointment) or:[self = anAppointment].

= anAppointment
 "Answer true if the receiver's date
 and notify time are the same as
 those of anAppointment"

 ^(date = anAppointment date)
 and:[notifyTime = anAppointment notifyTime].

> anAppointment
 "Answer true if the receiver's date
 anf notify time are later than
 those of anAppointment"

 ^(self <= anAppointment) not.

>= anAppointment
 "Answer true if the receiver's date
 and notify time are the same or
 later than those of anAppointment"

 ^(self < anAppointment) not.

checkTime:timeNow date:dateToday
 "Answer true if it is time to notify
 the user of his or her appointment"

 ^(date < dateToday)
 or:[(date = dateToday)
 and:[notifyTime < timeNow]].

date

 "Answer a Date, the appointment's date"
 ^date.

date:aDate
 "Set the appointment's date to aDate"
 date := aDate.
endTime
 "Answer a Time, the appointment's end time"
 ^endTime.
endTime:aTime
 "Set the appointment's end time to aTime"
 endTime := aTime.

info
 "Answer a String with the information on
 this appointment"

 ^OrderedCollection new
 add:user, ' has an appointment';
 add:'on ', date formPrint,
 ' at ', (self timePrintString:startTime);
 add:String new;
 addAll:text;
 yourself.

notifyTime
 "Answer a Time, the time at which the
 user is to be notified"
 ^notifyTime.

notifyTime:aTime
 "Set the time at which the user is to
 be notified to aTime"
 notifyTime := aTime.

printOn:aStream
 "Add a representation of the receiver
 to aStream"
 aStream
 nextPutAll:date formPrint;
 nextPutAll:' @ ';
 nextPutAll:(self timePrintString:startTime);
 nextPutAll:' - ';
 nextPutAll:(text at:1).

startTime
 "Answer a Time, the appointment's start time"
 ^startTime.

startTime:aTime
 "Set the appointment's start time to aTime"

 startTime := aTime.

text
 "Answer an Array containing the lines of text
 that describe the appointment"
 ^text.


text:aTextArray
 "Set aTextArray as the lines of text
 that describe the appointment"
 text := aTextArray.

timePrintString:aTime
 "Answer a string with a representation of
 aTime formatted as hh:mm AM/PM"
 ^self class timePrintString:aTime.

user
 "Answer the user for whom the appointment
 was created"
 ^user.

user:userName
 "Set the user for whom the appointment
 was created to userName"
 user := userName.







[LISTING TWO]

"The AppointmentBrowser class definition"

ApplicationWindow subclass: #AppointmentBrowser
 instanceVariableNames:
 'user appointments appointmentList
 dateEditor textEditor notifyTimeEditor
 endTimeEditor startTimeEditor '
 classVariableNames: ''
 poolDictionaries: ''

"**** AppointmentBrowser class methods ****"

open
 "Prompt for a user name and then open
 an AppointmentBrowser for that user"
 userName 

 Cursor offset:Display boundingBox center.
 userName := PromptDialog
 prompt:'Enter user name'.
 userName isNil ifTrue:[^nil].
 ^self openOn:userName.

openOn:aUserName
 "Open an AppointmentBrowser for aUserName"

 ^super new
 openWithInitializeMethod:#initializeUser:
 arguments:(Array with:aUserName).

"**** AppointmentBrowser instance methods ****"


addAppointment
 "The user as pushed the 'ADD' button so
 we need to construct a new appointment
 record and add it to the user's list"
 appointment index 

 appointment := Appointment new
 user:user;
 date:dateEditor date;
 startTime:startTimeEditor time;
 endTime:endTimeEditor time;
 notifyTime:notifyTimeEditor time;
 text:textEditor stringList deepCopy;
 yourself.
 appointments add:appointment.
 index := appointments indexOf:appointment.
 self updateAppointmentList.
 appointmentList selectItem:index.

changeAppointment
 "The user has pushed the 'CHANGE' button so
 we need to remove the currently selected
 appointment and then add a new on with the
 current values"

 appointmentList disableDrawing.
 self removeAppointment.
 appointmentList enableDrawing.
 self addAppointment.

createWindow
 "This method was generated by the
 Widgets/V 286 Interface Editor"

 ^TitledWindow new
 yourself;
 title: 'Appointment Browser';
 closable: true;
 iconizable: true;
 size: 415 @ 322;

 addWidget: (
 appointmentList := ListBox new
 yourself;
 nameForInstVar: 'appointmentList';
 on: #select send: #selectAppointment:;
 title: 'Appointments';
 size: 390 @ 133

 ) position: 12 @ 9;
 addWidget: (
 dateEditor := DateWidget new
 yourself;
 nameForInstVar: 'dateEditor';
 title: 'DATE:';
 date: (
 Date newDay: 16
 month: #December

 year: 1990
 );
 size: 114 @ 18
 ) position: 47 @ 156;
 addWidget: (
 TitledPane new
 yourself;
 title: 'WHAT FOR?';
 size: 233 @ 116;

 addWidget: (
 textEditor := TextEditWidget new
 yourself;
 nameForInstVar: 'textEditor';
 verticalScrollBar: true;
 horizontalScrollBar: false;
 size: 219 @ 90
 ) framer: (
 FramingParameters new
 xCentered;
 yCentered
 )
 ) position: 170 @ 151;

 addWidget: (
 notifyTimeEditor := TimeWidget new
 yourself;
 nameForInstVar: 'notifyTimeEditor';
 title: 'NOTIFY AT:';
 time: (
 Time new seconds: 74673
 );
 size: 144 @ 18
 ) position: 16 @ 269;
 addWidget: (
 TitledButton new
 yourself;
 on: #release send: #changeAppointment;
 title: 'CHANGE';
 size: 77 @ 19
 ) position: 326 @ 269;
 addWidget: (
 endTimeEditor := TimeWidget new
 yourself;
 nameForInstVar: 'endTimeEditor';
 title: 'END TIME:';
 time: (
 Time new seconds: 74673
 );
 size: 136 @ 18
 ) position: 24 @ 230;
 addWidget: (
 startTimeEditor := TimeWidget new
 yourself;
 nameForInstVar: 'startTimeEditor';
 on: #valueChanged send: #startTimeChanged:;
 title: 'START TIME:';
 time: (
 Time new seconds: 74613

 );
 size: 148 @ 18
 ) position: 12 @ 192;
 addWidget: (
 TitledButton new
 yourself;
 on: #release send: #addAppointment;
 title: 'ADD';
 size: 77 @ 19
 ) position: 170 @ 269;
 addWidget: (
 TitledButton new
 yourself;
 on: #release send: #removeAppointment;
 title: 'REMOVE';
 size: 77 @ 19
 ) position: 248 @ 269.
initializeUser:argArray
 "Initialize an AppointmentBrowser for the
 user whose name is given in argArray"

 user := argArray.
 (Appointments includesKey:user)
 ifFalse:[
 Appointments
 at:user
 put:(SortedCollection
 sortBlock:[:a :b a < b])].
 appointments := Appointments at:user.
 self window
 on:#activated send:#updateAppointmentList;
 title:'Appointments for ', user.
 dateEditor date:Date today.
 startTimeEditor time:(Time fromSeconds:32400). "9:00 AM"
 self
 startTimeChanged:startTimeEditor time;
 updateAppointmentList.

removeAppointment
 "The user has pushed the 'REMOVE' button so
 we need to remove the selected appointment
 from the user's list"
 index appointment 

 appointments size = 0 ifTrue:[^self].
 index := appointmentList selectionIndex.
 appointment := appointments at:index.
 appointments remove:appointment ifAbsent:[].
 self updateAppointmentList.
 index > appointments size
 ifTrue:[index := appointments size].
 appointmentList selectItem:index.

selectAppointment:aString
 "The user has selected an appointment so
 we need to fill-in all of the appropriate
 field editors"
 appointment 


 appointment :=
 appointments
 at:appointmentList selectionIndex.
 dateEditor date:appointment date.
 startTimeEditor time:appointment startTime.
 endTimeEditor time:appointment endTime.
 notifyTimeEditor time:appointment notifyTime.
 textEditor stringList:appointment text.

startTimeChanged:aString
 "The user has changed the start time
 so we need to supply rerasonable
 defaults for the end time and the
 notify time"
 aTime 

 aTime := startTimeEditor time.
 "Assume appointment is one hour long"

 endTimeEditor
 time:(aTime addTime:(Time fromSeconds:3600)).

 "Assume notify 5-minutes before"
 notifyTimeEditor
 time:(aTime subtractTime:(Time fromSeconds:300)).

updateAppointmentList
 "Update the list of appointments"
 size list 

 (size := appointments size) = 0 ifTrue:[^self].
 list := Array new:size.
 1 to:size do:[:index
 list
 at:index
 put:(appointments at:index) printString].
 appointmentList stringList:list.

user
 "Answer the name of the user for which
 this browser was opened"

 ^user.

windowLocation
 "Make the window centered on the screen"

 ^#center.







[LISTING THREE]

"The AppointmentNotifier class definition"


ApplicationWindow subclass: #AppointmentNotifier
 instanceVariableNames:
 'active running minute text appointment '
 classVariableNames: ''
 poolDictionaries: ''

"**** AppointmentNotifier instance methods ****"

acknowledge
 "The user has pushed th 'OK' button, so we
 need to remove the current appointment
 from the user's list"
 user list 

 appointment isNil ifTrue:[^self].
 user := appointment user.
 (list := Appointments
 at:user
 ifAbsent:[nil]) isNil
 ifFalse:[
 list
 remove:appointment
 ifAbsent:[]].
 text
 stringList:#();
 display.
 appointment := nil.
 minute := nil.

activateFor:anAppointment
 "Display the details of anAppointment and
 bring this window to the top"

 appointment isNil
 ifTrue:[
 "Previous appointment
 has been dismissed"
 text
 disableDrawing;
 stringList:anAppointment info;
 enableDrawing.
 appointment := anAppointment].
 Terminal bell; bell.
 Cursor offset:window origin.
 ScreenManager activateWindow:self window.

clockEvent
 "A clock tick has been received so we have
 to determine if the minute has rolled over
 and, if so, are any appointments ready"
 now thisMinute today list appointment 

 now := Time now.
 (thisMinute := now minutes) = minute
 ifTrue:[^self].
 minute := thisMinute.
 today := Date today.
 Appointments associationsDo:[:anEntry
 (list := anEntry value) size > 0

 ifTrue:[
 ((appointment := list first)
 checkTime:now date:today)
 ifTrue:[self activateFor:appointment]]].
closeWindow
 "Before closing this window, terminate the
 process that's monitoring clock events"

 active := false.
 "Wait for the process to terminate"
 [running] whileTrue:[Processor yield].
 super closeWindow.

createWindow
 "This method was generated by the
 Widgets/V 286 Interface Editor"

 ^TitledWindow new
 yourself;
 title: 'Appointment Notifier';
 closable: true;
 size: 183 @ 155;

 addWidget: (
 text := TextEditWidget new
 yourself;
 nameForInstVar: 'text';
 verticalScrollBar: true;
 horizontalScrollBar: false;
 size: 169 @ 91
 ) position: 5 @ 6;
 addWidget: (
 TitledButton new
 yourself;
 on: #release send: #acknowledge;
 title: 'OK';
 default: true;
 size: 57 @ 23
 ) framer: (
 FramingParameters new
 xCentered;
 originY: 103 relativeTo: #origin
 ).

initialize
 "Initialize a new AppointmentNotifier"
 active := false.
 running := false.
 minute := Time now minutes.
 [self run] forkAt:Processor highUserPriority.
run
 "Run the process that handles clock events"
 timerSemaphore 

 (timerSemaphore := Smalltalk
 at:#TimerSemaphore
 ifAbsent:[nil])
 isNil
 ifTrue:[

 timerSemaphore := Semaphore new.
 Smalltalk
 at:#TimerSemaphore
 put:timerSemaphore].
 active := true.
 running := true.
 [active]
 whileTrue:[
 timerSemaphore wait.
 self clockEvent].
 Smalltalk removeKey:#TimerSemaphore.
 running := false.

windowLocation
 "Make the window centered on the screen"

 ^#center.








[LISTING FOUR]

"Modifications to the Process class methods"

timerInterrupt
 "Implement the timer interrupt."
 timerSemaphore 

 PendingEvents add: (Message new
 selector: #clockEvent:;
 arguments: (Array with: 1)).
 KeyboardSemaphore signal.

 "**********************************************
 Added by Ken Ayers to support the
 Appointment Manager Application"

 timerSemaphore := Smalltalk at:#TimerSemaphore
 ifAbsent:[nil].
 timerSemaphore notNil
 ifTrue:[timerSemaphore signal].
 "**********************************************"
 self enableInterrupts: true.






[LISTING FIVE]

"Corrections to TimeWidget methods"
"**** TimeWidget instance methods ****"


time
 hours 

 hours := self hours.

 (self meridianEditor value = 'PM')
 ifTrue:[
 hours < 12 ifTrue:[hours := hours + 12]]
 ifFalse:[
 hours = 12 ifTrue:[ hours := 0]].

 ^(Time fromSeconds:0)
 hours:hours;
 minutes:self minutes.

time: newTime
 hours 

 self time = newTime ifTrue: [^self].

 hours := newTime hours.

 (hours >= 12)
 ifTrue: [
 self meridianEditor value: 'PM'.
 hours > 12 ifTrue:[hours := hours - 12]]
 ifFalse: [
 self meridianEditor value: 'AM'.
 hours = 0 ifTrue:[hours := 12]].

 self hourEditor value: hours.

 self minuteEditor value: newTime minutes.





























MAY, 1991
ARRAY BOUNDS CHECKING WITH TURBO C


Hardware assisted bounds checking thanks to a DOS extender




Glenn Pearson


Glenn has a Ph.D. in computer science from the University of Maryland, College
Park, focusing on man-machine interfaces. This article was inspired by
experience with a software development project to enhance "MLAB," a
mathematical modeling system from Civilized Software Inc. Glenn can be reached
c/o CSI, 7735 Old Georgetown Rd., Bethesda, MD 20814.


The premise behind DOS extenders is that most of the code in your program can
run in "protected mode." As a side benefit, DOS extenders also make it
possible for you to add array bounds checking to your programs. Because the
bounds checking is handled by the hardware, no runtime overhead is incurred
when accessing the array; the only overhead is additional work during memory
allocation and freeing. Unfortunately, this additional workload will cause
noticeable performance degradation with some programs, so it is best to
arrange your code so that bounds checking can easily be turned on or off.
While some compilers offer array bounds checking as a compile-time option,
Turbo C 2.0 does not. However, the technique presented in this article lets
you add array bounds checking to your Turbo C applications when used with the
Ergo DOS extender. Furthermore, the technique should still be applicable to
Borland C++, and it can be adapted to other DOS extenders.


Protected Mode Revisited


Several thorough discussions of protected mode have appeared in DDJ in the
past, so I'll just briefly hit the high points. If you're interested in more
detail on protected mode, refer to the "Suggested Reading" section at the end
of this article.
Recall that real mode addresses with Intel chips consist of a segment number
and an offset. On a 286, the segment register holds the 16-bit segment base
value, and the offset (as might be stored in the Instruction Pointer, or as
part of an assembly language instruction) is limited to 16 bits. When
composing the address, the segment "base" value is shifted left 4 bits and
added to the offset, resulting in a 20-bit value (a 1-Mbyte range).
Protected-mode addresses use offsets identical to the real-mode offsets. The
change is to the segment numbers, which are replaced with 16-bit "selectors."
This introduces a level of indirection. The high-order 13 bits of a selector
represent an offset into a "Descriptor Table" (as shown in Table 1), selecting
a particular descriptor row.
Table 1: Contents of each row in the descriptor table. Both the 286 and
386/486 can specify an arbitrary 32-bit size bound, using bytes 0,1,7, and 8,
but the 286 can access only memory within the first 64K of the Base Address.

 Bytes Contents
 --------------------------------------------------------------------

 0-1 Routinely zero for 286; Segment Size Limit extension for 386
 3 Access Rights Bits:
 7 Present
 6-5 Descriptor Privilege Level
 4-0 Other info (varies, field encoded)
 Identifies code versus data, writable, and so on
 4-6 24-bit Base Address in Real Memory
 7-8 16-bit Segment Size Limit

As a result, the real address space of an AT is limited to what can be
specified in 24 bits (namely 16 Mbytes), and the maximum segment size is 64K.
On the 386 and 486 chips, the offset register is expanded to hold 32 bits,
thus allowing a maximum segment size of 4 gigabytes.
There are actually two such tables maintained by the chip while in protected
mode. The Local Descriptor Table (LDT) is pointed to by the LDTR register,
while the Global Descriptor Table (GDT) is pointed to by the GDTR register.
Each register also holds the size of its table. The contents of the tables,
which live in main memory, are preliminarily filled in by the compiler and
linker, and are made final during loading immediately before execution begins.
Generally, each application has its memory allocation information stored in
its own LDT. Bit 2 of a selector address is 0 for access to the GDT and 1 for
LDT access. Bits 1-0 hold the "Requested Privilege Level." If all goes
smoothly, a C programmer would need to know only that a far pointer is really
a selector:offset pair (sel). Of course, things don't always go smoothly.
In any event, the availability of information in the tables allows the chip to
perform checking automatically on every memory access in order to make sure
the access is to a valid "present" segment and to a valid bounded region
within that segment (an offset between 0 and (Size Limit-1), inclusively).
Furthermore, operations such as writing to a code segment can be forbidden
through appropriate values in the Access Rights byte. A General-Protection
(GP) Fault is generated when an illegal access is detected.
The operating system and programs such as loaders need to be able to write to
code segments. This is handled through the use of a four-level privilege
scheme provided by 2 bits in the chip's Process Status Word as well as in the
descriptors and selectors. Most commonly, the operating system (such as
extended DOS) runs at the highest privilege of 0, while most applications run
at the lowest privilege of 3.


The Ergo DOS Extender


Ergo supports a number of C and Fortran compilers. As with all DOS extenders,
the basic idea behind the Ergo extender is to allow an application, written
using the standard DOS and BIOS calls (usually embedded within calls to the
compiler's libraries), to be converted to run in protected mode. In the ideal
case, this requires no reprogramming--you compile and link the program using
the normal compiler and linker. The resulting "real mode" executable is then
input to a conversion routine. The conversion routine looks for every memory
access using an absolute memory address (which in "real mode" explicitly or
implicity involves a pair of numbers, the segment and the offset) and converts
it to a protected-mode "selector," a form of logical address.
Ergo performs this conversion using its "EXPress" utility. Thus, if hello.exe
is our executable file, EXPress reads the file and generates "hello.exp." The
generated result is ready to be run, but not directly. Instead, Ergo provides
a loader, called UP, to switch to protected mode, map the selectors to actual
memory locations (usually in extended memory), and begin execution. When
"hello" terminates, the loader restores real mode and returns to the DOS
prompt.
In order for this magic to work, Ergo requires that a TSR program (OS286) be
installed in the system. As its name implies, OS286 uses only 16-bit
instructions and will thus work with 286 machines (as well as 386/486
machines). Ergo also provides OS386, which uses 32-bit instructions and is
compatible with 32-bit compilers. This limits the application to running on
the latter machines, but resulting code will run up to twice as fast. When
installed, OS286 has two main parts, a "real-mode kernel" that lives below 640
Kbytes, and a "protected-mode kernel" that usually resides in extended memory.
During installation, this TSR investigates the memory allocation of the
system, inquiring as to total physical memory and looking for other TSRs (such
as RAM disks) that have reserved space. Unused space in extended (and
optionally low) memory is considered fair game for use as the protected-mode
heap.
When the protected-mode loader is invoked, UP communicates with the kernels,
informing them of the amount of space needed for the load image of the
executable. This space is then allocated in the protected-mode heap, and the
image is loaded. The processor's memory mapping abilities are tapped to create
the correct correspondence between selectors and actual segment-offset
locations.
In addition, software-interrupt calls in the executable to DOS or BIOS system
services that are routed by the protected-mode interrupt table to the
protected-mode kernel. In some cases, the kernel carries out the requested
function directly in protected mode. Often, though, there is a momentary
switch into real mode, allowing the function to be performed by the real-mode
kernel, usually through a real-mode software interrupt to DOS/BIOS itself.
Switches between modes involve a modest amount of overhead on the 286 (on the
order of 0.1 second), so routine mode switching is not desirable. On 386/486
processors, on the other hand, just a single bit in the Process Status Word
need be twiddled.
The kernel TSR remains resident after the application returns to DOS, so
subsequent invocations of the application start more quickly and exhibit a
somewhat different initial on-screen message. The TSR is unloaded using
-remove switch. The loading procedure can thus be encapsulated as shown in
Figure 1(a).
Figure 1: (a) Batch file to automate the loading procedure; (b) using the bind
utility to bind OS286 and the loader into an application; (c) a typical GP
fault message.

 (a) os286
 up hello

 os286 -remove

 (b) bind -k os286.exe -1 tinyup.exe -i hello.exp -o hello.exe

 (c) GP at XXXX YYYY EC ZZZZ


This message should be read as "general-protection fault occurred in Code
Segment (CS) XXXX with Instruction Pointer (IP) of YYYY, with Ergo error code
of EC and chip-generated error code of ZZZZ"
This loading procedure, which is useful during product development, is not
necessarily one to present to the end user. An alternative often employed for
end distribution is to "bind" the kernel, the loader, and the application into
the executable. Ergo provides the bind utility for this purpose. The command
line, shown in Figure 1(b), uses bind to create heflo.exe. In this case,
another loader, Tinyup, is used (instead of up) with bind. Now when hello is
run, it first installs the OS286 kernel, which is now installed as an overlay,
not as a true TSR. If available memory space is insufficient to load the
kernel and application, a message reports the failure and returns you to the
DOS prompt. Otherwise, the application is loaded, run, and returned to DOS.
Unlike the unbound case, removal of the bound kernel from memory occurs
automatically when the application returns normally to DOS. (Additional
controls exist to determine what happens if a bound application is started up
in a system that separately has an unbound kernel resident.)


Symptoms of GP Faults


With a nontrivial program, problems can arise following conversion to
protected mode. These usually manifest themselves at runtime by a
general-protection fault, a fatal error that prints some information and dumps
you back into DOS. Figure 1(c) shows the sort of message you'll see when a GP
fault occurs. This message gives you the contents of the two most important
processor registers. The CS and IP values, in conjunction with the .XMP
extended map file produced during conversion, will identify the individual
module (.OBJ) name, and the appropriate location within the module that
generated the GP fault.
A GP error in a bound application started from the DOS prompt will generate
only the error message just described. But under other circumstances you can
get an additional full dump of all the registers, courtesy of a special GP
handler provided as part of up.exe and cp.exe. The full dump is available when
running unbound programs (with the UP loader) or when executing a program
under the CP debugger. The full dump will not always occur, because events
such as real-mode memory corruption can kill it. But in any event, the Ergo
kernel will regain control of the CPU, issue the one-line GP message, kill the
program, and return to DOS.


Causes of GP Faults


Problems can arise from interrupt-driven code and from code that attempts
direct access tb memory. I will not consider these sources of difficulties
here. Instead, the focus will be faults due to faulty pointers, array
overruns, or related address arithmetic.
Some compilers provide runtime checking of array bounds (often as a
compile-time option) but, as mentioned earlier, Turbo C 2.0 does not. These
often-elusive errors can generate a general-protection fault, as can
dereferencing of uninitialized or dangling pointers, such as that shown in
Example 1(a). The problem is that you can't count on the dereference to point
to an illegal selector:offset location and thus trigger the fault.
Example 1: (a) Dereferencing an uninitialized or dangling pointer; (b)
creating a file global table; (c) toggling bounds checking to either on or off
(d) a decentralized method for global or external static array "str1".

 (a) char *bad1, *bad2, *bad3;

 strcpy (bad1, "hello"); /* GP Candidate: Uninitialized */
 bad2 = (char *) calloc(20);
 free(bad2);
 strcpy (bad2, "hello");/* GP Candidate: Dangling */
 bad3 = (char *)calloc(2);
 strcpy(bad3, "hello");/* GP Candidate: Array Overrun */

 (b) #define LDTSIZE 1024 /* Ergo default LDT size */
 unsigned long malloclist [LDTSIZE]

 (c) void *mycalloc(int size)
 {void *results;

 #ifdef BOUNDS_ON
 results = pcalloc(size);
 #else
 results = calloc(size);
 #endif
 if (results == NULL) .../* complain & exit */
 return(results)
 }

 (d) char str1[] = "Hello"
 char *str2;

 {...
 /* Before first use of "Hello": */
 str2 = (char *)createDataWindow(str1,strlen(str1)+1);
 /* +1 for terminal \O */
 /* Use str2 from now on instead of str1 */
 ...}


 A similar approach is applicable to internal statics:

 somefunc( ) {
 static char str1[] = "Hello"
 static char *str2;

 ...
 /* Before first use of "Hello" in somefunc: */
 if (str2 == NULL)
 str2 = (char *) createBoundedWindow (str1, strlen(str1)+1);
 /* +1 for terminal \0 */
 /* Use str2 from now on instead of str1 */
 ...}

Overruns of static arrays such as string constants are less likely to be
detected because these are lumped into one data segment per .OBJ for the huge
model, or just one data segment for the large model. The bounds checking is
applied to the group overall, not to individual arrays.
Dynamically allocated arrays, on the other hand, may or may not get a private
sel, depending upon the runtime library provided with the compiler. If not,
malloc when first called will ask for a large block of memory from DOS (a 64K
segment) and return a pointer to the start of it; subsequent mallocs will
allocate from the same segment, if possible.


Heap Structure


For further insight, consider the Turbo C/Ergo heap--that area of memory under
Turbo C's management from which calls to malloc, calloc, and farmalloc draw.
When a Turbo C executable begins, it requests from DOS (with OS286
intervention) all available memory for its heap. This memory in real address
space could be either a single range of low memory, a single range of extended
memory, or both, depending upon how the OS286 kernel was precon-figured. (You
can also limit the heap space size by reserving space for other purposes. This
consideration doesn't change the aspects discussed here. The situation with
the Microsoft C compiler is similar with respect to their normal, so-called
far, heap. There is also a 64K near heap, which is contained within the
DGROUP.)
The heap block is then covered (internally by Ergo's implementation) by an
overall "data window" selector, spanning the whole heap. This window, which is
not strictly required, simplifies the next step by providing a convenient
place to store the 32-bit whole-heap size. Although the 286 chip can specify
an arbitrary 32-bit bound, it can access only contents stored within the first
64K of a selector's base. To make the entire heap accessible, a series of
child data windows is created, represented by a single set of consecutive
entries in the LDT. The fact that the entries are consecutive (they "tile" a
region of the LDT) is very important. Each selector in the tiled region,
except perhaps the last one, references a "stride" of 32K of memory. Any
address arithmetic done on a selector:offset pair (such as implicitly done
during an array element access) will invoke the Turbo C address-adjustment
routines, which Ergo has adjusted to be cognizant of tile boundaries. Thus,
memory references can overflow from one selector to the next within this LDT
region without faulting, so that data objects larger than 64K can be properly
accessed and manipulated. Address arithmetic overflows will generate a
hardware bounds fault only if the result goes outside the overall heap. (Why
doesn't each tiled selector reference the maximum size piece of memory
possible, namely 64K? Because overflows of 64K do not consistently invoke the
Turbo C address-arithmetic routines. 32K was chosen to make overflow checking
and adjustment arithmetic both reliable and easy.)
The Turbo C heap is structured in the usual way for memory management by
malloc/farmalloc and free/farfree calls. When a farmalloc request is issued
(or farcalloc, or farrealloc), the sel returned has a selector from within the
set of "tiled" selectors. Ergo makes the reasonable assumption that if you use
farmalloc instead of malloc for your allocation, it is because the object is
(or could be) larger than 64K, and you intend to use a huge or normalized
pointer to access it, rather than a far pointer. When address arithmetic is
performed on a huge pointer, regular Turbo C calls a subroutine to make sure
the result is normalized. Ergo changes this routine so that, for instance, an
increment of the offset that causes it to overflow will be propagated as an
addition of eight to the selector, thus pointing properly to the next tiled
descriptor in the LDT. Similarly, a decrement resulting in underflow subtracts
eight from the selector.
With a malloc call, the situation is a little more complicated (and
undocumented). It must be guaranteed that a malloced object resides in a
single segment, so that far pointers, which behave badly at segment
boundaries, can be safely used. Malloc starts out by calling farmalloc. If the
returned sel is such that the block would straddle a boundary in the LDT (an
infrequent occurrence), then a separate data window -- a private selector on
the underlying real memory area chosen in the heap -- is created and returned
to the user. This private selector, with its private size limit, is an
instance of automatic bounds checking that we will extend to all dynamic
memory allocations.
When free is called, it must check for this special situation and delete the
data window if present; then farfree is called to actually update the heap. It
is permissible to call free rather than farfree with sels allocated originally
by farmalloc, so if you choose to use only one of these calls for all
deallocations, make it free, not farfree.
The Ergo developers report that the Microsoft C heap structure and
heap-handling library routines are, in general, better structured internally
than Turbo C's. We have found that Turbo C/OS286 works well for mid-sized
large-model projects (say, 30 modules). Unfortunately, we have been
unsuccessful so far with a 100-module project that just barely fits within the
large-model (single data segment, for statics) paradigm. In this case, the LDT
appears to be corrupted, and the cause remains elusive. At this time, we would
suggest a different compiler, such as Microsoft C, be used with OS286 for
numerous-moduled projects.


Adding Bounds Checking


To guarantee a private sel for a dynamic array, you ask the Ergo system to
provide a data window over the array. The private sel is then used for
subsequent array accesses, so that bounds checking is performed with no
runtime overhead (the chip hardware does it). We will need to remember the
correspondence between this private sel and the sel returned by the allocator.
For this, we introduce a file global table, as shown in Example 1(b).
The stored values in this table are the allocator segment:offset values. The
index value to access the table is derived from a private sel value. The size
chosen for our table reflects the fact that the lowest three bits of the
selector are always of a fixed value. With Ergo, these three bits are all in
the 1 state, because we are using the LDT, not GDT, and the priority level is
3. Thus, we must shift the selector by three bits to make malloclist
close-packed.
The most general way to provide bounds checking is to write your own version
of all of Turbo C's allocation and freeing routines, so that all your
nonstatic arrays have this ability. The convention adopted here is to stick a
p prefix onto the corresponding Turbo C function name. A straightforward
version of pmalloc and pfree is shown in Listing One , page 104. The error
routine in Listing One is used for fatal errors, and concludes with an exit
call. File global array malloclist is of size LDTSIZE, a value that must match
the LDT size of the OS286 kernel, as discussed later.
In pmalloc, createBoundedWindow calls a function that first locates the
selector representing the start of the heap, and calculates the 32-bit offset
from that to the start of the allocated space. Using this information, one or
more calls to createDataWindow are then used to create the bounding sels. "One
or more?" you ask. Yes. If the allocated space is greater than 64K, it is
necessary to set up your own tiling. createDataWindow simply contains a call
to an Ergo "extended service" routine, accessed through software interrupt
21h, function E8h, and subfunction O1h. Windows so created are disposed of in
pfree with a call to deleteSegOrWin, the Ergo-enhanced DOS interrupt 49h. In
practice, the deleteSegOrWin call has to be placed inside a small loop to
handle deletion of tiled bounding windows.
Listing Two, page 104, provides a full set of allocation and free routines.
These are a bit more complex than we have discussed. They have an added "fail
soft" feature: If the LDT table becomes "full" as defined in the program by
the high-water mark, MAXLDTSIZE, then a private sel is foregone, and bounds
checking is not done on that array. Listing Two also features error reporting
and flow-trace macros, which you can modify to meet your needs. When a fatal
error occurs, two strings are passed as parameters. Only the second string is
understandable to the end user, but the developer will wish to see both. The
simple error routine shown is for the developer, and is appropriate for text
mode but not necessarily graphics mode.
As mentioned at the outset, you will want to arrange to toggle bounds checking
on and off easily. An elaboration of the scheme shown in Example 1(c) does the
job. If you are already using OS286, note that in mid-February a bug was
discovered that affects the zeroing of large arrays by Turbo C's calloc and
farcalloc. Contact Ergo for a patch.
Ergo's default LDT size is 1024. The low end of the LDT will already be taken
up by code and data sels before any dynamic data allocation occurs. If the
program doesn't do much mallocing, the default may be sufficient. Otherwise,
you take a generous guess at the maximum number of memory requests active at
any time, round up to the next power of 2, and set LDTSIZE to that value. A
286 machine, however, can have at most 8192 entries in its LDT. Thus, a
program that has, say, 10,000 simultaneously active dynamic arrays would have
to use a more selective method of assigning private sels than simply universal
use of pmalloc and pfree. (Note that while Ergo uses the LDT for user sels,
Rational Systems uses the GDT; however, the latter is likewise limited to 8192
entries.)
If LDTSIZE is changed from the default to some other valid value ranging from
128 to 8192, OS286 must be told how large to make the LDT when your program
begins execution. For bound applications, this is done by modifying the
kernel's disk file prior to binding it with the distributed program. The
current value of LDTSIZE (and all other setable parameters) can be viewed or
modified using 286 SETUP.
For unbound applications, you can specify the LDTSIZE on the command line
during loading of the kernel TSR. In the latter case, the load image of the
kernel, not the disk file, is altered so the change is not sticky across
reboots.
In addition to the LDT, the operating system also maintains for each task a
Local Working Table (LWT) which is identical in size to the LDT. Thus, each
unit increase in LDT size ties up 16 bytes of real memory during task
execution. As for other space requirements, the task's Task State Segment
(TSS) consumes under 4K, and the operating system needs 1K each for the GDT
and the Interrupt Descriptor Table (IDT).


Static Arrays and Structures


For nondynamic arrays, such as constant strings, there is no convenient
centralized mechanism for adding hardware bounds checking. A decentralized
method is shown in Example 1(d). It is not necessary to free the sel for a
global or explicit static array. In such cases, the array is allocated at
compile time from the data segment (for the huge model the data segment
associated with the code module) and never goes away.


Conclusion


The foregoing methods are adaptable to other static data structures. In a
large program, extensive use of this methodology requires a fair amount of
programmer effort, perhaps suggesting use as a debugging tool, to be applied
temporarily to suspect data objects.
For an automatic array of fixed size, whose pointer (and presumably array
space) is allocated off the stack, you would at least have to explicitly free
the sel at procedure exit. That is, a call to free(str2) would free up the
slot in the LDT. However, there are additional complexities involving
initialization that make this bounds checking technique inconvenient to apply
to automatic arrays.


Suggested Reading



Duncan, Ray, ed. Extending DOS. Reading, Mass.: Addison-Wesley, 1990.
Fried, Stephen. "Accessing Hardware From 80386 Protected Mode." DDJ (May/June,
1990).
iAPx 286 Programmer's Reference Manual. Santa Clara, Calif.: Intel
Corporation, 1985.
Williams, Al. "Roll Your Own DOS Extender." DDJ (October/November 1990).
Williams, Al. "DOS + 386 = 4 Gigabytes." DDJ (July 1990).


Products Mentioned


OS/286 & OS/386 DOS Extenders ERGO Computing-Extenders One Intercontinental
Way Peabody, MA 01960 508-535-7510 Standard version: $695 Virtual memory
version: $1490
Turbo C 2.0 and Borland C++ Borland International 1800 Green Hills Road P.O.
Box 660001 Scotts Valley, CA 95067-0001 408-438-5300 Turbo C $149.95 Borland
C++ $495

_ARRAY BOUNDS CHECKING WITH TURBO C_
by Glenn Pearson



[LISTING ONE]

 void *p;
 void *q;

 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {unsigned Offset,Segment;} Word; } LrgPtr;

 p = malloc(bytes);
 if (p EQ NULL) error("pmalloc: Out of memory space on heap");
 LrgPtr.A = q = createBoundedWindow(p,bytes);
 /* Lower 3 bits are ALWAYS ON; so we can restore them later: */
 /* Shift to make array smaller, and its length <= LDTSIZE */
 /* Remember the p-q correspondence: */
 malloclist[(LrgPtr.Word.Segment & ~7) >> 3] = p;
 return(q); /* q's Offset is zero */
 }

/*=====================================*/
 void pfree(void *q)
 {unsigned i;

 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {unsigned Offset,Segment;} Word; } LrgPtr;

 if (q EQ NULL) return;

 LrgPtr.A = q;
 if (LrgPtr.Word.Offset != 0)
 error("pfree: Attempt to free improper selector");
 /* Lower 3 bits of a selector are ALWAYS ON: */
 i = (LrgPtr.Word.Segment & ~7) >> 3;
 if (malloclist[i] == NULL)
 error("pfree: Attempt to free unknown window");
 deleteSegOrWin(q); /* First remove bounds-checking window */
 free(malloclist[i]); /* Then free memory */
 malloclist[i] = NULL; /* Prevent screw ups */
 }






[LISTING TWO]

/*
* Hardware-assisted Bounds Checking of Dynamic Arrays and Structures:
* Dynamically allocated arrays and structures do not automatically get a
private
* sel with Turboc C 2.0/Ergo (or Microsoft C/Ergo). To guarantee a private
* sel, this routine sets what Ergo calls a "window" over the array. The
private
* sel is then used for subsequent array accesses, so that bounds checking
* is performed with no run-time overhead (because the chip hardware does it).
By
* building this functionality into the "pmalloc" (and calloc, etc.) and
"pfree"
* and "pfarfree" routines, all non-static arrays get this ability.
*
* NOTE 1: Code shown is for the far memory model, so all pointers are far;
* Additional explicit casting will be needed for smaller models.
* NOTE 2: If using Turbo C huge model with osx86, any function call that
* passes the address of an automatic (stack) variable must have that
* address explicitly normalized first. Recommendation: use Microsoft C
instead.
*
*******************************************************************************
* --- The following bounds-related utility functions are currently private to
* --- this file, but could be made public as needed:
* errorstrcat()
* deleteSegOrWin()
* createDataWindow()
* allocateMultipleWindows()
* deleteMultipleWindows()
* createBoundedWindow()
* checksize()
* markPtoQ()
* --- The following routines provide runtime bounds checking for
* --- protected mode heap memory allocation. Each routine has the
* --- same syntax as the corresponding Turbo C 2.0 call (except "p" prefix)...
* pmalloc()
* pcalloc()
* prealloc()
* pfarmalloc()
* pfarcalloc()
* pfarrealloc()
* pfree()
* pfarfree()
******************************************************************************/

#include <alloc.h>
#include <stdio.h>
#include <dos.h>

/**********************FILE
GLOBALS********************************************/

#define private static
#define forward extern
#define import extern
#define export
#define uint16 unsigned short
#define uint32 unsigned long int

#define int16 short
#define int32 long int
#define EQ ==
#define ERRORSTRCAT
#define ERROR(A,B) errorstrcat(A,B)
#define ERROUT(A) {printf("Fatal Error %s.\n",A);exit(-1);}

/* Compile time debug tracing */
/* #define TRACE(A) A */
#define TRACE(A)

#define LDTSIZE 1024
/* Reserve arbitary slot space at extreme of LDT table for Ergo work
 space, such as extra windows sometimes needed for malloc */
#define MAXLDTNUM (LDTSIZE-25)
/* Assert: MAXLDTNUM <= LDTSIZE */
#define MAXSEGNUM ((MAXLDTNUM << 3) & 7)


/* Eclipse's default LDT size (in units of number of entries) is 1024. The low
end of the LDT will already be taken up by code and data sels, before any
dynamic data allocation occurs. If the program doesn't do much mallocing, the
default may be sufficient. Otherwise, you take a generous guess as to the
maximum number of memory requests active at any time, round up to the next
power of 2, and set LDTSIZE to that value. However, a 286 machine can have at
most 8192 entries in its Local Descriptor Table. Thus, a program that has,
say,
10,000 simultaneously active dynamic arrays would have to use a more selective
method of assigning private sels than simply universal use of "pmalloc" and
"pfree".

LDTSIZE should match the value stored in the os286 kernel. A larger LDT is
requested by modifying the kernel (the developer's on-disk .EXE file) by
specifying, for example:
 286setup ldtsize 2048
Valid ldtsize values range from 128 to 8192. This kernel can be subsequently
bound in with the distributed program. The current value of ldtsize (and all
other settable parameters) can be viewed by:
 286setup -help
Alternatively, for unbound applications, one may specify the ldtsize on the
command line during loading of the kernel TSR:
 os286 ldtsize 2048
This would typically be part of a batch file. Here, the load image of the
kernel is altered, not the disk file, so the change is not "sticky". */

void *malloclist[LDTSIZE]; /* K&R promises all values are
 initially zero (i.e., NULL) */

/* For createBoundedWindow funtion: */
#define SEGMENT 0
#define WINDOW 1
#define REALSEGMENT 2
#define REALWINDOW 3

/***************************************************************************/

#ifdef ERRORSTRCAT
/*======================================================================== */
void errorstrcat(char *stringA, char *stringB)
{char sss[90];


 strcpy(sss,"In ");
 strcat(sss,stringA); strcat(sss,", ");
 strcat(sss,stringB); gxerror(sss);
}
#endif

/*======================================================================= */
void private deleteSegOrWin(void *selector)
/*---------------------------------------------------------------------------
* Call to DOS delete-segment service, extended by Ergo to include
* memory "windows" as well. (Note: Turbo C's free & farfree DO NOT
* use this call). Memory must have been allocated by DOS interrrupts
* 0x48, 0xe7, or 0xe8; 0xe8 is used by createDataWindow routine here.
* Deleting a window also deletes any child windows.
*------------------------------------------------------------------------- */
{ /* regs, sregs, and LrgPtr are declared static to put them into common
 DATA segment */
 static union REGS regs;
 static struct SREGS sregs;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 LrgPtr.A = selector;
 sregs.es = LrgPtr.Word.Segment; /* Offset doesn't matter */
 regs.h.ah = 0x49; /* Delete segment or window */
 intdosx(&regs, &regs, &sregs);
 if (regs.h.al EQ 7)
 ERROR("deleteSegOrWin","Delete Segment; Bad Memory Map");
 if (regs.h.al EQ 9)
 ERROR("deleteSegOrWin","Delete Segment; Bad Selector");
}
/* ==================================================================== */
void private *createDataWindow(char *base, uint32 length)
/*---------------------------------------------------------------------------
"createDataWindow" is a call to an Eclipse "extended service" routine,
accessed through software interrupt E8, Function 01.
----------------------------------------------------------------------------*/
{ /* regs, sregs, and LrgPtr are declared static to put them into common
 DATA segment */
 static union REGS regs;
 static struct SREGS sregs;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 regs.h.ah = 0xe8;
 regs.h.al=1; /* Create Data window */
 /* si:bx=base; ds=parent selector */
 LrgPtr.A = base;
 sregs.ds = LrgPtr.Word.Segment;
 regs.x.si = 0L; /* parent selector takes care of high-order base */
 regs.x.bx = LrgPtr.Word.Offset;
 regs.x.cx = (uint16)((length >> 16) & 0x0000ffff); /* cx:dx=length in bytes
*/
 regs.x.dx = (uint16)(length & 0x0000ffff);
 intdosx(&regs,&regs,&sregs);
 LrgPtr.Word.Segment = regs.x.ax; /* Selector or error */
 LrgPtr.Word.Offset = 0;

 if (LrgPtr.Word.Segment > MAXSEGNUM)
 {/* We reserve a little work space at extreme of LDT table; if we've
 encrouched upon it, back off: */
 deleteSegOrWin(LrgPtr.A);
 /* Table is "full" as far as we're concerned */
 LrgPtr.Word.Segment = regs.x.ax = 21U;
 }
 if (regs.x.ax > 26U regs.x.ax == 21U)
 return (LrgPtr.A); /* Let caller handle "Descriptor Table Full" error */

 switch (regs.x.ax) {
 case 9U:
 ERROR("createDataWindow","Memory allocation; Bad Selector"); break;
 case 20U:
 ERROR("createDataWindow","Memory allocation; Bad Type"); break;
 /* case 21U: Handled by caller...
 ERROR("createDataWindow","Memory allocation; Descriptor Table Full"); break;
 */
 case 23U:
 ERROR("createDataWindow","Memory allocation; Need Local Descriptor"); break;
 case 25U:
 ERROR("createDataWindow","Memory allocation; Bad Base"); break;
 case 26U:
 ERROR("createDataWindow","Memory allocation; Bad Size"); break;
 default:
 ERROR("createDataWindow","Memory allocation; Unknown Error");break;
 }
 return(NULL); /* should never get here; quiet compiler warning */
}
/* =====================================================================*/
void private deleteMultipleWindows(char *base, uint16 count)
/*-----------------------------------------------------------------------
 * Called with the pointer representing the first (lowest LDT) selector
 * in a consecutive set of selectors to be deleted, and the count
 * of the number of selectors in the set.
 * See deleteSegOrWin for possible errors.
 *----------------------------------------------------------------------*/
{ int i;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 if (count == 0) ERROR("deleteMultipleWindows","zero count not valid");

 LrgPtr.A = base;
 /* Undocumented Ergo recommendation for tiled windows: delete from
 highest to lowest. 8 is increment from one LDT entry to next: */
 LrgPtr.Word.Segment += (count - 1) * 8;
 for (i=0; i < count; i++) {
 deleteSegOrWin(LrgPtr.A);
 LrgPtr.Word.Segment -= 8;
 }
}
/* ==================================================================== */
void private *allocateMultipleWindows(char *base, uint32 length)
/*---------------------------------------------------------------------------
"allocateMultipleWindows" is a call to an Eclipse "extended service" routine,
accessed through software interrupt 0xEA. It creates tiled windows,
using 32K tiles. The pointer for the first tile is returned.

(The total number of tiles created is also known, but not returned
 to the caller at this time.)
----------------------------------------------------------------------------*/
{ uint16 numSelectors;
 /* regs, sregs, and LrgPtr are declared static to put them into common
 DATA segment */
 static union REGS regs;
 static struct SREGS sregs;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 regs.h.ah = 0xea; /* Allocate multiple windows */
 /* si:bx and cx:dx are 32-bit offsets, not a paragraph address & offset */
 /* si:bx = stride in bytes = 32K (32K makes for fast math,
 64K has problems with Turbo C) */
 /* Maximum legal value is 64K (si = 1, bx = 0) */
 regs.x.si = 0;
 regs.x.bx = 0x8000; /* 32K */
 /* ds is parent selector */
 LrgPtr.A = base;
 sregs.ds = LrgPtr.Word.Segment; /*(uint16)((((uint32)base) >> 16) &
0x0000ffff);*/
 /* cx:dx=length in bytes: */
 regs.x.cx = (uint16)((length >> 16) & 0x0000ffff);
 regs.x.dx = (uint16)(length & 0x0000ffff);
 intdosx(&regs,&regs,&sregs);

 LrgPtr.Word.Segment = regs.x.ax; /* Selector or error */
 LrgPtr.Word.Offset = 0;
 if (regs.x.ax == 21U)
 return (LrgPtr.A); /* Let caller handle "Descriptor Table Full" error */

 numSelectors = regs.x.bx;
 /* 8 is increment between successive LDT entries: */
 if ((LrgPtr.Word.Segment + ((numSelectors-1)*8) ) > MAXSEGNUM)
 {/* We reserve a little work space at extreme of LDT table; if we've
 encrouched upon it, back off: */
 deleteMultipleWindows(LrgPtr.A,numSelectors);
 /* Table is "full" as far as we're concerned */
 LrgPtr.Word.Segment = 21U;
 return (LrgPtr.A); /* Let caller handle "Descriptor Table Full" error */
 }

 if (regs.x.ax > 26U) return (LrgPtr.A); /* Everything OK */

 switch (regs.x.ax) {
 case 9U:
 ERROR("allocateMultipleWindows","Memory allocation; Bad Selector"); break;
 /* case 21U: Handled by caller...
 ERROR("allocateMultipleWindows","Memory allocation; Descriptor Table Full");
break;
 */
 case 23U:
 ERROR("allocateMultipleWindows","Memory allocation; Need Local Descriptor");
break;
 case 27U:
 ERROR("allocateMultipleWindows","Memory allocation; Bad Stride"); break;
 }
 return(NULL); /* should never get here; quiet compiler warning */
}
/* ==================================================================== */

void private *createBoundedWindow(char *base, uint32 length)
/*---------------------------------------------------------------------------
"createBoundedWindow" is used with the "pmalloc" and "pfree" routines for
adding bounds-checking to dynamically-allocated objects. It features an
embedded loop of calls to Ergo's "extended DOS" function 0xED, "Get Segment
or Window Information".

Besided dynamic allocation checking, this function can also be used for
bounds-checking of static arrays and structures, such as constant strings.
There is no centralized mechanism (see article).
----------------------------------------------------------------------------*/
{ /* regs, sregs, and LrgPtr are declared static to put them into common
 DATA segment */
 static union REGS regs;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;
 uint16 type;
 uint32 offset = 0; /* Offset will accumulate total offset */

 TRACE(printf("Call to createBoundedWindow\n");)
 /* Start with initial selector */
 LrgPtr.A = base;
 regs.x.bx = LrgPtr.Word.Segment;
 /* Work up the inheritence tree until a non-WINDOW is found: */
 while (TRUE) {
 /* Next line must be within this loop, since ah is not preserved across
 the subsequent intdos call: */
 regs.h.ah = 0xed; /* Call to Get segment or window information */
 intdos(&regs,&regs);
 switch (regs.h.al) {
 case 9U:
 ERROR("createBoundedWindow","Memory allocation; Bad Selector");break;
 case 23U:
 ERROR("createBoundedWindow","Memory allocation; Need Local Descriptor");
 break;
 }
 type = regs.h.al;
 if (type == REALSEGMENT OR type == REALWINDOW)
 ERROR("createBoundedWindow","Real segment or window found");
 /* if ERROR, we will not continue here */
 /* Assert: type is either SEGMENT or WINDOW */
 /* Length in bytes = cx:dx */
 length = (((uint32)regs.x.cx) << 16) + (uint32)regs.x.dx;
 if (type != WINDOW) break; /* from while */
 /* For WINDOW type, di has it's parent selector: */
 regs.x.bx = regs.x.di;
 /* For WINDOW type, si:bx is the 32 bit offset within the parent: */
 offset += (((uint32)regs.x.si) << 16) + (uint32)regs.x.bx;
 }
 /* For SEGMENT type, si:bx is the 32-bit linear address of the base of
 the segment. */
 /* Add accumulated offset as well */
 base = (char *)((uint32)regs.x.si << 16) + (uint32)regs.x.bx + offset;

 /* Now we have the underlying selector:offset location; next,
 build a new window of just the right size on top of it. */

 if (length > 0x0000ffff)

 /* If length is greater than 64K, use tiling with 32K windows */
 return(allocateMultipleWindows(base,length));
 else
 return(createDataWindow(base,length));
}
/*===========================================================================*/
void private checksize(uint32 size)
/*----------------------------------------------------------------------------
* Unfortunately, Turbo C/Ergo's version of malloc ( & calloc & realloc)
* delivers a block with a header of 8 bytes (i.e., returns with an offset of
* 0x0008), so the full 64K is not available. If we don't check for this,
* block descriptor header could be overwritten... disaster!
-----------------------------------------------------------------------------*/
{
 if (size > 0x0000fff7)
 ERROR("checksize","Attempt to allocate more than 64K - 8 bytes");
}
/*========================================================================*/
void private markPtoQ (void *p, uint32 bytes, void *q)
/*--------------------------------------------------------------------------
 * This utility routine is used by the allocators to mark the
 * correspondence between p and q
 *-------------------------------------------------------------------------*/
{ /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 void *ptile;
 uint32 qspot;
/* 32K: */
#define TILESIZE 0x00010000

 LrgPtr.A = q;
 /* Lower 3 bits are ALWAYS ON; so we can restore them later: */
 /* Shift to make array smaller, and its length <= LDTSIZE */
 qspot = (LrgPtr.Word.Segment & ~7) >> 3;
 if (bytes > 0x0000ffff) {
 /* Tiling is indicated by multiple adjacent entries with the
 same value of p: */
 while (bytes > TILESIZE) {
 malloclist[qspot++] = p; /* Remember the p-q correspondence*/
 bytes = bytes - TILESIZE;
 }
 /* Conclude with last partial tile below */
 }
 malloclist[qspot] = p; /* Remember the p-q correspondence*/
 TRACE(printf("index: %u ",(LrgPtr.Word.Segment & ~7) >> 3);)
 TRACE(printf("returns 0x%lx\n",q);)
}
/* =========================================================================*/
void *pmalloc(uint16 bytes)
{
 void *p;
 void *p2;
 void *q;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;


 TRACE(printf("[pmalloc] want: %u, ",bytes);)
 checksize((uint32)bytes);
 p = malloc(bytes);

 if (p EQ NULL) ERROR("pmalloc","Out of memory space on heap");

 LrgPtr.A = q = createBoundedWindow(p,bytes);
 if (LrgPtr.Word.Segment == 21U)
 {/* The local descriptor table is full; "fail soft" by
 foregoing the pleasure of bounds checking. When freeing occurs,
 the decision about whether the pointer refers to a bounds-checking
 data window like q or just a direct alloc like p is made by viewing the
 offset. A data window always has a zero offset. The odds are
 1 in 64K that p has a zero offset. */
 if (((uint32)p & 0x0000ffff) == 0)
 {/* We have to make sure that p doesn't have a zero offset! */
 p2 = malloc((uint16)(bytes & 0x0000ffff));
 /* If we don't succeed next time, periodicity suggests we won't
 succeed n times. Just complain and fail */
 if (((uint32)p2 & 0x0000ffff) == 0)
 ERROR("pmalloc","Unlikely memory error");

 free(p);
 p = p2;
 }
 TRACE(printf("returns 0x%lx\n",p);)
 return(p);
 }
 markPtoQ(p, (uint32)bytes, q);
 return(q);
}
/* =========================================================================*/
void *pcalloc(uint16 nitems, uint16 bytes)
{
 void *p;
 void *p2;
 void *q;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 TRACE(printf("[pcalloc] want: %u * %u = %lu, ",
 nitems,bytes,(uint32)nitems*(uint32)bytes);)
 checksize((uint32)nitems*(uint32)bytes); /*Could be way over 64K - 8 bytes */
 p = calloc(nitems,bytes);

 if (p EQ NULL) ERROR("pcalloc","Out of memory space on heap");

 LrgPtr.A = q = createBoundedWindow(p,(uint32)nitems*(uint32)bytes);
 if (LrgPtr.Word.Segment == 21U)
 {/* The local descriptor table is full; "fail soft" by
 foregoing the pleasure of bounds checking. When freeing occurs,
 the decision about whether the pointer refers to a bounds-checking
 data window like q or just a direct alloc like p is made by viewing the
 offset. A data window always has a zero offset. The odds are
 1 in 64K that p has a zero offset. */
 if (((uint32)p & 0x0000ffff) == 0)
 {/* We have to make sure that p doesn't have a zero offset! */
 p2 = calloc(nitems,bytes);

 /* If we don't succeed next time, periodicity suggests we won't
 succeed n times. Just complain and fail */
 if (((uint32)p2 & 0x0000ffff) == 0) ERROR("pcalloc","Unlikely memory error");

 free(p);
 p = p2;
 }
 TRACE(printf("returns 0x%lx\n",p);)
 return(p);
 }

 markPtoQ(p, (uint32)bytes, q);
 return(q);
}
/*=========================================================================*/
void *prealloc(void *block, uint16 newsize)
{ void *p;
 void *p2;
 void *q;
 int16 i;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 TRACE(printf(" [prealloc] with: 0xlx, want: %u, ",block,newsize);)
 if (block == NULL) return(NULL);

 checksize((uint32)newsize);
 LrgPtr.A = block;
 if (LrgPtr.Word.Offset != 0)
 /* Assume no bounds protection for this allocation */
 return(realloc(block,newsize));

 /* Lower 3 bits of a selector are ALWAYS ON: */
 i = (LrgPtr.Word.Segment & ~7) >> 3;
 TRACE(printf("index: %u ",i);)
 if (malloclist[i] == NULL)
 ERROR("prealloc","Attempt to reallocate unknown window");

 if (newsize == 0)
 {deleteSegOrWin(block); /* First remove bounds-checking window */
 realloc(malloclist[i],newsize); /* Should return NULL */
 malloclist[i] = NULL; /* Prevent screw ups */
 TRACE(printf("returns NULL\n");)
 return(NULL);
 }

 p = realloc(malloclist[i],newsize); /* adjust memory */
 if (p == NULL)
 {TRACE(printf("returns NULL\n");)
 return(NULL); /* Couldn't do it */
 }

 malloclist[i] = NULL; /* Prevent screw ups */
 /* Since realloc may change pointer location as well as size, we'll
 just throw away the old bounds window and get a new one */
 deleteSegOrWin(block);
 LrgPtr.A = q = createBoundedWindow(p,newsize);
 if (LrgPtr.Word.Segment == 21U)

 {/* The local descriptor table is full; "fail soft" by
 foregoing the pleasure of bounds checking. When freeing occurs,
 the decision about whether the pointer refers to a bounds-checking
 data window like q or just a direct alloc like p is made by viewing the
 offset. A data window always has a zero offset. The odds are
 1 in 64K that p has a zero offset. */
 if (((uint32)p & 0x0000ffff) == 0)
 {/* We have to make sure that p doesn't have a zero offset! */
 p2 = malloc(newsize);
 /* If we don't succeed next time, periodicity suggests we won't
 succeed n times. Just complain and fail */
 if (((uint32)p2 & 0x0000ffff) == 0) ERROR("prealloc","Unlikely memory
error");

 free(p);
 p = p2;
 }
 TRACE(printf("returns 0x%lx\n",p);)
 return(p);
 }

 markPtoQ(p, (uint32)newsize, q);
 return(q);
}
/*==========================================================================*/
void *pfarmalloc(uint32 bytes)
{
 void *p;
 void *p2;
 void *q;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 TRACE(printf("[pfarmalloc] want: %lu, ",bytes);)
 p = farmalloc(bytes);
 if (p EQ NULL) ERROR("pfarmalloc","Out of memory space on heap");

 LrgPtr.A = q = createBoundedWindow(p,bytes);
 if (LrgPtr.Word.Segment == 21U)
 {/* The local descriptor table is full; "fail soft" by
 foregoing the pleasure of bounds checking. When freeing occurs,
 the decision about whether the pointer refers to a bounds-checking
 data window like q or just a direct alloc like p is made by viewing the
 offset. A data window always has a zero offset. The odds are
 1 in 64K that p has a zero offset. */
 if (((uint32)p & 0x0000ffff) == 0)
 {/* We have to make sure that p doesn't have a zero offset! */
 p2 = farmalloc(bytes);
 /* If we don't succeed next time, periodicity suggests we won't
 succeed n times. Just complain and fail */
 if (((uint32)p2 & 0x0000ffff) == 0)
 ERROR("pfarmalloc","Unlikely memory error");

 free(p);
 p = p2;
 }
 TRACE(printf("returns 0x%lx\n",p);)
 return(p);
 }

 markPtoQ(p, bytes, q);
 return(q);
}
/*==========================================================================*/
void *pfarcalloc(uint32 nitems, uint32 bytes)
{
 void *p;
 void *p2;
 void *q;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 TRACE(printf("[pfarcalloc] want: %lu * %lu = %lu,
",nitems,bytes,nitems*bytes);)
 /* We won't check to see if (nitems * bytes) overflows uint32;
 we'll let farcalloc do that job */
 p = farcalloc(nitems, bytes);
 if (p EQ NULL) ERROR("pfarcalloc","Out of memory space on heap");

 LrgPtr.A = q = createBoundedWindow(p,nitems*bytes);
 if (LrgPtr.Word.Segment == 21U)
 {/* The local descriptor table is full; "fail soft" by
 foregoing the pleasure of bounds checking. When freeing occurs,
 the decision about whether the pointer refers to a bounds-checking
 data window like q or just a direct alloc like p is made by viewing the
 offset. A data window always has a zero offset. The odds are
 1 in 64K that p has a zero offset. */
 if (((uint32)p & 0x0000ffff) == 0)
 {/* We have to make sure that p doesn't have a zero offset! */
 p2 = farcalloc(nitems, bytes);
 /* If we don't succeed next time, periodicity suggests we won't
 succeed n times. Just complain and fail */
 if (((uint32)p2 & 0x0000ffff) == 0)
 ERROR("pfarmalloc","Unlikely memory error");

 free(p);
 p = p2;
 }
 TRACE(printf("returns 0x%lx\n",p);)
 return(p);
 }
 markPtoQ(p, bytes, q);
 return(q);
}

/*=========================================================================*/
void *pfarrealloc(void *block, uint32 newsize)
{ void *p;
 void *p2;
 void *q;
 int16 i;
 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 TRACE(printf("[pfarrealloc] with 0x%lx, want: %lu, ",block, newsize);)
 if (block EQ NULL) return(NULL);

 LrgPtr.A = block;

 if (LrgPtr.Word.Offset != 0)
 /* Assume no bounds protection for this allocation */
 return(farrealloc(block,newsize));

 /* Lower 3 bits of a selector are ALWAYS ON: */
 i = (LrgPtr.Word.Segment & ~7) >> 3;
 TRACE(printf("index: %u ",i);)
 if (malloclist[i] == NULL)
 ERROR("pfarrealloc","Attempt to reallocate unknown window");

 if (newsize == 0)
 {deleteSegOrWin(block); /* First remove bounds-checking window */
 farrealloc(malloclist[i],newsize); /* Should return NULL */
 malloclist[i] = NULL; /* Prevent screw ups */
 TRACE(printf("returns NULL\n");)
 return(NULL);
 }

 p = (void *)farrealloc(malloclist[i],newsize); /* adjust memory */
 if (p == NULL)
 {TRACE(printf("returns NULL\n");)
 return(NULL); /* Couldn't do it */
 }

 malloclist[i] = NULL; /* Prevent screw ups */
 /* Since realloc may change pointer location as well as size, we'll
 just throw away the old bounds window and get a new one */
 deleteSegOrWin(block);
 LrgPtr.A = q = createBoundedWindow(p,newsize);
 if (LrgPtr.Word.Segment == 21U)
 {/* The local descriptor table is full; "fail soft" by
 foregoing the pleasure of bounds checking. When freeing occurs,
 the decision about whether the pointer refers to a bounds-checking
 data window like q or just a direct alloc like p is made by viewing the
 offset. A data window always has a zero offset. The odds are
 1 in 64K that p has a zero offset. */
 if (((int32)p & 0x0000ffff) == 0)
 {/* We have to make sure that p doesn't have a zero offset! */
 p2 = farmalloc(newsize);
 /* If we don't succeed next time, periodicity suggests we won't
 succeed n times. Just complain and fail */
 if (((int32)p2 & 0x0000ffff) == 0) ERROR("pfarrealloc","Unlikely memory
error");

 free(p);
 p = p2;
 }
 TRACE(printf("returns 0x%lx\n",p);)
 return(p);
 }
 markPtoQ(p, newsize, q);
 return(q);
}
/*===========================================================================*/
void pfree(void *q)
/*----------------------------------------------------------------------------
"pfree" is called with a selector q, which is either an Ergo data window
pointer used for bounds checking, or (less routinely) simply the pointer
returned directly by malloc. Examining the low 16 bits determines
which. For a window, we look up the stored value of the corresponding

malloc pointer (which is a selector, not the real physical address), and
q's slot in the Local Descriptor Table is freed for use by subsequent
allocations.

In any event, Turbo C's "free" is called with the malloc pointer.
--------------------------------------------------------------------------- */
 {uint16 i;
 void *p;

 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;

 TRACE(printf("[pfree] with: 0x%lx, ",q);)
 if (q EQ NULL) return;

 LrgPtr.A = q;
 if (LrgPtr.Word.Offset != 0)
 /* Assume no bounds protection for this allocation */
 {free(q); /* free memory */
 TRACE(printf("\n");)
 return;
 }
 /* Lower 3 bits of a selector are ALWAYS ON: */
 i = (LrgPtr.Word.Segment & ~7) >> 3;
 TRACE(printf("index: %u\n",i);)
 if ((p = malloclist[i]) == NULL)
 ERROR("pfree","Attempt to free unknown window");

 /* First remove bounds-checking window(s): */
 while (p == malloclist[i]) {
 deleteSegOrWin(LrgPtr.A);
 LrgPtr.Word.Segment++; /* q may be made of multiple tiles */
 malloclist[i++] = NULL; /* Prevent screw ups */
 }
 /* Turbo C's "free" call does not call DOS 0x49 (the equivalent of
 deleteSegOrWin); if it did, then the deleteSegOrWin(q) call might
 be unnecessary, since q is in some sense a child of malloclist[i].addr) */
 free(p); /* free memory */
}
/*===========================================================================*/
void pfarfree(void *q)
/*----------------------------------------------------------------------------
"pfarfree" is called with a selector q, which is either an Eclipse data window
pointer used for bounds checking, or (less routinely) simply the pointer
returned directly by farmalloc. Examining the low 16 bits determines
which. For a window, we look up the stored value of the corresponding
farmalloc pointer (which is a selector, not the real physical address), and
q's slot in the Local Descriptor Table is freed for use by subsequent
allocations.

In any event, Turbo C's "farfree" is called with the farmalloc pointer.
--------------------------------------------------------------------------- */
 {uint16 i;
 void *p;

 /* Make life easier when dealing with segment and offset pointers. */
 static union {void *A;
 struct {uint16 Offset,Segment;} Word; } LrgPtr;


 TRACE(printf("[pfarfree] with: 0x%lx, ",q);)
 if (q EQ NULL) return;

 LrgPtr.A = q;
 if (LrgPtr.Word.Offset != 0)
 /* Assume no bounds protection for this allocation */
 {farfree(q); /* free memory */
 TRACE(printf("\n");)
 return;
 }
 /* Lower 3 bits of a selector are ALWAYS ON: */
 i = (LrgPtr.Word.Segment & ~7) >> 3;
 TRACE(printf("index: %u\n",i);)
 if ((p = malloclist[i]) == NULL)
 ERROR("pfarfree","Attempt to free unknown window");

 /* First remove bounds-checking window(s): */
 while (p == malloclist[i]) {
 deleteSegOrWin(LrgPtr.A);
 LrgPtr.Word.Segment++; /* q may be made of multiple tiles */
 malloclist[i++] = NULL; /* Prevent screw ups */
 }
 /* Turbo C's "free" call does not call DOS 0x49 (the equivalent of
 deleteSegOrWin); if it did, then the deleteSegOrWin(q) call might
 be unnecessary, since q is in some sense a child of malloclist[i].addr) */
 farfree(p); /* free memory */
}
/*==========================================================================*/

































MAY, 1991
PROGRAMMING PARADIGMS


Windows and Gates




Michael Swaine


Last August, I wrote a column titled "Windows 3.0 Challenges All the Talent in
the Room." Time has given me some second thoughts about things I wrote then.
Windows vs. MS-DOS. A pseudo-conflict, to be sure. Windows requires DOS, and
the success of Windows ensures the continued existence of DOS. My point was
that, if Windows does everything that most people want from an operating
system and does it better than DOS, then DOS may wane as a development
platform. What I should have realized is that the processing power and memory
capacity of the typical machine in use will not grow as fast as the demands
that Windows will make. So as long as Windows supports DOS applications, there
will probably be a market for them.
Windows vs. OS/2. I didn't really have anything to say about OS/2 in that
column, and that still seems appropriate, but I do hear of developers
converting OS/2 applications to Windows.
Windows 3.0 vs. Macintosh System 7.0. System 7.0 still has a much better user
interface, is a more complete product, and has a better file system. But Win 3
is gaining ground, because the System 7.0 released this month will be less
than the product shown to developers more than a year ago. While Microsoft is
adding features to Windows, Apple has been stripping features out of System
7.0 to get it out the door.
ToolBook vs. HyperCard. In the area of user programming, I didn't reckon with
ToolBook's limitations, or with another contender. Unless Asymetrix can speed
up ToolBook significantly, we are not likely to see it used for a lot of user
programming, nor are we going to see a "bookware" market spring up, except as
ports from the stackware market on the Mac side. Heizer Software, the leading
name on the Mac side in the generally marginal market of spreadsheet macros
and stackware and such, has converted many of its HyperCard stacks to ToolBook
using ConvertIt, a tool Heizer distributes. This allows Heizer to list in its
catalog Macintosh and Windows versions of many of its stackware products,
which looks impressive. But now that Spinnaker has released the Windows
version of Plus, its HyperCard clone, there is a more direct stackware
connection between the platforms. Plus reads and runs HyperCard stacks, and
Plus stacks written for the Mac run on the Windows version without
modification. This would seem to be a cleaner porting path than what ToolBook
has.
Now the hedges: Spinnaker doesn't think of Plus as a HyperCard clone, but any
unbiased observer would. And I haven't reviewed Plus, so I can't say how
complete its platform independence is, but the demonstrations I've seen have
been very impressive.


Windows in the Rooms, Windows on the Floor


At the Windows & OS/2 Conference in San Jose this spring, I followed up on
this user programming thread. I was also trying to catch another session on
"Development Using High-productivity Programming Languages." It seems that
there are a lot of ways to do Windows besides Microsoft's way; that is, by
using the Software Developer's Kit (SDK).
Out on the exhibit floor, this same preoccupation with alternatives to SDK was
apparent. Blue Sky Software's WindowsMAKER Professional lets you point and
click to produce C code. Blythe's Omnis 5 and Enfin Software's Enfin/2 are
4GLs. GUIdance Choreographer is a shortcut to Windows apps that was getting a
lot of attention at the show. Knowledge Garden's KnowledgePro Windows is an
environment for rapid development under Windows or DOS. Viewpoint Systems' I/F
Builder is an interface builder. There were also a lot of CASE and
flowcharting tools. And of course Borland's C++, in a class by itself.
(Unintended puns get less and less avoidable as computer terminology takes
over more and more of the English language. I'd say it's spreading like a
virus, but that word's been taken, too.)
I had lunch on the last day of the conference with Alan Cooper, who has been
supporting himself for years as an independent software author. (That's
exactly the phrase he always uses; not developer or programmer, but software
author.) Alan is the author of Super-Project, Microphone II for Windows, and
an application written for Microsoft and not published. He has been using the
Windows SDK since Day One, and was skeptical about the shortcut products.
Alan was chairing the last panel session of the conference, on using the SDK.
He told me that he had brought together people who had the experience to talk
about the benefits and drawbacks of the SDK, and the independence to talk
freely. Too many sessions at shows like these, he told me, are thinly-veiled
product advertisements. His panelists had no products to hawk. I had already
planned to attend his session, and said so.


Outmoded, Old-fashioned, and a Waste of Time


The first question Alan asked his panelists had to do with these indirect
Windows programming tools. What they had to say is worth repeating.
Cooper: "I keep seeing [these 4GLs and other high-level Windows programming
tools] and hearing that Windows development using SDK is outmoded,
old-fashioned, and just a waste of time. I want you guys to tell me why this
is wrong. Or right."
Jim Weiler, Director, Windows User Group, Boston Computer Society: "One of the
key things is that you can't specify your application well enough ahead of
time to know if you are going to need ... this subclass or something. And if
you can't know that, then it's hard to know that you can do everything you
need to do with some of these tools. They all say you can write your own
p-code. The next question you ask should be, 'Can I get a window handle to a
window that you created in my p-code?' If the answer is no, then that tells
you something about what you can't do with your extensions to their stuff.
"I think it's good to break them up into different categories. There are fast
prototypers that aren't designed to write the whole app, but to get the screen
[designs done]. And then there are ones that are designed to replace the whole
thing, and ones that are languages, and ones that are just dynamic link
libraries. It depends on the use you put it to and the use it was designed
for. They don't all replace the SDK."
Kevin Welch, President, Eikon Systems: "We ran through all the tools we could
get our hands on and ended up with the SDK because we found that they were not
really saving us that much time."
John Zicker, Viewpoint Systems: "One of the things that we found after a
couple of years of development is that some of these tools are very good for
prototyping. What we typically run into is the flexibility issue and the
executable [speed issue]. Those were the two main things we ran into with
these indirect development tools. But you also need to look at what problem
you're trying to solve. There are some tools out there [that are useful], but
I have yet to see [a mainstream app written with one of them]."
Cooper: "That's a very interesting observation. Does anybody know of a
mainstream application available today that was written using anything other
than the SDK?" (Nobody could, really, on the panel or in the audience.) "So
empirically, it doesn't work."
Alan also asked for a definition of the SDK. Jim Weiler's was the most
succinct: "Those 20 files that you get in the box."


William H. Gates 3.0


I've been poking some fun at Bill Gates lately, and a press release that
recently crossed my desk announcing a forthcoming book reminded me of the
reason for my preoccupation.
The book, a 500-page hardback "independent investigative biography" of
Microsoft's founder, CEO, chairman, largest stockholder, and arguably
hardest-working employee, is called Billion Dollar Gates. It's being written
by PC/Computing columnist Stephen Manes and Seattle Times writer Paul Andrews,
and is due out from Doubleday next year.
It's time for a book like this, although it will no doubt increase Gates's
power and influence in the industry. In the press release, Manes characterizes
Gates quite correctly as "the most influential person in the computer
business." The press release also speaks of attitudes toward Gates running the
gamut "from acclaim and respect to envy and suspicion over his growing ability
and desire to influence the direction of the computer industry." Put me down
for one from column A and one from column B. Respect, by all means. And
suspicion.
What follows here are some of the reasons for and evidence of that power and
influence, presented here in the belief that these things are simply useful to
know.
Of the 60 or 70 billionaires in the U.S., five of them are in the computer
industry, and two, Gates and partner Paul Allen, made their money from
Microsoft. (The others are William Hewlett, David Packard, and EDS's H. Ross
Perot.) Gates became the youngest self-made billionaire in history when
Microsoft went public in 1986. He was 30 then; today his personal worth is
somewhere between $2 and $3 billion, nearly all in Microsoft stock. His salary
and fringe benefits as CEO are comparatively modest: $207,000 in 1989.
Gates's influence and power have not gone unnoticed. He was the subject of
some 50 magazine cover stories in 1990. Publications such as Computer Retail
News, Electronic Business, Personal Computing, and Upside regularly rank him
as one of the most influential executives in the industry, often as the most
influential. PC Magazine gave him its Lifetime Achievement Award in 1986, when
he was 30.
He had been through a few lifetimes by age 30, if you credit Lee Felsenstein,
who also lived through the wild years of the beginning of the personal
computer industry. "A year was a lifetime in those days," Felsenstein once
told me. And Gates -- but we all know that story: Dropped out of Harvard; with
Paul Allen and Marty Davidoff wrote a Basic for the just-released MITS Altair
computer, and moved to Albuquerque with Allen to do software for MITS; moved
MicroSoft (as it was spelled then) back home to Seattle where he edged out
Digital Research for the contract to do the operating system for IBM's PC and
then bought Tim Paterson's QDOS (Quick-and-Dirty Operating System) for $50,000
as the basis for MS-DOS.
Upside magazine, a magazine for venture capitalists obsessed with sports
metaphors, quotes early Microsoft venture backer and board member David
Marquand on Gates the manager: "Of all the CEOs I've worked with, Bill is the
brightest, hardest working, most focused -- an unbeatable combination."
Then there's Bill's company. Upside editor-in-chief Rich Karlgaard
characterizes Gates as "at least a 20-percent factor in his company's
prospects." This is not about his financial holding in the company; actually,
Gates owns something like 38 percent of Microsoft stock. It's an assessment of
his significance to the company; a valuation of Bill Gates as a Microsoft
asset. Karlgaard is saying that whatever Microsoft does, whatever it is
capable of doing, a fifth of that is embodied in one person.
One of the best lines I ever got out of Bill Gates was his dead-pan "IBM is a
big company," delivered while sitting in the bleachers overlooking the 1983
West Coast Computer Faire, and recalling the negotiations between then-tiny
Microsoft and enormous IBM three years earlier. Microsoft is now a pretty big
company itself, with 5000 to 6000 employees and annual sales in excess of a
billion dollars. It is one of the most profitable big companies in the
computer industry, if not the most profitable. Investors care about things
like a 31 percent sustainable growth rate, $6.5 billion in public wealth
created since going public in 1986, a compound return of 22.3 percent to
shareholders over the past five years. And it's influential, although the 60
million computers running Microsoft software are only the visible
manifestation of that influence. Pen-based computing, CD-ROM standards,
multimedia, interapplication communication, you name it, there's a Microsoft
way to approach it. Generally involving the payment of royalties to Redmond.
And none of these Microsoft ways get established without the input and perhaps
direction of Bill Gates, the 20 percent factor.
As I said, I think it's useful to know these things.
































































MAY, 1991
C PROGRAMMING


D-Flat


 This article contains the following executables: DFLAT.ARC


Al Stevens


This month we begin a new C Programming column project, the D-Flat C library
that implements a subset of the IBM SAA Common User Access (CUA) interface
library into C programs. The library is for use in DOS text-mode programs.
This project will take several months. The accumulated source code now exceeds
7500 lines, although I haven't prepared it for publication yet. I'll be
narrowing the margins to fit the printed page and inserting lots of comments,
so the code will surely grow. Each month, we will add to the source code and
you can collect it that way, learning about it from the column as you go
along. If you prefer, you can download the entire source code library from the
DDJ Forum on CompuServe or TelePath and get an early start on using it. The
monthly code will be posted there as well, just as it always has been, and the
documentation will be the text of the columns as the project proceeds. The
full source code package will be preliminary and generally unsupported
although I will answer any questions about it on CompuServe. The package will
include a terse description of the functions and messages in the D-Flat API as
well as some example programs to show how the library works. See the later
section, "How to Get D-Flat."
To begin with, I will explain the subset of CUA that D-Flat implements. D-Flat
supports a program model that begins with an application window. The
application window can be any reasonable size. It has a title bar and a menu
bar. The menu bar hosts popdown menus. A D-Flat application can open other
windows that are children of the application or children of one another. The
windows can be text boxes, list boxes, and edit boxes. Your user can move and
resize these windows and the application window by using the mouse or the
keyboard. These operations follow the conventions of the CUA standard, so your
application's user interface will resemble that of Windows, Presentation
Manager, TurboVision, and others. D-Flat supports the standard set of CUA
pull-down menus: File, Edit, Options, Window, Help, and the System menu for
moving, sizing, minimizing, and maximizing windows from the keyboard. Your
application can add its own menus as well. D-Flat has a clipboard and dialog
boxes, too. A dialog box is a window that contains data entry fields. These
fields can be edit boxes, push buttons, radio buttons, and list boxes. The
D-Flat library includes dialog boxes for the CUA standard File Open and Save
As dialogs. D-Flat does not use resource files and a resource compiler to
describe menus and dialog boxes the way the Windows SDK does. Instead, you
code your menus and dialog boxes in a format similar to the ones used by the
SDK resource files, and a set of C preprocessor macros compiles the menus and
dialog boxes into the structures that the D-Flat software expects.
For a complete definition of CUA, you can get volume CG26-4582 from the IBM
Systems Application Architecture Library. The volume is titled, Common User
Access: Advanced Interface Design Guide. I got my copy along with the
Microsoft Windows 3.0 Software Development Kit. It appears to be a definitive
description of CUA, although probably derived from the OS/2 Presentation
Manager. You will find some subtle departures from the CUA standard in Windows
3.0 and other Microsoft products. The Borland Turbo Debugger 2.0 interface
follows CUA somewhat, and the TurboVision functions of Turbo Pascal implement
some of CUA for Pascal programmers.
I developed this library to use in an application project. I looked around for
a CUA-compliant library that was text-based and that offered adequate
performance to run on the bottom-end laptops, the ones with not much disk
space and slow drives and processors. I found libraries for faster machines
(MEWEL, for example) and C++ class libraries (Zinc, for example), but nothing
that suited my needs for C language development of programs targeted for
low-end platforms. As a result, I launched this project and decided to publish
it as a series for the column. So far, the code compiles with Turbo C 2.0 and
Microsoft C 6.0. I'll stick with those two for a while and point out the few
places where the code is compiler-dependent so you can consider ports to other
compilers if you want.
I am writing this "C Programming" column with a simple D-Flat application, a
multiple-document notepad-like text editor. This program will be one of the
examples we use later on. It employs most of the CUA features of D-Flat --
editor windows, dialog boxes, pull-down menus, the Clipboard, and so on. It's
also helping me debug the D-Flat library.


Low-Level D-Flat Code


Like its musical namesake, D-Flat is not the easiest key to play in. It uses
the feared message-based, event-driven architecture similar to the one that
sends programmers scurrying to avoid Windows. But with practice you can play
"Body and Soul" in "five," and with diligence you can master and take
advantage of the benefits of this alternative way to write programs. As you
will see in the months to come, event-driven programming is not all that
intimidating.
We need to get some platform-dependent code out of the way first. This month
is dedicated to the C files that bind the rest of the library to the PC
architecture. If you wanted to move D-Flat to a different system, you would
change this code, which provides interfaces to the keyboard, mouse, and
screen.
Listing One, page 140, is dflat.h, the header file that all D-Flat
applications and most D-Flat library source files include. It begins by
including several other header files, four of which -- system.h, keys.h,
rect.h, and video.h -- appear in this month's column, and the rest of which
you will see next month. If you want to do some compiling and testing of the
low-level drivers with this month's code, you can stub out the includes of the
files that you do not have yet.
A browse through dflat.h will give you a preview of what D-Flat supports. The
first item of interest is the definition of the standard window classes in the
CLASS enumerated data type. From their titles -- TEXTBOX, RADIOBUTTON, and so
on, you get advance warning of what is coming. The WINDOW structure in dflat.h
is the controlling structure that accompanies the windows that your
application opens. A number of macros and prototypes define the D-Flat API.
There are some external data definitions and then the macros that define the
border characters of the windows. These characters use the text graphics
character set of the PC's text video system. The characters with the FOCUS_
prefix define the borders for a window that is "in focus," the one into which
the user is presently entering data, for example. The others define the border
characters for other windows.
D-Flat supports scroll bars. The scroll bar characters in dflat.h define the
characters that D-Flat uses to paint the scroll bar and the scroll boxes and
buttons. The CHECKMARK character is for menu items that are toggles. It
displays next to a toggle menu as a check mark when the item is turned on.
A D-Flat title bar can include a Control Box, Min/Max Box, and a window
restore character. The title bar characters in dflat.h define these values.
The text control characters in dflat.h define control bytes that D-Flat uses
to manage colors when it displays the contents of a text window. You will
learn about these in later columns.
The macros and prototypes at the end of dflat.h define the external functions
that either your application or the D-Flat API itself uses. I will explain
each of these when I address the subject to which the item relates.
Listing Two, page 142, is keys.h, a header file that defines the key values
for D-Flat. Your application would use these values to identify that a
particular key has been pressed. D-Flat is an event-driven, message-based
architecture. Later you will learn how to build window-processing functions
that react to events and messages. A keystroke would be an event that sends a
message to the window that is in focus. Your window-processing function would
receive the message along with the pressed key as a parameter. The values in
keys.h will be of interest when we get to that part of the system discussion.
Listing Three, page 142, is system.h, the header file that defines a number of
global values and prototypes that associate with the hardware and its drivers.
Your application programs will not usually concern themselves with these
values, but the hardware-dependent functions of D-Flat will. There are values
for interrupt vectors, screen dimensions, BIOS functions, and prototypes for
the mouse, cursor, and timer functions. There are macros that make the code
compatible with Microsoft C. The Turbo C and Microsoft C compilers implement
the DOS interface functions differently. Because I used Turbo C to develop
D-Flat, I had to develop the compatibility macros that translate the Turbo C
implementations into the Microsoft C conventions. These macros and a few other
places in the code where you will find the #ifdef MSC statement are where
compiler-dependent code exists. If you port D-Flat to another compiler, these
are the places that need your attention.
Listings Four and Five, page 145, are rect.h and video.h. The first listing
defines some macros and prototypes that D-Flat uses to deal with screen
rectangles. The second listing defines the interface to the video drivers.
Listing Six, page 145, is video.c, which contains the D-Flat video drivers.
The getvideo and storevideo functions read and write rectangles of video
memory from and to RAM buffers. These functions support saving and restoring
the video space that a temporary window will occupy. Not all windows use this
technique, but some do. The GetVideoChar and PutVideoChar functions read and
write a video character and its attribute byte from and to video memory. The
wputch function writes a single character to a position within a specified
window and uses the currently-established color values. The wputs function
similarly writes a string to a window. The get_videomode function determines
the current video mode and the address of video memory. Observe that these
functions do not worry about the video snow characteristics of the old IBM
Color Graphics Adaptor. If that is a problem, you must drop into assembly
language to solve it. The solution has been published often in the past. A
later column will discuss the technique if enough of you tell me it is
warranted.
Listing Seven, page 146, is console.c, which contains the code that manages
the keyboard, cursor, and the generation of a warning buzz. The keybit
function is of interest. It determines if a key has been pressed by using a
BIOS call. I could have used the Turbo C bioskey and MSC_dos_bioskey
functions, but they both have a bug that has persisted since they were first
introduced. If you go into a loop waiting for one of these functions to tell
you a key has been pressed, and the user presses Ctrl-Break, you will never
exit from the loop. I reported this bug to both compiler vendors years ago,
but neither has found it necessary to make a repair. Both compilers support
the kbhit function, but that function uses a DOS call, something I try to
avoid. Functions that use DOS for console I/O will crash if they execute from
within a TSR. As an alternative, I use the code you see in console.c when I
compile with Turbo C. MSC does not provide a way to test the zero flag after
an interrupt call except by using assembly language, and I do not want to do
that. Therefore, the MSC D-Flat implementation substitutes kbhit for keyhit
and is not suitable for use in a TSR.
The getkey function in console.c reads a key from the keyboard. If the key is
a function key or Alt-key value, the getkey function builds a unique 8-bit
value for it so that calling functions do not need to translate the 16-bit
scancode value that BIOS returns.
The beep function sounds a buzz for a short time. Observe the implementation
of the wait macro. It looks something like a C++ inline function.
The cursor functions in console.c use BIOS to position the keyboard cursor,
read its current position, change its shape, save and restore its
configuration, and hide and unhide it.
Listing Eight, page 147, is mouse.c, which contains the code that manages the
mouse. These functions are simple calls to the mouse interrupt vector with the
appropriate values in registers. The generic mouse function makes the call.
The other functions call it to use the mouse driver API. Before your program
can use a mouse, you must have a mouse driver installed. The API to mouse
drivers follows a standard established by the Microsoft mouse, so this code
works for most mice. The mouse_installed function tests to see if the mouse
driver is installed as a DOS device driver or as a TSR by looking at the
contents of the mouse interrupt vector. If it contains zero or if it points to
an IRET instruction, the driver is not there. The comments that precede each
of the other mouse functions explain what they do.
The code this month is the foundation to D-Flat. You can use it to write other
programs if you want, but its purpose is to support the D-Flat operating
environment. Next month we'll publish the other header files and get into the
functions that create windows and manage messages.


How to Get D-Flat


The complete source code package for D-Flat is on CompuServe in Library 0 of
the DDJ Forum and on TelePath. Its name is DFLAT.ZIP. I will replace this file
over the months as the code changes. As posted, everything compiles and works
with Turbo C 2.0 and Microsoft C 6.0. There is a makefile that the make
utilities of both compilers accept, and there is one example program, the
MEMOPAD program, with which I am writing this column. If you want to discuss
it with me, my CompuServe ID is 71101,1262.


VCOMP


I often receive unsolicited copies of small commercial utility programs that
their purveyors hope I will like and tell you about. One such program is
called VCOMP (Visual Compare). It is a text file comparison program. Until now
I always used the DIFF program that came with the old Aztec C compiler. I
never really liked DIFF because it does not give a clear picture of the
differences between files. It displays two differing blocks of text with <<
and >> tokens to say which file the text comes from. The idea is that one
token is an extract token and the other is an insert token. Extract the
extractable text and insert the insert text and you have the original file or
something like that. Not exactly intuitive. VCOMP, on the other hand, displays
a composite file in a fullscreen, scrolling, paging display with the common
text in one color and the differing blocks of text from the two files in two
other colors. There are a number of operations you can perform prior to
writing a composite file, but I haven't used any of them. I am mainly
interested in seeing what changed between two versions of a C source code
file. Being more human than being a good programmer, I often jam a bunch of
new code into a program without a wall-to-wall test. Sometime later I find
that an unrelated bug has been born. Not remembering the last time I tested
the part of the program that has the bug, and therefore not remembering what I
changed since the time the bug did not exist, I need to compare the current
source files with old backups to see what is new. VCOMP is perfect for that.
The price is $30 for DOS or OS/2 versions and $45 for both. Order from Whitney
Software Inc., P.O. Box 4999, Walnut Creek, CA 94596.


No End to Huffman



Several of you wrote to point out a bug in the Huffman compression code I
published in the February 1991 issue. It seems that the decompressed file had
a bad last byte. Some of you even sent corrections to the code. The function
that shifted bits into the compressed file did not shift down the last
compressed byte before writing it out. Apparently the two files I used to test
the program were, coincidentally, just the right size to avoid the bug.
Eight-to-one odds are hard to beat, especially twice. Replace the outbit
function in huffc.c with the code in Example 1.
Example 1: Replacement code for the huffc.c outbit function in my February
1991 column.

 /* -- collect and write bits to the
 compressed output file -- */
 static void outbit (FILE *fo, int bit)
 {
 if (ct8 == 8 bit == -1) {
 while (ct8 < 8) {
 out8 <<= 1;
 ct8++;
 }
 fputc(out8, fo);
 ct8 = 0;
 }
 out8 = (out8 << 1) bit;
 ct8++;
 }


_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

/* ------------- dflat.h ----------- */

#ifndef WINDOW_H
#define WINDOW_H

#define TRUE 1
#define FALSE 0

#include "system.h"
#include "config.h"
#include "rect.h"
#include "menu.h"
#include "keys.h"
#include "commands.h"
#include "config.h"
#include "dialbox.h"

/* ------ integer type for message parameters ----- */
typedef long PARAM;
typedef enum window_class {
 NORMAL,
 APPLICATION,
 TEXTBOX,
 LISTBOX,
 EDITBOX,
 MENUBAR,
 POPDOWNMENU,
 BUTTON,
 DIALOG,
 ERRORBOX,
 MESSAGEBOX,
 HELPBOX,
 TEXT,

 RADIOBUTTON,
 DUMMY
} CLASS;
typedef struct window {
 CLASS class; /* window class */
 char *title; /* window title */
 struct window *parent; /* parent window */
 int (*wndproc)
 (struct window *, enum messages, PARAM, PARAM);
 /* ---------------- window dimensions ----------------- */
 RECT rc; /* window coordinates
 (0/0 to 79/24) */
 int ht, wd; /* window height and width */
 RECT RestoredRC; /* restored condition rect */
 /* -------------- linked list pointers ---------------- */
 struct window *next; /* next window on screen */
 struct window *prev; /* previous window on screen*/
 struct window *nextbuilt; /* next window built */
 struct window *prevbuilt; /* previous window built */

 int attrib; /* Window attributes */
 char *videosave; /* video save buffer */
 int condition; /* Restored, Maximized,
 Minimized */
 void *extension; /* -> menus, dialog box, etc*/
 struct window *PrevMouse;
 struct window *PrevKeyboard;
 /* ----------------- text box fields ------------------ */
 int wlines; /* number of lines of text */
 int wtop; /* text line that is on the top display */
 char *text; /* window text */
 int textlen; /* text length */
 int wleft; /* left position in window viewport */
 int textwidth; /* width of longest line in textbox */
 int BlkBegLine; /* beginning line of marked block */
 int BlkBegCol; /* beginning column of marked block */
 int BlkEndLine; /* ending line of marked block */
 int BlkEndCol; /* ending column of marked block */
 int HScrollBox; /* position of horizontal scroll box */
 int VScrollBox; /* position of vertical scroll box */
 /* ------------------ list box field ------------------ */
 int selection; /* current selection */
 /* ----------------- edit box fields ------------------ */
 int CurrCol; /* Current column */
 char *CurrLine; /* Current line */
 int WndRow; /* Current window row */
 int TextChanged; /* TRUE if text has changed */
 char *DeletedText; /* for undo */
 int DeletedLength; /* " " */
 /* ---------------- dialog box fields ----------------- */
 struct window *dFocus; /* control that has the focus */
 int ReturnCode; /* return code from a dialog box */
} * WINDOW;

#include "message.h"
#include "classdef.h"
#include "video.h"

enum Condition {

 ISRESTORED, ISMINIMIZED, ISMAXIMIZED
};
/* ------- window methods ----------- */
#define WindowHeight(w) ((w)->ht)
#define WindowWidth(w) ((w)->wd)
#define BorderAdj(w,n) (TestAttribute(w,HASBORDER)?n:0)
#define ClientWidth(w) (WindowWidth(w)-BorderAdj(w,2))
#define ClientHeight(w) (WindowHeight(w)-BorderAdj(w,2))
#define WindowRect(w) ((w)->rc)
#define GetTop(w) (RectTop(WindowRect(w)))
#define GetBottom(w) (RectBottom(WindowRect(w)))
#define GetLeft(w) (RectLeft(WindowRect(w)))
#define GetRight(w) (RectRight(WindowRect(w)))
#define GetClientTop(w) (GetTop(w)+BorderAdj(w,1))
#define GetClientBottom(w) (GetBottom(w)-BorderAdj(w,1))
#define GetClientLeft(w) (GetLeft(w)+BorderAdj(w,1))
#define GetClientRight(w) (GetRight(w)-BorderAdj(w,1))
#define GetParent(w) ((w)->parent)
#define GetTitle(w) ((w)->title)
#define NextWindow(w) ((w)->next)
#define PrevWindow(w) ((w)->prev)
#define NextWindowBuilt(w) ((w)->nextbuilt)
#define PrevWindowBuilt(w) ((w)->prevbuilt)
#define GetClass(w) ((w)->class)
#define GetAttribute(w) ((w)->attrib)
#define AddAttribute(w,a) (GetAttribute(w) = a)
#define ClearAttribute(w,a) (GetAttribute(w) &= ~(a))
#define TestAttribute(w,a) (GetAttribute(w) & (a))
#define isVisible(w) (GetAttribute(w) & VISIBLE)
#define SetVisible(w) (GetAttribute(w) = VISIBLE)
#define ClearVisible(w) (GetAttribute(w) &= ~VISIBLE)
#define gotoxy(w,x,y) cursor(w->rc.lf+(x)+1,w->rc.tp+(y)+1)
WINDOW CreateWindow(CLASS,char *,int,int,int,int,void*,WINDOW,
 int (*)(struct window *,enum messages,PARAM,PARAM),int);
void AddTitle(WINDOW, char *);
void RepaintBorder(WINDOW, RECT *);
void ClearWindow(WINDOW, RECT *, int);
void clipline(WINDOW, int, char *);
void writeline(WINDOW, char *, int, int, int);
void writefull(WINDOW, char *, int);
void SetNextFocus(WINDOW,int);
void PutWindowChar(WINDOW, int, int, int);
void GetVideoBuffer(WINDOW);
void RestoreVideoBuffer(WINDOW);
int LineLength(char *);
#define DisplayBorder(wnd) RepaintBorder(wnd, NULL)
#define DefaultWndProc(wnd,msg,p1,p2) \
 classdefs[FindClass(wnd->class)].wndproc(wnd,msg,p1,p2)
#define BaseWndProc(class,wnd,msg,p1,p2) \
 classdefs[DerivedClass(class)].wndproc(wnd,msg,p1,p2)
#define NULLWND ((WINDOW) 0)
struct LinkedList {
 WINDOW FirstWindow;
 WINDOW LastWindow;
};
extern struct LinkedList Focus;
extern struct LinkedList Built;
extern WINDOW inFocus;
extern WINDOW CaptureMouse;

extern WINDOW CaptureKeyboard;
extern int foreground, background;
extern int WindowMoving;
extern int WindowSizing;
extern int TextMarking;
extern char *Clipboard;
extern WINDOW SystemMenuWnd;
/* --------------- border characters ------------- */
#define FOCUS_NW '\xc9'
#define FOCUS_NE '\xbb'
#define FOCUS_SE '\xbc'
#define FOCUS_SW '\xc8'
#define FOCUS_SIDE '\xba'
#define FOCUS_LINE '\xcd'
#define NW '\xda'
#define NE '\xbf'
#define SE '\xd9'
#define SW '\xc0'
#define SIDE '\xb3'
#define LINE '\xc4'
#define LEDGE '\xc3'
#define REDGE '\xb4'
#define SHADOWFG DARKGRAY
/* ------------- scroll bar characters ------------ */
#define UPSCROLLBOX '\x1e'
#define DOWNSCROLLBOX '\x1f'
#define LEFTSCROLLBOX '\x11'
#define RIGHTSCROLLBOX '\x10'
#define SCROLLBARCHAR 176
#define SCROLLBOXCHAR 178
#define CHECKMARK 251 /* menu item toggle */
/* ----------------- title bar characters ----------------- */
#define CONTROLBOXCHAR '\xf0'
#define MAXPOINTER 24 /* maximize token */
#define MINPOINTER 25 /* minimize token */
#define RESTOREPOINTER 18 /* restore token */
/* --------------- text control characters ---------------- */
#define APPLCHAR 176 /* fills application window */
#define SHORTCUTCHAR '~' /* prefix: shortcut key display */
#define CHANGECOLOR 174 /* prefix to change colors */
#define RESETCOLOR 175 /* reset colors to default */
/* ---- standard window message processing prototypes ----- */
int ApplicationProc(WINDOW, MESSAGE, PARAM, PARAM);
int NormalProc(WINDOW, MESSAGE, PARAM, PARAM);
int TextBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
int ListBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
int EditBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
int MenuBarProc(WINDOW, MESSAGE, PARAM, PARAM);
int PopDownProc(WINDOW, MESSAGE, PARAM, PARAM);
int ButtonProc(WINDOW, MESSAGE, PARAM, PARAM);
int DialogProc(WINDOW, MESSAGE, PARAM, PARAM);
int SystemMenuProc(WINDOW, MESSAGE, PARAM, PARAM);
int HelpBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
int MessageBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
/* ------------- normal box prototypes ------------- */
int isWindow(WINDOW);
WINDOW inWindow(int, int);
int WndForeground(WINDOW);
int WndBackground(WINDOW);

int FrameForeground(WINDOW);
int FrameBackground(WINDOW);
int SelectForeground(WINDOW);
int SelectBackground(WINDOW);
void SetStandardColor(WINDOW);
void SetReverseColor(WINDOW);
void SetClassColors(CLASS);
WINDOW GetFirstChild(WINDOW);
WINDOW GetNextChild(WINDOW);
WINDOW GetLastChild(WINDOW);
WINDOW GetPrevChild(WINDOW);
#define HitControlBox(wnd, p1, p2) \
 (TestAttribute(wnd, TITLEBAR) && \
 TestAttribute(wnd, CONTROLBOX) && \
 p1 == 2 && p2 == 0)
/* -------- text box prototypes ---------- */
char *TextLine(WINDOW, int);
void WriteTextLine(WINDOW, RECT *, int, int);
void SetTextBlock(WINDOW, int, int, int, int);
#define BlockMarked(wnd) ( wnd->BlkBegLine \
 wnd->BlkEndLine \
 wnd->BlkBegCol \
 wnd->BlkEndCol)
#define ClearBlock(wnd) wnd->BlkBegLine = wnd->BlkEndLine = \
 wnd->BlkBegCol = wnd->BlkEndCol = 0;
#define GetText(w) ((w)->text)
/* --------- menu prototypes ---------- */
int CopyCommand(char *, char *, int, int);
void PrepOptionsMenu(void *, struct Menu *);
void PrepEditMenu(void *, struct Menu *);
void PrepWindowMenu(void *, struct Menu *);
void BuildSystemMenu(WINDOW);
/* ------------- edit box prototypes ----------- */
#define isMultiLine(wnd) TestAttribute(wnd, MULTILINE)
/* --------- message box prototypes -------- */
void MessageBox(char *, char *);
void ErrorMessage(char *);
int TestErrorMessage(char *);
int YesNoBox(char *);
int MsgHeight(char *);
int MsgWidth(char *);
/* ------------- dialog box prototypes -------------- */
int DialogBox(DBOX *, int (*)(struct window *,
 enum messages, PARAM, PARAM));
int DlgOpenFile(char *, char *);
int DlgSaveAs(char *);
void GetDlgListText(WINDOW, char *, enum commands);
int DlgDirList(WINDOW, char *, enum commands,
 enum commands, unsigned);
int RadioButtonSetting(DBOX *, enum commands);
void PushRadioButton(DBOX *, enum commands);
void PutItemText(WINDOW, enum commands, char *);
void GetItemText(WINDOW, enum commands, char *, int);
/* ------------- help box prototypes ------------- */
void HelpFunction(void);
void LoadHelpFile(void);
#define swap(a,b){int x=a;a=b;b=x;}

#endif







[LISTING TWO]

/* ----------- keys.h ------------ */
#ifndef KEYS_H
#define KEYS_H
#define RUBOUT 8
#define BELL 7
#define ESC 27
#define ALT_BS 197
#define SHIFT_DEL 198
#define CTRL_INS 186
#define SHIFT_INS 185
#define F1 187
#define F2 188
#define F3 189
#define F4 190
#define F5 191
#define F6 192
#define F7 193
#define F8 194
#define F9 195
#define F10 196
#define CTRL_F1 222
#define CTRL_F2 223
#define CTRL_F3 224
#define CTRL_F4 225
#define CTRL_F5 226
#define CTRL_F6 227
#define CTRL_F7 228
#define CTRL_F8 229
#define CTRL_F9 230
#define CTRL_F10 231
#define ALT_F1 232
#define ALT_F2 233
#define ALT_F3 234
#define ALT_F4 235
#define ALT_F5 236
#define ALT_F6 237
#define ALT_F7 238
#define ALT_F8 239
#define ALT_F9 240
#define ALT_F10 241
#define HOME 199
#define UP 200
#define PGUP 201
#define BS 203
#define FWD 205
#define END 207
#define DN 208
#define PGDN 209
#define INS 210
#define DEL 211
#define CTRL_HOME 247

#define CTRL_PGUP 132
#define CTRL_BS 243
#define CTRL_FIVE 143
#define CTRL_FWD 244
#define CTRL_END 245
#define CTRL_PGDN 246
#define SHIFT_HT 143
#define ALT_A 158
#define ALT_B 176
#define ALT_C 174
#define ALT_D 160
#define ALT_E 146
#define ALT_F 161
#define ALT_G 162
#define ALT_H 163
#define ALT_I 151
#define ALT_J 164
#define ALT_K 165
#define ALT_L 166
#define ALT_M 178
#define ALT_N 177
#define ALT_O 152
#define ALT_P 153
#define ALT_Q 144
#define ALT_R 147
#define ALT_S 159
#define ALT_T 148
#define ALT_U 150
#define ALT_V 175
#define ALT_W 145
#define ALT_X 173
#define ALT_Y 149
#define ALT_Z 172
#define ALT_1 0xf8
#define ALT_2 0xf9
#define ALT_3 0xfa
#define ALT_4 0xfb
#define ALT_5 0xfc
#define ALT_6 0xfd
#define ALT_7 0xfe
#define ALT_8 0xff
#define ALT_9 0x80
#define ALT_0 0x81
#define ALT_HYPHEN 130

#define RIGHTSHIFT 0x01
#define LEFTSHIFT 0x02
#define CTRLKEY 0x04
#define ALTKEY 0x08
#define SCROLLLOCK 0x10
#define NUMLOCK 0x20
#define CAPSLOCK 0x40
#define INSERTKEY 0x80

struct keys {
 int keycode;
 char *keylabel;
};
int getkey(void);

int getshift(void);
int keyhit(void);
void beep(void);
extern struct keys keys[];
extern char altconvert[];

#endif







[LISTING THREE]


/* --------------- system.h -------------- */
#ifndef SYSTEM_H
#define SYSTEM_H
/* ----- interrupt vectors ----- */
#define TIMER 8
#define VIDEO 0x10
#define KEYBRD 0x16
#define DOS 0x21
#define CRIT 0x24
#define MOUSE 0x33
/* ------- platform-dependent values ------ */
#define FREQUENCY 100
#define COUNT (1193280L / FREQUENCY)
#define ZEROFLAG 0x40
#define MAXSAVES 50
#define SCREENWIDTH 80
#define SCREENHEIGHT 25
/* ----- keyboard BIOS (0x16) functions -------- */
#define READKB 0
#define KBSTAT 1
/* ------- video BIOS (0x10) functions --------- */
#define SETCURSORTYPE 1
#define SETCURSOR 2
#define READCURSOR 3
#define READATTRCHAR 8
#define WRITEATTRCHAR 9
#define HIDECURSOR 0x20
/* ------- the interrupt function registers -------- */
typedef struct {
 int bp,di,si,ds,es,dx,cx,bx,ax,ip,cs,fl;
} IREGS;
/* ---------- cursor prototypes -------- */
void curr_cursor(int *x, int *y);
void cursor(int x, int y);
void hidecursor(void);
void unhidecursor(void);
void savecursor(void);
void restorecursor(void);
void normalcursor(void);
void set_cursor_type(unsigned t);
void videomode(void);
/* ---------- mouse prototypes ---------- */

int mouse_installed(void);
int mousebuttons(void);
void get_mouseposition(int *x, int *y);
void set_mouseposition(int x, int y);
void show_mousecursor(void);
void hide_mousecursor(void);
int button_releases(void);
void resetmouse(void);
#define leftbutton() (mousebuttons()&1)
#define rightbutton() (mousebuttons()&2)
#define waitformouse() while(mousebuttons());
/* ------------ timer macros -------------- */
#define timed_out(timer) (timer==0)
#define set_timer(timer, secs) timer=(secs)*182/10+1
#define disable_timer(timer) timer = -1
#define timer_running(timer) (timer > 0)
#define countdown(timer) --timer
#define timer_disabled(timer) (timer == -1)

#ifdef MSC
/* ============= MSC Compatibility Macros ============ */
#define BLACK 0
#define BLUE 1
#define GREEN 2
#define CYAN 3
#define RED 4
#define MAGENTA 5
#define BROWN 6
#define LIGHTGRAY 7
#define DARKGRAY 8
#define LIGHTBLUE 9
#define LIGHTGREEN 10
#define LIGHTCYAN 11
#define LIGHTRED 12
#define LIGHTMAGENTA 13
#define YELLOW 14
#define WHITE 15

#define getvect(v) _dos_getvect(v)
#define setvect(v,f) _dos_setvect(v,f)
#define MK_FP(s,o) ((void far *) \
 (((unsigned long)(s) << 16) (unsigned)(o)))
#undef FP_OFF
#undef FP_SEG
#define FP_OFF(p) ((unsigned)(p))
#define FP_SEG(p) ((unsigned)((unsigned long)(p) >> 16))
#define poke(a,b,c) (*((int far*)MK_FP((a),(b))) = (int)(c))
#define pokeb(a,b,c) (*((char far*)MK_FP((a),(b))) = (char)(c))
#define peek(a,b) (*((int far*)MK_FP((a),(b))))
#define peekb(a,b) (*((char far*)MK_FP((a),(b))))
#define findfirst(p,f,a) _dos_findfirst(p,a,f)
#define findnext(f) _dos_findnext(f)
#define ffblk find_t
#define ff_name name
#define ff_fsize size
#define ff_attrib attrib
#define fnsplit _splitpath
#define fnmerge _makepath
#define EXTENSION 2

#define FILENAME 4
#define DIRECTORY 8
#define DRIVE 16
#define MAXPATH 80
#define MAXDRIVE 3
#define MAXDIR 66
#define MAXFILE 9
#define MAXEXT 5
#define setdisk(d) _dos_setdrive((d)+1, NULL)
#define bioskey _bios_keybrd
#define keyhit kbhit
#endif
#endif







[LISTING FOUR]

/* ----------- rect.h ------------ */
#ifndef RECT_H
#define RECT_H

typedef struct {
 int lf,tp,rt,bt;
} RECT;
#define within(p,v1,v2) ((p)>=(v1)&&(p)<=(v2))
#define RectTop(r) (r.tp)
#define RectBottom(r) (r.bt)
#define RectLeft(r) (r.lf)
#define RectRight(r) (r.rt)
#define InsideRect(x,y,r) (within(x,RectLeft(r),RectRight(r)) \
 && \
 within(y,RectTop(r),RectBottom(r)))
#define ValidRect(r) (RectRight(r) RectLeft(r))
#define RectWidth(r) (RectRight(r)-RectLeft(r)+1)
#define RectHeight(r) (RectBottom(r)-RectTop(r)+1)
RECT subRectangle(RECT, RECT);
RECT RelativeRectangle(RECT, RECT);
RECT ClientRect(void *);
RECT SetRect(int,int,int,int);
#endif







[LISTING FIVE]

/* ---------------- video.h ----------------- */

#ifndef VIDEO_H
#define VIDEO_H


#include "rect.h"

void getvideo(RECT, void far *);
void storevideo(RECT, void far *);
extern unsigned video_mode;
extern unsigned video_page;
void wputch(WINDOW, int, int, int);
int GetVideoChar(int, int);
void PutVideoChar(int, int, int);
void get_videomode(void);
void wputs(WINDOW, void *, int, int);

#define clr(fg,bg) ((fg)((bg)<<4))
#define vad(x,y) ((y)*160+(x)*2)
#define ismono() (video_mode == 7)
#define istext() (video_mode < 4)
#define videochar(x,y) (GetVideoChar(x,y) & 255)

#endif







[LISTING SIX]

/* --------------------- video.c -------------------- */
#include <stdio.h>
#include <dos.h>
#include <string.h>
#include <conio.h>
#include "dflat.h"

static unsigned video_address;
/* -- read a rectangle of video memory into a save buffer -- */
void getvideo(RECT rc, void far *bf)
{
 int ht = RectBottom(rc)-RectTop(rc)+1;
 int bytes_row = (RectRight(rc)-RectLeft(rc)+1) * 2;
 unsigned vadr = vad(RectLeft(rc), RectTop(rc));
 hide_mousecursor();
 while (ht--) {
 movedata(video_address, vadr, FP_SEG(bf),
 FP_OFF(bf), bytes_row);
 vadr += 160;
 (char far *)bf += bytes_row;
 }
 show_mousecursor();
}

/* -- write a rectangle of video memory from a save buffer -- */
void storevideo(RECT rc, void far *bf)
{
 int ht = RectBottom(rc)-RectTop(rc)+1;
 int bytes_row = (RectRight(rc)-RectLeft(rc)+1) * 2;
 unsigned vadr = vad(RectLeft(rc), RectTop(rc));
 hide_mousecursor();

 while (ht--) {
 movedata(FP_SEG(bf), FP_OFF(bf), video_address,
 vadr, bytes_row);
 vadr += 160;
 (char far *)bf += bytes_row;
 }
 show_mousecursor();
}

/* -------- read a character of video memory ------- */
int GetVideoChar(int x, int y)
{
 int c;
 hide_mousecursor();
 c = peek(video_address, vad(x,y));
 show_mousecursor();
 return c;
}

/* -------- write a character of video memory ------- */
void PutVideoChar(int x, int y, int c)
{
 if (x < SCREENWIDTH && y < SCREENHEIGHT) {
 hide_mousecursor();
 poke(video_address, vad(x,y), c);
 show_mousecursor();
 }
}

/* -------- write a character to a window ------- */
void wputch(WINDOW wnd, int c, int x, int y)
{
 int x1 = GetClientLeft(wnd)+x;
 int y1 = GetClientTop(wnd)+y;
 if (x1 < SCREENWIDTH && y1 < SCREENHEIGHT) {
 hide_mousecursor();
 poke(video_address,
 vad(x1,y1),(c & 255) 
 (clr(foreground, background) << 8));
 show_mousecursor();
 }
}

/* ------- write a string to a window ---------- */
void wputs(WINDOW wnd, void *s, int x, int y)
{
 int x1 = GetLeft(wnd)+x;
 int y1 = GetTop(wnd)+y;
 if (x1 < SCREENWIDTH && y1 < SCREENHEIGHT) {
 int fg = foreground;
 int bg = background;
 unsigned char *str = s;
 char ss[200];
 int ln[SCREENWIDTH];
 int *cp1 = ln;
 int len;
 strncpy(ss, s, 199);
 ss[199] = '\0';
 clipline(wnd, x, ss);

 str = (unsigned char *) ss;
 hide_mousecursor();
 while (*str) {
 if (*str == CHANGECOLOR) {
 str++;
 foreground = (*str++) & 0x7f;
 background = (*str++) & 0x7f;
 continue;
 }
 if (*str == RESETCOLOR) {
 foreground = fg;
 background = bg;
 str++;
 continue;
 }
 *cp1++ = (*str & 255) 
 (clr(foreground, background) << 8);
 str++;
 }
 foreground = fg;
 background = bg;
 len = (int)(cp1-ln);
 if (x1+len > SCREENWIDTH)
 len = SCREENWIDTH-x1;
 movedata(FP_SEG(ln), FP_OFF(ln), video_address,
 vad(x1,y1), len*2);
 show_mousecursor();
 }
}

/* --------- get the current video mode -------- */
void get_videomode(void)
{
 videomode();
 /* ---- Monochrome Display Adaptor or text mode ---- */
 if (ismono())
 video_address = 0xb000;
 else
 /* ------ Text mode -------- */
 video_address = 0xb800 + video_page;
}







[LISTING SEVEN]

/* ----------- console.c ---------- */

#include <conio.h>
#include <bios.h>
#include <dos.h>
#include "system.h"
#include "keys.h"

/* ----- table of alt keys for finding shortcut keys ----- */

char altconvert[] = {
 ALT_A,ALT_B,ALT_C,ALT_D,ALT_E,ALT_F,ALT_G,ALT_H,
 ALT_I,ALT_J,ALT_K,ALT_L,ALT_M,ALT_N,ALT_O,ALT_P,
 ALT_Q,ALT_R,ALT_S,ALT_T,ALT_U,ALT_V,ALT_W,ALT_X,
 ALT_Y,ALT_Z,ALT_0,ALT_1,ALT_2,ALT_3,ALT_4,ALT_5,
 ALT_6,ALT_7,ALT_8,ALT_9,0
};

unsigned video_mode;
unsigned video_page;

static int near cursorpos[MAXSAVES];
static int near cursorshape[MAXSAVES];
static int cs = 0;

static union REGS regs;

#ifndef MSC
#define ZEROFLAG 0x40
/* ---- Test for keystroke ---- */
int keyhit(void)
{
 _AH = 1;
 geninterrupt(KEYBRD);
 return (_FLAGS & ZEROFLAG) == 0;
}
#endif

/* ---- Read a keystroke ---- */
int getkey(void)
{
 int c;
 while (keyhit() == 0)
 ;
 if (((c = bioskey(0)) & 0xff) == 0)
 c = (c >> 8) 0x80;
 return c & 0xff;
}

/* ---------- read the keyboard shift status --------- */
int getshift(void)
{
 regs.h.ah = 2;
 int86(KEYBRD, &regs, &regs);
 return regs.h.al;
}

/* ------- macro to wait one clock tick -------- */
#define wait() \
{ \
 int now = peek(0x40,0x6c); \
 while (now == peek(0x40,0x6c)) \
 ; \
}

/* -------- sound a buzz tone ---------- */
void beep(void)
{
 wait();

 outp(0x43, 0xb6); /* program the frequency */
 outp(0x42, (int) (COUNT % 256));
 outp(0x42, (int) (COUNT / 256));
 outp(0x61, inp(0x61) 3); /* start the sound */
 wait();
 outp(0x61, inp(0x61) & ~3); /* stop the sound */
}

/* -------- get the video mode and page from BIOS -------- */
void videomode(void)
{
 regs.h.ah = 15;
 int86(VIDEO, &regs, &regs);
 video_mode = regs.h.al;
 video_page = regs.x.bx;
 video_page &= 0xff00;
 video_mode &= 0x7f;
}

/* ------ position the cursor ------ */
void cursor(int x, int y)
{
 videomode();
 regs.x.dx = ((y << 8) & 0xff00) + x;
 regs.x.ax = 0x0200;
 regs.x.bx = video_page;
 int86(VIDEO, &regs, &regs);
}

/* ------ get cursor shape and position ------ */
static void near getcursor(void)
{
 videomode();
 regs.h.ah = READCURSOR;
 regs.x.bx = video_page;
 int86(VIDEO, &regs, &regs);
}

/* ------- get the current cursor position ------- */
void curr_cursor(int *x, int *y)
{
 getcursor();
 *x = regs.h.dl;
 *y = regs.h.dh;
}

/* ------ save the current cursor configuration ------ */
void savecursor(void)
{
 if (cs < MAXSAVES) {
 getcursor();
 cursorshape[cs] = regs.x.cx;
 cursorpos[cs] = regs.x.dx;
 cs++;
 }
}

/* ---- restore the saved cursor configuration ---- */
void restorecursor(void)

{
 if (cs) {
 --cs;
 videomode();
 regs.x.dx = cursorpos[cs];
 regs.h.ah = SETCURSOR;
 regs.x.bx = video_page;
 int86(VIDEO, &regs, &regs);
 set_cursor_type(cursorshape[cs]);
 }
}

/* ------ make a normal cursor ------ */
void normalcursor(void)
{
 set_cursor_type(0x0607);
}

/* ------ hide the cursor ------ */
void hidecursor(void)
{
 getcursor();
 regs.h.ch = HIDECURSOR;
 regs.h.ah = SETCURSORTYPE;
 int86(VIDEO, &regs, &regs);
}

/* ------ unhide the cursor ------ */
void unhidecursor(void)
{
 getcursor();
 regs.h.ch &= ~HIDECURSOR;
 regs.h.ah = SETCURSORTYPE;
 int86(VIDEO, &regs, &regs);
}

/* ---- use BIOS to set the cursor type ---- */
void set_cursor_type(unsigned t)
{
 videomode();
 regs.h.ah = SETCURSORTYPE;
 regs.x.bx = video_page;
 regs.x.cx = t;
 int86(VIDEO, &regs, &regs);
}






[LISTING EIGHT]

/* ------------- mouse.c ------------- */

#include <stdio.h>
#include <dos.h>
#include <stdlib.h>
#include <string.h>

#include "system.h"

static union REGS regs;

static void near mouse(int m1,int m2,int m3,int m4)
{
 regs.x.dx = m4;
 regs.x.cx = m3;
 regs.x.bx = m2;
 regs.x.ax = m1;
 int86(MOUSE, &regs, &regs);
}

/* ---------- reset the mouse ---------- */
void resetmouse(void)
{
 mouse(0,0,0,0);
}

/* ----- test to see if the mouse driver is installed ----- */
int mouse_installed(void)
{
 unsigned char far *ms;
 ms = MK_FP(peek(0, MOUSE*4+2), peek(0, MOUSE*4));
 return (ms != NULL && *ms != 0xcf);
}

/* ------ return true if mouse buttons are pressed ------- */
int mousebuttons(void)
{
 if (mouse_installed())
 mouse(3,0,0,0);
 return regs.x.bx & 3;
}

/* ---------- return mouse coordinates ---------- */
void get_mouseposition(int *x, int *y)
{
 if (mouse_installed()) {
 mouse(3,0,0,0);
 *x = regs.x.cx/8;
 *y = regs.x.dx/8;
 }
}

/* -------- position the mouse cursor -------- */
void set_mouseposition(int x, int y)
{
 if(mouse_installed())
 mouse(4,0,x*8,y*8);
}

/* --------- display the mouse cursor -------- */
void show_mousecursor(void)
{
 if(mouse_installed())
 mouse(1,0,0,0);
}


/* --------- hide the mouse cursor ------- */
void hide_mousecursor(void)
{
 if(mouse_installed())
 mouse(2,0,0,0);
}

/* --- return true if a mouse button has been released --- */
int button_releases(void)
{
 if(mouse_installed())
 mouse(6,0,0,0);
 return regs.x.bx;
}
















































MAY, 1991
STRUCTURED PROGRAMMING


The Lesson of the Fallen Viking




Jeff Duntemann, KG7JF


I discovered CB radio in 1972, when it was still innocent and fairly polite,
and gas cost 32 cents a gallon. My first radio was an ancient, dented Viking
Messenger I, with a slightly rusty all-steel case that was full of tubes,
lordy, and a vibrator that buzzed like a swarm of killer bees. It was the best
radio that $25 could buy, however, and I discovered to my delight that by
diddling the pi-network loading coil a little bit one could coax eight or nine
watts out of it, while my less technical friends had to content themselves
with five.
All that summer I called myself Shakespeare, cruising the Northwest Side of
Chicago in my Chevelle, discussing e. e. cummings with some guy in Norwood
Park who called himself Theater Nut, with occasional side comments by a
sultry-voiced vixen known only as Legs. (After some difficulty and a little
triangulation we located Legs' 10-20 and discovered that she had just turned
13. Ahh, radio: The Great Equalizer!)
I went off the air abruptly one night, while trying to rig the Viking for
inside operation. It was a small matter of attaching an ordinary 117V line
cord to a nine-pin connector on the back panel. Piece of cake for an old
electron-jockey like me -- except that I counted the pins counterclockwise
rather than clockwise, and instead of connecting the 117V line to the power
transformer input, I connected it across the 12V tube filament feed. The
instant I plugged it in, all 14 of the Viking's tubes glowed bright blue for a
glorious moment and then went off to join Colonel Armstrong in the Great
Beyond.
It took me ten months to round up replacement tubes for the Viking, and in the
meantime the Arabs got mean, the truckers went on strike, gas went through the
roof, and CB hit the big time. When I got back on the air, I politely asked
after Theater Nut, but what I heard instead was "HOWBOUT THAT TENNESSEE TOILET
FLUSHER! THIS HYAR'S THE HILLBILLY BUSHWHACKER! YOU OAN THE CHANNEL, COME
OAN!"
And that's the only part I can print. I finally ran into Theater Nut at
Superdawg, to discover that he had sold his radio in disgust and was returning
to grad school. He listened to my tale of the fallen Viking with amusement.
"Helluva way to get to know your hardware, Jeff."
Amen.
Theater Nut was the only CBer I ever heard refer to his radio as "hardware"
(and he had plenty of other affectations) but the advice was a little
prophetic for 1973, and remains sound: Know your hardware. The C guys take
this as gospel, but far too many Pascal people investigate their hardware with
the same distance and distaste they might exhibit picking up doggy doo-doo in
the backyard after a hard rain.
Designing good comm software requires knowing your hardware, even if you buy
the low-level interface routines from someone else. Remember: They'll sell
code to you, but they won't debug your app for you. Sooner or later you'll be
tracing a roach and find yourself staring at the other guy's code. If it's all
a mystery, you'll be out of luck. Knowing what all those register INs and OUTs
are for will at least give you a fighting chance. Get the source when you can,
but at very least know what the box is up to.


The Chip With Two Brains


The PC's serial port pretty much comes down to a single chip: The 8250 on the
PC and the 16450 on the AT. The two are very similar, and the 16450 can in
fact be taken as a slightly enhanced version of the 8250. On first look at the
register matrix and the list of all the controls and such supported by these
chips, most newcomers recoil in horror. The horror might be real, but it's no
worse than confronting a B-tree manager or the schematic of a
double-conversion superhet receiver. You take it a piece at a time, and hide
those details that you don't need to confront at any given moment.
From a height, the UART chip is a little schizoid, and performs two distinct
jobs: It manages the serial/parallel conversion of data that is the soul of
what a UART does, and it also manages (in partnership with another chip on the
PC motherboard) a system of interrupts to make serial communications reliable.
The interrupts aren't essential to making the UART work, exactly, and to
understand the UART best you should set aside worries about the interrupt
system. You can in fact write a "toy" communication program without using
interrupts at all, and such a program can help explain a lot about what the
UART can and should do.
So before we get into interrupts at all, we're going to confront and
understand the UART portion of the 8250/16450 by itself.


I/O Ports


As I explained in my March 1991 column, a UART's job is to convert a byte of
data between "parallel" form, with all bits lined up side by side in eight
separate wires or memory locations, and "serial" form, where all the bits move
down a single wire, one after the other. The conversion works both ways, as
needed. Typically, in a PC serial port, data moving out of the computer is
converted from parallel form to serial form, and data moving into the computer
is converted from serial form to parallel form.
Software that you write, regardless of language, "talks" to the UART through
I/O ports. An I/O port, if you haven't ever confronted one in your work, is a
little like a memory location occupied by some sort of hardware gadget rather
than the familiar memory storage bin. An I/O port, like a memory location, has
an address, which is a unique numeric value specifying one single I/O port and
no other. The 86-family of chips has an I/O bus, which is the machinery for
moving information between the CPU and I/O ports. You needn't worry about how
the I/O bus operates at an electrical level. Simply understand that there are
65,535 possible I/O port addresses, and very few of them have any sort of
hardware device attached to them. Most I/O port addresses are just "empty
air."
By writing a byte of data to the port at the right I/O address, you can place
a byte of memory inside a hardware device. Similarly, by reading a byte of
data from a port at the right I/O address, you can pick up a byte of data from
inside a hardware device.
Different languages have different ways of dealing with I/O ports. Turbo
Pascal treats I/O ports as elements of a 65,535-element array. To read a byte
from an I/O port, you assign from an element of that array: LineStatus :=
Port[$03FD];. Writing to an I/O port is done by writing a value to an element
of the Port array: Port[$03F8] := OutChar;.


Based I/O Ports


The PC UART chips (there must be a separate chip for each serial port)
communicate with your software through I/O ports. Each chip is connected to
eight adjacent I/O addresses. Which actual physical I/O addresses these are
can be changed, usually by setting DIP switches on the motherboard or on the
serial port board. However, regardless of what addresses are actually used,
you can always assume that the eight I/O addresses occupied by a serial port
UART will be eight-in-a-row, without gaps.
The I/O address of the very first of those eight adjacent I/O ports' UART is
known as the base I/O address of the port as a whole. There are two standard
base I/O addresses for serial ports in the PC. One, for the COM1: serial port
device, is $03F8. The other, for the COM2: serial port device, is $02F8. It's
possible to put additional serial ports in a PC, but there are no recognized
standard base I/O address locations for them, and (far worse, as I'll explain
in a future column) no standard interrupt priority levels for them either.
Until you get seriously conversant with the PC's serial port and interrupt
management hardware, stick with COM1: and COM2:.


UART Registers


Although there are eight I/O addresses associated with each UART chip, it's
not quite true that there are only eight pathways into the UART through the
I/O bus. Some of the I/O ports do double duty, and this represents a large
part of the confusion of working with the PC serial ports.
To help avoid confusion, the term register is applied to each separate
function an I/O port may serve. Two registers may share an I/O port address.
The commonest case is when an I/O port serves one purpose when read from, and
an entirely different purpose when written to. The simplest example is in
moving data into and out of the UART chip. Incoming data (that is, data coming
into the PC from the modem) is held in one location inside the UART, and
outgoing data (that is, data waiting to be serialized and sent to the modem)
are held in two completely different places inside the UART. However, when you
read from the base I/O port, (that is, offset 0; the first of the eight I/O
addresses a UART is connected to) you get the incoming data, but when you
write to the base I/O port, you load a byte into the holding register to be
transmitted.
Always remember this: A UART I/O port and a UART register are not necessarily
identical! For example, the base I/O port serves no fewer than three different
UART registers, all through a single I/O address. To help you sort out the
registers from the ports, the individual UART registers all have standard
names and three-character acronyms to identify them. I didn't make these names
up; you'll find them in all discussions of the PC's UARTs and serial ports, so
they're certainly worth coming to know. The UART register names and their
abbreviations are summarized in Figure 1, along with the register's offset
from the base address, and whether it is connected to the I/O bus when written
to (W) or read from (R).
Figure 1: UART I/O addresses and register names

 Register name Acronym Read/ Offset COM1: COM2:
 from base address address
 --------------------------------------------------------------------------


 Receive buffer register RBR R 0 $03F8 $02F8
 Transmit holding register THR W 0 $03F8 $02F8
 Interrupt enable register IER R/W 1 $03F9 $02F9
 Interrupt ID register IIR R 2 $03FA $02FA
 FIFO control register
 (PS/2 only) FCR W 2 $03FA $02FA
 Line control register LCR R/W 3 $03FB $02FB
 Modem control register MCR R/W 4 $03FC $02FC
 Line status register LSR R/W 5 $03FD $02FD
 Modem status register MSR R/W 6 $03FE $02FE
 Scratch register
 (AT/PS2 only) SCR R/W $03FF $02FF

When the Divisor Latch Access Bit is = 1, these registers are available
instead of RBR and IER:

 Divisor latch, low byte DLL R/W 0 $03F8 $02F8
 Divisor latch, high byte DLM R/W 1 $03F9 $02F9

The last two registers shown in Figure 1 share I/O addresses with two other
registers, depending on the state of a status bit inside the UART called DLAB,
the Divisor Latch Access Byte. DLAB=0 (the default) the base I/O address and
the address at offset 1 are occupied by registers RBR and IER. However, toggle
DLAB to 1, and you can no longer access RBR and IER. Instead, register DLL is
accessed through the base I/O address, and register DLM is accessed through
the address at offset 1. (I'll come back to what these registers are for in
awhile. No sense adding to the confusion unnecessarily!)


Symbolic Names and I/O Addresses


However, in most cases the confusion isn't all that bad. Except for the DLAB
business, which I'll explain in due time, most registers are defined by their
I/O address and whether they are read from or written to. From now on, I'm
going to refer to UART registers by their three-character acronyms rather than
their I/O addresses or I/O address offsets. In fact, for the purposes of
keeping confusion down, I've set up a group of constant definitions in Listing
One (page 148) that define constants using the standard UART register names as
identifiers. These "computed constants" contain the I/O addresses of the UART
registers whose names they carry. In other words, the value of constant MCR is
calculated as PORTBASE+4, where PORTBASE can be either $03F8 (COM1:) or $02F8
(COM2:), as required. For example, if you're using COM1:, MCR is calculated as
$03F8+4, or $03FC. On the other hand, if you're using COM2:, MCR is calculated
as $02F8+4, or $02FC.
There may be an explanation as to why the UART's I/O addresses must be shared
when there are 65,535 available, and that explanation probably has to do with
the mechanics of manufacturing the chip itself, but there's no point moaning
over it. Study Figure 1 and refer back to it when confusion begins to rise
during the rest of this discussion. Sooner or later it'll all start to gel.


Using UART Registers


The best way to start in learning how to use UART registers is to build simple
procedures for reading a character from the UART and writing a character to
the UART.
Putting a character into the UART for transmission out of the PC is simply a
matter of writing the character into the transmit holding buffer register,
THR:
 PROCEDURE OutChar(Ch : Char);

 BEGIN Port(THR) := Byte(Ch); END;
You simply assign a value to the Port array element corresponding to the THR
I/O address, and the value is inside the UART, ready to be transmitted. The
type cast Byte (Ch) handles the type mismatch existing between the Port array,
which is an ARRAY OF Byte, and the Char value to be transmitted.
Getting a received character out of the UART is every bit as simple:
 FUNCTION InChar : Char;

 BEGIN
 InChar := Char(Port[RBR]);
 END;
The RBR constant is one of those defined in Listing One. You simply read from
the appropriate element of the Port array, and the UART places its received
value in your hand.


UART Register Bit Fields


With only a few exceptions, the UART registers shown in Figure 1 are not
whole-byte values. Most of the registers are divided into smaller fields that
are only 1 bit in size. Most of these are status flags of one sort or another,
indicating whether some condition is or is not present.
At this point, even trying to name all the bit fields for you would be
confusing, because there are a lot of them, many of them subtle and a little
obscure, some of them with names and standard acronyms and some without. Some
bits, furthermore, are undefined and are not used. Rather than lay them out
all at once, I'll explain them as I need them for the purposes of this
discussion.
The easiest one to begin with has to do with the status of the receive buffer
register, RBR. It's pointless or even hazardous to assume that a valid
character has been received by the UART from the outside world without some
definite signal to that effect. If you read RBR before a full character has
been received, you can end up with a meaningless garbage value. For this
reason, one of the UART bit fields indicates whether or not a complete
character has been received and is waiting in RBR. This bit field, called Data
Ready (DR), is bit 0 of the line status register, LSR. If DR is a 1-bit, a
character is waiting to be read in RBR. If DR is a 0-bit, no character is yet
ready to read.
Testing the state of the DR bit requires "masking out" the other bit fields in
LSR before testing its state. It's not difficult:
 FUNCTION InStat : Boolean;
 BEGIN
 InStat := Boolean(Port[LSR] AND $01)
 END;
ANDing the value fetched from LSR with $01 forces all bits in LSR to 0-bits
except bit 0, corresponding to the 1-bit in the value $01. This leaves you
with two possible values, $00 and $01. The type cast to type Boolean depends
on the special knowledge that under the hood, Boolean True values are bytes
with the bit 0 set to 1, and Boolean False values are bytes with the bit 0 set
to 0. That DR corresponds so neatly to the two Boolean values is mostly
coincidence, and is something of a special case. It applies only when the bit
field to be tested is present in bit 0 of the register in question. However,
because you may be calling the status function frequently within a fairly
tight loop, it makes sense to take advantage of any special case that cuts
down the amount of processing that needs to be done.
InStat allows you to have confidence that any character you read has been
completely received by the UART and doesn't still have a few bits missing. If
function InStat returns True, you can go right to RBR and receive a valid
character. If InStat returns False, nothing's come in yet, so wait a bit and
try again.



Testing Flags: The General Case


Testing flags in the general case requires both a masking operation and a
testing operation to return a Boolean state. You must first mask out all bits
in a register except the bit you want to check, and then compare that bit
against its equivalent value based on its position in the register.
Again, this is best explained by example. There is another bit field in the
line status register that indicates when the last character you passed to the
UART for transmission is completely sent and gone. There is no FIFO buffer in
the PC or AT UART chips. (There is one in the PS/2, but there aren't enough
PS/2 machines out there to make it worth using in generally-applicable
software.) If you write a new character to the Transmit Holding Register (THR)
before the last one is completely out of the UART, you can garble what remains
to be sent of the outgoing character. So in a way similar to testing DR, you
need to test another bit, the Transmit Hold Register Empty bit (THRE). THRE is
not conveniently located at bit 0 of LSR, so you need a more general method of
testing bit fields than we used with DR.
The THRE flag resides at bit 5 of LSR. If you take a byte of 0-bits and raise
only bit 5 to a 1-bit, the value of the byte changes from 0 to $20. This is
the equivalent "value" of the bit field THRE, given its position within its
register, LSR. To make efficient use of THRE in your programs, it would thus
make sense to declare a constant like this:
 CONST
 THRE = $20;
With this constant defined, we first mask out the unwanted bits of LSR,
leaving untouched only bit 5. Then we test whether the remaining value is
equal to THRE with its bit set to 1, and the resulting Boolean value indicates
whether the THRE bit was set to 1 or to 0:
 (LSR AND THRE) = THRE
This expression returns True if THRE = 1, and False if THRE = 0. You can use
this same general mechanism for testing any bit field within a byte. First AND
the register against the bit field value, then test the resulting byte value
against the bit field value. It's that simple!


Baud Rate and the Divisor Latches


Internal mechanisms of the UART chip limit the number of I/O addresses to
which it can respond to eight. Eight isn't quite enough addresses to give each
register an address of its own, so some doubling up is required. The most
complex case of multiple use of I/O addresses lies in the setting of the UART
baud rate.
Most people understand that the baud rate of a serial port is an indicator of
how fast data moves through the port. Baud rate is roughly equivalent to bits
per second, in that 300 baud represents 300 bits per second moving through the
port, and 1200 baud represents 1200 bits per second.
Setting the baud rate of the UART chip is done by placing a value in two UART
registers called the divisor latches. This value represents a bit rate
divisor, by which an internal clock is divided to produce a time constant that
controls the movement of bits through the UART. The larger the divisor value,
the slower the UART's baud rate. The divisor value for 300 baud is 384, while
the divisor for 1200 baud is 96. The divisors for other baud rates are even
less.
Why do we have to place a single value into two registers? Consider: The
divisor for 300 baud is 384, which expressed in binary occupies 9 bits. 9 bits
won't fit into a byte no matter how hard we push on them, so the divisor must
be considered a 16-bit quantity. All UART registers are byte-sized registers.
This forces the divisor value to reside in two 8-bit registers.
The divisor latches share the first two I/O addresses with RBR/THR (base
address) and IER (offset 1.) The base I/O address of the UART, therefore, is
shared by three registers: RBR, THR, and the least significant byte of the
divisor latches, DLL.
This sharing is managed by a flag bit in the Line Control Register, LCR. Bit 7
of LCR is the Divisor Latch Access Bit (DLAB). DLAB is normally 0, and when 0,
the divisor latches are inaccessible from outside the UART. To write a value
into the divisor latches, or to read the state of the current value stored
there, you must first set DLAB to 1. When DLAB = 1, the least significant
divisor latch byte is accessed through the base address, and the most
significant divisor latch byte is accessed through offset 1. Keeping in mind
our previous discussions, a routine to set the baud rate would look like this:
 CONST DLAB = $80;

 PROCEDURE SetRate(Divisor: Word);
 BEGIN
 { Set the DLAB flag to 1: }
 Port[LCR] := Port[LCR] OR DLAB;

 {Load the divisor latches: }
 Port[DLL] := Lo(Divisor);
 Port[DLM] := Hi(Divisor);
 { Clear the DLAB flag to 0:}
 Port[LCR] := Port[LCR] AND (NOT DLAB);
 END;
In essence, you set DLAB to 1, split the divisor value into two pieces and
load the pieces into the two divisor latches, then set DLAB back to 0, its
usual state.


A Polling Terminal Program


It's a short walk from what we've discussed so far to producing a real,
functioning (if somewhat weak-witted) terminal program in Turbo Pascal.
Listing Two (page 148) is such a program. I call it PollTerm because it uses
polling rather than interrupts to access the UART. Polling is simply a
repeated query. In PollTerm we ask the UART if a character is ready, and if
not, we loop around and ask again, and again, and again, and when one
eventually becomes ready, we go in and grab it.
Polling works, but it has its problems; far better to have the UART notify us
somehow when a character is ready to grab. This we can do with interrupts, but
it'll take another whole column or so for me to work through the mess it takes
to set such a system up and keep it going.
The fundamental logic of PollTerm is pretty simple:
 Set the serial port up.
 LOOP:
 If a character comes in from the serial port, write the character to the
screen;
 If a character was typed at the keyboard, send it to the serial port for
transmission.
 Go to LOOP.
This pseudocode omits certain niceties like interpreting keyboard commands
such as exiting the program gracefully. PollTerm has the minimal machinery for
parsing keyboard commands, although it only recognizes two: Ctrl-X exits the
program, and Ctrl-Z clears the screen. Nonetheless, PollTerm demonstrates the
basic logic common to all terminal programs, and when we create an
interrupt-driven version in a future column, it won't look a great deal
different.
PollTerm demonstrates a couple of other UART register operations that I
haven't discussed in depth. The data word length and parity mode must be set,
again by writing 2-bit fields to a UART register:
 Port[LCR] := BITS8 OR NOPARITY;
Similarly, you need to set certain "hardware handshake" lines to their correct
state by writing bit fields to the modem control register MCR. I haven't the
space in this column to explain every bit field in every UART register, but
I'll cover the most important ones next time.
For now, crank up PoolTerm and give it a try. You can read MCI Mail with it (I
just did a little while ago) but watch and see if you don't occasionally lose
a character or two while a continuous stream of text is coming down the line
to your machine. (This is most likely on slow machines like 4.77-MHz 8088
PCs.) That's the most important problem that interrupts correct.
The major point I've wanted to make this time has only been that the serial
port hardware is knowable, and eminently accessible from Turbo Pascal. Once
you have a handle on the UART registers and what they do, you can write
communications software in any language that will allow you to read and write
I/O ports.



Turbo Pascal Does Windows!


I've just received a fairly reliable beta test copy of Turbo Pascal for
Windows, and by the time you read this column, Borland should have made its
presence public and begun shipping it. The product has been a very long time
in the works, but Borland did well to take its time and do it right. After a
short look, I can't think of anything seriously wrong with the product. It
doesn't make Windows programming quite what I'd call easy, but let's say it
makes Windows programming possible in less than a normal lifetime.
You still have to digest lots of information about Windows' event-driven
machinery, but the class library hides this necessary complexity about as well
as can be done. If you're a longtime Windows user you'll have no trouble
writing Windows apps almost immediately. If you're a Windows virgin, plan on a
little study and some practice in the environment itself. Turbo Pascal for
Windows is not Turbo Pascal 6.5 or 7.0. It's an entirely new and separate
product that will be sold and (one assumes) evolve side by side with the
"original" text-based Turbo Pascal.


Products Mentioned:


Turbo Pascal for Windows Borland International 1800 Green Hills Road Scotts
Valley, CA 95066 408-438-8400
I'll have more to say about Turbo Pascal for Windows in my next column. If
you've had your nose pressed up against the Window all these years, watching
the C guys inside "enjoying" themselves with the Microsoft SDK, take heart.
It's your turn now--and I have a hunch some of the C guys are going to be
watching you crank out rapid-fire Turbo Pascal Windows applications with no
little envy. Microsoft Windows development doesn't have to hurt anymore.
Really.

_STRUCTURED PROGRAMMING COLUMN_
by Jeff Duntemann


[LISTING ONE]


CONST
 COMPORT = 2; { 1 = COM1: 2 = COM2: }
 COMBASE = $2F8;
 PORTBASE = COMBASE OR (COMPORT SHL 8); { $3F8 for COM1: $2F8 for COM2: }

 { 8250 control registers, masks, etc. }
 RBR = PORTBASE; { 8250 Receive Buffer Register }
 THR = PORTBASE; { 8250 Transmit Holding Register }
 IER = PORTBASE + 1; { 8250 Interrupt Enable Register }
 LCR = PORTBASE + 3; { 8250 Line Control Register }
 MCR = PORTBASE + 4; { 8250 Modem Control Register }
 LSR = PORTBASE + 5; { 8250 Line Status Register }

 DLL = PORTBASE; { 8250 Divisor Latch LSB when DLAB=1 }
 DLM = PORTBASE + 1; { 8250 Divisor Latch MSB when DLAB=1 }

 DLAB = $80; { Value for Divisor Latch Access Bit }
 THRE = $20; { Value for Transmit Holding Register Empty bit }
 BAUD300 = 384; { Value for 300 baud operation }
 BAUD1200 = 96; { Value for 1200 baud operation }
 DTR = $01; { Value for Data Terminal Ready }
 RTS = $02; { Value for Ready To Send }
 NOPARITY = 0; { Comm format value for no parity }
 BITS8 = $03; { Comm format value for 8 bits }






[LISTING TWO]

{--------------------------------------------------------------}
{ POLLTERM }
{ by Jeff Duntemann }
{ Turbo Pascal V6.0 }
{ Last update 2/3/91 }
{ This is a *truly* dumb terminal program that operates by }

{ polling the serial port's registers and does NOT use }
{ interrupts. It's a good illustration of why interrupts are }
{ necessary... }
{ PollTerm can be set to use either COM1: or COM2: by setting }
{ the COMPORT constant to 1 (for COM1) or 2 (for COM2:) as }
{ as appropriate and recompiling. }
{--------------------------------------------------------------}

PROGRAM PollTerm;

USES DOS,CRT;

{$I REGISTERS.DEF }

VAR
 Quit : Boolean; { Flag for exiting the program }
 HiBaud : Boolean; { True if 1200 baud is being used }
 KeyChar : Char; { Character from keyboard }
 CommChar : Char; { Character from the comm port }
 Divisor : Word; { Divisor value for setting baud rate }
 Clearit : Byte; { Dummy variable }
 NoShow : SET OF Char; { Don't show characters set }

PROCEDURE SetRate(Divisor : Word);

BEGIN
 { Set the DLAB flag to 1: }
 Port[LCR] := Port[LCR] OR DLAB;
 { Load the divisor latches: }
 Port[DLL] := Lo(Divisor);
 Port[DLM] := Hi(Divisor);
 { Clear the DLAB flag to 0:}
 Port[LCR] := Port[LCR] AND (NOT DLAB);
END;

FUNCTION InStat : Boolean;

BEGIN
 { Bit 0 of LSR goes high when a char is waiting: }
 InStat := Boolean(Port[LSR] AND $01);
END;

FUNCTION OutStat : Boolean;

BEGIN
 { Bit 5 of LSR goes high when the THR is ready for another char: }
 OutStat := ((LSR AND THRE) = THRE);
END;

FUNCTION InChar : Char;

BEGIN
 InChar := Char(Port[RBR]);
END;

PROCEDURE OutChar(Ch : Char); { Send a character to the comm port }

BEGIN
 Port[THR] := Byte(Ch) { Put character ito Transmit Holding Register }

END;

PROCEDURE UhUh;

VAR
 I : Integer;

BEGIN
 FOR I := 1 TO 2 DO
 BEGIN
 Sound(50);
 Delay(50);
 NoSound;
 Delay(50);
 END;
END;


{>>>>>POLLTERM MAIN PROGRAM<<<<<}

BEGIN
 HiBaud := True; { PollTerm defaults to 1200 baud; if "300"}
 Divisor := BAUD1200; { is entered after "POLLTERM" on the }
 IF ParamCount > 0 THEN { command line, then 300 baud is used }
 IF ParamStr(1) = '300' THEN
 BEGIN
 HiBaud := False;
 Divisor := BAUD300
 END;

 SetRate(Divisor); { Set the baud rate }

 Port[IER] := 0; { Disable 8259 interrupts }
 Port[LCR] := BITS8 OR NOPARITY; { Set word length and parity }
 Port[MCR] := DTR OR RTS; { Enable DTR, & RTS }
 Clearit := Port[RBR]; { Clear any garbage from RBR }
 Clearit := Port[LSR]; { Clear any garbage from LSR }

 DirectVideo := True;
 NoShow := [#0,#127]; { Don't display NUL or RUBOUT }

 ClrScr;
 Writeln('>>>POLLTERM by Jeff Duntemann');

 Quit := False; { Exit POLLTERM when Quit goes to True }
 REPEAT

 IF InStat THEN { If a character comes in from the modem }
 BEGIN
 CommChar := InChar; { Go get character }
 CommChar := Char(Byte(CommChar) AND $7F); { Mask off high bit }
 IF NOT (CommChar IN NoShow) THEN { If we can show it,}
 Write(CommChar) { then show it! }
 END;

 IF KeyPressed THEN { If a character is typed at the keyboard }
 BEGIN
 KeyChar := ReadKey; { First, read the keystroke }
 IF KeyChar = Chr(0) THEN { We have an extended scan code here }

 UhUh { but we're not using it! }
 ELSE
 CASE Ord(KeyChar) OF
 24 : Quit := True; { Ctrl-X: Exit PollTerm }
 26 : ClrScr; { Ctrl-Z: Clear the screen }
 ELSE BEGIN
 WHILE NOT OutStat DO BEGIN END; { I.e., nothing }
 OutChar(KeyChar)
 END;
 END; { CASE }
 END

 UNTIL Quit
END.
















































MAY, 1991
GRAPHICS PROGRAMMING


Further Ruminations of the Edsun CEG/DAC




Michael Abrash


Of late, I find myself asking many questions, but finding few easy answers.
How do lawyers manage to innovate without the incentive and protection of
patents for their work? Why is this nation capable of mounting an awesome
attack half a world away in just a couple of months, but unable to devise a
coherent energy policy in ten years? Why does anyone buy a 386+387 machine
when a 486 costs just about the same and is two to five times faster at
floating-point operations? I wonder about these things, I do. And, for lo
these many weeks since last we spoke, I have wondered this: What is the place,
in the grand scheme of things, of the Edsun Continuous Edge Graphics
Digital-to-Analog-Converter (CEG/DAC)? What is it that this chip can and
cannot do?


The CEG/DAC


As I discussed last month, the CEG/DAC is a VGA DAC substitute that makes a
VGA capable of displaying graphics of near 24-bit-per-pixel (bpp) quality.
This is accomplished via two features: Embedding in the bitmap information
that reprograms the palette (EDP), and specifying pixel colors as mixes of the
colors of neighboring pixels (pixel weighting). I stated last month that I
think that Super VGA-plus-CEG/DAC may well be the next PC graphics standard. I
still think that. The question I now have is: What sorts of graphics are the
features of the CEG/DAC good for, out of the many things a PC programmer might
wish for?
Were the CEG/DAC a true 24-bpp device, there'd be no question; it would be
able to display any sort of graphics at all, limited only by the speed of the
graphics code. However, the CEG/DAC is not a true 24-bpp device, and although
sometimes it is a good substitute, sometimes it clearly is not--and often it
is somewhere in between. I have come to think that the CEG/DAC is more useful
for static images than for dynamic images (animation, real-time screen
updates, and the like); in particular, the limitations of pixel weighting must
be attended to carefully when animating.
For example, consider the simple task of drawing a line.


Drawing Lines Through Space


The standard technique for drawing lines rapidly is Bresenham's Algorithm.
Bresenham-style lines suffer from one flaw: They tend to look jagged, because
the line jumps abruptly from one pixel to the next.
The traditional cure for jaggies is antialiasing. Technically, antialiasing is
the process of removing spurious signals resulting from undersampling,
typically with a low-pass filter, but in the context of graphics, antialiasing
has come to mean any process that helps eliminate jaggies. The CEG/DAC's pixel
weighting feature is designed to do exactly that, by allowing you to mix the
colors of horizontally neighboring pixels.
Pixel weighting works very well for smoothing jaggies in a single, static
image. Run Listing Two (page 149), which when linked to Listing One (page 149)
draws a rotating nonantialiased line (floating point rather than Bresenham's
is used for clarity and to save space, but the same points are drawn), and
press a key to freeze the image; you will see prominent jaggies along the
length of the line. Now run Listing Three (page 150) linked to Listing One (if
you have a CEG/DAC) or Listing Four (page 150) linked to Listing One (if you
don't) to draw an antialiased line, and press a key to freeze the image.
(Listing Three antialiases using CEG/DAC pixel weighting, as described later,
and Listing Four emulates pixel weighting by setting up the VGA palette with
32 weighted mixes of the line and background colors.) The jaggies are gone,
replaced with smooth edges.
There are two sorts of aliasing in graphics, though. What we've just seen is
spatial aliasing--aliasing over space, in this case over the pixels of the
screen. There's also temporal aliasing, aliasing from one image to the next as
animation is performed. When Listing Two is allowed to run freely, temporal
aliasing is clearly visible in the form of jaggies crawling wildly along the
edges of the line. It is this distracting and often hideous effect that drives
the need for temporal antialiasing.


Drawing Lines Over Time


Pixel weighting doesn't serve particularly well for temporal antialiasing, for
a number of reasons. The line animation performed by Listings Three and Four
looks better than that of Listing Two, but there are still shifting patterns
in the line, the line's brightness varies, and the line's motion appears less
than perfectly smooth. There are several reasons for this, two of which have
nothing to do with the CEG/DAC: The aspect ratio in 320 x 200 mode isn't 1:1,
and the fineness and accuracy of motion is limited by the use of integer
coordinates. (Then again, pixel weighting is poorly suited to drawing based on
fractional coordinates, because such drawing must often represent three or
more color influences on a single pixel.) However, a number of other problems
are directly related to the nature of the CEG/DAC.
For one thing, the line's brightness varies with angle. This is a side effect
of the way the CEG/DAC handles Y-major lines (lines that are longer along the
Y axis than the X axis). Normally, the CEG/DAC requires at least three pixels
in a row to specify a single weighted pixel: Two colors to average between and
a pixel weighting, which averages between the two colors, as shown in Figure
1. (I'm speaking of Advanced-8 mode here, the mode in which the maximum number
of colors is available.) Requiring three or more pixels across for pixel
weighting in this fashion would make it impossible to draw thin lines,
especially in close proximity.
Edsun solved this by introducing a special case. If a pixel color is followed
by a pixel mix and then by another color, the first color is taken to specify
a line color, and the mix is taken to specify the weighting between the line
color and the background color for the two pixels of the line, as shown in
Figure 2. The mix itself is applied to the pixel that specified the line
color, and the complement of the mix is applied to the pixel that specified
the mix. In other words, a color-mix pair in the middle of a line of colors is
taken to be a standalone unit specifying a cross-section of a two-pixel-wide
line, and the mix is applied to spread the color across the two pixels in such
a way that the intensity of the line color totals 100 percent. (Neither is
this the only special case; CEG/DAC programming is not trivial.)
A total line intensity of 100 percent on each scan line sounds good, but in
fact it is rarely correct. A vertical line should indeed have 100 percent
intensity across each scan line. However, as the line tilts toward 45 degrees,
the intensity per scan line should increase toward square root of two, because
the line is covering more ground from one scan line to the next. With pixel
weighting, however, the intensity per scan line remains the same, so the
overall brightness of the line decreases.
Also, the width of a line drawn with 100 percent-sum pixel weighting varies
with angle; horizontal and vertical lines are only one pixel wide, but all
other lines spread across two pixels along the minor axis (the axis along
which the line's length is less). The line's width, measured along the normal,
also varies along the length of the line; this shows up as motion along the
edges of the line as it turns, although it is more subdued than normal
jaggies.
The above caveats are also true of static images drawn with pixel weighting,
but there they are not nearly as noticeable. I've said before that graphics is
the art of fooling the eye and the brain. They're easily fooled for many
things; for example, they're quite willing to interpret a sequence of slightly
different images, displayed at a rate of 30 images per second, as motion. By
the same token, however, disturbances in the smooth progression of change over
time stick out like red flags. Don't ask me why; probably it was a handy trait
to have when spotting prey or avoiding being prey was a full time job. At any
rate, consistency from one frame to the next is essential.
Unfortunately, temporal consistency requires more than simply smoothing edges;
it requires filtering the image so that spurious elements are removed -- that
is, true antialiasing. With true antialiasing, a consistent representation of
the image can be obtained from one frame to the next, no matter how the image
is rotated or shifted, or what it is bordered by. I'd love to show you true
antialiasing in action, but I'm well and truly running out of space; I promise
to return to the topic soon.
True antialiasing requires that colors be settable as arbitrary mixes of the
surrounding primitives in all directions. Pixel weighting does not make that
possible. Therefore, except under special circumstances, pixel weighting is a
dejagging rather than antialiasing tool, and is useful primarily for smoothing
static images (where it surely does look good), rather than for smoothing
dynamic images.


Further Limitations of Pixel Weighting


There are other problems with dynamic pixel weighting. Programming is a
cumbersome process, for starters. Consider Listing Three. For Y-major lines,
the weighting between the two pixels on each scan line has to be calculated;
while that can certainly be done without floating-point calculations, I do not
see how it can be done without either an integer divide or a table search.
Either way, lines drawn with pixel weighting are going to be considerably
slower than Bresenham's lines.
X-major lines have their complications, too. Here the minor axis is vertical,
so we must split the 100 percent brightness at each step between two scan
lines. Unfortunately, the CEG/DAC is strictly a horizontally oriented device,
so we are obliged to make sure that the line color precedes the mix sequence
on each scan line, as shown in Figure 1. (The pixel establishing the line
color actually displays in the previous pixel's color; the line color doesn't
show up until the first mix pixel, in the form of the specified mix.) There's
another complication; if any scan line has only one mix command, then that mix
and the preceding color are interpreted as a special-case two pixel line mix,
as described earlier, rather than a normal mix sequence, and the colors
displayed for the pixels are quite different. These complications could have
been handled as special cases in Listing Three; however, I decided that the
code would be simpler and the operation of the CEG/DAC would be clearer if I
just drew all the mix pixels, then processed the buffer afterward to insert
leading line color pixels and clean up one-mix sequences.
If you think all this sounds involved and slow, well, it is. It's by no means
unmanageable, and certainly much more efficient implementations are possible,
but CEG drawing does, nonetheless, generally require additional coding and
execution time. Moreover, I didn't have to worry about handling intersections
with other lines and graphics objects in Listing Three; normally, such
intersections have to be checked for and potentially patched up, because they
can generate unintended results by disrupting existing mix sequences. In
addition, there are potential problems with left-side clipping of pixel
weighting sequences.
There are, of course, workarounds for many of these limitations. If no
rotation is performed and all moves are by an integral number of pixels, edge
crawling largely vanishes. Careful construction of animation so that objects
don't overlap can eliminate the complications of checking the bitmap for
intersections. Also, many primitives, such as rectangle fills, don't require
any antialiasing, and antialiased text can be predefined, then simply copied
to the screen as needed.
In fact, predefining antialiased images and avoiding troublesome situations
such as rotated polygons, intersections with other objects, and clipping may
be keys to high-performance CEG/DAC code. When such conditions are
unavoidable, you may want to consider EDP (on-the-fly palette loading), which
is probably the best bet for high-color-content dynamic drawing with the
CEG/DAC. (Basic-8 mode is simpler and faster when only a few colors are
required.)
Consider this: EDP does not suffer from clipping problems. Neither must
drawing that relies only on EDP concern itself with intersections; every pixel
is independent. The only remaining problem lies in making available all the
colors needed for drawing. Suppose that in 640 x 480 256-color mode, you
choose to allocate 160 pixels at the left and right edges of the screen (about
25 percent of the width of the screen) to static images, composed of mostly
solid colors. EDP requires five pixels to load a single palette location, so
about 30 colors could be changed per scan line during the static edges.
Additional colors could be set in the active portion of the display, with the
EDP commands embedded in areas of solid color.
Now, there would be 480 pixels in the active region across each scan line. 223
independent colors (the number available in Advanced-8 mode), with at least 30
changeable from line to line, could often serve quite nicely to draw whatever
it is that those 480 pixels across are supposed to represent. No, it wouldn't
be the same as drawing with 24 arbitrary bpp, but, properly structured, it
might not look a whole lot different; Listing Four, which uses only careful
palette loading, antialiases just as well as the CEG hardware in Listing
Three, and, with the help of EDP, would be considerably more flexible. (Not
coincidentally, the photorealistic images in the Edsun demo have large blank
borders.) Best of all, the only complications with EDP-based drawing would be
figuring out which colors to change and where for maximum effect; the far
greater complications of pixel weighting are avoided.
Of course, reliance on EDP is not a technique that a general-purpose driver
can use; it's necessary that the application be CEG/DAC-aware. Long ago, all
graphics applications were hardware-aware; now quasi-hardware-independence,
through general interfaces such as Windows and X Window System, is the
mainstream. I suspect that the truly mind-blowing CEG/DAC applications will be
throwbacks to that earlier era. In the end, what I'm saying is that if you
want to do spectacular animation on the CEG/DAC, you might do well to ignore
pixel weighting altogether, at least for images that must be generated in
realtime or might intersect, and focus on EDP. That was obviously not Edsun's
intent, for they didn't make it possible to enable only EDP; that's a shame,
because then we'd have 255 main colors to work with.
Overall, I think the CEG/DAC is better suited to static images than dynamic;
as the Edsun demo illustrates, the CEG/DAC is stunningly good at static
displays, and as the examples in this column illustrate, there are significant
complications and limitations associated with dynamic displays. One of the
tricks is converting as much drawing as possible from dynamic to static;
that's what predefining images and keeping them from intersecting is all
about. Although the CEG/DAC is a powerful resource, in some respects it comes
up short; it should be fascinating to watch as techniques evolve to work
around its limitations.



Book of the Month


This month's book is The RenderMan Companion, by Steve Upstill
(Addison-Wesley, 1990). RenderMan is a comprehensive programming interface
specification for 3-D graphics; implementations of the RenderMan interface
have been the basis for stunning photorealistic imaging and special effects.
(For background information on RenderMan, see Upstill's article "Photorealism
in Computer Graphics," in DDJ, November 1988). Companion takes you on a
wide-ranging tour of the RenderMan interface, with plenty of sample code and
output; even if you never program RenderMan directly, this book provides
worthwhile insight into the nature of 3-D rendering. At the very least, read
the foreword, a brief history of computer graphics and the development of
RenderMan; it provides a sense of the dizzying pace of progress in computer
graphics, and of the people behind the wonders.
Which brings a thought to mind. I don't know how many, if any, of the
techniques discussed in Companion and in the various standard graphics
references are patented, but I've never seen a mention of such. How is it, I
ask all you software patent proponents, that all this invention, publishing,
and sharing of knowledge for the public benefit happened without the
"necessary" incentive and protection of patents?
Think about it, that's all I ask.

_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

/* Animates a line rotating about its center. 180 frames are constructed
offscreen, one for each 1 degree rotation, and then copied to the screen to
produce animation. Edsun CEG antialiasing or emulated CEG antialiasing may
optionally be performed. To compile for Edsun CEG antialiasing, define
USE_CEG on the compiler command line (/DUSE_CEG for MSC, -DUSE_CEG for Turbo
C)
and link this code with Listing 3. To compile for emulated CEG antialiasing,
define EMULATE_CEG on the compiler command line and link this code with
[LISTING 4.]
All C code tested with Microsoft C 5.0. Requires a large data model
(the compact model was used for testing). */

#include <stdio.h>
#include <stdlib.h>
#include <conio.h>
#include <math.h>
#include <dos.h>
#ifdef __TURBOC__
#include <mem.h>
#else
#include <memory.h>
#endif

/* Size of frames drawn to system memory, and size of the circle
 formed by the rotation of the line segment around its center */
#define FRAME_WIDTH 49
#define FRAME_HEIGHT 49
#define FRAME_CENTER_X (FRAME_WIDTH/2)
#define FRAME_CENTER_Y (FRAME_HEIGHT/2)
#define X_RADIUS (FRAME_WIDTH/2-1)
#define Y_RADIUS (FRAME_HEIGHT/2-1)
#define SCREEN_WIDTH 320
#define SCREEN_HEIGHT 200
#define SCREEN_SEGMENT 0xA000
#define X_OFFSET_ADJUST ((SCREEN_WIDTH/2) - FRAME_CENTER_X)
#define Y_OFFSET_ADJUST ((SCREEN_HEIGHT/2) - FRAME_CENTER_Y)
#define PI 3.141592

int main(void);
extern void DrawLine(unsigned char *,int,int,int,int,int,int,int);
extern int SetCEGMode(int);
extern void SetAAPalette(void);

int main() {
 unsigned char *FramePtr[180], *ScreenPtr, *TempPtr;
 int angle, X1, Y1, X2, Y2, temp, i;

 union REGS regset;

 /* Generate 180 frames, one for each 1 degree rotation of the line
 segment about its center; store the results in system memory */
 printf("Precalculating frames. Please wait...");
 /* First, allocate space for the frames */
 for (angle = 0; angle < 180; angle++) {
 if ((FramePtr[angle] = (unsigned char *)
 malloc(FRAME_WIDTH * FRAME_HEIGHT)) == NULL) {
 printf("Out of memory\n");
 return(1);
 }
 /* Clear the frame to black */
 memset(FramePtr[angle], 0, FRAME_WIDTH * FRAME_HEIGHT);
 }

 /* Generate the frames, one for each 1 degree rotation of the line
 segment about its center */
 for (angle = 0; angle < 180; angle++) {
 /* Calculate upper end of line as a counterclockwise rotation from right
 end of a horizontal line and lower end of line as a counterclockwise
 rotation from left end of a horizontal line */
 temp = cos((double)angle * PI / 180.0) * X_RADIUS + 0.5;
 X1 = FRAME_CENTER_X + temp;
 X2 = FRAME_CENTER_X - temp;
 temp = sin((double)angle * PI / 180.0) * Y_RADIUS + 0.5;
 Y1 = FRAME_CENTER_Y - temp;
 Y2 = FRAME_CENTER_Y + temp;
 /* Draw the line in white */
 DrawLine(FramePtr[angle], FRAME_WIDTH, FRAME_HEIGHT, X1, Y1,
 X2, Y2, 15);
 }

 /* Set to the standard 256-color VGA mode, mode 0x13, 320x200 */
 regset.x.ax = 0x0013; int86(0x10, &regset, &regset);

#ifdef USE_CEG
 /* Enable Advanced-8 CEG mode */
 if (!SetCEGMode(13)) {
 /* Restore text mode and we're done */
 regset.x.ax = 0x0003;
 int86(0x10, &regset, &regset);
 fprintf(stderr, "CEG/DAC not installed\n");
 return(1); /* no CEG/DAC installed */
 }
#endif

#ifdef EMULATE_CEG
 /* Set the palette up for antialiasing */
 SetAAPalette();
#endif

 /* Draw the frames, at a rate of 1 frame per screen refresh
 interval, until a key is pressed */
 for (angle = 0;;) {
 do {
 /* Point to the destination area of display memory */
#ifdef __TURBOC__
 ScreenPtr = MK_FP(SCREEN_SEGMENT, (Y_OFFSET_ADJUST*SCREEN_WIDTH)

 + X_OFFSET_ADJUST);
#else
 FP_SEG(ScreenPtr) = SCREEN_SEGMENT;
 FP_OFF(ScreenPtr) =
 (Y_OFFSET_ADJUST * SCREEN_WIDTH) + X_OFFSET_ADJUST;
#endif
 /* Wait for the start of the vertical non-display portion of the
 frame */
 while (inp(0x3DA) & 0x08) ;
 while (!(inp(0x3DA) & 0x08)) ;
 /* Copy over the current frame, a scan line at a time */
 for (i = 0, TempPtr = FramePtr[angle]; i < FRAME_HEIGHT;
 i++, ScreenPtr += SCREEN_WIDTH, TempPtr += FRAME_WIDTH) {
 memcpy(ScreenPtr, TempPtr, FRAME_WIDTH);
 }
 angle = (angle + 1) % 180; /* wrap back every 180 frames */
 } while (!kbhit());
 if (getch() == 0x1B) /* pause if a key was pressed */
 break; /* exit if the key was Esc */
 getch(); /* wait for another key and resume */
 }

 /* Return the CEG/DAC to standard VGA operation by writing to
 palette location 223, restore text mode and we're done */
#ifdef USE_CEG
 while (inp(0x3DA) & 0x08) ; /* wait for the start of */
 while (!(inp(0x3DA) & 0x08)) ; /* vertical non-display */
 outp(0x3C8, 223); outp(0x3C9, 0); outp(0x3C9, 0); outp(0x3C9, 0);
#endif
 regset.x.ax = 0x0003; int86(0x10, &regset, &regset);
 return(0);
}







[LISTING TWO]

/* Draws a non-antialiased line from (X1,Y1) to (X2,Y2) into the buffer
pointed to by BufferPtr, of width BufferWidth and height BufferHeight. */

#include <dos.h>
#include <math.h>
#define SWAP(a,b) {temp = a; a = b; b = temp;}

void DrawLine(unsigned char *BufferPtr, int BufferWidth,
 int BufferHeight, int X1, int Y1, int X2, int Y2, int color)
{
 int X, Y, DeltaX, DeltaY, temp;
 double Slope, InverseSlope;

 /* Calculate the X and Y lengths of the line */
 DeltaX = X2 - X1;
 DeltaY = Y2 - Y1;

 /* Determine the major axis */

 if (abs(DeltaY) > abs(DeltaX)) {
 /* Y is the major axis */
 if (DeltaY < 0) { /* make sure DeltaY is positive */
 SWAP(X1, X2);
 SWAP(Y1, Y2);
 DeltaX = -DeltaX;
 DeltaY = -DeltaY;
 }
 InverseSlope = (double)DeltaX / (double)DeltaY;
 /* Scan out the line, stepping along the Y axis one pixel at a
 time and calculating the corresponding X coordinates */
 for (Y = Y1; Y <= Y2; Y++) {
 X = X1 + (int)floor(((double)(Y - Y1) * InverseSlope) + 0.5);
 *(BufferPtr + BufferWidth * Y + X) = color;
 }
 } else {
 /* X is the major axis */
 if (DeltaX < 0) { /* make sure DeltaX is positive */
 SWAP(X1, X2);
 SWAP(Y1, Y2);
 DeltaX = -DeltaX;
 DeltaY = -DeltaY;
 }
 Slope = (double)DeltaY / (double)DeltaX;
 /* Scan out the line, stepping along the X axis one pixel at a
 time and calculating the corresponding Y coordinates */
 for (X = X1; X <= X2; X++) {
 Y = Y1 + (int)floor(((double)(X - X1) * Slope) + 0.5);
 *(BufferPtr + BufferWidth * Y + X) = color;
 }
 }
}








[LISTING THREE]

/* Draws an Advanced-8 CEG-antialiased line from (X1,Y1) to (X2,Y2) into
buffer
pointed to by BufferPtr, of width BufferWidth and height BufferHeight. */

#include <dos.h>
#include <math.h>
#define SWAP(a,b) {temp = a; a = b; b = temp;}

void DrawLine(unsigned char *BufferPtr, int BufferWidth,
 int BufferHeight, int X1, int Y1, int X2, int Y2, int color)
{
 int X, Y, DeltaX, DeltaY, temp, WeightingIndex, i, MixLength;
 double Slope, InverseSlope, XFloat, YFloat;

 /* Calculate X and Y lengths of the line */
 DeltaX = X2 - X1;
 DeltaY = Y2 - Y1;


 /* Determine the major axis */
 if (abs(DeltaY) > abs(DeltaX)) {
 /* Y is the major axis */
 if (DeltaY < 0) { /* make sure DeltaY is positive */
 SWAP(X1, X2);
 SWAP(Y1, Y2);
 DeltaX = -DeltaX;
 DeltaY = -DeltaY;
 }
 InverseSlope = (double)DeltaX / (double)DeltaY;
 /* Scan out line, stepping along the Y axis 1 pixel at a time and
 calculating corresponding pair of X coordinates, 1 on each side of
 ideal line, with the mix between background color and line color at
 each of the two X coordinates proportional to proximity of line to
 that pixel, and with line color intensities of the two X coordinates
 summing to 100% */
 for (Y = Y1; Y <= Y2; Y++) {
 /* Exact X coordinate at this Y coordinate */
 XFloat = (double)X1 + ((double)(Y - Y1) * InverseSlope);
 /* Nearest X coordinate on or to the left of the line */
 X = (int)floor(XFloat);
 /* Draw the color to the left pixel */
 *(BufferPtr + BufferWidth * Y + X) = color;
 /* Draw the weighting index for the desired color mix for the
 left pixel to the right pixel; the CEG/DAC uses this to
 mix the color we just wrote with the background color and
 draw that for the left pixel, then uses the complementary
 mix to draw the right pixel (this pixel). Confusing, but
 that's the way the CEG/DAC works! */
 *(BufferPtr + BufferWidth * Y + X + 1) =
 (int)((XFloat - X) * 32.0) + 192;
 }
 } else {
 /* X is the major axis */
 if (DeltaX < 0) { /* make sure DeltaX is positive */
 SWAP(X1, X2);
 SWAP(Y1, Y2);
 DeltaX = -DeltaX;
 DeltaY = -DeltaY;
 }
 Slope = (double)DeltaY / (double)DeltaX;
 /* Scan out the line, stepping along the X axis one pixel at a
 time and calculating the corresponding pair of Y coordinates,
 one on each side of the ideal line, with the mix between the
 background color and the line color at each of the two Y
 coordinates proportional to the proximity of the line to that
 pixel, and with the line color intensities of the two Y
 coordinates summing to 100% */
 for (X = X1; X <= X2; X++) {
 /* Exact Y coordinate at this X coordinate */
 YFloat = (double)Y1 + ((double)(X - X1) * Slope);
 /* Nearest Y coordinate on or above the line */
 Y = (int)floor(YFloat);
 /* Calculate the weighting index for the percentage of this
 pixel that's on the top scan line of the pair this pixel
 is split between */
 WeightingIndex = (int)((YFloat - Y) * 32.0);
 /* Set the weighting for the top pixel */
 *(BufferPtr + BufferWidth * Y + X) =

 WeightingIndex + 192;
 /* Set the weighting for the bottom pixel, with the top and
 bottom weightings summing to 100% */
 *(BufferPtr + BufferWidth * (Y + 1) + X) =
 31 - WeightingIndex + 192;
 }

 /* Finally, post-process the buffer to put leading mix color
 bytes on mix sequences (so the weighting indexes have
 something to mix), and to turn one-pixel-wide sequences
 (artifacts of the above drawing approach, which will not
 display properly) into two-pixel-wide sequences by appending
 an additional mixed pixel 100% weighted to the background color */
 for (i = 0, MixLength = 0; i < (BufferWidth*BufferHeight); i++) {
 if (*(BufferPtr + i) != 0) {
 /* Part of a mix sequence; increment sequence length */
 MixLength++;
 } else {
 if (MixLength > 0) {
 /* Mix sequence just ended; set the line color to start
 the mix sequence */
 *(BufferPtr + i - MixLength - 1) = color;
 if (MixLength == 1)
 /* This is a 1-long mix sequence; pad it to a 2-long
 sequence with an all-background mixed pixel, so
 the mix sequence will display properly, rather
 than as a special 2-wide color/mix line case */
 *(BufferPtr + i) = 0xDF;
 MixLength = 0;
 }
 }
 }
 }
}

/* Sets the desired CEG/DAC mode, enabling CEG graphics. Returns 1 for
 success, 0 if no CEG/DAC is installed. */
int SetCEGMode(int mode) {
 /* Wait for the start of the vertical non-display portion of the
 frame */
 while (inp(0x3DA) & 0x08) ;
 while (!(inp(0x3DA) & 0x08)) ;

 outp(0x3C7, 222); /* write the CEG enable sequence */
 outp(0x3C9, 'C'); outp(0x3C9, 'E'); outp(0x3C9, 'G');
 outp(0x3C7, 222);
 outp(0x3C9, 'E'); outp(0x3C9, 'D'); outp(0x3C9, 'S');
 outp(0x3C7, 222);
 outp(0x3C9, 'U'); outp(0x3C9, 'N');
 outp(0x3C9, mode); /* write the CEG mode */

 /* CEG mode should be enabled. Make sure this is a CEG/DAC */
 outp(0x3C6, 0xFF); /* enable all DAC mask bits */
 if ((inp(0x3C6) & 0x70) == 0x70)
 return(0); /* no version # bit is 0; this is not a CEG/DAC */
 else
 return(1); /* this is a CEG/DAC, and it's ready to go */
}






[LISTING FOUR]

/* Draws an antialiased line from (X1,Y1) to (X2,Y2) into the buffer pointed
to by BufferPtr, of width BufferWidth and height BufferHeight. White on black
is the only supported color combination. */

#include <dos.h>
#include <math.h>
#define SWAP(a,b) {temp = a; a = b; b = temp;}

void DrawLine(unsigned char *BufferPtr, int BufferWidth,
 int BufferHeight, int X1, int Y1, int X2, int Y2, int color)
{
 int X, Y, DeltaX, DeltaY, temp, WeightingIndex, i, MixLength;
 double Slope, InverseSlope, XFloat, YFloat;

 /* Calculate X and Y lengths of the line */
 DeltaX = X2 - X1;
 DeltaY = Y2 - Y1;

 /* Determine the major axis */
 if (abs(DeltaY) > abs(DeltaX)) {
 /* Y is the major axis */
 if (DeltaY < 0) { /* make sure DeltaY is positive */
 SWAP(X1, X2);
 SWAP(Y1, Y2);
 DeltaX = -DeltaX;
 DeltaY = -DeltaY;
 }
 InverseSlope = (double)DeltaX / (double)DeltaY;
 /* Scan out line, stepping along the Y axis one pixel at a time and
 calculating corresponding pair of X coordinates, 1 on each side of
 ideal line, with the mix between background color and line color at
 each of the two X coordinates proportional to proximity of line to
 that pixel, and with line color intensities of two X coordinates
 summing to 100% */
 for (Y = Y1; Y <= Y2; Y++) {
 /* Exact X coordinate at this Y coordinate */
 XFloat = (double)X1 + ((double)(Y - Y1) * InverseSlope);
 /* Nearest X coordinate on or to the left of the line */
 X = (int)floor(XFloat);
 /* Calculate the weighting index for the percentage of this
 pixel that's in the left column of the pair this pixel is
 split between */
 WeightingIndex = (int)((XFloat - X) * 32.0);
 /* Draw the left pixel with the desired color weighting */
 *(BufferPtr + BufferWidth * Y + X) = WeightingIndex + 16;
 /* Draw the right pixel with the complement of the left pixel
 color weighting, so that the line color intensities sum to 100% */
 *(BufferPtr + BufferWidth * Y + X + 1) =
 31 - WeightingIndex + 16;
 }
 } else {
 /* X is the major axis */
 if (DeltaX < 0) { /* make sure DeltaX is positive */

 SWAP(X1, X2);
 SWAP(Y1, Y2);
 DeltaX = -DeltaX;
 DeltaY = -DeltaY;
 }
 /* Scan out the line, stepping along the X axis one pixel at a
 time and calculating the corresponding pair of Y coordinates,
 one on each side of the ideal line, with the mix between the
 background color and the line color at each of the two Y
 coordinates proportional to the proximity of the line to that
 pixel, and with the line color intensities of the two Y
 coordinates summing to 100% */
 Slope = (double)DeltaY / (double)DeltaX;
 for (X = X1; X <= X2; X++) {
 /* Exact Y coordinate at this X coordinate */
 YFloat = (double)Y1 + ((double)(X - X1) * Slope);
 /* Nearest Y coordinate on or above the line */
 Y = (int)floor(YFloat);
 /* Calculate the weighting index for the percentage of this
 pixel that's on the top scan line of the pair this pixel
 is split between */
 WeightingIndex = (int)((YFloat - Y) * 32.0);
 /* Draw the top pixel with the desired color weighting */
 *(BufferPtr + BufferWidth * Y + X) =
 WeightingIndex + 16;
 /* Draw the bottom pixel with the complement of the top pixel
 color weighting, so that the line color intensities sum to
 100% */
 *(BufferPtr + BufferWidth * (Y + 1) + X) =
 31 - WeightingIndex + 16;
 }
 }
}

/* Sets palette entries 16-47 for antialiasing, stepping from solid white to
solid black (increments of 1/31st.) Steps are corrected for a gamma of 2.3. */
void SetAAPalette() {
 union REGS regset;
 struct SREGS sregs;
 static unsigned char AASettings[32*3] = {
 63,63,63, 62,62,62, 61,61,61, 60,60,60, 59,59,59, 58,58,58,
 57,57,57, 56,56,56, 55,55,55, 54,54,54, 53,53,53, 52,52,52,
 51,51,51, 50,50,50, 49,49,49, 47,47,47, 46,46,46, 45,45,45,
 43,43,43, 42,42,42, 40,40,40, 39,39,39, 37,37,37, 35,35,35,
 33,33,33, 31,31,31, 28,28,28, 26,26,26, 23,23,23, 19,19,19,
 14,14,14, 0, 0, 0
 };

 regset.x.ax = 0x1012;
 regset.x.bx = 0x0010;
 regset.x.cx = 0x0020;
 regset.x.dx = (unsigned int) AASettings;
 segread(&sregs);
 sregs.es = sregs.ds; /* point ES:DX to AASettings */
 int86x(0x10, &regset, &regset, &sregs);
}


































































MAY, 1991
PROGRAMMER'S BOOKSHELF


C++: The Next Generation




Andrew Schulman


A year ago, in the May 1990 issue of DDJ, I wrote a round-up of books on the
C++ programming language. The bottom line was that, if you planned on reading
only one C++ book, then Stanley B. Lippman's C++ Primer based on C++ 2.0 was
the book to read. That's still true.
Some excellent books on C++ have appeared in the last year, reflecting the
growing maturity of the language. Some assume that the reader is already
familiar with C++. We have entered the second generation of C++ usage.
The best of the new C++ books is Jonathan S. Shapiro's A C++ Toolkit. This
book can be read in one evening, and is an enjoyable, brief introduction to
software reuse. Shapiro's goal is to prod you into thinking about using C++
for "reusable programming."
In a discussion of "The Failure of Libraries," Shapiro cites the example of
two different Unix library routines for handling regular expressions. "No
interesting program has ever used either of these library packages!" (p.3).
Something better than libraries is needed if we are to have a
"software-components subindustry"; object-oriented programming languages such
as C++ are an attempt to solve this problem.
We've all heard this one before. But simple things, such as using the example
of linked lists rather than the tired complex-numbers example, make this a
convincing argument for using C++.
Computer science has the interesting property that the vast majority of
problems are solved with a very small number of fundamental data structures.
These data structures are used so often that they have achieved the status of
koans. The most common by far is the linked list. If you have been programming
for more than a year or two, you can probably write them in your sleep.
Initially, I thought linked lists were too basic a topic for this book. A
recent project changed my mind.
I found myself working on a project that needed linked lists in several
places. Having written several hundred linked list data structures in my
career, I threw one together without bothering to build a class. No sooner had
I completed the first list than I need to build a second, and cranked out the
code for that one, too. Does this sound familiar? Alarm bells went off in my
head and I decided that this chapter was worth including.
In addition to the pleasure of reading some decent prose for a change,
Shapiro's book provides a fresh view of the major classical data structures.
There are chapters on bit sets, lists, arrays, dynamic arrays, binary trees,
hash tables, and atoms. Shapiro uses C++ to say something new and interesting
about these structures.
But all is not wine and roses. The section "Coping with Compiler Brain Death"
(pp. 76-7) explains what happens when your compiler can't inline a function
that has been declared inline. "An Implementation Note About Virtual
Functions" (pp. 86-7) says that if you have a class with virtual functions,
but without any noninline member functions, then your compiler is likely to
emit tons o'vtbls.
Chapter 15, on memory management, makes this point: "C++ programs, much more
than C programs, take advantage of the heap. As a result, C++ objects are more
frequently allocated in the heap than their C counterparts. Careful memory
management is a crucial aspect of C++ performance. As compilers get better, it
will very likely become the dominant issue in tuning C++ applications" (p.
161).
At the beginning of the book, Shapiro makes a point that seems to sum up the
big problem with C++, a problem that has no solution, and that stems from
C++'s greatest asset, which is its strong tie to C. "There are places where
the need to support C features prevents C++ from supporting object-oriented
features as well as one might like, and a surprising number of programs will
run up against these problems in one way or another" (p. ix).
That remark leads straight into this month's next C++ book, the long-awaited
Data Abstraction and Object-Oriented Programming in C++ by Keith Gorlen,
Sanford Orlow, and Perry Plexico. Gorlen et al. discuss how to stretch C++ as
far as it will go in the direction of object-oriented languages such as
Smalltalk, and away from the language's machine-oriented C heritage.
This book is based on the the NIH Class Library, a Smalltalk-like class
library for C++, which the authors developed as part of a project involving
biomedical research on Unix-based workstations at the National Institutes of
Health (NIH).
The NIH Class Library addresses a very real problem: C++ compilers do not come
with extensive class libraries. If you need a LinkedList class, you write it
(or borrow the one from Shapiro's book!). If you want the += operator to
signify concatenation when applied to a string, then you have to write a
String class with an operator+=( ) member function. C++ gives you the
mechanisms, but after that you're on your own. Fundamentally, C++ is still C.
C++ programmers can be jealous of programmers using object-oriented languages
such as Smalltalk, which come with extensive class libraries. When you buy
Digitalk Smalltalk/V, you get a massive class hierarchy. When you buy a C++
compiler, you get iostream.h. I am convinced that this Spartan approach,
remaining true to the language's C origins, is precisely why C++ has
succeeded. It is a compromise between C on the one hand and object-oriented
programming on the other.
But that doesn't change the fact that you need a class library. The NIH Class
Library brings some of the flavor of Smalltalk to C++; its class hierarchy has
Object at the top, a Collection class underneath that, a Bag class underneath
that, and so on. If you have Turbo C++ or the newer Borland C++, note that the
sample CLASSLIB is a scaled-down implementation of this same idea.
From the guided tour of the NIH Class Library given by Gorlen et al., I got
the sense that C++ provides just enough object-oriented features to be
tempting, but not enough to really work. How could it? C++ is still C.
For example, the constructor for a BigInt class must nonintuitively take a
string of digits because "this is the only way we can legally write very large
integer constants in C++" (p. 34). You can't write BigInt n =
18446744073709551615 because that number has 20 digits and is not a legal
integer constant in C--I mean, C++. Nor can you overload operator^( ) to mean
exponentiation and check if (n== 2^64 - 1) because in C--I mean in C++, the ^
operator is unary not binary.
This sort of restriction means that the promises of C++ often can't be
fulfilled. One promise is that with operator overloading, we can give "an
easily readable, 'mathematical' appearance" to mathematics programs (p.96). I
believe that the NIH Class Library comes as close as possible to this goal,
but it can't succeed, because C++ does not provide a free-form collection of
overloadable operators.
C++ seems to hold out the promise of working at a higher level, only to pull
you up short at the last minute with a stern reminder that this is still C.
Restrictions of this sort are necessary if C++ is to remain a serious tool for
developing commercial software. The authors of the NIH Class Library show what
can be done within these restrictions. In addition to reading their book, you
can get the NIH Class Library source code, either from the publisher (an
additional $16.95) or by downloading it from BIX (listings area c.plus.plus;
files nih30.zip, nih30.inf, and cppoops.zip). This is probably the largest
collection of public C++ source code available, and is well worth examining.
One final note on this book. For years, I have been expecting to see the
phrase "switch statement considered harmful" in print. One of the chief
benefits of C++ is that its virtual functions (dynamic binding) can eliminate
the need for switch statements. Anyone who has seen one of the 14-page "switch
statements from hell" that regularly appear in Microsoft Windows source code
cannot doubt that the switch statement should nearly always be replaced by
some sort of table (of function pointers, for instance). Anyhow, I was glad to
read the brief note, "The switch statement is considered harmful" (p. 104).
Our final book is Margaret A. Ellis and Bjarne Stroustrup, The Annotated C++
Reference Manual. These 447 pages are an expansion and update to the 70-page
Reference Manual that appeared in the back of Stroustrup's 1986 book The C++
Programming Language.
The new Ellis and Stroustrup book is nearly as unreadable as the original
Stroustrup book, and if you are doing anything with C++, it's just as
essential. Besides its approval as base document for the ANSI standardization
of C++ (the cover is stamped "ANSI Base Document"), Ellis and Stroustrup's
book contains many annotations and commentaries that clarify points in the
original reference manual, plus lengthy discussions of the many new features
added since 1986.
Opening the book to a random chapter, we find 22 pages of in-depth Talmudic
commentary on the following topics: Single Inheritance, Multiple Inheritance,
Multiple Inheritance and Casting, Multiple Inheritance and Implicit
Conversion, Virtual Base Classes, Virtual Base Classes and Casting, Single
Inheritance and Virtual Functions, Multiple Inheritance and Virtual Functions,
Virtual Function Tables, Instantiation of Virtual Functions, Virtual Base
Classes with Virtual Functions, and Renaming.
I came away from Ellis and Stroustrup's book with very grave worries about the
complexity of C++. It starts on p. 22 with the remark that a certain variable
"may not be eliminated even if it appears to be unused."
The reason is that the constructor or destructor for the variable's class may
have side-effects. You may say that no one should write a class where the mere
creation of an "unused" variable changes the program's behavior, but there are
several important C++ applications for just this sort of nonintuitive
behavior. On the same page, Ellis and Stroustrup provide a beautiful example
of a Tracer class. The importance of such "unused" variables also comes out in
static initializers for modules.
The point is simply that some of the nicest applications of C++ also reveal
its innate complexity: Here we have a language in which you simply cannot look
at a line of code and know what it's doing. An assembly-language programmer
might say the same thing about C, but to me there is a difference when we are
talking about a language in which deleting an unused variable might break the
program!
C++ is soon going to become even more complex. All three books discuss the two
major forthcoming features of C++: templates (parametized types) and
try/catch/throw (exception handling). These much-needed features will
undoubtedly interact in many interesting ways with all of the language's
existing features.




















MAY, 1991
A FAST PSEUDO RANDOM NUMBER GENERATOR


r250 for "better" random numbers




W. L. Maier


Bill Maier is a software engineer whose main interests are simulation,
mathematical software, and computer graphics. He can be reached at 3808 Seven
Gables, Fort Worth, TX 76133.


Computers have been required to generate pseudo random numbers from the
earliest days of computing. The usual method used to generate random numbers
is the linear congruent algorithm, which is implemented by repeated use of the
formula shown in Example 1 where the notation "p mod q" signifies the
remainder left over after dividing p by q.
Example 1: The linear congruent algorithm

 x' = (ax + c) mod m,

Although the linear congruent technique is by far the most common algorithm in
use for generating pseudo random numbers, there are other methods available.
One such method is the r250 algorithm, described by the physicists E. Stoll
and S. Kirkpatrick in a 1981 issue of the Journal of Computational Physics.
This method, named "r250" because of the 250-element array used in the
algorithm, is quite effective in general and is particularly well-suited for
use on PCs. I first became aware of the r250 algorithm several years ago after
reading an article in which the physicist Per Bak used this method to generate
random numbers for a Monte Carlo simulation of the Ising model. This model is
well-known in the physics community, and its simulation requires thousands of
independent random numbers to be generated. Although most simulations of this
type were (at the time the article was written) run on large computers, Bak
demonstrated that he could perform the same simulations on a PC.
The basic theory behind the r250 algorithm is that, under appropriate
conditions, a pseudo random sequence of bits can be generated by using the
formula in Example 2(a). In this expression, the a[i] and c[i] are bit values,
and are therefore equal to either 0 or 1. The formula simply says that if we
have a set of bits that have been previously generated (that is, a[i-1]
through a[i-p]), we can multiply them by the coefficients c[i] and add them
together to create a new pseudo random bit, a[k]. We add this bit to our
sequence, and then repeat the formula until we have as many random bits as
necessary. The maximum period of this sequence is 2{p-1}, which is achieved by
choosing the polynomial in Example 2(a) to be primitive. To simplify our
calculations, we choose most of the c[i] to be 0--in fact, for the r250 method
all but two of them are set to 0, and we wind up with the formula in Example
2(b).
In the r250 generator, the primitive polynomial chosen is q=103 and p=250.
Because we are generating only a single bit, we don't have to worry about the
carry when we add. We can simply use an XOR (exclusive or) operation, which is
the same as addition without carry. Thus, to create a random bit, we go back
to the 103rd and 250th bits that we previously generated and XOR those values
together. Of course, we are really not interested in just generating a single
random bit; in a real application we might want 16-bit values, for example. To
accomplish this, we treat each of the 16 bits in the word with the above
formula. Instead of keeping a single sequence of bits, we keep 16 sequences of
bits, which is just a sequence of 16-bit words, and use them in the formula in
Example 2(b), performing an XOR between the 103rd and 250th previous words.
If we compare the r250 method with the linear congruent method, we see that
r250 must perform one XOR and two index calculations to create a random
number, while the linear congruent method requires a multiplication, an
addition, and a division. Although the division is usually circumvented by
using automatic truncation of integers to register length, the linear
congruent method still needs to perform a multiplication each time a new
random number is created. Because multiplication can be a time-consuming
instruction compared to XOR, the r250 method is often a faster algorithm. For
example, on an 8086 processor an XOR instruction requires four clock cycles,
compared with about 115 clock cycles needed for an integer multiply. The MUL
instruction was sped up on the later editions of the 80x86 family, but even on
the 80386, MUL can take as many as 41 clock cycles compared with the two
needed for XOR on the same processor.
As noted previously, the period of this method is given by the expression
2{p-1}; in the case of r250, p = 250, so the period of this implementation is
2{250-1}, which is approximately 1.8e75. In comparison, the 16-bit linear
congruent method repeats (in the best case) after 65,536 iterations. For
applications where thousands of random numbers are required, r250 is clearly
the superior choice.


Implementing r250


The r250 algorithm is implemented with two functions, one which initializes
the generator and one which returns a random integer each time it is called.
The initialization function sets up a buffer of 250 random numbers, created
using some other available random number generator such as one employing the
linear congruent method. A pointer into the buffer is also set up. After the
system has been initialized, the generator routine creates a new random number
by performing an XOR between the numbers in the buffer at the current index
and at the current index plus 103. If adding 103 to the index would put the
pointer beyond the end of the buffer, it is wrapped around to keep it within
bounds. The new number produced by the XOR is stored at the current index, and
is also returned as the function value. The index is incremented before
returning in preparation for the next time r250 is called.
There is one potential problem concerning r250's initialization that must be
dealt with. Certain combinations of bits in the initial buffer can cause the
r250 algorithm to produce numbers which are too regular to be considered
pseudo random. One example of this would be the situation where a given bit
was 0 in each of the buffer words. Say, for example, that the buffer was
initialized with bit 12 equal to 0 in all 250 words in the buffer. Because new
words are created by XORing two previous buffer words together, and 0 XOR 0 =
0, bit 12 would be 0 in all subsequent words. This situation and other similar
problems can be avoided by ensuring that the buffer words are linearly
independent. Although a full explanation of this requires knowledge of linear
algebra and is beyond the scope of this article, linear independence can be
achieved by application of the following algorithm. For N-bit words, choose
any N words from the 250 word buffer, and think of the bits in these words as
forming an N x N square matrix. Set all of the bits along the diagonal from
the upper left corner to the lower right corner to 1, and set all bits to the
left of these diagonal bits to 0, leaving bits to the right of the diagonal
unchanged. This ensures linear independence, and guarantees that the random
numbers produced by r250 are pseudo random.
The code in Listing One (page 157) is a 16-bit implementation of the r250
algorithm in C, organized as a separate module that can be linked to other
programs requiring random numbers. The code shown was compiled with the Turbo
C++ compiler, but it does not use any elements of C++ and should compile under
almost any standard C compiler. Three routines are provided: r250_init for
initializing the system, r250 for generating random unsigned integers, and
dr250 for generating floating point random numbers in the interval 0 to 1. The
static variables r250_buffer and r250_index hold the random number buffer and
the index to the current location in the buffer, respectively. (C++ devotees
will recognize that these functions and data can be encapsulated into a class
for r250.) The standard C function rand( ) is used to initialize the r250
buffer. However, rand( ) returns integers in the range 0 to 0x7fff, so I make
the numbers in the r250 buffer true 16-bit values by adding a loop to turn on
bit 15 according to whether rand( ) returns a value in the upper or lower half
of its range, delimited by the value 16384. The last loop ensures that the
buffer is correctly initialized by applying the linear independence algorithm
given above. I use the formula k = 11 * j + 3 to spread out the selected words
over the buffer, although any 16 words would in fact work just as well.
Because rand( ) is being used to initialize the r250 buffer, there is a
possibility that overlapping sequences between runs will be produced, causing
repeating sequences in the output of r250. For example, suppose that the seed
value is chosen to be 687, and that rand( ) fills the r250 buffer with the
values 687, 16857, 23139, 2104, 16876, and so on. Now suppose that a second
run is made and that the seed chosen is 16857, which fills the r250 buffer
with the values 16857, 23139, 2104, 16876, and so on. These buffers are nearly
the same, so some of the output of r250 will be identical between runs.
Although such a situation is possible, its probability of occurrence is low
because there are only 250 values in the buffer from a possible 32768 produced
by rand( ). However, if a great many runs are to be made and overlapping
sequences would be a problem, it would be better to use r250 with elements
larger than 16 bits. It would not be difficult to extend the code given to use
32-bit values and to initialize the r250 buffer with a coded linear congruent
equation rather than rand( ).
The routine r250 implements the basic algorithm. The index j is the value of
the current index plus 103, unless that addition would put it outside the
buffer. Subtracting 147 from the index is the same as adding 103 and wrapping
it around, because 103 + 147 = 250, the size of the buffer. The rest of the
routine is straightforward. The routine dr250 returns a double in the range
0.0 to 1.0 using the r250 algorithm. It is an exact duplicate of r250 except
for the last line, which divides by 0xFFFF to produce the floating point
number. You could, of course, implement dr250 by performing a function call to
r250 and then doing the division, but this adds the overhead of the function
call even though it does save some space. Speed demon that I am, I chose the
implementation shown.
A number of sophisticated tests, both theoretical and experimental, have been
applied to the r250 algorithm to ensure that the numbers it is returning are
valid, and the interested reader can look them up in the references. However,
it is still necessary to check our algorithm to ensure that we have coded and
implemented it correctly. The test I chose for this is an intuitive one that
lends itself easily to graphical analysis. The basic concept behind this test
is that if we divide the interval 0 to 1 into N equal bins (that is,
subintervals), then a random number in the range 0 to 1 will be equally likely
to appear in any of the N bins. Thus, if a large amount of random numbers are
generated, say M, the number which appear in any given bin should be
approximately M/N. We then run this test for various values of M and N, with
various seed values for r250, and look for bins which contain too few or too
many random numbers. On a single run there will be some variation in the
number of values in each bin, but if we vary the seed value for a given M and
N, we should not find any bins that are regularly over- or under-populated. A
program which implements this test is given in Listing Two (page 157). I ran
the test with output redirected to a file, then used the file to generate a
graph showing the bin populations (see Figure 1). I could not find any
evidence of regularity in the random numbers.
Another intuitive test that is easily performed is to generate a large set of
random numbers in the range 0 to 1 and match them up in pairs. These pairs are
then used as the X, Y coordinates for points to be plotted. This procedure is
repeated for many random number pairs, all plotted on the same graph. The
points so plotted should be uniformly spread over the area from x = 0 to 1 and
y = 0 to 1. In addition, there should not be any regular features in the plot,
such as lines or spirals of points, which would indicate a departure from
randomness. I ran this test as well on r250 (see Figure 2), and again could
not find any problems.
As noted, the r250 method should, in theory, be quite fast. To test this
hypothesis I wrote a program to generate a large number of random numbers
using both the r250 method and the standard rand( ) function from the C
library. I ran this test on both an AT&T 3B2 computer running Unix and my
80386 PC under DOS, timing the results so that a speed comparison could be
done. On the 3B2 machine the r250 method was consistently about 15 percent
faster than rand( ). A similar test on the 80386 machine showed that r250 was
also about 15 percent faster than rand( ) on that platform. In both cases I
used r250 implemented in C as shown here; even more speed could be gained by
writing r250 in assembly language.
The r250 random number generator is an attractive alternative to the usual
methods. I have used it in a wide variety of applications and have been very
satisfied with the results. R250 is faster than the standard rand( ) function,
and will produce far longer sequences of random numbers without repeating than
the linear congruent method. Although I chose to implement r250 with 16-bit
random numbers, the method can be extended in a straightforward manner to
produce any size random numbers desired. In fact, the only disadvantage to the
r250 method that I can find is that is takes up more space than the linear
congruent method, which can be implemented with a single line of C code. For
applications where space is at a premium, the linear congruent method may be
preferable, but in most situations r250 can save valuable CPU time and provide
a better spectrum of random numbers at only a small cost in program size.


References


Bak, P. "Doing Physics with Microcomputers." Physics Today (December, 1983).
Kirkpatrick, S. and E. Stoll. "A Very Fast Shift-Register Sequence Random
Number Generator." Journal of Computational Physics (vol. 40, 1981).
Fruit, Robert "A Pseudo Random Number Generator." The C User's Journal (May,
1990).
Knuth, Donald. The Art of Computer Programming, Volume 2: Seminumerical
Algorithms. Reading, Mass.: Addison-Wesley, 1969.

_A FAST PSEUDO RANDOM NUMBER GENERATOR_
by W.L. Maier


[LISTING ONE]

/******************************************************************************
* Module: r250.cpp Description: implements R250 random number generator,
* from S. Kirkpatrick and E. Stoll, Journal of Computational Physics, 40,

* p. 517 (1981). Written by: W. L. Maier
******************************************************************************/

#include <stdlib.h>

/**** Static variables ****/
static unsigned int r250_buffer[250];
static int r250_index;

/**** Function prototypes ****/
void r250_init(int seed);
unsigned int r250();
double dr250();

/**** Function: r250_init Description: initializes r250 random number
generator. ****/
void r250_init(int seed)
{
/*---------------------------------------------------------------------------*/
 int j, k;
 unsigned int mask;
 unsigned int msb;
/*---------------------------------------------------------------------------*/
 srand(seed);
 r250_index = 0;
 for (j = 0; j < 250; j++) /* Fill the r250 buffer with 15-bit values */
 r250_buffer[j] = rand();
 for (j = 0; j < 250; j++) /* Set some of the MS bits to 1 */
 if (rand() > 16384)
 r250_buffer[j] = 0x8000;
 msb = 0x8000; /* To turn on the diagonal bit */
 mask = 0xffff; /* To turn off the leftmost bits */
 for (j = 0; j < 16; j++)
 {
 k = 11 * j + 3; /* Select a word to operate on */
 r250_buffer[k] &= mask; /* Turn off bits left of the diagonal */
 r250_buffer[k] = msb; /* Turn on the diagonal bit */
 mask >>= 1;
 msb >>= 1;
 }
}
/**** Function: r250 Description: returns a random unsigned integer. ****/
unsigned int r250()
{
/*---------------------------------------------------------------------------*/
 register int j;
 register unsigned int new_rand;
/*---------------------------------------------------------------------------*/
 if (r250_index >= 147)
 j = r250_index - 147; /* Wrap pointer around */
 else
 j = r250_index + 103;

 new_rand = r250_buffer[r250_index] ^ r250_buffer[j];
 r250_buffer[r250_index] = new_rand;
 if (r250_index >= 249) /* Increment pointer for next time */
 r250_index = 0;
 else
 r250_index++;


 return new_rand;
}
/**** Function: r250 Description: returns a random double in range 0-1. ****/
double dr250()
{
/*---------------------------------------------------------------------------*/
 register int j;
 register unsigned int new_rand;
/*---------------------------------------------------------------------------*/
 if (r250_index >= 147)
 j = r250_index - 147; /* Wrap pointer around */
 else
 j = r250_index + 103;

 new_rand = r250_buffer[r250_index] ^ r250_buffer[j];
 r250_buffer[r250_index] = new_rand;
 if (r250_index >= 249) /* Increment pointer for next time */
 r250_index = 0;
 else
 r250_index++;
 return new_rand / (double)0xffff; /* Return a number in 0.0 to 1.0 */
}





[LISTING TWO]

/******************************************************************************
* Module: rtest.c Description: tests R250 random number generator by
* placing data in a set of bins.
******************************************************************************/

#include <stdio.h>
#include <stdlib.h>

/**** Constants ****/
#define NMR_RAND 5000
#define MAX_BINS 500

/**** Function prototypes *****/
unsigned int r250();
void r250_init(int seed);

/**** Function: main ****/
void main(int argc, char *argv[])
{
/*---------------------------------------------------------------------------*/
 int j, k;
 int nmr_bins;
 int seed;
 int bins[MAX_BINS];
 double randm;
 double bin_limit[MAX_BINS];
 double bin_inc;
/*---------------------------------------------------------------------------*/
 if (argc != 3)

 {
 printf("Usage -- rtest [nmr_bins] [seed]\n");
 exit(1);
 }
 nmr_bins = atoi(argv[1]);
 if (nmr_bins > MAX_BINS)
 {
 printf("Error -- maximum number of bins is %d\n", MAX_BINS);
 exit(1);
 }
 seed = atoi(argv[2]);
 r250_init(seed);
 bin_inc = 1.0 / nmr_bins;
 for (j = 0; j < nmr_bins; j++)
 {
 bins[j] = 0; // Initialize bins to zero
 bin_limit[j] = (j + 1) * bin_inc;
 }
 bin_limit[nmr_bins-1] = 1.0e7; // Make sure all others are in last bin
 for (j = 0; j < NMR_RAND; j++)
 {
 randm = r250() / (double)0xffff;
 for (k = 0; k < nmr_bins; k++)
 if (randm < bin_limit[k])
 {
 (bins[k])++;
 break;
 }
 }
 for (j = 0; j < nmr_bins; j++)
 printf("%d\n", bins[j]);
}






























MAY, 1991
OF INTEREST


Jana Custer


Development and debugging for 32-bit Windows 3.0 applications is provided by
C8.0/386 for Windows from Watcom. Unix workstation and Mac applications can be
adapted to Windows, as well. C/386 for Windows includes an optimizing 386 C
compiler, debugger, development tools, and special libraries (to access the
Windows API from 32-bit code, and a 32-bit C library to enable 32-bit
applications to execute under Windows 3.0). Existing Windows development tools
are 16-bit, and so cannot fully exploit the potential of the 386. 32-bit
addressing allows GUI apps to use the memory available on these machines and
increases speed significantly.
Robert Wenig, scientist at Autodesk in Sausalito, California, told DDJ, "The
big thing for us is getting 32-bit debugging support. The real win for Watcom
is the source-level debugger, as well as enabling the technology needed to
easily get Mac and Unix applications up to Windows. It's easier to port 32-bit
apps to 32-bit Windows."
Developers can take existing Windows applications and create 32-bit versions.
The product was designed for use with the Microsoft Windows 3.0 SDK, and a DOS
extender is not required. Unlimited royalty-free runtime redistribution rights
are included. The compiler sells for $2,250. Reader service no. 23.
Watcom 415 Phillip St. Waterloo, Ontario Canada N2L 3X2 800-265-4555;
519-886-3700
PCsteam, a hardware-assisted debugger for Windows 3.0 and DOS-extenders, is
available from Systems & Software. The debugger has a CodeView interface,
real-time execution trail, and complex hardware breakpoints, timers, and
triggers.
PCsteam lets you track up to 32K of the real-time execution history of your
application software; by scrolling back in time, you can find which trigger
enabled or disabled trail acquisition, and your choice of display format
includes raw machine cycles, assembly language, source code, or a mixture of
assembly and source. PCsteam is priced at $3,995. Reader service no. 24.
Systems & Software Inc. 18012 Cowan, Ste. 100 Irvine, CA 92714 714-833-1700
The DataWindows library from Greenleaf Software now supports the Extended DOS
80386 architecture. Data-Windows/386 is an integrated library of C functions
for building interfaces, and includes logical windows, full-featured menus,
list boxes, sophisticated data entry, and a complete user interface toolkit.
Greenleaf DataWindows/386 takes advantage of the improved performance of the
386 and maintains compatibility with DataWindows 2.12 for MS-DOS and Unix.
Mark Edmead, custom software developer in the San Diego area, told DDJ that
he's been using DataWindows since version 2 for the 286. "I'm changing an
original application -- converting it from 286 to 386 with the Watcom compiler
-- and so far so good. DataWindows allows you to do graphical interfaces in a
fairly easy manner. You can build your own screens; and each menu or window
can be a separate file, so if you want to change something, you don't have to
recompile the program."
Window sizes are virtually unlimited, and enhanced video routines written in C
can directly access video memory or write to the display through calls to BIOS
functions. DataWindows supports DOS, Extended DOS/386, unix, Xenix, and OS/2.
The interface remains the same for both the programmer and the user,
regardless of the hardware or operating system. For protected-mode programming
in DOS, Datawindows/386 can be used with Watcom C/386 8.0 or MetaWare's High C
2.3 in conjunction with the 386 DOS Extender from Phar Lap Software.
DataWindows/386 retails for $799. Reader service no. 25.
Greenleaf Software Inc. 16479 Dallas Parkway, Ste. 570 Dallas, TX 75248
214-248-2561; 800-523-9830
Developers can design applications using VenturCom's Embedded Venix (a ROMable
AT&T Unix operating system) running on Ampro Computers' Little Board/386.
Embedded Venix is AT&T Unix System V Release 3.2 optimized for acquisition and
control, and the VenturCom claims it is the only industrial-strength Unix for
AT-based computers; they added real-time extensions in order to create a
deterministic operating system with a preemptable kernel.
Ampro Computers' Little Board/386 is a compact 32-bit single board computer
measuring 5.75" x 8". Minimodules provide VGA, modem, and network interfaces
in a stackable, miniature configuration. Reader service no. 20.
VenturCom 215 First St. Cambridge, MA 02142 617-661-1230
A workgroup computing environment in which users of 386 and 486 PCs can share
information and peripherals and continue to work with DOS application programs
can be created by using VM/386 MultiUser version 2.0 from IGC. VM/386 2.0 has
new features, including demand paging, remote dial-in capability, multitasking
at nodes, PCTERM terminal emulation, improved print spooler, ATI Wonder video
board support, sound support at text-only teminals, and improved hard disk
performance. VM/386 MultiUser 2.0 supports up to 32 users. This package
retails for $1049; a five-user package is available for $695, and the upgrade
costs $250. Reader service no. 21.
IGC Inc. 1740 Technology Dr. San Jose, CA 95110 408-441-0366
A development package for real-time and embedded systems applications is
available from Microtec Research and Force Computers.
Microtec's MCC68K ANSI C cross compiler and XRay68K in-circuit debugger are
now available for Force Computers' 68020- and 68030-based VMEbus CPU boards.
Users of Force computers will be able to use the Microtec XRay source level
debugger to communicate with target CPU boards and to debug code running on
Force boards.
Microtec supplies board support packages built using the monitor configuration
tool (MCT68K), which is an interactive, menu-driven program that enables a
Force board user to configure the system for microprocessor type,
communication scheme, I/O device type and address, as well as any information
on coresident firmware such as assembly-level ROM monitors. Microtec's ANSI C
cross compiler generates optimized, ROMable, reentrant, position-independent
code for the Motorola 680x0 family of microprocessors. Included are ANSI C
runtime libraries, a cross assembler, linker, and an object module librarian.
The XRay in-circuit debugger includes a host-based source-level debugger. The
C cross compiler lists at $2,000, and the XRay debugger starts at $2,200.
Reader service no. 22.
Microtec Research Inc. 2350 Mission College Blvd. Santa Clara, CA 95054
408-980-1300; 800-950-5554
Instant-C 5.0 from Rational Systems complements any production compiler with
an integrated development environment and reduces the edit-compile-link-text
cycle to an edit-test loop. Several new features include a menu and
multiwindow user interface with mouse support. Instant-C now supports external
editors and includes keystroke emulation for Wordstar, Brief, or EMACS. The
Instant-C editor can now handle large programs, and you can view and edit
multiple files and functions.
Also included is a data-watch window for improved debugging and support for
the Pascal function calling sequence. DPMI compatibility allows Instant-C to
access up to 16 Mbytes of extended memory and run under Windows 3.0 in a DOS
window. Instant-C also uses Rational's DOS/16M DOS extender and virtual memory
manager (VMM). Instant-C provides automatic error detection at compiler, link,
and runtime, and includes source-level debugging, interactive C expression
evaluation, and compatibility with Microsoft C. Version 5.0 is priced at $795;
upgrades cost $50, unless you purchased Instant-C after December 1990, in
which case you will receive it free. Reader service no. 27.
Rational Systems Inc. 220 N. Main St. Natick, MA 01760 508-653-6066
GX Graphics, a new member of the Genus Microprogramming GX Development Series
of graphics programming tools, is designed for developers adding graphics to
applications. Standard graphics primitive functions are provided, as are more
advanced routines such as Super VGA display mode support, mouse programming
routines, as well as the ability to draw to virtual buffers in conventional,
expanded, or disk memory.
DDJ spoke with David Herman from Island Systems in Burlington, Massachusetts
who said, "we have added the Genus GX Graphics library as the newest port for
our graphics-MENU GUI toolbox because we believe it offers an excellent
alternative to the native graphics that are supplied with the Borland and
Microsoft compilers. Attempts to provide shareware alternatives to these
native libraries have fallen short due to lack of mouse support in the higher
resolutions. The GX library correctly supports the mouse in the standard and
Super VGA modes of operation."
Genus claims that any application currently using an existing graphics
primitives library will improve in speed by replacing the existing library
with GX Graphics. A linkable kernel allows all products within the GX
Development Series to share a royalty-free centerpiece of common functions
linked directly into applications. Programmers who are members of the Genus GX
Development Series save on code overhead by sharing kernel functions between
libraries. Only the required kernel functions will be linked into the
programs. The kernel is responsible for all display adapter interfacing,
memory allocation, and virtual bitmap support. Over 100 routines (written in
assembly language) are provided. GX Graphics sells for $199, and source code
is available for $200. Reader service no. 26.
Genus Microprogramming 11315 Meadow Lake Houston, TX 77077 713-870-0737;
800-227-0918
Open Interface, a user interface development tool that provides windowing
system, operating system, and hardware independence, is now available from
Neuron Data. GUIs built with Open Interface are instantly portable across
Macintosh, DOS, OS/2, Unix, and VMS, and support the native look and feel of
the Macintosh, Windows, Presentation Manager, Motif, and Open Look windowing
environments.
Open Interface provides a superset of all the widgets and functionality
provided by the native toolkits in the major windowing environments.
Development licenses cost $7,000 for DOS and Macintosh, $9,000 for OS/2, and
$12,000 for Unix and VMS. Reader service no. 28.
Neuron Data 156 University Ave. Palo Alto, CA 94301 415-321-4488






















MAY, 1991
SWAINE'S FLAMES


YACK (Yet Another Computer Katalog)




Michael Swaine


Dear Fellow YACKonian,
Here's the latest offering of neat stuff from YACKland.
Lotus goofed! And, you win. Imagine having names and addresses of over 80
million households. Imagine being able to produce a mailing targeted to
inner-city singles in the Sunbelt, cautious young couples in suburban
Minneapolis, or mobile home families who own dogs but not cats. Or, unmarried
wealthy women over the age of 65 in your neighborhood. Lotus Marketplace:
Households contains name, address, age range, gender, marital status, dwelling
type, neighborhood income range, and neighborhood lifestyle data on over 120
million American consumers. In January of this year, Lotus yanked the product
off the market amid claims that it was a threat to privacy. But, through a
special arrangement with a former Lotus employee, we are able to offer you the
product that spooked the ACLU at an inflation-blasting price of just $79.90
($6 P&H).
It's unfair. You may be barred from cloning a computer or program because you
know too much. It's true. Lawsuits over copyright infringement in clone cases
are often decided on the issue of whether the developers had inside
information about the copyrighted product. Yet clones made this industry what
it is today. IBM didn't sue Compaq for cloning the PC. And Apple hasn't sued
Nutek, the first company to successfully clone the Mac. What's their secret?
Strategic ignorance. Nutek made sure that its developers had access only to
publicly-available information in cloning the Mac operating system, user
interface, and hardware logic. It's called "clean-room development," and now
you can have it, no matter how much you already know. Just use YACK's Clean
Coders. These programmers work exclusively in Cobol on IBM System 360s, so you
know they're ignorant. $79.90 per hour.
Massive object blowout! Forget late-night coding. Forget missed ship
deadlines. Imagine grabbing the object class you need in a blazing ten seconds
mean access time. If you're like me, you never have the object class you need
when you need it. Now, you can have an amazing 8000 object classes in the YACK
object libraries on CD-ROM. Choose from Set #1: Chrysler Parts; Set #2: Forms
of the Syllogism; or Set #3: Animals of the Galapagos. And there are more to
come! Each disk is just $79.90 ($6 P&H).
It's a problem. Kids love reading about space travel, but there aren't enough
good books written at their level. And, if you're like me, you don't have the
time to keep up with all the awesome developments in this exciting field. But
now there is Space 2000, a mind-blowing excursion into outer space in
easy-to-read loose-leaf format, written at the second-grade level. Your kids
will thrill to these dazzling scenarios for the exploitation of space.
Originally prepared by science fiction writer Jerry Pournelle for Vice
President Dan Quayle, Space 2000 is now available to kids everywhere for just
$79.90 ($6 P&H).
Forget clumsy character-recognition software. Forget puzzling over the
inconsistencies of English spelling. The MacHandwriter, available for the
first time in this country, will read your handwriting in any of an amazing
three Japanese character sets. The secret? Japanese characters are more
uniform and easier for computers to read than English. And, if you order
before July 31, 1991, you get absolutely free the book Beginning Japanese. The
MacHandwriter and book are just $79.90 while supply lasts ($6 P&H).
It's awesome. It's neat. It's super. I've slashed the price on the
ever-popular video A Tribute to Bill Gates from my December mailing. All
Bill's industry friends express their admiration and affection for the great
man in this three-minute video, now just $6 (P&H included).
I have a confession. I can't really get any of these items for you, even
though all but one of them are more or less real. The style parodied here is
that of Drew Alan Kaplan, publisher and breathless author of the imitable DAK
catalog of neat stuff. I've wanted to do Drew for years. Forget dry technical
writing.








































June, 1991
June, 1991
EDITORIAL


From the File




Jonathan Erickson


With spring in the air, it's time for spring cleaning, and what better place
to start than the overflowing file cabinet in the corner. The truth of the
matter is that this is the first chance I've had at clearing the debris from
the Loma Prieta earthquake. Okay, I can already hear you saying "That was
nearly two years ago!" That's right, but I've been waiting for my disaster
relief check and it hasn't arrived yet.
It's like this: Right after the earthquake, the state legislature passed a
special sales tax for earthquake relief. Although state coffers subsequently
swelled to the tune of $700 million, little of the relief fund reached those
in need. Not only that, but auditors discovered that $500 million is missing.
I guess if the bureaucrats can get away with losing that kind of money, I can
be excused for misplacing the occasional press release. But on to the file
cabinet ....


From the "Life Imitates Art" File


In the heat of the Persian Gulf battle, novelist-turned-wine-columnist Lew
Perdue sent out sham news releases for a "Desert Storm 486" PC ("The Mother of
all Boards"), complete with testimonials from George Bush and Norman
Schwartzkopf. Shortly thereafter, Intel sent out its press releases announcing
the "Military i486" that joins ranks with the Mil 386, 860, and 960
processors. Ada compilers are in the works too.


From the "Art Imitates Life" File


The demand for Michael Swaine's videotapes--particularly A Tribute to Bill
Gates and Groupthink: Getting Ready for Groupware--mentioned in recent
"Swaine's Flames" has generated more phone calls than I can keep up with. He's
created the demand, let's see if a product follows.
If you remember, Michael first described what he calls the "client-funded
paradigm" back in December, 1989, where you advertise and take orders for a
product that doesn't exist yet, then use the income for actual development.
Sounds good, as long as you're not the one answering the phone.


From the "Best Unused Press Release of the Week" File


The last time I heard from them, the Invention Submission Corporation was
pushing an improved pocket protector that helped "white collar workers achieve
a more polished, professional image by protecting costly business wear from
leaking pens and sharp pencil points." They've recently done themselves one
better, though, with this phony release:
A Silicon Valley inventor has created an intelligent alternative to ordinary
breakfast cereals. Brain Bran combines the wholesome goodness of bran fiber
with a unique ingredient that makes you more intelligent with each spoonful.
Special intellect-enhancing enzymes go to work as you eat, raising your IQ
while the bran increases your fiber content. This tasty new breakfast treat
can offer a fighting chance for kids who are struggling in school, and it
won't go soggy in milk. Adults can also use it as a snack before essential
business meetings, interviews, or MENSA tests.


From the "Microsoft Legal Defense Fund" File


Between the rock of the FTC and the hard place of Apple, Microsoft switchboard
operators will likely be answering the phones with "This is Microsoft. Call my
lawyer." First it was the FTC investigation surrounding OS/2 (which doesn't
need any more bad press), then the news that the inquiry had widened to
operating systems, applications software, and hardware peripherals, and more
recently notification that Apple was appending its GUI lawsuit to include
Windows 3.
The only known quantity in this muddle is that expensive lawyers will be doing
legal tap-dances for months to come. Maybe it's time to establish a legal
defense fund.
No, Microsoft doesn't need your money -- goodness knows they can afford their
own lawyers -- but the bedeviled Bellevue barristers could probably use some
good advice and suggestions from impartial parties. I'll start by throwing in
my two-cents worth.
Perhaps Microsoft could launch its FTC defense by saying, "If we were guilty
of making unfair use of inside information, our software would work better and
come out sooner than everyone else's." The Feds would then have to prove
otherwise. I'll leave the Windows alibis up to you.


Closing the File Drawer


It's time to stop rummaging through the file cabinet and move on to other
corners in the office. Who knows, maybe I'll get lucky and turn up that
missing $500 million.











June, 1991
LETTERS







Software Patents, Yet Again


Dear DDJ,
Edwin Floyd's "An Existential Dictionary" (November 1990) is a good example of
software which raises questions about patents. It is my understanding that
U.S. patents may be challenged successfully if they are issued to other than
the inventor (prior art), or if the alleged invention is obvious to one
trained in the art. But although Mr. Floyd's contribution, which strikes me as
having widespread application, is now obvious to me, it was not so before I
read about it in your magazine.
I don't agree with those who view the patent process as a scheme run by
incompetents to prevent software creativity. The patent office does not defend
patents issued -- ordinarily the inventor must do that -- but it will make an
effort to understand the often-arcane languages used by applicants, and to
issue patents when evidence indicates a possible advance in the art. Neither
the office nor the applicant can say with certainty that an advance did occur.
And the inventor who holds a letter patent is enabled some control over the
use of his invention for 18 years, but none thereafter.
I wonder if the letter patent of the copyright is the correct basis of a
structure to legally sanction software. My preference would be a scheme
analogous to the copyright of music. The author of a protected work would
receive royalties from varying instantiations of his work, and mere changes in
the language or identifiers used would not suffice to escape the copyright.
But I think we need some assurance that royalty fees will be reasonable, so
that programmers can correctly assume use of copyrighted techniques as an
element of their own work.
In the case of software developed at public universities, perhaps the
taxpayers should receive rights to it. Surely those who claim MIT will not be
able to authorize free use of its faculty's software are crying wolf. And if a
student makes a contribution, does he lose the rights of the citizen or
resident?
Back to Floyd's dictionary: Remember that the CRC algorithm is "bit-oriented"
and only the significant bits of the key should be run through, thus avoiding
trailing nulls and unused eighth bits. Also note that one may adjust the
length of the CRC to be computed, by making the divisor one bit-position
longer than the desired result.
In Floyd's application, for example, to set 4 bits in a table of 256 bytes
requires a bit address of 8 + 3 bits length, and four instances of such a
pseudorandom 11-bit number. To satisfy such a request, one could generate a
CRC of 15 bits length, and take from it four differing groups of 11 bits. But
upon such action, some bits will have been used four times, and others merely
three. Thus, theory would indicate that it caused a decrease in entropy of the
information. One may generate a CRC of length n by using a (preferably prime)
divisor with 1 in the bit positions n and 0. The result is then in positions
<n - 1> .. 0 (if we're shifting right, toward the little end).
Another problem found with some CRCs is that the divisor is (wrongfully?)
chosen to be symmetric with regard to bit position, so that keys with the same
word-length as the CRC length can be shifted in from either end. This is the
old little-endian, big-endian micro-mainframe struggle. But a divisor with
asymmetric placement of 1s and 0s will permit calculation of a CRC which is
sensitive to the shift-in order of the key.
Should Floyd's existence table be used to statistically reduce the number of
time-consuming "complete" searches, one could accept some failures of the
technique used, as the wasted futile searches would be balanced out by the
speed of the preliminary.
My variation of Floyd's algorithm initializes the "existence" table to a
suitable proportion of 1s and 0s (100 percent of either being special cases),
and then uses a marking scheme which forces some bits up and others down,
according to the key. Rather than returning proof of nonexistence or
probability of existence, the modified algorithm can be designed to return a
statistical indicator of probability which is suited to the application. This
estimator is always correct for the most recent entry, but can be expected to
deteriorate in accuracy as more and more entries are written over it. However,
even very "noisy" entries return a value potentially useful.
If Floyd's technique has been known for a long time, at least in theory,
perhaps it is traditional. But it was new to me, and I thank him for
publishing his findings.
Jon W. Osterlund
Greeley, Colorado
(Editor's note: Patents are valid for 17 years, not 18.)
Dear DDJ,
Lawyers! Win friends and influence people whilst making a killing: Patent your
arguments.
Just imagine:
The royalties!
The cases you alone can try!
How your patent can either make or break a case!
The utility of forcing adversaries to pay up even if you lose!
Getting a piece of the governmental pie!
Carve out your niche today! Yes, there is a future beyond ambulance chasing
and trust management.
Like arithmetic, logical argument has been around for a long time. The courts,
like computers, operate under rigid rules. Lawyers act like programs within
the machine by using arguments built step by step. The application of an
argument may be content-dependent (such as this one), or it may be generalized
into a set-piece (this same argument when applied to, say, chess or football).
Just strip out the terms and use variables: argument = chess move football
play; lawyer = grandmaster 300 lb. fella; court = chess game football game.
The point is, of course, that an argument can be patented just like an
algorithm. I have not patented this argument, but I might. Until then,
consider it as prior art.
Frederick Hawkins
Allentown, Pennsylvania
Dear DDJ,
I have just tried to read the article on software patents in your November
1990 issue. I could not finish it because my blood was beginning to boil. I
was dumbfounded to see such simple algorithms being patented.
I wonder how many professors of computer science know that the XOR technique
for cursors is patented. This algorithm is standard stuff for graphics
classes. Every year, thousands of computer programming students break the
patent when they write a simple cursor routine.
Computer algorithms must fall under the same rules as mathematical formulas
and physical laws. In the past, when a person invented a new formula or way of
doing math, they would get credit for its invention, but they wouldn't dream
of patenting it.
Could you imagine what kind of a world we would have if Sir Isaac Newton
patented his invention calculus? Every time you wanted to solve some math
problem, you would have to send Sir Isaac some money, or buy a site license.
As Mr. Kapor mentions in his article, computer software is built upon
mathematical foundations. Programs are akin to long, complex Boolean
statements. How can one patent a formula or equation? (I guess the people at
the Patent Office never saw the movie Young Einstein.)
I guess what determines what is patentable is what you are able to "sell" the
Patent Office. If it tooks new and unique to them, it must be so.
Timothy C. Swenson
Alexandria, Virginia


...for the Professional Programmer


Dear DDJ,
When I saw the one-inch high letters on the cover of your January 1991 issue
announcing the "Software Design" theme, I grabbed a copy, eager to find out
what you had to say on this critical issue. I enjoyed Michael Hagerty's piece
on the use of a CASE tool to rescue a southbound system design. (I even
circled the appropriate cell on the reader service card.)
I am getting really frustrated with many aspects of the software development
business, especially in the world of business applications: incompetent
managers placed in charge by senior executives who know little (or less) about
the software development life cycle (Senior Exec: "I'll put Paul in charge:
he's an accountant, but he did something on the installation of our general
ledger package, so he must know all about computers ..."); incompetent,
learned-it-on-the-job programmers ("Structured what? I've been in this
business for twenty years. Nobody can teach me anything about programming
..."); absurd project schedules ("Complete specifications before you start
coding? No way, there isn't time. Start coding now or you won't make your
deadline ..."); et cetera, ad nauseum.

What will it take before software engineering is considered a real profession,
requiring completion of a standard university curriculum and subsequent
licensure, before putting code in a buffer for money? Software architects, as
Mitch Kapor advocates, certainly, and soon; but I could only nod in sympathy
with Michael Hagerty (grimly, mind you) when he pointed out that his system
developer "was apparently unfamiliar with the most basic principles of
software engineering." Why are people like that employed in this discipline in
this day and age? How would society feel about neurosurgeons "apparently
unfamiliar with the most basic principles of" medicine?
The use of CASE tools to represent the design of a system should be an
industry standard, among many others. They're mature, stable, and well worth
the investment, unless of course, you're one of those pseudocoders who never
understood what CASE tools were for. Yes, I wax sarcastic, but would a
contractor consider the construction of a building designed by someone who
couldn't produce working drawings in accordance with professional
architectural standards?
Clearly, good design is just as important to software as it is to commercial
aircraft or artificial hearts. But I believe it should be considered in the
context of the larger issue: the "professionalization" of this discipline. DDJ
is a respected magazine, a voice that is heeded by programmers. I hope to hear
it much louder in favor of professional software engineering standards
whenever and wherever discussions of this onerous problem occur.
Andy P. Bender
Riverdale, Maryland


Standing Before the Altair


Dear DDJ,
Thank you for publishing the protest of Jonathan Titus (Letters column, March
1991) about the Mark-8 being created and published in the July 1974 issue of
Radio-Electronics, six months ahead of the famous Popular Electronics headline
cover "World's First Minicomputer Kit to Rival Commercial Models ... 'Altair
8800' SAVE OVER $1000" (I have it in front of me as I write, preserved in a
plastic bag.)
Popular Electronics was the most popular electronics magazine in the world. I
recall my extreme frustration in even trying to find Radio-Electronics in
libraries in Dallas when looking for referenced articles. The SMU Technical
Library didn't have it. I was unable to find it in book stores.
The Altair 8800 actually led some place, being a direct line to the CP/M-based
machines that dominated the market until the Apple II took its slice (while
often running CP/M itself). It established a standard card connection bus
(however badly arranged) that still has uses.
I think the Mark-8 is like the Langley flying machine that flew into the river
after catching on its launcher. It could fly, but it had a bad test, and the
Wright brothers not only flew, but they proved they could control the plane,
and then sold it to a job.
I never saw the Mark-8, but Mr. Titus's letter makes clear that "about a
thousand circuit board kits were sold," plus sets of hard-to-get parts. I have
to assume that it had no case and looked like a bunch of parts, rather than
the apparently usable computer on the cover of Popular Electronics. The Altair
included all parts for $397, could be bought assembled for $498, and I saw it
running both at our meetings and at the Altair Store that opened later in
town. A lot more than a thousand were sold and it generated clones that sold
even more widely (IMSAI). A lot of people believed in the Altair and the
Computer Hobbyist Group-North Texas blossomed after the Popular Electronics
article, not after that in Radio-Electronics.
Mike Firth
Dallas, Texas


Problem with the 80387 Chip


Dear DDJ,
I have a problem: A certain floating point multiply instruction does not work
correctly on my 80386/80387-based AT clone machine. This has been tested on
387s in machines of four different manufacturers and they have all failed. It
has also been tested on several 287s and 8087s and they have all worked
correctly.
The problem instruction was first found after compiling the program CNEWTON3.C
from the book Fractal Programming in C, by Roger T. Stevens, using Borland's
Turbo C 2.0. The program's screen output in certain regions was a solid brown
color when it should have been varying shades of blue. By using the debugging
aids of Turbo C, the problem was traced to a double-precision floating point
multiply instruction compiled from these lines of C code:
 Xsquare = X * X;
 Ysquare = Y * Y;
 denom = 3 * ((Xsquare - Ysquare) *
 (Xsquare - Ysquare) +
 4 * Xsquare * Ysquare);
The second multiply in 4 * Xsquare * Ysquare became a no-op in cases where Y
was less than 2 {-1022}, but greater than zero. This happened whenever X,Y was
trying to converge to X = 1.0, Y = 0.0. This would prevent the convergence;
the number of iterations maxed out at 64 and the color brown was assigned.
After understanding the symptoms of the problem, I wrote a small program
containing the lines
 double X, Y1, Y2;
 X = 4.45014e - 308;
 X = X / 2.0;
 Y1 = 1.0 + X * 4.0;
 Y2 = 1.0 + 4.0 * X;
The value of Y1 is computed correctly to be 1.0, but the value of Y2 is
computed erroneously to be 5.0. An equivalent program was written in Turbo
Pascal 5.0 and it did not fail in either case.
To understand the problem further, I used Turbo debugger 1.0 to trace the code
at assembly level. The number 4.45014e-308 becomes 001F FFFC 5DO2 B3A1 in
64-bit floating point. The leading ))1 is the sign and an 11-bit biased
exponent. The mantissa is actually 1 FFFC 5DO2 B3A1 but the leading 1 is not
stored since it is known to be one. When this number is divided by 2.0 the
result is OOOF FFFE 2E81 59DO.
The biased exponent is now zero but the mantissa also changed because floating
point numbers with a zero-biased exponent must have all bits of the mantissa
stored in them. (This is because they are not necessarily normalized.) If this
number were further divided by 2.0, the biased exponent would stay zero and
the mantissa would shift off to the right and become unnormalized.
When the Turbo C code that fails was compared with the Turbo Pascal code that
works correctly, it was found that Turbo C generated a 3-byte fmul
instruction, while Turbo Pascal generated a 4-byte fmul instruction.
 C: cs:02A6 DC4EE8 fmul
 qword ptr[bp-18]
 Pascal: cs:0176 DC0E3E00 fmul
 qword ptr[MAIN.X]
Apparently, the necessary and sufficient conditions for the failure are: 1.
Using a 80387, 2. 3-byte form of fmul, 3. double- or single-precision form of
fmul, 4. an operand in RAM with a zero-biased exponent and a nonzero mantissa.
The nature of the failure is that the fmul instruction becomes a no-op.
Harry J. Smith
Mountain View, California


CRC Solution


Dear DDJ,
I was very interested in "Designing an OSI Test Bed," by Ken Crocker, which
appeared in your December 1990 issue. I work with SDLC and the SCC, and wrote
a BITBUS driver using the original Zilog SCC (both Z85C30 and Z80C30).
Mr. Crocker wrote that he had problems with CRC checking. I have had similar
problems in the past, and I have found an unexpected solution which might work
with the Intel and his code, as well.
In SDLC mode, there is no need to give the 'ENTER HUNTMODE' command at anytime
(not even in the Init routine)! The SCC will manage that for you. Let it do
it; it knows what it's doing.
According to the Zilog manual, the RxCRC enable bit in register 3 of the SCC
should be ignored in SDLC mode, but this seems not to be true in every case.
My experience indicates that it is a good idea to leave this bit off (e.g.,
low). The SCC is quite a diva, and you have to think in curves and nodes to
get it to work.

By the way, Zilog has announced a new version of the SCC called Z85C130 ESCC.
This version includes a lot of improvements (deeper receive and transmit FIFO)
which should make it easier to use.
A general question to the experts: Why does everyone use the 85C30 with
processors like the 80x86? The Z80C30, with its multiplexed Address/Data bus
is really the better solution.
Volker Goller
Aachen, Germany


ERRATA


In Listing One of the April 1991 "C Programming" column, the source code at
the bottom of the second column on page 150 should read:
/* -- attach vectors to resident program -- */
setvect (KYBRD, newkb);
setvect (INT28, new28);

















































June, 1991
FORTRAN & GUIS


Putting your best interface forward




John L. Bradberry


John is a senior research engineer with the Georgia Tech Research Institute,
where he specializes in radar and antenna research. He is also development
manager for Scientific Concepts and can be reached there at 2359 Windy Hill
Road, Suite 201-J, Marietta, GA 30067.


Since its creation in 1957, the Fortran language has been used in applications
well beyond its original scope of FORmula TRANslation. Indeed, computers,
operating systems, compilers, and peripherals have grown in complexity and
capability at an almost exponential rate. While Fortran evolved from Fortran
66 to Fortran 77 (and hopefully soon to Fortran 90), a technological explosion
in the ability to select and display large amounts of data with incredible
speed and accuracy also occurred.
The basic I/O of CRTs has been supplemented with a wide variety of pointing
and digitizing devices -- such as high-resolution scanners and graphics
tablets -- with increasing demands for memory and speed. Simple operator
interfaces and menus have expanded in scope to include these capabilities and
requirements. As a by-product of this, a more universal concept called the
Graphical User Interface (GUI) emerged. GUI-based applications span a wide
range of implementations: from text-based pop-up windows to popular MIT
X-Window-based products in Unix environments to Microsoft Windows and many
other custom window systems. In the context of this article, GUI means a
high-level bit-mapped menu system with many features common among CPU
platforms and operating systems. Options are graphically represented and
selections are made using a mouse or pointing device, with emphasis on reduced
keystrokes and increased process, task and event monitoring.
Many applications written in Fortran (and other high-level languages) work
perfectly well but could use a GUI "face lift." Perhaps the first thought is
to give these applications a "cute" and more standard "windows look" by
creating your own custom library of windows functions or interfacing with
existing software tool kits.
In this article, I'll explore some of the options available to Fortran
programmers in this area. Many programmers assume that because Fortran was not
originally designed to write drivers, the only option available for creating a
windows interface is to learn C and use the Windows SDK or X Window toolkit
libraries. As you will see, low-level hardware and driver interface is
certainly required for even the simplest of window interfaces, but by no means
does this exclude Fortran applications from direct window support! I'll use
programming examples and consider a few language factors as I explore options
in creating more exciting menu displays that support Fortran-based
applications. Regardless of whether you intend to design your own interface or
shop around for the least painful windows interface kit, the information
presented here will help you make a more informed decision.


Preparing Your Code for a GUI System


Prior to implementing a GUI system, a fundamental change in how you structure
your program to handle parameter modifications might be in order. A menu-based
implementation should minimize the amount of questions asked and answered by
the user. In addition, user mistakes (bad input) should be trapped and rarely
allowed to reach process tasks. In some cases, the windows environment itself
seeks complete control of when and how any process or task interfaces with the
user.
Windows programming tends to be more event driven than procedural and a few
basic steps should be taken to make program conversion a little easier:
Centralize your application parameters as much as possible. In a menu system,
parameters or variables are displayed in related groupings so that a user can
see the current value at a glance and make changes by exception. By
centralizing the routines for setting/displaying these global variables,
functions or subroutines can be used as entry points for this process.
Group parameters based on association. As an extension of the centralization
process, related variables can be placed in subroutines representing one or
more submenus, as required. When the user clicks on an icon or widget to make
a selection, the location of a pointing device or the cursor position can be
used to uniquely identify a variable from the list. In this case, one
subroutine sets all related variables and another subroutine performs the
required operation based on the current values of the variables.
Utilize defaults and automatic variable initialization. At program startup or
on demand, every variable of consequence should be given a meaningful initial
value. This ensures that the user can operate quickly and safely without
having to set every single variable in the system.
Substitute generic I/O routines for inline READ, WRITE, or OPEN statements.
The usual approach to controlling user input consists of a program statement
such as WRITE(LogicalUnit,...) followed by another program statement such as
READ(LogicalUnit,...). This method should be replaced with a more generic
function such as PROMPT(LogicalUnit,...). This provides more uniform program
code appearance and makes modifying the I/O process much easier.
These ideas represent a few standard and modular structured programming
techniques. There's a good chance your code already conforms in some or all of
these areas. If not, you should seriously consider making such changes to your
program(s) prior to any modifications for a GUI interface.


What's Required in a Language to Support GUI Development?


There are actually at least two stages of operator interface development
beyond the simple "question --> wait for answer" type of program control. We
can loosely define text-based pop-up windows as a simpler form of GUI
interface for displays limited to characters instead of high-resolution bit
maps. Figure 1 is an example of a typical text based pop-up menu display.
With this technique, the user selects choices highlighted by horizontal or
vertical bars. The input and pointing devices recognized can be anything from
mice, keyboards, or digitizer pads to light pens. As we move to more
sophisticated GUI interface such as Microsoft Windows or Unix X Window, the
list of support functions grows in number and complexity. Instead of
text-based video memory, we use the bit-mapped graphics area. Each pixel on
the screen is individually controllable. I/O becomes more complicated as
events are distributed across machines and multitasking issues are raised.
Requirements for speed and low-level driver interfaces to memory and hardware
continue to expand the central theme of the design requirements. These factors
cause concern for even a moderately experienced Fortran programmer.
Let's begin by considering the major requirements as if we are going to write
this type of operator interface from scratch. This will take the form of a
windows library of sorts. Because we will assume (for the moment) that we know
all there is to know about windows development (we just haven't written the
code yet), I'll name the new window system the MNIH ("My Not Invented Here")
Windows Toolkit. To support the required features, we will "invent" a few
library functions (top down approach). While the list of functions will not be
complete at this point, it will illustrate some of the basic requirements for
support of graphical and bit-mapped window systems. Our beginning step(s)
include functions in logical groupings such as those in Table 1.
Table 1: Library functions required to implement the window library
InitGraphicsArea(Returncode) Finds and initializes the highest resolution
graphics mode available. All workstations and video drivers are configured
differently and have different capabilities. Return characteristics of current
system to caller.
GetVideoPage(Pagenum) Returns the current video page number Pagenum to the
caller for initialization and other bookkeeping. A video page refers to a
region of memory representing characters displayed, and their attributes such
as color and intensity. The characters are mapped through hardware in real
time to the CRT screen. Therefore, reading/writing directly to/from video
memory produces the instant pop-up/down effects in use today. Video page
functions are primarily used in the pop-up text window modes.
SetVideoPage(Pagenum) Sets the display to the video page number Pagenum
specified by the caller. In DOS-based systems, for example, this could be an
integer in the range of 0-3 (CGA mode 2 or 3) to select the active display
page (video services: Interrupt 10h, service 5).
CopyToVideoPage(Pagenum) Copies the current video page to a page specified by
Pagenum. Used to create the instant pop-up/down window effect and to control
the parent/child submenu or cascade process.
ReadWriteMemory(String,Row,Col,Pagenum) The character string String is copied
to the Row and Column on the video page number specified.
Prompt(Question,Answer) Issues a prompt with a default answer string at the
current cursor position. The user must be able to 'edit' or retype the Answer
to reply. All required input devices such as a mouse or keyboard are scanned
for a response.
ReadKbdNoWait(Keycode) Scans keyboard and returns code representing the key
pressed. As defined, NoWait implies that if no key is pressed, this function
must return immediately without waiting for a user response.
ReadMouse(Status,Row,Col) Returns mouse cursor position, if present, along
with status codes representing the button(s) pushed. This function behaves in
a fashion similar to that of the NoWait condition described in the
ReadKdbNoWait function.
OpenCloseVerticalWindow(Menu, Foreground, Background, Status) Opens a Menu
structure representing choices at the current row and specifies column
position using the Foreground and Background colors. The height of the window
box is automatically determined by the number of choices and the width by the
longest character string in the list. A highlighted bar is scrolled vertically
up and down the box as each choice is examined. The status code is used to
open or close the window and to return other control codes to the program or
process. Multiple window requests are stacked, representing various submenu or
child processes. In bit-mapped modes, both window open functions become more
complicated. As font sizes vary, the window dimensions must adapt accordingly.
OpenCloseHorizontalWindow(Menu, Foreground, Background, Status) Opens a Menu
structure representing choices at the current row and specifies column
position using the Foreground and Background colors. The length of the window
box is automatically determined by the number of choices and the width of the
highlight bar changes with each selection in the list. The status code is used
to open or close the window and to return other control codes to the program
or process.
SetWindowDimensions(Height, Width) Sets window display height and width using
some internal unit of measure.
SetWindowBorder(Title) Displays title bar string across top of window.
SetWindowFont(Fontname) Looks up and initializes font (bit-mapped window only)
for window operations.
GetWindowEvent(Eventcode) Waits for user input from mouse button, keyboard,
trackball, and so on.
SpawnChildProcess(Childname) Loads and runs child process from current window.
In single tasking systems, parent process suspends, pending termination of
child. In multitasking systems, child and parent may run concurrently.

Figure 2 illustrates a proposed class hierarchy of the MNIH Window System.
(Note: A more complete MNIH implementation could require as few as 50 or as
many as several hundred level 0 and level 1 functions. The numbers depend on
the window modes of operation -- text-based versus bit-mapped -- and the level
of window sophistication desired.)
The hierarchy diagram in Figure 2 illustrates an implied relationship between
the functions of the MNIH library system. As you might suspect, the
higher-level functions (level 2) have a much greater chance of being
implemented entirely in Fortran. However, based on the nature and description
of requirements, the low- and medium-level functions are beyond the capability
and scope of the Fortran 77 language standard.



Program Example: Bell Curve Plot Utility


Listings One and Two (page 101 and 102 respectively) are the program and
include files for a Gaussian distribution, or bell curve calculation and
plotting utility called BELL. The program was originally written in ANSI
X3.9-1978 Fortran 77 and later enhanced using the military extensions defined
in the MIL-STD-1753 standard. This standard is supported by many popular
Fortran compiler environments such as Microsoft 5.0 and VAX Fortran.
The bell program is used to calculate a Gaussian distribution by getting
multiple data values from the user (in the range 0-100), calculating the
distribution constants for the Gaussian equation and plotting the results in
histogram form followed by mean, variance, and other related values. Duplicate
values are accounted for by the value supplied for the number of occurrences.
As discussed earlier, the bell program structure contains some modifications
suggested for initial preparation in linking to a GUI interface library, such
as:
Centralize application parameters and Group parameters based on association.
In this example program, the parameters are located in the common include
file. Routines can centrally access and control these parameters, as required.
Substitute generic I/O routines for inline READ, WRITE, or OPEN statements.
User input is controlled through iprompt and drprompt routines for integer and
double-precision reals. For a larger application with more variable types, a
better approach might be to interpret all input as strings and convert it to
the appropriate data types, as needed.
To extend the program as written to add a windows interface structure such as
our MNIH example, the two routines GET_BELL_DATA( ) and PLOT_BELL_DATA( ) in
the main program could be changed. You could call Get_bell_data from an
open_window routine for controlling bell data input and Plot_bell_data could
be called from another open_window routine to present the results of the
Gaussian process.
Interestingly, languages such as C and Ada contain many built-in mechanisms
from which to implement most of the routines at all levels indicated in Figure
1. In C, for example, direct access to memory locations is available in some
form in every implementation. Keyboard and device I/O routines are varied and
provide for many levels of access and control. Many of the window
characteristics are best implemented with structures and records, where mixed
variable types can be organized under one user-defined class. This can be
readily done in C and many other high-level languages, including nonstandard
vendor enhancements to Fortran 77.
Based on these and other factors, most window development software kits
include software interface hooks for C.
At this point, we have sufficiently defined many of the requirements and
language considerations in development interfacing to our MNIH library system.
However, a few questions remain unanswered. If we were interested in
implementing all levels of our MNIH library for Fortran applications, what
options would we have? Do we implement levels 0 - 1 in C or assembler and
fight a mixed-language interface war? Do we punt and wait for Fortran 90? Have
we examined the problem from the proper perspective?


Another View of the MNIH Library and the GUI System


The layered approach to specification of the MNIH system allows examination of
the menu library system's scope. To implement this kind of system, a blue
print such as Figure 2 could be used. However, a better approach is to
consider the two following facts:
As you might suspect, the MNIH library already exists in one form or another
in commercially available X Toolkits or window development kits. For now, most
cases require a mixed-language approach to supplement Fortran.
At the highest level of interface, the MNIH window creation calls bear a
strong resemblance to the high-level file access routines! If you don't write
drivers for file track and sector manipulation in Fortran, why should it be
required for general-purpose window interfaces?
As GUI systems continue to mature, the level of standard interface at higher
levels will become commonly available. Ideally, vendors supplying compilers
could also supply "extensions" to make low-level MNIH implementations
unnecessary. At the time of writing, Microsoft was the only company I am aware
of that had already begun an aggressive campaign to close the development gap
between Fortran and window interfaces. The newer version of the bell program
running in a Microsoft Windows environment (Figure 2 through Figure 5) shows
just how close a direct Fortran/Windows interface is to becoming reality.
These figures represent the result of windows interface modifications made to
allow the bell curve program to execute in a Microsoft Windows 3.0
environment! The code used fits the patterns suggested earlier but with the
addition of less than ten additional lines of source code! Compare this small
effort to that of implementing the remainder of the MNIH low-level routines,
especially if you are not already comfortable with C or assembler.
The final program changes were made using a beta version of the upcoming
Microsoft Fortran. The following code fragments were the only changes required
to make the bell program windows-based as in the figure.
In the main program, the line added to produce the help box "About Bell" in
Figure 3 was: CALL ABOUTBOXQQ('Bell Curve Program\r Version 2.0'C). In the
subroutines get_bell_data and plot_bell_data, a graphics logical unit was
added to open a child window for I/O. The LU variable was replace with the GLU
variable in the prompt calls. The code added is shown in Example 1.
Example 1: Code changes required to make the bell program a Windows 3-based
application

 C
 INTEGER GLU !LOGICAL UNIT NUMBER
 C
 GLU=10
 OPEN (UNIT=GLU, FILE = 'USER')
 .
 . (See Listings 1-2 for rest of body)
 .
 C
 CLOSE (GLU, STATUS = 'KEEP')
 C

The STATUS = 'KEEP' part of the close call is used to keep the child windows
in view until the application is finally exited. Microsoft refers to this
level of window interface as "Quick Window." As the first child window is
opened for data values, the window error checking is automatically activated,
as shown in Figure 4. Figure 5 shows final bell data output as the last child
window opened. Note that the window unit numbers match the logical unit
assignments in the program fragments.
For more sophisticated icon and parameter control, additional levels of
interface are made available through this window development system.


Where Best to Put Your Development Efforts!


This article defines a high-level structure for developing custom GUI library
systems. For those truly interested in developing their own window library
system, the blueprint is here to follow. Unfortunately for the Fortran purist,
the Fortran 77 standard simply does not allow this task to be performed
without investing serious effort in writing mixed-language libraries and a lot
of interface code. Happily, this is not the only option available.
A great deal of effort should be spent in providing better structure to
Fortran code prior to the window treatments. The Fortran 90 standard should
make this effort much easier (provided it's universally implemented during
this century!).
Aside from Fortran implementation issues, larger issues concern the nature and
definition of the GUI itself. As GUI interface concepts continue to mature and
become more of a standard device type, the lower-level driver support of
languages other than C will increase. By carefully structuring your software
applications, you can take advantage of any of these cases and increase
program portability without severe changes to most of your code.

_FORTRAN & GUIS_
by John L. Bradberry


[LISTING ONE]

C >**************************************************************
 PROGRAM BELL

C **************************************************************
C AUTHOR: JOHN L. BRADBERRY CREATION DATE: FEB 15,1989
C UTILITY TO CREATE A BELL CURVE DATA 'PLOT' BY READING IN A SERIES
C OF NUMBERS IN THE RANGE OF 0-100. THE NUMBERS ARE USED TO CREATE
C THE GAUSSIAN DISTRIBUTION CONSTANTS. THE CONSTANTS ARE THEN USED TO
C CALCULATE A NORMAL DISTRIBUTION FROM 0 TO 100 IN STEPS OF 5. '*' ARE
C PLOTTED IN HISTOGRAM FORM TO SIMULATE BELL SHAPE.
C --------------------------------------------------------------
C
 IMPLICIT NONE
C
 INCLUDE 'BELLCOM.INC'
C
C
 INTEGER*2 LU !LOGICAL UNIT NUMBER
C
 LU=6
C
C INITIALIZE BELL CURVE DATA (CONTAINED IN COMMON)...
C
 BCIDX=0
 BCTOT=0
 BCEX=0
 BCEXS=0


C GET BELL CURVE VALUES FROM USER TO BE USED FOR CALCULATIONS...
C
 CALL GET_BELL_DATA(LU)
C
C CALCULATE CONSTANTS FOR GAUSSIAN DISTRIBUTION AND PLOT BELL CURVE
C USING THE '*' CHARACTER...
C
 CALL PLOT_BELL_DATA(LU)
C
C
 END
C
C >**************************************************************
 SUBROUTINE GET_BELL_DATA(LU)
C **************************************************************
C SUBROUTINE TO PROMPT USER FOR INTEGER VALUE...
C --------------------------------------------------------------
C AUTHOR: JOHN L. BRADBERRY CREATION DATE: FEB 8,1989
C
 IMPLICIT NONE
C
 INCLUDE 'BELLCOM.INC'
C
C
 INTEGER*2 I !LOOP INDEX COUNTER
 INTEGER*2 LU !LOGICAL UNIT NUMBER
 INTEGER*2 BCCOUNT !BELL CURVE DATA POINT COUNT
C
C
 BCCOUNT=1
 DO WHILE (BCCOUNT.GT.0)
C
 CALL IPROMPT(LU,'Enter Number Of Occurrences Next Data Point '//

 + 'Value (Or 0 To Exit).',BCCOUNT)

 IF (BCCOUNT.GT.0) THEN
 CALL DRPROMPT(LU,'Enter Data Point Value (Range 0-100):',
 + BCDAT)
 END IF
C
 IF (BCCOUNT.GT.0) THEN
 DO I=1,BCCOUNT
 BCIDX=BCIDX+1
 BCTOT=BCTOT+BCDAT
 END DO
 BCEX=BCEX+BCCOUNT*BCDAT
 BCEXS=BCEXS+BCCOUNT*BCDAT*BCDAT
 END IF
 END DO
C
C

 RETURN
 END
C
C >**************************************************************
 SUBROUTINE PLOT_BELL_DATA(LU)
C **************************************************************
C SUBROUTINE TO PROMPT USER FOR INTEGER VALUE...
C --------------------------------------------------------------
C AUTHOR: JOHN L. BRADBERRY CREATION DATE: FEB 8,1989
C
 IMPLICIT NONE
C
 INCLUDE 'BELLCOM.INC'
C
C
 INTEGER*2 LU !LOGICAL UNIT NUMBER
 INTEGER*2 KX !LOOP INDEX COUNTER
 INTEGER*2 STARCOUNT !NUMBER OF STARS TO OUTPUT IN BELL
 INTEGER*2 MAXSTARS !MAXIMUM STARS IN CHARACTER STRING

 PARAMETER (MAXSTARS=51)

 CHARACTER STARS*51 !STRING 'STAR' ARRAY

 REAL*8 RVAL1 !TEMPORARY
 REAL*8 RVAL2 !TEMPORARY
 REAL*8 DEGRAD !DEGREES TO RADIAN CONVERSION
C
C
 STARS='***************************************************'
C
 DEGRAD=3.141592654D0/180D0
C
 IF (BCIDX.GT.0) THEN
 BCEX=BCEX/BCIDX
 BCEXS=BCEXS/BCIDX
 BCMEAN=BCEX
 BCVAR=BCEXS-BCEX*BCEX
 BCSIGMA=SQRT(BCVAR)
 END IF

C
C BELL CURVE FORMULA...
C
C 1/(SIGMA(SQRT(2PI)))*EXP(-(X-MEAN)**2/(2*SIGMA))
C
 RVAL1=1.0/(BCSIGMA*SQRT(2*3.141592654))
 DO KX=0,100,5
 RVAL2=RVAL1*EXP(-1.0*((KX-BCMEAN)**2)/(2.0*BCSIGMA*BCSIGMA))
 RVAL2=1000*RVAL2

 STARCOUNT=MIN(NINT(RVAL2),MAXSTARS)
 WRITE(LU,*)KX,' ',STARS(1:STARCOUNT)
 END DO
C
 WRITE(LU,'(/,1X,A10,I2,2X,3(A10,F8.3,2X))')
 + '# POINTS= ',BCIDX,'MEAN= ',BCMEAN,'VARIANCE= ',
 + BCVAR,' SIGMA= ',BCSIGMA
C
C
 RETURN
 END
C
C >**************************************************************
 SUBROUTINE IPROMPT(LU,PROMPT,IVAL)
C **************************************************************
C SUBROUTINE TO PROMPT USER FOR INTEGER VALUE...
C --------------------------------------------------------------
C AUTHOR: JOHN L. BRADBERRY CREATION DATE: FEB 8,1989
C
 IMPLICIT NONE
C
 INTEGER*2 IVAL !INTEGER VALUE RETURNED
 INTEGER*2 LU !LOGICAL UNIT NUMBER
C
 CHARACTER*(*) PROMPT !STRING PROMPT TO BE ISSUED
C
C
 WRITE(LU,*)PROMPT
 READ(LU,*)IVAL
C
C
 RETURN
 END
C
C
C >**************************************************************
 SUBROUTINE DRPROMPT(LU,PROMPT,DRVAL)
C **************************************************************
C SUBROUTINE TO PROMPT USER FOR DOUBLE PRECISION REAL VALUE...
C --------------------------------------------------------------
C AUTHOR: JOHN L. BRADBERRY CREATION DATE: FEB 8,1989
C
 IMPLICIT NONE
C
 INTEGER*2 LU !LOGICAL UNIT NUMBER
C
 CHARACTER*(*) PROMPT !STRING PROMPT TO BE ISSUED

C

 REAL*8 DRVAL !REAL VALUE RETURNED
C
C
 WRITE(LU,*)PROMPT
 READ(LU,*)DRVAL
C
C
 RETURN
 END
C





[LISTING TWO]


C -----------------------------------------------------------
C BELL CURVE CONTROL COMMON ...
C -----------------------------------------------------------
C
 INTEGER*2 BCIDX !BELL CURVE INDEX
C
 REAL*8 BCMEAN !BELL CURVE MEAN
 REAL*8 BCEX !BELL CURVE EX TERM
 REAL*8 BCEXS !BELL CURVE EX TERM SQUARED
 REAL*8 BCTOT !BELL CURVE TOTAL
 REAL*8 BCDAT !BELL CURVE DATA
 REAL*8 BCVAR !BELL CURVE VARIANCE
 REAL*8 BCSIGMA !BELL CURVE SIGMA
C
C
 COMMON /BELLCURVE/
C
 +BCIDX,
 +BCMEAN,
 +BCEX,
 +BCEXS,
 +BCTOT,
 +BCDAT,
 +BCVAR,
 +BCSIGMA
C



[Example 1]

C
 INTEGER GLU !LOGICAL UNIT NUMBER
C
 GLU=10
 OPEN (UNIT=GLU, FILE = 'USER')
.
. (see listings 1-2 for rest of body)
.
C
 CLOSE (GLU, STATUS = 'KEEP')

C





























































June, 1991
USING THE REAL-TIME CLOCK


Faster time routines for Turbo Pascal




Kenneth Roach


Kenneth is an engineer for Unisys. He can be contacted at P.O. Box 2271,
Manteca, CA 95336.


When recently faced with the need to perform processing based on seconds
elapsed, I was disappointed at the types of time-related functions Turbo
Pascal provided. What I needed was a routine which would return the elapsed
time in seconds since a base date. That is, a Pascal routine similar to the
time function as defined for ANSI C. This was not available, and I was faced
with either doing calls to the Turbo Pascal GetTime procedure and manipulating
the value returned, or inventing an equivalent of the time function for Turbo
Pascal.
I quickly wrote a Pascal version of the time function that proved
satisfactory, though tests indicated that its performance was less so. The
Time procedure was called around 8000 times in a five-second period on the
system I was using--a 25-MHz 80386 PC running under MS-DOS. I then began
efforts to improve the procedure's performance. The first attempt involved
eliminating as many long integer calculations as possible; some remained,
however, because the value returned is a long integer. This improved
performance, though it still seemed that more processing time than should have
been was required.
The system I was using had an AT-compatible real-time clock, so it seemed
reasonable to test usage of this clock with the Time procedure. Calls to the
Pascal GetTime and GetDate procedures were replaced with direct reads of the
values maintained by the real-time clock. The performance improvement was
startling. Using the real-time clock, the newly created Pascal Time function
was faster than the compiler's own GetTime procedure, even though this Time
function had much more to do.
After some thought, this made sense. Turbo Pascal is designed to generate
programs which can be run on any DOS-based system, including those using the
8088 processor. It makes no assumptions about the type of hardware available.
Instead, it relies on standard MS-DOS time and date information provided by
the 8253 timer chip, regardless of what might be available. Performance
suffers, it seems, for the sake of compatibility.
Later tests with Turbo C provided similar results. In a five-second period,
the standard C time function was called some 11,000 times, and the gettime
function around 34,000 times. These counts are very similar to those of Turbo
Pascal, so it seems that they, too, obtain time and date information via the
8253 timer chip.
Because of the real-time clock's superior performance, I decided to create a
set of Turbo Pascal (and Turbo C) routines that use this clock.


Accessing the Real-Time Clock


The real-time clock function is provided by a Motorola MC146818 processor
located on the motherboard. Information from this clock is stored in
battery-backed memory. This memory is accessible by programs through port
addresses $70 and $71. Locations in the memory relating to the real-time clock
are described in Table 1. To read a memory location, it is necessary to first
place the location's address into register $70 and then read the data from
register $71. Writing to the memory is similar. The location to be written to
is first placed into register $70, and the data to be written is then placed
into register $71.
Table 1: Real-time clock memory locations

 Location Description
 -------------------------------------------------------------------------

 $00: Current time (second)
 $01: Alarm time (second)
 $02: Current time (minute)
 $03: Alarm time (minute)
 $04: Current time (hour)
 $05: Alarm time (hour)
 $06: Day of week
 $07: Day of month
 $08: Month (1-12)
 $09: Year, relative to century
 $32: Century
 $0a: Status Register A:
 Bit 7: Indicates update of time is in progress if set.
 Bit 6-4: Time frequency. Default is 010, or 32,786 KHz.
 Bit 3-0: Interrupt frequency. Default is 0110, or 1.024 KHz.
 $0b: Status Register B:
 Bit 7: Set clock: If set, the program can initialize the
 14 time-bytes. No updates will occur until the bit is reset.
 Bit 6: Periodic Interrupt Enable. If set, enables interrupt
 according to the parameters in register A.
 Bit 5: Alarm Interrupt Enable. If set, enables alarm
 interrupt at time specified in registers $01,
 $03 and $05.
 Bit 4: Update Ended Interrupt Enabled. If set, enables interrupt
 at clock update interval.
 Bit 3: N/A.

 Bit 2: If set, indicates time information is in binary,
 else time information is in BCD.
 Bit 1: If set, indicates clock is operating in 24-hour
 mode, else clock is in 12-hour mode.
 Bit 0: If set, enables daylight savings mode.
 $0c: Status Register C:
 Bit 7: Interrupt identification.
 Bit 6: Periodic interrupt occurred.
 Bit 5: Alarm interrupt occurred.
 Bit 4: Update interrupt occurred.
 Bit 3-0: N/A.
 $0d: Status Register D:
 Bit 7: If not set, indicates that the real-time clock has
 lost power.
 Bit 6-0: N/A.

Most of the memory locations are straightforward and explained adequately in
Table 1. Some deserve additional comment, however:
On systems tested, the daylight savings time bit in register B (bit 0) seems
unused. The routines which access the real-time clock therefore do not make
use of this bit.
Clocks observed generally run in 24-hour mode (bit 1 of status register B),
and since there is no bit indicating A.M. or P.M., the time functions provided
here depend on 24-hour mode. While 24-hour mode is documented as being the
default condition of the clock, the routines provided will force the real-time
clock to operate in 24-hour mode to assure proper functioning.
No assumptions are made with regard to whether time is stored in binary or BCD
format (bit 2 of status register B), and allowances for possible differences
between systems have been made.
For reasons unknown, the day-of-week indicator seems not to be used on systems
tested, and so is not relied upon in the routines here.
Only the last two digits of the year are stored in location $09. The century
is stored in location $32. To obtain the complete year, then, location $32
should be multiplied by 100 and added to location $09.
As Table 1 shows, the real-time clock offers an optional periodic interrupt.
When enabled, the real-time clock will generate an interrupt at a programmable
interval, which defaults to 1024 times per second, a much better resolution
than the standard clock-tick frequency of approximately 18.2 times per second.
Routines to handle this interrupt are provided here, and the interrupt may be
enabled or not, as required.
The real-time clock uses IRQ 8, which is handled through the second 8259 using
interrupt vector $70. The periodic interrupt is enabled by setting bit 6 of
status register B. When interrupts occur, the interrupt service routine must
examine status register C to determine the cause of the interrupt. This
presents a problem because an application can be involved in reading this
memory prior to and after the interrupt. The problem is compounded by the fact
that register $70 is a read-only register. Other interrupts should not be
allowed to occur while accessing the real-time clock.
If the clock interrupt occurred due to something other than the periodic
interrupt, the interrupt service routine must pass the interrupt through to
the normal ISR. If the ISR was called due to the periodic interrupt, any other
processing required should be done without calling the normal ISR. Following
this, an end-of-interrupt must be generated for both the primary and secondary
8259 interrupt controllers.
The real-time clock can also be requested to generate an interrupt at a
specific time (see Table 1, locations $01, $03, and $05). While MS-DOS does
not provide a mechanism for enabling or handling the periodic interrupt, it
does support enabling the alarm interrupt through interrupt $1a. When enabled
through MS-DOS, the BIOS will generate interrupt $4a. While this is an
acceptable mechanism, and perhaps preferred in some cases, an application can
handle the interrupt a bit more directly by processing IRQ 8 itself.


New Turbo Pascal Time and Data Functions


Listing One (page 88) shows the set of time and date functions using the
real-time clock I developed for Turbo Pascal. These routines include
replacements for the GetTime and GetDate procedures, as well as routines
emulating the C language's ctime and clock routines. Because Turbo Pascal does
not have a time function of its own, a version of the Time procedure that does
not use the real-time clock is provided along with one which does. Routines to
enable and disable periodic interrupts from the real-time clock are provided
as well, along with the necessary interrupt service routine for the clock.
Finally, a function is provided to return the periodic interrupt count for the
current second.
Listing Two (page 91) is a simple test program written to measure the
performance of these routines versus the performance of previously existing
routines. The test program repeatedly calls each of the routines for a
five-second period, with a counter incremented for each of the calls.
(I also developed a set of time functions for C that is similar to that for
Pascal. Due to space constraints, however, these functions are only available
electronically; see "Availability" on page 3. The C code includes real-time
clock-based replacements for C's time, gettime, and getdate functions, as well
as a replacement for the time function that does not use the real-time clock.
Interestingly, the corresponding Turbo Pascal procedure was found to be faster
than the one provided with Turbo C. Consequently, a replacement for C's ctime
function is provided as well.)
Timing results can be expected to vary from system to system, depending on the
processor type, resident software installed, and so on. A sample of the Pascal
test program's results is shown in Figure 1.
Figure 1: Test program results

 Test Summary:

 GetTime called 34601 times
 GetRtcTime called 108846 times
 GetRtcTime was 334% faster than
 GetTime

 GetDate called 16397 times
 GetRtcDate called 53894 times
 GetRtcDate was 330% faster than
 GetDate

 Time called 13250 times
 RtcTime called 30755 times
 RtcTime was 233% faster than Time

During the course of refining the real-time clock routines, I made a few
additional discoveries which can make using the real-time clock routines all
the more attractive.
The system being used for testing was connected to a local area network and,
for convenience, was running in file server mode about half the time, allowing
remote access to the system. Large differences were noted in performance
between tests on different occasions, and these differences were ultimately
traced to whether or not the LAN server program was in memory or not. Further
testing found that the overhead added to the MS-DOS date and time functions by
the server is considerable. When the LAN server program was running, the
MS-DOS date and time functions performed about a third as fast as when the
server program was not running, while the routines that used the real-time
clock performed only about 5-10 percent slower.
I also discovered that performing the above tests with a 386 memory management
program loaded added considerably to the overhead of the MS-DOS time and date
functions. The test results shown in Figure 1 are, in fact, those output by
the program when a memory management program was in use. When the memory
manager was disabled and the test repeated, the output in Figure 2 was
obtained. As can be seen, when the 386 memory manager was not in use, a
substantial improvement in the performance of the MS-DOS date and time
functions was observed. However, not only did the MS-DOS-based functions
improve in performance, the routines based on usage of the real-time clock
improved as well.
Figure 2: Test results with 386 memory manager disabled

 Test Summary:


 GetTime called 50898 times
 GetRtcTime called 129415 times
 GetRtcTime was 255% faster than
 GetTime

 GetDate called 25605 times
 GetRtcDate called 65650 times
 GetRtcDate was 257% faster than
 GetDate

 Time called 18237 times
 RtcTime called 37101 times
 RtcTime was 254% faster than Time

The reason for lessened performance with the 386 memory management software
relates to the fact that the particular memory manager in use runs the system
as a virtual 8086 task. When this is done, all interrupt processing is
filtered by a virtual 8086 task management program. According to Intel
documentation, this can add as much as 300 clock ticks to each interrupt
performed, and more than 200 clock ticks to each return from an interrupt.
Much of the processing performed by the real-time clock versions of the
routines is not affected by interrupt processing.
While use of a 386 memory management program does increase overhead for
virtually all things done, it is unlikely that many of us would be willing to
give up memory management programs at the present time. Such programs perform
valuable services, including emulation of expanded memory (EMS), remapping of
ROM and RAM in the upper areas of the first megabyte of memory, and often the
ability to load TSR programs and device drivers into this upper range of the
first megabyte of memory.
Because 386 memory managers will likely continue to be used, usage of the
real-time clock-based routines will circumvent any performance problems
relating to time and date processing on systems using such. The complete set
of time and date routines for Turbo Pascal are shown in Table 2.
Table 2: Time and date routines for Turbo Pascal and Turbo C

 Pascal EnableRtcInts; (Procedure)
 C void enable_rtc_ints( )
Enables interrupts from the real-time clock, which will be handled by Rtc in
Pascal or rtc in C. Note that in both languages, the routines that return the
current time rely on interrupts from the real-time clock to calculate
hundredths of seconds. If interrupts are not enabled, hundredths will not be
returned.
Enabling interrupts from the clock will cause additional processing time to be
used while servicing them, so clock interrupts should not be enabled unless
there is a need for time information at greater than a 1-second resolution.

 Pascal DisableRtcInts; (Procedure)
 C void disable_rtc_ints( )
If clock interrupts have been enabled, this routine must be called prior to
terminating your program to disable interrupts from the clock.

 Pascal None
 C void init_time( )
Used in the C version of the library to determine whether daylight savings
time is in effect. Should be called once prior to using the other time
routines in the C library to assure accuracy of results. Daylight savings time
is not a factor in the Pascal version.

 Pascal GetRtcTime(Var Hr, Mn, Sc, Hn:Word); (Procedure)
 C void get_rtc_time(timep *time)
A direct replacement of the original GetTime routine. Note that variable Hn
(hundredths of second) will always be set to zero unless interrupts from the
real-time clock have been enabled.

 Pascal GetRtcDate(Var Yr,Mo,Dy:Word); (Procedure)
 C void get_rtc_date(datep *date)
A direct replacement of the original GetDate, with the exception that day of
week is not calculated.

 Pascal RtcTime (Var Result: LongInt); (Procedure)
 C time rtc_time(time_t *result)
An addition to Turbo Pascal. Emulates C's time function, though with these
differences: First, C time function both returns a value and stores that value
at an address which is passed to it. RtcTime is a procedure, not a function.
Therefore, it simply stores a time value at the address of the Result
variable. Second, C's time function returns elapsed time in seconds, since
00:00:00, January 1, 1970, Greenwich Mean Time. Because there was no
preexisting Time procedure in Pascal, GMT information is not available, and
the real-time clock has provided the same value since January 1, 1980, the
Pascal RtcTime procedure returns elapsed seconds since 1980 instead of 1970,
without regard to GMT. Because C provides a precedent for the time function,
rtc_time behaves exactly as the original time function does. It returns a
value and stores it at the address specified, and returns a value representing
elapsed time since 00:00:00 Jan 1, 1970, GMT.

 Pascal Time2(Var Result : LongInt); (Procedure)
 C time_t time2(time_t *result);
For completeness, the Pascal Time2 procedure is provided for systems which are
not equipped with a real-time clock. The Pascal Time2 procedure was observed
to be faster than Turbo C's own time function, for unknown reasons. Because of
this, I include a replacement for Turbo C's time function, called time2, to
avoid duplicating the name.

 Pascal Clock : LongInt; (Function)
 C clock_t clock( );
Only usable when interrupts from the real-time clock have been enabled, and
only if those interrupts remain enabled between successive calls. Clock
returns a value representing the number of periodic interrupts generated since
such interrupts were first enabled (the interrupt handler is called 1024 times
per second). It can be called multiple times to determine the number of clock
ticks which have elapsed between two events. The value returned will go
negative should the program run for 24 days or so, which was not considered a
problem here.

 Pascal MilliCount : Integer; (Function)
 C int milli_count( );
When interrupts from the real-time clock are enabled, this function returns
the number of periodic interrupts generated for the current second.

 Pascal CTime2(Time : LongInt): TimeStrPtr; (Function)
 C char *ctime2(time_t *t);

CTime2 was first written in Turbo Pascal because TP does not provide an
equivalent. As written, it processes values returned by the Time2 procedure
(based on Jan 1, 1980). Like C's ctime, CTime2 for Pascal returns a pointer
instead of a string. Tests indicated that the Pascal CTime2 was around twice
as fast as Turbo C's own ctime, so a version was developed for C as well.

 Pascal Rtc; Interrupt; (Procedure)
 C void interrupt rtc( );
Process the periodic interrupts from the real-time clock when enabled.

A necessary limitation of most of these routines is that an AT-compatible
real-time clock is required for them to function. When a given program will be
used on 8088 systems, as well as newer ones, the presence of this real-time
clock cannot be guaranteed. In these cases, the processor type can be detected
and the time and date routines which do not use the real-time clock can be
used.
How these routines will effect the performance of your programs depends on
their need to sample the date and time and/or format it for output. The
tighter the loop, the greater the need for performance, and the better these
routines may help.

_USING THE REAL-TIME CLOCK_
by Kenneth Roach

[TURBO PASCAL VERSION]


[LISTING ONE]

(*
** TIMELIB.PAS
** (C) Copyright 1990 by Kenneth Roach
** This module contains procedures similar to Turbo Pascal's GetTime and
** GetDate procedures, but which are based on use of the AT class of
** system's real time clock. Additionally, procedures and functions are
** provided to enable and disable periodic interrupts from the real time
** clock along with an interrupt handler for same. Interrupts from the
** real time clock are provided at a rate of 1024 per second, and a
** function is provided to return the number of interrupts received in the
** current second. Also provided are emulations of the C language's
** time(), ctime() and clock() functions.
*)

Unit TimeLib;

Interface

Uses Dos;

Type
 TimeString = String[24];
 TimeStrPtr = ^TimeString;

Function RtcClock : LongInt;
Function MilliCount : Integer;
Function CTime2(Time : LongInt) : TimeStrPtr;
Procedure RtcTime(Var Where : LongInt);
Procedure Time2(Var Result : LongInt);
Procedure EnableRtcInts;
Procedure DisableRtcInts;
Procedure GetRtcTime(Var Hr,Mn,Sc,Hn : Word);
Procedure GetRtcDate(Var Yr,Mo,Dy : Word);


Implementation

Type
 ShortString = String[3];
 OldVec = Procedure;

Const

 CLI = $FA;
 STI = $FB;
 MASK_24 = $02;
 BCD_MASK = $04;
 CMOSFLAG = $70;
 CMOSDATA = $71;
 SECONDS_REQ = $00;
 MINUTES_REQ = $02;
 HOURS_REQ = $04;
 STATUSA = $0A;
 DATE_REQ = $07;
 MONTH_REQ = $08;
 YEAR_REQ = $09;
 CENTURY_REQ = $32;
 UPDATE = $80;
 HINIBBLE = $F0;
 LONIBBLE = $0F;

 SECS_PER_MIN = 60;
 SECS_PER_HOUR = 3600;
 SECS_PER_DAY = 86400;
 SECS_PER_YEAR = 31536000;
 MINS_PER_HOUR = 60;
 DAYS_PER_YEAR = 365;
 BASE_YEAR = 1980;
 DAYS_PER_WEEK = 7;
 TUESDAY = 3; { day of week for 1-1-1980 }
 APRIL = 4;
 JUNE = 6;
 SEPTEMBER = 9;
 NOVEMBER = 11;
 FEBRUARY = 2;

 RTC_VEC = $70;
 IMR2 = $A1;
 CMD1 = $20;
 CMD2 = $A0;
 EOI = $20;
 RTC_MASK = $FE;
 STATUSB = $0B;
 STATUSC = $0C;
 RTC_FLAG = $40;

 Months : Array[1..12] of ShortString =
 ('Jan','Feb','Mar','Apr','May','Jun',
 'Jul','Aug','Sep','Oct','Nov','Dec');
 Days : Array[1..7] of ShortString =
 ('Sun','Mon','Tue','Wed','Thu','Fri','Sat');

Var
 Bcd : Boolean;
 RtcCount : Integer;
 TickCount : LongInt;
 OldRtcVec : Pointer;
 OldCall : OldVec;
 OldMask : Byte;
 TimeStr : TimeString;

(*

** emulation of the C language clock() function. RtcClock returns
** a value corresponding to the number of periodic interrupts which
** have occurred since interrupts from the real time clock were
** enabled. The value will remain positive for some 24 days from
** initialization.
*)

Function RtcClock : LongInt;
Begin
 RtcClock := TickCount;
End;

(*
** MilliCount returns the real time clock periodic interrupt count for
** the current second. Range of value is 0 to 1023.
*)

Function MilliCount : Integer;
Begin
 MilliCount := RtcCount;
End;

(*
** real time clock interrupt handler
*)

Procedure Rtc; Interrupt;
Begin
 Inline(CLI);
 Port[CMOSFLAG] := STATUSC; { determine cause of interrupt }
 If (Port[CMOSDATA] and $40) <> 0 Then { is it for us? }
 Begin
 Inc(RtcCount); { update number of times ISR called this second }
 Inc(TickCount); { update total number of times called }
 If RtcCount = 1024 Then { if start of new second then }
 RtcCount := 0 { reset RtcCount }
 Else
 Begin
 Port[CMOSFLAG] := STATUSA; { check it again for accuracy }
 If (Port[CMOSDATA] and UPDATE) <> 0 Then
 RtcCount := 0;
 End;
 Port[CMD1] := EOI; { signal end of interrupt to primary 8259 }
 Port[CMD2] := EOI; { signal end of interrupt to chained 8259 }
 End
 Else
 OldCall; { not for us, so call bios ISR }
 Inline(STI);
End;

(*
** turn on interrupts from the real time clock
*)

Procedure EnableRtcInts;
Begin
 RtcCount := 0; { reset ISR counter values }
 TickCount := 0;
 GetIntVec(RTC_VEC,OldRtcVec);

 Move(OldRtcVec^,OldCall,Sizeof(Pointer)); { fake out Pascal... }
 SetIntVec(RTC_VEC,@Rtc); { point to interrupt handler }
 Port[IMR2] := Port[IMR2] and RTC_MASK; { enable clock interrupt }
 Port[CMOSFLAG] := STATUSB;
 OldMask := Port[CMOSDATA]; { get rtc mask register }
 Port[CMOSFLAG] := STATUSB;
 Port[CMOSDATA] := OldMask or RTC_FLAG; { enable periodic interrupts }
End;

(*
** turn off interrupts from the real time clock
*)

Procedure DisableRtcInts;
Begin
 Port[CMOSFLAG] := STATUSB;
 Port[CMOSDATA] := OldMask; { turn off periodic interrupts }
 Port[IMR2] := Port[IMR2] and (not RTC_MASK); { reset 8259 mask }
 SetIntVec(RTC_VEC,OldRtcVec); { remove our ISR }
End;

(*
** emulation of the C language's ctime() function
*)

Function CTime2(Time : LongInt) : TimeStrPtr;
Var
 Hr,Mn,Sc : Word;
 Yr,Mo,Dy : Word;
 Bias,Dw,T : Word;
 Junk,S : Byte;
 Temp : LongInt;
Begin
 Temp := Time mod SECS_PER_DAY; { get seconds left for this day }
 Hr := Temp div SECS_PER_HOUR; { determine hours this day }
 Temp := Temp mod SECS_PER_HOUR; { lose hours this day }
 Mn := Temp div MINS_PER_HOUR; { determine minutes this hour }
 Sc := Temp mod SECS_PER_MIN; { determine seconds this minute }

 Inline(CLI);
 Repeat { duplicate a bit of code for speed }
 Port[CMOSFLAG] := STATUSA; { wait until not in update mode }
 Until (Port[CMOSDATA] and UPDATE) = 0;
 Port[CMOSFLAG] := CENTURY_REQ; T := Port[CMOSDATA]; { get century }
 Port[CMOSFLAG] := YEAR_REQ; Bias := Port[CMOSDATA]; { get year }
 Port[CMOSFLAG] := MONTH_REQ; Mo := Port[CMOSDATA]; { get month }
 Port[CMOSFLAG] := DATE_REQ; Dy := Port[CMOSDATA]; { get day }
 Inline(STI);
 If Bcd Then { convert from BCD to binary as required }
 Begin
 T := ((T and HINIBBLE) shr 4) * 10 + (T and LONIBBLE);
 Bias := ((Bias and HINIBBLE) shr 4) * 10 + (Bias and LONIBBLE);
 Mo := ((Mo and HINIBBLE) shr 4) * 10 + (Mo and LONIBBLE);
 Dy := ((Dy and HINIBBLE) shr 4) * 10 + (Dy and LONIBBLE);
 End;
 Inc(Bias,T * 100);

 Temp := Time div SECS_PER_DAY; { get number of days for this value }
 Yr := Temp div DAYS_PER_YEAR; { now convert it to years }

 Bias := (Bias - BASE_YEAR) shr 2; { get leap year days for value }
 Dy := Temp - Yr * DAYS_PER_YEAR - Bias; { get unprocessed days }
 Inc(Dy); { add back 'today' }
 Inc(Yr,BASE_YEAR); { now add in the 1980 start date }
 Dw := Time div SECS_PER_DAY + TUESDAY; { 1-1-80 was a Tuesday }
 Dw := Dw mod DAYS_PER_WEEK; { determine weekday }

 Mo := 1; S := 1; { now determine the month's name }
 While S <> 0 Do { process total remaining days for year }
 Begin
 Junk := 0;
 Case S of
 APRIL,
 JUNE,
 SEPTEMBER,
 NOVEMBER: If Dy >= 30 Then { month has 30 days in it }
 Junk := 30;
 FEBRUARY: If (Yr shr 2) = 0 Then { special case february }
 If Dy >= 29 Then
 Junk := 29
 Else
 Else If Dy >= 28 Then
 Junk := 28;
 Else If Dy >= 31 Then
 Junk := 31; { else month has 31 days }
 End;
 If Junk <> 0 Then
 Begin
 Inc(Mo); { account for month just processed }
 Inc(S); { bump case index }
 Dec(Dy,Junk); { subtract days just processed }
 End
 Else
 S := 0; { Dy is less than 1 month, clear while var }
 End;

 TimeStr[1] := Days[Dw][1]; { now convert all values to a string }
 TimeStr[2] := Days[Dw][2]; { done inline for speed }
 TimeStr[3] := Days[Dw][3];
 TimeStr[4] := ' ';
 TimeStr[5] := Months[Mo][1];
 TimeStr[6] := Months[Mo][2];
 TimeStr[7] := Months[Mo][3];
 TimeStr[8] := ' ';
 TimeStr[9] := Chr(Dy div 10 + Ord('0'));
 TimeStr[10] := Chr(Dy mod 10 + Ord('0'));
 TimeStr[11] := ' ';
 TimeStr[12] := Chr(Hr div 10 + Ord('0'));
 TimeStr[13] := Chr(Hr mod 10 + Ord('0'));
 TimeStr[14] := ':';
 TimeStr[15] := Chr(Mn div 10 + Ord('0'));
 TimeStr[16] := Chr(Mn mod 10 + Ord('0'));
 TimeStr[17] := ':';
 TimeStr[18] := Chr(Sc div 10 + Ord('0'));
 TimeStr[19] := Chr(Sc mod 10 + Ord('0'));
 TimeStr[20] := ' ';
 TimeStr[21] := Chr(Yr div 1000 + Ord('0')); Yr := Yr mod 1000;
 TimeStr[22] := Chr(Yr div 100 + Ord('0')); Yr := Yr mod 100;
 TimeStr[23] := Chr(Yr div 10 + Ord('0'));

 TimeStr[24] := Chr(Yr mod 10 + Ord('0'));
 TimeStr[0] := Chr(24);
 CTime2 := @TimeStr;
End;


(*
** replacement for Turbo Pascal's GetTime procedure
*)

Procedure GetRtcTime(Var Hr,Mn,Sc,Hn : Word);
Begin
 Inline(CLI);
 Repeat
 Port[CMOSFLAG] := STATUSA; { wait until not in update cycle }
 Until (Port[CMOSDATA] and UPDATE) = 0;
 Port[CMOSFLAG] := SECONDS_REQ; Sc := Port[CMOSDATA]; { get seconds }
 Port[CMOSFLAG] := MINUTES_REQ; Mn := Port[CMOSDATA]; { get minutes }
 Port[CMOSFLAG] := HOURS_REQ; Hr := Port[CMOSDATA]; { get hour }
 Inline(STI);
 If Bcd Then { convert from BCD to binary as required }
 Begin
 Sc := ((Sc and HINIBBLE) shr 4) * 10 + (Sc and LONIBBLE);
 Mn := ((Mn and HINIBBLE) shr 4) * 10 + (Mn and LONIBBLE);
 Hr := ((Hr and HINIBBLE) shr 4) * 10 + (Hr and LONIBBLE);
 End;
 Hn := RtcCount div 10; { RtcCount goes to 1024 }
 If Hn > 75 Then { correct for values to 102 each second }
 Dec(Hn,3)
 Else If Hn > 50 Then
 Dec(Hn,2)
 Else If Hn > 25 Then
 Dec(Hn);
End;


(*
** replacement for Turbo Pascal's GetDate procedure
*)

Procedure GetRtcDate(Var Yr, Mo, Dy : Word);
Var T : Integer;
Begin
 Inline(CLI);
 Repeat
 Port[CMOSFLAG] := STATUSA; { wait until not in update mode }
 Until (Port[CMOSDATA] and UPDATE) = 0;
 Port[CMOSFLAG] := CENTURY_REQ; T := Port[CMOSDATA]; { get century }
 Port[CMOSFLAG] := YEAR_REQ; Yr := Port[CMOSDATA]; { get year }
 Port[CMOSFLAG] := MONTH_REQ; Mo := Port[CMOSDATA]; { get month }
 Port[CMOSFLAG] := DATE_REQ; Dy := Port[CMOSDATA]; { get day }
 Inline(STI);
 If Bcd Then { convert time from BCD to binary as required }
 Begin
 T := ((T and HINIBBLE) shr 4) * 10 + (T and LONIBBLE);
 Yr := ((Yr and HINIBBLE) shr 4) * 10 + (Yr and LONIBBLE);
 Mo := ((Mo and HINIBBLE) shr 4) * 10 + (Mo and LONIBBLE);
 Dy := ((Dy and HINIBBLE) shr 4) * 10 + (Dy and LONIBBLE);
 End;

 Inc(Yr,T * 100); { add in century }
End;

(*
** emulation of the C language's time() function
*)

Procedure RtcTime(Var Where : LongInt);
Var
 Hr : LongInt;
 T,S,B,Yr,Sc,Mn,Mo,Dy : Word;
Begin
 Inline(CLI); { following code is duplicated for speed }
 Repeat
 Port[CMOSFLAG] := STATUSA;
 Until (Port[CMOSDATA] and UPDATE) = 0;
 Port[CMOSFLAG] := SECONDS_REQ; Sc := Port[CMOSDATA]; { get seconds }
 Port[CMOSFLAG] := MINUTES_REQ; Mn := Port[CMOSDATA]; { get minutes }
 Port[CMOSFLAG] := HOURS_REQ; Hr := Port[CMOSDATA]; { get hour }
 Port[CMOSFLAG] := CENTURY_REQ; T := Port[CMOSDATA]; { get century }
 Port[CMOSFLAG] := YEAR_REQ; Yr := Port[CMOSDATA]; { get year }
 Port[CMOSFLAG] := MONTH_REQ; Mo := Port[CMOSDATA]; { get month }
 Port[CMOSFLAG] := DATE_REQ; Dy := Port[CMOSDATA]; { get day }
 Inline(STI);
 If Bcd Then { convert time from BCD to binary as required }
 Begin
 Sc := ((Sc and HINIBBLE) shr 4) * 10 + (Sc and LONIBBLE);
 Mn := ((Mn and HINIBBLE) shr 4) * 10 + (Mn and LONIBBLE);
 Hr := ((Hr and HINIBBLE) shr 4) * 10 + (Hr and LONIBBLE);
 T := ((T and HINIBBLE) shr 4) * 10 + (T and LONIBBLE);
 Yr := ((Yr and HINIBBLE) shr 4) * 10 + (Yr and LONIBBLE);
 Mo := ((Mo and HINIBBLE) shr 4) * 10 + (Mo and LONIBBLE);
 Dy := ((Dy and HINIBBLE) shr 4) * 10 + (Dy and LONIBBLE);
 End;

 Inline(STI);
 Mn := Mn * SECS_PER_MIN + Sc; { convert today's values to seconds }
 Hr := Hr * SECS_PER_HOUR + Mn;
 Inc(Yr,T * 100); { account for century }
 Dec(Yr,BASE_YEAR); { keep years since 1980 }
 Inc(Dy,(Yr shr 2)); { check leap years }
 S := 1;
 While S < Mo Do { add days for this year }
 Begin
 Case S of
 APRIL,
 JUNE,
 SEPTEMBER, { month has 30 days in it }
 NOVEMBER: Inc(Dy,30);
 FEBRUARY: If (Yr shr 2) = 0 Then { is this year a leap year? }
 Inc(Dy,29) { yes }
 Else
 Inc(Dy,28); { no }
 Else Inc(Dy,31); { else month has 31 days }
 End;
 Inc(S);
 End;
 Dec(Dy); { lose today... }
 Where := Yr * SECS_PER_YEAR + { return final value }

 Dy * SECS_PER_DAY + Hr;
End;


(*
** Pascal substitute for Turbo-C's time() function, based on calls to
** GetDate, GetTime. Provided for use on systems not equipped with a
** real time clock.
*)

Procedure Time2(Var Result : LongInt);
Var
 H : LongInt;
 S,Hr,Yr,Sc,Mn,Mo,Dy : Word;
Begin
 GetTime(Hr,Mn,Sc,S); { get time from Turbo Pascal }
 Mn := Mn * 60 + Sc; { convert to seconds }
 H := Hr * 3600 + Mn;
 GetDate(Yr,Mo,Dy,S); { get date from Turbo Pascal }
 Dec(Yr,1980); { get years since 1980 }

 Inc(Dy,Yr shr 2); { check leap years }
 S := 1;
 While S < Mo Do { add days for this year }
 Begin
 Case S of
 APRIL,
 JUNE,
 SEPTEMBER,
 NOVEMBER: Inc(Dy,30); { month has 30 days in it }
 FEBRUARY: If (Yr shr 2) = 0 Then { is this year a leap year? }
 Inc(Dy,29) { yes }
 Else
 Inc(Dy,28); { no }
 Else Inc(Dy,31); { else month has 31 days }
 End;
 Inc(S);
 End;
 Result := (Yr * SECS_PER_YEAR + { return final value }
 Dy * SECS_PER_DAY + H);
End;

(*
** unit initialization
*)

Begin
 Port[CMOSFLAG] := STATUSB;
 Bcd := (Port[CMOSDATA] and BCD_MASK) = 0; { check for BCD mode }
 Port[CMOSFLAG] := STATUSB;
 Port[CMOSDATA] := Port[CMOSDATA] or MASK_24; { force 24 hour mode }
 RtcCount := 0;
 TickCount := 0;
End.






[LISTING TWO]

(*
** TIME_PAS
** (C) Copyright 1990 by Kenneth Roach
** This program uses the time and date functions provided by Turbo Pascal
** compiler, as well as similar functions contained in the module TIMELIB.PAS.
** TIME_PAS calls each function for five seconds, counting the number of
** times the function in question was called. It then compares the number
** of times each function was called and displays the results. Following
** this, it displays the current date and time obtained from the
** GetRtcTime function, and as reported and converted by the RtcTime
** and CTime2 functions.
*)

Program TimePas;

Uses Dos,Crt,TimeLib;

Const
 TEST_TIME = 5120; { 5 seconds * 1024 ticks per second }

Var
 GrtCount : LongInt; { counter for GetRtcTime calls }
 GtCount : LongInt; { counter for GetTime calls }
 GrdCount : LongInt; { counter for GetRtcDate calls }
 GdCount : LongInt; { counter for GetDate calls }
 TCount : LongInt; { counter for Time calls }
 RtCount : LongInt; { counter for RtcTime calls }
 CtCount : LongInt; { counter for CTime2 calls }
 Timer1 : LongInt; { used in Time, RtcTime testing }
 Temp : LongInt;
 Hr,Mn,Sc,Hn : Word; { used in calls to GetTime, GetRtcTime }
 Yr,Mo,Dy,Dw : Word; { used in calls to GetDate, GetRtcDate }
 St : TimeStrPtr; { used in CTime2 testing }

(*
** test performance of real time clock based time functions
*)

Procedure TestRtc;
Begin

 Writeln;
 Write('Testing GetRtcTime...');
 Temp := RtcClock; { get current time tick count }
 Repeat
 GetRtcTime(Hr,Mn,Sc,Hn);
 Inc(GrtCount);
 Until (RtcClock - Temp) = TEST_TIME; { count for 5 seconds }

 Writeln;
 Write('Testing GetRtcDate...');
 Temp := RtcClock;
 Repeat
 GetRtcDate(Yr,Mo,Dy);
 Inc(GrdCount);
 Until (RtcClock - Temp) = TEST_TIME; { count for 5 seconds }


 Writeln;
 Write('Testing RtcTime...');
 Temp := RtcClock;
 Repeat
 RtcTime(Timer1);
 Inc(RtCount);
 Until (RtcClock - Temp) = TEST_TIME; { count for 5 seconds }

 Writeln;
 Write('Testing CTime2...');
 Temp := RtcClock;
 Repeat
 St := CTime2(Timer1);
 Inc(CtCount);
 Until (RtcClock - Temp) = TEST_TIME; { count for 5 seconds }

End;

(*
** test performance of Turbo Pascal/DOS based time functions
*)

Procedure TestPas;
Begin

 Writeln;
 Write('Testing GetTime...');
 Temp := RtcClock;
 Repeat
 GetTime(Hr,Mn,Sc,Hn);
 Inc(GtCount);
 Until (RtcClock - Temp) = TEST_TIME; { count for 5 seconds }

 Writeln;
 Write('Testing GetDate...');
 Temp := RtcClock;
 Repeat
 GetDate(Yr,Mo,Dy,Dw);
 Inc(GdCount);
 Until (RtcClock - Temp) = TEST_TIME; { count for 5 seconds }

 Writeln;
 Write('Testing Time2...');
 Temp := RtcClock;
 Repeat
 Time2(Timer1);
 Inc(TCount);
 Until (RtcClock - Temp) = TEST_TIME; { count for 5 seconds }

End;

(*
** determine percentage one value represents of another
*)

Function Percent(Count1,Count2 : LongInt) : LongInt;
Var Temp : LongInt;
Begin
 Temp := (Count1 * 100) div Count2;

 If ((Count1 * 100) mod Count2) >= 50 Then
 Inc(Temp);
 Percent := Temp;
End;

(*
** show results of timing tests
*)

Procedure DisplayResults;
Begin
 Writeln;
 Writeln('Test Summary:');
 Writeln;
 Writeln('GetTime called ',GtCount,' times');
 Writeln('GetRtcTime called ',GrtCount,' times');
 If GrtCount > GtCount Then
 Writeln('GetRtcTime was ',Percent(GrtCount,GtCount),
 '% the speed of GetTime')
 Else
 Writeln('GetTime was ',Percent(GtCount,GrtCount),
 '% the speed of GetRtcTime');

 Writeln;
 Writeln('GetDate called ',GdCount,' times');
 Writeln('GetRtcDate called ',GrdCount,' times');
 If GrdCount > GdCount Then
 Writeln('GetRtcDate was ',Percent(GrdCount,GdCount),
 '% the speed of GetDate')
 Else
 Writeln('GetDate was ',Percent(GdCount,GrdCount),
 '% the speed of GetRtcDate');

 Writeln;
 Writeln('Time2 called ',TCount,' times');
 Writeln('RtcTime called ',RtCount,' times');
 If TCount > RtCount Then
 Writeln('Time2 was ',Percent(TCount,RtCount),
 '% the speed of RtcTime')
 Else
 Writeln('RtcTime was ',Percent(RtCount,TCount),
 '% the speed of Time2');

 Writeln;
 Writeln('CTime2 called ',CtCount,' times');
End;


Begin
 GrtCount := 0; { initialize counter variables }
 GtCount := 0;
 GrdCount := 0;
 GdCount := 0;
 TCount := 0;
 RtCount := 0;
 CtCount := 0;

 EnableRtcInts;


 ClrScr;

 TestRtc; { test the functions using the real time clock }
 TestPas; { test the normal Pascal/DOS based time functions }

 DisplayResults;

 Writeln;
 Writeln('End of test.');
 Writeln('Start time display.');
 Writeln('Depress any key to stop');
 Writeln;
 While not KeyPressed Do
 Begin
 GetRtcTime(Hr,Mn,Sc,Hn);
 RtcTime(Timer1);
 Write(Chr(13),Hr:2,':',Mn:2,':',Sc:2,'.',Hn:2,
 ' ',CTime2(Timer1)^);
 End;

 DisableRtcInts;
End.




>

_USING THE REAL-TIME CLOCK_
by Kenneth Roach

[TURBO C VERSION]


[TIME_C.C]

/*
** TIME_C
** (C) Copyright 1990 by Kenneth Roach
** Version date: 3 November, 1990
**
** This program uses the time and date functions provided by the Turbo-C
** compiler, as well as similar functions contained in the module TIMELIB.C.
** TIME_C calls each function for five seconds, counting the number of
** times the function in question was called. It then compares the number
** of times each function was called and displays the results. Following
** this, it displays the current date and time obtained from the
** get_rtc_time function, and as reported and converted by the rtc_time
** and ctime2 functions.
*/

#include <stdio.h>
#include <dos.h>
#include <time.h>
#include "timelib.h"

long grt_count = 0L; /* counter for get_rtc_time() calls */
long grd_count = 0L; /* counter for get_rtc_date() calls */
long rt_count = 0L; /* counter for rtc_time() calls */

long gt_count = 0L; /* counter for gettime() calls */
long gd_count = 0L; /* counter for getdate() calls */
long t_count = 0L; /* counter for time() calls */
long t2_count = 0L; /* counter for time2() calls */
long ct2_count = 0L; /* counter for ctime2() calls */
long ct_count = 0L; /* counter for ctime() calls */
struct time t; /* used in testing of gettime, get_rtc_time */
struct date d; /* used in testing of getdate, get_rtc_date */
char *str; /* used in testing ctime, ctime2 */
time_t timer; /* used in testing time, time2, rtc_time */
long temp;

#define TEST_TIME 5120L /* 5 seconds * 1024 interrupts per */

/*
** test performance of real time clock based time functions
*/

void test_rtc()
{
 printf("\nTesting get_rtc_time...");
 temp = rtc_clock();
 do {
 get_rtc_time(&t);
 ++grt_count;
 } while(rtc_clock() - temp < TEST_TIME); /* count for 5 seconds */

 printf("\nTesting get_rtc_date...");
 temp = rtc_clock();
 do {
 get_rtc_date(&d);
 ++grd_count;
 } while(rtc_clock() - temp < TEST_TIME); /* count for 5 seconds */

 printf("\nTesting rtc_time...");
 temp = rtc_clock();
 do {
 rtc_time(&timer);
 ++rt_count;
 } while(rtc_clock() - temp < TEST_TIME); /* count for 5 seconds */

 printf("\nTesting ctime2...");
 temp = rtc_clock();
 do {
 str = ctime2(&timer);
 ++ct2_count;
 } while(rtc_clock() - temp < TEST_TIME); /* count for 5 seconds */
}

/*
** test performance of C's DOS based time functions
*/

void test_c()
{

 printf("\nTesting gettime...");
 temp = rtc_clock();
 do {

 gettime(&t);
 ++gt_count;
 } while(rtc_clock() - temp < TEST_TIME); /* count for 5 seconds */

 printf("\nTesting getdate...");
 temp = rtc_clock();
 do {
 getdate(&d);
 ++gd_count;
 } while(rtc_clock() - temp < TEST_TIME); /* count for 5 seconds */

 printf("\nTesting time...");
 temp = rtc_clock();
 do {
 time(&timer);
 ++t_count;
 } while(rtc_clock() - temp < TEST_TIME); /* count for 5 seconds */

 printf("\nTesting time2...");
 temp = rtc_clock();
 do {
 time2(&timer);
 ++t2_count;
 } while(rtc_clock() - temp < TEST_TIME); /* count for 5 seconds */

 printf("\nTesting ctime...");
 temp = rtc_clock();
 do {
 str = ctime(&timer);
 ++ct_count;
 } while(rtc_clock() - temp < TEST_TIME); /* count for 5 seconds */
}

/*
** determine percentage one value represents of another
*/

long percent(long count1,long count2)
{
 long temp;
 temp = (count1 * 100L) / count2;
 if(((count1 * 100L) % count2) >= 50L)
 ++temp;
 return(temp);
}

/*
** show results of timing tests
*/

void display_results()
{
 printf("\nTest Summary:\n");
 printf("\ngettime() called %6ld times\n",gt_count);
 printf("get_rtc_time() called %6ld times\n",grt_count);
 if(grt_count > gt_count)
 printf("get_rtc_time() was %02ld%% the speed of gettime()\n",
 percent(grt_count,gt_count));
 else

 printf("gettime() was %02ld%% the speed of get_rtc_time()\n",
 percent(gt_count,grt_count));

 printf("\ngetdate() called %6ld times\n",gd_count);
 printf("get_rtc_date() called %6ld times\n",grd_count);
 if(grd_count > gd_count)
 printf("get_rtc_date() was %02ld%% the speed of getdate()\n",
 percent(grd_count,gd_count));
 else
 printf("getdate() was %02ld%% the speed of get_rtc_date()\n",
 percent(gd_count,grd_count));

 printf("\ntime() called %6ld times\n",t_count);
 printf("time2() called %6ld times\n",t2_count);
 printf("rtc_time() called %6ld times\n",rt_count);
 if(rt_count > t_count)
 printf("rtc_time() was %02ld%% the speed of time()\n",
 percent(rt_count,t_count));
 else
 printf("time() was %02ld%% the speed of rtc_time()\n",
 percent(t_count,rt_count));

 printf("\nctime() called %6ld times\n",ct_count);
 printf("ctime2() called %6ld times\n",ct2_count);
 if(ct2_count > ct_count)
 printf("ctime2() was %02ld%% the speed of ctime()\n",
 percent(ct2_count,ct_count));
 else
 printf("ctime() was %02ld%% the speed of ctime2()\n",
 percent(ct_count,ct2_count));
}

void main()
{
 enable_rtc_ints();

 clrscr();

 test_rtc(); /* test the functions using the real time clock */
 test_c(); /* test the normal C/DOS based time functions */

 display_results();

 printf("\nEnd of test.\nStart time display.\nDepress any key to stop\n\n");

 while(!kbhit())
 {
 get_rtc_time(&t);
 rtc_time(&timer);
 printf("\r%02.2d:%02.2d:%02.2d.%02.2d %-24.24s",
 t.ti_hour,t.ti_min,t.ti_sec,t.ti_hund,ctime(&timer));
 }

 disable_rtc_ints();
}



[TIMELIB.C}



/*
** TIMELIB.C
** (C) Copyright 1990 by Kenneth Roach
** Version date: 3 November, 1990
**
** This module contains functions similar to ANSI C's time(), gettime() and
** getdate(), and clock() functions, but which are based on use of the AT
** class of system's real time clock. Additionally, functions are provided
** to enable and disable periodic interrupts from the real time clock along
** with an intterupt handler for same. Interrupts from the real time clock
** are provided at a rate of 1024 per second, and a function is provided to
** return the number of interrupts received in the current second. Also
** provided is a replacement for the C language's ctime() function which is
** modestly faster.
*/

#pragma inline

#include <stdio.h>
#include <dos.h>
#include <time.h>
#include "timelib.h"

#define CMOSFLAG 0x70
#define CMOSDATA 0x71
#define SECONDS_REQ 0x00
#define MINUTES_REQ 0x02
#define HOURS_REQ 0x04
#define STATUSA 0x0a
#define STATUSB 0x0b
#define STATUSC 0x0c
#define DATE_REQ 0x07
#define MONTH_REQ 0x08
#define YEAR_REQ 0x09
#define CENTURY_REQ 0x32
#define UPDATE 0x80
#define BCD 0x04
#define MASK_24 0x02
#define HINIBBLE 0xf0
#define LONIBBLE 0x0f

#define APRIL 4
#define JUNE 6
#define SEPTEMBER 9
#define NOVEMBER 11
#define FEBRUARY 2

#define RTC_VEC 0x70
#define IMR2 0xa1
#define CMD1 0x20
#define CMD2 0xa0
#define EOI 0x20
#define RTC_MASK 0xfe
#define RTC_FLAG 0x40

#define SECS_PER_DAY 86400L
#define SECS_PER_YEAR 31536000L

#define BIAS_10_YEARS 315532800L /* difference between 1970 and 1980 */
#define BASE_YEAR 1980
#define SECS_PER_MIN 60
#define SECS_PER_HOUR 3600
#define MINS_PER_HOUR 60
#define DAYS_PER_YEAR 365
#define DAYS_PER_WEEK 7
#define TUESDAY 3 /* day of week for 1-1-1980 */

#define bcd_bin(x) (bcd) ? ((((x & HINIBBLE) >> 4)\
* 10) + (x & LONIBBLE)) : (x)

char months[12][4] = {"Jan","Feb","Mar","Apr","May","Jun",
 "Jul","Aug","Sep","Oct","Nov","Dec"};
char days[7][4] = {"Sun","Mon","Tue","Wed","Thu","Fri","Sat"};

extern long timezone;

volatile int rtc_count = 0;
volatile long tick_count = 0L;

void interrupt (*old_rtc_vec)();

int func_init = 0;
int bcd = 0;
int dst = 0;
unsigned int old_mask;
char time_str[26];

/*
** replacement for the Turbo-C clock() function. rtc_clock returns
** a value corresponding to the number of periodic interrupts which
** have occurred since interrupts from the real time clock were
** enabled. The value will remain positive for some 24 days from
** initialization.
*/

clock_t rtc_clock()
{
 return(tick_count);
}

/*
** millicount returns the real time clock periodic interrupt count for
** the current second. Range of value is 0 to 1023.
*/

int milli_count()
{
 return(rtc_count);
}

/*
** real time clock interrupt handler
*/

void interrupt rtc()
{
 asm cli;

 outportb(CMOSFLAG,STATUSC); /* get interrupt register identification */
 if((inportb(CMOSDATA) & 0x40) != 0) /* if a "periodic" interrupt */
 {
 if(++rtc_count == 1024) /* update nbr times ISR called this sec */
 rtc_count = 0; /* if start of new second, reset rtc_count */
 else
 {
 outportb(CMOSFLAG,STATUSA); /* check it again for accuracy */
 if(inportb(CMOSDATA) & UPDATE)
 rtc_count = 0;
 }
 ++tick_count; /* update total number of times called */
 outportb(CMD1,EOI); /* signal end of interrupt to primary 8259 */
 outportb(CMD2,EOI); /* signal end of interrupt to chained 8259 */
 }
 else
 (*old_rtc_vec)();
 asm sti;
}

/*
** turn on interrupts from the real time clock
*/

void enable_rtc_ints()
{
 rtc_count = 0;
 tick_count = 0L;
 old_rtc_vec = getvect(RTC_VEC);
 setvect(RTC_VEC,rtc); /* point to interrupt handler */
 outportb(IMR2,inportb(IMR2) & RTC_MASK); /* enable clock interrupt */
 outportb(CMOSFLAG,STATUSB);
 old_mask = inportb(CMOSDATA); /* get rtc mask register */
 outportb(CMOSFLAG,STATUSB);
 outportb(CMOSDATA,old_mask RTC_FLAG); /* enable 1k interrupts */
}

/*
** turn off interrupts from the real time clock
*/

void disable_rtc_ints()
{
 outportb(CMOSFLAG,STATUSB);
 outportb(CMOSDATA,old_mask); /* turn off periodic interrupts */
 outportb(IMR2,inportb(IMR2) & ~RTC_MASK); /* diable RTC interrupts */
 setvect(RTC_VEC,old_rtc_vec); /* restore old interrupt vector */
}

/*
** replacement for the C language's ctime() function
*/

char *ctime2(time_t *t)
{
 unsigned int hr,mn,sc;
 unsigned int yr,mo,dy;
 unsigned int bias,dw;
 int junk,s,tp;

 long temp;
 time_t time;

 time = *t - BIAS_10_YEARS;
 if(dst)
 time -= 3600L; /* compensate for daylight savings */
 time -= timezone;
 temp = time % SECS_PER_DAY; /* get seconds left for this day */
 hr = temp / SECS_PER_HOUR; /* determine hours this day */
 temp %= SECS_PER_HOUR; /* lose hours this day */
 mn = temp / MINS_PER_HOUR; /* determine minutes this hour */
 sc = temp % SECS_PER_MIN; /* determine seconds this minute */

 asm cli;
 do /* following code duplicated for speed */
 outportb(CMOSFLAG,STATUSA); /* wait until not in update cycle */
 while(inportb(CMOSDATA) & UPDATE);
 outportb(CMOSFLAG,CENTURY_REQ); s = inportb(CMOSDATA); tp = bcd_bin(s);
 outportb(CMOSFLAG,YEAR_REQ); s = inportb(CMOSDATA); bias = bcd_bin(s);
 outportb(CMOSFLAG,MONTH_REQ); s = inportb(CMOSDATA); mo = bcd_bin(s);
 outportb(CMOSFLAG,DATE_REQ); s = inportb(CMOSDATA); dy = bcd_bin(s);
 asm sti;

 bias = bias + tp * 100 - BASE_YEAR;
 temp = time / SECS_PER_DAY; /* get number of days for this value */
 yr = temp / DAYS_PER_YEAR; /* now convert it to years */
 bias >>= 2; /* get leap year days for value */
 dy = temp - yr * DAYS_PER_YEAR - bias; /* get unprocessed days */
 yr += BASE_YEAR; /* now add in the 1980 start date */
 dw = time / SECS_PER_DAY + TUESDAY; /* 1-1-80 was a Tuesday */
 dw %= DAYS_PER_WEEK; /* determine weekday */
 --dw;
 s = 1; /* now determine the month's name */
 mo = 0;
 while(s) /* process total remaining days for year */
 {
 junk = 0;
 switch(s)
 {
 case APRIL: /* first do months with 30 days */
 case JUNE:
 case SEPTEMBER:
 case NOVEMBER: if(dy >= 30)
 junk = 30; break;
 case FEBRUARY: if((yr >> 2) == 0) /* special case february */
 if(dy >= 29)
 junk = 29; /* process leap year */
 else
 ;
 else if(dy >= 28) /* not a leap year */
 junk = 28; break;
 default: if(dy >= 31)
 junk = 31; /* else month has 31 days */
 }
 if(junk)
 {
 ++mo; /* account for month just processed */
 ++s; /* bump case index */
 dy -= junk; /* subtract days just processed */

 }
 else
 s = 0; /* Dy is less than 1 month, clear while var */
 }

 time_str[0] = days[dw][0]; /* now convert all values to a string */
 time_str[1] = days[dw][1]; /* avoid call to sprintf for speed */
 time_str[2] = days[dw][2];
 time_str[4] = months[mo][0];
 time_str[5] = months[mo][1];
 time_str[6] = months[mo][2];
 time_str[8] = dy / 10 + '0';
 time_str[9] = dy % 10 + '0';
 time_str[11] = hr / 10 + '0';
 time_str[12] = hr % 10 + '0';
 time_str[14] = mn / 10 + '0';
 time_str[15] = mn % 10 + '0';
 time_str[17] = sc / 10 + '0';
 time_str[18] = sc % 10 + '0';
 time_str[20] = yr / 1000 + '0'; yr %= 1000;
 time_str[21] = yr / 100 + '0'; yr %= 100;
 time_str[22] = yr / 10 + '0';
 time_str[23] = yr % 10 + '0';
 time_str[24] = '\n';
 time_str[25] = 0;
 time_str[3] = time_str[7] = time_str[10] = time_str[19] = ' ';
 time_str[13] = time_str[16] = ':';
 return(time_str);
}

/*
** replacement for Turbo-C's gettime() function
*/

void get_rtc_time(struct time *timep)
{
 int h,m,s;
 if(!func_init)
 init_time(); /* assure we have info we need */
 asm cli;
 do
 outportb(CMOSFLAG,STATUSA); /* wait until not in update cycle */
 while(inportb(CMOSDATA) & UPDATE);

 outportb(CMOSFLAG,HOURS_REQ); /* get hours */
 h = inportb(CMOSDATA); timep->ti_hour = bcd_bin(h);
 outportb(CMOSFLAG,MINUTES_REQ); /* get minutes */
 m = inportb(CMOSDATA); timep->ti_min = bcd_bin(m);
 outportb(CMOSFLAG,SECONDS_REQ); /* get seconds */
 s = inportb(CMOSDATA); timep->ti_sec = bcd_bin(s);
 asm sti;
 s = rtc_count / 10; /* rtc_count goes to 1024 */
 if(s > 75) /* correct for values to 102 each second */
 s -= 3;
 else if(s > 50)
 s -= 2;
 else if(s > 25)
 --s;
 timep->ti_hund = s;

}

/*
** replacement for Turbo-C's getdate() function
*/

void get_rtc_date(struct date *datep)
{
 int d,m,y,t,s;
 if(!func_init)
 init_time(); /* assure we have info we need */
 asm cli;
 do
 outportb(CMOSFLAG,STATUSA); /* wait until not in update cycle */
 while(inportb(CMOSDATA) & UPDATE);

 outportb(CMOSFLAG,CENTURY_REQ); /* get century */
 s = inportb(CMOSDATA); t = bcd_bin(s);
 outportb(CMOSFLAG,YEAR_REQ); /* get year */
 y = inportb(CMOSDATA); datep->da_year = bcd_bin(y);
 outportb(CMOSFLAG,MONTH_REQ); /* get month */
 m = inportb(CMOSDATA); datep->da_mon = bcd_bin(m);
 outportb(CMOSFLAG,DATE_REQ); /* get day */
 d = inportb(CMOSDATA); datep->da_day = bcd_bin(d);
 asm sti;
 datep->da_year = datep->da_year + t * 100; /* add in century */
}

/*
** replacement for Turbo-C's time() function
*/

time_t rtc_time(time_t *result)
{
 time_t hr;
 unsigned s,b,yr,sc,mn,mo,dy;
 if(!func_init)
 init_time(); /* assure we have info we need */
 asm cli; /* following code is duplicated for speed */
 do
 outportb(CMOSFLAG,STATUSA); /* wait until not update cycle */
 while(inportb(CMOSDATA) & UPDATE);

 outportb(CMOSFLAG,SECONDS_REQ); /* get seconds */
 s = inportb(CMOSDATA); sc = bcd_bin(s);
 outportb(CMOSFLAG,MINUTES_REQ); /* get minutes */
 s = inportb(CMOSDATA); mn = bcd_bin(s);
 outportb(CMOSFLAG,HOURS_REQ); /* get hours */
 s = inportb(CMOSDATA); hr = bcd_bin(s);

 outportb(CMOSFLAG,YEAR_REQ); /* get year */
 s = inportb(CMOSDATA); yr = bcd_bin(s);
 outportb(CMOSFLAG,CENTURY_REQ); /* get century */
 s = inportb(CMOSDATA); b = bcd_bin(s);
 outportb(CMOSFLAG,MONTH_REQ); /* get month */
 s = inportb(CMOSDATA); mo = bcd_bin(s);
 outportb(CMOSFLAG,DATE_REQ); /* get day */
 s = inportb(CMOSDATA); dy = bcd_bin(s);
 asm sti;


 mn = mn * 60 + sc; /* convert minutes to seconds */
 hr = hr * 3600 + mn + timezone; /* convert hours to seconds */
 yr = yr + b * 100 - 1980; /* get years since 1980 */
 dy = dy + (yr >> 2); /* correct days for leap years */
 s = 1;
 while(s < mo) /* add days for this year */
 switch(s++)
 {
 case APRIL: /* months with 30 days */
 case JUNE:
 case SEPTEMBER:
 case NOVEMBER: dy += 30L; break;
 case FEBRUARY: dy += ((yr >> 2) == 0) ? 29L : 28L; break;
 default: dy += 31L; /* else month has 31 days */
 }
 if(dst)
 hr -= 3600L; /* compensate for daylight savings */
 return(*result = (yr * SECS_PER_YEAR + /* return final value */
 dy * SECS_PER_DAY +
 hr + BIAS_10_YEARS)); /* 10 yr bias for difference */
 /* between 1970 and 1980 (secs) */
}

/*
** replacement for Turbo-C's time() function
*/

time_t time2(time_t *result)
{
 time_t hr;
 unsigned s,yr,mn,mo,dy;
 struct date d;
 struct time t;
 asm cli;

 getdate(&d);
 gettime(&t);
 mn = t.ti_min * 60 + t.ti_sec; /* convert minutes to seconds */
 hr = t.ti_hour * 3600 + mn + timezone; /* convert hours to seconds */
 yr = d.da_year - 1980; /* get years since 1980 */
 dy = d.da_day + (yr >> 2); /* correct days for leap years */
 s = 1;
 mo = d.da_mon;
 while(s < mo) /* add days for this year */
 switch(s++)
 {
 case APRIL: /* months with 30 days */
 case JUNE:
 case SEPTEMBER:
 case NOVEMBER: dy += 30L; break;
 case FEBRUARY: dy += ((yr >> 2) == 0) ? 29L : 28L; break;
 default: dy += 31L; /* else month has 31 days */
 }
 if(dst)
 hr -= 3600L; /* compensate for daylight savings */
 asm sti;
 return(*result = (yr * SECS_PER_YEAR + /* return final value */
 dy * SECS_PER_DAY +

 hr + BIAS_10_YEARS)); /* 10 yr bias for difference */
 /* between 1970 and 1980 (secs) */
}
/*
** initialize variables for rtc time and date functions
*/

void init_time()
{
 struct tm *cur_time;
 time_t timer;
 time(&timer); /* kick start TC's time code */
 cur_time = localtime(&timer); /* check for daylight savings time */
 dst = cur_time->tm_isdst;
 outportb(CMOSFLAG,STATUSB); /* get mode the clock is in */
 bcd = (inportb(CMOSDATA) & BCD) == 0; /* (binary or BCD) */
 outportb(CMOSFLAG,STATUSB);
 outportb(CMOSDATA,inportb(CMOSDATA) MASK_24);/* force 24 hour mode */
 func_init = 1;
}



[TIMELIB.H]


/*
** TIMELIB.H
**
** prototype declarations for TIMELIB.C
*/


clock_t rtc_clock();
int milli_count();
void enable_rtc_ints();
void disable_rtc_ints();
void get_rtc_time(struct time *timep);
void get_rtc_date(struct date *datep);
time_t rtc_time(time_t *result);
time_t time2(time_t *result);
void init_time();
char *ctime2();



















June, 1991
FAST SORTING USING LARGE STRING BUFFERS


More powerful sorts in Basic




Dale Thorn


Dale is a programmer at AGC Corp., where he specializes in optimization and
portability of Basic programs. He can be reached at 1001 AGC Drive, Cleveland,
TN 37312.


Basic has a reputation for making it possible to get small programs up and
running quickly. But on the other hand, the language has gained a reputation
for limited functionality, particularly when it comes to pointers and
large-scale memory management.
Most of what needs to be done regarding pointer types can be accomplished with
ordinary 16- and 32-bit integers; when formal identification is required,
variable names such as ptr.databuffer or ptr.indexbuffer will suffice.
As for memory management, the sorting routine described in this article
provides additional functionality by using large, single-string buffers and
integer variables as pointers to the buffers. The design objective of this
sort routine was to read data from files or file keys, send it one record at a
time to the sort, and begin retrieval when the time between the first record
sent and the first record retrieved is reduced to the absolute minimum. The
current version of the routine will accommodate more than 32,000 records; you
can modify the routine to handle several million records by creating index
buffers with a segment length of three instead of two.
This version is compatible with Microsoft Basic 4.0 (and up) where the binary
file mode provides the advantage of not closing and reopening files when
writing to them with different lengths. Porting to Basic implementations
compatible with Basic 4.0 and up shouldn't be a problem; moving to older
dialects will likely involve changes. (For instance, you may have to change
deflng x to defsng x if long integers are not available, then change the
affected clng, mod, and \ statements accordingly; use gotos without changing
the visible structure of the code when block if/then/else isn't available; and
make the main buffers [sbuf$, sndx$] the field variables in corresponding
random file opens if binary file mode is not available.) Porting to other
languages (especially C) may involve more work, but I've written the code in a
format designed to minimize porting problems. For example, I use nonstring
operations (such as memcpy) and buffers and other entities that allow maximum
flexibility in moving memory blocks.


Sorting String Data


Most Basic programs that sort string data use string arrays to hold the data
to be sorted. While individual array elements can be swapped quickly as the
result of a comparison, other problems (multilevel sorting, array element
assignment, writing arrays to disk, and so on) tend to slow the process to the
point where assembler routines are invariably required for performance. The
sort routine in Listing Two (page 95) takes a somewhat radical approach to the
problem by combining several techniques into one.
The first step in building this routine was to create a large string buffer to
hold all the sort data, where every n character is a data element including
blank-space padding, then insert each data element in sequence after first
shifting any greater values upward in the buffer with a mid$ command: mid$
(buffer$, x + n) = mid$(buffer$, x). This technique has two problems that led
to the addition of an indexing buffer in the current routine. First, the mid$
command requires swap space in string memory equal to the size of the shift on
the right side of the equal sign, which could double the memory requirement or
slow the process down by having to shift in segments. Secondly, the time
required to shift the buffer for each insertion was prohibitive.
More Details.
My solution was to write each data element into the main buffer (sbuf$) in
sequential order while performing the aforementioned insert on the associated
index buffer (sndx$). The Basic stringstack technique used here can best be
described by analogy: Imagine a library with three trays of index cards
(ordered by author, title, and subject) where each card contains the data and
an exact physical location for the book. Now remove all data from the cards
except the physical location pointer. This makes for a very compact index (in
terms of the size of index data stored). The major disadvantage is that to
find a specific title, a library patron will have to perform a binary search
on an index tray and walk over to the shelves for each index card examined,
just to make a comparison. Then the patron will have to determine whether to
move up or down the index stack before making another trip to the bookshelves.
Although impractical for library patrons, this approach works very well on
computers because computers get the data record directly from the index
pointer.
The third major technique used here is the group merge, where memory is not
sufficient to sort all data in one pass and each group is dumped to disk files
until the sort process is completed. The sort routine then loads one record
from each group and outputs the lowest (highest if descending order) of the
batch.


String Variables


Probably the biggest time-killer in most Basic programs is the creation of
string variables, both real and virtual. Creation of real strings cannot be
avoided, but when a string is recreated to the same length over and over
instead of just being cleared, it wastes time and forces more garbage
collection. The virtual strings, as I call them, do not waste as much time as
the assigned strings because they disappear after the current execution
statement and do not work their way down into the string heap. Mid$ commands,
str$, chr$, and so forth, all create virtual strings; when the number of
incidents is very large, we have, in certain cases, an opportunity to save
some time.
More Details.
The first technique is to create a substitute for the Basic function chr$. The
char$ array in Listing One (page 94) requires about 800 bytes of memory, and
once created (see Listing Two), a call to char$ should be about four times
faster than a call to chr$.
The second technique is not available to current versions of Basic, but
because it is available in a relatively inexpensive replacement library (see
"Declarations for the Sample Program"), I think it worthwhile to describe
here. The midchar function gets the ASCII value of any character in a string
without having Basic create the single mid$ character first, and does it about
five times faster if linking with Crescent Software's PDQ replacement library.
The third technique used in this sort is based on the second, and converts
16-bit integer values from string to numeric - cvi(mid$(xxx$, midpos,
2))--without having Basic create the mid$ characters; get the midchar of the
first byte and add the midchar of the second byte after multiplying it by 256.
This third technique is approximately three times faster than the
cvi(mid$(...function. For maximum portability, I keep a set of these
replaceable routines handy to append to Basic programs when compiling in
situations where PDQ and other proprietary add-ons are not available or
applicable.


Declarations for the Sample Sort Program


The sample program in Listing One (which call the sort in Listing Two) has
four main functions: send data to sort, create external index (optional),
resort from workfile (optional), and retrieve data from sort. From the top of
the listing, you will note the midchar declaration. The mid$ call it replaces
could be ignored in the code without requiring a separate function, but when
the function (see the end of the listing) is eliminated and the program is
compiled with Crescent Software's PDQ library, the speed increase is
substantial, so I keep in this format for maximum portability.
The next bit of code is the DIM line; the subscripts equal to 10 may be raised
or lowered if desired, and the subscript of 100 may have to be increased if
the number of sort groups could be greater than 100. As an example, if the
length of each sort string (sdat$) were 100 bytes, and the maximum sort buffer
(sbuf$) size were 30,000, only 300 records could be held in memory in a single
sort group. If the total number of records to be sorted were more than 30,000,
more than 100 groups would be required, and the aforementioned subscript would
have to be increased, although this would be an extreme case. The subscript of
255 for char$ cannot be lessened because it is a replacement for the Basic
chr$ function, and requires all 256 characters of the ASCII collating set.
Next, I used common shared to pass a group of variables to the main sorting
routine (Listing Two), but you will want to employ a named common block in
real applications so that the variables are not passed when chaining. Most of
the variable declarations below the common lines are required. The same list,
along with a few others, can be found at the top of Listing Two. The three
lines marked with an asterisk at the right side are the only lines that
required the programmer's attention, the sort-string length, and the integer
masking.
Two of the data sections (sortdata1 and sortdata3) are examples only and are
usually provided to the executing program by configuration files or by a data
dictionary. The data in sortdata2 is actual sort data and will generally come
from database files and the like. The remaining information for configuring a
sort can be found at the top of Listing Two.

--D.T.


Technical Details of the Sort Subprogram


The sortsq variable in Listing Two indicates the order of output (ascending/
descending), and resets the output pointer on each retrieval. nvflag is used
to minimize the character inversions for ascending/descending sequences, and
when nvflag matches the iseq() flag for a sort data segment, the characters in
that segment are inverted (subtracted from 255) to facilitate the
straightforward string comparisons that are the heart of the routine (see the
fillproc subroutine). For reasons that pertain to Basic's internal structure,
this inversion method, along with the swapping of bytes in 16-bit integer
string, provides better performance than a multistage comparison where each
segment is compared in turn, and the comparison terminates early if a
difference is found. This advantages applies even when the char$ and midchar
techniques are not included.
One place in the code where you will notice some cryptic and mostly
undocumented variables is about 85 lines after the beginning line sub
nsort...); ixx1, ixx2, and so on. These variables were built from a complex
set of sort-group initialization parameters, and they represent the only
efficient means I could find to put the intended idea into code. If this had
to be rewritten, I'd recommend finding another algorithm and avoiding any
modifications here.
The calls to memfree (25, 70, and 72 lines from the beginning) specify a byte
exclusion value as the first parameter. For reasons I could not deduce, older
versions of Basic seemed to want at least 4000 bytes overhead in the string
heap beyond any amount calculated for temporary (virtual) strings and so on in
order to prevent a string space corrupt error. Other Basic versions may not
have this limitation, but on the other hand, they have the memory to spare, so
I prefer to stay with the full exclusion for maximum portability. --D.T.



_FAST SORTING USING LARGE STRING BUFFERS_
by Dale Thorn


[LISTING ONE]

'==============================================================================
'NSORT.BAS Sort/retrieve/index data; ascending/descending; mixed data types
' By: Dale Thorn
' Rev. 03/26/91
'==============================================================================
main:

defint a-w
deflng x
defsng y
defdbl z

declare function midchar(i$, i) 'use Basic function (listed) if PDQ not avail.

dim ibeg(10), ilen(10), iptx(100, 1), iseq(10), char$(255)

common shared compln, ddunit, grpptr, grptot, maxrcd, memndx, ndunit
common shared ndxgrp, ndxlen, nosegs, nvflag, offset, opcode, opinit
common shared outptr, outtot, rcdptr, rcdtot, recptr, sdunit, sortln
common shared sortsq, subtot, ibeg(), ilen(), iptx(), iseq(), char$()

compln = 0 'comparison length in sort data (sdat$); may be less than sortln
ddunit = 0 'file channel/unit number for index-building (opcode = -3)
grpptr = 0 'sort group record pointer/sort buffer pointer
grptot = 0 'internal sort group size
maxrcd = 0 'internal maximum sort group size
memndx = 0 'internal index-load flag
ndunit = 0 'file channel/unit number for sort index files
ndxgrp = 0 'internal index file group record counter
ndxlen = 0 'internal index file record size
nosegs = 0 'no. of sort segments in sdat$; total length of segments = compln
nvflag = 0 'internal optimization for least ascending/descending inversions
offset = 0 'internal group-to-record offset counter
opcode = 0 'sort operation (0 to -3)
opinit = 0 'internal sort operation data initialization flag
outptr = 0 'internal data output record pointer
outtot = 0 'internal data output record counter
rcdptr = 0 'internal sort data record counter (all records)
rcdtot = 0 'internal sort data record total (final count)
recptr = 0 'internal sort data record counter (group records)
sdunit = 0 'file channel/unit number for sort data file (.sdx)
sortln = 0 'length of sort data buffer (sdat$); may be greater than compln
sortsq = 0 'internal sort sequence (ascending/descending) flag
subtot = 0 'internal partial group data record total (final count)

drcd$ = "" 'temp. sort data record buffer
nrcd$ = "" 'sort index file buffer
sbuf$ = "" 'main sort group memory buffer
sdat$ = "" 'main sort data record buffer
smsk$ = "" 'sort data mask (must be uppercased) [BBXXXBBXXXXXBB.....]
sndx$ = "" 'sort index-pointer memory buffer


'// NOTE: Any lines below with an asterisk (*) on the extreme /////
' right will require a modification or replacement. /////
'/////// Modification applies to DATA statements as well. /////

sortln = 40 'total sort buffer length*
pfmt$ = space$(5) 'output format buffer for integer strings
sdat$ = space$(sortln) 'sort data record buffer

restore sortdata1 'first tablespec to sort from
read sdunit, ndunit, ddunit 'file channel/unit numbers used by NSORT.SUB
read ibeg(0), ilen(0), iseq(0) 'test values from table sortdata1
nosegs = 0 'initialize total no. of sort segments
while ibeg(0) 'begin loop to load segment pointers and flags
 nosegs = nosegs + 1 'increment total sort segments
 ibeg(nosegs) = ibeg(0) 'segment begin pointer for sdat$ buffer
 ilen(nosegs) = ilen(0) 'segment length
 iseq(nosegs) = iseq(0) 'segment sort sequence (ascending/descending)
 compln = compln + ilen(0) 'total sort compare length
 read ibeg(0), ilen(0), iseq(0) 'read next set of test values
wend
smsk$ = string$(compln, "X") 'allocate masking buffer (default type=character)
mid$(smsk$, 21) = "BB" '"binary" position specified*
mid$(smsk$, 33) = "BB" '"binary" position specified*

restore sortdata2 'sample sort data table
opcode = 0 'set flag to add records to sort (initial operation)
nrcds = 0 'number of records added to the sort
do 'begin loop to read data and add to sort
 segptr = 1 'set segment position pointer for sdat$
 lset sdat$ = "" 'clear the sort data buffer prior to loading
 for segno = 1 to nosegs 'begin loop to load each data segment
 read segdata$ 'read data segment from table sortdata2
 if len(segdata$) = 0 then exit do 'exit read-data loop at end-of-data
 if midchar(smsk$, segptr) = 66 then '16-bit integer <BB> segment
 mid$(sdat$, segptr) = mki$(val(segdata$)) 'convert data to integer
 else 'character <XX....> segment
 mid$(sdat$, segptr) = segdata$ 'put character segment to sort buffer
 end if
 segptr = segptr + ilen(segno) 'increment segment position pointer
 next
 call nsort(drcd$, nrcd$, sbuf$, sdat$, smsk$, sndx$) 'add record to sort
 nrcds = nrcds + 1 'total records added to the sort
loop

opcode = -3 'set flag to build an external index to the sortdata file
call nsort(drcd$, nrcd$, sbuf$, sdat$, smsk$, sndx$) 'build the index file

open "sortdata.ddx" for binary as #ddunit 'open the external index file
ddxrcd$ = space$(2) 'allocate the index buffer
for rcdno = 1 to nrcds 'begin loop to retrieve and display indexed data
 call fileio(ddunit, 2, clng(rcdno), ddxrcd$, 0) 'retrieve an index record
 call fileio(sdunit, sortln, clng(cvi(ddxrcd$)), sdat$, 0) 'retrieve data
 for segno = 1 to nosegs 'begin loop to display sort segments
 if midchar(smsk$, ibeg(segno)) = 66 then '16-bit integer <BB> segment
 rset pfmt$ = right$(str$(cvi(mid$(sdat$, ibeg(segno), 2))), 5)
 print pfmt$; " "; 'print integer data
 else 'character segment
 print mid$(sdat$, ibeg(segno), ilen(segno)); " "; 'print char. data

 end if
 next
 print 'terminate print line
next
call killfile("sortdata.ddx", ddunit) 'index file closed and removed

restore sortdata3 'next tablespec to sort from
read ibeg(0), ilen(0), iseq(0) 'test values from table sortdata3
compln = 0 'comparison length in sort data (sdat$)
nosegs = 0 'initialize total no. of sort segments
while ibeg(0) 'begin loop to load segment pointers and flags
 nosegs = nosegs + 1 'increment total sort segments
 ibeg(nosegs) = ibeg(0) 'segment begin pointer for sdat$ buffer
 ilen(nosegs) = ilen(0) 'segment length
 iseq(nosegs) = iseq(0) 'segment sort sequence (ascending/descending)
 compln = compln + ilen(0) 'total sort compare length
 read ibeg(0), ilen(0), iseq(0) 'read next set of test values
wend

opcode = -1 'set flag to resort data from existing sort file
call nsort(drcd$, nrcd$, sbuf$, sdat$, smsk$, sndx$) 'resort the data

opcode = -2 'set flag to retrieve records from sort (final operation)
call nsort(drcd$, nrcd$, sbuf$, sdat$, smsk$, sndx$) 'retrieve 1st data record
while len(sdat$) 'begin loop to display sort data
 for segno = 1 to nosegs 'begin loop to display sort segments
 if midchar(smsk$, ibeg(segno)) = 66 then '16-bit integer <BB> segment
 rset pfmt$ = right$(str$(cvi(mid$(sdat$, ibeg(segno), 2))), 5)
 print pfmt$; " "; 'print integer data
 else 'character segment
 print mid$(sdat$, ibeg(segno), ilen(segno)); " "; 'print char. data
 end if
 next
 print 'terminate print line
 call nsort(drcd$, nrcd$, sbuf$, sdat$, smsk$, sndx$) 'retrieve next record
wend

close 'close all files
system 'return to DOS

'------------------------------------------------------------------------------
sortdata1: 'initial sort specifications
'------------------------------------------------------------------------------

'_____datafile____indexfile____buildfile :'File channel/unit numbers;
data 1, 2, 3 :'may be found using FREEFILE


'_____segbegin____seglength____segsequence :'Segment begin pointers, lengths
data 1, 20, 1 :'and sort sequences for sort
data 21, 2, -1 :'data buffer (sdat$).
data 23, 10, 1 :' sequence = 1; ascending
data 33, 2, -1 :' sequence = -1; descending
data 35, 6, 1 :'
data 0, 0, 0 :'end-of-data markers

'------------------------------------------------------------------------------
sortdata2: 'example sort data
'------------------------------------------------------------------------------


'_______Alpha data, len=20______Num.(2)______Alpha (10)____Num.(2)____Alpha
(6)
data "Petrol Chemicals Ltd", "3576", "London SW3", "588", "A23456"
data "Associated Factories", "112", "Richmond", "1313", "XNA"
data "Dale's Containers", "12343", "Devonshire", "55", "DALE"
data "", "", "", "", ""

'------------------------------------------------------------------------------
sortdata3: 'specifications for alternate sorting order
'------------------------------------------------------------------------------

'_____segbegin____seglength____segsequence :'Segment begin pointers, lengths
data 33, 2, 1 :'and sort sequences for sort
data 1, 10, 1 :'data buffer (sdat$).
data 0, 0, 0 :'end-of-data markers

function midchar (i$, i) static 'find ASCII value of a single character in i$
 midchar = asc(mid$(i$, i, 1)) 'set midchar value
end function 'return to calling program

rem $include: 'nsort.sub'





[LISTING TWO]


'==============================================================================
'NSORT.SUB Sort/retrieve/index data; ascending/descending; mixed data types
' By: Dale Thorn
' Rev. 03/24/91
'------------------------------------------------------------------------------
' compln - comparison length in sort data (sdat$); may be less than sortln
' ddunit - file channel/unit number for index-building (opcode = -3)
' grpptr - sort group record pointer/sort buffer pointer
' grptot - internal sort group size
' maxrcd - internal maximum sort group size
' memndx - internal index-load flag
' ndunit - file channel/unit number for sort index files
' ndxgrp - internal index file group record counter
' ndxlen - internal index file record size
' nosegs - no. of sort segments in sdat$; total length of segments = compln
' nvflag - internal optimization for least ascending/descending data
inversions
' offset - internal group-to-record offset counter
' opcode - sort operation (0 to -3)
' opinit - internal sort operation data initialization flag
' outptr - internal data output record pointer
' outtot - internal data output record counter
' rcdptr - internal sort data record counter (all records)
' rcdtot - internal sort data record total (final count)
' recptr - internal sort data record counter (group records)
' sdunit - file channel/unit number for sort data file (.sdx)
' sortln - length of sort data buffer (sdat$); may be greater than compln
' sortsq - internal sort sequence (ascending/descending) flag
' subtot - internal partial group data record total (final count)
'
' ibeg() - segment begin pointers for sort data buffer (sdat$)

' ilen() - segment length pointers for sort data buffer (sdat$)
' iptx() - pointers used if merge-sort req'd. (set internally)
' iseq() - segment sequence pointers for sort data buffer (sdat$)
' 1 = ascending; -1 = descending
' char$() - high-performance substitute for Basic chr$() function
'
' drcd$ - temp. sort data record buffer (set to "" on first call)
' nrcd$ - sort index file buffer (set to "" on first call)
' sbuf$ - main sort group memory buffer (set to "" on first call)
' sdat$ - main sort data record buffer (set to actual value on first call)
' smsk$ - sort data mask (must be uppercased)
' BB = integer string; XXX.... all other bytes
' sndx$ - sort index-pointer memory buffer (set to "" on first call)
'
'
' set opcode = 0 on first call to add records to sort.
' set opcode = -1 to resort data from existing sort work file (sortdata.sdx).
' set opcode = -2 on first call to retrieve records from sort.
' set opcode = -3 to build index file (sortdata.ddx).
'
' *** Notes: opcode = 0 is always the first process (add records).
' opcode = -1 may be set to resort data, but only following
' the creation of an index with opcode set to -3.
' opcode = -2 may be set to retrieve records once all records
' have been added with opcode set to 0, or after
' a resort with opcode set to -1. Once opcode is
' set to -2 and all records are retrieved, the
' sort routine is terminated and all sort memory
' is returned to the calling program. If further
' sorting is required, begin anew with opcode = 0.
' opcode = -3 may be set to build an index file following an
' initial sort with opcode set to 0, or a resort
' with opcode set to -1. If more than 2 sorting
' sequences are required, where 2 or more index
' files are needed, rename each .ddx file to save it.
' The final sort sequence may be obtained using
' opcode = -2, and thus eliminate the need for a
' corresponding index file. Each 2 bytes in the index
' file are a pointer to a record in the .sdx file.
'
' For the first sort (opcode = 0), place all sort segments of sdat$
' into the left part of sdat$ in sequential order (1, 2, 3, etc.).
' When re-sorting using opcode = -1, segments may be in any order.
' All data stored in sortdata.sdx will be in the original sequence.
'
' ***** Important: Minimum sort length is 2 bytes.
' ***** If free memory is minimal, more sort groups may
' ***** be needed, and dim iptx(nnn) may be too small.
' ***** Each opcode process must be completed for all
' ***** records before switching to another process.
' ***** Use named common block if chaining programs.
'------------------------------------------------------------------------------
sub nsort (drcd$, nrcd$, sbuf$, sdat$, smsk$, sndx$) static
 if opcode > -2 then 'insert a record <add to the sort>
 if opinit mod 2 = 0 then 'first-sort-record initialization
 opinit = opinit - 1 'adjust initialization flag
 sortsq = iseq(1) 'primary output sequence
 nvflag = 0 'data inversion flag
 for segno = 1 to nosegs 'build data inversion spec

 nvflag = nvflag + ilen(segno) * iseq(segno) 'bytes above/below 0
 next
 if nvflag < 0 then 'data inversion optimization
 nvflag = 1 'set inversion flag plus
 else
 nvflag = -1 'set inversion flag minus
 end if '[see fillproc & writeproc subroutines]
 if nvflag = sortsq then sortsq = -sortsq 'primary output sequence
 call killfile("sortdata.ndx", ndunit) 'kill work index file
 open "sortdata.ndx" for binary as #ndunit 'open work index file
 if opcode = 0 then 'initial (add records) operation
 call killfile("sortdata.sdx", sdunit) 'kill work data file
 open "sortdata.sdx" for binary as #sdunit 'open work data file
 drcd$ = space$(sortln) 'temporary sort data buffer
 for ichr = 0 to 255 'create substitute character set
 char$(ichr) = chr$(ichr) 'substitute for Basic chr$() function
 next
 end if
 call memfree(clng(4096), clng(195840), xfree) 'reserve 4 kb memory
 maxrcd = xfree \ (sortln + 4) 'maximum records per memory group
 if maxrcd > 32640 \ sortln then maxrcd = 32640 \ sortln 'buffer size
 sbuf$ = space$(maxrcd * sortln) 'main sort data buffer
 sndx$ = space$(maxrcd * 2 + 2) 'reorderable/shiftable index buffer
 rcdptr = 1 'used to count total records
 recptr = 1 'used to count records within a sort group
 grpptr = 1 'sort buffer pointer
 end if
 if opcode = -1 then 'resort from existing workfile (.sdx)
 ndxgrp = 0 'total number of sort groups
 offset = 0 'internal group-to-record offset counter
 while rcdptr <= rcdtot 'loop until all records are read
 call fileio(sdunit, sortln, clng(rcdptr), sdat$, 0) 'get sort data
 gosub putproc 'add records in new sort sequence
 wend
 else 'original (insert) sequence
 gosub putproc 'add records to sort
 end if
 else 'retrieve a record or build an index
 offset = 0 'group-to-record offset counter
 if opinit mod 2 then 'first retrieval record initialization
 opinit = opinit - 1 'adjust initialization flag
 if opinit = -2 then 'first operation after original sort
 rcdtot = rcdptr - 1 'total records from original sort
 subtot = recptr - 1 'partial-group subtotal from original sort
 end if
 outptr = 1 'beginning pointer for data output
 outtot = rcdtot 'total records to output
 if ndxgrp then 'sorting was done in groups
 gosub writeproc 'save data left over from previous operation
 else 'all sorting was done in memory
 maxrcd = rcdtot 'reset maximum records for file write
 ndxlen = maxrcd * 2 'length of index data to write
 gosub writeproc 'save sort data
 ndxgrp = 0 'reset index group count to zero
 end if
 sbuf$ = "" 'erase buffer to reclaim memory
 sndx$ = "" 'erase buffer to reclaim memory
 if ndxgrp then 'merge-sort required
 grplen = ndxlen 'group size * 2

 sbuf$ = space$(ndxgrp * sortln) 'buffer holds 1 record per group
 sndx$ = space$(ndxgrp * 2 + 2) 'buffer holds 1 record per group
 end if
 if opcode = -3 then 'build index from sorted data
 call memfree(clng(6144), clng(32640), xfree) 'reserve 2kb for .ddx
 else 'normal retrieval [return each record to calling program]
 call memfree(clng(4096), clng(32640), xfree) 'reserve normal 4 kb
 end if
 xsize = clng(outtot) * 2 'total records * 2
 memndx = (xsize <= 32640 and xsize <= xfree) 'index-in-memory flag
 if memndx then 'retrieval index fits entirely in memory
 ndxlen = xsize 'buffer length is index file length
 else 'retrieval index does not fit in memory
 ndxlen = 2 'buffer length is 16-bit integer length
 end if
 nrcd$ = space$(ndxlen) 'allocate index file buffer
 if memndx then call fileio(ndunit, ndxlen, clng(1), nrcd$, 0)'fill it
 if ndxgrp then 'merge-sort initialization
 ixx1 = (sortsq > 0) 'used locally to shorten line
 ixx2 = (sortsq < 0) 'used locally to shorten line
 ixx3 = (memndx and ixx1) 'used locally to shorten line
 ixx4 = (memndx and ixx2) 'used locally to shorten line
 iyy1 = 1 - memndx 'used locally to shorten line
 iyy2 = grplen \ (1 - not memndx) 'used locally to shorten line
 for recptr = 1 to ndxgrp 'loop thru each index group
 grpptr = recptr 'sort group record pointer
 iyy3 = (grptot - subtot) * (ixx2 and (recptr = ndxgrp))
 iyy4 = (grptot - subtot) * (ixx1 and (recptr = ndxgrp))
 ircd = (recptr + ixx1) * iyy2 + iyy3 * iyy1 + ixx4 - ixx1
 ircx = (recptr + ixx2) * iyy2 + iyy4 * iyy1 + ixx3 - ixx2
 if memndx then 'get index pointer from memory buffer
 ichr = midchar(nrcd$, ircd + 1) * 256 'high byte of index
 rcdptr = midchar(nrcd$, ircd) + ichr 'same as cvi(mid$(...
 else 'get index pointer from file
 call fileio(ndunit, ndxlen, clng(ircd), nrcd$, 0)
 rcdptr = cvi(nrcd$) 'set pointer to retrieve data
 end if
 call fileio(sdunit, sortln, clng(rcdptr), sdat$, 0) 'get data
 gosub fillproc 'add 1 record from each sort group to buffer
 iptx(recptr, 0) = ircd 'begin ptr.to load ndx.rcd. from group
 iptx(recptr, 1) = ircx 'end ptr.to load ndx.rcd. from group
 next
 recptr = ndxgrp 'reset groups-pointer to begin output
 if sortsq < 0 then outptr = recptr 'begin output in reverse order
 else 'non-merge; all output from memory
 if sortsq < 0 then outptr = outtot 'begin output in reverse order
 end if
 end if
 if opcode = -3 then 'build index from sorted data
 call killfile("sortdata.ddx", ddunit) 'kill user index file
 open "sortdata.ddx" for binary as #ddunit 'open user index file
 ddxrcd$ = space$(2048) 'collection buffer for index-build
 filptr = 0 'record pointer for writing .ddx buffer to file
 ddxptr = 1 'buffer pointer for adding index values to ddxrcd$
 gosub getproc 'get first index record
 while not closed 'retrieve index pointers and save to .ddx file
 mid$(ddxrcd$, ddxptr) = mki$(rcdptr) 'copy index to .ddx buffer
 ddxptr = ddxptr + 2 'increment buffer pointer
 if ddxptr > 2048 then 'write a group of data to file

 filptr = filptr + 1 'increment file pointer
 call fileio(ddunit, 2048, clng(filptr), ddxrcd$, -1) 'put data
 ddxptr = 1 'reset buffer pointer to beginning of buffer
 end if
 gosub getproc 'get next index records
 wend
 if ddxptr > 1 then 'save leftover index pointers
 call fileio(ddunit, 2048, clng(filptr + 1), ddxrcd$, -1) 'put data
 end if
 close #ddunit 'close the .ddx file
 ddxrcd$ = "" 'reclaim memory from .ddx buffer
 else 'retrieve a single sort record and return to calling program
 gosub getproc 'get a record pointer
 if not closed then 'retrieval OK as long as more records available
 call fileio(sdunit, sortln, clng(rcdptr), sdat$, 0) 'retrieve data
 end if
 end if
 if closed then 'retrieval/index completed
 if opcode = -2 then 'final (single-record retrieval) sequence
 call killfile("sortdata.ndx", ndunit) 'kill sort index workfile
 call killfile("sortdata.sdx", sdunit) 'kill sort data file
 sdat$ = "" 'kill sort data buffer
 end if
 nrcd$ = "" 'kill index file buffer
 sbuf$ = "" 'kill main sort group buffer
 sndx$ = "" 'kill sort index buffer
 end if
 end if
 exit sub 'return to calling program
 '--------------------------------------------------------------------------
 fillproc: 'put sort data into sbuf$, sndx$
 '--------------------------------------------------------------------------
 if opcode = 0 then lset drcd$ = sdat$ 'load all segments at once
 iptr = 1 'initialize work buffer pointer
 for segno = 1 to nosegs 'load segments into work buffer and/or do invert
 if midchar(smsk$, ibeg(segno)) = 66 then 'invert 16-bit integer strings
 ichr = midchar(sdat$, ibeg(segno)) 'save first byte, then swap
 mid$(drcd$, iptr) = char$(midchar(sdat$, ibeg(segno) + 1)) '2nd byte
 mid$(drcd$, iptr + 1) = char$(ichr) 'put 1st byte in 2nd position
 else 'non-integer (character) sort segment
 if opcode then 'segments not in original (contiguous) sequence
 mid$(drcd$, iptr) = mid$(sdat$, ibeg(segno), ilen(segno))
 end if 'insert each sort segment into temp. buffer [above]
 end if
 if iseq(segno) = nvflag then 'invert data for ascend/descend sequence
 for ichr = iptr to iptr + ilen(segno) - 1 'do each byte in segment
 mid$(drcd$, ichr) = char$(255 - midchar(drcd$, ichr))
 next 'data will be re-inverted before writing to file
 end if
 iptr = iptr + ilen(segno) 'increment work buffer segment pointer
 next 'begin binary search for sort compare [below]
 topptr = recptr 'set top end of binary search
 lowptr = 0 'set low end of binary search
 while topptr - lowptr > 1 'search work data buffer using work index buffer
 midptr = lowptr + (topptr - lowptr) \ 2 'set mid point for compare
 ichx = midptr * 2 'mid-position incorporating 16-bit index width
 ichr = midchar(sndx$, ichx) * 256 'same as cvi(mid$(.....))
 iptr = (midchar(sndx$, ichx - 1) + ichr - offset - 1) * sortln 'mid-
 if left$(drcd$, compln) <= mid$(sbuf$, iptr + 1, compln) then '-buff.pos

 topptr = midptr 'move search lower
 else 'sort record value > compare value in sort memory buffer
 lowptr = midptr 'move search higher
 end if
 wend
 iptr = topptr * 2 - 1 'current index-"stack" insert position
 mid$(sbuf$, (grpptr - 1) * sortln + 1) = drcd$ 'write sort data to buffer
 mid$(sndx$, iptr + 2) = mid$(sndx$, iptr, (recptr - topptr) * 2) 'shift ndx
 mid$(sndx$, iptr) = mki$(grpptr + offset) 'write current pointer to index
 return 'return to calling routine
 '--------------------------------------------------------------------------
 getproc: 'retrieve a record from the sort
 '--------------------------------------------------------------------------
 if ndxgrp then 'merge-retrieval from sort groups
 if recptr then 'sort records are still available
 ichr = outptr * 2 'mid-position based on 16-bit index width
 grpptr = midchar(sndx$, ichr - 1) + midchar(sndx$, ichr) * 256
 if memndx then 'get group pointer from work index [above]
 ichr = midchar(nrcd$, iptx(grpptr, 0) + 1) * 256 'get record ptr
 rcdptr = midchar(nrcd$, iptx(grpptr, 0)) + ichr 'from memory-index
 else 'get record pointer from index file
 call fileio(ndunit, ndxlen, clng(iptx(grpptr, 0)), nrcd$, 0)
 rcdptr = cvi(nrcd$) 'nrcd$ is a 16-bit integer record
 end if
 if sortsq > 0 then mid$(sndx$, 1) = mid$(sndx$, 3) 'shift work index
 if iptx(grpptr, 0) = iptx(grpptr, 1) then 'end of group reached
 recptr = recptr - 1 'decrement group stack pointer
 if sortsq < 0 then outptr = recptr 'set output pointer if appl.
 else 'end of group not yet reached
 iptx(grpptr, 0) = iptx(grpptr, 0) + (1 - memndx) * sortsq'move ptr
 if memndx then 'get a data record using a pointer from memory
 ichr = midchar(nrcd$, iptx(grpptr, 0)) 'get the record pointer
 ichx = midchar(nrcd$, iptx(grpptr, 0) + 1) * 256 '..from memory
 call fileio(sdunit, sortln, clng(ichr + ichx), sdat$, 0)
 else 'get a data record using a pointer from the index file
 call fileio(ndunit, ndxlen, clng(iptx(grpptr, 0)), nrcd$, 0)
 call fileio(sdunit, sortln, clng(cvi(nrcd$)), sdat$, 0)
 end if
 gosub fillproc 'add the data record to the merge-sort
 end if
 closed = 0 'retrieval process not closed
 else 'no more records available
 closed = not 0 'retrieval process closed
 end if
 else 'non-merge sort retrieval; all data is in memory
 if outtot then 'sort records are still available
 ichr = outptr * 2 'mid-position based on 16-bit index width
 rcdptr = midchar(nrcd$, ichr - 1) + midchar(nrcd$, ichr) * 256
 outptr = outptr + sortsq 'increment or decrement index pointer
 outtot = outtot - 1 'decrement remaining records
 closed = 0 'retrieval process not closed
 else 'no more records available
 closed = not 0 'retrieval process closed
 end if
 end if
 return 'return to calling routine
 '--------------------------------------------------------------------------
 putproc: 'add a record to the sort
 '--------------------------------------------------------------------------

 if recptr > maxrcd then 'too many records to fit in memory
 if ndxgrp = 0 then 'first group; initialize index group variables
 grptot = recptr - 1 'number of records per group
 ndxlen = grptot * 2 'size of index file buffer
 end if
 gosub writeproc 'save data group and index group
 offset = rcdptr - 1 'group-to-record offset counter
 recptr = 1 'reset group record counter
 grpptr = 1 'sort buffer pointer
 end if
 gosub fillproc 'add current record to sort
 rcdptr = rcdptr + 1 'increment total records counter
 recptr = recptr + 1 'increment group record counter
 grpptr = recptr 'sort buffer pointer
 return 'return to calling routine
 '--------------------------------------------------------------------------
 writeproc: 'write index and sort data to files
 '--------------------------------------------------------------------------
 ndxgrp = ndxgrp + 1 'increment the index group number
 call fileio(ndunit, ndxlen, clng(ndxgrp), left$(sndx$, ndxlen), -1)
 if opinit > -3 then 'initial sequences; save sort data to .sdx file
 for iptr = 0 to (maxrcd - 1) * sortln step sortln 'loop thru mem.buffer
 for segno = 1 to nosegs 're-invert data as appropriate
 iptz = iptr + ibeg(segno) 'sort group memory buffer pointer
 if midchar(smsk$, ibeg(segno)) = 66 then 'invert integer string
 ichr = midchar(sbuf$, iptz) 'save first byte, then swap
 mid$(sbuf$, iptz) = char$(midchar(sbuf$, iptz + 1)) '2nd byte
 mid$(sbuf$, iptz + 1) = char$(ichr) 'put 1st byte in 2nd pos.
 end if
 if iseq(segno) = nvflag then 'invert data for ascend/descend seq
 for ichr = iptz to iptz + ilen(segno) - 1 'invert each byte
 mid$(sbuf$, ichr) = char$(255 - midchar(sbuf$, ichr))
 next
 end if
 next
 next
 sdxlen = maxrcd * sortln 'size of group memory buffer
 xflptr = lof(sdunit) \ sdxlen + 1 'current data "record"
 call fileio(sdunit, sdxlen, xflptr, sbuf$, -1) 'put data group to file
 end if
 return
end sub 'return to calling program

sub fileio (fcno, flen, xrec, fbuf$, fopr) static 'read/write file data
 'int fcno 'file unit/channel no.
 'int flen '"record" length used for positioning only
 'int fopr '0 = read; non-0 = write
 'long xrec 'logical "record" number
 'char fbuf$ 'read/write data buffer
 xpos = (xrec - 1) * flen + 1 'absolute byte position in file
 if fopr then 'operation = write
 put #fcno, xpos, fbuf$ 'write data to file
 else 'operation = read
 get #fcno, xpos, fbuf$ 'read data from file
 end if
end sub 'return to calling program

sub killfile (ffil$, fcno) static 'kill a DOS file
 'int fcno 'file unit/channel no.

 'char ffil$ 'file name
 close #fcno 'close file if open
 open ffil$ for binary as #fcno 'open file in binary mode
 close #fcno 'close the file
 kill ffil$ 'kill the file
end sub 'return to calling program

sub memfree (xexc, xmax, xfree) static 'get max. free memory less exclusion
 'long xexc 'amount of memory to reserve/exclude
 'long xmax 'upper limit for xfree (or zero)
 xfree = fre("") - xexc 'total free memory less exclusion
 if xmax > 0 and xfree > xmax then xfree = xmax 'set maximum if applicable
end sub 'return to calling program

















































June, 1991
WHAT'S NEW WITH MODULA-2?


1991 could be a make-or-break year for Modula-2




K.N. King


K.N. King is an associate professor of mathematics and computer science at
Georgia State University. He is the author of Modula-2: A Complete Guide,
published by D.C. Heath, and a former columnist for the Journal of Pascal, Ada
& Modula-2. Use either king@prism.gatech.edu on Internet or knking on BIX to
reach him.


Modula-2, Niklaus Wirth's successor to Pascal, is no longer the new kid on the
block. It's a mature language with a growing international following, although
it has yet to become a major force on the scale of Pascal or C.
Of course, Pascal and C didn't experience great popularity when they first
appeared, either. But 1991 could be the year that Modula-2 breaks into the
ranks of major languages. In January, PC Week proclaimed that Modula-2 "is
just more than eight years old -- less than half the age of C -- and is ready
to enjoy the same surge of interest that C received during its own ninth year,
in 1981."
What makes 1991 such an important year? For starters, the international
standard is expected to be completed -- although not formally approved -- this
year. New versions of the three major DOS compilers are due. In September, the
Second International Modula-2 Conference is expected to draw hordes of
enthusiasts to England.
1991 also presents serious challenges for Modula-2. In particular, Modula-2
faces stiff competition from languages such as Ada and C++. Modula-2 is even
in danger of being upstaged by its own offspring, Oberon and Modula-3.
In this article, I'll assess how Modula-2 is doing, describe the status of
Modula-2 standardardization efforts, discuss the latest Modula-2 compilers,
and tell you how to find out more about Modula-2, Oberon, and Modula-3.


Whither Modula-2?


So how is Modula-2 really faring? Hard data is difficult to obtain, but Rich
Gogesch of Stony Brook Software estimates that there are roughly 150,000
Modula-2 users worldwide. That's a respectable number, but nowhere near the
number of Pascal or C users. In the U.S., Modula-2 use is reportedly static:
Current users are pleased but the language isn't attracting many new converts.
In other countries, the story is different. According to Steve Collins of Real
Time Associates, a British Modula-2 vendor, Modula-2 is "beating the pants
off" C in the European embedded systems arena. What's more, he notes that
almost all U.K. university graduates urrently learn Modula-2 as their primary
language; the one university that doesn't teach Modula-2 is expected to adopt
it this fall. American vendors have noticed the strength of Modula-2 in
Europe. Gogesch, for example, estimates that two-thirds of Stony Brook's sales
are made there.
Why isn't Modula-2 as popular in the U.S. as it is elsewhere? There was once a
lack of high-quality compilers, but today's Modula-2 compilers compete with
the best C compilers for compilation speed and code quality. The real reasons
are more subtle. First, Modula-2 has failed to distinguish itself from other
languages, notably Turbo Pascal. Modula-2 has never achieved "critical mass"
in the way C and Pascal have; users complain about the shortage of third-party
Modula-2 libraries. The lack of an official standard for the language--and
especially the libraries--has not helped. Also, Modula-2 doesn't get the
publicity that newer languages do; Gogesch feels that the hoopla surrounding
C++ in particular "has hurt Modula-2 quite a bit."
Plenty of programmers agree with Gogesch that, "in terms of large program
development, there's nothing as good as Modula-2." In the U.S., Modula-2 is
used by a surprising number of companies; however, small users don't get much
attention and large firms often shun publicity. A classic example is the
defense contractor whose 50 programmers write everything for Modula-2, then
use a translator to convert their code to Ada before delivery.


Standardization


An international standard for Modula-2 has been making steady progress since
1987, when working group ISO/IEC JTC1/SC22/WG13 first began to meet. A draft
proposed standard was issued in late 1989 and a revision has been in
preparation since mid-1990; it is expected to be ready by the time WG13 meets
in Germany this summer.
For various reasons, work on the standard has slowed recently. WG13 continues
to make major language changes even at this late stage; working out the
implications of these changes takes time. The standard will use VDM-SL (the
Vienna Development Method Specification Language) to specify the semantics of
Modula-2. Using VDM-SL adds rigor to the language definition but increases the
amount of time required to draft the standard. And unfortunately, members of
WG13 still disagree concerning such basic issues as the philosophy behind the
I/O library.
The draft proposed standard makes a number of changes to Modula-2. Because of
limited space, I can't describe them all, but I'll try to hit a few of the
high points. For more details, order a copy of the draft standard or read my
Modula-2 column in back issues of the Journal of Pascal, Ada & Modula-2.
Structured Value Constructors. The draft standard adds "structured value
constructors," which construct array and record values from their components
in a manner similar to Ada aggregates. For example, if T is an array type (say
T = ARRAY [0..3] OF INTEGER), the expression T {1, 2, 3, 4} constructs an
array of type T containing 1, 2, 3, and 4, in that order. Records are
constructed in the same way.
Complex Numbers. One of the surprises of last year's WG13 meeting was the
decision to provide support for complex numbers. The current plan is to add
COMPLEX and LONGCOMPLEX types to the language and ComplexMath and
LongComplexMath modules to the library.
Strings. Modula-2's string handling deficiencies are well known. At one point,
WG13 considered putting an "honest" string type in the language--a type whose
representation would be hidden from the programmer. The working group
acknowledged the value of such a type, but felt it to be too great a language
change. Instead, WG13 added both a standard function named LENGTH that
computes the length of a string and concatenation symbol for string literals
and constants. The group is also attempting to define a standard Strings
library module.
Input/Output. Improving the I/O library has been a top priority of WG13 from
the beginning. Unfortunately, the members don't always agree on how the
library should be improved. Some favor an industrial-strength library; others
prefer a simple, streamlined library suitable for writing example programs and
textbooks. The library in the first draft proposed standard leans in the
former direction; it contains a hefty 23 modules and 201 procedures. After
much discussion at last year's meeting, WG13 agreed to simplify the library.
The SYSTEM Module. The SYSTEM module, Modula-2's source of machine-dependent
features, has undergone a great deal of change. To give just one example, the
type LOC has been added in an attempt to make porting programs between
byte-addressable and word-addressable machines easier. LOC represents the
smallest addressable unit of memory; on some machines, a LOC value will be a
byte; on others, a word.
More Details.
Exception Handling. WG13 has long favored adding exception handling to
Modula-2, but members disagreed over which exception-handling model to use. In
fact, when the first draft standard was issued, they were still considering
two different proposals; both are described in an appendix. At last year's
meeting, WG13 agreed to adopt a mechanism somewhat similar to C's
setjmp/longjmp. The latest proposal, which adds an EXCEPTIONS module to the
library, actually provides a more powerful mechanism than setjmp/longjmp. In
particular, the RETRY procedure allows a program to redo an operation that
failed because an exception was raised.
Copies of the first draft proposed standard are available from the IEEE
Computer Society (see "Modula-2 Resouce Guide") for $35; ask for
"ISO/JTC1/SC22/WG13 draft of DP10154 - P1151 Modula-2." (Members of MODUS, the
Modula-2 User's Assocation, can obtain the draft at a reduced rate.) The next
draft should be available later this year.


Compilers


One sign of Modula-2's maturity is the number of high-quality implementations
available. This is especially true in the DOS world, where TopSpeed Modula-2,
Stony Brook Modula-2, and Logitech Modula-2 lead the market. According to a
recent review in PC Week, "these three products rival offerings for
better-known languages such as C and Pascal, from better-known vendors such as
Borland and Microsoft.
Jensen & Partners international recently released Version 3.0 of TopSpeed
Modula-2. Version 2.0 was the first to include support for object-oriented
programming; Version 3.0 adds multiple inheritance and support for data
hiding. Another feature of Version 3.0 is an automatic overlay system for both
code and data, making it possible to write DOS programs as large as 16 Mbytes.
JPI now sells compilers, for C, C++, and Pascal as well. All TopSpeed
compilers plug into a common environment and share a common code generator,
making multilanguage programming easy. Because of the environment's language
independence, it is now sold separately from the Modula-2 compiler. The VID
debugger is included with the environment.
TopSpeed Modula-2 is available in both DOS and OS/2 versions. The optional
Professional Techkit supports Windows 3.0 development.
Version 2.2 of Stony Brook's Professional Modula-2 compiler came out last
October; it features full support for Windows 3.0 and OS/2 1.2. Version 3.0
should appear late this year or early next year. Current plans call for 3.0 to
support object-oriented extensions compatible with those found in Turbo
Pascal. Version 3.0 promises additional optimizations as well.
Logitech Modula-2 has been taken over by MultiScope Inc., a Logitech-owned
enterprise best known for its debugger. Version 4.0 of Logitech/MultiScope
Modula-2, which should be out by the time you read this, is actually the Stony
Brook system, modified to retain compatibility with Logitech Modula-2 3.0.
MultiScope is already at work on Version 5.0 of the compiler, which is being
developed in-house.
In the Macintosh arena, the leading compiler is Metrowerks Modula-2, which
comes in two versions: the Professional Standalone Edition and the MPW
version. A student package, the StartPak, sells in college bookstores for $39.
Modula-2 compilers are also available for a huge number of other platforms.
The best source for compilers (and Modula-2 books as well) is the catalog
published by Real Time Associates. The catalog is accompanied by a newsletter
containing the latest Modula-2 news, gossip, and more; the current issue even
includes a Modula-2 crossword puzzle!



Conferences


The First International Modula-2 Conference was held in 1989 at Bled,
Yugoslavia. The conference drew 120 participants from 14 countries. Niklaus
Wirth not only gave a keynote speech but also attended most sessions and
participated vigorously in discussions.
The Second International Conference, to be held September 11-13, 1991 at the
Loughborough University of Technology in England, should draw an even larger
crowd. The conference will be preceded by a one-day workshop that gives an
overview of Modula-2 and describes its advantages over older languages. Wirth
is again expected to attend.


User Groups and Publications


The primary Modula-2 user group is MODUS (the Modula-2 User's Association),
which publishes the MODUS Quarterly. Another way to meet Modula-2 users is to
join USUS, the UCSD Pascal System User's Society, which has expanded to serve
"the Pascal, Modula-2, and portable programming community."
With Modula-2's popularity in Britain, it's no surprise that the British
Computer Society has its own Modula-2 Specialist Group, which is open to all
interested parties. Because the ISO standard is being drafted in England,
joining this group is a good way to keep track of the standardization effort.
A membership application form is included with the Real Time Associates
catalog.
Several online services and networks feature coverage of Modula-2. BIX (the
Byte Information Exchange) is an excellent place to meet other Modula-2 fans
in the modula.2 conference. Many Modula-2 compiler vendors have conferences on
BIX. On CompuServe, check out the CodePort (formerly MUSUS) forum operated by
USUS. Although CodePort is nominally devoted to portable programming languages
in general, discussions of Modula-2 often dominate. On Usenet, Modula-2 is
discussed in the comp.lang.modula2 newsgroup. Incidentally, comp.lang.modula2
can be reached from BIX and FidoNet.
With the demise of the Journal of Pascal, Ada & Modula-2 last year, there are
no major magazines focused on Modula-2. However, a relatively new publication
named Modules & Definitions is devoted exclusively to Modula-2. It is a
shareware magazine that can be downloaded from CompuServe and BIX. Readers are
expected to pay $9.95 annually; a paper subscription is available for $19.95.


Oberon


Any report on the status of Modula-2 would be incomplete without a discussion
of Oberon and Modula-3, which have lately begun to receive as much attention
as Modula-2 itself. (In fact, the theme of the Second international Modula-2
Conference is "Modula-2 and Beyond," with an emphasis on Oberon and Modula-3.)
Oberon, another Wirth-designed language, simultaneously simplifies Modula-2
(by dropping variant records, opaque types, enumeration types, subrange types,
and the FOR statement, among other features) and adds extensions for
object-oriented programming. In typical Wirth fashion, the added features
(principally "type extension") are simpler than the OOP extensions of Turbo
Pascal and similar languages.
Incidentally, the name "Oberon" refers not only to the language, but also to
the operating system developed for Wirth's own Ceres workstations. The Oberon
operating system, like the Oberon language, is designed to offer state-of-the
art functionality without complexity.
Getting information on Oberon used to require a trip to the local college
library to obtain journal articles. The first book on Oberon recently
appeared, however. The Oberon System: User Guide and Programmer's Manual was
written by Martin Reiser, who worked with Wirth on the Oberon system. Raiser's
book clears up at least one mystery surrounding Oberon -- the origin of the
language's name. According to Reiser, "the project was whimsically christened
Oberon by Wirth who was fascinated by the accuracy and reliability of the
space probe Voyager which passed the moon Oberon of planet Uranus at the time
of conception of the new project."
Reiser's book describes the Oberon system -- user interface, editor, compiler,
file system, and so forth -- from both the user's and the application
programmer's viewpoints. Reiser doesn't cover the Oberon language, however.
That task is left to a forthcoming book (coauthored by Wirth) titled
Programming in Oberon: Steps beyond Pascal and Modula. Unfortunately, this
book won't be available until mid-1992. For a discussion of design issues
faced by the Oberon team, we'll have to wait for The Oberon Project, a 1993
book by Wirth and Jurg Gutknecht, his collaborator on the Oberon project. Both
books will be published by Addison-Wesley in conjunction with the ACM Press.
Public-domain versions of the Oberon system are available for the Macintosh
and for Sun Sparcstations via anonymous FTP from neptune.ethz.ch
(129.132.101.33). For more information about Oberon, see "Oberon" by Dick
Pountain in Byte, March 1991.
Incidentally, Wirth isn't resting on his laurels; rumor has it that he's
already at work on Oberon-2.


Modula-3


Modula-3 was designed jointly by groups at DEC's Systems Research Center and
the (now defunct) Olivetti Research Center in an attempt to create a language
safer, yet more powerful than Modula-2. Modula-3 has a different type system
than Modula-2 and provides improved safety from runtime errors. It also adds
several new features, including garbage collection, exception handling,
threads (lightweight processes), and support for object-oriented programming.
The design of Modula-3 began in 1986; the first description of the language
was published in 1988. Since then, the language has continued to evolve. After
one final set of changes last winter (during which a generic facility was
added), the language was finalized.
The first book on Modula-3 is Systems Programming in Modula-3, a collection of
papers -- including the official language report -- written by the people who
designed and implemented the language. A second book, Programming in Modula-3,
is due out later this year. The author is Sam Harbison, who wrote about
Modula-3 in the November 1990 issue of Byte (and who, oddly enough, is
coauthor of the best-selling C: A Reference Manual). Programming in Modula-3
will be the first complete tutorial and reference for the language. Both books
are published by Prentice-Hall.
If you're interested in Modula-3, write to DEC's Systems Research Center and
ask for Research Reports 52 and 53. Contact Harbison's company, Pine Creek
Software, to receive a free Modula-3 newsletter. If you have access to Usenet,
check out the comp.lang.modula3 newsgroup.
A Modula-3 compiler for Unix is available by anonymous FTP from
gatekeeper.dec.com (16.1.0.2); the directory is /pub/DEC/Modula-3.


Modula-2 Resource Guide




Oberon



The Oberon System: User Guide and
Programmer's Manual
Martin Reiser
Wokingham, England:
Addison-Wesley, 1991; $37.75
ISBN 0-201-54422-9



Draft for Moudula-2 Standard




IEEE Computer Society
Standards Office
1730 Massachusetts Avenue NW
Washington, DC 20036



Modula-2 User Groups and Publications



MODUS
P.O. Box 51778
Palo Alto, CA 94303-0721
Membership: $25 annually

USUS
P.O. Box 1148
La Jolla, CA 92038
Membership: $45 annually

Modules & Definitions
P.O. Box 7549
York, PA 17404
717-792-5108



Modula-2 Compilers



Jensen & Partners International
1101 San Antonio Road, Suite 301
Mountain View,
CA 94043
415-967-3200

Metrowerks Inc.
The Trimex Building, Route 11
Mooers, NY 12958
514-458-2018

MultiScope Inc. 1235 Pear Ave.
Mountain View, CA 94043
415-968-4892

Real Time Associates Ltd.
Canning House, 59 Canning Road
Croydon, Surrey CRO 6QF
United Kingdom
44(0)81-656-7333 (voice)
44(0)81-655-0410 (fax)

Stony Brook Software
187 E. Wilbur Road, Suite 9

Thousand Oaks, CA 91360
800-624-7487



Modula-2 Conferences



Modula-2 Conference
Centre for Extension Studies
Loughborough University of
Technology
Loughborough, Leicestershire
LE11 3TU
United Kingdom
44(0)509-222174 (voice)
44(0)509-610813 (fax)



Modula-3



Digital Equipment Corporation
Systems Research Center
130 Lytton Avenue
Palo Alto, CA 94301

Pine Creek Software
305 South Craig Street, Suite 300
Pittsburg, PA 15213
412-681-9811 (voice and fax)

Systems Programming in Modula-3
Greg Nelson (editor)
Englewood Cliffs, N.J.: Prentice Hall,
1991: $25.00
ISBN 0-13-590464-1






















June, 1991
PORTING UNIX TO THE 386 RESEARCH & THE COMMERCIAL SECTOR


Where does BSD fit in?




William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 and 2.9BSD and was the chief architect
of National Semiconductor's GENIX project, which was the first virtual memory
microprocessor-based Unix system. Prior to establishing TeleMuse, a market
research firm, Lynne was vice president of marketing at Symmetric Computer
Systems. They conduct seminars on BSD, ISDN, and TCP/IP. Send e-mail questions
or comments to lynne@berkeley.edu. Copyright (c)1991 TeleMuse.


"The time has come," the Walrus said, "To talk of many things: Of shoes--and
ships--and sealing wax-- Of cabbages--and kings--And why the sea is boiling
hot-- And whether pigs have wings." --Lewis Carroll
At this point in our article series, the basic toolset for 386BSD development
is in place, and we're ready to begin the job of porting the kernel program.
(Or to use the mountain-climbing analogy we've followed until now, we've
completed the preliminaries and are ready to begin scaling the peak.) With
this in mind, it's a good time to pause and consider where we've been and
where we are going.
We've discovered over the course of this series that there is considerable
confusion and debate among researchers, programmers, businesses, and other
interests over the nature and role played in the computer industry by Berkeley
UNIX in general and by 386BSD in particular. This is not surprising, given the
direction of operating systems in the commercial sector such as AT&T's System
V, Release 4, Apple's System 7, IBM/Microsoft OS/2, and others. As such, it
has become crucial to differentiate these two sectors, examine the differing
motivations and goals, and discuss some of the trends that will eventually tie
these two worlds together.
The most important thing to remember about Berkeley UNIX is that it is and
will remain a "research" project. This means it is not designed with the needs
of the commercial sector in mind -- the University of California is not a
development shop such as the SCOs of the world. BSD provides, in essence, the
opportunity for operating systems, applications, networks, and other areas to
evolve beyond the current requirements of the commercial sector to produce the
technology required for next stage efforts. This is a demand of research -- to
get on with new work and not simply stagnate.
Commercial operating systems releases have a far different agenda, however.
While much of it is self-serving (such as ABI, which we think should actually
stand for "AT&T Binary Intolerance"), there is a method to the madness.
Commercial releases are tied to the past. In fact, the tie is so strong that
even when there is a critical need to offload some past burdens, a company
finds it politically impossible to do so. We are reminded of Fred Brooks's
classic work, The Mythical Man-Month (Addison-Wesley, 1975), and his
discussion of the infamous (but popular) IBM OS/360 operating system. This
operating system grew bigger and bigger and bigger in order to meet the
perceived demands of their customers. And as it grew bigger, the number of
bugs grew as well (though not at the same rate). As Brooks reflects on this
project, he pinpoints a key issue (page 122 - 123):
Lehman and Belady have studied the history of successive releases in a large
operating system. They find that the total number of modules increases
linearly with release number, but that the number of modules affected
increases exponentially with release number. All repairs tend to destroy the
structure, to increase the entropy and disorder of the system. Less and less
effort is spent on fixing original design flaws; more and more is spent on
fixing flaws introduced by earlier fixes. As time passes, the system becomes
less and less well-ordered. Sooner or later the fixing ceases to gain any
ground. Each forward step is matched by a backward one. Although in principle
usable forever, the system has worn out as a base for progress. Furthermore,
machines change, configurations change, and user requirements change, so the
system is not in fact usable forever. A brand-new, from-the-ground-up redesign
is necessary.
In sum, it might have been simpler to abandon further work on this titanic
system (400K of assembler code, a princely sum at the time) and go on to new
operating systems.
Because of its research agenda, Berkeley UNIX is less concerned with issues
such as ABI. Applications interfaces are quite properly handled outside the
kernel, usually with a library. Eventually, antiquated or nonstandard
interfaces are brought up to speed with newer technology, and programmers use
the library less and less, until finally most delete it from their world.
Researchers cannot afford to work with bloated kernels, stuffed full of arcane
and inappropriate software. Research operating systems must be lean, mean
computing machines.
So, while everyone longs for the latest innovation, the BSD Maserati, not
everyone feels comfortable with the incredible power it provides -- not
everyone is a race car driver any more than everyone is an operating systems
programmer. BSD provides the mechanism for tremendous new opportunities, but
it doesn't have a lot of safety nets. You can just as easily crash and burn
with BSD, and have no one to blame but yourself.
Commercial systems vendors offer customers a nice, big, memory-guzzling
Oldsmobile of an operating system ("This is your father's operating system")
with all the same features everyone has seen since childhood. And when in
doubt, more is added in the kernel until everyone is satisfied. Because they
do not want to increase support overhead, vendors try to prevent "crash and
burn" occurrences by making it so safe that you are protected from yourself
(and, by the way, if you do circumvent the controls, you can't blame
them--they tried to save you). At least, this is the intent: Give the
customers what they want (within reason and while maintaining control) and try
to minimize support headaches.
We said in January that "because standards by accumulation just don't work, we
strive in 386BSD to avoid such nonsense." This was not an idle statement, but
a cornerstone of our specification. We are not the first to observe the
problems that arise when bloated kernels become a mainstay of either research
or commercial offerings (as Fred Brooks discussed in his book). In several
current commercial offerings, the complexity has become so great that the
kernels have become difficult to maintain and impossible to orient toward the
future. Customers lose what they so desperately desire in an expensive
commercial release, namely bug-free software and timely support. This
trade-off for flexible and innovative systems is beginning, like OS/360, to
sink under its own weight.
It's ironic that this would happen to UNIX, which was predicated on the
essentially "minimalist" work of Thompson and Ritchie. This problem is not
restricted to the commercial sector, however. In fact, one reason many are
less than enamored with MACH is that its "microkernel" is roughly comparable
in size to the 386BSD kernel -- and yet it requires much more memory to be
useful.
The final question now becomes, "What do I use now?" For the user dependent on
a proprietary database or accounting software, the answer is quite simple --
just continue using what you have been using. Unless there is a compelling
reason to invest in new technology, you just disrupt your business and your
workers for no good reason. Eventually, some aspects of Berkeley UNIX will be
integrated into commercial (and other research) releases, but don't anticipate
it soon.
For those who must look to the future, however, such as applications,
networking, and operating systems designers, Berkeley UNIX will continue to be
a source of the innovative new technology required for new products and new
functionality in a competitive world economy. Businesses and programmers
should keep themselves current with these research trends, for ready
incorporation in the commercial market. By the way, sometimes it's nice to
drive a Maserati.


The 386BSD Project and Berkeley UNIX


The 386BSD project was established in the summer of 1989 for the specific
purpose of porting the University of California's Berkeley Software
Distribution (BSD) to the Intel 80386 microprocessor platform.
Encompassing over 150 Mbytes of operating systems, networking and applications
software, BSD is a fully functional complete operating systems software
distribution. The goal of this project was to make this cutting-edge research
version of UNIX widely available to small research and commercial efforts on
an inexpensive PC platform. By providing the base 386BSD port to Berkeley, our
hope is to foster new interest in Berkeley UNIX technology and to speed its
acceptance and use worldwide. We hope to see those interested in this
technology build upon it in both commercial and noncommercial ventures.
In each of these articles we will examine the key aspects of software,
strategy, and experience that make up a project of this magnitude. We intend
to explore the process of the 386BSD port, while learning to effectively
exploit features of the 386 architecture for use with an advanced operating
system. We also intend to outline some of the trade-offs in implementation
goals, which must be periodically re-examined. Finally, we will highlight
extensions that remain for future work, perhaps to be done by some of you
reading this article today.
Currently, 386BSD runs on 386 PC platforms and supports the following:
Many different PC platforms, including the Compaq 386/20, Compaq Systempro
386, and 386 with the Chips and Technologies chipset, any 486 with the OPTI
chipset, Toshiba 3100SX, and more
ESDI, IDE and ST-506 drives
3-1/2-inch and 5-1/4-inch floppy drives
Cartridge tape drive
Novell NE2000 and Western Digital Ethernet controller boards
EGA, VGA, CGA, and MDA monitors
287/387 floating point including the Cyrix EMC
A single-floppy standalone UNIX system, containing support for modems,
ethernet, SLIP, and Kermit to facilitate downloading of 386BSD to any PC over
the INTERNET network.
--B.J. and L.J.














June, 1991
 RECONCILING UNIX, ADA, & REAL-TIME PROCESSING


Standard versions don't always mesh




Bill O. Gallmeister


Bill is a software engineer at Lynx Real-Time Systems and vice chair of the
POSIX.4 committee. He can be reached at 16780 Lark Ave., Los Gatos, CA 95030,
or via e-mail at bog@ lynx.com.


The federal government's recent push for computing standards is producing some
major headaches for developers of systems and applications. Three principal
requirements of the U.S. federal market are: Support for the Ada programming
language; support for real-time performance; and support for Unix. In the
past, government contracts have not frequently needed all three of these
standards requirements fulfilled in the same contract.
However, the recent contract for supplying the on-board systems for NASA's
Space Station Freedom called for the IEEE POSIX 1003.4 standard real-time
extension and specified that the operating system also run Ada programs
efficiently. If this indicates a trend, it means that software vendors will
find an increasing need to supply Ada, Unix, and real-time all at the same
time.
This spells headaches for developers. Why? Because of two well-known
inequalities. The first is "Unix!= Real-Time Performance." The second, and
less well-known, is "Unix!= Ada".


Unix! = Real-Time


Although it is a fine general-purpose time-sharing system, Unix was never
designed for real-time performance, and suffers from serious limitations in
this area. Unix can cause large delays to high-priority tasks that need to
respond to events external to the computer. This added delay is highly
variable and unpredictable, and varies from vendor to vendor, from process to
process, and even from one process invocation to the next.
Because Unix is designed to support a number of different users at the same
time, it tries to be fair to all of them. Process priorities in Unix drift in
an effort to equitably timeshare the processor(s), and to improve the
performance of I/O-bound processes. By contrast, process priorities in many
real-time applications must be immutable, and the highest priority processes
must always be the ones running on a given processor.
Unix does not allow direct control of critical shared resources. In
comparison, the real-ti e programmer (or user) needs to have substantial
control over the system, including absolute control over priorities and which
task can run on a processor at a given time. Most important, the system must
be able to respond to critical external events quickly and in a predictable
amount of time.


Unix ! = Ada


It is less widely known that typical Unix systems are capable of supporting
only a crippled Ada implementation. While Ada was designed to deal with both
real-time constraints and concurrency, Unix was not. Standard versions of Unix
lack certain features necessary for optimum Ada processing: a non-time-sharing
scheduler, asynchronous I/O, reliable event notification, and fast mechanisms
for interprocess communication.
Although there are Ada implementations that run on standard Unix, these are
crippled, clumsy, and inefficient.
For example, an area in which these implementations fall short is tasking.
Many versions of Ada implement tasking on top of the Unix process model. In
the actual application code, "tasks" are implemented by a user-level scheduler
switching between different program counters and instruction pointers.
Although this has the advantage that switching tasks happens without going
into the kernel for a context switch, when one thread enters the kernel and
blocks -- for instance, when reading from a disk file -- all the tasks in the
Ada application are blocked. This is because the underlying Unix process that
runs the user-level scheduler is blocked waiting for the I/O. Standard Unix
has no concept of tasks, so on these systems, tasks will always be
second-class citizens.
Other implementation use a one-to-one mapping between Ada tasks and Unix
processes. The problem here is that Ada task priorities can't be respected
because the underlying Unix priorities used to implement them will drift,
according to the fairness calculation invoked every few system ticks by the
Unix scheduler.
Ada programs also require efficient methods for intertask synchronization.
Ideally, the operating system should provide semaphores that can effect a lock
in three or four instructions, without having to enter the kernel. In
contrast, the locking methods used in standard Unix (such as System V
semaphores, or using the fcntl( ) function) require kernel entry with each
operation. It seems that many of the incompatibilities between Unix and Ada
stem from the fundamental difference between their respective processing
models.


Ada Processing Model


Ada supports the concept of multiple threads of control in a particular
program, generally known as "Ada tasking." The two terms, "task" and "thread,"
are often used interchangeably, although not always correctly.
A straightforward way of implementing Ada tasks is to create one operating
system thread to run each task. However, greater efficiencies in task
switching can be achieved if multiple Ada tasks are supported by each thread,
because in such a case, we can schedule tasks without involving the kernel. In
subsequent discussion, I'll assume that Ada tasks are mapped one-to-one onto
operating system threads.
Each thread can be thought of as a unique set of machine registers -- a
different stack pointer, a different program counter. Threads within an Ada
program run concurrently. While one thread may be at a particular assignment
statement, another may be off updating the display, while a third is waiting
for a disk write to complete. These threads operate as separate subprocesses
within the application.
It is useful to think of multiple-threaded applications (Ada) as a natural
extension of single-threaded applications (Unix). However, there is one
significant difference. In a multithreaded system, there is no asynchrony
visible to the programmer. In single-threaded Unix, there is still the need to
do multiple things in parallel. This need has historically been supported
using multiple Unix processes and a couple of well-worn hacks. The most
well-known is Unix signals. Another is the use of the O_NONBLOCK flag on files
to avoid indefinite waits for I/O. These workarounds are discussed in the next
section.


The Unix Processing Model


A standard Unix process is a single "flow of control" -- a single line that
goes through the code as time progresses. This is the line you're thinking of
when you debug your single-threaded programs: "Okay, we're here at the
assignment statement, now we drop through into this case statement...." The
line is characterized by the contents of the machine registers at any point in
time -- especially the program counter.
One thing happens, followed by another. Other things, by and large, do not
happen in parallel with this single flow of control. Over the years, ad hoc
mechanisms for handling concurrency have evolved on Unix: the use of signals,
Sun-style asynchronous I/O, and nonblocking I/O. But none of these quite
handle the job.
Suppose an application wants to do something -- write a data log to a disk
file, for instance. In parallel, it wants to do something else -- say,
continue gathering data into a second log buffer. One way it can do that under
standard Unix is by creating a new process (via the fork() system call) to
perform the asynchronous operation. Separate processes do not share any memory
by default, but let's assume that we arrange things so that the old and the
new process share a few pages of virtual memory used to store the pages of the
data log to be written to disk.
A double-buffering scheme can be used to allow the original process to
continue gathering data in one buffer while another is asynchronously being
flushed to disk. The new process goes off with the data and writes to the
file; the original process continues on gathering data. However, the original
process needs to know when the asynchronous activity has completed, so that
the buffer written to the log can be reused for more data gathering.
How does the second process communicate its completion to the original
process? Historically, a signal sent via kill() is used. When a signal is
received by a process, it is interrupted as if by a hardware device interrupt,
and vectors to a signal handler routine that is analogous to an interrupt
handling routine. In that routine, the used buffer can be made available for
reuse.
The problem with this mechanism is that it is asynchronous -- the signal can
occur at any time. At any point in the application's execution, it must
presume that it can (and will) be "preempted" by the arrival of the signal.
Say the application is enqueueing more buffers to the list of free log buffers
when the signal arrives. It must assure that when the signal handling routine
comes along and tries to enqueue its own buffer, it doesn't muck up the queue
structure.
As if this weren't enough of a headache, the standard methods for dealing with
such concurrency cannot be used with signal handlers. The signal handler is
not a separate thread of execution. Rather, it is just a change in the
direction of the running process. Therefore, it cannot block on a semaphore
waiting for the free queue to become accessible. It has to do something with
the used buffer immediately, because once the signal handler stops, it is
done, forever.
Normally, the signal handler's action is to enqueue the buffer if possible,
but otherwise to leave it in a well-known place and set some flag indicating
that the buffer needs to be taken care of. Then, the initial process needs to
check whether a signal has occurred, if so enqueue the buffer, and so on.



Extending the Unix Model


As you can tell, dealing with asynchrony in an application is a major
headache. In contrast, multithreaded applications can avoid all asynchrony.
These applications are comprised of multiple threads, executing code
synchronously, without the possibility of being interrupted amidst what they
are doing. When one thread must synchronize with another, it does so
synchronously, by waiting for the other thread to arrive at an agreed-upon
rendezvous point. Because the applications do not have to deal with numerous
"what-if?" cases, they tend to be more robust, legible, and maintainable. They
are probably also more efficient, because they needn't do a lot of paranoid
checking to see if something has happened behind their back.
In the example above, one thread would be writing buffers out to disk, while
another puts data into the buffers. The threads would wait politely for each
other, using some kind of synchronization mechanism.
Increasingly, Unix vendors are extending the standard Unix processing model
with facilities for multiple threads within a single Unix process.
Multithreaded Unix processes allow asynchronous programming to be turned into
synchronous programming, as well as offering natural support for Ada tasking.


Synchronizating Threads Versus Tasks


Threads and Ada tasks are very similar, but not equivalent concepts. The
difference between the two can be seen largely in the methods used for
synchronizing threads versus tasks.
Why synchronize? Threads must synchronize what they're doing because they are
all sharing the same process. That means the same address space, the same file
descriptors, and the same data structures. There has to be a way to assure
that shared resources don't get messed up by threads modifying them all at
once. It's like too many chefs ruining the broth -- it's okay if each chef
locks the others out of the kitchen while he's stirring his particular pot.
There are many different mechanisms for synchronizing thread execution --
binary semaphores, counting semaphores, mutexes, condition variables,
readers/writer locks, and monitors, just for starters. Rather than try and
attack all of these, we'll just discuss one of the most popular
synchronization mechanisms: the combination of mutexes and condition
variables.
A mutex is simple -- it enforces mutual exclusion. Threads perform two calls
on a mutex to use it -- mutex_enter and mutex_exit. Once a thread has
successfully called mutex_enter, other threads that call mutex_enter will be
stopped until the first thread calls mutex_exit. Thus, a mutex can be used to
guard a resource (such as an error log) that must be written by only one
thread at a time.
Mutexes are also used to guard more complicated resources than files --
resources that may be ready for a particular operation or not. For a thread to
safely examine the state of such a resource, it must lock the mutex. But, say
the thread gets into the mutex and finds the resource isn't ready for its use?
It has to wait for a particular condition to be satisfied.
For example, in the case of the data-logging example, the thread putting data
in the buffers would wait for conditions like "the second buffer is
available," while the thread writing data to disk would wait for a condition
like "the first buffer is ready to be written to disk." You see the reason for
the name "condition variable." A condition variable can be waited for with
cond_wait, and signalled with cond_signal.
cond_wait automatically exits the mutex and enqueues the thread on the
condition variable, to await another thread's call to cond_signal. Then, the
mutex is reentered, and the thread continues. Mutexes and condition variables
are an extremely common way of synchronizing thread execution.
Interestingly, Unix systems based on Berkeley or AT&T Unix have a close
analogue to condition variables inside the kernel, in the sleep( )/wakeup( )
calls. However, a parallel to the mutex primitives is lacking in these
systems.


Synchronization with Ada Rendezvous


Ada task synchronization is accomplished with the Ada rendezvous. This is a
higher-level synchronization mechanism than condition variables and mutexes.
By that I mean that condition variables and mutexes can be used to efficiently
implement Ada rendezvous, but can also be used to synchronize threads in other
ways that might be difficult to emulate using Ada rendezvous.
In the Ada rendezvous, one task, the caller, calls another task, the acceptor.
Independently, the acceptor announces its willingness to accept callers. If
either of these happens before the other, the task is blocked. When both these
events occur, the acceptor is awakened and handles the caller's call. While
the acceptor is handling the caller's call, the caller is blocked. When the
acceptor is done, it ends the rendezvous, and the caller continues on its way.
Ada allows acceptors to accept any of a set of callers.
An Ada rendezvous looks something like a procedure call from the caller's
point of view -- it calls something (in the data logging example, it might be
a buffer-management facility) and it blocks until the thing returns. From the
callee's point of view, it's like waiting for some event to occur, handling
the event, and then waiting for the next event. The reason Ada uses this
mechanism is that it accurately models what is usually desired when two
threads wish to synchronize.
The Ada-style rendezvous is useful for many applications that use threads,
even if those applications are written in C.


POSIX


Earlier sections have discussed many shortcomings in standard versions of
Unix. These shortcomings are not necessarily inherent to the architecture of
Unix. In many cases, they can be viewed as specific holes in current
implementations.
For example, there is no requirement that a Unix system implement a
time-sharing scheduler in order to be called "Unix." There now exist real-time
Unixes that do not allow process priorities to drift. Likewise, some versions
of Unix can support most of Ada's real-time requirements. Still, most current
versions of Unix are based on the AT&T (System V) or UC Berkeley (BSD)
implementations, neither of which directly support real-time requirements.
The first steps toward Unix-based support for Ada tasking (among other things)
have been taken by a number of people, and are now being standardized by the
IEEE POSIX 1003.4a committee. Thread support can be found in systems such as
LynxOS, CMU's Mach, and numerous Unix ports performed by vendors of
multiprocessor Unixes, such as Encore and Sequent. The proposed POSIX.4a
functions allow a POSIX process to create an additional thread in a given
process, allow that thread to exit, allow other threads to synchronize their
execution with that thread.
In addition, the standard specifies that blocking system calls will block only
the thread which made the call. This, together with POSIX 1003.4 support for
real-time functionality, provides all the necessary pieces for real-time
Unix-based support of Ada.
To receive standards group mailings, including current drafts of POSIX 1003.
4a, request subscription and meeting information from:
Secretary, IEEE Standards Board IEEE Inc. P.O. Box 1331 445 Hoes Lane
Piscataway, NJ 08855-1336





















June, 1991
UNIX, ADA, AND REAL-TIME - A FORTH MULTITASKER


A medium-heavyweight forth multitasker




Jack J. Woehr


Jack is a senior project manager for Vesta Technology. He is a contributing
editor to Embedded Systems Programming magazine, and a member of the ANS/ASC
X3/X3J14 technical committee for ANS Forth. He can be contacted at 7100 W.
44th Ave., Suite 101, Wheat Ridge, CO 80033, as jax@well. UUCP, or as JAX on
GEnie.


Although C is a popular language for programming embedded systems, Forth is
more productive. In fact, Forth can make LEDs flash and stepper motors grind
while C programmers are still setting up compiler environment variables. Forth
offers interpreter convenience and compiler execution speed combined with a
modularity of structure paralleled only by hand-tooled assembly code.
One of the most striking conveniences of Forth as a control programming
language is that almost all commercial Forth systems and most shareware and
public domain Forths have a multitasker. Multitasking allows neat
modularization of asynchronous activities in an embedded system. An example of
using a multitasker in a control application is to create three extra tasks in
addition to the foreground console task: a data-collection task; a motor- (or
whatever) control task; and a task to handle second serial channel
communication to another processor. All three tasks can run in the background
while the programmer debugs or the operator directs the system via the Forth
interpreter or canned application interface, which continues to run in the
foreground.


Forth Multitaskers


The familiar Forth multitasker depends on a chain of well-behaved and
cooperative tasks containing embedded PAUSEs that voluntarily relinquish the
processor in turn. I described the advantages and disadvantages of this sort
of multitasking in the article "Cooperative Multitasking" (Embedded Systems
Programming, April 1990).
In the same issue, Phil Koopman, formerly senior research scientist for Harris
Semiconductor and world-renowned authority on Forth-based CPUs, took the idea
several steps further and discussed this and more intricate models of Forth
multitasking, grouping them into four categories: lightweight, mediumweight,
heavyweight, and preemptive. His examples, understandably, involved the Harris
RTX2000, an incredibly powerful "Forth engine" based on the legendary Novix
NC4016 designed by Chuck Moore, the inventor of Forth.
Recently, I completed an onboard ROM Forth system called "Forth-83i96," that
produces ROM-able autostart object code for the Vesta Technology SBC196, a
single-board computer based on the Intel 80C196KB/KC. Forth-83i96 possesses a
classically efficient Forth cooperative round-robin multitasker. Nonetheless,
after delivery of the system to Vesta Technology I remained intrigued by the
mixed-multitasking models described by Dr. Koopman. In a day's work I was able
to superimpose an optional preemptive multitasker over Forth-83i96 as an
application program. I had hoped to do so without reference to any information
beyond that available to an SBC196 user adept at the art of reading the fine
manual.
Alas! The way I designed the original multitasker precludes my utopian dream:
It turns out that "insider information" is required to complete the program,
specifically, awareness that the Forth-83i96 round-robin multitasker restart
routine presumes that register 0x26 of the 100-byte 80196 register file is
loaded with the address of the instance of the user variable ENTRY local to
the task about to restart. One positive result of this experiment is that I
have decided to change the internals of the Forth-83i96 multitasker to make it
simpler for users without access to the system source code to hack the
multitasker along the lines presented here. K.I.S.S. (Keep It Simple, Stupid)
is still the watchword in Forth.


PREEMPT96.F


The code in PREEMPT96.F (Listing One, page 98) provides an environment that
falls in the cracks between Phil Koopman's description of "mediumweight" and
"heavyweight" multitasking. I call my effort "medium-heavyweight" because it
provides for preemption at any point when a timeslice has run out, while
allowing the program to avoid the additional overhead of the preemptor by
voluntarily PAUSEing in the traditional manner before the timer interrupt
decides to "lower the boom" on the languidly lingering task. PAUSEing
cooperatively is a great cycle-saver, as Forth's PAUSE only occurs between
Forth definitions when all the state of the Forth virtual machine is in a very
few registers and mostly on the dual Forth stacks. PREEMPT, on the other hand,
must save a large register file, because the timer interrupt has no way of
knowing at what point in its processing the Forth virtual machine is at the
time of preemption.
The preemption scheme embodied in this application does not replace the
onboard cooperative multitasker, but rather supplements and "watchdogs" it.
Most of the facilities of the underlying cooperative tasker are still present.
The core of the preemption engine is the interrupt service routine ?PREEMPT.
?PREEMPT is driven by a periodic interrupt on the processor Timer1. ?PREEMPT
actually reloads Timer1 at every interrupt with the value contained in the
PERIOD register object. Timer1 counts up and interrupts at zero, so PERIOD
must contain the two's-complement of the desired count. (One count is eight
processor state times.)
When ?PREEMPT fires, it compares the User pointer (the register with which the
system's cooperative multitasker points to the task currently executing) with
the User pointer value that was present the last time ?PREEMPT fired. If the
User pointer value has changed, ?PREEMPT reloads its saved copy (LAST-UP) for
its next comparison. Also, the register object CLICKS is loaded from the
task-local priority variable of the new task in the line: clicks 'my-pri >body
@ up [+s] ld. If the User pointer is unchanged, ?PREEMPT decrements CLICKS and
branches on nonzero to an interrupt return. If CLICKS expires, ?PREEMPT
branches to PREEMPT, which saves a critical portion of the register file and a
few other data objects and passes control to the next task. Seventeen items go
to the data stack in this operation. Forth-83i96 provides a data stack depth
of 256 items, so average applications should be safe.
When PREEMPT arbitrarily suspends a task and goes to pass control to the next
task in the round-robin, it recognizes those tasks which it itself did not
suspend previously. The test is the line: ax reclaim # cmp, in which PREEMPT
tests the contents of AX (which contains the vector fetched from the ENTRY
user variable of the task about to be restarted) for a match with the address
of the RECLAIM routine, the routine that restarts a preempted task on the next
cycle of the round-robin.
When the restarting task does not PAUSE voluntarily, but is PREEMPTed, PREEMPT
pushes the vector resident in the task-local instance of ENTRY at the time of
preemption onto the stack in the line: 'ENTRY >body @ up [+s] push and
replaces it for this cycle of the round-robin with the address of RECLAIM. As
it restarts the task, RECLAIM restores the contents of ENTRY in the line:
'ENTRY >body @ up [+s] pop so that the task can proceed to its next voluntary
PAUSE without ever knowing it has been preempted.
If, upon testing, ENTRY is found to point to RECLAIM, PREEMPT passes control
to that routine. RECLAIM will restore the Forth virtual machine, including the
User and stack pointers and much of the 80C196 register file. RECLAIM will
also pop back the processor flags (via a POPA instruction) to their state
before the timer interrupt which PREEMPTed the task.
On the other hand, if ENTRY points to any other routine, PREEMPT does not care
what the routine is. PREEMPT merely prepares to pass control to that routine
(typically, the system cooperative multitasker restart routine that I call
RESUME, or for a SLEEPing task, the routine PASS). Because of PREEMPT's "don't
care" behavior, the Forth-83i96 cooperative multitasker word SLEEP still can
shut down a task, because it plants PASS in ENTRY. (Unlike the situation under
the cooperative tasker alone, after a task has been shut down via SLEEP, it
must not be recalled via WAKE, as WAKE cannot know the method by which the
task was suspended; the only clue was the contents of ENTRY that were
overwritten by SLEEP. "The remedy is left to the reader as an exercise.")
PREEMPT must take one specific action on finding that a task will be restarted
by a routine other than RECLAIM. PREEMPT must restore the processor flags to
some state, because the current ones were pushed by the PUSHA in the first
line of the timer interrupt ?PREEMPT. PUSHA clears the interrupt mask, so
simply ignoring the situation is not an option. On the other hand, PAUSE does
not save processor flags, so what to do?
To this end, in SETUP we save the processor configuration to the register
object SAVE-STAT after the timer interrupt has been enabled. The WSR,
INT_MASK, INT_MASK1, and PSW saved therein must be adequate to the restarting
of any task from a voluntary PAUSE. This pretty much precludes any task
operating under the medium-heavyweight multitasker from trifling with register
windows or interrupt masks, (although uninterruptible sections bracketed by DI
[Disable Ints] and EI [Enable Ints] are still possible, as long as
consideration is given to possible timer rollover during suspension of
interrupts). Of course, in any other preemptive multitasking operating system
it is equally atypical for tasks other than the system task to change
interrupt masks.
It does not matter that the condition flags in the Processor Status Word (PSW)
such as CARRY and ZERO, which are arbitrarily "restored" to a PAUSEed task by
PREEMPT in this manner, do not match the condition flags that were present
when the task PAUSEed. PAUSE is a Forth definition that executes between other
Forth definitions. Forth definitions pass state information from one to the
other via data stack entries. Only PREEMPT (which can interrupt the internals
of a Forth code word while that word is performing register-to-register
calculations) must restore precisely the condition flags that were saved.
PREEMPT does not really save the whole state of Forth-83i96. Faced with a huge
register file, I went hog-wild in coding the original system and cached data
normally consigned to VARIABLEs in registers. System objects having to do
with, for instance, the operation of the resident 80196 assembler, are
actually registers and are not saved by PREEMPT. Therefore, the preemptive
multitasker cannot safely be active at compile time.
I have arbitrarily truncated the set of registers saved and restored by
PREEMPT to include only the Forth virtual machine registers and the list of
scratch and iteration registers reserved and documented for user applications.
This seems to work satisfactorily; the user is welcome to change the code to
preserve more state.
To use the medium-heavyweight multitasker, all compilation should take place
before the multitasker is activated. Tasks may or may not contain PAUSEs, as
illustrated in SAMPLE-TASK0 and SAMPLE-TASK1. The priority must be set for
each task (SET-PRI), a period set for the timer interrupt (SET-PERIOD), and
SET-UP invoked.
Next, each task must be wakened for the first time via the system round-robin
multitasker word WAKE. MULTI, however, only affects PAUSE operations, so its
use becomes optional. Once SET-UP is invoked, the preemptive multitasker is
off and running--until reset! Use SLEEP to shut down an undesired task, or
better yet, lower its priority (which may be changed at any time) via SET-PRI.
In the context of the 80C196 micro-controller, the medium-heavyweight tasker
is a toy. The principle is applicable, however, to such processors as the
Philips/Signetics 68070, a control-oriented 68010 workalike whose onboard
memory management unit could offer considerable practical employment to mixing
preemptive and cooperative multitasking in the Forth programming model.


Note on the Forth Assembler Syntax


Forth systems traditionally contain a resident assembler, complete with
structured conditionals such as DO ... LOOP IF ELSE THEN BEGIN UNTIL, and so
on. A Forth assembler is suitable for generating native code for any assembly
construct from an interrupt routine to a hand-coded Forth definition. Such
assemblers are effectively themselves small Forth programs, and as such obey
the parsing logic of Forth, which dictates that operands precede operators.
(Although many PC Forth systems now also feature standard macro-syntax
assemblers, the overhead for such an assembler in a tiny controller Forth is
prohibitive.) Addressing mode indicators are themselves cast as individual
operands to the assembler operators: #, [], []+, [+s], and so on.
Thus, if the timer interrupt ?PREEMPT (see Listing One) were recoded from
Forth "reverse Polish" syntax to macro assembler syntax, you would see the
code in Example 1. The strange syntax by which variables and user variables
are referenced in the Forth assembler has to do with the nature of the
ROM-able code generated by Forth-83i96. Both variables and user variables
possess an embedded offset: In the case of variables, it is an absolute
address of their data body in distant RAM; in the case of user variables, it
is the offset from any task structure address at which the data body of their
local instance is to be found. The phrase: '<var-name> >BODY @ returns the
relevant embedded value in both cases.
Example 1: Recoding the timer interrupt ?PREEMPT from Forth "reverse Polish"
syntax into macro assembler syntax

 QPREEMPT
 PUSHA
 LDB WSR, #0FH

 LD TIMER1, PERIOD
 CLRB WSR
 CMP UP, LAST-UP
 JE ONWARDS
 ST UP, LAST-UP
 LD CLICKS, MY-PRI[UP]
 ONEMORGO
 POPA
 RET
 ONWARDS
 DJNZ CLICKS, ONEMORGO
 SJMP PREEMPT


_A MEDIUMWEIGHT-HEAVYWEIGHT FORTH MULTITASKER_
by Jack Woehr


[LISTING ONE]

\ prempt96.f ... a "medium-weight" pre-emptive multitasker for the
\ Vesta SBC196 running Forth-83i96.
\ Copyright *C* 1991 jack j. woehr
\ jax@well.UUCP JAX on GEnie
\ SYSOP RealTime Control & Forth Board (303) 278-0364 3/12/24 24 hrs.

\ *** Data Objects

\ Register Aliases

$ 00 constant zero \ Symbolic Name for the Zero Register
$ 20 constant up \ Forth83i96 User Pointer
$ 26 constant entry-reg \ Forth83i96 Multitasker assumes ENTRY in this reg.

\ Register Variables

$ 90 constant clicks \ clicks left to execute on current task
$ 92 constant last-up \ task that was executing last time interrupt fired
$ 94 constant period \ how often the preemptive multitasker should fire
$ 96 constant reg-temp \ a temporary register, used as a pointer and a holder
$ 98 constant save-stat \ PSW+INT_MASK+WSR+INT_MASK1 .. double length
$ 9C constant ax \ Symbolic Name for a Scratch Register

\ Special Function Registers (SFRs) for Hardware Control
\ See _80C196KB USER'S GUIDE_, Intel 1990.

$ 08 constant int_mask \ Int Mask containing Timer1 Int
$ 09 constant int_pend \ Interrupt Pending register for Timer1 Overflow
$ 0A constant timer1 \ base address of Timer1
$ 14 constant wsr \ Window Status Register, controls register windowing
$ 16 constant ioc1 \ Input/Output Control1 governs Overflow Int Enable

\ Bit Masks for Hardware Control

$ 04 constant enable \ IOC1.2, Timer1 Overflow Interrupt Enable
$ 01 constant ov-int \ INT_MASK.0 Timer1 Overflow

\ Interrupt Handle
\ Forth-83i96 ROM vectors ints thru regs


$ 48 constant timerov-handle \ Vector for Interrupt 00

\ Declare a USER VARIABLE of which all tasks will possess an instance.
\ The local instance is the task's priority.

user variable my-pri
forth

\ Value 0 - 255 (since DJNZ instruction is used ... substitute DJNZW
\ on the 80C196KC part in the routine ?PREEMPT for greater range of
\ possible priority values).
\ 1 ... Task will execute one click maximum
\ 0 ... Task will execute 256 times

\ *** Preemptor Routines

\ Here is the return from pre-emption:

label reclaim \ entry-reg == ENTRY
 up entry-reg ' ENTRY >body @ # sub3 \ load user pointer
 sp ' TOP >body @ up [+s] ld \ get task's stack
 ' ENTRY >body @ up [+s] pop \ get previous restart routine
 reg-temp $ 1A # ld \ first register to restore
 ax $ 34 $ 1A - 2/ # ld \ number of registers to restore
 dp@ \ (resolve addr for DJNZ)
 reg-temp []+ pop \ restore reg and postinc ptr
 ax djnz \ loop 'til done
 popa \ restore flags from when task was interrupted preemptively
 ret c; \ return address last thing waiting on stack

\ Here is the preemption:

label preempt \ User Pointer still points to current task
 \ Ret addr & Proc Flags already on stack
 reg-temp $ 32 # ld \ first register pair to preserve
 ax $ 34 $ 1A - 2/ # ld \ number of register pairs to preserve
 dp@
 reg-temp [] push \ save contents of a reg
 reg-temp 2 # sub2 \ "manual post decrement" mode!
 ax djnz
 ' ENTRY >body @ up [+s] push \ save the restart routine
 sp ' TOP >body @ up [+s] st \ save address of TOP of stack
 reg-temp reclaim # ld \ address of preempted task restarter
 reg-temp ' ENTRY >body @ up [+s] st \ install reclaim routine ...
 reg-temp ' LINK >body @ up [+s] ld \ == ENTRY of next task
 ax reg-temp [] ld \ @ENTRY == restart routine of next task
 ax reclaim # cmp \ is new task's restart routine RECLAIM?
 0<> if \ no, so we must restore intmask & psw "by hand"
 save-stat push
 save-stat 2+ push
 popa
 then
 entry-reg reg-temp ld
 ax br c; \ start next task!

\ A label to DJNZ to while we wait for task slice to expire.

label one-more-goround popa ret c;


\ Timer Interrupt

label ?preempt
 pusha \ save processor flags
 wsr $ 0f # ldb \ switch Register Window to write timer
 timer1 period ld \ set up next timer int
 wsr clrb \ switch back
 up last-up cmp \ executing same task as at last int?
 0<> if \ no
 up last-up st \ mark new task
 clicks ' my-pri >body @ up [+s] ld \ get "priority"
 popa \ restore processor flags
 ret \ return from interrupt
 then \ same task, decrement clicks
 one-more-goround clicks djnz \ return if time not yet expired
 preempt sjmp c; \ and if zero fall-thru, preempt

\ *** Hardware Setup

\ Install Interrupt Handler

: install-timer-int ( ---) ?preempt timerov-handle ! ;


\ Control Counter and Interrupt Masks

code setup ( --)
 last-up clr \ so that int will do setup first time
 pusha \ save setup while changing wsr
 wsr $ 0F # ldb \ change WSR to read IOC1, write timer
 timer1 period ld \ write timer period to Timer1
 ax ioc1 ldb \ get current IOC1 mask
 ax enable # orb \ mask in Timer1 Overflow Int Enable
 popa \ restore
 ioc1 ax ldb \ store resultant mask to IOC1
 zero int_pend ldb \ clear pending interrupts
 int_mask ov-int # orb \ set Timer Overflow Int
 pusha \ get int mask and psw
 save-stat 2+ sp [] ld \ save "normal" processor status
 save-stat 2 sp [+s] ld \ second word of same
 popa \ restore
 tonext c;


\ Timer1 counts up and interrupts on overflow ( FFFF/0000 boundary)

: set-period ( CPU-state-times-desired/8 --) negate period ! ;

\ How many timer ints go by before a task is forcibly pre-empted?

: set-pri ( clicks-before-preemption task --) my-pri local ! ;

\ *** Sample Tasks That Don't Behave Themselves

\ A "naughty" task that doesn't PAUSE very often!

variable zotz
background: sample-task0 ( --)

 begin $ 1 zotz +! zotz @ 0= if pause then again ;

\ A "wicked" task that doesn't PAUSE at all!

variable foof
background: sample-task1 ( --) begin 1 foof +! again ;

\ Typical usage: HEX 100 100 TEST-SAMPLE-TASKS

: test-sample-tasks ( clicks period --)
 set-period
 dup
 foreground set-pri \ set priority of foreground task
 dup \ "naughty" task gets same CPU as FOREGROUND
 sample-task0 set-pri \ set priority of "naughty" task
 4 / \ we'll give "wicked" task less CPU
 sample-task1 set-pri \ set priority of "wicked" task
 install-timer-int \ install handler in vector handle
 setup \ turn on interrupt
 sample-task0 wake \ enable tasks to do other than "pass"
 sample-task1 wake
 ( multi)
 \ MULTI optional, since FOREGROUND task will be preempted anyway
 \ If MULTI not set, PAUSE won't work and "naughty" and "wicked"
 \ task become equivalent (except for priority).
;

\ Try this after you have executed TEST-SAMPLE-TASKS

: watch-sample-tasks ( --)
 zotz off foof off
 begin zotz @ u. foof @ u. key? until
 key drop ;

\ *** End of PREEMPT96.F



























June, 1991
TAKING UP RESIDENCE WITH CODERUNNER


TSRs are no longer the chore they once were




R. Bradley Andrews


Brad is currently a freelance computer programmer and writer in Columbus,
Ohio. He can be reached on CompuServe at 76057, 1656 or on GEnie at R.B.
Andrews.


Traditionally, only those programmers thoroughly drilled in the internals of
the IBM PC and experienced in assembly language programming have undertaken
the writing of TSRs. Now Coderunner, a library of assembly routines from
Microsystems Software that provide many TSR functions, takes on this
challenge, opening the TSR door for programmers who may be less familiar with
assembly language or PC secrets.
This article examines Coderunner and presents Timer, a TSR digital stopwatch
that ticks off elapsed time in the upper-right of the screen. Simple as it is,
the example will enable us to put the Coderunner package through its paces.


Browsing the Package


The Coderunner package includes the source code for a wide variety of sample
TSRs. Among Coderunner's optimized functions are: support for string
manipulation, video, a keyboard interface, time and date functions, sound/
delay, file I/O, software interrupts and interrupt traps, COM/LPT support,
stack control, hotkey handling, full and tiny schedulers, DOS access, memory
management, standard memory allocation, conversion routines, integer
operations, and TSR specific items.
While I can't go into each of these, a few do stand out. The full scheduler,
for instance, allows for up to 64K events to be active at any time, while the
"tiny scheduler" allows fewer events, but requires only a small amount of
memory, and works perfectly in the Timer example presented later. Other
routines duplicate those in standard C libraries, but because they eliminate
some generality that TSRs don't need, the routines are much smaller and more
appropriate. The video routines make saving and displaying screen information
a breeze; only two calls were required to save and restore the area used for
the clock display in the Timer example. Microsystems Software also includes
several sample TSR programs that put their routines to work. And because these
are useful utilities, which you are free to modify for your own use, they are
helpful in learning how to create different kinds of TSRs.
The two Professional Developer's Kits (PDKs) also contain routines useful for
TSR tasks. PDK1 is mainly geared toward communication functions, with routines
to handle COM interfacing, ANSI terminal management, buffered I/O, flow
control, and some ancillary routines to tie up the loose ends. In addition,
the sample HyperCOM program provides a working example of how these routines
can be used to produce a robust telecom TSR. PDK1 also has support for LIM
memory and mixed model programming, in addition to built-in print-spooling
routines.
PDK3 is not quite so broad in scope, focusing instead on the problems
associated with spawning programs from a TSR. But its routines are by no means
limited. The capability to save the system's state and transfer all resident
portions of a program, including any graphics screens and palettes, to LIM or
disk is vital to anyone who wants to explore this area of programming. The
samples included are thorough and useful. The most recent release also adds
support for XMS memory, which Windows also supports.
While the manuals and examples have most, if not all, of the information
required to get started, the startup process can sometimes seem like learning
Spanish from an English-Spanish dictionary. However, this is not really
Coderunner's fault--TSRs require that a number of things be kept track of. And
while Coderunner simplifies the process, it doesn't remove all the pain. The
best way to begin is to read all the informational sections within the manual,
skipping the descriptions of the individual routines until they are needed.
This approach both gets all the meat out of the manual and postpones some
learning until it is required, reducing brain overload.


A Timely Example


I chose the Timer example because it tests two different areas of the
Coderunner library: the tiny event scheduler and the hotkey handler. Being
able to start with the guts of the clock example provided with the package
made my task easy, although a bit of work was required to add hotkey support.
Timer is designed to be in one of three states: 0 if the timer is active, but
nothing is displayed; 1 when the timer is active and counting; or 3 when the
timer is halted, but the final count is left on the screen. Each press of the
hotkey (Shift-F10, in this case) pushes it into the next state.
Timer's code is split into three separate files. TIMERINI.C in Listing One,
page 104, contains main( ) and all code needed to install the TSR. TIMERDAT.C
in Listing Two, page 104, holds all the data that are not required once the
timer is installed, such as the text for the sign-on screen. Due to the nature
of the compiler and linker, disposable variables must all be in this file--any
that have been placed in the same file with the code will not be disposed of
and will instead use up precious memory needlessly.
TIMER.C in Listing Three, page 104, holds the active part of the TSR. While
larger TSRs might split this between several files, timer is simple enough
that all of the required code and data can easily occupy a single file.
MAKEIT.BAT, Listing Four, page 104, is also included so that you can easily
rebuild the timer. Though MAKEIT.BAT does have a debug option, I found it
simple enough to work on the timer solely in TSR mode. In spite of several
reboots when something managed to hang the machine, the program was simple
enough that this was the preferred approach; many of the elements of this and
most other TSRs will only work in TSR mode.
Returning to main( ) in Listing One, the first step is to ensure that the TSR
is not already loaded. To do this, the program scans memory for a matching
cfg_rec data structure (defined in Listing Two) that identifies the TSR's
four-character ID code and version number, as well as any other information
the program may need. If the timer is already installed and the -r switch has
been passed from the command line, timer is removed from memory, if possible.
Otherwise, a warning message prints out and the loading process is aborted. If
this is the first load, the screen is cleared, the desired character
attributes are set, and the sign-on screen is displayed.
After the cursor is adjusted, idata_ end and icode_beg are set to allow
Coderunner to discard the start-up code when we are finished. Every TSR
requires the call to stay_resident to set up the needed Coderunner environment
information. install_hk and install_tsc set up the hotkey handler and the tiny
scheduler for our desired actions. move_to_lim comes from the PDKI module and
will move the TSR to LIM memory if it is available.
Most of the TIMERDAT data (see Listing Two) is standard, but install_list[]
bears a closer look. To prevent the linker from stripping out necessary code
during the link process, this variable must be set to an array of all the
install_xx routines used by the TSR. This is very important; it caused me a
bit of consternation when the timer randomly died (and hung the machine) until
I placed the proper routines in this array.


Products Mentioned


Microsystems Software Inc. 600 Worcester Road Framingham, MA 01701
508-626-8511 Coderunner: $149 $295 with source PDK1: $99; $195 with source
PDK3: $99; $195 with source System requirements: DOS 3.1 or greater for
development, DOS 2.1 or greater for TSR use, and Turbo C/C++ Microsoft C, or
Zortech C
TIMER.C (see Listing Three) houses the two main routines: popclock and hk_isr.
popclock does the actual timing and is called only when the timer is actively
running. It compares the current time to the start time, converts this to the
proper format, and then prints it out in the top right corner of the screen.
As long as which is equal to one, an event is set for the next tick, and the
process continues. Hk_isr changes the state to the next in sequence, and
performs any special actions required in that mode, such as saving or
restoring the information in the screen area. This approach works well in most
cases, but a few problems can arise. For example, if that part of the screen
has changed since the timer's start, it obviously will not look right when the
timer is done. Also, programs that use video screen 1, such as Brief, may have
the wrong information displayed there. The first problem is unavoidable
without an extensive amount of programming, but the latter could be fixed with
some additional code to check alternate screens.


Some Spit and Polish


As with any simple program, several potential enhancements are immediately
visible. After the timer has stopped, it should probably be updated in case it
is erased by some other user action. The stack size reserved for the program
could probably be reduced with the aid of Coderunner's stack checking
functions. The start up portion could also use command line parameters, or
some other method, to allow the user to select the hotkey used, the timer
detail level, and the colors used to display the timer. While the timer shows
several elements of a useful TSR, it barely scratches the surface of the
features in the Coderunner package.


Conclusion


Coderunner is geared toward use with Turbo C, Turbo C++, or Microsoft C, but
is also fully compatible with Zortech C++. I found the optimal system to be
the Turbo C 2.0 compiler in conjunction with the Microsoft linker to properly
link the various modules. In short, Coderunner and its associated Professional
Developer's Kits are a highly useful set of utilities that open the difficult
world of TSR programming to anyone who can grasp the C language.


_TAKING UP RESIDENCE WITH CODERUNNER_
by R. Bradley Andrews



[LISTING ONE]

#include "cr.h"

/** RESIDENT CODE STACK SPACE **/
#define HK_STK 150 /* Stack size for Hot-Key services */
#define SR_STK 64 /* Stack for COM services */
#define STK_SZ (HK_STK+SR_STK) /* Size in words */
word isr_stk[STK_SZ+1]; /* Allow allways extra word ! */

extern popclock(); /* Main TSR entry, called by scheduler */

extern hk_isr(); /* Hot-Key service in HELLO.C */
extern word hk_list[]; /* This is in HELLOINI.C for use by isr() function */

/******** Disposable messages in MSICLDAT.C **************/
extern char sms[]; /* Signon screen */
extern char attrc[],attrm[]; /* Screen attributes used for signon */
extern char already[]; /* Message when already loaded */
extern char unloaddone[];
extern char unloadproblem[];
extern word init_data_end; /* Marker for end of disposable data */

/* Dummy function marks start of disposable code */
init_code_start()
{
}


/* this main is called only once on program load, after that it is discarded
*/
main()
{ int i;
 char *a;

 if(second_load()) /* Check if TSR already loaded */
 {
 i=str_pos('-',cmd_line);
 if (i && ((cmd_line[i]0x20)=='r'))
 {
 if(remove_tsr()) /* Remove TSR from memory */
 {
 dspf(unloaddone); /* Inform about unloading */
 }
 else dspf(unloadproblem); /* Inform about unloading */
 }
 else dspf(already); /* Otherwise show Help message */
 mv_crs(); /* Reposition real cursor */
 return(1); /* Exit with errorlevel 1 */
 }

 /***** Display Signon Screen *****/
 clr_scr();
 a = color ? attrc:attrm; /* Select proper screen attributes */

 crs_x=20; crs_y=7; /* Location to put signon box */
 dspf(sms,a); /* Show the screen */

 crs_x=0; crs_y=scr_len-2; /* Move to the bottom of screen */
 mv_crs(); /* Place real cursor there */
 crs_y=0; /* Top line used to display clock */

 idata_end=&init_data_end; /* This enables init data disposal. */
 icode_beg=init_code_start; /* This enables init code disposal. */

 stay_resident(isr_stk,STK_SZ*2); /* Enable resident mode */
 install_hk(hk_list,hk_isr,STK_SZ*2,0x7F); /* HOT-KEY Support */

 install_tsc(popclock,2*STK_SZ,1); /* Install tiny scheduler */
/* add_tsc_event(8L); /* Add event to timer, 8 ticks from now */
 /* Popclock will be called by scheduler */
 return(0); /* Set errorlevel to 0 */

}







[LISTING TWO]

/* this module contains ALL disposable data only. The marker at the end
 called init_end_data is just a marker to the end of disposable data
 NO CODE CAN BE PUT INTO THIS MODULE, DATA ONLY */

#include "cr.h"

/* Video attributes for signon screen. The last one is for the Clock */

/*char attrc[]={0x17,0x1E,0x13,0x1F,0x1B,0x17}; /* Color screen attributes */
/*char attrm[]={0x07,0x0F,0x07,0x07,0x07,0x70}; /* Mono screen attributes */

/* This is the signon message */
char sms[]= "`0"
 "[_________________________[`m"
 "[`1 Timer 1.00 `0[`m"
 "[`2 DDDDDDDDDDD `0[`m"
 "[`4 Author: `0[`m"
 "[`4 R. Bradley Andrews `0[`m"
 "[\\\\\\\\\\\\\\\\\\\\\\\\\[`n`5";

char already[]="`nError - Timer already present."
 "`nEnter MSICLOCK -R to unload.`n";
word hk_list[]={M_RS+0x44,0}; /* Right-Shift Enter */

char unloaddone[]="Timer has been removed from memory.`n";
char unloadproblem[]="Timer could not be removed from memory.";
char _tsr_name[]="Timer";

struct cfg_rec config_block = { /* This record stays in only one module */
 sizeof(config_block), /* Configuration block size */
 'T','I','M','E', /* Program ID string */

 100, /* Version 1.00 */
 };

/**** STORE THIS IMMEDIATELY AFTER THE LAST DISPOSABLE DATA ITEM ****/

/**** 1. Store all install_?? type function pointers into install_list ****/
fp install_list[]={install_tsc, install_hk, install_bk}; /* Only tiny
scheduler used */

/**** 2. Put Marker for the end-of-init-data (must be = NONZERO) ****/
word init_data_end=1;
/************************************************************************/







[LISTING THREE]

#include "cr.h"

/** RESIDENT CODE STACK SPACE **/
#define HK_STK 150 /* Stack size for Hot-Key services */
#define SR_STK 100 /* Stack for COM services */
#define STK_SZ (HK_STK+SR_STK) /* Size in words */

#define kWidth 13 /* width of display string */

char time_str[]=" 00:00:00:00 ";
char blank_str[]=" ";

word sbuf[kWidth];

char attrc[]={0x17,0x1E,0x13,0x1F,0x1B,0x17}; /* Color screen attributes */
char attrm[]={0x07,0x0F,0x07,0x07,0x07,0x70}; /* Mono screen attributes */

long start_ticks;

int which=0;

popclock(void)
{ register char *t,*s;
 register long cur_ticks;
 struct time_rec cur_time;

 if (which==1) {
 add_tsc_event(1L); /* Setup next event in 1/3 second */
 }
 cur_ticks = bios_ticks();
 cur_ticks = cur_ticks - start_ticks;
 ticks2time(cur_ticks, &cur_time);

 t=&cur_time.hours; /* Prepare to convert hh:mm:ss */
 s=time_str+1;
 do cv_b2dec((byte)*t,s), s+=3; /* Convert 1 byte to decimal */
 while (--t>=&cur_time.sec100); /* Until done with seconds */

 chk_video(); /* Obtain current video settings */

 crs_x=scr_width-kWidth; /* Set cursor to upper right corner */
 dsp(time_str); /* Display time string there */

}
hk_isr()
{
 if (!which) {
 crs_x=scr_width-kWidth; /* Set cursor to upper right corner */
 get_block(kWidth,1,sbuf);
 start_ticks = bios_ticks();
 add_tsc_event(1L); /* Add event to timer, 8 ticks from now */
 which++;
 }
 else {
 if (which == 1 ) {
 which++;
 }
 else {
 crs_x=scr_width-kWidth; /* Set cursor to upper right corner */
 put_block(kWidth,1,sbuf);
 which = 0;
 }
 }
}






[LISTING FOUR]

REM Use switch d for debug mode
REM
if %1x==dx goto compdebug
tcc -c -I..\ -DPDK1 timer*.c
goto linkit
rem
:compdebug
tcc -c -v -I..\ -DDBG timer*.c
:linkit
del timerini.lib
lib timerini +timerini;
if %1x==dx goto dbg
:lnk1
link ..\r0 timerdat timer ..\r1,timer,,..\k1 ..\cr timerini/NOI/NOD/M;
goto exit
:dbg
tlink ..\r0 timerdat timer ..\r1,timer,,..\cr timerini/c/m/v;
:exit








































































June, 1991
CELESTIAL PROGRAMMING WITH TURBO PASCAL


The CCD camera brings astrophotography to the PC




Lars Frid-Nielsen and Alex Lane


Lars Frid-Nielsen is a veteran engineer in the research and development group
at Borland. Alex Lane is a product manager for Borland's Languages Business
Unit. They can be reached at 1800 Green Hills Road, Scotts Valley, CA
95067-0001.


Few things stir more interest in astronomy than the dramatic pictures of
galaxies and nebulae in books at your local public library. Virtually all of
these photographs are taken using large telescopes at the world's major
astronomical observatories. Up until a few years ago, the amateur astronomer's
enjoyment of the universe was almost entirely restricted to those images seen
from the eyepiece of the telescope.
Thanks to the development of SchmidtCassegrain technology, amateur astronomers
now have access to affordable, portable, large-aperture telescopes capable of
capturing the universe on photographic film. As color films have increasingly
become more light-sensitive, sensational pictures are a reality using nothing
more than a 10-inch telescope and the local 1-hour photoshop. The technical
revolution experienced by amateur astronomers in the field of astrophotography
is now on the verge of another dramatic breakthrough, thanks primarily to an
apparatus known as the Charge Coupled Device (CCD) camera which can be
connected to a PC and used to capture images. This article describes a project
using a CCD camera and Turbo Pascal to deliver a digital image to a PC where
it can then be displayed, stored, and processed.


The CCD Camera


The CCD camera used contains a 640 x 518 array of light-sensitive cells. These
cells convert the photons gathered by the telescope into individual electric
signals. Each signal is electronically amplified, rendering light sensitivity
far superior to traditional photograhic emulsions. CCD cameras have a linear
response to faint light signals, eliminating a shortcoming of conventional
photographic emulsions, called reciprocity failure, where the exposure time
required to record faint light sources grows exponentially. Reciprocity
failure translates directly into hour-long sessions guiding the eyepiece of a
telescope, which is no fun at all.
The signals from the light-sensitive cells are processed by the CCD camera and
are output in the form of a black-and-white NTSC television signal. The camera
connects to an interface card, sampling 256 points on every other scan line,
with each point having one of 64 levels of intensity.


Capturing the Image


Grabbing the image is the job of the Capture procedure in Listing One , page
106. Capture sends a reset signal to the interface card, signals the card to
begin capturing information, then drops into a while loop to continue the
process until the image is captured. Once captured by the interface card, the
image must be tranfered to the program. The program represents the image as a
variant record called pictype, which can be accessed either as a framerec
record or in "raw" form. As a framerec, an array of records represents the
individual lines of the image. (There is a byte reserved for synchronization
both before and after this array.) A synchronization byte and an array of byte
values represent each line, in turn. In raw form, image data is treated as an
array of integers, which is convenient for storing the information in a file,
as shown in the Save-Procedure function.
The Scan procedure transfers data in the card to a pictype record. In earlier
versions of the program, Scan was coded in Pascal, but was rewritten in
assembler and integrated using Turbo Pascal 6.0's built-in assembler. The
original code is retained in the listing as a comment.
CPU input/outport ports are used to transfer data from the card to the pictype
record, and perform necessary communication with the card. The original Pascal
code makes use of the predefined Port array to access the CPU's I/0 ports,
allowing about one image per second to be scanned on a 20-MHz 386SX machine
equipped with a VGA display. The assembler code is much more efficient,
allowing nearly eight images per second to be scanned with the same setup.


Processing the Image


Computer-enhanced photographs sent back by space probes such as Viking and
Voyager underscore the importance of computers in manipulating images. While
the capabilities of this program are not as advanced as those used by NASA,
you still exercise a great deal of control over the appearance of CCD images.
Typical manipulations include adding, subtracting, or masking images,
comparing images, adding or subtracting constant values to images,
establishing thresholds in images, inverting images, and filtering images.
Virtually all of these manipulations do line-by-line, cell-by-cell processing
of a pictype record.
Histograms play an important role in helping you process images. The
HistoWindow consists of a pointer to a HistoView and a constructor, which
creates a non-resizeable window and then constructs and displays its HistoView
inside the window. The HistoView, in its constructor, calls its own Update
method, taking a pointer to an image and distributing the image pixels into 64
intensity levels detectable by the interface card. Update then calls its
ancestor's DrawView method to display the histogram.
A typical histogram, generated from an image of the moon's surface, is shown
in Figure 1. We can see from the histogram that there are about 15 different
intensities recorded in the image. We can increase the range of intensities in
this image to nearly full range by multiplying each cell's intensity
four-fold. The resulting histogram is shown in Figure 2. Figure 3 shows the
"enhanced" image represented by the histogram in Figure 2.


The Evolution of the UI


As with many programs, the kernel of the program was developed fairly quickly
and had a rudimentary user interface. You entered information in response to
screen prompts, and because the program stored no state information, you
repeatedly entered the same information. We incorporated Turbo Pascal's
application framework, Turbo Vision, to improve the front end of the program.
(Pull-down menus and hotkeys are important advantages when you consider that
the program is often used in very dim light and in weather cold enough to
require gloves!)
Turbo Vision provides a fully controllable event-driven architecture, so you
only have to write code sufficient for handling events that distinguish your
object type's code from its ancestor's code. As such, there are two basic
steps to building an event-driven program with Turbo Vision. First, you define
the actions causing events to which your program will respond. Second, you
define what to do when events actually occur. The intermediate step of
identifying events is done by Turbo Vision's event handler (part of the
TApplication object). The event handler automatically queues events for
processing.
Listing Two, page 109 presents the main file, which includes all of the Turbo
Vision code for the program. A set of constants at the beginning of CCD.PAS
establishes symbols representing commands. In this way, when we need to refer
to the Open File command, we can use cmFOpen instead of the numerical value
1000.
Listing Two also declares three object types derived from object types in the
Turbo Vision hierarchy. CCDpgm is derived from the TApplication object type,
while the HistoView and HistoWindow types are derived from TView and TWindow,
respectively. CCDpgm adds methods for file I/0, image display, and histogram
updating, as well as providing virtual methods for initializing the menu bar
and status line, and handling events.
The actions of the program are defined in the CCDpgm.InitMenuBar and
CCDpgm.InitStatusLine methods. These methods use nested calls to NewItem and
NewStatusKey to construct linked lists of menu bar and status line items. This
syntax makes modification easy--for instance, to add a menu item, insert a
call to NewItem at the appropriate spot, supply appropriate parameters, then
insert a closing parenthesis at the end of the remaining nested statements.
The task of handling events as they occur is performed by the virtual method
HandleEvent. This method is called any time an event is identified, so a call
is first made to the ancestor TApplication.HandleEvent method, allowing the
generic application to take care of routine events. If an event is
user-defined, however, the case statement in this method defines how each
event is handled.
For example, if the user just used a mouse (or the F3 hotkey) to select Open
under the File menu bar selection, a cmFOpen command is generated, and passed
to the HandleEvent method, where code associated with cmFOpen is executed.
This case statement format facilitates incremental development. For example,
the code to execute for the command cmExpInteg is a call to a procedure called
NotImplemented, which calls Turbo Vision's MessageBox function, informing the
user that the feature in question is not yet implemented. The value returned
by MessageBox can be discarded because the file CCD.PAS specifies the use of
extended syntax with the $X compiler directive at the top of the file.
Of particular interest in the CCDpgm object type is the SetMenuItem method,
which dynamically modifies menu item text using OnTxt and OffTxt string
constants. It does this by accessing the items in the MenuBar, searching for a
match and concatenating the appropriate On or Off string to the item. Here, a
pulldown menu shows the user whether or not the program is set to use
high-resolution VGA, do auto display, or perform a photo session.


To Graphics Mode and Back



A major challenge to developing the UI was the need to switch from text to VGA
graphics mode to display an image, then to switch back to text mode. The
solution is to disable or suspend Turbo Vision long enough to display the
image, then to enable it again. If you don't disable Turbo Vision prior to
switching to graphics mode, Turbo Vision will continue to process input as it
sees fit, which will likely lead to trouble as it misinterprets events in VGA
mode.
The switches are accomplished with the help of the GraphicsStart and
GraphicsStop procedures. GraphicsStart shuts down Turbo Vision's error-message
and event handling with calls to DoneSysError and DoneEvents, respectively,
restores the initial screen mode and cursor, and frees memory with calls to
DoneVideo and DoneMemory.
Once Turbo Vision is shut down, the procedure Display_Image changes the video
mode to the appropriate VGA or high-resolution VGA mode as shown in Example 1.
Listing Three, page 112, presents the code for displaying the CCD image in VGA
mode. Finally, in returning from VGA mode, the procedure GraphicsStop
reestablishes Turbo Vision by initializing its memory, video, event and error
handlers, and redraws the textmode screen.
Example 1: Switching to VGA or high-resolution VGA graphics mode

 if VGAhiRes then
 beginb
 r.AX := ($00 SHL 8) OR $61;
 Intr(VideoInt,r);
 mode := 1;
 end
 else
 begin
 r.AX := ($00 SHL 8) OR $13;
 Intr(VideoInt,r);
 mode := 0;
 end



Future Directions


The program is easily enhanced, due in part to the ease with which the
interface can be modified. For example, more sophisticated image processing
and filtering can be added to allow user experimentation with various
algorithms. The program can also be enhanced by storing files in a standard
format, such as PCX or TIFF.
The system we described in this article clearly illustrates how advances in
technology are bringing once-distant subjects out of books and onto our
desktops. The PC and its software are the key elements to providing us
evergreater possibilities.


Products Mentioned


MS-4000 Series Solid State Video Camera Sierra Scientific 605 California Ave.
Sunnyvale, CA 94086 408-745-1500
Frame Grabber IDEC Inc. 1195 Doylestown Pike Quakertown, PA 18951 215-538-2600
Turbo Pascal 6.0 Borland International 1800 Green Hills Road Scotts Valley, CA
95066 408-438-8400

_CELESTIAL PROGRAMMING WITH TURBO PASCAL_
by Lars Frid-Neilson and Alex Lane



[LISTING ONE]

unit Video;
{*******************************************************}
interface
{*******************************************************}

{ Global constants }
CONST

{--- defaults for Supervision card setup }
 Aport = $2F0; { first port on the card }
 Bport = $2F1; { second port on the card }

{--- field control bytes }
 fieldsync = $40; { new field! }
 linesync = $41; { new line }
 fldend = $42; { end of field }
 rep1 = $80; { repeat x1 }

 rep16 = $90; { repeat x16 }

{--- image structure }
 maxbit = $3F; { bits used in pel }
 maxpel = 255; { highest pel index }
 maxline = 252; { highest line index }
 maxbuffer = 32766; { highest "INT" index }

{ Global types }

TYPE
 bitrng = 0..maxbit; { bit range }
 pelrng = 0..maxpel; { pel indexes }
 framerng = 0..maxline; { line indexes }
 subrng = 0..maxbuffer; { raw data indexes }
 pelrec = RECORD { one scan line }
 syncL : BYTE;
 pels : ARRAY[pelrng] OF BYTE;
 END;
 framerec = RECORD { complete binary field }
 syncF : BYTE;
 lines : ARRAY[framerng] OF pelrec;
 syncE : BYTE;
 END;
 rawrec = ARRAY[subrng] OF INTEGER;
 picptr = ^pictype; { picture ptr }
 pictype = RECORD CASE INTEGER OF { picture formats}
 0 : (fmt : framerec);
 1 : (words : rawrec);
 END;
 histtype = ARRAY[bitrng] OF Word; { pel histograms }
 regrec = RECORD CASE INTEGER OF
 1 : (AX : INTEGER;
 BX : INTEGER;
 CX : INTEGER;
 DX : INTEGER;
 BP : INTEGER;
 SI : INTEGER;
 DI : INTEGER;
 DS : INTEGER;
 ES : INTEGER;
 FLAGS : INTEGER);
 2 : (AL,AH : BYTE;
 BL,BH : BYTE;
 CL,CH : BYTE;
 DL,DH : BYTE);
 END;
 byteptr = ^BYTE; { general ptr }
 strtype = STRING[255]; { strings }
 Hextype = STRING[4];

{ Global functions and procedures }

PROCEDURE Add(pic1,pic2 : picptr);
PROCEDURE Subtract(pic1,pic2 : picptr);
PROCEDURE Mask(pic1,pic2 : picptr);
PROCEDURE Compare(pic1,pic2 : picptr);
PROCEDURE Offset(pic1 : picptr; newoffs : BYTE);
PROCEDURE Negoffset(pic1 : picptr; newoffs : BYTE);

PROCEDURE Multiply(pic1 : picptr; newscale : REAL);
PROCEDURE Threshold(pic1 : picptr; level : BYTE);
PROCEDURE Invert(pic1 : picptr);
PROCEDURE Filter1(pic1,pic2 : picptr);
PROCEDURE Edge(pic1,pic2 : picptr);
PROCEDURE Histogram(pic1 :picptr; VAR histo : histtype);
PROCEDURE PicSetup(VAR newpic : picptr);

function SavePicture(filespec : strtype; pic : picptr): integer;
function LoadPicture(filespec : strtype; pic : picptr): integer;

PROCEDURE SetSyncs(pic1 : picptr);
PROCEDURE Card;

function Capture: BOOLEAN;

PROCEDURE Scan(pic1 : picptr);

{*******************************************************}
implementation
{*******************************************************}

{ Do pic1 + pic2 into pic3 }
{ Sticks at maxbit }

PROCEDURE Add(pic1,pic2 : picptr);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }
 pelval : INTEGER; { pel value }

BEGIN
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO BEGIN
 pelval := pic1^.fmt.lines[lndx].pels[pndx] +
 pic2^.fmt.lines[lndx].pels[pndx];
 IF pelval > maxbit THEN
 pic1^.fmt.lines[lndx].pels[pndx] := maxbit
 ELSE
 pic1^.fmt.lines[lndx].pels[pndx] := pelval;
 END;
END;

{ Do pic1 - pic2 into pic3 }
{ Sticks at zero for pic1 < pic2 }

PROCEDURE Subtract(pic1,pic2 : picptr);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }

BEGIN
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO
 IF pic1^.fmt.lines[lndx].pels[pndx] >=
 pic2^.fmt.lines[lndx].pels[pndx]
 THEN
 pic1^.fmt.lines[lndx].pels[pndx] :=
 pic1^.fmt.lines[lndx].pels[pndx] -

 pic2^.fmt.lines[lndx].pels[pndx]
 ELSE
 pic1^.fmt.lines[lndx].pels[pndx] := 0;

END;

{ Do pic1 masked by pic2 into pic3 }
{ Only pic1 pels at non-zero pic2 pels go to pic3 }

PROCEDURE Mask(pic1,pic2 : picptr);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }

BEGIN
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO
 IF pic2^.fmt.lines[lndx].pels[pndx] = 0 then
 pic1^.fmt.lines[lndx].pels[pndx] := 0;
END;

{ Do Abs(pic1 - pic2) into pic3 }
{ Detects changes in images }

PROCEDURE Compare(pic1,pic2: picptr);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }

BEGIN
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO
 pic1^.fmt.lines[lndx].pels[pndx] := Abs(
 pic1^.fmt.lines[lndx].pels[pndx] -
 pic2^.fmt.lines[lndx].pels[pndx]);

END;

{ Add a constant to pic1 }

PROCEDURE Offset(pic1 : picptr;
 newoffs : BYTE);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }
 pelval : INTEGER; { pel value }

BEGIN
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO BEGIN
 pelval := newoffs + pic1^.fmt.lines[lndx].pels[pndx];
 IF (pelval AND $FFC0) = 0 THEN
 pic1^.fmt.lines[lndx].pels[pndx] := pelval
 ELSE
 pic1^.fmt.lines[lndx].pels[pndx] := maxbit;
 END;
END;

{ subtract a value from a picture }


PROCEDURE Negoffset(pic1 : picptr;
 newoffs : BYTE);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }
 pelval : INTEGER; { pel value }

BEGIN
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO BEGIN
 pelval := pic1^.fmt.lines[lndx].pels[pndx] - newoffs;
 IF (pelval AND $FFC0) = 0 THEN
 pic1^.fmt.lines[lndx].pels[pndx] := pelval
 ELSE
 pic1^.fmt.lines[lndx].pels[pndx] := maxbit;
 END;
END;

{ Multiply pic1 by a value }
{ Sticks at maximum value }

PROCEDURE Multiply(pic1 : picptr; newscale : REAL);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }
 pelval : INTEGER; { pel value }

BEGIN
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO BEGIN
 pelval := Trunc(newscale * pic1^.fmt.lines[lndx].pels[pndx]);
 IF (pelval AND $FFC0) = 0 THEN
 pic1^.fmt.lines[lndx].pels[pndx] := pelval
 ELSE
 pic1^.fmt.lines[lndx].pels[pndx] := maxbit;
 END;
END;

{ Threshold pic1 at a brightness level }

PROCEDURE Threshold(pic1 : picptr;
 level : BYTE);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }

BEGIN
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO
 IF pic1^.fmt.lines[lndx].pels[pndx] < level
 THEN pic1^.fmt.lines[lndx].pels[pndx] := 0;
END;

{ Invert pel values }

PROCEDURE Invert(pic1 : picptr);
VAR
 lndx : framerng; { line number }

 pndx : pelrng; { pel number }

BEGIN
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO
 pic1^.fmt.lines[lndx].pels[pndx] := maxbit AND
 (NOT pic1^.fmt.lines[lndx].pels[pndx]);
END;

{ Filter by averaging vertical and horizontal neighbors }

PROCEDURE Filter1(pic1,pic2 : picptr);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }

BEGIN
 FOR lndx := 1 TO (maxline-1) DO
 FOR pndx := 1 TO (maxpel-1) DO
 pic2^.fmt.lines[lndx].pels[pndx] :=
 (pic1^.fmt.lines[lndx-1].pels[pndx] +
 pic1^.fmt.lines[lndx+1].pels[pndx] +
 pic1^.fmt.lines[lndx].pels[pndx-1] +
 pic1^.fmt.lines[lndx].pels[pndx+1])
 SHR 2;
END;

{ Edge detection }

PROCEDURE Edge(pic1,pic2 : picptr);
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }

BEGIN
 FOR lndx := 1 TO (maxline-1) DO
 FOR pndx := 1 TO (maxpel-1) DO
 pic2^.fmt.lines[lndx].pels[pndx] :=
 (Abs(pic1^.fmt.lines[lndx-1].pels[pndx] -
 pic1^.fmt.lines[lndx+1].pels[pndx]) +
 Abs(pic1^.fmt.lines[lndx].pels[pndx-1] -
 pic1^.fmt.lines[lndx].pels[pndx+1]) +
 Abs(pic1^.fmt.lines[lndx-1].pels[pndx-1] -
 pic1^.fmt.lines[lndx+1].pels[pndx+1]) +
 Abs(pic1^.fmt.lines[lndx+1].pels[pndx-1] -
 pic1^.fmt.lines[lndx-1].pels[pndx+1]))
 SHR 2;
END;

{ Compute intensity histogram for pic1 }

PROCEDURE Histogram(pic1 :picptr;
 VAR histo : histtype);
VAR
 hndx : bitrng; { histogram bin number }
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }

BEGIN

 FOR hndx := 0 TO maxbit DO { reset histogram }
 histo[hndx] := 0;
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO
 histo[pic1^.fmt.lines[lndx].pels[pndx]] :=
 histo[pic1^.fmt.lines[lndx].pels[pndx]] + 1;
END;

{ Allocate and initialize the picture buffer }

PROCEDURE PicSetup(VAR newpic : picptr);
VAR
 pels : pelrng;
 lines : framerng;

BEGIN
 IF newpic <> NIL { discard if allocated }
 THEN Dispose(newpic);
 New(newpic); { allocate new array }
END;

{ Save picture file on disk }
{ Uses the smallest number of blocks to fit the data }

function SavePicture(filespec : strtype; pic : picptr): integer;
VAR
 ndx : subrng; { index into word array }
 rndx : REAL; { real equivalent }
 nblocks : INTEGER; { number of disk blocks }
 xfered : INTEGER; { number actually done }
 pfile : FILE; { untyped file for I/O }
 RtnCode : integer;

BEGIN
 RtnCode := 0;
 Assign(pfile,filespec);
 Rewrite(pfile);
 ndx := 0; { start with first word }
 WHILE (ndx < maxbuffer) AND { WHILE not end of pic }
 (Lo(pic^.words[ndx]) <> fldend) AND
 (Hi(pic^.words[ndx]) <> fldend) DO
 ndx := ndx + 1;

 ndx := ndx + 1; { fix 0 origin }

 rndx := 2.0 * ndx; { allow >32K numbers... }
 nblocks := ndx DIV 64; { 64 words = 128 bytes }
 IF (ndx MOD 64) <> 0 { partial block? }
 THEN nblocks := nblocks + 1;
 rndx := 128.0 * nblocks; { actual file size }
 BlockWrite(pfile,pic^.words[0],nblocks,xfered);

 IF xfered <> nblocks then RtnCode := IOresult;
 SavePicture := IOresult;
 Close(pfile);
END;

{ Load picture file from disk }


function LoadPicture(filespec : strtype;
 pic : picptr): integer;
var
 picfile : FILE OF pictype;
 RtnCode : integer;

BEGIN
 Assign(picfile,filespec);
 {$I- turn off I/O checking }
 Reset(picfile);
 RtnCode := IOresult;
 {$I+ turn on I/O checking again }
 IF RtnCode = 0 then
 begin
{$I- turn off I/O checking }
 Read(picfile,pic^); { this does the read }
 RtnCode := IOresult;
{$I+ turn on I/O checking again }

{ IF NOT (IOresult IN [0,$99]) then
 RtnCode := -1;}
 RtnCode := 0;
 end;
 LoadPicture := RtnCode;
end;

{ Set up frame and line syncs in a buffer }
{ This should be done only in freshly allocated buffers }

PROCEDURE SetSyncs(pic1 : picptr);
VAR
 lndx : framerng; { index into lines }

BEGIN
 pic1^.fmt.syncF := fieldsync; { set up empty picture }

 FOR lndx := 0 TO maxline DO BEGIN
 pic1^.fmt.lines[lndx].syncL := linesync;
 FillChar(pic1^.fmt.lines[lndx].pels[0],maxpel+1,0);
 END;
 pic1^.fmt.syncE := fldend; { set ending control }
END;

{ Test for the Supervisor card }
PROCEDURE Card;
var test: byte;

Begin
writeln ('testing for vgrab card');
 Port[Bport] := 0; { reset the output lines }
 Port[Aport] := 0;
 test := Port[Aport]; { look for the card }
 if (test and $0C0) = 0 then Begin
 Port[Aport] := $03;
 test := Port[Aport];
 if (test and $0C0) <> $0C0 then
 writeln ('No Supervision card found');
 end;
 Port[Bport] := 0; { reset the address lines}

end;

{ Capture routine for the Supervisor card }
function Capture: BOOLEAN;
var
 TimeOut : integer;
Begin
 Port[Bport] := 0; { reset everything }

 Port[Aport] := $03; { start the capture }
 TimeOut := 15000;
 while ((Port[Aport] and $0C0) = $0C0) and (TimeOut > 0) do
 TimeOut := pred(TimeOut);

 Port[Bport] := 0; { reset everything }
 Capture := TimeOut <> 0;
end;

{ Scan data routine for the Supervisor card }
PROCEDURE Scan(pic1 : picptr);

(*
VAR
 lndx : framerng; { line number }
 pndx : pelrng; { pel number }
*)

BEGIN

(* This is the original pascal code:
 =================================

 Port[Bport] := 0; { reset everything }
 FOR lndx := 0 TO maxline DO
 FOR pndx := 0 TO maxpel DO Begin
 pic1^.fmt.lines[lndx].pels[pndx]
 := (Port[Aport] and $3F);
 Port[Aport] := $02; { next address }
 Port[Aport] := 0; { idle the lines }
 end;

 Port[Bport] := 0; { reset everything }

 Now replaced by the following assembler code:
 ============================================= *)

 asm
 mov dx,2F1H
 xor al,al
 out dx,al
 mov bx,maxline
 les di,pic1
 inc di (* skip syncF byte *)
 cld
 mov dx,2F0H
@ReadBoard: mov cx,maxpel+1
 inc di (* skip syncL *)
@ReadLine: in al,dx
 and al,3FH

 stosb
 mov al,2
 out dx,al
 xor al,al
 out dx,al
 loop @ReadLine
 dec bx
 jnz @ReadBoard
 mov dx,2F1H
 xor al,al
 out dx,al
 end
end;

{*******************************************************}

end.







[LISTING TWO]

{$X+,S-}
{$M 16384,8192,655360}
uses
 Crt, Dos, Objects, Drivers, Memory, Views, Menus,
 StdDlg, MsgBox, App, Video, Vga, Dialogs;

const
 cmFOpen = 1000;
 cmFSave = 1001;
 cmFSaveAs = 1002;
 cmExpMon = 2000;
 cmExpInteg = 2001;
 cmExpGrab = 2002;
 cmMrgCompare = 3000;
 cmMrgAdd = 3001;
 cmMrgSub = 3002;
 cmMrgMask = 3003;
 cmProEdge = 4000;
 cmProFilter = 4001;
 cmProHist = 4002;
 cmProMult = 4003;
 cmProInvert = 4004;
 cmProOffset = 4005;
 cmProThreshold = 4006;
 cmDisplay = 5000;
 cmOptVga = 6000;
 cmOptAutoD = 6001;
 cmOptPhotoS = 6002;

 VgaHiResTxt : TMenuStr ='~V~GA HiRes ';
 AutoDisplayTxt: TMenuStr ='~A~uto Display ';
 PhotoModeTxt :TMenuStr ='~P~hoto session ';
 OnTxt : string[4] =' On';

 OffTxt : string[4] ='Off';

type
 pHistoView = ^HistoView;
 HistoView = object(TView)
 histo : histtype;
 constructor Init(Bounds: TRect);
 procedure Draw; virtual;
 procedure Update(Picture : picptr);
 end;

 pHistoWindow = ^HistoWindow;

 HistoWindow = object(TWindow)
 HistoView: pHistoView;
 constructor Init;
 end;

 pCCDpgm = ^CCDpgm;
 CCDpgm = object(TApplication)
 CurPicture: PicPtr;
 CurFileName: PathStr;
 PictureDirty: boolean;
 HistoGram: pHistoWindow;
 procedure FileOpen(WildCard: PathStr);
 procedure FileSave;
 procedure FileSaveAs(WildCard: PathStr);
 procedure DisplayImage;
 procedure InitMenuBar; virtual;
 procedure HandleEvent(var Event: TEvent); virtual;
 procedure InitStatusLine; virtual;
 procedure SetMenuItem(Item: string; Value: boolean);
 procedure UpdateHistoGram;
 end;

var
 CCD: CCDpgm;

procedure GraphicsStart;
begin
 DoneSysError;
 DoneEvents;
 DoneVideo;
 DoneMemory;
end;

procedure GraphicsStop;
begin
 InitMemory;
 TextMode(3);
 InitVideo;
 InitEvents;
 InitSysError;
 Application^.Redraw;
end;

function TypeInDialog(var S: PathStr; Title:string):boolean;
var
 D: PDialog;

 Control: PView;
 R: TRect;
 Result:Word;
begin
 R.Assign(0, 0, 30, 7);
 D := New(PDialog, Init(R, Title));
 with D^ do
 begin
 Options := Options or ofCentered;
 R.Assign(5, 2, 25, 3);
 Control := New(PInputLine, Init(R, sizeof(PathStr)-1));
 Insert(Control);
 R.Assign(3, 4, 15, 6);
 Insert(New(PButton, Init(R, 'O~K~', cmOk, bfDefault)));
 Inc(R.A.X, 12); Inc(R.B.X, 12);
 Insert(New(PButton, Init(R, 'Cancel', cmCancel, bfNormal)));
 SelectNext(False);
 end;
 D := PDialog(Application^.ValidView(D));
 if D <> nil then
 begin
 Result := DeskTop^.ExecView(D);
 if (Result <> cmCancel) then D^.GetData(S);
 Dispose(D, Done);
 end;
 TypeInDialog := Result <> cmCancel;
end;

constructor HistoWindow.Init;
var
 R:TRect;
begin
 R.Assign(0, 0, 68,21);
 TWindow.Init(R, 'Histogram', 0);
 Palette := wpCyanWindow;
 GetExtent(R);
 Flags := Flags and not (wfZoom + wfGrow); { Not resizeable }
 GrowMode := 0;
 R.Grow(-1, -1);
 HistoView := New(pHistoView, Init(R));
 Insert(HistoView);
end;

constructor HistoView.Init(Bounds: TRect);
begin
 TView.Init(Bounds);
 Update(CCD.CurPicture);
end;

procedure HistoView.Update(Picture : picptr);
begin
 Histogram(Picture,histo);
 DrawView;
end;

procedure HistoView.Draw;
const
 barchar = $DB; { display char for bar }
 halfbar = $DC; { half length bar }

 maxbar = 16; { length of longest bar }

var
 x,y : Integer;
 binID : Integer;
 maxval : Word; { the largest bin value }
 maxval1 : Word; { the next largest bin }
 barbase : Word; { bottom of bar }
 barmid : Word; { middle of bar }
 barstep : Word; { height of steps }
 halfstep : Word; { half of barstep }
 barctr : Integer; { character within bar }

begin
 TView.Draw;
 maxval := 1; { find largest value }
 maxval1 := maxval;
 binID := 0;
 for binID := 0 to maxbit do
 begin
 if histo[binID] > maxval then
 begin { new all-time high? }
 maxval1 := maxval; { save previous high }
 maxval := histo[binID]; { set new high }
 end
 else if histo[binID] > maxval1 then { 2nd highest? }
 maxval1 := histo[binID];
 end;

 barstep := maxval1 div maxbar; { steps between lines }
 halfstep := barstep div 2; { half of one step }
 y := 0;

 for barctr := maxbar downto 1 do
 begin { down bars }
 barbase := Trunc(barstep * barctr);
 barmid := barbase + halfstep;
 x := 1;
 for binID := 0 TO maxbit do { for each bin }
 begin
 if histo[binID] > barmid then
 WriteChar(x,y,Chr(barchar),7,1)
 else if histo[binID] > barbase then
 WriteChar(x,y,Chr(halfbar),7,1)
 else WriteChar(x,y,'_',7,1);
 x := succ(x);
 end;
 y := succ(y); { new line }
 end;

 for binID := 0 to maxbit do { fill in bottom }
 if histo[binID] > halfstep then
 WriteChar(binID+1,y,Chr(barchar),7,1)
 else if histo[binID] > 0 then
 WriteChar(binID+1,y,Chr(halfbar),7,1)
 else WriteChar(binID+1,y,'_',7,1);

 y := succ(y);
 x := 1;

 WriteStr(x,y, '0 1 2 3 ' +
 '4 5 6 ',7);
 y :=succ(y);
 WriteStr(x,y,'0123456789012345678901234567890123456789' +
 '012345678901234567890123',7);
end;

procedure CCDpgm.InitMenuBar;
var
 R: TRect;
begin
 GetExtent(R);
 R.B.Y := R.A.Y+1;
 MenuBar := New(PMenuBar, Init(R, NewMenu(
 NewSubMenu('~F~ile', 0, NewMenu(
 NewItem('~O~pen ...', 'F3', kbF3, cmFOpen, 0,
 NewItem('~S~ave', 'F2', kbF2, cmFSave, 0,
 NewItem('Save ~A~s ...', '', kbNoKey, cmFSaveAs, 0,
 NewItem('E~x~it', 'Alt-X', kbAltX, cmQuit, 0, nil))))),
 NewSubMenu('~E~xpose', 0, NewMenu(
 NewItem('~M~onitor','F9', kbF9, cmExpMon, 0,
 NewItem('~I~ntegrated Exposure ...', 'F10', kbF10, cmExpInteg, 0,
 NewItem('~G~rab', 'Shift-F9', kbShiftF9, cmExpGrab, 0,nil)))),
 NewSubMenu('~M~erge', 0, NewMenu(
 NewItem('~C~ompare Images ...','', kbNoKey, cmMrgCompare, 0,
 NewItem('~A~dd Images ...', '', kbNoKey, cmMrgAdd, 0,
 NewItem('~S~ubtract Images ...', '', kbNoKey, cmMrgSub, 0,
 NewItem('~M~ask Images ...', '', kbNoKey, cmMrgMask, 0,nil))))),
 NewSubMenu('~P~rocess', 0, NewMenu(
 NewItem('~E~dge Enhance','', kbNoKey, cmProEdge, 0,
 NewItem('~F~ilter', '', kbNoKey, cmProFilter, 0,
 NewItem('~H~istogram', '', kbNoKey, cmProHist, 0,
 NewItem('~M~ultiply ...', '', kbNoKey, cmProMult, 0,
 NewItem('~I~nvert', '', kbNoKey, cmProInvert, 0,
 NewItem('~O~ffset', '', kbNoKey, cmProOffset, 0,
 NewItem('~T~hreshold ...', '', kbNoKey, cmProThreshold, 0,nil)))))))),
 NewItem('~D~isplay', '', kbShiftF10, cmDisplay, 0,
 NewSubMenu('~O~ptions', 0, NewMenu(
 NewItem(VgaHiResTxt,'', kbNoKey, cmOptVga, 0,
 NewItem(AutoDisplayTxt, '', kbNoKey, cmOptAutoD, 0,
 NewItem(PhotoModeTxt, '', kbNoKey, cmOptPhotoS, 0,nil)))),
 nil)))))))));
end;

procedure CCDpgm.InitStatusLine;
var
 R: TRect;
begin
 GetExtent(R);
 R.A.Y := R.B.Y - 1;
 StatusLine := New(PStatusLine, Init(R,
 NewStatusDef(0, $FFFF,
 NewStatusKey('~F10~ Expose', kbF10, cmExpInteg,
 NewStatusKey('~F9~ Monitor', kbF9, cmExpMon,
 NewStatusKey('~ShiftF9~ Grab', kbShiftF9,cmExpGrab,
 NewStatusKey('~F3~ Open', kbF3, cmFOpen,
 NewStatusKey('~F2~ Save', kbF2, cmFSave,
 NewStatusKey('~AltX~ Exit', kbAltX, cmQuit,
 NewStatusKey('~ShiftF10~ Display', kbShiftF10, cmDisplay, nil))))))), nil)));

end;

procedure CCDpgm.FileSaveAs(WildCard: PathStr);
var
 D: PFileDialog;
begin
 D := New(PFileDialog, Init(WildCard, 'Save as',
 '~N~ame', fdOkButton + fdHelpButton, 100));
 D^.HelpCtx := 0;
 if ValidView(D) <> nil then
 begin
 if Desktop^.ExecView(D) <> cmCancel then
 begin
 D^.GetFileName(CurFileName);
 FileSave;
 end;
 Dispose(D, Done);
 end;
end;

procedure CCDpgm.FileSave;
begin
 if CurFileName[0] = chr(0) then
 FileSaveAs('*.CCD')
 else
 begin
 if SavePicture(CurFileName,CurPicture) <> 0 then
 MessageBox('Can''t Save File!', nil, mfError + mfOkButton);
 end;
end;

procedure CCDpgm.FileOpen(WildCard: PathStr);
var
 D: PFileDialog;
 wkPic: PicPtr;
begin
 D := New(PFileDialog, Init(WildCard, 'Open a File',
 '~N~ame', fdOpenButton + fdHelpButton, 100));
 D^.HelpCtx := 0;
 if ValidView(D) <> nil then
 begin
 if Desktop^.ExecView(D) <> cmCancel then
 begin
 D^.GetFileName(CurFileName);
 PicSetup(CurPicture);
 if LoadPicture(CurFileName,CurPicture) <> 0 then
 MessageBox('Error Loading File!', nil, mfError + mfOkButton)
 end;
 Dispose(D, Done);
 end;
end;

procedure CCDpgm.DisplayImage;
begin
 GraphicsStart;
 Display_Image(CurPicture);
 ReadKey;
 GraphicsStop;
end;


procedure CCDpgm.SetMenuItem(Item: string; Value: boolean);
var
 mText : TMenuStr;

function SearchItem(pI : PMenuItem): boolean;
begin
 if pI = NIL then
 SearchItem := true
 else if Pos(mText,pI^.Name^) <> 0 then
 begin
 SearchItem := false;
 if Value then
 pI^.Name^ := Concat(mText,OnTxt)
 else
 pI^.Name^ := Concat(mText,OffTxt)
 end
 else
 SearchItem := SearchItem(pI^.Next);
end;

var
 pI: PMenuItem;
begin
 mText := Copy(Item,1,Length(Item)-3);
 pI := MenuBar^.Menu^.Items;
 while pI <> NIL DO
 begin
 if pI^.SubMenu <> NIL then
 if not SearchItem(pI^.SubMenu^.Items) then
 pI := Nil
 else
 pI := pI^.Next
 else
 pI := pI^.Next;
 end;
end;

procedure NotImplemented;
begin
 MessageBox('This command has not been implemented yet!', nil, mfError +
mfOkButton);
end;

procedure CCDpgm.UpdateHistoGram;
begin
 if (HistoGram <> NIL) and (CurPicture <> NIL) then
 begin
 HistoGram^.HistoView^.Update(CurPicture)
 end;
end;

procedure CCDpgm.HandleEvent(var Event: TEvent);
var
 wkStr: PathStr;
 wkI,Result: integer;
 DoAutoDisplay: boolean;
 wkPicture: PicPtr;
 resPicture: PicPtr;
begin

 DoAutoDisplay := false;
 TApplication.HandleEvent(Event);
 case Event.What of
 evCommand:
 begin
 case Event.Command of
 cmFOpen: begin
 FileOpen('*.CCD');
 UpdateHistoGram;
 DoAutoDisplay := true;
 end;
 cmFSave: FileSave;
 cmFSaveAs: FileSaveAs('*.CCD');
 cmExpMon: begin
 GraphicsStart;
 if not Continuous(CurPicture) then
 begin
 GraphicsStop;
 MessageBox('Camera not responding!', nil, mfError + mfOkButton);
 if CurPicture <> NIL then
 begin
 dispose(CurPicture);
 CurPicture := NIL;
 end;
 end
 else
 GraphicsStop;
 end;
 cmExpInteg: NotImplemented;
 cmExpGrab: begin
 PicSetup(CurPicture);
 SetSyncs(CurPicture);
 if Capture then
 Scan(CurPicture)
 else
 MessageBox('Camera not responding!', nil, mfError + mfOkButton);
 end;
 cmMrgCompare: if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else
 begin
 WkPicture := CurPicture;
 CurPicture := NIL;
 FileOpen('*.CCD');
 Compare(WkPicture,CurPicture);
 Dispose(CurPicture);
 CurPicture:= WkPicture;
 UpdateHistoGram;
 DoAutoDisplay := true;
 end;
 cmMrgAdd: if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else
 begin
 WkPicture := CurPicture;
 CurPicture := NIL;
 FileOpen('*.CCD');
 Add(WkPicture,CurPicture);
 Dispose(CurPicture);

 CurPicture:= WkPicture;
 UpdateHistoGram;
 DoAutoDisplay := true;
 end;
 cmMrgSub: if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else
 begin
 WkPicture := CurPicture;
 CurPicture := NIL;
 FileOpen('*.CCD');
 Subtract(WkPicture,CurPicture);
 Dispose(CurPicture);
 CurPicture:= WkPicture;
 UpdateHistoGram;
 DoAutoDisplay := true;
 end;
 cmMrgMask: if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else
 begin
 WkPicture := CurPicture;
 CurPicture := NIL;
 FileOpen('*.CCD');
 Mask(WkPicture,CurPicture);
 Dispose(CurPicture);
 CurPicture:= WkPicture;
 UpdateHistoGram;
 DoAutoDisplay := true;
 end;
 cmProEdge: begin
 if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else
 begin
 wkPicture:= NIL; { get output array }
 PicSetup(wkPicture);
 SetSyncs(wkPicture);
 Edge(CurPicture,wkPicture);
 Dispose(CurPicture);
 CurPicture:= wkPicture;
 UpdateHistoGram;
 DoAutoDisplay := true;
 end;
 end;
 cmProFilter: begin
 if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else
 begin
 wkPicture := NIL;
 PicSetup(wkPicture);
 SetSyncs(wkPicture);
 Filter1(CurPicture,wkPicture);
 Dispose(CurPicture);
 CurPicture := wkPicture;
 UpdateHistoGram;
 DoAutoDisplay := true;
 end;

 end;
 cmProHist: begin
 if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else
 begin
 HistoGram := new(pHistoWindow,Init);
 Desktop^.Insert(ValidView(HistoGram));
 end
 end;
 cmProMult: if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else
 begin
 if TypeInDialog(wkStr,'Enter Mult Factor') then
 begin
 Val(wkStr,wkI,Result);
 if Result = 0 then
 Multiply(CurPicture,wkI);
 DoAutoDisplay := true;
 UpdateHistoGram;
 end;
 end;
 cmProInvert: begin
 if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else
 begin
 Invert(CurPicture);
 DoAutoDisplay := true;
 UpdateHistoGram;
 end;
 end;
 cmProOffset: if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else if TypeInDialog(wkStr,'Enter Offset') then
 begin
 Val(wkStr,wkI,Result);
 if Result = 0 then
 begin
 if (wkI<0) then
 begin
 wkI:= abs(wkI);
 Negoffset(CurPicture,wkI);
 end
 else
 Offset(CurPicture,wkI);
 DoAutoDisplay := true;
 UpdateHistoGram;
 end;
 end;
 cmProThreshold: if (CurPicture = NIL) then
 MessageBox('No picture!', nil, mfError + mfOkButton)
 else if TypeInDialog(wkStr,'Enter Threshold') then
 begin
 Val(wkStr,wkI,Result);
 if Result = 0 then
 Threshold(CurPicture,wkI);
 DoAutoDisplay := true;

 UpdateHistoGram;
 end;
 cmDisplay: DisplayImage;
 cmOptVga: begin
 VGAhiRes := not VGAhiRes;
 SetMenuItem(VgaHiResTxt,VGAhiRes);
 end;
 cmOptAutoD: begin
 AutoDisplay := not AutoDisplay;
 SetMenuItem(AutoDisplayTxt,AutoDisplay);
 end;
 cmOptPhotoS: begin
 PhotoMode := not PhotoMode;
 SetMenuItem(PhotoModeTxt,PhotoMode);
 end;
 else
 Exit;
 end;
 ClearEvent(Event);
 if DoAutoDisplay and AutoDisplay then
 DisplayImage;
 end;
 end;
end;

begin
 CCD.Init;
 CCD.CurPicture := NIL;
 CCD.CurFileName := '';
 CCD.SetMenuItem(VgaHiResTxt,False);
 CCD.SetMenuItem(AutoDisplayTxt,False);
 CCD.SetMenuItem(PhotoModeTxt,False);
 VGAhiRes := FALSE;
 AutoDisplay := FALSE;
 PhotoMode := FALSE;
 CCD.Run;
 CCD.Done;
end.





[LISTING THREE]

unit Vga;
{*******************************************************}

interface
USES Video, DOS, CRT;
var
 VGAhiRes: boolean;
 AutoDisplay: boolean;
 PhotoMode: boolean;

Procedure Display_Image(pic1: PicPtr);
function Continuous(var pic1: PicPtr): boolean;

implementation


{--- Sets the VGA display planes }
Procedure Set_Plane (plane : byte);

var old : byte;

begin
 Port[$01CE] := $0B2; { plane select mask }
 old := (Port[$01CF] and $0E1); { get the old plane value }
 Port[$01CE] := $0B2; { plane select mask }
 Port[$01CF] := ((plane shl 1) or old); { new plane register value }

end;

procedure DisplayInVgaMode(pic1: PicPtr);
begin
(*
 col := 32;
 for row := 0 to 200 do
 begin
 Move(pic1^.fmt.lines[row].pels[0],MEM[$A000:col],256);
 col := col + 320;
 end;
*)
 asm
 push ds
 lds si,pic1
 inc si (*Sync1*)
 mov bx,201
 mov ax,0A000H
 mov es,ax
 mov di,32
 cld
@LineLoop: inc si (*SyncL*)
 mov cx,128
 rep movsw
 add di,320-256
 dec bx
 jne @LineLoop
 pop ds
 end;
end;

{--- Show picture on VGA in 320x200x256 or }
{ 640x400x256 color mode }
Procedure Display_Image(pic1: PicPtr);

var
 r : registers; { BIOS interface regs }
 row,col : INTEGER; { Screen coordinates }
 Vmode : char;
 shade : byte;
 mode, i : integer;
 plane : byte;

const
 VideoInt : byte = $10;
 Set_DAC_Reg : integer = $1010;


begin
 if VGAhiRes then
 begin
 r.AX := ($00 SHL 8) OR $61;
 Intr(VideoInt,r); { set 640x400x256 color mode}
 mode := 1;
 end
 else
 begin
 r.AX := ($00 SHL 8) OR $13;
 Intr(VideoInt,r); { set 320x200x256 color mode}
 mode := 0;
 end;
 for shade := 0 to 63 do
 begin
 r.ax := Set_DAC_Reg;
 r.bx := shade;
 r.ch := shade;
 r.cl := shade;
 r.dh := shade;
 INTR(VideoInt,r);
 end;
 if mode = 0 then
 begin
 DisplayInVgaMode(pic1);
 end
 else
 begin
 for row := 0 to 102 do
 begin
 col := row * 640;
 Move(pic1^.fmt.lines[row].pels[0],MEM[$A000:col],256);
 end;
 plane := 1;
 Set_Plane ( plane );
 for row := 103 to 204 do
 begin
 col := (row - 103) * 640 + 384;
 Move(pic1^.fmt.lines[row].pels[0],MEM[$A000:col],256);
 end;
 plane := 2;
 Set_Plane ( plane );
 for row := 205 to 240 do
 begin
 col := (row - 205) * 640 + 128;
 Move(pic1^.fmt.lines[row].pels[0],MEM[$A000:col],256);
 end;
 end;
end;

function Continuous(var pic1: PicPtr): boolean;
var
 r : registers; { BIOS interface regs }
 row,col : INTEGER; { Screen coordinates }
 Vmode : char;
 shade : byte;
 cont : boolean;
CONST
 VideoInt : byte = $10;

 Set_DAC_Reg : integer = $1010;

begin
 PicSetup(pic1); { set up even picture array }
 SetSyncs(pic1);

 r.AX := ($00 SHL 8) OR $13;
 Intr(VideoInt,r); { set 320x200x256 color mode }

 FOR shade := 0 to 63 do begin { set VGA to gray scale }
 r.ax := Set_DAC_Reg;
 r.bx := shade;
 r.ch := shade;
 r.cl := shade;
 r.dh := shade;
 INTR(VideoInt,r);
 End;
 repeat
 if capture then
 begin
 scan(pic1);
 DisplayInVgaMode(pic1);
 Cont := true;
 end
 else
 Cont := false;
 until not Cont or KeyPressed;
 Continuous := Cont;
END;
end.
































June, 1991
 EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER


Avoiding redundant computations




Victor J. Duvanenko


Victor is a member of the technical staff at TrueVision, where he develops
videographics products for PCs. He can be reached at 7340 Shadeland Station,
Indianapolis, IN 46256 or by e-mail, victor@truevision.com.


Once in a great while, a matrix, a polynomial, or something else complicated
must be raised to an integer power. Let's say the value of the power is N.
Let's also give a name to the item that needs to be raised to a power--call it
M. A straightforward way to do this would be to perform (N-1) matrix
multiplications by M (that is, M*M*M*...*M*M). It is possible, however, to get
the same answer in at most 2log[2]N matrix multiplications.
If you parenthesize pairs of Ms, you can see that much of the work is
redundant (that is, (M*M)*(M*M)*(M*M)*...*(M*M)). In other words, the (M*M)
term could be computed once and then used repeatedly. It also follows that
((M*M)*(M*M)) terms, or pairs of pairs, could be computed once and then used.
The point is that there is much redundant work present when straightforward
multiplication is used.
For the sake of argument, let's assume that N = 16. Working backwards, we get
the results in Example 1(a). In other words, the answer can be obtained by
squaring M (a matrix multiplication), then squaring the result, squaring the
result again, and squaring it once more. This requires only 4 matrix
multiplications instead of the 15 needed in the straightforward method. It is
easy to see that the growth rate of the squarin-method is logarithmic: To
compute M{32} requires 5 matrix multiplies (squarings), while computing M{64}
requires 6 matrix multiplies. Therefore, the larger the power, the higher the
computational savings.
The example above worked out nicely because N was a power of 2. A bit more
work is needed when N is not a power of 2 (that is, N = 5 or 7 or 9). These
cases are simple to handle once you realize that N can be expressed as a sum
of numbers that are all powers of 2. For example, a five is equal to a one
plus a four, where one and four are numbers that are powers of 2 (5 = 2{0} +
2{2}). The same holds for seven (7 = 2{0} + 2{1} + 2{2}) and nine (9 = 2{0} +
2{3}). The equations in Example 1(b) show this principle applied to matrix
squaring.
So, when N is not a power of 2, the answer can be obtained by multiplying
several terms raised to the power of 2 power. This procedure needs to be
performed systematically, in the form of a program.
If you consider N as a binary value, it is easy to see that the bit positions
with a 1 in them indicate the presence of the corresponding power of 2, and Os
indicate the absence of that power of 2. For example, a nine can be
represented in binary as 1001, which states that 2{0} and 2{3} are present,
and 2{1} and 2{2} are not. Therefore, it is possible to look at the bit
pattern of N and determine when multiplications need to be done.
The procedure of raising a matrix to an integer power is then as follows:
1. Initialize the intermediate result matrix to an identity matrix (a 1 in
matrix world). Determine the number of bits that N can possibly have set by
computing log[2](N+1). This indicates the number of times squaring must be
done, because it shows the highest bit number set to 1.
2. If bit 0 of N is set, move M into the intermediate result matrix.
3. Square M, and place it back in M. If bit 1 of N is set, multiply the
intermediate result matrix by M (which is M{2}).
4. Square M again, and place it back in M. If bit 2 of N is set, multiply the
intermediate result matrix by M (which is M{4}). Continue until you reach the
highest bit of N set to 1 (at most, log[2] (N + 1) times).
5. The intermediate result matrix holds the answer.
This algorithm is easily implemented. The algorithm runs in about 2log[2]N
time, because it takes about log[2]N squarings (each squaring being a matrix
multiplication), plus at most log[2]N multiplications with the intermediate
result (as N can have at most log[2]N bits set to 1).


Application


One application of this method is shown in Computer Algorithms: Introduction
to Design and Analysis, which describes a method of computing Fibonacci
numbers using matrix multiplication. Fibonacci numbers are defined as in
Example 1(c). A less known matrix Fibonacci method is shown in Example 1(d).
In other words, any Fibonacci number can be computed by first raising the
matrix M to the appropriate power and then multiplying it by the (1,0) matrix.
Because a matrix can be raised to a power in 2log[2]N time, the algorithm to
compute a Fibonacci number runs in order log[2]N time.
Three ways to compute Fibonacci numbers have been implemented and are shown in
Listing One (page 157). First shown in the fibonacci procedure is the
straightforward method that simply remembers the last two Fibonacci numbers
and computes the next. This method takes (n - 1) additions to compute
fibanacci(n), which is linear running time. Double-precision numbers are used
because Fibonacci numbers grow very quickly and overflow even the long
integers. The second method is concise, recursive, and inefficient (see
procedure fibonacci_recursive). Its running time grows exponentially (phi{n},
where phi = 1.6). The third method is the matrix raised to a power method
described above (shown in procedure fibonacci_log).
The main portion of the program allows you to compute Fibonacci numbers using
the three available methods, and to experience the running time differences.
To prove that the matrix method is the fastest, the three methods were timed
on an 8-MHz PC/AT with a math coprocessor. Table 1 shows the running times to
compute a Fibonacci number 1000 times.
Table 1: Running times to compute a Fibonacci number 1000 times

 Fibonacci Linear Matrix
 Number Algorithm Algorithm
 (time in sec) (time in sec)
 ----------------------------------------

 1400 67.57 19.95
 700 33.56 17.63
 350 16.86 15.38
 175 8.52 14.18
 88 4.40 11.86
 44 2.20 9.73
 22 1.27 7.53

The recursive method was not timed because its running time grew so rapidly as
to make it impractical. For example, to compute Fibonacci(25) only once took
the recursive method over 30 seconds, whereas the other methods were
instantaneous. Table 1 shows that the running time of the linear algorithm is
indeed linear. It is also obvious that the running time of the matrix
algorithm grows much slower, and wins above 350 or so. Therefore, for
computing large Fibonacci numbers, the matrix method is fastest.
A word of caution: Fibonacci numbers grow very rapidly. In fact,
Fibonacci(1500) overflows the math coprocessor capabilities and causes a math
overflow error. One curious fact about Fibonacci numbers is that the ratio of
two successive Fibonacci numbers, F[n][F/n-1], approaches a number near 1.6 as
n grows. This number, also called the Golder Ratio (phi), appears in many
places such as architecture and recursive function theory. For instance, the
number of recursive calls to compute F[n] (a Fibonacci number) is more than
phi{n}. Thus, computing Fibonacci numbers quickly may be useful for
determining an accurate value of phi to 10,000 decimal places.


References



Baase, S. Computer Algorithms: Introduction to Design and Analysis. 2nd
edition. Reading, Mass.: Addison-Wesley, 1988.
Hill, F.S. Jr. Computer Graphics. New York, N.Y.: Macmillan, 1990.

_EFFICIENTLY RAISING MATRICES TO AN INTEGER POWER_
by Victor Duvanenko


[LISTING ONE]

#include <stdio.h>
#include <math.h>
#include <string.h>

#define N 2

/* Procedure to multiply two square matrices */
void matrix_mult( mat1, mat2, result, n )
double *mat1, *mat2; /* pointers to the matrices to be multiplied */
double *result; /* pointer to the result matrix */
int n; /* n x n matrices */
{
 register i, j, k;
 for( i = 0; i < n; i++ )
 for( j = 0; j < n; j++, result++ ) {
 *result = 0.0;
 for( k = 0; k < n; k++ )
 *result += *( mat1 + i * n + k ) * *( mat2 + k * n + j );
 /* result[i][j] += mat1[i][k] * mat2[k][j]; */
 }
}
/* Procedure to copy square matrices quickly. Assumes the elements are
 double precision floating-point type. */
void matrix_copy( src, dest, n )
double *src, *dest;
int n;
{
 memcpy( (void *)dest, (void *)src, (unsigned)( n * n * sizeof( double )));
}
/* Procedure to compute Fibonacci numbers in linear time */
double fibonacci( n )
int n;
{
 register i;
 double fib_n, fib_n1, fib_n2;

 if ( n <= 0 ) return( 0.0 );
 if ( n == 1 ) return( 1.0 );
 fib_n2 = 0.0; /* initial value of Fibonacci( n - 2 ) */
 fib_n1 = 1.0; /* initial value of Fibonacci( n - 1 ) */
 for( i = 2; i <= n; i++ ) {
 fib_n = fib_n1 + fib_n2;
 fib_n2 = fib_n1;
 fib_n1 = fib_n;
 }
 return( fib_n );
}
/* Procedure to compute Fibonacci numbers recursively (inefficiently) */
double fibonacci_recursive( n )

int n;
{
 if ( n <= 0 ) return( 0.0 );
 if ( n == 1 ) return( 1.0 );
 return( (double)( fibonacci_recursive(n - 1) + fibonacci_recursive(n - 2)));
}
/* Procedure to compute Fibonacci numbers in logarithmic time (fast!) */
double fibonacci_log( n )
int n;
{
 register k, bit, num_bits;
 double a[2][2], b[2][2], c[2][2], d[2][2];
 if ( n <= 0 ) return( 0.0 );
 if ( n == 1 ) return( 1.0 );
 if ( n == 2 ) return( 1.0 );

 n--; /* need only a^(n-1) */
 a[0][0] = 1.0; a[0][1] = 1.0; /* initialize the Fibonacci matrix */
 a[1][0] = 1.0; a[1][1] = 0.0;
 c[0][0] = 1.0; c[0][1] = 0.0; /* initialize the result to identity */
 c[1][0] = 0.0; c[1][1] = 1.0;

 /* need to convert log bases as only log base e (or ln) is available */
 num_bits = ceil( log((double)( n + 1 )) / log( 2.0 ));

 /* Result will be in matrix 'c'. Result (c) == a if bit0 is 1. */
 bit = 1;
 if ( n & bit ) matrix_copy( a, c, N );
 for( bit <<= 1, k = 1; k < num_bits; k++ ) /* Do bit1 through num_bits. */
 {
 matrix_mult( a, a, b, N ); /* square matrix a; result in matrix b */
 if ( n & bit ) { /* adjust the result */
 matrix_mult( b, c, d, N );
 matrix_copy( d, c, N );
 }
 matrix_copy( b, a, N );
 bit <<= 1; /* next bit */
 }
 return( c[0][0] );
}
main()
{
 int n;
 for(;;) {
 printf( "Enter the Fibonacci number to compute ( 0 to exit ): " );
 scanf( "%d", &n );
 if ( n == 0 ) break;
 printf("\nMatrix method: Fibonacci( %d ) = %le\n",n,fibonacci_log(n));
 printf("\nLinear method: Fibonacci( %d ) = %le\n",n,fibonacci(n));
 printf("\nRecursive method: Fibonacci( %d ) =
%le\n",n,fibonacci_recursive(n));
 }
 return(0);
}





































































June, 1991
PROGRAMMING PARADIGMS


The Cyberspace Amendment




Michael Swaine


The last time I had seen Jim Warren looking so enthusiastic, he was on roller
skates. That was in San Francisco years ago, during the heyday of the West
Coast Computer Faire, when the Faire still had that proper mixture of swap
meet, Renaissance pleasure faire, and revolution in the making. Jim also
started InfoWorld and was the first editor of Dr. Dobb's, but for years he was
most closely associated with the Faire, at which he was always the picture of
the laid-back workaholic, zipping through the aisles on wheels.
This time we were a few miles to the south, in Burlingame, California, on
March 26 of this year, at The First Conference on Computers, Freedom, and
Privacy. The conference was sponsored by the Computer Professionals for Social
Responsibility; cosponsored or supported by Apple Computer, Autodesk, Portal
Communications, the IEEE, and the ACM and various special interest groups of
each, the Electronic Networking Association, the Videotex Industry
Association, the Cato Institute, the Electronic Frontiers Foundation, the
ACLU, and the WELL; attended by hackers, crackers, spooks, and citizens, as
well as members of the press from The Wall Street Journal to Dr. Dobb's
Journal; and chaired by Jim Warren.
"I just found out he's going to say more than we expected," Jim whispered to
me. "He's going to call for a Constitutional amendment."
The "he" was Laurence H. Tribe, Professor of Constitutional Law, Harvard Law
School, widely regarded as the leading authority on Constitutional law, a man
who has been described as having had more influence on the Supreme Court than
any other living person not actually on the Court. A few minutes before
delivering his keynote speech this morning, Tribe had told Jim that the speech
will propose a Constitutional amendment for the information age.
Tribe's appearance alone was enough to bring out the East Coast and national
press; this news would send them to the phones after the keynote. For Jim
Warren it was something like the early Faire days, when you had the sense that
you were in on the reshaping of society. For me, too, I guess. This column is
about computers, freedom, privacy, and Tribe's proposed amendment to the
Constitution. I should disclose my bias: I'm on Tribe's side.


The Constitution In Cyberspace


I wasn't the only listener Tribe got on his side that morning. He made points
with the technically savvy audience immediately by defining, and setting his
goal in terms of, cyberspace. He knows that the word "cyberspace" is sometimes
used in just the sense in which William Gibson, cyberpunk author and coiner of
the word, uses it: For fantasy worlds in which one's mind becomes a device for
exploring a global data web. But he also knows the other sense in which the
word is more and more often being used: For the full range of
computer-mediated human interactions, from the copper-path network of
traditional telephony and the current one-way version of television, to highly
tappable cellular phones and office LANs and electronic bulletin boards and
networked police cruiser-based mobile terminals, to the ultimate in
computer-mediated human interaction, Prodigy. (Sorry. My joke, not Tribe's.)
Tribe set as his topic "how to map the text and structure of our Constitution
onto the texture and topology of cyberspace," asking, "when the lines along
which our Constitution is drawn warp or vanish, what happens to the
Constitution itself?"
Although convinced that "[t]he Constitution's core values ... need not be
transmogrified ... in the dim recesses of cyberspace," Tribe cited evidence of
a clear and present danger that that is exactly what will happen.
First, there is the threat of computer crime, and the threat from those who
overreact to that threat. Computer crime is real: Electronic trespass has
grown in menace to the point of cracking NORAD; crackers download and publish
people's credit histories from TRW; pranksters set loose worm programs that
shut down thousands of linked computers.
In response to these and other threats, real or imagined, the Secret Service
raided Steve Jackson Games, seizing all drafts of its fantasy role-playing
game, calling it a "handbook for computer crime;" the Treasury Department tied
up a fourth of its investigators eaves-dropping on electronic bulletin boards;
and last May, the government's Operation Sun Devil took on the teenagers'
Legion of Doom, seizing 42 computers and 23,000 disks in 14 cities. This
cracker war is weird enough. But there are many other, more subtle issues
involving the effect of technological developments on freedom, privacy, and
other Constitutional issues. The conference addressed many of them.
The Lotus Marketplace snafu was on everybody's mind. Lotus had only recently
pulled the product off the market in reaction to public concern over invasions
of privacy. But was the data involved actually private?
Then there was the phone number thing. The phone company has a service to sell
you: It shows you the telephone number of a caller before you answer the
phone. The more I heard about this (and between the newspaper articles, panel
discussions, and televised debates, there were plenty of opportunities to hear
about it), the less I understood why this particular technology had been
chosen to sell to the consumer. Whom does it really benefit? And does it
constitute an invasion of privacy?
And let us not forget the Rodney King case: It involved not one, but two new
technologies. While covering the video-taped beating of a motorist by members
of the Los Angeles Police Department, television stations across the country
quoted from a discussion among officers about the beating. The discussion took
place on a network of police cruiser-based mobile terminals. Was this an
invasion of privacy?
We can't assume that the courts and legislatures will deal wisely with such
questions. Tribe cited several cases of what he sees as errors of Congress and
the federal courts in dealing with technology, including the regulation of
radio and television broadcasting without adequate sensitivity to First
Amendment values, and a general treatment of electronically processed
information as though it were less entitled to Constitutional protection
because of that processing.
In short, the technologies that put food on our tables raise problems for the
Constitutional guarantees of freedom of speech, freedom of the press, freedom
of assembly, the right of privacy, the protection against unreasonable search
and seizure, and the security in one's property.
That's what was on Tribe's mind as he spoke that morning.


Getting Down to Cases


In that keynote address, Tribe gave two Supreme Court cases, Maryland vs.
Craig and Olmstead vs. United States, as examples of the ways in which the
Court has dealt with new technologies. The way the Court dealt with new
technologies in these cases, Tribe thinks, was in one case misguided and in
the other case plain wrong, and he explained why he thinks so.
In the recent case Maryland vs. Craig, the Supreme Court considered the impact
of one-way, closed-circuit television on the Sixth Amendment guarantee that
"In all criminal prosecutions, the accused shall enjoy the right ... to be
confronted with the witnesses against him." The accused in this case was an
alleged child abuser, and the witness an allegedly abused child. Balancing
benefits to the accuser and to society against costs to the accused, the Court
upheld, in a 5 to 4 decision, the power of a state to "confront" the accused
with a one-way, closed-circuit television version of his accuser. Tribe quoted
from Justice Scalia's dissent:
The Court has convincingly proved that the Maryland procedure serves a valid
interest, and gives the defendant virtually everything the Confrontation
Clause guarantees (everything, that is, except confrontation). I am persuaded,
therefore, that the Maryland procedure is virtually constitutional. Since it
is not, however, actually constitutional, I [dissent].
Tribe professed himself in complete agreement with Scalia; he apparently
believes that the Confrontation Clause is about more than just seeing what
your accuser looks like.
Back in 1928, in Olmstead vs. United States, the Supreme Court first
confronted a different new technology: the wiretap. The Court ruled that
Federal agents did not violate the Fourth Amendment's "right of the people to
be secure in their persons, houses, papers, and effects, against unreasonable
searches and seizures" when they tapped a phone without actually entering the
physical premises. The majority of the Court reasoned that the Amendment, by
enumerating physical things (persons, houses, papers, and effects),
demonstrated that it was only meant to apply to physical searches. Six years
later, Congress put back some of the protection that this decision removed,
when it passed the Federal Communications Act. But it wasn't until 1967 that
Olmstead vs. United States was overruled, in Katz vs. United States. In that
case, all participating justices except one agreed with the statement of
Justice Potter Stewart that "the Fourth Amendment protects people, not
places." That statement, which may have swayed the decisions of three
justices, was apparently supplied by Stewart's law clerk, Laurence H. Tribe.
How did the Court respond to the challenge of new technologies in Maryland vs.
Craig and Olmstead vs. United States? "Olmstead mindlessly read a new
technology out of the Constitution," according to Tribe, "while Craig
absent-mindedly read a new technology into the Constitution." Both decisions
undermined the protections of the Bill of Rights; Olmstead by reading the
Constitution as though the 18th-century authors had considered the question of
nonphysical searches and consciously decided to permit them; Craig by guessing
that when those authors guaranteed two-way physical confrontation, it was only
because they did not foresee the possibility of one-way electronic
confrontation.
"Although both Craig and Olmstead reveal an inadequate consciousness about how
new technologies interact with old values," Tribe concluded, "Craig at least
seems defensible even if misguided, while Olmstead seems just plain wrong."
In the press room after his talk, Tribe cited these same cases to explain why
he is proposing a Constitutional amendment rather than arguing cases before
the Court, something at which he has impressive experience.
The usual ways of bringing up these issues in trials have problems. Maryland
vs. Craig involved arguing for a right of an accused child abuser while
advocating putting a child through an arguably unnecessary ordeal. Olmstead
vs. United States involved arguing for a right of an accused racketeer while
suing the government over what could easily be seen as a technicality. In each
case, defending the Constitutional principle came down to attempting to
establish an abstract principle by using a case in which application of the
principle would pretty obviously hurt the innocent and aid the guilty.
If the issue is one of broad principle, Tribe advised, it is better to argue
it in a venue that is suited to arguing broad principles. He doesn't
necessarily expect the amendment to pass; but he believes that arguing it in
this way may bring the best possible thinking to bear on the issues.


Tribal Wisdom


What's the secret to avoiding the errors of the past in dealing with
technology? Tribe argues that the goal must be to remain true to the values
represented in the Constitution. Fidelity to the values requires flexibility
in textual interpretation, he says. And this interpretation should not merely
reflect the risks posed by technology, but should examine "how imposing those
risks comports with the Constitution's fundamental values of freedom, privacy,
and equality."
As to the argument itself, Tribe delivered it in classical form in the
keynote, presenting axioms for the information age.
For example, axiom 1: There is a vital difference between government and
private action.
As computer networks grow large, mediate their members' differences, create
rules of behavior, and straddle national boundaries, do they become, as
Professor Eli Noam argues, political entities? Since the Constitution
regulates governmental actions rather than the behavior of individuals or
groups, such an interpretation has been mooted as a justification for
regulating electronic bulletin boards and networks as though they were
government bodies. It's an approach that has been applied to such
quasi-governmental entities as large shopping malls and company towns. Is it
valid in the case of today's bulletin boards and networks?
No, says Tribe. A BBS is not a mall, but something more like a bookstore (and
not a publisher, either). So the government has no more business regulating
the content of a BBS than of a bookstore; but this doesn't make the BBS
operator liable for the content of messages any more than a bookstore owner is
liable for the the books on the shelves.

Another Tribe axiom argues that Constitutional principles should not vary with
accidents of technology. This axiom most directly motivates the Tribe
amendment. Tribe calls it the cyberspace corollary. Here it is, the proposed
27th amendment to the Constitution of the United States:
This Constitution's protections for the freedoms of speech, press, petition,
and assembly, and its protections against unreasonable searches and seizures
and the deprivation of life, liberty, or property without due process of law,
shall be construed as fully applicable without regard to the technological
method or medium through which the information content is generated, stored,
altered, transmitted, or controlled.




























































June, 1991
C PROGRAMMING


D-Flat Continued


 This article contains the following executables: DFLAT3.ARC ARCE.COM * Note:
Use ARCE.COM to extract .ARC files including DFLAT3.ARC on this disk


Al Stevens


This is the second installment of D-Flat, the new "C Programming" column
project that I started last month. D-Flat is a C function library that
implements the SAA Common User Access interface design in an event-driven
programming model for MS-DOS text-mode applications. Last month we built the
hardware-dependent code that deals with the mouse, the keyboard, and the
screen. We also coded the compiler-specific stuff, the compile-time
conditional code that distinguishes Turbo C from Microsoft C. The code from
last month encapsulates most of the hardware and the compiler dependencies. If
you wanted to port D-Flat to a different compiler or computer, you would
modify that code. The code that appears from now on will be mostly independent
of the hardware and the compiler, although there is an occasional #ifdef MSC
to get around some compiler differences. This month we address the parts of
the library that manage the configuration of a D-Flat application, that manage
window classes, and that contain the low-level window drivers.
My approach to explaining D-Flat will use a tutorial format at first. As you
progress through the series in the months to come, I will explain the
different parts of the system as soon as the code being discussed uses them.
This is a bottom-up approach. I want to get the low-level stuff out of the way
so that we can concentrate on the implementation of the window classes and how
your applications programs will use them. There are a lot of functions and
macros in the API that will pass by without much in the way of explanation at
first. The dflat.h header file from last month has many such macros. As we use
them, I will explain them. The window.c file this month has code that supports
clipping, but we don't need to get into clipping until we get past the basics
of displaying and using windows.
When the series is completed, I will publish a complete programmer's reference
guide to the messages, functions, and macros that make up the D-Flat API. In
addition, I will publish a generic user's guide for D-Flat applications. You
will be able to use that guide for any documentation that accompanies your
application. Eventually these guides will get into the file of source code
that you can download from CompuServe or Tele-Path, as explained later in this
column.


Program Configuration


A D-Flat application program will be subject to some user-controlled
configuration items. The user might be able to specify colors, editor options,
and so on. The D-Flat library includes functions and data structures to
support maintenance of the configuration. Listing One, page 148, is config.h,
the header file that describes the configuration data structures. The first
entry is a #define for the DFLAT_APPLICATION global symbol. This string names
your application program and will be the name of the configuration file as
well. As shown in Listing One, the string is initialized to the "memopad"
string literal. The configuration file will, therefore, be MEMOPAD.CFG.
The colors structure in config.h describes the color configuration for each of
the D-Flat window classes. The values are in pairs consisting of foreground
and background colors for each class. Sometimes you will see two pairs for a
class, such as ButtonFG, ButtonBG, ButtonSelFG, and ButtonSelBG. The second
pair, which always has Sel in its identifier, specifies the colors for
highlighted data items in the window class. These items include menu bars,
listbox selectors, and marked blocks of text. Some window classes have
definitions for the window's frame colors. These identifiers include the value
Frame.
Some of the colors are for components of windows rather than for windows
themselves. The Title and InFocusTitle values are the colors for the titles of
all windows. The MenuBar values are for an application's menu bar. The
InactiveSelFG and ShortCutFG colors are for the shortcut letter values in menu
selections.
The CONFIG typedef in config.h defines the format for a configuration record.
If your D-Flat application needs more configuration items -- and most of them
will -- you add your custom configuration data objects to the CONFIG data
type. Initially, there are five fields. The mono field, when true, specifies
that the windows are to display with the monochrome colors regardless of the
user's video system. The Insert-Mode field specifies whether the text editor
for edit box windows starts out in insert or overwrite mode by default. The
Tabs field specifies the tab width for the text editor. The WordWrap field
specifies whether the editor wraps words at the right margin of multiple-line
edit box windows or scrolls horizontally until the user presses the Enter key.
The clr field contains the colors for the application.
Listing Two, page 148 is config.c, which contains the default initialized
values for the configuration record and functions to read and write a
configuration file. There are two arrays of colors. One is for color systems
and one is for monochrome systems. To change the defaults, you would change
these arrays. The arrays initialize themselves by using the color values from
the Turbo C conio.h header file. The dflat.h header file, published last
month, contains the same values for Microsoft C users.
The cfg structure is the instance of the CONFIG data type that contains the
program's configuration values. If the application does not use custom
configuration values, the defaults apply.
The LoadConfig and SaveConfig functions load and save the current
configuration values from and to a file that is named according to the
DFLAT_APPLICATION global variable.


Window Classes


D-Flat works with windows in a hierarchy of window classes. This model is
similar to the one used by the Microsoft Windows programming platform. When
you create a window, you specify its class, and D-Flat therefore knows how to
deal with the window. The class identifies the window's colors and certain
processing characteristics. An edit box window behaves differently than a list
box window, for example. Listing Three, page 148, is classdef.h, the header
file that defines the CLASSDEFS data type, which contains the description of a
window class. The classdefs array of CLASSDEFS data types contains an entry
for each class. Each entry specifies the window class that it describes, the
base class -- if any -- from which it is derived, the window class's colors,
the address of a window processing function to which all messages for the
window are sent, and a window attribute value.
The window class values are defined in the dflat.h header file, which I
published last month. If you want to add window classes to the hierarchy, you
add an entry to the enum window_class in dflat.h and an entry to the classdefs
array in Listing Four, page 148, classdefs.c. This technique is different from
the way that Windows programmers do it. They put runtime code into their
programs to register new classes. A D-Flat programmer uses the C compiler to
register classes by coding the new classes into the class tables.
Each entry in the classdefs array defines the colors associated with the
class. There are three sets of colors -- one for the window itself, one for
highlighted selections in the window, and one for the window's frame. Observe
that the color entries are pointers to color values. This allows a program to
change the color configuration without changing this table. If a color entry
is a NULL pointer, which is valid for highlighted colors, the value assumes
the color of the window itself.
Each class has an associated window processing function, and the address of
that function appears in the classdefs array entry for the class. When
something sends a message to a window, the window processing function assigned
to the window's class executes first. That function will then either process
the message and return or call the function of the class from which the
original class is derived.
The original D-Flat classes as published in classdef.h are: NORMAL,
APPLICATION, TEXTBOX, LISTBOX, EDITBOX, MENUBAR, POPDOWNMENU, BUTTON, DIALOG,
ERRORBOX, MESSAGEBOX, HELPBOX, and DUMMY.
Most of the window classes are derived from the NORMAL window class--either
directly or by deriving from classes that derive from the NORMAL class. The
NORMAL class manages those processes--such as creating, closing, moving,
resizing, focusing, and so on--that can be common to all windows.
An APPLICATION window is the first window that a D-Flat application opens. It
is the parent of the MENUBAR window and all the document windows.
A TEXTBOX window is one into which you can write text and with which the user
can scroll and page through the text.
The LISTBOX and EDITBOX window classes are derived from the TEXTBOX class. A
LISTBOX window consists of lines in a list with a selection cursor that the
user can move up and down and with which the user can select from the entries
in the list. An EDITBOX window contains text that the user enters by using
text editor functions.
A MENUBAR window consists of a single line that appears on the first line of
an APPLICATION window and contains menu selections. When the user makes a menu
selection, the MENUBAR window opens a POPDOWNMENU window.
The POPDOWNMENU class is derived from the LISTBOX class. It is a single-page
listbox that will send command messages to its parent as the result of user
selections.
The BUTTON class defines a small static TEXTBOX window that reacts to user
selections by sending its parent a message.
The DIALOG window class defines a window that hosts a set of other window
types to implement a dialog box.
The ERRORBOX, MESSAGEBOX, and HELPBOX windows are DIALOG windows with defined
messages and user actions assigned to them.
The DUMMY window class defines a ghost window frame that appears when the user
is moving or sizing a window. You will learn all about these window classes in
the months to come when we discuss them and implement their window processing
modules.


Window Attributes


The classdefs array defines default window attributes for the window classes.
When you create a window, it automatically has the attributes of the class to
which it belongs, as well as all the classes from which it is derived. In
addition, you can specify additional attributes when you create the window, as
well as add and remove attributes from an existing window at any time. The
following attribute codes are defined in classdef.h: SHADOW, MOVEABLE,
SIZEABLE, HASMENUBAR,VSCROLLBAR, HSCROLLBAR, VISIBLE, SAVESELF, TITLEBAR,
CONTROLBOX, MINMAXBOX, NOCLIP, READONLY, MULTILINE, and HASBORDER.
Most of the attributes indicate that a window includes a particular property.
For example, the SHADOW attribute says that a window has a video shadow, the
MOVEABLE attribute says that the user may move the window around the screen.
Other attributes are not so obvious. The VISIBLE attribute says that the
window is to be displayed when the program creates it. The NOCLIP attribute
tells a window that it may display itself in regions of the screen that are
outside of its parent. The READONLY attribute is for EDITBOX windows, the user
may not change the text in a READONLY EDITBOX window. An EDITBOX window that
does not have the MULTILINE attribute occupies one line only and does not wrap
words or react to the Enter key.
The SAVESELF attribute says that a window is to save and restore video memory
when it is created or displayed and closed or hidden. Windows that do not have
this attribute do not save video memory when they are displayed. Furthermore,
when they are hidden or closed, they send messages to all overlapping windows
to repaint the parts of themselves that overlap the closing window. The
SAVESELF attribute is an efficiency tactic for windows that keep the focus for
as long as they are open. The POPDOWNMENU and DIALOG window classes have the
SAVESELF attribute.



The Window Driver


Listing Five, page 150, is window.c, the driver module for D-Flat windows. It
begins with the CreateWindow function. A D-Flat application calls this
function to create a window. You must pass this function the window class, the
text of the window's title, and the upper left screen coordinates relative to
zero where the window will first display, its height and width, a pointer --
usually NULL -- to data space that stores additional data about the window,
the WINDOW handle of the parent window, a pointer to a window processing
module, and an attribute value.
The window's title may be NULL, in which case the window will have no title.
If either of the window's upper left coordinates are - 1, the window will be
centered on the associated axis. If you provide an address to a window
processing function, messages to that window will be sent to the specified
function first. A discussion of the window processing function's operation
appears later in this series when we get into example programs. The attribute
value that you pass to CreateWindow as its last parameter contains additional
attributes that the window will have over and above those assigned to its
class.
The CreateWindow function allocates memory for the window structure and sets
the fields in the structure to their initial values as determined by the
function arguments or by default values. The structure's format appears in
dflat.h from last month. After the structure is initialized, the function
sends a CREATE_WINDOW message to the window. We will discuss the mechanism for
sending and processing window messages next month. If the window has the
VISIBLE attribute, the function sends the SHOW_WINDOW message to the window.
Finally, it returns the WINDOW handle to the function that called
CreateWindow.
The window.c source file contains several window-driver functions that outside
functions can call. Some of these support the D-Flat API and others are for
use by D-Flat functions themselves. The AddTitle function accepts a WINDOW
handle and a string title and adds the title to the window. The PutWindowChar
function writes a single character to a window at a specified x,y coordinate.
The clipbottom and clipline functions perform clipping of a window's display
to keep it inside the screen and parent window borders. The writeline function
writes a string to a window at a specified x, y coordinate, and pads to the
window's border with spaces if told to do so. The RepaintBorder function
displays the window's border including any scroll bars, the shadow, and the
title bar. The ClearWindow function clears the data space of a window to
spaces. The GetVideoBuffer and RestoreVideoBuffer functions manage the video
memory save operations for windows that have the SAVESELF attribute. The
LineLength function computes the logical length of a line of display data,
adjusting for any embedded color controls in the text.


How to Get D-Flat Now


The complete source code package for D-Flat is on CompuServe in Library 0 of
the DDJ Forum and on TelePath. Its name is D-FLAT.ARC. It is a preliminary
version but one that works. I will replace this file over the months as the
code changes. At present, everything compiles and works with Turbo C 2.0 and
Microsoft C 6.0. There is a makefile that the make utilities of both compilers
accept, and there is one example program, the MEMOPAD program, with which I
write some of these columns. If you want to discuss D-Flat with me, my
CompuServe ID is 71101,1262, and I monitor the DDJ FORUM daily.
Next month we will get into D-Flat's event-driven programming model's message
system.


The Programmer's Soap Box


Recent DDJ columns and articles have discussed the issue of software patents.
Opinions vary and are strong as to whether software algorithms should be
protected by patents. Given that some court decisions have ruled that they
are, I wonder now how effective such protection really is. It brings to mind
the PROMIS software system of the early '80s and the legal insanity that
surrounds it. A recent column by the syndicated columnist James Kilpatrick
recalled the case to my attention, and the account that follows comes mostly
from his column. The PROMIS case does not address specific software patent
issues, but it does raise questions about the ability of a developer to use
whatever protection the law provides to safeguard his or her ideas and rights.
Here's what happened.
A programmer named William Hamilton developed a system named PROMIS that
manages law enforcement caseload databases. The Justice Department awarded him
a contract so that federal lawyers could use PROMIS in the pursuit of their
noble duties. Then, guess what? Our government simply defaulted on payment.
They used the software, and they didn't pay for it. This wasn't shareware or
something that they downloaded from a BBS. This was a contract, and Uncle Sam
refused to honor its terms. There was no problem with the code. It worked
fine. The guardians of our trust, the protectors of the Constitution, the
nation's combined law firm and police department turned their back on an
iron-clad obligation to a citizen. They kept their PROMIS but they didn't keep
their promise, and Hamilton's company went belly-up.
Why did this happen? The government bureaucrat who was in charge of
authorizing payment was a former employee of Hamilton, and Hamilton had fired
him. Did the government guy's grudge against his former boss influence his
decision to withhold payment? Only he knows.
A federal bankruptcy judge ruled that the Justice Department was in default
and had in fact stolen the software. He ordered them to pay up. On appeal, the
government lost again. They appealed again, and Hamilton still hasn't seen any
money. Two separate Congressional investigations failed to shake the money
tree. Every federal bankruptcy judge who has since been assigned to the case
disqualifies himself for one vague reason or another. No one wants it because
the first judge who ruled against the government--his own employer--was not
reappointed when his term was up. To add insult to injury, the feds sold
copies of PROMIS to the Canadian government and have freely distributed it
among the U.S. intelligence community.
Given all this, how secure will you feel when your snappy new algorithm is
protected by a bona fide, U.S. grade-A patent? Who will go to bat for you when
someone uses your work? That, of course, will depend on how much money you
have for lawyers and who the pirate is. Such protection exists only when it is
in the interest of the guys with the big guns.

_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

/* ---------------- config.h -------------- */

#ifndef CONFIG_H
#define CONFIG_H

#define DFLAT_APPLICATION "memopad"

struct colors {
 /* ------------ colors ------------ */
 char ApplicationFG, ApplicationBG;
 char NormalFG, NormalBG;
 char ButtonFG, ButtonBG;
 char ButtonSelFG, ButtonSelBG;
 char DialogFG, DialogBG;
 char ErrorBoxFG, ErrorBoxBG;
 char MessageBoxFG, MessageBoxBG;
 char HelpBoxFG, HelpBoxBG;
 char InFocusTitleFG, InFocusTitleBG;
 char TitleFG, TitleBG;
 char DummyFG, DummyBG;
 char TextBoxFG, TextBoxBG;
 char TextBoxSelFG, TextBoxSelBG;
 char TextBoxFrameFG, TextBoxFrameBG;
 char ListBoxFG, ListBoxBG;
 char ListBoxSelFG, ListBoxSelBG;
 char ListBoxFrameFG, ListBoxFrameBG;
 char EditBoxFG, EditBoxBG;
 char EditBoxSelFG, EditBoxSelBG;

 char EditBoxFrameFG, EditBoxFrameBG;
 char MenuBarFG, MenuBarBG;
 char MenuBarSelFG, MenuBarSelBG;
 char PopDownFG, PopDownBG;
 char PopDownSelFG, PopDownSelBG;
 char InactiveSelFG;
 char ShortCutFG;
};

/* ----------- configuration parameters ----------- */
typedef struct config {
 char mono; /* True for B/W screens on any monitor */
 int InsertMode; /* Editor insert mode */
 int Tabs; /* Editor tab stops */
 int WordWrap; /* True to word wrap editor */
 struct colors clr; /* Colors */
} CONFIG;

extern CONFIG cfg;
extern struct colors color, bw;

void LoadConfig(void);
void SaveConfig(void);

#endif



[LISTING TWO]

/* ------------- config.c ------------- */

#include <conio.h>
#include "dflat.h"

/* ----- default colors for color video system ----- */
struct colors color = {
 LIGHTGRAY, BLUE, /* Application */
 LIGHTGRAY, BLACK, /* Normal */
 BLACK, CYAN, /* Button */
 WHITE, CYAN, /* ButtonSel */
 LIGHTGRAY, BLUE, /* Dialog */
 YELLOW, RED, /* ErrorBox */
 BLACK, LIGHTGRAY, /* MessageBox */
 BLACK, LIGHTGRAY, /* HelpBox */
 WHITE, CYAN, /* InFocusTitle */
 BLACK, CYAN, /* Title */
 GREEN, LIGHTGRAY, /* Dummy */
 BLACK, LIGHTGRAY, /* TextBox */
 LIGHTGRAY, BLACK, /* TextBoxSel */
 LIGHTGRAY, BLUE, /* TextBoxFrame */
 BLACK, LIGHTGRAY, /* ListBox */
 LIGHTGRAY, BLACK, /* ListBoxSel */
 LIGHTGRAY, BLUE, /* ListBoxFrame */
 BLACK, LIGHTGRAY, /* EditBox */
 LIGHTGRAY, BLACK, /* EditBoxSel */
 LIGHTGRAY, BLUE, /* EditBoxFrame */
 BLACK, LIGHTGRAY, /* MenuBar */
 BLACK, CYAN, /* MenuBarSel */

 BLACK, CYAN, /* PopDown */
 BLACK, LIGHTGRAY, /* PopDownSel */
 DARKGRAY, /* InactiveSelFG */
 RED /* ShortCutFG */
};

/* ----- default colors for mono video system ----- */
struct colors bw = {
 LIGHTGRAY, BLACK, /* Application */
 LIGHTGRAY, BLACK, /* Normal */
 BLACK, LIGHTGRAY, /* Button */
 WHITE, LIGHTGRAY, /* ButtonSel */
 LIGHTGRAY, BLACK, /* Dialog */
 LIGHTGRAY, BLACK, /* ErrorBox */
 LIGHTGRAY, BLACK, /* MessageBox */
 BLACK, LIGHTGRAY, /* HelpBox */
 BLACK, LIGHTGRAY, /* InFocusTitle */
 BLACK, LIGHTGRAY, /* Title */
 BLACK, LIGHTGRAY, /* Dummy */
 LIGHTGRAY, BLACK, /* TextBox */
 BLACK, LIGHTGRAY, /* TextBoxSel */
 LIGHTGRAY, BLACK, /* TextBoxFrame */
 LIGHTGRAY, BLACK, /* ListBox */
 BLACK, LIGHTGRAY, /* ListBoxSel */
 LIGHTGRAY, BLACK, /* ListBoxFrame */
 LIGHTGRAY, BLACK, /* EditBox */
 BLACK, LIGHTGRAY, /* EditBoxSel */
 LIGHTGRAY, BLACK, /* EditBoxFrame */
 LIGHTGRAY, BLACK, /* MenuBar */
 BLACK, LIGHTGRAY, /* MenuBarSel */
 BLACK, LIGHTGRAY, /* PopDown */
 LIGHTGRAY, BLACK, /* PopDownSel */
 DARKGRAY, /* InactiveSelFG */
 WHITE /* ShortCutFG */
};

/* ------ default configuration values ------- */
CONFIG cfg = {
 FALSE, /* mono */
 TRUE, /* Editor Insert Mode */
 4, /* Editor tab stops */
 TRUE /* Editor word wrap */
};

/* ------ load a configuration file from disk ------- */
void LoadConfig(void)
{
 FILE *fp = fopen(DFLAT_APPLICATION ".cfg", "rb");
 if (fp != NULL) {
 fread(&cfg, sizeof(CONFIG), 1, fp);
 fclose(fp);
 }
}

/* ------ save a configuration file to disk ------- */
void SaveConfig(void)
{
 FILE *fp = fopen(DFLAT_APPLICATION ".cfg", "wb");
 if (fp != NULL) {

 cfg.InsertMode = GetCommandToggle(ID_INSERT);
 cfg.WordWrap = GetCommandToggle(ID_WRAP);
 fwrite(&cfg, sizeof(CONFIG), 1, fp);
 fclose(fp);
 }
}




[LISTING THREE]

/* ---------------- classdef.h --------------- */

#ifndef CLASSDEF_H
#define CLASSDEF_H

typedef struct classdefs {
 CLASS class; /* window class */
 CLASS base; /* base window class */
 char *fg,*bg,*sfg,*sbg,*ffg,*fbg; /* colors */
 int (*wndproc)(struct window *,enum messages,PARAM,PARAM);
 int attrib;
} CLASSDEFS;

extern CLASSDEFS classdefs[];

#define SHADOW 0x0001
#define MOVEABLE 0x0002
#define SIZEABLE 0x0004
#define HASMENUBAR 0x0008
#define VSCROLLBAR 0x0010
#define HSCROLLBAR 0x0020
#define VISIBLE 0x0040
#define SAVESELF 0x0080
#define TITLEBAR 0x0100
#define CONTROLBOX 0x0200
#define MINMAXBOX 0x0400
#define NOCLIP 0x0800
#define READONLY 0x1000
#define MULTILINE 0x2000
#define HASBORDER 0x4000

int FindClass(CLASS);
#define DerivedClass(class) (classdefs[FindClass(class)].base)

#endif





[LISTING FOUR]

/* ---------------- classdef.c ---------------- */

#include <stdio.h>
#include "dflat.h"


/* Add class definitions to this table.
 * Add the class symbol to the CLASS list in dflat.h
 */

CLASSDEFS classdefs[] = {
 { /* ---------- NORMAL Window Class ----------- */
 NORMAL,
 -1,
 &cfg.clr.NormalFG, &cfg.clr.NormalBG,
 NULL, NULL,
 &cfg.clr.NormalFG, &cfg.clr.NormalBG,
 NormalProc
 },
 { /* ---------- APPLICATION Window Class ----------- */
 APPLICATION,
 NORMAL,
 &cfg.clr.ApplicationFG, &cfg.clr.ApplicationBG,
 NULL, NULL,
 &cfg.clr.ApplicationFG, &cfg.clr.ApplicationBG,
 ApplicationProc,
 VISIBLE SAVESELF CONTROLBOX TITLEBAR HASBORDER
 },
 { /* ------------ TEXTBOX Window Class -------------- */
 TEXTBOX,
 NORMAL,
 &cfg.clr.TextBoxFG, &cfg.clr.TextBoxBG,
 &cfg.clr.TextBoxSelFG, &cfg.clr.TextBoxSelBG,
 &cfg.clr.TextBoxFrameFG, &cfg.clr.TextBoxFrameBG,
 TextBoxProc
 },
 { /* ------------- LISTBOX Window class ------------- */
 LISTBOX,
 TEXTBOX,
 &cfg.clr.ListBoxFG, &cfg.clr.ListBoxBG,
 &cfg.clr.ListBoxSelFG, &cfg.clr.ListBoxSelBG,
 &cfg.clr.ListBoxFrameFG, &cfg.clr.ListBoxFrameBG,
 ListBoxProc
 },
 { /* ------------- EDITBOX Window Class -------------- */
 EDITBOX,
 TEXTBOX,
 &cfg.clr.EditBoxFG, &cfg.clr.EditBoxBG,
 &cfg.clr.EditBoxSelFG, &cfg.clr.EditBoxSelBG,
 &cfg.clr.EditBoxFrameFG, &cfg.clr.EditBoxFrameBG,
 EditBoxProc
 },
 { /* ------------- MENUBAR Window Class --------------- */
 MENUBAR,
 NORMAL,
 &cfg.clr.MenuBarFG, &cfg.clr.MenuBarBG,
 &cfg.clr.MenuBarSelFG, &cfg.clr.MenuBarSelBG,
 NULL, NULL,
 MenuBarProc,
 VISIBLE
 },
 { /* ------------- POPDOWNMENU Window Class ----------- */
 POPDOWNMENU,
 LISTBOX,
 &cfg.clr.PopDownFG, &cfg.clr.PopDownBG,

 &cfg.clr.PopDownSelFG, &cfg.clr.PopDownSelBG,
 NULL, NULL,
 PopDownProc,
 SAVESELF NOCLIP HASBORDER
 },
 { /* ----------- BUTTON Window Class --------------- */
 BUTTON,
 TEXTBOX,
 &cfg.clr.ButtonFG, &cfg.clr.ButtonBG,
 &cfg.clr.ButtonSelFG, &cfg.clr.ButtonSelBG,
 NULL, NULL,
 ButtonProc,
 SHADOW
 },
 { /* ------------- DIALOG Window Class -------------- */
 DIALOG,
 NORMAL,
 &cfg.clr.DialogFG, &cfg.clr.DialogBG,
 NULL, NULL,
 &cfg.clr.DialogFG, &cfg.clr.DialogBG,
 DialogProc,
 SHADOW MOVEABLE SAVESELF CONTROLBOX HASBORDER
 },
 { /* ------------ ERRORBOX Window Class ----------- */
 ERRORBOX,
 DIALOG,
 &cfg.clr.ErrorBoxFG, &cfg.clr.ErrorBoxBG,
 NULL, NULL,
 &cfg.clr.ErrorBoxFG, &cfg.clr.ErrorBoxBG,
 DialogProc,
 SHADOW HASBORDER
 },
 { /* --------- MESSAGEBOX Window Class ------------- */
 MESSAGEBOX,
 DIALOG,
 &cfg.clr.MessageBoxFG, &cfg.clr.MessageBoxBG,
 NULL, NULL,
 &cfg.clr.MessageBoxFG, &cfg.clr.MessageBoxBG,
 DialogProc,
 SHADOW HASBORDER
 },
 { /* ----------- HELPBOX Window Class --------------- */
 HELPBOX,
 DIALOG,
 &cfg.clr.HelpBoxFG, &cfg.clr.HelpBoxBG,
 NULL, NULL,
 &cfg.clr.HelpBoxFG, &cfg.clr.HelpBoxBG,
 DialogProc,
 SHADOW HASBORDER
 },
 { /* -------------- DUMMY Window Class ---------------- */
 DUMMY,
 -1,
 &cfg.clr.DummyFG, &cfg.clr.DummyBG,
 NULL, NULL,
 &cfg.clr.DummyFG, &cfg.clr.DummyBG,
 NULL,
 HASBORDER
 }

};

/* ------- return the offset of a class into the class
 definition table ------ */
int FindClass(CLASS class)
{
 int i;
 for (i = 0; i < sizeof(classdefs) / sizeof(CLASSDEFS); i++)
 if (class == classdefs[i].class)
 return i;
 return 0;
}







[LISTING FIVE]

/* ---------- window.c ------------- */

#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
#include <string.h>
#include "dflat.h"

WINDOW inFocus = NULLWND;

int foreground, background; /* current video colors */

static void InsertTitle(WINDOW, char *);
static void DisplayTitle(WINDOW, RECT);

/* --------- create a window ------------ */
WINDOW CreateWindow(
 CLASS class, /* class of this window */
 char *ttl, /* title or NULL */
 int left, int top, /* upper left coordinates */
 int height, int width, /* dimensions */
 void *extension, /* pointer to additional data */
 WINDOW parent, /* parent of this window */
 int (*wndproc)(struct window *,enum messages,PARAM,PARAM),int attrib)
 /* window attribute */
{
 WINDOW wnd = malloc(sizeof(struct window));
 get_videomode();
 if (wnd != NULLWND) {
 int base;
 /* ----- coordinates -1, -1 = center the window ---- */
 if (left == -1)
 wnd->rc.lf = (SCREENWIDTH-width)/2;
 else
 wnd->rc.lf = left;
 if (top == -1)
 wnd->rc.tp = (SCREENHEIGHT-height)/2;
 else

 wnd->rc.tp = top;
 wnd->attrib = attrib;
 if (ttl != NULL)
 AddAttribute(wnd, TITLEBAR);
 if (wndproc == NULL)
 wnd->wndproc = classdefs[FindClass(class)].wndproc;
 else
 wnd->wndproc = wndproc;
 /* ---- derive attributes of base classes ---- */
 base = class;
 while (base != -1) {
 int tclass = FindClass(base);
 AddAttribute(wnd, classdefs[tclass].attrib);
 base = classdefs[tclass].base;
 }
 if (parent && !TestAttribute(wnd, NOCLIP)) {
 /* -- keep upper left within borders of parent -- */
 wnd->rc.lf = max(wnd->rc.lf, GetClientLeft(parent));
 wnd->rc.tp = max(wnd->rc.tp, GetClientTop(parent) +
 (TestAttribute(parent, HASMENUBAR) ? 1 : 0));
 }
 wnd->class = class;
 wnd->extension = extension;
 wnd->rc.rt = GetLeft(wnd)+width-1;
 wnd->rc.bt = GetTop(wnd)+height-1;
 wnd->ht = height;
 wnd->wd = width;
 wnd->title = ttl;
 if (ttl != NULL)
 InsertTitle(wnd, ttl);
 wnd->next = wnd->prev = wnd->dFocus = NULLWND;
 wnd->parent = parent;
 wnd->videosave = NULL;
 wnd->condition = ISRESTORED;
 wnd->RestoredRC = wnd->rc;
 wnd->PrevKeyboard = wnd->PrevMouse = NULL;
 wnd->DeletedText = NULL;
 SendMessage(wnd, CREATE_WINDOW, 0, 0);
 if (isVisible(wnd))
 SendMessage(wnd, SHOW_WINDOW, 0, 0);
 }
 return wnd;
}

/* -------- add a title to a window --------- */
void AddTitle(WINDOW wnd, char *ttl)
{
 InsertTitle(wnd, ttl);
 SendMessage(wnd, BORDER, 0, 0);
}

/* ----- insert a title into a window ---------- */
static void InsertTitle(WINDOW wnd, char *ttl)
{
 if ((wnd->title = malloc(strlen(ttl)+1)) != NULL)
 strcpy(wnd->title, ttl);
}

/* ------- write a character to a window at x,y ------- */

void PutWindowChar(WINDOW wnd, int x, int y, int c)
{
 int x1 = GetClientLeft(wnd)+x;
 int y1 = GetClientTop(wnd)+y;

 if (isVisible(wnd)) {
 if (!TestAttribute(wnd, NOCLIP)) {
 WINDOW wnd1 = GetParent(wnd);
 while (wnd1 != NULLWND) {
 /* --- clip character to parent's borders --- */
 if (x1 < GetClientLeft(wnd1) 
 x1 > GetClientRight(wnd1) 
 y1 > GetClientBottom(wnd1) 
 y1 < GetClientTop(wnd1) 
 (y1 < GetTop(wnd1)+2 &&
 TestAttribute(wnd1, HASMENUBAR)))
 return;
 wnd1 = GetParent(wnd1);
 }
 }
 if (x1 < SCREENWIDTH && y1 < SCREENHEIGHT)
 wputch(wnd, c, x, y);
 }
}

static char line[161];

/* ----- clip line if it extends below the bottom of the
 parent window ------ */
static int clipbottom(WINDOW wnd, int y)
{
 if (!TestAttribute(wnd, NOCLIP)) {
 WINDOW wnd1 = GetParent(wnd);
 while (wnd1 != NULLWND) {
 if (GetClientTop(wnd)+y > GetBottom(wnd1))
 return TRUE;
 wnd1 = GetParent(wnd1);
 }
 }
 return GetClientTop(wnd)+y > SCREENHEIGHT;
}

/* ------ clip the portion of a line that extends past the
 right margin of the parent window ----- */
void clipline(WINDOW wnd, int x, char *ln)
{
 WINDOW pwnd = GetParent(wnd);
 int x1 = strlen(ln);
 int i = 0;

 if (!TestAttribute(wnd, NOCLIP)) {
 while (pwnd != NULLWND) {
 x1 = GetRight(pwnd) - GetLeft(wnd) - x;
 pwnd = GetParent(pwnd);
 }
 }
 else if (GetLeft(wnd) + x > SCREENWIDTH)
 x1 = SCREENWIDTH-GetLeft(wnd) - x;
 /* --- adjust the clipping offset for color controls --- */

 if (x1 < 0)
 x1 = 0;
 while (i < x1) {
 if ((unsigned char) ln[i] == CHANGECOLOR)
 i += 3, x1 += 3;
 else if ((unsigned char) ln[i] == RESETCOLOR)
 i++, x1++;
 else
 i++;
 }
 ln[x1] = '\0';
}

/* ------ write a line to video window client area ------ */
void writeline(WINDOW wnd, char *str, int x, int y, int pad)
{
 char wline[120];

 if (TestAttribute(wnd, HASBORDER)) {
 x++;
 y++;
 }
 if (!clipbottom(wnd, y)) {
 char *cp;
 int len;
 int dif;

 memset(wline, 0, sizeof wline);
 len = LineLength(str);
 dif = strlen(str) - len;
 strncpy(wline, str, ClientWidth(wnd) + dif);
 if (pad) {
 cp = wline+strlen(wline);
 while (len++ < ClientWidth(wnd)-x)
 *cp++ = ' ';
 }
 clipline(wnd, x, wline);
 wputs(wnd, wline, x, y);
 }
}

/* -- write a line to video window (including the border) -- */
void writefull(WINDOW wnd, char *str, int y)
{
 if (!clipbottom(wnd, y)) {
 strcpy(line, str);
 clipline(wnd, 0, line);
 wputs(wnd, line, 0, y);
 }
}

/* -------- display a window's title --------- */
static void DisplayTitle(WINDOW wnd, RECT rc)
{
 int tlen = min(strlen(wnd->title), WindowWidth(wnd)-2);
 int tend = WindowWidth(wnd)-4;

 if (SendMessage(wnd, TITLE, 0, 0)) {
 if (wnd == inFocus) {

 foreground = cfg.clr.InFocusTitleFG;
 background = cfg.clr.InFocusTitleBG;
 }
 else {
 foreground = cfg.clr.TitleFG;
 background = cfg.clr.TitleBG;
 }
 memset(line,' ',WindowWidth(wnd)-2);
 if (wnd->condition != ISMINIMIZED)
 strncpy(line + ((WindowWidth(wnd)-2 - tlen) / 2),
 wnd->title, tlen);
 line[WindowWidth(wnd)-2] = '\0';
 if (TestAttribute(wnd, CONTROLBOX))
 line[1] = CONTROLBOXCHAR;
 if (TestAttribute(wnd, MINMAXBOX)) {
 switch (wnd->condition) {
 case ISRESTORED:
 line[tend+1] = MAXPOINTER;
 line[tend] = MINPOINTER;
 break;
 case ISMINIMIZED:
 line[tend+1] = MAXPOINTER;
 break;
 case ISMAXIMIZED:
 line[tend] = MINPOINTER;
 line[tend+1] = RESTOREPOINTER;
 break;
 default:
 break;
 }
 }
 line[RectRight(rc)+1] = '\0';
 writeline(wnd, line+RectLeft(rc),
 RectLeft(rc), -1, FALSE);
 }
}

/* --- display right border shadow character of a window --- */
static void near shadow_char(WINDOW wnd, int y)
{
 int fg = foreground;
 int bg = background;
 int x = WindowWidth(wnd);
 int c = videochar(GetLeft(wnd)+x, GetTop(wnd)+y+1);

 if (TestAttribute(wnd, SHADOW) == 0)
 return;
 foreground = SHADOWFG;
 background = BLACK;
 PutWindowChar(wnd, x-1, y, c);
 foreground = fg;
 background = bg;
}

/* --- display the bottom border shadow line for a window --- */
static void near shadowline(WINDOW wnd, RECT rc)
{
 int i;
 int y = GetBottom(wnd)+1;

 if ((TestAttribute(wnd, SHADOW)) == 0)
 return;
 if (!clipbottom(wnd, WindowHeight(wnd))) {
 int fg = foreground;
 int bg = background;
 for (i = 0; i < WindowWidth(wnd); i++)
 line[i] = videochar(GetLeft(wnd)+i+1, y);
 line[i] = '\0';
 foreground = SHADOWFG;
 background = BLACK;
 clipline(wnd, 1, line);
 line[RectRight(rc)+3] = '\0';
 wputs(wnd, line+RectLeft(rc), 1+RectLeft(rc),
 WindowHeight(wnd));
 foreground = fg;
 background = bg;
 }
}

/* ------- display a window's border ----- */
void RepaintBorder(WINDOW wnd, RECT *rcc)
{
 int y;
 int lin, side, ne, nw, se, sw;
 RECT rc, clrc;

 if (!TestAttribute(wnd, HASBORDER))
 return;
 if (rcc == NULL) {
 rc = SetRect(0, 0, WindowWidth(wnd)-1,
 WindowHeight(wnd)-1);
 if (TestAttribute(wnd, SHADOW)) {
 rc.rt++;
 rc.bt++;
 }
 }
 else
 rc = *rcc;
 clrc = rc;
 /* -------- adjust the client rectangle ------- */
 if (RectLeft(rc) == 0)
 --clrc.rt;
 else
 --clrc.lf;
 if (RectTop(rc) == 0)
 --clrc.bt;
 else
 --clrc.tp;
 RectRight(clrc) = min(RectRight(clrc), WindowWidth(wnd)-3);
 RectBottom(clrc) =
 min(RectBottom(clrc), WindowHeight(wnd)-3);
 if (wnd == inFocus) {
 lin = FOCUS_LINE;
 side = FOCUS_SIDE;
 ne = FOCUS_NE;
 nw = FOCUS_NW;
 se = FOCUS_SE;
 sw = FOCUS_SW;
 }

 else {
 lin = LINE;
 side = SIDE;
 ne = NE;
 nw = NW;
 se = SE;
 sw = SW;
 }
 line[WindowWidth(wnd)] = '\0';
 /* ---------- window title ------------ */
 if (RectTop(rc) == 0)
 if (TestAttribute(wnd, TITLEBAR))
 DisplayTitle(wnd, clrc);
 foreground = FrameForeground(wnd);
 background = FrameBackground(wnd);
 /* -------- top frame corners --------- */
 if (RectTop(rc) == 0) {
 if (RectLeft(rc) == 0)
 PutWindowChar(wnd, -1, -1, nw);
 if (RectRight(rc) >= WindowWidth(wnd)-1)
 PutWindowChar(wnd, WindowWidth(wnd)-2, -1, ne);

 if (TestAttribute(wnd, TITLEBAR) == 0) {
 /* ----------- top line ------------- */
 memset(line,lin,WindowWidth(wnd)-1);
 line[RectRight(clrc)+1] = '\0';
 if (strlen(line+RectLeft(clrc)) > 1 
 TestAttribute(wnd, SHADOW) == 0)
 writeline(wnd, line+RectLeft(clrc),
 RectLeft(clrc), -1, FALSE);
 }
 }
 /* ----------- window body ------------ */
 for (y = 0; y < ClientHeight(wnd); y++) {
 int ch;
 if (y >= RectTop(clrc) && y <= RectBottom(clrc)) {
 if (RectLeft(rc) == 0)
 PutWindowChar(wnd, -1, y, side);
 if (RectRight(rc) >= ClientWidth(wnd)) {
 if (TestAttribute(wnd, VSCROLLBAR))
 ch = ( y == 0 ? UPSCROLLBOX :
 y == WindowHeight(wnd)-3 ?
 DOWNSCROLLBOX :
 y == wnd->VScrollBox ?
 SCROLLBOXCHAR :
 SCROLLBARCHAR );
 else
 ch = side;
 PutWindowChar(wnd, WindowWidth(wnd)-2, y, ch);
 }
 if (RectRight(rc) == WindowWidth(wnd))
 shadow_char(wnd, y);
 }
 }
 if (RectBottom(rc) >= WindowHeight(wnd)-1) {
 /* -------- bottom frame corners ---------- */
 if (RectLeft(rc) == 0)
 PutWindowChar(wnd, -1, WindowHeight(wnd)-2, sw);
 if (RectRight(rc) >= WindowWidth(wnd)-1)

 PutWindowChar(wnd, WindowWidth(wnd)-2,
 WindowHeight(wnd)-2, se);
 /* ----------- bottom line ------------- */
 memset(line,lin,WindowWidth(wnd)-1);
 if (TestAttribute(wnd, HSCROLLBAR)) {
 line[0] = LEFTSCROLLBOX;
 line[WindowWidth(wnd)-3] = RIGHTSCROLLBOX;
 memset(line+1, SCROLLBARCHAR, WindowWidth(wnd)-4);
 line[wnd->HScrollBox] = SCROLLBOXCHAR;
 }
 line[RectRight(clrc)+1] = '\0';
 if (strlen(line+RectLeft(clrc)) > 1 
 TestAttribute(wnd, SHADOW) == 0)
 writeline(wnd,
 line+RectLeft(clrc),
 RectLeft(clrc),
 WindowHeight(wnd)-2,
 FALSE);
 if (RectRight(rc) == WindowWidth(wnd))
 shadow_char(wnd, WindowHeight(wnd)-2);
 }
 if (RectBottom(rc) == WindowHeight(wnd))
 /* ---------- bottom shadow ------------- */
 shadowline(wnd, clrc);
}

/* ------ clear the data space of a window -------- */
void ClearWindow(WINDOW wnd, RECT *rcc, int clrchar)
{
 if (isVisible(wnd)) {
 int y;
 RECT rc;

 if (rcc == NULL)
 rc = SetRect(0, 0, ClientWidth(wnd)-1,
 ClientHeight(wnd)-1);
 else
 rc = *rcc;
 SetStandardColor(wnd);
 memset(line, clrchar, RectWidth(rc));
 line[RectWidth(rc)] = '\0';
 for (y = RectTop(rc); y <= RectBottom(rc); y++)
 writeline(wnd, line, RectLeft(rc), y, FALSE);
 }
}

/* -- adjust a window's rectangle to clip it to its parent -- */
static RECT near AdjustRect(WINDOW wnd)
{
 RECT rc = wnd->rc;
 if (TestAttribute(wnd, SHADOW)) {
 RectBottom(rc)++;
 RectRight(rc)++;
 }
 if (!TestAttribute(wnd, NOCLIP)) {
 WINDOW pwnd = GetParent(wnd);
 if (pwnd != NULLWND) {
 RectTop(rc) = max(RectTop(rc),
 GetClientTop(pwnd));

 RectLeft(rc) = max(RectLeft(rc),
 GetClientLeft(pwnd));
 RectRight(rc) = min(RectRight(rc),
 GetClientRight(pwnd));
 RectBottom(rc) = min(RectBottom(rc),
 GetClientBottom(pwnd));
 }
 }
 RectRight(rc) = min(RectRight(rc), SCREENWIDTH-1);
 RectBottom(rc) = min(RectBottom(rc), SCREENHEIGHT-1);
 RectLeft(rc) = min(RectLeft(rc), SCREENWIDTH-1);
 RectTop(rc) = min(RectTop(rc), SCREENHEIGHT-1);
 return rc;
}

/* --- get the video memory that is to be used by a window -- */
void GetVideoBuffer(WINDOW wnd)
{
 RECT rc;
 int ht;
 int wd;

 rc = AdjustRect(wnd);
 ht = RectBottom(rc) - RectTop(rc) + 1;
 wd = RectRight(rc) - RectLeft(rc) + 1;
 wnd->videosave = realloc(wnd->videosave, (ht * wd * 2));
 get_videomode();
 if (wnd->videosave != NULL)
 getvideo(rc, wnd->videosave);
}

/* --- restore the video memory that was used by a window --- */
void RestoreVideoBuffer(WINDOW wnd)
{
 if (wnd->videosave != NULL) {
 RECT rc = AdjustRect(wnd);
 storevideo(rc, wnd->videosave);
 free(wnd->videosave);
 wnd->videosave = NULL;
 }
}

/* ------- compute the logical line length of a window ------ */
int LineLength(char *ln)
{
 int len = strlen(ln);
 char *cp = ln;
 while ((cp = strchr(cp, CHANGECOLOR)) != NULL) {
 cp++;
 len -= 3;
 }
 cp = ln;
 while ((cp = strchr(cp, RESETCOLOR)) != NULL) {
 cp++;
 --len;
 }
 return len;
}
































































June, 1991
STRUCTURED PROGRAMMING


Mr. Horny Goes to Town




Jeff Duntemann, KG7JF


Like Thoreau, I rejoice that there are owls. Part of the reason is that owls
eat mice, and thus make a certain number of cats unnecessary, which is always
a plus. But the better part of it is that owls seem to live by design, unlike
English sparrows and house finches, which hop madly around in dead bushes,
burning their calories in manic random motion to no good purpose.
A great horned owl has staked out our piece of desert as his turf, and he sits
on the gnarly saguaro outside the master bedroom window, hooting mournfully
all night long. It's not silence, but it beats drag-racing teenagers, and come
summer the air conditioner will doubtless drown him out. Actually, we've
become so fond of Mr. Horny that we held a party in his honor, and invited all
our friends to come by and see him emerge from hiding at dusk and begin
scanning the yard for mice.
So we had 20 people up on the sundeck, drinking beer and eating hors d'oerves
and telling Saddam Hussein jokes, and dusk came and went with no sign of Mr.
Horny. This was a surprise, since we had seen him every night for a very long
time, and seems to point up a sort of Heisenberg's Uncertainty Principle about
owls: Inviting 20 rowdy people to your sundeck to look for owls will almost
certainly affect the likelihood of actually seeing one.
The owl has been gone for awhile, and we suspect Mr. Horny moved to Cave
Creek, having deemed our slice of Scottsdale uninhabitable. The lesson: Owls
only show up when you don't expect them. So don't expect them to appear--and
rejoice when they do.


Spotting OWL


There's some considerable rejoicing to be done: Borland has announced and is
shipping Turbo Pascal for Windows, and OWL. TPW (which is an entirely separate
product from text-only Turbo Pascal 6.0) is Windows-hosted (that is, the
product operates only under Windows) and generates only Windows applications.
So if you've already decided that you hate Windows, pass it by. On the other
hand, if you're a competent Turbo Pascal programmer and want a handhold on the
exploding Windows market, you must get this product. Contrary to even my own
expectations, it is good beyond imagining.
The reasons, like the product, are a little complex. Much of TPW's value lies
in something called the Object Windows Library (OWL), which is an application
framework for Windows, just as the Turbo Vision (TV) library (bundled with
Turbo Pascal 6.0) is an application framework for text mode under DOS. I
haven't had the chance to do much with Turbo Vision in this column (my list of
topics-to-be-covered now runs down the hall and into the garage) but that
might be just as well. TV and OWL are remarkably similar from a height, and
implement identical ideas for two very different platforms. Both provide the
underpinnings of an event-driven application, along with a rich collection of
software components for building user- and system-interface code.
Both OWL and TV are inescapably object oriented. The idea behind an
application framework is to inherit a boilerplate application that does
nothing on its own, and add to it the specific code that lets it do the work
you require of some particular application. The framework contains all the
hooks on which you hang an application, divided along the classic OOP axis of
generality versus specificity. The parent objects are so general that they
have the potential to do almost anything, but too general to perform any
useful task. So you define and implement child objects that do one specific
thing, using as much of the parent's general code as possible.
If you've ever implemented a menuing system you'll probably know what I mean.
Rather than hard-code specific menus into an application, the smart thing to
do is create a general-purpose menuing machine, to which you feed some sort of
menu-definition table that works with the menuing engine to create a
particular menu with a particular set of options. There are numerous ways to
do this (I presented one pre-OOP implementation in the Third Edition of
Complete Turbo Pascal) and the OOP notion of inheritance is tailor-made for
such things.
OWL and TV are both general-purpose "application engines" in that same sense.
You inherit what may in fact be thousands of lines of general-purpose code
from the application framework object and boilerplate window and control
objects, and can create a very polished-looking application in what might be
only a few hundred lines instead of many thousands. This is especially true of
OWL under Windows, where the stuff that you inherit encapsulates some of the
most violently difficult system-level code that one could imagine. (And I have
a legendary imagination.)


In the Belly of the Beast


Because let's face it: Microsoft Windows 3.0 was designed to do the impossible
and comes pretty damned close. It breaks the 640K DOS memory barrier, and it
adds multitasking to an operating system that isn't even reentrant. It isn't
perfect and never will be, but what minor problems I've had with it (once I
got it to run at all) are well worth the remarkable things it allows me to do.
Windows is an event-driven platform. Much or most of what Windows does, in
fact, is manage the keyboard, mouse, serial port, error, and other
system-generated events. Much or most of the work of writing a Windows
application is creating machinery that responds to the events that Windows
generates.
I could characterize Turbo Pascal for Windows programming, in fact, as the
process of attaching object methods to the events that Windows generates. It's
all very asynchronous: At unpredictable times, the user may press the left
mouse button. Windows detects this and sends an event bubbling up from the
depths to your application, saying, in effect, "Somebody pressed the left
mouse button. What are you going to do about it?"
The left-mouse-button-pressed event is a little package of information that
contains the location of the mouse cursor when the mouse button was pressed.
This allows you to respond to the event in different ways depending on where
the mouse cursor was when the user pressed the button.
In reality, Windows breaks down events into numerous special cases, and
dispatches what it calls messages to your application rather than whole mouse
events. (A rough count shows about 25 different messages relating to mouse
clicks alone.) A message is a code number rather than some sort of text
string, and parsing the message's code number is done automatically by OWL, as
I'll explain a little later.
You might as well think of messages as Windows events (in the sense that I
defined events in my December 1990 column) as long as you understand that the
word "event" in a Windows context actually stands for the physical occurrence
that gives rise to one or more messages. A quick example: Double-clicking on
the left mouse button (which is what Windows considers the "event") gives rise
to a down-click message, an up-click message, a double-click message, and a
second up-click message.


Underground Code Rivers


If you're like me, one of the first things you'll find yourself wondering in
looking at an event-driven programming model is, what's the flow of control?
Where does execution start, and where does it end? One thing's for sure, using
OWL is not like ordinary Pascal programming for DOS. (This also applies to
Turbo Vision, as those who have used TV will readily agree.) We're used to
seeing statements flowing one after the other in front of our eyes, like the
Shenandoah River taking its curves and leaving its oxbows as it meanders
toward its marriage with the Potomac.
Instead, what we have now is a network of underground code rivers that only
occasionally spill out of a crack in the cliff, to run for a while and then
vanish again into what seems like a bottomless pit. Typically, you the
programmer only see the side streams that you create. The bulk of the river's
flow is far beneath your feet.
OWL provides an application framework class called TApplication. To create
your own TPW application, you define a child class of TApplication, and extend
the child class with the specific methods your application needs to do its
work. TApplication contains a message loop, which is in fact where execution
remains most of the time. This message loop is hidden from you, and you
inherit it whole and with no need to override or extend it. The loop runs in
circles, continually asking Windows if any events are pending. When they are,
the message loop parses a message, sees if you have defined a method to
respond to that particular message, and if so, calls the method attached to
the message.
This is how control is handled in an OWL application. The main message loop
looks for messages, and calls the methods you have written as appropriate.
Attaching a method to a Windows message involves a new extension to the Turbo
Pascal object syntax. The modifier VIRTUAL may be followed by a numeric
constant or literal. This value specifies a Windows message to which that
method is attached. Only virtual methods may be attached to messages in this
way.


Resources


Something new that Windows brings to Pascal programming is the notion of
resources, a collective name for field-replaceable program elements that
include fonts, icons, bitmaps, menus, accelerator keys (which I generally call
"shortcut" keys), graphics cursors, dialog boxes, and ordinary text strings.
Resources are deliberately defined outside the code in a relatively
code-independent fashion so that applications may be made language or even
alphabet-independent with relatively little fiddling in source code.
Ninety-five percent of moving your application from English to French lies in
recreating the application's resources in the French language. Resources are
stored in a program's .EXE image, but are not embedded in actual machine code.
Turbo Pascal for Windows includes a separate utility for creating, browsing
and editing most Windows resources. This is the Whitewater Resource Toolkit,
licensed from the Actor folks. Working with resources is a lot of fun, and the
WRT is beautifully designed and highly intuitive.
Separating most program display elements off as resources applies a certain
design discipline to the TPW programming process. Creating a menu structure as
a resource requires that you design your menus before you start writing your
code, which generally means that you have to specify your feature set (which
is accessed through the menus) before you start pounding Pascal into the
keyboard. This is all to the good. I expect that in very short order,
third-party prototyping systems will appear for TPW, and you'll basically draw
your application interactively, then push a button and generate OWL code to
implement the bulk of the application, including resources. Such products
exist for both C and C++, and the appearance of TPW opens up whole new markets
for Windows prototyping and resource generation tools.


Finding the Front Door



For all the fact that the Turbo Pascal Windows documentation is highly
detailed and four inches thick, my instincts tell me that I could write
several books on the product, and probably should. It's a very detailed
subject, and although OWL manages the complexity of the Windows API to an
amazing degree, there is a limit to how much complexity you can hide without
beginning to "dumb down" the available resources of Windows itself.
Confronting a product such as TPW from a dead stop presents a feeling of
nameless dread that I call "looking for the front door." There's so much
technology there that it's far from clear what a newcomer should do first. I'm
still burrowing through it myself, and will be for some time. However, let me
offer a strategy for getting to know this thing:
Learn Windows itself first. This is critical. Make very sure that you
understand the jargon and the shape of the platform itself. Put Windows up on
your machine and use it religiously for a couple of weeks before taking TPW
out of the box. If possible, buy some sort of Windows application and use it
heavily for a while, rather than simply using old DOS applications launched
from Windows. If you don't know what a combo box is, you're going to have a
hell of a time designing one.
Install Turbo Pascal for Windows and read Chapter 1 of the User's Guide. This
will help you get in touch with the Windows-based IDE. If you already use TP6
in text mode, you'll be much of the way there; the menu structure and
general-UI principles are very similar. The rest of the User's Guide contains
tutorials on fundamental Pascal and object-oriented programming -- read them
if you need them.
Sit down with a tall pitcher of iced tea and read Chapters 1 through 13 of the
Windows Programming Guide straight through. Follow along on your machine
during the ten-step tutorial. Don't go off on your own yet. Do just what the
tutorial tells you to do. If you don't read these 13 chapters, you are lost.
Read the Whitewater Resource Toolkit User Guide from cover to cover. It's a
thin volume and won't take much time, and it explains the concept of resources
very well.
Now you're ready to hack. Take one of the example programs, make sure it
compiles and runs for you, and start tweaking it. You might fiddle with
resources before you even begin writing code. I took the BONK.PAS program
(which is an amiable video version of the old Whack-a-Mole carnival game) and
edited the mole bitmap to look like Ralph Nader. This made the game much more
satisfying. Change only one thing at a time until you start to catch on. Plan
on doing plenty of thumbing through the cavernous Windows Reference Guide.
Plan on making a lot of dumb mistakes. It's all part of the game.
Once you get started, the force of accumulated experience builds quickly. The
manual set is very good, but I did find some notable lapses. The worst of
these is that while Windows' communications port support is documented
piecemeal, nothing tells you how to put the pieces together to access the
port. Windows apparently contains its own interrupt-driven serial port code,
and will generate messages corresponding to various changes in the state of
the port (including the appearance of an incoming character) but no matter how
I arranged the pieces, the port would not come alive for me. If any of you can
tell me how to access the serial port from within Windows, please tell me so I
can explain it to everybody else.


The Start of an Era


I had expected something a little different from Turbo Pascal for Windows when
I first heard that it was in the works. I expected something higher-level, a
little more insulated from the Windows API, and a little easier to swallow in
one gulp. I expected something, in short, to meet Actor nose-to-nose,
especially since I knew that the Whitewater Resource Toolkit would be part of
the deal.
On the other hand, there already is an Actor. Why make another one? What
Borland in fact did is way more ambitious: They created a language that can do
anything with Windows that C can do, and yet be only a little more difficult
to learn than DOS-based Pascal. TPW allows you to make any API call, and
responds to any Windows message, just as you can in C--yet it gives you the
OWL library to do as much of the gritty work for you as possible. It creates
DLLs. It supports the Multiple Document Interface. It does lots of things I
don't quite understand yet. My instincts tell me clearly, however, that
nothing is missing, and nothing has been hidden away irretrievably. If Windows
can do it, TPW can make it happen.
I've used Windows since 1986, when it was still in beta test, and I've seen a
lot of SDK versions come and go. I've seen the potential in Windows, and seen
it buried beneath a monolithic hodge-podge of unmanaged and undifferentiated
detail. Actor, when I discovered it, was a delight--but Actor never really
caught on, largely for reasons of price and its proprietary nature.
Turbo Pascal for Windows is the first mainstream language (by that I mean C,
Pascal, Basic, Modula-2, and Fortran) delivered in a form that runs under
Windows, for Windows. At the risk of blathering, let me say that it is the
second-finest product that Borland has ever introduced, equalled only by their
groundbreaking Turbo Pascal 1.0. The compiler is fast, the environment
beautiful, but OWL is the key--and if you give a hoot at all about Windows,
you should rejoice that somebody finally made it happen.


The UART Registers Dissected


In last month's column, I presented the view of the UART's register set from a
height. This month, we'll take a closer and more detailed look at some of the
registers' various bit fields and little-known lore.
In the following paragraphs I'll be describing each of the UART registers in a
little more detail. Refer to the chart in Figure 1 in last month's column for
COM port addresses and offsets for each named register.
Receive Buffer Register (RBR) When the UART has finished assembling a
character out of serial bits arriving from a remote system, it places the
completed character in RBR. You can read the character from RBR more than
once, but don't bother reading it unless the Data Ready (DR) flag in the Line
Status register (LSR) has been raised to a 1-bit, indicating that a character
is complete and ready to read in RBR. What RBR contains when DR is 0 is
undefined, and you should consider it a garbage value.
The UART has the ability to generate a hardware interrupt when a complete
character is available in RBR. In professional-quality comm software, RBR is
read only by such an interrupt service routine. We'll get into those in a
later column.
Transmit Holding Register (THR) When you want to transmit a character to a
remote system through the UART, you place the outbound character in THR. The
UART then converts the character to a stream of bits placed on the serial
port's single data line.
It is possible to stuff characters into THR faster than the UART can convert
the characters to bits and move them out to the serial port. If this happens,
you'll send bits flying all over the place and mess over your transmission in
a serious fashion. Fortunately, there is a flag in LSR called Transmit Holding
Register Empty (THRE) that indicates, in a fashion similar to DR, that THR is
empty and that a new character may safely be written to it.
The UART can also generate an interrupt when THR becomes empty, and this
feature allows you to write an interrupt routine that automatically stuffs
characters from a buffer into the UART as fast as the UART can accept them.
Interrupt Enable Register (IER) The UART can operate in either interrupt
driven or polled mode. I demonstrated polled mode last month with the
POLLTERM.PAS program. Interrupt-driven mode is infinitely more useful, if
considerably more tangled in how it must be set up.
The UART can in fact generate an interrupt on any of four different
conditions: When an incoming character is ready to be read; when the UART is
ready to send another character out; when any of four error bits in the LSR go
to a value of 1; and when any of three status bits in the Modem Status
register (MSR) change state. These four interrupts can be enabled and disabled
independently of one another, by setting the appropriate bit to a 1 value and
leaving the others at 0. (Keep in mind that interrupts must also be turned on
for the adapter as a whole by setting yet another bit called OUT2 in the Modem
Control Register. No one ever promised that life would be simple.)
I'll explain how these bits affect the interrupts in more detail when we cover
communication interrupts in a future column.
Interrupt ID Register (IIR) Because the UART can generate any of four
different interrupts for different conditions, it's possible for more than one
interrupt to be "hanging fire" at one time. The CPU can eventually service
them all, but it has to know which ones are pending at what time. When
multiple interrupts are pending, the UART prioritizes them and lets the CPU
know what's up next through a 3-bit code in IIR. Once a given pending
interrupt has been serviced and cleared, the next one in priority is reflected
in the code in the IIR. When finally bit O of IIR goes to 1, no more
interrupts are pending.
Yes, this is confusing business. Again, we'll cover all registers connected
with interrupt generation in more detail in a future column.
FIFO Control Register (FCR) This register is available only on the UART chip
present in IBM PS/2s and PS/2 compatibles. A FIFO (First In First Out) is a
register that allows you to queue up data inside the UART chip itself on both
transmit and receive. It's like making both RBR and THR 16 characters deep.
This is very handy when your interrupt service routines are complex and take a
long time to execute, as they might in a protected-mode operating system. For
DOS applications on fast machines they simply aren't necessary. (If they are,
I suspect it means you don't know how to write a terse enough interrupt
service routine.)
Unfortunately, the bulk of the PCs out there don't have the advanced UART chip
containing the FIFOs, so it's unwise to rely on their being present in any
given machine, although you can test for them. I won't be covering use of the
FIFOs in this series; if you really need them, get the manufacturer's data
sheets on the 16550 UART chip.
Line Control Register (LCR) This is a very useful register, entirely divided
into bit fields, some of which represent 2- or 3-bit binary codes. I've
summarized the different fields in Figure 1.
Bits 0 (WLS0) and 1 (WLS1) represent a 2-bit code specifying the number of
bits in the "word length" (actually, the character length) to be used in data
transmission, not counting start, parity, or stop bits. The acronyms are Word
Length Select 0 and 1. The UART can send data using any of four different
character lengths: 5, 6, 7, or 8 bits per character. 5 and 6 are rarely used
anymore, and are a holdover from the bad old days of teletype. The bit codes
corresponding to the various word lengths are shown in Figure 1.
Bit 2 (STB) controls the number of stop bits. The UART can transmit either
one, two, or (again, in a throwback to Teletype days) one-and-a-half stop
bits. If STB=0, one stop bit is used. If STB=1, two stop bits are used.
However, if WLS0 and WLS1 specify that 5 data bits are to be used, a 1 bit in
STB will specify one-and-a-half stop bits.
Bit 3 (PEN) enables and disables parity checking. When PEN=0, parity is
disabled and no parity bit is sent or expected. When PEN=1, a parity scheme is
enabled, the nature of which is dictated by bits 4 and 5.
Bit 4 (EPS) specifies "even" or "odd" parity. (If parity is disabled by
setting PEN to 0, the state of EPS is ignored.) For lack of space, I won't
explain parity in detail here, but it's a limited, character-by-character form
of data validation. If you have parity enabled and a noise pulse blasts one of
the bits in a character you're transmitting, chances are good the parity
system will detect the damaged character and issue a parity error.
Bit 5 (STP) specifies whether or not "stick parity" is to be used. Stick
parity means that the parity bit is "stuck" to either 1 or 0, irrespective of
whether or not the character meets the parity-checking algorithm. Stick parity
is actually a way of sending a parity bit without doing any parity checking,
and is rarely used anymore.
Bits 3, 4, and 5 comprise a matrix that defines all the different parity modes
that the UART can support. The matrix is included in Figure 1.
Bit 6 (BRK) is used to carry a signal to the UART that a "break" condition is
to be created. To form a break condition, the data line is forced to the space
condition without changing for a period of time longer than one character.
Your program is responsible for setting the time; as long as BRK is set to 1,
a break condition will be maintained. You set it, and you clear it--or it
doesn't get cleared.
Bit 7 (DLAB) is the Divisor Latch Access Bit. This bit arbitrates between the
two uses of the registers at offset 0 and 1 from the UART base address. Its
value defaults to 0. Normally, the register at offset 0 is the RBR register,
and the register at offset 1 is the THR register. However, when DLAB is set to
1, the register at offset 0 is used to access the low byte of the divisor
value, and the register at offset 1 is used to access the high byte of the
divisor value. The divisor value actually sets the baud rate; it is a constant
by which an internal clock is divided to produce the master series of pulses
that the UART modulates into serial characters.
I'll have more to say about baud rates and divisors next issue. I had hoped to
cover all of the registers in one column, but Turbo Pascal for Windows
intervened. Ahh, so much technology--so little time!


Products Mentioned


Turbo Pascal for Windows Borland International 1800 Green Hills Road Scotts
Valley, CA 95066 408-438-8400 $249.95








June, 1991
GRAPHICS PROGRAMMING


Of Songs, Taxes, and the Simplicity of Complex Polygons




Michael Abrash


Every so often, my daughter asks me to sing her to sleep. (If you've ever
heard me sing, this may cause you concern about either her hearing or her
judgement, but love knows no bounds.) As any parent is well aware, singing a
young child to sleep can easily take several hours, or until sunrise, which
ever comes last. One night, running low on children's songs, I switched to a
Beatles medley, and at long last her breathing became slow and regular. At the
end, I softly sang "A Hard Day's Night," then quietly stood up to leave. As I
tiptoed out, she said, in a voice not even faintly tinged with sleep, "Dad,
what do they mean, 'working like a dog'? Chasing a stick? That doesn't make
sense; people don't chase sticks."
That led us into a discussion of idioms, which made about as much sense to her
as an explanation of quantum mechanics. Finally, I fell back on my standard
explanation of the Universe, which is that a lot of the time it doesn't make
sense.
As a general principle, that explanation holds up remarkably well. (In fact,
having just done my taxes, I think Earth is actually run by blob-creatures
from the planet Mrxx, who are helplessly doubled over with laughter at the
ridiculous things they can make us do. "Let's make them get Social Security
numbers for their pets next year!" they're saying right now, gasping for
breath.) Occasionally, however, one has the rare pleasure of finding a corner
of the Universe that makes sense, where everything fits together as if
preordained.
Filling arbitrary polygons is such a case.


Filling Arbitrary Polygons


In the February column, I described three types of polygons: convex,
nonconvex, and complex. The RenderMan Companion, which I mentioned last month,
has an intuitive definition of convex: If a rubber band stretched around a
polygon touches all vertices in the order they're defined, then the polygon is
convex. If a polygon has intersecting edges, it's complex. If a polygon
doesn't have intersecting edges but isn't convex, it's nonconvex. Nonconvex is
a special case of complex, and convex is a special case of nonconvex. (Which,
I'm well aware, makes nonconvex a lousy name--noncomplex would have been
better--but I'm following X Window System nomenclature here.)
The reason for distinguishing between these three types of polygons is that
the more specialized types can be filled with markedly faster approaches.
Complex polygons require the slowest approach; however, that approach will
serve to fill any polygon of any sort. Nonconvex polygons require less
sorting, because edges never cross. Convex polygons can be filled fastest of
all by simply scanning the two sides of the polygon, as we saw in March.
Before we dive into complex polygon filling, I'd like to point out that the
code in this article, like all polygon filling code I've ever seen, requires
that the caller describe the type of the polygon to be filled. Often, however,
the caller doesn't know what type of polygon it's passing, or specifies
complex for simplicity, because that will work for all polygons; in such a
case, the polygon filler will use the slow complex-fill code even if the
polygon is, in fact, a convex polygon.
Although I've never seen it mentioned anywhere, it is reasonably easy to
determine whether a polygon specified as complex or nonconvex is actually
convex. The best technique I've come up with is tracing around the polygon's
boundary, counting the number of times that the boundary reverses X and Y
directions. If the boundary reverses both directions no more than twice, the
polygon is convex. Whether the faster drawing of convex polygons justifies the
extra time required to count X and Y reversals depends on both the
implementation and the number of polygons in any particular application which
are specified as complex/nonconvex but are actually convex.


Active Edges


The basic premise of filling a complex polygon is that for a given scan line,
we determine all intersections between the polygon's edges and that scan line
and then fill the spans between the intersections, as shown in Figure 1.
(Section 3.6 of Computer Graphics, second edition, by Foley and van Dam
provides an overview of this and other aspects of polygon filling.) There are
several rules that might be used to determine which spans are drawn and which
aren't; we'll use the odd/even rule, which specifies that drawing turns on
after odd-numbered intersections (first, third, and so on) and off after
even-numbered intersections.
The question then becomes how we can most efficiently determine which edges
cross each scan line and where. As it happens, there is a great deal of
coherence from one scan line to the next in a polygon edge list, because each
edge starts at a given Y coordinate and continues unbroken until it ends. In
other words, edges don't leap about and stop and start randomly; the X
coordinate of an edge at one scan line is a consistent delta from that edge's
X coordinate at the last scan line, and that is consistent for the length of
the line.
This allows us to reduce the number of edges that must be checked for
intersection; on any given scan line, we only need to check for intersections
with the currently active edges--edges that start on that scan line, plus all
edges that start on earlier (above) scan lines and haven't ended yet--as shown
in Figure 2. This suggests that we can proceed from the top scan line of the
polygon to the bottom, keeping a running list of currently active
edges--called the Active Edge Table (AET)--with the edges sorted in order of
ascending X coordinate of intersection with the current scan line. Then we can
simply fill each scan line in turn according to the list of active edges at
that line.
Maintaining the AET from one scan line to the next involves three steps.
First, we must add to the AET any edges that start on the current scan line,
making sure to keep the AET X-sorted for efficient odd/even scanning. Second,
we must remove edges that end on the current scan line. Third, we must advance
the X coordinates of active edges with the same sort of error term-based,
Bresenham's-like approach we used for convex polygons, again ensuring that the
AET is X-sorted after advancing the edges.
Advancing the X coordinates is easy. For each edge, we'll store the current X
coordinate and all required error term information, and we'll use that to
advance the edge one scan line at a time; then we'll resort the AET by X
coordinate as needed. Removing edges as they end is also easy; we'll just
count down the length of each active edge on each scan line and remove an edge
when its count reaches zero. Adding edges as their tops are encountered is a
tad more complex. While there are a number of ways to do this, one
particularly efficient approach is to start out by putting all the edges of
the polygon, sorted by increasing Y coordinate, into a single list, called the
Global Edge Table (GET). Then, as each scan line is encountered, all edges at
the start of the GET that begin on the current scan line are moved to the AET;
because the GET is Y-sorted, there's no need to search the entire GET. For
still greater efficiency, edges in the GET that share common Y coordinates can
be sorted by increasing X coordinate; this ensures that no more than one pass
through the AET per scan line is ever needed when adding new edges from the
GET in such a way as to keep the AET sorted in ascending X order.
What form should the GET and AET take? Linked lists of edge structures, as
shown in Figure 3. With linked lists, all that's required to move edges from
the GET to the AET as they become active, sort the AET, and remove edges that
have been fully drawn is the exchanging of a few pointers.
In summary, we'll initially store all the polygon edges in
Y-primary/X-secondary sort order in the GET, complete with initial X and Y
coordinates, error terms and error term adjustments, lengths, and directions
of X movement for each edge. Once the GET is built, we'll do the following:
1. Set the current Y coordinate to the Y coordinate of the first edge in the
GET.
2. Move all edges with the current Y coordinate from the GET to the AET,
removing them from the GET and maintaining the X-sorted order of the AET.
3. Draw all odd-to-even spans in the AET at the current Y coordinate.
4. Count down the lengths of all edges in the AET, removing any edges that are
done, and advancing the X coordinates of all remaining edges in the AET by one
scan line.
5. Sort the AET in order of ascending X coordinate.
6. Advance the current Y coordinate by one scan line.
7. If either the AET or GET isn't empty, go to step 2.
That's really all there is to it. Compare Listing One (page 154) to the fast
convex polygon filling code from March, and you'll see that, contrary to
expectation, complex polygon filling is indeed one of the more sane and
sensible corners of the universe.


Complex Polygon Filling: An Implementation


Listing One shows a function, FillPolygon, that fills polygons of all shapes.
If CONVEX_FILL_LINKED is defined, then the fast convex fill code from March is
linked in and used to draw convex polygons. Otherwise, convex polygons are
handled as if they were complex. Nonconvex polygons are also handled as
complex, although this is not necessary, as discussed shortly.
Listing One is a faithful implementation of the complex polygon filling
approach just described, with separate functions corresponding to each of the
tasks, such as building the GET and X-sorting the AET. Listing Two (page 156)
provides the actual drawing code used to fill spans, built on a draw pixel
routine that is the only hardware dependency in the C code. Listing Three
(page 156) is the header file for the polygon filling code; note that it is an
expanded version of the header file used by the fast convex polygon fill code
from March. Listing Four (page 156) is a sample program that, when linked to
Listings One and Two, demonstrates drawing polygons of various sorts.
Listing Four illustrates several interesting aspects of polygon filling. The
first and third polygons drawn illustrate the operation of the odd/even fill
rule. The second polygon drawn illustrates how holes can be created in
seemingly solid objects; an edge runs from the outside of the rectangle to the
inside, the edges comprising the hole are defined, and then the same edge is
used to move back to the outside; because the edges join seamlessly, the
rectangle appears to form a solid boundary around the hole.
The set of V-shaped polygons drawn by Listing Four demonstrate that polygons
sharing common edges meet but do not overlap. This characteristic, which I
discussed at length several months back, is not a trivial matter; it allows
polygons to fit together without fear of overlapping or missed pixels. Listing
One reflects a fairly complex rule for drawing pixels on polygon boundaries
that I have devised. It's not essential that I detail that rule or its
implementation in Listing One (which is fortunate, for I lack the space to do
so), but it's important that you know that it exists, and that, as a result,
Listing One should always fill polygons so that common boundaries and vertices
are drawn once and only once. This has the side-effect for any individual
polygon of not drawing pixels that lie exactly on top or right boundaries or
at certain vertices.
By the way, I have not seen polygon boundary filling handled this way
elsewhere. The boundary filling approach in Foley and van Dam is similar, but
seems to me to not draw all boundary and vertex pixels once and only once.


More On Active Edges



Edges of zero height--horizontal edges and edges defined by two vertices at
the same location--never even make it into the GET in Listing One. A polygon
edge of zero height can never be an active edge, because it can never
intersect a scan line; it can only run along the scan line, and the span it
runs along is defined not by that edge but by the edges that connect to its
endpoints.


Performance Considerations


How fast is Listing One? When drawing triangles on a 20-MHz 386, it's less
than one-fifth the speed of the fast convex polygon fill code. However, most
of that time is spent drawing individual pixels; when Listing Two is replaced
with the fast assembler line segment drawing code in Listing Five (page 157),
performance improves by two and one-half times, to about half as fast as the
fast convex fill code. Even after conversion to assembly in Listing five,
DrawHorizontalLineSeg still takes more than half of the total execution time,
and the remaining time is spread out fairly evenly over the various
subroutines in Listing One. Consequently, there's no single place in which
it's possible to greatly improve performance, and the maximum additional
improvement that's possible is clearly considerably less than two times; for
that reason, and because of space limitations, I'm not going to convert the
rest of the code to assembly. However, when filling a polygon with a great
many edges, and especially one with a great many active edges at one time,
relatively more time would be spent traversing the linked lists. Then
conversion to assembly (which actually lends itself very nicely to linked list
processing) could pay off reasonably well.
The algorithm used to X-sort the AET is an interesting performance
consideration. Listing One uses a bubble sort, usually a poor choice for
performance. However, bubble sorts perform well when the data are already
almost sorted, and because of the X coherence of edges from one scan line to
another, that's generally the case with the AET. An insertion sort might be
somewhat faster, depending on the state of the AET when any particular sort
occurs, but a bubble sort will generally do just fine.
An insertion sort that scans backward through the AET from the current edge
rather than forward from the start of the AET could be quite a bit faster,
because edges rarely move more than one or two positions through the AET.
However, scanning backward requires a doubly linked list, rather than the
singly linked list used in Listing One. I've chosen to use a singly linked
list partly to minimize memory requirements (double-linking requires an extra
pointer field) and partly because supporting back links would complicate the
code a good bit. The main reason, though, is that the potential rewards for
the complications of back links and insertion sorting aren't great enough;
profiling a variety of polygons reveals that less than ten percent of total
time is spent sorting the AET. The potential 1-5 percent speedup gained by
optimizing AET sorting just isn't worth it in any but the most demanding
application -- a good example of the need to keep an overall perspective when
comparing the theoretical characteristics of various approaches.


Nonconvex Polygons


Nonconvex polygons can be filled somewhat faster than complex polygons.
Because edges never cross or switch positions with other edges once they're in
the AET, the AET for a nonconvex polygon needs to be sorted only when new
edges are added. In order for this to work, though, edges must be added to the
AET in strict left-to-right order. Complications arise when dealing with two
edges that start at the same point, because slopes must be compared to
determine which edge is leftmost. This is certainly doable, but because of
space limitations and limited performance returns, I haven't implemented this
in Listing One.


Coming Up


Next time, we may do some 256-color animation. Or we may poke into the innards
of the new 15-bpp VGAs. Or perhaps we'll take a look at RenderMan. Who knows?
If you have any preferences, by all means drop me a line.

_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

/* Color-fills an arbitrarily-shaped polygon described by VertexList.
If the first and last points in VertexList are not the same, the path
around the polygon is automatically closed. All vertices are offset
by (XOffset, YOffset). Returns 1 for success, 0 if memory allocation
failed. All C code tested with Turbo C++.
If the polygon shape is known in advance, speedier processing may be
enabled by specifying the shape as follows: "convex" - a rubber band
stretched around the polygon would touch every vertex in order;
"nonconvex" - the polygon is not self-intersecting, but need not be
convex; "complex" - the polygon may be self-intersecting, or, indeed,
any sort of polygon at all. Complex will work for all polygons; convex
is fastest. Undefined results will occur if convex is specified for a
nonconvex or complex polygon.
Define CONVEX_CODE_LINKED if the fast convex polygon filling code from
the February 1991 column is linked in. Otherwise, convex polygons are
handled by the complex polygon filling code.
Nonconvex is handled as complex in this implementation. See text for a
discussion of faster nonconvex handling */

#include <stdio.h>
#include <math.h>
#ifdef __TURBOC__
#include <alloc.h>
#else /* MSC */
#include <malloc.h>
#endif
#include "polygon.h"

#define SWAP(a,b) {temp = a; a = b; b = temp;}


struct EdgeState {
 struct EdgeState *NextEdge;
 int X;
 int StartY;
 int WholePixelXMove;
 int XDirection;
 int ErrorTerm;
 int ErrorTermAdjUp;
 int ErrorTermAdjDown;
 int Count;
};

extern void DrawHorizontalLineSeg(int, int, int, int);
extern int FillConvexPolygon(struct PointListHeader *, int, int, int);
static void BuildGET(struct PointListHeader *, struct EdgeState *,
 int, int);
static void MoveXSortedToAET(int);
static void ScanOutAET(int, int);
static void AdvanceAET(void);
static void XSortAET(void);

/* Pointers to global edge table (GET) and active edge table (AET) */
static struct EdgeState *GETPtr, *AETPtr;

int FillPolygon(struct PointListHeader * VertexList, int Color,
 int PolygonShape, int XOffset, int YOffset)
{
 struct EdgeState *EdgeTableBuffer;
 int CurrentY;

#ifdef CONVEX_CODE_LINKED
 /* Pass convex polygons through to fast convex polygon filler */
 if (PolygonShape == CONVEX)
 return(FillConvexPolygon(VertexList, Color, XOffset, YOffset));
#endif

 /* It takes a minimum of 3 vertices to cause any pixels to be
 drawn; reject polygons that are guaranteed to be invisible */
 if (VertexList->Length < 3)
 return(1);
 /* Get enough memory to store the entire edge table */
 if ((EdgeTableBuffer =
 (struct EdgeState *) (malloc(sizeof(struct EdgeState) *
 VertexList->Length))) == NULL)
 return(0); /* couldn't get memory for the edge table */
 /* Build the global edge table */
 BuildGET(VertexList, EdgeTableBuffer, XOffset, YOffset);
 /* Scan down through the polygon edges, one scan line at a time,
 so long as at least one edge remains in either the GET or AET */
 AETPtr = NULL; /* initialize the active edge table to empty */
 CurrentY = GETPtr->StartY; /* start at the top polygon vertex */
 while ((GETPtr != NULL) (AETPtr != NULL)) {
 MoveXSortedToAET(CurrentY); /* update AET for this scan line */
 ScanOutAET(CurrentY, Color); /* draw this scan line from AET */
 AdvanceAET(); /* advance AET edges 1 scan line */
 XSortAET(); /* resort on X */
 CurrentY++; /* advance to the next scan line */
 }

 /* Release the memory we've allocated and we're done */
 free(EdgeTableBuffer);
 return(1);
}

/* Creates a GET in the buffer pointed to by NextFreeEdgeStruc from
 the vertex list. Edge endpoints are flipped, if necessary, to
 guarantee all edges go top to bottom. The GET is sorted primarily
 by ascending Y start coordinate, and secondarily by ascending X
 start coordinate within edges with common Y coordinates */
static void BuildGET(struct PointListHeader * VertexList,
 struct EdgeState * NextFreeEdgeStruc, int XOffset, int YOffset)
{
 int i, StartX, StartY, EndX, EndY, DeltaY, DeltaX, Width, temp;
 struct EdgeState *NewEdgePtr;
 struct EdgeState *FollowingEdge, **FollowingEdgeLink;
 struct Point *VertexPtr;

 /* Scan through the vertex list and put all non-0-height edges into
 the GET, sorted by increasing Y start coordinate */
 VertexPtr = VertexList->PointPtr; /* point to the vertex list */
 GETPtr = NULL; /* initialize the global edge table to empty */
 for (i = 0; i < VertexList->Length; i++) {
 /* Calculate the edge height and width */
 StartX = VertexPtr[i].X + XOffset;
 StartY = VertexPtr[i].Y + YOffset;
 /* The edge runs from the current point to the previous one */
 if (i == 0) {
 /* Wrap back around to the end of the list */
 EndX = VertexPtr[VertexList->Length-1].X + XOffset;
 EndY = VertexPtr[VertexList->Length-1].Y + YOffset;
 } else {
 EndX = VertexPtr[i-1].X + XOffset;
 EndY = VertexPtr[i-1].Y + YOffset;
 }
 /* Make sure the edge runs top to bottom */
 if (StartY > EndY) {
 SWAP(StartX, EndX);
 SWAP(StartY, EndY);
 }
 /* Skip if this can't ever be an active edge (has 0 height) */
 if ((DeltaY = EndY - StartY) != 0) {
 /* Allocate space for this edge's info, and fill in the
 structure */
 NewEdgePtr = NextFreeEdgeStruc++;
 NewEdgePtr->XDirection = /* direction in which X moves */
 ((DeltaX = EndX - StartX) > 0) ? 1 : -1;
 Width = abs(DeltaX);
 NewEdgePtr->X = StartX;
 NewEdgePtr->StartY = StartY;
 NewEdgePtr->Count = DeltaY;
 NewEdgePtr->ErrorTermAdjDown = DeltaY;
 if (DeltaX >= 0) /* initial error term going L->R */
 NewEdgePtr->ErrorTerm = 0;
 else /* initial error term going R->L */
 NewEdgePtr->ErrorTerm = -DeltaY + 1;
 if (DeltaY >= Width) { /* Y-major edge */
 NewEdgePtr->WholePixelXMove = 0;
 NewEdgePtr->ErrorTermAdjUp = Width;

 } else { /* X-major edge */
 NewEdgePtr->WholePixelXMove =
 (Width / DeltaY) * NewEdgePtr->XDirection;
 NewEdgePtr->ErrorTermAdjUp = Width % DeltaY;
 }
 /* Link the new edge into the GET so that the edge list is
 still sorted by Y coordinate, and by X coordinate for all
 edges with the same Y coordinate */
 FollowingEdgeLink = &GETPtr;
 for (;;) {
 FollowingEdge = *FollowingEdgeLink;
 if ((FollowingEdge == NULL) 
 (FollowingEdge->StartY > StartY) 
 ((FollowingEdge->StartY == StartY) &&
 (FollowingEdge->X >= StartX))) {
 NewEdgePtr->NextEdge = FollowingEdge;
 *FollowingEdgeLink = NewEdgePtr;
 break;
 }
 FollowingEdgeLink = &FollowingEdge->NextEdge;
 }
 }
 }
}

/* Sorts all edges currently in the active edge table into ascending
 order of current X coordinates */
static void XSortAET() {
 struct EdgeState *CurrentEdge, **CurrentEdgePtr, *TempEdge;
 int SwapOccurred;

 /* Scan through the AET and swap any adjacent edges for which the
 second edge is at a lower current X coord than the first edge.
 Repeat until no further swapping is needed */
 if (AETPtr != NULL) {
 do {
 SwapOccurred = 0;
 CurrentEdgePtr = &AETPtr;
 while ((CurrentEdge = *CurrentEdgePtr)->NextEdge != NULL) {
 if (CurrentEdge->X > CurrentEdge->NextEdge->X) {
 /* The second edge has a lower X than the first;
 swap them in the AET */
 TempEdge = CurrentEdge->NextEdge->NextEdge;
 *CurrentEdgePtr = CurrentEdge->NextEdge;
 CurrentEdge->NextEdge->NextEdge = CurrentEdge;
 CurrentEdge->NextEdge = TempEdge;
 SwapOccurred = 1;
 }
 CurrentEdgePtr = &(*CurrentEdgePtr)->NextEdge;
 }
 } while (SwapOccurred != 0);
 }
}

/* Advances each edge in the AET by one scan line.
 Removes edges that have been fully scanned. */
static void AdvanceAET() {
 struct EdgeState *CurrentEdge, **CurrentEdgePtr;


 /* Count down and remove or advance each edge in the AET */
 CurrentEdgePtr = &AETPtr;
 while ((CurrentEdge = *CurrentEdgePtr) != NULL) {
 /* Count off one scan line for this edge */
 if ((--(CurrentEdge->Count)) == 0) {
 /* This edge is finished, so remove it from the AET */
 *CurrentEdgePtr = CurrentEdge->NextEdge;
 } else {
 /* Advance the edge's X coordinate by minimum move */
 CurrentEdge->X += CurrentEdge->WholePixelXMove;
 /* Determine whether it's time for X to advance one extra */
 if ((CurrentEdge->ErrorTerm +=
 CurrentEdge->ErrorTermAdjUp) > 0) {
 CurrentEdge->X += CurrentEdge->XDirection;
 CurrentEdge->ErrorTerm -= CurrentEdge->ErrorTermAdjDown;
 }
 CurrentEdgePtr = &CurrentEdge->NextEdge;
 }
 }
}

/* Moves all edges that start at the specified Y coordinate from the
 GET to the AET, maintaining the X sorting of the AET. */
static void MoveXSortedToAET(int YToMove) {
 struct EdgeState *AETEdge, **AETEdgePtr, *TempEdge;
 int CurrentX;

 /* The GET is Y sorted. Any edges that start at the desired Y
 coordinate will be first in the GET, so we'll move edges from
 the GET to AET until the first edge left in the GET is no longer
 at the desired Y coordinate. Also, the GET is X sorted within
 each Y coordinate, so each successive edge we add to the AET is
 guaranteed to belong later in the AET than the one just added */
 AETEdgePtr = &AETPtr;
 while ((GETPtr != NULL) && (GETPtr->StartY == YToMove)) {
 CurrentX = GETPtr->X;
 /* Link the new edge into the AET so that the AET is still
 sorted by X coordinate */
 for (;;) {
 AETEdge = *AETEdgePtr;
 if ((AETEdge == NULL) (AETEdge->X >= CurrentX)) {
 TempEdge = GETPtr->NextEdge;
 *AETEdgePtr = GETPtr; /* link the edge into the AET */
 GETPtr->NextEdge = AETEdge;
 AETEdgePtr = &GETPtr->NextEdge;
 GETPtr = TempEdge; /* unlink the edge from the GET */
 break;
 } else {
 AETEdgePtr = &AETEdge->NextEdge;
 }
 }
 }
}

/* Fills the scan line described by the current AET at the specified Y
 coordinate in the specified color, using the odd/even fill rule */
static void ScanOutAET(int YToScan, int Color) {
 int LeftX;
 struct EdgeState *CurrentEdge;


 /* Scan through the AET, drawing line segments as each pair of edge
 crossings is encountered. The nearest pixel on or to the right
 of left edges is drawn, and the nearest pixel to the left of but
 not on right edges is drawn */
 CurrentEdge = AETPtr;
 while (CurrentEdge != NULL) {
 LeftX = CurrentEdge->X;
 CurrentEdge = CurrentEdge->NextEdge;
 DrawHorizontalLineSeg(YToScan, LeftX, CurrentEdge->X-1, Color);
 CurrentEdge = CurrentEdge->NextEdge;
 }
}






[LISTING TWO]

/* Draws all pixels in the horizontal line segment passed in, from
 (LeftX,Y) to (RightX,Y), in the specified color in mode 13h, the
 VGA's 320x200 256-color mode. Both LeftX and RightX are drawn. No
 drawing will take place if LeftX > RightX. */

#include <dos.h>
#include "polygon.h"

#define SCREEN_WIDTH 320
#define SCREEN_SEGMENT 0xA000

static void DrawPixel(int, int, int);

void DrawHorizontalLineSeg(Y, LeftX, RightX, Color) {
 int X;

 /* Draw each pixel in the horizontal line segment, starting with
 the leftmost one */
 for (X = LeftX; X <= RightX; X++)
 DrawPixel(X, Y, Color);
}

/* Draws the pixel at (X, Y) in color Color in VGA mode 13h */
static void DrawPixel(int X, int Y, int Color) {
 unsigned char far *ScreenPtr;

#ifdef __TURBOC__
 ScreenPtr = MK_FP(SCREEN_SEGMENT, Y * SCREEN_WIDTH + X);
#else /* MSC 5.0 */
 FP_SEG(ScreenPtr) = SCREEN_SEGMENT;
 FP_OFF(ScreenPtr) = Y * SCREEN_WIDTH + X;
#endif
 *ScreenPtr = (unsigned char) Color;
}







[LISTING THREE]

/* POLYGON.H: Header file for polygon-filling code */

#define CONVEX 0
#define NONCONVEX 1
#define COMPLEX 2

/* Describes a single point (used for a single vertex) */
struct Point {
 int X; /* X coordinate */
 int Y; /* Y coordinate */
};
/* Describes a series of points (used to store a list of vertices that
 describe a polygon; each vertex connects to the two adjacent
 vertices; the last vertex is assumed to connect to the first) */
struct PointListHeader {
 int Length; /* # of points */
 struct Point * PointPtr; /* pointer to list of points */
};
/* Describes the beginning and ending X coordinates of a single
 horizontal line (used only by fast polygon fill code) */
struct HLine {
 int XStart; /* X coordinate of leftmost pixel in line */
 int XEnd; /* X coordinate of rightmost pixel in line */
};
/* Describes a Length-long series of horizontal lines, all assumed to
 be on contiguous scan lines starting at YStart and proceeding
 downward (used to describe a scan-converted polygon to the
 low-level hardware-dependent drawing code) (used only by fast
 polygon fill code) */
struct HLineList {
 int Length; /* # of horizontal lines */
 int YStart; /* Y coordinate of topmost line */
 struct HLine * HLinePtr; /* pointer to list of horz lines */
};







[LISTING FOUR]

/* Sample program to exercise the polygon-filling routines */

#include <conio.h>
#include <dos.h>
#include "polygon.h"

#define DRAW_POLYGON(PointList,Color,Shape,X,Y) \
 Polygon.Length = sizeof(PointList)/sizeof(struct Point); \
 Polygon.PointPtr = PointList; \
 FillPolygon(&Polygon, Color, Shape, X, Y);

void main(void);

extern int FillPolygon(struct PointListHeader *, int, int, int, int);

void main() {
 int i, j;
 struct PointListHeader Polygon;
 static struct Point Polygon1[] =
 {{0,0},{100,150},{320,0},{0,200},{220,50},{320,200}};
 static struct Point Polygon2[] =
 {{0,0},{320,0},{320,200},{0,200},{0,0},{50,50},
 {270,50},{270,150},{50,150},{50,50}};
 static struct Point Polygon3[] =
 {{0,0},{10,0},{105,185},{260,30},{15,150},{5,150},{5,140},
 {260,5},{300,5},{300,15},{110,200},{100,200},{0,10}};
 static struct Point Polygon4[] =
 {{0,0},{30,-20},{30,0},{0,20},{-30,0},{-30,-20}};
 static struct Point Triangle1[] = {{30,0},{15,20},{0,0}};
 static struct Point Triangle2[] = {{30,20},{15,0},{0,20}};
 static struct Point Triangle3[] = {{0,20},{20,10},{0,0}};
 static struct Point Triangle4[] = {{20,20},{20,0},{0,10}};
 union REGS regset;

 /* Set the display to VGA mode 13h, 320x200 256-color mode */
 regset.x.ax = 0x0013;
 int86(0x10, &regset, &regset);

 /* Draw three complex polygons */
 DRAW_POLYGON(Polygon1, 15, COMPLEX, 0, 0);
 getch(); /* wait for a keypress */
 DRAW_POLYGON(Polygon2, 5, COMPLEX, 0, 0);
 getch(); /* wait for a keypress */
 DRAW_POLYGON(Polygon3, 3, COMPLEX, 0, 0);
 getch(); /* wait for a keypress */

 /* Draw some adjacent nonconvex polygons */
 for (i=0; i<5; i++) {
 for (j=0; j<8; j++) {
 DRAW_POLYGON(Polygon4, 16+i*8+j, NONCONVEX, 40+(i*60),
 30+(j*20));
 }
 }
 getch(); /* wait for a keypress */

 /* Draw adjacent triangles across the screen */
 for (j=0; j<=80; j+=20) {
 for (i=0; i<290; i += 30) {
 DRAW_POLYGON(Triangle1, 2, CONVEX, i, j);
 DRAW_POLYGON(Triangle2, 4, CONVEX, i+15, j);
 }
 }
 for (j=100; j<=170; j+=20) {
 /* Do a row of pointing-right triangles */
 for (i=0; i<290; i += 20) {
 DRAW_POLYGON(Triangle3, 40, CONVEX, i, j);
 }
 /* Do a row of pointing-left triangles halfway between one row
 of pointing-right triangles and the next, to fit between */
 for (i=0; i<290; i += 20) {
 DRAW_POLYGON(Triangle4, 1, CONVEX, i, j+10);
 }

 }
 getch(); /* wait for a keypress */

 /* Return to text mode and exit */
 regset.x.ax = 0x0003;
 int86(0x10, &regset, &regset);
}





[LISTING FIVE]


; Draws all pixels in the horizontal line segment passed in, from
; (LeftX,Y) to (RightX,Y), in the specified color in mode 13h, the
; VGA's 320x200 256-color mode. No drawing will take place if
; LeftX > RightX. Tested with TASM 2.0
; C near-callable as:
; void DrawHorizontalLineSeg(Y, LeftX, RightX, Color);

SCREEN_WIDTH equ 320
SCREEN_SEGMENT equ 0a000h

Parms struc
 dw 2 dup(?) ;return address & pushed BP
Y dw ? ;Y coordinate of line segment to draw
LeftX dw ? ;left endpoint of the line segment
RightX dw ? ;right endpoint of the line segment
Color dw ? ;color in which to draw the line segment
Parms ends

 .model small
 .code
 public _DrawHorizontalLineSeg
 align 2
_DrawHorizontalLineSeg proc
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to our stack frame
 push di ;preserve caller's register variable
 cld ;make string instructions inc pointers
 mov ax,SCREEN_SEGMENT
 mov es,ax ;point ES to display memory
 mov di,[bp+LeftX]
 mov cx,[bp+RightX]
 sub cx,di ;width of line
 jl DrawDone ;RightX < LeftX; no drawing to do
 inc cx ;include both endpoints
 mov ax,SCREEN_WIDTH
 mul [bp+Y] ;offset of scan line on which to draw
 add di,ax ;ES:DI points to start of line seg
 mov al,byte ptr [bp+Color] ;color in which to draw
 mov ah,al ;put color in AH for STOSW
 shr cx,1 ;# of words to fill
 rep stosw ;fill a word at a time
 adc cx,cx
 rep stosb ;draw the odd byte, if any
DrawDone:

 pop di ;restore caller's register variable
 pop bp ;restore caller's stack frame
 ret
_DrawHorizontalLineSeg endp
 end

























































June, 1991
PROGRAMMER'S BOOKSHELF


Confessions of an Efficient Coder




Ray Duncan


The occupational risks of working with computers are a hot topic lately.
Glare, ambient noise, ELF, carpal tunnel syndrome -- all are proving to be
excellent fodder for journalists in search of an inflammatory headline, or
lawyers in pursuit of a down payment on a Maui condo. One of the less
glamorous (and therefore rarely mentioned) occupational risks is that, rather
than having your hands stiffen up like a mummy, or getting fried by CRT
radiation, you might simply disappear from view beneath the relentless
onslaught of junk mail. If your mailbox looks like mine, you receive upwards
of 20 catalogs a week from mail-order firms hawking everything from SIMMs to
cardboard diskette mailers, perhaps twice that many "special offers" from
magazines that you either already subscribe to or wouldn't be caught dead
reading, a wastebasketload of invitations to seminars on object-oriented
programming, and a fistful of promotional pieces for high-priced "insider"
newsletters.
After 20 years in the programming business, my resistance to all of these
marketing weapons is pretty well developed, even hypertrophied, perhaps, but I
must confess that I still occasionally succumb to the blandishments of the
newsletter tycoons. Their pitch is so seductive -- they rub shoulders with the
Captains of Industry, they have the experience and insight to put the Big
Picture together, they are armed with a word processor and a laser printer,
and for a limited time only you can join a select group of discerning
subscribers and receive priceless predigested wisdom by first-class mail.
Please fill in your credit card number and expiration date in the blanks
below, and be sure to include a phone number where we can reach you during
business hours in case you don't have enough room left on your credit limit.
Naturally, since there are no magic answers in an industry based on the laws
of physics, an endless supply of silicon, and the frail logical abilities of
carbon-based programmers, very few of these newsletters pan out to be worth
anywhere near their cost in unique perspectives or information. The two
conspicuous exceptions are Esther Dyson's Release 1.0, which is distinctive
for its droll humor and the cosmic character of its analyses, and Michael
Slater's Microprocessor Report, which is unexcelled among newsletters in its
timeliness, accuracy, depth of coverage, quality of writing, and production
values. As for the others, well ... suffice to say that Dyson's and Slater's
newsletters are the only ones that I've ever actually renewed.
Nevertheless, hope springs eternal, as the saying goes, and recently when I
was feeling particularly flush I threw caution to the winds (and my budget to
the wolves) and signed up for a year's worth of Ed Yourdon's American
Programmer. I'd been looking at Yourdon's flyers for at least a year or so,
but the promotional incentive that finally tipped the balance for me was
Yourdon's promise to throw in a selection of his technical reports, one of
which was called "The 68 Best Software Books." I'm a confirmed computer book
junkie -- I've never managed to walk out of the famous Computer Literacy
Bookstore in Sunnyvale without a tab in the three digits -- and I'm
exceptionally vulnerable to someone who promises me a highly specific list of
neat new books to buy.
As it turns out, Yourdon's newsletter might be more aptly named "American
Administrator" instead of American Programmer. What Yourdon views as
programming, you and I would consider the tedious paper-pushing of burned-out
middle managers; what you and I enjoy as the creative labor of programming,
Yourdon dismisses as an implementation detail that is better left to
zillion-dollar CASE tools and code generators. His monograph "The 68 Best
Software Books" reflects this bureaucratic orientation as well. He actually
comments, at one point, "Why would anyone want to read a programming book?
They tend to be deadly dry and boring, and the idea of actually reading one is
enough to instantly put programmers and non-programmers alike to sleep." This
follows naturally from Yourdon's conviction that programmers are merely
drudges, and the real thinking happens at the managerial level.
None of the dozen or so books that I insist on having within easy reach of my
keyboard at coding time are on Yourdon's list at all: Knuth's Art of Computer
Programming, Tanenbaum's Operating Systems, Sedgewick's Algorithms, Kernighan
and Ritchie's C Programming Language, Foley and Van Dam's Computer Graphics,
Hennessy and Patterson's Computer Architecture, and so on. Instead, Yourdon
endorses a collection of ethereal, academic works on software engineering and
structured design, leavened by a sprinkling of New Age titles with only the
vaguest relationship to programming: Sculley's Odyssey, Pirsig's Zen and the
Art of Motorcycle Maintenance, Minsky's Society of Mind, Weizenbaum's Computer
Power and Human Reason, and McCorduck's Machines Who Think. It's definitely
grounds for alarm when one's perspective is in such blatant conflict with that
of a famous guru such as Yourdon. So I was relieved to find that his list and
mine are not totally disjoint. He singles out Jon Bentley's More Programming
Pearls, one of my all-time favorites, for an especially warm commendation.
Bentley's Programming Pearls and More Programming Pearls books are anthologies
of his "Programming Pearls" columns in Communications of the ACM, 1983 through
1987. The columns-turned-chapters have been reworked to various extents with
bug fixes, additional material, and new exercises (don't cringe, even
Bentley's exercises and suggested solutions are entertaining reading), but
still retain their original chatty flavor. Although the chapters can be read
alone, they are broadly grouped into several categories -- programming
fundamentals, performance, tricks of the trade, I/0, and algorithms -- and
have a greater impact if read in sequence. I particularly recommend the
sections on code tuning, profilers, back-of-the-envelope calculations, "little
languages," and spelling checkers. Sample these, and you'll be a Bentley fan
for life.
The book Writing Efficient Programs is quite different. Although many of its
points are restated in the later Pearls books, it is more structured, more
cohesive, and somewhat more formal. It amounts, in fact, to a comprehensive
overview of optimization techniques -- from loop unrolling to lazy evaluation
to table lookups -- that can be assimilated in just a few hours, with an
extensive bibliography to direct you to further reading as needed. In the
"Preface," Bentley sets out his premises as follows:
The rules that we will study increase efficiency by making changes to a
program that often decrease program, clarity, and robustness. When this coding
style is applied indiscriminately throughout a large system (as it often has
been), it usually increases efficiency slightly but leads to late software
that is full of bugs and is impossible to maintain. For these reasons,
techniques at this level have earned the name of "hacks." It is hard to argue
with this criticism: although solid work has been done in this domain, much
work at this level is pure and simple hacking in the most pejorative sense of
that term.
But writing efficient code need not remain the domain of hackers. The purpose
of this book is to present work at this level as a set of engineering
techniques. The following are some of the differences between hacks and
engineering techniques:
Hacks are applied indiscriminately; engineering techniques are applied in a
well-defined context. While the medicine man peddles snake oil on the street,
the physician treats a patient only in a well-defined relationship. The
extreme measures of the surgeon are taken in a well-staffed and well-equipped
hospital, and only when the patient needs surgery.
Hacks are described loosely and informally; engineering techniques are
described precisely.
Hacks are created after a moment's thought; engineering techniques are firmly
based on scientific principles and are well tested in both laboratory and
applied contexts.
Hacks are passed on by word of mouth, at best; engineering techniques are
described in a common literature and are referred to by name when applied.
Hacks are described out of context; engineering techniques are presented with
indicators and contra-indicators for their application.
Hacks are applied without certainty; engineering techniques can be tested for
their appropriateness and effectiveness.
Methodical design and testing are persistent themes in Writing Efficient
Programs. Bentley is a fervent believer in profiling, and produces many
amusing or horrifying examples to make his point, for example:
Victor Vyssotsky enhanced a FORTRAN compiler in the early 1960s under the
design constraint that compilation time could not be noticeably slower. A
particular routine in his program was executed rarely (he estimated during
design that it would be called in about one percent of the compilations, and
just once in each of these) but was very slow, so Vyssotsky spent a week
squeezing every last unneeded cycle out of the routine. The modified compiler
was fast enough. After two years of extensive use the compiler reported an
internal error during compilation of a program. When Vyssotsky inspected the
code he found that the error occurred in the prologue of the "critical"
routine, and that the routine had contained this bug for its entire production
life. This implied that the routine had never been called during more than
100,000 compilations, so the week Vyssotsky put into prematurely optimizing it
was completely wasted.
By any ordinary mortal's standards, Bentley's everyday working environment is
a programming nirvana. He is a member of the staff at AT&T's Bell Labs in
Murray Hill, New Jersey, has immediate access to cutting-edge hardware and
software technology, and stands in line at the cafeteria with some of the most
brilliant software developers in the world. The roll call of manuscript
reviewers for his books is virtually a Who's Who of programming: Aho, Denning,
Gries, Kernighan, and Stroustrup, among others. Nevertheless, Bentley's style
is accessable, unpretentious, and determinedly practical, and is spiced with a
wit and warm understanding of human nature that puts a personal twist on even
the most arcane topic. Each of these three slim volumes is an authentic gem
... er ... pearl of technical writing. Once you have them on your bookshelves,
you'll want to reread them regularly.






























June, 1991
OF INTEREST





Java 1.4, the video analysis software for IBM PCs and compatibles from Jandel
Scientific, allows you to capture, process, measure, and analyze video images.
Image arithmetic, snapshots, and bit-map transforms, along with enhanced macro
facilities and data worksheet formatting/printing comprise version 1.4's new
features. Image processing time has been reduces by automated area of interest
(AOI) features. AOIs are defined using the new AOI backspace editing feature
and can be saved as a template for use across multiple images. The image
arithmetic feature lets you add, subtract, and average images with the current
image in the frame grabber buffer; it also measures changes over time and can
improve the signal to noise ratio of an image. Java analyzes data using
look-up tables, linear regression, t-tests and descriptive statistics.
The program is priced at $1495. Updates from version 1.3 run $295. Reader
service no. 31.
Jandel Scientific 65 Koch Road Corte Madera, CA 94925 415-924-8640 or
800-874-1888
Grafmatic, Plotmatic, and Printmatic are the three Fortran libraries now
available from Jewell Technologies. Grafmatic enables you to create video
graphics of two- and three-dimensional drawings, plots, or models. Printmatic
and Plotmatic have identical capabilities; Printmatic executes them on various
dot-matrix, laser, and inkjet printers, while Plotmatic supports HP and
Houston Instruments plotters. The libraries also feature hidden line removal
and shading for three-dimensional objects, as well as sophisticated spline
algorithms.
DDJ spoke with Dr. Howard Garon of Aptek Inc. in Silver Springs, Md., who uses
the libraries in creating software related to underwater acoustic propagation.
Aptek chose the libraries because they are "comprehensive in terms of the
numbers and kinds of routines available." Dr. Garon also said that he has
found the libraries "well-documented and straightforward to use as part of
Microsoft Fortran, QuickBasic, and Pascal."
All three libraries support the Microsoft, Ryan-McFarland, and Lahey Fortran
compilers, run on IBM PCs and compatibles and require DOS 2.0 or later. Each
one can be used as a stand-alone package, or they can be bundled together.
The libraries cost $249.95 apiece; discounts are available for bundling.
Reader service no. 25.
Jewell Technologies 4740 44th Ave. SW #203 Seattle, WA 98116
Version 3.0 of the Professional Edition C compiler for the 8051
microcontroller family has been released by Franklin Software. The new
compiler offers reentrant code generation, pointers to speed up your code,
registers into which program variables may be loaded directly, and
optimizations such as loops and global and local subexpressions. The new
version allows you to mix different types of memory models and affords
multiple CPTR and on-chip ALU support when possible as well as compatibility
with code created with previous versions.
DDJ spoke with Nick Andrews of Byte-BOS Integrated Systems in San Francisco,
who commented, "We've adapted our operating system to work with Franklin's
large model and with the external stack. It's a very efficient compiler in
terms of execution and the code size is small."
The compiler alone sells for $1195; a complete development kit runs $1995.
Reader service no. 20.
Franklin Software Inc. 888 Saratoga Avenue, Suite #2 San Jose, CA 95129
408-296-8051
XREF, a multilanguage cross-reference lister for C, Pascal, and Modula-2, is
now available from Amtec. The program has the capacity to handle up to 2000
different identifiers, for each of which it will recognize the first 29
characters.
XREF sells for $130. Reader service no. 24.
Amtec Software-Vertrieb Hauptstrasse 184 CH-4565 Recherswil Switzerland
41-65-443578 or 41-65-354550
ECAL is the new assembly language development system from VST Inc. ECAL
generates code for all 4- through 32-bit microprocessors and features an open
architecture. The package comprises an assembler with full macro capabilities,
including parameter passing, local labels and conditional assembly; an
integrated linker with a date-stamping make that reassembles only modules
edited since the last assembly; a menu-driven editor that features
split-screen mode and a file marker system; and diagnostic and
context-dependent help facilities.
There is an optional EPROM emulator which replaces the EPROM in the target
microprocessor and facilitates fast linkage between host and target.
Mike Appleton of Logical Devices Inc. in Fort Lauderdale said of the product,
"ECAL allowed very rapid development of our product, an EPROM/PLD programmer.
With most systems like this, you're limited to one type of microprocessor, but
ECAL offers you full development potential across the board, and we plan on
using it for other projects, as well."
The ECAL package runs on IBM PCs and compatibles and includes a source level
debugger, ROM emulator, and control files for all microprocessors. It sells
for $1895. Reader service no. 23.
VST Inc. 6600 NW 12th Avenue, Suite 203 Fort Lauderdale, FL 33309 305-491-7443
Oasys has introduced the Green Hills C++ Compiler. The compiler includes
object-oriented features such as multiple inheritance, classes with scope, and
operator overloading. It is compatible with AT&T's C++, Version 2.1 but
supports cfront Versions 1.2 and 2.0, ANSI C, and Kernighan and Ritchie C as
well. The compiler offers optimizations such as procedural inlining, register
caching, and loop unrolling and provides pragma extensions to permit flexible
control over runtime error checking and virtual tables. The product can be
used as a native development tool or as a cross-compiler for cross or embedded
development.
Green Hills C++ will run on Sun-3, 386 Unix/System V, and 88open compliant
machines. The compiler comes with ANSI C header files, a single-step post-link
program and a name mangler/demangler utility. Prices begin at $1000. Reader
service no. 28.
Oasys One Cranberry Hill Lexington, MA 02173 617-862-2002
The Guide to Debugging Embedded Code, a reference for debugging code in 8- and
16-bit embedded systems, has been released by Softaid. The guide instructs you
on how to set up your debugging environment and describes solutions to
problems commonly encountered when using conventional tools. Also included are
design recommendations to forestall debugging problems.
The guide is provided free of charge. Reader service no. 26.
Softaid Inc 8930 Route 108 Columbia, MD 21045
GFA-Basic for Windows 3.0 and GFA-Basic for MS-DOS are now available from GFA
Software Technologies. The DOS product includes over 70 commands and functions
for specific graphic and operating system operations, allowing you to write
applications that contain menu bars, windows, alert boxes and pop-up menus,
and are portable to Windows 3.0, OS/2 and Unix. MDA, HGC, CGA, EGA, and VGA
graphics modes are supported; EMS can be used to allow access to more than
640K of memory; and 8087/287/387 math co-processors can be utilized, if
available.
GFA-Basic for Windows has over 400 Windows-specific commands and functions
that allow you to load bit-map files, run DLLs, and use the Multiple Document
Interface, Clipboard, and DDE. This reduces the amount of coding required and
renders the SDK unnecessary.
This product also permits use of EMS and runs in real or protected mode,
depending on the Windows installation. C and Assembler routines can be bound
into your GFA-Basic programs.
GFA-Basic for both Windows and MS-DOS costs $449 for the 286 version and $495
for the 386. Reader service no. 27.
GFA Software Technologies Inc. 27 Congress St. Salem, MA 01970 508-744-0201
Alpha RPL is the new development language just released by Alpha Software.
This resident programming language allows you to create complete
memory-resident applications, integrate DOS applications, automate tasks for
time sensitive applications, and update user interfaces to give them a modern
look and feel. For existing applications, the RPL can intercept keystrokes and
read screens so that you can add functions or change commands.
Multiple DOS applications can exchange data while integrated under a single
user interface. While running such applications, Alpha RPL can also read
records from .dbf files. The environment includes a symbolic debugger,
librarian, and editor, and is memory-resident, permitting you to debug and
test Alpha RPL programs on top of the application they control.
Alpha RPL for DOS costs $595. With extended or expanded memory, as little as
36 K of conventional memory will suffice. Reader service no. 29.
Alpha Software Corp. One North Avenue Burlington, MA 01803 617-272-4876
Linnaeus, Version 1.0 is the new stand-alone DOS C++ class browser and
application development tool from Zircel Software. It builds application
information databases from subsets of any C++ source and requires no third
party software. You can select classes or lists of classes based on regular
expression searches, class characteristics, or parent/child classes. Linnaeus
provides tree displays of selected classes, along with a window containing
such information as the class's data members, methods, comments, parents,
children, and source filespec. Clicking on a class name or icon in either the
tree, the window, or the source listing will change that class and display its
Detail window.
Also included are the ZLIST utility, which can list any information found in
the browser database in any order, and a utility to preload your text
processor with the proper source file when you indicate which class you want
to edit.
Linnaeus costs $195 and runs under MS-DOS with an EGA/VGA. Reader service no.
30.
Zircel Software 2285A S. Jasper Way Aurora, CO 80013 303-750-9543
Software Engineer, a Lisp implementation for Windows 3.0, is now available
from Raindrop Software Corp. Using Software Engineer's new ability to call
routines in Dynamic Link Libraries, you can call previously coded C or
assembly language code from within Lisp. This, plus the product's support for
all Windows-compatible sound boards, yields a high-level environment for
creating multimedia applications.
By supporting DDE, the clipboard, and GDI, Software Engineer brings Lisp
programming with Windows to a level identical to C.
The retail price for Software Engineer is $249.95. Reader service no. 22.
Raindrop Software Corp. 845 E. Arapaho, Suite 105 Richardson, TX 75081
214-234-2611










June, 1991
SWAINE'S FLAMES


The Rights You Left Behind




Michael Swaine


My cousin Corbett stopped me in the hall the other day with a gleam in his
eye. "Characters as objects," he said. "Think about it." I didn't want to
think about it. "We've had this argument before," I protested. "I still think
Pournelle does it deliberately. Science fiction is a genre of ideas, and you
have to downplay characterization in favor of --"
"Not that kind of characters," he snapped. "Object-oriented font technology.
OOF!"
If he wasn't doing an impression of Spy magazine's movie reviewer Walter
Monheit {TM} (whose name apparently includes that trademark symbol), and I
didn't think he was, then like so many others, Corbett had fallen victim to
the current OOPidemic.
Am I the only person who kept trying to figure out what TROOPS stood for in
those "support our troops" signs during the late war? Or the only person who
associates object-oriented programming (OOPS, WHOOPS, where's the SCOOP?) with
cleaning up after a puppy? I await with mixed feelings the announcement of the
next conference on parallel object-oriented programming.
Well, this is pretty close. There just curled out of the fax machine the
program for Transputing 91, a conference with at least one arguably parallel
object-oriented programming session, to wit "Extending C++ with Communicating
Sequential Processes." I guess it makes sense that a RISCy processor would
lead to some OOPS. I see there's also a session on the Mad Postman network
chip. Don't make it ring twice.
Gee, I'm sorry. Publishing time being what it is, you've already missed the
conference, and here I'm still mulling over whether or not to attend. OK, I'll
go and report on anything interesting next month, all right? I wonder if
Gilbert Hyatt will be there to announce that he patented parallel
architectures in 1973.
I was kidding about Hyatt, but who knows? He seems to have a patent on
everything else. Not that patent ownership guarantees anything in these
interesting times when the courts are mulling over what to make of computer
technology. Ask Paul Heckel, or read the latest edition of his book, The Art
of Friendly Software Design, with special attention to the new section
recounting Heckel's troubles getting IBM to pay attention to his patent.
Friendly?
Legal snags also await anyone rash enough to launch into what is being called
new media. I recently attended a conference on the subject, and saw many rash
people. The January issue of The Computer Lawyer ($325 for 12 issues;
800-223-0231) sounds the swamp they are sailing into. A typical new media
product is a compilation. You might start with the text of an existing book,
augmented by pictures (from the book and elsewhere) and some film clips and
incidental music, then put it on a CD and craft an access engine that lets the
user browse freely.
That part's easy; now try to get the rights. The author's contract probably
wasn't written with new media in mind, and it could take a judge to
distinguish the author's reserved movie rights from the electronic rights sold
to the publisher. Finding out who has the rights when reprinting photographs
is always tricky, and is complicated in new media by the difficulty of
defining multiple use for user-controlled media. Rights in music and film can
be very complex: When you acquire the right to use a film clip, the right to
use the background music in the clip, or the likenesses of the stars in it,
does not necessarily come along. And there are other legal issues: For
nonstars, you may need to get releases to avoid charges of invasion of
privacy, and you may face the problem of ensuring that people are not shown in
an unflattering context when you don't entirely control the context. What goes
for people goes for works: Don't assume that you can use a Country & Western
soundclip in a way that ridicules the genre, or that you can colorize or edit
someone else's work freely just because you have the right to use it. The
decision to develop a new media product is one one should mull over.
And there's still no answer to where you affix the credits or the copyright
symbol in a new media product, although Corbett thinks the latter problem will
be solved once we have OOF.
"Once characters are objects," he explained, "fonts will become code libraries
and trademark and copyright symbols will be able to enforce themselves. Any
electronic copying of a file containing the C symbol will trigger the code
bundled in the character, which could flash a warning message to the user or
dial the phone and report the copying for royalty billing. Or it could just go
ahead and complete the electronic funds transfer."
Corbett thinks the implementation is trivial. I think he's a nincompOOP.





































July, 1991
July, 1991
EDITORIAL


Helping Hands, Electronic Sleight-of-Hand, and Let's Give Them a Hand




Jonathan Erickson


This time last year, we announced our annual Kent Porter Scholarship, an award
for computer science majors enrolled in accredited colleges and universities.
For those of you new to the Dr. Dobb's family, Kent was a longtime DDJ
columnist, editor, and programmer who passed away in 1989. In the 1990-91
academic year, we awarded four grants in his memory.
The purpose of the scholarship is to recognize academic achievement and
potential, and to financially assist continuing students in the pursuit of
their educational goals. At the request of Kent's family, special
consideration is given (but not limited) to students raising children while
attending school. Scholarships will be awarded in increments of $500 for the
coming year.
To apply for a scholarship, request in writing an application from: The Kent
Porter Scholarship, Dr. Dobb's Journal, 501 Galveston Drive, Redwood City, CA
94063.
Similarly, John Hopkins University has launched its National Search for
Computing Applications to Assist Persons with Disabilities (CAPD). This
competition is seeking affordable, innovative ideas, systems, devices, and
computer programs designed to assist Americans with physical or learning
disabilities. More than 100 awards will be granted -- including a $10,000
first prize -- in December 1991 at the Smithsonian Institute in Washington DC.
The categories for submissions are software, computer-based hardware devices
(invented or modified), and "paper design" (written descriptions of ideas not
yet implemented). The disabilities to address include those involving
mobility, communication, self-care, and self-direction.
The competition is open to professionals, students, and amateurs until August
23, 1991. For more information contact: John Hopkins CAPD, P.O. Box 1200,
Laurel, Maryland 20723.


I've Got A Secret


The Electronic Frontier Foundation (EFF) asserts that individuals who
communicate electronically "are entitled to the same First Amendment rights
enjoyed by other media....[and should not be] subject to unconstitutional,
overbroad searches and seizures of any of the contents of their systems,
including electronic mail."
Several recent events bring this bird home to roost.
For starters, the EFF and Steve Jackson Games has filed a precedent-setting
civil suit against the U.S. Secret Service charging violation of Jackson's
First and Fourth Amendment rights. The suit stems from a March 1990 raid on
SJG's Texas office when Secret Service agents seized manuscripts, hardware,
software, and private electronic mail merely on the suspicion that somewhere
in Jackson's office "might be" a document compromising the security of the 911
telephone system -- even though Jackson himself wasn't suspect. (It was a
Jackson employee who was suspected of releasing what turned out to be publicly
available 911 information.) The government has never filed charges against
Jackson, but neither has it returned all of his property.
About the time the EFF launched its suit, the Los Angeles district attorney
began investigating the Prodigy information service because of accusations
that Prodigy was "stealing" data from user's disks. The allegations center
upon a temporary file (created by Prodigy's software) on the user's PC that
supposedly contains information -- culled without the user's knowledge or
consent -- identifying resident applications. Conceivably, that information
can be surreptitiously uploaded to the Prodigy host. Although Prodigy
acknowledges the temp file exists, the service denies it's being gobbled up by
the host. Note that the LA DA hasn't made any formal charges.
Now throw in Senate bill S266, introduced by Senators Joseph Biden Jr.
(D-Del.) and Dennis DeConcini (D-Ariz.), and Swaine's Cousin Corbett begins
seeing bureaucratic bogeymen behind every nearby beechnut tree. S266 proposes
that electronic encryption vendors (both hardware and software for computers
and phones) make decoders available to the government so agents can more
easily undertake authorized searches. Here is a part of what S266 says:
"...providers of electronic communications services and manufacturers of
electronic communications service equipment shall insure that communications
systems permit the Government to obtain the plain text contents of voice,
data, and other communications...." At one fell swoop, your private data is no
longer private and encryption companies are effectively put out of business.
The EFF isn't alone in trying to check the erosion of electronic privacy. The
governing Council of the ACM also recently endorsed a resolution advocating
the establishment of governmental privacy protection mechanisms. Both
organizations need our support and deserve a hand for taking the initiative.
Contact the Electronic Frontier Foundation, Inc., 155 Second Street,
Cambridge, MA 02141 and the ACM, 11 West 42nd Street, New York, NY 10036.































July, 1991
LETTERS







Absolute vs. Relative Performance


Dear DDJ,
I'd like to thank Ken Kashmarek ("Letters," DDJ, February 1991) for taking the
time to write concerning my article, "Optimal Determination of Object Extents"
(DDJ, October 1990). I'm glad to hear he enjoyed it. I completely agree with
Ken about the column order reversal. Speedup, however, is defined as a
positive quantity; otherwise the "up" portion of the word wouldn't make much
sense. When I wrote the article, I thought computing percentages would be
obvious, so I didn't spell out the details. Anyway, what's really important is
speed improvement. Anybody can come up with a slower algorithm, its the faster
ones that are difficult.
For the most part, Ken's comments detailed "absolute performance." My article,
however, was about "relative performance"--that is, one algorithm relative to
another. Whatever hardware, compiler, switches, or floating-point element size
you use, as long as you are consistent with comparisons the results will
always be about the same (the MIN/MAX algorithm will be faster, since it
performs less work to reach the conclusion, as long as you must perform
comparisons of items to find min and max!).
Also notice that the data sample used makes a difference. For consistency, I
used the same data sample over all machines. All times were the user CPU time
since that's how long the computer took to execute the program. I believe I
used an array of 1000 items and ran a loop finding min and max of these items
5000 times (this varied from one machine to another to get accurate time
value). For timing, I used a program I wrote that's similar to the Unix time
command.
As for Ken's comments about the Cray, remember that when you spend over $10
million on a computer, you get one of the best parallelizing compilers
available (otherwise the hardware is kind of hard to use). This compiler
completely unfolds the loops and runs the code straight. But even the Cray
vector processor can't avoid doing comparisons; therefore it, too, will show
relative performance improvement. Yes, the Cray does matrix math faster, but
that's because it can do an addition and a multiplication in one cycle (just
like Ken's IBM 6000), and because it runs on a 250-MHz clock (vs. the 25-MHz
IBM 6000).
As for Ken's other comments:
1. Math coprocessors. Most workstations come with math coprocessors;.only
cost-sensitive PCs make FPUs optional anymore--for the time being. PCs based
on the 68040 and 80486 (and others) will have built-in math coprocessors.
Whether or not you have a FPU makes a difference in absolute measurement--but
not in relative timing.
2. Compilers. Compilers often have a mode to output assembly code so that you
can see exactly what's going on and how well the compiler is optimizing.
3. Array size. The science of algorithms splits algorithms into two groups:
one in which all code and data fit into main memory and another where they
don't. These two groups are compared with other algorithms within the group;
otherwise they don't always lose. But for relative purposes, it still doesn't
matter. Thanks to Ken for the data on the 3090 and Apple IIGS--and for caring
enough to write. His input is appreciated. For more information on these
topics, interested readers might refer to "Vector Pipelining, Chaining, and
Speed on the IBM 3090 and Cray X-MP" (IEEE Computer, September 1989). It's a
really good, down-and-dirty comparison of the two machines and their vector
capabilities.
Victor Duvanenko
Indianapolis, Indiana


Power/MS- Basic Dispute Continues


Dear DDJ,
This is a reply to Bruce Tonkin's response (DDJ, March 1991) to my original
letter concerning his review of Power Basic:
Where Bruce uses numbers in his reply, the numbers are essentially correct,
but his conclusions about the numbers are not. The main problem seems to be
that Bruce assumes: 1. That programmers will be willing to convert their
(possibly) large volume of source code to another format and syntax just to
gain some small advantage, when it is not known how long that particular
version of the language will be supported; where MS-DOS Basic will likely
dominate on PCs for years to come, and 2. that other programmers will mostly
use the same (inferior) methods of coding that Bruce describes in his reply.
To compare a few reply points:
1. a. Basic 7.x allows multiple 64-KByte string heaps.
 b. It is much faster to sort in a pre-allocated string (memory) buffer and
not in string arrays.
 c. A few large string buffers can be "concatenated" using simple algorithms
to simulate a very large memory block, much as one would do in C.
2. The argument about fixed-length strings in MS-Basic is essentially invalid,
since the larger string buffers need only be written to and searched using
dynamic pointers, and not actually shuffled around. Does Bruce think that the
two times slower speed of fixed-length strings vs. dynamic strings is
significant for MS-Basic, while the 3.4 times difference in Basics is not?
3. You'll have to write your own sort? Well, I'll give anyone who wants it a
super fast merge/insertion sort which uses a large string as a memory block
and avoids all of the problems using string arrays. The sort is generic and is
in a callable subprogram, and it will return all of the memory it uses when it
is finished running.
4. At my present job, we have 200 Basic programs comprising 700,000 lines of
code, and it has not been my experience that Basic slows down substantially
when it approaches maximum memory usage. I would suggest to programmers who
are looking for good performance as well as some measure of portability to
other modern languages, that they learn simple memory management and avoid the
anything-goes coding style which plagues many Basic programs today, and which
causes much embarrassment when the code is compared to C.
In conclusion, I believe most programmers will want to know how to get good
performance from whatever language they are now using or are committed to, and
not be led to believe that the only solution is to switch compilers and have
to convert a lot of code. I'm convinced that MS-Basic can run right up there
with Power Basic, no matter what the application, and when people who pay
programmers are no longer convinced of same, they will most likely jump to C
rather than switch to another Basic compiler.
Dale Thorn
Cleveland, Tennessee


Blipless Curves


Dear DDJ,
In the October 1990 issue of DDJ, I read a letter by Michael P. Lindner. It
was titled "Still Going in Circles" and was feedback on Tim Paterson's
article, "Circles and the Differential Analyzer" (July 1990). I was impressed
by Paterson's article. There are a few suggestions that I would like to make.
I implemented the algorithm as it was on DOS, Lindner's Example 3. It is
duplicated here as my Example 1.
Example 1: plot8 is the function which drows the point in the eight octants

 x = e = 0
 y = r
 while (x <= y)
 {
 plot8 (x,y);
 e =+ (x<<1)+1;
 if (e > 0)
 {

 e -= (y<<1) -1;
 y --;
 }
 x ++;
 }


The resulting curve had a blip, as shown in my Figure 1, which is extremely
noticeable on large circles.
The reason for that is that if the curve passes through a pixel, the "e" for
that point is positive and so the pixel below it is chosen. What results is
the plotting of pixels just inside the circle, not on it. The first pixel is
the start and is on the circle -- hence the blip.
In this algorithm the "e" for all the points would be zero or negative. A
better algorithm would make sure that "e" is kept around zero. A measure of
this "closeness" could be the sum of "e" for all the points. I'll call it E.
My Table 1 shows E for the fragment in Example 1 for various radii.
Table 1

 Radius E
 150 -14705
 175 -19942
 200 -27301


I was trying to write a small fast version in assembler, so adding more
variables or complicating any expression was out of the question. By changing
the condition in the if statement, a better result can be obtained. I tried
two other simple conditions (e>x) and (e>y). The corresponding E{1} and E{2}
are in Table 2 for the same radii.
Table 2

 Radius E[1] E[2]
 150 -10998 689
 175 -14895 -558
 200 -17244 1090


It's obvious from Table 2 that E{2} is the best bet, i.e., if (e>y).
/*line7*/, where E is around zero.
I found out E for radii from 1 to 1000 using this new condition. E was
positive for 724 of the cases and negative for the remaining 276 cases. This
also eliminates the blip. The resulting circle is closer to the true circle
and smooth and continuous. Just for statistics, I'd like to mention that in
the assembly language version I had kept "e," "x," and "y" in registers.
Making this change meant a reduction of the code size by 1 byte. The resulting
code would be just a wee bit faster, as the immediate operand (0 in this case)
need not be fetched.
V. Venkataraman
Pune, India


The Return of Big-O


Dear DDJ,
As many others probably already have, I regret to inform you of Edward
Allburn's misconclusions in his article entitled "Graph Decomposition:
Imposing order on chaos," in your January 1991 issue. Mr. Allburn claims --
because it is apparent -- that his algorithm is in 0(n{2}) in worst case.
Without going into details about graph theory, given a graph G with n
vertices, the most number of edges e is n(n- 1)/2. This is common knowledge in
graph theory. A graph that satisfies this property is called a complete graph.
Understanding this yields to show that the algorithm which Mr. Allburn claims
as 0(n{2}) is actually 0(n{3}). A complete graph is the worst possible case
for Mr. Allburn's algorithm. Given a complete graph g with n vertices and e
edges given to the algorithm in the order (1,2), (1,3), ... (1,n), (2,3), ...
(n - 1, n) the algorithm must loop through step #1 in Mr. Allburn's algorithm
(n-1)n/2 times. This step alone is in 0(n{2}). Now the loop in step #4.d.1
(see Allburn's Example 2) adds another degree of complexity, making the
algorithm run in 0(n{3}).
I tested the algorithm to see hard results. I implemented the algorithm in
Turbo C++ (see Listing One, page 14) and timed its execution, providing
complete graphs from N = 4 to 1000 step 4. I then took the time T(N) it took
to run the program on N vertices and found its ratio to N{2}, N{3}, and N{4}
and plotted them on a graph (see Figure 2). This graph shows how the algorithm
grows with respect to the tree growth functions N{2}, N{3}, and N{4}. From the
graph it can be seen that the graph of the ratio of N{2} to T(N) is
decreasing. This says that T(N) is growing faster than N{2}. Likewise, looking
at the ratio N{4} to T(N) the curve is increasing, saying that T(N) does not
grow as fast as N{4}. Finally, the graph of the ratio of N{3} to T(N) is
nearly constant (slightly increasing). This says that T(N) is growing as fast,
but not faster than N{3}, which is the definition of Big-Oh.
There are many ways in which general graph theory algorithms can be tailored
to individual problems to speed up algorithm execution. Knowing more
information than the general algorithm requires is always an asset. I did
enjoy reading Mr. Allburn's article and liked the development of the algorithm
using a new data structure. But I also wonder if Mr. Allburn really tested his
algorithm thoroughly before making his claim and if his understanding of
Big-Oh notation is such to make that claim.
Martin Schlapfer
Scotts Valley, California
Edward responds:
In his test, Mr. Schlapfer used a "complete" graph. For a directed graph that
translates to N* (N- 1) edges, and 1/2 that number for undirected graphs
(where N = Number of Vertices). The graph I used in my article was a "sparse"
graph which contained the minimum number of edges needed to specify that all
of the vertices are connected. This works out to be N- 1 edges for an
undirected graph.
In Big-O notation, "N" is intended to represent the amount of input into the
algorithm. For a graph, the units of input are traditionally either edges or
vertices. When the number of edges is about the same as the number of vertices
(as was the case in my example graph), then "N" can safely represent either
edges or vertices. When faced with "complete" graphs, however, then number of
edges is an order of magnitude greater than the number of vertices. Because of
this "N" must now represent the number of edges.
After reviewing my test results, I realized that this should be the case even
when the number of edges is less than the number of vertices. For example, in
the case where a graph is be comprised of 1 million vertices and only a
handful of edges, the GAD algorithm is finished as soon as the last edge has
been read . Because GAD's performance is tied to the number of edges rather
than vertices, "N" should represent edges in the analysis of this algorithm
performance. Thus, GAD is still O(N{2}) in the worst case. I thank Mr.
Schlapfer for bringing this to my attention.


Rumanian Rumination


Dear DDJ,
I am the representative of the Consulting and Engineering Agency, a new
private Rumanian company of consulting engineers, and I am writing to ask your
readers for help.
After the Rumanian revolution in December 1989, we realized that it was vital
for Rumania to try and turn its economy around from the bankrupt socialist
system into a market-driven economy. One area which needs desperate and
particular attention is computers and computing techniques.
Twenty of our best software and hardware engineers from Constanta -- drawn
from computer centers, universities, and other enterprises -- set up this
agency. The first major problem we face is a severe lack of scientific books,
manuals, magazines, and catalogs.
We realize that one of the most effective means of ridding our country of the
terrible memory of Communism is not simply by appealing for clothes and food,
but by asking for help in improving our education. In this way we can improve
our standards in industry and commerce, and help our people to make the things
they need themselves.
We would be very grateful if any of your readers could donate books and
software, which is particularly expensive for us. Perhaps even donate a
subscription to their favorite magazine, or even send us old issues that they
no longer need.
Donations should be addressed to either of the following:
Aurel Cartu, Manager Consulting & Engineering Agency

Aleea Brindujelor Nr 2,
Bloc L9, Sc C,
Ap 45 RO-8700
Constanta
Rumania
Guilain Devillers
A.I.D.E.
P.O. Box 54
L-8001 Strassen Luxembourg





















































July, 1991
RECURSIVE IMAGES


Using simple recursion and iterated function systems to draw natural objects




Steven Janke


Steven is an associate professor of mathematics at Colorado College in
Colorado Springs, C0 80903. His interests include computer graphics and
probabilistic algorithms. Steven's Bitnet address is: SJANKE%CCNODE @COLORADO.


The world of images seems to divide up into those that we can draw easily on a
computer and those that seem almost impossible. Buildings, pie charts, and
cars are relatively straight forward, whereas clouds, trees, and mountains are
quite another matter. With scanners, of course, most images can be put on the
screen, but then storage requirements soar and manipulation routines are often
awkward. It is much more efficient and aesthetically pleasing when we can
write just a few lines of code to generate the images we want. Because there
are nice algorithms for lines and circles, anything that can be described with
our standard Euclidian geometry seems easy to draw. Yet standard geometry
falls short when describing trees (organic trees, not data structures!) and
other natural objects that are randomly bumpy, wiggly, and intricate. In the
last decade or so, a new geometry, called "fractal geometry," has emerged to
do a better job of describing natural objects.
The main idea behind the fractal geometry view is that some images look like
they are made up of small copies of themselves. A single branch of a real tree
often looks like the entire tree. A small piece of a cloud looks like the
entire cloud. These recursive descriptions (called "self-similarity") can be
worked into an algorithm and coded easily to draw some natural-looking objects
with relative ease.
Recursion is most often thought of in the context of a recursive procedure --
a procedure that calls itself. The output of a procedure becomes the input of
the same procedure. In mathematical terms, the output of some function becomes
the input of that same function. This process is then iterated an arbitrary
number of times. To see this in action, let's first look at the recursive
approach to drawing a tree and then turn to the slightly more complex approach
of using Iterated Function Systems (IFS). This second technique is the focus
of much current research in image compression and is proving to be a more
general technique than seems obvious at first glance.


Simple Trees


The observation that one branch of a tree often looks like the entire tree
leads almost immediately to an effective algorithm. The main procedure, called
TREE, must draw a branch of the tree that will then split into a few
subbranches. We can draw a piece of the tree by indicating the starting point,
a direction, and a length. Then we let TREE call itself with new starting
points, new directions, and new lengths in order to produce subbranches.
Listing One, page 74, gives a Tur
There are four parameters for procedure TREE: The first two give the starting
position, the third gives the direction (that is, angle), and the fourth is
the level of recursion. The length of a branch is not given as a parameter: It
is calculated by knowing the current position in the tree. The trunk is the
longest, and the branches at the top of the tree are the smallest. The LEVEL
parameter is used both to indicate the depth of recursion and to calculate the
length of the current branch.
After drawing a branch, TREE calls itself twice with new starting points and
directions. This means that each branch splits into two subbranches. Because
LEVEL is decremented on each recursive call, it can be tested to stop the
recursion. Ten levels of recursion seem to give a reasonable image.
Now for the artistic refinements.... It is a simple matter to draw the main
branches in brown and the final branches in green to indicate leaves. Or you
may choose to write a subprocedure for drawing more recognizable leaves.
Perhaps the most realistic refinement is the addition of randomness. The
branching direction can be randomly distributed within a range, the length of
a branch can be random, and the number of branches can be randomly chosen to
be, say, two or three. RECURTRE uses a simple random element to govern
branching direction, so rather than producing symmetrical, uniform trees, it
produced trees such as the one shown in Figure 1. With just a little more
effort, the branches can be given thickness and support flowers or fruit.
It takes some practice, but by altering the range of branching angles, the
distribution of those angles, and the distribution of branch lengths, you can
draw a wide variety of tree shapes. The overall shape of real trees varies
from the triangular shape of fir trees to the more spherical shape of oak
trees to the somewhat cylindrical shape of poplar trees. For an intriguing
problem, try determining what parameters really govern the final shape of the
tree.


Iterated Function Systems


Recall again that recursion simply means that the output of a function becomes
the input for the same function. In the RECURTRE program (Listing One), the
inputs and outputs are branches. The TREE procedure takes the description of a
branch as an input and outputs descriptions of two branches. Each of the new
branches is then input into the procedure again. Rather than use branches as
inputs and outputs, it is also possible to focus on individual pixels. This
way, procedures (or functions) take a single pixel as input and output one
more pixel. Then the sequence of pixels generated should form an interesting
image.
(A note on terminology: Usually, a function that moves pixels around the
screen is referred to as a transformation. Because transformation is the term
used most often in the technical literature, it makes sense to use it here.)
Let's consider a transformation that takes a point on the screen with
coordinates (X,Y) and simply multiplies each coordinate by one-half. Then, if
the input to the transformation is (100,200) the output is (50,100); with
(50,100) as the input, the output is (25,50). This process, which always makes
the current output the next input, generates a sequence of points. In the
present case, it is easy to see that the sequence of points approaches (0,0).
Notice that if (0,0) is then made the input of the transformation, the output
is also (0,0). The point (0,0) is called a "fixed point" of the
transformation. Plotting the sequence of points on the screen gives pixels on
a straight line, which get closer together as they approach the origin.
Take another transformation that again multiplies the coordinates of a point
by one-half, but also adds 10 to the X coordinate. This time, the point (20,0)
is a fixed point because one-half of 20 plus 10 gives 20 back. The sequence of
plotted points bunches up as it approaches (20,0). Notice one important fact:
No matter which point you start with, the sequence always approaches the fixed
point. This is somewhat amazing, but true. You can start anywhere you like on
the screen and still be confident that the sequence will get closer and closer
to the fixed point.
Now imagine that there are two transformations, say, T1 and T2. Start with
some point, P, and apply both functions to this point. The result will be two
new points: Call them A and B. Then take these new points and drop them into
each of the two transformations. T1 operates on A and B to produce two new
points, and T2 operates on A and B to give two new points. At this stage of
the game there are points P, A, B, and four new points. T1 and T2 can now
operate on the four new points to produce eight newer points. Continuing in
this way, the transformations build a set of points on the screen.
In this set of plotted points, there is a sequence of points that approaches
the fixed point of T1. This is because there is a sequence of points where
only T1 has been applied. Similarly, there is also a sequence that approaches
the fixed point of T2. Moreover, consider a new transformation, T3, which
moves points by first applying T1 and then applying T2. In the set of plotted
points, you can also find a sequence that approaches the fixed point of T3. In
fact, if you plot enough points, you will have sequences that approach the
fixed point of any finite combination of our original two transformations!
This is beginning to look interesting. One starting point and two simple
transformations give what could be a complicated set of points on the screen.
Maybe there is an interesting image forming here. With a little more
mathematics, it becomes clear that there is a unique set, call it "S," that
has a nice property. If you pick any arbitrary point of S and apply either of
the two transformations, you get another point of S. S is called an attractor
for the two transformations T1 and T2. The set of points plotted on the screen
is an approximation of S. If you are lucky enough to start with a point in S,
you will stay in S. If you start with a point outside S, some of the first
points plotted may not be very close to S, but points further along in the
sequence do get closer to S. The more points you plot, the more detail you
will get. Often the set S has intricate detail and is called a "fractal."
Before going much further, notice one unfortunate fact. At each stage of the
algorithm there are twice as many new points as at the previous stage! The
exponential nature of this generation makes the algorithm impractical. But
there is a secret door: It turns out that instead of applying each of the
transformations T1 and T2 to the new set of points, you can flip a coin and
choose one of them to apply. This way only one new point is generated at each
stage, and the algorithm is saved. The amazing fact is that by using this
"Random Algorithm" you still get a picture of S! Although the algorithm
incorporates randomness, the resulting image always looks the same.
Let's see what we have so far. A couple of transformations plus a starting
point produce a rather complicated set of points. T1 and T2 are simple
transformations, but imagine transformations that move points in such a way
that they shrink images, or rotate images, or translate images, or skew
images, and so on. Choose some of them to form a special set. When applying
the Random Algorithm, you must randomly pick one of the transformations from
the special set to apply next. The selection can be made by giving equal
weight to each transformation in the set, or you can decide to pick some of
the transformations more often than the others. To implement a
general-selection technique, simply assign a probability to each
transformation to determine how often it is picked. With two transformations,
if the first is assigned a probability of two-thirds, it will be picked twice
as often as the second. The set of transformations plus the set of
probabilities is referred to as the IFS.
A brief historical note: The Australian mathematician John Hutchison developed
most of the theoretical basis for iterated function systems about eight years
ago. Since then, Michael Barnsley (Georgia Institute of Technology; Iterated
Systems) noticed the connection to graphics and developed the Random
Algorithm. Barnsley has been the most profit contributor to the field in the
last several years.
The discussion so far would be of little interest if the attractors S were
just a boring set of pixels. So look at Figure 2. This is a common example
called the "Sierpinski triangle." The IFS that produced this image has three
transformation with probability one-third assigned to each of them. This
attractor is not very organic looking, but it is rather complicated -- and
only three transformations are needed to describe it.
There is a restriction on the transformations allowed in an IFS: They must be
what are called "contractions." If a transformation from an IFS is applied to
two pixels on the screen, the distance between the two new pixels must be less
than the distance between the original two pixels. This means that the
transformation shrinks things. It may also rotate or translate, but a little
shrinking is necessary. There are a lot of these contractions in the world,
but for practical purposes it suffices to restrict attention to
transformations that can be described as follows: Let the input point have
coordinates (X,Y), and suppose the output point has coordinates (Xnew, Ynew),
where
Xnew = a*X + b*Y + c
Ynew = d*X + e*Y + f
The six numbers a,b,c,d,e,f describe the transformation; an appropriate data
structure here might be a 2 x 3 array with a,b,c in the first row and d,e,f in
the second. The numbers a,b,d,e determine how the transformation shrinks and
rotates figures while the numbers c and f determine how figures are translated
around the screen. Because translation is dependent on the scale you choose,
the actual values of c and f are not important. If the unit distance is one
inch, then c and f will have smaller values than if the unit distance is one
pixel. The important factor is the ratio of c to f.
Note that an arbitrary selection of the six numbers will certainly describe a
transformation, but it may not be a contraction. In practice, it is usually
easier to look at the effect of the transformation to see if it shrinks
things, rather than to worry whether the particular selection of six numbers
is theoretically allowed.
Listing Two, page 74, presents a program for implementing the Random Algorithm
for iterated function systems. For simplicity, the transformations are
designated in a constant declaration along with the probabilities. Starting
with the initial point (0,0), which is mapped to a prixel in the middle of the
screen, the program generates several thousand new points and plots them. Each
new point is determined by randomly selecting one of the functions in the IFS
using the given probabilities and then applying the chosen transformation to
the current point. Simply change the constant declaration to produce the
attractor for other IFSs.
Table 1 gives the transformations and probabilities for generating the
Sierpinski triangle -- a fern leaf (which is the image most often associated
with IFSs) and a tree. When reading the table, keep in mind that it is the
ratio of c to f that is important, not their actual values. Figure 3 shows a
more organic looking tree, which is the attractor for the IFS given in the
table. It should be clear that the relatively small amount of space necessary
to save the transformations makes the IFS technique singularly important in
image compression applications.
Table 1: Transformations for a few IFSs

 a b c
 d e f prob
-----------------------------------------------------------
 Sierpinski Triangle:
 T1 0.5 0 0

 0 0.5 0 0.33
 T2 0.5 0 100
 0 0.5 0 0.33
 T3 0.5 0 50
 0 0.5 -100 0.33

 Fern:
 T1 0 0 0
 0 0.16 0 0.02
 T2 0.2 -0.26 0
0.23 0.22 -24 0.065
 T3 -0.15 0.28 0
0.24 -6.6 0.065 0.26
 T4 0.85 0.04 0
-0.04 0.85 -24 0.85

 Tree:
 T1 0.04 0 0
 0 0.36 21 0.02
 T2 0.04 0 8
 0 0.36 21 0.02
 T3 0.4 0 4
 0 0.4 -27 0.20
 T4 0.54 0.09 -4
 0.11 0.44 -105 0.34
 T5 0.34 0.29 -24
-0.29 0.34 -45 0.21
 T6 0.22 -0.45 36
 0.48 0.25 -60 0.21




A Development System


There is still an outstanding practical problem. How do you go from a desired
image to the appropriate IFS? This task is still an art, but there is an
extremely useful technique developed by Barnsley. To understand it, let's
consider again a transformation from an IFS. Because the transformation moves
pixels around the screen, it has an effect on shapes. For example, if you take
a triangle on the screen and apply the transformation to the points on the
triangle, you get a new triangle. Because the transformation is a contraction,
the new triangle is a little smaller than the original and it may be skewed,
rotated, or translated. It is useful to visualize transformations by plotting
their effects on simple shapes.
All transformations in the IFS for the Sierpinski triangle shrink things to
half their original size. Then two of the transformations translate the
resulting images either over to the right or over to the right and up. You can
guess what the transformations do by looking at the picture and noticing that
there are triangles inside triangles. Moreover, the smaller triangles have
sides that are half the size of the sides enclosing them.
Now here is Barnsley's technique. Start with a rough outline of the image you
want. Apply a transformation, and look at the result. Maybe it shrinks the
outline and rotates a little. Pick transformations so that their results
effectively cover the original outline with smaller copies of itself. This is
the self-similarity! If you cover your outline carefully enough, the resulting
IFS will produce the desired image. This fact is called the "Collage Theorem,"
because a collage of shapes makes up the final image (see Figure 2).
Listing Three, page 74, presents a development system for designing iterated
function systems by utilizing the Collage Theorem. The system has three
sections. In section 1, the user draws the outline of the desired image on the
screen. This is done simply by moving a cursor around and pressing the
appropriate key when you wish to set a vertex.
In section 2, the user actually builds the appropriate transformations by
observing their effect on the original outline. The array of coefficients
describing the transformation is displayed in the upper-left corner of the
screen. By shrinking, rotating, and translating in various ways, the user
tries to cover the original outline using as many transformations as
necessary. The more accurate the cover, the closer the IFS approximates the
intended image.
After fixing the various transformations, the user moves to section 3 where
the Random Algorithm is used to produce the attractor for the IFS. At this
point, the user may choose to have the original outline on the screen for
comparison. Also by selecting color, the pixels will be plotted in a color
corresponding to the last transformation applied. For example, if
transformation 2 is picked randomly to apply, then the resulting pixel is
colored with color number 2. The coloring simply gives the user a better idea
of how the IFS is working.
In section 3, the program actually calculates a set of probabilities for the
IFS. This is done simply by determining how much each transformation shrinks
images and then assigning lower probability to those that shrink the most.
(For the mathematically inclined, the determinant of the transformation's
array is calculated.) This technique for assigning probabilities is by no
means unique or optimal. It is merely a practical way to get a more or less
uniform image. Other probability assignments can give interesting control over
the image, as discussed shortly. Once probabilities are calculated, the Random
Algorithm begins. Remember, you really can begin anywhere you wish. The
program selects the point (0,0) as the starting point. You may wish to try
other starting points to see if there is an effect on the image.
The program in Listing Three is fairly straightforward. A simple user
interface was chosen to keep the complexity down. In the final image, 3000
points are plotted, but this can easily be increased or decreased by changing
the value of the variable NM in the main body of the GENERATE procedure. One
of the final options is to save the transformations. Each transformation is
stored as a 2 x 3 array on the data file.


Going Further


There are several generalizations of the IFS technique that lead to more
realistic images. Three of them are particularly interesting.
First, the selection of probabilities can give enough control over the image
to allow for color or gray-scale rendering. The idea is this: As the sequence
of points is generated, some areas of the screen are "hit" more often than
others. In fact, some pixels are plotted more than once. By setting the color
of a screen area according to how many times it is hit, you can produce a
color image. Then, by adjusting the probabilities and perhaps adding or
deleting transformations, you can shade your image in various ways.
For the very intrepid, a second generalization is useful. The transformations
talked about so far are transformations that move points in the plane. But
there is no reason to restrict the technique to two dimensions. If you take
transformations that move points in space, then you have a three-dimensional
attractor. With an appropriate rendering algorithm, this image could look more
realistic than the two-dimensional ones.
Finally, two IFSs can be mixed in interesting ways. Listing Four, page 78,
produces the forest of ferns shown in Figure 5 by mixing the transformations
for a fern with two more transformations that give the forest shape.
Basically, the fern is drawn using the Random Algorithm, but with probability
one-third of the two other transformations that are invoked. One key to this
particular mixing method is that the algorithm keeps track of where it was in
the fern before invoking the forest transformations. After using the forest
transformations, the algorithm returns to the previous point in the fern.
There are exciting possibilities with IFSs and several unsolved problems, both
theoretical and practical. Yet it is easy to get in the middle of this fractal
landscape of images by doing a little experimenting on your own.


Bibliography



Barnsley, Michael. Fractals Everywhere. San Diego, Calif.: Academic Press,
1988.
Mandelbrot, Benoit. The Fractal Geometry of Nature. San Francisco, Calif.:
W.H. Freeman, 1982.
Peitgen, Heinz-Otto and Dietmar Shaupe, editors. The Science of Fractal
Images. New York, N.Y.: Springer-Verlag, 1988.

_RECURSIVE IMAGES_
by Steven Janke



[LISTING ONE]

PROGRAM RECURTRE;
 uses graph;
 var inc,firstdirection :real;
 gd,gm,depth,scale :integer;
 startx,starty :integer;
 xasp,yasp :word;
 asp :real;
 const pi:real=3.14159;
 procedure TREE(X,Y:integer; DIR:real; LEVEL:integer);
 var xnew,ynew:integer;
 begin
 if level>0 then {At level zero, recursion ends.}
 begin
 xnew:= round(level*scale*cos(dir))+x; {Multiplying by level }
 ynew:= round(asp*level*scale*sin(dir))+y; {varies the branch size.}
 if level<3 then setcolor(green) else setcolor(brown); {Green leaves}
 line(x,y,xnew,ynew);
 TREE(xnew,ynew,dir+random*inc,level-1); {Two recursive calls - one}
 TREE(xnew,ynew,dir-random*inc,level-1); {for each new branch.}
 end;
 end;
 procedure INIT;
 begin
 firstdirection:=-pi/2; {Negative since y increases down the screen.}
 inc:=pi/4;
 scale:=5;
 depth:=10;
 startx:=round(GETMAXX/2); starty:=round(0.75*GETMAXY);
 GETAspectRatio(xasp,yasp); asp:=xasp/yasp; {Find aspect ratio}
 end;
 BEGIN
 gd:=detect;
 initgraph(gd,gm,'\tp\units'); {Graphic drivers kept in \tp\units.}
 cleardevice; randomize;
 INIT;
 TREE(startx, starty, firstdirection, depth);
 readln;
 closegraph;
 END.





[LISTING TWO]

PROGRAM IFSDRAW; {Random Algorithm for drawing IFS attractor.}
 uses graph;

 var gd, gm :integer; {For graphics initialization}
 xoff, yoff :integer; {Offset to determine origin}
 xsc, ysc :real; {Scale variables}
 n, cl :integer; {Index variable, color variable}
 x,y,asp :real; {Starting point and aspect ratio}
 xasp,yasp :word; {Used to determine aspect ratio}
 const {Normally, these constants would be read from a data file. They
 are listed as constants here only for illustration. These
 particular transformations form an IFS for Sierpinski's triangle.}
 Totaltran:integer=3;
 CT:array[1..3,1..7] of real =
 {Format: a, b, c, d, e, f, probability}
 (( 0.5, 0, 0, 0, 0.5, 0, 0.33),
 ( 0.5, 0, 100, 0, 0.5, 0, 0.33),
 ( 0.5, 0, 50, 0, 0.5, -100, 0.33));
 procedure SETPROB;
 {To get a running sum of the probabilities for random number generation.}
 var i:integer;
 sum:real;
 begin
 sum:=0;
 for i:=1 to totaltran-1 do
 begin sum:=sum+CT[i,7]; CT[i,7]:=sum; end;
 CT[totaltran,7]:=1; {This is set to 1 to avoid any round-off problem.}
 end;
 procedure MAKETRAN;
 {Determine which transformation is next and then apply it.}
 var nx,ny:real;
 s:integer;
 function FINDTRAN:integer;
 {Return a random number between 1 and the number of transformations.}
 var i:integer;
 w:real;
 begin
 w:=random; i:=1;
 while w>CT[i,7] do i:=i+1;
 FINDTRAN:=i;
 end;
 begin
 S:=FINDTRAN;
 NX:=CT[S,1]*X + CT[S,2]*Y + CT[S,3];
 NY:=CT[S,4]*X + CT[S,5]*Y + CT[S,6];
 X:=NX; Y:=NY;
 end;
 procedure INIT;
 begin
 XSC:=1; YSC:=1; {Scale factors}
 XOFF:=round(GETMAXX/2); YOFF:=round(GETMAXY/2); {Determines origin}
 X:=0; Y:=0; {Starting point}
 cl:=white;
 GETAspectRatio(xasp,yasp); {BGI function for determining aspect ratio}
 asp:=xasp/yasp;
 end;
 BEGIN
 gd:=detect; initgraph(gd,gm,' '); cleardevice;
 INIT; SETPROB;
 for N:=1 to 5000 do
 begin
 MAKETRAN;

 putpixel(round(X*XSC)+XOFF, (round(asp*Y*YSC)+YOFF),cl);
 end;
 readln;
 closegraph;
 END.






[LISTING THREE]

PROGRAM IFS; {ITERATED FUNCTION SYSTEM DESIGNER}
 uses graph,crt;
 type matrix = array[1..2,1..3] of real;
 var points:array[1..100,1..2] of integer; {Points and Pts store vertices}
 pts:array[1..100,1..2] of real; {of main figure.}
 gd,gm: integer; {For graphics initialization.}
 cp:integer; {Total number of vertices in main figure.}
 xoff,yoff:integer; {Offset for main figure placement.}
 asp,xt,yt:real; {Aspect ratio and offsets for transformation.}
 select:boolean; {For menu selection.}
 tran:matrix; {Coefficients of current transformation.}
 tranlist: array[1..50] of matrix; {List of transformations}
 totaltran:integer; {Total number of transformations.}
 procedure APPLYTRAN; {--------------------------------------------}
 {Applies the current transformation to the vertices of main figure.}
 var i:integer;
 a:real;
 begin
 for i:=1 to cp do
 begin
 a:=tran[1,1]*pts[i,1]+tran[1,2]*pts[i,2];
 pts[i,2]:=tran[2,1]*pts[i,1]+tran[2,2]*pts[i,2];
 pts[i,1]:=a;
 end;
 end;
 procedure INIT; {-------------------------------------------------}
 var xasp,yasp:word;
 begin
 cp:=1;
 xoff:=round(GETMAXX/2); yoff:=round(GETMAXY/2);
 xt:=0; yt:=0;
 GETASPECTRATIO(Xasp,Yasp); asp:=xasp/yasp;
 totaltran:=0;
 end;
 procedure INITTRAN; {---------------------------------------------}
 begin
 tran[1,1]:=1; tran[1,2]:=0; tran[2,1]:=0; tran[2,2]:=1;
 end;
 procedure SAVETRAN(n:integer); {----------------------------------}
 begin
 tranlist[n]:=tran;
 tranlist[n,1,3]:=xt; tranlist[n,2,3]:=yt;
 xt:=0; yt:=0;
 end;
 procedure CONVPOINTS; {-------------------------------------------}
 {Converts screen coordinates in Points to world coordinates in Pts.}

 var i:integer;
 begin
 for i:=1 to cp do
 begin
 pts[i,1]:=points[i,1]-xoff;
 pts[i,2]:=(points[i,2]-yoff)/asp;
 end;
 end;
 procedure DRAWFIG(col:integer); {---------------------------------}
 var i,holdcol:integer;
 begin
 holdcol:=getcolor; setcolor(col);
 for i:=1 to cp-1 do
 line(round(pts[i,1]+xoff+xt),round(pts[i,2]*asp+yoff+yt*asp),
 round(pts[i+1,1]+xoff+xt),round(pts[i+1,2]*asp+yoff+yt*asp));
 setcolor(holdcol);
 end;
 procedure REDRAW(N:integer); {-------------------------------------}
 {Redraws orignial figure plus the results of each transformation.}
 {Transformation number N is not drawn.}
 var i:integer;
 begin
 xt:=0; yt:=0;
 cleardevice; CONVPOINTS; DRAWFIG(blue);
 for i:=1 to totaltran do
 if i<>n then
 begin
 CONVPOINTS; tran:=tranlist[i];
 xt:=tranlist[i,1,3]; yt:=tranlist[i,2,3];
 APPLYTRAN;
 DRAWFIG(red);
 end;
 xt:=0; yt:=0;
 end;
 procedure SCALE(xsize,ysize:real); {-------------------------------}
 {Changes the size of a figure.}
 var i,j:integer;
 begin
 for i:=1 to cp do
 begin pts[i,1]:=xsize*pts[i,1];
 pts[i,2]:=ysize*pts[i,2];
 end;
 for i:=1 to 2 do tran[1,i]:=xsize*tran[1,i];
 for i:=1 to 2 do tran[2,i]:=ysize*tran[2,i];
 end;
 procedure POSITION; {---------------------------------------------}
 {Positions figure as a new transformation is constructed.}
 var k:char;
 xx,yy:integer;

 procedure DIRECTIONS; {....................................}
 begin
 gotoxy(1,16); writeln('SCALE (S/W)');
 writeln('SCALEX (A/Q)');
 writeln('SCALEY (D/E)');
 writeln('ROTATE (R/F)');
 writeln('ROTATEX (T/G)');
 writeln('ROTATEY (Y/H)');
 writeln('REFLECT (X)');

 writeln('Use ARROWS to translate.');
 gotoxy(1,25); write('... Press Enter when finished ...');
 end;
 procedure REFLECT; {......................................}
 {Flips the figure around the line x=y.}
 var i:integer;
 xx:real;
 begin
 for i:=1 to cp do
 begin xx:=pts[i,1]; pts[i,1]:=pts[i,2]; pts[i,2]:=xx; end;
 xx:=tran[1,1]; tran[1,1]:=tran[2,1]; tran[2,1]:=xx;
 xx:=tran[1,2]; tran[1,2]:=tran[2,2]; tran[2,2]:=xx;
 end;
 procedure ROTATE(xangle,yangle:real); {...................}
 {Rotates the figure. If xangle and yangle are unequal, rotation}
 {is skewed.}
 var i,j:integer;
 a,b,xca,xsa,yca,ysa:real;
 begin
 xca:=cos(xangle); xsa:=sin(xangle);
 yca:=cos(yangle); ysa:=sin(yangle);
 for i:=1 to cp do
 begin
 a:=pts[i,1]*xca-pts[i,2]*ysa;
 pts[i,2]:=pts[i,1]*xsa+pts[i,2]*yca;
 pts[i,1]:=a;
 end;
 a:=tran[1,1]*xca-tran[2,1]*ysa;
 b:=tran[1,2]*xca-tran[2,2]*ysa;
 tran[2,1]:=tran[1,1]*xsa+tran[2,1]*yca;
 tran[2,2]:=tran[1,2]*xsa+tran[2,2]*yca;
 tran[1,1]:=a; tran[1,2]:=b;
 end;
 procedure WRITETRAN; {......................................}
 var i,j:integer;
 begin
 gotoxy(1,3); writeln('Current Transformation: ');
 for i:=1 to 2 do
 begin
 for j:=1 to 2 do
 begin
 gotoxy(1+(j-1)*10, 5+(i-1));
 writeln(tran[i,j]:7:2);
 end;
 gotoxy(21, 5+(i-1));
 if i=1 then writeln(xt:7:2) else writeln(yt:7:2);
 end;
 end;
 begin
 xx:=round(xt); yy:=round(asp*yt);
 WRITETRAN; DIRECTIONS;
 k:=readkey;
 while ord(k)<>13 do
 begin
 DRAWFIG(green);
 case ord(k) of
 0: begin
 k:=readkey;
 case ord(k) of

 72: yy:=yy-3;
 77: xx:=xx+4;
 80: yy:=yy+3;
 75: xx:=xx-4;
 end;
 end;
 83,115: scale(0.9,0.9); { S for decrease }
 87,119: scale(1.1,1.1); { W for increase }
 65,97 : scale(0.9,1); { A for x decrease }
 68,100: scale(1,0.9); { D for y decrease }
 81,113: scale(1.1,1); { Q for x increase }
 69,101: scale(1,1.1); { E for y decrease }
 82,114: rotate(0.1,0.1); { R for rotate cw }
 70,102: rotate(-0.1,-0.1); { F for rotate ccw }
 84,116: rotate(-0.1,0); { T for x rotate cw }
 71,103: rotate(0.1,0); { G for x rotate ccw }
 89,121: rotate(0,-0.1); { Y for y rotate cw }
 72,104: rotate(0,0.1); { H for y rotate ccw }
 88,120: reflect; { X to reflect in x=y }
 end;
 xt:=xx; yt:=yy/asp; DRAWFIG(green);
 WRITETRAN;
 k:=readkey;
 end;
 end;
 procedure SHAPE; {-------- SECTION I ------------------------------}
 {Sets up the main figure.}
 var i,j,er:integer;
 k:char;
 procedure BOX(x,y,col:integer); {..........................}
 var vs,hs,holdcol:integer;
 begin
 hs:=3; vs:=2; holdcol:=getcolor; setcolor(col);
 line(x-hs,y-vs,x+hs,y-vs);
 line(x+hs,y-vs,x+hs,y+vs);
 line(x+hs,y+vs,x-hs,y+vs);
 line(x-hs,y+vs,x-hs,y-vs);
 setcolor(holdcol);
 end;
 begin
 gotoxy(1,1); writeln('ITERATED FUNCTION SYSTEM DESIGNER');
 writeln('Section I: Draw outline of desired figure ....');
 gotoxy(1,23); writeln('Use arrows to position cursor.');
 writeln('Press P to place a vertex.');
 write('Press Enter when finished.');
 i:=xoff; j:=yoff; setwritemode(xorput);
 BOX(i,j,white);
 k:=readkey; er:=1; {Variable er used to determine when to draw box.}
 while ord(k)<>13 do
 begin
 case ord(k) of
 0: begin if er=1 then BOX(i,j,white); er:=1;
 k:=readkey;
 case ord(k) of
 72: j:=j-6;
 77: i:=i+8;
 80: j:=j+6;
 75: i:=i-8;
 end;

 BOX(i,j,white);
 end;
 80,112: begin er:=0; points[cp,1]:=i; points[cp,2]:=j;
 if cp>1 then begin setcolor(blue);
 line(points[cp-1,1],points[cp-1,2],
 points[cp,1], points[cp,2]);
 setcolor(white); end;
 cp:=cp+1;
 end;
 end;
 k:=readkey;
 end;
 points[cp,1]:=points[1,1]; points[cp,2]:=points[1,2];
 setcolor(blue);
 line(points[cp-1,1],points[cp-1,2],points[1,1],points[1,2]);
 setcolor(white); setwritemode(copyput);
 end;
 procedure MAKETRAN; {---------- SECTION II ------------------------}
 {Allows construction and alteration of transformations.}
 var nt,choice:integer;
 s,me:char;
 function MENUII:integer; {........................................}
 var xn:integer;
 begin
 gotoxy(1,1); writeln('1. Another Transformation');
 writeln('2. Next Transformation');
 writeln('3. Prepare to Draw');
 gotoxy(1,5); writeln('Select Number: '); me:=readkey;
 while (ord(me)<49) or (ord(me)>51) do me:=readkey;
 MENUII:=ord(me)-48;
 gotoxy(1,1);
 for xn:=1 to 5 do writeln(' ');
 end;
 begin
 gotoxy(1,1); writeln('Section II: Build Transformations ...');
 choice:=1; nt:=0;
 if totaltran<>0 then choice:=2;
 while choice<>3 do
 begin
 if choice=2 then
 begin nt:=nt+1;
 if nt>totaltran then nt:=1;
 REDRAW(nt);
 tran:=tranlist[nt];
 xt:=tranlist[nt,1,3]; yt:=tranlist[nt,2,3];
 end
 else begin INITTRAN; totaltran:=totaltran+1;
 nt:=totaltran;end;
 CONVPOINTS;
 if choice=2 then APPLYTRAN else SCALE(0.5,0.5);
 setwritemode(xorput);
 DRAWFIG(green);
 POSITION;
 setwritemode(copyput);
 SAVETRAN(NT);
 REDRAW(0);
 CHOICE:=MENUII;
 end;
 cleardevice;

 end;
 procedure GENERATE; {------------ SECTION III ---------------------}
 {Draw the resulting picture by applying transformations at random.}
 var xx,nm,wh,bd,cl,choice:integer;
 x,y:real;
 me:char;
 probs:array[1..50] of real;

 procedure ASSIGNPROB; {....................................}
 {Determines probability of each transformation.}
 var i:integer;
 s:real;
 begin
 for i:=1 to totaltran do
 begin
 tran:=tranlist[i];
 probs[i]:=abs(tran[1,1]*tran[2,2] - tran[1,2]*tran[2,1]);
 if probs[i]<0.02 then probs[i]:=0.02;
 end;
 s:=0; for i:=1 to totaltran do s:=s+probs[i];
 for i:=1 to totaltran do probs[i]:=probs[i]/s;
 s:=0; for i:=1 to totaltran do begin s:=s+probs[i]; probs[i]:=s; end;
 probs[i]:=1;
 end;
 function PICK:integer; {..................................}
 {Picks a transformation with designated probability distribution.}
 var j:integer;
 p:real;
 begin
 p:=random; j:=1;
 while p>probs[j] do j:=j+1;
 PICK:=j;
 end;
 procedure APPLY(w:integer); {..............................}
 {Applies chosen transformation to current point X,Y.}
 var z:real;
 begin
 tran:=tranlist[w];
 z:=tran[1,1]*X+tran[1,2]*Y;
 Y:=tran[2,1]*X+tran[2,2]*Y;
 X:=z+tran[1,3];
 Y:=Y+tran[2,3];
 end;
 procedure PUTIT(cc:integer); {.............................}
 begin
 if cl=0 then cc:=white;
 putpixel(round(X+xoff),round(Y*asp+yoff),cc);
 end;
 procedure MENUIII; {.......................................}
 var s:string;
 xx:integer;
 begin
 bd:=0;cl:=0;
 gotoxy(1,3); write('1. Border (Toggles)');
 gotoxy(25,3); writeln('Excluded');
 write('2. Color (Toggles)');
 gotoxy(25,4); writeln('No');
 writeln('3. Draw Image');
 writeln;writeln('Select Number: ');

 me:='5';
 while (ord(me)<>51) do
 begin
 me:=readkey;
 while (ord(me)<49) or (ord(me)>51) do me:=readkey;
 case ord(me) of
 49: begin if bd=0 then begin bd:=1; s:='Included'; end
 else begin bd:=0; s:='Excluded'; end;
 gotoxy(25,3);write(s);
 end;
 50: begin if cl=0 then begin cl:=1; s:='Yes';end
 else begin cl:=0; s:='No ';end;
 gotoxy(25,4);write(s);
 end;
 end;
 end;
 gotoxy(1,3);
 for xx:=1 to 5 do writeln(' ');
 end;
 begin
 cleardevice; ASSIGNPROB; randomize;
 gotoxy(1,1); writeln('Section III: Draw Image ... ');
 MENUIII;
 if bd=1 then begin CONVPOINTS; DRAWFIG(blue); end;
 nm:=3000; {Number of points to plotted in final image.}
 X:=0;Y:=0; {Initial point drawn.}
 PUTIT(7);
 for xx:=1 to nm do
 begin
 wh:=PICK; APPLY(wh); PUTIT((wh mod 7)+1);
 end;
 end;
 procedure FILESAVE;
 {To save transformations on disk.}
 var i:integer;
 tranfile:file of matrix;
 begin
 assign(tranfile, 'IFS.DAT');
 rewrite(tranfile);
 for i:=1 to totaltran do write(tranfile, tranlist[i]);
 close(tranfile);
 end;
 function MENUIV:boolean; {.......................................}
 var s:string;
 me:char;
 begin
 gotoxy(1,3); writeln('1. Return to Section II');
 writeln('2. Save transformations on file');
 writeln('3. Quit');
 writeln;writeln('Select Number: ');
 me:='2';
 while me='2' do
 begin
 me:=readkey;
 while (ord(me)<49) or (ord(me)>51) do me:=readkey;
 if me='2' then begin FILESAVE;
 gotoxy(1,9); writeln('DATA SAVED');
 end;
 end;

 if me='1' then MENUIV:=true else MENUIV:=false;
 end;
 BEGIN {----------------- Main Body ------------------------------}
 gd:=detect; initgraph(gd,gm,'');
 directvideo:=false; {Allows text using WRITE statements.}
 INIT; cleardevice;
 SHAPE; {... Section I ...}
 select:=true;
 while select do
 begin
 REDRAW(0);
 MAKETRAN; {... Section II ...}
 GENERATE; {... Section III ...}
 select:=MENUIV;
 end;
 cleardevice; closegraph;
 END.





[LISTING FOUR]

PROGRAM FOREST; {A mixture of two systems to produce a forest of ferns}
 uses graph;
 var n,xoff,yoff,gd,gm,cl: integer;
 xsc,ysc,x,y,bx,by,asp:real;
 xasp,yasp:word;
 const
 {CT holds the IFS for a fern}
 CT:array[1..4,1..7] of real =
 (( 0, 0, 0, 0, 0.16, 0, 0.02),
 ( 0.2,-0.26, 0, 0.23, 0.22, -24, 0.065),
 (-0.15, 0.28, 0, 0.26, 0.24, -6.6, 0.065),
 ( 0.85, 0.04, 0,-0.04, 0.85, -24, 0.85));
 {PL holds additional IFS functions to produce the forest}
 PL:array[1..2,1..6] of real =
 (( 0.8, 0, 80, 0, 0.8, -65),
 ( 0.8, 0, -80, 0, 0.8, -60));
 PROB:array[1..6] of real = (0.008, 0.034, 0.06, 0.4, 0.7, 1.0);
 procedure MAKETRAN;
 var nx,ny:real;
 s:integer;
 function FINDTRAN:integer;
 var i:integer;
 w:real;
 begin
 w:=random; I:=1;
 while w>PROB[i] do i:=i+1;
 FINDTRAN:=i;
 end;
 begin
 s:=FINDTRAN;
 if s<5 then {Generate another point in the fern.}
 begin
 nx:=CT[s,1]*x + CT[s,2]*y + CT[s,3];
 ny:=CT[s,4]*x + CT[S,5]*y + CT[s,6];
 x:=nx; y:=ny; bx:=x; by:=y;

 end
 else {Generate another point in the forest.}
 begin
 s:=s-4;
 nx:=PL[s,1]*bx + PL[s,2]*by + PL[s,3];
 ny:=PL[s,4]*bx + PL[s,5]*by + PL[s,6];
 bx:=nx; by:=ny;
 end;
 end;
 procedure INIT;
 begin
 xsc:=1.3; ysc:=1;
 xoff:=round(GETMAXX/2); yoff:=GETMAXY-50;
 x:=0; y:=0;
 bx:=0; by:=0;
 GETAspectRatio(xasp,yasp); asp:=xasp/yasp;
 end;
 BEGIN
 gd:=detect; initgraph(gd,gm,' ');
 INIT; cleardevice;
 for N:=1 to 32000 do
 begin
 MAKETRAN;
 putpixel(round(bx*xsc)+xoff,(round(asp*by*ysc)+yoff),green);
 end;
 readln; cleardevice; closegraph;
 END.



































July, 1991
SAVING AND RESTORING VGA SCREENS


Register programming without pain




Ben Myers


Ben is a founder and partner in Spirit of Performance, a Harvard, Mass. firm
that publishes Personal Measure, a package that measures application
performance and resource utilization. He also designs and programs custom
benchmarks of hardware and software. He can be reached at MCI Mail ID
357-1400.


When IBM announced the PS/2s with Video Gate Array (VGA) controllers, software
developers were cautioned to program only to VGA BIOS specifications. Those
who heeded IBM's warning were rewarded with painfully slow graphics routines.
Today, many VGA cards conform both to IBM's BIOS specification and to the
lower-level VGA register specification. This means that neither performance
nor standardization need to be sacrificed.
VGA is fundamentally an extension of the older Enhanced Graphics Adapter (EGA)
specification. However, VGA's square pixels allow a square to be truly square
and circles to be truly round. New BIOS calls are also defined for
configuration data, status information, and extra graphics modes. VGA adds an
additional write mode for register-level operations, a palette of 256K colors,
a 4:3 screen aspect ratio, and improved split-screen and panning capabilities.
Finally, in common with the less successful MCGA on the IBM PS/2 Models 25 and
30, VGA offers a 256-color mode, 320 x 200 pixels in size.
In finding out how VGA adapters really work, I turned to Richard Wilton's fine
Programmer's Guide to PC & PS/2 Video Systems (see bibliography) and worked
out how to fill an entire screen with a single color using repeated STOSB
instructions. The result was an honest benchmark of adapter performance whose
timings differed by no more than a few percent for the same adapter on an
8-MHz 80286 and a 33-MHz 80386. It was interesting to discover that some
brands of adapters left random unfilled dots on the screen when subjected to
the high data rate generated by STOSB on the faster PC, apparently due to
occasional misses on critical bus timings. This meant I had to write an
adapter integrity test that filled the screen with one color and read it all
back, to make sure that the color really had gotten into the video adapter
memory.
After finishing the adapter integrity test, I realized I had almost everything
necessary to save an entire screen and restore it again, so I tinkered some
more and came up with Read_VGA_Plane and Write_VGA_Plane (see Listing One,
page 79), both of which operate on one plane at a time. Wilton's book provides
little help for figuring out how to read individual VGA bit planes, but
Bradley Dyck Kliewer's otherwise unremarkable EGA/ VGA, A Programmer's
Reference Guide contained some reference information on all the VGA registers,
including a good description of the important Read Map Select Register.


Quick Overview of VGA


Saving and subsequently restoring a VGA screen image exploits a small fraction
of the registers and capabilities built into the adapter.
For VGA mode 12h (640x480, 16 colors), the memory inside a VGA adapter is
organized into four planes of bits (see Figure 1). The VGA buffer begins at
address A000:0000 in PC memory and is organized horizontally. That is, the
contents of a single byte are displayed horizontally, followed by the next
byte, and the next, all on a one-pixel-wide scan line. The last byte of a scan
line (pixels 632 - 639, numbering from zero) is followed in memory by the
first byte (pixels 0 - 7) of the next scan line. Each single byte in PC VGA
memory corresponds to 4 (!) bytes on the VGA card, one in each plane. To allow
16 colors to be displayed from a single plane of memory, VGA registers must be
programmed to write into some or all bit planes on the adapter. This allows a
single byte of data to be used to manipulate bit values in any of the four
planes.
When software wants to write data in a given color, the VGA treats the color
number as a set of four binary bits, one for each plane. Plane O is the "blue"
plane, plane 1 is "green," and plane 2 is "red." Finally, plane 3 controls the
intensity of the bits displayed. Mixing red, green, blue, and intensity data
together gives the 16 colors for VGA graphics. For example, the color yellow,
value OEh, is derived from an intensity bit, a red bit, and a green bit. The
VGA palette registers provide a level of indirection for color. They allow for
remapping of the 16 colors into any one of 256 possible hues.
When working at the BIOS level, use interrupt 10h to write text and to set
individual pixels in any one of the 16 colors in the current palette. This
method works, albeit slowly.
Working at the register level requires "programming" the VGA registers, which
is nothing more than putting values there to tell the adapter how to operate
on the bit patterns that will be written into PC memory later on. The
primitive VGA operations read, write, and update data in the adapter planes.
Updates are further subdivided into AND, OR, and XOR that combine a "latched"
VGA planar byte with a byte in memory.
Updating data in the VGA adapter planes is more complex than simply writing or
reading it. It requires that the data in the adapter planes be latched one
byte at a time, updated, and written back. A typical latching operation merely
moves data from PC VGA memory to a PC register, for example, mov al, ds[si],
where ds = AOOOh, and si points to a byte in VGA memory. The simple step of
accessing memory in the PC's VGA buffer area forces the VGA controller to
respond by reading data out of its planes into the VGA memory area.


Saving and Restoring Screen Images


Of all the VGA registers, only the Graphics Controller registers and the
Sequencer Address registers are used to save and restore a VGA graphic screen.
The Graphics Controller registers are accessed through port 03CEh. There are
nine of them, numbered from O to 8. To change the value of any one of them,
write its index number to port 03CEh, followed by the value to which it will
be set. The VGA controller accepts two consecutive bytes in the same OUT
instruction, so often no more than three instructions are needed to set up a
register. The code fragment in Example 1 programs register 5, the mode
register, with a 0 value.
Example 1: Programming register 5

 mov dx,3CEh Graphics controller I/O port
 mov ax,0005h Register 5, value of zero
 out dx,ax Write the value out to register 5


There are five Sequencer Address Registers at port 03C4h. They are accessed
just like the Graphics Controller registers. See Table 1 for a summary of the
registers and values needed to manage screen images.
Table 1: VGA registers used to save and restore screens

 Port Index Bits Name/Purpose
 ---------------------------------------------------------------------

 03C4 Sequencer Address Registers
 2 Map Mask Register, determines which adapter planes
 will be affected by subsequent operations
 0 1 = operate on plane 0
 1 1 = operate on plane 1
 2 1 = operate on plane 2
 3 1 = operate on plane 3
 O3CE Graphics Controller Registers
 1 Enable Set/Reset Register, chooses planes to be

 accessed in write mode 0
 0-3 1 in bit position enables use of corresponding
 memory plane for write mode 0
 3 Data Rotate/Function Select Register, rotates data
 written by CPU, then selects function for
 combining CPU data with planar data
 0-2 rotate count
 3-4 function select
 00 - overwrite with CPU data
 01 - AND data with latch contents
 10 - OR data with latch contents
 11 - XOR data with latch contents
 4 Read Map Select Register, determines which bit plane
 will be read. Note that unlike the Enable
 Set/Reset Register, this is a raw value that
 designates a SINGLE plane in the range 0 to 3.
 5 Mode Register
 0-1 Write Mode (0, 1, 2, or 3)
 2 Used for diagnostics
 3 Read Mode (0 or 1)
 4-5 Control mapping of CPU data to planes
 6 Controls 256 color mode




The savedemo Program


The GRFSAVE1 unit (Listing Two, page 79) is derived from software built into
the package my company developed, with functions we use in place of both
Borland and Microsoft's graphics libraries.
GRFSAVE1 uses the same manifest constants as in the QuickPascal MS-Graph unit,
but it could rely on the Turbo Pascal BGI constants just as easily. The
program savedemo (Listing Three, page 80) fills the screen with circles, saves
the contents of the screen, fills the screen with clipped rectangles, then
restores the previous screen. In between each visible screen operation,
savedemo pauses for up to five seconds so you can see what it has done.


How Read_VGA_Plane Works


The Read_VGA_Plane procedure is by far the simpler of the matched pair of MASM
video plane handling routines. It accepts a plane number and byte count from
the calling program and fills an externally defined array with the bits from
the adapter plane. The plane number, in the range O to 3, corresponds to the
number of the bit plane on the VGA adapter. The byte count is simply the total
number of bytes in the adapter plane. Since register programming is the same
for VGA and EGA graphics modes, varying the byte count allows Read_VGA_Plane
to work with any EGA or VGA color mode.
The VGA Graphics Controller I/O port, 03CEh, provides access to a set of
registers that determine how the VGA card operates. It is necessary to program
only two graphics controller registers to read from a given video plane. The
Mode Register (graphics controller register number 5) must be set in read mode
O to read bytes from a given individual plane. Then, the Read Map Select
Register (register 4) must be set up with the number of the bit plane to be
read. A single out instruction will do in either case, feeding a pair of bytes
that select the register number and pass the value to be loaded into the
register. Once the byte count (CX register) and the array address (ES:DI) have
been set up, a repeated byte move (REP MOVSB) reads each byte from the bit
plane. If you encounter the VGA bus timing problems described earlier, use a
slower byte-at-a-time loop instead. Once the bit plane has been moved, the
mode register is reset to its normal default of read mode 1, and the read map
select register to its default.


How Write_VGA_Plane Works


Reading and writing VGA adapter planes are not symmetrical operations, and
Write_VGA_Plane does more work than its counterpart. It accepts the same
calling parameters as Read_VGA_Plane, but the array is on the sending side of
a data move, and the video adapter plane is the destination.
Four Graphics Controller I/O registers must be set up, along with the Map Mask
Register in the Sequencer register (port 03C4h) group. The Mode Register is
again set to 0, for write mode 0. The Enable Set/Reset Register (graphics
register 1) must contain a O mask value to enable all bit planes. The Data
Rotate/ Function Select Register (graphics register 3) must be initialized to
replace bits in the selected bit plane with data from memory. Then the Bit
Mask Register (graphics register 8) is filled with bits to allow all bits in
memory to replace the corresponding bits in the bit plane. Finally, the Map
Mask Register (sequencer I/O register 2) must contain a bit that tells which
plane to write. Bit 0 indicates plane 0, bit 1 is for plane 1, and so on.
With the adapter registers properly initialized, Write_VGA_Plane initializes
CX, ES, and DI for yet another repeated move of data from the saved array to
adapter memory in the PC. Since all of the necessary VGA registers have been
programmed, the VGA adapter properly disposes of each byte of data within the
proper bit plane. Write_VGA_Plane then resets the register sit used back to
the normal default values.


A Monochrome Bonus


The GRFSAVE1 unit consists of six functions and procedures. HeapFunc simply
overrides the runtime error that results when the requested amount of heap
space is not available to the calling program after a GetMem call.
Init_Screen_Save is a simple bridge between the calling program and the unit.
It takes the plane size and number of planes for the current graphics mode
from the calling program. These calculations could also be done more tidily
within the GRFSAVE1 unit.
The Save_Screen function saves the entire screen on behalf of the calling
program and returns a count of the number of planes saved. If the plane count
is 0, this means that no planes were saved due to lack of heap space. The
calling program can then take appropriate action. Restore_Screen restores the
complete screen image only if it has been saved completely by Save_Screen.
Saving and restoring a monochrome graphics screen image is much less intricate
than operating on a multiplanar color screen. This can be accomplished without
any register-level programming. GRFSAVE1 provides a bonus in that it also
handles popular monochrome graphics formats, such as CGA monochrome, EGA
monochrome, VGA monochrome, and Hercules graphics mode.
savedemo does little except generate a pair of VGA graphic images to be
handled with the screen save and restore operations. But, it lets you see
Save_Screen and Restore_Screen do the job. Presently, savedemo does a slow
fade from the image being replaced to the image being restored, because
Restore_Screen replaces one video adapter plane at a time. To restore with an
overall top to bottom effect, it would be necessary to redesign the logic of
Restore_Screen to operate on each full scan line in sequence, restoring each
of its four planes.


What Next?



You can readily modify the procedures here to allow calling by C programs.
Just adjust the stack frame handling to handle the C calling convention.
The Super VGA modes (800 x 600 pixels) built into most manufacturers' VGA
cards work much like VGA except that more bits of data must be saved. For 800
x 600 screens, each plane occupies 60,000 bytes of data, or 240,000 bytes
total. The 1024 x 768 resolution beyond Super VGA is implemented
inconsistently among various VGA cards, but a graphic screen requires 98,204
bytes to save each bit plane! With this much data, or even the mere 28,808
bytes-per-bit plane for plain vanilla VGA, one must consider how to conserve
memory. One possibility is to compress data prior to saving the bit plane and
reconstitute it before restoring it.
Another tactic, which may or may not be combined with image compression, is to
save a screen image to expanded memory if it is available. Alternatively, you
can copy a VGA plane to a file and then read it back when restoring the image.
Or add the logic to save and restore only a part of the screen. This comes in
handy when doing drop-down or pop-up menus over a graphic image. Partial
screen saves are generally much quicker than regenerating that part of the
screen image from scratch.
Microsoft's QuickPascal provides the _GetImage and _PutImage procedures to
save and restore a screen image. Borland's BGI offers the similarly named
GetImage and PutImage. Even if you end up using those library functions
instead of the ones in this article, you can now see that there is nothing
magical about programming VGA registers.


Bibliography


Kliewer, Bradley Dyck. EGA/VGA, A Programmer's Reference Guide. New York
McGraw-Hill, 1988.
Personal System/2 and Personal Computer BIOS Interface Technical Reference
(IBM Publication 84X1514). International Business Machines Corporation 1987
(since superceded by another publication).
Wilton, Richard. Programmer's Guide to PC and PS/2 Video Systems. Redmond,
Wash.: Microsoft Press, 1987.

_SAVING AND RESTORING VGA SCREENS_
by Ben Myers


[LISTING ONE]

 PAGE 80,132
 TITLE EGA/VGA screen save/restore (Turbo Pascal 4.0+ or Quick Pascal 1.0)
; GRFSAVE.ASM -
; (C)Copyright 1989-1990 Spirit of Performance, Inc.
; All rights reserved. Unauthorized use or copying prohibited by law.

CODE SEGMENT WORD PUBLIC
 ASSUME CS:CODE
 PUBLIC Write_VGA_Plane ; Write video plane from caller's memory.
 PUBLIC Read_VGA_Plane ; Read video plane, move it to caller's memory.

; procedure Write_VGA_Plane (Plane, Count : word; var Plane_Array );
; procedure Read_VGA_Plane (Plane, Count : word; var Plane_Array );
; Parameters: Plane - Graphics plane number to move ( range 0-3 )
; Count - Byte count to move
; Plane_Array - Array for video plane values

Plane_Array EQU DWORD PTR [bp+06h]
Count EQU WORD PTR [bp+0Ah]
Plane EQU WORD PTR [bp+0Ch]

Write_VGA_Plane PROC FAR
 push bp ; Save Turbo's BP
 mov bp,sp ; Set up stack frame
 mov bx,ds ; Save Turbo's DS
 mov di,0A000h ; EGA/VGA buffer segment:offset, A000:0000
 mov es,di
 xor di,di ; ES:DI is start of video buffer
 mov dx,3CEh ; DX = Graphics Controller I/O Port
 mov ax,0005h ; AH = 00h (Read mode 0, write mode 0)
 ; AL = Mode register number (5)
 out dx,ax ; load Mode register
 mov ax,0001h ; AH = 00h (mask for Enable Set/Reset),
 ; also the default for modes 12h and 10h
 ; AL = Enable Set/Reset register number (1)
 out dx,ax ; load Enable Set/Reset register
 mov ax,0003h ; AH = Replace bit planes with memory,
 ; no bit rotation, also the default
 ; AL = Data Rotate/Function Select register
 ; number (3)

 out dx,ax ; load Data Rotate/Function Select register
 ; AL = Bit Mask Register (8)
 mov ax,0FF08h ; AH = bit mask
 out dx,ax ; Set bit mask register for all bits
 mov dx,3C4h ; DX = Sequencer I/O Port
 mov cx,ss:Plane ; Get Plane number from caller
 and cl,03h ; Force it to range 0 to 3
 mov ah,1 ; Set up AH with bit number of plane to restore
 shl ah,cl ; where bit 0 = plane 0, etc.
 mov al,02h ; AL = Map Mask Register number (2)
 out dx,ax ; load Map Mask register with plane number
 mov cx,ss:Count ; byte count to move (size of plane for
 ; EGA/VGA card in current mode )
 lds si,ss:Plane_Array ; Addr of array to restore plane values from
 rep movsb ; Move the data
; Or replace the above instruction by the slower but equivalent loop construct
; below in the event that your VGA card doesn't respond properly.
@@:
; lodsb ; Get a byte from save area plane
; stosb ; Form a byte for current plane
; loop @B ; Do next byte, until all have been done.

; Now reset the VGA registers used back to the defaults expected.
 mov dx,3CEh ; DX = Graphics Controller I/O Port
 mov ax,0001 ; AH = 0 (default Enable Set/Reset Value)
 ; AL = Enable Set/Reset register number (1)
 out dx,ax ; restore default Enable Set/Reset register
 mov dx,3C4h ; DX = Sequencer I/O port
 ; AH = all planes enabled
 mov ax,0F02h ; AL = Map Mask Register number (2)
 out dx,ax ; restore Map Mask register to do all planes.
 mov ds,bx ; Restore Turbo's DS
 pop bp ; Restore Turbo's BP
 ret 8 ; Remove params & return to call
Write_VGA_Plane ENDP

; procedure Read_VGA_Plane (Plane, Count : word; var Plane_Array );

Read_VGA_Plane PROC FAR
 push bp ; Save Turbo's BP
 mov bp,sp ; Set up stack frame
 mov bx,ds ; Save Turbo's DS
 mov si,0A000h ; EGA/VGA buffer segment:offset, A000:0000
 mov ds,si
 xor si,si ; DS:SI is start of video buffer
 mov dx,3CEh ; DX = Graphics Controller I/O Port
 mov ax,0005h ; AH = 00h (Read mode 0, write mode 0)
 ; AL = 5 (Mode register number)
 out dx,ax ; load Mode register
;; int 3 ; Enable breakpoint for debugging.
 mov ax,ss:Plane ; Get Plane number
 mov ah,al ; AH = color plane to get
 mov al,04h ; AL = Read Map Select Register number (4)
 out dx,ax ; load Read Map Select register
 mov cx,ss:Count ; byte count to move (size of plane for
 ; EGA/VGA card in current mode )
 les di,ss:Plane_Array ; Address of array to store plane values.
 rep movsb ; Move the data from video buffer to save area
; Or replace the above instruction by the slower but equivalent loop construct

; below in the event that your VGA card doesn't respond properly.
@@:
; lodsb ; Get a byte from plane of video buffer
; stosb ; Save it.
; loop @B ; Do next byte, until all have been done.

; Now reset the VGA registers used back to the defaults expected.
 mov ax,1005h ; AH = 10h, defaults for modes 12h and 10h
 ; AL = Mode register number (5)
 out dx,ax ; restore default mode register
 mov ax,0004h ; AL = Read Map Select Register number (4)
 out dx,ax ; load Read Map Select register default value
 mov ds,bx ; Restore Turbo's DS
 pop bp ; Restore Turbo's BP
 ret 8 ; Remove params & return to call
Read_VGA_Plane ENDP
CODE ENDS
 END





[LISTING TWO]

{
 GRFSAVE1.PAS - Unit to save and restore graphics screens
 Version 1.40 (03-19-90) --depends on MSgraph Unit for
 manifest constants indentifying graphics modes.
 procedure Write_VGA_Plane (Plane, Count : word; var Plane_Array );
 procedure Read_VGA_Plane (Plane, Count : word; var Plane_Array );
 Parameters: Plane - Graphics plane number to move ( range 0-3 )
 Count - Byte count to move
 Plane_Array - Array for video plane values
 (C)Copyright 1989-1990 Spirit of Performance, Inc.
 All rights reserved. Unauthorized use or copying prohibited by law.
}

{$R-,S-,I-,D+,F+,V-,B-,N-,L+ }

UNIT GRFSAVE;

INTERFACE
USES MsGraph;

const
 Max_Planes = 4;

procedure Init_Screen_Save ( Plane_Size : longint; Number_Of_Planes : word );
Function HeapFunc ( Size: word ) : integer;
function Save_Screen ( Mode : integer ) : integer;
procedure Restore_Screen;
procedure Write_VGA_Plane (Plane, Count : word; var Plane_Array );
procedure Read_VGA_Plane (Plane, Count : word; var Plane_Array );

IMPLEMENTATION

var
{ Video plane size and number of planes in bytes }

{ ****** When video plane size gets above 64K, need to change below }
 Video_Plane_Size : word; { in bytes }
 Number_GPlanes : word;
 Plane_Counter : word;
 Plane_Ptrs : array [1..Max_Planes] of pointer;
 Planes_Saved : integer;
 Saved_Graphics_Mode : integer;
 Monochrome_Buffer : ARRAY[ 0..$7FFF ] OF byte ABSOLUTE $B800:$0000;
 VGA_Buffer : ARRAY[ 0..$7FFF ] OF byte ABSOLUTE $A000:$0000;

procedure Init_Screen_Save;
 { Initialize unit with parameters }
 begin
 Video_Plane_Size := Plane_Size;
 Number_GPlanes := Number_Of_Planes;
 end;

Function HeapFunc;
{ Simple heap error function overrides run time error to avoid program
 abort. }
begin
 HeapFunc := 1; { return an error indicator to caller }
end;

{$L d:\tpsource\GRFSAVE.obj }
procedure Write_VGA_Plane; external;
procedure Read_VGA_Plane; external;

function Save_Screen;
{ Saves graphics planes for current graphics mode.
 Returns number of planes saved, in case caller cares.
}

 begin
 Saved_Graphics_Mode := Mode;
 Planes_Saved := 0;
 HeapError := @HeapFunc;
 case Saved_Graphics_Mode of
 _HRes16Color, { 640 x 200, 16 color }
 _EResColor , { 640 x 350, 4 or 16 color }
 _VRes16Color: { 640 x 480, 16 color }
 for Plane_Counter := 1 to Number_GPlanes do
 begin
 { Get memory to save a plane }
 GetMem( Plane_Ptrs[Plane_Counter], Video_Plane_Size);
 if Plane_Ptrs[Plane_Counter] <> nil then
 { Move the plane if GetMem succeeded }
 begin
 Read_VGA_Plane (Plane_Counter-1, Video_Plane_Size,
 Plane_Ptrs[Plane_Counter]^ );
 inc ( Planes_Saved );
 end;
 end;
 _HResBW , { 640 x 200, BW }
 _HercMono: { 720 x 348, BW for HGC }
 begin
 GetMem( Plane_Ptrs[1], Video_Plane_Size);
 if Plane_Ptrs[Plane_Counter] <> nil then
 begin

 Move ( Monochrome_Buffer, Plane_Ptrs[1]^, Video_Plane_Size );
 Planes_Saved := 1;
 end;
 end;
 _EResNoColor, { 640 x 350, BW }
 _VRes2Color : { 640 x 480, BW }
 begin
 GetMem( Plane_Ptrs[1], Video_Plane_Size);
 if Plane_Ptrs[Plane_Counter] <> nil then
 begin
 Move ( VGA_Buffer, Plane_Ptrs[1]^, Video_Plane_Size );
 Planes_Saved := 1;
 end;
 end;
 end; {case Saved_Graphics_Mode}
 if Planes_Saved <> Number_GPlanes then
 { Unsuccessful, so reset count of planes saved }
 Planes_Saved := 0;
 Save_Screen := Planes_Saved;
 end;

procedure Restore_Screen;
 begin
 if Planes_Saved <> 0 then
 case Saved_Graphics_Mode of
 _HRes16Color, { 640 x 200, 16 color }
 _EResColor , { 640 x 350, 4 or 16 color }
 _VRes16Color: { 640 x 480, 16 color }
 for Plane_Counter := 1 to Number_GPlanes do
 if Plane_Ptrs[Plane_Counter] <> nil then
 Write_VGA_Plane (Plane_Counter-1, Video_Plane_Size,
 Plane_Ptrs[Plane_Counter]^ );
 _HResBW , { 640 x 200, BW }
 _HercMono: { 720 x 348, BW for HGC }
 Move ( Plane_Ptrs[1]^, Monochrome_Buffer, Video_Plane_Size );
 _EResNoColor, { 640 x 350, BW }
 _VRes2Color : { 640 x 480, BW }
 Move ( Plane_Ptrs[1]^, VGA_Buffer, Video_Plane_Size );
 end; {case Saved_Graphics_Mode}
 end;
END.






[LISTING THREE]

PROGRAM savedemo;
{ savedemo.PAS - Demonstrate EGA/VGA graphics screen save/restore
 Version 1.00, 19 Mar 1990
 Uses for Microsoft Graphics Interface and selected MASM functions.
 (c)Copyright 1989-1990 Spirit of Performance, Inc.
}

USES
 DOS, MSGraph, Crt, GrfSave1;
type

 TimeRec = record
 Hour : word;
 Minute : word;
 Second : word;
 FracSec : word;
 Floating_Time : real;
end;

var
 Start_Time : TimeRec;
 Stop_Time : TimeRec;

Function Elapsed_Time ( Stop_Time, Start_Time : TimeRec ) : real;

const
 R3600 : real = 3600.0;
 R60 : real = 60.0;
 R100 : real = 100.0;

begin { Elapsed_Time }
 with Start_Time do
 begin
 Floating_Time := (Hour * R3600) + (Minute * R60) + Second
 + (FracSec / R100);
 end;
 if Stop_Time.Hour < Start_Time.Hour then inc(Stop_Time.Hour, 24);
 with Stop_Time do
 begin
 Floating_Time := (Hour * R3600) + (Minute * R60) + Second
 + (FracSec / R100);
 end;
 Elapsed_Time := Stop_Time.Floating_Time - Start_Time.Floating_Time;
end; { Elapsed_Time }

TYPE
 ViewPortType = record
 x1, y1, x2, y2 : word;
 end;
VAR
 errorcode : Integer;
 x,y : Integer;
 maxx, maxy : Integer; { Maximum addressable pixels }
 c : Char;
 vc : _VideoConfig;
 CurrentView : ViewPortType;
 lCount : longint;
 OldExitProc : Pointer; { Saves exit procedure address }
 Plane_Count : integer;
var
 Video_Plane_Size : longint;
 Number_GPlanes : word;
CONST
 Version_ID : string = ( 'Version 1.00, 19 Mar 1990' );
 Patterns : Array [0..11] of _FillMask =
 (
 (0,0,0,0,0,0,0,0),
 ($FF,$FF,$FF,$FF,$FF,$FF,$FF,$FF),
 ($FF, 0, $FF, 0, $FF, 0, $FF, 0),
 ($44, $88, $11, $22, $44, $88, $11, $22),

 ($77, $EE, $DD, $BB, $77, $EE, $DD, $BB),
 ($77, $BB, $DD, $EE, $77, $BB, $DD, $EE),
 ($88, $44, $22, $11, $88, $44, $22, $11),
 ($11, $AA, $44, $AA, $11, $AA, $44, $AA),
 ($55, $AA, $55, $AA, $55, $AA, $55, $AA),
 ($F0, $0F, $F0, $0F, $F0, $0F, $F0, $0F),
 (1, 0, 0, 0, 1, 0, 0, 0),
 (5, 0, 5, 0, 5, 0, 5, 0));

 CleanUp_Reqd : Boolean = TRUE;

{$F+}
procedure MyExitProc;
{ Procedure to clean up on early program termination }
begin
 ExitProc := OldExitProc; { Restore exit procedure address }
 if CleanUp_Reqd then
 begin { Restore original video mode. }
 errorcode := _SetVideoMode( _DefaultMode );
 end;
end; { MyExitProc }
{$F-}

Procedure GetViewSettings (var ReqView : ViewPortType);
begin
 ReqView := CurrentView;
end;

Procedure SetView (xa, ya, xb, yb : word);
begin
 _SetViewPort(xa, ya, xb, yb);
 _SetClipRgn (xa, ya, xb, yb);
 with CurrentView do
 begin
 x1 := xa; y1 := ya;
 x2 := xb; y2 := yb;
 end;
end;

procedure FullPort;
{ Set the view port to the entire screen }
begin
 SetView(0, 0, maxx, maxy);
end; { FullPort }

procedure MainWindow(Header : string);

{ Make a default window and view port for demos }
begin
 _SetTextColor(vc.numcolors-1); { Reset the colors }
 _SetBkColor(0);
 _SetColor(vc.numcolors-1);
 _ClearScreen(_GClearScreen); { Clear the screen }
 FullPort; { Full screen view port }
 _SetTextPosition( 1, (vc.NumTextCols - length(Header)) div 2);
 _OutText(Header); { Draw the header text }
 { Move the edges in to leave room for text at top and bottom }
 SetView(0, vc.NumYPixels div vc.NumTextRows + 1 , maxx,
 maxy-(vc.NumYPixels div vc.NumTextRows)-1);

end; { MainWindow }

procedure StatusLine(Msg : string);
{ Display a status line at the bottom of the screen }
begin
 FullPort;
 _SetLineStyle($FFFF);
 _SetFillMask(Patterns[0]);
 _SetColor(0); { Set the drawing color to black }
 _Rectangle(_GFillInterior,
 0, vc.NumYPixels-(vc.NumYPixels div vc.NumTextRows+1),
 maxx, maxy); { Erase old status line }
 _SetTextPosition( vc.NumTextRows,
 (vc.NumTextCols - length(Msg)) div 2);
 _SetTextColor(vc.numcolors-1); { Set the color for header }
 _SetBkColor(0);
 _OutText(Msg); { Write the status message }
 { Go back to the main window }
 SetView(0, vc.NumYPixels div vc.NumTextRows +1 , vc.NumXPixels,
 vc.NumYPixels-(vc.NumYPixels div vc.NumTextRows+1));
 _SetTextPosition( 1, 1 );
end; { StatusLine }

procedure WaitToGo; { Wait for user to abort program or continue }
const
 Esc = #27;
var
 Ch : char;
begin
 StatusLine('Esc aborts or press a key...');
 with Start_Time do GetTime ( Hour, Minute, Second, FracSec );
 repeat
 with Stop_Time do GetTime ( Hour, Minute, Second, FracSec );
 { Wait for keypress no more then 5 seconds, then go on without it }
 until KeyPressed or (Elapsed_Time ( Stop_Time, Start_Time ) > 5.0);
 if Keypressed then
 begin
 Ch := ReadKey;
 if Ch = #0 then Ch := readkey; { trap function keys }
 if Ch = Esc then
 Halt(0); { terminate program }
 end;
end; { WaitToGo }

procedure DrawRectangles;
{ Draw rectangles on the screen }
var
 MaxSize : word;
 XCenter, YCenter : word;
 ViewInfo : ViewPortType;
 YMax, XMax : word;
 jCount : word;

begin { DrawRectangles }
 MainWindow('Draw Rectangles');
 StatusLine('');
 GetViewSettings(ViewInfo);
 with ViewInfo do
 begin

 XMax := (x2-x1-1);
 YMax := (y2-y1-1);
 end;
 MaxSize := XMax shr 1;
 XCenter := XMax shr 1;
 YCenter := YMax shr 1;
 for lCount := 1 to MaxSize do
 begin
 _SetColor(lCount mod vc.numcolors);
 _Rectangle(_GBorder, XCenter+lCount, YCenter+lCount,
 XCenter-LCount, YCenter-LCount);
 end;
 WaitToGo;
end; { DrawRectangles }

procedure DrawCircles;
{ Draw concentric circles on the screen }
var
 MaxRadius : word;
 XCenter, YCenter : word;
 ViewInfo : ViewPortType;
 YMax, XMax : word;

begin { DrawCircles }
 MainWindow('Draw Circles');
 StatusLine('');
 GetViewSettings(ViewInfo);
 with ViewInfo do
 begin
 XMax := (x2-x1-1);
 YMax := (y2-y1-1);
 end;
 MaxRadius := XMax shr 1;
 XCenter := XMax shr 1;
 YCenter := YMax shr 1;
 for lCount := 1 to MaxRadius do
 begin
 _SetColor(lCount mod vc.numcolors);
 _Ellipse(_GBorder, XCenter + lCount, YCenter + lCount,
 XCenter - lCount, YCenter - lCount);
 end;
 WaitToGo;
end; { DrawCircles }

BEGIN
 OldExitProc := ExitProc; { save previous exit proc }
 ExitProc := @MyExitProc; { insert our exit proc in chain }
 _GetVideoConfig( vc );
 DirectVideo := FALSE; { No direct writes allowed in graphics modes }

 { Set graphics mode with highest resolution. }
 if (_SetVideoMode( _MaxResMode) = 0) then
 Halt( 1 );
 _GetVideoConfig( vc );
 if vc.mode <> _HercMono then
 begin
 Video_Plane_Size := vc.numxpixels div 8;
 Video_Plane_Size := Video_Plane_Size * vc.numypixels;
 end

 else
 Video_Plane_Size := 32768;
 if vc.numcolors = 2 then
 Number_GPlanes := 1 { B&W modes have 1 plane only }
 else
 Number_GPlanes := 4; { Assume that color modes have 4 planes }
 Init_Screen_Save (Video_Plane_Size, Number_GPlanes);

 _SetColor( vc.numcolors-1 );
 DrawRectangles;
 Plane_Count := Save_Screen (vc.mode); { Save the first screen }
 DrawCircles;
 Restore_Screen; { Restore the rectangles }
 WaitToGo; { Wait before terminating }

 { Restore original video mode. }
 errorcode := _SetVideoMode( _DefaultMode );
 CleanUp_Reqd := FALSE;
END.











































July, 1991
PORTING UNIX TO THE 386: A STRIPPED-DOWN KERNEL


Onto the initial utilities


 This article contains the following executables: 386BSD.791


William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 and 2.9BSD and was the chief architect
of National Semiconductor's GENIX project, the first virtual memory
microprocessor-based UNIX system. Prior to establishing TeleMuse, a market
research firm, Lynne was vice president of marketing at Symmetric Computer
Systems. They conduct seminars on BSD, ISDN, and TCP/IP. Send e-mail questions
or comments to lynne@berkeley.edu. Copyright (c) 1991 TeleMuse.


Much has been made of the preparations we have required before we could embark
on our present project. While that's all well and good, at some point we
really would like to get on with our adventure and start the main assault --
the kernel itself. Our roundabout development of tools and equipment allowed
us to scope out the weak points in the 386BSD specification, with the added
bonus of enhancing our experience and confidence. By following a disciplined
set of guidelines and procedures, we minimized one of the most demoralizing
activities of all -- trying to build our system without any idea as to where
the bugs (or failure modes) lie, especially those enormously irritating
compiler bugs induced by driver implementation bugs.
Now we arrive at the point in which we would like to create a "strippeddown"
kernel. At this stage of our work, our primary concern is with the
machine-dependent portions of the kernel that install it into the position to
execute processes (via the bootstrap procedure) and prepare the system for
initialization of the minimum machine-independent portions of the kernel
(processes, files, and pertinent tables).
Our 386BSD kernel is a kind of "virtual machine" (not to be confused with the
"virtual" in "virtual memory"), where functions underlie other functions
transparently. When the system is initialized, it can use portions that
require little direction to initialize even larger portions. Thus, this
virtual machine assembles itself tool by tool, much like a set of Russian
dolls. The machine-dependent kernel initialization is the innermost of the
dolls -- the kernel of the kernel around which all is built. The next outer
layer will then be built by the kernel's main( ) procedure (to be discussed
later), which in turn initializes higher-level portions of the kernel.
While our basic approach toward "wiring" the 386 for operation with the
machine independent BSD kernel is similar to that of our standalone system
(see DDJ March 1991), the details are now very important. In fact, we've
changed so much since our discussion of the 386BSD specification (DDJ January
1991) that even the specification needs to be revised in several key areas
such as the virtual memory system and per-process data structure. In addition,
the most recent versions of 386BSD (less than a month old) incorporate the
unique feature of the 386 architecture in a form of "recursive" paging which
not only leverages resources to the hilt, but also reduces complexity
enormously. (See text box "Brief Notes: 386BSD Recursive Paging.")


The Basic Structure of the UNIX Kernel


The structure of the BSD UNIX system is akin to that of an onion. Consisting
of layer upon layer, the outside layers of the BSD onion are those processes
visible to the computer "user," while the inner layers hide processes the user
needn't see, such as those relating to the hardware. (This can also be called
the "Almond Roca" kernel, if you prefer sweets.)
The operating system kernel lies in the innermost layer. Its primary
responsibility is to provide the appropriate level of utility services upon
which other programs and facilities are built. The kernel itself consists of
an inner "machine-dependent" portion and an outer "machine-independent"
portion. The center of the onion could be considered the raw hardware itself.
In UNIX parlance, the kernel is typically divided into the "high kernel" and
the "low kernel." The high kernel is concerned with UNIX abstractions, such as
files, processes, and other related objects. The low kernel, in contrast, is
concerned with the functionality of the kernel -- how to implement the
abstractions with machine-dependent mechanisms for operation.
More Details.
To some degree, all operating system are designed with this basic "onion model
in mind. However, the designers of competing systems spend a great deal of
time determining what items belong in a given layer. Unlike the ISO OSI layer
model which comprises computer systems networking, no agreement yet exists on
the ideal model for operating systems design.
Many operating systems prior to UNIX did not precisely delineate the operating
system and the user programs, and resulted in quite a wide variation in
layering. Some operating systems (such as VMS, RSX, and OS/370) have thousands
of different entry points and functions -- many chosen on an ad hoc basis. For
example, some user programs would call directly into the operating system at a
point known to be past a register-save sequence, because the writer of the
program would assume that it didn't cause a problem and might even speed up
the program slightly. Even nonuniformity within operating systems can occur,
such as when a devotee of one particular system adds a facility which relies
on a system call differing radically from the rest of the system. In these
cases, the layering is blurred between the user application program and the
given operating system -- not surprising considering the various ways that the
same effect can be achieved.
UNIX, a fundamentalist "return to the basics" approach, was a philosophical as
well as design issue. Unlike these other systems mentioned, UNIX has a very
small number of system calls (typically fewer than a couple hundred), and, as
such, must leverage them for maximal operation. This "simplicity" of design
can be found throughout its structure. In fact, a suspect subsystem within
UNIX itself is often branded as "unlike UNIX" due to nonmodular or clumsy
design. Ironically, this has been the case with software that has been part of
UNIX for years and widely used.
Part of the reason UNIX adherents (and its designers) appear to be "zealots"
of the minimalist view is that the pressure to add "just one more" system call
is quite great, and this one area alone has become a point of highly charged
and subjective debate as to where to draw the line. This is one reason why a
single UNIX "standard" has yet to emerge -- the lack of consensus on this and
other crucial issues.


Incremental Strategy


Despite the "purity of essence" debates, UNIX has grown like a weed. (Any
undesired plant is a weed, and one could say the same about UNIX, at least
initially -- just ask DEC or IBM or Apple.) It has grown because the
ever-increasing hunger for applications, and the functional infrastructure
needed to support them, to simplify or enhance work is insatiable.
Doubtlessly, UNIX will continue to grow in size and popularity (although some
of us would prefer it grow in a graceful and planned manner). However, there
are times when the "essence" of UNIX must be examined and understood, such as
when a native port is conducted. By restricting UNIX functions via conditional
compilation, we can work on making the core of the kernel functional. Once the
core is functional, remaining portions can be added incrementally. This
incremental methodology allows us to backtrack when errors or malfunctions
occur. In addition, we always have recourse to the previous version if
necessary.


Composing the Basic Minimal UNIX Kernel


What constitutes a minimal UNIX kernel? This varies according to the kind of
port desired and resources available. For example, one alternative plan we
almost selected involved using the Network FileSystem (NFS) instead of working
with the hard disk. If we had chosen that approach, code for implementing an
NFS client, along with the networking code, would have been a mandatory
component of our minimal port, while the disk driver and related support would
have been relegated to a less-important role.
Since we are concentrating on the machine-dependent portions of the minimal
kernel, we must pare-down considerably what is required. For 386BSD, we opted
for a traditional port (see DDJ March 1991) that relied on a hard disk, a
console interface (via the keyboard and display) and the process reschedule
clock (via the interval timer). All network protocol and related system
services (interprocess communications) were removed through conditional
compilation. Any extended functionality in the main body of the kernel meant
to accelerate operations (for example, macros, hash lookups, short circuit
evaluation) was also avoided -- after all, it makes no sense to improve the
speed of something that does not even run to begin with. Also, algorithm
improvement is not always a machine-independent phenomenon.
The point in generating the "tiniest" kernel imaginable is to simplify the
port. At this stage, we never expected to run something this small as a
complete "production" system. As we incrementally added subsystems to our
minimal kernel, we got a clearer understanding of the impact of each on the
kernel. Even within this small system, redundant code and interfaces occurred.
As such, a small amount of patience in this area always pays a handsome
dividend later.
Our minimal kernel was created by adding conditional compilation (#ifdefs)
statements to the BSD kernel source code to defeat the subsystems for
networking, TCP/IP protocols, routing, NFS, interprocess communications (other
than pipe), user process debugging, and the related services on which they
depend. In addition, since we only needed drivers for disk, display, keyboard,
and process scheduling clock, we could scale down the drivers and omit
autoconfiguration code. After making this operational, fleshing out the
drivers, and adding back in support to run debuggers caused the kernel to grow
considerably.
With all the concern these days over "bloated" kernels, with the consequent
support, extensibility, and other problems, it is instructional to examine a
sample listing of what can be considered a "stripped-down" 150-Kbyte kernel;
see Figure 1(a), page 85. (By the way, by abiding by the rules outlined
earlier and by using only the drivers necessary for basic functionality, our
early initial 386BSD kernel was less than 100 Kbytes in size -- and was both
debuggable and extensible.) As an example of how this differs from a
production system, Figure 1(b), page 85, contains the same breakdown for a
more recent system (using a derived MACH virtual memory system, NFS, TCP/IP,
multiple disk and Ethernet controllers, and other features added).


How Can You Be in Two Places at Once...?


By design (DDJ January 1991), we want our operating system kernel to run at
the top of the virtual address space (currently, location 0xfe000000) as in
Figure 2. However, our PC memory is mapped into the lower portion of the
address space before memory management is turned on. Thus, our bootstrap
program must load the kernel program into low physical memory to run, even
though the kernel has all of its absolute addresses directed to the top, where
no physical memory is present!
For the short run, the kernel executes code which manually compensates for
this problem, especially in the case of the data operands. Code operands are
stored as relative offsets that work regardless of location (so-called "PIC"
or Position Independent Code). PIC coding can be quite cumbersome (see Listing
One, page 85, from "start" to "begin"). Fortunately, the actual amount of code
required to operate in this fashion is small -- just enough to enable our
memory relocation hardware (the MMU).
As we recall (see DDJ January 1991), the 386 MMU utilizes a "two-level" paging
scheme in order to determine the physical page frame number -- the actual
address of physical memory underneath the virtual address. This mechanism
works by splitting the incoming virtual address into three parts: 10 bits of
page table directory index, 10 bits of page table index, and 12 bits of offset
within a page. The page table directory is a single page of physical memory
that facilitates allocation of page table space by breaking it up into 4-Mbyte
chunks of linear address space per each of its 1024 PDEs (Page Directory
Entries), which determine the location of underlying page tables in physical
memory. Each PDE-addressed page of a page table contains 1024 PTEs (Page Table
Entries). A PTE is similar in form and function to a PDE. The major difference
between a PDE and a PTE is one of hierarchy: A PDE selects the physical page
frame of PTEs while a PTE selects the physical page frame for the desired
reference. Once the frame offset least-significant address bits are obtained,
the final address is determined. This two-level mechanism is quite elaborate,
but it elegantly allows for the sparse allocation of address space, so that
the whole address space or even all of the address space mapping information
need not be present. In contrast, a one-level mapping scheme would require 4
Mbytes of real memory per task for mapping alone -- too much even for many
modern systems.
To run our kernel program with the MMU enabled, we must build page tables that
describe the physical location of memory storing the program, as well as the
mode of access allowed to each "page" (otherwise known as the "allocation
granularity" of the MMU) as in Figure 3. In addition, the MMU must have a page
directory table describing where it can find all of the possible 1024 page
table pages which allow it to access any part of its 4-gigabyte address space.
In a way, the 386 MMU acts almost as a "coprocessor" to the 386 CPU,
interpreting two data structures (page directories and page tables) on behalf
of the CPU and translating virtual addresses into physical ones. (The MIPS
RISC MMU is actually referred to as a coprocessor.)

While our code dutifully builds our page tables and page directory table to
make the above mapping work (see Listing One, near comment "build page
tables"), we are still left with a dilemma: How do we turn on mapping running
at a "high" physical address when we are still running at a "low" address? In
other words, how do we make the CPU switch from one address to another? Well,
the answer could depend on an understanding of many hardware-related issues
(such as the size of instruction prefetch queue, instruction pipelining,
address translation overlap, multiprocessor arbitration, and so forth), while
avoiding irregularities or non-standard approaches. For example, some systems
programmers have gotten away with murder over the years by assuming in the
software that the processor already has the instruction after the MMU is
enabled (not always true, mind you), instead of verifying it as they should in
all cases. This situation is analogous to people who dive over three lanes of
traffic at the last second just to make a freeway exit. Most of the time it
works, but occasionally it doesn't. In this case, Superposition of Matter
(unlike radio waves) doesn't hold (although Total Conservation of Mass does
hold). A disaster, possibly a crash, occurs -- so it goes with systems as
well.
Not that this area will get any easier, either, what with the even more
esoteric versions of the "N"86 on the drawing board. (By the way, has anyone
trademarked "N"86 yet?) One must anticipate where the technology will be
taken. For example, one might need to assume that the instruction queue always
consists of at least one instruction. As technology shifts, features which are
relied upon by even the most careful of programmers can be abandoned for
better ones (for example, a fully pipelined instruction execution with
pipelined MMUs that update address space state for branch prediction use).
In this case, the appropriate path around all of these hazards is simply to
map the bottom of address space to the same location -- or "double-map" the
same program. This way, it will work regardless of what the hardware designers
do. We could also have replicated the page tables to map the bottom of address
space where the kernel program begins, but we would end up duplicating the
same page tables used at the top of the address space, and that would be very
wasteful. Instead, we just double-map the bottom page directory entry (the one
that maps the bottom 4 Mbytes of address space) to the page directory entry
that maps the kernel. Once accomplished, the MMU can now be enabled.
Now that we are "running virtual," we need to leave the "bottom" of address
space by jmping from low to high (see DDJ March 1991). However, we must avoid
PIC from the jmp instruction, or else "jmp high" will keep us low. Because
"clever" assemblers and loaders transparently assume that PIC code is desired,
a quick solution is to push a constant on the stack and execute a return (ret)
instruction.


UNIX as a Subroutine Call


Once you are running "high," you need to install a stack. The stack should be
placed in the process's portion of address space. This way, it can be easily
changed when we move from a process or task to another, because each process
must have its own kernel stack. In a way, 386BSD functions like a subroutine
call for a user process, with its own internal calls stacking on this separate
stack, unlike the "jump to system" program approach used on systems such as
TOPS-10.
Keeping each process's kernel stack at the same virtual location works well
when using a single thread of execution processes, but is not advisable for
multithreaded execution. For the multithreaded version of 386BSD,
"lightweight" processes will require multiple kernel stacks and will be
allocated out of kernel global virtual memory as needed.


Configuring the 386 for UNIX Operation


The kernel program's address space established, we must "wire" the 386
processor hardware to the kernel interfaces and set initial conditions for the
system, including interrupt and exception processing, user process address
space definition, and preparation for context switching. All of the facilities
must be set from the earliest point possible, because before we leave the
kernel to execute a single user process instruction, we are already running
multitasking. In fact, we will even use multitasking and exception processing
as we initialize the system! This really should come as no surprise, as
software aficionados can never resist the temptation to use double-duty or
recursive code -- or even inscrutable self-referential code. As a result, we
page-fault the page tables to allocate them to be used in paging the first
process.


Segments Revisited


There is an old expression that says: "If one is used to using a hammer,
everything else becomes a nail." The 386 hammer of choice, segments, must be
used to pound together the rest of the architecture no matter what. In other
words, even if segments are not desired, they must be allocated and
initialized. Because we have chosen to achieve most of our functionality via
the paging mechanism (see DDJ January 1991), we try to minimize the need for
the segmentation mechanism, but allow for future extensibility in such areas
as dynamically growable tables (for example, ldt, gdt, ...). Currently,
init386( ) relies only on a constant table (see Listing Two, page 86). In
addition, separate descriptors for data and code segments are required as they
use different attribute sets, even though they exactly alias each other. This
allows for some interesting effects, such as allowing code to be executed out
of the stack!
The approach outlined thus far has permitted coding to proceed in C. This is
actually quite important, as the descriptor code and bitfield is obscure
enough without any additional complications, such as additional coding in
assembler. We actually could have worked in the reverse manner (invoking
segments and then paging) by using the descriptors to relocate user space and
run the kernel "low," but this would have significantly increased bookkeeping
overhead when going between user and kernel without offering any clear
advantage. Also, the construction of segment descriptors in assembly language
can be quite tedious if done in this manner.


Interrupts and Exceptions


In the standalone system (see DDJ March 1991) we built a Global Descriptor
Table to reinitialize segmentation. Now, we must follow the same techniques
developed for the standalone system to build a Global Descriptor Table for
descriptors used primarily by the kernel, and a Local Descriptor Table used
primarily by the user tasks. (The local table can later be made "relative" per
task if desired.) An Interrupt Descriptor Table must also be built that
instructs the processor to execute special assembly code stubs (located at
IDT-VEC(XXX) entry points) within the kernel (SEL_KPL) when any exceptions are
triggered. Low-level code in each bus adapter's support code, used to
wire-down all possible interrupts, is called to catch unintended interrupts
prior to the configuring of devices by the kernel. And finally, through the
use of special assembly language entry points (see DDJ March 1991), the
descriptor tables are loaded, with any user or kernel exceptions caught on the
fly.
Up to this point, we've just assumed that sufficient memory would be present
to satisfy our needs, but this should not continue. Instead, we must probe and
check the amount of memory present against recorded values in the system's
configuration memory. If a value appears unusually large, we choose the lesser
of the two. If both seem questionable, however, we revert to our minimum
assumption -- 640 Kbytes of base memory only.
Memory in hand, we next initialize the virtual memory system that will manage
both physical memory and virtual address space. The routine pmap_boot-strap( )
scales resources and assumptions based on available physical memory, and
synchronizes the arrangement of the early "pmap" or physical map of the system
to its internal data structures. The Mach virtual memory system, portions of
which are incorporated into BSD, is split into machine dependent (pmap) and
machine independent (vm_map) parts.
The remaining portion of init386( ) creates a way for a user process to enter
the kernel and an initial process state through which a user process can be
run for the first time. Because processes inherit these characteristics, this
"zeroth" process state in effect initializes all subsequent sibling processes!
Upon executing init386( ) and main( ) (which initializes the kernel), the
system is prepared for running the user process. Listing One (near the end)
contains code which moves us into user space to execute the very first
process. Little work is done to the user process itself -- instead, the
exception mechanisms are relied upon to supply memory and instructions to it.
This occurs from the point of initialization, because the init process that
starts the system itself is faulted in incrementally.


Summary: What Do We Have Now?


As you may have noted, over the course of this series we have been building
upon our previous work as we head toward our goal, and increasingly we are
relying on an understanding of our growing set of tools. And, at the same
time, we have recently changed some of our code to accommodate some of the
exciting new developments at Berkeley. With all the changes occurring, even
those very familiar with this software can become somewhat "lost."
At this stage, it is important to go back and recall the perspective we tried
to established on this project. We compared it to that of climbing a mountain,
and we carefully outlined and prepared for all the problems we thought we
would encounter. However, even with all the preparation we could muster, we've
still had to be fast on our feet. Paths which we had carefully mapped out just
six months ago are wiped away by an avalanche -- removed forever by the force
of innovation. Work and time and effort have been tossed aside as we've been
forced to adapt new approaches, not only to keep up with the group, but
occasionally to set the pace (as in recursive paging). And finally, as our
system grows, the complexity grows as well, and with it the blizzards of bugs
and incompatibilities that occasionally blind and dispire.
And now, after months of effort, we have developed the barest of kernels. We
will continue on with our kernel development, but we now have the makings of
the "Basic Kernel." Key elements of our Basic Kernel (multitasking, processes,
device drivers, executing the first process, games tests, paging, and
swapping) are crucial to establish a working understanding of 386BSD and
Berkeley UNIX. We look forward to seeing you on the trail with us.


Brief Notes: 386BSD Recursive Paging


When we began this project, many of our notions were based on prior experience
in that we emphasized the similarities of the 386 to other machines while
discounting its idiosyncrasies. Like a new car owner fumbling in the wrong
place for the headlight switch and cursing it for having moved from the
dashboard to the steering column, we mainly tried to just get 386BSD running.
However, once we felt "settled in," we decided to see if we could take it to
the limit. Consequently, the last few months have been like motoring with Mr.
Toad -- with the onslaught of software changes, "wild" seems too weak a word.
In keeping with the CSRG goals for the upcoming 4.4BSD release, one major task
was to migrate 386BSD to a virtual memory system derived from CMU's MACH
operating system. While this decision was appropriate, a major problem
relating to the 386 arose almost immediately; when implemented as designed by
CMU on the 386, the virtual memory system swallowed copious quantities of
virtual address space for the operating system -- space which is needed for
user processes. Most of this space was gobbled up maintaining address maps of
all in-memory processes page tables, so that the system could maintain access
to them at all times should they become active.
At the same time, we had been getting somewhat tired of how process page table
mapping was handled in the traditional BSD virtual memory system; since the
page tables themselves were in physical memory (for use by the 386's MMU), we
needed pages of page tables to map the page tables themselves before we could
modify them. As you can guess, this increased the amount of "bookkeeping"
overhead considerably, especially when interacting features are added (such as
shared libraries and shared memory). We hoped there would be a better way.
An ideal virtual memory system design gives access to information on the
virtual-to-physical translation process (and the converse) very quickly.
However, while the information is there, right on the same piece of silicon
and working at warp speed doing just this, there is no way via software to
invoke the mechanism other than through transparent processes -- creating the
"virtual memory" effect. (Don't expect any change in this area any time soon,
either, because for many hardware design reasons this is a nontrivial
addition.) As a consequence, the systems programmer must encode a tedious
subroutine with a sole purpose to emulate the same translation process in
software that is performed in a fraction of the time of a single instruction
by part of the hardware.
On the 386, page tables and page directories appear very similar -- in fact,
they're identical in contents (see top of Listing Four, page 90). Turning the
usual paging paradigms upside down, we examined what would occur if the page
tables and page directories were viewed as if they were software data
structures that could be connected in different ways. For example, frequently
we want to find the page table entry associated with a given page. Obviously,
the MMU does just this as it processes an ordinary reference to a page and
continues on to "indirect" through the PTE to get to a page. Upon reflection,
we noticed that if we arranged it so that the MMU goes through the same entry
twice, we could get it to "use up" an indirection. This would allow us to
reference the PTE itself instead of the underlying page. This approach, while
unorthodox and confusing to the uninitiated, turned out to be quite feasible.
Thus was born the "recursive" page map technique -- one guaranteed to annoy
the zealot and amaze the skeptic. Based on the "self-referential" model, the
386BSD recursive page table mechanism undergoes two iterations in the process
of obtaining the PTE itself. In the first iteration, see Figure 4(a), a
reference is made to the PTE of the page table directory. In the second
iteration, see Figure 4(b), a reference is made to the PDE that maps the page
directory itself. In other words, by "pointing" a page directory entry at the
page directory itself, we have created a window in our virtual address map
that consecutively maps all of the address space's page tables (in
corresponding order as well) with out the need for another page of memory. In
addition, this technique also maps the page directory itself, as a consequence
of the second indirection, through the "recursive" page directory element.
To return to the previously mentioned example, we can find a PTE for a page
with the macro vtopte( ) as seen in Listing Four, which consists of just a
shift and an add. Additional macros here demonstrate the simplicity this
method gives the virtual memory system.
The benefits of this technique are compelling:
1. We were able to reuse an existing data structure -- the page directories
and page tables (contrary to the intentions of the hardware designers, by the
way) -- thus reducing the memory cost of a process.
2. We were able to reduce the number of items we need to track per process,
thus reducing bookkeeping overhead.
3. This method allowed us to conveniently mediate the cost of process page
tables. (The process page tables belong to and don't clutter up the operating
system kernel space.)
4. We were able to increase the locality of reference, such that the processor
cache performance is enhanced.

5. We were able to provide a more convenient model of memory for the operating
system to exploit.
Particularly relevant to items 2 and 5, by writing the 386 machine-dependent
support routines in a recursive manner, we were able to make the code perform
double-duty in a module a fraction of the size of previous 386 versions. In
addition, the multiprocessor version of 386BSD may derive some benefit from
this technique when used to hierarchically share page directory regions. It is
rare when you find a method that conceptually fits so well and as a
side-effect improves performance.
This technique is not limited to the 386 by any means; other two-level paging
MMU microprocessors (68030, Clipper, 32532, 88000, ...) theoretically can
leverage this technique, though probably with less benefit. Because most of
these processors have separate address spaces for kernel and user, waste in
the kernel does not rob memory from the user process as it does on the 386.
-- B.J. and L.J.

Figure 1(a):

 Minimal Kernel Breakdown (by module)
vmunix: text data bss module name

 1152 32 0 clock.o
 0 500 0 conf.o
 4548 740 32 cons.o
 1508 24 0 init_main.o
 0 1212 0 init_sysent.o
 1588 28 0 kern_clock.o
 2044 12 0 kern_descrip.o
 3296 80 0 kern_exec.o
 1840 48 0 kern_exit.o
 1600 36 0 kern_fork.o
 956 0 0 kern_mman.o
 312 0 0 kern_proc.o
 1280 0 0 kern_prot.o
 1216 0 0 kern_resource.o
 3564 32 0 kern_sig.o
 684 16 0 kern_subr.o
 1808 24 0 kern_synch.o
 1864 4 0 kern_time.o
 248 0 0 kern_xxx.o
 6176 20508 0 locore.o
 5596 596 0 machdep.o
 0 148 0 param.o
 2184 84 8 subr_prf.o
 1092 72 0 subr_rmap.o
 244 0 0 subr_xxx.o
 184 72 0 swapgeneric.o
 3340 0 0 sys_generic.o
 4156 68 0 sys_inode.o
 1096 56 0 sys_process.o
 784 16 0 sys_socket.o
 2260 224 0 trap.o
 9480 516 0 tty.o
 12 204 0 tty_conf.o
 3928 4 0 tty_pty.o
 1924 0 0 tty_subr.o
 8680 1220 0 ufs_alloc.o
 3312 116 0 ufs_bio.o
 1668 0 0 ufs_bmap.o
 1248 48 0 ufs_disksubr.o
 416 0 0 ufs_fio.o
 3968 68 0 ufs_inode.o
 436 0 0 ufs_machdep.o
 2048 0 0 ufs_mount.o
 6020 220 0 ufs_namei.o
 2288 208 0 ufs_subr.o
 7100 112 0 ufs_syscalls.o
 0 620 0 ufs_tables.o
 0 152 0 vers.o
 2280 48 0 vm_drum.o

 2964 52 0 vm_machdep.o
 4364 180 0 vm_mem.o
 8280 188 0 vm_page.o
 2056 20 0 vm_proc.o
 3060 24 0 vm_pt.o
 2788 72 0 vm_sched.o
 528 0 0 vm_subr.o
 1052 32 0 vm_sw.o
 1836 44 0 vm_swap.o
 1536 152 0 vm_swp.o
 2048 68 0 vm_text.o
 3768 1492 1024 wd.o
totals: 145708 30492 1064

Figure 1(b):

 Fully Loaded Kernel Breakdown (by module)
vmunix: text data bss module
 0 4 0 af.o
 592 16 0 autoconf.o
 844 0 0 clock.o
 2584 168 0 com.o
 0 640 0 conf.o
 4096 676 40 cons.o
 540 132 0 dead_vnops.o
 1440 28 0 device_pager.o
 3180 152 48 fd.o
 1264 140 0 fifo_vnops.o
 2812 12 0 if.o
 2600 12 0 if_ether.o
 1056 24 18 if_ethersubr.o
 464 0 0 if_loop.o
 5044 12 12 if_ne.o
 3184 16 0 if_sl.o
 3852 12 4 if_we.o
 2844 4 0 in.o
 356 0 0 in_cksum.o
 1684 0 0 in_pcb.o
 12 320 0 in_proto.o
 1496 12 0 init_main.o
 0 1532 0 init_sysent.o
 0 468 0 ioconf.o
 2056 68 0 ip_icmp.o
 4564 60 48 ip_input.o
 2616 0 0 ip_output.o
 1372 4 0 isa.o
 1204 16 0 kern_acct.o
 1280 4 0 kern_clock.o
 3184 0 0 kern_descrip.o
 3176 0 0 kern_exec.o
 1424 0 0 kern_exit.o
 996 8 4 kern_fork.o
 1204 0 0 kern_kinfo.o
 1772 0 0 kern_ktrace.o
 1028 4 0 kern_lock.o
 1892 268 0 kern_malloc.o
 796 0 0 kern_physio.o
 1180 0 0 kern_proc.o
 1844 0 0 kern_prot.o

 1140 0 0 kern_resource.o
 4172 132 0 kern_sig.o
 684 0 0 kern_subr.o
 1988 4 0 kern_synch.o
 1408 4 0 kern_time.o
 264 0 0 kern_xxx.o
 7076 684 0 locore.o
 4684 192 0 machdep.o
 552 0 0 mem.o
 708 44 4 mfs_vfsops.o
 656 132 0 mfs_vnops.o
 1600 0 0 nfs_bio.o
 1020 0 0 nfs_node.o
 21700 36 0 nfs_serv.o
 7748 152 0 nfs_socket.o
 1040 144 21672 nfs_srvcache.o
 10284 40 4 nfs_subs.o
 1956 72 80 nfs_syscalls.o
 2996 40 1 nfs_vfsops.o
 21304 424 0 nfs_vnops.o
 348 12 16 npx.o
 0 152 0 param.o
 6308 16 0 pmap.o
 2908 4 36 radix.o
 164 8 0 raw_cb.o
 1072 36 0 raw_ip.o
 812 0 0 raw_usrreq.o
 2304 8 0 route.o
 4552 116 0 rtsock.o
 2584 60 0 slcompress.o
 2296 180 0 spec_vnops.o
 716 0 0 subr_log.o
 1764 8 0 subr_prf.o
 888 0 0 subr_rmap.o
 340 0 0 subr_xxx.o
 5456 28 0 swap_pager.o
 0 40 0 swapvmunix.o
 3344 0 0 sys_generic.o
 0 0 0 sys_machdep.o
 904 56 0 sys_process.o
 604 20 0 sys_socket.o
 228 0 0 tcp_debug.o
 5820 8 0 tcp_input.o
 1896 16 0 tcp_output.o
 1504 12 0 tcp_subr.o
 832 60 0 tcp_timer.o
 1620 8 0 tcp_usrreq.o
 2912 0 0 trap.o
 9488 316 0 tty.o
 1864 204 0 tty_compat.o
 12 204 0 tty_conf.o
 3452 4 0 tty_pty.o
 1988 0 0 tty_subr.o
 504 0 0 tty_tty.o
 1980 36 0 udp_usrreq.o
 9644 0 0 ufs_alloc.o
 2012 0 0 ufs_bmap.o
 1424 0 0 ufs_disksubr.o
 3756 0 0 ufs_inode.o

 1668 12 0 ufs_lockf.o
 3832 4 0 ufs_lookup.o
 4572 20 0 ufs_quota.o
 732 0 0 ufs_subr.o
 0 620 0 ufs_tables.o
 3948 64 0 ufs_vfsops.o
 8264 524 0 ufs_vnops.o
 620 0 0 uipc_domain.o
 2672 64 4 uipc_mbuf.o
 8 176 0 uipc_proto.o
 6164 0 0 uipc_socket.o
 3184 24 0 uipc_socket2.o
 5520 0 0 uipc_syscalls.o
 3320 32 0 uipc_usrreq.o
 0 232 0 vers.o
 3644 0 0 vfs_bio.o
 1108 4 0 vfs_cache.o
 0 24 0 vfs_conf.o
 1776 0 0 vfs_lookup.o
 3940 44 0 vfs_subr.o
 7544 0 0 vfs_syscalls.o
 1684 20 0 vfs_vnops.o
 3524 0 0 vm_fault.o
 1964 20 0 vm_glue.o
 84 0 0 vm_init.o
 1848 0 0 vm_kern.o
 944 0 308 vm_machdep.o
 7624 16 0 vm_map.o
 384 20 0 vm_meter.o
 3196 4 0 vm_mmap.o
 3588 16 0 vm_object.o
 2500 32 0 vm_page.o
 824 8 0 vm_pageout.o
 636 20 0 vm_pager.o
 1160 0 0 vm_swap.o
 416 0 0 vm_unix.o
 304 0 0 vm_user.o
 2200 28 0 vnode_pager.o
 6176 1648 524 wd.o
 5252 48 9 wt.o
totals: 359636 12248 22832


_PORTING UNIX TO THE 386: A STRIPPED-DOWN KERNEL_
by William Frederick Jolitz and Lynne Greer Jolitz


[LISTING ONE]

/* locore.s: Copyright (c) 1990,1991 William Jolitz. All rights reserved.
 * Written by William Jolitz 1/90
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 */


/* [Excerpted from i386/locore.s] */
#define R(s) s - KERNEL_BASE /* relocate references until mapping enabled */

/* Per-process region virtual address space is located at the top of user
 * space, growing down to the top of the user stack [set in the "high"
kernel].
 * At kernel startup time, the only per-process data we need is a kernel
stack,
 * so we allocate SPAGES of stack pages for the purpose before calling the
 * kernel initialization code. */
 .data
 .globl _boothowto, _bootdev, _cyloffset

 /* Temporary stack */
 .space 128
tmpstk:
_boothowto: .long 0 /* bootstrap options */
_bootdev: .long 0 /* bootstrap device */
_cyloffset: .long 0 /* cylinder offset of bootstrap partition */
 .text
 .globl start
start:
 /* arrange for a warm boot from the BIOS at some point in the future */
 movw $0x1234, 0x472
 jmp 1f
 .space 0x500 # skip over BIOS data areas

 /* pass parameters on stack (howto, bootdev, cyloffset)
 * note: 0(%esp) is return address of bootstrap that loaded this kernel. */
1: movl 4(%esp), %eax
 movl %eax, R(_boothowto)
 movl 8(%esp), %eax
 movl %eax, R(_bootdev)
 movl 12(%esp), %eax
 movl %eax, R(_cyloffset)

 /* use temporary stack till mapping enabled to insure it falls within map */
 movl $R(tmpstk), %esp

 /* find end of kernel image */
 movl $R(_end), %ecx
 addl $NBPG-1, %ecx
 andl $~(NBPG-1), %ecx
 movl %ecx, %esi

 /* clear bss and memory for bootstrap page tables. */
 movl $R(_edata), %edi
 subl %edi, %ecx
 addl $(SPAGES+1+1+1)*NBPG, %ecx
 # stack + page directory + kernel page table + stack page table
 xorl %eax, %eax # pattern
 cld
 rep
 stosb

 /* Map Kernel--N.B. don't bother with making kernel text RO, as 386
 * ignores R/W AND U/S bits on kernel access (only valid bit works) !
 * First step - build page tables */
 movl %esi, %ecx # this much memory,
 shrl $PGSHIFT, %ecx # for this many ptes
 movl $PG_V, %eax # having these bits set,

 leal (2+SPAGES)*NBPG(%esi), %ebx # physical address of Sysmap
 movl %ebx, R(_KPTphys) # in the kernel page table,
 call fillpt

 /* map proc 0's kernel stack into user page table page */
 movl $SPAGES, %ecx # for this many ptes,
 leal 1*NBPG(%esi), %eax # physical address of stack in proc 0
 orl $PG_VPG_URKW, %eax # having these bits set,
 leal (1+SPAGES)*NBPG(%esi), %ebx # physical address of stack pt
 addl $(ptei(_PTmap)-1)*4, %ebx
 call fillpt

 /* Construct an initial page table directory */
 /* install a pde for temporary double map of bottom of VA */
 leal (SPAGES+2)*NBPG(%esi), %eax # physical address of kernel pt
 orl $PG_V, %eax
 movl %eax, (%esi)

 /* kernel pde - same contents */
 leal pdei(KERNEL_BASE)*4(%esi), %ebx # offset of pde for kernel
 movl %eax, (%ebx)

 /* install a pde recursively mapping page directory as a page table! */
 movl %esi, %eax # phys address of ptd in proc 0
 orl $PG_V, %eax
 movl %eax, pdei(_PTD)*4(%esi)

 /* install a pde to map stack for proc 0 */
 leal (SPAGES+1)*NBPG(%esi), %eax # physical address of pt in proc 0
 orl $PG_V, %eax
 movl %eax, (pdei(_PTD)-1)*4(%esi) # which is where per-process maps!

 /* load base of page directory, and enable mapping */
 movl %esi, %eax # phys address of ptd in proc 0
 orl $I386_CR3PAT, %eax
 movl %eax, %cr3 # load ptd addr into mmu
 movl %cr0, %eax # get control word
 orl $0x80000001, %eax # and let s page!
 movl %eax, %cr0 # NOW!

 /* now running mapped */
 pushl $begin # jump to high mem!
 ret

 /* now running relocated at SYSTEM where the system is linked to run */
begin:
 /* set up bootstrap stack */
 movl $_PTD-SPAGES*NBPG, %esp # kernel stack virtual address top
 xorl %eax, %eax # mark end of frames with a sentinal
 movl %eax, %ebp
 movl %eax, _PTD # clear lower address space mapping
 leal (SPAGES+3)*NBPG(%esi), %esi # skip past stack + page tables.
 pushl %esi

 /* init386(startphys) main(startphys) */
 call _init386 # wire 386 chip for unix operation
 call _main
 popl %eax


 /* find process (proc 0) to be run */
 movl _curproc, %eax
 movl P_PCB(%eax), %eax

 /* build outer stack frame */
 pushl PCB_SS(%eax) # user ss
 pushl PCB_ESP(%eax) # user esp
 pushl PCB_CS(%eax) # user cs
 pushl PCB_EIP(%eax) # user pc
 movw PCB_DS(%eax), %ds
 movw PCB_ES(%eax), %es
 lret # goto user!

/* fill in pte/pde tables */
fillpt:
 movl %eax, (%ebx) /* stuff pte */
 addl $NBPG, %eax /* increment physical address */
 addl $4, %ebx /* next pte */
 loop fillpt
 ret






[LISTING TWO]

/* machdep.c: Copyright (c) 1989,1991 William Jolitz. All rights reserved.
 * Written by William Jolitz 7/89
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 */
/* [excerpted from i386/i386/machdep.c] * /
/* Initialize segments & interrupt table */

#define GNULL_SEL 0 /* Null Descriptor */
#define GCODE_SEL 1 /* Kernel Code Descriptor */
#define GDATA_SEL 2 /* Kernel Data Descriptor */
#define GLDT_SEL 3 /* LDT - eventually one per process */
#define GTGATE_SEL 4 /* Process task switch gate */
#define GPANIC_SEL 5 /* Task state to consider panic from */
#define GPROC0_SEL 6 /* Task state process slot zero and up */
#define NGDT GPROC0_SEL+1

union descriptor gdt[GPROC0_SEL+1];

/* interrupt descriptor table */
struct gate_descriptor idt[32+16];

/* local descriptor table */
union descriptor ldt[5];
#define LSYS5CALLS_SEL 0 /* forced by intel BCS */
#define LSYS5SIGR_SEL 1
#define L43BSDCALLS_SEL 2 /* notyet */

#define LUCODE_SEL 3
#define LUDATA_SEL 4

/* #define LPOSIXCALLS_SEL 5 /* notyet */
struct i386tss tss, panic_tss;

/* software prototypes -- in more palitable form */
struct soft_segment_descriptor gdt_segs[] = {
 /* Null Descriptor */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Code Descriptor for kernel */
{ 0x0, /* segment base address */
 0xfffff, /* length - all address space */
 SDT_MEMERA, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 1, /* default 32 vs 16 bit size */
 1 /* limit granularity (byte/page units)*/ },
 /* Data Descriptor for kernel */
{ 0x0, /* segment base address */
 0xfffff, /* length - all address space */
 SDT_MEMRWA, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 1, /* default 32 vs 16 bit size */
 1 /* limit granularity (byte/page units)*/ },
 /* LDT Descriptor */
{ (int) ldt, /* segment base address */
 sizeof(ldt)-1, /* length - all address space */
 SDT_SYSLDT, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 0, /* unused - default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Null Descriptor - Placeholder */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Panic Tss Descriptor */
{ (int) &panic_tss, /* segment base address */
 sizeof(tss)-1, /* length - all address space */
 SDT_SYS386TSS, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */

 0,0,
 0, /* unused - default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Proc 0 Tss Descriptor */
{ 0, /* segment base address */
 sizeof(tss)-1, /* length - all address space */
 SDT_SYS386TSS, /* segment type */
 0, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 0, /* unused - default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ }};
struct soft_segment_descriptor ldt_segs[] = {
 /* Null Descriptor - overwritten by call gate */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Null Descriptor - overwritten by call gate */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Null Descriptor - overwritten by call gate */
{ 0x0, /* segment base address */
 0x0, /* length - all address space */
 0, /* segment type */
 0, /* segment descriptor priority level */
 0, /* segment descriptor present */
 0,0,
 0, /* default 32 vs 16 bit size */
 0 /* limit granularity (byte/page units)*/ },
 /* Code Descriptor for user */
{ 0x0, /* segment base address */
 0xfffff, /* length - all address space */
 SDT_MEMERA, /* segment type */
 SEL_UPL, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 1, /* default 32 vs 16 bit size */
 1 /* limit granularity (byte/page units)*/ },
 /* Data Descriptor for user */
{ 0x0, /* segment base address */
 0xfffff, /* length - all address space */
 SDT_MEMRWA, /* segment type */
 SEL_UPL, /* segment descriptor priority level */
 1, /* segment descriptor present */
 0,0,
 1, /* default 32 vs 16 bit size */
 1 /* limit granularity (byte/page units)*/ } };
/* table descriptors - used to load tables by microp */

struct region_descriptor r_gdt = {
 sizeof(gdt)-1,(char *)gdt
};
struct region_descriptor r_idt = {
 sizeof(idt)-1,(char *)idt
};
setidt(idx, func, typ, dpl) char *func; {
 struct gate_descriptor *ip = idt + idx;
 ip->gd_looffset = (int)func;
 ip->gd_selector = GSEL(GCODE_SEL,SEL_KPL);
 ip->gd_stkcpy = 0;
 ip->gd_xx = 0;
 ip->gd_type = typ;
 ip->gd_dpl = dpl;
 ip->gd_p = 1;
 ip->gd_hioffset = ((int)func)>>16 ;
}
#define IDTVEC(name) X/**/name
extern IDTVEC(div), IDTVEC(dbg), IDTVEC(nmi), IDTVEC(bpt), IDTVEC(ofl),
 IDTVEC(bnd), IDTVEC(ill), IDTVEC(dna), IDTVEC(dble), IDTVEC(fpusegm),
 IDTVEC(tss), IDTVEC(missing), IDTVEC(stk), IDTVEC(prot),
 IDTVEC(page), IDTVEC(rsvd), IDTVEC(fpu), IDTVEC(rsvd0),
 IDTVEC(rsvd1), IDTVEC(rsvd2), IDTVEC(rsvd3), IDTVEC(rsvd4),
 IDTVEC(rsvd5), IDTVEC(rsvd6), IDTVEC(rsvd7), IDTVEC(rsvd8),
 IDTVEC(rsvd9), IDTVEC(rsvd10), IDTVEC(rsvd11), IDTVEC(rsvd12),
 IDTVEC(rsvd13), IDTVEC(rsvd14), IDTVEC(rsvd14), IDTVEC(syscall);
int lcr0(), lcr3(), rcr0(), rcr2();
int _udatasel, _ucodesel, _gsel_tss;
init386() { extern ssdtosd(), lgdt(), lidt(), lldt(), etext;
 int x;
 unsigned biosbasemem, biosextmem;
 struct gate_descriptor *gdp;
 extern int sigcode,szsigcode;
 struct pcb *pb = proc0.p_addr;
 /* initialize console */
 cninit ();
 /* make gdt memory segments */
 gdt_segs[GCODE_SEL].ssd_limit = btoc((int) &etext + NBPG);
 gdt_segs[GPROC0_SEL].ssd_base = pb;
 for (x=0; x < NGDT; x++) ssdtosd(gdt_segs+x, gdt+x);
 /* make ldt memory segments */
 ldt_segs[LUCODE_SEL].ssd_limit = btoc(UPT_MIN_ADDRESS);
 ldt_segs[LUDATA_SEL].ssd_limit = btoc(UPT_MIN_ADDRESS);
 /* Note. eventually want private ldts per process */
 for (x=0; x < 5; x++) ssdtosd(ldt_segs+x, ldt+x);
 /* exceptions */
 setidt(0, &IDTVEC(div), SDT_SYS386TGT, SEL_KPL);
 setidt(1, &IDTVEC(dbg), SDT_SYS386TGT, SEL_KPL);
 setidt(2, &IDTVEC(nmi), SDT_SYS386TGT, SEL_KPL);
 setidt(3, &IDTVEC(bpt), SDT_SYS386TGT, SEL_UPL);
 setidt(4, &IDTVEC(ofl), SDT_SYS386TGT, SEL_KPL);
 setidt(5, &IDTVEC(bnd), SDT_SYS386TGT, SEL_KPL);
 setidt(6, &IDTVEC(ill), SDT_SYS386TGT, SEL_KPL);
 setidt(7, &IDTVEC(dna), SDT_SYS386TGT, SEL_KPL);
 setidt(8, &IDTVEC(dble), SDT_SYS386TGT, SEL_KPL);
 setidt(9, &IDTVEC(fpusegm), SDT_SYS386TGT, SEL_KPL);
 setidt(10, &IDTVEC(tss), SDT_SYS386TGT, SEL_KPL);
 setidt(11, &IDTVEC(missing), SDT_SYS386TGT, SEL_KPL);
 setidt(12, &IDTVEC(stk), SDT_SYS386TGT, SEL_KPL);

 setidt(13, &IDTVEC(prot), SDT_SYS386TGT, SEL_KPL);
 setidt(14, &IDTVEC(page), SDT_SYS386TGT, SEL_KPL);
 setidt(15, &IDTVEC(rsvd), SDT_SYS386TGT, SEL_KPL);
 setidt(16, &IDTVEC(fpu), SDT_SYS386TGT, SEL_KPL);
 setidt(17, &IDTVEC(rsvd0), SDT_SYS386TGT, SEL_KPL);
 setidt(18, &IDTVEC(rsvd1), SDT_SYS386TGT, SEL_KPL);
 setidt(19, &IDTVEC(rsvd2), SDT_SYS386TGT, SEL_KPL);
 setidt(20, &IDTVEC(rsvd3), SDT_SYS386TGT, SEL_KPL);
 setidt(21, &IDTVEC(rsvd4), SDT_SYS386TGT, SEL_KPL);
 setidt(22, &IDTVEC(rsvd5), SDT_SYS386TGT, SEL_KPL);
 setidt(23, &IDTVEC(rsvd6), SDT_SYS386TGT, SEL_KPL);
 setidt(24, &IDTVEC(rsvd7), SDT_SYS386TGT, SEL_KPL);
 setidt(25, &IDTVEC(rsvd8), SDT_SYS386TGT, SEL_KPL);
 setidt(26, &IDTVEC(rsvd9), SDT_SYS386TGT, SEL_KPL);
 setidt(27, &IDTVEC(rsvd10), SDT_SYS386TGT, SEL_KPL);
 setidt(28, &IDTVEC(rsvd11), SDT_SYS386TGT, SEL_KPL);
 setidt(29, &IDTVEC(rsvd12), SDT_SYS386TGT, SEL_KPL);
 setidt(30, &IDTVEC(rsvd13), SDT_SYS386TGT, SEL_KPL);
 setidt(31, &IDTVEC(rsvd14), SDT_SYS386TGT, SEL_KPL);
#include "isa.h"
#if NISA >0
 isa_defaultirq();
#endif
 /* load descriptor tables into 386 */
 lgdt(gdt, sizeof(gdt)-1);
 lidt(idt, sizeof(idt)-1);
 lldt(GSEL(GLDT_SEL, SEL_KPL));
 /* resolve amount of memory present so we can scale kernel PT */
 maxmem = probemem();
 biosbasemem = rtcin(RTC_BASELO)+ (rtcin(RTC_BASEHI)<<8);
 biosextmem = rtcin(RTC_EXTLO)+ (rtcin(RTC_EXTHI)<<8);
 if (biosbasemem == 0xffff biosextmem == 0xffff) {
 if (biosbasemem == 0xffff && maxmem > RAM_END)
 maxmem = IOM_BEGIN;
 if (biosextmem == 0xffff && maxmem > RAM_END)
 maxmem = IOM_BEGIN;
 } else if (biosextmem > 0 && biosbasemem == IOM_BEGIN/1024) {
 int totbios = (biosbasemem + 0x60000 + biosextmem);
 if (totbios < maxmem) maxmem = totbios;
 } else maxmem = IOM_BEGIN;
 /* call pmap initialization to make new kernel address space */
 pmap_bootstrap ();
 /* now running on new page tables, configured,and u/iom is accessible */
 /* make a initial tss so microp can get interrupt stack on syscall! */
 pb->pcbtss.tss_esp0 = UPT_MIN_ADDRESS;
 pb->pcbtss.tss_ss0 = GSEL(GDATA_SEL, SEL_KPL) ;
 _gsel_tss = GSEL(GPROC0_SEL, SEL_KPL);
 ltr(_gsel_tss);
 /* make a call gate to reenter kernel with */
 gdp = &ldt[LSYS5CALLS_SEL].gd;
 gdp->gd_looffset = (int) &IDTVEC(syscall);
 gdp->gd_selector = GSEL(GCODE_SEL,SEL_KPL);
 gdp->gd_stkcpy = 0;
 gdp->gd_type = SDT_SYS386CGT;
 gdp->gd_dpl = SEL_UPL;
 gdp->gd_p = 1;
 gdp->gd_hioffset = ((int) &IDTVEC(syscall)) >>16;
 /* transfer to user mode */
 _ucodesel = LSEL(LUCODE_SEL, SEL_UPL);

 _udatasel = LSEL(LUDATA_SEL, SEL_UPL);
 /* setup per-process */
 bcopy(&sigcode, pb->pcb_sigc, szsigcode);
 pb->pcb_flags = 0;
 pb->pcb_ptd = IdlePTD;
}






[LISTING THREE]

/* Machine dependent constants for 386. */

/* user map constants */
#define VM_MIN_ADDRESS ((vm_offset_t)0)
#define UPT_MIN_ADDRESS ((vm_offset_t)0xFDC00000)
#define UPT_MAX_ADDRESS ((vm_offset_t)0xFDFF7000)
#define VM_MAX_ADDRESS UPT_MAX_ADDRESS

/* kernel map constants */
#define VM_MIN_KERNEL_ADDRESS ((vm_offset_t)0xFDFF7000)
#define KPT_MIN_ADDRESS ((vm_offset_t)0xFDFF8000)
#define KPT_MAX_ADDRESS ((vm_offset_t)0xFDFFF000)
#define KERNEL_BASE 0xFE000000
#define VM_MAX_KERNEL_ADDRESS ((vm_offset_t)0xFF7FF000)

/* # of kernel PT pages (initial only, can grow dynamically) */
#define VM_KERNEL_PT_PAGES ((vm_size_t)1)





[LISTING FOUR]

/*
 * pmap.h: Copyright (c) 1990,1991 William Jolitz. All rights reserved.
 * Written by William Jolitz 12/90
 *
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 *
 */
/*
 * [excerpted from i386/pmap.h]
 * Recursive map version by W. Jolitz
 */

/* page directory element */
struct pde
{
unsigned int

 pd_v:1, /* valid bit */
 pd_prot:2, /* access control */
 pd_mbz1:2, /* reserved, must be zero */
 pd_u:1, /* hardware maintained 'used' bit */
 :1, /* not used */
 pd_mbz2:2, /* reserved, must be zero */
 :3, /* reserved for software */
 pd_pfnum:20; /* physical page frame number of pte's*/
};

#define PD_MASK 0xffc00000 /* page directory address bits */
#define PT_MASK 0x003ff000 /* page table address bits */
#define PD_SHIFT 22 /* page directory address shift */
#define PG_SHIFT 12 /* page table address shift */

/* page table element */
struct pte
{
unsigned int
 pg_v:1, /* valid bit */
 pg_prot:2, /* access control */
 pg_mbz1:2, /* reserved, must be zero */
 pg_u:1, /* hardware maintained 'used' bit */
 pg_m:1, /* hardware maintained modified bit */
 pg_mbz2:2, /* reserved, must be zero */
 pg_w:1, /* software, wired down page */
 :1, /* software (unused) */
 pg_nc:1, /* 'uncacheable page' bit */
 pg_pfnum:20; /* physical page frame number */
};

#define PG_V 0x00000001
#define PG_RO 0x00000000
#define PG_RW 0x00000002
#define PG_u 0x00000004
#define PG_PROT 0x00000006 /* all protection bits . */
#define PG_W 0x00000200
#define PG_N 0x00000800 /* Non-cacheable */
#define PG_M 0x00000040
#define PG_U 0x00000020
#define PG_FRAME 0xfffff000

#define PG_NOACC 0
#define PG_KR 0x00000000
#define PG_KW 0x00000002
#define PG_URKR 0x00000004
#define PG_URKW 0x00000004
#define PG_UW 0x00000006

/*
 * Page Protection Exception bits
 */

#define PGEX_P 0x01 /* Protection violation vs. not present */
#define PGEX_W 0x02 /* during a Write cycle */
#define PGEX_U 0x04 /* access from User mode (UPL) */

/*
 * Address of current address space page table maps

 * and directories.
 */
extern struct pte PTmap[], Sysmap[];
extern struct pde PTD[], PTDpde;

/*
 * virtual address to page table entry and to physical address.
 * Note: these work recursively, thus vtopte of a pte will give
 * the corresponding pde that it in turn maps into.
 */
#define vtopte(va) (PTmap + i386_btop(va))
#define ptetov(pt) (i386_ptob(pt - PTmap))
#define vtophys(va) (i386_ptob(vtopte(va)->pg_pfnum) ((int)(va) & PGOFSET))
#define ispt(va) ((va) >= UPT_MIN_ADDRESS && (va) <= KPT_MAX_ADDRESS)

/*
 * macros to generate page directory/table indicies
 */

#define pdei(va) (((va)&PD_MASK)>>PD_SHIFT)
#define ptei(va) (((va)&PT_MASK)>>PT_SHIFT)






[LISTING FIVE]

/* param.h: Copyright (c) 1989,1990,1991 William Jolitz. All rights reserved.
 * Written by William Jolitz 6/89
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 */
/* Machine dependent constants for Intel 386. */

#define MACHINE "i386"
#define NBPG 4096 /* bytes/page */
#define PGOFSET (NBPG-1) /* byte offset into page */
#define PGSHIFT 12 /* LOG2(NBPG) */
#define NPTEPG (NBPG/(sizeof (struct pte)))
#define NBPDR (1024*NBPG) /* bytes/page dir */
#define PDROFSET (NBPDR-1) /* byte offset into page dir */
#define PDRSHIFT 22 /* LOG2(NBPDR) */
#define KERNBASE 0xFE000000 /* start of kernel virtual */
#define DEV_BSIZE 512
#define DEV_BSHIFT 9 /* log2(DEV_BSIZE) */
#define CLSIZE 1
#define CLSIZELOG2 0
#define SSIZE 1 /* initial stack size/NBPG */
#define SINCR 1 /* increment of stack/NBPG */
#define SPAGES 2 /* pages of kernel stack area */

/* clicks to bytes */
#define ctob(x) ((x)<<PGSHIFT)


/* bytes to clicks */
#define btoc(x) (((unsigned)(x)+(NBPG-1))>>PGSHIFT)
#define btodb(bytes) /* calculates (bytes / DEV_BSIZE) */ \
 ((unsigned)(bytes) >> DEV_BSHIFT)
#define dbtob(db) /* calculates (db * DEV_BSIZE) */ \
 ((unsigned)(db) << DEV_BSHIFT)

/* Map a ``block device block'' to a file system block. This should be device
 * dependent, and will be if we add an entry to cdevsw/bdevsw for that
purpose.
 * For now though just use DEV_BSIZE. */
#define bdbtofsb(bn) ((bn) / (BLKDEV_IOSIZE/DEV_BSIZE))

/* Mach derived conversion macros */
#define i386_round_pdr(x) ((((unsigned)(x)) + NBPDR - 1) & ~(NBPDR-1))
#define i386_trunc_pdr(x) ((unsigned)(x) & ~(NBPDR-1))
#define i386_round_page(x) ((((unsigned)(x)) + NBPG - 1) & ~(NBPG-1))
#define i386_trunc_page(x) ((unsigned)(x) & ~(NBPG-1))
#define i386_btod(x) ((unsigned)(x) >> PDRSHIFT)
#define i386_dtob(x) ((unsigned)(x) << PDRSHIFT)
#define i386_btop(x) ((unsigned)(x) >> PGSHIFT)
#define i386_ptob(x) ((unsigned)(x) << PGSHIFT)








































July, 1991
A COFF FILE LOADER FOR THE 34010


When your target system is RAM-based




Don Morgan


Don is a consulting engineer in the area of embedded systems and automation
and can be contacted care of Don Morgan Electronics, 2669 N. Wanda, Simi
Valley, CA 93065.


The Texas Instruments 34010 is one of the most widely used graphics processors
around. It is also among a series of TI processors that lets you develop
software that produces files conforming to the Common Object Format File
(COFF) definition.
A COFF file is like an EXE file in that it contains code and data -- and the
information necessary to load both into RAM. Depending on the development
language, the COFF file may also contain variables that must be initialized
during the load cycle.
For a ROM-based target, the COFF file is converted to a HEX file for PROM
programmers. However, if your target is RAM-based -- and not a Texas
Instrument Graphics (TIGA) application -- you must write a COFF file loader
that suits your target and application software.
I recently had to develop software using the 34010. A RAM-based target was to
be used in an 80x86 environment and the software didn't use the TIGA
interface. I needed a COFF file loader small enough to be embedded in an
application and robust enough to download fully linked C and assembly language
programs. I developed a function, presented here as a stand-alone program,
that allows me to load COFF files whose source was either assembly or C. The
function can also initialize the startup variables for a C program or load
them for initialization at boot time. (The COFF file loader was originally
written for Pacific Precision Laboratories in Chatsworth, Calif.)


The Host Interface


Besides parsing the COFF file, there's one important point to consider about
loading a program into a target's local memory: Before anything is written to
the target, the processor needs to be halted and the system prepared, so that
once the download is complete, it can be restarted correctly. The host
interface takes care of this.
The host interface (Listing One, page 93) consists of four 16-bit pointer
registers accessible to the host system through the HFS0 and HFS1 address bits
and HCS\, the host chip select. These bits are normally decoded from either
the memory map or I/O map.
HSTADRH and HSTADRL, the two pointers for the address, allow access to any
space within the 32-bit local bus, and on-board 34010 registers. HSTDATA is
the data buffer register, where data is placed when writing to local RAM and
read when reading from local RAM. In addition, the control register, HSTCTL,
provides for autoincrementing the address pointers on both reads and writes.
Here too, are bits that control the cache and the nonmaskable interrupt, halt,
input and output interrupts and message passing.
By setting the address pointers to the beginning address for your download,
and setting the bit in HSTCTL that causes the address pointers to be
automatically incremented with each write, you can write a block of code
simply by placing word after word (or byte after byte for an 8-bit interface)
in HSTDATA.
A typical procedure for downloading code to a target card might be as follows:
1. Set both the HALT bit and Non-Maskable Interrupt (NMI) bits in HSTCTL,
shortening the latency of the action to that of the NMI.
2. Download the code, using the facilities of the host interface.
3. Flush the cache, so that no old code can be executed after the processor is
restarted.
4. Write the nonmaskable interrupt vector to point to the new code to be
executed after restart.
5. Set the NMI and NMIM bits in HSTCTL to abandon the current context and to
see that the restart is uniform.
6. Restart the 34010 by resetting the HLT bit in the HSTCTL.


COFF.H


The target I was writing for was memory mapped, but it could just as easily
have been I/O mapped. This is an important consideration in the design of the
system. For one thing, memory mapping requires that all memory accesses within
the decoded segment(s) be the same width. This can cause conflicts between
8-bit and 16-bit cards mapped to the same area. If the target is I/O mapped,
programmers are allowed the very fast string instructions available on the
80x86, and can also use other video cards without fear of incompatibility.
Whichever you use, the loader must know where to find the host interface. In
part A of the COFF.H include file (Listing Two, page 95) there is a define
called MM that differentiates between I/O and memory mapped targets. The
appropriate host interface registers are presented for each mode, and their
addresses must be filled in before compilation. In this example, the code is
defined for memory mapped, and the addresses are given in based pointer
notation. If the target is I/O mapped, MM is made false and the appropriate
addresses are placed in the section marked "I/O mapped addresses."
Part B of Listing Two contains the defines for section offsets, bit names for
the HSTCTL register, flag definitions, and interrupt vectors. Finally, in part
C you'll find the structure definitions used to access information about the
program being loaded.


The Loader


The first part of the program (part A) looks for a named file and loads it if
possible.
The second part (part B) performs whatever initialization the TMS34010
requires. Instructions for setting up refresh timing and rates can be placed
here if necessary. (If, however, the loader is implemented as a function, that
may be taken care of elsewhere.) This is also where I prepare the target for
the down-load by halting it and flushing the cache.
In part C, the the program determines whether or not it has a valid COFF file
and uses the offsets to initialize pointers to the various headers.
The main header is the first data block in the COFF file, the address of
file_buffer[O], is cast as a pointer to a structure of the type m_hdr. The
first variable in this structure contains the Magic number (90h), which, if
present, tells the program that the file can be executed on a 34010. This is
followed by the number of sections in the file, the date and time, a pointer
to the symbol table, the number of entries in the symbol table, a field that
indicates whether or not there is an optional header (included to perform
relocation at down-load time), and a flags byte. The program checks the magic
number to see that it is a good file, retrieves the number of sections, checks
to see if there is an optional header, and gets the flags.
A pointer is also initialized for the optional header, and the first two bytes
are checked to see that they hold the magic number for the 34010. If so,
parsing is continued, otherwise the program is aborted. The entry point is an
important bit of information because the reset vector in the 34010's trap
table will have to be set to this address. If it is C code we are loading, it
is the address of the beginning of the startup code (Boot.obj).
More Details.
In part D, the section headers are examined one at a time to find the BSS
section. Although all the sections are basically handled in the same way, the
BSS section is loaded first. Even though it is technically "uninitialized," it
may be forced with fill data and as such, could overwrite initialization
variables if loaded out of order.
There is only one structure definition for all the sections (refer to COFF.H).
Each section is handled in the same manner: Its offset within the file_buffer
is used to initialize a pointer to a structure of type sect_header. This is
done in a function called get_sect, after the new section header has been
identified and accepted.
Next, the section headers are loaded one at a time into another structure, and
that information is used to find the raw data it is associated with and load
it. As you will see from the table showing the information in the section
header, the section header contains an offset to raw data and to 2 bytes
containing the flags describing that data. These flags are listed in COFF.H
under section header flags and indicate whether the raw data is code, data, or
bss, and how it is to be treated.
Part E contains the functions that actually load and verify the data. As far
as C program loading is concerned, one section is treated as special: The
CINIT section, which contains initialization tables for the C compiler. When
this program is loaded into a RAM-based target, the data is used to initialize
predetermined locations in memory to speed boot time, this is done with the
function put_data. In a ROM-based target, these variables may be located in
ROM and used by the boot code to reinitialize those same variables each time
the unit is reset, this is accomplished though the regular function, load
block.

Finally in part D, the proper reset vector is set on the target and the 34010
is allowed to restart.


Conclusion


The information presented here is fairly rudimentary, but it could easily be
extended to get the symbol table and line numbers for debugging purposes, or
to do relocation on partially linked files.


Bibliography


TMS34010 User's Guide. Texas Instruments, 1988.
TMS34010 Assembly Language Tools User's Guide. Texas Instruments, 1987.
TMS34010 C Compiler Reference Guide. Texas Instruments, 1988.


The Common Object File Format


The Common Object Format File (COFF) was originally developed by AT&T for
Unix. The assembler supplied by Texas Instruments for use with the 34010
produces a COFF file, which when linked contains code, data, and symbolic
debugging information.
COFF is said to encourage modular programming because it operates in blocks of
code called sections. There are two basic types of sections, initialized and
unitialized. Sections are relocatable and will eventually occupy contiguous
space in memory. Assembler directives allow you to create any number of these
sections.
A COFF file has at least three sections: a TEXT (usually executable code),
DATA (initialized data), and BSS (uninitialized variables). The compiler
produces an assembly language file which can be further modified or
immediately assembled into an object file. The object file is then linked with
libraries or other files according to directives on the command line or in a
CMD file. With these directives, the linker determines where to place each of
the sections found in the COFF file.
As Figure 1 shows, the COFF.OUT has three headers that provide information
about the kind of file it is, the positioning of the code and data, and
whether or not it is to be loaded.
The main header indicates what kind of system the code can be executed on, how
many section headers there are, whether there is an optional header available,
and other data (the date/time stamp, a flags byte, and so on).
The optional header is always included in a fully linked file. It describes
the size of the executable code and data, as well as providing the entry point
for the code and the addresses of TEXT and DATA.
A section header for each named section in the program. There will always be a
TEXT, a DATA, and a BSS section, and there may be more. There are assembly
language directives that allow you to create as many sections as you like, and
if the program was originally written in C, it will have a CINIT section
containing initialization data.
After the section headers, there is the raw data, the code or data associated
with each section. Finally, line number entries, a symbol table, and a string
table (for labels over eight characters) provides debugging information.
--D.M.

_A COFF FILE LOADER FOR THE 34010_
by Don Morgan



[LISTING ONE]

#include <fcntl.h>
#include <sys\types.h>
#include <sys\stat.h>
#include <io.h>
#include <conio.h>
#include <string.h>
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include "coff.h"

struct main_header *m_hdr;
struct opt_header *opt;
struct sect_header section;

void main(int argc, char **argv)
{
/**************** PART A *****************/
 int module, size, header, sect, j;
 unsigned long result;
 int bss_done = FALSE;


 unsigned char *receive_buffer;
 char tmp_buf[100];
 int data, i;
 if(argc != 2) {
 printf("\nno file given!");
 exit(-1);
 }
 module = open(argv[1], O_BINARYO_RDONLY);
 if(module == -1){
 perror("\nopen failed!");
 exit(-1);
 }
 /**************************/
 /*read coff file into buffer*/
 size = filelength(module);
 file_buffer = (char *)malloc(size);
 if(file_buffer == NULL) {
 perror("\nnot enough memory!");
 exit (-1);
 }
 if((result = (long)read(module, file_buffer, size)) <= 0) {
 perror("\ncan't find file!\n");
 exit(-1);
 }
 close(module);

/************ PART B *************/
/*set up 34010 for loading*/
/**************************/
/*it is the users responsibility to set up the cntrl register*/
/*please note that some of the register settings, such as the following*/
/*are application dependent, this code is included only to show an*/
/*example of low level setup for the 34010*/
 gsp_poke(cntrl, 0x4); /*sets cas before ras refresh*/

/**************************/
/*set up 34010 to restart correctly after loading program*/
 put_hstctl(hlt cf incr incw nmim nmi_flg);
 data = get_hstctl();
 if(data != (hlt cf incr incw nmim nmi_flg)) {
 printf("\nerror writing to hstctl!");
 exit(-1);
 }

/************ PART C *************/
/**************************/
/*get contents of main header*/
 m_hdr = (struct main_header *) file_buffer;
/*see if the file has a magic number*/
 if(m_hdr->magic_num !=FILE_MAGIC) {
 printf("\nnot a standard coff .out file!");
 exit(-1);
 }
/*check to see whether there is an optional header*/
 if((m_hdr->opt_head != OPT_XST)){
 printf("file is not fully linked!");
 exit(-1);
 }
/*get contents of optional header*/

 opt = (struct opt_header *) &file_buffer[OPT_OFST];
/*see if the optional header has a magic number*/
 if(opt->magic_num !=OPT_MAGIC) {
 printf("\nnon standard file!");
 exit(-1);
 }

/*************************************************/
/*begin searching for and loading section headers find bss section first! */
 i = FIRST_HDR;
 for (j=0;((j<m_hdr->num_sects) && !bss_done) ;j++){
 strcpy(tmp_buf,&file_buffer[i]);
 if(!bss_done) {
 if(!(strcmp(tmp_buf, ".bss"))) {
 strcpy(section.name, tmp_buf);
 header = i;
 get_sect(header);
 bss_done = TRUE;
 }
 }
 i += SEC_OFST;
 }
 /*now load the other sections*/
 i = FIRST_HDR;
 for (j=0; j<m_hdr->num_sects;j++){
 strcpy(tmp_buf,&file_buffer[i]);
 if(strcmp(tmp_buf, ".bss")) {
 strcpy(section.name, tmp_buf);
 header = i;
 get_sect(header);
 }
 i += SEC_OFST;
 }

 /*release memory for file buffer*/
 free(file_buffer);

/************ PART D *************/
/*set up reset and interrupt vectors for the 34010 */
/*usually, both the nmi and halt bits are set and then released */
/*this code may differ depending upon the desires of the programmer */
 gsp_poke(intenb,0x0); /*no interrupts*/
 gsp_poke(nmi_vect, opt->entry_point);
 gsp_poke(nmi_vect+0x10, opt->entry_point >> 0x10);
 gsp_poke(reset,opt->entry_point);
 gsp_poke(reset+0x10,opt->entry_point >> 0x10);
 put_hstctl(hlt incr nmi_flg incw nmim);
 /*toggle the halt bit and go*/
 data = get_hstctl();
 data &= ~hlt;
 put_hstctl( data );
}

/*********** PART E *************/
void get_sect(header)
int header;
{
 struct sect_header * ptr = (struct sect_header *)&file_buffer[header];
 load(ptr);

}
void load(ptr)
struct sect_header *ptr;
{
/*here the flags are checked to determine whether the section is to be loaded
or copied or ignored*/
 if((ptr->sect_size) && !(ptr->flags & STYP_DSECT) &&
 !(ptr->flags & STYP_NOLOAD)) {
 if(!(strcmp(ptr->name,".cinit"))
 && (ptr->flags & STYP_COPY))
 put_data(ptr);
 else
 if(((ptr->flags & STYP_TEXT)
 (ptr->flags & STYP_DATA)
 (!(strcmp(ptr->name,".cinit")
 && !(ptr->flags & STYP_COPY)))))
 load_block(ptr);
 }
}
void load_block(ptr)
struct sect_header *ptr;
{
 int data, temporary, hldr, limit;
 long i, j, file_pointer;
 file_pointer = ptr->raw_data;
/*set the host interface up to point at the correct address*/
#if MM
 *(gsp:>hstadrl) = ptr->virt_addr;
 *(gsp:>hstadrh) = ptr->virt_addr >> 0x10;
#else
 outpw(io_hstadrl, (unsigned int)ptr->virt_addr);
 outpw(io_hstadrh, (unsigned int)ptr->virt_addr >> 0x10);
#endif
 limit = (ptr->sect_size/0x10)-1;
 j=0;
 /*write each word to host interface*/
 for(i=0; i<=limit; i++){
/*get the data from the file buffer and get it in the correct order before
writing to the host interface*/
 data = (file_buffer[file_pointer+j++]&0xff);
 data += ((file_buffer[file_pointer+j++]&0xff)*0x100);
#if MM
 *(gsp:>hstdata) = data;
#else
 outpw(io_hstdata,data);
#endif
 }
 /*compare data*/
 /*point at the correct address*/
#if MM
 *(gsp:>hstadrl) = ptr->virt_addr;
 *(gsp:>hstadrh) = ptr->virt_addr >> 0x10;
#else
 outpw(io_hstadrl, (unsigned int)ptr->virt_addr);
 outpw(io_hstadrh, (unsigned int)ptr->virt_addr>> 0x10);
#endif
 limit = (ptr->sect_size/0x10)-1;
 j=0;
 for(i=0; i<=limit; i++){
 /*get the data*/
 data = (file_buffer[file_pointer+j++]&0xff);

 data += ((file_buffer[file_pointer+j++]&0xff)*0x100);
#if MM
 hldr = *(gsp:>hstdata);
#else
 hldr = inpw(io_hstdata);
#endif
 if(hldr != data)
 printf("\ncompare error!");
 }
}
void put_data(ptr)
struct sect_header *ptr;
{
 int data, temporary, hldr, limit, num_words, num;
 long i, j, reloc_address, file_pointer;
 struct init_table * init;
 file_pointer = ptr->raw_data;
 do{
 init = (struct init_table *)&file_buffer[file_pointer];
 reloc_address = init->ptr_to_var;
 file_pointer += 6;

/*point at relocation address*/
#if MM
 *(gsp:>hstadrl) = reloc_address;
 *(gsp:>hstadrh) = reloc_address >> 0x10;
#else
 outpw(io_hstadrl, (unsigned int)reloc_address);
 outpw(io_hstadrh, (unsigned int)reloc_address >> 0x10);
#endif

/*determine the amount of data to transfer and do it*/
 num_words = init->num_words;
 limit = --num_words;
 j=0;
 for(i=0; i<=limit; i++){
 data = (file_buffer[file_pointer+j++]&0xff);
 data += ((file_buffer[file_pointer+j++]&0xff)*0x100);
#if MM
 *(gsp:>hstdata) = data;
#else
 outpw(io_hstdata,data);
#endif
 }
 /*now, do a data compare*/
#if MM
 *(gsp:>hstadrl) = reloc_address;
 *(gsp:>hstadrh) = reloc_address >> 0x10;
#else
 outpw(io_hstadrl, (unsigned int)reloc_address);
 outpw(io_hstadrh, (unsigned int)reloc_address >> 0x10);
#endif
 num_words = init->num_words;
 limit = --num_words;
 j=0;
 for(i=0; i<=limit; i++){
 data = (file_buffer[file_pointer+j++]&0xff);
 data += ((file_buffer[file_pointer+j++]&0xff)*0x100);
#if MM

 hldr = *(gsp:>hstdata);
#else
 hldr = inpw(io_hstdata);
#endif
 if(hldr != data)
 printf("\ndata compare error!");
 }
 file_pointer += j;
 }while(((int)file_buffer[file_pointer]) != 0x0);
}
/*set up the hstctl*/
void put_hstctl(unsigned int value)
{
#if MM
 *(gsp:>hstctl) = value;
#else
 outpw(io_hstctl,value);
#endif
}
/*get current Hstclt setting*/
unsigned int get_hstctl()
{
 int value;
#if MM
 value = *(gsp:>hstctl);
#else
 value = inpw(io_hstctl);
#endif
 return value;
}
/*set host interface to point at correct address*/
void set_addr(unsigned long address)
{
#if MM
 *(gsp:>hstadrl) = address;
 *(gsp:>hstadrh) = address >> 0x10;
#else
 outpw(io_hstadrl,(unsigned int)address);
 outpw(io_hstadrh,(unsigned int)address >> 0x10);
#endif
}
void gsp_poke(unsigned long address, unsigned long value)
{
 set_addr(address);
#if MM
 *(gsp:>hstdata) = value;
#else
 outpw(io_hstdata,(unsigned int)value);
#endif
}
unsigned int gsp_peek(unsigned long address)
{
 int value;

 set_addr(address);
#if MM
 value = *(gsp:>hstdata);
#else
 value = inpw(io_hstdata);

#endif
 return value;
}





[LISTING TWO]

#define FALSE 0
#define TRUE 0x1

#define MM TRUE

/*physical addresses of memory mapped host interface*/
#if MM
_segment gsp = 0xc700;
int _based(void) *hstctl = (int _based(void)*)0xe00;
int _based(void) *hstdata = (int _based(void)*)0xf00;
int _based(void) *hstadrh = (int _based(void)*)0xc00;
int _based(void) *hstadrl = (int _based(void)*)0xd00;
#else
/*io mapped addresses of host interface*/
io_hstctl = 0;
io_hstdata = 0;
io_hstadrh = 0;
io_hstadrl = 0;
#endif

#define FILE_MAGIC 0x90
#define OPT_MAGIC 0x108
#define OPT_XST 0x1c
#define OPT_OFST 0x14
#define SEC_OFST 0x28
#define FIRST_HDR 0x30

#define cf 0x4000
#define hlt 0x8000
#define nmi_flg 0x100
#define nmim 0x200
#define incw 0x800
#define incr 0x1000

/*definitions*/
/*file header flags*/
#define F_RELFLG 0x1 /*relocation information stripped*/
#define F_EXEC 0x2 /*file is relocateable*/
#define F_LNNO 0x4 /*line numbers stripped*/
#define F_LSYMS 0x10 /*local symbos stripped*/
#define F_QR32WR 0x40 /*34010 byte ordering*/

/*section header flags*/
#define STYP_REG 0x0 /*regular section*/
#define STYP_DSECT 0x1 /*dummy section*/
#define STYP_NOLOAD 0x2 /*noload section*/
#define STYP_GROUP 0x4 /*grouped section*/
#define STYP_PAD 0x8 /*padding section*/
#define STYP_COPY 0x10 /*copy section, important for .cinit*/

#define STYP_TEXT 0x20 /*executable code*/
#define STYP_DATA 0x40 /*initialized data*/
#define STYP_BSS 0x80 /*uninitialized data*/
#define STYP_ALIGN 0x100 /*aligned on cache boundary*/

/*******************************************************/
/*interrupt vector:*/
#define reset 0xffffffe0
#define nmi_vect 0xfffffee0
/*******************************************************/
/*i/o registers:*/
#define refcnt 0xc00001f0
#define dpyadr 0xc00001e0
#define vcount 0xc00001d0
#define hcount 0xc00001c0
#define dpytap 0xc00001b0
#define pmask 0xc0000160
#define psize 0xc0000150
#define convdp 0xc0000140
#define convsp 0xc0000130
#define intpend 0xc0000120
#define intenb 0xc0000110
#define hstctlh 0xc0000100
#define hstctll 0xc00000f0
#define hst_adrh 0xc00000d0
#define hst_adrl 0xc00000e0
#define hst_data 0xc00000c0
#define cntrl 0xc00000b0
#define dpyint 0xc00000a0
#define dpystrt 0xc0000090
#define dpyctl 0xc0000080
#define vtotal 0xc0000070
#define vsblnk 0xc0000060
#define veblnk 0xc0000050
#define vesync 0xc0000040
#define htotal 0xc0000030
#define hsblnk 0xc0000020
#define heblnk 0xc0000010
#define hesync 0xc0000000
#define dac_wr 0xc7800
#define ppl_rd 0xc7000

/*header structures*/
struct main_header {
 unsigned short int magic_num;
 unsigned short int num_sects;
 long int date_stamp;
 long int sym_table;
 long int entries;
 unsigned short int opt_head;
 unsigned short int flags;
 };
struct opt_header {
 short int magic_num;
 short int version;
 unsigned long code_size;
 unsigned long init_size;
 unsigned long uninit_size;
 unsigned long entry_point;

 unsigned long start_text;
 unsigned long start_data;
 };
struct sect_header {
 unsigned char name[8];
 unsigned long phys_addr;
 unsigned long virt_addr;
 unsigned long sect_size;
 unsigned long raw_data;
 unsigned long reloc;
 unsigned long num_entries;
 unsigned short int reloc_entries;
 unsigned short int line_entries;
 unsigned short int flags;
 unsigned char ch1;
 unsigned char page;
 };
struct init_table {
 int num_words;
 long ptr_to_var;
 };
/*video pointer*/
char far *vid_mem;
/*variable declarations*/
int len, text_ptr, debug, coff_debug, fake;
unsigned char *file_buffer;
/*function prototypes*/
long getint(int, int);
void load(struct sect_header *);
void load_block(struct sect_header *);
void put_data(struct sect_header *);
void put_hstctl(unsigned int);
unsigned int get_hstctl(void);
void set_addr(unsigned long int);
void gsp_poke(unsigned long int, unsigned long int);
unsigned int gsp_peek(unsigned long int);
void get_sect(int);

























July, 1991
MASM'S CHANGING FACE


Looks like assembly, tastes like C




Mike Schmit


Mike is the president of Quantasm Corp., the developers and publishers of the
assembly language tools ASMFLOW, Quantasm Power Lib, and Magic TSR Toolkit.
Mike can be reached at 800- 765- 8086.


Microsoft's recently released version of Macro Assembler -- MASM 6.0 --
embodies the most ambitious changes in the life of the product. The most
noticeable change is that MASM is now fundamentally intended to support C
programmers. Consequently, the language is more C-like, easier for C
programmers to learn, and easier for programmers who code in both MASM and C
to switch from one to the other. For example, the EXTRN and STRUC directives
have new alias spellings to match C's extern and struct. Also, a new utility
is provided to convert C header (.H) files into MASM compatible include (.INC)
files.
Another view is that MASM has changed to make programming in assembly language
more convenient, allowing programmers to concentrate on the structure of
programs and in choosing the best instructions for the problem at hand.
MASM 6.0 includes a number of other updates, such as the CodeView debugger,
Programmer's WorkBench 1.1, and a new make facility (NMAKE). This article,
however, will primarily discuss the changes to the language itself.


A Look at the New Look


The first change you'll notice when switching to MASM 6.0 is that the program
name has been changed! The new program is ML.EXE and works in a fashion
similar to the MSC compiler's CL command. ML assembles and links multiple
modules. Most of the command line options have also changed. Fortunately, a
small driver command named MASM.EXE is supplied that accepts most of the old
command line options, converts them to their equivalent MASM 6.0 options, and
then automatically runs the new ML.EXE program. This allows old batch and make
files to work as before. Note, too, that the new command line interface (ML)
does not prompt for parameters. Another new feature is the addition of
MLX.EXE, a DOS-extended front end to ML. MLX will take advantage of DPMI,
VCPI, or XMS (in that order). You should only use MLX if you are having
capacity problems, because it runs slower.


Segmentation Control


Simplified segmentation directives were introduced in MASM 5.0. These
directives handle all the details of setting up segments with naming
conventions that match the segments generated by many high-level language
compilers. MASM 6.0 adds several new enhancements that support 32-bit segments
and flat model (for OS/2 2.0), additional calling conventions (SYSCALL and
STDCALL), and startup and exit code.
Table 1 shows the memory models supported by MASM 6.0. Note here that the tiny
model is now fully supported. The syntax for the .MODEL directive has also
changed, adding options to specify the language (for calling and naming
conventions), the operating system (DOS or OS/2), and the stack distance (near
or far). Listing One (page 96) shows a complete "hello world" program using
the simplified segmentation directives and related model-independent
directives.
Table 1: MASM 6.0 memory models

 Model Code Data Operating Notes
 System(s)
 ------------------------------------------------------------------------

 Tiny Near Near DOS code & data combined
 Small Near Near DOS, OS/2 1.x
 Medium Near DOS, OS/2 1.x
 Compact Near DOS, OS/2 1.x
 Large DOS, OS/2 1.x
 Huge DOS, OS/2 1.x
 Flat Near Near OS/2 2.x code & data combined, 32-bit offsets




Loop and Decision Structures


One feature that is likely to be popular is the addition of directives that
generate loops and decision structures in much the same way as high-level
language compilers. For instance, the .IF/.ELSE loop in Figure 1 is translated
to its corresponding assembly language instructions (shown at the bottom of
Figure 1). Of course, the generated labels (such as @C0001) are always unique.
The code generated by the decision and loop structures (such as that shown in
Figure 1) can be seen in the listing file by specifying the /Sa option
(maximize source listing) in conjunction with the /Fl option (generate listing
file).
Figure 1: MASM 6.0 contains decision and loop directives (in this case, an
.IF/.ELSE loop) that are translated to their corresponding instructions at
assembly time.

 .IF ax < mem_word1
 mov mem_word2, 2
 .ELSE
 mov mem_word2, 3

 .ENDIF

 The above code is translated to the
 following:

 cmp ax, mem_word1
 jnb @C0001
 mov mem_word2, 2
 jmp @C0003
 @C0001:
 mov mem_word2, 3
 @C0003:


A .WHILE directive and a REPEAT.. UNTIL construct are also available. The
.BREAK and .CONTINUE directives can be used to terminate a .REPEAT or .WHILE
loop prematurely. (Note that all of these directives begin with a period [.],
to differentiate them from conditional assembly directives.)
The range of allowed conditional expressions is quite complete. The relational
operators are the same as those used in C (see Table 2). However, you may know
from working with the 80x86 instruction set that there are separate
conditional jumps for signed and unsigned values, while the C syntax is the
same for each data type. C compilers generate the proper type of conditional
jumps based on the declared data types of the variables involved.
Table 2: MASM 6.0 Relational Operators

 Operator Meaning

 == equal
 != not equal
 > greater than
 < less than
 >= greater than or equal to
 <= less than or equal to
 & bit test
 ! logical NOT
 && logical AND
 logical OR


Until now, the concept of data as signed or unsigned in assembly language has
been all in the programmer's mind (and on occasion, in some comments). Signed
and unsigned data declarations have been added in MASM 6.0 (discussion
follows), as well as the ability to override any declaration. Because the set
of relational operators in C does not cover the full range available in
assembly language, conditional expressions may also use flag names as operands
(ZERO?, CARRY?, OVERFLOW?, SIGN?, and PARITY?).


Data Declarations


The architects of MASM 6.0 seemed to consider nothing off limits. Directives,
such as DB and DW (data byte and data word) used to declare data have all been
changed. You now can use BYTE, WORD, and DWORD to declare data instead of DB,
DW, and DD. The old directives are still available, so this is an optional
change.
At this point, you may be asking yourself why Microsoft would change something
as "unbroken" as DB. There are a number of features (such as the conditional
expressions) that require the assembler to "know" whether a byte (or word, and
so on) is to be treated as a signed or unsigned value. So there are also
SBYTE, SWORD, and SDWORD directives for declaring signed data values. A
pleasant side effect is that these directives make the language (somewhat)
more self-documenting.
Additionally, there are new directives for declaring floating point data:
REAL4, REAL8, and REAL10. Previously, you could declare a 32-bit IEEE floating
point number with DD (and you still can). But when using the new (preferred)
directives, an error will be generated if you try to declare floating point
data with DWORD. Even without using the MASM 5.1 compatibility options, the
older directives (DB, DW, DD, and so on) can still be used in exactly the same
way as before.


Jump Extending


About three years ago, SLR System's OPTASM was introduced, and one of its main
selling points (besides speed) was that it automatically generated the
shortest and fastest code for short and near unconditional jumps. In addition,
it would automatically generate the two-jump sequence required when a
conditional jump exceeded the 1-byte range. Later, Borland's Turbo Assembler
(TASM) introduced similar capabilities. Now, MASM 6.0 has almost caught up in
this category.
Probably the most annoying aspect of assembly-language programming for the
80x86 is the restriction of a 1-byte offset (+127, -128) for conditional
jumps. When this limit is exceeded, previous versions of MASM (including 5.1)
would generate a "jump out of range" error message. MASM 6.0 automatically
translates this code for the programmer. As an example, consider the code
fragment in Figure 2 along with its translated version. The only noticeable
change (for a jump out of range) is that the generated code is 5 bytes long
instead of 2. (There are no new labels or expanded code.) The 5 bytes are a
2-byte conditional jump (an inverse of the original) and a 3-byte
unconditional jump to the intended destination.
Figure 2: MASM 6.0 automatically generates a jump fixup when there is a jump
out of range. Notice that in this example the generated code is 5 bytes long
instead of 2.

 cmp ax, error_code
 je exit_error
 db 128 dup(90h) ; (128 bytes of code, NOP's here)
 exit_error:

 MASM 6.0 translates the above code to the following:

 cmp ax, error_code
 jne $+3 ; Note: $+3 is a relative

 ; jump 3 bytes ahead
 jmp exit_error
 db 128 dup (90h)
 exit_error:


If you are attempting to craft very compact code and you don't want this
automatic action to take place, you can use the OPTION: NOLJMP directive. In
addition, a level 2 warning is issued when a jump is extended. Note, however
that MASM 6.0 does not generate the required jump fixups when a loop
instruction is out of range (while OPTASM and TASM do).


HLL Interfacing


MASM 5.1 introduced several features that simplified the writing of assembly
language routines for use by high-level language (HLL) programs. An important
aspect of these improvements is that it's easy to use the same code with more
than one high-level language. A number of improvements added by MASM 6.0 make
programming more convenient, while others appear to be directly related to
making code easier to port to future versions of Windows and OS/2.
Two new calling conventions, SYSCALL and STDCALL, have been introduced for
OS/2 2.0. SYSCALL is similar to the C calling convention except that no
leading underscore is placed on the label and the called routine always
restores the stack. STDCALL is likewise similar except that the called routine
is responsible for restoring the stack unless a variable number of arguments
are specified (using VARARG); in that case, the C convention is used exactly.


The PROC Directive


The syntax for the PROC directive has been expanded to include a number of new
capabilities as shown in Figure 3. Note that the stack frame is automatically
set up based upon the various arguments in the PROC directive and defaults
based on the .MODEL directive. The concepts are similar to MASM 5.1, but
options have been added for overriding the defaults.
Figure 3: Syntax for the PROC directive

 label PROC [attributes] [USES reglist] [parameters...]

 where:
 label The name of the procedure
 attributes Any of distance, langtype, visibility and prologuearg
 reglist A list of registers following the USES keyword to be
 pushed by the prologue code and popped by the epilogue
 code. Each register must be separated by a space or tab.
 parameters A list of one or more parameters passed to the procedure on
 the stack. Each parameter consists of a parameter name,
 optionally followed by a colon and the parameter's data
 type. The data type of the last parameter may be VARARG,
 designating a variable number of remaining arguments. Each
 parameter must be separated by a comma.

 Attributes:
 distance Any of NEAR, FAR (also NEAR or FAR with 16 or 32,
 overriding the default segment size for 386, 486)
 langtype Determines the calling convention
 visibility PRIVATE, PUBLIC, or EXPORT
 prologuearg Lists the arguments required for prologue and epilogue code
 generation (for user-defined prologue/epilogue)


When the PROC directive is used in its extended form, the assembler
automatically generates code that sets up the stack frame, pushes and pops
registers that must be preserved, and properly cleans up the stack when a RET
instruction is encountered. In MASM 5.1, this prologue and epilogue code is
fixed based on the model and language. In MASM 6.0, you have the same fixed
options, some new options, and the capability to completely define your own
prologue and epilogue.
At this point, you may be wondering why you'd want to define your own prologue
or epilogue. One example is in stack size checking while debugging; you must
add something to every procedure, but remove it later. This makes coding and
switching back a simple matter. A code coverage analyzer, for example, could
insert itself into the code via user-defined prologues and epilogues.


Invoke and Procedure Prototypes


The new extended PROC directive allows one procedure to be assembled for any
memory model and calling convention. The inverse of this is the capability to
assemble programs that call procedures having different memory models and
calling conventions. This is done with the new INVOKE directive. Instead of
pushing arguments on the stack and using the CALL instruction, use the INVOKE
directive followed by the list of arguments. This is especially useful when
writing code that will be linked to commercial libraries or operating systems
APIs (such as OS/2 and Windows). The libraries can change models and/or
calling conventions and your source only needs to be reassembled and linked.
(This is a good idea because OS/2 will be changing calling conventions.)
One problem that arises is that the assembler doesn't know the type of each
argument in a procedure. MASM 6.0 rectifies this problem with the new PROTO
directive, which defines a procedure prototype. A procedure prototype informs
the assembler of the number and type of each argument so it can generate the
proper code and check for errors. Listings Two and Three (page 96) demonstrate
the differences between the old and new methods when programming for Windows.
In examining Listings Twoand Three, you may think that all of this could be
done with macros and conditional assembly. And you're right -- many of you
have done this in the past. This mechanism, however, is now a well-defined
standard that reduces code clutter, improves readability, and can be easily
published in magazine articles without the necessity for printing the macros.
Also, the code generated by the INVOKE directive takes into account pushing
constants (a two-step process on the 8088) and pointers onto the stack.
Finally, if you use indirect calls (CALL tbl[BX]) and still want to use
prototypes for error checking and documentation, there is a mechanism to
define a pointer to a prototype.



New Instructions and Directives


MASM 6.0 also adds new instructions to support the 80486 processor. Of course
you must use these with the caveat that these instructions make your program
processor-specific. Programmers designing 486-specific utilities (or special
versions of 386 utilities), operating systems (OS/2), and BIOSs on 486 systems
will surely have use for these instructions. The new instructions are listed
in Table 3.
Table 3: Instructions new to the 80486

 BSWAP byte swap
 CMPXCHG compare and exchange
 INVD invalidate data cache
 INVLPG invalidate TLB (Translation
 Lookaside Buffer)
 entry
 WBINVD write back and invalidate
 data cache
 XADD exchange and add


In addition, many of the more cryptic directives have been changed in MASM 6.0
to have more meaningful names. For example, the .XALL list control directive
is now .LISTMACRO. Both the old and new directives are accepted, so old code
does not need to be changed, even when the MASM 5.1 compatibility options are
used.


Macros


Some programmers use macros extensively, having created their own language
with macro libraries. Others never use macros because they tend to hide some
of the details of assembly language, possibly causing bugs or inefficient
code. The changes in MASM 6.0 will please both groups and make it easier for
beginners to learn macros. The changes are so substantial that there is an
option to use the old macros (OPTION: OLDMACROS).
The most interesting new feature is the ability to designate macro parameters
as required, or to specify a default value if the parameter is missing.
Consider, for instance, the code fragment in Example 1. The REQ keyword
specifies that a parameter is required. Its only effect is that of better
error reporting. In this case a syntax error would have been generated if a
parameter was missing, but in more complex macros these types of errors can be
difficult to track down. Also note in Example 1 that any parameter followed by
:= designates a default value. The default value should be enclosed in angle
brackets for proper recognition as a text value.
Example 1: Macro parameters can either be required as designated by the REQ
keyword or specify a default value

 set_cursor_pos MACRO row:REQ, col:REQ, page:=<0>
 mov dh, row
 mov dl, col
 mov bh, page
 int 10h
 ENDM
 ...
 set_cursor_pos 5, 10, 1 ; all parameters supplied
 ...
 set_cursor_pos 7, 15 ; page parameter takes default value
 ...
 set_cursor_pos ; ERROR: required parameters missing




Text Macros and Macro Functions


Using EQU, a numeric expression that can be immediately evaluated is a
permanent numeric equate. Otherwise, it is treated as a redefinable text
equate. The = directive, on the other hand, assigns a numeric value that may
be redefined later. But, to achieve a desired result, programmers are often
forced to use the two interchangeably. The new TEXTEQU defines a text macro
that is evaluated in the same manner as redefinable numeric equates.
Macro functions provide a mechanism to perform complex text processing at
assembly time. A macro function is defined in the same manner as a regular
macro (now called a macro procedure), but must return a text value with the
EXITM directive. Text values can be returned as numeric or text constants by
enclosing the text in angle brackets (<-2> or <mov>, for example), or by
prefixing a text equate or numeric expression with the expansion operator (%).
Listing Four (page 96) shows a macro function to calculate a factorial.


MASM 5.1 Compatibility


MASM 6.0 supplies both a command line option and an OPTION directive to
provide compatibility with code written in MASM 5.1 (and earlier versions).
The /Zm command line option sets all features to be compatible with MASM 5.1.
Alternatively, the OPTION M510 statement can be placed at the beginning of
your code. If you need to mix new and old features in the same code, use the
OPTION directive and selectively enable or disable specific features. Note
that the OPTION directive overrides any command line options.


Local and Global Labels



MASM 5.1 introduced the concept of labels being local to a given procedure.
Each label in a procedure can be local to just that procedure and cannot be
referenced elsewhere. Under 6.0, the default behavior is that all labels are
considered local. If you need to jump from one procedure to another, you can
declare any label as global in scope by declaring it with two colons instead
of one. This allows your code to be more readable since you can reuse the same
label names from one procedure to the next. And any label intended to be
accessed globally now stands out.
MASM 5.1 worked this way, but only if the .MODEL directive was used with a
language type. Otherwise, the operation in MASM 5.1 was the same as
OPTION:NOSCOPE. Although OPTION: SCOPE will help produce better and more
readable code, it will also restrict your source code to use with MASM 6.0 (or
MASM 5.1 if it uses the .MODEL with a language specified).


Structures and Unions


A structure is a group of related but dissimilar data types. Fields within a
structure can have different data types and sizes. An annoying restriction in
MASM 5.1 is that field names in a given structure can't be used in any other
context. One standard way to get around this is to prefix all field names with
the structure name or an abbreviation of the structure name.
MASM 6.0 now allows nested structures and unions. The directive STRUCT is now
a synonym for STRUC (to be more like C). Fields names do not need to be unique
within all identifiers but must be unique within a given nesting level for a
particular structure or union. A restriction is that a field name and a text
macro may not have the same name. This behavior is so different from previous
versions of MASM that the OPTION:M510 and OPTION: OLDSTRUCTS (or the /Zm
command-line option) cause the old structure behavior to be in effect.
The STRUCT directive provides two new options, an alignment option and the
NONUNIQUE keyword. The alignment can be 1, 2, or 4 with the default being 1.
The alignment value can be used to align individual fields on a particular
boundary for performance. Care must be taken, however, to align the start of
each structure on the same boundary. The command line option /Zp[n] (where n =
1, 2, or 4) causes structures to be aligned as specified in the structure
directive, but does not specify an alignment. The NONUNIQUE keyword requires
all field names of the structure or union to be fully qualified every time
they are used, regardless of the compatibility options in effect (M510,
OLDSTRUCTS, or /Zm).
Unions are new to MASM 6.0. Unions are similar to unions in C, variant records
in Pascal, or the EQUIVALENCE statement in Fortran. Another change is that the
dot operator is reserved for use by field names and cannot be used as an
alternative for the + operator. This is to allow the assembler to check fields
and make sure that they match with the declared structures. This makes the
code more readable, in that use of the dot operator implies the use of a
structure.


Typedefs


A pointer is a combination of a segment and an offset that is the address of,
for example, a variable in memory. In various memory models pointers may be
thought of as near or far, but all pointers are actually far. Near pointers
just have an assumed segment in one of the segment registers. For example, in
small model you would normally store only the offset portion of a pointer in
memory variables. The segment portion is assumed to be in a segment register
(normally DS for data). In a HLL, such as C, it is fairly easy to switch to a
new model because the compiler handles all the details for you. Writing
assembly-language code that is model-independent tends to be quite
complicated, especially when the assembly language code is more than just a
few subroutines called from a HLL.
MASM 6.0 introduces the ability to define types for pointer variables using
the TYPEDEF directive. Pointer types can simply be NEAR or FAR, or they can be
defined as NEAR16, NEAR32, FAR16, or FAR32 to override the current segment
size. If not specified, then it defaults based on the .MODEL directive.
Pointer types can also be defined in terms of a qualified type, which is any
type previously defined with TYPEDEF, a structure, or any intrinsic type (such
as BYTE or WORD).
The use of this new feature makes declaring model-independent data with
pointers much easier and more readable. However, writing the code that
accesses this data requires coding the in-line conditional assembly
directives. These conditional directives can be eliminated by using
traditional macros or the new text macros.


Products Mentioned


Macro Assembler 6.0 Microsoft Corporation One Microsoft Way Redmond, WA
98052-6399 206-882-8080 Price: $150 (upgrades to registered users $75) System
requirements: DOS 3.0 or later or OS/2, Version 1.1 or later
The ASSUME directive has always been misunderstood by a large number of
programmers. Using the simplified segment directives alleviates the need for
the ASSUME directive, at least for straightforward code. In the past, the
ASSUME directive allowed the program to inform the assembler what assumptions
to make about the contents of a segment register. Now you can specify an
assumption for a general register. This allows better error detection and
allows pointer data types to be assumed.


Speed


Besides being faster than its predecessor in raw performance, MASM 6.0 now
allows wildcards to be specified on the command line, which speeds the
assembly of many files. I found MASM 6.0 to be 20 to 40 percent faster than
MASM 5.1 in assembling source files ranging in size up to 100K. This is still
not as fast as Borland's Turbo Assembler (TASM) and SLR Systems' OPTASM.
(Note: OPTASM is compatible with MASM 5.0 and earlier and does not assemble
80386 instructions. TASM is compatible with MASM 5.1 and earlier and contains
a number of minor extensions and other features.) See Table 4 for a speed
comparison.
4: MASM 6.0 Speed Tests

 Assembler Test1 Test2
 -------------------------

 MASM 6.0 56 51
 MASM 5.10 69 --
 TASM 2.0 46 31
 OPTASM 1.72 33 18*

Test1: Assemble 20 files (20K to 100K in size, 900K total).
Test2: Make or wildcard assembly of same files.
*Used OPTASM's built-in make file. All times in seconds. All tests run on a
25MHz 80386.



Closing Comments


A number of previous MASM updates forced old code to be modified. But this
time, some of the changes are so major that Microsoft has added the capability
to support MASM 5.1 features selectively, or all at once. But overall, this is
an excellent upgrade, primarily because most of the new features help in
writing code that is easier to read and maintain.
The upgrade to MASM includes major changes to the internal operation of the
assembler as well as a complete facelift to the command line options and many
of the assembler directives. MASM can now assemble and link multiple files
from the command line, fixup conditional jumps that are out-of-range and
generate code for looping and decision structures. However, with all these
changes, you still must deal with the 80x86 instruction set, just as before
and that is what assembly language programming is really all about.

_MASM'S CHANGING FACE_
by Mike Schmit




[LISTING ONE]

 .MODEL small
 .STACK 100 ; reserves 100 bytes for the stack
 .CODE ; start of code segment
 main PROC
 .STARTUP ; generates startup code
 mov bx, 1 ; stdout
 mov cx, msg_len
 mov dx, offset DGROUP:msg
 mov ah, 40h ; write to handle
 int 21h ; call DOS to write msg
 .EXIT ; generates exit code
 main ENDP
 .DATA ; start of data segment
 msg BYTE 'Hello world.'
 msg_len equ $ - msg
 END main ; end, specify starting address







[LISTING TWO]


 EXTRN GetDC : far
 EXTRN MoveTo : far
 EXTRN LineTo : far
 EXTRN ReleaseDC : far

 point_list struc
 x1 dw ?
 y1 dw ?
 x2 dw ?
 y2 dw ?
 point_list ends
 .
 . (assume bx = hWnd)
 .
 push bx
 call GetDC ; returns hDC
 mov di, ax

 push di
 push [si].x1
 push [si].y1
 call MoveTo

 push di
 push [si].x2
 push [si].y2
 call LineTo

 push bx
 push di
 call ReleaseDC

 .
 .

 .



[LISTING THREE]

 GetDC PROTO FAR PASCAL hWnd:WORD
 MoveTo PROTO FAR PASCAL hDC:WORD, nX:WORD, nY:WORD
 LineTo PROTO FAR PASCAL hDC:WORD, nX:WORD, nY:WORD
 ReleaseDC PROTO FAR PASCAL hWnd:WORD, hDC:WORD

 option oldstructs
 point_list struct
 x1 word ?
 y1 word ?
 x2 word ?
 y2 word ?
 point_list ends
 .
 . (assume bx = hWnd)
 .
 invoke GetDC, bx ; returns hDC
 mov di, ax
 invoke MoveTo, di, [si].x1, [si].y1
 invoke LineTo, di, [si].x2, [si].y2
 invoke ReleaseDC, bx, di
 .
 .
 .






[LISTING FOUR]

 factorial MACRO num
 LOCAL result, factor
 IF num LE 0
 %error factorial parameter out of bounds
 ENDIF
 result = 1
 factor = num
 WHILE factor GT 0
 result = result * factor
 factor = factor - 1
 ENDM
 EXITM %result
 ENDM
 i = 1
 REPEAT 20 ; repeat block macro
 DWORD factorial(i) ; to generate a table of
 i = i + 1 ; the first 20 factorials
 ENDM
 DWORD factorial(-33) ; error





Example 1. Macro parameters can either be required as designated by the REQ
keyword or specify a default value

 set_cursor_pos MACRO row:REQ, col:REQ, page:=<0>
 mov dh, row
 mov dl, col
 mov bh, page
 int 10h
 ENDM
 ...
 set_cursor_pos 5, 10, 1 ; all parameters supplied
 ...
 set_cursor_pos 7, 15 ; page parameter takes default value
 ...
 set_cursor_pos ; ERROR: required parameters missing




Figure 1: MASM 6.0 contains decision and loop directives (in this case, an
.IF/.ELSE loop) that are translated to their corresponding instructions at
assembly time.

 .IF ax < mem_word1
 mov mem_word2, 2
 .ELSE
 mov mem_word2, 3
 .ENDIF

The above code is translated to the following:

 cmp ax, mem_word1
 jnb @C0001
 mov mem_word2, 2
 jmp @C0003
 @C0001:
 mov mem_word2, 3
 @C0003:




Figure 2. MASM 6.0 automatically generates a jump fixup when there is a jump
out of range. Notice this example that the generated code is five bytes long
instead of two.


 cmp ax, error_code
 je exit_error
 db 128 dup(90h) ; (128 bytes of code, NOP's here)
 exit_error:

MASM 6.0 translates this to the following:


 cmp ax, error_code
 jne $+3 ; Note: $+3 is a relative
 ; jump 3 bytes ahead
 jmp exit_error
 db 128 dup(90h)

 exit_error:





























































July, 1991
A C++ PCX FILE VIEWER FOR WINDOWS 3


Paul Chui


Paul develops computer applications for the aviation industry at KPMG Peat
Marwick in San Mateo, Calif., and is the coauthor of the Turbo C++ Disktutor.
He can be reached on CompuServe at 76077, 3162.


Some years ago, ZSoft developed its PC Paintbrush program as the PC's answer
to the Macintosh's MacDraw. Although many graphics programs rival PC
Paintbrush in features and popularity, its PCX file format has transcended the
program. With each successive version, the PCX file format has adapted to new
standards in display hardware to the point that many DOS programs now use PCX
files as a standard medium for exchanging graphics.
Windows provides excellent support for bitmap graphics. The object-oriented
system Windows uses to manage bitmaps makes them especially easy to work with.
To use a PCX bitmap in Windows, however, you must first translate it into the
Windows bitmap format. In this article, I present a PCX file viewer
implemented as a C++ class that creates a Windows bitmap object from a PCX
file. The code was developed under Borland C++ 2.0, a logical step forward for
Windows C programmers.


Run Length Encoding


Run Length Encoding (RLE) is a data compression algorithm that takes advantage
of repeated patterns. RLE has been discussed previously in DDJ (see "Run
Length Encoding" by Robert Zigon, February 1989, and "RLE Revisited" by Phil
Daley, May 1989); I'll use a simple example to demonstrate the algorithm.
Consider a 640 x 480 monochrome picture that is to be copied to disk. Such an
image would take 307,200 (640 * 480) bits, or 38,400 bytes. Now suppose that
the image is composed entirely of white pixels. This file would contain 38,400
bytes of 1s. This image, however, can be compactly represented in 2 bytes; the
first representing a repeat byte with the value of 38,400, and the second byte
representing the repeated data (in this case, 1).
Because RLE takes advantage of repeated patterns, compression of random data
is not nearly so effective. Therefore, PCX images are encoded using a variant
of the RLE scheme. In PCX files, if a byte has the two high bits set, then it
is a repeat byte. Otherwise, the byte is a single data byte. The general
algorithm is:
1. Read a byte;
2. If the two high bits are set, then N = the lower six bits, else Write this
byte and Go to step 1;
3. Read the next byte and Write it N times;
4. Go to step 1.
Note that since only six bits are used for the repeat count, the largest
possible repeat count is 63. Consequently, at least 4 bytes are needed to
encode a scanline that's 640 pixels wide.
Figure 1 shows how a single scanline is compressed in PCX. Because the two
high bits are set in the first byte, the lower six bits hold a repeat count.
The repeat count is 63 (111111b). A repeat byte of 0s follows. The third byte
is a repeat count of 17 (010001b) followed by another repeat byte of 0s. The
total is 80 bytes of 0s or 640 black pixels.
Figure 1: Compression of a single scanline of 640 black pixels in PCX

 Hex: FF 00 D1 00
 Binary: 11 111111 00000000 11 010001 00000000


There is another quirk in PCX encoding. To encode a data byte with the two
high bits set, PCX uses a repeat count of one followed by the data. This could
lead to encoded files two times larger than the raw bitmap! For most
nonpathological cases, however, PCX compression works well.


Windows Device-Independent Bitmaps


The Windows Device-Independent Bitmap (DIB) format, introduced with Windows 3,
overcomes the limitations of the old Windows bitmap format (see the text box
entitled "Windows Device-Dependent Bitmaps," which accompanies this article),
but you pay a penalty for device independence -- DIBs are more complex. You
must set up at least three structures to tell Windows about a new DIB.
The BITMAPINFO, RGBQUAD, and BITMAPINFOHEADER structures, which are defined in
WINDOWS.H, are shown in Figure 2. BITMAPINFO has two fields: A header
BITMAPINFOHEADER and a color table RGBQUAD[ ]. BITMAPINFOHEADER contains
information about the dimensions and format of the image. Each RGBQUAD
structure in BITMAPINFO defines a single color. The size of the color table is
determined by the number of colors in the image.
Figure 2: The BITMAPINFO, RGBQUAD, and BITMAPINFOHEADER structures as defined
in WINDOWS.H

 BITMAPINFO structure:
 BITMAPINFOHEADER bmiHeader
 RGBQUAD bmiColors[]

 RGBQUAD structure:
 BYTE rgbBlue Blue intensity
 BYTE rgbGreen Green intensity
 BYTE rgbRed Red intensity
 BYTE rgbReserved Not used, set to 0

 BITMAPINFOHEADER structure:
 DWORD biSize Size of BITMAPINFOHEADER (bytes)
 DWORD biWidth Width of bitmap (pixels)
 DWORD biHeight Height of bitmap (pixels)
 WORD biPlanes Always 1
 WORD biBitCount Bits per pixel
 1 = monochrome

 4 = 16 color
 8 = 256 color
 24 = 16 million color
 DWORD biCompression Type of compression
 0 = no compression
 1 = 8-bits/pixel RLE
 2 = 4-bits/pixel RLE
 DWORD biSizeimage Size of bitmap bits (bytes)
 (required if compression used)
 DWORD biXPelsPerMeter Horizontal resolution
 (pixels/meter)
 DWORD biYPelsPerMeter Vertical resolution
 (pixels/meter)
 DWORD biClrUsed Colors used
 DWORD biClrImportant Number of important colors




The PCX Class


The PCX class is defined in SHOWPCX. CPP (see Listing One, page 97) decodes
and displays PCX files in Windows. It has three main user methods: Read;
Display; and Handle. PCX::Read opens the PCX file and reads its 128-byte
header (see the text box entitled "The PCX Header"). From the header,
PCX::Read decides if it should create a monochrome, 16-color, or 256-color
decoder. The decoder creates a device-independent bitmap and stores the bitmap
handle in a private variable, hBitmap. A call to PCX::Display will put the DIB
into a Windows display context. If you need to obtain the handle for the DIB,
use PCX::Handle.
The PCX class knows little about the format of PCX images. It leaves the hard
work to the DecodePCX class. DecodePCX defines the read_pcx_line method, which
decompresses a scanline from the the PCX file. But you can't make a DecodePCX
object because it is an abstract class. That is, DecodePCX declares but does
not define a MakeDIB method. DecodePCX is used by deriving child classes that
each define a MakeDIB method. These children are Decode-MonoPCX, Decode16PCX,
and Decode-256PCX. Each MakeDIB method decodes and creates a different type of
bitmap.


Decoding Monochrome Images


Creating a monochrome DIB from a PCX file is straightforward. The method
DecodeMonoPCX::MakeDIB in Listing One creates the monochrome DIB. Most of the
code is dedicated to setting up the header. To create the DIB header, first
create a BITMAPINFO structure. This structure contains a variable size color
table. The size of this color table depends on the image type. For a
monochrome bitmap, the size of the BITMAPINFO header is: sizeof
(BITMAPINFOHEADER) + 2 * sizeof(RGBQUAD). There are two RGBQUAD entries in the
color table, one for white and another for black.
Some small but important differences exist between DIB and PCX images. The
length of each DIB scanline is always a multiple of 4 bytes, but the length of
a PCX scanline can be a multiple of 2 bytes. The ALIGN_DWORD macro compensates
for this difference. This macro is used to calculate the size of the image
buffer needed by the DIB: image_size = ALIGN_DWORD (BytesLine) * height;. In
this statement, BytesLine is the number of bytes in each PCX scanline, and
height is the number of scanlines in the image.
After allocating memory for the image buffer, a loop is used to read each line
of the PCX file until the image is complete. The origin of DIB starts at the
lower-left corner of the image, not the upper-left corner as in PCX files. So
the first line read from the PCX file is copied into the last line of the DIB.
Once the bitmap image and header information is filled in, a call to Window's
Create-DIBitmap creates the bitmap.


Decoding 16-Color Images


Translating a monochrome PCX file to a Windows DIB is relatively clean because
their image formats are similar. Unfortunately, translating 16-color DIBs
requires some bit twiddling. Four bits per pixel are used for 16-color DIBs,
but PCX files are arranged with 1 bit per pixel and four interleaving color
scanlines. Decode 16PCX::MakeDIB reads groups of four PCX scanlines, then
tests a bit from each scanline to create the appropriate 4-bit representation
of a DIB pixel. This requires significantly more processing than the
monochrome decoder.
There are 16 entries in the color table. This table is filled with the palette
copied from the PCX header. If the PCX file has no palette information, the
decoder uses literal RGB values.


Decoding 256-Color Images


The MCGA/VGA can display 256 colors out of a palette of 16 million at 320 x
200 resolution. However, Windows does not come with a display driver for this
mode. If you want to see Windows in 256 colors, you'll need more capable
graphics hardware (such as a Super VGA or 8514/A).
More Details.
As with the monochrome images, translating 256-color PCX images into Windows
DIB bits is elementary. Both images use an 8-bits-per-pixel format. Therefore,
the decompressed bits from the PCX file are simply copied into the DIB image
buffer.
More Details.


Color Palettes


A PCX file uses a maximum of 256 colors out of a palette of approximately 16
million colors. And as just mentioned, display adapters such as the 8514/A can
display 256 colors simultaneously out of a possible 16 million. When you
display a 256-color image, you want the system to use the palette specified by
your image. In a multitasking environment like Windows, this can be a
detriment to other programs that have their own palettes. Fortunately,
Windows' palette manager can be employed to settle these conflicts.
In order to accommodate programs that have important color information,
Windows allows each program to create its own logical palette. Logical
palettes allow an application to use as many colors as required, with minimal
interference to colors used by other applications. When an application is
active, Windows will satisfy its palette requests by exactly matching the
system palette to the logical palette. Windows can also approximate logical
palettes for inactive windows by mapping those colors to the closest colors in
the current system palette.
Decode 256PCX creates a Windows logical palette out of the PCX color palette.
Unlike the 16-color palette, PCX does not keep 256-color palettes in the
header. This palette is stored as 256 red, green, and blue bytes at the end of
the PCX file. To get the extended palette, go to the 769th byte from the end
of the PCX file. This byte should contain a OCh to verify the start of the
palette. The actual palette entries follow.
Decode 256PCX::MakeDIB passes the PCX palette to make_palette, which creates a
Windows logical palette. Before creating the bitmap, you must notify Windows
of the new palette. A call to SelectPalette identifies the logical palette for
the device context. RealizePalette tells Windows to map the system palette
into the logical palette.



The File Viewer


After the decoder is implemented, we need only to add a few lines of code to
our main Windows template to complete the file viewer (see PCXWIN.CPP in
Listing Two, page 100 and PCXWIN.H in Listing Three, page 101). The main C++
Windows module is written similar to a main C Windows module. I did not
encapsulate WinMain or WndProc into a C++ class because this does not simplify
the main Windows message loop for me. A notable C++ feature used in the main
module is the Scroller class (SCROLLER.CPP, Listing Four, page 101), which
encapsulates all scrollbar-related functions and variables so that the main
module is not littered with global variables. Scroller::Size notifies Scroller
when the size of the window has changed. The methods Scroller::Horz and
Scroller:: Vert handle the window's horizontal and vertical scrolling,
respectively.
The main module also handles a Windows palette message. The
WM_QUERY-NEWPALETTE message is received from Windows just before the
application becomes active. This allows it to realize its logical palette.
Windows also sends another message, WM_PALETTECHANGED, just after the system
palette has changed. This message should be processed if you want Windows to
approximate your palette colors when the window is in the background.
WM_PALETTECHANGED can be ignored, however, if you care about the appearance of
your window only when it is active.


A Word about Tools


The PCXWIN.MAK make file (see Listing Seven, page 102) was created from a
Borland Project file using the PRJ2MAK utility. If you prefer to compile the
viewer under the Borland Programmer's Platform, just create a project
containing the CPP, DEF, and RC files, select the Window App option from the
Options/Applications menu, and compile. (The DEF and RC files for this project
are presented in Listings Five and Six, page 101, respectively.) The -
=PCXWIN.SYM option tells Borland C++ to precompile the header files to a
binary symbol file. You can remove this from the make file, but I strongly
recommend you use it for your own projects. Much of the work in compiling a
Windows program is recompiling the WINDOWS.H header file.
Most C++ programs include numerous class declarations in header files. In
addition, C++ syntax is more difficult to parse than C syntax, heavily
penalizing compiler time. When changes are made to a source module, Borland
C++ checks the time stamp on the header files against the precompiled header
symbol file. If the headers have not been changed, the symbol file is loaded
without the need to parse the header file source. This brings the lines
necessary to recompile PCXWIN.CPP (Listing Two) from 3800 lines down to around
250 lines, which just about eliminates the programmer's coffee break. Notice,
too, that PCXWIN.DEF (Listing Five) does not contain any EXPORT functions. C++
function name mangling makes exporting linker names difficult. The solution is
to use the _export type qualifier in the function definitions.
Also note that the file viewer works in Windows standard or enhanced mode
because Borland C++ does not support real-mode programs. The PRODMODE
statement in PCXWIN.DEF tells the linker to tag the file viewer as a
protected-mode program. Windows real-mode compatibility may be a critical
issue for some, but there is a Windows programmers' movement to do away with
real mode completely. The debate continues on the importance of Windows
real-mode compatibility.


Possible Enhancements


I took a minimalist approach to the file viewer. It's really only a test
program for the PCX class. But adding features to this application could be
done very cheaply. Because the PCX class creates a Windows bitmap, clipboard
support was added simply by using SetClip-boardData in the main module. With
minimum effort, you can add a file dialog box, printing capability, and so on.
If you have a little more time, you can also add color dithering to allow
256-color images to be viewed on standard VGA displays. But as one of my
college professors was fond of saying, I'll leave the rest to you as an
exercise.


References


Microsoft Windows Software Development Kit Reference Manual. Redmond, Wash.:
Microsoft Press.
PC Paintbrush Technical Reference Manual. Marietta, Ga.: ZSoft Corp.
Petzold, Charles. Programming Windows. Redmond, Wash.: Microsoft Press, 1990.


The PCX Header


PCX files start with a 128-byte header. The PCXHEADER structure is defined in
Listing One. The pcxManufacturer field is always \xA. This is a "magic number"
used to identify a PCX file. The pcxVersion identifies what version of PC
Paintbrush or compatible program created this file. The version ID is shown in
Table 1.

Table 1: PCX version ID

 ID PC Paintbrush Version
 -------------------------

 0 2.5
 2 2.8 with palette
 3 2.8 without palette
 5 3.0


The pcxEncoding field should have the value 1. At this time, run length
encoding is the only scheme PCX files use. The pcxBitsPerPixel field tells you
how many consecutive bits in the file represent one pixel on the screen (see
Table 2).
Table 2: Bits per pixel

Bits/Pixel Colors (Display Mode)
---------------------------------

 1 Monochrome, EGA/VGA
 16 color
 2 CGA 4 color
 8 MCGA/VGA 256 color


The next four fields (pcxXmin, pcxYmin, pcxXmax, and pcxYmax) define the
limits of the image. The dimensions of the image will be pcxXmax-pcxXmin+1 by
pcxYmax-pcxYmin+1. The lower limits of pcxXmin and pcxYmin are usually both 0.
The pcxHres and pcxVres fields specify the resolution of the screen that was
used to create the file. These fields can be ignored. The pcxPalette field
contains the color palette. The pcxReserved field usually contains the BIOS
video mode but is not documented as such. It should be ignored.

The pcxPlanes field says how many hardware color planes the target video mode
has. PCX files store this information by interleaving color scanlines. For
example, the beginning of a PCX image in EGA 16-color mode appears in Figure
3. The image has two blue pixels, three green pixels, two red pixels, and one
magenta (red+blue) pixel. The image is 640 pixels (80 bytes = 640 pixels/8
bytes/pixel) wide. The actual number of bytes in the file per scanline is
probably less than 80 bytes because of compression.
The pcxBytesPerLine field tells you how many bytes are in each scanline. This
value will depend on the bits per pixel and the x dimensions of the image.
This field is always even. The pcxPaletteInfo field is for VGA images. The
value is 1 for grayscale and 2 for color images.
The rest of the 128-byte header is reserved for future additions to the PCX
format. The actual bits of the PCX bitmap immediately follow the 128-byte
header.
--P.C.


Windows Device-Dependent Bitmaps


Unlike GIF, which was designed as a generic graphics format, PCX file formats
have evolved by closely paralleling changing PC graphics hardware.
Decompressed PCX images are basically copies of the video display buffer. This
is also true of Windows Device-Dependent Bitmaps (DDBs).
For monochrome images, device dependence is not a problem. All monochrome
bitmap devices use a single bit to represent each pixel on the image. Unlike
the obvious solution used for monochrome images, the structure of color images
is not a simple matter of black and white (excuse the pun). Video hardware
designers have come up with ways to stuff as many colors into as few bytes as
possible. Pity the poor programmer. For example, the EGA 16-color video mode
uses 1 bit per pixel and four hardware "color planes" (red, green, blue,
intensity). Combinations of the bits from each plane are used to create 16
possible colors. PCX files and Windows DDBs emulate color planes by storing
them as interleaving color scanlines. MC-GA/VGA 256-color modes don't have
color planes, but use 8 bits per pixel. Not by coincidence, PCX files use 8
bits per pixel for 256-color images.
DDBs and PCX files have the same basic image format. EGA 16-color DDBs have
the same color scanline interlacing as the PCX format. This makes translating
from a PCX file directly to a DDB undeservedly easy. The process involves
unpacking the RLE for each line in the PCX file and then setting the unpacked
bits directly to the DDB; see Example 1 . The function read_pcx_line decodes a
single scanline from the PCX file. The same code constructs both monochrome
and 16-color images. For 16-color images, the number of color planes
(byPlanes) is 4. The image is monochrome if the number of color planes is 1.

Example 1: Setting unpacked bits directly to the DDB

 BYTE huge* lpImage = new BYTE[lImageSize];
 int h, line, plane;
 for (h=0, line=0; h<wHeight; ++h, line+=byPlanes)
 for (plane=0; plane<byPlanes; ++plane)
 read_pcx_line(lpImage+(lBmpBytesLine*(line+plane)));
 HBITMAP hBitmap = CreateBitmap (wWidth, wHeight, byPlanes, 1, lpImage);

This seems too good to be true. Here's the curve: DDBs interlace color
scanlines for small bitmaps, but if they're greater than 65,535 bytes, color
planes are interlaced. So the revised code fragment should look something like
Example 2.
--P.C.

Example 2: Revising the code in Example 1 for bitmaps greater than 65,535
bytes

 BYTE huge* lpImage = new BYTE[lImageSize];
 int h, line, plane;
 if (lImageSize < 65535L)
 // Interlaced color scanlines
 for (h=0, line=0; h<wHeight; ++h, line+=byPlanes)
 for (plane=0; plane<byPlanes; ++plane)
 read_pcx_line(lpImage+(LONG(iBmpBytesLine)*(line+plane)));
 else
 // Interlaced color planes
 for (h=0, line=0; h<wHeight; ++h, line+=wHeight)
 for (plane=0; plane<byPlanes; ++plane)
 read_pcx_line (lpImage+(lBmpBytesLine*(plane*wHeight+h)));
 HBITMAP hBitmap = CreateBitmap (wWidth, wHeight, byPlanes, 1,
 lpImage);



_A C++ PCX FILE VIEWER FOR WINDOWS 3_
by Paul Chui




[LISTING ONE]

#include <windows.h>
#include "pcxwin.h"

#include <io.h>

#define ALIGN_DWORD(x) (((x)+3)/4 * 4)


struct PCXRGB { BYTE r, g, b; };

struct PCXHEADER {
 BYTE pcxManufacturer;
 BYTE pcxVersion;
 BYTE pcxEncoding;
 BYTE pcxBitsPixel;
 int pcxXmin, pcxYmin;
 int pcxXmax, pcxYmax;
 int pcxHres, pcxVres;
 PCXRGB pcxPalette[16];
 BYTE pcxReserved;
 BYTE pcxPlanes;
 int pcxBytesLine;
 int pcxPaletteInfo;
 BYTE pcxFiller[58];
};

///////////////////////////////////////////////////////////////////////////
// NOTES: Decoder creates a DIB and possibly a PALETTE, but does not delete
// either. It is the responsibility of the Decoder's user to delete them.
///////////////////////////////////////////////////////////////////////////
class DecodePCX {
public:
 DecodePCX(int hfile, PCXHEADER& pcxHeader);
virtual HBITMAP MakeDIB(HDC hdc) = 0;
 HPALETTE Palette();
protected:
 WORD read_pcx_line(BYTE huge* pLine);
 BOOL NEAR PASCAL next_data();

 int hFile; // Handle to the open PCX file
 HPALETTE hPalette; // Handle to Palette

 PCXHEADER header;
 int BytesLine; // Bytes/Line in PCX file
 WORD width; // width in pixels
 WORD height; // height in scan lines

 BYTE byData; // Current data byte
 int iDataBytes; // Current unread data buffer size
};

HPALETTE DecodePCX::Palette() { return hPalette; }
class DecodeMonoPCX : public DecodePCX {
public:
 DecodeMonoPCX(int hfile, PCXHEADER& pcxHeader) :
 DecodePCX(hfile, pcxHeader) { }
 HBITMAP MakeDIB(HDC hdc);
};
class Decode16PCX: public DecodePCX {
public:
 Decode16PCX(int hfile, PCXHEADER& pcxHeader) :
 DecodePCX(hfile, pcxHeader) { }
 HBITMAP MakeDIB(HDC hdc);
};
class Decode256PCX: public DecodePCX {
public:
 Decode256PCX(int hfile, PCXHEADER& pcxHeader) :

 DecodePCX(hfile, pcxHeader) { }
 HBITMAP MakeDIB(HDC hdc);
private:
 HPALETTE make_palette(RGBQUAD* pColors);
};

///////////////////////////////////////////////////////////////////////////
// PCX Methods
///////////////////////////////////////////////////////////////////////////
PCX::PCX()
{
 hBitmap = 0;
 hPalette = 0;
 hFile = 0;

 wWidth = 0;
 wHeight = 0;
}
PCX::~PCX()
{
 if (hBitmap) DeleteObject(hBitmap);
 if (hPalette) DeleteObject(hPalette);
}

/****************************************************************************
 METHOD: BOOL PCX::Read(LPSTR lpszFileName, HDC hdc)
 PURPOSE: Creates a DIB from a PCX file
 PARAMETERS: LPSTR lpszFileName PCX file name
 HDC hdc A compatible DC for the DIB
 RETURN: TRUE if DIB was created, otherwise FALSE
 NOTES: ZSoft documents a CGA palette type that is not support here.
****************************************************************************/
BOOL PCX::Read(LPSTR lpszFileName, HDC hdc)
{
 // Delete the bitmap and reset variables
 if (hBitmap)
 {
 DeleteObject(hBitmap);
 hBitmap = 0; // So we know the bitmap has been deleted
 }
 if (hPalette)
 {
 DeleteObject(hPalette);
 hPalette = 0; // So we know the palette has been deleted
 }
 wWidth = 0;
 wHeight = 0;
 OFSTRUCT OfStruct;
 if ((hFile=OpenFile(lpszFileName, &OfStruct, OF_READ)) == -1)
 {
 ErrorMessage("Unable to open file.");
 return FALSE;
 }
 PCXHEADER header;
 if (_lread(hFile,(LPSTR)&header,sizeof(PCXHEADER)) != sizeof(PCXHEADER))
 {
 ErrorMessage("Error reading PCX file header.");
 return FALSE;
 }

 if(header.pcxManufacturer != 0x0a)
 {
 _lclose(hFile);
 ErrorMessage("Not a PCX file.");
 return FALSE;
 }
 wWidth = header.pcxXmax - header.pcxXmin + 1;
 wHeight = header.pcxYmax - header.pcxYmin + 1;

 DecodePCX* Decoder;

 /* Determine PCX file type and create a decoder */

 // 256-color file
 if (header.pcxBitsPixel == 8 && header.pcxPlanes == 1)
 Decoder = new Decode256PCX(hFile, header);
 else
 // 16-color file
 if (header.pcxBitsPixel == 1 && header.pcxPlanes == 4)
 Decoder = new Decode16PCX(hFile, header);
 else
 // monochrome file
 if (header.pcxBitsPixel == 1 && header.pcxPlanes == 1)
 Decoder = new DecodeMonoPCX(hFile, header);
 else
 ErrorMessage("Unsupported PCX format.");

 if (!Decoder)
 {
 ErrorMessage("Cannot create PCX decoder.");
 _lclose(hFile);
 return FALSE;
 }
 SetCursor( LoadCursor(NULL,IDC_WAIT) );
 // Create the bitmap
 hBitmap = Decoder->MakeDIB(hdc);
 hPalette = Decoder->Palette();
 SetCursor( LoadCursor(NULL,IDC_ARROW) );
 delete Decoder;
 _lclose(hFile);
 return (hBitmap) ? TRUE : FALSE;
}

/****************************************************************************
 METHOD: BOOL PCX::Display(HDC hdc, POINT& pos, RECT& rect)
 PURPOSE: Displays the DIB
 PARAMETERS: HDC hdc DC on which DIB is displayed
 POINT pos Destination positions
 RECT rect Clipping rectangle on source
 RETURN: TRUE if DIB was displayed, otherwise FALSE
 NOTES: Works for MM_TEXT mode only
****************************************************************************/
BOOL PCX::Display(HDC hdc, POINT& pos, RECT& rect)
{
 BOOL bltOk = FALSE;
 if (hBitmap)
 {
 HBITMAP hdcBitmap = CreateCompatibleDC(hdc);
 HBITMAP hOldBitmap = SelectObject(hdcBitmap, hBitmap);

 bltOk = BitBlt(hdc, rect.left,rect.top,rect.right,rect.bottom,
 hdcBitmap,pos.x,pos.y, SRCCOPY);
 SelectObject(hdcBitmap, hOldBitmap);
 DeleteDC(hdcBitmap);
 }
 return bltOk;
}

///////////////////////////////////////////////////////////////////////////
// DecodePCX Methods
///////////////////////////////////////////////////////////////////////////

/****************************************************************************
 METHOD: DecodePCX::DecodePCX(int hfile, PCXHEADER& pcxHeader)
 PURPOSE: Constructor
 PARAMETERS: int hfile Handle to open PCX file
 PCXHEADER pcxHeader PCX header
****************************************************************************/
DecodePCX::DecodePCX(int hfile, PCXHEADER& pcxHeader)
{
 hFile = hfile;
 // Reset file pointer
 if (_llseek(hFile, sizeof(PCXHEADER), 0) == -1)
 ErrorMessage("Error positioning past header.");
 header = pcxHeader;
 hPalette = 0;
 BytesLine = header.pcxBytesLine;
 width = header.pcxXmax - header.pcxXmin + 1;
 height = header.pcxYmax - header.pcxYmin + 1;
 byData = 0;
 iDataBytes = 0;
}

/****************************************************************************
 METHOD: WORD DecodePCX::read_pcx_line(BYTE huge* lpLineImage)
 PURPOSE: Decode a PCX RLE scanline
 PARAMETERS: BYTE huge* lpLineImage Destination of decoded scanline
 RETURN: Number of bytes decoded
****************************************************************************/
WORD DecodePCX::read_pcx_line(BYTE huge* lpLineImage)
{
 for (WORD n=0; n<BytesLine; )
 {
 if (!next_data()) return n;
 // If the two high bits are set...
 if (byData >= 0xc0)
 {
 // Get duplication count from lower bits
 BYTE run_len = byData & 0x3f;
 // Set run_len bytes
 if (!next_data()) return n;
 while(run_len--) lpLineImage[n++]=byData;
 }
 else
 // Set this byte
 lpLineImage[n++] = byData;
 }
 if (n != BytesLine)
 ErrorMessage("PCX Read Error.");

 return n;
}

/****************************************************************************
 METHOD: BOOL NEAR PASCAL DecodePCX::next_data()
 PURPOSE: Read a byte from the file and set to byData
 RETURN: FALSE on read error
 NOTES: The output byte is written to byData
****************************************************************************/
BOOL NEAR PASCAL DecodePCX::next_data()
{
 static BYTE fileBuf[5120];
 static int index = 0;
 if (iDataBytes == 0)
 {
 if ((iDataBytes = _read(hFile, fileBuf, sizeof(fileBuf))) <= 0)
 return FALSE;
 index = 0;
 }
 --iDataBytes;
 byData = *(fileBuf+(index++));
 return TRUE;
}

///////////////////////////////////////////////////////////////////////////
// DecodeMonoPCX Methods
///////////////////////////////////////////////////////////////////////////

/****************************************************************************
 METHOD: HBITMAP DecodeMonoPCX::MakeDIB(HDC hdc)
 PURPOSE: Make monochrome DIB
 PARAMETERS: HDC hdc Handle to compatible DC
 RETURNS: Handle to DIB, NULL on error
****************************************************************************/
HBITMAP DecodeMonoPCX::MakeDIB(HDC hdc)
{
 int h;
 LONG lDIBBytesLine = ALIGN_DWORD(BytesLine);
 LONG image_size = lDIBBytesLine*height;
 // Allocate memory for the image buffer
 GlobalCompact(image_size);
 HANDLE hImageMem = GlobalAlloc(GMEM_MOVEABLEGMEM_ZEROINIT, image_size);
 if (!hImageMem)
 {
 ErrorMessage("Out of memory."); return NULL;
 }
 BYTE huge* lpImage = (BYTE huge*) GlobalLock(hImageMem);
 if (!lpImage)
 {
 ErrorMessage("Memory error."); return NULL;
 }
 for (h=height-1; h>=0; --h)
 read_pcx_line(lpImage+(lDIBBytesLine*h));
 // Create the DIB header
 PBITMAPINFO pBmi = (PBITMAPINFO)
 new BYTE[ sizeof(BITMAPINFOHEADER)+2*sizeof(RGBQUAD) ];
 if (!pBmi)
 {
 ErrorMessage("Out of memory.");

 GlobalUnlock(hImageMem);
 GlobalFree(hImageMem);
 return NULL;
 }
 PBITMAPINFOHEADER pBi = &pBmi->bmiHeader;
 pBi->biSize = sizeof(BITMAPINFOHEADER);
 pBi->biWidth = width;
 pBi->biHeight = height;
 pBi->biPlanes = 1;
 pBi->biBitCount = 1;
 pBi->biCompression = 0L;
 pBi->biSizeImage = 0L;
 pBi->biXPelsPerMeter = 0L;
 pBi->biYPelsPerMeter = 0L;
 pBi->biClrUsed = 0L;
 pBi->biClrImportant = 0L;
 // Copy PCX Palette into DIB color table
 pBmi->bmiColors[0].rgbBlue = header.pcxPalette[0].b;
 pBmi->bmiColors[0].rgbGreen = header.pcxPalette[0].g;
 pBmi->bmiColors[0].rgbRed = header.pcxPalette[0].r;
 pBmi->bmiColors[1].rgbBlue = header.pcxPalette[1].b;
 pBmi->bmiColors[1].rgbGreen = header.pcxPalette[1].g;
 pBmi->bmiColors[1].rgbRed = header.pcxPalette[1].r;
 HBITMAP hBitmap = CreateDIBitmap(hdc, pBi, CBM_INIT,
 (LPSTR)lpImage, pBmi, DIB_RGB_COLORS);
 delete pBmi;
 // Free image buffer
 GlobalUnlock(hImageMem);
 GlobalFree(hImageMem);
 return hBitmap;
}

///////////////////////////////////////////////////////////////////////////
// Decode16PCX Methods
///////////////////////////////////////////////////////////////////////////

/****************************************************************************
 METHOD: HBITMAP Decode16PCX::MakeDIB(HDC hdc)
 PURPOSE: Make 16-color DIB
 PARAMETERS: HDC hdc Handle to compatible DC
 RETURNS: Handle to DIB, NULL on error
****************************************************************************/
HBITMAP Decode16PCX::MakeDIB(HDC hdc)
{
 LONG lDIBBytesLine = ALIGN_DWORD( (width+1)/2 );
 LONG image_size = lDIBBytesLine*height;
 // Allocate memory for the image buffer
 GlobalCompact(image_size);
 HANDLE hImageMem = GlobalAlloc(GMEM_MOVEABLEGMEM_ZEROINIT, image_size);
 if (!hImageMem)
 {
 ErrorMessage("Out of memory."); return NULL;
 }
 BYTE huge* lpImage = (BYTE huge*) GlobalLock(hImageMem);
 if (!lpImage)
 {
 ErrorMessage("Memory error."); return NULL;
 }
 // 16 color PCX files interleve scanlines for each color

 BYTE *npPlane[4];
 for (int h=0; h<4; ++h)
 npPlane[h] = new BYTE[BytesLine];
 if (!npPlane[0] !npPlane[1] !npPlane[2] !npPlane[3])
 {
 GlobalUnlock(hImageMem);
 GlobalFree(hImageMem);
 return NULL;
 }
 // 16 color DIB bitmaps have 4 bits per pixel
 for (h=height-1; h>=0; --h)
 {
 read_pcx_line(npPlane[0]);
 read_pcx_line(npPlane[1]);
 read_pcx_line(npPlane[2]);
 read_pcx_line(npPlane[3]);
 LONG l = (LONG) h * lDIBBytesLine;
 for (int m=0; m<BytesLine; ++m)
 {
 BYTE r = npPlane[0][m];
 BYTE g = npPlane[1][m];
 BYTE b = npPlane[2][m];
 BYTE i = npPlane[3][m];
 // Combine a bit from each 4 scan lines into a 4-bit nibble
 BYTE nibbles = 0;
 for (int k=0; k<4; ++k)
 {
 nibbles = 0;
 // If the most significant bit is set...
 // Set the appropriate bit in the higher order nibble
 if (r & '\x80') nibbles = 0x10;
 if (g & '\x80') nibbles = 0x20;
 if (b & '\x80') nibbles = 0x40;
 if (i & '\x80') nibbles = 0x80;
 r<<=1; g<<=1; b<<=1; i<<=1;
 // Repeat for the lower order nibble
 if (r & '\x80') nibbles = 0x01;
 if (g & '\x80') nibbles = 0x02;
 if (b & '\x80') nibbles = 0x04;
 if (i & '\x80') nibbles = 0x08;
 r<<=1; g<<=1; b<<=1; i<<=1;
 *(lpImage + l++) = nibbles;
 }
 }
 }
 for (h=0; h<4; ++h)
 delete npPlane[h];
 // Create the DIB header
 PBITMAPINFO pBmi = (PBITMAPINFO)
 new BYTE[ sizeof(BITMAPINFOHEADER)+16*sizeof(RGBQUAD) ];
 if (!pBmi)
 {
 ErrorMessage("Out of memory.");
 GlobalUnlock(hImageMem);
 GlobalFree(hImageMem);
 return NULL;
 }
 PBITMAPINFOHEADER pBi = &pBmi->bmiHeader;
 pBi->biSize = sizeof(BITMAPINFOHEADER);

 pBi->biWidth = width;
 pBi->biHeight = height;
 pBi->biPlanes = 1;
 pBi->biBitCount = 4;
 pBi->biCompression = 0L;
 pBi->biSizeImage = 0L;
 pBi->biXPelsPerMeter = 0L;
 pBi->biYPelsPerMeter = 0L;
 pBi->biClrUsed = 0L;
 pBi->biClrImportant = 0L;
 if (header.pcxVersion == 3)
 // No PCX palette, use literal color values
 {
 DWORD* clrTab = (DWORD*)pBmi->bmiColors;
 clrTab[0] = 0x000000L;
 clrTab[1] = 0x000080L;
 clrTab[2] = 0x008000L;
 clrTab[3] = 0x008080L;
 clrTab[4] = 0x800000L;
 clrTab[5] = 0x800080L;
 clrTab[6] = 0x808000L;
 clrTab[7] = 0x808080L;
 clrTab[8] = 0xc0c0c0L;
 clrTab[9] = 0x0000ffL;
 clrTab[10] = 0x00ff00L;
 clrTab[11] = 0x00ffffL;
 clrTab[12] = 0xff0000L;
 clrTab[13] = 0xff00ffL;
 clrTab[14] = 0xffff00L;
 clrTab[15] = 0xffffffL;
 }
 else
 // Copy PCX palette to DIB color table
 {
 for (int i=0; i<16; ++i)
 {
 pBmi->bmiColors[i].rgbBlue = header.pcxPalette[i].b;
 pBmi->bmiColors[i].rgbGreen = header.pcxPalette[i].g;
 pBmi->bmiColors[i].rgbRed = header.pcxPalette[i].r;
 pBmi->bmiColors[i].rgbReserved = 0;
 }
 }
 HBITMAP hBitmap = CreateDIBitmap(hdc, pBi, CBM_INIT,
 (LPSTR)lpImage, pBmi, DIB_RGB_COLORS);
 delete pBmi;
 // Free image buffer
 GlobalUnlock(hImageMem);
 GlobalFree(hImageMem);
 return hBitmap;
}

///////////////////////////////////////////////////////////////////////////
// Decode256PCX Methods
///////////////////////////////////////////////////////////////////////////

/****************************************************************************
 METHOD: HBITMAP Decode256PCX::MakeDIB(HDC hdc)
 PURPOSE: Make 256-color DIB
 PARAMETERS: HDC hdc Handle to compatible DC

 RETURNS: Handle to DIB, NULL on error
****************************************************************************/
HANDLE Decode256PCX::MakeDIB(HDC hdc)
{
 LONG lDIBBytesLine = ALIGN_DWORD(BytesLine);
 LONG image_size = lDIBBytesLine*height;
 // Allocate memory for the image buffer
 GlobalCompact(image_size);
 HANDLE hImageMem = GlobalAlloc(GMEM_MOVEABLEGMEM_ZEROINIT, image_size);
 if (!hImageMem)
 {
 ErrorMessage("Out of memory."); return NULL;
 }
 BYTE huge* lpImage = (BYTE huge*) GlobalLock(hImageMem);
 if (!lpImage)
 {
 ErrorMessage("Memory error."); return NULL;
 }
 for (int h=height-1; h>=0; --h)
 read_pcx_line(lpImage+(lDIBBytesLine*h));
 // Create the DIB header
 PBITMAPINFO pBmi = (PBITMAPINFO)
 new BYTE[ sizeof(BITMAPINFOHEADER)+256*sizeof(RGBQUAD) ];
 if (!pBmi)
 {
 ErrorMessage("Out of memory.");
 GlobalUnlock(hImageMem);
 GlobalFree(hImageMem);
 return NULL;
 }
 PBITMAPINFOHEADER pBi = &pBmi->bmiHeader;
 pBi->biSize = sizeof(BITMAPINFOHEADER);
 pBi->biWidth = width;
 pBi->biHeight = height;
 pBi->biPlanes = 1;
 pBi->biBitCount = 8;
 pBi->biCompression = 0L;
 pBi->biSizeImage = 0L;
 pBi->biXPelsPerMeter = 0L;
 pBi->biYPelsPerMeter = 0L;
 pBi->biClrUsed = 0L;
 pBi->biClrImportant = 0L;
 // Look for the palette at the end of the file
 if (_llseek(hFile, -769L, 2) == -1)
 ErrorMessage("Error seeking palette.");
 // It should start with a 0Ch byte
 BYTE Id256Pal;
 if (!(_read(hFile, &Id256Pal, 1) == 1 && Id256Pal == '\xc'))
 ErrorMessage("No palette found.");
 PCXRGB* PalPCX = new PCXRGB[256];
 if (_read(hFile, PalPCX, 768) != 768)
 ErrorMessage("Error reading palette.");
 // Copy PCX palette to DIB color table
 for (int i=0; i<256; ++i)
 {
 pBmi->bmiColors[i].rgbBlue = PalPCX[i].b;
 pBmi->bmiColors[i].rgbGreen = PalPCX[i].g;
 pBmi->bmiColors[i].rgbRed = PalPCX[i].r;
 pBmi->bmiColors[i].rgbReserved = 0;

 }
 delete PalPCX;
 if (hPalette)
 DeleteObject(hPalette);
 // Create and set logical palette
 if ((hPalette = make_palette(pBmi->bmiColors)) != NULL)
 {
 SelectPalette(hdc, hPalette, 0);
 RealizePalette(hdc);
 }
 else
 {
 ErrorMessage("Cannot create palette");
 }
 HBITMAP hBitmap = CreateDIBitmap(hdc, pBi, CBM_INIT,
 (LPSTR)lpImage, pBmi, DIB_RGB_COLORS);
 delete pBmi;
 // Free image buffer
 GlobalUnlock(hImageMem);
 GlobalFree(hImageMem);
 return hBitmap;
}

/****************************************************************************
 METHOD: HPALETTE Decode256PCX::make_palette(RGBQUAD* pColors)
 PURPOSE: Make 256-color Logical Palette
 PARAMETERS: RGBQUAD[256] pColors Palette colors
 RETURNS: Handle to Palette, NULL on error
****************************************************************************/
HPALETTE Decode256PCX::make_palette(RGBQUAD* pColors)
{
 if (!pColors)
 return NULL;
 PLOGPALETTE pPal = (PLOGPALETTE)
 new BYTE[ sizeof(LOGPALETTE) + 256 * sizeof(PALETTEENTRY)];
 if (!pPal)
 return NULL;
 pPal->palNumEntries = 256;
 pPal->palVersion = 0x300;
 for (int i=0; i<256; ++i)
 {
 pPal->palPalEntry[i].peRed = pColors[i].rgbRed;
 pPal->palPalEntry[i].peGreen = pColors[i].rgbGreen;
 pPal->palPalEntry[i].peBlue = pColors[i].rgbBlue;
 pPal->palPalEntry[i].peFlags = 0;
 }
 HPALETTE hPal = CreatePalette(pPal);
 delete pPal;
 return hPal;
}






[LISTING TWO]

#include <windows.h>

#include "pcxwin.h"

#include <stdlib.h>

static char szAppName[] = "PCXWIN";

#define MAXPATH 80
static char szFileName[MAXPATH+1] = "";
static char szUntitled[] = "PCXWIN - (Untitled)";

// Function Prototypes
int DoKeyDown(HWND hwnd, WORD wVkey);
int DoFileOpenDlg(HANDLE hInst, HWND hwnd);

LONG FAR PASCAL _export WndProc(HWND hwnd, WORD message,
 WORD wParam, LONG lParam);
BOOL FAR PASCAL _export FileOpenDlgProc(HWND hDlg, WORD message,
 WORD wParam, LONG lParam);
int PASCAL WinMain(HANDLE hInstance, HANDLE hPrevInstance, LPSTR, int
nCmdShow)
{
 if (!hPrevInstance)
 {
 WNDCLASS wndclass;
 wndclass.style = CS_HREDRAW CS_VREDRAW;
 wndclass.lpfnWndProc = WndProc;
 wndclass.cbClsExtra = 0;
 wndclass.cbWndExtra = 0;
 wndclass.hInstance = hInstance;
 wndclass.hIcon = LoadIcon(hInstance, "PCXWIN");
 wndclass.hCursor = LoadCursor(NULL, IDC_ARROW);
 wndclass.hbrBackground = GetStockObject(WHITE_BRUSH);
 wndclass.lpszMenuName = "PCXWIN";
 wndclass.lpszClassName = szAppName;
 RegisterClass(&wndclass);
 }
 HWND hwnd = CreateWindow(
 szAppName, szUntitled,
 WS_OVERLAPPEDWINDOW WS_VSCROLL WS_HSCROLL,
 CW_USEDEFAULT, CW_USEDEFAULT,
 CW_USEDEFAULT, CW_USEDEFAULT,
 NULL, NULL, hInstance, NULL
 );
 ShowWindow(hwnd, nCmdShow);
 UpdateWindow(hwnd);
 HANDLE hAccel = LoadAccelerators(hInstance, szAppName);
 MSG msg;
 while(GetMessage(&msg, NULL, 0, 0))
 {
 if (!TranslateAccelerator(hwnd, hAccel, &msg))
 {
 TranslateMessage(&msg);
 DispatchMessage(&msg);
 }
 }
 return msg.wParam;
}
LONG FAR PASCAL _export WndProc(HWND hwnd, WORD message,
 WORD wParam, LONG lParam)
{

 static HANDLE hInst;
 static PCX* pcx;
 static Scroller* scroll;
 HDC hdc;
 switch(message)
 {
 case WM_CREATE :
 hInst = ((LPCREATESTRUCT) lParam)->hInstance;
 pcx = new PCX;
 scroll = new Scroller(hwnd);
 return 0L;
 case WM_DESTROY :
 delete pcx;
 delete scroll;
 PostQuitMessage(0);
 return 0L;
 case WM_PAINT :
 PAINTSTRUCT ps;
 hdc = BeginPaint(hwnd, &ps);
 RECT rcClient;
 GetClientRect(hwnd, &rcClient);
 pcx->Display(hdc, scroll->Pos(), rcClient);
 EndPaint(hwnd, &ps);
 return 0L;
 case WM_QUERYNEWPALETTE:
 if (pcx->Palette())
 {
 hdc = GetDC(hwnd);
 SelectPalette(hdc, pcx->Palette(), 0);
 BOOL b = RealizePalette(hdc);
 ReleaseDC(hwnd, hdc);
 if (b)
 {
 InvalidateRect(hwnd, NULL, 1);
 return 1L;
 }
 }
 return 0L;
 case WM_SIZE :
 scroll->Size(pcx->Size());
 return 0L;
 case WM_VSCROLL :
 scroll->Vert(wParam, LOWORD(lParam));
 return 0L;
 case WM_HSCROLL :
 scroll->Horz(wParam, LOWORD(lParam));
 return 0L;
 case WM_KEYDOWN :
 return DoKeyDown(hwnd, wParam);
 case WM_COMMAND :
 switch (wParam)
 {
 case IDM_OPEN :
 if (DoFileOpenDlg(hInst, hwnd))
 {
 hdc = GetDC(hwnd);
 if (pcx->Read(szFileName, hdc))
 {
 char wtext[70];

 wsprintf(wtext, "PcxWin - %.40s (%u x %u)",
 AnsiUpper(szFileName),pcx->Width(), pcx->Height());
 SetWindowText(hwnd, wtext);
 }
 else
 {
 SetWindowText(hwnd, szUntitled);
 }
 ReleaseDC(hwnd, hdc);
 POINT ptNewPos = {0,0};
 scroll->Pos(ptNewPos);
 scroll->Size(pcx->Size());
 }
 InvalidateRect(hwnd, NULL, TRUE);
 break;
 case IDM_ABOUT:
 MessageBox(hwnd, "PCXWIN (c) Paul Chui, 1991",
 "About PCXWIN...", MB_OK MB_ICONINFORMATION);
 break;
 case IDM_EXIT :
 DestroyWindow(hwnd);
 break;
 case IDM_COPY :
 OpenClipboard(hwnd);
 EmptyClipboard();
 SetClipboardData(CF_BITMAP, pcx->Bitmap());
 CloseClipboard();
 }
 return 0L;
 }
 return DefWindowProc(hwnd, message, wParam, lParam);
}
int DoKeyDown(HWND hwnd, WORD wVkey)
{
 switch (wVkey)
 {
 case VK_HOME : SendMessage(hwnd, WM_VSCROLL, SB_TOP, 0L); break;
 case VK_END : SendMessage(hwnd, WM_VSCROLL, SB_BOTTOM, 0L); break;
 case VK_PRIOR : SendMessage(hwnd, WM_VSCROLL, SB_PAGEUP, 0L); break;
 case VK_NEXT : SendMessage(hwnd, WM_VSCROLL, SB_PAGEDOWN, 0L); break;
 case VK_UP : SendMessage(hwnd, WM_VSCROLL, SB_LINEUP, 0L); break;
 case VK_DOWN : SendMessage(hwnd, WM_VSCROLL, SB_LINEDOWN, 0L); break;
 case VK_LEFT : SendMessage(hwnd, WM_HSCROLL, SB_PAGEUP, 0L); break;
 case VK_RIGHT : SendMessage(hwnd, WM_HSCROLL, SB_PAGEDOWN, 0L); break;
 }
 return 0;
}
BOOL DoFileOpenDlg(HANDLE hInst, HWND hwnd)
{
 FARPROC lpfnFileOpenDlgProc = MakeProcInstance((FARPROC)FileOpenDlgProc,
 hInst);
 BOOL bReturn = DialogBox(hInst, "FileOpen", hwnd, lpfnFileOpenDlgProc);
 FreeProcInstance(lpfnFileOpenDlgProc);
 return bReturn;
}
BOOL FAR PASCAL FileOpenDlgProc(HWND hDlg, WORD message, WORD wParam, LONG)
{
 switch(message)
 {

 case WM_INITDIALOG :
 SendDlgItemMessage(hDlg, IDD_FNAME, EM_LIMITTEXT, MAXPATH, 0L);
 SetDlgItemText(hDlg, IDD_FNAME, szFileName);
 return TRUE;
 case WM_COMMAND :
 switch(wParam)
 {
 case IDOK :
 GetDlgItemText(hDlg, IDD_FNAME, szFileName, MAXPATH);
 EndDialog(hDlg, TRUE);
 return TRUE;
 case IDCANCEL :
 szFileName[0] = '\0'; // erase the string
 EndDialog(hDlg, FALSE);
 return TRUE;
 }
 }
 return FALSE;
}






[LISTING THREE]

#ifndef PCXWIN_H
#define PCXWIN_H

#define IDM_OPEN 0x10
#define IDM_EXIT 0x11
#define IDM_ABOUT 0x12
#define IDM_COPY 0x20
#define IDD_FNAME 0x20

class PCX {
public:
 PCX();
 ~PCX();
virtual BOOL Read(LPSTR lpszFileName, HDC theHdc);
virtual BOOL Display(HDC hdc, POINT& pos, RECT& rect);
 POINT Size();
 WORD Width();
 WORD Height();
 HBITMAP Bitmap();
 HPALETTE Palette();
private:
 WORD wWidth, wHeight;
 HBITMAP hBitmap;
 HPALETTE hPalette;
 int hFile; // Input file handle
};
inline POINT PCX::Size() { POINT p = {wWidth,wHeight}; return p; }
inline WORD PCX::Width() { return wWidth; }
inline WORD PCX::Height() { return wHeight; }
inline HBITMAP PCX::Bitmap() { return hBitmap; }
inline HPALETTE PCX::Palette() { return hPalette; }


class Scroller {
public:
 Scroller(HWND hwnd);
 int Size(POINT& ptImgSize);
 int Vert(WORD wSBcode, WORD wSBPos);
 int Horz(WORD wSBcode, WORD wSBPos);
 POINT Pos();
 POINT Pos(POINT& ptNewPos);
private:
 HWND hClientWnd;
 POINT ptPos; // Current Scroll position
 POINT ptMax; // Max Scroll range
 POINT ptInc; // Scroll increment
 POINT ptClient; // Size of client window
};
inline POINT Scroller::Pos() { return ptPos; }
inline POINT Scroller::Pos(POINT& ptNewPos) { return ptPos = ptNewPos; }
inline void ErrorMessage(PSTR message)
{
 MessageBox(NULL, (LPSTR) message, (LPSTR) "Error", MB_OKMB_ICONEXCLAMATION);
}
/* The standard max and min macros are undefined by BC++ because
 they may conflict with class-defined macros with the same names. */
#define MAX(a,b) (((a) > (b)) ? (a) : (b))
#define MIN(a,b) (((a) < (b)) ? (a) : (b))
#endif





[LISTING FOUR]

#include <windows.h>
#include "pcxwin.h"

//////////////////////// Class Scroller ////////////////////////////////
Scroller::Scroller(HWND hwnd)
{
 ptPos.x = 0; ptPos.y = 0;
 ptMax.x = 0; ptMax.y = 0;
 ptInc.x = 0; ptInc.y = 0;

 RECT rect;
 GetClientRect(hwnd, &rect);
 ptClient.x = rect.right; ptClient.y = rect.bottom;
 hClientWnd = hwnd;
}
int Scroller::Size(POINT& ptImgSize)
{
 RECT rect;
 GetClientRect(hClientWnd, &rect);
 ptClient.x = rect.right; ptClient.y = rect.bottom;
 ptMax.x = MAX(0, ptImgSize.x - ptClient.x);
 ptPos.x = MIN(ptPos.x, ptMax.x);
 SetScrollRange(hClientWnd, SB_HORZ, 0, ptMax.x, FALSE);
 SetScrollPos(hClientWnd, SB_HORZ, ptPos.x, TRUE);
 ptMax.y = MAX(0, ptImgSize.y - ptClient.y);
 ptPos.y = MIN(ptPos.y, ptMax.y);

 SetScrollRange(hClientWnd, SB_VERT, 0, ptMax.y, FALSE);
 SetScrollPos(hClientWnd, SB_VERT, ptPos.y, TRUE);
 return 0;
}
int Scroller::Vert(WORD wSBcode, WORD wSBPos)
{
 switch (wSBcode)
 {
 case SB_LINEUP :
 ptInc.y = -1;
 break;
 case SB_LINEDOWN :
 ptInc.y = 1;
 break;
 case SB_PAGEUP :
 ptInc.y = MIN(-1, -ptClient.y/4);
 break;
 case SB_PAGEDOWN :
 ptInc.y = MAX(1, ptClient.y/4);
 break;
 case SB_TOP :
 ptInc.y = -ptInc.y;
 break;
 case SB_BOTTOM :
 ptInc.y = ptMax.y - ptPos.y;
 break;
 case SB_THUMBPOSITION :
 ptInc.y = wSBPos - ptPos.y;
 break;
 default :
 ptInc.y = 0;
 }
 if (( ptInc.y = MAX(-ptPos.y, MIN(ptInc.y, ptMax.y - ptPos.y)) ) != 0)
 {
 ptPos.y += ptInc.y;
 ScrollWindow(hClientWnd, 0, -ptInc.y, NULL, NULL);
 SetScrollPos(hClientWnd, SB_VERT, ptPos.y, TRUE);
 UpdateWindow(hClientWnd);
 }
 return 0;
}
int Scroller::Horz(WORD wSBcode, WORD wSBPos)
{
 switch (wSBcode)
 {
 case SB_LINEUP :
 ptInc.x = -1;
 break;
 case SB_LINEDOWN :
 ptInc.x = 1;
 break;
 case SB_PAGEUP :
 ptInc.x = MIN(-1, -ptClient.x/4);
 break;
 case SB_PAGEDOWN :
 ptInc.x = MAX(1, ptClient.x/4);
 break;
 case SB_THUMBPOSITION :
 ptInc.x = wSBPos - ptPos.x;

 break;
 default :
 ptInc.x = 0;
 }
 if (( ptInc.x = MAX(-ptPos.x, MIN(ptInc.x, ptMax.x - ptPos.x)) ) != 0)
 {
 ptPos.x += ptInc.x;
 ScrollWindow(hClientWnd, -ptInc.x, 0, NULL, NULL);
 SetScrollPos(hClientWnd, SB_HORZ, ptPos.x, TRUE);
 UpdateWindow(hClientWnd);
 }
 return 0;
}





[LISTING FIVE]

NAME PCXWIN

DESCRIPTION 'PCX Viewer (c) Paul Chui, 1991'
EXETYPE WINDOWS
STUB 'WINSTUB.EXE'
CODE PRELOAD MOVEABLE DISCARDABLE
DATA PRELOAD MOVABLE MULTIPLE
HEAPSIZE 1046
STACKSIZE 8192
PROTMODE





[LISTING SIX]

#include <windows.h>
#include "pcxwin.h"

PCXWin MENU
BEGIN
 POPUP "&File"
 BEGIN
 MENUITEM "&Open" IDM_OPEN
 MENUITEM SEPARATOR
 MENUITEM "E&xit" IDM_EXIT
 MENUITEM "A&bout PCXWIN..." IDM_ABOUT
 END
 POPUP "&Edit"
 BEGIN
 MENUITEM "&Copy\tCtrl+Ins" IDM_COPY
 END
END
FILEOPEN DIALOG DISCARDABLE LOADONCALL PURE MOVEABLE 10, 35, 129, 56
STYLE WS_POPUP WS_CAPTION WS_SYSMENU 0x80L
CAPTION "Open File"
BEGIN
 CONTROL "File &name:" -1, "STATIC", WS_CHILD WS_VISIBLE, 8, 7, 56, 12

 CONTROL "" IDD_FNAME, "EDIT", WS_CHILD WS_VISIBLE WS_BORDER 
 WS_TABSTOP 0x80L, 7, 15, 116, 12
 CONTROL "OK" IDOK, "BUTTON", WS_CHILD WS_VISIBLE 
 WS_TABSTOP, 15, 36, 40, 12
 CONTROL "Cancel" IDCANCEL, "BUTTON", WS_CHILD WS_VISIBLE 
 WS_TABSTOP, 69, 36, 40, 12
END
PCXWin ACCELERATORS
{
 VK_INSERT, IDM_COPY, VIRTKEY, CONTROL
}





[LISTING SEVEN]

.AUTODEPEND
# *Translator Definitions*
CC = bccx +PCXWIN.CFG
TASM = TASM
TLINK = tlink
# *Implicit Rules*
.c.obj:
 $(CC) -c {$< }
.cpp.obj:
 $(CC) -c {$< }
# *List Macros*
Link_Exclude = \
 pcxwin.res
Link_Include = \
 pcxwin.obj \
 showpcx.obj \
 scroller.obj \
 pcxwin.def
# *Explicit Rules*
pcxwin.exe: pcxwin.cfg $(Link_Include) $(Link_Exclude)
 $(TLINK) /v/x/n/c/Twe/P-/LC:\CPP\LIB @&&
c0ws.obj+
pcxwin.obj+
showpcx.obj+
scroller.obj
pcxwin
 # no map file
cwins.lib+
import.lib+
maths.lib+
cs.lib
pcxwin.def

 RC -T pcxwin.res pcxwin.exe
# *Individual File Dependencies*
pcxwin.obj: pcxwin.cpp
showpcx.obj: showpcx.cpp
scroller.obj: scroller.cpp
pcxwin.res: pcxwin.rc
 RC -R -IC:\CPP\INCLUDE -FO pcxwin.res PCXWIN.RC
# *Compiler Configuration File*

pcxwin.cfg: pcxwin.mak
 copy &&
-v
-W
-H=PCXWIN.SYM
-IC:\CPP\INCLUDE
-LC:\CPP\LIB
 pcxwin.cfg



Example 1:

 BYTE huge* lpImage = new BYTE[lImageSize];
 int h, line, plane;
 for (h=0, line=0; h<wHeight; ++h, line+=byPlanes)
 for (plane=0; plane<byPlanes; ++plane)
 read_pcx_line(lpImage+(lBmpBytesLine*(line+plane)));
 HBITMAP hBitmap = CreateBitmap(wWidth, wHeight, byPlanes, 1, lpImage);



Example 2:

 BYTE huge* lpImage = new BYTE[lImageSize];
 int h, line, plane;
 if (lImageSize < 65535L)
 // Interlaced color scanlines
 for (h=0, line=0; h<wHeight; ++h, line+=byPlanes)
 for (plane=0; plane<byPlanes; ++plane)
read_pcx_line(lpImage+(LONG(iBmpBytesLine)*(line+plane)));
 else
 // Interlaced color planes
 for (h=0, line=0; h<wHeight; ++h, line+=wHeight)
 for (plane=0; plane<byPlanes; ++plane)
 read_pcx_line(lpImage+(lBmpBytesLine*(plane*wHeight+h)));
 HBITMAP hBitmap = CreateBitmap(wWidth, wHeight, byPlanes, 1, lpImage);

























July, 1991
PROGRAMMING PARADIGMS


A Language Without a Name: Part I




MICHAEL SWAINE


Back in 1983, Bob Jarvis founded a company named Wizard Systems. Wizard's main
product was Wizard C, a C compiler that Bob wrote and that later turned into
Borland's Turbo C. Since leaving Borland a year ago, Bob has been working on a
new programming language. At this writing, the language doesn't have a name,
its original name having failed to make it over the trademark search hurdle,
so in this two-part interview I've referred to it as The Language.
Since the commercial success rate for new languages developed by individuals
not employed by AT&T is less than stunning, I was curious about Bob's motives.
I was pretty sure he wasn't putting in all this effort strictly for the
experience. When he immediately pronounced the prospects for any new language
developed today "dim," I got more curious. Eventually he told me what he has
in mind for The Language, but not until he had given me his views on the
aesthetics and maturity of C++, criticized Borland's and Microsoft's corporate
strategies, reported on second thoughts on Ada at the DoD, and soul-searching
about the right kind of parallelism at Cray Research, and shared some
calculated guesses about the programming languages of the year 2001. Strangely
enough, all of these things proved to be eminently relevant to my original
question: Why is Bob Jarvis writing a new programming language?
DDJ: It seems daring to be writing a new language at this time. What do you
think the prospects are for a new language developed today? BJ: I'd have to
say they are dim.
DDJ: Dim, eh?
BJ: Let's put it this way: Almost all of the languages that are popular today
started out with very modest goals, usually from a very small number of people
tinkering in their basement. Some of them, like Niklaus Wirth, are obviously
very respected designers, but Dennis Ritchie was at the time an unknown. And
it took years and years for the languages to gain international popularity.
They built by word of mouth. C, for example: Enough people liked it that it
developed a kind of underground support network.
So I have very realistic views about the potential for new languages. I put
together something that I personally find fun to program in, that solves some
problems that I had, and hopefully other people will like it and want to
program in it, too. Who knows where it goes from there?
DDJ: Still, I imagine you'd like it if your language inspired the kind of
grass-roots support that C had. You were part of that support network for C
when you wrote Wizard C, which has since become Turbo C.
BJ: And now of course it's modified into C++, but that's not my fault.
DDJ: You're not a fan of C++, I take it?
BJ: I am on record as being an anti-C++ guy. That should not be taken as a
comment about Borland; I think that they made a very sound business decision
in doing C++. I think they quite rightly decided that they wanted to satisfy
that large demand. My objections to C++ are technical and aesthetic.
DDJ: Such as?
BJ: Well, it's much too large a language. There are too many features. There
are statements that you can't tell whether they are expressions or
declarations, and you may not be able to tell by parsing, even by the time you
get to a semicolon. The same sequence of characters maybe validly
interpretable as a declaration or as an expression. There is a resolution rule
which you're supposed to choose if you can't decide, but that kind of parsing
rule is pretty objectionable. It's certainly not the way I would define a
language.
And some of the new stuff that they're doing with parameterized types falls in
that same category. For example, the way that you define a parameterized type
is you enclose it in angle brackets. Well, angle brackets are greater than and
less than, and there are circumstances where you cannot tell whether a greater
than is the ending angle bracket for a parameterized type or an operator in an
expression in one of the parameters for that type. Worse, if you have a
parameterized type that takes another parameterized type as its parameter, you
end up with two closing angle brackets. And if you leave the space out, then
you've created a right-shift operator, and you get a syntax error.
DDJ: Be that as it may, C++ is certainly entrenched.
BJ: Yeah. That's one of the reasons why, when the subject came up at Borland
and there was some discussion about whether they should do a subset or the
full language, I said, as much as I may not have liked the language, that we
really needed to do the full language. Of course I wasn't alone in that view.
And I think Borland has been proven right: They've sold a ton of compilers.
But I think C++ has succeeded as much as it has because AT&T is very solidly
behind it and is pushing a whole lot of people into using it.
I was talking to the people at Cray and, although I'm sure this is not the
official opinion of Cray, one of their engineers said, "We don't have that
much demand from our customers. We have a little, but not enough to do our own
compiler yet." But AT&T is going to put more and more of the Unix operating
system into C++; there will be additional utilities and additional
functionality; X Windows, for example. So from Cray's standpoint, they're
eventually going to have to do a C++ compiler for the simple reason that they
have to run Unix on their machines.
DDJ: C++ has more than word of mouth behind it. Like Ada.
BJ: There have been a few very well publicized efforts like Ada of major teams
getting together to design languages and then ramming them down people's
throats. Ada is obviously still with us, but I was talking with someone from
the Department of Defense about Ada programming on Crays, and it turns out
that the only people that are doing Ada programming on Crays happen to be oil
companies. Don't ask me why. The Department of Defense guy was saying, "Yeah,
we don't do any Ada programming." And I said, "Wait a minute. Isn't it your
language?" And he said, "Yeah, but all our people got exceptions, so we don't
have to." Nobody wants to use it inside the DoD.
DDJ: But as you say, Ada is still with us, and C++ is all over us. Although it
may not be much consolation to independent language developers among Dr.
Dobb's readers, it seems that, if that big institution is in there pushing, a
language can become well entrenched pretty quickly.
BJ: If you've got a company pushing a language, that can make for some very
rapid success, and I think in the case of C++ that may have hurt the language,
because it's very immature. It's changing very rapidly. For it to be as
popular as it is and to still be changing so quickly, where virtually every
six months to a year you're getting a major new release with massive new
features, and where the standards committee is years away from completing a
standard -- the fact there are all these people using it means that it's going
to be around, but I think it's going to be a very unstable picture for quite a
few years to come.
DDJ: I take it The Language avoids some of the problems you have with C++. Did
you conceive it as an alternative to C++?
BJ: I started working on it as a sort of thought process while at Borland. We
were doing all this C++, and although I didn't like C++, I also didn't really
know it. I thought I should really learn the issues of object-oriented
programming. Then there were also some techniques in incremental compilation
that I wanted to explore.
DDJ: How did you start? How do you begin writing a new programming language?
BJ: I came up with three really strong ideas that I wanted. One, I wanted a
systems programming language that you could use to write efficient systems
programs, like operating systems and compilers. And I wanted to have OOP, but
fairly modest OOP; not anything as elaborate as C++ has. And the third thing
was that I wanted to make it easy to do incremental compilation, because I
wanted to explore that area of program development. So I made a number of
changes to C. I added some features and made some syntactic changes and took
out a few things and came up with The Language.
DDJ: You make it sound easy. But there's more to it than that; what are you
doing these days?
BJ: I've been writing a 32-bit operating system for the 80386 completely in
The Language. So right now I have -- I got everything running fragilely about
last July and have been refining it and extending it ever since, and am now
looking into issues like networking and all that sort of stuff. But I can
actually do development on the system, and it's actually a lot more stable
than developing under DOS because it's a protected-mode operating system. It
supports the DOS file system, has a command-line shell with long command lines
and I/O redirection, and you can spawn processes off into the background;
there's full multitasking. All the device drivers are written in The Language.
I don't use any BIOS calls at all. I never switch it to 16-bit mode. And
there's an integrated development environment with, so far, a very primitive
source-level debugger. And I'm just having fun writing all this neat software.
DDJ: Assuming for the moment that you're doing all this for your own amusement
and edification, what have you learned?
BJ: I've found out a few things. One is that the 80386 is a whole lot easier
to generate code for than the 8086. In the space between February and July of
1990, I went from not having any 32-bit software at all to having a complete
self-sustaining operating system. And that's including not having line one of
a 32-bit code generator; I had a 16-bit code generator and a parser written.
It took me about a month and a half to get a fairly stable 386 code generator
running.
DDJ: What about performance?
BJ: I benchmarked it against the same source code. It turns out that
translating C code into The Language involves a fair amount of manual
rewriting of the declarations, but functionally you can take C code and it
pretty much translates; and benchmarks are especially easy because they're all
one module. To verify the numbers I even went back to my 16-bit code generator
and reran the benchmarks, so that it was the same language and just the two
different modes. And I consistently found that things like the Sieve ran five
to ten percent faster in 32-bit mode, and the Dhrystone ran 25 percent faster.
That's with a very simple-minded, not very finely tuned 32-bit code generator
against Turbo C, with a code generator that has had four or five years or six
years worth of hand tuning and optimization work. And this was on the same
processor, so it was as close as you could get to an apples-to-apples code
generation test between 16- and 32-bit models. I was very surprised by that. I
did not expect to have such an improvement in speed between the two modes.
That was very pleasing. So, for example, the disk performance in my operating
system is actually better than DOS disk performance in several ways. It took
me less effort than I thought it would to get high-performance systems
programming. So the bottom line is that I feel like I've succeeded in that
area: I think I've got a language that can be used to write fairly efficient
systems programming.
DDJ: What about the object-oriented features of The Language?
BJ: If you compare the OOP features of The Language, they're very much like
the Turbo Pascal OOP extensions: simple single inheritance, none of this stuff
of friends and multiple inheritance. When you define a class, the methods have
to be defined within the body of that class. You have all the control of
public and private visibility and inheritability of objects and you have
static and dynamic binding of methods, very much like virtual functions in
C++.
DDJ: How was your experience in getting into object-oriented programming?
BJ: So far as OOP goes, my experience is that OOP isn't quite the panacea that
everyone makes it. It's got some very interesting aspects, and for certain
kinds of programming, like for developing windows kinds of environments, I
think there's some real value to it. But we're not going to have software ICs
and completely plug-compatible libraries. That's a lot of wishful thinking.
DDJ: Because?
BJ: Object-oriented programming makes it easier to design good library
interfaces and good generic classes, but it doesn't make it mandatory. You
still have the problem that, if you look at the problem that you have at hand
today and design your libraries and your support classes to fit that problem,
there's no guarantee that you've anticipated all the ways that the next
problem could be different. It's only after you've gone through the task of
trying to adapt this library to several different problems that you can be
reasonably sure that you've got a fairly general solution. You still have to
work fairly hard to get reusable software.
For example, in the integrated development environment in one of its fairly
early versions, I had the editor bound together with the compiler, very much
like Turbo C does it. But when the compiler spit out error messages it would
write all over the editor as it was currently running. I said this is silly;
it looks really terrible; I'm going to have to put in a message window like
Turbo C has.
Well, I was able to take the editor class for text editors, inherit a new
subclass for it for message windows, turn off all of the text-modification
events, add maybe two or three subroutines, and it took me all of three hours
to get a message window that allowed full horizontal and vertical scrolling,
marking a block of text in the message window and cutting it and pasting it
into another window, and positioning on a message and hitting return to put
you on the source alongside the message. All of that functionality took a very
short amount of time and a very small amount of code.
But the point is that I did have to go back and reengineer the editor class a
little bit to make that easier. I find that the adaptability of software with
OOP is very good, but occasionally you do have to go back to your base classes
and make some modifications, because uses come along that you didn't
anticipate.
DDJ: So I guess you're not saying that the emperor has no clothes, but that he
may not be clad appropriately for unexpected changes in the weather.
BJ: There's a tremendous amount of hype out there about OOP. It's been very
heavily oversold by some people. But I thought at the time I started this
project that it had value and I still do. It's not quite what I thought it was
going to be; you do have to change the way you think about your programs.
DDJ: What about the third aspect of The Language, which I think you said had
to do with making incremental compilation easier?
BJ: Well, one of the changes I made was I got rid of include files and went to
a binary module format very much like Modula-2 or Turbo Pascal has.
The early version didn't allow mutual references between modules. Things had
to be in a very strict hierarchy of references. And one of the things I found,
as I was adapting old C code that I had lying around and converting things
over to use this new format, was that the references in my C code were pretty
much spaghetti. I had very low-level modules with references back to top-level
routines buried in them and so everything was referring to everything else,
and all of that had not been visible because the linker was doing all that
magic for me, and it didn't care about any circular references; it just
resolved them.
So all of a sudden I was finding that the hierarchical structure of the
modules in the program really didn't make any sense, because there was no such
structure. And the process of building that structure to make these modules
and to make the import/export of information between modules work better was
quite difficult. I eventually realized that there are times when you really
need to do recursion from a low-level routine back up to a high-level routine.
So put in the ability to do cross-module references. I talked to the people
who did Turbo Pascal and they had the same problem with their modules: They
started out having a strict hierarchy and they had to give that up.
DDJ: The binary module format you mentioned; can you expand on that? What does
that buy you?

BJ: Going to binary headers massively improves compile speed. I think the
latest version of Turbo C++ also supports some form of binary headers, or
precompiled headers, and I think that they found the same sort of massive
speedup in processing things like Windows.h. You just precompile it once and
then use the binary. When you build a large program in [the version of Turbo C
I'm testing against], you end up recompiling all these headers over and over
again, so to compile a large project you might actually compile 120,000 lines
of source even though that consisted of 2000 lines of headers and 20,000 lines
of C code. In this new model, when you have 22,000 lines of source, you
compile 22,000 lines. So the speedup is twofold: First of all, the individual
compilations units are faster to compile, but you're also visiting a lot fewer
lines in the total build.
DDJ: What kinds of results are you getting?
BJ: The effective compile speed for a large application in The Language is not
as fast as Turbo Pascal. I think Turbo Pascal is still up in the stratosphere
of like 70,000 lines a minute on a fast machine. I'm only around 30,000 lines
a minute on a fast machine, but that's against Turbo C, which clocks out at
about 10-12,000 lines per minute on a fast machine without the binary headers.
I'm sure [the new version's binary header capability] boosts that quite a bit.
Going to binary headers helps in a whole lot of ways. It simplifies the
language. You don't have to do forward declarations in The Language. You have
one place where you define it; anybody else that wants to use the symbol that
you defined has to import the module that it's in, and that's it. Modula-2 and
Pascal have a dual, two-part definition, so you have the interface definition
and then you have the implementation. I just merge them together, and the
public information gets sort of filtered out as it does the compile.
Editor's Note: Michael and Bob continue their conversation next month,
speculating about the nature of development tools and computer systems in the
years to come.

























































July, 1991
C PROGRAMMING


D-Flat Message Processing


 This article contains the following executables: DFLAT.791 DFLAT3.ARC


AL STEVENS


This is the third installment in the continuing saga of D-Flat. Many of you
have responded to this project. Many have downloaded the preliminary source
code library from CompuServe and TelePath, have compiled it, and are sending
me bug reports already.
One of the problems with the preliminary publication of the code is that no
documentation exists yet for the D-Flat API's messages, macros, and functions.
I'll be producing and posting the API and user's documentation Real Soon Now.
In the meantime, you must use your hacking talents to ferret out what you need
to know from the code. If you get stumped, send me questions on CompuServe.


Events, Messages, and Windows


This month's chapter addresses the event and message mechanisms that D-Flat
uses. I devoted a column to event-driven architectures in the March 1991
issue. Following is a recap of how it works.
A program that uses the event-driven architecture reacts to events from the
outside -- keystrokes, mouse actions, the clock. The system's software
converts these hardware and user events into messages that it sends to the
applications software. The applications software lies dormant waiting for a
message to come along. When one does, the software processes it, and can send
other messages to other parts of itself and to the systems software.
A window-based event-driven architecture uses the video window as the
applications object to and from which messages pass. Windows receive and
process messages, and they send messages to other windows and to the system.
If you are not accustomed to this paradigm, it will not seem to make sense at
first.
In the traditional function-oriented program, data items do not do such things
as send and receive messages. They simply exist, and functions do things to
them. In an event-driven program, events happen, and the system sends messages
to windows. A window is a data item, an object. It receives and sends
messages. How can that be? It isn't simple, but it's not much different from
what you are used to. Instead of writing a function that does things to a
window, you write a function that represents the window itself and that
receives, processes, and sends messages.
This behavior of passing messages among windows and between windows and the
system is the foundation of object-oriented programming, a paradigm that
embodies other unique characteristics besides message passing. It is no wonder
that most of the program development environments for Windows and similar
event-driven platforms are based in object-oriented programming.
Here's a simple scenario. The application software establishes a window on the
screen and then waits for something to happen. The systems software is in a
loop watching the hardware. The user presses a key. That keystroke is an
event. The systems software translates the keystroke into a message and sends
the message to the window.
The window has an event processing function that receives and processes all
its messages. It receives the keystroke message and does whatever a window of
its class does with a keystroke. Perhaps it sends the system a message that
says, "Where is the cursor positioned?" The system answers such messages.
Perhaps the window establishes another window and sends it a message that
says, "Take this text and display it."
There are three basic message types -- event messages, messages from windows
to the system, and messages to windows from other windows or the system. Event
messages report mouse, keyboard, and clock events to the windows. Messages
from windows to the system include such actions as positioning and requesting
the position of the mouse and keyboard cursor. Messages to windows tell the
window to do something, such as repaint its data space or change its size.
Messages to windows can also ask the window to tell the message sender
something about the message receiver.
D-Flat, like other window-based systems, supports a set of standard window
classes. Last month's column published the class definition code. You can
build an entire application by using the standard menus, dialog boxes, list
boxes, buttons, and so on. You can also derive new window classes from the
existing ones and write window processing functions that manage the unique
qualities of your new window class. A derived window class can retain some or
all of the behavior of its base class. A window's behavior is a function of
its reaction to messages. A derived window class can introduce and process new
messages, and it can intercept and process existing messages that its base
class would have processed in other ways. This ability to derive new window
classes that override the behavior of their base class is another area where
this architecture resembles object-oriented programming.


The Message Software


Listing One, page 146, is message.h. It begins with some message-related
global definitions. MAXMESSAGES is the maximum number of messages that can be
queued. At 50, it should be enough. FIRSTDELAY and DELAYTICKS controls the
typematic-like behavior of the mouse when you hold the button down for
dragging or scrolling actions. The values in these globals are clock ticks of
1/18 second. The system waits FIRSTDELAY ticks after you first press the
button before it begins to repeat the button event. Then it waits DELAYTICKS
ticks between each repetition of the event. The DOUBLETICKS value is the
maximum number of ticks that may elapse between two clicks at the same screen
position before the system declares a double-click event. Some mouse-driven
programs provide for the user to modify these values. D-Flat does not do that,
but if you decide that you want it to, you should change these global symbols
into variables that your user can change.


The Messages


Message.h is the header file that defines the messages in a D-Flat
application. If you derive new window classes and need to add messages, you
would add them to this source file. It is not apparent from the message names
and their comments whether the messages are event-based, messages from the
windows to the system, or messages between windows. This distinction should
become apparent as you use the messages in the programs that follow.
The first message is the START message. I haven't found a use for it yet, but
it's in there in case I do. If it disappears from the code someday, I've given
up on it. The messages are divided into process communication messages; window
management messages; clock, keyboard, and mouse messages; messages for the
various standard window classes; and dialog box messages.
Windows can send messages to the system and send or post messages to other
windows including themselves. Last month's column described the Create Window
function, which creates a window and returns its handle, a variable of type
WINDOW. That handle is how a window identifies itself and the way that a
message sender addresses the message to a window. If the message is for the
system, the WINDOW address is the NULLWND constant value.
Every message includes a message identifier taken from the list in message.h.
Besides the message identifier, every message has two long integer parameters
that pass arguments along with the message. The content and meaning of the
parameters depends on the message itself. Sometimes they are empty. Some
messages use only the first parameter, others use both. Sometimes the
parameters are pointers; other times they are integer values; still other
times they are simple on/off indicators.
Listing Two, page 146, is message.c, the source code for event and message
processing. It maintains an event queue and a message queue. The event queue
collects mouse, keyboard, and clock events. The message queue collects
messages. The message.c file has two interrupt functions. The first one,
new-timer, is for the timer interrupt. It counts down the three software
timers that support the events. One timer, the double-timer counter counts the
ticks between mouse clicks to see if a double-click event has occurred. The
delaytimer counter controls the delays between repeats of the button event
when the user does not release the button. The clocktimer counter is for the
one-second clock event.
The second interrupt function is new-crit. It intercepts critical error
interrupts, posts the error, and posts the disk drive letter of the critical
error into an error message. The TestCriticalError function displays that
error message in a dialog box and retrieves the user's ignore or retry option.
This technique prevents DOS from splashing its rude "Abort, Retry, or Ignore"
message onto your orderly D-Flat screen.
The init_messages function initializes message processing. An applications
program must call this function before it begins to wait for messages. The
function initializes all the variables, attaches the interrupt vectors for the
timer and the critical error handler, and posts the START message to the
system.


Waiting for a Message


After a program has created a window and called init_messages, it can enter a
loop that waits for messages. The loop looks like this:
 while (dispatch_message( )) ;
The dispatch_message function is at the bottom of messages.c. As long as
nothing sends the STOP message to the system, the dispatch_message function
will return a true value, and your program will stay in the loop. When the
loop breaks, message processing is done. You would have to recall
init_messages to do any further message processing with dispatch_message.


Collecting Events



The dispatch_message function calls the collect_events function to collect any
pending events and queue them. The collect_events functions watches the
hardware, interprets the events, and queues them. Queued events consist of a
MESSAGE code that identifies the event and two integer parameters.
If the clocktimer counter has run down, the collect_events function reads the
time-of-day clock and posts a CLOCKTICK event into the event queue. The two
parameters are the segment and offset of a string variable that contains a
displayable copy of the time. The time display alternates between putting a
colon and a space between the hour and minute so that the clock will appear to
be ticking when you display the string.
If the keyboard shift key status has changed, the collect_events function
queues a SHIFT_CHANGED event with the new shift status value as the first
parameter. If the user typed a key, the collect_events function translates the
keystroke into an event and queues it.
Mouse events are tricky. The program remembers the most recent mouse location.
If you move the mouse, the collect_events function queues a MOUSE_MOVED event
with the new mouse coordinates in the parameters. If the right button is down,
the function posts a RIGHT_BUTTON event which also contains the screen
coordinates where the mouse was when you pressed the button. If the left
button is down and the mouse has moved, this is a new button press as far as
the software is concerned, and it queues a LEFT_BUTTON event with the screen
coordinates. It also turns off the doubletimer and sets the delaytimer.
If you released the mouse button, the function kills the delaytimer and starts
the doubletimer running. If you press the left button in the same place again
before the timer runs down, that is a double-click.
If the left button is down and the mouse has not moved, one of two things
might have happened. You might just be holding the button down. If you do and
the delaytimer runs down, the function posts another LEFT_BUTTON event and
restarts the delaytimer. If the doubletimer is running, however, there have
been two clicks in the span of the timer's life, and the function disables the
timer and queues the DOUBLE_CLICK event.


Turning Events into Messages


After calling the collect_events function, the dispatch_message function looks
to see if there are any events in the event queue. As long as there are, it
dequeues them, translates them into messages, and sends the messages to the
windows that should get them. Mouse messages usually go to the window in which
the mouse event occurred as determined by the screen coordinates of the event.
The exception is when a different window has captured all mouse events
regardless of where they hit.
Keyboard messages go to the window that presently has the "focus," which is
the window into which the user is typing data. The in-focus window is the one
that gets the user's attention. It is displayed on top of any other windows
and, if it is a window that accepts typing, the keyboard cursor is in it. If
no such window is in focus, the keyboard events go to whatever window does
have the focus, and it is that window's responsibility to see that the
keyboard gets processed or ignored, whichever is appropriate.
If the LEFT_BUTTON mouse event is going to a window that does not have the
focus, the dispatch_message function sends the window a SETFOCUS message to
tell it to take the focus. This procedure is what allows you to bring a window
to the top and give it the input focus by clicking anywhere within its space.
The dispatch_message function sends CLOCKTICK messages to whichever window has
captured the clock by sending itself the CAPTURE_CLOCK message.


Posting and Sending Messages


The message queue is where the system and windows post messages to go to
windows. They do that by calling the PostMessage function, which accepts a
WINDOW handle, the MESSAGE identifier, and the two message parameters. The
dispatch_message function dequeues those messages and sends them through the
SendMessage function, which takes the same parameters. If the message is a
STOP or ENDDIALOG message, the dispatch_message returns a false value to tell
the message processing loop of the application or the dialog box manager to
stop.
The system and windows send messages to each other with the SendMessage
function. It differs from the PostMessage function in that the messages get
sent right away. If you call SendMessage the system will not return to the
calling function until the message has been sent and processed. If you call
PostMessage, the message will go into the message queue and will not be sent
until the next time through the message-dispatching loop. If a message's
function is to return a value, you must use SendMessage. A posted message
returns nothing to the window that posts it.
The SendMessage function processes the messages that are for the system rather
than other windows. These include the STOP, CAPTURE_CLOCK, RELEASE_CLOCK,
KEYBOARD_CURSOR, CAPTURE_KEYBOARD, RELEASE_KEYBOARD, CURRENT_KEYBOARD_CURSOR,
SAVE_CURSOR, RESTORE_CURSOR, HIDE_CURSOR, SHOW_CURSOR, MOUSE_INSTALLED,
SHOW_MOUSE, HIDE_MOUSE, MOUSE_CURSOR, CURRENT_MOUSE_CURSOR, WAITMOUSE,
TESTMOUSE, CAPTURE_MOUSE, and RELEASE_MOUSE messages. These messages either
tell the system to change something in the hardware, ask for some
hardware-related values, or capture and release input devices.


Changes to D-Flat Code


It is inevitable. A project of this size grows and changes, and sometimes you
have to backtrack. I made one small addition to the window structure in
dflat.h, published in the May, 1991 issue. Insert this variable immediately
following the condition integer.
int restored_attrib; /* attributes when restored */
Add an assignment of zero to the variable in the CreateWindow function in
window.c, too. This variable saves a window's attributes word when the window
is minimized or maximized. The minimize and maximize procedures can then turn
off some attributes that are inappropriate in a minimized or maximized window.
The restore procedure restores the attribute word when it restores the window.
Several readers asked why the screen height was stuck at 25 lines when the PC
supports several other configurations. Change this global definition in
system.h from two months ago.
#define SCREENHEIGHT (peekb(0x40, 0x84)+1)
The former version of the global was set to 25. This version gets the current
screen height from the BIOS RAM data area. One reader reports that this change
delivers a screen height of 8 on his Hercules video system. If you have this
problem, return the statement to its earlier value.
Of course, all the changes I mention in the column are in the D-Flat source
code package that you can download.


What Will it Look Like?


Even though we are in the third month of code, you can't do anything with the
programs that have been published so far. There are dependencies in the source
modules for several months to come. You can download the full package, use the
example memopad program, and perhaps branch out on your own. To help you
decide to do that, Figure 1 shows a memopad screen with two open editor
windows and a menu popped down.


How to Get D-Flat Now


The complete source code package for D-Flat is on CompuServe in Library 0 of
the DDJ Forum and on TelePath. Its name is DFLATn.ARC, where n is an integer
that represents a loosely assigned version number. The library is a
preliminary version of the package but one that works. I will replace this
file over the months as the code changes. At present, everything compiles and
works with Turbo C 2.0 and Microsoft C 6.0. There is a makefile that the make
utilities of both compilers accept, and there is one example program, the
MEMO-PAD program. If you want to discuss D-Flat with me, my CompuServe ID is
71101,1262, and I monitor the DDJ Forum daily.


Them That Can, Does...


My pal Aubrey Sears works at NASA's Goddart Space Flight Center and has the
run of their computer rooms. He gets into all kinds of high-tech stuff. One
time he showed me a roomful of water-cooled mainframe. He said the service
technician had to have a plumber's license as well as being certified on the
behemoth. Aubrey's been learning C. He uses a mainframe C training manual,
which is chock full of examples such as the one in Example 1. The manual says
that this exercise demonstrates the left-to-right evaluation behavior of the
comma-separated expression. This example does more than that.
Example 1: A coding example from a mainframe C training manual

 #include <stdio.h>
 main()

 {

 int x=1,y=1;
 x = (y++,y+3);
 printf("x=%d,y=%d\n",x,y);
 printf("++y=%d,x=%d\n",(--x,++y), x);

 }


Aubrey runs these programs on every computer he can get his hands on, which is
an impressive suite of hardware. He wondered why C compilers for the Cray, the
Sun, the IBM 3081, the VAX, and the PC disagree about the value printed for x
in the second printf. Some display 4, the others display 5.
The C language has never mandated the order in which a function call's
arguments are evaluated. The second argument to the printf statement
decrements x. The third argument is x. If the compiler evaluates arguments
right-to-left, printf will display 5 because that's what x was before the
statement. If the compiler evaluates left-to-right, the value will be 4
because x decrements in the second argument before it gets passed in the
third.
"Which value is right?" asks Aubrey. Both, of course. What's wrong? The
example is wrong. Programmers should not write code like that example. It is a
solid demonstration of nonportable code. Writers of so-called C training
manuals should know better.


The Programmer's Soap Box: Drug Testing and Patent Medicine


At the tender age of 18, I took a polygraph test to get my first job. They
wanted to make sure that I was in no way a threat to the national interest. In
the 1950s, teenagers weren't as hip and experienced as they are today, and I
didn't have enough sinning under my belt -- or over my belt, for that matter
-- to jiggle a polygraph pen or titillate an examiner. I passed that test and
became a computer programmer. But I left that examination room a changed
person. I'm not sure that the machine really worked beyond its ability to
intimidate the examinee, but I am convinced that those people violated my
privacy by rummaging around in my brain.
Employers of yore insisted on good grooming, too. If your hair grew too long,
your boss would suggest a trip to the barbershop. Seems odd today, but that's
how things were, and nobody objected.
In the years since, the nation has perceived a growing drug problem, and the
same kind of hysteria that gave employers unregulated access to your thoughts
now lets them peek into other parts of your body as well. These invasions
include tests of your bodily extracts--tests that they conduct for the
purposes of granting or continuing your employment. Over the years, I missed
out on several choice jobs simply because I would not submit to polygraphs and
drug testing. But I never lost one because I wouldn't get a haircut.
Now those two conditions of employment have converged. You can wear your hair
any way you like, but you must give a lock of it to your boss. No, not for
your CEO to wear in a locket under his vest next to his cold, cold heart. Not
for any reason like that. They want your hair for drug testing.
Kirchman Corporation of Altamonte Springs, Florida fired a programmer who
refused to allow them to take a lock of her hair. Anita Nabors protested the
hair drug test on two grounds. She objected to them messing with her looks by
chopping off part of her coiffure, and she objected because the hair test is
not universally endorsed as a reliable drug-screening technique. She was
afraid that a faulty test result would jeopardize her job and reputation. She
offered to submit to a blood or urine test, equally intrusive in my view, but
the company said no and sent the six-year, "exemplary" employee packing.
Did they think she'd been hitting the drugs? No, it was just a new policy that
everyone had to comply with or be canned. "Cower to" would be a better way to
put it. Clip your bob or lose your job. Kirchman develops banking software,
and one day they just up and decided that their programs had to be drugfree.
All those banks feel better knowing that the programmers who write the
software just say no, don't you see?
Anita Nabors sued Kirchman. I'm rooting for her. She tells me that since she
left, they dropped the drug-screening program, even for new employees. That's
how well it was working. But Anita hasn't found work yet, and they haven't
offered to reinstate her. If you need a top Cobol programmer who is learning
about PCs and whose courage and character are as intact as her hairdo, send an
inquiry to DDJ and I'll forward it to her.
Before you follow Anita's brave example, however, be aware that the same
mean-spirited mentality that claims rights to your hair, blood, urine, and
spit will, if you object, brand you a probable drug addict or AIDS carrier who
doesn't want to get caught. Why else would you mind if they poke around in
your juices? You must have something to hide.
On to another plank. Last week I read that AT&T is trying to enforce a patent
on a software algorithm that X-Window uses. AT&T claims rights to an algorithm
that saves bitmapped images in memory so that a new window can pop up and pop
down later. They call it "backing store." They are notifying developers of
X-Window systems that there is a license fee for using this technology. I
wonder if other programs -- Windows, GEM, and NewWave, for example -- that use
similar algorithms will fall before the same sword? Could be. But beware,
AT&T. Your blade is two-edged. This is going to cost you. I'm switching to US
Sprint. Not only do I protest your use of software patents, but Murphy Brown
is a sight better looking than Dennis Ritchie.
_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

/* ----------- message.h ------------ */

#ifndef MESSAGES_H
#define MESSGAES_H

#define MAXMESSAGES 50
#define DELAYTICKS 1
#define FIRSTDELAY 7
#define DOUBLETICKS 5

typedef enum messages {
 /* ------------- process communication messages --------- */
 START, /* start message processing */
 STOP, /* stop message processing */
 COMMAND, /* send a command to a window */
 /* ------------- window management messages ------------- */
 CREATE_WINDOW, /* create a window */
 SHOW_WINDOW, /* show a window */
 HIDE_WINDOW, /* hide a window */
 CLOSE_WINDOW, /* delete a window */
 SETFOCUS, /* set and clear the focus */
 PAINT, /* paint the window's data space */
 BORDER, /* paint the window's border */
 TITLE, /* display the window's title */
 MOVE, /* move the window */
 SIZE, /* change the window's size */

 MAXIMIZE, /* maximize the window */
 MINIMIZE, /* minimize the window */
 RESTORE, /* restore the window */
 INSIDE_WINDOW, /* test x/y inside a window */
 /* ------------- clock messages ------------------------- */
 CLOCKTICK, /* the clock ticked */
 CAPTURE_CLOCK, /* capture clock into a window */
 RELEASE_CLOCK, /* release clock to the system */
 /* ------------- keyboard and screen messages ----------- */
 KEYBOARD, /* key was pressed */
 CAPTURE_KEYBOARD, /* capture keyboard into a window */
 RELEASE_KEYBOARD, /* release keyboard to system */
 KEYBOARD_CURSOR, /* position the keyboard cursor */
 CURRENT_KEYBOARD_CURSOR,/* read the cursor position */
 HIDE_CURSOR, /* hide the keyboard cursor */
 SHOW_CURSOR, /* display the keyboard cursor */
 SAVE_CURSOR, /* save the cursor's configuration*/
 RESTORE_CURSOR, /* restore the saved cursor */
 SHIFT_CHANGED, /* the shift status changed */
 /* ------------- mouse messages ------------------------- */
 MOUSE_INSTALLED, /* test for mouse installed */
 RIGHT_BUTTON, /* right button pressed */
 LEFT_BUTTON, /* left button pressed */
 DOUBLE_CLICK, /* right button double-clicked */
 MOUSE_MOVED, /* mouse changed position */
 BUTTON_RELEASED, /* mouse button released */
 CURRENT_MOUSE_CURSOR, /* get mouse position */
 MOUSE_CURSOR, /* set mouse position */
 SHOW_MOUSE, /* make mouse cursor visible */
 HIDE_MOUSE, /* hide mouse cursor */
 WAITMOUSE, /* wait until button released */
 TESTMOUSE, /* test any mouse button pressed */
 CAPTURE_MOUSE, /* capture mouse into a window */
 RELEASE_MOUSE, /* release the mouse to system */
 /* ------------- text box messages ---------------------- */
 ADDTEXT, /* add text to the text box */
 CLEARTEXT, /* clear the edit box */
 SETTEXT, /* set address of text buffer */
 SCROLL, /* vertical scroll of text box */
 HORIZSCROLL, /* horizontal scroll of text box */
 /* ------------- edit box messages ---------------------- */
 EB_GETTEXT, /* get text from an edit box */
 EB_PUTTEXT, /* put text into an edit box */
 /* ------------- menubar messages ----------------------- */
 BUILDMENU, /* build the menu display */
 SELECTION, /* menubar selection */
 /* ------------- popdown messages ----------------------- */
 BUILD_SELECTIONS, /* build the menu display */
 CLOSE_POPDOWN, /* tell parent popdown is closing */
 /* ------------- list box messages ---------------------- */
 LB_SELECTION, /* sent to parent on selection */
 LB_CHOOSE, /* sent when user chooses */
 LB_CURRENTSELECTION, /* return the current selection */
 LB_GETTEXT, /* return the text of selection */
 LB_SETSELECTION, /* sets the listbox selection */
 /* ------------- dialog box messages -------------------- */
 INITIATE_DIALOG, /* begin a dialog */
 ENTERFOCUS, /* tell DB control got focus */
 LEAVEFOCUS, /* tell DB control lost focus */

 ENDDIALOG /* end a dialog */
} MESSAGE;

/* --------- message prototypes ----------- */
void init_messages(void);
void PostMessage(WINDOW, MESSAGE, PARAM, PARAM);
int SendMessage(WINDOW, MESSAGE, PARAM, PARAM);
int dispatch_message(void);
int TestCriticalError(void);

#endif





[LISTING TWO]

/* --------- message.c ---------- */

#include <stdio.h>
#include <dos.h>
#include <conio.h>
#include <string.h>
#include <time.h>
#include "dflat.h"

static int px = -1, py = -1;
static int pmx = -1, pmy = -1;
static int mx, my;

static int CriticalError;

/* ---------- event queue ---------- */
static struct events {
 MESSAGE event;
 int mx;
 int my;
} EventQueue[MAXMESSAGES];

/* ---------- message queue --------- */
static struct msgs {
 WINDOW wnd;
 MESSAGE msg;
 PARAM p1;
 PARAM p2;
} MsgQueue[MAXMESSAGES];

static int EventQueueOnCtr;
static int EventQueueOffCtr;
static int EventQueueCtr;

static int MsgQueueOnCtr;
static int MsgQueueOffCtr;
static int MsgQueueCtr;

static int lagdelay = FIRSTDELAY;

static void (interrupt far *oldtimer)(void) = NULL;

WINDOW CaptureMouse = NULLWND;
WINDOW CaptureKeyboard = NULLWND;
static int NoChildCaptureMouse = FALSE;
static int NoChildCaptureKeyboard = FALSE;

static int doubletimer = -1;
static int delaytimer = -1;
static int clocktimer = -1;

WINDOW Cwnd = NULLWND;

/* ------- timer interrupt service routine ------- */
static void interrupt far newtimer(void)
{
 if (timer_running(doubletimer))
 countdown(doubletimer);
 if (timer_running(delaytimer))
 countdown(delaytimer);
 if (timer_running(clocktimer))
 countdown(clocktimer);
 oldtimer();
}

static char ermsg[] = "Error accessing drive x";

/* -------- test for critical errors --------- */
int TestCriticalError(void)
{
 int rtn = 0;
 if (CriticalError) {
 rtn = 1;
 CriticalError = FALSE;
 if (TestErrorMessage(ermsg) == FALSE)
 rtn = 2;
 }
 return rtn;
}

/* ------ critical error interrupt service routine ------ */
static void interrupt far newcrit(IREGS ir)
{
 if (!(ir.ax & 0x8000)) {
 ermsg[sizeof(ermsg) - 2] = (ir.ax & 0xff) + 'A';
 CriticalError = TRUE;
 }
 ir.ax = 0;
}

/* ------------ initialize the message system --------- */
void init_messages(void)
{
 resetmouse();
 show_mousecursor();
 px = py = -1;
 pmx = pmy = -1;
 mx = my = 0;
 CaptureMouse = CaptureKeyboard = NULLWND;
 NoChildCaptureMouse = FALSE;
 NoChildCaptureKeyboard = FALSE;

 MsgQueueOnCtr = MsgQueueOffCtr = MsgQueueCtr = 0;
 EventQueueOnCtr = EventQueueOffCtr = EventQueueCtr = 0;
 if (oldtimer == NULL) {
 oldtimer = getvect(TIMER);
 setvect(TIMER, newtimer);
 }
 setvect(CRIT, newcrit);
 PostMessage(NULLWND,START,0,0);
 lagdelay = FIRSTDELAY;
}

/* ----- post an event and parameters to event queue ---- */
static void PostEvent(MESSAGE event, int p1, int p2)
{
 if (EventQueueCtr != MAXMESSAGES) {
 EventQueue[EventQueueOnCtr].event = event;
 EventQueue[EventQueueOnCtr].mx = p1;
 EventQueue[EventQueueOnCtr].my = p2;
 if (++EventQueueOnCtr == MAXMESSAGES)
 EventQueueOnCtr = 0;
 EventQueueCtr++;
 }
}

/* ------ collect mouse, clock, and keyboard events ----- */
static void near collect_events(void)
{
 struct tm *now;
 static int flipflop = FALSE;
 static char timestr[8];
 int hr, sk;
 static int ShiftKeys = 0;

 /* -------- test for a clock event (one/second) ------- */
 if (timed_out(clocktimer)) {
 /* ----- get the current time ----- */
 time_t t = time(NULL);
 now = localtime(&t);
 hr = now->tm_hour > 12 ?
 now->tm_hour - 12 :
 now->tm_hour;
 if (hr == 0)
 hr = 12;
 sprintf(timestr, "%2.2d:%02d", hr, now->tm_min);
 strcpy(timestr+5, now->tm_hour > 11 ? "pm" : "am");
 /* ------- blink the : at one-second intervals ----- */
 if (flipflop)
 *(timestr+2) = ' ';
 flipflop ^= TRUE;
 /* -------- reset the timer -------- */
 set_timer(clocktimer, 1);
 /* -------- post the clock event -------- */
 PostEvent(CLOCKTICK, FP_SEG(timestr), FP_OFF(timestr));
 }

 /* --------- keyboard events ---------- */
 if ((sk = getshift()) != ShiftKeys) {
 ShiftKeys = sk;
 /* ---- the shift status changed ---- */

 PostEvent(SHIFT_CHANGED, sk, 0);
 }

 /* ---- build keyboard events for key combinations that
 BIOS doesn't report --------- */
 if (sk & ALTKEY)
 if (inp(0x60) == 14) {
 while (!(inp(0x60) & 0x80))
 ;
 PostEvent(KEYBOARD, ALT_BS, sk);
 }
 if (sk & CTRLKEY)
 if (inp(0x60) == 82) {
 while (!(inp(0x60) & 0x80))
 ;
 PostEvent(KEYBOARD, CTRL_INS, sk);
 }

 /* ----------- test for keystroke ------- */
 if (keyhit()) {
 static int cvt[] = {SHIFT_INS,END,DN,PGDN,BS,'5',
 FWD,HOME,UP,PGUP};
 int c = getkey();

 /* -------- convert numeric pad keys ------- */
 if (sk & (LEFTSHIFT RIGHTSHIFT)) {
 if (c >= '0' && c <= '9')
 c = cvt[c-'0'];
 else if (c == '.' c == DEL)
 c = SHIFT_DEL;
 else if (c == INS)
 c = SHIFT_INS;
 }
 /* -------- clear the BIOS readahead buffer -------- */
 *(int far *)(MK_FP(0x40,0x1a)) =
 *(int far *)(MK_FP(0x40,0x1c));
 /* ---- if help key call generic help function ---- */
 if (c == F1)
 HelpFunction();
 else
 /* ------ post the keyboard event ------ */
 PostEvent(KEYBOARD, c, sk);
 }

 /* ------------ test for mouse events --------- */
 get_mouseposition(&mx, &my);
 if (mx != px my != py) {
 px = mx;
 py = my;
 PostEvent(MOUSE_MOVED, mx, my);
 }
 if (rightbutton())
 PostEvent(RIGHT_BUTTON, mx, my);
 if (leftbutton()) {
 if (mx == pmx && my == pmy) {
 /* ---- same position as last left button ---- */
 if (timer_running(doubletimer)) {
 /* -- second click before double timeout -- */
 disable_timer(doubletimer);

 PostEvent(DOUBLE_CLICK, mx, my);
 }
 else if (!timer_running(delaytimer)) {
 /* ---- button held down a while ---- */
 delaytimer = lagdelay;
 lagdelay = DELAYTICKS;
 /* ---- post a typematic-like button ---- */
 PostEvent(LEFT_BUTTON, mx, my);
 }
 }
 else {
 /* --------- new button press ------- */
 disable_timer(doubletimer);
 delaytimer = FIRSTDELAY;
 lagdelay = DELAYTICKS;
 PostEvent(LEFT_BUTTON, mx, my);
 pmx = mx;
 pmy = my;
 }
 }
 else
 lagdelay = FIRSTDELAY;
 if (button_releases()) {
 /* ------- the button was released -------- */
 doubletimer = DOUBLETICKS;
 PostEvent(BUTTON_RELEASED, mx, my);
 disable_timer(delaytimer);
 }
}

/* ----- post a message and parameters to msg queue ---- */
void PostMessage(WINDOW wnd, MESSAGE msg, PARAM p1, PARAM p2)
{
 if (MsgQueueCtr != MAXMESSAGES) {
 MsgQueue[MsgQueueOnCtr].wnd = wnd;
 MsgQueue[MsgQueueOnCtr].msg = msg;
 MsgQueue[MsgQueueOnCtr].p1 = p1;
 MsgQueue[MsgQueueOnCtr].p2 = p2;
 if (++MsgQueueOnCtr == MAXMESSAGES)
 MsgQueueOnCtr = 0;
 MsgQueueCtr++;
 }
}

/* --------- send a message to a window ----------- */
int SendMessage(WINDOW wnd, MESSAGE msg, PARAM p1, PARAM p2)
{
 int rtn = TRUE, x, y;
 if (wnd != NULLWND)
 switch (msg) {
 case PAINT:
 case BORDER:
 case RIGHT_BUTTON:
 case LEFT_BUTTON:
 case DOUBLE_CLICK:
 case BUTTON_RELEASED:
 case KEYBOARD:
 case SHIFT_CHANGED:
 /* ------- don't send these messages unless the

 window is visible -------- */
 if (!isVisible(wnd))
 break;
 default:
 rtn = wnd->wndproc(wnd, msg, p1, p2);
 break;
 }
 /* ----- window processor returned or the message was sent
 to no window at all (NULLWND) ----- */
 if (rtn != FALSE) {
 /* --------- process messages that a window sends to the
 system itself ---------- */
 switch (msg) {
 case STOP:
 hide_mousecursor();
 if (oldtimer != NULL) {
 setvect(TIMER, oldtimer);
 oldtimer = NULL;
 }
 break;
 /* ------- clock messages --------- */
 case CAPTURE_CLOCK:
 Cwnd = wnd;
 set_timer(clocktimer, 0);
 break;
 case RELEASE_CLOCK:
 Cwnd = NULLWND;
 disable_timer(clocktimer);
 break;
 /* -------- keyboard messages ------- */
 case KEYBOARD_CURSOR:
 if (wnd == NULLWND)
 cursor((int)p1, (int)p2);
 else
 cursor(GetClientLeft(wnd)+(int)p1, GetClientTop(wnd)+(int)p2);
 break;
 case CAPTURE_KEYBOARD:
 if (p2)
 ((WINDOW)p2)->PrevKeyboard=CaptureKeyboard;
 else
 wnd->PrevKeyboard = CaptureKeyboard;
 CaptureKeyboard = wnd;
 NoChildCaptureKeyboard = (int)p1;
 break;
 case RELEASE_KEYBOARD:
 CaptureKeyboard = wnd->PrevKeyboard;
 NoChildCaptureKeyboard = FALSE;
 break;
 case CURRENT_KEYBOARD_CURSOR:
 curr_cursor(&x, &y);
 *(int*)p1 = x;
 *(int*)p2 = y;
 break;
 case SAVE_CURSOR:
 savecursor();
 break;
 case RESTORE_CURSOR:
 restorecursor();
 break;

 case HIDE_CURSOR:
 normalcursor();
 hidecursor();
 break;
 case SHOW_CURSOR:
 if (p1)
 set_cursor_type(0x0106);
 else
 set_cursor_type(0x0607);
 unhidecursor();
 break;
 /* -------- mouse messages -------- */
 case MOUSE_INSTALLED:
 rtn = mouse_installed();
 break;
 case SHOW_MOUSE:
 show_mousecursor();
 break;
 case HIDE_MOUSE:
 hide_mousecursor();
 break;
 case MOUSE_CURSOR:
 set_mouseposition((int)p1, (int)p2);
 break;
 case CURRENT_MOUSE_CURSOR:
 get_mouseposition((int*)p1,(int*)p2);
 break;
 case WAITMOUSE:
 waitformouse();
 break;
 case TESTMOUSE:
 rtn = mousebuttons();
 break;
 case CAPTURE_MOUSE:
 if (p2)
 ((WINDOW)p2)->PrevMouse = CaptureMouse;
 else
 wnd->PrevMouse = CaptureMouse;
 CaptureMouse = wnd;
 NoChildCaptureMouse = (int)p1;
 break;
 case RELEASE_MOUSE:
 CaptureMouse = wnd->PrevMouse;
 NoChildCaptureMouse = FALSE;
 break;
 default:
 break;
 }
 }
 return rtn;
}

/* ---- dispatch messages to the message proc function ---- */
int dispatch_message(void)
{
 WINDOW Mwnd, Kwnd;
 /* -------- collect mouse and keyboard events ------- */
 collect_events();
 /* --------- dequeue and process events -------- */

 while (EventQueueCtr > 0) {
 struct events ev = EventQueue[EventQueueOffCtr];

 if (++EventQueueOffCtr == MAXMESSAGES)
 EventQueueOffCtr = 0;
 --EventQueueCtr;

 /* ------ get the window in which a
 mouse event occurred ------ */
 Mwnd = inWindow(ev.mx, ev.my);

 /* ---- process mouse captures ----- */
 if (CaptureMouse != NULLWND)
 if (Mwnd == NULLWND 
 NoChildCaptureMouse 
 GetParent(Mwnd) != CaptureMouse)
 Mwnd = CaptureMouse;

 /* ------ get the window in which a
 keyboard event occurred ------ */
 Kwnd = inFocus;

 /* ---- process keyboard captures ----- */
 if (CaptureKeyboard != NULLWND)
 if (Kwnd == NULLWND 
 NoChildCaptureKeyboard 
 GetParent(Kwnd) != CaptureKeyboard)
 Kwnd = CaptureKeyboard;

 /* -------- send mouse and keyboard messages to the
 window that should get them -------- */
 switch (ev.event) {
 case SHIFT_CHANGED:
 case KEYBOARD:
 SendMessage(Kwnd, ev.event, ev.mx, ev.my);
 break;
 case LEFT_BUTTON:
 if (!CaptureMouse 
 (!NoChildCaptureMouse &&
 GetParent(Mwnd) == CaptureMouse))
 if (Mwnd != inFocus)
 SendMessage(Mwnd, SETFOCUS, TRUE, 0);
 case BUTTON_RELEASED:
 case DOUBLE_CLICK:
 case RIGHT_BUTTON:
 case MOUSE_MOVED:
 SendMessage(Mwnd, ev.event, ev.mx, ev.my);
 break;
 case CLOCKTICK:
 SendMessage(Cwnd, ev.event,
 (PARAM) MK_FP(ev.mx, ev.my), 0);
 default:
 break;
 }
 }
 /* ------ dequeue and process messages ----- */
 while (MsgQueueCtr > 0) {
 struct msgs mq = MsgQueue[MsgQueueOffCtr];
 if (++MsgQueueOffCtr == MAXMESSAGES)

 MsgQueueOffCtr = 0;
 --MsgQueueCtr;
 SendMessage(mq.wnd, mq.msg, mq.p1, mq.p2);
 if (mq.msg == STOP mq.msg == ENDDIALOG)
 return FALSE;
 }
 return TRUE;
}






[LISTING THREE]

/* ------- display a window's border ----- */
void RepaintBorder(WINDOW wnd, RECT *rcc)
{
 int y;
 int lin, side, ne, nw, se, sw;
 RECT rc, clrc;

 if (!TestAttribute(wnd, HASBORDER))
 return;
 if (rcc == NULL) {
 rc = SetRect(0, 0, WindowWidth(wnd)-1,
 WindowHeight(wnd)-1);
 if (TestAttribute(wnd, SHADOW)) {
 rc.rt++;
 rc.bt++;
 }
 }
 else
 rc = *rcc;
 clrc = rc;
 /* -------- adjust the client rectangle ------- */
 if (RectLeft(rc) == 0)
 --clrc.rt;
 else
 --clrc.lf;
 if (RectTop(rc) == 0)
 --clrc.bt;
 else
 --clrc.tp;
 RectRight(clrc) = min(RectRight(clrc), WindowWidth(wnd)-3);
 RectBottom(clrc) = min(RectBottom(clrc), WindowHeight(wnd)-3);
 if (wnd == inFocus) {
 lin = FOCUS_LINE;
 side = FOCUS_SIDE;
 ne = FOCUS_NE;
 nw = FOCUS_NW;
 se = FOCUS_SE;
 sw = FOCUS_SW;
 }
 else {
 lin = LINE;
 side = SIDE;
 ne = NE;

 nw = NW;
 se = SE;
 sw = SW;
 }
 line[WindowWidth(wnd)] = '\0';
 /* ---------- window title ------------ */
 if (RectTop(rc) == 0)
 if (RectLeft(rc) < WindowWidth(wnd))
 if (TestAttribute(wnd, TITLEBAR))
 DisplayTitle(wnd, clrc);
 foreground = FrameForeground(wnd);
 background = FrameBackground(wnd);
 /* -------- top frame corners --------- */
 if (RectTop(rc) == 0) {
 if (RectLeft(rc) == 0)
 PutWindowChar(wnd, -1, -1, nw);
 if (RectLeft(rc) < RectRight(rc)) {
 if (RectRight(rc) >= WindowWidth(wnd)-1)
 PutWindowChar(wnd, WindowWidth(wnd)-2, -1, ne);

 if (TestAttribute(wnd, TITLEBAR) == 0) {
 /* ----------- top line ------------- */
 memset(line,lin,WindowWidth(wnd)-1);
 line[RectRight(clrc)+1] = '\0';
 if (strlen(line+RectLeft(clrc)) > 1 
 TestAttribute(wnd, SHADOW) == 0)
 writeline(wnd, line+RectLeft(clrc),
 RectLeft(clrc), -1, FALSE);
 }
 }
 }
 /* ----------- window body ------------ */
 for (y = 0; y < ClientHeight(wnd); y++) {
 int ch;
 if (y >= RectTop(clrc) && y <= RectBottom(clrc)) {
 if (RectLeft(rc) == 0)
 PutWindowChar(wnd, -1, y, side);
 if (RectLeft(rc) < RectRight(rc)) {
 if (RectRight(rc) >= ClientWidth(wnd)) {
 if (TestAttribute(wnd, VSCROLLBAR))
 ch = ( y == 0 ? UPSCROLLBOX :
 y == WindowHeight(wnd)-3 ?
 DOWNSCROLLBOX :
 y == wnd->VScrollBox ?
 SCROLLBOXCHAR :
 SCROLLBARCHAR );
 else
 ch = side;
 PutWindowChar(wnd,WindowWidth(wnd)-2,y,ch);
 }
 }
 if (RectRight(rc) == WindowWidth(wnd))
 shadow_char(wnd, y);
 }
 }
 if (RectTop(rc) < RectBottom(rc) &&
 RectBottom(rc) >= WindowHeight(wnd)-1) {
 /* -------- bottom frame corners ---------- */
 if (RectLeft(rc) == 0)

 PutWindowChar(wnd, -1, WindowHeight(wnd)-2, sw);
 if (RectRight(rc) >= WindowWidth(wnd)-1)
 PutWindowChar(wnd, WindowWidth(wnd)-2,
 WindowHeight(wnd)-2, se);
 /* ----------- bottom line ------------- */
 memset(line,lin,WindowWidth(wnd)-1);
 if (TestAttribute(wnd, HSCROLLBAR)) {
 line[0] = LEFTSCROLLBOX;
 line[WindowWidth(wnd)-3] = RIGHTSCROLLBOX;
 memset(line+1, SCROLLBARCHAR, WindowWidth(wnd)-4);
 line[wnd->HScrollBox] = SCROLLBOXCHAR;
 }
 line[RectRight(clrc)+1] = '\0';
 if (strlen(line+RectLeft(clrc)) > 1 TestAttribute(wnd, SHADOW) == 0)
 writeline(wnd,
 line+RectLeft(clrc),
 RectLeft(clrc),
 WindowHeight(wnd)-2,
 FALSE);
 if (RectRight(rc) == WindowWidth(wnd))
 shadow_char(wnd, WindowHeight(wnd)-2);
 }
 if (RectBottom(rc) == WindowHeight(wnd))
 /* ---------- bottom shadow ------------- */
 shadowline(wnd, clrc);
}





Example 1:

#include <stdio.h>
main()
{
 int x=1,y=1;
 x = (y++,y+3);
 printf("x=%d,y=%d\n",x,y);
 printf("++y=%d,x=%d\n",(--x,++y),x);
}





















July, 1991
STRUCTURED PROGRAMMING


The Chip Is Bad!




Jeff Duntemann, KG7JF


Given the nature and quality of our software, it's not especially surprising
that we put such faith in the quality of our hardware. When things go sour, we
begin with a very strong assumption that we futzed a pointer dereference
somewhere rather than having been the innocent victim of a sizzled memory
chip.
Part of this is the mass-production commodity nature of PC components, which
makes them tremendously reliable relative to how things were 15 years ago.
Besides (as we can always fall back to insisting) software is an art, whereas
hardware is simply a science.
Hardware used to be an art too. I wire-wrapped my first computer on a piece of
surplus perfboard that had been pulled out of a military god-knows-what and
sold by the pound at the Santa Fe Hamfest. It was based on a bizarre CPU from
RCA called the CDP 1802. I used surplus toggle switches for input and surplus
LEDs for a display, and I will admit right here that I did it wrong, in a
truly spectacular fashion. For while the machine seemed to be fully functional
(that is, nothing caught fire when I turned power on) it wouldn't execute the
little 40-byte machine-code programs I laboriously toggled in.
I toggled in progressively shorter test programs, none of which worked, until
I was down to a program consisting of a single opcode (7BH) which on the 1802
switched a dedicated serial output pin high. No deal. The output pin stayed
low. The CPU chip was obviously bad. I bought another, swapped CPUs, and tried
again. No dice. I even bought a third (silly boy!) before I realized that
something else was to blame.
All chips in the minimachine were, in fact, good. What I had done was this: I
wired the toggle switches upside down, so that an up switch sent a 0-bit to
the CPU rather than a 1-bit. However, the LEDs were showing correct binary
patterns because I had mistakenly used hex inverter chips rather than hex
driver chips to drive the LEDs. The switches were inverting bits sent to the
CPU, and the inverters were deinverting the bits before sending them to the
LEDs. So while the toggles and the LEDs said 7BH, what the CPU was getting was
84H, which (if I remember correctly) was the second half of an ADD
instruction.
Aren't you glad that hardware isn't an art anymore? Lordy, I sure am.


When Chips Really are Bad


I was pretty surprised to hear that many thousands (no one will admit to quite
how many) of IBM PS/2 machines were built with a defective UART chip. The chip
in question is National's 16550, which supposedly had FIFO buffers built in on
both transmit and receive. Trouble was, the FIFOs didn't work. We're not
talking static damage here, nor bad fabrication. The bug was right in there at
the mask level. National soon issued the 16550A, which had fully functional
FIFOs (sorta sings, doesn't it?) but that left large numbers of 16550s in
unsuspecting hands.
The fallout from this little embarrassment is that nobody really incorporates
FIFO support in their comm software for fear of running on machines that
contain the bum UART. This is dumb. Software should be soft -- and if there's
a way of detecting a hardware feature reliably, you're giving up a competitive
edge if you don't use it.
This is yet another reason to heed the lesson of the fallen Viking: Because if
you know the hardware intimately, you can tell missing features from existing
features, and good chips from the bad. The C guys have known this for years. I
have some hope that the Pascal gang will come along in time.
A little later in this column I'll explain how to detect the presence of a
serial port, and determine exactly which UART chip is installed at any given
port. It's remarkably easy -- but ya gotta know the hardware!
For now, it's back to the UART registers.


UART Registers


Modem Control Register (MCR) -- this register gathers several miscellaneous
modem control functions into one place. (See Figure 1.) There are two output
pins (apart from the data pin) on the RS-232C port. Data Terminal Ready (DTR)
is asserted by setting MCR bit 0 to 1. DTR basically means that the modem is
ready to talk to whatever is on the other end of the line, and once the
conversation has started, DTA must remain asserted. Drop DTR (that is, set bit
0 of MCR to 0) and the connection between the modem and its opposite number
will be broken.
A 1-bit placed in MCR bit 1 asserts the Request To Send (RTS) line, which is a
holdover from the days when modems were "half-duplex" and data could move in
only one direction at a time down the line. The line had to be "turned around"
after each end finished talking, and RTS was one of two RS-232C pins that
handled the turnaround protocol. (The other, an input pin, is Clear To Send.)
In these full-duplex days, RTS is typically asserted when you begin
communication and simply left asserted until the connection is broken. You
really only need to fool with it intelligently if you must support half-duplex
operation.
There are two undedicated output pins on the PC's UART chip. These two pins
allow hardware designers to build special functions into the hardware that are
controllable from software through the MCR. Writing a 0 or 1 to the OUT1 and
OUT2 bits just passes those logical states to those general-purpose output
pins.
OUT1 has no standard use on the PC, but keep in mind that any given serial
board or internal modem board may make use of it, and those uses are probably
different from board to board. The Hayes internal modems, for example, use the
OUT1 bit (and pin) to do a hardware reset on the entire modem board. In
IBM-standard PC serial adapters, OUT2 is used to enable and disable the
interrupt machinery on the serial adapter as a whole. OUT2 may be thought of
as a gate placed between the UART and the interrupt machinery inside the PC.
In other words, before you can do anything with interrupts at all, you must
set OUT2 to 1. Also, when the UART is reset, (a hardware operation through a
pin on the UART chip) OUT2 reverts to its default state of 0, disabling
adapter interrupts.
Bit 4 initiates a sort of UART self-test called "loopback," in which UART
inputs are connected to UART outputs and software can determine whether the
UART is working internally. Attempting the loopback test can also detect with
great reliability whether a UART is present at all. More on this a little
later.
The MCR bit fields are summarized in Figure 1.
Line Status Register (LSR) The name of LSR is a little misleading--we're not
really talking about the line here. The status reflected in LSR is the status
of the UART machinery that converts bytes to outbound streams of serial bits
and inbound streams of serial bits to whole bytes. Both successes and failures
are reflected in LSR. See Figure 2.
The Data Ready (DR) field, bit 0, is useful mostly for polled communications,
as I presented in the POLL-TERM.PAS program two columns ago. DR goes to 1 when
a completed incoming character is available in the Receive Buffer Register,
RBR. Polling DR isn't how you do things in an interrupt-driven environment, as
I'll explain in time.
The Overrun Error (OE) field, bit 1, goes to 1 if the CPU fails to read a
completed character from RBR before the next character comes in and overwrites
RBR. One character "overruns" the next, destroying it -- and this flag is the
only way you'll know.
The Parity Error (PE) bit, bit 2, goes to 1 if the incoming character does not
reflect the parity bit requirements set up in bits 3-5 of the Line Control
Register. in bits 3-5 of the Line Control Register. This is part of the UART's
primitive built-in character-by-character error detection, and is not used
much anymore. A noise pulse blasting one of the bits of an incoming character
will also change that character's parity, causing an error to be flagged in
PE.
Block-by-block error detection (through protocols such as XModem or KERMIT) is
by far the better way to go.
Bit 3, the Framing Error (FE) bit, goes to 1 if the UART does not correctly
receive the incoming character's stop bit. Framing errors happen when
something stops a character halfway through. The UART loses track of the
number of incoming bits when it can't frame the incoming character between
start and stop bits.
The most common way framing errors happen is for the remote system to send a
break signal down the line. A break signal is simply holding the
communications line to a steady space state for longer than the duration of a
single character at the current baud rate. The remote system can send a break
signal as a way to get your system's attention right now. Conceptually, it's
an interrupt that works across the communications line.
The Break Interrupt (BI) bit of LSR (bit 4) goes to 1 when the UART detects
that the remote system has held the line to a space for longer than one
character time.
Oddly, although DR, OE, PE, and FE can trigger a system interrupt when they go
to 1, BI cannot. An incoming break signal, however, will cause a framing
error, and if line control status interrupts are enabled, a framing error will
trigger an interrupt, and an interrupt service routine can check the BI flag
to see if the framing error was in fact caused by a break signal.
You should keep in mind that not all framing errors are necessarily caused by
an incoming break signal. Line connection problems or noise pulses can cause
framing errors as well. To be sure that a framing error is caused by a break
signal, you must check BI!
I'll go into more detail on the enabling of line status interrupts when we
take up interrupts in a future column.
Bit 5 of LSR, Transmit Holding Register Empty, (THRE) goes to a 1 when the
UART has emptied the Transmit Holding Register by starting to transmit the
byte previously in the register. This bit in a one-state means that you may
load the next byte into the UART for transmission.
Don't confuse THRE with bit 6 of LSR, the Transmitter Empty bit, TE. THRE and
TE are related, but there is a critical difference: THRE indicates when it is
safe to load the next character into the UART. TE goes to 1 when the character
being transmitted has been fully serialized and sent out of the UART.
THRE and TE report the status of two different components of the UART. The
transmit holding register monitored by THRE is just that: A place to hold a
character pending the start of its transmission. The TE bit reports the
condition of the UART's internal shift register that bumps bits off the end of
the transmitted byte to send them down the line, nose-to-tail. A subtlety to
be remembered is that TE also keeps an eye on the transmit holding register.
When the shift register and the transmit holding register are empty, TE goes
to 1.
In other words, THRE tells you when it is safe to load the next character in
for transmission. TE tells you when the last character is out the door and
that the UART is truly and fully idle. Check this bit before shutting down the
UART -- you don't want to leave any loose bits rattling around inside the
serial port!
Modem Status Register (MSR) reports the status of four of the RS-232C input
signal lines present at the business end of the serial port. These are the
Clear To Send (CTS) and the Data Set Ready (DSR) lines mentioned earlier; the
Ring Indicator (RI) line, which goes to a one-state when the line is being
energized with the phone company's ring signal; and the Data Carrier Detect
(DCD) line, which goes to a one-state when the serial port detects a data
carrier coming in from the remote system.
MSR tells you two things: The current state of these four lines right now, and
whether or not any of the four lines have changed since the last time you read
MSR. Figure 3 should make this clearer. The four high bits of MSR report the
current state of the modem input pins, and the low four bits report whether
those bits have changed since the last time MSR was polled.
The "delta" names shown in Figure 3 just indicate that something has changed.
(From delta, the engineer's symbol for change.) Only Trailing Edge of Ring
Indicator (TERI) bears further explanation: The TERI bit goes high when the
ring signal has stopped. (That is, when the serial port has detected the
"trailing edge" of the ring signal, looking upon the signal as a trace on an
oscilloscope.) RI, however, reflects the instantaneous state of the ring
signal: 1 if present, 0 if not. TERI indicates that a full ring signal has
come and gone since the last poll of MSR. This allows you to count full rings
without having to constantly poll RI to determine when a ring has completed.
Reading MSR clears bits 0-3, and the bits will stay cleared until one of the
signal lines changes. You can poll MSR to see if the input signals have
changed, but you can also configure the UART to generate an interrupt when any
of the "delta" bits change. This makes it possible to let an interrupt service
routine count incoming rings and update a counter for you, allowing your
software to answer an incoming call only after a preset number of rings.

Again, I'll have more to say about MSR once we get into the interrupt side of
things.
Scratch Register (SCR) at offset 7 from the port base is just that: A 1-byte
holding register that will hold any byte-sized value as long as you leave it
there, ready to be read back any time. SCR doesn't control anything. It's just
a handy pocket to tuck something in for the time being.
Actually, what SCR does best is indicate whether or not the simplest PC UART
chip, the 8250, is present in a machine, as I'll explain shortly.


Detecting Serial Ports


That's the register set in some detail. I haven't explained the
interrupt-related bits thoroughly, nor the FIFO status/ control bits. We'll
get to those in more detail in future columns, when I explain how to create
serial port interrupt support and how to use the FIFOs. (One can't explain
everything in 3500 words!)
Knowing the register set thoroughly suggest certain highly useful tricks.
Perhaps the most useful of any is simply a means of knowing whether there
actually is a UART installed at either of the two standard port locations
(COM1 and COM2) or the pseudo-standard port locations (COM3 or COM4.)
Detecting a device that contains I/O ports is usually a matter of reading some
predictable value from one of the ports. There are devices that identify
themselves when polled through their I/O ports, but UARTs are not among them.
What you should then do is see if there's any sort of register in the device
that can be written to through an I/O port and then read back. If what you
read back from the port is what you had written out, you've got a device on
the end of your string.
What happens if you read an I/O port that doesn't exist? Nothing ugly, at
least. Typically, in modern PC designs, you'll get $FF. There are no
guarantees, however. I've seen at least one (an ancient clone long extinct)
that returned $00.
Reading back a value written to a register is a reliable indicator that
something is out there. But how do you tell if that something is indeed a UART
and not some weird, early-vintage or custom infrared coffee pot controller
interface board? "Standards" are scant promise that nothing other than a UART
will ever be found at port $03E8. Far better than simply writing out and
reading back a value is to trigger some sort of response that could only be
attributed to the exact device you're expecting.
That's what I went looking for through the forest of registers in the standard
PC UART chips. I found it in the UART loopback test.


Looping Back


The loopback test is an intriguing built-in feature present in all National
Semi-Conductor UARTs found in PCs. It's a self-diagnostic that, when enabled,
throws internal switches that route out-bound characters back into the RBR
without ever leaving the chip. Write a character to the THR and that character
will immediately appear in the RBR... if the UART is correctly connected and
fully functional.
One way to look for a UART, then, is to throw the UART into loopback test
node, and then write a character to the THR. If the character can then be read
back from RBR, a UART is present.
This works. The trouble is, RBR and THR exist at the same I/O port address,
offset 0 from the serial port base I/O address. Writing a value to RBR and
reading it back through THR is thus no different from writing a value to any
read/write register and getting it back again.
I wanted something a little more specific to UARTs. Fortunately, the loop-back
test does more than just short-circuit the UART's internal data path. With
loopback enabled, the UART takes the four RS-232C input control signals and
connects them internally to the four RS-232C output signals, in this fashion:
CTS to RTS DSR to DTR RI to OUT1 DCD to OUT2
If you write a bit pattern to the output signal bits, that bit pattern will be
turned around and will be placed on the input signal bits. And this time, the
inputs and outputs are in different registers. The output signal bits are in
the Modem Control Register (MCR), and the input signal bits are in the Modem
Status Register, (MSR). Writing a value to one register and getting that value
back from an entirely different register is much more characteristic of a
specific device.
But wait... it gets better. In the process, the loopback test reverses two of
the signal bits, so that a binary pattern written to MCR will be subtly but
predictably changed in passing through the UART to emerge at MSR.
I've laid it out in Figure 4. The full bit-field layouts of MCR and MSR are
shown inFigure 1 and Figure 3. After putting the UART into loopback test mode,
you can write a distinctive nybble (I've chosen $A) to the low four bits of
MCR. The four bits will be passed through to the high four bits of MSR, except
that the least significant two bits will be interchanged, as shown in the
figure.
The total effect is that a $0A byte written to the port at offset four can
immediately be read back from the port at offset 6, but as the value $90. This
is mighty distinctive behavior, and I seriously doubt that any other device
than a National-based UART will satisfy this kind of test.
The code implementing the test algorithm is shown in the DetectComPort
function in Listing One (page 153). There's not much to it: Throw the UART
into loopback mode, write $0A to MCR, read the value from MSR, mask off the
low four bits, and see if what's left amounts to $90. If so you have a UART at
the specified port base. If not, no UART. (The rest is housekeeping.) I've run
this test on a couple of dozen PCs over the last month and it pegged every
port, every time.


Looking for Mr. Badchip


I mentioned earlier that not all UARTs were created equal. Some were created
rather badly, in fact. There are primitive UARTs, middling UARTs, and advanced
UARTs. To make the most of the hardware on which your code is running, you
need to be able to tell which is which. Again, understanding the hardware at a
very detailed level is the key.
The original 8-bit IBM PC and XT machines supported an add-in serial port
board containing National's 8250B UART. The 8250 works well. It lacks the
scratch register at offset 7 from the base I/O address, nor does it have the
internal FIFO buffers of its successor UART models.
The AT-class PCs used a 16-bit add-in serial port board built around
National's next-generation UART, the 16450. Its major differences over the
8250B are the scratch register and the ability to work in faster systems and
support higher baud rates.
IBM's PS/2 machines originally appeared with National's subsequent UART, the
16550. The 16550, as I mentioned at the start of this column, was supposed to
contain 16-character deep FIFO buffers for both the transmit and receive
sides. The FIFOs were in the chip somewhere, but they didn't work correctly.
National quietly retired the 16550 for the corrected 16550A, but not before an
untold number of 16550 chips went out in early PS/2 models like the 50, 60,
and the 16MHz 80. (The 20MHz Model 80 contains the 16550A.)
Actually, telling the UART models apart (once you know a National UART exists
at a specific port base address) is easier than detecting a UART's presence to
begin with.
You begin by writing a distinctive value to the scratch register at base
offset 7, and attempting to read it back. If you don't get back what you
wrote, you have an 8250B, period. The rest of the test involves whether the
FIFO registers exist in the UART, and whether or not the FIFOs work correctly.
Enabling the FIFOs is done by setting bit 0 of the FIFO Control Register (FCR)
at offset 2 from the port base. Only the 16550 and the 16550A even have a FIFO
Control Register. Writing to the FCR on a 16450 has no effect whatsoever.
However, on the 16550 and 16550A, enabling the FIFOs affects two status bits
at bits 6 and 7 of the Interrupt Identification Register, IIR. These two bits
reflect the status of the send and receive FIFO buffers. If both FIFOs are
enabled, both FIFO status bits should be set to 1.
In the 16550A, this is the case. Enable FIFOs on the 16550A, and bits 6 and 7
of IIR immediately go to 1. However, part of the nature of the 16550's
internal bug is that only bit 7 goes to 1 when the FIFOs are enabled. (It was
a relatively cooperative bug, as bugs go).
So telling the 16450, 16550, and 16550A apart devolves to a CASE statement on
bits 6 and 7 of IIR. If both bits are 0, you have a 16450. If bit 6 is 0 and
bit 7 is 1, you have a 16550. If bits 6 and 7 are both 1, you have a 16550A.


The Example Program


Listing One, LOOPTEST.PAS, is a simple UART detector utility containing two
functions. DetectComPort returns a Boolean value indicating whether or not a
UART is present at COM port 1 - 4. Keep in mind that COM3 and COM4 were not
originally defined by IBM and are not strong standards. They may exist at I/O
base addresses other than $03E8 and $02E8. If LOOP TEST doesn't detect your
COM3 or COM4 ports, this is probably the reason.
The other function, DetectUARTType, returns a code from 0 - 4 depending on
what National Semiconductor UART model is found. Again, you must pass the port
number (1- 4) to DetectUART-Type, and the function assumes that a UART is
present at that port number. Turn DetectUARTType loose on a port number at
which no UART is installed, and it may tell you an 8250B is there. Check
first!
So far, LOOPTEST has reported correctly on all machines available to me for
testing. If you come across a serial port setup that fools it, do let me know.
And next time, it's interrupts fersure.
_STRUCTURED PROGRAMMING COLUMN_
by Jeff Duntemann


[LISTING ONE]

PROGRAM LoopTest; { From "Structured Programming" DDJ 7/91 }

CONST
 UART : ARRAY[0..4] OF STRING =
 (' faulty','n 8250',' 16450',' 16550',' 16550A');
VAR
 PortNum : Byte;
{-----------------------------------------------------------------}
{ FUNCTION DetectComPort by Jeff Duntemann }
{ This function returns a Boolean value indicating whether or not }
{ a National Semiconductor UART (one of 8250B, 16450, 16550, or }
{ 16550A) is present at the COM port passed in the PortNumber }
{ parameter. Don't run this function on serial ports that may be }
{ operating in some sort of background mode; it will probably }
{ disrupt any communication stream currently in progress. }
{-----------------------------------------------------------------}
FUNCTION DetectComPort(PortNumber : Integer) : Boolean;
CONST
 LOOPBIT = $10;
 PortBases : ARRAY[1..4] OF Integer = ($03F8,$02F8,$03E8,$02E8);
 { COM1 COM2 COM3 COM4 }
VAR
 Holder,HoldMCR,HoldMSR : Byte;
 MCRPort,MSRPort,THRPort,RBRPort : Integer;
BEGIN
 { Calculate port numbers for the port being looked for: }
 RBRPort := PortBases[PortNumber]; { RBR is at the base address }
 THRPort := RBRPort; { RBR and THR have same I/O address }
 MCRPort := RBRPort + 4; { MCR is at offset 4 }
 MSRPort := RBRPort + 6; { MSR is at offset 6 }
 { Put the UART into loopback test mode: }
 HoldMCR := Port[MCRPort]; { Save existing value of MCR }
 Port[MCRPort] := HoldMCR OR LOOPBIT; { Turn on loopback test mode }
 HoldMSR := Port[MSRPort];
 Port[MCRPort] := $0A OR LOOPBIT; { Put pattern to low 4 bits of MCR }
 { without disabling loopback mode }
 Holder := Port[MSRPort] AND $F0; { Read pattern from hi 4 bits of MSR }
 IF Holder = $90 THEN { The $A pattern is changed to $9 inside UART }
 DetectComPort := True
 ELSE
 DetectComPort := False;
 { Restore previous contents of MSR: }
 Port[MSRPort] := HoldMSR;
 { Take the UART out of loopback mode & restore old state of MCR: }
 Port[MCRPort] := HoldMCR AND (NOT LOOPBIT);
END;
{-----------------------------------------------------------------}
{ FUNCTION DetectUARTType by Jeff Duntemann }
{ This function returns a numeric code indicating which UART chip }
{ is present at the selected PortNumber (1-4.) The UART codes }
{ returned are as follows: }
{ 0 : Error; bad UART or no UART at COMn, where n=PortNumber }
{ 1 : 8250; generally (but not always!) present in PC or XT }
{ 2 : 16450; generally present in AT-class machines }
{ 3 : 16550; in PS/2 mod 50/60/early 80. FIFOs don't work! }
{ 4 : 16550A; in later PS/2's. FIFOs fully operative. }
{ NOTE: This routine assumes a UART is "out there" at the port # }
{ specified. Run DetectComPort first to make sure port is there! }
{-----------------------------------------------------------------}
FUNCTION DetectUARTType(PortNumber : Integer) : Integer;
CONST

 PortBases : ARRAY[1..4] OF Integer = ($03F8,$02F8,$03E8,$02E8);
 { COM1 COM2 COM3 COM4 }
VAR
 ScratchPort,IIRPort,FCRPort : Integer;
 Holder : Byte;
BEGIN
 { The scratch register is at offset 7 from the comm port base: }
 ScratchPort := PortBases[PortNumber] + 7;
 FCRPort := PortBases[PortNumber] + 2;
 IIRPort := FCRPort; { IIR and FCR are at same offset }
 Port[ScratchPort] := $AA; { Write pattern to the scratch register }
 IF Port[ScratchPort] <> $AA THEN { Attempt to read it back... }
 DetectUARTType := 1 { A UART without a scratch register is an 8250 }
 ELSE
 BEGIN { Now we have to test among the 16450, 16550, and 16550A }
 Port[FCRPort] := $01; { Setting FCR bit 0 on 16550 enables FIFOs }
 Holder := Port[IIRPort] AND $C0; { Read back to FIFO status bits }
 CASE Holder OF
 $C0 : DetectUARTType := 4; { Bits 6 & 7 both set = 16550A }
 $80 : DetectUARTType := 3; { Bit 7 set & bit 6 cleared = 16550 }
 $00 : DetectUARTType := 2; { Neither bit set = 16450 }
 ELSE DetectUARTType := 0; { Error condition }
 END; {CASE}
 Port[FCRPort] := $00; { Don't leave the FIFOs enabled! }
 END;
END;

BEGIN
 FOR PortNum := 1 TO 4 DO
 BEGIN
 IF DetectComPort(PortNum) THEN
 BEGIN
 Write ('Port COM',PortNum,' is present,');
 Writeln(' using a',UART[DetectUARTType(PortNum)],' UART.');
 END
 ELSE
 Writeln('Port COM',Portnum,' is not present.');
 END;
END.























July, 1991
GRAPHICS PROGRAMMING


Mode X: 256-Color VGA Magic




Michael Abrash


There's a well-known Latin saying, in complexitate est opportunitas ("in
complexity there is opportunity"), that must have been invented with the VGA
in mind. Well, actually, it's not exactly well-known (I just thought of it
this afternoon), but it should be. As evidence, witness the strange case of
the VGA's 320 x 240 256-color mode, which is undeniably complex to program and
isn't even documented by IBM -- but which is, nonetheless, perhaps the single
best mode the VGA has to offer, especially for animation.


What Makes 320 x 240 Special?


Five features set the 320 x 240 256-color mode (which I'll call "mode X,"
befitting its mystery status in IBM's documentation) apart from other VGA
modes. First, it has a 1:1 aspect ratio, resulting in equal pixel spacing
horizontally and vertically (square pixels). Square pixels make for the most
attractive displays, and avoid considerable programming effort that would
otherwise be necessary to adjust graphics primitives and images to match the
screen's pixel spacing. (For example, with square pixels, a circle can be
drawn as a circle; otherwise, it must be drawn as an ellipse that corrects for
the aspect ratio -- a slower, more complicated process.) In contrast, mode
13h, the only documented 256-color mode, provides a nonsquare 320 x 200
resolution.
Second, mode X allows page flipping, a prerequisite for the smoothest possible
animation. Mode 13h does not allow page flipping, nor does mode 12h, the VGA's
high-resolution 640 x 480 16-color mode.
Third, mode X allows the VGA's plane-oriented hardware to be used to process
pixels in parallel, improving performance by up to four times over mode 13h.
Fourth, like mode 13h but unlike all other VGA modes, mode X is a
byte-per-pixel mode (each pixel is controlled by one byte in display memory),
eliminating the slow read-before-write and bit-masking operations often
required in 16-color modes. In addition to cutting the number of memory
accesses in half, this is important because the memory caching schemes used by
many VGA clones speed up writes more than reads.
Fifth, unlike mode 13h, mode X has plenty of offscreen memory free for image
storage. This is particularly effective in conjunction with the use of the
VGA's latches; together, the latches and the off-screen memory allow images to
be copied to the screen four pixels at a time.
There's a sixth feature of mode X that's not so terrific: It's hard to program
efficiently. If you've ever programmed a VGA 16-color mode directly, you know
that VGA programming can be demanding; mode X is often as demanding as
16-color programming, and operates by a set of rules that turns everything
you've learned in 16-color mode sideways. Programming mode X is nothing like
programming the nice, flat bitmap of mode 13h, or, for that matter, the flat,
linear (albeit banked) bitmap used by 256-color SuperVGA modes. (I'd like to
emphasize that mode X works on all VGAs, not just SuperVGAs.) Many programmers
I talk to love the flat bitmap model, and think that it's the ideal
organization for display memory because it's so straightforward to program.
Remember the saying I started this column with, though; the complexity of mode
X truly is opportunity -- opportunity for the best combination of performance
and appearance the VGA has to offer. If you do 256-color programming,
especially if you use animation, you're missing the boat if you're not using
mode X.
Although some developers have taken advantage of mode X, its use is certainly
not widespread, being entirely undocumented; only an experienced VGA
programmer would have the slightest inkling that it exists, and figuring out
how to make it perform beyond the write pixel/read pixel level is no mean
feat. I've never seen anything in print about it, and, in fact, the only
articles I've seen about any of the undocumented 256-color modes were my own
articles about the 320 x 200, 320 x 400, and 360 x 480 256-color modes in
Programmer's Journal (January and September, 1989). (However, John Bridges has
put code for a number of undocumented 256-color resolutions into the public
domain, and I'd like to acknowledge the influence of his code on the mode set
routine presented in this, article.)
Given the tremendous advantages of 320 x 240 over the documented mode 13h, I'd
very much like to get it into the hands of as many developers as possible, so
I'm going to spend the next few columns exploring this odd but worthy mode.
I'll provide mode set code, delineate the bitmap organization, and show how
the basic write pixel and read pixel operations work. Then I'll move on to the
magic stuff: rectangle fills, screen clears, scrolls, image copies, pixel
inversion, and, yes, polygon fills (just a different driver), all blurry fast;
hardware raster ops; and page flipping. In the end, I'll build a working
animation program that shows many of the features of mode X in action.
The mode set code is the logical place to begin.


Selecting 320 x 240 256-Color Mode


We could, if we wished, write our own mode set code for mode X from scratch --
but why bother? Instead, we'll let the BIOS do most of the work by having it
set up mode 13h, which we'll then turn into mode X by changing a few
registers. Listing One (page 154) does exactly that.
After setting up mode 13h, Listing One alters the vertical counts and timings
to select 480 visible scan lines. (There's no need to alter any horizontal
values, because mode 13h and mode X both have 320-pixel horizontal
resolutions.) The maximum Scan Line register is programmed to double scan each
line (that is, repeat each scan line twice), however, so we get an effective
vertical resolution of 240 scan lines. It is, in fact, possible to get 400 or
480 independent scan lines in 256-color mode (see the aforementioned articles
for details); however, 400-scan-line modes lack square pixels and can't
support simultaneous offscreen memory and page flipping, and 480-scan-line
modes lack page flipping altogether, due to memory constraints.
At the same time, Listing One programs the VGA's bitmap to a planar
organization that is similar to that used by the 16-color modes, and utterly
different from the linear bitmap of mode 13h. The bizarre bitmap organization
of mode X is shown in Figure 1. The first pixel (the pixel at the upper left
corner of the screen) is controlled by the byte at offset 0 in plane 0. (The
one thing that mode X blessedly has in common with mode 13h is that each pixel
is controlled by a single byte, eliminating the need to mask out individual
bits of display memory.) The second pixel, immediately to the right of the
first pixel, is controlled by the byte at offset 0 in plane 1. The third pixel
comes from offset 0 in plane 2, and the fourth pixel from offset 0 in plane 3.
Then the fifth pixel is controlled by the byte at offset 1 in plane 0, and
that cycle continues, with each group of four pixels spread across the four
planes at the same address. The offset M of pixel N in display memory is M =
N/4, and the plane P of pixel N is P = N mod 4. For display memory writes, the
plane is selected by setting bit P of the Map Mask register (Sequence
Controller register 2) to 1 and all other bits to 0; for display memory reads,
the plane is selected by setting the Read Map register (Graphics Controller
register 4) to P.
It goes without saying that this is one ugly bitmap organization, requiring a
lot of overhead to manipulate a single pixel. The write pixel code shown in
Listing Two (page 154) must determine the appropriate plane and perform a
16-bit OUT to select that plane for each pixel written, and likewise for the
read pixel code shown in Listing Three (page 154). Calculating and mapping in
a plane once for each pixel written is scarcely a recipe for performance.
That's all right, though, because most graphics software spends little time
drawing individual pixels. I've provided the write and read pixel routines as
basic primitives, and so you'll understand how the bitmap is organized, but
the building blocks of high-performance graphics software are fills, copies,
and bitblts, and it's here that mode X shines.


Designing From a Mode X Perspective


Listing Four (page 154) shows mode X rectangle fill code. The plane is
selected for each pixel in turn, with drawing cycling from plane 0 to plane 3
then wrapping back to plane 0. This is the sort of code that stems from a
write-pixel line of thinking; it reflects not a whit of the unique perspective
that mode X demands, and although it looks reasonably efficient, it is in fact
some of the slowest graphics code you will ever see. I've provided Listing
Four partly for illustrative purposes, but mostly so we'll have a point of
reference for the substantial speed-up that's possible with code that's
designed from a mode X perspective.
The two major weaknesses of Listing Four both result from selecting the plane
on a pixel by pixel basis. First, endless OUTs (which are particularly slow on
386s and 486s, often much slower than accesses to display memory) must be
performed, and, second, REP STOS can't be used. Listing Five (page 156)
overcomes both these problems by tailoring the fill technique to the
organization of display memory. Each plane is filled in its entirety in one
burst before the next plane is processed, so only five OUTs are required in
all, and REP STOS can indeed be used. (I've used REP STOSB in Listings Five
and Six (page 156.) REP STOSW could be used and would improve performance on
some 16-bit VGAs; however, REP STOSW requires extra overhead to set up, so it
can be slower for small rectangles, especially on 8-bit VGAs. Doing an entire
plane at a time can produce a "fading-in" effect for large images, because all
columns for one plane are drawn before any columns for the next; if this is a
problem, the four planes can be cycled through once for each scan line, rather
than once for the entire rectangle.
Listing Five is 2.5 times faster than Listing Four at clearing the screen on a
20-MHz cached 386 with a Paradise VGA. Although Listing Five is slightly
slower than an equivalent mode 13h fill routine would be, it's not grievously
so. In general, performing plane-at-a-time operations can make almost any mode
X operation, at the worst, nearly as fast as the same operation in mode 13h
(although this sort of mode X programming is admittedly fairly complex). In
this pursuit, it can help to organize data structures with mode X in mind. For
example, icons could be prearranged in system memory with the pixels organized
into four plane-oriented sets (or, again, in four sets per scan line to avoid
a fading-in effect) to facilitate copying to the screen a plane at a time with
REP MOVS.


Hardware Assist from an Unexpected Quarter


Listing Five illustrates the benefits of designing code from a mode X
perspective; this is the software aspect of mode X optimization, which
suffices to make mode X about as fast as mode 13h. That alone makes mode X an
attractive mode, given its square pixels, page flipping, and offscreen memory,
but superior performance would nonetheless be a pleasant addition to that
list. Superior performance is indeed possible in mode X, although, oddly
enough, it comes courtesy of the VGA's hardware, which was never designed to
be used in 256-color modes.
All of the VGA's hardware assist features are available in mode X, although
some are not particularly useful. The VGA hardware feature that's truly the
key to mode X performance is the ability to process four planes' worth of data
in parallel; this includes both the latches and the capability to fan data out
to any or all planes. For rectangular fills, we'll just need to fan the data
out to various planes, so I'll defer a discussion of other hardware features
until another column. (By the way, the ALUs, bit mask, and most other VGA
hardware features are also available in mode 13h -- but parallel data
processing is not.)
In planar modes, such as mode X, a byte written by the CPU to display memory
may actually go to anywhere between zero and four planes, as shown in Figure
2. Each plane for which the setting of the corresponding bit in the Map Mask
register is 1 receives the CPU data, and each plane for which the
corresponding bit is 0 is not modified.
In 16-color modes, each plane contains one-quarter of each of eight pixels,
with the 4 bits of each pixel spanning all four planes. Not so in mode X. Look
at Figure 1 again; each plane contains one pixel in its entirety, with four
pixels at any given address, one per plane. Still, the Map Mask register does
the same job in mode X as in 16-color modes; set it to OFh (all 1-bits), and
all four planes will be written to by each CPU access. Thus, it would seem
that up to four pixels could be set by a single mode X byte-sized write to
display memory, potentially speeding up operations like rectangle fills by
four times.
And, as it turns out, four-plane parallelism works quite nicely indeed.
Listing Six is yet another rectangle-fill routine, this time using the Map
Mask to set up to four pixels per STOS. The only trick to Listing Six is that
any left or right edge that isn't aligned to a multiple-of-four pixel column
(that is, a column at which one four-pixel set ends and the next begins) must
be clipped via the Map Mask register, because not all pixels at the address
containing the edge are modified. Performance is as expected; Listing Six is
nearly ten times faster at clearing the screen than Listing Four and just
about four times faster than Listing Five--and also about four times faster
than the same rectangle fill in mode 13h. Understanding the bitmap
organization and display hardware of mode X does indeed pay.
Just so you can see mode X in action, Listing Seven (page 158) is a sample
program that selects mode X and draws a number of rectangles. Listing Seven
links to any of the rectangle fill routines I've presented.
And now, I hope, you begin to see why I'm so fond of mode X. Next month, we'll
continue with mode X by exploring the wonders that the latches and parallel
plane hardware can work on scrolls, copies, blits, and pattern fills.



Notes From the Edsun Front


Comments coming my way indicate a great deal of programmer interest in the
Edsun CEG/DAC, of which I wrote in April and May. However, everyone who has
actually programmed the CEG/DAC complains about how hard it is; the results
are nice, but the process of getting there is anything but. Nonetheless,
programming the CEG/DAC is certainly a solvable problem, and whoever solves it
best will come out looking mighty good. A fair analogy is writing active TSRs.
Six years ago, TSR-writing was black magic, and Sidekick, primitive by today's
standards, made a fortune. Today, any dope can choose from dozens of books and
toolkits and make a rock-solid TSR in a few hours. As programmers develop
better tools and a better understanding of the CEG/DAC, the grumbling will
subside, and the software will take off. Another case of complexity providing
opportunity.


Book of the Month


This month's book is Advanced Programmer's Guide to SuperVGAs, by Sutty and
Blair (Brady, 1990, ISBN 0-13-010455-8; $44.95). Pricey for softcover, but
included in that price is a diskette of SuperVGA assembly code (which I have
not tried out). This book is the single best guide I've seen to the Byzantine
world of SuperVGA programming, where every one of dozens of VGA models has
different mode numbers and banking schemes. Take it from someone who's waded
through a slew of chip databooks and application notes--this book will save
you a lot of time and aggravation if you have to program SuperVGAs directly.
Still, not everything I'd like to see is in there. For example, they cover
only the Tseng Labs ET3000 chip, not the now widely used ET4000 that supports
15-bpp graphics. That's not the authors' fault, of course; it's a reflection
of the incredible diversity and rate of change in the SuperVGA arena.
Mode X. The Edsun CEG/DAC. Super-VGA programming. In complexitate est
opportunitas. Q.E.D.
_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

; Mode X (320x240, 256 colors) mode set routine. Works on all VGAs.
; ****************************************************************
; * Revised 6/19/91 to select correct clock; fixes vertical roll *
; * problems on fixed-frequency (IBM 851X-type) monitors. *
; ****************************************************************
; C near-callable as:
; void Set320x240Mode(void);
; Tested with TASM 2.0.
; Modified from public-domain mode set code by John Bridges.

SC_INDEX equ 03c4h ;Sequence Controller Index
CRTC_INDEX equ 03d4h ;CRT Controller Index
MISC_OUTPUT equ 03c2h ;Miscellaneous Output register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X

 .model small
 .data
; Index/data pairs for CRT Controller registers that differ between
; mode 13h and mode X.
CRTParms label word
 dw 00d06h ;vertical total
 dw 03e07h ;overflow (bit 8 of vertical counts)
 dw 04109h ;cell height (2 to double-scan)
 dw 0ea10h ;v sync start
 dw 0ac11h ;v sync end and protect cr0-cr7
 dw 0df12h ;vertical displayed
 dw 00014h ;turn off dword mode
 dw 0e715h ;v blank start
 dw 00616h ;v blank end
 dw 0e317h ;turn on byte mode
CRT_PARM_LENGTH equ (($-CRTParms)/2)

 .code
 public _Set320x240Mode
_Set320x240Mode proc near
 push bp ;preserve caller's stack frame
 push si ;preserve C register vars
 push di ; (don't count on BIOS preserving anything)

 mov ax,13h ;let the BIOS set standard 256-color

 int 10h ; mode (320x200 linear)

 mov dx,SC_INDEX
 mov ax,0604h
 out dx,ax ;disable chain4 mode
 mov ax,0100h
 out dx,ax ;synchronous reset while setting Misc Output
 ; for safety, even though clock unchanged
 mov dx,MISC_OUTPUT
 mov al,0e3h
 out dx,al ;select 25 MHz dot clock & 60 Hz scanning rate

 mov dx,SC_INDEX
 mov ax,0300h
 out dx,ax ;undo reset (restart sequencer)

 mov dx,CRTC_INDEX ;reprogram the CRT Controller
 mov al,11h ;VSync End reg contains register write
 out dx,al ; protect bit
 inc dx ;CRT Controller Data register
 in al,dx ;get current VSync End register setting
 and al,7fh ;remove write protect on various
 out dx,al ; CRTC registers
 dec dx ;CRT Controller Index
 cld
 mov si,offset CRTParms ;point to CRT parameter table
 mov cx,CRT_PARM_LENGTH ;# of table entries
SetCRTParmsLoop:
 lodsw ;get the next CRT Index/Data pair
 out dx,ax ;set the next CRT Index/Data pair
 loop SetCRTParmsLoop

 mov dx,SC_INDEX
 mov ax,0f02h
 out dx,ax ;enable writes to all four planes
 mov ax,SCREEN_SEG ;now clear all display memory, 8 pixels
 mov es,ax ; at a time
 sub di,di ;point ES:DI to display memory
 sub ax,ax ;clear to zero-value pixels
 mov cx,8000h ;# of words in display memory
 rep stosw ;clear all of display memory

 pop di ;restore C register vars
 pop si
 pop bp ;restore caller's stack frame
 ret
_Set320x240Mode endp
 end





[LISTING TWO]

; Mode X (320x240, 256 colors) write pixel routine. Works on all VGAs.
; No clipping is performed.
; C near-callable as:
; void WritePixelX(int X, int Y, unsigned int PageBase, int Color);


SC_INDEX equ 03c4h ;Sequence Controller Index
MAP_MASK equ 02h ;index in SC of Map Mask register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X
SCREEN_WIDTH equ 80 ;width of screen in bytes from one scan line
 ; to the next

parms struc
 dw 2 dup (?) ;pushed BP and return address
X dw ? ;X coordinate of pixel to draw
Y dw ? ;Y coordinate of pixel to draw
PageBase dw ? ;base offset in display memory of page in
 ; which to draw pixel
Color dw ? ;color in which to draw pixel
parms ends

 .model small
 .code
 public _WritePixelX
_WritePixelX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame

 mov ax,SCREEN_WIDTH
 mul [bp+Y] ;offset of pixel's scan line in page
 mov bx,[bp+X]
 shr bx,1
 shr bx,1 ;X/4 = offset of pixel in scan line
 add bx,ax ;offset of pixel in page
 add bx,[bp+PageBase] ;offset of pixel in display memory
 mov ax,SCREEN_SEG
 mov es,ax ;point ES:BX to the pixel's address

 mov cl,byte ptr [bp+X]
 and cl,011b ;CL = pixel's plane
 mov ax,0100h + MAP_MASK ;AL = index in SC of Map Mask reg
 shl ah,cl ;set only the bit for the pixel's plane to 1
 mov dx,SC_INDEX ;set the Map Mask to enable only the
 out dx,ax ; pixel's plane

 mov al,byte ptr [bp+Color]
 mov es:[bx],al ;draw the pixel in the desired color

 pop bp ;restore caller's stack frame
 ret
_WritePixelX endp
 end






[LISTING THREE]

; Mode X (320x240, 256 colors) read pixel routine. Works on all VGAs.
; No clipping is performed.
; C near-callable as:
; unsigned int ReadPixelX(int X, int Y, unsigned int PageBase);


GC_INDEX equ 03ceh ;Graphics Controller Index
READ_MAP equ 04h ;index in GC of the Read Map register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X
SCREEN_WIDTH equ 80 ;width of screen in bytes from one scan line
 ; to the next
parms struc
 dw 2 dup (?) ;pushed BP and return address
X dw ? ;X coordinate of pixel to read
Y dw ? ;Y coordinate of pixel to read
PageBase dw ? ;base offset in display memory of page from
 ; which to read pixel
parms ends

 .model small
 .code
 public _ReadPixelX
_ReadPixelX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame

 mov ax,SCREEN_WIDTH
 mul [bp+Y] ;offset of pixel's scan line in page
 mov bx,[bp+X]
 shr bx,1
 shr bx,1 ;X/4 = offset of pixel in scan line
 add bx,ax ;offset of pixel in page
 add bx,[bp+PageBase] ;offset of pixel in display memory
 mov ax,SCREEN_SEG
 mov es,ax ;point ES:BX to the pixel's address

 mov ah,byte ptr [bp+X]
 and ah,011b ;AH = pixel's plane
 mov al,READ_MAP ;AL = index in GC of the Read Map reg
 mov dx,GC_INDEX ;set the Read Map to read the pixel's
 out dx,ax ; plane

 mov al,es:[bx] ;read the pixel's color
 sub ah,ah ;convert it to an unsigned int

 pop bp ;restore caller's stack frame
 ret
_ReadPixelX endp
 end





[LISTING FOUR]

; Mode X (320x240, 256 colors) rectangle fill routine. Works on all
; VGAs. Uses slow approach that selects the plane explicitly for each
; pixel. Fills up to but not including the column at EndX and the row
; at EndY. No clipping is performed.
; C near-callable as:
; void FillRectangleX(int StartX, int StartY, int EndX, int EndY,
; unsigned int PageBase, int Color);


SC_INDEX equ 03c4h ;Sequence Controller Index
MAP_MASK equ 02h ;index in SC of Map Mask register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X
SCREEN_WIDTH equ 80 ;width of screen in bytes from one scan line
 ; to the next
parms struc
 dw 2 dup (?) ;pushed BP and return address
StartX dw ? ;X coordinate of upper left corner of rect
StartY dw ? ;Y coordinate of upper left corner of rect
EndX dw ? ;X coordinate of lower right corner of rect
 ; (the row at EndX is not filled)
EndY dw ? ;Y coordinate of lower right corner of rect
 ; (the column at EndY is not filled)
PageBase dw ? ;base offset in display memory of page in
 ; which to fill rectangle
Color dw ? ;color in which to draw pixel
parms ends

 .model small
 .code
 public _FillRectangleX
_FillRectangleX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame
 push si ;preserve caller's register variables
 push di

 mov ax,SCREEN_WIDTH
 mul [bp+StartY] ;offset in page of top rectangle scan line
 mov di,[bp+StartX]
 shr di,1
 shr di,1 ;X/4 = offset of first rectangle pixel in scan
 ; line
 add di,ax ;offset of first rectangle pixel in page
 add di,[bp+PageBase] ;offset of first rectangle pixel in
 ; display memory
 mov ax,SCREEN_SEG
 mov es,ax ;point ES:DI to the first rectangle pixel's
 ; address
 mov dx,SC_INDEX ;set the Sequence Controller Index to
 mov al,MAP_MASK ; point to the Map Mask register
 out dx,al
 inc dx ;point DX to the SC Data register
 mov cl,byte ptr [bp+StartX]
 and cl,011b ;CL = first rectangle pixel's plane
 mov al,01h
 shl al,cl ;set only the bit for the pixel's plane to 1
 mov ah,byte ptr [bp+Color] ;color with which to fill
 mov bx,[bp+EndY]
 sub bx,[bp+StartY] ;BX = height of rectangle
 jle FillDone ;skip if 0 or negative height
 mov si,[bp+EndX]
 sub si,[bp+StartX] ;CX = width of rectangle
 jle FillDone ;skip if 0 or negative width
FillRowsLoop:
 push ax ;remember the plane mask for the left edge
 push di ;remember the start offset of the scan line
 mov cx,si ;set count of pixels in this scan line
FillScanLineLoop:

 out dx,al ;set the plane for this pixel
 mov es:[di],ah ;draw the pixel
 shl al,1 ;adjust the plane mask for the next pixel's
 and al,01111b ; bit, modulo 4
 jnz AddressSet ;advance address if we turned over from
 inc di ; plane 3 to plane 0
 mov al,00001b ;set plane mask bit for plane 0
AddressSet:
 loop FillScanLineLoop
 pop di ;retrieve the start offset of the scan line
 add di,SCREEN_WIDTH ;point to the start of the next scan
 ; line of the rectangle
 pop ax ;retrieve the plane mask for the left edge
 dec bx ;count down scan lines
 jnz FillRowsLoop
FillDone:
 pop di ;restore caller's register variables
 pop si
 pop bp ;restore caller's stack frame
 ret
_FillRectangleX endp
 end






[LISTING FIVE]

; Mode X (320x240, 256 colors) rectangle fill routine. Works on all
; VGAs. Uses medium-speed approach that selects each plane only once
; per rectangle; this results in a fade-in effect for large
; rectangles. Fills up to but not including the column at EndX and the
; row at EndY. No clipping is performed.
; C near-callable as:
; void FillRectangleX(int StartX, int StartY, int EndX, int EndY,
; unsigned int PageBase, int Color);

SC_INDEX equ 03c4h ;Sequence Controller Index
MAP_MASK equ 02h ;index in SC of Map Mask register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X
SCREEN_WIDTH equ 80 ;width of screen in bytes from one scan line
 ; to the next
parms struc
 dw 2 dup (?) ;pushed BP and return address
StartX dw ? ;X coordinate of upper left corner of rect
StartY dw ? ;Y coordinate of upper left corner of rect
EndX dw ? ;X coordinate of lower right corner of rect
 ; (the row at EndX is not filled)
EndY dw ? ;Y coordinate of lower right corner of rect
 ; (the column at EndY is not filled)
PageBase dw ? ;base offset in display memory of page in
 ; which to fill rectangle
Color dw ? ;color in which to draw pixel
parms ends

StartOffset equ -2 ;local storage for start offset of rectangle
Width equ -4 ;local storage for address width of rectangle

Height equ -6 ;local storage for height of rectangle
PlaneInfo equ -8 ;local storage for plane # and plane mask
STACK_FRAME_SIZE equ 8

 .model small
 .code
 public _FillRectangleX
_FillRectangleX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame
 sub sp,STACK_FRAME_SIZE ;allocate space for local vars
 push si ;preserve caller's register variables
 push di

 cld
 mov ax,SCREEN_WIDTH
 mul [bp+StartY] ;offset in page of top rectangle scan line
 mov di,[bp+StartX]
 shr di,1
 shr di,1 ;X/4 = offset of first rectangle pixel in scan
 ; line
 add di,ax ;offset of first rectangle pixel in page
 add di,[bp+PageBase] ;offset of first rectangle pixel in
 ; display memory
 mov ax,SCREEN_SEG
 mov es,ax ;point ES:DI to the first rectangle pixel's
 mov [bp+StartOffset],di ; address
 mov dx,SC_INDEX ;set the Sequence Controller Index to
 mov al,MAP_MASK ; point to the Map Mask register
 out dx,al
 mov bx,[bp+EndY]
 sub bx,[bp+StartY] ;BX = height of rectangle
 jle FillDone ;skip if 0 or negative height
 mov [bp+Height],bx
 mov dx,[bp+EndX]
 mov cx,[bp+StartX]
 cmp dx,cx
 jle FillDone ;skip if 0 or negative width
 dec dx
 and cx,not 011b
 sub dx,cx
 shr dx,1
 shr dx,1
 inc dx ;# of addresses across rectangle to fill
 mov [bp+Width],dx
 mov word ptr [bp+PlaneInfo],0001h
 ;lower byte = plane mask for plane 0,
 ; upper byte = plane # for plane 0
FillPlanesLoop:
 mov ax,word ptr [bp+PlaneInfo]
 mov dx,SC_INDEX+1 ;point DX to the SC Data register
 out dx,al ;set the plane for this pixel
 mov di,[bp+StartOffset] ;point ES:DI to rectangle start
 mov dx,[bp+Width]
 mov cl,byte ptr [bp+StartX]
 and cl,011b ;plane # of first pixel in initial byte
 cmp ah,cl ;do we draw this plane in the initial byte?
 jae InitAddrSet ;yes
 dec dx ;no, so skip the initial byte

 jz FillLoopBottom ;skip this plane if no pixels in it
 inc di
InitAddrSet:
 mov cl,byte ptr [bp+EndX]
 dec cl
 and cl,011b ;plane # of last pixel in final byte
 cmp ah,cl ;do we draw this plane in the final byte?
 jbe WidthSet ;yes
 dec dx ;no, so skip the final byte
 jz FillLoopBottom ;skip this planes if no pixels in it
WidthSet:
 mov si,SCREEN_WIDTH
 sub si,dx ;distance from end of one scan line to start
 ; of next
 mov bx,[bp+Height] ;# of lines to fill
 mov al,byte ptr [bp+Color] ;color with which to fill
FillRowsLoop:
 mov cx,dx ;# of bytes across scan line
 rep stosb ;fill the scan line in this plane
 add di,si ;point to the start of the next scan
 ; line of the rectangle
 dec bx ;count down scan lines
 jnz FillRowsLoop
FillLoopBottom:
 mov ax,word ptr [bp+PlaneInfo]
 shl al,1 ;set the plane bit to the next plane
 inc ah ;increment the plane #
 mov word ptr [bp+PlaneInfo],ax
 cmp ah,4 ;have we done all planes?
 jnz FillPlanesLoop ;continue if any more planes
FillDone:
 pop di ;restore caller's register variables
 pop si
 mov sp,bp ;discard storage for local variables
 pop bp ;restore caller's stack frame
 ret
_FillRectangleX endp
 end






[LISTING SIX]

; Mode X (320x240, 256 colors) rectangle fill routine. Works on all
; VGAs. Uses fast approach that fans data out to up to four planes at
; once to draw up to four pixels at once. Fills up to but not
; including the column at EndX and the row at EndY. No clipping is
; performed.
; C near-callable as:
; void FillRectangleX(int StartX, int StartY, int EndX, int EndY,
; unsigned int PageBase, int Color);

SC_INDEX equ 03c4h ;Sequence Controller Index
MAP_MASK equ 02h ;index in SC of Map Mask register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X
SCREEN_WIDTH equ 80 ;width of screen in bytes from one scan line

 ; to the next
parms struc
 dw 2 dup (?) ;pushed BP and return address
StartX dw ? ;X coordinate of upper left corner of rect
StartY dw ? ;Y coordinate of upper left corner of rect
EndX dw ? ;X coordinate of lower right corner of rect
 ; (the row at EndX is not filled)
EndY dw ? ;Y coordinate of lower right corner of rect
 ; (the column at EndY is not filled)
PageBase dw ? ;base offset in display memory of page in
 ; which to fill rectangle
Color dw ? ;color in which to draw pixel
parms ends

 .model small
 .data
; Plane masks for clipping left and right edges of rectangle.
LeftClipPlaneMask db 00fh,00eh,00ch,008h
RightClipPlaneMask db 00fh,001h,003h,007h
 .code
 public _FillRectangleX
_FillRectangleX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame
 push si ;preserve caller's register variables
 push di

 cld
 mov ax,SCREEN_WIDTH
 mul [bp+StartY] ;offset in page of top rectangle scan line
 mov di,[bp+StartX]
 shr di,1 ;X/4 = offset of first rectangle pixel in scan
 shr di,1 ; line
 add di,ax ;offset of first rectangle pixel in page
 add di,[bp+PageBase] ;offset of first rectangle pixel in
 ; display memory
 mov ax,SCREEN_SEG ;point ES:DI to the first rectangle
 mov es,ax ; pixel's address
 mov dx,SC_INDEX ;set the Sequence Controller Index to
 mov al,MAP_MASK ; point to the Map Mask register
 out dx,al
 inc dx ;point DX to the SC Data register
 mov si,[bp+StartX]
 and si,0003h ;look up left edge plane mask
 mov bh,LeftClipPlaneMask[si] ; to clip & put in BH
 mov si,[bp+EndX]
 and si,0003h ;look up right edge plane
 mov bl,RightClipPlaneMask[si] ; mask to clip & put in BL

 mov cx,[bp+EndX] ;calculate # of addresses across rect
 mov si,[bp+StartX]
 cmp cx,si
 jle FillDone ;skip if 0 or negative width
 dec cx
 and si,not 011b
 sub cx,si
 shr cx,1
 shr cx,1 ;# of addresses across rectangle to fill - 1
 jnz MasksSet ;there's more than one byte to draw

 and bh,bl ;there's only one byte, so combine the left
 ; and right edge clip masks
MasksSet:
 mov si,[bp+EndY]
 sub si,[bp+StartY] ;BX = height of rectangle
 jle FillDone ;skip if 0 or negative height
 mov ah,byte ptr [bp+Color] ;color with which to fill
 mov bp,SCREEN_WIDTH ;stack frame isn't needed any more
 sub bp,cx ;distance from end of one scan line to start
 dec bp ; of next
FillRowsLoop:
 push cx ;remember width in addresses - 1
 mov al,bh ;put left-edge clip mask in AL
 out dx,al ;set the left-edge plane (clip) mask
 mov al,ah ;put color in AL
 stosb ;draw the left edge
 dec cx ;count off left edge byte
 js FillLoopBottom ;that's the only byte
 jz DoRightEdge ;there are only two bytes
 mov al,00fh ;middle addresses are drawn 4 pixels at a pop
 out dx,al ;set the middle pixel mask to no clip
 mov al,ah ;put color in AL
 rep stosb ;draw the middle addresses four pixels apiece
DoRightEdge:
 mov al,bl ;put right-edge clip mask in AL
 out dx,al ;set the right-edge plane (clip) mask
 mov al,ah ;put color in AL
 stosb ;draw the right edge
FillLoopBottom:
 add di,bp ;point to the start of the next scan line of
 ; the rectangle
 pop cx ;retrieve width in addresses - 1
 dec si ;count down scan lines
 jnz FillRowsLoop
FillDone:
 pop di ;restore caller's register variables
 pop si
 pop bp ;restore caller's stack frame
 ret
_FillRectangleX endp
 end






[LISTING SEVEN]

/* Program to demonstrate mode X (320x240, 256-colors) rectangle
 fill by drawing adjacent 20x20 rectangles in successive colors from
 0 on up across and down the screen */
#include <conio.h>
#include <dos.h>

void Set320x240Mode(void);
void FillRectangleX(int, int, int, int, unsigned int, int);

void main() {

 int i,j;
 union REGS regset;

 Set320x240Mode();
 FillRectangleX(0,0,320,240,0,0); /* clear the screen to black */
 for (j = 1; j < 220; j += 21) {
 for (i = 1; i < 300; i += 21) {
 FillRectangleX(i, j, i+20, j+20, 0, ((j/21*15)+i/21) & 0xFF);
 }
 }
 getch();
 regset.x.ax = 0x0003; /* switch back to text mode and done */
 int86(0x10, &regset, &regset);
}
















































July, 1991
PROGRAMMER'S BOOKSHELF


Ten Pounds of Windows Books




Andrew Schulman


This month we will look at four new books on Microsoft Windows 3.0. When I
stepped on the bathroom scale holding these books and subtracted my own
weight, they turned out to weigh ten pounds. I guess I'm not the only one who
needs to go on a diet.
For the vast majority of programmers, graphics programming is, for better or
worse, going to increasing entail mastering -- not VGA palette registers or
Bresenham's line-drawing algorithm -- but the ins-and-outs of Application
Program Interfaces (APIs) such as the Macintosh Toolbox, Xt Intrinsics, or the
Microsoft Windows API. What I'll emphasize here is what these books can teach
us about the possibility of simplified programming for the notoriously complex
Windows API.


Programming Windows


The definitive book on Windows programming is, of course, Charles Petzold's
Programming Windows. Microsoft labels this "the authorized edition." I've also
heard Charles's book called "thunk," not because of the essential discussion
of "reload thunks" that appears on pp. 282-4, but because thunk is the sound
this massive book makes if you drop it on your foot!
In addition to Petzold's wonderfully smooth writing style (of which I am
completely jealous) and his ability to take even some of the more twisted
aspects of the Windows API and make them sound almost "programmer-friendly," I
like this book for the large number of buried treasures it contains. While
Petzold lets you know that Windows programming is hard to do -- and that this
difficulty is good for you -- the book also contains many ingenious ideas that
irresistibly appeal to the lazy, good-for-nothing bum in all of us.
For example: "Perhaps the epitome of lazy programming is the HEXCALC program
.... This program doesn't call CreateWindow at all, never processes WM_PAINT
messages, never obtains a device context, and never processes mouse messages.
Yet it manages to incorporate a ten-function hexadecimal calculator with a
full keyboard and mouse interface in fewer than 150 lines of source code" (p.
479). What a great idea!
Likewise, after a 40-page examination of Windows memory management, Charles
presents a two-p ragraph note titled "If You Know You're Running in Protected
Mode" (p. 302). This is an amazing note because it says, in essence, that you
can ignore most of the preceeding 40 pages.
It turns out that, by punting Windows Real mode, you can vastly simplify your
memory management: "When allocating moveable global memory, you can lock the
memory block immediately to obtain the pointer. Even though the block is
locked, Windows can still move it in memory -- and the pointer will remain
valid. You don't even have to save the global memory handle." Because Windows
3 Real mode is a complete joke which hardly anyone uses, and which probably
ought to have been left out of the product in the first place, you lose
nothing by developing only for the Windows protected (Standard and Enhanced)
modes. What you gain is some sanity. You can use the RCT switch to mark a
Windows application as "protected mode only."
Another gem in Petzold's book is the RANDRECT program on pp. 598-604, which
shows how to use the PeekMessage( ) function instead of the more typical
GetMessage( ). PeekMessage( ) is the key, not only to using Windows "dead
time," but also to simulating a typical DOS application-driven approach to
programming on top of Windows's event-driven architecture.
I am particularly fond of two small sections of this great book. Like many
programmers, I was completely hopeless with Windows programming, and despaired
of ever getting the hang of it, until I read "The Message Box" (pp. 439-441)
and the MFCREATE.C program (p. 640). These two small sections are the key to
truly simple Windows programming in which, for example, "hello world" takes
five lines of code, not the 80 lines you're hit with in Chapter 1 of this
book.
Basically, MFCREATE.C shows that Windows programs don't require a window, a
window class, a window procedure, a message loop, or any of the other Windows
paraphernalia that makes the API so daunting to a newcomer who just wants to
display three lines of output on the screen. In fact, Petzold notes that
MFCREATE.C, buried on p. 640, is the shortest program in the book. It happens
to create a disk-based metafile, which may not be what you're interested in,
but the key point is that this little program may open your eyes to a new way
of writing simple Windows utilities, or of just plain getting started with
Windows programming.
That brings us to discussion of the Windows MessageBox( ) function on pp.
439-441. If, as MFCREATE.C shows, there's no rule stating that WinMain( ) must
register a window class, create a window, install a window procedure, and
start a message loop, then all we need to write "hello world!", for example,
is a call to the handy Windows MessageBox( ) function.
Microsoft's documentation states that the first parameter to MessageBox( )
must be a window handle (an HWND) that identifies the window that owns the
message box, so you might think that you still have to go through the Windows
rigamarole just to create a window that can "own" the message box. However,
Petzold shows that "if you don't have a window handle available," you can use
NULL for the handle. When you're new to Window programming, or when you just
want to write a simple utility for use by your coworkers, the basic problem is
precisely that you don't have a window handle available, and Charles's NULL
trick lets you avoid writing the standard 80 lines of Windows boilerplate code
in order to get one.
Example 1 shows the simple code you can write by applying some of the ideas
buried in Petzold's book. This program displays information on what the
Windows Program Manager's About box likes to call "System Resources." No one
knows what these system resources are, but at least we know what percentage of
them is free! I had read on BIX that "System Resources" are in fact nothing
more than the local heaps of the USER and GDI modules, and that Program
Manager gets this information with an undocumented Windows call,
GetHeapSpaces( ). I wanted to write a simple test program to see if this was
true.
Example 1: A simple Windows program, SYSTRES.C, consists of little more than
the OkMsgBox( ) function from Petzold's Programming Windows.

 /* SYSTRES.C -- System Resources
 Borland C++:
 bcc -W systres.c
 rc systres.exe */

 #include <windows.h>

 /* undocumented Windows call: KERNEL.138 */
 extern DWORD FAR PASCAL GetHeapSpaces (WORD hModule);
 void heap_info (char *module, WORD *pfree, WORD *ptotal, WORD *ppercent)
 {
 DWORD info = GetHeapSpaces (GetModuleHandle(module));
 *pfree = LOWORD (info);
 *ptotal = HIWORD (info);
 *ppercent = (WORD) ((((DWORD) *pfree) * 100L) / ((DWORD) *ptotal));
 }
 #define PROGMAN_EXE "progman.exe"
 #define PROGMAN_CLASS "Progman"
 #define HELP_ABOUT 0x387 //subject to change!!!

 //run the Program Manager Help menu About... box
 void progmgr_aboutbox(void)
 {
 WORD progmgr;
 // use class name ("Progman"), not window title ("Program Manager"):
 // see Richter, Windows 3: A Developer's Guide, p. 80

 if (! (progmgr = FindWindow(PROGMAN_CLASS, 0)))
 {
 // WinExec () is async: equivalent of spawn () P_NOWAIT
 WinExec (PROGMAN_EXE, SW_SHOWMINIMIZED);
 if (! (progmgr = FindWindow (PROGMAN_CLASS, 0)))
 return; // couldn't find or load Program Manager
 }
 // use PostMessage instead of SendMessage so we run async
 PostMessage (progmgr, WM_COMMAND, HELP_ABOUT, 0L);
 }
 // from Petzold, Programming Windows, p. 441
 void OkMsgBox (char *szCaption, char *szFormat, ...)
 {
 char szBuffer [256] ;
 char *pArguments ;
 pArguments = (char *) &szFormat + sizeof szFormat ;
 wvsprintf (szBuffer, szFormat, pArguments) ; // changed from
 vsprintf
 MessageBox (NULL, szBuffer, szCaption, MB_OK) ;
 }
 int PASCAL WinMain (HANDLE hInstance, HANDLE hPrevInstance, LPSTR
 lpszCmdLine, int
 nCmdShow)
 {
 WORD user free, user_total, user_percent;
 WORD gdi_free, gdi_total, gdi_percent;
 heap_info ("USER", &user_free, &user_total, &user_percent);
 heap_info ("GDI", &gdi_free, &gdi_total, &gdi_percent);
 progmgr_aboutbox ();
 OkMsgBox ("System Resources",
 "USER heap: %u bytes free out of %u (%u%% free)\n"
 "GDI heap: %u bytes free out of %u (%u%% free)\n"
 "Free system resources: %u%%\n",
 user_free, user_total, user_percent,
 gdi_free, gdi_total, gdi_percent,
 min(user_percent, gdi_percent));
 }


The resulting program, SYSTRES.C, is built around the OkMsgBox( ) function
from p. 441 of Petzold's book. I made one change to this function: Rather than
call vsprintf( ), it calls wvsprintf( ), which is declared in WINDOWS.H and
built into Windows. This partially explains why the resulting executable,
SYSTRES.EXE, is less than 3 Kbytes, much smaller than the typical character
mode DOS utility, in a program which is a lot nicer to look at.
To see if the GetHeapSpaces( ) information really is equivalent to the numbers
reported by Program Manager, we of course need to pull down the Program
Manager Help menu and look at the About box. But there's really no reason why
the program can't do this for us, so SYSTRES.C also contains a function,
progmgr_aboutbox( ), which will do just that. It first sees if Program Manager
is already running; if it's not, it uses WinExec( ) to launch it. It then
sends Program Manager a message -- the same message that gets sent if you
manually click on the About ... menu item. It's amazing that we can perform
this type of interprocess communication with so few lines of code; the ability
to post messages from one program to another shows, by the way, that the
message-based architecture of Windows that Petzold describes is not just
arbitrary object-oriented terminology, but a true description of how Windows
works.
GetHeapSpaces( ) does turn out to provide the same numbers for System
Resources as Program Manager. The discrepancy of a percentage point or two
seems to be due to differences in how the two programs do integer-based
percentages. Incidentally, Windows programmers will soon not have to rely on
this undocumented function. A forthcoming Microsoft library called Toolhelp,
to be provided as part of its "Open Tools" strategy, will provide the
GDI-HeapInfo( ) and UserHeapInfo( ) functions, along with many other
much-needed additions to the Windows API. These functions will work in Windows
3.0, and will also be part of the retail release of Windows 3.1.
It may seem as if we've strayed far afield from Petzold's Programming Windows.
The point is simply that this book has so much useful information and so many
good ideas, that you can even use the book to adapt a style of Windows
programming with which Petzold would probably disagree. This is some book.
Thunk! Ouch!


Windows 3.0 for BASIC Programmers


Our next book has the unlikely title Windows 3.0 for BASIC Programmers. If you
are not a Basic programmer, and even if you are one of the many programmers
who despise this language, please keep reading anyway. The title of this book
is quite misleading. Let me quote the book's introduction:
This book will teach even the most inexperienced programmer how to write
Windows applications. It is unlike any other Windows book available today. You
don't need to learn hundreds of complex function calls, new styles of
programming, or the C language. Rather, through the use of a powerful
development package called Realizer Limited, included with this book, you will
learn how to create sophisticated Windows programs, quickly and painlessly. If
you can write a Basic program, you will be able to write a Windows program.
It's that simple.
This sounds like a piece of marketing literature but, oddly enough, it's all
true. For the price of a book, you get a "limited" version of Within
Technologies' Realizer Basic development environment for Windows. This is an
incredible bargain! You really will be able to write simple Windows
applications, and learn a lot about Windows, with this book/disk package.
Hyman had the excellent idea of trying to teach, not only how to use Realizer,
but also how Windows works. Several chapters contain "Behind the Scenes"
sections. In a few places these are somewhat confused (for example, the
explanation of Windows memory management), but the basic idea works well.
The key point is the presentation of a "higher level" on top of the Windows
API. For example, where Petzold's book has the reader sweat out 40 pages on
Dynamic Data Exchange (DDE), which end in the now somewhat famous statement
"The only advice I can offer here is simply to do the best you can," Hyman's
book and the Realizer software, like several other Basics for Windows, simply
provides a higher-level DDE interface.
Likewise, where most Windows programming books have to devote pages and pages
to reinventing the File Open ... dialog box found in every sizeable Windows
application (see Petzold, pp. 447-457, for example), the Realizer software
simply provides all the standard dialog boxes, built right into the language.
Interestingly, Microsoft is finally coming around to the novel idea of
building high-level interfaces. By the time Windows 3.1 is released,
developers will have two new dynamic link libraries (DLLs): one called
COMMDLG, which provides all the standard dialog boxes, and another called
DDEML, which provides a high-level DDE interface which looks surprisingly like
that found in WordBasic, the Basic development environment embedded in
Microsoft Word for Windows.
The demo programs that come with Hyman's book are impressive: two different
charting programs (Realizer specializes in forms, charts, and data analysis),
Blackjack and Poker simulations, a DDE "stock watcher" that communicates with
Excel, and so on.
Example 2 shows what Basic for Windows looks like. This is the SYSTRES utility
again, this time written with Realizer. Note that the Microsoft Software
Development Kit (SDK) is not needed for this program -- just Windows itself
and a $29.95 book/disk package. This is an interpreted, not a compiled
program, so we need some way to link to the Windows API routines. There's no
WINDOWS.H here; instead, Realizer lets you dynamic link at runtime to anything
in a DLL. That is the purpose of the EXTERNAL FUNC and EXTERNAL PROC
statements at the top of Example 2.
Example 2: The same utility as in Example 1, but this time written with the
Realizer Basic development environment for Windows.


 ' SYSTRES.RLZ -- System Resources
 ' run-time dynamic linking to Windows API
 external "kernel" func GetHeapSpaces (word) as dword
 external "kernel" func GetModuleHandle (pointer) as word
 external "user" func FindWindow (pointer, pointer) as word
 external "user" proc PostMessage (word, word, word, dword)

 func get_free (modname)
 heap_space = GetHeapSpaces (GetModuleHandle (modname))
 free = heap_space mod 65536 ' LOWORD
 total = Floor (heap_space / 65536) ' HIWORD
 percent = 100 * free / total
 Print #1; modname, "heap: ", free, " bytes free out of ", total;
 Print #1; " (", percent, "%)"
 return percent
 end func

 WM_COMMAND = 273 ' 111h (yuk! no hex!)
 HELP_ABOUT = 903 ' 387h

 ' pull down Program Manager Help About... box
 progmgr = FindWindow ("Progman", Pointer (0))
 if progmgr = 0 then
 Shell "progman", _Minimize
 progmgr = FindWindow ("Program", Pointer (0))
 end if
 PostMessage (progmgr, WM_COMMAND, HELP_ABOUT, 0)

 LogNew (1; "System Resources")
 user_percent = get_free ("USER")
 gdi_percent = get_free ("GDI")
 Print #1; "System Resources: ", min (user_percent, gdi_percent), "% free"
 LogControl (_Show)


This brings out an important point about Windows: If we are linking to these
functions from a Basic interpreter, with the Windows SDK nowhere in sight,
then all the Windows API routines must be part of Windows itself, and not
something that comes with the SDK. All the functionality is built into every
copy of Window that someone buys off the shelf at Egghead; the SDK brings
remarkably little to the party.
Actually, I did have to refer to WINDOWS.H once in coding up this example,
because I needed to know the "magic number" for the WM_COMMAND message. I got
the HELP_ABOUT magic number by watching the behavior of Program Manager with
Microsoft's SPY utility, and by examining PROGMAN.EXE with the excruciatingly
slow but nonetheless useful Whitewater Resource Toolkit (WRT) included with
Borland C++ (WRT was apparently built with the Actor language, and its poor
performance is about the worst possible form of advertising for Actor one
could imagine; this is a shame, because WRT would otherwise be addictive). In
both cases, I had to convert hexadecimal numbers to base ten because,
incredibly, Realizer does not accept hex numbers.
One final point about the Realizer program in Example 2: Note the SHELL
statement which is used to launch Program Manager (if it isn't already
running). In Basic for Windows (and remember, there are several other such
systems besides Realizer), SHELL is built on top of the Windows WinExec( )
function and is, therefore, fully asynchronous. The program that issues the
SHELL continues to run, even while the program it launched runs. This even
happens for DOS (non-Windows) applications in Enhanced mode, if the Background
bit is set in the DOS program's Settings. This is one of several nice
multitasking changes that Basic (and other languages) undergo under Windows.
I generally prefer Microsoft's WordBasic to Realizer; WordBasic has a more
intuitive syntax. However, Realizer and Hyman's book try to do several things
that WordBasic avoids. For one thing, dynamic linking works better with the
Realizer EXTERNAL statement than with the WordBasic DECLARE statement, because
Realizer distinguishes between the WORD/DWORD types and the INTEGER/LONG
types: WordBasic insists on using signed arithmetic for everything (this makes
it difficult to extract the low word and high word from the GetHeapSpaces( )
return value, for example).
Realizer also allows you to write event-driven programs and DDE servers (not
just DDE clients). The designers of Realizer thought through the implications
of this for the built-in Basic debugger: A menu selection "Full Stop" is
sometimes necessary when your program is continuing to be bombarded with DDE
messages, even though you think you've halted it in the debugger. This is one
of several interesting "Concurrent Basic" issues.
Hyman's chapter on event-driven programming is quite interesting: "Realizer
hides most events from the programmer. Realizer programs appear to be
state-based, or linear .... Behind the scenes, as each Realizer line is run,
Realizer handles a variety of messages and events .... There are some
situations in which the event-driven nature shows through. For example,
Realizer programs can have their own menus. The user can select a menu item
from these menus at any time. This causes an event that the Realizer program
needs to handle. Likewise for modeless forms" (p. 188).
Even if you haven't the slightest interest in Basic, definitely check out
Hyman's book if you are at all interested in Windows programming. At the very
least, you will get a taste of the sort of development environments and
languages that can be built on top of Windows, and of the great flexibility
which Windows is capable of in the hands of thoughtful third-party developers.


Windows 3: A Developer's Guide


Jeffrey Richter's is the first book to appear on advanced Windows programming.
That is, it assumes you've read Petzold's book, and makes no attempt to teach
introductory Windows programming. The book has eight lengthy chapters on
various aspects of Windows that haven't been properly covered elsewhere. For
example, there is an entire 70-page chapter titled "Installing Commercial
Applications," an 80-page chapter on "Setting Up Printers," and a 70-page
chapter on "Subclassing and Superclassing Windows."
Two chapters present excellent indepth examinations of Windows internals: the
opening chapter "Anatomy of a Window," and the chapter on "Tasks, Queues, and
Hooks." Together, these provide a coherent explanation of the
interrelationships of a program's Instance handle, Task handle, Module handle,
PSP, and so on.
I found many useful tips and suggestions here. For example, even the tiny
SYSTRES program shown in Examples 1 and 2 try to find Program Manager's window
handle by passing a class name, not a window title, to FindWindow( ). This
comes directly from p. 80 of Richter's book. I've also been using the VOYEUR
utility that comes with the book.
Speaking of utilities, one of the most striking things about Richter's book is
the install program for the accompanying disks. It's simply a standard Windows
install program (which he describes in chapter 8); nothing very exciting. But
if you think about it, here is a $39.95 book, and the accompanying disks have
a better, more professional, install program than most commercial
character-mode DOS applications. This is typical of the Windows marketplace:
Windows raises the level of quality for software, even for the sample programs
that accompany computer books.


Windows 3 Power Tools


Okay, so Windows 3 Power Tools isn't a book for programmers, and we all know
how bad "user" books can be. (You should see the junk stacked up in my
basement: Using Fastback, Learning Fastback, and An Advanced User's Guide to
Installing Fastback are a few of the titles that come to mind.)
But this really is a pretty good book on the intricacies of using and setting
up Windows. Setting up Windows so that it coexists properly with, say, a
network and an expanded memory manager, makes programming the Windows API look
like child's play. This book has good advice for such seemingly intractable
problems as getting QEMM 5.11 and Windows 3.0 Enhanced mode to coexist on a
Compaq computer. On the other hand, I saw nothing about the "out of
environment space" problem for the DOS box, nothing on the "instancing"
problems you can get when you run TSRs (such as the CED command-line editor)
before running multiple virtual machines in Windows Enhanced mode, and no
mention of the various bugs that plague the Windows 3.0 DOS box. This is
simply a decent user's book, by no means a great one.
The disks are the real reason to get this book/disk package. The centerpiece
is a graphical scripting language for Windows called "Oriel;" ORIEL.EXE itself
is only 18K, but it provides access to most of the Windows graphical device
interface (GDI) via an extremely reasonable-looking graphical batch language.
The output goes to a resizable, "persistent" window (that is, Oriel takes care
of repairing the window whenever it receives a WM_PAINT message). A number of
demo scripts show off the language's capabilities; as an example of what the
script language looks like, see the "hello world!" program shown in Example 3
(page 144). Oriel stands as a further example of how Windows really can be
made easier for programmers as well as for users.

Example 3: "Hello world!" using the Oriel graphical batch language

 { HELLO.ORL -- "hello world" for Oriel }

 UseCaption ("Hello")
 SetMenu ("&File", IGNORE, "E&xit", Do_Exit, ENDPOPUP)
 UseFont ("Roman", 0, 24, BOLD, NOITALIC, NOUNDERLINE, 64, 0, 0)
 DrawText (10, 10, "Hello world!")
 WaitInput ()
 Do_Exit:
 End


Another programmable utility that comes on the disks is Morrie Wilson's
CommandPost. This is a replacement for Program Manager. Visually, it resembles
the old MS-DOS Executive program for Windows; the difference is the
comprehensive scripting language, CommandPost Menu Language (CPML), that lets
you customize menus, manipulate windows, create dialogs, run programs, manage
files and directories, and so on. CommandPost could use a spiffier interface,
now that Program Manager is available, but it's hard to complain since so much
of CommandPost is changeable via the script language anyhow.
Just like Richter's book, Windows 3 Power Tools comes with an install program
that would put most commercial character-based DOS software to shame. Little
details like these convince me that Windows is going to succeed.



_PROGRAMMER'S BOOKSHELF COLUMN_
by Andrew Schulman


Example 1: A simple Windows program, SYSTRES.C, consists of
little more than the OkMsgBox() function from Petzold's Programming
Windows.

/* SYSTRES.C -- System Resources
Borland C++:
bcc -W systres.c
rc systres.exe */

#include <windows.h>

/* undocumented Windows call: KERNEL.138 */
extern DWORD FAR PASCAL GetHeapSpaces(WORD hModule);
void heap_info(char *module, WORD *pfree, WORD *ptotal, WORD *ppercent)
{
 DWORD info = GetHeapSpaces(GetModuleHandle(module));
 *pfree = LOWORD(info);
 *ptotal = HIWORD(info);
 *ppercent = (WORD) ((((DWORD) *pfree) * 100L) / ((DWORD) *ptotal));
}

#define PROGMAN_EXE "progman.exe"
#define PROGMAN_CLASS "Progman"
#define HELP_ABOUT 0x387 // subject to change!!!

// run the Program Manager Help menu About... box
void progmgr_aboutbox(void)
{
 WORD progmgr;
 // use class name ("Progman"), not window title ("Program Manager"):
 // see Richter, Windows 3: A Developer's Guide, p.80
 if (! (progmgr = FindWindow(PROGMAN_CLASS, 0)))
 {
 // WinExec() is async: equivalent of spawn() P_NOWAIT
 WinExec(PROGMAN_EXE, SW_SHOWMINIMIZED);
 if (! (progmgr = FindWindow(PROGMAN_CLASS, 0)))
 return; // couldn't find or load Program Manager

 }
 // use PostMessage instead of SendMessage so we run async
 PostMessage(progmgr, WM_COMMAND, HELP_ABOUT, 0L);
}
// from Petzold, Programming Windows, p. 441
void OkMsgBox(char *szCaption, char *szFormat, ...)
{
 char szBuffer[256] ;
 char *pArguments ;
 pArguments = (char *) &szFormat + sizeof szFormat ;
 wvsprintf(szBuffer, szFormat, pArguments) ; // changed from vsprintf
 MessageBox(NULL, szBuffer, szCaption, MB_OK) ;
}
int PASCAL WinMain(HANDLE hInstance, HANDLE hPrevInstance, LPSTR lpszCmdLine,
 int nCmdShow)
{
 WORD user_free, user_total, user_percent;
 WORD gdi_free, gdi_total, gdi_percent;
 heap_info("USER", &user_free, &user_total, &user_percent);
 heap_info("GDI", &gdi_free, &gdi_total, &gdi_percent);
 progmgr_aboutbox();
 OkMsgBox("System Resources",
 "USER heap: %u bytes free out of %u (%u%% free)\n"
 "GDI heap: %u bytes free out of %u (%u%% free)\n"
 "Free system resources: %u%%\n",
 user_free, user_total, user_percent,
 gdi_free, gdi_total, gdi_percent,
 min(user_percent, gdi_percent));
}





Example 2: The same utility as in Example 1, but this time written with
the Realizer BASIC development environment for Windows.

' SYSTRES.RLZ -- System Resources
' run-time dynamic linking to Windows API
external "kernel" func GetHeapSpaces (word) as dword
external "kernel" func GetModuleHandle(pointer) as word
external "user" func FindWindow(pointer, pointer) as word
external "user" proc PostMessage(word, word, word, dword)

func get_free(modname)
 heap_space = GetHeapSpaces(GetModuleHandle(modname))
 free = heap_space mod 65536 ' LOWORD
 total = Floor(heap_space / 65536) ' HIWORD
 percent = 100 * free / total
 Print #1; modname, "heap: ", free, " bytes free out of ", total;
 Print #1; " (", percent, "%)"
 return percent
end func

WM_COMMAND = 273 ' 111h (yuk! no hex!)
HELP_ABOUT = 903 ' 387h

' pull down Program Manager Help About... box
progmgr = FindWindow("Progman", Pointer(0))

if progmgr = 0 then
 Shell "progman", _Minimize
 progmgr = FindWindow("Progman", Pointer(0))
end if
PostMessage(progmgr, WM_COMMAND, HELP_ABOUT, 0)

LogNew(1; "System Resources")
user_percent = get_free("USER")
gdi_percent = get_free("GDI")
Print #1; "System Resources: ", min(user_percent, gdi_percent), "% free"
LogControl(_Show)




Example 3: "hello world!" using the Oriel graphical batch language

{ HELLO.ORL -- "hello world" for Oriel }

UseCaption("Hello")
SetMenu("&File", IGNORE, "E&xit", Do_Exit, ENDPOPUP)
UseFont("Roman", 0, 24, BOLD, NOITALIC, NOUNDERLINE, 64, 0, 0)
DrawText(10, 10, "Hello world!")
WaitInput()
Do_Exit:
 End




































July, 1991
OF INTEREST





Zortech C++, Version 3.0, a development package for DOS, MS Windows, OS/2, and
DOS 386, is now available from Zortech. The new version includes built-in
Windows development tools and documentation, eliminating the need for the
Windows SDK. Separate Windows, DOS, OS/2, and DOS 386 compilers have also been
rendered superfluous.
Version 3.0 includes a 386 DOS extender which allows you to develop
applications of up to 4 gigabytes. The Zortech DOS extender is royalty free
and requires much less memory than other DOS extenders. Also included is WINC,
a library for converting DOS command-line programs to Windows apps, and the
M++ library, which adds complete multidimensional array classes for C++.
The price for Zortech C++ for Windows is listed at $399.95; the developer's
edition runs $699.95; and the science and engineering edition costs $999.95.
Reader service no. 21
Zortech Inc. 4-C Gill Street Woburn, MA 01801 617-937-0696
Multiscope Debuggers for Windows, now shipping from Multiscope, provide
Microsoft Windows, Version 3.0 and character-mode interfaces for debugging
Windows and DOS applications written in Code View-information compatible
languages. Included are the Run-Time debugger for controlling the execution of
a program and the Post-Mortem debugger for analyzing the cause of a program
crash. The latter has a Monitor Execution and Dump (MED) system which allows
Windows developers to find the reason for an unrecoverable application error,
or a hang or crash. When MED detects an upcoming crash, it puts the computer
memory contents into a file. The user can then send this to the developer, who
can analyze the problem using the 12 different views of the program (including
exact location of the crash and variable contents at crash time) provided by
the Post-Mortem Debugger.
Also new to the package is remote debugging, whereby two machines are
connected using a serial line or network, with the debugger running on the
host computer and the application on the remote computer. Remote debugging
allows the programmer to control the execution of a program from the machine
on which it is running.
The Multiscope Debuggers for Windows cost $379. Reader service no. 22.
Multiscope Inc. 1235 Pear Ave., Suite 111 Mountain View, CA 94043 415-968-4892
Unipress Software has announced PC-Connect 6.01, a Microsoft Windows 3.0-based
interface to Unix. The PC-Connect desktop is displayed in a window and split
into two parts: the top portion, which shows up to 20 icons for net-worked
Unix computers; and the lower portion, which shows up to 64 icons for both
local and host-based applications. To start an application, you simply click
on an icon; PC-Connect automatically locates both local and remote
applications and launches them. You can print to either remote or local
printers and make terminal emulation connections to non-Unix hosts, as well.
The package contains a set of Unix utilities with the same look and feel as
their Windows counterparts: Unix Exec helps you manage files and directories,
and Unix Notepad aids in text editing.
PC-Connect supports Microsoft's DDE, allowing Unix and DOS applications to be
hot-linked for automatic updating. An optional SQL link enables DOS apps to be
automatically updated from SQL databases on the Unix system. XVision, a
Windows-based X-Window server for PCs, provides access to X applications. Also
available is the PC-Connect development system, which allows Unix developers
to access Windows API facilities so that Unix apps are displayed as Windows
applications on PCs
Patrick Montas of Unisys in McLean, Va. used PC-Connect while creating an
e-mail package. One requirement was to notify users when e-mail was received,
and PC-Connect did this very well by popping up a dialog box. Basically, says
Montas, "the best thing about the product is that you can create icons for
Unix applications, which saves a lot of time."
PC Connect can be used with RS-232C serial lines, Ethernet, or Token Ring as
well as TCP/IP networks. Prices start at $1140, depending on the Unix host
system and the number of PC users. XVision costs $449 per PC. Discounts are
available for multiple copies. Reader service no. 23.
UniPress Software Inc. 2025 Lincoln Hwy. Edison, NJ 08817 201-985-8000
EDIT*2000 is the new programmer's editor for the QNX operating system from
Computer Innovations. EDIT*2000 provides an advanced windowing interface that
supports concurrent editing of multiple files and multiple views into a single
file. All basic editing capabilities are performed in a single keystroke; you
can undo and redo; and redefinable keys facilitate personalization of the
system. EDIT*2000 is easy to learn and use and offers online help.
DDJ spoke with Phil Blecker of INX Services in Glendale, Calif., who writes
custom applications based on an object-oriented database. "We do all our
developing in C++," said Blecker, "and this is the only editor that works with
C++ in its native format and isn't proprietary." Phil further commented that
Computer Innovations has excellent [technical support] and often makes changes
to suit developers' requests.
The single-machine version costs $350; multinode discounts are available.
Reader service no. 24.
Computer Innovations Inc. 980 Shrewsbury Ave. Tinton Falls, NJ 07724
201-542-5920
The Paged Memory Manager (PMM) development tool for Windows and OS/2
Presentation Manager applications has been released by Emerging Technology
Consultants. PMM uses Windows or OS/2 PM to allocate fixed-size "pages" of
memory, then manages the suballocation of memory within those pages, bypassing
the native system function calls, and thus eliminating memory management
problems. All this is done in a completely transparent fashion.
PMM features multiple locking at several levels. In high-level locking, for
example, a page is automatically locked whenever memory is allocated on it,
while at the heap level, every memory block allocated within a page can be
locked multiple times. This is useful for debugging and allows you to pass
parameters via handles without worrying about locking conflicts.
Also supported is memory grouping, which specifies groups of pages containing
information you wish to isolate, thus improving program design and
implementation. There are debugging and statistics functions as well as free
technical support via phone, fax, and e-mail.
PMM for Windows and PMM for OS/2 PM cost $295 each; a developer's kit
containing both runs $495; and source code license, $2500. A special version
for XVT toolkit users is available from XVT Software Inc. Reader service no.
25.
Emerging Technology Consultants Inc. 3405 Penrose Place Boulder, CO 80301
303-447-9495
Easel Corp. has released EASEL Workbench, an OS/2-based set of development
tools for building graphical applications for OS/2, Windows, or DOS. The kit
consists of the high-level EASEL language, production compilers for OS/2,
Windows and DOS, and EASEL application runtime software. EASEL's graphical
user interface helps you develop PC-based client/server and cooperative
processing applications, as well as enhance existing terminal-based business
applications.
The tools available for designing GUIs are: A layout editor, with which you
can create, place, and size windows, dialog boxes, and dialog controls
(supports the SAA CUA guidelines); a drawing editor for designing icons and
application-specific graphics; a menu editor to easily build action bars and
pull-down and cascading menus; an attribute editor for designing and editing
an application's individual component attributes; a built-in text editor; and
a librarian.
Also included are an incremental compiler which allows you to compile specific
parts of an application; a source-level debugger, which executes the
application continuously or statement-by-statement; and trace facilities that
allow visual monitoring of an application's execution. The organizational
tools offered are a parts catalog and a project view manager.
EASEL Workbench provides support for Microsoft's SQL Server, IBM's Database
Manager, various mainframe communications protocols, advanced
program-to-program communications, and the Dynamic Data Exchange.
EASEL Workbench for either OS/2, Windows, or DOS costs $11,900. Additional
production compilers run $5,900. Reader service no. 29.
Easel corporation 25 Corporate Drive Burlington, MA 01803 617-221-2100
The Waite Group's Fractal Creations is a new book from Waite Group Press. The
book comes with a disk containing FRACTINT, a program that allows you to
explore fractal images on the PC. FRACTINT can convert fractals to 3-D images
(which you can view with the accompanying 3-D glasses), as well as zoom-in-on,
rotate, colorize, or color cycle them. Using the 3-D mode, you can create
mountain ranges, clouds, ferns, or abstracted landscapes.
The book includes a tutorial on fractals, reference material for FRACTINT, and
explanations of all the built-in fractals and how they work.
Priced at $34.95, ISBN #1-878739-05-0. Reader service no. 26.
Publishers Group West 800-365-3453
ENVY/Developer 1.0 is the new object-oriented team programming environment for
Smalltalk from Object Technology International. The developer supports the
full software manufacturing life cycle: prototyping, development, interactive
debugging, performance analysis, packaging, and application and embedded
systems maintenance. It allows you to improve productivity and reliability,
reuse existing components, and reduce maintenance costs. The graphical user
interface provides browsing tools for navigation, coding, testing, and
debugging.
All source and object code is maintained online in a repository and
configuration management and team support are seamlessly integrated with
Smalltalk's OOP environment. Thus, all team members have instant access to
updates and the complete version history of all components.
ENVY/Developer supports Smalltalk/V 286, Smalltalk/V PM, Smalltalk/V Apollo,
Smalltalk/V Sun-3, and an embedded version for MC680x0-based target
processors. A Windows version is in the works. Networks supported are Unix
TCP/IP, Novell Netware, Sun PC-NFS, Banyan Vines, Digital PCSA, IBM PC-LAN,
and LAN Manager.
Single-user versions run $4000, with considerable discounts available for
volume purchases. Reader service no. 27.
Object Technology International Inc. 1785 Woodward Drive Ottawa, Ontario
Canada K2C OP9 613-228-3535
CodePad is the new programmer's editor from Cognetic Systems that runs as a
Windows 3.0 application. While using Codepad to develop either Windows or DOS
applications, you no longer have to switch back and forth between DOS and
Windows: You can simultaneously view and edit multiple source files of all
sizes in overlapping windows.
There is a graphical mouse/keyboard interface and a goto line command useful
for locating compiler errors by line number; five different screen fonts; and
fast dynamic scrolling for browsing through source files. The package comes
with online hypertext help and a user's manual; if you have the Windows SDK
you can mark Windows system calls in your source code, then press a hot key to
get SDK hypertext help for that topic.
CodePad requires a 386 PC or higher and Microsoft Windows 3.0 running in 386
enhanced or standard mode. Development under Windows requires at least 3.5
Mbytes of system memory and 386 enhanced mode. CodePad sells for $99. Reader
service no. 20.
Cognetic Systems Inc. 12534 Pinecrest Rd. Herndon, VA 22071 703-476-7154
The Design and implementation of the 4.3BSD UNIX Operating System Answer Book
is available from Addison-Wesley. Authors Samuel J. Leffler and Marshall Kirk
McKusick provide the answers to the end-of-chapter exercises that appear in
The Design and Implementation of the 4.3BSD UNIX Operating System, which
describes the internal structure of the 4.3BSD system. The book includes
concepts, data structures, and algorithms used in implementing the system's fa
ilities. Reader service no. 28.
Addison-Wesley Publishing Company 1 Jacob Way Reading, MA 01867 617-944-3700
Ext. 2762 ISBN 0-201-54629-9
The Lucid C compiler for SPARC-based workstations has been released by Lucid.
Highly-optimized and fully tested, the compiler supports both Kernighan and
Ritchie and ANSI C, and is fully compatible with Sun-provided libraries and
tools.
Lucid C features parameter passing in registers and branch scheduling, as well
as the following optimizations: global register allocation, global common
sub-expression elimination, global constant folding, loop induction variable
elimination, strength reduction, code hoisting, tail recursion removal,
instruction scheduling, and partial and total redundancy elimination.
The cost for Lucid C, including documentation and a 90-day guarantee, is $495.
Site licenses and customer support services are available. Reader service no.
30.
Lucid Inc. 707 Laurel Street Menlo Park, CA 94025 415-329-8400






July, 1991
SWAINE'S FLAMES


System Gestures, or the Imperiled State




Michael Swaine


In FileMaker Pro, the same action will delete a data record in Browse mode, or
an entire data layout in Layout mode. I think that this is appropriate: In
Browse mode the primary unit on which one operates is the record, while in
Layout mode it is the layout. Nevertheless, it is possible to get confused and
delete a layout when you are trying to delete a record. I have a possible
solution to this problem, employing a user interface technique that I think is
being underused.
I don't mean to imply that Claris overlooked this danger in designing
FileMaker Pro. It employs all the common techniques by which an application
program protects its users from their own mistakes. There are cues that tell
you what mode you're in at any given time. There are warning messages that
tell you what you are about to delete, and that let you reconsider. And if you
do delete something accidentally, the application gives you visual feedback
that you did so, and lets you undo the deletion.
None of these techniques quite solves the FileMaker Pro deletion problem.
Knowing what mode you are in is helpful, but modes can overlap, as in this
case. There are actions more commonly used in Browse mode that are
nevertheless also available in Layout mode. Execute a sequence of these
actions while in Layout mode and your fingers will be telling you that you are
in Browse mode, despite what the static visual cues tell you. If the next step
in the sequence is to delete a record, you will be pulled toward that
overloaded delete action with a force proportional to the inertia of the
action sequence. You'll do it.
At that point, of course, you'll get a warning message. Unfortunately, you'll
be expecting one. FileMaker Pro warns users before deleting records in Browse
mode as well as before deleting layouts in Layout mode. So you won't read the
message. The information conveyed by a warning message when a warning message
is expected is the expected warning message. So that won't stop you.
Once you've deleted the layout, the screen will probably change radically and
you will probably recognize your error and you will probably have the sense to
select Undo before you do anything else, and if you come out on the right side
of all those probabilities, you'll be saved. But having the Undo operation be
the only functioning safeguard against a common error is not good.
Things could be improved by calling the user's attention to what is about to
happen as forcefully as the actual deletion calls attention to itself. I think
a system gesture would do that. What I mean by a system gesture is a visual
movement or change that carries a message and also calls attention to that
message. For example, when a window shrinks to an icon, animating that
shrinking not only shows where the icon went, but it draws the eye to the
point. Graphical user interfaces have a lot of user gestures, but not enough
system gestures.
A good system gesture can communicate on several levels at once, like a
gesture in the John Schlesinger movie Pacific Heights.
Melanie Griffith has tracked homicidal psychopath Michael Keaton to his hotel
lobby, where he now approaches the desk near which she sits in a chair, her
back to him, sipping a drink. It's a tense scene: She must stay long enough to
overhear his room number, but get away before he gets his message, which will
tell him she's there. We hear him say his room number, see him get his
message, read it, and turn. Then we get the gesture.
In her glass, sitting on a table, the top ice cube slips off the bottom one
with a clink.
Objectively all this gesture says is that she's no longer there to hold the
glass. On another level, the falling ice cube in tight focus for less than a
second says something about critical instants and events outside our control,
and it also conveys a time sense and defines a time interval.
I have nothing this artful in mind, but a simple color or shading change might
be an adequate gesture for deletion in FileMaker Pro. The point would be to
replace the textual warning message with a gesture that shows the
about-to-be-deleted object in a state viscerally recognizable as imperiled.
Color it red, fade it to gray, burn a (visual) hole in it, tear off a corner:
Some of these would be too expensive, but some such gesture ought to work, and
it could be useful in any deletion operation. We already have conventions for
showing if an object is selected or deselected, open or closed. Why not add an
imperiled state?





































August, 1991
August, 1991
EDITORIAL


Dr. Dobb's in '92, Encryption and Patents Now




Jonathan Erickson


Even though there's plenty of life left in 1991 -- including cold nights
aplenty for many of you and months of rain (we hope) for us in drought-weary
California -- we're anxious to get on with the coming year. No, it isn't that
we're eager to file next year's tax returns. (After all, say statisticians, we
just finished paying off the government this year). It's just that the topics
DDJ will be covering in 1992 have us champing at the proverbial bit (or is
that byte?). Here's the 1992 DDJ Editorial Calendar:
January Programming Advanced Architectures
February Protected-Mode Programming
March Assembly Language Programming
April Advanced Algorithms
May Data Communications
June Scientific and Engineering Programming
July Graphics Programming
August C Programming
September Debugging Tools and Techniques
October Object-Oriented Programming
November User Interfaces
December New Dimensions in Data
These aren't the only topics we'll be covering, of course. You'll also find
embedded systems and real-time programming, encryption, memory management,
data structures, biocomputing, and dozens of other articles presenting useful,
interesting tools and techniques. Our fundamental approach remains the same:
one programmer talking to -- and sharing ideas and techniques with -- other
programmers. And you can be assured there will be lots of source code.
For questions about article submissions and author guidelines, contact Tami
Zemel.


C Language Q&A


This being our annual C issue, it seems the perfect time to introduce a new,
ongoing feature we call "C Language Q&A" that answers some of the most
frequently asked questions that arise in the comp.lang.c newsgroup on the
Usenet distributed conferencing system. Steve Summit compiled the questions
and wrote up the answers.
A more complete version of this series (with references) is available from
Steve at scs@adam.mit.edu or on Usenet. The questions here aren't in any
particular order, nor will you find them on a regular page in this or
subsequent issues -- they're scattered throughout, starting with page 78 this
month.
If you have questions about C or about Steve's answers, drop us a note here,
or contact Steve via net mail. Eventually, we hope to provide similar Q&As on
other topics.


Encryption Update


Since last month, there have been some developments concerning the Electronic
Frontier Foundation, privacy, data encryption, and Senate Bill 266. To recap:
SB266 was a Biden-backed bill proposing that government agents be provided a
backdoor to encryption engines used for voice and data: "...providers of
electronic communications services and manufacturers of electronic
communications service equipment shall insure that communications systems
permit the Government to obtain the plain text contents of voice, data, and
other communications...."
After comments from the EFF and others, Sen. Patrick Leahy (D-Vt.), chair of
the subcommittee on technology and the law, shelved 266. It was then
resubmitted as Omnibus Crime Bill SB1241 -- without the onerous passage.
That's not to say the issue is dead. The FBI is still pushing for the
proposal, and opponents worry it will find its way into law through conference
committee, riders, or other laws. If you agree that the sentiments behind
SB266 are villainous, let your elected representative know.
Coincidentally, the ink was hardly dry on last month's edition before
Microsoft threw its hat into the encryption ring, announcing licensing of RSA
Data Security's patented public-key encryption technology. Bill Gates himself
attended the tiny press conference, noting that the most significant
announcements sometimes come in the smallest packages. Hyperbole aside, it was
an important turn of events -- for Microsoft, RSA, and millions of computer
users.
RSA is the immediate winner. The small company got a big endorsement for
technology that's on its way to becoming a de facto standard. (Lotus, DEC,
Novell, and others also license RSA toolkits.) Although Gates didn't give a
time frame, he did say Microsoft will build encryption technology and security
features into future versions of its operating systems. The Bespectacled One
added that encryption is central -- perhaps critical -- to Redmond's plans for
the future.
Not that Microsoft has anything to hide. The company's interest is as much
with authentication as privacy. With public-key encryption, you know the
person who says he sent you a document did in fact send it -- not someone
else. This is accomplished via a digital signature and public and private
keys. In short, authentication is absolutely vital to the adoption and success
of electronic messaging. Additionally, public-key encryption can be used for
electronic software distribution.
All of this talk about de facto encryption standards may go up in smoke,
however, because of the National Institute of Standards and Technology's
recent announcement to develop a proposal for a standard based on the
NSA-backed EIGamal public-key encryption algorithm. While it isn't compulsory
for private business or government agencies to adopt EIGamal, many likely will
if the proposal ever sees the light of day. There are holes galore in the NIST
proposal, ranging from the lack of a prerequisite hashing function to the
possibility of an open door for a backdoor. Public hearings before the House
Subcommittee on Technology and Competitiveness are in the offing and I bet
comments will be vociferous.


Patent Update


Because we haven't discussed software patents for a while doesn't mean the
problem has gone away. The good news is that the Secretary of Commerce has
convened a commission to investigate patent reforms. To this end, the Patent
and Trademark Office published in the Federal Registry on May 15, 1991 a call
for public input on a variety of patent issues, including software. The bad
news is that comments were accepted only until July 15, 1991. (Two months
hardly seems like enough time.) After July 15th, contact either the U.S.
Patent and Trademark Office, Box 15, Washington, D.C. or the Senate
Subcommittee on Patents, Trademarks, and Copyrights. The Secretary of Commerce
is due to receive the final report in August, 1992.
In an effort to get more software expertise into the patent process, the PTO
has loosened the requirements for patent agents -- those folks who present
patents to the PTO (with or without the help of patent lawyers). Until
recently, you had to have an EE or a similar background to be considered
qualified to take the PTO patent exam that tests mastery of patent laws. Now a
computer science degree, along with a passing score on the test, gets you into
the thick of things.
More good news on the software patent front came from U.S. District Judge
Michael Mukasey, who ruled that in New York, you can't beg, borrow, buy, or
steal an interest in a patent merely to launch a patent-infringement lawsuit.
The case revolved around Refac Development's 1989 suit charging that Lotus,
Ashton-Tate, Borland, CAI, Microsoft, and Informix infringed upon Forward
Reference Systems Ltd.'s patent. In 1989, Forward Reference turned over five
percent of the farm to Refac on the condition that Refac sue the above
mentioned vendors. That's a no-no in New York (and several other states), says
Judge Mukasey. (New York's "champerty" laws say you can't help foot the bill
for someone else's lawsuit simply for a slice of the pie.) There's nothing to
prevent Forward Reference from coming back and launching its own lawsuit,
however.
This isn't to say that software patents are a good thing -- I don't think they
are. Copyright is a better way of protecting intellectual property. I'll root
for the technologically astute David who develops and protects a unique
technique, then fights and wins against a corporate Goliath who steals it
away. But I suspect most software patents are being granted to large
corporations that already hold hundreds of patents -- not to individuals
working on their own.



PGP, PKP, and the Tie that Binds


It should come as no surprise that contentious, yet seemingly unrelated topics
such as software patents and data encryption are at times related. If you
don't think so, ask Phil Zimmerman, a software engineer and head honcho of
Phil's Pretty Good Software in Boulder, Colorado, who wrote a freeware
encryption program called Pretty Good Privacy (PGP) and distributed it across
the networks. The rub is that Phil knowingly implemented the patented RSA
encryption algorithm (U.S. #4,405,829) without first licensing it from Public
Key Partners (PKP), a sister company of RSA Data Security. (MIT apparently
holds the actual patent, while PKP holds the rights to license the patent.
Interestingly, the algorithm is not patented in Europe, where it enjoys
widespread use.) Phil states in his documentation that it is the
responsibility of users to obtain proper licenses from RSA or PKP, even though
PKP doesn't currently license to end users.
Network messaging was fast and often furious, as people -- including members
of the EFF -- argued about encryption, patents, how they interact, and whether
or not Zimmerman had the right to do what he did. Ultimately, at least one
online system pulled PGP from its library. About then, PKP and Zimmerman began
exploring ways of resolving the problem. The upshot is that Phil has agreed
not to distribute updated versions of his software, and PKP has agreed not to
sue.
RSA Data Security acknowledges there's a need for more information --
particularly source code -- about the RSA algorithm. Consequently, the company
will within the next few weeks be distributing free of charge across the
Internet, C code for PEM (Privacy Enhanced Mail), a program that implements
RSA public-key encryption. We'll provide more information on PEM as it becomes
available.























































August, 1991
LETTERS







Interpreting the Constitution


Dear DDJ,
According to Michael Swaine in his June 1991 "Programming Paradigms," Laurence
Tribe argues that in order to avoid errors of the past in dealing with
constitutional issues, one must "remain true to the values represented in the
Constitution" and that "fidelity to the values requires flexibility in textual
interpretation."
The entire nation should find it of great comfort that, at long last, someone
is going to fully define the values represented in the Constitution for "the
rest of us." Tribe's "fl xibility" in interpreting the text will, undoubtedly,
produce a translation that is uncannily in line with his own personal opinions
and political leanings.
As I see it, there are four ways of interpreting the Constitution, none of
them fully acceptable:
1. What did they write? That "the right of the people to keep and bear arms
shall not be infringed." Look up infringed in the dictionary. Any regulation,
including all restrictions on carrying concealed weapons, should not be
allowed. Strike one.
2. What did they mean? They meant that only white, Protestant, land-owning
males should be allowed to vote. Strike two.
3. What would they write today? The drafters of the Constitution have been
dead for more than 150 years. I guess we'll have to consult our crystal balls
to find out. Strike three.
4. What do I want it to say? This is the method used by most people,
apparently Tribe included. Actually, judging by the number of times I've heard
people say, "that's unconstitutional!" I think that very few people have ever
read the Constitution. Strike four. (You say there are only three strikes?
Where does it say that in the Constitution?)
Technological advances and their application in the judicial process should be
weighed by the degree to which they advance the cause of justice and provide
protection of the innocent.
David Rago Livonia,
Michigan


Compressing to the Minimum


Dear DDJ,
Reading Mark Nelson's article on data compression (DDJ, February 1991)
inspired me to experiment with text compression. In particular, I hoped that I
could find an algorithm that compressed text data to near its theoretical
minimum (if such a thing exists) without the bad-order time and memory
requirements Mark spoke about with the high-order modeling.
I started with the assumption that text data can be characterized as a stream
of symbols in which some symbols are more frequent than others, certain pairs
of symbols are more frequent than others, and certain words and phrases are
more frequent than others. I searched for an algorithm which exploits this
"repetition" of strings or "frequency differentials" on all scales with one
mechanism.
I came up with something which performed comparably to PKZIP. And I found the
simplicity of the program quite surprising.
This algorithm sounds a little like Mark's Order-1 modeling, yet I found no
need to go to higher orders. I had an array of "symbols," which was
initialized to have one symbol for each character encountered in the text, but
room for many more elements. Then I went through the file and constructed a
table of correlations between each possible pair of symbols. For every pair of
symbols occurring frequently enough above some threshold, I established a new
symbol to replace the pair. Then I repeated the process until I failed to
generate new symbols, at which time I reasoned that I had "removed" all
context regularities from the file, and all that remained was to apply the
Arithmetic Coding method to code all these symbols.
I found that, as I expected, most words coagulated together to be represented
by a single symbol, and even uncommon words were represented by only two or
three symbols, so that in a sense it was working like a dictionary model. If
text was repetitious on all scales, like a fractal, I'd be able to compress
the file to just a few symbols, with all the information of the file contained
in the array of symbols, but as it is, the array of symbols is always small
compared with the processed file.
For example,
 while(true) { break;
was encoded by four symbols:
 while( true ) {\n\r\t\t break;
Now, of course, I ended up with a large number of "symbols," up to several
thousand for a large English file. To construct the table of correlations, I
couldn't very well have an array: int c[2000][2000]. But I accomplished the
same effect by splitting the theoretical table into smaller squares of some
smaller fixed size, so that for each square I constructed part of the
hypothetical 2000 x 2000 correlation table. It slowed things down, but I was
not particularly concerned about speed. Firstly, I thought that the primary
concerns were compression ratios and reconstruction times. Secondly, as it
turned out, I was never waiting too long for files to be compressed.
Can I conclude that I may have compressed the file to its theoretical minimum?
Something I believe, which someone may someday prove, is that Arithmetic
Coding is the theoretical optimal way of encoding a sequence of symbols,
providing that the sequence has no context regularities. A sequence of symbols
without context regularities can be permuted in any way and be statistically
equivalent to the original sequence. If my processed file of symbols has this
property, I should be able to permute it in any way and get something
completely comprehensible. This is not the case. I haven't tackled the problem
that symbols occur with different frequencies in different regions of the
text, (more so with programs than English), because I expected only modest
gains with that optimization. Also, more pertinently, I know the rules of C
syntax (and layout) and English grammar, and so all the "comprehensible" files
share some regularities this method does not employ. Some of these more
superficial regularities could be programmed into a compression program as
culture-specific information, but there are also deeper regularities, e.g.,
meaning and function, which would be difficult to exploit.
Tim Cooper
Eastwood, Australia
Mark responds: Mr. Cooper has come up with an interesting algorithm for
tokenizing a file. Whether it can be effectively used to compress a file is
another question. By developing a comprehensive dictionary full of strings, it
is possible to drastically reduce the size of a file well beyond the ratios
achievable by compression programs most of us use. The problem is, in order to
decompress the file, a copy of the dictionary has to be passed to the
decompression program along with the data. So the critical factor is how much
storage space is taken up by the dictionary and the compressed data together,
not just the data. Mr. Cooper seems to be neglecting that factor in his
presentation.
In the event that you are going to be compressing the same type of files on a
regular basis, it makes sense to build a dictionary and keep it online for
both the compressor and the decompressor to use. For the most part, today's
users seem to prefer algorithms such as LZW that create dictionaries
on-the-fly. It seems possible that applications involving the storage of
massive amounts of homogeneous data, such as reference works on CD-ROM, could
benefit from the Coopper approach.


Until Proven Otherwise


Dear DDJ,
I am writing in response to the letters from Steve Medvedoff and Charles Pine
("Letters," May 1991) regarding Michael Swaine's February 1991 "Programming
Paradigms" column ("A Programmer Over Your Shoulder"). These letters present a
wonderful opportunity to reinforce the points he made in his article about On
the Shape of Mathematical Arguments by Antonetta van Gasteren.
The special irony is that the second writer disproves his own point. Briefly,
he claims that he has a "hardly" messy proof of the natural number
pair-matching problem. His proof is clear. It is also incorrect. The first
part of this letter attempts to find out why.
It is hard to show where the "misproof" breaks down, because the conclusion,
after all, is the correct one! This problem is very similar to one often faced
by programmers. Your program has been getting the correct answer, but suddenly
its not. By looking at the code, you eventually discover that it was getting
the correct answer purely by accident. This incorrect proof manifests a
similar bug. Its technique appears to work in the present situation. In other
circumstances, it will actually prove incorrect statements.
One way to find this kind of bug is to represent the program, or proof,
abstractly. In this case, then, the argument goes as follows: Begin with the
pairing that you believe is optimal. Show that every pairing derivable from
yours in a certain way is not better than yours. Conclude that yours is best.
The missing piece emerges in the phrase "in a certain way." The pairings using
the procedure in the misproof include only those achievable by "exchang[ing]
the members of two pairs." These are not all possible pairings (when the
number of all pairs exceeds two, anyway), and so it does not necessarily
follow that the putative optimum is best. In other situations (where the
function to maximize is different, for example) this technique can "prove"
incorrect conclusions.
Perhaps the "messy" proofs are messy because they are correct? The second
point of this letter is to expound on the similarities between the spirit of
modern mathematical thinking and code reuse. This is mentioned by the first
letter writer, Steve Medvedoff, where he notes that "in mathematics, looking
at a problem from a different perspective...is a common, often enlightening
practice."
The same example will do nicely to illustrate. The sum of products appears in
many branches of mathematics: as a dot product of vectors, for example, or as
a line integral over a polygon, or the square of the diagonal of some
hyperbox. In this case it is especially fruitful to look a little farther
afield, to probability theory.
You should recognize the sum of products expression as the covariance of a
bivariate population (the set of pairs). Now for the "code reuse": Those
familiar with the relationship between covariance and correlation will see
immediately that maximizing the covariance (while keeping the coordinate sets
fixed) is achieved by maximizing the correlation coefficient. (By the way,
this is where I hide most of the algebra and hand-waving that appears
"in-line" in other proofs.)

The lights begin to flash. To wit: 1. The problem needn't be restricted to
natural numbers, or even positive numbers, at all; the coordinates can be
allowed to be any sets of real numbers. 2. The correlation coefficient is
invariant under translations and rescalings of the coordinates (i.e., of all
Xs at once, or of all Ys at once). (I can't resist pointing out that this
extends the useful symmetries in the problem from a small, finite permutation
group to an (infinite) four-dimensional group.) 3. The problem now has a
direct geometrical interpretation: How can we match the points so that their
scatterplot most closely approximates a straight line? This is an especially
nice bonus, because it lets us visualize a solution to the problem.
Bringing in all these results from elementary probability theory is akin to
reusing old code. "Why reinvent the wheel?" is the cry of today's programmer,
seeking the Holy Grail of perfect code in no time with no work. It is a
traditional technique of mathematics, which derives much of its power from
building on previous results.
"Reusing" these mathematical results, let's translate the X-values so that the
smallest is 0, and the Y-values so that their smallest also is 0. (In other
words, shift the geometric picture so that the bottommost, leftmost coordinate
is the origin. We can do this because of observation 2 above.) Now it is
algebraically trivial (when you see this word "trivial," you know you're not
reading a real proof! to see that the smallest X must be paired with the
smallest Y in order to maximize the sum of products. Any other possibility can
be equalled or improved by using this matching instead. The key is that the
zeros simplify the computation.
This is where the beauty of the recursive formulation mentioned by Swaine
comes in. We're done! The problem now is "one size" smaller, since we are left
to pair the remaining points among themselves. By recursion, we see
immediately that the ordered X-values must be paired with the ordered
Y-values, smallest with smallest, through largest with largest.
The same kind of magic instantaneous solution arises in well-written recursive
code. Recursive-descent parsers, for example, can appear just as mysteriously
effective.
One point, worth a lot of reflection (I think), is that the hard part was over
as soon as the connection with probability theory was made. In other words,
the achievement lies not in the reuse of techniques (code) per se; it lies in
discovering which code is relevant, and using it ("interfacing" with it)
appropriately.
I have been able only to touch upon parallels between the disciplines of
programming and "proof-making" here. Proofs and programs can have bugs, but
sometimes appear to work correctly. Reusable code and insights from other
mathematical/scientific disciplines can be remarkably effective. Visualizing a
problem can be enlightening. These points, and the spirit of this discussion,
are worthy of further musing, further pursuit. I hope that Swaine keeps on
track.
William A. Huber, Ph.D.
Philadelphia, Pennsylvania





















































August, 1991
STANDARD C: A STATUS REPORT


Where does the language go from here?




Rex Jaeschke


Rex is editor of The Journal of C Language Translation, a quarterly
publication aimed at implementors of C language translation tools. He also
serves on the ANSI C committee X3J11, is the US international representative
to ISO C, and is the convenor of the Numerical C Extensions Group. Rex can be
contacted at 2051 Swans Neck Way, Reston, VA 22091, 703-860-0091, or via the
Internet at rex@aussie.com.


We in the C industry are at a major milestone. We have our first language
standard. We have many mainstream compiler vendors working hard to conform to
that standard with quite a few of them probably already there. And we soon
hope to have in operation a system for formal validation against that
standard. Our industry is maturing. Some say that because of that, it is
starting to die or at least fade. With everything supposedly nailed down by a
formal definition, there's no room left for the entrepreneurial freedom that
launched this language to such great heights. The media, of course, has lost
quite a bit of its interest in this (dare I say) commonplace language and is
instead focusing on the new darling, C++.
If you like to be "on the leading edge," okay. From my perspective, however,
that's the last place I want to be if I have real work to do. With
standardization comes stability. And while our very being is helped by change,
it is just as much hindered by it. Creeping featurism we can do without, for
the most part. Now that the hype seems to have gone from our beloved language
we can get on with the job we were presumably hired to do. That is, deliver
usable results using real people and a small, finite budget. And if we need to
write portable code, at least we now have a much better chance of doing it
than we ever had before.
At this point in C's life, it is worth looking back at how we got to where we
are -- to what shaped the standard and why, and where we can and will go from
here.


The ANSI C Standard


The first standard for C was that endorsed by the American National Standards
Institute (ANSI) in December of 1989 (known formally as X3.159-1989). This
standard was produced by the X3J11 committee under the auspices of the X3
Secretariat which, in turn, is managed by the Computer and Business Equipment
Manufacturer's Association (CBEMA) in Washington, D.C. Strictly speaking,
X3J11 is not an ANSI committee; however, it is generally known as the ANSI C
committee.
How did X3J11 get started? For that we go back to early 1983 when Jim Brodie
of Motorola was charged with implementing a C compiler. After having searched
for a complete definition of the language, preprocessor, and library he
concluded no such thing existed. What did exist was a book called The C
Programming Language by Kernighan and Ritchie (now affectionately known as
K&R); various papers published both within and outside AT&T's Bell Labs; and a
myriad of compilers, most of which were either derived from, or tried to
emulate, the Unix portable C compiler (pcc). If an implementor wanted to know
what a C compiler was supposed to do in a particular situation, the answer was
often "Why don't you run a test through pcc and see what it does?
Brodie reported to his boss that no complete description existed. His boss,
who had previously been involved in language standards, suggested Brodie
should make a standard because none currently existed. Brodie took the advice;
after all, how long could it possibly take to pin down the few loose ends
needed to make the definition really complete? (The answer was, about seven
years.)
So Brodie became the convener of a proposed ANSI committee. His first big job
was to write a project proposal outlining the goals and purposes of such a
committee as well as the justifications and estimations of the resources it
would need and where they would come from. This proposal was then submitted to
SPARC (ANSI loves acronyms) for its consideration. Once they approved, the new
committee was officially formed and named X3J11. (X3 is the ANSI subgroup that
deals with computing standards, J is the language category within X3, and
there were already ten other committees in existence. Other examples are X3J3
-- Fortran and X3J16 -- C++.)
The first official meeting was chaired by convener Brodie. A slate of officers
was nominated and, eventually, elected. The original officers were: Chair,
Brodie; Vice-Chair, Tom Plum, Secretary, P.J. Plauger; and Vocabulary Rep,
Andy Johnson.
Larry Rosler volunteered to serve as editor of the draft standard. (Although
the editor is not an officer, he obviously plays a very important role in the
committee's work.) These people continue to serve in those capacities today
except for Rosler who has been replaced by David Prosser. A Rationale editor
(Randy Hudson) was added later, as was an International Representative (first
Steve Hersee, then Plauger, and now myself).


Appointments to the C High Court


The general impression users seem to have of standards is that they have been
handed down from some divine authority that supposedly knows what is best for
"mere mortals." In fact, X3J11 is made up of real people, many of whom
actually program in C. If they are just "regular" C programmers, how do they
get appointed to the high court of C? Well, they pretty much appoint
themselves. Or at least their employers appoint them.
Anyone can attend and informally participate in an ANSI standards meeting even
if they aren't registered members. However, to vote on issues of policy they
must meet certain requirements. For example:
They must be fully paid members registered with CBEMA. (This requires they
fill out a form and pay $250/year.) And while companies can register and send
as many employees as they wish, one employee becomes their primary delegate
with any others being alternates. Only one representative from each company
may vote at a time. A member need not be a US citizen nor even a permanent
resident. (X3J11 had active participation from numerous nonresident
non-Americans.)
To actually benefit from committee deliberations, they really need to attend
meetings. In most years this meant four-and-a-half-day meetings four times a
year, generally somewhere in the US.
To maintain their voting privileges, they must attend two of every three
meetings and be signed in on the attendance list for a certain number of days
each meeting. Each individual company gets one vote. Fred Smith, an
independent consultant, gets one vote, as do IBM, AT&T, and DEC. Obviously,
Fred's vote can be very significant.
In summary, you can have a significant impact on an ANSI standard if you have
$250 and the time and budget to attend meetings and prepare and read papers
between meetings. You can also have a real impact if you never attend
meetings, simply by being an observing member and corresponding by post or
electronic mail.
X3J11 has been dominated by implementors of C language translation tools.
(This is not to suggest the resulting standard is somehow inadequate or
biased, however. On the contrary, I think the standard is of very high
quality.) Of course, each vendor also indirectly represented their own
customer base. But what the vendor sees as being in his best interest may not
be what his customer wants or needs. As C matures even further, it will be
interesting to see if the composition of X3J11 membership changes. I suspect
it will, but it's a slow process and few end users perceive the need to get
involved at this level or can justify the investment in time and expense.
It is important to realize that X3J11 members must pay their own way in all
standards-related activities and that membership is totally on a volunteer
basis. Neither ANSI nor the X3 Secretariat provide any financial assistance. A
gracious host must be found for each meeting. (Until recently, hosts had to
duplicate and mail up to 200 sets of several hundred pages before and after
each meeting.) ANSI's only revenue comes from committee membership fees and
the sale of standards documents.


The Committee's Aims


The aims are best stated by quoting the Rationale document that accompanies
the C Standard:
The Committee's overall goal was to develop a clear, consistent, and
unambiguous Standard for the C programming language which codifies the common,
existing definition of C and which promotes the portability of user programs
across C language environments.
The X3J11 charter clearly mandates the Committee to codify common existing
practice. The Committee has held fast to precedent wherever this was clear and
unambiguous. The vast majority of the language defined by the Standard is
precisely the same as is defined in Appendix A of The C Programming Language
by Brian Kernighan and Dennis Ritchie, and as is implemented in almost all C
translators. (This document is hereinafter referred to as K&R.)
K&R is not the only source of existing practice. Much work has been done over
the years to improve the C language by addressing its weaknesses. The
Committee has formalized enhancements of proven value which have become part
of the various dialects of C.
More Details.
Existing practice, however, has not always been consistent. Various dialects
of C have approached problems in different and sometimes diametrically opposed
ways. This divergence has happened for several reasons. First, K&R, which has
served as the language specification for almost all C translators, is
imprecise in some areas (thereby allowing divergent interpretations), and it
does not address some issues (such as a complete specification of a library)
important for code portability. Second, as the language has matured over the
years, various extensions have been added in different dialects to address
limitations and weaknesses of the language; these extensions have not been
consistent across dialects.
One of the Committee's goals was to consider such areas of divergence and to
establish a set of clear, unambiguous rules consistent with the rest of the
language. This effort included the consideration of extensions made in various
C dialects, the specification of a complete set of required library functions,
and the development of a complete, correct syntax for C.
The work of the Committee was in large part a balancing act. The Committee has
tried to improve portability while retaining the definition of certain
features of C as machine-dependent. It attempted to incorporate valuable new
ideas without disrupting the basic structure and fabric of the language. It
tried to develop a clear and consistent language without invalidating existing
programs. All of the goals were important and each decision was weighed in the
light of sometimes contradictory requirements in an attempt to reach a
workable compromise.
In specifying a standard language, the Committee used several guiding
principles, the most important of which are:
Existing code is important, existing implementations are not.
C code can be portable.

C code can be non-portable.
Avoid "quiet changes." [These are changes to widespread practice that alter
the meaning of existing code.]
A standard is a treaty between implementor and programmer.
Keep the spirit of C. There are many facets of the spirit of C, but the
essence is a community sentiment of the underlying principles upon which the C
language is based. Some of the facets of the spirit of C can be summarized in
phrases like
Trust the programmer.
Don't prevent the programmer from doing what needs to be done.
Keep the language small and simple.
Provide only one way to do an operation.
One of the goals of the Committee was avoid interfering with the ability of
translators to generate compact, efficient code. In several cases the
Committee has introduced features to improve the possible efficiency of the
generated code; for instance, floating point operations may be performed in
single-precision if both operands are float rather than double.


Milestones


Because a standards committee tends to work in isolation during much of its
deliberations, it was deemed important to get outside input once X3J11 had
progressed far enough. As such, an unofficial public review period was
announced and copies of the working draft were made available for purchase
through CBEMA. Considerable time was then spent by X3J11 in analyzing the many
criticisms and suggestions submitted.
The next stage was to have a formal public review. (From this point on, X3J11
required a two-thirds majority to change the draft instead of the previous
simple majority.) In this case, the draft standard was issued for four months.
All public comments had to be answered according to a detailed set of rules.
In each instance where X3J11 decided to not adopt a suggestion it had to give
some rationale as to why, and a dissatisfied public could appeal its
treatment. Once all issues had been resolved, the cycle was repeated, but only
for a two month period. After three public reviews, the draft standard had
convergerd to its final form. This process continued until X3J11 converged to
a standard that was acceptable to most (but not necessarily all) voting
members and ANSI.
For all intents and purposes, the standard was complete by late 1988. However,
sometime after the final public comment period ended, a letter from a member
of the public was discovered by ANSI. It had been inadvertently mislaid.
Despite the fact that the final draft included perhaps half of his suggestions
already, the author of this letter wanted his pound of flesh. After X3J11
considered his comments (many of which requested support for real-time
applications), they chose to not implement his remaining suggestions. Unhappy
with that outcome, he chose the appeals route, which ultimately resulted in no
further technical changes to the draft. After about a one year delay, an ANSI
C standard was finally accepted late in 1989. For some, that might seem like
the end of the line. However, it's really just the end of one cycle in a
never-ending carousel ride. X3J11 not only has to interpret requests from the
public but, eventually, must look at revising that standard.


The Interpretations Phase


Under ANSI rules, a standards committee must rule on interpretations of its
standard. That is the main business of X3J11 at this time and for the
foreseeable future. Because interpreting the standard requires much less time
and effort than producing the initial standard, X3J11 went to a two day
meeting schedule twice a year. However, as the workload increased, meetings
have been extended to two-and-a-half days.
Examples of the interpretation requests processed thus far are:
1. Given the following macro definitions #define lp (and #define fm(a) a, to
what does fm lp "abc" ) expand? (X3J11's answer was fm ("abc").)
2. What is the resultant output from printf ("#.4o",345)? Is it 0531 or is it
00531? (X3J11's answer was 0531.)
3. Do functions return structure values by copying? (X3J11 finally decided
that the function return must be done as if a copy was being performed.)
While most requests can be dispensed with quite readily, a few require
detailed reading of many related sections of the standard. As a result, the
formal answer can be quite lengthy and contain numerous quotes from the
standard. Some requests are quite educating for members of X3J11 who thought
the standard said something quite different or who disagree with the final
outcome.
An interpretation made by X3J11 does not have the weight of the standard and
it can, in fact, be overturned by X3J11 or ANSI at some later date. However,
it is expected that almost all interpretations will eventually find their way
into a future revision of the standard. To help communicate interpretations to
the outside world, X3J11 is in the process of publishing a technical bulletin
containing all requests resolved to date.


The ISO C Standard


In 1986 an ISO C standards committee was formed called SC22/WG14. At the same
time, the draft ANSI standard was converging. However, due to input from
US-based vendors wishing to sell compilers to non-English speaking countries,
and from customers and vendors in those countries, considerable effort was put
into enhancing the draft ANSI standard in two different directions: To better
handle the western European single-byte characters containing diacritical
marks (such as German's a and Spanish's n) and to better handle the multibyte
character sets used in Asia.
Although this effort delayed the ANSI C standard for more than a year, it did
pave the way for the ANSI standard to be adopted as the ISO standard without
technical change. That is, except for reformatting changes, the ISO and ANSI C
standards became equivalent. (The ISO standard is known as ISO/IEC 9899: 1990
(E).)
While X3J11 is busy handling interpretations, WG14 is blazing new trails on
several fronts and is producing a normative addendum. The main parts of that
addendum are:
The UK delegation is busy working on a clarification document that further
explains numerous "dark corners" of the standard. No substantive changes will
result, only explanatory material.
The Japanese delegation is defining numerous extensions to the library to
provide more support for multibyte character handling. This will involve a new
header, functions, typedefs, and macros.
The Danish delegation is working on a more readable alternative to the
trigraph solution already contained in Standard C. (A trigraph is a
three-character sequence of the form ?? x that can be used to write the nine
characters, such as {,}, , and \ missing from most systems using the ISO-646
character set.) This has been by far the most controversial activity amongst
member nations.
When published, this normative addendum will have the full weight of a
standard.
Currently, WG14 meets twice a year for two or three days. Last November they
met in Copenhagen. In May it was Tokyo and in December it will be Milan.
Meetings are usually attended by representatives from four or five countries
with each delegation having one to four members. No formal votes are held but,
rather, decisions are made by consensus. Of course, getting work done at the
ISO level is slower because delegates must get input from their national
standards bodies, and the timing of X3J11 and WG14 can impact the ability of
the US to quickly respond to proposals at the ISO level. Fortunately,
considerable debate occurs by electronic mail both at the ISO and X3J11 level.


Extension Efforts


Numerous efforts are underway to extend the C language, its environment, or to
define add-on libraries. Some of these are being pursued by groups affiliated
with standards bodies while others are simply extensions to commercially
available compilers. A few of the more notable ones are discussed here.
The Numerical C Extensions Group (NCEG) I convened this group in March 1989 as
an ad hoc working group. Because ANSI C clearly was not going to address
numerous issues of importance to the numerical community, I decided to try and
coordinate the extensions that were already evolving in that direction. My
intent was not to replace Fortran with a numerically extended C, however.
At the initial meeting of NCEG the following issues were identified as needing
the most attention:
aliasing
vectorization and array syntax
complex arithmetic
variably dimensioned arrays
IEEE floating point support
exceptions and errno
parallelization
aggregate initializers
Not long after, the parallel extension topic was removed because another ANSI
committee (X3H5) had this as part of its project proposal. Work has progressed
on all the other topics and a new topic was recently added. This involves
extending the type facility to handling larger integers. Other extensions to
initialization syntax are also being considered.

NCEG has met regularly since mid-1989, generally along with (in the two days
that precede or follow) X3J11 meetings. In March of 1991, SPARC finally
accepted NCEG's project proposal. NCEG has since become a working group within
the ANSI umbrella and is now formally known as X3J11.1.
NCEG's mission is to produce a technical report, not a standard. Public
comment drafts of parts of this report will likely become available early in
1992. The primary purpose of NCEG is to define approaches to solving numerical
issues and to promote adoption of its solutions among the vendor community.
This will hopefully play a major role in establishing prior art for future C
standard revisions.
Others The ANSI committee X3H5 is defining a parallel processing model along
with Fortran and C language bindings. Its first draft of a C binding appeared
in December 1990 and is currently being revised based on input from numerous
sources.
Various ISO committees are working on, or have completed, C language bindings
in areas such as 2-D and 3-D GKS graphics.
AT&T has long had a research project on parallelism called "Concurrent C,"
while Thinking Machines has their language C*. There is also C++ and
Stepstone's Objective C. The GNU C compiler from the Free Software Foundation
has already played a significant role in making C available to numerous
systems. That compiler also has numerous interesting experimental extensions.
Another interesting project is being developed by the initial author of Turbo
C. His language is tentatively called OPAL and is derived from C. Many other
projects are underway, including C-cured, a version of C with a supposedly
more rational way of declaring types.


The C++/C Compatibility Question


When it was proposed that X3J11 take on the task of standardizing C++, it
declined. Not only was this not within the original charter of X3J11, members
spent more than six years working on C but they also had plenty of
interpretation work to keep them busy. In any event, there were plenty of
arguments as to why C++ was a different language despite its C roots.
Eventually, a new committee (X3J16) was formed to work on C++, and within
little more than a year, an ISO C++ working group was also formed.
There has been much discussion of X3J16's now well-known statement that C++
should be "as close as possible to ANSI C but no closer." In any event, there
are moves afoot within X3J16 to keep C++ as upwards-compatible with C as
possible and to diverge only when absolutely necessary. Such compatibility
issues are being handled by X3J16's C Compatibility subgroup. Some of the main
issues this group faces are:
Programs containing both C and C++ code
Whether Standard C code can and should be correctly compiled by a C++
translator
C++ code that calls Standard C library routines and vice versa


Compiler Validation


Any vendor can claim conformance to Standard C, and many currently do.
However, at the time of writing (May 1991) the C validation service within the
US was not yet operational. The National Institute of Science and Technology
(NIST, formerly the National Bureau of Standards) has, however, selected a
validation suite from Perennial. This suite has been under technical review
for several months now and is being revised to meet NIST requirements. It
should be noted that NIST will validate against the FIPS C standard. (FIPS is
an acronym for "Federal Information Processing Standard.") Currently, FIPS C
is ANSI C with a few extra requirements.
The validation suite chosen by the British Standards Institute (BSI) and
several other European standards bodies is from a different vendor, Plum Hall.
Since BSI's conformance facility is operational, it has already begun issuing
validation certificates for several compilers. BSI validates against ANSI/ISO
C.
One interesting issue is that a conforming implementation must document what
it does for all cases of implementation-defined behavior. Documentation cannot
be automatically checked by a suite, so it remains to be seen just how this
aspect will be handled.


The Future of Standard C


Except for reformatting changes, the ISO and ANSI C standards are currently
equivalent. As such, X3J11 and WG14 encourages all those concerned to talk
about Standard C rather than ANSI or ISO C. We already talk of global economy
and multinational companies abound. With the advent of real-time electronic
communications the world is "shrinking." National boundaries are easily
transcended and, for many issues, irrelevant. As such, it makes economic sense
for us to work towards international standards rather than on isolated
national fronts.
All indications are that future evolution of Standard C will be determined at
the international level. National groups such as X3J11 will still continue to
play a very important role and may even do development on behalf of WG14.
However, the focus will be more and more on international standards. To that
end, X3J11 is currently completing a letter ballot to withdraw X3.159-1989 and
to replace it with ISO/IEC 9899: 1990 (E). That is, it is planning to make the
ANSI C standard exactly the same as that from ISO, including the ISO format.
This paves the way for X3J11 to officially track the work of WG14; as ISO C is
revised, ANSI C can be revised in the same manner.
The technical report produced by NCEG will likely have some impact on future
directions of C because many of the members of X3J11 are also active within
X3J11.1.
As to C++, at this time I do not see it as the logical successor to C. For
certain classes (no pun intended) of programs, C++ may indeed be more
appropriate. However, C still has its own considerable niche. And with C
leading the race to support multibyte and international software character
sets via C's locales, I suspect C will be used in an even broader range of
applications in the future.


C Language Resource Guide


To get further information about topics discussed in this article, contact the
appropriate person or organization below:
The ANSI C Standard's official designation is ANSI X3.159-1989. The price of
the standard is $65. Note that it is available in printed form only. To obtain
a copy, contact:

American National Standards Institute
Sales Department
11 West 42nd Street
New York, NY 10036
212-642-4900; Fax: 212-398-0023
Sales Fax: 212-302-1286

Convenor of X3J11 (ANSI C)
Jim Brodie
Motorola Inc.
Tempe, AZ 85284
602-897-4390
Internet: brodie@ssdt-tempe.sps.mot.com

If you would like to formally submit a request for interpretation of the C
standard, contact:

Manager of Standards Processing
X3 Secretariat, CBEMA
311 First Street N.W., Suite 500

Washington, DC 20001-2178
202-737-8888; Fax: 202-638-4922

Convenor of X3J11.1 (NCEG)
Rex Jaeschke
Journal of C Language Translation
2051 Swans Neck Way
Reston, VA 22091
703-860-0091
Internet: rex@aussie.com

Chairman of X3J16 (ANSI C++)
Dmitry Lenkov
Hewlett-Packard Company
19447 Pruneridge Avenue, MS 47LE
Cupertino, CA 95014
408-447-5279
Internet: dmitry%hpda@hplabs.hp.com

The ISO C Standard's official designation is ISO/IEC 9899:1990 (E). Copies are
available from ANSI in the US and from affiliated national standards
organizations in other countries.

WG14 (US international rep.)
Rex Jaeschke (see contact information)

Vice-Chair of X3H5 (Parallel processing)
Walter G. Rudd
Department of Computer Science
Oregon State University
Corvallis, OR 97331-3902
503-737-5553; Fax: 503-737-3014
Internet: rudd@cs.orst.edu

NIST-US validation issues and policy:
National Institute of Science and Technology
Software Standards Validation Group
Gaithersburg, MD 20899
301-975-3247

US NIST-approved validation suite:
Perennial
4699 Old Ironsides, Drive, Suite 210
Santa Clara, CA 95054
408-727-2255; Fax: 408-748-2909
Internet: uunet!peren!acvs

BSI/European Validation Service
British Standards Institution Quality Assurance (BSI QA)
PO Box 375
Milton Keynes, MK14 6LL
United Kingdom
+44-0908-220908
Fax: +44-0908-226071
Internet: neil@bsiqa.uucp

BSI/European-approved validation suite:
Plum Hall Inc.
1 Spruce Avenue
Cardiff, NJ 08232
609-927-3770; Fax: 609-653-1903

Internet: plum@plumhall.com





























































August, 1991
A SOURCE CODE GENERATOR FOR C


A language-independent means of building programs that are consistent,
elegant, and fast




Karl Vogel


Karl is a programmer for Control Data Corporation. Contact him at 2970
Presidential Dr., Suite 200, Fairborn, OH 45324.


I have been writing code for about nine years, and one thing I've noticed is
that programming can usually be divided into three phases. Phase 1 (getting
the idea) takes all of a millisecond and usually occurs somewhere really
convenient, like in the shower. Phase 3 (running the finished program on the
machine) takes a few seconds or minutes for the type of applications I usually
deal with. Phase 2 (actually writing the bloody thing) usually takes a few
orders of magnitude longer than Phases 1 and 3 put together. I know life's not
supposed to be fair, but this is ridiculous.
Ever since I started working in a Unix shop, I've always wanted a
"programmer's assistant" for C, kind of an invisible HAL 9000 sitting behind
me and whispering things like "I'm sorry, Dave, but I don't think that you
should cast that pointer. Malloc will barf, and your program will go west,
taking me with it." I don't have a clue about how to write something like
that, so the next best thing is to take the 80 percent trashwork of coding
away and leave me with the 20 percent that's fun. I needed a code generator
and my requirements were simple:
It had to be very simple to use, with a minimum amount of nonsense to
remember.
It had to run from the command line like any other program, with a minimum of
options.
It had to be extensible, handling different coding conventions and languages
if needed.
It had to write the outline of a program and then put me right into the editor
of my choice.
It had to be simple enough to put together in a few hours, because a good idea
now beats a great one on the drawing board any day.
The program I came up with is called "new." The source code for "new" is shown
in Listings One through Ten. Listing One (page 102) is new.h, which holds the
data structure for the command line options, plus general definitions. Listing
Two (page 102) is macro.h, the header file for macro substitutions. Listing
Three (page 102) shows new.c, which creates and optionally edits one or more
new C programs. Listing Four (page 102) is CreateCode.c, which accepts the
substitution structure, the input file, and the new C file to be created. It
also reads the template file, fills it in with the contents of the
substitution structure, and writes the results to the new C file.
Listing Five (page 103) lists EditCode.c, which edits the new C file using
either vi or whatever you have in your EDITOR environment variable. Listing
Six (page 103) is GetOptions.c, which accepts the command line arguments and a
pointer to a structure that holds the options. It also parses the arguments
into the structure. Listing Seven (page 104) shows Help.c, which prints help
information to stdout. Listing Eight (page 106) lists OutString.c, which
decides which macro string to return based on the current token character.
Listing Nine (page 108), ReplaceMacros.c, accepts the macro structure holding
the variables to be replaced, an input filepointer, and an output filepointer.
It also does token substitution from input to output. Finally, Listing Ten
(page 108) is SetMacros.c, which sets up the MACRO structure used for token
replacement.
The "new" program works very much like the Unix program "nroff." The command
nroff -mx file formats a file using a list of text formatting macros
specified, by the -mx flag. In a similar fashion, new -mx file creates a
source code file using a template which is specified by the -mx flag. In both
cases, x is simply a file extension which identifies a given template file in
a given directory.
There is no reason for x to be restricted to a single character, although to
save typing I've chosen single-character names for my programming templates.
I have six C programming templates which manage to cover most of my needs:
-mf writes a generic C function other than main( ). This is the default,
because I use it most often. Figure 1 shows the template itself, while Listing
Four shows an example of the expanded template put to use.
Figure 1: Generic function template

 #ifndef lint
 static char * $n_$e_rcsid =
 "$Header: $n.$e, v $r $y/$c/$d $h: $m:
 $s $w Exp $";

 static char * $n_$e_source =
 "$Source$";
 #endif

 /*
 * NAME:
 * $n
 *
 * SYNOPSIS:
 * $t $n ($1)
 *
 * DESCRIPTION:
 * $u
 *
 * ARGUMENTS:
 * Describe any function arguments.
 *
 * AUTHOR:
 * $x
 *
 * BUGS:
 * None noticed.
 *

 * REVISIONS:
 *
 * $Log$
 */

 $i
 $p
 $g

 $t $n ($1)
 $a{
 /*
 * Functions.
 */

 $f
 /*
 * Variables.
 */

 $v
 /*
 * Processing.
 */

 $b
 return (0);
 }


* -mm writes a generic main routine with the standard arguments argc and argv.
* -mh writes the comment section of a basic C include file.
* -md writes a short main routine which is nothing but a driver/testbed for a
function.
* -ms writes a short stub function which is meant to serve as a placeholder.
* -mi writes a simple I/O routine which reads from stdin (standard input) and
writes to stdout (standard output).
Due to space constraints, all of the templates and fully commented versions of
the source files are available electronically; see "Availability" on page 3.
The source code listings presented here are complete, except for comments.
In order to make this as flexible as possible, I used a simple
token-substitution scheme in each template. A template consists of straight
text plus reserved tokens of the form $a, $b, and so on. The dollar sign
specifies the (possible) beginning of a token, and the first character after
it tells the code generator what is to be inserted in place of the token. If a
dollar sign is followed by any character other than one of the ones recognized
by the code generator, then the dollar sign and the character are simply
passed along to the program file being generated.
You don't have to live with the dollar sign as the start of a token; the code
generator allows the token delimiter to be specified in one of the header
files. I use the dollar sign as the token delimiter throughout this article.
Each token has a specific meaning: $a will be replaced by the argument list to
a function, $d by the current day of the month, and so on. A complete list of
the recognized tokens and the substituted text for each one is provided in
Table 1. I did my best to keep them reasonably mnemonic.
Example 1(a) shows a template for a header file, 1(b) some sample values for
the reserved tokens, and 1(c) the resulting generated source code. The line
numbers in the example are for your convenience; they are not included in the
template or the generated code. Making the templates was easy; in each case, I
picked a coding standard I could live with, copied a C program that reflected
it into the template file, and replaced the guts of the program with tokens.
The main data structure in "new" consists of a C structure which has one entry
for each reserved token. Some of the entries in the structure are filled using
information from the command line, but most are not. You may notice that a lot
of the entries in the C token structure are not filled from within the code
generator at all. This is because someday when I have time (translation: When
I get an initiative attack I can't talk myself out of), I want to write a
screen-oriented version of the source code generator which would allow someone
to fill some or all of the entries in the token structure interactively.
Figure 2 shows pseudocode which lays out the structure of the code generator
as it currently exists.
Table 1: Token substitution list

 Token Replacement string for the token

 a Arguments for the function. Full declarations separated by
 semicolons and newlines.
 b Body of the function. Comes under the first comment after the
 variable declarations. Can include tabs and newlines.
 c Current month
 d Current day
 e File extension
 f Functions declared internally
 g Global variables separated by semicolons
 h Current hour
 i "Include" commands, #include <stdio.h>
 l Function argument list. Variable names separated by commas
 m Current minute
 n Function name

 p Preprocessor macros -- #define commands separated by newlines
 r Revision level for a Revision Control System, the system I
 used in this case
 s Current second
 t Type of function, such as char * or int
 u Function usage. This comes right after the first use of the
 name, and tells what the function is supposed to do. Not
 to be confused with the "Usage" section in the intro
 comments. This is the only "smart" macro: It won't let
 you put too many words on a line and starts each line with
 a comment indicator. The comment indicator is a space
 followed by an asterisk; "new" assumes that there is both a
 leading /* and a trailing */ on some other line.
 v Internal variables separated by semicolons, tabs, and newlines
 w Who wrote the function
 x Full name of the function author, extracted from the comment
 field in the /etc/passwd file
 y Current year

Example 1: (a) Template file; (b) token replacement values; (c) generated
code.

 (a) 1 #ifndef lint
 2 static char * $n_$e_rcsid =
 3 "$Header: $n.$e,v $r $y/$c/$d $h:$m:$s $w Exp $";
 4
 5 static char * $n_$e_source =
 6 "$Sources$";
 7 #endif
 8
 9 /*
 10 * Header file for "$n".
 11 *
 12 * $Log$
 13 */
 14
 15 #include <stdio.h>

 (b) $c "09"
 $d "23"
 $e "h"
 $h "00"
 $m "25"
 $n "program"
 $r "1.1"
 $s "44"
 $w "vogel"
 $y "90"

 (c) 1 #ifndef lint
 2 static char * program_h_rcsid =
 3 "$Header: program.h,v 1.1 90/09/23 00:25:44 vogel Exp $";
 4
 5 static char * program_h_source =
 6 "$Sources$";
 7 #endif
 8
 9 /*
 10 * Header file for "program".
 11 *

 12 * $Log$
 13 */
 14
 15 #include <stdio.h>

Figure 2: Pseudocode that illustrates the structure of the code generator

 begin
 look for some sensible command line options and one or more files to
 create
 if (none found)
 then

 print help
 bail out
 endif
 for (each file on the command line)
 do

 set up the structure holding the tokens to be substituted
 filter text from the template file to the output program,
 doing appropriate token replacement
 edit the output program

 done
 end

I'm a big believer in separating the user interface part of a program from the
part that does the actual work. Keeping all of the information needed to
generate a C function within a single data structure makes a special-purpose
code generator much easier to write, because it doesn't matter where the
information stored in the token structure comes from: an input screen, another
program, a file, or any combination thereof. The code generator needs only a
filled token structure.
If I ever had to write a "mass-code" generator to read a bunch of input files
and generate a bunch of functions and programs, the token structure would make
it possible for me to write code that looks like Example 2.
Example 2: A flexible data structure leads to code that looks like this.

 while (getinfo (input file, pointer to the C token structure))
 do
 putfunction (output file, pointer to the C token structure)
 done

In this case, getinfo would read a file and fill the token structure, and
putfunction would read the stuff in the token structure and write the C code.
The input file format has no necessary connection to the output file format,
and this is a case of two totally dissimilar data structures which have to
peacefully coexist in the same program. Whenever you have a case like this,
either marry the data structures or divorce them, but don't let them live in
sin. A divorce seemed the cleanest solution; input and output functions only
communicate through the token structure, and otherwise know nothing about each
other.


Why a Code Generator?


Whether you use this code generator, buy a commercial one, or gird up your
loins and roll your own doesn't matter. What does matter is getting one and
using it, and here's why:
Consistency This is one reason I can't emphasize enough. With so many projects
involving multiple programmers and even multiple teams, anything that helps
communication between the worker bees isn't to be sneered at. Consistent code
appearance doesn't guarantee understandable code, but it doesn't hurt, and a
code generator makes enforcement of coding standards easier.
As I see it, the main problem with coding standards is that they don't take
into account the fact that coding usually boils down to some poor schlub
typing at a terminal. If you don't at least try to make that person's life
easier when imposing your standard, forget it, you'll get lip service at best
and outright rebellion at worst. In the area of software, a poor solution is
actually worse than none at all, because it lulls people into thinking that
the "problem" is being "solved." (Isn't this how we wound up with Ada?)
Beauty There's something about a well-crafted piece of code that makes it very
easy on the eyes, but occasionally appearance suffers when deadlines loom.
Instead of sprucing up every function by hand, spruce up your templates and
let the code generator do the boring stuff.
Generality The templates and code here are presented in C, but there is no
reason for it to be that way. One code generator can handle C, Pascal,
Modula-2, or whatever you have the patience to write a template for. I used
one-letter template names because I'm not a typing fan, not because it's a
requirement of the code generator.
Economy One of the things I admire most about Unix is what they left out. Unix
handles files, processes, and nothing else. This type of code generator
encourages you to develop your code one function at a time, but Unix doesn't
"know" anything about functions. To teach Unix about functions, you can use a
tool such as "ctags" to make editing source files easier, but "ctags" must be
kept up to date every time you make a significant change to your source. The
code generator encourages the idea of "one function, one file," and if you
make a one-to-one correspondence between functions and files, then Unix knows
about functions too. This enables you to use the power of the Unix toolset
(and mindset) on each separate compilable portion of a C program, and for me
that equates to finer control over the whole process.
I certainly respect companies that care enough about their programmers to buy
CASE tools, but I've noticed that the current products have a bit of an
attitude: "Be reasonable, write your code, and design your products OUR way."
Instead of buying a CASE tool and changing your methods to suit the machine
(or another vendor), why not expand the capacities of the operating system you
have now?
Speed Placing one function in one file puts a permanent end to the game of
"let's find the function," and in combination with a decent "make" tool for
generating executables from a list of dependencies, it also ensures that you
are doing the minimum amount of recompilation when you modify source during
development. Goodbye "ctags"!
Program Tracing If I write any program which is going to receive widespread
use by a lot of different people, I like to impress the customer by being
aware of problems before I get the irate phone call, not after. One easy way
to do this is to build an ErrorMsg function which accepts the arguments:
ErrorMsg (file, line, format, arg1, arg2, ...) where file is the name of the
function (stored in the __FILE__ preprocessor variable), line is the line
number at which the problem occurred (stored in the __LINE__ preprocessor
variable), format is the output format used by functions in the printf family,
and the remaining arguments are variables which get referenced from within the
output format. The ErrorMsg function appends an appropriate message to an
error file, and just before the main routine exits, another function,
AnyErrors, checks the error file to see if it contains anything. If AnyErrors
returns TRUE, a third function, MailMsg, mails the error file to me.
The important point here is that with one function per file, the __FILE__
pre-processor variable always tells me exactly what function is messing up, so
I don't have to look at a source listing somewhere and mutter "Let's see, line
600 is function foobar, I think, but what's at line 700?" I know immediately
what functions were called in what order, and if the ErrorMsg function was
called enough times with enough information, I may have everything I need to
fix the problem without having to use a debugger. I've always believed that if
you have to fire up a debugger to fix a problem, then you have a much more
basic problem with your code that a debugger can't fix.
Flexibility Having one function per file makes writing all kinds of neat tools
much easier. For example, I have another program called "doc" which looks
through one or more directories, and prints the comment header from a given
function, in case I forget the order of the arguments or something. It won't
cure cancer but it does save time, and I didn't have to put in too many late
evenings to write it, either. doc reads a function name from the argument
list, appends .c to it if it's not there already, and uses the Unix access
call to see if a file by that name exists in one of several directories. If
such a file is found, the comment header is written to stdout.


Conclusion


Of course, you can make just as big a mess with a code generator as without,
and you can do it faster; a fool with a tool is just a well-equipped fool.
Like any other tool, a code generator can save beaucoup time only if it's used
with some common sense. I tend to be wary of any CASE product that claims to
let you write code "without thinking about it;" anyone who writes code for
someone else's use without thinking is dangerous.

_A SOURCE CODE GENERATOR FOR C_
by Karl Vogel


[LISTING ONE]

/* File "new.h" */

#ifndef lint
static char *new_h_rcsid =
"$Header: new.h,v 1.5 91/03/29 19:23:08 vogel Exp $";

static char *new_h_source =
"$Source: /d/cdc/vogel/source/new/RCS/new.h,v $";

#endif

/*
 * NAME:
 * new.h
 *
 * SYNOPSIS:
 * #include "new.h"
 *
 * DESCRIPTION:
 * Holds the data structure for the command line options, plus
 * general definitions.
 *
 * AUTHOR:
 * Karl Vogel
 *
 * BUGS:
 * None noticed.
 *
 * REVISIONS:
 *
 * $Log: new.h,v $
 * Revision 1.5 91/03/29 19:23:08 vogel
 * 1. Renamed the directory holding the templates.
 *
 * Revision 1.4 91/03/15 16:56:33 vogel
 * 1. Ran indent on the code and reformatted the comment header.
 *
 * Revision 1.3 90/07/03 14:21:20 vogel
 * 1. Added a new field: extension of the program to be created. This is also
 * in "macro.h", but it can be most easily set from examining the command
 * line options.
 *
 * Revision 1.2 90/07/03 12:07:06 vogel
 * 1. Added a flag to the OPTIONS structure which is TRUE if the created file
 * is to be edited, FALSE otherwise.
 *
 * Revision 1.1 90/06/29 15:32:04 vogel
 * Initial revision
 *
 */

#include <stdio.h>


#define YES 1
#define NO 0

/* Error codes. */
#define OK 0
#define EMALLOC 1

/* Set directory holding templates, and prefix of each template file. */
#define DIRECTORY "/d/cdc/vogel/source/new/"
#define PREFIX "template."

/* Set up the options structure. */
struct options_data
{
 char **files; /* NULL-terminated array of C files
 * to be created. */
 char *template; /* Full name of the desired
 * template file. */
 char exten[5]; /* Extension of the program to be
 * created. */
 int edit; /* True if the created program is to
 * be edited. */
 int help; /* True if help is needed. */
};

typedef struct options_data OPTIONS;





[LISTING TWO]


/* File "macro.h" */

#ifndef lint
static char *macro_h_rcsid =
"$Header: macro.h,v 1.2 91/03/15 16:56:28 vogel Exp $";

static char *macro_h_source =
"$Source: /d/cdc/vogel/source/new/RCS/macro.h,v $";

#endif

/*
 * NAME:
 * macro.h
 *
 * SYNOPSIS:
 * #include "macro.h"
 *
 * DESCRIPTION:
 * Header file for macro substitutions.
 *
 * AUTHOR:
 * Karl Vogel
 *
 * BUGS:

 * Any problems or suggestions.
 *
 * REVISIONS:
 *
 * $Log: macro.h,v $
 * Revision 1.2 91/03/15 16:56:28 vogel
 * 1. Ran indent on the code and reformatted the comment header.
 *
 * Revision 1.1 90/06/29 17:40:15 vogel
 * Initial revision
 *
 */

#define BUFFER 512

struct macro_data
{
 char args[BUFFER]; /* Arguments for the function: full
 * declarations separated by
 * semicolons and newlines. Replaces
 * $a in a template. */
 char body[BUFFER]; /* Body of the function. Comes under
 * the first comment after the
 * variable declarations. Includes
 * tabs and newlines. Replaces $b in
 * a template. */
 char exten[5]; /* Extension of the created file,
 * usually ".c". Replaces $e in a
 * template. */
 char functions[BUFFER]; /* Internal function
 * declarations. Comes just
 * after the start of the
 * function itself. Includes
 * tabs and newlines.
 * Replaces $f in a template. */
 char globals[BUFFER];/* External global variables. Comes
 * just before the start of the
 * function itself. Includes tabs
 * and newlines. Replaces $g in a
 * template. */
 char include[BUFFER];/* External include files. Comes
 * just before the start of the
 * function itself. Includes tabs
 * and newlines. Replaces $i in a
 * template. */
 char alist[BUFFER]; /* Short form of the argument list,
 * consisting of just the variable
 * names separated by commas.
 * Replaces $l in a template. */
 char name[64]; /* Name of the function being
 * created. May or may not be the
 * same as the filename. Replaces $n
 * in a template. */
 char define[BUFFER]; /* Holds preprocessor #define
 * statements separated by newlines.
 * Replaces $p in a template. */
 char rev[16]; /* Current revision level of the
 * created function. Replaces $r in
 * a template. */

 char type[32]; /* Type of function created, i.e.
 * char * or int. Replaces $t in a
 * template. */
 char usage[BUFFER]; /* Comes right after the first use of
 * the function name, and tells you
 * what the function is supposed to
 * do. Just above the Usage section
 * in the intro comments. This is
 * the only "smart" macro: won't let
 * you put too many words on a line,
 * and starts each line with a
 * comment indicator. Replaces $u in
 * a template. */
 char vars[BUFFER]; /* Holds internally-declared
 * variables, separated by tabs,
 * semicolons, and newlines. Replaces
 * $v in a template. */
 char userid[16]; /* Holds the author's userid.
 * Replaces $w in a template. */
 char username[BUFFER]; /* Holds the author's full
 * name. Replaces $x in a
 * template. */
 char token; /* Holds the character to be used as
 * a token indicator. Shown as a
 * dollar sign in this file. */
 int year; /* Current year. Replaces $y in a
 * template. */
 int month; /* Current month. Replaces $c in a
 * template. */
 int day; /* Current day. Replaces $d in a
 * template. */
 int hour; /* Current hour. Replaces $h in a
 * template. */
 int minute; /* Current minute. Replaces $m in a
 * template. */
 int second; /* Current second. Replaces $s in a
 * template. */
};
typedef struct macro_data MACRO;





[LISTING THREE]

/* File "new.c" */

#ifndef lint
static char *new_c_rcsid =
"$Header: new.c,v 1.6 91/03/15 16:55:42 vogel Exp $";

static char *new_c_source =
"$Source: /d/cdc/vogel/source/new/RCS/new.c,v $";

#endif

/*
 * NAME:

 * new
 *
 * SYNOPSIS:
 * new [-c] [-h] [-mx] file [file ...]
 *
 * DESCRIPTION:
 * Creates and optionally edits one or more new C programs.
 *
 * "new" makes extensive use of templates to create new C files.
 * A template file has the name "template.x", where 'x' is the string
 * which follows '-m' in the program arguments.
 *
 * You can create other templates by adding template files to the
 * template directory specified in the header file "new.h".
 *
 * OPTIONS:
 * "-c" creates a new program but does NOT edit it.
 *
 * "-h" prints help information if improper or no arguments at all
 * are given.
 *
 * "-mx" creates a program of type 'x', where x is one of the following:
 * d - function driver.
 * f - normal function (default).
 * h - header file.
 * i - program to handle simple I/O.
 * m - main routine with arguments.
 * o - function to handle command line arguments.
 * s - stub (placeholder) function.
 *
 * "file" is one or more C files to be created. A suffix of ".c" will
 * be appended if it isn't there already. If the file already
 * exists, you will be asked if you want to overwrite it.
 *
 * AUTHOR:
 * Karl Vogel
 *
 * BUGS:
 * None noticed.
 *
 * REVISIONS:
 *
 * $Log: new.c,v $
 * Revision 1.6 91/03/15 16:55:42 vogel
 * 1. Ran indent on the code and reformatted the comment header.
 *
 * Revision 1.5 90/09/24 16:42:03 vogel
 * 1. Corrected a typo in the comments.
 *
 * Revision 1.4 90/07/03 14:20:23 vogel
 * 1. Copied the desired file extension into the MACRO structure.
 *
 * Revision 1.3 90/07/03 12:05:23 vogel
 * 1. Added "nextfile" variable, to hold the next file to be created.
 * 2. Got rid of debugging print statements for the macros structure.
 * 3. Added code to create and optionally edit each file in turn.
 *
 * Revision 1.2 90/06/29 17:40:04 vogel
 * 1. Added header file and definitions for the macro substitution strings.

 * 2. Added code to print help if needed.
 *
 * Revision 1.1 90/06/29 15:27:52 vogel
 * Initial revision
 *
 */

#include "new.h"
#include "macro.h"
#include <strings.h>
#include <sys/param.h>

main (argc, argv)
int argc;
char **argv;
{

/* Variables. */
 MACRO *macros;
 MACRO macbuffer;

 OPTIONS *options;
 OPTIONS opbuffer;

 char nextfile[MAXPATHLEN];

 int k;
 int status;

/* Process command line options. If they aren't OK, offer help and exit. */
 options = &opbuffer;
 status = GetOptions (argc, argv, options);

 if (status == EMALLOC)
 {
 fprintf (stderr, "Severe memory error -- please call the ");
 fprintf (stderr, "System Administrator.\n");
 exit (1);
 }

 if (options->help)
 {
 (void) Help ();
 exit (1);
 }

/* Set up the substitution macros. */
 macros = &macbuffer;
 SetMacros (macros);
 (void) strcpy (macros->exten, options->exten);

/* Create and optionally edit each file in turn. */
 for (k = 0; options->files[k]; ++k)
 {
 (void) strcpy (nextfile, options->files[k]);

 if (CreateCode (macros, options->template, nextfile) == 0)
 if (options->edit)
 EditCode (nextfile);

 }

 exit (0);
}






[LISTING FOUR]

/* File "CreateCode.c" */

#ifndef lint
static char *CreateCode_c_rcsid =
"$Header: CreateCode.c,v 1.2 91/03/15 16:55:25 vogel Exp $";

static char *CreateCode_c_source =
"$Source: /d/cdc/vogel/source/new/RCS/CreateCode.c,v $";

#endif

/*
 * NAME:
 * CreateCode
 *
 * SYNOPSIS:
 * #include "macro.h"
 *
 * int CreateCode (macros, template, nextfile)
 * MACRO * macros;
 * char * template;
 * char * nextfile;
 *
 * DESCRIPTION:
 * Accepts the substitution structure, the input file, and the new C
 * file to be created. Reads the template file, fills it in with the
 * contents of the substitution structure, and writes the results to
 * the new C file.
 *
 * A function value of 0 is returned if everything is OK.
 * A function value of 1 is returned if the template could not be read.
 * A function value of 2 is returned if the output could not be written.
 *
 * ARGUMENTS:
 * "macros" is the structure holding the macro substitution values.
 * "template" is the name of the input template file.
 * "nextfile" is the name of the next output file.
 *
 * AUTHOR:
 * Karl Vogel
 *
 * BUGS:
 * None noticed.
 *
 * REVISIONS:
 *
 * $Log: CreateCode.c,v $

 * Revision 1.2 91/03/15 16:55:25 vogel
 * 1. Ran indent on the code and reformatted the comment header.
 *
 * Revision 1.1 90/07/03 11:59:45 vogel
 * Initial revision
 *
 */

#include "macro.h"
#include <stdio.h>
#include <strings.h>
#include <sys/file.h>

#define SMALLBUF 20

int CreateCode (macros, template, nextfile)
MACRO *macros;
char *template;
char *nextfile;
{

/* Variables. */

 FILE *in;
 FILE *out;

 char *base;
 char *dot;
 char last[SMALLBUF];
 char prompt[SMALLBUF];

 int length;
 int status;

/* If filename already has default extension specified in MACRO structure, zap
 * it so we can store the function name in the MACRO structure. */
 status = 0;
 in = (FILE *) NULL;
 out = (FILE *) NULL;

 last[0] = '.';
 last[1] = '\0';

 (void) strcat (last, macros->exten);
 length = strlen (last);

 if (dot = rindex (nextfile, '.'))
 if (strcmp (dot, last) == 0)
 *dot = '\0';

 if (base = rindex (nextfile, '/'))
 base++;
 else
 base = nextfile;

 (void) strcpy (macros->name, base);
 (void) strcat (nextfile, last);

/* If output file already exists, make sure that user wants to overwrite it.
*/

 if (access (nextfile, F_OK) == 0)
 {
 printf ("File \"%s\" exists. Overwrite? ", nextfile);
 fflush (stdout);
 fgets (prompt, SMALLBUF, stdin);

 if (prompt[0] != 'y' && prompt[0] != 'Y')
 {
 printf ("\"%s\" not overwritten.\n", nextfile);
 goto done;
 }
 }

/* Open the input template file. */
 if ((in = fopen (template, "r")) == (FILE *) NULL)
 {
 printf ("Unable to open template file \"%s\"\n", template);
 status = 1;
 goto done;
 }

/* Open the output C file. */
 if ((out = fopen (nextfile, "w")) == (FILE *) NULL)
 {
 printf ("Unable to open C file \"%s\"\n", nextfile);
 status = 2;
 goto done;
 }

/* Do the macro substitution, clean up, and return. */
 ReplaceMacros (in, macros, out);
done:
 if (in)
 (void) fclose (in);
 if (out)
 (void) fclose (out);
 return (status);
}




[LISTING FIVE]

/* File "EditCode.c" */

#ifndef lint
static char *EditCode_c_rcsid =
"$Header: EditCode.c,v 1.2 91/03/15 16:55:28 vogel Exp $";

static char *EditCode_c_source =
"$Source: /d/cdc/vogel/source/new/RCS/EditCode.c,v $";

#endif

/*
 * NAME:
 * EditCode
 *

 * SYNOPSIS:
 * int EditCode (nextfile)
 * char * nextfile;
 *
 * DESCRIPTION:
 * Edits the new C file using either "cvi" or whatever the user has in
 * his "EDITOR" environment variable.
 *
 * ARGUMENTS:
 * "nextfile" is the new C file.
 *
 * AUTHOR:
 * Karl Vogel
 *
 * BUGS:
 * None noticed.
 *
 * REVISIONS:
 *
 * $Log: EditCode.c,v $
 * Revision 1.2 91/03/15 16:55:28 vogel
 * 1. Ran indent on the code and reformatted the comment header.
 *
 * Revision 1.1 90/07/03 12:00:13 vogel
 * Initial revision
 */

#include <stdio.h>

int EditCode (nextfile)
char *nextfile;
{

/* Functions. */
 char *getenv ();

/* Variables. */
 char *editor;
 char cmd[BUFSIZ];

 int status;

/* Create and execute the edit command. */
 if (editor = getenv ("EDITOR"))
 (void) strcpy (cmd, editor);
 else
 (void) strcpy (cmd, "cvi");
 (void) strcat (cmd, " ");
 (void) strcat (cmd, nextfile);
 (void) system (cmd);

 return (0);
}





[LISTING SIX]


/* File "GetOptions.c" */

#ifndef lint
static char *GetOptions_c_rcsid =
"$Header: GetOptions.c,v 1.4 91/03/15 16:55:31 vogel Exp $";

static char *GetOptions_c_source =
"$Source: /d/cdc/vogel/source/new/RCS/GetOptions.c,v $";

#endif

/*
 * NAME:
 * GetOptions
 *
 * SYNOPSIS:
 * #include "new.h"
 *
 * int GetOptions (argc, argv, options)
 * int argc;
 * char ** argv;
 * OPTIONS * options;
 *
 * DESCRIPTION:
 * Accepts the command line arguments and a pointer to a structure which
 * holds the options, and parses the arguments into the structure.
 *
 * ARGUMENTS:
 * "argc" and "argv" are the normal C program variables holding the
 * argument list.
 *
 * "options" is the structure meant to hold the options.
 *
 * AUTHOR:
 * Karl Vogel
 *
 * BUGS:
 * Any problems or suggestions.
 *
 * REVISIONS:
 *
 * $Log: GetOptions.c,v $
 * Revision 1.4 91/03/15 16:55:31 vogel
 * 1. Ran indent on the code and reformatted the comment header.
 *
 * Revision 1.3 90/07/03 14:19:49 vogel
 * 1. Added code to set up the file extension based on the command line
options.
 * "new -mh" gives you a file with name ending in ".h", anything else gives
 * you a file with name ending in ".c".
 *
 * Revision 1.2 90/07/03 12:01:01 vogel
 * 1. Added new option '-c'.
 * 2. Added code to make sure that default template is properly set up.
 *
 * Revision 1.1 90/06/29 15:28:13 vogel
 * Initial revision
 *
 */


#include "new.h"
#include <strings.h>

#define SMALL 10

int GetOptions (argc, argv, options)
int argc;
char **argv;
OPTIONS *options;
{

/* Functions. */
 char *malloc ();

/* Variables. */
 char *next;
 char extension[SMALL + 1];

 int count;
 int flags;
 int k;
 int status;

 unsigned int length;

/* Initialize options to defaults. */
 options->files = (char **) NULL;
 options->template = (char *) NULL;
 options->edit = YES;
 options->help = NO;

 (void) strcpy (extension, "f");

 count = 0;
 status = OK;

/* Handle the flags, and count the files. */
 for (k = 1; k < argc; ++k)
 {
 next = argv[k];

 if (*next == '-')
 {
 next++;

 if (*next == 'm')
 {
 (void) strncpy (extension, next + 1, SMALL);
 extension[SMALL] = '\0';
 }

 else
 if (*next == 'c')
 options->edit = NO;

 else
 {
 options->help = YES;

 goto done;
 }
 }

 else /* option does not start with a dash */
 count++;
 }

/* If no files were specified, set help flag. Otherwise, store filenames. */
 if (count)
 {
 length = (unsigned) ((count + 1) * (sizeof (char *)));
 if (options->files = (char **) malloc (length))
 {
 for (k = 1, count = 0; k < argc; ++k)
 if (*argv[k] != '-')
 options->files[count++] = argv[k];
 options->files[count] = (char *) NULL;
 }
 else
 {
 status = EMALLOC;
 goto done;
 }
 }
 else
 {
 options->help = YES;
 goto done;
 }

/* Set up the file extension. */
 if (strcmp (extension, "h") == 0)
 options->exten[0] = 'h';
 else
 options->exten[0] = 'c';

 options->exten[1] = '\0';

/* Set up the template pathname. */
 length = (unsigned) (strlen (DIRECTORY) + strlen (PREFIX) +
 strlen (extension) + 1);
 if (options->template = malloc (length))
 (void) sprintf (options->template, "%s%s%s", DIRECTORY,
 PREFIX, extension);
 else
 status = EMALLOC;
done:
 return (status);
}





[LISTING SEVEN]

/* File "Help.c" */


#ifndef lint
static char *Help_c_rcsid =
"$Header: Help.c,v 1.4 91/03/15 16:55:33 vogel Exp $";

static char *Help_c_source =
"$Source: /d/cdc/vogel/source/new/RCS/Help.c,v $";

#endif

/*
 * NAME:
 * Help
 *
 * SYNOPSIS:
 * int Help ()
 *
 * DESCRIPTION:
 * Prints help information to stdout.
 *
 * ARGUMENTS:
 * None.
 *
 * AUTHOR:
 * Karl Vogel
 *
 * BUGS:
 * None noticed.
 *
 * REVISIONS:
 *
 * $Log: Help.c,v $
 * Revision 1.4 91/03/15 16:55:33 vogel
 * 1. Ran indent on the code and reformatted the comment header.
 *
 * Revision 1.3 90/09/24 16:40:47 vogel
 * 1. Corrected a typo.
 *
 * Revision 1.2 90/07/03 12:01:42 vogel
 * 1. Changed print strings to reflect new '-c' option.
 *
 * Revision 1.1 90/06/29 17:40:46 vogel
 * Initial revision
 *
 */

#include <stdio.h>

int Help ()
{

/* Variables. */
 int k;

 static char *array[] =
 {
 "",
 "\tnew:\t\tcreates and optionally edits one or more new C",
 "\t\t\tprograms.",
 "",

 "\t\tUsage:\tnew [-c] [-h] [-mx] file [file ...]",
 "",
 "\t\tWhere:\t\"-c\"\tcreates a new program but does NOT edit it.",
 "",
 "\t\t\t\"-h\"\tprints a help file. Help is also given if",
 "\t\t\t\tno arguments at all are given.",
 "",
 "\t\t\t\"-mx\"\tcreates a program of type 'x', where x is",
 "\t\t\t\tone of the following:",
 "\t",
 "\t\t\t\td - function driver.",
 "\t\t\t\tf - normal function (default).",
 "\t\t\t\th - header file.",
 "\t\t\t\ti - program to handle simple I/O.",
 "\t\t\t\tm - main routine with arguments.",
 "\t\t\t\to - function to handle command line arguments.",
 "\t\t\t\ts - stub (placeholder) function.",
 "",
 "\t\t\t\"file\"\tis one or more C files to be created. A",
 "\t\t\t\tsuffix of \".c\" will be appended if it isn't",
 "\t\t\t\tthere already.",
 "\t",
 "\t\t\t\tIf the file already exists, you will be asked",
 "\t\t\t\tif you want to overwrite it.",
 "",
 "",
 "\t\t\"new\" makes extensive use of templates to create new C files.",
 "\t\tA template file has the name \"template.x\", where 'x' is the",
 "\t\tstring which follows '-m' in the program arguments.",
 "\t",
 "\t\tYou can create other templates by adding template files to",
 "\t\tthe template directory specified in the header file \"new.h\".",
 (char *) NULL
 };

/* Write the help information. */
 for (k = 0; array[k]; ++k)
 puts (array[k]);
 return (0);
}





[LISTING EIGHT]

/* File "OutString.c" */

#ifndef lint
static char *OutString_c_rcsid =
"$Header: OutString.c,v 1.2 91/03/15 16:55:35 vogel Exp $";

static char *OutString_c_source =
"$Source: /d/cdc/vogel/source/new/RCS/OutString.c,v $";

#endif

/*

 * NAME:
 * OutString
 *
 * SYNOPSIS:
 * #include "new.h"
 * #include "macro.h"
 *
 * int OutString (macros, current, string)
 * MACRO * macros;
 * char current;
 * char * string;
 *
 * DESCRIPTION:
 * Decides which macro string to return based on the current token
 * character.
 *
 *
 * ARGUMENTS:
 * "macros" holds the current macro replacement values.
 *
 * "current" is the value of the token character following a dollar sign.
 *
 * "string" is the replacement value of that token.
 *
 * AUTHOR:
 * Karl Vogel
 *
 * BUGS:
 * None noticed.
 *
 * REVISIONS:
 *
 * $Log: OutString.c,v $
 * Revision 1.2 91/03/15 16:55:35 vogel
 * 1. Ran indent on the code and reformatted the comment header.
 *
 * Revision 1.1 90/07/03 12:02:28 vogel
 * Initial revision
 *
 */

#include "new.h"
#include "macro.h"
#include <stdio.h>
#include <ctype.h>
#include <strings.h>

int OutString (macros, current, string)
MACRO *macros;
char current;
char *string;
{

/* Variables. */
 char *blank;
 char *s;
 char *t;
 char temp[BUFFER];


 int col;
 int first;
 int length;
 int status;

/* Do the simple string replacements, and numeric conversion. */
 switch (current)
 {
 case 'a':
 (void) strcpy (string, macros->args);
 break;
 case 'b':
 (void) strcpy (string, macros->body);
 break;
 case 'c':
 (void) sprintf (string, "%2.2d", macros->month);
 break;
 case 'd':
 (void) sprintf (string, "%2.2d", macros->day);
 break;
 case 'e':
 (void) strcpy (string, macros->exten);
 break;
 case 'f':
 (void) strcpy (string, macros->functions);
 break;
 case 'g':
 (void) strcpy (string, macros->globals);
 break;
 case 'h':
 (void) sprintf (string, "%2.2d", macros->hour);
 break;
 case 'i':
 break;
 case 'l':
 (void) strcpy (string, macros->alist);
 break;
 case 'm':
 (void) sprintf (string, "%2.2d", macros->minute);
 break;
 case 'n':
 (void) strcpy (string, macros->name);
 break;
 case 'p':
 (void) strcpy (string, macros->define);
 break;
 case 'r':
 (void) strcpy (string, macros->rev);
 break;
 case 's':
 (void) sprintf (string, "%2.2d", macros->second);
 break;
 case 't':
 (void) strcpy (string, macros->type);
 break;
 case 'u':
 break;
 case 'v':
 (void) strcpy (string, macros->vars);

 break;
 case 'w':
 (void) strcpy (string, macros->userid);
 break;
 case 'x':
 (void) strcpy (string, macros->username);
 break;
 case 'y':
 (void) sprintf (string, "%2.2d", macros->year);
 break;
 default:
 *string = '\0';
 break;
 }

/* Handle "Usage" string. Write no more than 50 characters/line, indented
 *3 tab spaces in. */
 if (current == 'u')
 {
 (void) strcpy (temp, macros->usage);
 t = temp;
 *string = '\0';
 first = YES;
 while (length = strlen (t))
 {
 if (!first)
 (void) strcat (string, "\n * \t\t\t");

 first = NO;
 if (length <= 50)
 {
 (void) strcat (string, t);
 break;
 }
 else
 {
 blank = (char *) NULL;
 col = 25;
 for (s = t; *s && col < 75; ++s, ++col)
 if (isspace (*s))
 blank = s;
 if (blank)
 {
 *blank = '\0';
 (void) strcat (string, t);
 t = blank + 1;
 }
 else
 {
 (void) strcat (string, t);
 break;
 }
 }
 }
 }

/* Handle the "include" strings. */
 if (current == 'i')
 {

 for (s = string, t = macros->include; *t; ++s, ++t)
 {
 if (*t == ' ')
 *s = '\t';
 else
 *s = *t;
 }
 *s = '\0';
 }
 return (0);
}





[LISTING NINE]

/* File "ReplaceMacros.c" */

#ifndef lint
static char *ReplaceMacros_c_rcsid =
"$Header: ReplaceMacros.c,v 1.2 91/03/15 16:55:37 vogel Exp $";

static char *ReplaceMacros_c_source =
"$Source: /d/cdc/vogel/source/new/RCS/ReplaceMacros.c,v $";

#endif

/*
 * NAME:
 * ReplaceMacros
 *
 * SYNOPSIS:
 * #include "macro.h"
 *
 * int ReplaceMacros (in, macros, out)
 * FILE * in;
 * MACRO * macros;
 * FILE * out;
 *
 * DESCRIPTION:
 * Accepts the macro structure holding the variables to be replaced,
 * an input filepointer, and an output filepointer. Does token
 * substitution from input to output.
 *
 * ARGUMENTS:
 * Described above.
 *
 * AUTHOR:
 * Karl Vogel
 *
 * BUGS:
 * None noticed.
 *
 * REVISIONS:
 *
 * $Log: ReplaceMacros.c,v $
 * Revision 1.2 91/03/15 16:55:37 vogel

 * 1. Ran indent on the code and reformatted the comment header.
 *
 * Revision 1.1 90/07/03 12:02:42 vogel
 * Initial revision
 *
 */

#include "macro.h"
#include <stdio.h>
#include <ctype.h>
#include <strings.h>

int ReplaceMacros (in, macros, out)
FILE *in;
MACRO *macros;
FILE *out;
{

/* Variables. */
 char *s;
 char string[BUFFER];
 char current;
 char previous;

/* Start the main loop which looks ahead one character. */
 previous = getc (in);
 if (previous == EOF)
 return (0);

/* Decide what to do if we get a token character. */
 while ((current = getc (in)) != EOF)
 {
 if (previous == macros->token)
 {
 if (index ("abcdefghilmnprstuvwxy", current))
 {
 OutString (macros, current, string);
 for (s = string; *s; ++s)
 putc (*s, out);
 previous = getc (in);
 }
 else
 {
 putc (previous, out);
 previous = current;
 }
 }
 else
 {
 putc (previous, out);
 previous = current;
 }
 }

/* Don't forget to write the last character. */
 putc (previous, out);
 return (0);
}






[LISTING TEN]

/* File "SetMacros.c" */

#ifndef lint
static char *SetMacros_c_rcsid =
"$Header: SetMacros.c,v 1.4 91/03/15 16:55:40 vogel Exp $";

static char *SetMacros_c_source =
"$Source: /d/cdc/vogel/source/new/RCS/SetMacros.c,v $";

#endif

#include "macro.h"
#include <stdio.h>
#include <strings.h>
#include <pwd.h>
#include <sys/time.h>

int SetMacros (macros)
MACRO *macros;
{

/* Functions. */
 char *getenv ();
 struct passwd *getpwuid ();

/* Variables. */
 char *s;
 int offset;
 long clock;
 struct passwd *ptr;
 struct tm *now;

/* Set up the internal C stuff. The "name" entry will be set later. */
 macros->args[0] = '\0';
 macros->body[0] = '\0';
 macros->functions[0] = '\0';
 macros->globals[0] = '\0';
 macros->alist[0] = '\0';
 macros->name[0] = '\0';
 macros->define[0] = '\0';
 macros->usage[0] = '\0';
 macros->vars[0] = '\0';

 (void) strcpy (macros->exten, "c");
 (void) strcpy (macros->rev, "1.1");
 (void) strcpy (macros->type, "int");

 (void) strcpy (macros->include,
 "#include <stdio.h>\n#include <ctype.h>");

/* Set up userid and full name of program author. See if we can get the
 * information from the environment. If that fails, look it up from userid
 * and passwd file. */

 macros->userid[0] = '\0';
 macros->username[0] = '\0';

 if (s = getenv ("USER"))
 (void) strcpy (macros->userid, s);
 if (s = getenv ("USERNAME"))
 (void) strcpy (macros->username, s);
 if (strlen (macros->userid) == 0 strlen (macros->username) == 0)
 {
 if (ptr = getpwuid (getuid ()))
 {
 (void) strcpy (macros->userid, ptr->pw_name);
 if (strncmp (ptr->pw_gecos, "Civ ", 4) == 0)
 offset = 4;
 else
 offset = 0;
 (void) strcpy (macros->username,
 ptr->pw_gecos + offset);
 if (s = index (macros->username, ';'))
 *s = '\0';
 }
 else
 {
 (void) strcpy (macros->userid, "unknown");
 (void) strcpy (macros->username, "unknown");
 }
 }

/* Set the character which we will recognize as the start of a token. */
 macros->token = '$';

/* Set up the current date and time. */
 (void) time (&clock);
 now = localtime (&clock);
 macros->second = now->tm_sec;
 macros->minute = now->tm_min;
 macros->hour = now->tm_hour;
 macros->day = now->tm_mday;
 macros->month = now->tm_mon + 1;
 macros->year = now->tm_year;

 return (0);
}



















August, 1991
A LISP-STYLE LIBRARY FOR C


Adding essential features of Lisp to C simplifies handling of complex data
objects




Daniel N. Ozick


Daniel is an independent consultant who designs software tools, builds
pattern-matching systems, and does software-engineering research. His book,
Structured Assembly Language Programming for the Z80, was published in 1985.
Contact him at 1 Jackie Court, Chester, NY 10918.


C has become one of the most popular and widely used programming languages,
and for good reasons: You can compile the same C program to run on a wide
variety of machines; C is a relatively small and easy-to-learn language; it
has a rich set of operators and a concise notation; and it lets you deal
directly with primitive machine objects, such as characters, numbers, and
addresses.
On the other hand, C provides only limited support for composite data objects,
having only fixed-size arrays and structures in its repertoire of built-in
composite types. The language lacks general mechanisms for:
Creating and operating on arrays or lists that vary in size and/or whose
elements may not all be of the same type
Automatically allocating and deallocating the memory used by dynamically
created objects
Reading and writing external textual representations of internal program
objects and symbols
Because of these deficiencies, if you want to write a C program that involves
complex data objects of varying sizes and types, you have to build your own ad
hoc mechanisms for manipulating those objects. You often end up "reinventing
the wheel' and obfuscating the real work of the program.
Lisp and similar problem-oriented languages excel where the machine-oriented C
is deficient. These languages have powerful features for dealing with symbolic
and dynamically created data structures at a high level. Having enjoyed and
made productive use of the expressive power of several dialects of Lisp, I set
out to save myself from the tar pits of C by creating a package of C macros,
functions, and data types--a Lisp-Style Library for C--that could imitate some
of Lisp's capabilities in a C environment. These capabilities include:
Variable-length lists containing elements of arbitrary type ("Lisp" is an
acronym for "LISt Processing").
Self-identified data objects
Symbols, symbolic data, and symbolic input/output
Systematic memory recycling
I've successfully used the Lisp-Style Library in my own work, most recently in
the construction of a general-purpose compiler (a kind of interpretive version
of Unix's lex and yacc tools). The availability of the Lisp-like language
features lets me put more effort into effective design and less into details
of implementation. At the same time, the extended capabilities do not require
a special language preprocessor or changes in syntax--standard C mechanisms
are used--nor do they impose a general sacrifice of C's space and time
efficiency.


Lists


Consider a program called process whose command line requires a filename
argument that may include wildcard characters. For example, process *.c is a
request to process every file in the current directory having the extension
".c." Inside process, you need to have a list of unambiguous filenames.
Assuming your operating system doesn't automatically handle wildcard expansion
on the command line, you might like also to be able to write something like
the code of Example 1, where source_file_names is a variable whose value is a
list (of any length) of filename strings; process_file is a function whose
argument is a filename string; and for__each is a function that causes
process_file to be sequentially applied to each element of the source_file
names list.
Example 1: A simple assignment statement and function call create and process
an arbitrarily long list of strings.

 source_file_names = expand_wildcards (argv [i]);
 for_each (process_file, source_file_names);

You can achieve the ability to manipulate variable-length lists in this
simple, direct way by constructing them as singly linked chains of pointer
pairs--what Lisp calls cons cells. In the Lisp-Style Library, a Pair is a
two-component struct, as shown in Figure 1. The component names car and
cdr(pronounced "car" and "could-er") long ago lost their original meanings,
but have been retained by the Lisp-using community to refer to the first and
second halves of pairs, respectively. Object is a C pointer to a
self-identified data object.
Figure 1: The Lisp-Style Library defines Pair, a two-component struct that is
the basis for all linked lists. Object is a C pointer to a self-identified
data object.

 /* Pair -- a Lisp 'cons' cell for creating linked lists */
 typedef struct
 {
 Object car; /* any Object */
 Object cdr; /* PAIR Object or NULL (usually) */
 } Pair;

To make the discussion of list building easier, let's use Lisp's standard
notation for lists. Elements enclosed by parentheses and separated by white
space represent a list. For example, ("prog-1.c", "prog-2.c", "prog-3.c") is a
list of three items, each of which is a string. This list could be the value
of source_file_names. An empty pair of parentheses () represents a sequence
containing no elements--the empty list.
Let's also use a common schematic representation for Lisp data called
box-and-pointer notation. In this representation, each Object is shown as a
pointer to a box, with the box containing a representation of the object's
data. The box for a pair is actually a double box, containing a car pointer in
its left half and a cdr pointer in its right half.
Figure 2 shows the box-and-pointer notation for the sample source_file_names
list of three strings. Notice that the chain of pairs is terminated by having
the last pair contain a cdr whose value is NULL, indicated by the slashed box.
In addition to its function as a list terminator, NULL (called nil in Lisp)
can also be considered equivalent to the empty list: NULL and () represent the
same entity.


List Constructors


Functions for building lists are called constructors, the most fundamental of
which, called first_put, adds an Object to the front of a list by allocating a
new pair and setting its car to point to the new Object and its cdr to point
to the rest of the list. Lisp calls this function cons (for "construct").
In Example 2, three nested calls to first_ put build the source_file_names
list, back to front, starting with the list terminator, NULL. The calls to
make_string are required to convert standard C strings into self-identified
Objects of type STRING.

Example 2: Three nested calls to first_put build a three-element list, back to
front, starting with the list terminator NULL.

 source_file_names =
 first_put (make_string ("prog-1.c"),
 first_put (make_string ("prog-2.c"),
 first_put (make_string ("prog-3.c"), NULL)));

Alternatively, the code in Example 3 accomplishes the same piece of list
building with a single invocation of list. This function uses ANSI C's
mechanism for handling a variable number of arguments, with a NULL argument
terminating the argument list. In true Lisp systems, functions can determine
the number of arguments supplied to them at runtime without resorting to a
special terminator argument.
Example 3: A single invocation of the list function builds a three-element
list. NULL terminates the function's variable-length argument list.

 source_file_names = list (make_string ("prog-1.c"),
 make_string ("prog-2.c"),
 make_string ("prog-3.c"),
 NULL );

If you build a list one item at a time with successive, nonnested applications
of first_put, you end up with the first item added as the last item on the
list. To avoid this reversal, you can use last_put, which adds an item to the
end of a list. For example, the code fragment in Example 4, which might be
part of expand wildcards, constructs a file name list in the same order as
filenames are retrieved by get_ first_name and get_next_name. Note that the
filename list, names, starts out empty as a result of the first assignment
statement, names = NULL;.
Example 4: This code fragment, which might be part of expand_wildcards,
constructs the filename list names in order (front to back) using last_put.
The list starts out empty as a result of the first assignment statement.

 names = NULL;
 name = get_first_name (filespec);
 while (name != NULL)
 {
 names = last_put (name, names);
 name = get_next_name (filespec);
 }

If you do need to reverse an already constructed list, use reverse. This
function returns a new list (comprising newly allocated pairs distinct from
those making up the old list) whose elements are listed in reverse order of
those in the old list.
The last constructor function is append, which splices together two lists. (In
true Lisp, append can take any number of lists as arguments.) Example 5
displays a fragment of code that uses append to construct a list of
unambiguous source_file_names from a list of command-line arguments, each of
which gets its wildcards expanded.
Example 5: This code fragment uses append to construct a list of unambiguous
source_file_names from a list of command-line arguments, each of which gets
its wildcards expanded.

 /* collect remainder of args (wildcards expanded) as 'source_file_names'
 */
 source_file_names = NULL;
 for (i = i; i < argc; i++)
 source_file_names = append (source_file_names, expand_wildcards (argv
 [i]));



List Selectors


Functions for taking lists apart or getting access to their elements are
called "selectors." The definition of the Lisp-Style Library function
for__each, Figure 3 , illustrates the two fundamental selectors, first and
but_first.
Figure 3: The Lisp-Style Library mapping function for_each demonstrates the
classic method for walking down a list by successive applications of but_first
while fetching each element for processing with first. (Function_1 is a
pointer to a function of one Object argument, returning an Object result.)

 /* for_each -- apply a function 'f' to each element of a list */
 void for_each (Function_1 f, Object list)
 {
 while (list != NULL)
 {
 (*f) (first (list));
 list = but_first (list);
 }
 }

first (called car in Lisp) returns the first element of its argument list, and
but_first (called cdr in Lisp) returns the list that begins with the second
element of its argument list--that is, everything but the first element. The
while loop in for_each implements the general scheme for traversing a list by
successive applications of but_first. You can, by the way, determine the
number of elements in a list before attempting to process them all by using
the length function.
nth provides nonsequential access to the elements of a list, as if it were an
array. Indexing begins at zero, so nth (list, O) is equivalent to first
(list).
Obviously, this use of lists is inefficient, since nth must traverse n pointer
links before arriving at the desired element. For more efficient nonsequential
access to sequential data, the Lisp-Style Library provides an Object of type
VECTOR. list_to_vector (list) returns a dynamically allocated VECTOR big
enough to hold all the elements of list; the macro invocation vector (vec) [n]
returns the nth element of vec; and the macro invocation vector_length (vec)
returns the number of elements in vec.
Remember, however, that unlike VECTOR objects (or C arrays for that matter),
lists can grow and shrink; they do not require an index value for access; and
elements can be inserted or deleted anywhere. With a few simple, indexless
access procedures, lists can serve as sets, stacks, queues, and associative
arrays (tables). The Lisp-Style Library functions is_member, assoc, and index
(Listing Three, page 113), and the macros push and pop (Listing One, page 112)
constitute a start in this direction.

Of course you pay for all this flexibility: PAIRs occupy memory space and must
be recycled when they become obsolete; access to list elements requires an
additional level of indirection (when compared to an array of fixed-size data
objects); and the time to access a list element is proportional to its
position in the list.


Mapping Functions


Mapping functions transform one list into another, just as programs in a Unix
pipeline transform one file into another. They promote a "signal-processing
view of programs that can be a powerful organizational method.
Suppose, for example, that the function integers generates the sequential list
of integers in the range specified by Its input arguments, that square returns
the square of its input, and that sum returns the sum of the integers in its
input list. Then sum (integers (1, 1O)) returns the sum of the integers from 1
through 10; map (square, integers (1, 10)) returns a list of the squares of
the integers from 1 through 10; and, as shown in the signal-flow diagram of
Figure 4, sum (map (square, integers (1, 10))) returns the sum of the squares
of the integers from 1 through 10. (You can run these examples, using Lisp's
comma-free prefix syntax, in the Tiny Lisp Interpreter demonstration program
of Listing Seven, page 123. See the comments in that listing for detailed
instructions.)
In other words, map applies a function to each element of an input list, and
collects the results of that application in a new output list of the same
length, thus transforming the list. map_no_nils works like map, except that it
discards NULL results, possibly resulting in an output list that is shorter
than the function's input list. This behavior makes map_no_nils useful for
filtering a list (by removing some elements) in addition to transforming it.
The list-related functions are implemented in Listing Three.


Objects


Up to this point, we've been assuming that the elements of a list are all of
the same type. But suppose we want to create a list to represent the
expression (A* (B + 3)).
If we treat the written form of that expression directly as standard list
notation, then we have the list whose three elements are A, *, and a nested
list also containing three elements (B, +, and 3). Allowing the third element
of the list to be itself a list results in the box-and-pointer notation of
Figure 5. The nested lists can also be viewed as the tree of Figure 6.
For a C program to interpret correctly the data structure diagrammed in Figure
5 and Figure 6, list-traversal procedures must be able to determine what each
pointer points to. Is this a pointer to a pair or to a primitive object? If
several types of primitive objects may be encountered--and we've already seen
STRINGs and VECTORs--what type of primitive object is this a pointer to?
The need to determine the type of data objects at runtime is the reason for
the creation of the Object type, which is defined as a pointer to a
self-identified data object. To avoid the awkwardness of excessive precision,
I will also informally refer to the object itself as an Object.
Objects have two components: type and value. The type of an Object serves as
an identifying "tag" for the value that follows. The type tag is a small
integer (or C enum constant) with the possible values UNDEFINED, SYMBOL,
STRING, INTEGER, FUNCTION, PAIR, VECTOR, or TOKEN (see Listing One). Following
the tag is the value part of the Object, whose contents depend on the Object's
type. Figure 7 shows, for each type of Object, the make_function that creates
the Object by dynamic allocation from the C heap, the macros that access the
components of the Object, and the Object's memory layout.
To see how the make_ functions and access macros of Figure 2 are used, let's
look at two short examples. The statement s = make_string ("string 1");
results in the creation of a new dynamically allocated Object of type STRING.
Consequently, the macro call type (s) returns STRING, while the macro call
string (s) returns a (char *) pointer that can be used in regular C
expressions such as strcmp (string (s), "string 2"). Similarly, if i =
make_integer (99); then type (i) returns INTEGER and integer (i) returns an
int that can be used in regular C expressions such as if (integer (i) < O).


Polymorphic Functions


Because the type of an Object can be determined at runtime, you can write
polymorphic functions -- that is, functions that can be applied to different
types of input and whose precise action is determined by the type of that
input. For example, the Lisp-Style Library's pp_object ("prettyprint object"),
Example 6, fetches the type of obj using the type access macro and dispatches
according to the fetched value.
Example 6: This code fragment from pp_object ("prettyprint object")
demonstrates dispatching according to the type of an Object, determined at
runtime.

 switch (type (obj))
 {
 case SYMBOL:
 write_symbol (obj);
 break;
 case STRING:
 write_string (obj);
 break;
 ...

 }

Sometimes it's simpler to use a type predicate (a macro that returns TRUE if
its argument is of a particular type and FALSE otherwise) to determine the
type of an Object. Table 1 lists the available predicates, and the eval
("evaluate") function of Listing Seven provides several examples of their use.
Incidentally, as in true Lisp, a NULL object or any non-NULL object that is
not a list is considered to be primitive and called an atom.
Table 1: Type predicates, implemented as macros, in the Lisp-Style Library for
C(Listing One). A type predicate returns TRUE if its argument is of the type
specified in its name and FALSE otherwise. As in true Lisp, an atom is the
NULL object or any non-NULL object that is not a list.

 is_null
 is_symbol
 is_pair
 is_atom
 is_list
 is_vector
 is_string
 is_integer
 is_function
 is_token



Symbols



So far I've described the use of PAIRs as a general method for creating lists
and trees and the advantages of using data objects that carry their own type
tags. Now let's look at the Lisp-Style Library's adaptations of two of the
symbol-manipulation features of Lisp that make that language so powerfully
expressive: SYMBOL objects and symbolic input/output of data.
An Object of type SYMBOL has two essential properties: a unique print_ name
string and a unique address. In addition, a SYMBOL can be associated with a
value (of type Object).
The read_object function (Listing Five, page 119) efficiently converts
character strings -- the external textual representations of SYMBOLs -- into
unique internal SYMBOL objects, while write_ object (Listing Three) converts
internal SYMBOL objects into their external textual representations as
strings.
The properties of symbols make them useful as identifiers of all sorts.
Anywhere you might use a #define or enum constant, you can probably use a
symbol and gain the advantage of symbolic textual input and output. This is
particularly handy for testing and debugging. Listing Two (page 112) contains
the declarations of symbols used inside the Lisp-Style Library itself, mainly
to represent the character and token types recognized by the Lisp input reader
(see Listing Five and "The Reader" section that follows).
I have used symbols as reserved words and identifiers in several "little
languages" including one for defining menus, one for defining rules in an
expert-system knowledge base, one for defining text-formatter output, and even
one for defining the grammar of another little language. In fact, if you're
willing to use standard list notation as the syntax of your little language,
you can create a program to read an input file in the language and convert the
information into your program's internal data structures with very little
effort (as benefits a little language). The Tiny Lisp Interpreter
(read-evaluate-print loop) of Listing Seven is yet another example of the
utility of symbolic I/O.
Listing Four (page 118) shows the simple hashed symbol-table scheme that
allows read_object to efficiently look up print_name strings. The symbol table
is an array of hash buckets, where each bucket is a list (implemented, of
course, with the list constructors I described earlier). lookup finds a symbol
in the table by first calculating its hash index (with hash) and then walking
down the corresponding hash-bucket list (with but_ first) until it finds a
matching string. install adds a new symbol by inserting it (with first_ put)
at the front of the calculated hash-bucket list. intern ("internalize") always
returns a unique SYMBOL object, either one found with lookup or one just
created and added to the symbol table by install.


The Reader


Like the reader in many Lisp systems, the reader (or "lexer") for the
Lisp-Style Library uses a general-purpose mechanism based on a read-table. A
read-table specifies, for each character that may be encountered in the input,
the type of that character. In this case, the allowed types are WHITESPACE,
COMMENT_MARKER, SPECIAL, STRING_MARKER, and ENDFILE_MARKER. The core function
of the reader, get_token, dispatches on the type of the current character
(fetched from read_table via the char_type macro). Several of get_token's
auxiliary procedures (get_white-space, get_string, and get_word) also
reference read_table.
As implemented in Listing Five, the reader:
Ignores white space, including spaces, tabs, newlines, and formfeeds.
Ignores everything between a semicolon (;) and the following newline, unless
the semicolon is part of a quoted string
Recognizes parentheses ( ) as list delimiters, and converts the external
representation of lists into an internal representation consisting of chains
of PAIR objects
Converts characters between pairs of double quotes ("") into STRING objects,
with the following subset of C's backslash (\) escapes correctly interpreted:
\n, \f, \\, \"'
Converts sequences of constituent characters (everything but white space,
semicolon, parentheses, double quotes, backslash, and end-of-file) into SYMBOL
objects, unless the sequence begins with a decimal digit
Converts sequences of decimal digits into nonnegative INTEGER objects
(Notice that SYMBOLs, unlike C identifiers, can include characters other than
letters, digits, and underscores. This means that a little language based on
the Lisp-Style Library could include symbols such as +, pair?, or
list->vector.)
On the output side, write_object produces textual representations of Objects
that conform to read_ object's input format. In other words, you can write an
Object to a file with write_ object and read back an equivalent Object with
read_ object. (If the Object in question is a SYMBOL, then the two Objects
will be not only equivalent but also identical.)
write_object also produces textual representations for VECTORs and TOKENs
using Lisp-style #() and #S() notation, respectively. These cannot be read
back by the current version of read_object, however.


Systematic Memory Recycling


The flexibility of variable-length lists and data objects that are created as
they're needed comes at a price. Memory for Objects must be allocated
dynamically from the C heap, and unless the supply of heap memory is
inexhaustible -- the memory for Objects that are no longer needed (and only
for those objects) must be freed.
True Lisp systems provide automatic garbage collection, pausing briefly to
accomplish this process whenever the program runs out of heap memory. Garbage
collection consists of tracing the entire network of active pointers, starting
with those in the machine registers (and including values on the stack), and
discarding those objects that are unreachable.
Unfortunately, I have not yet found a way to implement automatic garbage
collection using standard C mechanisms: It seems to require the creation of an
entirely new language (or at least a new type of compiler). Instead, the
Lisp-Style Library provides two methods for accomplishing what you might call
"planned recycling": free_object and mark / free_to_mark Listing Six, page
121).
Using free_object requires you to keep track of what Objects you have created,
although you don't have to worry about Objects that are components of other
Objects. For example, if you create a list and hand it as an argument to
free_object, the function will free every PAIR in the list and every Object
pointed to by a PAIR in the list, including other lists (recursively).
On the other hand, using the mark / free_to_mark technique, which I adapted
from the PostScript language, does not require you to keep track of individual
Objects. Instead, you discard all Objects created between the invocation of
free_to_mark and the most recent mark, as in Example 7. Calls to mark and
free_to_mark may be properly nested.
Example 7: The functions mark and free_to_mark provide a form of planned
Object recycling free_to_mark discards -- that is, recycles the memory used by
-- all Objects created since the most recent mark.

 mark ();
 <create and use Objects, all of which become garbage>
 free_to_mark ();


Objects created in a subregion bounded by calls to mark_persistent and
unmark_persistent will not be freed by a subsequent call to free_to_mark. This
allows some objects to persist beyo~ the mark/free region, as in Example ~
copy_object duplicates an Object a~ all its components recursively (exce~ for
SYMBOLs, which are unique)~ much the same way that free_obje~ frees an Object
and all its componen~ recursively.
The function persistent_copy_obje~ allows the statement important_obje~ =
persistent_copy_object (importan~ object); to be substituted for the la~ three
statements of Example 8. In other words, persistent_copy_object is copy~
object wrapped inside a "persisten~ region.
Example 8: Objects created in a subregion bounded by calls to mark_persistent
and unmark_persistent will not be freed by a subsequent call to free_to_mark.
This allows some objects_in this case, the copy of important_object-to persist
beyond the mark/free region.

 mark ();
 <create and use Objects, most of which become garbage>
 mark_persistent ();
 important_object = copy_object (important_object);
 unmark_persistent ();
 free_to_mark ();
 <use important_object>

To understand how the mark/fre~ scheme works, first note that all of the
functions that allocate memory for Objects -- the make_series (Listing Three
-- do so through safe_malloc, which in turn calls C's native malloc. As
currently implemented, safe_malloc simply aborts the program with an error
message (using the error function of Listing Three) if sufficient memory to
satisfy the allocation request is not available.
Similarly, the functions that free memory, free_object and free_to_mark
(Listing Six), do so through safe_free, which in turn calls C's native
free_safe_free sets the first byte of the deal-located memory, which will
usually be an Object's type tag, to zero. Because zero is not a legal type
value -- in fact, it has the name UNDEFINED -- a function that attempts to
reference a freed Object will fail, assuming it does some type checking.
Setting the entire block of deallocated memory to zero in safe_free would
further increase the probability of catching references through "stale"
pointers. Many Objects have components that are themselves pointers to other
data, so a reference to a discarded Object would likely lead to an attempt to
dereference a zero-valued (NULL) pointer. That action can be trapped by the
compiler's runtime error checking or by the operating system. Of course, the
extra zero-setting would come at the expense of some execution time.
At the beginning of Listing Six, several variables tell the next part of the
mark/free story:
The variable alloc_persistent, initially TRUE, determines whether allocated
blocks should "persist" beyond any invocation of free_to_mark. It is set TRUE
by mark_persistent and FALSE by mark. It is also set according to the value at
the top of mark_stack (see below) by unmark_persistent and free_to_mark.
The variable marked_block_list points to the front of a linked list containing
every block that was allocated when alloc_ persistent was FALSE. (These are
the blocks that can be freed by some invocation of free_to_mark.) safe_malloc
allocates the space for the links in this list by adding space for a Pointer
at the beginning of every memory block it allocates. It adds the current
memory block to the front of the list if alloc_persistent is FALSE.

The variable mark_stack, together with mark_stack_index, constitutes a stack
of Marks, each of which is identified as TEMPORARY (pushed by mark and popped
by free_ to_mark) or PERSISTENT (pushed by mark_ persistent and popped by
unmark_ persistent), and each of which also contains an index pointer into
marked_block_list.
To summarize the process: safe_malloc maintains a list of marked blocks; mark
saves a pointer into the list of marked blocks; and free_to_mark walks down
the marked block list freeing blocks until it reaches the block referenced by
the saved pointer. mark_ persistent and unmark_ persistent allow the
definition of a subregion where memory blocks do not get added to the marked
block list within a region where they do.


Further Development


I have been building the Lisp-Style Library for C incrementally, over time,
adding new features as I've needed them for particular projects. The macros,
functions, and types form a reasonably coherent set, but the library is far
from a finished product. Without making the mistake of trying to extend the
library to the point where it becomes an inefficient interpreter of true Lisp
programs instead of an efficient library for C, here are a few of my ideas for
further development:
Define a type-checking macro and invoke it in all functions that expect (and
now assume) a particular type of input Object. This would add safety at the
expense of execution time.
Extend printf to include the capability to print Objects. Include an option to
print STRINGs without double quotes and backslash escapes.
Improve the prettyprinting capabilities of pp_object to make the most use of
the available line width.
Add more primitive Object types, such as FLOAT and CHAR. Extend the reader and
writer functions to handle these types, as well as negative INTEGERs.
Add a facility for defining STRUCTURE objects and for automatically creating
constructor (make_), selector, and type-predicate (is_) functions or macros.
As an option, make it possible to use SYMBOLs to name the structure's
components (again, at the expense of efficiency). Extend the reader and writer
functions appropriately. (In the current implementation, any structure that
needs to be handled as an Object -- TOKEN is an example -- must be hand-coded,
including its make_function and is_type predicate.)


Conclusion


I have described an effective approach to representing and manipulating
variable-length, heterogeneous lists, self-identified data objects, and
symbolic data in C. The simple examples in this article only hint at the
potential usefulness of the Lisp-Style Library for C: Its real power is
revealed in projects involving more complex data.
It often seems to me that almost everything in the world -- no matter how
simple or complex -- can be represented as a list of something. Moreover,
written language is so fundamental to our ability to represent and talk about
the world that it seems essential for a computer language to be able to
manipulate word-like symbols as easily as it can manipulate numbers and
operators. The techniques I've described here bring these powerful ideas to
the practical world of C programming.


References


Abelson, H. and G.J. Sussman, with J. Sussman. Structure and Interpretation of
Computer Programs. Cambridge, Mass.: MIT Press, 1985.
Adobe Systems Inc. PostScript Language Reference Manual. Reading, Mass.:
Addison-Wesley, 1985.
Aho, A.V., R. Sethi, and J.D. Ullman. Compilers: Principles, Techniques, and
Tools. Reading, Mass.: Addison-Wesley, 1986.
Bentley, J. "Programming Pearls: Little Languages." Communications of the ACM
(August 1986).
Schimandle, J. "Encapsulating C Memory Allocation." DDJ (August 1990).
Steele, G.L., Jr. Common LISP: The Language. Bedford, Mass.: Digital Press,
1984.
Winston, P.H. and B.K.P. Horn. LISP. Second Edition. Reading, Mass.:
Addison-Wesley, 1984.
Woodruff, B. "PostScript as a Programming Language," in Real World PostScript,
edited by S.F. Roth. Reading, Mass.: Addison-Wesley, 1988.
_A LISP-STYLE LIBRARY FOR C_
by Daniel N. Ozick


[LISTING ONE]

/* file LISP.H of 6-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */
/* Lisp-Style Library for C (Main Header File) */

/* Constants */
/* Array Sizes */
#define MAXSTRING 128 /* size of standard character array */
#define MAXLINE 256 /* size of text line character array */
#define HASH_TABLE_SZ 211 /* size of HASH_TABLE -- should be prime */

/* Characters */
#define EOS '\0' /* end of string */
#define TAB '\t'
#define NEWLINE '\n'
#define FORMFEED '\f'
#define SPACE 32
#define BELL 7
#define BACKSPACE 8
#define RETURN 13
#define LINEFEED 10
#define ESCAPE 27

#define DOT '.'
#define PERIOD '.'
#define DOS_EOF 26
#define BACKSLASH '\\'
#define SINGLE_QUOTE '\''
#define DOUBLE_QUOTE '\"'
#define LEFT_PAREN '('
#define RIGHT_PAREN ')'
#define LINE_SPLICE (-2)

/* Strings */
#define NULLSTR ""
#define NEWLINESTR "\n"
/** Types **/
/* Boolean -- standard truth values */
typedef enum
 {
 FALSE,
 TRUE
 } Boolean;
#if 0
/* Note: The following 'enum' version of Object_Type uses an 'int' (16 bits)
 of storage under Microsoft C 6.0! */
/* Object_Type -- values for first component of 'Object' (self-id tag) */
typedef enum
 {
 /* General Types */
 UNDEFINED,
 SYMBOL,
 STRING,
 INTEGER,
 FUNCTION,
 PAIR,
 VECTOR,
 /* Built-in C Structures */
 TOKEN,
 } Object_Type;
#endif
/* Note: The following version of Object_Type is guaranteed to use only one
'char' of storage. (Contrast with 'enum' version, above.) */
/* Object_Type -- values for first component of 'Object' (self-id tag) */
typedef char Object_Type;
/* General Types */
#define UNDEFINED 0
#define SYMBOL 1
#define STRING 2
#define INTEGER 3
#define FUNCTION 4
#define PAIR 5
#define VECTOR 6
/* Built-in C Structures */
#define TOKEN 7
/* Pointer -- 'Generic *' : what's pointed to is unknown at compile time */
typedef void *Pointer;
/* Object -- pointer to self-identified object (starts with Object_Type) */
typedef Object_Type *Object;
/* Function -- pointer to function of ? arguments returning Object */
typedef Object (*Function)(Object, ...);
/* Function_0 -- pointer to function of 0 arguments returning Object */

typedef Object (*Function_0)(void);
/* Function_1 -- pointer to function of 1 Object returning Object */
typedef Object (*Function_1)(Object);
/* Symbol_Entry -- the attributes of a symbol (entered into Symbol_Table) */
typedef struct
 {
 char *print_name; /* printed representation and lookup key */
 Object value; /* value of global variable named by symbol */
 } Symbol_Entry;
/* Pair -- a Lisp 'cons' cell for creating linked lists */
typedef struct
 {
 Object car; /* any Object */
 Object cdr; /* PAIR Object or NULL (usually) */
 } Pair;
/* Token -- structure Object stores token type and lexeme string */
typedef struct
 {
 Object type; /* SYMBOL */
 char *lexeme; /* string as it appeared in external file */
 } Token;
/* Hash_Table -- an array of hash-bucket lists used for symbol tables */
typedef Object Hash_Table [HASH_TABLE_SZ];
/** Macros **/
/* Standard Input and Output */
#define ungetchar(c) ungetc (c, stdin)
#define peekchar() ungetc (getchar(), stdin)
/** Object Components **/
/* SOT -- size of 'Object_Type' (bytes used by type tag) */
#define SOT sizeof (Object_Type)
/* type -- return the object's self-identification (Object_Type) */
#define type(object) *((Object_Type *) object)
/* symbol -- return the address of symbol's name and value (Symbol_Entry) */
#define symbol(object) ((Symbol_Entry *) (object + SOT))
/* symbol_value -- return the value assigned to a symbol */
#define symbol_value(object) (symbol(object)->value)
/* string -- return the address of (the first char of) standard C string */
#define string(object) ((char *) (object + SOT))
/* integer -- return an 'int' */
#define integer(object) *((int *) (object + SOT))
/* function -- return the address of a function that returns Object */
#define function(object) *((Function *) (object + SOT))
/* pair -- return the address of a Lisp-style CONS cell */
#define pair(object) ((Pair *) (object + SOT))
/* first -- return first element of a list (Lisp CAR) */
#define first(object) (pair(object)->car)
/* but_first -- return list less its first element (Lisp CDR) */
#define but_first(object) (pair(object)->cdr)
/* vector -- return the base address of a 1-dimensional array of Object */
#define vector(object) ((Object *) (object + SOT + sizeof (int)))
/* vector_length -- return length of a VECTOR Object (also an lvalue) */
#define vector_length(object) *((int *) (object + SOT))
/* token -- return the address of a Token structure */
#define token(object) ((Token *) (object + SOT))
/* Type Predicates */
#define is_null(object) (object == NULL)
#define is_symbol(object) (type(object) == SYMBOL)
#define is_pair(object) (type(object) == PAIR)
#define is_atom(object) (is_null(object) (type(object) != PAIR))

#define is_list(object) (is_null(object) is_pair(object))
#define is_vector(object) (type(object) == VECTOR)
#define is_string(object) (type(object) == STRING)
#define is_integer(object) (type(object) == INTEGER)
#define is_function(object) (type(object) == FUNCTION)
#define is_token(object) (type(object) == TOKEN)
/* declare_symbol -- declare extern var with same name as interned sym */
#define declare_symbol(name,type) extern Object name;
/* List-Based Stacks */
/* push -- push an object on to a (list-based) stack */
#define push(location,object) \
 location = first_put (object, location)
/* pop -- pop an object off of a (list-based) stack, NULL if stack empty */
#define pop(location) \
 ( (location != NULL) ? \
 pop_f (&location) : NULL )
/* Function Prototypes */
void error (char *fstr, ...);
Object first_put (Object item, Object list);
Object last_put (Object item, Object list);
Object list (Object item, ...);
Object append (Object list_1, Object list_2);
Object reverse (Object list);
Object flatten (Object obj);
Object flatten_no_nils (Object obj);
void for_each (Function_1 f, Object list);
Object map (Function_1 f, Object list);
Object map_no_nils (Function_1 f, Object list);
Object nth (Object list, int n);
Object assoc (Object key, Object a_list);
Object pop_f (Object *location);
int length (Object list);
Object is_member (Object obj, Object list);
int index (Object element, Object list);
char *make_c_string (char *str);
Object make_symbol (char *name);
Object make_string (char *s);
Object make_integer (int n);
Object make_function (Function f);
Object make_token (Object type, char *lexeme);
Object make_vector (int length);
Object list_to_vector (Object list);
void write_object (Object obj);
Object read_object (void);
Object lookup (char *str);
Object intern (char *str);
Object install_with_value (char *str, Object val);
Object set_symbol_value (Object sym, Object val);
void install_internal_symbols (void);
void mark (void);
void free_to_mark (void);
void mark_persistent (void);
void unmark_persistent (void);
Pointer safe_malloc (size_t size);
void safe_free (void *p);
void free_object (Object obj);
Object copy_object (Object obj);
Object persistent_copy_object (Object obj);
void init_internal_read_table (void);

void set_internal_reader (void);





[LISTING TWO]

/* File I-SYMS.H of 28-Jan-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Declaration of Symbols in Internal Symbol Table **/
/* Symbol Types */
declare_symbol (SYMBOL_TYPE, SYMBOL_TYPE);
declare_symbol (RESERVED, SYMBOL_TYPE);
declare_symbol (CHAR_TYPE, SYMBOL_TYPE);
declare_symbol (TOKEN_TYPE, SYMBOL_TYPE);
/* Reserved "Lisp" Symbols */
declare_symbol (_UNDEFINED, RESERVED);
declare_symbol (NIL, RESERVED);
declare_symbol (T, RESERVED);
declare_symbol (EOF_OBJECT, RESERVED);
/* Character Types */
declare_symbol (ILLEGAL, CHAR_TYPE);
declare_symbol (WHITESPACE, CHAR_TYPE);
declare_symbol (STRING_MARKER, CHAR_TYPE);
declare_symbol (COMMENT_MARKER, CHAR_TYPE);
declare_symbol (SPECIAL, CHAR_TYPE);
declare_symbol (CONSTITUENT, CHAR_TYPE);
declare_symbol (ESCAPE_MARKER, CHAR_TYPE);
declare_symbol (ENDFILE_MARKER, CHAR_TYPE);
/** Token Types **/
/* For Internal Diagnostics */
declare_symbol (T_ERROR, TOKEN_TYPE);
/* Internal Special Symbols (Lisp IO) */
declare_symbol (T_LPAREN, TOKEN_TYPE);
declare_symbol (T_RPAREN, TOKEN_TYPE);
/* Others */
declare_symbol (T_NEWLINE, TOKEN_TYPE);
declare_symbol (T_WHITESPACE, TOKEN_TYPE);
declare_symbol (T_WORD, TOKEN_TYPE);
declare_symbol (T_STRING, TOKEN_TYPE);
declare_symbol (T_EOF, TOKEN_TYPE);





[LISTING THREE]

/* File LISP.C of 6-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Lisp-Style Library for C (Main File of User Functions) **/
/* Include Files */
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <stdarg.h>
#include "lisp.h"
#include "i-syms.h"

/** Functions **/
/* error -- write string (args like 'printf') to 'stdout' and exit */
void error (char *fstr, ...)
 {
 va_list ap;
 va_start (ap, fstr);
 vfprintf (stdout, fstr, ap);
 fputc (NEWLINE, stdout);
 /* write DOS_EOF to 'stdout' for compatibility */
 fputc (DOS_EOF, stdout);
 exit (1);
 va_end (ap);
 }
/** List Constructors **/
/* first_put -- add an Object to the front of a list (Lisp CONS) */
Object first_put (Object item, Object list)
 {
 Object new_list;
 new_list = (Object) safe_malloc (sizeof (Object_Type) + sizeof (Pair));
 type (new_list) = PAIR;
 pair (new_list) -> car = item;
 pair (new_list) -> cdr = list;
 return (new_list);
 }
/* last_put -- add an Object to the end of a list (Destructive!) */
Object last_put (Object item, Object list)
 {
 Object old_list, new_list;
 new_list = first_put (item, NULL);
 if (list == NULL)
 return (new_list);
 else
 {
 old_list = list;
 while (but_first (list) != NULL)
 list = but_first (list);
 pair (list) -> cdr = new_list;
 return (old_list);
 }
 }
/* list -- return a new list of given arguments (last arg must be NULL) */
Object list (Object item, ...)
 {
 va_list ap;
 Object result;
 result = NULL;
 va_start (ap, item);
 while (item != NULL)
 {
 result = last_put (item, result);
 item = va_arg (ap, Object);
 }
 va_end (ap);
 return (result);
 }
/* append -- concatenate two lists (destructive (!) Lisp equivalent) */
Object append (Object list_1, Object list_2)
 {
 Object list;

 if (list_1 == NULL)
 return (list_2);
 else
 if (list_2 == NULL)
 return (list_1);
 else
 {
 list = list_1;
 while (but_first (list) != NULL)
 list = but_first (list);
 pair (list) -> cdr = list_2;
 return (list_1);
 }
 }
/** List Modifiers **/
/* reverse -- return a new list in reverse order (Lisp equivalent) */
Object reverse (Object list)
 {
 Object new_list;
 new_list = NULL;
 while (list != NULL)
 {
 new_list = first_put (first (list), new_list);
 list = but_first (list);
 }
 return (new_list);
 }
/* flatten -- return the leaves of a tree (atoms of nested lists) */
Object flatten (Object obj)
 {
 if (is_null (obj))
 return (first_put (NULL, NULL));
 else if (is_atom (obj))
 return (list (obj, NULL));
 else if (is_null (but_first (obj)))
 return (flatten (first (obj)));
 else
 return (append (flatten (first (obj)),
 flatten (but_first (obj)) ));
 }
/* flatten_no_nils -- 'flatten' a tree, discarding NULL atoms */
Object flatten_no_nils (Object obj)
 {
 if (is_null (obj))
 return (NULL);
 else if (is_atom (obj))
 return (list (obj, NULL));
 else
 return (append (flatten_no_nils (first (obj)),
 flatten_no_nils (but_first (obj)) ));
 }
/** Mapping Functions **/
/* for_each -- apply a function 'f' to each element of a list */
void for_each (Function_1 f, Object list)
 {
 while (list != NULL)
 {
 (*f) (first (list));
 list = but_first (list);

 }
 }
/* map -- apply a function 'f' to each element of list, put results in list */
Object map (Function_1 f, Object list)
 {
 Object output;
 output = NULL;
 while (list != NULL)
 {
 output = first_put ((*f) (first (list)), output);
 list = but_first (list);
 }
 return (reverse (output));
 }
/* map_no_nils -- like 'map', but collect only non-NULL results */
Object map_no_nils (Function_1 f, Object list)
 {
 Object result;
 Object output;
 output = NULL;
 while (list != NULL)
 {
 result = (*f) (first (list));
 if (result != NULL)
 output = first_put (result, output);
 list = but_first (list);
 }
 return (reverse (output));
 }
/** List Selectors **/
/* nth -- return nth element of a list or NULL (Lisp equivalent) */
Object nth (Object list, int n)
 {
 while ((list != NULL) && (n > 0))
 {
 list = but_first (list);
 n--;
 }
 if (list != NULL)
 return (first (list));
 else
 return (NULL);
 }
/* assoc -- association-list lookup returns PAIR whose 'first' matches key */
Object assoc (Object key, Object a_list)
 {
 Object pair;
 while (a_list != NULL)
 {
 pair = first (a_list);
 if (first (pair) == key)
 return (pair);
 else
 a_list = but_first (a_list);
 }
 return (NULL);
 }
/* pop_f -- pop an object off of a (list-based) stack: 'pop' macro helper */
Object pop_f (Object *location)

 {
 Object item;
 item = first (*location);
 *location = but_first (*location);
 return (item);
 }

/* List Properties */
/* length -- return the integer length of a list (Lisp equivalent) */
int length (Object list)
 {
 int n;
 n = 0;
 while (list != NULL)
 {
 list = but_first (list);
 n++;
 }
 return (n);
 }
/* is_member -- T if 'obj' is identical to element of 'list', else NULL */
Object is_member (Object obj, Object list)
 {
 while (list != NULL)
 {
 if (first (list) == obj)
 return (T);
 else
 list = but_first (list);
 }
 return (NULL);
 }
/* index -- return index of first occurence of 'element' in 'list' */
int index (Object element, Object list)
 {
 int n;
 n = 0;
 while ((list != NULL) &&
 (first (list) != element) )
 {
 list = but_first (list);
 n++;
 }
 if (list != NULL)
 return (n);
 else
 return (-1);
 }
/** Object Constructors **/
/* make_c_string -- make new copy of argument string in free memory */
char *make_c_string (char *str)
 {
 char *new_string;
 new_string = (char *) safe_malloc (strlen (str) + 1);
 strcpy (new_string, str);
 return (new_string);
 }
/* make_symbol -- return a new symbol of given name (no table lookup) */
Object make_symbol (char *name)

 {
 Object new_symbol;
 new_symbol = (Object) safe_malloc (sizeof (Object_Type) +
 sizeof (Symbol_Entry) );
 type (new_symbol) = SYMBOL;
 symbol (new_symbol) -> print_name = make_c_string (name);
 symbol (new_symbol) -> value = _UNDEFINED;
 return (new_symbol);
 }
/* make_string -- return a new STRING Object with value of given string */
Object make_string (char *s)
 {
 Object new_string;
 new_string = (Object) safe_malloc (sizeof (Object_Type) + strlen (s) + 1 );
 type (new_string) = STRING;
 strcpy (string (new_string), s);
 return (new_string);
 }
/* make_integer -- return a new INTEGER Object of specfied value */
Object make_integer (int n)
 {
 Object new_integer;
 new_integer = (Object) safe_malloc (sizeof (Object_Type) + sizeof (int) );
 type (new_integer) = INTEGER;
 integer (new_integer) = n;
 return (new_integer);
 }
/* make_function -- return a new FUNCTION Object of specfied value */
Object make_function (Function f)
 {
 Object new_function;
 new_function = (Object) safe_malloc (sizeof (Object_Type) +
 sizeof (Function) );
 type (new_function) = FUNCTION;
 function (new_function) = f;
 return (new_function);
 }
/* make_token -- return a new TOKEN Object of specified type and lexeme */
Object make_token (Object type, char *lexeme)
 {
 Object new_token;
 new_token = (Object) safe_malloc (sizeof (Object_Type) + sizeof (Token));
 type (new_token) = TOKEN;
 token (new_token) -> type = type;
 token (new_token) -> lexeme = make_c_string (lexeme);
 return (new_token);
 }
/** Vectors **/
/* make_vector -- return a new VECTOR object of specified 'length' */
Object make_vector (int length)
 {
 Object new_vector;
 int i;
 new_vector = (Object) safe_malloc (sizeof (Object_Type) + sizeof (int) +
 length * sizeof (Object) );
 type (new_vector) = VECTOR;
 vector_length (new_vector) = length;
 for (i = 0; i < length; i++)
 vector(new_vector) [i] = NULL;

 return (new_vector);
 }
/* list_to_vector -- given a (proper) list, return a new VECTOR Object */
Object list_to_vector (Object list)
 {
 Object new_vector;
 Object *element;
 new_vector = make_vector (length (list));
 element = vector(new_vector);
 while (list != NULL)
 {
 *element = first (list);
 list = but_first (list);
 element++;
 }
 return (new_vector);
 }
/** Symbolic Output **/
/* write_spaces -- write 'n' spaces to 'stdout' */
void write_spaces (int n)
 {
 int i;
 for (i = 0; i < n; i++)
 putchar (SPACE);
 }
/* write_c_string -- write standard C string with double-quotes and escapes */
void write_c_string (char *s)
 {
 putchar (DOUBLE_QUOTE);
 while (*s != EOS)
 {
 switch (*s)
 {
 case NEWLINE:
 putchar (BACKSLASH);
 putchar ('n');
 break;
 case TAB:
 putchar (BACKSLASH);
 putchar ('t');
 break;
 case FORMFEED:
 putchar (BACKSLASH);
 putchar ('f');
 break;
 case BACKSLASH:
 putchar (BACKSLASH);
 putchar (BACKSLASH);
 break;
 case DOUBLE_QUOTE:
 putchar (BACKSLASH);
 putchar (DOUBLE_QUOTE);
 break;
 default:
 putchar (*s);
 break;
 }
 s++;
 }

 putchar (DOUBLE_QUOTE);
 }
/* write_symbol -- write printed representation of SYMBOL Object */
void write_symbol (Object obj)
 {
 printf ("%s", symbol(obj)->print_name);
 }
/* write_string -- write printed representation of STRING Object */
void write_string (Object obj)
 {
 write_c_string (string(obj));
 }
/* pp_object -- pretty-print an Object starting at 'col', output at 'hpos' */
void pp_object (Object obj, int col, int hpos)
 {
 int i;
 write_spaces (col - hpos); hpos = col;
 if (obj == NULL)
 printf ("()");
 else
 switch (type(obj))
 {
 case SYMBOL:
 write_symbol (obj);
 break;
 case STRING:
 write_string (obj);
 break;
 case INTEGER:
 printf ("%d", integer(obj));
 break;
 case PAIR:
 /* for now, assume proper list (ending in NULL 'but_first') */
 putchar (LEFT_PAREN); hpos++;
 while (obj != NULL)
 {
 if (! is_pair (obj))
 error ("pp_object: not proper list");
 pp_object (first (obj), col+1, hpos);
 obj = but_first (obj);
 if (obj != NULL)
 {
 putchar (NEWLINE); hpos = 0;
 }
 }
 putchar (RIGHT_PAREN);
 break;
 case VECTOR:
 putchar ('#'); hpos++;
 putchar (LEFT_PAREN); hpos++;
 for (i = 0; i < vector_length(obj); i++)
 {
 pp_object (vector(obj) [i], col+2, hpos);
 if (i < vector_length(obj)-1)
 {
 putchar (NEWLINE); hpos = 0;
 }
 }
 putchar (RIGHT_PAREN);

 break;
 case FUNCTION:
 printf ("#<function>");
 break;
 case TOKEN:
 printf ("#S(TOKEN ");
 write_symbol (token(obj)->type);
 putchar (SPACE);
 write_c_string (token(obj)->lexeme);
 putchar (RIGHT_PAREN);
 break;
 default:
 error ("pp_object: not standard object");
 break;
 }
 }
/* write_object -- write (re-readable) printed representation of Object */
void write_object (Object obj)
 {
 /* for now (simple version), assume 'hpos' initially 0 */
 pp_object (obj, 0, 0);
 }





[LISTING FOUR]

/* File SYMBOLS.C of 5-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Symbol Tables and Installed Symbols **/
/* Include Files */
#include <stdio.h>
#include <string.h>
#include "lisp.h"
/** Variables **/
/* internal_symbols -- the symbol table for "Lisp" */
Hash_Table internal_symbols;
/* symbol_table -- pointer to the current symbol table */
Object *symbol_table;
/* Predefined Internal Symbols */
#undef declare_symbol
#define declare_symbol(name,type) \
 Object name
#include "i-syms.h"
/** Functions **/
/* init_hash_table -- set all hash buckets in a table to the empty list */
void init_hash_table (Hash_Table table)
 {
 int i;
 for (i = 0; i < HASH_TABLE_SZ; i++)
 table [i] = NULL;
 }
/* hash -- given a character string, return a hash code (from Aho, p. 436) */
int hash (char *str)
 {
 char *p;
 unsigned long g, h;

 /* from the book "Compilers" by Aho, Sethi, and Ullman, p. 436 */
 h = 0;
 for (p = str; *p != EOS; p++)
 {
 h = (h << 4) + (*p);
 g = h & 0xF0000000;
 if (g)
 {
 h = h ^ (g >> 24);
 h = h ^ g;
 }
 }
 return ( (int) (h % HASH_TABLE_SZ));
 }
/* lookup -- given a string, return symbol from 'symbol_table' or NULL */
Object lookup (char *str)
 {
 Object hash_bucket; /* list */
 Object sym; /* symbol */
 hash_bucket = symbol_table [hash (str)];
 /* walk linearly down 'hash_bucket' list looking for input string */
 while (hash_bucket != NULL)
 {
 sym = first (hash_bucket);
 if (strcmp (symbol (sym) -> print_name, str) == 0)
 return (sym);
 else
 hash_bucket = but_first (hash_bucket);
 }
 return (NULL);
 }
/* install -- add a new symbol with given print string to 'symbol_table' */
Object install (char *str)
 {
 Object new_sym;
 int hash_index;
 new_sym = make_symbol (str);
 /* insert new symbol object at the front of appropriate hash bucket list */
 hash_index = hash (str);
 symbol_table [hash_index] = first_put (new_sym, symbol_table [hash_index]);
 return (new_sym);
 }
/* intern -- return (possibly new and just installed) symbol of given name */
Object intern (char *str)
 {
 Object sym; /* symbol */
 sym = lookup (str);
 if (sym == NULL)
 sym = install (str);
 return (sym);
 }
/* set_symbol_value -- set the value of an already installed symbol */
Object set_symbol_value (Object sym, Object val)
 {
 symbol (sym) -> value = val;
 return (val);
 }
/* install_with_value -- add a new symbol and its value to 'symbol_table' */
Object install_with_value (char *str, Object val)

 {
 Object new_sym;
 new_sym = install (str);
 set_symbol_value (new_sym, val);
 return (new_sym);
 }
/* install_internal_symbols -- set internal symbols known at compile time */
void install_internal_symbols (void)
 {
 symbol_table = internal_symbols;
 #undef declare_symbol
 #define declare_symbol(name,type) \
 name = install_with_value (#name, type)
 #include "i-syms.h"
 install_with_value ("(", T_LPAREN);
 install_with_value (")", T_RPAREN);
 }





[LISTING FIVE]

/* File LEXER.C of 6-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Lexical Analyzer (a.k.a. Lexer, Scanner, or Reader) **/
/* Include Files */
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
#include "lisp.h"
#include "i-syms.h"
/* External Variables */
extern Object *symbol_table;
extern Hash_Table internal_symbols;
/* Internal Function Prototypes */
Object read_list (Object first_atom);
/* Constants */
#define CHAR_SET_SZ 256
/** Types **/
/* Read_Table -- array giving CHAR_TYPE SYMBOL for every char and EOF */
typedef Object Read_Table [CHAR_SET_SZ+1];
/** Variables **/
/* internal_read_table -- read table for "Lisp" reader */
Read_Table internal_read_table;
/* read_table -- pointer to the current read table */
Object *read_table;
/* eof_seen -- 'get_token' (EOF) sets TRUE */
Boolean eof_seen = FALSE;
/** Macros **/
/* char_type -- return char type of char or EOF from current read table */
#define char_type(c) read_table[c+1]
/** Functions **/
/* set_read_table_entries -- set a list of read-table entries to Char_Type */
void set_read_table_entries (char *s, Object t)
 {
 while (*s != EOS)

 char_type (*s++) = t;
 }
/* init_read_table -- initialize 'read_table' with CONSTITUENT and EOF */
void init_read_table (void)
 {
 int c;
 for (c = 0; c < CHAR_SET_SZ; c++)
 char_type (c) = CONSTITUENT;
 char_type (EOF) = ENDFILE_MARKER;
 }
/* init_internal_read_table -- initialize 'internal_read_table' */
void init_internal_read_table (void)
 {
 read_table = internal_read_table;
 init_read_table ();
 set_read_table_entries (" \t\f\n", WHITESPACE);
 set_read_table_entries (";", COMMENT_MARKER);
 set_read_table_entries ("()", SPECIAL);
 char_type (DOUBLE_QUOTE) = STRING_MARKER;
 char_type (BACKSLASH) = ESCAPE_MARKER;
 }
/* set_internal_reader -- set 'read_table' and 'symbol_table' for Lisp I/O */
void set_internal_reader (void)
 {
 read_table = internal_read_table;
 symbol_table = internal_symbols;
 }
/* get_whitespace -- return TOKEN Object of type T_WHITESPACE */
Object get_whitespace (void)
 {
 char lexeme [MAXSTRING];
 int index;
 int current_char;
 /* collect characters up to next non-whitespace */
 index = 0;
 while (current_char = getchar (),
 (char_type (current_char) == WHITESPACE) &&
 (index < MAXSTRING-1) )
 lexeme [index++] = (char) current_char;
 lexeme [index] = EOS;
 ungetchar (current_char);
 return (make_token (T_WHITESPACE, lexeme));
 }
/* get_escaped_char -- return single character value, line splice ==> EOS */
int get_escaped_char (void)
 {
 int c;
 /* discard ESCAPE_MARKER */
 getchar ();
 switch (c = getchar ())
 {
 case 'n':
 return (NEWLINE);
 break;
 case 't':
 return (TAB);
 break;
 case 'f':
 return (FORMFEED);

 break;
 case BACKSLASH:
 return (BACKSLASH);
 break;
 case DOUBLE_QUOTE:
 return (DOUBLE_QUOTE);
 break;
 /* Note: LINE_SPLICE should really be discarded */
 case NEWLINE:
 return (LINE_SPLICE);
 break;
 default:
 return (c);
 break;
 }
 }
/* get_string -- return TOKEN Object of type T_STRING */
Object get_string (void)
 {
 char lexeme [MAXSTRING];
 int index;
 int current_char;
 /* discard starting STRING_MARKER */
 getchar ();
 /* collect characters until next (unescaped) STRING_MARKER */
 index = 0;
 while (current_char = getchar (),
 (char_type (current_char) != STRING_MARKER) &&
 (index < MAXSTRING-1) )
 {
 if (char_type (current_char) != ESCAPE_MARKER)
 lexeme [index++] = (char) current_char;
 else
 {
 ungetchar (current_char);
 lexeme [index++] = (char) get_escaped_char ();
 }
 }
 lexeme [index] = EOS;
 return (make_token (T_STRING, lexeme));
 }
/* skip_comment -- discard characters of a 'get_token' (line) comment */
void skip_comment (void)
 {
 while (getchar () != NEWLINE)
 ;
 }
/* get_special_sym -- return one of the special-symbol TOKEN Objects */
Object get_special_sym (void)
 {
 int current_char;
 char lexeme [3];
 Object sym;
 current_char = getchar ();
 lexeme [0] = (char) current_char;
 /* check for two-character special symbol */
 current_char = getchar ();
 lexeme [1] = (char) current_char;
 lexeme [2] = EOS;

 sym = lookup (lexeme);
 if (sym != NULL)
 return (make_token (symbol_value (sym), lexeme));
 /* check for one-character special symbol */
 else
 {
 ungetchar (current_char);
 lexeme [1] = EOS;
 sym = lookup (lexeme);
 if (sym != NULL)
 return (make_token (symbol_value (sym), lexeme));
 /* else error */
 else
 error ("get_special_sym: no token type for '%s' ", lexeme);
 }
 }
/* get_word -- return TOKEN Object of type T_WORD */
Object get_word (void)
 {
 char lexeme [MAXSTRING];
 int index;
 int current_char;
 /* collect characters up to next non-constituent */
 index = 0;
 while (current_char = getchar (),
 (char_type (current_char) == CONSTITUENT) &&
 (index < MAXSTRING-1) )
 lexeme [index++] = (char) current_char;
 lexeme [index] = EOS;
 ungetchar (current_char);
 return (make_token (T_WORD, lexeme));
 }
/* get_token -- return a single TOKEN Object (raw version) */
Object get_token (void)
 {
 int current_char;
 Object ct;
 if (eof_seen)
 error ("get_token: attempt to read past end of file");
 current_char = peekchar ();
 ct = char_type (current_char);
 if (ct == CONSTITUENT)
 return (get_word ());
 else if (ct == WHITESPACE)
 return (get_whitespace ());
 else if (ct == SPECIAL)
 return (get_special_sym ());
 else if (ct == STRING_MARKER)
 return (get_string ());
 else if (ct == COMMENT_MARKER)
 {
 skip_comment ();
 return (get_token ());
 }
 else if (ct == ESCAPE_MARKER)
 {
 /* discard anything but LINE_SPLICE */
 if (get_escaped_char () == LINE_SPLICE)
 return (make_token (T_WHITESPACE, NEWLINESTR));

 else
 return (get_token ());
 }
 else if (ct == ENDFILE_MARKER)
 {
 /* set end-of-file flag (see 'with_current_files') */
 eof_seen = TRUE;
 return (make_token (T_EOF, NULLSTR));
 }
 else
 error ("get_token: bad char type for '%c' ", current_char);
 }
/* symbol_or_number -- interpret string as SYMBOL or INTEGER Object */
Object symbol_or_number (char *s)
 {
 if (isdigit (*s))
 return (make_integer (atoi (s)));
 else
 return (intern (s));
 }
/* read_atom -- return an atomic Object or list-syntax TOKEN Object */
Object read_atom (void)
 {
 Object t, tt;
 t = get_token ();
 tt = token(t)->type;
 if (tt == T_WHITESPACE)
 return (read_atom ());
 else
 if (tt == T_WORD)
 return (symbol_or_number (token(t)->lexeme));
 else
 if (tt == T_STRING)
 return (make_string (token(t)->lexeme));
 else
 if (tt == T_EOF)
 return (EOF_OBJECT);
 else
 if ((tt == T_LPAREN) (tt == T_RPAREN))
 return (t);
 else
 error ("read_atom: bad token type on input");
 }
/* read_object_1 -- 'read_object' with first input atom supplied */
Object read_object_1 (Object first_atom)
 {
 Object_Type ot;
 Object tt;
 ot = type(first_atom);
 if (ot == TOKEN)
 tt = token(first_atom)->type;
 if ((ot == TOKEN) && (tt == T_LPAREN))
 return (read_list (read_atom ()));
 else
 if ((ot == TOKEN) && (tt == T_RPAREN))
 error ("read_object_1: right paren without matching left paren");
 else
 return (first_atom);
 }

/* read_list -- read paren-delimited list (helper for 'read_object') */
Object read_list (Object first_atom)
 {
 Object_Type ot;
 Object tt;
 Object first, rest;
 ot = type(first_atom);
 if (ot == TOKEN)
 tt = token(first_atom)->type;
 if ((ot == TOKEN) && (tt == T_RPAREN))
 return (NULL);
 else
 if ((ot == TOKEN) && (tt == T_EOF))
 error ("read_list: EOF encountered before closing right paren");
 else
 {
 first = read_object_1 (first_atom);
 rest = read_list (read_atom ());
 return (first_put (first, rest));
 }
 }
/* read_object -- read complete Object, including paren-delimited list */
Object read_object (void)
 {
 return (read_object_1 (read_atom ()));
 }




[LISTING SIX]

/* File MEMORY.C of 6-Feb-91 / Copyright (C) 1990 by Daniel N. Ozick */

/** Memory Allocation and Deallocation Functions **/
/* Include Files */
#include <stdio.h>
#include <stdlib.h>
#include "lisp.h"
/* Constants */
#define MAX_MARK_LEVELS 16
/** Types **/
/* Mark_Type */
typedef enum
 {
 TEMPORARY,
 PERSISTENT
 } Mark_Type;
/* Mark -- an element of 'mark_stack' */
typedef struct
 {
 Mark_Type type;
 Pointer index;
 } Mark;
/** Variables **/
/* marked_block_list -- pointer to linked list of marked allocated blocks */
Pointer marked_block_list = NULL;
/* mark_stack -- stack of 'Mark' and stack index */
Mark mark_stack [MAX_MARK_LEVELS];

int mark_stack_index = 0;
/* alloc_persistent -- FALSE means stack pointers to freeable memory blocks */
Boolean alloc_persistent = TRUE;
/** Functions **/
/* push_marked_block -- push pointer to block on 'marked_block_list' */
void push_marked_block (Pointer p)
 {
 * (Pointer *) p = marked_block_list;
 marked_block_list = p;
 }
/* pop_marked_block -- pop pointer to block from 'marked_block_list' */
Pointer pop_marked_block (void)
 {
 Pointer p;
 p = marked_block_list;
 if (p != NULL)
 {
 marked_block_list = * (Pointer *) p;
 return (p);
 }
 else
 error ("pop_marked_block: 'marked_block_list' is empty");
 }
/* push_mark_stack -- push a Mark on top of 'mark_stack' */
void push_mark_stack (Mark m)
 {
 if (mark_stack_index < MAX_MARK_LEVELS)
 mark_stack [mark_stack_index++] = m;
 else
 error ("push_mark_stack: exceeded MAX_MARK_LEVELS");
 }
/* pop_mark_stack -- pop a Mark from 'mark_stack' */
Mark pop_mark_stack (void)
 {
 if (mark_stack_index > 0)
 return (mark_stack [--mark_stack_index]);
 else
 error ("pop_mark_stack: stack empty");
 }
/* top_mark_stack -- return top of 'mark_stack' or PERSISTENT Mark if empty */
Mark top_mark_stack (void)
 {
 Mark m;
 if (mark_stack_index > 0)
 return (mark_stack [mark_stack_index-1]);
 else
 {
 m.type = PERSISTENT;
 m.index = marked_block_list;
 return (m);
 }
 }
/* mark -- push TEMPORARY Mark (with 'marked_block_list') on 'mark_stack' */
void mark (void)
 {
 Mark m;
 m.type = TEMPORARY;
 m.index = marked_block_list;
 push_mark_stack (m);

 alloc_persistent = FALSE;
 }
/* free_to_mark -- 'safe_free' all memory blocks alloc'ed since last 'mark' */
void free_to_mark (void)
 {
 Mark m;
 m = pop_mark_stack ();
 if (m.type == TEMPORARY)
 {
 while (marked_block_list != m.index)
 safe_free ((char *) pop_marked_block () + sizeof (Pointer));
 alloc_persistent = (top_mark_stack().type == PERSISTENT);
 }
 else
 error ("free_to_mark: wrong mark type on 'mark_stack'");
 }
/* mark_persistent -- disable stacking of freeable memory block pointers */
void mark_persistent (void)
 {
 Mark m;
 m.type = PERSISTENT;
 m.index = marked_block_list;
 push_mark_stack (m);
 alloc_persistent = TRUE;
 }
/* unmark_persistent -- pop a PERSISTENT Mark off the 'mark_stack' */
void unmark_persistent (void)
 {
 Mark m;
 m = pop_mark_stack ();
 if (m.type == PERSISTENT)
 alloc_persistent = (top_mark_stack().type == PERSISTENT);
 else
 error ("unmark_persistent: wrong mark type on 'mark_stack'");
 }
/* safe_malloc -- Unix 'malloc' wrapped inside test for sufficient memory */
Pointer safe_malloc (size_t size)
 {
 Pointer memory;
 static long num_blocks = 0;
 static long total_space = 0;
 /* allocate block, including header for link in 'marked_block_list' */
 memory = malloc (size + sizeof (Pointer));
 num_blocks++;
 total_space += size;
 if (memory != NULL)
 {
 if (! alloc_persistent)
 push_marked_block (memory);
 /* return beginning of user data block */
 return ((char *) memory + sizeof (Pointer));
 }
 else
 error ("safe_malloc: out of memory"
 " (num_blocks = %ld, total_space = %ld) \n",
 num_blocks, total_space );
 }
/* safe_free -- Unix 'free' with first byte of block set to zero */
void safe_free (void *p)

 {
 * (char *) p = (char) 0;
 /* free block, including header for link in 'marked_block_list' */
 free ((char* ) p - sizeof (Pointer));
 }
/* free_object -- free memory for Object and recursively for its components */
void free_object (Object obj)
 {
 if (marked_block_list != NULL)
 error ("free_object: can't free if 'marked_block_list' not empty");
 if (obj == NULL)
 return;
 else
 switch (type(obj))
 {
 case SYMBOL:
 return;
 break;
 case STRING:
 case INTEGER:
 case FUNCTION:
 break;
 case PAIR:
 free_object (first(obj));
 free_object (but_first(obj));
 break;
 case VECTOR:
 error ("free_object: VECTOR objects not implemented yet");
 break;
 case TOKEN:
 safe_free (token(obj)->lexeme);
 break;
 default:
 error ("free_object: not standard object");
 break;
 }
 safe_free (obj);
 }
/* copy_object -- copy Object and its components recursively */
Object copy_object (Object obj)
 {
 if (obj == NULL)
 return (NULL);
 switch (type(obj))
 {
 case SYMBOL:
 return (obj);
 case STRING:
 return (make_string (string(obj)));
 case INTEGER:
 return (make_integer (integer(obj)));
 case FUNCTION:
 return (make_function (function(obj)));
 case PAIR:
 return (first_put (copy_object (first(obj)),
 copy_object (but_first(obj)) ));
 case VECTOR:
 error ("copy_object: VECTOR objects not implemented yet");
 case TOKEN:

 return (make_token (token(obj)->type, token(obj)->lexeme ));
 default:
 error ("copy_object: not standard object");
 }
 }
/* persistent_copy_object -- 'copy_object' wrapped in 'mark_persistent' */
Object persistent_copy_object (Object obj)
 {
 Object result;
 mark_persistent ();
 result = copy_object (obj);
 unmark_persistent ();
 return (result);
 }






[LISTING SEVEN]

/* File REPL.C of 11-Feb-91 / Copyright (C) 1991 by Daniel N. Ozick */
/* REPL: A Simplified Lisp-Style Read-Evaluate-Print Loop or
A Tiny Lisp Interpreter
REPL is a simple interactive program intended to demonstrate some of the
features of The Lisp-Style Library for C. At the DOS > prompt, it READs user
input and attempts to convert that input into an internal Object. Then it
EVALuates the input Object as a Lisp expression according to the rules below.
Finally, it PRINTs the external representation of the result of evaluating the
input Object, and prompts for more input. This LOOP continues until either an
error occurs or the user interrupts it with control-C or control-Break.
Lisp expressions are evaluated as follows: 1. The empty list evaluates to
itself. 2. A symbol evaluates to its symbol_value. 3. Strings and integers
evaluate to themselves. 4. A list whose first element is the symbol quote
evaluates to the second element of the list. 5. A list whose first element is
a symbol whose symbol_value is a function evaluates to the result of applying
that function to the (recursively) evaluated elements of the rest of the list.
"Impure" Lisp-style functions--those that have non-Object inputs or output--
cannot be used in the Tiny Lisp Interpreter. These functions are for_each, map
(for which pmap is the equivalent "pure" version), map_no_nils, nth, length,
and index. In addition, the interpreter cannot handle macros such as first,
but_first, push, pop and the is_ type predicates. To create the REPL
executable file, link the compiled versions of LISP.C, SYMBOLS.C, LEXER.C,
MEMORY.C, and REPL.C. The required header files are LISP.H and I-SYMS.H. The
Lisp-Style Library and this program have been compiled and tested using
Microsoft C 6.0 under PC-DOS 3.3. */

/* Include Files */
#include <stdio.h>
#include "lisp.h"
/** Variables **/
/* quote -- marker SYMBOL for quoted-expression special form in 'eval' */
Object quote;
/** Macros **/
/* declare_function -- set up a SYMBOL whose value is FUNCTION (same name) */
#define declare_function(name) \
 install_with_value (#name, make_function ((Function) name))
/** Functions **/

/* integers -- return the list of INTEGERs 'n1' through 'n2' inclusive */
Object integers (Object n1, Object n2)
 {
 int i;
 Object result;
 result = NULL;
 for (i = integer (n1); i <= integer (n2); i++)
 result = first_put (make_integer (i), result);
 return (reverse (result));
 }
/* sum -- return (as an INTEGER) the sum of a list of INTEGERs */
Object sum (Object list)
 {
 int sum;
 sum = 0;
 while (list != NULL)
 {
 sum += integer (first (list));
 list = but_first (list);
 }
 return (make_integer (sum));
 }
/* square -- return (as an INTEGER) the square of an INTEGER */
Object square (Object n)
 {
 return (make_integer (integer (n) * integer (n)));
 }
/* The following function is the "purified" version of map. It has a
non-Object
input and can't be used in the Tiny Lisp Interpreter. Similar purifications
can be made for other impure functions in The Lisp-Style Library for C. */
/* pmap -- apply a function to each element of a list, put results in list */
Object pmap (Object f, Object list)
 {
 Object output;
 output = NULL;
 while (list != NULL)
 {
 output = first_put ((*function(f)) (first (list)), output);
 list = but_first (list);
 }
 return (reverse (output));
 }
/* install_function_symbols -- set up some symbols for read-eval-print loop */
void install_function_symbols (void)
 {
 /* pure Object functions from LISP.C */
 declare_function (first_put);
 declare_function (last_put);
 declare_function (reverse);
 declare_function (list);
 declare_function (append);
 declare_function (flatten);
 declare_function (flatten_no_nils);
 declare_function (is_member);
 declare_function (assoc);
 /* pure Object functions from REPL.C (examples for Tiny Interpreter) */
 declare_function (integers);
 declare_function (sum);
 declare_function (square);

 declare_function (pmap);
 }
/* apply -- apply a ("pure" Object) FUNCTION to a list of args (max of 8) */
Object apply (Object f, Object args)
 {
 return ((*function (f)) (nth (args, 0), nth (args, 1),
 nth (args, 2), nth (args, 3),
 nth (args, 4), nth (args, 5),
 nth (args, 6), nth (args, 7) ));
 }
/* eval -- evaluate a Lisp-syntax expression (see notes above) */
Object eval (Object expr)
 {
 Object first_element, f;
 /* () is self-evaluating */
 if (is_null (expr))
 return (expr);
 /* symbol ==> symbol's value, other atoms are self-evaluating */
 else if (is_atom (expr))
 {
 if (is_symbol (expr))
 return (symbol_value (expr));
 else
 return (expr);
 }
 /* lists are function applications or quoted expressions */
 else if (is_pair (expr))
 {
 first_element = first (expr);
 if (first_element == quote)
 return (first (but_first (expr)));
 if (is_symbol (first_element))
 f = symbol_value (first_element);
 else
 error ("eval: first element of list is not a symbol");
 if (is_function (f))
 return (apply (f, map (eval, but_first (expr))));
 else
 error ("eval: symbol value is not a function");
 }
 }
/* main (REPL) -- interactive read-eval-print loop (Tiny Lisp Interpreter) */
int main (int argc, char *argv[])
 {
 printf ("A Tiny Lisp Interpreter using the Lisp-Style Library for C \n");
 printf ("Copyright (C) 1991 by Daniel N. Ozick \n\n");
 /* initialize internal symbol tables and read-tables */
 mark_persistent ();
 install_internal_symbols ();
 init_internal_read_table ();
 set_internal_reader ();
 install_function_symbols ();
 quote = intern ("quote");
 unmark_persistent ();
 /* do read-eval-print loop until user interrupt */
 while (TRUE)
 {
 mark ();
 printf ("\n> ");

 write_object (eval (read_object ()));
 free_to_mark ();
 }
 /* return "no errors" */
 return (0);
 }
























































August, 1991
GENERIC CONTAINERS IN C++


Implementing generic packages without parameterized types -- difficult, but
not impossible




Andrew Davidson


Andrew received his M.S. degree in System Science from SUNY in Binghamton,
N.Y. He has been programming for over 11 years and is interested in
mathematical modeling, optimization, and computer simulations. He can be
reached at 115 Filbert Road, Woodside, CA 94062, or at aed@netcom. UUCP.


A common problem in object-oriented design is creating and controlling
collections, sets, or groups of objects. As object class designers, we'd
ideally like to focus on the atomic aspects of the object abstraction,
ignoring the secondary problem of maintaining and controlling collections of
these objects. More importantly, we'd like to implement this control structure
in a generic and reusable fashion.
This article presents a method for creating generic lists of objects in C++
with code developed on an Intel 80486, using GNU g++ 1.37 under SCO Unix
System V, Release 3.2.


Generic Objects


Most algorithms described in books on data structures might be considered
"generic" because they work for many types of data -- they're
type-independent. Unfortunately, these books are more often than not written
with procedural languages in mind, and we miss the generic (and code
reusability) aspects of these algorithms. When we need one of these data
structures, we usually have to reimplement the algorithm and data structure to
fit our particular need.
To write a generic container class in any language, the language must provide
some means of writing a general type-independent algorithm. One common
complaint about C++ is that it does not support parameterized types, thereby
making it difficult to create generic classes.
Parameterized types would allow you to easily write generic functions such as
Max( ), which returns the greater of two values. Max( ) would contain the
statement return( a > b ? a : b) and would work for any class or data type.
(Future versions of C++ are supposed to support parameterized types.)
Currently, you can use overloading to create a Max( ) function for any class
you may need, but that means rewriting the Max( ) function for each new class.
The basic logic does not change, only the parameter list of the function.
Using operator overloading in this example does not give us the code
reusability or genericity we are after, but does make the code
lexicographically simpler. Even if code reuse were not an issue, overloading
still would not help as it works only for functions, not classes.


Inheriting the Problem


The $100 question, then, is how do we create a generic container class without
parameterized types? In his book A C++ Toolkit, Jonathan S. Shapiro presents a
design for a generic list based on inheritance. The only objection I have to
this design is that it requires the class designer to decide what sort of
containers the class will work with. Rather than place the burden of
clairvoyance on the class designer, I would rather defer this decision to the
user of the class. This frees the class designer to focus on the atomic
aspects of object abstraction and ignore the secondary problem of how to
maintain and control collections of these objects. It is very difficult to
know how a class will be used in the future.
There is also another problem with creating container classes based on
inheritance in C++. As a consequence of C++'s strong type checking, the
container class will return a pointer or reference to a container, instead of
to the object the container class is supposed to control. To get around this
problem, you can cast the returned pointer to the class of the object, but you
should not have to do this.
Stephen Dewhurst and Kathy Stark (Programming in C++) also present an almost
fully generic design for linked lists using typedefs to support various data
types. The problem with this approach is that their generic list will support
only one type of class in any given program. For example, assume we have two
classes -- one representing coins, and the other representing colors. It
should be possible to have a list of coins and a separate list of colors in
the same file scope, without the coin class and the color class having to
share a common base class.


Solving the Generic Problem


The design I present here is a more generic version of Dewhurst and Stark's
generic list and gets around the problem of maintaining a separate list within
the same file scope. Dewhurst and Stark implement the linked list using item,
list, and iter classes. The item class contains a pointer to the next item and
a reference to the object to be controlled using the list. The list class
contains a pointer to the head of the list and member functions for clearing,
appending, and inserting objects into the list. The iter class is used to
provide a control abstraction, which retains the state of the iteration from
invocation to invocation. The test programs described later illustrate how to
use these three classes.
Generic.h, which is included in C++ 2.0, contains a set of macro functions
that can be expanded to create a unique set of container class definitions for
each class to be controlled. The macro functions will entirely encapsulate and
hide the implementation of the container class. Specifically, Generic.h
contains a name2() macro function that takes two arguments and concatenates
them. For example, name2(foo, blort) expands to be fooblort. name2() is used
to create a unique name for each type of list required. For each generic class
needed, define a macro function that will take one argument representing the
class to be controlled. Listing One (page 124) #defines the GenList class.
Once defined, the GenList macro can create a generic list of coins using
GenList(coin) ListOfCoins;. The C Preprocessor (CPP) will convert this to
coinGenList listOfCoins;.
With the name part of the problem solved, all that's left is defining the data
members and member function for the GenList(CLASS_TAG). Generic.h provides
another macro function, declare( ), to help with this problem. To declare the
class definition for the generic list-class of coins, use
declare(GenList,coin);. CPP will expand this into GenList-declare(coin);, a
programmer-defined macro function. In fact, this is where you'll write all
generic algorithms pertaining to the list class. GenListdeclare( ) is typical
of the macro function names generated by declare( ); its form is shown in
Figure 1. In examining Figure 1, recall that the backslash character (\) is
used to continue the macro on the next line. Anytime I need the name of the
generic class, I use the corresponding name macro function.
Figure 1: The GenListdeclare macro function

 #define GenListdeclare (CLASS_TAG \
 class GenList (CLASS_TAG) \
 { \
 private: \
 public: \
 /* the default constructor */ \
 GenList (CLASS_TAG)() \
 { \
 ... \
 } \
 }

At this point, you should have enough background to browse Listing One. I have
also provided test1.cc (see Listing Two, page 124), coin.hh (Listing Three,
page 124), and coin.cc ( Listing Four, page 125) to provide an example of
actual use. Also provided is preproc.cc (Listing Five, page 125), which shows
how the actual macro function is expanded for the coin class.



Improvements


I implemented the generic list using references. You may choose to keep a copy
of the object in your version of the item class. I decided to use references
because my application controls static objects. By using references, I save
memory. (I have only one copy of the object instead of possibly two.) The code
should also run faster because there will be fewer calls to the object's
constructors and destructors. Keep in mind that every time you put an object
on the stack, the compiler automatically makes a call to the object's copy
constructor. If the object is derived, this can add significantly to the
number of function calls made.
A disadvantage of using references instead of maintaining a copy is that it's
possible to have dangling references. Before an object in the list is
destroyed, it must be removed from the list. Failure to do so will render the
list useless. The test program gives an example of this. This can easily occur
if, for example, two lists are set equal to one another and then passed to
different sub-systems.
Users of the generic list may choose to split the declare macro into two. One
macro would define the generic class data members and member functions. The
other macro function would implement the actual member functions. There are
two advantages to this approach. The first advantage is that this version will
work with C++ compilers that are based on cfront. The second advantage is that
you will eliminate link errors that occur when two modules try to use generic
lists on the same class of objects.
There are several member functions that you could add to the generic list
family of classes. Dewhurst and Stark mention the possible addition of an
apply function in which there is a member or friend function that you wish to
"apply" to all of the objects in the list. The user of the list can easily
implement this in a few lines of code using the iter class.
I found that adding the apply function to the list class can become unruly.
For example, I had to overload the apply function to handle member and friend
functions that take a variable number of arguments. If you plan to implement
this feature, declare the pointers to the functions as taking a variable
number of arguments and provide two versions of the apply function: one for
the member functions, the other for the friend functions.
Of course, there's always a flip side to the coin, and in this case the
disadvantage is that you lose the benefits of type checking. Nevertheless, the
generic list container class presented in this article should provide a good
starting point -- and may this wheel never be invented again.


Endnotes


Jonathan S. Shapiro, A C++ Toolkit. Englewood Cliffs, N.J.: Prentice-Hall,
1991.
Stephen C. Dewhurst and Kathy T. Stark, Programming in C++. Englewood Cliffs,
N.J.: Prentice Hall, 1989.


References


Lippman, Stanley B. C++ Primer. Reading, Mass.: Addison-Wesley, 1989.
Stubbs, Daniel F. and Webre, Neil W. Data Structures with Abstract Data Types
and Pascal. Monterey, Calif.: Brooks/ Cole Publishing Company, 1985.
_GENERIC CONTAINERS IN C++_
by Andrew Davidson


[LISTING ONE]

/*
 * Generic list
 *
 * by A. E. Davidson 12/90
 *
 * Overview
 *
 * USES
 * to declare a generic list of class
 * foo items and a generic list of class
 * bar items
 *
 * #include "genriclt.hh"
 * DeclareList(Foo);
 * DeclareList(Bar);
 *
 * main()
 * {
 * class foo;
 * class bar;
 * GenList(foo) genListOfFoo;
 * GenList(bar) genListOfBar;
 * GenIter
 *
 * REQUIREMENTS / ASSUMPTIONS
 * The generic list does not manage
 * the items memory. It is up to the
 * user to free them
 *

 * the function GenList::RemoveItem()
 * takes a pointer to a member function.
 * this funciton should return true is found
 * else false. GenList::RemoveItem will not
 * compile on my machine if the member
 * fuction is inlined. it gives a signal 6 error message
 *
 * NOTES
 * never use new to create a list or an iter!
 *
 *
 */

#include <generic.h>

#ifndef GENERIC_LIST_HH
#define GENERIC_LIST_HH

/*
 * these macro's should be used
 * any where you need to reffer to
 * the generic list, item, or iter classes
 *
 * PTAMF: Pointer to a Member Function
 */
#define PTAMF(CLASS_TAG) name2(CLASS_TAG,PTAMF)
#define GenItem(CLASS_TAG) name2(CLASS_TAG,GenItem)
#define GenList(CLASS_TAG) name2(CLASS_TAG,GenList)
#define GenIter(CLASS_TAG) name2(CLASS_TAG,GenIter)

/*----------------------------- class Item ---------------------------*/
/*
 * GenItem(CLASS_TAG) is a private
 * class. It can only be created by
 * be the member functions of GenList(CLASS_TAG)
 * and GenIter(CLASS_TAG)
 */

#define GenItemdeclare(CLASS_TAG) \
class GenItem(CLASS_TAG) \
{ \
 GenItem(CLASS_TAG) *next; \
 CLASS_TAG &item; \
 GenItem(CLASS_TAG)(CLASS_TAG &i) : item(i) \
 {next = NULL;} \
 GenItem(CLASS_TAG)(CLASS_TAG &i, GenItem(CLASS_TAG) *n) : item(i) \
 {next = n; } \
 ~GenItem(CLASS_TAG)() \
 {;} \
 friend GenList(CLASS_TAG); \
 friend GenIter(CLASS_TAG); \
}

/*---------------------------- class List ---------------------------*/
#define GenListdeclare(CLASS_TAG) \
class GenList(CLASS_TAG) \
{ \
 GenItem(CLASS_TAG) *hd; \
 public: \

 GenList(CLASS_TAG)(GenItem(CLASS_TAG) *n = NULL) \
 { hd = n;} \
 GenList(CLASS_TAG)(const CLASS_TAG &i) \
 {hd = NULL; insert(i);} \
 ~GenList(CLASS_TAG)() \
 {Clear();} \
 void Clear(void) \
 { \
 GenItem(CLASS_TAG) *pt; \
 while (hd) \
 { \
 pt = hd; \
 hd = hd->next; \
 delete pt; \
 } \
 } \
 GenList(CLASS_TAG)(const GenList(CLASS_TAG) &seq) {hd = seq.hd;} \
 GenList(CLASS_TAG) operator = (const GenList(CLASS_TAG) &other) \
 { \
 hd = other.hd; \
 return *this; \
 } \
 void insert(CLASS_TAG &i){ hd = new GenItem(CLASS_TAG)(i, hd); } \
 void append(CLASS_TAG &i) \
 { \
 for (GenItem(CLASS_TAG) *pt = hd; pt && pt ->next; pt = pt->next) \
 ; \
 if (pt) \
 { \
 GenItem(CLASS_TAG) *tmp = new GenItem(CLASS_TAG)(i); \
 pt->next = tmp; \
 } \
 else insert(i); \
 } \
 \
 void removeItem(PTAMF(CLASS_TAG) found, CLASS_TAG &obj) \
 { \
 GenItem(CLASS_TAG) *prev, *rem; \
 \
 prev = NULL; \
 rem = hd; \
 while( rem && !(obj.*found)(rem->item) ) \
 { \
 prev = rem; \
 rem = rem->next; \
 } \
 if (rem) \
 { \
 if (prev) \
 prev->next = rem->next; \
 else \
 hd = rem->next; \
 delete rem; \
 } \
 } \
 friend GenIter(CLASS_TAG); \
}




/*------------------------------- class Iter ---------------------------*/
 /*
 * interate over entire list
 * CLASS_TAG *operator()()
 */

#define GenIterdeclare(CLASS_TAG) \
class GenIter(CLASS_TAG) \
{ \
 GenItem(CLASS_TAG) *current; \
 public: \
 GenIter(CLASS_TAG)(GenList(CLASS_TAG) &ilist) \
 { current = ilist.hd; } \
 GenIter(CLASS_TAG)(GenIter(CLASS_TAG) &other) \
 { current = other.current; } \
 ~GenIter(CLASS_TAG)() \
 {;} \
 GenIter(CLASS_TAG) operator = (GenList(CLASS_TAG) &ilist) \
 { current = ilist.hd; return *this; } \
 CLASS_TAG *operator()() \
 { \
 if (current) \
 { \
 GenItem(CLASS_TAG) *tmp = current; \
 current = current->next; \
 return &tmp->item; \
 } \
 return NULL; \
 } \
}


/*
 * macro that create all the generic types
 * provided for ease of uses
 *
 * for some unknown reason my compiler can't handle a
 * function prameter that is a pointer to a member function
 * It can deal with it if the pointer is declared using
 * a typedef
 */
#define DeclareList(CLASS_TAG) \
 typedef int (CLASS_TAG::*PTAMF(CLASS_TAG))(CLASS_TAG &); \
 class GenList(CLASS_TAG); \
 class GenIter(CLASS_TAG); \
 declare(GenItem,CLASS_TAG); \
 declare(GenList,CLASS_TAG); \
 declare(GenIter,CLASS_TAG)

#endif





[LISTING TWO]



/*
 * test1.cc
 *
 * by A. E. Davidson
 *
 * Provides a driver to test the generic list
 */

#include <stream.h>
#include "coin.hh"
#include "genericl.hh"

/*
 * use the declare macros to
 * allow creation of list of desired types
 */

DeclareList(coin);

/*
 * proto typing function using the generic list
 * stuff must be done after DeclareList()
 */
void displayAndTotal(GenIter(coin) next_coin);

main()
{
 /*--- create some coins -------*/
 coin c1 = penny;
 coin c2 = nickel;
 coin c3 = dime;
 coin c4 = quarter;

 /*------ create a list of coins -----*/
 GenList(coin) list_of_coins;
 list_of_coins.append(c1);
 list_of_coins.append(c2);
 list_of_coins.append(c3);
 list_of_coins.append(c4);

 /*------- display the list of coins and there total ------*/
 displayAndTotal(list_of_coins);

 /*-------------- remove one of the coins --------------*/
 cout << "\n\n list after removing coin c2 \n";
 list_of_coins.removeItem(&coin::found, c2);
 displayAndTotal(list_of_coins);


 /*
 * rember: c2 has been removed from the list but it still exists
 */
 cout << "\n\n coin c2 still exists, it was only removed from the list \n";
 cout << "coin: " << c2;


#ifdef NEVER

 /*

 * this is example shows a design flaw
 * with the generic list assignment operator
 *
 * if you delete an object but do not remove it
 * from the list first you will get a core dump.
 * The list will contain a dangling reference
 *
 * this is becuase I chose to implement the
 * the list using references instead of copying
 * the objects. See the discusion at the end of the
 * article
 */
 coin *c5 = new coin(quarter);

 list_of_coins.append(*c5);
 delete c5;
 displayAndTotal(list_of_coins);

#endif
}

/*
 * this function illustrates
 * how to use the GenIter class
 *
 * notice that the parmeter list expect
 * an inter object, but I always pass a list
 * object
 */
void displayAndTotal(GenIter(coin) next_coin)
{
 double total = 0.0;
 coin *tmp;

 while ((tmp = next_coin()))
 {
 /*
 * coins know how to convert themselves to doubles
 */
 total += *tmp;
 /*
 * coins also know how to display themselves
 */
 cout << "coin: " << *tmp << "\ttotal: " << total <<"\n";
 }
}





[LISTING THREE]

/*
 * coin class
 *
 * by A. E. Davidson
 *
 * USES

 * provides a simple class that can be
 * used to illistrate the operation of
 * the generic list
 *
 * a coin may be a penny, nickel, dime, or quarter
 */

#ifndef COIN_HH
#define COIN_HH

enum coin_type {penny, nickel, dime, quarter};


class coin
{
 coin_type unit;
 double amount;

 public:
 coin()
 {unit = penny; amount = 0.01;}
 coin( coin_type type);
 coin(const coin &other)
 {unit = other.unit; amount = other.amount;}
 ~coin(){;}
 coin& operator = (const coin &other)
 { unit = other.unit; amount = other.amount;}
 friend ostream& operator << (ostream &os, coin &c);
 operator double ()
 {return amount;}

 /*
 * this function is intended to be
 * used with GenList(CLASS_TAG)::removeItem()
 * I get a compile error if I try to inline this
 * function
 */
 int found(const coin &other);
};


#endif





[LISTING FOUR]


#include "stream.h"
#include "coin.hh"

char *coin_name[] = {"penny", "nickel", "dime", "quarter"} ;

/*
 * convenient look up table
 * keeps from having to duplicate case statements any
 * time you need to work with unit data member

 */
static struct
{
 coin_type kind;
 double amount;
} table [] =
 {
 {penny, 0.01},
 {nickel, 0.05},
 {dime, 0.10},
 {quarter, 0.25},
 {quarter, 0.0}, /* end of the table */
 };

coin::coin(coin_type type)
{
 unit = type;
 for (int i = 0; table[i].amount != 0.0 && unit != table[i].kind; i++)
 ;
 amount = table[i].amount;
}


ostream& operator << (ostream &os, coin &c)
{
 os << coin_name[c.unit];
 return os;
}


int coin::found(const coin &other)
{
 return (unit == other.unit);
}





[LISTING FIVE]


/*
 * this is the output from CPP
 * g++ -E test1.cc
 *
 * I reformated the output to make it
 * easier to read
 */

typedef int (coin::* coinPTAMF )(coin &);
class coinGenList ;
class coinGenIter ;

class coinGenItem
{
 coinGenItem *next;
 coin &item;


 coinGenItem (coin &i) : item(i)
 {next = 0 ;}
 coinGenItem (coin &i, coinGenItem *n) : item(i)
 {next = n; } ~coinGenItem ()
 {;}
 friend coinGenList ;
 friend coinGenIter ;
} ;

class coinGenList
{
 coinGenItem *hd;

 public:
 coinGenList (coinGenItem *n = 0 )
 { hd = n;}
 coinGenList (const coin &i)
 {hd = 0 ; insert(i);}
 ~coinGenList ()
 {Clear();}
 void Clear(void)
 {
 coinGenItem *pt;

 while (hd)
 {
 pt = hd;
 hd = hd->next;
 delete pt;
 }
 }
 coinGenList (const coinGenList &seq)
 {hd = seq.hd;}
 coinGenList operator = (const coinGenList &other)
 { hd = other.hd; return *this; }
 void insert(coin &i)
 { hd = new coinGenItem (i, hd); }
 void append(coin &i)
 {
 for (coinGenItem *pt = hd; pt && pt ->next; pt = pt->next)
 ;
 if (pt)
 {
 coinGenItem *tmp = new coinGenItem (i);
 pt->next = tmp;
 }
 else
 insert(i);
 }
 void removeItem( coinPTAMF found, coin &obj)
 {
 coinGenItem *prev, *rem;

 prev = 0 ;
 rem = hd;
 while( rem && !(obj.*found)(rem->item) )
 {
 prev = rem;
 rem = rem->next;

 }
 if (rem)
 {
 if (prev)
 prev->next = rem->next;
 else
 hd = rem->next;
 delete rem;
 }
 }
 friend coinGenIter ;
} ;

class coinGenIter
{
 coinGenItem *current;

 public:
 coinGenIter (coinGenList &ilist)
 { current = ilist.hd; }
 coinGenIter (coinGenIter &other)
 { current = other.current; }
 ~coinGenIter () {;}
 coinGenIter operator = (coinGenList &ilist)
 { current = ilist.hd; return *this; }
 coin *operator()()
 {
 if (current)
 {
 coinGenItem *tmp = current;
 current = current->next;
 return &tmp->item;
 } return 0 ;
 }
} ;





void displayAndTotal(coinGenIter next_coin);

main()
{

 coin c1 = penny;
 coin c2 = nickel;
 coin c3 = dime;
 coin c4 = quarter;


 coinGenList list_of_coins;
 list_of_coins.append(c1);
 list_of_coins.append(c2);
 list_of_coins.append(c3);
 list_of_coins.append(c4);


 displayAndTotal(list_of_coins);



 cout << "\n\n list after removing coin c2 \n";
 list_of_coins.removeItem(&coin::found, c2);
 displayAndTotal(list_of_coins);





 cout << "\n\n coin c2 still exists, it was only removed from the list \n";
 cout << "coin: " << c2;


# 78 "test1.cc"

}









void displayAndTotal(coinGenIter next_coin)
{
 double total = 0.0;
 coin *tmp;

 while ((tmp = next_coin()))
 {



 total += *tmp;



 cout << "coin: " << *tmp << "\ttotal: " << total <<"\n";
 }
}







[MAKEFILE]

CC= g++
OBJS= coin.o test1.o
SRCS = coin.cc test1.cc
LIBS= -lg++ -lm
INCLUDE= -I/n/catserv/usr1/tools/sun4/usr/local/lib/g++-include

.SUFFIXES: .cc


.cc.o:
 $(CC) -c -g $<

coinTest : $(OBJS) coin.hh genericlt.hh
 $(CC) -o $@ -g $(OBJS) $(LIBS)

clean :
 rm -f coinTest *.o


#
# notes
#

#
# $@ the name of the current target
#

#
# $< the name of a dependency file, derived as if selected
# for use with an implicit rule
#

depend :
 makedepend -- $(CFLAGS) -- $(INCLUDE) -- $(SRCS)
# DO NOT DELETE THIS LINE -- make depend depends on it.

coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/stream.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/ostream.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/File.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/builtin.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/stddef.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/std.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/stdio.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/math.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/values.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/streambuf.h
coin.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/istream.h
coin.o: coin.hh
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/stream.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/ostream.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/File.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/builtin.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/stddef.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/std.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/stdio.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/math.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/values.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/streambuf.h
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/istream.h
test1.o: coin.hh genericlt.hh
test1.o: /n/catserv/usr1/tools/sun4/usr/local/lib/g++-include/generic.h









August, 1991
PORTING UNIX TO THE 386: THE BASIC KERNEL


Overview and initialization


 This article contains the following executables: 386BSD.891


William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 and 2.9BSD and was the chief architect
of National Semiconductor's GENIX project, the first virtual memory
micro-processor-based UNIX system. Prior to establishing TeleMuse, a market
research firm, Lynne was vice president of marketing at Symmetric Computer
Systems. They conduct seminars on BSD, ISDN, and TCP/IP. Send e-mail questions
or comments to lynne@berkeley.edu. Copyright (c) 1991 TeleMuse.


In the previous article we examined the machine-dependent layer initialization
of the "stripped-down" kernel -- the machine-dependent portion of the kernel
which installs the kernel into the position to execute processes (via the
bootstrap procedure) and prepares the system for initialization of the minimum
machine-independent portions of the kernel (processes, files, and pertinent
tables). We viewed our 386BSD kernel as a kind of "virtual machine" (not to be
confused with the "virtual" in "virtual memory"), where functions underlie
other functions transparently. When initialized, the system can use portions
that require little direction to initialize even larger portions. Thus, this
virtual machine assembles itself tool by tool, much like a set of Russian
dolls. The machine-dependent kernel initialization is the innermost of the
dolls -- the kernel of the kernel around which all is built.
We now extend the layered model further, by incrementally turning on all its
internal services using the kernel's main( ) procedure. In other words, this
next outer layer will be built by the kernel's main( ) procedure, which in
turn initializes higher-level portions of the kernel. This is the second major
milestone of our UNIX port -- the halcyon point where most of the kernel
services and data structures are initialized.
At this stage, we'll review key elements of the BSD kernel which will be
invoked in future articles. We'll briefly examine the interrelationships
between some of these elements, in order to delineate the broader picture a
bit more and illuminate some important ideas in UNIX system design.
More Details.


Layered Modeling: Achieving a Well-Stacked System


A basic understanding of the entire system demands a return to the layered
model described last month. In brief, the kernel is a program which runs in
supervisor mode on the 386 (or "Ring 0" -- for a review on rings, see "The
Standalone System" DDJ March 1991). The kernel implements the primitives,
called "system calls," of UNIX and manages the environment and other
characteristics of the many user "processes" run to provide functionality to
the system. (Each user process runs in a separate "Ring 3" address space.)
Only processes running in their protected address spaces are truly visible to
the UNIX user, as they provide the requested functionality (a command
processor or "shell," a compiler, an editor, and so on). These processes
constitute the outermost layer of our layered operating system model. System
calls and various processor exceptions (page fault, device interrupt, overflow
, and so on) are methods by which a process either directly or indirectly
enters the UNIX kernel to request services. In this way, the kernel acts as a
transparent (not statically-linked) subroutine library that functions as a
kind of virtual machine. It's as if the microprocessor hardware itself
actually did a whole read( ) system call request in the single lcall
instruction used to signal a system call. (For further information, see
Leffler, et al, The Design and Implementation of the 4.3BSD UNIX Operation
System, Chapter 3: Kernel Services, page 43-45, Addison-Wesley, 1989.)
Layers within the kernel are split into the (mostly) machine-independent "top"
half, and the (mostly) machine-dependent "bottom" half. The top half
synchronously processes exceptions and system calls and blocks a currently
executing process when an event causes a delay, such as a temporary memory
shortage, or a device input/output operation. Blocking a process permits the
kernel to run another delayed process, allowing multiple processes to appear
to run concurrently on a single processor. The bottom half, in contrast,
asynchronously processes device interrupts that are never allowed to be
blocked. Device interrupts can be viewed, therefore, as high-priority,
real-time tasks brought to life by a hardware interrupt to render the
necessary effect and then exit the stage. They then return the kernel from the
interrupt back to whatever code was running before the interrupt occurred. In
a way, device interrupts are practically stateless and serve primarily to
signal the occurrence of an external event to the synchronous "top" layers.
Note, however, that such notifications will only take effect when the top
layers allow preemption -- in the UNIX model, this is only allowed at certain
points when operating in the "top" layers (generally, when returning to the
user process from the kernel, or when blocking for an event).
The impact of the layered model on our 386BSD port cannot be understated. Our
386BSD system can be broken up into modular subsystems that have neat
boundaries and work by a handful of simple rules. One rule which we live by is
the aforementioned low-level asynchronous, top-level synchronous arrangement.
For example, if we describe some code that must block for a resource, we are
already restricted to a discussion of the top layer. Likewise, if we describe
an event that occurs as a result of a peripheral completing an operation, rest
assured it resides within the lower layers. This organization allows us to
break the whole into parts we can handle; otherwise we quickly become mired in
complexity.
Understanding and following the rules inherent in our layered model greatly
simplifies UNIX kernel design. Without these rules, we would need a lot more
"critical region" code dealing with arbitrary preemption. These same rules,
however, also limit our ability to easily implement UNIX in real-time
environments. (For example, Ethernet delays are sometimes unpredictable.) Some
versions of UNIX attempt to improve real-time performance by minimizing
worst-case delays through the judicious addition of blocks to allow
high-priority processes to run, but this is not a simple fix.
There is reason to believe that the synchronous design of UNIX limits its
performance with disk file writes. In a paper presented at the Summer 1990
USENIX Technical Conference, "Why Aren't Operating Systems Getting Faster as
Fast as Hardware?" (USENIX Technical Proceedings, page 247) John Osterhout of
the University of California at Berkeley discusses the need to "decouple
filesystem...operations that modify files," as synchronous writes are required
for filesystem stability, consistency, and crash recovery. This is currently
not a great problem, because filesystem reads (the majority of operations) can
be elegantly cached and anticipated. Still, this raises questions about the
current UNIX model and may result in its revision.


Top-Level Layers


Last month, we discussed bottom-level initialization, where the Interrupt
Descriptor Table (IDT), a 386 hardware interface to interrupts and exceptions,
was wired into code entry points (IDTVEC (XXX)). This is how 386BSD glues the
hardware interrupts onto the bottom layer. In addition, some of the top-layer
interfaces were also established. Now we need to build and initialize the
other top-layer kernel functions in our BSD kernel main( ). With the kernel
initialized, we must get the ball rolling by bootstrapping user processes to
add functionality in the form of services (such as a command processor that
will allow useful work to be done with our 386BSD system).
To implement the UNIX model, we refer to many items -- all of which are
managed by the top layers and referenced by lower layers. These include
processes, address spaces, files, filesystems, buffers, messages, signals,
credentials, and others. They are grouped into a global set managed by the
kernel on behalf of all processes, and a private set managed by the process
for the benefit of the program running within the process.


The Global Kernel Set


The global set of objects is split into a shared database (proc, inode, buffer
cache, and file structures), as well as a group of consumable resources
(memory pages, data buffers, and heap storage). The shared database objects
use methods for which items are searched, contended, modified, allocated,
deleted, and linked together; all in a preemptible fashion, since many
processes may attempt simultaneous access to these objects during multitasking
operation. These databases must be initialized, allocated the appropriate
minimum requirements for operation, and linked as the system requires.
The UNIX paradigm is "process-centric," that is, most of the activity is built
around the current running process. With the exception of necessary functions
such as scheduling or interprocess communications (which require knowledge of
multiple processes), most of the kernel is written without any explicit
knowledge of any process save the running one. Thus, an understanding of how a
process is provided services tells you the bulk of what the kernel does. Very
little of the code and data structures are explicitly aimed at this "global"
view.
The focus of kernel activity is the list of processes (the "proc table").
Processes are linked into various lists so they can locate other processes
through various relationships. As the kernel operates, processes migrate onto
and off of different lists. While the process structure is not globally
allocated, the list of entries is itself a global resource. Each system call
and exception is directed to operate on a given process. As such, the BSD
kernel uses the struct proc entry of each process as the key data structure
that indexes all related kernel entities of a process.


Process Private Set


Each process possesses a number of data structures which can be leveraged to
properly implement the UNIX model. So many of these are required that we
reduce them, for simplicities sake, to a given set from which we draw upon in
our discussions. All these properties of the process are rooted in the
"per-process data structure," also known as struct proc or "proc slot" (see
Listing One, page 126). This is just one element of the previously mentioned
list of processes which defines just what a process is.


The Proc Slot


In Figure 1, 386BSD uses a proc slot as the nexus of information for a
process. Many different structures, most of them dynamically allocated, hang
off this single proc entry (see Figure 2). These may be, in different cases,
shared by processes, dynamically grown, or externalized to special
applications. Among the auxiliary structures are:
p_cred This structure is the process's credentials, that is, the information
(such as user ID number and group membership) used to regulate access to
system resources by the process. This information is managed in a generic
fashion by most of the kernel and is consulted by a tiny, centralized portion
of the kernel (so that additional security control mechanisms can be added or
substituted). It is shared by sibling processes of like ownership.

p_fd Each process has a private file descriptor table: a dynamically
allocated, growable structure used to store information on files currently
open by the process. (Older versions of UNIX had a static limit on the number
of open files, usually 20.)
p_stats Statistics on the use of various resources consumed by the process.
For example, the amount of time the processor used, the memory used, and
various other details are tallied by this structure.
p_limits Analogous to statistic recording on the process, this auxiliary
structure is used to put administrative limits on resource utilization.
p_vmspace Another critical resource for the process is contained in the
virtual address space, details of which can be found within each process's
p_vmspace data structure. Among the data available is the virtual memory
system's address map (vm_map) which heads a table of address map entries. Each
entry, in turn, manages allocated regions of virtual address space. Also, each
process contains a physical map (vm_pmap) structure, managed by the pmap
layer, containing current address translation state information (discussed
later in the "Virtual Memory Subsystem" section).
p_sigacts POSIX process signals, a kind of software interrupt for user
processes, are implemented with the signal action state information in this
structure.
p_pgrp POSIX provides for the concept of "sessions" as a method of organizing
process groups. Process groups are a set of processes operating together (for
example, a pipeline such as "foo bar bletch"). Sessions utilize a session
leader (usually a command processor or shell) that manages process groups. It
has the ability to suspend or resume process groups run connected to a
terminal (in the "foreground") or detached from the terminal (in the
"background"). The data structures used to manage this feature reside in this
shared data structure.
The proc structure in 386BSD highlights the modularity of function present in
the BSD design. Although BSD is currently implemented as a monolithic kernel,
it can be arranged so that multithreaded distributed kernel operation can be
achieved. In general, BSD kernel development has been focused around the
revision and examination of the monolithic operating system kernel prior to
implementation in a multithreaded kernel. This approach seeks to avoid putting
the cart before the horse, so to speak, and avoids vacuous "modularity"
modifications which purport to work only in a multiprocessor environment. This
is not reticence in design -- merely caution.
Multiprocessor systems are desirable, so the pent up enthusiasm to take
advantage of them can overwhelm the many research directions available and
result in the canonization of inappropriate or short-sighted standards.
Current standards efforts are making headway, although the overall
multiprocessor architecture is still unknown. (For example, some POSIX groups
are attempting to define a standard for thread programming, and currently the
most popular standard is one contrary to UNIX primitives, because the group
touting this standard would rather ignore UNIX. This will result in another
pointless standard taking its place alongside the dodo and other dead-end
events of history.)
Due to the way this arrangement results in "data hiding," the facilities of
filesystems, accounting, administration, virtual memory, and POSIX signal
processing are each separated from the inner part of the operating systems
kernel. Each can be evolved separately or redefined with minimal interaction,
as befits a modular design.


Kernel Events


Processes operate synchronously, processing a system call item by item. If
they need to wait for either a resource or an external event, they must block
with a sleep( ) function call to await changes and give up the processor.
Elsewhere in the kernel, a corresponding wakeup( ) function call will awaken
the snoozing process, preparing the process to run when next possible. While
sleep gives up the processor, wakeup schedules processes to run -- it does not
transfer to processes nor even insure that the process will ever run. Wakeup
calls are idempotent. Many can occur before the process actually starts to
run.
A process can only wait on a single event at a time, and is usually uniquely
identified by the kernel address of the object for which it waits. This event
is stored in the p_wchan field of the process's proc slot. Events themselves
don't require additional space when active, so secondary or recursive effects
(as might happen in the case of a block on memory starvation) don't occur.
As the 386BSD system and its drivers are all written with these event
mechanisms in place, we are potentially multitasking from the start, although
until we replicate (or "fork") to create multiple processes, no actual context
switches occur to different processes. (There are no other processes to switch
to.) Instead, the processor is allowed to idle, waiting for events. In the
UNIX perspective, we always try to organize the general case so that it
functions seamlessly on initialization, in order to leverage it early. An
example of this approach can be observed in the mechanisms that provide
diskless operation, where we must provide a root filesystem over a network
connection before we have a filesystem to run the programs that normally
initialize the network and locate the filesystem on the network. (Got it?
Good.)


Machine-Independent Initialization


Machine-independent initialization is begun by wiring up a "process zero." In
the previous article, we took care in the assembly language initialization to
craft a separate region for the kernel stack -- this will be our "Oth" process
kernel stack. We then commence the creation and attachment of the necessary
auxiliary data structures that process 0 will use during the lifetime of our
system. No process is specially considered, so all must have these structures
present and consistent with other structures in the kernel. To avoid recursive
problems with the "virgin" birth, the first process must be hand-wired with
the barest of necessities, and space for the auxiliary structures must be
allocated statically. In fact, we will find that process 0 will attempt to
become eternal, so it's actually more costly to dynamically allocate space for
it than to do so statically!
Having made a 0th process, we now must create a process list to which the
system can refer in a global fashion, to locate, add, delete, and modify
processes. This is not really complicated, because at this point all of the
queue pointers point either at our just-born process O or at "nil." At this
stage, all process-related operations can now be activated, although only for
statically-allocated processes (which is not very interesting -- we need to
turn on the virtual memory and storage allocation functions for something more
useful).
UNIX likes to have access to herds of processes, many appearing to run
simultaneously,to do its bidding. As a result, we need to rapidly flit between
processes running for a brief slice of time before blocking. To make this a
low-cost operation, we use a priority-ordered run queue of process pointers to
rapidly select the next process to run when it's time to switch. This is now
initialized to permit the context switch code to be run (as it will be called
when we block for I/O operations).


Virtual Memory Subsystem


As mentioned in the previous article, 386BSD has been rewritten to use a new
virtual memory system with greater capabilities. This new package, derived
from MACH version 2, possesses generalized mechanisms which allow management
of multiple regions of virtual memory within the user processes and kernel
itself, thus avoiding the arbitrary and idiosyncratic methods used in earlier
Berkeley UNIX virtual memory systems. This "new vm" is composed of
machine-dependent (physical map) and machine-independent (virtual map)
portions.
This new virtual memory system was originally conceived in 1985 at
Carnegie-Mellon University by Avadis Tevanian (now at Next) and Michael Wayne
Young to provide an easily retargettable virtual memory system with the modern
functionality required by the MACH operating system implementation. It serves
as the basis for virtual memory systems in current MACH implementations,
OSF/1, and Berkeley UNIX.
To initialize the virtual memory system, all remaining pages of physical
memory (not occupied by the kernel program itself) are each first allocated a
resident page data structure (vm_page). Queues of free pages are created so
that pages can be allocated from them.
Next, virtual memory objects are created to provide an abstraction on which to
hang collections of physical pages. To allocate virtual address space, virtual
memory maps are also created to identify valid regions of virtual memory and
the characteristics of these regions. The virtual memory system will associate
virtual memory objects containing physical pages of memory with portions of
address space mapped by a virtual memory map, as needed.
We then initialize the kernel's virtual address map and provide a mechanism to
allocate portions of "wired down" memory to the kernel's address space with a
function called kmem_alloc. This function, the most primitive of storage
allocators, allows us to allocate pages of memory dynamically in the
granularity of pages at a time.
With a memory allocator present, the initialization of the physical map (pmap)
portion of the system is completed, allocating tables that will be used by the
physical map module to track the association of physical pages of memory with
the hardware address translation mechanisms data structures (Page Directory
Table and Page Table Pages on the 386). At this point, the virtual memory
system can allocate multiple address spaces and on-fault physical pages to
legitimate references to previously mapped virtual map regions.
In designing a virtual memory system, the common drawback is the inherent
complexity required. Not only does the system have to allocate virtual address
space, but it also needs to allocate pages of memory to "back up" the virtual
space. On some systems, the virtual memory system allocates space, grabs some
pages, and manually wires them into the address translation map. With the new
386BSD virtual memory system, when you ask for memory from a memory allocator,
both virtual and physical memory are allocated. In other words, you always get
the memory in an address space.
Another point to consider when designing virtual memory systems: Suppose we
share the same pages in different processes. We may wish to "back up" shared
pages that might be modified incrementally -- thus, unique pages need be
created only when the contents of a page change. This mechanism, called "copy
on write," allows us to postpone or avoid entirely modifying a process's
memory. Only a mechanism to track changes is required. This is accomplished by
copying virtual memory objects that shadow the original object.
To complete the initialization of the virtual memory system, we must now
initialize and activate "pagers," the software that reads in the contents of
pages from the filesystem and stages pages in and out of processor memory to
disk when we run short of "fast" storage. Pagers interface with external forms
of information, such as local filesystems, disk swap partitions, disk drives,
network filesystems, and the like.


Kernel Memory Allocator


Besides allocating pages of memory from the virtual memory system, we need a
means of allocating smaller granularity objects. Many data structures,
possessing short and long lifetime and generally in the order of 32 bytes in
size, are allocated by the kernel on an "as needed" basis. UNIX provides user
processes with a malloc( ) memory allocator for general-purpose memory
allocation; the same type of function resides in the BSD kernel. This provides
for a global heap store -- so called because everything is kept in a heap, all
piled together!
Kernel malloc( ) uses the virtual memory system to obtain actual storage to
manage (called an "arena"). This storage area encompasses the heap itself.
After the vm system has been activated, we initialize our allocator. From this
point on, we can dynamically allocate data structures. Older versions of BSD
used statically allocated tables that minimally required the system to be
patched and rebooted if a resource was overutilized -- sometimes the system
even had to be recompiled from its source code. With dynamic allocation, the
configuration can be changed on a live system and the effect observed
immediately.


Device Startup


Once enough of the system services are established, we can proceed to scale
and configure tables appropriate for operation, among them the buffer cache
and character list (clist) structures. While these are usually similar on most
systems, a few have private buffer memory pools associated with devices (such
as a disk array with onboard RAM) that should be specially arranged prior to
system operation. Currently, the amount of disk buffering memory is chosen as
a fixed percentage of memory at boot time, but work is underway to allow a
more dynamic allocation scheme.
Next, we configure( ) devices in the system by walking a table of devices --
calling each device driver's probe routine with the parameters for each device
and testing for the presence of each recorded device. Not all devices need be
present. In fact, alternative addresses may be recorded for the same device.
If the device is present, a probe routine will return true, with a subsequent
call to the corresponding attach( ) routine to allocate resources (memory,
interrupts, and so on) for the device and wire it into 386BSD. (In future
articles, we will discuss how 386BSD dynamically structures the interrupt
control devices on-the-fly.)
After cpu_startup( ), the system begins to schedule processes. We allow for
this by enabling the rescheduling clock. This clock periodically interrupts
the kernel and adjusts the priority of other processes that might compete for
use of the processor.


Mounting the Root



We next initialize the virtual filesystem layer. We make our first reference
to it by mounting the root filesystem and marking it as the top-level point
from which to resolve filename references. The root filesystem, like other
filesystems, can be of many different types. However, as this request is
honored by code that calls successively lower-layer functions, we ultimately
get to the bottom layers in the form of a device driver that extracts from the
disk or network the external information of the filesystem on which all files
are stored. If the root filesystem cannot be located, 386-BSD abruptly
terminates.


Final Machine Initialization


Our final machine initialization step is to split process 0 into three
processes (see Figure 1). This is done by creating separate copies of initial
process 0 with the fork1( ) kernel service. fork1( ) implements the "replicate
process" functionality used by the UNIX fork( ) system call. After being
copied twice (creating process 1 and 2 -- both blocked), process 0 will call
the scheduling function sched( ), which endlessly selects processes to shuttle
in and out of secondary storage. In essence, it also manages to enforce a
"fairness" policy on running executable processes present in RAM memory. If
sched( ) finds nothing to do (as it will at this stage of the system's life),
it will block, waiting to wake up when things need to be shuffled again.
When process 0 blocks, process 1, which has been patiently waiting since
fork1( ) was invoked, can be run. Process 1 is then furnished a user address
space with a tiny user program inserted into it. The user process is then
transferred. The first instruction is to execute a file on the root filesystem
(/sbin/init). Thus, our tiny bootstrap program, wired into the kernel, pulls
in a much larger UNIX program located in the root. Even better, the init
program is created with the same tools, operates in the same protected
fashion, and functions with the same system calls as any UNIX program. This
means we can use the richness of the program environment to build a more
elaborate degree of functionality as the system boots itself up. At some
point, however, process 1 will block (perhaps waiting for the disk to find a
block of data for init). At this point, another process can be run.
Process 2, yet another copy of process 0, is given the chance to run at this
point. It will immediately call the pageout( ) function, the sole purpose of
which is to scout out pages of underutilized memory (that is, held by some
process, but not being used). This compulsive little function varies its
activity depending on the amount of unused memory available. If little memory
is available, it rapidly bails water, forcing pages of processes out to
secondary storage (swap space) to prevent the system from becoming constipated
due to lack of memory. If plenty of memory exists (as does at the start of
system boot up), it blocks waiting for a more desperate time.
Processes 0 and 2 are system processes that only run in the kernel -- as
endlessly looping functions, they provide a special service when awakened.
Process 1, on the other hand, is an ordinary user process running code loaded
from the root filesystem. Among other niceties that our init program provides,
it offers a command interpreter through the use of the UNIX fork( ) and exec(
) primitives. In Figure 2, for example, fork( ) and execve( ) system calls are
successively used to replicate a new process (process 3) and execute the
default command interpreter (or shell) /bin/sh. In turn, the shell will follow
the same mechanism to create more processes and fill them with programs the
user requests. The thick grey line in Figure 2 delineates the state of the
world by the end of main( ) in the kernel. The asterisk represents the point
where the first user instruction is executed, while below the line all
remaining initialization, done by user processes, occurs.
The two system processes provide a synchronous mechanism (remember, the high
layers are synchronous) to rectify resource imbalances. By possessing the
complete resources of a process, each can use the kernel's versatility,
including blocking operations that the asynchronous lower layer routines are
forbidden to use (such as requesting disk I/O).


Summary


In this article, we have just touched on the layout of our generic 386BSD
system (4.3 > x < 4.4), and introduced many of the mechanisms, data
structures, and relationships between them. Our point is not to provide
exhaustive descriptions of the operation of BSD in general, but to provide
enough background to understand the operation of 386-related code, as well as
design choices.
To accomplish this task, we've purposely not described much of the detail of
the various BSD subsystems; it is sufficient at this point if you have
obtained some notion of what they are and why we need to turn them on in the
order that we do. In conducting a port, one actually makes it through this
body of code pretty quickly. It is the ticklish operations of fork, exec, and
process context switching that get the first shakedown journey and surprises.
Also, when the kernel design has been refined, and much of this code revised,
this area continues to present challenges.
In the next article, we will leave the hand-waving descriptions of process
switching behind and dig into some actual code. In particular, we shall
examine sleep( ), wakeup( ), and swtch( ), and how the three of these bring
off the illusion of multiple simultaneous process execution on a sole
processor. We will also delve into why the UNIX paradigm shifts comparatively
easily when it comes to multitasking, and why it's been such a long uphill
climb to move others (notably MS-DOS and Finder) into preemptible
multitasking. Finally, we will discuss some of the requirements for the
extensions needed to support multiprocessor and multithreaded operation in the
monolithic 386BSD kernel.


386BSD Availability


The Computer Systems Research Group at the University of California Berkeley
has announced that the BSD Networking Software Release 2--which includes
386BSD--is now available for licensing. The distribution is a source
distribution only, and does not contain program binaries for any architecture.
Thus it is not possible to compile or run this software without a preexisting
system that is installed and running. In addition, the distribution does not
include sources for a complete system. It includes source code and manual
pages for the C library and approximately three-fourth of the utilities
distributed as part of 4.3BSD-Reno. The software distribution is provided on
1/2-inch 9-track tape and 8mm cassette only. For specific information, contact
the Distribution Coordinator, CSRG, Computer Science Division, EECS,
University of California, Berkeley, CA 94720 or bsd-dist@CS.Berkeley.EDU or
uunet!bsd-dist@CS.Berkeley.EDU.


_PORTING UNIX TO THE 386: THE BASIC KERNEL_
by William Frederick Jolitz and Lynne Greer Jolitz


[LISTING ONE]

/* Copyright (c) 1986, 1989, 1991 The Regents of the University of California.
 * All rights reserved.
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 * notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 * notice, this list of conditions and the following disclaimer in the
 * documentation and/or other materials provided with the distribution.
 * 3. All advertising materials mentioning features or use of this software
 * must display the following acknowledgement:
 * This product includes software developed by the University of
 * California, Berkeley and its contributors.
 * 4. Neither the name of the University nor the names of its contributors
 * may be used to endorse or promote products derived from this software
 * without specific prior written permission.
 * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY

 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 * @(#)proc.h 7.28 (Berkeley) 5/30/91
 */

#ifndef _PROC_H_
#define _PROC_H_

#include <machine/proc.h> /* machine-dependent proc substruct */

/* One structure allocated per session. */
struct session {
 int s_count; /* ref cnt; pgrps in session */
 struct proc *s_leader; /* session leader */
 struct vnode *s_ttyvp; /* vnode of controlling terminal */
 struct tty *s_ttyp; /* controlling terminal */
 char s_login[MAXLOGNAME]; /* setlogin() name */
};
/* One structure allocated per process group. */
struct pgrp {
 struct pgrp *pg_hforw; /* forward link in hash bucket */
 struct proc *pg_mem; /* pointer to pgrp members */
 struct session *pg_session; /* pointer to session */
 pid_t pg_id; /* pgrp id */
 int pg_jobc; /* # procs qualifying pgrp for job control */
};
/* Description of a process. This structure contains information needed to
 * manage a thread of control, known in UNIX as a process; it has references
 * to substructures containing descriptions of things that process uses, but
 * may share with related processes. Process structure and substructures are
 * always addressible except for those marked "(PROC ONLY)" below, which might
 * be addressible only on a processor on which the process is running. */
struct proc {
 struct proc *p_link; /* doubly-linked run/sleep queue */
 struct proc *p_rlink;
 struct proc *p_nxt; /* linked list of active procs */
 struct proc **p_prev; /* and zombies */
 /* substructures: */
 struct pcred *p_cred; /* process owner's identity */
 struct filedesc *p_fd; /* ptr to open files structure */
 struct pstats *p_stats; /* accounting/statistics (PROC ONLY) */
 struct plimit *p_limit; /* process limits */
 struct vmspace *p_vmspace; /* address space */
 struct sigacts *p_sigacts; /* signal actions, state (PROC ONLY) */
#define p_ucred p_cred->pc_ucred
#define p_rlimit p_limit->pl_rlimit
 int p_flag;
 char p_stat;
 pid_t p_pid; /* unique process id */
 struct proc *p_hash; /* hashed based on p_pid for kill+exit+... */
 struct proc *p_pgrpnxt; /* pointer to next process in process group */
 struct proc *p_pptr; /* pointer to process structure of parent */
 struct proc *p_osptr; /* pointer to older sibling processes */
/* The following fields are all zeroed upon creation in fork */
#define p_startzero p_ysptr
 struct proc *p_ysptr; /* pointer to younger siblings */
 struct proc *p_cptr; /* pointer to youngest living child */
 /* scheduling */
 u_int p_cpu; /* cpu usage for scheduling */

 int p_cpticks; /* ticks of cpu time */
 fixpt_t p_pctcpu; /* %cpu for this process during p_time */
 caddr_t p_wchan; /* event process is awaiting */
 u_int p_time; /* resident/nonresident time for swapping */
 u_int p_slptime; /* time since last block */
 struct itimerval p_realtimer; /* alarm timer */
 struct timeval p_utime; /* user time */
 struct timeval p_stime; /* system time */
 int p_traceflag; /* kernel trace points */
 struct vnode *p_tracep;/* trace to vnode */
 int p_sig; /* signals pending to this process */
/* end area that is zeroed on creation */
#define p_endzero p_startcopy
/* The following fields are all copied upon creation in fork */
 sigset_t p_sigmask; /* current signal mask */
#define p_startcopy p_sigmask
 sigset_t p_sigignore; /* signals being ignored */
 sigset_t p_sigcatch; /* signals being caught by user */
 u_char p_pri; /* priority, negative is high */
 u_char p_usrpri; /* user-priority based on p_cpu and p_nice */
 char p_nice; /* nice for cpu usage */
 struct pgrp *p_pgrp; /* pointer to process group */
 char p_comm[MAXCOMLEN+1];
/* end area that is copied on creation */
#define p_endcopy p_wmesg
 char *p_wmesg; /* reason for sleep */
 struct user *p_addr; /* kernel virtual addr of u-area (PROC ONLY) */
 swblk_t p_swaddr; /* disk address of u area when swapped */
 int *p_regs; /* saved registers during syscall/trap */
 struct mdproc p_md; /* any machine-dependent fields */
 u_short p_xstat; /* Exit status for wait; also stop signal */
 u_short p_acflag; /* accounting flags */
};
#define p_session p_pgrp->pg_session
#define p_pgid p_pgrp->pg_id
/* Shareable process credentials (always resident). Includes a reference to
 * current user credentials as well as real and saved ids that may be used to
 * change ids. */
struct pcred {
 struct ucred *pc_ucred; /* current credentials */
 uid_t p_ruid; /* real user id */
 uid_t p_svuid; /* saved effective user id */
 gid_t p_rgid; /* real group id */
 gid_t p_svgid; /* saved effective group id */
 int p_refcnt; /* number of references */
};
/* stat codes */
#define SSLEEP 1 /* awaiting an event */
#define SWAIT 2 /* (abandoned state) */
#define SRUN 3 /* running */
#define SIDL 4 /* intermediate state in process creation */
#define SZOMB 5 /* intermediate state in process termination */
#define SSTOP 6 /* process being traced */
/* flag codes */
#define SLOAD 0x0000001 /* in core */
#define SSYS 0x0000002 /* swapper or pager process */
#define SSINTR 0x0000004 /* sleep is interruptible */
#define SCTTY 0x0000008 /* has a controlling terminal */
#define SPPWAIT 0x0000010 /* parent is waiting for child to exec/exit */

#define SEXEC 0x0000020 /* process called exec */
#define STIMO 0x0000040 /* timing out during sleep */
#define SSEL 0x0000080 /* selecting; wakeup/waiting danger */
#define SWEXIT 0x0000100 /* working on exiting */
#define SNOCLDSTOP 0x0000200 /* no SIGCHLD when children stop */
#define STRC 0x0004000 /* process is being traced */
#define SWTED 0x0008000 /* another tracing flag */
#define SADVLCK 0x0040000 /* process may hold a POSIX advisory lock */

#ifdef KERNEL
/* We use process IDs <= PID_MAX; PID_MAX + 1 must also fit in a pid_t
 * (used to represent "no process group"). */
#define PID_MAX 30000
#define NO_PID 30001
#define PIDHASH(pid) ((pid) & pidhashmask)
#define SESS_LEADER(p) ((p)->p_session->s_leader == (p))
#define SESSHOLD(s) ((s)->s_count++)
#define SESSRELE(s) { \
 if (--(s)->s_count == 0) \
 FREE(s, M_SESSION); \
 }
extern int pidhashmask; /* in param.c */
extern struct proc *pidhash[]; /* in param.c */
struct proc *pfind(); /* find process by id */
extern struct pgrp *pgrphash[]; /* in param.c */
struct pgrp *pgfind(); /* find process group by id */
struct proc *zombproc, *allproc; /* lists of procs in various states */
extern struct proc proc0; /* process slot for swapper */
struct proc *initproc, *pageproc; /* process slots for init, pager */
extern struct proc *curproc; /* current running proc */
extern int nprocs, maxproc; /* current and max number of procs */

#define NQS 32 /* 32 run queues */
struct prochd {
 struct proc *ph_link; /* linked list of running processes */
 struct proc *ph_rlink;
} qs[NQS];
int whichqs; /* bit mask summarizing non-empty qs's */
#endif /* KERNEL */
#endif /* !_PROC_H_ */






















August, 1991
C PROGRAMMING FOR THE 68HC05 MICROCONTROLLER


High-level languages for embedded systems




Truman T. Van Sickle


Ted is an applications engineer in Motorola's Semiconductor Products Sector.
You can reach him c/o Motorola Inc., 12254 Hancock St., Carmel, IN 46032.


Whenever the topic of high-level languages and microcontrollers comes up, the
response is usually something like, "The microcontroller doesn't have enough
RAM," or "There's never enough ROM," or "Compilers don't create the tight code
needed for microcontroller operation." Of course, these statements most often
come from assembly language programmers who believe that compiler writers are
more concerned with compilers than with the final performance of the code
generated by the compiler.
Nevertheless, there are advantages to using a high-level language and compiler
for programming microcontrollers. One is comprehensibility. Code written in a
high-level language has a format derived from the problem being solved. The
program comprises a series of statements, each statement solving a small
portion of the problem without concern for the computer. Almost anyone with
some understanding of the problem can examine a high-level language program
that executes the problem and understand the intent of the program. This is
unlike assembly language, where code written by one assembly language
programmer often cannot be understood by another.
Furthermore, high-level programs can usually be written faster than assembly
language programs because high-level language programmers work with the
problem, focusing their efforts on solving the problem. They don't need to
worry about the available computer resources required to resolve the problem.
The compiler writer, on the other hand, can use all available computer
resources properly in the implementation of any program.
Portability is another advantage of high-level languages. Different machines
have completely dissimilar assembly languages. Even machines within the same
family of parts have differing assembly-level features. Compiler writers must
take great care to mask these differences so that the language will be the
same from machine to machine. Programs written in a high-level language for
one part in a family of parts should require little rework to be moved to
another family member.
In this article, I'll discuss high-level microcontroller programming, using
Motorola's 68HC05 and Byte Craft's C6805 C compiler. As an example, I'll add
time-of-day functionality to two members of the 68HC05 family, the MC68HC05J1
and MC68HC05C8.


Internal Time-of-Day Clock for the MC68HC05J1


The MC68HC05J1 is the simplest of the HC05 family of parts. Its clock is
primitive, and the time intervals at which a periodic interrupt can be
generated are not very flexible.
C code for the clock portion of such a system is shown in Listing Two (page
128). Two header files are included: HC05J1.H, which contains the
component-specific pragmas and the definitions of all bits in the timer
control status register; and GENERAL.H, which contains several macro
definitions that are useful in writing code. For example, one of the macros in
this file, #define FOREVER while(TRUE), is used to create a loop that executes
forever in the main program.
All of the variables used for this program are declared as global. Inside the
main program, several initialization statements are executed, and the
interrupts on the part are enabled. The comments in Listing One (page 128)
explain these statements. Following the initialization is a FOREVER statement,
a loop that executes while the microcontroller is running.
Inside this loop, the variables sec, mts, and hrs are tested, incremented, and
set. The value of sec is incremented in the interrupt service routine
__TIMER(). In the main loop, sec is tested to determine if it is less than 60.
When it is equal to 60, sec is reset to 0 and mts is incremented and tested to
determine if its incremented value is 60. When mts is 60, it is reset to 0,
and hrs is incremented. When the incremented value of hrs is 13, hrs is reset
to 1.
This simple clock is followed by a WAIT() statement which places the
MC68HC05J1 into the wait mode until another interrupt occurs. The variable
locations hrs, mts, and sec contain the current time: Other routines are
required to display the time on an external device or set the time with some
type of push-button arrangement.
The timer is set up to cause a timer interrupt to occur at 8.192-millisecond
intervals. When the interrupt occurs, the Timer Overflow Flag and the
Real-Time Interrupt (RTI) Flag are both reset. Because the interrupt time
interval is fixed at a rather odd value, a simple integer count of the number
of interrupts will not provide an accurate one-second interval. A count of 122
interrupts is about one second. The error is large enough that it must be
corrected if this unit is to be used as a clock. The correction algorithm is
as follows:
1. Count 122 8.192-ms ticks per second for 13 seconds. On the 14th second
count 123 ticks. This routine provides 14.000128 seconds per indicated
14-second period.
2. Repeat the above cycle 79 times and on the 80th cycle use a cycle of 14
seconds with 122 ticks in each second. The elapsed interval of this sequence
is 1120.002048 seconds with an indicated time of 1120 seconds.
3. Finally, repeat cycle 2 three times, and on the fourth cycle drop one
8.192-ms tick to provide an indicated and elapsed time of exactly 4480
seconds.
More Details.
The variables corr1, corr2, and corr3 are used to keep track of the cycles
just mentioned. The C code to implement this correction scheme is contained in
the timer interrupt service routine of Listing One.
The compiled version of this code is shown in Listing Three (page 128) where
the files HC05J1.H and GENERAL.H are expanded. The global variables are placed
in the RAM memory beginning at the address 0xc0. The first compound statement
in main() is compiled as three CLR memory instructions, and the initialization
of the timer control status register is accomplished by 5-bit clear or set
instructions. The interrupts are turned on with the CLI() instruction.
The code to implement the FOREVER loop is the branch at the address 0x330 that
returns the execution to the address 0x311. Thirty-one bytes of code are used
to execute the complete clock operation in the main() program. The interrupt
service routine follows the main program and requires 70 bytes of code.
Assembly language programmers should examine this code carefully and determine
if they could do any better than the compiler has done here. My immediate
reaction was that the bit manipulation instructions could have been replaced
by byte operations that require less code. On reflection, however, I concluded
that the bit manipulations were coded in C by me, and I could easily have used
byte-type operations and saved the same amount of code space within the C
program. Otherwise, the single RTS instruction at the end of the main program
is the only wasted byte in the program.


Internal Time-of-Day Clock for the MC68HC05C8


As the C program in Listing Four (page 134) illustrates, the timer of the
MC68HC05C8 is much more flexible than that of the MC68HC05J1. While the
executing portion of this program is a few lines shorter than the equivalent
program for the MC68HC05J1, the setup portion of the program is significantly
longer because the MC68HC05C8 is a much bigger part.
More Details.
Examine the code in Listing Four. The MC68HC05C8 has several registers that
are 16 bits. The machine must handle these registers as two 8-bit registers.
The three declarations at the beginning of this program provide a foolproof
means of dealing with the two 8-bit parts of a 16-bit register. The first
declaration is that of a structure, as shown in Example 1(a), page 74, and the
second is for a union, as in Example 1(b). A union is compiled to provide
enough space for the storage of the largest element in its argument list. In
this case, the union contains two 16-bit items, so the declaration in Example
1(c) provides 16 bits of storage. It is possible to deal with the 16-bit
location as either a long or 2 bytes, and there is no question as to where the
bytes will be stored. Note that in the interrupt service routine __TIMER(),
this union is used to move the contents of the timer count register into
memory with two 1-byte moves, and then 500 is added to the long word of the
union.
Example 1: A union is compiled to provide enough space for the storage of the
largest element in its argument list. (a) The first declaration is that of a
structure; (b) the second is for a union; (c) the union contains two 16-bit
items, so the declaration provides 16 bits of storage.

 (a) struct bothbytes
 {
 int hi;
 int lo;
 }

 (b) union isboth
 {
 long l;

 struct bothbytes b;
 }

 (c) union isboth time_comp_count;

The main() routine of this program has a different setup because it is
necessary only to enable the output compare interrupt and the interrupts for
the processor. The remainder of the main program is identical to that of the
earlier version for the MC68HC05J1.
The timer interrupt service routine is significantly different in this
program. After an interrupt occurs, it is necessary to clear the timer
overflow and the output compare flag bits in the timer status register. The
timer overflow bit is cleared by reading the timer status register prior to
reading the low byte of the timer count register. The output compare flag is
reset by writing to the output compare low byte after reading the timer status
register. These operations are accomplished in the first seven lines of code
in the interrupt service routine. Also, in this portion of the code, the
contents of the timer compare register are incremented by 500 to prepare for
the next interrupt time. This processor -- when running with a 4-MHz
oscillator -- will have the internal clock increment every two microseconds.
Therefore, adding 500 to the output compare register will cause the processor
to be interrupted by the timer once every millisecond.
The remainder of the interrupt service routine is quite simple. The interrupt
service routine is entered once each millisecond. Therefore, when the value of
count which is initialized to 1000 is decremented to 0, exactly one second has
passed. If the decremented value is not count, then the program returns to the
main program. When the decremented value is zero, the location sec is
incremented, and count is reset to 1000. Remember that sec is processed in the
main program loop so nothing more is needed in the interrupt service routine.
In the compiled listing version of Listing Four, the file HC05C8.H is much
longer than the corresponding file for the MC68HC05J1. (Due to space
constraints the compiled version is not shown here, but it is available
electronically; see "Availability," page 3.) The MC68HC05C8 has many more
registers than the MC68HC05J1, and the individual bits within these registers
are each named in the HC05C8.H file.
The declaration of bothbytes and isboth does not cause memory allocation. The
declaration of the union time_comp_count causes the allocation of the memory.
With the exception of the initialization, the compiled version is functionally
the same as that shown in Listing Three. The interrupt service routine
requires 56 bytes in this case. There are two returns from the interrupt
service routine, and in both cases the compiler inserted an RTI instruction.


Summary


The C6805 compiler was written for a microcontroller that has many limitations
when compared with the typical computer. All the unique features of the
microcontroller can be placed in a header file; inclusion of this header in
the program will assure that the proper features of the microcontroller will
be made available to the compiler. C6805 adheres to the ANSI standard for the
C language as far as is practical when the computer is considered.


Products Mentioned


C6805 Code Development System Byte Craft Limited 421 King Street North
Waterloo, Ontario N2J 4E4 Canada 519-888-6911


C6805 Specifics


The C6805 compiler was written to support the MC68HC05 family of parts.
Because some MC68HC05 microcontroller instructions have no counterpart in C,
special directives identify unique microcontroller characteristics to the
compiler.
Table 1 lists nine assembly instructions available to the 68HC05 that have no
equivalent C call. They can be accessed as either a single instruction (all
uppercase) or as a function call, as shown. The function call requires a pair
of closed parentheses to follow the name of the instruction.
Table 1: Assembly codes directly callable by C6805

 Function Operation
 ------------------------------------------------------------

 CLC or CLC() Clear Carry Bit
 SEC or SEC() Set Carry Bit
 CLI or CLI() Clear Interrupt Flag (turn interrupts on)
 SEI or SEI() Set Interrupt Flag (turn interrupts off)
 NOP or NOP() No Operation
 RSP or RSP() Reset Stack Pointer
 STOP or STOP() STOP Instruction
 SWI or SWI() Software Interrupt
 WAIT or WAIT() WAIT Instruction

A pragma is a C preprocessor command not defined by the language. As such, the
compiler writer can use the #pragma command to satisfy a need not specifically
identified by the langua e. C6805 uses pragmas to identify
microcontroller-specific characteristics. Table 2 contains a list of pragmas
used by C6805.
Table 2: C6805 pragma directives

 pragma Function
 -----------------------------------------------

 #pragma portxy I/O port definition
 #pragma memory RAM/ROM definition
 #pragma mor Mask Option Register
 #pragma has Instruction set options
 #pragma options Compiler directives
 #pragma vector Interrupt vector definitions

Listing One (page 128) shows part of the file HC05J1.H, a header file used by
C6805 when compiling a file for the MC68HC05J1. The first six entries in this
file define the fixed port locations used with this part. The format of a
pragma directive here is #pragma portxx portname @ address where portxx can be
portr, portw, or portrw, which shows whether the port is read, write, or both.
portname is the name used in the program for the port. The at symbol (@)
identifies a memory address. The first line in Listing One tells the compiler
that porta is a read/write port located at address 0x0 in the computer memory
space. Note that the timer_count register at memory location 0x9 is a
read-only register.
The location and amounts of RAM and ROM are identified in the two memory
directives. This part has a STOP, a WAIT, and a MUL instruction. Finally, the
vector entries identify the locations of all interrupt vectors and the names
of the interrupt service routines associated with each vector. A vector pragma
causes the address of the function with the given name to be placed in the
specified vector location.

The header files that contain the pragmas for each microcomputer are not part
of the compiler and must be written by the programmer. Programmers will want
to identify in the header file each of the bits in the fixed registers, such
as the timer control register. That way, the main program will not be
cluttered with defines more than necessary.
--T.V.S


Knowing Your Microprocessor Reduces Code Size and Execution Speed


One of the major advantages of high-level languages is that they isolate the
programmer from the target environment. The application can be abstracted to a
target-independent description; that is, the programmer describes the
application algorithmically and lets the high-level language compiler do its
best job of implementing it on a particular target. Little has to be done to
change the application from one target to another.
For smaller code size and better execution speed, the design engineer can use
fixed arrays rather than pointer arithmetic, allocate variables to Page 0 for
shorter instructions, and keep control statements short for branches.
High-level language compilers usually generate highly optimized code. Where
can design engineers gain their next advantage? From the processor. By knowing
the strengths and weaknesses of the target processor, the design engineer
creates better implementations, even using good tools.
This is true of all processors, but especially the 8-bit ones. Experience
shows, even with an extremely good C compiler, there are still ways the
programmer can influence both code size and execution speed as evidenced by
the following three code examples.


Array Vs. Pointer Arithmetic


Consider for a moment the differences between pointer and array arithmetic. An
array access consists of a base address indexed by a constant or variable.
Pointer access is done by a variable indexed by a constant or variable. In the
6805 instruction set, 56 percent of the opcode map is dedicated to accessing a
variable, yet only 12 percent of the opcode map can be used to access an array
indexed by a constant or variable. The 6805 has no way to use a 16-bit
variable to access memory without doing some form of self-modifying code.
Knowing this, the developer using fixed arrays rather than pointer arithmetic
for the storage and retrieval of data will get better code from the compiler
Example 1 shows the differences between the two access methods. In an array,
data is accessed first by an array access and secondly by a pointer. The lack
of a 16-bit index register causes the compiler to generate considerably longer
code for the pointer access.
Example 1: Differences between accessing data with pointers or arrays on the
6805

 0010 0011 unsigned int i,j;
 0C1C 0014 char a[20];
 0029 int *ptr;

 0100 BE 11 LDX $11 i = a[j]; /* Getting data from an
 0102 D6 OC 1C LDA $0C1C,X array in the 6805 */
 0105 B7 10 STA $10

 0107 A6 0C LDA #$0C ptr = &a; /* Setting a pointer to
 0109 B7 29 STA $29 the beginning of the
 010B A6 1C LDA #$1C array */
 010D B7 2A STA $2A

 010F BB 12 ADD $12 i = *(ptr + j);
 0111 B7 19 STA $19 /* Accessing the array
 0113 B6 29 LDA $29 with pointers using
 0115 A9 00 ADC #$00 self modifying code
 0117 B7 18 STA $18 to handle 16 bit
 0119 AE C6 LDX #$C6 pointers. */
 011B BF 17 STX $17
 011D AE 81 LDX #$81
 011F BF 1A STX $1A
 0121 BD 17 JSR $17
 0123 B7 10 STA $10



Page 0 Variables


The 6805, like many single-chip microprocessors has dedicated some of its
instructions to accessing data in the first 256 locations of memory. In the
6805, this area of memory is dedicated to memory-mapped I/O ports, scratch-pad
RAM, and often some ROM. Almost half the 6805 instruction set is Page 0 and
accesses this area directly. Good compilers know where the target data of its
instructions are and emit appropriate code. The programmer can allocate
frequently used variables in the first 256 bytes of memory to substantial
advantage. Example 2 shows two syntactically identical code fragments
generating 7 bytes or 2 bytes depending on variable location.
Example 2: Syntactically identical code for two variables generates tighter
code in the first 256 bytes of memory.

 0125 C6 0B B8 LDA $0BB8 y = -y;
 0128 40 NEGA
 0129 C7 0B B8 STA $0BB8

 012C 30 14 NEG $14 x = -x;




Branching


The instruction set in the 6805 restricts conditional branch offsets to a
single byte (129-bytes forward and 126-bytes back from the branch
instruction). To conditionally branch beyond this range, a branch with the
complement condition branches around a jump instruction. This requires 5-bytes
of generated code and on average increases the execution speed. By keeping
control statements (if, while, and for) short conditional branches will be
generated with 2 bytes of code. This is only a 3-byte savings but over a large
program it can be substantial. See Example 3.
Example 3: For branches beyond 128 bytes, the 6805 C compiler reduces 5 bytes
of code to 2 bytes.

 012E B6 11 LDA $11 if (j < 25) {
 0130 A1 19 CMP #$19
 0132 24 01 BCC $0135
 0134 9D NOP NOP (); }

 0135 B6 10 LDA $10 if (i < 25) {
 0137 A1 19 CMP #$19
 0139 25 03 BCS $013E
 013B CC 01 D7 JMP $01D7
 013E 9D NOP NOP (); NOP ();
 013F 9D NOP
 . . . . . .
 01D4 9D NOP
 01D5 9D NOP NOP (); NOP (); }
 01D6 9D NOP
 01D7 81 RTS }

Walter is president of Byte Craft Limited and has been programming embedded
systems on single-chip microcomputers for over 20 years. He can be reached at
421 King Street North, Waterloo, Ontario N2J 4E4 Canada.
_C PROGRAMMING FOR THE 68HC05 MICROCONTROLLER_
by Truman T. Van Sickle



[LISTING ONE]

#pragma portrw PORTA @ 0x00;
#pragma portrw PORTA @ 0x01;
#pragma portrw DDRA @ 0x04;
#pragma portrw DDRB @ 0x05;
#pragma portrw TCST @ 0x08;
#pragma portr TCNT @ 0x09;

#pragma memory RAMPAGE0 [64] @ 0xc0;
#pragma memory ROMPROG [1024] @ 0x300;

#pragma has STOP ;
#pragma has WAIT ;
#pragma has MUL ;

#pragma vector __TIMER @ 0x07f8;
#pragma vector __IRQ @ 0x07fa;
#pragma vector __SWI @ 0x07fc ;
#pragma vector __RESET @ 0x07fe;





[LISTING TWO]

#pragma option v

#include "hc05j1.h"
#include "general.h"

/* define the global variables */
int hrs,mts,sec;
int count;
int corr1,corr2,corr3; /* used to correct the time errors */

main(void)
{
 corr1=corr2=corr3=0; /* time corrections */
 TCST.RT0=0; /* 57.3 ms cop timer */
 TCST.RT1=0; /* 8.192 ms RTI */
 TCST.RTIE=1; /* Turn on the RTI */
 TCST.RTIF=0; /* Reset interruput */
 TCST.TOF=0; /* flags */
 CLI(); /* turn on interrupt */
 FOREVER
 {
 if(sec==60) /* do clock things each minute */
 {
 sec=0;
 if(++mts==60)
 {
 mts=0;
 if(++hrs==13)
 hrs=1;
 }
 }
 WAIT(); /* wait here to save the energy */
 }
}
void __TIMER(void) /* routine executed every RTI (8.192 ms) */
{
 TCST.TOF=0; /* reset the interrupt */
 TCST.RTIF=0; /* flags */
 if (++count==122)
 {
 sec++; /* increment seconds */
 if(++corr1==14) /* To correct for 8.192 ms per tick */
 {
 corr1=0; /* run 122 ticks per second for 13 */
 if(++corr2==80) /* seconds, and 123 for the 14th second */
 { /* With this algorithm there are 14.000128 */
 corr2=0; /* actual seconds per 14 indicated. Then run */
 if(++corr3==4)
 {
 count=1;
 corr3==0;
 }
 else
 count=0;/* 79 of these cycles followed by 1 cycle of */
 } /* 14 seconds with 122 ticks per second. The */
 else /* elapsed time for this cycle = 1120.002048 */
 count=(-1); /* seconds for and indicated time of 1120 */
 } /* seconds. Repeat this cycle 4 times; on */
 else /* last cycle drop 1 tick makes indicated &
 count=0; /* elapsed time exactly 4480 seconds.*/
 }

}





[LISTING THREE]

 #pragma option v
 #include "hc05j1.h"
0000 #pragma portrw PORTA @ 0x00;
0001 #pragma portrw PORTB @ 0x01;
0003 #pragma portr PORTD @ 0x03;
0004 #pragma portrw DDRA @ 0x04;
0005 #pragma portrw DDRB @ 0x05;
0008 #pragma portrw TCST @ 0x08;
0009 #pragma portrw TCNT @ 0x09;
07F0 #pragma portrw __COPSVS @ 0x7f0;
07F8 #pragma vector __TIMER @ 0x07f8;
07FA #pragma vector __IRQ @ 0x07fa;
07FC #pragma vector __SWI @ 0x07fc ;
07FE #pragma vector __RESET @ 0x07fe;
 #pragma has STOP ;
 #pragma has WAIT ;
 #pragma has MUL ;
00C0 0040 #pragma memory RAMPAGE0 [64] @ 0xc0;
0300 0400 #pragma memory ROMPROG [1024] @ 0x300;
0000 #define RT0 0 /* timer_cont_stat */
0001 #define RT1 1
0004 #define RTIE 4
0005 #define TOFE 5
0006 #define RTIF 6
0007 #define TOF 7
 #include "general.h"
0001 #define TRUE 1
0000 #define FALSE 0
0001 #define FOREVER while(TRUE)
0002 #define max(a,b) (a) > (b) ? (a) : (b)
0003 #define min(a,b) (a) < (b) ? (a) : (b)
0004 #define abs(a) (a) >= 0 ? (a) : -(a)
 /* define the global variables */
00C0 00C1 00C2 int hrs,mts,sec;
00C3 int count;
00C4 00C5 00C6 int corr1,corr2,corr3; /* time corrections */
 main(void)
 {
0300 3F C6 CLR $C6 corr1=corr2=corr3=0; /* time corrections */
0302 3F C5 CLR $C5
0304 3F C4 CLR $C4
0306 11 08 BCLR 0,$08 TCST.RT0=0; /* 57.3 ms cop timer */
0308 13 08 BCLR 1,$08 TCST.RT1=0; /* 8.192 ms RTI */
030A 18 08 BSET 4,$08 TCST.RTIE=1; /* Turn on the RTI */
030C 1D 08 BCLR 6,$08 TCST.RTIF=0; /* Reset interruput */
030E 1F 08 BCLR 7,$08 TCST.TOF=0; /* flags */
0310 9A CLI CLI(); /* turn on interrupt */
 FOREVER
 {
0311 B6 C2 LDA $C2 if(sec==60) /* do clock things */
0313 A1 3C CMP #$3C

0315 26 18 BNE $032F
0317 3F C2 CLR $C2 sec=0;
0319 3C C1 INC $C1 if(++mts==60)
031B B6 C1 LDA $C1
031D A1 3C CMP #$3C
031F 26 0E BNE $032F
 {
0321 3F C1 CLR $C1 mts=0;
0323 3C C0 INC $C0 if(++hrs==13)
0325 B6 C0 LDA $C0
0327 A1 0D CMP #$0D
0329 26 04 BNE $032F
032B A6 01 LDA #$01 hrs=1;
032D B7 C0 STA $C0
 }
 }
032F 8F WAIT WAIT(); /* wait here to save energy */
0330 20 DF BRA $0311 }
0332 81 RTS }
 void __TIMER(void)
07F8 03 33 {
0333 1F 08 BCLR 7,$08 TCST.TOF=0; /* reset the interrupt */
0335 1D 08 BCLR 6,$08 TCST.RTIF=0; /* flags */
0337 3C C3 INC $C3 if (++count==122)
0339 B6 C3 LDA $C3
033B A1 7A CMP #$7A
033D 26 39 BNE $0378
 {
033F 3C C2 INC $C2 sec++; /* increment seconds */
0341 3C C4 INC $C4 if(++corr1==14)
0343 B6 C4 LDA $C4
0345 A1 0E CMP #$0E
0347 26 2D BNE $0376
 {
0349 3F C4 CLR $C4 corr1=0;
034B 3C C5 INC $C5 if(++corr2==80)
034D B6 C5 LDA $C5
034F A1 50 CMP #$50
0351 26 1D BNE $0370
 {
0353 3F C5 CLR $C5 corr2=0;
0355 3C C6 INC $C6 if(++corr3==4)
0357 B6 C6 LDA $C6
0359 A1 04 CMP #$04
035B 26 0F BNE $036C
 {
035D A6 01 LDA #$01 count=1;
035F B7 C3 STA $C3
0361 B6 C6 LDA $C6 corr3==0;
0363 26 04 BNE $0369
0365 A6 01 LDA #$01
0367 20 01 BRA $036A
0369 4F CLRA
 }
036A 20 02 BRA $036E else
036C 3F C3 CLR $C3 count=0;
 }
036E 20 04 BRA $0374 else
0370 A6 FF LDA #$FF count=(-1);

0372 B7 C3 STA $C3
 }
0374 20 02 BRA $0378 else
0376 3F C3 CLR $C3 count=0;
 }
0378 80 RTI }
07FE 03 00

SYMBOL TABLE
LABEL VALUE LABEL VALUE LABEL VALUE LABEL VALUE

DDRA 0004 DDRB 0005 FALSE 0000 PORTA 0000
PORTB 0001 PORTD 0003 RT0 0000 RT1 0001
RTIE 0004 RTIF 0006 TCNT 0009 TCST 0008
TOF 0007 TOFE 0005 TRUE 0001 __COPSVS 07F0
__IRQ 07FA __RESET 07FE __STARTUP 0000 __SWI 07FC
__TIMER 0333 corr1 00C4 corr2 00C5 corr3 00C6
count 00C3 hrs 00C0 main 0300 mts 00C1
sec 00C2 

MEMORY USAGE MAP ('X' = Used, '-' = Unused)
0300 : XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX
0340 : XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXX-------
0380 : ---------------- ---------------- ---------------- ----------------
03C0 : ---------------- ---------------- ---------------- ----------------
0700 : ---------------- ---------------- ---------------- ----------------
0740 : ---------------- ---------------- ---------------- ----------------
0780 : ---------------- ---------------- ---------------- ----------------
07C0 : ---------------- ---------------- ---------------- --------XX----XX

All other memory blocks unused.

Errors : 0
Warnings : 0






[LISTING FOUR]

#include "hc05c8.h"
#include "general.h"

int hrs, mts, sec; /* global variables */
long count=1000;

struct bothbytes /* 16 bit int structure */
{
 int hi;
 int lo;
};
union isboth /* and union */
{
 long l;
 struct bothbytes b;
};
union isboth time_comp_count;

registera ac;
void main(void)
{
 int key;
 TCR.OCIE = 1; /* enable output compare interrupt */
 CLI(); /* enable all interrupts */
 FOREVER
 {
 if(sec==60) /* do clock things each minute */
 {
 sec=0;
 if(++mts==60)
 {
 mts=0;
 if(++hrs==13)
 hrs=1;
 }
 }
 WAIT(); /* wait here to save the energy */
 }
}
void __TIMER(void) /* time interrupt service routine */
{
 /* the program gets here every millisecond */
 time_comp_count.b.hi = TCHI;
 ac =TSR; /* Clear the tof bit */
 time_comp_count.b.lo = TCLO;
 time_comp_count.l += 500; /* 500 counts per millisecond */
 OCHI = time_comp_count.b.hi;
 ac = TSR; /* Clear ocf bit */
 OCLO = time_comp_count.b.lo;
 if(--count)
 return ;
 else
 {
 sec++; /* here every second */
 count=1000;/* reset count to 1 second */
 }
}























August, 1991
DECIMAL FRACTIONAL CONVERSION


Converting decimal fractions to binary in assembly language




Don Morgan


Don is a consulting engineer specializing in embedded systems. He can be
contacted at Pacific Precision Laboratories in Chatsworth, Calif., or at Don
Morgan Electronics, 2669 N. Wanda, Simi Valley, CA 93065.


More often than not, writing code for an embedded system means looking for a
balance of speed and storage. When the system must process numerical data,
this can become critical, arithmetic takes time. There are any number of
solutions that might be implemented, such as sticking to integer only
arithmetic, or going to fixed point if fractional calculations are necessary.
Even then, systems that must relate to humans for numerical data often receive
it in decimal form, and because decimal arithmetic is not native to most
processors, what follows is some form of radix conversion. Good front-end
routines can make a difference in speed, storage requirements, and accuracy
and the 8086 provides some tools to aid in this effort.


Common Conversion Methods


Tables are commonly used for conversions between one "kind" of thing to
another, such as rpm to cycles, pounds to stone, feet to millimeters. They are
useful for converting between bases, too. However, they require more space and
are slower than necessary, and they do not accommodate fractional parts
particularly well.
Another common method is to perform a straightforward integer conversion,
keeping track of the decimal places for fractional parts so that a final
divide can be done to bring the result back into range. This is a common
solution but requires the program to keep track of the radix point and does
not readily produce fractional parts native to the hardware of an embedded
system, such as D/A converters.
There is another approach that works well on processors that provide
instructions for doing decimal adjustments--the 8086, for instance--that is
faster than the table-driven method and produces true binary fractions for
either fast, fixed point routines or hardware.


Fractional Parts


The code fragment in Example 1 provides an interesting example of radix
conversion performed by multiplying the input value by the target base, using
the arithmetic of the base being converted from. In this case, where the
target is binary and the convert-from base is decimal, the conversion is
accomplished using simple additions and the DAA instruction.
After the input has been buffered, the numbers following the radix point are
packed and placed in an accumulator concatenated with a result variable to
hold the converted mantissa. The number in the accumulator is added to itself
to perform a multiply by two and take advantage of the AF flag. All standard
carries are allowed to occur and overflow into the concatenated register meant
to hold the mantissa. Each addition is followed by the DAA instruction to
handle the additional interdigital carries that can occur with a decimal
overflow. This conversion can continue until the accumulator is exhausted or
the desired precision is reached. In addition to being fast, this method can
produce higher accuracy than the table method just mentioned.


The Conversion Routines


The code in Example 1 was written for an embedded system using fixed point
arithmetic to perform high-speed calculations in a real-time control
application. It expected values ranging from .00002 to 50,000, a range that
allowed both integer and mantissa to share a doubleword (32 bits) with the
radix point between the two words. This routine can be rewritten, however, to
accommodate data of any size.
Before calling the subroutine, the fractional portion is packed and stored in
DEC_FRAC. The FRAC subroutine is called with AX already loaded with the packed
fraction still in decimal notation. Upon entry, CX is set to the number of
bits in the resulting mantissa; this tells the routine when it is at the
needed precision. Each byte is multiplied by two through addition (NOT
SHIFTED) in order to use the Auxiliary Flag as well as the standard Carry.
Following each addition, the DAA instruction is executed to handle decimal
overflows, with the carry flag being checked, as well, to handle normal
carries. Both types of overflow are handled to maintain decimal arithmetic. At
the end of this routine, the remainder still in AX is checked and the fraction
in DX is rounded up if necessary. The result is placed in MANTISSA.


Conclusion


A routine of this sort can be useful, as it does without the ROM or RAM
storage required of a table-driven routine. Even more interesting, though, is
its ability to produce a binary fractional conversion with accuracy and speed.
Example 1: Fractional conversion routine

 mantissa word ?
 dec_frac word ?
 ;
 ; frac-conversion of decimal fractional part to hex
 ; enter with packed decimal word in ax
 ; returns with result in dx\
 ;DS is assumed to point into the Data Segment
 ;
 frac proc
 mov cx,10h ;number of bits in resulting mantissa
 cnvt:
 add al,al ;could add to self, we will see

 daa
 mov bl,al
 mov al,ah
 jnc nc1
 add al,al ;could add to self, we will see
 daa
 inc al
 jmp short nc2
 nc1:
 add al,al ;could add to self, we will see
 daa
 nc2:
 mov ah,al
 mov al,bl
 rcl dx,1
 loop cnvt
 sub ax, 5000h
 jc end_frac
 inc dx ;for round off
 end_frac:
 mov word ptr mantissa, dx
 ret
 frac end







































August, 1991
TESTING C COMPILER PERFORMANCE


XScheme puts these nine compilers to the test


 This article contains the following executables: XSCHEME.ARC CTEST.ARC


David Betz


David is a technical editor for DDJ and the author of XLisp, XScheme, and the
TelePath conferencing system. He can be reached at DDJ, 501 Galveston Drive,
Redwood City, CA 94063.


Whenever the question of compiler benchmarks raises its ugly head, the axiom
"lies, damned lies, and benchmarks" immediately jumps to mind. Nevertheless,
benchmarks can be a valid means of comparison, especially when you have the
opportunity to benchmark the actual application program you're developing.
This leads to a second commonly spouted maxim about benchmarks: "The best
benchmark is the program you are actually going to use." Unless you benchmark
your actual application program, your experience of a particular compiler's
performance will vary.
With this in mind, I recently measured the performance of a program I use all
the time -- XScheme. This choice of a test program was not made because it
happened to fit my personal benchmarking needs, but because it presents a
general-purpose, nontrivial test of a compiler's functionality and
performance. While XScheme is by no means an exhaustive testbed for all
compiler features or application requirements, it did suit my immediate needs,
and I'm sharing my findings with you.
Consequently, I examine in this article nine different C compilers for MS-DOS:
Four protected-mode compilers -- Watcom C/386, Metaware High C 386/486,
Intel's 386/486 C Code Builder, and Microway NDP C-386; and five real-mode
compilers -- Microsoft C, Borland C++, JPI TopSpeed C, Zortech C++, and MIX
Power C.


The Playing Field


My tests were performed on an Everex Step 386/20 with 4 Mbytes of memory, a
64K cache, an 80387 floating point coprocessor, and a Seagate ST4096 80-Mbyte
MFM hard disk drive under MS-DOS, Version 5.0. I used DOS 5.0 partly because
it supports disk partitions larger than 32 Mbytes, making it possible to
install all of the compilers at the same time. In addition, this was an
excellent opportunity to test these compilers under the new operating system.
All of the timings in this article were generated with FILES = 30 and BUFFERS
= 10 in CONFIG.SYS. I mention this because the installation procedure for
Intel's C Code Builder suggests that BUFFERS be set to no less than 30. The
tests for each compiler should be on equal footing, so I used BUFFERS = 10 for
C Code Builder as well.
Just for comparison purposes, I tried the XScheme compile with BUFFERS = 30.
As you might expect, C Code Builder compiles somewhat faster. Of course, so
does Borland C++, and I imagine the others do, too. In spite of the
recommendation in the Code Builder installation procedure, I think the results
presented here give an accurate representation of the relative performance of
all the compilers, and that the performance of any of them could be improved
by increasing the number of buffers.
Both the Metaware and Watcom compilers I tested generate code that runs in
protected mode on the 386. This requires the use of a DOS extender. I used the
Phar Lap 386 DOS-Extender SDK, which provides an environment for running
protected-mode programs under DOS and includes an assembler, linker, and
debugger.
I've also taken a brief look at Rational System's Oxygen program. Oxygen is a
product designed to be used with Microsoft C to allow the protected mode
copies of the Microsoft compiler and linker to be run under DOS. This allows
for compiling much larger programs, as all of extended memory can be used by
both the compiler and linker. It also seems to considerably speed up compile
and link times.


The Yardstick


There are two important aspects of a compiler's performance: generating
compact and efficient code and compiling and linking efficiently. The latter
is particularly important during program development when large portions of
the program are recompiled often, as changes are made to global header files
or large modules.
To test compile and link speeds, I chose a program I wrote called "XScheme,"
an interpreter for the Scheme dialect of Lisp. XScheme is a relative of XLisp,
which I wrote before creating XScheme. (For more information on XScheme, see
the accompanying text box "XScheme's Timely Role.") The entire program
consists of about 10,000 lines of C code (including blank lines and comments).
Lisp-like languages are not known for parsimonious use of a machine's
instruction cycles; therefore, timing the speed of XScheme execution is a fair
test of the binaries produced by a given compiler. Although there's far more
source code to XScheme than could be printed here, it's available
electronically for those of you who are interested in it as a language
implementation, as well as those who wish to confirm the results presented in
this article; see "Availability" on page 3.
Because all of the compilers tested here claim ANSI conformance, I spent some
time adding ANSI prototypes to XScheme so that I could test the compilers with
their ANSI compatiblity functions enabled. (Editor's note: Adding ANSI
prototypes didn't uncover a single problem in the XScheme code!) I have
measured the compilation speed both with and without optimization enabled. My
assumption is that optimization is usually not necessary during development
and that disabling it usually leads to faster compile times.
In addition to measuring the time required to compile XScheme, I have also
measured the performance of the resulting executables. I've done this by
timing the execution of an implementation of the Fibonacci function written in
Scheme and run under each of the interpreters generated during the compile
tests.
As a second performance measure, I've written a program called "CTest" that
contains two separate tests. To test floating-point performance, I've used a
modified version of a fractal tree program published in the article "Fractals
in the Real World" by Dick Oliver (DDJ, April 1991). It does a large number of
floating-point operations in the process of generating the fractal tree and
seems like a better test than some of the do-nothing sequences of
floating-point operations that have been used in the past. So that only
floating-point performance is measured, I've written dummy versions of the
graphics routines that actually draw the tree. CTest is also available
electronically.
The second test in CTest is a C version of the Fibonacci test that I ran under
XScheme. This is really a subroutine call test because it performs relatively
few operations other than calls.
CTest reports its results in terms of iterations per second. In the case of
the floating point test, this is the number of trees generated per second. To
compute this, 100 trees were generated and the elapsed time was divided by
100. In the case of the Fibonacci test, this is the number of computations of
fib(25) per second. To compute this, fib(25) was computed 100 times and the
elapsed time was divided by 100. The only exception to this is in the case of
the Power C compiler. Its performance was so slow that only 50 iterations of
the tree test were performed and the elapsed time was divided by 50 to get the
number of trees per second.


Switch Settings


For the XScheme test, I used the large memory model for all real-mode
compilers and the flat memory model for the protected-mode compilers. I used
software floating point (although the tests I ran didn't contain any
floating-point code). For the optimized versions, I optimized for speed rather
than size.
For the CTest test, I used the small memory model and tried to use the best
optimization switches to get good floating-point performance. I used switches
that allowed the compilers to generate inline floating point instructions
where possible, and I forced the use of the 387 coprocesser chip. Table 1
provides descriptions of the various switches used.
Table 1. Description of switches used in compiling the test programs

 Microsoft C

 -AL Use the large memory model.
 -AS Use the small memory model.
 -FPi Generate inline floating-point instructions, with emulation if an
 80x87 doesn't exist at runtime.
 -FP87 Generate inline floating-point instructions; an 80x87 is required
 at runtime.

 -Gs Disable stack overflow checking.
 -Za Enforce ANSI compatibility.
 -Ox Maximize optimization.

 JPI TopSpeed C

 option(ansi=>on) Check for ANSI compatibility.
 optimize(speed=>on) Optimize for speed over size.
 optimize(cpu=>386) Generate code for the 80386.
 optimize(copro=>387) Generate code for the 80387.

 Zortech C

 -ml Use the large memory model.
 -ms Use the small memory model.
 -f Generate inline floating point instructions.
 -o Optimize.
 -A Check for ANSI compatibility.

 Power C

 /ms Use the small memory model.
 /f8 Use hardware-only floating-point library.

 Watcon~ C

 -fpi Generate inline floating-point instructions, with emulation if an
 80387 doesn't exist at runtime.
 -fpi87 Generate inline floating-point instructions; an 80387 is required
 at runtime.
 -za Disable non-ANSI extensions.
 -s Disable stack overflow checking.
 -oatx Optimize:
 a Alias checking relaxed
 t Speed favored over size
 x Generate some library functions inline, enable loop
 optimizations, no stack overflow checking.

 Metaware High C

 -Hansi Check for ANSI compatibility.
 -O Turn on full optimization.
 -f387 Generate code for the 80387 coprocessor.

 Microway NDP C

 -n2 Generate inline 80387 code.
 -n3 Advanced 80387 stack utilization.
 -ansi Check for ANSI compatibility.
 -O Perform code hoisting and peephole optimizations.
 -on Turn on all optimizations.
 -OLM Additional memory optimizations, add speed optimizations related
 to moving code out of loops and speeding up loops.

 C Code Builder

 -n Include 80387 emulation library (link switch).
 -t Include information for link-time type checking.
 -O0 Select the lowest level of optimization.

 -O3 Select the highest level of optimization.



The Players


The benchmark results that appear in Tables 2 and 3 speak for themselves, and
you can draw your own conclusions. But, in looking over the various packages,
you may want to consider more than the raw data. Therefore, I'll briefly
describe the features of these packages and note any problems I encountered
during the tests.
Table 2: (a) XScheme compile and execution times (fib 25), and file sizes
generated; (b) Switch settings used.

 (a) Real-mode Compilers

 Without With
 optimization optimization
 Compile Size Execute Compile Size Execute
------------------------------------------------------------------------

 Microsoft C 1103 137638 94 1344 139190 93
 w/Oxygen 664 802
 MSC with -qc 160 183922 141
 Borland C++ 144 136434 99 146 132130 89
 Zortech C 151 132382 90 310 133470 88
 TopSpeed C 338 130724 103 377 123156 99
 Power C (wouldn't compile) (wouldn't compile)

 Protected-mode Compilers

 Watcom C 304 98132 132 334 96320 132
 High C 666 104456 64 719 102104 62
 C Code Builder 594 239841* 80 635 213597* 70
 NDP C (wouldn't compile)

* The size includes a bound in DOS extender

(b)

 Compiler Without optimization With optimization
----------------------------------------------------------------------

 Microsoft C -AL -FPi -Gs -Za -AL -FPi -Gs -Za -Ox
 MSC with -qc -AL -FPi -Gs -Za -qc
 Borland C++ -ml -f -A -H=xs_bc.sym -ml -f -A -G -A -H=xs_bc.sym
 Zortech C -ml -b -A -ml -b -A -o
 TopSpeed C option(ansi=>on) option(ansi=>on)
 pragma(speed=>on)
 Power C (wouldn't compile) (wouldn't compile)
 Watcom C -fpi -za -s -fpi -za -s -oatx
 High C -Hansi -Hansi -O
 C Code Builder -n -t -O0 -n -t -O3
 NDP C (wouldn't compile)

Table 3: CTest benchmark results. Execution times are in iterations per second
(a); Switches used in the CTest benchmark (b).

(a) Real-mode Compilers

 fib(25) tree size
-------------------------------------------

 Microsoft C 1.23457 0.308642 20838

 Zortech C 1.21951 0.284091 16480
 TopSpeed C 1.5625 0.320513 22562
 Borland C++ 1.21951 0.306748 27470
 Power C 1.190476 0.082372 21360

 Protected-mode Compilers

 Watcom C 0.98039 0.34364 15320
 High C 1.23457 0.469484 23969
 C Code Builder 1.49254 0.571429 87452*
 NDP C 1.42857 0.425532 32129

 * The size includes a bound in DOS extender

 (b) Compiler Switches

 Microsoft C -AS -FPi87 -Gs -Ox -Za
 Zortech C -ms -f -o -A
 TopSpeed C option(ansi=>on),
 optimize(speed=>on,cpu=>386,
 copro=>387)
 Borland C++ -ms -f287 -G -O -A
 Power C /f8 /ms
 Watcom C -fpi87 -za -s -oatx
 High C -f387 -Hansi -O
 C Code Builder -n -t -O3
 NDP C -n2 -n3 -ansi -O -on -OLM



Protected-Mode Compilers


Watcom C/386 8.0 Watcom C/386 Standard Edition comes with a compiler as well
as a linker, a make utility, a librarian, an editor, and a number of other
utilities. The Professional Edition, which I used for this article, adds a
protected-mode version of the compiler, a profiler, and a source-level
debugger. The Watcom compiler generates executable files that work with Phar
Lap's 386DOS-Extender.
One noteworthy problem is that the protected-mode compiler is not compatible
with HIMEM.SYS that comes with DOS 5.0. Watcom says that the next version will
support HIMEM, but until then you must remove HIMEM from your CONFIG.SYS file
in order to use the protected-mode version of the compiler.
The only trouble I ran into while compiling XScheme under Watcom C was that I
overlooked the section in the manual describing how continuation lines work in
their make program. Traditional Unix make programs use the backslash character
at the end of a line to indicate that the line is to be continued on the next
line. Watcom decided to change this since the backslash (\) character is used
by DOS as a separator in path specifications. Instead, they use the ampersand
character (&) to indicate a continuation line. Once I modified my makefile,
XScheme compiled without any further problems.
Metaware High C 2.32 Metaware High C 386 comes with a protected-mode compiler,
an editor, a source-level debugger, and a profiler as well as a number of
utility programs. Oddly enough, it does not include a make program, so I used
the Borland make utility to build the test programs. It produces executables
that work with the Phar Lap 386DOS-Extender.
When it comes to problems, there's little to say -- Metaware High C compiled
XScheme without a hitch.
Microway NDP C 3.1 Microway NDP C comes with two versions of the compiler --
one for their NDP Tool Kit and the other for the Phar Lap 386DOS-Extender. For
this article, I used the NDP Tools version of the compiler. In addition to the
compiler, NDP C comes with a librarian, an editor, and a number of utilities.
NDP C doesn't come with a make facility, so I used the Borland make utility to
build the test programs. Unfortunately, I was unable to get XScheme to compile
under NDP C. It seems that NDP C does not allow the name of a macro to be
passed as a parameter to another macro. Because this feature is used by
XScheme to parse arguments to internal functions, I was unable to perform the
tests that involved XScheme compiliation and execution. I was able to compile
the CTest program and I've included those results.
Intel 386/486 C Code Builder Kit The C Code Builder Kit is one-stop shopping
for 32-bit developers and includes a 32-bit C compiler, a DOS extender
compatible with the 0.9 DPMI specification, a source-level debugger, a linker,
a librarian, and a make utility.
I was particularly excited about C Code Builder because I've had many requests
for protected-mode versions of programs I've developed and that I distribute
free of charge to anyone who wants to use them noncommercially. I've been
unable to comply due to the licensing fees required for the DOS extenders
required to run them in protected mode. With C Code Builder, I'll finally be
able to build protected-mode executables and distribute them free to anyone
who is interested. This royalty-free policy should be of interest to
commercial developers as well.
The C Code Builder Kit was a late arrival so I didn't have much time to work
with it. Luckily, I didn't need much time to get both test programs compiled
and running. I ran into only a minor problem in compiling XScheme which
involved a redundant function declaration not required for an ANSI compiler.
Once I got rid of the offending declaration, everything compiled without a
hitch.


Real-Mode Compilers


Microsoft C 6.0 Microsoft C comes with both DOS and OS/2 versions of the
compiler as well as a librarian, a linker, a debugger, a make program (nmake),
and various utilities. It also includes an integrated development environment,
but I used the command-line version for this article. In addition, Microsoft C
supports developing applications for Windows 3.0 (using the Windows SDK). This
feature was, of course, not addressed by our test suite.
Because I was running a number of compilers that didn't get along with
HIMEM.SYS, I had it disabled. This didn't cause any problems until I tried to
time the compilation of XScheme. To do this, I used a simple timer program
that I wrote for this article. The program takes a DOS command as an argument
and times how long it takes to execute that command. Unfortunately, Microsoft
C ran out of memory when run from this timer program. To get XScheme to
compile, I had to reenable HIMEM to free up some additional memory for
Microsoft C. Under DOS 5.0, doing so enables the operating system to "load
high," freeing up memory below 64OK for the compiler's use.
This points out a potential problem with using Microsoft C in real mode: It
sometimes runs out of memory doing large compiles. Rational Systems' Oxygen
solves this problem, as described above. Oxygen allows you to run under DOS
the protected-mode version of Microsoft C normally used only under OS/2. For
systems with extended memory, this makes much more memory available to
Microsoft C and the Microsoft linker, and makes compiles faster (as you can
see in Table 2).
Borland C++ 2.0 Borland C++ comes with the compiler itself, a librarian, a
linker, a source-level debugger, a profiler, an assembler, a make program, and
various utilities. It also includes an integrated development environment, but
I used the command-line compiler for this article. Another nice feature of
Borland C++ is that it includes the Whitewater Resource Toolkit to allow you
to develop Windows 3.0 applications without any additional software. This can
save you considerable money over buying both Microsoft C and the Windows 3.0
SDK. Note that even though this is a C++ compiler, I only tested its ANSI C
features.
To speed up compilation under Borland C++ (although it was hardly slow to
begin with), I used their precompiled header option. This creates a symbol
table file the first time it encounters a header file. It then uses the
information from that file instead of reparsing the header the next time it is
encountered. This improved compile times by about six percent.
More Details.
Borland provides a program called tkernel that allows their compiler to run in
protected mode. This makes it possible to compile much larger source files
without running out of memory on machines with extended memory. I tried using
this for the compile time tests but it didn't speed up compilation, so the
results reported here are for the real-mode compiler.


Products Mentioned



Watcom C/386, Version 8.0 Watcom 415 Philip Street Waterloo, Ontario Canada
N2L 3X2 519-886-3700 $1295
Metaware High C, Version 2.32 MetaWare Inc. 2161 Delaware Ave. Santa Cruz, CA
95060-5706 408-429-META $995
Microway NDP C, Version 3.1.0 MicroWay P.O. Box 79 Kingston, MA 02364
508-746-7341 $895 (for 386), $1195 (for 486)
Intel 386/486 C Code Builder Kit Intel 5200 NE Elam Young Parkway Hillsboro,
OR 07124-5961 503-696-8080 $695
Microsoft C, Version 6.00A Microsoft Corp. One Microsoft Way Redmond, WA
98052-6399 206-882-8080 $495
Borland C++, Version 2.0 Borland International 1800 Green Hills Road P.O. Box
660001 Scotts Valley, CA 95066-0001 408-438-8400 $495
Zortech C++, Version 2.1 Zortech, Inc. 4-C Gill Street Woburn, MA 01801
617-937-0696 $450
JPI TopSpeed C, Version 3.00 Jensen & Partners International 1101 San Antonio
Road Mountain View, CA 94043 415-967-3200 $396
MIX Power C, Version 2.0.1 MIX Software 1132 Commerce Drive Richardson, TX
75081 800-333-0330 $19.95
Phar Lap 386DOS-Extender SDK Version 3.0 Phar Lap Software 60 Aberdeen Avenue
Cambridge, MA 02138 617-661-1510 $495
Rational Systems Oxygen, Version 1.1a Rational Systems 220 North Main Street
Natick, MA 01760 508-653-6006 $199
JPI TopSpeed C 3.0 JPI TopSpeed is a multilanguage development environment
that supports C, C++, Modula-2, and Pascal. Any or all of these languages can
be used from a single environment to develop applications. For this article,
however, I used the command-line version of their C compiler. The TopSpeed
package also includes a source-level debugger, a linker, and a profiler.
TopSpeed doesn't include a standard make facility but does support a project
facility for building programs with multiple modules. Actually, the TopSpeed
project system is much more powerful than Unix-style make facilities. It
includes a macro language with conditionals and a very flexible method for
adding language processors. It also keeps track of the options used to compile
source files and will recompile a file when the options that affect it change.
TopSpeed C also compiled XScheme without a problem.
Zortech C++ 2.1 Zortech C++ comes with a compiler, a linker, a librarian, and
a source-level debugger, as well as various other utilities. It also includes
an integrated development environment. One advantage of Zortech C++ is its
availability for a large number of platforms. In addition to the DOS compiler,
Zortech produces OS/2, Unix, and Macintosh compilers. This could make
cross-platform development much easier. Zortech C++ also supports developing
applications for Windows 3.0 using the Microsoft Windows SDK.
Zortech C++ provides a protected-mode version of its compiler that allows the
compilation of very large source files. I tried using this version in the
compilation tests, but because it didn't improve compilation times, I've
reported the real-mode compiler results here.
Finally, I should mention that even though this is a C++ compiler, I only
tested its ANSI C features.
MIX Power C 2.0.1 MIX Power C includes a compiler and a linker. A source-level
debugger and a profiler are available as a separate package. It doesn't
include a standard make program but does support a simple project facility for
building programs with multiple modules. Unfortunately, I couldn't compile
XScheme because Power C can't handle nested calls to the same macro. For
instance, the following pair of macro definitions won't work:
 #define car(x) ((x)->n_car)
 #define foo(x) car(car(x))
I was able to compile the CTest program and I've included those results. I
should note that Power C is a very inexpensive package and perhaps shouldn't
be compared directly with these higher-priced compilers.


Conclusion


Be mindful when comparing the execution performance of the protected-mode
compilers with the real-mode compilers. The environments in which they run are
considerably different. Also, their integer performance will vary because the
protected-mode compilers use 32-bit integers and the real-mode compilers use
16-bit integers. Finally, an obvious advantage to running in protected mode is
having access to much larger amounts of memory. These tests do not show that
difference but you should keep it in mind when deciding whether to use a
protected-mode compiler.


XScheme's Timely Role


A few years ago, I received a call from a graduate student at MIT who was
looking for a way for computer science students to use the Macintosh version
of XLisp for homework assignments. The problem was that XLisp is based on
Common Lisp and the course materials assumed Scheme, a dialect of Lisp
invented at MIT by Gerald Sussman and Guy Steele.
One of my original goals in developing XLisp was to build a small, yet
powerful subset of Lisp for small computers. Originally, I based XLisp on
MacLisp. Later, I converted it to be more or less compatible with Common Lisp
when that standard came out, but I've always been uncomfortable with that
shift. Common Lisp is a huge language and implementing even a subset of it
resulted in a language that was much larger than what I had originally planned
for XLisp. Scheme seemed to be a way to solve this problem: It's an elegant,
powerful language without a lot of extra baggage.
So with a few minor edits, XLisp became XScheme. At first, XScheme was a
straight interpreter like XLisp. However, when I added the proper handling of
tail recursion (a feature required of every implementation of Scheme), I
converted it to its present form, a bytecode compiler/interpreter. The version
I used for the C compiler article is 0.28. I've assigned it a version number
of less than 1.0 because I want to add support for debugging before
considering XScheme complete.
One notable change to previous versions of XScheme is the addition of
prototypes to test the ANSI C features of the compilers in the accompanying
article. The use of prototypes is controlled by the__STDC__pre-processor
symbol, so it is still possible to compile XScheme with non-ANSI compilers.
I've written a bench function in XScheme to test the executables generated by
a given compiler and have included it in Listing One (page 93) as an example
of XScheme syntax. bench times the evaluation of an arbitrary expression. To
implement this function, I've added two functions to XScheme. The time
function returns the current time in seconds. The diff-time function takes two
times returned by time and computes the difference in seconds. bench also
reports the number of calls to the garbage collector that occur during the
evaluation of the expression.
Finally, the fib function in Listing One computes Fibonacci numbers. There is
a call to the bench function to time the execution of a call to fib to compute
the 25th Fibonacci number. Keep in mind, however, that XScheme is not a
benchmark in the traditional sense, but provides the opportunity to test a
compiler's performance under real world conditions.
-- D.B.


SEE EXECUTABLES:

XSCHEME.ARC
CTEST.ARC
















August, 1991
SCALING AND PRINTING FAXES FASTER


Some C code, a bit of assembler, and a couple of tricks


 This article contains the following executables: FAX.ARC


Greg Pickles


Greg's 22 years of computer experience ranges from applications programming to
embedded systems design. Currently he is developing custom PC applications as
a consultant, and he can be reached by fax at Elegant Technology Associates,
206-747-9447, or on CompuServe 70303,2435.


After purchasing a fax card, I needed a way to print the images on my laser
printer. Because the printer was on another machine across the network, I
couldn't use the Fax card's built-in software.
The resolution of fax images is about 200 x 100 dots per inch (dpi) in
standard mode, and 200 x 200 in fine mode. My fax board stores all images at
200 x 200 -- if necessary, duplicating each horizontal line (for standard
resolution images). So, in order to print images at full size using the 300 x
300 dpi resolution of my Laser printer, the images must be scaled at a 2:3
ratio.
I first tried printing the images in Postscript mode, using the Postscript
image scaling operator. This worked, but it took more than five minutes per
page. I then decided to use the PCL4 printer language native to HP's
LaserjetII (and emulated by my printer). PCL4 doesn't support general scaling
of raster images (there are fixed ratios of 1:2, 1:3, and 1:4), so I had to do
this part myself. I wrote the scaling routine in assembler, the bulk of the
program in C, and, after a couple of optimization tricks, I can now print many
fax pages in about a minute.


Scaling Algorithm


My scaling technique is conceptually very simple: Take every 2 x 2 pixel group
from the original image and map it to a 3 x 3 pixel group in the scaled image
(see Figure 1). The quality of fax images as received is poor enough that such
a straightforward algorithm is perfectly adequate.
Because the scaling ratio is not integral, there is room for artistic license
in mapping 2 x 2 groups to 3 x 3 groups. Figure 1 shows the mapping I used. I
prefer a bolder look for ease of reading, this mapping favors black pixels.
You can easily change this to one you prefer. Each input group has a code
based on the four pixels it contains (the top two pixels are bits 0 and 1, the
bottom two are bits 2 and 3).
PCX and PCL4 represent monochrome raster image data in a similar manner. A
raster line is stored as a series of bytes each representing eight pixels on a
raster line. The high-order bit in a byte corresponds to the left-most pixel.
One major difference is that PCX files represent a white pixel by 1 and PCL4
represents it by 0.
Figure 2 shows how two input data bytes from adjacent raster lines are divided
into four input group codes. The scale ratio dictated dealing with the output
data in groups of three bits. Because an integral number of these groups does
not fit in a byte, I had to cover the cases where a group spans a byte
boundary. My scaling routine handles this problem by accessing output data in
words, and by using word addresses on both odd- and even-byte boundaries,
which would place the current output group entirely within a word.
Figure 3 shows how eight output groups correspond to bytes of output data. For
groups 0 through 4 the data is accessed as word 0. For groups 5 through 7 it
is accessed as word 1. The ninth group is treated exactly like the first, the
tenth like the second, and so forth.
The mapping from 2 x 2 to 3 x 3 groups is done with a translation table. Each
table entry consists of four words; the first three words are data for the
rows of the output group, the fourth word is a dummy placeholder so that a
simple shift instruction can derive table offsets. Figure 1 lists the output
data for each kind of group. Figure 3 shows 3 bytes of output data. Note that,
due to Intel byte-ordering conventions, the bytes get swapped when accessed as
a word.
The 3 bits of output data are stored in bits 8 through 10 of the words in the
mapping table. For each output group, these 3 bits are shifted by an
appropriate amount to position them properly before ORing them into the
output. In my routine, the shifting is always performed by a ROtate Left (ROL)
instruction.
The rotations for output groups 0 through 7, respectively, are as follows:
13,10,7,4,1,6,3, and 0. The rotation value starts at 13 and decreases by 3 for
each group until 1 has been reached. These values correspond to output groups
0 through 4 shown in Figure 3. After a rotation of 1 is used, the output
address must be incremented and the rotation count set to 6 to process the
groups in the next word (which overlaps the previous word by 1 byte). After
the step with a rotation count of 0, the whole process starts over.


Scaling Function


Because the scaling requires a lot of bit manipulation, I wrote the key
routine in assembly language (Listing One, page 136). It is written for MASM
5.1 and uses MASM's high-level language support. The C-callable function takes
four parameters: a far pointer to two raster lines of input, a far pointer to
a buffer to receive the scaled output, the number of bytes in each input line,
and a flag indicating whether to invert the data or not. The second line of
input data is expected to immediately follow the first in memory. Likewise,
the output data lines will be created in consecutive memory locations.
First Scale2to3 sets the outer loop count to the number of bytes in an input
line and computes the number of bytes in an output line. The output line size
is saved in DX for use in accessing the three lines in the output buffer.
Next the function computes the size of the output buffer and fills it with 0s.
The code assumes that the buffer will be word-aligned and the caller must
ensure that it is. Most variables in C end up word-aligned anyway. (In
Microsoft C, dynamic memory allocation will always be word-aligned.)
At the label S23_15, DS:SI is loaded with the segment: offset of the mapping
table. CL is loaded with the initial rotate amount. Label S23_30 starts the
outer loop that processes bytes of input data. A byte from the first line is
loaded into AH and from the second line into AL. If the inversion parameter is
set, the data in AX is negated.
Label S23_40 starts the inner loop that processes the four input groups
contained in AX. Every time the routine comes to S23_40, the bits to use for
the input group code are the high bits in AH and AL. The next seven
instructions isolate these bits and convert them into an input group number in
AX. In the process, the input data is saved on the stack so that it will be
available the next time around the inner loop. The group number is then
shifted left three to get an offset into the mapping table. Having the offset
into the table, it is merely a matter of loading three words, rotating them as
necessary, and ORing them into the output buffer at the correct address.
At the bottom of the inner loop, the bookkeeping to manage the rotate count
and output address is done. If CL is above 1, 3 is subtracted from it. If CL
is equal to 1 it is set to 6 and the output address is incremented. This is
the point at which access to the output data switches from one word address to
the subsequent word. If CL is 0, it is set to 16 and the output address is
incremented twice. The new values loaded into CL are 3 more than actually
needed because 3 will be subtracted at label S23_60.
After the inner loop has been traversed four times, the outer loop count is
decremented and the input data pointer is incremented. This continues until
all of the input data has been processed.


Puffing it to Use


Having a fast function to perform the scaling was only half the battle; I
needed an efficient program to use it that wouldn't destroy all my hard-won
speed. Listings Two (page 136) and Three (page 138) present the C code for a
program that reads a PCX file, applies the 2:3 scaling, and outputs PCL4
commands for a printer. To improve performance, I used buffered I/O. C
provides buffered I/O but for some reason (using MSC 5.1), output characters
would on occasion disappear.
Rather than fix the problem, I wrote my own I/O routines. These are not shown
in the printed listings here, but are provided in the electronic version of
the listings (see "Availability," page 3). Five functions provide the needed
operations, using DOS "handle I/O." GetBufCh() returns the next byte from a
file, reading data into the buffer as needed. (PCX files must be processed 1
byte at a time, so this is all that's needed.) Output, however, is usually
performed on blocks of data (an entire raster line, for example). BufWrite( )
copies blocks of data into the output buffer until it fills up; then BufFlush(
) writes out the buffer. FileOpen( ) and FileClose( ) complete the function
set.


Reading PCX Files


Listings Two and Three give the source code for the PCX and Laserjet related
parts of the program. The header file in Listing Two declares a structure that
passes options and control information from main( ) to PCXToHP( ) and a
structure for reading the PCX file header. The PCX file reading functions are
loosely based on Kent Quirk's PCX-to-Postscript program (DDJ, August 1989) so
I won't discuss them in detail.
The main( ) routine, after some initialization and option parsing, opens the
output file. The output file/device name can be specified on the command line
or by an environment variable named PCXHP. If neither is present, the program
defaults to PRN.

In monochrome PCX file format, a bit value of 1 means a white pixel, and 0
means black -- the opposite of how PCL4 interprets things. Consequently the
OPTIONS structure is initialized with inversion enabled to get a normal (black
on white) printed page.
PcxReadLines( ) reads input data and processes it to form a requested number
of raster lines. If it runs out of data at the beginning of the first line, it
returns FALSE to indicate the end of input data. If it runs out of data after
the beginning of the first line, it fills the remainder of the requested lines
with 1 bit, making all lines past the end of the PCX data white.
Some programs generate a PCX file with more bytes of data in a raster line
than are actually required for the number of pixels. The Windows Paintbrush
program is a notable example, filling any bits and/or bytes beyond the end of
the raster image with 0s which, if interpreted as image data, print as black.
PcxReadLines( ) sets any bits beyond the end of the image data to 1 so that
they will print as white. Three members of the OPTIONS structure allow
PcxReadLines( ) to deal with extra bits efficiently. sEndAdjust is set to the
number of bytes at the end of a raster line requiring adjustment. sAdjOffset
is set to the offset of the first byte needing adjustment. ucMask has bits set
which are ORed into the first adjusted byte to accommodate images ending in
the middle of a byte.


Optimizing Output


A fax image for a letter-size page scaled for a laser printer contains over
one million bytes of data. Just transferring the data to my printer (using
COPY /B in DOS) took over 3.5 minutes. Data sent can be reduced if the printer
understands some form of data compression. Unlike PCL5, PCL4 does not support
compressed data, so I had to do something else.
In PCL4, raster graphics are printed on a page by "painting" the black dots;
there is no need to print the white ones (all dots are white by default).
Because a typical fax image consists of a lot of white space, PCXToHP( )
exploits this to reduce the amount of data sent to the printer. By skipping
the bytes that are blank at the beginning and end of a line (and by skipping
blank lines entirely), PCXToHP( ) may have to send less than 20 percent of the
original data. (The savings, of course, depend strongly on the contents of the
image.)
Another optimization that is not implemented here is the search for runs of
all white bytes within a line and skip them if they are longer than about 20
bytes (the number of bytes required by the additional cursor positioning
commands).
To print raster graphics using PCL4, the program positions the print cursor
and then issues a start raster graphics, which sets the left margin (X
position) at the current location. Each raster line moves the print cursor one
dot down the page (Y position) but keeps the same left margin. Consecutive
lines at the same X position need only be sent to the printer one after the
other; the Y position for each line is automatically adjusted. White space can
be skipped with a Y positioning command. Whenever the X position changes, a
full X-Y position command must be sent, along with a start graphics.
The first part of PCXToHP( ) opens the PCX file, reads its header, and
verifies that it is a monochrome file. It then determines if there are any
bits at the end of each raster line that must be masked because they are not
part of the image, and sets the members of the OPTIONS structure
appropriately. Next it computes the number of bytes in a scaled line and
allocates a buffer to hold three of them. Then it outputs a command to set
graphics resolution to 300 dpi.
The key variables are iCurX, iCurY, iNewX, and iNewY. iCurX and iCurY hold the
position at which the next graphics data will be printed if no positioning
commands are issued. They are initialized to -1 to insure that an X-Y
positioning command is sent before the first line of data. iNewX and iNewY are
set to the position at which the next graphics data must be printed. They are
initialized at the X-Y position specified by the OPTIONS structure for the
image. If iCurX and iCurY do not equal iNewX and iNewY, a full X-Y positioning
command is required.
The outer loop reads through the input data two lines at a time, scales it to
three lines of output data, determines when it is appropriate to skip data
because it is blank, and outputs the data with appropriate positioning
commands. The loop continues until all lines in the image are printed or it
comes to the end of the PCX file.
The inner loop deals with each of the three output lines. IndexNE( ) is a
function returning the index of the first byte in a buffer not equaling a
value, or -1 if all bytes in the buffer equal the value. It is used here to
find the index (stored in iFront) of the first nonzero byte in a line. If all
bytes are 0, nothing happens except that iNewY is incremented. If there is
something to print, iFront is used to compute the value of iNewX for the line.
CntREQ( ) is a function that counts the number of bytes equal to a value
working backwards (decrementing) from an address. It is used here to determine
the number of bytes (stored in iBack) of 0 at the end of the line.
Based on the values of iCurX, iCurY, iNewX, and iNewY a positioning command is
sent, if necessary. A start graphics command is then sent, followed by the
graphics data. iFront is used to compute the address of the first byte of
graphics data to be output and iFront and iBack are used to compute the actual
number of bytes to send. The graphics data is followed by an end graphics
command. Then, a form feed causes the page to be printed.
With PCXHP, I am now happily printing faxes on my laser printer in a
reasonable amount of time.
_SCALING AND PRINTING FAXES FASTER_
by Greg Pickles


[LISTING ONE]

;-----------------------------------------------------------------------------
; Scale2To3 -- by Greg Pickles -- C callable assembly language routine to
; expand 2 lines of 200 DPI bitmap to 3 lines of 300 DPI bitmap. Assumes that
; all memory for storing the lines is allocated outside this routine.
;-----------------------------------------------------------------------------
.model large,c
.286
 .data
 ;----------------------------------------------------------------
 ; MapTbl contains the output bit map for each 2x2 input section
 ; Entries are groups of 4 bytes, with the 4th byte a placeholder.
 ;----------------------------------------------------------------
MapTbl dw 0000000000000000b ;0
 dw 0000000000000000b
 dw 0000000000000000b
 dw 0
 dw 0000001100000000b ;1
 dw 0000001100000000b
 dw 0000000000000000b
 dw 0
 dw 0000011000000000b ;2
 dw 0000011000000000b
 dw 0000000000000000b
 dw 0
 dw 0000011100000000b ;3
 dw 0000011100000000b
 dw 0000000000000000b
 dw 0
 dw 0000000000000000b ;4
 dw 0000001100000000b
 dw 0000001100000000b
 dw 0
 dw 0000001100000000b ;5
 dw 0000001100000000b

 dw 0000001100000000b
 dw 0
 dw 0000011000000000b ;6
 dw 0000011100000000b
 dw 0000001100000000b
 dw 0
 dw 0000011100000000b ;7
 dw 0000011100000000b
 dw 0000001100000000b
 dw 0
 dw 0000000000000000b ;8
 dw 0000011000000000b
 dw 0000011000000000b
 dw 0
 dw 0000001100000000b ;9
 dw 0000011100000000b
 dw 0000011000000000b
 dw 0
 dw 0000011000000000b ;a
 dw 0000011000000000b
 dw 0000011000000000b
 dw 0
 dw 0000011100000000b ;b
 dw 0000011100000000b
 dw 0000011000000000b
 dw 0
 dw 0000000000000000b ;c
 dw 0000011100000000b
 dw 0000011100000000b
 dw 0
 dw 0000001100000000b ;d
 dw 0000011100000000b
 dw 0000011100000000b
 dw 0
 dw 0000011000000000b ;e
 dw 0000011100000000b
 dw 0000011100000000b
 dw 0
 dw 0000011100000000b ;f
 dw 0000011100000000b
 dw 0000011100000000b
 dw 0

 .code
;--------------------------------------------------------------------
; void Scale2to3(char far*,char far*,short,short);
;--------------------------------------------------------------------
 public Scale2to3
Scale2to3 proc uses si di ds,pIn:PTR,pOut:PTR,nBytes:WORD,InvFlg:Word
LOCAL OuterCnt:WORD
S23_0:
 mov ax,nBytes ;get number of bytes in input line
 mov OuterCnt,ax ;set outer loop count
 mov dx,ax ;mult AX by 3/2 and put result in DX
 shr ax,1 ;div AX by 2
 jnc S23_10 ;if carry, then there is a fractional
 inc ax ; bytein the result, so inc for it
S23_10: add dx,ax ;save # of bytes in output line in dx
 ;fill the output buffer with 0

 mov cx,dx ;multiply output line size by 3
 shl cx,1
 add cx,dx
 mov bx,cx ;save CX to test for odd value later
 shr cx,1 ;divide CX by 2 to get word count
 les di,pOut ;get pointer to output buffer
 sub ax,ax ;get fill value
 rep stosw
 test bl,1 ;see if extra byte to fill
 jz S23_15
 stosb
S23_15: mov ax,@data
 mov ds,ax
 mov si,offset MapTbl
 mov cl,13 ;amount to shift initial output value
 ;top of loop that processes a byte of input
 ; register usage:
 ; AX = input data and output data
 ; BX = ofset into map table
 ; CH = inner loop counter
 ; CL = shift count for this output group
 ; DX = size of output line
 ; DS:SI pointer to map table
 ; ES:DI pointer to output word
S23_30: les di,pIn ;get pointer to 1st input line
 mov ah,byte ptr es:[di] ;get data from line 1
 add di,nBytes
 mov al,byte ptr es:[di] ;get data from line 2

 test InvFlg,0ffffh ;see if we need to invert
 jz S23_35
 not ax
S23_35: mov ch,4 ;do 4 2-bit segments in next loop
 mov es,word ptr pOut+2 ;get segment address of output
S23_40: rol ax,2 ;bits we want are in 0,1,8,9
 push ax
 and ax,303h ;mask out other bits
 shl ah,2 ;move bits from 8,9 to 10,11
 or al,ah ;or them into al
 sub ah,ah ;clear ah
 shl ax,3 ;ax now has offset into enlarge table
 mov bx,ax

 mov ax,[si+bx] ;get output value
 rol ax,cl ;shift it
 mov di,word ptr pOut
 or es:[di],ax ;or into output

 add di,dx ;get pointer to line 2 of output
 mov ax,[si+bx+2] ;get output value
 rol ax,cl ;shift it
 or es:[di],ax ;or into output

 add di,dx ;get pointer to line 3 of output
 mov ax,[si+bx+4] ;get output value
 rol ax,cl ;shift it
 or es:[di],ax ;or into output

 pop ax

 ;adjust the shift count for the output data
 ;and the output pointer, if necessary
 cmp cl,1 ;see if we need to bump output pointer
 ja S23_60 ;jump if just need to adjust count
 ;CL is either 0 or 1 so it must become
 ; either 13 or 6, respectively
 ; NOTE: we are later going to sub 3
 ; so set CL what we want + 3
 mov cl,9 ;assume CL was 1
 je S23_50 ;jump if CL=1
 mov cl,16 ;opps! guessed wrong so make it 13
 inc word ptr pOut ;when CL goes from 0 to 13, need to
 ; bump the output pointer by 2
S23_50: inc word ptr pOut ;increment output pointer
S23_60: sub cl,3
 dec ch ;decrement inner loop counter
 jnz S23_40 ;jump to inner loop if not 0

 inc word ptr pIn ;increment input pointer
 dec OuterCnt ;decrement outer loop counter
 jnz S23_30

 ret
Scale2to3 endp
 end





[LISTING TWO]

/*******************************************************************
 * PCXHP.H by Greg Pickles -- Header file for FAX image print program.
 *******************************************************************/

typedef struct { // This struct passes control information.
 SHORT sXpos; // x pos for image on page in pixels
 SHORT sYpos; // y pos for image on page in pixels
 SHORT sInv; // TRUE to invert image
 SHORT sEndAdjust; // number of bytes at end of a raster line to adjust
 // because they are beyond the end of the actual image
 SHORT sAdjOffset; // offset in line of first byte to adjust
 UCHAR ucMask; // mask to OR in to the first byte that is
 // adjusted (image may end in middle of byte)
} OPTIONS;
typedef struct { // This struct is the PCX file header.
 UCHAR ucPcxId; // PCX ID, always 0x0a
 UCHAR ucVer; // PCX version
 UCHAR ucEncMeth; // 1 = run length
 UCHAR ucBPP; // bits per pixel
 USHORT usUpLeftX, usUpLeftY; // position of upper left corner
 USHORT usLoRightX, usLoRightY; // position of lower right corner
 USHORT usDispXRes, usDispYRes; // resolution of display
 UCHAR aucPalette[48]; // palette data
 UCHAR ucRes;
 UCHAR ucNumPlanes; // number of bit planes of data
 USHORT usBytePerLine; // # bytes in an raster line
 UCHAR ucRes2[60];

} PCX_HDR;

/*-------------- Function prototypes-------------------------------------*/
int PCXToHP(char*, FILEBUFFER*, OPTIONS*);
void usage(void);
PCX_HDR *PcxReadHeader(PCX_HDR*, FILEBUFFER*);
void pcx_print_header(PCX_HDR*);
UCHAR *pcx_alloc_line(PCX_HDR*, SHORT);
int PcxReadLines(PCX_HDR*, UCHAR*, FILEBUFFER*, SHORT, OPTIONS*);
UCHAR *pcx_test_line(PCX_HDR*, UCHAR*, SHORT, SHORT);
UCHAR *ScanNE(UCHAR*, UCHAR, int);
int IndexNE(UCHAR*, UCHAR, int);
int CntREQ(UCHAR*, UCHAR, int);
int main(int, char**);

void Scale2to3(char far*,char far*,short,short);





[LISTING THREE]

/*******************************************************************
 * PCXHP.C by Greg Pickles --- FAX to LaserJetII image print program.
 *******************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <memory.h>
#include <malloc.h>
#include <fcntl.h>
#include <io.h>
#include <sys\types.h>
#include <sys\stat.h>
#include <conio.h>

#include "bufio.h"
#include "pcxhp.h"

/*******************************************************************
 * PCXToHP -- Process a PCX file. Returns 0 if successful, non-0 if error
 * pszFileName : pointer to the input file name
 * pstrcOutFile : pointer to output file buffer structure
 * pstrcOpt : pointer to OPTIONS struc for processing this file
 *******************************************************************/
int PCXToHP(char *pszFileName, FILEBUFFER *pstrcOutFile, OPTIONS *pstrcOpt)
{
FILEBUFFER *pstrcInFile;
PCX_HDR strcHdr;
SHORT i, k, sXsize, sYsize, sLineBytes, sExpLineSize;
UCHAR *pucLine, *pucExpBuf;
int iCurX = -1, iCurY = -1;
int iNewX, iNewY, iFront, iBack;
char szFileName[80];
char szHpLineLead[20];
char szHpPos[50];
 // these strings are for building PCL4 commands
static char szPosFmtFXYO [] = "\x01b*p%dx%dY\x01b*r1A";

static char szPosFmtY [] = "\x01b*p%dY";
static char szGrDataFmt [] = "\x01b*b%dW";
static char szEndGr [] = "\x01b*rB";
static char szRes300 [] = "\x01b*t300R";

 strcpy(szFileName, pszFileName); // make local copy of name
 if (strchr(szFileName,'.') == NULL) // add .PCX if needed
 strcat(szFileName,".pcx");
 // allocate input file buffer and open file, using my buffered I/O routines
 if ( (pstrcInFile=FileOpen(szFileName,0)) == NULL )
 return 1;
 if (PcxReadHeader(&strcHdr,pstrcInFile) == NULL) {
 fprintf(stderr,"Error reading PCX header in file '%s'\n",
 szFileName);
 close(pstrcInFile->hFile);
 return 3;
 }
 if (strcHdr.ucNumPlanes != 1) {
 fprintf(stderr,"Error: Not a monochrome PCX file.\n");
 close(pstrcInFile->hFile);
 return 5;
 }
 // extract size of line, compute number of rows,
 // allocate buffer for 2 input lines
 sLineBytes = strcHdr.usBytePerLine;
 sYsize = strcHdr.usLoRightY - strcHdr.usUpLeftY + 1;
 pucLine = malloc(2*sLineBytes);
 // determine whether any bits/bytes need to be masked
 // at the end of a decompressed line
 sXsize = (strcHdr.usLoRightX - strcHdr.usUpLeftX + 1);
 if ( sXsize/8 < sLineBytes ) {
 pstrcOpt->sEndAdjust = sLineBytes - sXsize/8;
 pstrcOpt->sAdjOffset = sXsize/8;
 pstrcOpt->ucMask = (UCHAR) (0xff >> sXsize%8);
 }
 // compute length of scaled line and allocate buffer
 sExpLineSize = strcHdr.usBytePerLine + (strcHdr.usBytePerLine/2) +
 ((strcHdr.usBytePerLine & 1) ? 1 : 0);
 pucExpBuf = malloc(sExpLineSize*3);
 // set HP graphics resolution to 300 DPI
 BufWrite(pstrcOutFile,szRes300, strlen(szRes300));
 // init position on page
 iNewX = pstrcOpt->sXpos;
 iNewY = pstrcOpt->sYpos;

 for (i=0; i<sYsize; i+=2) {
 if ( !PcxReadLines(&strcHdr, pucLine, pstrcInFile, 2, pstrcOpt) )
 break;
 Scale2to3(pucLine,pucExpBuf,sLineBytes,pstrcOpt->sInv);

 for (k=0; k<3; k++) {
 if ((iFront=IndexNE(pucExpBuf+k*sExpLineSize,0,sExpLineSize)) >= 0) {
 iNewX = pstrcOpt->sXpos + iFront*8;
 iBack=CntREQ(pucExpBuf+(k+1)*sExpLineSize-1,0,sExpLineSize);
 if (iNewX != iCurX) {
 sprintf(szHpPos,szPosFmtFXYO,iNewX,iNewY);
 BufWrite(pstrcOutFile,szHpPos,strlen(szHpPos));
 iCurX = iNewX;
 iCurY = iNewY;

 }
 else if (iNewY != iCurY) {
 sprintf(szHpPos,szPosFmtY,iNewY);
 BufWrite(pstrcOutFile,szHpPos,strlen(szHpPos));
 iCurY = iNewY;
 }
 // note: a possible optimization is to remember the previous
 // leadin string, value of iFront, and leadin string length
 // and only create them when they change
 sprintf(szHpLineLead,szGrDataFmt, sExpLineSize-iFront-iBack);

 BufWrite(pstrcOutFile,szHpLineLead,strlen(szHpLineLead));
 BufWrite(pstrcOutFile,pucExpBuf+k*sExpLineSize+iFront,
 sExpLineSize-iFront-iBack);
 iCurY++;
 }
 iNewY++;
 }
 }
 BufWrite(pstrcOutFile,szEndGr,strlen(szEndGr));
 BufWrite(pstrcOutFile,"\f",1);
 free(pucLine);
 free(pucExpBuf);
 FileClose(pstrcInFile);
 return 0;
}
/*******************************************************************/
void usage(void) // displays help info about usage, return to system
{
 printf("PCXHP\n");
 printf(" Given a .PCX file, this program creates a file which can be\n");
 printf(" sent to an HP LaserJet printer to print the image.\n");
 printf(" PCXHP [-xX] [-yY][-d] [-i] [-ofilename] filename\n");
 printf(" Options include: (units) [default]\n");
 printf(" -xPOS set horizontal position (pixels from left) [0]\n");
 printf(" -yPOS set vertical position (pixels from bottom) [0]\n");
 printf(" -d dump PCX file info to stdout [off]\n");
 printf(" -i sInv image [off]\n");
 printf(" -oFIL set output filename, or use SET PCXHP=filename\n");
 exit(1);
}
/*******************************************************************
 * PcxReadHeader -- Reads the header of a PCX file. Returns NULL if can't.
 *******************************************************************/
PCX_HDR *PcxReadHeader(PCX_HDR *pstrcHdr, FILEBUFFER *pstrcF)
{ lseek(pstrcF->hFile,0L,SEEK_SET);
 if (read(pstrcF->hFile,(char*)pstrcHdr,sizeof(PCX_HDR))
 != sizeof(*pstrcHdr))
 return NULL;
 else return pstrcHdr;
}
/*******************************************************************
 * PcxReadLines --- returns TRUE if success, FALSE if out of data
 * Reads and expands the next N line from the PCX file. Assumes
 * that the file pointer is positioned at the point in the file at
 * which to begin reading. Performs data expansion as necessary.
 * pstrcHdr : pointer to PCX header struct for the file
 * pbLine : pointer to the buffer in which to put lines
 * pstrcF : pointer to the opened FILEBUFFER for the file

 * sLines : number of lines to read and expand
 *******************************************************************/
int PcxReadLines(PCX_HDR *pstrcHdr, UCHAR *pucLine, FILEBUFFER *pstrcF,
 SHORT sLines, OPTIONS *pstrcOpt)
{
int iData, iData2;
UCHAR *pucDst, *pucLineStart;
USHORT usLSize = pstrcHdr->usBytePerLine;
int i, j;
 for (j=0, pucDst=pucLine; j<sLines; j++) {
 for (i=0, pucLineStart=pucDst; i<usLSize; ) {
 // if we get EOF on the first line, return FALSE
 // to indicate we're done, otherwise fill the
 // rest of the lines with 0xff (i.e. blank)
 if ((iData=GetBufCh(pstrcF)) == EOF) {
 if ( (j > 0) (i > 0) ) {
 memset(pucDst,0xff,usLSize-i);
 i = usLSize;
 pucDst += (usLSize-i);
 }
 else
 return FALSE;
 }
 else {
 if ((iData & 0xc0) == 0xc0) { // check for run length
 // read data to be repeated; if EOF, return FALSE
 if ((iData2=GetBufCh(pstrcF)) == EOF)
 return FALSE;
 memset(pucDst, (UCHAR)iData2, iData & 0x3f);
 pucDst += iData & 0x3f;
 i += iData & 0x3f;
 }
 else {
 *pucDst++ = (UCHAR)iData;
 i++;
 }
 }
 }
 if (i=pstrcOpt->sEndAdjust) {
 pucLineStart += pstrcOpt->sAdjOffset;
 *pucLineStart = pstrcOpt->ucMask;
 while (--i)
 *(++pucLineStart) = 0xff;
 }
 }
 return TRUE;
}
/*******************************************************************
 * IndexNE -- Scans a buffer for the first byte not equal to a specified byte.
 * pbBuf pointer to the buffer to test
 * bVal value to test for
 * iCount number of bytes to test
 * Returns: -1 if the buffer contains only the specified byte
 * otherwise offset of the first byte that is not the specified byte
 *******************************************************************/
int IndexNE(UCHAR *bBuf, UCHAR bVal, int iCount)
{ int iOrig = iCount;
 while (iCount && (*bBuf == bVal)) { iCount--; bBuf++; }
 if (iCount) return iOrig-iCount;

 return -1;
}
/*******************************************************************
 * CntREQ -- Counts the number of bytes equal to a specified byte from a
 * starting point in memory backwards.
 * pbBuf : pointer to the (end of the) buffer to test
 * bVal : value to test for
 * iCount : number of bytes to test
 * Returns number of bytes found that are equal to the specified byte
 *******************************************************************/
int CntREQ(UCHAR *bBuf, UCHAR bVal, int iCount)
{ int iOrig = iCount;
 while (iCount && (*bBuf == bVal)) { iCount--; bBuf--; }
 return iOrig-iCount;
}
/*******************************************************************/
int main(int argc, char *argv[])
{
int i;
FILEBUFFER *OutFile;
char *outfname = NULL;
char *filename = NULL;
static OPTIONS Opt = {0,0,TRUE,0,0,0};
 if (argc < 2) usage();
 for (i=1; i<argc; i++) {
 if (argv[i][0] == '-' argv[i][0] == '/')
 switch (toupper(argv[i][1]))
 {
 case 'X': Opt.sXpos = atoi(argv[i]+2); break;
 case 'Y': Opt.sYpos = atoi(argv[i]+2); break;
 case 'I': Opt.sInv = !Opt.sInv; break;
 case 'O': outfname=argv[i]+2; break;
 case '?': usage(); break;
 default: fprintf(stderr, "Unknown option %s\n",argv[i]);
 usage(); break;
 }
 else
 filename = argv[i];
 }
 if ( (outfname == NULL) && ((outfname = getenv("PCXHP")) == NULL) )
 outfname = "prn";
 if ( (OutFile=FileOpen(outfname,1)) == NULL ) exit(1);
 i = PCXToHP(filename, OutFile, &Opt);
 FileClose(OutFile);
 return i;
}
















August, 1991
PROGRAMMING PARADIGMS


A Language Without a Name: Part II




MICHAEL SWAINE


Bob Jervis is the author of Wizard C, which later became Borland's Turbo C.
Recently, Bob has been working on a new programming language, a seemingly
Quixotic gesture in a time when, by his own admission, the prospects for a
new, independently developed language are dim. In last month's installment of
this two-part interview, he talked about C++, OOP, Ada and the DoD, and the
approach he is taking with his new language.
This month, the interview moves on to the future of software development, a
future in which multiprocessor machines will sit on the average desktop and
parallel algorithms will be the norm. Ultimately, Bob brings it all around to
an answer to the question that inspired the interview in the first place: Why
is Bob Jervis writing a new programming language?
DDJ: You've been doing language development for some time. Any thoughts on the
quality of development tools, or the direction you see software development
tools headed in the future?
BJ: As far as development tools go, I haven't seen anything that contradicts
the basic model that Turbo C and Turbo Pascal established. More and more
integration. More and more, you're sitting at your screen with this
ever-larger array of support tools there at your fingertips. The 640K barrier
of DOS has been kind of a throttle on how much tools you get for the simple
reason that having all those tools costs memory and costs CPU cycles. 640K
[makes it] sort of hard to be squeezing 100K compilers and 100K debuggers and
100K code analyzers in and out and get any performance out of the machine. But
I think that once you get into the 32-bit world, things are a little more
friendly and you'll see some really powerful development environments.
I think the big difference between the way that Turbo Pascal and Turbo C have
done things and the way the 32-bit environments are going to work is more
along the lines of Microsoft's Workbench, where rather than having everything
compiled into one monolithic .exe, you're going to have a much more loosely
connected collection of tools. You're going to be able to plug in your own
editor and you're going to be able to plug in a third-party debugger if they
have a better one.
I hate terms like software bus, because that sort of term is very misleading
about what is really happening, but I think you're going to get some simple
publicly documented interfaces that say: The editor supports these features,
and here's how you get a list of error messages into the editor, and here's
how the debugger works, and here's how you find out where you are in the
source and how you set a breakpoint, and here's how the compiler works, and
here's how you feed it the source, and here's how you get information out, and
so on. It's clearly in the object-oriented paradigm that any particular
company's compiler or editor or debugger will inherit that basic class and put
in their specific implementation. It will look to the user like one set of
windows or related buttons under the editor, but it will be a much more robust
set of tools.
I think you're going to see more multivendor solutions. In the good old
mainframe days, if you went to IBM you got your compiler and you got your
libraries and you got your operating system, and only the very marginal niche
languages that were too small for your hardware vendor to care about did you
get from third parties. As time has gone on, people have become much more
comfortable with building solutions out of pieces from several different
sources.
I recently got a copy of the DPMI stuff from Microsoft, and I was very
impressed that they've actually documented an interface that's important and
that lets third-party vendors interact with Windows in a fundamental way.
That's really exciting. Microsoft has, to this day, undocumented features in
DOS that they refuse to talk about.
DDJ: Last fall I heard Bill Gates questioned about the undocumented features
of DOS. He maintained that there were none.
BJ: That's very interesting. Well, if it's undocumented, it's not a feature,
right? Too bad it's things that their own utilities use. It blows my mind that
the only way you can make a debugger work with a standard.exe and actually
work through the operating system is if you use undocumented features. Which
is one of the things that I find impressive about DPMI: It really looks like
you could write a DOS extender that supports DPMI, and that therefore runs
Windows under a non-DOS environment, or not a pure DOS environment. I think
that's very encouraging. That's going to go a very long way toward helping a
lot of different developers -- and users -- out. I see it as a real positive
step.
Of course it'll probably turn out that there's some hidden feature they
haven't told you about that'll make Windows run two times faster on their DPMI
implementation.
DDJ: Getting back to your project, you are planning for The Language to be a
product, right? What are your plans?
BJ: It's been a very interesting project. This spring I sat back after having
done all this work and having made some presentations last fall and said: I
don't have AT&T or IBM or Microsoft or Borland pushing my language. How am I
going to find a role for it? And I said: Well, C didn't become popular for ten
years after its original invention. So if we assume ten years out, maybe what
I should do is make sure The Language has features that people are going to
want ten years from now.
DDJ: I suppose that depends most on what kind of hardware is on the desktop in
2001.
BJ: That's what I looked at: What is the hardware going to look like ten years
out? And the answer that came back is that everything is going be
multiprocessor. There may conceivably still be a $10 8088 that you can still
buy in the year 2000, but realistically people are going to have two, three,
four, five processors sitting on their desks.
So what I concluded is that the way to make this language have a real future
is to stop now before I build -- or try to build -- a large customer base and
lock myself in, and go back and look at how you program a multiprocessor.
So I started looking at the research materials and realized that there's a lot
of room. The state of the research on multiprocessing languages and how to
program them is very limited. I just came back from Cray, which has obviously
been working on multiprocessors for a number of years. It blew my mind when
they talked about performance on the order of four gigaflops. Four gigaflops!
You gotta be kidding me. But they said: That's just benchmarks; realistically
you can't expect more than about one gigaflop. So they're in a different
league. But ten years from now that league is going to be on your desk.
DDJ: I've dug into the research on multiprocessor architectures and
programming for this column in the past. The impression that I'm left with is
that there are several different models, depending on the number of
independent processors and whether each processor has its own memory and so
on, and that each of these models is as complex as the familiar sequential
model. The parallel universe is bigger than the sequential universe.
BJ: One of the things that Cray has stated [is] that they are going to be
going to massively parallel machines. Their [current] project is a
1000-processor multiprocessor.
When you look at those kinds of architectures, you need a different kind of
language to talk about programming those kinds of beasts. Cray's experience is
that it's next to impossible to take any old random piece of C code or Fortran
and to decipher what the hell you're supposed to do with your other half-dozen
processors that are sitting around idle. The work they've been doing is how to
get these multiprocessors to share a common memory where there aren't too many
multiprocessors. They're finding A, that it's very hard, and B, that you have
to extend the language to support extra features so that the programmer can
help the compiler find the parallelism in the program.
[But now] even Cray is admitting that maybe this [model of] one monolithic
memory and a bunch of processors snaking through it is not going to scale up
to 1000 processor machines very easily. They haven't completely figured out
what they're going to do, but a lot of other people like N Cubed and Hypercube
are much more aggressively saying: What we're going to have is x many thousand
processors and each one's going to have their own dedicated memory and we're
going to be sending messages back and forth over some high-speed bus, or some
multitude of buses and communications lines, to share information.
DDJ: I assume you've looked at the existing languages. There are a number of
languages that have been developed for doing parallel programming. Occam.
Parlog.
BJ: There's a lot of research effort, but commercially there hasn't been a lot
that really works well with that kind of hardware. There are basically two
strategic evolutionary directions that people are taking in trying to program
these multi-processors.
One is what you might call the Prolog camp that says: We're just going to run
Prolog or some language related to Prolog, and what you're going to do is
distribute your productions and your solution resolution across this multitude
of machines, and they're going to execute the productions and explore the
alternatives in parallel. The problem with that as I see it is that -- and it
may turn out that when you have enough horse-power and true parallelism that
it may work out -- but those kinds of languages haven't proved to be too
successful in the marketplace. People have a hard time programming in them.
DDJ: And the other direction?
BJ: It dovetails very nicely with what I was doing with The Language. It's the
area of research called Concurrent OOP. What that does is say: To get your
parallelism, you take all the objects in your program, and all of a sudden all
of those objects become parallel processes. So each object, instead of being a
passive patch of memory with some functions somehow associated with it, is an
active processing beast. And when you use the Smalltalk terminology of sending
messages, you really are sending messages.
I've run across at least half a dozen different projects that have worked out
languages to express parallelism using this stuff, and it looks very
interesting, so I'm working on putting those kinds of features into The
Language.
There's going to be a large [range of numbers] of processors available in the
machine of the year 2000. Low-end machines like the [Intel] Micro 2000 will
have a mere half dozen processors, whereas your high-end Cray machines are
going to have 1000 to 10,000 processors, and we're going to need programming
languages that work on that range of hardware. So what I want to do is [have]
the compiler know that [when it's running on] a fairly low-end machine with
not many processors, you use less parallelism and compile things more into
regular procedure calls between objects, and bind more objects into a single
executable. And then when you've got the processors, you split everything up
and ship everything across to all the processors.
I think that we're probably still five years away from a whole lot of people
having machines that can really take advantage of that kind of language,
because I wouldn't expect a real low-end multiprocessor from Intel, for
example, until the 686 or 786. Before they get the microchips out that have a
dozen processors, you're going to have chips with two or three processors on
them. So we're still a few years away before these things become advantages.
DDJ: So you are writing a language for a market that won't begin to emerge for
another five years?
BJ: Well, there is one thing that is happening today that fundamentally is
dealing with a very similar sort of problem, and that's distributed
[computing]. I went to a presentation recently by one of the guys at Next.
They have this thing where you go home at night and it comes in and uses your
desktop processor as extra computing horsepower. So I think that there's an
immediate use for a language like this. If you can write a single piece of
source or write an application that thinks that it's calling the library, and
have the compiler and the operating system supporting it turn that into a
client-server application operating over a network, you've got an immediate
application for these features. And then when the multiprocessors come along,
you've got even more applications for them.
So that's the technical context in which, if you will, I'm doing version two
of The Language. Version one is still pretty much a single-processor,
conventional C-like language with OOP extensions. The next generation is going
to be much more ambitious in terms of the kinds of technical scope.
DDJ: Ambitious seems to be an appropriate description. Aren't there some
fundamental problems that have to be solved before a language for these kinds
of architectures can be commercially viable?
BJ: There are a lot of things that have to be solved. Clearly, if you've got a
distributed network where you've got two processors, then a client-server
exchanging messages is a sensible way of doing things. To get both processors
involved, you have to ship information in the form of a message across the
network. But if those two pieces of the application happen to be residing on
one machine, then it's less clear what the best way of doing it is. Because
now all of a sudden you're paying all this internal overhead to do a
generalized message send when all you've got is one little processor ticking
away.
So it's not at all clear how you balance the efficiency for conventional
single-processor architecture with the flexibility of distributed computing.
Most of the research I've seen has not really addressed that very well. For
example, message-based operating systems, while they have great flexibility
for working across networks, tend not to perform very well compared to more
conventional operating systems. So if your programming language makes heavy
use of messages, then you're going to be in the same boat.
So I'm trying to explore ways to at least make the more common operations of
basic disk I/O flexible enough that if you have a network there, you'll be
able to take advantage of it, but if you don't have a network it'll still be
relatively efficient. For today, that's probably the single biggest challenge
that I face. It's not proving to be real easy. But then again, that's why
there aren't ten other people out there selling products like this.
DDJ: That may change, after this interview appears. I have to say, you're
exploring some issues that I find fascinating, as well as challenging.
BJ: It's an exciting end of the business. The more I talk to people the more I
realize that this is really cutting-edge technology. There's a very good
chance that in another five to ten years this kind of work is going to be
being done all over the world, just because the demand will be there.
That, in a nutshell, is where I think software development is going, too. OOP
came along at the right time to help us write windows kinds of applications,
and I think these sorts of languages are going to take OOP to the next step.
It's the most promising avenue I've run across in the literature for the kind
of language in which you still write algorithms most of the time, so you don't
have to constantly be thinking, as in Prolog: How do I get it to do A before
B? If you want it to do A before B you just write A ; B and it does it. And if
you want it to do it in parallel, you have to think about it. You pay for the
complexity you need.










August, 1991
C PROGRAMMING


Repairs to D-Flat, Power C, and C++ Compilers


 This article contains the following executables: DFLAT5.ARC


Al Stevens


This, the annual C issue, marks the beginning of my fourth year as Dr. Dobb's
Journal's C Programming Columnist. Pardon me if I puff out my chest when I say
that. Every month I get to write some C code, read some books, drag out my
soapbox, and write for the leading programmer's magazine. All during each
month I talk to other programmers on CompuServe. Every now and then I go some
place such as Boston, New Orleans, or San Francisco for a computer show or a
programming symposium. And they call this work.
This month we will repair some of the D-Flat files published in past columns,
take a look at Power C, the bargain-basement ANSI C compiler, and discuss the
latest C++ compilers for the PC. The discussion about C++ is a sneaky way for
me to plug the second edition of my C++ book. It's traditional for computer
columnists to use their columns to plug their own books. The practice is
called the "Pournellian Imperative."


Fixing D-Flat


First, you jack up de car...
We're into the fourth installment of D-Flat, and already I have some
corrections to make to code previously published. The three listings this
month are new versions of dflat.h (Listing One, page 174), config.h (Listing
Two, page 175), and window.c (Listing Three, page 177). These are some of the
files that have changed. I will publish the others along with some new code
next month. This time I'll explain what has changed in the system that affects
existing code.
One of my complaints about many windows and user interface libraries is that
small programs often must carry the weight of most of the library in the .EXE
file, even though the program does not need most of the features. I found the
same thing happening to D-Flat. As I added features to the user interface, the
size of an applications program grew, even though the applications code had
not changed. That was contrary to my original intentions for D-Flat, so I had
to do something about it. I decided to let the C preprocessor be the program
configuration manager. If an application does not need a D-Flat feature, you
do not need to compile the code that supports the feature. Inasmuch as code to
support discrete features might be found anywhere in the many D-Flat source
files, the best way to suppress features is with compile-time conditionals.
Listing Two, config.h, contains some new global definitions. They are named
INCLUDE_SYSTEM_MENUS, INCLUDE_DIALOG_BOXES, and so on. If you do not want
system menus in your program, for example, you remove the INCLUDE_SYSTEM_MENUS
global definition from config.h and recompile the D-Flat library. The effect
will be that you will not be able to move, resize, minimize, and maximize the
windows, but the program will be a lot smaller. A simple CUA program that uses
none of the configurable features will be as much as 60K smaller than one that
uses everything.
By selectively including and excluding the global definitions, you can
generate or suppress system menus, the time-of-day clock display, the multiple
document interface, scroll bars, shadows, dialog boxes, the clipboard,
multiple-line listboxes and editboxes, and message logging. This approach
means that D-Flat is not necessarily a static library that you use unchanged
for all your projects. Instead, you modify the library functions to suit the
needs of the specific application. This practice is consistent with my
original design goals, which include using the compiler rather than additional
runtime code to add window classes, menus, and dialog boxes to an application.
There are a number of other changes to window.c, which reflect problems I've
had with clipping and repainting when the application has a lot of document
windows. The new dflat.h and config.h files use the new INCLUDE_global
symbols. dflat.h adds some members to the window structure to support features
that I added to D-Flat in areas we haven't discussed yet but need more data in
the structure. dflat.h has most of the prototypes and macros for the windows
class methods, so they too have similarly changed and grown. Instead of
specifying the names of the CLASS enum, dflat.h now includes a file named
class.h, which you will see in a later column. This file localizes the
definition of classes. A similar file named dflatmsg.h defines the message
codes. I had to split these definitions into other header files because a new
feature needs the class and message names in a displayable string format. I
did not want to maintain redundant lists of classes and messages.
The new feature that uses strings of class and message names is a debug aid
that logs selected D-Flat messages into a text file. A later installment
describes the feature.
Some of the other source files have changed, and I am not going to republish
them because the changes are minor and should have little effect on the
operation of the program. Anyone who wants to use D-Flat should download it
from CompuServe or TelePath. The latest version includes a sparse text file
that documents the D-Flat functions, classes, and messages.


How to Get D-Flat Now


The complete source code package for D-Flat is on CompuServe in Library 0 of
the DDJ Forum and on TelePath. Its name is something like DFLATn.ARC, where n
is an integer that represents a loosely-assigned version number. Do not trust
me to always tell you the correct name. The sysops might change its format.
The D-Flat library that you download is a preliminary version of the package,
but it works with most features in place. The help system is not there,
however. I have not designed it yet. At present, everything compiles and works
with Turbo C 2.0, Microsoft C 6.0, and Power C 2.0.2. There is a makefile for
the TC and MSC compilers and a project file for Power C. There is one example
program, the MEMOPAD program. Some readers have ported D-Flat to Watcom C and
Zortech C++. They report few problems doing that. Read further to see what I
ran into when porting D-Flat to Power C. If for some reason you cannot get to
either online service, send me a formatted diskette -- any PC format -- and an
addressed, stamped diskette mailer. A regular envelope works for 3.5-inch
diskettes. Send it to me at Dr. Dobb's. I'll send you the latest copy of the
library. The software is free, but if you care to, stick a dollar bill in the
mailer. I'll give the dollar to the local food bank or other charity that
takes care of homeless or hungry children, and you and I will feel good about
it. The dollar is, of course, optional, as all charitable contributions should
be. Call it "careware."
If you want to discuss D-Flat with me, my CompuServe ID is 71101, 1262, and I
monitor the DDJ Forum daily.


Power C


If you are looking for a bargain in C compilers, check out Power C from MIX
Software (Richardson, Tex.). For twenty bucks -- the typical cost of a C book
-- you get a respectable book about C, but with an ANSI C compiler thrown in.
Well, actually it's the other way around. The twenty bucks buys the compiler.
Besides being the reference guide for the compiler and library, the user's
manual is also a decent C language tutorial. So, take that twenty clams you
budgeted for a book on C and buy a book and a compiler. But wait, there's
more. For another ten bucks you can get the library source code. Add twenty
bucks more for Power Ctrace, a source code debugger. MIX has other packages as
well. A BCD math package and a database toolbox are twenty bucks each. These
are garage sale prices.
I wanted to get D-Flat compiled with Power C in time for this issue. What
better recession-bashing combination than a $20 compiler and a free CUA
library? I uncovered a few twists in the way Power C does things, however, and
I'll tell you about them here.
ANSI C allows you to call a function through a pointer without using the
pointer dereferencing notation. The call looks like any other function call.
Power C does not allow this convention if the pointer is in a structure. No
problem; I put the parentheses and the asterisk into the call.
Power C does not allow you to initialize a structure with the assignment of
another structure such as this:
 struct rect rc1 = rc2;
D-Flat does that all over the place, so I changed the initializations to
assignments like this:
 struct rect rc1;
 rc1 = rc2;
The interrupt keyword in a Power C function pointer must be exactly the way
Turbo C defined it and not the way that Microsoft C requires it. That required
a compile-time conditional.
I had some problems with the move data function, replaced calls to it with a
memcpy function call, and got past it. The Power C FP_SEG and FP_OFF macros
are fashioned after the Microsoft C convention rather than the Turbo C
convention, which D-Flat uses, so an adjustment fixed that.
At last the D-Flat example MEMOPAD program was loading and displaying its
application window. The menus would pop up, but had no text in them, and the
program did not properly load text files that I named on the command line.
When I exited the program, DOS was dinged up. Time to fire up Power Ctrace,
the MIX debugger.
First, I had to recompile everything with debugging turned on. After the
compile, about 3 Mbytes of .TRC files were left on my disk. They are built by
the compiler and used by the linker to generate a .SYM file for the Ctrace
debugger. You can delete them after the compile, but be forewarned that you
will need room for them while the compile is underway. I kept running out of
disk space until I found out what was going on.
I run DOS 5.0 with everything in upper memory, and there were 605,920 bytes
free. The MEMOPAD.EXE file compiled by Power C with debug information is
184,672 bytes long. The Power Ctrace program would not load the MEMOPAD
program, saying that there was not enough memory. Earlier versions of DOS used
a bunch of lower memory, and applications programs loaded well above the 64K
boundary. Many programs, it turned out, would not run well if they started
below 64K. I thought that Ctrace might be one of them, so I rebooted with DOS
loaded low. Then there were a paltry 554,848 bytes available for applications.
Ctrace still said there was not enough memory. If that ain't enough memory,
Hoss, then Ctrace is one hungry memory hog.
A call to MIX's tech support revealed that Ctrace limits the symbol table to
64K. Apparently D-Flat has more symbols than will fit in the table. It's not
that I don't have enough memory -- Ctrace just doesn't use enough of it. The
MIX tech support person told me about a feature that disables the tracing of
selected symbols, but to do that you need to get the program loaded in the
first place, which I cannot do. The alternative is to compile most of the
program without debugging information and compile the suspected modules with
it, keeping the symbol table size down. A library such as D-Flat is an
integrated whole. In a message-driven architecture with inherited window
classes, every message passes through most of the code. A bug could be
anywhere. In desperation, I reverted to the tried and true printf debugging
technique, which uses printf calls in the code to tell you where you are going
and what the variables look like.
The next thing I learned was that the Power C fnsplit and fnmerge functions do
not work the way the Turbo C ones do. You pass the Turbo C functions NULL
pointers to tell it not to split or merge a component of the file path and
name. The Power C functions do damage to memory when you do that. You must
pass pointers to zero-length strings to fnmerge and pass fnsplit pointers to
character arrays that you are not going to use.
Power C emulates Microsoft C in some areas and Turbo C in others. Both of the
big guys have unique extensions to the language and library to support the PC-
and MS-DOS platform. Power C mixes the two sets of extensions. In some cases,
the emulated extensions do not work exactly like the borrowed ones.
The last bug was tough, and it points to a serious departure from convention
in the code compiled by Power C.
Power C casts far pointers to longs differently than the other compilers. The
D-Flat message-passing protocols resemble Windows' parameter conventions in
that generic long integers carry the parameters for the messages. To pass a
pointer as an argument, I cast it to a PARAM, which is a typedef for a long.
Power C doesn't like that cast, though. It converts the cast value into an
integer that has the value of the pointer's offset, but does not include its
segment. The most significant 16 bits of the cast value are zero. I had to
change all those pointer casts to calls to a function that converts the
pointer correctly to a long.

It took me about three hours to port the original D-Flat code from Turbo C to
Microsoft C. I worked on the Power C port for three days before I finally got
it working, primarily because I could not use the debugger. Although Power C
is a good entry-level ANSI C compiler, I do not consider it to be serious
competition as a software development tool because you must limit it to the
development of small programs. However, there is no better bargain for the
budget-minded student of C.


The State of C++


Last year I wrote a book called Teach Yourself C++, a C programmer's C++
tutorial with about 130 example programs. The code in the examples used C++
2.0 conventions, and I used Zortech_C++ and a beta of Turbo C++ to get them
running. Before the book went to press, I ran the exercises through Comeau C++
and Guidelines C++, which are ports of the AT&T CFRONT C++ translator program.
Some of the examples compiled with Intek C++, a 386-only port of the CFRONT
program, but 1.2 was the only version I had of that compiler.
Since the book came out, AT&T released its C++, Version 2.1 specification. In
addition, I got a lot of letters with comments and questions about the book.
To a writer, such events call for a second edition, which I have completed and
the book should be in the stores by the time you read this column. The book is
about generic C++, but because I use a 386 MS-DOS machine, I decided to borrow
from the experience to report here on contemporary C++ compilers for the PC.
These are not reviews of the compilers. Although I used all of them to compile
130 small programs, that experience alone is not sufficient to allow me to
compare their true performance. I will tell you about some problems I had, but
these problems are not necessarily typical. They spring from my need to be as
compatible with as many compilers as possible. C++ is not as well-defined or
well-understood as C. You can expect the compilers to depart from one another
and from what is thought to be the accepted definition. ANSI is working on a
standard definition, but that will take a while.


Dropouts


Guidelines went out of the C++ compiler business, so there is no longer a
Guidelines C++. Intek did not return my call, so I could not upgrade to their
latest offering -- or even tell you if they still have one.


Borland C++ 2.0


Borland (Scotts Valley, Calif.) has repackaged their C++ with support for
Windows 3 programming, but the compiler supports only the 2.0 specification.
Borland C++ is a compiler implementation of C++ 2.0 that was originally
packaged as Turbo C++ 1.0, but Borland has since modified their C++ to align
their version number with the AT&T release and to add support for Windows
programming.
I had no problems with Borland C++, primarily because I used a beta of the
predecessor, Turbo C++, to develop the original 130 programs. The only
exceptions were with the new features that C++ 2.1 added to the language.
There are several exercises in the book that demonstrate 2.1 behavior and that
do not compile with Borland C++ because the compiler is for version 2.0.
That's not a problem; that's just how things are.


Comeau C++


Comeau C++ (Comeau Computing, Richmond Hill, N.Y.) upgraded to version 2.1
some time ago. Comeau C++ is a CFRONT port, which means that it is an
adaptation of the AT&T C++ 2.1 translator. The compiler comes in versions that
run under MS-DOS and Unix. The CFRONT program reads your C++ source code and
translates it into C code that must be compiled by a C compiler. You will need
a copy of Microsoft C to compile the translated output from Comeau C++.
The close association of Comeau C++ with the Microsoft C compiler caused some
of my problems. The CFRONT program has a command line switch to indicate that
the code uses ANSI conventions rather than K&R conventions. When you use that
switch, CFRONT emits some code that Microsoft C does not compile. When you do
not use the switch, the ANSI code does not pass the CFRONT syntax check. The
code that CFRONT emits is correct ANSI C, but the Microsoft compiler doesn't
accept it. This problem was in MSC prior to the formal acceptance of the ANSI
C specification. Microsoft released MSC 6.0 after ANSI published the standard
C definition, but Microsoft did not correct the problem.
If you want to use your PC to develop C++ code that you will port to a Unix
CFRONT environment, or if you want to port some Unix C++ programs to MS-DOS,
this is the compiler to get.


TopSpeed C++


TopSpeed C++ (Jensen and Partners, Mountain View, Calif.) is a compiler that
implements C++, Version 2.1. The compiler is one part of a multiple-language
product that includes C, C++, Pascal, and Modula-2. The languages share a
common code generator and an integrated development environment. I found a few
differences in the defaults for formatted stream output between TopSpeed C++
and the others. Nonetheless, most of the exercises compiled and ran without
problems. I hit a few snags in the features that were in the C++ 2.0
specification and that 2.1 does not include. TopSpeed C++ does not have a
_new_handler pointer that you can initialize, for example. It also does not
permit the C++ syntax that creates anonymous, hidden variables such as when
you initialize a reference with a constant.


Zortech C++


Zortech C++ (Zortech, Arlington, Mass.) is version 2.1, but the version number
is their own, not an attempt to align with AT&T's 2.1. The compiler is closer
to AT&T's 2.0 specification, although there are some differences even between
AT&T 2.0 and Zortech 2.1. For example, Zortech's input/output stream classes
and libraries are different. Rather than use the AT&T iostream library that
all the other compilers use, Zortech decided to design their own improvements
to the old C++ 1.2 iostreams. There is nothing wrong with that, I suppose, but
I had to tell my readers to skip the entire chapter on iostreams if they use
Zortech. It was either that or write a Zortech-specific chapter on iostreams,
and I did not want to do that.


Undocumented DOS


If you write programs that extract the last ounce of performance from the
MS-DOS platform, you've learned that you need more functions and internal data
structures than the ones that Microsoft specifies in their Technical Reference
Manual and in the officially-sanctioned books from Microsoft Press. There is a
body of knowledge, collected in the underground and growing daily, that
identifies the undocumented characteristics of DOS that you can use to do
things that DOS does not support. The classic example is found in all the
undocumented Mouseketeer machinations of the TSR memory-resident program.
Undocumented DOS (Addison-Wesley, Reading, Mass.), by Andrew Schulman, Raymond
Michels, Jim Kyle, Tim Paterson, David Maxey, Ralf Brown, and the entire cast
of "Hello, Dolly," is the programmer's equivalent of a Kitty Kelly
unauthorized biography. (Celebrity trivia: Ms. Kelly was a drummer in a rock
band in the '60s.) Undocumented DOS tells you all the good stuff about how
Microsoft programmers get things to work -- when they do -- because they know
what DOS's insides look like. Insider coding. This is an essential book for a
PC programmer. There is a lot of C and assembly language code, and the book
comes with a diskette that has all the code and the compiled programs.
Undocumented DOS is the most informative DOS programming book I have read.
_C PROGRAMMING COLUMN_
by Al Stevens



[LISTING ONE]

/* ------------- dflat.h ----------- */
#ifndef WINDOW_H
#define WINDOW_H


#define VERSION "Version 3 Beta"
#define TRUE 1
#define FALSE 0

#include "system.h"
#include "config.h"
#include "rect.h"
#include "menu.h"
#include "keys.h"
#include "commands.h"
#include "config.h"
#include "dialbox.h"

/* ------ integer type for message parameters ----- */
typedef long PARAM;
#define TE(m) m
typedef enum window_class {
#include "classes.h"
} CLASS;
typedef struct window {
 CLASS class; /* window class */
 char *title; /* window title */
 struct window *parent; /* parent window */
 int (*wndproc)
 (struct window *, enum messages, PARAM, PARAM);
 /* ---------------- window dimensions ----------------- */
 RECT rc; /* window coordinates (0/0 to 79/24) */
 int ht, wd; /* window height and width */
 RECT RestoredRC; /* restored condition rect */
 /* -------------- linked list pointers ---------------- */
 struct window *next; /* next window on screen */
 struct window *prev; /* previous window on screen*/
 struct window *nextbuilt; /* next window built */
 struct window *prevbuilt; /* previous window built */
 int attrib; /* Window attributes */
 char *videosave; /* video save buffer */
 int condition; /* Restored, Maximized, Minimized */
 int restored_attrib; /* attributes when restored */
 void *extension; /* -> menus, dialog box, etc*/
 struct window *PrevMouse;
 struct window *PrevKeyboard;
 /* ----------------- text box fields ------------------ */
 int wlines; /* number of lines of text */
 int wtop; /* text line that is on the top display */
 char *text; /* window text */
 int textlen; /* text length */
 int wleft; /* left position in window viewport */
 int textwidth; /* width of longest line in textbox */
 int BlkBegLine; /* beginning line of marked block */
 int BlkBegCol; /* beginning column of marked block */
 int BlkEndLine; /* ending line of marked block */
 int BlkEndCol; /* ending column of marked block */
 int HScrollBox; /* position of horizontal scroll box */
 int VScrollBox; /* position of vertical scroll box */
 /* ----------------- list box fields ------------------ */
 int selection; /* current selection */
 int AddMode; /* adding extended selections mode */
 int AnchorPoint;/* anchor point for extended selections */

 int SelectCount;/* count of selected items */
 /* ----------------- edit box fields ------------------ */
 int CurrCol; /* Current column */
 int CurrLine; /* Current line */
 int WndRow; /* Current window row */
 int TextChanged; /* TRUE if text has changed */
 char *DeletedText; /* for undo */
 int DeletedLength; /* " " */
 /* ---------------- dialog box fields ----------------- */
 struct window *dFocus; /* control that has the focus */
 int ReturnCode; /* return code from a dialog box */
} * WINDOW;
#include "message.h"
#include "classdef.h"
#include "video.h"
enum Condition {
 ISRESTORED, ISMINIMIZED, ISMAXIMIZED
};
void LogMessages (WINDOW, MESSAGE, PARAM, PARAM);
void MessageLog(WINDOW);
/* ------- window methods ----------- */
#define WindowHeight(w) ((w)->ht)
#define WindowWidth(w) ((w)->wd)
#define BorderAdj(w) (TestAttribute(w,HASBORDER)?1:0)
#define TopBorderAdj(w) ((TestAttribute(w,TITLEBAR) && \
 TestAttribute(w,HASMENUBAR)) ? \
 2 : (TestAttribute(w,TITLEBAR \
 HASMENUBAR HASBORDER) ? 1 : 0))
#define ClientWidth(w) (WindowWidth(w)-BorderAdj(w)*2)
#define ClientHeight(w) (WindowHeight(w)-TopBorderAdj(w)-\
 BorderAdj(w))
#define WindowRect(w) ((w)->rc)
#define GetTop(w) (RectTop(WindowRect(w)))
#define GetBottom(w) (RectBottom(WindowRect(w)))
#define GetLeft(w) (RectLeft(WindowRect(w)))
#define GetRight(w) (RectRight(WindowRect(w)))
#define GetClientTop(w) (GetTop(w)+TopBorderAdj(w))
#define GetClientBottom(w) (GetBottom(w)-BorderAdj(w))
#define GetClientLeft(w) (GetLeft(w)+BorderAdj(w))
#define GetClientRight(w) (GetRight(w)-BorderAdj(w))
#define GetParent(w) ((w)->parent)
#define GetTitle(w) ((w)->title)
#define NextWindow(w) ((w)->next)
#define PrevWindow(w) ((w)->prev)
#define NextWindowBuilt(w) ((w)->nextbuilt)
#define PrevWindowBuilt(w) ((w)->prevbuilt)
#define GetClass(w) ((w)->class)
#define GetAttribute(w) ((w)->attrib)
#define AddAttribute(w,a) (GetAttribute(w) = a)
#define ClearAttribute(w,a) (GetAttribute(w) &= ~(a))
#define TestAttribute(w,a) (GetAttribute(w) & (a))
#define isVisible(w) (GetAttribute(w) & VISIBLE)
#define SetVisible(w) (GetAttribute(w) = VISIBLE)
#define ClearVisible(w) (GetAttribute(w) &= ~VISIBLE)
#define gotoxy(w,x,y) cursor(w->rc.lf+(x)+1,w->rc.tp+(y)+1)
WINDOW CreateWindow(CLASS,char *,int,int,int,int,void*,WINDOW,
 int (*)(struct window *,enum messages,PARAM,PARAM),int);
void AddTitle(WINDOW, char *);
void InsertTitle(WINDOW, char *);

void DisplayTitle(WINDOW, RECT *);
void RepaintBorder(WINDOW, RECT *);
void ClearWindow(WINDOW, RECT *, int);
#ifdef INCLUDE_SYSTEM_MENUS
void clipline(WINDOW, int, char *);
#else
#define clipline(w,x,c) /**/
#endif
void writeline(WINDOW, char *, int, int, int);
void writefull(WINDOW, char *, int);
void SetNextFocus(WINDOW,int);
void SetPrevFocus(WINDOW,int);
void PutWindowChar(WINDOW, int, int, int);
void GetVideoBuffer(WINDOW);
void RestoreVideoBuffer(WINDOW);
void CreatePath(char *, char *, int, int);
int LineLength(char *);
RECT AdjustRectangle(WINDOW, RECT);
#define DisplayBorder(wnd) RepaintBorder(wnd, NULL)
#define DefaultWndProc(wnd,msg,p1,p2) \
 (*classdefs[FindClass(wnd->class)].wndproc)(wnd,msg,p1,p2)
#define BaseWndProc(class,wnd,msg,p1,p2) \
 (*classdefs[DerivedClass(class)].wndproc)(wnd,msg,p1,p2)
#define NULLWND ((WINDOW) 0)
struct LinkedList {
 WINDOW FirstWindow;
 WINDOW LastWindow;
};
extern struct LinkedList Focus;
extern struct LinkedList Built;
extern WINDOW inFocus;
extern WINDOW CaptureMouse;
extern WINDOW CaptureKeyboard;
extern int foreground, background;
extern int WindowMoving;
extern int WindowSizing;
extern int TextMarking;
extern char *Clipboard;
extern WINDOW SystemMenuWnd;
/* --------------- border characters ------------- */
#define FOCUS_NW '\xc9'
#define FOCUS_NE '\xbb'
#define FOCUS_SE '\xbc'
#define FOCUS_SW '\xc8'
#define FOCUS_SIDE '\xba'
#define FOCUS_LINE '\xcd'
#define NW '\xda'
#define NE '\xbf'
#define SE '\xd9'
#define SW '\xc0'
#define SIDE '\xb3'
#define LINE '\xc4'
#define LEDGE '\xc3'
#define REDGE '\xb4'
/* ------------- scroll bar characters ------------ */
#define UPSCROLLBOX '\x1e'
#define DOWNSCROLLBOX '\x1f'
#define LEFTSCROLLBOX '\x11'
#define RIGHTSCROLLBOX '\x10'

#define SCROLLBARCHAR 176
#define SCROLLBOXCHAR 178
#define CHECKMARK 251 /* menu item toggle */
/* ----------------- title bar characters ----------------- */
#define CONTROLBOXCHAR '\xf0'
#define MAXPOINTER 24 /* maximize token */
#define MINPOINTER 25 /* minimize token */
#define RESTOREPOINTER 18 /* restore token */
/* --------------- text control characters ---------------- */
#define APPLCHAR 176 /* fills application window */
#define SHORTCUTCHAR '~' /* prefix: shortcut key display */
#define CHANGECOLOR 174 /* prefix to change colors */
#define RESETCOLOR 175 /* reset colors to default */
#define LISTSELECTOR 4 /* selected list box entry */
/* ---- standard window message processing prototypes ----- */
int ApplicationProc(WINDOW, MESSAGE, PARAM, PARAM);
int NormalProc(WINDOW, MESSAGE, PARAM, PARAM);
int TextBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
int ListBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
int EditBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
int MenuBarProc(WINDOW, MESSAGE, PARAM, PARAM);
int PopDownProc(WINDOW, MESSAGE, PARAM, PARAM);
int ButtonProc(WINDOW, MESSAGE, PARAM, PARAM);
int DialogProc(WINDOW, MESSAGE, PARAM, PARAM);
int SystemMenuProc(WINDOW, MESSAGE, PARAM, PARAM);
int HelpBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
int MessageBoxProc(WINDOW, MESSAGE, PARAM, PARAM);
/* ------------- normal box prototypes ------------- */
int isWindow(WINDOW);
WINDOW inWindow(int, int);
int WndForeground(WINDOW);
int WndBackground(WINDOW);
int FrameForeground(WINDOW);
int FrameBackground(WINDOW);
int SelectForeground(WINDOW);
int SelectBackground(WINDOW);
void SetStandardColor(WINDOW);
void SetReverseColor(WINDOW);
void SetClassColors(CLASS);
WINDOW GetFirstChild(WINDOW);
WINDOW GetNextChild(WINDOW);
WINDOW GetLastChild(WINDOW);
WINDOW GetPrevChild(WINDOW);
#define HitControlBox(wnd, p1, p2) \
 (TestAttribute(wnd, TITLEBAR) && \
 TestAttribute(wnd, CONTROLBOX) && \
 p1 == 2 && p2 == 0)
/* -------- text box prototypes ---------- */
#define TextLine(wnd, sel) \
 (wnd->text + *((int *)(wnd->extension) + sel))
void WriteTextLine(WINDOW, RECT *, int, int);
void SetAnchor(WINDOW, int, int);
#define BlockMarked(wnd) ( wnd->BlkBegLine \
 wnd->BlkEndLine \
 wnd->BlkBegCol \
 wnd->BlkEndCol)
#define ClearBlock(wnd) wnd->BlkBegLine = wnd->BlkEndLine = \
 wnd->BlkBegCol = wnd->BlkEndCol = 0;
#define GetText(w) ((w)->text)

void ClearTextPointers(WINDOW);
void BuildTextPointers(WINDOW);
/* --------- menu prototypes ---------- */
int CopyCommand(char *, char *, int, int);
void PrepOptionsMenu(void *, struct Menu *);
void PrepEditMenu(void *, struct Menu *);
void PrepWindowMenu(void *, struct Menu *);
void BuildSystemMenu(WINDOW);
int isActive(MENU *, int);
void ActivateCommand(MENU *,int);
void DeactivateCommand(MENU *,int);
int GetCommandToggle(MENU *,int);
void SetCommandToggle(MENU *,int);
void ClearCommandToggle(MENU *,int);
void InvertCommandToggle(MENU *,int);
/* ------------- list box prototypes -------------- */
int ItemSelected(WINDOW, int);
/* ------------- edit box prototypes ----------- */
#ifdef INCLUDE_MULTILINE
#define isMultiLine(wnd) TestAttribute(wnd, MULTILINE)
#else
#define isMultiLine(wnd) FALSE
#endif
/* --------- message box prototypes -------- */
void MessageBox(char *, char *);
void ErrorMessage(char *);
int TestErrorMessage(char *);
int YesNoBox(char *);
int MsgHeight(char *);
int MsgWidth(char *);

#ifdef INCLUDE_DIALOG_BOXES
/* ------------- dialog box prototypes -------------- */
int DialogBox(WINDOW,DBOX *,int(*)(struct window *,enum
messages,PARAM,PARAM));
int DlgOpenFile(char *, char *);
int DlgSaveAs(char *);
void GetDlgListText(WINDOW, char *, enum commands);
int DlgDirList(WINDOW, char *, enum commands, enum commands, unsigned);
int RadioButtonSetting(DBOX *, enum commands);
void PushRadioButton(DBOX *, enum commands);
void PutItemText(WINDOW, enum commands, char *);
void GetItemText(WINDOW, enum commands, char *, int);
void SetCheckBox(DBOX *, enum commands);
void ClearCheckBox(DBOX *, enum commands);
int CheckBoxSetting(DBOX *, enum commands);
WINDOW ControlWindow(DBOX *, enum commands);
CTLWINDOW *ControlBox(DBOX *, WINDOW);
#endif
/* ------------- help box prototypes ------------- */
void HelpFunction(void);
void LoadHelpFile(void);
#define swap(a,b){int x=a;a=b;b=x;}

#endif






[LISTING TWO]

/* ---------------- config.h -------------- */

#ifndef CONFIG_H
#define CONFIG_H

#define DFLAT_APPLICATION "MEMOPAD"

#ifdef BUILD_FULL_DFLAT
#define INCLUDE_SYSTEM_MENUS
#define INCLUDE_CLOCK
#define INCLUDE_MULTIDOCS
#define INCLUDE_SCROLLBARS
#define INCLUDE_SHADOWS
#define INCLUDE_DIALOG_BOXES
#define INCLUDE_CLIPBOARD
#define INCLUDE_MULTILINE
#define INCLUDE_LOGGING
#endif

struct colors {
 /* ------------ colors ------------ */
 char ApplicationFG, ApplicationBG;
 char NormalFG, NormalBG;
 char ButtonFG, ButtonBG;
 char ButtonSelFG, ButtonSelBG;
 char DialogFG, DialogBG;
 char ErrorBoxFG, ErrorBoxBG;
 char MessageBoxFG, MessageBoxBG;
 char HelpBoxFG, HelpBoxBG;
 char InFocusTitleFG, InFocusTitleBG;
 char TitleFG, TitleBG;
 char DummyFG, DummyBG;
 char TextBoxFG, TextBoxBG;
 char TextBoxSelFG, TextBoxSelBG;
 char TextBoxFrameFG, TextBoxFrameBG;
 char ListBoxFG, ListBoxBG;
 char ListBoxSelFG, ListBoxSelBG;
 char ListBoxFrameFG, ListBoxFrameBG;
 char EditBoxFG, EditBoxBG;
 char EditBoxSelFG, EditBoxSelBG;
 char EditBoxFrameFG, EditBoxFrameBG;
 char MenuBarFG, MenuBarBG;
 char MenuBarSelFG, MenuBarSelBG;
 char PopDownFG, PopDownBG;
 char PopDownSelFG, PopDownSelBG;
 char InactiveSelFG;
 char ShortCutFG;
};
/* ----------- configuration parameters ----------- */
typedef struct config {
 char version[sizeof DFLAT_APPLICATION + sizeof VERSION];
 char mono; /* 0=color, 1=mono, 2=reverse mono */
 int InsertMode; /* Editor insert mode */
 int Tabs; /* Editor tab stops */
 int WordWrap; /* True to word wrap editor */
 int Border; /* True for application window border */
 int Title; /* True for application window title */

 int Texture; /* True for textured appl window */
 int ScreenLines; /* Number of screen lines (25/43/50) */
 struct colors clr; /* Colors */
} CONFIG;
extern CONFIG cfg;
extern struct colors color, bw, reverse;
int LoadConfig(void);
void SaveConfig(void);

#endif






[LISTING THREE]

/* ---------- window.c ------------- */
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
#include <string.h>
#include <dos.h>
#include "dflat.h"

WINDOW inFocus = NULLWND;

int foreground, background; /* current video colors */
static void TopLine(WINDOW, int, RECT);

/* --------- create a window ------------ */
WINDOW CreateWindow(
 CLASS class, /* class of this window */
 char *ttl, /* title or NULL */
 int left, int top, /* upper left coordinates */
 int height, int width, /* dimensions */
 void *extension, /* pointer to additional data */
 WINDOW parent, /* parent of this window */
 int (*wndproc)(struct window *,enum messages,PARAM,PARAM),
 int attrib) /* window attribute */
{
 WINDOW wnd = malloc(sizeof(struct window));
 get_videomode();
 if (wnd != NULLWND) {
 int base;
 /* ----- height, width = -1: fill the screen ------- */
 if (height == -1)
 height = SCREENHEIGHT;
 if (width == -1)
 width = SCREENWIDTH;
 /* ----- coordinates -1, -1 = center the window ---- */
 if (left == -1)
 wnd->rc.lf = (SCREENWIDTH-width)/2;
 else
 wnd->rc.lf = left;
 if (top == -1)
 wnd->rc.tp = (SCREENHEIGHT-height)/2;
 else

 wnd->rc.tp = top;
 wnd->attrib = attrib;
 if (ttl != NULL)
 AddAttribute(wnd, TITLEBAR);
 if (wndproc == NULL)
 wnd->wndproc = classdefs[FindClass(class)].wndproc;
 else
 wnd->wndproc = wndproc;
 /* ---- derive attributes of base classes ---- */
 base = class;
 while (base != -1) {
 int tclass = FindClass(base);
 AddAttribute(wnd, classdefs[tclass].attrib);
 base = classdefs[tclass].base;
 }
 if (parent && !TestAttribute(wnd, NOCLIP)) {
 /* -- keep upper left within borders of parent - */
 wnd->rc.lf = max(wnd->rc.lf,GetClientLeft(parent));
 wnd->rc.tp = max(wnd->rc.tp,GetClientTop(parent));
 }
 wnd->class = class;
 wnd->extension = extension;
 wnd->rc.rt = GetLeft(wnd)+width-1;
 wnd->rc.bt = GetTop(wnd)+height-1;
 wnd->ht = height;
 wnd->wd = width;
 wnd->title = NULL;
 if (ttl != NULL)
 InsertTitle(wnd, ttl);
 wnd->next = wnd->prev = wnd->dFocus = NULLWND;
 wnd->parent = parent;
 wnd->videosave = NULL;
 wnd->condition = ISRESTORED;
 wnd->restored_attrib = 0;
 wnd->RestoredRC = wnd->rc;
 wnd->PrevKeyboard = wnd->PrevMouse = NULL;
 wnd->DeletedText = NULL;
 SendMessage(wnd, CREATE_WINDOW, 0, 0);
 if (isVisible(wnd))
 SendMessage(wnd, SHOW_WINDOW, 0, 0);
 }
 return wnd;
}
/* -------- add a title to a window --------- */
void AddTitle(WINDOW wnd, char *ttl)
{
 InsertTitle(wnd, ttl);
 SendMessage(wnd, BORDER, 0, 0);
}
/* ----- insert a title into a window ---------- */
void InsertTitle(WINDOW wnd, char *ttl)
{
 if ((wnd->title=realloc(wnd->title,strlen(ttl)+1)) != NULL)
 strcpy(wnd->title, ttl);
}
/* ------- write a character to a window area at x,y ------- */
void PutWindowChar(WINDOW wnd, int x, int y, int c)
{
 int x1 = GetLeft(wnd)+x;

 int y1 = GetTop(wnd)+y;
 if (isVisible(wnd)) {
 if (!TestAttribute(wnd, NOCLIP)) {
 WINDOW wnd1 = GetParent(wnd);
 while (wnd1 != NULLWND) {
 /* --- clip character to parent's borders -- */
 if (x1 < GetClientLeft(wnd1) 
 x1 > GetClientRight(wnd1) 
 y1 > GetClientBottom(wnd1) 
 y1 < GetClientTop(wnd1))
 return;
 wnd1 = GetParent(wnd1);
 }
 }
 if (x1 < SCREENWIDTH && y1 < SCREENHEIGHT)
 wputch(wnd, c, x, y);
 }
}
static char line[161];
#ifdef INCLUDE_SYSTEM_MENUS
/* ----- clip line if it extends below the bottom of parent window ------ */
static int clipbottom(WINDOW wnd, int y)
{
 if (!TestAttribute(wnd, NOCLIP)) {
 WINDOW wnd1 = GetParent(wnd);
 while (wnd1 != NULLWND) {
 if (GetClientTop(wnd)+y > GetClientBottom(wnd1)+1)
 return TRUE;
 wnd1 = GetParent(wnd1);
 }
 }
 return GetTop(wnd)+y > SCREENHEIGHT;
}
/* -- clip portion of line that extends past right margin of parent window--
*/
void clipline(WINDOW wnd, int x, char *ln)
{
 WINDOW pwnd = GetParent(wnd);
 int x1 = strlen(ln);
 int i = 0;
 if (!TestAttribute(wnd, NOCLIP)) {
 while (pwnd != NULLWND) {
 x1 = GetClientRight(pwnd) - GetLeft(wnd) - x + 1;
 pwnd = GetParent(pwnd);
 }
 }
 else if (GetLeft(wnd) + x > SCREENWIDTH)
 x1 = SCREENWIDTH-GetLeft(wnd) - x;
 /* --- adjust the clipping offset for color controls --- */
 if (x1 < 0)
 x1 = 0;
 while (i < x1) {
 if ((unsigned char) ln[i] == CHANGECOLOR)
 i += 3, x1 += 3;
 else if ((unsigned char) ln[i] == RESETCOLOR)
 i++, x1++;
 else
 i++;
 }
 ln[x1] = '\0';

}
#else
#define clipbottom(w,y) FALSE
#endif
/* ------ write a line to video window client area ------ */
void writeline(WINDOW wnd, char *str, int x, int y, int pad)
{
 static char wline[120];
 if (!clipbottom(wnd, y))
 {
 char *cp;
 int len;
 int dif;
 memset(wline, 0, sizeof wline);
 len = LineLength(str);
 dif = strlen(str) - len;
 strncpy(wline, str, ClientWidth(wnd) + dif);
 if (pad) {
 cp = wline+strlen(wline);
 while (len++ < ClientWidth(wnd)-x)
 *cp++ = ' ';
 }
 clipline(wnd, x, wline);
 wputs(wnd, wline, x, y);
 }
}
/* -- write a line to video window (including the border) -- */
void writefull(WINDOW wnd, char *str, int y)
{
 if (!clipbottom(wnd, y)) {
 strcpy(line, str);
 clipline(wnd, 0, line);
 wputs(wnd, line, 0, y);
 }
}
RECT AdjustRectangle(WINDOW wnd, RECT rc)
{
 /* -------- adjust the rectangle ------- */
 if (TestAttribute(wnd, HASBORDER)) {
 if (RectLeft(rc) == 0)
 --rc.rt;
 else if (RectLeft(rc) < RectRight(rc) &&
 RectLeft(rc) < WindowWidth(wnd)+1)
 --rc.lf;
 }
 if (TestAttribute(wnd, HASBORDER TITLEBAR)) {
 if (RectTop(rc) == 0)
 --rc.bt;
 else if (RectTop(rc) < RectBottom(rc) &&
 RectTop(rc) < WindowHeight(wnd)+1)
 --rc.tp;
 }
 RectRight(rc) = max(RectLeft(rc),min(RectRight(rc),WindowWidth(wnd)));
 RectBottom(rc) = max(RectTop(rc),min(RectBottom(rc),WindowHeight(wnd)));
 return rc;
}
/* -------- display a window's title --------- */
void DisplayTitle(WINDOW wnd, RECT *rcc)
{

 int tlen = min(strlen(wnd->title), WindowWidth(wnd)-2);
 int tend = WindowWidth(wnd)-3-BorderAdj(wnd);
 RECT rc;
 if (rcc == NULL)
 rc = RelativeWindowRect(wnd, WindowRect(wnd));
 else
 rc = *rcc;
 rc = AdjustRectangle(wnd, rc);
 if (SendMessage(wnd, TITLE, LPARAM(rcc), 0)) {
 if (wnd == inFocus) {
 foreground = cfg.clr.InFocusTitleFG;
 background = cfg.clr.InFocusTitleBG;
 }
 else {
 foreground = cfg.clr.TitleFG;
 background = cfg.clr.TitleBG;
 }
 memset(line,' ',WindowWidth(wnd));
 if (wnd->condition != ISMINIMIZED)
 strncpy(line + ((WindowWidth(wnd)-2 - tlen) / 2),
 wnd->title, tlen);
 if (TestAttribute(wnd, CONTROLBOX))
 line[2-BorderAdj(wnd)] = CONTROLBOXCHAR;
#ifdef INCLUDE_SYSTEM_MENUS
 if (TestAttribute(wnd, MINMAXBOX)) {
 switch (wnd->condition) {
 case ISRESTORED:
 line[tend+1] = MAXPOINTER;
 line[tend] = MINPOINTER;
 break;
 case ISMINIMIZED:
 line[tend+1] = MAXPOINTER;
 break;
 case ISMAXIMIZED:
 line[tend] = MINPOINTER;
 line[tend+1] = RESTOREPOINTER;
 break;
 default:
 break;
 }
 }
#endif
 line[RectRight(rc)+1] = line[tend+3] = '\0';
 writeline(wnd, line+RectLeft(rc),
 RectLeft(rc)+BorderAdj(wnd),
 0,
 FALSE);
 }
}
#ifdef INCLUDE_SHADOWS
/* --- display right border shadow character of a window --- */
static void near shadow_char(WINDOW wnd, int y)
{
 int fg = foreground;
 int bg = background;
 int x = WindowWidth(wnd);
 int c = videochar(GetLeft(wnd)+x, GetTop(wnd)+y);

 if (TestAttribute(wnd, SHADOW) == 0)

 return;
 foreground = DARKGRAY;
 background = BLACK;
 PutWindowChar(wnd, x, y, c);
 foreground = fg;
 background = bg;
}
/* --- display the bottom border shadow line for a window -- */
static void near shadowline(WINDOW wnd, RECT rc)
{
 int i;
 int y = GetBottom(wnd)+1;

 if ((TestAttribute(wnd, SHADOW)) == 0)
 return;
 if (!clipbottom(wnd, WindowHeight(wnd))) {
 int fg = foreground;
 int bg = background;
 for (i = 0; i < WindowWidth(wnd)+1; i++)
 line[i] = videochar(GetLeft(wnd)+i, y);
 line[i] = '\0';
 foreground = DARKGRAY;
 background = BLACK;
 clipline(wnd, 1, line);
 line[RectRight(rc)+1] = '\0';
 if (RectLeft(rc) == 0)
 rc.lf++;
 wputs(wnd, line+RectLeft(rc), RectLeft(rc),WindowHeight(wnd));
 foreground = fg;
 background = bg;
 }
}
#endif
/* ------- display a window's border ----- */
void RepaintBorder(WINDOW wnd, RECT *rcc)
{
 int y;
 int lin, side, ne, nw, se, sw;
 RECT rc, clrc;
 if (!TestAttribute(wnd, HASBORDER))
 return;
 if (rcc == NULL) {
 rc = RelativeWindowRect(wnd, WindowRect(wnd));
#ifdef INCLUDE_SHADOWS
 if (TestAttribute(wnd, SHADOW)) {
 rc.rt++;
 rc.bt++;
 }
#endif
 }
 else
 rc = *rcc;
 clrc = AdjustRectangle(wnd, rc);
 if (wnd == inFocus) {
 lin = FOCUS_LINE;
 side = FOCUS_SIDE;
 ne = FOCUS_NE;
 nw = FOCUS_NW;
 se = FOCUS_SE;

 sw = FOCUS_SW;
 }
 else {
 lin = LINE;
 side = SIDE;
 ne = NE;
 nw = NW;
 se = SE;
 sw = SW;
 }
 line[WindowWidth(wnd)] = '\0';
 /* ---------- window title ------------ */
 if (TestAttribute(wnd, TITLEBAR))
 if (RectTop(rc) == 0)
 if (RectLeft(rc) < WindowWidth(wnd)-BorderAdj(wnd))
 DisplayTitle(wnd, &rc);
 foreground = FrameForeground(wnd);
 background = FrameBackground(wnd);
 /* -------- top frame corners --------- */
 if (RectTop(rc) == 0) {
 if (RectLeft(rc) == 0)
 PutWindowChar(wnd, 0, 0, nw);
 if (RectLeft(rc) < WindowWidth(wnd)) {
 if (RectRight(rc) >= WindowWidth(wnd)-1)
 PutWindowChar(wnd, WindowWidth(wnd)-1, 0, ne);
 TopLine(wnd, lin, rc);
 }
 }
 /* ----------- window body ------------ */
 for (y = RectTop(rc); y <= RectBottom(rc); y++) {
 int ch;
 if (y == 0 y >= WindowHeight(wnd)-1)
 continue;
 if (RectLeft(rc) == 0)
 PutWindowChar(wnd, 0, y, side);
 if (RectLeft(rc) < WindowWidth(wnd) &&
 RectRight(rc) >= WindowWidth(wnd)-1) {
#ifdef INCLUDE_SCROLLBARS
 if (TestAttribute(wnd, VSCROLLBAR))
 ch = ( y == 1 ? UPSCROLLBOX :
 y == WindowHeight(wnd)-2 ?
 DOWNSCROLLBOX :
 y-1 == wnd->VScrollBox ?
 SCROLLBOXCHAR :
 SCROLLBARCHAR );
 else
#endif
 ch = side;
 PutWindowChar(wnd, WindowWidth(wnd)-1, y, ch);
 }
#ifdef INCLUDE_SHADOWS
 if (RectRight(rc) == WindowWidth(wnd))
 shadow_char(wnd, y);
#endif
 }
 if (RectTop(rc) <= WindowHeight(wnd)-1 &&
 RectBottom(rc) >= WindowHeight(wnd)-1) {
 /* -------- bottom frame corners ---------- */
 if (RectLeft(rc) == 0)

 PutWindowChar(wnd, 0, WindowHeight(wnd)-1, sw);
 if (RectLeft(rc) < WindowWidth(wnd) &&
 RectRight(rc) >= WindowWidth(wnd)-1)
 PutWindowChar(wnd, WindowWidth(wnd)-1,
 WindowHeight(wnd)-1, se);
 /* ----------- bottom line ------------- */
 memset(line,lin,WindowWidth(wnd)-1);
#ifdef INCLUDE_SCROLLBARS
 if (TestAttribute(wnd, HSCROLLBAR)) {
 line[0] = LEFTSCROLLBOX;
 line[WindowWidth(wnd)-3] = RIGHTSCROLLBOX;
 memset(line+1, SCROLLBARCHAR, WindowWidth(wnd)-4);
 line[wnd->HScrollBox] = SCROLLBOXCHAR;
 }
#endif
 line[WindowWidth(wnd)-2] = line[RectRight(rc)] = '\0';
 if (RectLeft(rc) != RectRight(rc) 
 (RectLeft(rc) && RectLeft(rc) < WindowWidth(wnd)-1))
 writeline(wnd,
 line+(RectLeft(clrc)),
 RectLeft(clrc)+1,
 WindowHeight(wnd)-1,
 FALSE);
#ifdef INCLUDE_SHADOWS
 if (RectRight(rc) == WindowWidth(wnd))
 shadow_char(wnd, WindowHeight(wnd)-1);
#endif
 }
#ifdef INCLUDE_SHADOWS
 if (RectBottom(rc) == WindowHeight(wnd))
 /* ---------- bottom shadow ------------- */
 shadowline(wnd, rc);
#endif
}
static void TopLine(WINDOW wnd, int lin, RECT rc)
{
 if (TestAttribute(wnd, TITLEBAR HASMENUBAR))
 return;
 if (RectLeft(rc) < RectRight(rc)) {
 /* ----------- top line ------------- */
 memset(line,lin,WindowWidth(wnd)-1);
 line[RectRight(rc)] = '\0';
 writeline(wnd, line+RectLeft(rc),
 RectLeft(rc)+1, 0, FALSE);
 }
}
/* ------ clear the data space of a window -------- */
void ClearWindow(WINDOW wnd, RECT *rcc, int clrchar)
{
 if (isVisible(wnd)) {
 int y;
 RECT rc;
 if (rcc == NULL)
 rc = RelativeWindowRect(wnd, WindowRect(wnd));
 else
 rc = *rcc;
 if (RectLeft(rc) == 0)
 RectLeft(rc) = BorderAdj(wnd);
 if (RectRight(rc) > WindowWidth(wnd)-1)

 RectRight(rc) = WindowWidth(wnd)-1;
 SetStandardColor(wnd);
 memset(line, clrchar, sizeof line);
 line[RectRight(rc)+1] = '\0';
 for (y = RectTop(rc); y <= RectBottom(rc); y++) {
 if (y < TopBorderAdj(wnd) 
 y >= WindowHeight(wnd)-1)
 continue;
 writeline(wnd,
 line+(RectLeft(rc)),
 RectLeft(rc),
 y,
 FALSE);
 }
 }
}
/* -- adjust a window's rectangle to clip it to its parent -- */
static RECT near ClipRect(WINDOW wnd)
{
 RECT rc;
 rc = wnd->rc;
#ifdef INCLUDE_SHADOWS
 if (TestAttribute(wnd, SHADOW)) {
 RectBottom(rc)++;
 RectRight(rc)++;
 }
#endif
 if (!TestAttribute(wnd, NOCLIP)) {
 WINDOW pwnd = GetParent(wnd);
 if (pwnd != NULLWND) {
 RectTop(rc) = max(RectTop(rc),
 GetClientTop(pwnd));
 RectLeft(rc) = max(RectLeft(rc),
 GetClientLeft(pwnd));
 RectRight(rc) = min(RectRight(rc),
 GetClientRight(pwnd));
 RectBottom(rc) = min(RectBottom(rc),
 GetClientBottom(pwnd));
 }
 }
 RectRight(rc) = min(RectRight(rc), SCREENWIDTH-1);
 RectBottom(rc) = min(RectBottom(rc), SCREENHEIGHT-1);
 RectLeft(rc) = min(RectLeft(rc), SCREENWIDTH-1);
 RectTop(rc) = min(RectTop(rc), SCREENHEIGHT-1);
 return rc;
}
/* -- get the video memory that is to be used by a window -- */
void GetVideoBuffer(WINDOW wnd)
{
 RECT rc;
 int ht;
 int wd;
 rc = ClipRect(wnd);
 ht = RectBottom(rc) - RectTop(rc) + 1;
 wd = RectRight(rc) - RectLeft(rc) + 1;
 wnd->videosave = realloc(wnd->videosave, (ht * wd * 2));
 get_videomode();
 if (wnd->videosave != NULL)
 getvideo(rc, wnd->videosave);

}
/* --- restore the video memory that was used by a window -- */
void RestoreVideoBuffer(WINDOW wnd)
{
 if (wnd->videosave != NULL) {
 RECT rc;
 rc = ClipRect(wnd);
 storevideo(rc, wnd->videosave);
 free(wnd->videosave);
 wnd->videosave = NULL;
 }
}
/* ------ compute the logical line length of a window ------ */
int LineLength(char *ln)
{
 int len = strlen(ln);
 char *cp = ln;
 while ((cp = strchr(cp, CHANGECOLOR)) != NULL) {
 cp++;
 len -= 3;
 }
 cp = ln;
 while ((cp = strchr(cp, RESETCOLOR)) != NULL) {
 cp++;
 --len;
 }
 return len;
}


































August, 1991
STRUCTURED PROGRAMMING


Chimney-Pipe Interruptions




Jeff Duntemann KG7JF


Mr. Horny is back. He announced his return in spectacular style one recent
weekday night at 3 a.m. or so, by landing on the perforated metal
spark-catcher cap that encloses the top of the master-bedroom fireplace
chimney pipe, and proceeding to do what owls do for reasons best known to
themselves: HOO-HOO! HOOOOOOOOOOOOOO!
It's not for nothing that owls can be heard a long way off, and a chimney pipe
can do wonders in conducting sound waves. Carol and I sat bolt upright in bed,
sure the Martians were invading. Mr. Byte sailed off his spot at the foot of
the bed and ran to the fireplace, determined to defend his home from hostile
aliens, and made his most fearsome noises right into the hearth until I hauled
him bodily back to bed.
That was that. Sound amplifies both ways through a chimney pipe. We've heard
Mr. Horny since then, at considerably greater distance.


Living Better Asynchronously


Sometimes I think I would define life as a series of interruptions, from owls
and other things. We set up a sequential itinerary for ourselves, begin to
pursue it, and then the phone rings. Or the oven timer beeps. Or the dog
throws up on the brand new living room rug. We trudge through life, answering
phones, burning roasts, and wiping up dog urp, thinking sequentially while
struggling against the universe's insistence on operating asynchronously.
Should we expect our machines to operate any differently?
I sometimes see us, the structured programming gang, as living in a fool's
paradise. We start our programs at the top, run them through to the bottom,
and assume nothing untoward happens along the way. But in fact, the C and
assembler crazies have almost totally masked reality for us: Interruptions are
happening all the time, from clocks and disks and printers and modems and
network controllers and numerous other things. The BIOS, operating system, and
installable device drivers do their work well, so well that we can sometimes
squint a little and forget that such things as machine interrupts even exist.
Nonetheless, they do. They're essential. I think, moreover, that we should
understand how interrupts work, and be ready to write our own interrupt
handlers when the occasion arises.


A Tap on the Shoulder


The nature of The Box That Follows a Plan is to begin at the top of a sequence
of machine instructions and follow them sequentially to their end, making
branches and jumps in a rational manner. An interrupt is nothing more than a
tap on the CPU's shoulder, with a directive to hold that thought, and duck
over here for a second to take care of something else right now.
Some interrupts happen at predictable times (the clock tick interrupt being
the best example), but the real hallmark of interrupts is that they happen
when they happen, generally on no schedule and without warning. I can sit here
and stare at the screen doing nothing for as long as I choose. But at some
point (at least if I ever expect to make a nickel writing again) I have to
reach out and press a key. Bang! There's an interrupt. The machine must set
aside what it's doing for a moment and go fetch the code for the key I just
pressed. It does some necessary processing on that key (more than you might
imagine, although that's another story) before putting a key code in the
keyboard buffer and taking up its previous work once more.
Like everything else connected with the 86-family Intel CPU product line,
interrupts have evolved over time. What I'm going to describe here are things
as they exist in the 8086/8088 CPU itself, without getting into the
enhancements specific to the 286, 386, and 486. Perhaps another time.
The 8086 has the machinery to handle as many as 256 different interrupts. A
handful of them perform special services baked right into the CPU chip, and a
few more serve the PC hardware and operating environment, but most lie simply
unused. Each of those 256 different interrupts represent a possible "something
else" for the CPU to do when it receives its tap on the shoulder. We'll come
back to the machinery of the shoulder tap itself. It's easier to begin by
understanding what happens when an interrupt is received by the CPU.


The Interrupt Vector Table


Interrupts are numbered from 0 to 255. Regardless of where an interrupt comes
from, it has a number in that range. Down in the very lowest area of the 8086
memory address space is a table of 256 4-byte slots for containing addresses,
one address for each of the 256 possible interrupts. This table is called the
interrupt vector table, and is 1024 bytes in size, located in the very first
1024 bytes of memory. Most of the slots in the interrupt vector table are
empty and consist of 4 bytes of zeros. A valid address in the table is called
an interrupt vector.
The vectors in the table are full 32-bit addresses, consisting of a 16-bit
segment and a 16-bit offset. The offset portion is first (lower) in memory
followed by the segment portion. I've sketched out the order of the addresses
and their component parts in Figure 1. You don't need to memorize these
things; most of the time, you'll be dealing with interrupt vectors as
indivisible wholes.
As I said, most of the slots in the interrupt vector table are zeroed out and
considered empty. At power-up time and occasionally later, DOS, the BIOS, a
driver, or an application will place a valid vector in the vector table.
"Vector" really means "pointer," and that's a good way to conceptualize the
vectors placed in the interrupt vector table. They are pointers to little code
sequences located somewhere else in memory. These code sequences are called
interrupt service routines (ISRs), and are the "something else" that the CPU
must do when an interrupt occurs.
Something to keep in mind is that any interrupt service routine can always be
located, no matter where it actually is in memory, simply by knowing the
interrupt number it serves. The address of the ISR that serves interrupt 6
exists in segment 0, at an offset of 6 x 4, or 24 (hexadecimal $18). The ISR
itself is not there, but the ISR's address is. The CPU simply has to multiply
an interrupt's number by four (which is an easy thing for the CPU to do, since
multiplies by powers of two are simply bit-shifts) and jump to the address it
finds at the resulting offset from 0. It will then be executing the
interrupt's ISR.


Hold Everything


From a height then, what happens when the CPU receives interrupt N is this: It
saves the bare essentials of what it is currently executing, locates the
address of interrupt N, and then branches to the code existing at that
address. At that point it is executing the interrupt's ISR.
What gets saved, and how? Remarkably, the CPU only saves two things when an
interrupt happens: The machine flags and the 32-bit instruction pointer. The
machine flags comprise a 16-bit word containing information about actions in
progress, such as whether the last arithmetic operation resulted in a carry or
a borrow, whether the last operation forced the accumulator (AX) register to
0, and so on. In a sense, the flags retain the essential what of the CPU's
previous work. Next, the CPU saves the where of its previous work, by saving
the address of the instruction it was about to execute when the interrupt came
in. This address consists of the Code Segment register (CS) and the
Instruction Pointer register (IP). The CPU does not automatically save the
contents of the machine registers like AX, DX, BP, or SI. If the registers are
to be saved, the ISR itself must save them. Obviously, if the ISR leaves the
registers alone, it needn't save them. However, if it intends to reuse them or
otherwise change values that exist in them, it had better save them, and in
most cases ISRs do save one or more registers that they intend to use.
The CPU saves what it saves by pushing it on the system stack. The stack is
nothing more than an area of memory addressed by two registers, SS and SP. SS
and SP are initially set up by DOS, and as Pascal programmers you should only
change them in dire need. Altering the stack incorrectly (or unsuccessfully)
is the fastest road I can think of to a Big Red Switch crash.
The stack is an interesting creature that I won't fully describe here. The
essence of a stack is that it is a last-in, first-out mechanism. The last
thing pushed onto the stack is the first thing popped off. In other words,
things come off the stack in the reverse order that they go on. For example,
the CPU pushes the flags on the stack first, followed by CS, and then IP. When
it retrieves them later on, it will first pop IP, then CS, and finally the
flags.
To recap: When interrupt N taps the CPU on the shoulder, the CPU first pushes
the flags on the stack, then pushes CS, followed by IP. The CPU then
calculates the address of interrupt vector N, reads the interrupt vector from
low memory, and places the vector into CS and IP.
As a result, the next instruction the CPU fetches for execution is the first
instruction of interrupt N's interrupt service routine. The CPU is then off
and running on the interrupt.


Coming Home Again



As I mentioned earlier, if the ISR intends to use any of the registers, it
must push their current values onto the stack, so it can restore those values
before returning control to what the CPU was doing before the interrupt.
After saving any registers it intends to use, the ISR does what it must. As
I'll say again and again, it had better be quick. Creating complex ISRs that
take a long time to execute is asking for trouble. There are also special
considerations you have to keep in mind when writing ISRs in order to stay out
of various kinds of trouble. I'll get into these later on. (Can you say,
"reentrancy?")
When the ISR is finished with its specific tasks, it must return control
gracefully to whatever work was in progress when the interrupt happened. The
advice, "pop whatever you push" is applicable here. Anything the ISR pushed
onto the stack must be popped off again. If the ISR pushed three machine
registers onto the stack, it had better pop three registers back off again, or
you'll hear the crash in the next county.
The final switch from ISR back to ordinary application code is handled by a
special machine instruction called an interrupt return instruction, or IRET.
The IRET pops the IP value from the stack back into the CPU's internal
instruction pointer register, pops the CS value back into the code segment
register, and finally restores the prior state of the various machine flags by
popping the flags' values from the stack into the flags themselves.
At this point (assuming the ISR didn't "trash" any registers or memory that
the code-in-progress was using) things should be just as they were before the
interrupt happened, and work (like life) goes on -- at least until the next
interrupt.


Software Interrupts


There are some minor details that we'll come back to, but in general, all
interrupts are handled pretty much that way. And so we return to the question
of where interrupts actually come from; that is, who taps the CPU's shoulder
to kick off an interrupt?
Interrupts can come from two different places: software and hardware. Software
interrupts are intriguing and I'll spend some time on them in a future column.
But quite briefly, you can kick off an interrupt just by using a special
machine instruction created for that purpose. Executing an INT N instruction
forces the CPU to go through the interrupt process just described for
interrupt N.
Far trickier, but more useful in many ways are interrupts generated by the
hardware. The CPU chip has a pin dedicated to interrupt generation.
Ordinarily, this pin is held idle at a logic 0. A hardware gadget of some sort
may be attached to the interrupt pin, and when that gadget even momentarily
raises the level on the interrupt pin to logic 1, a hardware interrupt occurs,
and once again the sequence described above happens.


Sharing a Pin


So conceptually, interrupts are pretty simple. You can almost consider them
subroutines whose addresses can be found in a table at a predictable location,
and for software interrupts that's pretty close to the whole truth.
Hardware interrupts, however, get complicated for this reason: There is only
one general-purpose interrupt pin on the CPU chip. As soon as you want to
connect more than one interrupt-capable peripheral to the PC, you have to
consider how to keep the peripherals from fighting over that one pin.
It's not enough to put some sort of eight-input OR-gate on the interrupt pin
and then give everybody an input to the OR-gate. That allows up to eight
people to knock, but the CPU still has no way of knowing who's there. (Not to
mention the problem of what to do when two or three people knock at once....)
This problem requires another chip to solve, and that chip is the 8259
Programmable Interrupt Controller (PIC) device manufactured by Intel, National
Semiconductor, and other firms. The 8259 has three jobs to do:
1. It allows up to eight devices to access the CPU's interrupt pin and it
tells the CPU which device is interrupting.
2. It allows the programmer to "mask out" any of those eight interrupts, so
that when desired, the masked interrupts will not be passed through to the
CPU.
3. It handles the problem of what to do when another interrupt request comes
in while a prior interrupt request is still being serviced. Understand the
three tasks performed by the 8259, and you've got PC interrupts in your hip
pocket.


Those IRQ Numbers


At one point or another you've run into a serial port problem (no probablies
about it; serial port problems are as common as corrupt congressmen) and had
someone ask, "Well, is the port set up for IRQ3 or IRQ4?" Perhaps you peeked
at the DIP switches and were able to report the truth, but you might also have
wondered just what that meant.
The IRQ numbers are the identifiers of the eight inputs to the 8259 PIC chip.
They run from IRQ0 to IRQ7, and they represent literal input pins on the
physical 8259 chip as well as the names of signals passing through the chip.
It's important to remember that the IRQ numbers do not correspond to interrupt
vector numbers. IRQ0, for example, does not make use of interrupt vector 0,
nor does IRQ1 make use of vector 1, and so on. In truth, the IRQ interrupts
are "mapped onto" interrupt vectors 8 through 15, where IRQ0 uses vector 8,
IRQ1 uses vector 9, and so on. I've summarized the first 16 PC interrupt
vectors in Table 1, including their memory addresses, applicable IRQ numbers,
and standard uses, if any.
Table 1: The first 16 interrupt vectors and their uses

 Vector# Vector offset from segment 0 Standard use
----------------------------------------------------------------
 0 $0000 Divide by 0 (internal)
 1 $0004 Single step (internal)
 2 $0008 Non-Maskable Interrupt
 3 $000C Breakpoint interrupt
 4 $0010 Divide overflow
 5 $0014 Print screen
 6 $0018 IBM Reserved
 7 $001C IBM Reserved
 8 $0020 IRQ0 Timer tick
 9 $0024 IRQ1 Keyboard
 A $0028 IRQ2 AT 8259 pass-through
 B $002C IRQ3 COM2:
 C $0030 IRQ4 COM1:
 D $0034 IRQ5 Hard disk controller
 E $0038 IRQ6 Diskette controller
 F $003C IRQ7 Parallel port

The first several interrupts are special-purpose in nature. Some of them are
built into the CPU. A divide by 0 operation, for example, will automatically
trigger an interrupt to vector 0. You don't have to code it up, and the
interrupt pin on the CPU is not involved. If the DIV instruction microcode
detects a divide by 0, it does what amounts to a software interrupt to vector
0.
The Non-Maskable Interrupt (NMI) uses vector 2. NMI has its own dedicated pin
on the CPU, and generally is used to report catastrophic hardware failure. On
those occasions when you've seen PARITY ERROR on your screen just before the
system locked up, you've witnessed a nonmaskable interrupt in action,
reporting a bad memory location somewhere. The NMI is not something
programmers ordinarily mess with, so I won't describe it further.


Cascading Controllers



What I've described so far is pretty much the way things exist on the older PC
and XT machines based on the 8088. Starting with the AT in 1984, however, IBM
added a second 8259 PIC chip to the motherboard. This added eight interrupt
lines to the system, for a total of 15. (One line is taken in connecting the
two 8259 chips to one another, or there would be 16.)
The second 8259's output pin is connected to the IRQ2 input of the original
8259. This prevents IRQ2 from being used for any specific hardware device, but
it adds the eight inputs from the second 8259. The second set of interrupt
inputs are known as IRQ8-IRQ15. IRQ8 is used by the AT's real-time clock chip,
and IRQ9 is used by local area network adapter boards. Most of the other IRQs
are undedicated or reserved.
When an interrupt comes in from one of the second set of IRQs, the second 8259
enters an interrupt to IRQ2 of the first 8259. Then some additional protocols
must be followed to inform the CPU which of the second set of IRQ's was the
ultimate source of the interrupt. Yes, it does get hairy, but the second eight
IRQs don't really involve serial communications in any way, and I won't be
discussing them further.


Masking Out Interrupts


Apart from the NMI and two internal interrupts, all of the interrupts in the
PC architecture are maskable, meaning that the CPU can be made to ignore them,
even when an external hardware device attempts to trigger them.
The CPU can mask out interrupts generally through the use of the Interrupt
Flag (IF) and the two machine instructions that toggle it. The STI instruction
sets IF, and the CLI instruction clears IF. When IF is cleared, all maskable
interrupts will be ignored by the CPU. (This includes software interrupts, but
again, not the NMI.) IF is automatically cleared when an interrupt is
recognized by the CPU to prevent a second interrupt from happening until the
CPU is ready to deal with it. We'll come back to this shortly.
Masking out individual interrupts while allowing others to go through is done
by way of the machinery inside the 8259 chip. Inside the 8259 is an 8-bit
register cleverly named Operation Control Word 1, or OCW1 for short. Don't
forget that, despite its name, OCW1 is a byte in size, and not a word. Each of
the 8 bits in OCW1 masks or enables one of the eight IRQ interrupt signals
controlled by the 8259. It's a simple relationship: Bit 0 controls IRQ0, bit 1
controls IRQ1, and so on, through all eight IRQs.
When a bit in OCW1 is 1, the corresponding interrupt is disabled. (We say
"masked.") When a bit in OCW1 is 0, the corresponding interrupt is enabled,
Enabling COM1: means setting the bit for IRQ4 to 0; enabling COM2: means
setting the bit for IRQ3 to 0. (See Table 1 for IRQ numbers and what they do.)


Working With OCW1


OCW1 is a read/write register accessed through I/O port number $21. I've
summarized the bit numbers, IRQ numbers, and mask values associated with OCW1
in Figure 2. One thing never to forget in programming OCW1 is that you can't
just write a mask value to it. Writing $10 directly to OCW1 will disable IRQ4
-- and enable all other IRQs, regardless of whether they were enabled before!
You must make sure that what you write affects only the mask bits and thus
IRQs that you wish to affect. This is best done by reading OCW1, ANDing or
ORing the OCW1 contents with the mask bit you wish to change, and then writing
the whole value back to OCW1. For example, to disable interrupts at IRQ4 (to
turn off COM1: interrupts) you might use the Pascal statements in Example
1(a). The OR operator writes the single 1-bit in IRQ4Mask to OCW1 without
affecting any of the other bits either way. Enabling interrupts at IRQ4 would
be done as in Example 1(b).
Example 1: (a) Disabling interrupts at IRQ4; (b) enabling interrupts at IRQ4

 (a) OCW1 := $21;
 IRQ4Mask := $10;

 Port [OCW1] := Port [OCW1] OR IRQ4Mask;

 (b) Port [OCW1] := Port [OCW1] AND (NOT IRQ4Mask);

Note that you must first invert the mask value via NOT. The idea in enabling
an interrupt is to force the OCW1 mask bit in question to 0, and the
significant bit in the mask values are all 1-bits. The AND operator will then
force the mask bit in question to a 0, since the significant bit in the
inverted mask value is the only 0-bit in the inverted mask value.


Hold That Thought


Once again, the subject at hand far outweighs a single column's worth of
magazine pages. There's a lot more to be said about the 8259 before we can
write a simple interrupt-driven version of the POLLTERM program I presented a
few columns back. I'd almost say that in regard to interrupt driven serial
port I/O, the 8259 is a more significant challenge than the CPU itself.
More coming. Stay tuned.

























August, 1991
GRAPHICS PROGRAMMING


More Undocumented 256-Color VGA Magic




MICHAEL ABRASH


Every so often, a programming demon that I'd thought I'd forever laid to rest
arises to haunt me once again. A minor example of this -- an imp, if you will
-- is the use of " = " when I mean " == ," which I've done all too often in
the past, and am sure I'll do again. That's minor deviltry, though, compared
to the considerably greater evils of one of my personal scourges, of which I
was recently reminded anew: too-close attention to detail. Not seeing the
forest for the trees. Looking low when I should have looked high. Missing the
big picture, if you catch my drift.
Thoreau said it best: "Our life is frittered away by detail....Simplify,
simplify." That quote sprang to mind when I received a letter from Anton
Treuenfels of Fridley, Minnesota, thanking me for clarifying the principles of
filling adjacent convex polygons, as discussed in this column in February and
March. Anton then went on to describe his own method for filling convex
polygons.
Anton's approach had its virtues and drawbacks, foremost among the virtues
being a simplicity Thoreau would have admired. For instance, in writing my
polygon-filling code, I had spent quite some time trying to figure out the
best way to identify which edge was the left edge and which the right, finally
settling on comparing the slopes of the edges if the top of the polygon wasn't
flat, and comparing the starting points of the edges if the top was flat.
Anton simplified this tremendously by not bothering to figure out ahead of
time which was the right edge of the polygon and which the left, instead
scanning out the two edges in whatever order he found them and letting the
low-level drawing code test, and if necessary swap, the endpoints of each
horizontal line of the fill, so that filling started at the leftmost edge.
This is a little slower than my approach (although the difference is almost
surely negligible), but it also makes quite a bit of code go away.
What that example, and others like it in Anton's letter, did was kick my mind
into a mode that it hadn't -- but should have -- been in when I wrote the
code, a mode in which I began to wonder, "How else can I simplify this code?";
what you might call Occam's Razor mode. You see, I created the convex
polygon-drawing code by first writing pseudocode, then writing C code, and
finally writing assembly code, and once the pseudocode was finished, I stopped
thinking about the interactions of the various portions of the program. In
other words, I became so absorbed in individual details that I forgot to
consider the code as a whole. That was a mistake, and an embarrassing one for
someone who constantly preaches that programmers should look at their code
from a variety of perspectives. May my embarrassment be your enlightenment.
The point is not whether, in the final analysis, my code or Anton's code is
better; both have their advantages. The point is that I was programming with
half a deck because I was so fixated on the details of a single sort of
implementation; I ended up with relatively hard-to-write, complex code, and
missed out on many potentially useful optimizations by being so focused. It's
a big world out there, and there are many subtle approaches to any problem, so
relax and keep the big picture in mind as you implement your programs. Your
code will likely be not only better, but also simpler. And whenever you see me
walking across hot coals in this column when there's an easier way to go,
please, let me know!
Thanks, Anton.


Mode X Continued


Last month, I introduced you to what I call mode X, an undocumented 320 x 240
256-color mode of the VGA. Mode X is distinguished from mode 13h, the
documented 320 x 200 256-color VGA mode, in that it supports page flipping,
makes off-screen memory available, has square pixels, and, above all, lets you
use the VGA's hardware to increase performance by as much as four times (at
the cost of more complex and demanding programming, to be sure -- but end
users care about results, not how hard the code was to write, and mode X
delivers results in a big way). Last month we saw how the VGA's plane-oriented
hardware can be used to speed solid fills. That's a nice technique, but this
month we're going to move up to the big guns -- the latches.
The VGA has four latches, one for each plane of display memory. Each latch
stores exactly one byte, and that byte is always the last byte read from the
corresponding plane of display memory, as shown in Figure 1. Furthermore,
whenever a given address in display memory is read, all four planes' bytes at
that address are read and stored in the corresponding latches, regardless of
which plane supplied the byte returned to the CPU (as determined by the Read
Map register). As with so much else about the VGA, the above will make little
sense to VGA neophytes, but the important point is this: By reading one
display memory byte, 4 bytes --one from each plane -- can be loaded into the
latches at once. Any or all of those 4 bytes can then be written anywhere in
display memory with a single byte-sized write, as shown in Figure 2. The
upshot is that the latches make it possible to copy data around from one part
of display memory to another, 32 bits (four pixels) at a time -- four times as
fast as normal. (Recall from last month that in mode X, pixels are stored one
per byte, with four pixels in a row stored in successive planes at the same
address, one pixel per plane.) However, any one latch can only be loaded from
and written to the corresponding plane, so an individual latch can only work
with every fourth pixel on the screen; the latch for plane 0 can work with
pixels 0, 4, 8. . ., the latch for plane 1 with pixels 1, 5, 9. . ., and so
on.
The latches aren't intended for use in 256-color mode -- they were designed to
allow individual bits of display memory to be modified in 16-color mode -- but
they are nonetheless very useful in mode X, particularly for patterned fills
and screen-to-screen copies, including scrolls. Patterned filling is a good
place to start, because patterns are widely used in windowing environments for
desktops, window backgrounds, and scroll bars, and for textures and color
dithering in drawing and game software.
Fast mode X fills with patterns that are four pixels in width can be performed
by drawing the pattern once to the four pixels at any one address in display
memory, reading that address to load the pattern into the latches, setting the
Bit Mask register to 0 to specify that all bits drawn to display memory should
come from the latches, and then performing the fill pretty much as we did last
month, except that each line of the pattern must be loaded into the latches
before the corresponding scan line on the screen is filled. Listings One and
Two (page 181) together demonstrate a variety of fast mode X four-by-four
pattern fills. (The mode set function called by Listing One is from last
month's column.)
Four-pixel-wide patterns are more useful than you might imagine. There are
actually 2{128} possible patterns (16 pixels, each with 2{8} possible colors);
that set is certainly large enough for most color-dithering purposes, and
includes many often-used patterns, such as halftones, diagonal stripes, and
crosshatches.
Furthermore, eight-wide patterns, which are widely used, can be drawn with two
passes, one for each half of the pattern; this principle can in fact be
extended to patterns of arbitrary multiple-of-four widths. (Widths that aren't
multiples of four are considerably more difficult to handle, because the
latches are four pixels wide.)


Allocating Memory in Mode X


Listing Two raises some interesting questions about the allocation of display
memory in mode X. In Listing Two whenever a pattern is to be drawn, that
pattern is first drawn in its entirety at the very end of display memory; the
latches are then loaded from that copy of the pattern before each scan line of
the actual fill is drawn. Why this double copying process, and why is the
pattern stored in that particular area of display memory?
The double copying process is used because it's the easiest way to load the
latches. Remember, there's no way to get information directly from the CPU to
the latches; the information must first be written to some location in display
memory, because the latches can be loaded only from display memory. By writing
the pattern to off-screen memory, we don't have to worry about interfering
with whatever is currently displayed on the screen.
As for why the pattern is stored exactly where it is, that's part of a master
memory allocation plan that will come to fruition next month when I implement
a mode X animation program. Figure 3 shows this master plan; the first two
pages of memory (each 76,800 pixels long, spanning 19,200 addresses -- that
is, 19,200 pixel quadruplets -- in display memory) are reserved for page
flipping, the next page of memory (also 76,800 pixels long) is reserved for
storing the background (this is used to restore the holes left after images
move), the last 16 pixels (four addresses) of display memory are reserved for
the pattern buffer, and the remaining 31,728 pixels (7932 addresses) of
display memory are free for storage of icons, images, temporary buffers, or
whatever. This is an efficient organization for animation, but there are
certainly many other possible setups. For example, you might choose to have a
solidly-colored background, in which case you could dispense with the
background page (instead using the solid rectangle fill routine to replace the
background after images move), freeing up another 76,800 pixels of off-screen
storage for images and buffers. You could even eliminate page-flipping
altogether if you needed to free up a great deal of display memory. For
example, with enough free display memory it is possible in mode X to create a
virtual bitmap three times larger than the screen, with the screen becoming a
scrolling window onto that larger bitmap. This technique has been used to good
effect in a number of games, although I don't know if any of those games use
mode X.


Copying Pixel Blocks Within Display Memory


Another fine use for the latches is copying pixels from one place in display
memory to another. Whenever both the source and the destination share the same
nibble alignment (that is, their start addresses modulo four are the same), it
is not only possible but quite easy to use the latches to perform the copy
four pixels at a time. Listing Three (page 182) shows a routine that copies
via the latches. (When the source and destination do not share the same nibble
alignment, the latches cannot be used, because the source and destination
planes for any given pixel differ; in that case, you can set the Read Map
register to select a source plane and the Map Mask register to select the
corresponding destination plane, then copy all pixels in that plane; repeat
for all four planes.)
Listing Three has an important limitation: It does not guarantee proper
handling when the source and destination overlap, as in the case of a downward
scroll, for example. Listing Three performs top-to-bottom, left-to-right
copying. Downward scrolls require bottom-to-top copying; likewise, rightward
horizontal scrolls require right-to-left copying. As it happens, my intended
use for Listing Three is to copy images between off-screen memory and onscreen
memory, and to save areas under pop-up menus and the like, so I don't really
need overlap handling -- and I do really need to keep the size of this column
down. However, you will surely want to add overlap handling if you plan to
perform arbitrary scrolling and copying in display memory.
Now that we have a fast way to copy images around in display memory, we can
draw icons and other images between two and four times faster than in mode
13h, depending on the speed of the VGA's display memory. (In case you're
worried about the nibble-alignment limitation on fast copies, don't be; I'll
address that fully next time, but the secret is to store all four possible
rotations in off-screen memory, then select the correct one for each copy.)
However, before our fast display memory-to-display memory copy routine can do
us any good, we must have a way to get pixel patterns from system memory into
display memory, so that they can then be copied with the fast copy routine.


Copying to Display Memory


The final piece of the puzzle is the system memory to
display-memory-copy-routine shown in Listing Four (page 182). This routine
assumes that pixels are stored in system memory in exactly the order in which
they will ultimately appear on the screen; that is, in the same linear order
that mode 13h uses. It would be more efficient to store all the pixels for one
plane first, then all the pixels for the next plane, and so on for all four
planes, because many OUTs could be avoided, but that would make images rather
hard to create. And, while it is true that the speed of drawing images is, in
general, often a critical performance factor, the speed of copying images from
system memory to display memory is not particularly critical in mode X.
Important images can be stored in off-screen memory and copied to the screen
via the latches much faster than even the speediest system memory-to-display
memory-copy-routine could manage.
I'm not going to present a routine to perform mode X copies from display
memory to system memory, but such a routine would be a straightforward inverse
of Listing Four.


Coming Up: Our Hero Risks Life, Limb, and Word Count in a Thrilling Conclusion


Next month, I'll take all the mode X tools we've developed, together with one
more tool -- masked image copying -- and the remaining unexplored feature of
mode X, page flipping, and build an animation application. I hope that when
I'm done, you'll agree with me that mode X is the way to animate on the PC. I
also hope that I can fit everything into one column; there are always so many
interesting things to say that I have trouble keeping the size of these
columns down, and mode X animation covers even more fertile ground than usual.

But, hey -- you've already heard about my programming demons; I'll spare you
the writing demons. Besides, as I'm fond of saying, end users care about
results, not how you produced them. For my writing, you folks are the end
users--and notice how remarkably little you care about how this magazine gets
written and produced. You care that it shows up in your mailbox every month,
and you care about the contents, but you sure don't care about how it got
there. When you're a creator, the process matters. When you're a buyer,
results are everything. All important. Sine qua non. The whole enchilada.
If you catch my drift.


Late Flash!


The Mode X mode set code in my July '91 column (Listing One, page 154) has a
small -- but critical -- bug. On line 46, the value loaded into AL should be
OE3h, not OE7h. Without this correction, the screen will roll on
fixed-frequency (IBM 851X-style) monitors.
_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash




[LISTING ONE]

/* Program to demonstrate mode X (320x240, 256 colors) patterned
 rectangle fills by filling the screen with adjacent 80x60
 rectangles in a variety of patterns. Tested with Borland C++
 2.0 in C compilation mode and the small model */
#include <conio.h>
#include <dos.h>

void Set320x240Mode(void);
void FillPatternX(int, int, int, int, unsigned int, char*);

/* 16 4x4 patterns */
static char Patt0[]={10,0,10,0,0,10,0,10,10,0,10,0,0,10,0,10};
static char Patt1[]={9,0,0,0,0,9,0,0,0,0,9,0,0,0,0,9};
static char Patt2[]={5,0,0,0,0,0,5,0,5,0,0,0,0,0,5,0};
static char Patt3[]={14,0,0,14,0,14,14,0,0,14,14,0,14,0,0,14};
static char Patt4[]={15,15,15,1,15,15,1,1,15,1,1,1,1,1,1,1};
static char Patt5[]={12,12,12,12,6,6,6,12,6,6,6,12,6,6,6,12};
static char Patt6[]={80,80,80,80,80,80,80,80,80,80,80,80,80,80,80,15};
static char Patt7[]={78,78,78,78,80,80,80,80,82,82,82,82,84,84,84,84};
static char Patt8[]={78,80,82,84,80,82,84,78,82,84,78,80,84,78,80,82};
static char Patt9[]={78,80,82,84,78,80,82,84,78,80,82,84,78,80,82,84};
static char Patt10[]={0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
static char Patt11[]={0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3};
static char Patt12[]={14,14,9,9,14,9,9,14,9,9,14,14,9,14,14,9};
static char Patt13[]={15,8,8,8,15,15,15,8,15,15,15,8,15,8,8,8};
static char Patt14[]={3,3,3,3,3,7,7,3,3,7,7,3,3,3,3,3};
static char Patt15[]={0,0,0,0,0,64,0,0,0,0,0,0,0,0,0,89};
/* Table of pointers to the 16 4x4 patterns with which to draw */
static char* PattTable[] = {Patt0,Patt1,Patt2,Patt3,Patt4,Patt5,Patt6,
 Patt7,Patt8,Patt9,Patt10,Patt11,Patt12,Patt13,Patt14,Patt15};
void main() {
 int i,j;
 union REGS regset;

 Set320x240Mode();
 for (j = 0; j < 4; j++) {
 for (i = 0; i < 4; i++) {
 FillPatternX(i*80,j*60,i*80+80,j*60+60,0,PattTable[j*4+i]);
 }
 }
 getch();
 regset.x.ax = 0x0003; /* switch back to text mode and done */
 int86(0x10, &regset, &regset);

}







[LISTING TWO]

; Mode X (320x240, 256 colors) rectangle 4x4 pattern fill routine.
; Upper left corner of pattern is always aligned to a multiple-of-4
; row and column. Works on all VGAs. Uses approach of copying the
; pattern to off-screen display memory, then loading the latches with
; the pattern for each scan line and filling each scan line four
; pixels at a time. Fills up to but not including the column at EndX
; and the row at EndY. No clipping is performed. All ASM code tested
; with TASM 2. C near-callable as:
; void FillPatternedX(int StartX, int StartY, int EndX, int EndY,
; unsigned int PageBase, char* Pattern);

SC_INDEX equ 03c4h ;Sequence Controller Index register port
MAP_MASK equ 02h ;index in SC of Map Mask register
GC_INDEX equ 03ceh ;Graphics Controller Index register port
BIT_MASK equ 08h ;index in GC of Bit Mask register
PATTERN_BUFFER equ 0fffch ;offset in screen memory of the buffer used
 ; to store each pattern during drawing
SCREEN_SEG equ 0a000h ;segment of display memory in mode X
SCREEN_WIDTH equ 80 ;width of screen in addresses from one scan
 ; line to the next
parms struc
 dw 2 dup (?) ;pushed BP and return address
StartX dw ? ;X coordinate of upper left corner of rect
StartY dw ? ;Y coordinate of upper left corner of rect
EndX dw ? ;X coordinate of lower right corner of rect
 ; (the row at EndX is not filled)
EndY dw ? ;Y coordinate of lower right corner of rect
 ; (the column at EndY is not filled)
PageBase dw ? ;base offset in display memory of page in
 ; which to fill rectangle
Pattern dw ? ;4x4 pattern with which to fill rectangle
parms ends

NextScanOffset equ -2 ;local storage for distance from end of one
 ; scan line to start of next
RectAddrWidth equ -4 ;local storage for address width of rectangle
Height equ -6 ;local storage for height of rectangle
STACK_FRAME_SIZE equ 6

 .model small
 .data
; Plane masks for clipping left and right edges of rectangle.
LeftClipPlaneMask db 00fh,00eh,00ch,008h
RightClipPlaneMask db 00fh,001h,003h,007h
 .code
 public _FillPatternX
_FillPatternX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame

 sub sp,STACK_FRAME_SIZE ;allocate space for local vars
 push si ;preserve caller's register variables
 push di

 cld
 mov ax,SCREEN_SEG ;point ES to display memory
 mov es,ax
 ;copy pattern to display memory buffer
 mov si,[bp+Pattern] ;point to pattern to fill with
 mov di,PATTERN_BUFFER ;point ES:DI to pattern buffer
 mov dx,SC_INDEX ;point Sequence Controller Index to
 mov al,MAP_MASK ; Map Mask
 out dx,al
 inc dx ;point to SC Data register
 mov cx,4 ;4 pixel quadruplets in pattern
DownloadPatternLoop:
 mov al,1 ;
 out dx,al ;select plane 0 for writes
 movsb ;copy over next plane 0 pattern pixel
 dec di ;stay at same address for next plane
 mov al,2 ;
 out dx,al ;select plane 1 for writes
 movsb ;copy over next plane 1 pattern pixel
 dec di ;stay at same address for next plane
 mov al,4 ;
 out dx,al ;select plane 2 for writes
 movsb ;copy over next plane 2 pattern pixel
 dec di ;stay at same address for next plane
 mov al,8 ;
 out dx,al ;select plane 3 for writes
 movsb ;copy over next plane 3 pattern pixel
 ; and advance address
 loop DownloadPatternLoop

 mov dx,GC_INDEX ;set the bit mask to select all bits
 mov ax,00000h+BIT_MASK ; from the latches and none from
 out dx,ax ; the CPU, so that we can write the
 ; latch contents directly to memory
 mov ax,[bp+StartY] ;top rectangle scan line
 mov si,ax
 and si,011b ;top rect scan line modulo 4
 add si,PATTERN_BUFFER ;point to pattern scan line that
 ; maps to top line of rect to draw
 mov dx,SCREEN_WIDTH
 mul dx ;offset in page of top rectangle scan line
 mov di,[bp+StartX]
 mov bx,di
 shr di,1 ;X/4 = offset of first rectangle pixel in scan
 shr di,1 ; line
 add di,ax ;offset of first rectangle pixel in page
 add di,[bp+PageBase] ;offset of first rectangle pixel in
 ; display memory
 and bx,0003h ;look up left edge plane mask
 mov ah,LeftClipPlaneMask[bx] ; to clip
 mov bx,[bp+EndX]
 and bx,0003h ;look up right edge plane
 mov al,RightClipPlaneMask[bx] ; mask to clip
 mov bx,ax ;put the masks in BX


 mov cx,[bp+EndX] ;calculate # of addresses across rect
 mov ax,[bp+StartX]
 cmp cx,ax
 jle FillDone ;skip if 0 or negative width
 dec cx
 and ax,not 011b
 sub cx,ax
 shr cx,1
 shr cx,1 ;# of addresses across rectangle to fill - 1
 jnz MasksSet ;there's more than one pixel to draw
 and bh,bl ;there's only one pixel, so combine the left
 ; and right edge clip masks
MasksSet:
 mov ax,[bp+EndY]
 sub ax,[bp+StartY] ;AX = height of rectangle
 jle FillDone ;skip if 0 or negative height
 mov [bp+Height],ax
 mov ax,SCREEN_WIDTH
 sub ax,cx ;distance from end of one scan line to start
 dec ax ; of next
 mov [bp+NextScanOffset],ax
 mov [bp+RectAddrWidth],cx ;remember width in addresses - 1
 mov dx,SC_INDEX+1 ;point to Sequence Controller Data reg
 ; (SC Index still points to Map Mask)
FillRowsLoop:
 mov cx,[bp+RectAddrWidth] ;width across - 1
 mov al,es:[si] ;read display memory to latch this scan
 ; line's pattern
 inc si ;point to the next pattern scan line, wrapping
 jnz short NoWrap ; back to the start of the pattern if
 sub si,4 ; we've run off the end
NoWrap:
 mov al,bh ;put left-edge clip mask in AL
 out dx,al ;set the left-edge plane (clip) mask
 stosb ;draw the left edge (pixels come from latches;
 ; value written by CPU doesn't matter)
 dec cx ;count off left edge address
 js FillLoopBottom ;that's the only address
 jz DoRightEdge ;there are only two addresses
 mov al,00fh ;middle addresses are drawn 4 pixels at a pop
 out dx,al ;set the middle pixel mask to no clip
 rep stosb ;draw the middle addresses four pixels apiece
 ; (from latches; value written doesn't matter)
DoRightEdge:
 mov al,bl ;put right-edge clip mask in AL
 out dx,al ;set the right-edge plane (clip) mask
 stosb ;draw the right edge (from latches; value
 ; written doesn't matter)
FillLoopBottom:
 add di,[bp+NextScanOffset] ;point to the start of the next scan
 ; line of the rectangle
 dec word ptr [bp+Height] ;count down scan lines
 jnz FillRowsLoop
FillDone:
 mov dx,GC_INDEX+1 ;restore the bit mask to its default,
 mov al,0ffh ; which selects all bits from the CPU
 out dx,al ; and none from the latches (the GC
 ; Index still points to Bit Mask)
 pop di ;restore caller's register variables

 pop si
 mov sp,bp ;discard storage for local variables
 pop bp ;restore caller's stack frame
 ret
_FillPatternX endp
 end






[LISTING THREE]

; Mode X (320x240, 256 colors) display memory to display memory copy
; routine. Left edge of source rectangle modulo 4 must equal left edge
; of destination rectangle modulo 4. Works on all VGAs. Uses approach
; of reading 4 pixels at a time from the source into the latches, then
; writing the latches to the destination. Copies up to but not
; including the column at SourceEndX and the row at SourceEndY. No
; clipping is performed. Results are not guaranteed if the source and
; destination overlap. C near-callable as:
; void CopyScreenToScreenX(int SourceStartX, int SourceStartY,
; int SourceEndX, int SourceEndY, int DestStartX,
; int DestStartY, unsigned int SourcePageBase,
; unsigned int DestPageBase, int SourceBitmapWidth,
; int DestBitmapWidth);

SC_INDEX equ 03c4h ;Sequence Controller Index register port
MAP_MASK equ 02h ;index in SC of Map Mask register
GC_INDEX equ 03ceh ;Graphics Controller Index register port
BIT_MASK equ 08h ;index in GC of Bit Mask register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X

parms struc
 dw 2 dup (?) ;pushed BP and return address
SourceStartX dw ? ;X coordinate of upper left corner of source
SourceStartY dw ? ;Y coordinate of upper left corner of source
SourceEndX dw ? ;X coordinate of lower right corner of source
 ; (the row at SourceEndX is not copied)
SourceEndY dw ? ;Y coordinate of lower right corner of source
 ; (the column at SourceEndY is not copied)
DestStartX dw ? ;X coordinate of upper left corner of dest
DestStartY dw ? ;Y coordinate of upper left corner of dest
SourcePageBase dw ? ;base offset in display memory of page in
 ; which source resides
DestPageBase dw ? ;base offset in display memory of page in
 ; which dest resides
SourceBitmapWidth dw ? ;# of pixels across source bitmap
 ; (must be a multiple of 4)
DestBitmapWidth dw ? ;# of pixels across dest bitmap
 ; (must be a multiple of 4)
parms ends

SourceNextScanOffset equ -2 ;local storage for distance from end of
 ; one source scan line to start of next
DestNextScanOffset equ -4 ;local storage for distance from end of
 ; one dest scan line to start of next
RectAddrWidth equ -6 ;local storage for address width of rectangle

Height equ -8 ;local storage for height of rectangle
STACK_FRAME_SIZE equ 8

 .model small
 .data
; Plane masks for clipping left and right edges of rectangle.
LeftClipPlaneMask db 00fh,00eh,00ch,008h
RightClipPlaneMask db 00fh,001h,003h,007h
 .code
 public _CopyScreenToScreenX
_CopyScreenToScreenX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame
 sub sp,STACK_FRAME_SIZE ;allocate space for local vars
 push si ;preserve caller's register variables
 push di
 push ds

 cld
 mov dx,GC_INDEX ;set the bit mask to select all bits
 mov ax,00000h+BIT_MASK ; from the latches and none from
 out dx,ax ; the CPU, so that we can write the
 ; latch contents directly to memory
 mov ax,SCREEN_SEG ;point ES to display memory
 mov es,ax
 mov ax,[bp+DestBitmapWidth]
 shr ax,1 ;convert to width in addresses
 shr ax,1
 mul [bp+DestStartY] ;top dest rect scan line
 mov di,[bp+DestStartX]
 shr di,1 ;X/4 = offset of first dest rect pixel in
 shr di,1 ; scan line
 add di,ax ;offset of first dest rect pixel in page
 add di,[bp+DestPageBase] ;offset of first dest rect pixel
 ; in display memory
 mov ax,[bp+SourceBitmapWidth]
 shr ax,1 ;convert to width in addresses
 shr ax,1
 mul [bp+SourceStartY] ;top source rect scan line
 mov si,[bp+SourceStartX]
 mov bx,si
 shr si,1 ;X/4 = offset of first source rect pixel in
 shr si,1 ; scan line
 add si,ax ;offset of first source rect pixel in page
 add si,[bp+SourcePageBase] ;offset of first source rect
 ; pixel in display memory
 and bx,0003h ;look up left edge plane mask
 mov ah,LeftClipPlaneMask[bx] ; to clip
 mov bx,[bp+SourceEndX]
 and bx,0003h ;look up right edge plane
 mov al,RightClipPlaneMask[bx] ; mask to clip
 mov bx,ax ;put the masks in BX

 mov cx,[bp+SourceEndX] ;calculate # of addresses across
 mov ax,[bp+SourceStartX] ; rect
 cmp cx,ax
 jle CopyDone ;skip if 0 or negative width
 dec cx
 and ax,not 011b

 sub cx,ax
 shr cx,1
 shr cx,1 ;# of addresses across rectangle to copy - 1
 jnz MasksSet ;there's more than one address to draw
 and bh,bl ;there's only one address, so combine the left
 ; and right edge clip masks
MasksSet:
 mov ax,[bp+SourceEndY]
 sub ax,[bp+SourceStartY] ;AX = height of rectangle
 jle CopyDone ;skip if 0 or negative height
 mov [bp+Height],ax
 mov ax,[bp+DestBitmapWidth]
 shr ax,1 ;convert to width in addresses
 shr ax,1
 sub ax,cx ;distance from end of one dest scan line to
 dec ax ; start of next
 mov [bp+DestNextScanOffset],ax
 mov ax,[bp+SourceBitmapWidth]
 shr ax,1 ;convert to width in addresses
 shr ax,1
 sub ax,cx ;distance from end of one source scan line to
 dec ax ; start of next
 mov [bp+SourceNextScanOffset],ax
 mov [bp+RectAddrWidth],cx ;remember width in addresses - 1
 mov dx,SC_INDEX+1 ;point to Sequence Controller Data reg
 ; (SC Index still points to Map Mask)
 mov ax,es ;DS=ES=screen segment for MOVS
 mov ds,ax
CopyRowsLoop:
 mov cx,[bp+RectAddrWidth] ;width across - 1
 mov al,bh ;put left-edge clip mask in AL
 out dx,al ;set the left-edge plane (clip) mask
 movsb ;copy the left edge (pixels go through
 ; latches)
 dec cx ;count off left edge address
 js CopyLoopBottom ;that's the only address
 jz DoRightEdge ;there are only two addresses
 mov al,00fh ;middle addresses are drawn 4 pixels at a pop
 out dx,al ;set the middle pixel mask to no clip
 rep movsb ;draw the middle addresses four pixels apiece
 ; (pixels copied through latches)
DoRightEdge:
 mov al,bl ;put right-edge clip mask in AL
 out dx,al ;set the right-edge plane (clip) mask
 movsb ;draw the right edge (pixels copied through
 ; latches)
CopyLoopBottom:
 add si,[bp+SourceNextScanOffset] ;point to the start of
 add di,[bp+DestNextScanOffset] ; next source & dest lines
 dec word ptr [bp+Height] ;count down scan lines
 jnz CopyRowsLoop
CopyDone:
 mov dx,GC_INDEX+1 ;restore the bit mask to its default,
 mov al,0ffh ; which selects all bits from the CPU
 out dx,al ; and none from the latches (the GC
 ; Index still points to Bit Mask)
 pop ds
 pop di ;restore caller's register variables
 pop si

 mov sp,bp ;discard storage for local variables
 pop bp ;restore caller's stack frame
 ret
_CopyScreenToScreenX endp
 end






[LISTING FOUR]

; Mode X (320x240, 256 colors) system memory to display memory copy
; routine. Uses approach of changing the plane for each pixel copied;
; this is slower than copying all pixels in one plane, then all pixels
; in the next plane, and so on, but it is simpler; besides, images for
; which performance is critical should be stored in off-screen memory
; and copied to the screen via the latches. Copies up to but not
; including the column at SourceEndX and the row at SourceEndY. No
; clipping is performed. C near-callable as:
; void CopySystemToScreenX(int SourceStartX, int SourceStartY,
; int SourceEndX, int SourceEndY, int DestStartX,
; int DestStartY, char* SourcePtr, unsigned int DestPageBase,
; int SourceBitmapWidth, int DestBitmapWidth);

SC_INDEX equ 03c4h ;Sequence Controller Index register port
MAP_MASK equ 02h ;index in SC of Map Mask register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X

parms struc
 dw 2 dup (?) ;pushed BP and return address
SourceStartX dw ? ;X coordinate of upper left corner of source
SourceStartY dw ? ;Y coordinate of upper left corner of source
SourceEndX dw ? ;X coordinate of lower right corner of source
 ; (the row at EndX is not copied)
SourceEndY dw ? ;Y coordinate of lower right corner of source
 ; (the column at EndY is not copied)
DestStartX dw ? ;X coordinate of upper left corner of dest
DestStartY dw ? ;Y coordinate of upper left corner of dest
SourcePtr dw ? ;pointer in DS to start of bitmap in which
 ; source resides
DestPageBase dw ? ;base offset in display memory of page in
 ; which dest resides
SourceBitmapWidth dw ? ;# of pixels across source bitmap
DestBitmapWidth dw ? ;# of pixels across dest bitmap
 ; (must be a multiple of 4)
parms ends

RectWidth equ -2 ;local storage for width of rectangle
LeftMask equ -4 ;local storage for left rect edge plane mask
STACK_FRAME_SIZE equ 4

 .model small
 .code
 public _CopySystemToScreenX
_CopySystemToScreenX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame

 sub sp,STACK_FRAME_SIZE ;allocate space for local vars
 push si ;preserve caller's register variables
 push di

 cld
 mov ax,SCREEN_SEG ;point ES to display memory
 mov es,ax
 mov ax,[bp+SourceBitmapWidth]
 mul [bp+SourceStartY] ;top source rect scan line
 add ax,[bp+SourceStartX]
 add ax,[bp+SourcePtr] ;offset of first source rect pixel
 mov si,ax ; in DS

 mov ax,[bp+DestBitmapWidth]
 shr ax,1 ;convert to width in addresses
 shr ax,1
 mov [bp+DestBitmapWidth],ax ;remember address width
 mul [bp+DestStartY] ;top dest rect scan line
 mov di,[bp+DestStartX]
 mov cx,di
 shr di,1 ;X/4 = offset of first dest rect pixel in
 shr di,1 ; scan line
 add di,ax ;offset of first dest rect pixel in page
 add di,[bp+DestPageBase] ;offset of first dest rect pixel
 ; in display memory
 and cl,011b ;CL = first dest pixel's plane
 mov al,11h ;upper nibble comes into play when plane wraps
 ; from 3 back to 0
 shl al,cl ;set the bit for the first dest pixel's plane
 mov [bp+LeftMask],al ; in each nibble to 1

 mov cx,[bp+SourceEndX] ;calculate # of pixels across
 sub cx,[bp+SourceStartX] ; rect
 jle CopyDone ;skip if 0 or negative width
 mov [bp+RectWidth],cx
 mov bx,[bp+SourceEndY]
 sub bx,[bp+SourceStartY] ;BX = height of rectangle
 jle CopyDone ;skip if 0 or negative height
 mov dx,SC_INDEX ;point to SC Index register
 mov al,MAP_MASK
 out dx,al ;point SC Index reg to the Map Mask
 inc dx ;point DX to SC Data reg
CopyRowsLoop:
 mov ax,[bp+LeftMask]
 mov cx,[bp+RectWidth]
 push si ;remember the start offset in the source
 push di ;remember the start offset in the dest
CopyScanLineLoop:
 out dx,al ;set the plane for this pixel
 movsb ;copy the pixel to the screen
 rol al,1 ;set mask for next pixel's plane
 cmc ;advance destination address only when
 sbb di,0 ; wrapping from plane 3 to plane 0
 ; (else undo INC DI done by MOVSB)
 loop CopyScanLineLoop
 pop di ;retrieve the dest start offset
 add di,[bp+DestBitmapWidth] ;point to the start of the
 ; next scan line of the dest
 pop si ;retrieve the source start offset

 add si,[bp+SourceBitmapWidth] ;point to the start of the
 ; next scan line of the source
 dec bx ;count down scan lines
 jnz CopyRowsLoop
CopyDone:
 pop di ;restore caller's register variables
 pop si
 mov sp,bp ;discard storage for local variables
 pop bp ;restore caller's stack frame
 ret
_CopySystemToScreenX endp
 end


















































August, 1991
PROGRAMMER'S BOOKSHELF


Computer Graphics: Principles and Practice




Ray Duncan


The subspeciality of computer graphics programming is a craft, an art, a
science, and (for some) a bit of a religious avocation as well. It's hard to
think of any other niche in our industry that can appeal to practioners with
such a wide range of backgrounds, levels of training, or native talent.
Computer scientists and mathematicians develop ever-bigger-and-better
algorithms for transformations, ray-tracing, and solid-modeling; the Silicon
Valley engineers design ever-faster graphics coprocessors; the Far-Eastern
manufacturers crank out ever-cheaper color CRTs with ever-higher resolutions;
the systems software people develop graphical user interfaces to soak up the
pixels and CPU cycles as fast as they appear; and self-taught teenagers can
write video games that make some of them wealthy before they are old enough to
drive.
When I first became involved with personal computers, the computer graphics
market was defined by incredibly expensive workstations on the high end and
the Apple II on the low end. Most of the readily available graphics terminals
were vector-based, because serial lines simply didn't have the bandwidth to
support raster graphics, and the generation of a few minutes of
high-resolution raster animation required days of mainframe computer time.
There was little in the way of hardware between these extremes that the
average programmer could play with, and the trade press reflected this
situation. Most books on computer graphics that I could lay my hands on were
stiflingly formal, academic texts with a heavy emphasis on mathematical proofs
and floating point, and a surprising paucity of quality illustrations.


Computer Graphics: Principles and Practice, Second Edition


James D. Foley, Andries van Dam, Steven K. Feiner, and John F. Hughes Reading,
Mass.: Addison-Wesley 1990, 1174 pp., $64.50 ISBN 0-201-12110-7
Both the release of the original IBM PC in 1981 and the appearance of Foley
and van Dam's Fundamentals of Interactive Computer Graphics in 1982
foreshadowed the enormous changes that were about to sweep the field of
computing. The PC was the first representative of a standardized hardware
platform that would put 80386-based, 640 x 480, 16-color graphics capabilities
(or better) and sophisticated tools within the financial reach of nearly every
programmer by the end of the decade. Similarly, Foley and van Dam's book made
graphics software technology accessible to the programming masses. The book
was of manageable length and highly readable, yet it covered an enormous range
of graphical issues -- from plotting bar graphs to three-dimensional
transformations and chromatic color models -- and discussed them in sufficient
detail for the needs of all but the most advanced practitioners. My own copy,
which is always within easy reach of the keyboard, has become tattered and
stained from use.
But although the first edition of Foley and van Dam achieved "classic" status
almost immediately, the passing years and the inexorable march of technology
have not treated the book kindly. The amount of space it devoted to raster
graphics seems hopelessly inadequate no . (The description of an important
raster flood-fill algorithm, for example, is limited to a single diagram and
caption.) The book's treatment of menus and user interactions has long since
been outstripped by evolution in modern graphical user interfaces such as the
Macintosh's System 7, Unix's Motif, and DOS's Windows. And its selection of
color plates, many of which were based on Atari game computers or Evans and
Sutherland flight simulators, appear severely dated in our era of inexpensive
RISC workstations, photorealistic rendering, and shrink-wrapped mass-market
software like Autodesk Animator.
Fortunately, 1990 brought us a new edition of Foley and van Dam, which seems
likely to be even more important than the original volume. The book now has
four authors instead of two (Feiner and Hughes joined the team), a different
title (Computer Graphics: Principles and Practice), and a massive amount of
new material -- the number of pages has almost doubled. The printed material
has been fortified with an impressive number of dazzling full-color
illustrations -- several times as many as in the first edition -- that reflect
the very latest presentation technology. The subject matter has been
reoriented toward raster graphics and gives equal time to both integer and
floating-point graphics toolboxes. The algorithms have been recast in a
Pascal-like structured pseudocode. In short, the book has been revamped so
drastically that the most puzzling aspect is why Addison-Wesley chose to call
it a 2nd Edition.
In its new incarnation, Computer Graphics: Principles and Practice is divided
into five main sections or groups of chapters. The first section includes some
historical perspective, basic graphics hardware concepts, a simple integer
graphics package that resembles Quick-Draw, 2-D and 3-D transformations, and a
3-D floating-point graphics package that supports hierarchies of graphic
objects. The second section is devoted to user interfaces and is completely up
to date with authoritative discussions of Open Look, Motif, the Macintosh,
Next-Step, the latest arcade games, and even the cutting-edge experiments in
virtual reality. The third group embraces the topics of curves, surfaces,
modeling, and color systems. The fourth set of chapters talks about image
synthesis and surveys a gamut of complex issues, ranging from ray tracing to
the projection of textures and reflections onto contoured surfaces. The last
few chapters are concerned with page description languages, animation, and
finally with state-of-the-art products and research -- the most sophisticated
graphics hardware and software that man has yet devised and/or that money can
buy.
I should warn you that Foley and van Dam's new opus has a high intimidation
factor compared to its predecessor. Whereas the first edition was just about
the right length and had the right tone to be browsed from beginning to end,
the second edition is encyclopedic in its appearance and considerably more
demanding in its nature. Although the writing is excellent, and the
explanations are admirably lucid, you would hardly be any more inclined to
read this book straight through than you would be tempted to make a project of
devouring Knuth's three-volume Art of Computer Programming during your summer
vacation. The depth and breadth is simply too great, even if you've got the
mathematical background to handle the most advanced material (as it happens, I
don't). Nevertheless, this is one of the finest reference works on the
bookstore shelves today, and sooner or later you're going to need one of the
pearls this book has to offer. Buy it now to read the delightful chapters on
design and implementation of user interfaces, and keep it nearby for those
graphical emergencies of the future.




































August, 1991
OF INTEREST





Visual Basic, a graphical application development system for Microsoft
Windows, is now available from Microsoft. The new programming system provides
visual user-interface design capabilities (for creating command buttons, text
fields, list boxes, pictures, drop-down menus, and file system controls) with
general-purpose programming tools, allowing you to create compiled windows
.exe files that can be freely distributed without runtime fees or royalties.
Visual Basic can be used to develop any Windows-based application and is also
useful for integrating multiple Windows-based applications and for automating
software testing through Dynamic Data Exchange (DDE).
Based on Microsoft QuickBasic, Visual Basic has been modified for the
graphical environment and the event-driven programming language. It uses a
threaded p-code incremental compiler and source-level debugging tools,
including an interactive immediate window, in an integrated system.
In addition to support for DLLs and the DDE, the control set itself can be
extended by developers using C and the Windows SDK and the Visual Basic
Control Development Kit (available separately) to provide the ability to
integrate new user-interface components into the graphical design and code
development environment.
From what we've seen and used, Visual Basic is a powerful, but relatively
easy-to-grasp system for developing Windows apps. Microsoft's most difficult
sales pitch might be convincing novice Windows programmers that Visual Basic
is an easy entry into Windows development, while convincing experienced
Windows developers that it is powerful enough to suit their needs. Visual
Basic fills both bills. It's a milestone development environment that all
Windows programmers should examine.
Runs in Windows 3.0's standard or enhanced modes; requires a 80286 processor
or higher; hard disk; mouse; CGA, EGA, VGA, 8514, Hercules or compatible
display; MS-DOS 3.1 or later; and one or more megabytes of memory. Online
help, an icon library, and an icon editor are included. The price is $199.
Reader service no. 20.
Microsoft Corporation One Microsoft Way Redmond, WA 98052-6399 206-882-8080
The Network C Library for Netware is new from Automation Software Consultants.
The library includes 300 functions to access NetWare system services and
statistics such as file and directory management, locking and synchronization,
bindery management, accounting, messaging, printing, connection and
workstation services, and queue management and transaction tracking. Also
featured are diagnostic and performance statistical reporting for the file
server and individual workstations. Network C Library's documentation gives
you an overview of each set of services with implementation suggestions and a
description of each function. There are 100 working sample programs that
provide source code for most of the Netware command line utilities and for
reports generated from the "fconsole" and "pconsole" programs. Other utilities
are a sample client/server application and bindery maintenance utilities.
Supports the Microsoft C and Turbo C compilers and costs $225, $450 with
source code. Reader service no. 21.
Automation Software Consultants Inc. 124 Venice Ave. Cincinnati, OH 45140
513-677-0842
Ready Systems has released VRTX-velocity for DOS, an integrated, PC-hosted
cross-development and runtime environment that allows embedded systems
designers to develop applications for Motorola 680x0 microprocessors. The
package includes a complete set of Microtec 4.1E ANSI C tools including an
ANSI cross-compiler, C cross-reference utilities, and Motorola compatible
assemblers, linkers, and librarians.
Cross-development tools are included to allow DOS host computers to develop
VRTX32 applications which are downloaded to the target 680x0 processor. The
RTsource debugger allows you to remotely debug C and assembly code running on
the 68K target processor. Communication between the host and target systems is
established through a serial link.
VRTXvelocity includes RTscope, which resides on the target and serves as both
a board-level monitor and a system-level debugger. It handles all the
operations specific to the processor board, such as memory and register
functions, instruction breakpoints, and communications with the host. It also
allows you to examine VRTX32 system objects, such as tasks, queues,
semaphores, and event flags and issue VRTX32 system calls.
The VRTX Environment System also allows you to build BSPs for custom CPU
boards, in addition to supporting CPU and controller boards manufactured by
Force, Radstone, Motorola (133, 133a, and 147), Heurikon, Pep, and Tadpole.
VRTXvelocity costs $12,500. Reader service no. 30.
Ready Systems 470 Potrero Ave. Sunnyvale, CA 94086 408-736-2600
C// for Turbo C, a C extension program, has been released by Subtlesoft. The
program is comprised of hundreds of runtime-created, separable processes
which, through dynamic priorities and scheduling, are able to share common
resources, self-parallel functions, queues, lists, events, and timeouts. In
addition to the C communications and classical synchronization mechanisms, C//
makes available a new class of semiautomatic variables, runtime control
variables, double access to process arguments, stack monitoring, private
stacks, and offsets. External events are handled through the C// driver,
user-programmed ISRs, and urgent process executions.
C// costs $333; demo kits sell for $33. Reader service no. 22.
Subtlesoft International 4344 Bristol Street Pittsburgh, PA 15207 412-521-1158
W5086 is the new single-chip, user-interface controller from Weitek. Two key
functions of Microsoft Windows' Graphical Device Interface (GDI) are
incorporated in hardware, thus improving performance of the graphical
environment and its applications. These functions are the Bit Block Transfer
(BitBLT), which copies a bitmapped image of a rectangular array of bits that
correspond to the pixels of a graphic image from a source device to a
destination device; and the LineDrawing function, which draws horizontal and
vertical lines on screen one at a time. With the W5086, the BitBLT is 26 times
better than that performed by the system CPU, while the LineDrawing function
is five times better.
Compatible also with IBM VGA, W5086 is suitable for 16- or 32-bit systems,
offers up to 2048 x 1024 resolution in monochrome; 1024 x 768 with 16 colors;
and 800 x 600 and 640 x 480 with 256 colors. All high resolution modes support
non-interlaced or interlaced monitors.
The W5086 costs $30 for 1000 units at 70 MHz in a 100-pin quad-flat-package.
The W5186 for 80 MHz in 144- and 160-pin quad-flat-packages will be released
shortly. Reader service no. 29.
Weitek Corporation 1060 E. Arques Sunnyvale, CA 94086 408-738-8400
Screen Manager Professional (SMP) is the new interface design toolbox for C
programmers from Magee Enterprises. Using SMP, you can develop windowing,
menuing, context sensitive help functions, data entry and keyboard and mouse
support. SMP's unique features are: an unlimited number of windows,
event-driven mouse support, context sensitive help, linkable libraries, and a
tutorial with over 100 code examples.
Additional features include a comprehensive menu system and shadowing and
overlapping of windows. All windowing libraries are written entirely in
hand-optimized assembly language, providing speed and low RAM overhead.
DDJ spoke with Russ Beardall, a systems analyst at Duke University. Said
Beardall, "I like the way the function calls are laid out. You can define all
the attributes and windows at the beginning and simply turn them on instead of
using many calls. On the other hand, you do have the option of using calls to
set specific window attributes."
SMP costs $349.95, $499.95 with source code. Reader service no. 24.
Magee Enterprises Inc. 2909 Langford Road, Suite A600 Norcross, GA 30071-1506
404-446-6611
Tools.h++ 4.0, a foundation class library compatible with Microsoft Windows
3.0, is now available from Rogue Wave. The new version includes over 60
classes that are not derived from a single root class, making it easy to
combine them with other C++ classes from CNS, Glocken-spiel, Zinc, and others.
Classes are provided to handle strings, dates, times, files, Btree,
Smalltalk-like collections classes, link lists, queues, stacks, and so on.
There are several features new to the version: It can be run as a DLL,
allowing smaller executables, code sharing, and easier maintenance; DDE and
Clipboard stream buffer classes make for easy exchange of data with other
applications while using stream I/O; a regular expression class makes searches
possible to find matches using a regular expression pattern; a tokenizer class
makes string parsing easy; expanded virtual I/O streams allow saving and
restoring any object into memory, disk, or through the Windows DDE or
Clipboard; and error checking allows for structured error handling.
All classes are optimized for speed and size and have an isomorphic persistent
store facility that allows complex objects to be stored and restored on
heterogeneous networks or through the DDE.
With source code, Tools.h++ for MS-DOS costs $199; $499 for Unix workstation
or DOS network. DOS object code only runs $99. Reader service no. 26.
Rogue Wave P.O. Box 2328 Corvallis, OR 97339 503-745-5908
MetaWare has released its Windows Application Development Kit (ADK) for 32-bit
Application Development. The new ADK enables you to develop, debug, and run
true 32-bit Windows 3.0 applications using the MetaWare High C compilers. The
ADK also supports MetWare's full 32-bit Run-Time Library.
Applications developed for 16-bit Windows or 32-bit Extended DOS can be ported
with minimal changes, allowing utilization of 32-bit protected memory without
using a DOS extender.
DDJ spoke with Joe Chien, software engineering manager at Silicon Graphics,
Mountain View, Calif. "Everything in Unix is running on 32-bit, so the ADK
makes it much easier to port, and calculations are faster," said Chien.
Furthermore, "you can use flat memory, and MetaWare's tech support is very
good."
License fees run $795; the introductory fee is $495. Technical support is
free. Reader service no. 28.
MetaWare Inc. 2161 Delaware Ave. Santa Cruz, CA 95060-5706 408-429-META
Scientific Software Tools has released DriverLINX, a series of real-time
data-acquisition drivers for third-party high-speed analog and digital I/O
boards for Windows 3.0. DriverLINX consists of language- and
hardware-independent Dynamic Link Libraries designed to support
data-acquisition boards from Keithley Metrabyte, Advantech, Computer Boards,
and Sotec under real, standard, and enhanced modes. The DLLs include services
to display dialog boxes for configuration management and service request
entry, context-sensitive online help, and extensive error-checking and
reporting capabilities.
The high-level interface to PC data-acquisition hardware reduces the effort
involved in porting data collection, instrumentation, monitoring, and control
applications into the Windows environment.
Applications communicate with DriverLINX by passing a "service request"
containing the specifications for a data-acquisition task. DriverLINX
acknowledges the request and notifies the application as the task is
completed.
There are over 70 services for creating foreground and background tasks to
perform analog and digital input and output, time and frequency measurement,
event counting, pulse output, and period measurements. The most common
data-acquisition protocols are implemented without sacrificing the hardware's
high-speed data-acquisition capabilities.
DriverLINX includes multitasking and multiuser capabilities; it can manage up
to six data-acquisition boards and ten concurrent tasks.
DriverLINX costs $400 and requires Windows 3.0, DOS 3.1 or later, 640K of
memory, a hard disk, and a 286, 386, or 486 machine. Reader service no. 27.
Scientific Software Tools Inc. Penn State Tech. Development Center 30 East
Swedesford Road Malvern, PA 19355 215-889-1354
Xionics has released the ImageSpeed software accelerator an engine that
provides low-cost, high-performance Document Image Processing (DIP) technology
for image retrieval applications. Working in a PC LAN environment, ImageSpeed
increases image clarity on standard VGA monitors using Scale-to-Gray
technology, an algorithm that converts scanned monochrome images to gray-scale
prior to display, thus enhancing visible resolution.
ImageSpeed features convolution scaling and rotation. It can retrieve,
decompress, and display an image scanned at 300 dpi in as little as one
second. Using Scale-to-Gray adds two to four seconds.
The engine can be driven by ImageSoft Libraries running under DOS, Microsoft
Windows 3.0, and OS/2. This set of C callable routines controls all imaging
operations, including scanning, compression, decompression, scaling, rotation,
display, and printing.
Twenty ImageSpeed licenses sell for $3000; 250 run $10,000. Demo copies are
available. Reader service no. 25.
Xionics Inc. 765 The City Drive, Suite 340 Orange, CA 92668
































































August, 1991
SWAINE'S FLAMES


If You Build It, Will They Come?







In the first place, the name is wrong. It's not multimedia. Multimedia means
combining media, like pasting a picture in a word processing document, or like
USA Today. Words and pictures together, that's one medium plus another, voila,
multimedia. Other media can get into the act, but words and pictures are
enough. And words and pictures together aren't new.
This new thing that is being called multimedia, this convergence of television
and video and sound and audio CD technology with computer technology and
CD-ROM, is not about just any media. It's specifically about dynamic media
like sound and video: media that include a time dimension. Even the "multi"
part has to do with time: For these new media, integration really means
synchronization.
Apple has recognized this time-dependence in naming its new media product,
QuickTime. QuickTime is a set of tools for compressing, storing, and
synchronizing tracks of dynamic data. As part of QuickTime, Apple has defined
an extensible, cross-platform dynamic data file format called "movie." Movies
can have tracks of time-dependent data of various types. Currently, the data
type includes two kinds of tracks: video and sound. More are planned.
Maybe the term multimedia will go away. Apple's press releases for QuickTime
show a reticence with respect to the term, and there's a lot of talk at shows
and in the press about Desktop Video (DTV?) and Desktop Presentations (DTP2?
-- hmm).
In the second place, the name is still wrong.
It's probably not going to be desktop anything. The adjective desktop implies
something about the market for dynamic media products that just ain't so. Look
at the machines: At the low end, Commodore's CDTV is only the precursor to
low-cost home media machines from Apple and PC vendors, while the high end
includes relatively expensive studio-quality MIDI and video editing equipment.
These are tools for the home or for the studio, not for the desktop. Look at
the definition Microsoft has put forth for a "multimedia PC": a 10-MHz 286 or
better, 2-Mbyte RAM, 30-Mbyte hard disk, VGA, internal CD-ROM drive, and an
audio card that meets certain specifications. A 286? These are not the specs
for a desktop computer for the 1990s; this is a definition designed to cover
home media machines and to scale up to studio-quality equipment. Desktop
business machines are in there somewhere, but I don't believe that business
presentation software was uppermost in the minds of the authors of this
definition.
Even if board-meeting presentations of the 1990s are spiced up with video and
sound, the real market for dynamic media is probably not some extrapolation
from desktop publishing, but rather an extrapolation from the current video
market. Some people are even starting to talk about dynamic media development
as The New Hollywood.
In the third place, if that's the right name, God help us.
What would The New Hollywood produce? The old Hollywood produces movies,
entertainment, mind candy. Can this really be where the dynamic media trend in
computer technology is heading? Will content producers dominate and tool
producers become something like special effects artists? Will productivity
give way to what Apple calls "user experience" in judging products?
Yeah, probably to some extent all of this will happen, and would happen even
without dynamic media, simply because content-based products are now viable.
CD-ROM is the first economical delivery medium for content producers. Lotus
Marketplace didn't fail in the marketplace; there were a lot of small
businesses that wanted that product, and there are lots of customers for other
content-based products. Data is eminently marketable.
Sure, databases, but entertainment? Yeah, that too. It's no deep insight that
the way to get the computer into the home was to connect it to the television
set, and now that's happening. It shouldn't be surprising if what gets
produced is somehow connected to TV fare, too. But while the public's need for
entertainment products seems to exceed its need for spreadsheets, the failure
rate of big-budget movies suggests that it's difficult to gauge the public's
need for any particular entertainment product in advance. It's a highly
subjective field of dreams.
Developers of business software have had the luxury of knowing that their
products can increase the user's productivity. Ditto for compiler vendors.
Increased productivity is a legitimate pitch that speaks to the bottom line.
But developers working in The New Hollywood may not have that luxury. Buying
into that market may be like hearing voices in a cornfield and building a
ballpark.
If you build it, they will come. Well, maybe. Now it looks like we're building
The New Hollywood. Who will come to the opening?
Michael Swaine editor-at-large



































September, 1991
September, 1991
EDITORIAL


Radio Days, or Making Waves on the Airways




Jonathan Erickson


It never fails. The more things change, the more they seem to stay the same.
In the late 1800s, for instance, Oklahoma weathered Boomers, Sooners, and
scoundrels in one of the biggest Federally-okayed land grabs in history. In
the late 1900s, yesteryear's Boomers may become tomorrow's Tuners as the
government once again gets knee-deep in deregulation dilemmas. This time
around it's not grazing land or oil patches in the eye of the storm, but the
crowded radio spectrum. In a bill before Congress, up to 200 MHz of the
spectrum will eventually move from governmental control (the Pentagon, Energy
Department, and the like) to private industry. At issue is the mechanics of
how the Federal Communications Commission will make the move. One proposed
method of reassignment -- backed by the FCC and White House -- is to auction
off the spectrum to the highest bidder. Best guesses put the future market
value of the available frequencies upwards of $200 billion.
It's not that paying off the national debt or unburdening the FCC of
regulatory responsibilities is altogether bad. But I agree with radiophile
Harry Helms (author of The Underground Frequency Guide and Shortwave Listening
Guidebook) who compares auctioning the spectrum to that of selling off the
national parks. Moreover, the auction provision means that available
frequencies will likely go to large corporations at the exclusion of small
innovative companies. The alternative is to assign frequencies by lottery or
hearings, both presumably run by the FCC.
By now you're probably wondering if you mistakenly picked up Radio and
Electronics instead of Dr. Dobb's, and questioning what the radio spectrum has
to do with computer programming. The answer is, "everything." Over the coming
years, data communications will increasingly be handled by wireless
technologies -- RF, spread spectrum, infrared, and cellular. Terms such as
PCN, SMR, WIN, CT3, and "10-4 Good Buddy" will be as common to programmers as
to radio hams. At a recent conference, Bill Frezza of Ericsson GE Mobile Data
summed up the relationship between PCs and radio communications: "We're not
out to put a keyboard on every radio. We just want an antenna on every PC."
The infrastructure for the burgeoning data radio industry is already in place.
Nationwide networks such as the IBM/Motorola-sponsored ARDIS (Advanced Radio
Data Information Service) system and RAM Mobile Data's packet-switched mobile
data service are now online (well, on the air), enabling you to send short
blasts of data at 8 Kbps from one end of the country to the other for as
little as 25 cents/Kbyte. Likewise, personal wireless networks will sprout
inside buildings. (Ericsson Business Communications has an experimental system
operating at 800-1000 MHz that converts analog PBX transmission and digital
radio signals.) Of course, there's nothing new about radio/PC communications;
Jim Warren proposed such a system in "The Digicast Project," DDJ, October
1978.
Software developers are already getting on the wireless wagon. Motorola's
Mobile Communication group will soon release a DOS-based C toolkit for RF
modem software to support its 400i portable RF modem (the one used with the
NCR 3125 pen-based computer). Traveling Software will deliver a communications
engine designed to accommodate drivers for RF, cellular, and other transports.
NCR and Persoft are providing comm software to connect NCR wireless networks
to Ethernet networks. And Apple is pushing hard for its proposed Data-PCS
("Personal Communications Service") for transmitting data wirelessly within a
radius of 50 meters indoors.
Central to Data-PCS (endorsed by NCR, IBM, Tandy, and Grid) is the assumption
that computer users have the same right to communicate wirelessly as with
wires. To this end, Apple is asking that "a small part of airwaves be made
available to computer manufacturers and users, without requiring radio
licenses or having to pay for using the airwaves." The burr under the FCC
saddle is that the frequencies Data-PCS requests -- 40 MHz in the 1850-1990
MHz band -- is currently allocated to others; hence the need for privatization
and reassignment. Granted, digital communication without a license already
exists, down around 49 MHz. But this area is reserved for low-power units (toy
walkie-talkies, cordless phones, garage door openers, and so on) and is too
noisy for PC use. What's needed is nothing less than an open access digital
radio service for PC users.
Let's not forget that better ways to use scarce spectrum resources will
evolve. Faster communication and advanced compression will make more efficient
use of radio frequencies, not to mention freeing up those portions currently
being used inefficiently. (The rapid move to cable TV, for instance, may free
up some frequencies used by broadcast TV.) But as we know, technology is a
two-edged butterknife. For example, the greater a signal's baud rate, the more
frequency space it needs. Yet problems like this pave the way for new
solutions -- and new opportunities. (RF Data Network Systems, for instance, is
a small company I ran across that's addressing these emerging needs by
developing data compression routines for radio communications.)
To learn more about such topics, attend the 10th ARRL Amateur Radio Computer
Networking Conference (September 27-29) in San Jose, Calif. For details,
contact the ARRL, 225 Main Street, Newington, CT 06111, 203-666-1541. Over and
out.








































September, 1991
LETTERS







Cordic Connections


Dear DDJ,
I just read Pitts Jarvis's article, "Implementing Cordic Algorithms" (October
1990), which was brought to my attention by a friend who knew I was looking
for such an algorithm. Unfortunately, that was a year ago. I did solve my
problem, however, using this very same algorithm. I found the article very
easy to understand and it certainly made the algorithm a lot clearer to me
than it had been. I would like to add a couple of comments, if I may.
I had actually found these algorithms through Knuth's Fundamental Algorithms,
specifically, exercise 28 in section 1.2.2 (and its answer, of course), which
is concerned with performing exponentiation using only shifts, adds, and
subtracts. The note at the end of the answer to the exercise says that there
are similar algorithms for trig functions and gives references. Somewhere in
the paper trail of references I read that the algorithm actually dates back to
Briggs, 300 years ago. My friend told me that Briggs actually computed some of
the original log tables. As I understand it, the logarithm algorithm may have
developed into the trig algorithm by way of Euler's formula (which relates the
two in the field of complex numbers).
Also, Pitts suggested these algorithms might be useful for graphics (which was
my application) but I don't believe you indicated that the circular routines
are extremely well suited for conversions between polar and rectangular
coordinates. For example, I believe if you put r * K into the circular
routine, you will get r * cos(a) and r * sin(a). And in the inverse routine,
you unavoidably get sqrt(x * x = y * y)/K. In other words, these routines
unavoidably do coordinate conversion. This, when I saw it, reminded me of HP
calculators, one of which you say uses these algorithms.
I had also noticed (although I would not recommend them) that Taylor's series
for the sine or cosine of a particular number requires all the same
multiplications. My conclusion was that it would be nice if there were
coordinate conversion routines in software libraries and available from
floating point processors because they are often what is needed, and there is
probably a lot of waste in computing sine and cosine independently. Of course,
these (CORDIC) functions would require an intrinsic multiply or divide by K.
Now I see that Intel '87s use this algorithm, and while simultaneous sine and
cosine are available, software libraries (that I know of) don't make it
available for high-level languages. So how about it? Will we see polar and
rectangular coordinate conversion routines in standard libraries?
Also, the units or scale of r, x, and y need only be consistent; the algorithm
is independent of this. The units or scale of the angle need only match the
atan table, which theoretically could be a parameter of the function.
Anyway, yours was a very nice article, but where were you when I needed you?
Lawrence Leinweber
Cleveland Heights, Ohio
Pitts responds: Knuth attributes exercise 28 in section 1.2.2 (a method for
exponentiation using only shift and add) to Richard Feynman, the Nobel prize
winning physicist. Chapter 22 of The Feynman Lectures on Physics describes the
method Henri Briggs used in 1620 to calculate the original table of
logarithms. This chapter, entitled "Algebra," in the space of ten pages
relates the trigonometric functions, logarithms, exponentials, complex
numbers, e, and pi using only the basic notions of integers and counting. It
is fascinating reading to see how the consequences of a few simple definitions
and some algebraic manipulation can lead to Euler's formula.
The CORDIC algorithms perform rectangular/polar conversions by their very
nature; however, you have to scale r carefully in order to avoid overflow as
the algorithm runs.
For general-purpose floating calculations, it's probably more efficient to
perform polar/rectangular conversions by computing sine and cosine separately,
using methods based on approximation theory. Most floating point libraries use
this method. This class of algorithms is tabulated in Computer Approximations,
by J. F. Hart, et al. The CORDIC algorithms are useful in more specialized
circumstances.


The Mandarin Middle Management Conspiracy


Dear DDJ,
In the June 1991 Programmer's Bookshelf, Ray Duncan takes Ed Yourdon to task:
"What Yourdon views as programming, you and I would consider the tedious
paper-pushing of burned-out middle managers."
I am sorry to have to tell Ray that Yourdon's view is the standard in large
organizations. Still the standard. A full generation after people started to
notice that good programmers frequently make poor managers, few big companies
-- none at all, actually, in my 22 years of experience -- have any proper
career structure for people whose natural talent is for what Ray correctly
calls "the creative labor of programming." The assumption everywhere is that
after three or four years cutting code, you have paid your dues and are
entitled to be rewarded with a real job: issuing memos, drafting proposals,
and attending meetings, meetings, meetings.
As a matter of fact, there is some justice in the Yourdon view. Most of those
people who did their three years coding were not much good at it (ask the
people who have to maintain their code!) and lapsed gratefully into the Unit
Manager slot. Like most human beings, they had no talent or passion for
anything outside their private lives, and so were best employed in a job
requiring no talent or passion. The true casualties are those of us who love
making code and are good at it, yet want to arrive at middle age earning a
respectable salary. We are the losers in a culture dominated by the idea, once
confined to empires of the bureaucratic-despotic type, that the only
worthwhile form of human activity is directing the work of others: to say to
this one, "Come," and he cometh, to say to that one, "Go," and he goeth. The
goal of all endeavor is the Mandarin's cap. Those of us who don't want to be
Mandarins, but who would like to be properly rewarded for useful work done
with loving attention, are thought of as eccentrics. Try turning down that
promotion to Unit Manager: They look at you as if you'd asked for a transfer
to the mail room.
Between the middle management drone and the nerd hacker who never takes his
eyes from the screen (all you ever see of him is his ponytail), there is
another breed: the true programmer/analyst. We dress for business, and we know
business. Years of working with accountants, engineers, executives, field
engineers, and warehouse managers have taught us how they think and what their
needs are. We can figure out their requirements, then we can go back to our
cubes and turn those requirements into fast, maintainable, waterproof code.
Not total loners, we can play the noncom when necessary, taking on a couple of
support programmers and guiding their work to good effect. We understand the
need for CASE methods and can operate perfectly well within them (pace your
correspondent Andy Bender, same issue), though we'll never be enthusiastic
about "functional specifications," "dataflow diagrams," and all those other
submanagerial exercises in applied boredom, eating into time that could be
spent coding or fishing around in users' minds. We don't mind turning to a 4GL
for the occasional emergency enhancement, but will never believe that this is
real programming. ("Oracle is for people who don't like computers," a
colleague commented to me recently. We all know what he means: It's for the
three-year people.)
Sick of the Mandarin ethos of large companies, we've tried to go into business
for ourselves as code producers, but found we had no skill at marketing. We
end up as "consultants," resented by permanent staff, harassed by the IRS, cut
off from the life of the companies we work at.
When programmers first started to appear in business organizations, they were
regarded with fear and suspicion, the possessors of arcane knowledge:
anarchic, disruptive elements in the stately world of business management. The
paper-pushers saw early on that computers were a threat to their status. They
acted to contain and neutralize that threat. Now the Yourdons have taken over,
and Harmony has been restored. Programmers -- like engineers in a previous
generation (why do you think everything's now made in Japan?) -- have been
marginalized and demoralized. Andy Bender can talk as much as he likes about
the "professionalization" of software engineering and the need for a "standard
university curriculum;" but if he thinks graduates of that curriculum will
ever have the status of MBAs, he's dreaming.
There are consolations, though. Programming may, as a Mandarin recently told
me, be "nothing but a glorified clerical function," but we still have our
skill and our passion. And, of course, we still have Dr. Dobb's.
John Derbyshire
London, England


Raising Questions on Raising Matrices


Dear DDJ,
Victor J. Duvanenko's "Efficiently Raising Matrices to an Integer Power" (June
1991) showed me that the method I have been using for raising real numbers to
integer powers can also be used for matrices. I call this the "Russian peasant
method" for exponentiation. It is closely related to a method of that name for
multiplication which only involves the simple operations of halving, doubling,
and adding.
While this is an efficient method of raising a matrix to an integer power, it
is not an efficient method of computing Fibonacci numbers, as was implied in
the article. There is an exact equation for Fibonacci numbers due to Jacques
Philippe Marie Binet, published in 1843:
 F(n) = (b{n} - c{n})/a
 where a = Sqrt (5)
 b = (1 + a)/2 = Golden Ratio,
 c = (1 - a)/2 = 1 - b = -1/b
Since the magnitude of c{n}/a is less than 1/2 for n >= 0, an efficient method
of computing Fibonacci numbers is the evaluation of the equation:
 F(n) = Round(b{n}/a).
In C this can be programmed as:
 #include <math.h>
 fib = floor(0.5 + pow(b,n)/a);
A program based on this method runs 15 to 20 times faster than Victor's matrix
method.
Harry J. Smith

Saratoga, California
Dear DDJ,
I read "Efficiently Raising Matrices to an Integer Power" with great interest.
Mr. Duvanenko's points are very well considered and illustrated.
Although it is not the major point of the article, I would like to add another
method to compute Fibonacci numbers as an illustration of a more general
point. The point is that there are frequently direct analytic methods that
yield closed-form solutions. In fact, the Golden Ratio is not just the ratio
of F(n) and f(n - 1) as n approaches infinity, but can be expressed more
accurately as the (sqrt(5) + 1)/2. Frequently, the Golden Ratio is referred to
as the Golden Number and is expressed as a nonrepeating decimal with an
approximate value of 0.6180339.
You may find it interesting to note that the GN has some other interesting
properties. For example, (GN - 1) = (1/GN).
A source of the closed form comes from discrete difference equations. The
following is a short form solution:
 F(n) = F(n - 1) + F(n - 2)
 F(n - 2) + F(n - 1) + F(n) = 0
When the del operator is defined as del(Fn)) = F(n - 1), then the general
solution is a(b){n}. Using the quadratic formula on the characteristic
equation:
 del del + del - 1 = 0
yields a general solution of
 f(n) = 1/sqrt(5) ((1 + sqrt(5))/2){n}
 -f(n) = 1/sqrt(5) ((1 - sqrt(5))/2){n}
Using the closed form solution implemented with static constants yields a
solution that computes in fixed time. This approach can be used with a variety
of other functions, as shown in Example 1.
Example 1

 #include <stdio.h>
 #include <stdlib.h>
 #include <math.h>
 #include <string.h>

 static double Invsqrt5;
 static double Sqrt5plus;
 static double Sqrt5minus;

 double fiber(int n);

 int main (void)
 {
 int f_num;8
 char f_str[80];
 Invsqrt5 = 1.0 / sqrt(5.0);
 Sqrt5plus = (1.0 + sqrt(5.0)) / 2.0;
 Sqrt5minus = (1.0 - sqrt(5.0)) / 2.0;

 while (1)
 {
 fputs ("\nnumber: " , stdout);
 f_str[0] = '\';
 fgets(f_str, sizeof(f_str) , stdin);
 if (strlen(f_str) > 0)
 {
 f_num = atoi(f_str);
 printf("F(%5d) is %20.01f\n" ,
 f_num , fiber(f_num));
 }
 }
 return (0);
 }

 double fiber(int n)
 {
 return((pow(Sqrt5plus , (double) n)
 - pow(Sqrt5minus , (double))
 * Invsqrt5);
 }

Tom Carrington
Plano, Texas

Dear DDJ,
While Mr. Duvanenko's article was interesting, I thought it was
inappropriately applied. If
individual Fibonacci numbers are required, they can be more economically
computed using the
following formula:

 F(n)=Round(p{n}/sqrt(5)),

where p = (1 + sqrt(5))/2, is the Golden Ratio.
p{n} can be evaluated using the repeated squaring method described in the
article. Or even better, F(n) = trunc (0.5 + exp(n * ln(p) - 0.5 * ln(5))).
Both formulae are derived from the solution to the recursive definition of
Fibonacci numbers, i.e.,

 F(n) = 1/sqrt(5) * [((1 + sqrt(5))/2){n} -
 ((1 - sqrt(5))/2){n}]
As the absolute value of the second term is less than 0.5, we can neglect it
and round to the nearest integer.
Jeremy Ottenstein
Elizabeth, New Jersey


Are Your Windows Running?


Dear DDJ,
Ben Myers's WINTHERE program ("Winthere," January 1991) is very useful if you
need to know if Windows is running real, standard, or DPMI mode. But if you
only need to know if Windows is running or not, there is an easier way.
Windows is adding a new environment variable, the "windir," to keep track of
the home directory. The variable is added by Windows in runtime and not, as
usual, in the AUTOEXEC.BAT. Since any app spawned by DOS EXEC (int 21,
function 4B), as programs normally are, inherits the environment block from
its parent, windir will be available in all win- and oldapps started under
Windows. Note that this is only available in Windows 3.0. Example 2 shows my
environment in a DOS-box when running Windows.
Example 2: The DOS-box environment with Windows running

 C:\DOS>set
 PROMPT=$p$g
 PATH=C:\WINDOWS;C:\DOS;C:\PCT;C:\BORLANDC\BIN;C:\WINDEV; C:\T D;
 TEMP=C:\WINDOWS\TEMP
 COMSPEC=C:\DOS\COMMAND.COM
 windir=C:\WINDOWS

The program in Example 3 checks if the "windir" environment variable is
present and if it is, the program exits with error code 0, otherwise with
error code 255.
Example 3: Program to check for windir

 // CheckWin check if the environment variable "windir"
 // is present.
 #include <stdlib.h>

 int main(void)
 {
 if(getenv("windir") = = NULL) return 255; // windir didn't exist
 else return 0; // it did
 }

The program in Example 4 is very useful in .BAT files for checking if they are
running with or without Windows present. This avoids the common problem when
you forget that you are in a DOS-box and try to start Windows again, often
ending up in a general protection fault destroying whatever was running under
Windows. (I renamed WIN.COM to WINDOWS.COM.)
Example 4: Program to check whether .BAT files are running with or without
Windows

 WIN.BAT
 @ECHO OFF
 CheckWin
 IF ERRORLEVEL 255 THEN GOTO W
 ECHO Windows is already running (Press Ctrl=Esc or Alt+Tab)
 GOTO END
 :W
 WINDOWS
 :END

Torgil Johnsson

Kista, Sweden
Ben responds: Mr. Johnsson has indeed found a simple way to determine whether
Windows 3.0 is running. Just like my original and incomplete WINTHERE solution
to the problem, Johnsson's CHECKWIN has advantages and disadvantages.
1. CHECKWIN relies on an apparently undocumented feature of Windows 3.0,
though one that can readily be verified without resorting to reverse
engineering methods. Since the windir environment variable is undocumented,
Microsoft has no obligation to carry it forward to new releases of Windows.
When I first broached this problem to Microsoft via their online service over
a year ago, the Microsoft response suggested something similar to Mr.
Johnsson's approach, but it was vaguely stated, and there was no mention of
the windir environment variable.
2. The method in CHECKWIN is not robust. It does not guarantee that Windows is
running. For example, I could set the windir variable with a C setenv function
call made from a non-Windows program, and this would mislead any program that
used the CHECKWIN method. In passing, note that the DOS SET command cannot be
used to set an environment variable which contains any lower case characters,
because SET converts both environment variable and environment string to upper
case.
3. On the plus side, the CHECKWIN approach was properly both from Windows and
DOS apps. I put a similar getenv function call into one of my Windows apps,
and it returned the path from which Windows was launched. This implies that it
would also work when used within device drivers and other system software that
needs to know whether Windows is running or not. Also in passing, setting an
environment variable is not a foolproof method for making known the path from
which Windows is launched. Environment space in a PC is limited, and any
software that uses environment variables always runs the risk of exceeding
available environment space. Many Microsoft products use one or more
environment variables. From a human factors standpoint this is not desirable,
especially for novice users who may have real difficulty figuring out what did
not work and why. There are several good techniques for keeping track of this
sort of information without using environment variables.
I have reservations about using the CHECKWIN approach in commercial software.
Until Microsoft defines a robust API for detecting the presence of Windows, I
cannot offer a better solution other than combining my original WINTHERE code
with the getenv function as used by CHECKWIN. The combined result still has
holes in it, but it is better than nothing. Let's hope that Microsoft will
address this issue with Windows 3.1, which is expected to be in beta testing
in late June or early July 1991.
























































September, 1991
LITTLE LANGUAGES, BIG QUESTIONS


An informal history of a good idea




Ray Valdes


Ray is a technical editor at DDJ. He can be reached at 501 Galveston Drive,
Redwood City, CA 94063.


What do you think is the most widely used programming language today? Much as
we'd like to think they have, neither C nor Pascal has taken over the world.
It's too late for Ada and ASM, and too early for C++. So it must be Fortran or
Cobol, right? The correct answer, however, is none of the above. The most
widely used programming language is, in fact, Lotus 1-2-3 Macro Language.
Yes, we know "real programmers" don't use Lotus Macro Language, but a very
large number of real people do use it, and get worthwhile results every day.
Why such a puny language claims so large an audience might be a mystery to
those of us ensconced in the flame wars of whether multiple inheritance in C++
is a Good Thing. Nevertheless, it's useful to look at how little languages
were used in the past, and how they are used in the software field today. Some
of this may cause you to rethink your past allegiances.


The History of Managing Complexity


In some sense, the entire history of computer programming is the evolution of
strategies for managing complexity. A principal strategy for dealing with
complexity has been computer language design.
Computer system designer Glenn Myers, in his classic work on computer
architecture (see "References"), talks about the "semantic gap" between a
machine's instruction set and the requirements of an application program. In
his view, a language compiler is a technique for bridging the semantic gap
between two distinct ways of representing a computational solution. Twenty
years ago, when Myers wrote his book, application requirements were such that
the gap could be spanned with a single step -- by writing an application in
Fortran or Cobol and translating this solution into its machine-language
representation. Now that requirements are more complex, the semantic gap has
become so wide that it's sometimes preferable to bridge it in several steps,
using a little language as a stepping-stone.
So over the years, the architecture of complex applications has undergone
structural change, from a single module built out of distinct subcomponents,
to a nested architecture. Designers often use an image of two Russian dolls,
one placed inside the other, to describe this kind of structure.
Complex application problems become easier to solve if the application is
partitioned into two (or more) nested components: a core module that provides
a primitive set of services for an application area (the "engine"), and a
surrounding module that provides programmatic access to these services. The
surrounding module is typically a language interpreter for a simple, easily
parsed computer language -- a "little language."
Some of these languages are designed on an ad hoc basis and bear no
resemblance to any other language (Lotus Macro Language, for example). Other
languages are carefully crafted subsets of existing programming languages such
as C or Lisp. These days, you can find application engines for handling
databases (Paradox), publication graphics (Display Postscript), CAD (AutoCad),
symbolic mathematics (Mathematica), 3-D rendering, text-editing,
spellchecking, hypertext, and many other tasks.
Using little languages with big engines is therefore not a recent invention --
but a technique that has been reinvented and rediscovered many times since we
evolved from the lower life-form of assembly language and began to walk
upright, using higher-level languages.
The first little languages were simple extension languages. For example, in
1964, Ivan Sutherland's landmark graphics program, Sketchpad, incorporated a
sophisticated, constraint-based system for describing relationships between
graphical elements; these constraints were described using a simple language.
The CoreWars computer game, popular with hackers on mainframe machines during
the mid-sixties, also used a simple language to describe competing processes.
No doubt there were many other such instances, but the written record before
1970 is rather sketchy (at least as found in my library).


The Spawn of lex and yacc


Constructing a language that is more than a toy, while not supremely
difficult, is nevertheless a laborious task that happens faster with proper
tools. It was during the '70s that language construction tools came into full
bloom. During those years, the pragmatic, tool-oriented company culture at
AT&T's Bell Labs -- as engendered by Kernighan, Ritchie, Thompson, Bentley,
and other researchers -- placed a high value on compiler-building utilities.
Unix's lex and yacc are now showing their age, but at the time were a
reasonably painless way to build the scanning and parsing components of a
compiler or interpreter. The Unix pcc (Portable C Compiler) was built with
these tools and then ported to many hardware platforms, and for many years
represented the de facto definition of the C language.
To my knowledge, some little languages built at Bell Labs using lex and yacc
include pic (for diagrams), tbl (table-processing), eqn (equations),
nroff/troff (document processing) awk (text manipulation), the various
operating system shells (Bourne, Korn), bc (a Basic-like calculator), Ideal
(graphics), Grap (mathematical graphs), and pico (image processing). These
programs were easy to construct, and proliferated like rabbits in the /usr/bin
directory on Unix systems.
Compared to bulky present-day software packages such as dBase, ToolBook,
HyperCard, or contemporary C compilers, the Bell Labs languages are pretty
lean and mean. On my PC, the size of the Microsoft C compiler (three passes)
is three-fourths of a Megabyte; the dBase IV binary is 1.8 Megabytes. By
contrast, the size of the awk executable is less than 60 Kbytes. The
application-specific component of a 1970's vintage Bell Labs program -- the
engine -- is more like a small, two-stroke motor rather than a gas-guzzling
V8.
A couple of years ago, Jon Bentley of Bell Labs wrote a CACM column entitled
"Little Languages," in which he discussed this idea of using little languages
to address different application areas. His brief but thought-provoking
article focused on the problem of creating complex graphics within a
character-mode document processor, and how the PIC language at Bell Labs was
useful in this regard. Bentley, unfortunately, does not give us a complete
run-down of the many little languages that have emerged from Bell Labs. Such
an account would be most interesting. But the title remains memorable.
It is some kind of indicator that the most recent language creation to emerge
from AT&T -- C++, Version 2.1 -- is far from "little" (in fact, some critics
claim it is monstrously large). I think the language translation requirements
of C++ may have transcended the technology present in lex and yacc, given the
large number of potentially ambiguous constructs.
Over the years, the little languages from Bell Labs have been used and abused
in different ways. For example, some odd soul used the troff typesetting
language to write a Basic interpreter. (Notwithstanding, I've written a
troff-like interpreter in PostScript.)
Moving to another place and time, a momentous event in the history of software
technology occurred when Richard Stallman abused a little language at MIT.


Stallman's Inspired Hack


The text editor I'm using to write this is a descendant of an inspired hack
using the TECO macro language. TECO is a line-oriented text editing facility
that ran on various flavors of DEC computers, from PDP8s to 11s to 10s. It is
similar to DOS's EDLIN or Unix's ed, in that single-letter commands are used
to carry out editing functions on text that has been read into a buffer.
TECO, however, provides additional commands for putting data into local
variables, and for testing and branching, and therefore becomes Turing-machine
equivalent. A printout of TECO macros pushes the envelope of inscrutability,
making the average Unix or DOS command file seem as clear as a blue sky over
Montana.
In the mid-1970s, a friend and I implemented a menu-driven mailing list
application written in TECO macros. Although the entire source code was only
four pages long, we spent many hours debugging and enhancing the code. This
experience immensely increased my respect for Stallman's feat, which was to
build an extensible, screen-oriented text editor, EMACS, on the uneven,
prickly foundation of TECO macros.
More Details.
The EMACS editor, of course, has since been reimplemented by Stallman and many
other independent authors. Its feature list has grown to include directory
editing, e-mail handling, and functions not usually associated with text
editors (but since adopted by mainstream commercial products). The most recent
incarnation of Stallman's editor is GnuEmacs, available from the Free Software
Foundation as well as many other sources. On my Unix system, the GnuEmacs
executable is about 650 Kbytes.
GnuEmacs is written using the two-step language/engine combo. The base layer
is about 80,000 lines of C code that implement both basic editing and
functions that support "MockLisp," a Lisp-like language in which the
higher-level features of the editor are implemented.
This two-layer structure has been followed by EMAC's spiritual descendants on
PCs, such as Epsilon and BRIEF. (For more on the BRIEF Macro Language, see
"Programmer's Workbench" in this issue.) Although BRIEF was not designed to be
an Emacs clone, both the original version of BRIEF and GnuEmacs use Lisp-like
macro languages. The new version of BRIEF uses a C-like language, similar to
that in Epsilon. Interestingly, BRIEF translates the C-like syntax to the
older Lisp form before compiling it into bytecodes.
The Lisp syntax has been chosen as an extension language model by numerous
other application programs, such as AutoDesk's AutoCad. It's not just
microcomputer applications -- for example, the design automation systems from
Cadence and Mentor, which run on workstations, have Lisp-like extension
languages. Recently, the CAD Framework Initiative (an industry standardization
group) chose the Lisp-like Scheme language as its standard extension language.
The principal reasons are that Lisp syntax is computationally complete, easy
to learn, and, most importantly, easy to parse. David Betz's XLisp provides a
respectable amount of Lisp functionality in less than 10,000 lines of C code.
Perhaps it was for this reason that XLisp was grafted onto AutoCad to become
AutoLisp. (For more information on XLisp and XScheme, a dialect of Lisp, see
"Testing C Compiler Performance" by David Betz, DDJ, August 1991.)


The Ultimate EMACS



Stallman's design approach has been echoed in text editor implementations far
and wide. Some of this influence has been indirect, by intellectual osmosis,
leading to inadvertent rediscovery. The ME text editor from Magma Systems may
fall into this category. In other cases, the lineage can be traced more
directly, as with Borland's word processor, Sprint, which evolved from Final
Word, which came from Mince, which came out of MIT.
Perhaps the most elaborate incarnation of a text editor -- the "ultimate
EMACS," so to speak -- is the technical publication system by Interleaf. The
Interleaf TPS package is a high-end system for writing, editing, and producing
technical documents. The software runs on workstations, X-terminals, IBM
mainframes, PCs, and Macs. If you need to produce thousands of pages of
specifications for an aerospace project, this is the tool of choice. Version
1.0 of the Interleaf software came out of the Scribe research project at MIT
in the early '80s. It consisted of 100,000 lines of C code. The current
version, Interleaf TPS 5.0, weighs in at over 1.5 million lines of C code.
Given these proportions, one can hardly expect Interleaf's macro language to
be a puny runt, and it is not, representing about 90 percent of a full Common
Lisp implementation (Common Lisp is the Ada of Lisp-like languages), plus
object-oriented extensions.
Unlike AutoCad, which keeps its AutoLisp extension language at arm's length
(by implementing most of the application's functionality in C), Interleaf Lisp
is not just an "extension" language, but also an implementation language.
About 25 percent of the functionality of the Interleaf system is implemented
in Lisp code, rather than C. This requires about 250,000 lines of Lisp, a
large amount, but much less than if it were done in raw C. The boundary
between the two languages is transparent enough that a C function can call a
Lisp function which calls another C function, without any function being aware
of what language is being used by its caller. Most of the data structures that
represent a document are packaged as objects visible to both Lisp and C code.
Despite the highly evolved structure of Interleaf TPS, it is gratifying to
note that remnants of the EMACS pedigree remain, in the form of the
traditional arcane key bindings -- present, but unknown to most of Interleaf's
customers and even its most senior technical staff.


How to Ask Hard Questions


As we have seen here, a small language can be used to implement large parts of
a complex application. But there are other uses for this technique.
Often, the most important step in solving a difficult problem is to ask the
right question. Many key developments in science have occurred as a result of
properly formulating a key question -- as happened after the apple fell on
Newton's head. Sometimes, however, before you can ask the right question, you
need a language in which to express that question.
The accompanying text box touches on the ability of Lisp-like languages to
express fundamental problems in computation. Beyond this, a number of AI
research projects leave "pure Lisp" way behind by defining languages on top of
Lisp in which ideas about knowledge and reasoning can be easily explored. The
first well-known example of this is Terry Winograd's Planner, since followed
by many others.


Alternative Syntaxes


Most of the little languages mentioned earlier use the syntax of either Lisp
or C. Implementors choose Lisp for its simplicity and its strong theoretical
underpinnings. C is used because of its familiarity. Alternative models
include Forth and Basic.
Forth is considered by its advocates to be the archetypal little language. Not
only does it have a byte-sized syntax, but its philosophy of use is congruent
to the language/engine idea. The process of writing a Forth program consists
of defining Forth words which are defined in terms of other words, which are
ultimately defined by calls to the underlying engine (in the case of
language/engine duo), or by machine language fragments (in the case of a
stand-alone system). The most widespread example of a Forth-like little
language is PostScript. A less well-known example is the macro language for
the Final Word editor.
Basic has not been used all that much, until recently -- but this will change.
Bill Gates's longstanding affection for Basic drives Microsoft's campaign to
make Basic the primary macro language for the DOS and Windows platforms. The
extension language in Word for Windows is the first step in the campaign.
Visual Basic (see next section) represents another volley. The principal
contender has yet to be introduced by Microsoft, however.
An older use of Basic-like syntax is found in the Mumps programming language
(see the code example in the sidebar). Mumps folds a rich mixture of
sophisticated database functionality onto a bare-bones syntax, consisting of
26 single-letter commands (one for each letter of the alphabet, including "G"
for "goto"). This strange brew of a language is the basis for one of the
largest software development projects ever, the $1.2 billion government
contract, now ongoing, to computerize the nation's VA hospitals.


Access to Environments


Another popular use of little languages is to interface to software
environments. The environments range from rudimentary operating systems such
as MS-DOS, to more complex graphical environments, such as Microsoft Windows
or the Mac Finder/System.
Access to operating system functionality has always been available with DOS
batch files (and on Unix, with the Bourne/C/Korn shell languages). By
contrast, easy programmatic access to windowing environments has been a long
time coming.
A recent example is Microsoft's Visual Basic, which deftly skirts the morass
of the 600-plus functions in the Microsoft Windows API, and provides
high-level access to a usable core set, plus low-level access to the rest. The
classic "hello world" program in Visual Basic is only one line long, compared
to the 100-line version using the raw Windows API.
On the Mac, a well-known example is the HyperTalk language found in Apple's
HyperCard program. As originally conceived by Bill Atkinson, HyperCard was
purely an end-user application, free from the complications of
programmability. As the program evolved during development, an English-like
interpreted language was added by Dan Winkler, in order to take full advantage
of HyperCard's rich set of features. A more recent example on the Mac,
containing functions closer to an operating system batch language, is
Userland's Frontier. (Refer to Michael Swaine's "Programming Paradigms" in
this issue for additional details on Frontier.)


Conclusion


In past product advertising, neither Apple nor Microsoft has been known for
acknowledging intellectual debts to prior innovators. I haven't checked the
ads recently, but I doubt things are different with Visual Basic and HyperCard
2.0. This is no great sin, because good marketing demands a focus on
here-and-now rather than on there-and-then. The history of little languages,
however, is too interesting -- and their contribution too important -- to be
conveniently forgotten.


References


Bentley, Jon. "Programming Pearls: Little Languages." Communications of the
ACM (August, 1986).
Myers, Glenford, Advances in Computer Architecture. Wiley, 1982.
Stark, Richard. LISP, Lore, and Logic. Springer-Verlag, 1990.
Thompson, Ken. "Reflections on Trusting Trust," in ACM Turing Award Lectures.
ACM Press/Addison-Wesley, 1987.
Unix Research System Papers, vol.2, Murray Hill, N.J.: AT&T Bell Labs.


How Strong Is Your Little Language?


Computationally speaking, some little languages are 98-pound weaklings, others
are bantam-weight powerhouses.
Although benchmarks exist to evaluate a compiler's code generation capability
and compilation speed, there aren't any formal benchmarks that can evaluate a
programming language's computational ability. In a broad sense, all but the
most rudimentary languages are equivalent, because all are capable of
Turing-machine computation. Looking at the details, however, languages are
often different enough that you're comparing apples and oranges.
Despite this observation, one can measure a computer language's expressiveness
by the ease in which fundamental problems such as the Halting Problems can be
expressed. Richard Stark, in Lisp Lore and Logic, shows how a syntactically
small language like Lisp is actually a powerhouse of symbolic processing. He
presents an elegant proof of the unsolvability of the Halting Problem in about
20 lines of pure Lisp, by showing how certain clearly stated initial
conditions lead to a contradictory statement in Lisp.
He concludes his proof by saying, "When a computational system such as Lisp is
a source of problems that can't be solved within the system , there are two
[interpretations]. One is that the system is too weak to solve its own
problems. The other is that it presents such a strong notion of computability
that deep and unsolvable questions can arise. Either interpretation could be
true....In this case, we believe that the strong interpretation is correct."
A different, much more informal, benchmark of a language's computational power
is the programming exercise that Ken Thompson (coauthor of Unix) used to pass
the time in college. According to Thompson -- in his 1983 ACM Turing Award
lecture -- the following exercise was popular "...for the same reason that
three-legged races are popular." The goal is to write the shortest
self-reproducing program: "More precisely stated...to write a source program
that, when compiled and executed, will produce as output an exact copy of its
source."
Thompson adds, "If you have never done this, I urge you to try it on your own.
The discovery of how to do it is a revelation that far surpasses any benefit
obtained by being told how to do it." Thompson's program, in C, is shown in
Example 1(a). Although he considers it the starting point for "the cutest
program I ever wrote," the code itself is 233 lines of somewhat homely C.

Example 1: Writing the shortest self-reproducing program: (a) Ken Thompson's C
code; (b) a pure Lisp version; (c) a Mumps implementation.

 (a) char s [] = {
 '\t'
 '0',
 '\n',
 '}',
 ';',
 '\n',
 '\n',
 '/',
 '*',
 '\n',
 ...213 lines deleted...
 0
 };
 /*
 * The string s is a representation of the body
 * of this program from '0' to the end.
 */

 main()
 {
 int i;

 printf("char\ts[]={\n");
 for (i=0 ; s [i]; i++)
 printf("\t%d,\n",s[i]);
 printf("%s",s);
 }

 (b) ( (lambda (x) (list x (list (quote quote) x)))
 (quote
 (lambda (x) (list x (list (quote quote) x))) ))

 (c) S N=$C(13)_$C(10)_$C(9) S Q=$C(34) S S= "W N_""S
 N=$C(13)_$C(10)_$C(9) S Q=$C(34) S S=""_Q;F I=1:1:$L(
 S) S C=$E(S,I) W $S(C=$C(34) :Q_Q,1:C);W Q_N:F I=1:1:$
 L(S) S C=$E(S,I) W $S(C=$C(59) :N,1:C);Q;"
 W N_ "S N=$C(13)_$C(10)_$C(9) S Q=$C(34) S S="_Q
 F I=1:1:$L(S) S C=$E(S,I) W $S(C=$C(34) :Q_Q,1:C)
 W Q_N
 F I=1:1:$L(S) S C=$E(S,I) W $S(C=$C(59) :N,1:C)
 Q

By comparison, the equivalent can be expressed in pure Lisp in only three
lines of very elegant code; Example 1(b). ("Pure Lisp" is the subset of the
Lisp language consisting of functions that have no side effects, that is,
functions that are mathematically "pure." The code in Example 1(b) is from
John McCarthy and Carolyn Talcott, and is quoted in Stark's book.)
For fun, I wrote the equivalent program in the Mumps programming language,
which is a Basic-like little language that accesses a sophisticated
associative database engine. Although the listing is only a couple of lines,
the code is extremely grubby and cryptic; see Example 1(c).
Although each of these programs has the same underlying logic, the steps
required to reach the same goal in each language are so different that you can
make a case that each program implements a distinct algorithm.
How does your favorite (or most despised) language stack up? Send us your
solution to this exercise and we'll publish the most interesting, elegant,
and/or bizarre listings. A special prize will go to the first entry written in
Lotus Macro Language.
-- R.V.












September, 1991
YOUR OWN TINY OBJECT-ORIENTED LANGUAGE


C++? Smalltalk? What About Bob?


 This article contains the following executables: BOB12.ARC


David Betz


David is a technical editor for DDJ, and the author of XLisp, XScheme, and the
TelePath conferencing system. He can be reached at DDJ, 501 Galveston Drive,
Redwood City, CA 94063.


When I first started reading Dr. Dobb's back in the '70s, the articles I
looked forward to the most were those describing tiny programming languages.
First there were tiny implementations of Basic for various microprocessors,
then small implementations of C and Forth, and even a tiny language for the
control of robots. These articles intrigued me because they not only described
a language, but also included complete source code for its implementation.
I've always been interested in how programming languages are constructed, and
this gave me an opportunity to look inside and see how things worked.
Eventually, I decided to try my own hand at building languages.
Since then, I've built many different types of languages, ranging from simple
assemblers to complete Lisp systems. This article describes my latest
creation, a C-like, object-oriented language I call "Bob." Unlike the popular
Small C compiler by Ron Cain, it isn't a strict subset of C or C++; hence it
isn't possible to compile Bob programs with a standard C or C++ compiler.
Instead, Bob is an interpreter for a language with C-like syntax and a class
system similar to C++, but without variable typing and mostly without
declarations. In a sense, Bob is a combination of C++ and Lisp.


Writing a Bob Program


Before I begin describing Bob in detail, let's discuss how you go about
writing Bob programs. Example 1(a) presents a simple example program -- a
function for computing factorials -- written in Bob.
This function definition looks a lot like its C counterpart. The only
noticeable difference is the lack of a declaration for the type of the
parameter n and for the return type of the function. Variable types do not
need to be declared in Bob. Any variable can take on a value of any type.
There is no need for type declarations.
To take this further, Example 1(b) shows a program that uses the factorial
function above to display the factorials of the numbers from 1 to 10. Again,
this program looks a lot like a similar program written in C. The main
difference is in the first line. In a function definition's formal parameter
list, the semicolon character introduces a list of variables local to the
function. In this case, the variable i is local to the function main. Also,
notice that I've used the print function to display the results instead of the
C printf function. The print function in Bob prints each of its arguments in
succession. It is capable of printing arguments of any type and automatically
formats them appropriately.
Example 1: (a) A Bob program for computing factorials; (b) a program that uses
the factorial function to display the factorials of the numbers from 1 to 10.

 (a) factorial(n)
 {
 return n == 1 ? 1 : n * factorial(n-1);

 }

 (b) main(; i)
 {
 for (i = 1; i <= 10; ++i)
 print(i," factorial is ",factorial(i),"\n");
 }

In addition to supporting C-like expressions and control constructs, Bob also
supports C++-like classes. Again, Bob is a typeless language, so the syntax
for class definitions is somewhat different from C++, but it is similar enough
that it should be easy to move from one to the other. Example 2(a) shows a
simple class definition.
Example 2: (a) A simple class definition; (b) a constructor function for the
class foo.

 (a) class foo
 {
 a,b;
 static last;
 static get_last();
 }
 (b) foo::foo(aa,bb)
 {
 a = aa; b = bb;
 last = this;
 return this;
 }

This fragment defines a class called foo with members a and b, a static member
last, and a static member function get_last. Unlike in C++, it is not
necessary to declare all member functions within the class definition; only
the static member functions need be declared. It is necessary, however, to
declare all data members in the class definition.
As in C++, new objects of a class are initialized using a constructor
function, which has the same name as the class itself. Example 2(b) is the
constructor function for the foo class. This constructor takes two arguments,
which are the initial values for the member variables a and b. It also
remembers the last object created in the static member variable last. Lastly,
it returns the new object. For those of you not familiar with C++, the
variable this refers to the object for which the member function is being
called. It is an implicit parameter passed to every nonstatic member function.
In this case, it is the new object just created.

In Bob, all data members are implicitly protected: The only way to access or
modify the value of a member variable is through a member function. If you
need to access to a member variable outside a member function, you must
provide access to member functions to do this; see Example 3(a). To set the
value of a member variable, see Example 3(b). Finally, Example 3(c) is a
member function that displays the numbers between a and b for any object of
the foo class, and a main function that creates some objects and manipulates
them. The new operator creates a new object of the class whose name follows
it. The expressions in parentheses after the class name are the arguments to
be passed to the constructor function.
Example 3: (a) Providing access to a member variable outside a member
function; (b) setting the value of a member variable; (c) a member function
that displays the numbers between a and b for any object of the foo class, and
a main function that creates some objects and manipulates them.

 (a) foo::get_a()
 {
 return a;
 }

 (b) foo::set_a(aa)
 {
 a = aa;
 }

 (c) foo::count (; i)
 {
 for (i = a; i <= b; ++i)
 print (i, "\n");
 }

 main(; foo1, foo2)
 {

 foo1 = new foo (1, 2); // create a object of class
 foo
 foo2 = new foo (11, 22); // and another
 print ("fool counting\n"); // ask the first to count
 foo1 ->count ();
 print ("foo2 counting\n"); // ask the second to count
 foo2 ->count ();
 }

Bob also allows one class to be derived from another. The derived class will
inherit the behavior of the base class and possibly add some behavior of its
own. Bob only supports single inheritance; therefore, each class can have at
most one base class. The code in Example 4(a) defines a class bar derived from
the base class foo, defined earlier.
Example 4: (a) Defining a class bar derived from the base class foo; (b) the
constructor for bar needs to initialize this new member variable as well as
doing the initialization normally done for objects of class foo.

 (a) class bar : foo
 // a class derived from foo
 {
 c;
 }

 (b) bar::bar (aa,bb,cc)
 {

 this->foo (aa,bb);
 c = cc;
 return this;
 }

The class bar will have member variables a and b inherited from foo as well as
the additional member variable c . The constructor for bar needs to initialize
this new member variable and do the initialization normally done for objects
of class foo; see Example 4(b).
This definition points out another difference between Bob and C++. In C++,
constructor functions cannot be called to initialize already existing objects.
This is allowed in Bob, so the foo constructor can be used to do the common
initialization of the foo and bar classes. In C++, it would be necessary to
define an init function for foo and call it from both constructors.
That's a brief walk through the features of Bob. Table 1 details Bob's
complete syntax.
Table 1: Bob syntax

 Class Definition

 class<class-name>[:<base-class-name>]
 {<member-definition>...}

 Member Definition


 <variable-name>...;
 static<variable-name>...;
 <function-name>([<formal-argument-list>]);
 static <function-name>([<formal-argument-list>]);

 Function Definition

 [<class-name>::]<function-name>
 ([<formal-argument-list>[;<temporary-list>])
 {<statement>...}

 Statement

 if (<test-expression>)<then-statement>[else<else-statement>];
 while (<test-expression>)<body-statement>
 do<body-statement>while<test-expression>);
 for(<init-expression>;<test-expression>;<increment-expression>)
 <body-statement>
 break;
 continue;
 return[<result-expression>];
 [<expression>];
 {<statement>...}

 Expression
 <expression>,<expression>
 <lvalue>=<expression>
 <lvalue>+=<expression>
 <lvalue>-=<expression>
 <lvalue>*=<expression>
 <lvalue>/=<expression>
 <test-expression>?<true-expression>:<false-expression>
 <expression><expression>
 <expression>&&<expression>
 <expression><expression>
 <expression>^<expression>
 <expression>&<expression>
 <expression>==<expression>
 <expression>!=<expression>
 <expression><<expression>
 <expression><=<expression>
 <expression>>=<expression>
 <expression>><expression>
 <expression><<<expression>
 <expression>>><expression>

 <expression>+<expression>
 <expression>-<expression>
 <expression>*<expression>
 <expression>/<expression>
 <expression>%<expression>
 -<expression>
 !<expression>
 ~<expression>
 ++<lvalue>
 --<lvalue>
 <lvalue>++
 <lvalue>--

 new <class-name>([<constructor-arguments>])
 <expression>([<arguments>])
 <expression>-><function-name>([<arguments>])
 (<expression>)
 <variable-name>
 <number>
 <string>
 nil



How Does it All Work?


I've implemented Bob as a hybrid of a compiler and an interpreter. When a
function is defined, it is compiled into instructions for a stack-oriented
bytecode machine. When the function is invoked, those bytecode instructions
are interpreted. The advantage of this approach over a straight interpreter is
that syntax analysis is done only once, at compile time. This speeds up
function execution considerably and opens up the possibility of building a
runtime-only system that doesn't include the compiler at all.


Runtime Organization


First, I'll describe the runtime environment of Bob programs. The virtual
machine that executes the bytecodes generated by the Bob compiler has a set of
registers, a stack, and a heap. The register set is shown in Table 2. All
instructions get their arguments from and return their results to the stack.
Literals are stored in the code object itself and are referred to by offset.
Branch instructions test the value on the top of the stack (without popping
the stack) and branch accordingly. Function arguments are passed on the stack,
and function values are returned on top of the stack.
Table 2: Registers used by the virtual machine

 Register Description
 ---------------------------------------

 code currently executing code
 object

 cbase base of bytecode array for
 the current code object

 pc address of the next
 bytecode to fetch

 sp top of the stack
 fp stack frame for the current
 call

 stkbase bottom stack limit
 stktop top stack limit

In Bob, all member functions are virtual. This means that when a member
function is invoked, the interpreter must determine which implementation of
the member function to invoke. This is done by the SEND opcode, which uses a
selector from the stack (actually, just a string containing the name of the
member function) with the method dictionary associated with the object's class
to determine which member function to use. If the lookup fails, the dictionary
from the base class is examined. This continues, following the base class
chain until either a member function is found or there is no base class. If a
member function is found to correspond to the selector, it replaces the
selector on the stack and control is transfered to the member function, just
as it would have been for a regular function. If no member function is found,
an error is reported and the interpreter aborts.
Bob supports five basic data types: integers, strings, vectors, objects, and
nil. Internally, the interpreter uses four more types: classes, compiled
bytecode functions, built-in function headers, and variables. Wherever a value
can be stored, a tag indicates the type of value presently stored there. The
structure for Bob values is shown in Example 5.
Example 5: The structure for Bob values

 typedef struct value {
 int v_type; /* data type */
 union { /* value */
 struct class *v_class;
 struct value *v_object;
 struct value *v_vector;
 struct string *v_string;

 struct value *v_bytecode;
 struct dict_entry *v_var;
 int (*v_code) ();
 long v_integer;
 } v;

 } VALUE;

Objects, vectors, and bytecode objects are all represented by an array of
value structures. In the case of bytecode objects, the first element in the
vector is a pointer to the string of bytecodes for the function, and the rest
are the literals referred to by the bytecode instructions. Class objects are
vectors, where the first element is a pointer to the class object and the
remaining elements are the values of the nonstatic member variables for the
object. Built-in functions are just pointers to the C functions that implement
the built-in function. Variables are pointers to dictionary entries for the
variable. There is a dictionary for global symbols and one for classes. Each
class also has a dictionary for data members and member functions.
In addition to the stack, Bob uses a heap to store objects, vectors, and
strings. The current implementation of Bob uses the C heap and the C functions
malloc and free to manage heap space and uses a compacting memory manager.


The Source Code


The Bob bytecode compiler is a fairly straightforward recursive descent
compiler. At the moment, it uses a set of heavily recursive functions to parse
expressions. I intend to replace that with a table-driven expression parser.
The bytecode interpreter (Listing One, page 86) is really just a giant switch
statement with one case for each bytecode.
The source code for Bob is too large (more than 3000 lines) to be included in
this issue. Consequently, it's available electronically; see "Availability" on
page 3. I am including significant portions of the code here, though, and I
hope this will give you a taste of the implementation of Bob.


Conclusions


Well, there it is -- a complete, if simple, object-oriented language. I don't
think I'd want to throw away my C or C++ compiler in favor of programming in
Bob, but it could serve as a good basis for building a macro language for an
application program or just as a tool for experimenting with language design
and implementation. It should be fairly easy to extend Bob with more built-in
functions and classes or to build application-specific versions with functions
tailored to your own application. I'm designing a computerized system for
controlling theater lighting and will probably use Bob as a macro facility in
that system. Anyway, have fun playing with Bob and please let me know if you
come up with an interesting application for it.
_YOUR OWN TINY OBJECT-ORIENTED LANGUAGE_
by David Betz


[LISTING ONE]

/* bobint.c - bytecode interpreter */
/*
 Copyright (c) 1991, by David Michael Betz
 All rights reserved
*/

#include <setjmp.h>
#include "bob.h"

#define iszero(x) ((x)->v_type == DT_INTEGER && (x)->v.v_integer == 0)
#define istrue(x) ((x)->v_type != DT_NIL && !iszero(x))

/* global variables */
VALUE *stkbase; /* the runtime stack */
VALUE *stktop; /* the top of the stack */
VALUE *sp; /* the stack pointer */
VALUE *fp; /* the frame pointer */
int trace=0; /* variable to control tracing */

/* external variables */
extern DICTIONARY *symbols;
extern jmp_buf error_trap;

/* local variables */
static unsigned char *cbase; /* the base code address */
static unsigned char *pc; /* the program counter */
static VALUE *code; /* the current code vector */

/* forward declarations */
char *typename();

/* execute - execute a bytecode function */
int execute(name)
 char *name;
{
 DICT_ENTRY *sym;


 /* setup an error trap handler */
 if (setjmp(error_trap) != 0)
 return (FALSE);

 /* lookup the symbol */
 if ((sym = findentry(symbols,name)) == NULL)
 return (FALSE);

 /* dispatch on its data type */
 switch (sym->de_value.v_type) {
 case DT_CODE:
 (*sym->de_value.v.v_code)(0);
 break;
 case DT_BYTECODE:
 interpret(sym->de_value.v.v_bytecode);
 break;
 }
 return (TRUE);
}

/* interpret - interpret bytecode instructions */
int interpret(fcn)
 VALUE *fcn;
{
 register int pcoff,n;
 register VALUE *obj;
 VALUE *topframe,val;
 STRING *s1,*s2,*sn;

 /* initialize */
 sp = fp = stktop;
 cbase = pc = fcn[1].v.v_string->s_data;
 code = fcn;

 /* make a dummy call frame */
 check(4);
 push_bytecode(code);
 push_integer(0);
 push_integer(0);
 push_integer(0);
 fp = topframe = sp;

 /* execute each instruction */
 for (;;) {
 if (trace)
 decode_instruction(code,pc-code[1].v.v_string->s_data);
 switch (*pc++) {
 case OP_CALL:
 n = *pc++;
 switch (sp[n].v_type) {
 case DT_CODE:
 (*sp[n].v.v_code)(n);
 break;
 case DT_BYTECODE:
 check(3);
 code = sp[n].v.v_bytecode;
 push_integer(n);
 push_integer(stktop - fp);

 push_integer(pc - cbase);
 cbase = pc = code[1].v.v_string->s_data;
 fp = sp;
 break;
 default:
 error("Call to non-procedure, Type %s",
 typename(sp[n].v_type));
 return;
 }
 break;
 case OP_RETURN:
 if (fp == topframe) return;
 val = *sp;
 sp = fp;
 pcoff = fp[0].v.v_integer;
 n = fp[2].v.v_integer;
 fp = stktop - fp[1].v.v_integer;
 code = fp[fp[2].v.v_integer+3].v.v_bytecode;
 cbase = code[1].v.v_string->s_data;
 pc = cbase + pcoff;
 sp += n + 3;
 *sp = val;
 break;
 case OP_REF:
 *sp = code[*pc++].v.v_var->de_value;
 break;
 case OP_SET:
 code[*pc++].v.v_var->de_value = *sp;
 break;
 case OP_VREF:
 chktype(0,DT_INTEGER);
 switch (sp[1].v_type) {
 case DT_VECTOR: vectorref(); break;
 case DT_STRING: stringref(); break;
 default: badtype(1,DT_VECTOR); break;
 }
 break;
 case OP_VSET:
 chktype(1,DT_INTEGER);
 switch (sp[2].v_type) {
 case DT_VECTOR: vectorset(); break;
 case DT_STRING: stringset(); break;
 default: badtype(1,DT_VECTOR); break;
 }
 break;
 case OP_MREF:
 obj = fp[fp[2].v.v_integer+2].v.v_object;
 *sp = obj[*pc++];
 break;
 case OP_MSET:
 obj = fp[fp[2].v.v_integer+2].v.v_object;
 obj[*pc++] = *sp;
 break;
 case OP_AREF:
 n = *pc++;
 if (n >= fp[2].v.v_integer)
 error("Too few arguments");
 *sp = fp[n+3];
 break;

 case OP_ASET:
 n = *pc++;
 if (n >= fp[2].v.v_integer)
 error("Too few arguments");
 fp[n+3] = *sp;
 break;
 case OP_TREF:
 n = *pc++;
 *sp = fp[-n-1];
 break;
 case OP_TSET:
 n = *pc++;
 fp[-n-1] = *sp;
 break;
 case OP_TSPACE:
 n = *pc++;
 check(n);
 while (--n >= 0) {
 --sp;
 set_nil(sp);
 }
 break;
 case OP_BRT:
 if (istrue(sp))
 pc = cbase + getwoperand();
 else
 pc += 2;
 break;
 case OP_BRF:
 if (istrue(sp))
 pc += 2;
 else
 pc = cbase + getwoperand();
 break;
 case OP_BR:
 pc = cbase + getwoperand();
 break;
 case OP_NIL:
 set_nil(sp);
 break;
 case OP_PUSH:
 check(1);
 push_integer(FALSE);
 break;
 case OP_NOT:
 if (istrue(sp))
 set_integer(sp,FALSE);
 else
 set_integer(sp,TRUE);
 break;
 case OP_NEG:
 chktype(0,DT_INTEGER);
 sp->v.v_integer = -sp->v.v_integer;
 break;
 case OP_ADD:
 switch (sp[1].v_type) {
 case DT_INTEGER:
 switch (sp[0].v_type) {
 case DT_INTEGER:

 sp[1].v.v_integer += sp->v.v_integer;
 break;
 case DT_STRING:
 s2 = sp[0].v.v_string;
 sn = newstring(1 + s2->s_length);
 sn->s_data[0] = sp[1].v.v_integer;
 memcpy(&sn->s_data[1],
 s2->s_data,
 s2->s_length);
 set_string(&sp[1],sn);
 break;
 default:
 break;
 }
 break;
 case DT_STRING:
 s1 = sp[1].v.v_string;
 switch (sp[0].v_type) {
 case DT_INTEGER:
 sn = newstring(s1->s_length + 1);
 memcpy(sn->s_data,
 s1->s_data,
 s1->s_length);
 sn->s_data[s1->s_length] = sp[0].v.v_integer;
 set_string(&sp[1],sn);
 break;
 case DT_STRING:
 s2 = sp[0].v.v_string;
 sn = newstring(s1->s_length + s2->s_length);
 memcpy(sn->s_data,
 s1->s_data,s1->s_length);
 memcpy(&sn->s_data[s1->s_length],
 s2->s_data,s2->s_length);
 set_string(&sp[1],sn);
 break;
 default:
 break;
 }
 break;
 default:
 badtype(1,DT_VECTOR);
 break;
 }
 ++sp;
 break;
 case OP_SUB:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 sp[1].v.v_integer -= sp->v.v_integer;
 ++sp;
 break;
 case OP_MUL:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 sp[1].v.v_integer *= sp->v.v_integer;
 ++sp;
 break;
 case OP_DIV:
 chktype(0,DT_INTEGER);

 chktype(1,DT_INTEGER);
 if (sp->v.v_integer != 0) {
 int x=sp->v.v_integer;
 sp[1].v.v_integer /= x;
 }
 else
 sp[1].v.v_integer = 0;
 ++sp;
 break;
 case OP_REM:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 if (sp->v.v_integer != 0) {
 int x=sp->v.v_integer;
 sp[1].v.v_integer %= x;
 }
 else
 sp[1].v.v_integer = 0;
 ++sp;
 break;
 case OP_INC:
 chktype(0,DT_INTEGER);
 ++sp->v.v_integer;
 break;
 case OP_DEC:
 chktype(0,DT_INTEGER);
 --sp->v.v_integer;
 break;
 case OP_BAND:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 sp[1].v.v_integer &= sp->v.v_integer;
 ++sp;
 break;
 case OP_BOR:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 sp[1].v.v_integer = sp->v.v_integer;
 ++sp;
 break;
 case OP_XOR:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 sp[1].v.v_integer ^= sp->v.v_integer;
 ++sp;
 break;
 case OP_BNOT:
 chktype(0,DT_INTEGER);
 sp->v.v_integer = ~sp->v.v_integer;
 break;
 case OP_SHL:
 switch (sp[1].v_type) {
 case DT_INTEGER:
 chktype(0,DT_INTEGER);
 sp[1].v.v_integer <<= sp->v.v_integer;
 break;
 case DT_FILE:
 print1(sp[1].v.v_fp,FALSE,&sp[0]);
 break;

 default:
 break;
 }
 ++sp;
 break;
 case OP_SHR:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 sp[1].v.v_integer >>= sp->v.v_integer;
 ++sp;
 break;
 case OP_LT:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 n = sp[1].v.v_integer < sp->v.v_integer;
 ++sp;
 set_integer(sp,n ? TRUE : FALSE);
 break;
 case OP_LE:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 n = sp[1].v.v_integer <= sp->v.v_integer;
 ++sp;
 set_integer(sp,n ? TRUE : FALSE);
 break;
 case OP_EQ:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 n = sp[1].v.v_integer == sp->v.v_integer;
 ++sp;
 set_integer(sp,n ? TRUE : FALSE);
 break;
 case OP_NE:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 n = sp[1].v.v_integer != sp->v.v_integer;
 ++sp;
 set_integer(sp,n ? TRUE : FALSE);
 break;
 case OP_GE:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 n = sp[1].v.v_integer >= sp->v.v_integer;
 ++sp;
 set_integer(sp,n ? TRUE : FALSE);
 break;
 case OP_GT:
 chktype(0,DT_INTEGER);
 chktype(1,DT_INTEGER);
 n = sp[1].v.v_integer > sp->v.v_integer;
 ++sp;
 set_integer(sp,n ? TRUE : FALSE);
 break;
 case OP_LIT:
 *sp = code[*pc++];
 break;
 case OP_SEND:
 n = *pc++;
 chktype(n,DT_OBJECT);

 send(n);
 break;
 case OP_DUP2:
 check(2);
 sp -= 2;
 *sp = sp[2];
 sp[1] = sp[3];
 break;
 case OP_NEW:
 chktype(0,DT_CLASS);
 set_object(sp,newobject(sp->v.v_class));
 break;
 default:
 info("Bad opcode %02x",pc[-1]);
 break;
 }
 }
}

/* send - send a message to an object */
static send(n)
 int n;
{
 char selector[TKNSIZE+1];
 DICT_ENTRY *de;
 CLASS *class;
 class = sp[n].v.v_object[OB_CLASS].v.v_class;
 getcstring(selector,sizeof(selector),sp[n-1].v.v_string);
 sp[n-1] = sp[n];
 do {
 if ((de = findentry(class->cl_functions,selector)) != NULL) {
 switch (de->de_value.v_type) {
 case DT_CODE:
 (*de->de_value.v.v_code)(n);
 return;
 case DT_BYTECODE:
 check(3);
 code = de->de_value.v.v_bytecode;
 set_bytecode(&sp[n],code);
 push_integer(n);
 push_integer(stktop - fp);
 push_integer(pc - cbase);
 cbase = pc = code[1].v.v_string->s_data;
 fp = sp;
 return;
 default:
 error("Bad method, Selector '%s', Type %d",
 selector,
 de->de_value.v_type);
 }
 }
 } while ((class = class->cl_base) != NULL);
 nomethod(selector);
}

/* vectorref - load a vector element */
static vectorref()
{
 VALUE *vect;

 int i;
 vect = sp[1].v.v_vector;
 i = sp[0].v.v_integer;
 if (i < 0 i >= vect[0].v.v_integer)
 error("subscript out of bounds");
 sp[1] = vect[i+1];
 ++sp;
}

/* vectorset - set a vector element */
static vectorset()
{
 VALUE *vect;
 int i;
 vect = sp[2].v.v_vector;
 i = sp[1].v.v_integer;
 if (i < 0 i >= vect[0].v.v_integer)
 error("subscript out of bounds");
 vect[i+1] = sp[2] = *sp;
 sp += 2;
}

/* stringref - load a string element */
static stringref()
{
 STRING *str;
 int i;
 str = sp[1].v.v_string;
 i = sp[0].v.v_integer;
 if (i < 0 i >= str->s_length)
 error("subscript out of bounds");
 set_integer(&sp[1],str->s_data[i]);
 ++sp;
}

/* stringset - set a string element */
static stringset()
{
 STRING *str;
 int i;
 chktype(0,DT_INTEGER);
 str = sp[2].v.v_string;
 i = sp[1].v.v_integer;
 if (i < 0 i >= str->s_length)
 error("subscript out of bounds");
 str->s_data[i] = sp[0].v.v_integer;
 set_integer(&sp[2],str->s_data[i]);
 sp += 2;
}

/* getwoperand - get data word */
static int getwoperand()
{
 int b;
 b = *pc++;
 return ((*pc++ << 8) b);
}

/* type names */

static char *tnames[] = {
"NIL","CLASS","OBJECT","VECTOR","INTEGER","STRING","BYTECODE",
"CODE","VAR","FILE"
};

/* typename - get the name of a type */
static char *typename(type)
 int type;
{
 static char buf[20];
 if (type >= _DTMIN && type <= _DTMAX)
 return (tnames[type]);
 sprintf(buf,"(%d)",type);
 return (buf);
}

/* badtype - report a bad operand type */
badtype(off,type)
 int off,type;
{
 char tn1[20];
 strcpy(tn1,typename(sp[off].v_type));
 info("PC: %04x, Offset %d, Type %s, Expected %s",
 pc-cbase,off,tn1,typename(type));
 error("Bad argument type");
}

/* nomethod - report a failure to find a method for a selector */
static nomethod(selector)
 char *selector;
{
 error("No method for selector '%s'",selector);
}

/* stackover - report a stack overflow error */
stackover()
{
 error("Stack overflow");
}




Example 1:

(a)

 factorial(n)
 {
 return n == 1 ? 1 : n * factorial(n-1);

 }



(b)


 main(; i)

 {
 for (i = 1; i <= 10; ++i)
 print(i," factorial is ",factorial(i),"\n");
 }



Example 2:

(a) A Bob class definition

 class foo
 {
 a,b;
 static last;
 static get_last();
 }


(b)

 foo::foo(aa,bb)
 {
 a == aa; b = bb;
 last = this;
 return this;
 }





Example 3:

(a)
 foo::get_a()
 {
 return a;
 }



(b)

 foo::set_a(aa)
 {
 a = aa;
 }


(c)


 foo::count(; i)
 {
 for (i = a; i <= b; ++i)
 print(i,"\n");
 }


 main(; foo1,foo2)
 {

 foo1 = new foo(1,2); // create a object of class foo
 foo2 = new foo(11,22); // and another
 print("foo1 counting\n"); // ask the first to count
 foo1->count();
 print("foo2 counting\n"); // ask the second to count
 foo2->count();
 }


Example 4:

(a)

 class bar : foo // a class derived from foo
 {
 c;
 }


(b)

 bar::bar(aa,bb,cc)
 {
 this->foo(aa,bb);
 return this;
 }



Example 5

typedef struct value {
 int v_type; /* data type */
 union { /* value */
 struct class *v_class;
 struct value *v_object;
 struct value *v_vector;
 struct string *v_string;

 struct value *v_bytecode;
 struct dict_entry *v_var;
 int (*v_code)();
 long v_integer;
 } v;
} VALUE;














September, 1991
ADDING AN EXTENSION LANGUAGE TO YOUR SOFTWARE


The little language/application interface




Neville Franks


Neville is the owner of Soft As It Gets, a company specializing in the
development of programmer's tools, including a programmer's editor that uses
the extension language described in this article. Neville can be contacted at
3 Pullman Crt., East St. Kilda Victoria 3183, Australia. Phone +613-523-0557,
fax +613-528-6836.


How often do you wish for a few more thousand bytes of code space, or that you
could avoid yet another full edit/compile/link cycle of your complex
application?
One way of solving problems such as these is by adding a general-purpose
extension language to your software. An extension language is a programming
language whose compiled executable code works without actually being linked
into your software. In some ways, this is similar to Dynamic Link Libraries
(DLLs) in operating systems such as OS/2. Because the linkage step is
bypassed, program development is dramatically cut. Extensible languages also
open up the possibility of end users writing their own programs to extend or
modify your software--all you need to do is clearly document the functions in
your application that are callable from the extension language, and provide
your end users with the documentation and the extension language compiler. The
end user does not need megabytes of disk space for a full-blown compiler, nor
does he need to be concerned with the complexities of such an environment.
There are many areas where extension languages can be very useful. In a data
entry application, for example, it is common to validate fields as they are
entered. In some applications these validation rules can be quite complex and
may even vary slightly from one customer to the next. If you write the rules
using an extension language, they can be changed quickly and easily, even by
the customer if necessary. You could even create a library of programs and
select from it on-the-fly. Programmers demand flexibility and extensibility in
their program editors. A well-designed extension language fits perfectly into
an editor, and offers seamless program addition.
For several reasons, extension languages aren't suited to all programming
areas. Firstly, they usually aren't full-blown languages, so some programming
tasks may not be possible. Secondly, they are interpreted in one way or
another, and are therefore not fast enough for many time critical areas.
Finally, the executable code is typically loaded from disk on an as-needed
basis; this also takes time.
This article describes the steps to add extension language capabilities to
your software.


Choosing a Language


Like any other computer language, an extension language must be carefully
designed. You need to consider what type of operations your extension language
programs will perform: Are they going to be used in AI-style programs and be
Prolog- or Lisp-based? Are they going to work with strings, bits, and bytes
and be more like C or Pascal? Or are they so unique that they need a
customized language?
There is a lot to be said for basing your language on one programmers are
already familiar and comfortable with. The extension language used as an
example in this article is a subset of C.


The Compiler


C is a fairly minimal language. Some people, in fact, even refer to it as a
high-level assembler. This aside, C is a simple, yet powerful language widely
used in commercial application development. C's minimality lends itself well
to being an extension language. The task of compiler writing is simplified and
the code needed for the runtime environment is not especially overwhelming.
The theory and design of simple C compilers -- in particular Small-C by Ron
Cain and James Hendrix and its many derivatives -- has been covered in detail
over the years. A compiler such as Small-C can be put into use as an extension
language compiler by changing its code generator to one suited for an
extension language environment. This is possible because a compiler performs
the following sequence of events on a program: lexical analysis, parsing, code
generation, and optimization. The main change needed for use with an extension
language is in the area of code generation.
Instead of generating machine code for a particular CPU, you generate code for
an intermediate language, a bit like the early Pascal P-code system. This
intermediate language is then executed by a runtime machine or interpreter.
You could try to execute the extension language source code directly, but this
has its limitations. In particular, the program might not be syntactically
correct, would probably run slowly, and would most likely have a much larger
runtime code overhead.
By compiling into an intermediate form, these problems disappear. Thorough
checking is performed during normal lexical analysis and parsing, and the
intermediate code is much more compact, so it will execute faster than the
original program representation.


Executing Compiled Code


After the source code has been successfully compiled into its intermediate
form, it can be executed using a runtime interpreter. Apart from code to load
executable programs from disk, the interpreter is all you need to add to your
application. All in all, you need less than 2 Kbytes of code space.
The runtime interpreter is implemented as a simple stack machine. The stack is
used to hold function call stack frames, automatic variables, and intermediate
values during expression evaluation. The stack is represented by an int array,
and we assume that a pointer is the same size as an int; see Figure 1. Each
time a function call is made, a new stack frame is added. A stack frame
occupies four stack entries:
RA: the Return Address of the caller
SFL: the Stack Frame Link that links stack frames together
NA: the Number of Arguments passed to the called function
ES: the pointer to the environment's Expression Stack
The design of the intermediate language is fairly straightforward. You start
by working out what sort of instructions are needed -- in other words, the
machine's instruction set. If the instruction set mimics some of the
higher-level extension language semantics, then the compiler's code generator
will be easier to design and implement. Execution speed is an important
consideration, and with careful design of the intermediate language, a
reasonably fast interpreter can be constructed. The intermediate language
instruction set for the C extension language falls into the broad categories
listed in Table 1 . These instructions fit in well with C and are easy to
relate during code generation. If you wanted the interpreter to work with
different languages, this close cooperation would most likely be a hindrance
and a more generic instruction set should be chosen instead.
Table 1: The intermediate language instruction set categories for a C
extension language

 Load stack from memory
 Load memory from stack
 Load stack with memory address
 Load stack with immediate value

 Increment and decrement instructions


 Unconditional and Conditional Jumps

 Call and Return from function
 Call external function

 Logical instructions such as:
 Not, And, Or, Xor

 Relational Instructions such as:
 Gt, Lt, Eq, Ne

 Arithmetic instructions such as:
 Add, Sub, Mult, Div, Mod, Shift

 Bitwise instructions such as:
 And, Or, Xor



Stack Machine Operation


The stack machine is passed a pointer to a block of memory containing the
compiled executable code. This block of memory is seen as a stream of bytes
containing instructions and their operands. In A Little Smalltalk, Timothy
Budd calls these "bytecodes." This book provides detailed information for a
Smalltalk interpreter written in C and is recommended reading. Similarities
also exist with the threaded code used in Forth interpreters.
To make the intermediate code as compact as possible, instructions vary in
length from 1 byte for RET and SUB to 4 bytes for a CALL. Most instructions
are either 1 or 3 bytes. Using varyingsized instructions complicates movement
of the instruction pointer, but this is warranted, considering the space
savings obtained.
The interpreter has four registers: an Instruction Pointer (IP), a Stack
Pointer (SP), a Base Pointer (BP), and an Expression Stack pointer (ES).
The instruction pointer moves sequentially through the code until a Jump,
Call, or Ret instruction is reached. After changing to its new location, this
sequential process continues once more. When the instruction pointer finally
gets back to its starting point, program execution ends. The other three
registers are used as pointers into the stack.
The Base Pointer register points to the end of the current stack frame and is
used to access variables local to the current function. It is also used by the
Ret instruction. When a Ret instruction is reached, the sequence in Figure
2(a) occurs. For a Call, the sequence in Figure 2(b) takes place.
Figure 2: (a) When a Ret instruction is reached, this sequence occurs; (b) the
sequence for a Call instruction.

 (a) sp = bp - size of stack frame;
 bp = sp[ stack frame link ];
 es = sp[ expression stack ];
 ip = sp[ return address ];
 sp = sp - sp[ argument count ];

 (b) sp[ return address ] = ip;
 sp[ stack frame link ] = bp;
 sp[ expression stack ] = es;
 sp[ argument count ] = arg_count;
 sp = sp + size of stack frame;
 bp = es = sp;
 ip = function address;

The argument count is determined at compile time and included as part of the
Call instruction. Prior to performing a call, arguments to be passed to the
called function are pushed onto the stack as part of the normal expression
evaluation. The called function then has access to these arguments by using
negative offsets from the base pointer, taking the stack frame size into
account.
The last of the registers is the expression stack pointer which points to the
base of the expression stack for the current function. The expression stack
starts directly after the area reserved for local variables and is used to
hold intermediate expression results; see Figure 3.
The interpreter executes the code by moving the instruction pointer through
it, as just mentioned. Each instruction is decoded by case statements inside a
big switch. The sample code in Figure 4 gives you the general idea. In the
interest of simplicity, the code fragment assumes fixed length instructions.
This example clearly shows the use of the ip, bp, and sp registers. The es
register is not directly used by any instructions, except the Call and Ret
instructions as shown earlier. Its primary purpose is for resetting the stack
pointer back to the base of the expression stack once an expression has been
completely evaluated.
Figure 4: The interpreter executes the code by moving the instruction pointer
through it. Each instruction is decoded by case statements inside a big
switch.

 switch( ip->instruction )
 {
 case LODI: /* load immediate -> ++sp */
 *++sp = ip->immed_val;
 ++ip;
 break;

 case STOR : /* store sp -> local variable */
 *(bp + ip->addr) = *sp;
 ++ip;
 break;


 case OPR_ADD: /* add values on stack */
 --sp;
 *sp += *( sp + 1 );
 ++ip;
 break;

 case JPT : /* conditional jump - true */
 if ( *sp == 0 )
 {
 ++ip; /* next instr */
 break;
 } /* note: fall through */

 case JMP : /* unconditional jump */
 ip = ip->addr;
 break;

 case ....
 }

Apart from local or automatic variables, global variables are also supported.
The executable program has a header that carries information such as the size
of the global data area, as well as an image of the global data itself. Memory
is allocated on the heap, and the global data is copied into it. The heap
address is subsequently passed on to the interpreter. Global data includes
string constants as well as globally declared variables.
During compilation, variables are determined to be either local or global and
the appropriate instructions are generated. At runtime, global variables are
accessed much like local variables by using an offset from the base of the
global data area, instead of the base pointer register. The offset is encoded
into the instruction at compile time.


Calling External Functions (Accessing Application Program Functions


An important feature is the ability of the extension language to call not just
functions within the program itself, but external functions residing in the
host application program. This gives extension language programs access to
each and every function in an application program.
This capability is accomplished by providing both the extension language
compiler and the interpreter with access to the application's symbol table.
The symbol table is produced during linking and contains the name of every
function, along with its address. In Microsoft-compatible languages, the
symbol table is in a file with the extension .MAP.
During compilation, the symbol table is used to resolve the addresses of
functions that are not in the program itself or have not yet been encountered
(forward references). The symbol table (map file) is scanned, and if the
function name is found, its address is returned. The compiler then generates
the code for an external function call based on this address. The compiler
also checks for name clashes between external functions and functions local to
the program. If the same name is found in both areas an error is produced.
If you wish to restrict the range of functions accessible by extension
language programs, simply strip down the .MAP file accordingly.
During program execution, the interpreter handles these external calls by
pushing the parameters from the interpreter stack onto the application
program's stack. It then does an indirect function call using the address
obtained during compilation as a function pointer.
When the external function returns, its return value is placed on the
interpreter stack for use as required.


Programs Can Call Each Other


The interpreter not only allows the use of recursion within a program, but
also allows interpreter programs to call other interpreter programs. A program
simply calls an external routine which loads the new program from disk, and
then calls the interpreter all over again to start it running. This results in
nested invocations of the interpreter. When the current program finishes
executing, the interpreter returns to the loader routine, which in turn
returns to the previous invocation of the interpreter; see Figure 5.
Figure 5: When the current program finishes executing, the interpreter returns
to the loader routine, which in turn returns to the previous invocation of the
interpreter.

 interpreter(code, global_data, stack)
 
 -->external call to program loader
 
 -->interpreter(code, global_data, stack)

Parameters can be passed to the nested interpreter, just as they can be passed
to the original invocation. Furthermore, they can be passed either by value or
by reference. Passing by reference gives the new program access to variables
on the caller's stack, or the global data area.


Calling Extension Language Programs


As described here, extension language programs have a single entry point
similar to main() in normal C programs. There is a difference, however, in
that you can pass a variable number and type of arguments to a program.
The entry point is named main(), and it is up to the caller and callee to make
sure the arguments match in type and number. Included in the runtime system is
a program loader, which loads a specified program from disk and then calls the
interpreter to run it. The C function prototype for the loader is: pgm_run
(char *prog_name, int num_parms, void *parms[]); where prog_name is the name
of the compiled program to run; num_parms is the number of parameters being
passed to it; and parms[] is a combination of pointers to parameters and
possibly parameters themselves.
Let's say you have an extension language program that adds up a set of
numbers, and you want to run it from your application. The application code
would look like the code in Figure 6(a), while the the extension language
program would look like Figure 6(b). This simple example demonstrates
parameters being passed by reference and by value.
Figure 6: (a) Sample application code for running an extension language
program; (b) sample extension language program executed from an application
program.


 (a) add_up_numbers()
 {
 void *parms[4];
 int result;

 parms[0] = &result; /* result passed by reference */
 /* numbers passed by value */
 parms[1] = 23; parms[2] = 69; parms[3] = -78;
 pgm_run( "add_num", 4, parms );
 printf( "Result = %d", result );
 }

 (b) main( result, num1, num2, num3 )
 int *result, num1, num2, num3;
 {
 *result = num1 + num2 + num3;
 }

The num_parms parameter is used by the interpreter to push the correct number
of arguments from the parms array onto the interpreter stack before calling
main().
The interpreter has a function that allows any extension language function,
including main(), to find out exactly how many parameters it was passed. This
is particularly useful in main(), allowing different parts of the application
to call the same extension language program with different numbers of
arguments.
The example above is simple in the extreme; in practice, considerably more
complex programs are used. It is not uncommon to write programs in excess of
1000 lines of code.


What's Missing?


So far, the compiler supports a fairly complete subset of the C language.
Elements that are missing include structures, unions, bit fields, typedef,
casts, enum, extern, static, unsigned types, and data types float, double, and
long. For an extension language, I don't think their loss incurs any great
deficiencies. Several of these could be added if necessary, but of course they
would make the compiler and interpreter larger and more complex. One of the
aims of this project was to keep it as simple as possible, yet provide enough)
power to handle most tasks.


Possible Extensions


It is possible to virtualize the stack machine so programs run in virtual
memory. In a virtual memory environment, code is paged in and out of real
memory as necessary, enabling quite large and complex programs to be executed.
Virtual memory, along with a least recently used caching system, allows many
programs to be loaded concurrently and call each other with minimum overhead.


References


Aho, Alfred V. and Jeffrey D. Ullman. Principles of Compiler Design. First
edition. Reading, Mass.: Addison-Wesley, 1977.
Betz, David. "Embedded Languages." Byte (November, 1988).
Bornat, Richard. Understanding and Writing Compilers. Indianapolis, Ind.:
Macmillan, 1985.
Brinch Hansen, Per. Brinch Hansen on Pascal Compiler. Englewood Cliffs, N.J.:
Prentice Hall, 1985.
Budd, Timothy. A Little Smalltalk. Reading, Mass.: Addison-Wesley, 1987.
Darnell, Peter A. and Philip E. Margolis. Software Engineering In C. New York,
N.Y.: Springer-Verlag, 1988.
Hendrix, James E. The Small-C Handbook. Englewood Cliffs, N.J.: Prentice Hall,
1984.
Schildt, Herb. "Building Your Own C Interpreter." Dr. Dobb's Journal (August,
1989).
Tremblay, J. and P.G. Sorenson. The Theory and Practice of Compiler Writing.
New York, N.Y.: McGraw-Hill, 1985.
Wirth, Niklaus. Algorithms + Data Structure=Programs. Englewood Cliffs, N.J.:
Prentice Hall, 1976.
_ADDING AN EXTENSION LANGUAGE TO YOUR SOFTWARE_
by Neville Franks



Figure 2:

(a) sp = bp - size of stack frame;
 bp = sp[ stack frame link ];
 es = sp[ expression stack ];
 ip = sp[ return address ];

 sp = sp - sp[ argument count ];



(b) sp[ return address ] = ip;
 sp[ stack frame link ] = bp;
 sp[ expression stack ] = es;
 sp[ argument count ] = arg_count;
 sp = sp + size of stack frame;
 bp = es = sp;
 ip = function address;

Figure 4:


 switch( ip->instruction )
 {
 case LODI : /* load immediate -> ++sp */
 *++sp = ip->immed_val;
 ++ip;
 break;

 case STOR : /* store sp -> local variable */
 *(bp + ip->addr) = *sp;
 ++ip;
 break;

 case OPR_ADD : /* add values on stack */
 --sp;
 *sp += *( sp + 1 );
 ++ip;
 break;

 case JPT : /* conditional jump - true */
 if ( *sp == 0 )
 {
 ++ip; /* next instr */
 break;
 } /* note: fall through */

 case JMP : /* unconditional jump */
 ip = ip->addr;
 break;

 case ....
 }

Figure 6:

(a)
 add_up_numbers()
 {
 void *parms[4];
 int result;

 parms[0] = &result; /* result passed by reference */
 /* numbers passed by value */
 parms[1] = 23; parms[2] = 69; parms[3] = -78;
 pgm_run( "add_num", 4, parms );

 printf( "Result = %d", result );
 }


(b)
 main( result, num1, num2, num3 )
 int *result, num1, num2, num3;
 {
 *result = num1 + num2 + num3;
 }




















































September, 1991
PORTING UNIX TO THE 386: THE BASIC KERNEL


Multiprogramming and Multitasking, Part One




William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 and 2.9BSD and was the chief architect
of National Semiconductor's GENIX project, the first virtual memory
micro-processor-based UNIX system. Prior to establishing TeleMuse, a market
research firm, Lynne was vice president of marketing at Symmetric Computer
Systems. They conduct seminars on BSD, ISDN, and TCP/IP. Send e-mail questions
or comments to lynne@berkeley.edu. (c) 1991 TeleMuse.


Last month, we began our preliminary discussion of the main() procedure in the
386BSD kernel. This procedure is critical to UNIX, and we will be referring to
it again and again as we introduce more functionality in our 386BSD kernel and
incrementally turn on all of the kernel's internal services. We examined the
validation and debugging of kernel functions up to executing the main()
procedure, processor initialization, turning on paging, and sizing memory,
among other things. We then went on to create our initial page fault and
context generation for the scheduler, paging, and first user processes. In
other words, by building much of the main() procedure, we also assembled the
framework for the next outer layer of our UNIX operating system -- which, in
turn, will initialize higher-level portions of the kernel.
This month and next, we turn to a seemingly unrelated area: multiprogramming
and multitasking. Their relevance lies in that beginning with this stage of
our port, the code grows more complex as more items interact with each other;
as such, we are often working simultaneously on many key portions. This is one
reason why ports are often begun with great hopes and aspirations, but
abandoned as the complexity increases.
Multiprogramming and multitasking, two important elements which help make UNIX
"UNIX," are also areas of the basic design that the creators did particularly
well; hence, they are instructive to anyone interested in operating systems
design. Many other popular operating systems have only recently allowed for
multiprogramming -- at great cost and incompletely -- even though it was
"planned for" in their early designs.
This month, we will reexamine our understanding of the conventions--the style
of programming portions of the operating system--inherent in the design of
UNIX. These conventions are the conceptual framework upon which a
multiprogramming operating system is built. Once these conventions are
understood, the ease and simplicity of the UNIX multiprogramming environment
contrasts markedly with the more convoluted attempts at multiprogramming in
later operating systems.
Next month we'll examine some actual code; in particular, sleep(), wake-up(),
and swtch(), and how they create the illusion of multiple, simultaneous
process execution on a sole processor. We'll discuss some of the requirements
for the extensions needed to support multiprocessor and multithreaded
operation in the monolithic 386BSD kernel and reexamine the multiprogramming
attempts of some other operating systems in light of what we have learned.


What is Multiprogramming?


Occasionally, an area is so misunderstood or shrouded in mysticism that the
simple and elegant explanation is ignored in favor of an obtuse or complicated
one. Multiprogramming is so elemental to the design of any operating system
that ignorance of its development and structure precludes any real
understanding of UNIX. It is a concept which appears obvious to the typical
user (and for good reason): Through multiprogramming we can leverage or work
on several tasks at once. Obviously, concurrently running several editors,
formatters, and output devices would be as important to a writer as
simultaneously running a compiler, debugger, and editor is to a programmer.
The term "multiprogramming" implies multiple applications active at any time.
In other words, it is the effect we can see of having access to and working
with these programs. A UNIX system generally allows a fair number of
simultaneous applications programs to be present at a given time, a
convenience to which the average UNIX user quickly becomes accustomed. (While
we wrote this article, there were four editor processes, two command
processors or shells -- one of which was on a different host on the network --
and a contact management program present. This is a typical level of activity
for one or two people doing a little writing.) While the computer can cause
them all to appear active in a wink of an eye, we can't use them nearly as
fast, so we just hop between programs, calling more as needed.
More Details.


Attempts at Multiprogramming: MS-DOS and Finder


UNIX users become so spoiled with the ability to multiprogram that they don't
fair well when migrated back to a non-multiprogramming environment such as
MS-DOS and Finder. Not surprisingly, later versions of these have attempted to
compensate for this lack with limited extensions such as line printer spooling
(via a TSR in MS-DOS for example). These extensions are by no means perfect,
however. For example, certain applications programs invoked from within nested
command processors invariably fail, sometimes with catastrophic results.
Windows 3.0 for the 386 attempts to increase multiprogramming capabilities
beyond earlier MS-DOS extensions by running a protected-mode,
preemptive-scheduling operating system with multiple MS-DOS partitions. It
gets around the problem by simulating a separate complete MS-DOS environment
for each task using the virtual 8086 mode on the processor itself. In other
words, it effectively places a hard shell around each MS-DOS "task," as if it
were running on a separate processor.
Internally, the software looks somewhat like UNIX. Through the 8086 mode
feature of the 386 processor, Windows/386 can leverage some UNIX approaches to
multiprogramming. The MS-DOS sessions end up running in "almost" real mode
while the rest of the operating system, like 386BSD, runs in protected mode in
a completely different part of the system. Even some of the bugs encountered
(and fixed) in porting 386BSD to the PC are similar to those in extant
protected-mode systems like Windows/386.
On the Macintosh, applications can be launched and switched, but again the
focus is more intraapplication than interapplication. System 7.0 with
Multifinder is supposed to deal with this area more comprehensively. (We shall
see.)
The question as to why multiprogramming was more difficult to achieve in
MS-DOS and Finder on the Macintosh, for example, can be turned around and
phrased in a more direct manner: "Where in UNIX is the concept of
multiprogramming implemented (if there is such a place), and what elements of
multiprogramming are missing from other operating systems that made extensions
for multiprogramming more difficult later on?"
In UNIX, the concept of multiprogramming was extant from the beginning, so
let's examine what the designers of UNIX did right.


Conventions and Definitions


To the user, multiprogramming is the ability to use and access many programs
at once. Organick, however, views multiprogramming as "Systems for 'passing
the processor around' among several processes, so as to prevent the idling of
a CPU during I/O waits or other delays, are known as multiprogramming
systems." While utilitarian and concise, this definition yields little insight
to the layman.
To make the concept of multiprogramming more obvious, we must review some
fundamental terms: processes and tasks, context switching, preemption and
multitasking, and time slicing. We'll also contrast our understanding of these
terms with those of other designers who helped develop the functional
definitions.


Processes and Tasks


Programming on UNIX is predicated upon the existence of processes. Dennis and
Van Horn's is a good basic definition: "A process is a locus of control within
an instruction sequence. That is, a process is that Abstract entity which
moves through the instructions of a procedure as the procedure is executed by
a processor." This definition describes a dynamic entity practically
"swimming" through the code but does not tie us down by saying exactly what a
process consists of or how it works through the code.
Organick gives a more functional definition, inherited from the precursor to
UNIX, MULTICS: "Process (Lay Definition): A set of related procedures and data
undergoing execution and manipulation, respectively, by one of possibly
several processors of a computer." Again, our definition is not "absolute." We
run into this problem with a few other terms, in particular, lightweight
processes and threads. (See the sidebar entitled "Brief Notes: Lightweight
Processes and Threads.") Modern UNIX systems utilize lightweight processes, as
opposed to the heavyweight, let's-do-everything MULTICS processes.
Processes, by their nature, are isolated entities. Tasks, on the other hand,
share resources such as variables and memory. A task can be a subset of a
process (such as a thread) or live outside of a process (in a portion of
memory on the system, for example). A task is really just some register state
and a bag of memory that can be accessed by other tasks. Keep in mind that
this concept is more primitive than that of a process.
Tasks are "well-behaved" when they process the event that activated them
without locking out other cooperating tasks (that is, not doing time-consuming
operations) and when they relinquish execution to give other lower-priority
tasks a chance to run. Tasks must be careful to arbitrate for shared resources
before they use them.
Real-time tasks perform many operations usually done within the unseen
internals of a time-sharing system. Thus, the transparency so touted in a
timesharing system hinders the external visibility needed for real-time
operation. (See the sidebar entitled "Brief Notes: Is UNIX Real-Time Enough?")



Context Switching


The actual operation of going from one process to another is called a context
switch. When this occurs, the state information of the computer's processor
(registers, mode, memory mapping information, coprocessor registers, and so
on) or "context," is saved away in a location where it can be later restored.
Then, a new process which must be run is found. Finally, the state information
of the new process must be loaded and run.
The word "context" is used to refer to different, but related things. When a
processor gets an interrupt, or a procedure is called, the state information
of "what the processor was doing before" is always recorded (somewhere -- it
depends on the computer and system). This might be termed an "interrupt
context switch" or a "procedure context switch." These are different from a
process (or task) context switch.
A process context switch can be a costly operation, especially if a processor
has a large number of registers. The additional overhead of this operation is
just another unwelcome burden to a nonmultitasking operating system -- the
price for doing n things at once. This overhead is created in proportion to
how much multitasking is done at a time. Remember, many active processes
result in many context switches.


Preemption and Multitasking


Simple multitasking systems can be written that run one task at a time and
switch to another task when idling. These systems are nonpreemptible, because
a process is not allowed to preempt or run ahead of the currently active
process. This mechanism does not allow for much flexibility. Early examples of
nonpreemptible multitasking systems included MPM and UCSD Pascal.
MS-DOS, in contrast, is a single-tasking system. To illustrate the difference,
let's assume we are running a program, such as a number cruncher, on both our
nonpreemptible multitasking system and our single-tasking MS-DOS system. If
the program allows for no interruptions, it will run to completion on both
systems. Let's assume, however, that the programmer who wrote this program
installed a request for input midway through the program. At the point the
request for input appears and the compilation is stopped, we can "put aside"
the nonpreemptible multitasking system's program, run another task, and then
return to input the data and continue the program. On our single-tasking
MS-DOS system, however, we cannot put aside the program midway through; we
must either input the requested data and complete the compilation or abort it.
A preemptible multitasking system is far more interesting for our purposes.
Unlike our nonpreemptible multitasking system, we can not only run one task at
a time and then switch to another task when idling, but one task can also
preempt automatically, without manual intervention. This concept is quite
powerful in practice, but adds its own complications, as we shall see.
Obviously, preemptive systems must have a way of grabbing control and
preempting the current process. Therefore, preemption mechanisms are either
"immediate" or "casual" in action, depending on the amount of time available
to activate a process to run. In a real-time system, an immediate guaranteed
response time is crucial; with a time sharing system, even a ponderous one
fourth of a second (approximately 500,000 386 instructions) response time is
unnoticeable.
Preemption adds two major costs: First, it implies more context switches
(hence, increased overhead); and second, nonpreemptible sections must be
carefully coded to remain nonpreemptible. Additional code is usually required
to surround nonpreemptible sections and when contending for shared resources
that may become active. (These nonpreemptible sections of code are called
"critical" because they must execute without preemptions, else the integrity
of the system is impacted.) These costs can increase, depending on the degree
of preemption allowed.
More Details.
Unlike a real-time system, UNIX possesses no ability to preempt a higher
priority process when the kernel is already processing something that cannot
be "blocked." "Wakeup" calls can schedule a higher priority process that wants
to be run, but up to an entire rescheduling clock tick can pass before this
shuffling of the schedule is noticed.


Time Slicing


An important consideration with timesharing systems in general is to ensure
that each process receives a period of time to run on the processor when it is
required, and that, when many processes are ready, the system's scheduler
"round-robin" and tender the appropriate time to the processes. With our
preemptible multitasking system, we can afford to give processes a period of
activity, called a timeslice, to switch them back and forth. (A UNIX timeslice
has a lifetime of one-tenth of a second.) A process can run up to the lifetime
of a timeslice unless some other process intrudes (preempts). In contrast, a
real-time system will only activate a well-behaved task per a given event.
The system's scheduler determines the rules detailing which process is allowed
to run. Usually, the policies the scheduler uses to manage resources
(processor time, RAM, I/O bandwidth, preferential use) have a root goal in
mind. With UNIX, the concept of "fairness" is invoked, in that processes from
multiple users are allowed to compete for resources on an even basis. While
this is appropriate for a time-sharing system, a real-time system would
require us to have an intentionally "unfair" scheduling policy in place -- one
that would award resources to a task solely on the basis of its runtime
priority and event occurrence.


UNIX Organization for Multiprogramming


From the very beginning, UNIX was conceived of as a "preemptible multitasking
system," on top of which lightweight processes are built. Preemptible
multitasking occurs at the internal kernel level of the system and its
mechanics are transparent to the user. Multiprogramming is built upon these
mechanisms and is what is observed at the user level. It is more concerned
with the effects of our preemptible multitasking system on the user (in other
words, how we interact with the system). By the way, if we have more than one
processor, we are also doing multiprocessing. Got all that? Good, because
these terms are thrown about all the time with little distinction between
them, and they are distinct.
UNIX utilizes a limited-preemption mechanism that provides each process a
timeslice which it can consume. Because its goals are oriented towards
timesharing, such timeslices are made perceptibly short to maintain the
illusion of timesharing. The mechanism for providing this is simple and
elegant, avoiding "ultimate" mechanisms that impose complexity throughout the
kernel. The trade-off for this simple approach is a minimal, submillisecond
response time delay, event-to-process -- not a great loss for a time-sharing
system.
Having obtained an overview of multiprogramming and the related terminology,
we can now delve inward to the actual mechanisms by which processes are
created, switched around, and terminated.


A UNIX Process's Double Life


A UNIX process possesses both a user-mode program and a supervisor-mode "alter
ego." The user-mode applications program runs code and obtains service from
the operating system via system calls. This in turn causes the computer to
enter supervisor mode and run the operating system's code, which processes the
exception (system call) for this process. During the processing of a system
call, or other exception, multitasking code comes into play. Because the
computer system's periodic clock interrupt forces entry into the operating
system (usually every 100 times a second), even a process that does not call
the system by itself (a heavy calculation) mandatorily calls the system in
this manner.


Blocking and Unblocking Processes


A process waiting for an event can give up the processor by calling the
tsleep() routine to "block" itself out from the processor until the expected
event occurs. It then frees up the processor to run another process. tsleep()
records the priority and other details of its slumber. The code checks to see
if the event for which it was waiting has occurred. If not, another tsleep()
is issued.
When the event occurs, processes waiting for it are "unblocked" by a wakeup()
call. The wakeup() call reschedules the previously blocked process, allowing
it to run when it next has priority. wakeup() and tsleep() are not necessarily
discreetly paired; one wakeup() call can awaken many tsleep()ing processes
waiting for the same event. This explains why a process is not guaranteed that
the condition waited for is true; if one buffer becomes free, a dozen
processes may wake up for it. Only one is satisfied, so the others return to
their slumbers.


Process Context-Switching Mechanism


In order to provide multiprogramming capability, we must be able to exchange
the currently running program with the next program to be run whenever the
current process blocks for an event or the currently allocated timeslice of
the process is consumed. This switch from one process to another, or process
context switch, is a very critical piece of code, and is the pivotal mechanism
for multiple execution.
Interestingly enough, a process context switch is similar to a subroutine call
mechanism called a "coroutine." Coroutines are not nested within a hierarchy
like subroutines, but instead reside at the same level. When a coroutine
pauses, it returns execution to its caller, knowing that it will be reentered
at some future date when its caller suspends.


Next Month


Our next step is to discuss in depth the 386BSD switch() routine, and how it
impacts multiprocessing capabilities to 386BSD. We'll embark upon those
subjects next month.



References


Organick, Elliot I. The Multics System. Cambridge, Mass.: MIT Press, 1972.
Dennis, J.B. and Earl C. Van Horn. "Programming Semantics for Multiprogrammed
Computations." CACM, vol. 3, no. 9 (1966).


Brief Notes: Lightweight Processes and Threads


On MULTICS, processes were so "heavy-weight" that a fair amount of work was
done to reuse them, rather than create and destroy them constantly. In a
similar vein, VMS processes are "precreated" before use to minimize activation
time, "cleaned and pressed" after use, in preparation for reuse, and then
terminated. UNIX, on the other hand, relies on lightweight processes because
it uses one (or many) for each command executed through the shell. In other
words, processes are so convenient to work with (each has a completely
independent and isolated program within) that we want to use them commonly.
This is why we make them lightweight in UNIX -- so we can cheaply create and
destroy them as needed.
During the mid-1980s, research versions of UNIX found that a limiting factor
in the speed and efficiency of UNIX might be, in part, that lightweight
processes were not lightweight enough. For example, to perform a UNIX fork()
operation to clone a process, the early versions of UNIX were required to copy
an entire process, frequently unnecessarily (such as copy a 200-Kbyte text
editor only to execute a 10-Kbyte command interpreter). In order to preserve
the lightweight nature of UNIX processes, research systems implemented
"copy-on-write" fork(), where only the pages actually modified would be
copied. (All pages in the copy are marked "read-only;" as write operations are
detected and faulted, the affected pages are copied and marked "read-write.")
Copy-on-write is now a de facto standard in UNIX systems. Alas, copy-on-write
required duplication of page table and other bookkeeping information, so it
was not considered "lightweight enough" for some.
One extreme solution was a mechanism called vfork() -- it literally stole the
parent process's address space en masse and used it temporarily until it
executed another program, and thus avoided copying page tables and bookkeeping
information. This sounds wonderful until you realize that the program using
vfork() has to be carefully written to avoid inadvertently modifying the
parent process (because both child and parent process run using exactly the
same memory). Also, due to the weird semantics of vfork() (you must clean up
after yourself, the child process always runs first, and so on), it's not a
general-purpose replacement for fork(). (In fact, in many cases you can't
employ it.) Finally, it's not easy to debug, because a single program is
"running" in two places. However, it still remains the cheapest way to spawn
processes, even if it is a bit ugly and cumbersome.
In the never-ending search for lighter-weight mechanisms, "threads" next come
into view. Exactly what a thread is, however, varies from one system to
another. Some view threads simply as tasks within a process. Others view them
as Lightweight Processes (LWPs) that may share address space. Minimally, a
thread must have a separately executing PC (program counter) or instruction
pointer on the 386, although to be practical, we also suggest a stack.
Hopefully, the cost of creation and context switching would then be so low
that we could program in terms of thousands of threads. Typically, they would
be used as ways for normally dormant functions to become active (by being
scheduled to run), such as when an exception needs to be processed. Threads
are also blockable.
Threads seem at first to provide a natural way to explore parallel processing
or multiprocessing within a process, because you can now have a thread per
processor. On second glance, however, threads aren't an answer to the hard
questions of parallel processing. In particular, for the past 30-40 years we
have been working with sequential mechanisms. Now how do we delegate work to
parallel instruction streams? In other words, how do we "think parallel"?
Threads may be part of the answer, but, contrary to the popular literature,
they are far from all of it.
Leveraging the early idea that programs were in processes and could be used in
each command, conventional UNIX multiprogramming permitted you to build
complicated commands easily and tie them together in scripts and pipelines;
this was valuable.
However, at the risk of sounding skeptical, it is not clear that threads offer
any immediate advantage on either a uniprocessor or multiprocessor system over
the conventional UNIX multiprogramming with processes. To truly make use of
threads, our program development tools (compilers, linkers, debuggers) must
provide even more functionality than before. This is especially true with the
debugger, as you need to know just what thread modified which global variable
that caused another to generate an exception. Otherwise, multithread
programming might resemble an impossible can of worms, subject more to the
rules of black magic than professional practice.
Watching this "slimming down" of processes, we can only speculate about what
will occur next if this trend continues. Can we next expect subinstruction
parallelization, or multi-threaded microcode? If so, it might fit in with the
trend toward superscalar processors, where multiple operations are performed
per each clock tick (that is, a 50-MHz processor that does hundreds of
millions of operations per second). Perhaps programs will then be written in a
two-dimensional hierarchical arrangement, and separately address parallel
iteration in hardware and sequential iteration in time. We might end up with
the neural net arrangement, itself an example of both time and space
iteration. Whatever else occurs, the trend toward parallel execution will
definitely shape the programming environments of the future.
-- B.J. and L.J.


Brief Notes: Is UNIX Real-Time Enough?


Our UNIX system model, with its 1 to 100 millisecond response time, seems more
than adequate for its time-sharing role. However, we seem to be moving into a
world where the elements of what is loosely described as "multimedia," namely
voice, imagery, and other sensory information, are becoming more commonplace.
These sources of information not only require a lot of I/O bandwith to make it
onto and off of our system, but they also may require microsecond or perhaps
even nanosecond response time (as for video). Does UNIX work well enough for
the multimedia and networking of the future, or is it showing its age?
On the surface, there a number of problems. By the time a process is
activated, a voice response event of brief duration (such as saying the words
"yes" or "no") may have come and gone unnoticed. Software delays can amount to
substantial portions of time, so we lean towards customized hardware solutions
as the only predictable way of guaranteed response.
This is not the end of the story, or the end of UNIX, however, as many
elements of multimedia have already been demonstrated on extant UNIX systems
sans customized hardware. Clever software can "buffer ahead" in order to make
up for the occasional delay that mars the synchronous "playback" of
information. The 1991 Summer USENIX presented several examples in the areas of
music and video which were really quite fun to watch and work with.
However, other problems remain, especially that of bandwith. One (relatively)
quick way around this is to develop new protocols that synchronize and reserve
bandwith on a network. For casual near-term needs, UNIX can function
adequately in this way.
Video, however, presents an even greater problem. Even when we utilize modern
data compression techniques, current data rates push hardware and software to
their limit. Video grows even more intractable when we start to consider
future video (HDTV) bandwiths. Finally, if we want our software to transform
the video in real time, we can't work with a compressed signal. For such
applications, real-time systems may be required. The question now becomes, can
our workstations serve all needs well?
Real-time systems differ from time-sharing systems in that they tend to have
their controls "exposed" instead of hidden. Scheduling is often controlled by
a group of tasks that cooperate, and arbitrary preemption is the rule, not the
exception. UNIX primitives don't always fit nicely into this world, and most
existing real-time systems have different proprietary interfaces, complicating
matters even further. POSIX has people working on real-time standards, so
there may be hope for a standard interface yet, but not soon. As a way around
the problem, a system could conceivably work both interfaces at once, by
building mechanisms on top of UNIX mechanisms (this has been done by some
manufacturers), or we could rewrite UNIX to provide guaranteed real-time
response (as has been done by other manufacturers).
In the long run, the conflict over real-time operation verses transparent
time-sharing may be an intractable problem which only a successor to UNIX may
correct. For the short-term, systems which appear to "mix" real-time and
time-sharing UNIX (like those systems with separate dedicated processors) will
probably suffice, but the economics of the marketplace will eventually demand
a less costly or less complicated solution.
-- B.J. and L.J.




























September, 1991
OBJ LIBRARY MANAGEMENT


Building your own object module manager




Thomas Siering


Thomas, a graphics software engineer for Software Publishing Corporation, is
interested in graphics, systems programming, and software tools. He encourages
responses, and enhancements to this article and can be reached through DDJ.


With just about any popular programming environment, library management
utilities appear to be little more than an afterthought on the designer's
part. Typically, the user interface is primitive, the options limited, and
report generation unwieldy. It offers little information about the object
modules contained, is not extendable by macros or hooks to user code, and does
very little to track intermodule dependencies. In short, most object (OBJ)
module library tools have evolved very little from the original LIB utility
offered by Microsoft in the earliest versions of DOS and DOS-development
tools.
While the less than satisfied LIB user can buy third-party OBJ library
managers, there is one solution that will always fit -- roll your own. This
approach, however, has one prerequisite: The OBJ library manager internals
must emulate Microsoft's Library Manager. Short of that requirement, no linker
or other library manager will be able to interpret a library generated with a
custom-made product, and it will lead a very lonely and unproductive life!
I don't pretend to know what the ultimate library manager should look like,
nor do I claim to have found an exhaustive feature list that will please
everybody. Instead, I'll discuss the technology underlying the Microsoft
standard and present code that implements these concepts. Armed with this
know-how and a Microsoft-compatible library of functions, you can then focus
on designing the ultimate additional capabilities, the snazziest user
interface, or the most enlightening library analysis reports.


Why OBJ Libraries?


OBJ module libraries are an essential part of software development. Even if a
programmer chooses not to organize code in a library and to maintain OBJ
modules as separate single entities, most programming environments require at
least a couple of libraries to be linked in for handling floating-point
support, standard runtime functions, graphics, and the like. Ignorance about
their internals may be acceptable until problems arise -- missing symbols,
multiple definitions, library overflows -- or library interdependencies have
to be analyzed.
Although the LIB format is an industry-wide de facto standard, existing
libraries can lead to compatibility problems that may require analytic tools.
These problems may arise when it is unclear which Microsoft language tool a
library was created with. Worse yet, a library may be constructed with a
third-party librarian, and module extraction may work in some cases when a
"compatible" tool is applied while mysteriously failing for other member
modules. Our emphasis will, therefore, be on analytical tools. Any available
librarian can add modules to a library (most of the time), and will
(hopefully) extract them successfully. While the code presented here covers
all the bases required for constructing a complete librarian, it seemed
prudent to avoid the clutter inherent in a larger program and to emphasize and
demonstrate the building blocks common to any library-related tool instead.


OBJ Modules


While a library is composed of a number of object modules, these are in turn
each a collection of object records. A great variety of object record types
exist, all of which must be understood by linkers. A librarian will only
concern itself with a select few, as shown in Figure 1.
All object record types share the same basic format. They are identified by a
1-byte record type, followed by a 2-byte length field (whose value excludes
the first 3 bytes). The subsequent record layout varies by record type, and
concludes with a checksum byte. As a result, system tools can traverse object
modules efficiently, considering only the record types of interest to the
purpose at hand.
Every object module starts off with a THEADR (header) and is terminated by a
MODEND (module end) record. The chief purpose of the header record is to list
either the module's name or the name of the source file used to compile it.
Subtle differences exist between vendors. Borland C++, for example, uses the
header record for the object module's name. If the module's filename changes,
so will the THEADR record, thus losing its connection to the source code's
filename. Microsoft C, on the other hand, will always maintain the source
filename (including even the source file extension, and possibly the full path
name). To provide the object module name, which may differ from the source's,
an additional record type, LIBMOD, has been added. Interestingly enough, this
object record type will be present only while the module resides in the
library. When it is extracted, the LIBMOD will be stripped out. (LIBMOD
records are actually a subclass of a more general type, namely the COMENT,
designed for vendor-specific comments and extensions.)
Object module libraries are collections of code and data shared by client
code, utilizing only the members needed. PUBDEF records contain the names of
publicly defined or global symbols. Public definitions are of such fundamental
importance that the library manager creates an additional dictionary for
speedy lookup.
Library managers can optionally use numerous other record types to enhance
their analytic capabilities. For example, the EXTDEF record complements the
PUBDEF type. It lists external definitions, meaning symbols referenced in a
given module but expected to be defined in another. Knowledge of external
definitions can be useful in determining whether a library still needs to
retain a member. If no other library member contains EXTDEFs for a module's
PUBDEFs, it may be safely removed (unless of course client code outside the
library requires it).
The Object Module Format (OMF) constitutes a comprehensive subject in its own
right, so details are beyond the scope of this discussion. For further
information and analytical software, consult Siering or Wilton (see
"References").


The OBJ Library Standard


OBJ module libraries consist of various individual records that resemble in
appearance object module records: Library records are identified by a 1-byte
record type, followed by a 2-byte page size and variable-length,
record-specific data. Unlike object modules, no checksum byte is appended.
Every library follows the same simple general layout: The first record must be
the Library Header. The header is immediately followed by the Object Modules.
Both appear in the same form as they would individually. The Marker follows,
its only purpose being to pad between the modules and the following library
record type. Next, the Symbol Dictionary serves as an index to the object
modules' public symbols. Finally, an optional Extended Dictionary may follow.


The Library Header Record


Every OBJ module library starts with a Library Header (record type F0h), shown
in Figure 2. After the 2-byte length field (as usual, this value does not
include the first 3 bytes already traversed), a 4-byte longword provides the
file position of the Symbol Dictionary, and a 2-byte field gives the
dictionary's size in blocks. Finally, a 1-byte flag field may contain
specifics about the library. Currently, the only nonzero value defined is 01h,
indicating case sensitivity.
Besides identifying a file as a library, the header maps out the basic
structure of the OBJ module library. The header's end marks the beginning of
the modules themselves. Their end is implied in the file position of the
Symbol Dictionary, which resides after the modules (although a Marker record
may be present, as we will see).
Another crucial piece of structural information is implied in the header. OBJ
modules are allocated file space in "pages." Page size is user-definable as a
power of 2 between 4 and 15, allowing for pages as small as 16 bytes (the
default) and as large as 32,768. A library's actual page size is simply
inferred from the header record size because the header always fills (with
padding) one page.


OBJ Module Records


Immediately following the library header are the object modules. They appear
as they would in standalone object format, although padded with 0s to the next
page boundary. One other possible difference is that the Microsoft LIB program
might have added a LIBMOD comment record containing the object module's name
(as opposed to the source file's name which appears in THEADR, the header
record). Borland's TLIB will change the THEADR to reflect the OBJ module's
name if it has been renamed since it was compiled.
The vast majority of libraries will employ a page size of 16. (Borland TLIB
does not even allow for any user-specified page sizes, although it will handle
libraries created with alternative page sizes.) Page size is significant
because in libraries, reference is made to modules not by their actual file
position (which would require a longword) but by their page number. As a
result, modules always begin on a new page. Page numbers are unsigned
integers, allowing the modules to occupy a maximum of 65,536 pages. Increasing
the page size will allow for more modules to reside in a library but will lead
to more wasted space because on an average, the last one-half page will be
unused by the module and zero-padded.



The Marker Record


As mentioned previously, object modules are zero-padded to the next page unit,
typically 16 bytes. Symbol Dictionary blocks which immediately follow the
object modules, however, are always aligned to a block size of 512 bytes.
Straddling the netherworlds between these two allocation units is the Marker
record. Its only purpose is to fill up space between the last object module
and the dictionary, as illustrated in Figure 3 .
Marker records are of trivial structure. They are identified by a library
record type of F1h, followed by the length word. The remainder of their "data"
is simply zero-padding.


The Symbol Dictionary


The Symbol Dictionary follows the object modules (and the Marker, if any). It
is located via the Library Header's dictionary offset entry, and consists of
512-byte blocks. The number of blocks, recorded in the Library Header's
dictionary size field, is a prime value between 2 and 251. A prime number is
chosen for the benefit of the symbol hashing algorithm (although Borland's
TLIB conserves space by not limiting its page numbers to primes).
While building an executable, one of the primary tasks of the linker is to
resolve all external references. In other words, if code is referencing global
data or calling an external function, the linker will first search other
object modules for this symbol's defining instance. After that, it will search
through libraries, deciding whether to pull any of their members into the
final executable. This could conceivably be done by searching through every
module in a library and scanning every public definition (PUBDEF) record for
the symbol definition. Instead, however, a library's Symbol Dictionary makes
these names directly accessible: For every symbol lookup, two hash values are
computed. One value determines which directory block the symbol entry resides
in, and the other yields the entry number within the block, the bucket. Every
block contains 37 buckets.
Symbol Dictionary blocks are laid out as follows. The first 37 bytes
constitute the hash table. If a bucket is vacant, the value is 0. Otherwise,
it contains the dictionary entry's word offset (relative to the dictionary
block). The 38th byte contains the offset of the next available word for
directory entries. If the block's directory entry space is full (which may be
the case, even if not all buckets are utilized), this byte will be set to FFh.
Directory entries immediately follow the hash table. They are variable-length,
containing the following data. The symbol name is in typical object module
format: a 1-byte length field and a character array without a null-terminator.
A 2-byte page number follows, indicating the location of the object module
defining the symbol. Finally, an alignment byte may appear, because directory
entries have to be word aligned.


The Dictionary Hashing Algorithm


Despite recent publication of parts of Microsoft's de facto library standard
(see "References"), no information about the hashing algorithm has ever
appeared in print before this article. Once understood, the hashing algorithm
is fairly simple. This simplicity is obviously required for efficient
implementation.
The hash function requires as input the string to be hashed and the number of
blocks over which the hash is to be distributed. It computes hash values for
the dictionary block, as well as the hash table bucket. In anticipation of
hash clashes (that is, two strings hashing to the same value), two more values
will be computed: the block overflow and the bucket overflow. These values can
then be added repeatedly to the block and the bucket values to find an
available hash table position.
The hash function sets a pointer to the first symbol string character as well
as the last one (the null terminator). Disregarding the current characters'
case, hash values are calculated by exclusive - or of the current character to
the previous hash value rotated by 2. This process is repeated once for every
character in the string. By letting the two pointers traverse the string in
opposite directions and performing the rotations to the left as well as right,
four values are generated.


The Extended Dictionary


Library managers may choose to implement one additional library element, the
Extended Dictionary. This implementation-specific record will appear after the
Symbol Dictionary, its contents vary and are typically proprietary. The
purpose of an Extended Dictionary is to provide information on the object
modules designed to speed up the linker's tasks. The presence of this record
is optional, and all its information content can be obtained from the
generally-supported record types. As an aside, linker vendors may choose to
rely on the Extended Dictionary instead of the "regular" Symbol Dictionary.
This becomes obvious from a bug in Borland's TLIB, Versions 3.01 and earlier,
which will occasionally result in a corrupt dictionary. While library managers
depending on a correct Symbol Dictionary (such as our utility) will fail,
Borland's linker doesn't notice the problem and builds a correct executable by
either ignoring the Symbol Dictionary or exclusively using the Extended
Dictionary (I brought the problem to Borland's attention and an updated,
corrected version is now available from their technical support group).


Applications of Library Manager Internals


Having endured the details of the object module library format thus far, it
seems perfectly appropriate to look for practical applications of this insider
knowledge. The code presented in this article handles all low-level details of
library management, making the construction of new utilities a simple matter
of higher-level design.
Let's briefly consider a few areas where existing library management falls
short. For one, quality assurance could benefit from advanced library tools.
For developers of function libraries, verification of the finished product is
of utmost importance. For example, a simple linear scan of the Symbol
Dictionary blocks gave away the previously discussed TLIB bug. Also, once an
object module is in the library, many internal forms of corruption may never
be detected. A utility could easily traverse the object modules and verify
their checksums. Equally simple, a group of object modules can be compared to
their versions as library members, and thus can be verified as current and
correct.
A similar area is dependency management. In order to ensure the completeness
of a library, yet prevent the inclusion of unneeded modules, both public
definitions (PUBDEF) and external definitions (EXTDEF) are of interest.
Current library managers are not prepared to generate this information, which
can be critical when considering removal of a module from a library.
Dependency management also enables you to generate reports such as call trees.
Even an extension of a MAKE-like utility can be created to replace library
modules by current stand-alone versions of the same object files.
Dealing with a library without owning its source code can also be made much
more manageable. How about a list of all occurrences of a given symbol, not
only the symbols included in the Symbol Dictionary? Or an "explode" option
that extracts all object modules from a library without the user having to
know or enter them by names? (As simple as this option is, it does not exist
in popular library managers.) The latter option can also be used in situations
where a library is best "decomposed" for overlay design. A "rename" facility
that allows the changing of a symbol, its dependent entries in other modules,
and the update of the Symbol Dictionary can be useful in case of a name clash.
Or a symbol could be hidden by some easily reversible bit manipulation to
allow for a temporary dual definition.
Another area not adequately supported in existing products is performance
tweaking. When a hash table becomes densely packed, more hash clashes occur,
and link time slows. If the number of Symbol Dictionary blocks is increased
above their required minimum, access speed will increase at the expense of
space. (This requires that the linker not ignore the dictionary; this can be
tested by purposefully corrupting a dictionary entry and attempting a link
involving this symbol.) Where space is at a premium (especially where a
library overflow is imminent), the problem can be helped by removing unneeded
COMENT object records from modules.
Finally, library manager user interfaces can be made more appealing and
convenient. For example, a full-screen symbol browser may allow for the
tracing of dependencies by clicking on symbols. There is virtually no limit to
the designs which may be based on this library management code.


The OBJ Library Manager Source Code


The source code for my OBJ library manager is divided into three categories.
Listings One and Two (page 90) dispense with the unexciting stuff first,
namely, the service functions used for routine purposes unrelated to our task.
The second category is the Object Library Engine (OL E), again divided between
Listings Three (page 90) and Four (page 91). Finally, Listings Five and Six
(page 94) provide sample applications. Listing Five dumps an entire Symbol
Dictionary by sequential scan, while Listing Six performs either a library
explode or selective library member extractions.
This code is meant to be self-documenting and will build on the concepts
discussed without further ado.


Acknowledgments


I am indebted to Greg Lobdell of Microsoft as well as David Intersimone and
Eli Boling of Borland International for providing documents, software, and
discussions of their respective library management products.


References



Siering, Thomas. "Understanding and Using .obj Files." The C Gazette (Spring
1991).
Wilton, Richard. "Object Modules." The MS-DOS Encyclopedia. Redmond, Wash.:
Microsoft Press, 1988.
Wilton, Richard. "The Microsoft Object Linker." The MS-DOS Encyclopedia.
Redmond, Wash.: Microsoft Press, 1988.
Microsoft C Developer's Toolkit Reference. Redmond, Wash.: Microsoft Corp.,
1990.
_OBJ LIBRARY MANAGEMENT_
by Thomas Siering


[LISTING ONE]

//****** svc.h -- Service functions *******

#define NOFILE NULL // no error log file
typedef enum {
 Message,
 Warning,
 Error
} MESSAGETYPE;

char *MakeASCIIZ(unsigned char *LString);
void Output(MESSAGETYPE MsgType, FILE *Stream, char *OutputFormat, ...);





[LISTING TWO]

//****** svc.c -- Service functions *******
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include "svc.h"

// MakeASCIIZ - Take a string of 1-byte length/data format, and make it
ASCIIZ.
char *MakeASCIIZ(unsigned char *LString)
{
 char *ASCIIZString;
 unsigned char StringLength;

 StringLength = *LString++;
 if ((ASCIIZString = malloc((int) StringLength + 1)) == NULL)
 return (NULL);
 strncpy(ASCIIZString, (signed char *) LString, StringLength);
 ASCIIZString[StringLength] = '\0';
 return (ASCIIZString);
}

// Output -- Write to the output stream. This function adds an exception-
// handling layer to disk IO. It handles abnormal program termination, and
// warnings to both stderr and output. Three types of message can be handled:
// Message, simply printed to a file; Warning, print to file AND stderr;
// Error, same as warning, but terminate with abnormal exit code.
void Output(MESSAGETYPE MsgType, FILE *Stream, char *OutputFormat, ...)
{
 char OutputBuffer[133];
 va_list VarArgP;


 va_start(VarArgP, OutputFormat);
 vsprintf(OutputBuffer, OutputFormat, VarArgP);
 // If this is (non-fatal) warning or (fatal) error, also send it to stderr
 if (MsgType != Message)
 fprintf(stderr, "\a%s", OutputBuffer);
 // In any case: attempt to print message to output file. Exception check.
 if (Stream != NOFILE)
 if ((size_t) fprintf(Stream, OutputBuffer) != strlen(OutputBuffer)) {
 fprintf(stderr, "\aDisk Write Failure!\n");
 abort();
 }
 /* If this was (fatal) error message, abort on the spot */
 if (MsgType == Error) {
 flushall();
 fcloseall();
 abort();
 }
 va_end(VarArgP);
}





[LISTING THREE]

//***** ole.h -- Global include info for Object Library Engine (ole.c) ******

#define THEADR 0x80 // OMF module header
#define COMENT 0x88 // OMF comment record
#define MODEND 0x8A // OMF module end record
#define LIBMOD 0xA3 // library module name comment class
#define LIBHEADER 0xF0 // LIB file header
#define MARKER_RECORD 0xF1 // marker between modules & dictionary
#define NUMBUCKETS 37 // number of buckets/block
#define DICTBLOCKSIZE 512 // bytes/symbol dictionary block
#define DICTBLKFULL 0xFF // Symbol dictionary block full

#define UNDEFINED -1 // to indicate non-initialized data
#define STR_EQUAL 0 // string equality

// These two macros will rotate word operand opw by nbits bits (0 - 16)
#define WORDBITS 16
#define ROL(opw, nbits) (((opw) << (nbits)) ((opw) >> (WORDBITS - (nbits))))
#define ROR(opw, nbits) (((opw) >> (nbits)) ((opw) << (WORDBITS - (nbits))))

typedef enum {
 false,
 true
} bool;

#pragma pack(1)

typedef struct {
 unsigned char RecType;
 int RecLength;
} OMFHEADER;

typedef struct {

 unsigned char RecType;
 int RecLength;
 unsigned char Attrib;
 unsigned char CommentClass;
} COMENTHEADER;

typedef struct { // Record Type F0h
 int PageSize; // Header length (excl. first 3 bytes)
 // == page size (module at page boundary)
 // page size == 2 ** n, 4 <= n <= 15
 long DictionaryOffset; // file offset of Symbol Dictionary
 int NumDictBlocks; // number of Symbol Dictionary blocks
 // <= 251 512-byte dictionary pages
 unsigned char Flags; // only valid flag: 01h => case-sensitive
 bool IsCaseSensitive;
 bool IsLIBMODFormat; // is MS extension type LIBMOD present?
} LIBHDR;

typedef struct {
 unsigned char MarkerType; // This's better be F1h
 int MarkerLength; // filler to dictionary's 512-byte alignment
} DICTMARKER;

typedef struct {
 int BlockNumber;
 int BucketNumber;
 unsigned char *SymbolP;
 long ModuleFilePos;
 bool IsFound;
} DICTENTRY;

typedef struct {
 int BlockHash;
 int BlockOvfl;
 int BucketHash;
 int BucketOvfl;
} HashT;

void GetLibHeader(LIBHDR *LibHeader, FILE *InLibFH);
HashT Hash(char SymbolZ[], int NumHashBlocks);
DICTENTRY FindSymbol(char *SymbolZ, LIBHDR *LibHeader, FILE *InLibFH);
void GetSymDictBlock(int BlockNumber, LIBHDR *LibHeader,
 FILE *InLibFH);
long FindModule(char *ModuleName, LIBHDR *LibHeader, FILE *InLibFH);
DICTENTRY GetSymDictEntry(int BlockNumber, int BucketNumber,
 LIBHDR *LibHeader, FILE *InLibFH);
char *GetModuleName(long ModuleFilePos, LIBHDR *LibHeader, FILE *InLibFH);
bool FindLIBMOD(FILE *InLibFH);
bool FindObjRecord(FILE *ObjFH, unsigned char RecType);
bool ExtractModule(char *ModuleName, char *NewModuleName, LIBHDR *LibHeader,
 FILE *InLibFH);
void CopyObjModule(FILE *NewObjFH, long FilePos, FILE *InLibFH);





[LISTING FOUR]


//***** ole.c -- Object Library Engine ******

#include <stdio.h>
#include <io.h>
#include <stdlib.h>
#include <string.h>
#include "ole.h"
#include "svc.h"

typedef struct {
 unsigned char SymbolDictBlock[DICTBLOCKSIZE]; // symbol dictionary block
 int FreeSpaceIdx; // cursor to next free symbol space slot
 bool IsFull; // is this sym. dict. block full?
 int BlockNumber; // current block number
} DICTBLOCK;

// The number of pages in the Symbol Dictionary has to be a prime <= 251.
// NOTE: The smallest page number in MS LIB is 2, in Borland TLIB it's 1.
static int Primes[] = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43,
 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113,
 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193,
 197, 199, 211, 223, 227, 229, 233, 239, 241, 251 };
// Symbol Dictionary Block
static DICTBLOCK DictBlock;
// GetLibHeader -- Get header of an object module library. The library
// header's ( record type F0) main purpose is to identify this data file as a
// library, give page size, and size and location of Symbol Dictionary.
void GetLibHeader(LIBHDR *LibHeader, FILE *InLibFH)
{
 if (fgetc(InLibFH) != LIBHEADER)
 Output(Error, NOFILE, "Bogus Library Header\n");
 // NOTE: The LIBHDR data structure has been enlarged to include more
 // info than the actual LIB header contains. As a result, a few more bytes
 // are read in past the actual header when we take sizeof(LIBHDR). This
 // is no problem since there's plenty to read after the header, anyway!
 if (fread(LibHeader, sizeof(LIBHDR), 1, InLibFH) != 1)
 Output(Error, NOFILE, "Couldn't Read Library Header\n");
 // Add in Header length word & checksum byte
 LibHeader->PageSize += 3;
 // Determine if LIB includes Microsoft's LIBMOD extension
 // Find the first OBJ module in the LIB file
 if (fseek(InLibFH, (long) LibHeader->PageSize, SEEK_SET) != 0)
 Output(Error, NOFILE, "Seek for first object module failed\n");
 LibHeader->IsLIBMODFormat = FindLIBMOD(InLibFH);
 LibHeader->IsCaseSensitive = LibHeader->Flags == 0x01 ? true : false;
 // Make it clear that we haven't read Symbol Dictionary yet
 DictBlock.BlockNumber = UNDEFINED;
 }
// FindModule -- Find a module in Symbol Dictionary and return its file
// position. If not found, return -1L.
long FindModule(char *ModuleName, LIBHDR *LibHeader, FILE *InLibFH)
{
 char *ObjName;
 DICTENTRY DictEntry;
 char *ExtP;
 // Allow extra space for terminating "!\0"
 if ((ObjName = malloc(strlen(ModuleName) + 2)) == NULL)
 Output(Error, NOFILE, "OBJ Name Memory Allocation Failed\n");
 strcpy(ObjName, ModuleName);

 // Allow search for module name xxx.obj
 if ((ExtP = strrchr(ObjName, '.')) != NULL)
 *ExtP = '\0';
 // NOTE: Module names are stored in LIB's with terminating '!'
 strcat(ObjName, "!");
 DictEntry = FindSymbol(ObjName, LibHeader, InLibFH);

 free(ObjName);
 return (DictEntry.IsFound == true ? DictEntry.ModuleFilePos : -1L);
}
// FindSymbol -- Find a symbol in Symbol Dictionary by (repeatedly, if
// necessary) hashing the symbol and doing dictionary lookup.
DICTENTRY FindSymbol(char *SymbolZ, LIBHDR *LibHeader, FILE *InLibFH)
{
 DICTENTRY DictEntry;
 char *SymbolP;
 HashT HashVal;
 int MaxTries;
 int Block, Bucket;

 HashVal = Hash(SymbolZ, LibHeader->NumDictBlocks);
 Block = HashVal.BlockHash;
 Bucket = HashVal.BucketHash;
 MaxTries = LibHeader->NumDictBlocks * NUMBUCKETS;
 DictEntry.IsFound = false;

 while (MaxTries--) {
 DictEntry = GetSymDictEntry(Block, Bucket, LibHeader, InLibFH);
 // Three alternatives to check after Symbol Dictionary lookup:
 // 1. If the entry is zero, but the dictionary block is NOT full,
 // the symbol is not present:
 if (DictEntry.IsFound == false && DictBlock.IsFull == false)
 return (DictEntry);
 // 2. If the entry is zero, and the dictionary block is full, the
 // symbol may have been rehashed to another block; keep looking:
 // 3. If the entry is non-zero, we still have to verify the symbol.
 // If it's the wrong one (hash clash), keep looking:
 if (DictEntry.IsFound == true) {
 // Get the symbol name
 SymbolP = MakeASCIIZ(DictEntry.SymbolP);
 // Choose case-sensitive or insensitive comparison as appropriate
 if ((LibHeader->IsCaseSensitive == true ? strcmp(SymbolZ, SymbolP) :
 stricmp(SymbolZ, SymbolP)) == STR_EQUAL) {
 free(SymbolP);
 return (DictEntry);
 }
 free(SymbolP);
 }
 // Cases 2 and 3 (w/o a symbol match) require re-hash:
 Block += HashVal.BlockOvfl;
 Bucket += HashVal.BucketOvfl;
 Block %= LibHeader->NumDictBlocks;
 Bucket %= NUMBUCKETS;
 }
 // We never found the entry!
 DictEntry.IsFound = false;
 return (DictEntry);
}
// Hash -- Hash a symbol for Symbol Dictionary entry

// Inputs: SymbolZ - Symbol in ASCIIZ form; NumHashBlocks - current number of
// Symbol Dictionary blocks (MS LIB max. 251 blocks)
// Outputs: Hash data structure, containing: BlockHash, index of block
// containing symbol; BlockOvfl, block index's rehash delta; BucketHash,
// index of symbol's bucket (position) on page; BucketOvfl, bucket index's
// rehash delta
// Algorithm: Determine block index, i.e. page number in Symbol Dictionary
// where the symbol is to reside, and the bucket index, i.e. the position
// within that page (0-36). If this leads to collision, retry with bucket
// delta until entire block has turned out to be full. Then, apply block
// delta, and start over with original bucket index.
HashT Hash(char SymbolZ[], int NumHashBlocks)
{
 HashT SymHash; // the resulting aggregate hash values
 unsigned char *SymbolC; // symbol with prepended count
 int SymLength; // length of symbol to be hashed
 unsigned char *FwdP, *BwdP; // temp. pts's to string: forward/back.
 unsigned int FwdC, BwdC; // current char's at fwd/backw. pointers
 unsigned int BlockH, BlockD, BucketH, BucketD; // temporary values
 int i;
 SymLength = strlen(SymbolZ);
 // Make symbol string in Length byte/ASCII string format
 if ((SymbolC = malloc(SymLength + 2)) == NULL)
 Output(Error, NOFILE, "Memory Allocation Failed\n");
 SymbolC[0] = (unsigned char) SymLength;
 // copy w/o EOS
 strncpy((signed char *) &SymbolC[1], SymbolZ, SymLength + 1);
 FwdP = &SymbolC[0];
 BwdP = &SymbolC[SymLength];
 BlockH = BlockD = BucketH = BucketD = 0;
 for (i = 0; i < SymLength; i++) {
 // Hashing is done case-insensitive, incl. length byte
 FwdC = (unsigned int) *FwdP++ 0x20;
 BwdC = (unsigned int) *BwdP-- 0x20;
 // XOR the current character (moving forward or reverse, depending
 // on variable calculated) with the intermediate result rotated
 // by 2 bits (again, left or right, depending on variable).
 // Block Hash: traverse forward, rotate left
 BlockH = FwdC ^ ROL(BlockH, 2);
 // Block Overflow delta: traverse reverse, rotate left
 BlockD = BwdC ^ ROL(BlockD, 2);
 // Bucket Hash: traverse reverse, rotate right
 BucketH = BwdC ^ ROR(BucketH, 2);
 // Bucket Overflow delta: traverse forward, rotate right
 BucketD = FwdC ^ ROR(BucketD, 2);
 }
 // NOTE: Results are zero-based
 SymHash.BlockHash = BlockH % NumHashBlocks;
 SymHash.BucketHash = BucketH % NUMBUCKETS;
 // Obviously, hash deltas of 0 would be nonsense!
 SymHash.BlockOvfl = max(BlockD % NumHashBlocks, 1);
 SymHash.BucketOvfl = max(BucketD % NUMBUCKETS, 1);

 free(SymbolC);
 return (SymHash);
}
// GetSymDictBlock -- Read and pre-process a Symbol Dictionary block
void GetSymDictBlock(int BlockNumber, LIBHDR *LibHeader, FILE *InLibFH)
{

 // Find and read the whole Symbol Dictionary block
 if (fseek(InLibFH, LibHeader->DictionaryOffset + (long) BlockNumber *
 (long) DICTBLOCKSIZE, SEEK_SET) != 0)
 Output(Error, NOFILE, "Could Not Find Symbol Dictionary\n");
 if (fread(DictBlock.SymbolDictBlock, DICTBLOCKSIZE, 1, InLibFH) != 1)
 Output(Error, NOFILE, "Couldn't Read Library Header\n");
 // Is this block all used up?
 DictBlock.FreeSpaceIdx = DictBlock.SymbolDictBlock[NUMBUCKETS];
 DictBlock.IsFull = (DictBlock.FreeSpaceIdx == DICTBLKFULL) ? true : false;
 // For future reference, remember block number
 DictBlock.BlockNumber = BlockNumber;
}
// GetSymDictEntry -- Look up and process a Symbol Dictionary block entry
DICTENTRY GetSymDictEntry(int BlockNumber, int BucketNumber,
 LIBHDR *LibHeader, FILE *InLibFH)
{
 DICTENTRY DictEntry;
 unsigned char SymbolOffset;
 unsigned char SymbolLength;
 int PageNumber;
 // Remember entry's block/bucket and init. to no (NULL) entry
 DictEntry.BlockNumber = BlockNumber;
 DictEntry.BucketNumber = BucketNumber;
 DictEntry.SymbolP = NULL;
 DictEntry.IsFound = false;
 // Make sure the appropriate block was already read from obj. mod. library
 if (DictBlock.BlockNumber != BlockNumber)
 GetSymDictBlock(BlockNumber, LibHeader, InLibFH);
 // WORD offset of symbol in dictionary block: 0 means no entry
 SymbolOffset = DictBlock.SymbolDictBlock[BucketNumber];
 if (SymbolOffset != 0) {
 // Since it's word offset, need to multiply by two
 DictEntry.SymbolP = &DictBlock.SymbolDictBlock[SymbolOffset * 2];
 // Get the symbol's object module offset in LIB
 SymbolLength = *DictEntry.SymbolP;
 // Object module's LIB page number is right after symbol string
 PageNumber = *(int *) (DictEntry.SymbolP + SymbolLength + 1);
 DictEntry.ModuleFilePos = (long) PageNumber *
 (long) LibHeader->PageSize;
 DictEntry.IsFound = true;
 }
 return (DictEntry);
}
// GetModuleName -- Read the OMF module header record (THEADR - 80h) or, if
// present, MS's LIBMOD extension record type. NOTE: For Microsoft C,
// THEADR reflects the source code name file at compilation time. OBJ name
// may differ from this; the LIBMOD record will contain its name. For
// Borland C++, THEADR is the only pertinent record and will contain OBJ
// module's name rather than the source's.
char *GetModuleName(long ModuleFilePos, LIBHDR *LibHeader, FILE *InLibFH)
{
 int SymbolLength;
 char *ModuleName;
 OMFHEADER OmfHeader;
 // Position at beginning of pertinent object module
 if (fseek(InLibFH, ModuleFilePos, SEEK_SET) != 0)
 Output(Error, NOFILE, "Seek for object module at %lx failed\n",
 ModuleFilePos);
 if (LibHeader->IsLIBMODFormat == false) {

 if (fread(&OmfHeader, sizeof(OmfHeader), 1, InLibFH) != 1)
 Output(Error, NOFILE, "Couldn't Read THEADR at %lx\n",
 ModuleFilePos);
 if (OmfHeader.RecType != THEADR)
 Output(Error, NOFILE, "Bogus THEADR OMF record at %lx\n",
 ModuleFilePos);
 }
 else
 if (FindLIBMOD(InLibFH) == false) {
 Output(Warning, NOFILE, "No LIBMOD record found at %lx\n",
 ModuleFilePos);
 return (NULL);
 }
 SymbolLength = fgetc(InLibFH);
 if ((ModuleName = malloc(SymbolLength + 1)) == NULL)
 Output(Error, NOFILE, "Malloc failure Reading module name\n");
 if (fread(ModuleName, SymbolLength, 1, InLibFH) != 1)
 Output(Error, NOFILE, "Couldn't Read THEADR\n");
 ModuleName[SymbolLength] = '\0';
 return(ModuleName);
}
// FindLIBMOD -- Get a LIBMOD (A3) comment record, if present.
// NOTE: This is a special OMF COMENT (88h) record comment class used by
// Microsoft only. It provides the name of the object modules which may
// differ from the source (contained in THEADR). This record is added when an
// object module is put into library, and stripped out when it's extracted.
// This routine will leave file pointer at the LIBMOD name field.
bool FindLIBMOD(FILE *InLibFH)
{
 COMENTHEADER CommentHdr;
 // Search (up to) all COMENT records in OBJ module
 while (FindObjRecord(InLibFH, COMENT) == true) {
 if (fread(&CommentHdr, sizeof(CommentHdr), 1, InLibFH) != 1)
 Output(Error, NOFILE, "Couldn't Read OBJ\n");
 if (CommentHdr.CommentClass == LIBMOD)
 return (true);
 else
 // if not found: forward to next record, and retry
 if (fseek(InLibFH, (long) CommentHdr.RecLength -
 sizeof(CommentHdr) + sizeof(OMFHEADER), SEEK_CUR) != 0)
 Output(Error, NOFILE, "Seek retry for LIBMOD failed\n");
 }
 // We got here only if COMENT of class LIBMOD was never found
 return (false);
}
// FindObjRecord -- Find an object module record in one given module.
// On call, file pointer must be set to an objec record. Search will
// quit at the end of current module (or when record found).
bool FindObjRecord(FILE *ObjFH, unsigned char RecType)
{
 OMFHEADER ObjHeader;
 while (fread(&ObjHeader, sizeof(ObjHeader), 1, ObjFH) == 1) {
 // If it's the record type we're looking for, we're done
 if (ObjHeader.RecType == RecType) {
 // Return with obj module set to record requested
 if (fseek(ObjFH, -(long) sizeof(ObjHeader), SEEK_CUR) != 0)
 Output(Error, NOFILE, "Seek for Record Type %02x failed\n",
 RecType & 0xFF);
 return (true);

 }
 // End of object module, record type NEVER found
 if (ObjHeader.RecType == MODEND)
 return (false);
 // Forward file pointer to next object module record
 if (fseek(ObjFH, (long) ObjHeader.RecLength, SEEK_CUR) != 0)
 Output(Error, NOFILE, "Seek retry for Record Type %02x failed\n",
 RecType & 0xFF);
 }
 // If this quit due to I/O condition, it's either EOF or I/O error
 if (feof(ObjFH) == 0)
 Output(Error, NOFILE, "Couldn't Read OBJ\n");
 // we completed w/o error and w/o finding the record (should NEVER happen)
 return (false);
}
// ExtractModule -- Find an object module in a library and extract it into
// "stand-alone" object file. Return true if ok, else false.
// Optional: Can specify a new name for the module.
bool ExtractModule(char *ModuleName, char *NewModuleName, LIBHDR *LibHeader,
 FILE *InLibFH)
{
 long FilePos;
 char *NewObjP;
 char *NewObjName;
 FILE *NewObjFH;
 // Find the object module's position in the library file
 FilePos = FindModule(ModuleName, LibHeader, InLibFH);
 if (FilePos == -1L)
 return (false);
 // Determine name for new .obj, and set it up
 NewObjP = NewModuleName != NULL ? NewModuleName : ModuleName;
 if ((NewObjName = malloc(strlen(NewObjP) + 5)) == NULL)
 Output(Error, NOFILE, "Malloc failure Making module name %s\n",
 NewObjP);
 strcpy(NewObjName, NewObjP);
 // Open the new .obj file, and pass everything off to low-level routine
 if ((NewObjFH = fopen(NewObjName, "wb")) == NULL)
 Output(Error, NOFILE, "Open failure new module %s\n", NewObjName);
 CopyObjModule(NewObjFH, FilePos, InLibFH);
 fclose(NewObjFH);
 free(NewObjName);
 return (true);
}
// CopyObjModule -- Low-level copy of LIB member to OBJ file.
void CopyObjModule(FILE *NewObjFH, long FilePos, FILE *InLibFH)
{
 OMFHEADER RecHdr;
 // Get to the object module in LIB
 if (fseek(InLibFH, FilePos, SEEK_SET) != 0)
 Output(Error, NOFILE, "Seek failure to file position %ld\n", FilePos);
 // Write module from LIB to separate obj file
 do {
 // Read OMF header record, this will give record type and length
 if (fread(&RecHdr, sizeof(RecHdr), 1, InLibFH) != 1)
 Output(Error, NOFILE, "Couldn't Read OBJ\n");
 // Need to check every COMENT record to make sure to strip LIBMOD out
 if (RecHdr.RecType == COMENT) {
 // Throw away next byte (Attrib COMENT byte) for now
 fgetc(InLibFH);

 // Check COMENT's Comment Class
 // If it's a LIBMOD, set file pointer ro next record and continue
 if (fgetc(InLibFH) == LIBMOD) {
 if (fseek(InLibFH, (long) RecHdr.RecLength - 2L, SEEK_CUR) != 0)
 Output(Error, NOFILE, "Seek error on COMENT\n");
 continue;
 }
 else
 // Wasn't a LIBMOD: reset file pointer to continue normally
 if (fseek(InLibFH, -2L, SEEK_CUR) != 0)
 Output(Error, NOFILE, "Seek error on COMENT\n");
 }
 if (fwrite(&RecHdr, sizeof(RecHdr), 1, NewObjFH) != 1)
 Output(Error, NOFILE, "Couldn't Write new OBJ\n");
 while (RecHdr.RecLength--)
 fputc(fgetc(InLibFH), NewObjFH);
 } while (RecHdr.RecType != MODEND);
}





[LISTING FIVE]

//***** olu1.c -- Object Library Utility, Sample Application 1. *****
// This utility performs a linear scan and dump of an object module library's
// Symbol Dictionary. NOTE: Due to Borland TLIB bug, this utility may NOT work
// with libraries generated with versions 3.01 or less.
//*****************************************************************************
#include <stdio.h>
#include <stdlib.h>
#include "ole.h"
#include "svc.h"

static void DumpSymbolDictionary(LIBHDR *LibHeader, FILE *InLibFH);
void main(int argc, char *argv[]);

// main -- Surprise!
void main(int argc, char *argv[])
{
 FILE *InLibFH;
 LIBHDR LibHeader;
 long ModFilePos;
 if (argc != 2)
 Output(Error, NOFILE, "Usage: %s file.lib\n", argv[0]);
 if ((InLibFH = fopen(argv[1], "rb")) == NULL)
 Output(Error, NOFILE, "Couldn't Open %s\n", argv[1]);
 GetLibHeader(&LibHeader, InLibFH);
 DumpSymbolDictionary(&LibHeader, InLibFH);
}
// DumpSymbolDictionary -- Print out an entire Symbol Dictionary
static void DumpSymbolDictionary(LIBHDR *LibHeader, FILE *InLibFH)
{
 int BlockIdx, BucketIdx;
 DICTENTRY DictEntry;
 char *ModuleName;
 char *SymbolP;
 for (BlockIdx = 0; BlockIdx < LibHeader->NumDictBlocks; BlockIdx++)

 for (BucketIdx = 0; BucketIdx < NUMBUCKETS; BucketIdx++) {
 DictEntry = GetSymDictEntry(BlockIdx, BucketIdx, LibHeader,
 InLibFH);
 if (DictEntry.IsFound == false)
 continue;
 // Get the symbol name
 SymbolP = MakeASCIIZ(DictEntry.SymbolP);
 // Get to the corresponding module name record (THEADR or LIBMOD)
 ModuleName = GetModuleName(DictEntry.ModuleFilePos, LibHeader,
 InLibFH);
 printf("%s -- Module %s (%08lxh)\n", SymbolP,
 ModuleName, DictEntry.ModuleFilePos);
 printf("Hash: Block %d , Bucket %d\n", BlockIdx, BucketIdx);
 free(SymbolP);
 free(ModuleName);
 }
}





[LISTING SIX]

//***** olu2.c -- Object Library Utility, Sample Application 2. *****
// This utility explodes an object module library, i.e. move all its members
// out into .obj form. This functionality is absent from popular library
// managers; useful for libraries the user is unfamiliar with. Optionally,
// named single members can be copied, as well. NOTE: Modules can be
// extracted in sequence efficiently, but we're showing off functions!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "ole.h"
#include "svc.h"

static void ExplodeLibrary(LIBHDR *LibHeader, FILE *InLibFH);
void main(int argc, char *argv[]);

// main -- Surprise!
void main(int argc, char *argv[])
{
 FILE *InLibFH;
 LIBHDR LibHeader;

 if (argc != 2 && argc!= 3)
 Output(Error, NOFILE, "Usage: %s file.lib [file.obj]\n", argv[0]);
 if ((InLibFH = fopen(argv[1], "rb")) == NULL)
 Output(Error, NOFILE, "Couldn't Open %s\n", argv[1]);
 GetLibHeader(&LibHeader, InLibFH);
 if (argc == 3) {
 if (ExtractModule(argv[2], NULL, &LibHeader, InLibFH) == false)
 Output(Error, NOFILE, "Extraction of Module %s failed\n",
 argv[2]);
 }
 else
 ExplodeLibrary(&LibHeader, InLibFH);
}
// Explode Library -- Extract all (or one specific) library member(s).

// NOTE: This is done in a contrived way just to show off some of functions.
// We go through entire Symbol Dict., determine if a symbol is a module name
// by comparing its entry to the module name that entry is leading to, and
// then extract the module.
static void ExplodeLibrary(LIBHDR *LibHeader, FILE *InLibFH)
{
 int BlockIdx, BucketIdx;
 DICTENTRY DictEntry;
 char *ModuleName;
 char *SymbolP;
 char *ModuleFN;
 for (BlockIdx = 0; BlockIdx < LibHeader->NumDictBlocks; BlockIdx++)
 for (BucketIdx = 0; BucketIdx < NUMBUCKETS; BucketIdx++) {
 DictEntry = GetSymDictEntry(BlockIdx, BucketIdx, LibHeader,
 InLibFH);
 if (DictEntry.IsFound == false)
 continue;
 // Get the symbol name
 SymbolP = MakeASCIIZ(DictEntry.SymbolP);
 ModuleName = GetModuleName(DictEntry.ModuleFilePos, LibHeader,
 InLibFH);
 // If it compares, it's a module name
 if (strnicmp(SymbolP, ModuleName, strlen(ModuleName)) ==
 STR_EQUAL) {
 if ((ModuleFN = malloc(strlen(ModuleName) + 4)) == NULL)
 Output(Error, NOFILE, "Couldn't malloc file name %s\n",
 ModuleName);
 strcpy(ModuleFN, ModuleName);
 strcat(ModuleFN, ".obj");
 if (ExtractModule(ModuleFN, NULL, LibHeader, InLibFH) ==
 false)
 Output(Error, NOFILE, "Extraction of Module %s failed\n",
 ModuleFN);
 free(ModuleFN);
 }
 free(SymbolP);
 free(ModuleName);
 }
}























September, 1991
 SOFTWARE PARTITIONING FOR MULTITASKING COMMUNICATION


The key to high performance from any hardware




David McCracken


David is a consulting engineer in the embedded systems field and can be
contacted at 6850 Freedom Blvd., Aptos, CA 95003.


Embedded applications are increasingly demanding concurrent functions. Users
no longer tolerate, for example, a machine that shuts down the user interface
while printing. Nowhere is this more evident than in a rapidly emerging
application class in which the computer allows its operator to communicate
transparently with other machines through a variety of media. Especially where
the computer acts as a turn-key controller, the user does not interact with it
as a computer and is unforgiving of behavioral restrictions dictated by the
deficiencies of a hidden entity.
It has been suggested that general purpose multitasking operating systems will
provide the basis for such applications. But to achieve generality, these
operating systems suffer an enormous context-switch time penalty. Naive
programmers think that hardware manufacturers will simply make machines that
are fast enough to solve any performance problems. But embedded applications
are often cost-sensitive. Additionally, at any point in time there is a bulk
of computing hardware that provides the best cost/performance, has the most
alternate sources, and is well-understood. If software can be crafted to keep
an application within the limits of this hardware, the resulting product is
easier to manufacture and maintain. Proper software partitioning is a key
element in extracting the highest performance from any hardware. This is
particularly true in the design of multitasking applications.
Multitasking generally serves two purposes: to simplify complex programming
tasks and to improve performance. A general-purpose operating system such as
Unix provides an example of the former. Independently written programs, both
cooperative and stand-alone, can peacefully coexist in Unix. The easiest way
to develop a large program is to divide it into processes that can be
developed in relative isolation. But this doesn't afford any improvement in
performance. In fact, the overhead of task swapping lowers performance
compared to the same job being accomplished by a single-task process. To
improve performance, event-driven multitasking is needed. Assume that a given
application includes a job that can't complete without communicating with an
external entity, and that this communication doesn't require the CPU's total
processing bandwidth. By using an interrupt dedicated to the specific entity
to grab CPU time, we not only tell the CPU when to work but also what to work
on.


Machine Hierarchy


To simplify application programming, some operating systems use a single
unified task-switching mechanism. The mechanism provides time slices for
separate programs and responds to external events by analyzing their
priorities and launching appropriate tasks. Thus, tasks that communicate with
external entities are full-fledged programs. The principal problem with this
is the enormous overhead associated with dispatching a task that has all the
rights of a high-level program. External communication usually involves moving
many individual bytes -- a very simple task -- until a critical amount of
information has moved, allowing the program to advance to its next major
state, at which point it probably requires high-level privileges.
An analogous situation exists in translators, such as compilers. Input text is
analyzed by grouping individual characters into tokens and then parsing the
tokens into statements. Tokenizing requires only a simple-state machine, a
Deterministic Finite Automaton (DFA). Parsing requires a more complex PushDown
Automaton (PDA), which is essentially a state machine with a stack. A PDA
certainly has the power to do anything that the DFA can, including tokenizing,
but at a higher computational price, because it's a more complex machine.
Similarly, the higher levels of translation are too complex for the PDA,
requiring the power (and price) of a Turing machine.
Giving all levels of a communication task equally powerful computing
facilities is equivalent to a compiler using a Turing machine for tokenizing
(scanning), parsing, and code generation. But to achieve higher performance,
most compilers use (simulated) DFA and PDA for the first two phases, reserving
the powerful and expensive Turing capability for use only where it is
essential.


Partitioning


To better utilize our machine's power, we need to identify its computing
mechanisms of varying power. We then need to partition any communication
function into layers that are distinguished by computing complexity and that
align with the available mechanisms. Note that the mechanisms are machines
only in a theoretical sense: They are not specific hardware. The same basic
hardware can, for example, support two different kinds of task switchers, one
that apportions CPU time to fully privileged high-level tasks, and another
that provides reduced computing capability in response to external events. In
this scheme, the typical application is partitioned horizontally by functions,
such as user interface, LAN interface, printer, and so on, and vertically by
computing complexity. Figure 1 illustrates an application consisting of four
tasks, A through D, operating in a system with three computing levels, 1
through 3. Each task may execute at all three levels or at only one or two of
them.
The horizontal task partitioning can usually be done without considering the
vertical computing levels. Obviously, though, we can't do an adequate vertical
partitioning without knowing just what kind of computing power is available at
each level. To a large extent, this depends on the computing platform -- the
base hardware, added hardware, the native operating system (if there is one),
and any operating system support that we might provide in the application
program itself. Many application designers overlook the latter, for example,
believing that MS-DOS can't be used for a multitasking application.
A program designed for reliability and ease of maintenance minimizes the
connections between the different task/level boxes. An extreme approach to
task encapsulation would demand, for example, that taskA/level1 know nothing
of the organization of data in taskA/level2 or in taskB/level1. In Unix, a
pipe could be used to allow taskA to communicate with taskB while enforcing
their complete separation. But performance can be improved by sharing data,
either through a virtual pipe as in the MACH operating system, or through
explicitly shared memory.
The job of partitioning a design is difficult because it entails many
trade-offs and must be done before we understand the application fully. It
must be done even when the scheme I've outlined here is not followed. This
scheme, however, provides a rational approach to the process.


A Concrete Example


In many applications, a single computer is called upon to serve as a
communication hub, connecting a variety of external devices. For example, I
recently finished the design of a controller for a complex medical chemical
analyzer. An ISA (Industry Standard Architecture, aka AT-compatible) computer
was programmed to communicate simultaneously with the analyzer via GPIB
(General-Purpose Interface Bus, IEEE488) and with a host computer via RS-232
while printing results and interacting with the user via a windowed interface.
During all this overt activity, the application called for independent,
concurrent long-range quality control related to both the test results and the
controller's operation. I chose MS-DOS for the native operating system because
it is compact, stable, and has a good supply of support applications, such as
compilers and window libraries, and because most of the programmers on the
project were familiar with it.
MS-DOS doesn't provide multitasking, and its functions are not reentrant.
Confronted by these limitations, many programmers are surprised to see it
providing the base for this sort of application. Identifying (and creating)
the levels of computing machinery and partitioning the tasks to match are what
make this design work.
The obviously discrete functions of the application determine the program's
horizontal partitioning. DMA (Direct Memory Access) provides the lowest level
computing mechanism. DMA represents a partial state machine, where each
transition is based solely on the current state, regardless of the input.
Smarter DMA devices that can look for defined inputs do exist, like Zilog's
Z8410, but the Intel 8237 used in most ISA machines is not one of them. In
fact, as used in ISA machines, the 8237 can't even provide complete DMA
capability without some higher-level assistance because its 16-bit counters
don't cover the 24-bit address range of the machine.
Interrupt SubRoutines (ISR) provide the next level of computation. Many
operating systems use external event-driven interrupts only to trigger the
dispatcher, which then launches the corresponding high-level task, effectively
eliminating the ISR as a distinct computing level. In our design, the
interrupts trigger an immediate and direct response, thereby automatically
telling the CPU not only when to do a context switch but also what context to
switch to, with very little overhead. However, the price we pay is that the
ISRs cannot provide the highest level of computation. Interrupts can occur
even while a DOS function is executing. Because DOS is not reentrant, the ISRs
must execute without calling DOS. This doesn't mean that we have to, for
example, write our own disk access functions for use in an ISR, but that we
partition each task so that disk access is needed only at a higher level. Any
data processed in an ISR must reside in memory.
The highest computing level is provided by simple, non-preemptive round-robin
multitasking. Each task checks for any work to be done and then releases its
time slot by calling a function whose sole purpose is to record the task's
current context and restore that of the next task in the cycle. Each task in
turn is given the opportunity to function as a full DOS program until it gives
up its time. These are clearly cooperative tasks, in the usual sense that they
operate on common data and toward common goals, and also in the sense that
they must voluntarily give up their time.
The high-level partition of each task must use polling to determine the work
to be done because its time slice is not synchronized to any events. This
degrades performance only slightly because for each high-level event there can
be thousands of low-level ones efficiently processed by the task's ISR and/or
DMA partitions. For example, the GPIB ISR reduces all events to a single
doubleword (unsigned long) bit array that the high level can test in a few CPU
cycles.


Timing


In this application, the GPIB, RS-232, and printer tasks all have dedicated
ISRs. Only the GPIB has a data flow volume--as much as 17K in one
block--sufficient to demand using the DMA. The attached device limits the
maximum data rate to 75K/sec, and often the rate is much lower. Consequently,
I chose to operate the DMA in its CPU cycle-stealing mode, even though burst
mode uses a more efficient memory cycle that effects a memory/input/output
transfer in nearly half the time. When stealing cycles, the 8237 requires six
clocks at five MHz (its clock is independent of the microprocessor's) or 1.2
microseconds, to perform a transfer. Thus, even at the maximum data rate, DMA
consumes only 1.2/13.3, or nine percent of the bus bandwidth. Further, on many
DMA hits, the CPU won't be slowed down at all because its multicycle
instructions and prefetch queue elastically couple execution to bus access. In
contrast, burst mode would last long enough for the CPU to deplete its
available instructions and then be forced to remain idle.
That cycle-stealing DMA is essentially free (in CPU time) should be enough to
convince anyone to try to find the portions of a task that such a dumb
mechanism can compute. Unfortunately, ISA machines don't provide a means to
connect the serial and parallel (printer) ports to DMA. GPIB capability is
provided by a plug-in card that supports DMA.
The (CPU time) expense of an ISR is largely determined by the amount of data
processing we want it to do. As expected, greater computing capability is more
expensive. For example, most commercial serial communication libraries afford
ISRs that provide little more than DMA does. Typically, the input function
simply transfers bytes to a memory buffer and sets a flag when it sees a
particular value. ISR context switching requires only the time needed for INT
and IRET, 40 clocks (23+17) plus stacking and unstacking (say AX, DX, DS, and
SI) 32 clocks. With a 20MHz CPU clock, this consumes 3.6 microseconds. The
real work of the ISR probably consumes another 5 to 15 microseconds. But data
processing is deferred. A high-level function is expected to poll the flag and
process all data itself. This results in slow responses to external events. It
also is not very efficient because the high level must poll for many
communication situations that could have been handled by the ISR, and because
the contents of the input buffer must usually be moved to memory locations
determined by the application. It is often not possible to anticipate the
ultimate destination of input before seeing some portion of it, such as a
control header.
The ISR can be turned into a more powerful computing mechanism by giving it
explicit state memory. Even the simple ISR has a little state awareness
implied by the counter used to access memory buffers. One way to convert a
simple ISR into an interrupt-driven state machine is to explicitly encode the
address of the next state process in a table indexed by the current state
either alone or in combination with an input value. At each interrupt, the
appropriate process is invoked by calling (or jumping to) the address found in
the table.
Most states persist for more than one interrupt, so it would be unreasonable
to advance a state counter automatically. Instead, each state would
independently decide when to advance Figure 2 diagrams a hypothetical case in
which state 4 persists for nine interrupts and then advances to 4a where it
increments the state counter to 5.
An explicit state table is not the best method of encoding the transition
function. It is inefficient for each state to decide when to advance the
counter and then simply increment a value that will subsequently be used to
look up an address in the table. At lower cost, each state can put the address
of the next state into the counter, which is not really a counter anymore,
although it serves the same purpose. At the next interrupt, the ISR vectors
directly through the counter. Thus, in Figure 2, the counter would contain an
address, the table would be eliminated, and at 4a the address of state 5 would
be put into the counter. Not only does this approach speed up the state
machine overhead and reduce memory requirements, but it also simplifies
modifying the transition function.
The basic overhead of the interrupt-driven state machine is slightly more than
that of the simple ISR, adding two instructions. After pushing registers we
add a jump through the state counter in data memory, JMP [state_counter], and
at some point in each state we add a load state counter literal, MOV [
state_counter ], OFFSET next_state. These add 11 and three clocks or 0.7
microseconds to each interrupt. For negligible cost we buy a substantially
more powerful mechanism.

Up to this point, partitioning decisions have been compelled by simple logic.
The remaining decisions become increasingly arbitrary. For one, assembly
language was chosen for all ISRs. Few programmers dispute that it can deliver
higher performance than even C or Forth, and likewise that it can encourage
software chaos. Consider that much of the usefulness of HLLs comes from their
libraries, which are not safe to use in this application because they can call
DOS without our knowing it. Also, most ISRs contain a substantial amount of
low-level hardware manipulation, which C allows but doesn't facilitate.
Perhaps the most important argument for using assembly language is that
because interrupts can occur at any time, we are willing to pay a lot to be
able to dispose of them as quickly as possible. The CPU time taken by the ISRs
varies considerably over the different states, the quickest executing in about
10 microseconds. (including context switch) and the longest in about 100
microseconds. The simple round-robin task dispatcher shown in Listing One
(page 96) allows a high-level context switch to occur in 5.3 microseconds.
This obviously affords more efficient use of CPU time than more complex
preemptive dispatchers, many of which take 1000 times longer to perform a
context switch. However, the dispatcher doesn't assume responsibility for
ensuring that tasks actually give up their time. The individual task programs
must be designed to do their work in discrete chunks. Typically, they scan a
prioritized work list, execute the first ready item and then give up their
time by calling the release function shown in Listing One.
If the high-level portion of a task has no work ready to perform when it is
dispatched, then it immediately releases, consuming less than 8 microseconds.
If, on the other hand, it has work that involves disk access, its time slot
may stretch out to 200 milliseconds or more. Consequently, the round-robin
dispatch cycle time is nondeterministic. But we can determine the worst case
maximum time. Any portion of a task that requires a faster response than the
total cycle time must be partitioned into the task's ISR. Unfortunately,
because the final program itself determines the cycle time, we can only
estimate the timing threshold when we do the partitioning.
The ISRs are by nature more difficult to write, so we want to place most of a
task in the high level. The worst result of overestimating the round-robin
cycle time is that some of the ISRs may be doing more work than is dictated by
timing requirements. Underestimating, however, may allow the threshold to push
beyond a time-sensitive function that has been apportioned into the high
level. The only way to guarantee reliable operation in this case is to move
that function into the ISR. The actual design did not experience this problem
because I guessed a very conservative 4-second cycle time. The actual worst
case time is one second. The average time of about 80 milliseconds is fast
enough that the user interface slows noticeably only during heavy window
banging.
While the high-level portion of each task executes in a predetermined time
slot (of varying width) the lower levels are distributed in time. Ideally,
interrupts and DMA would be uniformly distributed in order to avoid
time-demanding hot spots. The ISR and DMA portions of all tasks can steal CPU
time from the high level of any task. Figure 3 illustrates a typical time
slice in which ISRs steal cycles from the high level and DMA steals from ISRs
as well as from the high level.


Data Connection


Having decided, for performance, to use directly shared memory for
communication between the partitions, we have two major concerns: how to
achieve data encapsulation and concurrency. It does seem as if shared memory
and data encapsulation are incompatible. Consider communication between the
GPIB high level and ISR. The obvious way to separate them is to provide one
input pipe and one output pipe, thus establishing a very restricted
connection. This incurs several severe performance penalties. One is that the
high level has to move data in and out of the pipes. Small amounts could be
moved reasonably quickly, but the GPIB data blocks are as large as 17Kbytes. A
second problem is the memory wasted on pipes that have to be long enough to
hold all of the data that might be transferred during a worst case round-robin
dispatching cycle. A more subtle but equally severe problem is that inputs
cannot be effectively prioritized, because the high level has to process the
input in the order received. Even a virtual pipe, scanned by the high-level
program, must be treated as a FIFO at least until the various input blocks
have been separated and identified. As mentioned earlier, these problems also
exist with most commercial serial communication libraries; but the greater
speed and data volume of the GPIB amplifies their effect.
The only way to avoid handling all the data twice and at inopportune times is
to give the ISR some understanding of the contents, which violates
encapsulation and moves some of the sophistication from the high level to the
ISR. The key to resolving this dilemma is, not surprisingly, proper
partitioning to match task functions to mechanisms of appropriate power. Each
GPIB interchange consists of an initial transmission by the external device
(the chemical analyzer) followed by a response from our controller. Each
analyzer transmission is from 1 of 16 categories, as indicated in a 10-byte
header. The controller's response, which must occur in about 50 milliseconds,
is based on the transmission category. The relatively rapid response time
requires the ISR to handle the interchange without timely help from the high
level.
To enable the ISR to respond appropriately to the input category while
limiting its understanding of the data, the ISR is allowed to know only a
generic version of this transaction. It uses the category listed in an input's
header to index a table of transaction descriptors, which is maintained by the
high level. Each descriptor is a structure that tells the destination address
for this type of input, the source address of the response, and several pieces
of information used to assure the integrity of the transaction. The ISR code i
oblivious to the specific categories. The high-level program can modify the
transactions simply by modifying the table (statically or on-the-fly) without
affecting ISR code.
Listing Two (page 96) shows the first three of 16 transaction descriptors.
From the assembler's point of view each structure is just a group of eight
words (16 bytes), but comments and initializing statements are arranged to
portray an array of structures. The first structure describes the response to
the "status" input. The input destination is a far global pointer, OFFSET
_status, SEG _status. The input length element tells the ISR that this address
is the beginning of a block of 12,040 bytes. The ISR will not allow input data
to fill beyond this point. The input header is supposed to list a data length
no greater than the available space, but defensive programming is essential
whenever dealing with inputs from the outside world.
The response half of the descriptor lists source address, OFFSET ack, SEG ack
and its 11-byte length. This unvarying "acknowledge" response to several
inputs, including status, doesn't need to be controlled by the high level. But
the descriptor table treats all transactions uniformly so that the ISR code
understands only the generic form. The 00h Most Significant Byte (MSB) of
CHKSTAT/chkLB tells the ISR that the last word (2 bytes) of the source
contains a valid checksum. The acknowledge response never changes, and
recomputing the checksum at every transmission would be wasted effort.
In the second descriptor, the command_request input has an uninitialized
destination so the high level must fill this in before the first
command_request transmission. The CHKSTAT byte in the "command" response is
01h, not 00h as in the case of ack. This indicates that the ISR must compute
the checksum before transmitting "command." Giving the ISR this responsibility
allows the high level to change data in the "command" source at any time
without having to recompute the checksum after each change. Whenever the high
level changes data contents, it sets CHKSTAT to 1. The ISR computes the
checksum only once, just before transmitting the data. It then changes CHKSTAT
to 0 so that, unless the high level makes additional changes, the ISR knows
that the checksum remains valid. Any response source that is in constant flux
has CHKSTAT value 02h, which tells the ISR to always recompute the checksum.
In this case, the high level can ignore CHKSTAT when changing data.
The third descriptor lists the same "command" response for the phase_request
input as for the command_request input. Such redundancies are a small price to
pay for generic ISR code. Another small price is that performance degrades
slightly because the program has to access variables indirectly. A
spaghetti-coded ISR could embed the values as literals.
Listing Three (page 96) shows the first three elements of a second table which
the ISR consults to determine which flag to set to tell the high level that a
particular input has been received. The high level, written in C, considers
flags (gpib_stat in Listing Two) as a single unsigned long, two unsigned
shorts, or four unsigned chars. The ISR treats them as an array of 4 bytes.
Each entry in the table, which is indexed by the input category, indicates the
byte and bit to set. This arbitrary mapping allows the high-level program to
determine the optimum congregation of flags to minimize its testing effort.
Note that the declaration extern unsigned char gpib_stat [4] supports simpler
byte selection than if defined as a larger object, such as the unsigned long
that it really is. For example, the second 2 bytes can be accessed as
*(unsigned short*) (gpib_stat+1).
The GPIB DMA/ISR connection is largely determined by the decisions regarding
the high level/ISR relationship. The high level has essentially nothing to do
with the mechanics of communication. Once set in motion, the ISR/DMA runs
freely in a cycle that begins with the DMA being set to input the 10-byte
header. DMA is incapable of intelligent processing, so it is set up to issue
the GPIB interrupt after receiving the last header byte. The external device
(analyzer) treats the header and data as an uninterrupted transmission.
Fortunately, GPIB specifies a transmission synchronized by handshaking on
every byte. This gives the ISR time to inspect the header and determine the
appropriate response from the transaction descriptor array.
The ISR sets up the DMA again, this time to receive the remainder of the
input, whose length was listed in the header. When the ISR is invoked at the
end of this, it is automatically vectored to its next state (as described
earlier). DMA is set to transmit the entire response with an interrupt
occurring only at the end, at which point the cycle repeats.
This is an abbreviated description. The actual ISR contains several additional
states for dealing with various GPIB peculiarities. One more state is worth
discussing. As mentioned earlier, the DMA controller, I8237, doesn't support
the full address range of the computer, or even the reduced range (640K-bytes)
of MS-DOS. It provides only the lowest 16 address bits, the remainder being
provided by a simple latch which is mapped into the computer's I/O space
independently of the DMA chip. Thus, DMA can't handle any input or output
buffer that crosses a 64K boundary: 65536, 131072, 196608, and so on. The
operating system and C libraries can allocate memory blocks that don't cross
logical segment boundaries, but they are oblivious to these hardware
boundaries. There is no reasonable way to guarantee that the application will
not produce such blocks. We have to move the boundary problem to a mechanism
of more power than the DMA. The ISR checks each input or output for boundary
crossing before setting up the DMA transfer. If a breech would occur, then the
transmission is split into two parts separated by an interrupt, where the ISR
adjusts the latched address. It is best to try to avoid such untidy
partitioning, but sometimes we have little choice.
A final data connection issue that must be addressed by all multitasking and
multiprocessing systems is concurrency of nonatomic data. An isolated datum
that can be read or written without interruption usually presents no problem
(semaphores require a more demanding atomic test and set). In the C library,
signal.h defines sig_atomic_t, the largest integer type the processor can load
or store atomically in the presence of asynchronous interrupts. This is a
short integer when the compiler generates less than 80386-specific
instructions.
Even if all integer types could be atomically accessed, structures, arrays,
and isolated data related by application are not atomic. For example, one of
the controller's functions is to send to the chemical analyzer a list of the
tests to be performed. The number of tests can vary, so the transmitted data
specifies the count as well as the test types. If the count doesn't match the
actual number of tests, the analyzer will misinterpret the entire
transmission. When the analyzer asks for the test list, the ISR/DMA responds
immediately. If the high-level process happens to be writing into the list,
the count could be new and the tests old data, or vice versa. The only way to
guarantee concurrency of tests and count is to make the group atomic.
Any input or output may contain atomic data groups. Obviously, there is no
native (hardware or operating system) mechanism to provide atomic access; the
application must provide it. One possibility is already available in the
transaction descriptor array. Every buffer could be duplicated. The high level
could ping-pong between two duplicates by changing the address in the
appropriate descriptor to point to the buffer not being accessed. This
approach presents some problems. One is that the buffers already consume about
60K of memory. A more subtle problem is that the output data is updated
piecemeal by asynchronous processes. We want to transmit the most recent data
available. In a sense, output preparation is never completed, and the only
reasonable time to switch buffers is when the analyzer asks the controller to
transmit. But we have already established that this point is not synchronized
to the high level. Therefore, double-buffering does not solve the concurrency
problem.
The most general solution is to block high-level access to communication
buffers whenever the GPIB ISR or DMA is active. Such a draconian solution
would give the high level very few opportunities to access data. A more
reasonable approach is to realize that data is atomic relative to specific
asynchronous events. By identifying the blocking events more specifically, we
create larger access windows. For example, given ten input and ten output
buffers, blocking a particular access only when the ISR and DMA are processing
a particular buffer makes the window 20 times larger.
To access a group of data atomically, the high-level process first sets up a
"load list" of copy descriptors, each of which contains the source,
destination, and length of an item to be copied. For example, an array of
integers, regardless of its length, is specified by a single descriptor, while
noncontiguous integers must be specified by separate descriptors. The load
list also identifies the one GPIB event that blocks access. The load list is
passed to an access control function that disables interrupts and then checks
the blocking event against current GPIB activity. If they don't match, the
function executes all of the moves specified by the load list and then
reenables interrupts. If they do match, the function reenables interrupts and
either releases or returns to the caller, depending upon which action the
caller has requested. If the access function has released, then each time it
is subsequently redispatched, it checks again for unblocked access, and
finally completes the transaction.


Performance


In the application I've described, the GPIB communication link has certain
critical time periods established by the external device. Because there is no
combination of standard hardware and general-purpose MTOS (MultiTasking
Operating System) that could meet these requirements, the general opinion (but
not mine) was that the goals could not be met without multiprocessing. We
considered a design in which each communication link would be handled by a
separate processor. Obviously, the extra hardware is more expensive, reduces
MTBF (Mean Time Between Failures), and presents a very complicated debugging
situation (even getting the program-under-development into the independent
processors for testing is complicated). Furthermore, this approach does not
eliminate the difficulties of controlling and synchronizing data.
Multitasking and multiprocessing have essentially the same intertask
communication overhead per CPU. Therefore, the only legitimate excuse for
solving a problem with multiple processors would be if the sum of the data
processing in all tasks is more than one processor can handle in the time
available. There is no simple formula to tell you whether this is the case
before you actually implement your solution. Something to keep in mind is that
in embedded applications, "too slow" means lost data, while "too fast" means
too expensive and probably less reliable. "Fast enough" is exactly what we
want, and proper partitioning is essential to achieving a design that is fast
enough.
_SOFTWARE PARTITIONING FOR MULTITASKING COMMUNICATIONS_
by David McCracken


[LISTING ONE]

COMMENT $ ------------------ _RELEASE --------------------------------------
 A task can call this from any point to release its time. However, better
heap utilization will result from freeing memory allocated during a time slot
before releasing. To caller, release looks like a simple function whose
prototype is void release(void). All other currently enabled tasks are then
dispatched in turn until the cycle is complete and this task is redispatched
by returning from the call. The si, di, ds, bp registers are restored as
expected. Thus, the release function also acts as the dispatcher. By using
"ret" to dispatch, the dispatcher automatically knows the size of
task dispatch addresses by whether SMALL or LARGE CODE.
$
_release PROC
 push bp
 push ds
 push di
 push si
 mov ax,dseg
 mov ds,ax
 mov si,[task_no] ;Get current task number.
 shl si,1 ;*2 to convert to index for sp word table.
 mov word ptr stk_ptrs[si],sp ;Save sp for next dispatch of this task.

nextsk: dec [task_no]
 jnz dpatch ;If task_no is OK then go dispatch the next task.
;Task 1 (on bottom of stack heap) has just finished; so verify it didn't
; corrupt stack below allotment by checking "end of stack" marker is valid.
 mov bp,mark_pos ;Point to marker location on stack.
 cmp [bp],marker_value ;Is the mark still there?
 jne scrash ;Crash on task_no=0 means task 1 overran its stack.
 mov si,[max_t_no]
 mov [task_no],si ;Rollover task count to upper limit.
dpatch: mov si,[task_no] ;Get current task number.
 cmp byte ptr task_enable[si],0
 je nextsk ;If this task is disabled then try for next one.
 shl si,1 ;*2 to convert to index for sp word table.
 mov bp,mark_pos[si] ;Get this task's stack marker.
 cmp [bp],marker_value ;Did the preceding task overflow its stack?
 jne scrash
 cli ;Don't allow interrupts while monkeying with stack.
 mov sp,word ptr stk_ptrs[si] ;Retrieve sp saved during last dispatch
 ;of this task.
 sti ;Interrupts OK now because any individual stack
 ;should be able to support them.
 pop si
 pop di
 pop ds
 pop bp ;Assume these registers are only ones stored through
 ;release and re-dispatch cycle. However, watch out
 ;for possible register variables.
 ret ;Dispatch task by returning to its release return
 ;address. Note that dispatch/release is implicitly a
 ;loop only by operation with dispatched tasks.
_release ENDP





[LISTING TWO]

COMMENT $ GPIB transaction descriptors.
 Each descriptor lists input destination address, maximum input length that
can be tolerated, type of response (used for concurency control), checksum
status of the response, response source address and its length.
$
_gpib_trans LABEL WORD
; ----INPUT DESTINATION--------- ------------- RESPONSE -----------------
; address length type CHKSTAT/chkLB address length
;...... analyzer status .......... ............... acknowlege .............
dw OFFSET _status, SEG _status, 12040, 0f000h, 00fdh, OFFSET ack , SEG ack ,
11
;....... command request ......... .............. command .................
 dw 0 , 0 , 12, 0200h, 0100h, OFFSET _cmnd, SEG _cmnd, 245
;....... phase request ........... .............. command .................
 dw 0 , 0 , 12, 0200h, 0100h, OFFSET _cmnd, SEG _cmnd, 245
 .
 .

 .


[LISTING THREE]


;-------------------- Input Available Status ----------------------
_gpib_stat db 0,0,0,0 COMMENT $ _gpib_stat is defined as a 4 byte object so
that a C program can access each byte as unsigned char (UC), any pair as
unsigned short (US), or the entire group as unsigned long (UL). In C, data is
declared by "extern UC gpib". A quick check for any input would be
" if(*(UL*)gpib_stat)". Accessing as smaller objects allows catagorical
checking without having to check each bit individually. For example,
"if(*(US*)(gpib_stat+1)" tests for any bit set in the second two bytes. $

;-------------------- Status flag selector table ---------------------
flag_select LABEL WORD ;Flag selectors: byte selector and bit mask.
 dw 0001h ;status: flag byte 0, bit 0.
 dw 0002h ;command request: flag byte 0, bit 1.
 dw 0004h ;phase request: flag byte 0, bit 2.
 .

 .












































September, 1991
 ML AND COLORED PETRI NETS FOR MODELING AND SIMULATION


A little language for a big job




Peter D. Varhol


Peter is a freelance writer and an assistant professor of computer science and
mathematics at Rivier College in New Hampshire. He can be contacted through
DDJ.


In a recent operating systems course, one of my graduate students constructed
an analytical model of various hardware configurations running a
multiprocessor computer system. To her surprise, a configuration with a single
fast processor produced a shorter average waiting time for jobs than the same
processor running in tandem with a much slower processor (although
intriguingly, the number of jobs processed was higher with the latter).
Furthermore, two processors of the same power running in tandem produced a
shorter average waiting time than the two unbalanced processors. I confirmed
that the mathematics were right, and we found supporting evidence in a
textbook, but it was not the intuitive answer. Therefore, I decided to build a
model of the two configurations and simulate the flow of jobs through the
processors.
The language I chose to develop the model was ML (short for Meta Language), a
functional programming language developed at the University of Edinburgh by a
group headed by Robin Milner. An ML specification emerged in 1987 to become
Standard ML. Due to its interactive nature, it is an interpreted language.
Interpreters were developed at the University of Edinburgh for Unix machines
and at AT&T Bell Laboratories for the VAX and 680x0 Unix computers. Meta
Software has ported the Edinburgh interpreter to the Macintosh (the platform I
used to develop this article), and makes use of it in their software tools.
An additional reason I chose ML is that I had some experience working with
Meta Software's family of formal design tools, which makes use of ML as the
underlying language for customizing and assembling a Petri net model or
simulation. In particular, Design/CPN (for colored Petri nets) enables a
designer to express much of the model in terms of a graphical representation,
with an extended version of ML used to make certain declarations and set
conditions within the model.


ML Characteristics


Like most functional languages, ML is not a large language. It makes use of
lambda calculus, which contains only variables, function expressions, and
function applications. In ML, functions are treated as first class data
objects, and it is possible to perform such operations as passing a function
to a function (see Table 1). All computations that are possible in other
languages are possible using this formalism. I liked what I had read about ML
because its formalism had a mathematical logic to it that made programming
seem more like solving a puzzle than doing real work.
Table 1: ML commands and syntax

 Variable Declaration
 val x= 5 (integer)
 val y = 2.7 (real)
 x + y (error -- type mismatch)

 Negation
 val a = ~3

 Function Definition
 fun square x (x: int) = x * x (integer argument and function)
 fun square x (x: real) = x * x (real argument and function)
 fun square x = x * x (error -- type cannot be deduced)

 Integer to Real Conversion
 real (5)

 Real to Integer Conversion
 floor (2.7) (rounds down to nearest integer)

 Boolean Expressions
 > greater than
 < less than
 = equal to
 <> not equal to
 <= less than or equal to
 >= greater than or equal to

 Equality Operator for Real Numbers
 x == y (to account for approximate representations of real numbers)

 Conditional Expressions
 fun abs x = if x >= 0 then x else ~x


 Conditional Conjunctions
 fun range x = 0 <= x and also x <= 99

 Characters and Strings
 chr 97 (returns the character "a")
 ord "a" (returns the ASCII value 97)
 "string" (string representation)

 Tuples
 (12,3) (a pair)
 divide (12,3) (returns 4)

 Lists
 [5, 2, 7] (three-element list)

 Datatype Binding

datatype SEASON = spring summer autumn winter (enumerated type) To me, ML is
like Lisp with type checking. In both languages it is natural to express a
large number of operations as function declarations. However, in Lisp it is
possible for any variable or function to accept any type of value; the only
type errors come from type mismatches or improper operations on types. ML, on
the other hand, is a strongly typed language.
In Standard ML, type errors can arise from passing a function an incorrect
type argument, for example, because all functions have types associated with
them. One important advantage in ML is that there is rarely a need to declare
variable or function types. This is because its type-checking mechanism is
able to deduce the type of many variables and functions by their composition.
For example, the assignment val x = 3 will return int, because the value
passed to x is an integer. Likewise, fun double x = 2 * x is expecting to
receive an integer and return an integer, because I used the integer 2 rather
than the real 2.0. To make it a real function, I would have to say fun rdouble
x = 2.0 * x. Many of the predefined arithmetic functions in Standard ML are
polymorphic, so that these functions can accept different types as inputs.
I confess I prefer to work in a typeless language such as Lisp, because
declaring types has always seemed to me a waste of time. However, there are
certainly advantages to a strongly typed language in writing and debugging
correct code. A strongly typed language without the need for declarations is
to me the best of both worlds.
Like Lisp, ML has the concept of lists. There are two types of lists: the list
and the tuple. The concept of the tuple in ML is the same as its mathematical
definition. It is a list of set length, but the data types of its members can
be different. A list, on the other hand, can be of any arbitrary length, but
the data types of all its elements must be the same. For this program, I used
the list construct to store random numbers. The ML function map acts like the
Lisp mapcar, which applies an operation consecutively to all elements of a
list.


Applying ML to a Multiprocessor Model


The multiprocessor computer system model was quite simple. I used functions
representing each processor, plus a common queue (see Listing One, page 81).
The queue was simply a random number generator that placed values into a list.
When each of the functions had finished processing the current job, they
removed the next value from the queue. Each processor function called the
queue function to generate a new job. The new job consisted of a random number
that determined how long the processor was to spend computing that job. I
varied the relative speed of the processors by multiplying that random number
by the number of times the slow processor was slower than the other processor.
For example, multiplying the random number received by the slower processor by
three made that processor act three times slower than the other one. This
model consists of some functions adapted from Wikstrom's textbook on the
subject (see "References").
There is no predefined random number generator in ML. I considered and
rejected an example provided in Wikstrom's book as inadequate for generating a
large number of statistically correct random numbers. Instead, I used the
congruence method, and Xn + 1 = 7 ** 9 * Xn (mod 10 ** 10), with X[0] the seed
value. Schaum's volume on numerical analysis indicates that this equation will
generate 5.1 to the seventh power, or close to 90,000, pseudorandom numbers
with acceptable statistical properties and a fairly uniform distribution
before repeating.
There is also no way of accessing the system clock from Standard ML.
Therefore, I could not use the random number in the processor function to
signify an absolute period of time to execute jobs. I couldn't szy, for
example, wait three seconds, then go to the random number queue again.
Instead, I had to use relative times. Each processor took the job generated
into the random number list and counted down to 0 before taking another random
number from the queue. This approach meant that the faster processor received
proportionally more jobs than the slower one, which is what would happen in
real life.
The last of the ML functions computed system statistics. These included the
mean and standard deviation for the total number of jobs processed, waiting
time in the system (I did not distinguish between system and queue because the
queue was just a list of pseudorandom numbers) and jobs processed per
processor. These statistics were used to support the results of the analytic
model.
One concern with this model is that I used a uniform distribution with a range
of 10 (in other words, the longest job was ten times longer than the shortest
one) for processing time required by my simulated processes. I have no idea if
this accurately represents the distribution of job times in a computer system.
Certainly, a part of it has to do with the type of work being done. If any
reader has information on this subject, I would like to know it.


Using Meta Software's Design/CPN with ML


Petri nets, developed in the early 1960s by Carl Adam Petri, are defined as a
four-tuple, C = (P,T,I,O), where P is a finite set of places, T is a finite
set of transitions, I the input function (mapping to transitions), and O the
output function (mapping from transitions). A place can contain one or more
tokens, which signify the marking of that place. If all input places to a
transition have markings, then that transition is enabled, and can fire,
transferring tokens from the input places to the output places.
As an example, Figure 1 illustrates the Petri net that I used to model the
multiprocessor computer system. Petri nets are especially appropriate for
modeling processes, as opposed to states, because the logic that can be
codified into the transition can simulate complex processes. Here, I am more
interested in the process of jobs being run in the computer system rather than
what state the system is in at any given time. Unlike most other modeling
methods, Petri nets support the formal analysis of many conditions, such as
reachability and verifiability, which help demonstrate the correctness of the
model and make the simulation process possible.
The colored Petri net model is an extension of standard Petri nets that use
colors to designate the different types of data a place can hold. Colored
Petri nets can be formally reduced to standard Petri nets, so any provable
conclusions concerning the latter can also be applied to the former. In my
model, all processes (markings) can be routed to either processor, so there
was no need to specify colors as a way of directing jobs to one processor or
another (although that might be useful in examining different job-scheduling
algorithms).
An extended version of ML, called CPN ML, is embedded in Design/CPN and is
used by the programmer to declare types with colors, specify initial markings
on places, and define guards on transitions. In addition, each transition can
have an attached code region, which executes whenever the transition is fired.
CPN ML is also used in the algorithms built into Design/CPN to determine the
enabling and occurence of bindings resulting from a simulation run.
Developing the Petri net model itself was easy. The ready queue is one place,
each of the processors are two other places, and two terminal places signify a
completed job. Two transitions passed off jobs to the processors, while two
more transitions connect the processors to the completed job places.
Design/CPN uses an extension of Petri nets that supports the concept of time,
so I was able to incorporate the amount of time it took for each process to
finish a job without resorting to the jury-rigged method described earlier. A
guard on each transition passing jobs off to the processors simply delayed it
from firing again until the amount of time specified by the random number
expired.
There are several reasons why this version of the model is preferred over the
ML-only version. First, it was easier to develop. In addition to the graphical
representation, it took only a few lines of ML to specify the parameters of
the system. Less knowledge of ML was required, although correspondingly more
knowledge of Design/CPN was necessary. Secondly, it is possible to develop
models of complex interacting systems with more detail and greater accuracy
using a graphical representation. The accuracy comes from Design/CPN's use of
Petri net formal logic to ensure certain aspects of the consistency and
correctness of the system. More detail can be achieved with hierarchical Petri
nets, which I didn't use. Lastly, seeing the model in action is
psychologically more reassuring than trusting entirely to equations. It
provides a reality check that the simulation is actually operating as it was
intended.


Products Mentioned


Design/CPN Meta Software 150 Cambridgepark Drive Cambridge, MA 02140
617-576-6920


And the Results of the Simulation...


The simulation supported the results of the analytical representations of the
multiprocessor systems. In fact, we found through further analytical methods
that in terms of time in the system, given a heterogeneous two-processor
system in which one processor is at least three times faster than the second,
the waiting time is longer than for an equivalent power system of identical
processors. The reason is that the throughput of the second processor is so
slow that it drags down the average waiting time of the entire multiprocessor
system to the point where it just cannot keep up with the other
configurations.
Nevertheless, all the multiprocessor systems tested outperformed several
separate computers of equivalent total power with separate job queues, thus
demonstrating the utility of multiprocessor systems in general. I confirmed
this through simulation with the aforementioned models.



References


Design/CPN: A Reference Manual. Cambridge, Mass.: Meta Software, 1991.
Trivedi, Kishor. Probability and Statistics with Reliability, Queuing, and
Computer Science Applications. Englewood Cliffs, N.J.: Prentice-Hall, 1982.
Wikstrom, Ake. Functional Programming using Standard ML. Englewood Cliffs,
N.J.: Prentice-Hall, 1987.
_ML AND COLORED PETRI NETS FOR MODELING AND SIMULATION_
by Peter D. Varhol


[LISTING ONE]


(* A list generator for the processes *)

fun listGenerator start next 0 = nil
 listGenerator start next n = start ::
 listGenerator (next start) next (n - 1)


(* Random number generator. Uses the congruence method. *)

fun random seed = let val x = (power 7.0 9.0) * seed mod (power 10.0 10.0)
 in x - real(floor x)
 end;
val seed = random seed;


(* Seed value for the random number generator. This number should *)
(* be changed for every simulation run to produce a different random *)
(* number list. *)

val seed = .5


(* Functions to produce the list of random numbers. Calls the list *)
(* and the random number function, and places the results in randomList. *)

fun proclist n seed = listGenerator seed random n;
fun ranint a b c = floor(x * real(b - a + 1)) + a;
fun randomList = x yx z seed = map(randint x y) (proclist z seed);


(* Part of a sample processor in the multiprocessor system. CountDown *)
(* is a recursive function that serves as the relative time clock. *)

fun processorOne randomList = countDown (map(randomList))
 statistics randomList;














September, 1991
A BRIEF MACRO PACKAGE FOR EDITING BINARY FILE SYSTEMS


Synchronized side-by-side windows -- and mouse support, too


 This article contains the following executables: BRIEF.ARC


James Rodriguez


Jim is a technical support engineer and BRIEF macro enthusiast who has acted
as technical advisor to several third-party developers. He can be reached at
Solution Systems, 372 Washington Street, Wellesley, MA 02191.


The programmer's editor known as BRIEF (Basic Reconfigurable Interactive
Editing Facility) is built on the concept of extensibility and
programmability. Macros are so ingrained in the design of BRIEF that many of
the standard editing functions are implemented in the macro language, rather
than in compiled code. By writing your own BRIEF macros, you can tailor the
standard editor in both small ways (such as setting start-up colors, tab
settings, or window positions), and large ways (such as modifying existing
commands or creating new commands of your own design).
This article describes a package of BRIEF macros that lets you edit binary
files in BRIEF, using two side-by-side windows that display both hex and ASCII
representations of binary data.
The BRIEF Macro Language CBRIEF, the macro language used here, has a syntax
that resembles C. Earlier versions of BRIEF used a LISP-like macro language,
which is still supported. In fact, CBRIEF macros are translated to the older
LISP syntax before being compiled into a bytecode representation.
The primitives in CBRIEF fall into three categories: language primitives,
editing primitives, and DOS primitives. The language primitives provide the
usual programming language control structures, operators, and data types. The
editing primitives control the core functionality of the text editing engine
-- manipulating resources such as windows, buffers, keyboards, cursor
position, and other state variables. The DOS primitives let you query the
environment for system information, as well run stand-alone DOS programs from
within BRIEF.


Text Files vs. Binary Data


The core engine that underlies the standard set of BRIEF macros is designed
for editing text files, not binary files. When reading in a file, for example,
it converts all zero-valued characters (that is, the ASCII NUL) to spaces. It
also expects lines to end with the CR/LF sequence (as found in standard DOS
text files), but will also wrap lines when a lone CR is encountered. Editing
beyond the end of a screen line or file automatically repositions the
terminator character and pads the file accordingly.
A file that contains binary data will therefore obviously conflict with
BRIEF's default actions. The default behavior is bound in at a low level --
part of the compiled executable. Unless you want to patch machine-language
code, it is hard to modify this behavior directly. Fortunately, BRIEF provides
a means of intercepting the default processing of a file by using a
"registered" macro. When a macro is registered (with the register_macro
primitive), it will be executed every time a particular event occurs. In our
case, we will intercept processing whenever a buffer is created by the
standard edit_file primitive.
Once this is done, you can spawn an external program to convert the binary
file into a corresponding text representation. The user can then edit the
resulting text file within BRIEF, and, upon completion, convert it back to its
binary form. Actually, our conversion program, BBE, creates two files from the
one binary file: a .HEX (hex representation) file and an .ASC file (printable
ASCII characters). The hex and ASCII files are displayed in two side-by-side
windows and are modified simultaneously by the user. When editing the file,
the keyboard bindings are changed to enable newly defined movement keys, 0-F
typeable keys, and several function keys.


"Language" Macros


Before going over the listing, some background on "language macros" is
necessary. The purpose of language macros is to execute a specific package of
language-specific macros associated with a particular file type (a file's type
is determined by its file extension, such as .HEX, .C, or .ASM).
All package macros are activated by a registered macro named _call_packages
found in LANGUAGE.CB, a standard BRIEF file. This language macro is executed
every time a new file is edited or a different buffer is attached to the
current window. This mechanism is implemented by creating and maintaining a
system buffer database of information specific to file extensions. Whenever a
different file extension is encountered, new information is added to the
system buffer by parsing the BPACKAGES environment variable.
The _call_packages macro then accesses this information and executes the
macros listed for that particular event. A full discussion of this is beyond
the scope of this article. For our purposes, what is important is the parsing
and execution of the _on event. In the system buffer, we first remove any
existing references to .HEX extensions, and then add our package information.
The most important of these insertions is the enhancement of the .hex_on line.
The macros listed on this line will be called by the language macros every
time a .HEX file is attached to the current window.
The name _bin_on was chosen as an example of how _call_package would have
created the name if the bin package were being loaded via BPACKAGES. In our
case, we modify the system buffer directly, so we can use any arbitrary name.
All of this will become more clear as we discuss the implementation.


The Package Implementation


My macro package for editing binary files is shown in Listings One (page 98)
and Two (page 100). Note that space restrictions do not allow printing all the
source code for the entire package, which amounts to 3000 lines. The complete
version is available in electronic form; see "Availability," on page 3.
Whenever any macro file is loaded into BRIEF, the _init macro within the file
will be executed. In the case of the binary editing macro package, my_init
macro creates a system buffer (__hex _files) which acts as the database for
storing the names and true file extensions of files which have been converted.
The _init macro also registers the macro (_bin_edit) that intercepts the file
editing process and specifies the keyboard assignments used for hex editing.
(The use of a leading underscore in a macro name is to prevent it from being
called by the user with the F10 key -- the execute_macro command.)
The registered macro _bin_edit recognizes a binary file by the error message
"Null characters in file fixed," displayed by BRIEF's edit_file primitive
whenever a binary 0 is converted to a space.
After prompting the user for confirmation, _bin_edit calls _hex_edit. Two
parameters are passed to _hex_edit: the original filename and a flag. The flag
allows a text file to be edited as if it were binary. The system buffer
database __hex_files is searched to see if the file is already controlled by
the macro package. If not, a command line is created with sprintf, and a DOS
program is run to accomplish the file conversion.


Conversion and Reconversion


Writing a program to convert a binary file into formatted hex and ASCII files
and back again is not a difficult exercise. The interesting decisions are: how
to represent the nondisplayable ASCII characters (NULL, CR, TAB), how to
control the program with command-line parameters, and how to format each group
of bytes. Parameters to the stand-alone converter, BBE, specify the conversion
mode and the names of resulting output files.
BBE accommodates conversions from binary to hex, from Unix to DOS line
termination, and from raw data file to fixed-length records. The Unix and
fixed-length record features of the package will not be discussed here. The
conversion from binary to hex creates two files: an ASCII hex representation
with 50 characters per line (representing 25 binary bytes) terminated with a
CR-LF sequence and an ASCII representation with the 25 corresponding ASCII
text characters.
The reconversion process is handled by write_buffer, a "replacement macro." A
replacement macro is a function that can completely replace a standard editor
primitive with the same name. More typically, replacement macros are used to
alter the default behavior of the original function, by doing a little bit of
processing and then calling the original function to complete the processing.
This saves having to reimplement complex code from scratch.
A potential problem for the reconversion utility is encountering a byte with a
value of 26 (Control-Z, which serves as the DOS end-of-file marker). This is
dealt with by ignoring any line whose length is less than 50 characters.
Note that, during the reconversion process, any bytes with the "XX" character
sequence are not converted. This is an easy way to implement byte deletion:
During interactive editing, when the user presses the delete key, the selected
bytes are replaced with Xs in the hex buffer.



Editing Converted Files


After the standalone converter has run, the resulting contents are displayed
in two side-by-side windows for interactive editing. The dual-window
presentation simplifies the logic for updating the screen display. In the
ASCII text window, characters such as CR, TAB, and NULL are represented by a
period.
The ASCII buffer is created as a system buffer, in effect hiding it from
direct manipulation. The buffer IDs of both buffers, as well as the original
filename and extension, are then concatenated into a formatted record to be
added to the database. The first two characters in the hex buffer are
highlighted to give the visual effect of a two-character byte. If this is the
first time the macro has been executed during the edit session, the package
information is modified. The removal of existing package information for .HEX
files is important because of any default package or equivalencing which may
exist. The hex_template_first is used to disassociate the current local
keyboard. The _bin_on will then be executed by _call_on_packages due to the
modifications. If the flag parameter is set, _call_on_packages may not have
been executed, so these are run.
The _bin_on macro accesses the __hex_files database to extract the buffer ID
of the system buffer containing the ASCII representation of the converted
file. The macro then creates side-by-side windows and attaches the buffers to
their respective windows. Each window in BRIEF is associated with a window ID.
These values are stored in global variables to allow synchronized updates to
the buffers. The _bin_on macro also enables the hex editing keyboard and
activates the mouse handler before returning. Because _bin_on creates and
changes windows, the language macros must be disabled to avoid recursion
during the process of instantiating windows. The registered macros are
therefore toggled off using the unregister_macro command.


The Mouse Event Handler


Mouse support is new in version 3.1, so specific discussion of this subject is
necessary. BRIEF macros support mouse interaction through mouse events and
event handler macros. BRIEF comes supplied with a default mouse event handler,
_mouse_action, which can be called from within a mouse macro that you create.
My mouse event handler is called _bin_mouse.
Events are passed as parameters to the currently defined mouse macro whenever
a mouse event occurs. There are two kinds of events: simple and complex. My
_bin_mouse macro uses a separate method to deal with each. Simple events have
parameters based on relative position or a static operation. Vertical scroll
bar events are a typical example. The parameters passed to the scroll bar
events define the position on the scroll bar relative to the thumb button. The
processing of simple events in _bin_mouse is done by executing the macro
assigned to the corresponding movement key. The inq_assignment primitive is
used to obtain the macro name to execute. Mouse macros are associated with a
keyboard definition and are only active while that keyboard definition is
used.
Complex events receive literal parameters. A "CLICK" event is passed the
actual line and column location of the mouse cursor within the buffer where
the event occurred. Interpretation of these coordinates is usually required
before the desired action is performed. In this example, the parameters are
manipulated to position the cursor at the parameter coordinates before the
desired action is taken.
Mouse events are passed to the mouse event handler only when they occur in the
current window. When an event occurs in a different window, a SET_WIN event is
passed to the mouse macro. When this event is interpreted, the _bin_mouse
macro uses the set_window primitive to make the window ID parameter the
current window. The CLICK event will not be passed to the mouse macro unless
the window is changed. Positioning is done from either window by manipulating
the column parameter dependent on which window the mouse event occurred in.


Conclusion


Although BRIEF is a full-featured editor, no editor can cover every possible
use. Fortunately, the macro language in BRIEF is powerful enough to let you
modify even the most intrinsic functions of the standard program.


Products Mentioned


BRIEF, Version 3.1 Solution Systems 372 Washington Street Wellesley, MA 02181
800-677-0001 $249
_A BRIEF MACRO PACKAGE FOR EDITING BINARY FILES_
by James Rodriguez



[LISTING ONE]

/* _init _hex_edit _bin_on _bin_off _bin_mouse write_buffer */

#define HEX

#define SHOW_DELAY 50
#define UP -11
#define DOWN 11
#define LEFT -1
#define RIGHT 1
#define PGUP -21
#define PGDN 21
#define TOP 30
#define BOTTOM -31
#define HOME -40
#define END -41

extern _package_buf, delete_curr_buffer;

int __hex_files, __bin_keyboard, __hex_window, __asc_window, __unix_files;
_init ()
{ __hex_files = create_buffer ("HEXFILES", NULL, 1); //Binary file database
 __unix_files = create_buffer ("UNIXFILE", NULL, 1); //Unix file database
 register_macro (6, "_bin_edit");
 keyboard_push (); // Create a new keyboard
 set_mouse_action("_bin_mouse"); // Define the mouse handler

 add_hex_keys(); // Add key assignments
 __bin_keyboard = inq_keyboard (); // Store the keyboard identifier
 keyboard_pop (1); // Reset the stack
}
// _hex_edit: associates a system buffer with a hex buffer and modifies
// the package information for .HEX files. Called by _bin_edit.
void _hex_edit (string file_name, int in_memory)
{ int __hex_buffer, __asc_buffer, tmp_buf, current_buffer;
 string buffer_name, file_ext, buffer_id, file_path;
 global __hex_buffer, __asc_buffer;

 if (get_parm (0, file_name))
 { get_parm (1, in_memory); // Check any flag passed
 if (index (file_name, "."))
 { file_ext = substr (file_name, 1 + rindex (file_name, "."));
 file_name = substr (file_name, 1, rindex (file_name, ".") - 1);
 }
 else file_ext = "";
 buffer_name = substr (file_name, rindex (file_name, "\\") + 1);
 current_buffer = inq_buffer ();
 if (inq_called () == "_bin_edit")
 delete_buffer (current_buffer);
 // Error check for bad buffer name
 tmp_buf = create_buffer (buffer_name + ".hex", file_name + ".hex", 0);
 if (tmp_buf)
 { set_buffer (__hex_files);
 top_of_buffer ();
 if (search_fwd (file_name + ",", 0, 0))
 edit_file (file_name + ".hex"); // It's already in a buffer
 else
 { delete_buffer (tmp_buf);
 sprintf (file_path, "bbe BH %s.%s %s.hex %s.asc>&nul", file_name,
 file_ext, file_name, file_name);
 // Do the hex files already exist?
 if (!exist (file_name + ".hex") && !exist (file_name + ".asc"))
 { message ("Generating hex file");
 dos (file_path); // spawn standalone program to convert file
 }
 else
 message ("Editing existing files.");
 __hex_buffer = create_buffer (buffer_name + ".hex", file_name +
 ".hex", 0);
 __asc_buffer = create_buffer (buffer_name + ("." + file_ext),
 file_name + ".asc", 1);
 // Save info in the database for later.
 sprintf (buffer_id, "oldb=%10d,newb=%10d,file=%s,ext=%s",
 __hex_buffer, __asc_buffer, file_name, file_ext);
 set_buffer (__hex_files);
 beginning_of_line ();
 insert (buffer_id + "\n");
 set_buffer (__hex_buffer);
 top_of_buffer ();
 drop_anchor ();
 move_rel (0, 1);
 // Access the package buffer and configure for the hex extension.
 if (first_time ())
 { if (!_package_buf) load_macro ("language");
 set_buffer (_package_buf);
 top_of_buffer ();

 while (search_fwd ("<.hex", 1, 0))
 delete_line ();
 // Add the correct lines in packages to make this work.
 insert (".hex_equivalents\n");
 insert (".hex_new;\n");
 insert (".hex_existing;\n");
 insert (".hex_first;\n");
 insert (".hex_on;_on,_bin_on\n"); // Enabled _bin_on
 insert (".hex;=hex,,=hex\n");
 top_of_buffer ();
 }
 set_buffer (__hex_buffer);
 if (! in_memory)
 call_registered_macro (1);
 }
 }
 }
}
// _bin_on: edits the hex and ascii files in side by side windows by creating
// and moving an edge. Insures correct local keyboard by setting it
explicitly.
// Looks in hex_files buffer for filename and extract buffer ids. Associate
// buffers with windows and return.
string _bin_on ()
{ int i;
 string buf_to_find, exten;
 inq_names (buf_to_find, exten);
 buf_to_find = substr (buf_to_find, 1, (rindex (buf_to_find, exten) -
 strlen (exten)) + 1);
 i = inq_buffer();
 set_buffer (__hex_files); // Edit the system buffer to find buffer id's.
 top_of_buffer ();
 if (search_fwd (buf_to_find + ",", 0, 0)) // Find the line with filename.
 {
 keyboard_flush(); // Remove any pending keystrokes
 use_local_keyboard (0); // Detach the local keyboard
 keyboard_push (__bin_keyboard); // Activate the hex keyboard
 set_mouse_action("_bin_mouse"); // Attach the mouse event handler
 beginning_of_line ();
 buf_to_find = trim (read ()); // Parse out the buffer ids
 buf_to_find = ltrim (substr (buf_to_find, index (buf_to_find, "oldb=")
 + 5, 10));
 __hex_buffer = atoi (buf_to_find, 1);
 beginning_of_line ();
 buf_to_find = read ();
 buf_to_find = ltrim (substr (buf_to_find, index (buf_to_find, "newb=")
 + 5, 10));
 __asc_buffer = atoi (buf_to_find, 1);
 set_buffer (__hex_buffer);
 // Unregister the registered macro so a recursive situation does not
 // arise from the window manipulations.
 unregister_macro (1, "_call_on_packages");
 create_edge (3);
 __hex_window = inq_window ();
 move_edge (1, 12);
 change_window (1);
 __asc_window = inq_window ();
 attach_buffer (__asc_buffer); // Attach the system buffer to a window
 if (!inq_marked ())
 drop_anchor (2);

 else
 refresh ();
 refresh ();
 change_window (3);
 register_macro (1, "_call_on_packages"); // Enable the language macro.
 returns "_bin_off"; // Return the off event for the language package.
 }
 else
 { set_buffer(i); // This is an error check to allow editing
 returns ""; // of files with .hex extensions which are not
 } // under the binary package control.
}
// _bin_off -- deletes the created window and resets the keyboard.
void _bin_off ()
{ keyboard_flush();
 keyboard_pop ();
 keyboard_push ();
 add_hex_keys();
 __bin_keyboard = inq_keyboard ();
 delete_edge (1); // Delete the window
 keyboard_pop (1);
 set_mouse_action("_mouse_action"); // Reset the mouse handler
}
#define BUTTON_1_CLICK 10
#define BUTTON_2_CLICK 11
#define BUTTON_1_DBLCLK 13
#define BUTTON_2_DBLCLK 14
#define VERTICAL_SCROLL 17
#define CLOSE_WINDOW 19
#define SET_WINDOW 20
#define STATUS_AREA 21
#define SCROLLBAR__LINEUP 0
#define SCROLLBAR__LINEDOWN 1
#define SCROLLBAR__PAGEUP 2
#define SCROLLBAR__PAGEDOWN 3
#define SCROLLBAR__TOP 6
#define SCROLLBAR__BOTTOM 7
#define TITLE_BAR 1

void _bin_mouse(int action, int modifier, int line, int col)
{ if (inq_window() == __asc_window)
 col = col*2; // Modify the column parameter if clicked in ascii window
 switch (action)
 {
 case STATUS_AREA: // Go to offset on status area click
 execute_macro(inq_assignment("<Alt-g>"));
 case SET_WINDOW: // A different window was selected
 {
 if (col!=TITLE_BAR && line == __asc_window) // Disregard mouse action
 { // on the border.
 unregister_macro (1, "_call_on_packages");
 set_window (__asc_window);
 }
 }
 case BUTTON_2_CLICK:
 case BUTTON_2_DBLCLK:
 case BUTTON_1_CLICK:
 case BUTTON_1_DBLCLK:
 { int lines,cols;

 if (col % 2) // Modify column parameter to event byte value
 col++;
 if (col > 50) // If past formatted string length go to the end of
 col = 50; // the string.
 unregister_macro (1, "_call_on_packages");
 set_window (__hex_window);
 inq_position(lines, cols);
 raise_anchor ();
 move_rel(0,-1);
 save_position();
 if (move_abs(line,col))
 {
 // If beyond the end of buffer ignore the action
 if (! inq_position(lines,cols) && lines==line && cols == col)
 { restore_position(0);
 move_rel(0,-1);
 set_window (__asc_window);
 raise_anchor ();
 move_abs(line,col / 2);
 drop_anchor (2);
 refresh ();
 set_window (__hex_window);
 }
 else
 restore_position();
 }
 else
 restore_position();
 drop_anchor ();
 move_rel (0, 1);
 register_macro (1, "_call_on_packages");
 refresh ();
 switch (action)
 {
 case BUTTON_2_DBLCLK: // Opens a line (feature not shown)
 { execute_macro(inq_assignment("<Ctrl-Enter>"));
 }
 case BUTTON_1_DBLCLK: // Double click modifies current byte.
 { string sread, character;
 int hex_val;
 raise_anchor ();
 move_rel (0, -1);
 sread = "Enter new value for ";
 if (read (1) != "\n")
 sread += read (2);
 sread += ": ";
 if (get_parm (NULL, character, sread, 2))
 { hex_val = _bin_atoh (character);
 sprintf (character, "%02x", hex_val); // Convert int to hex
 unregister_macro (1, "_call_on_packages");
 set_window (__asc_window);
 switch (hex_val) // Make sure the value is displayable.
 { case 13:
 case 9:
 case 0:
 sread = ".";
 default:
 sprintf (sread, "%c", hex_val);
 }

 /* Insert the value in the hex and ascii buffers
 ** rehighlight both windows and return. */
 insert ("%s", sread);
 raise_anchor ();
 delete_char ();
 move_rel (0, -1);
 drop_anchor (2);
 refresh ();
 set_window (__hex_window);
 insert ("%s", upper (character));
 delete_char (2);
 move_rel (0, -2);
 }
 drop_anchor ();
 move_rel (0, 1);
 refresh ();
 register_macro (1, "_call_on_packages");
 refresh ();
 }
 }
 }
 case VERTICAL_SCROLL: // Vertical scroll bar events.
 {
 switch (line)
 {
 case SCROLLBAR__LINEUP: // Click on up arrow
 execute_macro(inq_assignment("<Left>"));
 case SCROLLBAR__LINEDOWN: // Click on down arrow
 execute_macro(inq_assignment("<Right>"));
 case SCROLLBAR__PAGEUP: // Click above thumb button
 execute_macro(inq_assignment("<PgUp>"));
 case SCROLLBAR__PAGEDOWN: // Click below thumb button
 execute_macro(inq_assignment("<PgDn>"));
 case SCROLLBAR__TOP: // Double click on up arrow
 execute_macro(inq_assignment("<Ctrl-PgUp>"));
 case SCROLLBAR__BOTTOM: // Double click on down arrow
 execute_macro(inq_assignment("<Ctrl-PgDn>"));
 }
 }
 }
}
// write_buffer: A replacement for write_buffer. Checks file extension, cleans
// up system buffers. Note: conversion back to binary is not done if buffer
// has not been modified.
replacement int write_buffer ()
{
 int buf_to_edit, buf_to_delete, file_is_controlled, file_was_modified;
 string file_name, ext, response, response2, old_buffer, write_command;
 inq_names (file_name, ext);
 /* if write_buffer was not called from the keyboard and the extension
 ** isn't .hex or .unx call write_buffer. */
 if ("" == inq_called () && (ext == "hex" ext == "unx"))
 if (get_parm (0, response, "Convert file? ", 1, "Y"))
 if (get_parm (1, response2, "Delete buffer? ", 1, "Y"))
 { int file_is_hex;
 file_is_hex = 0;
 buf_to_edit = inq_buffer (); // Store the current buffer id
 raise_anchor (); // Remove the highlight
 if (inq_modified ()) // Only write and convert if changed

 {
 returns write_buffer (); // Preserve the return value
 file_was_modified = 1;
 }
 else
 file_was_modified = 0; // if not changed don't reconvert
 // Remove the extension.
 if (index (file_name, "."))
 file_name = substr (file_name, 1, rindex (file_name, ".") - 1);
 // Make the system buffer database current
 if (ext == "hex")
 {
 set_buffer (__hex_files);
 file_is_hex = 1;
 }
 else
 set_buffer (__unix_files); // Not used in this example
 top_of_buffer ();
 // Try to find filename record and extract original extension.
 if (search_fwd (file_name + ",ext=", 0, 0))
 { beginning_of_line ();
 ext = trim (read ());
 file_is_controlled=1;
 // Find the ascii buffer associated with the hex buffer
 if (file_is_hex)
 { int temp_buffer=inq_buffer();
 ext = ltrim (substr (ext, index (ext, "newb=") + 5));
 old_buffer = substr (ext, 1, 10);
 // Retrieve the buffer id
 buf_to_delete = atoi (old_buffer, 1);
 set_buffer(buf_to_delete);
 raise_anchor();
 write_buffer();
 drop_anchor(2);
 set_buffer(temp_buffer);
 if (upper (response2) == "Y")
 delete_buffer (buf_to_delete);
 }
 ext = trim (substr (ext, rindex (ext, "=") + 1));
 beginning_of_line ();
 if (upper (response2) == "Y")
 delete_line (); // Delete the record
 }
 // Reset the current buffer
 set_buffer (buf_to_edit);
 if (file_is_hex)
 sprintf (write_command, "bbe HB %s.hex %s.%s>&nul", file_name,
 file_name, ext);
 else
 sprintf (write_command, "bbe DU %s.unx %s.%s>&nul", file_name,
 file_name, ext);
 if (file_was_modified && upper (response) == "Y" &&
 file_is_controlled)
 {
 message ("%s", write_command);
 dos (write_command); // Call DOS to execute the conversion
 message ("Conversion complete");
 }
 // Delete the current buffer and the converted file(s).

 if (upper (response2) == "Y")
 {
 if (1 != delete_curr_buffer())
 {
 file_is_controlled=-1;
 delete_buffer(buf_to_edit);
 }
 if (file_is_hex)
 {
 sprintf (response, "%s.hex", file_name);
 del (response);
 sprintf (response, "%s.asc", file_name);
 }
 else
 sprintf (response, "%s.unx", file_name);
 del (response);
 if (file_is_controlled == -1) // No other buffers
 exit("y");
 }
 else
 if (file_is_hex && file_is_controlled)
 {
 move_rel (0, -1);
 drop_anchor ();
 move_rel (0, 1);
 }
 }
 else ;
 else ;
 else // Call the primitive
 return write_buffer ();
}




[LISTING TWO]

/* _bin_add _bin_atoh add_hex_keys _bin_edit _bin_delete */

// _bin_move -- this macro handles synchronized positioning.
void _bin_move (int direction)
{ int col, line;
 get_parm(0,direction);
 raise_anchor ();
 unregister_macro (1, "_call_on_packages"); // Remove the language control
 set_window (__asc_window);
 raise_anchor ();
 inq_position (line, col);
 switch (direction)
 { case LEFT:
 case RIGHT:
 {
 if ((col == 25 && direction == RIGHT) (col==1 && direction==LEFT))
 { // Scrolling is needed
 move_rel (direction,0);
 col = (direction == LEFT ? 25 : 1);
 move_abs (0,col);
 }

 else
 move_rel (0,direction);
 }
 case UP:
 case DOWN:
 move_rel (direction % 10,0);
 case PGUP:
 case PGDN:
 { inq_window_size(line); // Get the number of lines in the window
 move_rel ((direction % 20) * line,0);
 }
 case HOME:
 move_abs(0,1); // Beginning of line
 case END:
 move_abs(0,25); // End of line
 case TOP:
 move_abs(1,1); // Top of buffer
 case BOTTOM:
 { end_of_buffer();
 move_abs(0,25); // End of buffer
 }
 }
 while (inq_position (line,col)) // We might be in virtual space if
 move_rel (-1, 0); // so move up until we're not.
 drop_anchor (2);
 refresh ();
 set_window (__hex_window);
 move_abs(line,col * 2); // Reposition in the hex buffer
 move_rel (0, -1); // Highlight the current byte.
 drop_anchor ();
 move_rel (0, 1);
 refresh ();
 register_macro (1, "_call_on_packages");
}
// _bin_add: overwrites the byte in hex window. It is assigned to all of valid
// hex editing keys and uses push_back to get the original key.
void _bin_add (string key_read)
{ string character, sread;
 int hex_val, temp_hex;
 // Get the parameter which is key pressed and push it back on keyboard
 // buffer so that it displays in prompt as if it were typed there.
 get_parm (0, key_read);
 push_back (key_to_int (key_read));
 raise_anchor ();
 // Read the text from the buffer to display the old value at the prompt.
 move_rel (0, -1);
 sread = "Enter new value for ";
 if (read (1) != "\n")
 sread += read (2);
 sread += ": ";
 if (get_parm (1, character, sread, 2))
 { // Limit the prompt response to two characters
 hex_val = _bin_atoh(character);
 sprintf (character, "%02x", hex_val); // Convert the int to hex.
 unregister_macro (1, "_call_on_packages"); // Don't disturb the windows.
 set_window (__asc_window);
 switch (hex_val) // Make sure the value typed is displayable
 { case 13:
 case 9:

 case 0:
 sread = ".";
 default:
 sprintf (sread, "%c", hex_val);
 }
 raise_anchor ();
 delete_char (); // Remove the old character and insert the new.
 insert ("%s", sread);
 move_rel(0,-1);
 drop_anchor (2);
 refresh ();
 set_window (__hex_window);
 delete_char (2); // Remove the old byte and insert the new.
 insert ("%s", upper (character));
 }
 drop_anchor ();
 move_rel (0, 1);
 refresh ();
 register_macro (1, "_call_on_packages"); // Reenable the language macro
 refresh ();
 execute_macro(inq_assignment("<Right>")); // Move to the next byte.
}
// _bin_delete: converts the current byte to XX which will be ignored when
// reconverted. Parameter is for deleting a full line.
void _bin_delete (~int line)
{ get_parm(0,line);
 unregister_macro (1, "_call_on_packages");
 set_window (__asc_window);
 raise_anchor ();
 if (line)
 { save_position();
 move_abs(0,1);
 drop_anchor(3);
 translate("?",".",1,1,1,1); // Change any character to a "."
 raise_anchor ();
 restore_position();
 }
 else
 { delete_char ();
 insert (".");
 move_rel (0, -1);
 }
 drop_anchor (2);
 refresh ();
 set_window (__hex_window);
 raise_anchor ();
 if (line)
 { save_position();
 move_abs(0,1);
 drop_anchor(3);
 translate("?","X",1,1,1,1); // Change any character to "X"
 raise_anchor ();
 restore_position();
 move_rel (0, -1);
 }
 else
 { move_rel (0, 1);
 insert ("XX");
 move_rel (0, -4);

 delete_char (2);
 }
 drop_anchor ();
 move_rel (0, 1);
 register_macro (1, "_call_on_packages");
 refresh ();
}
// _bin_atoh -- accepts a "hex" string and returns an integer.
int _bin_atoh (string to_convert)
{ string str_rev;
 int converted, loop_count, t_int;

 get_parm (0, to_convert);

 // Read a character from the end of the string and multiply
 // it by the loop count which is multiplied by 16 on each iteration.
 while (strlen (trim (to_convert)))
 { str_rev = substr (to_convert, strlen (to_convert), 1);
 t_int = index("123456789ABCDEF",upper(str_rev));
 loop_count *= 16;
 if (0 == loop_count)
 loop_count = 1;
 converted += t_int * loop_count;
 to_convert = substr (to_convert, 1, strlen (to_convert) - 1);
 }
 return converted;
}
// add_hex_keys -- adds hex editing keys to the keyboard.
add_hex_keys()
{ string macro_name, key_name;
 int loop;
 for (loop=48; loop<58; loop++) // Assign the numeric keys
 { // to the keyboard
 sprintf (key_name, "<%c>",loop);
 sprintf (macro_name, "_bin_add \"%c\"",loop);
 assign_to_key (key_name, macro_name);
 }
 for (loop=65;loop<71;loop++) // Assign the a-f Hex keys
 { // to the keyboard
 sprintf (key_name, "<%c>",loop);
 sprintf (macro_name, "_bin_add \"%c\"",loop);
 assign_to_key (key_name, macro_name);
 assign_to_key (lower(key_name), macro_name);
 }
 assign_to_key ("<Alt-n>", "edit_next_buffer");
 assign_to_key ("<Alt-e>", "edit_file");
 assign_to_key ("<Alt-w>", "write_buffer");
 assign_to_key ("<Alt-x>", "exit");
 sprintf (macro_name, "_bin_move %d", UP);
 assign_to_key ("<Up>", macro_name);
 sprintf (macro_name, "_bin_move %d", DOWN);
 assign_to_key ("<Keypad-Enter>", macro_name);
 assign_to_key ("<Down>", macro_name);
 sprintf (macro_name, "_bin_move %d", LEFT);
 assign_to_key ("<Left>", macro_name);
 sprintf (macro_name, "_bin_move %d", RIGHT);
 assign_to_key ("<Right>", macro_name);
 sprintf (macro_name, "_bin_move %d", PGUP);
 assign_to_key ("<PgUp>", macro_name);

 sprintf (macro_name, "_bin_move %d", PGDN);
 assign_to_key ("<PgDn>", macro_name);
 sprintf (macro_name, "_bin_move %d", TOP);
 assign_to_key ("<Ctrl-Pgup>", macro_name);
 sprintf (macro_name, "_bin_move %d", BOTTOM);
 assign_to_key ("<Ctrl-PgDn>", macro_name);
 sprintf (macro_name, "_bin_move %d", HOME);
 assign_to_key ("<Home>", macro_name);
 sprintf (macro_name, "_bin_move %d", END);
 assign_to_key ("<End>", macro_name);
}
// _bin_edit -- registered macro for trapping Null files message.
void _bin_edit ()
{ if (inq_message () == "Null characters in file fixed.")
 { string okay;
 string file;
 inq_names (file);
 // See if the conversion is requested.
 if (get_parm (0, okay, "Okay to create HEX file to edit? ", 1, "Y"))
 if (upper (okay) == "Y")
 _hex_edit (file, 0); // Call _hex_edit to do conversion
 else // Other wise let the user frustrate themselves.
 message ("You asked for it.");
 }
}
on_packages"); // Don't disturb the windows.
 set_window (__asc_window);



































September, 1991
PROGRAMMING PARADIGMS


Calling Apple's Bluff




Michael Swaine


Dave Winer shouldn't be this relaxed. Sitting by the pool behind his Menlo
Park home, he seems willing to chat as long as I like about Living Video-text,
the company he created from an idea about outlining software and sold to
Symantec at the height of its success, or about UserLand, his new company that
is challenging Apple on its own turf by selling system software, including a
user scripting language called "Frontier" that seems to step on the toes of
Apple's long-rumored AppleScript. UserLand has six employees, including its
one-person Windows division (UserLand is aggressively multiplatform), and this
means that I am tying up a sixth of the company's human resources as we sit by
the pool. But Winer, with all his chips in the pot in this high-stakes game,
looks as cool as a riverboat gambler. Maybe it's because he's been bluffed so
often in the past.
DDJ: You're known for outlining, with products like ThinkTank, Ready!, and
More, but you've told me that your interest in scripting languages is just as
deep.
DW: The first scripting language I did was at the University of Wisconsin on
Unix. It used outline structures to manage program source code, the
observation being that programs are actually hierarchies. [But when] I left
Wisconsin, it struck me that outlining had more general applications, so I got
myself a ~~~~~~~~ Z-2D and set out to make an integrated database and outline
processor. Essentially, it could handle tables and outlines, and outlines
could be a field type in a database, or you could have an outline index into
your table structure.
DDJ: What was the logic behind that combination?
DW: My feeling was that by having outlines, which are like a table of
contents, combined with a tabular display, which is like an index, the
structure would get into every little bit of information you had. This was
like 1980, Apple was just going public, and I came to the conclusion that
Apple was my ticket, so I picked up and moved to California to sell this
product to Apple. I found out who the guy in charge was --
DDJ: If memory serves --
DW: Steve Jobs. I called him up and I had all these meetings, and they said
they wanted to buy the tables; they didn't want the outlines. I was 25 and
really stupid. I blew them off and they [sent] me to Personal Software,
[where] I committed to something I shouldn't have. The thing I sold them was
to be called VisiText, [and] was built on the model of the Unix line editor,
but then I saw VisiCalc, and it was a religious experience. So I committed to
doing four months worth of development work on VisiText to add the Visi to it.
DDJ: To make it screen-oriented? That wouldn't have been a trivial task.
DW: It took about a year and a half. And in that year and a half Personal
Software grew from 12 people to about 200 people, and the new management
didn't want to do outlining. They cut me loose, gave me some cash, and gave me
back the product. I went around to every other software company in the
industry, and the doors were all open, because Personal Software was the
leading software company at the time. I had meetings with Seymour Rubenstein,
Bill Gates, Fred Gibbons, the Carlstons at Broderbund. And everybody turned me
down. I had $20,000 left out of the settlement, half a manual written, and the
source code for a program that nobody wanted.
DDJ: So you decided to do it yourself.
DW: On the Apple II when the whole world was going PC. It was a struggle. We
needed a big boost, and we got it from the Mac. We were one of the very first
companies out with a Mac product. My brother Peter and Doug Barron were our
Mac team. Ironically, it wasn't the sales from the Mac product that made us.
It was sales from the PC product, but what attracted attention to the PC
product was the Mac product. [Then in] 1985, the Mac went through a terrible
year. I cut [Peter and Doug] loose and said "Go and do neat stuff, but we're
going to try and grow in the PC direction."
DDJ: In 1985 the PC market was very good.
DW: Very good. In late '85 we came out with Ready!, a resident TSR outliner.
In a lot of ways Ready! was ahead of its time. It would load in 3K of
conventional memory and keep itself up in expanded memory and do the swap when
you switched in. Everything we had learned about outlining went into it. We
had developed the product to be a Sidekick killer. Then I ran into Philippe
Kahn at PC Expo that summer at a dinner and we went aside and he said, "David,
if you position this thing against Sidekick I will put an outliner into
Sidekick and blow you out of the water."
DDJ: Philippe can be intimidating.
DW: His picture was in InfoWorld every week. I made a judgement error: I
decided to take him seriously. I decided to, in effect, position Ready! at
ThinkTank. In September 1985, it shipped. It died in March. We should have
attacked Sidekick, confronted them head-on, and we would have got some market
share. People who used the product were very happy with it.
DDJ: Basically, you were bluffed.
DW: I was bluffed. Another guy who bluffed me was Mitch Kapor. The first time
was with Symphony Text Outliner. He said, "Stop promoting ThinkTank. I'm going
to come along with Symphony Text Outliner and blow you away." And I believed
him. And it didn't happen. Mitch did it to me three times.
DDJ: What got you back into the Mac market?
DW: By early '86, the Mac Plus came out and the Mac market was just
incredible. The run rate on ThinkTank 512 was incredible, but the problem was
that ThinkTank 512 didn't run on the Mac Plus. The machine was selling in
great quantity, but we were out of money, had a head count of over 50, had
support problems [with ThinkTank 512] -- we were in desperate shape. While I
was out playing in PC land, Peter and Doug had produced all these neat little
add-ons to ThinkTank. The question was, can we take all that stuff and put it
together and come out with a new product in '86? We put everything the company
had on that, financially, in terms of personnel, in terms of our morale. Guy
Kawasaki showed us where the market opportunity was with that thing. In June
of '86, we came out with More for the Macintosh, and we never had cash
problems again.
DDJ: But the final chapter of that story is that you sold Living Videotext.
Tell me about that.
DW: We were at a crossroads. The company was accumulating cash, and we had
completely exhausted every trick we had in product development. It was going
to take another two years to load the guns up again. The only thing we could
do was to see what could be bought, but there wasn't very much to buy. In
1985, everybody had sort of abandoned the Macintosh. And at the same time,
everybody was trying to buy us. Then I had a meeting with Bill Gates in, I
guess it was February of '87, and he just blurted out, "Why don't we just buy
you?" We worked out a letter of intent. It was all happening incredibly fast.
And then the deal fell apart, but I had got committed to my board. Meanwhile,
Gordon Eubanks and John Dorr [of Symantec] were saying, "We really want to buy
you guys." So when the Microsoft deal fell through, I called up Gordon and
said, "OK, tell me how much you want to pay. It's yours."
DDJ: You were telling me that you took time off to redefine yourself after
that. You seem to have redefined yourself as president of a software company.
DW: Here I am doing it again, of course. I guess I needed a break. And I
wanted to do [this]; I wanted to do another product.
DDJ: You wanted to do a scripting language?
DW: I've always been interested in programming languages, [but] when I left
Symantec, I had no particular goal in mind. I was just building neat stuff,
things that I thought might fit into something I would do later. Then I
started hanging out with Jean-Louis Gassee. He had been giving speeches about
user scripting languages and how important they were. Bill Gates [said these
same things] in 1981, and Gassee a few years later is giving the same
speeches. [So] I had a breakfast meeting with him and said, "Can I have a look
at your development on this scripting language? Maybe I'll develop some
products for it." And he said, "There is no scripting language." And I said,
"What? You're giving all these speeches and there's no scripting language?"
And he said, "Well, sometimes to manipulate the people inside of Apple I have
to get up on a stage somewhere." So at the next breakfast meeting, I asked him
if he thought there would be any problem if I did one of these things.
DDJ: A scripting language? That must have put him in an interesting position.
DW: He did his job very well. His job was to say, "Well, this is something we
ought to do, but in the meantime we don't have anything, so why don't you go
ahead and do it." So I spent about six months getting something together.
[Apple] made me an offer, and I said, "No way." They said, "You know, we might
do one." I said, "OK, great, go do it." I remembered all the bluffs in the
past. This was the big bluff. And they didn't even realize it was a bluff.
DDJ: Well, at this point UserLand Frontier is ready to ship and there's no
sign of AppleScript, so apparently you were right. So far, anyway.
DW: I think they're two or three years behind where we are today. I always
underestimated how long it would take for us to do this, and I think they're
grossly underestimating how much effort there is in this piece of software.
DDJ: Tell me about this piece of software, Frontier.
DW: If you look at the way an operating system like Unix or MS-DOS launches
applications, you type the application name, followed by a command line, and
it's got all these parameters jammed in there in one place, OK?
DDJ: OK.
DW: The theory here is that launching the application is not a particularly
significant event. Once the application is launched, that's when the
conversation can begin. [Frontier] reaches into the applications and uses each
application as a toolbox. So a word processor becomes basically an API of what
may be hundreds of calls, all of which are incredibly good at manipulating
text. And a spreadsheet becomes a calculating engine. We've found a whole new
approach to integration. Up to this point, if you wanted to integrate a word
processor and a database, you had to have an application company that shared
your vision, and they had to produce both a word processor and a database
under one roof, and they had to put the commands into the menu that did the
integration. So therefore by necessity those commands are going to be sort of
machine-oriented things.
DDJ: How do you mean, "machine oriented?"
DW: Some people think of Find and Replace as being very high-level things, but
for some people they're very low-level. High-level things would be like,
"Prepare for board meeting." A series of dialog boxes comes up saying, "Have
you got your balance sheet done?" "No." "OK, here's the balance sheet. What
should change?" It's a different style of usage.
DDJ: So you're saying that the user could script such a command, drawing on
the capabilities of existing applications. Or applications modified to support
user scripting. I like the idea of the user being able to use applications as
specialized toolboxes.
DW: We as commercial software developers have solutions to offer these guys,
and we have not found a way of unlocking them. The whole point here is to rig
the applications up with wires, [which] is a very light-weight job. It's a lot
easier than, say for example, putting a third-party spelling checker into a
word processor, because there's no user interface.
DDJ: This is different from the philosophy of applications talking to one
another that we've heard a lot about in discussions of interapplication
communication. I've always thought that if applications have to talk to each
other, it raises some interesting questions about competition.
DW: I don't believe in applications talking to each other. I believe that the
user is the orchestra leader. He talks to one application, asks for something,
then he passes that off to the other application. I don't think it's a trivial
difference. Because it's where all the sticking point has been in the Mac
community. A lot of people are trying to get all these conversations going
between applications, and it's an exercise in frustration. If you do it that
way you haven't broken the bottleneck.
DDJ: What's the bottleneck?
DW: The bottleneck is, here I am Joe User sitting out there in Peoria,
Illinois, and I've got [two] products that between them have all the features
that I need, but they don't work together. So I call application vendor A and
say, "Hey, man, please put a database into your word processor." Well, maybe
they do, maybe they don't. If they do, it isn't going to be the database I
like, it isn't going to be as powerful because the word processor guys don't
know --
DDJ: That's an argument for IAC generally. Applications shouldn't do each
others' jobs.
DW: All right, so we buy this argument. Let's say everybody's really
cooperative, so they say, "How would you like us to do this?" So he [tells
them and] they go to the next guy and he says, "I want a whole 'nother set of
commands. I don't like anything that guy likes." You can't find consensus,
[and] if you look for it, if you try to write a word processor that integrates
with any old database, your least common denominator is just way too low. I
think it's [as] ridiculous for any two applications to talk to each other
directly as it would be for any application to know beforehand who was going
to do the paste. It won't work that way.
DDJ: What intrigues me is that it seems to call for a lot of cooperation among
competitors.
DW: Which, as I think you were pointing out, is a very unlikely thing to
happen. And it removes forever any possibility of competitiveness on the basis
of how good your wires are. I would really much prefer to see competiveness
based on what your IAC wires look like, because I know that in the software
industry people get into lockstep very quickly. I think a little bit of chaos
would be very useful right now.
DDJ: So how do I, as an application developer, wire up my application for
scripting, if I buy your arguments?

DW: Don't worry about everybody else. Worry about yourself. Put the wires into
your application so that you could imagine writing scripts yourself that drive
your application. The mental exercise that I ask people to go through is:
Imagine that you were integrating a scripting language into your product.
Everybody's thought about doing it at one time or another, right? So imagine
what the verbs in that language would be. That's what you should implement.
DDJ: You're right at the beginning with Frontier. What are your expectations
for the product?
DW: It's easy to tell the story of Living Videotext, because it's all shaken
out, it's all done. Last time I got to do a mysteryware product, outlining,
that I think was way ahead of the curve. I think we're ahead of the curve
again, but I don't believe we're seven years ahead of the curve. I believe
we're enough ahead of the curve that we will be a leader in this category.
That I didn't want to sacrifice, because I don't like the idea of being a
me-too in a category. There are too many things that are left unexplored to go
out and try to do something that somebody else did.
DDJ: So did you choose wisely in redefining yourself? Is this what you want to
be doing?
DW: For me this is almost the most fun I could possibly have, because it's
both really gutsy technology and really, really heavy-duty politics. We're all
fighting over who gets to design what the next generation of applications
looks like. And one of the things I'm looking forward to is playing a little
bit in that field. Three years ago, when [I met with Gassee] I had come
prepared to talk about doing applications assuming the existence of such a
thing as Frontier. Well, now Frontier is almost a reality, and now it looks
like it would be fun to do application software.
DDJ: But you're not worried about Apple coming out with something like
Frontier?
DW: No. I've decided not to be worried about bluffing this time. I believe
we're looking at a bluff. A guy sits down at a poker table and says, I've got
a great hand, I'm going to bet real high. Well, these guys are not betting
real high on the AppleScript hand. I don't spend a lot of time worrying about
that. What I spend time -- I don't know. What do I spend time worrying about?
I don't, really.























































September, 1991
C PROGRAMMING


D-Flat and the C Preprocessor




Al Stevens


There comes a time when you want to stop pushing the envelope. My work is done
on two books, the August column looks good, and the D-Flat code is complete. I
took an extended weekend off and went to play the piano at the Dixieland Jazz
Jubilee festival in Sacramento. Those people sure dress funny. Not the
musicians, the patrons. It's a cross between a Jaycees convention and Sunday
on the golf course. The average age is well above the Yupper limits. I didn't
take a laptop this time. I vowed to forget about computers for the weekend.
So there I was, beer-bleary and pounding out "Little Rock Getaway" in the hot
afternoon sun of Old Sacramento, when it hit me. I need to overhaul the D-Flat
class definition process. Up to now it's all been #defines and structure
initializations. Menus and dialog boxes are slicker than that, so I'd better
get the class system into line. I finished the set and went looking for a pad
and pencil. Anita O'Day, the great jazz singer from the '50s, was backstage
complaining about everthing and abusing the musicians and festival volunteers.
I guess there was no mailman handy for her to bite.
D-Flat uses the C preprocessor to do most of the program configuration. For
example, to build an application where users do not need to move and size
windows, the programmer removes the INCLUDE_SYSMENU global definition and
recompiles the library. To add a pop-down menu to the application's menu bar,
the programmer adds an entry to a table in a file named menus.c and
recompiles. A file named dialogs.c similarly manages the definition of dialog
boxes. I'll describe how the menus and dialog boxes work and then tell you
what I've done to the class system.


Menus


A Windows program defines menus and dialogs in resource files. You compile
those files separately with a resource compiler and either connect them to the
.EXE file or read them in at runtime. A D-Flat program starts up with
initialized structures that describe the menus and dialog boxes. The
preprocessor manages the initialization of those structures with some macros
that translate statements into structure declarations and initializations. The
macro call statements vaguely resemble the entries in a Windows resource file.
The idea is to make the addition and modification of resources as easy as
possible by hiding the format of the structure from the code.
Listing One, page 136, is menu.h, the header file that defines the macros with
which you build a menu bar and pop-down menus. When you develop an
applications program, you can generally ignore the format of the structures
that describe the menus. There are functions and macros to set and test the
pertinent information, such as whether a selection is active. The important
part is how to invoke the macros that create a menu bar. Listing Two, page
136, is menus.c. It contains the code for the menus the example MEMOPAD
application program uses. Refer to the July issue on page 120 for a figure
that shows what the menu bar and the popped down File menu look like.
You define a menu bar by beginning the definition with the DEFMENU macro and
ending it with the ENDMENU macro. The parameter in the DEFMENU macro names the
menu. Between the two macros, you define each pop-down menus beginning with
the POPDOWN macro and ending with the ENDPOP-DOWN macro. The first parameter
in the POPDOWN macro specifies its title on the menu bar. The title is text.
The tilde (~) character in the text is a prefix that specifies the shortcut
key to the menu. The user presses Alt and this key to pop the menu down with
the keyboard. The second parameter in the POPDOWN macro is a pointer to a
function that will execute immediately before the pop-down menu pops down.
This function can test conditions within the program and decide that toggle
commands should be turned on or off and which menu selections are active and
which ones are inactive. Between the POPDOWN and ENDPOPDOWN macros are the
SELECTION macros that define the selections. Each SELECTION macro has a
parameter that specifies the text of its selection title. The tilde characters
in the title text identify the shortcut keys to the selections when the menu
is popped down.
The second parameter in the SELECTION macro is the command code associated
with the menu selection. When the user chooses a selection on a menu, D-Flat
sends a message to the application window, the parent of all the document
windows. That message is a COMMAND message with the menu selection's command
code as its first parameter. If the application window cannot process the
message, and if a document window has the focus, the application window sends
the COMMAND message to the document window.
The third parameter in the SELECTION macro is either 0 or the key code for an
accelerator key. An accelerator key is a keystroke that will execute the
command from anywhere in the application, without respect to whether a menu is
popped down. The Save command on the Files menu has Alt+S as its accelerator
key. Its name will display on the menu next to the selection. You can see that
in the figure from the July issue.
The last parameter in the SELECTION macro is the attribute value for the
selection. Its values, which can be ORed together, can be INACTIVE, TOGGLE,
and CHECKED. An inactive selection does not execute. For example, in the
MEMOPAD program, the Save, Save As, and Print selections on the File menu are
initially inactive. They have no purpose when the program has no document
window in focus. If you select the pop-down menu with a document window in
focus, the applications window observes that and, from its PrepFileMenu
function, which is pointed to up in the POPDOWN macro, the applications window
makes the Save, Save As, and Print selections active. If the in-focus window
is not a document, the applications window makes the selections inactive. A
selection with the TOGGLE attribute does not execute a command. Instead, when
you choose it, the selection toggles its CHECKED attribute on and off. When
the attribute is on, the selection displays with a check mark to the left of
its title on the menu. The Insert and Word Wrap selections on the Options menu
are toggle selections.
The SEPARATOR macro that several menus have between some SELECTION macros
defines separator lines on the menu between the selections.
The menu bar described by menus.c is typical of most CUA applications. You
would use the same menu in any application. If the application did not include
text editing, you might not use the Edit menu. A more complete text editing
application might include a Search menu for text searches. You would add other
pop-down menus for the processes that are unique to the application. The CUA
standard specifies that a View menu can be between the Edit and Options menus.
The View menu would allow users to change the way information in the document
windows is viewed. Lists of data might have sort sequence options, for
example. Other application-dependent pop-downs would be between the View and
Options menus.
The last item in menus.c is the definition of the System Menu. It is defined
independently of the Main menu because it does not pop down from the menu bar.
Instead it pops down from the control box in the upper-left corner of windows
that use it. It has commands that let you move, size, minimize, maximize,
restore, and close the window with the keyboard.


Dialog Boxes


Figure 1 in this issue shows the MEMOPAD program with a dialog box displayed,
in this case the usual File Open dialog box. You see lots of interesting
things in this figure. The dialog box has some static text, a single-line text
entry box, two list boxes, and three command buttons. If you've ever written a
Windows application, you've seen the text input to the resource compiler.
D-Flat has something similar, but instead of a special resource language
compiler, D-Flat uses the C preprocessor to implement its dialog boxes.
Listing Three, page 137, is dialbox.h, the header file that defines the macros
for the dialog box definitions. As with menus, you don't need to remember the
structures. The macro language that defines the dialog boxes is what you will
find important. Listing Four, page 138, is dialogs.c, the source file with
definitions of the dialog boxes that the MEMOPAD application uses.
The definition for the File Open dialog box shows the format for describing
dialog boxes. You start with the DIALOGBOX macro and end with the ENDDB macro.
The parameter in the DIALOGBOX macro is the name of the structure that the
program will use when it refers to fields in the dialog box.
The DB_TITLE macro must be the first macro after the DIALOGBOX macro. It
defines the dialog box's title, screen position, and size. Most dialog boxes
will have screen coordinates of -1, -1 to tell D-Flat to center them. The next
two parameters in the DB_TITLE macro are the dialog box's height and width.
The CONTROL macros come next. Each one defines a control window on the dialog
box. The first parameter is the control box class, which can be TEXT, EDITBOX,
LISTBOX, BUTTON, CHECKBOX, and RADIOBUTTON. The dialog box in Figure 1 has all
but the last two control box classes. A checkbox displays as [ ] or [X],
depending on whether it is currently selected or not. A radio button displays
as ( ) or (o).
The second parameter in the CONTROL macro is a text string that the control
window displays. Only TEXT and BUTTON controls have text strings. The others
have NULL as that parameter. The string in a BUTTON control is the label on
the button. The string in a TEXT control is the static string display. Observe
that the File Open dialog box has a TEXT control window with a NULL string.
This control displays the current path, and the program provides the string
value when it sets up the dialog box.
The next two CONTROL macro parameters are the control window's column and row
position relative to the dialog box. After that are the control window's
height and width. The last parameter is the command code that the program uses
to interrogate and modify the values in the control window.
When a TEXT control window has a command code, the text is associated with
another control window on the dialog box that has the same command code. The
text value of the TEXT control window will have a tilde character as the
prefix to one of its characters. That character is the shortcut key for moving
the focus to the associated other control window when the dialog box is
displayed. For example, Radio buttons can be grouped by their position on the
dialog box. A radio button must be one of a group, because when one is
selected, the others are deselected just like the station buttons on a car
radio. When some radio buttons have the same column coordinate and are on
adjacent rows, they constitute a group.


Classes


With the improvements that came to me that day on a bandstand, you can add
window classes to D-Flat by making an entry in one table, providing color
codes in another, and setting up a window processing function. To begin with,
there is classdef.h, which I published in June. It defines the array of
structures that describes classes. The new class system uses the same
structure except that the colors are no longer a part of the structure, and
the CLASS variable that identifies the class is not needed because it is
implied by the position of the structure entry in the array.
The backbone of the class system is classes.h, Listing Five, page 138. It
contains a table of ClassDef macro invocations with the data items that
describe a class. The first parameter is the name of the class. This
identifier will become a value in an enumerated data type for the rest of the
program to use to reference the type. The second data item is the identifier
for the base class, the one from which the current one is derived. A derived
class inherits the properties of the base class. The third parameter is a
pointer to the window processing module for the class. The last parameter is
the window attribute that windows of the class will open with.
Other source files include classdef.h after defining the ClassDef macro to
extract only the pertinent data items that the source file needs. For example,
to define the enumerated CLASS data type, dflat.h now has the code in Example
1(a). The code in Example 1(b) builds an array of class strings for message
logging and to provide default help window mnemonics. This device uses the
preprocessor's # operator to build a string from the class name. For example,
the entry for the TEXTBOX class looks like this: ClassDef(TEXTBOX, NORMAL,
TextBoxProc, 0). The ClassDef macro, as just defined expands the macro to
this: "TEXTBOX",. The code in Example 1(c) builds the array of class-defining
structures.
Example 1: (a) To define the enumerated Class data type, dflat.h now has this
code; (b) this code builds an array of class strings for message logging and
to provide default help window mnemonics; (c) this code builds the array of
class-defining structures.

(a) typedef enum window_class {
 #define ClassDef (c,b,p,a) c,
 #include "classes.h"
 CLASSCOUNT
 } CLASS;


(b) char *ClassNames[] = {
 #undef ClassDef
 #define ClassDef (c,b,p,a) #c,
 #include "classes.h"
 NULL
 };

(c) CLASSDEFS classdefs[] = {
 #undef ClassDef
 #define ClassDef(c,b,p,a)
 {b,p,a},
 {0,0,0}
 #include "classes.h"
 };

There is a significant performance improvement in this approach. The earlier
method required a search of the table to find the matching class every time a
window processing function called its base processing function or chained to
the default processing function for the window class. With this array, the
search is unnecessary. The entry offset is the same as the CLASS value.
Listing Six, page 139, is config.c. This replaces an earlier version because
the method for defining default class colors is different. Class colors are
now an array with three dimensions. There is an entry for each class. If you
add a class to D-Flat, you must add an entry to the color array. Each class
has four sets of foreground and background colors. There is a standard color,
a color for selected items within the window, a color for the window's frame,
and a color for highlighted items. The highlighted color set for the MENU BAR,
POPDOWNMENU, and BUTTON classes are two foreground colors instead of a
foreground and background pair. The first foreground color is for inactive
items in the windows, and the second is the foreground color for shortcut
keys. Those foregrounds, when used, combine with the windows' standard
background colors.
There are three groups of D-Flat colors. The first is the color group for
color monitors. The second group is the black and white group. The third group
is a reverse black and white group, which seems to work better on some LCD
screens.


What You Can't Preprocess


It is my opinion that the ANSI X3J11 committee missed a beat when they used
the ## character pair for the pre-processor's token-pasting operator and # for
the "stringize" (see the "Programmer's Soapbox") operator. Because the
operators are new to ANSI C, the committee could just have easily used one of
the symbols that C does not use. The @ and $ characters are available, for
example.
Perhaps the committee was reacting to prior art in some nonstandard compilers,
and perhaps the users of those compilers had strong influence on the
committee. In their rationale document, the committee says that they
introduced the "stringize" # operator to replace a practice where some
compilers replaced an identifier in a string literal that matched a macro
argument name. They say they invented the ## token paster because some
compilers supported token pasting by replacing the / **/ comment with zero
characters instead of one space, as K&R specify. Whatever the reason for the
operators, the choice of # to implement them means that a C macro cannot
include any preprocessor statements in its replacement list. You cannot write
a macro that has #ifdef or #define in it, for example.
D-Flat has a source file named keys.h that defines the values that the
keyboard event sends as a keystroke message. That file was in May's issue.
Another file is keys.c, Listing Seven, page 141. That file initializes a
structure array with entries that D-Flat uses to display accelerator key
combination names on menus. Each entry has one of the values from keys.h and a
string that is the displayable name of the key. I wanted to do all of this
with a macro the way I did with the ClassDef macro described earlier. That way
I could add and delete keys at will without needing to make changes to tables
in different source files. By including only the keys that the application
uses, the program does not use unnecessary string space for unused key
combination names. I envisioned something like a KeyDef macro that was invoked
liked Example 2(a).
Example 2: (a) Involving KeyDef macro; (b) building the array of structures
that is used in keys.c; (c) getting values defined for the key mnemonics; (d)
getting the # by the pre-processor; (e) declring and initializing integers.

(a) KeyDef (ALT_S, 159+OFFSET, "Alt+S")

(b) struct keys keys[] = {
 #undef KeyDef
 #define KeyDef (k,v,s) {v,s},
 #include "keycaps.h"
 {-1,NULL)
 };

(c) #undef KeyDef
 #define KeyDef (k,v,s) #define k v
 #include "keycaps.h"

(d) #undef KeyDef
 #define KeyDef (k,v,s) extern const int k;
 #include "keycaps.h"

(e) #undef KeyDef
 #define KeyDef (k,v,s)const int k = v;
 #include "keycaps.h"

An include file, perhaps named keycaps.h, would contain one of these calls for
every key available to a D-Flat application. The programmer would comment out
the ones that the application did not need. The header file would be called in
different places, with the Key-Def macro defined differently depending on its
purpose. For example, to build the array of structures that you see in keys.c,
I would code as in Example 2(b) .
That works fine, but first I need to get values defined for the key mnemonics,
such as ALT_S. Those values are #defined in keys.h from May, but I would
prefer to have all my key definitions in one place, and so, in this
hypothetical method, I would need to use the code in Example 2(c) somewhere.
The problem is that the C preprocessor hits the # at the beginning of the
#define operator, treats it as a "stringize" operator, and preprocessing goes
downhill from there because the misinterpreted //define// token is not one of
the macro's parameters. If the "stringize" operator was the @, for example,
this would not be a problem.
An alternative solution that gets past the preprocessor is shown in Example
2(d). That sequence of code would appear in a header file that is visible to
all the files that refer to key combination mnemonics. It would define the
existence of external integers with the mnemonics as identifiers. To declare
and initialize the integers, I would put the code in Example 2(e) in one of
the C files in the external scope.
So far, so good. But now we run into yet another obstacle. D-Flat has several
places where static or external arrays are initialized with the key mnemonic
values. Such initializers must be constant values, not variables. The integer
declarations are const, but C (unlike C++) treats them as variables, and you
must initialize a static or extern variable with a true constant. The
work-around is to initialize every such array with assignments. That adds
runtime code and bulk to the program. I decided not to do it.
Perhaps my grousing unfairly blames the committee. Three ANSI-compliant
compilers hiccuped on the #define inside a macro, but is that the compiler's
fault or the standard's? This next suggestion probably violates the rules of
precedence for preprocessing, but it seems that the preprocessor could
recognize #define and the other directives as unique tokens and preprocess
them without dropping into the "stringizing" mess. X3J11, however, did not
specify that, and so the compilers' behavior is correct, if not ideal.


Farewell to Power C



Last month I reported that I had ported D-Flat to MIX's Power C. So I did, but
version 4 is the last version that will have that support. I kept running into
the walls, floors, and ceilings in Power C. The most recent one was an
apparent limit on the global #define table space. D-Flat uses a lot of #define
statements. The compiler didn't issue a warning or an error. It simply began
to behave as if certain defined globals did not exist, including the one that
told it that I was using Power C. This abandonment is not an indictment of
Power C. I still think that Power C is a good deal for the budget-minded
programmer, especially the C student.


How to Get D-Flat Now


It has become obvious that I cannot continue to present the D-Flat source code
in orderly monthly installments. The code already published keeps changing as
readers use D-Flat and report back, and as I decide to fix things and make
improvements. So here's the plan: Each column will include listings that are
relevant to the discussion at hand. I will not republish entire listings of
modules that change. You should not set about to collect the monthly listings
with the intention of building a full system at the end of the series.
Instead, you should get the complete package as we go along by downloading it
or writing to me.
The D-Flat source code is on CompuServe in Library 0 of DDJ Forum and on
TelePath. Its name is DFLAT n.ARC, where the n is an integer that represents a
loosely-assigned version number. There is another file, named DFn-TXT.ARC,
which contains the Help system database and the documentation for the
programmer's API. At present, everything compiles and works with Turbo C 2.0,
Turbo C++, and Microsoft C 6.0. There is a makefile for the TC and MSC
compilers. There are two example program, the MEMOPAD program, a multiple
document notepad, and the NOTEPAD program, a single-document text editor,
built with few of the D-Flat features to demonstrate a minimum-feature
compile.
If for some reason you cannot get to either online service, send me a
formatted diskette -- any PC format -- and an addressed, stamped diskette
mailer. Send it to me in care of DDJ. I'll send you the latest copy of the
library. The software is free, but if you care to, stick a dollar bill in the
mailer. I'll match the dollar and give it to the Brevard County Food Bank.
They take care of homeless and hungry children.
If you want to discuss D-Flat with me, don't try to call me at that address. I
am never there. Use CompuServe. My CompuServe ID is 71101,1262, and I monitor
DDJ Forum daily.


The Programmer's Soapbox: The Decline of Language


I'm heading home to get back to work. The Sacramento airport loudspeaker
blares, "Hi. I'm Kevin, and I'll be your passenger boarding person today."
I am preplanning this column about the preprocessor as I wait for Kevin to
finalize preboarding those passengers who have extinguished their smoking
materials, need special assistance, and may now proceed down the jetway. I am
thinking about the words we utilize -- err, use -- hmm, abuse.
X3J11 invented more than a "stringize" operator. They invented a word. An
abomination, this "stringize," but it follows in the tradition that "verbizes"
nouns with the "ize" appendage. The practice is commonplace in military and
computer literature, and seems to be used (utilized) where no suitable verb is
known to (memorized by) the coiner. A good example is "initialize," which I
first saw in K&R, and which has now found its way into dictionaries and my
XyWrite spell checker. The X3J11 committee includes several notable writers
(among its standardizers). How did they allow this "stringize" word to slip
into the culture as they completed (finalized) the draft? Jaeschke uses the
word in his book, Mastering Standard C, as if it were a real word. Plaugher
and Brodie wisely avoid it in their book, Standard C.
_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

/* ------------ menu.h ------------- */
#ifndef MENU_H
#define MENU_H

/* ----------- popdown menu selection structure
 one for each selection on a popdown menu --------- */
struct PopDown {
 char *SelectionTitle; /* title of the selection */
 int ActionId; /* the command executed */
 int Accelerator; /* the accelerator key */
 int Attrib; /* INACTIVE CHECKED TOGGLE */
 char *help; /* Help mnemonic */
};

/* ----------- popdown menu structure
 one for each popdown menu on the menu bar -------- */
typedef struct Menu {
 char *Title; /* title on the menu bar */
 void (*PrepMenu)(void *, struct Menu *); /* function */
 struct PopDown Selections[23]; /* up to 23 selections */
 int Selection; /* most recent selection */
} MENU;

/* --------- macros to define a menu bar with
 popdowns and selections ------------- */
#define SEPCHAR "\xc4"
#define DEFMENU(m) MENU m[]= {
#define POPDOWN(ttl,func) {ttl,func,{
#define SELECTION(stxt,acc,id,attr) {stxt,acc,id,attr,#acc},
#define SEPARATOR {SEPCHAR},
#define ENDPOPDOWN {NULL},0}},
#define ENDMENU {NULL} };

/* -------- menu selection attributes -------- */

#define INACTIVE 1
#define CHECKED 2
#define TOGGLE 4

/* --------- the standard menus ---------- */
extern MENU MainMenu[];
extern MENU SystemMenu[];
extern MENU *ActiveMenu;

int MenuHeight(struct PopDown *);
int MenuWidth(struct PopDown *);

#endif






[LISTING TWO]

/* -------------- menus.c ------------- */
#include <stdio.h>
#include "dflat.h"

/* --------------------- the main menu --------------------- */
DEFMENU(MainMenu)
 /* --------------- the File popdown menu ----------------*/
 POPDOWN( "~File", PrepFileMenu )
 SELECTION( "~New", ID_NEW, 0, 0 )
#ifdef INCLUDE_DIALOG_BOXES
 SELECTION( "~Open...", ID_OPEN, 0, 0 )
 SEPARATOR
#endif
 SELECTION( "~Save", ID_SAVE, ALT_S, INACTIVE)
#ifdef INCLUDE_DIALOG_BOXES
 SELECTION( "Save ~as...", ID_SAVEAS, 0, INACTIVE)
#endif
 SEPARATOR
 SELECTION( "~Print", ID_PRINT, 0, INACTIVE)
 SEPARATOR
 SELECTION( "~DOS", ID_DOS, 0, 0 )
 SELECTION( "E~xit", ID_EXIT, ALT_X, 0 )
 ENDPOPDOWN
 /* --------------- the Edit popdown menu ----------------*/
 POPDOWN( "~Edit", PrepEditMenu )
 SELECTION( "~Undo", ID_UNDO, ALT_BS, INACTIVE)
#ifdef INCLUDE_CLIPBOARD
 SEPARATOR
 SELECTION( "Cu~t", ID_CUT, SHIFT_DEL, INACTIVE)
 SELECTION( "~Copy", ID_COPY, CTRL_INS, INACTIVE)
 SELECTION( "~Paste", ID_PASTE, SHIFT_INS, INACTIVE)
 SEPARATOR
 SELECTION( "Cl~ear", ID_CLEAR, 0, INACTIVE)
#endif
 SELECTION( "~Delete", ID_DELETETEXT, DEL, INACTIVE)
 SEPARATOR
 SELECTION( "Pa~ragraph", ID_PARAGRAPH, ALT_P,INACTIVE)
 ENDPOPDOWN

 /* ------------- the Options popdown menu ---------------*/
 POPDOWN( "~Options", NULL )
 SELECTION( "~Insert", ID_INSERT, INS, TOGGLE)
 SELECTION( "~Word wrap", ID_WRAP, 0, TOGGLE)
#ifdef INCLUDE_DIALOG_BOXES
 SELECTION( "~Tabs...", ID_TABS, 0, 0 )
 SEPARATOR
 SELECTION( "~Display...", ID_DISPLAY, 0, 0 )
#ifdef INCLUDE_LOGGING
 SEPARATOR
 SELECTION( "~Log Messages ",ID_LOG, 0, 0 )
#endif
#endif
 SEPARATOR
 SELECTION( "~Save Options", ID_SAVEOPTIONS, 0, 0 )
 ENDPOPDOWN

#ifdef INCLUDE_MULTIDOCS
 /* --------------- the Window popdown menu --------------*/
 POPDOWN( "~Window", PrepWindowMenu )
 SELECTION( NULL, ID_CLOSEALL, 0, 0)
 SEPARATOR
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 SELECTION( "~More Windows...", ID_WINDOW, 0, 0)
 SELECTION( NULL, ID_WINDOW, 0, 0 )
 ENDPOPDOWN
#endif
#ifdef INCLUDE_HELP
 /* --------------- the Help popdown menu ----------------*/
 POPDOWN( "~Help", NULL )
 SELECTION( "~Help for help...", ID_HELPHELP, 0, 0 )
 SELECTION( "~Extended help...", ID_EXTHELP, 0, 0 )
 SELECTION( "~Keys help...", ID_KEYSHELP, 0, 0 )
 SELECTION( "Help ~index...", ID_HELPINDEX, 0, 0 )
 SEPARATOR
 SELECTION( "~About...", ID_ABOUT, 0, 0 )
#ifdef INCLUDE_RELOADHELP
 SEPARATOR
 SELECTION( "~Reload Help Database",ID_LOADHELP,0, 0 )
#endif
 ENDPOPDOWN
#endif

ENDMENU

#ifdef INCLUDE_SYSTEM_MENUS
/* ------------- the System Menu --------------------- */
DEFMENU(SystemMenu)
 POPDOWN("System Menu", NULL)

 SELECTION("~Restore", ID_SYSRESTORE, 0, 0 )
 SELECTION("~Move", ID_SYSMOVE, 0, 0 )
 SELECTION("~Size", ID_SYSSIZE, 0, 0 )
 SELECTION("Mi~nimize", ID_SYSMINIMIZE, 0, 0 )
 SELECTION("Ma~ximize", ID_SYSMAXIMIZE, 0, 0 )
 SEPARATOR
 SELECTION("~Close", ID_SYSCLOSE, CTRL_F4, 0 )
 ENDPOPDOWN
ENDMENU

#endif





[LISTING THREE]

/* ----------------- dialbox.h ---------------- */
#ifndef DIALOG_H
#define DIALOG_H

#include <stdio.h>

#define MAXCONTROLS 25

#define OFF FALSE
#define ON TRUE
/* -------- dialog box and control window structure ------- */
typedef struct {
 char *title; /* window title */
 int x, y; /* relative coordinates */
 int h, w; /* size */
} DIALOGWINDOW;
/* ------ one of these for each control window ------- */
typedef struct {
 DIALOGWINDOW dwnd;
 int class; /* LISTBOX, BUTTON, etc */
 char *itext; /* initialized text */
 char *vtext; /* variable text */
 int command; /* command code */
 char *help; /* help mnemonic */
 int isetting; /* initially ON or OFF */
 int setting; /* ON or OFF */
 void *wnd; /* window handle */
} CTLWINDOW;
/* --------- one of these for each dialog box ------- */
typedef struct {
 char *HelpName;
 DIALOGWINDOW dwnd;
 CTLWINDOW ctl[MAXCONTROLS+1];
} DBOX;
/* -------- macros for dialog box resource compile -------- */
#define DIALOGBOX(db) DBOX db={ #db,
#define DB_TITLE(ttl,x,y,h,w) {ttl,x,y,h,w},{
#define CONTROL(ty,tx,x,y,h,w,c) \
 {{NULL,x,y,h,w},ty,tx,NULL,c,#c,(ty==BUTTON?ON:OFF),OFF,NULL},
#define ENDDB }};


#define Cancel " Cancel "
#define Ok " OK "
#define Yes " Yes "
#define No " No "

#endif





[LISTING FOUR]

/* ----------- dialogs.c --------------- */
#include "dflat.h"

#ifdef INCLUDE_DIALOG_BOXES

/* -------------- the File Open dialog box --------------- */
DIALOGBOX( FileOpen )
 DB_TITLE( "Open File", -1,-1,19,48)
 CONTROL(TEXT, "~Filename", 2, 1, 1, 8, ID_FILENAME)
 CONTROL(EDITBOX, NULL, 13, 1, 1,29, ID_FILENAME)
 CONTROL(TEXT, "Directory:", 2, 3, 1,10, 0 )
 CONTROL(TEXT, NULL, 13, 3, 1,28, ID_PATH )
 CONTROL(TEXT, "F~iles", 2, 5, 1, 5, ID_FILES )
 CONTROL(LISTBOX, NULL, 2, 6,11,16, ID_FILES )
 CONTROL(TEXT, "~Directories", 19, 5, 1,11, ID_DRIVE )
 CONTROL(LISTBOX, NULL, 19, 6,11,16, ID_DRIVE )
 CONTROL(BUTTON, " ~OK ", 36, 7, 1, 8, ID_OK)
 CONTROL(BUTTON, " ~Cancel ", 36,10, 1, 8, ID_CANCEL)
 CONTROL(BUTTON, " ~Help ", 36,13, 1, 8, ID_HELP)
ENDDB
/* -------------- the Save As dialog box --------------- */
DIALOGBOX( SaveAs )
 DB_TITLE( "Save As", -1,-1,19,48)
 CONTROL(TEXT, "~Filename", 2, 1, 1, 8, ID_FILENAME)
 CONTROL(EDITBOX, NULL, 13, 1, 1,29, ID_FILENAME)
 CONTROL(TEXT, "Directory:", 2, 3, 1,10, 0 )
 CONTROL(TEXT, NULL, 13, 3, 1,28, ID_PATH )
 CONTROL(TEXT, "~Directories",2, 5, 1,11, ID_DRIVE )
 CONTROL(LISTBOX, NULL, 2, 6,11,16, ID_DRIVE )
 CONTROL(BUTTON, " ~OK ", 36, 7, 1, 8, ID_OK)
 CONTROL(BUTTON, " ~Cancel ", 36,10, 1, 8, ID_CANCEL)
 CONTROL(BUTTON, " ~Help ", 36,13, 1, 8, ID_HELP)
ENDDB
/* -------------- generic message dialog box --------------- */
DIALOGBOX( MsgBox )
 DB_TITLE( NULL, -1,-1, 0, 0)
 CONTROL(TEXT, NULL, 1, 1, 0, 0, 0)
 CONTROL(BUTTON, NULL, 0, 0, 1, 8, ID_OK)
 CONTROL(0, NULL, 0, 0, 1, 8, ID_CANCEL)
ENDDB

#ifdef INCLUDE_MULTIDOCS
#define offset 4
#else
#define offset 0
#endif

/* ------------ VGA Display dialog box -------------- */
DIALOGBOX( DisplayVGA )
 DB_TITLE( "Display", -1, -1, 13+offset, 34)
#ifdef INCLUDE_MULTIDOCS
 CONTROL(CHECKBOX, OFF, 9, 1, 1, 3, ID_TITLE)
 CONTROL(TEXT, "~Title", 15, 1, 1, 5, ID_TITLE)
 CONTROL(CHECKBOX, OFF, 9, 2, 1, 3, ID_BORDER)
 CONTROL(TEXT, "~Border", 15, 2, 1, 6, ID_BORDER)
 CONTROL(CHECKBOX, OFF, 9, 3, 1, 3, ID_TEXTURE)
 CONTROL(TEXT, "Te~xture",15, 3, 1, 7, ID_TEXTURE)
#endif
 CONTROL(RADIOBUTTON, OFF, 9,1+offset,1,3,ID_COLOR)
 CONTROL(TEXT, "Co~lor", 15,1+offset,1,5,ID_COLOR)
 CONTROL(RADIOBUTTON, OFF, 9,2+offset,1,3,ID_MONO)
 CONTROL(TEXT, "~Mono", 15,2+offset,1,4,ID_MONO)
 CONTROL(RADIOBUTTON, OFF, 9,3+offset,1,3,ID_REVERSE)
 CONTROL(TEXT, "~Reverse", 15,3+offset,1,7,ID_REVERSE)
 CONTROL(RADIOBUTTON, OFF, 9,5+offset,1,3,ID_25LINES)
 CONTROL(TEXT, "~25 Lines",15,5+offset,1,8,ID_25LINES)
 CONTROL(RADIOBUTTON, OFF, 9,6+offset,1,3,ID_43LINES)
 CONTROL(TEXT, "~43 Lines",15,6+offset,1,8,ID_43LINES)
 CONTROL(RADIOBUTTON, OFF, 9,7+offset,1,3,ID_50LINES)
 CONTROL(TEXT, "~50 Lines",15,7+offset,1,8,ID_50LINES)
 CONTROL(BUTTON, " ~OK ", 2,9+offset,1,8,ID_OK)
 CONTROL(BUTTON, " ~Cancel ", 12,9+offset,1,8,ID_CANCEL)
 CONTROL(BUTTON, " ~Help ", 22,9+offset,1,8,ID_HELP)
ENDDB
/* ------------ EGA Display dialog box -------------- */
DIALOGBOX( DisplayEGA )
 DB_TITLE( "Display", -1, -1, 12+offset, 34)
#ifdef INCLUDE_MULTIDOCS
 CONTROL(CHECKBOX, OFF, 9, 1, 1, 3, ID_TITLE)
 CONTROL(TEXT, "~Title", 15, 1, 1, 5, ID_TITLE)
 CONTROL(CHECKBOX, OFF, 9, 2, 1, 3, ID_BORDER)
 CONTROL(TEXT, "~Border", 15, 2, 1, 6, ID_BORDER)
 CONTROL(CHECKBOX, OFF, 9, 3, 1, 3, ID_TEXTURE)
 CONTROL(TEXT, "Te~xture",15, 3, 1, 7, ID_TEXTURE)
#endif
 CONTROL(RADIOBUTTON, OFF, 9, 1+offset,1,3,ID_COLOR)
 CONTROL(TEXT, "Co~lor", 15, 1+offset,1,5,ID_COLOR)
 CONTROL(RADIOBUTTON, OFF, 9, 2+offset,1,3,ID_MONO)
 CONTROL(TEXT, "~Mono", 15, 2+offset,1,4,ID_MONO)
 CONTROL(RADIOBUTTON, OFF, 9, 3+offset,1,3,ID_REVERSE)
 CONTROL(TEXT, "~Reverse", 15, 3+offset,1,7,ID_REVERSE)
 CONTROL(RADIOBUTTON, OFF, 9, 5+offset,1,3,ID_25LINES)
 CONTROL(TEXT, "~25 Lines",15, 5+offset,1,8,ID_25LINES)
 CONTROL(RADIOBUTTON, OFF, 9, 6+offset,1,3,ID_43LINES)
 CONTROL(TEXT, "~43 Lines",15, 6+offset,1,8,ID_43LINES)
 CONTROL(BUTTON, " ~OK ", 2, 8+offset,1,8,ID_OK)
 CONTROL(BUTTON, " ~Cancel ", 12, 8+offset,1,8,ID_CANCEL)
 CONTROL(BUTTON, " ~Help ", 22, 8+offset,1,8,ID_HELP)
ENDDB
/* ------------ CGA/MDA Display dialog box -------------- */
DIALOGBOX( DisplayCGA )
 DB_TITLE( "Display", -1, -1, 9+offset, 34)
#ifdef INCLUDE_MULTIDOCS
 CONTROL(CHECKBOX, OFF, 9, 1, 1, 3, ID_TITLE)
 CONTROL(TEXT, "~Title", 15, 1, 1, 5, ID_TITLE)
 CONTROL(CHECKBOX, OFF, 9, 2, 1, 3, ID_BORDER)

 CONTROL(TEXT, "~Border", 15, 2, 1, 6, ID_BORDER)
 CONTROL(CHECKBOX, OFF, 9, 3, 1, 3, ID_TEXTURE)
 CONTROL(TEXT, "Te~xture",15, 3, 1, 7, ID_TEXTURE)
#endif
 CONTROL(RADIOBUTTON, OFF, 9, 1+offset,1,3,ID_COLOR)
 CONTROL(TEXT, "Co~lor", 15, 1+offset,1,5,ID_COLOR)
 CONTROL(RADIOBUTTON, OFF, 9, 2+offset,1,3,ID_MONO)
 CONTROL(TEXT, "~Mono", 15, 2+offset,1,4,ID_MONO)
 CONTROL(RADIOBUTTON, OFF, 9, 3+offset,1,3,ID_REVERSE)
 CONTROL(TEXT, "~Reverse", 15, 3+offset,1,7,ID_REVERSE)
 CONTROL(BUTTON, " ~OK ", 2, 5+offset,1,8,ID_OK)
 CONTROL(BUTTON, " ~Cancel ", 12, 5+offset,1,8,ID_CANCEL)
 CONTROL(BUTTON, " ~Help ", 22, 5+offset,1,8,ID_HELP)
ENDDB

#define TS2 "~2 "
#define TS4 "~4 "
#define TS6 "~6 "
#define TS8 "~8 "
/* ------------ Tab Stops dialog box -------------- */
DIALOGBOX( TabStops )
 DB_TITLE( "Editor Tab Stops", -1,-1, 10, 35)
 CONTROL(RADIOBUTTON, OFF, 2, 1, 1, 3, ID_TAB2)
 CONTROL(TEXT, TS2, 7, 1, 1, 23, ID_TAB2)
 CONTROL(RADIOBUTTON, OFF, 2, 2, 1, 11, ID_TAB4)
 CONTROL(TEXT, TS4, 7, 2, 1, 23, ID_TAB4)
 CONTROL(RADIOBUTTON, OFF, 2, 3, 1, 11, ID_TAB6)
 CONTROL(TEXT, TS6, 7, 3, 1, 23, ID_TAB6)
 CONTROL(RADIOBUTTON, OFF, 2, 4, 1, 11, ID_TAB8)
 CONTROL(TEXT, TS8, 7, 4, 1, 23, ID_TAB8)
 CONTROL(BUTTON, " ~OK ", 1, 6, 1, 8, ID_OK)
 CONTROL(BUTTON, " ~Cancel ", 12, 6, 1, 8, ID_CANCEL)
 CONTROL(BUTTON, " ~Help ", 23, 6, 1, 8, ID_HELP)
ENDDB
/* ------------ Windows dialog box -------------- */
#ifdef INCLUDE_MULTIDOCS
DIALOGBOX( Windows )
 DB_TITLE( "Windows", -1, -1, 19, 24)
 CONTROL(LISTBOX, NULL, 1, 1,11, 20, ID_WINDOWLIST)
 CONTROL(BUTTON, " ~OK ", 2, 13, 1, 8, ID_OK)
 CONTROL(BUTTON, " ~Cancel ", 12, 13, 1, 8, ID_CANCEL)
 CONTROL(BUTTON, " ~Help ", 7, 15, 1, 8, ID_HELP)
ENDDB
#endif

#ifdef INCLUDE_LOGGING
/* ------------ Message Log dialog box -------------- */
DIALOGBOX( Log )
 DB_TITLE( "D-Flat Message Log", -1, -1, 18, 41)
 CONTROL(TEXT, "~Messages", 10, 1, 1, 8, ID_LOGLIST)
 CONTROL(LISTBOX, NULL, 1, 2, 14, 26, ID_LOGLIST)
 CONTROL(TEXT, "~Logging:", 29, 4, 1, 10, ID_LOGGING)
 CONTROL(CHECKBOX, OFF, 31, 5, 1, 3, ID_LOGGING)
 CONTROL(BUTTON, " ~OK ", 29, 7, 1, 8, ID_OK)
 CONTROL(BUTTON, " ~Cancel ", 29, 10, 1, 8, ID_CANCEL)
 CONTROL(BUTTON, " ~Help ", 29, 13, 1, 8, ID_HELP)
ENDDB
#endif


#ifdef INCLUDE_HELP
/* ------------ the Help window dialog box -------------- */
DIALOGBOX( HelpBox )
 DB_TITLE( NULL, -1, -1, 0, 45)
 CONTROL(TEXTBOX, NULL, 1, 1, 0, 40, ID_HELPTEXT)
 CONTROL(BUTTON, " ~Close ", 0, 0, 1, 8, ID_CANCEL)
 CONTROL(BUTTON, " ~Back ", 10, 0, 1, 8, ID_BACK)
 CONTROL(BUTTON, "<< ~Prev ", 20, 0, 1, 8, ID_PREV)
 CONTROL(BUTTON, " ~Next >>", 30, 0, 1, 8, ID_NEXT)
ENDDB
#endif
#endif





[LISTING FIVE]

/* ----------- classes.h ------------ */
/* Class definition source file
 * Make class changes to this source file
 * Other source files will adapt
 * You must add entries to the color tables in
 * CONFIG.C for new classes.
 * Class Name Base Class Processor Attribute
 * ------------ --------- --------------- -----------
 */
ClassDef( NORMAL, -1, NormalProc, 0 )
ClassDef( APPLICATION, NORMAL, ApplicationProc, VISIBLE 
 SAVESELF 
 CONTROLBOX )
ClassDef( TEXTBOX, NORMAL, TextBoxProc, 0 )
ClassDef( LISTBOX, TEXTBOX, ListBoxProc, 0 )
ClassDef( EDITBOX, TEXTBOX, EditBoxProc, 0 )
ClassDef( MENUBAR, NORMAL, MenuBarProc, NOCLIP )
ClassDef( POPDOWNMENU, LISTBOX, PopDownProc, SAVESELF 
 NOCLIP 
 HASBORDER )
#ifdef INCLUDE_DIALOG_BOXES
ClassDef( BUTTON, TEXTBOX, ButtonProc, SHADOW 
 NOCLIP )
ClassDef( DIALOG, NORMAL, DialogProc, SHADOW 
 MOVEABLE 
 CONTROLBOX
 HASBORDER 
 NOCLIP )
ClassDef( ERRORBOX, DIALOG, DialogProc, SHADOW 
 HASBORDER )
ClassDef( MESSAGEBOX, DIALOG, DialogProc, SHADOW 
 HASBORDER )
#else
ClassDef( ERRORBOX, TEXTBOX, NULL, SHADOW 
 HASBORDER )
ClassDef( MESSAGEBOX, TEXTBOX, NULL, SHADOW 
 HASBORDER )
#endif

#ifdef INCLUDE_HELP

ClassDef( HELPBOX, DIALOG, HelpBoxProc, SHADOW 
 MOVEABLE 
 SAVESELF 
 HASBORDER 
 NOCLIP 
 CONTROLBOX )
#endif

/* ========> Add new classes here <======== */

/* ---------- pseudo classes to create enums, etc. ---------- */
ClassDef( TITLEBAR, -1, NULL, 0 )
ClassDef( DUMMY, -1, NULL, HASBORDER )
ClassDef( TEXT, -1, NULL, 0 )
ClassDef( RADIOBUTTON, -1, NULL, 0 )
ClassDef( CHECKBOX, -1, NULL, 0 )






[LISTING SIX]

/* ------------- config.c ------------- */
#include <conio.h>
#include <string.h>
#include "dflat.h"

/* ----- default colors for color video system ----- */
unsigned char color[CLASSCOUNT] [4] [2] = {
 /* ------------ NORMAL ------------ */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {LIGHTGRAY, BLACK}},/* HILITE_COLOR */
 /* ---------- APPLICATION --------- */
 {{LIGHTGRAY, BLUE}, /* STD_COLOR */
 {LIGHTGRAY, BLUE}, /* SELECT_COLOR */
 {LIGHTGRAY, BLUE}, /* FRAME_COLOR */
 {LIGHTGRAY, BLUE}}, /* HILITE_COLOR */
 /* ------------ TEXTBOX ----------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ------------ LISTBOX ----------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLUE}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ----------- EDITBOX ------------ */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLUE}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ---------- MENUBAR ------------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {BLACK, CYAN}, /* SELECT_COLOR */

 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {DARKGRAY, RED}}, /* HILITE_COLOR Inactive, Shortcut (both FG) */
 /* ---------- POPDOWNMENU --------- */
 {{BLACK, CYAN}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, CYAN}, /* FRAME_COLOR */
 {DARKGRAY, RED}}, /* HILITE_COLOR Inactive ,Shortcut (both FG) */
#ifdef INCLUDE_DIALOG_BOXES
 /* ------------ BUTTON ------------ */
 {{BLACK, CYAN}, /* STD_COLOR */
 {WHITE, CYAN}, /* SELECT_COLOR */
 {BLACK, CYAN}, /* FRAME_COLOR */
 {DARKGRAY, RED}}, /* HILITE_COLOR Inactive ,Shortcut (both FG) */
 /* ------------- DIALOG ----------- */
 {{LIGHTGRAY, BLUE}, /* STD_COLOR */
 {LIGHTGRAY, BLUE}, /* SELECT_COLOR */
 {LIGHTGRAY, BLUE}, /* FRAME_COLOR */
 {LIGHTGRAY, BLUE}}, /* HILITE_COLOR */
#endif
 /* ----------- ERRORBOX ----------- */
 {{YELLOW, RED}, /* STD_COLOR */
 {YELLOW, RED}, /* SELECT_COLOR */
 {YELLOW, RED}, /* FRAME_COLOR */
 {YELLOW, RED}}, /* HILITE_COLOR */
 /* ----------- MESSAGEBOX --------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
#ifdef INCLUDE_HELP
 /* ----------- HELPBOX ------------ */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLUE}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {WHITE, LIGHTGRAY}},/* HILITE_COLOR */
#endif
 /* ---------- TITLEBAR ------------ */
 {{BLACK, CYAN}, /* STD_COLOR */
 {BLACK, CYAN}, /* SELECT_COLOR */
 {BLACK, CYAN}, /* FRAME_COLOR */
 {WHITE, CYAN}}, /* HILITE_COLOR */
 /* ------------ DUMMY ------------- */
 {{GREEN, LIGHTGRAY}, /* STD_COLOR */
 {GREEN, LIGHTGRAY}, /* SELECT_COLOR */
 {GREEN, LIGHTGRAY}, /* FRAME_COLOR */
 {GREEN, LIGHTGRAY}} /* HILITE_COLOR */
};
/* ----- default colors for mono video system ----- */
unsigned char bw[CLASSCOUNT] [4] [2] = {
 /* ------------ NORMAL ------------ */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {LIGHTGRAY, BLACK}},/* HILITE_COLOR */
 /* ---------- APPLICATION --------- */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {LIGHTGRAY, BLACK}},/* HILITE_COLOR */

 /* ------------ TEXTBOX ----------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ------------ LISTBOX ----------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ----------- EDITBOX ------------ */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ---------- MENUBAR ------------- */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {DARKGRAY, WHITE}}, /* HILITE_COLOR Inactive, Shortcut (both FG) */
 /* ---------- POPDOWNMENU --------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {DARKGRAY, WHITE}}, /* HILITE_COLOR Inactive ,Shortcut (both FG) */
#ifdef INCLUDE_DIALOG_BOXES
 /* ------------ BUTTON ------------ */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {WHITE, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {DARKGRAY, WHITE}}, /* HILITE_COLOR Inactive ,Shortcut (both FG) */
 /* ------------- DIALOG ----------- */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {LIGHTGRAY, BLACK}}, /* HILITE_COLOR */
#endif
 /* ----------- ERRORBOX ----------- */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {LIGHTGRAY, BLACK}},/* HILITE_COLOR */
 /* ----------- MESSAGEBOX --------- */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {LIGHTGRAY, BLACK}},/* HILITE_COLOR */
#ifdef INCLUDE_HELP
 /* ----------- HELPBOX ------------ */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {WHITE, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {WHITE, LIGHTGRAY}},/* HILITE_COLOR */
#endif
 /* ---------- TITLEBAR ------------ */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {WHITE, LIGHTGRAY}},/* HILITE_COLOR */

 /* ------------ DUMMY ------------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}} /* HILITE_COLOR */
};
/* ----- default colors for reverse mono video ----- */
unsigned char reverse[CLASSCOUNT] [4] [2] = {
 /* ------------ NORMAL ------------ */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ---------- APPLICATION --------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ------------ TEXTBOX ----------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ------------ LISTBOX ----------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ----------- EDITBOX ------------ */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
 /* ---------- MENUBAR ------------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {DARKGRAY, WHITE}}, /* HILITE_COLOR Inactive, Shortcut (both FG) */
 /* ---------- POPDOWNMENU --------- */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {DARKGRAY, WHITE}}, /* HILITE_COLOR Inactive ,Shortcut (both FG) */
#ifdef INCLUDE_DIALOG_BOXES
 /* ------------ BUTTON ------------ */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {WHITE, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {DARKGRAY, WHITE}}, /* HILITE_COLOR Inactive ,Shortcut (both FG) */
 /* ------------- DIALOG ----------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}}, /* HILITE_COLOR */
#endif
 /* ----------- ERRORBOX ----------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */

 {BLACK, LIGHTGRAY}}, /* HILITE_COLOR */
 /* ----------- MESSAGEBOX --------- */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {BLACK, LIGHTGRAY}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {BLACK, LIGHTGRAY}},/* HILITE_COLOR */
#ifdef INCLUDE_HELP
 /* ----------- HELPBOX ------------ */
 {{BLACK, LIGHTGRAY}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {BLACK, LIGHTGRAY}, /* FRAME_COLOR */
 {WHITE, LIGHTGRAY}},/* HILITE_COLOR */
#endif
 /* ---------- TITLEBAR ------------ */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {LIGHTGRAY, BLACK}}, /* HILITE_COLOR */
 /* ------------ DUMMY ------------- */
 {{LIGHTGRAY, BLACK}, /* STD_COLOR */
 {LIGHTGRAY, BLACK}, /* SELECT_COLOR */
 {LIGHTGRAY, BLACK}, /* FRAME_COLOR */
 {LIGHTGRAY, BLACK}} /* HILITE_COLOR */
};
#define SIGNATURE DFLAT_APPLICATION " " VERSION

/* ------ default configuration values ------- */
CONFIG cfg = {
 SIGNATURE,
 0, /* Color */
 TRUE, /* Editor Insert Mode */
 4, /* Editor tab stops */
 TRUE, /* Editor word wrap */
 TRUE, /* Application Border */
 TRUE, /* Application Title */
 TRUE, /* Textured application window */
 25 /* Number of screen lines */
};
/* ------ load a configuration file from disk ------- */
int LoadConfig(void)
{
 FILE *fp = fopen(DFLAT_APPLICATION ".cfg", "rb");
 if (fp != NULL) {
 fread(cfg.version, sizeof cfg.version+1, 1, fp);
 if (strcmp(cfg.version, SIGNATURE) == 0) {
 fseek(fp, 0L, SEEK_SET);
 fread(&cfg, sizeof(CONFIG), 1, fp);
 }
 else
 strcpy(cfg.version, SIGNATURE);
 fclose(fp);
 }
 return fp != NULL;
}
/* ------ save a configuration file to disk ------- */
void SaveConfig(void)
{
 FILE *fp = fopen(DFLAT_APPLICATION ".cfg", "wb");
 if (fp != NULL) {

 cfg.InsertMode = GetCommandToggle(MainMenu, ID_INSERT);
 cfg.WordWrap = GetCommandToggle(MainMenu, ID_WRAP);
 fwrite(&cfg, sizeof(CONFIG), 1, fp);
 fclose(fp);
 }
}






[LISTING SEVEN]

/* ------------- keys.c ----------- */

#include <stdio.h>
#include "keys.h"

struct keys keys[] = {
 {F1, "F1"},
 {F2, "F2"},
 {F3, "F3"},
 {F4, "F4"},
 {F5, "F5"},
 {F6, "F6"},
 {F7, "F7"},
 {F8, "F8"},
 {F9, "F9"},
 {F10, "F10"},
 {CTRL_F1, "Ctrl+F1"},
 {CTRL_F2, "Ctrl+F2"},
 {CTRL_F3, "Ctrl+F3"},
 {CTRL_F4, "Ctrl+F4"},
 {CTRL_F5, "Ctrl+F5"},
 {CTRL_F6, "Ctrl+F6"},
 {CTRL_F7, "Ctrl+F7"},
 {CTRL_F8, "Ctrl+F8"},
 {CTRL_F9, "Ctrl+F9"},
 {CTRL_F10, "Ctrl+F10"},
 {ALT_F1, "Alt+F1"},
 {ALT_F2, "Alt+F2"},
 {ALT_F3, "Alt+F3"},
 {ALT_F4, "Alt+F4"},
 {ALT_F5, "Alt+F5"},
 {ALT_F6, "Alt+F6"},
 {ALT_F7, "Alt+F7"},
 {ALT_F8, "Alt+F8"},
 {ALT_F9, "Alt+F9"},
 {ALT_F10, "Alt+F10"},
 {HOME, "Home"},
 {UP, "Up"},
 {PGUP, "PgUp"},
 {BS, "BS"},
 {END, "End"},
 {DN, "Dn"},
 {PGDN, "PgDn"},
 {INS, "Ins"},
 {DEL, "Del"},

 {CTRL_HOME, "Ctrl+Home"},
 {CTRL_PGUP, "Ctrl+PgUp"},
 {CTRL_BS, "Ctrl+BS"},
 {CTRL_END, "Ctrl+End"},
 {CTRL_PGDN, "Ctrl+PgDn"},
 {SHIFT_HT, "Shift+Tab"},
 {ALT_BS, "Alt+BS"},
 {SHIFT_DEL, "Shift+Del"},
 {SHIFT_INS, "Shift+Ins"},
 {CTRL_INS, "Ctrl+Ins"},
 {ALT_A, "Alt+A"},
 {ALT_B, "Alt+B"},
 {ALT_C, "Alt+C"},
 {ALT_D, "Alt+D"},
 {ALT_E, "Alt+E"},
 {ALT_F, "Alt+F"},
 {ALT_G, "Alt+G"},
 {ALT_H, "Alt+H"},
 {ALT_I, "Alt+I"},
 {ALT_J, "Alt+J"},
 {ALT_K, "Alt+K"},
 {ALT_L, "Alt+L"},
 {ALT_M, "Alt+M"},
 {ALT_N, "Alt+N"},
 {ALT_O, "Alt+O"},
 {ALT_P, "Alt+P"},
 {ALT_Q, "Alt+Q"},
 {ALT_R, "Alt+R"},
 {ALT_S, "Alt+S"},
 {ALT_T, "Alt+T"},
 {ALT_U, "Alt+U"},
 {ALT_V, "Alt+V"},
 {ALT_W, "Alt+W"},
 {ALT_X, "Alt+X"},
 {ALT_Y, "Alt+Y"},
 {ALT_Z, "Alt+Z"},
 {-1, NULL}
};



Exampl 1:

(a)

 typedef enum window_class {
 #define ClassDef(c,b,p,a) c,
 #include "classes.h"
 CLASSCOUNT
 } CLASS;


(b)

 char *ClassNames[] = {
 #undef ClassDef
 #define ClassDef(c,b,p,a) #c,
 #include "classes.h"
 NULL

 };

(c)

 CLASSDEFS classdefs[] = {
 #undef ClassDef
 #define ClassDef(c,b,p,a) {b,p,a},
 {0,0,0}
 #include "classes.h"
 };



Exampl 2:

(a)

 KeyDef(ALT_S, 159+OFFSET, "Alt+S")

(b)

 struct keys keys[] = {
 #undef KeyDef
 #define KeyDef(k,v,s) {v,s},
 #include "keycaps.h"
 {-1,NULL}
 };

(c)

 #undef KeyDef
 #define KeyDef(k,v,s) #define k v
 #include "keycaps.h"


(d)

 #undef KeyDef
 #define KeyDef(k,v,s) extern const int k;
 #include "keycaps.h"


(e)

 #undef KeyDef
 #define KeyDef(k,v,s) const int k = v;
 #include "keycaps.h"















September, 1991
STRUCTURED PROGRAMMING


What is this Interrupt Really About?




Jeff Duntemann, KG7JF


Interrupts, like sorrows, come not as single spies, but in battalions. At
10:00 A.M. sharp, Joni's Airpark Catering Truck rolls into our office parking
lot, playing the first three bars of "Dixie" so loud the ground shakes. This
is fine by us; Joni brings things that are cold and wet, which in a Scottsdale
summer is more than a convenience; it's survival.
Joni's arrival, furthermore, is more than just an interrupt. It's a whole
hierarchy of interrupts. When her horn whistles Dixie, work stops. People pour
into the parking lot and begin milling around, buying Cokes and muffins and
waiting for... The Pizza Pride Girl.
Nobody goes back into their offices before the appearance of the Pizza Pride
girl. It's part of the ritual, and there she is, bopping out of the little
office that says Pizza Pride on the window, waist-length blonde hair blowing
around in the hot wind. We're not sure what they do in the Pizza Pride office;
probably broker tons of imitation cheese-flavored Pizza polymer to pizza
parlors around the country. It doesn't matter. They have Pizza Pride Girl.
She's probably 20, with cheekbones up to here, exotic sculpted eyebrows and a
shameful abundance of things that bored young men like to look at. And look
they do (discreetly; this isn't any construction site), while she picks out
her Coke and doughnuts. Some days its a silk camisole over black leather pants
laced up to the waist on both sides: other days it's a polka-dot turn over
skin-tight cyclist's pants. Some days coolie slippers, some days four-inch
heels, some days barefoot. In a third-rate office park dominated by insurance
agencies and windshield repair places, she represents the only truly random
element in an otherwise totally predictable routine.
Some days Keith and I look down from our second-floor window over-looking the
parking lot, and dig each other in the ribs watching the women watching the
men watching the Pizza Pride Girl. It doesn't really help get the work done,
but it does truly break up the day.
And sometimes, on the way out the door to get a Coke from Joni, we ask each
other, well, are you really going down for a Coke or are you just going down
to see what the Pizza Pride Girl will be wearing today?
In other words, what is this interruption really about?


Prioritizing Interrupts


Here we fold seamlessly into where we left our discussion last month, with a
question the CPU asks the 8259 Programmable Interrupt Controller chip
frequently. Recall that there's only one interrupt input pin on the CPU chip.
One of the 8259's most important jobs is to allow several different hardware
devices to share access to that one interrupt input pin. It has another job at
least as important as (and much more complex than) the first: handling the
situation that comes up when two or more different devices want to interrupt
the CPU at the same time.
The CPU can only do one thing at a time. When an interrupt comes in, it begins
executing the Interrupt Service Routine (ISR) for that interrupt. Completing
the ISR takes a short (one hopes) time, but it still takes time. What happens
if a second interrupt comes in while the CPU is still executing the ISR
belonging to the first interrupt?
That depends on the priority level of the two interrupts.
When the 8259 chip is initialized at power-up time, it stack-ranks IRQ0-IRQ7
in a particular priority order. The default ordering puts IRQ0 at the highest
priority, followed by IRQ1, and so on, with IRQ7 in the priority basement.
This ordering can be changed on-the-fly by sending commands to the 8259, but
on the PC this is almost never done. Assume unless told otherwise in PC work
that IRQ0-7 are priority ordered with IRQ0 at the top and the others in order
down from there.
Now, back to the question of a second interrupt appearing while the first is
being serviced. If the second interrupt to come in has a higher priority than
the first, the second interrupt will interrupt the first, and the second
interrupt's ISR will begin running immediately. Otherwise, the second
interrupt will be ignored.
For example: The COM1: serial port uses IRQ4. The timer tick interrupt uses
IRQ0. If you are running a communications program and your IRQ4 ISR is
currently executing, the timer tick interrupt will interrupt your
communications ISR. IRQ0 has a higher priority than IRQ4. However, the
parallel port uses IRQ7; and if the parallel port tries to interrupt the CPU
while the serial port ISR is executing, the parallel port's interrupt request
will be ignored until the serial port ISR completes execution. IRQ7 has a
lower priority than IRQ4.
Similarly, if two interrupts come in at exactly the same time, (and no ISR is
currently executing) the 8259 chooses the one with the higher priority and
allows it to execute.
Priority of interrupts generally isn't a serious problem in PC work. One
potential problem appears if you want to use both COM1 and COM2 at the same
time. COM2, using as it does IRQ3, has a higher priority than COM1 using IRQ4.
COM1 cannot interrupt the machine while a COM2 interrupt is being serviced.
This is one excellent reason to keep your ISRs short and to the point.


The In-Service Register and EOI


The 8259 contains an 9-bit register called the In-Service Register. (Some
folks call it the ISR, but doing so results in a bad case of colliding
acronyms -- forgive me if I spell it out a lot.) There is one bit in the
read-only In-Service Register for each of the eight IRQ lines supported by the
8259. Bit 0 belongs to IRQ0, bit 1 to IRQ1, and so on. When the ISR of an IRQ
is currently executing, that IRQ's bit in the In-Service Register will be set
to 1. When one of the bits is set to 1, the 8259 will ignore any IRQ whose
priority level is lower than the IRQ whose bit is set.
Thus, during the time that your IRQ4 ISR is executing, bit 4 in the In-Service
Register will be set to 1. But until that bit is cleared to 0, no interrupt
from IRQ4-IRQ7 will be recognized by the 8259. Who clears it? Your program
must, by sending a command to the 8259 chip. This command is called an
End-Of-Interrupt (EOI) command.
There are two kinds of EOI commands that the 8259 understands. One is the
nonspecific EOI (NSEOI) and the other is the specific EOI (SEOI). The specific
EOI command tells the 8259 that a particular specified ISR has completed
execution. The nonspecific EOI simply tells the 8259 that whatever ISR was
most recently executing is now complete. Nonspecific EOI is the one most
frequently used. Keep in mind that as long as interrupts are prioritized, the
8259 always knows what IRQ is currently executing: the one with the
highest-priority 1-bit set in the In-Service Register. If IRQ0 and IRQ4 are
both marked (with 1-bits) as being in ser vice, IRQ0 must be the one currently
executing, because it has priority over IRQ4. So when a nonspecific EOI
command comes in, the 8259 clears the set bit with the highest priority in the
In-Service Register; in this case, bit 0, which belongs to IRQ0.
All you need to do to send a non-specific EOI command to the 8259 is write the
value $20 to an 8259 register called Operation Control Word 2 (OCW2), which on
the PC is located at I/O port $20: Port[$20]:= $20;.
A nonspecific EOI command must be sent to the 8259 at the conclusion of every
ISR, or lower priority interrupts will simply be blocked.


A Real Interrupt-Driven Terminal Program


There's another aspect to the "What is this interrupt really about?" question
that I haven't addressed yet: The UART can generate an interrupt based on any
of several different occurrences, such as a character coming in, or readiness
to send another character out. But if I go further down that road without
providing some practice, the theory I've been laying out these past few
columns will set into cement without being useful. So we'll come back to that
question in the near future. For now, let's talk about the simplest possible
interrupt-driven communications program, one that recognizes only a single
interrupt trigger: an incoming character from the modem.
Listing One (page 142) is INTTERM.-PAS, the interrupt-driven successor to
POLLTERM, which I presented a couple of columns ago. It does pretty much what
POLLTERM does, with the exception that it doesn't poll the serial port for
incoming characters. Instead, when a character appears from the modem, INTTERM
grabs it from the port through the use of an interrupt service routine, and
tucks it away until the program logic can deal with it.
INTTERM is a good, simple working example of how the PC's interrupt machinery
operates. Once you digest that, you'll be much less likely to trip over the
more complex serial port object I'm rolling for a future column.
INTTERM is not object oriented, and should compile under Turbo Pascal 5.0,
5.5, or 6.0.


The Polling Loop


I shouldn't mislead you by implying that there isn't any polling going on in
INTTERM.PAS. INTTERM has a polling loop almost identical to the one in
POLLTERM. The loop is the REPEAT..UNTIL structure at the end of the main
program block. It polls the incoming character butfer. If a character is
ready, it takes the character from the buffer and writes it to the screen. It
then polls the keyboard, and if a keystroke is ready (and if that keystroke is
not a predefined command), INTTERM writes the keystroke to the serial port.
That's the polling loop in a nutshell.
The key difference between POLLTERM and INTTERM is where the incoming
character comes from, POLLTERM actually polls the serial port hardware in its
InStat function, looking at bit 0 of the UART's Line Status Register (LSR). If
LSR bit 0 is 1, a character is ready to be picked up.

INTTERM also has an InStat function, but the serial port hardware is not
involved. In fact, nowhere in the main program block, nor in any routine
called by the main program block is an incoming character read from the serial
port. INTTERM's interrupt service routine fetches the incoming character from
the serial port and places it in a buffer. The polling loop only "talks to"
that buffer--and that buffer is an interesting character all by itself.


Circular Buffers


The variable Buffer is defined as type CircularBuffer -- but there isn't
anything unusual about its declaration as a simple character array. What makes
CircularBuffer special (that is, circular) is not the way it is defined but
the way it is used. Follow along on Figure 1 during the discussion below.
Two integer variables LastSaved and LastRead contain indexes into the Buffer
array. When execution begins, both are set to 0--the index of the first
element of the array. After interrupts are turned on and INTTERM begins
listening for incoming characters, the ISR Incoming picks up any characters
that arrive from the serial port, and writes them to the next position in
Buffer. The next position is defined to be the position immediately after the
index stored in the LastSaved variable.
The key to this whole notion of being circular lies in how that next position
in the array is determined. This is the code (from Incoming) that does it:
 IF LastSaved >= BUFFSIZE THEN
 LastSaved := 0
 ELSE
 Inc(LastSaved);
BUFFSIZE is an integer constant that specifies the highest index of the
CircularArray array type. (Here, 1023.) What happens in the test is that if
LastSaved is found to be equal to the highest index in the array, LastSaved is
reset to 0--otherwise it is simply incremented to the next index. In other
words, once you increment LastSaved to the end of the array, it "wraps" around
to the beginning again.
In Figure 1(a), four characters have come in, and have been placed into Buffer
by the ISR Incoming. (I've reduced the size of BUFFER to 16 here to keep the
figure manageable.) LastSaved "points" to the last character it saved -- hence
its name. The other index pointer, LastRead, is defined as holding the index
of the last character read from the buffer by the polling loop, hence its
name. Here, LastRead has yet to move off zero. Once the polling loop begins to
poll, however, LastRead will pick up one character on each pass through the
REPEAT..UNTIL loop. Each time InChar picks a character out of the buffer,
LastRead is incremented:
 IF LastRead >= BUFFSIZE THEN
 LastRead := 0
 ELSE
 Inc(LastRead)
Looks familiar? LastRead and LastSaved are treated identically by the program.
Both are incremented along the Buffer array, and both wrap back to 0 once they
reach the end of the array.
By the time represented in Figure 1(b), LastRead has picked four characters
out of Buffer. LastSaved, however, has gotten way ahead, having placed a full
16 characters into Buffer at the behest of Incoming. When the next interrupt
happens, Incoming will wrap LastSaved around to 0, so that it can begin its
dash through Buffer again.
Note that as soon as it wraps to 0, LastSaved begins "reusing" locations in
Buffer where it placed characters earlier; Figure 1(c). This is OK -- because
LastRead managed to get them out of the buffer and displayed to the screen
before the original characters were overwritten. In Figure 1(c), LastRead is
finally beginning to "catch up" to LastSaved. In Figure 1(b), LastRead was a
full 12 characters behind LastSaved. In Figure 1(c), LastRead has come up from
behind and is only five characters behind LastSaved.
Because both index pointers LastSaved and LastRead are wrapped around to 0
when they reach the end of the Buffer array, we can think (metaphorically) of
the two ends of Buffer brought up and butted nose-to-tail, making the buffer
into a ring, as shown in Figure 1(d). As characters are placed into the buffer
and read from the buffer, the two index pointers zip around and around the
buffer in the direction shown by the array.


Catching Up


The notion of "catching up" is important in a circular buffer. By definition,
the circular buffer is empty when the two pointers are equal. Think about it:
If the last character read (at LastRead) is the same as the last character
saved (at LastSaved), then there are no more characters waiting to be read.
The buffer is never "cleared" by being filled with blanks or anything. Whether
or not it is empty and what characters it "contains" is dictated solely by the
relative position of the two pointers. The contents of a circular buffer are
defined as those characters between LastRead and LastSaved, moving
index-forward from LastRead.
As Figure 1(b) begins to demonstrate, LastSaved can only get so far ahead of
LastRead before information stored in the buffer is overwritten and lost. So
over a period of time, the program had better be capable of reading characters
from the buffer just as fast as the interrupt service routine can put them
there. However, the buffer allows the modem to deliver data in rapid spurts,
perhaps faster (for a few seconds) than the polling loop can grab. Or, the
buffer allows the polling loop to go off and do other things for a few seconds
(like closing or opening a disk file) and not have to worry about missing
incoming characters.
On a fast machine, POLLTERM can run as fast as 2400 baud without missing any
characters. But run any faster -- or begin to do anything more involved than
throw incoming characters at the display -- and the power of INTTERM becomes
mandatory.


Wrapping With AND


Experienced Pascal hackers should be assured that I am aware of a faster way
to wrap an index -- by using the AND operator -- but the whole idea with
INTTERM is to create an interrupt-driven program that the ordinary Pascal
programmer can understand. AND will do the same thing as the IF statement just
shown, like so:
 LastSaved := Inc(LastSaved) AND
 BUFFSIZE;
This, however, is nowhere near as easy to understand. For those who want to
work it out, keep in mind that using AND like this works only for values of
BUFFSIZE that are some power of two decremented by one; for example, 15, 63,
255, and so on.


The ISR Itself


Functionally, the interrupt service routine itself (Incoming in INTTERM) is
simple. In truth, it's only four statements long.
The first thing most common ISRs must do is turn interrupts back on again at
the CPU. When the 86-family of processors recognize an interrupt, they set a
flag indicating that an interrupt is in progress. Until this flag is cleared
again, another interrupt will not be recognized by the CPU. Now, in most cases
there is no reason why one ISR cannot be interrupted by another, and in fact
it happens all the time. The timer tick interrupt on IRQ0 interrupts everybody
(given its top priority) and this is an accepted way of life. So early on in a
garden-variety ISR, execute a CLI instruction (opcode $FB) to clear the flag
and reenable CPU interrupts. (This has nothing to do with the EOI command that
must be sent to the 8259 PIC -- we'll get to that shortly.)
After reenabling CPU interrupts, the ISR increments the LastSaved pointer as
explained earlier. Once LastSaved points to the next location in Buffer, the
incoming character is read from the Receive Buffer Register (RBR) and written
into Buffer at LastSaved. That's the core task of the ISR: to grab a character
from RBR and write it into Buffer.
Finally, with its work done, the ISR must tell the 8259 that it's all
finished, and to clear the appropriate bit in the In-Service Register. This is
the purpose of the line
 Port[OCW2] := $20;
If you don't do this, you won't get another serial port interrupt, nor any
interrupt with a lower priority. Don't forget EOI!


Turbo Pascal Interrupt Procedures


The Incoming ISR is written entirely in Turbo Pascal. There's no assembly
language involved. The qualifier INTERRUPT identifies an otherwise
unremarkable procedure as an interrupt procedure. The differences are almost
all beneath the surface, in the way the compiler generates entry and exit code
for the procedure.
One thing is for sure: Do not try to execute an interrupt procedure as though
it were a normal procedure. Odd things will happen, up to and including hard
system crashes.
Nothing in its definition attaches an interrupt procedure to any particular
interrupt vector. This has to be done manually, using Turbo Pascal's GetIntVec
and SetIntVec procedures, as shown in the SetupSerialPort procedure:

 GetIntVec(ComInt, OldVector);
 ExitProc := @IntTermExitProc;
 SetIntVec(ComInt,@Incoming);
You could conceivably just peek and poke addresses into the interrupt vector
table itself, but this is a bad idea. What would happen if an interrupt
occurred on a vector while you're half-finished changing it? Nothing good,
surely. Turbo Pascal's library procedures use the "safe" routines for altering
vectors available through DOS.
Another point to notice from the snippet shown above is the use of an exit
procedure in connection with INTTERM. The exit procedure, IntTermExitProc, is
there to restore certain essential conditions in the event that a runtime
error dumps execution from INTTERM back to DOS. Exit procedures are always
executed before the Turbo Pascal runtime code relinquishes control to DOS,
either normally or in the event of a runtime error. You don't have to call the
exit procedure explicitly; the Turbo Pascal runtime code hands control to it
automatically at the appropriate time. Note the $F directives: Exit procedures
must always be declared as FAR, because they must be callable from the runtime
library's code segment rather than any particular program or unit's code
segment.
The first priority within the exit procedure is to put the original vector
back into the interrupt vector used by INTTERM. The vector that was in the
table when INTTERM took control from DOS is saved in the variable OldVector,
and Oldvector is stuffed back into the interrupt vector table by the exit
procedure. Additionally, the communications line is brought down, and
interrupts are disabled both at the UART and at the 8259 PIC.
An exit procedure you build into your Pascal communications software may do
more than this, but it should never do less.


What Not to Do in an Interrupt Procedure


And finally, the big question of How To Stay Alive inside an interrupt
routine. Most people have a vague awareness that interrupt service routines
are truly alien territory, where things that nobody would bat an eye about in
an ordinary program can put your session six feet under. The problem is called
reentrancy, and it centers around the possibility of interrupting a piece of
code, and then executing that piece of code again from within the interrupt
service routine. Some code is reentrant (meaning it can be interrupted and
executed again) and some is not.
The biggest hazard is DOS. Once interrupted, DOS should not be executed a
second time unless some very arcane precautions are taken. (I won't attempt to
enumerate them here. That's a subject for an entire book, not a column.) So
don't make DOS calls from within an ISR, or (obviously) call any code that
makes DOS calls. Read/Readln and Write/Writeln use DOS, as do all disk access
routines. The heap is another thing that is not generally reentrant. Do not
call any of the heap management routines. However, I've managed to access data
stored on the heap from within an ISR by dereferencing pointers. This data was
allocated outside the ISR, however. Do not allocate or free memory from within
an ISR.
In general, limit your ISR's activities to simple stuffing of buffers and
manipulation of global variables, as INTTERM does here. Reentrancy is only one
of your problems; putting too much stuff into an ISR makes your ISR run
slowly, which is a hazard all by itself. The bulkier your ISR, the less
quickly your com software can be made to run.


Torment It!


On the other hand, it might well be educational to carefully back up your hard
disk and do a lot of stupid things to INTTERM. Make DOS calls or allocate heap
memory from within the ISR. Forget to issue EOI. Call Incoming directly. Then
watch what happens.
I've found that you learn less when things go well than when they blow up in
your face. And the evil savor of the forbidden can be very satisfying --
especially considering that mayhem is less likely with software than with,
say, lighting off M80s inside beer cans.
That's it on interrupts for a column or so. Study them in the meantime. We'll
be back to them, after covering some additional design issues and taking a
stab at the goblin we know as Turbo Vision.
_STRUCTURED PROGRAMMING COLUMN_
by Jeff Duntemann


[LISTING ONE]

{------------------------------------------------------------------------}
{ INTTERM }
{ by Jeff Duntemann }
{ Turbo Pascal V5.0 or later }
{ Last update 6/2/91 }
{ This is an interrupt-driven "dumb terminal" program for the PC. It }
{ the use of Turbo Pascal's INTERRUPT procedures, and in a lesser fashion}
{ the use of serial port hardware. It can be set to use either COM1: or }
{ COM2: by setting the COMPORT constant to 1 (for COM1) or 2 (for COM2) }
{ as appropriate and recompiling. }
{------------------------------------------------------------------------}

PROGRAM INTTERM;

USES DOS,CRT;

CONST
 COMPORT = 2; { 1 = COM1: 2 = COM2: }
 COMINT = 13-COMPORT; { 12 = COM1: (IRQ4) 11 = COM2: (IRQ3) }
 COMBASE = $2F8;
 PORTBASE = COMBASE OR (COMPORT SHL 8); { $3F8 for COM1: $2F8 for COM2: }

 { 8250 control registers, masks, etc. }
 RBR = PORTBASE; { 8250 Receive Buffer Register }
 THR = PORTBASE; { 8250 Transmit Holding Register }
 LCR = PORTBASE + 3; { 8250 Line Control Register }
 IER = PORTBASE + 1; { 8250 Interrupt Enable Register }
 MCR = PORTBASE + 4; { 8250 Modem Control Register }
 LSR = PORTBASE + 5; { 8250 Line Status Register }
 DLL = PORTBASE; { 8250 Divisor Latch LSB }

 DLM = PORTBASE + 1; { 8250 Divisor Latch MSB }
 DLAB = $80; { 8250 Divisor Latch Access Bit }

 BAUD300 = 384; { Value for 300 baud operation }
 BAUD1200 = 96; { Value for 1200 baud operation }
 NOPARITY = 0; { Comm format value for no parity }
 BITS8 = $03; { Comm format value for 8 bits }
 DTR = $01; { Value for Data Terminal Ready }
 RTS = $02; { value for Ready To Send }
 OUT2 = $08; { Bit that enables adapter interrupts }
 BUFFSIZE = 1023;

 { 8259 control registers, masks, etc. }
 OCW1 = $21; { 8259 Operation Control Word 1 }
 OCW2 = $20; { 8259 Operation Control Word 2 }
 { The 8259 mask bit is calculated depending on }
 { which serial port is used... }
 { $10 for COM1: (IRQ4); $08 for COM2: (IRQ3): }
 IRQBIT = $20 SHR COMPORT;

TYPE
 CircularBuffer = ARRAY[0..BUFFSIZE] OF Char; { Input buffer }
VAR
 Quit : Boolean; { Flag for exiting the program }
 HiBaud : Boolean; { True if 1200 baud is being used }
 KeyChar : Char; { Character from keyboard }
 CommChar : Char; { Character from the comm port }
 Divisor : Word; { Divisor value for setting baud rate }
 Clearit : Byte; { Dummy variable }
 Buffer : CircularBuffer; { Our incoming character buffer }
 LastRead, { Index of the last character read }
 LastSaved : Integer; { Index of the last character stored }
 NoShow : SET OF Char; { Don't show characters set }
 OldVector : Pointer; { Global storage slot for the old }
 { interrupt vector }
PROCEDURE EnableInterrupts;
INLINE($FB);
{->>>>Incoming (Interrupt Service
Routine)<<<<--------------------------------}
{ This is the ISR (interrupt Service Routine) for comm ports. DO NOT call
this}
{ routine directly; you'll crash hard. The only way Incoming takes control is
}
{ when a character coming in from modem triggers a hardware interrupt from }
{ serial port chip, the 8250 UART. Note that the register pseudo-parameters }
{ are not needed here, and you could omit them. However, omitting them
doesn't}
{ really get you any more speed or reliability. }
{-----------------------------------------------------------------------------}

PROCEDURE Incoming(Flags,CS,IP,AX,BX,CX,DX,SI,DI,DS,ES,BP : Word);
INTERRUPT;
BEGIN
 { First we have to enable interrupts *at the CPU* during the ISR: }
 EnableInterrupts;
 { The first "real work" we do is either wrap or increment the index of the }
 { last character saved. If index is "topped out" at buffer size (here, }
 { 1023) we force it to zero. This makes the 1024-byte buffer "circular," }
 { in that once the index hits the end, it rolls over to beginning again. }
 IF LastSaved >= BUFFSIZE THEN LastSaved := 0 ELSE Inc(LastSaved);

 { Next, we read the actual incoming character from the serial port's}
 { one-byte holding buffer: }

 Buffer[LastSaved] := Char(Port[RBR]);

 { Finally, we must send a control byte to the 8259 interrupt }
 { controller, telling it that the interrupt is finished: }
 Port[OCW2] := $20; { Send EOI byte to 8259 }
END;

{$F+}
PROCEDURE IntTermExitProc;
BEGIN
 Port[IER] := 0; { Disable interrupts at 8250 }
 Port[OCW1] := Port[OCW1] OR IRQBIT; { Disable comm int at 8259 }
 Port[MCR] := 0; { Bring the comm line down }
 SetIntVec(COMINT,OldVector); { Restore previously saved vector }
END;
{$F-}

PROCEDURE SetupSerialPort;
BEGIN
 LastRead := 0; { Initialize the circular buffer pointers }
 LastSaved := 0;

 Port[IER] := 0; { Disable 8250 interrupts while setting them up }

 GetIntVec(ComInt,OldVector); { Save old interrupt vector }
 ExitProc := @IntTermExitProc; { Hook exit proc into chain }
 SetIntVec(ComInt,@Incoming); { Put ISR address into vector table }

 Port[LCR] := Port[LCR] OR DLAB; { Set up 8250 to set baud rate }
 Port[DLL] := Lo(Divisor); { Set baud rate divisor }
 Port[DLM] := Hi(Divisor);
 Port[LCR] := BITS8 OR NOPARITY; { Set word length and parity }
 Port[MCR] := DTR OR RTS OR OUT2; { Enable adapter, DTR, & RTS }
 Port[OCW1] := Port[OCW1] AND (NOT IRQBIT); { Turn on 8259 comm ints }
 Clearit := Port[RBR]; { Clear any garbage from RBR }
 Clearit := Port[LSR]; { Clear any garbage from LSR }
 Port[IER] := $01; { Enable 8250 interrupt on received character }
END;

FUNCTION InStat : Boolean;
BEGIN
 IF LastSaved <> LastRead THEN InStat := True
 ELSE InStat := False;
END;

FUNCTION InChar : Char; { Bring in the next character }
BEGIN
 IF LastRead >= BUFFSIZE THEN LastRead := 0
 ELSE Inc(LastRead);
 InChar := Buffer[LastRead];
END;

PROCEDURE OutChar(Ch : Char); { Send a character to the comm port }
BEGIN
 Port[THR] := Byte(Ch) { Put character ito Transmit Holding Register }
END;

PROCEDURE ShowHelp;
BEGIN

 Writeln('>>>IntTerm by Jeff Duntemann >>>>>>>>>>>>>>>>>>>>>>>>>>>');
 Writeln(' Defaults to 1200 Baud; to run at 300 Baud');
 Writeln(' invoke with "300" after "INTTERM" on the command line.');
 Writeln;
 Writeln(' Commands:');
 Writeln(' ALT-X: Exits to DOS');
 Writeln(' CTRL-Z: Clears the screen');
 Writeln('<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<');
END;

{>>>>>INTTERM MAIN PROGRAM<<<<<}
BEGIN
 HiBaud := True; { Defaults to 1200 baud; if "300" is }
 Divisor := BAUD1200; { entered after "INTTERM" on the command }
 IF ParamCount > 0 THEN { line, then 300 baud is used instead. }
 IF ParamStr(1) = '300' THEN
 BEGIN
 HiBaud := False;
 Divisor := BAUD300
 END;
 DirectVideo := True;
 NoShow := [#0,#127]; { Don't display NUL or RUBOUT }
 SetupSerialPort; { Set up serial port & turn on interrupts }
 ClrScr;
 Writeln('>>>INTTERM by Jeff Duntemann');
 Quit := False; { Exit INTTERM when Quit goes to True }
 REPEAT
 IF InStat THEN { If a character comes in from the modem }
 BEGIN
 CommChar := InChar; { Go get character }
 CommChar := Char(Byte(CommChar) AND $7F); { Mask off high bit }
 IF NOT (CommChar IN NoShow) THEN { If we can show it,}
 Write(CommChar) { then show it! }
 END;
 IF KeyPressed THEN { If a character is typed at the keyboard }
 BEGIN
 KeyChar := ReadKey; { First, read the keystroke }
 IF KeyChar = Chr(0) THEN { We have an extended scan code here }
 BEGIN
 KeyChar := ReadKey; { Read second half of extended code }
 CASE Ord(KeyChar) OF
 59 : ShowHelp; { F1 : Display help screen }
 45 : Quit := True; { Alt-X: Exit IntTerm }
 END { CASE }
 END
 ELSE
 CASE Ord(KeyChar) OF
 26 : ClrScr; { Ctrl-Z: Clear the screen }
 ELSE OutChar(KeyChar)
 END; { CASE }
 END
 UNTIL Quit
END.





































































September, 1991
GRAPHICS PROGRAMMING


256-Color VGA Animation




Michael Abrash


No amusing stories or informative anecdotes to kick off the column this time;
too much ground to cover; gotta hurry. Won't talk about the time a friend made
the mistake of loudly saying "$100 bill" during an animated discussion while
walking among the bums on Market Street in San Francisco one night, thereby
graphically illustrating that context is everything. Can't spare a word about
how my daughter thinks my 11-year-old floppy-based CP/M machine is more
powerful than my 386 with its 100-Mbyte hard disk because the CP/M machine's
word processor loads and runs twice as fast as the 386's Windows-based word
processor, demonstrating that progress is not the neat exponential curve we'd
like to think it is, and that features and performance are often conflicting
notions. And, Lord knows, can't take the time to discuss the dietary or other
habits of small white dogs, notwithstanding that said dogs seem to be relevant
to just about every aspect of computing; check Jeff's column for that. No
lighthearted fluff for us; we have real work to do, for today we animate with
256 colors.


Masked Copying


Over the past two columns, we've put together most of the tools needed to
implement animation in the VGA's undocumented 320 x 240 256-color mode, which
I call "mode X." We now have mode set code, solid and 4 x 4 pattern fills,
system memory-to-display memory block copies, and display memory-to-display
memory block copies. The final piece of the puzzle is the ability to copy a
nonrectangular image to display memory; I call this "masked copying."
Masked copying is sort of like drawing through a stencil, in that only certain
pixels within the destination rectangle are drawn. The object is to fit the
image seamlessly into the background, without the rectangular fringe that
results when nonrectangular images are drawn by block copying their bounding
rectangle. This is accomplished by using a second rectangular bitmap, separate
from the image but corresponding to it on a pixel-by-pixel basis, to control
which destination pixels are set from the source and which are left unchanged.
With a masked copy, only those pixels properly belonging to an image are
drawn, and the image fits perfectly into the background, with no rectangular
border. In fact, masked copying even makes it possible to have transparent
areas within images.
The system memory-to-display memory masked copy routine in Listing One (page
143) implements masked copying in a straightforward fashion. In the main
drawing loop, the corresponding mask byte is consulted as each image pixel is
encountered, and the image pixel is copied only if the mask byte is nonzero.
As with most of the system-to-display code I've presented, Listing One is not
heavily optimized, because it's inherently slow; there's a better way to go
when performance matters, and that's to use the VGA's hardware.


Fast Masked Copying


Last month we saw how the VGA's latches can be used to copy four pixels at a
time from one area of display memory to another in mode X. We've further seen
that in mode X the Map Mask register can be used to select which planes are
copied. That's all we need to know to be able to perform fast masked copies;
we can store an image in off-screen display memory, and set the Map Mask to
the appropriate mask value as up to four pixels at a time are copied.
There's a slight hitch, though. The latches can only be used when the source
and destination left edge coordinates, modulo four, are the same, as explained
last month. The solution is to copy all four possible alignments of each image
to display memory, each properly positioned for one of the four possible
destination-left-edge-modulo-four cases. These aligned images must be
accompanied by the four possible alignments of the image mask, stored in
system memory. Given all four image and mask alignments, masked copying is a
simple matter of selecting the alignment that's appropriate for the
destination's left edge, then setting the Map Mask with the 4-bit mask
corresponding to each four-pixel set as we copy four pixels at a time via the
latches.
Listing Two (page 143) performs fast masked copying. This code expects to
receive a pointer to a MaskedImage structure, that in turn points to four
AlignedMaskedImage structures that describe the four possible image and mask
alignments. The aligned images are already stored in display memory, and the
aligned masks are already stored in system memory; further, the masks are
predigested into Map Mask register-compatible form. Given all that
ready-to-use data, Listing Two selects and works with the appropriate
image-mask pair for the destination's left edge alignment.
It would be handy to have a function that, given a base image and mask,
generates the four image and mask alignments and fills in the MaskedImage
structure. Listing Three (page 145), together with the include file in Listing
Four (page 145) and the system memory-to-display memory block-copy routine in
Listing Four from last month, does just that. It would be faster if Listing
Three were in assembly language, but there's no reason to think that
generating aligned images needs to be particularly fast; in such cases, I
prefer to use C, for reasons of coding speed, fewer bugs, and maintainability.


Notes on Masked Copying


Instead of using a separate image mask, Listings One and Three could instead
operate by not drawing 0-bytes; that is, they could treat 0-bytes in the image
as transparent, by leaving the destination unchanged wherever a 0-byte occurs
in the image. That isn't as flexible as a separate image mask, in that 0-bytes
can't be drawn as part of the image, but it does eliminate the need for a
separate mask, and the loss of one color isn't particularly serious in
256-color mode.
Listings One and Two, like all mode X code I've presented, perform no
clipping, because clipping code would take up too much room in the listings.
While clipping can be implemented directly in the low-level mode X routines
(at the beginning of Listing One, for instance), another, potentially simpler
approach would be to perform clipping at a higher level, modifying the
coordinates and dimensions passed to low-level routines such as Listings One
and Two as necessary to accomplish the desired clipping. It is for precisely
this reason that the low-level mode X routines support programmable start
coordinates in the source images, rather than assuming (0,0); likewise for the
distinction between the width of the image and the width of the area of the
image to draw.
Lastly, it would be more efficient to make up structures that describe the
source and destination bitmaps, with dimensions and coordinates built in, and
simply pass pointers to these structures to the low level, rather than passing
many separate parameters, as is now the case. I've used separate parameters
for simplicity and flexibility.


Animation


Gosh. There's just no way I can discuss animation fundamentals in any detail
in the space I have. Basically, I'm going to perform page-flipped animation,
in which one page (bitmap large enough to hold a full screen) of display
memory is displayed while another page is drawn to. When the drawing is
finished, the newly modified page is displayed, and the other -- now invisible
-- page is drawn to. The process repeats ad infinitum. For further
information, check out Computer Graphics, by Foley and van Dam
(Addison-Wesley); Principles of Interactive Computer Graphics, by Newman and
Sproull (McGraw Hill); "Real-Time Animation" by Rahner James (January 1990,
DDJ); or my article, "Split Screen Animation for the VGA," (June 1991, PC
Techniques). (Portions of the animation code in Listing Five and all of the
ShowPage function in Listing Six were originally presented in the
last-mentioned article.)


Mode X Animation in Action


Listing Five (page 145) ties together everything I've discussed about mode X
in a compact but surprisingly powerful animation package. Listing Five first
uses solid and patterned fills and system memory-to-screen memory masked
copying to draw a static background containing a mountain, a sun, a plain,
water, and a house with puffs of smoke coming out of the chimney, and sets up
the four alignments of a masked kite image. The background is transferred to
both display pages, and drawing of twenty kite images in the nondisplayed page
using fast masked copying begins. After all images have been drawn, the page
is flipped to show the newly updated screen, and the kites are moved and drawn
in the other page, which is no longer displayed. Kites are erased at their old
positions in the nondisplayed page by block copying from the background page.
(See last month's column for the display memory organization used by Listing
Five.) So far as the displayed image is concerned, there is never any hint of
flicker or disturbance of the background. This continues at a rate of up to 60
times a second until Esc is pressed to exit.
Worth noting: The animation is extremely smooth on a 20-MHz 386. It is
somewhat more jerky on an 8-MHz 286, because only 30 frames a second can be
processed. If animation looks jerky on your system, try reducing the number of
kites.
The kites draw perfectly into the background, with no interference or fringe,
thanks to masked copying. In fact, the kites also cross with no interference
(the last-drawn kite is always in front), although that's not readily apparent
because they all look the same anyway and are moving fast. Listing Five isn't
inherently limited to kites; create your own images and initialize the object
list to display a mix of those images and see the full power of mode X
animation.
The external functions called by Listing Five can be found in Listings One,
Two, Three, and Six (page 146), and in the listings in the July and August
columns.


Farewell to Mode X



With that, we end our exploration of mode X, at least for the time being. Mode
X admittedly has its complexities; that's why I've provided a broad and
flexible primitive set. Still, so what if it is complex? Take a look at
Listing Five in action. That sort of colorful, high-performance animation is
worth jumping through a few hoops for; drawing 20, or even 10, fair-sized
objects at a rate of 60 Hz, with no flicker, interference, or fringe, is no
mean accomplishment, even on a 386. (Yes, I know, Amigas can do that without
breaking a sweat, but there aren't tens of millions of Amigas out there. There
are that many VGAs.)
There's much more I wanted to do with mode X in general and with Listing Five
in particular, but this was all I could squeeze into three columns; next month
it's time to move on, most likely to 15-bit-per-pixel VGAs. In closing, I'd
like to point out that all of the VGA's hardware features, including the
built-in AND, OR, and XOR functions, are available in mode X, just as they are
in the standard VGA modes. If you understand the VGA's hardware in mode 0x12,
try applying that knowledge to mode X; you might be surprised at what you find
you can do.


Glitch Department


William Huber, of Philadelphia, Pennsylvania, wrote to point out that the
method for determining whether a polygon is convex or not that I briefly
described back in June (it's convex if there are exactly two X and two Y
direction reversals) doesn't work for beans. (My words, not his.) Well, yes.
And, not to sound like I'm covering up a goof (he said, sounding exactly like
he was covering up a goof), I knew that. Between the time I wrote that column
and the time I got the galleys from the editors at DDJ, I discovered for
myself that my clever approach didn't work under certain circumstances (if the
edges cross without inducing extra X or Y reversals), and I deleted it from
the galleys. At least I thought I did. The edits are written right on the
galley sheets in my files, but either I forgot to relay the changes to the
editors, or the changes got lost in the editing shuffle, or perhaps an
Einstein among viruses got into the DDJ computer system and rewrote the
article right back to the original wording. Beats me; it's just the sort of
minor system crash that you have to expect once in a while when you're dealing
with fallible, carbon-based life-forms. Sorry.
To repeat: I do not know of an adequately fast way to determine whether a
polygon is convex. If you do know of one, you might want to pass it along,
care of DDJ.


Glitch Department, Part II


While writing the code for this column, I noticed a couple of mistakes from
last month. The simple one is that the comment at the start of Listing Two
that shows the C prototype for FillPatternX incorrectly describes the function
as FillPatternedX. The more interesting goof was in Listing Three from last
month. In that listing, the commented-out code in this month's Listing Seven
(page 147) should be changed to the uncommented code; this causes the Sequence
Controller Index to be loaded with the index of the Map Mask register, in
light of my belated realization that it was not programmed to point to the Map
Mask earlier in that module, contrary to the original comment.
That mistake was mine and mine alone, and it was a bug. (Ah, the hazards of
block-copying code!) Having said that, allow me to point out that it was a bug
that didn't cause -- couldn't cause -- any problems, thereby pointing out an
optimization opportunity. You see, as it happens, the mode X mode set code
leaves the SC Index register pointing to the Map Mask, and no other module
changes that setting. That's not surprising; the Map Mask is the only SC
register that affects data flow (the other SC registers are used only to set
up modes), so it is the only SC register that is normally programmed by
drawing code. Listing Three from last month worked properly, even though it
failed to load the SC Index, because the SC Index pointed to the Map Mask
before Listing Three began, and remained in that state throughout Listing
Three, as is the case with all the mcde X drawing code. That leads to an
obvious thought: Why not simply make it official and say that the SC Index
must always point to the Map Mask in your mode X code, then remove the
instructions in each module that point the SC Index to the Map Mask?
No reason at all; it's a good optimization that I've used myself in other
modes (the SC Index can generally be left pointing to the Map Mask register in
all VGA modes). I'd originally chosen not to do it in this column for fear of
causing confusion, but there's really no reason to constantly reload the SC
Index to point to the Map Mask -- as I accidentally proved last month by
neglecting to do so myself.
_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

; Mode X (320x240, 256 colors) system memory-to-display memory masked copy
; routine. Not particularly fast; images for which performance is critical
; should be stored in off-screen memory and copied to screen via latches.
Works
; on all VGAs. Copies up to but not including column at SourceEndX and row at
; SourceEndY. No clipping is performed. Mask and source image are both byte-
; per-pixel, and must be of same widths and reside at same coordinates in
their
; respective bitmaps. Assembly code tested with TASM 2.0. C near-callable as:
; void CopySystemToScreenMaskedX(int SourceStartX,
; int SourceStartY, int SourceEndX, int SourceEndY,
; int DestStartX, int DestStartY, char * SourcePtr,
; unsigned int DestPageBase, int SourceBitmapWidth,
; int DestBitmapWidth, char * MaskPtr);

SC_INDEX equ 03c4h ;Sequence Controller Index register port
MAP_MASK equ 02h ;index in SC of Map Mask register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X

parms struc
 dw 2 dup (?) ;pushed BP and return address
SourceStartX dw ? ;X coordinate of upper left corner of source
 ; (source is in system memory)
SourceStartY dw ? ;Y coordinate of upper left corner of source
SourceEndX dw ? ;X coordinate of lower right corner of source
 ; (the column at EndX is not copied)
SourceEndY dw ? ;Y coordinate of lower right corner of source
 ; (the row at EndY is not copied)
DestStartX dw ? ;X coordinate of upper left corner of dest
 ; (destination is in display memory)
DestStartY dw ? ;Y coordinate of upper left corner of dest
SourcePtr dw ? ;pointer in DS to start of bitmap which source resides
DestPageBase dw ? ;base offset in display memory of page in
 ; which dest resides
SourceBitmapWidth dw ? ;# of pixels across source bitmap (also must
 ; be width across the mask)
DestBitmapWidth dw ? ;# of pixels across dest bitmap (must be multiple of 4)
MaskPtr dw ? ;pointer in DS to start of bitmap in which mask

 ; resides (byte-per-pixel format, just like the source
 ; image; 0-bytes mean don't copy corresponding source
 ; pixel, 1-bytes mean do copy)
parms ends

RectWidth equ -2 ;local storage for width of rectangle
RectHeight equ -4 ;local storage for height of rectangle
LeftMask equ -6 ;local storage for left rect edge plane mask
STACK_FRAME_SIZE equ 6
 .model small
 .code
 public _CopySystemToScreenMaskedX
_CopySystemToScreenMaskedX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame
 sub sp,STACK_FRAME_SIZE ;allocate space for local vars
 push si ;preserve caller's register variables
 push di

 mov ax,SCREEN_SEG ;point ES to display memory
 mov es,ax
 mov ax,[bp+SourceBitmapWidth]
 mul [bp+SourceStartY] ;top source rect scan line
 add ax,[bp+SourceStartX]
 mov bx,ax
 add ax,[bp+SourcePtr] ;offset of first source rect pixel
 mov si,ax ; in DS
 add bx,[bp+MaskPtr] ;offset of first mask pixel in DS

 mov ax,[bp+DestBitmapWidth]
 shr ax,1 ;convert to width in addresses
 shr ax,1
 mov [bp+DestBitmapWidth],ax ;remember address width
 mul [bp+DestStartY] ;top dest rect scan line
 mov di,[bp+DestStartX]
 mov cx,di
 shr di,1 ;X/4 = offset of first dest rect pixel in
 shr di,1 ; scan line
 add di,ax ;offset of first dest rect pixel in page
 add di,[bp+DestPageBase] ;offset of first dest rect pixel
 ; in display memory
 and cl,011b ;CL = first dest pixel's plane
 mov al,11h ;upper nibble comes into play when plane wraps
 ; from 3 back to 0
 shl al,cl ;set the bit for the first dest pixel's plane
 mov [bp+LeftMask],al ; in each nibble to 1

 mov ax,[bp+SourceEndX] ;calculate # of pixels across
 sub ax,[bp+SourceStartX] ; rect
 jle CopyDone ;skip if 0 or negative width
 mov [bp+RectWidth],ax
 sub word ptr [bp+SourceBitmapWidth],ax
 ;distance from end of one source scan line to start of next
 mov ax,[bp+SourceEndY]
 sub ax,[bp+SourceStartY] ;height of rectangle
 jle CopyDone ;skip if 0 or negative height
 mov [bp+RectHeight],ax
 mov dx,SC_INDEX ;point to SC Index register
 mov al,MAP_MASK

 out dx,al ;point SC Index reg to the Map Mask
 inc dx ;point DX to SC Data reg
CopyRowsLoop:
 mov al,[bp+LeftMask]
 mov cx,[bp+RectWidth]
 push di ;remember the start offset in the dest
CopyScanLineLoop:
 cmp byte ptr [bx],0 ;is this pixel mask-enabled?
 jz MaskOff ;no, so don't draw it
 ;yes, draw the pixel
 out dx,al ;set the plane for this pixel
 mov ah,[si] ;get the pixel from the source
 mov es:[di],ah ;copy the pixel to the screen
MaskOff:
 inc bx ;advance the mask pointer
 inc si ;advance the source pointer
 rol al,1 ;set mask for next pixel's plane
 adc di,0 ;advance destination address only when
 ; wrapping from plane 3 to plane 0
 loop CopyScanLineLoop
 pop di ;retrieve the dest start offset
 add di,[bp+DestBitmapWidth] ;point to the start of the
 ; next scan line of the dest
 add si,[bp+SourceBitmapWidth] ;point to the start of the
 ; next scan line of the source
 add bx,[bp+SourceBitmapWidth] ;point to the start of the
 ; next scan line of the mask
 dec word ptr [bp+RectHeight] ;count down scan lines
 jnz CopyRowsLoop
CopyDone:
 pop di ;restore caller's register variables
 pop si
 mov sp,bp ;discard storage for local variables
 pop bp ;restore caller's stack frame
 ret
_CopySystemToScreenMaskedX endp
 end






[LISTING TWO]

; Mode X (320x240, 256 colors) display memory to display memory masked copy
; routine. Works on all VGAs. Uses approach of reading 4 pixels at a time from
; source into latches, then writing latches to destination, using Map Mask
; register to perform masking. Copies up to but not including column at
; SourceEndX and row at SourceEndY. No clipping is performed. Results are not
; guaranteed if source and destination overlap. C near-callable as:
; void CopyScreenToScreenMaskedX(int SourceStartX,
; int SourceStartY, int SourceEndX, int SourceEndY,
; int DestStartX, int DestStartY, MaskedImage * Source,
; unsigned int DestPageBase, int DestBitmapWidth);

SC_INDEX equ 03c4h ;Sequence Controller Index register port
MAP_MASK equ 02h ;index in SC of Map Mask register
GC_INDEX equ 03ceh ;Graphics Controller Index register port

BIT_MASK equ 08h ;index in GC of Bit Mask register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X

parms struc
 dw 2 dup (?) ;pushed BP and return address
SourceStartX dw ? ;X coordinate of upper left corner of source
SourceStartY dw ? ;Y coordinate of upper left corner of source
SourceEndX dw ? ;X coordinate of lower right corner of source
 ; (the column at SourceEndX is not copied)
SourceEndY dw ? ;Y coordinate of lower right corner of source
 ; (the row at SourceEndY is not copied)
DestStartX dw ? ;X coordinate of upper left corner of dest
DestStartY dw ? ;Y coordinate of upper left corner of dest
Source dw ? ;pointer to MaskedImage struct for source
 ; which source resides
DestPageBase dw ? ;base offset in display memory of page in
 ; which dest resides
DestBitmapWidth dw ? ;# of pixels across dest bitmap (must be multiple of 4)
parms ends

SourceNextScanOffset equ -2 ;local storage for distance from end of
 ; one source scan line to start of next
DestNextScanOffset equ -4 ;local storage for distance from end of
 ; one dest scan line to start of next
RectAddrWidth equ -6 ;local storage for address width of rectangle
RectHeight equ -8 ;local storage for height of rectangle
SourceBitmapWidth equ -10 ;local storage for width of source bitmap
 ; (in addresses)
STACK_FRAME_SIZE equ 10
MaskedImage struc
 Alignments dw 4 dup(?) ;pointers to AlignedMaskedImages for the
 ; 4 possible destination image alignments
MaskedImage ends
AlignedMaskedImage struc
 ImageWidth dw ? ;image width in addresses (also mask width in bytes)
 ImagePtr dw ? ;offset of image bitmap in display memory
 MaskPtr dw ? ;pointer to mask bitmap in DS
AlignedMaskedImage ends
 .model small
 .code
 public _CopyScreenToScreenMaskedX
_CopyScreenToScreenMaskedX proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame
 sub sp,STACK_FRAME_SIZE ;allocate space for local vars
 push si ;preserve caller's register variables
 push di

 cld
 mov dx,GC_INDEX ;set the bit mask to select all bits
 mov ax,00000h+BIT_MASK ; from the latches and none from
 out dx,ax ; the CPU, so that we can write the
 ; latch contents directly to memory
 mov ax,SCREEN_SEG ;point ES to display memory
 mov es,ax
 mov ax,[bp+DestBitmapWidth]
 shr ax,1 ;convert to width in addresses
 shr ax,1
 mul [bp+DestStartY] ;top dest rect scan line

 mov di,[bp+DestStartX]
 mov si,di
 shr di,1 ;X/4 = offset of first dest rect pixel in
 shr di,1 ; scan line
 add di,ax ;offset of first dest rect pixel in page
 add di,[bp+DestPageBase] ;offset of first dest rect pixel
 ; in display memory. now look up the image that's
 ; aligned to match left-edge alignment of destination
 and si,3 ;DestStartX modulo 4
 mov cx,si ;set aside alignment for later
 shl si,1 ;prepare for word look-up
 mov bx,[bp+Source] ;point to source MaskedImage structure
 mov bx,[bx+Alignments+si] ;point to AlignedMaskedImage
 ; struc for current left edge alignment
 mov ax,[bx+ImageWidth] ;image width in addresses
 mov [bp+SourceBitmapWidth],ax ;remember image width in
 ; addresses
 mul [bp+SourceStartY] ;top source rect scan line
 mov si,[bp+SourceStartX]
 shr si,1 ;X/4 = address of first source rect pixel in
 shr si,1 ; scan line
 add si,ax ;offset of first source rect pixel in image
 mov ax,si
 add si,[bx+MaskPtr] ;point to mask offset of first mask pixel in DS
 mov bx,[bx+ImagePtr] ;offset of first source rect pixel
 add bx,ax ; in display memory

 mov ax,[bp+SourceStartX] ;calculate # of addresses across
 add ax,cx ; rect, shifting if necessary to
 add cx,[bp+SourceEndX] ; account for alignment
 cmp cx,ax
 jle CopyDone ;skip if 0 or negative width
 add cx,3
 and ax,not 011b
 sub cx,ax
 shr cx,1
 shr cx,1 ;# of addresses across rectangle to copy
 mov ax,[bp+SourceEndY]
 sub ax,[bp+SourceStartY] ;AX = height of rectangle
 jle CopyDone ;skip if 0 or negative height
 mov [bp+RectHeight],ax
 mov ax,[bp+DestBitmapWidth]
 shr ax,1 ;convert to width in addresses
 shr ax,1
 sub ax,cx ;distance from end of one dest scan line to start of next
 mov [bp+DestNextScanOffset],ax
 mov ax,[bp+SourceBitmapWidth] ;width in addresses
 sub ax,cx ;distance from end of source scan line to start of next
 mov [bp+SourceNextScanOffset],ax
 mov [bp+RectAddrWidth],cx ;remember width in addresses

 mov dx,SC_INDEX
 mov al,MAP_MASK
 out dx,al ;point SC Index register to Map Mask
 inc dx ;point to SC Data register
CopyRowsLoop:
 mov cx,[bp+RectAddrWidth] ;width across
CopyScanLineLoop:
 lodsb ;get the mask for this four-pixel set

 ; and advance the mask pointer
 out dx,al ;set the mask
 mov al,es:[bx] ;load the latches with 4-pixel set from source
 mov es:[di],al ;copy the four-pixel set to the dest
 inc bx ;advance the source pointer
 inc di ;advance the destination pointer
 dec cx ;count off four-pixel sets
 jnz CopyScanLineLoop

 mov ax,[bp+SourceNextScanOffset]
 add si,ax ;point to the start of
 add bx,ax ; the next source, mask,
 add di,[bp+DestNextScanOffset] ; and dest lines
 dec word ptr [bp+RectHeight] ;count down scan lines
 jnz CopyRowsLoop
CopyDone:
 mov dx,GC_INDEX+1 ;restore the bit mask to its default,
 mov al,0ffh ; which selects all bits from the CPU
 out dx,al ; and none from the latches (the GC
 ; Index still points to Bit Mask)
 pop di ;restore caller's register variables
 pop si
 mov sp,bp ;discard storage for local variables
 pop bp ;restore caller's stack frame
 ret
_CopyScreenToScreenMaskedX endp
 end






[LISTING THREE]

/* Generates all four possible mode X image/mask alignments, stores image
alignments in display memory, allocates memory for and generates mask
alignments, and fills out an AlignedMaskedImage structure. Image and mask must
both be in byte-per-pixel form, and must both be of width ImageWidth. Mask
maps isomorphically (one to one) onto image, with each 0-byte in mask masking
off corresponding image pixel (causing it not to be drawn), and each
non-0-byte
allowing corresponding image pixel to be drawn. Returns 0 if failure, or # of
display memory addresses (4-pixel sets) used if success. For simplicity,
allocated memory is not deallocated in case of failure. Compiled with
Borland C++ 2.0 in C compilation mode. */

#include <stdio.h>
#include <stdlib.h>
#include "maskim.h"

extern void CopySystemToScreenX(int, int, int, int, int, int, char *,
 unsigned int, int, int);
unsigned int CreateAlignedMaskedImage(MaskedImage * ImageToSet,
 unsigned int DispMemStart, char * Image, int ImageWidth,
 int ImageHeight, char * Mask)
{
 int Align, ScanLine, BitNum, Size, TempImageWidth;
 unsigned char MaskTemp;
 unsigned int DispMemOffset = DispMemStart;

 AlignedMaskedImage *WorkingAMImage;
 char *NewMaskPtr, *OldMaskPtr;
 /* Generate each of the four alignments in turn */
 for (Align = 0; Align < 4; Align++) {
 /* Allocate space for the AlignedMaskedImage struct for this
 alignment */
 if ((WorkingAMImage = ImageToSet->Alignments[Align] =
 malloc(sizeof(AlignedMaskedImage))) == NULL)
 return 0;
 WorkingAMImage->ImageWidth =
 (ImageWidth + Align + 3) / 4; /* width in 4-pixel sets */
 WorkingAMImage->ImagePtr = DispMemOffset; /* image dest */
 /* Download this alignment of the image */
 CopySystemToScreenX(0, 0, ImageWidth, ImageHeight, Align, 0,
 Image, DispMemOffset, ImageWidth,
 WorkingAMImage->ImageWidth * 4);
 /* Calculate the number of bytes needed to store the mask in
 nibble (Map Mask-ready) form, then allocate that space */
 Size = WorkingAMImage->ImageWidth * ImageHeight;
 if ((WorkingAMImage->MaskPtr = malloc(Size)) == NULL)
 return 0;
 /* Generate this nibble oriented (Map Mask-ready) alignment of
 the mask, one scan line at a time */
 OldMaskPtr = Mask;
 NewMaskPtr = WorkingAMImage->MaskPtr;
 for (ScanLine = 0; ScanLine < ImageHeight; ScanLine++) {
 BitNum = Align;
 MaskTemp = 0;
 TempImageWidth = ImageWidth;
 do {
 /* Set the mask bit for next pixel according to its alignment */
 MaskTemp = (*OldMaskPtr++ != 0) << BitNum;
 if (++BitNum > 3) {
 *NewMaskPtr++ = MaskTemp;
 MaskTemp = BitNum = 0;
 }
 } while (--TempImageWidth);
 /* Set any partial final mask on this scan line */
 if (BitNum != 0) *NewMaskPtr++ = MaskTemp;
 }
 DispMemOffset += Size; /* mark off the space we just used */
 }
 return DispMemOffset - DispMemStart;
}






[LISTING FOUR]

/* MASKIM.H: structures used for storing and manipulating masked
 images */
/* Describes one alignment of a mask-image pair */
typedef struct {
 int ImageWidth; /* image width in addresses in display memory (also
 mask width in bytes) */
 unsigned int ImagePtr; /* offset of image bitmap in display mem */

 char *MaskPtr; /* pointer to mask bitmap */
} AlignedMaskedImage;
/* Describes all four alignments of a mask-image pair */
typedef struct {
 AlignedMaskedImage *Alignments[4]; /* ptrs to AlignedMaskedImage
 structs for four possible destination
 image alignments */
} MaskedImage;







[LISTING FIVE]

/* Sample mode X VGA animation program. Portions of this code first appeared
in PC Techniques. Compiled with Borland C++ 2.0 in C compilation mode. */

#include <stdio.h>
#include <conio.h>
#include <dos.h>
#include <math.h>
#include "maskim.h"

#define SCREEN_SEG 0xA000
#define SCREEN_WIDTH 320
#define SCREEN_HEIGHT 240
#define PAGE0_START_OFFSET 0
#define PAGE1_START_OFFSET (((long)SCREEN_HEIGHT*SCREEN_WIDTH)/4)
#define BG_START_OFFSET (((long)SCREEN_HEIGHT*SCREEN_WIDTH*2)/4)
#define DOWNLOAD_START_OFFSET (((long)SCREEN_HEIGHT*SCREEN_WIDTH*3)/4)

static unsigned int PageStartOffsets[2] =
 {PAGE0_START_OFFSET,PAGE1_START_OFFSET};
static char GreenAndBrownPattern[] =
 {2,6,2,6, 6,2,6,2, 2,6,2,6, 6,2,6,2};
static char PineTreePattern[] = {2,2,2,2, 2,6,2,6, 2,2,6,2, 2,2,2,2};
static char BrickPattern[] = {6,6,7,6, 7,7,7,7, 7,6,6,6, 7,7,7,7,};
static char RoofPattern[] = {8,8,8,7, 7,7,7,7, 8,8,8,7, 8,8,8,7};

#define SMOKE_WIDTH 7
#define SMOKE_HEIGHT 7
static char SmokePixels[] = {
 0, 0,15,15,15, 0, 0,
 0, 7, 7,15,15,15, 0,
 8, 7, 7, 7,15,15,15,
 8, 7, 7, 7, 7,15,15,
 0, 8, 7, 7, 7, 7,15,
 0, 0, 8, 7, 7, 7, 0,
 0, 0, 0, 8, 8, 0, 0};
static char SmokeMask[] = {
 0, 0, 1, 1, 1, 0, 0,
 0, 1, 1, 1, 1, 1, 0,
 1, 1, 1, 1, 1, 1, 1,
 1, 1, 1, 1, 1, 1, 1,
 1, 1, 1, 1, 1, 1, 1,
 0, 1, 1, 1, 1, 1, 0,

 0, 0, 1, 1, 1, 0, 0};
#define KITE_WIDTH 10
#define KITE_HEIGHT 16
static char KitePixels[] = {
 0, 0, 0, 0,45, 0, 0, 0, 0, 0,
 0, 0, 0,46,46,46, 0, 0, 0, 0,
 0, 0,47,47,47,47,47, 0, 0, 0,
 0,48,48,48,48,48,48,48, 0, 0,
 49,49,49,49,49,49,49,49,49, 0,
 0,50,50,50,50,50,50,50, 0, 0,
 0,51,51,51,51,51,51,51, 0, 0,
 0, 0,52,52,52,52,52, 0, 0, 0,
 0, 0,53,53,53,53,53, 0, 0, 0,
 0, 0, 0,54,54,54, 0, 0, 0, 0,
 0, 0, 0,55,55,55, 0, 0, 0, 0,
 0, 0, 0, 0,58, 0, 0, 0, 0, 0,
 0, 0, 0, 0,59, 0, 0, 0, 0,66,
 0, 0, 0, 0,60, 0, 0,64, 0,65,
 0, 0, 0, 0, 0,61, 0, 0,64, 0,
 0, 0, 0, 0, 0, 0,62,63, 0,64};
static char KiteMask[] = {
 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
 0, 0, 1, 1, 1, 1, 1, 0, 0, 0,
 0, 1, 1, 1, 1, 1, 1, 1, 0, 0,
 1, 1, 1, 1, 1, 1, 1, 1, 1, 0,
 0, 1, 1, 1, 1, 1, 1, 1, 0, 0,
 0, 1, 1, 1, 1, 1, 1, 1, 0, 0,
 0, 0, 1, 1, 1, 1, 1, 0, 0, 0,
 0, 0, 1, 1, 1, 1, 1, 0, 0, 0,
 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
 0, 0, 0, 1, 1, 1, 0, 0, 0, 0,
 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
 0, 0, 0, 0, 1, 0, 0, 0, 0, 1,
 0, 0, 0, 0, 1, 0, 0, 1, 0, 1,
 0, 0, 0, 0, 0, 1, 0, 0, 1, 0,
 0, 0, 0, 0, 0, 0, 1, 1, 0, 1};
static MaskedImage KiteImage;

#define NUM_OBJECTS 20
typedef struct {
 int X,Y,Width,Height,XDir,YDir,XOtherPage,YOtherPage;
 MaskedImage *Image;
} AnimatedObject;
AnimatedObject AnimatedObjects[] = {
 { 0, 0,KITE_WIDTH,KITE_HEIGHT, 1, 1, 0, 0,&KiteImage},
 { 10, 10,KITE_WIDTH,KITE_HEIGHT, 0, 1, 10, 10,&KiteImage},
 { 20, 20,KITE_WIDTH,KITE_HEIGHT,-1, 1, 20, 20,&KiteImage},
 { 30, 30,KITE_WIDTH,KITE_HEIGHT,-1,-1, 30, 30,&KiteImage},
 { 40, 40,KITE_WIDTH,KITE_HEIGHT, 1,-1, 40, 40,&KiteImage},
 { 50, 50,KITE_WIDTH,KITE_HEIGHT, 0,-1, 50, 50,&KiteImage},
 { 60, 60,KITE_WIDTH,KITE_HEIGHT, 1, 0, 60, 60,&KiteImage},
 { 70, 70,KITE_WIDTH,KITE_HEIGHT,-1, 0, 70, 70,&KiteImage},
 { 80, 80,KITE_WIDTH,KITE_HEIGHT, 1, 2, 80, 80,&KiteImage},
 { 90, 90,KITE_WIDTH,KITE_HEIGHT, 0, 2, 90, 90,&KiteImage},
 {100,100,KITE_WIDTH,KITE_HEIGHT,-1, 2,100,100,&KiteImage},
 {110,110,KITE_WIDTH,KITE_HEIGHT,-1,-2,110,110,&KiteImage},
 {120,120,KITE_WIDTH,KITE_HEIGHT, 1,-2,120,120,&KiteImage},
 {130,130,KITE_WIDTH,KITE_HEIGHT, 0,-2,130,130,&KiteImage},

 {140,140,KITE_WIDTH,KITE_HEIGHT, 2, 0,140,140,&KiteImage},
 {150,150,KITE_WIDTH,KITE_HEIGHT,-2, 0,150,150,&KiteImage},
 {160,160,KITE_WIDTH,KITE_HEIGHT, 2, 2,160,160,&KiteImage},
 {170,170,KITE_WIDTH,KITE_HEIGHT,-2, 2,170,170,&KiteImage},
 {180,180,KITE_WIDTH,KITE_HEIGHT,-2,-2,180,180,&KiteImage},
 {190,190,KITE_WIDTH,KITE_HEIGHT, 2,-2,190,190,&KiteImage},
};
void main(void);
void DrawBackground(unsigned int);
void MoveObject(AnimatedObject *);
extern void Set320x240Mode(void);
extern void FillRectangleX(int, int, int, int, unsigned int, int);
extern void FillPatternX(int, int, int, int, unsigned int, char*);
extern void CopySystemToScreenMaskedX(int, int, int, int, int, int,
 char *, unsigned int, int, int, char *);
extern void CopyScreenToScreenX(int, int, int, int, int, int,
 unsigned int, unsigned int, int, int);
extern unsigned int CreateAlignedMaskedImage(MaskedImage *,
 unsigned int, char *, int, int, char *);
extern void CopyScreenToScreenMaskedX(int, int, int, int, int, int,
 MaskedImage *, unsigned int, int);
extern void ShowPage(unsigned int);

void main()
{
 int DisplayedPage, NonDisplayedPage, Done, i;
 union REGS regset;
 Set320x240Mode();
 /* Download the kite image for fast copying later */
 if (CreateAlignedMaskedImage(&KiteImage, DOWNLOAD_START_OFFSET,
 KitePixels, KITE_WIDTH, KITE_HEIGHT, KiteMask) == 0) {
 regset.x.ax = 0x0003; int86(0x10, &regset, &regset);
 printf("Couldn't get memory\n"); exit();
 }
 /* Draw the background to the background page */
 DrawBackground(BG_START_OFFSET);
 /* Copy the background to both displayable pages */
 CopyScreenToScreenX(0, 0, SCREEN_WIDTH, SCREEN_HEIGHT, 0, 0,
 BG_START_OFFSET, PAGE0_START_OFFSET, SCREEN_WIDTH,
 SCREEN_WIDTH);
 CopyScreenToScreenX(0, 0, SCREEN_WIDTH, SCREEN_HEIGHT, 0, 0,
 BG_START_OFFSET, PAGE1_START_OFFSET, SCREEN_WIDTH,
 SCREEN_WIDTH);
 /* Move the objects and update their images in the nondisplayed
 page, then flip the page, until Esc is pressed */
 Done = DisplayedPage = 0;
 do {
 NonDisplayedPage = DisplayedPage ^ 1;
 /* Erase each object in nondisplayed page by copying block from
 background page at last location in that page */
 for (i=0; i<NUM_OBJECTS; i++) {
 CopyScreenToScreenX(AnimatedObjects[i].XOtherPage,
 AnimatedObjects[i].YOtherPage,
 AnimatedObjects[i].XOtherPage +
 AnimatedObjects[i].Width,
 AnimatedObjects[i].YOtherPage +
 AnimatedObjects[i].Height,
 AnimatedObjects[i].XOtherPage,
 AnimatedObjects[i].YOtherPage, BG_START_OFFSET,

 PageStartOffsets[NonDisplayedPage], SCREEN_WIDTH,
 SCREEN_WIDTH);
 }
 /* Move and draw each object in the nondisplayed page */
 for (i=0; i<NUM_OBJECTS; i++) {
 MoveObject(&AnimatedObjects[i]);
 /* Draw object into nondisplayed page at new location */
 CopyScreenToScreenMaskedX(0, 0, AnimatedObjects[i].Width,
 AnimatedObjects[i].Height, AnimatedObjects[i].X,
 AnimatedObjects[i].Y, AnimatedObjects[i].Image,
 PageStartOffsets[NonDisplayedPage], SCREEN_WIDTH);
 }
 /* Flip to the page into which we just drew */
 ShowPage(PageStartOffsets[DisplayedPage = NonDisplayedPage]);
 /* See if it's time to end */
 if (kbhit()) {
 if (getch() == 0x1B) Done = 1; /* Esc to end */
 }
 } while (!Done);
 /* Restore text mode and done */
 regset.x.ax = 0x0003; int86(0x10, &regset, &regset);
}
void DrawBackground(unsigned int PageStart)
{
 int i,j,Temp;
 /* Fill the screen with cyan */
 FillRectangleX(0, 0, SCREEN_WIDTH, SCREEN_HEIGHT, PageStart, 11);
 /* Draw a green and brown rectangle to create a flat plain */
 FillPatternX(0, 160, SCREEN_WIDTH, SCREEN_HEIGHT, PageStart,
 GreenAndBrownPattern);
 /* Draw blue water at the bottom of the screen */
 FillRectangleX(0, SCREEN_HEIGHT-30, SCREEN_WIDTH, SCREEN_HEIGHT,
 PageStart, 1);
 /* Draw a brown mountain rising out of the plain */
 for (i=0; i<120; i++)
 FillRectangleX(SCREEN_WIDTH/2-30-i, 51+i, SCREEN_WIDTH/2-30+i+1,
 51+i+1, PageStart, 6);
 /* Draw a yellow sun by overlapping rects of various shapes */
 for (i=0; i<=20; i++) {
 Temp = (int)(sqrt(20.0*20.0 - (float)i*(float)i) + 0.5);
 FillRectangleX(SCREEN_WIDTH-25-i, 30-Temp, SCREEN_WIDTH-25+i+1,
 30+Temp+1, PageStart, 14);
 }
 /* Draw green trees down the side of the mountain */
 for (i=10; i<90; i += 15)
 for (j=0; j<20; j++)
 FillPatternX(SCREEN_WIDTH/2+i-j/3-15, i+j+51,SCREEN_WIDTH/2+i+j/3-15+1,
 i+j+51+1, PageStart, PineTreePattern);
 /* Draw a house on the plain */
 FillPatternX(265, 150, 295, 170, PageStart, BrickPattern);
 FillPatternX(265, 130, 270, 150, PageStart, BrickPattern);
 for (i=0; i<12; i++)
 FillPatternX(280-i*2, 138+i, 280+i*2+1, 138+i+1, PageStart, RoofPattern);
 /* Finally, draw puffs of smoke rising from the chimney */
 for (i=0; i<4; i++)
 CopySystemToScreenMaskedX(0, 0, SMOKE_WIDTH, SMOKE_HEIGHT, 264,
 110-i*20, SmokePixels, PageStart, SMOKE_WIDTH,SCREEN_WIDTH, SmokeMask);
}
/* Move the specified object, bouncing at the edges of the screen and

 remembering where the object was before the move for erasing next time */
void MoveObject(AnimatedObject * ObjectToMove) {
 int X, Y;
 X = ObjectToMove->X + ObjectToMove->XDir;
 Y = ObjectToMove->Y + ObjectToMove->YDir;
 if ((X < 0) (X > (SCREEN_WIDTH - ObjectToMove->Width))) {
 ObjectToMove->XDir = -ObjectToMove->XDir;
 X = ObjectToMove->X + ObjectToMove->XDir;
 }
 if ((Y < 0) (Y > (SCREEN_HEIGHT - ObjectToMove->Height))) {
 ObjectToMove->YDir = -ObjectToMove->YDir;
 Y = ObjectToMove->Y + ObjectToMove->YDir;
 }
 /* Remember previous location for erasing purposes */
 ObjectToMove->XOtherPage = ObjectToMove->X;
 ObjectToMove->YOtherPage = ObjectToMove->Y;
 ObjectToMove->X = X; /* set new location */
 ObjectToMove->Y = Y;
}






[LISTING SIX]

; Shows the page at the specified offset in the bitmap. Page is displayed when
; this routine returns. This code first appeared in PC Techniques.
; C near-callable as: void ShowPage(unsigned int StartOffset);

INPUT_STATUS_1 equ 03dah ;Input Status 1 register
CRTC_INDEX equ 03d4h ;CRT Controller Index reg
START_ADDRESS_HIGH equ 0ch ;bitmap start address high byte
START_ADDRESS_LOW equ 0dh ;bitmap start address low byte

ShowPageParms struc
 dw 2 dup (?) ;pushed BP and return address
StartOffset dw ? ;offset in bitmap of page to display
ShowPageParms ends
 .model small
 .code
 public _ShowPage
_ShowPage proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to local stack frame
; Wait for display enable to be active (status is active low), to be
; sure both halves of the start address will take in the same frame.
 mov bl,START_ADDRESS_LOW ;preload for fastest
 mov bh,byte ptr StartOffset[bp] ; flipping once display
 mov cl,START_ADDRESS_HIGH ; enable is detected
 mov ch,byte ptr StartOffset+1[bp]
 mov dx,INPUT_STATUS_1
WaitDE:
 in al,dx
 test al,01h
 jnz WaitDE ;display enable is active low (0 = active)
; Set the start offset in display memory of the page to display.
 mov dx,CRTC_INDEX

 mov ax,bx
 out dx,ax ;start address low
 mov ax,cx
 out dx,ax ;start address high
; Now wait for vertical sync, so the other page will be invisible when
; we start drawing to it.
 mov dx,INPUT_STATUS_1
WaitVS:
 in al,dx
 test al,08h
 jz WaitVS ;vertical sync is active high (1 = active)
 pop bp ;restore caller's stack frame
 ret
_ShowPage endp
 end







[LISTING SEVEN]

;;;<change this in Listing 3 from the August, 1991 column...>
;;; mov dx,SC_INDEX+1 ;point to Sequence Controller Data reg
;;; ; (SC Index still points to Map Mask)
;;;<...to this>
 mov dx,SC_INDEX
 mov al,MAP_MASK
 out dx,al ;point SC Index reg to Map Mask
 inc dx ;point to SC Data reg






























September, 1991
PROGRAMMER'S BOOKSHELF


The Emperor's New Mind: Concerning Computers, Minds, and the Laws of Physics




Michael Swaine


I was three-fourths of the way through this book before I began to suspect
that it wasn't headed where I thought. I had been carried along by Penrose's
clear writing and obvious mastery of a wide variety of very interesting
subjects, fully expecting, on the basis of the introduction and cover copy,
Penrose to be leading up to another unconvincing argument for why the mind
cannot be a computer. I've read several such arguments, such as John Searle's
(in)famous "Chinese Room" argument, which runs something like this:
Let's say we have a computer programmed to pass the Turing test, in this
limited sense: We feed it simple stories, like "A man went into a restaurant
and ordered a hamburger; when it arrived, it was burnt to a crisp, and he
stormed out of the restaurant without leaving a tip." Then we ask it yes or no
questions like "Did the man eat the hamburger?" And let's say the stories and
the questions are in Chinese.
In another room is the philosopher John Searle, who knows not a word of
Chinese. We shoot the stories and questions in to him as to the computer, and
he works through the same algorithm, coming up with the same yes or no
answers. Because he knows no Chinese, he obviously doesn't understand the
stories, nor the questions to which he is providing correct answers. If he
doesn't understand the content he's operating on mechanically, then it's
surely silly to suggest that the computer -- performing the same operations --
does.
Penrose finds Searle's argument compelling but not entirely convincing. I'm
probably not doing the argument justice, because as stated here it seems
foolish to me. Whatever understanding might turn out to be, Searle has set up
a problem domain in which it is not required, and then faults the computer for
not employing it. That hardly seems fair.
But Searle's argument is not Penrose's argument. Penrose does lead up to an
argument that the mind cannot be a computer, but by the time I was
three-fourths of the way through the book, it was apparent that he had a new
approach, and that it is not one that can easily be dismissed.
A priori, one would not expect to be able easily to dismiss a Roger Penrose
argument. Penrose is a mathematician who knows a lot about modern physics and
has won a lot of awards, including the 1988 Wolf Prize, which he shared with
Stephen Hawking for their joint contribution to our understanding of the
universe.
In building his case, Penrose touches on the computer science and mathematics
themes of the Turing test, the Church-Turing thesis, Hilbert's problem,
Godel's theorem, complexity theory, and the Mandelbrot set; physics topics
such as special and general relativity, quantum theory, cosmology, entropy,
the big bang, and quantum gravity; brain research issues like split-brain
studies and brain plasticity work; and philosophical issues like quantum vs.
Platonic views of reality, and the distinction between computability and
determinism. Because he really uses all these disciplines in developing his
argument, only someone with a firm grasp on all these subjects could properly
evaluate the argument. That's not me; but the group of people capable of
evaluating the argument may be limited to Penrose.
Penrose's argument, though, is easy to state. To understand something,
according to Penrose, is to make direct contact with Truth.
Some readers may feel that this raises some questions. Penrose actually
invites us to ask, "What is truth?" Penrose's answer is a return to a Platonic
view of a mathematically pure reality "out there" somewhere. This is not a
trendy view. Nick Herbert's book Quantum Reality well characterizes the views
on reality held by or entertained by modern physicists, and none of them take
this naive Platonic approach.
Penrose realizes that this view is not consistent with quantum physics, but it
does seem consistent with classical physics, which seems to describe the world
of everyday experience. Penrose agrees with Einstein and others that quantum
physics must be incomplete, and that there is some generalization waiting to
be discovered which will tie it together with classical physics. Discovered,
not invented. Penrose really thinks there is a reality out there to be
discovered. He cites the Mandelbrot set as an example of the mathematical
reality that is "out there."
If there is an objective reality, then truth is simply the accord of model
with reality, which is what we all thought it was all along, right? Well, we
did if we weren't well versed in quantum views of reality. This naive view of
truth gives Penrose a way to deal with consciousness. Consciousness is contact
with reality.
This is where Penrose pins his argument that the mind cannot be a computer.
One of the things that we attribute to the mind and don't attribute to any
existing computer is consciousness. Nor does anyone have any notion how a
computer might attain consciousness. This is not surprising, since we have no
idea how humans attain it, either, or any consensus on what consciousness is.
Penrose offers a common sense definition of consciousness, and argues that a
computer can't have consciousness so defined. To do so, he has to provide a
limiting definition of what a computer can be, but not too restrictive:
Basically, I think, he's deriving a definition of computer from the accepted
definition of the adjective computable.
Along the way, Penrose presents the Godel theorem argument. We can know that
an assertion in a system is true even though we cannot derive it via
algorithmic procedures of the system. But he is not simply reprising the
unconvincing argument for the superiority of minds over computers: that the
mind can somehow step outside the system to which the computer is
algorithmically bound. That argument has always seemed to me to be wishful
thinking: Computers can operate at different symbolic levels; and the fact
that human minds can step outside certain systems to examine problems at a
higher level, does not mean that the mind can leap to any level. Nothing in
our ability to see an individual problem from the outside, to bring insight to
it, constitutes a proof that thought is not governed by complex algorithms.
But Penrose has something different in mind. It isn't just to a higher model
that we jump, but to the thing modeled. We recognize the truth by comparing
the model with reality. This must be the case, Penrose argues, because no
undirected algorithm can recognize truth. The algorithmic method can lead to
falsehood as easily as to truth.
Penrose doesn't say where this business of the mind touching truth occurs. He
presumably believes that we'll need this new model of physics, the one that
transcends quantum mechanics, to understand the connection of mind and
reality. He has some thoughts about such a model, this being in his line of
work, after all. Some hints, he suspects, are hidden in our perception of
time.
The arrow of time, which pierces us all eventually, is not, apparently, a fact
of nature, but a fact of perception. Why are we cursed to this treadmill?
Penrose doesn't know, but he presents some research from physiological
psychology that was new to me, showing that, while our reflexes act in
fraction-of-a-second times, it takes something like a second or two to decide
to act and follow through, even for something as simple as making a fist.
It's common experience that reflexes are faster than conscious decisions, but
these numbers seem obviously wrong. You can do the experiment yourself, timing
how long it takes you to decide to clench your fist and then to do it. Not two
seconds, surely.
But the experiments were apparently well designed and controlled. What's going
on here? Penrose argues that, if the results are accurate, our perception of
time is even more out of step with reality than we thought. Are there some
stages of conscious decision making that are unconscious that we sort of sleep
through? Or do our minds simply attach a perception of near-instantaneousness
to a phenomenon that actually takes a second or two?
How all this ties into the interrelationships among minds, computers, and the
laws of physics is something even Penrose isn't quite sure about. The book
shows what Penrose is thinking, and not all the thoughts have gelled. But the
least certain of his arguments may be the most important.





























September, 1991
ONE-WAY HASH FUNCTIONS


Using cryptographic algorithms for hashing




Bruce Schneier


Bruce has an MS in Computer Science and has worked in computer and data
security for a number of public and private concerns. He can be reached at 730
Fair Oaks Ave., Oak Park, IL 60302.


Pattern matching is usually a problem more suited for elementary programming
classes than Dr. Dobb's Journal. But what if the strings are 100 Kbytes large?
What if you have to search through a million of them looking for matches? And
what if you have to try and match 1000 of these strings each day? With a
simple string comparison algorithm, you will have to compare one billion
strings, each requiring from 1 to 100,000 individual byte comparisons. And
don't forget the hellacious storage and retrieval requirements. There has to
be a better way.
Probabilistic algorithms are useful for jobs like this. They work -- most of
the time -- and they have been applied to finding large random prime numbers,
solutions to particularly convoluted graph theory problems, and pattern
matching. The algorithm I am going to present only fails in one out of 2{128}
(that's about 3 x 10{38}) tries. There isn't a piece of commercial software on
the market that fails less often. You're much more likely to get a disk error
reading the program, or a power surge that fries your whole system, than you
are to have this algorithm fail. By the time this algorithm fails, our species
will most likely have long killed itself off, or at least mutated into some
different species that has better things to do with its time than sit around
comparing a million 100-Kbyte strings.
We're going to take the computer analog of fingerprints; something small that
uniquely identifies something much larger. Hash functions, which are functions
that take a large input string and produce a smaller output string, perform
the task nicely.


Hash Functions


One of the simplest hash functions is to take the XOR of every 128-bit block
in the file, as in Example 1. This works well with random data; each 128-bit
hash value is equally likely. However, with more rigidly formatted data, such
as ASCII, the results are less than ideal. In most normal text files, the
high-order bit of each byte is always 0. Therefore, 8 bits in the hash will
always be 0. This means that the effective number of possible hashes is far
less than 2{128} (or at least that each possible hash value is not equally
likely), and different files are more likely to produce identical hashes.
Example 1: A simple hash function that uses the XOR of every 128-bit block in
the file

 main (int argc, char *argv[]) {
 unsigned long hash[4] = {0, 0, 0, 0}, data [4];
 FILE *fp;
 int i;
 if ((fp == fopen (argv [1], "rb")) ! = NULL) {
 while ((fread (data, 4, 4, fp) ! = NULL)
 for (i=0; i<4; i++)

 hash[i] ^= data[i];
 fclose (fp);
 for (i=0; i<4; i++)

 printf ("%081x", hash [i]);
 printf ("\n");
 }
 }

An easy way to get around this is to roll the hash value 1 bit after each XOR.
This way each bit is more likely to be throughly randomized by the time the
entire file has been hashed, regardless of how the data looks beforehand. The
program in Example 2 rolls each longword 1 bit to the right.
Example 2: Rolling each longword 1 bit to the right

 main (int argc, char *argv[]) {
 unsigned long hash [4] = {0, 0, 0, 0}, data [4];

 FILE *fp;
 int i;
 if ((fp == fopen (argv[1], "rb")) != NULL) {
 while ((fread (data, 4, 4, fp) != NULL)
 for (i=O; i<4; i++) {
 hash[i] ^= data[i];
 hash[i] = hash[i]>>1 ^ hash[i]<<31;
 }
 fclose (fp);

 for (i=0; i<4; i++)
 printf ("%081x", hash[i]);
 printf ("\n");

If the hash function is only going to deal with ASCII data, a further
improvement might be to ignore spaces. This way, "HELLO WORLD" hashes to the
same value as "HELLOWORLD." Of course, it also means that "The house holds one
at seven bells" hashes to the same value as "The household son eats even
bells," but nothing is perfect. Maybe the program should count the first
space, but ignore any additional ones. Additional modifications that ignore
punctuation and case might also be useful, or they might be detrimental. It
all depends on the type of data involved.
These hash functions work great as long as there isn't a malicious user trying
to gum up the works. Creating a file that hashes to a particular value is very
easy with any of these functions. It's only slightly harder with more
complicated functions. Usually, all that is required is to calculate the hash
of any particular file, and then append a 128-bit string at the end of it to
force the new file to hash to whatever value you'd like.
In certain applications this problem is very serious, but it can be solved by
using a "one-way hash function." By "one-way," I mean that given an input
file, it is very easy to calculate the hash value. However, given a particular
hash value, it is, for all practical purposes, impossible to produce an input
file that hashes to that value. The function has a trap door: It's easy to go
one way, but hard to go back the other unless you know the "secret."


The MD5 Algorithm


In theory, any cryptographic algorithm can be used as a one-way hash function.
For example, the Digital Encryption Standard (DES) algorithm (see the "C
Programming" column, DDJ, September 1990), endorsed by the National Security
Agency and used by countless government agencies and business interests, works
quite well. Encrypt the entire file using the DES block-chaining mode, and the
two final 64-bit encrypted blocks make up the hash. (ANSI X 9.9 suggests using
only the last 32 bits of the last encrypted block, but that seems inadequate.)
It is random, not reversible, and every bit of the hash is dependent on every
bit of the input file. But in software, DES is slow; the algorithm is not at
all optimized for this task. And even worse, if the DES key is compromised,
then everyone knows the secret to the trap door and the one-way property of
the hash function is lost.
An alternate algorithm, shown in Listing One (page 150), is called MD5.
(Listing Two, page 151, is the header file for MD5.) Cryptographer Ron Rivest
of MIT (he's the "R" in the RSA public-key algorithm) invented MD5
specifically as a one-way hash function. His personal needs for the algorithm
revolve more around secure digital signatures than pattern matching, but we
will come back to that later.
MD5 produces a 128-bit hash (or "Message Digest," hence the name) of an input
file. The algorithm was designed for speed, simplicity, and compactness on
32-bit architectures. It is based on a simple set of primitive operations
(that is, LOAD, ADD, XOR) on 32-bit operands. The algorithm also works on 8-
and 16-bit microprocessors, albeit significantly slower.
The heart of the algorithm is MD5Update, which processes the input file in
512-bit chunks. Transform performs the basic MD5 algorithm. MD5Init kicks off
the whole process. MD5Final is primarily concerned with the partial block at
the end of the input file (only the rare input file can be divided exactly
into 512-bit blocks). The final block is padded with a single 1 and a string
of 0s, with the last word reserved for a binary representation of the number
of bits in the original input file. After the algorithm churns away, the
resultant hash is stored in mdContext.
It is straightforward to implement MD5 to solve the pattern matching problem
at the beginning of this article. First, create and store a hash of each of
the million 100-Kbyte input strings. Each one is only 16 bytes long, so you
only need 16 Mbytes of storage space instead of 100 gigabytes. Every time you
have a new test string, compare its hashes with those stored. Not only are the
storage requirements four orders of magnitude smaller, you only have to make
one 128-bit comparison per string pair. You'll get a false match every 10{27}
years or so, so you might want to add code that explicitly checks strings that
hash to the same value. I wouldn't even bother.


Putting One-Way Hash Functions to Work


One-way hash functions can be used to detect viruses, which make their living
by surreptitiously modifying programs. Numerous security programs take simple
hashes called Cyclic Redundancy Checks (CRCs). These programs hash clean
files, and then compare the stored hash with a new hash of the file. If a
virus infected the file, its hash would be different. The problem with most of
these virus-detection programs is that a clever virus can make sure any
virus-infected file hashes to the same value as the clean file. It would be
harder to do this with CRC fingerprinters than with the simple hash programs
listed earlier, but it would still be possible. MD5 makes it practically
impossible. Even if the virus had complete knowledge of the MD5 algorithm, it
could not modify itself so that the infected program would hash to the same
value. All it could do would be to randomly append bits to the file and try to
duplicate the hash by brute force. But even if we allow the virus to try a
million possible hashes per second, it would take 10{25} years for it to
successfully infect the file. I'm willing to live with the possibility of one
successful virus infection on my system between now and when the sun goes
nova.
As I mentioned, one-way hash functions have uses in digital signatures. A
digital signature scheme is where one party can "sign" a document in such a
way that no one except the signer could sign the document, and anyone can
verify that the signer actually did sign the document. Most digital signature
schemes involve public-key cryptography; they are computational nightmares and
often painfully slow. Speed increases drastically if the digital signature
algorithm operates on a hash of the document rather than the document itself.
Because the chances of two documents having the same hash are only 1 in
2{128}, anyone can safely equate a signature of the hash with a signature of
the document. If a conventional hash function were used, it would be a trivial
matter to invent multiple documents that hashed to the same value, so that
anyone signing a particular document would be duped into signing a multitude
of documents. The protocol just could not work without one-way hash functions.
One-way hash functions can also be used by an archival system to verify the
existence of documents without storing their contents. The central database
could just store the hashes of files. It doesn't have to see the files at all;
users submit their hashes to the database, and the database time stamps the
submissions and stores them. If there is any disagreement in the future about
who created a document and when, the database could resolve it by finding the
hash in its files. Enhancements to this scheme, such as having the database
use a digital signature protocol and sign a concatenation of the hash and a
date/time code, would make this protocol even more useful. This has vast
implications with regard to privacy: Someone could copyright a document but
still keep it secret. Only if he wished to prove his copyright would he have
to make it public.
The source code has been successfully compiled and run under MS-DOS, Unix, and
Macintosh OS. Because of space constraints, I've included with this article
only the C source that implements the MD5 algorithm; the include files,
makefiles, and sample test routines are available electronically as described
on page 3. RSA Data Security Inc. has put the MD5 algorithm into the public
domain so that anyone can use it. Their only requirement is that any copies of
their source code or documentation (but not different implementations of the
mathematical algorithm) retain the copyright notice at the beginning of
Listing One. Rivest cautions that while the algorithm improves over MD4, it is
still new and has not been completely cryptoanalyzed. It is always possible
that some clever person will find a way to break it, but over the years as
more clever people try and fail, confidence in the algorithm will grow.
_ONE-WAY HASH FUNCTIONS_
by Bruce Schneier


[LISTING ONE]


/***********************************************************************
 ** md5.c -- the source code for MD5 routines **
 ** RSA Data Security, Inc. MD5 Message-Digest Algorithm **
 ** Created: 2/17/90 RLR **
 ** Revised: 1/91 SRD,AJ,BSK,JT Reference C Version **
 ** Revised (for MD5): RLR 4/27/91 **
 ***********************************************************************
 ** Copyright (C) 1990, RSA Data Security, Inc. All rights reserved. **
 ** License to copy and use this software is granted provided that **
 ** it is identified as the "RSA Data Security, Inc. MD5 Message- **
 ** Digest Algorithm" in all material mentioning or referencing this **
 ** software or this function. **
 ** License is also granted to make and use derivative works **
 ** provided that such works are identified as "derived from the RSA **
 ** Data Security, Inc. MD5 Message-Digest Algorithm" in all **
 ** material mentioning or referencing the derived work. **
 ** RSA Data Security, Inc. makes no representations concerning **
 ** either the merchantability of this software or the suitability **
 ** of this software for any particular purpose. It is provided "as **
 ** is" without express or implied warranty of any kind. **
 ** These notices must be retained in any copies of any part of this **
 ** documentation and/or software. **
 **********************************************************************/

#include "md5.h"


/* forward declaration */
static void Transform ();

static unsigned char PADDING[64] = {
 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
};

/* F, G, H and I are basic MD5 functions */
#define F(x, y, z) (((x) & (y)) ((~x) & (z)))
#define G(x, y, z) (((x) & (z)) ((y) & (~z)))
#define H(x, y, z) ((x) ^ (y) ^ (z))
#define I(x, y, z) ((y) ^ ((x) (~z)))

/* ROTATE_LEFT rotates x left n bits */
#define ROTATE_LEFT(x, n) (((x) << (n)) ((x) >> (32-(n))))

/* FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4 */
/* Rotation is separate from addition to prevent recomputation */
#define FF(a, b, c, d, x, s, ac) \
 {(a) += F ((b), (c), (d)) + (x) + (UINT4)(ac); \
 (a) = ROTATE_LEFT ((a), (s)); \
 (a) += (b); \
 }
#define GG(a, b, c, d, x, s, ac) \
 {(a) += G ((b), (c), (d)) + (x) + (UINT4)(ac); \
 (a) = ROTATE_LEFT ((a), (s)); \
 (a) += (b); \
 }
#define HH(a, b, c, d, x, s, ac) \
 {(a) += H ((b), (c), (d)) + (x) + (UINT4)(ac); \
 (a) = ROTATE_LEFT ((a), (s)); \
 (a) += (b); \
 }
#define II(a, b, c, d, x, s, ac) \
 {(a) += I ((b), (c), (d)) + (x) + (UINT4)(ac); \
 (a) = ROTATE_LEFT ((a), (s)); \
 (a) += (b); \
 }

/* The routine MD5Init initializes the message-digest context
 mdContext. All fields are set to zero. */
void MD5Init (mdContext)
MD5_CTX *mdContext;
{
 mdContext->i[0] = mdContext->i[1] = (UINT4)0;

 /* Load magic initialization constants.
 */
 mdContext->buf[0] = (UINT4)0x67452301;
 mdContext->buf[1] = (UINT4)0xefcdab89;
 mdContext->buf[2] = (UINT4)0x98badcfe;
 mdContext->buf[3] = (UINT4)0x10325476;

}

/* The routine MD5Update updates the message-digest context to
 account for the presence of each of the characters inBuf[0..inLen-1]
 in the message whose digest is being computed. */
void MD5Update (mdContext, inBuf, inLen)
MD5_CTX *mdContext;
unsigned char *inBuf;
unsigned int inLen;
{
 UINT4 in[16];
 int mdi;
 unsigned int i, ii;

 /* compute number of bytes mod 64 */
 mdi = (int)((mdContext->i[0] >> 3) & 0x3F);

 /* update number of bits */
 if ((mdContext->i[0] + ((UINT4)inLen << 3)) < mdContext->i[0])
 mdContext->i[1]++;
 mdContext->i[0] += ((UINT4)inLen << 3);
 mdContext->i[1] += ((UINT4)inLen >> 29);

 while (inLen--) {
 /* add new character to buffer, increment mdi */
 mdContext->in[mdi++] = *inBuf++;

 /* transform if necessary */
 if (mdi == 0x40) {
 for (i = 0, ii = 0; i < 16; i++, ii += 4)
 in[i] = (((UINT4)mdContext->in[ii+3]) << 24) 
 (((UINT4)mdContext->in[ii+2]) << 16) 
 (((UINT4)mdContext->in[ii+1]) << 8) 
 ((UINT4)mdContext->in[ii]);
 Transform (mdContext->buf, in);
 mdi = 0;
 }
 }
}

/* The routine MD5Final terminates the message-digest computation and
 ends with the desired message digest in mdContext->digest[0...15]. */
void MD5Final (mdContext)
MD5_CTX *mdContext;
{
 UINT4 in[16];
 int mdi;
 unsigned int i, ii;
 unsigned int padLen;

 /* save number of bits */
 in[14] = mdContext->i[0];
 in[15] = mdContext->i[1];

 /* compute number of bytes mod 64 */
 mdi = (int)((mdContext->i[0] >> 3) & 0x3F);

 /* pad out to 56 mod 64 */
 padLen = (mdi < 56) ? (56 - mdi) : (120 - mdi);

 MD5Update (mdContext, PADDING, padLen);

 /* append length in bits and transform */
 for (i = 0, ii = 0; i < 14; i++, ii += 4)
 in[i] = (((UINT4)mdContext->in[ii+3]) << 24) 
 (((UINT4)mdContext->in[ii+2]) << 16) 
 (((UINT4)mdContext->in[ii+1]) << 8) 
 ((UINT4)mdContext->in[ii]);
 Transform (mdContext->buf, in);

 /* store buffer in digest */
 for (i = 0, ii = 0; i < 4; i++, ii += 4) {
 mdContext->digest[ii] = (unsigned char)(mdContext->buf[i] & 0xFF);
 mdContext->digest[ii+1] =
 (unsigned char)((mdContext->buf[i] >> 8) & 0xFF);
 mdContext->digest[ii+2] =
 (unsigned char)((mdContext->buf[i] >> 16) & 0xFF);
 mdContext->digest[ii+3] =
 (unsigned char)((mdContext->buf[i] >> 24) & 0xFF);
 }
}

/* Basic MD5 step. Transforms buf based on in. */
static void Transform (buf, in)
UINT4 *buf;
UINT4 *in;
{
 UINT4 a = buf[0], b = buf[1], c = buf[2], d = buf[3];

 /* Round 1 */
#define S11 7
#define S12 12
#define S13 17
#define S14 22
 FF ( a, b, c, d, in[ 0], S11, 0xd76aa478); /* 1 */
 FF ( d, a, b, c, in[ 1], S12, 0xe8c7b756); /* 2 */
 FF ( c, d, a, b, in[ 2], S13, 0x242070db); /* 3 */
 FF ( b, c, d, a, in[ 3], S14, 0xc1bdceee); /* 4 */
 FF ( a, b, c, d, in[ 4], S11, 0xf57c0faf); /* 5 */
 FF ( d, a, b, c, in[ 5], S12, 0x4787c62a); /* 6 */
 FF ( c, d, a, b, in[ 6], S13, 0xa8304613); /* 7 */
 FF ( b, c, d, a, in[ 7], S14, 0xfd469501); /* 8 */
 FF ( a, b, c, d, in[ 8], S11, 0x698098d8); /* 9 */
 FF ( d, a, b, c, in[ 9], S12, 0x8b44f7af); /* 10 */
 FF ( c, d, a, b, in[10], S13, 0xffff5bb1); /* 11 */
 FF ( b, c, d, a, in[11], S14, 0x895cd7be); /* 12 */
 FF ( a, b, c, d, in[12], S11, 0x6b901122); /* 13 */
 FF ( d, a, b, c, in[13], S12, 0xfd987193); /* 14 */
 FF ( c, d, a, b, in[14], S13, 0xa679438e); /* 15 */
 FF ( b, c, d, a, in[15], S14, 0x49b40821); /* 16 */

 /* Round 2 */
#define S21 5
#define S22 9
#define S23 14
#define S24 20
 GG ( a, b, c, d, in[ 1], S21, 0xf61e2562); /* 17 */
 GG ( d, a, b, c, in[ 6], S22, 0xc040b340); /* 18 */
 GG ( c, d, a, b, in[11], S23, 0x265e5a51); /* 19 */

 GG ( b, c, d, a, in[ 0], S24, 0xe9b6c7aa); /* 20 */
 GG ( a, b, c, d, in[ 5], S21, 0xd62f105d); /* 21 */
 GG ( d, a, b, c, in[10], S22, 0x2441453); /* 22 */
 GG ( c, d, a, b, in[15], S23, 0xd8a1e681); /* 23 */
 GG ( b, c, d, a, in[ 4], S24, 0xe7d3fbc8); /* 24 */
 GG ( a, b, c, d, in[ 9], S21, 0x21e1cde6); /* 25 */
 GG ( d, a, b, c, in[14], S22, 0xc33707d6); /* 26 */
 GG ( c, d, a, b, in[ 3], S23, 0xf4d50d87); /* 27 */
 GG ( b, c, d, a, in[ 8], S24, 0x455a14ed); /* 28 */
 GG ( a, b, c, d, in[13], S21, 0xa9e3e905); /* 29 */
 GG ( d, a, b, c, in[ 2], S22, 0xfcefa3f8); /* 30 */
 GG ( c, d, a, b, in[ 7], S23, 0x676f02d9); /* 31 */
 GG ( b, c, d, a, in[12], S24, 0x8d2a4c8a); /* 32 */

 /* Round 3 */
#define S31 4
#define S32 11
#define S33 16
#define S34 23
 HH ( a, b, c, d, in[ 5], S31, 0xfffa3942); /* 33 */
 HH ( d, a, b, c, in[ 8], S32, 0x8771f681); /* 34 */
 HH ( c, d, a, b, in[11], S33, 0x6d9d6122); /* 35 */
 HH ( b, c, d, a, in[14], S34, 0xfde5380c); /* 36 */
 HH ( a, b, c, d, in[ 1], S31, 0xa4beea44); /* 37 */
 HH ( d, a, b, c, in[ 4], S32, 0x4bdecfa9); /* 38 */
 HH ( c, d, a, b, in[ 7], S33, 0xf6bb4b60); /* 39 */
 HH ( b, c, d, a, in[10], S34, 0xbebfbc70); /* 40 */
 HH ( a, b, c, d, in[13], S31, 0x289b7ec6); /* 41 */
 HH ( d, a, b, c, in[ 0], S32, 0xeaa127fa); /* 42 */
 HH ( c, d, a, b, in[ 3], S33, 0xd4ef3085); /* 43 */
 HH ( b, c, d, a, in[ 6], S34, 0x4881d05); /* 44 */
 HH ( a, b, c, d, in[ 9], S31, 0xd9d4d039); /* 45 */
 HH ( d, a, b, c, in[12], S32, 0xe6db99e5); /* 46 */
 HH ( c, d, a, b, in[15], S33, 0x1fa27cf8); /* 47 */
 HH ( b, c, d, a, in[ 2], S34, 0xc4ac5665); /* 48 */

 /* Round 4 */
#define S41 6
#define S42 10
#define S43 15
#define S44 21
 II ( a, b, c, d, in[ 0], S41, 0xf4292244); /* 49 */
 II ( d, a, b, c, in[ 7], S42, 0x432aff97); /* 50 */
 II ( c, d, a, b, in[14], S43, 0xab9423a7); /* 51 */
 II ( b, c, d, a, in[ 5], S44, 0xfc93a039); /* 52 */
 II ( a, b, c, d, in[12], S41, 0x655b59c3); /* 53 */
 II ( d, a, b, c, in[ 3], S42, 0x8f0ccc92); /* 54 */
 II ( c, d, a, b, in[10], S43, 0xffeff47d); /* 55 */
 II ( b, c, d, a, in[ 1], S44, 0x85845dd1); /* 56 */
 II ( a, b, c, d, in[ 8], S41, 0x6fa87e4f); /* 57 */
 II ( d, a, b, c, in[15], S42, 0xfe2ce6e0); /* 58 */
 II ( c, d, a, b, in[ 6], S43, 0xa3014314); /* 59 */
 II ( b, c, d, a, in[13], S44, 0x4e0811a1); /* 60 */
 II ( a, b, c, d, in[ 4], S41, 0xf7537e82); /* 61 */
 II ( d, a, b, c, in[11], S42, 0xbd3af235); /* 62 */
 II ( c, d, a, b, in[ 2], S43, 0x2ad7d2bb); /* 63 */
 II ( b, c, d, a, in[ 9], S44, 0xeb86d391); /* 64 */

 buf[0] += a;

 buf[1] += b;
 buf[2] += c;
 buf[3] += d;
}

/***********************************************************************
 ** End of md5.c **
 **********************************************************************/






[LISTING TWO]


/***********************************************************************
 ** md5.h -- header file for implementation of MD5 **
 ** RSA Data Security, Inc. MD5 Message-Digest Algorithm **
 ** Created: 2/17/90 RLR **
 ** Revised: 12/27/90 SRD,AJ,BSK,JT Reference C version **
 ** Revised (for MD5): RLR 4/27/91 **
 **********************************************************************/
/* typedef a 32-bit type */
typedef unsigned long int UINT4;

/* Data structure for MD5 (Message-Digest) computation */
typedef struct {
 UINT4 i[2]; /* number of _bits_ handled mod 2^64 */
 UINT4 buf[4]; /* scratch buffer */
 unsigned char in[64]; /* input buffer */
 unsigned char digest[16]; /* actual digest after MD5Final call */
} MD5_CTX;

void MD5Init ();
void MD5Update ();
void MD5Final ();

/***********************************************************************
 ** End of md5.h **
 **********************************************************************/




















September, 1991
OF INTEREST





ProtoView has released SQL View, a high-level scripting and prototyping
language for the Windows environment that features high- (and, if necessary,
low-) level access to SQL Server database tables, views, stored procedures,
and powerful data and cross-field validation. SQL View provides table
browsing, spreadsheet-like data lens extraction, data formatting and
validation features, and interactive querying and update functionality and is
fully integrated with ProtoView's screen painter and object editor. Once a
prototype has been developed and tested, it can be compiled down to a binary
executable that can be run directly under Windows.
Typical application development time can be cut by more than 100 percent,
compared to other Windows development languages.
Also available now from ProtoView is the Screen Management Facility for
Windows, Version 3.2, new features of which are: the ability to place bitmaps
and icons within pushbuttons using the screen painter; multiple range checking
for data entry fields; currency and date control supporting 26 currencies and
six international formats, respectively, with full validation; and enhanced
code generation for Borland C++ and Zortech C++.
SQL View costs $495; ProtoView 3.2 runs $695. Reader service no. 24.
ProtoView Development Co. 353 Georges Road Dayton, NJ 08810 908-329-8588
ZBasic/Windows, a Windows development kit from 32 Bit Software, is based on
ZBasic-PC, which allows you to write both 386 and 486 programs on any PC.
Programs can be written on non-386 machines and recompiled for 386 and 486
hardware.
ZBasic/Windows allows you to use Windows at three levels of complexity: a
basic level of coexistence with other Windows applications, a more advanced
level with Windows compliance, and a fully integrated Windows level with
multiple boxes and menu bars. This choice of levels allows you to easily write
programs that coexist with other applications in the Windows environment.
ZBasic/Windows' introductory price is $139.95; the regular price will be
$199.95. Bundled with ZBasic-PC, the cost is $229.95. Reader service no. 20.
32 Bit Software 3232 McKinney Ave., LB 14 Dallas, TX 75204 800-32BITSW or
214-720-2051
BRIEForC++, a BRIEF add-on that includes a complete browser for C++ has been
released by Solution Systems. The browser allows you to view classes and
structures in a pop-up window. You can browse according to class hierarchy,
member variables, or member functions. The browser scans multiple files,
allowing access to all files in the system, not just those currently stored in
memory. BRIEForC++ automatically loads the relevant files and positions the
cursor to the appropriate member variable, member function, or class.
Once you make a selection from the window, you can begin editing with BRIEF.
To edit from another class or function, call the browser again.
BRIEForC++ supports the Borland C++ Version 2.0 compiler and requires BRIEF
3.0 or higher, hard disk, and a minimum of 256K RAM. The price is $129; $399
when bundled with BRIEF 3.1. Reader service no. 21.
Solution Systems 372 Washington Street Wellesley, MA 02181 800-677-0001 or
617-431-2313
The ImageTools Document Management Utility (DMU) is now available from
Discorp. Running under Windows 3.0 and fully implementing the Dynamic Data
Exchange, ImageTools facilitates complex imaging functions such as scan and
store, view, print, export, and document management through simple program
interaction. ImageTools affords easy integration of document image processing
into any application and simplified access to users' imaging peripherals.
The DMU is accessed through either a user interface, the Windows clipboard, or
a DDE interface, and uses either single files or Discorp's folder manager to
create, print, or view documents. Viewing and printing are accomplished using
hardware or software decompression with the image data presented in a window.
Each image window allows for image zooming, reduction, rotating, panning,
scrolling, and copying to and from the Windows clipboard. File formats
available include CMP, ImageTools, TIFF, or PDA; you can export to BMP or PCX
formats.
ImageTools costs $400. Reader service no. 22.
Distributed Image Systems Corporation 290 Easy Street #5 Simi Valley, CA 93065
805-584-0688
Integrated communication between remote computers and central applications on
LANs is enabled by the new family of products from Dovetail Communications.
The family includes the File Transfer Server, a hardware/software
communications server which resides in a dedicated PC workstation and manages
bidirectional file transfer activities at speeds of 1200-9600 bps. The server
includes the Script Language Interpreter (SLI), an embedded programming
environment (a dialect of Tiny Basic) that allows you to integrate file
transfer functions into your network applications.
The Communications Co-Processor is an add-on component that allows support for
as many as eight dial-up lines. (The basic server supports only two.) Also
included are REMCOM, a remote API module for optimizing the file transfer
process for operation from within another program, making file transfer
automated and transparent for remote users.
Each LAN server costs $695; REMCOM modules, $15; and the Communications
Co-Processor, $500. OEM pricing is available. Reader service no. 23.
Dovetail Communications Technologies 1050 Northgate Dr., Suite 455 San Rafael,
CA 94903-2540 415-492-1303
A new version (1.62) of the Far Memory Manager library is being shipped by
Dolphin Software. The library now provides the ability to identify the line
number and source file of a statement that generates any of 27 errors
detectable by the Dolphin functions; a utility to display any Dolphin function
declaration from the DOS command line; and revised documentation that presents
the debugging features in practical detail.
This is added to the previously available enhanced memory allocation
functions, multidimensional dynamic array allocation, far heap diagnostic
functions, and a memory allocation log for tracking down far memory bugs.
Ben Barron of the University of California School of Medicine in Davis,
Calif., said of the Far Memory Manager, "I was most impressed by the fact that
it permits you to allocate and deallocate memory in blocks of any size without
any penalty, and by the log that shows you how all your far memory was
allocated."
The library supports both Turbo C/C++ and Microsoft C and comes in small,
medium, and large memory models. The price is $99. Reader service no. 25.
Dolphine Software 48 Shattuck Square #147 Berkeley, CA 94704 415-644-9530
Release 2.0 of NEWT-SDK, a TCP/IP development kit developed specifically for
Windows 3.0, is now shipping from NetManage. NEWT is the only currently
available TCP/IP package implemented as a Windows DLL, as opposed to a TSR;
thus, it is loaded into memory only when needed. Release 2.0 includes two new
Direct Link Libraries: FTP, which provides an API to the Internet File
Transfer Protocol for file transfer to and from a remote system; and SMTP,
which provides an API to the Simple Mail Transfer Protocol, allowing mail
exchange with Unix workstations, NEWT users, and other supporting systems.
NEWT supports IP routing between networks -- Ethernet, Token Ring, FDDI, and
now the Serial Line IP (SLIP) -- communicating with multiple network cards
simultaneously and routing IPs between them.
DDJ spoke with Brian Holeman of Maxim Technologies in Vienna, Va., who used
NEWT to connect a server on a Sparc station to a user interface. "NEWT was the
only product we found with the ability to link into the server as a standard
Windows app from within Windows," said Holeman.
NEWT supports Windows' standard and enhanced modes and all network interface
cards based on the NDIS industry standard. The price is $500; site licenses
and multiple-copy discounts are available. Reader service no. 26.
NetManage Inc. 10020 N. DeAnza Blvd. #101 Cupertino, CA 95014 408-257-6404
The C Editor and Menu+ are two new libraries from Automated Business
Consultants. The editor acts as a complete word processor, allowing unlimited
data file size and offering a spell checker, printer drivers, underlining,
bolding, justification, cut and paste and search and replace facilities, and
multiple window margins.
Menu+ is a user interface library with window functions, window status
tracking, pull-down bars, pop-up button menus, a complete set of data entry
functions, file and directory browsers, and picklists.
Bundled together, the C Editor and Menu+ cost $325; Menu+ is sold separately
for $150. Reader service no. 27.
Automated Business Consultants P.O. Box 5642 Evansville, IN 47715 812-477-6482
SLR Systems has announced OPTLINK/Compress 3.0, a linker with compressed,
single-level overlays designed to provide greater file compression at higher
speed. It produces self-extract-to-run EXE files, and when the file is
executed, an automatically embedded decompression routine instantly takes
control and expands the program back to its original form.
Features new to version 3.0 include: up to 8192 overlays, indirect calls, EMS/
XMS caching, compress root and large and small model overlays, and
autoallocation of libraries. Furthermore, OPTLINK/Compress allows overlays to
a separate file and calls from overlay to overlay. With OPTLINK, large
programs can be stored on a single floppy disk -- its EXE files need less than
1K extra memory for the decompression routine.
Kenneth Kiraly of MultiScope Inc. in Mountain View, Calif., chose the product
for its speed. "Compared to its competitors, OPTLINK saves time in developing
large applications and generates smaller executables," Kiraly commented.
OPTLINK/Compress supports Codeview 3.0 and 3.1 and costs $350. Upgrades are
$75 or free, depending on purchase date. Reader service no. 28.
SLR Systems 1622 N. Main Street Butler, PA 16001 412-282-0864
New from Machine Independent Software is the Hash Table Generator, for use in
compilers, interpreters, and other language recognizers. The Generator
produces a table which results in a "perfect hash," the fastest way to
identify keywords. One hash function call and one string compare uniquely
identify any keyword. The hash function generation process is automated: You
provide a list of keywords in a text file, one keyword per line. To add a
keyword, simply add a line to this file and rerun the Hash Table Generator
using the new list.
Features included are: no collisions, each keyword produces a unique hash
result; for n keywords, a hash value for each of between 0 and n-1; a fast
hash function provided in C source, uses one table lookup and one XOR function
for each character in a keyword; a header file including keyword hash value
#defines and an array of keyword strings sorted by hash value; and dynamic
computation of hash table size.
Available for DOS and OS/2. $59 includes table generator, sample input file,
hash function in C source, and user guide. Reader service no. 29.
Machine Independent Software Corp. 1651 Bennington Hollow Lane Reston, VA
22090 703-435-0413
Sequiter Software has released CodeBase++ 1.01, a multiuser, dBase-compatible
class library for use with C++ that now comes with a DLL for Microsoft Windows
3.0. The library works directly with the data, index, and memo files of dBase
III-IV, allowing C++ developers to build dBase-compatible applications. The
combination of C++ and CodeBase++ offers high-level database management
capabilities, speed, low memory requirements, portability, flexibility, and
the advantages of a low-level programming language.
Ray Storer, an independent developer from Fort Knox, Ky., said, "the functions
are really powerful and straightforward. Take the function DATAMEMO, for
example: You execute it, and it does everything for you. It opens up buffers
and the database, memo, and index files. You just need to manipulate the
data."
Codebase++ can be shipped royalty free with any application; price is $295.
Reader service no. 30.
Sequiter Software Inc. #209, 9644-54 Ave. Edmonton, Alberta T6E 5V1 Canada
403-448-0313
Embedded DOS is the new MS-DOS-compatible operating system from General
Software. Originally developed for desktop use, it was revamped to satisfy the
demands of embedded systems developers. It provides the full functionality of
MS-DOS 3.31 with a ruggedized, real-time implementation essential for embedded
applications.
An "adaptation kit" allows you to combine and customize the DOS components to
meet the requirements of your embedded environment; it also includes a special
version of the DOS kernel that supports symbolic debugging during system
development. Source code is provided for all the base device drivers,
COMMAND.COM, and utility programs.
Developer kits cost $495; royalties are based on volume commitments ($2-6 per
copy). Reader service no. 31.

General Software P.O. Box 2571 Redmond, WA 98073 206-391-4285





























































September, 1991
SWAINE'S FLAMES


May the Metaphors Be with You




Michael Swaine


The IBM-Apple agreement seems to answer a few questions. The answer to the
question "Which Mac-like interface will IBM put on its Unix?" is, apparently.
"Apple's." The Macintosh interface itself will be grafted onto AIX.
The answer to the question, "Which RISC chip will Apple use in future
computers?" is, apparently, "IBM's." Apple will use the IBM Power Chip, a
Motorola/IBM single-chip implementation of RISC System/6000 technology.
The answer to the question, "How will IBM respond to Apple's lead in
multimedia technology?" is, apparently, "By acquiring rights to it." Apple and
IBM will jointly develop platform-independent multimedia software.
It's not so apparent what question is answered by IBM's and Apple's agreement
to agree to form a jointly-owned software company to develop an
object-oriented operating system capable of running software developed for the
Mac, AIX, and OS/2, but the deal as a whole certainly answers the question,
"What can Apple do to draw attention away from layoffs and slipping profits?"
Of course, the primary function of such machinations is not to answer
questions, but to raise them, keeping the computer press busy with such
puzzles as: Does the RISC alliance hurt workstation vendors such as Sun
Microsystems or the Advanced Computing Environment group? If IBM gives the nod
to Motorola, what gesture is it giving to Intel? Whither OS/2? Is all this
legal?
In the present political climate, Apple and IBM probably don't have to worry
about the legality of their agreement as long as they don't put any quotas in
it, but most of the other questions, while worth asking, are not yet
answerable. Apple and IBM have only signed a letter of intent. An actual
agreement could take until Christmas, and any products that come of the
agreement could be four or five years off. Then, too, IBM could decide to live
up to the letter of the agreement while putting its real efforts into
something else entirely; it is a company capable of doing several incompatible
things at once, such as ten years ago when, after having commissioned
Microsoft to develop the operating system for the original PC, it offered its
customers a choice of three operating systems.


Speaking of Microsoft...


About the only thing that everyone is sure of is that this deal is not good
news for Microsoft. It chips away at the Masters of the Universe image that
Microsoft had built up. Back in March, I presented on this page an
alternate-world version of Lord of the Rings, in which Bilbo kept the ring of
power and was consolidating his position as absolute ruler of Middle Earth. It
was intended as a straightforward description of the present state of the
personal computer industry. Six months and one letter of intent later, Bilbo
seems to have dropped the ring.
It was probably to be expected; in fact, Dave Winer expected something like it
when I talked with him this summer. Speaking of Microsoft's hegemony, he said,
"They can't hold onto that indefinitely. It's got to swing back the other way.
Microsoft is too hot. Everybody is taking aim at them."
Winer's view then was that Microsoft could make life very tough for Apple by
making all the right improvements in Windows and DOS in the next versions, and
that Microsoft probably would do just that. The same strategy could still be
the best one for Microsoft, keeping them competitive, if not dominant. In
fact, Microsoft's Steve Ballmer characterized his company's focus in the wake
of the Apple-IBM pact as "Windows, Windows, Windows." Of course, future
versions of Windows might have as much to do with Windows 3.0 as Visual Basic
has to do with Basic. Surely, Microsoft's object linking and embedding
technology will be a part of a future Windows, or it will have no chance
against this IBM-Apple operating system. If IBM and Apple actually get
together to produce an operating system, I think we all hope there will be an
alternative. Maybe it'll be Microsoft's. Forget the Lord of the Rings analogy;
think of Darth Vader joining the rebel alliance.
But Microsoft is no Darth Vader, either. Throughout its involvement with IBM,
Microsoft has, to a remarkable degree maintained its integrity and steered its
own course. Will Apple do as well?


































October, 1991
October, 1991
EDITORIAL


If You Wanna Play, You're Gonna Have to Pay




Jonathan Erickson


If you believe the world of reusable object-oriented software components is
upon us, you might give some thought about how to measure the value of
software. For OO pioneer Brad Cox, the best way to measure the worth of code
is by the amount of use it receives, not the number of copies sold.
Consequently, says Cox, programming tool vendors wanting to distribute their
wares may want to follow the lead of WKRP, instead of Egghead Software.
It's Cox's view that today's software industry faces some of the same
challenges the music industry confronted years ago when radio stations began
playing music free-of-charge to listeners. Instead of crying the blues, the
melody moguls adopted the concept of pay-per-use, a notion that simply means
radio stations receive music free-of-charge, paying for it as songs are
played. Industry-sponsored clearing houses (ASCAP and BMI) monitor this
process by, among other functions, receiving payment from radio stations and
distributing royalties to publishers.
Cox proposes that software vendors also distribute programs free-of-charge,
allowing users to pay for software only as it's used. No, this isn't shareware
in sheep's clothing. To use the software, you need a unique add-in
board--provided at no charge by an ASCAP-like software clearing house--that
unlocks the software for use and logs it as it's accessed. The user then
uploads (via an 800 phone number) usage information to the clearing house,
receiving an invoice by return mail.
Instead of the phone system, the add-in card can be equipped with an RF modem,
allowing the board to wake up at certain times to automatically broadcast
software-usage data to the host. If the user doesn't pay the piper on time,
the host broadcasts a shut-down message until the bill is paid. Software
updates and other information can also be transmitted.
Pay-per-use seems more apropos for tool developers than end users. First of
all, it's ideally suited for reusable, high-level software components that can
be assembled into working programs. Having a storehouse of thousands of C++
classes at your fingertips is tantalizing; having to pay for each and every
one isn't. With the current pay-per-copy model, you buy the whole enchilada
just for access to a few. With pay-per-use, you have access to a storehouse of
components, but only pay for those you use, as you use them. Implicit in this
concept, of course, is a royalty-free distribution license covering software
passed on to end users.
Just as radio station end users don't pay to hear music, neither will software
end users be inclined to spring for pay-per-use programs. (I certainly don't
want to hear the cash register ring every time I launch my word processor.)
However, pay-per-use might eliminate some headaches for network
administrators. Every time an application program is copied from a server to a
network workstation, the "use" is logged, broadcast, and invoiced--and
documentation is immediately express mailed back to the site. The
administrator only has to manage a single in-house copy. To some degree, this
oversimplified sketch describes technology that's already in place:
Hewlett-Packard's NetLS software, for instance, "floats" licenses around a
network, logging charges based on time used.
Nor is there any reason why pay-per-use and pay-per-copy can't reside on the
same system, particularly for developers. Programmers would likely have
pay-per-copy compilers, roll-your-own classes, and pay-per-use components. (In
fact, we already accept a pay-per-copy/use mixed model with the phone system:
We "pay-per-copy" for local phone calls and "pay-per-use" for long distance.)
Individual programmers might very well want to pay per copy for some
components, particularly for those objects they frequently use.
Before pay-per-use software--or, for that matter, the concept of reusable
components--will be accepted, standardization is a must. We'll want, for
example, to share components across various vendor products: A C++ class must
work with any C++ compiler. This is being addressed by the ANSI committee,
which is making progress. The bad news is that the ANSI process takes time;
witness standard C.
But standards within a language definition aren't enough. We'll also want to
share code across languages; objects common to both Smalltalk and C++, for
example. This issue is also being addressed, in one case by the Object
Management Group, whose goal is to define a framework--a common object
model--to develop a heterogeneous applications environment across various
hardware and operating systems.
The optimum environment for pay-per-use may be in a team or "megaprogramming"
environment rather than the one-person/one-computer shop. Megaprogramming, a
concept described by Michael Floyd in DDJ January 1991, views programming in
terms of designing and composing software components on a grand scale.
Reusable class libraries designed for team programming settings are also
becoming available; ParcPlace's recently released Objectworks\C++ Version 2.4
lets programmers load and browse each other's C++ code without interrupting
each other.
If either megaprogramming or team programming is in our future, pay-per-use
software in one form or another may play an important part in it.





































October, 1991
LETTERS







Multithreaded Processes for 386BSD


Dear DDJ,
I am reading with great interest the series of articles about the 386BSD port.
After reading the 386BSD article in your June 1991 issue, I read Bill
Gallmeister's "Reconciling Unix, Ada, and Realtime Processing." Thinking about
the benefits of multithreaded processes, I formed a simple idea of how to
implement them in BSD-Unix. The base of my idea is something between fork and
vfork which I called sfork. It behaves like vfork for the code and data
segments, but unlike vfork it copies the stack segment and doesn't block the
parent. So, after a call to sfork, both processes (threads) share the same
code and data segments but have a different stack segment.
Now the problem of the integrity of the shared data segment arises. This could
be solved by implementing another kernel function, atom, that has two
parameters, a pointer to a function, and an argument for that function. atom
has the job of calling the function with its argument, with the guarantee that
no other thread can interrupt the function if it is not blocking due to a call
to sigpause. With the aid of atom, all mechanisms for synchronizing like
semaphores, mutexes, monitors, and rendezvous could be implemented. All
functions using common or static data (including some C-library functions such
as printf, malloc, and others) must be synchronized via atom. The data segment
must remain shared even through a call to sbrk. exit should not be called
because of open file structures, exit must be used instead as an alternative
to atom. Dijkstra's P and V primitives could be implemented in the kernel.
Listing One, below, should highlight my idea.
I know there may be many problems with my idea which I have not figured out
yet, but because I have no access to a BSD source, I am not able to implement
my idea and test it.
Finally, thanks DDJ, for a great journal.
Gunter Jung
Neurenberg, Germany
Bill and Lynne respond: Your idea does have merit. In fact, the article in the
September issue speaks to this area, and brings up vfork as an example. The
only problem we see with your suggestion is the higher cost of the P( ) and V(
) primitives, and that signals would not be reliable (see Section 4.7 of the
4.3BSD book by Leffler, et al, on the sigvec( ) mechanism.) FasterP( ) and V(
) can be built using "test and set" atomic operations. See Maurice Bach's book
on Unix for more information in this vein.
You might be interested in knowing that some groups have implemented something
similar to what you suggest, called the "shopping" fork. It allows you to have
variable weight processes ("I'd like a stack segment, a separate process
state, hold the data and text segments.") However, a problem faced by all
multithreaded/LWP implementations is to reimplement the library functions in a
reentrant form (no small task). Another bone of contention is how to
synchronize the threads: If we go through the kernel (yours does slightly,
using sigpause), we lose speed but gain robustness. On the other hand, if we
stay in user mode, we lose robustness and gain speed.
The standards people are trying to wrangle out all the hits here, but we think
that this area will be ripe for experimentation for years to come. Too many
experts in the field have differing opinions.
Now that the networking release, which incorporates most of 386BSD sources, is
out, you may have a chance to actually build your sfork( ) into the kernel and
experiment with it yourself. You can get access. The reason we did 386BSD was
so more people could experiment with ideas such as yours. It may not be
immediately obvious how to do so, since working in the kernel still requires
much training and experience, but you can try. We hope our series helps
clarify the internals of 386BSD, so that more may have a chance to gain such
abilities.
As to the parts of 386BSD that may eventually handle LWPs, we won't forget
you, stay tuned to coming issues, as we can only type so fast!


About Entropy, or Let's Get Small


Dear DDJ,
A propos Kas Thomas's article on entropy (February 1991), Issue 12 of
Glottometrika, (a German computational linguistics journal) contains the
description (in English) of an algorithm for computing character entropy to
any order. The publishers of Glottometrika distribute a fairly snazzy, and it
seems, bug-free PC implementation of this algorithm that lets you compute the
character or word (your choice) entropy of ASCII text files right up to
order-120, as well as generate and retrieve text, without resorting to
extended memory or swapping to disk.
If you want to know more, contact Prof Dr. Reinhard Kohler (Universitat Trier,
FB II - Linguistische Datenverarbeitung, Postfach 38 25, D5500 Trier, Germany)
who handles software distribution or Dr. Rolf Hammerl, Editor, Glottometrika
(Ruhr-Universitat Bochum, Sprachwissenschaftliches Institut, Postfach 10 21
48, D4630 Bochum 1, Germany).
J.B.M. Guy
Clayton, Australia


The Power of Matrices


Dear DDJ,
In connection with Victor Duvanenko's interesting article (DDJ, June 1991), it
is worth mentioning that a matrix, M, of order N raised to a high integer
power, K, can be expressed as a matrix polynomial in M of no higher degree
than N - 1. This important result follows from the famous Hamilton-Caley
Theorem which states that a matrix must satisfy its own characteristic
equation.
By using this result, Duvanenko's illustrative calculation of Fibonacci
numbers can be made even more efficient, by about two to one, by taking
advantage of the structure and symmetry of all the matrices being squared and
all the product matrices being evaluated. Rather than general-purpose matrix
multiplications and subsequent moves of these matrices, more efficient
specialized calculations can be done and array elements overlayed in the
process.
The characteristic equation of any square matrix M may be obtained by
evaluating the determinant of the matrix expression, M - xI = 0, where I is
the identity matrix of order N and x is a scalar variable which defines the
polynomial. The Theorem substitutes M for x in this resulting polynomial. M{N}
may then be expressed by transposition as the resulting polynomial of degree
(N - 1). The expression for any higher power may be obtained by repeated
multiplication of both sides by M and substituting for M{N} where it occurs.
A matrix, M, of order 2 is used by the author in an illustration of
calculating Fibonacci numbers of higher order, where M is given by Example 1.
Example 1

 (11)
 M = (10) and

 (F[n]) (F[(N-1)]) (F[1])
 (F[(N-1)]) = M * (F[(N-2)]) = M{N-1}*(F[0])



This matrix to any higher integral power, K, can be expressed in a simple
first order polynomial in M, for example, M{2} = M + I. It may be verified and
proved by mathematical induction that M{K} = F[K] * M + F[(K-1)] * I where
F[0] = 0, F[1] = 1 and F[K]= F[(K-1)] + F[(K-2)].
This implies that the matrix MK can be expressed as the symmetric matrix in
Example 2, and that Example 3 is also a symmetric matrix with terms given in
preferred calculation sequence by F[2K] = F[K] * (F[(K + 1)] + F[(K-1)]),
F[(2K-1)] = F[K]{2} + F[(K-1)]{2}, F[(2K + 1)] = F[2K] + F[(2K-1)].
Example 2


 M{K} = (F[(K+1)] F[K])
 (F[K] F[(K-1)])

Example 3

 (M{K}){2} = (F[(2K+1)] F[2K])
 (F[2K] F[(2K-1)])

Thus, starting with the matrix M in row element order M11, M12, M21, and M22,
the matrices M{2}, M{4}, M{8}, M{16}, etc. can each be successively evaluated,
as needed, by overlaying these elements as follows: M21 := M21 * (M11 + M22),
M22 := M12 * M12 + M22 * M22, M11 := M12 + M22, M21 := M12, where ":+" means
"is replaced by."
The product matrix, P, is a selected product of these successively squared
matrices, but at every stage of development it, too, is a power of the matrix
M, and hence it, too, is symmetric and can be evaluated similarly.
It is also worth noting that although the development assumed that F[0] and
F[1] were equal to 0 and 1 respectively, the matrix equation in Example 4
holds for arbitrary values of F[0] and F[1], without changing M even though
this changes the values of the sequence.
Example 4

 (F[N]) (F[1])
 (F[(N-1)]) = M{N-1} * (F[0])

John T. Godfrey
East Jordan, Michigan



Patent Possibilities


Dear DDJ,
Use outside experts to check out patent applications/peer reviews, like
scientific journals? Let the experts handle it? Right. Here are a couple of
alternate suggestions. If you still fundamentally believe what your teachers
taught you -- that the world should be centrally controlled by experts --
don't bother to read on.
Keep patent applications secret for a while. (I understand that Japan does
not.) If the patent office gets two patents for the same thing, they cancel
each other out. And if someone shows prior art for a patent, they cancel each
other out. Remember, we citizens, as a communal body, are paying the inventor
by giving him a patent. How often do you pay for something that you get for
free or that you have already paid for?
Well, the patent office people and patent lawyers won't fall in love with that
suggestion. Its object is to cheaply and naturally simplify many patent cases.
But here's a real beauty that the bureaucrats will love: Put a tax on patents.
Yeah, that's right. They have value; so tax 'em. OK, boys, before you call me
a "liberal" and toss me out of the tavern, hear me out. We have property taxes
on "our" land. Of course, that it's "our" land is just a little joke between
us and the public (the government). But we do have more control over our land
than someone in, say, Rumania. With control comes responsibility. And it comes
in two forms: 1. We can get fined, jailed, sued, or shot if we do bad things
with our land; 2. We must prove that we are not wasting the land. We prove it
by coughing up property taxes every so often. If we waste the land, then it's
harder to pay.
So let's make a patent "owner" demonstrate that he's putting the patent to
good use -- not just wasting everybody's time. Make the application as cheap
and easy as is possible. The quarterly tax and paperwork will keep it real.
Would either of these things discourage patents? You bet. Would they slow down
the constitutional I.8.8, "progress of science and useful arts?" I don't think
so. I believe that we could all quit mining for get-rich-quick gold and get
back to building, selling, and using things instead.
B. Alex Robinson
Maple Valley, Washington


Pseudo Random Bridge -- It's All in the Cards


Dear DDJ,
The article by W.L. Maier on the r250 pseudo random number generator (May
1991) has proven to be extremely useful. I used it, after modifying the
initiator as described below, in a program for generating sets of bridge hands
for tournament play (see Example 5). Since the probability of all possible
hand patterns has been calculated from probability theory (Borel and Cheron,
The Mathematical Theory of Bridge), this application provides a good test of
r250's randomness. Results from 16 sets of 36 deals, to produce 2304 hands
after 29376 r250( ) calls (51 per deal) gave results, shown in Table 1, which
agree closely with the theoretical probabilities. In addition, r250( ) was
noticeably faster than Microsoft's rand( ). Dealing a set of 36 hands with
r250( ) took about three seconds on my 16-MHz 80386, compared with about five
seconds using rand( ).
Table 1

 4333 227 9.85 10.54
 4432 518 22.48 21.55
 4441 75 3.25 2.99
 5332 349 15.15 15.52
 5422 235 10.20 10.58
 5431 301 13.06 12.93

 * * * *
 * * * *
 * * * *
 8311 7 0.31 0.12

 8320 4 0.17 0.11

 8410 1 0.04 0.05

 8500 0 0.00 0.003


 9999* 0 0.00 0.04

Total 2304 100.01 100.013

All patterns containing 9-, 10-, 11-, 12-, and 13-card suits. As published,
r250_init( ) required 500 calls to rand( ) in order to initialize the
r250_buffer array. I modified r250_init( ) to eliminate those 500 calls for
two reasons: 1. to make r250( ) a free-standing pseudo random number
generator; and 2. to maintain r250( )'s speed advantage vs. linear congruent
generators when only a few random numbers are needed.
Kenneth Lindsay
Laie, Hawaii


A Universal Cross Assembler


Dear DDJ,
I was pleased to read Ken Skier's article "Assembly Language Macros" in the
March 1991 issue.
I presented basically the same concept several years ago, except I did it by
making procedural instructions part of the assembly language instruction set.
In 1983 I introduced a universal cross assembler which through the use of
user-defined tables could generate the machine code for any microprocessor. At
the same time I defined a universal set of mnemonics and addressing mode
syntax which was applicable to any processor. The instruction set I defined
included procedural instructions similar to the macros Ken described.
With my universal assembler, an individual was able to change the instruction
set definition table to convert a single assembly language instruction into
any desired sequence of machine code instructions and/or data. I chose this
approach rather than macros to speed up the assembly process and eliminate the
need for having a macro definition file reassembled with every program each
time the program was assembled.
Also an individual could write relocatable subroutines which defined
procedural instructions. These external subroutine instructions had the same
format as normal assembly language instructions except instead of generating
machine code to perform the procedure, a call to the subroutine was generated,
followed by the data to be used by the procedure. The code for the subroutine
was appended to the end of the program's executable machine code as part of
the assembly process, without the need for an external linking loader.
In addition, my cross assembler allowed for words to be inserted within an
assembly language instruction to help clarify the meaning of each parameter
and thus make the source code more readable. There were also many other
special features I included to assist in learning and using assembly language.
My universal cross assembler was called MOPI. I tried from 1983 to 1988 to
distribute the cross assembler myself, under the name VOCS, without much
success. In 1986 I wrote a book titled Universal Assembly Language (Tab Books)
describing my proposed universal instruction set and assembler. Although still
available from Tab Books, the book never made the best seller list.
Unfortunately, at the time my assembler and book were introduced, C was
gaining popularity and nobody was interested in hearing about assembly
language. I couldn't get any support for the concept and eventually gave up
trying.
I always thought it ironical that while everyone complained how hard it was to
read and understand an assembly language program, they embraced C with open
arms. Yet C, with all of its special symbols, is equally if not harder to read
and understand. Just think, they never had a yearly contest in assembly
language to see "who can write the most unreadable code," as they do with C.
Although now I do a lot more programming in C than assembler, I still like
assembly language.
Glad to see that someone else thinks that assembly language, when used
properly, is equally worthy of being used to write complete systems like any
other so-called high-level language.
Robert M. Fitz
Plymouth, Minnesota





































October, 1991
A MINIMAL OBJECT-ORIENTED DEBUGGER FOR C++


Interesting things can happen at construction time




William M. Miller


William M. Miller is director of North American Operations for Glockenspiel
Ltd. and vice chair of X3J16, the ANSI Standard C++ Committee. He can be
reached via Internet at wmm@world.std.com and on CompuServe at 72105,1744.


When C++ first appeared outside AT&T Bell Laboratories in 1985, early adopters
of the language enjoyed its power and expressiveness, but suffered from a
dearth of specialized tools. The lack of a C++-specific debugger in particular
hampered these pioneering programmers.
Happily, those days are mostly past. In addition to other tools, most C++
compiler vendors now also provide quite serviceable debuggers. Nevertheless,
there remain some platforms for which only a C or assembly-oriented debugger
(or even no debugger at all, in some embedded development environments) is
available.
With a little foresight, however, C++ source-level debugging is possible even
in these tool-poor environments. That's where MOOD -- the Minimal
Object-Oriented Debugger -- comes in. MOOD, in keeping with its name, makes no
pretense of providing every imaginable debugging service. It does, however,
offer the ability to trace through program execution, set breakpoints, and
interactively display the values of objects, as well as provide a framework on
which more elaborate debugging facilities can be built, as required.


Theory of Operation


In essence, MOOD transfers control to a central routine each time an
"interesting event" occurs. Such interesting events include program start and
termination, function entry and exit, construction and destruction of objects,
and user breakpoints. Whenever the central routine is invoked, it examines its
internal state and the nature of the event to determine whether to initiate an
interactive dialogue with the user, or simply to return and allow the program
to continue its normal execution.
C++ makes it easy to gain control during these "interesting events" through
the semantics of constructor and destructor operation. Whenever an object is
created, its constructor is invoked. This guarantee by the language holds
whether the object is global static (and hence initialized at the very
beginning of the program execution), local static or automatic (and therefore
initialized when control flows through its declaration), or on the heap (and
thus initialized when it is explicitly allocated). The same is true of
destructors whenever an object goes out of existence. Furthermore, not only
are the constructor and destructor of the object itself invoked, but those of
all of its base classes as well.
Normally, the purpose of running an object's constructor is to ensure that the
object is internally consistent and to acquire any resources -- memory, files,
devices, and so on -- it needs; the destructor is responsible for releasing
those resources. However, it is perfectly reasonable for MOOD to piggyback on
top of these normal constructor and destructor operations and then transfer
control to the MOOD kernel at that time.
The application of constructor and destructor semantics to tracing object
creation and destruction is obvious. If the object being created or destroyed
is derived from a class whose constructor and destructor contain calls to the
MOOD kernel, MOOD will have the opportunity to display a message or converse
with the user at those times.
Using constructors and destructors to transfer control to MOOD at function
entry and exit is equally straightforward, but perhaps a bit less obvious. In
this case, a class must be defined whose constructor and destructor call the
MOOD kernel indicating function entry and exit, and an auto object of that
class must be declared as the first statement in each function whose execution
is to be traced. When that object is created, at the beginning of the
function's execution, MOOD will be informed of the function's invocation; when
it is destroyed, that is, when the function returns, the destructor will
notify MOOD that the function has terminated.
Extending these concepts to the start and finish of the entire program is
simply a matter of defining a global static object whose constructor and
destructor perform the necessary MOOD calls.


Programming with MOOD


As can be inferred from the preceding section, MOOD is an intrusive debugger;
that is, it is necessary to make certain changes to the source code of the
program in order to debug it (the "foresight" mentioned at the start of this
article). However, these changes are quite minor in nature and extent, as the
following paragraphs demonstrate.
The first requirement placed on code to be debugged with MOOD is that it
include the header file MOOD.hxx (Listing One, page 110). This file contains
all the declarations required by the interface with MOOD.
Function entry and exit tracing is enabled by declaring an object of class
trace as the first statement of each traced function. The trace constructor
takes a character string argument specifying the name of the function; this
name is displayed whenever MOOD is running in verbose mode and can be used to
specify a breakpoint. For example, a function to be traced would be written as
follows:
 void some_func( ) {
 trace t("some_func");
 // ...
 }
Debugging data is a bit more involved but still not terribly burdensome. Three
things are required of each class whose objects are to be made known to MOOD:
First, it must declare class monitored as a virtual base class. Second, it
must pass a string, provided in the object declaration, to the monitored
constructor. This string is intended to be the name of the object being
constructed; like the function name in class trace, it will be displayed in
verbose mode at construction and destruction time, and it can be used for
breakpointing. The third requirement is to provide an override for the virtual
function display( ); it should print out the contents of the object on stderr
in whatever form is most appropriate. A sample class and object declaration
for MOOD is shown in Example 1.
Example 1: A sample class and object declaration for MOOD

 class some_class: public virtual monitored {
 public:
 some_class(const char* obj_name):
 monitored(obj_name) { }
 void display() {
 fprintf(stderr, "i=%d\nj=%d\n", i, j);
 }
 private:
 int i;
 int j;
 };

 some_class an_object("an_object");


There are two important reasons that derivation from class monitored should be
virtual instead of ordinary. First, in an elaborate inheritance hierarchy, the
most-derived class (that is, the class used in the declaration of the object)
may have several base classes, each of which is derived from class monitored.
If ordinary inheritance were used, a verbose mode trace would include
construction and destruction records for each of those classes when the object
was created and destroyed. Making monitored a virtual base class causes the
message to occur only once, because there is only a single instance of
monitored in the object.
Second, deriving virtually from class monitored means that the most-derived
class's constructor is both permitted and required to supply the argument to
the class monitored constructor. This is important because only the object
declaration itself "knows" the name of the object being declared, and virtual
inheritance makes it both easy to pass that information along and impossible
to forget to do so.
The final feature of MOOD.hxx that can be used by a program is insertion of
conditional or unconditional breakpoints. Normally, MOOD allows user
interaction (a "breakpoint") at any time a message would be printed in verbose
tracing mode -- that is, at function entry and exit and object construction
and destruction. If additional breakpoints are required, for instance, at some
interesting point in an algorithm that does not correspond to one of the
traced events, the programmer can insert explicit calls to the cond_break( )
function.
A call to cond_break( ) with no arguments is an unconditional break; MOOD will
notify the user and enter an interactive dialogue, regardless of any other
conditions. To make a breakpoint conditional, use a call that specifies a name
(for example, cond_break(USER_BP, "my breakpoint")). Then, when the user tells
MOOD that execution should proceed to a particular named event, MOOD will
ignore conditional breakpoints whose names do not match the specified name.
Listing Two, page 110, shows a simple program that illustrates many of the
features of interfacing with MOOD. The foo::display( ) function simply prints
out the object name to demonstrate that the mapping from printed address to
actual object is correct.


Getting into the MOOD


One other feature of MOOD.hxx is worth discussing. The class init_ctl,
declared at the end of Listing One, along with its associated object
declaration, dbg_init, is vital for enabling the operation of MOOD. Because
dbg_init is global static, it is constructed before program execution begins
and destroyed after the program completes. The constructor of class init_ctl
contains the call to cond_break( ) for the start of the program, which allows
the user to gain interactive control immediately to turn on verbose mode, go
to a particular breakpoint, and so on.
There will be one copy of dbg_init in each compilation unit that includes
MOOD.hxx (it's a static variable), so a way must be found to avoid having
multiple invocations of cond_break() for program start. This once-only
limitation is implemented by means of the static member_count; the init_ ctl
constructor increments _count and only calls cond_break( ) when the very first
instance of dbg_init is created.
MOOD will also never be called if the preprocessor variable DEBUGGER_ON is not
defined; all the classes in MOOD.hxx are conditionally compiled to do nothing
in its absence. (User code calls to cond_break( ) should be similarly
protected.)


Using MOOD


MOOD's user interface is primitive but functional. The available commands are
described in Table 1, so the description here will be limited to an overview
of how to use MOOD.
Table 1: The MOOD Debugging Commands

 In its current form, MOOD recognizes the following commands:

 s Step to the next "interesting event" and return to
 interactive mode

 g [<name>] Go until the next event that has the
 specified name (which can be a function, object,
 or user breakpoint). If no name is specified, go
 until the next user breakpoint or until the end of
 the program.

 d <addr> Display an object, selected by the address
 printed out at the time the object was
 constructed. This command calls the display()
 member function for the object whose
 "monitored" subobject is at the referenced
 address, which is displayed at construction time
 if in verbose mode.

 v Turn on verbose mode. This sends messages
 to stderr upon function entry and exit, and upon
 object construction and destruction.

 q Turn off verbose mode.

All user commands are implemented in MOOD.cxx (Listing Three, page 110). The
implementation is rudimentary, and is intended only to provide a foundation of
basic functionality. There are many ways in which this functionality can be
extended. Some suggestions for improvement are described at the end of this
article.
When a program compiled for MOOD first begins execution, the user is presented
a welcoming banner and a command prompt. At this point, one possible user
action is to type a "v" (for verbose) command, followed by a "g" (go) command
with no argument. The program will then run without pausing, and as it
executes, MOOD will print (on stderr, to avoid conflict with ordinary program
output on stdout) a descriptive message for each interesting event that
occurs.
Another possibility at the opening prompt is to first type v" (or not,
depending on the volume of output desired), and then type "g" followed by a
name. The program will execute normally until an interesting event with the
specified name occurs, at which time the user will be presented an opportunity
for further interaction. MOOD makes no distinction among object names,
function names, and breakpoint names, so any of these can be used with the "g"
command.
A third possibility is to "single step" with the "s" command. Due to MOOD's
implementation, the step increment is limited to interesting events," rather
than individual C++ statements.
The commands mentioned so far are all allowed at any MOOD breakpoint, not just
at the start of the program. You can step from one breakpoint to another using
the "g" command, then turn off verbose mode using the "q" (quiet) command, and
so on. Furthermore, at any breakpoint, any object that has been constructed
but not destroyed can be displayed by passing its address (the one printed by
MOOD at construction) to the "d" command. This command will invoke the
object's display( ) member function. (Note: It is important not to attempt to
display an object during the breakpoint resulting from that object's
construction! MOOD is entered from the monitored constructor, and the virtual
function table pointing to the derived class's display() member function has
not been set up at that point. Similarly, the breakpoint for an object's
destruction occurs after the object has been destroyed and its display( )
member function is no longer available.)
A sample debugging session, using the program in Listing Two, is shown in
Example 2. Explanatory comments (text following "//" in the listing) were
added after the log was made and are not part of the program's input or
output.
Example 2: A sample session with MOOD
This is a transcript of a MOOD debugging session. The program being debugged
is tdbg.cxx, shown in Listing Three. Any remarks preceded by a double-slash
(//) are comments which were added after the session was transcribed.

 tdbg // Start the demo program.
 MOOD:
 Minimal Object-Oriented Debugger version 0.0
 cmd> v // Verbose mode is on.

 cmd> g y // Go till we get to y.
 Enter main
 Construct *p @ 27A8
 Enter x // Entering function x().
 Construct xf @ 2746
 Enter y // Entering function y().
 cmd> d 27a8 // Ask to display an object.
 *p // Yep, that's the right one.
 cmd> d 2746 // Display another object.
 xf // Which it does.
 cmd> s // Single step.
 Construct yf @ 271C
 cmd> s // Single step again.
 Destruct yf @ 271C
 cmd> g z // Go till we get to z().
 Exit y
 Destruct xf @ 2746
 Exit x
 Enter z // Entering function z().
 cmd> g // Here we are. So just finish up.
 Construct zf @ 2746
 Destruct zf @ 2746
 Exit z
 Destruct *p @ 27A8
 Exit main
 End of execution



Where to Go from Here


The debugging facilities of MOOD are useful but quite rudimentary. They can be
improved in various ways. For example, you could, without much effort, allow
symbolic lookup of object names by maintaining a dynamic symbol table of live
objects. This would allow display of objects by name rather than by address.
Another suggestion is to allow the user to modify object values interactively
during a breakpoint. A third idea is to allow program execution to be
interrupted by a particular keystroke combination.
The MOOD technique requires some cooperation from the program being debugged,
but the other side of that coin is that the capabilities of the approach are
limited only by the imagination of the programmer.


Acknowledgment


The techniques embodied in MOOD were described in "Debugging and
Instrumentation of C++ Programs" by Martin O'Riordan, then of Glockenspiel, in
the Proceedings of the 1988 USENIX C++ Conference.

_A MINIMAL OBJECT-ORIENTED DEBUGGER FOR C++_
by William M. Miller


[LISTING ONE]

// MOOD.hxx by William M. Miller, 8/3/91. MOOD user declarations.

/* This header file is included in every module which is to be metered
 * by MOOD. Its actions are controlled by the preprocessor variable
 * DEBUGGER_ON. If not this is not defined, no debugger actions will occur.
 */
#ifndef _DEBUGGER_DEFS
#define _DEBUGGER_DEFS
 // Conditions under which cond_break may be called:
enum break_condition {
 PROG_START, PROG_END, FCN_ENTRY, FCN_EXIT, OBJ_CTOR, OBJ_DTOR,
 USER_BP

 };

/* The cond_break() function is called in all the above contexts for two
 * purposes: 1) to display trace information when verbose mode is on, and
 * 2) to break under the appropriate conditions to allow the user to
 * interact with the debugger. The default arguments are to allow a user
 * program to contain the call "cond_break();" with no arguments to perform
 * an unconditional breakpoint into the debugger's interactive mode.
 */
void cond_break(break_condition cond = USER_BP, const char* name = 0,
 void* addr = 0);

/* The trace class is intended to allow function tracing. Each function or
 * block which should be included in the trace should declare an object of
 * class trace at the very beginning. The result, in verbose mode, will be
 * to display the function/block name at entry and exit; this name can also
 * be used to set a breakpoint from the debugger's interactive mode.
 */
class trace {
public:
#ifdef DEBUGGER_ON
 trace( const char* fcn_name): _name(fcn_name)
 { cond_break( FCN_ENTRY, _name); }
 ~trace( ) { cond_break( FCN_EXIT, _name); }
private:
 const char* _name;
#else
 trace( const char* ) { }
#endif
 };

/* The monitored class is intended for use as a virtual base class of any
 * classes whose construction/destruction is to be traced in verbose mode
 * or whose values are to be displayed interactively. Derived classes must
 * pass the object or class name to the constructor and must supply an
 * override to the display() member function.
 */
class monitored {
public:
#ifdef DEBUGGER_ON
 monitored( const char* obj_name ): _name(obj_name)
 { cond_break(OBJ_CTOR, _name, this); };
 ~monitored( ) { cond_break(OBJ_DTOR, _name, this); }
 virtual void display() = 0;
private:
 monitored() { } // keep cfront 2.0 happy -- it requires a default
 // constructor in virtual base classes for no good reason.
 const char* _name;
#else
 monitored( const char* ) { }
 monitored( ) { }
#endif
 };

/* The following class and static object call cond_break exactly once at
 * program start, before any debuggable object is created, to allow the
 * user to set up tracing and breakpoints, and once at the end of execution
 * to allow for any needed cleanup.
 */

#ifdef DEBUGGER_ON
class init_ctl {
public:
 init_ctl( ) { if (_count++ == 0) cond_break(PROG_START); }
 ~init_ctl( ) { if (--_count == 0) cond_break(PROG_END); }
private:
 static int _count;
};

static init_ctl dbg_init;
#endif
#endif






[LISTING TWO]

// tdbg.cxx by William M. Miller, 8/3/91. This is a sample
// program to be debugged with MOOD. A transcript of the debugging
// session is shown in Example 3, accompanying this article.

extern "C" {
#include <stdio.h>
}
#include "MOOD.hxx"

struct foo: virtual monitored {
 foo( const char* nm ): monitored(nm), my_name(nm) { }
 void display( ) { fprintf(stderr, "%s\n", my_name); }
 const char* my_name;
 };
void x();
void y();
void z();

int main() {
 trace tt("main");
 foo* p = new foo("*p");
 x();
 z();
 delete p;
 return 0;
 }

void x() {
 trace tt("x");
 foo xf("xf");
 y();
 }

void y() {
 trace tt("y");
 foo yf("yf");
 }

void z() {

 trace tt("z");
 foo zf("zf");
 }






[LISTING THREE]

// MOOD.cxx, by William M. Miller, 8/3/91. MOOD kernel definitions.

/* This routine implements the user interface of MOOD, the Minimal Object
 * Oriented Debugger. Example 2 in the accompanying article describes
 * currently available commands: s, g, d, v, and q.
 */
extern "C" {
#include <stdio.h>
#include <string.h>
}
#define DEBUGGER_ON 1
#include "MOOD.hxx"

/* The objp() function does a system-dependent conversion of an ASCII pointer
 * specification into a pointer to a monitored object.
 */
monitored* objp(const char* str);

/* The cond_break() function is called under the various circumstances
 * described by the enumeration break_condition. It prints a message, if
 * required, describing the reason for its call, and optionally enters
 * interactive mode to take commands.
 */
void cond_break(break_condition cond, const char* name, void* addr) {

 static int tracing = 0; // => verbose mode
 static char brk_name[128] = ""; // name on which to break
 static int was_step = 1; // last cmd was "step" => go to
 // interactive mode
 char buff[128]; // command line buffer

/* We enter the display and possible interactive mode code under the following
 * conditions:
 * 1) We are tracing (verbose mode).
 * 2) We are stepping. (Note: this is initially TRUE, which takes care of
 * getting into interactive mode on the PROG_START call.)
 * 3) The breakpoint name was set and the current name matches it, or this
 * is a user breakpoint call with no name
 * 4) The breakpoint name was not set and this is a user breakpoint call.
 */
 if (tracing was_step 
 (brk_name[0] && ((strcmp(brk_name, name) == 0) 
 (cond == USER_BP && !name))) (!brk_name[0] && cond == USER_BP)) {

 switch(cond) { // Print an appropriate message:

 case PROG_START:
 fprintf(stderr,"MOOD: Minimal Object Oriented Debugger, V. 0.0\n");

 break;
 case PROG_END: fprintf(stderr, "End of execution.\n"); break;

 case FCN_ENTRY: fprintf(stderr, "Enter %s\n", name); break;

 case FCN_EXIT: fprintf(stderr, "Exit %s\n", name); break;

 case OBJ_CTOR: fprintf(stderr, "Construct %s @ %p\n", name, addr);
 break;
 case OBJ_DTOR: fprintf(stderr, "Destruct %s @ %p\n", name, addr);
 break;
 case USER_BP: fprintf(stderr, "Breakpoint %s (%p)\n", name, addr);
 break;
 } // switch

/* We enter interactive mode if any of the above conditions other than
 * tracing is met. (This implies that named user breakpoints are skipped
 * if the user uses a g <name> command in which the name does not match
 * the breakpoint name, but that unnamed user breakpoints are always
 * effective, as are named user breakpoints after a g command with no name.)
 */
 if (was_step (brk_name[0] && ((strcmp(brk_name, name) == 0) 
 (cond == USER_BP && !name))) 
 (!brk_name[0] && cond == USER_BP)) {

// Reset breakpoint conditions
 was_step = 0;
 brk_name[0] = 0;

// Main command loop
 do {
 fprintf(stderr, "cmd> ");
 gets(buff);
 switch(buff[0]) {

 case 'd': objp(buff + 2)->display(); break;
 case 'g': if (buff[1]) strcpy(brk_name, buff + 2); break;
 case 'q': tracing = 0; break;
 case 's': was_step = 1; break;
 case 'v': tracing = 1; break;
 } // switch

 } while(buff[0] != 's' && buff[0] != 'g');
 } // if (interactive)
 } // if (message)
} // cond_break()

monitored* objp(const char* str) {
 monitored* p;
 sscanf(str, "%p", &p);
 return p;
 }

int init_ctl::_count;




































































October, 1991
PROPOSING A C++ STRING CLASS STANDARD


Here's your chance to influence a class library standard




Steve Teale


Steve is a member of the Zortech development team specializing in class
libraries (C++ Tools, the Database, and the IOStreams implementation). He also
teaches C++ language training courses in the UK and can be reached at
201-691-8203.


As C++ progresses through the ANSI standardization process, the standard is
likely to define library elements (much as the C Standard) which come as
component parts of any conforming C++ implementation. The library elements
specified will differ from the C standard in that they come in the form of
class libraries. Some libraries, such as iostreams, are fairly well defined by
existing implementations. Others are less well defined.
This article describes one such library: the String class. The String class is
a good illustration of what C++ library components might be like. But more
importantly, it could provoke feedback from potential users of standard C++ as
to what facilities should be provided in a string package. In the case of
String, such input can still influence the standard. In fact, if you'd like to
comment on the package presented here, contact me at the number listed at the
bottom of this page; I will collate the responses and forward them to the ANSI
committee.


A Class Specification


The backbone of the specification for a C++ class library consists of header
files that describe the public interface of the classes. That is, the
properties of the new data types which the classes describe.
The draft ANSI specification for the String class is similar to that in
Listing One (page 114). I say similar because the current ANSI draft does not
include the member functions shown in Figure1(a). Nevertheless, they are
included in Listing One. This is where the feedback begins! Two assignments
have been added to match the argument types of the constructors. String
manipulation has always been a strong point of Basic, and this area has been
addressed by including the insert and remove functions. The NOT operator has
been added to provide a succinct test for empty Strings. If the operator
char*() function is made to return a null pointer in the case of an empty
string, then both notations shown in Figure 1(b) can be used. The operator[](
) function is discussed later.
Figure 1: (a) Functions added to the ANSI draft; (b) notations to test for an
empty String; (c) explicit friend functions; (d) avoiding constructor calls in
Boolean operations.

 (a)

 String &operator=(char);
 String &operator+=(char);

 String &insert (int, const String&);
 String &insert (int, const char *);
 String &insert (int, char);
 String &remove (int, int);

 int operator!( );
 char operator [] (int);

 (b)

 String a;

 If (!a) ...; // if a is empty string
 If (a) ...; // if a is not empty - uses operator char *()

 (c)

 friend int operator==(const String&, const String&);
 friend int operator==(const String&, const char *);
 friend int operator==(const char *, const String&);

 (d)

 String a;
 ...
 if (a == "whatever")
 ...;


 will cause a sequence something like:

 String temp = "whatever"; // constructor from const char *
 if (operator==(a, temp))
 ...;
 // temp destructor call

Finally, many C programmers are comfortable with the ANSI C string functions,
so it seems courteous to include analogues of these in a new string package
for a related language. Not doing so is much like throwing the baby out with
the bathwater -- it wastes existing expertise.
The use of explicit friend functions for the Boolean tests and concatenation
operations -- see Figure 1(c) -- is in line with the ANSI draft. It is
possible, however, to omit the last two functions shown in Figure 1(c) because
the constructor String::String(const char *) can automatically convert these
argument patterns; see Figure 1(d). On the other hand, the explicit versions
are more efficient because they avoid the overhead of constructor calls.
A public class interface specification says a lot about a new type, but not
everything that a programmer might need to know. Concerning implementation,
the ANSI draft says that no terminator character is reserved for String. It
does not, however, state whether String objects with the same value (that is,
the same number and sequence of characters) may actually occupy the same
storage. It would be more memory efficient, and in some cases faster, if such
common representations were shared. This has been the practice in many
implementations of string classes, and one which will be followed here.
Taking such considerations into account, the actual String class header file
expands beyond the public interface to include something like that shown in
Listing Two (page 114). The extra class srep (string representation) is
entirely private except for its friendship with the String class. It provides
for the shared element of String objects having the same value. The private
constructor effectively prohibits derivation, so the dirty trick of the
nominal-sized body can be used and the new operator overloaded; thus srep
objects of exactly the right size are always allocated. Each srep object keeps
a count of the number of String objects currently sharing it and notes the
length of its contents, thereby substantially speeding up many string
operations.
The private part of the String class is just a pointer to an srep object. The
private function body provides access to the actual storage, and the static
character variable (only one of these for all instances of String in an
application) provides a safe address for uses of the array access operator
which happen to be out of bounds.
Of course, the stated public interface doesn't have to be implemented this
way. It may not even be necessary to state how it is done. For the most part
though, programmers are a suspicious breed; if an implementation is not
clearly explained, they tend to go away and do it again for themselves. We
don't want this to happen. It breaks the first law of object-oriented
programming, which says, "never reinvent the wheel."


Usage


Now that we have a specification for the class, our outline "documentation"
should come up with some examples of how the public interface is intended to
be used. Taking things in the order of the specification, we can start with
the constructors illustrated in Figure 2(a). The constructor from a regular
C-style string has a defaulted second argument. This enables the use of the
function call-style initialization with two arguments, as shown in Figure
2(b). Note that one of the constructors is also called if a function, foo,
that takes a String object as its argument is called. Consider, for example,
the statement foo("argument"). Argument passing is initialization, so a
constructor must be called to initialize the argument object. In this case, it
is the one which converts from a regular C-style string. A constructor is also
called if foo is called with an actual String argument d, as in foo(d). The
argument is passed by value, so the copy constructor gets called to set up the
auto variable which corresponds to the argument within the scope of foo.
Because of this (although a String object is small, in fact just a pointer to
an srep object), functions like foo are often written to take a const
reference argument, foo (const String &s). Then the call foo(d) doesn't
involve a constructor call, and the use of const prevents code in the body of
foo from messing around with the value of d, which is the usual objection to
reference arguments. But that's enough digression.
Figure 2: (a) Initialization of String objects; (b) using the function call
style of initialization; (c) possible assignment operations for the String
class.

 (a)

 String t; // an empty string
 String a = "good girl"; // initialized from a regular C style string
 String b = a; // the copy constructor
 String c = '.';

This is the idiomatic usage. It would have just the same effect to write:

 String a ("good girl");
 String b(a);
 String c('.');

 (b)

 String d("good girl", 4);
 // same effect as String d = "good";

 (c)

 String v = "abcd";
 v += 1; // result "bcde" ?

 String w = "provided"
 w -= "vide" // result "prod" ?
 w -= 'r'; // result "pod" ?

 String x = "1234567890"
 x <<= 1; // result "2345678901"

The constructors also get called, of course, if a String object is created
dynamically. The destructor gets called automatically when any auto or static
instances of String go out of Scope, or when we explicitly delete dynamically
allocated ones.


Initialization and Assignment


Classes like String need to make separate and explicit provisions for
initialization and for assignment. Assignment operators effect the value of an
existing object. It seems logical that the set of possible assignments should
match the set of possible initializations.

Both C and C++ have many assignment operators. Apart from straight assignment,
only one of them is supported for this String class: the += operation, which
appends a string (with a small s, that is either a String, a string, or a
character) to an existing String. Others are conceivable, but not necessarily
as intuitive. Figure 2(c) shows additional possible assignment operators. The
most useful of these would probably be - =.


Modifiers


The next group of member functions are "other modifiers." These act on the
String object for which they are called. It is fairly obvious that these, and
the assignment operations discussed previously, must have the effect of
severing the object from any representation to which it is attached in common
with other String objects.
The modifiers provide for insertion of strings into existing Strings and
removal of specified portions of Strings, including truncation.
A class such as String usually needs to make some provision for automatic
conversion. At present, the ANSI draft provides the two conversions shown in
Figure 3(a). Both of these are problematic. The conversion to char causes
problems when the String is empty, as shown in Figure 3(b). The function calls
in this example are all right because the compiler can apply the user-defined
conversion from String to char to coerce the actual arguments to the show
function to match the formal argument. However, String has no reserved
delimiter character, so it is not possible to return a character from the
conversion from an empty String which distinguishes it from a string of nulls.
Figure 3: (a) Automatic conversions currently provided for in the ANSI draft;
(b) the conversion to char causes problems when the String is empty.

 (a)

 operator char(); // conversion to a char
 operator char *(); // conversion to a char *

 (b)

 void show(char c)
 {
 cout << c << ' ' << int (c) << endl;
 }
 char array[4] = { '\0', '\0', '\0', '\0', };

 String nulls(array,4);
 String empty;

 show(nulls);
 show(empty);

Also, a conversion to char is painfully close to a conversion to int. Any
function which took an integer argument would be satisfied by the erroneous
use of a String variable as its argument. The value passed would correspond to
its first character. Such bugs could be difficult to find!
operator char *( ) is problematic because the obvious thing to do is to return
a pointer to the sequence of characters associated with the String.
Unfortunately, this is not a pointer to a regular C-style string. C++ Strings
don't have a reserved terminator character. The issue could be fudged by
always storing one more characters in the String than necessary (a terminating
0) but this would still not get around the bugs caused by strings which
genuinely had embedded null characters. Note that this has bearing on the
provision of functions for Strings which parallel the ANSI C functions for
strings [strlen(const char *), and strlen(const String&)].
It might be preferable to take a step back and have the String class provide
only one conversion: one to void *. This would allow users to get a char
pointer to the first character of the sequence by explicitly recognizing (with
a cast) what they were doing. It is also a fairly standard C++ subterfuge to
allow an if statement to test an object.


Mutators


The next group of members are mutators: functions which act rather like copy
constructors in that they make new, but in these cases, altered String objects
from existing ones.
The mutators provided by the ANSI draft, including those provided by friend
functions as opposed to members, are shown in Figure 4(a). The first two are
straightforward: Given that String a = "Cat", and that b = a.upper(), b gets
"CAT" a is unchanged. Also, if c = b.lower(), then c gets "cat" b is
unchanged. The third mutator in Figure 4(a) extracts a substring and uses it
to form the new String. The first argument indicates the substring offset, and
the second the substring length.
Figure 4: (a) The mutators provided by the ANSI draft; (b) providing access to
individual characters of a String.

 (a)

 String String::upper() const;
 String String::lower() const;
 String String::operator() (int start, int length) const;
 String operator+(const String &a, const String &b);
 String operator+(const String &a, const char *b);
 String operator+(const char *a, const String &b);

 (b)

 String a = "Cat";
 a(0) = 'R'; // a gets "Rat"
 // a.operator()(0) = 'R';

The selection of constructors for class String suggests that the addition
operator, which forms a new String from two other objects should also be
overloaded to deal with single characters. Then, writing b = '(' + v + ')'
where v is a String would not cause implicit constructor calls to convert the
characters to Strings before addition. This already does not happen with b = "
(" + v + ")". Similar considerations apply to the Boolean operations as well.



Accessing Individual Chars


The next vexed design decision is how to give access to the individual
characters of a String. The ANSI draft provides char &operator( )(int). This
allows for expressions such as that shown in Figure 4(b). The draft also
provides operator char *( ) for strings, so it is possible to write a[0] = 'R'
as well.
This actually seems a more "natural" notation for a String, which is an
array-like sequence of characters. If the conversion to char pointer were
dropped, it might be desirable to provide operator[](int) to serve as an
lvalue. operator( )(int) could be retained to provide a faster unchecked
report of the value of the ith character. This is the approach adopted in the
example implementation.
Anyone who started off thinking that the String class was a trivial design
exercise should be having second thoughts! "Simple" low-level classes like
String are always a design minefield.


Searching


The next bunch of functions -- shown in Figure 5(a) -- does search operations
on Strings. The match functions return the index of the first nonmatching
character, or -1 if the match is exact. The index functions return the index
of the starting character of a matching substring or character, or -1 if there
is no match. The index functions may optionally take an offset argument to
determine where the search should start. The remaining member functions test
for an empty String and report the length of a String.
Figure 5: (a) Searching strings; (b) Boolean tests on Strings.

 (a)

 String a = "The quick brown fox jumped over the lazy dogs back";
 String b = "The queen";
 String c = "lazy";

 a.match(b); // returns 7
 a.match("The quick"); // returns -1

 a.index(b); // returns -1
 a.index(c); // returns 36
 a.index("The", 1); // returns -1
 a.index("The"); // returns 0
 a.index("the"); // returns 32
 a.index('b'); // returns 10
 a.index('b',11); // returns 46

 (b)

 // C style
 char *s1 = "Joe", *s2 = "Fred";
 if (s1 > s2) ...; // legal, but not usually what was intended!
 if (!strcmp(s1,s2) ...; // proper test for equality

 // C++ style

 String s1 = "Joe", s2 = "Fred";
 if (s1 > s2) ...; // evaluates to true
 if (s1 == s2) ...; // proper test for equality



Boolean Tests


All that remain are the Boolean friend functions and the iostreams I/O
operations. The Boolean operators are quite intuitive, and in fact allow the
sort of operations on Strings which beginners in C often attempt on regular
C-style strings, as shown in Figure 5(b). As noted, the triplets of Boolean
functions could arguably be expanded to include int operator == (const
String&, char), and so on.


Conclusion


An implementation of the String class library is available along with an
accompanying test suite. Space limitations prohibit a full presentation, but
the library is available electronically (see "Availability" on page 3) or
directly from me.
Finally, I encourage you to make comments on the String class library. I will
collect the responses I receive and forward them to the ANSI committee, as
promised.


_PROPOSING A C++ STRING CLASS_
by Steve Teale


[LISTING ONE]


class String {
public:
// String class constants
 enum string_enum { all = INT_MAX };

// Constructors / destructor
 String(); // make an empty String
 String(const String&); // make a String by copying another
 String(const char *, int count = 0);
 // make a String from a regular C style string
 String(char); // make a String from a single char

 ~String(); // Clean up after a String goes out of scope

// Assignments
 String &operator=(const String&); // Assign from another String
 String &operator=(const char *); // Assign from a C style string
 String &operator=(char); // Assign from a single char
 String &operator+=(const String&);
 // Concatenate a String to an existing String
 String &operator+=(const char *);
 // Concatenate a C style string to a String
 String &operator+=(char); // Concatenate a char

// Other modifiers
 String &insert(int pos, const String&); // Insert into an existing string
 String &insert(int pos, const char *);
 String &insert(int pos, char);
 String &remove(int pos, int count); // Remove from an existing string
 String &truncate(int); // Truncate a String

// Conversions
 operator char*() const;
 // Translate a String to a pointer to its first character
 operator char() const; // Translate a String to its initial character

// Mutators (make an existing String into a variant)
 String upper() const; // Make a new String by conversion to uppercase
 String lower() const;
 String operator()(int start, int len);
 // Make a new String starting from start, of len characters

// Access individual characters
 char &operator[](int); // Access individual characters as if the String
 // were an array
 char operator()(int); // Get the character value of the i'th character.

// Searches
 int match(const String&) const;
 // Return position of first non-matching character
 int match(const char *) const;
 int index(const String&, int pos = 0) const;

 int index(const char *, int = 0) const;
 // Find the position of a substring or character
 int index(char, int = 0) const;

// Statistics
 int operator!() const; // Test for an empty string
 int length() const; // Get the length of the String

// Some of the properties of a String are provided by functions which
// are friends of the String class

friend int strlen(const String&); // Parallel the ANSI C string functions
// etc

// Concatenations to make a new String
friend String operator+(const String&, const String&);
friend String operator+(const String&, const char *);
friend String operator+(const char *, const String&);

// Boolean tests
friend int operator==(const String&, const String&);
friend int operator==(const String&, const char *);
friend int operator==(const char *, const String&);
friend int operator!=(const String&, const String&);
friend int operator!=(const String&, const char *);
friend int operator!=(const char *, const String&);
friend int operator>(const String&, const String&);
friend int operator>(const String&, const char *);
friend int operator>(const char *, const String&);
friend int operator<(const String&, const String&);
friend int operator<(const String&, const char *);
friend int operator<(const char *, const String&);
friend int operator>=(const String&, const String&);
friend int operator>=(const String&, const char *);
friend int operator>=(const char *, const String&);
friend int operator<=(const String&, const String&);
friend int operator<=(const String&, const char *);
friend int operator<=(const char *, const String&);

// Input/ output operations
friend istream &operator>>(istream&, String&);
friend ostream &operator<<(ostream&, const String&);
};





[LISTING TWO]


#ifndef __STRING_HPP
#define __STRING_HPP
#include <limits.h>

class istream; // forward declarations to avoid including all of iostream.hpp
class ostream;

class srep { // actual string representation

friend class String;
private:
 srep(int, const char * = 0);
 void *operator new(size_t cs, size_t ss = 0);

 int refs;
 int length;
 char body[1];
};

class String {
public:
 // as above
private:
 char *body() const { return rp? rp->body: 0; }

 srep *rp;
 static char dummy;
};

inline int strlen(const String &s)
{
 return s.length();
}

// and other friend functions which can conveniently be defined inline

#endif


Figure 1

(a)

 String &operator=(char);
 String &operator+=(char);

 String &insert(int, const String&);
 String &insert(int, const char *);
 String &insert(int, char);
 String &remove(int, int);

 int operator!();


(b)

String a;

if (!a) ...; // if a is empty string
if (a) ...; // if a is not empty - uses operator char *()


(c)

friend int operator==(const String&, const String&);
friend int operator==(const String&, const char *);
friend int operator==(const char *, const String&);



Figure 2

(a)

 String a;
 ...
 if (a == "whatever")
 ...;

will cause a sequence something like:

 String temp = "whatever"; // constructor from const char *
 if (operator==(a, temp))
 ...;
 // temp destructor call


(b)

// two members
 int operator==(const String&);
 int operator==(const char *);

// and a friend
friend int operator==(const char *, const String&);


Figure 3


(a)

String t; // an empty string
String a = "good girl"; // initialized from a regular C style string
String b = a; // the copy constructor
String c = '.';

This is the idiomatic usage. It would have just the same effect to write:

String a("good girl");
String b(a);
String c('.');


(b)


String d("good girl", 4);
// same effect as String d = "good";


(c)


String v = "abcd";
v += 1; // result "bcde" ?

String w = "provided"

w -= "vide" // result "prod" ?
w -= 'r'; // result "pod" ?

String x = "1234567890"
x <<= 1; // result "2345678901"


Figure 4

(a)


 operator char(); // conversion to a char
 operator char *(); // conversion to a char *

(b)

 void show(char c)
 {
 cout << c << ' ' << int(c) << endl;
 }

 char array[4] = { '\0', '\0', '\0', '\0' };

 String nulls(array,4);
 String empty;

 show(nulls);
 show(empty);

Figure 5

(a)

 String String::upper() const;
 String String::lower() const;
 String String::operator()(int start, int length) const;
 String operator+(const String &a, const String &b);
 String operator+(const String &a, const char *b);
 String operator+(const char *a, const String &b);


(b)


 String a = "Cat";
 a(0) = 'R'; // a gets "Rat"
 // a.operator()(0) = 'R';


Figure 6

(a)

String a = "The quick brown fox jumped over the lazy dogs back";
String b = "The queen";
String c = "lazy";

a.match(b); // returns 7

a.match("The quick"); // returns -1

a.index(b); // returns -1
a.index(c); // returns 36
a.index("The", 1); // returns -1
a.index("The"); // returns 0
a.index("the"); // returns 32
a.index('b'); // returns 10
a.index('b',11); // returns 46


(b)


// C style
 char *s1 = "Joe", *s2 = "Fred";
 if (s1 > s2) ...; // legal, but not usually what was intended!
 if (!strcmp(s1,s2)) ...; // proper test for equality

// C++ style

 String s1 = "Joe", s2 = "Fred";
 if (s1 > s2) ...; // evaluates to true
 if (s1 == s2) ...; // proper test for equality

Figure 6

(a)

String a = "The quick brown fox jumped over the lazy dogs back";
String b = "The queen";
String c = "lazy";

a.match(b); // returns 7
a.match("The quick"); // returns -1

a.index(b); // returns -1
a.index(c); // returns 36
a.index("The", 1); // returns -1
a.index("The"); // returns 0
a.index("the"); // returns 32
a.index('b'); // returns 10
a.index('b',11); // returns 46


(b)


// C style
 char *s1 = "Joe", *s2 = "Fred";
 if (s1 > s2) ...; // legal, but not usually what was intended!
 if (!strcmp(s1,s2)) ...; // proper test for equality

// C++ style

 String s1 = "Joe", s2 = "Fred";
 if (s1 > s2) ...; // evaluates to true
 if (s1 == s2) ...; // proper test for equality
































































October, 1991
 OBJECT-ORIENTED SOFTWARE CONFIGURATION MANAGEMENT


The limitations of difference models require a new approach




Richard Harter


Richard is president of Software Maintenance and Development Systems Inc.,
makers of Aide-de-Camp software. He can be reached at P.O. Box 555, concord,
MA 01742


One of the most frustrating and least understood aspects of software
development is keeping track of software changes over time. Solving this
problem is the goal of Software Configuration Management (SCM). While in
recent years much of the design and coding of software has been automated,
managing the output of that process is still largely a manual task. This
situation has begun to change.
All developers manage their software at some level, if only to sift through
the lists of source files on their directories. This approach to SCM is much
like flipping through the cards in a Rolodex, except that it's done on a
computer. The programmer relies on his or her own memory, plus perhaps a
naming convention, to know which files belong to which software versions.
While workable on small projects, manual methods can often leave the developer
with too many unanswered questions.
For example: Which files, out of possibly hundreds, implement a new feature in
the software? What other files are affected by a given change and must also be
revised? Which lines of code were deleted, added, or rewritten--and why?
SCM becomes really difficult if several people are working on different parts
of a program at once. What happens, for example, if you and I are both working
on products which include File A, but I change my copy of the file? If you
like the features I have added to File A, how will you know if you can bring
my version of the file into your product?


Physical and Logical Differences


Managing change is difficult because a single change may have many
consequences--at both a physical and a logical level. The term "logical
change" refers to a change in what the program does, while "physical change"
means a change in what the program is. Logical changes include a new feature
or function. Physical changes include edited source lines, rewritten
documentation, and rebuilt executables.
Developers must know how a logical change affects other logical changes. (If I
change what the program does here, how will it affect what the program does
over there?) They must also know which physical changes result from a single
logical change (such as altered source lines, revised documentation, rebuilt
executables, and so on). A complete SCM system, in short, must help the
developer understand the full impact of a logical change.
Formal SCM models have improved on the Rolodex approach by providing methods
to track software changes automatically. Two broad classes of formal models
are in use today: difference models and an object-oriented model. Difference
models represent the traditional approach to SCM. They rely on the physical
differences between successive versions of source files to construct and store
software. The object-oriented model, on the other hand, relies more heavily on
the logical changes that cause these physical differences.
There are two classes of difference models: the Update Model and the
Integrated Differences Model. In the former, the deltas are discrete files
that exist apart from the base version, while in the latter, the deltas are
embedded within a common file (called a master file") with the base version.
Control records are interleaved with deltas in the Integrated Differences
Model to indicate to which version a delta belongs. Examples of the Update
Model are IBM Update and CDC Update. Examples of the Integrated Differences
Model are the Source Code Control System (SCCS) packaged with Unix System V,
and Digital Equipment Corporation's CMS program. Currently, the only example
of an object-oriented SCM system is my company's Aide-de-Camp.


Limitations of Difference Models


One major limitation of the difference models is that to specify a software
release, the developer must identify and name a collection of files and a
version number for each file. In the difference models, a file version is
conceptually constructed with a single pass algorithm from a base version of
the file and a sequence of differences between the base version and successive
versions. These differences are called "deltas."
A complete software release is built from the base versions of all the files,
and the application of all the deltas implied by the specified file versions.
Logically, a software release can be described by the formula
 Release[N] = {V[b], Deltas[n]}
In this formula, Deltas[n] means a set of some number of deltas, with each
delta containing lines which are added to or subtracted from files in the base
version (known as V[b]).
The next release in the sequence would then be
 Release[N+I] = {V[b], Deltas[n], Deltas[n+1]}
Or in other words, a set of sets of deltas and the base version. To
distinguish between the two layers of sets, we will use the term configuration
to refer to the base version plus all the sets of deltas that comprise a
release.
It would be a genuine benefit if the SCM system could use knowledge of
existing releases to define new releases, as in
 Release[N+I] = {Release[N], Deltas[n+1]}
Difference models do not support this level of SCM automation, because deltas
have no names, and releases are only collections of versions of files.
The lack of an independent identity for deltas presents three handicaps to the
developer. First, the developer cannot use the model to define other releases
that may not have been anticipated when the named releases were created. Say,
for example, that we wanted features that could be implemented by a new
configuration of deltas, and suppose that the current configurations are
defined as
 Deltas[n] = {D1, D2, D3} and Deltas[n+1] = {D4, D5, D6}
We might want to say that the desired, yet unnamed, release would consist of a
new configuration of deltas, such as
 Release[N+1] = {D2, D3, D6}
Unfortunately, the difference models offer no intelligence with which to
directly select a new set of deltas from existing configurations.
A second handicap is that future releases must always carry with them the
deltas from current and past releases, even if those deltas are no longer
active in the program. In other words, the source file for Release[N+3] must
include the deltas that defined Release[N+2], while Release[N+2] must include
the deltas that defined Release[N+1]. Each release depends on the presence of
all prior releases in order to register as a valid release with the SCM model.
If a previous link in the chain is broken (a delta left out), later releases
may no longer be valid. If, for example, somewhere down the road we wanted to
remove the features implemented by Deltas[n], we would do so by writing a
Deltas[n+2] that removes the logical effects of D1, D2, and D3 in the program,
but not the actual code. Therefore, in the difference models,
 Release[N+2] = {V[b], Deltas[n], Deltas[n+1], Deltas[n+2]}
Contrast this with the simpler and more logical
 Release[N+2] = {V[b], Deltas[n+1]}
This simpler expression is not allowed in the difference models because if we
physically remove Deltas[N], we can no longer go back and build Release[N] or
subsequent releases which incorporate it. Optimally, we would like to keep Del
tas[N] available but be able to include or exclude it from the software as
needed.
To some extent, this need in the difference models to carry prior changes into
the future, whether they are needed or not, defeats the purpose of SCM. The
developer ought to be able to decide whether or not to retain old code in a
program. Moreover, the developer should not have to expend resources storing,
compiling, managing, and dealing with code that is no longer relevant to the
current release.


Linearity Constraints



The third and final handicap of not being able to deal with deltas
independently from named releases is that software change can only be managed
along a linear path. This linearity constraint flies in the face of the
real-world need to pursue development along multiple paths. Most software
development efforts consist of teams which work on different parts or versions
of software at the same time. One team, for example, may work on a Unix
version of a product, while others work on DOS and OS/2 versions.
When working under linearity constraints, however, an SCM system can support
only one path at a time. An example would be when software is developed along
a time line in which each release is defined by adding its own deltas to
previous ones.
But in situations where multiple releases of the same product undergo parallel
development, not all releases share all prior deltas. In such a situation, the
deltas that comprise one release may not include deltas along a separate
branch or path of development. As far as the SCM system is concerned, the
deltas along one branch have no meaning to the deltas along a different
branch. This means that the SCM cannot build a release along one path that
includes deltas from a separate path.
This may not cause a problem as long as the development paths remain separate.
Development is simply managed independently as two distinct chains which
happen to share a base version and some early deltas. A problem will occur
when the developer needs to migrate a change from a release on one development
path to another path. At that point, what the developer may have is perhaps
hundreds of isolated deltas with no way to logically tie them together. The
following example illustrates this.


Change Along Two Paths


Here is an example of migrating an intermediate change from one development
path to another development path. The baseline release, shown in Example 1, is
a toy program which computes the sum of its arguments. Examples 2, 3, and 4
show changes made to the baseline version. I've marked the lines that were
changed with arrows.
Example 1: The base version of a simple example program

 void main(int argc,char *argv[])
 {
 int sum,i;

 for (i=0;i<argc;i++)
 {
 sum += atoi(argv[i]);
 }
 printf("Sum = %d\n",sum);
 }

Example 2: Change A1 fixes a bug in initialization of a variable.

 void main(int argc,char *argv[])
 {
 -> int sum = 0;
 -> int i;

 for (i=0;i<argc;i++)
 {
 sum += atoi(argv[i]);
 }
 printf("Sum = %d\n",sum);
 }

Example 3: Change A2 is some compiler-dependent speed optimization.

 void main(int argc,char *argv[])
 {
 -> register int sum = 0;
 -> register int i;

 -> for (i=argc; --i >=0; )
 {
 sum += atoi(argv[i]);
 }
 printf("Sum = %d\n",sum);
 }

Example 4: Change B1 modifies the function of the program to sum squares.

 void main(int argc,char *argv[])
 {
 int sum,i;
 ->int k;


 for (i=0;i<argc;i++)
 {
 ->k = atoi(argv[i]);
 ->sum += k*k;
 }
 printf("Sum = %d\n",sum);
 }

There are two modification paths for this program: path A and path B. The
modifications along path A consist of bug fixes and performance enhancements.
Change A1 (shown in Example 2) fixes a problem with an uninitialized variable.
Change A2 (shown in Example 3) tries some compiler-dependent code optimization
for speed. The modification in path B (change B1, shown in Example 4) alters
the program to compute the sum of the squares of its arguments.
Now let's suppose that we want to migrate the first change made in the path A
(that is, the code in Example 2) over to path B. Example 5 shows the file
after migration. This result cannot be achieved with difference-oriented
models. An object-oriented SCM, however, can represent these changes in a way
that migration is possible.
Example 5: Merging path B with the first part of path A

 main (int argc, char *argv[])
 {
 -> int sum = 0;
 -> int i;
 int k;

 for (i=0; i<argc; i++)
 {
 k = atoi (argv[i]);
 sum += k*k;
 }
 printf ("Sum = %d\n", sum);

 }

In traditional SCMs, the disk file is the primary data type that gets managed.
The basic properties of the disk file are its name, size, location, and
content. By contrast, an object-oriented SCM uses a range of data types with
properties such as program functionality, hardware dependencies, and foreign
languages supported. The advantage here is that object properties have meaning
outside the physical definitions of the software.
File-oriented methods can only say where the physical chunks of a program
reside. An object-oriented system can also say where the functions of a
program reside and which of those functions satisfy specified criteria -- such
as which changes go with which hardware platform or program feature.
Going back to our example, an object-oriented SCM can migrate a change from
one development path to another because each change is made in a separate,
distinct step. Each change is known to the system as an individual object.
Furthermore, changes can be migrated without worrying about changing line
numbers because each line in the file is a distinct object with a permanent
identity. This allows us to formulate the change in terms of logical rather
than physical constructs.


Change Sets


In an object-oriented SCM, all source lines are uniquely named by the system
as they are entered. Lines in the database of program code are either original
or associated with a change set. A change set can either add lines, delete
lines, or both. Although a line is listed as being deleted by a change set,
the line is never physically removed from the database; it is simply excluded
from any software versions which include that change set. To build a version,
the developer selects the change sets to be included. Change sets can be
selected according to any criteria established by the developer, regardless of
whether or not the changes were named in a previously defined configuration.
A change set has both properties and attributes. Properties are the contents
of the change and can include its name, a text abstract, the author,
associated source code, object code, and binary images. An advantage of
organizing information as objects, rather than as source files, is that no
inherent restriction exists regarding the type of information represented.
Differences in machine-readable files can be handled as readily as differences
in text files.
Attributes are tags that label the change set. They can be both system-and
user-supplied and are used to group changes according to meaningful criteria,
such as all the changes that implement a particular accounting rule. At
tributes allow developers to deal with a change in terms of meaning, rather
than as a chain of deltas. User-supplied attributes can include operating
system names, hardware platform names, and customer names. System-supplied
attributes include release names and dependencies. Dependency attributes, for
example, identify which software modules call, and are called by, other soft
ware modules.
The coexistence of properties and attributes in a single object is a powerful
combination. Changes now have an identity independent of any releases in which
they participate. When defining a release, developers are no longer locked
into selecting predetermined change configurations. Later releases no longer
include all changes from early releases.
Changes to a program's code can be represented logically in terms of software
features. They can be selected or deselected at will and grouped according to
whatever criteria are meaningful to the developer.
Developers are now free to migrate changes between parallel development paths
rather than proceeding down a linear chain of differences, in which a break in
the chain results in files being set adrift by the SCM system. If so desired,
a developer can still maintain a linear path simply by selecting all the
change sets that belong to the previous version. But in addition, a software
release can simply be a base version plus any selected changes. A specific
feature can be migrated from one development path to another by selecting only
those change sets tagged as implementing the particular feature. The developer
can even define an entirely new release as a new selection of change sets.
Finally, the whole notion of SCM can be expanded to include whatever
properties might be included within an object -- not just source lines, but
machine readable code, documentation, and even front-end CASE design data.



_OBJECT-ORIENTED SOFTWARE CONFIGURATION MANAGEMENT_
by Richard Harter


Example 1: The base version of a simple example program.

void main(int argc,char *argv[])
{
 int sum,i;

 for (i=0;i<argc;i++)
 {
 sum += atoi(argv[i]);
 }

 printf("Sum = %d\n",sum);
}



Example 2: Change A1 fixes a bug in initialization of a variable.

void main(int argc,char *argv[])
{
----> int sum = 0;
----> int i;


 for (i=0;i<argc;i++)
 {
 sum += atoi(argv[i]);
 }
 printf("Sum = %d\n",sum);
}


Example 3. Change A2 is some compiler-dependent speed optimization.

void main(int argc,char *argv[])
{
----> register int sum = 0;
----> register int i;

----> for (i=argc; --i >=0; )
 {
 sum += atoi(argv[i]);
 }
 printf("Sum = %d\n",sum);
}



Example 4. Change B1 modifies the function of the program to sum squares.

void main(int argc,char *argv[])
{
 int sum,i;
----> int k;

 for (i=0;i<argc;i++)
 {
----> k = atoi(argv[i]);
----> sum += k*k;
 }
 printf("Sum = %d\n",sum);
}


Example 5. Merge of path B with the first part of path A.

main(int argc, char *argv[])
{
----> int sum = 0;
----> int i;

 int k;

 for (i=0; i<argc; i++)
 {
 k = atoi(argv[i]);
 sum += k*k;
 }
 printf("Sum = %d\n",sum);

}




















































October, 1991
THE OBJECT D'ART


Application frameworks are the new frontier




Michael Floyd


Mike is DDJ's senior technical editor and can be reached at the DDJ offices or
on CompuServe at 76703,4057.


Why was Motif designed the way it was? Is multiple inheritance really required
to model real-world problems? What is software correctness and how can I
incorporate it in my programming style? Questions like these represented the
object (or more properly, object) d'art at the fifth Technology of
Object-Oriented Languages and Systems (TOOLS) conference held recently at the
University of California at Santa Barbara. In most cases, I found surprising
answers to these (and other) questions, surprises that I'll share with you in
this report.
The first two days of the conference offered a variety of tutorials split
between a management track and a technical track. The last two days consisted
of presented papers, panels, and a keynote address from object-oriented
pioneer Adele Goldberg of ParcPlace Systems. One invited paper of note was
presented by Vania Joloboff of OSF. Late afternoons were set aside for
companies showing their wares.
Notably, the Eiffel tutorial was packed. A surprise, however, was that the C++
session given by Stanley Lippman, a technical staff member at AT&T and author
of the well-known A C++ Primer, was not. And oddly, the CLOS tutorial received
the least attention and was consequently canceled. (Nevertheless, it provided
an opportunity for me to sit down with Lois Wolf of Franz, and chat about
Lisp. From my school days, I've always thought of Lisp as a highly-recursive,
function-based language. Lois's unique perspective as an in-the-trenches
programmer in the early days at Symbolics provided some interesting insights
on the development of Lisp dialects such as Flavors, Portable Common LOOPs,
and CLOS. I hope Lois will share those insights with us in a future issue.)


A Look at Eiffel


The first thing to say about Eiffel is that it is not just a language, but a
methodology. Developed by Bertrand Meyer, Eiffel embraces a notion of
"software correctness" where the specification and implementation of a class
exactly match. According to Meyer, a software component is written correctly
if it adheres to a basic set of requirements. These requirements are
encouraged by the language and selectively enforced at runtime. For example,
Eiffel supports the notion of "assertions" as a means of rigorously specifying
the features of a class. Assertions are Boolean values that involve
preconditions which are required before a feature is executed; postconditions
that are guaranteed to be TRUE after a feature is executed; and class
invariants which specify properties that an instance must always satisfy.
Including assertions in classes improves program documentation. But more
importantly, they ensure the consistency of the class--a key factor to
correctness. Also in the list of requirements for correct components are
check-correctness, loop-correctness, and error-correctness. The idea is that
together, these concepts allow you to create a specification of a class that
is rigorous in nature, thus ensuring its correctness in design and
implementation. (Also see "Writing Correct Software With Eiffel" by Bertrand
Meyer, DDJ, December 1989.)
The development of Eiffel as a nonproprietary language is curious, to say the
least. I was surprised at OOPSLA '90, when Meyer told me that the language
specification had been placed in the public domain. Shortly thereafter, an
independent organization was formed to maintain standards for various
implementations of Eiffel, and to advance its cause. In March of this year,
the Eiffel trademark, held by Meyer's Interactive Software Engineering (270
Stokes Road, Suite 7, Goleta, CA 93117), was passed to the Nonprofit
International Consortium for Eiffel (P.O. Box 6884, Santa Barbara, CA 93160),
which now monitors the language's standards. This is a first -- a
nonproprietary language specification controlled by an "open" industry
consortium!
These developments make it possible for independent developers to implement
their own versions of the Eiffel language, and Eiffel/S is the first example
of a commercially available implementation developed outside of Interactive
Software Engineering (ISE). Eiffel/S, developed by SiG Computer, is a DOS
version based on the Eiffel 3.0 specification (also a first). Interestingly,
it can generate C source code that is compatible with either Microsoft or
Turbo C. Note that Eiffel/S will be marketed in the U.S. by ISE; other planned
versions include Unix V/386 and OS/2.
Another language, Sather, is based on Eiffel but departs from the
specification in several areas. Sather, which currently runs on Sun 4
computers, places more emphasis on efficiency and less on "correct software
construction." Sather was developed by Stephen M. Omohundro, who has also been
involved in past developments of Star Lisp and Mathematica. Sather, which is
currently in beta, is freely available through anonymous ftp, or on floppy
disk or cartridge tape through Rock Solid Software (P.O. Box 163072, Austin,
TX 78716). Sather is offered under a license agreement patterned after the
Free Software Foundation's GNU license.


A UI for You and I


Windows is an event-driven system that is now accessible to object-oriented
programmers using C++, Actor, Turbo Pascal, and the like. But although Windows
supports message passing and subclassing of windows, it does not directly
support object-oriented concepts.
Motif, on the other hand, is an exercise in object-oriented design using C.
Motif, which is based on the X Consortiums Intrinsics Toolkit (Xt), uses Xt to
provide object-oriented mechanisms such as encapsulation of window objects via
widgets and single inheritance. According to Vania Joloboff, who is chiefly
responsible for versions 1 and 2, a primary goal in designing Motif was to
make it as close as possible (in terms of look and feel) to Windows. Although
this makes sense, it is an interesting admission by Joloboff. There are a lot
of DOS and Windows guys out there, and users can move freely from one
environment to another. It also paves the way for multiplatform frameworks.
(Maybe Al Stevens will add support for Motif in his next incarnation of
D-Flat.)
Joloboff used Motif as an elegant example of the types of problems and
decisions involved in object-oriented de sign. Subclassing, for instance,
provides many benefits, including that of reusability. But when referencing an
instance object in a large class hierarchy, the linker sucks in all code from
parent objects and thus increases code size. Additionally, complex hierarchies
increase program maintenance and the complexity of the documentation. So, the
designer must decide at some point whether to subclass or augment existing
classes.
There seems to be some debate in the object-oriented community over multiple
inheritance. Opponents contend that multiple inheritance compounds the
problems of subclassing (as described earlier) and introduces new problems
such as name clashing and repeated inheritance. Both are usually handled by
object-oriented language compilers, but keep in mind that Motif was written in
C. The experts I spoke with, including Joloboff, feel that real-world modeling
is not possible without the use of multiple inheritance.
I still see problems with this notion because I believe current mechanisms are
not fully representative. Inheritance mechanisms as implemented in current
object-oriented languages do not allow for features of a parent object to be
selectively inherited. Selective inheritance is well known in the AI
community, where links to nodes (objects) in the hierarchy can be
distinguished as is-a (for example, float is-a real number) and a-kind-of
(float is a-kind of number). But unlike their AI cousins, object-oriented
inheritance hierarchies do not distinguish between the types of links in the
tree. Child is-a-kind-of Parent. It's all or nothing.
Such semantics are quite possibly lost on the programmer trying to bang out
the next killer app--just inherit the feature and ignore it. But aside from
possible unwanted side-effects, there are performance issues. The decision in
Motif, then, was to go with traits, or what Joloboff calls a "poor man's
multiple inheritance." Traits are methods, such as PUSHABLE, which can be
applied to objects. An interesting solution.


The Main Event


Hearing Adele Goldberg at the podium was itself worth the price of admission.
Goldberg's talk on the "need to change" inspired many attendees to rethink
their strategies for management, structuring of programming teams, and the
role of consultants. A recurring theme at the conference (which surfaced in
Goldberg's address) was that of a new reward system for programmers. The
current system is to reward programmers for the number of lines of code they
produce. I'm all for reward, but that's at odds with the desire to reuse code.
(Think about that when you ask for your next raise.)
Noting the high levels of hype in the industry, Goldberg warned that the AI
community also enjoyed its moment in the sun and that (the industry) is not
doing any favors by over-promoting the technology. Indeed, many of us (even
editors and journalists) have hopped onto the object-oriented wagon. The
discussion, however, implied that many products are claiming to have
object-oriented features merely to gain its benefits, alluding to Pascal as a
case in point.
Both Microsoft and Borland have provided extensions, based in part on Apple's
Object Pascal. I caught up with Borland's Eugene Wang to get his take on the
comment. According to Wang, "Turbo Pascal provides three levels of
object-orientation: the language, supporting tools, and application
frame-works." Wang is quick to point out that the syntax of the language
supports encapsulation, inheritance, and polymorphism and notes that Turbo
comes with a browser and object inspector integrated with Turbo Debugger.


Application Frameworks


When asked for a show of hands, about half the keynote audience admitted they
were at least familiar with application frameworks, which provide a high-level
layer between the programmer and the operating environment. Indeed, the
proliferation of GUIs will make application frameworks the new front for
compiler vendors. Witness the heavy investments in the technology by companies
such as ParcPlace and Borland.
Turbo Pascal, for example, provides application frameworks for both DOS (with
Turbo Vision) and Windows (with its Object Windows Library). I've used Object
Windows, and there's a significant learning curve associated with it. As you
might guess, mastering the hierarchy and knowing when (and how) to make calls
directly to the Windows API takes time. But there are a number of fine points
that creep in. For instance, Turbo Pascal now supports null-terminated
strings, which are required by Windows, both as PChars and zero-based
character arrays. These are handled in different ways and are a source of
confusion among programmers used to Turbo's String type. Turbo provides a set
of functions for handling PChars, but no checking is performed by these
functions. This leads to problems, such as dangling pointers, seen only in
languages like C. I'm not sold on the benefits for users of Turbo Vision, but
the investment is definitely worth the time for Windows programmers.


Time Well Spent



Questions I did not hear at TOOLS were, "What are the benefits of
object-oriented programming?" or "What is inheritance?" For the most part, the
attendees were a sophisticated group with an advanced understanding of
object-oriented principles. The presentations were small and informal, making
some of the top experts accessible to all. Also, the emphasis seemed to be
less academic with more focus on applications and practitioners. As noted by
Bertrand Meyer, TOOLS caters to the attendees--not just the presenters.
Meyer also noted that holding conferences more frequently enables organizers
to react more quickly to industry developments. In fact, TOOLS will be held
three times next year, starting with a Sidney, Australia session in December
and a March '92 conference in Dortmund, Germany. I suppose I'll have to wait
for the next conference in Santa Barbara, but there's no question it'll be
worth the wait.




























































October, 1991
PORTING UNIX TO THE 386 THE BASIC KERNEL


Multiprogramming and Multitasking, Part II




William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 and 2.9BSD and the chief architect of
National Semiconductor's GENIX project, the first virtual memory
microprocessor-based UNIX system. Prior to establishing TeleMuse, a market
research firm, Lynne was vice president of marketing at Symmetric Computer
Systems. They conduct seminars on BSD, ISDN, and TCP/IP. Send e-mail questions
or comments to lynne@berkeley.edu. (c) 1991 TeleMuse.


In last month's installment, we embarked upon an exploration of
multiprogramming and multitasking -- two of the important elements which help
make UNIX "UNIX." Now that we have examined these key UNIX conventions and
have developed the intellectual framework for multiprogramming, we will now
proceed to an examination of some actual code, in particular sleep(),
wakeup(), and switch() and how these three programs carry off the illusion of
multiple simultaneous process execution on a sole processor.
Following our discussion of the 386BSD switching mechanisms, we will then
discuss some of the requirements for the extensions needed to support
multiprocessor and multithread operations in the monolithic 386BSD kernel.
Finally, we will reexamine the multiprogramming attempts of some other
operating systems in light of what we have learned.


The 386BSD swtch() Routine


The process context switching routine in our 386BSD kernel that provides the
actual functionality is called swtch(). In a nutshell, swtch() stores the
process's context in a pcb (process context block) data structure, finds the
next process to run from those waiting to be run, loads the new process's
state, and then executes a return. Note that in this case, the call to swtch()
occurred the last time the new process was running and calling swtch().
This mechanism relies on the fact that a process must effectively request a
context switch to the next consecutive process, whatever it is. This way,
either another process is run, or, if it is the only running process, it
switches back to itself again. If there is no process to run, such as when the
sole running process switches while waiting for disk I/O to complete, swtch()
must idle the processor and rescan the run queue when a process is added to
the queue.
The 386BSD swtch() routine attempts to avoid work when possible, because at
times, literally hundreds to thousands of switches per second may be demanded
as we use a large number of processes running on top of the kernel. To reduce
overhead, swtch() does nothing to ensure that the data structures remain
unchanged between the call and subsequent return from swtch(). As such, other
parts of the kernel must incorporate mechanisms concerned with critical
sections (that we can implement as needed at those sites). Similarly, nothing
in swtch() is done to prevent deadlocks or even ensure that a process is ever
executed again! (MULTICS, in contrast, has refined to a fine art the ability
to avoid deadlocks.) By reducing the overhead on context switching, we also
reduce the cost of a process.
Other operating systems attempt to reduce the overhead on process switching by
avoiding switches when possible and planning for them when forced to use them.
The designers of UNIX took a different approach -- make the mechanism for
context switching simple and cheap, use it often, and build new mechanisms for
handling other requirements on an as needed basis. In other words,
multitasking comes first, and everything else falls within our multitasking
framework.


Where Is swtch() Used?


swtch() is called only in the "top" layers of the kernel when a process, in
kernel mode, blocks for a resource. To keep processes that stay in user mode
from locking or rescheduling, swtch() is called as an exception at the end of
interrupt processing, when we would be returning to the user-mode program. A
periodic clock interrupt (used to recalculate the process priorities used by
swtch() to determine who to run next) also ensures that a process can't
overstay its welcome even if no other interrupts occur.
If a process is not running, it is either waiting to return to user mode (for
instance, a higher priority process preempted it) or its kernel mode alter ego
is waiting for some event or resource to "wakeup" so that it can be switched.
In most UNIX systems, a process status command called ps can be used to
differentiate between these two cases. In the first case, a process will be
marked as "runnable" (R or SRUN). In the second, it will be marked as
"sleeping" (S or SSLEEP) and be assigned a "wchan" (or wakeup event number) to
indicate which event it is waiting for.
Listing One (page 118) is a code fragment executed after a system call or
interrupt, when the process is about to return to user mode. At this point, no
resources capable of causing a deadlock condition are being held, so the
process sees no semantic difference between running now or later -- it is safe
to switch. Note that policy decisions of when and who to switch to are made
elsewhere. Also note that signals, including ones that might result in
termination, are processed at this point.
tsleep(), or "timed sleep" (see Listing Two, page 118), is a blocking call in
the BSD kernel that sets a process sleeping and switches away to run another
process (or idle) until the event occurs. It works by marking the process as
waiting for an event, inserting it on a sleep queue (frequently many processes
sleep for the same event), and switching. In addition, tsleep() has options to
abort a sleep after a time limit, as well as the ability to catch "signals"
(that is, someone has killed the process and the code calling tsleep should
clean up and let other routines, such as system calls, process the incoming
signal).
Unlike tsleep(), wakeup() (see Listing Three, page 118) does not cause control
to pass to another process. It simply removes the block ( indicated by the
"chan" event), stopping processes that are sleeping for it. To reduce time
spent finding processes, the sleep queue is hashed. Found processes are added
to the run queues. If the process is not loaded, the swap scheduler associated
with process 0, the first process, is awakened to bring it in. (Of course, we
won't put it on a run queue and reschedule.) Note that is not an error if no
processes are waiting for the event. The processes now placed on the run queue
will be run (in order of priority and on a FIFO order) as swtch() is called.
When swtch() is called by other portions of the kernel, caution must be
observed -- its "simple" nature should not cause more problems than it solves.
For example, to forestall deadlocks and data structure corruption created by
another process that may have been run in the interim, we must make sure that
calls are "safe." These rules, by the way, are generic to the entire kernel.


The Life Cycle of a Process


By now you might have noticed that swtch() relies upon the existence of a
current process, and possibly even other processes, to be summarily executed.
But we still have not discussed how processes come into being and how they
cease to exist -- in other words, the life cycle of a process. We'll briefly
touch on these to complete the picture.
Processes are created by a fork() system call. The microscopic, inner portion
of a fork() call must create a process context that can be swtch()ed to so
that it can be run. This process context is created by a cpu_fork() routine.
It builds the process context into the newly allocated process data structure,
then adds it to the list of executable processes.
Process termination is conducted by a similarly named routine, cpu_exit().
This routine, the inner part of the exit() system call, deallocates a process
and its related data structures, and then switches away. The process no longer
exists, so it never runs the risk of being switched back. This is also true
for "killed" processes, because exit() is called in the course of processing a
signal. Exiting UNIX processes are, in effect, suicides.
For those of you wondering why an apparently "unkillable" process occurred (a
not uncommon occurrence), this is because the process "sees" a termination
signal only when it returns to user mode. If the process is blocked waiting
for an event or a resource, and the blocking mechanism does not notice the
existence of the signal, it will continue to wait in vain for that resource.
This problem usually arises from a programming error in a device driver or as
an unintended deadlock that a new feature in the system just provoked.


The Magic of swtch(): a Simple Scenario


Now that we've outlined what swtch() does, we need to know how it brings off
the illusion of multiprogramming -- the magic, if you will. As a working
example, we will now set up a hypothetical session: Three people, each running
a process (each on a terminal), are typing simultaneously.
User A presses a key. This causes the computer to interrupt User C's read()
system call that was reading a file off the disk. User A's key press is
processed by the system and put on a character queue associated with User A's
process. A wakeup is then issued to this process, and the computer, in
clearing the interrupt, returns to User C's read() system call.
User C's process, in kernel mode, requests a block from the disk. It now
blocks waiting for the disk, and switches. However, User B had a disk I/O
process waiting even before this example began (yes, an argument for
preexisting universes) and is now competing with User C's process.
User B's process now becomes the current running process because it has more
priority than User A's process. User B's process runs, returning to user mode
and continuing the user program until its timeslice is used up. (The
rescheduling clock interrupt routine periodically checks the timeslice as it
monitors the passage of time, typically 60-100 times a second.) On the last
clock interrupt, the rescheduling code sets the wantresched flag, and a call
to swtch() occurs just before the return to user mode. User A's process is
then selected to run. It then processes the received character in the top half
of the system, completes its system call, and returns to the user.
This entire scenario took place in the order of a hundredth of a second.
Unlike the audiophile who claims to dislike CDs because "he can hear the sound
breaking up" 44,000 times a second (good ears!), most mortals can't resolve
time this finely. However, if 70 users on an anemic little machine such as a
PDP-11/70 all hit return at the same time, this tolerable
hundredth-of-a-second delay would grow to encompass seconds. In fact, you
could go get an espresso while running a simple compilation and get back in
time for a prompt. (Except that someone would probably steal your terminal in
the meantime. Yes, it used to be this way.)
This should illustrate the amount of work done by this tiny subroutine. As
such, it's time to dissect the innards of the 386BSD swtch() routine.



A Closer Look at swtch()


For all practical purposes, switch() boils down to three functions: store,
select, and load. We store the current process's state, select the new process
to be run, and load in the new process's state.
Storing and loading processor state is pretty simple (see Listing Four, page
118). All we really need to do is move processor registers appropriately and
get a new address space. The key global variable within our BSD kernel is the
curproc variable; it points to the active process at all times. From the
process data structure (which acts as a directory of data structures for this
process), we obtain from the p_addr field the address of the pcb to hold the
context we will store and load. One element of that context, the page table
base pointer %cr3 of the process, defines the address space of the user
process page tables.
Floating point is dealt with here by setting a bit on a processor special
register %cr0, which causes a trap on the next floating-point operation. In
this way, we can delay the store and unload of the coprocessor until it's
absolutely needed. Storing and loading the "Numeric Coprocessor Extension"
(NPX) is costly, and usually only one or two processes use it at a time, to
minimize the impact on the remaining processes.
For the system hardware, we have to remember the priority level of hardware
interrupts (the cpl) and restore it as well. Naturally, we must also set
curproc to point to the successor process.
The mechanism for selecting the new process is more complicated. In 386BSD we
have 32 run queues of ascending priorities. swtch() is a consumer of the run
queues, taking the leading process off the highest priority run queue and
loading it. It will consume all processes off a high priority queue before
consuming any from a lower queue. To quickly find which queue to consume
first, the queue status variable whichqs records the "filled versus empty"
status of all queues (32 queues corresponding to 32 bits, one bit per queue).
With a single-bit scan instruction, we can determine if: 1. A process is on
any queue at all (for example, if 0, no process to run); and 2.64 the highest
priority queue that has a process in it.
What if we can't find a process to run? We must "idle," awaiting a process
that can be serviced. In our idle loop, we reenable interrupts and execute the
hlt instruction to stop the processor. In reality, the hlt instruction pauses
until the next interrupt. We actually could do other things at this point
(some systems, for example, scrub dirty pages), but we choose to idle the
processor in an (usually vain) attempt to avoid stealing bandwidth from the
bus (ah...but ISA has no other bus masters or other CPUs), or, in the case of
a CMOS 386 in a laptop, save power (almost none are "low-power" CMOS).
In order to be run for a time from swtch(), processes must first be queued on
a run queue. The setrq() routine (see Listing Five, page 120) places them on
the end or tail of the run queue associated with the process's priority.
Priority may change as a process runs, so we have a remrq() as well (see
Listing Five) to remove a process from a run queue. This allows us to reinsert
the process with setrq() at a different priority, and thus, a different run
queue.


Alternative Implementations and Trade-offs


There are a number of other ways to implement this process switching routine.
On the 386, at least three ways of loading and unloading the registers provide
different degrees of functionality at a given cost. Also, we can choose to
reorder the structure of this routine to minimize the costs for common cases
(switch to self and switch to idle, for instance).
Instead of storing and loading the registers with individual move
instructions, we could have used a single pushal instruction. In that case, we
would point the stack at the pcb before issuing the instruction, then restore
it afterward. Something below the pcb might be written on if we got an
interrupt, so we would also have to bracket it with instructions to lock out
interrupts and reenable them.
Another way we could store and load registers on the 386 is to use the
all-en-compassing JUMP TSS task switch instruction (ljmp to a TSS descriptor).
This instruction, unique to the 386, stores all register state in a special
data structure and loads new state as well from the new process. It even
switches the page tables and sets the TS bit dealing with the coprocessor.
(Does just about everything but walk the dog!)
To use the JUMP TSS instruction, we would need a TSS descriptor for each
process that could be active at any time. Also, we would have to detect the
case where we might he switching back onto ourselves, and avoid using JMP TSS
at this point. (This instruction is so helpful, it even tracks whether we are
using the very TSS loaded, and gives us an exception if we do that!) This
instruction, true to its calling, really does it all, storing and loading all
registers (sans the coprocessor, thank goodness) all the time.
RISC fanatics tend to have a field day with CISC instructions of this kind
because of the complexity and expense. As a matter of fact, while
comprehensive, we find it too slow to use efficiently for our purposes.
However, to be fair to the designers of the 386, if you need to do all of the
things that JUMP TSS offers, using this instruction is probably your best bet.
(At the moment, 386BSD doesn't really need all this instruction's features,
but our pcb format allows us to use JUMP TSS in subsequent versions, should it
become more desirable.) The 386 also supports exceptions and interrupts
optionally handled with a transparent CALL TSS, an instruction which may also
offer some advantage in certain circumstances.
In comparing our three methods outlined for process switching routines, it is
important to sit down and add up the instruction costs. We know that by saving
fewer registers, we end up doing fewer loads and stores, and hence make our
end-to-end cost lower. In our 386BSD swtch ( ) function (see Listing Four), we
get away with saving only six registers. We don't need to save %eax, %edx, and
%ecx because these are compiler temporary registers which are discarded on
return. We also don't save the segment registers because they don't change in
this version of the system. In contrast, pushal saves eight registers and JMP
TSS saves 20. Adding up the instruction costs, our approach is the best of the
three.
We can also look at structural changes to swtch ( ) itself. For example,
instead of our (store, select, load) sequence we could try a (select, store,
load) sequence. In this example, if we detect the case of switching to
ourselves (perhaps after an idle wait), we can avoid both the store and load
of registers. (This is particularly useful if you have a RISC with 30-100
registers or more.) We actually coded the first version this way and ran the
operating system like this for a year. But due to the paucity of registers on
the 386, little advantage was gained.
The simpler arrangement seems to work better, because with (store, select,
load), we have more registers free to keep the values used by select, thus
reducing the number of memory accesses. Also, there is usually a delay between
when the processor idles and when an interrupt occurs (which will cause a
wakeup ( ) and then a setrq ( )), so the store occurs during this delay time.
This in turn makes the delay shorter because only the load would need to be
completed to finish the swtch ( )! Thus, the average real time elapsed would
be shorter, because the store is overlapped with the wait!
Another change one could make is to move the select portion to the setrq ( )
function, and rely on a single comparison to determine if a switch or idle
needs to be done (having already pre-calculated which one it is). But this
adds to the complexity, and might not result in a gain, because the places
that matter have a setrq ( ) call just prior to swtch ( ). It might even be
slower, because there are more setrq ( ) calls than swtch ( ) calls. Sometimes
clever optimizations just move the problem around rather than improve the
situation.


Multiprocessing


Multiprocessing describes a system capable of managing multiple processors.
(It does not mean running multiple processes, which is called
"multiprogramming.") UNIX kernel paradigms we have mentioned are extensible to
multiprocessors (with semaphores and effort), because many of the problems
we've dealt with (serialization, deadlock prevention, blocking, and context
switching) apply here as well, but on a grander scale. With multiprocessors,
there's obviously even more competition for shared data structures, as each
processor may want the very same data object at the same moment in time.
The degree to which multiprocessors can be applied is often confusing. In the
common case, multiple processors can each do a process at a time --
parallelism exists at the process level. Such UNIX systems have "make"
programs that can run multiple, simultaneous compilations. A smaller set of
systems (including some MACH and Chorus systems) also permit multiple
processors to each do a part of a process simultaneously. This is a form of
"fine grain" parallelism. The mechanisms of multiprocessor operation within a
single process are facilitated by either threads or lightweight processes.
Threads are "nanotasks" -- in other words, the smallest possible state living
in address space. They are inexpensive mechanisms added to existing processes.
However, most thread programming models use different primitives for dealing
with threads than for processes.
Lightweight processes (LWP) are "nanoprocesses" that may share all, or a
portion, of an address space. They are treated by the system just like
processes, but share resources, so they are "lighter" than UNIX processes.
Lightweight processes use similar or identical primitives to deal with
processes in general.
If you are beginning to notice that the disagreement between these two
approaches is one of either being "inside out" instead of "outside in," you're
dead on the money. Reading Swift is an excellent exercise for those unclear on
such conflicts, as the residents of Lilliput and Blefescu well know!


Adding Multiprocessing to 386BSD


Current versions of 386BSD have the sole goal of running on a uniprocessor
machine, but that's not to say it will always be this way. Should we wish to
extend it to multiple processors, we would need to consider a number of
issues.
386BSD is a "monolithic" kernel: All its functionality is built in a single
program known as the kernel. On a uniprocessor, this kernel program is
multitasking, in a sense, to provide multiple processes with system call
functionality. On a multiprocessor, the same program would be present, but
would have to support multiprocessing as well.
Among the most significant modifications to 386BSD would be changes to the
subroutines that block, unblock, and select processes that would serialize
access to process state. System call requests from processes in 386BSD are
always processed with a process pointer, so the state within the process
structure and the process's kernel stack would be uniquely accessed by that
very processor. Other objects, such as the corresponding user process's
address space, files, file descriptors, buffers, and the like, would then be
arbitrated for by a series of "spin" locks and reference-checked on allocation
and deletion. Much of this has already been anticipated in the current version
of the system, unlike previous editions of BSD.
How might such extensions allow multiprocessing inside a single process? Well,
processes could share all or part of an address space, so they could become
lightweight by not requiring a complete copy of all of a process's parts, for
example, vfork(). (See the sidebar entitled "Brief Notes: Lightweight
Processes and Threads" in the September issue.) By virtue of "gang"
scheduling, a set of shared processes could all become active, each individual
one per processor. Debugging such an arrangement would be identical in form to
multiple process debugging.


Reflection: Why is it Hard to Add Multiprogramming After the Fact?


UNIX has been around for over 20 years and predates other operating systems
which have come (and gone) such as CPM, MS-DOS, and Finder. So, why has
multiprogramming come to mass market platforms such as the PC and Mac so late?
Both MS-DOS and Finder were written with some multiprogramming "writing on the
wall" in mind, but it's been a long road between "it's going to be there" and
"it's here," and it's not common yet.
Windows/386 accomplishes some aspects of multiprogramming (somewhat like UNIX)
through a hardware trick: by simulating PCs via the virtual 8086 mode, with
the actual windows kernel running in protected mode. Multifinder (System 7.0)
on the Macintosh also attempts multiprogramming, but the price is that no
safety nets are held out for naughty programs. (In other words, programs were
doing things they should never have been doing, so they should be changed to
work appropriately. This does not go over well on the applications circuit, to
say the least.)
What was the problem? The experiences of the past were not heeded in many
areas, and an appropriate model was not completely thought out. (In addition,
the cost of an MMU was considered "too high" for the PC and Mac.)
These systems did not separate the application program from the kernel, as
UNIX has done since its early days on the PDP-11/45. This meant that missing
functionality in the system could be bestowed by a clever applications
program, but there was a downside. These applications got far more intimate
with its internals than the operating system's designers probably desired.
It's hard to believe that MS-DOS redirectors and TSRs were anything but a bad
dream to such designers, who had different plans for the future.
The drivers and other portions of the system were written expecting
synchronous operation without preemption by the system. This precludes
multiprogramming, because the system can't run another program in the idle
time waiting for the disk. (Early versions of UNIX suffered this flaw as
well.) To put this in UNIX terminology, the top half of the system blured
together with the bottom half, because it didn't matter with nonmultitasking
systems such as MS-DOS and Finder.
In general, the lessons learned from UNIX (and other operating systems) are
often ignored by operating systems designers. The desire to "create from the
bottom up" can result in short-sighted and incomplete designs which are
difficult to rectify later. A good operating systems designer should attempt
to leverage as much as possible from other efforts, sorting out the good from
the bad, and then proceeding onwards with a new set of goals. After all, we
don't go ahead and design the microprocessor, design and build the hardware,
port the operating system, and then write the applications programs all at
once, do we? (Well, we wouldn't recommend this route, but we have actually
done three of the four, and it was not easy.)
Remember also that shortcuts that appear not to matter often come back to
bite, so trade-offs should be carefully hashed out before a decision is made.
That's why we discuss why we didn't do something as well as what we did do in
386BSD. These rules are pertinent to all operating system design, not just
UNIX.
After understanding the broad implications of multiprogramming, along with the
minutiae which make it possible, it's impressive that it's all based on a
simple set of conventions. Through a careful understanding of these
conventions and how they are implemented, we gain appreciation for how a
simple model, carefully arranged, can offer much years down the line.



Onward and Forward


The mechanics of processes and context switching, coupled with a basic
understanding of multiprogramming, multiprocessing, and multitasking, form
some of the key components of a true UNIX system. These fundamental constructs
were incorporated into the original design of UNIX, with the result that
extending it into the realms of multiprocessing, for example, becomes a
plausible goal not buffeted by contradictory design elements (as in MS-DOS,
for example). Any operating system which purports to be multiprogramming must
meet the definitions and constructs of a multiprogramming system, else it
quite simply is not.
As we stated earlier, we are working on many areas of this 386BSD port at
once, so we will be returning to our main ( ) procedure (see DDJ August 1991)
next month to continue our discussion in more detail and focus on the
primitives and organization which impact device drivers. In particular, we
will examine important areas such as auto-configuration, the enabling
operation of the PC hardware devices, splX() (interrupt vector-level
management), and the interrupt vector code. The following month, after having
laid the groundwork for our UNIX device drivers, we will discuss sample device
drivers. In particular, we will examine in detail some of the code required
for the console, disk, and clock interrupt drivers. The basic structure of
these drivers, minimal requirements, and extending the functionality through
procedures such as disklabels will also be discussed.

_PORTING UNIX TO THE 386: MULTIPROGRAMMING AND MULTITASKING_
by William Frederick Jolitz and Lynne Greer Jolitz


[LISTING ONE]

/* code fragment from i386/trap.c (in trap() and syscall()) */
 ...
 if (want_resched) {
 /*
 * Enqueue our current running process first, so
 * that we may eventually run again. Block clock
 * interrupts that may interfere with priority
 * (e.g. we'd rather it not be recalculated part
 * way thru setrun).
 */
 (void) splclock();
 setrq(p);
 (void) splnone();
 p->p_stats->p_ru.ru_nivcsw++;
 swtch();
 while (i = CURSIG(p))
 psig(i);
 }
 ...







[LISTING TWO]

/*-
 * Copyright (c) 1982, 1986, 1990 The Regents of the University of California.
 * Copyright (c) 1991 The Regents of the University of California.
 * All rights reserved.
 */

/*
 * General sleep call.
 * Suspends current process until a wakeup is made on chan.
 * The process will then be made runnable with priority pri.
 * Sleeps at most timo/hz seconds (0 means no timeout).
 * If pri includes PCATCH flag, signals are checked
 * before and after sleeping, else signals are not checked.
 * Returns 0 if awakened, EWOULDBLOCK if the timeout expires.
 * If PCATCH is set and a signal needs to be delivered,
 * ERESTART is returned if the current system call should be restarted
 * if possible, and EINTR is returned if the system call should
 * be interrupted by the signal (return EINTR).
 */

tsleep(chan, pri, wmesg, timo)
 caddr_t chan;
 int pri;
 char *wmesg;
 int timo;
{
 register struct proc *p = curproc;
 register struct slpque *qp;
 register s;
 int sig, catch = pri & PCATCH;
 extern int cold;
 int endtsleep();

 s = splhigh();
 if (cold panicstr) {
 /*
 * After a panic, or during autoconfiguration,
 * just give interrupts a chance, then just return;
 * don't run any other procs or panic below,
 * in case this is the idle process and already asleep.
 */
 splx(safepri);
 splx(s);
 return (0);
 }

#ifdef DIAGNOSTIC
 if (chan == 0 p->p_stat != SRUN p->p_rlink)
 panic("tsleep");
#endif

 p->p_wchan = chan;
 p->p_wmesg = wmesg;
 p->p_slptime = 0;
 p->p_pri = pri & PRIMASK;

 /* Insert onto the tail of a sleep queue list. */
 qp = &slpque[HASH(chan)];
 if (qp->sq_head == 0)
 qp->sq_head = p;
 else
 *qp->sq_tailp = p;
 *(qp->sq_tailp = &p->p_link) = 0;

 /*
 * If time limit to sleep, schedule a timeout
 */
 if (timo)
 timeout(endtsleep, (caddr_t)p, timo);

 /* We put ourselves on the sleep queue and start our timeout
 * before calling CURSIG, as we could stop there, and a wakeup
 * or a SIGCONT (or both) could occur while we were stopped.
 * A SIGCONT would cause us to be marked as SSLEEP
 * without resuming us, thus we must be ready for sleep
 * when CURSIG is called. If the wakeup happens while we're
 * stopped, p->p_wchan will be 0 upon return from CURSIG.
 */
 if (catch) {

 p->p_flag = SSINTR;
 if (sig = CURSIG(p)) {
 if (p->p_wchan)
 unsleep(p);
 p->p_stat = SRUN;
 goto resume;
 }
 if (p->p_wchan == 0) {
 catch = 0;
 goto resume;
 }
 }

 /* Set process sleeping, go find another process to run */
 p->p_stat = SSLEEP;
 p->p_stats->p_ru.ru_nvcsw++;
 swtch();

resume:
 splx(s);
 p->p_flag &= ~SSINTR;

 /* cleanup timeout case */
 if (p->p_flag & STIMO) {
 p->p_flag &= ~STIMO;
 if (catch == 0 sig == 0)
 return (EWOULDBLOCK);
 } else if (timo)
 untimeout(endtsleep, (caddr_t)p);

 /* if signal was caught, return appropriately */
 if (catch && (sig != 0 (sig = CURSIG(p)))) {
 if (p->p_sigacts->ps_sigintr & sigmask(sig))
 return (EINTR);
 return (ERESTART);
 }
 return (0);
}








[LISTING THREE]

/*-
 * Copyright (c) 1982, 1986, 1990 The Regents of the University of California.
 * Copyright (c) 1991 The Regents of the University of California.
 * All rights reserved.
 */

/* Wakeup on "chan"; set all processes
 * sleeping on chan to run state.
 */
wakeup(chan)
 register caddr_t chan;

{
 register struct slpque *qp;
 register struct proc *p, **q;
 int s;

 s = splhigh();
 qp = &slpque[HASH(chan)];

restart:
 for (q = &qp->sq_head; p = *q; ) {
#ifdef DIAGNOSTIC
 if (p->p_rlink p->p_stat != SSLEEP && p->p_stat != SSTOP)
 panic("wakeup");
#endif
 if (p->p_wchan == chan) {
 p->p_wchan = 0;
 *q = p->p_link;
 if (qp->sq_tailp == &p->p_link)
 qp->sq_tailp = q;
 if (p->p_stat == SSLEEP) {
 /* OPTIMIZED INLINE EXPANSION OF setrun(p) */
 if (p->p_slptime > 1)
 updatepri(p);
 p->p_slptime = 0;
 p->p_stat = SRUN;
 if (p->p_flag & SLOAD)
 setrq(p);
 /*
 * Since curpri is a usrpri,
 * p->p_pri is always better than curpri.
 */
 if ((p->p_flag&SLOAD) == 0)
 wakeup((caddr_t)&proc0);
 else
 need_resched();
 /* END INLINE EXPANSION */
 goto restart;
 }
 } else
 q = &p->p_link;
 }
 splx(s);
}






[LISTING FOUR]

/* Copyright (c) 1989, 1990, 1991 William Jolitz. All rights reserved.
 * Written by William Jolitz 6/89
 *
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED

 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 */

/* Swtch() */
ENTRY(swtch)

 incl _cnt+V_SWTCH

 /* switch to new process. first, save context as needed */
 movl _curproc, %ecx
 movl P_ADDR(%ecx), %ecx

 /* unload processor registers, we need to use them */
 movl (%esp),%eax
 movl %eax, PCB_EIP(%ecx)
 movl %ebx, PCB_EBX(%ecx)
 movl %esp, PCB_ESP(%ecx)
 movl %ebp, PCB_EBP(%ecx)
 movl %esi, PCB_ESI(%ecx)
 movl %edi, PCB_EDI(%ecx)

 /* save system related details */
 movl $0,_CMAP2 /* blast temporary map PTE */
 movw _cpl, %ax
 movw %ax, PCB_IML(%ecx) /* save ipl */

 /* save is done, now choose a new process or idle */
rescanfromidle:
 movl _whichqs,%edi
2:
 bsfl %edi,%eax /* found a full queue? */
 jz idle /* if nothing, idle waiting for some */

 /* we have a queue with something in it */
 btrl %eax,%edi /* clear queue full status */
 jnb 2b /* if it was clear, look for another */
 movl %eax,%ebx /* save which one we are using */

 /* obtain the run queue header */
 shll $3,%eax
 addl $_qs,%eax
 movl %eax,%esi

#ifdef DIAGNOSTIC
 /* queue was promised to have a process in it */
 cmpl P_LINK(%eax),%eax /* linked to self? (e.g. not on list) */
 fje panicswtch /* not possible */
#endif

 /* unlink from front of process q */
 movl P_LINK(%eax),%ecx
 movl P_LINK(%ecx),%edx
 movl %edx,P_LINK(%eax)
 movl P_RLINK(%ecx),%eax
 movl %eax,P_RLINK(%edx)

 /* is the queue truely empty? */
 cmpl P_LINK(%ecx),%esi
 je 3f

 btsl %ebx,%edi /* nope, set to indicate full */
3:
 movl %edi,_whichqs /* update queue status */

 /* notify system we've rescheduled */
 movl $0,%eax
 movl %eax,_want_resched

#ifdef DIAGNOSTIC
 /* process was insured to be runnable, not sleeping */
 cmpl %eax,P_WCHAN(%ecx)
 jne panicswtch
 cmpb $ SRUN,P_STAT(%ecx)
 jne panicswtch
#endif

 /* isolate process from run queues */
 movl %eax,P_RLINK(%ecx)

 /* record details of newproc in our global variables */
 movl %ecx,_curproc
 movl P_ADDR(%ecx),%edx
 movl %edx,_curpcb
 movl PCB_CR3(%edx),%ebx

 /* switch address space */
 movl %ebx,%cr3

 /* restore context */
 movl PCB_EBX(%edx), %ebx
 movl PCB_ESP(%edx), %esp
 movl PCB_EBP(%edx), %ebp
 movl PCB_ESI(%edx), %esi
 movl PCB_EDI(%edx), %edi
 movl PCB_EIP(%edx), %eax
 movl %eax, (%esp)

#ifdef NPX
 /* npx will interrupt next instruction, delay npx switch till then */
#define CR0_TS 0x08
 movl %cr0,%eax
 orb $CR0_TS,%al /* disable it */
 movl %eax,%cr0
#endif

 /* set priority level we were at last time */
 pushl PCB_IML(%edx)
 call _splx
 popl %eax

 movl %edx,%eax /* return (1); (actually, non-zero) */
 ret

/* When no processes are on the runq, Swtch branches to idle
 * to wait for something to come ready.
 */
 .globl Idle
Idle:
idle:

 call _spl0
 cmpl $0,_whichqs
 jne rescanfromidle
 hlt /* wait for interrupt */
 jmp idle









[LISTING FIVE]

/* Copyright (c) 1989, 1990 William Jolitz. All rights reserved.
 * Written by William Jolitz 7/91
 * Redistribution and use in source and binary forms are freely permitted
 * provided that the above copyright notice and attribution and date of work
 * and this paragraph are duplicated in all such forms.
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
 * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 */

/*
 * Enqueue a process on a run queue. Process will be on a run queue
 * until run for a time slice (swtch()), or removed by remrq().
 * Should only be called with a running process, and with the
 * processor protecting against rescheduling.
 */
setrq(p) struct proc *p; {
 register rqidx;
 struct prochd *ph;
 struct proc *or;

 /* Rescale 256 priority levels to fit into 32 queue headers */
 rqidx = p->p_pri / 4;

#ifdef DIAGNOSTIC
 /* If this process is already linked on run queue, we're in trouble. */
 if (p->p_rlink != 0)
 panic("setrq: already linked");
#endif

 /* Link this process on the appropriate queue tail */
 ph = qs + rqidx;
 p->p_link = (struct proc *)ph;
 or = p->p_rlink = ph->ph_rlink;
 ph->ph_rlink = or->p_link = p;

 /* Indicate that this queue has at least one process in it */
 whichqs = (1<<rqidx);
}

/* Dequeue a process from the run queue its stuck on. Must be called
 * with rescheduling clock blocked.
 */

remrq(p) struct proc *p; {
 register rqidx;
 struct prochd *ph;

 /* Rescale 256 priority levels to fit into 32 queue headers */
 rqidx = p->p_pri / 4;

#ifdef DIAGNOSTIC
 /* If a run queue is empty, something is definitely wrong */
 if (whichqs & (1<<rqidx) == 0)
 panic("remrq");
#endif

 /* Unlink process off doublely-linked run queue */
 p->p_link->p_rlink = p->p_rlink;
 p->p_rlink->p_link = p->p_link;

 /* If something is still present on the queue,
 * set the corresponding bit. Otherwise clear it.
 */
 ph = qs + rqidx;
 if (ph->ph_link == ph)
 whichqs &= ~(1<<rqidx);
 else
 whichqs = (1<<rqidx);

 /* Mark this process as unlinked */
 p->p_rlink = (struct proc *) 0;
}

































October, 1991
XALLOC: AN EXPANDED MEMORY MANAGER


An expanded memory manager for Turbo Pascal




Herbert Gintis


Herb is a professor of economics at the University of Massachusetts, Thompson
Hall, Amberst, MA 01003.


This article describes a Turbo Pascal 5.5 unit that implements the expanded
memory equivalent of the Turbo Pascal dynamic memory functions getmem,
freemen, memavail, and maxavail. The new functions -- xgetmem, xfreemem,
xmemavail, and xmaxavail -- mirror their main memory counterparts as much as
possible (see Table 1). However, the peculiarities of expanded memory do
require some special treatment.
Table 1: The Xalloc unit interface section

 Function Description
 ------------------------------------------------------------------------
 type An address in memory is a pointer, but
 xaddress = record in expanded memory, it must hold the
 page : byte expanded memory page (page) and the
 offset : word offset from the start of the page
 end (offset). We call this an xaddress.

 const xgetmem returns this page in an
 nilpage = $ff; xaddress to indicate that the request
 was denied.

 function xalloc_init: boolean; Initializes the expanded memory
 manager. Must be called before any
 other routine in the unit. Returns
 true if it's okay to use the model,
 and false otherwise.

 procedure xalloc_done; Releases all expanded memory. Invoked
 as part of a program's exit procedure.

 procedure xgetmem (var x: Requests a block of size bytes for
 xaddress; size: word); variable x. The page and offset are
 returned in x. Returns nilpage if
 denied.

 procedure xfreemem (var x: Returns the block of expanded memory to
 xaddress; size : word); the pool.

 function page_in (var x: Returns the address in memory of the
 xaddress): pointer; block of expanded memory owned by
 x. Since this function also moves the
 page to main memory, it must be
 called each time the variable x is
 accessed, if there is a chance it
 was paged out of memory.

 function xmaxavail: Returns the maximum block size
 longint; available.


 function xmemavail: Returns the total amount of expanded
 longint; memory available.

Expanded memory consists of pages, and a variable in expanded memory is
identified by its page and the offset of the variable's storage in the page.
We represent an expanded memory location by an xaddress record, consisting of
a page and an offset; and xgetmem returns such an xaddress to the calling
program. To access a variable in expanded memory, we must move the page of
expanded memory into main memory and tell the calling program its location in
main memory. This is accomplished by page_in, which takes the xaddress as an
argument and returns the main memory address as a pointer. At the same time,
it shuffles the proper page into main memory if it isn't already there.
It's very important to call page_in every time an expanded memory location or
page is addressed, because some other routine may have shuffled it back out
(through an interrupt, for example). The best way to ensure this is by
building expanded memory management into the objects that use xalloc. For
example, see the string-type object xline created in Listing One (page 121).
The xalloc unit (see Listing Two , page 121) behaves in a manner almost
identical to Turbo Pascal's memory manager. When a block of memory is
requested by xgetmem, and if there are no "holes" in the expanded memory the
unit controls, the block is taken from the heap -- the unused memory at the
top of a page. If there is not enough room on the heap (there's actually a
different heap for each page), a new page is allocated. When a block is
deallocated by xfreemem, there are two possibilities: If the block is at the
end of the heap, the heap is simply expanded; if the block is in the middle of
an allocated block of memory, it is placed on a list of free memory "holes".
Before checking the heap, xgetmem checks this array for a suitable block. If
it finds one, the list of free memory holes is adjusted accordingly.
There are clear ways to improve the unit. One improvement is to switch
automatically to main memory or disk, if expanded memory is not implemented or
is exhausted. Another is to improve garbage collection and memory compaction.

_XALLOC: AN EXPANDED MEMORY MANAGER FOR TURBO PASCAL_
by Herbert Gintis


[LISTING ONE]

unit xlineobj;

{ Typical use:
 program xtest;
 uses xalloc,xlineobj;
 var
 s : xline;
 begin
 if not xalloc_init then halt;
 s.init;
 s.put_text('This goes into expanded memory');
 writeln(s.get_text);
 s.done;
 xalloc_done;
 end.
}
interface

uses xalloc;

type
 xline = object
 len : byte;
 mem : xaddress;
 constructor init;
 destructor done; virtual;
 procedure newsize(ncols : integer);
 function get_text : string;
 procedure put_text(s : string);
 end;

implementation

var
 xs : ^string;

constructor xline.init;
const
 mincols = 8;
begin
 xgetmem(mem,mincols);
 len := mincols-1;
 xs := xpage_in(mem);
 xs^ := '';
end;


destructor xline.done;
begin
 xfreemem(mem,len+1);
end;

procedure xline.newsize(ncols : integer);
begin
 xfreemem(mem,len+1);
 xgetmem(mem,ncols+1);
 xs := xpage_in(mem);
 len := ncols;
end;

function xline.get_text : string;
begin
 xs := xpage_in(mem);
 get_text := xs^;
end;

procedure xline.put_text(s : string);
begin
 if length(s) <> len then newsize(length(s));
 xs := xpage_in(mem);
 xs^ := s;
end;

end.





[LISTING TWO]

unit xalloc;
 {-See the unit xlineobj.pas for typical use of this unit}
interface

const
 nilpage = $ff;
type
 xaddress = record
 page : byte;
 pos : word;
 end;
function xalloc_init : boolean;
procedure xgetmem(var x : xaddress;size : word);
procedure xfreemem(var x : xaddress;size : word);
function xpage_in(var x : xaddress) : pointer;
function xmaxavail : longint;
function xmemavail : longint;
procedure xalloc_done;

implementation

uses crt,dos;

const
 emm_int = $67;

 dos_int = $21;
 maxfreeblock = 4000;
 xblocksize = $4000;
 _get_frame = $41;
 _unalloc_count = $42;
 _alloc_pages = $43;
 _map_page = $44;
 _dealloc_pages = $45;
 _change_alloc = $51;
type
 xheap = array[0..1000] of word;
 fblock = record
 page : byte;
 start,stop : word;
 end;
 fblockarray = array[1..maxfreeblock] of fblock;
var
 regs : registers;
 handle,tot_pages : word;
 xheapptr : ^xheap;
 xfreeptr : ^fblockarray;
 last_page,lastptr : integer;
 map : array[0..3] of integer;
 frame : word;

function ems_installed : boolean;
const
 device_name : string[8] = 'EMMXXXX0';
var
 i : integer;
begin
 ems_installed := false;
 with regs do begin {check for ems present}
 ah := $35; {get code segment pointed to by interrupt 67h}
 al := emm_int;
 intr(dos_int,regs);
 for i := 1 to 8 do if device_name[i] <> chr(mem[es : i + 9]) then exit;
 end;
 ems_installed := true;
end;

function unalloc_count(var available : word): boolean;
begin
 with regs do begin
 ah := _unalloc_count;
 intr(emm_int,regs);
 available := bx;
 unalloc_count := ah = 0 {return the error code}
 end;
end;

function alloc_pages(needed: integer): boolean;
begin
 with regs do begin
 ah := _alloc_pages;
 bx := needed;
 intr(emm_int,regs);
 handle := dx;
 alloc_pages := (ah = 0); {return the error code}

 end;
end;

function xdealloc_pages: boolean;
begin
 with regs do begin
 ah := _dealloc_pages;
 dx := handle;
 intr(emm_int,regs);
 xdealloc_pages := (ah = 0); {return the error code}
 end;
end;

function change_alloc(needed : integer) : boolean;
begin
 with regs do begin
 ah := _change_alloc;
 bx := needed;
 dx := handle;
 intr(emm_int,regs);
 change_alloc := (ah = 0); {return the error code}
 end;
end;

function xmap_page(l_page,p_page: integer): boolean;
begin
 xmap_page := true;
 if map[p_page] <> l_page then with regs do begin
 ah := _map_page;
 al := p_page;
 bx := l_page;
 dx := handle;
 intr(emm_int,regs);
 xmap_page := (ah = 0);
 if ah = 0 then map[p_page] := l_page;
 end;
end;

function xpage_in(var x : xaddress) : pointer;
begin
 if xmap_page(x.page,0) then xpage_in := ptr(frame,x.pos)
 else xpage_in := nil;
end;

function xget_frame(var frame: word): boolean;
begin
 with regs do begin
 ah := _get_frame;
 intr(emm_int,regs);
 frame := bx;
 xget_frame := (ah = 0); {return the error code}
 end;
end;

procedure xgetmem(var x : xaddress;size : word);
var
 i : integer;
begin
 for i := 1 to lastptr do begin

 with xfreeptr^[i] do begin
 if size <= stop - start then begin
 x.page := page;
 x.pos := start;
 inc(start,size);
 if start = stop then begin
 xfreeptr^[i] := xfreeptr^[lastptr];
 dec(lastptr);
 end;
 exit;
 end;
 end;
 end;
 x.page := nilpage;
 i := 0;
 repeat
 inc(i);
 if i > tot_pages then exit;
 if i > last_page then begin
 inc(last_page);
 if not change_alloc(last_page) then exit;
 end;
 until xblocksize - xheapptr^[pred(i)] > size;
 with x do begin
 page := pred(i);
 pos := xheapptr^[page];
 inc(xheapptr^[page],size);
 end;
end;

procedure xfreemem(var x : xaddress;size : word);
var
 i,xstop : integer;
begin
 xstop := x.pos + size;
 i := 0;
 while i < lastptr do begin
 inc(i);
 with xfreeptr^[i] do begin
 if x.page = page then begin
 if x.pos >= start then begin
 if x.pos <= stop then begin
 x.pos := start;
 if xstop < stop then xstop := stop;
 xfreeptr^[i] := xfreeptr^[lastptr];
 dec(lastptr);
 dec(i)
 end;
 end
 else if xstop >= start then begin
 if xstop < stop then xstop := stop;
 xfreeptr^[i] := xfreeptr^[lastptr];
 dec(lastptr);
 dec(i)
 end;
 end;
 end;
 end;
 if lastptr > 0 then with xfreeptr^[lastptr] do

 if start = stop then dec(lastptr);
 if x.pos < xstop then begin
 if xstop = xheapptr^[x.page] then xheapptr^[x.page] := x.pos
 else begin
 if lastptr < maxfreeblock then begin
 inc(lastptr);
 with xfreeptr^[lastptr] do begin
 page := x.page;
 start := x.pos;
 stop := xstop;
 end;
 end;
 end;
 end;
end;

function xmemavail : longint;
var
 s : longint;
 i : integer;
begin
 s := 0;
 for i := 0 to pred(tot_pages) do inc(s,$4000 - xheapptr^[i]);
 for i := 1 to lastptr do with xfreeptr^[i] do inc(s,stop - start);
 xmemavail := s;
end;

function xmaxavail : longint;
var
 s : longint;
 i : integer;
begin
 s := 0;
 for i := 0 to pred(tot_pages) do
 if $4000 - xheapptr^[i] > s then s := $4000 - xheapptr^[i];
 for i := 1 to lastptr do with xfreeptr^[i] do
 if stop - start > s then s := stop - start;
 xmaxavail := s;
end;

procedure xalloc_done;
begin
 if not xdealloc_pages then;
end;

function xalloc_init : boolean;
var
 i : word;
begin
 xalloc_init := false;
 if not ems_installed then exit;
 if not unalloc_count(tot_pages) then exit;
 if tot_pages = 0 then exit;
 if not xget_frame(frame) then exit;
 getmem(xheapptr,tot_pages*sizeof(word));
 if xheapptr = nil then exit;
 new(xfreeptr);
 if xfreeptr = nil then exit;
 for i := 0 to pred(tot_pages) do xheapptr^[i] := 0;

 if not alloc_pages(1) then exit;
 xalloc_init := true;
 lastptr := 0;
 last_page := 1;
 for i := 0 to 3 do map[i] := -1;
end;

end.






















































October, 1991
C++ FOR EMBEDDED SYSTEMS


Implementing embedded systems for the 80x86 takes a leap forward in
affordability




Stuart G. Phillips, N6TT0 and Kevin J. Rowett, N6RCE


Stuart is director of network products development at Tandem Computers in
Cupertino, Calif. Kevin is the project leader for TCP/IP development. You can
reach them at Tandem Computers, MS 201-02, 10501 N. Tantau Ave., Cupertino, CA
95014 or by e-mail at stu@tandem.com or kevinr@tandem.com.


Implementing and debugging embedded systems usually requires the designer to
use a logic analyzer or in-circuit emulator, devices that can make embedded
systems programming an expensive undertaking. With the software and techniques
discussed here, implementing embedded systems for the Intel 80x86 family takes
a quantum leap forward in affordability.
In this article, we describe how you can use Borland C++ (BC++) to develop
applications for embedded systems using the PC. The article is based on our
experience with using BC++ to develop software for an intelligent,
multifunction communications processor (MIO) that uses the NEC V40 (a
functional superset of the Intel 80188). We targeted MIO at the support of
high-speed digital radio links to isolate the PC from real-time processing
requirements. We developed software for MIO on the PC, then downloaded it into
the controller for execution and testing. Because of space constraints, the
complete software package for the MIO is available electronically; see
"Availability" on page 3 for details.
Here, we explain how to customize BC++ to support a non-PC environment and how
to convert DOS executable (EXE) files built using BC++ for an embedded system.
We'll also describe how to interface Borland's Turbo Debugger (TD) to the
system under test to speed up the debugging process.


The Startup Module


Most C (or C++) programmers think that the main( ) procedure is the first
piece of code in a program to be executed. BC++, however, invokes main( ) from
a startup module linked with the program. The startup module establishes the
environment for main( ), setting up the stack and heap, and creating the argc
and argv arguments from the command line. Finally, it calls main( ). The
startup module regains control and tidies up before returning to DOS when the
main program terminates by returning or calling exit( ).
Borland supplies several different versions of the startup module, depending
on the choice of memory module and the use of floating-point operations. When
using the Integrated Development Environment (IDE), the choice of memory model
(set by compiler options) determines the selection of startup module. You can
find the source code for the startup module (C0.ASM) in the EXAMPLES\STARTUP
directory created when you installed BC++.
In most cases, our embedded system won't be running DOS, so we have to modify
the startup module to make it DOS-independent.
A brief description of MIO will help you understand our particular
environment. We designed MIO as a communications processor for high-speed
radio links (see the sidebar entitled "Amateur Packet Radio"). We used the NEC
V40 because of its high degree of integration (DMA, interrupt controller,
timers, and so on -- all on the CPU) and its object code compatibility with
the Intel 80x86 family. We can map either 8- or 64-Kbytes of MIO memory into
the PC memory map as a shared memory window accessible to both MIO and the PC.
We use the shared memory window initially to download control programs into
MIO and subsequently for passing commands and data between the PC and MIO. MIO
hardware allows the PC and MIO to interrupt each other under program control
to provide synchronization between the two processors.
The file BUILD-C0.BAT shows the commands needed to build the startup module
for the various memory models. We support all memory models except the huge
model. Literals control conditional assembly to generate all memory models
from a single source file.
You must specify the startup module as the first module in the object file
list given as a parameter to the linker (TLINK). The startup module specifies
an entry point (STARTX) that the linker uses as the initial start address for
the program. Our startup module for MIO has a jump instruction at the entry
point to take execution to the start of the control program located above the
shared memory window.
We use a simple two-stage loader to take a control program and load it into
MIO. The majority of programs are larger than the size of the shared window,
so the loader first loads a program into MIO that copies data to and from the
shared window. The loader uses this small program to relocate blocks of code
and data into MIO memory, and then causes the control program to be executed.
The startup module begins to execute the main( ) procedure by setting the data
segment register to point to the static data area known as DGROUP. Depending
on the choice of memory model, DGROUP may contain all data and the stack or
only initialized and static data.
Now that we can access DGROUP, we must determine where to locate the stack. We
determine the size of the stack from the variable _stklen. Normally you
wouldn't declare this variable because Borland includes it in the standard C
library. We declare _stklen in the startup module because we make limited use
of the standard library. C programs can set the value of _stklen by
referencing it as an extern variable.
Regardless of the choice of memory model, we must check whether there is
enough memory available for the requested stack size. The startup module
contains a special symbol called edata@, located at the end of the static and
initialized data in DGROUP. We use this symbol to determine the last location
in memory occupied by the control program. We calculate the amount of free
memory by converting the address of edata@ into paragraphs and then
subtracting that value from the size of memory on MIO.
The startup module applies several consistency checks on the size of the stack
and amount of free memory, depending on the choice of memory model. The choice
of memory model determines the location of the stack. (See the Borland C++
Programmer's Guide for a full explanation.) After setting the size and
location of the stack, the startup module calls all the initialization
procedures that the programmer specified through the #pragma startup compiler
directive. The startup module also uses the same code to call the termination
procedures specified using the #pragma exit directive when the main( )
procedure exits or returns. Borland uses self-modifying code to save memory by
sharing the code that calls the startup and exit procedures.
The startup module prepares dummy arguments for argc and argv and then calls
main( ). We pass all configuration and optional data to MIO through the shared
memory window.
More Details.
You will need to modify the startup module to reflect the hardware environment
of your embedded system. The boot PROM on MIO takes care of initial hardware
configuration such as disabling interrupts, starting memory refresh, and so
on, eliminating the need to duplicate that code in our startup module.


Using the Standard Library


You need to exercise caution when considering the use of procedures in the
standard library. Borland designed the library to support a DOS environment on
the PC; therefore, it makes extensive use of BIOS and DOS function calls.
Using procedures that call the BIOS or DOS will likely cause your program to
enter hyperspace when executed on the embedded system.
You have three options to choose from when you need a library procedure:
Consider buying the library source code from Borland. The price might be an
obstacle, but think how long it would take you to implement all the procedures
you will need.
Consider implementing your own standard library if you don't need too many
support procedures. This isn't as much work as you may think because you
probably don't need 90 percent of the procedures in the standard library that
deal with such things as accessing files, console I/O, and so on.
Finally, before deciding to implement your own version of the procedure, you
can use the following procedure to check whether the library procedure invokes
BIOS or DOS functions.
Write a small test program that calls the procedure of interest; Borland
includes example programs in the library reference manual for each procedure.
Compile and link the example program specifying the options to include
symbolic debugging information. When you have a linked program, invoke the
debugger on the program and open a CPU window to display the CPU registers and
the code in instruction format. Position the display to the start of the
library procedure using the goto command. Page through the instruction decodes
looking for INT instructions; both the BIOS and DOS functions use INT
instructions to transfer execution from your program. The use of either BIOS
or DOS function calls means you will have to write your own version.


Converting DOS Executable Files


Unless your embedded system runs DOS, you won't be able to use the EXE file
generated by the linker without conversion. The linker produces relocateable
object code when it creates the EXE file. Normally the DOS program loader
takes care of "fixing up" the relocateable code based on the initial load
address of the program. Figure 1 shows the layout of the EXE file header
written by the linker. The linker creates a relocation item for every 16-bit
word in the program that requires relocation. Each relocation item contains
the segment and offset (relative to the base address of the program) of the
word needing relocation. The loader reads the word at the specified location
and adds to it the base address at which it loaded the program. The loader
writes back the relocated word to its original location.
Figure 1: Format of EXE file header

 struct EXEHDR {
 unsigned short magic; // EXE file if 0x5a4d

 unsigned short nbytes; // Size of last page in bytes
 unsigned short npages; // Size of image in pages
 // 1 page = 512 bytes
 unsigned short nreloc; // Number of relocation items
 unsigned short hdrsize; // Size of EXE header
 unsigned short endmin; // Minimum memory
 unsigned short maxmem; // Maximum memory
 unsigned short ss_offset; // Stack segment offset
 unsigned short val_sp; // Initial value for SP
 unsigned short chksum;
 unsigned short val_ip; // Initial value for IP
 unsigned short cs_offset; // Code segment offset
 unsigned short rel_offset; // Offset of relocation items
 unsigned short ovl_num; // Overlay number
 };

We build control programs for MIO just as we would build programs for DOS;
instead of the normal startup module, we use the modified version described
earlier. Before we can use the resulting EXE file on MIO, we must relocate the
object code based on its eventual load address in MIO memory. Rather than
perform the relocation as we download the control program into MIO, we opted
to use a separate utility called COMF. The source code for COMF is included in
the electronic distribution; see "Availability" on page 3 for details.
COMF takes the EXE file created by the linker and creates a relocated version
of the file as a MIO download file. We use the file extension LOD to identify
MIO download files. Each LOD file has a file header that includes a version
number (so the loader can determine whether it can process the file). It also
includes the entry point for the program, the size of the program, and the
time stamp of the original EXE file. Figure 2 shows the LOD header format.
Figure 2: Format of LOD file header

 struct LODHDR {
 unsigned short magic; // LOD file of 0x4655
 unsigned short version; // Version id of LOD header
 unsigned short val_offset; // Initial value for IP
 unsigned short val_seg; // initial value for CS
 long timestamp; // DOS time stamp of orig. EXE file
 long image_size; // Image size in bytes
 };

The logic for COMF is fairly straightforward and needs little explanation.
After validating the EXE file header, COMF builds the LOD header and then
begins the relocation process. The linker does not sort the relocation items
when it writes the EXE header; relocation items appear in random address
order. COMF reads in all relocation items and sorts them into ascending
address order before beginning the relocation process. By sorting the
relocation items we only need to make one pass through the EXE file as we
create the LOD file. We paid little attention to the efficiency of COMF
because the conversion process doesn't take long, even for large programs with
many relocation items.
The only optional parameter to COMF is the initial load address. By
convention, we load MIO control programs at address 0010:0000; COMF assumes
this address by default.
Loading a LOD file is simply a matter of copying the object code to MIO memory
and beginning execution. The loader will optionally implant the time stamp
from the LOD header at address 0000:00FC if we intend to use Turbo Debugger to
debug software on MIO.


Debugging Using Turbo Debugger


The biggest challenge we face with an embedded system is debugging our
download programs. Without keyboard or display, getting a view of what's going
on within the embedded system requires a logic analyzer, in-circuit emulator,
or some other technique. With the aid of MIOTDREM, a program described in
Listing One (page 124) that runs on MIO and emulates TDREMOTE, we can use
Turbo Debugger (TD) to debug our download programs.
In normal DOS environments, Borland supplies a utility called TDREMOTE and
then enables TD to control a remote PC over an asynchronous communication
link. TD passes TDREMOTE commands for execution and return of results. In this
way we can use TD to debug our download programs.
MIO supports four communications ports using two Zilog 82530 Enhanced Serial
Communication Controllers, or ESCCs. Zilog developed the 82530 as a functional
superset of the 8530; the ESCC has larger FIFOs than the 8530 and a number of
features that make it much easier to program.
The module COMM.C contains all the ESCC-dependent software; you can support
your own embedded system by writing a version of COMM.C that supports your
hardware. You will also have to change MAIN.C to initialize your hardware
rather than that of MIO. The remaining modules of MIOTDREM are
hardware-independent, apart from their 80x86 idiosyncrasies.
TD uses a simple protocol between itself and its remote partner over an
asynchronous link configured for 8-bit characters with no parity and 1 stop
bit. You must use a cable that crosses transmit and receive data between the
PC and embedded system. The TD manual describes the cable configuration in
some detail.
Each exchange of a message between TD and MIOTDREM begins with the sending of
a single byte that contains the length of data to follow. The receiver of the
length byte acknowledges receipt by sending back a NUL advising that the
transfer can continue. The initiator then sends the message. The protocol
provides no facilities such as a checksum or negative acknowledgment for error
detection or recovery. The protocol makes the assumption that the link between
TD and its remote partner is of high quality and not subject to errors. You
shouldn't have any problems if you keep the cable between the PC and embedded
system short (three to five feet of high-quality cable). Figure 3 shows the
protocol exchange graphically.
The procedure tdr_processor( ) in TDRPROC.C processes messages that arrive
from TD. tdr_processor( ) loops, waiting for the receive interrupt routine to
append a message to the message queue msgq. The receive interrupt handler
handles the handshake between TD and MIOTDREM, and then simply queues the
message. The structure td_imsg describes the incoming message formats. Each
incoming message starts with a byte command value followed by the parameters
to the command. Commands allow TD to read and write memory or IO registers,
set or move blocks of memory and read or write the processor registers.
tdr_processor( ) calls support procedures, depending on the command sent for
execution.
The module MACHINE.C contains all the machine-level procedures that handle the
transfer of control between MIOTDREM and the program being debugged. The
module includes interrupt service routines for divide by 0, single-step
(trace), and breakpoint interrupts. When TD sends the command to start
executing the program under test, MIOTDREM calls the procedure go_program() to
start execution. go_program( ) saves the current processor state on a private
stack and then sets the processor registers from the local register copy.
Control passes back to MIOTDREM whenever one of the three interrupts occurs or
you interrupt the program by pressing Ctrl-BREAK on the PC executing TD. The
procedures mc_brk0( ), mc_brk1( ), and mc_brk3( ) (the three interrupt service
routines) all call mc_return( ) to pass control back to MIOTDREM. mc_return( )
saves the state of the program under test and then manipulates the stack to
simulate a return from go_program( ) with a return value indicating the cause
for return. MIOTDREM passes the return value back to TD across the
asynchronous link.
The code in MACHINE.C isn't difficult to understand if you use a piece of
paper to follow the stack manipulations. The Borland C++ Programmer's Guide
explains the entry and exit conventions used by the compiler; you will need to
understand these conventions to follow how the return from go_program( ) is
faked.
Debugging with TD is straightforward: You invoke TD in the normal manner and
specify the name of the EXE file used as the input to COMF. After the initial
handshake, TD sends the name of the file to MIOTDREM with the command to
return the DOS time stamp for the EXE file. TD uses this mechanism to ensure
that the remote has the correct version of the program. In normal DOS
operation, TD will transmit the file to its remote partner if the file time
stamps don't match. MIO doesn't have a disk, and we have already loaded the
program ready for debugging, so MIOTDREM uses the time stamp value implanted
at address 0000:00FC by the loader as the return value to TD. TD then sends a
command to load the program, which MIOTDREM ignores, simply sending back an
acknowledgment that the load was successful. Debugging then continues as
normal.
You can locate code for MIOTDREM in EPROM on your embedded system or load it
as a normal download program. We opted for the latter approach because we only
need MIOTDREM when we are debugging new programs. Our version of MIOTDREM also
provides support to the loader to copy code and data from the shared memory
window to the rest of memory. We first load MIOTDREM, then load the program
that we want to debug.


Conclusion


BC++ makes an excellent toolkit for developing embedded systems software. The
Borland tools are high quality and easily adapted to develop code for a non-PC
environment. Using TD to debug software on an embedded system makes for faster
(less frustrating) development. We hope you find these techniques useful.
In a future article, we'll describe the ins and outs of programming the
various versions of the Zilog ESCC.


Amateur Packet Radio



The development of radio technology owes much to Amateur Radio. Originally
assigned the "useless" frequencies below 200 meters, Amateur Radio operators
(more often called Radio Hams) pioneered the development of much of today's
radio technology. Today the US has almost 500,000 Hams licensed by the Federal
Communications Commission. Hams come from all walks of life and may be found
both having fun at conventions and "hamfests," and providing essential
communications in disaster relief operations.
Increasingly, Hams have come to rely on digital communications to carry
messages important to public safety, health, and welfare. During the Bay Area
earthquake in 1989, Ham Packet Radio carried much of the health and welfare
traffic, allowing literally thousands of people to establish the safety of
their loved ones.
Packet radio is the "new frontier" for today's Hams; it represents a unique
marriage between computers and radio technology, and provides yet another area
where Hams can make a difference. A complex network of packet radio stations,
over which Hams regularly exchange messages, spans the US and much of the
world. However, much of this network is built of links operating at slow
speeds of around 1200 bps. Much opportunity exists to further the state of the
art and provide higher speed links for the Amateur Service.
Hams have demonstrated experimental links at speeds above 2 Mbps. Work is
ongoing to raise the link speed and provide high-speed digital communications
at economical prices.
Recently, the FCC (at the urging of the American Radio Relay League, the
official body of US Hams) approved a new class of Amateur Radio license which
does not require knowledge of Morse code. The new class allows operation on
VHF frequencies and opens the way for many technically competent people to
join Amateur Radio without the obstacle of learning Morse. Hams with the new
license can participate in the marriage of high-speed computer communications
and radio technology, providing a new dimension to public service operations.
For information on how to become a Ham Radio Operator and participate in
developing Amateur Packet Radio, contact the American Radio Relay League
(ARRL) at 225 Main Street, Newington, CT 06111 and mention Dr. Dobb's Journal.
-- S.G.P. and K.J.R.

_C++ FOR EMBEDDED SYSTEMS_
by Stuart G. Phillips and Kevin J. Rowett


[LISTING ONE]

// MIOTDREM
// --------
// Copyright (c) 1991, Stuart G. Phillips. All rights reserved.
// Permission is granted for non-commercial use of this software.
// You are expressly prohibited from selling this software in any form,
// distributing it with another product, or removing this notice.
// THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
// IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
// WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR
// PURPOSE.
// This module contains the main() for the MIO version of TDREMOTE. The
// module is primarily responsible for initialization; TDREMOTE functions
// are handled in other modules.

#include "mio.h"
#include "8530.h"
#include "miotdr.h"

#ifdef SERIAL_DEBUG

// Communication region used to deposit status information about the
// programs progress etc in the shared memory window.

struct comm_region { unsigned short status;
 unsigned short scc_status;
 unsigned short scc_special;
 unsigned short int_cnt;
 unsigned short rx_cnt;
 unsigned short tx_cnt;
 unsigned char command;
 };
#endif

struct relo_creg { unsigned short offset;
 unsigned short segment;
 unsigned short count;
 unsigned char command;
 };

// Command values for RELO.LOD
#define RELO_NULL 0x00
#define RELO_COPYTO 0x01
#define RELO_COPYFROM 0x02
#define RELO_EXIT 0x03


// Magic number signifying presence of "RELO" support
#define RELO_MAGIC 0x5a41

extern void tdr_processor();
static void relo_lod();

#ifdef SERIAL_DEBUG
static struct comm_region *creg = (struct comm_region *)0x80;
#endif

// Globals

unsigned char rx_buffer[BUFLEN],
 tx_buffer[BUFLEN];

unsigned _stklen = 512U;

void main()
{
 unsigned char data,val;
 unsigned short i;

 disable();

 // Enable ICU in the V40 peripheral select register
 data = inportb(OPSEL);
 outportb(OPSEL,dataICU);
 data = inportb(OPSEL);

#ifdef SERIAL_DEBUG
 creg->status = data;
#endif

 /* Initialize the ICU */

 outportb(IULA,ICUBASE); // set base address */
 outportb(IMDW,IIW1IIW4NRIEMIET); // no IIW4, extended mode,
 // edge triggered
 //
 outportb(IMKW,IVEC); // Set vector base for PIC/
 outportb(IMKW,SI7); // IRQ7 is slave

 outportb(IMKW,0xff); // Mask off all interrupts
 relo_lod(); // Provide relocation support
 // to MIOLOAD
#ifdef SERIAL_DEBUG
 creg->status = 0;
 creg->scc_status = 0;
 creg->scc_special = 0;
 creg->int_cnt = 0;
 creg->rx_cnt = 0;
 creg->tx_cnt = 0;
 creg->command = 0xff;
#endif

 comm_init();

 tdr_processor();

 /*NOTREACHED*/
}

static void relo_lod()
{
 struct relo_creg *creg = (struct relo_creg *) 0x80;
 unsigned short *magic = (unsigned short *) 0x88;
 unsigned char *go = (unsigned char *) 0x9f;
 unsigned char *swin, *p;
 unsigned short i, relo_done = 0;

 // Initialize command field in communication region
 creg->command = 0xff;

 // Implant magic number so that MIOLOAD knows we're up and running
 *magic = RELO_MAGIC;

 while (!relo_done){
 // Wait for command value to change from 0xff
 while (creg->command == 0xff) ;
 switch(creg->command){
 case RELO_COPYTO:
 p = (unsigned char *)(((long)creg->segment << 16) +
 (long)creg->offset);
 swin = (unsigned char *) 0x100;
 for(i = creg->count; i != 0; i--)
 *p++ = *swin++;
 creg->command = 0xff;
 break;
 case RELO_COPYFROM:
 p = (unsigned char *)(((long)creg->segment << 16) +
 (long)creg->offset);
 swin = (unsigned char *) 0x100;
 for(i = creg->count; i != 0; i--)
 *swin++ = *p++;
 creg->command = 0xff;
 break;
 case RELO_EXIT:
 creg->command = 0xff;
 relo_done = 1;
 break;
 default:
 creg->command = 0xff;
 break;
 }
 }
 // Clear our magic marker and then reset go flag to 0xf4 (halt) code.
 // Wait for MIOLOAD to set this to 0x90 to indicate we should continue.
 *magic = 0;
 *go = 0xf4;

 while (*go != 0x90) ;
}


Figure 1: Format of EXE file header

struct EXEHDR {
 unsigned short magic; // EXE file if 0x5a4d

 unsigned short nbytes; // Size of last page in bytes
 unsigned short npages; // Size of image in pages
 // 1 page = 512 bytes
 unsigned short nreloc; // Number of relocation items
 unsigned short hdrsize; // Size of EXE header
 unsigned short endmin; // Minimum memory
 unsigned short hilo_flag;
 unsigned short ss_offset; // Stack segment offset
 unsigned short val_sp; // Initial value for SP
 unsigned short chksum;
 unsigned short val_ip; // Initial value for IP
 unsigned short cs_offset; // Code segment offset
 unsigned short rel_offset; // Offset of relocation items
 unsigned short ovl_num; // Overlay number
};


Figure 2: Format of LOD file header


struct LODHDR {
 unsigned short magic; // LOD file if 0x4655
 unsigned short version; // Version id of LOD header
 unsigned short val_offset; // Initial value for IP
 unsigned short val_seg; // initial value for CS
 long timestamp; // DOS time stamp of orig. EXE file
 long image_size; // Image size in bytes
};


































October, 1991
SMALLTALK AND EMBEDDED SYSTEMS


More than just a pretty face




John Duimovich and Mike Milinkovich


John is manager of embedded systems at Object Technology International (1785
Woodward Drive, Ottawa, Ontario K2C OP9 Canada). Mike is product manager for
OTI's ENVY/Developer. They can be reached at john@oti.on.ca and
mike@oti.on.ca, respectively.


It may come as a surprise to some people, but Smalltalk and embedded systems
do mix. Before you throw down this magazine and say "this time Dr. Dobb's has
gone too far," you should know that there are commercial products shipping
today (and more to the point, making money) based on the technology described
in this article. Developers are currently using Smalltalk to build and deliver
systems in the following areas: protocol testing, instrumentation, automated
hardware component testing, factory automation, process control, and command
and control systems. Embedded Smalltalk systems exist for VME-based 680x0
platforms and run on top of real-time operating systems (RTOS) such as VRTX
from Ready Systems and VxWorks from Wind River Systems. If you are surprised
that Smalltalk runs in stock VME card cages with commercial cards and RTOSs,
you may also be surprised to learn that Smalltalk executables are ROMable, and
do not require a mouse, keyboard, display screen, or disk.
For most embedded systems developers, object-oriented programming is a
relative unknown. When you begin thinking about using OOPS for your next
product, you likely consider C++ as the only option. However, Smalltalk is not
only a viable language alternative for many embedded applications, but it also
provides better development tools. All implementations of Smalltalk come with
a comprehensive toolset consisting of code and object browsers, symbolic
debuggers, and extensive class libraries.


Embedded and Real-Time Use


Traditionally, embedded development involves coding on a host environment for
a target system, then downloading and executing the code. This is the standard
edit-compile-link-run cycle familiar to all developers using a compiled
language, with the added burden of downloading to a target. Developers are
happy to develop on a Sun workstation with their tools (editors, source code
control), but in the end they have to deliver in an embedded target which
can't or won't run Unix. In fact, targets are usually programmer hostile. The
tools are either inferior to their workstation counterparts or not available
at all. Developers struggle to deliver reliable, often mission-critical
software, while dealing with emulators, downloaders, and cross-compilers --
each of which introduces another step or variable into the development
process.
Embedded developers often delay the move to the target system until the last
possible moment due to the lack of good (or at least familiar) development
tools. The Smalltalk-based development system runs directly on the target. For
example the entire development environment -- usually thought of as only
available on high-powered workstations -- can be run in a VME card cage using
68000-family processors. Using this approach, the code is developed on the
same machine it will be shipped on, reducing the requirement for ports and
other cross-development headaches. In cases where the target hardware is not
available or in short supply, the system can be coded on the host machine (in
Smalltalk) and the identical code run on the target when it becomes available.
A typical Smalltalk development system contains a host workstation such as a
Unix workstation or PC. For embedded systems, a target may also be added to
the configuration, with a workstation acting as a host. Programmers may
develop code on either the host or target. Code written on the host can be
loaded directly onto the target platform and vice versa. Figure 1 illustrates
a typical machine configuration for embedded Smalltalk development. To provide
an in-target development environment, graphics and I/O services for target
systems which do not have such devices must be provided by the host. A remote
interface can be provided using either Ethernet transport or a shared bus
interface such as the one provided by the BIT3 PC to VME bus interface card.
Where Ethernet is used, a single target system can easily be used by multiple
developers to test applications under development. A transport-independent
stream protocol will allow easy extension to other communications media.
Multiple connections can be served by a single Smalltalk remote server. This
allows a single Smalltalk server to provide graphics for multiple targets,
either on separate target systems or in multi-processor configurations.


Memory Management


Manual versus automated memory management is a subject that will always result
in an interesting discussion (try it at your next dinner party) and should be
of considerable interest to embedded systems programmers. Garbage collection
skeptics often use hypothetical disasters such as an airplane coming in for a
landing just when its avionics decides to garbage collect its memory.
Proponents use the counter-example of a dangling pointer bug to cause the same
catastrophe. The truth is that automated memory management can result in more
predictable and reliable systems than those using handcrafted techniques.
Most embedded systems programmers are used to doing their own memory
management. Efficiency and vague mutterings about "full control" of memory are
usually given as reasons for managing memory manually. The efficiency argument
does not hold up under scrutiny, because modern garbage collectors use only
about three percent of the CPU time in a typical interactive application. This
is actually better than what the average developer can do on his or her own.
Better yet, garbage collection substantially reduces memory management errors,
often the source of catastrophic and hard-to-find bugs. Just as importantly,
garbage collection frees developers from writing complicated memory management
routines and allows them to concentrate on the real task at hand -- shipping a
working product. For a great number of embedded applications, there is little
doubt that automated memory management would be a major improvement over the
current state of the art.
For those developers who have real-time requirements as well, garbage
collection also raises the issue of deterministic runtime behavior. Currently,
there are no commercially available garbage collectors which meet the needs of
developers writing "hard" real-time systems. However, the technology exists,
and as Smalltalk finds it way into more mission-critical applications, they
will become available. In reality, though, the issue is really no different in
Smalltalk than in, say, C. A truly time-critical function will typically be
implemented using statically allocated memory (because malloc isn't
deterministic either) and will often be written using handcrafted assembler
code. Such functions can be called from Smalltalk as easily as from C, and the
garbage collector is guaranteed not to run during their execution.


Interfacing to Other Languages


Real-world applications require that Smalltalk provides external interfaces to
languages such as C and assembler. Most Smalltalk systems include a mechanism
for calling user primitives written in C (or other languages). User primitives
can be used to improve the performance of execution bottlenecks, to guarantee
the deterministic behavior of critical regions, or to provide a Smalltalk
interface to existing routines written in other languages.
Example 1(a) (taken from OTI's ENVY/Smalltalk) illustrates the use of a user
primitive to compute the checksum of a byte object. The corresponding C code
for this checksum primitive is in Example 1(b).
Example 1: (a) A user primitive that computes the checksum of a byte object;
(b) the corresponding C code for the checksum primitive.

 (a)

 checkSum
 "Answer a twos complement checksum of the receiver."

 <primitive: checkSum>
 ^ self primitiveFailed

 (b)

 #include "userprim.h"
 DEFINE_USER_PRIMITIVE( checkSum )
 {
 char *buffer;
 unsigned int size;
 unsigned short chkSum;

 char *V_objectByteAddress();

 buffer = V_objectByteAddress(PARAM(1));
 size = V_objectSize (PARAM (1));
 for (i=0;i<size;i++) {
 chkSum += *buffer++;
 }
 SUCCEED (TO_SmallInteger (chkSum));
 }

Most Smalltalks support an API which allows user primitives full access to the
objects passed as arguments as well as access to well-known Smalltalk objects
such as true, false, and nil. Conversion routines allow the primitive to
easily convert Smalltalk objects into C data and vice versa. An important
feature in any user primitive facility is the ability to create Smalltalk
objects in primitives. This allows the application developer full flexibility
in coding primitives.


Processes and Interrupts


Most Smalltalk virtual machines support interrupts in some form. Some systems
allow signaling of Smalltalk semaphores from primitives, allowing interrupt
routines to start processes; others support numbered virtual machine
interrupts directly. Example 2 (again from OTI's ENVY/Smalltalk) supports the
latter model. Interrupts can be passed directly to interrupt handlers written
in Smalltalk using the VM interrupt facility. The code fragment in Example
2(a) illustrates a process which waits for data and when enough has arrived,
interrupts the Smalltalk system to process the data. Interrupt values are
arbitrary integer values from 1 to 32,767. The Smalltalk code which would
"hook" this interrupt is shown in Example 2(b). Every time the example code
posts interrupt 20, the enoughDataHasArrived message will be sent to the
dataHandler object. The interrupt executes in the context of the current
process.
Example 2: (a) Direct support of numbered virtual machine interrupts;
(b)Smalltalk code that "hooks" interrupts; (c) dataHandler as an object which
has an instance variable, dataFullSemaphore, which refers to an instance of
the class Semaphore; (d) the corresponding process could be created with this
code.

 (a)

 while {1) {
 wait_for_data();
 if (enough_data) v_interruptVM (20);
 }

 (b)

 Process
 configureInterrupt: 20
 withMessage:
 (DirectedMessage
 selector: #enoughDataHasArrived
 arguments: # ()
 receiver: dataHandler).

 (c)

 enoughDataHasArrived
 "Inform the process waiting for the data that the
 data has arrived."

 dataFullSemaphore signal

 (d)

 [[true] whileTrue: [
 dataFullSemaphore wait.
 self processDataBuffer]] forkAt: 2

The Smalltalk process model is very flexible. The basic functionality is
encapsulated in class Process. Processes provide a mechanism for executing
multiple threads of control. Process management uses a priority-based
scheduler with the highest priority processes executing sequentially in a
round-robin fashion. Processes run until they block (typically on a semaphore)
and usually do not timeslice. Full source to the process scheduler mechanism
is provided and may be subclassed or modified to implement your favorite
scheduling algorithm.
Continuing the example, a process-based implementation of enough
DataHasArrived follows. An alternative approach could be to process the data
directly from the Smalltalk interrupt handler. Example 2(c) uses a process
responsible for handling the incoming data. This is a typical approach used in
embedded applications. The code in Example 2(c) assumes that dataHandler is an
object which has an instance variable, dataFullSemaphore, which refers to an
instance of the class Semaphore. The corresponding process could be created
with the code in Example 2(d) . The example given has the process running at
priority 2, which typifies a low-priority background process. This code
creates a new process running at priority 2. The process waits on the
dataFullSemaphore until it is signalled by the interrupt handler enough
DataHasArrived. If the process executing when the interrupt is serviced is
lower than 2, then the signal code will result in an immediate process switch,
and the data buffer will be processed. Otherwise, the current process will
continue to run and the example process will handle the data when resumed by
the scheduler.


Additional Embedded Features


Access to real memory is important when developing embedded system
applications. Smalltalk is a high-level language, so developers are usually
insulated from the low-level details of the underlying memory system and
devices. For Smalltalk to be used in embedded applications, classes must be
provided to model the underlying hardware. An example of how this may be done
is a VMEMemory class which provides long, word, and byte access to real memory
addresses. This allows Smalltalk code to be written which accesses
low-level-data structures without requiring any user primitives.
When interfacing to an RTOS, Smalltalk can have full access to the underlying
OS via user primitives. For example, the user primitive in Example 3
demonstrates how to signal an RTOS semaphore. (This example is from a Vx-Works
implementation.)

Example 3: Signaling a RTOS semaphore

 DEFINE_USER_PRIMITIVE(semaphoreSignal)
 {
 int semId;
 IF ISNOT_SmallInteger (PARAM (1)) THEN FAIL; ENDIF
 semId = TO_long (PARAM (1));
 semGive (semId);
 SUCCEED (SELF);
 }



Delivery Tools


Once software has been developed, it must be performance tuned, packaged, and
released. Sophisticated packaging tools exist which allow the delivery of
stand-alone executables written in Smalltalk. In addition, for MC680x0
embedded environments, these tools support the ability to ROM portions of the
image. Performance tools allow developers to profile arbitrary pieces of code
or the whole system.


Performance Tuning


A number of profiling tools exist for Smalltalk language implementations.
There are both sampling and full-trace profilers which provide an accurate
picture of runtime processor and memory utilization. Sampling profilers
periodically trace the stack during execution to provide a statistical
analysis of where time is spent. To ensure a high degree of accuracy, the code
must be run long enough to provide a statistically significant number of
samples. The advantage to this type of profiler is that programs execute at
nearly full speed while being observed. A tracing profiler follows the
execution of every instruction and message send and provides an exact picture
of execution history. This type of profiler is much slower but more accurate
and can also provide code coverage information.
The better tools also provide a graphical user interface to allow developers
to quickly interpret the results of the trace. Data on garbage collection
activity during the trace may also be provided. The profiler can monitor
single processes or groups of processes; the latter are useful when
performance tuning multiple cooperating processes.
Once a bottleneck has been identified, there are several techniques which may
be used to improve performance. The first is the same used in all languages --
analyze the algorithms, data structures, and coding style being used to ensure
that the correct approach is being followed. One of the other options
available for performance tuning is to recode critical sections in C or
assembler. In typical systems the 90/1O rule applies; that is, 90 percent of
the time is spent in 10 percent of the code. A primitive added in the
appropriate place will often dramatically improve system performance.
A timing facility should be provided which can interface to high-resolution
timers on some VME cards and allow the timing of code down to microsecond
precision. The interactive nature of Smalltalk allows an application
programmer to selectively execute and time pieces of the application
individually, as performance bottlenecks are tracked down. Additionally, when
prototyping, a potential piece of code can be tested for both correctness and
performance, before it is integrated with the full system.
Memory usage and garbage collection can be monitored either by using a memory
usage monitor or directly by the application. Memory utilization can be
displayed graphically on the screen during program execution. Memory hot spots
can be discovered by monitoring this visual "memory heartbeat."


Packaging


An important part of object-oriented embedded development is packaging.
Smalltalk programmers typically develop code in a large, rich development
environment, of which the stand-alone application is just a small part.
Packaging is the process of extracting the stand-alone application from the
development environment in order to deliver a minimal executable. Packaging is
not a step required with traditional languages such as C, where the executable
is generated by the compiler.
With Smalltalk, the developer is given a highly productive incremental
development environment. The normal edit-compile-link phases, which allow most
developers to spend a lot of time hanging around the coffee machine, do not
exist. Once the application has been completed, however, it must be extracted
from this environment to be shipped. Smalltalk's approach results in greatly
improved programmer productivity throughout the development cycle, but does
require this additional step at the end.
For embedded OO development, the packager must be more sophisticated than the
standard link-editor tool used for languages such as C. The reason is the
complex structure of objects. In addition to the simple code and initialized
data found in C environments, OO environments include objects which must be
handled during the packaging process. The packager tool should allow any
object to be placed in ROM or RAM, at an absolute address or link time
relocatable address. This includes the placement of classes, methods, and
global objects. C header files are created during the packaging step which
allow C code to directly reference objects.
When packaging a Smalltalk application, you specify which classes, methods,
and variables are to be placed into the packaged application. Analysis tools
aid the packaging engineer in identifying software components which are not
required in the delivered executable. This is more difficult in Smalltalk than
in other languages because of the dynamic binding inherent in polymorphic
languages. For example, if the application sends the message "draw," all
implementations of that method might be required in the packaged application.
Packaging tools compute the potential messages required at runtime by
following message sends. You can refine this by providing additional
information such as which methods and classes may be removed. A good packager
outputs a small executable with minimal input from the developer. Increasing
the information provided by you, however, typically decreases the size of the
delivered application. Consequently, packaging is an iterative process.
Additional tools provided by packagers allow fine grain control of the
packaging process. This allows the creation of smaller images -- improving
runtime memory usage. Typically, these tools include the removal of unused
globals, object initialization, and method replacement.
More Details.
Packaging tools for embedded systems allow the user to specify whether a
class, method, or object is ROMmable. Typically, in an embedded environment,
the major part of a Smalltalk application (80 percent or more) can reside in
ROM. An interesting benefit of ROMing is that system performance usually
improves, because the garbage collector is not required to manage ROM. In
principle, there is no difference between ROMing Smalltalk programs versus
traditional assembler or C programs. In addition to ROMing requirements,
embedded systems developers must be able to place specific structures
(classes, instances, methods) at specific memory locations. Packagers should
give the application engineer fine grained control over the placement of such
data and code.
The final executable can be output as a binary image and/or as assembler
source code. The assembler output code can be assembled and linked using
conventional tools. This is important because it allows the development of
Smalltalk applications to fit into any existing firmware generation process.


Conclusions


Embedded systems development using Smalltalk is no longer a research
curiosity. Real systems are being shipped today which have used the technology
described here. Some examples include:
A troubleshooting tool for Ethernet and Token Ring local area networks. This
is a portable device which a technician connects to a network. A RISC-based
subsystem snoops on network traffic, capturing and buffering data as the
network runs. A rule-based expert system analyzes the captured data and
provides troubleshooting advice to the technician.
A fast sampling oscilloscope implemented in Smalltalk and C. The Smalltalk
portion consists of approximately 20,000 lines of code in 450 classes. The
system integrates a DSP chip which is used to perform high-speed signal
processing on sampled data. As microprocessors continue to improve and memory
becomes even cheaper, the complexity of embedded applications will undoubtedly
increase. As this happens, the ability to meet customers' expectations and
management's deadlines with traditional tools and methods will decline. Using
Smalltalk to develop embedded systems is not a panacea. Developing complex
systems will never be easy. But developers who use object-oriented programming
systems such as Smalltalk will be engineering high-quality solutions faster
and cheaper than their competitors.


References


Goldberg A. and Robson D. Smalltalk: The Language and its Implementation.
Addison-Wesley, 1983.
Smalltalk/V User Manual. Digitalk Inc., 1990.
Barry, B. Using Objects to Design and Build Radar ESM Systems. ACM OOPSLA
Conference Proceedings, 1987.
Williams, T. "Object-Oriented Tools Expand the Repertoire of Real-Time
Developers." Computer Design (May 1991).



Smalltalk Tools for Embedded Systems


Most Smalltalks are not suitable for embedded systems development straight out
of the box -- but then, neither are most C or C++ compilers. To enhance the
existing Smalltalk environment for use in embedded systems, a number of
extensions are required. Three components of Object Technology International's
ENVY tools are of interest to embedded systems developers.
ENVY/Smalltalk is an implementation of Digitalk's Smalltalk/V for MC680x0
platforms which has been tailored to meet the requirements of embedded
systems. A number of features have been added to the runtime environment.
These include: a virtual machine and image which are ROMable; a garbage
collector which is aware of ROM; a high-resolution timer interface;
ENVY/Remote, which provides graphics and input/output services from targets to
host workstations; the ability to link (using standard link tools) the virtual
machine and image into a single executable; and ROM emulation mode, which
allows memory stores to be optionally checked to catch stores into ROM, even
when running a RAM-based development environment. This catches classic "it
works in RAM, but not in ROM" bugs.
ENVY/Developer is a development environment that bundles together ENVY/Manager
(a tool that provides team programming, version control, and configuration
management), ENVY/Stats (a runtime performance analysis tool), and
ENVY/Packager (a tool for creating small, stand-alone executables).
ENVY/Actra is a multiprocessor implementation of Smalltalk in which the
Smalltalk process model has been integrated with the process model of a
message-passing, real-time operating system. ENVY/Actra's extended process
model supports synchronous message send and receive and asynchronous reply
between tasks. Task switches occur transparently when a Smalltalk message is
sent between processes (Tasks). Processes may reside on different processors;
if so, they execute concurrently. Tasks may also send to and receive messages
from RTOS process which may be implemented in other languages such as C.
--J.D. and M.M.

_SMALLTALK AND EMBEDDED SYSTEMS_
by John Duimovich and Mike Milinkovich




Example 1

(a)


 checkSum
 "Answer a twos complement checksum of the receiver."

 <primitive: checkSum>
 ^ self primitiveFailed


(b)


 #include "userprim.h"
 DEFINE_USER_PRIMITIVE( checkSum )
 {
 char *buffer;
 unsigned int size;
 unsigned short chkSum;
 char *V_objectByteAddress();

 buffer = V_objectByteAddress(PARAM(1));
 size = V_objectSize (PARAM (1));
 for (i=0;i<size;i++) {
 chkSum += *buffer++;
 }
 SUCCEED (TO_SmallInteger(chkSum));
 }




Example 2

(a)


 while {1) {
 wait_for_data();
 if (enough_data) V_interruptVM (20);
 }



(b)

 Process
 configureInterrupt: 20
 withMessage:
 (DirectedMessage
 selector: #enoughDataHasArrived
 arguments: #()
 receiver: dataHandler).


(c)

 enoughDataHasArrived
 "Inform the process waiting for the data that the
 data has arrived."

 dataFullSemaphore signal


(d)

 [[true] whileTrue: [
 dataFullSemaphore wait.
 self processDataBuffer]] forkAt: 2




Example 3

 DEFINE_USER_PRIMITIVE(semaphoreSignal)
 {
 int semId;
 IF ISNOT_SmallInteger (PARAM (1)) THEN FAIL; ENDIF
 semId = TO_long (PARAM (1));
 semGive (semId);
 SUCCEED (SELF);
 }






















October, 1991
FORTH: A STATUS REPORT


X3J14 Technical Committee votes dpANS Forth




Jack J. Woehr


Jack is a senior project manager at Vesta Technology Inc. in Wheat Ridge,
Colorado. He is a member of the ANS/ASC X3J14 Technical Committee for ANS
Forth and is currently Chapter Coordinator for the Forth Interest Group. Jack
can be reached as jax@well.UUCP, as JAX on GEnie, or as the Sysop of the RCFB
BBS, 303-278-0364.


Forth has always been the high-level language most tightly coupled to
hardware, in particular to 16-bit two's-complement hardware. X3J14 has
attempted to describe Forth in an architecture-independent fashion. Although
laudable in theory, this has striking practical implications which are
reverberating throughout the Forth community, as professionals and enthusiasts
ponder the fruit of X3J14's labors. The document's description of Forth, which
will now be passed upwards through the labyrinthine ANSI procedure, is a
radical departure from previous descriptions.
X3J14 has had to reconcile conflicts between Forth theory and modern practice.
Furthermore, the proceedings have faced the perennial Forth problem of
establishing what "common and accepted industry practice" is for a language
community whose favorite joke is "If you have seen one Forth, you have
seen...one Forth;" and whose founder pronounced that "...Forth is more of an
approach than a specification for a programming language," and that "Standards
are wonderful: Everyone should have one."
The X3J14 Technical Committee has taken a different viewpoint: Programming
language environments standardized under recognized proceedings are more
commercially viable than programming language environments not so
standardized. Most members of the Technical Committee hope that an American
National Standard Forth will ameliorate the heretofore negative opinion of
some project managers, who view Forth as merely a hacker's language.
The ANS/ASC X3/X3J14 Technical Committee for ANS Forth met in Boulder, Colo.
from 7/30 to 8/3, 1991 and confirmed its 19:1 vote to forward the present
BASIS document to ANSI SPARC (Standards Planning and Review Committee, a
supervisory body). Upon SPARC approval, the "dpANS" (draft proposed ANS)
document, will enter a 4-month period of public review that will ultimately
lead to the first American National Standard Forth. Any technical changes that
arise during the review period, mean another vote by X3J14 to place the
revision into a new public review cycle.
Consider first of all that two previous industry (albeit non-ANSI) standards
for Forth explicitly described Forth as a threaded interpreter. That is
certainly the classic implementation of Forth and the strategy originally
chosen by creator Charles Moore, but many modern Forths are not threaded
interpreters. Some are based on a Forth instruction set in microcode or in
silicon. Others compile optimized machine code or execute from token lookup
tables.
Certainly Forth has detached itself in the marketplace from 16-bit machines
and even (in the case of FIG-Forth for the Cray) from two's-complement
architecture, a move recognized by X3J14 in its declaration that "This
standard allows one's complement, two's complement, or sign-magnitude
representations," the minimum commonality allowing all these representations
to co-exist in a Forth standard, being that, "In all these systems, arithmetic
zero is represented as the value of a single cell with all bits clear." (X3J14
dpANS-2 3.3.2.)
Until now, novel implementations (although clearly and recognizably Forth and
often a joy to use in projects) have had to bear the stigma and economic
disadvantage of being nonstandard.
The two notable results of the four-year odyssey of X3J14 have been: 1. A
masterpiece of a dpANS, which states the eternal verities of the Forth
environment while abstracting them from implementation details to the greatest
extent possible; 2. controversy, the spice without which the food of the Forth
programmer loses its savor.
The Forth word NOT is paradigmatic of the inconsistencies in Forth practice
which X3J14 has had to resolve. NOT is not in dpANS. Where did it go? Thereby
hangs a tale.
Forth has, for a long while, possessed the words AND, OR, XOR, and NOT. While
the first three were always bitwise operators, in early Forths NOT was usually
a logical operator. NOT's logical status was enshrined in Forth-79, which
mandated that the result of the expression HEX FFFF NOT 1234 NOT = be a TRUE
flag in Forth-79, and all bits set in Forth-83 and the proposed ANS Forth.
(Note that "all bits set" could be referred to conveniently as "negative one"
[1] in Forth-79 and Forth-83. Such shorthand is unavailable to ANS Forth,
however, as "all bits set" on a 16-bit one's complement machine is "negative
zero" [!?] and on a 16-bit sign-magnitude machine "HEX-7FFF".)
Forth-83 changed the meaning of NOT; it became symmetric with AND, OR and XOR,
a bitwise operator, so that HEX FFFF NOT becomes zero (0) and HEX 1234 NOT
becomes HEX EDCB.
This seemingly reasonable alteration flew in the face of nearly the entire
installed base of Forth systems at the time the change was made. To this day,
the economic and religious conflict between the "logical NOTters" and the
"bitwise NOTters" is unresolved. Therefore, ANS Forth will possess the
operators 0= (the equivalent of logical NOT) and INVERT (the equivalent of
bitwise NOT), but not NOT.
Similarly, Forth-79 mandated that integer division with one negative operand
return a symmetric result, whereas Forth-83 imposed floored division. Both
methods of division yield equally valid results, in that the divisor times the
quotient plus the remainder (or "modulus," as it is called in floored
division) equal the original dividend. The results, however, are different!
Floored integer division is useful in rotary motion and robotics and matrix
operations. Symmetric division, on the other hand, is alleged to return a
result closer to the user's expectations. Partisans of both methods exhibit a
zeal akin to that found in the eleventh century controversy between the
Eastern and Roman church over the nature of the paraclete.
X3J14's compromise was this: Primitives FM/MOD (d n1 -- n2 n3, floored) and
SM/REM (d n1 -- n2 n3, symmetric) are to be provided for both methods of
signed integer division. The default behavior of /MOD and related words can be
based on either primitive, as the implementor (read "the implementor's
customer") pleases. The portable Standard Program, which is dependent on the
behavior of signed integer division with negative operands, can either execute
test cases to decide what sort of system it is running under or use the
appropriate primitive to ensure the desired results.
X3J14 has also tackled the thorny problems of Forth features that are by now
extremely widespread but have never appeared in an industry standard; examples
are local variables, floating point mathematics, and multitasking. Of these,
multitasking has proven the most intractable, with the proposed dpANS
providing only guidelines for the Standard Program in the face of widely
varying industry practice.
The proposed ANS Forth elucidates in great detail the twin concepts of the
Standard System and the Standard Program. Extensive documentation requirements
are imposed on both the implementor of a Standard System and the author of a
Standard Program that has any particular environmental dependencies.
The proposed Standard is layered in several word sets, each with its own
glossary and extensions. The word sets are: Core, Block, Double,
Error-Handling, Facility, File-Access, Floating Point, Locals,
Memory-Allocation, Programming-Tools, Search Order, and String. A Standard
System need only possess the Core word set, and even so, "[The] Standard does
not require that all words be provided in executable form. The implementor may
choose to provide word definitions, including definitions of words in the core
word set, in source form only. This is acceptable as long as a clearly
documented mechanism exists for adding the word definitions to the
dictionary." (X3J14 dpANS-2 4.1.)
Several implementations of portions of the proposed Standard already exist.
Former DDJ columnist and current X3J14 member Martin Tracy attempts to keep
pace with the various levels of BASIS in his Zen Forth system for MS-DOS
machines, available for download from many Forth telecom outlets. (See also
"Zen for Embedded Systems," by Martin Tracy, DDJ January 1990.)
Gordon Charlton, secretary of FIG-UK, has informed the Technical Committee
that his organization is implementing the proposed standard on a 6809
platform. I have implemented in my JAX4TH on the Amiga all of the word sets
with their extensions except for Floating Point, Programming-Tools, and
Locals; it should be hitting the BBSs about the time you read this.
Despite the traditional annual announcement of the "Death of Forth" (this year
the fool's cap was snatched from John Dvorak and Jerry. Pournelle's heads by
Steve Gibson, who averred in the July 15 Infoworld that "Much like Latin,
Forth is a dead language...with a history"), Forth is alive and thriving
around the world, and in particularly good health in the field of embedded
systems. Forth programmers have a great deal at stake in an American National
Standard Forth, and my hope is that as many as possible will participate in
the public review period and lend their experience and wisdom to the task of
bettering the viability of commercial Forth programming.


























October, 1991
ENHANCING THE ACTOR DEVELOPMENT ENVIRONMENT


Adding configuration inheritance to an object-oriented environment




Steve Hatchett


Steve is a senior systems analyst at Tetra Tech Data Sytems Inc., a software
development and systems integration house in San Diego, California. He has
worked in applications development for nine years, using object-oriented
languages in the MS-Windows environment for the past two-and-a half years. He
can be reached at 14352 Mussey Grade Rd., Ramona, CA 92065, or through
CompuServe at 70304,1423.


As object-oriented languages move into the mainstream of the software
industry, developers are expressing their desire for tools to facilitate
multiproject team development. We were faced with this issue when we began our
second project using Whitewater's Actor language. An international airline had
asked for a system that would automate the scheduling of airport personnel in
services such as passenger check-in, baggage handling, and other tasks less
visible to the public. We chose to develop the project as a Windows
application written in Whitewater's Actor language, using an off-the-shelf
database engine. The choice of interface environment was an important
consideration because for most users, this would be their first computer
experience. We chose to develop a Windows application because we felt the
standard graphical interface would make the user's job as simple as possible
and give us the ability to design a pleasing and informative interactive
display. We had used the Actor language on a previous project, and picked it
again because of the much-shortened development time we experienced. We also
expected its object-oriented nature would ease the development of the major
original algorithms.
The resulting application was delivered and installed on networked PCs at the
airline's main airport and now schedules the work of 900 people daily. The
airline relies on the system to determine all the jobs that need to be done
each day by applying user-defined work standards to the ever-changing flight
schedule. The complex determination process includes queueing theory,
predicted passenger arrival profiles, and profile smoothing. An allocation
algorithm assigns the jobs as efficiently as possible to the available staff
to ensure all jobs will be covered. The system deals with schedule changes as
they occur, including dynamically updated displays. Supervisors have the
ability to interactively override the computer's decisions at any time.
Early in the project, I decided to invest some time extending the Actor
development environment to better handle the needs of multiproject team
development. We wanted to easily share portions of code developed during the
previous project. We also wanted a way to easily coordinate and combine the
work of each person. Many of the concepts behind the changes and enhancements
I made are not unique to Actor and could be applied to other object-oriented
languages as well.


Actor Environment Overview


Actor is an object-oriented language that provides a powerful system for rapid
development of Microsoft Windows applications. Despite this power, it gives
only limited support to aid the development of multiple projects by multiple
programmers. After installation, the Actor executable and image files are
placed in their own subdirectory (C:\ACTOR, for example) along with various
source load files. From this directory there are subdirectories for source
.act files, source backups, class source files, resources, and temporary work
source files. A typical directory setup is shown in Figure 1. All projects can
be developed within these directories or the entire directory group, and the
files contained therein can be duplicated for each project.
Figure 1: A typical Actor directory setup

 XC:\ACTOR\
 ACT\
 BACKUP\
 CLASSES\
 RES\
 WORK\



Development Environment Shortcomings


The two simple approaches to organizing development of Actor applications do
little to address the needs and problems of multipleproject development. If
all projects are developed from a common directory, classes can proliferate to
staggering numbers. No easy scheme exists for identifying general classes from
those for specific projects. A common solution resorts to cumbersome class
naming conventions.
Developing projects in separate directories forces duplication of code and the
associated problems. Changes or bug fixes made to heavily used classes must be
manually copied and integrated into each project. The lack of configuration
control can lead to situations where classes are separately modified by
different projects, resulting in divergent versions of classes. These same
problems express themselves when updates of the language come out. Every
project must be separately merged with the new versions.
The problems described so far are increased if more than one programmer is
doing development. When working on separate computers with local drives, code
duplication occurs within projects as well as across projects. The effort
involved in merging the work of the various programmers hampers efficiency.
Working on a LAN can alleviate code duplication within a project, but there
are no safeguards to keep programmers from stepping on each other's work. This
maintenance headache is made worse by the lack of identification within
classes and methods of when and by whom changes were made. The time savings of
common classes are soon consumed by the effort expended in manual
configuration control.
Actor comes with a rich set of classes, easing the programmer's job.
Occasionally, it is appropriate and necessary to add methods to these existing
classes. This can be done by modifying the class source directly, using the
Browser, or by creating and loading an .act file containing the source code
for the additional methods. Both choices have their drawbacks.
Using the Browser is the most convenient way to add or modify methods in
Actor's standard classes. Changes made this way are automatically placed in
the source file for the affected class. When new versions of Actor are
released, all changes of this type must be extracted and reapplied to the new
class source files. There is no inherent identification of the methods that
have been added or modified, so this can become a tedious task. Problems can
also arise from a buildup of methods added with each project. Many of these
added methods won't be required by a particular project, yet all will end up
in the image when the class source files are loaded.
If changes to the Actor classes are segregated into .act files, it becomes
much easier to deal with Actor updates. The buildup of code can be prevented
by making separate .act files for different purposes. Unfortunately, the
Browser can only find source code in the .cls class source files. When an .act
file that has been loaded contains a modified version of a method existing in
the class source file, a confusing condition occurs. If the method is selected
in the Browser, the source code from the class file is displayed, even though
the image executes the method from the .act file. Source code for methods
loaded from .act files will not be found by the Browser, and must instead be
maintained using a text editor.


Configuration Inheritance


After identifying the inconveniences just described, we looked for a way to
enhance Actor's development environment to better support multipleproject
development by several programmers. The resulting system is centered around
the concept we call configuration inheritance. By extending inheritance into
the dimension of project development, related projects can be organized in
such a way that they inherit access to common classes, methods, and resources.
Project relationships are defined by their organization in the directory
hierarchy. An example of such a hierarchy is shown in Figure 2. The top-level
directory for Actor development (the Actor directory, with its act, backup,
classes, res, and work subdirectories) contains the Actor language and its
associated files in their original form. General additions and changes we make
to Actor occur in the \ACTOR\GENERAL directory and its supporting
subdirectories. Project subdirectories descend from the GENERAL directory. We
modified Actor so it will resolve references to source files by searching the
directory tree from the point of current development up to the original Actor
directories. If you were working on "project2," for example, and you asked
Actor to load a class source file, it would look first in C:\ACTOR
\GENERAL\PROJECT2\CLASSES\. If the file wasn't found there, the
C:\ACTOR\GENERAL\CLASSES\ would be searched next. If the file was still not
found, Actor would finally look in C:\ACTOR\CLASSES. This same mechanism works
with all source, resource, and include files.
Figure 2: Project relationships are defined by their organization in the
directory hierarchy.

 C:\ACTOR\
 ACT\
 BACKUP\
 CLASSES\
 RES\

 WORK\
 GENERAL\

 BACKUP\
 CLASSES\
 RES\
 WORK\

 PROJECT1\

 BACKUP\
 CLASSES\
 RES\
 WORK\

 PROJECT2\

 BACKUP\
 CLASSES\
 RES\
 WORK\

The project hierarchy can be arbitrarily deep, allowing related projects to be
grouped under common subdirectories. Thus, they may inherit access to common
code and resources from the parent directory. Projects can have descendant
subdirectories for testing and experimenting. Changes made in these
subdirectories do not affect the parent directory. Source code and resources
from a descendant subdirectory can be made a permanent part of the project by
moving them to the parent directory.


Class Extensions


By itself, the directory hierarchy does not address the need of adding or
changing methods for classes defined in parent directories. We implemented
this capability by introducing a new type of source file we call the "class
extension file." Actor class files have filenames ending in .cls. Our class
extension files have filenames ending in .clx. Class extension files only
contain code for methods, and class initialization code. Only normal class
files can actually define a class. The power of this feature can be
demonstrated using Actor's String class as an example. The original Actor
class source file (C:\ACTOR\CLASSES\ STRING.CLS) defines the String class, and
contains the String methods provided by Actor. Generally useful methods added
to String are placed in the C:\ACTOR\GENERAL\CLASSES\ STRING.CLX class
extension file. String methods specific to an individual project are contained
in a STRING.CLX file in that project's CLASSES directory. Using this system,
changes to source files are only made in the CLASSES directory of the project
where work is being done. Because the original Actor source code remains
unchanged, installing updated versions of Actor becomes a simple matter of
replacing the original files.
Only one .cls file can exist for a class. This is where the class is actually
defined. The directory level containing the .cls file is said to "own" the
class. Descendant projects can only change or add methods in their own .clx
files for the class. Changes in the definition of the class, such as ancestor
class, instance variables, and class variables can only be made at the level
owning the class. This protects the class from being redefined by descendant
projects.
To implement the class extension feature, we made changes to Actor's Browser.
When the programmer selects a method from the Browser's list of methods
defined for a class, it looks successively up the directory tree until it
finds the method in a .clx file or the .cls file. If the method is found in
one of the project's parent CLASSES directories, the Browser notifies the
programmer that the method is not "owned" by the project. This notification
appears in the title bar as brackets around the method name. If the method is
not owned by the project, and a change is made, the modified method is written
into a .clx file for the class in the project's WORK subdirectory. The
original .cls file remains unchanged. Methods added to classes defined higher
up in the directory hierarchy are also placed in a .clx file for the class in
the project's WORK directory. When a snapshot of the image is performed, the
.clx files in WORK are copied into the project's CLASSES directory, as they
are for .cls files.
To use the class extensions, you'll need the file EXTEND.LOD (Listing One,
page 125) to load the development environment extensions and EXTSOURCE.CLS
(Listing Two, page 125), which allows the source code for a class to be
contained in more than one file. Other files, such as TOOLWIND.CLX in Listing
Three, page 127, modify the existing Actor development classes to use or
support the features I've described. Because of space constraints, not all of
the class extension files are presented here. However, all are provided
electronically, including BEHAVIOR.CLX, CLASSDIA.CLX, CLVARDIA.CLX,
FUNCTION.CLX, METHODBR.CLX, SOURCEFI.CLX, STRING.CLX, SYMBOL.CLX, SYSTEM.CLX,
BROWSER.CLX, and WORKEDIT.CLX; see "Availability" on page 3.


Resource Inheritance


A few changes were made to allow resources to obey the configuration
inheritance system. In the GENERAL\RES directory, the actor.rc file was broken
up into separate files for each type of resource. We created actor.acc for
accelerators, actor.dlg for dialogs, actor.mnu for menus, and actor.str for
strings. A skeleton .rc file was written to pull together the actor resources
using the rcinclude directive. A batch file was written to invoke the resource
compiler using an include path that looks first in the RES directory of the
current project, next in the RES directory of the project's parent directory,
and on up to ACTOR\RES. Using this system, all projects descending from
GENERAL inherit the component actor resource files in GENERAL\RES. Projects
need only copy the skeleton .rc file into their own RES directory, set the
application name, and rcinclude the additional resource files specific to the
project. By doing this, we avoid duplication of resource definitions and
simplify maintenance.


Multiple Programmer Support


The final changes in our enhancement to Actor's development environment were
made to support multiple programmers working on a LAN. Within a project
directory, each programmer has their own image file, usually bearing the
programmer's name with an .ima extension. Normally this would not be feasible
because Actor uses only one file to save the dirty classes list and another
for the changes log. We overcame the problem by changing Actor to use separate
dirty classes and change log files for each image.
Automatic time stamping of method source code changes was implemented to allow
simple tracking of modifications. When a method is accepted in the Browser, a
time stamp is automatically inserted as the first line (overwriting the
previous time stamp if it exists). The time stamp includes the date, time,
programmer initials, class name, and method name, as in the following example:
/*Stamp 1990/10/21 15:48 swh String:findPath */.
The grep utility can be used to extract a list of all the /*Stamp lines in a
CLASSES directory. By using the DOS sort command, the list can be ordered by
date, programmer, or class. The time stamp also makes it easy to differentiate
between original Actor code and methods we have added.


Conclusion


In using these enhancements to Actor, we found that the concept of
configuration inheritance adds a powerful and fundamental dimension to
object-oriented development. The result is an environment that enhances the
ability to create and utilize reusable code. This implementation of the
concept was born out of our immediate needs, and constrained by limits of
time. I hope the ideas expressed in this article will serve as a step in the
direction of a standard, publicly available enhancement to Actor that fully
addresses the problems I've described. This will make Actor an even more
productive environment for applications development.

_ENHANCING THE ACTOR DEVELOPMENT ENVIRONMENT_
by Steve Hatchett


[LISTING ONE]



/* EXTEND.LOD - Actor development extensions load file
 * Copyright(C) 1991 Steve Hatchett. All rights reserved.
 * Steve Hatchett 14352 Mussey Grade Rd.
 * CIS: 70304,1423 Ramona, CA 92065
 * Use this file to load the development environment
 * extensions for use with the Actor programming
 * language.
 * load("extend.lod");
 * load();
 */
#define MAXSOURCENEST 5; /* max directory nesting */
Actor[#Programmer] := " "; /* set this in workspace */
Actor[#StampText] := "Stamp";/* time stamp header. Another
 * useful stamp header would be
 * "Copyright (C) 1991 XYZ, Inc."*/
LoadFiles :=
{
/* Your path may be different. */
 load(new(SourceFile),"c:\actor\general\classes\string.clx");
 load(#("classes\extsource.cls"
 "c:\actor\general\classes\symbol.clx"));

/* The minimum code necessary to navigate the directory
 * hierarchy has now been loaded, paths are no longer needed. */
 do(#(Behavior Browser ClassDialog ClVarDialog Function
 SourceFile System ToolWind WorkEdit), {using(cl)
 loadExtensions(cl);
 });
}!!





[LISTING TWO]

/* EXTSOURCE.CLS - Actor development extensions
 Copyright (C) 1991 Steve Hatchett. All rights reserved.
 Provides access to class source code. Allows the source code for a class
 to be contained in more than one file ordered hierarchically by directory
 structure. There is one .cls file containing the actual class definition.
 Additional .clx files may be exist at lower levels of the hierarchy. They
 can contain methods and class initialization code. */!!

inherit(Object, #ExtSourceFiler,
 #(clName /* Name of class being handled. */
 ownsClass /* True if the .cls file for class is
 in this session's directory. */
 fileNames /* Source code file names with paths. */
 ownsLast /* True if we owned the last method
 we read. */), 2, nil)!!

now(class(ExtSourceFiler))!!

/* Return a new ExtSourceFiler, initialized to handle source code for the
 class whose name Symbol was given. */
Def openClass(self className)
{

 ^init(new(self),className);
} !!

now(ExtSourceFiler)!!

/* Load class extension files (.clx) for the class, but don't load the class
 file (.cls). This is useful for loading extensions to the standard
 Actor classes. Looks for all files in CLASSES subdirectories. */
Def loadExtensions(self work fNames fnm)
{
 work := loadString(332);
 if size(fileNames) > 0
 cand subString(first(fileNames),0,size(work))=work
 fileNames[0] := loadString(331) + getFileName(self);
 if not(exists(File,fileNames[0],0))
 removeFirst(fileNames);
 endif;
 endif;
 if size(fileNames) > 0
 fNames := copy(fileNames);
 fnm := last(fNames);
 if fnm[size(fnm)-1] == 's'
 pop(fNames); /* removes the .cls file */
 endif;
 ^do(reverse(fNames), {using(fnm)
 load(new(SourceFile),fnm);
 });
 endif;
 ^nil;
}!!

/* Recompile the class by loading all its source code.
 Looks in WORK for own source file if class is dirty. */
Def recompile(self)
{
 ^do(reverse(fileNames), {using(fnm)
 load(new(SourceFile),fnm);
 });
}!!

/* Return open SourceFile with given file name (which should include path). */
Def openSourceFile(self fName sFile)
{
 sFile := new(SourceFile);
 setName(sFile,fName);
 if not(open(sFile,0))
 errorBox(loadString(311), clName
 + loadString(312)+fName+".");
 ^nil;
 endif;
 ^sFile;
}!!

/* If no source file exists for the class at the directory level of this actor
 session, then create an extension file in WORK, and mark class as dirty. */
Def mustHaveOwn(self sFile)
{
 if size(fileNames) == 0
 cor first(fileNames)[0] == '.'

 add(DirtyClasses,clName);
 insert(fileNames,loadString(332)
 +subString(clName,0,8)+".clx",0);
 makeClassExtFile(new(SourceFile),self);
 endif;
}!!

/* Replace the the fSym method text with methtext, or add method text if it
 wasn't already in source file. Mode determines whether method is a class or
 object method. These changes will only be made in the source file owned by
 this actor session. Returns nil if not successful. */
Def saveMethText(self methtext fSym mode rFile wFile)
{
 mustHaveOwn(self);
 if (rFile := openClass(SourceFile,self))
 wFile := saveMethText(rFile,methtext,fSym,mode);
 reName(wFile,(fileNames[0] := condDelCFile(rFile,self)));
 endif;
 ^rFile;
}!!

/* Replace the the class initialization code, with given code. Change will
 only be made in source file owned by this actor session. Returns nil if
 not successful. */
Def replaceClassInit(self text rFile wFile)
{
 mustHaveOwn(self);
 if (rFile := openClass(SourceFile,self))
 wFile := replaceClassInit(rFile,text);
 reName(wFile,(fileNames[0] := condDelCFile(rFile,self)));
 close(rFile);
 endif;
 ^rFile;
}!!

/* Return the class init code for this class as a TextCollection. */
Def readClassInit(self rFile text)
{
/* look through source files until some class initialization text is found. */
 ownsLast := false;
 do(fileNames,{using(fnm)
 if (rFile := openSourceFile(self,fnm))
 text := readClassInit(rFile);
 close(rFile);
 if size(text) > 0
 ownsLast := (fnm[0] <> '.');
 ^text;
 endif;
 endif;
 });
 ^new(TextCollection,5);
}!!

/* Remove the fSym method text if it is found in source file owned by this
 actor session. Mode determines whether method is a class or object method.
 Returns nil if not successful or if method was not in owned source file. */
Def removeMethod(self fSym mode rFile wFile)
{
 if size(fileNames) > 0

 cand first(fileNames)[0] <> '.'
 cand (rFile := openClass(SourceFile,self))
 if wFile := replaceMethod(rFile,nil,fSym,mode)
 reName(wFile,(fileNames[0] := condDelCFile(rFile,self)));
 endif;
 close(rFile);
 endif;
 ^wFile;
}!!

/* Load the class by loading all its source code. Looks in CLASSES for
 all source files. */
Def load(self work)
{
 work := loadString(332);
 if size(fileNames) > 0
 cand subString(first(fileNames),0,size(work))=work
 fileNames[0] := loadString(331) + getFileName(self);
 if not(exists(File,fileNames[0],0))
 removeFirst(fileNames);
 endif;
 endif;
 ^do(reverse(fileNames), {using(fnm)
 load(new(SourceFile),fnm);
 });
}
!!

/* Return true if the last method read using loadMethText was from a source
 file at directory level of this actor session. */
Def ownsLast(self)
{
 ^ownsLast;
}!!

/* Initialize a new ExtSourceFiler. */
Def init(self nm metaNm)
{
 if class(nm) <> Symbol
 nm := name(nm);
 endif;
 if (metaNm := isMetaName(nm))
 clName := name(value(metaNm));
 else
 clName := nm;
 endif;
 getFileNames(self);
}!!

/* Return true if the .cls file for the class this filer is handling exists
 at the directory level of this actor session. */
Def ownsClass(self)
{
 ^ownsClass;
}!!

/* Return text of aMethod. Note accordingly if source code is missing. Mode
 indicates type of method, either class (BR_CMETH) or object (BR_OMETH). Looks
 through all the source files for the class. */

Def loadMethText(self aMethod mode text rFile)
{
/* look through source files until the text for the given method is found. */
 ownsLast := false;
 rFile := new(SourceFile);
 do(fileNames,{using(fnm)
 if not(rFile := openSourceFile(self,fnm))
 errorBox(loadString(311), clName
 +loadString(312)+fnm+".");
 else
 text := findMethod(rFile,aMethod,mode);
 close(rFile);
 if text
 ownsLast := (fnm[0] <> '.');
 ^leftJustify(text[0]);
 endif;
 endif;
 });
 ^aMethod + loadString(310);
}!!

/* Return the class's name the way Behavior would do it. */
Def name(self)
{
 ^clName;
}!!

/* Return the class's filename the way Behavior would do it. */
Def getFileName(self dir)
{
/* return name + ext */
 ^subString(clName,0,8)
 + if ownsClass
 ".cls" else ".clx" endif;
}!!

/* Get the names of the files containing this class' source code. */
Def getFileNames(self dir fRoot base)
{
 fileNames := new(OrderedCollection,MAXSOURCENEST);
 dir := loadString(if clName in DirtyClasses
 332 else 331 endif);
 base := subString(clName,0,8);
 fRoot := dir + base;
 do(MAXSOURCENEST, {using(i fnm)

/* if the original .cls file or a .clx file is found for class, add it to list
 if it's above this session's directory. */
 if exists(File,(fnm:=fRoot+".cls"),0)
 add(fileNames,fnm);
 if i==0
 ownsClass := true;
 endif;
 ^fileNames;
 endif;
 if exists(File,(fnm:=fRoot+".clx"),0)
 add(fileNames,fnm);
 endif;


/* construct the path name of the next higher level. */
 if i==0
 fRoot := "..\classes\"+base;
 else
 fRoot := "..\"+fRoot;
 endif;
 });
 ^fileNames;
}!!

/* Return the names of the files containing this class' source code
 (including own source file). */
Def fileNames(self)
{
 ^fileNames;
}!!





[LISTING THREE]

/* TOOLWIND.CLX - Actor development extensions
 *
 * Copyright(C) 1991 Steve Hatchett. All rights reserved.
 */!!
now(class(ToolWindow))!!

/* Used by the system to initialize the DirtyClasses
 Set backup file. DirtyClasses is assumed to be
 empty on entry. For each class found in the backup
 file, ask the user whether to re-load the dirty
 class file or use the old class file.

 swh modified to support extended source files.
 swh modified to support unique dirty class file
 names between images.
 */
Def loadDirty(self dName)
{
/* mod for unique dirty file name - base the dirty
 * file name on the image name.
 */
 setName($DFile,subString(imageName(TheApp),0,
 indexOf(imageName(TheApp),'.',0))
 +".drt");
/* end mod */

 if open($DFile, 0)
 then
...
 if size(DirtyClasses) > 0
 then do(copy(DirtyClasses),
 {using(clName)

/* mod for extended source file support - if the dirty
 * class exists in the system, let it tell us its file
 * name (.cls or .clx), otherwise the class was created

 * since the last snapshot, so it should be a .cls file.
 */
 dName := loadString(332) + subString(clName,0,8)
 + if not(Classes[clName])
 cor isOwnClass(Classes[clName])
 ".cls" else ".clx" endif;
/* end mod */

 dName := loadString(332) + subString(clName,0,8) + ".cls";
...
}!!

now(ToolWindow)!!

/* Insert/overwrite a time stamp as the first line of
 the method text.
 */
Def stampMethText(self text fSym mode stampHead)
{
 stampHead := "/*"+StampText;

/* if method doesn't already have a stamp, insert
 * a line for it.
 */
 if left(text[0],size(stampHead),' ') <> stampHead
 insert(text,"",0);
 endif;

/* construct the time stamp. */
 text[0] := stampHead+" "+timeStamp(System)+" "
 +programmer(System)+" "+name(selClass)
 +if mode == BR_CMETH
 "(Class)" else "" endif
 +":"+fSym+" */";
}!!

/* Save the new text for current method into the source
 file. The first argument, text, is the method text. The
 second, fSym, is the symbol with the name of the method,
 e.g. #print. The new or revised method text ends up in a
 class source file in WORK. Also, write the text to the
 change log.

 swh modified to support extended source files
 swh modified to support automatic time stamping
 */
Def saveMethText(self, text, fSym textEnd, nowCl, rFile, wFile)
{...
 changeLog(ew, "now(" + nowCl + ")" + Chunk +
 subText(text, 0, 0, textEnd, size(text[textEnd])));

/* mod for automatic time stamps
 * place time stamp in method text.
 */
 if text
 stampMethText(self,text,fSym,mode);
 endif;
/* end mod */


/* mod to support extended source files */
 saveMethText(openClass(ExtSourceFiler,selClass),
 text, fSym, mode);
/* end mod */

 makeDirty(self);
 endif;
 ^fSym;
}!!

/* ToolWindow class initialization
 Rewritten to override ToolWindow.cls class
 initialization to support unique dirty file
 names between images - base the dirty file
 name on the image name.
 */
$DFile := setName(new(TextFile),
 subString(imageName(TheApp),0,
 indexOf(imageName(TheApp),'.',0))
 +".drt");










































October, 1991
MIXED-LANGUAGE WINDOWS PROGRAMMING


Fortran as a DLL, Visual Basic for the front end




John Norwood


John is a technician for Microsoft's Product Support group and can be reached
at One Microsoft Way, Redmond, WA 98052-6399.


In the beginning, there was the punch card deck, the teletypewriter, and
Fortran. Since then, the I/O interaction characteristics of Fortran have in
many ways remained frozen, restricted by READ and WRITE, the only available
ANSI standard I/O statements. But times change. Users are no longer willing to
accept command-line, console-oriented user interfaces. Consequently, many
Fortran programmers want to move from character-based screen management to
graphical user interfaces such as Windows 3.0. With tools such as Microsoft's
Fortran 5.1 QuickWin runtime library (which allows standard, console-oriented
programs to be ported without modification to Windows), programmers can avoid
the immediate plunge into the complexities of GUI programming.
But the QuickWin libraries represent a trade-off between ease of use and
flexibility. The QuickWin user interface does not support features such as
application-specific menus, buttons, or graphics. Therefore, to build Windows
applications, Fortran programmers are expected to integrate Fortran code with
a C Windows program that utilizes the Windows Software Development Kit.


Enter Visual Basic


C can present a serious programming challenge for the average Fortran
programmer. Fortunately, Microsoft's Visual Basic provides an alternative with
easy access to virtually all the sophisticated features of Windows. And if the
standard feature set of Visual Basic is not sufficient or optimal for the task
at hand, the programmer can make direct calls to the Windows kernel, create a
custom Dynamic Link Library (DLL) using another language, or create a custom
control using C and the Visual Basic Control Development Kit.
An area of interest to Fortran programmers, then, is the potential to take
existing Fortran code, modify it for use in a DLL, and then access this DLL
from Visual Basic, which serves as the front end to the Fortran routines. In
this way, each language can contribute in its area of greatest strength.


Communicating through DLLs


The problems of mixed-language programming caused by runtime library conflicts
during static linking aren't an issue, as DLLs are created and linked as
stand-alone, single-language, modules. Because global data cannot be shared
between an application and a DLL, all communication between Visual Basic and a
Fortran Windows DLL is through the parameter list.
Visual Basic treats Fortran DLLs essentially as "black boxes," into which data
is passed through the argument list, and out of which modified values are
passed back through arguments or as function return values. All communication
with the user must be through Visual Basic because a Fortran DLL cannot do any
screen I/O (all input and output calls in a Fortran DLL are resolved by
character-based routines and are incompatible with the Windows environment).
Also, in keeping with the "black box" metaphor, a Fortran DLL cannot call back
to routines in the Visual Basic main program. File I/O is possible from the
Fortran DLL as long as care is exercised to close the file before the
application terminates. All code in the Fortran DLL must consist of functions
or subroutines because the Visual Basic program is the main program, or in
Windows terms, the actual "task" (DLLs can never be tasks under Windows).
I find it best to initially test and thoroughly exercise the Fortran code with
a main driver main program (either DOS or QuickWin). This eliminates the
possibility of the Fortran/Visual Basic interface as the culprit in any
suspicious program behavior. Also, the CodeView for Windows source-level
debugger that comes with Microsoft Fortran can be used if symbolic information
is included in the Fortran DLL.
To create the actual DLL, you must first create a definitions file to link it.
A sample .DEF file called MATLIB.DEF, which can be used as a guide, is
included with Fortran 5.1. The only modifications necessary are to change the
name in the LIBRARY statement to match the base name of your DLL, and to
include the names of all subroutines or functions that you plan to call
directly from Visual Basic in the EXPORT statement. Always include the default
WEP DLL termination routine in the EXPORT section of a DLL definitions file.
Code that will be included in a DLL must be compiled with the /Aw option; code
that will be an entry point into the DLL must be compiled with the /Aw and /Gw
options. The link must use the runtime library LDLLFEW.LIB and the definitions
file. It isn't necessary to create an import library using the IMPLIB utility
because Visual Basic does not require it.


Accessing the DLL


A Visual Basic program is managed as a project which contains form modules
where the layout of the user interface is designed; code modules containing
auxiliary code; and a global module for declaration of elements of global
scope. In order to access a DLL from Visual Basic, a Basic Declare statement
with the special Lib keyword is required.
Fortran is particularly easy to interface with because the default calling
protocols in both languages are closely matched: Almost everything in both
languages is passed as a far pointer, and both languages store arrays in
column major format. Modification is required only in passing strings and
arrays. Strings in Visual Basic can be either dynamic or fixed-length. When
communicating with any Fortran DLL, it is best to use fixed-length strings
because a dynamic-length string is only as long the last value assigned to it.
If this value is smaller than the length of the string as declared in the
Fortran routine, the Fortran code will write out into nonallocated areas of
memory, wreaking havoc with other memory locations or causing a protection
violation. To pass a string from Visual Basic to any DLL, the keyword ByVal
must be specified on the string argument in the Declare statement for that DLL
routine. Passing strings as ByVal arguments is critical because it cancels the
additional information that Visual Basic uses in passing strings to other
Visual Basic routines and allows the correct generation of the calling
protocols for a mixed-language interface. For example, if the Fortran routine
looks like that in Example 1(a), the Visual Basic sub should look like Example
1(b), and the declaration in the Declarations section of the form or module
should resemble 1(c).
Example 1: Passing strings: (a) Fortran routine; (b) Visual Basic sub; (c)
declaration.

 (a)

 SUBROUTINE STRINGER (INSTRING)
 CHARACTER*40 INSTRING
 INSTRING = 'This is from Fortran'
 END

 (b)

 Sub Form_Click ()
 Dim temp As String * 40
 Call STRINGER (temp)
 Debug.Print temp
 End Sub

 (c)


 Declare Sub STRINGER Lib "d:\vb\
 test\string.dll" (ByVal Mystring
 As String)

Care must also be exercised in passing arrays as parameters. Visual Basic has
some special array handling facilities that aid in passing arrays from one
Visual Basic routine to another. This default handling of arrays must be
suppressed when communicating with other languages. When an array is passed
from Visual Basic to a DLL, the first element of the array is specified in the
call. This results in a far pointer to the head of the array being passed to
the routine in the Fortran DLL, which is precisely what Fortran expects to
receive when an array is passed. For example, if the Fortran routine looks
like Example 2(a), the Visual Basic sub will look like Example 2(b), and the
declaration in the Declarations section of the form or module will look like
Example 2(c).
Example 2: Passing arrays: (a) Fortran routine; (b) Visual Basic sub; (c)
declaration.

 (a)

 SUBROUTINE ARRAYTEST (ARR)
 INTEGER*4 ARR(20)
 ARR = 5
 END

 (b)

 Sub Form_0Click ()
 Static testarray (1 To 20) As
 Long
 Call ARRAYTEST(testarray(1))
 For i% = 1 To 20
 Debug.Print testarray(i%)
 Next i%
 End Sub

 (c)

 Declare Sub ARRAYTEST Lib "d:\vb\
 test\array.dll* (Myarray As Long)

Finally, one of the more difficult things to pass from Visual Basic to a DLL
is an array of strings. The solution is to declare a user-defined type in
Visual Basic with only a fixed-length string as its element. This will map
onto an array of strings in Fortran. For example, if the Fortran routine looks
like Example 3(a), the Visual Basic sub will look like Example 3(b), and the
declarations in the global module will look like Example 3(c).
Example 3: Passing an array of strings: (a) Fortran routine; (b) Visual Basic
sub; (c) declaration.

 (a)

 SUBROUTINE ARRAYSTRINGS (ARR)
 CHARACTER*24 ARR(5)
 ARR = 'This is a string also'
 END

 (b)

 Sub Form_Click ()
 Static testarray (1 To 5) As
 StringArray
 Call ARRAYTEST (testarray(1))
 For i% = 1 To 5
 Debug.Print
 testarray(i%). strings
 Next i%
 End Sub

 (c)

 Type StringArray
 strings As String * 24
 End Type


 Declare Sub ARRAYSTRINGS Lib "d:\
 vb\test\array.dll" (Myarray As
 StringArray)

Declare statements can be placed in the Declarations section of any form or
module but the Type statement can only be placed in the global module. Passing
all other data items is very straightforward. Note, however, that Fortran
should never use the construct CHARACTER*(*) to receive adjustable size
strings into a DLL, although there is no problem with using adjustable size
arrays in a Fortran DLL.


Other Considerations


Before describing a sample application, I'll address some limitations of mixed
Visual Basic and Fortran programming. Visual Basic does not support data items
greater than 64 Kbytes in size -- huge data items in Microsoft language terms.
Microsoft Fortran routines can create and manipulate such items, but Visual
Basic will only be able to access them either element by element or in pieces
less than 64 Kbytes. There is no easy solution to this problem if huge data
must be used. C DLLs have been written that provide a whole suite of services
for working with huge arrays. All the array allocation and manipulation are
done by calls from Visual Basic to the C service routines. An equivalent set
of services written in Fortran would be difficult to emulate due to the lack
of pointer capability.
I've already touched on another mixed Visual Basic and Fortran development
limitation. Fortran must be accessed as a DLL, so it must be a passive
receiver of information from Visual Basic. In many cases, existing Fortran
code containing I/O scattered throughout would have to be extensively modified
to work under these conditions. Using C and the Windows SDK with Fortran isn't
an easy solution because although Fortran need not be accessed as a DLL, all
screen I/O and calls to the Windows API must still be done from C. This still
results in a great deal of recoding on the Fortran side. The one advantage of
using C is that the Fortran code, if statically linked to a C application, can
make callbacks to C routines in the main program or other C functions; this
cannot be done from Fortran to Visual Basic.
Also, Fortran DLLs do not, by default, yield to any other process. Once a
calculation begins in Fortran it will continue to its conclusion without
stopping. This is not compatible with using the application in a multitasking
mode.
Finally, there is the question of performance and flexibility. Performance of
a GUI application is not an issue of benchmarks but of perception: How
responsive does the application seem to the user? The advantage of doing the
intensive number-crunching in an optimized Fortran DLL combined with an
efficient design of the Visual Basic front end should result in an application
whose performance is comparable to other Windows applications. For total
flexibility nothing can compare to coding at the API level using C and the
Windows SDK; however, the possibilities of extending Visual Basic using custom
controls and direct calls to the Windows kernel help to diminish any perceived
restrictions. When comparing the programming process in Visual Basic and
programming using the Windows SDK, the cost/benefit ratio is completely
situation-specific and dependent on the programming expertise and resources
available.


Building an Eigenvalue Calculator


When I first thought of testing a mixed Visual Basic/Fortran example program,
I looked for Fortran code that was well-tested, concise, nontrivial, and
well-suited for utilization as a DLL. The book Numerical Recipes: The Art of
Scientific Computing (Press, Flannery, Teukolsky, and Vetterling, Cambridge
University Press, ISBN 0-521-30811-9) seemed like a good place to start. I
thought Visual Basic would make matrix data entry easy to implement. Because
Eigenvalues and Eigenvectors mathematically characterize any matrix for which
they exist, I decided to use the Eigenvalue and Eigenvector computation
routines (not too complex, yet not numerically trivial) to build an Eigenvalue
calculator. Also, symmetric matrices suggest a user entry routine that
enforces the matrix symmetry automatically, thus making the Visual Basic side
of the program more interesting. Figure 1 shows the calculator. The Fortran
routine TRED2 (Listing One, page 130) takes a symmetric matrix and returns a
symmetric tridiagonal matrix in two vectors and an orthogonal matrix used in
finding the Eigenvectors. The routine TQLI takes the output of TRED2 and uses
orthogonal transformations with implicit shifts to produce a vector containing
the Eigenvalues and a matrix with Eigenvectors for columns (details are in
Numerical Recipes).
On the Visual Basic side, I had to decide which control was appropriate for a
matrix entry routine. I chose the custom control GRID.VBX because it was
designed for precisely this purpose. This control is like a small fragment of
an Excel spreadsheet that can be easily resized and comes with built-in scroll
bars and clipboard functionality. It does not consist of text entry fields so
text entry must be done in a dedicated text control and transferred to the
cells on the grid (just like in Excel). Some attempt was made to validate user
entry, but this could have been more rigorous. The same grid that the matrix
was entered on also displayed the Eigenvectors after calculation, and a
one-dimensional row grid was used to display the corresponding Eigenvalues.
The dimension of the grids was made dynamically resizable, ranging from 1 to
35. A simple menu was constructed to allow pasting to and copying from the
grids to the Windows clipboard. Everything was contained on one form; the
Global.bas module was used to declare various constants, and the Declare
statements were used for communicating with the Fortran DLL.
Because the Fortran code was chosen with an eye to creating a DLL, it required
little modification. One additional argument was added to the TQLI subroutine
to pass out an error flag, and an error output statement was eliminated. The
routines were left as separately exported subroutines instead of being called
from within the DLL from a master subroutine. Having separately callable
routines allows Visual Basic to regain control of the program more often
because Fortran DLLs do not cooperate in multitasking and only relinquish
control of the operating system on return. If desired, some code could be
inserted to allow the user to cancel the operation before the second call to
the DLL. The two routines were tested with a driver program and then made into
a DLL using the menu options in the "Programmer's Workbench." The definition
file was created by modifying the example .DEF file included with the Fortran
5.1 sample source code.
All the remaining work was on the Visual Basic side of the program. A
calculator-style interface was used because it was straightforward to
implement. There is a two-way connection between the grid text entry field and
the matrix grid (grid1). This is where all matrix values are entered,
modified, and validated. Maintaining this relationship and validating data
entry were probably the least obvious aspects of the interface.
The remainder of the interface is mostly button-oriented and fairly
self-evident. Two menus were created to allow the user to copy from and paste
to the grids. This is where the power of the grid custom control was apparent.
Every grid has a clip property that is robust in accepting data from the
clipboard: If the quantity of data in the clipboard and the selected area
don't match, data is automatically truncated or padded with nulls to make a
match with no error invoked. This, combined with the default selection
capability of a grid control (similar to that of an Excel spreadsheet) makes
moving rows, columns, entire matrices, and subsets thereof extremely simple.
Probably the least satisfactory aspect of the program is the lack of features
in navigating the matrix grid. There is no default way to advance to the next
cell on a grid, although arrow keys and most other cursor movement keys are
recognized when the grid is the focus. To implement this, a key (for example a
tab) would have to be trapped in the text entry field and position on the grid
dynamically maintained. If scrolling is necessary, this must also be
maintained by the programmer. For the sake of simplicity, this was not
implemented. Considering the powerful properties imbedded in the grid control,
in any serious program this would not be much of an obstacle.
Data validation was done in a somewhat quick and dirty manner: The intrinsic
Val function was used to convert whatever text was entered into a number and
then back into text for placement in the grid. An attempt was made to notify
the user when conversions were made. The sample program takes advantage of the
convenient array handling features of Visual Basic. It was possible to
dynamically allocate only as much memory as required for the current dimension
on the form. Fortran programmers will certainly appreciate the ability to
declare an array to be global to the form with global access to upper and
lower bounds and yet dynamically allocate the array in Basic subroutines.


Gotchas


Transferring programming experience with a sequential procedural language such
as Fortran to an event-driven programming system such as Visual Basic can lead
to certain pitfalls. It is startlingly easy to throw up a series of controls
that perform the required actions in isolation. Yet difficult and subtle
programming and design issues lie latent in the interaction between controls
and the free-form sequence of user actions allowed by a GUI. The user's
ability to manipulate any control at any time (and often in a variety of ways)
presents the programmer with an embedded sequence of temporal and positional
"ifs" that if laid out in actual code, would manifest considerable complexity.
On-the-fly programming in Visual Basic without any prior planning or design is
tempting, but can result in ground-up code rewrites due to unforeseen and
subtle interactions between event handlers: Ease of programming partially
masks the complexity of the user interface.
Visual Basic can be a very modularized programming system, yet all properties
and methods of all controls are globally available. Hence, side effects are an
ever-present danger. I often found myself wishing for some means of
determining exactly which control previously had the focus. This is usually a
sign that procedural programming habits have crept in and event handlers are
not being coded in a truly self-contained manner.
An overpowering desire to add just one more global flag is evidence that old
programming methods die a hard and lingering death. I was never able to
eliminate all my own global variables because of the initial sloppy design of
the program.


Wrapping Up


Every hard-core Fortran programmer probably has at least one bad Visual Basic
program waiting to be written. Hopefully, the methodologies and paradigms of
an event-driven interface will begin to be as fundamental and natural as the
use of structured programming and well-documented code. The ease of
interaction between programming systems such as Visual Basic and procedural
languages means that useful Fortran code can exist as a component of a modern
GUI program, thus preserving the value of existing code, and simplifying the
transition to newer styles of programming.

_MIXED-LANGUAGE WINDOWS PROGRAMMING_
by John Norwood



[LISTING ONE]


 subroutine tred2(a,n,np,d,e)
c
c Householder reduction of a real symmetric, nxn matrix a, stored in an
c npxnp physical array. On output, a is replaced by the orthogonal matrix
c q effecting the transformation.
c d returns the diagonal elements of the tridiagonal matrix, and e the off-
c diagonal elements, with e(1)=0.


 implicit none

 integer*4 i,j,k,l ! Loop indices
 integer*4 n ! Actual array size
 integer*4 np ! Physical array size of incoming array
 real*4 a(np,np) ! Matrix, np is used so dimensions match
 real*4 d(np) ! Vector that will have diagonal elements
 real*4 e(np) ! Vector that will have off-diagonal elements
 real*4 h ! Vector norm used in forming projection
 real*4 scale ! Scale factor
 real*4 f ! Temporary variable
 real*4 g ! Temporary variable
 real*4 hh ! Another piece of the projection

 if (n.gt.1) then
 do 18 i=n,2,-1
 l=i-1
 h=0.
 scale=0.
 if(l.gt.1) then
 do 11, k=1,l
 scale=scale+abs(a(i,k))
11 continue
 if(scale.eq.0.) then ! Skip transformation
 e(i)=a(i,l)
 else
 do 12, k=1,l
 a(i,k)=a(i,k)/scale ! Use scaled a's in transformation
 h=h+a(i,k)**2 ! for eigenvectors
12 continue
 f=a(i,l)
 g=-sign(sqrt(h),f)
 e(i)=scale*g
 h=h-(f*g)
 a(i,l)=f-g
 f=0.
 do 15,j=1,l
 a(j,i)=a(i,j)/h
 g=0.
 do 13, k=1,j
 g=g+a(j,k)*a(i,k)
13 continue
 if(l.gt.j) then
 do 14,k=j+1,l
 g=g+a(k,j)*a(i,k)
14 continue
 endif
 e(j)=g/h ! Form element of projection in
 f=f+e(j)*a(i,j) ! temporarily unused element of e
15 continue
 hh=f/(h+h)
 do 17, j=1,l
 f=a(i,j)
 g=e(j)-hh*f
 e(j)=g
 do 16, k=1,j ! Loop to reduce matrix a
 a(j,k)=a(j,k)-f*e(k)-g*a(i,k)
16 continue
17 continue

 endif
 else
 e(i)=a(i,l)
 endif
 d(i)=h
18 continue

c This starts the eigenvector specific part of the code.

 endif
 d(1)=0.
 e(1)=0.
 do 23, i=1,n ! Begin accumulation of transformation matrices
 l=i-1
 if(d(i).ne.0.) then
 do 21, j=1,l
 g=0.
 do 19, k=1,l ! Use information stored in a to form
 g=g+a(i,k)*a(k,j) ! projection times orthogonal matrix
19 continue
 do 20, k=1,l
 a(k,j)=a(k,j)-g*a(k,i)
20 continue
21 continue
 endif

c This ends the eigenvector specific part of the code.

 d(i)=a(i,i)

c This starts the eigenvector specific part of the code.

 a(i,i)=1. ! Reset row and column of a to identity matrix
 if(l.ge.1)then ! for the next iteration of transformation loop
 do 22,j=1,l
 a(i,j)=0.
 a(j,i)=0.
22 continue
 endif

c This ends the eigenvector specific part of the code.

23 continue
 return
 end

 subroutine tqli(d,e,n,np,z,iterflag)

c This subroutine performs a QL algorithm with implicit shifts, to determine
c the eigenvalues and eigenvectors of a real, symmetric, tridiagonal matrix,
c or of a real, symmetric matrix previously reduced by tred2 above.
c d is a vector of length np. On input, its first n elements are the
c diagonal elements of the tridiagonal matrix. On output, it returns the
c eigenvalues. The vector e inputs the subdiagonal elements of the
c tridiagonal matrix, with e(1) arbitrary. On output, e is destroyed.
c If eigenvectors are desired, the matrix z (nxn stored in an npxnp array)
c is input as the identity matrix or the matrix that is returned from tred2.
c On output, the kth column of z contains the normalized eigenvector
c corresponding to the eigenvalue in d(k).


 implicit none

 integer*4 i,k,l ! Loop indices
 integer*4 iter ! Iteration counter
 integer*4 n ! Logical size of array z
 integer*4 m ! Submatrix size
 integer*4 np ! Physical size of array z
 integer*4 iterflag ! Flag to return error code to main routine
 real*4 d(np) ! Diagonal elements is, eigenvalues out
 real*4 e(np) ! Off diagonal elements in, nothing out
 real*4 z(np,np) ! Matrix from tred2 in, eigenvectors out
 real*4 dd ! Holds small subdiagonal element
 real*4 g ! Holds Givens rotation
 real*4 s ! "Sin" component of Givens rotation
 real*4 c ! "Cos" component of Givens rotation
 real*4 p ! Element of projection matrix
 real*4 f ! Holds result of "sin" applied to element of e
 real*4 b ! Holds result of "cos" applied to element of e
 real*4 r ! Temporary piece of c or s


 if(n.gt.1) then
 do 11, i=2,n ! Renumber elements of e
 e(i-1)=e(i)
11 continue
 e(n)=0.
 do 15,l=1,n
 iter=0
1 do 12,m=l,n-1 ! Search for small subdiagonal element
 dd=abs(d(m))+abs(d(m+1))
 if (abs(e(m))+dd.eq.dd) go to 2
12 continue
 m=n
2 if(m.ne.l) then

 if(iter.eq.30) then
 iterflag = -1
 return
 endif

 iter=iter+1
 g=(d(l+1)-d(l))/(2.*e(l)) ! Calculate shift
 r=sqrt(g**2+1.)
 g=d(m)-d(l)+e(l)/(g+sign(r,g))
 s=1.
 c=1.
 p=0.
 do 14, i=m-1,l,-1 ! Plane rotation followed by a Givens
 f=s*e(i) ! rotations to maintain tridiagonal
 b=c*e(i) ! form
 if(abs(f).ge.abs(g))then
 c=g/f
 r=sqrt(c**2+1.)
 e(i+1)=f*r
 s=1./r
 c=c*s
 else
 s=f/g

 r=sqrt(s**2+1.)
 e(i+1)=g*r
 c=1./r
 s=s*c
 endif
 g=d(i+1)-p
 r=(d(i)-g)*s+2.*c*b
 p=s*r
 d(i+1)=g+p
 g=c*r-b

c Start of code specific to forming eigenvectors.

 do 13, k=1,n
 f=z(k,i+1)
 z(k,i+1)=s*z(k,i)+c*f
 z(k,i)=c*z(k,i)-s*f
13 continue

c End of code specific to forming eigenvectors.

14 continue
 d(l)=d(l)-p
 e(l)=g
 e(m)=0.
 go to 1
 endif
15 continue
 endif


 return
 end




Example 1:

(a)

 SUBROUTINE STRINGER(INSTRING)
 CHARACTER*40 INSTRING
 INSTRING = 'This is from Fortran'
 END

(b)



Sub Form_Click ()
 Dim temp As String * 40
 Call STRINGER(temp)
 Debug.Print temp
End Sub

(c)

Declare Sub STRINGER Lib "d:\vb\test\string.dll" (ByVal Mystring As String)





Example 2:

(a)
 SUBROUTINE ARRAYTEST(ARR)
 INTEGER*4 ARR(20)
 ARR = 5
 END

(b)

Sub Form_Click ()
 Static testarray(1 To 20) As Long
 Call ARRAYTEST(testarray(1))
 For i% = 1 To 20
 Debug.Print testarray(i%)
 Next i%
End Sub



(c)
Declare Sub ARRAYTEST Lib "d:\vb\test\array.dll" (Myarray As Long)




Example 3

(a)

 SUBROUTINE ARRAYSTRINGS(ARR)
 CHARACTER*24 ARR(5)
 ARR = 'This is a string also'
 END

(b)

Sub Form_Click ()
 Static testarray(1 To 5) As StringArray
 Call ARRAYTEST(testarray(1))
 For i% = 1 To 5
 Debug.Print testarray(i%).strings
 Next i%
End Sub



(c)

Type StringArray
 strings As String * 24
End Type

Declare Sub ARRAYSTRINGS Lib "d:\vb\test\array.dll" (Myarray As StringArray)
































































October, 1991
PROGRAMMING PARADIGMS


The Real Meaning of the Apple-IBM Deal




Michael Swaine


As I flew into Boston, I looked over my schedule of meetings and events for
MacWorld Expo. Near the top of the list, or actually down near the middle of
the list but near the top of the nonparty items, were two meals that I was not
looking forward to.
Wednesday breakfast: Listen to analysts discuss the true meaning of the
Apple-IBM deal. Have to go because I promised Nancy.
Wednesday lunch: Listen to Roger Heinen, Apple vice-president and general
manager of the software architectures group explain the true meaning of the
Apple-IBM deal. Have to go on the off-chance that he might say something
interesting.
Everybody wants to explain the true meaning of the Apple-IBM deal, just like
everybody wants to invent the next Ben & Jerry's ice cream flavor. (Well,
everybody I know. These are the people who invented the ice cream cure for the
Grateful Dead concert munchies, Cherry Garcia. Ben & Jerry, that is, not my
friends. My friends invented the snow cone drenched with Canadian Club whiskey
called a Deadly Do-right.)
Although either meal would have been improved by a dish (or cone) of Ben &
Jerry's, the food was OK. I did consider, when the moderator called for
questions at the breakfast meeting, asking if there was any more coffee, but a
waitperson arrived just then with a fresh pot, so my curiosity was satisfied.
The nonfood features of these meals were not that satisfying. I came away from
each table hungry, neither meal having supplied the recommended daily
allowance of true meaning.
Here, then, is the true meaning of the Apple-IBM deal. Are. Meanings. Plural.


The Toaster is Hot


But first, what it is.
IBM and Apple have agreed to agree to several things.
One of these is to work together to develop a multimedia platform and license
it to others. This software platform is to include Apple's QuickTime, which
defines a Movie data type and handles sound and video synchronization, and
which is generally regarded as one of the hottest developments to date in the
multimedia area.
This is significant, because the multimedia market looks like the computer and
software industry's best opportunity to move beyond office equipment and into
a real market: entertainment. Maybe "real" isn't the best word; "big" might be
more apt. "Huge" would be apter. Replacing the typewriter is a fine thing, but
replacing the TV and the VCR would be totally awesome.
Another of the hottest multimedia developments to date is the engagingly named
video toaster, a nifty device based on a Commodore Amiga and allowing the
savvy user to develop professional-looking videos with wild computer effects
and titles and such for something closer to $10,000 than to $100,000. Todd
Rundgren's video "Change Myself," which was available at the show,
demonstrates dramatically what an artist can do with the product. The latest
wrinkle is that the Toaster can now be controlled from a Macintosh computer.
The toaster certainly has its limitations, but it is fair to say that this
means one can now have a video production studio as a Macintosh peripheral for
under $20,000.
What the toaster and QuickTime news means is that Apple has a large lead over
Microsoft's Multimedia PC standard in this area, and that IBM is sharing in
and further legitimizing the Mac approach to Multimedia. As analysts Tony Bove
and Cheryl Rhodes have pointed out, the multimedia part of the Apple-IBM deal
can pay off quickly, and can hurt Microsoft in this potentially huge market.


Purple Prose


But it is the other things that Apple and IBM agreed to agree to that are
getting most of the attention, because they threaten to redefine the computer
industry as we know it, rather than just to create a new, larger industry.
These things have to do with architectures and operating systems.
Apple will build future machines using IBM's RISC architecture. Apparently
Apple's own RISC development will be dumped or backburnered once Apple fully
commits to the IBM chip. Given Apple's history of keeping its options open, it
probably won't get backburnered. Through his brother Peter, who was at the
show, I heard that Apple veteran Allen Baum, who has invested some time in
Apple's RISC technology, finds the IBM approach technologically unimpressive.
IBM will incorporate the Mac Toolbox into its Unix version, AIX, so that it
can run Mac applications. That doesn't make a lot of sense, I know, and there
seems to be some disagreement between Apple and IBM about just what AIX will
get. IBM has said that AIX will have the Mac interface, and Apple has said no.
I think there may be some details to be ironed out here.
IBM will integrate Macintosh into the IBM client/server architecture. Apple
will get legitimized in corporate America, is the general view of the
significance of this.
And finally, the two companies will get together to develop a new operating
system. This is specifically what everyone wants to talk about, and what my
Macworld Expo meals featured as the main course.
The new operating system is to be based on Apple's Pink project, is to be
object-based rather than file-based, is to have a graphical user interface, is
to be portable, and is to run across networks. There are a couple of levels
here: Underlying everything is to be an object layer, which supports the Pink
operating system, API, and GUI, as well as supporting existing operating
systems, meaning some kind of Unix, OS/2, and the Mac OS. Everyone seems to
want to call it all Pink, although Bove and Rhodes sort of suggest that when
IBM has finished making its contribution the result will be Purple. At the
breakfast meeting, Macromind founder Marc Canter expressed his view of what
the software will look like when IBM gets done in language of the sort that
any day now the Supreme Court will probably decide is not Constitutionally
protected. Marc Canter has a lot of opinions. He would probably like to rename
Ben & Jerry.
The two companies will form a jointly owned company to produce this new
operating system, transferring their respective object-based operating system
groups (about 100 employees each) to the company. Basically, Apple's
contribution is Pink, and IBM's is Metaphor.
Metaphor is David Liddle's company, which last year formed Patriot Partners
with IBM to produce platform portable object-based system software. Liddle's
vision was something called Constellation, which was to be a tool-rich
environment oriented toward user tasks rather than being defined by discrete
application programs. IBM has since bought Metaphor and is delivering it and
Liddle to the new venture; Patriot Partners and Constellation have ceased to
exist as separate entities, and Constellation is supposed to become part of
the new thing.
Just how Pink and Constellation get integrated is unclear, but IBM's idea was
that Liddle would be in charge. It looks as of this writing like Liddle will
be the president of the new company, but Apple is wrangling to maintain some
control over the venture, pushing for Ed Berst to be the CEO. Apple and IBM
are to own 50 percent each of the company.


Your Brain on Drugs


That's the deal. Not surprisingly, it is unpopular among two groups of
computer users: those who have long admired Apple and those who have long
distrusted IBM. During the show, you could buy a t-shirt on the street in
Boston that expressed their common view: On the front, the Apple logo and the
caption: This is your brain.
On the back, the Apple logo done in IBM-blue stripes and the caption: This is
your brain on drugs.
Opinions vary on when we will see the new system software.
At the breakfast meeting, pundit Jeffrey Tarter gave his prediction: five
years from now at the earliest, if at all. This sounds right, but one hears
claims that something could appear in the next year. T/Maker founder Heidi
Roizen cleared that up by pointing out that there was confusion in the press
about when a product might ship and when developers might get a first look.
Roger Heinen predicted that the Mac operating system would be running on
RS/6000 RISC machines in 18 to 24 months, presumably for developers to see.
There are predictions that AIX will be running Mac software by 1993 and that
new RISC-based Macs will be on sale in 1993. And Heinen has predicted that the
Pink market could be where the Mac market is today by 1995. This is surely
hyperbole unless Pink is almost ready to go out the door now, and in any case
would seem to be inconsistent with his prediction that the Mac market has
another good decade.
How much pain will be involved? Well, Marc Canter says that Pink FUD (Fear,
Uncertainty, and Doubt) is holding up software development, but Heinen
insisted over lunch that Apple was committed to binary compatibility for
existing Mac software on new machines and under Pink. Unix support is more
confusing. Apparently Apple will continue to support its Unix, A/UX, on its
hardware, provide Mac Toolbox support for IBM's Unix, AIX, and evolve A/UX
toward AIX. In the long term, though, AIX is likely to be supplanted by this
thing called Pink.
Who wins and who loses? This is one of those questions that journalists like
to pose, to the frustration of those unimaginative enough to be looking for
answers. (The reason for this need not concern us here. Oh, all right; it's
because most journalists are philosophy majors, because journalism school
graduates can't spell.) I certainly don't know the answer, but here are some
cautions regarding some speculative answers being circulated.
Microsoft obviously loses. Well, I can promise that those of us in the press
who have been dumping on Microsoft for doing a better job of taking over the
world than of delivering working software are going to lighten up. And
although Pink will undoubtedly become a product someday, so will Microsoft's
next offering, which will probably fall very close to Pink in the features
spectrum. I see no reason why Microsoft can't do to Pink what it did, in its
extremely successful Windows 3, to the Mac OS.
Apple obviously wins. Unless, of course, IBM decides it doesn't like Apple any
more. While he's gloating over what Apple is going to get out of this deal;
John Sculley should ask himself why IBM is cancelling the Windows versions of
some of its application software when everybody else is rushing to do Windows
development. So should IBM stockholders, for that matter, since it doesn't
look like a move that has anything to do with IBM's profits, but rather like
an attack, albeit feeble, on Microsoft's profits. There are lots of reasons
why the deal could go sour: In Info-World, Brian Livingston speculated that
the deal will founder over support for Intel-based hardware, where IBM will
want its own machines to have an edge.



Meanwhile, Back at the Show


Over seafood out by the bay the last night of the show, two Fortran developers
confessed that they were asked not long ago to do a Cobol implementation. One
of their reasons for refusing, they explained, was that "nobody programs in
Cobol." It turns out that they weren't just commenting on the obvious decline
in popularity and fashionability of Cobol; they meant that even those who use
Cobol do their development using other tools.
Cobol defenders often cite the 80 Godzillion lines of Cobol code now in use,
and while it is easy to take potshots at these references--like saying that 80
Godzillion lines of Cobol translates into 3,000 lines of C--there is a huge
body of Cobol code in use, and this argues for Cobol's being around for some
time to come.
But if nobody actually programs in a language, doesn't that make it officially
dead as a programming language? Even if we can't pull the plug on Cobol, can't
we agree on the diagnosis of brain death? And in deference to the 80
Godzillion lines of code, start referring to Cobol as a code repository? It's
just a thought.
Dave Winer was at the show demonstrating the latest beta of Userland Frontier,
his system-level scripting tool for the Mac. To make the product less
intimidating to users who might be intimidated by the idea of programming,
Winer's people have given Frontier the ability to spit out scripts as small
double-clickable files. Formerly, the scripts lived in Frontier's object
database. Double-clickable Frontier scripts are not stand-alone applications,
but Frontier documents, requiring Frontier to run. Their advantage lies in
their reassuring concreteness, and in the possibility of sharing, trading, and
downloading them.
The downloading angle could have something to say about the success or failure
of Frontier. If lots of Frontier scripts start appearing on electronic
services free for the downloading, it could trigger a lot of purchases of
Frontier. On the other hand, if those scripts are bad, they will reflect badly
on Frontier itself, as happened with the glut of bad stacks when HyperCard
first came out.
I've heard this analogy to HyperCard stacks, but it seems to me that it isn't
very good. HyperCard stacks are usually judged to be bad because of design
issues: They are visually awful, they are poorly organized, it's hard to tell
what they do or where you are in them. Frontier scripts will have none of
these problems. Not being visual and not having a structure saves them from
problems of visual design and structure.
Of course, they may have other problems, like crashing your system or trashing
your files. Frontier's language is powerful and reaches deeply into the system
or an application. Frontier scripts can do damage. Scripts from amateur
programmers should be examined or tested before being accepted by sysops and
before being used by downloaders.
And the door is now open for amateurs to do the kind of deliberate damage that
virus creators do. That's the price of power.
As I look over the press coverage after the show, it appears that the favorite
flavor of speculation is that IBM will eventually buy Apple. I'll make a
specific prediction about that: It just won't happen. Feel free to ridicule me
when it does.

















































October, 1991
C PROGRAMMING


Building a D-Flat Application




Al Stevens


This month we continue the D-Flat project by publishing the last of the
supporting source code modules. Two files provide functions that D-Flat uses
to manage lists and screen rectangles, and two other files describe the
messages and commands that D-Flat and its applications use. For the first
time, we will see an example D-Flat application program. Next month we start
looking at the modules that describe and manage the window classes.


Linked Lists in D-Flat


Listing One, page 166, is lists.c, the source file that maintains two linked
lists of windows. The first linked list, Built, records the D-Flat
application's windows in the sequence in which they are built. The second
linked list, Focus, records the windows in the order in which they take the
user's input focus.
Linked lists are managed by a structure that contains two pointers to window
structures: a FirstWindow pointer and a LastWindow pointer. Every window
structure has nextfocus, prevfocus, nextbuilt, and prevbuilt pointers. The
LinkedList structure is the list head, and the window structure contains the
forward and reverse linked list pointers. These structures are defined in
dflat.h.
There are two linked lists because some processes need to traverse the windows
in the sequence in which they have been in focus, and others need to use the
sequence in which the windows were created.
The functions named SetNextFocus and SetPrevFocus set the focus to the next or
previous window in the Focus linked list. The wnd parameter is the window from
which the functions start looking. These functions are called when windows
close or when the user steps through the windows with the Alt+F6 key or the
Tab key on a dialog box. The SetNextFocus function attempts to find a window
that has the same parent as the one from which the focus is being taken. It
uses the SearchFocus-Next function to do that.
Four functions add and remove windows from the two linked lists. They are
AppendBuiltWindow, RemoveBuiltWindow, AppendFocusWindow, and RemoveFocus
Window.
There are four functions for traversing the lists in a first-to-last sequence.
A program calls GetFirstChild or GetFirstFocusChild to get the first child
window for a parent window from one of the two lists. Then it calls
GetNextChild or GetNextFocusChild successively until one of these functions
returns NULL. Example 1 shows the code sequence for traversing a list. In this
example, pwnd is the parent window, and cwnd is each of the child windows in
turn.
Example 1: The code sequence for traversing a list

 WINDOW cwnd = GetFirstChild (pwnd);
 while (cwnd != NULLWND) {
 /*--process the child window--*/
 cwnd = GetNextChild (pwnd, cwnd);
 }

There are two functions for traversing the Built linked list in reverse order.
They are GetLastChild and GetPrevChild. They work the same as their forward
counterparts. No functions exist to traverse the Focus list because there is
no need for them.
The SkipSystemWindows function is called to bypass granting the focus to
APPLICATION, MENUBAR, and STATUS-BAR windows when the user is using Alt+F6 or
another method to move among the document windows.


Rectangles


To properly manage screen displays, D-Flat needs some rectangle management
functions. Listing Two, page 167, is rect.c, the source file that contains
these rectangle functions. The first function is subVector, which is used only
by the other functions in rect.c. The function accepts two vectors, described
by their integer end points as tl, t2 and o1, o2. It computes the overlap of
the two vectors and stores the result as two end points in the integers
pointed to by the v1 and v2 integer pointer parameters. The function uses the
within macro defined in rect.h, which I published several months ago. That
header file also defines the RECT typedef, which is a structure that contains
the left, top, right, and bottom rectangle coordinates.
The subRectangle function accepts two RECTs and computes the rectangle
resulting from the overlap of the two, returning the computed rectangle in a
RECT structure.
The ClientRect function computes the client rectangle of a specified window by
using the four macros, GetClientLeft, GetClientRight, GetClientTop, and
GetClientBottom. These macros are defined in dflat.h. They compute the client
rectangle from the window's rectangle by adjusting for the window's border,
titlebar, and menubar, if they exist.
The Relative WindowRect function produces a rectangle that is relative to 0
within the rectangle of the specified window.
The ClipRectangle function clips a rectangle that is inside a window so that
the rectangle does not extend beyond the screen borders or the borders of any
ancestor windows.


D-Flat Messages


Several months ago I published the messages.h file that defined the D-Flat
messages as members of an enum definition. That technique is changed. Listing
Three, page 167, is dflatmsg.h, a new header file that contains the messages
expressed as parameters to the DFlatMsg macro. Parts of the program that need
lists of the messages in different formats define the macro to their own needs
and include the dflatmsg.h file. For example, the enumerated list of messages
in dflat.h now works like Example 2. The DFlatMsg macro gets redefined to use
the message values as members of the enumerated list. The MESSAGE-COUNT entry
represents the last entry in the list and is, therefore, the number of
messages in the system.
Example 2: The enumerated list of messages in dflat.h works like this.

 typedef enum messages {
 #undef DFlatMsg
 #define DFlatMsg(m) m,
 #include "dflatmsg.h"
 MESSAGECOUNT
 } MESSAGE;


The message logging process needs strings of the message's names so that it
can display messages in the log window. It uses dflatmsg.h, as in Example 3.
The DFlatMsg macro gets redefined to use the message values as strings with
one leading space. This usage allows the messages to appear in a
multiple-selection list box.
Example 3: The message logging process uses strings of the message's names to
display messages in the log window.

 static char *message[] = {
 #undef DFlatMsg
 #define DFlatMsg(m) " " #m,
 #include "dflatmsg.h"
 NULL
 };



D-Flat Commands


One of the D-Flat standard messages is the COMMAND message. A program sends
that message to a window with this format:
SendMessage(wnd, COMMAND, cmdcode, 0);
The command code is a unique value that the window must recognize. The menu
system uses command codes to tell the window which menu command was executed.
The dialog box system identifies control windows with command codes. Listing
Four, page 167, is commands.h, the file that specifies the commands in an
enumerated list. As published, the list has all the commands used by D-Flat
and the memopad application. You will add entries to this list to add custom
commands for your application.


An Example Program: MEMOPAD


Listing Five, page 169, is memopad.c, an example of an application that uses
D-Flat. It is a multiple-document notepad program that uses all the standard
D-Flat menus and dialog boxes and lets you view and modify many different text
files at once. Its architecture is typical of a D-Flat application program.
The main function in memopad.c begins by calling the init_messages function to
initialize message processing. Then it stores its argv parameter in an
external variable so that other parts of the program -- specifically the
module that loads the help database -- can determine the subdirectory from
which the program is run.
The memopad program now creates its application window by calling the
CreateWindow function. The APPLICATION argument tells D-Flat the window class.
The next argument is the window's title. Following that are the window's
upper-left coordinates and height and width. By using 0,0 as the coordinates
and -1, -1 as the dimensions, the program tells D-Flat to use the entire
screen for the window. The MainMenu argument is the address of the MENU
structure for the application window's menu bar. The next argument is NULL.
That argument usually specifies the WINDOW handle of the window's parent, but
an APPLICATION window has no parent. The MemoPadProc argument is the address
of the window processing module that will receive messages sent to the window.
An APPLICATION window has a default window processing module, but each
application must provide an additional one to handle the commands that are
unique to the application. The last argument to the CreateWindow function is
the window's attribute word, which consists of attribute values ORed together.
The memopad application window must be movable and sizable, and have a border
and a status bar.
After the window is created, the program sends it a message telling it to take
the user's input focus. Then the program scans the command line to see if the
user named any files. If so, each file specification is sent in turn to the
PadWindow function.
The next step is for the program to enter the message dispatching loop. It
does this by successively calling the dispatch_message function until the
function returns a false value, at which time the program is finished and may
terminate.
The PadWindow function accepts a file specification. The function expands the
specification into filenames one at a time and sends the filenames to the
OpenPadWindow function.
The MemoPadProc function is the window processing module for the memopad
application window. When the system or a window sends a message to the
application window, D-Flat will execute this function, passing it the message.
It does so because the function's address was included in the Create Window
call that created the window.
A window processing module can process any message. Usually, however, the
window processing module for the application will only process COMMAND
messages that the menu system sends when the user chooses a menu command and
when the current document window -- if one exists -- does not intercept then
message. MemoPadProc processes those COMMAND messages unique to the memopad
application and not processed by the document windows. The ID_NEW command is
to create a new file in a new document window. The ID_OPEN command is to
select an existing file to load into a new document window. The ID_SAVE and
ID_SAVEAS commands are to save the document window that has the focus back to
disk. ID_DELETEFILE is to delete the file represented by the window that has
the focus. ID_PRINT prints the current document, and ID_ABOUT displays a
message about the application.
When MemoPadProc processes a message, it can simply return a TRUE value. This
action prevents any further processing of the message by the default window
processing module for the APPLICATION class. If MemoPadProc does not want to
intercept and process a message, it calls the DefaultWndProc function, which
passes the message to the default window processing module for the window's
class, in this case the APPLICATION class.
The NewFile and SelectFile functions both call OpenPadWindow to open a new
document window. NewFile passes "Untitled" as the window's title. SelectFile
executes the File Open dialog box by calling the OpenFileDialogBox function.
If that function returns a true value, the user has selected a file
specification which the function stored in the FileName character array. The
SelectFile function searches the windows that are the children of the memopad
application window to see if the selected file is already loaded into a
document window. If so, SelectFile gives the focus to that window and returns.
Otherwise, SelectFile calls OpenPadWindow, passing it the file specification
of the selected file.
The OpenPadWindow function opens a document window and, if the specified file
is "Untitled," initializes the window with a blank document. Otherwise it
loads the text file into the window. First, the OpenPadWindow function uses
the stat function to see if the file exists and, if it does, to determine its
size. If the file does not exist, OpenPad Window calls the ErrorMessage
function to display an error message to that effect. If the file does exist,
or if an untitled new document is being built, OpenPadWindow calls
CreateWindow to create the document window. The title of the window comes from
the filename component of the file specification. The window position is
computed so that successive windows are cascaded. The application window is
the parent of the document window, and EditorProc is the document window's
window processing module. After creating the document window, OpenPadWindow
calls LoadFile to load the specified file into the window and then sets the
user input focus to the document window.
Take careful notice of what has just happened. The program creates an
application window and then goes into the dispatch_message loop waiting for
the loop to return false, whereupon the program terminates. That simple logic
is the foundation of event-driven, message-based programming. Everything else
that happens in a D-Flat program is the result of messages sent to the
application window. Other than for the time-of-day display in the status bar,
the program will do nothing until the user takes an action that generates a
message. When the MemoPadProc function receives one of the COMMAND messages
that it recognizes, things begin to happen. Until one of those commands comes
along, the user can move and resize the application window, change options,
and read the help screens, but nothing of significance will occur within the
application itself until MemoPadProc gets a message.
The LoadFile function loads a specified text file into a document window.
First the function allocates a buffer that can contain the text of the file.
Then the function reads the file into the buffer and sends the SETTEXT message
to the window to tell it to use the buffer for its text display.
The PrintPad function prints the contents of the current document window by
sending it to the standard print device. If you were going to use this memopad
program for serious text editing, you would want to put tests for printer
ready into this function and then intercept critical errors in case the
printer runs out of paper or otherwise fails while you are printing. Without
these measures, DOS will splash its rude error messages across your orderly
D-Flat screens.
The SaveFile function saves the text in the current document window to a disk
file. If the document window is untitled, or if the user selected the Save As
command on the File menu, the SaveFile function executes the Save As dialog
box by calling the SaveAsDialogBox function. If that function returns false,
the user has neglected to specify a filename for the document, and the
SaveFile function returns without doing anything. Otherwise, the SaveFile
function assigns the filename to the document window and uses the filename
component of the new filename as the new title of the document window. Next,
the SaveFile function writes the text to the disk file. Before opening the
file, the SaveFile function calls the MomentaryMessage function to display a
small message in a window on the screen. This tells the user that something is
going on and to stand by. After closing the file, the SaveFile function sends
the CLOSE_WINDOW message to the momentary message window.
The DeleteFile function deletes the file that is in the current document
window and closes that window. First, it calls the YesNoBox function to allow
the user to confirm the delete.
The ShowPosition function is an example of how an application can display
one-liners in the application window's status bar. It builds a text display
that contains the line and column numbers of where the keyboard cursor is in
the current document window. Then it sends the ADDSTATUS message to the
application window with the address of the text display as the first parameter
in the message.
The EditorProc function is the window processing module for the document
windows. Every message sent to one of these windows comes to this function
first. Observe its treatment of the SETFOCUS and KEYBOARD_CURSOR messages.
When a window is to take the focus, it receives the SETFOCUS message with the
first parameter set to true. When a window is to give up the focus, it
receives the SETFOCUS message with the first parameter set to false. The
EditorProc function uses this message and the KEYBOARD_CURSOR message to
manage the continuing line/column display in the status bar. Whenever the
keyboard cursor is to change, the current window receives the KEYBOARD_CURSOR
message with the column coordinate as the first parameter and the row
coordinate as the second parameter. These messages cause the EditorProc
function to call the ShowPosition function to update the status bar display to
the new line and column. When the document window gives up the focus, the
EditorProc function sends the ADDSTATUS message to the application window with
zero parameters to tell the application window to clear the text display in
the status bar. The SETFOCUS and KEYBOARD cursor messages both call the
DefaultWndProc function before they do their processing. This is the procedure
that a window processing module uses to cause the default message processing
to occur first. The module must then return the value returned from the
DefaultWndProc function when the module is done with its own processing.
The EditorProc function intercepts the CLOSE_WINDOW function to see if the
user has made any changes to the text without saving it. If so, the function
calls the YesNoBox function to allow the user to save the text. If the user
chooses to save the text, the EditorProc function sends a COMMAND message with
the ID_SAVE command to the application window. This is an example of how the
document window simulates a user action by sending a COMMAND to the
application as if the user had chosen the command from a menu.
Any messages that EditorProc does not intercept get sent to the default window
processing module for the EDITBOX class of windows when EditorProc calls
DefaultWndProc before EditorProc returns.
Take another moment to consider again what has happened. Before the user chose
a new untitled document window or loaded a file into a document window, the
program was looping and waiting for something to do. The user's menu selection
sent a message to the application window. The application window created a
document window and gave it the focus. Now almost every thing the user does
results in a message to the in-focus document window. The document window is
an EDITBOX, so all the messages that support an EDITBOX go to the document
window. The EDITBOX's default window processing module manages the details of
text editing, and the application's edit box processing module processes only
those messages that are unique to the application.
The underlying concepts of event-driven, message-based processing are what the
memopad application program employs. These same concepts permeate the D-Flat
library itself. Help, options, menus, dialog boxes, the status bar, and text
editing all use the same mechanisms that an application will use. They
encapsulate the common processes that most applications use into a user
interface library. The applications software can be trivial by comparison. The
memopad program is implemented in a relatively small source code module --
less than 400 lines of code -- yet it is a multiple-document text editor
program, something that would be a significant software development effort if
you had to build the user interface and the text editing functions into the
application itself.


How to Get D-Flat Now


The D-Flat source code is on CompuServe in Library 0 of the DDJ forum and on
M&T Online. Its name is DFLATn.ARC, where the n is an integer that represents
a loosely-assigned version number. There is another file, named DFnTXT.ARC,
which contains a description of what has changed and how to build the
software, the Help system database, and the documentation for the programmer's
API. At present, everything compiles and works with Turbo C 2.0, Turbo C++,
and Microsoft C 6.0. There is a makefile for the TC and MSC compilers. There
is an example program, the MEMOPAD program, which is a multiple-document
notepad.
If for some reason you cannot get to either online service, send me a
formatted disk -- any PC format -- and an addressed, stamped diskette mailer.
Send it to me in care of DDJ. I'll send you the latest copy of the library.
The software is free, but if you care to, stick a dollar bill in the mailer.
I'll match the dollar and give it to the Brevard County Food Bank. They take
care of homeless and hungry children.
If you want to discuss D-Flat with me, use CompuServe. My CompuServe ID is
71101,1262, and I monitor the DDJ forum daily.


A Star is Born



Philippe Kahn has a video. You won't see it on MTV, and I'm not sure how you
get it. It's called "The World of Objects," and it's mostly Philippe
explaining what object-oriented programming is, and why it is the answer to a
programmer's prayer. It begins with Philippe playing the soprano sax, Philippe
playing the piano, Philippe playing the bass violin, and Philippe playing the
drums. If you can make it past that, you are treated to a session with
Philippe's band, a ride with Philippe in his 1960 Corvette, and an
introduction to his dog and her puppies. I liked the puppies.
Next comes an entertaining lecture delivered by Philippe using the Gene Autry
school of method acting and some horrible phrases like, "state of
Corvetteness," "art of car driving," and "well-architected." The best part was
a clamshell, pen-based, multimedia, notebook computer with high-resolution
color graphics and no keyboard. Where do I get one of those? I think it was a
fake.
The show includes cameo appearances by Alan Kay, Marvin Minsky, Bjarne
Stroustrup, Joseph Weizenbaum, and Niklaus Wirth, all telling us that OOP is
wonderful, but none telling us what it is. Niklaus didn't seem as committed as
the others. Alan warned us to get on board or be left in the dust.
I had trouble identifying the audience for the video. It attempts to explain
OOP, but to whom? Nonprogrammers have no idea why there are different
orientations to programming because they do not know what programming is in
the first place. Programmers are not going to associate with the
non-programming analogies to cars, music, and animals. Everyone makes that
mistake. OOP is hard to explain, so those in the know search for other-world
metaphors, which never quite make it. Even Minsky used couches and chairs.
Judy and I watched the video together. Judy is a programmer who did not know
about OOP before seeing this show. She still isn't sure what it is, but she
had this comment about the video: "Except for the mother dog, there are no
women in it."

_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

/* --------------- lists.c -------------- */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <conio.h>
#include <dos.h>
#include "dflat.h"

struct LinkedList Focus = {NULLWND, NULLWND};
struct LinkedList Built = {NULLWND, NULLWND};

/* --- set focus to the window beneath the one specified --- */
void SetPrevFocus(WINDOW wnd)
{
 if (wnd != NULLWND && wnd == inFocus) {
 WINDOW wnd1 = wnd;
 while (TRUE) {
 if ((wnd1 = PrevWindow(wnd1)) == NULLWND)
 wnd1 = Focus.LastWindow;
 if (wnd1 == wnd)
 return;
 if (wnd1 != NULLWND)
 break;
 }
 if (wnd1 != NULLWND)
 SendMessage(wnd1, SETFOCUS, TRUE, 0);
 }
}

/* this function assumes that wnd is in the Focus linked list */
static WINDOW SearchFocusNext(WINDOW wnd, WINDOW pwnd)
{
 WINDOW wnd1 = wnd;

 if (wnd != NULLWND) {
 while (TRUE) {
 if ((wnd1 = NextWindow(wnd1)) == NULLWND)
 wnd1 = Focus.FirstWindow;
 if (wnd1 == wnd)
 return NULLWND;
 if (wnd1 != NULLWND)
 if (pwnd == NULLWND pwnd == GetParent(wnd1))
 break;
 }
 }
 return wnd1;

}

/* ----- set focus to the next sibling ----- */
void SetNextFocus(WINDOW wnd)
{
 WINDOW wnd1;

 if (wnd != inFocus)
 return;
 if ((wnd1 = SearchFocusNext(wnd, GetParent(wnd)))==NULLWND)
 wnd1 = SearchFocusNext(wnd, NULLWND);
 if (wnd1 != NULLWND)
 SendMessage(wnd1, SETFOCUS, TRUE, 0);
}

/* ---- remove a window from the Built linked list ---- */
void RemoveBuiltWindow(WINDOW wnd)
{
 if (wnd != NULLWND) {
 if (PrevWindowBuilt(wnd) != NULLWND)
 NextWindowBuilt(PrevWindowBuilt(wnd)) =
 NextWindowBuilt(wnd);
 if (NextWindowBuilt(wnd) != NULLWND)
 PrevWindowBuilt(NextWindowBuilt(wnd)) =
 PrevWindowBuilt(wnd);
 if (wnd == Built.FirstWindow)
 Built.FirstWindow = NextWindowBuilt(wnd);
 if (wnd == Built.LastWindow)
 Built.LastWindow = PrevWindowBuilt(wnd);
 }
}

/* ---- remove a window from the Focus linked list ---- */
void RemoveFocusWindow(WINDOW wnd)
{
 if (wnd != NULLWND) {
 if (PrevWindow(wnd) != NULLWND)
 NextWindow(PrevWindow(wnd)) = NextWindow(wnd);
 if (NextWindow(wnd) != NULLWND)
 PrevWindow(NextWindow(wnd)) = PrevWindow(wnd);
 if (wnd == Focus.FirstWindow)
 Focus.FirstWindow = NextWindow(wnd);
 if (wnd == Focus.LastWindow)
 Focus.LastWindow = PrevWindow(wnd);
 }
}

/* ---- append a window to the Built linked list ---- */
void AppendBuiltWindow(WINDOW wnd)
{
 if (wnd != NULLWND) {
 if (Built.FirstWindow == NULLWND)
 Built.FirstWindow = wnd;
 if (Built.LastWindow != NULLWND)
 NextWindowBuilt(Built.LastWindow) = wnd;
 PrevWindowBuilt(wnd) = Built.LastWindow;
 NextWindowBuilt(wnd) = NULLWND;
 Built.LastWindow = wnd;
 }

}

/* ---- append a window to the Focus linked list ---- */
void AppendFocusWindow(WINDOW wnd)
{
 if (wnd != NULLWND) {
 if (Focus.FirstWindow == NULLWND)
 Focus.FirstWindow = wnd;
 if (Focus.LastWindow != NULLWND)
 NextWindow(Focus.LastWindow) = wnd;
 PrevWindow(wnd) = Focus.LastWindow;
 NextWindow(wnd) = NULLWND;
 Focus.LastWindow = wnd;
 }
}

/* -------- get the first child of a parent window ------- */
WINDOW GetFirstChild(WINDOW wnd)
{
 WINDOW ThisWindow = Built.FirstWindow;
 while (ThisWindow != NULLWND) {
 if (GetParent(ThisWindow) == wnd)
 break;
 ThisWindow = NextWindowBuilt(ThisWindow);
 }
 return ThisWindow;
}

/* -------- get the next child of a parent window ------- */
WINDOW GetNextChild(WINDOW wnd, WINDOW ThisWindow)
{
 if (ThisWindow != NULLWND) {
 do {
 if ((ThisWindow = NextWindowBuilt(ThisWindow)) !=
 NULLWND)
 if (GetParent(ThisWindow) == wnd)
 break;
 } while (ThisWindow != NULLWND);
 }
 return ThisWindow;
}

/* -- get first child of parent window from the Focus list -- */
WINDOW GetFirstFocusChild(WINDOW wnd)
{
 WINDOW ThisWindow = Focus.FirstWindow;
 while (ThisWindow != NULLWND) {
 if (GetParent(ThisWindow) == wnd)
 break;
 ThisWindow = NextWindow(ThisWindow);
 }
 return ThisWindow;
}

/* -- get next child of parent window from the Focus list -- */
WINDOW GetNextFocusChild(WINDOW wnd, WINDOW ThisWindow)
{
 while (ThisWindow != NULLWND) {
 ThisWindow = NextWindow(ThisWindow);

 if (ThisWindow != NULLWND)
 if (GetParent(ThisWindow) == wnd)
 break;
 }
 return ThisWindow;
}

/* -------- get the last child of a parent window ------- */
WINDOW GetLastChild(WINDOW wnd)
{
 WINDOW ThisWindow = Built.LastWindow;
 while (ThisWindow != NULLWND) {
 if (GetParent(ThisWindow) == wnd)
 break;
 ThisWindow = PrevWindowBuilt(ThisWindow);
 }
 return ThisWindow;
}

/* ------- get the previous child of a parent window ------- */
WINDOW GetPrevChild(WINDOW wnd, WINDOW ThisWindow)
{
 if (ThisWindow != NULLWND) {
 do {
 if ((ThisWindow = PrevWindowBuilt(ThisWindow)) !=
 NULLWND)
 if (GetParent(ThisWindow) == wnd)
 break;
 } while (ThisWindow != NULLWND);
 }
 return ThisWindow;
}

/* --- bypass system windows when stepping through focus --- */
void SkipSystemWindows(int Prev)
{
 int cl, ct = 0;
 while ((cl = GetClass(inFocus)) == MENUBAR 
 cl == APPLICATION
#ifdef INCLUDE_STATUSBAR
 cl == STATUSBAR
#endif
 ) {
 if (Prev)
 SetPrevFocus(inFocus);
 else
 SetNextFocus(inFocus);
 if (++ct == 3)
 break;
 }
}





[LISTING TWO]

/* ------------- rect.c --------------- */


#include <dos.h>
#include "dflat.h"

 /* -- Produce vector end points produced by overlap of two other vectors --
*/
static void subVector(int *v1, int *v2, int t1, int t2, int o1, int o2)
{
 *v1 = *v2 = -1;
 if (within(o1, t1, t2)) {
 *v1 = o1;
 if (within(o2, t1, t2))
 *v2 = o2;
 else
 *v2 = t2;
 }
 else if (within(o2, t1, t2)) {
 *v2 = o2;
 if (within(o1, t1, t2))
 *v1 = o1;
 else
 *v1 = t1;
 }
 else if (within(t1, o1, o2)) {
 *v1 = t1;
 if (within(t2, o1, o2))
 *v2 = t2;
 else
 *v2 = o2;
 }
 else if (within(t2, o1, o2)) {
 *v2 = t2;
 if (within(t1, o1, o2))
 *v1 = t1;
 else
 *v1 = o1;
 }
}

 /* -- Return rectangle produced by the overlap of two other rectangles --- */
RECT subRectangle(RECT r1, RECT r2)
{
 RECT r = {0,0,0,0};
 subVector((int *) &RectLeft(r), (int *) &RectRight(r),
 RectLeft(r1), RectRight(r1),
 RectLeft(r2), RectRight(r2));
 subVector((int *) &RectTop(r), (int *) &RectBottom(r),
 RectTop(r1), RectBottom(r1),
 RectTop(r2), RectBottom(r2));
 if (RectRight(r) == -1 RectTop(r) == -1)
 RectRight(r) =
 RectLeft(r) =
 RectTop(r) =
 RectBottom(r) = 0;
 return r;
}

/* ------- return the client rectangle of a window ------ */
RECT ClientRect(void *wnd)
{

 RECT rc;

 RectLeft(rc) = GetClientLeft((WINDOW)wnd);
 RectTop(rc) = GetClientTop((WINDOW)wnd);
 RectRight(rc) = GetClientRight((WINDOW)wnd);
 RectBottom(rc) = GetClientBottom((WINDOW)wnd);
 return rc;
}

/* ---- return the rectangle relative to its window's screen position ---- */
RECT RelativeWindowRect(void *wnd, RECT rc)
{
 RectLeft(rc) -= GetLeft((WINDOW)wnd);
 RectRight(rc) -= GetLeft((WINDOW)wnd);
 RectTop(rc) -= GetTop((WINDOW)wnd);
 RectBottom(rc) -= GetTop((WINDOW)wnd);
 return rc;
}

/* ----- clip a rectangle to the parents of the window ----- */
RECT ClipRectangle(void *wnd, RECT rc)
{
 RECT sr;
 RectLeft(sr) = RectTop(sr) = 0;
 RectRight(sr) = SCREENWIDTH-1;
 RectBottom(sr) = SCREENHEIGHT-1;
 if (!TestAttribute((WINDOW)wnd, NOCLIP))
 while ((wnd = GetParent((WINDOW)wnd)) != NULLWND)
 rc = subRectangle(rc, ClientRect(wnd));
 return subRectangle(rc, sr);
}







[LISTING THREE]

/* ----------- dflatmsg.h ------------ */

/* message foundation file
 * make message changes here
 * other source files will adapt
 */

/* -------------- process communication messages ----------- */
DFlatMsg(START) /* start message processing */
DFlatMsg(STOP) /* stop message processing */
DFlatMsg(COMMAND) /* send a command to a window */
/* -------------- window management messages --------------- */
DFlatMsg(CREATE_WINDOW) /* create a window */
DFlatMsg(SHOW_WINDOW) /* show a window */
DFlatMsg(HIDE_WINDOW) /* hide a window */
DFlatMsg(CLOSE_WINDOW) /* delete a window */
DFlatMsg(SETFOCUS) /* set and clear the focus */
DFlatMsg(PAINT) /* paint the window's data space*/
DFlatMsg(BORDER) /* paint the window's border */

DFlatMsg(TITLE) /* display the window's title */
DFlatMsg(MOVE) /* move the window */
DFlatMsg(SIZE) /* change the window's size */
DFlatMsg(MAXIMIZE) /* maximize the window */
DFlatMsg(MINIMIZE) /* minimize the window */
DFlatMsg(RESTORE) /* restore the window */
DFlatMsg(INSIDE_WINDOW) /* test x/y inside a window */
/* ---------------- clock messages ------------------------- */
DFlatMsg(CLOCKTICK) /* the clock ticked */
DFlatMsg(CAPTURE_CLOCK) /* capture clock into a window */
DFlatMsg(RELEASE_CLOCK) /* release clock to the system */
/* -------------- keyboard and screen messages ------------- */
DFlatMsg(KEYBOARD) /* key was pressed */
DFlatMsg(CAPTURE_KEYBOARD) /* capture keyboard into a window */
DFlatMsg(RELEASE_KEYBOARD) /* release keyboard to system */
DFlatMsg(KEYBOARD_CURSOR) /* position the keyboard cursor */
DFlatMsg(CURRENT_KEYBOARD_CURSOR) /*read the cursor position */
DFlatMsg(HIDE_CURSOR) /* hide the keyboard cursor */
DFlatMsg(SHOW_CURSOR) /* display the keyboard cursor */
DFlatMsg(SAVE_CURSOR) /* save the cursor's configuration*/
DFlatMsg(RESTORE_CURSOR) /* restore the saved cursor */
DFlatMsg(SHIFT_CHANGED) /* the shift status changed */
DFlatMsg(WAITKEYBOARD) /* waits for a key to be released */
/* ---------------- mouse messages ------------------------- */
DFlatMsg(MOUSE_INSTALLED) /* test for mouse installed */
DFlatMsg(RIGHT_BUTTON) /* right button pressed */
DFlatMsg(LEFT_BUTTON) /* left button pressed */
DFlatMsg(DOUBLE_CLICK) /* left button double-clicked */
DFlatMsg(MOUSE_MOVED) /* mouse changed position */
DFlatMsg(BUTTON_RELEASED) /* mouse button released */
DFlatMsg(CURRENT_MOUSE_CURSOR)/* get mouse position */
DFlatMsg(MOUSE_CURSOR) /* set mouse position */
DFlatMsg(SHOW_MOUSE) /* make mouse cursor visible */
DFlatMsg(HIDE_MOUSE) /* hide mouse cursor */
DFlatMsg(WAITMOUSE) /* wait until button released */
DFlatMsg(TESTMOUSE) /* test any mouse button pressed*/
DFlatMsg(CAPTURE_MOUSE) /* capture mouse into a window */
DFlatMsg(RELEASE_MOUSE) /* release the mouse to system */
/* ---------------- text box messages ---------------------- */
DFlatMsg(ADDTEXT) /* add text to the text box */
DFlatMsg(CLEARTEXT) /* clear the edit box */
DFlatMsg(SETTEXT) /* set address of text buffer */
DFlatMsg(SCROLL) /* vertical scroll of text box */
DFlatMsg(HORIZSCROLL) /* horizontal scroll of text box*/
/* ---------------- edit box messages ---------------------- */
DFlatMsg(EB_GETTEXT) /* get text from an edit box */
DFlatMsg(EB_PUTTEXT) /* put text into an edit box */
/* ---------------- menubar messages ----------------------- */
DFlatMsg(BUILDMENU) /* build the menu display */
DFlatMsg(SELECTION) /* menubar selection */
/* ---------------- popdown messages ----------------------- */
DFlatMsg(BUILD_SELECTIONS) /* build the menu display */
DFlatMsg(CLOSE_POPDOWN) /* tell parent popdown is closing */
/* ---------------- list box messages ---------------------- */
DFlatMsg(LB_SELECTION) /* sent to parent on selection */
DFlatMsg(LB_CHOOSE) /* sent when user chooses */
DFlatMsg(LB_CURRENTSELECTION)/* return the current selection */
DFlatMsg(LB_GETTEXT) /* return the text of selection */
DFlatMsg(LB_SETSELECTION) /* sets the listbox selection */

/* ---------------- dialog box messages -------------------- */
DFlatMsg(INITIATE_DIALOG) /* begin a dialog */
DFlatMsg(ENTERFOCUS) /* tell DB control got focus */
DFlatMsg(LEAVEFOCUS) /* tell DB control lost focus */
DFlatMsg(ENDDIALOG) /* end a dialog */
/* ---------------- help box messages ---------------------- */
DFlatMsg(DISPLAY_HELP)
/* --------------- application window messages ------------- */
DFlatMsg(ADDSTATUS)






[LISTING FOUR]

/* ---------------- commands.h ----------------- */

/* Command values sent as the first parameter
 * in the COMMAND message
 * Add application-specific commands to this enum
 */

#ifndef COMMANDS_H
#define COMMANDS_H

enum commands {
 /* --------------- File menu ---------------- */
 ID_OPEN,
 ID_NEW,
 ID_SAVE,
 ID_SAVEAS,
 ID_DELETEFILE,
 ID_PRINT,
 ID_DOS,
 ID_EXIT,
 /* --------------- Edit menu ---------------- */
 ID_UNDO,
 ID_CUT,
 ID_COPY,
 ID_PASTE,
 ID_PARAGRAPH,
 ID_CLEAR,
 ID_DELETETEXT,
 /* --------------- Search Menu -------------- */
 ID_SEARCH,
 ID_REPLACE,
 ID_SEARCHNEXT,
 /* -------------- Options menu -------------- */
 ID_INSERT,
 ID_WRAP,
 ID_LOG,
 ID_TABS,
 ID_DISPLAY,
 ID_SAVEOPTIONS,
 /* --------------- Window menu -------------- */
 ID_WINDOW,
 ID_CLOSEALL,

 /* --------------- Help menu ---------------- */
 ID_HELPHELP,
 ID_EXTHELP,
 ID_KEYSHELP,
 ID_HELPINDEX,
 ID_ABOUT,
 ID_LOADHELP,
 /* --------------- System menu -------------- */
 ID_SYSRESTORE,
 ID_SYSMOVE,
 ID_SYSSIZE,
 ID_SYSMINIMIZE,
 ID_SYSMAXIMIZE,
 ID_SYSCLOSE,
 /* ---- FileOpen and SaveAs dialog boxes ---- */
 ID_FILENAME,
 ID_FILES,
 ID_DRIVE,
 ID_PATH,
 /* ----- Search and Replace dialog boxes ---- */
 ID_SEARCHFOR,
 ID_REPLACEWITH,
 ID_MATCHCASE,
 ID_REPLACEALL,
 /* ----------- Windows dialog box ----------- */
 ID_WINDOWLIST,
 /* --------- generic command buttons -------- */
 ID_OK,
 ID_CANCEL,
 ID_HELP,
 /* -------------- TabStops menu ------------- */
 ID_TAB2,
 ID_TAB4,
 ID_TAB6,
 ID_TAB8,
 /* ------------ Display dialog box ---------- */
 ID_BORDER,
 ID_TITLE,
 ID_STATUSBAR,
 ID_TEXTURE,
 ID_COLOR,
 ID_MONO,
 ID_REVERSE,
 ID_25LINES,
 ID_43LINES,
 ID_50LINES,
 /* ------------- Log dialog box ------------- */
 ID_LOGLIST,
 ID_LOGGING,
 /* ------------ HelpBox dialog box ---------- */
 ID_HELPTEXT,
 ID_BACK,
 ID_PREV,
 ID_NEXT
};

#endif








[LISTING FIVE]

/* --------------- memopad.c ----------- */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys\types.h>
#include <sys\stat.h>
#include <dos.h>
#ifndef MSC
#include <dir.h>
#endif
#include "dflat.h"

static char Untitled[] = "Untitled";
static int wndpos = 0;

static int MemoPadProc(WINDOW, MESSAGE, PARAM, PARAM);
static void NewFile(WINDOW);
static void SelectFile(WINDOW);
static void PadWindow(WINDOW, char *);
static void OpenPadWindow(WINDOW, char *);
static void LoadFile(WINDOW, int);
static void PrintPad(WINDOW);
static void SaveFile(WINDOW, int);
static void DeleteFile(WINDOW);
static int EditorProc(WINDOW, MESSAGE, PARAM, PARAM);
static char *NameComponent(char *);
char **Argv;

void main(int argc, char *argv[])
{
 WINDOW wnd;
 init_messages();
 Argv = argv;
 wnd = CreateWindow(APPLICATION,
 "D-Flat MemoPad " VERSION,
 0, 0, -1, -1,
 &MainMenu,
 NULL,
 MemoPadProc,
 MOVEABLE 
 SIZEABLE 
 HASBORDER 
 HASSTATUSBAR
 );

 SendMessage(wnd, SETFOCUS, TRUE, 0);
 while (argc > 1) {
 PadWindow(wnd, argv[1]);
 --argc;
 argv++;
 }

 while (dispatch_message())
 ;
}
/* ------ open text files and put them into editboxes ----- */
static void PadWindow(WINDOW wnd, char *FileName)
{
 int ax, criterr = 1;
 struct ffblk ff;
 char path[64];
 char *cp;

 CreatePath(path, FileName, FALSE, FALSE);
 cp = path+strlen(path);
 CreatePath(path, FileName, TRUE, FALSE);
 while (criterr == 1) {
 ax = findfirst(path, &ff, 0);
 criterr = TestCriticalError();
 }
 while (ax == 0 && !criterr) {
 strcpy(cp, ff.ff_name);
 OpenPadWindow(wnd, path);
 ax = findnext(&ff);
 }
}
/* ----- window processing module for the memopad application window ----- */
static int MemoPadProc(WINDOW wnd,MESSAGE msg,PARAM p1,PARAM p2)
{
 switch (msg) {
 case COMMAND:
 switch ((int)p1) {
 case ID_NEW:
 NewFile(wnd);
 return TRUE;
 case ID_OPEN:
 SelectFile(wnd);
 return TRUE;
 case ID_SAVE:
 SaveFile(inFocus, FALSE);
 return TRUE;
 case ID_SAVEAS:
 SaveFile(inFocus, TRUE);
 return TRUE;
 case ID_DELETEFILE:
 DeleteFile(inFocus);
 return TRUE;
 case ID_PRINT:
 PrintPad(inFocus);
 return TRUE;
 case ID_ABOUT:
 MessageBox(
 "About D-Flat and the MemoPad",
 " ZDDDDDDDDDDDDDDDDDDDDDDD?\n"
 " 3 \\\ \\\ \ 3\n"
 " 3 [ [ [ [ [ 3\n"
 " 3 [ [ [ [ [ 3\n"
 " 3 [ [ [ [ [ [ 3\n"
 " 3 ___ ___ __ 3\n"
 " @DDDDDDDDDDDDDDDDDDDDDDDY\n"
 "D-Flat implements the SAA/CUA\n"

 "interface in a public domain\n"
 "C language library originally\n"
 "published in Dr. Dobb's Journal\n"
 " ------------------------ \n"
 "MemoPad is a multiple document\n"
 "editor that demonstrates D-Flat");
 MessageBox(
 "D-Flat Testers and Friends",
 "Jeff Ratcliff, David Peoples, Kevin Slater,\n"
 "Naor Toledo Pinto, Jeff Hahn, Jim Drash,\n"
 "Art Stricek, John Ebert, George Dinwiddie,\n"
 "Damaian Thorne, Wes Peterson, Thomas Ewald,\n"
 "Mitch Miller, Ray Waters, Jim Drash, Eric\n"
 "Silver, Russ Nelson, Elliott Jackson, Warren\n"
 "Master, H.J. Davey, Jim Kyle, Jim Morris,\n"
 "Andrew Terry, Michel Berube, Bruce Edmondson,\n"
 "Peter Baenziger, Phil Roberge, Willie Hutton,\n"
 "Randy Bradley, Tim Gentry, Lee Humphrey,\n"
 "Larry Troxler, Robert DiFalco, Carl Huff,\n"
 "Vince Rice, Michael Kaufman, Donald Cumming,\n"
 "Ross Wheeler, Lou DiNardo, Keith London, Frank\n"
 "Burleigh, Jason Ward, Skip Key, Sam Gentile,\n"
 "Dana P'Simer, Ken North");
 return TRUE;
 default:
 break;
 }
 break;
 default:
 break;
 }
 return DefaultWndProc(wnd, msg, p1, p2);
}
/* --- The New command. Open an empty editor window --- */
static void NewFile(WINDOW wnd)
{
 OpenPadWindow(wnd, Untitled);
}
/* --- The Open... command. Select a file --- */
static void SelectFile(WINDOW wnd)
{
 char FileName[64];
 if (OpenFileDialogBox("*.PAD", FileName)) {
 /* --- see if the document is already in a window --- */
 WINDOW wnd1 = GetFirstChild(wnd);
 while (wnd1 != NULLWND) {
 if (stricmp(FileName, wnd1->extension) == 0) {
 SendMessage(wnd1, SETFOCUS, TRUE, 0);
 SendMessage(wnd1, RESTORE, 0, 0);
 return;
 }
 wnd1 = GetNextChild(wnd, wnd1);
 }
 OpenPadWindow(wnd, FileName);
 }
}
/* --- open a document window and load a file --- */
static void OpenPadWindow(WINDOW wnd, char *FileName)
{

 static WINDOW wnd1 = NULLWND;
 struct stat sb;
 char *Fname = FileName;
 char *ermsg;
 if (strcmp(FileName, Untitled)) {
 if (stat(FileName, &sb)) {
 if ((ermsg = malloc(strlen(FileName)+20)) != NULL) {
 strcpy(ermsg, "No such file as\n");
 strcat(ermsg, FileName);
 ErrorMessage(ermsg);
 free(ermsg);
 }
 return;
 }
 Fname = NameComponent(FileName);
 }
 wndpos += 2;
 if (wndpos == 20)
 wndpos = 2;
 wnd1 = CreateWindow(EDITBOX,
 Fname,
 (wndpos-1)*2, wndpos, 10, 40,
 NULL, wnd, EditorProc,
 SHADOW 
 MINMAXBOX 
 CONTROLBOX 
 VSCROLLBAR 
 HSCROLLBAR 
 MOVEABLE 
 HASBORDER 
 SIZEABLE 
 MULTILINE
 );
 if (strcmp(FileName, Untitled)) {
 if ((wnd1->extension =
 malloc(strlen(FileName)+1)) != NULL) {
 strcpy(wnd1->extension, FileName);
 LoadFile(wnd1, (int) sb.st_size);
 }
 }
 SendMessage(wnd1, SETFOCUS, TRUE, 0);
}
/* --- Load the notepad file into the editor text buffer --- */
static void LoadFile(WINDOW wnd, int tLen)
{
 char *Buf;
 FILE *fp;

 if ((Buf = malloc(tLen+1)) != NULL) {
 if ((fp = fopen(wnd->extension, "rt")) != NULL) {
 memset (Buf, 0, tLen+1);
 fread(Buf, tLen, 1, fp);
 SendMessage(wnd, SETTEXT, (PARAM) Buf, 0);
 fclose(fp);
 }
 free(Buf);
 }
}
/* --- print the current notepad --- */

static void PrintPad(WINDOW wnd)
{
 unsigned char *text;

 /* ---------- print the file name ---------- */
 fputs("\r\n", stdprn);
 fputs(GetTitle(wnd), stdprn);
 fputs(":\r\n\n", stdprn);

 /* ---- get the address of the editor text ----- */
 text = GetText(wnd);

 /* ------- print the notepad text --------- */
 while (*text) {
 if (*text == '\n')
 fputc('\r', stdprn);
 fputc(*text++, stdprn);
 }

 /* ------- follow with a form feed? --------- */
 if (YesNoBox("Form Feed?"))
 fputc('\f', stdprn);
}
/* ---------- save a file to disk ------------ */
static void SaveFile(WINDOW wnd, int Saveas)
{
 FILE *fp;
 if (wnd->extension == NULL Saveas) {
 char FileName[64];
 if (SaveAsDialogBox(FileName)) {
 if (wnd->extension != NULL)
 free(wnd->extension);
 if ((wnd->extension =
 malloc(strlen(FileName)+1)) != NULL) {
 strcpy(wnd->extension, FileName);
 AddTitle(wnd, NameComponent(FileName));
 SendMessage(wnd, BORDER, 0, 0);
 }
 }
 else
 return;
 }
 if (wnd->extension != NULL) {
 WINDOW mwnd = MomentaryMessage("Saving the file");
 if ((fp = fopen(wnd->extension, "wt")) != NULL) {
 fwrite(GetText(wnd), strlen(GetText(wnd)), 1, fp);
 fclose(fp);
 wnd->TextChanged = FALSE;
 }
 SendMessage(mwnd, CLOSE_WINDOW, 0, 0);
 }
}
/* -------- delete a file ------------ */
static void DeleteFile(WINDOW wnd)
{
 if (wnd->extension != NULL) {
 if (strcmp(wnd->extension, Untitled)) {
 char *fn = NameComponent(wnd->extension);
 if (fn != NULL) {

 char msg[30];
 sprintf(msg, "Delete %s?", fn);
 if (YesNoBox(msg)) {
 unlink(wnd->extension);
 SendMessage(wnd, CLOSE_WINDOW, 0, 0);
 }
 }
 }
 }
}
/* ------ display the row and column in the statusbar ------ */
static void ShowPosition(WINDOW wnd)
{
 char status[30];
 sprintf(status, "Line:%4d Column: %2d",
 wnd->CurrLine, wnd->CurrCol);
 SendMessage(GetParent(wnd), ADDSTATUS, (PARAM) status, 0);
}
/* ----- window processing module for the editboxes ----- */
static int EditorProc(WINDOW wnd,MESSAGE msg,PARAM p1,PARAM p2)
{
 int rtn;
 switch (msg) {
 case SETFOCUS:
 rtn = DefaultWndProc(wnd, msg, p1, p2);
 if ((int)p1 == FALSE)
 SendMessage(GetParent(wnd), ADDSTATUS, 0, 0);
 else
 ShowPosition(wnd);
 return rtn;
 case KEYBOARD_CURSOR:
 rtn = DefaultWndProc(wnd, msg, p1, p2);
 ShowPosition(wnd);
 return rtn;
 case COMMAND:
 if ((int) p1 == ID_HELP) {
 DisplayHelp(wnd, "MEMOPADDOC");
 return TRUE;
 }
 break;
 case CLOSE_WINDOW:
 if (wnd->TextChanged) {
 char *cp = malloc(25+strlen(GetTitle(wnd)));
 SendMessage(wnd, SETFOCUS, TRUE, 0);
 if (cp != NULL) {
 strcpy(cp, GetTitle(wnd));
 strcat(cp, "\nText changed. Save it?");
 if (YesNoBox(cp))
 SendMessage(GetParent(wnd),
 COMMAND, ID_SAVE, 0);
 free(cp);
 }
 }
 wndpos = 0;
 if (wnd->extension != NULL) {
 free(wnd->extension);
 wnd->extension = NULL;
 }
 break;

 default:
 break;
 }
 return DefaultWndProc(wnd, msg, p1, p2);
}
/* -- point to the name component of a file specification -- */
static char *NameComponent(char *FileName)
{
 char *Fname;
 if ((Fname = strrchr(FileName, '\\')) == NULL)
 if ((Fname = strrchr(FileName, ':')) == NULL)
 Fname = FileName-1;
 return Fname + 1;
}




Example 1:

 WINDOW cwnd = GetFirstChild(pwnd);
 while (cwnd != NULLWND) {
 /* -- process the child window -- */
 cwnd = GetNextChild(pwnd, cwnd);

 }


Example 2:


 typedef enum messages {
 #undef DFlatMsg
 #define DFlatMsg(m) m,
 #include "dflatmsg.h"
 MESSAGECOUNT
 } MESSAGE;




Example 3

 static char *message[] = {
 #undef DFlatMsg
 #define DFlatMsg(m) " " #m,
 #include "dflatmsg.h"
 NULL

 };


</LT>









October, 1991
STRUCTURED PROGRAMMING


Sympathy on the Loss of One of Your Legs




Jeff Duntemann KG7JF


Life is complicated. Art follows life. If greeting cards are indeed art, then
either the greeting card industry has gotten awfully complicated, or else all
men are Socrates.
An industry that once was happy to declare Happy Birthday! Merry Christmas! Be
My Valentine! now has to deal with death, divorce, Halloween, big raises,
parole, sick pets, shared custody, and National Secretary's Day. A recent scan
down the aisle at a Scottsdale Hallmark shop yielded Happy Anniversary to
Mother And Her Husband From Both Of Us as well as Your Father And I Disagree
But We Both Still Love You. The diversity is boggling, but even more boggling
is the specificity. When a card shop contains 10,000 cards, and when 50
million Americans a day buy at least one card in those shops, the industry can
afford to be specific.
Perhaps when you see so much specificity your mind begins to manufacture more
of it. Mine does (perhaps as a reaction against our overwhelming bias in favor
of generalized code), and that may be why I did a double take halfway down the
aisle at the card shop not long ago. A second look at a certain flowery
offering showed it to read Sympathy on the Loss of Your Loved One, but damned
if I didn't first read it as Sympathy on the Loss of One of Your Legs.
Now that's specific.


The Fallacy of Generality


Ridiculous? Well, wait a minute. What if someone you cared about really did
lose one of his or her legs? What card could possibly capture your feelings
better than that? (Assuming you were brave enough to express any feelings at
all--not everyone would.) My hunch is that such a card would sell perhaps
thirty units a year, total--but it would sell to nearly 100 percent of its
intended audience, that is, people who lose one of their legs but retain some
sense of humor.
And therein lies a lesson.
The major applications sold today (word processing, spreadsheets, databases)
are tremendously general in nature. Why? Two reasons: First of all, we who
began and carried the microcomputer revolution are pathological tinkerers. We
love these damned boxes and we can't stop playing with them. For tinkerers,
generality is the necessary condition that allows more tinkering. If you have
an application that does only one thing, there's no more tinkering to be done.
Also, in the beginning the installed base for any given machine was small, so
to be economically viable, a product had to be warpable to serve many very
different needs. When only one dog kennel in a hundred has a computer, you
can't make money on the DogMatic Kennel Management System, so you sell the
breeders dBase III instead and let them futz up their own systems. On the
other hand, once 90 percent of dog kennels have computers, you can create
DogMatic and make a handsome living on it.
Both reasons for application generality have their genesis in a time now gone.
Many or even most people who buy computers today have no desire at all to
tinker. They have their fun playing tennis or sabotaging nuclear power plants.
Hacking never once entered their minds as something you do on a Saturday
night. They want computers to solve their own very specific problems, not
create more problems. They want programs that track tennis scores or manage
athletic shoe outlets or spot trends in orange juice futures. General-purpose
databases would only infuriate them. People who do not tinker would look upon
dBase III as an incomplete product, as something of no earthly use to anyone.
And with an installed base of DOS machines at something like 100,000,000
worldwide, it's getting to the point where the majority of working people are
going to have a family computer.
With an installed base like that, you can target the narrowest of vertical
markets and still make a living, because nearly everybody in that market will
already have a computer. If the market itself exists, it will support a
vertical market application.


Go Vertical, Young Man


Of course, there have been vertical market applications as long as there have
been computers. The earliest were slanted toward big business categories such
as banking, law, and government where margins are high and money flows freely.
Little by little, what we're seeing is that all businesses have become
vertical markets, because our machines are powerful enough to handle any small
business, and cheap enough to be bought by small business without breaking the
bank.
A "f' rinstance." I own and operate a programmers' magazine as my "real" job.
It's a small business, with only five employees and revenues of considerably
less than a million dollars per year. The margins are not high, and we're in
no position to pay $50,000 for a minicomputer and $20,000 for a
magazine-management package. Fortunately, we don't have to. We bought a
vertical-market package for $3500 that runs briskly on a $1300 rotgut 386SX
no-name clone box with a monochrome monitor and a 72-Mbyte hard disk. It
manages circulation promotion and fulfillment for 30,000 customers, as well as
accounting and mailing list management. The package (called QuickFill, from
CWC Software) is very slick, very robust, and beautifully documented. The
phone support is inexpensive and superb. The vendor is still in business and
seems to be doing well. I'd call this a miracle, except that it seems to be a
trend.
If there will be a way to make money in this business for the next ten years,
this is it: Go vertical. Find a market that isn't saturated, study it closely,
and then create an application that is so specific to that market that it
requires little or no tinkering on the part of its purchasers. QuickFill
followed standard practice in the magazine publishing business so closely that
we didn't have to futz it, or change the way we worked to adapt to it.
QuickFill is a very nearly totally accurate reflection of the magazine
publishing business in the mirror of the desktop DOS machine.
Your mission is to accurately reflect your chosen market in the mirror of the
universal DOS machine. If you don't accept it, go back to selling shower
curtains.


The Personal Vertical Application


That's the state of things right now, and I'll come back to the issue of
small-business vertical market applications shortly. Looking ahead just a year
or two, I see another trend unfolding along the same axis: the appearance of
the personal vertical market. By "personal," I mean an application that
doesn't involve some kind of money-making pursuit. By "vertical," I simply
mean specific--highly specific--to a certain type of endeavor, one so specific
that no one right now would admit that it could sell enough copies to make
money.
Forget the legendary recipe management database. There simply isn't enough
management to be done to make such a thing useful. Cookbooks do the job plenty
well, and the resolution on the pictures is better. (Who wants to see jaggies
on their Spicy Cataluna Chicken Wings?)
No. Let me hand you all an idea I had years ago, but shelved because it was
too specific for the installed base at the time. I think its time has come.
The idea is hereby declared public domain (lest some lawyer or other moron try
to patent it) and may the best implementor win.
I call it Card Shark. It is a graphics drawing program specifically tailored
to create greeting cards, and nothing else. It pulls the following elements
together:
1. A large clip-art object library of cute dogs, grotesque caricatures,
crosses, flower arrangements, country cottages, or legally licensed characters
like Ziggy, Snoopy, or Zippy the Pinhead.
2. An intelligent "posing" engine that can not only scale such clip art, but
also rotate and manipulate it within constraints. Click your mouse and pull
Ziggy's pudgy little arm up over his head, waving bye-bye, but still keeping
his essential shape within the constraints of the character as Tom Wilson
designed him. Put Zippy the Pinhead in various poses until you find one you
like. Etc.
3. A clip-joke collection indexed by occasion. Want an insult related to
turning 40? Or one related to getting a big raise? Or an inspirational message
encouraging the reader to carry on, even with one leg missing? You don't have
to tell the user that the little matrix he or she fills out (factors like
raunch quotient, social class, and so on) to go fetch a joke is a
query-by-example form.
4. A bunch of highly versatile fonts in which to cast the card's message.
5. Laser printable paper stock and envelopes to make the cards real. Forget
dot matrix printers. By 1995 they'll seem as quaint as single-sided diskette
drives.
6. A gimmick. For example, a help system implemented as an animated cartoon
shark, who pops up on command to answer questions inside of a cartoon speech
balloon, or shows up to counsel the user when he or she tries to do something
nonsensical. The shark could be just another piece of clip art, with the help
system--rather than the user--doing the posing of the character.
Some of this stuff may be slightly state-of-the-art, but only for the next 20
minutes or so. The hardest part will be the posable clip art--keeping in mind
that if you do it right, that technology could itself become a product for use
in a lot of far better things than silly greeting cards.
I think that many or even most DDJ readers could create a product something
like this, especially if you back away from the posing engine a little.
Creating the product, you may already be objecting, is the easy part. Where
would you sell such a thing?
Don't be silly. You'd sell it in Hallmark card shops. All they do is cards. Do
a little market research, create a proposal that explains the concept and why
it could work in a nation with 75,000,000 PC compatibles, make the packaging
slick and consumer-oriented, and present it to the national Hallmark
franchiser. They might click their tongues and perhaps try it first in Santa
Clara or San Jose, but after the first few thousand copies flew off the
shelves you'd see the thing go national in a big, big way. Consumer products
can be like that.
Remember me after you're rich.



Squeezing the Development Cycle


That's only my most ambitious example; I have lots of other ones. As a
sometime stamp collector, I would like a highly-specific graphics program that
only creates custom stamp album pages. Give me that, and maybe add the ability
to link a square on a stamp album page to a database record. But don't make it
a stamp-collector's configurable database. Make it a stamp album page
generator. I collect stamps to help make me forget programming, remember?
I could fill the rest of this column with suggested personal vertical
applications, but by now I hope your own imagination is hard at work, coming
up with things totally alien to my own (somewhat limited) experience. Instead
of that, let's talk about the prospect for making this stuff pay.
In one sense, Card Shark is a bad example. It's vertical by virtue of being
highly specific, but it's also horizontal in the sense that the majority of
Americans enjoy and send greeting cards. It will be a challenging product to
create, but the potential payoff is high because the potential market numbers
in the millions.
Most applications are not that horizontal. And in the near term, at least, the
exploding PC installed base in small business presents the more urgent need.
By nature, a vertical market application has a limited market. A given market
also has a price point above which the fish won't bite. Markets whose fish
bite high have mostly been filled. You'll need to come in cheaper for
industries that are not as cash-flush as law or medicine.
Vertical market developers generally know this, or at least sense it: To make
an application pay in a limited market, you need to make it happen fast.
And here's a related vertical market developer rule that hobbyists and
tinkerers will probably bridle at: Build nothing that you can buy. Building
something from scratch that can be had from Programmer's Paradise for $200
without royalties will be at best a stunt and at worst fatal to your project.
In this category fall user interfaces, database engines, hardcopy output
libraries, and numerous other things. I have serious doubts that you can come
up with an unquestionably better user interface toolkit than you can buy or
get bundled with a compiler.
Ditto a database engine. QuickFill is inexpensive at least in part because its
developers used a database engine called db_Vista (from Raima Inc.) rather
than wasting a year or so creating one themselves.
In a nutshell, the key to success in vertical markets is to create as much of
the application as humanly possible from standardized, over-the-counter parts.
My rule of thumb would be this: Buy at least 75 percent of the application
(measured in terms of source code lines) in library form.
I can hear the objection already: What if one of your library vendors goes out
of business and you don't have source? Yes, you could be in trouble. But then
again, life demands the taking of risks. If you won't take that risk, the
developer down the street who does will beat you to market and plow you into
the soil.


Learning a Market


Actually, the single most crucial development task in creating a vertical
market application is understanding the target market. Compared to that, the
programming is almost trivial. Most of your time, funds, and energy, in fact,
should go toward making your app an accurate reflection of the industry it
serves.
This is mostly not a computer skill. This is mostly a people skill. You may be
a true Renaissance man or woman with the ability to grok a market in its
fullness after a few days observing it. Probably, you are not. Your best bet
in that case is to find a partner who has made his or her life's work in your
chosen market, and split ownership of the product and the proceeds 50-50 with
that partner.
If this seems to be giving away the store, ask yourself how well you would be
able to model a Christmas tree farm in software from your two-flat on the
north side of Chicago. The real value-added in a vertical market application
is the accuracy of its business model. Your partner is the one who must guide
your programming efforts in such a way as to guarantee that accuracy. By
contrast, much of what you bring to the equation is the ability to create the
model quickly and inexpensively, and to maintain and support it after release
without a small army of additional personnel.
It certainly helps if you yourself have some experience in the field you're
trying to model, and you certainly can't remain totally ignorant of the
field's details. But you still need someone with the ability to see the
business from the user's point of view, minus your inherent and inescapable
love of tinkering the box for the hell of it.
But once you've chosen your partner, the way to proceed (culled from my
numerous conversations with people who do or have done this) is this: Spend a
solid month at the actual job site controlled by your partner where the
prospective vertical market app will be used. Initiate an ongoing, two-way
line of inquiry. You must observe the tasks your partner has to perform to
keep the business running, and you must constantly ask why each step is done,
and what each step contributes to the larger business context. Take voluminous
notes. (A small tape recorder works well during the day, assuming you are
willing to spend two hours intelligently filtering its contents to a document
file of some sort that night.)
But you should encourage your partner to ask questions of you as well,
especially if he or she is nontechnical. Your partner knows the work better
than you, and he or she should stop at appropriate points and say, "This is a
real logjam operation for us. What can we do with a computer to help
streamline the work flow here?" The biggest selling points of your eventual
application will be the ways it clears those logjams.
Resist the temptation to redesign the shape of the business to accommodate the
program. Today's machines are capacious and fast enough to do things the
not-quite-optimal way, if that way more closely mirrors the business being
automated. Adhering to the "common practice" prevalent in the business is one
of your design constraints. It may even be the most important one.
Do not even begin your design work until you have been soaked to the eyeballs
in the everyday work of the business for that full month. Learn the jargon,
and incorporate the jargon into the application right at the design level.


"Screen-in" Design


It often makes sense in vertical-market work to design the app from the screen
in, by designing screen layouts (or at least approximating them) before any
other system elements are decided. The screens come first because they are the
places where the user touches the app. The screens should reflect common
business practice, and may even be the video realization of business forms in
common use in the business. Mimic such forms where possible, especially where
a high degree of forms standardization (as in government, tax, or insurance
work) exists.
Find a screen design tool that appeals to you -- and teach your partner how to
use it. Have your partner take the first cut at designing the screens. You may
have to touch them up, but if your partner knows anything about efficient work
methods (and if not, the business is probably in trouble), the best screen
designs will come from your partner and not from you.
Treat your suite of screen designs as a group of flexible constraints. Once
you've got 'em, design to 'em. The screens will probably need to be tweaked as
your design shakes out -- but you must try very hard to let the screens shape
the design, rather than letting design expediency (read here: programmer
laziness) reach back and reshape the screens.


Working Fast


The narrower your market, the faster you had better be able to produce the
goods. This means that you have to be good at your chosen language, and that
your chosen language must allow the sort of mad-dash gonzo programming (what I
call lightning development) you have to accomplish to make money. C is
hopeless for lightning development -- but before you C guys start tying the
noose, let me throw this additional comment in: Until you've bought or
accumulated a truly tremendous high-level toolkit, Pascal isn't a great deal
better. Modula-2 is worse than both, because the tools aren't there, and may
never be.
While researching this column, I looked at a great many programming
environments I had sitting on the shelf, evaluating them strictly in terms of
their appropriateness for lightning development. My eventual conclusion
surprised me, and it will certainly surprise you: The best environment for
lightning development isn't generally considered a language at all (even
though it is) and that is Clarion Professional.
Clarion is usually marketed as and reviewed as a relational database manager.
There is certainly a relational database beating at Clarion's heart (a damned
fine one, too) but that's not what makes it so good. What Clarion has that the
Paradox Engine and db_Vista lack is a true structured programming language and
superb interactive design tools.
Clarion seems to have been designed specifically to allow lightning
development of vertical-market, database-oriented applications that run in
text mode. For simple applications, you may not have to actually write any
Clarion language code by hand. Using the provided tools, you draw your
database schema, draw your user interface and report screens, and let the
Clarion engine tie it all up with a bow. Such an application is not very
flexible (it basically allows you to enter data into a database and query or
report it back again), but it works fast and it's rock-solid -- and you can
put it together in an hour-and-a-half once you learn your way around the
system.
On more ambitious projects, you rough out an application as far as you can
take it with the interactive design tools, and then you bring it the rest of
the way by writing code in the Clarion language proper. My friend Kate Daniel
is a seasoned Clarion developer who can do in a weekend what a C programmer
would proudly brag of accomplishing in "only" ten weeks, and a Pascal guy
might do in five.
I've heard gripes that this isn't really programming, sometimes from people
who think nothing of wrestling with CASE tools for weeks at a time. Maybe what
they mean is that it doesn't hurt enough.


Stepson of Cobol


But I think what bothers them about Clarion is that it looks a lot like Cobol
on the surface, primarily because it is traditionally written entirely in
uppercase. (Clarion is not case-sensitive and may be written in lower-or
mixed-case, as desired.) That, and some old-fashioned control structures such
as computed GOTO have been enough to give it a bad name. Clarion, however, is
a structured language up with the best of them. Let me give you some
specifics.
A Clarion program or module (like a Turbo Pascal unit) has two parts: a
declaration part and a code part, separated by the reserved word CODE. All the
standard Turbo Pascal data types are supported, plus a 15-digit BCD numeric
type. Simple data types may be declared as variables or combined into GROUP
structures, which are analogous to Pascal records.
Report specs and screen specs are implemented as data structures as well. This
allows the interactive design tools to write the state of a designed report or
screen design in such a form as to be directly accessible to the programmer as
source code.
A MAP structure may be defined to summarize all declarations used within the
current program, including declarations of items defined in other modules.
This is roughly analogous to the INTERFACE section of a Pascal unit, and is
how modular compilation is expedited in Clarion.
Remarkably, Clarion has a richer suite of flow control structures than
Pascal's. In a dition to IF and CASE, Clarion has a very versatile LOOP
structure, which can act as either a WHILE..DO loop or a REPEAT..UNTIL loop.
Within LOOP, you can use BREAK to break out of the loop before terminating
conditions are met, or CYCLE to begin at the top of an iterative loop with the
next iterator value. The RESTART statement has no analog in Pascal, but does
the job of the setjmp..longjmp pair in C, in that it allows you to set a trap
point to which you can return if you find yourself hopelessly mired in a
morass of control structures that is caving in on itself. CHAIN allows
Basic-style chaining to another Clarion program, and CALL performs a
subroutine-style CHAIN to another Clarion program, with the original program's
state retained so control can be returned to it. RUN shells out to DOS, either
to COMMAND.COM or to some other executable program.
There are two different kinds of procedures: One is the kind we're familiar
with in Pascal and C, and the other is a simple GOSUB-style subroutine called
a local routine.
And, of course, there are two different kinds of GOTO.
As with any language, you can write spaghetti code in Clarion, but if you
can't build clean structured modules with an arsenal like that, you oughta be
selling shower curtains, period. If this be Cobol, I'll take a double handful.



Building Applications


The Clarion compiler generates Clarion P-code for a proprietary P-machine. The
P-code is generally what is used during development. Once the application is
solid, a translator utility converts the P-code to standard DOS .OBJ files
that may be linked as a machine-code .EXE.
Clarion has another distinction that is not adequately appreciated: It has
among the best documentation sets of any programming language I've ever used.
The large three-ring manuals are complete, well-written, virtually
indestructible, and lie unconditionally flat on the desk. There's a good
tutorial on the interactive design tools, and, most remarkably, an entire book
containing annotated example programs written in Clarion. These are not toy
programs; one example is 800 lines long. In any new language spec, annotated
program examples are absolutely necessary to give you the big picture of
program structure once you've become conversant with the individual language
statements.


Products Mentioned


Clarion Professional Developer 2.1 Clarion Software 150 East Sample Road
Pompano Beach, FL 33064 800-354-5444
db_Vista III Raima Inc. 3245 146th Place SE Bellevue, WA 98007 206-747-5570
With Clarion's permission, I've reproduced a simple mortgage-payment
calculator from the examples book in Listing One, page 171. It's necessarily
small, but it's quite representative of the spirit of a Clarion program.


More Mortgages


Clarion succeeds because it does not try to be too much. It does not support
graphics or event-driven programming, and is best suited for database-central
applications that don't get too exotic. On the other hand, you can pick up the
essentials of the language in an afternoon and be producing reasonable
applications in two or three days. If you have to work fast, there's simply
nothing like it anywhere.
 Listing Two (page 171) returns us to Pascal-land, actually in preparation for
next issue, when we launch into (hold your breath) Turbo Vision. MORTGAGE.PAS
is a mortgage object that generates and manipulates a mortgage amortization
table. There's nothing the least bit tricky about it. It lacks a user
interface, but that's OK...because that's what Turbo Vision does best.

_STRUCTURED PROGRAMMING COLUMN_
by Jeff Duntemann


[LISTING ONE]

 TITLE('COMPUTE MONTHLY PAYMENTS')
PAYMENT PROGRAM
 INCLUDE('\CLARION\STD_KEYS.CLA') !STANDARD KEYCODE EQUATES

SCREEN SCREEN HLP('PAYMENT'),HUE(7,0,0)
 ROW(4,25) PAINT(13,32),HUE(7,1)
 COL(25) STRING('<201,205{30},187>')
 ROW(5,25) REPEAT(3);STRING('<186,0{30},186>') .
 ROW(8,25) STRING('<204,205{30},185>')
 ROW(9,25) REPEAT(7);STRING('<186,0{30},186>') .
 ROW(16,25) STRING('<200,205{30},188>')
 ROW(6,28) STRING('CALCULATE MONTHLY PAYMENTS')
 ROW(13,32) STRING('Payment :')
 ROW(15,30) STRING('Press Ctrl-Esc to Exit')
 ROW(9,32) STRING('Principal:')
 COL(44) ENTRY(@N7),USE(AMOUNT),INS,NUM
 ROW(10,32) STRING('Rate {5}:')
 COL(44) ENTRY(@N7.3),USE(RATE),INS,NUM
 ROW(11,32) STRING('Years :')
 COL(49) ENTRY(@N2),USE(YEARS),INS,NUM
PAYMENT ROW(13,42) STRING(@N9.2)
 .
AMOUNT DECIMAL(7) !PRINCIPAL AMOUNT
RATE DECIMAL(7,3) !ANNUAL PERCENTAGE RATE
YEARS DECIMAL(3) !TERM IN YEARS
MONTHS DECIMAL(3) !TERM IN MONTHS
MON_RATE REAL !MONTHLY RATE
TEMP REAL !INTERMEDIATE VALUE

 CODE !START THE CODE SECTION
 ALERT(CTRL_ESC) !ENABLE THE CTRL-ESC KEY

 HELP('PAYMENT') !OPEN THE HELP FILE
 OPEN(SCREEN) !DISPLAY THE SCREEN LAYOUT
 LOOP !LOOP THROUGH THE FIELDS
 ACCEPT !GET A FIELD FROM THE KEYBOARD
 IF KEYCODE() = CTRL_ESC THEN RETURN. !EXIT ON CTRL-ESC

 IF AMOUNT * RATE * YEARS <> 0 !WHEN ALL FIELDS ARE ENTERED:
 MONTHS = YEARS * 12 !COMPUTE MONTHS
 MON_RATE = RATE / 1200 !COMPUTE MONTHLY RATE
 TEMP = 1 / ((1 + MON_RATE) ^ MONTHS) !COMPUTE MONTHLY PAYMENT
 PAYMENT = AMOUNT * (MON_RATE / (1 - TEMP))
 ELSE !OTHERWISE:
 PAYMENT = 0 ! SET MONTHLY PAYMENT TO ZERO
 . !END THE IF STATEMENT
 IF FIELD() = ?YEARS !AFTER THE LAST FIELD
 SELECT(?AMOUNT) ! SELECT THE FIRST FIELD
 . . !END THE IF AND LOOP STATEMENTS






[LISTING TWO]

{ By Jeff Duntemann -- From DDJ for October 1991 }

UNIT Mortgage;

INTERFACE

TYPE
 Payment = RECORD { One element in the amort. table. }
 PayPrincipal : Real;
 PayInterest : Real;
 PrincipalSoFar : Real;
 InterestSoFar : Real;
 ExtraPrincipal : Real;
 Balance : Real;
 END;
 PaymentArray = ARRAY[1..2] OF Payment; { Dynamic array! }
 PaymentPointer = ^PaymentArray;

 PMortgage = ^TMortgage;
 TMortgage =
 OBJECT
 Periods : Integer; { Number of periods in mortgage }
 PeriodsPerYear : Integer; { Number of periods in a year }
 Principal : Real; { Amount of principal in cents }
 Interest : Real; { Percentage of interest per *YEAR*}

 MonthlyPI : Real; { Monthly payment in cents }
 Payments : PaymentPointer; { Array holding payments }
 PaymentSize : LongInt; { Size in bytes of payments array }

 CONSTRUCTOR Init(StartPrincipal : Real;
 StartInterest : Real;
 StartPeriods : Integer;
 StartPeriodsPerYear : Integer);

 PROCEDURE SetNewInterestRate(NewRate : Real);
 PROCEDURE Recalc;
 PROCEDURE GetPayment(PaymentNumber : Integer;
 VAR ThisPayment : Payment);
 PROCEDURE ApplyExtraPrincipal(PaymentNumber : Integer;
 Extra : Real);
 PROCEDURE RemoveExtraPrincipal(PaymentNumber : Integer);
 DESTRUCTOR Done;
 END;

IMPLEMENTATION
FUNCTION CalcPayment(Principal,InterestPerPeriod : Real;
 NumberOfPeriods : Integer) : Real;
VAR
 Factor : Real;
BEGIN
 Factor := EXP(-NumberOfPeriods * LN(1.0 + InterestPerPeriod));
 CalcPayment := Principal * InterestPerPeriod / (1.0 - Factor)
END;

CONSTRUCTOR TMortgage.Init(StartPrincipal : Real;
 StartInterest : Real;
 StartPeriods : Integer;
 StartPeriodsPerYear : Integer);
VAR
 I : Integer;
 InterestPerPeriod : Real;
BEGIN
 { Set up all the initial state values: }
 Principal := StartPrincipal;
 Interest := StartInterest;
 Periods := StartPeriods;
 PeriodsPerYear := StartPeriodsPerYear;
 { Here we calculate the size that the payment array will occupy. }
 { We retain this because the number of payments may change...and }
 { we'll need to dispose of the array when the object is ditched: }
 PaymentSize := SizeOf(Payment) * Periods;

 { Allocate payment array on the heap: }
 GetMem(Payments,PaymentSize);

 { Initialize extra principal fields of payment array: }
 FOR I := 1 TO Periods DO
 Payments^[I].ExtraPrincipal := 0;
 Recalc; { Calculate the amortization table }
END;

PROCEDURE TMortgage.SetNewInterestRate(NewRate : Real);
BEGIN
 Interest := NewRate;
 Recalc;
END;

{ This method calculates the amortization table for the mortgage. }
{ The table is stored in the array pointed to by Payments. }

PROCEDURE TMortgage.Recalc;
VAR
 I : Integer;

 RemainingPrincipal : Real;
 PaymentCount : Integer;
 InterestThisPeriod : Real;
 InterestPerPeriod : Real;
 HypotheticalPrincipal : Real;
BEGIN
 InterestPerPeriod := Interest/PeriodsPerYear;
 MonthlyPI := CalcPayment(Principal,
 InterestPerPeriod,
 Periods);
 { Round the monthly to cents: }
 MonthlyPI := int(MonthlyPI * 100.0 + 0.5) / 100.0;

 { Now generate the amortization table: }
 RemainingPrincipal := Principal;
 PaymentCount := 0;
 FOR I := 1 TO Periods DO
 BEGIN
 Inc(PaymentCount);
 { Calculate the interest this period and round it to cents: }
 InterestThisPeriod :=
 Int((RemainingPrincipal * InterestPerPeriod) * 100 + 0.5) / 100.0;
 { Store values into payments array: }
 WITH Payments^[PaymentCount] DO
 BEGIN
 IF RemainingPrincipal = 0 THEN { Loan's been paid off! }
 BEGIN
 PayInterest := 0;
 PayPrincipal := 0;
 Balance := 0;
 END
 ELSE
 BEGIN
 HypotheticalPrincipal :=
 MonthlyPI - InterestThisPeriod + ExtraPrincipal;
 IF HypotheticalPrincipal > RemainingPrincipal THEN
 PayPrincipal := RemainingPrincipal
 ELSE
 PayPrincipal := HypotheticalPrincipal;
 PayInterest := InterestThisPeriod;
 RemainingPrincipal :=
 RemainingPrincipal - PayPrincipal; { Update running balance }
 Balance := RemainingPrincipal;
 END;
 { Update the cumulative interest and principal fields: }
 IF PaymentCount = 1 THEN
 BEGIN
 PrincipalSoFar := PayPrincipal;
 InterestSoFar := PayInterest;
 END
 ELSE
 BEGIN
 PrincipalSoFar :=
 Payments^[PaymentCount-1].PrincipalSoFar + PayPrincipal;
 InterestSoFar :=
 Payments^[PaymentCount-1].InterestSoFar + PayInterest;
 END;
 END; { WITH }
 END; { FOR }

END; { TMortgage.Recalc }

PROCEDURE TMortgage.GetPayment(PaymentNumber : Integer;
 VAR ThisPayment : Payment);
BEGIN
 ThisPayment := Payments^[PaymentNumber];
END;

PROCEDURE TMortgage.ApplyExtraPrincipal(PaymentNumber : Integer;
 Extra : Real);
BEGIN
 Payments^[PaymentNumber].ExtraPrincipal := Extra;
 Recalc;
END;

PROCEDURE TMortgage.RemoveExtraPrincipal(PaymentNumber : Integer);
BEGIN
 Payments^[PaymentNumber].ExtraPrincipal := 0.0;
 Recalc;
END;

DESTRUCTOR TMortgage.Done;
BEGIN
 FreeMem(Payments,PaymentSize);
END;

END. { MORTGAGE }



































October, 1991
GRAPHICS PROGRAMMING


The Virtures of Affordable Technology: the Sierra Hicolor DAC




Michael Abrash


My, how quickly the PC world changes! Six months ago, I described the Edsun
CEG/DAC as a triumph of inexpensive approximation. That chip was and is an
ingenious bridge between SuperVGA and true color that requires no
modifications to VGA chips or additional memory, yet achieves often-stunning
results. Six months ago, the CEG/DAC was the only affordable path beyond
SuperVGA.
Time and technology march on, and, in this case, technology has marched much
the faster. I have on my desk a SuperVGA card, built around the Tseng Labs
ET4000 VGA chip, 1 Mbyte of RAM, and the Sierra Semiconductor Hicolor DAC
(digital-to-analog converter, the chip that converts pixel values from the VGA
into analog signals for the monitor), that supports an 800x600, 32,768-color
mode. The added cost of the Hicolor DAC over a standard VGA DAC (of which the
Hicolor DAC is a fully compatible superset) to the board manufacturer is less
than $10; I have already seen a Hicolor-based SuperVGA listed in Computer
Shopper for under $200.
To those of us who remember buying IBM EGAs for $1000, there's a certain
degree of unreality to the thought of an 800x600 32K-color VGA for less than
$200.
Understand, now, that I'm not talking about clever bitmap encoding or other
tricky ways of boosting color here. This is the real, 15-bpp, almost
true-color McCoy, beautifully suited to imaging, antialiasing, and virtually
any sort of high-color graphics you might imagine. The Hicolor DAC supports
normal bitmaps that are just like 256-color bitmaps, except that each pixel is
composed of 15 bits spread across 2 bytes. If you know how to program 800x600
256-color mode, you should have no trouble at all programming 800x600
32K-color mode; for the most part, just double the horizontal byte counts.
(Lower-resolution 32K-color modes, such as 640x480, are available. No 1024x768
32K-color mode is supported, not due to any limitation of the Hicolor DAC, but
because no VGA chip currently supports the 1.5 Mbytes of memory that mode
requires. Expect that to change soon.) The 32K-color banking schemes are the
same as in 256-color modes, except that there are half as many pixels in each
bank. Even the complexities of the DAC's programmable palette go away in
32K-color mode, because there is no programmable palette.
And therein lies the strength of the Hicolor DAC: It's easy to program.
Theoretically, the CEG/DAC can produce higher-color and more precise images
using less display memory than the Hicolor DAC, because CEG color resolutions
of 24-bpp and even higher are possible. Practically speaking, it's hard to
write software -- especially real-time software -- that takes full advantage
of the CEG/DAC's capabilities. On the other hand, it's very easy to extend
existing 256-color SuperVGA code to support the Hicolor DAC, and although 32K
colors is not the same as true color (24-bpp), it's close enough for most
purposes, and astonishingly better than 256 colors. Digitized and rendered
images look terrific on the Hicolor DAC, just as they do on the CEG/DAC -- and
it's a lot easier and much faster to generate such images for the Hicolor DAC.
The Hicolor DAC has three disadvantages. First, it requires twice as much
memory at a given resolution as does an equivalent 256-color or CEG/DAC mode.
This is no longer a significant problem (apart from temporarily precluding a
1024x768 32K-color mode, as explained earlier); memory is cheap, and 1 Mbyte
is becoming standard on SuperVGAs. Secondly, graphics operations can take
considerably longer, simply because there are twice as many bytes of display
memory to be dealt with; however, the latest generation of SuperVGAs provides
for such fast memory access that 32K-color software will probably run faster
than 256-color software did on the first generation of SuperVGAs. Finally, the
Hicolor DAC neither performs gamma correction in hardware nor provides a
built-in look-up table to allow programmable gamma correction.
To refresh your memory, gamma correction is the process of compensating for
the nonlinear response of pixel brightness to input voltage. A pixel with a
green value of 60 is much more than twice as bright as a pixel of value 30.
The Hicolor DAC's lack of built-in gamma correction puts the burden on
software to perform the correction so that antialiasing will work properly,
and images such as digitized photographs will display with the proper
brightness. Software gamma correction is possible, but it's a time-consuming
nuisance; it also decreases the effective color resolution of the Hicolor DAC
for bright colors, because the bright colors supported by the Hicolor DAC are
spaced relatively farther apart than the dim colors.
The lack of gamma correction is, however, a manageable annoyance. On balance,
the Hicolor DAC is true to its heritage; a logical, inexpensive, and painless
extension of SuperVGA. The obvious next steps are 1024x768 in 32K colors, and
800x600 with 24 bpp; heck, 4 Mbytes of display memory (eight 4-Mbit RAMs)
would be enough for 1024x768 24-bpp with room to spare. In short, the Hicolor
DAC appears to be squarely in the mainstream of VGA evolution. (Note that
although most of the first generation of Hicolor boards are built around the
ET4000, which has quietly and for good reason become the preeminent SuperVGA
chip, the Hicolor DAC works with other VGA chips and will surely appear on
SuperVGAs of all sorts in the near future.)
Does that mean that the Hicolor DAC will become a standard? Beats me. I'm out
of the forecasting business; the world changes too fast. The CEG/DAC has a
head start and is showing up in a number of systems, and who knows what else
is in the pipeline? Still, programmers love the Hicolor DAC, and I would be
astonished if there were not an installed base of at least 100,000 by the end
of the year. Draw your own conclusions; but me, I can't wait to do some
antialiased drawing on the Hicolor DAC (and I will, in this column, next
month).
If the CEG/DAC is a triumph of inexpensive approximation, the Hicolor DAC is a
masterpiece of affordable technology. I'd have to call a 1-Mbyte Hi-color
SuperVGA for around $200 the ultimate in graphics cost effectiveness at this
moment -- but don't expect it to hold that title for more than six months.
Things change fast in this industry; $200 true-color in a year, anyone?


Polygon Antialiasing


To my mind, the best thing about the Hicolor DAC is that, for the first time
in the VGA market, it makes fast, general antialiasing possible -- and the
readers of this column will soon see the fruits of that. You see, what I've
been working toward in this column is real-time 3-D, perspective drawing on a
standard PC, without the assistance of any expensive hardware. The object
model I'll be using is polygon-based; hence the fast polygon fill code I've
presented. With mode X (320x240, 256 colors, undocumented by IBM), we now have
a fast, square-pixel, page-flipped, 256-color mode, the best that standard VGA
has to offer. In this mode, it's possible to do not only real-time,
polygon-based perspective drawing and animation, but also relatively
sophisticated effects such as lighting sources, smooth shading, and hidden
surface removal. That's everything we need for real-time 3-D -- but things
could still be better.
Pixels are so large in mode X that polygons have very visibly jagged edges.
These jaggies are the result of the aliasing of which I spoke back in April
and May; that is, distortion of the true image that results from undersampling
at the low pixel rate of the screen. Jaggies are a serious problem; the whole
point of real-time 3-D is to create the illusion of reality, but jaggies
quickly destroy that illusion, particularly when they're crawling along the
edges of moving objects. More frequent sampling (higher resolution) helps, but
not as much as you'd think. What's really needed is the ability to blend
colors arbitrarily within a single pixel, the better to reflect the nature of
the true image in the neighborhood of that pixel -- that is, antialiasing. The
pixels are still as large as ever, but with the colors blended properly, the
eye processes the screen as a continuous image, rather than as a collection of
discrete pixels, and perceives the image at much higher resolution than the
display actually supports.
There are many ways to antialias, some of them fast enough for real-time
processing, and they can work wonders in improving image appearance -- but
they all require a high degree of freedom in choosing colors. For many sorts
of graphics, 256 simultaneous colors is fine, but it's not enough for
generally useful antialiasing (although we will shortly see an interesting
sort of special-case antialiasing with 256 colors). Therefore, the one element
lacking in my quest for affordable real-time 3-D has been good antialiasing.
No longer. The Hicolor DAC provides plenty of colors (although I sure do wish
the software didn't have to do gamma correction!), and makes them available in
a way that allows for efficient programming. In a couple of months, I'm going
to start presenting 3-D code; initially, this code will be for mode X, but you
can expect to see some antialiasing code for the Hicolor DAC soon.


256 Color Antialiasing


Next month, I'll explain how the Hicolor DAC works -- how to detect it, how to
initialize it, the pixel format, banking, and so on -- and then I'll
demonstrate Hicolor antialiasing. This month, I'm going to demonstrate
antialiasing on a standard VGA, partly to introduce the uncomplicated but
effective antialiasing technique that I'll use next month, partly so you can
see the improvement that even quick and dirty antialiasing produces, and
partly to show the sorts of interesting things that can be done with the
palette in 256-color mode.
I'm going to draw a cube in perspective. For reference, Listing One (page 173)
draws the cube in mode 13h (320x200, 256 colors) using the standard polygon
fill routine that I developed back in February and March. No, the perspective
calculations aren't performed in Listing One; I just got the polygon vertices
out of 3-D software that I'm developing and hardwired them into Listing One.
Never fear, though; we'll get to true 3-D soon enough.
Listing One draws a serviceable cube, but the edges of the cube are very
jagged. Imagine the cube spinning, and the jaggies rippling along its edges,
and you'll see the full dimensions of the problem.
Listings Two (page 173) and Three (page 173) together draw the same cube, but
with simple, unweighted antialiasing. The results are much better than Listing
One; there's no question in my mind as to which cube I'd rather see in my
graphics software.
The antialiasing technique used in Listing Two is straightforward. Each
polygon is scanned out in the usual way, but at twice the screen's resolution
both horizontally and vertically (which I'll call "double-resolution,"
although it produces four times as many pixels), with the double-resolution
pixels drawn to a memory buffer, rather than directly to the screen. Then,
after all the polygons have been drawn to the memory buffer, a second pass is
performed; this pass looks at the colors stored in each set of four
double-resolution pixels, and draws to the screen a single pixel that reflects
the colors and intensities of the four double-resolution pixels that make it
up, as shown in Figure 1. In other words, Listing Two temporarily draws the
polygons at double resolution, then uses the extra information from the
double-resolution bitmap to generate an image with an effective resolution
considerably higher than the screen's actual 320x200 capabilities.
Two interesting tricks are employed in Listing Two. First, it would be best
from the standpoint of speed, if the entire screen could be drawn to the
double-resolution intermediate buffer in a single pass. Unfortunately, a
buffer capable of holding one full 640x400 screen would be 64,000 or more
bytes in size -- too much memory for most programs to spare. Consequently,
Listing Two instead scans out the image just two double-resolution scan lines
(corresponding to one screen scan line) at a time. That is, the entire image
is scanned once for every two double-resolution scan lines, and all
information not concerning the two lines of current interest is thrown away.
This banding is implemented in Listing Three, which accepts a full list of
scan lines to draw, but actually draws only those lines within the current
scan line band. Listing Three also draws to the intermediate buffer, rather
than to the screen.
The polygon-scanning code from February was hard-wired to call the function
DrawHorizontalLineList, which drew to the display; this is the polygon-drawing
code called by Listing One. That was fine so long as there was only one
possible drawing target, but now we have two possible targets -- the display
(for nonantialiased drawing), and the intermediate buffer (for antialiased
drawing). It's desirable to be able to mix the two, even within a single
screen, because antialiased drawing looks better but nonantialiased is faster.
Consequently, I have modified Listing One from February -- the function
FillConvexPolygon -- to create FillCnvxPolyDrvr, which is the same as
FillConvexPolygon, except that it accepts as a parameter the name of the
function to be used to draw the scanned-out polygon. FillCnvxPolyDrvr is so
similar to FillConvexPolygon that it's not worth taking up printed space to
show it in its entirety; Listing Four (page 174) shows the differences between
the two; and the new module will be available in its entirety as part of the
code from this issue, under the name FILCNVXD.C.
The second interesting trick in Listing Two is the way in which the palette is
stacked to allow unweighted antialiasing. Listing Two arranges the palette so
that rather than 256 independent colors, we'll work with four-way combinations
within each pixel of three independent colors (red, green, and blue), with
each pixel accurately reflecting the intensities of each of the four color
components that it contains. This allows fast and easy mapping from four
double-resolution pixels to the single screen pixel to which they correspond.
Figure 2 illustrates the mapping of subpixels (double-resolution pixels)
through the palette to screen pixels. This palette organization converts mode
13h from a 256-color mode to a four-color antialiasing mode.
It's worth noting that many palette registers are set to identical values by
Listing Two, because the values of subpixels matter, arrangements of these
values do not. For example, the pixel values 0x01, 0x04, 0x10, and 0x40 all
map to 25 percent blue. By using a table look-up to map sets of four
double-resolution pixels to screen pixel values, more than half the palette
could be freed up for drawing with other colors.


Unweighted Antialiasing: How Good?


Is the antialiasing used in Listing Two the finest possible antialiasing
technique? It is not. It is an unweighted antialiasing technique, meaning that
no accounting is made for how close to the center of a pixel a polygon edge
might be. The edges are also biased a half-pixel or so in some cases, so
registration with the underlying image isn't perfect. Nonetheless, the
technique used in Listing Two produces attractive results, which is what
really matters; all screen displays are approximations, and unweighted
antialiasing is certainly good enough for PC animation applications.
Unweighted antialiasing can also support good performance, although this is
not the case in Listings Two and Three, where I have opted for clarity rather
than performance. Increasing the number of lines drawn on each pass, or
reducing the area processed to the smallest possible bounding rectangle would
help improve performance, as, of course, would the use of assembly language.
If there's room, I'll demonstrate some of these techniques next month.
For further information on antialiasing, you might check out the standard
reference: Computer Graphics, by Foley and van Dam. Michael Covington's
"Smooth Views," in the May, 1990 Byte, provides a short but meaty discussion
of unweighted line antialiasing.
As relatively good as it looks, Listing Two is still watered-down
antialiasing, even of the unweighted variety. For all our clever palette
stacking, we have only five levels of each color component available; that's a
far cry from the 32 levels of the Hicolor DAC, or the 256 levels of true
color. The limitations of 256-color modes, even with the palette, are showing
through.
Next month, 15-bpp antialiasing.


The Mode X Mode Set Bug, Revisited



Two months back, I added a last-minute note to this column describing a fix to
the mode X mode set code that I presented in the July column. I'd like to
describe how this bug slipped past me, as an illustration of why it's so
difficult to write flawless software nowadays. The key is this: The PC world
is so huge and diverse that it's a sure thing that someone, somewhere, will
eventually get clobbered by even the most innocuous bug -- a bug that you
might well not have found if you had spent the rest of your life doing nothing
but beta testing. It's like the thought that 100 monkeys, typing endlessly,
would eventually write the complete works of Shakespeare; there are 50,000,000
monkeys out there banging on keyboards and mousing around, and they will
inevitably find any holes you leave in your software.
In writing the mode X mode set code, I started by modifying known-good code. I
tried the final version of the code on both of my computers with five
different VGAs, and I had other people test it out on their systems. In short,
I put the code through all the hoops I had available, and then I sent it out
to be beaten on by 100,000 DDJ readers. It took all of one day for someone to
find a bug.
The code I started with used the VGA's 28-MHz clock. Mode X should have used
the 25-MHz clock, a simple matter of setting bit 2 of the Miscellaneous Output
register (3C2h) to 0 instead of 1.
Alas, I neglected to change that single bit, so frames were drawn at a faster
rate than they should have been; however, both of my monitors are
multifrequency types, and they automatically compensated for the faster frame
rate. Consequently, my clock-selection bug was invisible and innocuous --
until all those monkeys started banging on it.
IBM makes only fixed-frequency VGA monitors, which require very specific frame
rates; if they don't get what you've told them to expect, the image rolls --
and that's what the July mode X mode set code did on fixed-frequency monitors.
The corrected version, shown in Listing Five (page 174), selects the 25-MHz
clock, and works just fine on fixed-frequency monitors.
Why didn't I catch this bug? Neither I nor a single one of my testers had a
fixed-frequency monitor! This nicely illustrates how difficult it is these
days to test code in all the PC-compatible environments in which it might run.
The problem is particularly severe for small developers, who can't afford to
buy every model from every manufacturer; just imagine trying to test
network-aware software in all possible configurations.
When people ask why software isn't bulletproof; why it crashes or doesn't
coexist with certain programs; why PC clones aren't always compatible; why, in
short, the myriad of irritations of using a PC exist -- this is a big part of
the reason. That's just the price we pay for unfettered creativity and vast
choice in the PC market.
Unfettered for the moment; but consider AT&T's patent on backing store, the
"esoteric" idea of storing an obscured area of a window in a buffer so as to
be able to redraw it quickly. It took me all of ten minutes to independently
invent that one five years ago. Better yet, check out the letters to the
editor in the July Programmer's Journal, about which I will say no more
because it sets my teeth on edge. We'd all better hope that no one patents
"patterned tactile-pressure information input," that is, typing. Trust
50,000,000 monkeys to come up with a system as ridiculous as this.

_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

/* Demonstrates non-antialiased drawing in 256 color mode. Tested with
 Borland C++ 2.0 in C mode in the small model. */

#include <conio.h>
#include <dos.h>
#include "polygon.h"

/* Draws the polygon described by the point list PointList in color
 Color with all vertices offset by (X,Y) */
#define DRAW_POLYGON(PointList,Color,X,Y) \
 Polygon.Length = sizeof(PointList)/sizeof(struct Point); \
 Polygon.PointPtr = PointList; \
 FillConvexPolygon(&Polygon, Color, X, Y);

void main(void);
extern int FillConvexPolygon(struct PointListHeader *, int, int, int);

/* Palette RGB settings to load the first four palette locations with
 black, pure blue, pure green, and pure red */
static char Palette[4*3] = {0, 0, 0, 0, 0, 63, 0, 63, 0, 63, 0, 0};

void main()
{
 struct PointListHeader Polygon;
 static struct Point Face0[] =
 {{198,138},{211,89},{169,44},{144,89}};
 static struct Point Face1[] =
 {{153,150},{198,138},{144,89},{105,113}};
 static struct Point Face2[] =
 {{169,44},{133,73},{105,113},{144,89}};
 union REGS regset;
 struct SREGS sregs;

 /* Set the display to VGA mode 13h, 320x200 256-color mode */
 regset.x.ax = 0x0013; int86(0x10, &regset, &regset);

 /* Set color 0 to black, color 1 to pure blue, color 2 to pure
 green, and color 3 to pure red */
 regset.x.ax = 0x1012; /* load palette block BIOS function */
 regset.x.bx = 0; /* start with palette register 0 */
 regset.x.cx = 4; /* set four palette registers */
 regset.x.dx = (unsigned int) Palette;
 segread(&sregs);

 sregs.es = sregs.ds; /* point ES:DX to Palette */
 int86x(0x10, &regset, &regset, &sregs);

 /* Draw the cube */
 DRAW_POLYGON(Face0, 3, 0, 0);
 DRAW_POLYGON(Face1, 2, 0, 0);
 DRAW_POLYGON(Face2, 1, 0, 0);
 getch(); /* wait for a keypress */

 /* Return to text mode and exit */
 regset.x.ax = 0x0003; /* AL = 3 selects 80x25 text mode */
 int86(0x10, &regset, &regset);
}







[LISTING TWO]

/* Demonstrates unweighted antialiased drawing in 256 color mode.
 Tested with Borland C++ 2.0 in C mode in the small model. */

#include <conio.h>
#include <dos.h>
#include <stdlib.h>
#include <string.h>
#include "polygon.h"

/* Draws the polygon described by the point list PointList in color
 Color, with all vertices offset by (X,Y), to ScanLineBuffer, at
 double horizontal and vertical resolution */
#define DRAW_POLYGON_DOUBLE_RES(PointList,Color,x,y) \
 Polygon.Length = sizeof(PointList)/sizeof(struct Point); \
 Polygon.PointPtr = PointTemp; \
 /* Double all vertical & horizontal coordinates */ \
 for (k=0; k<sizeof(PointList)/sizeof(struct Point); k++) { \
 PointTemp[k].X = PointList[k].X * 2; \
 PointTemp[k].Y = PointList[k].Y * 2; \
 } \
 FillCnvxPolyDrvr(&Polygon, Color, x, y, DrawBandedList);

#define SCREEN_WIDTH 320
#define SCREEN_HEIGHT 200
#define SCREEN_SEGMENT 0xA000
#define SCAN_BAND_WIDTH (SCREEN_WIDTH*2) /* # of double-res pixels
 across scan band */
#define BUFFER_SIZE (SCREEN_WIDTH*2*2) /* enough space for one scan
 line scanned out at double
 resolution horz and vert */
void main(void);
void DrawPixel(int, int, char);
int ColorComponent(int, int);
extern int FillCnvxPolyDrvr(struct PointListHeader *, int, int, int,
 void (*)());
extern void DrawBandedList(struct HLineList *, int);


/* Pointer to buffer in which double-res scanned data will reside */
unsigned char *ScanLineBuffer;
int ScanBandStart, ScanBandEnd; /* top & bottom of each double-res
 band we'll draw to ScanLineBuffer */
int ScanBandWidth = SCAN_BAND_WIDTH; /* # pixels across scan band */
static char Palette[256*3];

void main()
{
 int i, j, k;
 struct PointListHeader Polygon;
 struct Point PointTemp[4];
 static struct Point Face0[] =
 {{198,138},{211,89},{169,44},{144,89}};
 static struct Point Face1[] =
 {{153,150},{198,138},{144,89},{105,113}};
 static struct Point Face2[] =
 {{169,44},{133,73},{105,113},{144,89}};
 unsigned char Megapixel;
 union REGS regset;
 struct SREGS sregs;

 if ((ScanLineBuffer = malloc(BUFFER_SIZE)) == NULL) {
 printf("Couldn't get memory\n");
 exit(0);
 }

 /* Set the display to VGA mode 13h, 320x200 256-color mode */
 regset.x.ax = 0x0013; int86(0x10, &regset, &regset);

 /* Stack the palette for the desired megapixel effect, with each
 2-bit field representing 1 of 4 double-res pixels in one of four
 colors */
 for (i=0; i<256; i++) {
 Palette[i*3] = ColorComponent(i, 3); /* red component */
 Palette[i*3+1] = ColorComponent(i, 2); /* green component */
 Palette[i*3+2] = ColorComponent(i, 1); /* blue component */
 }
 regset.x.ax = 0x1012; /* load palette block BIOS function */
 regset.x.bx = 0; /* start with palette register 0 */
 regset.x.cx = 256; /* set all 256 palette registers */
 regset.x.dx = (unsigned int) Palette;
 segread(&sregs);
 sregs.es = sregs.ds; /* point ES:DX to Palette */
 int86x(0x10, &regset, &regset, &sregs);

 /* Scan out the polygons at double resolution one screen scan line
 at a time (two double-res scan lines at a time) */
 for (i=0; i<SCREEN_HEIGHT; i++) {
 /* Set the band dimensions for this pass */
 ScanBandEnd = (ScanBandStart = i*2) + 1;
 /* Clear the drawing buffer */
 memset(ScanLineBuffer, 0, BUFFER_SIZE);
 /* Draw the current band of the cube to the scan line buffer */
 DRAW_POLYGON_DOUBLE_RES(Face0, 3, 0, 0);
 DRAW_POLYGON_DOUBLE_RES(Face1, 2, 0, 0);
 DRAW_POLYGON_DOUBLE_RES(Face2, 1, 0, 0);

 /* Coalesce the double-res pixels into normal screen pixels

 and draw them */
 for (j=0; j<SCREEN_WIDTH; j++) {
 Megapixel = (ScanLineBuffer[j*2] << 6) +
 (ScanLineBuffer[j*2+1] << 4) +
 (ScanLineBuffer[j*2+SCAN_BAND_WIDTH] << 2) +
 (ScanLineBuffer[j*2+SCAN_BAND_WIDTH+1]);
 DrawPixel(j, i, Megapixel);
 }
 }

 getch(); /* wait for a keypress */

 /* Return to text mode and exit */
 regset.x.ax = 0x0003; /* AL = 3 selects 80x25 text mode */
 int86(0x10, &regset, &regset);
}

/* Draws a pixel of color Color at (X,Y) in mode 13h */
void DrawPixel(int X, int Y, char Color)
{
 char far *ScreenPtr;

 ScreenPtr = (char far *)MK_FP(SCREEN_SEGMENT, Y*SCREEN_WIDTH+X);
 *ScreenPtr = Color;
}

/* Returns the gamma-corrected value representing the number of
 double-res pixels containing the specified color component in a
 megapixel with the specified value */
int ColorComponent(int Value, int Component)
{
 /* Palette settings for 0%, 25%, 50%, 75%, and 100% brightness,
 assuming a gamma value of 2.3 */
 static int GammaTable[] = {0, 34, 47, 56, 63};
 int i;

 /* Add up the number of double-res pixels of the specified color
 in a megapixel of this value */
 i = (((Value & 0x03) == Component) ? 1 : 0) +
 ((((Value >> 2) & 0x03) == Component) ? 1 : 0) +
 ((((Value >> 4) & 0x03) == Component) ? 1 : 0) +
 ((((Value >> 6) & 0x03) == Component) ? 1 : 0);
 /* Look up brightness of the specified color component in a
 megapixel of this value */
 return GammaTable[i];
}






[LISTING THREE]

/* Draws pixels from the list of horizontal lines passed in; drawing
 takes place only for scan lines between ScanBandStart and
 ScanBandEnd, inclusive; drawing goes to ScanLineBuffer, with
 the scan line at ScanBandStart mapping to the first scan line in
 ScanLineBuffer. Intended for use in unweighted antialiasing,

 whereby a polygon is scanned out into a buffer at a multiple of the
 screen's resolution, and then the scanned-out info in the buffer is
 grouped into megapixels that are mapped to the closest
 approximation the screen supports and drawn. Tested with Borland
 C++ 2.0 in C mode in the small model */

#include <string.h>
#include <dos.h>
#include "polygon.h"

extern unsigned char *ScanLineBuffer; /* drawing goes here */
extern int ScanBandStart, ScanBandEnd; /* limits of band to draw */
extern int ScanBandWidth; /* # of pixels across scan band */

void DrawBandedList(struct HLineList * HLineListPtr, int Color)
{
 struct HLine *HLinePtr;
 int Length, Width, YStart = HLineListPtr->YStart;
 unsigned char *BufferPtr;

 /* Done if fully off the bottom or top of the band */
 if (YStart > ScanBandEnd) return;
 Length = HLineListPtr->Length;
 if ((YStart + Length) <= ScanBandStart) return;

 /* Point to the XStart/XEnd descriptor for the first (top)
 horizontal line */
 HLinePtr = HLineListPtr->HLinePtr;

 /* Confine drawing to the specified band */
 if (YStart < ScanBandStart) {
 /* Skip ahead to the start of the band */
 Length -= ScanBandStart - YStart;
 HLinePtr += ScanBandStart - YStart;
 YStart = ScanBandStart;
 }
 if (Length > (ScanBandEnd - YStart + 1))
 Length = ScanBandEnd - YStart + 1;

 /* Point to the start of the first scan line on which to draw */
 BufferPtr = ScanLineBuffer + (YStart - ScanBandStart) *
 ScanBandWidth;

 /* Draw each horizontal line within the band in turn, starting with
 the top one and advancing one line each time */
 while (Length-- > 0) {
 /* Draw the whole horizontal line if it has a positive width */
 if ((Width = HLinePtr->XEnd - HLinePtr->XStart + 1) > 0)
 memset(BufferPtr + HLinePtr->XStart, Color, Width);
 HLinePtr++; /* point to next scan line X info */
 BufferPtr += ScanBandWidth; /* point to next scan line start */
 }
}








[LISTING FOUR]

/* The changes required to convert the function FillConvexPolygon,
 from Listing 1 in the Feb, 1991, column, into FillCnvxPolyDrvr.
 FillConvexPolygon was hardwired to call DrawHorizontalLineList to
 draw to the display; FillCnvxPolyDrvr is more flexible because it
 draws via the driver passed in as the DrawListFunc parameter */

/****** Delete this line ******/
extern void DrawHorizontalLineList(struct HLineList *, int);

/****** Change this... ******/
int FillConvexPolygon(struct PointListHeader * VertexList, int Color,
 int XOffset, int YOffset)
/****** ...to this ******/
int FillCnvxPolyDrvr(struct PointListHeader * VertexList, int Color,
 int XOffset, int YOffset, void (*DrawListFunc)())

/****** Change this... ******/
 DrawHorizontalLineList(&WorkingHLineList, Color);
/****** ...to this ******/
 (*DrawListFunc)(&WorkingHLineList, Color);







[LISTING FIVE]


; Mode X (320x240, 256 colors) mode set routine. Works on all VGAs.
; ****************************************************************
; * Revised 6/19/91 to select correct clock; fixes vertical roll *
; * problems on fixed-frequency (IBM 851X-type) monitors. *
; ****************************************************************
; C near-callable as: void Set320x240Mode(void);
; Tested with TASM 2.0.
; Modified from public-domain mode set code by John Bridges.

SC_INDEX equ 03c4h ;Sequence Controller Index
CRTC_INDEX equ 03d4h ;CRT Controller Index
MISC_OUTPUT equ 03c2h ;Miscellaneous Output register
SCREEN_SEG equ 0a000h ;segment of display memory in mode X

 .model small
 .data
; Index/data pairs for CRT Controller registers that differ between
; mode 13h and mode X.
CRTParms label word
 dw 00d06h ;vertical total
 dw 03e07h ;overflow (bit 8 of vertical counts)
 dw 04109h ;cell height (2 to double-scan)
 dw 0ea10h ;v sync start
 dw 0ac11h ;v sync end and protect cr0-cr7
 dw 0df12h ;vertical displayed
 dw 00014h ;turn off dword mode

 dw 0e715h ;v blank start
 dw 00616h ;v blank end
 dw 0e317h ;turn on byte mode
CRT_PARM_LENGTH equ (($-CRTParms)/2)

 .code
 public _Set320x240Mode
_Set320x240Mode proc near
 push bp ;preserve caller's stack frame
 push si ;preserve C register vars
 push di ; (don't count on BIOS preserving anything)

 mov ax,13h ;let the BIOS set standard 256-color
 int 10h ; mode (320x200 linear)

 mov dx,SC_INDEX
 mov ax,0604h
 out dx,ax ;disable chain4 mode
 mov ax,0100h
 out dx,ax ;synchronous reset while setting Misc Output
 ; for safety, even though clock unchanged
 mov dx,MISC_OUTPUT
 mov al,0e3h
 out dx,al ;select 25 MHz dot clock & 60 Hz scanning rate

 mov dx,SC_INDEX
 mov ax,0300h
 out dx,ax ;undo reset (restart sequencer)

 mov dx,CRTC_INDEX ;reprogram the CRT Controller
 mov al,11h ;VSync End reg contains register write
 out dx,al ; protect bit
 inc dx ;CRT Controller Data register
 in al,dx ;get current VSync End register setting
 and al,7fh ;remove write protect on various
 out dx,al ; CRTC registers
 dec dx ;CRT Controller Index
 cld
 mov si,offset CRTParms ;point to CRT parameter table
 mov cx,CRT_PARM_LENGTH ;# of table entries
SetCRTParmsLoop:
 lodsw ;get the next CRT Index/Data pair
 out dx,ax ;set the next CRT Index/Data pair
 loop SetCRTParmsLoop

 mov dx,SC_INDEX
 mov ax,0f02h
 out dx,ax ;enable writes to all four planes
 mov ax,SCREEN_SEG ;now clear all display memory, 8 pixels
 mov es,ax ; at a time
 sub di,di ;point ES:DI to display memory
 sub ax,ax ;clear to zero-value pixels
 mov cx,8000h ;# of words in display memory
 rep stosw ;clear all of display memory

 pop di ;restore C register vars
 pop si
 pop bp ;restore caller's stack frame
 ret

_Set320x240Mode endp
 end




























































October, 1991
PROGRAMMER'S BOOKSHELF


You Could Look it Up




Andrew Schulman


It has recently come to my attention that "no one reads computer books." In
fact, my coworkers delight in telling me this, especially when I've just
happened to mention that I was up the night before, writing one.
Okay, so maybe curling up with 80x86 Architecture and Programming, Programming
Windows, Using C++, Computer Architecture: A Quantitative Approach, or any of
the other books we've reviewed in these pages isn't your idea of a good time.
Some of you may even be distrustful of all forms of prose, and just want to be
given some sample source code and -- as one reader put it -- "the facts."
No doubt about it, you can get by without prose, that is, without
explanations. But you can't do your job without hard-core reference material.
It's time for "Programmer's Bookshelf" to take on the sort of books that
programmers might actually keep on their desks and use every day: reference
manuals.


The Oxford Dictionary


I'd like to start off with a somewhat unconventional manual, though: a
dictionary of computing. Did you know that such a thing even exists? Even many
of the writers I know don't seem to use one. But a good computer dictionary
can be a remarkably useful tool.
The Oxford Dictionary of Computing, now in its third edition, has recently
come out in paperback. Any reader of Dr. Dobb's will benefit from owning a
copy of this handy, inexpensive volume. The scope of the book can be seen from
the entries on one randomly selected pair of pages; see Example 1.
Example 1: Entries from Oxford's Dictionary of Computing

 Trojan horse
 Tron (real-time operating system
 nucleus)
 trouble shooting
 true complement
 truncation (see roundoff error)
 trunk
 trunk circuit
 trusted
 truth table
 TSR (see also hot-key)
 TTL
 T-type flip-flop
 Turbo languages
 Turing computability
 Turing machine (TM)

This also provides a glimpse at the wide range of the field of computing
itself: everything from mathematical logic and combinatorics, security issues,
electronics, real-time operating systems, and switching theory, to hacks and
mass-market commodity compilers.
I use this book whenever I come across or need to use a term that, when I'm
being honest with myself, I only half understand. For example, in a manual I
recently wrote for Phar Lap, I needed to explain the difference between
interrupt and exception. What a fool, you say: Everybody knows that! But try
explaining it now, out loud. You probably have a fuzzy "sense" of the
distinction between these two words, but not a precise definition. Well,
that's what dictionaries are for.
The cross-references in a dictionary such as this are useful when you know
only a little bit about a subject, and would like to learn a few of the key
issues and maybe pick up a few of the key terms (perhaps so you can impress
your coworkers). For instance, let's say that I am interested in learning more
about data compression, but don't know (or can't remember) anything specific
about it. The Oxford dictionary entry on "data compression" doesn't say a
whole heck of a lot, but it does refer to the entries for "information
theory," "reduncancy," and "source coding." Turning to the entry on source
coding, I can read about variable-length codes, Huffman coding, Shannon-Fano
coding, and Kraft's inequality. Most important, I see that source coding is
contrasted with "channel coding." That, it turns out, is another term for
error detecting and error correcting codes. Turning now to the description of
error correcting codes, I find out a little bit about Hamming codes,
Reed-Solomon, and simplex codes. Thus, simply by flipping through a few pages,
a previous ignoramous has learned something about how data compression (source
coding) on the one hand and error detection/correction (channel coding) on the
other fit into the grand scheme of information theory.
Naturally, the Oxford dictionary has a somewhat academic bent. While
surprisingly topical in some places (such as "TSR" and "Turbo languages"), in
others you may glance at the definition for a term with which you really are
familiar, and find that you have absolutely no idea what they're talking
about! For example, while it is certainly reasonable for the definition of
"regular expression" to make reference to formal language theory, the example
given of a regular expression might have been more useful if it looked a
little more like grep, and a little less like something out of a linguist's or
logician's nightmare.
Naturally, as in any book, there are errors. The definition for "threading,"
for instance, gets completely confused between "threading," a technique used
in interpreted programming languages such as Forth, and "threads," meaning
lightweight processes.
In any dictionary like this, there is a danger of collecting a lot of formal
sounding academic terms (such as "semi-Thue system"), and missing some of the
more colorful jargon of computing. I found, however, that despite some
unfortunate omissions ("lvalue" and "thunk" -- how could they leave those
out?!), many of these phrases did make it into the Oxford dictionary: shell,
pipeline, stub, execute, latch, lazy evaluation, Look and Feel, bit-slice,
rollback, thrashing, garbage collection, remote procedure call, hash, clone,
cache, and carry. Just skimming through a book like this will give you an
appreciation for the vast number of key concepts -- contributions to human
thought, in a way -- produced by the field of computing.
I can't think of a better way to spend $10.95.


Backup Dictionary


One thing is missing from the Oxford dictionary, however. While it has
excellent coverage of the timeless truths of computing (such as Chomsky
hierarchy, partial recursive function, and TSR), it's less helpful with some
of the uglier manifestations of computing in the here and now. For instance,
even after all these years, your average marketing weenie, sales manager, or
software engineer still can't remember the difference between extended and
expanded memory. It sure would be nice to be able to look these up somewhere,
but the Oxford dictionary doesn't have them, nor, I think, should it. Such
topics are simply too ephemeral (despite their surprising persistence year
after year) to justify inclusion in a book of this sort. Likewise for terms
such as "protected mode," "real mode," "Dynamic Data Exchange," and "resource
fork."
So where do you turn for decent explanations of terms such as these? Your
coworkers? No, they don't know what they're talking about! What you need is an
additional, backup dictionary (remember, we've only spent $10.95 so far) like
Microsoft Press Computer Dictionary.
The Microsoft dictionary has short, breezy, but genuinely useful definitions
for many of the terms you come across every day. Again, even if you have some
notion of what these terms mean, you will sharpen your understanding of them
by keeping this book on your desk and using it a few times a week.
My one complaint about the Microsoft dictionary is that it often misses the
richness of the concepts it defines. The definition of "virtual machine" is a
good case in point. Whereas the Oxford dictionary defines a VM as a
"collection of resources that emulates the behavior of an actual machine,"
going on to explain what this means by discussing processes, workspaces, and
isolation, the Microsoft dictionary merely says that a VM is "software that
mimics the performance of a hardware device," giving the not-quite-right
example of running Intel-based software on a Motorola chip. The Microsoft
definition seems to imply that any form of emulation constitutes a virtual
machine; the Oxford definition focuses the definition properly.
However, the scope of the topics covered seems just about right, as indicated
by the entries on another randomly chosen set of pages; see Example 2.

Example 2: Entries from Microsoft Press Computer Dictionary

 emulsion laser storage
 enable
 Encapsulated PostScript (EPS)
 encipher
 encode
 encryption
 end-around carry
 end-around shift
 en dash
 End key
 endless loop (see infinite loop)
 end mark
 end-of-file
 end-of-text
 end-of-transmission
 endpoint
 end user
 engine
 Enhanced Expanded Memory
 Specification
 Enhanced Graphics Adapter
 enhanced keyboard (101/102-key)

While I can't see myself looking up the definition of "endless loop" (I see
enough of the real thing in my own code), certainly a brief explanation of EPS
or the enhanced keyboard, or even a well-written paragraph explaining what the
overused word "engine" is supposed to mean, is useful to have nearby.


H&S


If you're using C, the next reference book you must get is C: A Reference
Manual, Third Edition, by Samuel P. Harbison and Guy L. Steele, Jr.
It has never been clear to me why, once the first edition of Harbison and
Steele's book was available, K&R remained popular. Sure, every C programmer
owes Kernighan and Ritchie an enormous debt for describing and creating what
to many of us remains the world's most useful programming language. But, once
you know C, the K&R book just isn't all that useful.
H&S is a book that every C programmer will use again and again. Now in its
third edition, the book covers both ANSI, C and "traditional C." It also does
a good job of mentioning odd-ball but important variants in the language, such
as the far and huge keywords in Microsoft C and other Intel-based compilers.
One section that I have found particularly useful over the years is the
lengthy discussion of the C preprocessor. Like everything else in H&S, the
preprocessor is defined much more rigorously than in other C books. Perhaps
this is because Steele is an outsider to the C community (he is codesigner of
the beautiful language, Scheme, and of the big language, Common LISP), and
therefore takes much less for granted than might someone from AT&T Bell Labs.
Considering that many (too many!) full-length books on the C runtime library
are available, H&S's 100 page section on the C runtime libraries is
surprisingly useful. The small blocks of sample source code shown are always
illuminating. The explanation of the time and date facilities and of
setjmp/longjmp is the best I've seen.
My one disappointment in the third edition was that H&S dropped an extremely
nice package of functions and macros for set manipulation that had appeared in
the second edition. Every now and then I used to take out the book and ponder
the function in Example 3 for quickly computing the size of a set.
Example 3: H&S function for computing the size of a set

 typedef unsigned SET;
 #define emptyset ((SET) 0)
 int cardinality (SET x) {
 int count = 0;
 while (x != emptyset) {
 x ^= (x & -x);
 ++ count;
 }
 return count;
 }

I still don't get how that works, but it sure is beautiful. I guess I'll keep
my copy of the second edition too.


Finally!


For years, Microsoft Corporation has wished that its most successful product,
MS-DOS, would just go away and die. This is a very strange thing for a
corporation to want its cash cow to do.
But no matter how many times they try to abandon this cow, it keeps coming
back. So Microsoft has decided to take care of it -- for a while, anyhow. As
part of its recent release of MS-DOS 5.0, Microsoft has also brought out an
"official" programmer's reference manual for DOS -- MS-DOS Programmer's
Reference: Version 5.0.
You might think that having a widely available official reference for one's
operating system is an obvious thing to do, but it is nonetheless a surprising
and welcome move by Microsoft. It's all part of the company's coming to terms
with the continued outrageous success of DOS. They seem to no longer find it
technically interesting or challenging, but it just won't go away.

Of course, Microsoft Press also sells the standard references on MS-DOS: Ray
Duncan's Advanced MS-DOS Programming, The MS-DOS Encyclopedia, and the
incredibly useful MS-DOS Extensions. This new book, however, has no author (it
was "written, edited, and produced by Microsoft Corporation"), has the word
"official" on it, has a blurb on the back that says "Accept no substitutes,"
and makes no attempt to explain or teach -- it presents nothing but "the
facts."
Thus, we now have Microsoft's official statement of what MS-DOS is. It's not
as good as Ray's books, but if you do any kind of DOS programming, you're
going to have to get this book too.
Naturally, MS-DOS Programmer's Reference includes the new memory management
(INT 21h AH = 58h) and task switching (INT 2Fh AH = 4Bh) functions added in
DOS 5.0. Furthermore, there is good coverage of various INT 2Fh subsystems, so
at least now one knows what Microsoft considers to be part of MS-DOS. There is
an assembly language STRUC for each DOS data structure, with a paragraph of
explanation for each field.
The biggest surprise is that Microsoft has finally officially documented some
of the most commonly used undocumented DOS functions. Of course, previous
Microsoft documentation (such as the chapter on TSRs in The MS-DOS
Encyclopedia) has mentioned these functions, but always with the proviso that
"Microsoft cannot guarantee that the information in this article will be valid
for future versions of MS-DOS," and always without including the functions in
the standard INT 21h references. In other words, everyone knew about the
functions and used them, but they weren't "supported." The previously
undocumented INT 21h functions shown in Example 4 are now supported.
Unfortunately, many crucial functions are still undocumented, but it's a
start.
Example 4: Previously undocumented INT 21h functions

 Get Default DPB (1Fh)
 Get DPB (32h)
 Get InDOS Flag Address (34h)
 Load Program (4B01h)
 Set PSP Address (50h)
 Get PSP Address (51h)
 Set Extended Error (5D0Ah)

It's also unfortunate that version information is placed in a separate section
of the book, away from the functions themselves, and that in some cases the
version information is wrong or misleading. For instance, the important Get
Startup Drive function (INT 21h AX = 3305h) is listed as available in DOS 2.0
and higher; in fact, the function is only available in DOS 4.0 and higher.
(Have you ever tried to find from which disk a DOS 3.3 machine was booted?)
Similarly, some of the previously undocumented functions have been documented
by pretending that they are only available in DOS 5.0 and higher.
Errors really are unavoidable, but in any "official" reference, they can be
exceedingly costly. For example, there is an extremely serious error in the
documentation for the previously undocumented LOAD structure used with the
Load Program function (INT 21h AX = 4B01h): The last two fields, 1dCSIP and
1dSSSP, are reversed. This appears in two separate locations in the book, and
is the sort of thing that could cost someone days of lost time.
All in all, this is an absolutely essential reference for all DOS*programmers,
but the "Accept no substitutes" slogan doesn't quite work. Being a pure
reference, without any explanations or sample source code, some of the
material here is simply not useable by itself. Using the task-switcher API,
for instance, would require much more material than this book provides. Thus,
MS-DOS Programmer's Reference should be viewed as the starting point for a
whole new set of DOS programming books.
For Microsoft, too, the MS-DOS Programmer's Reference should be viewed simply
as a starting point. Three nice spin-offs would be a disk with C and ASM
header files, a QuickHelp version of the book, and a DOS test suite that
exercised each function and demonstrated its proper use. DOS really is a cash
cow, and it would be nice to see Microsoft do more to milk it.












































October, 1991
OF INTEREST





WindowBuilder/V, an interactive interface builder for Smalltalk/V Windows from
Acumen Software, is a powerful point-and-click interface editor for rapidly
designing Windows user interfaces.
An interface is built by moving, sizing, and editing interface components
directly on screen. You can test interface examples as you create them,
facilitating rapid prototyping. Other features include full-featured menu and
menu bar editors, component alignment and distribution tools, and a sizing
specification editor. WindowBuilder retails for $149.95. Reader service no.
20.
Acumen Software 2140 Shattuck Ave., Suite 1008 Berkeley, CA 94704 415-649-0601
ImageMan is an object-oriented Windows library from Data Techniques that
allows you to add advanced image display and print capabilities to Windows
applications. ImageMan allows applications to access all types of images with
the same set of standard function calls, thus reducing development time. It
supports TIFF, PCX, Encapsulated Postscript, Windows Metafile, and Bitmap
image formats.
Because it is supplied as a Windows DLL, ImageMan can be used with C, C++,
Turbo Pascal for Windows, Visual Basic, Smalltalk/V, Actor, and more.
ImageMan is priced at $395, $995 with source code, and is royalty free. Reader
service no. 21.
Data Techniques Inc. 1000 Business Center Drive, Suite 120 Savannah, GA 31405
800-868-8003 or 912-651-8003
Greenleaf Comm++, a C++ class library for asynchronous communications, has
been released by Greenleaf Software. Classes provided include those for serial
port controls, modem controls, file transfer protocols, and calculation of
check values. Additional classes support hardware dependent features. Comm++
will accommodate interrupt-driven circular buffered service for up to 32 ports
at up to 115K bps.
Comm++, including full source code and technical support, sells for $199.
Reader service no. 22.
Greenleaf Software 16479 Dallas Parkway, Suite 570 Dallas, TX 75248
800-523-9830 or 214-248-2561
The Stork installation software has been released by Island Systems. The
system includes tools for configuring the diskette organization, creating a
complete user interface, specifying systems checks, and building the media
set. Unique features include the ability to tailor the product installation to
the characteristics of the end user's system, a simulation mode to test the
installation and the product without loading a diskette, and automatic
creation of an optimal installation procedure for different disk sizes. As far
as modifications due to product revisions, the Stork can incorporate changes
in a number of minutes. Priced at $175. Reader service no. 23.
Island Systems 7 Mountain Road Burlington, MA 01803 617-273-0421
MetWINDOW/PREMIUM, the graphics development system from Metagraphics, is
available for use with either 286 and 386 protected-mode DOS extenders or PC
Unix systems and supports almost all popular PC graphics hardware.
MetWINDOW/PREMIUM adds expanded graphics functionality with complete sets of X
Window-compliant and transparent rasterOps. It implements Metagraphics' new
hyper-cursor technology, supporting solid-state colored cursors up to 32 x 32
pixels in size, for running in high-resolution display modes of 800 x 600 and
1024 x 768 or above.
MetWINDOW/PREMIUM for DOS sells for $595; for Unix, $795. Reader service no.
24.
Metagraphics Software Corp. P.O. Box 66779 Scotts Valley, CA 95066
408-438-1550
The Tigre Programming Environment from Tigre Object Systems is a multiplatform
application builder that lets you build GUI applications to run on Macintosh,
MS-Windows, Sun, IBM R/6000, DEC, HP, and other workstations. Color
applications built with Tigre run without modification on any of these
platforms. Applications generated by Tigre are in Smalltalk, using ParcPlace
Systems' Objectworks\Smalltalk Release 4, which must be purchased separately.
Tigre includes a library of tested user interface object classes (or widgets)
and tools to manipulate them. To build an application, choose from a variety
of buttons, text editors, and picture viewers; place the widget into your new
Tigre program; modify it with drawing tools: connect it to your Smalltalk
code; and test it. Developing with Tigre is quick because an application can
be modified while the program is running. The modification is ready for use
immediately, without restarting or recompiling.
Tom Soon, a research scientist at Pacific Bell, commented, "I have been using
Smalltalk to create proof-of-concept prototypes for over three years. With
Tigre to help in creating the graphical user interfaces, I now spend
significantly less time than it took to program them in Smalltalk."
Also included is Tigris, Tigre's multi-user, object-oriented persistent-object
store. This database allows access to any arbitrary type of data, including
variable length text, icons, images, sounds, or any type of object.
Connections can be established quickly between any interface component and its
underlying data representation, making it easy to build even the most complex
information processing applications.
Tigre is priced at $3500 for a single user license on any of the above
platforms. Reader service no. 25.
Tigre Object Systems 3004 Mission St., Suite D Santa Cruz, CA 95060
408-427-4900
Microsoft has released a Source Profiler that generates four different types
of diagnostic information about a program as it executes: the length of time
taken to execute selected lines or functions; the number of instances selected
lines or functions are executed, thus identifying overly used loops or
functions and eliminating the need to insert counters to test code, whether a
line or function was exercised; and a relative comparison of the time spent in
each of the program symbols. Output files of statistics from individual
Profiler sessions can be combined to reveal how the program behaves during
test suites.
The profiler features choice of lines or functions, a low memory requirement,
and extensive capacity and invocation options. It uses compiler-generated
input from Microsoft's CodeView debugger to determine line and function
symbols. You can profile all a program's lines or functions or specify only
certain parts.
Because the profiler occupies only about 80K, you can test memory-intensive
applications. Also, by using virtual memory for its own data collection, the
Source Profiler can keep track of an unlimited number of lines or functions.
Also available is the Windows Debugging Version. A standard component of the
Windows SDK, the debugging version can now be purchased separately, and used
with any Microsoft or third-party Windows development tool. The debugging
version displays information on a second monitor or COM port, which alerts
programmers to problems with the use or management of Windows resources during
application development. It validates parameters, checks API calls, performs
check/sums on code segments, tracks Windows handles, and provides other types
of feedback.
The Source Profiler costs $79 ($49.95 for registered Microsoft product users);
it can be used with the following Microsoft products: Microsoft C, Fortran,
Basic, Macro Assembler, Cobol, Pascal compiler, QuickBasic, and QuickC
Compiler with QuickAssembler. The debugging version is $195. Reader service
no. 26.
Microsoft Corp. One Microsoft Way Redmond, WA 98052-6399 206-882-8080
The new release of the 386 DOS-Extender from Phar Lap (version 4.0) is
DPMI-compliant, allowing multi-mega-byte Extended-DOS applications to run
under Windows 3.0's enhanced mode and any future DPMI-compliant multitasking
environments. Windows support lets you communicate between Windows and
Extended-DOS applications through the Windows Clipboard, facilitating text and
graphics interchange.
In addition to DPMI, the 386 DOS-Extender supports all other industry
standards: INT 15; VCPI, allowing you to work with EMS emulators; XMS,
providing compatibility with DOS 5.0 and Windows' standard mode; and VDS.
The 386 DOS-Extender SDK is available for $495, updates from Version 3.0 are
free, and from Version 2.2d they are $150 for 386 DOS-Extender SDK and $25 for
386 VMM, the virtual memory add-in driver. Reader service no. 27.
Phar Lap Software Inc. 60 Aberdeen Ave. Cambridge, MA 02138 617-661-1510
Rational Systems recently released four new products: DOS/4G, BigWin,
WinServe, and DOS/16M. DOS/4G is a 32-bit DOS extender for 80386/486
architecture. It allows DOS extended applications to run as DOS programs under
any DPMI-compliant operating system. Thus, you can run 32-bit applications in
protected mode. DOS/4G requires little memory, and its linear addressing model
provides a single address, making extended memory easier to access and
resulting in better compiler code. It supports real-time applications and TSRs
and is compatible with Virtual control Program Interface and XMS extended
memory standards.
BigWin is a DOS extender for 32-bit windows programs, with which you can use
Windows' enhanced mode. Features include source code compatibility, compiler
independence, and zero-based flat model addressing. Applications created using
BigWin can be linked with 16- or 32-bit industry standard DLLs, allowing you
to call and debug other programs from within Windows.
The WinServe toolkit is for transforming existing programs into Windows 3.0
applications. It supports communication between Windows front-end and DOS
back-end programs; memory sharing; a standard DLL that exports an API for
Windows application use; an object file containing an API for use by DOS/16M
applications; a built-in debugger; and a Windows 3.0 virtual device driver.
DOS/16M is a 16-bit DOS extender for creating DOS programs that require large
amounts of memory. Using it, you can expand the functionality of DOS
applications with little modification.
Prices for DOS/4G range from $5000-25,000, depending on configuration;
Big-Win, WinServe, and DOS/16M begin at $5000. Reader service no. 28.
Rational Systems Inc. 220 No. Main St. Natick, MA 01760 508-653-6006
Wanim, a sprite animation DLL designed for building multimedia, educational,
and video game software, is AND-XOR Systems' latest release. Wanim is a fast
multitasking DLL that can be used with any Windows 3.0-compatible language.
Using Wanim you can change sprite display priority on-the-fly, create and
position multiple animation zones anywhere within a window, scroll background
images, detect collisions, and generate masks. Algorithmic sprites allow you
to create animation objects using Windows' text, line, rectangle, and ellipse
drawing functions. Algorithmic and bitmap sprites may be combined.
Wanim works with any Windows-supported graphics mode. The price is $69, with
source code $99. Demo disks cost $5. Reader service no. 29.
AND-XOR Systems 1107 Fair Oaks Ave., Suite 167 South Pasadena, CA 91030
213-969-4081
A Data Compression Library is now available from PKWARE that lets you add
state-of-the-art data compression technology to your applications. The
routines are flexible and allow complete control over the input and output
data. The application program specifies where the compressed data is sent or
where the data is to be extracted.
The library features an all-purpose data compression algorithm for ASCII or
Binary data; application controlled I/O and memory allocation; a generic
compression routine format; a dictionary of adjustable size compatibility with
C, Pascal, Basic, Fortran, Assembly, and Clipper compilers, as well as others;
a requirement of 35K of memory for compression and 12K for extraction; and the
ability to work with any 80x86 family CPU in real or protected mode.
The Data Compression Library sells for $295 plus shipping and handling. Reader
service no 30.
PKWARE Inc. 9025 North Deerwood Drive Brown Deer, WI 53223 414-354-8699
Huntsville Microsystems has released the HMI-200-68020 emulator and the
HMI-240-EC020 adaptor to support the Motorola 68EC020 microprocessor. They
support speeds of 16 and 25 MHz and provide real-time emulation, four complex
break and trigger points, and two 4K x 4-bit trace buffers, including 16
external trace bits and 32 bits of time tag information. The unit has 256K
bytes of emulation memory with an option for 1, 2, or 4 Mbytes.
The emulator is integrated with SourceGate, a window-driven, high-level
language debugger that supports C, Pascal, and Ada compilers and shows the
user's code in either source or assembly code or both. Customizable watch
windows can be displayed to monitor specific memory locations or other
variables.
A transparently operating, real-time performance analysis and test coverage
option is also included. Up to eight modules can be defined; time elapsed in
each module is displayed in histograms relative to the total test time. The
emulator, including SourceGate, costs $12,000 for the 16-MHz version and
$17,500 for the 25-MHz version. Software support for Sun and Apollo
workstations is $2000; the performance analysis option is $2500. Reader
service no. 31.
Huntsville Microsystems Inc. P.O. Box 12415 Huntsville, AL 35815 205-881-6005
































































October, 1991
SWAINE'S FLAMES


What About Bill?




Michael Swaine


Poor old Bill. What a way to be treated in the year of one's 200th birthday.
The following are some thoughts on old Bill of Rights, as his birthday
approaches. (It's December 15, so there's still time to pick out something
nice.)
Katie Hafner and John Markoff have written a book called Cyberpunk (Simon &
Schuster, New York, 1991), which profiles three people whom I would call
crackers, but whom they call hackers. They justify this usage by claiming that
the term has lost its original positive connotations and has been appropriated
by the lawbreakers and mischief-makers of cyberspace. Possibly so, but the
original MIT hackers and attendees of the annual Hacker's Conference might
take issue with this view. At least they didn't call the book Hackers II. It
was John, incidentally, who broke the story of Robert Morris's trial for
setting loose a virus that accidentally brought down Internet, and Morris is
one of the three profiled crackers. Morris is the "nice" cracker of the book.
The others, Pengo, who sold secrets to the Soviets, and Kevin Mitnick, who
messed around with credit ratings, are more sinister.
Since John is an old friend of mine, this will not be a review of the book.
I'm not unbiased: I hope that you all go out and buy it and make John and
Katie rich. I think they tell the story of their three wayward youths very
well. Still, I came away from Cyberpunk wishing that John and Katie had
written another book.
One standard journalistic technique is to get "the reaction story." Having
broken the Morris case, John could have written about the public reaction. He
and Katie did investigate the reaction to this and to similar cases: public
ignorance and hysteria, governmental ignorance and hysteria, police breaking
into homes to seize computers, software, printouts, and modems (or "modums,"
as John and Katie report the Santa Cruz, California police spelling it).
As the police misspelling example shows, John and Katie did include some of
the results of their investigation of the reaction to these cases of kids
cracking into computer systems, as much as was appropriate to the story they
were telling.
My entirely unreasonable gripe is that they didn't tell the story I wanted
told, which was the reaction story. I asked myself, as I read the book, which
is worse, kids running wild on computer networks or the police running wild
and breaking into homes? For me, the answer is clearly the latter. The rule of
law can survive lawbreakers; it more or less assumes the existence of
lawbreakers. But it suffers badly when the law enforcers break the law. And
they can: The latest example of judicial doublethink is that illegally
obtained evidence can now sometimes be used in court, which is to say, the
government can break the law to get evidence of lawbreaking.
But the more disturbing reaction is the redefinition of fundamental law out of
fear or ignorance. I'm of the opinion that most of what the Supreme Court has
been up to recently fits this description, and I'm alarmed about the state of
basic civil rights in the United States. But one doesn't have to agree with me
about this to see that there is a serious danger of judicial overreaction to
perceived dangers from kids with modems. Professor Lawrence Tribe has proposed
a Constitutional cyberspace amendment in an attempt to focus attention on
these matters. Tribe's amendment basically says that cyberspace is just
technology and that the values underlying the Constitution remain constant
across technologies. So, for example, the values of freedom of speech and
assembly, without which democracy cannot exist, are just as important in
cyberspace, but we must understand the implications of the technology to know
precisely what freedom of speech and assembly mean in cyberspace. In other
words, the cyberspace amendment extends the Bill of Rights to cyberspace.
The irony is that, even if Tribe's amendment were passed, it can't mean
anything more than the Bill of Rights means. The cyberspace amendment would be
a pointer to a structure, a structure that is being deleted, bit by bit.









































November, 1991
November, 1991
EDITORIAL


Odds and Ends, Some Odder Than Others







Now that murder, mayhem, pollution, affordable housing, and too much taxation
are no longer problems in New Jersey, state legislators are focusing their
bureaucratic bombsights on more heinous criminals -- unlicensed programmers.
In a bill proposed by Assemblywoman Barbara Kalik, anyone calling him- or
herself a software engineer in New Jersey would have to pass a written test
and (more to the point, I bet) pay a license fee. Kalik claims the legislation
would provide an industry-wide standard -- administered by the State Board of
Software Engineers -- for computer programmers, similar to the state
proficiency standards set for barbers, plumbers, acupuncturists, and
morticians. The bill has passed the Assembly, but not yet the Senate. Similar
measures are being considered in California, Texas, Ohio, and Tennessee.
Whence, you might ask, does this proposal spring? Are billions of bytes of
bug-ridden code being written in New Jersey, causing angry mobs to clamor for
justice and structured programming? Not really, a spokesperson in Kalik's
office admitted. Instead, a "couple" of Kalik's constituents observed that
anyone in New Jersey can call himself a programmer. The "solution" to this
intolerable situation, they said (and Kalik agreed), is a certification
process that "tests the applicant's knowledge of software engineering theory
and procedures and any other subjects the board may deem useful." This
examination gambit has run into rough waters even before leaving the dock. A
single test won't work, Kalik realizes, because of the broad nature of
computer science. Requiring exams in all areas of specialization, on the other
hand, would cost more than the state can take in. What's a politician to do?
Here's a suggestion: If a particular programmer can't do the task at hand, let
him or her get more training -- or get someone who can do the job. Now that I
think about it, this idea might work for politicians, too.


386BSD Begets BSD/386


The first commercial implementation of 386BSD UNIX has just been announced and
is due for release within the next few months. See page 152 of this issue for
details.


Win-Win Situations


I'm pleased to announce the recipients of this year's Kent Porter Memorial
Scholarship: Michael Leventhal of the University of California-Berkeley and
Cameron Gordon of the University of Southern Maine. Each will receive a $500
scholarship.
Likewise, a pair of contests we announced several months back -- one sponsored
by DDJ, the other by Symantec Corp. -- have been wrapped up.
The Symantec Think Programming Contest was open to college and high-school
students working in Think C or Pascal. At the college level, Brad Smith of
Beltsville, Md. was the grand prize winner, and runners-up were Atul Butte of
Providence, R.I. and David Harkness of Los Angeles. The winner at the high
school level was Robert Leslie of Concord, N.H., with Adam Miller of Ithaca,
N.Y. and Seth LaForge of Berkeley, Calif. as runners-up. Grand prize winners
received from Symantec a $5000 scholarship, a Macintosh IIci system, and a
lifetime subscription to DDJ. Runners-up received $500 and a one-year
subscription to DDJ.
The other announced competition was the DDJ Data Compression Contest. I won't
go into details about the results, other than to direct you to Mark Nelson's
article on page 62 of this issue. Thanks to all of you who sent in entries.


If Small is Beautiful, Really Small Must be Really Beautiful


I'd be remiss when mentioning data compression not to say something about what
Michael Barnsley has been up to lately. Barnsley, of fractal compression fame
and Iterated Systems Inc., stopped by the office to show off some really
impressive technology. Barnsley has upped the compression ante with
compression ratios of 2456:1 -- with rates of 10,000:1 by next summer! The
techniques that make this phenomenal compression possible are implemented in
(among other programs) a Windows application called the "P.OEM Fractal
Formatter." Equally amazing is that Barnsley is doing this in software; up
until now, you needed a hardware compression board.
As if the extraordinary compression weren't enough, Barnsley's technology is
"resolution independent," meaning that it can be displayed at infinite
resolution, depending on the isplay hardware. To change from one display
resolution to another, you simply have to change the driver and nothing more.
Additionally, Barnsley is releasing the specification for what he calls the
"Fractal Image Format" (FIF), an alternative to PCX, TIFF, BMP, and the like.
His Fractal Formatter app lets you convert image files back and forth in these
various formats, including FIF.
You gotta see this stuff to believe it.





















November, 1991
LETTERS







Patent Perspectives


Dear DDJ,
The League for Programming Freedom's article opposing software patents
("Software Patents," November 1990) rather intrigued me since I have recently
applied for a patent on a new type of software, a new, more efficient method
of displaying text. My partner and I went through the considerable effort and
expense because we realized that only by securing patent protection did we
have any chance of profiting in any way commensurate with our invention's
potential value. If we were to release our program with only the copyright
laws to protect our efforts, as soon as we had established its usefulness
others with more financial resources would quickly rewrite our program using
our unprotected concepts and push us out of the market. A potential licensee
has almost said as much.
In the article there is no consideration of these benefits to the creator of a
new concept. Is this of no value? You seem to want the creator to share with
the world his ideas and allow all the money to go to the marketer and the
financier. The patent has been for 200 years the only device which has enabled
the lone inventor a chance of controlling the destiny of his invention and to
profit from it. It was one of the keys to America's industrial strength and it
is even more important in the "Information Age."
The article details many of the problems with searching patents and going
through the patent process. I can agree fully with this part of your article
and could add to it. It is a mess. But rather than do away with patent
protection for software, you should be advocating improvement in the system.
The system needs to be completely computerized and internationalized. Similar
charges could also be made against the copyright system. Wasn't it George
Harrison who inadvertently rewrote someone else's song and had to pay damages?
At least the patents are available and can be checked -- try that for
copyright! Should you not also then call for the elimination of copyrighting
of software?
A letter affords little space to answer every charge made against software
patents. However, I want to at least register my strong disagreement with
almost everything in the article relating to the applicability of patents to
software. It must have been written by a very good lawyer, for it is rare to
find writing which sounds so reasonable yet so thoroughly obscures the truth.
I am sure that the author could construct equally compelling arguments
questioning the applicability of copyright law to software as well. And
probably has! A full explanation of why patent law is applicable to some
software is beyond the scope of this letter, but the short version, as I
understand it, is the following: When a program is entered into a computer,
switches may be a "new and useful device," and thus the Constitution
guarantees its inventor patent protection over the whole or parts of his
invention. Software had been thought not to be patentable in the US because of
a case where someone had tried to patent an algorithm which had no specific
usefulness. This misled many to think that all software was not patentable.
The patentability of some software became established not because the Patent
Office wanted to patent software (they already had plenty of work), but
because the courts enforced the Constitution. Only an amendment will properly
succeed.
Do I see no value in the activities of The League for Programming Freedom? No,
it is certainly worthwhile to work against patents which should not have been
granted because of prior art. And I agree that copyright protection should not
be extended to include interfaces. (Patents are applicable for these.) But I
hope that they will cease their efforts to end the patent protection
guaranteed to creators wisely included in our Constitution.
Howard R. Davis III
Atlanta, Georgia
Dear DDJ,
The article "Software Patents" brought to mind something that happened a
couple of years after I graduated (1953). The transistor was then the
promising new technology that was to free us of our everyday drudgery. At the
time, I heard that one of the largest electronics companies was trying to
patent transistor versions of all of the commonly used vacuum tube circuits. I
am under the impression that they did not succeed in getting the patents. In
many respects, the situation was similar to the present software patent
situation -- the patent office was not knowledgeable of the technology being
used in the patent applications. It might be worthwhile for The League for
Programming Freedom to look into what happened 35 years ago.
While getting a law passed may be a laudable endeavor, I think that The League
could make more of an immediate impact by lessening the value of software
patents that they feel are invalid. This could be done by publishing on an
update basis a review of the existing and new software patents. This review
could identify the standard algorithm (with references) that is described in
the patent and give a workaround to avoid infringement. These reviews ideally
should be in a database that could be referenced by standard algorithm name.
This would give the poor soul who is accused of infringing someone's patent an
immediate source of data to fight the action. It would also get into
publication and the public domain (and thus make unpatentable) ways of getting
around these software patents.
One more thing -- the author mentions (page 70) the possibility of the
royalties exceeding the sale price of the software product. People who license
patents do it to make money -- they have an interest in the licensee staying
in business and making money. I really doubt that the situation of royalties
exceeding the sale price would ever happen in real life.
David L. Spooner
Wilmington, Delaware


Bending Bender's Ear


Dear DDJ,
If software engineers such as Andy P. Bender (Letters, June 1991) want more
respect, they had better get real! Comparing putting code in a buffer for
money to neurosurgery is the height of hubris. And to believe that a standard
university curriculum produces better-qualified people than experience is
silly. Universities usually select better people to start with, but they
seldom improve them much. Equating "incompetent" to "learned-it-on-the-job" is
an insult to the pioneers, including Mitch Kapor, whom he quotes.
As far as "requiring...licensure [sic]" is concerned, this is a state function
for the protection of the public and was not meant to be a tool for gouging
the public. The requirements should be on the product, not the person creating
it. (In the case of such personal services as medicine, law, the ministry, and
teaching, the person is the product. This is why they are professions.) For
some years now it has been next to impossible to create new licensing for any
field under either state or federal law. Life and casualty actuaries have been
working on it for years with limited success. When they started, they had a
long history of tough qualifying examinations for membership in their
organizations.
Andy admits that he waxes sarcastic and implies that there are "standards"
(legal ones?) on how the working drawings for construction are produced. Are
there? If so, who enforces them?
The closest you are likely to come to so-called professionalization is private
organizations of people in a "discipline" with entry requirements based on
education, experience, examination, or publication in reviewed journals. For
most, examination is the most appropriate, but, of course, for granting tenure
to professors, publication is key. In any case, efforts in this direction have
had little success. I don't normally use my CDP designation.
So that Andy will know where I am coming from, I have a science and
mathematics background with degrees from major universities. I worked in
insurance for many years with experience in data processing, actuarial,
accounting, and general management. Many of the languages I know and computers
I have programmed for came along after I had already "been in this business
for twenty years." Perhaps nobody could teach me anything, but I could
certainly learn it. I will soon be 65 and am more up-to-date than most of the
so-called professional MIS people I know.
I believe in sound techniques, but I think that standards are a bunch of crap.
I believe in learning and in some (but certainly not all) education, but I
think that certification and licensing are scams.
Harlow B. Staley
Northbrook, Illinois


Favorite Software Books


Dear DDJ,
I am very interested in Ray Duncan's "Programmer's Bookshelf" column in your
June 1991 issue. I was wondering if you could either supply me with the list
of Yourdon's favorite software books or the information I need to get this
list. I do agree that reading programming books can be dry and boring, so I
was happy to read that someone like Yourdon would also recommend
nonprogramming books.
Chamrong Chhut
McLean, Virginia
Editor's note: Contact Edward Yourdon's The American Programmer, 161 West 86
Street, Suite 9A, New York, NY 10024, or by phone at 212-769-9460.


Amen


Dear DDJ,
There were many minor points and assertions in John Derbyshire's letter ("The
Mandarin Middle Management Conspiracy," September 1991) with which I might
raise issue. However, having just reread his letter for the third time, I
think all my reactions can be distilled into a single, concise comment: Amen,
brother.
Kevin D. Weeks

Oak Ridge, Tennessee


Why Buffer the UART?


Dear DDJ,
I read with interest Jeff Duntemann's "Structured Programming" column (June
1991), in which he discusses the UART registers.
On page 134, in the section "FIFO Control Register (FCR)," Jeff made a
statement which puzzled me: "For DOS applications on fast machines they
(buffered UART's) simply aren't necessary. (If they are, I suspect it means
you don't know how to write a terse enough interrupt service routine.)"
We have developed an application for the PC (the "monitor") which uses the
serial port to monitor and control the behavior of another computer system
(the "remote") in real time. The remote and the monitor send data continuously
to each other at 19,200 baud. Due to limitations in the remote, no flow
control is used -- it sends data in a continuous stream in real time. The
monitor is a 20-MHz 286 PC.
Our experience has been that a buffered UART is necessary to avoid loss of
data coming into the monitor in this situation. The problem does not seem to
be the efficiency of the ISR we wrote to service the incoming data (I wish it
were, so we could correct it), but rather the efficiency (or proper design) of
other software in the machine; specifically the BIOS routines which service
the fixed disk and the keyboard interrupt.
As best we can tell, what happens is that the keyboard service and/or fixed
disk software in some BIOS's disable interrupts just long enough to cause
incoming data to occasionally be lost. We have found, for example, that AMI
BIOS works fine (thousands of hours of experience), whereas Compaq loses data.
Please tell me if there is some subtle point I have missed, which would allow
me to avoid data loss without having to use the buffered UART.
By the way, you might mention to your readers that the 16550AF is the chip
they want, not the 16550. The 16550 is buffered, but the buffer does not work
properly. This has been corrected in the 16550AF. This can be tested using
Example 1.
Example 1

 port[base+2]:=7;
 case port[base+2] and $c0 of
 0: it's an 8250 or 16450
 $80: it's a 16550 (bad buffer)
 $c0: it's a 16550AF (buffer OK to use)
 else who knows?
 end;

Russ Ether
South Bend, Indiana


Conversion Routine Optimization


Dear DDJ,
The conversion routine presented in Don Morgan's "Decimal Fractional
Conversion" (August 1991) can be improved significantly. Example 2 provides
the same result in approximately 50 percent of the space. It is essentially
sequential except for one conditional jump instruction. It avoids the penalty
for clobbering the prefetch queue which is rather severe on the less powerful
Intel CPUs for which the routine was designed. It may not be important but it
preserves the BX and CX registers. It also corrects a harmless bug in the
carry logic.
Example 2

 MOV DX,0001h
 DigitLoop:
 ADD AL,AL
 DAA
 XCHG AL,AH
 ADC AL,AL
 DAA
 XCHG AL,AH
 ADC DX,DX
 JNC DigitLoop
 SUB AX,5000h
 SBB DX,-1
 RET

A carry from a low byte into a high byte during addition could propagate. The
ADC instruction has been designed to handle just such a situation. The
convoluted carry logic used in the article works because a number added to
itself will always yield an even number. A carry into an even number will
never propagate but the INC instruction would not propagate it if it could
occur.
To facilitate reentrancy, the calling routine should store the result.
Otherwise, the revised code will produce identical results without the
unwarranted complexity.
Berry Ratliff
Ann Arbor, Michigan









November, 1991
A CONVERSATION WITH ROBERT CARR PART I


Designing operating systems for the future




Michael Swaine


Michael is editor-at-large for DDJ and can be contacted at 501 Galveston,
Drive, Redwood City, CA 94063.


As GO Corporation scrambled toward initial release of PenPoint, its 32-bit,
object-oriented, multitasking operating system, I stole some time from Robert
Carr, the cofounder and head of development of GO. Pen-based computing looks
like a potentially huge market with many technical challenges and product
opportunities, but the operating system is interesting independently of its
pen support. (The accompanying text box entitled "A Technical Overview of
PenPoint" examines the OS.)
The conversation naturally revolved around pen-based computing and the
operating system, but throughout the discussion, the speakers found themselves
returning again and again to another topic: the proper locus of integration in
software. As the author of Framework, Carr is an expert on integrated
software, and the industry as a whole has held differing views over the years
on the proper locus and degree of integration, and this has affected the range
of opportunities for developers. PenPoint's approach to integration, Carr
argues, is good news for the smaller, independent developers.
In the first installment of this two-part interview, Carr focuses on the
PenPoint operating system itself; next month, he discusses the PenPoint
notebook user interface.
DDJ: Maybe we can start from a moment in history that was probably memorable
for you and back up from there. The moment in history that I have in mind is
in early 1984 when you had just made an interesting deal with Ashton-Tate, a
pretty hot company at that time, that would result in your product becoming
their hot product of the year. It must have been an exciting time. How did you
get to that point?
RC: I guess the key step in getting to that point was starting to work on
Framework.
DDJ: You hadn't developed a single piece of application software on your own
before. What led you to take on something like Framework?
RC: What led me to do that? Hmm. I had studied computer science at Stanford
and worked at Xerox PARC, doing a small piece of applications programming on
the Alto. Then I got somewhat disenchanted with software and took a year off
and ended up in Los Angeles doing contract programming on Context MBA. That
was in stark contrast to Xerox, where I had been a small fish in a big pond.
In L.A. I found myself working with a very small startup, just six or seven
people. In spite of the fact that I was quite young, I found that I was able
to make a lot of contributions and play a strong leadership role. So that,
plus my break, it turned out, really energized me and got me very excited
about creating software again. After about a year of working there largely
implementing their vision, and as their vision became apparent, I started
having a lot of my own ideas about how to do integrated software in quite a
different fashion.
In working on Framework, what I was trying to do was to define all these
elements that were in common across applications and abstract them out, and by
doing so both to make it easier to implement the applications, but also to
provide a highly integrated environment where the user would get many
benefits, where they could mix and match, say, spreadsheets and word
processing in a live compound document, or they could have tremendous
consistency. I was always hobbled in my efforts because I was working at the
application level and everything that I was abstracting out really needed to
be put down in the operating system if it was going to work across
applications that could be provided by third-party vendors. And of course up
at the application level I couldn't do that, so it was all part of one larger
application, a so-called "integrated software package."
DDJ: But that's the nature of integrated packages.
RC: That is the Achilles heel of integrated software, that it is a closed
world. The end user can't go out and buy another piece and install it and
extend it or mix and match their favorite applications. Hence integrated
software, evolutionarily speaking, will never be a high-end product, even
though in the '80s that's how it was positioned. Its long-term role is just
more entry-level products for users who want a general-purpose, low-end, more
cost-efficient solution. So with integrated software I had always been stuck
with this problem. And that was one of the things that particularly attracted
me to the opportunity to cofound GO. I felt that it was a rare opportunity
where a major new market was opening up, with significant new technical needs,
such that a new operating system really was needed, and therefore a golden
opportunity for me to try to do an operating system (from my viewpoint)
correctly, such that it could provide a high degree of integration across
applications and yet be open-ended.
DDJ: Norm Meyerowitz, of Brown University, once wrote an article titled, "Why
We Are All Doing Hypertext Wrong." His point was that hypertext needs to exist
at the operating system level. It seems to me that his argument generalizes to
pretty much anything that involves that degree of integration.
RC: Mm-hmm. In fact, I don't know if you know it, but Norm is working here
now.
DDJ: I didn't know that.
RC: Yeah, that was a very good article. Many of the concepts of hypertext and
object orientation he had been talking about he felt that we've done here at
GO in PenPoint. He wants to get on beyond those. He joined us a few months
ago.


The Point of PenPoint


DDJ: Let's talk about PenPoint; the characteristics of the operating system,
why they're there, what you wanted to achieve with this operating system.
RC: Well, what attracted me to GO and to pursuing the pen market was the
notion that here was a new market that needed a new operating system, and that
was a rare opportunity because the world generally does not need twenty-seven
hundred operating systems.
DDJ: How many does it need?
RC: I think it typically needs a slowly shifting constellation of five or
seven or eight. In fact, I think that that's what we've always had
historically, and MS-DOS's dominance for a period in the '80s I think was a
historical anomaly.
DDJ: So what was the design vision behind your OS?
More Details.
RC: What we've tried to do, since we did have a fresh start, was to build a
very solid foundation that would last for many, many years to come. That was
one going-in mission. And of course we wanted to code in a high-level
language, and we also wanted to make sure that we could have as robust and as
extensible an architecture as possible. That led us to the notion of object
orientation and also to coding it in C. We wanted it, at a detail level, to
work very well with a pen. At a more gestalt level, I've always thought that
one of the impediments to ease of learning and use that other operating
systems have is the file-system model. And of course we wanted to have a GUI
so we could deliver the benefits that those can deliver. But I felt that
beyond the GUI -- the other systems all seemed to me to be stuck in a dead
end. Everybody has a desktop, the desktops all look like the other desktop
metaphors, and the GUIs are all the same, and it's all vintage 1978
technology. And the evidence is very strong that it just isn't good enough for
most people in terms of learnability and usability.
DDJ: One of your steps in increasing learnability and usability was to scuttle
the file-system model?
RC: I think one of the major conceptual stumbling blocks is the presence of a
file system, where the user has got to deal with developing a conceptual
metaphor where they store things in these tiny hidden places that are
represented by little character strings or maybe little icons plus a character
string, and then they have to shuttle it from its invisible place into the
desktop and can only have a few things present. So we really ought to come up
with a conceptual metaphor that would be much richer than the desktop metaphor
and also just do away with the dichotomy between your workplace and your
filing place, and just have, in essence, your workplace.
DDJ: Increasing learnability and usability would seem to be necessary if one
wants to open up the market for computers to people who don't, or can't, or
won't use them today. Pen-based computing seems to be, in large part, intended
for just that market-expanding purpose. It opens computer use up to people who
can't or won't type, who don't sit at a desk. Didn't you also have the idea,
in developing PenPoint, that what you were going to be doing would address a
considerably larger audience than personal computers today?
RC: Absolutely. In fact, that was one of the major motivations for a new
operating system. We felt that the vast majority of our market that we would
be going after would be a market of new users who would be oftentimes using
new applications and would be using it on new hardware that would be fairly
different from existing PC hardware configurations. So when you have a new
market and new users and new applications and new hardware, a new OS was the
only missing part of the puzzle, and that was the business opportunity we
wanted to go after. And to go after new users, we saw we'd have to have an
easier to learn and use system. Another goal was to make application
development easier and also have it provide many of the benefits of integrated
software to end users across all applications on the system. Hence my own
roots in doing integrated software and being an application developer.


Back to the Future


DDJ: So we're back to integrated software.
RC: Basically, what we wanted to do was find as many elements that had
traditionally been done at the application level and done again and again and
again and again at the application level, reinvented by every application
team, and take those out and implement them once in the OS, so that you speed
application development and give them a more consistent user interface. The
way we knit it all together is with something called the Application
Framework.
DDJ: Interesting name.
RC: Hm. And of course we draw on such things as MacApp from the Apple world,
and there's also been some university research in application frameworks, and
we were able to draw on those sources in putting these application building
blocks, if you will, right into the OS and letting all applications inherit
them.
DDJ: Object orientation is a vogue term, and as such is often used pretty
loosely. My understanding is that your approach to PenPoint was object
oriented from the start. What did -- what do -- you mean by object
orientation?
RC: What we mean when we say that PenPoint is object-oriented is that the
programming interfaces are implemented as a sequence of objects that you can
send messages to, and that these objects are instances of classes, and classes
can be subclassed, thereby modifying their behavior, and of course the objects
tend to encapsulate information and behavior and hide the actual data
representation from the folks that want to communicate with them. So I think
anybody who has studied what I'd call "true object orientation" in the
Smalltalk sense will find that when we talk about object orientation, that's
what we mean. Almost all the programming interfaces are objects to send
messages to. It is not a lot of function calls.
What we've done that no one else has done yet, because no one else has done an
entire operating system in an object-oriented fashion, is we've extended the
usual object orientation notions in several areas that are necessary to do a
true modern OS in an object-oriented fashion. For instance, our object model
allows objects in multiple processes to communicate with each other, and we
allow objects to be global and to be shared between application instances,
which is a good example of a technical requisite. If you're going to have a
high degree of integration in an environment, you want to be able to share
objects between different application instances. And yet languages such as C++
don't address the notion of sharing objects or even classes between
applications. They're just designed to structure the guts of a single
application in an object-oriented fashion. They're not a whole lot of
applications sharing a lot of code and behavior between them in an operating
system where applications and classes can be installed and deinstalled. When a
new application is installed, what actually happens is we have to graft a lot
of their classes onto the class hierarchy.

Well, what happens when a class comes in that is just another version of an
existing class? You have versioning issues, all of which are important in the
real world, where an end user is not necessarily going to want to install new
copies of all their programs any time they get any kind of an update of any
piece of an application.
Another issue is the persistence of objects: the notion that you can file
objects and then later on unfile them and they need to get coupled up with the
code that represents their behavior. Those are all notions that we've built
right into PenPoint's object model that underlies the language implementation
of the object-oriented paradigm.


The PenPoint "File System"


DDJ: Tell me about how PenPoint deals with documents and files.
RC: What we've done is, we've hidden the file system from the user. In
PenPoint, you just turn pages in the notebook, and the operating system will
automatically create an application instance and a process for the new page
that you're turning to. Then that application will get its data out of a file
system, but that file system is completely hidden from the user.
DDJ: So you do have a file system. It almost sounded earlier as though you had
dumped the notion completely.
RC: It turns out that the file system is a very sound concept to have in an
operating system. It's the place where the application can store its data in
what I'll call a passive form, which is a form in which you're not using
actual memory addresses. The active form typically cannot survive across a
crash or a restart or an application reload, so to store the data in a robust
fashion so that it can be used again even if the OS restarts or even if you
load the data into a different machine that has a different memory layout, you
need to store it in this passive kind of memory-location neutral format. We
wanted to be sure that PenPoint was a very robust system such that even if the
OS itself crashed we could do an on-the-fly restart and have the user right
back with all the data. So we have applications, behind the scenes, store
their data in a fairly traditional and robust protected file system.
DDJ: So what does the file system look like to a developer?
RC: To the developer, the APIs to the file system are object oriented, so
there are some classes that you talk to, a class directory handle, for
instance. The kinds of functions you can perform on the file system are very
traditional: You can have stream-oriented I/O, you open files, you close
files.
The actual feature set of the file system is that it's a blend of the OS/2
HPFS and the Macintosh system. You can have long filenames of 31 characters;
the files can have attributes attached to them; files have what you might
think of as a resource fork as well as a data fork. It's a good, modern,
sophisticated file system. What we do is we map this file system onto the
local [device]. If you store these files, say, on an MS-DOS floppy or on an
IBM PC or on a Macintosh, we'll preserve all these semantics when we store the
files even on these foreign file systems.
DDJ: And what does the user see?
RC: What the user sees is just a document that they turn to. That document is
a page in the notebook, and one of PenPoint's major innovations is that inside
that document or application instance -- the two are synonymous -- the user
can actually place other applications or documents and create a hierarchy of
applications. This is a common theme again from Framework, where you could
create a hierarchy of objects. The benefit to the user is that they can create
a live compound document just as easily as they can create a single document
that's of a single data type.
DDJ: Compound documents are certainly a hot topic right now.
RC: I don't pretend that all users all of the time need to create compound
documents; in fact, I'm leery that many users very often need to do that in
the traditional ways that our marketing people in this industry show. Of
course, everybody likes to show a spreadsheet and a graph inside a letter,
right?
DDJ: Makes a good demo.
RC: I'm skeptical that that actually gets done very often. There are times
when users mix spreadsheets and graphics into their word processing, but there
are many other kinds of more subtle examples that can be achieved by the same
architecture, and that will be more important as time goes by. Mixing in-line
Post-It notes and signatures into the body of a word processing document, or a
logo or voice annotation Post-It note or building applications that you might
think of as custom or vertical applications, but where you are reusing other
applications as your building blocks.
DDJ: Dave Winer has talked in our pages about achieving that through
interapplication communication on the Macintosh. How do you support it in
PenPoint?
RC: Technically, what you need to do there is to put the building blocks under
the control of your new meta-application. Maybe what you're creating is an
interactive report viewer that's going to show some data on the screen to your
executive. If the operating system supports you sticking a spreadsheet or a
database application right into the middle of your new meta- or vertical
application, that can make your job very easy. But the OS has got to support
the other application running right inside yours or else you're going to be
stuck with having to start from scratch.
DDJ: Tell me what your multitasking is like.
RC: Our multitasking is most similar to OS/2. We provide a model in which
processes are the basic unit of ownership for resources such as memory, and
within a process you can have child threads, so you can get multitasking even
inside of a process if you want. The basic rule is that every application
instance is a separate process. So, if an application crashes, what the
operating system will do is shut that process down, but the rest of the
operating system and other applications will keep going. We have a decent
array of interapplication primitives, semaphores, and the like.
One of the interesting notions about PenPoint from a process viewpoint is that
the application framework defines what you might think of as a life cycle of
an application. The life cycle starts with the notion of an application
instance being created, and in each stage of the life cycle there are explicit
messages that will tell an application object to move from one level of the
cycle to another. So there's a message to create an instance of an
application; there's a message to activate, which means be awake, so that
you've got all your data open and ready to work on and you're ready to
communicate with other application objects but you're not yet displayed on a
screen; and the highest level of nirvana is to actually display yourself. And
then there are backing-off stages for that: You're no longer on the screen but
you're still active; then there's closing you down such that you're completely
filed and you don't have an active process; and lastly there's deleting the
instance.
An application basically needs to respond to these messages, and somebody
outside of the application tells the application when to wake up, when to be
creative, when to shut down, when to display itself. So you can imagine that
when you turn a page, the notebook is sending these messages to an
application. But if an application is embedded inside of another application,
then it is the parent that can be sending these messages to the child or
embedded application, thereby controlling its fate.


PenPoint Portability and Opportunities


DDJ: You did the original work on PenPoint on the 286, but you've been moving
it over to the 386, which looks like the assumed machine today. Wasn't that
early work painful?
RC: If we were doing things over today, we would pick the 386, to be honest. I
guess we called that one wrong. But in '87 when we chose the 286, from both a
hardware and from a software viewpoint it seemed like a conservative choice to
make.
As a startup back then with just five or six people, we felt we were taking on
enough risk points so that pushing the envelope in the processor choice area
was something we chose not to do. We did not know of a low-power version of
the 386 back in '87; in fact, Intel probably didn't know about it, because
they probably didn't have their SL chipset even underway back then. And back
then I felt that the tools for the 286 were there and were fairly reliable
because OS/2 was driving the 286 protect mode. There were a couple of 386
compilers -- I wasn't familiar with them, so I don't want to denigrate them,
but I was fearful that the tool support for the 386 native mode would be weak.
What happened was that by 1990 we were successful enough, and we learned about
the 386SL chipset that was due to come from Intel, so that it became clear
that we had an opportunity to switch over to the 386 before we ever really
reached the market, and thereby avoided getting trapped in the Intel
segmentation world with an installed base.
DDJ: Was it a difficult port?
RC: We had always designed and programmed PenPoint to be a 32-bit operating
system, so even on the 286 all the programming interfaces were 32 bits wide.
Of course that compiled down sometimes to some costly instruction sets,
because you've got to load the segmentation register and the offset register.
Because we had done that, moving to the 386 was a pretty straightforward thing
to do, and were just about done doing that. We had to rewrite portions of the
multitasking kernel, because the 386 native mode is different from the 286
mode, and we have added virtual memory support to the kernel. But above that
kernel level, for all the rest of PenPoint, it's been pretty much a
recompilation. And that's all it's going to be for application developers
who've been coding on the developer's release.
DDJ: Other platforms?
RC: We designed PenPoint for portability. You'll see us exploit that to the
greatest degree we can in the next couple of years. First of all, now that we
are a flat-memory model operating system written in C, and a 32-bit one, we
are just as portable as UNIX. Arguably a bit more portable. So you may very
well see us move to other processors, including RISC processors.
To further aid portability we've isolated as many machine dependencies as
possible in a layer we call the MIL, the Machine Interface Layer. It's
somewhat similar to a BIOS layer, but it's a more modern version; it's an
extensible and reentrant form of the BIOS. Our hardware companies are
responsible for implementing the MIL layer. If they want to have some unique
features in their hardware, they can write some new MIL code to provide a
low-level interface, and then PenPoint applications will support that. The
last aspect of portability that's really important to mention is scalability
to different configurations.
DDJ: Meaning different display sizes and aspect ratios?
RC: Well, there are two dimensions to it. One dimension is the memory
configuration. We will run on a machine that has no disk, and therefore is
just what I will call a "single-tier" memory configuration. But we will also
run on a machine that has a disk or backing store of some kind. Of course,
when we do that, we hide the fact that there's a two-tier from the use . We
hide the fact that there's a disk drive there, because we also hide the notion
of a file system. Hence the virtual memory implementation which makes it
transparent to the user. So that's one way we scale: We'll run on diskless
machines as well as on disk machines.
The other way we scale is that the user interface will adjust to fit different
screen sizes and aspect ratios. So we can run on much less than 640 x 480
pixels. And that's very important because devices the size of your [reporter's
4 x 8-inch] notepad will be very popular I think in just about a year or two.
DDJ: You said "arguably more portable than UNIX." Were you giving me that
argument when you talked about the MIL layer?
RC: Yes, and also, portable while retaining binary compatibility. Because when
we license [PenPoint] to hardware companies, we will not give them source code
access to it. They will be responsible for implementing a MIL, and then the
PenPoint binary will run directly on top of that MIL and what they will get
from us is a copy of PenPoint binary. And our license requires them to provide
for not only API but also full binary compatibility. So the applications
market should grow fairly rapidly with PenPoint, since there is one
application binary that will run on all these PenPoint machines. But the
hardware companies, all of a sudden, can begin designing hardware that's
different from other hardware, whereas in the PC world they've had to clone
the hardware even down to the trace level so that they could be assured of
running the software.
DDJ: You mentioned RISC. Anything in the works?
RC: We're actively evaluating various opportunities to move PenPoint to RISC
processors. That's all I can say now. But we think RISC processors offer some
real benefits to users in terms of their performance, and also low-cost and
low-power consumption. And if we do move PenPoint to any of these RISC
processors, we'll be ensuring that the same data files work -- that it's one
single data file format for all applications across the different processor
families. But while I praise the RISC opportunities, we're also very pleased
by what we've seen from Intel in terms of their interest and cooperation in
the pen market. They have told us that they are very committed to providing
competitive processors there, and we believe them.
DDJ: GO and Phoenix Technologies have announced an agreement that provides
Phoenix with GO's hardware design -- its electrical and mechanical
specifications. What does the Phoenix deal mean to GO?
RC: I view Phoenix as icing on the cake for GO. Our main business is providing
an operating system to the hardware companies, and there are a couple dozen we
are working with right now. What Phoenix offers us -- and we're very pleased
to be working with them -- is the opportunity to reach many hardware companies
that we could not get to right now. They're both selling a pen-based hardware
design to these companies, and also offering both BIOS and also MIL
implementations. So I think what they'll basically do is speed the development
of the clone market in pen computers. Whereas in the '80s that took many years
to develop, we may well see some clone companies shipping good pen offerings
next year.
DDJ: You say that pen computing offers a lot of significant opportunities for
software developers. One area in which the technology is certainly still in
development is handwriting recognition. Are we going to see in our lifetimes a
machine that can read this? [Swaine's notes]
RC: I think it's unlikely that we'll see computers recognizing handwriting
that humans can't recognize. Unfortunately today, however, computers can't
recognize handwriting that humans can. So there's definitely room for
improvement. I think we'll see rapid improvement. First of all, we believe
it's a software problem, not a hardware problem, and that the algorithms
should stay in software for quite a few more years. Sometimes people wonder if
there's some sort of special hardware that can help handwriting recognition.
We think that that's not the case. The technology is still very young. Not
that many man-years have been invested in handwriting recognition over the
last ten thousand years. There have been only a few noticeable products,
projects that have gone on during the '70s and '80s.
DDJ: And that's going to change?
RC: Now that Microsoft and GO are offering operating systems in which the
handwriting module can be replaced by a third party, for the first time ever
there's a market for handwriting recognition engines. We think that that'll
unleash a lot of entrepreneurial and technical innovation and that we'll see
rapid progress, both because more and talented teams are working on the
problem, but also because the teams in place are continuing to work on it, and
perhaps getting more real-world feedback on their approaches. And then lastly,
much of the handwriting recognition is CPU-bound, and since CPUs are tending
to double their performance every 18 months, time will help solve this
problem.


Next Month


Carr continues his conversation in December's "Programming Paradigms,"
covering the PenPoint Notebook User Interface (NUI), imaging model, and more.



A Technical Overview of PenPoint


Ray Valdes
GO's PenPoint environment, in development for four years now, represents
perhaps a hundred person-years of software development effort. What was
accomplished over that time?
After taking the week-long developer course last November, and doing a little
work on my own on the system, I've found that the results of GO's effort are
indeed impressive:
A portable, protected-mode operating system with preemptive multitasking and
much of the power of OS/2
A windowed user interface environment that serves needs similar to Microsoft
Windows
A graphic primitive layer with much of the high-end functionality of Display
Postscript
An object-oriented application framework similar in purpose to Apple's MacApp
or Borland's ObjectWindows
Support for detachable networking, deferred data transfer, and reconnectable
volumes without peer in mainstream environments
Development tools that support object-oriented programming in C; also
utilities like a source-level debugger and outline font editor; and, further,
a prototype stylus computer (the Lombard), available to developers
Oh, by the way, it also supports handwriting recognition.
The dry list above does not hint at the impact of using a PenPoint machine for
the first time -- an exhilarating rush similar to my first experience of using
a Mac, eight years ago.
Given such results, a hundred person-years is a scant amount of time. GO has
been able to reach its ambitious goals by front-loading the team with
high-powered talent. Many, if not most, developers there have earned their
spurs on some significant past project in the PC industry.
Lest you think I've been snowed by GO's technical evangelists, I do have some
criticisms of the system, presented further on. First, a brief tour of rich
and complex environment.
Unlike other platforms' system components, many of which were designed
independently of each other and now suffer through an uneasy coexistence,
PenPoint can take advantage of the "clean slate" approach. There are no quirky
DOS misfeatures to circumvent or tolerate.
At the higher layers, the object orientation of the system is paramount.
Designing a PenPoint application is similar to creating one for the Smalltalk
environment, in that you instantiate and/or subclass the components of your
app from existing classes in the application framework (in this case, three
classes: clsApp, clsView, and clsObject).
Mobile stylus computers have memory constraints much stricter than those of
desktop machines, so code reuse is essential. PenPoint maximizes code reuse
through its object orientation and via a much higher level of application
integration than that in Windows or Presentation Manager.
Both Microsoft and Apple have now been fostering increased integration among
applications -- through protocols such as Object Linking and Embedding (OLE)
or System 7 AppleEvents. But PenPoint has the advantage here because the layer
is not added on after the fact, but inherent in the design.
A key concept in PenPoint is "scalability," which means, among other things,
that all visual components resize and scale themselves at runtime to fit the
desired display format. This concept is woven into the system at a meticulous
level of detail, from the system font, to the system menus, to other objects;
all these resize and reposition themselves automatically according to
user-defined constraints.
Another concept thoroughly blended into all system behavior is the notion of a
core set of gestures that work in all contexts and modes.
Viewing the layers of PenPoint's software architecture from bottom to top, the
lowest level is the PenPoint Kernel. This component uses a traditional
function-call programming interface, as opposed to an object-oriented
protocol. The Kernel offers the standard features of a modern OS, such as
priority-based preemption, multitasking, multithreading, interprocess,
communication and synchronization (through semaphores), and memory protection
of code and data.
To support mobile computing, the OS powers down the CPU when all tasks are
idle, and wakes it up for pen events and timer alarms. To accommodate
diskless, memory-constrained systems, execution of programs does not imply two
copies of an application's code (one loaded into RAM and another stored on the
file system). Rather, a single relocated image is shared across all clients.
Hardware memory protection helps preserve the integrity of code and data in
case of a system crash.
The current developer release of PenPoint supports memory-management functions
specific to Intel's segmented 286 architecture. The commercial version will
include changes to support a 386-style architecture (32-bit flat model).
Sitting next to the Kernel at the lowest levels of the operating system is the
Class Manager. This supports the objects used by higher-level components such
as the UI Toolkit and the File System. The Class Manager provides
message-passing capability for PenPoint objects, as well enabling object
creation, behavior inheritance, and runtime definition of classes. Although
class definitions are built dynamically at runtime, the specifications are
usually written at compile time.
How is this done without using an object-oriented language? GO has established
a set of C programming conventions, including preprocessor macros and
#defines, and a special utility (the method-table compiler) that together
enable a form of OOP, despite the use of object-resistant language.
Applications are written in C on DOS machines, compiled with the Microsoft C
6.0, and then linked by the OS/2 segmented executable linker with PenPoint
runtime libraries. They are then either downloaded to a PenPoint tablet or
tested on a developer version of Penpoint that boots up on your DOS desktop
machine.
GO documents say that as much as 80 percent of the line-by-line structure of
an application program's code consists of statements that send messages to
objects via Class Manager calls. This, for me, is the most irritating aspect
of writing PenPoint apps. Although the system design is beautiful, the mundane
details of coding are cumbersome and ugly. Briefly, sending a message in
PenPoint involves declaring a struct to hold the message parameters, filling
in the struct members with parameter values, and then making a function call
to the Class Manager. It's possible that GO may offer a true object-oriented
language (C++?) at some point in the future. Why didn't GO choose C++ at the
outset? Four years ago, when the development effort began, there were no C++
compilers available for the 80x86 platform. Another obstacle is that C++
doesn't allow for runtime definition of classes.
PenPoint's file system is strictly hierarchical rather than graph-structured
(that is, like DOS or the Mac, not UNIX). The File System consists of volumes,
directories, and files. One feature addressing mobile computing is the ability
to mount and dismount volumes easily. Files can have user-defined attributes.
A nice feature is the ability to subclass files. This facilities implementing
application-specific structured files, such as Mac-style resource files.
Interestingly, the File System is a higher-level abstraction the can be
implemented on top of lower-level existing file systems such as DOS or the
Mac. In fact, the version of PenPoint that boots from DOS machines implements
a single PenPoint file with a DOS subdirectory containing multiple DOS files.
Continuing up the layers of the system, there is ImagePoint -- a set of
graphics primitives that go beyond those in Windows' GDI, Presentation
Manager's GPI, or Apple's QuickDraw, to approach Display Postscript's
functionality.
A single imaging model is used for both screen display and printer output. All
objects can be arbitrarily scaled, rotated, and translated. Display objects
include Bezier curves (no Postscript-style paths, however), as well as sampled
images in a variety of types and formats.
Further up, there is the Windowing layer, which supports operations on window
objects, such as moving, sizing, repainting, clipping, filing, enumerating,
and so on. As in Microsoft Windows, you can have a bunch of parent and child
windows instantiated in a tree structure. Unlike other systems, windows in
PenPoint can reposition and resize themselves (and their children)
automatically. This is a key feature that enables system scalability. It is
backed up by some sophisticated algorithms to minimize CPU cycles.
The UI Toolkit implements the middle layer of PenPoint's user interface. The
UI Toolkit calls on the Windowing and Graphics subsystems to draw user
interface components and graphics primitives. In turn, the Application
Framework and the classes implementing the GO's Notebook metaphor (NUI) use UI
Toolkit objects extensively.
The UI Toolkit contains user interface "widgets" such as buttons, labels,
borders, tables, menus, pop-ups, scrollbars, list boxes, option sheets (dialog
boxes), editing fields, and icons. Many of these objects can recognize
pen-based commands and capture handwritten text. Nearly every window which has
a label or responds to the pen is a UI component of some sort. Their behavior
appearance is modified through subclassing.
The Application Framework, the highest level in the PenPoint software
architecture, allows for relationships among application-level entities.
Compound documents can be built by embedding one application within another.
Thus, developers of PenPoint apps need not build all aspects of an application
in order to provide a full-featured solution to their customers. Rather, they
can concentrate on the critical pieces of software that they know best, such
as implementing an efficient scientific algorithm or a complete set of
actuarial rules. With this unprecedented collaboration among application-level
components, GO hopes that the whole will be much more than the sum of its
parts.
Experiencing PenPoint as a user is thoroughly enchanting. As a programmer,
however, there are some thorns in the side. Already stated: the coding of
message sends. Also, it takes about a minute to reboot the system from a DOS
desktop machine, never mind the compile and link cycle. Developing for
PenPoint is similar to Mac development in its infancy, when you needed a Lisa
and had to sit through compile-and-download cycles.
The people at GO, of course, are well aware of these and other criticisms, and
are addressing them. Third-party vendors are also busy here.
From a distance, the design appears well-conceived. Closer up, there are
blemishes now being touched up. I've not worked with the system enough to
discover those irritations and obstacles that only surface over time. When
these appear, I don't think they'll diminish the compelling beauty of the
system's design and implementation.
--R.V.




















November, 1991
 LOADING DEVICE DRIVERS FROM THE DOS COMMAND LINE


DEVLOD loads both character and block device drivers


 This article contains the following executables: DEVLOD.ARC


Jim Kyle


Jim has published more than a dozen books and hundreds of magazine articles,
and has been Primary Forum Administrator of Computer Language magazine's forum
on CompuServe since 1985. Most recently, he is responsible for the revised
editions of Que's DOS Programmer's Reference and Using Assembly Language, and
is a coauthor of Undocumented DOS, edited by Andrew Schulman, from which this
article has been adapted.


Ever have an MS-DOS program that required the presence of a device driver, and
wish you had a way to install the driver from the command-line prompt, rather
than having to edit your CONFIG.SYS file and then reboot the system?
Of course, you can be thankful that it's so much easier to reboot MS-DOS than
it is to rebuild the kernel, which must be done to add a device driver to
UNIX. While DOS 2.x borrowed the idea of installable device drivers from UNIX,
it's often forgotten that DOS in fact improved on the installation of device
drivers by replacing the building of a new kernel with the simple editing of
CONFIG.SYS.
But still, most DOS users occasionally wish they could just type a command
line to load a device driver and be done with it.
Also, developers of device drivers often wish they had a way to debug the
initialization phase of a device driver. This type of debugging usually
requires either a debug device driver that loads before your device driver, or
a hardware in-circuit emulation. But if only we could load device drivers
after the normal CONFIG.SYS stage....
Well, wish no more. Command-line loading of MS-DOS device drivers is not only
possible, it's relatively simple to accomplish, once you know a little about
undocumented DOS. This article presents such a program, DEVLOD, written in a
combination of C and assembly language. All you have to type is DEVLOD,
followed by the name of the driver to be loaded, and any parameters needed,
just as you would supply them in CONFIG.SYS. For example, instead of placing
the following in CONFIG.SYS: device=c:\dos\ansi.sys, you would instead type
the following on the DOS command line: C:\>devlod c:\dos\ ansi.sys.
There are several ways to verify that this worked, but perhaps the simplest is
to write ANSI strings to CON and see if they are properly interpreted as ANSI
commands. For example, after a DEVLOD ANSI.SYS, the following DOS command
should produce a DOS prompt in reverse video: C:\>prompt$e[7m$ p$g$e[Om.
DEVLOD loads both character device drivers (such as ANSI.SYS) and block device
drivers (drivers that support one or more drive units, such as VDISK.SYS),
whether located in .SYS or .EXE files.


How DEVLOD Works


To install a device driver, a program must first locate the driver and
determine its size, then reserve space for it. Because this space is almost
certain to be at a higher memory address than the loader itself, the loader
moves itself up above the driver area, so that memory space will not be unduly
fragmented. Once the space is set up, DEVLOD loads the driver file and links
it into the chain of drivers that MS-DOS maintains. Next, the program calls
the driver's own initialization code, and finally returns to DOS, leaving the
driver resident but releasing all space that is no longer needed. The basic
structure of the DEVLOD program is shown in Figure 1.
Figure 1: Basic structure of DEVLOD

 startup code (CO.ASM)
 main (DEVLOD.C)
 Move_Loader
 movup (MOVUP.ASM)
 Load_Drvr
 INT 21h Function 4BO3h (Load Overlay)
 Get_List
 INT 21h Function 52h (Get List of Lists)
 based on DOS version number:
 get number of block devices
 get value of LASTDRIVE
 get Current Directory Structure (CDS) base
 get pointer to NUL device
 Init_Drvr
 call DD init routine
 build command packet
 call Strategy
 call Interrupt
 Get_Out
 if block device:
 Put_Blk_Dev
 for each unit:
 Next_Drive
 get next available drive letter
 INT 21h Function 32h (Get DPB)
 INT 21h Function 53h (Translate BPB -> DPB)
 poke CDS

 link into DPB chain
 Fix_DOS_Chain
 link into dev chain
 release environment space
 INT 21h Function 31h (TSR)

DEVLOD loads device drivers into memory using the documented DOS function for
loading overlays, INT 21h Function 4B03h. An earlier version read the driver
into memory using DOS file calls to open, read, and close the driver, but this
made it difficult to handle .EXE driver types. By instead using the EXEC
function, DEVLOD makes DOS take care of properly handling both .SYS and .EXE
files.
DEVLOD then calls undocumented INT 21h Function 52h (Get List of Lists) to
retrieve the number of block devices currently present in the system, the
value of LASTDRIVE, a pointer to the DOS Current Directory Structure (CDS)
array, and a pointer to the NUL device. The location of these variables within
the LoL, (List of Lists) varies with the DOS version number. (See Table 1,
which explains LoL, CDS, and similar alphabet soup in more detail.)
Table 1: DOS and BIOS data structures

 Data
 Structure Description
-----------------------------------------------------------------------

 BPB The BIOS uses the BPB (BIOS Parameter Block) to learn the
 format of a block device. Normally, the BPB is part of a
 physical disk's boot record, and contains information such
 as the number of bytes in a sector, the number of root
 directory entries, the number of sectors taken by the File
 Allocation Table (FAT), and so on.

 CDS The CDS (Current Directory Structure) is an undocumented
 array of structures, sometimes also called the Drive Info
 Table, which maintains the current state of each drive in
 the system. The array is n elements long, where n equals LASTDRIVE.

 DPB For every block device (disk drive) in the system, there is a
 DBP (Drive Parameter Block). These 32-byte blocks contain
 the information that DOS uses to convert cluster numbers into
 Logical Sector Numbers, and also associate the device driver
 for that device with its assigned drive letter.

 LoL Probably the most commonly used undocumented DOS data
 structure, the list of Lists is the DOS internal variable
 table, which includes, among other things, the LASTDRIVE
 value, the head of the device driver chain, and the CDS

(Current Directory Structure). A pointer to the LoL is returned in ES:BX by
undocumented DOS Int 21h Function 52h.
DEVLOD requires a pointer to the NUL device because NUL acts as the "anchor"
to the DOS device chain. Because DEVLOD's whole purpose is to add new devices
into this chain, it must update this linked list. The other variables from the
List of Lists are needed in case we are loading a block device (which we won't
know until later, after we've called the driver's INIT routine).
If the DOS version indicates operation under MS-DOS 1.x, or in the OS/2
compatibility box, DEVLOD quits with an appropriate message. Otherwise, a
pointer to the name field of the NUL driver is created, and the 8 bytes at
that location are compared to the constant "NUL" (followed by five blanks) to
verify that the driver is present and the pointer is correct.
Next, DEVLOD sends the device driver an initialization packet. This is
straightforward: The function Init_Drvr( ) forms a packet with the INIT
command, calls the driver's Strategy routine, and then calls the driver's
Interrupt routine. As elsewhere, DEVLOD merely mimicks what DOS does when it
loads a device driver.
If the device driver INIT fails, there is naturally nothing we can do but bail
out. It is important to note that we have not yet linked the driver into the
DOS driver chain, so it is easy to exit if the driver INIT fails. If the
driver INIT succeeds, DEVLOD can then proceed with its true mission, which
takes place (oddly enough) in the function Get_Out( ).
It is only at this point that DEVLOD knows whether it has a block or character
device driver, so it is here that DEVLOD takes special measures for block
device drivers, by calling Put_Blk_Dev( ). For each unit provided by the
driver, that function calls undocumented DOS INT 21h Function 32h (Get DPB --
see Table 1) and INT 21h Function 53h (Translate BPB to DPB), alters the CDS
entry for the new drive, and links the new DPB into the DPB chain. In short,
in Put_Blk_Dev( ), DEVLOD takes information returned by a block driver's INIT
routine and produces a new DOS drive.
DEVLOD pokes the CDS in order to install a block device, and needs a drive
letter to assign to the new driver. The function Next_Drive( ) is where DEVLOD
determines the drive letter to assign to a block device (if there is an
available drive letter). One technique for determining the next letter,
#ifdefed out within DEVLOD.C, is simply to read the "Number of Block Devices"
field (nblkdrs) out of the LoL. However, this fails to take account of SUBSTed
or network-redirected drives. Therefore, we walk the CDS instead, looking for
the first free drive. In any case, DEVLOD will update the nblkdrs field if it
successfully loads a block device.
Whether loading a block or character driver, DEVLOD also uses the "break
address" (the first byte of the driver's address space which can safely be
turned back to DOS for reuse) returned by the driver. Get_Out( ) converts the
break address into a count of paragraphs to be retained.
The function copyptr( ) is called three times in succession to first save the
content of the NUL driver's link field, then copy it into the link field of
the new driver, and finally store the far address of the new driver in the NUL
driver's link field. The copyptr( )function is provided in MOVUP.ASM,
described later in the text. Note again that the DOS linked list is not
altered until after we know that the driver's INIT succeeded.
At last, DEVLOD links the device header into DOS's linked list of driver
headers and saves some memory by releasing its environment. (The resulting
"hole in RAM" will cause no harm, contrary to popular belief. It will, in
fact, be used as the environment space for any program subsequently loaded, if
the size of the environment is not increased!) Finally, DEVLOD calls the
documented DOS TSR function INT 21h Function 31h to exit, so as not to release
the memory now occupied by the driver.


The Stuff DEVLOD's Made of


Before we look at how this dynamic loader accomplishes all this in less than
2000 bytes of executable code, let's mention some constraints.
Many confusing details were eliminated by implementing DEVLOD as a .COM
program, using the tiny memory model of Turbo C. The way the program moves
itself up in memory became much clearer when the .COM format removed the need
to individually manage each segment register.
In order to move the program while it is executing, it's necessary to know
every address the program can reach during its execution. This precludes using
any part of the libraries supplied with the compiler. Fortunately, in this
case that's not a serious restriction; nearly everything can be handled
without them. Two assembly language listings take care of the few things that
cannot easily be done in C itself.
The one readily available implementation of C that makes it easy to completely
sever the link to the runtime libraries is Borland's Turbo C, which provides
sample code showing how. (Microsoft also provides such a capability, but the
documentation is quite cryptic.)
Thus the main program, DEVLOD.C (Listing One, page 90), requires Turbo C with
its register pseudovariables and geninterrupt( ) and __emit__( ) features.
Register pseudovariables such as _AX provide a way to directly read or load
the CPU registers from C and both geninterrupt( ) and __emit__( ) simply emit
bytes into the code stream; neither are actually functions.
The smaller assembler module MOVUP (Listing Two, page 94) contains two
functions used in DEVLOD: movup( ) and copyptr( ). Recall that in order not to
fragment memory, DEVLOD moves itself up above the area into which the driver
will be loaded. It accomplishes this feat with movup( ).
The function copyptr( ) is located here merely because it's written in
assembler. It could have been written in C, but using assembly language to
transfer 4 bytes from source to destination makes the function much easier to
understand.

Finally, startup code appears in C0. ASM (Listing Three, page 96), which has
been extensively modified from startup code provided by Borland with Turbo C.
This or similar code forms part of every C program and provides the linkage
between the DOS command line and the C program itself. Normal start-up code,
however, does much more than this stripped-down version: It parses the
argument list, sets up pointers to the environment, and arranges things so
that the signal( ) library functions can operate.
Because our program has no need for any of these actions, our C0.ASM module
omits them. What's left just determines the DOS version in use, saving it in a
pair of global variables, and trims the RAM used by the program down to the
minimum. Then the module calls main( ), PUSHes the returned value onto the
stack, and calls exit( ). If the program succeeds in loading a device driver,
it will never return from main( ).
This sample program includes two assembly language modules in addition to the
C source, so a MAKEFILE (Listing Four, page 98) for use with Borland's MAKE
utility greatly simplifies its creation.


How Well Does DEVLOD Work?


Figure 2 shows the use of the utilities MEM (which displays owners of
allocated memory) and DEV (which lists the names of the installed device
drivers) to see what our system looks like after we've loaded up a large
number of device drivers with DEVLOD. (MEM and DEV come from the book
Undocumented DOS, but MAPMEM.COM from TurboPower or a number of other
utilities could also be used.)
Figure 2: Loading device drivers

 C:\UNDOC\KYLE>devlod \dos\smartdrv.sys 256 /a
 Microsoft SMARTDrive Disk Cache version 3.03
 Cache size: 256K in Expanded Memory
 Room for 30 tracks of 17 sectors each
 Minimum cache size will be 0K

 C:\UNDOC\KYLE>devlod \dos\ramdrive.sys
 Microsoft RAMDrive version 3.04 virtual disk D:
 Disk size: 64k
 Sector size: 512 bytes
 Allocation unit: 1 sectors
 Directory entries: 64

 C:\UNDOC\KYLE>devlod \dos\vdisk.sys
 VDISK Version 3.2 virtual disk E:
 Buffer size adjusted
 Sector size adjusted
 Directory entries adjusted
 Buffer size: 64 KB
 Sector size: 128
 Directory entries: 64

 C:\UNDOC\KYLE>devlod \dos\ansi.sys

 C:\UNDOC\KYLE>mem
 Seg Owner Size Env
 -------------------------------------------------------------------------

 09F3 0008 00F4 ( 3904) config [15 2F 4B 67 ]
 0AE8 0AE9 00D3 ( 3376) 0BC1 c:\dos33\command.com [22 23 24 2E ]
 0BBC 0000 0003 ( 48) free
 0BC0 0AE9 0019 ( 400)
 0BDA 0AE9 0004 ( 64)
 0BDF 3074 000D ( 208)
 0BED 0000 0000 ( 0) free
 0BEE 0BEF 0367 ( 13936) 0BE0 \msc\bin\smartdrv.sys 256 /a [13 19 ]
 0F56 0F57 1059 ( 66960) 0BE0 \msc\bin\ramdrive.sys [F1 FA ]
 1FB0 1FB1 104C ( 66752) 0BE0 \dos33\vdisk.sys
 2FFD 2FFE 0075 ( 1872) 0BE0 \dos33\ansi.sys [1B 29 ]
 3073 3074 1218 ( 74112) 0BE0 C:\UNDOC\KYLE\MEM.EXE [00 ]
 428C 0000 7573 (481072) free [30 F8 ]

 C:\UNDOC\KYLE>dev
 NUL
 CON
 Block: 1 unit(s)
 Block: 1 unit(s)
 SMARTAAR

 QEMM386$
 EMMXXXX0
 CON
 AUX
 PRN
 CLOCK$
 Block: 3 unit(s)
 COM1
 LPT1
 LPT2
 LPT3
 COM2
 COM3
 COM4

The output from MEM shows quite clearly that our device drivers really are
resident in memory. The output from DEV shows that they really are linked into
the DOS device chain (for example, "SMARTAAR" is SMARTDRV.SYS). Of course, the
real proof for me is that after loading SMARTDRV, RAMDRIVE, VDISK, and
ANSI.SYS, my disk accesses went a bit faster (because of the new 256K SMARTDRV
disk cache in expanded memory). I also had some additional drives (created by
RAMDRIVE and VDISK), and programs that assume the presence of ANSI.SYS (for
shame!) suddenly started producing reasonable output. Of course, I had less
free memory.
One other interesting item in the MEM output is the environment segment number
displayed for the four drivers. Recall that, in order to save some memory,
DEVLOD releases its environment. The MEM program correctly detects that the
0BE0h environment segment, still shown in the PSP for each resident instance
of DEVLOD, does not in fact belong to them. The name "DEVLOD" does not precede
the names of the drivers, because program names (which only became available
in DOS 3+) are located in the environment segment, not in the PSP. Each
instance of DEVLOD has jettisoned its environment, so its program name is gone
too.
Who then does this environment belong to? Actually, it belongs to MEM.EXE
itself. Because each instance of DEVLOD has released its environment, when MEM
comes along there is a nice environment-sized block of free memory just
waiting to be used, and MEM uses this block of memory for its environment. The
reason 0BE0 shows up as an environment, not only for MEM.EXE, but for each
instance of DEVLOD as well, is that when DEVLOD releases the environment, it
doesn't do anything to the environment segment address at offset 2Ch in its
PSP. Probably DEVLOD (and any other program which frees its environment) ought
to zero out this address.
It should be noted that some device drivers appear not to be properly loaded
by DEVLOD. These include some memory managers and some drivers that use
extended memory. For example, Microsoft's XMS driver HIMEM.SYS often crashes
the system if you attempt to load it with DEVLOD. Furthermore, while DEVLOD
VDISK.SYS definitely works in that a valid RAM disk is created, other programs
that check for the presence of VDISK (such as protected-mode DOS extenders)
often fail mysteriously when VDISK has been loaded in this unusual fashion. In
the MEM display, note that the INT 19h vector is not pointing at VDISK.SYS as
it should.
For another perspective on loading drivers, see the .EXE Magazine article,
"Installing MS-DOS Device Drivers from the Command Line" by Giles Todd. For
background on DOS device drivers in general, two excellent books are the
classic Writing MS-DOS Device Drivers by Robert S. Lai, and the recent Writing
DOS Device Drivers in C by Phillip M. Adams and Clovis L. Tondo.


References


Adams, Phillip and Clovis L. Tondo. Writing DOS Device Drivers in C. Englewood
Cliffs, N.J.: Prentice Hall, 1990.
Lai, Robert S. Writing MS-DOS Device Drivers. Reading, Mass.: Addison-Wesley,
1987.
Todd, Giles. "Installing MS-DOS Device Drivers from the Command Line." .EXE
Magazine (August 1990).

_LOADING DEVICE DRIVERS FROM THE DOS COMMAND LINE_
by Jim Kyle


[LISTING ONE]

/********************************************************************
 * DEVLOD.C - Copyright 1990 by Jim Kyle - All Rights Reserved *
 * (minor revisions by Andrew Schulman *
 * Dynamic loader for device drivers *
 * Requires Turbo C; see DEVLOD.MAK also for ASM helpers.*
 ********************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <dos.h>

typedef unsigned char BYTE;

#define GETFLAGS __emit__(0x9F) /* if any error, quit right now */
#define FIXDS __emit__(0x16,0x1F)/* PUSH SS, POP DS */
#define PUSH_BP __emit__(0x55)
#define POP_BP __emit__(0x5D)

unsigned _stklen = 0x200;
unsigned _heaplen = 0;

char FileName[65]; /* filename global buffer */
char * dvrarg; /* points to char after name in cmdline buffer */
unsigned movsize; /* number of bytes to be moved up for driver */

void (far * driver)(); /* used as pointer to call driver code */
void far * drvptr; /* holds pointer to device driver */
void far * nuldrvr; /* additional driver pointers */
void far * nxtdrvr;
BYTE far * nblkdrs; /* points to block device count in List of Lists*/
unsigned lastdrive; /* value of LASTDRIVE in List of Lists */
BYTE far * CDSbase; /* base of Current Dir Structure */
int CDSsize; /* size of CDS element */
unsigned nulseg; /* hold parts of ListOfLists pointer */
unsigned nulofs;
unsigned LoLofs;

#pragma pack(1)

struct packet{ /* device driver's command packet */
 BYTE hdrlen;
 BYTE unit;
 BYTE command; /* 0 to initialize */
 unsigned status; /* 0x8000 is error */
 BYTE reserv[8];
 BYTE nunits;
 unsigned brkofs; /* break adr on return */
 unsigned brkseg; /* break seg on return */
 unsigned inpofs; /* SI on input */
 unsigned inpseg; /* _psp on input */
 BYTE NextDrv; /* next available drive */
 } CmdPkt;

typedef struct { /* Current Directory Structure (CDS) */
 BYTE path[0x43];
 unsigned flags;
 void far *dpb;
 unsigned start_cluster;
 unsigned long ffff;
 unsigned slash_offset; /* offset of '\' in current path field */
 // next for DOS4+ only
 BYTE unknown;
 void far *ifs;
 unsigned unknown2;
 } CDS;

extern unsigned _psp; /* established by startup code in c0 */
extern unsigned _heaptop; /* established by startup code in c0 */
extern BYTE _osmajor; /* established by startup code */
extern BYTE _osminor; /* established by startup code */

void _exit( int ); /* established by startup code in c0 */
void abort( void ); /* established by startup code in c0 */

void movup( char far *, char far *, int ); /* in MOVUP.ASM file */
void copyptr( void far *src, void far *dst ); /* in MOVUP.ASM file */

void exit(int c) /* called by startup code's sequence */
{ _exit(c);}

int Get_Driver_Name ( void )
{ char *nameptr;
 int i, j, cmdlinesz;


 nameptr = (char *)0x80; /* check command line for driver name */
 cmdlinesz = (unsigned)*nameptr++;
 if (cmdlinesz < 1) /* if nothing there, return FALSE */
 return 0;
 for (i=0; i<cmdlinesz && nameptr[i]<'!'; i++) /* skip blanks */
 ;
 dvrarg = (char *)&nameptr[i]; /* save to put in SI */
 for ( j=0; i<cmdlinesz && nameptr[i]>' '; i++) /* copy name */
 FileName[j++] = nameptr[i];
 FileName[j] = '\0';

 return 1; /* and return TRUE to keep going */
}

void Put_Msg ( char *msg ) /* replaces printf() */
{
#ifdef INT29
 /* gratuitous use of undocumented DOS */
 while (*msg)
 { _AL = *msg++; /* MOV AL,*msg */
 geninterrupt(0x29); /* INT 29h */
 }
#else
 _AH = 2; /* doesn't need to be inside loop */
 while (*msg)
 { _DL = *msg++;
 geninterrupt(0x21);
 }
#endif
}

void Err_Halt ( char *msg ) /* print message and abort */
{ Put_Msg ( msg );
 Put_Msg ( "\r\n" ); /* send CR,LF */
 abort();
}

void Move_Loader ( void ) /* vacate lower part of RAM */
{
 unsigned movsize, destseg;
 movsize = _heaptop - _psp; /* size of loader in paragraphs */
 destseg = *(unsigned far *)MK_FP( _psp, 2 ); /* end of memory */
 movup ( MK_FP( _psp, 0 ), MK_FP( destseg - movsize, 0 ),
 movsize << 4); /* move and fix segregs */
}

void Load_Drvr ( void ) /* load driver file into RAM */
{ unsigned handle;
 struct {
 unsigned LoadSeg;
 unsigned RelocSeg;
 } ExecBlock;

 ExecBlock.LoadSeg = _psp + 0x10;
 ExecBlock.RelocSeg = _psp + 0x10;
 _DX = (unsigned)&FileName[0];
 _BX = (unsigned)&ExecBlock;
 _ES = _SS; /* es:bx point to ExecBlock */
 _AX = 0x4B03; /* load overlay */

 geninterrupt ( 0x21 ); /* DS is okay on this call */
 GETFLAGS;
 if ( _AH & 1 )
 Err_Halt ( "Unable to load driver file." );
}

void Get_List ( void ) /* set up pointers via List */
{ _AH = 0x52; /* find DOS List of Lists */
 geninterrupt ( 0x21 );
 nulseg = _ES; /* DOS data segment */
 LoLofs = _BX; /* current drive table offset */

 switch( _osmajor ) /* NUL adr varies with version */
 {
 case 0:
 Err_Halt ( "Drivers not used in DOS V1." );
 case 2:
 nblkdrs = NULL;
 nulofs = LoLofs + 0x17;
 break;
 case 3:
 if (_osminor == 0)
 {
 nblkdrs = (BYTE far *) MK_FP(nulseg, LoLofs + 0x10);
 lastdrive = *((BYTE far *) MK_FP(nulseg, LoLofs + 0x1b));
 nulofs = LoLofs + 0x28;
 }
 else
 {
 nblkdrs = (BYTE far *) MK_FP(nulseg, LoLofs + 0x20);
 lastdrive = *((BYTE far *) MK_FP(nulseg, LoLofs + 0x21));
 nulofs = LoLofs + 0x22;
 }
 CDSbase = *(BYTE far * far *)MK_FP(nulseg, LoLofs + 0x16);
 CDSsize = 81;
 break;
 case 4:
 case 5:
 nblkdrs = (BYTE far *) MK_FP(nulseg, LoLofs + 0x20);
 lastdrive = *((BYTE far *) MK_FP(nulseg, LoLofs + 0x21));
 nulofs = LoLofs + 0x22;
 CDSbase = *(BYTE far * far *) MK_FP(nulseg, LoLofs + 0x16);
 CDSsize = 88;
 break;
 case 10:
 case 20:
 Err_Halt ( "OS2 DOS Box not supported." );
 default:
 Err_Halt ( "Unknown version of DOS!");
 }
}

void Fix_DOS_Chain ( void ) /* patches driver into DOS chn */
{ unsigned i;

 nuldrvr = MK_FP( nulseg, nulofs+0x0A ); /* verify the drvr */
 drvptr = "NUL ";
 for ( i=0; i<8; ++i )
 if ( *((BYTE far *)nuldrvr+i) != *((BYTE far *)drvptr+i) )

 Err_Halt ( "Failed to find NUL driver." );

 nuldrvr = MK_FP( nulseg, nulofs ); /* point to NUL driver */
 drvptr = MK_FP( _psp+0x10, 0 ); /* new driver's address */

 copyptr ( nuldrvr, &nxtdrvr ); /* hold old head now */
 copyptr ( &drvptr, nuldrvr ); /* put new after NUL */
 copyptr ( &nxtdrvr, drvptr ); /* and old after new */
}

// returns number of next free drive, -1 if none available
int Next_Drive ( void )
{
#ifdef USE_BLKDEV
 return (nblkdrs && (*nblkdrs < lastdrive)) ? *nblkdrs : -1;
#else
 /* The following approach takes account of SUBSTed and
 network-redirector drives */
 CDS far *cds;
 int i;
 /* find first unused entry in CDS structure */
 for (i=0, cds=CDSbase; i<lastdrive; i++, ((BYTE far *)cds)+=CDSsize)
 if (! cds->flags) /* found a free drive */
 break;
 return (i == lastdrive) ? -1 : i;
#endif
}

int Init_Drvr ( void )
{ unsigned tmp;
#define INIT 0
 CmdPkt.command = INIT; /* build command packet */
 CmdPkt.hdrlen = sizeof (struct packet);
 CmdPkt.unit = 0;
 CmdPkt.inpofs = (unsigned)dvrarg; /* points into cmd line */
 CmdPkt.inpseg = _psp;
 /* can't really check for next drive here, because don't yet know
 if this is a block driver or not */
 CmdPkt.NextDrv = Next_Drive();
 drvptr = MK_FP( _psp+0x10, 0 ); /* new driver's address */

 tmp = *((unsigned far *)drvptr+3); /* STRATEGY pointer */
 driver = MK_FP( FP_SEG( drvptr ), tmp );
 _ES = FP_SEG( (void far *)&CmdPkt );
 _BX = FP_OFF( (void far *)&CmdPkt );
 (*driver)(); /* set up the packet address */

 tmp = *((unsigned far *)drvptr+4); /* COMMAND pointer */
 driver = MK_FP( FP_SEG( drvptr ), tmp );
 (*driver)(); /* do the initialization */

 /* check status code in command packet */
 return (! ( CmdPkt.status & 0x8000 ));
}

int Put_Blk_Dev ( void ) /* TRUE if Block Device failed */
{ int newdrv;
 int retval = 1; /* pre-set for failure */
 int unit = 0;

 BYTE far *DPBlink;
 CDS far *cds;
 int i;

 if ((Next_Drive() == -1) CmdPkt.nunits == 0)
 return retval; /* cannot install block driver */
 if (CmdPkt.brkofs != 0) /* align to next paragraph */
 {
 CmdPkt.brkseg += (CmdPkt.brkofs >> 4) + 1;
 CmdPkt.brkofs = 0;
 }
 while( CmdPkt.nunits-- )
 {
 if ((newdrv = Next_Drive()) == -1)
 return 1;
 (*nblkdrs)++;
 _AH = 0x32; /* get last DPB and set poiner */
 _DL = newdrv;
 geninterrupt ( 0x21 );
 _AX = _DS; /* save segment to make the pointer */
 FIXDS;
 DPBlink = MK_FP(_AX, _BX);
 (unsigned) DPBlink += (_osmajor < 4 ? 24 : 25 );
 _SI = *(unsigned far *)MK_FP(CmdPkt.inpseg, CmdPkt.inpofs);
 _ES = CmdPkt.brkseg;
 _DS = CmdPkt.inpseg;
 _AH = 0x53;
 PUSH_BP;
 _BP = 0;
 geninterrupt ( 0x21 ); /* build the DPB for this unit */
 POP_BP;
 FIXDS;
 *(void far * far *)DPBlink = MK_FP( CmdPkt.brkseg, 0 );

 /* set up the Current Directory Structure for this drive */
 cds = (CDS far *) (CDSbase + (newdrv * CDSsize));
 cds->flags = 1 << 14; /* PHYSICAL DRIVE */
 cds->dpb = MK_FP(CmdPkt.brkseg, 0);
 cds->start_cluster = 0xFFFF;
 cds->ffff = -1L;
 cds->slash_offset = 2;
 if (_osmajor > 3)
 { cds->unknown = 0;
 cds->ifs = (void far *) 0;
 cds->unknown2 = 0;
 }

 /* set up pointers for DPB, driver */
 DPBlink = MK_FP( CmdPkt.brkseg, 0);
 *DPBlink = newdrv;
 *(DPBlink+1) = unit++;
 if (_osmajor > 3)
 DPBlink++; /* add one if DOS 4 */
 *(long far *)(DPBlink+0x12) = (long)MK_FP( _psp+0x10, 0 );
 *(long far *)(DPBlink+0x18) = 0xFFFFFFFF;
 CmdPkt.brkseg += 2; /* Leave two paragraphs for DPB */
 CmdPkt.inpofs += 2; /* Point to next BPB pointer */
 } /* end of nunits loop */
 return 0; /* all went okay */

}

void Get_Out ( void )
{ unsigned temp;

 temp = *((unsigned far *)drvptr+2); /* attribute word */
 if ((temp & 0x8000) == 0 ) /* if block device, set up tbls */
 if (Put_Blk_Dev() )
 Err_Halt( "Could not install block device" );

 Fix_DOS_Chain (); /* else patch it into DOS */

 _ES = *((unsigned *)MK_FP( _psp, 0x002C ));
 _AH = 0x49; /* release environment space */
 geninterrupt ( 0x21 );

 /* then set up regs for KEEP function, and go resident */
 temp = (CmdPkt.brkofs + 15); /* normalize the offset */
 temp >>= 4;
 temp += CmdPkt.brkseg; /* add the segment address */
 temp -= _psp; /* convert to paragraph count */
 _AX = 0x3100; /* KEEP function of DOS */
 _DX = (unsigned)temp; /* paragraphs to retain */
 geninterrupt ( 0x21 ); /* won't come back from here! */
}

void main ( void )
{ if (!Get_Driver_Name() )
 Err_Halt ( "Device driver name required.");
 Move_Loader (); /* move code high and jump */
 Load_Drvr (); /* bring driver into freed RAM */
 Get_List(); /* get DOS internal variables */
 if (Init_Drvr ()) /* let driver do its thing */
 Get_Out(); /* check init status, go TSR */
 else
 Err_Halt ( "Driver initialization failed." );
}




[LISTING TWO]

 NAME movup
;[]------------------------------------------------------------[]
; MOVUP.ASM -- helper code for DEVLOD.C 
; Copyright 1990 by Jim Kyle - All Rights Reserved 
;[]------------------------------------------------------------[]

_TEXT SEGMENT BYTE PUBLIC 'CODE'
_TEXT ENDS

_DATA SEGMENT WORD PUBLIC 'DATA'
_DATA ENDS

_BSS SEGMENT WORD PUBLIC 'BSS'
_BSS ENDS

DGROUP GROUP _TEXT, _DATA, _BSS


ASSUME CS:_TEXT, DS:DGROUP

_TEXT SEGMENT BYTE PUBLIC 'CODE'

;-----------------------------------------------------------------
; movup( src, dst, nbytes )
; src and dst are far pointers. area overlap is NOT okay
;-----------------------------------------------------------------
 PUBLIC _movup

_movup PROC NEAR
 push bp
 mov bp, sp
 push si
 push di
 lds si,[bp+4] ; source
 les di,[bp+8] ; destination
 mov bx,es ; save dest segment
 mov cx,[bp+12] ; byte count
 cld
 rep movsb ; move everything to high ram
 mov ss,bx ; fix stack segment ASAP
 mov ds,bx ; adjust DS too
 pop di
 pop si
 mov sp, bp
 pop bp
 pop dx ; Get return address
 push bx ; Put segment up first
 push dx ; Now a far address on stack
 retf
_movup ENDP

;-------------------------------------------------------------------
; copyptr( src, dst )
; src and dst are far pointers.
; moves exactly 4 bytes from src to dst.
;-------------------------------------------------------------------
 PUBLIC _copyptr

_copyptr PROC NEAR
 push bp
 mov bp, sp
 push si
 push di
 push ds
 lds si,[bp+4] ; source
 les di,[bp+8] ; destination
 cld
 movsw
 movsw
 pop ds
 pop di
 pop si
 mov sp, bp
 pop bp
 ret
_copyptr ENDP


_TEXT ENDS

 end




[LISTING THREE]



 NAME c0
;[]------------------------------------------------------------[]
; C0.ASM -- Start Up Code 
; based on Turbo-C startup code, extensively modified 
;[]------------------------------------------------------------[]

_TEXT SEGMENT BYTE PUBLIC 'CODE'
_TEXT ENDS

_DATA SEGMENT WORD PUBLIC 'DATA'
_DATA ENDS

_BSS SEGMENT WORD PUBLIC 'BSS'
_BSS ENDS

DGROUP GROUP _TEXT, _DATA, _BSS

; External References

EXTRN _main : NEAR
EXTRN _exit : NEAR

EXTRN __stklen : WORD
EXTRN __heaplen : WORD

PSPHigh equ 00002h
PSPEnv equ 0002ch

MINSTACK equ 128 ; minimal stack size in words

; At the start, DS, ES, and SS are all equal to CS

;/*-----------------------------------------------------*/
;/* Start Up Code */
;/*-----------------------------------------------------*/

_TEXT SEGMENT BYTE PUBLIC 'CODE'

ASSUME CS:_TEXT, DS:DGROUP

 ORG 100h

STARTX PROC NEAR

 mov dx, cs ; DX = GROUP Segment address
 mov DGROUP@, dx
 mov ah, 30h ; get DOS version

 int 21h
 mov bp, ds:[PSPHigh]; BP = Highest Memory Segment Addr
 mov word ptr __heaptop, bp
 mov bx, ds:[PSPEnv] ; BX = Environment Segment address
 mov __version, ax ; Keep major and minor version number
 mov __psp, es ; Keep Program Segment Prefix address

; Determine the amount of memory that we need to keep

 mov dx, ds ; DX = GROUP Segment address
 sub bp, dx ; BP = remaining size in paragraphs
 mov di, __stklen ; DI = Requested stack size
;
; Make sure that the requested stack size is at least MINSTACK words.
;
 cmp di, 2*MINSTACK ; requested stack big enough ?
 jae AskedStackOK ; yes, use it
 mov di, 2*MINSTACK ; no, use minimal value
 mov __stklen, di ; override requested stack size
AskedStackOK:
 add di, offset DGROUP: edata
 jb InitFailed ; DATA segment can NOT be > 64 Kbytes
 add di, __heaplen
 jb InitFailed ; DATA segment can NOT be > 64 Kbytes
 mov cl, 4
 shr di, cl ; $$$ Do not destroy CL $$$
 inc di ; DI = DS size in paragraphs
 cmp bp, di
 jnb TooMuchRAM ; Enough to run the program

; All initialization errors arrive here

InitFailed:
 jmp near ptr _abort

; Set heap base and pointer

TooMuchRAM:
 mov bx, di ; BX = total paragraphs in DGROUP
 shl di, cl ; $$$ CX is still equal to 4 $$$
 add bx, dx ; BX = seg adr past DGROUP
 mov __heapbase, bx
 mov __brklvl, bx
;
; Set the program stack down into RAM that will be kept.
;
 cli
 mov ss, dx ; DGROUP
 mov sp, di ; top of (reduced) program area
 sti

 mov bx,__heaplen ; set up heap top pointer
 add bx,15
 shr bx,cl ; length in paragraphs
 add bx,__heapbase
 mov __heaptop, bx
;
; Clear uninitialized data area to zeroes
;

 xor ax, ax
 mov es, cs:DGROUP@
 mov di, offset DGROUP: bdata
 mov cx, offset DGROUP: edata
 sub cx, di
 rep stosb
;
; exit(main());
;
 call _main ; the real C program
 push ax
 call _exit ; part of the C program too

;----------------------------------------------------------------
; _exit()
; Restore interrupt vector taken during startup.
; Exit to DOS.
;----------------------------------------------------------------

 PUBLIC __exit
__exit PROC NEAR
 push ss
 pop ds

; Exit to DOS

ExitToDOS:
 mov bp,sp
 mov ah,4Ch
 mov al,[bp+2]
 int 21h ; Exit to DOS

__exit ENDP

STARTX ENDP

;[]------------------------------------------------------------[]
; Miscellaneous functions 
;[]------------------------------------------------------------[]

ErrorDisplay PROC NEAR
 mov ah, 040h
 mov bx, 2 ; stderr device
 int 021h
 ret
ErrorDisplay ENDP

 PUBLIC _abort
_abort PROC NEAR
 mov cx, lgth_abortMSG
 mov dx, offset DGROUP: abortMSG
MsgExit3 label near
 push ss
 pop ds
 call ErrorDisplay
CallExit3 label near
 mov ax, 3
 push ax
 call __exit ; _exit(3);

_abort ENDP

; The DGROUP@ variable is used to reload DS with DGROUP

 PUBLIC DGROUP@
DGROUP@ dw ?

_TEXT ENDS

;[]------------------------------------------------------------[]
; Start Up Data Area 
;[]------------------------------------------------------------[]

_DATA SEGMENT WORD PUBLIC 'DATA'

abortMSG db 'Quitting program...', 13, 10
lgth_abortMSG equ $ - abortMSG

;
; Miscellaneous variables
;
 PUBLIC __psp
 PUBLIC __version
 PUBLIC __osmajor
 PUBLIC __osminor

__psp dw 0
__version label word
__osmajor db 0
__osminor db 0

; Memory management variables

 PUBLIC ___heapbase
 PUBLIC ___brklvl
 PUBLIC ___heaptop
 PUBLIC __heapbase
 PUBLIC __brklvl
 PUBLIC __heaptop

___heapbase dw DGROUP:edata
___brklvl dw DGROUP:edata
___heaptop dw DGROUP:edata
__heapbase dw 0
__brklvl dw 0
__heaptop dw 0

_DATA ENDS

_BSS SEGMENT WORD PUBLIC 'BSS'

bdata label byte
edata label byte ; mark top of used area

_BSS ENDS

 END STARTX





[LISTING FOUR]

# makefile for DEVLOD.COM
# can substitute other assemblers for TASM

c0.obj : c0.asm
 tasm c0 /t/mx/la;

movup.obj: movup.asm
 tasm movup /t/mx/la;

devlod.obj: devlod.c
 tcc -c -ms devlod

devlod.com: devlod.obj c0.obj movup.obj
 tlink c0 movup devlod /c/m,devlod
 if exist devlod.com del devlod.com
 exe2bin devlod.exe devlod.com
 del devlod.exe









































November, 1991
PORTING UNIX TO THE 386: THE BASIC KERNEL


Device autoconfiguration




William Frederick Jolitz and Lynne Greer Jolitz


Bill was the principal developer of 2.8 and 2.9BSD and was the chief architect
of National Semiconductor's GENIX project, the first virtual memory
microprocessor-based UNIX system. Prior to establishing TeleMuse, a market
research firm, Lynne was vice president of marketing at Symmetric Computer
Systems. They conduct seminars on BSD, ISDN, and TCP/IP. Send e-mail questions
or comments to lynne@berkeley.edu. (c) 1991 TeleMuse.


Last month we examined the mechanics of processes and context switching.
Coupled with a basic understanding of multiprogramming, multiprocessing, and
multitasking (see DDJ, September 1991), we have now covered one of the
fundamental tenets on which our 386BSD operating systems kernel relies and on
which everything else is built. With this, we have conquered the "first pitch"
of our mountain.
In essence, we can consider our examination of multiprogramming and
multiprocessing and the details of swtch( ) to be analogous to an examination
of our map (concepts) and a careful laying of anchors before we climb up and
over a treacherous overhang. Why an overhang? Because a cavalier approach to
these basic elements could result in a misdesign which causes a great fall
later. Witness the difficulty in getting other operating systems to accomplish
what UNIX was designed to do from the first.
However, it is time to make tracks and cover new ground. We are now working on
many areas of the 386BSD port at once, so we must return to our main( )
procedure (see DDJ, August 1991) and focus on the organization and primitives
which impact device drivers. In particular, we need to understand the concepts
necessary to the integration of appropriate device drivers. We examine the
UNIX concept of "device interface," the layout and terms used in device
drivers, and how BSD works the miracle of autoconfiguration. We also examine
how our BSD kernel interfaces with its device drivers.
Next month, we will examine actual driver operations. Then, after laying the
groundwork for our UNIX device drivers, we will discuss some sample device
drivers.


Re-examining Our Framework: Kernel Services


In our previous articles on machine-dependent (DDJ, July 1991) and
machine-independent (DDJ, August 1991) initialization, you might have noticed
that we completely bypassed a significant area -- I/O device initialization,
otherwise known as "automatic configuration" or autoconfiguration. This was
done intentionally so that we could present a clear introduction to the basic
operating arrangement of our BSD kernel without gorging on UNIX trivia.
By describing the basic framework of kernel services prior to I/O devices, we
actually chronicled this port as it happened. We took this approach because by
using portions of the kernel services to debug and/or bypass problems we
encounter with the device drivers, we make a lot less work for ourselves. When
needed, we could build a debugging framework around a targetted problem area,
focus on it, try alternatives, and resolve it to conclusion.
In other words, at every point along the climb, we have attempted to belay
ourselves against the foundation of work we have built. (The question now
becomes "Was the mountain there to climb, or did we build the mountain as we
climbed it?" Zen philosophers and systems programmers can debate this question
at their leisure.) The further we delve into the system, the greater the
possibility of a catastrophic misstep, so our anchors (tools) must be
carefully placed to prevent us from minor falls. Our previous work will now
form the basis for our current work on drivers.
And while there are many heartbreaks (and other breaks) which result from
falling, nothing is quite as sweet as conclusively putting the finger on an
obstinate nine-month-old "bug" that has played hide-and-seek through your most
relentless attempts. ("That which does not destroy us, makes us strong."
--Nietzsche.)


UNIX as the Device Driver Interface


Over the course of integrating drivers into an operating system, a programmer
unversed in systems can be intimidated by the device interface problem. The
common approach is to try and "glue" an arbitrarily designed driver onto the
side of the kernel and attempt to minimize the interface to the kernel,
perhaps by doing everything in the driver. This half-hearted approach may
result in a (somewhat) working product, but it does not lead to efficient and
correct design and operation in all cases. Given the frequency in which this
is done, it's no wonder that drivers are frequently considered a black art.
They are never truly finished or fully debugged. ("If carpenters built homes
the way programmers write programs, then the first woodpecker to come along
would destroy civilization.")
Another approach is to actually reverse your perspective and consider the
entire problem as a "bag of drivers" with UNIX as the pervasive interface to
them (see "Brief Notes: UNIX -- Just a 'Bag of Drivers?'). In other words,
hold UNIX as the given constant and mold the driver design to suit. This is
somewhat unorthodox, but can be quite instructive.
So, instead of dealing with the driver as an independent entity, we take a
broader view of the kernel's interfaces and services provided for the drivers'
use. We can then leverage this knowledge of the kernel to illustrate the
methodology of how the kernel's rich set of services can be lithely used to
integrate device drivers. This approach actually fits in quite nicely with the
heritage of UNIX device-driver integration.


Then What is a Device Driver?


Now that we have shifted our perspective of UNIX, we should really sit down
and define our terms carefully. In general, the term "device driver" refers to
the software that operates a device. Obvious enough, right?
It's when we try to get specific that we run into trouble. For example, if we
extend the definition of device to imply control of a "hardware device," we
find that we have now excluded many drivers that function entirely in
software. These "software devices" are used to mimic a device-driver interface
to simulate the effect of the desired "pseudo-device," such as /dev/pty.
(Pseudo-ttys, which simulate terminal drivers, are used when logging in over
the network with a telnet or rlogin session.) Other device drivers can
redirect references to yet another driver elsewhere in the kernel, bypassing
the "normal" reference. The /dev/tty device, for example, always refers to the
terminal the process is currently associated with, even though this may be
different for most processes on the same system.
In systems other than UNIX, device drivers can vary in role, responsibility,
and form. Under MS-DOS, we can have drivers implemented in the BIOS as
loadable files (for example, ANSI.SYS) or as TSRs (most mouse drivers). Under
Mach 3.0, device drivers run outside the kernel in separate processes, as
entities completely separate from the operating system.
For our purposes, a driver is a set of functions, compiled into the kernel
when it is generated, that connect to the driver interface mechanism.
Generally, the functions of a driver are all kept in a single source file, and
there is one driver per device. Frequently, the part of the device that the
computer directly interacts with is called the "controller," and it may have
more than one physical device. If the devices can operate autonomously during
operations to a degree, they are called "slaves," because they share
responsibility with the controller "master" for the transfer, unlike "dumb"
devices that have a trivial role.


What are Drivers Made of?


Device driver are usually responsible for all aspects of device recognition,
initialization, operation, and error recovery. Because the devices may be
mounted on a hierarchy of buses and rely on interrupt mechanisms of the
processor, they interact with many machine-dependent and bus-dependent support
functions. Many times, the characteristics of the support functions are so
different between different computers (such as the Mac and a PC or
workstation) that drivers for similar controllers look radically different.
The required intimacy with the system and the architecture is one reason that
driver code is reinvented all the time (the "have it your way" method gone
mad). Even UNIX drivers on the same architecture may require significant
rework to port them between different flavors of UNIX (such as SVR3, SVR4,
MACH, and BSD). The choice of drivers in 386BSD (as in other UNIX ports), was
significantly affected by our ability to leverage other drivers present on the
same architecture.


Leveraging Other Drivers


Sometimes, when there is a good match between the needs of a porting project
and those of a reliable and well-written "old" driver, it can be leveraged
with a minimum of effort. We can then put all our efforts into refining
something of demonstrated value rather than reinventing the wheel.

Frequently, however, there is little in common between the two, and trying to
glue the old code into the new system becomes more trouble than writing one
from scratch. Worse yet, an "old" driver may purport to be more than it is, by
claiming to support functionality that has not been tested, although on the
surface it may seem to at least pay lip service to needful areas. In fact, we
have seen many such half-hearted drivers, and very few that methodically set
out to extensively support the equipment. The reason is obvious: The drivers
are finished, as far as the programmer is concerned, and never looked at
again.
You can assess drivers by looking for the hallmarks: structure, form, history,
organization, content, correctness, and clarity. The hardest hallmark to
judge, pragmatics of design and appropriate implementation, generally must be
borne out through trial of the software. Being a judge of software is as
difficult as being a judge of character.
For 386BSD, we assessed two strategies for leveraging past work. The first was
to translate driver requests into a series of BIOS commands, then support a
mechanism to temporarily enter real mode to allow the BIOS ROMs to satisfy
those requests. The value of this approach would be to obtain 100 percent
compatibility with any PC-based system (MS-DOS has enforced this from day
one). Had this been strictly a commercial effort, this strategy might have
been satisfactory. For hard disks and display adapters, the BIOS mechanism has
been quite successful in mitigating hardware configuration problems for users.
More Details.
However, items important to a researcher using 386BSD, such as tape backup,
networking, and serial communications, were not anticipated in the initial
BIOS plan, because at the time, these things were believed to be far in the
future. Also, IBM really only got serious about support for protected-mode
BIOS with the PS/2 ABIOS, so even trying to leverage some of the BIOS requires
the ticklish matter of switching from protected mode, and maintaining a
context for the non-multitasking BIOS to run in while multitasking is going on
around it. Clearly, this would require a colossal kludge, as the BIOS was
never intended for anything but the vagaries of MS-DOS.
So, although there were tons of MS-DOS driver software available, we
ultimately found little usable code without going well out of the scope of the
project and markedly altering our specification goals (see DDJ, January 1991).
To top it all off, our performance would be shot to hell, because code written
for a 16-bit machine with 64-Kbyte segments doesn't leverage a 32-bit machine
with a 4-gigabyte flat address space very well. Having already learned more
than we ever wanted about ISA and 386, we had no incentive to add BIOS and DOS
trivia as well. Thus, we bid a fond farewell to this strategy, fearing that
the machine might become obsolete before it was fully mastered!
For the second strategy, we looked at drivers contributed to Berkeley which
ran under UNIX on VAX, HP300, NS32000, and 386 PC machines. From this source,
we were able to satisfy more than half of our initial driver requirements, and
base our system on software that had some history of operating on another
platform for a period of time. We could also pick and choose among a number of
drivers for some devices. Ironically, the better drivers came from the less
well-known machines.


Categories of Device Drivers


The BSD operating system's kernel broadly interacts with its device drivers,
depending on the kind of device and the nature of information it provides.
Unit record devices, such as keyboards, terminals, modems, and printers tend
to fall into one category of device drivers. Mass storage devices such as tape
drives and hard and floppy disks fall into another category. A third category
includes packet transfer devices such as network interfaces (Ethernet and
token ring controller boards, for example). Bitmap display frame buffers could
be considered yet another category.
Often, we would like our system to vary the ways we might configure or
interact with these devices, depending on need. For example, the point of disk
drives is to store and organize both small and large collections of data or
programs, so it is inconvenient to interact with the disk drive on its terms
alone (disk sector address and sector data contents). Therefore, we impose an
abstraction which allows us to name (or key) collections of data as a file.
This file system abstraction is the principle way programs make use of the
disk. We still need to have mechanisms to access the disk as a whole, however,
if for no other reason than to manage and maintain the file system (for
instance, check consistency, backups, file recovery).
We could use a file system to organize a tape drive as well, and it might
work, provided we don't mind waiting minutes for a file. However, tapes are
more commonly used as archives and thus we impose on top of the tape data
record formats, sometimes variable sized and with special hardware-generated
records to denote file separators (or file marks) and end of tape indications.
Unit record devices have little in the way of data structure. The application
program pushes data bytes through them for the desired effect. For the
convenience of the applications programs, the system provides for a variety of
mechanisms to facilitate optional input and output processing. Among these
mechanisms is a kind of "super" or metadriver, called a "line discipline." The
line discipline acts as an intermediary between the device driver and the
operating system. The most common of the line disciplines is the "tty driver,"
which implements the semantics of the UNIX keyboard interface (that is,
backspace, line kill, interrupt/suspend a process) for the user.
Network devices are quite different in nature. Incoming and outgoing packets
are structured in elaborate and (usually) hierarchical ways. Not only is their
content important, but so is the time and means by which they arrive. Also,
unlike the other categories cited, a single data record may end up going to
one of many different destinations, and this may be dynamically altered as the
system software changes routing policies. Thus, the kernel's device interfaces
may look quite different from the other categories.
Accomplishing bitmap graphics is reflected in another I/0 interface need. In
this case, we must regulate access to the frame buffer's physical memory by
arranging to map the memory into an application's (such as an X server)
virtual address space.
Each of these categories interfaces to a different portion of the BSD kernel.
Disk drives are interfaced into the file system of the kernel and into "device
special files" (found in /dev), which allow utility programs to bypass the
file system. These files are, in effect, trap doors out of the single UNIX
file namespace and into a given device driver. Device-special files also allow
devices in general to be operated by applications and utility programs.
Network devices are connected to the network protocol processing mechanism and
are only visible through the network software interface mechanisms. Thus,
network devices don't show up among the device-special files.


BSD Autoconfiguration Goals


Versions of UNIX prior to 3BSD had a rather fixed notion of configuration;
systems were conditionally compiled for a given set of hardware or by manually
altering the configuration flags in the driver. (Usually this was done to save
on the amount of system code taking up space -- this was important if one had
as little as 256 Kbytes, where every Kbyte counted.) If the driver was not
there, but the hardware was, it could not be used. Worse yet, different
systems had to be created for differently configured systems, even if they had
minor differences in interrupt vectors, were missing a redundant card, or had
conflicting controller port assignments.
Early 4BSD versions introduced a more versatile form of configuration that
allowed for runtime configuration shortly after the system's kernel was
loaded, but prior to operation of the kernel. The intent of this configuration
mechanism was to put off wiring-down device-dependent information until the
last moment, then attempt to discover as much of this information from the
hardware itself and apply it to the drivers as needed. The prime motivation
was to factor out as much of the idiosyncratic configuration differences as
possible.
The goal of this work was to minimize the impact of maintaining a diverse
number of computer systems and peripherals within a single version of the
kernel. The more we can achieve with this the better, because the sheer volume
of different kinds of devices that can be configured with systems now is
enormous.
Even more elegant mechanisms to automatically configure the drivers for the
given devices present have been developed over the course of time.
"Autoconfiguration" was an early innovation in Berkeley UNIX, and it remains a
hallmark of a Berkeley-derived version of UNIX to the present.


BSD Autoconfiguration Approach


In our BSD kernel, we implement autoconfiguration by incrementally searching
for all devices that might be supported by the drivers present in our kernel.
This is accomplished by "walking" a table of device information to locate
devices on our target system and calling a routine in each associated driver,
using this information, to check for the presence of a given device. If this
probe( ) routine finds a device, the driver can be wired into the system by
applying the configuration information saved in the table. We can inform the
driver of this, so that it can adjust its own parameters and "fine-tune"
configuration by calling an attach( ) routine in the driver. (In some cases,
the attach( ) routine may find a terminal conflict with the attempted device
configuration, and may deny the configuration attempt.)
Sometimes we have a master device that manages a number of slave devices (a
disk controller with multiple drives, for instance). In such a case, when we
find a controller with the probe( ), we iterate through each possible
subdevice that might exist on the controller by means of a slave( ) routine in
the driver. If any slave devices are found, the attach( ) routine is called
for each routine so the drive may be "wired" into the driver.
Depending on the computer, it's possible to do autoconfiguration with varying
success. Sometimes, much of the device-dependent information can be obtained
by the software cleverly manipulating the device to reveal how it is attached
to the system. At other times, it is nearly impossible to detect the presence
of a device. Worse yet, a hidden conflict between two mutually exclusive
devices could cause them to interfere with each other. (This happens all too
easily on the ISA bus.)
As a result of the configuration pass, a manifest of devices and related
configuration information is tallied on the console device, so that an
operator can observe what the kernel was able to find and make use of. This
can be of great use in diagnosing dead equipment, especially if either a
device known to be present in the computer fails to respond, or if a device
known to be missing mysteriously shows up in its place.


Alternative Autoconfiguration Approaches


BSD's current autoconfiguration scheme is rigidly top-down, not unlike that of
a recursive descent parser. To begin with, all buses directly connected to the
computer are probed successively. While examining each bus, all devices on a
given bus are summarily probed, and in turn, all slave devices on a given
controller device. But this approach has some drawbacks; we may not yet have
all the device information at the time we succeed in doing the probe( ) for a
device to attach( ) it then.
An alternate solution suggested by Chris Torek (LBL) is to change this
arrangement and instead do successive "depth-first" probe( )s on all
lower-level objects to discover all information about the device and its
hidden requirements before committing to the corresponding attach( ). Thus, a
more complete picture of a device's demands and conflicts can be obtained
before we commit to attaching the device.
Yet another possibility might be using a two-pass, or "bottom-up" method, in
which all devices, resources, and dependencies are found on the first pass in
a kind of "survey" expedition. Having gathered a complete picture of system
requirements, the second pass assembles the pieces as if they were Lego
blocks, incrementally attaching them from peripheral to controller to bus to
driver. A device can be said to exist by its driver if a complete, connected
path is available.
Note that with a complete description of dependencies by either of these
mechanisms, we don't need to tie down the processor's interrupts, special
equipment requirements, or other resources -- except when we actually open the
device -- so we don't have to configure solely at boot time. Thus, we could
change drives with the driver file closed, and when it reopens, the system
will discover the change and adapt accordingly.


When Autoconfiguration Comes into Play


The current BSD kernel manages to locate and configure devices upon boot-up
because it must find (at least) the characteristics of the root file system,
paging store, and console device, so that it can begin the most basic
operation. Because it has to do all that, the reasoning goes, it might as well
find everything else. This is adequate for most purposes, but should you wish
to reconfigure a SCSI tape drive, for example, it's a bit of a pain to reboot
the system. (Actually, configuration should be done on device "open" as well
as during boot-up, but this is a lot of work to do correctly and hence is
usually not done.)


Information Required for Autoconfiguration


Autoconfiguration does not stop with just finding the device. More than half
the battle is accumulating all the information possible about the device, in
order to properly attach it. As an example, let's try to capture a general
list of possible information desired. This should extend beyond the needs of
the ISA bus, because we may need to consider other buses.



How and Where to Find Devices


Devices are usually found on a bus of some kind. In fact, it is not unusual
for a computer to have more than one bus, or even buses of more than one type.
EISA bus, for example is a kind of bus within a bus, with ISA devices working
by one set of rules and full EISA cards working by a completely different set
(for example, slot-independent vs. slot-relative). Thankfully, less common is
a hierarchical bus arrangement, where bus adapters themselves are devices on
buses. (There are DEC VAX machines that use this to a depth of two or three.)
In these cases we need to know the description of finding the I/O port or
memory-mapped control and status registers of the given device. We may also
need to locate the shared buffer memory that display adapters and network
interfaces may require. Some bus facilities imply sharing or arbitration among
devices; thus, special care must be taken to avoid conflicts between devices.


Device Signalling


The processor interrupt mechanisms, which usually differ with each style of
bus, must be determined. Many new devices that support shared use among
multiprocessors, or that have multiple data streams (such as disk arrays),
possess hardware "mailbox" mechanisms to report their progress as they
complete lists of operations that the driver may have in progress. As we
demand higher aggregate data rates, the complexity of our hardware I/O system
may require more elaborate mechanisms to synchronize the hardware with
software, and these will necessarily need to be configured and managed by our
operating system.


Bulk I/O Facility Usage


For mass data transport, we may need to find and allocate DMA facilities,
which may be in the form of channels or dedicated buses. Some of these may
require conflict mitigation and perhaps (in the future) bandwidth reservation.
Some facilities also require address translation, as we take a large,
logically contiguous transfer and scatter/gather it to a group of data pages
(seemly) randomly disposed around the system.


Device Characteristics


We may have a device with no peripherals, dumb peripherals such as printers or
terminals, or those with a master/slave sharing of responsibility. These
devices have configuration-dependent parameters that may be set with hard DIP
switches or soft configuration mechanisms. (Some manufacturers have caught on
to the soft configuration approach. Newer Ethernet cards for the ISA, for
example, utilize clever mechanisms to do this.) Disk drive capacity and
geometry must also be determined. Modern peripheral standards such as ESDI and
SCSI use standard methods to obtain this information. Some devices may have
conflicts with others (for example, dual ported access of a single drive), and
these must be uncovered. The revision of a given device and its
diagnostics/disaster recovery mechanisms is also important information (for
instance, does the disk drive use bad sector sparing?).


Autoconfiguration and Disk Drive Labels


Within our BSD system, we usually subdivide disk drives into partitions that
may contain different kinds of file system abstractions -- all on the same
drive. To describe this and the disk geometry in a device-independent fashion,
we use a "disklabel" embedded in the data on the drive. The actual location of
the disklabel may not be standard across all storage architectures, but the
contents and use of the information in the higher layers above that of the
given disk driver itself is identical in all cases.
The data structure definition of the current BSD disklabel attempts to support
a rather diverse group of mass storage architectures. As a final part of the
autoconfiguration process, the disk driver extracts this data structure from
the disk drive and adjusts its parameters, including drive partitioning
tables, to reflect this information. The kernel uses this information to
determine which portions of the disk have been set aside for paging, which
have various file system types, and the underlying physical storage parameters
implied (such as file system block and fragment size).


Higher-level Autoconfiguration


Up to this point, we have only outlined the information that the kernel of the
operating system may require to configure itself appropriately. Many systems
do this low-level configuration well, but few go beyond this after the system
boots up and configures itself for use. Other configuration procedures, such
as finding and mounting various file systems, attaching to various computer
networks, and generally embedding itself into the fabric of the local and
regional computer environment, are not usually done.
However, in this modern era of LANs, enterprise networks, and global
internetworks, computer systems no longer stand alone. High-level
configuration of resources has now become a necessity. As a result, one of the
current trends in modern computer systems is resource discovery and
management. The cost of systems management is usually calculated on a
per-computer basis, and as personal computers and workstations replace dumb
terminals, this grows to be a significant factor.
In addition, as the demand for better applications programs increases, more
configuration information needs to be maintained per system. At the same time,
manufacturers are being forced to grant more autonomy to computer usage groups
and move away from the centralized MIS-management mentality that made the
trains run on time. Managing what one consortium describes as the "Distributed
Computing Environment" is going to be quite a challenge over the next few
years.


The ISA in a Nutshell


Now that we have examined how BSD handles configuration, and understand the
interface, we must study the other side of the question -- how to work a
device on a bus. In the 386BSD porting project, the ISA bus was chosen for the
initial port, as it is the most common bus available.
Before we can delve into the code, a review of the ISA bus is necessary. A
driver's view of the ISA bus reveals the mechanisms we must create to work a
device on the bus.


I/O Ports


The ISA has an independent I/O bus, separate from its program and data memory
bus, that is primarily used to twiddle the bits for the control and status
characteristics of devices. It consists of 1024 discrete, byte-sized "ports,"
some of which can be accessed in twos as 16-bit-wide operations. Each port may
be read or written, and a given device usually decodes (or implements) a block
of them (8, 16, or 32). Some devices function exclusively through the I/O
ports -- even the most common hard disk controller (which relies on "string"
instructions that repetitively sequences data through a single port).
The ISA bus, having mere rudiments of configurability, relies in part on
devices being at known port locations, and has no mechanism to discriminate
conflicting devices that may have overlapping or mutually exclusive
assignments (for example, it does not work). For those devices which do not
have standard port addresses, freely assignable zones serve as catch basins in
which to place them. Most cards have only a handful of alternative port
assignments (each a different handful, of course), so avoiding conflicts with
a fully stocked box can sometimes be a tedious puzzle. (This is often made
more interesting when a hardware manufacturer cleverly decodes more ports than
are documented.) This leads to the "scraped knuckles" effect, where the
computer's chassis is laid open, and cards shuffled in numerous attempts to
find the "holy grail" -- the correct combination of DIP switches, hardware
options, slots, and cables. (All this, while muttering on the 45th attempted
power-on, the immortal phrase from Bullwinkle, the patron saint of
programmers, "This time for SURE!")
Suffering ISA definitely makes one appreciate EISA or MCA all the more,
although ISA systems and I/O cards are still being produced in massive
numbers. Hard to believe that so much work is still being done with a bus that
was inspired by the Apple II, technological aeons ago!


Interrupts


Devices commonly have one or no interrupts; they rarely have more than one.
Again, like I/O ports, there are "standard" assignments for common cards, but
the situation is a little more desperate here because we have far fewer
interrupts than ports. Depending on whether we have an XT or AT card, we can
have as many as 6 or 11 unique choices of interrupts, respectively, out of a
net 15 interrupts that the ISA PC fields. This selection is usually
constrained even more because few cards allow more than a selection of two or
three different interrupts. Also, each interrupt has a discrete priority above
higher numbered ones, so choosing a different interrupt can alter the
processing order of the interrupt (the lowest numbered ones always getting
first billing).

The software has no independent way of ascertaining the association of devices
with interrupts, unless it compels a device to interrupt when all other
devices are forced mute. (This assumes that the device can be programmed to
interrupt without external stimulus.) For electronics reasons, cards cannot
reliably share an interrupt. Also, interrupts whose source is too brief to be
recorded get unceremoniously deposited onto one of two interrupts, each of
which may have a device connected as well. (These interrupts do
"double-duty.")


I/O Display/Buffer/ROM Memory


Some devices use a portion of the dedicated region of memory resident on the
ISA bus. This region is frequently called the "hole," as it slices the
machine's RAM into base and extended memory. Unlike the I/O ports mentioned
earlier, this memory is not usually used for device control registers, but for
various other purposes. Display adapters use dedicated regions of this memory
to hold their frame buffer (or, if in higher resolution mode, a "window" or
segment of the frame buffer too big to fit in the "hole"). Network controllers
often have shared-memory buffers that can be selected to steal a portion of
this memory as well. Finally, the BIOS ROMs, also present in the hole, scan it
to find other device ROMs to supplement its functions with. This is how
display adapters retain software compatibility -- by extending the number of
display modes available through the BIOS and hiding the actual register
programming from view. Network and hard disk controller cards use this method
to allow for initial loading of MS-DOS off the network or SCSI hard disk. As a
characteristic of the ISA, this region of memory is apportioned by ad hoc
rules and is the frequent bane of configuration.


Direct Memory Access (DMA)


Various devices implement the direct memory access mechanism of the ISA. Three
16-bit and four 8-bit wide DMA slave transfers to a single master are
available for dedicated use of cards specifically designed to make use of
them. An interesting feature of the original PC/AT was that a string
instruction to move data for the disk controller was faster than the DMA
channel, so the disk controller did not even bother to implement the DMA
channel. Unfortunately, the standards for the ISA have been set by its
progenitor, so the bandwidth hallmarks of DMA transfer are not present with
this bus. Not surprisingly, because of the various restrictions, cards using
the DMA facilities are not as common as with other computers. As with the
interrupt facilities, the software has no direct method to determine which
card is connected to which DMA channel. An even more critical failing for a
386/486 system that uses paging is the lack of a page map to do
"scatter/gather" to the 4 Kbyte-sized pages that might be located at random
physical addresses, yet consecutive virtual addresses. The DMA facility only
works on consecutive physical memory, so the software must improvise a
solution.


386BSD Autoconfiguration Scheme


Having reviewed the key points of our ISA bus, the question becomes "How do we
do autoconfiguration for 386BSD?" Luckily, this is not as involved an answer
as one might think, because our little 386 ISA bus machine is guaranteed to
have just a single bus with a maximum of a few handfuls of hardware devices
that need support. (We only have 8 slots.)
First, we create a configuration table that allows us to encode the
descriptions of where to find the devices on the bus, as well as wild card
values that require us to go out and compel the device to interrupt to locate
which interrupt it's configured for.
To find interrupts, we program the interrupt controller to allow us to poll
the interrupt lines to check for activity on a given line when we probe for a
device, and we wait for a sufficient period before giving up. With some
notorious devices, we just wire them into the designated interrupt in the
table and go on. For all remaining interrupts not found, an interrupt catcher
table will reflect them to an error-logging service of the kernel, so we can
note their occurrence.
Next we use a probe( ) entry, locate the device, and "prod" it into optionally
generating an interrupt and a DMA request. Sometimes this can be subtle to
write, because we need to determine if anything at all is present with the
supplied parameters, yet we don't want to inadvertently trigger a device we
haven't gotten to in our list of autoconfiguration table entries.
Occasionally, the only way to avoid these conflicts is by ordering
autoconfiguration, as in the case of display adapters. Backwards compatibility
with earlier software was required, so VGA and EGA display adapters would
decode the older CGA/MDA addresses as a part of the auto-sense feature to
support software that only knew of the older boards. If we probe for the
existence of the boards in an oldest-to-newest order, newer boards will
respond as older ones, thus confusing the situation. By checking in order of
newest-to-oldest, we can associate the correct driver with the appropriate
board, even though there may be some ambiguity.
As we find devices, we logically connect interrupts and DMA request signals to
the associated drivers. With interrupts, we point the Interrupt Descriptor
Table (IDT) call-gate entry to the assembly language stub routine associated
with the driver. We then adjust the interrupt mask to disable interrupts for
all devices in the group to which the driver belongs. (In the future, we will
learn more about such interrupt groups.)
To complete the attach of the device, the attach( ) routine in each driver is
called to configure the device appropriately for operation and to report
relevant facts about how the device can be used back to the system. Network
drivers manage to extract link layer addresses embedded in the cards and
inform the network protocol portions of the kernel of characteristics.
Either at the time of attach, or at the subsequent open, disk labels are
extracted off of disk drives, and the system can be made aware of the kinds of
file systems used, including paging areas for virtual memory.


386BSD Autoconfiguration Limitations


We've described the information our BSD kernel might wish to obtain from the
hardware to configure devices, and what the ISA has to offer in this regard.
The two are far from a perfect match. Much information is missing that we
would prefer to have, and the situation regarding configuration conflict
detection between devices seems almost hopeless. But this is assuming we have
no hints at all about the bus; in fact we do, and we are compelled to use
them.
For the more ancient and problematic cases (such as printer parallel ports
that won't generate an interrupt unless a printer is attached and ready), we
can force the configuration table to assume the interrupt associated with the
device. Thus, if the printer is detected during a probe, the software will
dutifully wire down the interrupt vector without verifying that it actually is
attached. These limits are primarily due to the lack of information available
because of the history of the ISA bus.


Other Buses


Much of our current strategy has focused around the ISA bus of our target
machine. However, there are a number of machines which utilize other buses,
such as the Microchannel (MCA), EISA, VME, or other non-ISA buses. To
implement these bus types, this portion of the system would change greatly.
While additional buses were outside of the scope of our project, we did not
desire 386BSD to be limited solely to ISA, so the ISA bus-related code is a
configurable option with a defined interface into the kernel. To add support
for other buses, you can add the functionality in along side the current ISA
code and use it as an example.
In the case of EISA (which extends the functionality of the ISA bus for new
cards designed to this standard), such new code would be interwoven with the
existing ISA autoconfiguration mechanism, as both would be needed to support
old ISA and new EISA devices. For MCA, which uses a completely different
approach to board configuration and is incompatible with ISA, the
autoconfiguration and device drivers would be completely separate from the ISA
code.


Next Time


Now that we have reviewed autoconfiguration and its mechanisms, it is time to
move on to actual driver operations, such as the enabling operation of the PC
hardware devices, splX( ) (interrupt priority-level management), and the
interrupt vector code. After this, we will walk through the code of some
sample drivers, noting the important points in light of our knowledge about
BSD autoconfiguration and interfaces. We will examine in detail some of the
code required for the console, disk, and clock interrupt drivers. The basic
structure, minimal requirements, and extending the functionality of these
drivers through procedures such as disklabels will also be discussed.


Brief Notes: UNIX--Just a "Bag of Drivers?"


Interfaces are a rather crucial part of an operating system, yet we've managed
to avoid them up to now. How? Well, it wasn't as hard as you might think. A
look at the heritage of operating systems might be instructive.
Many early operating systems were little more than a "bag" of drivers and
subroutines to make use of them. The operating system provided the "common
unifying" interface between hardware resources and the applications that
consumed them. Initially, these early systems used a handful of physical
resources (disk blocks, RAM, CPU) packaged as abstractions (files, address
space, time slice) which an application would obtain and then relinquish to
the system, as needed.
More advanced systems attempted to "multiplex" resources in an effort to
manage resources more efficiently among a number of competing applications, in
order to get the most use out of expensive hardware (in other words, amortize
the costs over widespread use). As operating systems began to contend with
networks, data exchange formats and conversion, and standard programming
languages, the size and extent of a user's reach extended beyond a single
machine. Computers begot more computers, and only then did the issues of
resource sharing and interface standardization become worthy of notice.
Because resource sharing/multiplexing was done primarily for cost reasons, and
only secondarily for convenience and cooperation with other users, it has
gotten second shrift from designers and standards groups, until now. However,
so many conflicting approaches exist that it appears hopeless that there will
ever be "a standard operating system," let alone "a standard computer
architecture."
The modern bane of technology is that as complexity increases, it starts to
overwhelm and blind us with its bulk. As this occurs, we are required to deal
with the more microscopic elements in ever greater orders of magnitude.
Constant improvement in our algorithms, mechanisms, and paradigms are the only
way we can ever hope to mitigate this deluge.
And we have not even mentioned the new demands on the frontiers of
development, in which video and audio signals, representing hundreds of
megabytes per second of bandwidth, need to be channeled, processed, and
combined for multimedia purposes. Nor have we mentioned the need for
cooperative multiprocessing applications.
Future operating systems challenges will be quite different from those of the
past. The economics of computers no longer require us to use whatever means
necessary to save a handful of bytes here and there. We can opt for a
direction that leads toward increased clarity and scope, instead of recreating
a new version of the old. Paradigm shift sometimes allows us to take a step
back and recognize that the tree leaves we were previously staring at really
are part of a forest.
However, even at this stratospheric level, the operating system still retains
its original heritage of being a "bag of drivers." Damn elaborate drivers
maybe, but drivers nonetheless. In a way, we have been discussing various
aspects of the driver interface all along, because UNIX is the interface.

--B.J. and L.J.



_PORTING UNIX APPLICATIONS TO DOS_
by David N. Glass

Figure 1: Macros to handle text and binary files in DOS.

#ifdef DOS
# define READ_BIN "rb"
# define READ_TXT "r"
# define WRITE_BIN "wb"
# define WRITE_TXT "w"
#else /* the unix way */
# define READ_BIN "r"
# define READ_TXT "r"
# define WRITE_BIN "w"
# define WRITE_TXT
#endif /* DOS */


Figure 2: Mapping the UNIX SIGALRM to a DOS user-defined signal.

#ifdef DOS
# define SIGALRM SIGUSR1
#endif /*DOS*/



Figure 3. Writing files using a specified file handle.

write_files(fp, buffer, bytes)
int fp;
char *buffer;
int bytes;
{
 if (bytes > 0) write(fp, buffer, bytes);
}


Figure 4: Mapping WRIT_TTY to either DOS or UNIX I/O calls.

#ifdef DOS
# define WRITE_TTY write_port
#else /* unix */
# define WRITE_TTY write
#endif /* DOS */














November, 1991
DDJ DATA COMPRESSION CONTEST RESULTS


And the winner is...




Mark Nelson


Mark is vice president of software development at Greenleaf Software and
author of the upcoming The Data Compression Book (M&T Publishing). Mark can be
contacted at 16479 Dallas Parkway, Suite 570, Dallas, TX 75248.


When we announced the DDJ Data Compression Contest in the February 1991 issue,
we expected a big response from readers -- and we weren't disappointed. Over
the course of the following months, compression programs from all over the
world poured into DDJ's offices.
As you might expect from a contest like this, the entries covered the whole
spectrum, from the good and bad to the downright ugly. However, as I reviewed
the code coming in, the one thing that stood out was a refreshingly high level
of creativity and innovation in the approaches to data compression. There were
virtually no "me too" rehashed versions of familiar programs such as UNIX
Compress or LHarc. Instead, programmers who read DDJ seemed to be eager to try
out new algorithms and techniques, frequently with good results. This provides
an interesting counterpoint to those who tell us that to produce good software
today you need $1,000,000 budgets and hordes of interchangeable programmers.
Maybe it doesn't have to be that way.


The Cream of the Crop


After running all the submitted programs through the DDJ compression test
suite, a handful distinguished themselves as exceptional contenders for the
DDJ compression championship:
Charles Ashford submitted a program called COMPRESS that used statistical
modeling combined with an arithmetic coder. His program was written in C for
MS-DOS.
Gene H. Olson submitted a program he posted to USENET comp.sources.misc a few
months earlier. His C implementation of a modified LZW algorithm was intended
to be an improved replacement for the UNIX compress program, until Gene found
he had inadvertently run afoul of the Unisys LZW patent.
Urban Koistinen from Sweden submitted another 32-bit C program targeted for
UNIX systems. Urban's program appears to be a bit-oriented statistical
compression program.
Tom Ehlert, a DDJ reader from Germany, submitted code written in C and
assembly to implement his program HSTEST. HSTEST works under MS-DOS, and uses
a combination of sliding dictionary and Huffman compression techniques.
Philip Gage submitted an ANSI C 16-bit program called SIXPACK, so named
because of the fact that it uses six different algorithms to compress files.
Finally, we received an MS-DOS assembly language program from Finland. The
authors, Jussi Puttonen, Timo Raita, and Jukka Teuhola, based their submission
on their ongoing research at the University of Turku in Turku, Finland.


The Results


The final test suite for the contest consisted of approximately 2 Mbytes each
of text, graphics, and executable files. Individual file sizes ranged from as
little as 5 Kbytes up to 700 Kbytes. I judged programs based on their ability
to compress each of the three different types of files, as well as overall
compression ability. We also judged speed of compression and expansion.
It took literally weeks of continuous testing on the test machine (a 25-MHz
80386 PC from Everex Systems that had an 160-Mbyte hard disk and 8 Mbytes of
RAM), but after a few false starts, I was finally able to complete testing and
begin tabulation of the results. The final results confirmed what I had
suspected during testing: The championship was going to be decided between two
programs that were as different as night and day.
When I first began testing Charles Ashford's COMPRESS program, I knew it was
going to be a problem. Charles's program had two characteristics that made it
stand out: First, it compressed files significantly better than any other
program submitted, or for that matter, any other program I have seen. Second,
it was by far the slowest compression program submitted; in fact it was three
orders of magnitude slower than the fastest entrant! Testing the Ashford code
took almost four full days on a 25-MHz 386 machine. However, the contest
didn't place any restrictions on the time a program could take, so this was
all perfectly legal.
The second stand-out program was Gene Olson's COMPACT. This program was at the
complete opposite end of the spectrum. COMPACT is written entirely in 32-bit C
for UNIX targets, and by MS-DOS standards is quite a resource hog in its own
right. At a minimum, COMPACT requires 1 Mbyte of RAM for internal table space.
Gene's program makes good use of system resources for managing I/O, however,
as his compression and expansion speeds dwarf any of the competition. Gene has
an advantage in working on a platform that implements overlapping I/O, and he
makes the most of it.
The source code for the programs discussed here is available electronically;
see "Availability," page 3. In addition, I've included a complete listing of
the test results in both text and Lotus formats.


The Judge's Decision


Tables 1 and 2 show the final results in all of the categories judged. As you
can see, the two stand-out programs dominate their respective categories, but
fail to make even an appearance where they are weak. A few other programs,
such as Gage's SIXPACK, show up consistently high in the standings, but are
generally not able to defeat the Ashford code.
Table 1: Compression ratio results

 Overall Graphics Text Executables
---------------------------------------------------------------------------

 1st Ashford 37.5% Koistinen 38.6% Ashford 27.1% Ashford 45.7%
 2nd Koistinen 40.5% Ashford 29.7% Gage 31.1% Gage 48.0%
 3rd Gage 41.8% S. Heller 41.8% Koistinen 33.2% Koistinen 48.6%
 4th Ehlert 43.8% T. Isaac 42.61% Ehlert 34.3% Ehlert 49.5%
 5th E. Hatton 47.1% Gage 47.7% E. Hatton 34.3% E. Hatton 56.1%


NOTE: Ratios are expressed as 100 * (compressed size / original size)

Table 2: Compression speed results

 Compression Expansion
--------------------------------------------------------------

 1st Olson 85,613 bytes/sec Olson 131,968 bytes/sec
 2nd U. Turku 67,003 bytes/sec U. Turku 90,575 bytes/sec
 3rd Ehlert 15,564 bytes/sec A. Tam 63,668 bytes/sec
 4th S. Boyd 14,962 bytes/sec Ehlert 43,341 bytes/sec
 5th C. Wong 11,862 bytes/sec C. Wong 27,389 bytes/sec

As neither program is able to claim a decisive victory, the judge's final
decision is that Gene Olson and Charles Ashford will share the Grand Prize. I
hope that Gene and Charles combine their talents in the future, as between the
two of them they have an unbeatable program.


How do Well-Known Compression Programs Stack Up?


During the course of testing the submissions to the DDJ compression contest, I
also ran some well-known freeware and shareware programs through the test
suite to see how well they performed. In every case, I used the default
options to add files to an archive, and I took the archiving program's word
for how many bytes it used to create the compressed file. The programs tested
were:
PKZIP 1.10, a popular MS-DOS shareware archiver
ARJ 2.10, a newer MS-DOS archiver. ARJ is freeware for noncommercial use.
LHA 2.10, a freeware MS-DOS archiver for which full source is available; LHA
is the successor to LHArc.
PAK 2.51, a low-cost shareware archiver
COMPRESS, the UNIX utility, available for numerous platforms
As the results listed in Table 3 show, PKWare may be losing its edge as the
data compression leader in the PC world. ARJ is a relative newcomer, but it
appears that Robert Jung is coming close in his efforts to claim the number
one position in the compression contest. We will have to wait to see if a new
release of PKZIP is launched any time soon.
Table 3: Off-the-shelf compression program results

 Compression Speed Expansion Speed Compression Ratio
-------------------------------------------------------------------------
 PKZIP 13987 bytes/sec PKZIP 39745 bytes/sec ARJ 41.4%
 ARJ 13623 bytes/sec ARJ 34652 bytes/sec LHA 41.8%
 LHA 12168 bytes/sec LHA 23090 bytes/sec PKZIP 43.8%
 PAK 9452 bytes/sec PAK 20979 bytes/sec PAK 45.0%
 COMPRESS 6132 bytes/sec COMPRESS 8516 bytes/sec COMPRESS 52.6%

--M.N.



























November, 1991
PORTING UNIX APPLICATIONS TO DOS


The bigger the job, the more the right tool counts


 This article contains the following executables: PORTUNIX.ARC


David N. Glass


David is vice president of Performance Computing Inc., a custom software
services company specializing in development tools, windows, and applications
support for high-performance architectures. He can be reached at P.O. Box
230995, Portland, OR 97223, or 503-624-8245.


Like many UNIX workstation software engineers, I've watched with surprise (and
horror) as DOS and the PC have spread through the engineering community. That
an operating system with so few safeguards against inadvertent crashes and a
processor that forces the programmer to think like a car renter ("Will that be
the compact or the small model, sir?") could become so popular continues to
amaze me.
Consequently, when our biggest client asked us to port the Free Software
Foundation's GNU/960 Development Tool Suite -- consisting of approximately
240,000 lines of C source code -- to DOS, we took a deep breath and dove in.
Hopefully, what we learned with our port, and what we're sharing with you in
this article, can reduce headaches when you undertake similar tasks.


Facing the Challenge


A number of issues are involved with porting a 32-bit UNIX application to the
DOS world, the most obvious being that DOS is a 16-bit operating system. At
the system-services level, all data reads, writes, and transfers are limited
to 16-bit addressability (a 64K segment). While DOS native applications have
learned to live with this limitation by making multiple data manipulations in
64K chunks, UNIX applications have been written to access as much as 4
gigabytes in one data transfer. Splitting each data manipulation into multiple
64K chunks would be both inefficient and error prone. It's better to use tools
that will handle this for you invisibly. Of course, 16 bits has many more
implications, including the segmented memory model and how it affects
addressing capabilities and performance.
Another difference is the size of the int data type between 16-bit DOS and
32-bit UNIX. At first glance, this seems a minor point, but it can actually
cause all kinds of misery during the port. Not only do you have to find and
replace all the ints but, if you miss one and pass it as a parameter, the
stack can be corrupted and cause the application to crash. Fortunately, some
DOS compilers (such as those of Intel, Watcom, and Metaware) use an int size
of 32 bits that eliminates this worry.
DOS inflicts many memory restrictions on its applications. Without some type
of extended memory support, system and applications must fit into a maximum of
640 Kbytes of code space. Even with extended memory support, memory
availability is limited to the amount of physical memory in the system, minus
memory resident system utilities. This can place a strangle-hold on UNIX
applications that have been written for virtual memory and the attitude that
"memory is cheap." The most common solution is to stuff intermediate data that
would normally be held in memory out to temporary files. Unfortunately, this
can be a major rework, depending on the application. Furthermore, it can slow
the execution speed tremendously, because the data is accessed based on disk
transfer rates instead of physical memory access times.
The UNIX runtime library is a cornucopia of utilities that range from data
manipulation to basic I/O. Corresponding DOS C runtime libraries offer most of
the functions provided on UNIX. However, some UNIX capabilities simply do not
exist under DOS. For example, because DOS supports only a single-thread
execution (no preemptive multitasking), UNIX functions such as fork( ) cannot
be equally implemented on DOS. Applications that require cooperative
multitasking between child processes could require major reworking.
Finally, like many applications, ours was constantly under development. A
major requirement was to minimize specific-for-DOS changes in the source code.
Therefore, solving these problems by placing #ifdefs throughout the code was
not acceptable because future upgrades to the application (which continue on
the UNIX host) could result in as much effort to port as the original program.
It was important to plan ahead and devise these sorts of one-time changes that
could be separated into a system-dependent DOS include file. This
well-documented file could be used when planning future upgrades and
enhancements, to make sure all the coding standards are still followed.


GNU/960 Development Tools


The GNU/960 tool suite, targeted for the Intel 80960 32-bit RISC processor
used in commercial applications such as laser printers, network controllers,
terminals, avionics, and radar processing, is a cross-development system based
on the Free Software Foundation's (FSF) generic tool suite. GNU/960 consists
of an optimizing compiler, assembler, linker, archiver, debugger, and
communications package, as well as numerous minor (yet useful) utilities,
including a dump utility, a tool to migrate between the two object file
formats the linker can produce, and a symbol table extractor. All in all,
there are 17 separate development tools, 364 different source files, and over
240,000 lines of code.
The GNU/960 tool suite supports the entire 960 family, including the
superscalar 960CA, which can execute multiple instructions in one clock cycle
when the compiler has scheduled the instructions in the proper order.
The 960 processor generally communicates with the host development system over
an RS-232 connection. This connection is manned on the host side by the GNU
comm utility, and on the 960 side by a bootstrap kernel called "Nindy."
Downloading an application is completed via a packet transfer protocol that
detects data errors and requests a packet to be present, if necessary. These
communications are full-featured, allowing programmers to specify options such
as data size, stop bits, parity, and baud rates from 300 to 38,400 bps.


Project Requirements


Based on our initial evaluation, we came up with a set of criteria to help
determine the DOS-based development tool suite best suited for the porting
task. Obviously, the compiler had to generate 32-bit code capable of executing
in the 386/486 protected mode, but also support virtual memory to alleviate
640K limitations. From my experience, UNIX engineers have been spoiled by
demand paging, and would rather avoid overlaying data and code segments. We
also don't want to deal with extended/ expanded memory hooey. We want a real,
"use all the memory you've got, then page from disk" virtual memory. Not all
DOS extenders include virtual memory managers.
Another requirement was that the toolset be a complete integrated package. We
wanted tools that work together seamlessly. We didn't want to get a virtual
memory manager from one vendor, a compiler from another, and a debugger from a
third. In addition, the package had to be well-supported, stable, and work as
promised.
Also high on the list of requirements was that the toolset not have any hidden
costs attached to it. In particular, a royalty-free DOS extender was
considered mandatory. It's difficult to justify charging a fee when
distributing "free" software like the GNU tools. Finally, the environment had
to provide a clear path to Windows 3.0. After all, one of the biggest reasons
for porting the application to the PC is to make it available to the greatest
number of users possible.


C Code Builder


We evaluated several options, including those from Metaware, Watcom, and
Intel. We did not consider the Microsoft and Borland products because they
produce only 16-bit code. The environment that best fit our criteria was
Intel's 386/ 486 C Code Builder Kit. Code Builder includes a 32-bit compiler,
a full-screen source-level debugger, virtual memory manager, 0.9
DPMI-compatible DOS extender, linker, librarian, and make utility. Our
greatest concern was with the newness of the product. We later learned that
the compiler is an adaptation of Intel's well-established x86 embedded
cross-development compilers; versions of this compiler have been used to write
real-time embedded applications for many years. (This perhaps explains why the
compiler performs so well for a newly released product.) In short, we had no
code generator related problems.
The compiler will accept K&R C syntax as well as ANSI standard C. This
flexibility allows us to port the "dusty deck" C, which most of GNU is written
in, while still employing the improvements of ANSI C on any new code we wrote.
The Runtime Library (RTL) complies with the ANSI specification. It has
included Microsoft, POSIX, and System V UNIX extensions, in the order of
priority. Thus, there is a good chance that most UNIX routines will be
available for use under Code Builder, especially if the code was written under
System V UNIX.
Code Builder also contains a make utility very similar to UNIX make. In fact,
it even contains some rudimentary UNIX shell-like commands (for, cp, and rm,
for example) that do not exist under a standard DOS command-line interpreter.
This makes supporting UNIX make files much easier, and getting builds going
much quicker.


Limitations, Expected and Otherwise


If your application was written using the Berkeley BSD version of UNIX, you
may have more trouble with the Runtime Library. BSD support was apparently
never a design criterion for Code Builder and the less-common or BSD-specific,
system-level functions will probably not exist. Furthermore, even if the
routine you are using has a corresponding routine in the Code Builder RTL (no
matter which UNIX RTL you have been using), you had better check the
documentation. It is always possible that the routine was coded to some other
standard than you expect, and functions a little bit differently than you were
counting on. Making assumptions like these can cause premature gray when
trying to debug some weird porting bug!

We started off porting the compiler and the communications tools, figuring
that the sheer size and complexity of the compiler (over 100,000 lines of
code) and the low-level RS-232 bit twiddling would flush out many of the
problems we would encounter over the life of the project. So, we purchased
Ethernet cards, bought PC-NFS for our DOS boxes, mounted the UNIX source code
disk to be accessed over the net, and prepared to compile our modules.
The problems we encountered fell into five general categories: system
mismatches; sloppy programming practices; C Code Builder limitations;
"DOSisms;" and "library misses." The first three tended to be compilation
failures, while the rest didn't show up until the link stage or at runtime. In
general, the later in the compile/link/run cycle a problem showed up, the
harder it was to track down.


System Mismatches


I call a problem a "system mismatch" when the tool or utility is designed with
some other set of criteria in mind. These problems show up either before
compilation begins or as compilation errors. For example, the make utility
that comes with Code Builder can handle about 80 percent of what you might
expect a UNIX make file to handle. One big difference, however, is that UNIX
make files contain shell instructions that execute when a target has been
recognized. Unfortunately, UNIX instructions do not exist on DOS.
In a few cases, such as echo and for, the make utility seems to add the
functionality of their UNIX shell counterparts. This can be deceiving (and
frustrating) because they're really provided to be Microsoft make-like, which
uses a similar-but-different syntax. Once it's clear that echo is limited with
respect to its UNIX cousin, and that rerouting using >, >>, and >& works, but
only in fairly simple forms, the make files are not too difficult to adjust to
work under both environments.
Another example of a system mismatch is the definition of certain external
global names under C Code Builder. For example, the global value errno, which
is used to return specific error values from certain I/O routines and is
normally implemented as an int, is instead implemented as a macro in Code
Builder. This was done to make the Code Builder runtime library reentrant. A
noble goal, but compilation errors abound whenever the application explicitly
defines errno as extern int errno. This is fairly common in certain
applications.
Missing include files present yet another system mismatch related problem.
Under DOS, there is no need for the data definitions and routines defined
within such include files as ioctl.h, termio.h, curses.h, and sys/file.h. Many
of these relate to low-level I/O functions, which on DOS are handled by the
BIOS. Others are definitions of terminal types and capabilities--something
foreign to DOS, which expects only the standard PC monitor. Any data or
routines normally defined in these include files and used in your application
will need to be simulated, stubbed out, or references removed before
compilation can continue.
Unlike UNIX, DOS differentiates between text and binary files. With text
files, data such as control characters and character sequences are interpreted
directly by the I/O routines. Data in binary files are passed through without
interpretation. We dealt with this by defining the macros shown in Figure 1
and modifying the opens to be fp = fopen (filename, READ_BIN) or fp=fopen
(filename, READ_TXT). This works equally well on DOS and UNIX. This approach
centralizes the DOS-specific code into a single location within an include
file.
Figure 1: Macros to handle text and binary files in DOS

 #ifdef DOS
 # define READ_BIN "rb"
 # define READ_TXT "r"
 # define WRITE_BIN "wb"
 # define WRITE_TXT "w"
 #else /* the UNIX way */
 # define READ_BIN "r"
 # define READ_TXT "r"
 # define WRITE_BIN "w"
 # define WRITE_TXT
 #endif /* DOS */



Sloppy Programming Practices


These problems occur at compile time and are the easiest to solve because, in
most cases, they are ultimately a result of bad or lazy programming. For
example, we found enumerated types being defined with trailing commas. (One
can only guess it made adding the next new enumeration value quicker.) While
UNIX C compilers are rather lenient in this regard, Code Builder choked on the
trailing comma.
Another example of sloppy code we found broke the preprocessor. In this case,
note the macro definition #define abort( ) fancy_abort( ). The expansion
fancy_abort( ) also contained the macro definition abort( ), so the
preprocessor went into an infinite loop trying to resolve the circular
recursion. Some compilers catch circular definitions; Code Builder does not.


Code Builder Limitations


Limitations inherent to Code Builder tend to be designed in artificial
restrictions that no one on the design team ever thought would be questioned.
For example, who would have thought that a macro definition string would
exceed 1K? Unfortunately, the GNU compiler has some incredibly long macro
definitions that are used to define special tables and output formats.
Fortunately, the limitation on macro expansions is much greater (6K). If the
problem is only in the length of the string that follows the macro name in the
definition, you can work around it by splitting the macro into multiple parts.
The other two size problems we ran into show up at runtime. With Code Builder,
the programmer has control over the maximum size to which the stack can grow,
and the maximum size of real memory used before going to disk for virtual
memory. Both these problems usually manifest themselves as a runtime abort,
often changing slightly when the debugger is run, or when new routines are
written and linked into the application. The default stack size is determined
by the linker, and can be adjusted using the - s [+-] <size> linker command
line option. Some of the GNU tools use alloca() to allocate dynamic memory on
the stack for entire temporary data files, so we needed to allow the stack to
grow as much as 1 Mbyte.
Something to watch out for when debugging your application is the amount of
virtual memory needed to run the Code Builder application. It is necessary to
anticipate the maximum amount of memory the application will need during
execution, then set the "region size" accordingly. Code Builder defaults to a
region size equal to that of all your system's extended memory. If your
application needs more, malloc() will eventually fail and your application
will take whatever error precautions have been programmed into it, if any. The
region size can be adjusted at compile time by using the -xregion switch on
the compiler's command line.
Another limitation relates to the library routine alloca(), which allocates
dynamic memory directly on the stack so that it is automatically "freed" upon
returning from the current routine. Even though this routine is considered
obsolete by the ANSI C committee, Intel saw fit to include it in its RTL for
Microsoft and K&R C compatibility. This turned out to be good news because GNU
tends to use alloca() with gusto. However, there are some limitations on it
which can cause problems not immediately evident when trying to debug a
failure at runtime. The most damaging is that at least one local variable
needs to be defined in any routine that uses alloca(). Otherwise, the stack
pointer may not be properly restored upon executing a return statement and the
application may branch off to Mars. Of course, as is true with UNIX, nothing
allocated with this routine should be passed to free(), because it will cause
the dynamic memory heap to become corrupted.
More Details.
The Code Builder debugger is useful and flexible, once you get used to the
commands and the rules for moving around in its "windowed" environment. The
only trouble I found with the debugger is really not its fault. Apparently the
compiler does not place the proper debug information into included files that
have executable code in them. When this happens, the debugger points to the
wrong location in the source. It doesn't resynch until the application
executes code from some other source file. If possible, the best way around
this problem is to remove all executable instructions from include files. If
this is not possible, you may have to create a temporary C file in which you
have preincluded all files with executable code in them until that portion of
the application is ported and tested.


DOSisms


A "DOSism" is a problem that arises because DOS simply won't do what you need
it to. Most of these issues relate to limitations of the BIOS routines. We ran
into both speed and accuracy problems when dealing with the RS-232 port via
the usual BIOS calls. Our requirements were that the downloading be able to
run at up to 38.4K bps and not lose any bits. This seems like a reasonable
request, but turned into a nightmare when we looked into BIOS further.
As described in the accompanying text box, "Communicating Around the BIOS,"
the BIOS could not guarantee that it would return to the host program every
character written to the port from Nindy, at any rate over and including 9600
baud. Ultimately, we had to write our own RS-232 driver to bypass the BIOS,
then mop up all the ramifications of doing so.
The last limitation we ran into is shared by Code Builder and DOS. It is a
restriction on the number of files that can be opened at any given time. If
the application is failing because it can't open enough files, the DOS
limitation can be removed by modifying the FILES command in the config.sys
file. We found that 45 files were sufficient for our application. We ran into
a bug, however, in DOS 4.01, in which these values were ignored. We were not
able to run our application successfully on DOS 4.01 when it needed to open
more than 20 files. We had no such problems under DOS, Versions 3.x or 5.0.
The Code Builder RTL also has a maximum number of files that can be open at
one time. Unfortunately, this number is not affected by the FILES value, set
in config.sys. Instead, the applications main routine must be modified to
include a call to_init_handle_count(num_files), where num_files is the maximum
number of files you need open at any given time. This number should be less
than or equal to the value in config.sys.


Library Misses


"Library misses" are problems that relate to library routines that either
don't exist or aren't the same under DOS and UNIX. A simple example is that
there is no way to turn off local echo on characters typed in from the
keyboard without calling a completely different read() function. The GNU
interactive communications tool that talks over RS-232 to Nindy running on the
80960 processor expects to have Nindy echo the characters it receives. So on
UNIX, it utilizes an ioctl call to turn off echo and calls the standard read
keyboard routine. We wanted to maintain our goal of not changing the read( )
functions, so we had to turn the Nindy echo off, otherwise we would see double
for everything the user typed.

Another problem was that DOS does not have available all the interrupt signals
that can be used under UNIX. The list of available signals is shown in Table
1. Thus, if your UNIX application uses one of the nonmapped signals, an
alternative must be used for running under DOS. For instance, DOS has no
SIGALRM alarm clock capability. To port code that uses it, one must map the
UNIX SIGALRM onto one of the user-definable signals (see Figure 2). Then all
statements that raise(SIGALRM) will really be raising the DOS user-defined
signal. Of course, in this case you also will need to write a version of
alarm( ) which uses the BIOS clock and explicitly raises SIGALRM when the
correct amount of time has lapsed.
Table 1: Mapping of interrupt signals under UNIX and MS-DOS

 MS-DOS Signal UNIX Signal Meaning
 ---------------------------------------------------------

 SIGABRT Abnormal termination
 SIGBREAK Control+Break signal
 SIGFPE SIGFPE Floating point exception
 SIGILL SIGILL Illegal instruction
 SIGINT SIGINT Control C interrupt
 SIGSEGV SIGSEGV Segmentation violation
 SIGTERM SIGTERM SW termination signal
 SIGUSR1 SIGUSR1 User-defined signal
 SIGUSR2 SIGUSR2 User-defined signal
 SIGUSR3 User-defined signal
 SIGHUP Hangup
 SIGQUIT Quit
 SIGTRAP Trace trap
 SIGIOT IOT instruction
 SIGEMT EMT instruction
 SIGKILL Process kill
 SIGBUS Bus error
 SIGSYS Bad arg to system call
 SIGPIPE Pipe write with no reader
 SIGALRM Alarm clock
 SIGCLD Death of child process
 SIGPWR Power fail
 SIGPOLL Selectable event pending

Figure 2: Mapping the UNIX SIGALRM to a DOS user-defined signal

 #ifdef DOS
 # define SIGALRM SIGUSR1
 #endif /*DOS*/

Furthermore, the workarounds you devise to sidestep restrictions in DOS may
cause routines that exist under in Code Builder's RTL to be insufficient. As I
mentioned earlier, we had to bypass the DOS BIOS to guarantee accurate
high-speed RS-232 communications. In doing this, we rendered useless all calls
to standard I/O routines dealing with the RS-232 port through the BIOS. This
included read(), write(), and dup2(), to name a few. We had to go back and
recode these routines to go through our own data structures and RS-232 driver,
then figure out a way to execute our routines when accessing the RS-232 port,
and the normal read(), write(), and dup2() when accessing local files on the
DOS disk.


Conclusion


Obviously, we had some work to do to complete the port, but Code Builder held
up its end of the bargain. Even though I've focused on things to watch for,
there were workarounds. In fact, I largely credit the smoothness of the port
to the development tools we used. In this respect, the Code Builder tool suite
gave the DOS machine the same feeling as a UNIX workstation.


Communicating Around the BIOS


Devices and files are handled differently in DOS than in UNIX. In DOS it is
not possible, for instance, to simply open a stream and begin reading from and
writing to it. Instead, devices and files must be opened, the controller
initialized, and the BIOS tables setup. The BIOS controls all I/O, including
that which is bound for the RS-232 port. The BIOS handles all interrupts that
are raised when a character comes in over the port, and supplies interface
calls to access the data. The problem is, when DOS needs to service certain
other high-priority requests such as disk accesses, it turns off other
interrupt servicing. DOS still receives the interrupt, but does nothing until
the disk access is completed.
In cases where the data doesn't need to travel terribly fast, say less than
9600 baud, chances are that the disk access will be completed before another
character arrives on the port. However, as speeds exceed 9600 baud, there is
an increasing chance that multiple interrupts will be received during a disk
access, with only the most recent character being picked up after the disk
access is over.
This isn't fatal for the port, but it meant that we had to write a driver to
bypass the BIOS routines and talk directly to the UART controlling the RS-232
connection, buffering all characters as they are received. Then, when the disk
access in completed and the interrupt is serviced, we take all characters that
have been placed in the buffer. The driver source code (as well as other
routines discussed in this section) are available electronically; see
"Availability" on page 3.
Unfortunately, there are ramifications of this solution. Not only do we need
the driver, but we need to modify every I/O routine invocation which could be
going through the RS-232 port. This means we need to create our own versions
of routines such as open(), close(), read(), and write(). Futhermore, some I/O
routines may have data go over the RS-232 port at one invocation, and data for
the disk at another. For example, the subroutine write_files() in Figure 3
takes as input a file handle. This subroutine has no way of knowing whether
the file handle relates to a disk file or an RS-232 connection. Thus, it must
be able to handle both kinds.
Figure 3: Writing files using a specified file handle

 write_files (fp, buffer, bytes)
 int fp;
 char *buffer;
 int bytes;
 {

 if (bytes > 0) write (fp, buffer,
 bytes);
 }

So the first step was to change all I/O function calls to routines of our own
making. We did this using #ifdefs, as shown in Figure 4, then changing the
appropriate calls to WRITE_TTY. Then we needed to write the open_port(),
read_port(), write_port(), and close_port() routines that would keep track of
which file handles were open to RS-232 files. A simple test could allow those
routines to use the built-in RTL disk-file routines or our own RS-232 driver
routines.
Figure 4: Mapping WRIT_TTY to either DOS or UNIX I/O calls

 #ifdef DOS
 # define WRITE_TTY write_port
 #else /* unix */
 # define WRITE_TTY write
 #endif /* DOS */

There was one last hoop we had to jump through before this scheme was
complete. The application we were porting used the runtime library routine
dup2(), which reassigns an open file handle to another. So we needed to write
a version of dup2_port(). However, the UNIX version of this file maintains the
reassignment even when child processes have been spawned. To do this, we
needed to devise a global data structure that maintained the list of file
handles that had been reassigned, and keep track of those that ultimately were
linked to the RS-232 port.
-- D.G.


_PORTING UNIX APPLICATIONS TO DOS_
by David N. Glass

Figure 1: Macros to handle text and binary files in DOS.

#ifdef DOS
# define READ_BIN "rb"
# define READ_TXT "r"
# define WRITE_BIN "wb"
# define WRITE_TXT "w"
#else /* the unix way */
# define READ_BIN "r"
# define READ_TXT "r"
# define WRITE_BIN "w"
# define WRITE_TXT
#endif /* DOS */


Figure 2: Mapping the UNIX SIGALRM to a DOS user-defined signal.

#ifdef DOS
# define SIGALRM SIGUSR1
#endif /*DOS*/

Figure 3. Writing files using a specified file handle.

write_files(fp, buffer, bytes)
int fp;
char *buffer;
int bytes;
{
 if (bytes > 0) write(fp, buffer, bytes);
}

Figure 4: Mapping WRIT_TTY to either DOS or UNIX I/O calls.

#ifdef DOS
# define WRITE_TTY write_port
#else /* unix */
# define WRITE_TTY write
#endif /* DOS */
































































November, 1991
 MONITORING DISTRIBUTED PRINTERS UNDER NOVELL NETWARE


Let your fingers do the walking




V. James Krammes


Jim is a senior information systems specialist for Midland Mutual Life
Insurance. He can be reached there at 250 East Broad Street, Mail Stop 101,
Columbus, OH 43215 or on CompuServe, 75300,1663.


The power and flexibility of printing services under Novell NetWare 3.x can
create headaches for LAN administrators and operations staff used to a
centralized console facility. Print servers can run on the file servers, or on
dedicated computers attached to the network. Printers can be attached directly
to the print server or can reside on any DOS workstation attached to the
network -- or the Internet.
This article describes a program that allows an operator to monitor multiple,
possibly remote printers on a single screen.


Printing Under Novell NetWare


Novell NetWare/386 (later called NetWare 3.0) included (among other
improvements) the ability to print to printers attached to DOS workstations
anywhere on the network. Previously, under Advanced NetWare, Versions 2.1x,
printers had to be attached directly to the file server. This meant that,
unless you had a third-party print spooling system, it was fairly easy for one
person to physically monitor the printers. Now, however, printers can easily
be spread over as far a geographic area as the network itself occupies.
The LAN administrator defines a print server on a specific file server,
configuring queues (identified by name), and printers (identified by number).
Printers can be defined as either local or remote, and have one or more queues
attached to them. Five local and up to 16 total printers are supported per
print server.
The print server software can then be loaded to run on the file server
(PSERVER.NLM), or can be run from a DOS workstation logged into the file
server (PSERVER.EXE). Printers defined as local must be directly connected to
the computer running the print server software. Remote printers, however, are
dynamic. The user controlling a remote printer runs a TSR program
(RPRINTER.EXE) identifying which print server and printer number the
workstation controls. A workstation can control multiple remote printers, but
each printer requires its own copy of RPRINTER.


Defining the Problem, Designing the Solution


I wanted a tool that would enable our operations staff to tell with a single
glance whether printing operations on the monitored printers were proceeding
normally. In addition, I wanted them to be able to tell how far large print
jobs (in this case, bimonthly compensation statements) were from completion.
Some reports have to be released manually by the operators, so this would cut
down on the amount of idle time between print jobs. The specific information
required was:
Printer status (out-of-paper, jammed, offline, and so on)
Information about the active print job (number of copies, copies completed,
size of each, and percent complete with this copy)
In addition, some sort of visual indicator of printer trouble was desired.


What Tools does Novell Provide?


Novell provides PCONSOLE.EXE, a program used both to create the print server
and to monitor queues and printers. In addition, the PSERVER program displays
summary information about its printers on the console of the machine it is
running on.
The problem with PCONSOLE is that although the printer status information is
provided, the operator must navigate several layers of menus to check on
different printers. This is even more difficult if the printers reside on
different file servers. PCONSOLE is more than sufficient if you only need to
monitor a single printer. It does not allow you to monitor several printers on
a single screen, however.


The NetWare C Interface for DOS


Novell does provide a means to gather the desired information. It's called the
NetWare C Interface for DOS and provides 400 C procedures broken into 20
categories, including -- new to version 1.2 -- print server services. These
services allow a programmer to perform most PCONSOLE functions
programmatically.
For the calls to be effective, you must establish a connection with, and
identify yourself to the desired print server. Print servers communicate with
clients via SPX, Novell's version of Xerox's SPP protocol. A user may only be
logged into a single print server at a time.


Key Design Decisions


Because of our hardware standards, all DOS workstations have a VGA color
monitor. This allows me to handle interface issues without having to allow for
different monitors. Interface procedures can safely be VGA-specific (see
Listing One, intrface.c, page 100). Listing Two (page 100) is intrface.h, the
include files.
Our network standards require all print servers to run as NLMs and are named
the same as the file server they run on. I can therefore treat the file server
and print server as a single entity. I can also assume that a user will
request information from no more than eight print servers at a time, because
the NetWare workstation shells (NETx.COM, EMSNETx.EXE, XMSNETx.EXE) allow
connection to only eight file servers at a time.
For simplicity's sake, the print server names and printer numbers will be read
from a file. Entries in the file are in the format <print-server-name> /
<printer-number>. The file can easily be edited with any ASCII editor.



Supporting Data Structures


Information about print servers is kept in a structure called (appropriately
enough) PrintServer (see Listing Three, pmon.h, page 100). (Several of the
field types are defined in the Novell header file "nit.h." WORD is defined as
unsigned int, and BYTE is unsigned char.) The fields in this structure are
shown in Table 1.
Table 1: Fields for PrintServer structure

 Field Description
 -------------------------------------------------------------------------

 ConnectionID Nonzero indicates that this structure is valid.
 Name Null-terminated name of the print server
 SPXConnection SPX connection number used to communicate with this print
 server
 Error Nonzero indicates that one of the Print Server Services
 calls failed when attempting to attach to or login to the
 print server.
 Printers Head of a linked list of printer information structures

Information about individual printers is stored in a structure called Printer
(see Table 2). The NetWare shell allows no more than eight concurrent
connections, so print server information will be stored in an eight-element
array.
Table 2: Fields for Printer structure

 Field Description
 ----------------------------------------------------------------------

 Printer Printer number
 Status Nonzero indicates that one of the Print Server
 Services calls failed when attempting to get printer
 or job status information for this printer.
 Problem 1 = offline, 2 = out of paper, other value = no trouble
 HasJob True indicates that a job is active on the printer.
 Copies Number of copies requested
 CopiesDone Number of copies completed
 JobSize Size (in bytes) of one copy
 BytesDone Number of bytes sent so far for this copy
 PercentDone Percent of this copy that has been sent so far
 Next Pointer to the next printer for this print server, or
 NULL



Connections: Preferred, Default, and Primary


The shell can communicate with up to eight different file servers. How, then,
does the shell determine at any one time which server to send packets to?
Packets are sent to file servers based on the classification of the connection
to the server.
Packets will be sent to the Preferred server, if a Preferred connection has
been set. If not, packets will go to the Default server, and if there is no
Default server, to the Primary server.
The Preferred server must be explicitly set by calling the
SetPreferredConnectionID( ) API routine.
The Default server is the server that the current default drive maps to, if
the drive is a network drive. For example, if the current drive is Z: and
drive Z: is mapped to file server FS1 directory SYS:PUBLIC, then FS1 is the
default server.
The primary server is used only if the current default drive is not a network
drive, and there is no preferred server. In this case, the primary server is
the server whose entry is the lowest numbered in the shell's tables (usually
the first server that the shell is connected to).


The Code: PMON.C


The program PMON.C (Listing Four, page 100) first determines whether a
configuration filename was passed on the command line. If not, the default
"PMON.CFG" is used. Initialize( ) is called and prints a copyright banner and
initializes the array of PrintServer structures to 0. It then calls
OpenConfigFile( ), which tries to open the configuration file. If it can't, it
prints an error message and terminates. Initialize( ) then calls
BuildServerList( ) and closes the configuration file.
BuildServerList( ) reads each line of the configuration file, ignoring blank
lines and lines that begin with a semi-colon. It calls strtok( ) to return the
server name, which is checked for validity. Next, the API routine
GetConnection-ID( ) is called to get the connection ID for this server. If the
user is not logged into the server (or the server doesn't exist), we print an
error message and exit. Then strtok( ) is called again to retrieve the printer
number from the line read. If strtok( ) returns NULL, or the first character
returned is not a digit, an error message is issued and the program exits.
Otherwise, AddPrinterToServer( ) is called. AddPrinterToServer( ) gets memory
for a Printer structure and links it into the list for the proper print
server.
After returning from Initialize( ), main( ) turns off the cursor and calls
ScreenSetup( ), which clears the screen with a Bright White on Blue attribute
and displays a copyright message, a header for the printer information, and an
exit message. main( ) then loops until a key is pressed, calling MainLoop( )
and then waiting for half a second before calling MainLoop( ) again. If a key
is pressed, the loop exits, the keyboard is flushed, the cursor is turned on
again and placed in the lower-right corner of the screen, and the program
exits to DOS.
MainLoop( ) checks each of the PrintServer array elements to see if it is
valid, then calls LoginToServer( ) and walks the chain of printers for that
server calling GetPrinterInformation( ) and DisplayPrinterInformation( ).
MainLoop( ) then logs out of the print server by calling DetachFromServer( ).
LoginToServer( ) first verifies that the entry is valid. Then
SetPreferredConnection-ID( ) is called to make sure that we are communicating
with the correct file server. PSAttachToPrintServer is called and establishes
an SPX connection with the print server. Finally, PSLogin ToPrintServer( ) is
called to finish the login process.
GetPrinterInformation( ) is passed an SPX connection and a printer structure.
PSGetPrinterStatus is called, passing the SPX connection and printer number,
returning the Trouble indicator (into Problem) and the HasJob flag. If the
HasJob flag is true, PSGetPrintJobStatus is called, again passing the SPX
connection and printer number, returning all remaining fields in the Printer
structure except PercentDone, which is computed.
Display Printer Information( ) calls sprintf( ) to format the information into
a string. Then the line on which this printer is to be displayed is cleared,
and PutStrNoAttr( ) is called to write the string to the VGA video buffer.
Finally, if the Problem flag is 1 (offline, usually indicating a jam), the
attribute of the line is changed to Ox9C -- blinking Red on Blue; if the
Problem is 2 (out-of-paper), the attribute is changed to 0x9E -- blinking
Yellow on Blue.



Products Mentioned


Novell NetWare 3.11 Novell Inc. 112E 1700 South Provo, UT 84606 800-526-5463
$3,495
Novell NetWare C Interface -- DOS Novell Inc. Development Products Division
5918 West Courtyard Dr. Austin, TX 78730 800-733-9673 $295
Note that no error checking is done for a NULL return from calloc( ). The
Printer structure is only 25 bytes, and at most there could be 128 printers
(eight print servers times 16 printers per server), so the greatest amount of
memory requested would be 3200 bytes.
This brings to mind two questions: Why bother with the linked lists at all?
and Why not just allocate an array of 128 Printer entries and be done with it?
Well, there is a large amount of overhead required to log into a print server.
Furthermore, a user can be logged into only one print server at a time.
Therefore, this data structuring allows me to log into each print server once
(per loop in MainLoop( )), get the information for all desired printers on
this print server, and then go on to the next print server.


Finishing Up


This program was a fairly simple effort, involving the more complicated areas
of NetWare programming (IPX/SPX and VAPs/NLMs) only indirectly. However, it
achieved its design goal of allowing the operators to tell at a glance whether
there were any problems with the printers they were monitoring.

_MONITORING DISTRIBUTED PRITNERS UNDER NOVELL NETWARE_
by V. James Krammes


[LISTING ONE]


#include <dos.h>
extern char far *screen;
void CursorOff()
{
 union REGS regs;
 regs.h.ah = 1;
 regs.h.ch = 0x20;
 regs.h.cl = 0;
 int86(0x10,&regs,&regs);
}

void CursorOn()
{
 union REGS regs;
 regs.h.ah = 1;
 regs.h.ch = 11;
 regs.h.cl = 12;
 int86(0x10,&regs,&regs);
}

void CursorAt(r,c)
 unsigned char r,c;
{
 union REGS regs;
 regs.h.ah = 02;
 regs.h.bh = 0;
 regs.h.dh = r;
 regs.h.dl = c;
 int86(0x10,&regs,&regs);
}

void _cls(attr)
 unsigned char attr;
{
 union REGS regs;

 regs.h.ah = 0x06; /* scroll window up */
 regs.h.al = 0; /* clear entire window */
 regs.h.bh = attr;
 regs.x.cx = 0; /* upper left = (0,0) */
 regs.h.dh = 23;
 regs.h.dl = 79;
 int86(0x10,&regs,&regs);
}

void clearline(row,attr)
 unsigned char row,attr;
{
 union REGS regs;
 regs.h.ah = 0x06; /* scroll window up */
 regs.h.al = 1; /* clear 1 line */
 regs.h.bh = attr;
 regs.h.ch = row; /* upper left = (row,0) */
 regs.h.cl = 0;
 regs.h.dh = row; /* lower right = (row,79) */
 regs.h.dl = 79;
 int86(0x10,&regs,&regs);
}

void PutStrNoAttr(row,col,str,len)
 unsigned char row,col;
 int len;
 char *str;
{
 register int offset = (row * 160) + (col << 1);
 while (len--) {
 *(screen + offset) = *str++;
 offset += 2;
 }
}

void putattr(row,col,attr,cnt)
 unsigned char row,col,attr;
 int cnt;
{
 register int offset = (row * 160) + (col << 1) + 1;
 while (cnt--) {
 *(screen + offset) = attr;
 offset += 2;
 }
}




[LISTING TWO]

void CursorOff(void);
void CursorOn(void);
void CursorAt(unsigned char,unsigned char);
void _cls(unsigned char);
void clearline(unsigned char,unsigned char);
void PutStrNoAttr(unsigned char,unsigned char,char *,int);
void putattr(unsigned char,unsigned char,unsigned char,int);






[LISTING THREE]

/* PMON.H - header file for PMON: printer monitor. */

/** define structures **/
typedef struct _Printer {
 BYTE Printer;
 int Status;
 BYTE Problem;
 BYTE HasJob;
 WORD Copies;
 WORD CopiesDone;
 long JobSize;
 long BytesDone;
 float PercentDone;
 struct _Printer *next;
} Printer;

typedef struct _PrintServer {
 WORD ConnectionID;
 char Name[48];
 WORD SPXConnection;
 int Error;
 Printer *Printers;
} PrintServer;





[LISTING FOUR]

#include <stdio.h>
#include <stdlib.h>
#include <dos.h>
#include <nit.h>
#include <npt.h>
#include <string.h>
#include "pmon.h"
#include "intrface.h"

char ConfigFileName[128];
FILE *ConfigFile;

unsigned DelayValue = 500;

PrintServer PS[8];
Printer *P;

char far *screen;

void OpenConfigFile(void)
{
 ConfigFile = fopen(ConfigFileName,"r");
 if (!ConfigFile) {

 printf("Unable to open configuration file \"%s\"\n",ConfigFileName);
 exit(1);
 }
}

void strrep(char *s,char c1,char c2)
{
 while (*s) {
 if (*s == c1)
 *s = c2;
 s++;
 }
}

void AddPrinterToServer(int psnum,BYTE prnum)
{
 register Printer *p = PS[psnum].Printers;
 if (!p) {
 PS[psnum].Printers = calloc(1,sizeof(Printer));
 PS[psnum].Printers->Printer = prnum;
 } else {
 while (p->next)
 p = p->next;
 p->next = calloc(1,sizeof(Printer));
 p = p->next;
 p->Printer = prnum;
 }
}

void BuildServerList(void)
{
 char buf[256];
 char *ptr;
 int status;
 WORD ConnectionID;
 fgets(buf,255,ConfigFile);
 while (strlen(buf)) {
 strrep(buf,'\n',0);
 if (buf[0] != ';' && strlen(buf)) {
 ptr = strtok(buf,"/");
 if (!ptr strlen(ptr) > 47) {
 printf("Error: expected <printserver>/<printer#>\n");
 printf(" found \"%s\"\n",buf);
 exit(1);
 }
 status = GetConnectionID(ptr,&ConnectionID);
 if (status) {
 printf("Error: You are not logged in to server \"%s\"\n",ptr);
 exit(1);
 }
 if (!PS[ConnectionID-1].ConnectionID) {
 PS[ConnectionID-1].ConnectionID = ConnectionID;
 strcpy(PS[ConnectionID-1].Name,ptr);
 }
 ptr = strtok(NULL," \n\t;");
 if (!ptr !isdigit(*ptr)) {
 printf("Error: expected <printserver>/<printer#>\n");
 printf(" found \"%s\"\n",buf);
 exit(1);

 }
 AddPrinterToServer(ConnectionID-1,atoi(ptr));
 }
 fgets(buf,255,ConfigFile);
 }
}

void Initialize()
{
 register int i;
 printf("PMON 1.00 - (C) 1991 The Midland Mutual Life Insurance Company\n\n");
 screen = (char far *) MK_FP(0xB800,0x0000);
 for (i=0; i < 8; i++)
 memset((char *) &PS[i],0,sizeof(PrintServer));
 OpenConfigFile();
 BuildServerList();
 fclose(ConfigFile);
}

void ScreenSetup(void)
{
 static char *title =
 "(C) 1991 The Midland Mutual Life Insurance Company";
 static char *header =
 " Server /P# Status #Copies Size of 1 Done So Far Percent";
 static char *footer =
 "Press any Key to Exit";
 _cls(0x1F);
 PutStrNoAttr(0,(80-strlen(title)) >> 1,title,strlen(title));
 PutStrNoAttr(2,6,header,strlen(header));
 PutStrNoAttr(23,(80-strlen(footer)) >> 1,footer,strlen(footer));
}

void LoginToServer(int i)
{
 register int status;
 BYTE CAL;
 if (PS[i].ConnectionID) {
 SetPreferredConnectionID(PS[i].ConnectionID);
 status = PSAttachToPrintServer(PS[i].Name,&PS[i].SPXConnection);
 if (status)
 PS[i].Error = status;
 else {
 status = PSLoginToPrintServer(PS[i].SPXConnection,&CAL);
 if (status)
 PS[i].Error = status;
 }
 }
}

void DetachFromServer(int i)
{
 if (PS[i].ConnectionID)
 PSDetachFromPrintServer(PS[i].SPXConnection);
}

void GetPrinterInformation(WORD spxid,Printer *p)
{
 char sdummy[128];

 BYTE dummy;
 WORD wdummy;
 register int status;
 status = PSGetPrinterStatus(spxid,p->Printer,&dummy,&(p->Problem),
 &(p->HasJob),&dummy,&wdummy,sdummy,sdummy);
 p->Status = status;
 if (p->HasJob) {
 PSGetPrintJobStatus(spxid,p->Printer,sdummy,sdummy,
 &wdummy,sdummy,&(p->Copies),&(p->JobSize),
 &(p->CopiesDone),&(p->BytesDone),&wdummy,
 &dummy);
 if (p->JobSize == 0.0 p->BytesDone == 0.0)
 p->PercentDone = 0.0;
 else
 p->PercentDone = (float) (p->BytesDone) / (float) (p->JobSize) * 100.0;
 }
}
char *TroubleDescription(BYTE code)
{
 static char *Desc[] = { " OK ",
 " JAMMED ",
 "No Paper" };
 if (code == 1)
 return Desc[1];
 else if (code == 2)
 return Desc[2];
 else
 return Desc[0];
}
char *CommaString(long num)
{
 static char buf2[15];
 char buf1[15];
 register int p1,p2;
 int positions = 0;
 memset(buf1,0,15);
 memset(buf2,0,15);
 sprintf(buf1,"%ld",num);
 p1 = strlen(buf1);
 p2 = (p1 > 9) ? p1 + 3 : (p1 > 6) ? p1 + 2 : (p1 > 3) ? p1 + 1 : p1;
 p1--;
 p2--;
 while (p1 >= 0) {
 buf2[p2--] = buf1[p1--];
 if (++positions % 3 == 0)
 buf2[p2--] = ',';
 }
 return buf2;
}

void DisplayPrinterInformation(PrintServer *ps,Printer *p,int row)
{
 char buf[80],s1[15],s2[15];
 if (row <= 21) {
 if (ps->Error)
 sprintf(buf,"%-10s/%02d Error #%d",ps->Name,
 p->Printer,ps->Error);
 else if (p->Status)
 sprintf(buf,"%-10s/%02d Error #%d",ps->Name,

 p->Printer,p->Status);
 else if (!p->HasJob)
 sprintf(buf,"%-10s/%02d %-8s",ps->Name,p->Printer,
 TroubleDescription(p->Problem));
 else {
 strcpy(s1,CommaString(p->JobSize));
 strcpy(s2,CommaString(p->BytesDone));
 sprintf(buf,"%-10s/%02d %-8s %02d/%02d %11s %11s %5.2f",
 ps->Name,p->Printer,TroubleDescription(p->Problem),
 p->CopiesDone,p->Copies,s1,s2,p->PercentDone);
 }
 clearline(row,0x1f);
 PutStrNoAttr(row,6,buf,strlen(buf));
 if (p->Problem == 1)
 putattr(row,0,0x9C,80);
 else if (p->Problem == 2)
 putattr(row,00,0x9E,80);
 }
}

void MainLoop(void)
{
 register Printer *p;
 register int i;
 int CurrentRow = 4;
 for (i=0; i<8; i++)
 if (PS[i].ConnectionID) {
 LoginToServer(i);
 p = PS[i].Printers;
 while (p) {
 GetPrinterInformation(PS[i].SPXConnection,p);
 DisplayPrinterInformation(&PS[i],p,CurrentRow++);
 p = p->next;
 }
 DetachFromServer(i);
 }
}

main(int argc,char *argv[])
{
 if (argc == 1)
 strcpy(ConfigFileName,"PMON.CFG");
 else if (argc == 2)
 strcpy(ConfigFileName,argv[1]);
 else {
 printf("Usage - PMON [<config-file>]\n");
 exit(1);
 }
 Initialize();
 CursorOff();
 ScreenSetup();
 while (!kbhit()) {
 MainLoop();
 delay(DelayValue);
 }
 CursorOn();
 if (!getch())
 getch();
 CursorAt(24,0);

 return 0;
}




























































November, 1991
C PROGRAMMING


D-Flat Window Classes


 This article contains the following executables: DFLAT9.ARC DF9TXT.ARC


Al Stevens


In past months, I discussed and published most of the D-Flat code that
underpins the design of window classes and their management and display. What
remain are code modules that implement each of the standard D-Flat classes in
the hierarchy. By studying these, you will see how window classes work. This
knowledge will be valuable when you begin to derive your own classes from the
ones that D-Flat provides. This column begins the discussion of specific
window classes by reviewing the classes that D-Flat provides, discussing how
the class hierarchy works, and describing the NORMAL class, which is the base
class from which all others derive.
Table 1 lists the standard D-Flat window classes. Figure 1 shows the windows
classes in their hierarchy. I discussed most of these classes in the June 1991
column.
Table 1: The Standard D-Flat window classes

 Class Description
 ------------------------------------------------------------------------
 NORMAL Base window for all window classes
 APPLICATION Application window -- has the menu
 TEXTBOX Contains text. Base window for listbox, editbox, and so on
 LISTBOX Contains a list of text -- base window for menubar
 EDITBOX Text that a user can edit
 MENUBAR The application's menu bar
 POPDOWNMENU Pop-down menu
 BUTTON Command button in a dialog box
 DIALOG Dialog box -- parent to editbox, button, listbox, and so on
 ERRORBOX For displaying an error message
 MESSAGEBOX For displaying a message
 HELPBOX Help window
 TEXT Static text on a dialog box
 RADIOBUTTON Radio button on a dialog box
 CHECKBOX Check box on a dialog box
 STATUSBAR Status bar at the bottom of application window

Figure 1 shows how all window classes derive from the NORMAL base class. In
September, you learned how to add new window classes to the hierarchy by
adding entries to classes.h and config.c.


Window-Processing Module


Each window class needs a window processing module to which D-Flat will send
messages. If a class does not have its own window processing module, D-Flat
will send the messages to the processing module of the window's base class. A
window class's processing module takes the form shown in Example 1.
Example 1: A window class's processing module

 int ClassProc(WINDOW wnd, MESSAGE msg, PARAM p1, PARAM p2)
 {
 switch (msg) {
 /* ---- process the window's messages ---- */
 default:
 break;
 }
 return BaseWndProc (CLASS, wnd, msg, p1, p2);
 }

The ClassProc identifier is unique for each class, and the CLASS constant in
the last statement is the window's class name, for example TEXTBOX or MENUBAR.
D-Flat calls the window processing module to send the window a message. The
message might have one or two parameters that extend its meaning. For example,
the LEFT_BUTTON message occurs when the user presses the left button. The
message parameters contain the x and y coordinates where the mouse cursor was
positioned when the user pressed the button.
D-Flat decides which window gets a message based on the current operating
environment. Keyboard messages go to the window that has the focus. Mouse
messages go to the window where the mouse cursor is located at the time of the
message. Messages from other windows go to the windows to which they are
addressed. For example, a window might send a message to its parent window or
to one of its child windows.

A window processing module can ignore a message. In fact, most window
processing modules process only a small subset of the messages. By calling the
BaseWndProc module, the window processing module guarantees that its base
window class will process the messages it needs. For example, no window class
needs to concern itself with the details of moving itself around the screen.
The window processing module for the NORMAL class takes care of window
movements. Because the NORMAL class is the base for all other classes, the
other classes all share the window movement code that the NORMAL class
provides. On the other hand, most window classes have unique operations for
the PAINT message. Some of them will process the message entirely, some will
superimpose their PAINT operations on top of those of the base classes, and
some will ignore the message and allow the base class to handle the entire
process. In a few cases they will intercept and reject the message.
A window processing module can prevent a message from going to the base
window's processing module simply by returning without calling the BaseWndProc
function. Sometimes the window processing module wants the base class
processes to occur first, so it calls BaseWndProc before doing its own stuff
rather than after.
You will remember from the discussion on the CreateWindow function that you
can add a window processing module to an instance of a window. This is not the
same as a derived window's processing module, although it has some of the same
features. An overriding window processing module for an existing class gets
first crack at the messages, employs the same strategies for intercepting,
ignoring, and augmenting the class's processing, and then calls the
DefaultWndProc function instead of the BaseWndProc function. The memopad.c
example application from last month created an application window that used
its own window processing module in addition to the default one that D-Flat
provides.


The Normal Class


You will learn about D-Flat windows from the top down, using the chart in
Figure 1 as a guide. This month we will begin with the NORMAL window class,
which contains the code common to all windows. The NormalProc window
processing module manages the following operations: Showing, hiding, and
maintaining the window in the order of focused windows; moving and resizing
windows; painting blank windows for classes that do not paint themselves;
displaying the window border and title bar; minimizing, maximizing, and
restoring the window; calling and managing the commands from the system menu;
and closing the window. The NormalProc function is included in Listing One,
page 136, normal.c. That source file also includes the code for deciding which
parts of which windows should be repainted when an overlapping window is
closed or moved out of the way, and which parts of a window should be
repainted when it comes to the front of the focus.
We will look at NormalProc first. It uses the format given in Example 2 for
window processing modules. A window processing module is essentially one big
switch statement with cases for each of the messages. We will discuss the
messages one at a time. Remember that by the time the NormalProc function
receives a message, it has passed up through every window processing module in
the class hierarchy, starting at the bottom. So, if a message gets sent to a
POPDOWNMENU window, for example, the window processing modules for the
POPDOWNMENU, LISTBOX, and TEXTBOX classes get it first in turn before the
NormalProc function sees it. Furthermore, if the instance of the window has a
window processing module, that one gets it first.


The Messages


The CreateWindow function sends the CREATE_WINDOW message after the function
has built the basic window structure. NormalProc eventually gets the message
and adds the window to the two linked lists I described last month. Then, if
the mouse is not installed, NormalProc removes the window's scroll bar
attributes. There is no point in having scroll bars if you don't have a mouse.
Some windows have the SAVESELF attribute, which means that they will never be
overlapped by another window. These include pop-down menus, message boxes, and
modal dialog boxes. These windows do not require the overhead involved in
repainting the contents of the windows that they overlap because they will
never lose the focus as long as they exist. Instead, they simply save the
video memory that they will overwrite. NormalProc calls the GetVideoBuffer
function for such windows if they have the VISIBLE attribute when they are
created.
The SHOW_WINDOW message is to paint the entire window. If the window has a
parent, and the parent is not visible, this message does nothing. Otherwise,
if the window has the SAVESELF attribute and its video save buffer has not
been built, the message calls the GetVideoBuffer function. The message then
sets the VISIBLE attribute for the window and sends PAINT and BORDER messages
to the window.
Take a moment and consider the operation just described. The window processing
module for a window class just sent a message to a window that is either of or
is derived from the class. Why couldn't the module simply avoid the
message-passing overhead and do whatever it does when it receives the message?
You will see a lot of this kind of activity in D-Flat and other message-based
systems. It is one reason why they are generally less efficient than other
programming models. The NormalProc function cannot simply pass control to its
own code to process the PAINT and BORDER messages. The window class is most
certainly derived -- directly or indirectly -- from the NORMAL class. The
derived class might -- probably will -- have its own processing for the PAINT
message, if not for the BORDER message.
After sending itself PAINT and BORDER messages, this window sends the
SHOW_WINDOW message to each of its child windows so that they will be
displayed as well.
The HIDE_WINDOW message processes only if the window has the VISIBLE
attribute. First the message clears the VISIBLE attribute from the window and
all child windows. Then it must redisplay the other windows that the window
covers. If the window has a video save buffer, the message calls the
RestoreVideoBuffer function. Otherwise, it calls the PaintOverLappers function
-- which we will discuss soon -- which removes the window by sending PAINT and
BORDER messages to whatever the window covered. Not quite that simple, but
conceptually correct.
The DISPLAY_HELP message is posted by a HELPBOX window when it is changing
help window displays. NormalProc calls the DisplayHelp function when it gets
this message.
The INSIDE_WINDOW message is sent to find out if the screen coordinates in the
parameters are inside the window.
The KEYBOARD message processes the keystrokes that the derived classes do not
intercept. The message translates the F1 key into a COMMAND ID_HELP message.
If the window is being moved or resized, the KEYBOARD message processes the
arrow keys, the Enter key and the Esc key for these operations by translating
them into MOUSE_CURSOR and MOUSE_MOVED messages. If the keystroke is the F1
help key, the Alt+F6 window stepper key, the Alt+space bar system menu key, or
the Ctrl+F4 close window key, the program processes those keys here because
their functions are common to all windows. If a particular window class does
not use one or more of these operations, or uses them differently, the class's
window processing module will intercept and process or ignore the key.
All other keystrokes that make it as far as NormalProc get posted as messages
to the parent of the window. This procedure assures that system-wide
keystrokes, such as menu accelerator keys, get processed by the application
window, which is the parent of all other windows. The ADDSTATUS and
SHIFT_CHANGED messages are similarly passed by NormalProc to the parent window
of whatever window originally receives them. If that is not supposed to happen
for a particular derived window class, the window processing module of one of
the windows in the hierarchy will have intercepted and processed the message.
Take another moment to consider the difference between the hierarchy of window
classes and the parent-child relationship of instances of windows. These are
two different logical chains, and it is easy to confuse them. An instance of a
window is created, and that window's definition and behavior are functions of
its class and the base classes from which its class is derived. The same
window can be a child of another window of the same or a different class, and
it can have one or more child windows of its own, each of which may be of
classes that are the same as or different from each other and that of their
parent.
The PAINT message for the NORMAL window class simply calls ClearWindow to
paint a blank window. A NORMAL window has no data display in its client data
space.
The BORDER message calls functions to display the window's border and title
bar, and, if the window has a status bar, sends the status bar window a PAINT
message.
The COMMAND message includes a parameter that identifies the command that is
to be processed. As a general rule, command messages result when the user
selects a command from a menu or performs some activity on a dialog box.
D-Flat sends a COMMAND message to the current in-focus document window, dialog
box, or the window that is the parent of the menu. The COMMAND message's first
parameter contains the command's code. NormalProc manages the ID_HELP command
and the commands in Example 2, sent by the system menu.
Example 2: These commands, managed by NormalProc, are sent by the system menu.

 ID_SYSRESTORE to restore a minimized or
 maximized window
 ID_SYSMOVE to move a window
 ID_SYSSIZE to resize a window
 ID_SYSMINIMIZE to minimize a window
 ID_SYSMAXIMIZE to maximize a window
 ID_SYSCLOSE to close a window

A window receives the SETFOCUS message when it is to receive or relinquish the
user's input focus. For the window to receive the input focus, the message has
a true value in the first parameter. To relinquish the focus, the first
parameter is false.
Clearing the focus from a window consists of setting the global inFocus WINDOW
type to NULL and sending a BORDER message to the window. The current in-focus
window identifies itself to the user with a double-line border and a
highlighted title bar, so the border must be repainted when the window loses
the focus.
Setting the focus to a window is more complicated. The Redraw flag will
indicate if the window needs to be selectively repainted for those parts of it
that are overlapped by other windows. If the window is presently visible and
does not have the SAVESELF attribute, it might need to be redrawn. If the
window is the child of an APPLICATION window and has no children itself,
however, it will not need to be repainted selectively. Once that decision is
made, the program sends the SETFOCUS message with a false parameter to the
currently in-focus window to tell it to relinquish the focus. If the Redraw
indicator has been set, the program calls the PaintUnderLappers function to
paint those portions of the window that are overlapped by other windows. Then
it puts the window at the top of the infocus linked list and puts its WINDOW
handle into the inFocus global variable. If the Redraw indicator is set, the
program sends the window the BORDER message to tell it to change its border to
the in-focus configuration just described. Otherwise, the SHOW_WINDOW message
will tell the window to completely repaint itself. If the window has no parent
or its parent is an application window or dialog box, the SHOW_WINDOW message
goes to the window itself. Otherwise, it goes to the window's parent window.
You might set the focus to a child window when all or part of its parent is
overlapped by other windows.
If the DOUBLE_CLICK message hits the window's control box, the program sends
the CLOSE_WINDOW message to the window. If the LEFT_BUTTON message hits the
window's control box, the program posts the LEFT_BUTTON message to the
window's parent. The LEFT_BUTTON can also hit the minimize, maximize, or
restore boxes, in which case the program sends the corresponding message to
the window. If the message hits the title bar, the program sets up a window
move operation by capturing the mouse and calling the dragborder function. If
the message hits the lower right corner of the window, the program sets up the
window resize operation the same way.
The MOUSE_MOVED message handles moving and sizing windows by calling the
dragborder and sizeborder functions each time the mouse movement occurs.
BUTTON_RELEASED terminates any window move or resize that is in effect.
The MAXIMIZE, MINIMIZE, RESTORE, MOVE, and SIZE messages come next. These
messages perform those operations on the window. The MOVE message sends
corresponding MOVE messages to all the window's child windows. If you resize a
window that has a maximized child window in it, the program sends a
corresponding SIZE message to the child.
The CLOSE_WINDOW message sends the HIDE_WINDOW message to the window and then
sends CLOSE_WINDOW messages to all the window's child windows. If the window
has the focus, the program calls the SetPrevFocus function to set the focus to
the one that had it just before this window got it. Then the program removes
the window structure from the window linked lists and frees the memory used by
the window structure and its video save buffer.


Other Functions


The normal.c source file includes functions to compute the space for D-Flat's
version of minimized window icons. When the user minimizes a window, D-Flat
simply resizes it to a tiny window and finds room for it somewhere in the
application window's client data space. The LowerRight function computes the
rectangle occupied by the first minimized window, which will display in the
lower right corner of the application window. The PositionIcon function
computes subsequent adjacent icon positions to the left and above the first
one.
The TerminateMoveSize, dragborder, and sizeborder functions support window
moving and sizing. When you move or size a window with the keyboard or mouse,
D-Flat draws the border of a dummy window frame around the original window
position. You interactively move or resize that dummy frame with the
dragborder and resizeborder functions. When you are done, the
TerminateMoveSize function gets rid of the dummy frame.
The next several functions manage the redisplay of document windows when there
are several on the screen and one of them gets hidden or brought to the top.
When a window becomes hidden, either because you move it to another position
or close it, the windows it overlaps need to be repainted. D-Flat begins at
the bottom of the in-focus list and sends each window a PAINT message. That
process would work on its own, but it would be time-consuming and visually
distracting if a lot of windows were involved.
Rather than completely repaint every overlapped window, NormalProc calls the
PaintOverLappers function to compute and repaint the overlapping rectangle
space within each overlapped window. The PAINT messages for each window class
and the BORDER message for all window classes acknowledge the first parameter
as a RECT structure that is the relative rectangle within themselves to be
repainted. PaintOverLappers stores the WINDOW handle of the overlapping window
in the HiddenWindow variable and calls PaintOverParents. This function calls
itself recursively for its parent and then calls PaintOver to repaint the
subject window and PaintOverChildren to repaint the window's child windows.
The recursive calls are to assure that the most distant ancestor of the window
and its children are dealt with first. The PaintOverChildren function
processes all the child windows for a window. It calls PaintOver to paint each
child, and then calls itself recursively in case the child window has child
windows. The PaintOver function computes the rectangle that is the
intersection of the one to paint and the one being hidden--the one whose
handle is in the HiddenWindow variable. Then it calls the PaintOverLap
function, which decides from the rectangle which of the component parts of the
window need to be painted.
The PaintUnderLappers function is the other side of the coin. The program
calls the function when a window gets the focus and is coming to the front of
the display. Rather than unnecessarily repainting the entire window,
PaintUnderLappers computes the rectangle that includes all overlapped parts of
the window. First, it locates the oldest ancestor of the window and works with
it by calling the PaintUnder function. This function looks at every other
window in the system. If a window is descended from the one being processed,
the function skips it. The descendents will be repainted as a result of the
repainting of their ancestor. If a window is an ancestor, it too is skipped.
This approach assures that only siblings and unrelated windows will be looked
at. Once the function has found such a window, it looks to see if the found
window is ahead of the subject window in the focus chain. If so, the function
saves the handle of the found window in the HiddenWindow variable and calls
the PaintOver function for the subject window. After PaintUnder has processed
all the other windows against the subject window this way, it calls itself
recursively for each of the subject window's child windows.
Considering the previous two paragraphs, do you wonder why event-driven,
message-based windowing systems are generally less snappy than some of their
forebears?


How to Get D-Flat Now



The D-Flat source code is on CompuServe in Library 0 of DDJ Forum and on M&T
Online. Its name is DFLATn. ARC, where the n is an integer that represents a
loosely-assigned version number. There is another file, named DFnTXT.ARC. It
includes a README file that describes the changes and how to build the
software. The file also contains the Help system database and the
documentation for the programmer's API. D-Flat compiles with Turbo C 2.0,
Borland C++ 2.0, Microsoft C 6.0, and Watcom C 8.0. There are makefiles for
the TC, MSC, and Watcom compilers. There is an example program, MEMOPAD, which
is a multiple-document notepad.
If you cannot get to either online service, send me a formatted diskette--360K
or 720K--and an addressed, stamped diskette mailer. Send it to me in care of
DDJ. I'll send you the latest copy of the library. The software is free, but
if you care to, stick a dollar bill in the mailer. I'll match the dollar and
give it to the Brevard County Food Bank. They take care of homeless and hungry
children. This "careware" program of mine has netted about $200 so far for the
Food Bank. They are grateful for your generosity. If you want to discuss
D-Flat with me, use CompuServe. My CompuServe ID is 71101,1262, and I monitor
DDJ Forum daily.
_C PROGRAMMING COLUMN_
by Al Stevens



[LISTING ONE]

/* ------------- normal.c ------------ */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <conio.h>
#include <dos.h>
#include "dflat.h"

#ifdef INCLUDE_MULTIDOCS
static void near PaintOverLappers(WINDOW wnd);
static void near PaintUnderLappers(WINDOW wnd);
#endif
static int InsideWindow(WINDOW, int, int);
#ifdef INCLUDE_SYSTEM_MENUS
static void TerminateMoveSize(void);
static void SaveBorder(RECT);
static void RestoreBorder(RECT);
static RECT PositionIcon(WINDOW);
static void near dragborder(WINDOW, int, int);
static void near sizeborder(WINDOW, int, int);
static int px = -1, py = -1;
static int diff;
static int conditioning;
static struct window dwnd = {DUMMY, NULL, NULL, NormalProc, {-1,-1,-1,-1}};
static int *Bsave;
static int Bht, Bwd;
int WindowMoving;
int WindowSizing;
#endif
/* -------- array of class definitions -------- */
CLASSDEFS classdefs[] = {
 #undef ClassDef
 #define ClassDef(c,b,p,a) {b,p,a},
 #include "classes.h"
};
WINDOW HiddenWindow;

int NormalProc(WINDOW wnd, MESSAGE msg, PARAM p1, PARAM p2)
{
 int mx = (int) p1 - GetLeft(wnd);
 int my = (int) p2 - GetTop(wnd);
 int DoneClosing = FALSE;
 switch (msg) {
 case CREATE_WINDOW:
 AppendBuiltWindow(wnd); /* add to the lists */
 AppendFocusWindow(wnd);
#ifdef INCLUDE_SCROLLBARS

 if (!SendMessage(NULL, MOUSE_INSTALLED, 0, 0))
 ClearAttribute(wnd, VSCROLLBAR HSCROLLBAR);
#endif
 if (TestAttribute(wnd, SAVESELF) && isVisible(wnd))
 GetVideoBuffer(wnd);
 break;
 case SHOW_WINDOW:
 if ((GetParent(wnd) == NULL isVisible(GetParent(wnd)))
#ifdef INCLUDE_SYSTEM_MENUS
 && !conditioning
#endif
 ) {
 WINDOW cwnd = Focus.FirstWindow;
 if (TestAttribute(wnd, SAVESELF) &&
 wnd->videosave == NULL)
 GetVideoBuffer(wnd);
 SetVisible(wnd);
 SendMessage(wnd, PAINT, 0, 0);
 SendMessage(wnd, BORDER, 0, 0);
 /* --- show the children of this window --- */
 while (cwnd != NULL) {
 if (GetParent(cwnd) == wnd &&
 cwnd->condition != ISCLOSING)
 SendMessage(cwnd, msg, p1, p2);
 cwnd = NextWindow(cwnd);
 }
 }
 break;
 case HIDE_WINDOW:
 if (isVisible(wnd)
#ifdef INCLUDE_SYSTEM_MENUS
 && !conditioning
#endif
 ) {
 WINDOW cwnd = Focus.LastWindow;
 /* --- hide the children of this window --- */
 while (cwnd != NULL) {
 if (GetParent(cwnd) == wnd)
 ClearVisible(cwnd);
 cwnd = PrevWindow(cwnd);
 }
 ClearVisible(wnd);
 /* --- paint what this window covered --- */
 if (wnd->videosave != NULL)
 RestoreVideoBuffer(wnd);
#ifdef INCLUDE_MULTIDOCS
 else
 PaintOverLappers(wnd);
#endif
 }
 break;
#ifdef INCLUDE_HELP
 case DISPLAY_HELP:
 DisplayHelp(wnd, (char *)p1);
 break;
#endif
 case INSIDE_WINDOW:
 return InsideWindow(wnd, (int) p1, (int) p2);
 case KEYBOARD:

#ifdef INCLUDE_SYSTEM_MENUS
 if (WindowMoving WindowSizing) {
 /* -- move or size a window with keyboard -- */
 int x, y;
 x=WindowMoving?GetLeft(&dwnd):GetRight(&dwnd);
 y=WindowMoving?GetTop(&dwnd):GetBottom(&dwnd);
 switch ((int)p1) {
 case ESC:
 TerminateMoveSize();
 return TRUE;
 case UP:
 if (y)
 --y;
 break;
 case DN:
 if (y < SCREENHEIGHT-1)
 y++;
 break;
 case FWD:
 if (x < SCREENWIDTH-1)
 x++;
 break;
 case BS:
 if (x)
 --x;
 break;
 case '\r':
 SendMessage(wnd,BUTTON_RELEASED,x,y);
 default:
 return TRUE;
 }
 /* -- use the mouse functions to move/size - */
 SendMessage(wnd, MOUSE_CURSOR, x, y);
 SendMessage(wnd, MOUSE_MOVED, x, y);
 break;
 }
#endif
 switch ((int)p1) {
#ifdef INCLUDE_HELP
 case F1:
 SendMessage(wnd, COMMAND, ID_HELP, 0);
 return TRUE;
#endif
 case ALT_F6:
 SetNextFocus(inFocus);
 SkipSystemWindows(FALSE);
 return TRUE;
#ifdef INCLUDE_SYSTEM_MENUS
 case ' ':
 if ((int)p2 & ALTKEY)
 if (TestAttribute(wnd, HASTITLEBAR))
 if (TestAttribute(wnd, CONTROLBOX))
 BuildSystemMenu(wnd);
 return TRUE;
#endif
 case CTRL_F4:
 SendMessage(wnd, CLOSE_WINDOW, 0, 0);
 SkipSystemWindows(FALSE);
 return TRUE;

 default:
 break;
 }
 /* ------- fall through ------- */
 case ADDSTATUS:
 case SHIFT_CHANGED:
 if (GetParent(wnd) != NULL)
 PostMessage(GetParent(wnd), msg, p1, p2);
 break;
 case PAINT:
 if (isVisible(wnd))
 ClearWindow(wnd, (RECT *)p1, ' ');
 break;
 case BORDER:
 if (isVisible(wnd)) {
 if (TestAttribute(wnd, HASBORDER))
 RepaintBorder(wnd, (RECT *)p1);
 else if (TestAttribute(wnd, HASTITLEBAR))
 DisplayTitle(wnd, (RECT *)p1);
 if (wnd->StatusBar != NULL)
 SendMessage(wnd->StatusBar, PAINT, p1, 0);
 }
 break;
#ifdef INCLUDE_SYSTEM_MENUS
 case COMMAND:
 switch ((int)p1) {
#ifdef INCLUDE_HELP
 case ID_HELP:
 DisplayHelp(wnd,ClassNames[GetClass(wnd)]);
 break;
#endif
 case ID_SYSRESTORE:
 SendMessage(wnd, RESTORE, 0, 0);
 break;
 case ID_SYSMOVE:
 SendMessage(wnd, CAPTURE_MOUSE, TRUE, (PARAM) &dwnd);
 SendMessage(wnd, CAPTURE_KEYBOARD, TRUE, (PARAM) &dwnd);
 SendMessage(wnd, MOUSE_CURSOR, GetLeft(wnd), GetTop(wnd));
 WindowMoving = TRUE;
 dragborder(wnd, GetLeft(wnd), GetTop(wnd));
 break;
 case ID_SYSSIZE:
 SendMessage(wnd, CAPTURE_MOUSE, TRUE, (PARAM) &dwnd);
 SendMessage(wnd, CAPTURE_KEYBOARD, TRUE, (PARAM) &dwnd);
 SendMessage(wnd, MOUSE_CURSOR, GetRight(wnd), GetBottom(wnd));
 WindowSizing = TRUE;
 dragborder(wnd, GetLeft(wnd), GetTop(wnd));
 break;
 case ID_SYSMINIMIZE:
 SendMessage(wnd, MINIMIZE, 0, 0);
 break;
 case ID_SYSMAXIMIZE:
 SendMessage(wnd, MAXIMIZE, 0, 0);
 break;
 case ID_SYSCLOSE:
 SendMessage(wnd, CLOSE_WINDOW, 0, 0);
 break;
 default:
 break;

 }
 break;
#endif
 case SETFOCUS:
 if (p1 && inFocus != wnd) {
 WINDOW pwnd = GetParent(wnd);
 int Redraw = isVisible(wnd) &&
 !TestAttribute(wnd, SAVESELF);
 if (GetClass(pwnd) == APPLICATION) {
 WINDOW cwnd = Focus.FirstWindow;
 /* -- if no children, do not need selective
 redraw --- */
 while (cwnd != NULL) {
 if (GetParent(cwnd) == wnd)
 break;
 cwnd = NextWindow(cwnd);
 }
 if (cwnd == NULL)
 Redraw = FALSE;
 }
 /* ---- setting focus ------ */
 SendMessage(inFocus, SETFOCUS, FALSE, 0);

 if (Redraw)
 PaintUnderLappers(wnd);

 /* remove window from list */
 RemoveFocusWindow(wnd);
 /* move window to end of list */
 AppendFocusWindow(wnd);
 inFocus = wnd;

 if (Redraw)
 SendMessage(wnd, BORDER, 0, 0);
 else {
 if (pwnd == NULL 
 GetClass(pwnd) == DIALOG 
 DerivedClass(GetClass(pwnd)) ==
 DIALOG GetClass(pwnd) ==
 APPLICATION)
 SendMessage(wnd, SHOW_WINDOW, 0, 0);
 else
 SendMessage(pwnd, SHOW_WINDOW, 0, 0);
 }
 }
 else if (!p1 && inFocus == wnd) {
 /* -------- clearing focus --------- */
 inFocus = NULL;
 SendMessage(wnd, BORDER, 0, 0);
 }
 break;
 case DOUBLE_CLICK:
#ifdef INCLUDE_SYSTEM_MENUS
 if (!WindowSizing && !WindowMoving)
#endif
 if (HitControlBox(wnd, mx, my))
 PostMessage(wnd, CLOSE_WINDOW, 0, 0);
 break;
 case LEFT_BUTTON:

#ifdef INCLUDE_SYSTEM_MENUS
 if (WindowSizing WindowMoving)
 break;
#endif
 if (HitControlBox(wnd, mx, my)) {
 BuildSystemMenu(wnd);
 break;
 }
#ifdef INCLUDE_SYSTEM_MENUS
 if (my == 0 && mx > -1 && mx < WindowWidth(wnd)) {
 /* ---------- hit the top border -------- */
 if (TestAttribute(wnd, MINMAXBOX) &&
 TestAttribute(wnd, HASTITLEBAR)) {
 if (mx == WindowWidth(wnd)-2) {
 if (wnd->condition == ISRESTORED)
 /* --- hit the maximize box --- */
 SendMessage(wnd, MAXIMIZE, 0, 0);
 else
 /* --- hit the restore box --- */
 SendMessage(wnd, RESTORE, 0, 0);
 break;
 }
 if (mx == WindowWidth(wnd)-3) {
 /* --- hit the minimize box --- */
 if (wnd->condition != ISMINIMIZED)
 SendMessage(wnd, MINIMIZE, 0, 0);
 break;
 }
 }
 if (wnd->condition != ISMAXIMIZED &&
 TestAttribute(wnd, MOVEABLE)) {
 WindowMoving = TRUE;
 px = mx;
 py = my;
 diff = (int) mx;
 SendMessage(wnd, CAPTURE_MOUSE, TRUE,
 (PARAM) &dwnd);
 dragborder(wnd, GetLeft(wnd), GetTop(wnd));
 }
 break;
 }
 if (mx == WindowWidth(wnd)-1 &&
 my == WindowHeight(wnd)-1) {
 /* ------- hit the resize corner ------- */
 if (wnd->condition == ISMINIMIZED 
 !TestAttribute(wnd, SIZEABLE))
 break;
 if (wnd->condition == ISMAXIMIZED) {
 if (TestAttribute(GetParent(wnd),HASBORDER))
 break;
 /* ----- resizing a maximized window over a
 borderless parent ----- */
 wnd = GetParent(wnd);
 }
 WindowSizing = TRUE;
 SendMessage(wnd, CAPTURE_MOUSE,
 TRUE, (PARAM) &dwnd);
 dragborder(wnd, GetLeft(wnd), GetTop(wnd));
 }

#endif
 break;
#ifdef INCLUDE_SYSTEM_MENUS
 case MOUSE_MOVED:
 if (WindowMoving) {
 int leftmost = 0, topmost = 0,
 bottommost = SCREENHEIGHT-2,
 rightmost = SCREENWIDTH-2;
 int x = (int) p1 - diff;
 int y = (int) p2;
 if (GetParent(wnd) != NULL &&
 !TestAttribute(wnd, NOCLIP)) {
 WINDOW wnd1 = GetParent(wnd);
 topmost = GetClientTop(wnd1);
 leftmost = GetClientLeft(wnd1);
 bottommost = GetClientBottom(wnd1);
 rightmost = GetClientRight(wnd1);
 }
 if (x < leftmost x > rightmost 
 y < topmost y > bottommost) {
 x = max(x, leftmost);
 x = min(x, rightmost);
 y = max(y, topmost);
 y = min(y, bottommost);
 SendMessage(NULL,MOUSE_CURSOR,x+diff,y);
 }
 if (x != px y != py) {
 px = x;
 py = y;
 dragborder(wnd, x, y);
 }
 return TRUE;
 }
 if (WindowSizing) {
 sizeborder(wnd, (int) p1, (int) p2);
 return TRUE;
 }
 break;
 case BUTTON_RELEASED:
 if (WindowMoving WindowSizing) {
 if (WindowMoving)
 PostMessage(wnd,MOVE,dwnd.rc.lf,dwnd.rc.tp);
 else
 PostMessage(wnd,SIZE,dwnd.rc.rt,dwnd.rc.bt);
 TerminateMoveSize();
 }
 break;
 case MAXIMIZE:
 if (wnd->condition != ISMAXIMIZED) {
 RECT rc = {0, 0, 0, 0};
 RECT holdrc;
 holdrc = wnd->RestoredRC;
 rc.rt = SCREENWIDTH-1;
 rc.bt = SCREENHEIGHT-1;
 if (GetParent(wnd))
 rc = ClientRect(GetParent(wnd));
 wnd->condition = ISMAXIMIZED;
 SendMessage(wnd, HIDE_WINDOW, 0, 0);
 conditioning = TRUE;

 SendMessage(wnd, MOVE, RectLeft(rc), RectTop(rc));
 SendMessage(wnd, SIZE, RectRight(rc), RectBottom(rc));
 conditioning = FALSE;
 if (wnd->restored_attrib == 0)
 wnd->restored_attrib = wnd->attrib;
#ifdef INCLUDE_SHADOWS
 ClearAttribute(wnd, SHADOW);
#endif
 SendMessage(wnd, SHOW_WINDOW, 0, 0);
 wnd->RestoredRC = holdrc;
 }
 break;
 case MINIMIZE:
 if (wnd->condition != ISMINIMIZED) {
 RECT rc;
 RECT holdrc;

 holdrc = wnd->RestoredRC;
 rc = PositionIcon(wnd);
 wnd->condition = ISMINIMIZED;
 SendMessage(wnd, HIDE_WINDOW, 0, 0);
 conditioning = TRUE;
 SendMessage(wnd, MOVE, RectLeft(rc), RectTop(rc));
 SendMessage(wnd, SIZE, RectRight(rc), RectBottom(rc));
 SetPrevFocus(wnd);
 conditioning = FALSE;
 if (wnd->restored_attrib == 0)
 wnd->restored_attrib = wnd->attrib;
 ClearAttribute(wnd,
 SHADOW SIZEABLE HASMENUBAR 
 VSCROLLBAR HSCROLLBAR);
 SendMessage(wnd, SHOW_WINDOW, 0, 0);
 wnd->RestoredRC = holdrc;
 }
 break;
 case RESTORE:
 if (wnd->condition != ISRESTORED) {
 RECT holdrc;
 holdrc = wnd->RestoredRC;
 wnd->condition = ISRESTORED;
 SendMessage(wnd, HIDE_WINDOW, 0, 0);
 wnd->attrib = wnd->restored_attrib;
 wnd->restored_attrib = 0;
 conditioning = TRUE;
 SendMessage(wnd, MOVE, wnd->RestoredRC.lf, wnd->RestoredRC.tp);
 wnd->RestoredRC = holdrc;
 SendMessage(wnd, SIZE, wnd->RestoredRC.rt, wnd->RestoredRC.bt);
 SendMessage(wnd, SETFOCUS, TRUE, 0);
 conditioning = FALSE;
 SendMessage(wnd, SHOW_WINDOW, 0, 0);
 }
 break;
 case MOVE: {
 WINDOW wnd1 = Focus.FirstWindow;
 int wasVisible = isVisible(wnd);
 int xdif = (int) p1 - wnd->rc.lf;
 int ydif = (int) p2 - wnd->rc.tp;

 if (xdif == 0 && ydif == 0)

 return FALSE;
 if (wasVisible)
 SendMessage(wnd, HIDE_WINDOW, 0, 0);
 wnd->rc.lf = (int) p1;
 wnd->rc.tp = (int) p2;
 wnd->rc.rt = GetLeft(wnd)+WindowWidth(wnd)-1;
 wnd->rc.bt = GetTop(wnd)+WindowHeight(wnd)-1;
 if (wnd->condition == ISRESTORED)
 wnd->RestoredRC = wnd->rc;
 while (wnd1 != NULL) {
 if (GetParent(wnd1) == wnd)
 SendMessage(wnd1, MOVE, wnd1->rc.lf+xdif, wnd1->rc.tp+ydif);
 wnd1 = NextWindow(wnd1);
 }
 if (wasVisible)
 SendMessage(wnd, SHOW_WINDOW, 0, 0);
 break;
 }
 case SIZE: {
 int wasVisible = isVisible(wnd);
 WINDOW wnd1 = Focus.FirstWindow;
 RECT rc;
 int xdif = (int) p1 - wnd->rc.rt;
 int ydif = (int) p2 - wnd->rc.bt;

 if (xdif == 0 && ydif == 0)
 return FALSE;
 if (wasVisible)
 SendMessage(wnd, HIDE_WINDOW, 0, 0);
 wnd->rc.rt = (int) p1;
 wnd->rc.bt = (int) p2;
 wnd->ht = GetBottom(wnd)-GetTop(wnd)+1;
 wnd->wd = GetRight(wnd)-GetLeft(wnd)+1;

 if (wnd->condition == ISRESTORED)
 wnd->RestoredRC = WindowRect(wnd);

 rc = ClientRect(wnd);
 while (wnd1 != NULL) {
 if (GetParent(wnd1) == wnd &&
 wnd1->condition == ISMAXIMIZED)
 SendMessage(wnd1, SIZE, RectRight(rc), RectBottom(rc));
 wnd1 = NextWindow(wnd1);
 }

 if (wasVisible)
 SendMessage(wnd, SHOW_WINDOW, 0, 0);
 break;
 }
#endif
 case CLOSE_WINDOW:
 wnd->condition = ISCLOSING;
 if (wnd->PrevMouse != NULL)
 SendMessage(wnd, RELEASE_MOUSE, 0, 0);
 if (wnd->PrevKeyboard != NULL)
 SendMessage(wnd, RELEASE_KEYBOARD, 0, 0);
 /* ----------- hide this window ------------ */
 SendMessage(wnd, HIDE_WINDOW, 0, 0);
 /* --- close the children of this window --- */

 while (!DoneClosing) {
 WINDOW wnd1 = Focus.LastWindow;
 DoneClosing = TRUE;
 while (wnd1 != NULL) {
 WINDOW prwnd = PrevWindow(wnd1);
 if (GetParent(wnd1) == wnd) {
 if (inFocus == wnd1) {
 RemoveFocusWindow(wnd);
 AppendFocusWindow(wnd);
 inFocus = wnd;
 }
 SendMessage(wnd1,CLOSE_WINDOW,0,0);
 DoneClosing = FALSE;
 break;
 }
 wnd1 = prwnd;
 }
 }
 /* --- change focus if this window had it -- */
 SetPrevFocus(wnd);
 /* ------- remove this window from the
 list of open windows ------------- */
 RemoveBuiltWindow(wnd);
 /* ------- remove this window from the
 list of in-focus windows ---------- */
 RemoveFocusWindow(wnd);
 /* -- free memory allocated to this window - */
 if (wnd->title != NULL)
 free(wnd->title);
 if (wnd->videosave != NULL)
 free(wnd->videosave);
 free(wnd);
 break;
 default:
 break;
 }
 return TRUE;
}
/* ---- compute lower left icon space in a rectangle ---- */
#ifdef INCLUDE_SYSTEM_MENUS
static RECT LowerRight(RECT prc)
{
 RECT rc;
 RectLeft(rc) = RectRight(prc) - ICONWIDTH;
 RectTop(rc) = RectBottom(prc) - ICONHEIGHT;
 RectRight(rc) = RectLeft(rc)+ICONWIDTH-1;
 RectBottom(rc) = RectTop(rc)+ICONHEIGHT-1;
 return rc;
}
/* ----- compute a position for a minimized window icon ---- */
static RECT PositionIcon(WINDOW wnd)
{
 RECT rc;
 RectLeft(rc) = SCREENWIDTH-ICONWIDTH;
 RectTop(rc) = SCREENHEIGHT-ICONHEIGHT;
 RectRight(rc) = SCREENWIDTH-1;
 RectBottom(rc) = SCREENHEIGHT-1;
 if (GetParent(wnd)) {
 WINDOW wnd1 = (WINDOW) -1;

 RECT prc;
 prc = WindowRect(GetParent(wnd));
 rc = LowerRight(prc);
 /* - search for icon available location - */
 while (wnd1 != NULL) {
 wnd1 = GetFirstChild(GetParent(wnd));
 while (wnd1 != NULL) {
 if (wnd1->condition == ISMINIMIZED) {
 RECT rc1;
 rc1 = WindowRect(wnd1);
 if (RectLeft(rc1) == RectLeft(rc) &&
 RectTop(rc1) == RectTop(rc)) {
 RectLeft(rc) -= ICONWIDTH;
 RectRight(rc) -= ICONWIDTH;
 if (RectLeft(rc) < RectLeft(prc)+1) {
 RectLeft(rc) =
 RectRight(prc)-ICONWIDTH;
 RectRight(rc) =
 RectLeft(rc)+ICONWIDTH-1;
 RectTop(rc) -= ICONHEIGHT;
 RectBottom(rc) -= ICONHEIGHT;
 if (RectTop(rc) < RectTop(prc)+1)
 return LowerRight(prc);
 }
 break;
 }
 }
 wnd1 = GetNextChild(GetParent(wnd), wnd1);
 }
 }
 }
 return rc;
}
/* ----- terminate the move or size operation ----- */
static void TerminateMoveSize(void)
{
 px = py = -1;
 diff = 0;
 SendMessage(&dwnd, RELEASE_MOUSE, TRUE, 0);
 SendMessage(&dwnd, RELEASE_KEYBOARD, TRUE, 0);
 RestoreBorder(dwnd.rc);
 WindowMoving = WindowSizing = FALSE;
}
/* ---- build a dummy window border for moving or sizing --- */
static void near dragborder(WINDOW wnd, int x, int y)
{
 RestoreBorder(dwnd.rc);
 /* ------- build the dummy window -------- */
 dwnd.rc.lf = x;
 dwnd.rc.tp = y;
 dwnd.rc.rt = dwnd.rc.lf+WindowWidth(wnd)-1;
 dwnd.rc.bt = dwnd.rc.tp+WindowHeight(wnd)-1;
 dwnd.ht = WindowHeight(wnd);
 dwnd.wd = WindowWidth(wnd);
 dwnd.parent = GetParent(wnd);
 dwnd.attrib = VISIBLE HASBORDER NOCLIP;
 InitWindowColors(&dwnd);
 SaveBorder(dwnd.rc);
 RepaintBorder(&dwnd, NULL);

}
/* ---- write the dummy window border for sizing ---- */
static void near sizeborder(WINDOW wnd, int rt, int bt)
{
 int leftmost = GetLeft(wnd)+10;
 int topmost = GetTop(wnd)+3;
 int bottommost = SCREENHEIGHT-1;
 int rightmost = SCREENWIDTH-1;
 if (GetParent(wnd)) {
 bottommost = min(bottommost, GetClientBottom(GetParent(wnd)));
 rightmost = min(rightmost, GetClientRight(GetParent(wnd)));
 }
 rt = min(rt, rightmost);
 bt = min(bt, bottommost);
 rt = max(rt, leftmost);
 bt = max(bt, topmost);
 SendMessage(NULL, MOUSE_CURSOR, rt, bt);

 if (rt != px bt != py)
 RestoreBorder(dwnd.rc);

 /* ------- change the dummy window -------- */
 dwnd.ht = bt-dwnd.rc.tp+1;
 dwnd.wd = rt-dwnd.rc.lf+1;
 dwnd.rc.rt = rt;
 dwnd.rc.bt = bt;
 if (rt != px bt != py) {
 px = rt;
 py = bt;
 SaveBorder(dwnd.rc);
 RepaintBorder(&dwnd, NULL);
 }
}
#endif
/* ----- adjust a rectangle to include the shadow ----- */
#ifdef INCLUDE_SHADOWS
static RECT adjShadow(WINDOW wnd)
{
 RECT rc;
 rc = wnd->rc;
 if (TestAttribute(wnd, SHADOW)) {
 if (RectRight(rc) < SCREENWIDTH-1)
 RectRight(rc)++;
 if (RectBottom(rc) < SCREENHEIGHT-1)
 RectBottom(rc)++;
 }
 return rc;
}
#endif
/* --- repaint a rectangular subsection of a window --- */
#ifdef INCLUDE_MULTIDOCS
static void near PaintOverLap(WINDOW wnd, RECT rc)
{
 int isBorder, isTitle, isData;
 isBorder = isTitle = FALSE;
 isData = TRUE;
 if (TestAttribute(wnd, HASBORDER)) {
 isBorder = RectLeft(rc) == 0 &&
 RectTop(rc) < WindowHeight(wnd);

 isBorder = RectLeft(rc) < WindowWidth(wnd) &&
 RectRight(rc) >= WindowWidth(wnd)-1 &&
 RectTop(rc) < WindowHeight(wnd);
 isBorder = RectTop(rc) == 0 &&
 RectLeft(rc) < WindowWidth(wnd);
 isBorder = RectTop(rc) < WindowHeight(wnd) &&
 RectBottom(rc) >= WindowHeight(wnd)-1 &&
 RectLeft(rc) < WindowWidth(wnd);
 }
 else if (TestAttribute(wnd, HASTITLEBAR))
 isTitle = RectTop(rc) == 0 &&
 RectLeft(rc) > 0 &&
 RectLeft(rc)<WindowWidth(wnd)-BorderAdj(wnd);

 if (RectLeft(rc) >= WindowWidth(wnd)-BorderAdj(wnd))
 isData = FALSE;
 if (RectTop(rc) >= WindowHeight(wnd)-BottomBorderAdj(wnd))
 isData = FALSE;
 if (TestAttribute(wnd, HASBORDER)) {
 if (RectRight(rc) == 0)
 isData = FALSE;
 if (RectBottom(rc) == 0)
 isData = FALSE;
 }
#ifdef INCLUDE_SHADOWS
 if (TestAttribute(wnd, SHADOW))
 isBorder = RectRight(rc) == WindowWidth(wnd) 
 RectBottom(rc) == WindowHeight(wnd);
#endif
 if (isData)
 SendMessage(wnd, PAINT, (PARAM) &rc, 0);
 if (isBorder)
 SendMessage(wnd, BORDER, (PARAM) &rc, 0);
 else if (isTitle)
 DisplayTitle(wnd, &rc);
}
/* ------ paint the part of a window that is overlapped
 by another window that is being hidden ------- */
static void PaintOver(WINDOW wnd)
{
 RECT wrc, rc;
#ifdef INCLUDE_SHADOWS
 wrc = adjShadow(HiddenWindow);
 rc = adjShadow(wnd);
#else
 wrc = HiddenWindow->rc;
 rc = wnd->rc;
#endif
 rc = subRectangle(rc, wrc);
 if (ValidRect(rc))
 PaintOverLap(wnd, RelativeWindowRect(wnd, rc));
}
/* --- paint the overlapped parts of all children --- */
static void PaintOverChildren(WINDOW pwnd)
{
 WINDOW cwnd = GetFirstFocusChild(pwnd);
 while (cwnd != NULL) {
 if (cwnd != HiddenWindow) {
 PaintOver(cwnd);

 PaintOverChildren(cwnd);
 }
 cwnd = GetNextFocusChild(pwnd, cwnd);
 }
}
/* -- recursive overlapping paint of parents -- */
static void PaintOverParents(WINDOW wnd)
{
 WINDOW pwnd = GetParent(wnd);
 if (pwnd != NULL) {
 PaintOverParents(pwnd);
 PaintOver(pwnd);
 PaintOverChildren(pwnd);
 }
}
/* - paint the parts of all windows that a window is over - */
static void near PaintOverLappers(WINDOW wnd)
{
 HiddenWindow = wnd;
 PaintOverParents(wnd);
}
/* --- paint those parts of a window that are overlapped --- */
static void near PaintUnder(WINDOW wnd)
{
 WINDOW hwnd = Focus.FirstWindow;
 while (hwnd != NULL) {
 /* ---- don't bother testing self ----- */
 if (hwnd != wnd) {
 /* --- see if other window is descendent --- */
 WINDOW pwnd = GetParent(hwnd);
 while (pwnd != NULL) {
 if (pwnd == wnd)
 break;
 pwnd = GetParent(pwnd);
 }
 /* ----- don't test descendent overlaps ----- */
 if (pwnd == NULL) {
 /* -- see if other window is ancestor --- */
 pwnd = GetParent(wnd);
 while (pwnd != NULL) {
 if (pwnd == hwnd)
 break;
 pwnd = GetParent(pwnd);
 }
 /* --- don't test ancestor overlaps --- */
 if (pwnd == NULL) {
 /* ---- other window must be ahead in
 focus chain ----- */
 WINDOW fwnd = NextWindow(wnd);
 while (fwnd != NULL) {
 if (fwnd == hwnd)
 break;
 fwnd = NextWindow(fwnd);
 }
 if (fwnd != NULL) {
 HiddenWindow = hwnd;
 PaintOver(wnd);
 }
 }

 }
 }
 hwnd = NextWindow(hwnd);
 }
 /* --------- repaint all children of this window
 the same way ----------- */
 hwnd = Focus.FirstWindow;
 while (hwnd != NULL) {
 if (GetParent(hwnd) == wnd)
 PaintUnder(hwnd);
 hwnd = NextWindow(hwnd);
 }
}
/* paint the parts of a window that are under other windows */
static void near PaintUnderLappers(WINDOW wnd)
{
 WINDOW pwnd = wnd;
 /* find oldest ancestor younger than application window */
 while (pwnd != NULL && GetClass(pwnd) != APPLICATION) {
 if (TestAttribute(wnd, SAVESELF))
 break;
 wnd = pwnd;
 pwnd = GetParent(pwnd);
 }
 PaintUnder(wnd);
}
#endif
#ifdef INCLUDE_SYSTEM_MENUS
/* --- save video area to be used by dummy window border --- */
static void SaveBorder(RECT rc)
{
 Bht = RectBottom(rc) - RectTop(rc) + 1;
 Bwd = RectRight(rc) - RectLeft(rc) + 1;
 if ((Bsave = realloc(Bsave, (Bht + Bwd) * 4)) != NULL) {
 RECT lrc;
 int i;
 int *cp;

 lrc = rc;
 RectBottom(lrc) = RectTop(lrc);
 getvideo(lrc, Bsave);
 RectTop(lrc) = RectBottom(lrc) = RectBottom(rc);
 getvideo(lrc, Bsave + Bwd);
 cp = Bsave + Bwd * 2;
 for (i = 1; i < Bht-1; i++) {
 *cp++ = GetVideoChar(RectLeft(rc),RectTop(rc)+i);
 *cp++ = GetVideoChar(RectRight(rc),RectTop(rc)+i);
 }
 }
}
/* ---- restore video area used by dummy window border ---- */
static void RestoreBorder(RECT rc)
{
 if (Bsave != NULL) {
 RECT lrc;
 int i;
 int *cp;
 lrc = rc;
 RectBottom(lrc) = RectTop(lrc);

 storevideo(lrc, Bsave);
 RectTop(lrc) = RectBottom(lrc) = RectBottom(rc);
 storevideo(lrc, Bsave + Bwd);
 cp = Bsave + Bwd * 2;
 for (i = 1; i < Bht-1; i++) {
 PutVideoChar(RectLeft(rc),RectTop(rc)+i, *cp++);
 PutVideoChar(RectRight(rc),RectTop(rc)+i, *cp++);
 }
 free(Bsave);
 Bsave = NULL;
 }
}
#endif
/* ----- test if screen coordinates are in a window ---- */
static int InsideWindow(WINDOW wnd, int x, int y)
{
 RECT rc;
 rc = WindowRect(wnd);
 if (!TestAttribute(wnd, NOCLIP)) {
 WINDOW pwnd = GetParent(wnd);
 while (pwnd != NULL) {
 rc = subRectangle(rc, ClientRect(pwnd));
 pwnd = GetParent(pwnd);
 }
 }
 return InsideRect(x, y, rc);
}
/* ----- find window that screen coordinates are in --- */
WINDOW inWindow(int x, int y)
{
 WINDOW wnd = Focus.LastWindow;
 while (wnd != NULL) {
 if (SendMessage(wnd, INSIDE_WINDOW, x, y)) {
 WINDOW wnd1 = GetLastChild(wnd);
 while (wnd1 != NULL) {
 if (SendMessage(wnd1, INSIDE_WINDOW, x, y)) {
 if (isVisible(wnd)) {
 wnd = wnd1;
 break;
 }
 }
 wnd1 = GetPrevChild(wnd, wnd1);
 }
 break;
 }
 wnd = PrevWindow(wnd);
 }
 return wnd;
}













November, 1991
STRUCTURED PROGRAMMING


Waves in What?




Jeff Duntemann KG7JF


Toward the end of July, my oldest nephew Brian came visiting from Chicago, and
together we built a one-tube radio. This wasn't difficult; in fact, the
density of radio parts in my garage has gotten so high that radios form
spontaneously out there if you shake the place up a little. (This happened a
lot in California.) So Brian and I shook up a bin containing a 1H4G tube, a
toothpaste pump, some octal relay sockets, a couple of variable capacitors, a
50K pot, some wire, and several nine-volt batteries taped together, and a
radio happened.
This is heavy stuff to an eight-year-old, and I tried gamely to explain what
was going on. He understood how tubes light up (we turned out the garage
lights and watched the thin filament glow a mysterious orange in the dark) and
I think he understood that an antenna catches radio waves that are passing by
and feeds them into the grid of the tube. His understanding of the rest of it
(including regeneration and the nature of that cantankerous "tickler" coil)
will come in time. But while we tinkered at it, I kept expecting the one
question that I knew I simply could not answer, the question that had, 30
years ago, driven me nuts (as well as several fairly knowledgeable adults,
including the old man and my weird Uncle Louie, who knew everything) when I
first put my own one-tube radio together.
The question never came. To an eight-year-old, there is still magic, and
radios make just as much sense as the Teenage Mutant Ninja Turtles. Hence
Brian, confronted with the unmistakable reality of radio waves (in the
screechy form of KOY-AM, playing the golden hits of the '40s, '50s, '60s,
'70s, '80s, and '90s in my WW-II vintage headphones), never thought to ask,
Waves in what?"
Waves in What?
Sheesh, boy, now there's a question.


To Everything it's Context


Sure. Think about it: Toss a rock in a pond and circles of waves will fan out
from the point of impact, eventually to lap at the opposite shore. They're
waves -- in water. Clap your hands and sound waves carry the rhythm to the
folks across the room. Those are waves in air. But radio waves can cross the
gulfs between galaxies, where there's as close to nothing as anywhere else you
could name. Waves in what? The only truthful way to answer that question is to
cop out and contend that radio waves aren't really waves at all, they're,
well, they're quantum phenomena, which is just God's way of telling us,
because I Say So.
Over the past year, an uncomfortable truth has been dawning on me, as I've
experimented with objects in different languages and different applications.
The truth is that we've been missing something essential in thinking about
objects, or, worse, pretending that that something isn't necessary and doesn't
exist. The something I speak of is the larger context within which objects are
used. In an admittedly loose analogy, if objects are waves, then their context
is what they, as waves, are made of.
We've been very quick to shout about how objects are casehardened little
nuggets of functionality, totally encapsulated and independent of other
program elements. We've bragged about how the coupling between objects and
other program elements approaches zero. We've even been extrapolating from our
own hype, and claiming that objects will eventually be the software equivalent
of TTL integrated circuits (this is a Brad Cox notion to which I'll return
later on) and will be available off-the-shelf in hundreds of different
standard "packages" that anyone can buy and use. We've been generating all
this yahooha without thinking too hard about how objects interact with their
context, and how that context shapes and limits what may be done with its
objects.


The Language ContexT


In all but a few DOS OOP languages, an object's context is, in practice,
limited to the language that generates the objects. Can you stick a compiled
Turbo Pascal object on a disk and hand it to a friend who programs in Turbo
C++? It's supposed to be possible (with some restrictions) but hey; judging by
the hOOPla, you'd expect it to be easy.
The new release of JPI's TopSpeed Environment allows much greater sharing of
objects across languages. JPI has carefully defined a language-independent
calling convention, and all languages share a common runtime library. In
theory (and I haven't tested this rigorously), any compiled JPI object should
be usable from any JPI object-oriented language.
This tells us something about objects that should be obvious -- and yet how
soon we forget: An object is no less dependent on its language's runtime
library than any compiled subprogram. We can't even write an object to a disk
file without elaborate and (to my mind) somewhat shaky fooling-around, because
when the object goes to disk, its code doesn't go with it. The code stays in a
library module of some kind, and only instance data is written out to a
stream. Registration with a stream only allows the object to locate its code
in a code segment when the object is read back from disk to memory.
Encapsulation here is more a matter of calling rights than any sort of
physical bundling-up-in-a-ball, as too many of us have uncritically come to
assume.


The Hierarchy Context


If we as a community have misunderstood one element of object-oriented
technology more than any other, I would have to point to the object hierarchy
context as the culprit. This is the origin of that old objection of Scott
Guthery (DDJ, December, 1989) that even if you just want a banana, you get the
whole gorilla. And if you're dealing with a sizeable object hierarchy, he's
right -- you can't necessarily just pick one item off an object hierarchy tree
without the risk of bringing along a lot of unexpected baggage. Scott was
reminding us that encapsulation includes everything on a line between the root
and the particular leaf you instantiate. In other words, once you invoke
inheritance, no object is a banana.
Rather, I think it's fair to say that the object is the hierarchy, and that if
you link an object into your application, you link in much or most of its
hierarchy and all of the hierarchy's assumptions as well. Linkers can only get
so smart, and if you make heavy use of polymorphism and virtual methods,
little or nothing will be stripped out of the hierarchy at link time. This
will be true even if all you intend to use is one or two different classes
from the tree.
The logical extreme of hierarchy context is seen in Smalltalk, where there is
only one hierarchy, and everything is part of that hierarchy. In Smalltalk,
there is only one indivisible gorilla, and an individual class is nothing more
than the gorilla wearing a hand puppet. You can watch the puppet and ignore
the gorilla, but you must not forget that the gorilla is always there, and
that without him, the hand puppet is limp and useless.


Toward a Platform Context


Does it sound like I've become a little disenchanted with objects? I suppose I
have. The Object Wave is a little more than two years old now, and it came
about with a truckful of promises, few of which have been fulfilled. One
promise in particular attracted me, and I've come around to the bitter view
that we just can't get there from here.
That was the promise of standard, universally usable software mechanisms made
possible through object-oriented technology. I first encountered this promise
in Brad Cox's very good book, Object-Oriented Programming: an Evolutionary
Approach. The book was in one respect an apologia for Cox's own language,
Objective C, but it was also a call to produce what Cox calls (and has
trademarked as) Software ICs.
For the last 15 or 20 years engineers have been able to make use of a vast
array of digital logic blocks created as TTL integrated circuits. The logic
blocks are all different, but what remains universally standard is the way the
blocks interact. All chips use a standard power potential of five volts, and
all input and output pins have a standard voltage swing and current
source/sink ability called fan-in and fan-out. Several dozen companies make or
have made TTL ICs, and all of them may be used interchangeably, regardless of
their vendors.
Why can't we do this with objects? Simply put: There is no standard context.
The standard context of TTL ICs is simple and universal: the laws of
electrical physics, bounded by a spare handful of standard assumptions about
voltage and current values.
The best we can do so far is create standard object libraries for use with
particular compilers. Object Professional and Turbo Vision are good examples.
But hey, what's the essential difference between object libraries like that
and the garden-variety procedure and function libraries we've been using for
years? The answer: Not much.
There is promise for creating language-spanning object libraries for JPI's
TopSpeed system, but that's no consequence of OOP technology; you could do the
same with ordinary procedures and functions.
If there is a solution -- or at least a next step -- it will have to be the
broadening of object contexts to embrace the platform, regardless of language.
DOS as we know it is hopeless in that regard, because it can only reliably
treat a piece of code as an executable stored on disk. (DOS's sole motion
toward what I have in mind involves TSRs, which are thoroughly crippled by
DOS's careless internal architecture. A TSR object library, while
theoretically possible, has to jump through flaming hoops to keep itself from
crashing the system. Not cool.) The platform must be able to treat a binary
image containing both code and data (that is, an object) as a loadable library
that can be made safely available to all transient applications.
Surprise! We're halfway there, and what Microsoft does in the next two years
will dictate how close we will eventually come.


Halfway There



The underlying machinery for a language-independent, platform-wide object
context has been with us since the release of Windows 3.0. I'm talking about
Dynamic Link Libraries (DLLs), perhaps the least-appreciated feature Windows
brings us. DLLs are a little like units, and a little like TSRs. Like units,
they can contain both code and data, and they can have an initialization
section (but not an exit procedure). Like TSRs they can be loaded by Windows
and ;left in memory for the use of other programs. They occupy the same
ecological niche under Windows that TSRs occupy under DOS: that of resident
platform extension. Windows itself is composed of several DLLs that are loaded
by the Windows kernel when Windows takes control, so in creating a DLL you are
in a sense extending Windows.
What DLLs do not have is any high-level knowledge of object-oriented methods.
This is the missing half that must be added, and again, it's a conceptually
simple matter of defining some standards. There is a movement underway at
Microsoft to define some of these platform-wide, object-interface standards,
but one gets the impression that real technology is still years off. I suspect
it probably wont be incorporated into a Microsoft platform before the 32-bit
Windows descendant expected (by this columnist at least) no earlier than
September of 1993.
Once a tenable platform object context appears as Win32, a great deal more of
the promise of OOP should become real. We should be able to pass "canned"
objects from machine to machine on disk or over networks, and expect them to
work identically on all Win32-based machines, from any language that supports
the standard object messaging protocol.
That path wont get us to the non-Win32 machines, but shall we say this doesn't
distress me. At the rate big hardware companies like Apple and IBM continue to
slit their own throats, I anticipate that by the turn of the century the
Gruesome Twosome will be reduced to a couple of niche firms offering
special-purpose, high-end boxes and paying their bills by selling Pacific Rim
80686-based PC clones. Nope. Software rules the future, and you and I both
know who rules software, with no serious challenger in sight.


On to Turbo Vision


I said all that as my way of announcing that I am ceasing my search for
object-oriented truth and will instead settle for a good toolbox. Because
that's where I've found object technology to really shine: in the design of
software tools that may be easily extended and customized without gross
rewriting of source code.
Good examples are beginning to appear, now that developers have digested and
understand object technology. For the next several columns I'll be probing
Turbo Vision, which comes in the can with Turbo Pascal 6.0 and is probably the
most-owned (if not necessarily the most-used) object-oriented toolbox going
right now.
Turbo Vision has taken a lot of heat on the networks and in the user groups
since it appeared. It's quite literally unlike anything Borland has ever
released for Turbo Pascal, and it really does represent a new turn in
technology, not only for Turbo Pascal but for Pascal in general. A lot of
people hate it. Few people truly understand it. But I would like to plead for
everyone to see it on its own terms and give it a fair chance.


Meat and Bones


I'm not really stealing a phrase from Chapter 1 of the Turbo Vision Guide when
I say that Turbo Vision provides the bones of a windowing application. I wrote
some early parts of the Guide, and I like that way of putting it: Turbo Vision
allows you to inherit the bones of an application (things such as menus,
windows, edit fields, and so on) to which you add the meat, which are the
routines that allow the application to do your specific tasks.
It isn't quite accurate to call TV a user-interface toolbox. It's really a
boilerplate application, stripped of all application-specific functionality.
The most visible portions of TV are UI components, obviously -- but behind the
scenes are some other remarkable mechanisms as well. Turbo Vision contains a
very efficient event manager that lets you break away from the "pick a number
from the menu" kind of hierarchical control that leads to what I called The
Cuba Lake Effect a few columns back. All of this is built into a remarkable
object type called TApplication, which is the boilerplate application I spoke
of earlier.
From a height, this is how you use Turbo Vision: You create a child type of
TApplication. You override some of its existing methods and add some new
methods; add a few other objects, define some menus and dialogs, and hook the
various parts together with pointers. That's your application; and your main
program looks like this in almost every case:
BEGIN
 MyApplication.Init;
 MyApplication.Run;
 MyApplication.Done;
END.
The first line sets your application up. The second line runs it. The third
line reasserts system defaults, deallocates memory, and otherwise tidies up
whatever mess the application made.


The Pain and the Gain


I won't be so bold as to say it's simple, nor that it's easy. Turbo Vision was
certainly the most difficult learning experience I ever had in Pascal, so if
you're having trouble with TV, don't kick yourself for it. Nonetheless, for
all the trouble I've had with it (and am still having with it!) I continue to
use it and like it more and more as I do.
I see Turbo Vision as something like a pair of expensive leather boots. They
hurt when you first put them on, but as you wear them two things happen: The
boots adapt to your feet, and your feet adapt to the boots. You need to use TV
intensely for a while to make it stop hurting -- because the hurt comes from
not truly understanding what the damned thing is up to, nor how to make it do
what you want. As you become familiar with Turbo Vision, however, the pain
goes away, not only because you understand how it works, but also because
understanding how it works allows you to bend it in your own directions to
suit your own needs. Over time, it becomes a far better fit -- and eventually,
you'll wonder how you ever did without it.
I've managed to distill some guidelines for working with Turbo Vision. These
are the "rules," in a sense, and if you're not willing to play by the rules
you'll be in a rut and wearing the tread off your tires in no time.
1. Understand pointers. Really. I mean, really really. If you aren't
comfortable with pointers, TV will be a brick wall ten miles high.
Polymorphism depends utterly on pointers to make it work, and TV makes the
most pervasive use of polymorphism that I have ever encountered.
2. Don't try to subset it. You can't just pull a menu procedure out of Turbo
Vision and use it apart from TApplication. TV isn't really a toolbox from
which you can pick one gizmo or the other. It's a boilerplate application, and
the explicit and implicit coupling among the components is very high. About
the only thing I would say is extractable from Turbo Vision is the TCollection
hierarchy, which is basically a linked-list manager that functions tolerably
well on its own.
3. Don't try to modify it. In other words, extend it but don't try to alter
the look or operation of the parts that are already there. The coupling among
the components is high, and this coupling is not simply a matter of procedure
calls or global variables. There are a multitude of very subtle assumptions
underlying Turbo Vision, (most of them completely undocumented) and even
innocuous-appearing changes can have completely unexpected consequences in
what seem to be totally unrelated parts of the system.
4. Do things the Turbo Vision way. Whenever you can, let Turbo Vision carry
the ball its own way and in its own direction. All of us have our own ways of
thinking about program design, and what is easy to forget when using Turbo
Vision is that your program is already designed. The event-driven architecture
embodied in TApplication is complete and functional, and it will shape
everything else your program does from top to bottom. Don't fight it. The
thing was created to save you time, and if you persist in trying to twist it
in some direction that aligns with your own biases, you'll be wasting huge
amounts of time and energy.
An unpleasant chap on CompuServe referred to Turbo Vision as Nazi programming.
This is just another manifestation of Not Invented Here, and if he persists in
spending all his days creating his own personal event-driven environment, he's
welcome to it. I personally enjoy the freedom from having to solve such
problems myself.


The Learning Curve


Regardless of how willing you are to work on Turbo Vision's own terms, there
remains the question of how to learn it. Turbo Vision is hard to learn in part
because it's one big, heavily integrated mechanism and not a loose bin of
software odds and ends. It's tough to pull one element of TV up for
examination without having to understand seven hundred other things first.
This is what I call "looking for the front door;" it's the search for a
starting point on a sequential path to mastery of the product.
There's no easy front door to learning TV, and that sequential path is
inevitably going to be cluttered with forward references. As with anything
else, you'll learn best by doing. Here's my suggested strategy:
1. Start by reading the Turbo Vision tutorials in Part 1 of the Turbo Vision
Guide. Pull up the demonstration apps as you go, run them, and see if you can
make any sense at all of the code. (This sort of experience is cumulative.
Eventually the Aha! Insight! epiphanies will come so quickly your head will
spin.) Much of it won't make sense, and a lot of what may seem to make sense
at first won't stick. Don't worry -- and don't get discouraged. Just keep
going.
2. Before you begin tinkering with TV itself, read up on and experiment with
the TCollection class and its children. These stand independent of TV, and can
be learned and used without any knowledge of TV.
3. Read the rest of TV Guide Part 2, which is a detailed description of Turbo
Vision. The first time through, this will be rough going. Again, bull through
it at least once, and twice if you have the intestinal fortitude.
4. Take one of the example apps and begin changing it, one thing at a time.
Start small. Change the wording on a window title. Add a dummy menu item. Add
a dummy command to the status line. Each time you make a change, crank your
brain wide open to the place in the big picture where your one small change
fits in. This is the stage where you have to try to pull all your previous
undigested knowledge together. If you have the leisure, I'd suggest spending
two or three full days doing nothing else; or failing that, a solid week of
evenings.
5. Specify a simple application of your own, and try to make it happen. Steal
freely from the example apps. They work well, and they were written by people
who will probably always know more about TV than you will. Try to shape your
learning app such that it can be added to incrementally, allowing you to test
and learn from it as you go, in small chunks. Don't try to make 1500 lines of
TV code compile at once, the first time.


My Learning App


Listing One (page 143) contains an early version of my own TV learning app,
HCALC.PAS. HCALC ("HouseCalc") is intended to be a collection of utilities for
dealing with home ownership. Listing One only implements a simple mortgage
calculator, allowing you to create several mortgage scenarios in independent
windows and compare them.

I created it in very small stages. I had a menu bar and status line with dummy
commands before the commands did anything. The first windows were empty
windows. I created the menu option to close all windows before the windows
contained any information. Only then did I actually create the dialog box to
gather mortgage parameters, and the last thing I did was actually place the
mortgage information into the windows. The application was compilable and
executable at every stage, which greatly helped me assimilate the knowledge
that I was drinking from the TV fire hose.
It worked for me. It should work for you. Try it! In the next several columns
I'll be explaining the operation of HCALC in detail. You might make a Xerox
copy of Listing One so that you can refer to it in the coming months, because
we won't be reprinting HCALC in every issue.
I haven't forgotten my original goal of designing and building a data
communications application in this column. We're working on it. There's no
easy path to the best goals, and if it takes a year, it takes a year. I'm not
going anywhere. Stay tuned.

_STRUCTURED PROGRAMMING COLUMN_
by Jeff Duntemann


[LISTING ONE]

PROGRAM HCalc; { By Jeff Duntemann; Update of 10/31/91 }
 { Requires Turbo Pascal 6.0! }

USES App,Dialogs,Objects,Views,Menus,Drivers,
 FInput, { By Allen Bauer; on CompuServe BPROGA }
 Mortgage; { By Jeff Duntemann; from DDJ 10/91 }

CONST
 cmNewMortgage = 199;
 cmExtraPrin = 198;
 cmCloseAll = 197;
 cmCloseBC = 196;
 cmPrintSummary = 195;
 WindowCount : Integer = 0;

TYPE
 MortgageDialogData =
 RECORD
 PrincipalData : Real;
 InterestData : Real;
 PeriodsData : Integer;
 END;

 ExtraPrincipalDialogData =
 RECORD
 PaymentNumber : Integer;
 ExtraDollars : Real;
 END;

 THouseCalcApp =
 OBJECT(TApplication)
 InitDialog : PDialog; { Dialog for initializing a mortgage }
 ExtraDialog : PDialog; { Dialog for entering extra principal }
 CONSTRUCTOR Init;
 PROCEDURE InitMenuBar; VIRTUAL;
 PROCEDURE CloseAll;
 PROCEDURE HandleEvent(VAR Event : TEvent); VIRTUAL;
 PROCEDURE NewMortgage;
 END;

 PMortgageTopInterior = ^TMortgageTopInterior;
 TMortgageTopInterior =
 OBJECT(TView)
 Mortgage : PMortgage;
 CONSTRUCTOR Init(VAR Bounds : TRect);
 PROCEDURE Draw; VIRTUAL;
 END;



 PMortgageBottomInterior = ^TMortgageBottomInterior;
 TMortgageBottomInterior =
 OBJECT(TScroller)
 { Points to Mortgage object owned by TMortgageView }
 Mortgage : PMortgage;
 CONSTRUCTOR Init(VAR Bounds : TRect;
 AHScrollBar, AVScrollbar : PScrollBar);
 PROCEDURE Draw; VIRTUAL;
 END;

 PMortgageView = ^TMortgageView;
 TMortgageView =
 OBJECT(TWindow)
 Mortgage : TMortgage;
 CONSTRUCTOR Init(VAR Bounds : TRect;
 ATitle : TTitleStr;
 ANumber : Integer;
 InitMortgageData :
 MortgageDialogData);
 PROCEDURE HandleEvent(Var Event : TEvent); VIRTUAL;
 PROCEDURE ExtraPrincipal;
 PROCEDURE PrintSummary;
 DESTRUCTOR Done; VIRTUAL;
 END;


CONST
 DefaultMortgageData : MortgageDialogData =
 (PrincipalData : 100000;
 InterestData : 10.0;
 PeriodsData : 360);


VAR
 HouseCalc : THouseCalcApp; { This is the application object itself }



{------------------------------}
{ METHODS: THouseCalcApp }
{------------------------------}


CONSTRUCTOR THouseCalcApp.Init;

VAR
 R : TRect;
 aView : PView;

BEGIN
 TApplication.Init; { Always call the parent's constructor first! }

 { Create the dialog for initializing a mortgage: }
 R.Assign(20,5,60,16);
 InitDialog := New(PDialog,Init(R,'Define Mortgage Parameters'));
 WITH InitDialog^ DO
 BEGIN
 { First item in the dialog box is input line for principal: }
 R.Assign(3,3,13,4);

 aView := New(PFInputLine,Init(R,8,DRealSet,DReal,0));
 Insert(aView);
 R.Assign(2,2,12,3);
 Insert(New(PLabel,Init(R,'Principal',aView)));

 { Next is the input line for interest rate: }
 R.Assign(17,3,26,4);
 aView := New(PFInputLine,Init(R,6,DRealSet,DReal,3));
 Insert(aView);
 R.Assign(16,2,25,3);
 Insert(New(PLabel,Init(R,'Interest',aView)));
 R.Assign(26,3,27,4); { Add a static text "%" sign }
 Insert(New(PStaticText,Init(R,'%')));

 { Up next is the input line for number of periods: }
 R.Assign(31,3,36,4);
 aView := New(PFInputLine,Init(R,3,DUnsignedSet,DInteger,0));
 Insert(aView);
 R.Assign(29,2,37,3);
 Insert(New(PLabel,Init(R,'Periods',aView)));

 { These are standard buttons for the OK and Cancel commands: }
 R.Assign(8,8,16,10);
 Insert(New(PButton,Init(R,'~O~K',cmOK,bfDefault)));
 R.Assign(22,8,32,10);
 Insert(New(PButton,Init(R,'Cancel',cmCancel,bfNormal)));
 END;

 { Create the dialog for adding additional principal to a payment: }
 R.Assign(20,5,60,16);
 ExtraDialog := New(PDialog,Init(R,'Apply Extra Principal to Mortgage'));
 WITH ExtraDialog^ DO
 BEGIN
 { First item in the dialog is the payment number to which }
 { we're going to apply the extra principal: }
 R.Assign(9,3,18,4);
 aView := New(PFInputLine,Init(R,6,DUnsignedSet,DInteger,0));
 Insert(aView);
 R.Assign(3,2,12,3);
 Insert(New(PLabel,Init(R,'Payment #',aView)));

 { Next item in the dialog box is input line for extra principal: }
 R.Assign(23,3,33,4);
 aView := New(PFInputLine,Init(R,8,DRealSet,DReal,2));
 Insert(aView);
 R.Assign(20,2,35,3);
 Insert(New(PLabel,Init(R,'Extra Principal',aView)));

 { These are standard buttons for the OK and Cancel commands: }
 R.Assign(8,8,16,10);
 Insert(New(PButton,Init(R,'~O~K',cmOK,bfDefault)));
 R.Assign(22,8,32,10);
 Insert(New(PButton,Init(R,'Cancel',cmCancel,bfNormal)));
 END;

END;


{ This method sends out a broadcast message to all views. Only the

{ mortgage windows know how to respond to it, so when cmCloseBC is
{ issued, only the mortgage windows react--by closing. }

PROCEDURE THouseCalcApp.CloseAll;

VAR
 Who : Pointer;

BEGIN
 Who := Message(Desktop,evBroadcast,cmCloseBC,@Self);
END;


PROCEDURE THouseCalcApp.HandleEvent(VAR Event : TEvent);

BEGIN
 TApplication.HandleEvent(Event);
 IF Event.What = evCommand THEN
 BEGIN
 CASE Event.Command OF
 cmNewMortgage : NewMortgage;
 cmCloseAll : CloseAll;
 ELSE
 Exit;
 END; { CASE }
 ClearEvent(Event);
 END;
END;


PROCEDURE THouseCalcApp.NewMortgage;

VAR
 Code : Integer;
 R : TRect;
 Control : Word;
 ThisMortgage : PMortgageView;
 InitMortgageData : MortgageDialogData;

BEGIN
 { First we need a dialog to get the intial mortgage values from }
 { the user. The dialog appears *before* the mortgage window! }
 WITH InitMortgageData DO
 BEGIN
 PrincipalData := 100000;
 InterestData := 10.0;
 PeriodsData := 360;
 END;
 InitDialog^.SetData(InitMortgageData);
 Control := Desktop^.ExecView(InitDialog);
 IF Control <> cmCancel THEN { Create a new mortgage object: }
 BEGIN
 R.Assign(5,5,45,20);
 Inc(WindowCount);
 { Get data from the initial mortgage dialog: }
 InitDialog^.GetData(InitMortgageData);
 { Call the constructor for the mortgage window: }
 ThisMortgage :=
 New(PMortgageView,Init(R,'Mortgage',WindowCount,

 InitMortgageData));

 { Insert the mortgage window into the desktop: }
 Desktop^.Insert(ThisMortgage);
 END;
END;


PROCEDURE THouseCalcApp.InitMenuBar;

VAR
 R : TRect;

BEGIN
 GetExtent(R);
 R.B.Y := R.A.Y + 1; { Define 1-line menu bar }

 MenuBar := New(PMenuBar,Init(R,NewMenu(
 NewSubMenu('~M~ortgage',hcNoContext,NewMenu(
 NewItem('~N~ew','F6',kbF6,cmNewMortgage,hcNoContext,
 NewItem('~E~xtra Principal ','',0,cmExtraPrin,hcNoContext,
 NewItem('~C~lose all','F7',kbF7,cmCloseAll,hcNoContext,
 NewItem('E~x~it','Alt-X',kbAltX,cmQuit,hcNoContext,
 NIL))))),
 NIL)
 )));
END;


{---------------------------------}
{ METHODS: TMortgageTopInterior }
{---------------------------------}

CONSTRUCTOR TMortgageTopInterior.Init(VAR Bounds : TRect);

BEGIN
 TView.Init(Bounds); { Call ancestor's constructor }
 GrowMode := gfGrowHiX; { Permits pane to grow in X but not Y }
END;


PROCEDURE TMortgageTopInterior.Draw;

VAR
 YRun : Integer;
 Color : Byte;
 B : TDrawBuffer;
 STemp : String[20];

BEGIN
 Color := GetColor(1);
 MoveChar(B,' ',Color,Size.X); { Clear the buffer to spaces }
 MoveStr(B,' Principal Interest Periods',Color);
 WriteLine(0,0,Size.X,1,B);

 MoveChar(B,' ',Color,Size.X); { Clear the buffer to spaces }
 { Here we convert payment data to strings for display: }
 Str(Mortgage^.Principal:7:2,STemp);
 MoveStr(B[2],STemp,Color); { At beginning of buffer B }

 Str(Mortgage^.Interest*100:7:2,STemp);
 MoveStr(B[14],STemp,Color); { At position 14 of buffer B }
 Str(Mortgage^.Periods:4,STemp);
 MoveStr(B[27],STemp,Color); { At position 27 of buffer B }
 WriteLine(0,1,Size.X,1,B);

 MoveChar(B,' ',Color,Size.X); { Clear the buffer to spaces }
 MoveStr(B,
 ' Extra Principal Interest',
 Color);
 WriteLine(0,2,Size.X,1,B);

 MoveChar(B,' ',Color,Size.X); { Clear the buffer to spaces }
 MoveStr(B,
 'Paymt # Prin. Int. Balance Principal So far So far ',
 Color);
 WriteLine(0,3,Size.X,1,B);

END;


{------------------------------------}
{ METHODS: TMortgageBottomInterior }
{------------------------------------}

CONSTRUCTOR TMortgageBottomInterior.Init(VAR Bounds : TRect;
 AHScrollBar, AVScrollBar :
 PScrollBar);

BEGIN
 { Call ancestor's constructor: }
 TScroller.Init(Bounds,AHScrollBar,AVScrollBar);
 GrowMode := gfGrowHiX + gfGrowHiY;
 Options := Options OR ofFramed;
END;


PROCEDURE TMortgageBottomInterior.Draw;

VAR
 Color : Byte;
 B : TDrawBuffer;
 YRun : Integer;
 STemp : String[20];

BEGIN
 Color := GetColor(1);
 FOR YRun := 0 TO Size.Y-1 DO
 BEGIN
 MoveChar(B,' ',Color,80); { Clear the buffer to spaces }
 Str(Delta.Y+YRun+1:4,STemp);
 MoveStr(B,STemp+':',Color); { At beginning of buffer B }
 { Here we convert payment data to strings for display: }
 Str(Mortgage^.Payments^[Delta.Y+YRun+1].PayPrincipal:7:2,STemp);
 MoveStr(B[6],STemp,Color); { At beginning of buffer B }
 Str(Mortgage^.Payments^[Delta.Y+YRun+1].PayInterest:7:2,STemp);
 MoveStr(B[15],STemp,Color); { At position 15 of buffer B }
 Str(Mortgage^.Payments^[Delta.Y+YRun+1].Balance:10:2,STemp);
 MoveStr(B[24],STemp,Color); { At position 24 of buffer B }

 { There isn't an extra principal value for every payment, so }
 { display the value only if it is nonzero: }
 STemp := '';
 IF Mortgage^.Payments^[Delta.Y+YRun+1].ExtraPrincipal > 0
 THEN
 Str(Mortgage^.Payments^[Delta.Y+YRun+1].ExtraPrincipal:10:2,STemp);
 MoveStr(B[37],STemp,Color); { At position 37 of buffer B }
 Str(Mortgage^.Payments^[Delta.Y+YRun+1].PrincipalSoFar:10:2,STemp);
 MoveStr(B[50],STemp,Color); { At position 50 of buffer B }
 Str(Mortgage^.Payments^[Delta.Y+YRun+1].InterestSoFar:10:2,STemp);
 MoveStr(B[64],STemp,Color); { At position 64 of buffer B }
 { Here we write the line to the window, taking into account the }
 { state of the X scroll bar: }
 WriteLine(0,YRun,Size.X,1,B[Delta.X]);
 END;
END;


{------------------------------}
{ METHODS: TMortgageView }
{------------------------------}

CONSTRUCTOR TMortgageView.Init(VAR Bounds : TRect;
 ATitle : TTitleStr;
 ANumber : Integer;
 InitMortgageData :
 MortgageDialogData);
VAR
 TopInterior : PMortgageTopInterior;
 BottomInterior : PMortgageBottomInterior;
 HScrollBar,VScrollBar : PScrollBar;
 R,S : TRect;

BEGIN
 TWindow.Init(Bounds,ATitle,ANumber); { Call ancestor's constructor }
 { Call the Mortgage object's constructor using dialog data: }
 WITH InitMortgageData DO
 Mortgage.Init(PrincipalData,
 InterestData / 100,
 PeriodsData,
 12);

 { Here we set up a window with *two* interiors, one scrollable, one }
 { static. It's all in the way that you define the bounds, mostly: }
 GetClipRect(Bounds); { Get bounds for interior of view }
 Bounds.Grow(-1,-1); { Shrink those bounds by 1 for both X & Y }

 { Define a rectangle to embrace the upper of the two interiors: }
 R.Assign(Bounds.A.X,Bounds.A.Y,Bounds.B.X,Bounds.A.Y+4);
 TopInterior := New(PMortgageTopInterior,Init(R));
 TopInterior^.Mortgage := @Mortgage;
 Insert(TopInterior);

 { Define a rectangle to embrace the lower of two interiors: }
 R.Assign(Bounds.A.X,Bounds.A.Y+5,Bounds.B.X,Bounds.B.Y);

 { Create scroll bars for both mouse & keyboard input: }
 VScrollBar := StandardScrollBar(sbVertical + sbHandleKeyboard);
 { We have to adjust vertical bar to fit bottom interior: }

 VScrollBar^.Origin.Y := R.A.Y; { Adjust top Y value }
 VScrollBar^.Size.Y := R.B.Y - R.A.Y; { Adjust size }
 { The horizontal scroll bar, on the other hand, is standard: }
 HScrollBar := StandardScrollBar(sbHorizontal + sbHandleKeyboard);

 { Create bottom interior object with scroll bars: }
 BottomInterior :=
 New(PMortgageBottomInterior,Init(R,HScrollBar,VScrollBar));
 { Make copy of pointer to mortgage object: }
 BottomInterior^.Mortgage := @Mortgage;
 { Set the limits for the scroll bars: }
 BottomInterior^.SetLimit(80,InitMortgageData.PeriodsData);
 { Insert the interior into the window: }
 Insert(BottomInterior);
END;


PROCEDURE TMortgageView.HandleEvent(Var Event : TEvent);

BEGIN
 TWindow.HandleEvent(Event);
 IF Event.What = evCommand THEN
 BEGIN
 CASE Event.Command OF
 cmExtraPrin : ExtraPrincipal;
 cmPrintSummary : PrintSummary;
 ELSE
 Exit;
 END; { CASE }
 ClearEvent(Event);
 END
 ELSE
 IF Event.What = evBroadcast THEN
 CASE Event.Command OF
 cmCloseBC : Done
 END; { CASE }
END;


PROCEDURE TMortgageView.ExtraPrincipal;

VAR
 Control : Word;
 ExtraPrincipalData : ExtraPrincipalDialogData;

BEGIN
 { Execute the "extra principal" dialog box: }
 Control := Desktop^.ExecView(HouseCalc.ExtraDialog);
 IF Control <> cmCancel THEN { Update the active mortgage window: }
 BEGIN
 { Get data from the extra principal dialog: }
 HouseCalc.ExtraDialog^.GetData(ExtraPrincipalData);
 Mortgage.Payments^[ExtraPrincipalData.PaymentNumber].ExtraPrincipal :=
 ExtraPrincipalData.ExtraDollars;
 Mortgage.Recalc; { Recalculate the amortization table... }
 Redraw; { ...and redraw the mortgage window }
 END;
END;



PROCEDURE TMortgageView.PrintSummary;

BEGIN
END;


DESTRUCTOR TMortgageView.Done;

BEGIN
 Mortgage.Done; { Dispose of the mortgage object's memory }
 TWindow.Done; { Call parent's destructor to dispose of window }
END;



BEGIN
 HouseCalc.Init;
 HouseCalc.Run;
 HouseCalc.Done;
END.


[THE FOLLOWING IS SOURCE FOR FINPUT.PAS]

unit FInput;
{$X+}
{
 This unit implements a derivative of TInputLine that supports several
 data types dynamically. It also provides formatted input for all the
 numerical types, keystroke filtering and uppercase conversion, field
 justification, and range checking.

 When the field is initialized, many filtering and uppercase converions
 are implemented pertinent to the particular data type.

 The CheckRange and ErrorHandler methods should be overridden if the
 user wants to implement then.

 This is just an initial implementation and comments are welcome. You
 can contact me via Compuserve. (76066,3202)

 I am releasing this into the public domain and anyone can use or modify
 it for their own personal use.

 Copyright (c) 1990 by Allen Bauer (76066,3202)

 1.1 - fixed input validation functions

 This is version 1.2 - fixed DataSize method to include reals.
 fixed Draw method to not format the data
 while the view is selected.
}

interface
uses Objects, Drivers, Dialogs;

type
 VKeys = set of char;


 PFInputLine = ^TFInputLine;
 TFInputLine = object(TInputLine)
 ValidKeys : VKeys;
 DataType,Decimals : byte;
 imMode : word;
 Validated, ValidSent : boolean;
 constructor Init(var Bounds: TRect; AMaxLen: integer;
 ChrSet: VKeys;DType, Dec: byte);
 constructor Load(var S: TStream);
 procedure Store(var S: TStream);
 procedure HandleEvent(var Event: TEvent); virtual;
 procedure GetData(var Rec); virtual;
 procedure SetData(var Rec); virtual;
 function DataSize: word; virtual;
 procedure Draw; virtual;
 function CheckRange: boolean; virtual;
 procedure ErrorHandler; virtual;
 end;

const
 imLeftJustify = $0001;
 imRightJustify = $0002;
 imConvertUpper = $0004;

 DString = 0;
 DChar = 1;
 DReal = 2;
 DByte = 3;
 DShortInt = 4;
 DInteger = 5;
 DLongInt = 6;
 DWord = 7;
 DDate = 8;
 DTime = 9;

 DRealSet : VKeys = [#1..#31,'+','-','0'..'9','.','E','e'];
 DSignedSet : VKeys = [#1..#31,'+','-','0'..'9'];
 DUnSignedSet : VKeys = [#1..#31,'0'..'9'];
 DCharSet : VKeys = [#1..#31,' '..'~'];
 DUpperSet : VKeys = [#1..#31,' '..'`','{'..'~'];
 DAlphaSet : VKeys = [#1..#31,'A'..'Z','a'..'z'];
 DFileNameSet : VKeys =
[#1..#31,'!','#'..')','-'..'.','0'..'9','@'..'Z','^'..'{','}'..'~'];
 DPathSet : VKeys =
[#1..#31,'!','#'..')','-'..'.','0'..':','@'..'Z','^'..'{','}'..'~','\'];
 DFileMaskSet : VKeys =
[#1..#31,'!','#'..'*','-'..'.','0'..':','?'..'Z','^'..'{','}'..'~','\'];
 DDateSet : VKeys = [#1..#31,'0'..'9','/'];
 DTimeSet : VKeys = [#1..#31,'0'..'9',':'];

 cmValidateYourself = 5000;
 cmValidatedOK = 5001;

procedure RegisterFInputLine;

const
 RFInputLine : TStreamRec = (
 ObjType: 20000;
 VmtLink: Ofs(typeof(TFInputLine)^);
 Load: @TFInputLine.Load;
 Store: @TFinputLine.Store

 );

implementation

uses Views, MsgBox, StrFmt, Dos;

function CurrentDate : string;
var
 Year,Month,Day,DOW : word;
 DateStr : string[10];
begin
 GetDate(Year,Month,Day,DOW);
 DateStr := SFLongint(Month,2)+'/'
 +SFLongInt(Day,2)+'/'
 +SFLongInt(Year mod 100,2);
 for DOW := 1 to length(DateStr) do
 if DateStr[DOW] = ' ' then
 DateStr[DOW] := '0';
 CurrentDate := DateStr;
end;

function CurrentTime : string;
var
 Hour,Minute,Second,Sec100 : word;
 TimeStr : string[10];
begin
 GetTime(Hour,Minute,Second,Sec100);
 TimeStr := SFLongInt(Hour,2)+':'
 +SFLongInt(Minute,2)+':'
 +SFLongInt(Second,2);
 for Sec100 := 1 to length(TimeStr) do
 if TimeStr[Sec100] = ' ' then
 TimeStr[Sec100] := '0';
 CurrentTime := TimeStr;
end;

procedure RegisterFInputLine;
begin
 RegisterType(RFInputLine);
end;

constructor TFInputLine.Init(var Bounds: TRect; AMaxLen: integer;
 ChrSet: VKeys; DType, Dec: byte);
begin
 if (DType in [DDate,DTime]) and (AMaxLen < 8) then
 AMaxLen := 8;

 TInputLine.Init(Bounds,AMaxLen);

 ValidKeys:= ChrSet;
 DataType := DType;
 Decimals := Dec;
 Validated := true;
 ValidSent := false;
 case DataType of
 DReal,DByte,DLongInt,
 DShortInt,DWord : imMode := imRightJustify;

 DChar,DString,

 DDate,DTime : imMode := imLeftJustify;
 end;
 if ValidKeys = DUpperSet then
 imMode := imMode or imConvertUpper;
 EventMask := EventMask or evMessage;
end;

constructor TFInputLine.Load(var S: TStream);
begin
 TInputLine.Load(S);
 S.Read(ValidKeys, sizeof(VKeys));
 S.Read(DataType, sizeof(byte));
 S.Read(Decimals, sizeof(byte));
 S.Read(imMode, sizeof(word));
 S.Read(Validated, sizeof(boolean));
 S.Read(ValidSent, sizeof(boolean));
end;

procedure TFInputLine.Store(var S: TStream);
begin
 TInputLine.Store(S);
 S.Write(ValidKeys, sizeof(VKeys));
 S.Write(DataType, sizeof(byte));
 S.Write(Decimals, sizeof(byte));
 S.Write(imMode, sizeof(word));
 S.Write(Validated, sizeof(boolean));
 S.Write(ValidSent, sizeof(boolean));
end;

procedure TFInputLine.HandleEvent(var Event: TEvent);
var
 NewEvent: TEvent;
begin
 case Event.What of
 evKeyDown : begin
 if (imMode and imConvertUpper) <> 0 then
 Event.CharCode := upcase(Event.CharCode);
 if not(Event.CharCode in [#0..#31]) then
 begin
 Validated := false;
 ValidSent := false;
 end;
 if (Event.CharCode <> #0) and not(Event.CharCode in ValidKeys) then
 ClearEvent(Event);
 end;
 evBroadcast: begin
 if (Event.Command = cmReceivedFocus) and
 (Event.InfoPtr <> @Self) and
 ((Owner^.State and sfSelected) <> 0) and
 not(Validated) and not(ValidSent) then
 begin
 NewEvent.What := evBroadcast;
 NewEvent.InfoPtr := @Self;
 NewEvent.Command := cmValidateYourself;
 PutEvent(NewEvent);
 ValidSent := true;
 end;
 if (Event.Command = cmValidateYourself) and
 (Event.InfoPtr = @Self) then

 begin
 if not CheckRange then
 begin
 ErrorHandler;
 Select;
 end
 else
 begin
 NewEvent.What := evBroadCast;
 NewEvent.InfoPtr := @Self;
 NewEvent.Command := cmValidatedOK;
 PutEvent(NewEvent);
 Validated := true;
 end;
 ValidSent := false;
 ClearEvent(Event);
 end;
 end;
 end;
 TInputLine.HandleEvent(Event);
end;

procedure TFInputLine.GetData(var Rec);
var
 Code : integer;
begin
 case DataType of
 Dstring,
 DDate,
 DTime : TInputLine.GetData(Rec);
 DChar : char(Rec) := Data^[1];
 DReal : val(Data^, real(Rec) , Code);
 DByte : val(Data^, byte(Rec) , Code);
 DShortInt : val(Data^, shortint(Rec) , Code);
 DInteger : val(Data^, integer(Rec) , Code);
 DLongInt : val(Data^, longint(Rec) , Code);
 DWord : val(Data^, word(Rec) , Code);
 end;
end;

procedure TFInputLine.SetData(var Rec);
begin
 case DataType of
 DString,
 DDate,
 DTime : TInputLine.SetData(Rec);
 DChar : Data^ := char(Rec);
 DReal : Data^ := SFDReal(real(Rec),MaxLen,Decimals);
 DByte : Data^ := SFLongInt(byte(Rec),MaxLen);
 DShortInt : Data^ := SFLongInt(shortint(Rec),MaxLen);
 DInteger : Data^ := SFLongInt(integer(Rec),MaxLen);
 DLongInt : Data^ := SFLongInt(longint(Rec),MaxLen);
 DWord : Data^ := SFLongInt(word(Rec),MaxLen);
 end;
 SelectAll(true);
end;

function TFInputLine.DataSize: word;
begin

 case DataType of
 DString,
 DDate,
 DTime : DataSize := TInputLine.DataSize;
 DChar : DataSize := sizeof(char);
 DReal : DataSize := sizeof(real);
 DByte : DataSize := sizeof(byte);
 DShortInt : DataSize := sizeof(shortint);
 DInteger : DataSize := sizeof(integer);
 DLongInt : DataSize := sizeof(longint);
 DWord : DataSize := sizeof(word);
 else
 DataSize := TInputLine.DataSize;
 end;
end;

procedure TFInputLine.Draw;
var
 RD : real;
 Code : integer;
begin
 if not((State and sfSelected) <> 0) then
 case DataType of
 DReal : begin
 if Data^ = '' then
 Data^ := SFDReal(0.0,MaxLen,Decimals)
 else
 begin
 val(Data^, RD, Code);
 Data^ := SFDReal(RD,MaxLen,Decimals);
 end;
 end;

 DByte,
 DShortInt,
 DInteger,
 DLongInt,
 DWord : if Data^ = '' then Data^ := SFLongInt(0,MaxLen);

 DDate : if Data^ = '' then Data^ := CurrentDate;
 DTime : if Data^ = '' then Data^ := CurrentTime;

 end;

 if State and (sfFocused+sfSelected) <> 0 then
 begin
 if (imMode and imRightJustify) <> 0 then
 while (length(Data^) > 0) and (Data^[1] = ' ') do
 delete(Data^,1,1);
 end
 else
 begin
 if ((imMode and imRightJustify) <> 0) and (Data^ <> '') then
 while (length(Data^) < MaxLen) do
 insert(' ',Data^,1);
 if (imMode and imLeftJustify) <> 0 then
 while (length(Data^) > 0) and (Data^[1] = ' ') do
 delete(Data^,1,1);


 end;
 TInputLine.Draw;
end;

function TFInputLine.CheckRange: boolean;
var
 MH,DM,YS : longint;
 Code : integer;
 MHs,DMs,YSs : string[2];
 Delim : char;
 Ok : boolean;
begin
 Ok := true;
 case DataType of
 DDate,
 DTime : begin
 if DataType = DDate then Delim := '/' else Delim := ':';
 if pos(Delim,Data^) > 0 then
 begin
 MHs := copy(Data^,1,pos(Delim,Data^));
 DMs := copy(Data^,pos(Delim,Data^)+1,2);
 delete(Data^,pos(Delim,Data^),1);
 YSs := copy(Data^,pos(Delim,Data^)+1,2);
 if length(MHs) < 2 then MHs := '0' + MHs;
 if length(DMs) < 2 then DMs := '0' + DMs;
 if length(YSs) < 2 then YSs := '0' + YSs;
 Data^ := MHs + DMs + YSs;
 end;
 if (length(Data^) >= 6) and (pos(Delim,Data^) = 0) then
 begin
 val(copy(Data^,1,2), MH, Code);
 if Code <> 0 then MH := 0;
 val(copy(Data^,3,2), DM, Code);
 if Code <> 0 then DM := 0;
 val(copy(Data^,5,2), YS, Code);
 if Code <> 0 then YS := 0;
 if DataType = DDate then
 begin
 if (MH > 12) or (MH < 1) or
 (DM > 31) or (DM < 1) then Ok := false;
 end
 else
 begin
 if (MH > 23) or (MH < 0) or
 (DM > 59) or (DM < 0) or
 (YS > 59) or (YS < 0) then Ok := false;
 end;
 insert(Delim,Data^,5);
 insert(Delim,Data^,3);
 end
 else
 Ok := false;
 end;

 DByte : begin
 val(Data^, MH, Code);
 if (Code <> 0) or (MH > 255) or (MH < 0) then Ok := false;
 end;


 DShortint :
 begin
 val(Data^, MH, Code);
 if (Code <> 0) or (MH < -127) or (MH > 127) then Ok := false;
 end;

 DInteger :
 begin
 val(Data^, MH, Code);
 if (Code <> 0) or (MH < -32768) or (MH > 32767) then Ok := false;
 end;

 DWord : begin
 val(Data^, MH, Code);
 if (Code <> 0) or (MH < 0) or (MH > 65535) then Ok := false;
 end;
 end;
 CheckRange := Ok;
end;

procedure TFInputLine.ErrorHandler;
var
 MsgString : string[80];
 Params : array[0..1] of longint;
 Event: TEvent;
begin
 fillchar(Params,sizeof(params),#0);
 MsgString := '';
 case DataType of
 DDate : MsgString := ' Invalid Date Format! Enter Date as MM/DD/YY ';
 DTime : MsgString := ' Invalid Time Format! Enter Time as HH:MM:SS ';
 DByte,
 DShortInt,
 DInteger,
 DWord : begin
 MsgString := ' Number must be between %d and %d ';
 case DataType of
 DByte : Params[1] := 255;
 DShortInt : begin Params[0] := -128; Params[1] := 127; end;
 DInteger : begin Params[0] := -32768; Params[1] := 32768; end;
 DWord : Params[1] := 65535;
 end;
 end;
 end;
 MessageBox(MsgString, @Params, mfError + mfOkButton);
end;

end.














November, 1991
GRAPHICS PROGRAMMING


Antialiasing with the Sierra Hicolor DAC




Michael Abrash


There's an Italian saying, the gist of which is, "It need not be true, so long
as it's well said." This strikes close to the essential truth of antialiasing:
The image need not be accurate, so long as it looks like it is. You don't go
to the trouble of antialiasing in order to get a mathematically precise
representation of an image; you do it so the amazing eye/brain
integrating/pattern matching system will see what you want it to see.
This is a particularly relevant thought at the moment, for we're smack in the
middle of discussing the Sierra Hicolor DAC, which makes classic, high-quality
antialiasing, of the sort that Targa boards have offered for years, available
at mass-market prices. To recap, the Hicolor DAC extends SuperVGA to provide
selection among enough colors for serious rendering and antialiasing; 32,768
simultaneous colors, to be exact. Although the Hicolor DAC falls short of the
24-bpp true color standard, you aren't likely to find a 24-bpp adapter priced
in the $200-300 range.
Last month, we looked at simple, unweighted antialiasing in the context of the
VGA's standard 256-color mode, performing antialiasing between exactly four
colors -- red, green, blue, and black -- with five semi-independent levels of
each of the three primary colors available. This month, we'll start off by
discussing the basic Hicolor programming model, then we'll do the same sort of
antialiasing as last month -- but this time with 32 fully independent levels
of each primary color and resolutions up to 800x600, which makes quite a
difference indeed.


A Brief Primer on the Sierra Hicolor DAC


The operation of the Hicolor DAC in 32K-color mode is remarkably simple.
First, the VGA must be set to a 256-color mode with twice the desired
horizontal resolution; for example, a 1600x 600 256-color mode would be
selected if 800x600 32K-color mode were desired. Then, the Hicolor DAC is set
to high-color mode via the command register; in high-color mode, the Hicolor
DAC takes each pair of 256-color pixels, joins them together into one 16-bit
pixel, converts the red, green, and blue components (described shortly)
directly to proportional analog values (no palette is involved), and sends
them to the monitor.
There is a serious problem here, however: There is no standard way for an
application to select a high-color mode. It's not enough to set up the Hicolor
DAC; the VGA must also be set to the appropriate double-resolution 256-color
mode, and the sequence for doing that--especially selecting high-speed
clocks--varies from VGA to VGA. There is no VESA mode number for high-color
modes; there is a VESA programming guideline for high-color modes, and I'm
looking into it, but it's certainly not as simple as a mode number. In any
case, the VESA interface isn't always available.
Consequently, high-color mode selection is adapter-dependent. Fortunately,
most of the Hicolor-based boards are built around the Tseng Labs ET4000 VGA
chip (ATI is the only exception I know of, which is why I'm focusing on
ET4000-based Hicolor boards in this column), and Tseng provides a BIOS
interface for high-color modes. (There's no guarantee that manufacturers using
the ET4000 will follow the Tseng interface, but I suspect they will, as it's
the closest thing to a standard at the moment.) Unfortunately, when I run the
ET4000 BIOS function that reports whether a Hicolor DAC is present on my
Toshiba portable without a Hicolor board installed, it hangs my system, so
it's not a good idea to rely on the BIOS functions alone.
My solution, shown in Listing One (page 146), is to first check for a Hicolor
DAC and an ET4000 at the hardware level; if both are present, I call the BIOS
(which is presumably at least not hostile to the Tseng BIOS high-color
extensions at this point) to check for the availability of Hicolor modes, and
finally, if all has gone well, to set the desired high-color mode. This is
probably overkill, but at least this way you get three kinds of chip ID code
to mix and match as you wish.


Programming the Hicolor DAC


The pixel format of the Hicolor DAC is straightforward: Each pixel is stored
in one word of display memory, with the lowest 5 bits forming the blue
component, the next 5 bits forming the green component, the next 5 bits
forming the red component, and bit 15 ignored, as shown in Figure 1. Pixels
start at even addresses. The bits within a word are organized Intel style,
with the byte at the even address containing bits 7-0 (blue and part of
green), and the byte at the odd address containing bits 158 (red and the rest
of green).
Pixels proceed linearly for the length of the bitmap; the organization is the
same as 256-color mode, except that each pixel takes up one word, rather than
one byte. As in SuperVGA 256-color modes, the bitmap is too long to be
addressed in the 64K video memory window, so banking must be used; again, the
banking is just like 256-color or banking, except that each bank contains only
half as many pixels; 32,768, to be exact.
On the ET4000, the Segment Select register at 3CDh controls banking, as shown
in Figure 2. There are 16 banks, each spanning 64 Kbytes of the total 1-Mbyte
bitmap. Banks can be selected separately for read and write to facilitate
scrolling and screen-to-screen copies, although we won't need that today.
Simple enough, but there's a catch: broken rasters.
Banks are 64K in length. If each Hicolor scan line is 1600 bytes long, then
65,536/1600=40 raster lines (lines 0-39) fit in bank 0 -- with 1536 bytes of
the next line (line 40), also in the first bank. The last 64 bytes of line 40
are in bank 1, as shown in Figure 3, so the line is split by the bank
boundary; hence the term "broken raster." Broken rasters crop up for other
bank crossings as well, and make Hicolor programming somewhat slower (the
extent depends heavily on the quality of the code) and considerably more
complicated.
Broken rasters are not unique to Hicolor modes; 800x600 and 640x480 256-color
modes also normally have broken rasters. However, there's a clever workaround
for broken rasters in 256-color modes: Stretch the bitmap width to 1K pixels,
via the Row Offset register, so that banks split between raster lines
(although this works for 800x600 only if the VGA has 1 Mbyte of memory).
Sad to say, stretching the bitmap width to 1K pixels doesn't work in Hicolor
mode. There's not enough memory to do it at 800x600 (at least until 2-Mbyte
VGAs appear), but that's not the problem at 640x480. The problem is that a Row
Offset register setting of 256 would be required to stretch the bitmap width
to 1K pixels -- and the Row Offset register only goes up to 255. I'm sure that
when the VGA was being designed, 255 seemed like plenty, but then, 640K once
seemed like pie in the sky. The upshot is that Hicolor programming unavoidably
requires handling broken rasters. That's a nuisance, but a manageable one;
next, we'll see polygon fill code that deals with broken rasters.


Non-antialiased Hicolor Drawing


Listing Two (page 146) draws a perspective cube in 640x480 32K color mode,
with help from the initialization code in Listing One, the DrawPixel-based
low-level polygon fill code in Listing Three (page 146), and the header file
in Listing Seven, page 149. (FILCNVXD.C from last month and Listing Four from
the March column are also required.) Not surprisingly, the cube drawn by
Listing Two looks a lot like the non-antialiased cube drawn last month, but
isn't as jagged because the resolution is now much higher. Nonetheless,
jaggies are still quite prominent, and they remain clearly visible at 800x600.
(I've used 640x480 mode so that the code will work on fixed-frequency
monitors, but Listing Two can be altered for 800x600 mode simply by changing
the parameter passed to SetHCMode and the value of BitmapWidthInBytes.)
Listing Two doesn't run very fast when linked to Listing Three; I suspect
you'll like the low-level polygon fill code in Listing Four (page 146) much
better. This code handles broken rasters reasonably efficiently, by checking
for them at the beginning of each scan line, then splitting up the fill and
banking appropriately whenever a bank crossing is detected.


Simple Unweighted Antialiasing


As the saying goes, you can never be too rich, too thin, or have too many
colors available. Personally, I only buy one of those three assertions: You
really can't have too many colors. Listings Five (page 148) and Six (page
149), together with Listings One, Three, and Seven, FILCNVXD.C from last
month, and Listing Four from March, show why. This program draws the same
cube, but this time employing the simple, unweighted antialiasing we used last
month -- and taking advantage of the full color range of the Hicolor DAC. The
results are excellent: On my venerable NEC MultiSync, at a viewing distance of
one foot, all but two of the edges look absolutely smooth, with not the
slightest hint of jaggies, and the two imperfect edges show only slight
ripples. At two feet, the cube looks perfect. The difference between the
non-antialiased and antialiased cubes is astounding, considering that we're
working with the same resolution in both cases.
A quick review of the simple antialiasing used this month and last: The image
is drawn to a memory buffer at a multiple of the resolution that the actual
screen supports. Each pixel on the screen maps to a group of hi-res pixels
(subpixels), arranged in a square, in the memory buffer. The colors of the
subpixels in each square are averaged, and the corresponding screen pixel is
set to the average subpixel color.
There's not enough memory to scan out the entire image at high resolution
(about 50K is required just to scan out one raster line at 4X resolution!), so
Listing Six scans out just those pixels that lie in a specified band. (Each
band corresponds to a single raster line in Listing Five.) Note that Listing
Six draws 32-bit pixels to the memory buffer; this is true color, plus an
extra byte for flexibility. Consequently, Listing Six is a general-purpose
tool, and can be used with any sort of adapter, so long as the main program
knows how to convert from true color to adapter-specific pixels. Listing Five
does this by calculating the average intensity in each subpixel group of each
of the three primary colors, in the range 0-255, then looking up the gamma
corrected equivalent color value for the Hicolor DAC, mapped into the range
0-31.
A quick look at the gamma-corrected color mapping table in Listing Five shows
why hardware gamma correction is sorely missed in the Hicolor DAC. The
brightest half of the color range -- from half intensity to full intensity --
is spanned by only 9 of the Hicolor DAC's 32 color values. That means that for
brighter colors, the Hicolor DAC effectively has only half the color
resolution that you'd expect from 5 bits per color gun, and the resolution is
even worse at the highest intensities.
I'd like to take a moment to emphasize that although Listing Five works with
only the three primary colors, it could just as easily work with the thousands
of colors that can be produced as mixes of the three primaries; there are none
of the limitations of 256-color mode, and no special tricks (such as biasing
the palette according to color frequency) need be used. Inevitably, though,
proportionately fewer intermediate blends are vailable and hence antialiasing
becomes less precise when there is less contrast between colors; you're not
going to be able to do much antialiasing between a pixel with a green true
color value of 250 and another with a value of 255. This is where the lack of
gamma correction and the difference between 15-bpp and true color become
apparent.


Notes on the Antialiasing Implementation


Listing Five features user-selectable subpixel resolution (the multiple of the
screen resolution at which the image should be drawn into the memory buffer).
A subpixel resolution of two times normal along both axes (2X) looks much
better than nonantialiased drawing, but still has visible jaggies. Subpixel
resolution of 4X looks terrific, as mentioned earlier. Higher subpixel
resolutions are, practically speaking, reserved for 386 protected mode,
because they would require a buffer larger than 64K to hold the high-res
equivalent of a single scan line.

On the downside, Listing Five is very slow, even though the conversion process
from true color pixels to Hicolor pixels is limited to the bounding rectangle
for the cube being drawn, thereby saving the time that was wasted last month
drawing the empty space around the cube. It could easily be sped up by, say,
an order of magnitude, in a number of ways. First, you could implement an ASM
function that's the equivalent of memset, but stores longs (dwords) rather
than chars (bytes). In the absence of any such C library function, Listing Six
uses a loop with a pointer to a long, hardly a recipe for high performance.
Listing Five could also be sped up by doing the screen pixel construction from
each square of subpixels in assembly language using pointers rather than array
look-ups. It would also help to organize the screen pixel drawing more as a
variant rectangle fill, instead of going through DrawPixel every time, so that
the screen pointer doesn't have to be recalculated from scratch and the bank
doesn't need to be calculated and set for every pixel. Clipping each polygon
to the band before rather than after scanning it out would speed things up, as
would building an edge list for the polygons once, ahead of time, then
advancing it incrementally to scan out each band, rather than doing one
complete scan of each polygon for each band. Bigger bands would help; drawing
the whole image to the memory buffer in one burst, then converting the entire
image to Hicolor pixels in a single operation, would be ideal, but would
require a ridiculous amount of memory. (Would you believe, 31 megs for one
full 800x600 Hicolor screen at 4X resolution?)
Finally, to alter Listing Five for 800x600 Hicolor mode, change the parameter
passed to SetHCMode, the value of BitmapWidthInBytes, and the value of
SCREEN_WIDTH.


Further Thoughts on Antialiasing


The banded true color approach of Listings Five and Six is easily extended to
other antialiasing approaches. For example, you could, if you wished, average
together all the subpixels not within a square, but rather within a circle of
radius sqrt(2.0)*Resolution Multiplier /2 around each pixel center. This
approach is a little more complicated, but it has one great virtue: An image
will be antialiased identically, regardless of its rotation.
Why is the shape of the subpixel area that's collected into a screen pixel
important, when the maximum resolution we can actually draw with is the
resolution of the screen? I'll quote William vanRyper, from the
graphics.disp/vga conference on BIX:
If you anti-alias an edge on the screen, and let the eye-brain pick the edge
somewhere in the gradient between the object color and the background, you can
adjust the placement of that perceptual edge by altering the ramp of the
gradient. If the number of intermediate values you can choose among is greater
than the number of gradient pixels you set (across the edge), you can adjust
the position of the perceptual edge in increments of less than a pixel. This
means you can locate the antialiased object to sub-pixel precision.
In other words, by using blends of color in a smooth, consistent gradient
across a boundary, you can get the eye to pick out the boundary location with
a precision that's greater than the resolution of the screen. This is, of
course, part and parcel of the wonderful eye/ brain magic that allows color to
substitute for resolution and makes antialiasing worthwhile.
Given that we can draw images with perceived resolution higher than the
screen, consistency in subpixel placement is very important. Unfortunately,
our simple square antialiasing does not produce the same results (a consistent
color gradient) for an image rotated 45 degrees as it does for an unrotated
image -- but antialiasing based on a circular subpixel area does. So the shape
of the subpixel area used for antialiasing matters because if it's not
symmetric in all directions, boundaries will appear to wiggle as images
rotate, destroying the image of reality that antialiased animation strives to
create.
On the other hand, if you're drawing only static images, use a square subpixel
area for antialiasing; it's fast, easy, and looks just fine in that context.
As I said at the outset, we're not seeking mathematical perfection here, just
a good-looking display for the purpose at hand. If it looks good, it is good.

_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

/* Looks for a Sierra Hicolor DAC; if one is present, puts the VGA into the
specified Hicolor (32K color) mode. Relies on the Tseng Labs ET4000 BIOS and
hardware; probably will not work on adapters built around other VGA chips.
Returns 1 for success, 0 for failure; failure can result from no Hicolor DAC,
too little display memory, or lack of an ET4000. Tested with Borland C++ 2.
0 in C mode in the small model. */

#include <dos.h>
#define DAC_MASK 0x3C6 /* DAC pixel mask reg address, also Sierra
 command reg address when enabled */
#define DAC_WADDR 0x3C8 /* DAC write address reg address */

/* Mode selections: 0x2D=640x350; 0x2E=640x480; 0x2F=640x400; 0x30=800x600 */
int SetHCMode(int Mode) {
 int i, Temp1, Temp2, Temp3;
 union REGS regset;

 /* See if a Sierra SC1148X Hicolor DAC is present, by trying to
 program and then read back the DAC's command register. (Shouldn't be
 necessary when using the BIOS Get DAC Type function, but the BIOS function
 locks up some computers, so it's safer to check the hardware first) */
 inp(DAC_WADDR); /* reset the Sierra command reg enable sequence */
 for (i=0; i<4; i++) inp(DAC_MASK); /* enable command reg access */
 outp(DAC_MASK, 0x00); /* set command reg (if present) to 0x00, and
 reset command reg enable sequence */
 outp(DAC_MASK, 0xFF); /* command reg access no longer enabled;
 set pixel mask register to 0xFF */
 for (i=0; i<4; i++) inp(DAC_MASK); /* enable command reg access */
 /* If this is a Hicolor DAC, we should read back the 0 in the
 command reg; otherwise we get the 0xFF in the pixel mask reg */
 i = inp(DAC_MASK); inp(DAC_WADDR); /* reset enable sequence */
 if (i == 0xFF) return(0);

 /* Check for a Tseng Labs ET4000 by poking unique regs, (assumes
 VGA configured for color, w/CRTC addressing at 3D4/5) */
 outp(0x3BF, 3); outp(0x3D8, 0xA0); /* unlock extended registers */
 /* Try toggling AC R16 bit 4 and seeing if it takes */
 inp(0x3DA); outp(0x3C0, 0x16 0x20);

 outp(0x3C0, ((Temp1 = inp(0x3C1)) 0x10)); Temp2 = inp(0x3C1);
 outp(0x3C0, 0x16 0x20); outp(0x3C0, (inp(0x3C1) & ~0x10));
 Temp3 = inp(0x3C1); outp(0x3C0, 0x16 0x20);
 outp(0x3C0, Temp1); /* restore original AC R16 setting */
 /* See if the bit toggled; if so, it's an ET3000 or ET4000 */
 if ((Temp3 & 0x10) !(Temp2 & 0x10)) return(0);
 outp(0x3D4, 0x33); Temp1 = inp(0x3D5); /* get CRTC R33 setting */
 outp(0x3D5, 0x0A); Temp2 = inp(0x3D5); /* try writing to CRTC */
 outp(0x3D5, 0x05); Temp3 = inp(0x3D5); /* R33 */
 outp(0x3D5, Temp1); /* restore original CRTC R33 setting */
 /* If the register was writable, it's an ET4000 */
 if ((Temp3 != 0x05) (Temp2 != 0x0A)) return(0);

 /* See if a Sierra SC1148X Hicolor DAC is present by querying the
 (presumably) ET4000-compatible BIOS. Not really necessary after
 the hardware check above, but generally more useful; in the
 future it will return information about other high-color DACs */
 regset.x.ax = 0x10F1; /* Get DAC Type BIOS function # */
 int86(0x10, &regset, &regset); /* ask BIOS for the DAC type */
 if (regset.x.ax != 0x0010) return(0); /* function not supported */
 switch (regset.h.bl) {
 case 0: return(0); /* normal DAC (non-Hicolor) */
 case 1: break; /* Sierra SC1148X 15-bpp Hicolor DAC */
 default: return(0); /* other high-color DAC */
 }

 /* Set Hicolor mode */
 regset.x.ax = 0x10F0; /* Set High-Color Mode BIOS function # */
 regset.h.bl = Mode; /* desired resolution */
 int86(0x10, &regset, &regset); /* have BIOS enable Hicolor mode */
 return (regset.x.ax == 0x0010); /* 1 for success, 0 for failure */
}







[LISTING TWO]

/* Demonstrates non-antialiased drawing in 640x480 Hicolor (32K color) mode on
an ET4000-based SuperVGA with a Sierra Hicolor DAC installed. Tested with
Borland C++ 2.0 in C mode in the small model. */

#include <conio.h>
#include <dos.h>
#include "polygon.h"
/* Draws the polygon described by the point list PointList in color
 Color, with all vertices offset by (x,y) */
#define DRAW_POLYGON(PointList,Color,x,y) { \
 Polygon.Length = sizeof(PointList)/sizeof(struct Point); \
 Polygon.PointPtr = PointList; \
 FillCnvxPolyDrvr(&Polygon, Color, x, y, DrawHCLineList);}

void main(void);
extern int SetHCMode(int);
extern int FillCnvxPolyDrvr(struct PointListHeader *, int, int, int,
 void (*)());

extern void DrawHCLineList(struct HLineList *, int);
int BitmapWidthInBytes = 640*2; /* # of bytes per raster line */

void main()
{
 struct PointListHeader Polygon;
 static struct Point Face0[] = {{396,276},{422,178},{338,88},{288,178}};
 static struct Point Face1[] = {{306,300},{396,276},{288,178},{210,226}};
 static struct Point Face2[] = {{338,88},{266,146},{210,226},{288,178}};
 union REGS regset;

 /* Attempt to enable 640x480 Hicolor mode */
 if (SetHCMode(0x2E) == 0)
 { printf("No Hicolor DAC detected\n"); exit(0); };

 /* Draw the cube */
 DRAW_POLYGON(Face0, 0x1F, 0, 0); /* full-intensity blue */
 DRAW_POLYGON(Face1, 0x1F << 5, 0, 0); /* full-intensity green */
 DRAW_POLYGON(Face2, 0x1F << 10, 0, 0); /* full-intensity red */
 getch(); /* wait for a keypress */

 /* Return to text mode and exit */
 regset.x.ax = 0x0003; /* AL = 3 selects 80x25 text mode */
 int86(0x10, &regset, &regset);
}







[LISTING THREE]

/* Draws all pixels in the list of horizontal lines passed in, in Hicolor
(32K color) mode on an ET4000-based SuperVGA. Uses a slow pixel-by-pixel
approach. Tested with Borland C++ 2.0 in C mode in the small model. */

#include <dos.h>
#include "polygon.h"
#define SCREEN_SEGMENT 0xA000
#define GC_SEGMENT_SELECT 0x3CD

void DrawPixel(int, int, int);
extern int BitmapWidthInBytes; /* # of pixels per line */

void DrawHCLineList(struct HLineList * HLineListPtr,
 int Color)
{
 struct HLine *HLinePtr;
 int Y, X;

 /* Point to XStart/XEnd descriptor for the first (top) horizontal line */
 HLinePtr = HLineListPtr->HLinePtr;
 /* Draw each horizontal line in turn, starting with the top one and
 advancing one line each time */
 for (Y = HLineListPtr->YStart; Y < (HLineListPtr->YStart +
 HLineListPtr->Length); Y++, HLinePtr++) {
 /* Draw each pixel in the current horizontal line in turn,

 starting with the leftmost one */
 for (X = HLinePtr->XStart; X <= HLinePtr->XEnd; X++)
 DrawPixel(X, Y, Color);
 }
}

/* Draws the pixel at (X, Y) in color Color in Hicolor mode on an
 ET4000-based SuperVGA */
void DrawPixel(int X, int Y, int Color) {
 unsigned int far *ScreenPtr, Bank;
 unsigned long BitmapAddress;

 /* Full bitmap address of pixel, as measured from address 0 to
 address 0xFFFFF. (X << 1) because pixels are 2 bytes in size */
 BitmapAddress = (unsigned long) Y * BitmapWidthInBytes + (X << 1);
 /* Map in the proper bank. Bank # is upper word of bitmap addr */
 Bank = *(((unsigned int *)&BitmapAddress) + 1);
 /* Upper nibble is read bank #, lower nibble is write bank # */
 outp(GC_SEGMENT_SELECT, (Bank << 4) Bank);
 /* Draw into the bank */
 FP_SEG(ScreenPtr) = SCREEN_SEGMENT;
 FP_OFF(ScreenPtr) = *((unsigned int *)&BitmapAddress);
 *ScreenPtr = (unsigned int)Color;
}







[LISTING FOUR]

; Draws all pixels in the list of horizontal lines passed in, in
; Hicolor (32K color) mode on an ET4000-based SuperVGA. Uses REP STOSW
; to fill each line. Tested with TASM 2.0. C near-callable as:
; void DrawHCLineList(struct HLineList * HLineListPtr, int Color);

SCREEN_SEGMENT equ 0a000h
GC_SEGMENT_SELECT equ 03cdh

HLine struc
XStart dw ? ;X coordinate of leftmost pixel in line
XEnd dw ? ;X coordinate of rightmost pixel in line
HLine ends

HLineList struc
Lngth dw ? ;# of horizontal lines
YStart dw ? ;Y coordinate of topmost line
HLinePtr dw ? ;pointer to list of horz lines
HLineList ends

Parms struc
 dw 2 dup(?) ;return address & pushed BP
HLineListPtr dw ? ;pointer to HLineList structure
Color dw ? ;color with which to fill
Parms ends

; Advances both the read and write windows to the next 64K bank.

; Note: Theoretically, a delay between IN and OUT may be needed under
; some circumstances to avoid accessing the VGA chip too quickly, but
; in actual practice, I haven't found any delay to be required.
INCREMENT_BANK macro
 push ax ;preserve fill color
 push dx ;preserve scan line start pointer
 mov dx,GC_SEGMENT_SELECT
 in al,dx ;get the current segment select
 add al,11h ;increment both the read & write banks
 out dx,al ;set the new bank #
 pop dx ;restore scan line start pointer
 pop ax ;restore fill color
 endm

 .model small
 .data
 extrn _BitmapWidthInBytes:word
 .code
 public _DrawHCLineList
 align 2
_DrawHCLineList proc near
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to our stack frame
 push si ;preserve caller's register variables
 push di
 cld ;make string instructions inc pointers
 mov ax,SCREEN_SEGMENT
 mov es,ax ;point ES to display memory for REP STOS
 mov si,[bp+HLineListPtr] ;point to the line list
 mov ax,[_BitmapWidthInBytes] ;point to the start of the
 mul [si+YStart] ; first scan line on which to draw
 mov di,ax ;ES:DI points to first scan line to
 mov al,dl ; draw; AL is the initial bank #
 ;upper nibble of AL is read bank #,
 mov cl,4 ; lower nibble is write bank # (only
 shl dl,cl ; the write bank is really needed for
 or al,dl ; this module, but it's less confusing
 ; to point both to the same place)
 mov dx,GC_SEGMENT_SELECT
 out dx,al ;set the initial bank
 mov dx,di ;ES:DX points to first scan line
 mov bx,[si+HLinePtr] ;point to the XStart/XEnd descriptor
 ; for the first (top) horizontal line
 mov si,[si+Lngth] ;# of scan lines to draw
 and si,si ;are there any lines to draw?
 jz FillDone ;no, so we're done
 mov ax,[bp+Color] ;color with which to fill
 mov bp,[_BitmapWidthInBytes] ;so we can keep everything
 ; in registers inside the loop
 ;***stack frame pointer destroyed!***
FillLoop:
 mov di,[bx+XStart] ;left edge of fill on this line
 mov cx,[bx+XEnd] ;right edge of fill
 sub cx,di
 jl LineFillDone ;skip if negative width
 inc cx ;# of pixels to fill on this line
 add di,di ;*2 because pixels are 2 bytes in size
 add dx,bp ;do we cross a bank during this line?
 jnc NormalFill ;no

 jz NormalFill ;no
 ;yes, there is a bank crossing on this
 ; line; figure out where
 sub dx,bp ;point back to start of line
 add di,dx ;offset of left edge of fill
 jc CrossBankBeforeFilling ;raster splits before the left
 ; edge of fill
 add cx,cx ;fill width in bytes (pixels * 2)
 add di,cx ;do we split during the fill area?
 jnc CrossBankAfterFilling ;raster splits after the right
 jz CrossBankAfterFilling ; edge of fill
 ;bank boundary falls within fill area;
 ; draw in two parts, one in each bank
 sub di,cx ;point back to start of fill area
 neg di ;# of bytes left before split
 sub cx,di ;# of bytes to fill to the right of
 ; the bank split
 push cx ;remember right-of-split fill width
 mov cx,di ;# of left-of-split bytes to fill
 shr cx,1 ;# of left-of-split words to fill
 neg di ;offset at which to start filling
 rep stosw ;fill left-of-split portion of line
 pop cx ;get back right-of-split fill width
 shr cx,1 ;# of right-of-split words to fill
 ;advance to the next bank
 INCREMENT_BANK ;point to the next bank (DI already
 ; points to offset 0, as desired)
 rep stosw ;fill right-of-split portion of line
 add dx,bp ;point to the next scan line
 jmp short CountDownLine ; (already advanced the bank)
;======================================================================
 align 2 ;dfill area is entirely to the left of
CrossBankAfterFilling: ; the bank boundary
 sub di,cx ;point back to start of fill area
 shr cx,1 ;CX = fill width in pixels
 jmp short FillAndAdvance ;doesn't split until after the
 ; fill area, so handle normally
;======================================================================
 align 2 ;fill area is entirely to the right of
CrossBankBeforeFilling: ; the bank boundary
 INCREMENT_BANK ;first, point to the next bank, where
 ; the fill area resides
 rep stosw ;fill this scan line
 add dx,bp ;point to the next scan line
 jmp short CountDownLine ; (already advanced the bank)
;======================================================================
 align 2 ;no bank boundary problems; just fill
NormalFill: ; normally
 sub dx,bp ;point back to start of line
 add di,dx ;offset of left edge of fill
FillAndAdvance:
 rep stosw ;fill this scan line
LineFillDone:
 add dx,bp ;point to the next scan line
 jnc CountDownLine ;didn't cross a bank boundary
 INCREMENT_BANK ;did cross, so point to the next bank
CountDownLine:
 add bx,size HLine ;point to the next line descriptor
 dec si ;count off lines to fill

 jnz FillLoop
FillDone:
 pop di ;restore caller's register variables
 pop si
 pop bp ;restore caller's stack frame
 ret
;======================================================================
_DrawHCLineList endp
 end







[LISTING FIVE]

/* Demonstrates unweighted antialiased drawing in 640x480 Hicolor (32K color)
mode. Tested with Borland C++ 2.0 in C mode in the small model. */

#include <conio.h>
#include <dos.h>
#include <stdlib.h>
#include <string.h>
#include "polygon.h"
/* Draws the polygon described by the point list PointList in the
 color specified by RED, GREEN, AND BLUE, with all vertices
 offset by (x,y), to ScanLineBuffer, at ResMul multiple of
 horizontal and vertical resolution. The address of ColorTemp is
 cast to an int to satisfy the prototype for FillCnvxPolyDrvr; this
 trick will work only in a small data model */
#define DRAW_POLYGON_HIGH_RES(PointList,RED,GREEN,BLUE,x,y,ResMul) { \
 Polygon.Length = sizeof(PointList)/sizeof(struct Point); \
 Polygon.PointPtr = PointTemp; \
 /* Multiply all vertical & horizontal coordinates */ \
 for (k=0; k<sizeof(PointList)/sizeof(struct Point); k++) { \
 PointTemp[k].X = PointList[k].X * ResMul; \
 PointTemp[k].Y = PointList[k].Y * ResMul; \
 } \
 ColorTemp.Red=RED; ColorTemp.Green=GREEN; ColorTemp.Blue=BLUE; \
 FillCnvxPolyDrvr(&Polygon, (int)&ColorTemp, x, y, DrawBandedList);}
#define SCREEN_WIDTH 640
#define SCREEN_SEGMENT 0xA000

void main(void);
extern void DrawPixel(int, int, char);
extern void DrawBandedList(struct HLineList *, struct RGB *);
extern int SetHCMode(int);

/* Table of gamma corrected mappings of linear color intensities in
 the range 0-255 to the nearest pixel values in the range 0-31,
 assuming a gamma of 2.3 */
static unsigned char ColorMappings[] = {
 0, 3, 4, 4, 5, 6, 6, 6, 7, 7, 8, 8, 8, 8, 9, 9, 9,10,10,10,
 10,10,11,11,11,11,11,12,12,12,12,12,13,13,13,13,13,13,14,14,
 14,14,14,14,14,15,15,15,15,15,15,15,16,16,16,16,16,16,16,16,
 17,17,17,17,17,17,17,17,17,18,18,18,18,18,18,18,18,18,19,19,
 19,19,19,19,19,19,19,19,20,20,20,20,20,20,20,20,20,20,20,21,

 21,21,21,21,21,21,21,21,21,21,22,22,22,22,22,22,22,22,22,22,
 22,22,22,23,23,23,23,23,23,23,23,23,23,23,23,24,24,24,24,24,
 24,24,24,24,24,24,24,24,24,25,25,25,25,25,25,25,25,25,25,25,
 25,25,25,26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,27,27,
 27,27,27,27,27,27,27,27,27,27,27,27,27,27,28,28,28,28,28,28,
 28,28,28,28,28,28,28,28,28,28,28,29,29,29,29,29,29,29,29,29,
 29,29,29,29,29,29,29,29,30,30,30,30,30,30,30,30,30,30,30,30,
 30,30,30,30,30,30,31,31,31,31,31,31,31,31,31,31};
/* Pointer to buffer in which high-res scanned data will reside */
struct RGB *ScanLineBuffer;
int ScanBandStart, ScanBandEnd; /* top & bottom of each high-res
 band we'll draw to ScanLineBuffer */
int ScanBandWidth; /* # subpixels across each scan band */
int BitmapWidthInBytes = 640*2; /* # of bytes per raster line in
 Hicolor VGA display memory */
void main()
{
 int i, j, k, m, Red, Green, Blue, jXRes, kXWidth;
 int SubpixelsPerMegapixel;
 unsigned int Megapixel, ResolutionMultiplier;
 long BufferSize;
 struct RGB ColorTemp;
 struct PointListHeader Polygon;
 struct Point PointTemp[4];
 static struct Point Face0[] =
 {{396,276},{422,178},{338,88},{288,178}};
 static struct Point Face1[] =
 {{306,300},{396,276},{288,178},{210,226}};
 static struct Point Face2[] =
 {{338,88},{266,146},{210,226},{288,178}};
 int LeftBound=210, RightBound=422, TopBound=88, BottomBound=300;
 union REGS regset;

 printf("Subpixel resolution multiplier:");
 scanf("%d", &ResolutionMultiplier);
 SubpixelsPerMegapixel = ResolutionMultiplier*ResolutionMultiplier;
 ScanBandWidth = SCREEN_WIDTH*ResolutionMultiplier;

 /* Get enough space for one scan line scanned out at high
 resolution horz and vert (each pixel is 4 bytes) */
 if ((BufferSize = (long)ScanBandWidth*4*ResolutionMultiplier) >
 0xFFFF) {
 printf("Band won't fit in one segment\n"); exit(0); }
 if ((ScanLineBuffer = malloc((int)BufferSize)) == NULL) {
 printf("Couldn't get memory\n"); exit(0); }

 /* Attempt to enable 640x480 Hicolor mode */
 if (SetHCMode(0x2E) == 0)
 { printf("No Hicolor DAC detected\n"); exit(0); };

 /* Scan out the polygons at high resolution one screen scan line at
 a time (ResolutionMultiplier high-res scan lines at a time) */
 for (i=TopBound; i<=BottomBound; i++) {
 /* Set the band dimensions for this pass */
 ScanBandEnd = (ScanBandStart = i*ResolutionMultiplier) +
 ResolutionMultiplier - 1;
 /* Clear the drawing buffer */
 memset(ScanLineBuffer, 0, BufferSize);
 /* Draw the current band of the cube to the scan line buffer */

 DRAW_POLYGON_HIGH_RES(Face0,0xFF,0,0,0,0,ResolutionMultiplier);
 DRAW_POLYGON_HIGH_RES(Face1,0,0xFF,0,0,0,ResolutionMultiplier);
 DRAW_POLYGON_HIGH_RES(Face2,0,0,0xFF,0,0,ResolutionMultiplier);

 /* Coalesce subpixels into normal screen pixels (megapixels) and draw them */
 for (j=LeftBound; j<=RightBound; j++) {
 jXRes = j*ResolutionMultiplier;
 /* For each screen pixel, sum all the corresponding
 subpixels, for each color component */
 for (k=Red=Green=Blue=0; k<ResolutionMultiplier; k++) {
 kXWidth = k*ScanBandWidth;
 for (m=0; m<ResolutionMultiplier; m++) {
 Red += ScanLineBuffer[jXRes+kXWidth+m].Red;
 Green += ScanLineBuffer[jXRes+kXWidth+m].Green;
 Blue += ScanLineBuffer[jXRes+kXWidth+m].Blue;
 }
 }
 /* Calc each color component's average brightness; convert
 that into a gamma corrected portion of a Hicolor pixel,
 then combine the colors into one Hicolor pixel */
 Red = ColorMappings[Red/SubpixelsPerMegapixel];
 Green = ColorMappings[Green/SubpixelsPerMegapixel];
 Blue = ColorMappings[Blue/SubpixelsPerMegapixel];
 Megapixel = (Red << 10) + (Green << 5) + Blue;
 DrawPixel(j, i, Megapixel);
 }
 }
 getch(); /* wait for a keypress */

 /* Return to text mode and exit */
 regset.x.ax = 0x0003; /* AL = 3 selects 80x25 text mode */
 int86(0x10, &regset, &regset);
}







[LISTING SIX]

/* Draws pixels from the list of horizontal lines passed in, to a 32-bpp
buffer; drawing takes place only for scan lines between ScanBandStart and
ScanBandEnd, inclusive; drawing goes to ScanLineBuffer, with the scan line at
ScanBandStart mapping to the first scan line in ScanLineBuffer. Note that
Color here points to an RGB structure that maps directly to the buffer's pixel
format, rather than containing a 16-bit integer. Tested with Borland C++ 2.0
in C mode in the small model */

#include "polygon.h"

extern struct RGB *ScanLineBuffer; /* drawing goes here */
extern int ScanBandStart, ScanBandEnd; /* limits of band to draw */
extern int ScanBandWidth; /* # of subpixels across scan band */

void DrawBandedList(struct HLineList * HLineListPtr,
 struct RGB *Color)
{

 struct HLine *HLinePtr;
 int Length, Width, YStart = HLineListPtr->YStart, i;
 struct RGB *BufferPtr, *WorkingBufferPtr;

 /* Done if fully off the bottom or top of the band */
 if (YStart > ScanBandEnd) return;
 Length = HLineListPtr->Length;
 if ((YStart + Length) <= ScanBandStart) return;

 /* Point to XStart/XEnd descriptor for the first (top) horizontal line */
 HLinePtr = HLineListPtr->HLinePtr;

 /* Confine drawing to the specified band */
 if (YStart < ScanBandStart) {
 /* Skip ahead to the start of the band */
 Length -= ScanBandStart - YStart;
 HLinePtr += ScanBandStart - YStart;
 YStart = ScanBandStart;
 }
 if (Length > (ScanBandEnd - YStart + 1))
 Length = ScanBandEnd - YStart + 1;

 /* Point to the start of the first scan line on which to draw */
 BufferPtr = ScanLineBuffer + (YStart-ScanBandStart)*ScanBandWidth;

 /* Draw each horizontal line within the band in turn, starting with
 the top one and advancing one line each time */
 while (Length-- > 0) {
 /* Fill whole horiz line with Color if it has positive width */
 if ((Width = HLinePtr->XEnd - HLinePtr->XStart + 1) > 0) {
 WorkingBufferPtr = BufferPtr + HLinePtr->XStart;
 for (i = 0; i < Width; i++) *WorkingBufferPtr++ = *Color;
 }
 HLinePtr++; /* point to next scan line X info */
 BufferPtr += ScanBandWidth; /* point to start of next line */
 }
}







[LISTING SEVEN]

/* POLYGON.H: Header file for polygon-filling code */

/* Describes a single point (used for a single vertex) */
struct Point {
 int X; /* X coordinate */
 int Y; /* Y coordinate */
};

/* Describes a series of points (used to store a list of vertices that
describe a polygon; each vertex is assumed to connect to the two adjacent
vertices, and the last vertex is assumed to connect to the first) */
struct PointListHeader {
 int Length; /* # of points */

 struct Point * PointPtr; /* pointer to list of points */
};

/* Describes the beginning and ending X coordinates of a single
 horizontal line */
struct HLine {
 int XStart; /* X coordinate of leftmost pixel in line */
 int XEnd; /* X coordinate of rightmost pixel in line */
};

/* Describes a Length-long series of horizontal lines, all assumed to be on
contiguous scan lines starting at YStart and proceeding downward (used to
describe scan-converted polygon to low-level hardware-dependent drawing
code)*/
struct HLineList {
 int Length; /* # of horizontal lines */
 int YStart; /* Y coordinate of topmost line */
 struct HLine * HLinePtr; /* pointer to list of horz lines */
};

/* Describes a color as an RGB triple, plus one byte for other info */
struct RGB { unsigned char Red, Green, Blue, Spare; };









































November, 1991
PROGRAMMER'S BOOKSHELF


Making Contact with Computers




Roy Duncan


Until very recently, the architects of human-machine interfaces have been
unsung heroes (and villains) in a rarely visible and even-more rarely
appreciated specialty. For example, if you look inside the case of a classic
black desk telephone, you don't find the names of the patient designers and
researchers at AT&T who spent years refining the ergonomics of that humble
instrument. There was no equivalent of Steven Jobs at Bell Labs to Immortalize
these fellows! Similarly, most of the older literature on human-machine
interfaces lies far off the beaten path in obscure journals, conference
proceedings, and graduate student theses.
But the rapid proliferation of PCs and their "productivity" applications over
the last decade has focused public attention on interface issues that were
previously only dimly perceived, because the interaction between a user and a
computer program is so much more intimate and intense than the interface
between a user and a toaster, a telephone, or even a typewriter. You can put
up with a toaster whose darkness control is difficult to adjust, or a
telephone that is shaped like Mickey Mouse, but it's frustrating and
aggravating to work with a computer program whose interface is inconsistent,
inefficient, or unforgiving. When you're forced to use a "bad" program, you
take it personally.
Unfortunately, the track record of the PC software industry with regard to
human interfaces has not been good. Experienced interface designers are often
brought into the software development process late, or are not consulted at
all. As Ted Nelson of Xanadu fame has commented:
Historical accident has kept programmers in control of a field in which most
of them have no aptitude the artistic integration of the mechanisms they work
with. It is nice that engineers and programmers and soft ware executives have
found a new form of creativity with which to find a sense of personal
fulfillment. It is just unfortunate that they have to inflict the result on
users.
The Macintosh stands out as the one personal computer where human interface
designers, psychologists, and graphics artists were involved with the hardware
and software from its earliest stages. The elegance of the Macintosh System
and Finder and the powerful, yet consistent, user interface found in Macintosh
applications are not a happy accident. They have resulted from a determined
effort by Apple to establish, evolve, and evangelize the user-interface
guidelines that were created with the help of these consultants, and to
stigmatize applications that flout the guidelines as "un-Mac-like."
DOS and UNTX software vendors, on the other hand, have only awakened to
esthetic considerations rather recently. Up until a year or two ago, the
"graphical user interface" of a Sun workstation was little more than a clock
icon and a command-line interpreter running in a movable window. Similarly,
the graphical user interface of Microsoft Windows, Versions 1.03 and 2.x, was
notorious for its clumsiness, ugliness, and the counterintuitive assignments
of its so-called "accelerator" keys. (Microsoft publicized its enlistment of
graphics designers in Windows, Version 3, but this mainly called attention to
its failure to consult such experts much earlier.)


The Interface World According to Apple


Addison Wesley's recently-released book The Art of Human Computer Interface
Design started out as an internal Apple project to organize and document the
company's experience for the benefit of new employees. As the project became
more ambitious, it was transformed into a book proposal, and a call for papers
was sent our. Brenda Laurel, the editor for the project, collected proposals
and commissioned articles over a two-year period. The first drafts of the
articles were distributed to all of the authors, the authors were gathered
together for a three day conference at Asilomar to share ideas, and then the
papers were revised a second and third time to build in cross-references and
add discussions of recent developments.
The result of this cross pollination is a massive opus divided rather
arbitrarily into five sections; Creativity and Design, Users and Contexts,
Sermons, Technique and Technology, and New Directions. The 50-odd chapters are
a hodgepodge of essays, interviews, anecdotes, musings, and polemics --
ranging from insightful (Alan Kay and Scott Kim) to vague and pretentious
(Jean-Louis Gassee) to virtually incomprehensible (Timothy Leary). The most
engaging material is often found where you least expect it, such as the
description of a touch-screen/Mac II/voice synthesis interface constructed by
Apple researchers for Koko, a 260-pound gorilla. (The computer's case is built
out of one-half inch carbonate, one-inch by two-inch aluminum girders, and
one-inch tempered glass, with slots for passive ventilation that channel
foreign materials such as bananas and excrement away from the CPU.)
Although The Art of Human-Computer Interface Design is enjoyable reading and
is well worth your time, the quality of the writing is somewhat uneven and the
focus is erratic. Furthermore, the book is not cohesive or detailed enough to
serve as a primary textbook or as a design guide for professionals. The
emphasis on the Apple graphical user interface also limits the book's
usefulness -- I'm happy to stipulate that the Apple System 7 is the best GUI
on a mass-market computer today, but there are certainly worthwhile
innovations in NextStep, Motif, and even (heaven help us) OS/2 Presentation
Manager that could have been discussed, not to mention the stylus-oriented
platforms such as PenPoint that are looming on the horizon.
The most dissappointing aspect of this book, for me, is its conservatism.
Within a very few years, near-microscopic self-contained multi-MIPS 32-bit (or
64-bit?) processors will be manufactured for pennies, wireless networking will
be the norm, and real-time access to information services will be available at
any point on the earth's surface, courtesy of satellites and digital cellular
telephony. The implications are literally mind-boggling. But most of the
authors of The Art of Human-Computer Interface Design don't see farther ahead
than two-handed Mac manipulations (a track ball in the left hand and a mouse
in the right), semi smart voice-mail, and head-mounted displays combined with
solenoid-encrusted gloves for cartoon-like virtual realities.


When Life Imitates Art


Where might we find more adventurous thinking on human-computer interfaces?
The literary genre known as science fiction seems like a reasonable candidate.
Over the years, science fiction authors have "predicted" a host of
technological advances, from geosynchronous communication satellites to
computer viruses. But a preliminary glance into the science fiction classics
isn't likely to impress you. Almost without exception, the "Old Masters" were
unable to project computer technology in any direction other than bigger and
better mainframes, some of which became sentient merely as a function of their
size rather than as the result of any advances in hardware or software. For
illustrations of this focal imaginative deficit, see 2001 (Arthur C. Clarke),
the Foundation series (Isaac Asimov), Cities In Flight (James Blish), Colossus
(D.F. Jones), and The Moon is a Harsh Mistress (Robert A. Heinlein).
However, there is an emerging group of science fiction authors who have grown
up with computer technology and can conceive of more provocative outcomes for
the human-computer interface. The novels written by these authors are
collectively known as "cyberpunk" and are characterized by a vivid and
sometimes disorienting "in-your-face" style, as well as an aggressive
extrapolation and incorporation of cutting-edge technology. The cyberpunk
authors also seem to share a fatalistic outlook that the world will inexorably
become more noisy, dirty, stressful, crowded, and corrupt -- but perhaps this
is just the angst of the "X Generation" rather than a property of cyberpunk
per se.
For those of you who are interested in having your consciousness raised,
broadened, or perhaps even assaulted a little, I'm going to make a few
cyberpunk reading recommendations. The first, last, and most important book to
buy is Neuromancer, by William Gibson. The characters in Neuromancer inhabit a
frantic, depersonalized world where all true political power has devolved to
multinational cartels descended from the Japanese zaibatsus, prosthetic
organs, and mind-altering drugs are as readily available as soft drinks (if
you have the money), and the age-old mortal sins have been supplanted by the
new crimes and vices related to the theft or subversion of other people's
data. Gibson crystallized the concept of "cyberspace"--hustlers of his world
"jack in" to "cyberspace decks" with which they can perceive the
world-girdling networks directly as a surreal terrain called "the matrix."
"The matrix has its roots in primitive arcade games," said the voice-over, "in
early graphics programs and military experimentation with cranial jacks." On
the Sony, a two-dimensional space war faded behind a forest of mathematically
generated ferns, demonstrating the spacial possibilities of logarithmic
spirals; cold blue military footage burned through, lab animals wired into
test systems, helmets feeding into fire control circuits of tanks and war
planes. "Cyberspace. A consensual hallucination experienced daily by billions
of legitimate operators, in every nation, by children being taught
mathematical concepts... A graphic representation of data abstracted from the
banks of every computer in the human system. Unthinkable complexity. Lines of
light ranged in the nonspace of the mind, clusters and constellations of data.
Like city lights, receding..."
He settled the black terry sweatband across his forehead, careful not to
disturb the flat Sendai electrodes. He stared at the deck in his lap, not
really seeing it, seeing instead the shop window on Ninsei, the chromed
shuriken burning with reflected neon... He closed his eyes. Found the ridged
face of the power stud. And in the blood-lit dark behind his eyes, silver
phosphenes boiling in from the edge of space, hypnagogic images jerking past
like film compiled from random frames. Symbols, figures, faces, a blurred
fragmented mandala of visual information... A gray disk, the color of Chiba
sky. Disk beginning to rotate, faster, becoming a sphere of paler grey.
Expanding--
And flowed, flowered for him, fluid neon origami trick, the unfolding of his
distanceless home, his country, transparent 3D chessboard to infinity. Inner
eye opening to the stepped scarlet pyramid of the Eastern Seaboard Fission
Authority burning beyond the green cubes of Mitsubishi Bank of America, and
high and very far away he saw the spiral arms of military systems, forever
beyond his reach.
Bruce Sterling's Islands in the Net is, on the surface, a vision of a kinder,
gentler future that Gibson's. His characters jog on the beach and are
ecologically oriented, the nuclear family in some form or other is still
important, a New-Age philosophy of the Optimal Persona and a low-energy,
Gaia-friendly school of architecture prevails. All this, however, plays out
against a backdrop of political fragmentation, high-tech terrorism, tiny
third-world nations that survive by harboring data pirates and skimming a
percentage of their profits, and near-total depletion of the world's natural
resources. And again, the whole world's facilities for information storage,
business transactions, education, and even recreation have been subsumed into
an all-knowing, all-embracing, all-prevading network.
Islands in the Net is one of the most absorbing science fiction books I've
ever read, but I also found it to be one of the most disturbing. This is, I
guess, because the near-future world it paints feels frighteningly plausible;
everything is a reasonable extrapolation of existing trends, with no appeal to
the invention of magical new technologies. In contrast to Sterling, who
manipulates global sociopolitical issues with ease, Pat Cadigan turns the
reader inward and explores the layers of consciousness. Her excellent but
little-known book Mindplayers portrays a society where one of the forbidden
forms of amusement is the induction of artificial psychoses with a direct
computer-to-mind interface called a "madcap." Those who get caught by the
Brain Police are required to undergo corrective theraphy.
As the final item in this little cyberpunk sampler, I offer Vernor Vinge's
Marooned in Realtime. Although this book is set in the same future world as
Vinge's earlier book, The Peace War, it addresses a far more cosmic
question--the next evolutionary step for the human race--embedded in a
crackling detective yarn. The narrator, an ex-cop, is a sort of Rip Van Winkle
of the space age; he and a handful of others emerge from stasis bubbles to
find the residues of an unimaginably high-tech infrastructure on an empty
world. With the physical evidence blurred past recognition by the passage of
millions of years, the question of the age is: Where did everyone else go?
Some interpret the event as an epidemic, others as a war between nations or
with invading aliens. But one of the other survivors speculates:
"We were on the exponential track... By 2200, all but the blind could see that
something fantastic lay in our immediate future. We had practical immortality.
We had the beginnings of interstellar travel. We had networks that effectively
increased human intelligence--with bigger increases coming... And intelligence
is the basis of all progress. My guess is that by mid-century, any goal--any
goal you could state objectively without internal contradictions--could be
achieved. And what would life be like fifty years after that? There would
still be goals, and there would still be striving, but not what we could
understand.
"To call that time 'The Extinction' is absurb. It was a Singularity, a place
where extrapolation breaks down and new models must be applied. And those new
models are beyond our intelligence... Mankind simply graduated, and you and I
and the others missed graduation night."


Through the Looking Glass


Vinge's anticipation of a Singularity brings me around to the nonfiction book
I was thinking of when I began writing this column: Disappearing Through the
Skylight, by O.B. Hardison, Jr. This book discusses how the accelerating pace
of developments in computers, quantum physics, cosmology, chaos theory,
molecular biology, and numerous other disciplines have already thrust us
through what amounts to a Singularity. Our understanding of the universe and
our place in it has changed in fundamental ways, and the reverberations of
this change find their expression in such mediums as concrete poetry,
neomodernist architecture, dissonant music, and minimalist styles of painting.
Computers now share the human environment. Most obviously they exhibit
rudimentary intelligence. They also have been equipped with arms and grippers
and legs, and in this form they have begun to act physically on the world
around them and modify it. Inevitably, they affect the sense of human
identity. Is the mind a machine--and a relatively simple one at that, once the
trick of programming with neurons is understood? Is the claim of humanity to
uniqueness disappearing along with the claim of each human to a separate
identity shaped by a local habitation and a name? Is the idea of what it is to
be human disappearing, along with so many other ideas, through the modern
skylight?
In its fearless exploration of inner and outer worlds, modern culture has
evidently reached a turning point--a kind of phase transition from one set of
values to another. Crossing the barrier that separates the phases is another
kind of disappearance. The nature of that barrier is nicely characterized in a
phrase developed by science in connection with the search for extraterrestrial
life: "horizon of invisibility." A horizon of invisibility cuts across the
geography of modern culture. Those who have passed through it cannot put their
experience into familiar words and images because the languages they have
inherited are inadequate to the new worlds they inhabit. They therefore
express themselves in metaphors, paradoxes, contradictions, and abstractions
rather than languages that "mean" in the traditional way--in assertions that
are apparently incoherent or collages using fragments of the old to create
enigmatic symbols of the new. The most obvious case in point is modern
physics, which confronts so many paradoxes that physicists like Paul Dirac and
Werner Heisenberg have concluded that traditional languages are, for better or
worse, simply unable to represent the world that science has forced on them.
In "Quantum Mechanics and a Talk with Einstein," Heisenberg remarks, "I assume
that the mathematical scheme works, but no link with the traditional language
has been established so far." The same comment might be made about the
relation between the twentieth-century languages of Cubism, collage, Dada, and
concrete poetry and the visual and verbal languages that preceded them.
Disappearing Through the Skylight is so wide-ranging that it defies
summarization here. Hardison explains and critiques everything from modern
poetry to experiments in artificial reality with an insight and authority that
most of us would be delighted to be able to apply to a single discipline. The
book's theme is not computers, but computers are found in every part of the
book, because computers are rapidly becoming an integral part of our culture
(and hence, "disappearing"). Hardison's analysis of the current research into
artificial intelligence is fascinating, and his speculations on the evolution
of silicon life are startling. Buy this book.









November, 1991
OF INTEREST





Digital Research has released DR DOS 6.0. This latest version has several
significant new features, including advanced memory management and disk
caching, password protection for files and directories, and on-the-fly
compression. For instance, using Memory-MAX memory management, DR DOS 6.0
automatically loads itself--including buffers, device drivers, memory-resident
programs, and network drivers--into upper and high memory, freeing up to 628K
of main memory for applications.
The DiskMAX compression feature provides a file compression system that can be
optionally invoked to increase disk space by 100 percent or more. Also present
is a disk cache based on Super PC Kwick (from Multisoft), which helps DOS and
Windows applications run faster by keeping data in memory. Data to be written
to disk is buffered in memory, resulting in faster write operations. DiskMAX
can also defragment hard disks, safely reorganizing files into contiguous
blocks.
TaskMAX is a task switcher that allows you to load up to 20 applications
simultaneously and switch between them. It can run both from the operating
system's graphical shell and the command line through a fully configurable
hotkey selection. DR DOS 6.0 can use extended or expanded memory to speed task
switching.
Additionally, Version 6.0 enables you to protect files, subdirectories, and
disk partitions, or even entire environments through a power-on password
function and software keyboard lock; to recover accidentally erased files; and
to consult online documentation via a full hypertext facility.
DR DOS 6.0 costs $99; upgrades are $24.95. Reader service no. 20.
Digital Research Inc. 70 Garden Court Monterey, CA 93942 408-646-6016
P. J. Plauger's latest book, The Standard C Library, is now available from
Prentice Hall. Offering comprehensive treatment of the ANSI and ISO standards
for the C Library, the volume features practical advice on use of all 15
header files in the standard C library. Each chapter is devoted to an
individual section of the library, including topics such as function use,
implementation specifics, testing methods, and complete code for the ANSI
standard C library.
The book also covers library design and implementation; the concepts, design
issues, and trade-offs associated with library building; and
internationalization issues and writing locale-independent programs.
The Standard C Library is $38 (ISBN 0-13-838012-0). Reader service no. 21.
Prentice Hall Order Dept. 200 Old Tappan Rd. Old Tappan, NJ 07675 201-767-5937
QuickC for Windows is the new Windows applications development system from
Microsoft that does not require use of the Windows SDK. It includes an
interface drawing and code generation tool, a C compiler, an integrated
debugger, an editor, and resource tools such as an image editor, a dialog
editor, and a resource compiler. All these tools run from within an integrated
Windows-hosted environment and are incorporated in a Toolbar to simplify
selection of development tasks such as launching the building of a program,
setting breakpoints, and single-stepping through code.
Also included in the package are QuickCASE:W and the QuickWin library.
QuickCASE:W simplifies the interface-building process: You can select standard
interface parts and it will generate the C code required for the interface and
the related files. The code regeneration technology in QuickCASE:W lets you
move back and forth between QuickCASE:W and the QuickC for Windows
environment, hand-customizing C code and making visual changes, and have all
changes automatically incorporated after the C code is generated again.
QuickC offers easy conversion of existing DOS applications to the Windows
environment through two different channels: With the QuickWin runtime library
you can convert DOS C programs directly to Windows programs in minutes, just
by relinking the program; otherwise you can change a DOS application into a
Windows DLL, then build a Windows application interface with QuickCASE:W. This
is convenient if you wish to use different languages for different tasks.
QuickC for Windows includes complete documentation for the environment,
Windows APIs, the C language, and C runtime libraries, and is priced at $199
($89.95 for registered users of selected Microsoft products). Reader service
no. 22.
Microsoft One Microsoft way Redmond, WA 98052-6399 206-882-8080
The Whitewater Group has released Actor 4.0, a new version of its
object-oriented programming platform. The new object-oriented interface to SQL
lets database developers use Windows-based OOP. DLLs to uniformly access
Paradox, dBase, and Excel are provided, and for an additional fee you can
purchase them for SQL Server, DB2, Oracle, and OS/2 Extended Edition Database.
Other enhancements to Version 4.0 include an improved ObjectWindows, the class
library that facilitates quick and easy creation of Windows interfaces and
safe multiple inheritance via protocols. The protocols keep all shared data in
one location, making code maintenance easier and eliminating unexpected
dependencies. Data encapsulation is preserved because protocols cannot
directly modify data variables--they request their subscribing classes to make
the changes instead. There is a Protocol Browser for creating, editing, and
tracking protocols.
Two versions are available: Actor 4.0 retails for $249 (upgrades $75); Actor
4.0 Professional costs $495. (Upgrades from previous professional versions are
$75, and from previous Actor versions, $195.) DLLs for access to DB2, Oracle,
SQL Server, and OS/2 Extended Edition Database are priced at $395. Reader
service no. 23.
The Whitewater Group 1800 Ridge Ave. Evanston, IL 60201 708-328-3800
Now shipping from Watcom is the C8.5/386 Optimizing Compiler and Tools, an
ANSI and SAA-compatible, 32-bit development system for DOS and Windows. The
package features a royalty-free 32-bit DOS extender and a true 32-bit Windows
GUI and DLL development kit--debugger, profiler, linker, make utility, and
more--enabling development, debugging, and royalty-free distribution of 32-bit
applications for DOS and Windows. The C8.5/386 supports such 80x86
environments as Windows and 32-bit DOS extenders from Rational, Phar Lap, and
Ergo; Microsoft language extensions to simplify porting 16-bit code are also
included.
Rational Systems' DOS/4GW 32-bit DOS extender is included in the package and
supports DPMI, VCPI, and XMS standards. C8.5/386's list price is $995; it is
offered initially at $795. Reader service no. 24.
Watcom 415 Phillip St. Waterloo, Ontario Canada N2L 3X2 800-265-4555
The Numerical Algorithms Group's Fortran 90 compiler, which supports the new
ISO Fortran 90 standard, is now shipping. The compiler is portable and
provides reusable compiler technology. Compilation is achieved in a four-pass
process, the final of which, the code generation pass, generates K&R C code.
As a result, the compiler will be available in an increasing number of
computing environments, allowing users to migrate their programs. The
generated executable can perform well, by relying on the optimization that is
built into the host system's native C compiler.
The NAG compiler is available for SUN 3s and 4s and HP/Apollo workstations.
License fees begin at $895; academic discounts available. Reader service no.
25.
Numerical Algorithms Group 1400 Opus Place, Suite 200 Downers Grove, IL 60515
708-971-2337
PC-X Windows is the new X-Window server from Intelligent Decisions for PCs and
compatibles running MS-DOS. It allows you to use your PC as a remote graphics
terminal, thus taking advantage of inexpensive PC hardware when adding new
screens to an existing network.
The server uses a VGA or Super-VGA graphics card with Ethernet or Serial Line
IP (SLIP) and supports up to 600x480 or 800x600 resolution graphics. PC-X
Windows implements the full X11R4 Protocol. It also runs in protected mode.
The price is $295; complete precon-figured PC-X stations, including 14-inch
color monitor and 40-Mbyte hard disk, start at $1495. Reader service no. 26.
Intelligent Decisions Inc. 536 Weddell Dr., Suite 2P Sunnyvale, CA 94098
408-734-3730
Ted Gruber Software has released Fastgraph, a graphics library for DOS-based
PC systems that includes more than 150 highly optimized routines. Fastgraph
features video-mode detection and initialization; colors, virtual colors, and
palettes; graphics primitives, a redefinable world space coordinate system;
hardware and software character fonts; physical and virtual video page
management; image display facilities; animation; special effects; keyboard,
mouse, and joystick control; and sound effects and music.
The library is written in assembly language and each routine has been
handoptimized. It supports VGA, EGA, MCGA, CGA, HGC, and Tandy 1000 graphics
modes as well as the standard color and monochrome text modes. The
documentation includes more than 120 example programs.
Jake Star of CompuTeach in New Haven, Conn., noted that Fastgraph was the
fastest of all the libraries he checked. "It supports the most modes in the
best manner possible, it's easy to program, has good documentation, and the
tech support is great," he commented.
At $149, Fastgraph supports Microsoft C and QuickC, Turbo C and C++,
QuickBasic, and Microsoft Fortran; includes libraries for small, medium, and
large memory models. Reader service no. 27.
Ted Gruber Software P.O. Box 13408 Las Vegas, NV 89112 702-735-1980
New from Visitech is the GraphLink printer graphics library, which includes
over 100 routines to build and print graphics at the maximum resolution of
dot-matrix and laser printers, instead of the traditional screen dump.
Features include a screen capture module that allows you to print
high-resolution graphics from existing screen graphics and programs; two types
of fonts--stroked and filled-outline--that can be scaled to any size, drawn at
any angle, and rendered in italics and bold-face; a mouse-based font editor;
and printer drivers for various laser, inkjet, and dot-matrix printers.
GraphLink is available for Microsoft C, QuickC, and Turbo C++ and C. The list
price is $145. Reader service no. 28.
Visitech Software 7936 Haymarket Raleigh, NC 27615 919-676-8474
The HALO Image File Format Library (HIFFL) from Media Cybernetics is now
available for both DOS and Windows 3.0. HIFFL is a library of functions that
allow applications to read and write bit-mapped image files in a variety of
formats: TIFF, PCX, BMP, and CUT. Because HIFFL is independent of any graphics
library, such libraries are not necessary to produce applications that are
compatible with many scanning, video capture, image processing, and desktop
publishing products.
HIFFL for DOS is distributed as an object library, supports Borland and
Microsoft C compilers, and costs $249; the Windows version, a DLL, costs $349.
Development licenses and source code are available. Reader service no. 29.
Media Cybernetics 8484 Georgia Ave. Silver Spring, MD 20910 301-495-3305
CommonBase is the new C++ framework for object-oriented development of
database applications from Glockenspiel (distributed by ImageSoft). It
provides a common C++ class interface over a wide choice of ISAM libraries and
SQL databases, increasing developer productivity for Oracle, Microsoft SQL
Server, Gupta SQLBase, and Coromandel's Integra SQL databases.
Applications are ported between databases by relinking. This portable
interface allows you to initially write applications using an inexpensive ISAM
package, and later link them to an industry standard SQL database without
changing the source code.
The CommonBase C++ PC version costs $499; $999 with source code. Non-PC
version: $999 without, of $1999 with source code. Reader service no. 30.
ImageSoft Inc. 2 Haven Ave. Port Washington, NY 11050 800-245-8840 or
516-767-9067
The first commercial implementation of the 386BSD project, the development of
which has been documented in DDJ since January 1991, will soon be available.
Berkeley Software Design Inc. (BSDI) has announced BSD/386, a UNIX-compatible
operating system for the i386/486 architectures. This is a complete system,
with no optional modules, and includes full TCP/IP and OSI networking, a
reimplementation of Sun's Network File System, X-Window System Version 11
Release 5, text processing software, POSIX functionality, GNU ANSI C, and C++.
Complete source is provided for the system as well as the binaries in QIC-150
or QIC-25 tape format. The introductory price will be $999. Reader service no.
31.
Berkeley Software Design Inc. 3110 Fairview Park Drive, Suite 580 Falls
Church, VA 22042 800-ITS-UNIX or 703-876-5040






November, 1991
SWAINE'S FLAMES


Beyond Babblegate




Michael Swaine


The computer industry is introducing words to the language at an alarming
rate. Acronyms, verbified nouns, nounified verbs, neogerundives, portmanteau
words, nested acronyms--new words have to be coined to categorize the new
words being coined.
The military and the underworld continue to be rich sources of euphemism and
neologism, perhaps coined to keep outsiders out. But this motivation only
explains some of the neologizing going on in the computer industry. Much
technoneologizing occurs simply because new things need new names. One word
used to describe these new words is technobabble, but is there a word for the
phenomenon of their proliferation?
The French have a word for it, so the saying says, and if they do it must be
official, since the French maintain a governmental agency to certify new words
as part of the official language. I have a word for it, too. Biblical
authority speaks of a tower of Babel, but I think that the current neologistic
overload is best characterized as Babblegate. I recommend the word to speakers
of all languages, including those watchers at the gates of the French
language.
The term "gate" itself, variously parameterized, is an overloaded operator.
Little by little, new uses of the word have crept into the language. Watergate
led (linguistically) to Contragate, and on to othergates. If you throw in
HeavensGate as a metaphor for a career catastrophe, these gates all lead to a
meaning something like "a serious blunder that will cause the blunderer some
temporary distress, but for which he will later be pardoned."
No doubt the reason we don't speak of AppleGate is that there have been so
many of them. The latest AppleGate in the making is the copy-protection
brainstorm. One sees why computer game companies grudgingly wear the albatross
of copy protection, but why officials of a hardware company should voluntarily
go around making speeches in which they hang this dead bird around the
company's neck is beyond me.
And which of Bill, Darryl, and Robert Gates will give us a new twist on the
word? My guess is Robert, in 1996. Various explanations have been advanced for
the timidity of Democratic Presidential noncandidates, but no one has asked
the obvious question: Would you want to run against someone with so many
friends in the CIA? If Robert Gates runs for President in 1996, my guess is
that he'll run unopposed. Call it coupgate.
These mordant speculations are inspired by reading, in the same day,
Christopher Hitchens's article, "Unlawful, Unelected, and Unchecked: How the
CIA subverts the government at home" in the October Harper's, and John Barry's
Technobabble (MIT Press, 1991). The article being too grim to contemplate,
I'll describe the book. Be forewarned that the author is an old friend and
that I did a technical review of the book in manuscript.
Technobabble is both scholarly and entertaining. Although it deals with the
roots and varieties of technobabble and examines its connections with the
lexicons of sex, drugs, religion, and pop psychology, the pace of the book is
set by amusing examples, anecdotes, poems, songs, and bloopers (discreet
hardware, the geranium transistor). And it pokes a lot of fun at the
euphemisms of press release writers, pointing out, for example, that most
products apparently don't do anything, but are merely "designed to" do things,
and that the thing they are most often designed to do is to "support"
something. Barry spends a whole page detailing the varied uses of the word
support in computer writing.
There's also a chapter on the complex and inconsistent grammar of
technobabble. "Any noun can be verbed," Barry quotes one author, and goes on
to demonstrate, citing perhaps the first uses of "mouse" and "version" as
verbs. Changing verbs into nouns is useful when talking about programming
language constructs: "The install is straightforward" is not necessarily an
unsuccessful attempt to write "the installation is straightforward," if it
refers to a specific command or procedure named "install." Barry gives sound
advice on how to deal with such constructions.
But some grammatical questions remain unresolved. To date, there is no
universally accepted way to spell the -ing form of an acronym treated as a
verb. Do you write PIPing or PIPPing or PIPping or pipping? And what about
ROMed code, or is it ROMMed or ROMmed or rommed or ROM'd? Those of us who use
these terms from beyond Babblegate need some standards. Barry's book, which
doesn't have all the answers but which identifies most of the questions, comes
at a good time.







































December, 1991
December, 1991
EDITORIAL


Staying Off the Horns of a Dilemma




Jonathan Erickson


Just in case you were so wrapped up in mid-October's Supreme Court
confirmation hearings that you missed the rest of the news, be advised that
the Justice Department granted approval for Borland's acquisition of
Ashton-Tate. And even if you did read about it in your paper's business
section, chances are the story was wrong: No less than the Wall Street Journal
reported that the merger was official once the "basic features of
Ashton-Tate's...database software [were] placed in the public domain." This
would be big news if it were true; lamentably, it isn't.
Although the Justice Department did attach strings--in the terms of "consent
decrees"--to the merger, those strings weren't tied to the public domain. The
consent decrees stipulate that: 1. For a period of 10 years, Borland cannot
enforce any dBase-related copyrights, including command languages, programming
languages, file structures, command names, menu items, and menu command
hierarchy; 2. Borland must seek prompt resolution of A-T's lawsuit against Fox
Software.
Imagine, if you can, the pickle barrel Borland would be in if the Justice
Department hadn't come up with door #1. Borland would be defending itself
against Lotus, while, by virtue of acquiring A-T and its ongoing litigations,
suing the pants off Fox for exactly the same thing. Borland's defense against
Lotus is that systems, procedures, and methods of operations are not
copyrightable. Ditto for Fox. I think they call this being on the horns of a
dBase dilemma. As you might expect, Borland agreed posthaste to the
government's proposal.
Let's face it, nothing was really resolved, at least in terms of UI copyright
issues; I reckon we'll have to wait for the Lotus vs. Borland suit for
enlightenment on those matters. However, third-party dBase/xBase developers do
get a reprieve, putting them in a better position to compete with Borland and
each other. And the entire market benefits.
Why, you might ask, is Borland so magnanimous with dBase copyrights and not
with, say, Borland C++? (Now, wouldn't that be something!) Probably because
the A-T dBase copyright claims are so flimsy that A-T likely couldn't win the
suit against Fox. I'd add that Borland, to its credit, isn't known as a
litigious company.
I wish I could say, as the WSJ did, that it's gratifying the "government's
action reflects growing concern over consolidation in the...software industry,
where a handful of companies already account for the bulk of the business."
Unfortunately, this isn't the case either. When pondering mergers like that of
Borland and A-T, the Justice Department considers the impact on the industry
of a specific merger, not industry trends (such as consolidation).
It remains that software development is one of the few enterprises with a low
barrier to entry and high likelihood of success. Just about the only barriers
to entry in the software development industry are, in fact, intellectual
property rights -- copyrights and patents. At least in this isolated case,
these thorny issues have been put on hold, making possible greater third-party
entry and livelihood, until they're sorted out.
In late October, the Justice Department filed with the U.S. District Court for
the Northern District of California in San Francisco a competitive impact
statement concerning the Borland/A-T consent decree and will be accepting
comments for a period of 60 days. If you have anything to say about this,
here's your chance. Contact the Justice Department for details; the docket
number of United States vs. Borland International is C91-3666MHP.


Why is That Cable Going Behind the Curtain?


One reason UIs for pen-based computers are covered in this issue is that, as a
Xerox PARC researcher recently said, pens are where the UI action is and, as
the September issue of Scientific American suggested, stylus-based devices
will soon be everywhere. Let's hope, however, that this emerging segment of
the industry isn't introduced with the same kind of bravado as the Momenta
Computer, briefly described on page 18.
The Momenta Computer, a promising system with powerful features, was
introduced at a lavishly financed and well-orchestrated launch to an
overflowing Silicon Valley audience. However, we noticed a marked discrepancy
between the very fast system performance at Monday's launch extravaganza and
relatively slow performance at Tuesday's OOPSLA '91 conference in Phoenix.
What we found out--and what company spokesmen failed to tell attendees at the
launch--was that the display on the stage-size VGA projection screen was
driven by a high-speed desktop machine, not the Momenta tablet the demo
presenter appeared to be using. The reason Momenta didn't use its own machine,
a PR representative later told us, was that the system doesn't have a VGA-out
port for connecting to an external monitor, so they used the desktop machine
and detachable digitizing tablet instead. Huh? Well, if you say so...

































December, 1991
LETTERS







Antique Software


Dear DDJ,
Jeff Duntemann's October 1991 "Structured Programming" column about
programming for small vertical markets really hit the target. I started
designing and programming computers in 1955. Now I am an antiques dealer. The
two careers meet in a column in an antiques trade journal telling fellow
dealers how to use computers. Software for antiques dealers is submitted to me
for review almost monthly. None of these products has gained general
acceptance, and most are just bad.
Much of this software is a gussied up version of a program originally
commissioned by one dealer. Invariably these products represent a thin stripe
of the vertical market. The rest of it is written by people who do not know
Chippendale from Limoge but think they know how our businesses ought to work.
Either approach is bound to fail in an industry that has as many
individualists, ways of selling, and bookkeeping methods as this one does. It
is especially insulting to see a program that does not properly account for
inventory value, when the antiques industry is completely inventory-driven.
Give us a break, guys: Get out in the field.
Jeff's support of Clarion for application development is on the mark too.
Several programs I have reviewed and use regularly are written in it. One
communicates with an on-line database service that has no more than 600 users.
The programming costs have to be recovered from a $100 annual fee. With a
modest development cost, Clarion provides a good looking piece of software for
the basically nontechnical users.
Clarion lacks one thing: a GUI or text-mode GUI look-alike. Of seventeen
horizontal applications I use regularly in my business, eleven have gone to
the common look in the last year. That is another thing vertical market
developers need to think about.
John P. Reid
Bear, Delaware


OWLish Satisfaction


Dear DDJ,
I am a programmer specializing in real-time financial information and price
charting. I really enjoyed Jeff Duntemann's June 1991 "Structured Programming"
column about the new Turbo Pascal for Windows. I've been using the product for
a month now and already have put up a 5000-line MDI application. I agree with
your first impressions about it. Before TPW, I was looking for something to
develop Windows applications. I tried C, but I can't really think using !,*,&,
and ->. Begin and end just feel better. I was very excited the first time I
saw TPW advertised in your magazine, and it surpassed my expectations.
TPW makes it easy to port code from Pascal 6.0 but also allows porting of C
code. I have done both. I ported SNAP3 C code (see DDJ, February 1991) to OWL
in about an hour. I also ported 3000 lines from my Pascal 6.0 charting
program. TPW lets you use both OWL and the conventional C Windows structure.
I have only two complaints about TPW: First, I would like to see the same rich
set of code examples of TP 5.5 and 6.0. Of course I understand Borland was in
a hurry to release this hot product. But an example like MICROCALC would be a
very good source for reference. Examples are the ultimate source of
information for complicated environments like Windows. Particularly, I think
they should have provided some example of DDE Server. This is a good subject
for the "Structured Programming" column. DDE is just too complicated for
average programmers like myself.
This DDEServerWindow object would have methods for handling Windows DDE
messages (WMDDEAdvise, WMDDERequest, WMDDEAck, and so on). These methods would
call virtual methods like:
TopicAvailable(Topic:String):Boolean;
ItemAvailable(Topic,Item:String):Boolean;
GetItem(Topic,Item:String;var CFFormat:
 word;PValue:Pointer;var
 Length:integer);
For warm and hot links, the method:
ChangeItem(Topic,Item:String;PNew
 Value:Pointer;Length:integer);
and so on. This way it would hide the complexities of atoms, global memory
blocks, Advise, AckReq, DeferUpd, and God knows what. I really don't have the
technical skills necessary to write such an object.
I tried to port DDEPOP from Petzold's book with no success. I had a
particularly hard time with BOOL flags in DDEAdvise structs. DDE Servers are
to Windows what TSRs are to DOS: hard to understand and debug. You get a lot
of "Unrecoverable Application Error" (UEA) messages.
This leads to my second complaint. These Windows error messages don't say much
about what generated the error. Even for just a common runtime error, Pascal
cannot find the error point in the source code unless you type in the address.
That's a dumb thing for an environment that's supposed to integrate
applications. Sometimes, you have to start the debugger just to find a simple
UAE.
About the communication ports: I had the same problems you did. I solved them
by writing my own interrupt-driven communication services. I used the same
code my old TP 6.0 comm application did. The only change I had to do was to
use the DATA segment for the circular buffer and head. My old ISR used the
CODE segment for these variables, a clear protection violation. My interrupt
routine is in assembly language but I believe you can write one in Pascal as
well.
Apparently, Microsoft doesn't want to enforce the use of Windows
communications facilities. This fact is confirmed by the permission to access
serial port registers directly (I don't know much about protected mode but I
know it can avoid such accesses) and the absence of documentation about the
use of Windows comm functions in SDK and Petzold's book.
My communications program works fine in real, standard, and 386 enhanced
modes. Windows even warns you if you try to start a non-Windows app that uses
the same serial port. The only problem with this approach is that if you get a
UAE (a very common occurrence while developing), your application terminates,
leaving the interrupt uncovered. In this state, one byte coming in from the
comm port is enough to hang Windows. (Actually, Windows aborts to DOS.) I
couldn't find anything like Pascal's ExitProc in the Windows documentation.
I will try to put the communications routines in a DLL. DLLs don't terminate
violently like applications and have initialization and exit procedures, which
may be used to set and restore the interrupt vector.
Turbo Pascal for Windows is really an important product and I'm happy to see
important magazines like yours interested in providing information about it.
Omar F. Reis
Sao Paulo, Brazil


What About Al?


Dear DDJ,
I just read Al Stevens's September "C Programming" column and I think that he
is giving the ANSI C committee a bad rap. I think the problem is with his
code.
Al seems to have been using a C compiler which made some peculiar decisions
about how to interpret "preprocessor" lines in macro replacements. The 1978
Kernighan and Ritchie seems to be silent on this subject. But Harbison and
Steele (C: A Reference Manual) say explicitly that Al's code should fail: "If
a macro expands into something that looks like a preprocessor command, that
command will not be recognized as a command by the preprocessor." I have
always used Harbison and Steele as gospel when trying to write portable code
for pre-ANSI compilers, since they based their book on many different dialects
of C. The ANSI spec just codifies this behavior.
If Al wants to create macros which look like #define lines, it should be
pretty easy using ANSI C as long as he is willing to create complicated make
files. Using a file like
#define POUND #
#define defMacro(macro, replacement)

 POUND define macro replacement
#include "whatever.h"
and sending the output of the preprocessor to another .h file should give him
what he wants.
But I think that the real problem is with his style of coding. I also like to
use the C preprocessor for exotic purposes. But after being bitten a number of
times by incompatibilities between different C dialects, bugs in the
preprocessor, and overflowing internal buffers, I have learned to avoid
abusing the preprocessor. I think any C programmer who is producing supposedly
portable code ought to follow this rule: "If it looks like it might fail, it
probably will on some compiler. Would I rather spend my time studying the ANSI
spec and experimenting with my compiler, or would I rather write my own
preprocessor and know exactly how it works? (And, if I don't do it now, I will
have to rip all this code out and write my own preprocessor when I port it!)"
Alan B. Harper
Oakland, California
Dear DDJ,
I am perplexed by "The Programmer's Soapbox" at the end of Al Stevens's
September column. If language is declining, I ain't noticed it. (Is this
oxymoronization?) A few points:
1. The first edition of The C Programming Language by Kernighan and Ritchie
has a copyright of 1978 and contains the word "initialize." My copy of
Websters New Collegiate Dictionary also contains the word and has a copyright
of 1977. In all fairness, the meaning given indicates the word came into being
as a result of the computer revolution, but it seems all it is an example of
is that Al was unaware of terminology presently in use when he first read K&R.
2. The only complaints that Al seems to have about "stringize" is that the
X3J11 committee coined it and that it is an abomination. If a suitable word or
simple phrase already exists that he felt should have been used instead, a
good editorial would have enlightened us. If he feels the word is lacking in
some respect, he did not convey this. If he felt that he had coined a better
word he did not tell us. The word does sound strange and seems a bit forced,
but it does clearly convey the meaning that X3J11 intended. Is that not the
important issue?
3. As fields of endeavor continue to come into existence and grow, it is
natural for new words to be created to convey the concepts involved. That some
of these words move into the mainstream is not an indication that language is
declining, but merely that society and its language are evolving because of
these endeavors. The word initialize showing up in a spell checker only
indicates that society has evolved to the point where people are using
computers and programmers are using spell checkers.
It is a good thing that Al was not present when man was making the first
tentative snorts and grunts that became language. He would have had nothing to
say in protest.
William R. Ockert
Carrington, North Dakota


Against Subversion


Dear DDJ,
I found Kenneth Roach's article "Using the Real-Time Clock" (June 1991) very
informative, but I would like to make some comments.
One issue that disturbs me is that Mr. Roach suggests replacing the system
services for getting and setting the system time by directly using the
hardware clock. Subverting system services is never a good idea unless there
is some overwhelming reason to do so. Mr. Roach gave the reason that he needed
an accurate timing mechanism. My suggestion would be to use his own Turbo
Pascal Clock() function for this purpose. Otherwise, I fail to see a reason
why getting or setting the system time would be a time critical operation; the
less than one millisecond overhead is simply not going to be an issue with the
value of time kept by the system, or be perceived by the user unless done
repeatedly in test loops.
My objection to subverting the operating system in this case is that MS-DOS
provides you with the ability to override the clock device driver such that it
can be done in a device-independent way. This device driver is used by DOS for
the get/set system time and date services. I have developed clock device
drivers that do this both for the real-time clocks commonly found in XTs and
the AT real-time clock. Sadly, the default clock device driver in MS-DOS
relies on the value updated by the timer tick interrupt. Few DOS users know
that by using a clock device driver designed for a real-time clock, you can
provide the convenience of the ability to set the hardware clock by using the
DIS time and date commands.
Mr. Roach also noted that the get time service on a LAN was considerably
slower than with no LAN installed. I believe this is due to the LAN using a
synchronous time base for all connected machines, thus the request is handled
via the network. This situation would be desirable when comparing time stamps
of network files and other network related activities.
Mr. Roach's use of the AT's periodic interrupt may be a potential problem. The
AT BIOS makes use of this interrupt with the event wait service (interrupt
15h, function 86h), which is intended for use by a multitasking operating
system.
Robert Mashlan
Norman, Oklahoma


Patent Proponent


Dear DDJ,
The article entitled "Software Patents" was basically fear-mongering
propaganda, and so seemed out of place in the usually placid technical pages
of DDJ.
Principally lacking in the article is any recognition of the economic
environment in which products compete. For example, patent license fees are
almost always royalties. Unless a patent holder is an idiot, he or she has no
desire to put a manufacturer out of business, or to force a product to be
crippled in the marketplace. In fact, the pressure is on the patent holder to
negotiate a reasonable fee, so that a new product can compete successfully
with established products, and thus create maximum royalties. Very few patents
are so vital that absolutely no marketplace alternatives are possible.
Deceptively absent from the article is the identification of those who are
most advantaged by patents: independent individual inventors. Without patent
protection, any new idea can be taken and used by those who have the largest
staffs of programmers, the largest production, marketing, and sales
organizations, and the largest advertising budgets; an individual cannot
realistically hope to compete with such organizations other than in small
niche markets. With patent protection, the individual has some amount of
leverage to restrain or harness large organizations and thus reap the rewards
of his or her own efforts. Patent protection can be obtained directly by
individuals, for modest fees.
The best handbook available is Patent it Yourself, Second edition, by David
Pressman ($32.95 ppd. from Nolo Press, 950 Parker St., Berkeley, CA 94710;
800-992-6656). Self-patenting is a lot of hard work, but is probably within
the range of any technical person willing and able to put out the effort.
Although the article begins by pointing out that a patent is a grant of
monopoly in return for public disclosure, it is embarrassingly silent with
respect to the lack of exactly that sort of disclosure in software, and the
problems thus caused. It is no accident that one of the oft-mentioned problems
in software is that programmers continually "reinvent the wheel." It must be
that way: Virtually all of the "good" or economically important software is
available only as object code, rather than source.
Because programs are not generally protected by patent, precious source code
is kept as a vital trade secret; consequently, any especially good techniques
within the source generally remain unavailable to the public forever, instead
of just the limited lifetime of a patent. And when the software product
eventually dies, any special techniques in it die as well.
Because economically important techniques are not publicity disclosed,
ordinary programmers cannot incrementally build upon previous work; most
programmers will not even see that work. In contrast, large organizations can
afford to disassemble competitive code; the secrets thus revealed are, again,
trade secrets, and again unavailable to the general public. Trade secret
software techniques are thus available to large organizations, to make them
even more powerful competitors. (This clearly happened during the early years
of DOS.)
Another point the article overlooks is that patent protection encourages the
investment in research necessary to develop new ideas. True, some developments
are easy, cheap, and obvious. But others may involve extensive library
research, theoretical development, and experimental trial-and-error; such work
can be very expensive. If expensive results are not protected, such research
will be a poor investment, one which will not be made again. In an unprotected
environment it is far more efficient for large companies simply to wait for
someone else to come up with an idea, then steal it. Patents restrain this,
and are thus a tool to help recover research expenses (although most patents
do not earn back their issue fees). Failure to recover such expenses means
less research. Is that what we want?
When we are young and in school, information is provided for us, and the
vehicles of such information are freely available. But as mature individuals
we must understand that we are part of a capitalist society, and all of the
information we have was found, accumulated, and paid for, through the direct
effort of previous generations. One of the ways this was accomplished is by
patent protection. Patents have thus been proven in practice to be an
important tool for encouraging public disclosure in a society which respects
private ownership. Moreover, patents automatically provide economic support
only for worthwhile market applications of research and development, a
situation which seems far better than the idea of tax-supported research
grants under arbitrary and political bureaucratic control.
The natural application of patent concepts to software has the potential for
improving the industry for individuals and small businesses, by allowing them
to restrain the giants. Naturally, large companies may see this as a threat.
Perhaps they will even support "The League for Programming Freedom" to try to
keep this threat at bay.
Terry Ritter
Austin, Texas

















December, 1991
THE DESIGN AND IMPLEMENTATION OF PIE MENUS


They're fast, easy, and self-revealing




Don Hopkins


Don is a software engineer for SunSoft and can be contacted at 88 Mercy
Street, Mountain View, CA 94041.


Although the computer screen is two-dimensional, today most users of windowing
environments control their systems with a one-dimensional list of choices --
the standard pull-down or drop-down menus such as those found on Microsoft
Windows, Presentation Manager, or the Macintosh.
This article describes an alternative user-interface technique I call "pie"
menus, which is two-dimensional, circular, and in many ways easier to use and
faster than conventional linear menus. Pie menus also work well with
alternative pointing devices such as those found in stylus or pen-based
systems. I developed pie menus at the University of Maryland in 1986 and have
been studying and improving them over the last five years.
During that time, pie menus have been implemented by myself and my colleagues
on four different platforms: X10 for the uwm window manager, SunView, NeWS for
the Lite Toolkit, and Open Windows for THE NeWS Toolkit. Fellow researchers
have conducted both comparison tests between pie menus and linear menus, and
also tests with different kinds of pointing devices, including mice, pens, and
trackballs.
Included with this article are relevant code excerpts from the most recent
NeWS implementation, written in Sun's object-oriented PostScript dialect.


Pie Menus Properties


In their two-dimensional form, pie menus are round menus containing menu items
positioned around the cursor--as opposed to the rows or columns of traditional
linear menus. The menu item target regions are shaped like the slices of a
pie, and the cursor starts out in the center, in a small inactive region. The
active regions are all adjacent to the cursor, but each in a different
direction. You select from a pie menu by clicking the mouse or tapping the
stylus, and then pointing in a particular direction.
Although there are multiple kinds of pie menus, the most common implementation
uses the relative direction of the pointing device to determine the
selection--as compared with the absolute positioning required by linear menus.
The wedge-shaped slices of the pie, adjacent to the cursor but in different
directions, correspond to the menu selections. Visually, feedback is provided
to the user in the form highlighting the wedge-shaped slices of the pie. In
the center of the pie, where the cursor starts out, is an inactive region.
When a pie menu pops up, it is centered at the location of the click that
invoked it: where the mouse button was pressed (or the screen was touched, or
the pen was tapped). The center of the pie is inactive, so clicking again
without moving dismisses the menu and selects nothing. The circular layout
minimizes the motion required to make a selection. As the cursor moves into
the wider area of a slice, you gain leverage, and your control of direction
improves. To exploit this property, the active target areas can extend out to
the edges of the screen, so you can move the cursor as far as required to
select precisely the intended item.
You can move into a slice to select it, or move around the menu, reselecting
another slice. As you browse around before choosing, the slice in the
direction of the cursor is highlighted, to show what will happen if you click
(or, if you have the button down, what will happen if you release it). When
the cursor is in the center, none of the items are highlighted, because that
region is inactive.
Pie menus can work with a variety of pointing devices--not just mice, but also
pens, trackballs, touchscreens, and (if you'll pardon the hand waving) data
gloves. The look and feel should, of course, be adapted to fit the qualities
and constraints of the particular device. For example, in the case of the data
glove, the two-dimensional circle of a pie could become a three-dimensional
sphere, and the wedges could become cones in space.
In all cases, a goal of pie menus is to provide a smooth, reliable gestural
style of interaction for novices and experts.


Pie Menu Advantages


Pie menus are faster and more reliable than linear menus, because pointing at
a slice requires very little cursor motion, and the large area and wedge shape
make them easy targets.
For the novice, pie menus are easy because they are a self-revealing gestural
interface: They show what you can do and direct you how to do it. By clicking
and popping up a pie menu, looking at the labels, moving the cursor in the
desired direction, then clicking to make a selection, you learn the menu and
practice the gesture to "mark ahead" ("mouse ahead" in the case of a mouse,
"wave ahead" in the case of a dataglove). With a little practice, it becomes
quite easy to mark ahead even through nested pie menus.
For the expert, they're efficient because -- without even looking -- you can
move in any direction, and mark ahead so fast that the menu doesn't even pop
up. Only when used more slowly like a traditional menu, does a pie menu pop up
on the screen, to reveal the available selections.
Most importantly, novices soon become experts, because every time you select
from a pie menu, you practice the motion to mark ahead, so you naturally learn
to do it by feel! As Jaron Lanier of VPL Research has remarked, "The mind may
forget, but the body remembers." Pie menus take advantage of the body's
ability to remember muscle motion and direction, even when the mind has
forgotten the corresponding symbolic labels.
By moving further from the pie menu center, a more accurate selection is
assured. This feature facilitates mark ahead. Our experience has been that the
expert pie menu user can easily mark ahead on an eight-item menu. Linear menus
don't have this property, so it is difficult to mark ahead more than two
items.
This property is especially important in mobile computing applications and
other situations where the input data stream is noisy because of factors such
as hand jitter, pen skipping, mouse slipping, or vehicular motion (not to
mention tectonic activity).
There are particular applications, such as entering compass directions, time,
angular degrees, and spatially related commands, which work particularly well
with pie menus. However, as we'll see further on, pies win over linear menus
even for ordinary tasks.


Pie Menu Flavors


There are many possible flavors or variants of pie menus. One obvious
variation is to use semicircular pie ("fan") menus at the edge of the screen.
Secondly, although the usual form of pie menus is to use only the directional
angle in determining a selection, there is a variant of pie menus which offers
two parameters of choice with a single user action. In this case, both the
direction and the distance between the two points are used as parameters to
the selection. The ability to specify two input parameters at once can be used
in situations where the input space is two-dimensional.
For example, in a graphics or word processing applications, a dual-parameter
pie menu will allow you to specify both the size and style of a typographic
font in one gesture. The direction selects the font style from a set of
styles, and the distance selects the font size from the range of sizes. An
increased distance from the center corresponds to an increase in the size of
the font. Visual feedback can be provided to the user by making a text sample
swell or shrink dynamically as the pointer is moved to and fro. Direction and
distance can be continuous or discrete, as appropriate.
A minor variation in the use of pie menus is whether you click-and-drag as the
menu pops up, or whether two clicks are required: one to make the menu appear,
another to make the selection. In fact, it's possible to support both.
Other variants include scrolling spiral pies, rings, pies within square
windows, and continuous circular fields. These variants are discussed in a
later section.


Pie Menu Implementations


As mentioned earlier, several pie menu implementations exist, including: X10,
SunView, and two NeWS implementations (using different toolkits).

I first attempted to implement pie menus in June 1986 on a Sun 3/160 running
the X10 window system by adding them to the "uwm" window manager. The user
could define nested menus in a ".uwmrc" file and bind them to mouse buttons.
The default menu layout was specified by an initial angle and a radius that
you could override in any menu in which labels overlapped. The pop-up menu was
rectangular, large enough to hold the labels, and had a title at the top.
Then I linked the window manager into Mitch Bradley's Sun Forth, to make a
Forth-extensible window manager with pie menus. I used this interactively
programmable system to experiment with pie menu tracking and window management
techniques, and to administer and collect data for Jack Callahan's experiment
comparing pie menus with linear menus.
In January 1987, while snowed in at home, Mark Weiser implemented pie menus
for the SunView window system. They are featured in his reknowned "SDI" game,
the source code for which is available free of charge.
I implemented pie menus in round windows for the Lite Toolkit in NeWS 1.0 in
May 1987. The Lite Toolkit is implemented in Sun's object-oriented PostScript
dialect, and pie menus are built on top of the abstract menu class, so they
have the same application interface as linear menus. Therefore, pie menus can
transparently replace the default menu class, turning every menu in the system
into a pie, without having to modify other parts of the system or
applications.
Because of the equivalence in semantics between pie menus and linear menus,
pies can replace linear menus in systems in which menu processing can be
revectored. Both the Macintosh and Microsoft Windows come to mind as possible
candidates for pie menu implementations. Of course, for best results, the
application's menu items should be arranged with a circular layout in mind.
My most recent implementation of pie menus runs under the NeWS Toolkit, the
most modern object-oriented toolkit for NeWS, shipped with Open Windows,
Version 3. The pie menu source code and several special-purpose classes, as
well as sample applications using pie menus are all available for no charge.


Usability Testing


Over the years, there have been a number of research projects studying the
human factors aspects of pie menus.
More Details.
Jack Callahan's study compares the seek time and error rates in pies versus
linear menus. There is a hypothesis known as Fitts's law, which states that
the "seek time" required to point the cursor at a target depends on the
target's area and distance. The wedge-shaped slices of a pie menu are all
large and close to the cursor, so Fitts's law predicts good times for pie
menus. In comparison, the rectangular target areas of a traditional linear
menu are small, and each is placed at a different distance from the starting
location.
Callahan's controlled experiment supports the result predicted by Fitts's law.
Three types of eight-item menu task groupings were used: Pie tasks (North, NE,
East, and so on), linear tasks (First, Second, Third, and so on), and
unclassified tasks (Center, Bold, Italic, and so on). Subjects with little or
no mouse experience were presented menus in both linear and pie formats, and
told to make a certain selection from each. Those subjects using pie menus
were able to make selections significantly faster and with fewer errors for
all three task groupings.
The fewer the items, the faster and more reliable pie menus are, because of
their bigger slices. But other factors contribute to their efficiency. Pies
with an even number of items are symmetric, so the directional angles are
convenient to remember and articulate. Certain numbers of items work well with
various metaphors, such a clock, an on/off switch, or a compass. Eight-item
pies are optimal for many tasks: They're symmetric, evenly divisible along
vertical, horizontal, and diagonal axes, and have distinct, well-known
directions.
Gordon Kurtenbach carried out an experiment comparing pie menus with different
visual feedback styles, numbers of slices, and input devices. One interesting
result was that menus with an even number of items were generally better than
those with odd numbers. Also, menus with eight items were especially fast and
easy to learn, because of their primary and secondary compass directions.
Another result of Kurtenbach's experiment was that, with regard to speed and
accuracy, pens were better than mice, and mice were better than trackballs.
The "Eight Days a Week" menu shown in Figure 1 is a contrived example of
eight-item symmetry: It has seven items for the days of the week, plus one for
today. Monday is on the left, going around counterclockwise to Friday on the
right. Wednesday is at the bottom, in the middle of the week, and the weekend
floats above on the diagonals. Today is at the top, so it's always an easy
choice. THE NeWS Toolkit code that creates this pie menu is shown in Listing
One (page 94).


Pie Menu Disadvantages


The main disadvantage of pie menus is that when they pop up, they can take a
lot of screen space due to their circular layout. Long item labels can make
them very large, while short labels or small icons make them more compact and
take up less screen space.
The layout algorithm should have three goals: to minimize the menu size, to
prevent menu labels from overlapping, and to clearly associate labels with
their directions. It's not necessary to confine each label to the interior of
its slice--that could result in enormous menus. In a naive implementation, you
might use text labels rotated around the center of the pie. But rotated text
turns out not to work well, because it exaggerates "jaggies." This is hard to
read without rotating your head, and doesn't even satisfy the goal of
minimizing menu size.
One successful layout policy I've implemented justifies each label edge within
its slice, at an inner radius big enough that no two adjacent labels overlap.
To delimit the target areas, short lines are drawn between the slices, inside
the circle of labels, like cuts in a pie crust.
One solution to the problem of pie menus with too many items is to divide up
large menus into smaller, logically related submenus. Nested pies work quite
well, and you can mark ahead quickly through several levels. You remember the
route through the menus in the same way you remember how to drive to a
friend's house: by going down familiar roads and making the correct turn at
each intersection.
Another alternative is to use a scrolling pie menu that encompasses many items
in a spiral but only displays a fixed number of them at once. By winding the
cursor around the menu center, you can scroll through all the items, like
walking up or down a spiral staircase.


Other Design Considerations


When you mark ahead quickly to select from a familiar pie, it can be annoying
if the menu pops up after you've already finished the selection, and then pops
down, causing the screen to repaint and slowing down interaction. If you don't
need to see the menu, it shouldn't show itself. When you mark ahead,
interaction is much quicker if the menu display is preempted while the cursor
is in motion, so you never have to stop and wait for the computer to catch up.
If you click up a menu when the cursor is at rest, it should pop up
immediately, but if you press and move, the menu should not display until you
sit still. If you mark ahead, selecting with a smooth continuous motion, the
menu should not display at all. However, it's quite helpful to give some type
of feedback, such as displaying the selected label on an overlay near the
cursor, or previewing the effect of the selection.
When you pop up a pie menu near the edge of the screen, the menu may have to
be moved by a certain offset in order to fit completely on the screen,
otherwise you couldn't see or select all the items. But it would be quite
unexpected were the menu to slip out from under the click, leaving the cursor
pointing at the wrong slice. So whenever the menu is displayed on the screen,
and it must be moved in order to fit, it is important to"warp" the cursor by
the same offset, relative to its position at the time the menu is displayed.
If you mark ahead so quickly that the menu display is preempted, the cursor
shouldn't be warped. Pen- and touchscreen-based pie menus can't warp your pen
or finger, so pie menus along the screen edge could pop up as semicircular
fans. Note that cursor warping is also an issue that linear menus should
address.
Ideally, pie menu designers should arrange the labels and submenus in
directions that reflect spatial associations and relationships between them,
making it easy to remember the directions. Complementary items can be opposite
each other, and orthogonal pairs at right angles.
It's difficult to mark ahead into a pie menu whose items are not always in the
same direction, because if the number of items changes, and they move around,
you never know in which directions to expect them. Pie menus are better for
selecting from a constant set of items, such as a list of commands, and best
when the items are thoughtfully arranged to exploit the circular layout.


Sample Pie Menu


The pie menu shown in Figure 3 is an example of one that I added to the NeWS
environment. Clicking on the window frame pops up this menu of
window-management commands, designed to take advantage of mark ahead. Because
this menu is so commonly used, you can learn to use it quickly, and save a lot
of time. At the left of the figure is the top-level menu with commonly used
commands and logically related submenus. The Grab item has been selected,
popping up a graphical submenu of corners and edges. The icon for the bottom
edge is highlighted, but has not yet been selected. Clicking in that slice
allows you to grab and stretch the edge of the window frame.
Figure 4 shows as second example, a color wheel that allows you to set the
brightness, and to select a color from a continuous range of hues and
saturations. The code (written in object-oriented Postscript) that implements
this color wheel is shown in Listing Two (page 94). The hue varies smoothly
around the color wheel with direction, and the saturation varies smoothly with
distance, with pure colors in the center fading to gray around the edge.
Outside the pale perimeter is a continuous band of grays from white to black,
that looks like the shadow inside a paint can, and functions as a circular
brightness dial. Dipping into this gray border sets the brightness of the
whole wheel. You may select any shade of gray around the border, or move back
into the paint can, to select a color at the current brightness. As you move
around, the cursor shows the true color selected, and because the cursor is
displayed even before the menu is popped up, you can mark ahead and select a
color without popping up the menu!


Conclusion


Pie menus are easy to learn, fast to use, and provide a gestural style of
interaction that suits both novices and experts. And this is one user
interface that's on the house, so enjoy!


References


Card, Stuart. "User Perceptual Mechanisms in the Search of Computer Command
Menus." Proceedings of the ACM (March, 1982).
Hopkins, D., J. Callahan, and M. Weiser. "Pies: Implementation, Evaluation,
and Application of Circular Menus." University of Maryland Computer Science
Department Technical Report, 1988.
Perlman, Gary. "Making the Right Choices with Menus." Interact '84, First IFIP
International Conference on Human Computer Interaction, Amsterdam, 1984.
Shneiderman, Ben. Designing the User Interface. Reading, Mass.:
Addison-Wesley, 1987.



Momenta's Command Compass A Pie Menu by any Other Name


One implementation of a pie menu was recently announced by Momenta
Corporation, a Silicon Valley startup developing a "pentop" computer. (A
"pentop" PC supports both keyboard and pen input.) The Momenta Computer is a
dual-mode system that can run standard MS-DOS programs or programs written for
the "Momenta Environment," a Smalltalk/V-based environment that supports the
pen. Central to the Momenta Environment is a system-wide pie menu the company
refers to as the "Command Compass."
The Command Compass operates consistently across all Momenta Environment
applications (notepad, spreadsheet, sketchpad, and so on) by allowing users to
manipulate (move, copy, cut, paste, and so on) text or graphical objects that
have previously been defined. Some operations can therefore be performed in a
single stroke. As Figure 1 illustrates, for instance, you can quickly move a
block of text by defining it, calling up the Command Compass, moving the
stylus through the "move" wedge and on to the "move-to" position, and
releasing the pen.
The figure shows that Momenta's menus are a visually faithful implementation
of pie menus as described in this article. If Momenta succeeds in its
endeavor, pie menus will join pull-down and linear menus as mainstream
user-interface components.
--editors

_THE DESIGN AND IMPLEMENTATION OF PIE MENUS_
by Don Hopkins


[LISTING ONE]

% Code to implement the "8 Days a Week" Pie Menu
% by Don Hopkins

/pie framebuffer /new ClassPieMenu send def
[ (Today)
 (Sunday)
 (Monday) (Tuesday) (Wednesday) (Thursday) (Friday)
 (Saturday)
] /setitemlist pie send
90 /setinitialangle pie send
false /setclockwise pie send

/can framebuffer /new ClassPieMenuCanvas send def
pie /setpiemenu can send
/minsize {100 100} /installmethod can send
/win can framebuffer /new ClassBaseWindow send def
/new ClassEventMgr send /activate win send
/place win send /map win send




[LISTING TWO]

 /Layout { % - => -
 PieGSave self setcanvas
 /LayoutInit self send
 /LayoutValidateItems self send
 /LayoutItemRadius self send
 /LayoutOuterRadius self send
 grestore
 } def

 /LayoutInit { % - => -
 % Deflate the menu.
 /Radius 0 def
 % Figure the slice width.
 /SliceWidth 360 /itemcount self send 1 max div def
 % Point the initial slice in the initial angle.
 /ThisAngle InitialAngle store
 } def



 /LayoutValidateItems { % - => -
 % Loop through the items, validating each one.
 ItemList {
 begin % item

 % Measure the item.
 /DisplayItem load DisplayItemSize
 /ItemHeight exch def
 /ItemWidth exch def

 % Remember the angle and the direction.
 /Angle ThisAngle def
 /DX Angle cos def
 /DY Angle sin def

 % Figure the offset from the tip of the inner radius
 % spoke to the lower left item corner, according to
 % the direction of the item.
 %
 % Items at the very top (bottom) are centered on their
 % bottom (top) edge. Items to the left (right) are
 % centered on their right (left) edge.
 %
 DX abs .05 lt { % tippy top or bippy bottom

 % Offset to the North or South edge of the item.
 /XOffset ItemWidth -.5 mul def
 /YOffset
 DY 0 lt {ItemHeight neg} {0} ifelse
 def

 } { % left or right

 % Offset to the East or West edge of the item.
 /XOffset
 DX 0 lt {ItemWidth neg} {0} ifelse
 def
 /YOffset ItemHeight -.5 mul def

 } ifelse

 % Twist around to the next item.
 /ThisAngle
 ThisAngle SliceWidth
 Clockwise? {sub} {add} ifelse
 NormalAngle
 store

 end % item
 } forall
 } def

 /LayoutItemRadius { % - => -
 % Figure the inner item radius, at least enough to prevent

 % the items from overlapping.
 /ItemRadius RadiusMin def
 /itemcount self send 3 gt { % No sweat if 3 or less.


 % Check each item against its next neighbor.
 0 1 /itemcount self send 1 sub {

 /I exch def
 /NextI I 1 add /itemcount self send mod def

 % See if these two items overlap.
 % If they do, keep pushing the item radius out
 % by RadiusStep until they don't.
 { I /CalcRect self send
 NextI /CalcRect self send
 rectsoverlap not {exit} if % They don't overlap!

 % They overlap. Push them out a notch and try again.
 /ItemRadius ItemRadius RadiusStep add def
 } loop

 } for
 % Now that we've gone around once checking each pair,
 % none of them overlap any more!
 } if

 % Add in some more space to be nice.
 /ItemRadius ItemRadius RadiusExtra add def
 } def

 /LayoutOuterRadius { % - => -
 % Now we need to calculate the outer radius, based on the radius
 % of the farthest item corner. During the loop, Radius actually
 % holds the square of the radius, since we're comparing it against
 % squared item corner radii anyway.
 /Radius ItemRadius dup mul def
 ItemList {
 begin % item

 % Remember the location to center the item edge.
 /x DX ItemRadius mul def
 /y DY ItemRadius mul def

 % Remember the location of the item's SouthWest corner.
 /ItemX x XOffset add round def
 /ItemY y YOffset add round def

 % Figure the distance of the item's farthest corner.
 % This is easy 'cause we can fold all the items into
 % the NorthEast quadrant and get the same result.
 DX abs .05 lt { % tippy top or bippy bottom

 % (x,y) is South edge: radius^2 of NorthEast corner
 x abs ItemWidth .5 mul add dup mul
 y abs ItemHeight add dup mul add


 } { % left or right

 % (x,y) is West edge: radius^2 of NorthEast corner
 x abs ItemWidth add dup mul
 y abs ItemHeight .5 mul add dup mul add


 } ifelse

 % Remember the maximum corner radius seen so far.
 Radius max /Radius exch store
 end % item
 } forall

 % Take the square root and add some extra space.
 /Radius
 Radius sqrt Gap add Border add ceiling cvi
 store % Whew, we're done! Time to party!
 } def


















































December, 1991
ENHANCING THE X-WINDOW SYSTEM


Adding a paperlike interface and handwriting recognition




Jim Rhyne, Doris Chow, and Michael Sacks


The authors are researchers at the IBM T.J. Watson Research Center and can be
contacted at P.O. Box 704, Yorktown Heights, New York, NY 10598 or at
jrhyne@ibm.com. Note: Parts of this article were presented at the X-Window
technical conference earlier this year.


About four years ago, we began working on enhancements to the X-Window system
to provide a stylus-based user interface for handheld computers. This article
focuses on those X11 extensions, specifically those that support stylus-driven
applications.
We use the term PaperLike Interface (PLI) to distinguish the emerging
generation of notepad computers from those machines that rely on keyboard and
mouse interaction. Our group has been researching the technology associated
with this new class of machines, and we've built several prototypes that run
on AIX and X11.
The specifications for our research machines are a moving target, but our goal
is to build a machine with a 640x 480 display (16 gray levels), under 6
pounds, and comparable to a 32-bit personal computer in speed and storage.
Currently, the system software for our prototype consists of AIX and a
modified X11, Release 4. The operating system includes TCP/IP, sockets, and
NFS, and it is quite feasible to run large, compute-intensive applications on
a host machine while running an X server on the notepad prototype.
The distributed nature of X applications is vital to our development plans.
One of our sample applications is a cooperative meeting application in which
several networked users draw on a shared drawing surface (single client,
multiple servers). When we acquire wireless LAN capability early next year,
the distributed computing model will be even more important.


Application Software Architecture


The software architecture of the system is partitioned into three areas:
application, server, and kernel. X systems use a distributed architecture,
with multiple client-side applications communicating over a channel (which can
be a local area network) with one or more X servers that provide graphics
display and input event handling services. The kernel is the component in
which hardware dependencies such as device drivers are contained.
The application layer is itself subdivided into four layers. At the topmost
level is code that is purely application specific. This code calls on services
provided by the next lower layer, the OSF/Motif widget set. (Widgets are user
interface components such as dialogs, list boxes and text-edit fields.) The
third level down is the so-called intrinsics layer (Xt) of the X11 toolkit,
and finally there is the Xlib library of primitives that implement the X
client/server protocol.
Implementing our PLI system required modifications to all these areas of the
system.
PLI applications are built using an extended version of the OSF/Motif widget
set. We've added new widgets to this set, and these widgets connect with an
X11 server that has been modified to support an extended protocol. Dispatching
of stroke events to widgets required the modification of the Xt of the X11
toolkit. And, of course, supporting stylus-oriented interaction required
modifications to digitizer device drivers.
We'll describe the modifications to each of these layers, in turn, starting
with the OSF/Motif widgets.


The Writing Area Widget


A principal new widget we created is called the WritingArea widget. This
widget receives strokes from the server and invokes application-supplied
callback functions. It is subclassed from the Motif DrawingArea widget and
uses that widget's exposure callback and other resources.
The WritingArea widget is basically a primitive stroke-receiver widget
combined with replaceable behavior modules invoked as callbacks. Callbacks are
provided for stroke receipt and acceptance, for stroke processing, and for
exposure events. The stroke receipt callback decides whether to accept or
reject the stroke. If the stroke is accepted, the stroke processing callback
is invoked with the array of coordinates comprising the stroke. The exposure
callback is invoked whenever the server determines that part of the widget's
window needs to be redisplayed by the application. The widget maintains a list
of active strokes and redisplays them after the application's exposure
callback has completed. This widget may also be configured so that it does not
store or display accepted strokes. This configuration is useful for
applications which will store the strokes and redisplay them during exposure
callback processing.
Exposure event processing follows a typical sequence in which application
graphics are generated, followed by the display of writing baselines when
appropriate. Library procedures are provided for baseline generation and
accommodate user-specific parameters such as horizontal and vertical spacing
and the presence or absence of horizontal segmentation guides, such as
ticmarks.


The Recognition Widget


The WritingReco widget provides resources for configuring the recognizers, in
addition to those provided by the WritingArea widget. The WritingReco widget
relies on services provided by the Recognition/Presentation Toolkit, which is
currently being constructed. The various components are shown in Figure 1.
The Recognition/Presentation toolkit supplies the callback routines needed by
the WritingReco widget. It also simplifies the programming interface to the
recognizers, by providing a consistent user interface to recognition-related
services such as error correction, prototype, and recognizer management. In
addition, it provides a library of reusable functions for recognition and
recognition-related services which would otherwise have to be written by each
application developer.


Toolkit Support for Error Correction


In the PLI interface, the error correction paradigm is such that the user
selects an erroneous displayed symbol by touching it with the pen, to replace
it with the correct symbol, and to correct the recognizer. Error correction is
therefore a special mode in which the toolkit receives and interprets strokes,
rather than passing them to the recognizers and the application.
A possible design for error correction has an error correction button placed
on the title line of the window border. Touching this button places the
toolkit in error correction mode. When the user touches a displayed symbol,
the touch stroke location is used to select the corresponding symbol from the
recognition results.
One of the possible error correction styles is activated; for example, the
next symbol from the set of possibilities might be displayed. The user exits
the error correction mode by again touching the error correction button. The
application designer or user selects an error correction style for each of the
application's recognition objects by defining resource values in the usual
way.
Other functions, such as adjustment of recognition parameters or training to
introduce a new symbol, are accessed by touching another button in the title
bar, then touching anywhere in a WritingArea widget's window. A pop-down menu
appears, from which the user selects the desired function. Subsequently, a
recognizer control panel may appear, or a training window. When the user
dismisses these windows, the toolkit exits the special mode and the
application resumes normal behavior.
Implementation of these functions is complicated because an application main
window may contain several WritingReco widgets. Each one is associated with an
instance of the recognition object which contains recent recognition results,
strokes, and result display regions, as well as the parameters for recognizing
strokes received in the widget's window.
A form for data entry, for example, may be composed of several WritingReco
widgets and their associated recognition toolkit instances. A particular
widget/toolkit pair might select a recognition vocabulary of numbers, if only
entry of numbers is allowed. This sort of restriction is valuable because
recognition accuracy and speed are improved, and the user is alerted to entry
errors by the display of special symbols where the recognizer is unable to
find a suitable match. For example, an "A" entered by the user in a numeric
entry field might appear displayed as a "?".
Touching one of the recognition function buttons causes a global variable to
be set, which is checked by each recognition object. A stroke received while
the variable is set will be routed to the corresponding toolkit function
rather than being sent for recognition.



Extensions to the X Protocol


The X11R4 protocol extension for PLI consists of a stroke event and seven
requests.
The stroke event has several subcases identified by the detail byte. These
subcases include; the start of a stroke, motion during a stroke, the end of a
stroke, and proximity (which occurs when the pen position is detectable but
the pen is not touching the display surface).
To help the application determine whether to accept the stroke or request the
stroke path, the stroke event contains the starting and ending coordinates of
the stroke and the maximum and minimum values for X and Y. It also contains a
set of flags which indicate whether the start and end points are inside or
outside. These flags were selected because the corresponding tests were
frequently used in previous prototype applications to determine stroke
acceptance.
The stroke event structure is of fixed size, and thus cannot contain the
sequence of coordinates generated by the digitizer. To obtain these
coordinates, an application makes a request which returns a variable-length
data structure. This same request also converts the coordinates from the
screen-relative form retained by the server to a window-relative form.
Using another kind of request, applications can accept or reject a stroke. The
stroke event contains a server-generated ID used to identify the stroke to be
accepted or rejected. The protocol requires that each stroke eventually be
accepted or rejected by the applications that see it. When this condition is
met, the server will erase the stroke ink and delete the stroke from its
queue. The protocol allows strokes to be forced from the server queue, and
this may be needed when a client hangs without accepting or rejecting some
strokes. Strokes are automatically accepted for a client which dies; to reject
them might lead to creation of unwanted pointer events.
Stroke replies contain scaled coordinates rather than pixel coordinates (see
the discussion in the "Device Driver" section for details) and cannot be drawn
using the XDrawLine library function. To simplify application programming, the
extension provides an XDrawStroke function and protocol request with similar
parameters. The server converts the stroke coordinates and invokes the
line-drawing procedure.
There is also a request which allows a client to request realignment of the
digitizer and the display. The client that performs the function is typically
invoked from the window manager's menu.
Another similar request allows a client to set the pointer button being
emulated by the stylus. This is not set from the window manager menu, but from
a small icon permanently displayed on the screen. There is a request to
enqueue a stroke, which is used to help debug the server and the toolkits.
Finally, applications can query the server for details about the display and
digitizer capabilities by using yet another request.


Stroke Routing and Pointer Emulation


The stroke processing functions of the X11 server have been grouped into a
server extension, with a corresponding extension to the X11 protocol. The
design of these functions is somewhat surprising, as a stylus is neither a
keyboard nor a mouse, but may be called upon to emulate either.
Experiments with our early prototypes led to the following observations:
A consistent method of pointer emulation is required, so that existing
applications could be driven by the stylus.
A stroke may extend across several windows, and only the applications can
determine whether a stroke is acceptable in one of their windows.
Recognition of strokes will differ from window to window, precluding a simple
strategy of recognizing strokes prior to dispatching them to applications.
Because a handwritten character is two to three times larger than a
presentation font symbol, a user will often wish to continue a string of
handwritten characters beyond the boundary of the window in which the string
began.
Unlike a pointer event, which has a single point of interest, a stroke ranges
over an area of the screen. What point in the stroke should determine the
window routing? For many gestures, it will be the start of the stroke. But for
others such as the arrowhead, the salient point will lie at a point along the
stroke that is found by the recognizer. In the case of the arrowhead, the
natural salient point is its tip.
We observed that users tended to work in a particular window, and this
suggested routing strokes to a particular window until that window's
application rejected a stroke. When the server receives a stroke rejection, it
selects another candidate window for the rejected stroke and all that follow
it. This routing scheme permits an application to capture handwriting which
runs outside of window boundaries. It also permits an application to recognize
a stroke before deciding whether to reject or accept it. However, this
algorithm has the property that a misbehaving client can cause all strokes to
be routed to it and defeat pointer emulation. When this happens, the server
becomes useless until the client is killed by some external means (such as
telneting in from another workstation).
Alternative solutions considered were: moving the recognition function to the
server and using recognition results to assist in the routing decision, or
routing strokes to all windows at the same time and letting them decide
whether to accept or reject the stroke. Moving the recognizer seemed
infeasible because each application requires a distinct symbol set and applies
differing criteria to weight-recognition results. In addition, the interface
to the recognition software is quite complicated. We may revisit this decision
in the future, as we better understand the requirements for recognition and
its software architecture. At first glance, routing strokes to all clients at
the same time seems an invitation to chaos. However, applications may be
designed with this behavior in mind and should agree on a unique recipient
virtually all of the time.
There are several cases to consider:
1. The user clicks within a nonstroke window.
2. The user makes a stroke within a stroke window.
3. The user makes a gesture (for example, a caret) which is partly outside the
stroke window.
4. The user makes a pointer drag interaction.
In the first case the stroke has no potential stroke routing candidates
because it is entirely within the nonstroke window. As soon as the end of the
stroke is seen, the server turns the entire stroke into a pointer event.
Because the stroke duration is short, the user never notices that the pointer
emulation decision occurs at the end of the stroke.
In the second case, the stroke lies entirely within the window, so there is
only one routing candidate.
In the third case, in which the stroke is partly outside the stroke window,
there are two variations, depending on whether the other candidate window is a
stroke or nonstroke window.
If it is a stroke window, the acceptance/rejection test is based on where the
salient point of the gesture or character falls. The stroke is recognized by
the primary application and its salient point falls inside the window, so the
application accepts the stroke. The other application may also recognize the
stroke, but finds that the salient point falls outside the visible region of
its window and so rejects the stroke. If the other application is not
performing recognition, then it should reject any stroke which lies partially
outside the visible region of its window. If neither window is performing
recognition, both will reject the stroke and it will disappear. Hopefully the
user will find this response to be reasonably intuitive, and will then make
the stroke again within the proper boundaries.
If the stroke falls partly outside the stroke window onto a nonstroke window,
the stroke is not turned into a pointer event unless there are no stroke
candidates, or all stroke candidates have rejected the stroke. Therefore, the
stroke window will see the stroke events, but the nonstroke window will not
see pointer events unless the stroke window rejects the stroke. A misbehaving
stroke application can prevent a stroke that enters its window from being
turned into pointer events. The user can make the stroke again, avoiding the
window of the misbehaving application, if pointer emulation was intended. The
stroke remains on the display until all candidates have accepted or rejected
it. The user expected the stroke to disappear (as a result of pointer
emulation), and its failure to disappear is a clue that an application is
misbehaving.
The fourth and last case is one in which the user drags the stylus as if it
were a pointer. This case is difficult because the pointer emulation decision
must occur at the start of the stroke. In the meantime, the motion of the
stylus may cross several windows (which can be either stroke or nonstroke
windows).
What will likely trouble the user is that the drag echo won't occur until the
user has lifted the stylus; this not what is expected.
Special handling is necessary here. If the start of the stroke lies in a
nonstroke window, and the stylus remains relatively stationary for a brief
period (for example, 100 msec), then the stroke is converted to a series of
pointer events and never routed as a stroke. Most users performing a drag
quickly discover that the button-down event appeared at the wrong position,
and they have missed the target they were trying to hit. This behavior is
especially pronounced when trying to drag a window border to resize it,
because of the narrowness of the borders. The mouse is held essentially still
during this wait time (and so is the stylus).


Enabling Windows for Stroke Routing


X11 allows applications to indicate interest in getting reports of various
kinds of events which occur in each of their windows. We extended this
mechanism to stroke events, and used it to trigger pointer emulation. If a
window is tagged for pointer events, but not for stroke events, then a stroke
which would be routed to this window is converted into pointer events.
The conversion is a natural one: The stroke start becomes a button-down, the
stroke end becomes a button-up, and the intermediate reports become pointer
motions. The stylus thus naturally mimics the mouse, and experienced mouse
users rarely make mistakes in employing the stylus. The stylus leaves an ink
trail in this mode and although this is initially noticeable, for instance
while moving or resizing a window, it does not impede the user and none of our
subjects has asked us to eliminate it. The server deinks strokes as soon as it
determines that pointer emulation is active, and the ink is usually gone
within a fraction of a second.
We currently provide multiple-button support via a small icon which the user
may touch to select the button being emulated. This provides the needed
function, but encourages frequent user errors because users forget to restore
the original button setting.


The PLI Device Driver


This kernel component manages the hardware interface to the digitizer,
generates ink on the display, and provides a standard interface to the X11
server. Anticipating frequent changes to digitizer and display hardware as
well as the need to support several operating systems, we constructed the PLI
driver in three parts:
1. The digitizer driver, which handles the digitizer and its hardware
connection.
2. The display driver, which initializes the display and provides inking and
deinking functions.
3. The OS driver, which interfaces the other parts to the operating system,
and transfers data to and from the application.
Standard interfaces are defined between these components, making it possible
to support a new digitizer by replacing just the digitizer driver.
Encapsulating the OS functions has resulted in extra procedure calls in the
device driver, but the execution time penalty is small and the driver
portability greatly improved. The existence of three IBM operating systems for
the IBM PS/2 (DOS, OS/2, and AIX) makes portable software quite valuable.
The device driver is opened by the server. Digitizer reports are then read as
a character stream. The application can be notified when data is available; in
AIX the select system call is used. The server may control the behavior of the
device driver by writing to it. If supported by the operating system, the
device driver may place its data directly in a circular buffer accessible to
the application, to avoid the system call overhead and double copying of the
data.
When the pen touches the writing surface, the device driver begins to report a
stream of coordinates to the server. At the same time, the device driver is
generating an ink trace on the display. The stream of coordinates from
pen-down to pen-up is called a stroke, and is the primary data unit reported
by the device driver. To avoid excessive overhead, the device driver buffers
the coordinate stream and occasionally indicates, via select, that data is
available for the server. Our current digitizer provides position reports even
when the stylus is a small distance above the surface. The device driver does
not buffer this data, but periodically reports the current position.
Inking is done in the device driver to provide realtime feedback. The X11
server runs as a single threaded application process and cannot guarantee
realtime attention to the device driver. The device driver saves the critical
display state, performs its inking, and restores the display state; thus, it
can time-share the display with the X11 server. Unfortunately, ot all displays
are designed so that the state can be saved and restored, and in this case,
the X11 server will need to be extensively modified to provide a separate
inking thread with locks to control sharing of the display. The server will
erase the ink, which eliminates the need for the device driver to buffer
potentially large amounts of data in its memory.

The Bresenham line algorithm is used to connect successive digitizer points
while the stylus switch is depressed. Because of the high sampling rate of the
digitizer, the stylus rarely moves more than one or two pixel positions on the
display between samples. The inking process is invoked only when the stylus
has moved more than one pixel from the previous sample.
Ink is generated on one of the four planes of the display. The server may
freely use the other three, planes providing eight grey levels. The ink plane
is combined with the display plane using XOR implemented in the display color
map. Other ink-combining functions are possible, but preserving the contrast
between ink and application graphics is critical.


Coordinate Transformation


There are three coordinates systems to contend with: digitizer coordinates,
display screen coordinates, and window-relative coordinates.
The digitizer resolution is typically 2 to 16 times greater than the display
resolution, and the digitizer resolution must be preserved for accurate
recognition. To generate the ink trace, coordinates must be converted to
display screen units. Furthermore, the server and applications want to see
stroke information relative to the display screen or to windows on the display
screen, and not in some coordinate system provided by the digitizer
manufacturer.
The device driver addresses these issues by returning scaled screen
coordinates which have been multiplied by a factor of 2, 4, 8, or 16. The
subpixel resolution of the digitizer is preserved, and the conversion back to
integral pixel coordinates can be done with a right shift.
The device driver uses a simple linear model to convert the digitizer
coordinates to scaled display coordinates:
 x'=ax+by+c
 y'=dx+ey+f
The linear model requires eight parameters and compensates for scale,
translation, and rotation between the digitizer and the display coordinate
systems introduced when the display and digitizer are joined together. The
computation uses integer arithmetic, because floating-point services are not
usually available to device drivers.
The coefficients a through f are prescaled to prevent loss of significance
during the computation. The resulting coordinates are pixel values scaled to
preserve the dynamic range of the digitizer. Currently, we use a scaling
factor of 2{2}.
The eight parameters must be provided by the server, and are written to the
device driver during its initialization. Generally, the parameters are
obtained by displaying a crosshair at three locations on the display and
asking the user to touch each crosshair. The crosshair coordinates and the
averaged digitizer coordinates fully determine six parameters of the
conversion function. The other two parameters are fixed at design time by the
dynamic range of the digitizer and the resolution ratio between the digitizer
and the display. One writes a command to the device driver to turn off the
inking and set up the unity conversion function, and the driver subsequently
reports the raw digitizer coordinates. After the six parameters are computed,
they are written to the device driver and inking is restored.
This calibration procedure also compensates for visual parallax. Rather than
calibrate the driver once during initialization, we permit the user to
recalibrate at will as a way to compensate for periodic changes in viewing
position.
The device driver also timestamps the beginning and end of each stroke. In our
system, these timestamps are accurate to one sixtieth of a second. The primary
use for the timestamp is to detect unintended breaks in a stroke. It is
physically difficult for a user to lift and lower the pen in less than 0.07
seconds, so when an application sees a stroke ending and a new one beginning
in an interval smaller than that, it may concatenate the two strokes and
interpolate the missing data values.
The device-driver interface is further complicated by the possibility of
internal buffer overflow. Internal buffer overflow causes immediate cessation
of inking to alert the user that something is wrong. The X11 server receives a
status report that the stroke ended prematurely; typically, it will discard
the stroke as we have found that users tend to lift the pen when the ink
ceases and will repeat the stroke when its visible part has been erased. All
the inked coordinates are reported, so that the server can erase them.


Conclusion


The policy of the MIT X Consortium to distribute sample source code for X11R4
has greatly facilitated our work. Other proprietary window systems would not
have permitted the kinds of modifications necessary to support stylus
interaction for a PaperLike Interface.
We have recently contributed a preliminary X11R5 implementation of the PLI for
the IBM RISC System/6000 to the MIT X Consortium. The code is available via
anonymous FTP from MIT. The future of PLI is potentially a bright one. We hope
that others will join us in exploring and developing this technology, and that
computing users will find it fun and effective.


References


Card, S.K., T.P. Morgan, and A. Newell. The Psychology of Human-Computer
Interaction. Lawrence Earlbaum Associates, 1983.
Wolf, C.G. "A Comparative Study of Gestural and Keyboard Interfaces."
Proceedings of the Human Factors Society 32nd Annual Meeting, 1988.
Wolf, C.G. and J.R. Rhyne, "A Taxonomic Approach to Understanding Direct
Manipulation." Proceedings of the Human Factors Society 31st Annual Meeting,
1987.




























December, 1991
LINKING USER INTERFACE AND DATABASE OBJECTS


Designing an application environment for a pen-based machine




Eng-Kee Kwang and Christopher Rosebrugh


Eng-Kee Kwang and Christopher Rosebrugh are cofounders of PI Systems
Corporation. Eng-Kee is vice president of software development, and Chris is
vice president of engineering. They can be reached at 10300 SW Greenburg Rd.,
Suite 500, Portland, OR 97223.


It's no great revelation that different types of computers beget--or
require--different user interfaces. Witness the rapidly emerging class of
computers that use a pen or stylus as an input device. While many of these
systems are desk-oriented and use the pen primarily as a pointing device to
complement the keyboard, the system we describe in this article is designed
for mobile users in the field of collecting and analyzing data and use the pen
as its primary input device. The familiar "desktop" UI metaphor is
inappropriate for an "armtop" computer that's capable of being cradled in the
crook of your arm like a common clipboard or notebook. For this tool an
alternative to the traditional desktop is more fitting.
In this first installment of a two-part article, we describe the software
architecture of the notebook UI we implemented for the Infolio, a pen-based
computer our company PI (Portable Information) Systems recently introduced.
The system is built around object-based user interface and database software.
It uses a notebook metaphor, with applications and other categories of
software accessed as "tab sections," and user data objects accessible as icons
in a "table of contents" section. Because of the high degree of software and
hardware integration, we'll begin with an overview of the complete system to
put into perspective the intricacies of how the UI and database objects are
linked and how they communicate with each other.
In our next installment, we'll describe the Infolio hardware architecture and
our hardware/software development and integration methodology. Although the
design we describe is unique to the Infolio, the concepts behind the design do
have broader applications.


System Overview


The Infolio computer (see Figure 1) is completely pen-based--it has no
keyboard--and uses handwriting recognition to accept alphanumeric characters
and command gestures "written" directly on its bit-mapped LCD display. The
character recognition is stroke-based and includes a training utility that
lets users map their own handprinting to characters and their preferred
gestures to commands. The device itself weighs less than three pounds and
measures 1.2X9X10 inches. Based on Motorola's 68331 small systems processor (
a variant of the 68020), it operates for up to 12 hours on a charge and can
also use standard AA batteries.
Rather than using floppies or hard disk, the Infolio executes code and
accesses data stored on PCMCIA memory cards inserted in any one of the three
available slots. One slot is dedicated to system memory, holding a
mixed-memory card which contains the system software in 1 Mbyte of ROM and the
system stack and heap in 1 Mbyte of SRAM. (PCMCIA, short for "Personal
Computer Memory Card International Association," is an industry-supported open
standard that defines solid-state, credit card-size IC cards. The recently
announced 2.0 spec supports functions for modems, fax, and networking in
addition to memory and application storage.) The other two slots can hold ROM
or SRAM cards of sizes ranging from 64 Kbytes to 8 Mbytes each. These cards
hold third-party applications and user data. Because the system memory card
also provides user data storage space, additional memory cards are not always
needed. Because the Infolio supports execute-in-place, read-in-place, and
write-in-place, the storage cards are used as continuous flat memory rather
than as a disk-like subsystem.


Design Goals and Requirements


From a hardware point of view, the system design goals included use of
lowcost, off-the-shelf components, minimum power dissipation and platform
weight, maximum database and graphics performance, and true portability.
From the human interaction point of view, the critical design requirement of
this system was that it be simple to use in field environments--as close as
possible to using pen and paper, but also taking advantage of computing
resources to streamline data collection tasks, minimize user errors, and
maximize data integrity.
As we conceptualized the software design, we established several goals:
User interaction should be highly graphical.
The system should be tuned to the pen as the basic input device.
The system should provide object-based data, user interface, and application
modules (the features normally associated with object-oriented architectures
that are most pertinent to the Infolio are the object hierarchy, data and
behavior encapsulation, and object independence).
The system should provide many reusable utilities so that creating a new
application becomes a simple matter of sticking a bunch of standard objects
together with minimal custom code.
The result is that building applications on our platform is straightforward.
Programming the machine consists primarily of connecting standard or
customized UI objects to database objects for viewing and editing. A typical
application consists of one or more forms definitions (which embody database
records), user-defined constraints between the fields in these records, and a
graphical representation of the records for both viewing and editing.
Another early decision was that, in order to meet our market window, the
software and hardware would have to be designed and implemented concurrently.
To make this possible, we defined and created a layer in the system above
which all software would (theoretically) be absolutely portable. This allowed
us to develop most of the software on SPARCStations while the hardware was
being prototyped. Of the megabyte of code we wrote, only 50 Kbytes (less than
five percent) is specific to the particular platform (Infolio, SPARCStation,
or PC).


User Interaction Paradigm


There are a number of features that support Infolio's use as a field data
acquisition and analysis tool.
First, information is input solely with a pen. The system converts the user's
hand-printing and command gestures into ASCII data and system commands. The
character recognition system is designed to learn individual users'
hand-printing idiosyncrasies; this feature also allows the system to be
adapted to process non-Roman languages.
Secondly, information in the system is presented to the user for manipulation
via a clipboard or notebook-style user interface. This user interface was
chosen because it is a familiar paradigm--most people are somewhat familiar
with a physical clipboard or a notebook, while only computer users are
comfortable with a windowing environment consisting of multiple overlapped
windows. The notebook interface simplifies interaction by presenting in the
forefront only information in which the user is interested. All other pieces
of information are easily accessible via tabs, but remain hidden from view and
do not clutter the screen. With the notebook interface, the user does not have
to juggle and manage the organization of multiple, overlapping windows.
Because the computer skills of users range from novice to expert, however, we
also implemented an easily accessible environment that supports multiple
overlapping windows. This mechanism allows overlapping windows to sit on top
of the basic clipboard view. Any application can be "torn-off" the clipboard
and become a floating window, which allows the user to look at more than one
piece of information at a time. There can be as many torn-off windows as the
user wishes. Figure 2 shows an example of a floating window.


Database Facilities


The software provides the foundation for a massively-linked database machine.
Some of the concepts present are an evolution of the Zog system, a hypertext
program that came out of research at Carnegie-Mellon in the early '80s.
Infolio allows both conventional searches of the system's database contents,
as well as hyper-linking capabilities. Using the hyper-linking mechanism, a
database object can point to any other database object without the need to
understand context. For example, an icon can be embedded within an application
to link to another application--perhaps a note with text recorded in it. By
embedding a link icon to the note in an appointment entry, the user can access
background information about that appointment simply by tapping the pen on the
link icon.
Links also serve another, more fundamental function: The database architecture
makes use of the linking mechanism to build hierarchical objects. For example,
an address card object can be built by simply tying together a set of more
primitive database objects. This is shown in Figure 3. Each child object
retains its original behavior without the need for any additional code. In
effect, database linking provides the glue by which the entire software system
is constructed on top of the Infolio's kernel.



A Layered Architecture


We decided early in the design process to create a layered software
architecture which presents both a simple procedural interface to kernel
services, as well as an object-oriented interface to the database and
user-interface functions. The rationale behind this decision was:
Don't complicate simple services (such as time and date calculation) with
complex interfaces.
Simplify complex services (such as database record creation and manipulation)
with simple interfaces. This layered structure is shown in Figure 4.
Layering is more than simply a design philosophy. We strive to maintain the
integrity of layer boundaries, to foster a strongly modular system with
well-defined dependencies. References are almost always from a higher to a
lower layer. Interaction between adjacent layers dominates, with bottom-up and
multilayer references intentionally kept to a minimum. By minimizing random
references between any two modules, modules affected by changes to external
interfaces are more easily identified.
Layering in the system also serves to create a disciplined approach to
software design typically found only in hardware design. The layered structure
reflects the top-down approach used in designing large hardware systems. The
idea is to present to the developer (whether of hardware or software) a view
of the world that contains a limited number of discrete, easily managed pieces
of information to work with at any one time. This makes the system more
understandable. For example, if a developer is working on the Core Object
Framework layer (as shown in Figure 4), over 90 percent of the time that
developer need be concerned about services in the Kernel Services layer only;
only occasionally must he or she deal with other layers' contents.
An additional benefit of layering is that we can port the entire software
system to a different operating system, if so desired, with minimal code
changes. In most instances, only code in the kernel will need to be changed to
support a different platform and operating system.


Object Orientation


One of our software design goals was to create an object-based environment in
which the task of creating an application means simply instantiating and
connecting UI editing or viewing objects, and then attaching these UI objects
to appropriate database objects. The application only needs to provide code
that actually deals with its specific purpose rather than with handling
information flow in the system. All of the data manipulation and bookkeeping
mechanisms are already part of the objects' default behavior.
This object orientation also extends to how the system interacts with
applications themselves. From Infolio's point of view, applications are
objects. This enables a standard way for the system to interact with
applications. It also allows application objects to instantiate and embed
other applications within them without explicitly knowing implementation
details of the embedded application.
While much of the system design has an object-oriented flavor, we chose not to
use an object-oriented development language. Instead, we chose C as the
implementation language because it is flexible and widely-known, and because
it is well supported with respect to cross-compilers and debuggers. Other
considerations were code space limitations and the desire to minimize the
execution overhead inherent in object-oriented languages such as C++. Our
intent was to provide an object-oriented interface to the database, user
interface, and applications--not to provide a generally object-oriented
programming system. We call modules having procedural interfaces "toolkits,"
and those having object-oriented interfaces "toolboxes."
The object or class hierarchy mechanism in Infolio allows complex objects to
be built using simpler objects without duplicating the code associated with
the simpler objects. We can thus build complex applications incrementally
without incurring the cost of duplicated code.
Encapsulation of data and behavior protect the integrity of the object.
Changes to the data contained within an object can only be made through a
standard predefined message interface.


Event Handling


The interaction between the system and the user is based on events, which act
as catalysts for activity in the system.
Events in the Infolio include both physical and logical events. Physical
events include, for example, pen tip down, pen tip up, side switch down, side
switch up, card door open, low battery, failing battery, wake up, and time-out
alarms. Logical events are generated by the system to trigger a response
without actually targeting the responder. Examples of logical events include
inserting characters, deleting characters, selecting a region, closing a
window, moving a window, and refreshing windows.
The system uses the hierarchical organization of the user-interface objects to
determine the appropriate response to an event that occurs in a specific area
of the screen. Events that are not position-specific are handled by whatever
handler exists at the time the event is to be processed. If no handler exists,
the event is simply ignored and discarded.
As shown in Figure 5, the system event manager passes an event to the window
manager, which in turn passes it to the root UI object via the user-interface
framework. The event is then passed down the object hierarchy until it reaches
a "leaf" UI object (one that has no children). This UI object can then choose
to handle the event or bounce it back up the hierarchy. This mechanism gives
the lowest-level object (the most specific object) the first shot at handling
the event. If the event remains unhandled, it becomes a null event and is
discarded.
Because locality is the basis for handling events and passing messages to UI
objects, new objects can be installed without affecting any existing objects.
This also allows applications to change the way a UI object handles an event.
To illustrate how an application can change event handling, here's an example
using a text input object. The text object's normal behavior when passed an
Insert event is to take the data (in this case a string) and append it to its
internal storage. To make it a Read-Only text object, all the application has
to do is disable the handler for that message. When the object receives an
Insert event, it will then ignore the event and pass it back up the hierarchy,
as shown in Figure 5.


Behavior Encapsulation


Objects are wholly responsible for implementing their own behavior, while
conforming to a minimal fixed external interface. The definition of an object
includes not only its appearance and its data storage, but also how it
interacts with events and with other objects.
For example, a number input object will only respond positively to an Insert
event that contains numeric characters, namely, 0-9. Any other characters will
be ignored. This object always exhibits the same behavior, regardless of where
it is instantiated. Because behavior is an integral part of the object, any
application that instantiates the object does not have to provide any code to
filter the input data.
Objects with container ability have a specific interaction style with the
objects they contain. For example, three container objects are currently
defined: Bag, Scaffold, and OneOf. All three can contain zero or more objects
of any type, including other Bags, Scaffolds, or OneOfs. However, each
organizes its children differently, so that changes imposed on the container
objects will be reflected differently by their children.
A Bag provides for a visual organization of objects that is unconstrained or
freeform--its component objects can sit anywhere. When a Bag is resized, the
size and position of all its children with respect to the origin of the Bag's
space are unchanged. Objects sitting outside the visible portions of the Bag's
space are simply invisible to the user.
A Scaffold provides a structured row/ column organization for the objects it
contains. Objects in a Scaffold are organized either in a row or in a column
structure. When a Scaffold is resized, it will attempt to distribute the
change in dimension to all its children that are set up to change. So, if
there are four objects in a row Scaffold, and two of the four objects are set
up to change, and if the row shrinks in the horizontal dimension by two
inches, each of the two resizable children will be made to shrink in
proportion to the new horizontal dimension.
A OneOf provides an overlapping organization of objects. All objects contained
in a OneOf occupy overlapping visual space, and only one child object is
visible at any one time. When a OneOf is resized, all its children are resized
to the same new size.


Message Handling


All objects support a common set of messages, and can send these common
messages to any other objects. Common messages are akin to a base class
definition. Common messages can be passed around without any explicit
knowledge of the objects required. New objects can thus be defined, created,
and installed without having to recompile the system.
The message handling mechanism is implemented using function tables. All
object classes provide function tables in which a predetermined number of
slots in the table represent the common messages. The common messages which
all UI objects support are shown in Table 1.
Table 1: Common messages supported by all UI objects

 UI_NEW_M Creates a new instance of the object class
 UI_DELETE_M Deletes an object instance
 UI_DRAW_M Causes a redraw of the object instance
 UI_MOVETO_M Changes the location of the object
 UI_RESIZE_M Changes the dimension of the object
 UI_1CLICK_M Passes a pen tap event to the object for processing
 UI_PEN_DOWN_M Passes a pen down event to the object for processing
 UI_PEN_UP_M Passes a pen up event to the object for processing
 UI_GESTURE_M Passes a gesture event to the object for processing
 UI_GET_INFO_M Inquires about attribute information of the object




Sending Messages To Objects


Every object instantiated in the system has a unique identifier that also
contains its object type.
For example, to pass a message do_something to the object named object_x, the
following will suffice: ui_do_something (object_x,...).
If object_x chooses not to handle the message do_something, the call will
simply fail without serious consequence. This allows the caller to not have to
worry about whether an object actually understands a particular message.
Examples 1 and 2 present pseudocode fragments that demonstrate how a new
message, Do_Something, is created by the developer with the system's help.
Example 1(a) shows how to define the new message and map it to a handler by
using a standard template. Example 1(b) shows how this template-based
definition is automatically expanded into implementation code by the PI "file
compiler." Example 2(a) shows the message-handler code written by the
developer, and Example 2(b) shows the system code that resolves the new
message and passes it to the object's new message-handling function.
Example 1: (a) Starting with common message template definition, the developer
defines the message and maps it to its handler function; (b) the system
implementation code that is automatically generated by the PI file compiler
from the definition in Example 1(a).

(a)
 /* message template description */
 %func_template bool UI_DO_SOMETHING_M (
 obj_id /* IN: object pointer *
 );
 /* Object class message description mapping. It is NOT necessary to
 declare
 * parameters to message since they are already defined in template. */
 %function [UI_DO_SOMETHING_M] ui_Bag_do_something;

 _________________________________________________________________________

(b)
 /* These macros are used by the UI framework to pass messages to
 particular
 * objects without requiring explicit knowledge of the object classes.
 * The typedef and macros are AUTOMATICALLY generated from the message
 * template description. These macros, etc. are only used byi
 framework. */
 #define UI_DO_SOMETHING_M_ID 1
 typedef void (*UI_DO_SOMETHING_M_fn_t) ( obj_id );
 #define UI_DO_SOMETHING_M_(_ftbl)
 (*(UI_DO_SOMETHING_M_fn_t)_ftbl[UI_DO_SOMETHING_M_ID])

 /* Messages are implemented as re-directed MACRO'ed functions through
function

 * table. This typedef and define is AUTOMATICALLY generated from a
high-level
 * message description. */
 typedef void (*ui_Bag_do_something_fn_t) ( obj_id );
 #define ui_Bag_do_something (*(ui_Bag_do_something_fn_t)bag_ftbl [1])

 /* Message function table for object class Bag. This table is
 AUTOMATICALLY
 * generated from the high-level message description. */
 void *bag_ftbl [19] = { (void *)...,
 (void *) _ui_Bag_do_something.

 ...

 };

Example 2: (a) Developer's message handler code for the object; (b) the system
framework code that resolves the message and passes it to its
handler

 (a)


 void _ui_Bag_do_something(
 ui_Bag_t *bag /* corresponds to an obj_id */
 )
 {
 ...
 }
 (b)

 /* Implementation of messaging mechanism for sending message
 DO_SOMETHING */
 bool ui_do_something(
 obj_id id,
 ...
 )
 {
 void **ftbl;

 /* look for message function table associated with the object
 ID. */
 ftbl = __find_object_ftbl (id);
 if (ftbl && &UI_DO_SOMETHING_M (ftbl)) {
 /* If function table exists for the object -> known object
 type
 * and message DO_SOMETHING is defined, pass the message to
 the
 * object (call the function!!) */
 return UI_DO_SOMETHING_M (ftbl) (id, ...);
 }
 return FALSE;
 }



Data/Viewer Binding


The relationship between a data object and a viewer/editor object is loosely
constrained. This means that just because a data object contains an integer
value, it is not constrained to be viewed only as a number. Different viewers
can be attached to the same data object at the same time. For example, an
integer data object can be viewed as a number using a simple text or number
object, or graphically, as a gauge, by using a gauge object as the viewer.
In addition to being able to view the database information differently,
viewers can also be constructed to view distinct portions--or hide certain
pieces--of a complex database object.
For example, to create two views into an integer object, you would link user
interface objects to a database object as shown in Example 3. The graphical
result of these links is shown in Figure 6.
Example 3: Creating two views into an integer object by linking a UI object to
a database object

 db_handle = db_Int16_new(123);
 viewer_1 = ui_Num_new(...);
 viewer_2 = ui_Gauge_new(...);
 ui_attach_data(viewer_1, db_handle);
 ui_attach_data(viewer_2, db_handle);



Conclusion


The layered software architecture and object-oriented implementation of
Infolio have enabled us to build a simple but powerful software system for our
portable pen-based computer. Next month, we'll examine the system's hardware
architecture.


_LINKING USER INFTERFACE AND DATABASE OBJECTS_
by Eng-Lee Kwang and Christopher Rosebrugh


EXAMPLE 1:

(a)

 /* message template description */
 %func_template bool UI_DO_SOMETHING_M (
 obj_id /* IN: object pointer */
 );

 /* Object class message description mapping. It is NOT necessary to declare
 * parameters to message since they are already defined in template. */
 %function[UI_DO_SOMETHING_M] ui_Bag_do_something;

--------------------------------------------------------------------------
(b)

/* These macros are used by the UI framework to pass messages to particular
* objects without requiring explicit knowledge of the object classes.
* The typedef and macros are AUTOMATICALLY generated from the message
* template description. These macros, etc. are only used by framework. */
#define UI_DO_SOMETHING_M_ID 1
typedef void (*UI_DO_SOMETHING_M_fn_t)( obj_id );
#define UI_DO_SOMETHING_M(_ftbl)
(*(UI_DO_SOMETHING_M_fn_t)_ftbl[UI_DO_SOMETHING_M_ID])

/* Messages are implemented as re-directed MACRO'ed functions through function
 * table. This typedef and define is AUTOMATICALLY generated from a high-level
 * message description. */
 typedef void (*ui_Bag_do_something_fn_t)( obj_id );
 #define ui_Bag_do_something (*(ui_Bag_do_something_fn_t)bag_ftbl[1])

/* Message function table for object class Bag. This table is AUTOMATICALLY
 * generated from the high-level message description. */
 void *bag_ftbl[19] = {
 (void *)...,
 (void *)_ui_Bag_do_something,
 ...
 };


EXAMPLE 2:

(a)

 void _ui_Bag_do_something(
 ui_Bag_t *bag /* corresponds to an obj_id */
 )
 {
 ....
 }


(b)

/* Implementation of messaging mechanism for sending message DO_SOMETHING */
 bool ui_do_something(
 obj_id id,
 ...
 )
 {

 void **ftbl;

 /* look for message function table associated with the object ID. */
 ftbl = __find_object_ftbl(id);
 if (ftbl && &UI_DO_SOMETHING_M(ftbl)) {
 /* If function table exists for the object -> known object type
 * and message DO_SOMETHING is defined, pass the message to the
 * object (call the function!!) */
 return UI_DO_SOMETHING_M(ftbl)(id, ...);
 }
 return FALSE;
 }



EXAMPLE 3:

 db_handle = db_Int16_new(123);

 viewer_1 = ui_Num_new(...);

 viewer_2 = ui_Gauge_new(...);

 ui_attach_data(viewer_1, db_handle);

 ui_attach_data(viewer_2, db_handle);




































December, 1991
FS: A FILE STATUS UTILITY FOR UNIX


Using system calls to extract open files




Jeff Reagen


Jeff is a special projects engineer for Banyan Systems and can be reached at
28 Grant Street, Milford, MA 01757.


In the UNIX environment, just about every operation you can think of involves
a file in some way. For example, the interface to the hard disk occurs through
some special device file in the /dev directory; and running a command implies
the contents of a file must be known in order to execute them. The number of
open files and type of files are completely driven by the environment of the
system.
Because UNIX is a file-based operating system by design, understanding the
system's working set of open files can provide a wealth of information when
exploring a running UNIX system.
In this article, I'll demonstrate how the working set of open files can be
extracted from the kernel using system calls from sections two and three of
the UNIX programmer's manual. In addition, various include files are also
used, because many kernel data structures are either defined or alluded to in
these files. UNIX System V, Release 3.2, Version 2.2 from Interactive Systems
was used to develop the software for this article.
The file status (fs) utility that is the focus of this article allows the user
to examine the current status of the UNIX file table in much the same way ps
allows users to look at the list of active processes.
In UNIX, there exist two distinct address spaces or levels: user and kernel.
At user level, a program is allowed to execute nonprivileged instructions
only. Privilege operations tend to involve those instructions which access
hardware devices such as the CPU or hard disk. The kernel level is where
system software executes. This software is a collection of instructions and
data structures used to manage the computer's resources. Some of these
resources include process scheduling, memory management, and the file system.
System software can do anything! There are no protection schemes operating in
kernel mode to protect against misbehaving system software.
The file table is part of the system software. It maintains a list of all open
files in the UNIX system. Each file table entry contains information about the
entry as well as a pointer to an information node, or in UNIX terminology an
"inode." The inode in turn contains detailed information about the file it
represents. fs uses the combination of these two kernel data structures to
report the file status for every active file. The structure declarations for a
file entry and an inode entry can be found in /usr/include/sys/file.h and
/usr/include/sys/inode.h respectively.
Why all the fuss, you might wonder. Well, in UNIX a user-level program cannot
reference the data structures located in the kernel address space. To bridge
this gap, a pseudo device driver, referenced through /dev/kmem is used to
treat kernel memory as if it were a file. This driver comes bundled with all
flavors of the UNIX system. Walking through the implementation of fs will
provide insight into how this interface is used.
Before going any further, I should say a fair amount of detective work is
required when referencing kernel data structures. You cannot just open a
manual and expect it to tell you that the internal name of the file table is
"file." Rather, the best advice I can offer involves exploration of the
/usr/include/sys directory, where an enormous amount of kernel internal
information exists. In the case of the file table, file.h includes an external
reference to an array of file structures called "file."


Implementation


As I said earlier, the memory interface /dev/kmem was provided so application
programs such as fs could have access to kernel data. Some of these structures
fs wishes to reference can be found in the include files shown in Listing One
(page 96). Of particular interest is nlist.h, file.h, inode.h, and var.h.
nlist.h contains the definition of a name list structure. fs declares an array
of these nlist structures called kern_syms, which contains three nlist
structures. Two of these structures are used to search for the "v" and "file"
kernel data structures. These two structures play an important role for fs, as
we'll see shortly. The third entry is a null declaration the sole purpose of
which is to terminate the list. fs is only interested in using the n_name and
n_value symbols in the nlist structure. The n_name field is passed in,
instructing nlist to look up the symbol matching this string. n_value is the
corresponding value of the symbol in n_name that nlist has retrieved from the
kernel's symbol table.
fs begins by calling uname( ) to determine the name of the current kernel; see
Listing Two (page 96). Remember, most UNIX bootstraps provide the capability
to boot alternative kernel images. The kernel name is used by nlist( ) to
identify the disk file containing the kernel image. fs continues by calling
nlist( ). When the call to nlist completes, valid offsets to our kernel data
structures can be taken from the n_value fields.
Next, /dev/kmem is opened. The first open referenced through file descriptor
fd is used for general-purpose work such as file seeks, reads, and file table
extraction. The second open is issued to create a private file descriptor used
to directly access specified "in core" inode.
Now that the initialization stuff is out of the way, fs can get down to
business. First we want to get a copy of the "v" data structure into the user
space occupied by fs. To do so, fs creates a data structure large enough to
hold the contents of "v." Next, fs must position the file pointer to the base
of the "v" structure maintained by the kernel. Don't get confused here! There
are two references to the "v" structure. One is the kernel "v" structure --
this is the structure fs wishes to extract. The second is the local fs copy --
which is where the contents of the kernel version get copied to. Easy enough,
right?
To position the file pointer, fs calls lseek using the value field returned by
nlist. The pointer is now located at the beginning of the kernel's "v"
structure.
Next, a read is issued to read the kernel "v" into the local "v" structure
declared by fs. Voila, fs has successfully pulled the contents of a kernel
data structure out of kernel space and placed it into user space.
With the "v" structure in hand, fs calls file_entry (Listing Three, page 98)
to examine each entry of the file table. file_entry begins by positioning the
file pointer associated with fd to the beginning of the file table in the same
manner fs did for "v."
A loop is used to read in each file table entry one at a time. Because the
size of a file table can vary from UNIX system to system, it doesn't make
sense to "hard code" the value. Fortunately, UNIX puts the number of file
table entries into the "v" structure where the kernel and other programs such
as fs can gain easy access to the information. file_entry uses this count as a
loop control variable in the body of its function.
With the loop in place, file_entry just reads the next sequential file table
entry. If the reference count is greater than or equal to one, the entry is
considered valid and fs begins to explore.
UNIX permits a file to be opened in many different modes, including read
and/or write, write append only, and read only. file_entry( ) displays this
mode, which is taken from the f_flag field from the file structure. Next, the
file reference count and the position of the file pointer get displayed. The
reference count may be more than one! When UNIX forks (creates) a new process,
the file descriptors of the process being forked are inherited by the new
"child" process which causes the file entry reference count to be incremented.
With all of the file entry-specific information displayed, file_entry calls
inode_entry to take a look at the inode-specific information before moving on
to the next entry in the file table.
inode_entry (Listing Four, page 98) is used to query the inode passed to it by
file_entry( ). The inode entry contains specific information about the file,
including its type, size, user ID, and group ID, as well as the inode number
itself.
UNIX supports many different file types. Among these are device files, which
couple the physical hardware to a file interface; FIFO files, which allow
programs to communicate with one another; directories, which maintain a list
of files; and of course plain files which may contain machine instructions or
text.
To gather the inode information associated with the current file entry,
inode_entry uses lseek to adjust the position of the inofd file pointer to the
offset found in the inode field of the file structure. On System V, Release
3.2 UNIX machines, this field is referenced through f.up.f_uinode. inode_entry
now transfers the current inode from the kernel into the user "inode" buffer
using the read system call. inode_entry( ) completes its work by displaying
the fields of interest described earlier.
Sample output produced by fs is shown in Figures 3 and 4. Figure 3 illustrates
how UNIX associates a piece of hardware, such as the video display and
keyboard in this example, with a file. Notice the operation being performed is
read and/or write, and the next operation is going to take place at
hexadecimal offset 4b16.
Figure 3: Sample output describing a character device

 Contents of file table entry #1
 File operation flag ...... File opened for read/write
 Reference count .......... 12
 File pointer position .... 4b16
 Inode number ............. 65
 File type ................ Character device (Maj 5/Min 0)
 File size ................ 0
 User id .................. 0
 Group id ................. 0

Figure 4: Sample output describing a regular file

 Contents of file table entry #17

 File operation flag ................. File opened for read/write
 Reference count ..................... 1
 File pointer position ............... 1200
 Inode number ........................ 974
 File type ........................... Regular file
 File exists on Major/Minor device ... 0/3
 File size ........................... 11264
 User id ............................. 0
 Group id ............................ 0

If I did not know that the major number for the video and keyboard driver is
5, it would still be possible to determine which driver is associated with
major number 5. Because fs says the device is a character type it will have to
be declared as part of the character device switch table also called "cdevsw".
The cdevsw is an array of all the character devices known to the UNIX system.
The major number is used to refer to a device in the cdevsw. So in Figure 3,
the character device of interest is referenced by the fifth entry in the
cdevsw table.
Another piece of sample output from fs describing a regular file is shown in
Figure 4. Attributes of interest in this example include the file system's
special device name of which the file under inspection is a member, and the
inode number. With these two pieces of information, the ASCII name of the file
can be obtained using existing UNIX tools. On my UNIX system, Major number 0
indicates we're dealing with a hard disk, and minor number 3 means it's disk 0
partition 3. Minor number 3 on most i386/i486 versions of UNIX is usually the
/usr file system. There are two ways the ASCII filename can be located. The
first involves the find command. Use find to recursively search through /usr
looking at the inode number of all regular files. Figure 1 describes the
command syntax one may use. Alternatively, the ncheck command could be used to
search a file system for the specified inode and return the ASCII name. Figure
2 illustrates the use of ncheck.
Figure 1: Using find, ls, and grep to get filename

 find /usr -type f -exec /bin/ls -i \{\} \; grep 974

Figure 2: Using ncheck to obtain name of file

 ncheck -i 974 /dev/dsk/0s3



Installation


Because fs relies upon kernel memory access, it must execute with Super User
privileges. One way to do this is login as root every time you want to run the
program. This is not desirable because every user would need to know the root
password! A better approach is the set user ID bit, or setuid for short.
setuid works by making root the owner of fs and then changing the mode of fs
to include the setuid permission bit. Then when fs is invoked, the setuid bit
instructs UNIX to interpret the user ID to be that of root instead of the user
who actually issued the command. Figure 5 illustrates how to initially install
fs.
Figure 5: Installation of fs

 cp fs /bin/fs
 chown root /bin/fs
 chmod go+s /bin/fs



Extensions


This version of fs is primitive in that it displays a verbose listing of each
active file table entry. It's a trivial exercise to modify the output of fs by
employing the use of option switches. For example, an option could be used to
limit the display to only those files belonging to a specified user ID or
group ID. Another option switch might display files associated with a
particular file system.
A creative programmer may also decide to display the ASCII name of each file
system as opposed to the major and minor number of the disk partition housing
the file system. To accomplish this, the mounted on device name taken from the
inode is used as a key to search the mount table referred to by /etc/mnttab.
It's now a simple matter to extract the corresponding ASCII name from
/etc/mnttab.
Actual physical disk sectors occupied by a regular file or directory can be
obtained rather easily. To implement this feature, simply pick up the pointer
to the file system-specific inode from the "in core" inode. Then, display the
information referenced through the disk block field. For example, a 1K System
V file system uses the s5inode structure to represent the file system specific
portion of the inode. s5i_a describes which sectors contain the file contents.
The declaration of an s5inode can be found in /usr/include/sys/fs/s5inode.h.


Uses


fs is probably the most useful for those people whose job it is to take care
of UNIX systems--the system administrator. Administrators could use fs to
profile their file systems and understand if the current file system is
breaking down due to unbalanced load. For example, if one file system always
seems to have the majority of files active, then the other file systems are
essentially doing nothing to enhance overall system performance. Knowing this
information, a sharp administrator might move some of the directories from the
busy file system to one of the quiet ones.
Another use of fs is locating open files on a particular file system.
Remember, a file system cannot be unmounted until all files are closed!


Conclusion


Although fs provides a specific service, much of the software and many of the
algorithms can be used to build new programs which require information
maintained by the UNIX kernel. For example, with the help of a text editor, fs
could be transformed into a program used to report the configuration of
various static data structures such as the process table, incore inode table,
and even the file table.
Many existing UNIX programs use this algorithm to extract kernel information
on a regular basis. For example, the ps command displays the state of each
process currently known to UNIX, and under BSD 4.3, systat utilizes nlist to
extract the load averages maintained by the kernel.
The file status program developed here adds another useful utility to the
system administrator's toolbox, or possibly provides another means for the
curious programmer to explore UNIX in greater detail.

_FS: A FILE STATUS UTILITY FOR UNIX_
by Jeff Reagen



[LISTING ONE]

/* Copyright (c) 1991 Jeff Reagen ALL RIGHTS RESERVED. */
#include "sys/types.h"
#include "sys/fcntl.h"
#include "sys/param.h"
#include "sys/immu.h"
#include "sys/fs/s5dir.h"
#include "sys/signal.h"
#include "sys/user.h"
#include "sys/errno.h"
#include "sys/cmn_err.h"
#include "sys/buf.h"
#include "nlist.h"
#include "sys/stat.h"
#include "sys/region.h"
#include "sys/proc.h"
#include "sys/var.h"
#include "sys/sysmacros.h"
#include "sys/file.h"
#include "sys/inode.h"
#include "sys/utsname.h"

long lseek();

/* Symbols fs will request from the kernel. */
struct nlist kern_syms[] = {
 "v",
 0,
 0,
 0,
 0,
 0,

 "file",
 0,
 0,
 0,
 0,
 0,

 "",
 0,
 0,
 0,
 0,
 0
};

struct nlist *pinfo;
struct var v;
int fd;
int inofd; /* file pointer for inode lookup */








[LISTING TWO]

/* Copyright (c) 1991 Jeff Reagen ALL RIGHTS RESERVED. */
main ()
{
 int i;
 long res;
 struct utsname name;
 char realname[10];

 /* Obtain name of current kernel. It's not necessarily /unix! */
 if (uname (&name) < 0)
 {
 printf ("Cannot identify the current Unix system!\n");
 exit (1);
 }

 /* Prefix name of kernel with a "/" */
 sprintf (realname, "/%s", name.sysname);

 if (nlist (realname, kern_syms) < 0)
 {
 printf ("Could not get name list\n");
 exit (1);
 }
 pinfo = &kern_syms[0];

 /* Check value of proc symbol. */
 if (pinfo->n_value == 0)
 {
 printf ("nlist call failed.\n");
 exit (2);
 }
 if ( (fd = open ("/dev/kmem", O_RDONLY)) < 0)
 {
 printf ("Cannot open /dev/kmem.\n");
 exit (3);
 }
 if ( (inofd = open ("/dev/kmem", O_RDONLY)) < 0)
 {
 printf ("Cannot open /dev/kmem.\n");
 exit (3);
 }

 /* Get Unix system variable structure. */
 if ( (res=lseek (fd, (long)kern_syms[0].n_value, 0)) == -1)
 {
 printf ("Can't seek /dev/kmem.\n");
 exit (4);
 }
 if (read (fd, &v, sizeof (struct var)) < 0)
 {
 printf ("Can't read sysinfo struct from /dev/kmem.\n");
 exit (5);
 }

 /* Display system information. */

 file_entry();
}







[LISTING THREE]

/* Copyright (c) 1991 Jeff Reagen ALL RIGHTS RESERVED. */
/* Examine each entry in the Unix file table. */
file_entry()
{
 int res;
 int fno;
 struct file file;

 /* Position to base of file table. */
 if ( (res=lseek (fd, (long)kern_syms[1].n_value, 0)) == -1)
 {
 printf ("Can't seek /dev/kmem.\n");
 exit (4);
 }

 /* read each entry one after the other. */
 for (fno = 0; fno < v.v_file; fno++)
 {
 /* get next slot. */
 if (read (fd, &file, sizeof (struct file)) < 0)
 {
 printf ("Can't read file slot.\n");
 exit (5);
 }
 /* Display entry info. iff referenece count indicates
 file entry is valid. */
 if (!file.f_count)
 {
 continue; /* abort this entry! */
 }
 printf ("Contents of file table entry #%d\n",fno);
 printf ("\tFile operation flag ................. ");
 switch (file.f_flag)
 {
 case FOPEN: printf ("File is opened\n");
 break;
 case FREAD: printf ("File is being read\n");
 break;
 case FWRITE: printf ("File is being written\n");
 break;
 case FREADFWRITE:
 printf ("File opened for read/write\n");
 break;
 case FWRITEFAPPEND:
 printf ("File opened for write append\n");
 break;
 case FREADFWRITEFAPPEND:
 printf ("File opened for read/write appends\n");

 break;
 case FNDELAY: printf ("File operating in delayed mode\n");
 break;
 case FAPPEND: printf ("File opened for appended operation\n");
 break;
 case FSYNC: printf ("File writes are synchronous\n");
 break;
 case FMASK: printf ("File masked ?????\n");
 break;
 default: printf ("%x\n",file.f_flag);
 break;
 }
 printf ("\tReference count ..................... %d\n",
 file.f_count);
 printf ("\tFile pointer position .............. %lx\n",
 file.f_un.f_off);

 /* process the inode */
 inode_entry (file);

 /* Pause so user can read the output. */
 getchar(); printf("\n\n");

 }
}






[LISTING FOUR]

/* Copyright (c) 1991 Jeff Reagen ALL RIGHTS RESERVED. */
/* Get the inode associated with the current file table entry. */
inode_entry (file)
 struct file file;
{
 int res;
 struct inode inode; /* current core inode */
 if ( (res=lseek (inofd, (long)file.f_up.f_uinode, 0)) == -1)
 {
 printf ("Can't seek /dev/kmem for inode.\n");
 exit (4);
 }
 if (read (inofd, &inode, sizeof (struct inode)) < 0)
 {
 printf ("Can't read in core inode.\n");
 exit (5);
 }
 printf ("\tInode number ........................ %d\n", inode.i_number);
 printf ("\tFile type ........................... ");
 switch (inode.i_ftype)
 {
 case IFDIR: printf ("Directory\n");
 break;
 case IFCHR: printf ("Character device (Maj %d/Min %d)\n",
 major(inode.i_rdev),minor(inode.i_rdev));
 break;

 case IFBLK: printf ("Block device\n");
 printf ("\tBlock device .... %d\n",inode.i_dev);
 break;
 case IFREG: printf ("Regular file\n");
 printf ("\tFile exists on Major/Minor device ... %d/%d\n",
 major(inode.i_dev), minor(inode.i_dev) );
 break;
 case IFIFO: printf ("FIFO special\n");
 break;
 case IFMPC: printf ("Multiplexed Character special\n");
 break;
 case IFMPB: printf ("Multiplexed Block special\n");
 break;
 case IFNAM: printf ("Special Named file\n");
 }
 printf ("\tFile size ........................... %d\n",inode.i_size);
 printf ("\tUser id ............................. %d\n", inode.i_uid);
 printf ("\tGroup id ............................ %d\n", inode.i_gid);
}











































December, 1991
A SIMPLE HANDLE-BASED MEMORY MANAGER


Avoid memory allocation fragmentation




David Betz


David is a technical editor for DDJ and the author of XLisp, XScheme, and the
M&T Online conferencing system. He can be reached at DDJ, 501 Galveston Drive,
Redwood City, CA 94063.


One of the problems that plagues programs that do a lot of runtime memory
allocation is fragmentation. If a program allocates and deal-locates blocks of
varying sizes, memory will tend to become fragmented. Eventually, an
allocation request will fail, not because there isn't enough free memory, but
because there isn't a large enough contiguous block of free memory to satisfy
the request.
One solution to this problem is to move all of the blocks of memory that are
in use to one end of the heap. This leaves all of the free space at the other
end in one contiguous block. The obvious problem with this approach is that
the program that allocated the memory still maintains pointers to the
allocated blocks which become invalid as the blocks are moved.
The way both Windows 3 and the Macintosh operating system overcome this
problem is to provide a way for programs to access allocated memory indirectly
through "handles." A handle, rather than being a direct pointer to a block of
allocated memory, is an indirect pointer. On the Macintosh, handles are
actually pointers to pointers, whereas under Windows, they are 16-bit integer
values that designate a pointer that is maintained in a table managed by the
memory allocator. With handles, the memory manager is free to move around
allocated blocks to free up contiguous space to satisfy an allocation request.
It just needs to update the single pointer associated with the handle of each
block of memory still in use.
The cost of this approach is that each time a program references data through
a handle, an extra level of indirection is required. Usually, this isn't a
major concern and doesn't impact performance much. The benefit is the
elimination of memory fragmentation problems.
In this article, I present a simple handle-based memory manager. The package
includes the include files (Listing One, page 66), C source ( Listing Two,
page 66), makefile (Listing Three, page 151), and a test program (Listing
Four, page 151). There are really only four entry points to this package.
NewHeap( ) initializes a new heap. It takes two arguments: the initial number
of handles and the size of the heap in bytes. Its counterpart is FreeHeap( ),
which takes a heap allocated with NewHeap( ) and frees all of the memory
associated with it. Then there is HeapAlloc( ) which allocates a block of
memory from a heap and returns a handle to the block and HeapFree( ), which
frees a block of memory allocated with HeapAlloc( ).
A handle is just a pointer to a pointer to a character (a Char **), so to
refer to a block of memory through a handle, an extra * is required.
The heap is organized with the table of handles at its base and the memory
available for allocation above that. Three variables control allocation. The
variables base and top point to the base and top of the area available for
allocation. The variable next points to top of the next block that will be
returned by the allocator. Initially, it points to the same address as top.
When it receives an allocation request, the allocator checks to see if
next-base is greater than or equal to the amount requested. If it is, the
allocator returns a block whose base is at next-base and sets next equal to
that value. In this way, memory is allocated from the top of the heap toward
the bottom.
When the allocator finds that there is not enough memory between base and next
to satisfy a new allocation request, it compacts the heap by copying all of
the blocks still in use toward the top of the heap, leaving all of the free
space between base and next. The allocator then checks the free space between
base and next again. If there is enough space, the variables are updated as
described above and the new block is returned. If not, the allocation request
fails.
Handles are allocated in a much simpler fashion. The table of handles is
searched to find a handle that is not in use. If no free handles are
available, the compactor is called and the base variable is moved up enough to
provide space to allow the handle table to be expanded. Of course, this too
could fail. If it does, the allocator returns NULL to indicate that the
allocation request has failed.

_A SIMPLE HANDLE-BASED MEMORY MANAGER_
by David Betz



[LISTING ONE]

/* hmm.h - definitions for a simple handle based memory manager */

typedef char **HANDLE;

/* heap header structure */
typedef struct heaphdr {
 int hh_nhandles; /* number of handles */
 char *hh_next; /* next free location */
 char *hh_base; /* base of heap memory */
 char *hh_top; /* top of heap memory */
 HANDLE hh_handles; /* base of handle array */
} HEAPHDR;

HEAPHDR *NewHeap(int,int);
void FreeHeap(HEAPHDR *);
HANDLE HeapAlloc(HEAPHDR *,int);
void HeapFree(HEAPHDR *,HANDLE);






[LISTING TWO]

/* hmm.c - a simple handle based memory manager -- by David Betz */


#include <stdio.h>
#include "hmm.h"

/* number of handles to add when expanding the handle table */
#define HINC 32

/* block prefix structure */
typedef struct blockpfx {
 HANDLE bp_handle; /* handle for this block */
} BLOCKPFX;

/* block suffix structure */
typedef struct blocksfx {
 int bs_size; /* size of block */
} BLOCKSFX;

static HANDLE FindMemory(HEAPHDR *,HANDLE,int);
static HANDLE NewHandle(HEAPHDR *);
static HANDLE UnusedHandle(HEAPHDR *);
static void ExpandHandleTable(HEAPHDR *,int);
static void CompactHeap(HEAPHDR *);

/* NewHeap - allocate and initialize a new heap */
HEAPHDR *NewHeap(nhandles,nbytes)
 int nhandles; /* initial number of handles */
 int nbytes; /* initial number of free bytes */
{
 char *malloc();
 HEAPHDR *h;
 int tsize;
 HANDLE p;
 tsize = nhandles * sizeof(char *);
 if ((h = (HEAPHDR *)malloc(sizeof(HEAPHDR) + tsize + nbytes)) == NULL)
 return (NULL);
 h->hh_nhandles = nhandles;
 h->hh_handles = p = (HANDLE)((char *)h + sizeof(HEAPHDR));
 while (--nhandles >= 0) *p++ = NULL;
 h->hh_base = (char *)h->hh_handles + tsize;
 h->hh_top = h->hh_base + nbytes;
 h->hh_next = h->hh_top;
 return (h);
}

/* FreeHeap - free a heap allocated by NewHeap() */
void FreeHeap(h)
 HEAPHDR *h; /* heap to free */
{
 free(h);
}

/* HeapAlloc - allocate a block of memory from the heap */
HANDLE HeapAlloc(h,size)
 HEAPHDR *h; /* the heap */
 int size; /* size of block to allocate */
{
 HANDLE p;
 if ((p = NewHandle(h)) == NULL)
 return (NULL);

 return (FindMemory(h,p,size));
}

/* HeapFree - free a block of memory allocated by HeapAlloc() */
void HeapFree(h,p)
 HEAPHDR *h; /* the heap */
 HANDLE p; /* the handle to free */
{
 BLOCKPFX *bp;
 bp = (BLOCKPFX *)(*p - sizeof(BLOCKPFX));
 bp->bp_handle = NULL;
 *p = NULL;
}

static HANDLE NewHandle(h)
 HEAPHDR *h; /* the heap */
{
 HANDLE p;
 if ((p = UnusedHandle(h)) == NULL)
 ExpandHandleTable(h,HINC);
 return (UnusedHandle(h));
}

static HANDLE UnusedHandle(h)
 HEAPHDR *h; /* the heap */
{
 HANDLE p;
 int n;
 for (p = h->hh_handles, n = h->hh_nhandles; --n >= 0; ++p)
 if (*p == NULL)
 return (p);
 return (NULL);
}

static void ExpandHandleTable(h,n)
 HEAPHDR *h; /* the heap */
 int n; /* number of handles to add */
{
 char *base;
 HANDLE p;
 CompactHeap(h);
 base = h->hh_base + (n * sizeof(char *));
 if (base <= h->hh_next) {
 p = (HANDLE)h->hh_base;
 h->hh_base = base;
 h->hh_nhandles += n;
 while (--n >= 0)
 *p++ = NULL;
 }
}

static HANDLE FindMemory(h,p,size)
 HEAPHDR *h; /* the heap */
 HANDLE p; /* the handle to allocate space for */
 int size; /* size of block to allocate */
{
 BLOCKPFX *bp;
 BLOCKSFX *bs;
 int tsize;

 char *d;
 tsize = sizeof(BLOCKPFX) + size + sizeof(BLOCKSFX);
 if ((d = h->hh_next - tsize) < h->hh_base) {
 CompactHeap(h);
 if ((d = h->hh_next - tsize) < h->hh_base)
 return (NULL);
 }
 h->hh_next = d;
 bp = (BLOCKPFX *)d;
 bp->bp_handle = p;
 d += sizeof(BLOCKPFX);
 bs = (BLOCKSFX *)(d + size);
 bs->bs_size = size;
 *p = d;
 return (p);
}

static void CompactHeap(h)
 HEAPHDR *h; /* the heap */
{
 char *src,*dst;
 BLOCKPFX *hp;
 BLOCKSFX *hs;
 src = dst = h->hh_top;
 while (src > h->hh_next) {
 hs = (BLOCKSFX *)(src - sizeof(BLOCKSFX));
 hp = (BLOCKPFX *)((char *)hs - hs->bs_size - sizeof(BLOCKPFX));
 if (hp->bp_handle) {
 if (src == dst)
 src = dst = (char *)hp;
 else {
 while (src > (char *)hp)
 *--dst = *--src;
 *hp->bp_handle = dst + sizeof(BLOCKPFX);
 }
 }
 else
 src = (char *)hp;
 }
 h->hh_next = dst;
}






[LISTING THREE]


OFILES=hmmtest.obj hmm.obj

hmmtest.exe: $(OFILES)
 cl $(OFILES)







[LISTING FOUR]

#include <stdio.h>
#include "hmm.h"

HANDLE allocshow(int);
void showheap(void);
void pause(void);

#define HMAX 100

HEAPHDR *h;
HANDLE handles[HMAX];

main()
{
 int i;

 /* allocate a heap */
 h = NewHeap(16,4096);
 showheap();
 pause();

 /* allocate a bunch of blocks of memory from the heap */
 printf("Allocating...\n");
 for (i = 0; i < HMAX; ++i) {
 printf("%2d: ",i);
 handles[i] = allocshow(32);
 sprintf(*handles[i],"%d",i); /* put something in the block */
 putchar('\n');
 }

 /* show the state of the heap after the allocations */
 showheap();
 pause();

 /* free every other block (to test the compaction algorithm) */
 printf("Deallocating...\n");
 for (i = 0; i < HMAX; i += 2)
 HeapFree(h,handles[i]);

 /* show the state of the heap after the deallocations */
 showheap();
 pause();

 /* now reallocate the blocks we freed (causes compaction) */
 printf("Reallocating...\n");
 for (i = 0; i < HMAX; i += 2) {
 printf("%2d: ",i);
 handles[i] = allocshow(32);
 sprintf(*handles[i],"%d",i);
 putchar('\n');
 }

 /* show the state of the heap after the new allocations */
 showheap();
 pause();


 /* make sure all of the blocks contain what we expect */
 printf("Checking...\n");
 for (i = 0; i < HMAX; ++i) {
 printf("%2d: %04x->%04x=",i,handles[i],*handles[i]);
 printf("%s",*handles[i]);
 if (atoi(*handles[i]) != i)
 printf(" *** ERROR");
 putchar('\n');
 }

 /* free the heap and exit */
 FreeHeap(h);
}

HANDLE allocshow(size)
 int size;
{
 HANDLE p;
 if (p = HeapAlloc(h,size))
 printf("%04x->%04x",p,*p);
 return (p);
}

void pause()
{
 while (getchar() != '\n')
 ;
}

void showheap()
{
 printf("nhandles: %d\n",h->hh_nhandles);
 printf("handles: %04x\n",h->hh_handles);
 printf("base: %04x\n",h->hh_base);
 printf("next: %04x\n",h->hh_next);
 printf("top: %04x\n",h->hh_top);
}

























December, 1991
STATISTICAL PERFORMANCE ANALYSIS


Looking for quality time


 This article contains the following executables: STAT.ARC


Fred Motteler


Fred has been a software engineer at Applied Microsystems Corporation since
receiving a Ph.D. in physics from the University of Washington in 1986. In
addition to his work at Applied Microsystems, he has worked on a number of
embedded real-time controller applications. He can be contacted at Applied
Microsystems Corporation, 5020 148th Ave. N.E., P.O. Box 97002, Redmond, WA
98073-9702.


Statistical performance analysis is a practical technique for determining
which modules in a program require the most execution time. By knowing this,
you can improve performance of time-critical applications. Consecutive trials
can then help zero in on effective coding changes for execution time
reduction. While the techniques presented here are ideally suited for embedded
applications--and were in fact developed in part to squeeze more performance
out of embedded systems development tools, as described later--the
implementation is general-purpose enough to apply to just about any
time-critical application.
Where I work (Applied Microsystems), statistical performance analysis has been
used to improve time-critical application performance on several occasions.
The most notable was the development of the SCSI communication option for our
ES 1800 emulator for 680x0, 80x86, and Z800x processors. Use of statistical
performance analysis allowed both host and emulator software to be optimized
for speed. This resulted in data transfer rates two to five times faster than
would otherwise have been possible.
Another example of statistical performance analysis use involved development
of an arbitrary precision, portable C, IEEE-P754 format-compatible, and
floating-point arithmetic package. Again, performance analysis was used to
determine which routines to optimize for speed and how effective the
modifications were.


Performance Analysis Steps


The basic idea of statistical performance analysis is to periodically sample
the program counter while a program is executing. Once the program or the
sampling is finished, the program counter samples are sorted according to the
address range of each module in the program. The number of samples that lie
within each module is tallied. From these tallies, an approximate value for
the relative amount of execution time spent in each module can be determined
and displayed. (See Figure 1.)
Determining the module to which each program counter sample belongs requires
the program's link map or memory map to be examined. Most linkers produce
memory maps that give information about modules' physical locations in memory.
To good approximation, the number of samples falling within each module is
proportional to the amount of time spent executing the code within the module.
The final tallies for each module can be displayed as a percentage of the
total number of samples recorded for the entire program.
Figure 2 contains examples of statistical performance analysis applied to a
simple test program. The test program consists of a main( ) function and 15
delay_loop( ) functions. The delay loop functions are configured so that the
second function's delay loop takes twice as much time as the first. The third
function's delay loop takes three times as much time as the first, and so on.
The main function contains a loop that contains one call to each of the 15
subfunctions. Good statistics should show the second function taking twice as
much time as the first, the third function taking three times as much time as
the first, and so on. Figure 2(b) shows each program counter sample sorted
according the function within which it lies. Trial program configuration was
(patest.c): looptoloopN = 5120, scaleN = 2, delay_one = 1, delay_two = 2,
delay_three = 3, and so on. This was run using an ES 1800 68020 emulator with
a 12.5-MHz "Demon" target for two seconds of 4000-bus cycles between samples.
For a complete listing of the test program patest. c, see Listing One (page
100).
In contrast, Figure 3 presents statistical performance analysis results done
on a real PC program. The program being tested is the regression tester for
the floating-point arithmetic package mentioned earlier. Only the results for
the first 20 functions are included here.
Figure 3. Floating-point arithmetic package regression tester. Trial program
configuration: fmtest.exe running on an 8-MHz AT clone. Command-line
invocation: pamsdos fmtest. map lattice.cfg fmtest.

 Sample output:
 Module table sorted by sample counts:
 4096 samples collected

 module samples % 0 5 10 15 20 25 30
 ushftlm 919 22 **************************************
 ushftrm 846 21 ************************************
 PA_BOT_MEM 670 16 ***************************
 ucheckm 463 11 *******************
 _eufbs 266 6 **********
 ucmpm 169 4 ******
 isubm 138 3 *****
 iaddm 94 2 ***
 ffbitext 74 2 ***
 umultm 62 2 ***
 _nmalloc2 42 1 *
 _nfreex 42 1 *
 udivm 37 1 *
 itestm 33 1 *
 _nchkb 23 1 *
 ffbitins 21 1 *
 _nnohold 20 0
 _CO_pfmt 18 0
 prints 18 0
 calloc 14 0


From this example, it is clear that the routines ushftlm and ushftrm together
take up more than 40 percent of the total program execution time. These
routines do bitwise shifts left and right. As expected, they are used
extensively and are prime candidates for execution-time optimization. The
routine PA_BOT_MEM actually represents program counter samples that were below
the program being tested. Typically, this represents the fraction of time
spent in DOS support functions. The regression tester outputs a significant
amount of results to the display, so time spent scrolling the PC's screen, for
example, is significant.
Other routines that show up significantly, ucheckm, ucmpm, isubm, iaddm,
umultm, udivm, and itestm are all integer math routines used to support the
floating-point package. Some Lattice library routines also show up (_eufbs,
_nmalloc2, _nfreex, and _nchkb) due to the extensive use of dynamic memory
allocation by the floating-point routines.


Statistical Performance Analysis Techniques


The examples in Figures 2 and 3 demonstrate that the relative time required by
different modules can be determined from periodic program counter sampling. A
logical question is, "So why isn't statistical performance analysis more
widely accepted?"
The main argument against statistical performance analysis is that the entire
execution path of a program is not sampled. Only isolated snapshots of where
the program is executing are sampled, so the results of statistical
performance analysis may not be accurate.
The nature of possible inaccuracies is difficult to characterize and depends
on the program being analyzed, the sampling period, and total number of
samples taken. Most statistical methodology assumes the sampling process is a
series of independent events (such as repeated flipping of a coin). The
results of such random sampling are well characterized by the binomial
distribution, as is the range of variation in possible outcomes.
In contrast, programs are not random: There is a well-defined sequence of
events. Starting the same program twice from the same initial conditions
produces the same results. Sampling the program counter at periodic intervals
is not a series of independent events.
The assumption for statistical performance analysis is that for sufficiently
complex programs, appropriate sampling periods, and appropriate sampling
durations, the program counter sampling will approximate independent sampling.
The nature of problems encountered with statistical performance analysis
samples is somewhat similar to that of problems associated with pseudorandom
number generation programs. Fortunately, the general complexity of most
programs makes selection of reasonable parameters for statistical performance
analysis much easier than selection of reasonable pseudorandom number
generation parameters.
Several factors affect the variation of results between one statistical
performance analysis trial and the next:
Change in the sampling period
Change in the total sampling time
Nonidentical initial conditions: The program sampling may not be identically
synchronized with the program being sampled.
Identical initial conditions, but varying synchronization relative to the
program being sampled. This is true when the program sampling is driven off a
separate clock circuit from the processor clock circuit running the program.
External events that alter the execution path of the program being sampled:
unsynchronized inputs, interrupts, DMA, and so on.
A poor choice of either the sampling frequency or the total sampling time can
result in biased results. The conditions in Table 1 are required for good
statistics.
Table 1: Conditions for good statistics

 Where

 t = program counter sampling interval
 T = total sampling time
 N = number of program modules
 n = total number of program samples (n = T/t)

 then for good sampling

 t<<T (take a lot of samples)
 N << n, (there are many samples in each module)
 t &ne; some basic periodicity of the program,
 T >> basic periodicity of the program, or
 T == some multiple of the basic periodicity of the program,
 or part of the program being tested.

The examples in Figures 4 and 5 illustrate the difference between good and bad
statistics. In both cases, a simple test program is analyzed. The program
consists of a main function and 15 identical delay loop subfunctions. The main
function contains a loop that contains one call to each of the 15
subfunctions. Good statistics should show all 15 functions requiring the same
fraction of the total execution time. (See Listing One for the test program
patest.c.)
Figure 4(a) illustrates a "good" sampling period in which all of the delay
loop functions should have about the same number of samples. Trial program
configuration (patest.c) is: looptoloopN = 5120, scaleN = 2, delay_one = 8,
delay_two=8, delay_three = 8, and so on. This was run using an ES 1800 /68020
emulator with a 12.5-MHz "Demon" target for two seconds of 3850-bus cycles
between samples.
Figure 4(b) illustrates a "bad" sampling period, in which the sampling period
matches a fundamental periodicity of the program. The delay loop functions are
short delays, and the number of main loop iterations is large. Delay functions
delay_eight and delay_fifteen have a disproportionately large share of
samples. Trial program configuration (patest.c) is: looptoloopN = 5120,
scale-N = 2, delay_one = 8, delay_two = 8, delay_three = 8, and so on. This
was run using an ES 1800 68020 emulator with a 12.5-MHz "Demon" target for two
seconds of 3900-bus cycles between samples.
Figure 4: Sampling periods: (a) good (b) bad

 (a)

 Sample output:
 Module table sorted by sample counts:
 1377 samples collected

 module samples % 0 1 2 3 4 5 6 7 8 9 10
 .main 91 7 *****************************
 .delay_seven 87 6 *************************
 .delay_thirteen 87 6 *************************
 .delay_one 86 6 *************************
 .delay_eleven 86 6 *************************
 .delay_fourteen 86 6 *************************
 .delay_twelve 86 6 *************************
 .delay_ten 86 6 *************************
 .delay_five 86 6 *************************

 .delay_six 86 6 *************************
 .delay_nine 85 6 *************************
 .delay_fifteen 85 6 *************************
 .delay_three 85 6 *************************
 .delay_four 85 6 *************************
 .delay_two 84 6 *************************
 .delay_eight 84 6 *************************

 (b)

 Sample output:
 Module table sorted by sample counts:
 1359 samples collected

 module samples % 0 1 2 3 4 5 6 7 8 9 10
 .delay_eight 160 12 ************************
 .delay_fifteen 158 12 ************************
 .delay_four 81 6 *************
 .delay_nine 80 6 *************
 .delay_eleven 80 6 *************
 .delay_five 80 6 *************
 .delay_twelve 80 6 *************
 .delay_seven 80 6 *************
 .delay_thirteen 80 6 *************
 .delay_one 80 6 *************
 .delay_two 80 6 *************
 .delay_fourteen 80 6 *************
 .delay_ten 80 6 *************
 .delay_three 79 6 *************
 .delay_six 79 6 *************

Figure 5(a) illustrates "good" total sampling time, in which all of the delay
loop functions should have about the same number of samples. Trial program
configuration (patest.c) is: looptoloopN = 5, scaleN = 10, delay_one = 8,
delay_two = 8, delay_three = 8, and so on. This was run using an ES 1800 68020
emulator with a 12.5-MHz "Demon" target for two seconds of 4000-bus cycles
between samples.
Figure 5(b) shows "bad" total sampling time. Half of the delay loop functions
have significantly more samples than expected and the other half have less. In
this case, the total sampling time was not a multiple of a fundamental
periodicity of the program, the delay loop functions are long delays, and the
number of main loop iterations is small. (delay = 8192, loop iterations = 5).
Trial program configuration (patest.c) is: looptoloopN = 5, scaleN = 10,
delay_one = 8, delay_two = 8, delay_three = 8, and so on. This was run using
an ES 1800 68020 emulator with a 12.5-MHz "Demon" target for two seconds of
2500-bus cycles between samples.
Figure 5: Sampling time: (a) good (b) bad

 (a)

 Sample output:
 Module table sorted by sample counts:
 1343 samples collected

 module samples % 0 1 2 3 4 5 6 7 8 9 10
 .delay_one 85 6 *************************
 .delay_eleven 85 6 *************************
 .delay_six 85 6 *************************
 .delay_eight 83 6 *************************
 .delay_three 83 6 *************************
 .delay_fourteen 83 6 *************************
 .delay_four 82 6 *************************
 .delay_nine 82 6 *************************
 .delay_thirteen 82 6 *************************
 .delay_ten 80 6 *************************
 .delay_twelve 80 6 *************************
 .delay_two 80 6 *************************
 .delay_seven 80 6 *************************
 .delay_fifteen 80 6 *************************
 .delay_five 80 6 *************************

 (b)


 Sample output:
 Module table sorted by sample counts:
 2000 samples collected

 module samples % 0 1 2 3 4 5 6 7 8 9 10
 .delay_fourteen 132 7 ******************************
 .delay_nine 131 7 ******************************
 .delay_ten 131 7 ******************************
 .delay_one 131 7 ******************************
 .delay_eleven 131 7 ******************************
 .delay_fifteen 131 7 ******************************
 .delay_twelve 131 7 ******************************
 .delay_thirteen 130 7 ******************************
 .delay_eight 106 5 **********************
 .delay_six 106 5 **********************
 .delay_four 105 5 **********************
 .delay_two 105 5 **********************
 .delay_three 105 5 **********************
 .delay_seven 105 5 **********************
 .delay_five 104 5 **********************

In actual practice, selection of the total sampling time and the sampling
frequency is rather subjective. Typically the total sampling time is either:
A. The total execution time of the program. For a program that has definite,
finite start, and end points (such as the floating-point package regression
tester presented earlier) and runs for seconds or minutes, this is a
reasonable choice. B. The execution time of a particular operation within a
much larger program. Here the goal is to isolate the operation being tested
from the rest of the program. A good example of this is the actual download of
program data from a host system to the target system via an emulator.
Typically the sampling frequency is either fixed in hardware or variable. On
the PC, the clock-tick frequency is fixed. This makes checking for the effects
of natural program periodicity difficult. Using a PC with a different CPU
clock frequency or processor has the same effect as changing the sampling
rate. If either the interrupt frequency or the rate of external hardware
sampling can be varied then checking for the effects of natural program
periodicity relatively easy.
Unless there are periodic interrupts in the system being tested, natural
periodicities in the program are difficult to determine. The best way to check
for natural periodicity is to vary the sampling frequency and see how it
affects the results. If periodic interrupts are present, and/or an obvious
natural periodicity is present, then the sampling period should be changed
such that t' = (irrational #)*t where t = natural period (or previous sampling
period) and t' = sampling period to try.
Irrational numbers such as pi, pi/2, pi/3, e, and e/2 are good initial
choices. This assumes that the natural periodicity is approximately equal to
the sampling period and is much less than the total sampling time, where t =
natural period << T.
Using an irrational number minimizes the probability that the new sampling
period is also some multiple of the program's natural period. However, due to
the finite nature of the sampling, even choosing an irrational period ratio
does not guarantee good statistics.
The best way to guarantee reasonable results is to try several different
sampling periods.


Program Counter Sampling Methods


The most critical step in statistical performance analysis is collecting the
sample program counter values. The most common ways to do this are:
Native interrupt-driven sampling. A periodic, hardware-generated interrupt is
used to trigger an interrupt routine which reads the program counter value
pushed onto the system stack when the interrupt occurred. The program counter
value is written to a part of the system's memory that the program being
tested does not affect.
External hardware sampling. External hardware (such as an emulator with trace
memory) periodically samples the processor's address and status lines. The
address lines during instruction fetch cycles are a fairly accurate method of
determining the current program counter. The instruction fetch cycles are
recorded into trace memory contained in the external hardware.


Native Interrupt Sampling on PCs


The advantage of using a PC is that it is a common software development
platform. Good quality compilers, editors, and other development tools are
readily available and inexpensive.
The PC has a clock-tick hardware interrupt (INT 8) that occurs every 54.92
milliseconds (18.21 times per second). This interrupt is used for time keeping
by the PC's BIOS routines and by MS-DOS. (See either the Microsoft MS-DOS
Programmer's Reference, the XT Technical Reference, or the AT Technical
Reference.) The value of 18.21 times per second is used so that there are
2{16} (65,536) clock ticks per hour.
The clock-tick interrupt vector normally points to a timekeeper service
routine in either the BIOS or MS-DOS. Prior to running the program to test,
the vector is modified to point to a local program counter sampler interrupt
routine. This routine samples the pre-interrupt program counter value on the
stack and saves it into a local data buffer not affected by the program being
tested. After sampling the PC value, control is transferred to the original
timekeeper interrupt service routine. The interrupt vector is restored after
the program being tested completes execution. An example is the source code
file patick.asm (Listing Two, page 101).
The most significant limitation of doing native, interrupt-driven sampling on
a PC is that the interrupt rate is fixed at a relatively slow rate. A fixed
sampling rate does not allow determination of possible effects of natural
periodicity in the program being tested. The only way to investigate this is
to use several PCs with different processor clock rates. The sampling rate
stays the same, but the execution speed of the program changes, depending on
the PC used.
Another problem with using a PC for performance analysis is that programs are
dynamically relocated at runtime. Memory maps generally give module addresses
relative to the origin of the program. The origin of the program is determined
at runtime. This requires either a modified loader that returns the program
origin address, or use of a trial program that returns where it has been
loaded into memory. The latter approach is presented in the source code files
pawhere.c (Listing Three, page 103) and pamsdos.c (Listing Four, page 103).
A more general limitation of using native, interrupt-driven sampling is that
it requires system resources. Memory is required for the sampling program and
for the program counter sample buffer. This memory usage competes for memory
with the program being tested. A periodic interrupt must be provided to drive
the sampling. The sampling interrupt service routine requires some execution
time. This execution time requirement competes with that of the program being
tested. While in most cases, use of some system resources does not
significantly impact the program being tested, in some it will.


Analysis Methods


Once the program counter samples have been collected by either native
interrupt sampling or external hardware, the samples can be processed on
either the native system or an alternate host system.
Reading the memory map for the program being tested is an easy way to
determine the beginning and ending address of each module within it. The file
pardmap.c (accessible electronically; see "Availability" at the end of this
article) is an example of a generic map file reader program. The map file
reader reads the map file and creates a data table of module names, address
ranges, and number of samples. It then sorts the table according to address
values.
For each program counter sample, a binary search is done to determine which
module's sample count to increment. The file pautil.c does the binary search
and increments the appropriate sample count. This file, along with the make
file, the required header file, padef.h, discussed in this article, and sample
configuration files for Microtec and Intermetrics cross linkers, and for
Lattice, Zortech, and Microsoft PC native linkers are available
electronically.


So, Why Aren't You Using Statistical Performance Analysis?



For execution time critical programs, the benefits of using statistical
performance analysis far outweigh the potential problems due to questionable
accuracy of results. With proper understanding of the statistical performance
analysis technique, most potential problems can be eliminated, making
statistical performance analysis a useful, practical tool.


Availability


In addition to the code availability sources listed on page 3, complete source
code and PC executable versions of pamsdos.exe and paes1800.exe are available
on a 5 1/4-inch, 360K floppy disk from Applied Microsystems. The executable
version was compiled from the given source code using the PC Lattice C
compiler, Version 6.01. Both programs are fully commented and include fairly
complete usage messages.

_STATISTICAL PERFORMANCE ANALYSIS_
by Fred Motteler


[LISTING ONE]

/* patest.c -- A collection of simple routines to test the accuracy of
** statistical performance analysis programs for the PC and ES 1800 emulator.
*/

/* Default delay timing parameters */
#define DELAY_ONE 1
#define DELAY_TWO 2
#define DELAY_THREE 3
#define DELAY_FOUR 4
#define DELAY_FIVE 5
#define DELAY_SIX 6
#define DELAY_SEVEN 7
#define DELAY_EIGHT 8
#define DELAY_NINE 9
#define DELAY_TEN 10
#define DELAY_ELEVEN 11
#define DELAY_TWELVE 12
#define DELAY_THIRTEEN 13
#define DELAY_FOURTEEN 14
#define DELAY_FIFTEEN 15

#define DELAY_SCALE 10 /* Effectively muliplies values by 1000 */
#define DELAY_LOOPS 5 /* Default number of times thru main loop */

/* Loop delay parameters. These are done as globals to allow easy access to
** the timing parameters via the ES 1800 emulator. This allows different
** timing configurations to be tested without having to recompile and link
** this code. Kludgie, but it encourages easy experimentation. */
long final_sumL = 0;
int time_oneN = DELAY_ONE;
int time_twoN = DELAY_TWO;
int time_threeN = DELAY_THREE;
int time_fourN = DELAY_FOUR;
int time_fiveN = DELAY_FIVE;
int time_sixN = DELAY_SIX;
int time_sevenN = DELAY_SEVEN;
int time_eightN = DELAY_EIGHT;
int time_nineN = DELAY_NINE;
int time_tenN = DELAY_TEN;
int time_elevenN = DELAY_ELEVEN;
int time_twelveN = DELAY_TWELVE;
int time_thirteenN = DELAY_THIRTEEN;
int time_fourteenN = DELAY_FOURTEEN;

int time_fifteenN = DELAY_FIFTEEN;
int scaleN = DELAY_SCALE;
int looptoloopN = DELAY_LOOPS;

/* Function: long delay_xxxx(int delayN)
** Description: These are simple functions designed to allow varied delays.
** The code in each delay function is the identical to the code in all of
** the other delay functions. This allows accurate comparision of the
** relative execution time of each function. Fifteen of these functions should
** be a reasonable number to represent a simple "real" program.
*/
long
delay_one(delayN)
int delayN;
{
 int i;
 long sumL;

 sumL = 0L;
 delayN <<= scaleN;
 for (i = 0; i < delayN; i++)
 sumL += (long) i;

 return(sumL);
}
long
delay_two(delayN)
int delayN;
{
 int i;
 long sumL;
 sumL = 0L;
 delayN <<= scaleN;
 for (i = 0; i < delayN; i++)
 sumL += (long) i;
 return(sumL);
}

 .
 .
 .
 .

long
delay_fifteen(delayN)
int delayN;
{
 int i;
 long sumL;
 sumL = 0L;
 delayN <<= scaleN;
 for (i = 0; i < delayN; i++)
 sumL += (long) i;
 return(sumL);
}

/* Function: void main()
** Description: This is a simple routine to run the various delay routines.
** The delay time variables are all globals to allow experimentation with

** the timing parameters using the ES 1800 emulator.
*/
void
main()
{
 int i;
 final_sumL = 0L;
 for (i = 0; i < looptoloopN; i++)
 {
 final_sumL += delay_one(time_oneN);
 final_sumL += delay_two(time_twoN);
 final_sumL += delay_three(time_threeN);
 final_sumL += delay_four(time_fourN);
 final_sumL += delay_five(time_fiveN);
 final_sumL += delay_six(time_sixN);
 final_sumL += delay_seven(time_sevenN);
 final_sumL += delay_eight(time_eightN);
 final_sumL += delay_nine(time_nineN);
 final_sumL += delay_ten(time_tenN);
 final_sumL += delay_eleven(time_elevenN);
 final_sumL += delay_twelve(time_twelveN);
 final_sumL += delay_thirteen(time_thirteenN);
 final_sumL += delay_fourteen(time_fourteenN);
 final_sumL += delay_fifteen(time_fifteenN);
 }
}




[LISTING TWO]

TITLE patick - IBM PC / Clone Clock Tick CS:IP Grabber
; File: patick.asm--Fred Motteler and Applied Microsystems Corporation
; Copyright 1990. All Rights Reserved
; Description:
; This file contains three functions:
; C callable:
; void painit(bufferLP, lengthN) Initialize grabber interrupt vector
; int paclose() Close grabber interrupt vector
; Interrupt routine, this is treated like part of painit():
; patick Grab CS:IP value
; These functions are configured for small model.
; Stack frame structure for painit():
;
stkfr STRUC
 OLD_FR DW ? ; Previous stack frame pointer
 RETADDR DW ? ; Return address to caller
 BUFFERP DW ? ; Pointer to buffer to use
 BUFLEN DW ? ; Length of buffer (in longwords)
stkfr ENDS
;
; Stack frame structure for clock tick timer routine.
intfr STRUC
 INT_FR DW ? ; Pre-interrupt stack frame pointer
 IP_VAL DW ? ; Pre-interrupt IP value
 CS_VAL DW ? ; Pre-interrupt CS value
intfr ENDS
TIMER EQU 8h ; Timer interrupt vector number

DGROUP GROUP _DATA
_DATA SEGMENT WORD PUBLIC 'DATA'
 ASSUME DS:DGROUP
bufptr DW 0 ; Starting point of buffer
bufsiz DW 0 ; Number of longwords in the buffer
bufindx DW 0 ; Next location of buffer to use
bufwrap DB 0 ; Flag if buffer has wrapped...
_DATA ENDS
_TEXT SEGMENT BYTE PUBLIC 'CODE'
 ASSUME CS:_TEXT
;
; void paopen (unsigned long *bufferLP, int lengthN)
; This a C callable function to initialize the CS:IP grabber and
; start it up. bufferLP points the buffer of where to write CS:IP
; values. lengthN is the length of the buffer in longwords.
 PUBLIC paopen
paopen PROC NEAR
 push bp
 mov bp,sp
 push si
 push di
 push es
;
; Set up the local buffer pointer values from those passed on the stack.
 mov ax,[bp].BUFFERP ; Get pointer to start of buffer
 mov bufptr,ax
 mov ax,[bp].BUFLEN ; Get length of the buffer
 shl ax,1 ; convert longword length to byte length
 shl ax,1
 mov bufsiz,ax
 xor ax,ax ; Start at the beginning of the buffer
 mov bufindx,ax
 mov bufwrap,al ; Reset buffer wrap flag
;
; Save the original clock tick interrupt vector.
 mov al,TIMER ; interrupt number into al
 mov ah,35h ; DOS function = get vector
 int 21h ; DOS returns old vector in es:bx
 mov cs:oldseg,es ; save old segment
 mov cs:oldoff,bx ; save old offset
;
; Disable interrupts while changing the interrupt vector.
 cli
;
; Change clock tick interrupt routine to point at local interrupt routine.
 mov al,TIMER ; vector number
 mov ah,25h ; DOS function = set vector
 mov dx,OFFSET patick ; point to our interrupt handler
 push ds ; don't lose ds, we need to get to local data
 push cs ; move this cs to ds
 pop ds ;
 int 21h ; set the new vector
 pop ds ; restore ds
;
; Enable interrupts and return;
 pop es
 pop di
 pop si
 pop bp

 sti
 ret
;
; Clock tick grabber routine. This routine samples CS:IP that were pushed
; on to the stack when the interrupt occurs.
patick: push bp
 mov bp,sp ; Treat CS:IP values like stack frame
 push ax
 push bx
 push ds
;
; Get the local ds to allow access to local variables.
 mov ax,DGROUP
 mov ds,ax
;
; Use bx as a pointer to the recording buffer
 mov bx,bufptr
 add bx,bufindx
;
; Grab the pre-interrupt CS:IP values off the stack
 mov ax,[bp].IP_VAL ; grab the IP
 mov [bx],ax ; save the IP in the recording buffer
 inc bx
 inc bx
 mov ax,[bp].CS_VAL ; grab the CS
 mov [bx],ax ; save the CS in the recording buffer
 inc bx
 inc bx
;
; Check if we are at the end of the buffer
 sub bx,bufptr ; get the byte offset index back again
 mov ax,bufsiz ; get the buffer byte length
 cmp ax,bx
 jne notend ; jump if not at the end of the buffer
;
; At the end of the buffer
 mov bx,0 ; reset the buffer index
 mov al,0ffh ; set flag to indicate buffer wrap
 mov bufwrap,al
;
; Write out modified buffer index
notend: mov bufindx,bx
;
; Clean up
 pop ds
 pop bx
 pop ax
 pop bp
;
; Jump to the original interrupt service routine. An immediate jump
; is used so no segment registers are required.
 DB 0eah ; jmp immediate, to the offset:segment
; selected below (brute force approach).
; Original interrupt handler's offset and segment values. These are
; in the current code segment to allow the interrupt routine given
; here to directly jump to the original interrupt routine.
oldoff DW 0 ; Room for original timer interrupt offset
oldseg DW 0 ; Room for original timer interrupt segment


paopen ENDP
;
; int paclose() This is a C callable function to close CS:IP grabber and
; return the number of CS:IP values grabbed.
 PUBLIC paclose
paclose PROC NEAR
 push bp
 mov bp,sp
 push si
 push di
 push es
;
; Disable interrupts while the original interrupt vector is restored.
 cli
 mov al,TIMER ; get interrupt number
 mov ah,25h ; DOS function = set vector
 push ds ;
 mov dx,cs:oldoff ; old timer offset
 mov ds,cs:oldseg ; old timer segment
 int 21h ; restore old vector
 pop ds ;
;
; Enable interrupts.
 sti
;
; Calculate the number of CS:IP values
 cmp bufwrap,0 ; check if the buffer has wrapped
 jne wrapped ; jump if it has wrapped
 mov ax,bufindx ; no wrap, return buffer index as count
 jmp done
wrapped: mov ax,bufsiz ; wrapped, return buffer size as count
;
; Clean up stack and return
done: shr ax,1 ; Return count in number of CS:IP pairs
 shr ax,1
 pop es
 pop di
 pop si
 pop bp
 ret

paclose ENDP
_TEXT ENDS
 END






[LISTING THREE]

/* pawhere.c -- contains a very simple program that returns its segment base
** address. Note that this program is Lattice version 6.01 specific in that
** the Lattice small model has "main" at the beginning of the exectable
** portion of the program. Other compiler/linker packages may require that the
** program map be examined for the module that starts the program.
** Copyright 1990 Fred Motteler and Applied Microsystems Corporation
*/

#include <dos.h>
#include <stdio.h>

unsigned int
main()
{
 FILE *fp;

 fp = fopen("pawhere.tmp", "w");
 fprintf(fp, "%x %x\n",
 (FP_SEG((char far *) main)), (FP_OFF((char far *) main)));
 fclose(fp);
 exit(0);
}






[LISTING FOUR]

/* pamsdos.c -- Utility functions used by MS-DOS version of the statistical
** performance analysis package.
** Copyright 1990 Fred Motteler and Applied Microsystems Corporation
*/

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include "padef.h"

/* Function: int main( argcN, argvAS )
** Description: MS-DOS based statistical performance analysis program.
** Command line arguments: pamsdos prog.map prog.cfg prog.exe options
** Where: prog.map = memory map for program; prog.cfg = memory map
** configuration; prog.exe = program to run; options = command options
** for the program to run
*/
int
main( argcN, argvAS )
int argcN;
char *argvAS[];
{
 int errorN; /* Error code */
 unsigned int segmentW; /* Starting load address of program to run */
 unsigned int offsetW;
 unsigned long originL;
 int processedN; /* Number of map globals processed */
 int i; /* General index */
 FILE *mapFP; /* Map file to read */
 FILE *formatFP; /* File with map file format information */
 char commandAC[PA_LINE_LEN]; /* Complete command line for program */
 int pagelinesN; /* Number of lines on output page, 0 if
 * continuous, -1 if no display output, else
 * n if n lines per page. */
 FILE *listFP; /* Results output file */
 char listAB[80]; /* Optional results listing file path/name */

 char pagelinesAB[8]; /* String for number of lines/page */

 printf("pamsdos - Statistical performance analysis tool for MS-DOS\n");
 printf("Version %s\n", PA_VERSION);
 printf("Copyright (C) 1990 Fred Motteler and Applied Microsystems Corp\n");
 if (argcN < 4)
 {
 printf("Usage: pamsdos prog.map prog.cfg prog.exe [options]\n");
 printf(" Where: prog.map memory map for program\n");
 printf(" prog.cfg memory map configuration file\n");
 printf(" prog.exe program to run\n");
 printf(" [options] command line options for program to run\n");
 exit(-100);
 }

 /* Determine where the program to run is to be located. */
 if ((errorN = pa_locate(&segmentW, &offsetW)) != 0)
 {
 pa_error(errorN);
 exit(errorN);
 }
 /* Calculate origin of program. Room must be allowed for memory
 * malloc()'d off the heap. */
 originL = (unsigned long) (segmentW + 1);
 originL <<= 4;
 originL += (unsigned long) (offsetW - 2);
 originL += (unsigned long) (PA_BUFLEN << 2);

 if ((pa_debugN & PA_GENERAL) != 0)
 {
 printf("program start segment:offset %x:%x\n", segmentW, offsetW);
 printf(" linear address %lx\n",originL);
 }

 /* Get the complete command line to invoke the program. */
 strcpy(commandAC, argvAS[3]);
 if (argcN > 4)
 {
 for (i = 4; i < argcN; i++)
 {
 strcat(commandAC," ");
 strcat(commandAC,argvAS[i]);
 }
 }

 /* Run the program and collect samples. */
 printf("Starting %s\n", argvAS[3]);
 if ((errorN = pa_pcsample(commandAC, PA_SAMPLE, PA_BUFLEN)) != 0)
 {
 pa_error(errorN);
 exit(errorN);
 }

 /* Read in the configuration file to get map format information and
 * to get number of lines / display page and option listing file. */
 if ((formatFP = fopen(argvAS[2], "r")) == (FILE *) NULL)
 {
 pa_error(PA_NO_CFG_E);
 exit(PA_NO_CFG_E);

 }

 /* Read in display lines, and optional output file configuration data
 * from the configuration file. */
 if (((errorN = paconfig(formatFP, PA_PAGELINES, pagelinesAB)) != 0) 
 ((errorN = paconfig(formatFP, PA_LISTFILE, listAB)) != 0))
 {
 pa_error(errorN);
 fclose(formatFP);
 exit(errorN);
 }

 /* Determine the number of lines/page to display */
 if (sscanf(pagelinesAB, "%d", &pagelinesN) != 1)
 {
 pa_error(PA_BAD_ARG_E);
 fclose(formatFP);
 exit(PA_BAD_ARG_E);
 }

 /* Open the optional listing file */
 if (listAB[0] == '\0')
 listFP = (FILE *) NULL;
 else if ((listFP = fopen(listAB, "w")) == (FILE *) NULL)
 {
 pa_error(PA_NO_LST_E);
 fclose(formatFP);
 exit(PA_NO_LST_E);
 }

 /* Read program's memory map and create "bins" for program counter samples.
*/
 if ((mapFP = fopen(argvAS[1], "r")) == (FILE *) NULL)
 {
 pa_error(PA_NO_MAP_E);
 fclose(mapFP);
 exit(PA_NO_MAP_E);
 }
 if ((errorN = pardmap(mapFP, formatFP, originL, &processedN)) != 0)
 {
 pa_error(errorN);
 fclose(mapFP);
 fclose(formatFP);
 exit(errorN);
 }

 /* Process the samples and sort the bins according to the PC hits in
 * each bin. */
 printf("Processing samples\n");
 if ((errorN = pa_bstuff(PA_SAMPLE, patableAHP, &processedN)) != 0)
 {
 pa_error(errorN);
 fclose(mapFP);
 fclose(formatFP);
 exit(errorN);
 }

 /* Display the results */
 padisply(patableAHP, processedN, pagelinesN, listFP);
 fclose(mapFP);

 fclose(formatFP);
 exit(0);
}

/* Function: int pa_locate(unsigned int *segmentPW, unsigned int *offsetPW)
** Description: This function figures out where in memory the program to be
** analyzed is to be run. MS-DOS executables are dynamically located at
** runtime. In order to avoid the complexity of writing a DOS ".exe" loader
** program, a simpler approach is used here. This function uses the ANSI
** system() library function to execute a trial program, "pawhere.exe" that
** writes its starting code segment and offset to a temporary file
** "pawhere.tmp". After "pawhere.exe" has finished, this function opens the
** temporary file and reads the starting segment and offset value. It is
** assumed that the desired program to be tested will have the same starting
** code segment and offset. If all operations were successful, then 0 is
** returned. Otherwise a non-zero error code will be returned.*/
int
pa_locate(segmentPW, offsetPW)
unsigned int *segmentPW;
unsigned int *offsetPW;
{
 FILE *fp;

 /* First figure out where the program will be loaded. Run "pawhere.exe"
 * via a system() function call. */
 if ((system("pawhere")) != 0)
 return(PA_NO_WHERE_E);

 /* Read in the result of whereami.tmp. */
 if ((fp = fopen("pawhere.tmp", "r")) == (FILE *) NULL)
 return(PA_NO_TMP_E);
 if ((fscanf(fp, "%x %x", segmentPW, offsetPW)) != 2)
 return(PA_BAD_TMP_E);
 fclose(fp);
 if (remove("pawhere.tmp") != 0)
 return(PA_TMP_RM_E);

 return(0);
}

/* Function: int pa_pcsample(char *programS, char *sampfileS, int samplesN)
** Description: This function runs the program (entire command line) pointed
** to by programS, while sampling its program counter every PC clock tick.
** Up to samplesN program counter samples are collected, and then written
** out in binary format to the file sampfiles.*/
int
pa_pcsample(programS, sampfileS, samplesN)
char *programS; /* Command line of program to run */
char *sampfileS; /* File to use to write out pc samples */
int samplesN; /* Maximum number of samples to collect */
{
 unsigned long *pcbufferPL; /* Word pointer to local pc sample buffer */
 unsigned int *pcbufferPW; /* Long pointer to local pc sample buffer */
 unsigned long *pcorgPL; /* Original copy of pointer to pc sample buf */
 unsigned int segmentW; /* Starting segment of program to run */
 unsigned int offsetW; /* Starting offset of program to run */
 int handleN; /* pc sample file handle */
 unsigned long sampleL; /* segment:offset sample converted to linear */
 int i; /* general index */


 /* Grab memory for the sample buffer */
 if ((pcbufferPL = (unsigned long *) malloc((4*samplesN)))
 == (unsigned long *) NULL)
 return(PA_NO_MEM_E);
 /* Copy buffer pointer to allow word (int) access as well as long access.*/
 pcbufferPW = (unsigned int *) pcbufferPL;
 pcorgPL = pcbufferPL;

 /* Start CS:IP sampling */
 paopen(pcbufferPW, samplesN);

 /* Run the desired program. */
 if (system(programS) != 0)
 {
 paclose();
 return(PA_NO_EXEC_E);
 }

 /* Stop sampling */
 samplesN = paclose();

 /* Convert the samples from offsetW:segment to linear addresses relative
 * to the origin of the loaded program. */
 if ((pa_debugN & PA_GENERAL) != 0)
 printf("pa_pcsample: number of samples: %d\n", samplesN);
 for (i = 0; i < samplesN; i++)
 {
 /* Read segment:offset value from the table. */
 offsetW = *pcbufferPW++;
 segmentW = *pcbufferPW++;

 if ((pa_debugN & PA_GENERAL) != 0)
 printf("pa_pcsample: sample segment:offset %x:%x\n",
 segmentW,offsetW);

 /* Convert it to a linear address. */
 sampleL = ((unsigned long) offsetW)
 + (((unsigned long) segmentW) << 4);
 /* Write the linear address back to the table. */
 *pcbufferPL++ = sampleL;
 if ((pa_debugN & PA_GENERAL) != 0)
 printf("pa_pcsample: linear sample %lx\n",sampleL);
 }

 /* Write the samples to a binary file. */
 if ((handleN = open (sampfileS, (O_CREAT O_WRONLY O_RAW), 0))
 == (-1))
 {
 free(pcorgPL);
 return(PA_NO_PC_FILE_E);
 }
 if ((write( handleN, ((char *) pcorgPL), (samplesN << 2)))
 != (samplesN << 2))
 {
 close(handleN);
 free(pcorgPL);
 return(PA_NO_PC_WR_E);
 }


 close(handleN);
 free(pcorgPL);
 return(0);
}

























































December, 1991
VISIBLE RESULTS WITH VISUAL BASIC


Building a network mail application--in four days!


 This article contains the following executables: EMAIL.ARC


Al Stevens


Al is a DDJ contributing editor and can be contacted at 501 Galveston Dr.,
Redwood City, CA 94063.


A small revolution is quietly taking place in the world of the Windows/DOS
programmer. Microsoft has flattened the Windows programming ramp into a gentle
slope with a product called Visual Basic, a program development environment
that makes Windows programming not only a breeze, but a pleasure as well.
Visual Basic is a milestone product that defines the next generation for
desktop software development. It integrates the interactive design of the
application's user interface with Basic language code that processes the
user's actions. A programmer designs screen windows and dialog boxes by using
onscreen, interactive tools and integrates this visual design with fragments
of code to create a Windows application. What is so important about this
integration?
A large part of the development of most interactive programs is the design and
implementation of the user interface. The Windows API provides hooks into the
function libraries that display windows and collect the user's actions. Using
that API, however, is not easy, particularly if you are new to Windows. It
takes some time for a newcomer to climb the learning ramp. The Software
Development Kit (SDK) has a lot of functions and tools for the programmer to
learn -- more than most programmers care to tackle. Once you have all the
functions, messages, resources, and so on under your belt, you write code--a
lot of code -- to display and manage windows, menus, and dialog boxes. Most of
your work is involved with user interface code.
Writing a Windows program is an iterative process of designing the user
interface, writing the code, compiling, testing, and changing your mind about
what looks good. Visual Basic does not eliminate the iterations, but it
assumes most of the tedium. It targets the tasks that occupy most of a
programmer's time and greases the two toughest parts of Windows programming --
designing the interface and writing the code to support it. There are no
longer any valid reasons why a programmer should be unable or unwilling to
write Windows programs.
Visual Basic is not the first developer's tool to integrate design and coding,
but it will score more impact on the programming community, first because it
comes from Microsoft, but more importantly because it supports Windows
programming. Its origins at Microsoft virtually guarantee that it will be
aggressively promoted and supported. Its support of Windows opens a flood gate
for new applications that were not developed before because Windows
programming was too difficult. Visual Basic might not be the tool to build the
next horizontal "killer app," but it is more than sufficient for all those
custom and vertical applications that have gone begging because nobody wanted
to jump feet-first into the SDK quagmire. We have come to believe that Windows
programming is too difficult for the average programmer, giving rise to a
Windows programming priesthood. That perception is wrong, but until now
becoming a productive Windows programmer required a significant investment in
time.
The example electronic mail program that accompanies this article is my first
Windows application of any significance whatsoever. There are only about 640
lines of code in the program, and it took about four days to write. That
included learning Visual Basic, relearning the fundamentals of the Basic
language, learning something more about Windows programming, designing the
program and its data structures, and, oh yes, writing the code. More about the
mail program later.


Designing the User Interface


You write Visual Basic programs by interactively building the screen windows
the user will see and adding the code that processes the user's actions. Each
window is called a "form" in Visual Basic. When you start Visual Basic, you
see a typical Windows menu bar across the top, a collection of tool icons down
the left edge of the screen, a project window on the right, and a big fat
blank form in the middle. The project window lists elements of the program. At
first, there are two items in the list -- the form and an empty file called
GLOBAL.BAS. The form represents the first thing the user will deal with when
your program starts.
You write your program by changing the form's appearance and adding visual
components, and code to it. This process involves changing the form's size and
position and adding "controls" and menus. Controls are text-entry boxes, check
boxes, radio and command buttons, scroll bars, list boxes, labels, and so on
all typical of Windows applications. Visual Basic groups these forms and
controls into "objects." Besides the usual controls, there are specialized
drive, directory, and list boxes that navigate the disk and file system,
display the current values, and let the user change them. The Picture is
another control type. It lets you include bitmap images in your application.
You can add the images when you design the application and you can load them
at runtime from bitmap and icon files. Visual Basic comes with a collection of
canned icon files. One of its sample applications is an icon editor and viewer
program, so you can build your own icons from scratch or modify one of the
canned ones. You can draw pictures at runtime as well by using the Visual
Basic graphics commands. There is a Timer control, too, into which you build
the code that executes after specified time-outs.
To add a control to a form during design, you select its icon from the tool
box and position the control on the form, using the mouse to drag the control
around and fix its size. Each object has a unique control identifier that the
program uses to address the control. Objects have properties that describe
their appearance and behavior. You determine some of the properties when you
design the program, and you can change some of them at runtime from within the
code. Among the form's properties are its form name, which the program uses to
address the form; its caption, which appears in the title bar; whether it has
a control box and minimize and maximize buttons; what the mouse cursor looks
like; the form's color; its border style; whether the user can change its
size; and so on. Controls have properties relevant to their type. A text box
might have scroll bars, for example, and it can use any of several fonts and
font sizes. A check box might be initially checked or unchecked. You can
organize controls into groups, too, which affects how the user addresses them,
and you can define the order that the user tabs among the controls.
A menu is a menu bar with pull-down menus, typical in Windows applications. To
build it, you open the Menu Design Window and define each item on the menu bar
and each item in the pull-down menus in an outline-style indentured table. You
must specify a control name for each selection as well as other menu
properties, such as whether the selection is checked, enabled, or visible when
the menu first displays. You can assign shortcut and accelerator keys to the
menu selections.
Any time during this design process you can tell Visual Basic to run your
program. Even though you haven't written any code yet, the form displays just
as the user will see it. You can pull down menus, tab among the controls, key
in some text, turn the check boxes on and off, push the radio buttons, and so
on.


Writing the Code


During the form design process, you can double-click an object and Visual
Basic will open a code window for the form with the object selected. The code
window has "Object" and "Proc" drop-down list boxes at the top. The Object
list contains an entry for the form and one for each of the controls and menu
selections that you have designed. The entries identify the items by their
control names. Visual Basic assigns default names: Form1, Form2, Check1,
Edit1, and so on. You soon learn to use more meaningful names. Each entry in
the Object list has an associated Proc list. The Proc entries are fixed
according to the type of the object. Check boxes have one set of Proc entries,
list boxes have another, and so on. Each Proc entry is an event, and each
event has a code fragment that the event executes. Initially, these code
fragments are empty. It is your job to write the code.
Suppose that you design an application with two forms, and the first form
loads the second form when the user selects a particular command button --
call it the NewForm button. You would write the code that processes the Click
event for the NewForm button. Visual Basic names the subroutines by using the
object name and the event. The subroutine for the Click event on your NewForm
button would look like this:
 Sub NewForm_Click

 End Sub
You would put the one or two lines of code into the subroutine that loads and
displays the second form. That's all there is to it. Your program is mostly a
collection of small code fragments that react to events.
Besides the code fragments for events, there are places for you to put common
subroutines and functions and global declarations.


The Application Definition Files


A Visual Basic application store its design in several files. The definition
of each form and the code for its objects are in a module with the form's
CtlName property for a filename and .FRM as the extension. The GLOBAL.BAS
module has global declarations and constants. You can insert a Visual Basic
file named CONSTANT.TXT into the GLOBAL.BAS module to get some standard
definitions, and you put your own global definitions there as well. Most
programs will have at least one other .BAS module that contains functions and
subroutines that are common across forms. This collection of .BAS and .FRM
modules constitutes the design and implementation of your Windows application.
The modules are listed in the Project Window. The contents of the Project
Window are recorded in the project's .MAK file.


Debugging


Visual Basic's debugger is a primitive one, slightly better that GWBASIC
debugging techniques and almost as good as QuickBasic's Debug mode. You can
set breakpoints, step through the program one statement at a time, step
through procedures, and break into the program's execution. You can make
source code changes, often without restarting. You can use the Immediate
Window to enter and execute Basic commands while the program is interrupted,
perhaps to view or change the values of variables.
Don't be put off by the shortage of debugging features. Most of the reasons
why you need full-functioned debuggers in DOS do not apply to Visual Basic
programs. The code that will have the bugs in it are those small event-driven
modules. The logic that drives their execution is a function of your visual
design rather than the structure of a program. Visual Basic manages the parts
of a Windows program that usually have the difficult bugs.



The Runtime Application


To prepare your application for distribution to users, you tell Visual Basic
to make the .EXE file. This file, along with the VBRUN100.DLL file supplied
with Visual Basic, is all the user needs. Microsoft allows you to freely
distribute the runtime DLL. No doubt future versions of Windows will include
this file or one of its successors.


Distributing Source Code


If you print source code for documentation or distribution, you are in
trouble. The source code is a small part of a Visual Basic application. Most
of the programmer's efforts are in the design of objects, which is buried
inside the .FRM files. You can print source code or save it to .TXT files. You
can print mostly useless primitive line drawings that resemble your forms, but
do not display everything. No matter, a screen capture to the Clipboard and a
session with Windows Paint will get your screens into print.
Printing the source code and pictures of the forms is not enough, though. The
glue that holds everything together is the design of the objects. Each object
has an object type, a name, and properties. For example, readers of your
documentation will not understand the FooBar_CLick subroutine if they do not
know that FooBar is a command button, and they will not understand when
another part of the program changes the property of the FooBar.Enabled
property. Visual Basic provides no way for you to view these properties other
than to look at them on the screen. Documentation of Visual Basic programs
needs to be on magnetic media. Print the pictures and describe the more
interesting code fragments, but give them the rest on a diskette.


VBMAIL: An Example


To wring out Visual Basic and get up to speed, I developed a simple electronic
mail program for networks and called it VBMAIL. I'll explain the program here
to give you an idea of the scope of an application that one can develop in a
short time.
The VBMAIL program starts with an AddrBook form for the address book shown in
Figure 1. You sign on by selecting your user name from the address book. If
your name is not already in the address book, you can enter your userid and
name into InputBox$ dialog boxes, and VBMAIL will add you to the book. Each
user has a file named userid. USR with the user's name and a subdirectory to
store mail messages and the mail database.
An environment variable such as USR=JUDY bypasses the sign-on process.


Reading and Saving Mail


After you are logged on, VBMAIL displays the MailBox form shown in Figure 2.
Mail is stored in your subdirectory in files named EMAILnnn.MSG. The nnn is a
number from 000 to 999. You cannot have more than 1000 messages. The mailbox
displays the date, sender, and subject of each message. The text box at the
bottom of the form shows the text of whichever message is currently selected
in the list box. The user can scroll through the message list and the message
text.
The File menu on the MailBox form lets you open an Editor form (Figure 3) to
write and send mail. It also lets you open the Mail Files form to view
incoming messages that you have saved. Viewing the Mail Files is the same as
viewing the MailBox, except that the program displays files with the .FIL
extension rather than .MSG. The Save command from the MailBox form's File menu
renames the selected .MSG file to a .FIL file. The Reply command opens the
Reply form initialized to reply to the selected message. The Delete command on
the File menu deletes the message file after you respond to a confirmation
message box. Visual Basic includes a standard set of message and string input
dialog boxes that you can invoke with a single function or subroutine call.
VBMAIL includes a form named Holder that is never seen by the user. Its
purpose is to host two icons for the mailbox and a timer control that watches
for incoming mail so VBMAIL can alert the user.


Sending Mail


To write a new message from the Editor form, you type the message, address it,
and send it. The address must be one of the user names from the Address Book.
You can open the AddrBook form to retrieve a user name or type it in yourself.
If you leave the name blank or type an invalid one, VBMAIL automatically opens
the AddrBook form for you.
Writing a Reply is similar to writing a new message, except that the address
is always the sender of the message to which you are replying, and you can
read that message while you write the reply. New messages and replies are
written into the receiving user's subdirectory with the .SND file extension.


Watching for Incoming Mail


The timer control in the Holder form runs every five seconds and uses the Dir$
function to see if there are any .SND files in the user's subdirectory. If so,
VBMAIL changes the name of the file to .MSG so that it will not see it again,
reads it, and adds it to the MailBox form's list-box control, changes the
MailBox's icon from an empty mailbox with the flag down to one with a letter
in it and the flag up, and alerts the user with the Beep statement. If there
are no .SND files, the timer changes the icon back to the empty mailbox. Both
icons come from the Visual Basic icon set.


Using the Clipboard


The Editor and Reply forms have Edit menus with Cut, Copy, Paste, and Delete
commands. Visual Basic text controls already have these functions built into
the Shift+Del, Ctrl+Ins, Shift+Ins, and Del accelerator keys. The code
fragments for the menu selections use the SendKeys statement to execute the
operations. Visual Basic takes care of the rest.


Read-Only Text Boxes


The message-reading textboxes on the MailBox and Reply forms are read-only.
Visual Basic does not include a read-only property for text controls, so you
must do it yourself. VBMAIL allows the user to Copy text from these read-only
messages into the Clipboard, to Paste the text into replies, for example, but
the user must not add or delete text in the read-only text controls. By adding
code to the KeyPress and KeyDown events for these controls, I was able to head
off any keystrokes that would change the text and still pass through the keys
that move the cursor around, mark blocks, and copy them to the Clipboard.
Due to space limitations, the source code for the VBMAIL program is provided
electronically (see "Availability," page 3). By examining the listings and
looking at the figures, you can see how the program is designed.
If you are not on a network, you can run two copies of VBMAIL with different
userids and send messages back and forth. If the receiver's MailBox form is
minimized, you will see its icon raise its flag within five seconds from the
time the sender sends a message. If the receiver's MailBox is active, you will
see the message appear in the list-box control.


Improving VBMAIL



VBMAIL is a compact e-mail application. How could you improve it? First, you
could combine the messages into indexed files to improve access time and disk
space use. VBMAIL does not keep copies of messages that you send to other
users, and that feature would be an improvement. You could add message
threads, text searches, multiple database folders, and private address books.
VBMAIL could copy messages to one or more carbon copy receivers. How about
blind carbon copies? Distribution lists? Enclosures? You could add gateways to
delivery agents, such as MHS, and to other mail systems, such as CompuServe
and MCI Mail. You might want to rearrange the users' subdirectory structure to
use network protection strategies so that users could not read or delete one
another's mail.
Despite these potential improvements, VBMAIL is a significant application for
a four-day project with only 640 lines of code to debug. The design phase of
the project took no more -- probably much less -- time than it would have
taken if I had written VBMAIL in C and used the SDK. After a design is done
with the SDK, I have to code all those window-processing modules with all
those switch statements and all those bugs -- all the time with my head buried
in the Windows Programmer's Reference to keep track of the hundreds of API
functions and messages.


What do You Need to Know?


To use Visual Basic, you need to know what Windows programs look like. You'll
get that exposure by installing Windows and playing with the accessory
programs. You need to understand event-driven programming. That will come
naturally. There is no other way to write a Visual Basic program, and learning
event-driven techniques will be a side-effect of your first development task.
It will be helpful to have and know how to read the Windows Programmer's
Reference. From time to time you will want to do something that Visual Basic
does not support, and you might be able to do it with the Windows API. You can
get the Programmer's Reference at the book store. Microsoft Press publishes it
as a book.
Finally, you will need to know or learn the QuickBasic dialect of Basic. More
about that later.


Problems


Visual Basic is a well-conceived, well-implemented product, particularly for
version 1, and more particularly for such a radically different product. It
has its problems and its deficiencies, but most of these have workarounds.
Where there are no workarounds, one learns to do without or be careful. Here
are some of the knots I hit.
If you accidentally bump the Del key while you have a control selected, Visual
Basic deletes the control. The Undo command on the Edit menu does nothing
following such a deletion. You have to rebuild the control from scratch.
Visual Basic does save the deleted object's code fragments in the Proc list of
the form's General object, but the visual part of the control is gone. If you
delete a frame that contains several other controls, you've just lost a lot of
work. At the very least, Visual Basic should ask you to confirm the deletion.
The Color Palette has a number of nonsolid colors, and you can build custom
colors. These colors can be the background of a control. But when that control
is a label, the background of the character boxes is always the solid
component of the color. The workaround is to use the PRINT statement to create
labels on a form, but then you cannot get Click events for the pseudolabels.
If you make changes to your forms and delete controls, Visual Basic keeps some
residue in its form definition files. The compiled .EXE file is bigger than it
needs to be. To get the smallest .EXE size, you must save the source code as
text and then load it back in, replacing the original code. Then, immediately
build your .EXE. I saw VBMAIL.EXE shrink by 3K after one such session.
For some reason, the Save Text command on the Code module sometimes defaults
to the subdirectory for icons rather than the subdirectory of the .FRM or .BAS
file. I went through several Saves and Loads before I realized I wasn't
getting anywhere.
When you start a Windows program from the Program Manager, Windows sets the
startup subdirectory to where the .EXE file is. When you run a program from
within the Visual Basic design process, there is no way to tell it where you
want the current subdirectory to be. For this reason, programs that look for
subdirectories below their own path will need a ChDir in the start-up code
just for testing within Visual Basic. VBMAIL has such a statement in its Main
function. You'll need to take that statement out before you build the .EXE
file.
So far, I've had only two of the dreaded and infamous Windows Unrecoverable
Application Errors (UAE) while running Visual Basic. I don't know how I got
the first one, and I could not reproduce it, but it cost me about an hour's
work. I save my work more often now. The other UAE came when I tried to merge
all of a big Microsoft file named WINAPI.TXT into the GLOBAL.BAS file. From
that point on, Visual Basic reported that it was out of memory, no matter what
I did to clean things up. After a while I got the UAE. UAEs are a fact of life
among Windows programmers. Unfortunately, Windows users see them as well, all
too often, Microsoft should install UAE-activated cattle prods into its
programmers' chairs. They'd work harder to make Windows more reliable.


How Good is Basic?


Basic bashing is probably more popular than C bashing, mainly because most C
programmers do it, and they (we) seem to outnumber everybody else. If anything
will keep programmers away from Visual Basic, it will be the "Basic" part.
Visions of line numbers, GOTOs, no local scope, no structure, all bring to
mind the Basic of yore. Visual Basic is not the same Basic that some of us
remember. It is a dialect of QuickBasic with most of the structured
programming constructs that a programmer needs. The language is effective,
robust, and complete.
There is one insidious feature that has been in Basic from its beginnings, and
Visual Basic has it too. A program can implicitly declare variables by
referencing them. The compiler sees a new variable name and declares it with
whatever type the code expects. Visual Basic supports explicit declarations of
local and global variables, but the old way works, too, automatically
declaring variables that you do not explicitly declare elsewhere. Misspell a
variable name and you declare a new variable, like it or not. You can do a lot
of head scratching when the program acts up. Other than for that, I don't see
much wrong with the Visual Basic version of the Basic language.


The Windows API


Eventually, you will hit the limit with Visual Basic. You will want to do
something that you know other Windows programs can do but that Visual Basic
does not support. Inasmuch as Visual Basic programs run in the Windows
operating environment, most of the function calls of the Windows API are
available to the program, and there is a good chance that you can do what you
want.
WINAPI.TXT is a file that declares the functions for the Windows API. You have
to get that file from Microsoft. (It is on CompuServe in the MSLANG
libraries.) To use the file, you must determine which API functions you need
and put those declarations into GLOBAL.BAS. With the API declarations in
GLOBAL.BAS, you can call the Windows API functions from your Basic code by
using the subroutine calling conventions.


Add-ons


Visual Basic does not have everything. For example, it does not support the
Multiple Document Interface or multiple-line selection list boxes. You'll need
to look for add-on libraries of Visual Basic tools from other vendors if you
need these features. As programmers dream up things they'd like to have,
somebody builds the add-ons. You can add custom DLLs, controls, and reusable
forms to your application. There are several third party vendors providing
such add-ons, and the list of vendors and products is growing. There are
already Visual Basic add-ons for database engines, communications drivers, and
interfaces to network APIs.
Not every application will use only the small set of controls that Visual
Basic includes. Some vendors sell add-on tools that implement new controls.
Visual Basic accepts .VBX files that describe a control with its format and
properties. When you add such a file to an application, Visual Basic
integrates the control's icon into the Toolbox window, and you can add the
control to the forms in your application. You can also incorporate other
Visual Basic source files into your programs. For example, code for the
Windows standard File Open and Save dialog boxes is available.


Products Mentioned


Visual Basic Microsoft Corporation One Microsoft Way Redmond, WA 98052-6399
206-882-8080 $199 Minimum requirements: 80286, DOS 3.1, Windows 3.0, 1 Mbyte
RAM, CGA display, hard disk, and a Microsoft-compatible mouse


The Visual Change of Life?


Programmers and reviewers have been rhapsodizing for several months about
Visual Basic, suggesting that it has changed their lives forever. To a certain
extent -- not including forever -- this can be true. I cannot promise what it
will do for you, but I can offer some guesses from my own experience. For one
thing, I spend more time in Windows than ever before. And, for the first time
in years, I am writing programs in something other than C. But these are
personal experiences. How will Visual Basic change the world of programming as
a whole?
First, Visual Basic will do more toward legitimatizing Windows as the dominant
GUI -- perhaps the dominant UI -- than anything else Microsoft has tried,
simply because it hurls open the doors for the development of Windows
programs. It would seem that Visual Basic is as much a Microsoft marketing
ploy to entrench Windows as it is to cut another notch in the language market.
No matter, all to the good. There will be more public domain, shareware, and
commercial software for Windows than ever before -- not only developer's
add-on tools, but applications, too. The increase in Windows applications will
increase the acceptance of Windows by users, which will create a demand for
more applications, thereby creating opportunities for Windows programmers.
Second, Visual Basic does for Windows programming what dBase did for DOS
applications development -- it puts the technology within easy reach of more
programmers. You do not have to be a technical wizard to write Windows
programs any more. You might hear that Visual Basic is everyman's programming
tool. Not so. You still need to know how to write a program. Nonetheless, the
Windows priesthood is, it seems, finally where it belongs -- snapping at the
heels of the snail darter, the eight-inch floppy disk, and the Communist
Party. There will be more Windows programmers now than ever before.
Third, the success of Visual Basic -- which I predict will be substantial --
redefines how programming is done. There will be an outpouring of similar
visual language products with other languages and for other platforms, and
that is good for all of us.































































December, 1991
GRAPHICAL DATA VISUALIZATION


Implementing an interactive GUI




Marian G. Williams and Peter D. Varhol


Peter is a freelance writer and assistant professor of computer science and
mathematics at Rivier College in New Hampshire. Marian is a research scientist
at the Center for Productivity Enhancement at the University of Massachusetts
at Lowell. They can be contacted through DDJ.


As professionals trained in the traditional numerical methods of data
analysis, we've come to be impressed with the potential of various data
visualization techniques in filling in some of the blind spots inherent in
these methods. In particular, numerical methods are notoriously poor at taking
highly multivariate data and identifying interesting relationships worthy of
further investigation.
This area, known generically as exploratory data analysis, is important when
theories have not yet caught up with experimentation, or when large numbers or
interrelationships between variables may confound straightforward predictions.
For example, it is well known that there is a relationship between smoking and
lung disease, but the relationship is complicated by other aspects of a
person's lifestyle, socio-economic standing, other health considerations, and
even ethnic and racial background. Determining which factors affect which
others can be difficult unless the researcher knows exactly what to look for.
Furthermore, few software applications do this type of data analysis well. We
would like to discuss both a technique for exploratory data analysis using
visualization and an applications development environment that we found useful
in prototyping it.


Taking the Solution at Face Value


A unique method of performing exploratory data analysis was introduced in 1973
by a Stanford University statistician named Herman Chernoff. His idea was
conceptually simple but very sensible. He proposed a straightforward
caricature of the human face as an innovative tool for the visualization of
multivariate data -- long before the term "data visualization" came into
common use. The face itself represents different aspects of the data. In
practice, it looks much like the so-called "happy face" that has adorned
posters for years.
Chernoff's goal, and the goal of data visualization in general, is to develop
a graphical method of representing data points in n-dimensional space that
allows an observer to easily recognize strong and complex relationships in the
data. The observer can then perform statistical analyses to study the
relationships.
Chernoff's premise is that a person spends years learning how to read other
people's facial expressions. Why not make use of this highly developed skill
for data visualization? Use one face to represent each data point in
n-dimensional space, and let the facial features be determined by the fields
of that data point. Let one field determine the curve of the smiling or
frowning mouth, another determine the size of the pupils, and so forth. Then
arrange all of the faces in a display. A person can spot regularities and
peculiarities in the data by recognizing similarities and differences among
the faces.
Chernoff's faces are potentially useful for cluster analysis and for detecting
changes in time series. Chernoff used faces for cluster analysis of Eocene
Yellow Limestone fossils and for discrimination analysis of mineral data from
core drillings. In the fossil example, each fossil was represented by a face.
Chernoff showed how the fossils could be sorted by grouping the faces
according to similarity of expression. There may be a measure of subjectivity
involved, however, as a colleague of Chernoff's could not entirely replicate
Chernoff's own groupings. In the core drilling example, a sequence of faces
represented data from specimens taken sequentially along a core. Changes in
expression indicated the places that critical changes took place in the core.
The usefulness of a display of faces depends very much on which data field is
matched with which facial feature, yet there is no algorithm for doing the
matching. In a recent article about the pace of life in various American
cities, the editors of American Scientist used Chernoff's faces to represent
the author's data. They deliberately mapped the data fields onto the facial
features in such a way that the cities with a slow pace of life (Los Angeles,
for instance) generated smiling, relaxed expressions, while cities with a fast
pace of life (such as Boston and New York) were represented by scowling faces.
A different mapping of data fields onto facial features could have resulted in
a less meaningful representation of the data.
Because the right way, by which we mean the most informative way, to match
data fields to facial features may not be known in advance of drawing the
faces, software for generating displays of faces needs to have an interface
that lets the observer experiment interactively. This was one of the goals of
our graphical prototype, discussed shortly.
Chernoff says of his faces, "This approach is an amusing reversal of a common
one in artificial intelligence. Instead of using machines to discriminate
between human faces by reducing them to numbers, we discriminate between
numbers by using the machine to do the brute labor of drawing faces and
leaving the intelligence to the humans, who are still more flexible and
clever."
There are a number of visualization research projects that exploit basic
perceptual capabilities in ways similar to Chernoff's ideas. The Exvis project
at the University of Massachusetts at Lowell (formerly the University of
Lowell) is one such project. The researchers are studying the use of texture
perception, sound perception, and color perception in creating representations
that help scientists to locate the important information in their data.
Instead of faces, they use small stick figures that have graphical, sound, and
color attributes.


A Graphical Implementation of Chernoff's Faces


While the concept seems difficult to implement using traditional software
development tools, recent advances in object-oriented technologies have made
the development of highly graphical user interfaces far easier than in the
past. In particular, when screen objects have a corresponding object
representation in the development system, implementing an interactive
graphical interface is conceptually clean.
We used VZ Programmer for Windows and OS/2 to implement a rapid prototype of a
data visualization tool based on Chernoff's faces. VZ Programmer lets you use
a graphical object editor to draw a picture of the user interface of your
application. Each component of the picture is an instance of an object class
in the VZ class library, and has certain characteristic data and behaviors
associated with it. Furthermore, through VZ's implementation of a C
interpreter/incremental compiler, the behaviors of instances can be modified
through the introduction of new methods and data attributes.
We chose to represent a single data object as an individual Chernoff's face.
Six different characteristics of the face and its position were used to
represent up to six different variables, or measurements, from each data
source. These measurements are identified on a Cartesian plane by one of the
following characteristics: Position on X-axis, position on Y-axis, length of
mouth, position of mouth, size of eyes, and distance between eyes.
The initial interface in VZ Programmer is highly graphical, and uses a file
cabinet metaphor. By clicking on a drawer of the file cabinet, the files in
that drawer are displayed as file folders. Opening one of these file folders
takes the programmer into the development environment, which centers around
the object editor.
VZ Programmer provides the object editor for drawing the user interface. This
editor includes a tools palette, which lets you draw virtually any shape on
the screen, including lines, circles, controls, and scroll bars. Each of these
shapes are VZ objects, and can have methods and data associated with them to
define their behavior.
Attributes are used to store methods and data within an object. Data
attributes simply have a name and an untyped value. Methods include a script
of C++ code, such as that shown in Example 1. For example, for a button to
open a window, a method has to be associated with that button to accomplish
that task.
Example 1: (a) Entering and error checking a data value; (b) using a menu to
call up new data analysis window

 (a)

 EndEdit( void )
 {
 int index = int(textString);
 int newValue = this=>(textString);

 Erase(0);
 if (newValue >= 0) {
 chartData[index] = newValue;
 this=>(textString) = newValue;
 }
 else {
 this=>(textString) = chartData[index];

 NoticeBox("Range Check Error",
 "Data must be larger than zero.", 0);
 }
 Draw(0);
 }

 (b)

 MenuCommand( short id ) {
 string pageName;
 VZ_PAGEWINDOW** window;
 VZ_PAGE* page;

 switch(id) {
 case 100:
 pageName = "faces 1";
 window = &mdiChild1;
 break;
 case 200:
 pageName = "faces 2";
 window = &mdiChild2;
 break;
 case 300:
 pageName = "faces 3";
 window = &mdiChild3;
 break;
 }

 if (!(*window) && (page = FindPage(pageName))) {
 *window = new VZ_PAGEWINDOW (page);
 if (*window) {
 (*window)->Show();
 }
 }
 }

The first consideration in our application was data entry. While the most
convenient method, the spreadsheet metaphor, is not normally thought of as
object oriented, VZ Programmer provides two different types of text-field
object classes that can be readily adapted to the task. Therefore, setting up
a limited-functionality spreadsheet interface took only a few minutes. The
columns corresponded to variables within a single data object, while the rows
represented new data objects. Each field had a variable attribute to intercept
the entered value.
We then created a library of faces using the object editor. The library
consisted of a faces object class, with subclasses representing each of the
different components of face used. Therefore, all our faces were actually
predefined in these classes, so that displaying a face was simply a matter of
instantiating a new instance of the appropriate class or classes.
Ideally, it would have been better to have the application draw each face from
scratch at runtime, using the input data provided. However, in addition to
simplifying the implementation, this approach recognizes that there are
limitations to both the resolution of the screen and the ability of the data
analyst to discern differences in similar values. We believe that we did not
lose a great deal of physical or perceptual resolution by grouping similar
values together into the same graphical representation.
New classes are created from within the object editor, by selecting the Edit
Class item from the Utilities menu. After creating the new class, we defined
the appearance of the faces classes by drawing them from within the object
editor. By working with class attributes, we also set the global behaviors
associated with these classes.


Putting the Pieces Together


The application came together in the following way. Each value associated with
a data object in the input spreadsheet was normalized so that it represented a
value between 0 and 1. Each subclass of the faces class had a corresponding
range of values between 0 and 1. The components for the faces were selected
out of the class libraries according to this value, and the resulting
instances were assembled on the screen.
Furthermore, the user can select the facial component used to represent a
particular variable. We used a dialog box for the user to name both the
variable and the corresponding component class. Both menus and dialog boxes
are created within the VZ object editor, using the predefined control classes
from the palette. Of course, both also need the appropriate code to enable
menu and dialog selections to accomplish the necessary actions.
The result is that the user brings up the spreadsheet with a menu selection in
order to enter data, then brings up the dialog box to associate variable with
facial component. Accomplishing this step then automatically performs
normalization of the data and matches the normalized data with the appropriate
facial components. Then the application opens up a new window that displays
Chernoff faces on a coordinate plane. Each face comes from the same class
library and takes up the same amount of space. The facial components come from
separate libraries, and are added to the facial objects. Figure 1, for
instance, shows an example Chernoff's faces display that contains demographic
data from the most recent US Census. The faces represent 49 different cities
and towns in Massachusetts. The X- and Y-axis position them according to their
relative geographic location (X, north and south; Y, east and west). The smile
represents population growth over the last ten years, with the larger the
smile, the more the relative growth. The space between the eyes represents
growth in per capita income over the same period, with more income depicted by
more space.
Is this a useful way of representing data for the purpose of discovering new
relationships between variables? It almost seems too tongue-in-cheek to be of
real use, but it may inspire data analysts to look at other nonconventional
ways of visualizing data. While the concept itself is intuitively appealing,
it's difficult to tell at this time if the approach has value in exploratory
data analysis. With graphical prototype building tools such as VZ Programmer,
examining other approaches can be done quickly and easily.


How VZ Programmer Contributed to the Task


VZ Corporation refers to their development language as C++. That is something
of a misnomer, because it is not completely complaint with either ANSI C nor
AT&T C++ 2.0. In both cases there is a workable subset of the language.
However, even under the best of circumstances, it is unlikely that a
significant amount of code can be ported into or out of the environment. The
problem is that the code is used to supplement the object class library,
especially for graphical screen objects. No standard C++ class library exists,
so true code compatibility is out of the question. This is likely to be the
case in any object-oriented C++ development efforts until that standard class
library does exist.
As a result of its class library, a VZ Programmer executable is not a
stand-alone application. It must be executed either in the VZ Programmer
development environment, or along with a runtime module that contains
essential class bindings. The methods in a VZ executable file are actually
compiled C code, with hooks for predefined screen objects defined in the
runtime module. The executable file also contains the complete definitions of
any user-defined classes, such as our faces classes.
Versions of VZ Programmer run on both Windows and OS/2 Presentation Manager.
In our tests, compatibility between the two versions appears to be very good.
All of the code and graphical objects developed under the Windows version ran
without modification on the OS/2 version. We worked in MS Windows using
version 2.0 of VZ Programmer. In this environment, it requires Windows running
in standard or enhanced mode, and 2 Mbytes of memory. It takes up about 1.5
Mbytes of disk space.
Without a graphical development tool such as VZ Programmer, this project would
have required months of dedicated coding, and we would probably have not
undertaken the effort at all. The application described took a bit more than a
month of part-time effort, a portion of which was spent learning VZ Programmer
itself. While the resulting prototype was satisfactory for experimental uses,
it may not be everyone's idea of a general distribution product. There is
still resistance from some developers to bundling a runtime module with an
application, but as object-oriented development with predefined class
libraries continues to grow in popularity, the need to include class bindings
as a runtime component will grow.



References


Chernoff, Herman. "The Use of Faces to Represent Points in n-Dimensional Space
Graphically." Journal of the American Statistical Association (June 1973).
Levine, Robert V. "The Pace of Life." American Scientist (Sept./Oct. 1990).
Williams, Marian G., Stuart Smith, and Giampiero Pecelli. "Experimentally
Driven Visual Language Design: Texture Perception Experiments for Iconographic
Displays." Proceedings of the 1989 IEEE International Workshop on Visual
Languages (Oct, 1989).


Products Mentioned


VZ Programmer VZ Corporation 57 West South Temple Street Salt Lake City, UT
84101 800-627-8851 or 801-595-1352


_GRAPHICAL DATA VISULAIZATION_
by Marian G. Williams and Peter D. Varhol

EXAMPLE 1.

(a)

EndEdit( void )
{
 int index = int(textString);
 int newValue = this=>(textString);

 Erase(0);
 if (newValue >= 0) {
 chartData[index] = newValue;
 this=>(textString) = newValue;
 }
 else {
 this=>(textString) = chartData[index];
 NoticeBox("Range Check Error",
 "Data must be larger than zero.", 0);
 }
 Draw(0);
 }


(b)

MenuCommand( short id ) {
 string pageName;
 VZ_PAGEWINDOW** window;
 VZ_PAGE* page;

 switch(id) {
 case 100:
 pageName = "faces 1";
 window = &mdiChild1;
 break;
 case 200:
 pageName = "faces 2";
 window = &mdiChild2;
 break;
 case 300:
 pageName = "faces 3";

 window = &mdiChild3;
 break;
 }

 if (!(*window) && (page = FindPage(pageName))) {
 *window = new VZ_PAGEWINDOW(page);
 if (*window) {
 (*window)->Show();
 }
 }
 }



















































December, 1991
PROGRAMMING PARADIGMS


A Conversation with Robert Carr Part II




Michael Swaine


In last month's column, Robert Carr and I discussed the design of PenPoint, GO
Corporation's 32-bit, object-oriented, multitasking operating system. This
month, we pick up where we left off, focusing on the PenPoint Notebook User
Interface (NUI) and imaging model.
DDJ: There are three aspects of your current work that I'd like to ask you
about. There are the opportunities for developers that PenPoint offers.
There's the operating system itself. And then there's this new paradigm of
using computers, pen-based computing. Of course, a new paradigm translates
into new opportunities, but it's probably useful to step back and view it
strictly as a paradigm and think about the fundamental differences it
represents, such as what it means to use a computer without a cursor, before
even trying to think about what this means in terms of markets.
RC: It certainly is a paradigm shift. Part of the shift has to do with the use
of the pen, but some other parts have to do with other elements. With regard
to the pen, you mentioned one of the good points, which is that the pen is a
cursorless device. That's wonderful. Cursors are an artificial concept and
users indeed have got to go through some learning and some motor skill
development in order to manipulate them. Nevertheless, it takes some real work
on the software side to overcome some of the disadvantages that come from
throwing the cursor away.
DDJ: Such as?
RC: One of the advantages of the cursor is that it is an accurate pointing
device, because as you are positioning it, it is giving you feedback as to
exactly where you are pointing it, whereas with a pen, as you come close to
the screen you still don't know where the software thinks you're pointing. It
turns out that you can overcome nearly all of that pointing resolution loss,
if you will, by putting more intelligence into your software. But that's just
one of many examples of needing the right software for the pen versus for the
cursor.
DDJ: So how do you do that? How do you solve this particular problem of
pointing resolution loss?
RC: Intelligent targeting algorithms. "The user's trying to point to
something. Let me look around and see what's close to this pixel that they
probably wanted. The command they just drew is a command that always acts on
words instead of characters; let me look for the word that's closest to here
and not just the character that's closest." There are a lot of targeting
heuristics that you can put in the user interface, and it can take a lot of
testing to get those right. So loss of the cursor is part of the paradigm
shift. Another part is the opportunity to support gestures, to invent a whole
new way of controlling software. We found that gestures work very well with
users because a well-designed gesture set tends to have some amount of
intuitive obviousness to users.
DDJ: Still, it's an awfully open-ended problem, isn't it? You're in the
position of having to build a language that people will take to naturally, and
there isn't anything universal to build on. Or is there?
RC: There is no standardized mark-up language in our society, and in fact most
lay people have never learned any mark-up language. So you need to design a
gesture set more to harmonize with their "collective unconscious" with regard
to what mark-up commands to use. We found that users can learn a well-designed
gesture set very quickly, both because it has strong mnemonic value visually
and also because you're drawing gestures out with your fingers -- what I call
finger memory -- gets called into play.
DDJ: Motor memory.
RC: Right. And of course motor memory is a very real memory. So gestures are
quickly learned, but they're also very efficient, since they collapse the
two-step of selecting and then acting down into a single step.
DDJ: I hadn't thought of that.
RC: That's the technical reason why users also experience the "funness," or
immediateness. Which brings me to the other major aspect of the paradigm
shift. Good user interfaces have the attribute of transparency: The user tends
to forget that there's a user interface mediating between them and the thing
they're working on. It's extremely difficult for desktop computers, or even
laptop computers, to ever have a very high degree of immediacy because of a
variety of reasons. First of all, the mouse and the keyboard are
remote-control devices: You're doing work down here to work on the screen up
there. But then also the very nature of the device is that it tends to control
you. You have to come to the device, you have to sit in a certain requisite
position, you have to hold your hands out in front of you: It's dictating a
whole lot of your behavior patterns. The fact is that we've all accepted this
and we never think about it, but I believe at something of an unconscious
level we tend to...resent is probably too strong of a word, but we notice the
friction and the costs that come from having to come to the computer. "Oh,
gee; I have to do some work on my computer." What always goes through our
minds is, "...and therefore I must go up to my home office and sit at the
desk. I can't do it here on the couch in front of the Saturday afternoon
football game," even though, perhaps from a concentration point of view that
would be just fine. So the computer forces us to come to it on its terms.
More Details.
DDJ: Which is an area in which pen-based notebook computers have the
advantage: You can take them with you. But if I understand you right, you're
arguing that this advantage is more than just a matter of convenience, that
it's a genuine paradigm shift in computer use. How is it a paradigm shift?
RC: With pen-based computers, in the mobile market at least, with devices that
are physically rather small -- tablet-sized or smaller -- all of a sudden that
equation has shifted for the first time. The human is in control again. The
human can use the device wherever they like, in almost any posture that they
like, and they can wave the device around, they can look down at it. And I
think that that's the other half of this paradigm shift. There's the pen half,
but there's also this physical relationship half, in which now you are
dictating how you use the device, and it's much more inert and you are the
active agent. Does that make sense?
DDJ: It does. I was wondering. A keyboard is obviously a discrete input
device. There are only so many keys. Do you have a consciousness when using a
pen-based system of it being more continuous? When you draw a circle around
something do you have more of a sense of moving in a continuum?
RC: No, you don't think so much "circle," you think "edit," since that's the
operation you're doing. And that's the transparency that starts occurring. A
part of this transparency of gestures is that we found with our gesture set
that we were able to oftentimes arrive at gestures that made -- I'll call it
physical common sense. Namely, that the direct motor actions that you're
taking with your pen tip had some kind of an analog to the semantics of the
operation that you're making.
DDJ: You'll have to give me an example.
RC: The most obvious case in our scroll gestures, in which, if you want to
scroll, you actually shove the window contents up and down, left and right,
with the pen tip. Another good example would be that our gestures for
inserting spaces and carriage returns actually tend to end with a movement in
the direction that you're opening the space up in. That's part of this
mnemonic value that makes them quick to learn, but it also contributes to the
transparency in which you actually start thinking, "I'm opening up space
here," or "I'm cutting this object in half."
DDJ: You also wrote your own imaging system for PenPoint. Tell me about
Imagepoint.
RC: In the graphics area, like many areas, PenPoint is set up to be highly
configurable, so in fact we can support multiple graphics subsystems being
installed; but Imagepoint is the one that we've developed and, it's by default
installed with PenPoint. And currently all of PenPoint's user interface
elements use it, so PenPoint applications are typically all using ImagePoint,
at least the early ones. But it is conceivable that we could install a Display
Postscript graphics subsystem and folks who wanted to use that could then talk
to that.
DDJ: Is ImagePoint Postscript-like?
RC: In its architecture it is most similar to Display Postscript, but it's
much lighter weight. Which is the reason we didn't use Postscript itself.
ImagePoint runs in less than 200K of memory and displays pretty good
performance on the 16-MHz 286s that we were showing you last winter. Like
Postscript it unifies text as a graphics primitive along with all other
graphics primitives, and all graphics primitives are translatable, rotatable,
and scalable, which makes it very straightforward for applications to produce
graphically rich user interfaces that have a rich mixture of text. The text in
ImagePoint is based on outline font technology, so you use a minimum amount of
memory to store the fonts, but we can display them on demand at any point
size.
DDJ: How difficult would it be to put Display Postscript on a PenPoint-running
machine?
RC: It's analogous to a lot of porting. Since PenPoint supports ANSI standard
C, it's typically very straightforward to port what I'll call engine code;
code that mostly talks to itself. Most applications come in two halves: the
engine half and the UI half. The UI half makes numerous calls into the
underlying operating system to present its user interface, and user interfaces
are very difficult to port across different operating systems. Engines,
however, by design, tend to be extremely portable. This is one of the reasons
why you'll find a lot of applications coming over quite readily. Not because
they try to port the user interface, because that actually needs to be
rewritten and rethought for the pen. And that's true even if you work in an
existing OS: You'll find that with Microsoft Windows for Pen; increasingly,
Microsoft not only admitting but arguing that, yes, Windows applications need
to be rewritten for the pen. So once you're rewriting the UI for the pen, and
if the engine is easily portable, shouldn't you be coming over to an OS that
from the ground up was designed for mobility and for the pen? That's our basic
argument.
DDJ: So to port Display Postscript...
RC: Postscript would be ported over to PenPoint as a new instance of our
imaging class, and it would subclass it so it would respond to its Postscript
messages, whereas our ImagePoint -- technically it's actually called "class
Sys-Graf"--responds to its imaging model. Supporting Postscript would be
pretty straightforward; what would be difficult is to get PenPoint code in
applications that are currently talking to ImagePoint to convert over to
talking to Postscript. Although we're architecturally similar to Postscript,
we certainly are not API-compatible and we're not intended to be. Otherwise we
would have licensed Postscript.
DDJ: But if someone were developing a dedicated machine and they wanted
Postscript and they were responsible for their own applications--
RC: Yep.
DDJ: What opportunities do you see for software developers that fall out of
the pen market?
RC: I think there are a wide variety of opportunities for developers in the
pen market. I think the analogies with the PC market are very close. Looking
back, I think that many people see that if they had been first or second in a
given application category on the PC or Macintosh they could have done pretty
well, and now that they're facing being the 27th entry in the category, it's
going to be very difficult for them to come up with a product where they could
compete or find a publisher to work with. The salient thing with pen computing
is it's a brand-new market and it'll be quite a few years before it's "too
late" to really be out there competing with other software companies. That's
one point: It's a new market, and therefore you can compete on something of a
level field with many of the established software companies. The second point
is that, because of the embedding architecture of PenPoint, and because it's
object oriented, it allows, in fact almost forces, applications to be smaller
and more focused and not to be these large monoliths that do 27 things, which
today's spreadsheets and word processors and presentation graphics packages
are.
DDJ: We seem to be at a stage where every application needs to be a Framework
an integrated software package.
RC: They're drawing programs, they're typesetting programs, they're
spreadsheets, they're graphics programs; they don't call them integrated
software, but they are. No small software developer or company can compete
with a Lotus or Microsoft on that playing field. Well, in PenPoint, because
the OS does the integration for you, all of a sudden the status quo, the
expected norm, is that your application does one thing and does it well. And
all of a sudden a small development team can compete against, and will always
be able to compete with, a large production organization.
DDJ: And a lot of these machines are going to be small machines, with limited
resources.
RC: Right. So well-designed and well-crafted software will be highly valued.
So the first resource is a new market, where most categories are still open,
so there are good opportunities. Secondly, applications tend to be smaller and
more focused. The third thing is that many application categories are waiting
to be invented. There's both the opportunity and need, and also the
satisfaction of tremendous creativity. We believe that most of the
best-selling applications five years from now in pen computing are still to be
invented. We have some ideas about what those are, but our basic faith is that
it's the creativity of the application developer that'll drive the market
growth, and also the invention of these new application categories.
DDJ: Do you believe in the concept of the Killer App?
RC: No, no one killer app. I don't think we'll ever again see something as
striking as VisiCalc on the Apple II in terms of being so clearly identifiable
as an application that helped to birth an industry or marketplace. We do have
a killer data type in mind, which is ink. It's become real clear over the last
couple of years that most pen-based applications will and should support ink
as a data type, just as today they support, perhaps ASCII text and floating
point as data types that they manipulate in various fields. So you'll see ink
markup layers, acetate markup layers, ink annotations, ink Post-It notes,
entire ink editors, which you might think of as a note-taker or something,
where you are never really translating ink, but perhaps you are reformatting
it and editing it.
DDJ: You'd better tell me what ink is.
RC: Oh, yeah; ink is really just the path that the pen followed, captured from
a digitizer and stored as a sequence of strokes or polylines or curves,
however the software wants to represent it. That data structure, that sequence
of strokes, can then be displayed on the screen or printed out, ultimately.
But like an object-oriented drawing program, an ink editor would let the user
edit these strokes on the screen, perhaps cut them in half, delete some of
them, maybe rescale some of them, move them around. So if you take a page of
notes and then you want to go back to the middle of the page and add some more
thoughts, on paper of course you'd have to write in the margin and draw a
line. With an ink editor you'd simply give a quick gesture, a flick of the pen
tip, and that would open up some white space, because it would have shoved all
the strokes below the middle of the page further down. And you can start
seeing some real benefits to adding the computer to your handwritten notes.
DDJ: So an ink editor wouldn't interpret the strokes, wouldn't treat them like
letters the way a word processor or text editor would, but would just let you
manipulate the marks as marks.
RC: Jerry Kaplan makes a good analogy. He points out that as the word
processor was to the typewriter, in terms of letting us reformat typeset text,
ink editors are to ink on paper. An ink editor or note-taker lets you
reformat, edit, open up your handwritten notes or diagrams. That's an easy
example of a new application category that does not exist on the desktop
PCs--but will be a hotbed of innovation.


The PenPoint UI



Roland Alden and Tony Hoeber
Roland is one of the principal implementors of PenPoint. Tony is the Notebook
UI architect. They can be contacted at GO, 950 Tower Lane, Suite 1400, Foster
City, CA 94404.
The current generation of GUIs was designed to be used with a keyboard, mouse,
and desk-bound workstation. In designing PenPoint we assumed that the primary
input device is a pen, and the hardware form factor is a notebook or
pocket-sized pad. Starting from these premises made a world of difference. In
PenPoint the organizing metaphor is the notebook instead of the desktop, and
the interaction style is gestural as well as graphical. The interface is also
inherently scalable to accommodate different display sizes and resolutions.
Everyone is familiar with real-world notebooks, forms, sheets, pads, tabs,
bookmarks, sticky notes, and so on. The elements of the PenPoint Notebook User
Interface (NUI) all make sense within the context of these familiar objects.
The user of a PenPoint machine begins by seeing a notebook on the screen that
has a table of contents page. Instead of launching an application, the user
simply turns to the desired page in the notebook. Instead of window-oriented
control panels, data entry areas and radio buttons, the user deals with option
sheets, writing pads, and checklists.
The notebook is a very flexible organizing model. Users can temporarily
"unsnap" pages from the notebook to view multiple documents at once, or chunk
their data into multiple notebooks. PenPoint also allows users to create
compound documents by embedding one document within another, and to create
hypertext-style buttons allowing quick navigation from one document to
another.
One of the strengths of the pen as an input device is that it allows the user
to indicate both the operand and the operation with a single gesture. This is
often easier than first selecting the object, then locating the command on a
pulldown menu. Gestures are thoroughly integrated into all aspects of the
system. There is a core set of 11 gestures that work the same across all
applications. These core functions include delete, insert, move/copy, edit,
scroll, and so on. An example is the "flick" gesture -- a short line moving
up, down, left, or right. Its primary function is to scroll, by moving the
line of text being flicked to the top or bottom of the page. Flicks work in
contexts other than scrolling. Flicking left or right on the title line of the
notebook turns to the next or previous page; flicking up or down on the title
line of a floating notebook zooms or unzooms; flicking on overlapping notebook
tabs brings obscured tabs into view, and so on. In each case the same user
model is maintained: to shove the object and bring more information into view.
Note that PenPoint doesn't force gestures on the user. The system always
offers a dual command path that allows either a gesture or a selection from a
control to precipitate an action.
The PenPoint UI is completely scalable. There are a number of reasons for
this. Because PenPoint-based computers will come in a wide variety of form
factors and display types, the amount of text in a UI component that can be
displayed without scrolling can vary widely. In addition, everyone has
different preferences for viewing text on a screen, concerning both size and
font style. These personal decisions can change from one time or place to the
next. For instance, someone who normally reads menu text at 9 points may
prefer to double the size to 18 points when trying to read the computer
display in a moving vehicle, or walking from a bright room to a dimly lit one.
Everyone will make a different trade off between font style and font size to
fit constraints of display resolution, lighting, and so on. Because of this,
the user of PenPoint can control both the size and style of the "system" font
used by most user-interface components to display text.
Further, different parts of the UI may need to be displayed in different
formats. For instance, at low resolutions, Asian letterforms (Japanese Kanji
and Korean Hangul) must be displayed at a slightly larger size from Latin
letter-forms. PenPoint therefore allows certain UI components to display text
in sizes relative to (smaller or larger than) the reference size chosen by the
user. Even the thickness of separator lines on menus can be relative to the
size of the system font.
Another dynamically alterable variable is the display proportion: whether
display hardware that is rectangular operates in "landscape" or "portrait"
mode. PenPoint-based computers are small and portable so the user can easily
physically reorient them.
It's beneficial to place these choices in the hands of the user; however, this
can place a heavy burden on the user-interface toolkit, and it requires the
application developer to abandon some ideas about user-interface design.
For instance, user-interface tools such as those found on the Macintosh or MS
Windows platforms treat the user interface as a two-dimensional graphic design
problem. The size and position of UI components as well as their spatial
relationships are decided by the application programmer using tools that
resemble a drawing program. We call this the "pictorial model" of
user-interface construction.
PenPoint replaces this pictorial model with what could be called a layout
model. An application developer describes the relationships between different
UI components in general terms, including constraints regarding size and
position. At runtime the system optimizes the layout to fit the prevailing
display conditions.
The PenPoint layout model operates over trees of windows. Special window
classes are used to provide layout behavior for their child windows. Every UI
component (button, scrollbar, and so on) is a subclass of Class Window
(clsWin) in PenPoint's object-based programming system. The Layout window
classes do not draw anything (except borders); they simply organize child
windows. Application programmers can create new types of Layout windows, and
these can lay out standard component windows. Conversely, standard Layout
windows can be used to lay out custom component windows. PenPoint provides two
standard Layout classes: one that supports tables, and one that supports
arbitrary window arrangements.
Object-oriented programming is the key to building a flexible system such as
this. There are two dimensions to this flexibility. The first, and arguably
the most important, is subclassing. Through subclassing, a programmer can
adapt an existing tool without having to reinvent complex behavior that does
not need to be changed. And sharing as much behavior as possible makes
programs smaller and more reliable, and their user interfaces more consistent.
The other key to flexibility is keeping objects packaged as small units of
functionality and allowing them to be combined into larger aggregate objects.
Happily, this is a service naturally provided by the layout classes.














































December, 1991
C PROGRAMMING


D-Flat Text Boxes




Al Stevens


We are into the eighth month of the D-Flat project now. For those of you who
tuned in late, I'll briefly explain the project. D-Flat is a DOS text-mode
library that implements the IBM SAA/CUA user interface with an event-driven,
message-based programming model. CUA is the interface used in Windows,
Presentation Manager, QuickBasic, Turbo products, and many other programs. CUA
is fast becoming the de facto standard for interactive software user
languages. I started the product last year to provide CUA capabilities to DOS
C programmers. My inspiration was Borland's Turbo Vision, which was available
with Turbo Pascal. Borland now offers Turbo Vision with Turbo C++, but there
is still no version for the Borland C compiler. This is because Turbo Vision
is based on the object-oriented classes of Turbo Pascal and C++. There are
other libraries that offer CUA for DOS text-mode programmers. D-Flat is an
alternative that is free with source code, supports several compilers, and has
no restrictions on its use or distribution.


The D-flat TEXTBOX Class


Last month I described the NORMAL window class, the base class for all others.
It manages the messages common to all windows, dealing with window moving,
sizing, overlapping, and so on. This month we move to the TEXTBOX class, the
next class in the hierarchy, which is a base to several others. A text box is
a window that can display text. As you will see in later installments, list
boxes, menus, edit boxes, and other classes derive from the text box. A window
class that has text in all or part of its client area is likely to be derived
from a text box.
Just as the NORMAL class manages operations common to all windows, the TEXTBOX
class manages to operations on text in text box windows. The class displays,
scrolls, and pages through the text by processing the messages that support
those operations.


Text-box Data Fields


There are a number of members in the window structure for a text box that the
class uses to manage the text. The structure is defined in dflat.h. The text
pointer points to the heap memory where the text is stored. Text in text box
consists of displayable lines of text, each terminated by a newline character.
The complete body of text is null-terminated. The memory occupied by the text
is taken from the heap. The textlen integer is the length of the text buffer,
not necessarily the length of the text in the buffer, but greater than or
equal to that length. The wlines integer is the number of lines of text in the
buffer. The textwidth integer is the length of the longest line in the buffer
and is, therefore, the effective width of the text file. The wtop integer is
the line number, relative to 0, that is displayed at the top of the window.
This value changes when the user scrolls and pages the text. The wleft integer
is the column number, relative to 0, that is displayed in the left margin of
the window. This value changes when the user scrolls the text horizontally.
The window structure also contains the TextPointers members, which points to
an array of integers. The integers are offsets to the lines of text in the
buffer. This array is to improve performance in scrolling, paging, and
deleting text. It eliminates the need for the program to constantly scan the
text, looking for the addresses of lines.
There are four integer values in the window structure that specify the line
and column boundaries of a marked block of text, if one exists. These integers
are named BlkBegLine, BlkEndLine, BlkBegCol, and BlkEndCol.
Listing One, page 144, is textbox.c, the source file that implements the
TEXTBOX class. It begins with a number of functions that process the messages
for the text box. Earlier versions of this code and the code for other window
classes had the message-processing statements mostly in the cases for the one
big switch statement that the class's window-processing module used. Several
readers reported to me that the TopSpeed C compiler was bombing out on these
big functions. I did not support TopSpeed C at the time, and JPI, the vendor,
suggested breaking the program into smaller functions. At the same time, I
observed that the Watcom C compiler was reporting that it did not have enough
memory to fully optimize the functions with the big switch statements. For
these reasons, I decided to put the message-processing code into individual
functions. Watcom no longer reports the optimization problems, and the
executable from Watcom has shrunk by about 5K. The code is easier to read,
too, because there are fewer levels of indenture. If you download the D-Flat
library, compare the normal.c source file with the version of it that I
published last month. You'll see the difference.
Each function has a comment that says which message the function processes.
The window-processing module is TextBoxProc, and you can see that the cases of
its switch statement usually just call the functions for the messages. I will
discuss each of the messages and you can follow along by finding the function
in the listing.
The ADDTEXT message appends text to a text box. Its first parameter is a
pointer to the text string to be appended. The window makes its own copy of
the text, so a program that sends the ADDTEXT message need not to worry about
the string going out of scope while the window is still open.
The SETTEXT message assigns a new block of a text to the text-box window. The
first parameter points to the new text, which replaces any existing text in
the text box. Just as with the ADDTEXT message, the window makes its own copy
of the text string.
The CLEARTEXT message clears the text in a text box window. It frees the
text's memory space and sets all the text control fields to 0.
The KEYBOARD message processes keystrokes for the text box. Text boxes react
to keys that page or scroll the text. The function that processes these keys
simply sends messages to the window that tell it to scroll or page, up or
down, horizontally or vertically, a line or a page or the entire document.
The LEFT_BUTTON message also scrolls and pages the text box if the mouse is
positioned in the horizontal or vertical scroll bar. If the user clicks a
scroll button, this message sends the appropriate SCROOL or HORIZSCROLL
message to scroll the text one line or column vertically or horizontally. If
the user clicks in the scroll bar but not on the scroll slider box, this
message sends the appropriate up or down or left or right page scrolling
message. If the user clicks on the scroll slider box, this message sets either
the VSliding or H-Sliding indicator so the window knows it is sliding
vertically or horizontally. Then the message sends the MOUSE-TRAVEL message to
the system to make the mouse stay inside the scroll bar. This technique keeps
the mouse cursor from going all over the screen and confusing the user and the
rest of the system while the user slides the slider box. This condition will
persist until the user releases the mouse button.
The MOUSE_MOVED message is important only if the VSliding or HSliding
indicator is set, in which case the program moves the corresponding slider box
by writing a SCROLLBARCHAR where the box was previously and the SCROLLBOXCHAR
where it is now.
When the BUTTON_RELEASED message occurs with either the VSliding or HSliding
indicator set, the program uses the new slider box position to compute new
page-display parameters for the text box. The program also sends the
MOUSE-TRAVEL message to permit the mouse to roam the entire screen again.
The SCROLL message adjusts the text box's wtop offset. Then, if the window is
visible, the message scrolls the window up or down and displays the bottom or
top line, depending on which way the scroll is going. Finally, the message
computes a new position for the slider box and displays it.
The HORIZCROLL message adjusts the text box's wleft offset and sends the PAINT
message to the window.
The SCROLLPAGE and HORIZSCROLLPAGE messages work like the SCROLL and
HORIZSCROLL messages, except that they scroll the number of lines or columns
visible in the window--one page--and then send the PAINT message.
The SCROLLDOC message scrolls the text box to the beginning or end of the
document.
The PAINT message causes the text box to repaint its client area. The first
parameter can be a pointer to a RECT structure relative to the screen
rectangle of the window. The parameter specifies that the PAINT message is
only to paint the portion of the window represented by the RECT structure. If
there is no RECT pointer in the parameter, the program computes the rectangle
to be the entire client area of the window. For each line in the rectangle,
the program calls the WriteTextLine function to display the line. If there are
fewer lines of text than will fill the window, the program calls the writeline
function to pad the client area with blank lines. Finally, the message
computes new positions for the horizontal and vertical scroll bar slider boxes
and displays them.
The CLOSE_WINDOW message sends the CLEARTEXT message to remove the text from
the window and then frees the memory allocated for the text line pointers.
There are several functions in textbox.c that support the message processing.
The ComputeHScrollBox and ComputeVScrollBox functions compute positions on the
scroll bars for the slider boxes based on the length and width of the
document, the height and width of the text box window, and the current visible
portion of the document in the window, as indicated by the wtop and wleft
variables. The ComputeWindowTop and Compute WindowLeft functions are the
inverse of ComputeHScrollBox and ComputeVScrollBox functions. They compute
values for the wtop and wleft variables from the current slider box positions.
The GetTextline function locates a specified line of text in the document
buffer, allocates enough memory from the heap to hold the line, copies the
line from the document buffer to the allocated space, and returns the address
of the allocated line buffer. This function is used only from within the code
in textbox.c. The calling function must free the allocated memory. The
function uses the TextLine macro to locate the address of the line in the
document buffer. That macro is defined in dflat.h. It uses the specified line
number, the address of the document buffer, and the array of offsets pointed
to by the TextPointers member of the window structure.
The WriteTextLine function is a complicated piece of code. Its job is to
prepare a displayable buffer for a specified line of text in the text box's
document buffer. The displayable buffer will be clipped on either or both ends
if the RECT parameter does not encompass the whole line. The text line or a
portion of it might fall within a currently selected block, and the function
must insert the appropriate color controls into the text line for the video
functions to use to display the text. D-Flat assumes that a window's text will
display with its default client area colors unless an escape sequence appears
in the text.
This sequence, which must displace no screen character positions, consists of
a CHANGECOLOR character value followed by a foreground color byte and a
background color byte. This sequence will tell the display functions to switch
to the specified color until it finds a RESETCOLOR control byte or reaches the
end of the text line. These control bytes must be taken into consideration
when the function is mapping the clipping rectangle onto the text line. This
is one of those pieces of code that defies description. It took me a long time
to get it working, and I am not sure I could describe its operation now. I am
not sure I even understand it any more. I pray regularly that it never needs
work and will fight to the bitter end any proposed modification to D-Flat that
would affect this code. I counsel you to never paint yourself into such a code
corner, but I know you will from time to time.
The SetAnchor function accepts column and row coordinates for the window and
sets the anchor point for the marking of a block of text.
The ClearTextPointers function establishes the TextPointers array by
reallocating it to the size of a single offset integer, which is given a 0
value. The BuildTextPointers function scans the document text buffer looking
for the addresses of the beginnings of text lines. It puts offset values into
the TextPointers array for each line found. The program calls this function
whenever some editing operation--deleting a block, for example--changes the
buffer in a way that would alter the text line offsets.


How to Get D-Flat Now


The D-Flat source code is on CompuServe in Library 0 of the DDJ Forum and on
M&T Online. Its name is DFLAtn.-ARC, where the n is an integer that represents
a loosely assigned version number. There is another file, named DFnTXT.ARC. It
includes a README. DOC file that describes the changes and how to build the
software. There is a calendar in the README.DOC file that shows the past and
scheduled monthly columns and the D-Flat subjects that they address. The
DFnTXT.ARC file also contains the Help system database and the documentation
for the programmer's API. D-Flat compiles with Turbo C 2.0, Borland C++ 2.0,
Microsoft C 6.0, and Watcom C 8.0. There are makefiles for the TC, MSC, and
Watcom compilers. There is an example program, the MEMOPAD program, which is a
multiple document notepad.
If you cannot use the online service, send me a formatted diskette--360K or
720K--and an addressed, stamped diskette mailer. Send it to me in care of DDJ.
I'll send you the latest copy of the library. The software is free, but if you
care to, stick a dollar bill in the mailer for the Brevard County Food Bank.
They take care of homeless and hungry families. We've collected about $375 so
far from generous D-Flat "careware" users. I took a pile of money over there
today. They are very grateful.
If you want to discuss D-Flat with me, use CompuServe. My CompuServe ID is
71101,1262, and I monitor the DDJ Forum daily.



Great Moments in Tech Support


The scene: Tech Support person (TS) has taken a phone call from a Customer (C)
who does not know how to install the software product.
TS: Is the diskette in drive A?
C: No.
TS: Put the diskette in drive A.
C: OK, I did.
TS: Now close the door.
C: OK, hold on a minute.
(Sound of phone being put down. Footsteps. SLAM! Footsteps.)
C: OK, now what?
Next month we'll get into the D-Flat EDITBOX, the window class that lets the
user key text into a window after the fashion of a text editor or word
processor.

_C PROGRAMMING COLUMN_
by Al Stevens


[LISTING ONE]

/* ------------- textbox.c ------------ */

#include "dflat.h"

#ifdef INCLUDE_SCROLLBARS
static void ComputeWindowTop(WINDOW);
static void ComputeWindowLeft(WINDOW);
static int ComputeVScrollBox(WINDOW);
static int ComputeHScrollBox(WINDOW);
static void MoveScrollBox(WINDOW, int);
#endif
static char *GetTextLine(WINDOW, int);

#ifdef INCLUDE_SCROLLBARS
int VSliding;
int HSliding;
#endif

/* ------------ ADDTEXT Message -------------- */
static void AddTextMsg(WINDOW wnd, PARAM p1)
{
 /* --- append text to the textbox's buffer --- */
 unsigned adln = strlen((char *)p1);
 if (adln > (unsigned)0xfff0)
 return;
 if (wnd->text != NULL) {
 /* ---- appending to existing text ---- */
 unsigned txln = strlen(wnd->text);
 if ((long)txln+adln > (unsigned) 0xfff0)
 return;
 if (txln+adln > wnd->textlen) {
 wnd->text = realloc(wnd->text, txln+adln+3);
 wnd->textlen = txln+adln+1;
 }
 }
 else {
 /* ------ 1st text appended ------ */
 wnd->text = calloc(1, adln+3);

 wnd->textlen = adln+1;
 }
 if (wnd->text != NULL) {
 /* ---- append the text ---- */
 strcat(wnd->text, (char*) p1);
 strcat(wnd->text, "\n");
 BuildTextPointers(wnd);
 }
}

/* ------------ SETTEXT Message -------------- */
static void SetTextMsg(WINDOW wnd, PARAM p1)
{
 /* -- assign new text value to textbox buffer -- */
 char *cp;
 unsigned int len;
 cp = (void *) p1;
 len = strlen(cp)+1;
 if (wnd->text == NULL wnd->textlen < len) {
 wnd->textlen = len;
 if ((wnd->text=realloc(wnd->text, len+1)) == NULL)
 return;
 wnd->text[len] = '\0';
 }
 strcpy(wnd->text, cp);
 BuildTextPointers(wnd);
}

/* ------------ CLEARTEXT Message -------------- */
static void ClearTextMsg(WINDOW wnd)
{
 /* ----- clear text from textbox ----- */
 if (wnd->text != NULL)
 free(wnd->text);
 wnd->text = NULL;
 wnd->textlen = 0;
 wnd->wlines = 0;
 wnd->textwidth = 0;
 wnd->wtop = wnd->wleft = 0;
 ClearBlock(wnd);
 ClearTextPointers(wnd);
}

/* ------------ KEYBOARD Message -------------- */
static int KeyboardMsg(WINDOW wnd, PARAM p1)
{
 switch ((int) p1) {
 case UP:
 return SendMessage(wnd,SCROLL,FALSE,0);
 case DN:
 return SendMessage(wnd,SCROLL,TRUE,0);
 case FWD:
 return SendMessage(wnd,HORIZSCROLL,TRUE,0);
 case BS:
 return SendMessage(wnd,HORIZSCROLL,FALSE,0);
 case PGUP:
 return SendMessage(wnd,SCROLLPAGE,FALSE,0);
 case PGDN:
 return SendMessage(wnd,SCROLLPAGE,TRUE,0);

 case CTRL_PGUP:
 return SendMessage(wnd,HORIZPAGE,FALSE,0);
 case CTRL_PGDN:
 return SendMessage(wnd,HORIZPAGE,TRUE,0);
 case HOME:
 return SendMessage(wnd,SCROLLDOC,TRUE,0);
 case END:
 return SendMessage(wnd,SCROLLDOC,FALSE,0);
 default:
 break;
 }
 return FALSE;
}

#ifdef INCLUDE_SCROLLBARS
/* ------------ LEFT_BUTTON Message -------------- */
static int LeftButtonMsg(WINDOW wnd, PARAM p1, PARAM p2)
{
 int mx = (int) p1 - GetLeft(wnd);
 int my = (int) p2 - GetTop(wnd);
 if (TestAttribute(wnd, VSCROLLBAR) &&
 mx == WindowWidth(wnd)-1) {
 /* -------- in the right border ------- */
 if (my == 0 my == ClientHeight(wnd)+1)
 /* --- above or below the scroll bar --- */
 return FALSE;
 if (my == 1)
 /* -------- top scroll button --------- */
 return SendMessage(wnd, SCROLL, FALSE, 0);
 if (my == ClientHeight(wnd))
 /* -------- bottom scroll button --------- */
 return SendMessage(wnd, SCROLL, TRUE, 0);
 /* ---------- in the scroll bar ----------- */
 if (!VSliding && my-1 == wnd->VScrollBox) {
 RECT rc;
 VSliding = TRUE;
 rc.lf = rc.rt = GetRight(wnd);
 rc.tp = GetTop(wnd)+2;
 rc.bt = GetBottom(wnd)-2;
 return SendMessage(NULL, MOUSE_TRAVEL,
 (PARAM) &rc, 0);
 }
 if (my-1 < wnd->VScrollBox)
 return SendMessage(wnd,SCROLLPAGE,FALSE,0);
 if (my-1 > wnd->VScrollBox)
 return SendMessage(wnd,SCROLLPAGE,TRUE,0);
 }
 if (TestAttribute(wnd, HSCROLLBAR) &&
 my == WindowHeight(wnd)-1) {
 /* -------- in the bottom border ------- */
 if (mx == 0 my == ClientWidth(wnd)+1)
 /* ------ outside the scroll bar ---- */
 return FALSE;
 if (mx == 1)
 return SendMessage(wnd, HORIZSCROLL,FALSE,0);
 if (mx == WindowWidth(wnd)-2)
 return SendMessage(wnd, HORIZSCROLL,TRUE,0);
 if (!HSliding && mx-1 == wnd->HScrollBox) {
 /* --- hit the scroll box --- */

 RECT rc;
 rc.lf = GetLeft(wnd)+2;
 rc.rt = GetRight(wnd)-2;
 rc.tp = rc.bt = GetBottom(wnd);
 /* - keep the mouse in the scroll bar - */
 SendMessage(NULL,MOUSE_TRAVEL,(PARAM)&rc,0);
 HSliding = TRUE;
 return TRUE;
 }
 if (mx-1 < wnd->HScrollBox)
 return SendMessage(wnd,HORIZPAGE,FALSE,0);
 if (mx-1 > wnd->HScrollBox)
 return SendMessage(wnd,HORIZPAGE,TRUE,0);
 }
 return FALSE;
}

/* ------------ MOUSE_MOVED Message -------------- */
static int MouseMovedMsg(WINDOW wnd, PARAM p1, PARAM p2)
{
 int mx = (int) p1 - GetLeft(wnd);
 int my = (int) p2 - GetTop(wnd);
 if (VSliding) {
 /* ---- dragging the vertical scroll box --- */
 if (my-1 != wnd->VScrollBox) {
 foreground = FrameForeground(wnd);
 background = FrameBackground(wnd);
 PutWindowChar(wnd, WindowWidth(wnd)-1,
 wnd->VScrollBox+1,SCROLLBARCHAR);
 wnd->VScrollBox = my-1;
 PutWindowChar(wnd, WindowWidth(wnd)-1,
 my, SCROLLBOXCHAR);
 }
 return TRUE;
 }
 if (HSliding) {
 /* --- dragging the horizontal scroll box --- */
 if (mx-1 != wnd->HScrollBox) {
 foreground = FrameForeground(wnd);
 background = FrameBackground(wnd);
 PutWindowChar(wnd, wnd->HScrollBox+1,
 WindowHeight(wnd)-1, SCROLLBARCHAR);
 wnd->HScrollBox = mx-1;
 PutWindowChar(wnd, mx, WindowHeight(wnd)-1,
 SCROLLBOXCHAR);
 }
 return TRUE;
 }
 return FALSE;
}

/* ------------ BUTTON_RELEASED Message -------------- */
static void ButtonReleasedMsg(WINDOW wnd)
{
 if (HSliding VSliding) {
 RECT rc;
 rc.lf = rc.tp = 0;
 rc.rt = SCREENWIDTH-1;
 rc.bt = SCREENHEIGHT-1;

 /* release the mouse ouside the scroll bar */
 SendMessage(NULL, MOUSE_TRAVEL, (PARAM) &rc, 0);
 VSliding ? ComputeWindowTop(wnd) :
 ComputeWindowLeft(wnd);
 SendMessage(wnd, PAINT, 0, 0);
 SendMessage(wnd, KEYBOARD_CURSOR, 0, 0);
 VSliding = HSliding = FALSE;
 }
}
#endif

/* ------------ SCROLL Message -------------- */
static int ScrollMsg(WINDOW wnd, PARAM p1)
{
 /* ---- vertical scroll one line ---- */
 if (p1) {
 /* ----- scroll one line up ----- */
 if (wnd->wtop+ClientHeight(wnd) >= wnd->wlines)
 return FALSE;
 wnd->wtop++;
 }
 else {
 /* ----- scroll one line down ----- */
 if (wnd->wtop == 0)
 return FALSE;
 --wnd->wtop;
 }
 if (isVisible(wnd)) {
 RECT rc;
 rc = ClipRectangle(wnd, ClientRect(wnd));
 if (ValidRect(rc)) {
 /* ---- scroll the window ----- */
 scroll_window(wnd, rc, (int)p1);
 if (!(int)p1)
 /* -- write top line (down) -- */
 WriteTextLine(wnd,NULL,wnd->wtop,FALSE);
 else {
 /* -- write bottom line (up) -- */
 int y=RectBottom(rc)-GetClientTop(wnd);
 WriteTextLine(wnd, NULL,
 wnd->wtop+y, FALSE);
 }
 }
#ifdef INCLUDE_SCROLLBARS
 /* ---- reset the scroll box ---- */
 if (TestAttribute(wnd, VSCROLLBAR)) {
 int vscrollbox = ComputeVScrollBox(wnd);
 if (vscrollbox != wnd->VScrollBox)
 MoveScrollBox(wnd, vscrollbox);
 }
#endif
 return TRUE;
 }
 return FALSE;
}

/* ------------ HORIZSCROLL Message -------------- */
static void HorizScrollMsg(WINDOW wnd, PARAM p1)
{

 /* --- horizontal scroll one column --- */
 if (p1) {
 /* --- scroll left --- */
 if (wnd->wleft + ClientWidth(wnd)-1 <
 wnd->textwidth)
 wnd->wleft++;
 }
 else
 /* --- scroll right --- */
 if (wnd->wleft > 0)
 --wnd->wleft;
 SendMessage(wnd, PAINT, 0, 0);
}

/* ------------ SCROLLPAGE Message -------------- */
static void ScrollPageMsg(WINDOW wnd, PARAM p1)
{
 /* --- vertical scroll one page --- */
 if ((int) p1 == FALSE) {
 /* ---- page up ---- */
 if (wnd->wtop) {
 wnd->wtop -= ClientHeight(wnd);
 if (wnd->wtop < 0)
 wnd->wtop = 0;
 }
 }
 else {
 /* ---- page down ---- */
 if (wnd->wtop+ClientHeight(wnd) < wnd->wlines) {
 wnd->wtop += ClientHeight(wnd);
 if (wnd->wtop>wnd->wlines-ClientHeight(wnd))
 wnd->wtop=wnd->wlines-ClientHeight(wnd);
 }
 }
 SendMessage(wnd, PAINT, 0, 0);
}

/* ------------ HORIZSCROLLPAGE Message -------------- */
static void HorizScrollPageMsg(WINDOW wnd, PARAM p1)
{
 /* --- horizontal scroll one page --- */
 if ((int) p1 == FALSE) {
 /* ---- page left ----- */
 wnd->wleft -= ClientWidth(wnd);
 if (wnd->wleft < 0)
 wnd->wleft = 0;
 }
 else {
 /* ---- page right ----- */
 wnd->wleft += ClientWidth(wnd);
 if (wnd->wleft>wnd->textwidth-ClientWidth(wnd))
 wnd->wleft=wnd->textwidth-ClientWidth(wnd);
 }
 SendMessage(wnd, PAINT, 0, 0);
}

/* ------------ SCROLLDOC Message -------------- */
static void ScrollDocMsg(WINDOW wnd, PARAM p1)
{

 /* --- scroll to beginning or end of document --- */
 if ((int) p1)
 wnd->wtop = wnd->wleft = 0;
 else if (wnd->wtop+ClientHeight(wnd) < wnd->wlines){
 wnd->wtop = wnd->wlines-ClientHeight(wnd);
 wnd->wleft = 0;
 }
 SendMessage(wnd, PAINT, 0, 0);
}

/* ------------ PAINT Message -------------- */
static int PaintMsg(WINDOW wnd, PARAM p1)
{
 /* ------ paint the client area ----- */
 RECT rc, rcc;
 int y;
 char blankline[201];

 /* ----- build the rectangle to paint ----- */
 if ((RECT *)p1 == NULL)
 rc=RelativeWindowRect(wnd, WindowRect(wnd));
 else
 rc= *(RECT *)p1;
 if (TestAttribute(wnd, HASBORDER) &&
 RectRight(rc) >= WindowWidth(wnd)-1) {
 if (RectLeft(rc) >= WindowWidth(wnd)-1)
 return FALSE;
 RectRight(rc) = WindowWidth(wnd)-2;
 }
 rcc = AdjustRectangle(wnd, rc);

 /* ----- blank line for padding ----- */
 memset(blankline, ' ', SCREENWIDTH);
 blankline[RectRight(rcc)+1] = '\0';

 /* ------- each line within rectangle ------ */
 for (y = RectTop(rc); y <= RectBottom(rc); y++){
 int yy;
 /* ---- test outside of Client area ---- */
 if (TestAttribute(wnd,
 HASBORDER HASTITLEBAR)) {
 if (y < TopBorderAdj(wnd))
 continue;
 if (y > WindowHeight(wnd)-2)
 continue;
 }
 yy = y-TopBorderAdj(wnd);
 if (yy < wnd->wlines-wnd->wtop)
 /* ---- paint a text line ---- */
 WriteTextLine(wnd, &rc,
 yy+wnd->wtop, FALSE);
 else {
 /* ---- paint a blank line ---- */
 SetStandardColor(wnd);
 writeline(wnd, blankline+RectLeft(rcc),
 RectLeft(rcc)+1, y, FALSE);
 }
 }
#ifdef INCLUDE_SCROLLBARS

 /* ------- position the scroll box ------- */
 if (TestAttribute(wnd, VSCROLLBARHSCROLLBAR)) {
 int hscrollbox = ComputeHScrollBox(wnd);
 int vscrollbox = ComputeVScrollBox(wnd);
 if (hscrollbox != wnd->HScrollBox 
 vscrollbox != wnd->VScrollBox) {
 wnd->HScrollBox = hscrollbox;
 wnd->VScrollBox = vscrollbox;
 SendMessage(wnd, BORDER, p1, 0);
 }
 }
#endif
 return TRUE;
}

/* ------------ CLOSE_WINDOW Message -------------- */
static void CloseWindowMsg(WINDOW wnd)
{
 SendMessage(wnd, CLEARTEXT, 0, 0);
 if (wnd->TextPointers != NULL) {
 free(wnd->TextPointers);
 wnd->TextPointers = NULL;
 }
}

/* ----------- TEXTBOX Message-processing Module ----------- */
int TextBoxProc(WINDOW wnd, MESSAGE msg, PARAM p1, PARAM p2)
{
 switch (msg) {
 case CREATE_WINDOW:
 wnd->HScrollBox = wnd->VScrollBox = 1;
 ClearTextPointers(wnd);
 break;
 case ADDTEXT:
 AddTextMsg(wnd, p1);
 break;
 case SETTEXT:
 SetTextMsg(wnd, p1);
 break;
 case CLEARTEXT:
 ClearTextMsg(wnd);
 break;
 case KEYBOARD:
#ifdef INCLUDE_SYSTEM_MENUS
 if (WindowMoving WindowSizing)
 return FALSE;
#endif
 if (KeyboardMsg(wnd, p1))
 return TRUE;
 break;
 case LEFT_BUTTON:
#ifdef INCLUDE_SYSTEM_MENUS
 if (WindowSizing WindowMoving)
 return FALSE;
#endif
#ifdef INCLUDE_SCROLLBARS
 if (LeftButtonMsg(wnd, p1, p2))
 return TRUE;
 break;

 case MOUSE_MOVED:
 if (MouseMovedMsg(wnd, p1, p2))
 return TRUE;
 break;
 case BUTTON_RELEASED:
 ButtonReleasedMsg(wnd);
#endif
 break;
 case SCROLL:
 ScrollMsg(wnd, p1);
 return TRUE;
 case HORIZSCROLL:
 HorizScrollMsg(wnd, p1);
 return TRUE;
 case SCROLLPAGE:
 ScrollPageMsg(wnd, p1);
 return TRUE;
 case HORIZPAGE:
 HorizScrollPageMsg(wnd, p1);
 return TRUE;
 case SCROLLDOC:
 ScrollDocMsg(wnd, p1);
 return TRUE;
 case PAINT:
 if (isVisible(wnd) && wnd->wlines) {
 PaintMsg(wnd, p1);
 return FALSE;
 }
 break;
 case CLOSE_WINDOW:
 CloseWindowMsg(wnd);
 break;
 default:
 break;
 }
 return BaseWndProc(TEXTBOX, wnd, msg, p1, p2);
}

#ifdef INCLUDE_SCROLLBARS
/* ------ compute the vertical scroll box position from
 the text pointers --------- */
static int ComputeVScrollBox(WINDOW wnd)
{
 int pagelen = wnd->wlines - ClientHeight(wnd);
 int barlen = ClientHeight(wnd)-2;
 int lines_tick;
 int vscrollbox;

 if (pagelen < 1 barlen < 1)
 vscrollbox = 1;
 else {
 if (pagelen > barlen)
 lines_tick = pagelen / barlen;
 else
 lines_tick = barlen / pagelen;
 vscrollbox = 1 + (wnd->wtop / lines_tick);
 if (vscrollbox > ClientHeight(wnd)-2 
 wnd->wtop + ClientHeight(wnd) >= wnd->wlines)
 vscrollbox = ClientHeight(wnd)-2;

 }
 return vscrollbox;
}

/* ---- compute top text line from scroll box position ---- */
static void ComputeWindowTop(WINDOW wnd)
{
 int pagelen = wnd->wlines - ClientHeight(wnd);
 if (wnd->VScrollBox == 0)
 wnd->wtop = 0;
 else if (wnd->VScrollBox == ClientHeight(wnd)-2)
 wnd->wtop = pagelen;
 else {
 int barlen = ClientHeight(wnd)-2;
 int lines_tick;

 if (pagelen > barlen)
 lines_tick = pagelen / barlen;
 else
 lines_tick = barlen / pagelen;
 wnd->wtop = (wnd->VScrollBox-1) * lines_tick;
 if (wnd->wtop + ClientHeight(wnd) > wnd->wlines)
 wnd->wtop = pagelen;
 }
 if (wnd->wtop < 0)
 wnd->wtop = 0;
}

/* ------ compute the horizontal scroll box position from
 the text pointers --------- */
static int ComputeHScrollBox(WINDOW wnd)
{
 int pagewidth = wnd->textwidth - ClientWidth(wnd);
 int barlen = ClientWidth(wnd)-2;
 int chars_tick;
 int hscrollbox;

 if (pagewidth < 1 barlen < 1)
 hscrollbox = 1;
 else {
 if (pagewidth > barlen)
 chars_tick = pagewidth / barlen;
 else
 chars_tick = barlen / pagewidth;
 hscrollbox = 1 + (wnd->wleft / chars_tick);
 if (hscrollbox > ClientWidth(wnd)-2 
 wnd->wleft + ClientWidth(wnd) >= wnd->textwidth)
 hscrollbox = ClientWidth(wnd)-2;
 }
 return hscrollbox;
}

/* ---- compute left column from scroll box position ---- */
static void ComputeWindowLeft(WINDOW wnd)
{
 int pagewidth = wnd->textwidth - ClientWidth(wnd);

 if (wnd->HScrollBox == 0)
 wnd->wleft = 0;

 else if (wnd->HScrollBox == ClientWidth(wnd)-2)
 wnd->wleft = pagewidth;
 else {
 int barlen = ClientWidth(wnd)-2;
 int chars_tick;

 if (pagewidth > barlen)
 chars_tick = pagewidth / barlen;
 else
 chars_tick = barlen / pagewidth;
 wnd->wleft = (wnd->HScrollBox-1) * chars_tick;
 if (wnd->wleft + ClientWidth(wnd) > wnd->textwidth)
 wnd->wleft = pagewidth;
 }
 if (wnd->wleft < 0)
 wnd->wleft = 0;
}
#endif

/* ----- get the text to a specified line ----- */
static char *GetTextLine(WINDOW wnd, int selection)
{
 char *line;
 int len = 0;
 char *cp, *cp1;
 cp = cp1 = TextLine(wnd, selection);
 while (*cp && *cp != '\n') {
 len++;
 cp++;
 }
 line = malloc(len+6);
 if (line != NULL) {
 memmove(line, cp1, len);
 line[len] = '\0';
 }
 return line;
}

/* ------- write a line of text to a textbox window ------- */
void WriteTextLine(WINDOW wnd, RECT *rcc, int y, int reverse)
{
 int len = 0;
 int dif = 0;
 unsigned char *line;
 RECT rc;
 unsigned char *lp, *svlp;
 int lnlen;
 int i;
 int trunc = FALSE;

 /* ------ make sure y is inside the window ----- */
 if (y < wnd->wtop y >= wnd->wtop+ClientHeight(wnd))
 return;

 /* ---- build the retangle within which can write ---- */
 if (rcc == NULL) {
 rc = RelativeWindowRect(wnd, WindowRect(wnd));
 if (TestAttribute(wnd, HASBORDER) &&
 RectRight(rc) >= WindowWidth(wnd)-1)

 RectRight(rc) = WindowWidth(wnd)-2;
 }
 else
 rc = *rcc;

 /* ----- make sure rectangle is within window ------ */
 if (RectLeft(rc) >= WindowWidth(wnd)-1)
 return;
 if (RectRight(rc) == 0)
 return;
 rc = AdjustRectangle(wnd, rc);
 if (y-wnd->wtop<RectTop(rc) y-wnd->wtop>RectBottom(rc))
 return;

 /* --- get the text and length of the text line --- */
 lp = svlp = GetTextLine(wnd, y);
 if (svlp == NULL)
 return;
 lnlen = LineLength(lp);

 /* -------- insert block color change controls ------- */
 if (BlockMarked(wnd)) {
 int bbl = wnd->BlkBegLine;
 int bel = wnd->BlkEndLine;
 int bbc = wnd->BlkBegCol;
 int bec = wnd->BlkEndCol;
 int by = y;

 /* ----- put lowest marker first ----- */
 if (bbl > bel) {
 swap(bbl, bel);
 swap(bbc, bec);
 }
 if (bbl == bel && bbc > bec)
 swap(bbc, bec);

 if (by >= bbl && by <= bel) {
 /* ------ the block includes this line ----- */
 int blkbeg = 0;
 int blkend = lnlen;
 if (!(by > bbl && by < bel)) {
 /* --- the entire line is not in the block -- */
 if (by == bbl)
 /* ---- the block begins on this line --- */
 blkbeg = bbc;
 if (by == bel)
 /* ---- the block ends on this line ---- */
 blkend = bec;
 }
 /* ----- insert the reset color token ----- */
 memmove(lp+blkend+1,lp+blkend,strlen(lp+blkend)+1);
 lp[blkend] = RESETCOLOR;
 /* ----- insert the change color token ----- */
 memmove(lp+blkbeg+3,lp+blkbeg,strlen(lp+blkbeg)+1);
 lp[blkbeg] = CHANGECOLOR;
 /* ----- insert the color tokens ----- */
 SetReverseColor(wnd);
 lp[blkbeg+1] = foreground 0x80;
 lp[blkbeg+2] = background 0x80;

 lnlen += 4;
 }
 }
 /* - make sure left margin doesn't overlap color change - */
 for (i = 0; i < wnd->wleft+3; i++) {
 if (*(lp+i) == '\0')
 break;
 if (*(unsigned char *)(lp + i) == RESETCOLOR)
 break;
 }
 if (*(lp+i) && i < wnd->wleft+3) {
 if (wnd->wleft+4 > lnlen)
 trunc = TRUE;
 else
 lp += 4;
 }
 else {
 /* --- it does, shift the color change over --- */
 for (i = 0; i < wnd->wleft; i++) {
 if (*(lp+i) == '\0')
 break;
 if (*(unsigned char *)(lp + i) == CHANGECOLOR) {
 *(lp+wnd->wleft+2) = *(lp+i+2);
 *(lp+wnd->wleft+1) = *(lp+i+1);
 *(lp+wnd->wleft) = *(lp+i);
 break;
 }
 }
 }
 /* ------ build the line to display -------- */
 if ((line = malloc(200)) != NULL) {
 if (!trunc) {
 if (lnlen < wnd->wleft)
 lnlen = 0;
 else
 lp += wnd->wleft;
 if (lnlen > RectLeft(rc)) {
 /* ---- the line exceeds the rectangle ---- */
 int ct = RectLeft(rc);
 char *initlp = lp;
 /* --- point to end of clipped line --- */
 while (ct) {
 if (*(unsigned char *)lp == CHANGECOLOR)
 lp += 3;
 else if (*(unsigned char *)lp == RESETCOLOR)
 lp++;
 else
 lp++, --ct;
 }
 if (RectLeft(rc)) {
 char *lpp = lp;
 while (*lpp) {
 if (*(unsigned char*)lpp==CHANGECOLOR)
 break;
 if (*(unsigned char*)lpp==RESETCOLOR) {
 lpp = lp;
 while (lpp >= initlp) {
 if (*(unsigned char *)lpp ==
 CHANGECOLOR) {

 lp -= 3;
 memmove(lp,lpp,3);
 break;
 }
 --lpp;
 }
 break;
 }
 lpp++;
 }
 }
 lnlen = LineLength(lp);
 len = min(lnlen, RectWidth(rc));
 dif = strlen(lp) - lnlen;
 len += dif;
 if (len > 0)
 strncpy(line, lp, len);
 }
 }
 /* -------- pad the line --------- */
 while (len < RectWidth(rc)+dif)
 line[len++] = ' ';
 line[len] = '\0';
 dif = 0;
 /* ------ establish the line's main color ----- */
 if (reverse) {
 char *cp = line;
 SetReverseColor(wnd);
 while ((cp = strchr(cp, CHANGECOLOR)) != NULL) {
 cp += 2;
 *cp++ = background 0x80;
 }
 if (*(unsigned char *)line == CHANGECOLOR)
 dif = 3;
 }
 else
 SetStandardColor(wnd);
 /* ------- display the line -------- */
 writeline(wnd, line+dif,
 RectLeft(rc)+BorderAdj(wnd),
 y-wnd->wtop+TopBorderAdj(wnd), FALSE);
 free(line);
 }
 free(svlp);
}

/* ----- set anchor point for marking text block ----- */
void SetAnchor(WINDOW wnd, int mx, int my)
{
 if (BlockMarked(wnd)) {
 ClearBlock(wnd);
 SendMessage(wnd, PAINT, 0, 0);
 }
 /* ------ set the anchor ------ */
 wnd->BlkBegLine = wnd->BlkEndLine = my;
 wnd->BlkBegCol = wnd->BlkEndCol = mx;
}

/* ----- clear and initialize text line pointer array ----- */

void ClearTextPointers(WINDOW wnd)
{
 wnd->TextPointers = realloc(wnd->TextPointers, sizeof(int));
 if (wnd->TextPointers != NULL)
 *(wnd->TextPointers) = 0;
}

#define INITLINES 100

/* ---- build array of pointers to text lines ---- */
void BuildTextPointers(WINDOW wnd)
{
 char *cp = wnd->text, *cp1;
 int incrs = INITLINES;
 unsigned int off;
 wnd->textwidth = wnd->wlines = 0;
 while (*cp) {
 if (incrs == INITLINES) {
 incrs = 0;
 wnd->TextPointers = realloc(wnd->TextPointers,
 (wnd->wlines + INITLINES) * sizeof(int));
 if (wnd->TextPointers == NULL)
 break;
 }
 off = (unsigned int) (cp - wnd->text);
 *((wnd->TextPointers) + wnd->wlines) = off;
 wnd->wlines++;
 incrs++;
 cp1 = cp;
 while (*cp && *cp != '\n')
 cp++;
 wnd->textwidth = max(wnd->textwidth,
 (unsigned int) (cp - cp1));
 if (*cp)
 cp++;
 }
}

#ifdef INCLUDE_SCROLLBARS
static void MoveScrollBox(WINDOW wnd, int vscrollbox)
{
 foreground = FrameForeground(wnd);
 background = FrameBackground(wnd);
 PutWindowChar(wnd, WindowWidth(wnd)-1,
 wnd->VScrollBox+1, SCROLLBARCHAR);
 PutWindowChar(wnd, WindowWidth(wnd)-1,
 vscrollbox+1, SCROLLBOXCHAR);
 wnd->VScrollBox = vscrollbox;
}
#endif












December, 1991
STRUCTURED PROGRAMMING


The Tragedy of the Black Box




Jeff Duntemann, KG7JF


I get at least one or two letters a week from people demanding to know what
the "KG7JF" after my name means. One chap guessed it was a lodge slogan, in
the manner of IOOF, while another reader wanted to know if it was a clue in
some sort of national treasure hunt. Sorry, gang. I don't do lodges and I
don't do treasure hunts. (I don't even buy lottery tickets, since I only take
sucker bets.) KG7JF is my amateur radio (ham) call-sign, meaning that I've
earned the right to buy or build my own radio transmitter and thereby (within
a body of accepted practice) disturb the ether.
It's what I do when I can't stand computers anymore, and I give ham radio
credit for keeping my head from exploding on numerous occasions. Not everybody
builds their own radios anymore, which is a damned shame. In fact, my current
radio project is designing a two-way FM radio from integrated circuits
designed for the cordless phone industry, in order to make building amateur
gear simpler and less expensive. (Not to mention a way of asking the Fallen
Viking to forgive me...) The project has been an amazing education in a lot of
ways, not the least of which is my discovery of the parallel between modern
radio design and object-oriented programming.


Line-by-line and Part-by-part


I've been involved with radios a lot longer than I've been involved with
programming. I have 25 years experience building radios the old way; that is,
part by part, transistor by transistor, soldering a component in here and
another there, each addition a custom job, and the whole project taking weeks
or even months of loose moments to complete.
Does this sound familiar, in a metaphorical way? Lord knows, it should. Most
of us, especially when we're in a hurry, build programs line by line, each
line conscious of the one before, and in its own way a custom job. Moreover,
most of us get this nagging feeling that it's all way too much work and that
there has got to be a better way.
In designing the radio I'm calling Chipper, I set out with full intent of
buying whole radio subsystems in the form of integrated circuits, rather than
building subsystems, a transistor, and a resistor at a time. I expected it to
take four or five ICs (rather than 25 or so transistors) to create a useful
dual-conversion narrow-band FM receiver. Instead (to my shock) I found that it
took two.
The boss IC is something called the MC3362 from Motorola. To build a radio,
you add components to its 24 pins. If you want a bare-bones radio, you only
add a few components. To create a more elaborate radio, you add more
components. The important point is that you add things; you don't change them.
There are some fundamental behaviors exhibited by the MC3362 that come through
no matter what, and there are some other behaviors that can at best be masked
or reinterpreted.


The Downside of Black Boxes


The MC3362 is a literal black box. A signal emerges from one pin for
intermediate frequency filtering; you send it through a filter of your own
design, and then feed the filtered signal back into the little black chip
through another pin. You can only control what emerges from the chip. The
stuff inside is beyond your control.
But it's worse than that. The old saw of "out of sight, out of mind" comes
into play with a new twist: What you can't see is pure hell to understand. I
used to be able to point to a resistor in a circuit and say, "That's a bias
resistor. It sets the operating point for transistor Q14. To change the
operating point, change the value of the resistor." No more. The MC3362 has
plenty of transistors (about 75, in fact) and plenty of bias resistors. But
they're buried in the middle of the chip where you can't change them. You not
only can't change the operating points, you can't even sample them to see what
they are. You can't "tweak" them and watch the consequences. You remain in a
state of enforced ignorance. Encapsulation in black polyethylene is pretty
total.
And the upshot is that I don't understand the MC3362. I know what goes in and
what comes out, sort of, but the dynamics of what goes on inside is
undocumented and might as well be magic. Worst of all, my lack of
understanding of the hidden parts of the system cripples my understanding of
those parts of the system that I can see.
This is the Tragedy of the Black Box--which is a great deal of what is wrong
with Turbo Vision. So much happens behind the scenes that even the stuff that
we see on the surface becomes mysterious, legendary, and contrary to
conventional wisdom. The Black Box problem will dog you throughout your
experience with Turbo Vision. Get used to it. Strive whenever you can to
understand even the parts of the system that don't require your direct
intervention. Knowledge is power, and (less obviously) knowledge is
cumulative.


The Stuff Apps are Made Of


So. Let's start here. There are actually two necessary technical descriptions
of Turbo Vision: One is of what it's made of, and the other is of what it
does. Both are mutually connected in a multitude of ways, but it's marginally
easier to first approach Turbo Vision from the standpoint of its structure.
This will probably take an entire column. It's a big, subtle, and confusing
subject.
From a height: A Turbo Vision application is a whole crew of objects,
allocated on the heap and linked by pointers. There is one boss object, the
application object. The application object owns all other objects present in
the application. This ownership is a question of pointer referents, and has
nothing to do with object hierarchy relationships. Don't get the two confused!
We're going to discuss object ownership first, long before we get into the
details of the Turbo Vision object hierarchy.
We have to define some technical terms here: A view in Turbo Vision is any
object that can display itself to the screen. Anything that can't display
itself is not a view, and is what I call a mute object. The term "mute object"
was introduced very early in the Turbo Vision Guide (by me) and then forgotten
about when I passed the project into other hands. It's a good term, however,
and I'll continue to use it.
The application object can own other objects because it is a special kind of
object called a group. A group is a special kind of views that own other
views. A group is the root node of a linked list of views (or other groups)
that it owns, and we say that it owns them by virtue of their being part of
that linked list.
The structure of Turbo Vision can best be understood in terms of groups. In a
sense, Turbo Vision is made of groups and very little else.


An Example: TApplication


You can instantiate and run an object of type TApplication, and it's an
interesting thing to do. Not much happens (remember, TApplication is a
boilerplate application that does no actual work) but you will see something
on your screen. What you see is a menu bar at the top of the screen, a status
line at the bottom of the screen, and a pattern of halftone characters in the
middle, completely covering the rest of the screen. You're not actually
looking at TApplication. The TApplication object itself is not a typical view
and has no on-screen presence. Instead, it is a group that owns (at minimum)
three views: A menu bar view, a status line view, and a desktop view, and
these are what you're seeing.
For the sake of clarity, I lied a moment ago. The desktop view isn't really a
view. It's another group, and what you see is not actually the desktop object
but another object owned by the desktop group, called a background. The
background view is simply a way of displaying a pattern on all parts of the
screen not taken up by other things. This is a good example of an important
truth: A group can own other groups. The application group (an instance of
TApplication) owns the desktop group (an instance of TDesktop).
The desktop group, in turn, owns the background object (at very minimum) but
it also owns all the visible elements you create for your program: windows,
dialog boxes, and so on. See Figure 1. Note in Figure 1 that only groups (the
elliptical objects) can own other things. A window is a group, and we don't
actually see the window object itself on the screen. Instead, we see the
component views that the window owns: its pane, its frame, and its scroll
bars.


Ownership


The ownership relationship between two objects is not something defined at
compile time, but is something that happens strictly at runtime. At runtime,
an object can be inserted into a group by way of an Insert method present in
every group: MyGroup.Insert(PtrToMy-Object);. This is the way that any
arbitrary object is inserted into a group. A pointer to the object is passed
to the group's Insert method, and the referent of PtrToMyObject is inserted
into the linked list of objects whose root node lies in MyGroup. (There are a
lot of operations like this that deal with objects only through pointers. You
must get comfortable with pointers before you have a ghost of a chance of
understanding Turbo Vision!)
If you've compiled and run my HCALC program published last month, you'll get a
fell for this. When HCALC begins running, there are no windows on the desktop.
When you pull down the Mortgage menu and select New, you create a new mortgage
window and insert it into the desktop group. The desktop group owns that new
window, and all other windows you may create later on. This is why we can say
that the desktop group owns literally everything in your application except
the status line and menu bar. Typically, over the life of an application
session your desktop group will insert into itself and later delete numerous
objects as windows are opened and closed.



Commanding the Troops


The notion of a group is critical when we begin putting windows together. A
window is a group--and therefore, we can insert into the window group whatever
"standard parts" the window needs--and only those parts it needs. If a window
doesn't need any scroll bars, don't insert them. A window may need more than a
single pane--so insert two or three or however many will do the job. The
versatility of the TV group concept is stunning.
As objects, groups aren't very "bright." They don't have a lot of native
intelligence. Mostly, groups exist as glue to tie other objects together. The
smarts in a group are pretty much all in the objects owned by the group. The
group itself acts as a fairly dumb linked list manager, and that's all.
There are times when all the components of a group must act together. When you
move a window, the parts of the window all have to move at once, and in the
same direction for the same distance, or the window will come apart while you
watch. The group must thus have some way of telling all the objects it owns to
do the same thing at the same time.
In formal terms, the group object has the power to iterate an operation over
all the objects it owns. In other words, you define a procedure to be
performed (typically by creating a pointer to said procedure) and then have
the group command each object it owns to execute that procedure.
Every group has a method named ForEach to do the job. ForEach takes as its
only parameter a pointer to (and this is extremely important!) a far local
procedure to be performed. That procedure can't be a method, but the procedure
can be local to a method and also call a method. (This is one ugly shortcoming
of Turbo Pascal: Methods cannot be accessed directly through procedure
pointers.) Don't fret the details for now. It's enough to understand at this
point that a group can force all the objects it owns to execute a given
procedure: MyGroup.ForEach (@ DoSomething);. Here, every object owned by
MyGroup is instructed to execute the DoSomething procedure. If MyGroup owns
any groups, each group then, in turn, orders all the objects that it owns to
execute DoSomething.


Z Law and Z Order


Structurally, that's the greater part of understanding Turbo Vision: the
inserting and deleting of groups into your desktop, and the creation of groups
to be windows of various kinds.
There's another subtlety, however, involving the way that groups are displayed
on the screen. Views can plainly overlap--create two mortgage windows in HCALC
and drag them over one another--and something has to dictate which one is on
the top. That something is called Z order, and it's a concept you have to
master before trying to put a group together.
We're used to thinking of the screen as a Cartesian plane with X and Y
coordinates. The notion of depth is a foreign one, especially in text mode
screens such as the one used by Turbo Vision. TV, however, adds the dimension
of depth to the text screen. In a sense, it provides a third axis--the Z axis,
coming after X and Y--to define that dimension of depth. When two views
overlap on the screen, one is "underneath" another--hence the depth dimension.
Again, look to TApplication for a simple example. When an application object
first runs, it only owns three things by default: the desktop, the status
line, and the menu bar. The desktop is "on the bottom" and both of the other
views sit on top of it, partially obscuring it.
The first view inserted into a group is the view "behind" all the other views
inserted later. By this rule, the desktop is inserted into the application
object before both the menu bar and the status line. If you inserted the menu
bar and status line first and then inserted the desktop, neither the menu bar
nor the status line would be visible--because the desktop would be in front of
both and block them from view!
You can assign a numerical correspondence to Z order. The first view inserted
is #1, and the number increases as you stack views one atop another. (This
becomes explicit when you apply numbers to open windows that overlap: Until
you start swapping them around, the window with the highest number is the
window on the top of the heap.) The component parts of a window illustrate Z
order fairly well. Consider Figure 2.
A Turbo Vision window is a group. Most windows you will encounter consist of
at least two objects: a frame object and a pane object. Many windows also have
a scroll bar object, as shown here. (Some windows have two scroll bar objects.
A window can also have more than one pane.) The figure shows the Z order in
terms of looking "down" into the text screen from above. The first object to
be encountered is the pane object. Beneath the pane is the scroll bar, and
beneath everything is the frame.
The pane is set up to be one character position smaller than the frame in both
X and Y. This is why the pane doesn't hide the frame even though the pane is
"above" the frame in Z order. The scroll bar, however, does overlap the right
edge of the frame and hides that edge from view.
Z order is set initially by the order that objects are inserted into a group.
The first object inserted is on the bottom, and all subsequent objects should
be thought of as in layers, with the last object inserted into the group on
the top. There are some circumstances under which the Z order can change, but
this usually involves the order of opened windows on the desktop. The
component parts of a window have one Z order, established at compile time,
that never changes.
If you ever try to put a group of views together and some of the views don't
appear on the screen, check your Z order. You may have unwittingly put one of
the views behind another, larger view, totally obscuring the "missing" view.
Remember: First view inserted is on the bottom, last view inserted is on the
top.


Views and Their Children


One of the criticisms I have of Turbo Vision is that it has no easily
graspable set of Bit Principles--you might say it's a collection of exceptions
that totally overwhelms its rules. This is especially evident when you look
closely at TV's family of views and how they have to be used. Some views need
to be subclassed to use them, and some may be used as is. Some views must not
be subclassed--and there's really no way to keep it all straight except to
remember a whole raft of special rules. Let's run down the list to get
oriented.
TView is one of those "abstract classes" that isn't intended to be
instantiated and used. It contains all the essential common characteristics of
a view, and you subclass it to make a specific kind of view.
In fact, most of the time you won't even directly subclass TView, but will
instead subclass one of TView's more specific children. TView is the ancestor
class of all views, and Turbo Vision provides numerous child classes that you
either use as is or subclass further.
TGroup is technically a view (because it is a child class of TView) but you
should think of groups as special cases among views. A group has no screen
presence of its own (as mentioned earlier) but instead is a group of views
bound together into a linked list. As with TView, TGroup is an abstract class
that isn't useful in and of itself. You create custom groups that do the work
you require by subclassing TGroup. Most of the Turbo Vision components you'll
be using frequently are groups. TApplication, TWindow (as well as their
subclasses) and TDialog are all groups.
TDesktop is a group, owned by TApplication. You don't typically instantiate
TDesktop, because TApplication does it for you, and the only reason you would
subclass TDesktop is to create some arcane variant of the standard
desktop--which I promise you is advanced Turbo Vision and not something to be
approached lightly.
TBackground (a view) provides the textured background displayed behind all you
other windows in a TV application. It's not good for much other than that, and
in most TV work you'll neither use it directly (the TDesktop class controls
the one you see) nor subclass it.
TWindow (a group) will act as the parent type for about half of the objects
you'll end up creating under TV. (The other half will be dialog boxes and
everything else.) TWindow is another abstract class that serves no use if
instantiated directly. You have to subclass it and flesh out the subclass with
some meat. In HCALC.PAS, for example, I created the TMortgageView window type
by subclassing TWindow. The child class is given a mortgage object and a
mortgage-specific constructor (among other things) to enable it to display a
mortgage table on the screen.
But what about scroll bars and interior panes? You need to remember that the
TWindow type is a group--and you flesh it out in part by giving it new fields
and methods, but also in part by inserting subviews (such as scroll bars and
panes) into it by using the Insert method. The key to knowing which way to
flesh out a window view is to insert things that are also views or
groups--like the panes and scroll bars--and add anything else (that is,
ordinary types, mute objects, and sub-programs) as fields and methods. That's
why TMortgageView gets a TMortgage field-- TMortgage is a mute object, not a
view or a group.
TDialog is a child class of TWindow (and hence a group) but it is handled in a
radically different way from the typical TWindow subclasses you'll create.
(Ahh, the grand confusion of it all!) A dialog box is a special kind of window
that asks questions of the user, and based on the answers to those questions,
carries back some important information to the application that owns it.
For reasons I'll explain shortly, you never subclass TDialog. You use it as it
is, and flesh out the provided TDialog class by inserting views or groups into
it. Mostly, what you insert are called controls: buttons, text entry fields,
or check boxes, and other user-action input devices.
Dialog boxes differ from windows in a number of ways. One is that the size of
a dialog box is set when its constructor is called, and cannot be changed
thereafter. That is, you cannot zoom or resize a dialog box.


Resources


There are other more technical differences as well. But the biggest difference
between windows and dialog boxes is that dialog boxes belong to a special
class of objects called resources. A resource is a standard Turbo Vision
object with a standard, random-access way of being written to or read from a
special Turbo Vision stream. (Streams are the canonical way of storing Turbo
Pascal objects in files -- and yet another another column.) If you have
frequently used "standard parts" in your programs, you can store them in a
resource file on disk and not have to initialize them when the application
using them runs. Instead, you simply read them off the resource stream, whole
and intact and ready to use.
You can arrange to get your custom objects onto or off of streams by providing
custom code within the objects. With resources, there's no arranging to be
done. All the Turbo Vision standard types may be used as resources, because
the standard types know how to write themselves to and from streams. And
that's why you never subclass a dialog box. Subclass it, and the subclass
won't be able to work with TV's stream I/O system. It won't be "standard"
anymore.
For similar reasons, you shouldn't try to insert a nonstandard view or group
into a dialog box. Stick with the TButton, TCheckBoxes, TRadioButtons, and
other standard controls if you intend to make your dialog box into a resource
stored in a resource life.
HCALC doesn't use resources, and its one dialog box is instantiated at
run-time, every time the program is run. (I saved a considerable amount of
code by using a nonstandard control in my dialog box, preventing me from
easily considering the dialog box a resource.) However, once the dialog box is
created, it is "tethered" by a global pointer variable and the same dialog box
may be used again and again for the duration of the application session. If
you're curious to see how this is done, look at the code implementing the
constructor for the HCALC application object, THouseCalcApp.Init.


The Turbo Vision Development Toolkit


This is a good place to jump in and mention a new product from Blaise
Computing, the Turbo Vision Development Toolkit. The greater part of the TVDT
is in fact a resource editing and management system for Turbo Vision.
The idea is that you can create standard dialog boxes and other elements that
may be used by any number of Turbo Vision applications, just by reading them
from their resource file. The TVDT gives you interactive design tools for
creating dialog boxes, menus, and strings, along with all the machinery you'll
need to quickly read them from disk.
The difference between using the TVDT and doing it by hand is astonishing. For
one thing, setting up a dialog box in a program requires code to initialize
the box in the right size, insert all the controls, etc., etc. If you do that
in a resource editor, you can carve all the setup code out of your apps, and
bring in your resources from the resource file with a single easily
documentable Pascal statement.
But mostly, futzing dialog boxes by hand is a miserable, trial-and-error kind
of ordeal in which you set up a rectangle, set up the coordinates for the
various controls, then compile and run the app to see what sort of mess you've
made. With the TVDT, you just draw the dialog box on the screen, drag controls
to their correct positions, then save it all out to a resource file when it
looks the way you want it.
The TVDT has one additional trick up its sleeve: It can convert a Turbo Vision
resource to a Microsoft Windows resource -- which, of course, means a Turbo
Pascal for Windows resource. So while it's not practical to create a single
application source file that serves both Turbo Pascal platforms, you can at
least automatically convert a Turbo Vision resource to a TPW resource, and
thus the more elements of your TV system you place in a resource file, the
more easily you'll be able to port a TV application to TPW.

The TVDT manual is clear and complete, and the software hasn't revealed any
important bugs. The product is unique, and if you intend to do any work at all
in Turbo Vision, it's just plain essential.


Partway There


I'll return to resources in a later column. This month I've been able to give
you an overview of Turbo Vision structure. We're about a quarter of the way
there. The best and worst of TV is tied up in how it operates behind the
scenes: its event-driven programming mode. We'll take a hacksaw to the corner
of the black box next column, and see what we can see. I hate to say it, but
it won't be all that much -- you may find (as I'm finding) that programming in
Turbo Vision is as much an act of faith as it is a matter of skill.


Products Mentioned


The Turbo Vision Development Toolkit Blaise Computing Inc. 819 Bancroft Way
Berkeley, CA 94710 415-540-5441 $149.00

















































December, 1991
GRAPHICS PROGRAMMING


CATCHING UP




Michael Abrash


It's been nigh on a year now since I started this column, and it's time to
catch up on some interesting odds and ends that I've been unsuccesfully trying
to squeeze in for the better part of that time. No, I haven't forgotten that I
said I'd start in on 3-D animation this month, but it's been put off until
next month. Now, calm down; I know I promised, it's just that these things
take time to do properly; you wouldn't want me to start off prematurely and
end up with, God forbid, slow 3-D animation, would you? Don't send nasty
letters; I'll get to it next month, honest I will.
The funny thing of it is, 3-D perspective drawing is basically pretty easy.
Shading (at least in its simpler aspects) is relatively easy, too. Even hidden
surface handling isn't bad--given lots of memory and processor power. The
memory is easy enough to come by in 386 protected mode, but the processor
power isn't. We're talking major silicon here, Intel i860s or, better yet,
Crays; 386s don't cut the mustard, so some sleight of hand is in order. Fixed
point arithmetic can replace floating point. Table look-ups can stand in for
excruciatingly slow sine and cosine functions. Fast, approximate antialiasing
can be used instead of precise but slow techniques. The real trick, though, is
some combination of restrictions and techniques that allows real-time hidden
surface removal. There are at least a dozen ways to remove hidden surfaces,
but the fast ones don't always work, and the ones that always work are slow.
Workstations deal with this by using dedicated hardware to perform
z-buffering, or by applying MIPS by the bushel to sorting, or the like.
Because we can't do any of that, I'm still pondering the best way to wring
real-time hidden surface handling out of a 386.
So, this month, catching up; next month, 3-D animation. If I bug out again
next month, then write nasty letters. At least I'll know you care.


Nomenclature Blues


Bill Huber wrote to take me to task--and a well-deserved kick in the fanny it
was, I might add--for my use of non-standard terminolgy in describing polygons
in the February, March, and June columns. The X-Window System defines three
categories of polygons: complex, nonconvex, and convex. These three
categories, each a specialized subset of the preceding category,
not-so-coincidentally map quite nicely to three increasingly fast polygon
filling techniques. Therefore, I used the XWS names to describe the sorts of
polygons that can be drawn with each of the polygon filling techniques.
The problem is that those names don't accurately describe all the sorts of
polygons that the techniques are capable of drawing. Convex polygons are those
for which no interior angle is greater than 180 degrees. The "convex" drawing
approach described in February and March actually handles a number of polygons
that are not convex; in fact, it can draw any polygon through which no
horizontal line can be drawn that intersects the boundary more than twice. (In
other words, the boundary reverses Y direction exactly twice, disregarding
polygons that have degenerated into horizontal lines, which I'm going to
ignore.) Bill was kind enough to send me the pages out of Computational
Geometry, An Introduction (Springer-Verlag, 1988) that describe the correct
terminology; such polygons are, in fact, "monotone with respect to a vertical
line" (which unfortunately makes a rather long #define variable). Actually, to
be a tad more precise, I'd call them "monotone with respect to a vertical line
and simple," where "simple" means "not self-intersecting." Similarly, the
polygon type I called "nonconvex" is actually "simple," and I suppose what I
called "complex" should be referred to as "nonsimple," or maybe just "none of
the above."
This may seem like nit-picking, but actually, it isn't; what it's really about
is the tremendous importance of having a shared language. In one of his books,
Richard Feynman describes having developed his own mathematical framework,
complete with his own notation and terminology, in high school. When he got to
college and started working with other people who were at his level, he
suddenly understood that people can't share ideas effectively unless they
speak the same language; otherwise, they waste a great deal of time on
misunderstandings and explanation. Or, as Bill Hubert put it, "You are free to
adopt your own terminology when it suits your purposes well. But you risk
losing or confusing those who could be among your most astute readers--those
who already have been trained in the same or a related field." Ditto.
Likewise. D'accord. And mea culpa; I shall endeavor to watch my language in
the future.


Nomenclature in Action


Just to show you how much different proper description and interchange of
ideas can be, consider the case of identifying convex polygons. Several months
back, a nonfunctional method for identifying such polygons--checking for
exactly two X direction changes and two Y direction changes around the
perimeter of the polygon--crept into this column by accident. That method, as
I have since noted, does not work. Still, a fast method of checking for convex
polygons would be highly desirable, because such polygons can be drawn with
the fast code from the March column, rather than the relatively slow,
general-purpose code from the June column.
Now consider Bill's point that we're not limited to drawing convex polygons in
our "convex fill" code, but can actually handle any simple polygon that's
monotone with respect to a vertical line. Additionally, consider Anton
Treuenfels's point, made back in the August column, that life gets simpler if
we stop worrying about which edge of a polygon is the left edge and which is
the right, and instead just scan out each raster line starting at whichever
edge is left-most. Now, what do we have?
What we have is an approach passed along by Jim Kent, of Autodesk Animator
fame. If we modify the low-level code to check which edge is left-most on each
scan line and start drawing there, as just described, then we can handle any
polygon that's monotone with respect to a vertical line regardless of whether
the edges cross. (I'll call this "monotone-vertical" from now on; if anyone
wants to correct that terminology, jump right in.) In other words, we can then
handle nonsimple polygons that are monotone-vertical; self-intersection is no
longer a problem. Then, we just scan around the polygon's perimeter looking
for exactly two direction reversals along the Y axis only, and if that proves
to be the case, we can handle the polygon at high speed. Figure 1 shows
polygons that can be drawn by a monotone-vertical capable filler; Figure 2
shows some that cannot.
Listing One (page 149) shows code to test whether a polygon is appropriately
monotone. This test lends itself beautifully to assembly language
implementation, because it's basically nothing but pointers and conditionals;
unfortunately, I don't have room for an assembly version this month, but
translation from Listing One is pretty straight-forward, should you care to do
so yourself. Listings Two and Three (page 149) are variants of the fast convex
polygon fill code from March, modified to be able to handle all
monotone-vertical polygons, including nonsimple ones; the edge-scanning code
(Listing Four from March) remains the same, and so is not shown here. Listing
Four (page 150) shows the changes needed to convert Listing One from June to
employ the vertical-monotone detection test and use the fast vertical-monotone
drawing code whenever possible; note that Listing Five from June is also
required in order for this code to link. Listing Five (page 150) this month is
the latest version of the polygon.h header file.
Is monotone-vertical polygon detection worth all this trouble? Under the right
circumstances, you bet. In a situation where a great many polygons are being
drawn, and the application either doesn't know whether they're
monotone-vertical or has no way to tell the polygon filler that they are,
performance can be increased considerably if most polygons are, in fact,
monotone-vertical. This potential performance advantage is helped along by the
surprising fact that Jim's test for monotone-vertical status is simpler and
faster than my original, nonfunctional test for convexity.
See what accurate terminology and effective communication can do?


Graphics Debugging with Turbo Debugger


I don't know what debugger you use, but I'd be willing to wager that more of
you than not use the same one I do, Turbo Debugger. It's powerful, the
interface is good (although arguably it lost some ease of use in the
transition to mouse support and CUA compatibility), and it's obviously the
debugger of choice if you're using a Borland compiler. I would be remiss,
however, if I failed to warn you of a serious problem with TD when it comes to
debugging graphics programs.
The problem, simply put, is that TD mucks about with the VGA's registers when
it gets control, even when you're running TD on a monochrome screen and your
app on a color screen, courtesy of the -do switch. Although TD has no business
fooling around with the VGA's registers in that case, seeing as how it's not
using the color screen, that's not the problem; the problem is that when TD
resumes execution of the app, it doesn't put all the VGA's registers back the
way it found them. Given that the VGA's registers are readable, this
interference is unnecessary; it's also unfortunate, because it makes it
impossible to debug some kinds of graphics with TD without a second computer
available.
What sorts of graphics can't TD debug? Page flipping, for one; when it gets
control, TD seems to force the start address of the displayed portion of the
bitmap back to 0, thereby displaying page 0 whether you like it or not.
Setting VGA registers manually via the I/O feature in the CPU window is also
interfered with; often, hand-entered register settings just don't stick. Mode
X (320x240, 256 colors, as discussed in the July through September columns) is
messed up quite royally at times; some of the nonstandard register settings
required to create mode X are apparently undone. There may be other problems,
but those I've mentioned are enough to limit TD's ability to debug many sorts
of graphics, especially animation.
Example 1: The proper sequence for setting write mode 1

 mov dx,3ceh ;Graphics Controller Index
 mov al,5 ;Graphics Mode reg index
 out dx,al ;point GC index to G_MODE
 inc dx ;Graphics Controller Data
 in al,dx ;get current mode setting
 and al,not 3 ;mask off write mode field
 or a1,1 ;set write mode field to 1
 out dx,a1 ;set write mode 1

There is a solution, as it happens: Get another computer, run a serial link
from your main system to the second computer, and debug your programs running
on that system via TDREMOTE. When running in remote mode, TD doesn't mess with
any registers, and becomes an ideal graphics debugging tool. The downside, of
course, is that you have to have a second computer; also TDREMOTE is slower in
almost every respect than TD.


Hi-Res VGA Page Flipping



This is one of those odd little items that might come in handy someday. The
background is this: On a standard VGA, hi-res mode is mode 12h, which offers
640x480 resolution with 16 colors. That's a nice mode, with plenty of pixels,
and square ones at that, but it lacks one thing--page flipping. The problem is
that the mode 12h bitmap is 144 Kbytes in size, and the VGA has only 256
Kbytes total, too little memory for two of those monster mode 12h pages. With
only one page, flipping is obviously out of the question, and without page
flipping, top-flight, hi-res animation can't be implemented. The standard
fallback is to use the EGA's hi-res mode, mode 10h (640x350, 16 colors) for
page flipping, but this mode is less than ideal for a couple of reasons: It
offers sharply lower vertical resolution, and it's lousy for handling
scaled-up CGA graphics, because the vertical resolution is a fractional
multiple--1.75 times, to be exact--of that of the CGA. CGA resolution may not
seem important these days, but many images were originally created for the
CGA, as were many graphics packages and games, and it's at least convenient to
be able to handle CGA graphics easily.
There are a couple of interesting, if imperfect, solutions to the problem of
hi-res page flipping. One is to use the split screen to enable page flipping
only in the top two-thirds of the screen; see "VGA Split-Screen Animation," in
the June, 1991 issue of PC Techniques, for details (and for details on the
mechanics of page flipping, as well). This doesn't address the CGA problem,
but it does yield square pixels and a full 640x480 screen resolution, although
not all those pixels are flippable.
A second solution is to program the screen to a 640x400 mode. Such a mode uses
almost every byte of display memory (64,000 bytes, actually; you could add
another few lines, if you really wanted to), and thereby provides the highest
resolution possible on the VGA for a fully page-flipped display. It maps well
to CGA resolutions, being either identical or double in both dimensions. As an
added benefit, it offers an easy-on-the-eyes 70-Hz frame rate, as opposed to
the 60 Hz that is the best that mode 12h can offer, due to the design of
standard VGA monitors. Best of all, perhaps, is that 640x400 16-color mode is
easy to set up.
The key to 640x400 mode is understanding that on a VGA, mode 10h (640x350) is,
at heart, a 400-scan-line mode. What I mean by that is that in mode 10h, the
Vertical Total register, which controls the total number of scan lines, both
displayed and nondisplayed, is set to 447, exactly the same as in the VGA's
text modes, which do in fact support 400 scan lines. A properly sized and
centered display is achieved in mode 10h by setting the polarity of the sync
pulses to tell the monitor to scan vertically at a faster rate (to make fewer
lines fill the screen), by starting the overscan after 350 lines, and by
setting the vertical sync and blanking pulses appropriately for the faster
vertical scanning rate. Changing those settings is all that's required to turn
mode 10h into a 640x400 mode, and that's easy to do, as illustrated by Listing
Six (page 150), which provides mode set code for 640x400 mode.
In 640x400, 16-color mode, page 0 runs from offset 0 to offset 31,999 (7CFFh),
and page 1 runs from offset 32,000 (7D00h) to 63,999 (0F9FFh). Page 1 is
selected by programming the Start Address registers (CRTC registers 0Ch, the
high 8 bits, and 0Dh, the low 8 bits) to 7D00h. Actually, because the low byte
of the start address is 0 for both pages, you can page flip simply by writing
0 or 7Dh to the Start Address High register (CRTC register 0Ch); this has the
benefit of eliminating a nasty class of potential synchronization bugs that
can arise when both registers must be set. Listing Seven (page 150)
illustrates simple 640x480 page flipping.
The 640x400 mode isn't exactly earthshaking, but it can come in handy for page
flipping and CGA emulation, and I'm sure that some of you will find it useful
at one time or another.


Modifying VGA Registers


EGA registers are not readable. VGA registers are readable. This revelation
will not come as news to most of you, but many programmers still insist on
setting entire VGA registers even when they're modifying only selected bits,
as if they were programming the EGA. This comes to mind because I recently
received a query inquiring why write mode 1 (in which the contents of the
latches are copied directly to display memory) didn't work in mode X.
Actually, write mode 1 does work in mode X; it didn't work when this
particular correspondent enabled it because he did so by writing the value 01h
to the Graphics Mode register. As it happens, the write mode field is only one
of several fields in that register, as shown in Figure 3. In 256-color modes,
one of the other fields--bit 6, which enables 256-color pixel formatting--is
not 0, and setting it to 0 messes the screen up quite thoroughly.
The correct way to set a field within a VGA register is, of course, to read
the register, mask off the desired field, insert the desired setting, and
write the result back to the register. In the case of setting the VGA to write
mode 1, consult Example 1.
This approach is more of a nuisance than simply setting the whole register,
but it's safer. It's also slower; for cases where you must set a field
repeatedly, it might be worthwhile to read and mask the register once at the
start, and save it in a variable, so that the value is readily available in
memory and need not be repeatedly read from the port. This approach is
especially attractive because INs are much slower than memory accesses on 386
and 486 machines.
Astute readers may wonder why I didn't put a delay sequence, such as JMP $+2,
between the IN and OUT involving the same register. There are, after all,
guidelines from IBM, specifying that a certain period should be allowed to
elapse before a second access to an I/O port is attempted, because not all
devices can respond as rapidly as a 286 or faster chip can access a port. My
answer is that while I can't guarantee that a delay isn't needed, I've never
found a VGA that required one; I suspect that the delay specification has more
to do with motherboard chips such as the timer, the interrupt controller, and
the like, and I sure hate to waste the delay time if it's not necessary.
However, I've never been able to find anyone with the definitive word on
whether delays might ever be needed when accessing VGAs, so if you know the
gospel truth, or if you know of a VGA/processor combo that does require
delays, please drop me a line. You'd be doing a favor for a whole generation
of graphics programmers who aren't sure whether they're skating on this ice
without delays.
_GRAPHICS PROGRAMMING COLUMN_
by Michael Abrash


[LISTING ONE]

/* Returns 1 if polygon described by passed-in vertex list is monotone with
respect to a vertical line, 0 otherwise. Doesn't matter if polygon is simple
(non-self-intersecting) or not. Tested with Borland C++ 2 in small model. */

#include "polygon.h"

#define SIGNUM(a) ((a>0)?1:((a<0)?-1:0))

int PolygonIsMonotoneVertical(struct PointListHeader * VertexList)
{
 int i, Length, DeltaYSign, PreviousDeltaYSign;
 int NumYReversals = 0;
 struct Point *VertexPtr = VertexList->PointPtr;

 /* Three or fewer points can't make a non-vertical-monotone polygon */
 if ((Length=VertexList->Length) < 4) return(1);

 /* Scan to the first non-horizontal edge */
 PreviousDeltaYSign = SIGNUM(VertexPtr[Length-1].Y - VertexPtr[0].Y);
 i = 0;
 while ((PreviousDeltaYSign == 0) && (i < (Length-1))) {
 PreviousDeltaYSign = SIGNUM(VertexPtr[i].Y - VertexPtr[i+1].Y);
 i++;
 }

 if (i == (Length-1)) return(1); /* polygon is a flat line */

 /* Now count Y reversals. Might miss one reversal, at the last vertex, but
 because reversal counts must be even, being off by one isn't a problem */
 do {
 if ((DeltaYSign = SIGNUM(VertexPtr[i].Y - VertexPtr[i+1].Y))
 != 0) {
 if (DeltaYSign != PreviousDeltaYSign) {
 /* Switched Y direction; not vertical-monotone if
 reversed Y direction as many as three times */
 if (++NumYReversals > 2) return(0);
 PreviousDeltaYSign = DeltaYSign;

 }
 }
 } while (i++ < (Length-1));
 return(1); /* it's a vertical-monotone polygon */
}





[LISTING TWO]

/* Color-fills a convex polygon. All vertices are offset by (XOffset,
YOffset).
"Convex" means "monotone with respect to a vertical line"; that is, every
horizontal line drawn through the polygon at any point would cross exactly two
active edges (neither horizontal lines nor zero-length edges count as active
edges; both are acceptable anywhere in the polygon). Right & left edges may
cross (polygons may be nonsimple). Polygons that are not convex according to
this definition won't be drawn properly. (Yes, "convex" is a lousy name for
this type of polygon, but it's convenient; use "monotone-vertical" if it makes
you happier!)
*******************************************************************
NOTE: the low-level drawing routine, DrawHorizontalLineList, must be able to
reverse the edges, if necessary to make the correct edge left edge. It must
also expect right edge to be specified in +1 format (the X coordinate is 1
past
highest coordinate to draw). In both respects, this differs from low-level
drawing routines presented in earlier columns; changes are necessary to make
it
possible to draw nonsimple monotone-vertical polygons; that in turn makes it
possible to use Jim Kent's test for monotone-vertical polygons.
*******************************************************************
Returns 1 for success, 0 if memory allocation failed */

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include "polygon.h"

/* Advances the index by one vertex forward through the vertex list,
wrapping at the end of the list */
#define INDEX_FORWARD(Index) \
 Index = (Index + 1) % VertexList->Length;

/* Advances the index by one vertex backward through the vertex list,
wrapping at the start of the list */
#define INDEX_BACKWARD(Index) \
 Index = (Index - 1 + VertexList->Length) % VertexList->Length;

/* Advances the index by one vertex either forward or backward through
the vertex list, wrapping at either end of the list */
#define INDEX_MOVE(Index,Direction) \
 if (Direction > 0) \
 Index = (Index + 1) % VertexList->Length; \
 else \
 Index = (Index - 1 + VertexList->Length) % VertexList->Length;

extern void ScanEdge(int, int, int, int, int, int, struct HLine **);
extern void DrawHorizontalLineList(struct HLineList *, int);

int FillMonotoneVerticalPolygon(struct PointListHeader * VertexList,

 int Color, int XOffset, int YOffset)
{
 int i, MinIndex, MaxIndex, MinPoint_Y, MaxPoint_Y;
 int NextIndex, CurrentIndex, PreviousIndex;
 struct HLineList WorkingHLineList;
 struct HLine *EdgePointPtr;
 struct Point *VertexPtr;

 /* Point to the vertex list */
 VertexPtr = VertexList->PointPtr;

 /* Scan the list to find the top and bottom of the polygon */
 if (VertexList->Length == 0)
 return(1); /* reject null polygons */
 MaxPoint_Y = MinPoint_Y = VertexPtr[MinIndex = MaxIndex = 0].Y;
 for (i = 1; i < VertexList->Length; i++) {
 if (VertexPtr[i].Y < MinPoint_Y)
 MinPoint_Y = VertexPtr[MinIndex = i].Y; /* new top */
 else if (VertexPtr[i].Y > MaxPoint_Y)
 MaxPoint_Y = VertexPtr[MaxIndex = i].Y; /* new bottom */
 }

 /* Set the # of scan lines in the polygon, skipping the bottom edge */
 if ((WorkingHLineList.Length = MaxPoint_Y - MinPoint_Y) <= 0)
 return(1); /* there's nothing to draw, so we're done */
 WorkingHLineList.YStart = YOffset + MinPoint_Y;

 /* Get memory in which to store the line list we generate */
 if ((WorkingHLineList.HLinePtr =
 (struct HLine *) (malloc(sizeof(struct HLine) *
 WorkingHLineList.Length))) == NULL)
 return(0); /* couldn't get memory for the line list */

 /* Scan the first edge and store the boundary points in the list */
 /* Initial pointer for storing scan converted first-edge coords */
 EdgePointPtr = WorkingHLineList.HLinePtr;
 /* Start from the top of the first edge */
 PreviousIndex = CurrentIndex = MinIndex;
 /* Scan convert each line in the first edge from top to bottom */
 do {
 INDEX_BACKWARD(CurrentIndex);
 ScanEdge(VertexPtr[PreviousIndex].X + XOffset,
 VertexPtr[PreviousIndex].Y,
 VertexPtr[CurrentIndex].X + XOffset,
 VertexPtr[CurrentIndex].Y, 1, 0, &EdgePointPtr);
 PreviousIndex = CurrentIndex;
 } while (CurrentIndex != MaxIndex);

 /* Scan the second edge and store the boundary points in the list */
 EdgePointPtr = WorkingHLineList.HLinePtr;
 PreviousIndex = CurrentIndex = MinIndex;
 /* Scan convert the second edge, top to bottom */
 do {
 INDEX_FORWARD(CurrentIndex);
 ScanEdge(VertexPtr[PreviousIndex].X + XOffset,
 VertexPtr[PreviousIndex].Y,
 VertexPtr[CurrentIndex].X + XOffset,
 VertexPtr[CurrentIndex].Y, 0, 0, &EdgePointPtr);
 PreviousIndex = CurrentIndex;

 } while (CurrentIndex != MaxIndex);

 /* Draw the line list representing the scan converted polygon */
 DrawHorizontalLineList(&WorkingHLineList, Color);

 /* Release the line list's memory and we're successfully done */
 free(WorkingHLineList.HLinePtr);
 return(1);
}





[LISTING THREE]

; Draws all pixels in list of horizontal lines passed in, in mode 13h, VGA's
; 320x200 256-color mode. Uses REP STOS to fill each line.
; ******************************************************************
; NOTE: is able to reverse the X coords for a scan line, if necessary to make
; XStart < XEnd. Expects whichever edge is rightmost on any scan line to be in
; +1 format; that is, XEnd is 1 greater than rightmost pixel to draw. if
; XStart == XEnd, nothing is drawn on that scan line.
; ******************************************************************
; C near-callable as:
; void DrawHorizontalLineList(struct HLineList * HLineListPtr, int Color);
; All assembly code tested with TASM 2.0 and MASM 5.0

SCREEN_WIDTH equ 320
SCREEN_SEGMENT equ 0a000h

HLine struc
XStart dw ? ;X coordinate of leftmost pixel in line
XEnd dw ? ;X coordinate of rightmost pixel in line
HLine ends

HLineList struc
Lngth dw ? ;# of horizontal lines
YStart dw ? ;Y coordinate of topmost line
HLinePtr dw ? ;pointer to list of horz lines
HLineList ends

Parms struc
 dw 2 dup(?) ;return address & pushed BP
HLineListPtr dw ? ;pointer to HLineList structure
Color dw ? ;color with which to fill
Parms ends
 .model small
 .code
 public _DrawHorizontalLineList
 align 2
_DrawHorizontalLineList proc
 push bp ;preserve caller's stack frame
 mov bp,sp ;point to our stack frame
 push si ;preserve caller's register variables
 push di
 cld ;make string instructions inc pointers

 mov ax,SCREEN_SEGMENT

 mov es,ax ;point ES to display memory for REP STOS

 mov si,[bp+HLineListPtr] ;point to the line list
 mov ax,SCREEN_WIDTH ;point to the start of the first scan
 mul [si+YStart] ; line in which to draw
 mov dx,ax ;ES:DX points to first scan line to draw
 mov bx,[si+HLinePtr] ;point to the XStart/XEnd descriptor
 ; for the first (top) horizontal line
 mov si,[si+Lngth] ;# of scan lines to draw
 and si,si ;are there any lines to draw?
 jz FillDone ;no, so we're done
 mov al,byte ptr [bp+Color] ;color with which to fill
 mov ah,al ;duplicate color for STOSW
FillLoop:
 mov di,[bx+XStart] ;left edge of fill on this line
 mov cx,[bx+XEnd] ;right edge of fill
 cmp di,cx ;is XStart > XEnd?
 jle NoSwap ;no, we're all set
 xchg di,cx ;yes, so swap edges
NoSwap:
 sub cx,di ;width of fill on this line
 jz LineFillDone ;skip if zero width
 add di,dx ;offset of left edge of fill
 test di,1 ;does fill start at an odd address?
 jz MainFill ;no
 stosb ;yes, draw the odd leading byte to
 ; word-align the rest of the fill
 dec cx ;count off the odd leading byte
 jz LineFillDone ;done if that was the only byte
MainFill:
 shr cx,1 ;# of words in fill
 rep stosw ;fill as many words as possible
 adc cx,cx ;1 if there's an odd trailing byte to
 ; do, 0 otherwise
 rep stosb ;fill any odd trailing byte
LineFillDone:
 add bx,size HLine ;point to the next line descriptor
 add dx,SCREEN_WIDTH ;point to the next scan line
 dec si ;count off lines to fill
 jnz FillLoop
FillDone:
 pop di ;restore caller's register variables
 pop si
 pop bp ;restore caller's stack frame
 ret
_DrawHorizontalLineList endp
 end






[LISTING FOUR]

/*** Replace this... ***/
extern int FillConvexPolygon(struct PointListHeader *, int, int, int);

/*** ...with this... ***/

extern int FillMonotoneVerticalPolygon(struct PointListHeader *,
 int, int, int);
extern int PolygonIsMonotoneVertical(struct PointListHeader *);

/*** Replace this... ***/
#ifdef CONVEX_CODE_LINKED
 /* Pass convex polygons through to fast convex polygon filler */
 if (PolygonShape == CONVEX)
 return(FillConvexPolygon(VertexList, Color, XOffset, YOffset));
#endif

/*** ...with this... ***/
#ifdef CONVEX_CODE_LINKED
 /* Pass convex polygons through to fast convex polygon filler */
 if ((PolygonShape == CONVEX) 
 PolygonIsMonotoneVertical(VertexList))
 return(FillMonotoneVerticalPolygon(VertexList, Color, XOffset,
 YOffset));
#endif






[LISTING FIVE]

/* POLYGON.H: Header file for polygon-filling code */

#define CONVEX 0
#define NONCONVEX 1
#define COMPLEX 2

/* Describes a single point (used for a single vertex) */
struct Point {
 int X; /* X coordinate */
 int Y; /* Y coordinate */
};

/* Describes series of points (used to store a list of vertices that describe
a polygon; each vertex is assumed to connect to the two adjacent vertices, and
last vertex is assumed to connect to the first) */
struct PointListHeader {
 int Length; /* # of points */
 struct Point * PointPtr; /* pointer to list of points */
};

/* Describes beginning and ending X coordinates of a single horizontal line */
struct HLine {
 int XStart; /* X coordinate of leftmost pixel in line */
 int XEnd; /* X coordinate of rightmost pixel in line */
};

/* Describes a Length-long series of horizontal lines, all assumed to be on
contiguous scan lines starting at YStart and proceeding downward (used to
describe scan-converted polygon to low-level hardware-dependent drawing
code)*/
struct HLineList {
 int Length; /* # of horizontal lines */
 int YStart; /* Y coordinate of topmost line */

 struct HLine * HLinePtr; /* pointer to list of horz lines */
};

/* Describes a color as an RGB triple, plus one byte for other info */
struct RGB { unsigned char Red, Green, Blue, Spare; };






[LISTING SIX]

/* Mode set routine for VGA 640x400 16-color mode. Tested with
 Borland C++ 2, in C compilation mode. */

#include <dos.h>

void Set640x400()
{
 union REGS regset;

 /* First, set to standard 640x350 mode (mode 10h) */
 regset.x.ax = 0x0010;
 int86(0x10, &regset, &regset);

 /* Modify the sync polarity bits (bits 7 & 6) of the
 Miscellaneous Output register (readable at 0x3CC, writable at
 0x3C2) to select the 400-scan-line vertical scanning rate */
 outp(0x3C2, ((inp(0x3CC) & 0x3F) 0x40));

 /* Now, tweak the registers needed to convert the vertical
 timings from 350 to 400 scan lines */
 outpw(0x3D4, 0x9C10); /* adjust the Vertical Sync Start register
 for 400 scan lines */
 outpw(0x3D4, 0x8E11); /* adjust the Vertical Sync End register
 for 400 scan lines */
 outpw(0x3D4, 0x8F12); /* adjust the Vertical Display End
 register for 400 scan lines */
 outpw(0x3D4, 0x9615); /* adjust the Vertical Blank Start
 register for 400 scan lines */
 outpw(0x3D4, 0xB916); /* adjust the Vertical Blank End register
 for 400 scan lines */
}






[LISTING SEVEN]

/* Sample program to exercise VGA 640x400 16-color mode page flipping, by
drawing a horizontal line at the top of page 0 and another at bottom of page
1,
then flipping between them once every 30 frames. Tested with Borland C++ 2,
in C compilation mode. */

#include <dos.h>
#include <conio.h>


#define SCREEN_SEGMENT 0xA000
#define SCREEN_HEIGHT 400
#define SCREEN_WIDTH_IN_BYTES 80
#define INPUT_STATUS_1 0x3DA /* color-mode address of Input Status 1
 register */
/* The page start addresses must be even multiples of 256, because page
flipping is performed by changing only the upper start address byte */
#define PAGE_0_START 0
#define PAGE_1_START (400*SCREEN_WIDTH_IN_BYTES)

void main(void);
void Wait30Frames(void);
extern void Set640x400(void);

void main()
{
 int i;
 unsigned int far *ScreenPtr;
 union REGS regset;

 Set640x400(); /* set to 640x400 16-color mode */

 /* Point to first line of page 0 and draw a horizontal line across screen */
 FP_SEG(ScreenPtr) = SCREEN_SEGMENT;
 FP_OFF(ScreenPtr) = PAGE_0_START;
 for (i=0; i<(SCREEN_WIDTH_IN_BYTES/2); i++) *ScreenPtr++ = 0xFFFF;

 /* Point to last line of page 1 and draw a horizontal line across screen */
 FP_OFF(ScreenPtr) =
 PAGE_1_START + ((SCREEN_HEIGHT-1)*SCREEN_WIDTH_IN_BYTES);
 for (i=0; i<(SCREEN_WIDTH_IN_BYTES/2); i++) *ScreenPtr++ = 0xFFFF;

 /* Now flip pages once every 30 frames until a key is pressed */
 do {
 Wait30Frames();

 /* Flip to page 1 */
 outpw(0x3D4, 0x0C ((PAGE_1_START >> 8) << 8));

 Wait30Frames();

 /* Flip to page 0 */
 outpw(0x3D4, 0x0C ((PAGE_0_START >> 8) << 8));
 } while (kbhit() == 0);

 getch(); /* clear the key press */

 /* Return to text mode and exit */
 regset.x.ax = 0x0003; /* AL = 3 selects 80x25 text mode */
 int86(0x10, &regset, &regset);
}

void Wait30Frames()
{
 int i;

 for (i=0; i<30; i++) {
 /* Wait until we're not in vertical sync, so we can catch leading edge */

 while ((inp(INPUT_STATUS_1) & 0x08) != 0) ;
 /* Wait until we are in vertical sync */
 while ((inp(INPUT_STATUS_1) & 0x08) == 0) ;
 }
}

























































December, 1991
PROGRAMMER'S BOOKSHELF


Libraries and "the One Right Way"




Andrew Schulman


Any fool can turn out a book on the C runtime library--those functions such as
printf(), strlen(), sqrt(), and setjmp()--and a quick trip to your local
bookstore will reveal that many fools have. All it takes is the ability to
copy, mistakes and all, the vendor's original documentation, and then serve up
the resulting pointlessness as The Microsoft C Bible, The Complete Guide to
Turbo C, or The Turbo C++ Programmer's Reference.
Our first book this month is a book on the C standard library that, wonder of
wonders, has things you won't find in the manuals that came with your
compiler. In fact, P.J. Plauger's The Standard C Library aims not so much to
show how to use the C standard library, but to show how it is implemented.
This is an incredible book! Instead of wasting paper just showing you how to
call printf(), Plauger (who chaired the Library subcommittee of the ANSI C
committee) shows you how to write printf--and scanf, malloc, sqrt, and every
other function in the library approved by the ANSI and ISO standards for C.
Besides presenting about 9,000 lines of C code for a complete portable C
runtime library, Plauger explains it all pretty much line-by-line, with
special emphasis given to international considerations (multibyte character
sets are given extensive coverage here), portability, testing, and handling of
errors and oddities. For example, the chapter on <math.h> stresses the way in
which even a seemingly trivial operation such as fabs (floating-point absolute
value) is complicated and made nontrivial by the need to handle screwball
values such as infinity and NaN (Not a Number). While authors of computer
books frequently engage in "exercise left for the reader" cop-outs, Plauger
seems to delight in tackling such nasties.
Certainly there are places where Plauger decides not to deal with an issue.
The chapters explaining machine-specific facilities such as <signal.h> and
<setjmp.h> are, in the interests of portability, somewhat vague. The
implementation of realloc() does not try to expand a block in place. But
Plauger is generally quite explicit about such limitations.
Some of the material in The Standard C Library has appeared in Plauger's
monthly "Standard C" column in the C User's Journal. The book contains one
chapter for each of the 15 .h files that make up the ANSI and ISO standard C
library. Each chapter contains the following sections:
A "Background" section that includes useful historical information (for
example, the chapter on <stdio.h> discusses the evolution of the
device-independent input/output model)
"What the C Standard Says" (a reprint in very tiny type of the relevant pages
from the ISO 9899:1990 standard)
A section on using the facility
A meaty section on implementing the facility (for example, implementing
<stdio.h> gets 50 pages)
A section on testing--yes, testing--the facility
An extremely thoughtful bibliography (for example, the <signal.h> chapter
refers the reader to the 1976 PDP-11 manual, because PDP-11 traps and
interrupts inspired the signals defined for C)
Exercises (some of the ones for <time.h> look really difficult!)
Within the "implementing" section, things seem explained in exactly the right
order. However, it is extremely annoying that the chapters themselves are
arranged in strictly alphabetical order. While the interdependencies of the 15
facilities that make up the C standard library probably make any ordering
somewhat arbitrary, it seems absurd that, for instance, <math.h> and <float.h>
are not treated together, and that the book begins with <assert.h> rather than
with, say, <stdlib.h>. This is my only complaint about this otherwise superb
book.


C++ 3.0


It is interesting to consider what isn't in the C standard library. While it
contains generic searching and sorting facilities (bsearch() and qsort()),
there is nothing for doing linked lists, hash tables, or circular buffers. The
manipulation of such structures is simply too difficult to "library-ize." Yet,
not having these available as components just as standard as printf() and
strlen() is a shocking waste.
The C++ programming language is in many ways an attempt to solve this large
problem. In C++, generic facilities such as linked lists are much more
amenable to being put in a library that programmers can really use with the
same ease that they use strlen(). But to make the masses of programmers switch
from C, C++ needs to provide immediately tangible benefits, such as actual,
inexpensive, shrinkwrapped libraries, rather than the mere ability to create
such libraries.
Reading the new second edition of Stroustrup's The C++ Programming Language, I
for the first time felt that this might really happen. The overall theme of
the many changes to C++, Stroustrup notes, has been to make it "a better
language for writing and using libraries." Libraries really are the key to the
whole business. Some of the changes in C++, such as multiple inheritance, are
truly of interest only to connoisseurs of object-oriented design. But others,
such as templates, seem like such a big win that some day--when there are
several inexpensive C++ 3.0 compilers available--not using C++ may seem as
conservative and hidebound as refusing to use function prototypes in C would
seem today.
Whether or not you use C++--and, let's face it, very little commercial
software shipping today has been written in C++--you owe it to yourself to get
the new edition of Stroustrup's book. (Note that this is the second edition of
The C++ Programming Language, and not the hardcover Annotated C++ Reference
Manual, which was reviewed in the May 1991 DDJ.) Not only is the second
edition well written and quite readable by the average C programmer, but it
covers many topics of wide interest.
In particular, The C++ Programming Language now contains three chapters (over
100 pages) on general issues of design. This includes such topics as the
development cycle, management, reuse, hybrid design, components, and the
design of libraries. Throughout, Stroustrup is concerned with making the right
compromises between conflicting goals. As he notes several times in the book,
"This reflects my view that there is no 'one right way.'"
That sounds trite, but when you think about it, many programmers in fact act
as if there were such a thing as "the right way," and as if the goal of
software design were to come up with "elegant" solutions. Wrong! Stroustrup
clearly designed C++ itself as a series of trade-offs. This is exactly why it,
rather than "purer" languages such as Smalltalk, actually has a chance of
becoming a tool that's used by masses of programmers.
Two other new chapters discuss the most recent additions to C++: templates and
exceptions. Proper exception handling is required by anyone building or using
industrial-strength libraries. The new exception-handling mechanism in C++ 3.0
(which adds the keywords try, catch, and throw) feels much too complex, but
Stroustrup's chapter on this subject is brilliant. Any programmer even
remotely interested in the three Es--errors, events, and exceptions--will want
to read Stroustrup's discussion of this subject.
The chapter on templates presents what is probably the most exciting new
facility of C++ 3.0. Templates allow very generic facilities, such as linked
lists or sort procedures, to be defined without being tied down to a specific
class. While providing many of the benefits of typeless languages, templates
do not undermine static type checking or runtime efficiency.
This sounds rather vague, so see the stack template in Example 1, which
defines a stack as a completely abstract data type, with no mention of whether
it is a stack of ints, chars, or widgets. That gets decided when someone
creates a stack which uses this same declaration to create, for example, a
stack of ints and a stack of chars (as shown in Example 2). In essence,
templates provide all the typeless flexibility of preprocessor macros, but
without the many problems associated with macros (naturally, templates do
introduce some new problems).
Example 1: A stack template in C++ 3.0

 // stack.h

 template<class=T> class stack

 {

 T *V, *p;
 int sz;

 public:

 // all the following functions are inline
 stack(int s) { v = p = new T[sz = s]; }
 -stack() { delete[] v; }
 void push (T a) { *p++ = a; }
 T pop() { return *--p; }
 int size() const { return p-v; }

 }

Example 2: Creating two different classes of stack from the same template

 #include <iostream.h>
 #include "stack.h"

 main()
 {
 stack<int> si(100); // stack of 100 ints
 stack<char> sc(100); // stack of 100 chars

 for (int i=0; i<10; i++)
 {
 si.push(i);
 sc.push(i);
 }

 while (si.size()) // pop everything and
 cout << si.pop() << ' ' ; // display it
 cout << '\n' ;
 }

The implications of templates for library building should be clear, or at
least they will be, once compilers are available to implement templates. I
used a beta version of such a forthcoming C++ compiler and was particularly
impressed at how templates interact nicely with the C++ inline facility; the
assembly language output from Example 2 was exactly what one would have
achieved doing all this by hand in C. Example 3, for instance, shows what the
statement si.push(i) became.
Example 3: Assembly language output from Example 2

 mov bx, word ptr [si.p]
 mov ax, word ptr [i]
 mov word ptr [bx], ax ; *p = i
 add word ptr [si.p], 2 ; p += sizeof(int)

With the language described in Stroustrup's second edition, it truly seems
possible that we could one day have extensive libraries of ready-to-use,
high-level classes, that are as easy and intuitive to use and as efficient as
strlen() is today. Yet this will require not only that C++ provide the right
set of features, but that the designers of these libraries show some of the
same level-headedness that Stroustrup reveals in his book. "The conviction
that there is no one right way permeates the design of C++." In this spirit,
future C++ libraries might have to look a lot more like the present C standard
library than their designers might like.






























December, 1991
OF INTEREST





CNS has released multiplatform versions of C++/Views, an object-oriented
development environment now available for Windows, OS/2, and the Macintosh
platforms.
With nearly 100 C++ classes, C++/Views also includes a C++ class browser that
supports project management, class browsing and editing, class hierarchy
management, and integrated compile and go. With C++/Views you can write your
application once and simply recompile and execute it on all the supported
platforms.
The Windows version costs $495; Mac and OS/2 versions sell for $895. Reader
service no. 20.
CNS Inc. 1250 Park Road Chanhassen, MN 55317-9260 612-474-7600
The FUSION TCP/IP Developer's Kit (FDK) for Windows, which allows developers
to distribute windows applications across multivendor networks via TCP/IP, has
been released by Network Research. The FDK provides TCP/IP network and
transport services by supporting Windows' DLL. Linking the FUSION DLL to the
application at runtime allows access to FUSION's TCP/IP protocols.
The FDK's unique feature is a set of modules you can link with your networking
interface. The modules help isolate differences between TCP/IP socket-oriented
and Windows' message-based communications protocols. When compiled and linked
with a Windows program, the modules improve the interface with the FUSION
TCP/IP DLL and help speed up development.
To obtain the FDK, you must purchase FUSION TCP/IP for DOS ($295) and a
maintenance contract ($100). Reader service no. 21.
Network Research 2380 N. Rose Ave. Oxnard, CA 93030 805-485-2700 or
800-541-9508
New versions of SPARCworks and the SPARCompiler are now available from SunPro.
Developed for Solaris 2.0, the UNIX System, V Release 4-based operating
environment from SunSoft, SPARCworks, and the SPARCompiler offer significant
enhancements for development in C++, ANSI C, Fortran, and Pascal.
SPARCworks now allows you to browse and graphically display program
structures, including C++ class hierarchies; debug optimized code; collect and
analyze application and system performance data; compare and merge multiple
versions of files; view and control system build procedures; and launch and
manage individual session tasks. SPARCworks also now uses ToolTalk, the
interapplication communication protocol from SunSoft.
New releases of SPARCompiler products are SPARCompiler C++ 3.0, based on the
UNIX Systems Laboratories de facto standard, Cfront 3.0, and offering
parameterized types; SPARCompiler C 2.0, supporting ANSI and K&R C, 64-bit
integer data types, and migration tools for moving from SVR3 to SVR4 and from
K&R to ANSI C; SPARCompiler Fortran 2.0, featuring faster I/O performance,
64-bit integer data types, and improved viewing of array data within the
SPARCworks debugger. Prices range from $1750 for SPARCworks Professional C to
$2195 for SPARCworks Professional Fortran. The SPARCompiler runs from $695 for
C to $1075 for Fortran. Volume discounts available. Reader service no. 22.
SunPro 2550 Garcia Ave. Mountain View, CA 94043-1100 415-960-1300
Blaise Computing now supports Turbo Vision applications in Turbo C++ with
version 2.0 of its Turbo Vision Development Toolkit. The toolkit is a set of
utilities and an object class library that includes a resource editor for
interactively creating or changing dialog boxes and other resources, a utility
to convert Turbo Vision resources into Windows resource script files, and
object and class libraries that extend Turbo Vision's capabilities.
Source code for all objects and utilities is included for the price of $149.
Reader service no. 23.
Blaise Computing Inc. 819 Bancroft Way Berkeley, CA 94710 415-540-5441
MULTI, the new source-level debugger from Oasys, is multitarget,
multilanguage, multiwindow, and multiprocess, and can be used for native and
cross development. With it, you can simultaneously debug applications written
in C++, C, Pascal, Fortran, and assembly.
MULTI lets you display variable, class, and reference windows, and set
breakpoints on overloaded or member functions. It automatically demangles
symbol names, associates C++ source lines with addresses in the executable,
disambiguates references to overloaded functions and operators, and allows you
to debug inline procedures. In conjunction with the Green Hills native or
cross C++ compilers, MULTI also provides automatic resolution of class base
types such as virtual functions and displays complex class structures,
including multiple inheritance.
MULTI has a completely customizable graphical user interface based on the
industry standard X Window environment, making its user interface easy to
learn and consistent across platforms. MULTI can be used to debug single or
multiple processes native or remote to the processor it is running on, and the
same debugger can interface to several remote devices without relinking or
modification. You can test your applications by configuring MULTI to interface
directly with various debug execution environments via a UNIX ptrace command
interface or your own ptrace-based interface.
MULTI is available for Sun-4 SPARC and DECstations. Prices for cross
development start at $2000; for native development, $1000. Reader service no.
24.
Oasys One Cranberry Hill Lexington, MA 02173 617-862-2002
First Class Software has announced Profile/V, an interactive performance
analysis tool for Smalltalk/V. Profile/V shows where time is being spent:
which methods are most expensive and which statements within each method are
the costliest.
Profile/V operates by periodically sampling the Smalltalk activation stack to
create a tree of the methods used while executing a block of code, weighted by
which methods took the most time. The tree is displayed as an indented list in
a browser. By selecting a node in the tree you can bring up the source code of
the associated method in one text pane and a profile of time spent in the
method's statements in another.
Also included is a filtering mechanism called "gathering," which merges all
the profiles of a selected method into a single tree. This allows you to
compare the profile of mutually recursive methods from the perspective of any
method and can also collect and show the performance impact of utility
routines.
Profile/V comes with a tutorial explaining optimization strategies in
Smalltalk and costs $299. Reader service no. 25.
First Class Software P.O. Box 226 Boulder Creek, CA 95006-0226 408-338-4649
The LPI-C++ 32-bit compiler for commercial application development in
multi-platform UNIX environments is now available from Liant. Packaged with
CodeWatch, Liant's source-level debugger, LPI-C++ allows you to build portable
C++ applications.
LPI-C++ is a complete "2.1" implementation of C++ as Ellis and Stroustrop
describe it in The Annotated C++ Reference Manual, and has compile-time switch
compatibility with Cfront, ANSI C, and K&R C. The compiler includes an
optimizer that reduces program size and improves runtime performance using
global common subexpression elimination, branch chaining, loop induction, loop
unrolling, and subroutine inlining. LPI-C++ gives you error messages that
clearly identify the type, severity, and exact location of programming errors
and attempt to specify probable corrections.
DDJ spoke with Matthias Neugebauer of BKS Software in Berlin, who is
developing an OOP database development environment for C++. "LPI-C++ is good
for making the final version of a product but not for development. There are
performance problems with the compiler, but not with the generated code," says
Neugebauer. "I will use it to test conformance to C++ specifications, because
this is the only compiler that has that capability."
LPI-C++ is initially available on 386- and 486-based systems running UNIX,
Version 3, and will soon be available for Version 4 and for SPARC systems
running SunOS. Interaction with the debugger is through either the X-Window
graphical interface or a character-based interface. The introductory price is
$695 per copy; $1295 with CodeWatch. Reader service no. 26.
Liant Software Corp. 959 Concord St. Framingham, MA 01701 800-237-1874
Anthora is the new relational database development tool for Windows 3.0 from
Exxitus Corp. Among Anthora's many features are two report generators, one of
which creates a .dbf file with the report; the ability to run on a network or
as a stand-alone product; a browser that lets you display and update indexed
and related databases simultaneously; a dialog editor that allows you to work
at the object level, with access to windows and custom controls; the ability
to store graphical information as part of the field inside a database, with
support for Bitmaps and Metafiles; DDE and DLL; an interpreted language
similar in structure to C; and a compiler.
Anthora costs $295; the compiler is $89. Reader service no. 27.
Evolution Trading Inc. 7206 NW 31st St. Miami, FL 33122 305-593-1516
A new set of parallel-processing development tools for the TMS320C40 parallel
DSP chip is available from Texas Instruments. The set includes tools for
hardware development and verification such as the XDS510 scan-based parallel
emulator, and a Parallel-Processing Development System (PPDS) that simplify
design and debugging. Also included are code generation tools that support
writing code using the parallel-processing capability of the C40. An example
is TI's ANSI C compiler with parallel-processing runtime support. The
additional component of the set is software tools, which include a C and
assembly source debugger that support the development, debugging, and
benchmarking of parallel processors.
Contact Texas Instruments for pricing. Reader service no. 28.
Texas Instruments P.O. Box 809066 Dallas, TX 75380 800-336-5236 x 700
Magic Fields from Blue Sky Software is a new tool for Windows data field
validation. With it, you can point and click to add pre- and custom-defined
data entry fields, instead of writing the code to do input checking.
Magic Fields consists of a collection of objects that perform data field
validation, including numeric, text, alphanumeric, date, currency, dollars,
phone numbers, social security numbers, and so on. The custom-defined object
can be used to define fields not in the collection. Magic Fields supports
international date and currency formats and can be used with any standard
Windows dialog editor. The price is $295. Reader service no. 29.
Blue Sky Software 1224 Prospect St., Suite 155 La Jolla, CA 92037 619-459-6365
A low-cost logic circuit simulator has been designed by K.E. Ayers and
Associates specifically for beginners in digital electronics. The LogicLab
Explorer software package simulates on the PC a logic workbench that includes
a wire-wrap "breadboard" with a capacity of up to 40 16-pin DIP packages; a
"parts cabinet" with common 74LS-series IC devices, push-button switches, LED,
and 7-segment indicators, and a programmable signal source; a 16-channel logic
analyzer; and a wire wrap "gun."
To aid in constructing circuits a graphical windowing system is also included.
LogicLab Explorer costs $49.95. Reader service no. 30.
K.E. Ayers & Associates 7825 Larchwood St. Dublin, OH 43017 614-792-2473
Two new books from Addison-Wesley are Visual Design with OSF/Motif, by Shiz
Kobara, and The X Window System: A User's Guide, by Niall Mansfield. Kobara's
book shows planners and designers how to take advantage of the 3-D graphic
capabilities of the OSF/Motif user interface via step-by-step tutorials and
over 200 illustrations.
Mansfield's volume gives an overview of the X Window system, a list of
available software, an installation guide, and a road map to the
documentation. Many code examples, screen shots, tips, and warnings are
included. Reader service no. 31.
Addison-Wesley Publishing Co. 1 Jacob Way Reading, MA 01867 617-944-3700
ProductOne SnapShot from Digital ChoreoGraphics is a new 3-D CAD visualization
tool. SnapShot allows you to view a shaded 3-D CAD file in proper perspective
and lighting over any digitized background scene with optional animation. It
uses an interactive menu structure to achieve quick renderings on a wide range
of systems. SnapShot also allows you to envision any object or group of
objects at any stage of completion in either wireframe, hidden surface,
shaded, or dithered view modes. Additional features include an intuitive GUI,
a virtual color palette, and the ability to simulate photographic effects on
the PC screen.
Background images can be in GIF, PCX, or PDS formats; CAD files can be in
AutoCAD DXF, Design CAD-3D, and NASA PDS formats. SnapShot sells for $295.
Educational discounts are available. Reader service no. 32.
Digital ChoreoGraphics 1763 Orange Ave. Costa Mesa, CA 92627 800-326-1969






December, 1991
SWAINE'S FLAMES


Truth or Dare




Michael Swaine


Welcome to "Doped Junction," the show that dares to ask the dumb questions.
Each week, Sadie Pentathol interviews industry leaders in her trademarked
abrasive style. Tonight, Sadie abrades IBM President Jack Kuehler and Apple
Chief Executive Officer John Sculley regarding certain scurrilous rumors. And
now, here's Sadie.
Sadie: Jack Kunstler. You're the president of IBM. Bigger than God. You could
have stepped on Apple and crushed it like a bug. Why get into bed with John
Sculley?
Kuehler: IBM believes that there are great mutual benefits to be realized from
joint ventures in today's business climate. By the way, it's Kuehler.
Sadie: I've read all the news stories about the deal, all about the pink
RISC-platformable object-oriented operating systems in enterprise computing
environments for preferred clients -- but it's all Greek to me. Let's cut to
the chase, Jack. Exactly what did you license from Apple, and why?
Kuehler: It's pretty simple, really. Our sales staff kept hearing that our
customers wanted Macs. IBM listens to its customers, so we researched just
what features of Macs they liked. That's basically what we've licensed.
Sadie: So now when your customers ask for those features--
Kuehler: We've got 'em.
Sadie: By the Coke-can hairs. John Scuzzy, what did Apple get out of the deal?
Sculley: That's Sculley. Well, Sadie, what we primarily got is acceptance in
the corporate world. IBM has said, "Hey, Corporate World -- Apple is OK."
Sadie: But according to Jack Kunstler, their sales force is saying something
quite different.
Sculley: Well, that's their job, and they do it well. But the point is the
acceptance. Also, we get protection from the Beatles.
Sadie: Beagles?
Sculley: Beatles. British rock group, you remember. Steve and Steve licensed
the name Apple from them with the proviso that Apple stay out of the music
business. Well, we have a lot of music-related products in the works, and
there's this lawsuit now, and the stockholders were making a lot of noise
about the company's future being in the hands of Ringo Starr --
Sadie: So what you're saying is that you're going to hide all Apple's music
products in your multimedia joint venture, Colitis?
Sculley: We considered giving them to Claris, but we didn't think the judge
would fall for that.
Sadie: Long term. Beyond the '90s. The next millennium. What's the picture,
Jack Cruller?
Kuehler: In the long term, IBM is going to sell off its hardware and software
and focus on service and support.
Sadie: Really? Wow.
Kuehler: Ross Perot begged IBM to go into DP services when he worked for us
back in '62. I just got his memo last week. Bureaucracy, you know. Service and
support is the fastest growing sector of the industry, the margins are better
than hardware, and frankly, we're tired of parasite companies making money off
IBM's shortcomings.
Sadie: So you're going to make money off IBM's shortcomings. Who are you going
to sell off your hardware and software products to?
Kuehler: The third world: Southeast Asia, Latin America, England. Software
will follow hardware and semiconductors offshore in a few years. But service
and support will stay domestic, because while Americans will buy any amount of
foreign products, they will never learn to understand people with accents.
Sadie: What's the big challenge in the short term -- John Shirley?
Sculley: That's Scuzzy. I can only speak for Apple, and for us, the big
challenge in the next few years will be to continue to sell Macintoshes until
the new machines come out, now that we've effectively said that the Mac is
obsolete.
Sadie: Can Apple pull it off, John?
Sculley: Frankly, Sadie, our research tells us that Apple customers are
fanatically loyal to obsolete hardware. The Apple II line is apparently
unkillable. So we're not worried.
Sadie: Where does all of this leave OS/2, Jack Cooler?
Kuehler: We will continue to fully support OS/2 for the forseeable future.
Sculley: Just like we will continue to support TrueType.
Kuehler: Oh, thanks a lot, John.
That's it for this week. Tune in next week, when Sadie will interview Bill
Gates and Philippe Kahn regarding allegations of a Microsoft-Borland merger.



















Special Issue, 1991
Special Issue, 1991
EDITORIAL


What? Me Worry About Windows Programming?




Michael Floyd


Fenestracryptophobia: The fear of Windows programming. Symptoms range from
mild headaches to outright disorientation and confusion. In worst cases,
suffering programmers find it difficult to manage even simple events such as
opening a window.
To truly understand the above-described phobia you have to place yourself in
the afflicted programmer's development environment. For starters, Windows 3.0
programmers must face some 578 API function calls. With enhancements come new
API functions; Windows 3.1 is expected to bring the total up to 771 calls. Add
to this 197 multimedia extensions and 78 for pen functions and you're staring
down the barrel of 1,046 API calls. It's no wonder tools such as Asymetrix's
ToolBook and visual programming environments like Visual Basic are attracting
so much attention.
Indeed, visual programming tools are the current prescription for ailing
programmers. But one problem I see with visual programming is that it favors
the look of an application over its functionality -- the visual programming
GUI tail, in effect, is wagging the dog. Furthermore, visual languages that
necessarily support the environment are likely to forego other language
features that programmers might expect. "No problem!" the quick to answer
might say, "Just design the interface and call a DLL written in your favorite
language." Unfortunately, this approach does not, for instance, alleviate the
need to make calls to the GDI to draw graphics or to support DDE.
Windows 3.1 promises relief for some of these programmer phobias while, at the
same time, introducing significant enhancements. One big change is the
introduction of the Dynamic Data Exchange Management Library (DDEML) which,
according to Microsoft, simplifies DDE message passing while standardizing the
DDE protocol. DDEML provides a set of functions (26 in all) that insulate you
from the gory details of client/server conversations, transaction handling,
memory management, and the like. This is good news to new and seasoned
fenestracryptographers because both approaches work. If you're new to Windows
or just tired of hacking DDE code, go with DDEML. If you've already made the
investment, the old message-based DDE is still there.
And then there's Object Linking and Embedding (OLE) which allows you to create
a document (called a compound document) consisting of embedded objects. These
embedded objects can be linked to and managed by applications. Thus, the
document takes center stage and multiple apps can do what they do best. But
what's good for the user -- via 64 new functions -- isn't fun for the
programmer. Consider too that you should at least entertain the ideas of
making applications TrueType-aware, Pen-aware, adding support for Drag and
Drop, preparing your code for 32-bit Windows, and ...I feel another
fenestracryptophobia attack coming on!
The complexity of GUIs is the very reason why application frameworks are
important. As you might expect, Microsoft has announced application framework
to support rapid UI development that will provide a basic set of "foundation"
classes that lay on top of Windows while specific UI classes will provide the
interface to the application. Most notable is that the "open" design allows
for third party classes at the UI level.
If you've already invested in Borland's Object Windows Library (OWL), don't
panic just yet. Borland earlier this year submitted its specification for OWL
to the Object Management Group (OMG). OWL is a library of reusable objects (or
classes) that allow you to build the UI components of an application. More
importantly, OWL is language independent. Object Windows is already available
with Actor, Turbo Pascal for Windows, and Borland C++. And although it's
specific to Windows, it is hoped that OWL will evolve to a
platform-independent framework under the guidance of the OMG.
With Microsoft looking ahead to 32-bit Windows and Windows NT, there's little
doubt that fenestracryptophobia will spread. A standard application framework
that is both language- and platform-independent may help, but new innovations,
dissimilar environments, and different hardware make a sure cure unlikely.








































Special Issue, 1991
 QUICK APPROXIMATIONS OF POLYGONAL AREAS USING BITBLT
 This article contains the following executables: BITLIT.ASC


Nancy Nicolaisen


Nancy has worked for the U.S. Geological Survey and the National Ocean Service
in Alaska. She is the author of a hypertext publishing system for maps and
geographically related data.


Several years ago, Garrison Kiellor visited Alaska, putting on a July 4th show
that included a side-splitting monologue about a couple visiting from the
Midwest who used a roadmap to decide where to go sightseeing. The humor in
this is a bit of an inside joke because on one hand, our entire state has
fewer miles of road than a small county in the Lower 48; on the other hand,
Alaska has one-half of the coastline and one-fifth the landmass of the entire
U.S. If we are short on roads, we are even shorter on destinations.
As Gertrude Stein said, "When you get there there is no there there," unless
you count our majestic wilderness, which, of course, counts very much to us.
In the job of managing natural resources and doing environmental research, the
biggest challenges we have involve the logistics of putting scientists in the
field, getting back their data, analyzing the information, and communicating
it to decision makers and operational users.
Over the years it has become clear that we need to put as much data as
possible in the hands of the users in an approachable form. Natural scientists
tend to use Geographic Information System (GIS) tools, and invariably, these
tools mean a big computer and a data processing bureaucracy. For complex,
exacting projects, this is a good arrangement. But for quick, general answers
to uncomplicated questions, things become unsatisfactory for both users and DP
professionals.
In short, what we need is a "backpack" GIS that can be carried into the field,
the meeting room, or the press briefing. We need to do things such as display
cartographically faithful maps and employ "hypermaps" to enrich the
information content with pictures, charts, graphs, narrative text, and raw
data. We need to interact with the data in simple ways. What follows is a look
at how we might provide a quick, reasonably accurate answer to a simple
question such as, "How much of this habitat zone was covered by the oil
spill?"
In this article, I examine the use of graphics and Boolean raster operations
to estimate areas. The strategy for doing this relies on determining the unit
area represented by a single pixel, distinguishing which pixels belong to the
visual representation of the area we want to estimate, counting them, and
multiplying by our unit area. For example, Figure 1 and Figure 2 show
irregularly shaped areas we can combine to find various areas. Figure 3 shows
the pull-down menu listing the various combinations of areas we can do. Figure
4 shows the result of estimating the non-intersecting area of Areas 1 and 2.


Examining the Source Code


Let's start by looking at the C source in Listing Two, page 14. (The entire
system, including header files, icons, and so on is available electronically;
see "Availability" on page 3.) The program begins with the normal
initialization tasks. For the first instance of the program, we need to
register the window class and create any generic objects from which all
instances could benefit. Notice that when we register this class, the
rClass.cbWndExtra field is set to the size of a WORD, which is defined in
Windows.h as being an unsigned short. This reserves space for us in the Window
structure of each separate instance of the application. This word will hold
the value of the constant representing the user's choice from the pull-down
menu, and is used to switch the cases in the paint procedure.
Next we need to do the initialization required for every instance of the
program. We assign the instance handle to a global variable and create the
window based upon the previously registered class. We show the new window,
using the cmdShow parameter provided to us by Windows.
At this point, we are waiting in the main program, WinMain, for the messages
to come rolling in. WinMain contains the message dispatch loop. When they
arrive, those messages will be delivered to the WindowProc. The WM_COMMAND
message tells us that the user has spoken; naturally, we will want to do
something about that as quickly as possible. The wParam (for Word Parameter)
contains a constant that identifies the pull-down menu choice of the user. We
will store this value in the space allocated for this purpose in the window
structure and call the paint procedure.


Painting the Window


When we enter the paint procedure there are some routine things to do. First,
we need to get a handle to a display context and find the dimension in pixels
of the client area of our window. We want the size of a unit in the logical
coordinate system to be the same in both X and Y directions, so we set the
mapping mode to MM_ISOTROPIC (or to put it another way:
(xUnitLength/yUnitLength = 1)). This is important, because we estimate areas
of shapes by counting the number of pixels in the area and multiplying by the
value fAreaPerPixel, which we define as the square area in the logical
coordinate system represented by one pixel.
When we define the size of the viewport and the window, notice that the signs
of the Y extents differ. This aligns the lower-left corner of our window on
the display with the origin of our logical coordinate system. As a result of
this alignment, the X axis will increase to the right and the Y axis will
increase, going upward.
Next we create a memory display context with the same default properties as
the DC we got by calling BeginPaint. It's worth stressing that while the
default attributes of the memory DC are like those of our visual DC, the size,
location, and properties of the drawing surface are not yet defined. The newly
created DC has a monochrome "display surface" composed of one pixel. We must
explicitly define its coordinate systems and mapping properties. The memory
"display surface" itself comes into being when we create a bitmap and select
it into the memory DC.
We create memory bitmaps to hold an image of Area 1 and Area 2, and play the
previously recorded metafiles onto them to construct those images. The
metafiles are used here for the sake of convenience. Notice the calls to
SaveDC and RestoreDC that bracket the PlayMetaFile call. Metafiles can and
frequently do affect the settings of a DC. It is a good practice to anticipate
this and ensure that the state of the DC will be acceptable after the metafile
has played.
At this point, there are only two real jobs left. First we need to combine the
bitmaps, and then we need to count the bits that represent the area.


The Versatile BitBlt


A quick look at the source lines for the paint procedure in Listing Two will
reveal that we use BitBlt to combine the bitmaps into a single image. BitBlt
is one of the most lovable graphics tools around -- it's fast, flexible, and
though this example is a little short on spectacular graphics, it can provide
them. BitBlt takes several parameters: a handle to a destination DC; X and Y
coordinates for the logical origin of the bitmap on the destination; logical
width and height of the bitmap on the destination; a handle to a source DC; X
and Y coordinates for the logical origin of the bitmap on the source; and,
most interestingly, a Raster Operation (ROP) code.
The ROP code determines which of 256 possible Boolean functions will be
performed on three sets of bits to produce the final image. The first set of
bits is called the "Pattern." The Pattern is a bitmap which is the currently
selected "brush" in the destination display context. The other two sets of
bits are the bitmaps selected in the source and destination display contexts.
Of the 256 possible ROP code constants, 15 have common names, and it is good
form to use the name in place of the constant if one exists. Not surprisingly,
the named ROP codes are the ones most frequently used. There are some cases in
which it is pretty obvious which one will do the job. For instance, SRCCOPY
will simply "wallpaper" the source bitmap on to the destination at the
location you specify. For the occasions when it is not so clear which Boolean
function will produce the result you want, here is how to determine the
appropriate ROP code.


Forming the Raster Operation Code


As an example, we'll find a ROP code which will cause all the pixels outside
the combination of Area 1 and Area 2 to be black, and all the interior pixels
to be white. First construct the following truth table in Table 1(a).
Table 1: Truth tables for interior pixels

 (a) Pattern: 1 1 1 1 0 0 0 0
 Source: 1 1 0 0 1 1 0 0
 Destination: 1 0 1 0 1 0 1 0

 (b) Pattern: 1 1 1 1 0 0 0 0
 Source: 1 1 0 0 1 1 0 0
 Destination: 1 0 1 0 1 0 1 0
 ---------------------------------------


 Result: 1 1

 (c) Pattern: 1 1 1 1 0 0 0 0
 Source: 1 1 0 0 1 1 0 0
 Destination: 1 0 1 0 1 0 1 0
 ---------------------------------------

 Result: 1 1 1 1 1 1

 (d) Pattern: 1 1 1 1 0 0 0 0
 Source: 1 1 0 0 1 1 0 0
 Destination: 1 0 1 0 1 0 1 0
 ---------------------------------------

 Result: 0 1 1 1 0 1 1 1

You may wonder about the content and meaning of the bit patterns in this
table. (I am always curious when I come across a new "sacred number.") The
patterns have no physical meaning, but do result in 256 unique ROP code
indexes across a new "sacred number.") The patterns have no physical meaning,
but do result in 256 unique ROP code indexes upon Boolean combination. The two
hex digits that result from the combination of the Pattern, Source, and
Destination bit strings give us an index into the table of ternary raster
operation codes in the Windows Software Development Kit Reference literature.
Recall that each of the areas is black and the background is white. In Table
1(b), 0 is black and 1 is white. Where source and destination are both 0
(black), we need a 1 (white) in the Result, because such a bit falls inside
both areas; refer to Table 1(c). Where either source or destination are 0
(black), we need a 1 (white) in the Result, because such a bit falls inside
one of the areas as in Table 1(d).
Where source and destination are both 1 (white), we need a 0 (black) in the
Result, because such a bit falls outside both of the areas. Our Result is 77H.
Our result is called the "Boolean Function Number." We can find this number in
the table of ternary raster operation codes, volume 2, chapter 11 of the
Windows SDK Reference. The full ROP code, 7700e6H, and a Reverse Polish
description of the Boolean function (DSan) are listed there. For this code,
there is no common name.
For each of the cases in the paint switch, blackening the bits we want to
count is just a matter of determining the ROP code and BitBlting the two
images.
Counting the bits is now fairly straight-forward. When we call the function
FindArea, we pass the handle to the visual DC because the pixels we need to
count have been BitBlted to the screen. We create a single-plane bitmap so
that the final array of bits will have a one-to-one correspondence with the
pixels they represent. Because we created our other bitmaps to be compatible
with the visual DC, they could have had some other color organization
(multiple bits per pixel or more than one color plane, for example). Now we
want them to be translated to a monochrome bitmap that has the dimensions of
the client area of our window. In the translation that will take place, if the
source was organized as a color bitmap, all source pixels which are the same
color as the source background will be white in the monochrome destination.
All other pixels will be black. For our example, this is just right. BitBlt
performs this translation when it detects a difference in the color
organization of the two bitmaps.
Next, we get information about the bitmap we just created and put it in the
BITMAP structure we call bmArea. (In this case, this call was not strictly
necessary.) We determine how much space we need to retrieve a copy of the bits
and make a call to LocalAlloc. After we lock the block and get a near pointer
to it, we retrieve our data with a call to GetBitmapBits.


Counting the Bits


A little counting and a little multiplying and we're home free. The procedure
_COUNT_BITS, shown in Listing One (page 14), is an assembly language routine
that does the following things: It takes the bits in word size hunks; it
shifts the word left one bit at a time; and for each 0 (or black) bit that
"falls off" the top and into the Carry Flag, it increments an accumulator. The
value of the accumulator is passed back to the caller as SetBits. I'll be the
first to admit that this isn't reminiscent of rocket science or Handel's Water
Music. It does accomplish a couple of things, though. Counting bits is the
sort of job where performance can be drastically improved through the use of
assembly language, and it's a tool we shouldn't avoid when its use is
justified. Second, you'll notice that we went to some length to avoid changing
any segment registers. It was all right to alter and restore segment registers
in previous versions of Windows, but this is no longer true. With respect to
mixed-language Windows programming, it's worthwhile to follow most of the
rules your mother instructed you to practice: Be careful with things that
don't belong to you (segment registers), pick up after yourself (make sure you
clear the stack of local data and saved registers), and don't be getting into
things you haven't asked to borrow (by writing beyond the end of allocated
memory objects). After we count the bits and multiply by fAreaPerPixel, we
report our estimate using a MessageBox. Whew!


Closings and Caveats


There we have the concept of finding areas with raster operations. With a
little adjustment, the strategy should work in about any environment. Of
course, the success of this approach is highly dependent on the original
coordinate data. The larger the areas, the less accurate the estimates. Much
digital map data is geometrically manipulated to optimize for its use in all
sorts of different applications. Inherently, when we try to measure flat
squares on our somewhat spherical and bumpy earth, things get complicated. Not
all cartographic projections would yield coordinates that would work well in
this scheme. Another deficiency of this example is the expectation of
monochrome areas. Dealing with colored areas would be more useful, but more
involved. Resizing the window and round-off error will result in minor changes
in the reported area.


Acknowledgments


Thanks to Stan Moll of the U.S. Geological Survey/National Mapping Division,
for spending a morning rounding up digital data for Area 1 and Area 2. Also
many thanks to Bill Clark, colleague and professor of Computer Science at the
University of Alaska, for his advice and service as a sounding board. And, as
are many Windows programmers, I am greatly in the debt of a wise stranger,
Charles Petzold, author of the indispensable Programming Windows 3.

_QUICK APPROXIMATIONS OF POLYGONAL AREAS USING BITBLT_
by Nancy Nicolaisen


[LISTING ONE]


; Parameters:
; BP+6 = Offset to bits
; BP+8 = Bitmap Height in scan lines
; BP+10 = Bitmap width in pixels
; Local Data
; BP-2 = hiword of the bit count
; BP-4 = loword of the bit count

.8086

.MODEL MEDIUM
memM EQU 1
INCLUDE CMACROS.INC

.CODE
 PUBLIC _COUNT_BITS

_COUNT_BITS PROC
 PUSH BP ; Preserve BP
 MOV BP, SP ; Set stack frame pointer
 SUB SP, 4 ; This will hold our byte count
 PUSH DI ; Preserve DI
 PUSH SI ; Preserve SI

 SUB AX, AX ;Zero the accumulator reg
 MOV [BP-2], AX ;Zero the local
 MOV [BP-4], AX ; storage
 MOV DI, [BP+8] ;Height of the bitmap in scan lines
 MOV SI, [BP+6] ;DS:offset to bits

scanning_line: MOV DX, [BP+10] ;Number of bits to check in each
 ; scan line
scanning_bytes: MOV BX, [SI]
 XCHG BH, BL ;Get a word
 MOV CL, 16 ;prepare to scan 16 bits
shifting_bits: SAL BX,1 ;shift most significant remaining bit
 JC shift_again ;if it was set, shift again
 INC AX ;increment ax for each 0 bit

shift_again: DEC DX ;Decrement bits/line
 JZ new_line ;Process a new scan line?
 DEC CL ;do we have bits left in this word?
 JNZ shifting_bits ;keep shifting this word
 INC SI ;if not, advance to the
 INC SI ; next word
 JMP scanning_bytes ; and look for black bits

new_line: ADD [BP-4], AX ;Add this line's total to our
 ADC WORD PTR [BP-2], 0
 ; accumulator

 DEC DI ;Decrement the line counter
 JZ pass_bit_count ;If were done, do exit things
 INC SI ;Else bump the pointer to bits
 INC SI ; past the word we just scanned
 ; and any pad bytes
 SUB AX, AX ;Zero the accumulator
 JMP scanning_line ;Do the next line

pass_bit_count:
 POP SI ;Restore SI
 POP DI ;Restore DI
 POP AX ;Loword of bitcount
 POP DX ;Hiword of bitcount
 POP BP ;Restore BP

 RET ;Return the bit count in DX:AX
 ; and let the caller clear the stack


_COUNT_BITS ENDP
END






[LISTING TWO]

/***************************************************************************/
/* I N C L U D E F I L E S */
/***************************************************************************/

#include "\windev\include\windows.h"
#include "areas.h"


/***************************************************************************/
/* T H E P R O G R A M ' S G L O B A L V A R I A B L E S */
/***************************************************************************/

static HANDLE hInst; /* Data that can be referenced thruout */
static HWND hWnd; /* the program, but is not normally */
long float fAreaPerPixel;
HDC hDCMem;
HBITMAP hOldBitmap;
/***************************************************************************/
/* M A I N P R O G R A M */
/***************************************************************************/

int PASCAL WinMain (hInstance,
 hPrevInstance,
 lpszCmdLine,
 cmdShow)

HANDLE hInstance, hPrevInstance;
LPSTR lpszCmdLine; /* Length of the command line. */
int cmdShow; /* Iconic or Tiled when start. */
{
 MSG msg;

 hInst = hInstance;
 if( hPrevInstance )
 {
 return (FALSE);
 }

 Init (hInstance, cmdShow); /* Initialization rtn.*/

 while /* The main loop: */
 (GetMessage((LPMSG)&msg, NULL, 0, 0)) /* (terminated by a QUIT) */
 {
 TranslateMessage(&msg); /* Have Windows translate */
 DispatchMessage(&msg); /* Have Windows give mess */
 /* to the window proc. */
 }
 exit(msg.wParam); /* End of the program. */
}


/***************************************************************************/
/* I N I T I A L I Z A T I O N */
/***************************************************************************/

int FAR PASCAL Init (hInstance, cmdShow)

HANDLE hInstance;
int cmdShow;
{
 WNDCLASS rClass; /* Window class structure. */
 int FullScreenX;
 int FullScreenY;

 rClass.lpszClassName = (LPSTR) "NN:AREA";
 rClass.hInstance = hInstance;
 rClass.lpfnWndProc = WindowProc;
 rClass.hCursor = LoadCursor (NULL, IDC_ARROW) ;
 rClass.hIcon = LoadIcon (hInstance, "AREAS");
 rClass.lpszMenuName = (LPSTR) "AreaFinder";
 rClass.hbrBackground = GetStockObject (WHITE_BRUSH) ;
 rClass.style = CS_HREDRAW CS_VREDRAW CS_DBLCLKS;
 rClass.cbClsExtra = 0;
 rClass.cbWndExtra = sizeof( WORD );

 RegisterClass ( &rClass); /* Register the class. */

 hInst = hInstance;
 FullScreenY = GetSystemMetrics( SM_CYFULLSCREEN );
 FullScreenX = GetSystemMetrics( SM_CXFULLSCREEN );
 hWnd = CreateWindow((LPSTR) "NN:AREA", /* Window class name. */
 "Using BitBlt to Estimate Areas - Dr. Dobbs",
 /* Window Title */
 WS_OVERLAPPEDWINDOW WS_MAXIMIZE,
 /* Type of window. */
 0, /* Where the window should */
 0, /* go when the app opens... */
 (FullScreenX / 16 ) * 16,
 /* Make a scan line fill an */
 /* even # words */
 FullScreenY, /* */
 NULL, /* No parent for this wind */
 NULL, /* Use the class menu. */
 hInstance, /* Who created this window. */
 NULL /* No params to pass on. */
 ) ;
 ShowWindow( hWnd, cmdShow );
 return TRUE;
 }


 /***************************************************************************/
 /* T H E W I N D O W P R O C E D U R E */
 /***************************************************************************/


long FAR PASCAL WindowProc (hWnd, message, wParam, lParam )

HWND hWnd; /* Handle of the window */

unsigned message; /* Message type */
WORD wParam; /* Message 16 bit param */
LONG lParam; /* Message 32 bit param */
{


 switch (message) /* Check the mess type */
 {
 case WM_COMMAND:
 switch(wParam)
 { /* Store wParam in the */
 case ID_AREA1: /* WindowWord and tell */
 case ID_AREA2: /* the paint proc about */
 case ID_UNION: /* it... */
 case ID_INTERSECTION:
 case ID_EXCLUSIVE:
 case ID_OUTSIDE:
 case ID_SHOWPOLYS:
 SetWindowWord(hWnd, 0, wParam );
 InvalidateRect( hWnd, NULL, TRUE );
 UpdateWindow( hWnd );
 break;

 default:
 break;
 }
 break;

 case WM_CREATE:
 InvalidateRect( hWnd, NULL, TRUE );
 UpdateWindow( hWnd );
 break;

 case WM_PAINT:
 PaintAreaWindow( hWnd );
 break;

 case WM_SIZE: /* Dont let the window change */
 break; /* size... */

 case WM_DESTROY:
 PostQuitMessage(0); /* send yourself a QUIT */
 break; /* message. */

 default:
 return(DefWindowProc(hWnd, message, wParam, lParam));
 break;
 }
 return(0L);
}
RECT rWorkRect;
/***************************************************************************/
/* T H E P A I N T P R O C E D U R E */
/***************************************************************************/

int FAR PASCAL PaintAreaWindow (hWnd)

HWND hWnd; /* Handle of the window. */
{


PAINTSTRUCT ps;
HDC hDC;
HANDLE hArea1Meta;
HANDLE hArea2Meta;
HBITMAP hArea1;
HBITMAP hArea2;


WORD WhatToEstimate;


hDC = BeginPaint( hWnd, &ps);
GetClientRect( hWnd, &rWorkRect );
SetMapMode( hDC, MM_ANISOTROPIC ); /* X and Y are dimensionally equal */
SetViewportOrg( hDC, 0, rWorkRect.bottom );
 /* The viewport origin is at the */
 /* left corner of the screen... */
SetViewportExt( hDC, rWorkRect.right, -rWorkRect.bottom );
 /* X increases to the right and Y */
 /* increases going up... */
SetWindowOrg( hDC, X_ORIGIN, Y_ORIGIN );
SetWindowExt( hDC, X_EXTENT, Y_EXTENT );
 /* Logical dimensions depend on the */
 /* data which defines the areas. */
 /* The constants are defined in */
 /* Areas.h */
hDCMem = CreateCompatibleDC( hDC );
SetMapMode( hDCMem, MM_ISOTROPIC );

SetViewportOrg( hDCMem, 0, rWorkRect.bottom );
SetViewportExt( hDCMem, rWorkRect.right, -rWorkRect.bottom );
SetWindowOrg( hDCMem, X_ORIGIN, Y_ORIGIN );
SetWindowExt( hDCMem, X_EXTENT, Y_EXTENT );
 /* Create a memory display context */
 /* that simulates the visible DC...*/

hArea1 =
 CreateCompatibleBitmap( hDCMem, rWorkRect.right , rWorkRect.bottom );
hOldBitmap = SelectObject( hDCMem, hArea1 );
 /* Create a bitmap with the same */
 /* organization as this device and */
 /* select it into the memory DC... */
hArea1Meta = GetMetaFile( "Area1.bas" );
SaveDC( hDCMem );
PlayMetaFile( hDCMem, hArea1Meta );
RestoreDC( hDCMem, -1 );
DeleteMetaFile( hArea1Meta );

hArea2 =
 CreateCompatibleBitmap( hDCMem, rWorkRect.right, rWorkRect.bottom );
SelectObject( hDCMem, hArea2 );
hArea2Meta = GetMetaFile( "Area2.bas" ); /* For convenience, we'll construct
*/
SaveDC( hDCMem ); /* our area bitmaps using pre- */
PlayMetaFile( hDCMem, hArea2Meta ); /* recorded metafiles. Since the */
RestoreDC( hDCMem, -1 ); /* metafile can change the attrib- */
DeleteMetaFile( hArea2Meta ); /* utes of the DC, its often a good*/
 /* practice to use the context */
 /* stack to preserve the DC before */

 /* playing the metafile, and */
 /* restore it afterward... */
fAreaPerPixel =
 ( (float)X_EXTENT / (float)rWorkRect.right ) *
 ( (float)Y_EXTENT / (float)rWorkRect.bottom );
 /* Calculate the area of 1 Pixel... */


WhatToEstimate = GetWindowWord( hWnd, 0 );
switch( WhatToEstimate )
 {
 case ID_AREA1:
 SelectObject( hDCMem, hArea1 );
 DeleteObject( hArea2 );
 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCCOPY);
 SelectObject( hDCMem, hOldBitmap );
 DeleteObject( hArea1 );
 FindArea( hDC, &rWorkRect );
 break; /* Estimate the area of Area 1 */

 case ID_AREA2:
 DeleteObject( hArea1 );
 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCCOPY);
 SelectObject( hDCMem, hOldBitmap );
 DeleteObject( hArea2 );
 FindArea( hDC, &rWorkRect ); /* Estimate the area of Area 2... */
 break;

 case ID_UNION:
 SelectObject( hDCMem, hArea1 );
 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCCOPY);
 SelectObject( hDCMem, hArea2 );
 DeleteObject( hArea1 );

 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCAND );
 SelectObject( hDCMem, hOldBitmap );
 DeleteObject( hArea2 );
 FindArea( hDC, &rWorkRect ); /* Estimate the area of the union */
 break;

 case ID_INTERSECTION:
 SelectObject( hDCMem, hArea1 );
 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCCOPY);
 SelectObject( hDCMem, hArea2 );
 DeleteObject( hArea1 );
 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCPAINT );
 SelectObject( hDCMem, hOldBitmap );
 DeleteObject( hArea2 );
 FindArea( hDC, &rWorkRect ); /* Estimate the area of the intersec-
 tion... */
 break;

 case ID_EXCLUSIVE:

 SelectObject( hDCMem, hArea1 );
 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCCOPY);
 SelectObject( hDCMem, hArea2 );
 DeleteObject( hArea1 );

 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, 0x990066 );
 SelectObject( hDCMem, hOldBitmap );
 DeleteObject( hArea2 );
 FindArea( hDC, &rWorkRect ); /* Estimate the area of an exclusive
 combination... */
 break;

 case ID_OUTSIDE:
 SelectObject( hDCMem, hArea1 );
 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCCOPY);
 SelectObject( hDCMem, hArea2 );
 DeleteObject( hArea1 );

 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, 0x7700e6 );
 SelectObject( hDCMem, hOldBitmap );
 DeleteObject( hArea2 );
 FindArea( hDC, &rWorkRect ); /* Estimate the area outside the
 combined areas...*/
 break;

 case ID_SHOWPOLYS:
 SelectObject( hDCMem, hArea1 );

 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCCOPY);
 MessageBox(hWnd, "This is Area 1... ", "Areas", MB_OK );
 SelectObject( hDCMem, hArea2 );
 DeleteObject( hArea1 );

 BitBlt( hDC, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDCMem, X_ORIGIN, Y_ORIGIN, SRCCOPY );
 SelectObject( hDCMem, hOldBitmap );
 DeleteObject( hArea2 );
 MessageBox(hWnd, "This is Area 2... ", "Areas", MB_OK );
 /* Lets have a look at the areas... */
 break;


 default:
 break;
 }
DeleteDC( hDCMem ); /* Always relinquish GDI leftovers!! */
DeleteObject( hOldBitmap );
ValidateRect( hWnd, NULL );
 /* Validate the client area... */
EndPaint (hWnd, &ps); /* Finished painting for now. */
SetWindowWord(hWnd, 0, NULL );
return TRUE;
}


/***************************************************************************/
/* F I N D A R E A */
/***************************************************************************/
void PASCAL FindArea( hDC, lprWork )

 HDC hDC;
 LPRECT lprWork;

{

 HBITMAP hArea;
 HANDLE hAreaMemory;
 BITMAP bmArea;
 PSTR pAreaBits;
 unsigned int NumberBytes;


 long float fArea;
 long SetBits;
 char szApproxArea[12];



 hArea = CreateBitmap( lprWork->right, lprWork->bottom, 1, 1, NULL );
 /* Create a monochrome bitmap with the
 same dimensions as the client area...
 */
 SelectObject( hDCMem, hArea );

 BitBlt( hDCMem, X_ORIGIN, Y_ORIGIN, X_EXTENT, Y_EXTENT,
 hDC, X_ORIGIN, Y_ORIGIN, SRCCOPY);
 /* Select it into the memory DC and copy
 the client area to it...
 */
 GetObject( hArea, sizeof( BITMAP ), &bmArea );
 /* Get the dimensions and color organization
 information about the monochrome bitmap...
 */

 NumberBytes = bmArea.bmPlanes * bmArea.bmHeight * bmArea.bmWidthBytes;
 hAreaMemory = LocalAlloc( LMEM_FIXED LMEM_ZEROINIT, NumberBytes );
 if( hAreaMemory == NULL )
 {
 MessageBox(hWnd, "Try closing other windows or resizing Areas...","Unable to
allocte memory!", MB_OK MB_ICONHAND );
 return;
 }
 pAreaBits = LocalLock( hAreaMemory );
 GetBitmapBits( hArea, (DWORD)NumberBytes, (LPSTR)pAreaBits );
 /* Allocate memory and get bits... */
 SetBits = COUNT_BITS( pAreaBits, bmArea.bmHeight, bmArea.bmWidth);

 /* Count the Black ( 0H ) bits... */
 LocalUnlock( hAreaMemory ); /* Release memory... */
 LocalFree( hAreaMemory );
 SelectObject( hDCMem, hOldBitmap );
 DeleteObject( hArea );
 /* Delete the monochrome bitmap... */
 fArea = SetBits * fAreaPerPixel;
 sprintf( szApproxArea, "%.2f", fArea );

 MessageBox(hWnd, szApproxArea, "Approximate area in square meters... ", MB_OK
);
 /* Calculate and display area... */
 return;
}


























































Special Issue, 1991
PROPVIEW: A WINDOWS FAMILY BROWSER
 This article contains the following executables: PROGEDIT.ARC


Mike Klein


Mike is a software engineer and specializes in Microsoft Windows, HP New Wave,
and Novell Netware. He is also the author of several books and numerous
magazine articles, and can be reached at 500 Cole St., San Francisco, CA
94117, via CompuServe at 73750,2152, and on M&T Online as MikeKlein.


"Subclassing: A window or set of windows that belong to the same window class,
and whose messages are intercepted and processed by another window function
(or functions) before being passed to the class window function." -- Microsoft
Windows SDK
"Subclassing: A legal means by which a programmer can appropriate and use code
and objects developed by others." -- Mike Klein
Subclassing is a method of intercepting and possibly processing the messages
going to an object, whether it be an application's menu bar or a custom
control. Messages going to an object may be logged (examined and passed on),
acted on and passed along, acted on and then discarded, or just discarded
altogether.
You don't have to be using C++ or SmallTalk to benefit from subclassing
techniques, since Microsoft Windows supports subclassing and several other
object-oriented programming methods as well, including easy code reuse and
inheritance. All you need is a Windows-approved C compiler and the Windows SDK
to start programming in an object-oriented environment.
Any menu or window on the desktop can be hooked into and subclassed. This
means that anything is fair game, whether it's the listbox in your application
that you need to enhance, or the menu bar in Aldus Pagemaker that needs an
extra command or two. I have to admit, at first this made me think a little
about the legal ramifications. However, as long as you've actually purchased
the application, nobody can really complain of any wrongdoing. After all, you
haven't actually modified anybody's code -- just the way it interacts with
Windows and other objects. One heck of a lot of control can be gained by
subclassing an application or control. With the addition of Windows' EXE-HDR
and Spy utilities to dump a program's internals and view a program's internal
message processing, you can pretty much learn anything you need about an
application and how it was developed.


Manipulating Objects


The key to subclassing is Windows' open architecture -- every window has an
open and documented message-based interface through which creation,
manipulation, display, and destruction can be accomplished. Not everything,
however, can be done with a message; sometimes you need to modify a window's
internal structure. The benefits of subclassing can be achieved in one of two
ways: by hooking into a window function chain and passing unprocessed messages
down the line; or by creating a new window class.
To illustrate subclassing, I'm including with this article two programs:
ProgEdit, a spawned copy of Notepad (with the important distinction of having
an extra menu option for selecting tab stops, see Figure 1); and
BetterListBox, an example of a superclassed control.
All too often I've wanted to view program source code with Notepad and ended
up with poorly formatted output. ProgEdit was a quick fix to an annoying
problem. In fact, I think that this classifies as one of those few times that
a "quick" project actually ends up being finished quickly, and turns into a
useful utility.
BetterListBox, on the other hand, is an example of subclassing a listbox
control to add data entry and other input enhancements. Too many Windows
applications take the cheap way out with Windows' built-in listbox class,
which is hardly designed for speedy data input. Windows itself is very
inconsistent in how it handles listboxes and combo boxes in general.
BetterListBox lets you build a control that will enhance future applications.
After reading this article, you'll see that subclassing is a powerful and easy
technique for developing code. The benefit of not having to debug the other
developer's code (which hopefully already works) alone justifies the simple
interfacing required to subclass an object. Windows per se doesn't make
subclassing difficult; it's the poorly laidout SDK manuals (not enough
cross-referencing) and lack of complete descriptions for all the different
window messages. The manuals just aren't clear enough when it comes to the
Windows nitty-gritty, meaning heavy memory management, subclassing, owner
draw, MDI, and complicated graphics issues.


The Inside Skinny


For both ProgEdit and BetterListBox, we need to take a look at the structure
common to all windows. This structure WNDCLASS, is shown in Example 1.
Example 1: The WNDCLASS structure, common to all windows

 struct WNDCLASS
 {

 LPSTR lpszClassName; /* Window class name */
 WORD Style; /* Window class style */
 long (FAR PASCAL *lpfnWndProc)(); /* Window class func */
 int cbClsExtra; /* Class extra data */
 int cbWndExtra; /* Window extra data */
 HANDLE hInstance; /* program instance */
 HICON hIcon; /* Class icon to use */
 HCURSOR hCursor; /* Class cursor to use */
 HBRUSH hBrushBackground; /* Class bckgrnd brush */
 LPSTR lpszMenuName; /* Class menu name */
 };

We're most interested in the window's class function, which is responsible for
processing messages for the window class. The functions in Example 2
manipulate a window's internal structure and provide the hooks needed for
subclassing an object.
Example 2: Functions that manipulate a window's internal structure and provide
the hooks needed for subclassing an object

 BOOL GetClassInfo (HANDLE hInst, LPSTR lpClassName, LPWNDCLASS
 lpWndClass);
 HMENU GetMenu (HWND hWnd);
 HMENU GetSubMenu (HMENU hMenu, int nPos);
 HMENU GetSystemMenu(HWND hWnd, BOOL bRevert);
 LONG GetWindowLong(HWND hWnd, int nIndex);

 LONG SetWindowLong(HWND hWnd, int nIndex, DWORD dwNewLong);
 WORD GetWindowWord(HWND hWnd, int nIndex);
 WORD SetWindowWord(HWND hWnd, int nIndex, WORD wNewWord);



ProgEdit: Hooking Into a Foreign Application


ProgEdit demonstrates how to attach to a window function, passing any
unprocessed messages down the chain to other handlers, and eventually to
Windows itself. It doesn't require too much description because it's a pretty
simple program. Listings One through Five show the actual source code for
ProgEdit, including the header file, make file, definition file, and so on.
ProgEdit kicks in by validating its window function that will intercept
messages destined for Notepad's window function. Validation, normalization,
and a proc instance, or "thunk," is how Windows resolves its dynamic linking
problems, whereby function addresses are computed at runtime. Next, ProgEdit
uses one of Notepad's window handles to set a new value for Notepad's edit
window function. We're essentially POKEing in a new value for the window class
function to pick up and use. In return, we get a PEEK indicating the old value
of whatever we changed, which in this case is the original edit window's class
function. We'll use this function pointer later in our own window proc when we
need to pass on message processing to the window's original function.
Next, a couple of menu-related functions query the Notepad main window for a
menu handle, and append a new menu item on the menu bar. It's the "item ID" of
this menu item that we're filtering and trapping for our custom window
procedure.
It's a simple as that. Any messages saying somebody clicked on the "Tab" menu
item are processed by us; any other messages are passed onto the window's
original procedure. Tabs are set by a simple dialog box that pops up and asks
the user for a tab amount. The tab amount is remembered by a custom profile
statement. I have to admit, the profile setting initially seemed like a good
idea, but now doesn't really make much sense. I mean, how often do I switch
from my default (which is four)?


BetterListBox: Making the Bad Better


Windows is an incredible "software tinkertoys" set. The building blocks may be
extremely simple, but then again so is the atom, and look what can be created
from enough of them! From Windows' base object window classes, several new
classes may be created, including combo boxes (at a low level) and
spreadsheets (at a slightly higher level). The beauty of Windows is that it
usually provides several ways to approach a problem, each with its own
trade-offs. Windows' listboxes are a perfect starting point for building a new
control.
At first glance, it would seem that listboxes are extremely powerful, but in
fact it's quite the opposite, as they depend upon Windows' GDI for most of
their flash. Listboxes have several serious shortcomings, first of which is
primitive data entry capabilities. Listboxes are by default read-only -- they
can't be edited and traversed like an Excel or Wingz spreadsheet. Second,
there are too many inconsistencies between single-, multiple-, and
extended-selection listboxes. Single-selection listboxes don't have an
initially highlighted item; however, once you set the highlight you can't
remove it (using the keyboard). The API is also lacking a few key listbox
messages. So what does this mean?


Subclass and Live to Tell the Story!


Although creating BetterListBox only took a couple of days, getting used to
the different messaging quirks made programming a nightmare. (Listings Six
through Ten, page 24, show the actual source code for ProgEdit, including the
header file, make file, definition file, and so on.) All too many listboxes in
commercial Windows applications have the same three buttons to the right of
them: Add, Delete, and Edit. While this is fine for the keyboard illiterate
and mouse retentive, it isn't so good for clerks updating inventory or temps
being paid for their productivity. Not only does Microsoft need to make
Windows work fast, but it needs Windows to allow people to work fast too -- an
important difference that's been overlooked by too many GUI applications.
BetterListBox was written specifically for single-selection, single-column
listboxes. However, with the addition of probably ten lines of code, it should
function with multicolumn and multiple-selection listboxes as well.
The key to creating a good listbox is to trap for some common-sense
characters, such as DEL, INS, ENTER, PGUP, PGDN, and so on. Although Windows
has a listbox style called LBS _ WANT - KEYBOARDINPUT, it will only send you
keystrokes when it has the input focus. Note that listboxes don't get the
focus when they are empty, which is a big problem when you want to add
something to an empty listbox.
Besides allowing a lot of keyboard shortcuts, I wanted it to be easy to edit
items inside the listbox. Whereas most applications pop up a dialog box to
enter a new item, BetterListBox creates an edit control exactly over the
currently highlighted listbox cell and copies the contents of the listbox
entry into the edit window, afterwards setting input focus to itself. With
BetterListBox, pressing DEL deletes the current listbox cell, pressing INS
copies the current entry and inserts a new listbox cell at your current
position, and pressing the ENTER key switches in and out of edit mode for the
listbox cell. The nice thing about the edit mode is that all the normal
listbox navigation keys function as normal, including the up and down arrow,
PGUP, and PGDN. Another added benefit is that you're kept in edit mode the
whole time, as in a spreadsheet. The middle and right mouse buttons (single-
or double-clicking) act just like the left mouse button, putting them to good
use. The other edit control window class that I created, BetterEditCtrl, is
also a good start into designing an even more enhanced version.
If multiple controls are present in the same dialog box, you'll need to
respond to the window message WM_GETDLG-CODE, which is used by Windows to
allow a particular control to process the navigation and interaction between
the other controls in the dialog. There can be several window procedures in a
chain filtering messages for the same window, so Windows has to ask which one
will be responsible for processing this input. This means that you too can be
in charge of processing the TAB, reverse TAB, and arrow keys. This really
doesn't present a problem, though. In the case of the TAB key, you'd want to
call GetNextDlgTabItem and set the focus to whatever window handle is
returned. What this function does is tell you, based on the ID of the control
you pass, which controls are ahead of and behind you (controls with the
WS_TAB-STOP setting, that is). In this manner, you can process the TAB and
reverse TAB keys quite effortlessly.


Conclusion


I welcome any suggestions for improving BetterListBox, and will be posting any
changes I make to the code to the Online BBS, which contains quite a number of
other Windows programming utilities and source as well. The listings for both
ProgEdit and BetterListBox are commented quite extensively, so start reading,
playing, and incorporating subclassing techniques into your own applications.

_SUBCLASSING APPLICATIONS_
by Mike Klein


[LISTING ONE]

# Standard Windows make file. The utility MAKE.EXE compares the
# creation date of the file to the left of the colon with the file(s)
# to the right of the colon. If the file(s) on the right are newer
# then the file on the left, Make will execute all of the command lines
# following this line that are indented by at least one tab or space.
# Any valid MS-DOS command line may be used.

# This line allows NMAKE to work as well
all: progedit.exe

# Update the resource if necessary
progedit.res: progedit.rc progedit.h progedit.ico
 rc -r progedit.rc

# Update the object file if necessary
progedit.obj: progedit.c progedit.h
 cl -W4 -c -AS -Gsw -Oad -Zp progedit.c


# Update the executable file if necessary, and if so, add the resource back
in.
progedit.exe: progedit.obj progedit.def
 link /NOD progedit,,, libw slibcew, progedit.def
 rc progedit.res

# If the .res file is new and the .exe file is not, update the resource.
# Note that the .rc file can be updated without having to either
# compile or link the file.
progedit.exe: progedit.res
 rc progedit.res






[LISTING TWO]

#define IDC_TABAMT 100

int PASCAL WinMain(HANDLE, HANDLE, LPSTR, int);

LONG FAR PASCAL MyMainWndProc(HWND, unsigned, WORD, LONG);
BOOL FAR PASCAL TabAmount(HWND, unsigned, WORD, LONG);








[LISTING THREE]

; module-definition file for Progeddit -- used by LINK.EXE

NAME PROGEDIT ; application's module name
DESCRIPTION 'Progedit - Programming Editor'
EXETYPE WINDOWS ; required for all Windows applications
STUB 'WINSTUB.EXE' ; Generates error message if application
 ; is run without Windows

;CODE can be moved in memory and discarded/reloaded
CODE PRELOAD MOVEABLE DISCARDABLE

;DATA must be MULTIPLE if program can be invoked more than once
DATA PRELOAD MOVEABLE MULTIPLE DISCARDABLE

HEAPSIZE 1024
STACKSIZE 5120 ; recommended minimum for Windows applications

; All functions that will be called by any Windows routine
; MUST be exported.

EXPORTS
 MyMainWndProc @1
 TabAmount @2









[LISTING FOUR]

#include <windows.h>
#include "progedit.h"

ProgEdit ICON PROGEDIT.ICO

TabAmount DIALOG 11, 25, 75, 24
CAPTION "Tab Amount"
STYLE WS_POPUPWINDOW WS_CAPTION
BEGIN
 CONTROL "Tab Amt:", -1, "static",
 SS_RIGHT WS_CHILD,
 10, 6, 30, 12
 CONTROL "4", IDC_TABAMT, "edit",
 ES_LEFT WS_BORDER WS_TABSTOP WS_CHILD,
 45, 6, 20, 12
END









[LISTING FIVE]

/*****************************************************************************
PROGRAM: ProgEdit -- AUTHOR: Mike Klein -- VERSION: 1.0
FILE: progedit.exe -- REQUIREMENTS: Windows 3.x
PURPOSE: Example of adding a menu item to a "foreign" application. In this
case, the program is Windows' NotePad, and the extension added is a definable
tab stop setting to Notepad's menu bar.
*****************************************************************************/

#define _WINDOWS
#define NOCOMM

#include <windows.h>
#include <stdio.h>
#include <string.h>

#include "progedit.h"

/* Handles & vars needed for ProgEdit */
HANDLE hInstProgEdit;
FARPROC lpfnMyMainWndProc;
/* Handles & vars needed for Notepad */
HMENU hMenuNotepad;
HWND hWndNotepadMain;

HWND hWndNotepadEdit;
FARPROC lpfnNotepadMainWndProc;

int TabAmt;

BYTE Text[100];

/*****************************************************************************
 FUNCTION: WinMain
 PURPOSE : Calls initialization function, processes message loop
*****************************************************************************/
int PASCAL WinMain(HANDLE hInstance, HANDLE hPrevInstance, LPSTR lpCmdLine,
 int nCmdShow)
{
 MSG msg;
 struct
 {
 WORD wAlwaysTwo;
 WORD wHowShown;
 }
 HowToShow;

 struct
 {
 WORD wEnvSeg;
 LPSTR lpCmdLine;
 LPVOID lpCmdShow;
 DWORD dwReserved;
 }
 ParameterBlock;
 HowToShow.wAlwaysTwo = 2;
 HowToShow.wHowShown = SW_SHOWNORMAL;
 ParameterBlock.wEnvSeg = 0;
 ParameterBlock.lpCmdLine = "";
 ParameterBlock.lpCmdShow = (LPVOID) &HowToShow;
 ParameterBlock.dwReserved = NULL;
 hInstProgEdit = hInstance;

 /* Run a copy of NotePad */
 if(LoadModule("notepad.exe", (LPVOID) &ParameterBlock) < 32)
 {
 MessageBox(NULL, "Running instance of NotePad", "ERROR",
 MB_OK MB_ICONSTOP);
 return(FALSE);
 }

 /* Get handles to Notepad's two main windows */
 hWndNotepadMain = GetActiveWindow();
 hWndNotepadEdit = GetFocus();

/* Set up different function pointers. Get a ptr to my hWnd func, then
** plug it into the other application's struct so it calls my func. Of
** course, at end of my func, I call func that I stole in first place. */
 lpfnMyMainWndProc=MakeProcInstance((FARPROC) MyMainWndProc, hInstProgEdit);
 lpfnNotepadMainWndProc=(FARPROC) SetWindowLong(hWndNotepadMain,GWL_WNDPROC,
 (DWORD) lpfnMyMainWndProc);

 /* Get handle to Notepad's menu and add Tabs to main menu */
 hMenuNotepad = GetMenu(hWndNotepadMain);

 AppendMenu(hMenuNotepad, MF_STRING, IDC_TABAMT, "&Tabs");
 DrawMenuBar(hWndNotepadMain);

 /* Read in tab amt from win.ini */
 GetProfileString("ProgEdit", "Tabs", "4", Text, 2);
 TabAmt = (HIWORD(GetDialogBaseUnits()) * (Text[0] - '0')) / 4;
 SendMessage(hWndNotepadEdit, EM_SETTABSTOPS, 1, (LONG) (LPINT) &TabAmt);

 /* Acquire and dispatch messages until a WM_QUIT message is received. */
 while(GetMessage(&msg, NULL, NULL, NULL))
 {
 TranslateMessage(&msg);
 DispatchMessage(&msg);
 }
 FreeProcInstance(lpfnMyMainWndProc);
 return(FALSE);
}

/*****************************************************************************
 FUNCTION: MyMainWndProc
 PURPOSE : Filter/replacement function for Notepad's MainWndProc()
*****************************************************************************/
LONG FAR PASCAL MyMainWndProc(HWND hWnd,unsigned wMsg,WORD wParam,LONG lParam)
{
 FARPROC lpProc;
 switch(wMsg)
 {
 case WM_COMMAND :
 switch(wParam)
 {
 case IDC_TABAMT :
 /* Set tab stops in edit window */
 lpProc = MakeProcInstance(TabAmount, hInstProgEdit);
 DialogBox(hInstProgEdit, "TabAmount", hWnd, lpProc);
 FreeProcInstance(lpProc);
 break;;
 default :
 break;
 }
 break;
 case WM_DESTROY :
 SendMessage(hWndNotepadMain, WM_QUIT, 0, 0L);
 PostQuitMessage(0);
 break;
 default :
 break;
 }
 return(CallWindowProc(lpfnNotepadMainWndProc, hWnd, wMsg, wParam, lParam));
}

/*****************************************************************************
 FUNCTION: TabAmount
 PURPOSE : Processes messages for edit window that gets tab amount
*****************************************************************************/
BOOL FAR PASCAL TabAmount(HWND hWnd, unsigned wMsg, WORD wParam, LONG lParam)
{
 switch(wMsg)
 {
 case WM_INITDIALOG :

 /* Display the current tab setting */
 SendDlgItemMessage(hWnd, IDC_TABAMT, EM_LIMITTEXT, 1, 0L);
 SetDlgItemInt(hWnd, IDC_TABAMT, (TabAmt * 4) /
 HIWORD(GetDialogBaseUnits()), FALSE);
 return(TRUE);
 case WM_COMMAND :
 switch(wParam)
 {
 case IDOK :
 case IDCANCEL :

 /* Get number of tabs and calculate it in dialog units */
 GetDlgItemText(hWnd, IDC_TABAMT, Text, sizeof(Text));
 TabAmt = (HIWORD(GetDialogBaseUnits()) * (Text[0] - '0')) / 4;
 /* Set the tab stops in the edit window */
 SendMessage(hWndNotepadEdit, EM_SETTABSTOPS, 1,
 (LONG) (LPINT) &TabAmt);
 InvalidateRect(hWndNotepadEdit, NULL, TRUE);
 UpdateWindow(hWndNotepadEdit);
 /* Save the tab amt in WIN.INI profile */
 WriteProfileString("ProgEdit", "Tabs", Text);
 EndDialog(hWnd, TRUE);
 return(TRUE);
 default :
 break;
 }
 break;
 default :
 break;
 }
 return(FALSE);
}







[LISTING SIX]

# Standard Windows make file. The utility MAKE.EXE compares the
# creation date of the file to the left of the colon with the file(s)
# to the right of the colon. If the file(s) on the right are newer
# then the file on the left, Make will execute all of the command lines
# following this line that are indented by at least one tab or space.
# Any valid MS-DOS command line may be used.

# This line allows NMAKE to work as well
all: subclass.exe

# Update the resource if necessary
subclass.res: subclass.rc subclass.h subclass.ico
 rc -r subclass.rc

# Update the object file if necessary
subclass.obj: subclass.c subclass.h
 cl -W4 -c -AS -Gsw -Oad -Zip subclass.c


# Update the executable file if necessary, and if so, add the resource back
in.
subclass.exe: subclass.obj subclass.def
 link /NOD /CO subclass,,, libw slibcew, subclass.def
 rc subclass.res

# If the .res file is new and the .exe file is not, update the resource.
# Note that the .rc file can be updated without having to either
# compile or link the file.
subclass.exe: subclass.res
 rc subclass.res








[LISTING SEVEN]


/* Standard defines */
#define FIRST (0L)
#define LAST (0x7fff7fffL)
#define ALL (0x00007fffL)

#define IDC_LISTBOX 100
#define IDC_INPUTBOX 100

/* Function prototypes */
int PASCAL WinMain(HANDLE, HANDLE, LPSTR, int);

LONG FAR PASCAL MainWndProc(HWND, unsigned, WORD, LONG);
LONG FAR PASCAL HandleListBoxes(HWND, unsigned, WORD, LONG);
LONG FAR PASCAL HandleEditCtrls(HWND, unsigned, WORD, LONG);
VOID PASCAL CloseEditWindow(VOID);
VOID PASCAL OpenEditWindow(DWORD);








[LISTING EIGHT]

; module-definition file for Megaphone -- used by LINK.EXE

NAME Test ; application's module name
DESCRIPTION 'Test'
EXETYPE WINDOWS ; required for all Windows applications
STUB 'WINSTUB.EXE' ; Generates error message if application
 ; is run without Windows

;CODE can be moved in memory and discarded/reloaded
CODE PRELOAD MOVEABLE DISCARDABLE

;DATA must be MULTIPLE if program can be invoked more than once

DATA PRELOAD MOVEABLE MULTIPLE

HEAPSIZE 1024
STACKSIZE 5120 ; recommended minimum for Windows applications

; All functions that will be called by any Windows routine
; MUST be exported.

EXPORTS
 MainWndProc @1
 HandleListBoxes @2
 HandleEditCtrls @3







[LISTING NINE]

/* Include files needed for .RC file */
#include "windows.h"
#include "subclass.h"

/* The program's icon (not that it needs one) */
SubClass ICON SUBCLASS.ICO

/* Main dialog w/listbox used by SubClass */
SubClass DIALOG 36, 34, 100, 100
CAPTION "SubClass"
CLASS "SubClass"
STYLE WS_POPUPWINDOW WS_CAPTION WS_MINIMIZEBOX DS_LOCALEDIT
BEGIN
 CONTROL "", IDC_LISTBOX, "BetterListBox",
 LBS_HASSTRINGS LBS_NOTIFY LBS_NOINTEGRALHEIGHT 
 WS_BORDER WS_VSCROLL WS_CHILD,
 10, 10, 80, 80
END






[LISTING TEN]

/*****************************************************************************
 PROGRAM: SubClass -- AUTHOR: Mike Klein -- VERSION: 1.0
 FILE : subclass.exe -- REQUIREMENTS: Windows 3.x
 PURPOSE: An example of a subclassed listbox, providing enhanced input
 and data-entry facilities.
*****************************************************************************/

/* Some std defines needed */
#define _WINDOWS
#define NOCOMM

/* INCLUDE files */

#include <windows.h>
#include "subclass.h"

/* Global variables */
HANDLE hInstSubClass;
HWND hDlgSubClass;
HWND hWndListBox;
HWND hWndEdit;

int CurrentIndex;
int NumListBoxItems;
RECT CurrentItemRect;

BOOL InsideEditMode = FALSE;
DWORD dwEditPos;

BYTE InputString[50];

/* Far pointers to Windows' class functions for listboxes and edit ctrls */
FARPROC lpfnListBox;
FARPROC lpfnEditCtrl;

/*****************************************************************************
 FUNCTION: WinMain
 PURPOSE : Calls initialization function, processes message loop
*****************************************************************************/
int PASCAL WinMain(HANDLE hInstance, HANDLE hPrevInstance, LPSTR lpCmdLine,
 int nCmdShow)
{
 WNDCLASS wc;
 MSG msg;
 if(!hPrevInstance)
 {
 hInstSubClass = hInstance;
 /* Fill in window class structure with parameters that describe the
 ** main window */
 wc.style = CS_DBLCLKS;
 wc.lpfnWndProc = MainWndProc;
 wc.cbClsExtra = 0;
 wc.cbWndExtra = DLGWINDOWEXTRA;
 wc.hInstance = hInstSubClass;
 wc.hIcon = LoadIcon(hInstSubClass, "SubClass");
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = NULL;
 wc.lpszClassName = "SubClass";
 if(!RegisterClass(&wc))
 return(FALSE);
 /* Fill in window class structure with parameters that describe our
 ** custom list box -- BetterListBox */
 wc.style = CS_DBLCLKS;
 wc.lpfnWndProc = HandleListBoxes;
 wc.cbClsExtra = 0;
 wc.cbWndExtra = 0;
 wc.hInstance = hInstSubClass;
 wc.hIcon = NULL;
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = NULL;

 wc.lpszClassName = "BetterListBox";
 if(!RegisterClass(&wc))
 return(FALSE);
 /* Fill in window class structure with parameters that describe our
 ** custom edit control -- BetterEditCtrl */
 wc.style = CS_DBLCLKS;
 wc.lpfnWndProc = HandleEditCtrls;
 wc.cbClsExtra = 0;
 wc.cbWndExtra = 0;
 wc.hInstance = hInstSubClass;
 wc.hIcon = NULL;
 wc.hCursor = LoadCursor(NULL, IDC_IBEAM);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = NULL;
 wc.lpszClassName = "BetterEditCtrl";
 if(!RegisterClass(&wc))
 return(FALSE);
 /* Get information on listbox class, so we can find out what its
 ** class window function is. */
 if(GetClassInfo(NULL, "listbox", &wc) == FALSE)
 return(FALSE);
 else
 lpfnListBox = (FARPROC) wc.lpfnWndProc;
 /* Get information on edit control class, so we can find out what its
 ** class window function is. */
 if(GetClassInfo(NULL, "edit", &wc) == FALSE)
 return(FALSE);
 else
 lpfnEditCtrl = (FARPROC) wc.lpfnWndProc;

 /* Create the main window */
 if((hDlgSubClass = CreateDialog(hInstSubClass, "SubClass",
 NULL, 0L)) == NULL)
 {
 return(FALSE);
 }
 /* Get an oft used handle */
 hWndListBox = GetDlgItem(hDlgSubClass, IDC_LISTBOX);
 /* Put in some test strings. */
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "computer");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "telephone");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "lcd");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "ochessica");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "heeyah");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "video");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "smoke");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "sky");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "lovely");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "windows");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "nunez");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "beer");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "pug");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "query");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "remote");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "party");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "mixer");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "skate");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "varied");
 SendMessage(hWndListBox, LB_ADDSTRING, 0, (LONG) (LPSTR) "interests");

 /* Give the listbox an initial selection */
 SendMessage(hWndListBox, LB_SETCURSEL, 0, 0L);
 ShowWindow(hDlgSubClass, nCmdShow);
 UpdateWindow(hDlgSubClass);
 }
 else
 {
 /* If there was another instance of SubClass running, then switch
 ** to it by finding any window of class = "SubClass". Then, if it's
 ** an icon, open the window, otherwise just make it active. */
 hDlgSubClass = FindWindow("SubClass", NULL);
 if(IsIconic(hDlgSubClass))
 ShowWindow(hDlgSubClass, SW_SHOWNORMAL);
 SetActiveWindow(hDlgSubClass);
 return(FALSE);
 }
 /* Acquire and dispatch messages until a WM_QUIT message is received. */
 while(GetMessage(&msg, NULL, NULL, NULL))
 {
 TranslateMessage(&msg);
 DispatchMessage(&msg);
 }
}

/*****************************************************************************
 FUNCTION: MainWndProc
 PURPOSE : Processes messages for SubClass dialog box
*****************************************************************************/
LONG FAR PASCAL MainWndProc(HWND hWnd, unsigned wMsg, WORD wParam,
 LONG lParam)
{
 switch(wMsg)
 {
 case WM_CLOSE :
 DestroyWindow(hDlgSubClass);
 return(0L);
 case WM_SETFOCUS :
 SetFocus(hWndListBox);
 return(0L);
 case WM_DESTROY :
 PostQuitMessage(0);
 return(0L);
 default :
 break;
 }
 return(DefDlgProc(hWnd, wMsg, wParam, lParam));
}

/*****************************************************************************
 FUNCTION: OpenEditWindow
 PURPOSE : Opens edit window inside listbox
*****************************************************************************/
VOID PASCAL OpenEditWindow(DWORD CharSel)
{
 /* Flag telling us were in edit mode */
 InsideEditMode = TRUE;
 /* Find out current index into listbox */
 CurrentIndex = (int) SendMessage(hWndListBox, LB_GETCURSEL, 0, 0L);
 if(CurrentIndex == LB_ERR)

 CurrentIndex = 0;
 /* Find out what the text is in selected listbox cell */
 SendMessage
 (
 hWndListBox,
 LB_GETTEXT,
 CurrentIndex,
 (LONG) (LPSTR) InputString
 );
 /* Get client dimensions of listbox cell with respect to the entire
 ** listbox and create an edit window right inside of it. */
 SendMessage
 (
 hWndListBox,
 LB_GETITEMRECT,
 CurrentIndex,
 (DWORD) (LPRECT) &CurrentItemRect
 );
 hWndEdit = CreateWindow
 (
 "BetterEditCtrl",
 "",
 ES_AUTOHSCROLL ES_LEFT WS_VISIBLE WS_CHILD,
 CurrentItemRect.left + 2,
 CurrentItemRect.top,
 CurrentItemRect.right - CurrentItemRect.left - 2,
 CurrentItemRect.bottom - CurrentItemRect.top,
 hWndListBox,
 IDC_INPUTBOX,
 hInstSubClass,
 0L
 );
 /* Pre-fill the edit control with what was in the listbox cell */
 SetWindowText(hWndEdit, InputString);
 SetFocus(hWndEdit);
 SendMessage(hWndEdit, EM_SETSEL, 0, CharSel);
}

/*****************************************************************************
 FUNCTION: CloseEditWindow
 PURPOSE : Closes edit window inside listbox
*****************************************************************************/
VOID PASCAL CloseEditWindow(VOID)
{
 /* Flag telling us were aren't in edit mode anymore */
 InsideEditMode = FALSE;
 /* Get text of what was entered into edit control */
 GetWindowText(hWndEdit, InputString, sizeof(InputString));
 if(!GetWindowTextLength(hWndEdit))
 {
 DestroyWindow(hWndEdit);
 SendMessage(hWndListBox, WM_KEYDOWN, VK_DELETE, 0L);
 return;
 }
 /* Turn redrawing for the listbox off. */
 SendMessage(hWndListBox, WM_SETREDRAW, 0, 0L);
 /* Find out the RECT of the currently selected listbox item */
 SendMessage(hWndListBox, LB_GETITEMRECT, CurrentIndex, (DWORD) (LPRECT)
 &CurrentItemRect);

 /* Delete the old string and add the new one */
 SendMessage(hWndListBox, LB_INSERTSTRING, CurrentIndex,
 (LONG) (LPSTR) InputString);
 SendMessage(hWndListBox, LB_DELETESTRING, CurrentIndex + 1, 0L);
 /* Destroy the old edit window. */
 DestroyWindow(hWndEdit);
 /* Validate the whole listbox and then invalidate only the list box rect
 ** that we put the edit window into. */
 ValidateRect(hWndListBox, NULL);
 InvalidateRect(hWndListBox, &CurrentItemRect, TRUE);

 /* Turn re-drawing for the listbox back on and send a WM_PAINT for the
 ** entry changes to take effect. */
 SendMessage(hWndListBox, WM_SETREDRAW, 1, 0L);
 UpdateWindow(hWndListBox);
 SetFocus(hWndListBox);
}

/*****************************************************************************
 FUNCTION: HandleListBoxes
 PURPOSE : Process keystrokes and mouse for list boxes
*****************************************************************************/
LONG FAR PASCAL HandleListBoxes(HWND hWnd, unsigned wMsg, WORD wParam,
 LONG lParam)
{
 switch(wMsg)
 {
 case WM_LBUTTONDBLCLK :

 /* Go into edit mode and put caret at end of edit ctrl. */
 if(SendMessage(hWnd, LB_GETCOUNT, 0, 0L))
 {
 OpenEditWindow(LAST);
 }
 return(0L);
 case WM_LBUTTONDOWN :
 if(InsideEditMode)
 {
 /* Find out cursor pos from the edit ctrl, so we can
 ** use same positioning for cell were moving into */
 dwEditPos = SendMessage(hWndEdit, EM_GETSEL, 0, 0L);
 CloseEditWindow();
 /* Tell listbox to move cur ptr up or down based on
 ** the mouse position, and open a new edit window */
 SendMessage(hWndListBox, wMsg, wParam, lParam);
 OpenEditWindow
 (
 MAKELONG(LOWORD(dwEditPos), LOWORD(dwEditPos))
 );
 return(0L);
 }
 break;
 case WM_MBUTTONDBLCLK :
 case WM_RBUTTONDBLCLK :
 /* Make middle & right mouse buttons like left mouse button. */
 SendMessage(hWnd, WM_LBUTTONDBLCLK, wParam, lParam);
 break;
 case WM_MBUTTONDOWN :
 case WM_RBUTTONDOWN :

 /* Make middle & right mouse buttons like left mouse button. */
 SendMessage(hWnd, WM_LBUTTONDOWN, wParam, lParam);
 SendMessage(hWnd, WM_LBUTTONUP, wParam, lParam);
 break;
 case WM_KEYDOWN :
 switch(wParam)
 {
 case VK_RETURN :
 /* Enter was pressed, so go into edit mode and put the
 ** caret at the end of the edit ctrl */
 if(SendMessage(hWnd, LB_GETCOUNT, 0, 0L))
 {
 OpenEditWindow(LAST);
 }
 return(0L);
 case VK_INSERT :
 /* The INS key (add a new string). First, get currently
 ** selected entry. If none exists, assume that focus is on
 ** the first cell */
 CurrentIndex =
 (int) SendMessage(hWnd, LB_GETCURSEL, 0, 0L);
 if(CurrentIndex == LB_ERR)
 {
 CurrentIndex = 0;
 }
 /* Find out what the text is in selected listbox cell */
 if(SendMessage(hWnd, LB_GETCOUNT, 0, 0L))
 {
 SendMessage
 (
 hWnd,
 LB_GETTEXT,
 CurrentIndex,
 (LONG) (LPSTR) InputString
 );
 }
 else
 {
 /* If nothing's in the listbox, then copy a null to
 ** the edit control. */
 InputString[0] = '\0';
 }

 /* Insert new entry */
 SendMessage
 (
 hWnd,
 LB_INSERTSTRING,
 CurrentIndex,
 (LONG) (LPSTR) InputString
 );
 /* Let our "edit current cell" function take over */
 OpenEditWindow(ALL);
 return(0L);
 case VK_DELETE :
 /* The DEL key. If no items are in the listbox, then
 ** return. Else, get the currently selected item. */
 if(!(NumListBoxItems = (int)
 SendMessage(hWnd, LB_GETCOUNT, 0, 0L)))

 {
 break;
 }
 if((CurrentIndex = (int)
 SendMessage(hWnd, LB_GETCURSEL, 0, 0L)) == LB_ERR)
 {
 CurrentIndex = 0;
 }
 /* Delete the string. Tried to get rid of annoying
 ** focus rect; couldn't. Too many inconsitencies in
 ** the way Windows handles list and combo boxes. */
 SendMessage(hWnd, LB_DELETESTRING, CurrentIndex, 0L);
 if(CurrentIndex == NumListBoxItems - 1)
 {
 --CurrentIndex;
 }
 /* Reset our listbox selection. */
 SendMessage(hWnd, LB_SETCURSEL, CurrentIndex, 0L);
 return(0L);
 default :
 break;
 }
 break;
 default :
 break;
 }
 /* Return any unprocessed messages to window's original class procedure. */
 return(CallWindowProc(lpfnListBox, hWnd, wMsg, wParam, lParam));
}

/*****************************************************************************
 FUNCTION: HandleEditCtrls
 PURPOSE : Process keystrokes and mouse for edit controls
*****************************************************************************/
LONG FAR PASCAL HandleEditCtrls(HWND hWnd, unsigned wMsg, WORD wParam,
 LONG lParam)
{
 switch(wMsg)
 {
 case WM_LBUTTONDBLCLK :
 /* Turn of edit mode, closing the edit window */
 CloseEditWindow();
 return(0L);
 case WM_KEYDOWN :
 switch(wParam)
 {
 case VK_RETURN :
 /* Turn of edit mode, closing the edit window */
 CloseEditWindow();
 return(0L);
 case VK_DELETE :
 /* Delete a character if one exists in the edit ctrl.
 ** Otherwise, if the cell is blank, delete the entire
 ** cell. */
 if(!GetWindowTextLength(hWnd))
 {
 CloseEditWindow();
 SendMessage(hWndListBox, wMsg, wParam, lParam);
 return(0L);

 }
 break;
 case VK_DOWN :
 case VK_UP :
 case VK_PRIOR :
 case VK_NEXT :
 /* Find out cursor pos from edit ctrl, so we can
 ** use same positioning for cell we're moving into */
 dwEditPos = SendMessage(hWndEdit, EM_GETSEL, 0, 0L);
 CloseEditWindow();
 /* Tell listbox to move cur ptr up or down, and
 ** open an edit window at the new pos. */
 SendMessage(hWndListBox, wMsg, wParam, lParam);
 OpenEditWindow
 (
 MAKELONG(LOWORD(dwEditPos), LOWORD(dwEditPos))
 );
 return(0L);
 default :
 break;
 }
 break;
 default :
 break;
 }
 /* Return any unprocessed messages to window's original class procedure. */
 return(CallWindowProc(lpfnEditCtrl, hWnd, wMsg, wParam, lParam));
}


































Special Issue, 1991
WINDOWS MEETS C++


Scott Robert Ladd


Scott operates The Ladd Group, a consulting firm that specializes in
developing scientific software. He can be reached at 3652 County Road 730,
Gunnison, CO 81230-9726.


Programmers have recently been examining Windows-compatible C++ compilers in
hope of finding a magic key that will unlock the door to easily written
Windows applications. The question they are asking is whether or not the
object-oriented power of C++ can be used to simplify Windows programming. The
answer is "yes," but with a caveat. Although it supports the notion of
subclassing, Windows was not necessarily designed with object-oriented
programming in mind. So despite Windows' event-driven architecture,
Microsoft's software engineers necessarily approached Windows from a
"procedural" (Pascal/C) perspective.
The conflict between C++ and Windows is a contrast of simple solutions and
difficult problems. Once the cantankerous nature of Windows has been tamed by
a set of C++ classes, object-oriented Windows applications can be built easily
and efficiently. Toward that end, this article presents a class hierarchy that
will ease you into the development of a generic Windows application.


Designing Classes


A Windows Application (or WinApp) has two basic components: the main message
dispatch loop and windows. The message loop takes incoming messages and
dispatches them to the appropriate window functions. In every WinApp, the
constants are the message loop, the instance handle, and the need to track the
error status of the application. These constants can easily be encapsulated
into a class.
Windows are created in two steps. First, you register a "window class." (This
is not a C++ class; rather, it is a description of a category of windows that
share similar characteristics.) Secondly, you generate a window using the
window class as an argument to the CreateWindow function. A window class can
be used in the creation of many windows. Because window instances and window
classes are separate (but related) entities, I designed my class hierarchy
with C++ classes for both window classes and windows themselves.
When writing a WinApp in C, two techniques associate data with a specific
window. The first, extra data (as defined by the window's class), can be
allocated in the window's data structure. The GetWindowWord/SetWindowWord and
GetWindowLong/SetWindowLong functions can store and retrieve 16- and 32-bit
values, respectively, in these so-called "extra bytes." This is convenient for
small amounts of data, but doesn't work well when storing structures or large
amounts of data.
The second method of associating data with a window is via a property list.
Each data item in a property list is given a text name. Data for a window can
be stored and retrieved by name; almost any amount of information can be
associated with a window via a property list. However, property lists aren't
fast because special functions access each piece of data through an
inefficient linear look-up table. If you need quick access to your data, avoid
property lists.
Whether or not you use extra bytes or property lists, you end up with
reliability problems. Extra bytes are accessed via specific offsets into the
internal structure of a window. One small mistake in the offset specification
can return invalid data, or even worse, destroy other data. Property lists
require maintenance of a list of text names for various data elements; again,
an error-prone and onerous task.
Using C++ classes solves the problem of window-specific data storage by
encapsulating the data elements for a window. Instead of generating extra
window bytes or creating a property list, you can encapsulate data elements
for a specific kind of window in a class description. There's no overhead; the
member functions defined for a window data type can directly access the
associated data without messy offsets or special access functions.


Basic Classes


The design methodology just described specifies three basic classes: a class
to encapsulate application-global information; a class to define the generic
structure and function of window classes; and a simple window class. All of
these classes are required for any WinApp, so they can be defined together in
the same files. I'll begin by describing the class definitions in winclass.h
(see Listing One, page 33).
The WinApp class defines an application's global characteristics. The instance
handle and error status for an application are stored in a WinApp object of
which there is only one in any application.
I didn't just define the instance and error status as global variables because
global variables can be modified accidentally from anywhere within a program.
In a Windows application, where thousands of lines of code are not executed in
an obviously structured fashion, tracking down unexpected changes in global
variables can be a nightmare. Once the WinApp object is created, the value for
Instance can't be changed. Read-only access protects the program from itself.
The global error indicator is also handled in an object-oriented fashion; it
can only be modified via the appropriate WinApp member functions.
WinApp also defines functional attributes. Every application has a message
loop; WinApp encapsulates and standardizes the loop in the Dispatch member
function. Once Dispatch has been defined, all applications can use it
identically; it need not be redeveloped for every program.
C++ classes for windows provide a foundation for building application-specific
classes. WindowClass defines the characteristics common to all window classes.
The constructor assigns values to a WNDCLASS structure, obtaining the
application instance handle from the app parameter. Registration is handled by
the Register/Unregister member functions. The GetName method returns a pointer
to the character string that identifies the class. WindowClass combines the
complete data and functional definitions of a window class into a single
package.
The BasicWindow class is not meant to be used as it is defined, but rather as
a general-base class from which other window types can be derived. Windows,
controls, and dialog boxes are all windows, although each has unique
characteristics. So the BasicWindow class defines the characteristics common
to all windows. To prevent BasicWindow objects from being created outside of
their derived classes, BasicWindow's constructor is a protected member
function. The constructor accepts the name of a window class, which is
assigned to the data member ClassName. All windows have a handle identifier
and style value, which are maintained in the Handle and Style data members of
the BasicWindow class. In addition, every window must keep track of the
program instance it is associated with; this value is stored in the Instance
data member. The constructor sets the data members to "empty" values,
expecting the constructor for a derived class to assign specific values to
these components.
The first three public member functions, GetHandle, GetStyle, and GetInstance,
retrieve the HANDLE, Style, and Instance data members, respectively, for a
BasicWindow object.
Show Update calls the ShowWindow and UpdateWindow API functions. Show is
called to hide or display a window, depending on the value of cmdShow. (Valid
cmdShow arguments are listed under the ShowWindow function in the Windows API
Reference manual.) UpdateWindow sends a WM_PAINT message to the Window's
callback function, requesting that the client area be redrawn. By making these
functions inline, calls to the corresponding API function are directly
embedded in the code when Show and Update are called, eliminating the overhead
of a double function call.
GetText copies window's text into a caller-supplied buffer. The text of a
programmer-defined window is its heading; for a predefined control, the
placement of the text depends upon the type of control. SetText is a
complimentary function that changes the text associated with a window.
The Disable function disables a window, and Enable enables a window. Like most
BasicWindow functions, these functions are inline. The methods GetExtents and
Move retrieve and change, respectively, the x-y position and height/width of a
window.


Window Class


Window is the base class for application (nondialog box, noncontrol) windows.
The public interface to a Window consists of four functions. The constructor
must be passed application and window class information. These objects provide
the instance handle and class name required when creating the object. The
constructor does not generate the window; it is the Actualize member function
that calls CreateWindow.
I had a long debate with myself about how to construct and create Window
objects. My first design implemented the CreateWindow call in the constructor,
in effect combining the constructor with Actualize. There was no need to
maintain arguments for CreateWindow in data members, because CreateWindow was
called in a constructor that defined those arguments.
That scheme didn't work well. A window object is often defined before it needs
to be displayed, and the all-in-one constructor automatically calls
CreateWindow whether or not I want it. Furthermore, when deriving new Window
classes, there was a problem with the order of construction. Base class
constructors are called before derived class constructors; with the
CreateWindow call in the base class constructor, the window was automatically
generated using the default parameters before the derived class's constructor
could make changes.
Creating windows works better with a split generation process. When a derived
class object is created, the base class constructor is called to set default
values for the CreateWindow arguments. Immediately thereafter, the derived
class constructor is called to change some or all of those defaults. Actualize
is called once the construction is complete, using the values of the data
members as arguments to CreateWindow. Inefficiency stemming from possible
reassignment of values to argument data members is a small price to pay for
added flexibility.
The five protected function members are called by the private member function
MessageHandler -- the callback function common to all Window objects. You may
wonder why I defined MessageHandler as part of the Window class, when it is
defined as part of a window class in traditional C applications, and why I
defined it as a static, instead of normal member. To answer these questions,
you must examine how a callback function works, and understand the
difficulties that face C++ programs working with the Windows programming
model.


Considerations


As mentioned earlier, the Windows API was designed with C programmers in mind;
as such, it makes certain assumptions. For example, a window callback function
must have four parameters of a specific type in a specific order. C++ member
functions, however, have a "secret" parameter. All nonstatic member functions
have a hidden pointer parameter referenced by the keyword this. The this
pointer addresses the object for which the function was called. Were
MessageHandler defined as a normal member function, the hidden this parameter
would prevent it from being called correctly by the message dispatch loop. To
eliminate the this pointer, MessageHandler must be a static member function.
This leads us to a chicken-and-egg problem. When a window is created with
CreateWindow, it is assigned a callback function based on the callback defined
for its window class. CreateWindow sends several messages to the new window
before returning. This is all automatic, and there is no way to change the
callback address to point to MessageHandler until after CreateWindow has been
called to return a window HANDLE! The question is which callback function to
use for the window class, so that the initial message sent by CreateWindow can
be processed.
The Windows API provides a default callback function named DefWindowProc. The
messages sent to a window by CreateWindow are not processed by the programmer;
they are usually passed on to DefWindowProc. The only exception is WM_CREATE,
a message usually processed to perform initialization of window-specific data.
I solved the problem in Listing Two, page 34, by assigning DefWindowProc as
the default callback function for a WindowClass object. When CreateWindow is
called for a BasicWindow object, the initial message is passed on to
DefWindowProc. Once CreateWindow is done, I use the SetWindowLong API function
to change the callback for that window from DefWindowProc to MessageHandler. A
WM_CREATE message is then sent explicitly to the window, which now calls
MessageHandler for all its messages.
This leads to another problem: To make classes truly polymorphic, each class
should be able to define class-specific message handling using virtual
functions. A static member function such as MessageHandle, however, cannot be
declared virtual! How, then, can we define MessageHandler to be both a static
function and polymorphic?

The solution involves sleight-of-hand techniques. When a callback function
receives a message, its first argument contains the HANDLE of the window for
which the message is intended. The window HANDLE can be used as a way of
obtaining a pointer to the Window object. Earlier, I described how Windows can
allocate extra data space in the internal structure of a window. In this extra
space, a pointer to the Window object can be stored. When MessageHandler is
called, it can extract this pointer via the window HANDLE and use it to invoke
virtual methods. Here's how it works. First, a Window object is constructed.
Then, the Actualize member function is called to generate the window. In
Actualize, once the call to CreateWindow is successful, the this pointer for
the object is stored in the Window's extra byte with the macro SetWindowPtr.
When MessageHandler is called, it can extract this pointer using its window
HANDLE argument and the GetWindowPtr macro. Memory model-specific versions of
the SetWindowPtr and GetWindowPtr methods are defined in the winclass.h header
file. The WindowClass must also declare a number of extra window bytes
(defined by the SIZEOF_THIS macro) for holding this pointer.
Once MessageHandler has a pointer to the Window object, it can call virtual
member functions to process specific messages. The Window class defines five
virtual functions for handling messages. Four of these functions handle
specific messages commonly operated upon by all windows: Create handles
WM_CREATE; close processes WM_CLOSE; Paint handles WM_PAINT, which must be
processed by all windows that display text or graphics; and Command handles
WM_COMMAND messages generated by actions involving menu items and control
windows. Any other messages are passed to the AuxMsgHandler function, defined
by the base class as a call to DefWindowProc. WM_CREATE, WM_CLOSE, WM_PAINT,
and WM_COMMAND are the only messages directly handled by most windows I use in
my applications.
The remaining implementation issue involves implementing the callback as part
of Window. The callback function is more closely tied to the window itself
than to the window class. How messages are processed is defined by the menus
and controls defined within a window. Quite often in C-based applications, a
WinApp is forced to change the callback function assigned from the window
class in order to process window-specific messages. Implementing a callback
function as a member of the Window class directly associates the callback
subfunctions with the window they are working with.


Classes for Controls


Controls are specialized windows created using predefined window classes. All
controls share certain characteristics in common; this suggests that a common
base class should be used. My control base class is named Control; it is
derived from the BasicWindow class. Control provides a set of common features
to all classes derived from it.
There is no need for a program to create Control objects; such objects are
useless in and of themselves. Only derived classes need to construct Control
objects. So the constructor for Control constructor is protected; it can only
be called from within the Control class scope or by classes derived from
Control. This constructor, in fact, does nothing but call the constructor for
the BasicWindow class from which Control is derived.
Like Window, Control has an Actualize member function that displays the
Control on this screen. The GetID function returns the ID code for a control.
SubClass is a method that changes the callback function for a control.
Microsoft calls this "subclassing," but "redirecting" or "intercepting" would
be more appropriate. Subclassing is used by sophisticated WinApps that need
direct control over how a Control's messages are being handled. The uses of
subclassing, however, are outside the scope of this article, so I'll leave it
to you to investigate the subject further.
Two simple control classes are derived from Control: StaticLeft and
PushButton. The former is a simple text box; the later is the grey push button
used for "Okay" and "Cancel" buttons. The implementation of these classes is
extremely simple; they inherit a majority of their functionality from Control
and BasicWindow.


A Simple Application


Listings Three through Six (page 36) are the source files required for a C++
WinApp I call "woop." Application-specific classes based on the WindowClass
and Window classes provide the core of the program. The WOOPWindow class
defines and creates two controls windows inside the client area of its window.
Pressing PushButton elicits a beep from the computer's speaker; choosing menu
items changes the contents of the StaticLeft control.
The source code compiles correctly under Borland C++ 2.0. While the code
compiles without warning under Zortech C++, a minor code generation problem
prevents it from executing. A Zortech representative told me the problem would
be corrected by the time you read this article.
The classes presented here are merely a beginning; real WinApps are far more
complex than woop. My goal was to provide you with a basic understanding of
the marriage between C++ and Windows. You can, however, use woop as a starting
point to create your own extended classes and customize them for your specific
applications.

_WINDOWS MEETS C++_
by Scott Robert Ladd


[LISTING ONE]


// WINDOWS CLASSES
// winclass.h -- Class definitions for Windows programming
// Copyright 1991 by Scott Robert Ladd. All Rights Reserved.

#if !defined(__WINCLASS_H)
#define __WINCLASS_H

#include "windows.h"

// macro resolving to size of this pointer
#define SIZEOF_THIS (sizeof(void *))
// macro resolving to number of extra bytes required by this window
#define PTR_BYTES (SIZEOF_THIS)
// macros used to store and extract class pointer from extra window bytes
#if sizeof(void *) == 4
 #define SetWindowPtr(wdw,ptr) SetWindowLong((wdw),0,(DWORD)(ptr))
 #define GetWindowPtr(wdw) (Window *)GetWindowLong(wdw,0)
#else
 #define SetWindowPtr(wdw,ptr) SetWindowWord((wdw),0,(WORD)(ptr))
 #define GetWindowPtr(wdw) (Window *)GetWindowWord(wdw,0)
#endif
// callback function types
typedef long (_far _pascal * WinCallBack)(HWND,WORD,WORD,DWORD);
typedef int (_far _pascal * DlgCallBack)(HWND,WORD,WORD,DWORD);
//------------- CLASS WinApp -------------
class WinApp
 {
 public:
 static int Dispatcher();


 static void SetInstance(HANDLE inst);
 static HANDLE GetInstance();
 private:
 static HANDLE Instance;
 };
// set instance handle
inline void WinApp::SetInstance(HANDLE inst)
 {
 if (Instance == NULL)
 Instance = inst;
 }
// get instance handle
inline HANDLE WinApp::GetInstance()
 {
 return Instance;
 }
//------------------ CLASS WindowClass ------------------
class WindowClass : public WinApp
 {
 public:
 WindowClass(); // constructor

 BOOL Register(); // register
 BOOL Unregister(); // unregister

 const char * GetName() const; // get class name string
 protected:
 WNDCLASS ClassData;
 };
//------------------ CLASS BasicWindow ------------------
class BasicWindow : private WinApp
 {
 public:
 HWND GetHandle() const;
 HANDLE GetInstance() const;
 DWORD GetStyle() const;
 void Show(int cmdShow = SW_SHOW);
 void Update();
 void SetText(char * text);
 void GetText(char * buffer, int n) const;
 void Disable();
 void Enable();
 void GetExtents(int * x, int * y, int * width, int * height) const;
 void Move(int x, int y, int width, int height);
 protected:
 BasicWindow(const char * cname);
 HWND Handle;
 HANDLE Instance;
 DWORD Style;
 char ClassName[32];
 };
inline HWND BasicWindow::GetHandle() const
 {
 return Handle;
 }
inline HANDLE BasicWindow::GetInstance() const
 {
 return Instance;
 }

inline DWORD BasicWindow::GetStyle() const
 {
 return Style;
 }
inline void BasicWindow::Show(int cmdShow)
 {
 ShowWindow(Handle,cmdShow);
 }
inline void BasicWindow::Update()
 {
 UpdateWindow(Handle);
 }
inline void BasicWindow::SetText(char * text)
 {
 SetWindowText(Handle,text);
 }
inline void BasicWindow::GetText(char * buffer, int n) const
 {
 GetWindowText(Handle,buffer,n);
 }
inline void BasicWindow::Disable()
 {
 EnableWindow(Handle,FALSE);
 }
inline void BasicWindow::Enable()
 {
 EnableWindow(Handle,TRUE);
 }
inline void BasicWindow::GetExtents(int * x, int * y, int * width,
 int * height) const
 {
 RECT r;
 GetWindowRect(Handle,&r);
 *x = r.left;
 *y = r.top;
 *width = r.right - r.left + 1;
 *height = r.bottom - r.top + 1;
 }
inline void BasicWindow::Move(int x, int y, int width, int height)
 {
 MoveWindow(Handle,x,y,width,height,TRUE);
 }
//------------- CLASS Window -------------
class Window : public BasicWindow
 {
 public:
 Window(const char * cname);

 BOOL Actualize(const char * title,
 int x = CW_USEDEFAULT,
 int y = CW_USEDEFAULT,
 int width = CW_USEDEFAULT,
 int height = CW_USEDEFAULT);
 HDC OpenDC();
 void CloseDC(HDC dc);
 protected:
 virtual void Create(HWND wdw, CREATESTRUCT _far * cs);
 virtual void Close(HWND wdw);
 virtual void Paint(HDC dc);

 virtual void Command(HWND wdw, WORD wParam, DWORD lParam);

 virtual long AuxMsgHandler(HWND wdw, WORD message,
 WORD wParam, DWORD lParam);
 private:
 static long _far _pascal _export MessageHandler(HWND wdw,
 WORD message, WORD wParam, DWORD lParam);
 };
inline HDC Window::OpenDC()
 {
 return GetDC(Handle);
 }
inline void Window::CloseDC(HDC dc)
 {
 ReleaseDC(Handle,dc);
 }
//-------------- CLASS Control --------------
class Control : public BasicWindow
 {
 public:
 BOOL Actualize(const Window & parent, const char * text, WORD id,
 int x, int y, int width, int height);
 WORD GetID() const;
 WinCallBack SubClass(WinCallBack newHandler);
 protected:
 Control(const char * cname);
 };
inline WORD Control::GetID() const
 {
 return GetWindowWord(Handle,GWW_ID);
 }
inline WinCallBack Control::SubClass(WinCallBack newHandler)
 {
 return (WinCallBack)SetWindowLong(Handle,GWL_WNDPROC,(DWORD)newHandler);
 }
inline Control::Control(const char * cname) : BasicWindow(cname)
 {
 // does nothing else
 }
//----------------- CLASS StaticLeft -----------------
class StaticLeft : public Control
 {
 public:
 StaticLeft();
 };
//----------------- CLASS PushButton -----------------
class PushButton : public Control
 {
 public:
 PushButton();
 };
#endif // __WINCLASS_H





[LISTING TWO]


// WINDOWS CLASSES
// winclass.cpp -- Windows class implementations.
// Copyright 1991 by Scott Robert Ladd. All Rights Reserved.

#include "winclass.h"
#include "string.h"

//------------- CLASS WinApp -------------

// define values for static member of WinApp class
HANDLE WinApp::Instance = NULL;
// message dispatcher
int WinApp::Dispatcher()
 {
 if (Instance == NULL)
 return !0;
 MSG msg;
 while (GetMessage(&msg, NULL, NULL, NULL))
 {
 TranslateMessage(&msg);
 DispatchMessage(&msg);
 }

 return msg.wParam;
 }
//------------------ CLASS WindowClass ------------------
WindowClass::WindowClass()
 {
 #if defined(__BCPLUSPLUS__)
 ClassData.lpfnWndProc = DefWindowProc;
 #else
 ClassData.lpfnWndProc = (long (_far _pascal *)())DefWindowProc;
 #endif
 ClassData.hInstance = WinApp::GetInstance();
 ClassData.style = 0;
 ClassData.cbClsExtra = 0;
 ClassData.cbWndExtra = PTR_BYTES;
 ClassData.hIcon = NULL;
 ClassData.hCursor = LoadCursor(NULL,IDC_ARROW);
 ClassData.hbrBackground = GetStockObject(WHITE_BRUSH);
 ClassData.lpszMenuName = NULL;
 ClassData.lpszClassName = "DefaultWindowClass";
 }
BOOL WindowClass::Register()
 {
 if (!RegisterClass((WNDCLASS _far *)&ClassData))
 return FALSE;
 else
 return TRUE;
 }
BOOL WindowClass::Unregister()
 {
 if (!UnregisterClass(ClassData.lpszClassName,ClassData.hInstance))
 return FALSE;
 else
 return TRUE;
 }
const char * WindowClass::GetName() const
 {

 return (const char *)ClassData.lpszClassName;
 }
//------------------ CLASS BasicWindow ------------------
BasicWindow::BasicWindow(const char * cname)
 {
 Handle = NULL;
 Instance = WinApp::GetInstance();
 Style = 0UL;

 if (cname != NULL)
 strncpy(ClassName,cname,32);
 else
 ClassName[0] = 0; // blank class name
 }
//------------- CLASS Window -------------
Window::Window(const char * cname) : BasicWindow(cname)
 {
 Style = WS_OVERLAPPEDWINDOW;
 }
BOOL Window::Actualize(const char * title, int x, int y, int width, int
height)
 {
 Handle = CreateWindow(ClassName, (char *)title, Style, x, y,
 width, height, NULL, NULL, Instance, 0L);
 if (Handle == NULL)
 return FALSE;
 FARPROC fp = MakeProcInstance((FARPROC)((DWORD)(Window::MessageHandler)),
 Instance);
 SetWindowLong(Handle,GWL_WNDPROC,(DWORD)fp);
 SetWindowPtr(Handle,this);
 SendMessage(Handle,WM_CREATE,0,0L);
 return TRUE;
 }
void Window::Create(HWND wdw, CREATESTRUCT _far * cs)
 {
 // by default, does NOTHING!
 }
void Window::Close(HWND wdw)
 {
 // by default, does NOTHING!
 }
void Window::Paint(HDC dc)
 {
 // by default, does NOTHING!
 }
void Window::Command(HWND wdw, WORD wParam, DWORD lParam)
 {
 // by default, does NOTHING!
 }
long Window::AuxMsgHandler(HWND wdw, WORD message, WORD wParam, DWORD lParam)
 {
 return DefWindowProc(wdw,message,wParam,lParam);
 }
long _far _pascal _export Window::MessageHandler(HWND wdw, WORD message,
 WORD wParam, DWORD lParam)
 {
 HDC dc;
 PAINTSTRUCT ps;
 Window * wptr;
 wptr = GetWindowPtr(wdw);

 switch (message)
 {
 case WM_CREATE:
 wptr->Create(wdw, (CREATESTRUCT _far *)lParam);
 break;
 case WM_CLOSE:
 wptr->Close(wdw);
 break;
 case WM_PAINT:
 dc = BeginPaint(wdw,&ps);
 wptr->Paint(dc);
 EndPaint(wdw,&ps);
 break;
 case WM_COMMAND:
 wptr->Command(wdw,wParam,lParam);
 break;
 default:
 return wptr->AuxMsgHandler(wdw,message,wParam,lParam);
 }
 return 0L;
 }
//-------------- CLASS Control --------------
BOOL Control::Actualize(const Window & parent, const char * text, WORD id,
 int x, int y, int width, int height)
 {
 Handle = CreateWindow(ClassName, (char *)text, Style, x,y,width,height,
 parent.GetHandle(), id, parent.GetInstance(), 0L);
 if (Handle == NULL)
 return FALSE;
 return TRUE;
 }
//----------------- CLASS StaticLeft -----------------
StaticLeft::StaticLeft() : Control("STATIC")
 {
 Style = WS_CHILD SS_LEFT SS_NOPREFIX WS_BORDER;
 }
//----------------- CLASS PushButton -----------------
PushButton::PushButton() : Control("BUTTON")
 {
 Style = WS_CHILD BS_PUSHBUTTON;
 }







[LISTING THREE]

// WINDOWS CLASSES
// woop.cpp -- A program to test basic windows classes.
// Copyright 1991 by Scott Robert Ladd. All Rights Reserved.

#include "winclass.h"
#include "woop.h"

#define IDC_BUTTON 1000


//---------------------- CLASS MainWindowClass ----------------------
class WOOPWindowClass : public WindowClass
 {
 public:
 WOOPWindowClass();
 };
WOOPWindowClass::WOOPWindowClass() : WindowClass()
 {
 ClassData.hIcon = LoadIcon(WinApp::GetInstance(),"WOOPIcon");
 ClassData.hbrBackground = GetStockObject(LTGRAY_BRUSH);
 ClassData.lpszMenuName = "WOOPMenu";
 ClassData.lpszClassName = "MainWindowClass";
 }
//----------------- CLASS WOOPWindow -----------------
class WOOPWindow : public Window
 {
 public:
 WOOPWindow(const char * cname);

 virtual BOOL Actualize();
 protected:
 virtual void Close(HWND wdw);
 virtual void Command(HWND wdw, WORD wParam, DWORD lParam);
 StaticLeft SCtl;
 PushButton BCtl;
 };
WOOPWindow::WOOPWindow(const char * cname) : Window(cname)
 {
 }
BOOL WOOPWindow::Actualize()
 {
 BOOL res;
 res = Window::Actualize("WOOP Window"); // call base class member
 if (res == FALSE)
 return FALSE;
 res = SCtl.Actualize(*this,"Static control",0,100,10,120,16);
 if (res == FALSE)
 return FALSE;
 SCtl.Show();
 SCtl.Update();
 res = BCtl.Actualize(*this,"Button",IDC_BUTTON,100,50,60,40);
 if (res == FALSE)
 return FALSE;
 BCtl.Show();
 BCtl.Update();
 return TRUE;
 }
void WOOPWindow::Close(HWND wdw)
 {
 PostQuitMessage(0);
 }
void WOOPWindow::Command(HWND wdw, WORD wParam, DWORD lParam)
 {
 switch (wParam)
 {
 case IDM_EXIT:
 PostQuitMessage(0);
 break;
 case IDM_ABOUT:

 MessageBox(GetFocus(),"WOOP version 1.00","About...",MB_OK);
 break;
 case IDM_SET:
 SCtl.SetText("Set!");
 break;
 case IDM_RESET:
 SCtl.SetText("Reset!");
 break;
 case IDC_BUTTON:
 MessageBeep(0);
 MessageBeep(0);
 }
 }
//----------------- FUNCTION WinMain -----------------
int _pascal WinMain(HANDLE instance,HANDLE prevInst,LPSTR cmdLine,int cmdShow)
 {
 WinApp::SetInstance(instance);
 WOOPWindowClass wc;
 if (prevInst == NULL)
 wc.Register();
 WOOPWindow wdw(wc.GetName());

 wdw.Actualize();
 wdw.Show(cmdShow);
 wdw.Update();

 return WinApp::Dispatcher();
 };







[LISTING FOUR]

#include "windows.h"
#include "woop.h"

WOOPIcon ICON WOOP.ICO

WOOPMenu MENU
 BEGIN
 POPUP "&Menu"
 BEGIN
 MENUITEM "Set", IDM_SET
 MENUITEM "Reset", IDM_RESET
 MENUITEM SEPARATOR
 MENUITEM "&About...", IDM_ABOUT
 MENUITEM SEPARATOR
 MENUITEM "E&xit", IDM_EXIT
 END
 END








[LISTING FIVE]

#define IDM_SET 100
#define IDM_RESET 101
#define IDM_ABOUT 102
#define IDM_EXIT 103







[LISTING SIX]

NAME OOPAPP

DESCRIPTION 'OPP Windows App'

EXETYPE WINDOWS
STUB 'WINSTUB.EXE'
CODE PRELOAD MOVEABLE DISCARDABLE
DATA PRELOAD MOVEABLE MULTIPLE

HEAPSIZE 4096
STACKSIZE 4096



[MAKE FILE]

windows = y

woop.exe : woop.obj winclass.obj woop.res woop.def

woop.obj : woop.cpp winclass.h woop.h

winclass.obj : winclass.cpp winclass.h

woop.res : woop.rc woop.h woop.ico




















Special Issue, 1991
PROGRAMMING WINDOWS USING STATE TABLES


Michael A. Bertrand and William R. Welch


Mike teaches mathematics and programming at Madison Area Technical College,
Madison, WI 53704. Bill is a freelance writer and programmer who holds a Ph.D.
in biological science. He can be reached at 201 Virginia Terrace, Madison, WI
53705.


This article presents a Windows-based program called "Draw" that uses state
tables to implement interactive drawing tools in an economical, consistent
fashion. Draw renders four kinds of geometric figures: rectangles, rounded
rectangles, ellipses, and lines. Each type is associated with a drawing tool
that's accessed by means of a menu choice (see Figure 1). Our implementation
uses state tables to encapsulate program control flow in a single data
structure (an array of pointers to functions). Using this technique, you can
easily extend the program to support other kinds of geometric figures, as long
as the user interaction for the new types is similar to the types described
here.
Before discussing the details of our implementation, it is useful to review
some of the basic concepts behind Windows programs.


Event-Driven Programming


As more and more programmers are finding out, writing programs for Microsoft
Windows and other event-driven GUIs is very different from writing traditional
DOS programs. In a Windows program, your program does not have a single line
of control, flowing from beginning to middle to end. Rather, it responds to
all manner of events (or, in Windows parlance, messages) that are sent by the
system to all applications, at arbitrary or unpredictable times. This
event-driven structure follows the pattern of interaction of a real-world user
driving an interactive graphics application: Any one event, such as a mouse
movement, is about as likely to occur as any other (say, a keystroke or a menu
choice).
Using window procedures (called Wndprocs), your application is able to respond
to all of these events or messages as they occur. This is not merely a
suggestion, but an implementation requirement. Each type of window in a
Microsoft Windows application must have a procedure associated with it that
receives all messages sent by environment to that class of window. The
messages correspond to external events (mouse movements, mouse clicks,
keystrokes) as well as internal events (for example, the message that asks the
application to redraw its screen display, or a message sent by another
application, and so on).
With this bit of background, we can now discuss the Draw program. Draw
consists of a single header file, Draw.h (Listing One, page 45), and a single
C-language source file, Draw.c (Listing Two, page 45). There is also a
makefile (Listing Three, page 46) and two files required by Windows: the
definition file, Draw.def (Listing Four, page 46), and the resource file,
Draw.rc (Listing Five, page 46).


The Main Window Procedure


In general, every application has a main window and an associated main window
procedure. If the application has other kinds of windows (known as child
windows), each of these kinds will have a window procedure defined for it as
well. Draw creates only one kind of window, so it has only a single window
procedure, WndProc.
The function WndProc contains code to respond to Windows messages such as
selecting a drawing tool from the menu, responding to mouse events, and
repainting the window when it is moved or resized. WndProc passes mouse-button
and mouse-move events to the function Tool, which manages drawing. Tool
provides a template for interactive drawing tools and is the real heart of
Draw.
When using Draw, you interactively display geometric figures by invoking three
mouse events: left-button-down, mouse-move, and left-button-up. These three
events produce the Windows messages WM_LBUTTONDOWN, WM_ MOUSEMOVE, and
WM_LBUTTONUP, respectively. As is common in Windows programs, Tool uses these
messages as case constants in a switch statement. With the rectangle tool, for
example, you first depress the left mouse button (WM_LBUTTONDOWN) to define
the x and y coordinates (x1 and y1) of the initial corner of the figure. Then,
as you move the mouse without releasing the button (dragging the cursor and
producing a series of WM_MOUSEMOVEs), the program repeatedly erases and
redraws the rectangle while the current mouse position defines the x and y
coordinates (x2 and y2) of the rectangle corner opposite the initial corner.
The final figure appears when you release the left mouse button
(WM_LBUTTONUP).


A Simplifying Technique


Draw's four tools require a minimal amount of code. The key to this economy is
the data structure DrawFig, which is an array of pointers to functions --one
for each tool. All four tools work in exactly the same way (that is,
left-button-down, mouse-move, left-button-up), and their functions have the
same parameters and return a value of the same type. In choosing a tool
through the menu, the program sets the value of the DrawFig index, iFigType.
This value, in turn, determines which function is pointed to by the DrawFig
array and used for the actual drawing in Tool.
Two of the functions that the DrawFig array points to, Rectangle and Ellipse,
are standard Windows functions, that is, part of the native Application
Program Interface (API). The other two functions that the DrawFig array points
to, DrawRoundRect and DrawLine, are our own. This is because the native
Windows functions to draw rounded rectangles (RoundRect) and lines (MoveTo and
LineTo) have different parameters than Rectangle and Ellipse. To deal with
this difference, we wrote the DrawRoundRect and DrawLine functions. These two
have the same parameters as Rectangle and Ellipse, so all four functions can
be included in the same array of pointers to functions, the DrawFig array.
This scheme of using the DrawFig array to point to tool functions that use the
same three mouse events to draw figures has an important ramification: Other
similarly behaving tools can be added to Draw by simply including pointers to
the appropriate functions in the list of the DrawFig array initializers and
including them in the menu. Additions might, for example, be tools for
isosceles triangles, regular polygons, and parabolic segments.


Storing Figure Coordinates


In any Windows application, whenever the user moves a window or changes its
size, Windows sends a WM_PAINT message to the application to erase and
redisplay the entire output are of the window. Any figures produced by Draw
will be erased, and Draw must redraw them if they are to stay on the screen as
the location or size of the window is changed.
This restoration of the window contents can be accomplished only if Draw in
some way saves the figures. This it does, in the externally defined structure
faList, which is an array of structures of type FIGURE. Each FIGURE is faList
contains a field (name iType) that indicates the type of figure (rectangle,
rounded rectangle, ellipse, or line) and a structure (rsCoord) that contains
the x and y coordinates of the two end-points which define the location of the
figure. Values for these variables are assigned -- a new figure is saved -- in
this case block WM_LBUTTONUP of function Tool. Whenever WndProc gets a
WM_PAINT message, it traverses faList, a simple graphics database, to restore
the screen. The array faList is characteristic of the graphics programming
approach known as vector-based or display-list oriented approach. (This
technique is also sometimes loosely called object-oriented.) In this approach,
a geometric figure is represented in the database, or display list, by a set
of drawing commands and endpoint coordinates that determine how the list is
traversed to display the figures. In Draw, the drawing command in the list is
the type of figure (iType); more elaborate systems include attributes such as
line width and line color.


Rubber Banding Figures


Draw uses the standard Windows raster operation (ROP2) codes in order to
"rubber band" a figure as the mouse cursor is dragged. This occurs in Tool, in
the case block WM_MOUSEMOVE, which calls the Windows function Set-ROP2 with
argument R2_NOTXORPEN. This argument sets the XOR (eXclusive OR) drawing mode.
Using the previous values of x2 and y2, the XOR mode causes the tool function
called by the DrawFig array to erase the existing figure. DrawFig calls the
tool function again, using the current values of x2 and y2 to draw the new
figure. When the same figure is drawn twice in the same place in XOR drawing
mode, figures in the background are left unchanged. Case WM_LBUTTONUP calls
ROP2 with argument R2_COPYPEN, setting the COPY drawing mode. In COPY mode,
the background color fills the interior of the figure, erasing overlapped
portions of any underlying figures.


System State Tables


The concept of "system state" is central to understanding Draw. It is the
current state of the application that determines the response of the program
to a mouse event. We use a single variable (iState, in Tool) to represent the
system state.
Table 1 shows Draw's state table. It shows how the two system states (WAITING
and DRAWING) are related to the three mouse events (WM_LBUTTONDOWN,
WM_MOUSEMOVE, WM_LBUTTONUP) that DRAW's tools use to display figures. When you
use DRAW, you send a series of mouse events to Tool. Tool's response to a
given mouse event depends not only on that event, but also on the sequence of
previous events. Tool records this sequence of mouse events as transitions in
system state, and the state table documents these transitions.
Table 1: State table shows the changes from one system state to another
(Waiting to Drawing and back), as triggered by the mouse events
(left-button-down, move, left-button-up).


 Mouse Event
 System State WM_LBUTTONDOWN WM_MOUSEMOVE WM_LBUTTONUP
 --------------------------------------------------------

 WAITING DRAWING -- --
 DRAWING -- -- WAITING

To use Table 1, enter at the initial system state, WAITING, and read across to
see the effect of mouse events. WM_MOUSEMOVE and WM_LBUTTONUP have no effect,
but WM_LBUTTONDOWN causes a transition to a new system state, DRAWING, and
starts the tool. Reenter the table at the new system state, DRAWING, and again
read across. Now WM_LBUTTONDOWN and WM_MOUSEMOVE have no effect, but
WM_LBUTTONUP causes a state transition back to WAITING, and stops the tool.
Tool responds to only one sequence of mouse events: WM_LBUTTONDOWN,
WM_MOUSEMOVE, WM_LBUTTONUP. This sequence is reflected in only one path
through the state table: WAITING --> DRAWING --> WAITING.
System state can both determine the response to a mouse event and be
determined by a mouse event. For example, in case WM_MOUSEMOVE of Tool, if
iState equals DRAWING, the old figure is erased and the new one is drawn; if
iState equals WAITING, there is no effect (break). By contrast, in case
WM_LBUTTONDOWN, if iState equals WAITING, iState is changed to DRAWING and the
endpoint coordinates are assigned.
Draw's tools are simple; thus Table 1 is correspondingly simple. Table 2 is a
slightly more involved example that describes what would happen if two mouse
events, right-button-down (WM_RBUTTONDOWN) and right-button-up (WM_RBUTTONUP),
were added to translate (that is, change the location of) the figure being
drawn. In this example, when you depress the right button while drawing, mouse
moves translate the figure without changing its shape.
Table 2: This state table extends the relationships in Table 1 by adding two
mouse events and another system state.

 Mouse Event
 System WM_LBUTTONDOWN WM_MOUSEMOVE WM_LBUTTONUP WM_RBUTTONDOWN WM_RBUTTONUP
 State
 ---------------------------------------------------------------------------

 WAITING DRAWING -- -- -- --
 DRAWING -- -- WAITING TRANSLATING --
 TRANSLATING -- -- WAITING -- DRAWING

To use Table 2, enter at the initial system state, WAITING, and read across.
WM_LBUTTONDOWN causes a transition to DRAWING and starts the tool, as before.
WM_LBUTTONUP causes a transition back to WAITING, as before, and stops the
tool. An intervening WM_RBUTTONDOWN, however, changes the state to
TRANSLATING. In state TRANSLATING, WM_MOUSEMOVEs cause translations rather
than rubber banding. WM_RBUTTONUP changes the state back to DRAWING. You can
alternate between rubber banding (state DRAWING) and translating (state
TRANSLATING) until the final WM_LBUTTONUP.
This expanded tool responds to the mouse-event sequence: WM_LBUTTONDOWN,
WM_MOUSEMOVE, WM_RBUTTONDOWN, WM_MOUSEMOVE, WM_RBUTTONUP, WM_MOUSEMOVE,
WM_LBUTTONUP, and this sequence is reflected in a path through the state table
as: WAITING --> DRAWING --> TRANSLATING --> DRAWING --> WAITING.
In implementing the state tables, we coded them as two-dimensional switches,
that is, nested switch statements. More elaborate tables might require an
array-based approach. In Draw, the mouse event controls the outer switch
statement, and the state variable controls the inner one. For Table 2, the
skeleton for case WM_MOUSEMOVE is shown in Example 1. To fully flesh out the
example, case blocks for WM_RBUTTONDOWN and WM_RBUTTONUP would have to be
added to the switch statement that selects from iMessage in Tool. Also, the
state variable iState would have to be changed accordingly.
Example 1: The skeleton for case WM_MOUSEMOVE

 case WM_MOUSEMOVE:
 switch (iState)
 {
 case WAITING:
 /* Tool not started; nothing to do. */
 break; /* WAITING */
 case DRAWING:
 /* User is rubber banding. Erase old figure and draw new
 figure. Reset statics (x2,y2) to mouse coordinates. */

 ............ rubber banding code here ............
 break; /* DRAWING */
 case TRANSLATING:
 /* User is translating. Erase old figure and draw new
 figure. Reset statics (x1,y1) and (x2,y2) to translated
 values. */

 ............. translating code here ............
 break; /* TRANSLATING */
 } /* switch (iState) */
 break; /* WM_MOUSEMOVE */

Table 2 demonstrates that, as permissible sequences of mouse events and
consequent system states are added to a program being developed, the
complexity of the interactions increases rapidly. The code that describes
these interactions necessarily becomes equally complex. Poorly managed
complexity leads to intractability. State tables are a way to cut through this
complexity. If state tables are first used to describe the interactions are
constructed before the code is written they provide a guide for writing the
code. New features can be added with only a minimal alteration of working
code. State tables become a means of managing complexity and are therefore a
valuable aid in writing and documenting Windows applications that make heavy
use of mouse events.

_PROGRAMMING WINDOWS USING STATE TABLES_
by Michael A. Bertrand and William R. Welch


[LISTING ONE]

/***********DRAW.H : header file for DRAW.C **************************/

#define WAITING 0 /* the possible values for variable iState in */

#define DRAWING 1 /* Tool() are WAITING and DRAWING */

/* These constants are the possible values for iMenuChoice, the variable
 * recording the user's menu choice. The old menu choice must be stored
 * so the check mark can be removed from the menu when a new menu choice
 * is made. Do not change. */
#define IDM_RECT 100
#define IDM_ROUND_RECT 101
#define IDM_ELLIPSE 102
#define IDM_LINE 103
#define IDM_ABOUT 104

/* These constants are the possible values for iFigType, the variable
 * recording the current FIGURE, as chosen through the menu. The value is
 * also stored in the iType field in faList[] and is used to determine
 * which drawing function is called upon from DrawFig[], the array of
 * pointers to functions; since these values are indices into an array,
 * starting at 0, they may not be changed. */
#define FT_RECT (IDM_RECT - IDM_RECT)
#define FT_ROUND_RECT (IDM_ROUND_RECT - IDM_RECT)
#define FT_ELLIPSE (IDM_ELLIPSE - IDM_RECT)
#define FT_LINE (IDM_LINE - IDM_RECT)

/* maximum number of FIGUREs in faList[] */
#define MAX_FIGS 1000

/* FIGUREs in faList[]: rectangle, rounded rectangle, ellipse, line */
typedef struct
{ int iType;
 RECT rsCoord;
} FIGURE;

/* global variables */
FIGURE faList[MAX_FIGS]; /* List of FIGUREs */
int iListSize; /* tally number of displayed FIGUREs */
HANDLE hInst; /* current instance */
RECT rClient; /* client area in scr coords for ClipCursor() */

/* function prototypes */
long FAR PASCAL WndProc(HWND hWnd, unsigned iMessage, WORD wParam,
 LONG lParam);
void NEAR PASCAL Tool(HWND hWnd, unsigned iMessage, LONG lParam,int iFigType);
BOOL FAR PASCAL DrawRoundRect(HDC hDC, int x1, int y1, int x2, int y2);
BOOL FAR PASCAL DrawLine(HDC hDC, int x1, int y1, int x2, int y2);
BOOL FAR PASCAL AboutDraw(HWND hDlg, unsigned message, WORD wParam,
 LONG lParam);
/* DrawFig[] is an array of pointers to FAR PASCAL functions, each with parms
 * (HDC,int,int,int,int) and returning BOOL. Note Rectangle() and Ellipse()
are
 * MS Windows GDI calls, while DrawRoundRect() and DrawLine() are our calls.
*/
BOOL (FAR PASCAL *DrawFig[4])(HDC hDC, int x1, int y1, int x2, int y2)
 = {Rectangle, DrawRoundRect, Ellipse, DrawLine};









[LISTING TWO]

/******* DRAW.C by Michael A. Bertrand and William R. Welch. *******/

#include <windows.h>
#include "draw.h"

int PASCAL WinMain(HANDLE hInstance, HANDLE hPrevInstance, LPSTR lpszCmdLine,
 int nCmdShow)
 /* hInstance : current instance handle
 * hPrevInstance : previous instance handle
 * lpszCmdLine : current command line
 * nCmdShow : display either window or icon
 */
{ static char szAppName [] = "Draw";
 static char szIconName[] = "DrawIcon";
 static char szMenuName[] = "DrawMenu";

 HWND hWnd; /* handle to WinMain's window */
 MSG msg; /* message dispached to window */
 WNDCLASS wc; /* for registering window */

 /* Save instance handle in global var so can use for "About" dialog box. */
 hInst = hInstance;

 if (!hPrevInstance) /* Register application window class. */
 { wc.style = CS_HREDRAW CS_VREDRAW;
 wc.lpfnWndProc = WndProc; /* function to get window's messages */
 wc.cbClsExtra = 0;
 wc.cbWndExtra = 0;
 wc.hInstance = hInstance;
 wc.hIcon = LoadIcon(hInstance, szIconName);
 wc.hCursor = LoadCursor(NULL, IDC_ARROW);
 wc.hbrBackground = GetStockObject(WHITE_BRUSH);
 wc.lpszMenuName = szMenuName; /* menu resource in RC file */
 wc.lpszClassName = szAppName; /* name used in call to CreateWindow() */
 if (!RegisterClass(&wc))
 return(FALSE);
 }
 /* Initialize specific instance. */
 hWnd = CreateWindow(szAppName, /* window class */
 szAppName, /* window caption */
 WS_OVERLAPPEDWINDOW, /* normal window style */
 CW_USEDEFAULT, /* initial x-position */
 CW_USEDEFAULT, /* initial y-position */
 CW_USEDEFAULT, /* initial x-size */
 CW_USEDEFAULT, /* initial y-size */
 NULL, /* parent window handle */
 NULL, /* window menu handle */
 hInstance, /* program instance handle */
 NULL); /* create parameters */

 ShowWindow(hWnd, nCmdShow); /* display the window */
 UpdateWindow(hWnd); /* update client area; send WM_PAINT */

 /* Read msgs from app que and dispatch them to appropriate win function.
 * Continues until GetMessage() returns NULL when it receives WM_QUIT. */
 while (GetMessage(&msg, NULL, NULL, NULL))
 { TranslateMessage(&msg); /* process char input from keyboard */

 DispatchMessage(&msg); /* pass message to window function */
 }
 return(msg.wParam);
}
/****************************************************************/
long FAR PASCAL WndProc(HWND hWnd,unsigned iMessage, WORD wParam, LONG lParam)
 /* IN: hWnd : handle to window
 * iMessage : m^C





[LISTING THREE]

THIS LISTING IS CURRENTLY UNAVAILABLE




[LISTING FOUR]

THIS LISTING IS CURRENTLY UNAVAILABLE




[LISTING FIVE]

THIS LISTING IS CURRENTLY UNAVAILABLE




